CN111159400B - Product comment emotion classification method and system - Google Patents

Product comment emotion classification method and system Download PDF

Info

Publication number
CN111159400B
CN111159400B CN201911315255.6A CN201911315255A CN111159400B CN 111159400 B CN111159400 B CN 111159400B CN 201911315255 A CN201911315255 A CN 201911315255A CN 111159400 B CN111159400 B CN 111159400B
Authority
CN
China
Prior art keywords
comment
attribute
classification
emotion
comments
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911315255.6A
Other languages
Chinese (zh)
Other versions
CN111159400A (en
Inventor
章红
王中卿
贾俊铖
那幸仪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN201911315255.6A priority Critical patent/CN111159400B/en
Publication of CN111159400A publication Critical patent/CN111159400A/en
Application granted granted Critical
Publication of CN111159400B publication Critical patent/CN111159400B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/12Hotels or restaurants

Abstract

The invention relates to a product comment emotion classification method, which comprises the following steps: text information of related comments of the object product is obtained, and is subjected to text conversion so as to be converted into a unified vector form; distinguishing and extracting at least one attribute tag of the object product based on the comments converted into the vector form; predicting the emotion labels of the comments based on the comments converted into the vector form; and combining the attribute label of the object product and the emotion label of the predicted comment, and carrying out joint learning of emotion classification and attribute classification aiming at each comment of the object product to find a common relation between the two. According to the method, the relation between the text and other factors is fully considered, and the accuracy of comment star level prediction is assisted and improved by extracting the attribute information of the object products associated with the comments. Experiments show that the method can remarkably improve the accuracy of comment staring condition classification to a certain extent.

Description

Product comment emotion classification method and system
Technical Field
The invention relates to the field of emotion classification, and belongs to a method and a system for classifying product comment emotions.
Background
With the rapid development of various universal systems such as smart phones, tablet computers, wireless sensor networks, etc., humans have been living in a mixed network environment in which communication networks, the internet, and sensor networks are integrated with each other. The emotion analysis based on comments is an important branch and research direction in the field of opinion mining, and mainly researches how to adopt an automatic method to acquire the recognition and detraction trend of the user on things from the comments of the user.
Because emotion classification can solve the phenomenon that various comment information is disordered on the internet to a certain extent, and a user can conveniently and accurately position required information, emotion classification becomes a key technology with great practical value, is a powerful means for organizing and managing data, and is valued by a plurality of experts in the field of research. There are two kinds of emotion classification methods, one is a semantic dictionary-based classification method and the other is a machine learning-based classification method. Although there are many studies on emotion classification based on machine learning, most of the studies are only conducted from text, and only text information is used to perform emotion classification in a traditional machine learning manner, and other factors (such as attribute information of a product for which the text is aimed) besides the text are not considered, and these potential factors related to the text are also important. Taking a restaurant as an example, the evaluation of the restaurant by the user is often related to factors such as dining environment, price interval, type of restaurant, noise level, etc. of the restaurant. If these factors are not considered in the emotion classification process of comments, the accuracy and pertinence of the obtained classification result are definitely compromised.
Disclosure of Invention
The invention aims to provide a method and a system for classifying the emotion of a product comment, which fully consider the relation between texts and other factors and assist in improving the accuracy of comment star-level prediction by extracting attribute information of an object product associated with the comment. Experiments show that the method can remarkably improve the accuracy of comment staring condition classification to a certain extent.
In order to achieve the above purpose, the present invention provides the following technical solutions:
referring to fig. 1, a method for classifying emotion of a product comment, the method for classifying emotion includes:
s1: text information of related comments of the object product is obtained, and is subjected to text conversion so as to be converted into a unified vector form;
s2: distinguishing and extracting at least one attribute tag of the object product based on the comments converted into the vector form;
s3: predicting the emotion labels of the comments based on the comments converted into the vector form;
s4: and combining the attribute label of the object product and the emotion label of the predicted comment, and carrying out joint learning of emotion classification and attribute classification aiming at each comment of the object product to find a common relation between the two.
In a further embodiment, in step S1, the obtaining text information of the related comments of the object product, and performing text conversion on the text information to convert the text information into a unified vector form includes the following steps:
s11: counting all words in each comment-related sentence x in the training set into a vocabulary set V, wherein each word is provided with a corresponding number x i ,x i Representing the ith word in sentence x, each word x i E V is composed of a series of characters C i =[c1,c2,…,cl]A composition, wherein l is the maximum length of the word and each word is added to this length;
s12: each sentence x is represented as [ V [ x ] 1 ],V[x 2 ],…,V[x n ]]Where n is the number of words contained in each sentence, Q ε R d×|C| Is a representation matrix of words, C is the word of the character, d is the dimension of the character vector
S13: the word vector and the vocabulary set V are used as input and are transmitted to an Embedding layer, and then the Embedding layer constructs a character vector C i ,C i ∈R d×l C is carried out by i As inputs to LSTM.
In a further embodiment, in step S2, distinguishing and extracting at least one attribute tag of the object product based on the comment converted into the vector form means that the multi-layer perceptron training binary classifier is used to predict the attribute polarity of the object product, and the method includes the following steps:
s21: representing the text information of comments as an input vector H, employing the hidden layer of LSTM to learn higher-level features H B
S22: h to be generated by hidden layer B As input of the softmax output layer, predicting the attribute label of the commodity to be P through the softmax layer B
P B =softmax(W b H B +B b )
wherein ,W b ,B b all are parameters of an experimental model.
In a further embodiment, in step S3, based on the comment converted into the vector form, predicting the emotion label of the comment refers to predicting the emotion polarity of the given comment by using a multi-layer perceptron training binary classifier, and the method includes the following steps:
s31: using text information of comments as input vector H, employing hidden layer of LSTM to learn higher level feature H p
S32: h to be generated by hidden layer p As an input to the normal output layer, the predicted emotion polarity label through the normal layer is P p
P p =normal(W p H p +B p )
wherein ,W p ,B p all are parameters of an experimental model.
In a further embodiment, in step S4, the combining the attribute tag of the object product with the emotion tag of the comment obtained by prediction, performing joint learning of emotion classification and attribute classification for each comment of the object product, and searching for a commonality relationship between the two includes the following steps:
s41: the staring condition P of one comment is judged, and each attribute Q (Q 1 ,q 2 ,…,q n ) As a parameter go intoLine input, wherein P uses Gaussian distribution to calculate values, Q uses standard softmax function to classify, P is used for predicting the star-beating condition of comments to obtain classifying ability, Q is used for distinguishing different attributes to obtain domain-invariant characteristics;
s42: record F (x) i ) Is the feature vector corresponding to the comment sentence input, and the star class classification Q maps the feature vector F (x) i ) To the label corresponding to it, the mapping parameter is used for theta y Marking, training star classification on resource domain NS samples based on cross entropy loss, star classification loss L task Expressed as:
wherein ,yi Is corresponding to x i Star class classification labels of e S (x),is the output of the highest softmax layer:s (x) is a set of input comments and corresponding star class classification sample pairs;
during training, for parameter θ f and θy Optimizing to make classification loss L task Minimization, θ f Is x i Mapping to feature vector F (x i ) Mapping parameters of (a);
s43: the attribute tag classifier maps the same hidden state h to have the parameter θ d With the purpose of the following attribute loss function L type Distinguishing attribute tags:
wherein ,di Is the correct attribute label for sample i,is the output of the top layer, N t Refers to N from the target domain t The sample was taken out of the sample,
by making theta d Minimizing the loss on the surface while minimizing theta f The loss is maximum, and the attribute classifier is trained to the saddle point of the loss function;
optimizing θ f It is ensured that the attribute-label classifier cannot distinguish between attributes, in other words, a commonality relationship is found between star-class classification and attribute classification.
Based on the product comment emotion classification method, the invention also provides a product comment emotion classification system, which comprises a text converter, an attribute marking classifier and a comment star classifier;
the text converter is used for converting text information into a unified vector form, the attribute marking classifier is used for distinguishing attribute information of products, and the comment star class classifier is used for distinguishing star class information of comments.
In a further embodiment, the emotion classification system is built by adopting a Multi-LSTM neural network, and comprises a plurality of LSTM layers, wherein the plurality of LSTM layers are divided into a main LSTM layer and an auxiliary LSTM layer;
the main LSTM layer is used for classifying the comments in star level on the basis of comment sentences; the auxiliary LSTM layer is used for distinguishing various attribute information of the object product on the basis of comment sentences.
The invention has the beneficial effects that:
1) And the relation between the text and other factors is fully considered, and the accuracy of the comment star-level prediction is assisted to be improved by extracting the attribute information of the object products associated with the comment. Experiments show that the method can remarkably improve the accuracy of comment staring condition classification to a certain extent; and, the more the number of samples that obtain, the better the equilibrium of sample, the more attribute information that participates in classification, the better the emotion classification effect of comment.
2) A Multi-LSTM neural network is used to classify the emotion of the text. Multi-LSTM neural networks primarily use multiple LSTM layers, divided into a main LSTM layer, and an auxiliary LSTM layer. Taking the experiment of the invention as an example, the main LSTM layer classifies stars based on comment sentences. The auxiliary LSTM layer is used for distinguishing various attribute information of the target product on the basis of comment sentences.
The foregoing description is only an overview of the present invention, and is intended to provide a better understanding of the present invention, as it is embodied in the following description, with reference to the preferred embodiments of the present invention and the accompanying drawings.
Drawings
FIG. 1 is a flow chart of a product review emotion classification method of the invention.
Fig. 2 is a schematic diagram of a product comment emotion classification system according to the present invention.
Fig. 3 is a schematic diagram of restaurant category distribution in the second embodiment.
Fig. 4 is a schematic diagram of distribution of selling price intervals in a restaurant in the second embodiment.
FIG. 5 is a graph comparing training effects of LSTM and Multi-LSTM at different training sample numbers.
Detailed Description
The following describes in further detail the embodiments of the present invention with reference to the drawings and examples. The following examples are illustrative of the invention and are not intended to limit the scope of the invention.
Example 1
The invention adopts a Multi-LSTM neural network for learning common characteristics between an attribute domain, namely an attribute classification block of a product, and a target domain, namely a star classification block of a comment. The architecture of the Multi-LSTM is shown in FIG. 2. The model consists of three parts: text converters, attribute-tag classifiers, and comment star classifiers. The text converter is used for converting the text information into a unified vector form, the attribute marking classifier is used for distinguishing the attribute information of the product, and the comment star class classifier is used for distinguishing the star class information of the comment. Next, we will describe the operation and implementation method of each part in detail.
1. Representation of text
Text converter takes a sentence as input, we mark the sentence as x, and mark the i-th word in the sentence as x i The vocabulary set of all words in the training set is denoted as V, and C is denoted as the vocabulary of the characters. d is the dimension of the character vector, Q.epsilon.R d×|C| Is a representation matrix of words. Each word x i E V is formed by a series of characters C i =[c1,c2,…,cl]Composition, where l is the maximum length of the word and each word is added to this length. C (C) i ∈R d×l Will become the input to the LSTM. The method comprises counting all words in all sentences in training set into vocabulary set V, wherein each word has a corresponding number, so that each sentence can be expressed as [ V [ x ] 1 ],V[x 2 ],…,V[x n ]]Where n is the number of words per sentence. The word vector and the vocabulary set V are taken as input and are transmitted to an Embedding layer, and then the Embedding layer constructs a character vector C i ,C i As inputs to LSTM.
2. Classification of emotion
Emotion classification is the process of analyzing, processing, generalizing and reasoning subjective text with emotion colors, and finally classifying the attitude of the text to focus on positive emotion or negative emotion. In the invention, the classification of emotion is regarded as a regression problem, emotion analysis aims at predicting emotion labels of comment texts, and aims at finding out the viewpoint attitudes of reviewers or authors on certain topics or aiming at certain products, so that comment information of massive products can be analyzed and mined. In daily consumption, many consumers often know the related comment information of the commodity on the internet before purchasing the product, the information amount of the internet is huge, a great deal of energy is wasted in browsing all the information, and at the moment, the comment is effectively combed by using emotion classification technology, so that a comprehensive impression of the product is provided for the user. Taking the experiment of the invention as an example, the emotion classification we do alsoThe method is star classification, and judges the emotion tendency of a comment on a restaurant, and the higher the star classification is, the more positive the comment has on the restaurant. The input star-level labels are (0.2,0.4,0.6,0.8,1.0), and the star-level labels respectively correspond to the comment of 1 star, 2 star, 3 star, 4 star and 5 star. Through LSTM training, a continuous star rating was obtained. In experiments, we use a multi-layer perceptron training binary classifier to predict the polarity of a given comment, using the text information of the comment as input vector H, using the hidden layer of LSTM to learn higher-level features. Hidden layer generated H p As an input to the normal output layer, the predicted emotion polarity label through the normal layer is P p
P p =normal(W p H p +B p )
wherein ,W p ,B p all are parameters of an experimental model.
3. Classification of commodity attributes
Different commodities have different attribute information, and the attribute information of the commodities sometimes becomes the advantages or disadvantages of the commodities, so that a certain positive or negative effect is generated on comments of users, and analysis of the relationship between the commodity attributes and the comments can help the emotion classification of the comments to a certain extent. But first, the attribute information of the commodity is distinguished. Classification of properties of goods can be regarded as a classification task, in experiments we use multi-layer perceptron training binary classifier to predict the polarity of properties of a given good, express textual information of comments as input vector H, use hidden layers of LSTM to learn higher-level features. Hidden layer generated H B As input of the softmax output layer, predicting the attribute label of the commodity to be P through the softmax layer B
wherein ,W b ,B b all are parameters of an experimental model.
Example two
Taking a restaurant as an example, the classification of the commodity attribute, namely classifying attribute information of the restaurant, selects two types of attribute information which are representative in comparison through experiments, namely Category information (Category) and Price Range information (Price Range).
1. Category information
Class information the restaurants are divided into different kinds of restaurants, for example in this experiment into ten different cuisines, mexicon cuisine (Mexican), new American cuisine (American New), traditional American cuisine (American Traditional), italy cuisine (Italian), chinese cuisine (Chinese cuisine), japan cuisine (japane), mediterranean cuisine (Mediterranean), thailand cuisine (Thai), vietnamese (Vietnamese), other cuisines (other thers). Inputting the type information of the restaurant, training, and obtaining the type of the restaurant of the comment for the test. Emotion may be affected by a variety of factors in the comments, one of which is the class of restaurants. As shown in Table 1 and example 2 below, the comment from example 1 shows that it is a good or bad experience with Mexico and that praise with Mexico at the restaurant. Example 2 this comment teaches a self-service experience with a Japanese sushi store where the critique likes sushi but dislikes it.
The example 1 comment itself is a positive comment and the example 2 comment itself is a neutral comment. However, the attribute information in the restaurant is the attribute information in the two comments, one is mexico dish and the other is japanese dish, and in the example, both the correlations have a positive effect on the emotional tendency of the comments.
Table 1 restaurant category attribute impact example
2. Price section information
The price section information distinguishes the prices of different restaurants, and takes this experiment as an example, the prices of the restaurants are divided into five sections (low, normal, high, veryhigh, other) in total. Inputting price information of a restaurant, training, and obtaining price division of the restaurant of a comment for the comment for testing. Price is also a key to influencing comment emotion. As shown in Table 3 and example 4 below, the review of example 3 shows that the reviewer is extremely praise of the price of the restaurant and lists the commonplace of the particular instance highlighting the price. Example 4 this review shows that the reviewer is slightly discontent to the restaurant price and gives him/herself a price that is deemed appropriate. It is apparent that the price attribute information in example 3 will have a positive effect on the emotional tendency of the comment in example 3, whereas the price attribute information in example 4 will have a negative effect on the emotional tendency of the comment in example 4.
Table 2 restaurant price space attribute impact example
3. Joint learning of emotion classification and commodity attribute classification
The staring condition P of one comment is set, and each attribute Q (Q 1 ,q 2 ,…,q n ) As a parameter input, P uses gaussian distribution to calculate values and Q uses a standard softmax function for classification. P is used to predict the staring of comments to obtain classification capability, and Q is used to distinguish between different attributes to obtain domain-invariant features.
Record F (x) i ) Is the feature vector corresponding to the comment sentence input, and the star class classification Q maps the feature vector F (x) i ) To the tag to which it corresponds,we use θ for this mapping parameter y And (5) marking. Star class classification is trained on cross entropy loss based resource domain NS samples:
wherein yi Is corresponding to x i Star class classification labels of e S (x),is the output of the highest softmax layer:during training, for parameter θ f and θy Optimizing to make classification loss L task Minimizing. Thus, P (F (x) i ) A) can be correctly classified in the target domain.
Conversely, the attribute classifier maps the same hidden state h to have the parameter θ d Is a property label of (a). The purpose of the attribute classifier is to distinguish attribute tags with the following loss function:
wherein di Is the correct attribute label for sample i,is the output of the top layer, N t Refers to N from the target domain t And (3) a sample. By making theta d Minimizing the loss on the surface while minimizing theta f The loss on the model is maximum, the attribute classifier is trained to the saddle point of the loss function,optimizing θ f It is ensured that the attribute-label classifier cannot distinguish between attributes, in other words, a commonality relationship is found between star-class classification and attribute classification.
In general, the method takes a comment of a restaurant as a starting point, predicts the scoring condition of the comment, takes the comment itself as a base, takes the attribute of the restaurant as an auxiliary, searches for the internal relation between the attribute of the restaurant and the comment, and obtains common characteristics so as to obtain a better classifying effect than simply carrying out emotion classification on the comment text information.
Example III
Experiments are carried out by adopting a long short memory neural network (LSTM) and a Multi-LSTM neural network, and the performance condition of classification of the model proposed by us is verified by comparing the results of the experiments.
1. Experimental setup
The experimental data comes from the biggest comment website Yelp in the United states, attribute information about restaurants and related comment information are collected, and appropriate data are screened from the collected information for training and testing, and specific setting conditions of different models are described below.
(1.1) evaluation criteria
The experiment adopts unified standard RMSE (root mean square error), wherein the root mean square error is the square root of the ratio of the square of the observed value and the true value deviation to the observed times n, the deviation condition of the predicted value and the true value can be well reflected, and the formula of the evaluation standard RMSE is as follows:
wherein gri For true comment emotion score, pr i To predict comment emotional tendency.
(1.2) number of data sets and data distribution
The training set is 1891 comments, the test set is 3310 comments, and in order to ensure the effectiveness of training, the comments are subjected to balanced distribution screening, and the following comments are distributed.
Table 3 Star level distribution of comments
Fig. 3 is a schematic diagram of restaurant category distribution in the second embodiment. Fig. 4 is a schematic diagram of distribution of selling price intervals in a restaurant in the second embodiment.
(1.3) parameter setting
The neural network activation functions used in the experiments were normal and softmax, corresponding to the emotion prediction and attribute classification of the comments, respectively. The Adma algorithm was used to optimize the model parameters, with other parameter settings as shown in Table 4.
Table 4 experimental parameter set table
2. Experimental result comparison and analysis
(2.1) comparing the reference result with the reference result
To evaluate the performance of the Muti-LSTM as we proposed, we selected a comparison of the results with the LSTM alone, the comparison results being shown in Table 5.
Table 5 comparison with reference System
It can be seen from Table 5 that the behavior was still different between the two models, and that the result of the RMSE using the LSTM model was 0.260 and that using the Muti-LSTM model was 0.249, which differ by 0.011. The result of the Muti-LSTM is about 4% better than the result of the LSTM, the progress is obvious, the emotion regression prediction of the comment can be assisted by other restaurant attribute information except the comment, and the emotion regression prediction of the comment is more accurate under a better data set, namely, the data set with highly correlated comment information and restaurant attribute information.
(2.2) influence of different factors
In the experiment, the type and price interval of the restaurant are selected for experiment, so that the accuracy of emotion prediction of comments is improved. To further compare the influence of different factors on the emotion prediction of comments, we performed a second set of experiments, where different influence factors were taken alone for comparison, and the comparison results are shown in table 6.
TABLE 6 comparison of effects of different factors
Through comparison and discovery, the method has certain help to the emotion prediction of comments no matter what the type information or the price interval information of the restaurant is, but the improvement effect is very fine, when the two influence factors are combined, more attribute information can be captured, more comments are distinguished, and the effect is obvious. It can be predicted that when one comment contains attribute information of more restaurants, the comments are better distinguished by using a plurality of attribute classifiers, so that emotion regression prediction is performed on the comments more effectively.
(2.3) Effect of different training samples
In order to study the influence of different training samples on experimental results, we made a division of the training set, and compared how the Muti-LSTM model and LSTM model perform under the condition that the number of samples is different by taking 10%,20%,30%,..100% respectively, and the comparison result is shown in fig. 5.
It can be seen that the training effect of both the LSTM model and the Muti-LSTM model fluctuates under different numbers of training sets, but the effect of Muti-LSTM is generally better. The reason for the fluctuation may be that under the new number of data sets, the samples are not equalized, so that the correlation degree between the attribute information of the restaurant in the training samples and the comment information is high or low, and a better or worse result is caused, but generally speaking, the attribute information of the restaurant will generate a certain benign help to the emotion regression prediction of the comment.
The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples illustrate only a few embodiments of the invention, which are described in detail and are not to be construed as limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.

Claims (4)

1. The emotion classification method for the product comments is characterized by comprising the following steps of:
s1: text information of related comments of the object product is obtained, and is subjected to text conversion so as to be converted into a unified vector form;
s2: distinguishing and extracting at least one attribute tag of the object product based on the comments converted into the vector form;
s3: predicting the emotion labels of the comments based on the comments converted into the vector form;
s4: combining the attribute label of the object product and the emotion label of the predicted comment, carrying out joint learning of emotion classification and attribute classification aiming at each comment of the object product, and searching for a commonality relation between the two;
in step S2, distinguishing and extracting at least one attribute tag of the object product based on the comment converted into the vector form means that the multi-layer perceptron training binary classifier is adopted to predict the attribute polarity of the object product, and the method includes the following steps:
s21: representing text information of comments as an input vector H, employing hidden layers of LSTM to learn higher-level featuresH B
S22: h to be generated by hidden layer B As input of the softmax output layer, predicting the attribute label of the commodity to be P through the softmax layer B
P B =softmax(W b H B +B b )
wherein ,W b ,B b all are parameters of an experimental model;
in step S3, based on the comment converted into the vector form, predicting the emotion label of the comment refers to predicting the emotion polarity of the given comment by using a multi-layer perceptron training binary classifier, and the method includes the following steps:
s31: using text information of comments as input vector H, employing hidden layer of LSTM to learn higher level feature H p
S32: h to be generated by hidden layer p As an input to the normal output layer, the predicted emotion polarity label through the normal layer is P p
P p =normal(W p H p +B p )
wherein ,W p ,B p all are parameters of an experimental model;
in step S4, the combination of the attribute tag of the object product and the emotion tag of the comment obtained by prediction performs joint learning of emotion classification and attribute classification for each comment of the object product, and searches for a commonality relationship between the two, which includes the following steps:
s41: the staring condition P of one comment is judged, and each attribute Q (Q 1 ,q 2 ,…,q n ) Inputting parameters, wherein P uses Gaussian distribution to calculate values, Q uses standard softmax functions to classify, P is used for predicting the star-beating condition of comments to obtain classifying capacity, and Q is used for distinguishing different attributes to obtain domain-invariant characteristics;
s42: record F (x) i ) Is the feature vector corresponding to the comment sentence input, and the star class classification Q maps the feature vector F (x) i ) To the label corresponding to it, the mapping parameter is used for theta y Marking, training star classification on resource domain NS samples based on cross entropy loss, star classification loss L task Expressed as:
wherein ,yi Is corresponding to x i Star class classification labels of e S (x),is the output of the highest softmax layer:s (x) is a set of input comments and corresponding star class classification sample pairs;
during training, for parameter θ f and θy Optimizing to make classification loss L task Minimization, θ f Is x i Mapping to feature vector F (x i ) Mapping parameters of (a);
s43: the attribute tag classifier maps the same hidden state h to have the parameter θ d With the purpose of the following attribute loss function L type Distinguishing attribute tags:
wherein ,di Is the correct attribute label for sample i,is the output of the top layer, N t Refers to N from the target domain t The sample was taken out of the sample,
by making theta d Minimizing the loss on the surface while minimizing theta f The loss is maximum, and the attribute classifier is trained to the saddle point of the loss function;
optimizing θ f It is ensured that the attribute-label classifier cannot distinguish between attributes, in other words, a commonality relationship is found between star-class classification and attribute classification.
2. The method for classifying emotion of product reviews according to claim 1, wherein in step S1, the step of obtaining text information of the related reviews of the target product, and performing text conversion on the text information to convert the text information into a unified vector form comprises the steps of:
s11: counting all words in each comment-related sentence x in the training set into a vocabulary set V, wherein each word is provided with a corresponding number x i ,x i Representing the ith word in sentence x, each word x i E V is composed of a series of characters C i =[c1,c2,…,cl]A composition, wherein l is the maximum length of the word and each word is added to this length;
s12: each sentence x is represented as [ V [ x ] 1 ],V[x 2 ],…,V[x n ]]Where n is the number of words contained in each sentence, Q ε R d×|C| Is a representation matrix of words, C is the word of the character, d is the dimension of the character vector
S13: takes the word vector and the vocabulary set V as input and transmits the word vector and the vocabulary set V to the embeddedThe ding layer and the Embedding layer constructs a character vector C i ,C i ∈R d×l C is carried out by i As inputs to LSTM.
3. A product comment emotion classification system based on the product comment emotion classification method of claim 1, wherein the emotion classification system comprises a text converter, an attribute marking classifier and a comment star class classifier;
the text converter is used for converting text information into a unified vector form, the attribute marking classifier is used for distinguishing attribute information of products, and the comment star class classifier is used for distinguishing star class information of comments.
4. The product review emotion classification system of claim 3, wherein the emotion classification system is built by a Multi-LSTM neural network and comprises a plurality of LSTM layers, wherein the plurality of LSTM layers are divided into a main LSTM layer and an auxiliary LSTM layer;
the main LSTM layer is used for classifying the comments in star level on the basis of comment sentences; the auxiliary LSTM layer is used for distinguishing various attribute information of the object product on the basis of comment sentences.
CN201911315255.6A 2019-12-19 2019-12-19 Product comment emotion classification method and system Active CN111159400B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911315255.6A CN111159400B (en) 2019-12-19 2019-12-19 Product comment emotion classification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911315255.6A CN111159400B (en) 2019-12-19 2019-12-19 Product comment emotion classification method and system

Publications (2)

Publication Number Publication Date
CN111159400A CN111159400A (en) 2020-05-15
CN111159400B true CN111159400B (en) 2023-09-26

Family

ID=70557348

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911315255.6A Active CN111159400B (en) 2019-12-19 2019-12-19 Product comment emotion classification method and system

Country Status (1)

Country Link
CN (1) CN111159400B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112800184B (en) * 2021-01-13 2021-08-06 华东师范大学 Short text comment emotion analysis method based on Target-Aspect-Opinion joint extraction
CN113806545B (en) * 2021-09-24 2022-06-17 重庆理工大学 Comment text emotion classification method based on label description generation

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066445A (en) * 2017-04-11 2017-08-18 华东师范大学 The deep learning method of one attribute emotion word vector
CN107862343A (en) * 2017-11-28 2018-03-30 南京理工大学 The rule-based and comment on commodity property level sensibility classification method of neutral net
CN108363753A (en) * 2018-01-30 2018-08-03 南京邮电大学 Comment text sentiment classification model is trained and sensibility classification method, device and equipment
CN109492229A (en) * 2018-11-23 2019-03-19 中国科学技术大学 A kind of cross-cutting sensibility classification method and relevant apparatus

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080249764A1 (en) * 2007-03-01 2008-10-09 Microsoft Corporation Smart Sentiment Classifier for Product Reviews

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066445A (en) * 2017-04-11 2017-08-18 华东师范大学 The deep learning method of one attribute emotion word vector
CN107862343A (en) * 2017-11-28 2018-03-30 南京理工大学 The rule-based and comment on commodity property level sensibility classification method of neutral net
CN108363753A (en) * 2018-01-30 2018-08-03 南京邮电大学 Comment text sentiment classification model is trained and sensibility classification method, device and equipment
CN109492229A (en) * 2018-11-23 2019-03-19 中国科学技术大学 A kind of cross-cutting sensibility classification method and relevant apparatus

Also Published As

Publication number Publication date
CN111159400A (en) 2020-05-15

Similar Documents

Publication Publication Date Title
Danescu-Niculescu-Mizil et al. How opinions are received by online communities: a case study on amazon. com helpfulness votes
US7783642B1 (en) System and method of identifying web page semantic structures
US10685181B2 (en) Linguistic expression of preferences in social media for prediction and recommendation
Yang et al. Integrating rich and heterogeneous information to design a ranking system for multiple products
CN103870973A (en) Information push and search method and apparatus based on electronic information keyword extraction
US11416565B2 (en) Techniques to leverage machine learning for search engine optimization
Noori Classification of customer reviews using machine learning algorithms
CN111639258B (en) News recommendation method based on neural network
CN112015721A (en) E-commerce platform storage database optimization method based on big data
CN111159400B (en) Product comment emotion classification method and system
CN109522412A (en) Text emotion analysis method, device and medium
Ashik et al. Data set for sentiment analysis on Bengali news comments and its baseline evaluation
Micu et al. Exploring artificial intelligence techniques’ applicability in social media marketing
Baishya et al. SAFER: sentiment analysis-based fake review detection in e-commerce using deep learning
Hossain et al. Sentiment analysis on reviews of e-commerce sites using machine learning algorithms
Shi et al. A Word2vec model for sentiment analysis of weibo
CN110781300A (en) Tourism resource culture characteristic scoring algorithm based on Baidu encyclopedia knowledge graph
Bitto et al. Sentiment analysis from Bangladeshi food delivery startup based on user reviews using machine learning and deep learning
CN111259223B (en) News recommendation and text classification method based on emotion analysis model
CN104572623A (en) Efficient data summary and analysis method of online LDA model
CN112115712A (en) Topic-based group emotion analysis method
Raviya et al. An Approach for Recommender System Based on Multilevel Sentiment Analysis Using Hybrid Deep Learning Models
Jeon et al. Graph based aspect extraction and rating classification of customer review data
CN112463966B (en) False comment detection model training method, false comment detection model training method and false comment detection model training device
Dambhare et al. Smart map for smart city

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant