CN113343665A - Commodity comment sentiment analysis method and system based on aspect-level fine granularity - Google Patents
Commodity comment sentiment analysis method and system based on aspect-level fine granularity Download PDFInfo
- Publication number
- CN113343665A CN113343665A CN202110529175.1A CN202110529175A CN113343665A CN 113343665 A CN113343665 A CN 113343665A CN 202110529175 A CN202110529175 A CN 202110529175A CN 113343665 A CN113343665 A CN 113343665A
- Authority
- CN
- China
- Prior art keywords
- commodity comment
- root node
- comment data
- acquiring
- commodity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 37
- 239000011159 matrix material Substances 0.000 claims abstract description 60
- 238000000034 method Methods 0.000 claims abstract description 58
- 230000008451 emotion Effects 0.000 claims abstract description 55
- 230000002996 emotional effect Effects 0.000 claims abstract description 20
- 238000013135 deep learning Methods 0.000 claims abstract description 16
- 239000013598 vector Substances 0.000 claims description 38
- 238000007781 pre-processing Methods 0.000 claims description 13
- 230000004927 fusion Effects 0.000 claims description 11
- 238000012937 correction Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 6
- 238000012552 review Methods 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 4
- 230000006870 function Effects 0.000 claims description 4
- 238000012935 Averaging Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 230000007935 neutral effect Effects 0.000 claims description 3
- 238000012549 training Methods 0.000 description 12
- 230000008569 process Effects 0.000 description 8
- 238000010801 machine learning Methods 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 3
- 238000013145 classification model Methods 0.000 description 3
- 239000003086 colorant Substances 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 239000003643 water by type Substances 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/51—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/253—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0282—Rating or review of business operators or products
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Business, Economics & Management (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Marketing (AREA)
- Economics (AREA)
- Software Systems (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a commodity comment sentiment analysis method and system based on aspect-level fine granularity, a storage medium and electronic equipment, and relates to the field of sentiment analysis. The method adopts an aspect dependency tree algorithm to construct a syntax tree with the aspect words as root nodes, and stores the syntax tree with an image matrix; acquiring initial characteristic representation of a root node by adopting a recursive algorithm according to the image matrix; introducing a pre-constructed emotion dictionary according to the initial feature representation of the syntax tree and the root node, and acquiring the final feature representation of the root node; and acquiring the emotional tendency of the commodity comment data by adopting a deep learning method according to the final feature representation of the root node. The method constructs a dependency tree algorithm with the aspect words as root nodes, carries out emotion prediction on each aspect word, breaks through the limitation that comments only cover one emotion tendency, and improves the accuracy of predicting the aspect level emotion tendency; the merchant can accurately know the information for management; the method is helpful for customers to better select commodities.
Description
Technical Field
The invention relates to the technical field of sentiment analysis, in particular to a commodity comment sentiment analysis method and system based on aspect-level fine granularity, a storage medium and electronic equipment.
Background
The commodity comment data is the evaluation of the customers on the aspects of product quality, price, service and the like after the electronic commerce transaction is completed. The commodity comment data becomes an important information source for merchants to acquire customer consumption preference and develop accurate marketing. Such evaluation sets tend to have strong emotional tendencies. The emotional tendency of the client can be researched, so that the acceptance of the client to an enterprise can be measured, and the consumption preference of the client can be mined.
Currently, two techniques are mainly used for predicting the commodity evaluation emotion: firstly, based on traditional machine learning, the main form is to obtain a commodity comment data set and predict emotion by using a machine learning algorithm; secondly, based on a sequence neural network model, the main form is that a time sequence neural network is used for obtaining sentence-level feature vectors for direct prediction based on word vectors.
However, it is relatively difficult to construct a traditional machine learning model, and a traditional machine learning algorithm usually requires complex feature engineering (the content of the feature engineering is scattered, the definition manner is fuzzy, and roughly includes two aspects of feature extraction and feature selection), that is, the implementation method of machine learning for feature extraction and feature selection is complex, and deep learning is a method of extracting and representing information through a multilayer neural network. The sequence neural network model cannot process data types with graph structures, syntax and even sentences can be regarded as data with the graph structures and are not effectively utilized by the sequence neural network model, and meanwhile, the sequence neural network cannot effectively process grammatical relations with long dependence. Therefore, how to provide a more convenient and more comprehensive commodity comment sentiment analysis scheme becomes a problem which needs to be solved urgently at present.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects of the prior art, the invention provides a commodity comment sentiment analysis method, a system, a storage medium and electronic equipment based on aspect-level fine granularity, and solves the technical problems that the existing commodity comment sentiment analysis scheme is complex and comment data cannot be fully utilized.
(II) technical scheme
In order to achieve the purpose, the invention is realized by the following technical scheme:
a commodity comment sentiment analysis method based on aspect level fine granularity comprises the following steps:
s1, acquiring and preprocessing commodity comment data;
s2, obtaining the aspect words contained in the preprocessed commodity comment data according to the preprocessed commodity comment data;
s3, constructing a syntax tree with the aspect words as root nodes by adopting an aspect dependency tree algorithm, and storing the syntax tree with an image matrix;
s4, acquiring initial characteristic representation of the root node by adopting a recursive algorithm according to the image matrix;
s5, introducing a pre-constructed emotion dictionary according to the initial feature representation of the syntax tree and the root node, and acquiring the final feature representation of the root node;
and S6, acquiring the emotional tendency of the commodity comment data by adopting a deep learning method according to the final feature representation of the root node.
Preferably, the preprocessing procedure in step S1 includes:
analyzing the commodity comment data by adopting a Stanford Parser to obtain entity information in the commodity comment data, marking an entity as 1 and marking the rest as 0; and mapping each part-of-speech tag after the commodity comment data is analyzed into a corresponding vector, and splicing the vector with the obtained mark of each word to obtain an intermediate vector.
Preferably, the step S2 specifically includes:
s21, acquiring an embedded vector matrix of the commodity comment data by adopting a glove model, splicing the embedded vector matrix with the intermediate vector acquired in the preprocessing step, inputting the spliced vector matrix into an lstm network, and acquiring a final vector of each word;
and S22, obtaining the position of the aspect word in the commodity comment data by adopting a deep learning method according to the final vector, and extracting the aspect word according to the position.
Preferably, the step S3 specifically includes:
s31, setting the aspect words as root nodes of the grammar tree;
s32, determining a next-layer node of the root node according to a grammar dependency tree obtained after the commodity comment data are analyzed and a grammar dependency relationship of the aspect words in the grammar dependency tree;
s33, determining the next layer of nodes according to the grammar dependency relationship of the corresponding words of the nodes in the grammar dependency tree;
and S34, repeatedly executing the step S33 until all words in the commodity comment data are traversed, acquiring the grammar tree, and storing the grammar tree in an image matrix.
Preferably, the step S4 specifically includes:
s41, fusing the image matrix and the embedded vector matrix to obtain a fusion matrix;
s42, averaging the numerical values of each layer of the fusion matrix according to the syntax tree structure;
and S43, acquiring a correction matrix according to the averaged fusion matrix and the step function.
And S44, inputting the correction matrix into an lstm network to obtain the initial characteristic representation of the root node.
Preferably, the step S5 specifically includes:
according to the emotion dictionary, searching for emotion label scores corresponding to direct child nodes of the grammar tree, splicing the initial feature representation and the search result of the root node, and obtaining the final feature representation of the root node; the emotion label score is in a range of-1, wherein-1 represents that the degree with the negative color is the highest, and 1 represents that the degree with the positive color is the highest.
Preferably, the emotional tendencies include positive, negative, neutral.
A commodity comment sentiment analysis system based on aspect level fine granularity comprises:
the comment acquisition module is used for acquiring and preprocessing commodity comment data;
the aspect word acquisition module is used for acquiring aspect words contained in the preprocessed commodity comment data according to the preprocessed commodity comment data;
the syntax tree construction module is used for constructing a syntax tree with the aspect words as root nodes by adopting an aspect dependency tree algorithm and storing the syntax tree with an image matrix;
the initial characteristic representation module is used for obtaining the initial characteristic representation of the root node by adopting a recursive algorithm according to the image matrix;
the final feature representation module is used for introducing a pre-constructed emotion dictionary according to the grammar tree and the initial feature representation of the root node to acquire the final feature representation of the root node;
and the emotion analysis module is used for acquiring the emotional tendency of the commodity comment data by adopting a deep learning method according to the final feature representation of the root node.
A storage medium storing a computer program for commodity comment sentiment analysis based on aspect-level fine granularity, wherein the computer program causes a computer to execute the commodity comment sentiment analysis method as described above.
An electronic device, comprising:
one or more processors;
a memory; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs comprising instructions for performing the method for sentiment analysis of merchandise reviews as described above.
(III) advantageous effects
The invention provides a commodity comment sentiment analysis method, a commodity comment sentiment analysis system, a storage medium and electronic equipment based on aspect level fine granularity. Compared with the prior art, the method has the following beneficial effects:
the method comprises the steps of obtaining aspect words contained in preprocessed commodity comment data; constructing a syntax tree with the aspect words as root nodes by adopting an aspect dependency tree algorithm, and storing the syntax tree with an image matrix; acquiring initial characteristic representation of a root node by adopting a recursive algorithm according to the image matrix; introducing a pre-constructed emotion dictionary according to the initial feature representation of the syntax tree and the root node, and acquiring the final feature representation of the root node; and acquiring the emotional tendency of the commodity comment data by adopting a deep learning method according to the final feature representation of the root node. According to the method, each aspect word in the commodity comment data is positioned, the dependency tree algorithm with the aspect word as a root node is constructed, emotion prediction is carried out on each aspect word, the limitation that the comment only covers one emotion tendency is broken, and the accuracy of predicting the emotion tendency of the aspect level is improved; the merchant can accurately know the information for management; the method is helpful for customers to better select commodities.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a commodity comment sentiment analysis method based on aspect-level fine granularity according to an embodiment of the present invention;
FIG. 2 is a diagram of a syntax dependency tree according to an embodiment of the present invention;
FIG. 3(a) is a schematic diagram of a syntax tree with a root node being a face word according to an embodiment of the present invention;
FIG. 3(b) is a diagram of another syntax tree with a root node being a facet word according to an embodiment of the present invention;
fig. 4 is a structural block diagram of a commodity comment sentiment analysis system based on aspect-level fine granularity according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention are clearly and completely described, and it is obvious that the described embodiments are a part of the embodiments of the present invention, but not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
By providing the commodity comment sentiment analysis method, the commodity comment sentiment analysis system, the storage medium and the electronic equipment based on the aspect-level fine granularity, the technical problems that the existing commodity comment sentiment analysis scheme is complex and comment data cannot be fully utilized are solved, the limitation that comments only cover one sentiment tendency is broken, and the accuracy of predicting the aspect-level sentiment tendency is improved; the merchant can accurately know the information for management; the method is helpful for customers to better select commodities.
In order to solve the technical problems, the general idea of the embodiment of the application is as follows:
the method and the device for processing the commodity comment data acquire the facet words contained in the preprocessed commodity comment data; constructing a syntax tree with the aspect words as root nodes by adopting an aspect dependency tree algorithm, and storing the syntax tree with an image matrix; acquiring initial characteristic representation of a root node by adopting a recursive algorithm according to the image matrix; introducing a pre-constructed emotion dictionary according to the initial feature representation of the syntax tree and the root node, and acquiring the final feature representation of the root node; and acquiring the emotional tendency of the commodity comment data by adopting a deep learning method according to the final feature representation of the root node. The embodiment of the invention positions each aspect word in the commodity comment data, constructs a dependency tree algorithm taking the aspect word as a root node, and carries out emotion prediction on each aspect word, breaks through the limitation that the comment only covers one emotion tendency, and improves the accuracy of predicting the emotion tendency of the aspect level; the merchant can accurately know the information for management; the method is helpful for customers to better select commodities.
In order to better understand the technical solution, the technical solution will be described in detail with reference to the drawings and the specific embodiments.
In a first aspect, as shown in fig. 1, an embodiment of the present invention provides a commodity comment sentiment analysis method based on aspect-level fine granularity, including:
s1, acquiring and preprocessing commodity comment data;
s2, obtaining the aspect words contained in the preprocessed commodity comment data according to the preprocessed commodity comment data;
s3, constructing a syntax tree with the aspect words as root nodes by adopting an aspect dependency tree algorithm, and storing the syntax tree with an image matrix;
s4, acquiring initial characteristic representation of the root node by adopting a recursive algorithm according to the image matrix;
s5, introducing a pre-constructed emotion dictionary according to the initial feature representation of the syntax tree and the root node, and acquiring the final feature representation of the root node;
and S6, acquiring the emotional tendency of the commodity comment data by adopting a deep learning method according to the final feature representation of the root node.
The embodiment of the invention positions each aspect word in the commodity comment data, constructs a dependency tree algorithm taking the aspect word as a root node, and carries out emotion prediction on each aspect word, breaks through the limitation that the comment only covers one emotion tendency, and improves the accuracy of predicting the emotion tendency of the aspect level; the merchant can accurately know the information for management; the method is helpful for customers to better select commodities.
The following describes the steps of the embodiment of the present invention with reference to specific contents:
and S1, acquiring and preprocessing commodity comment data.
The embodiment of the invention adopts a crawler tool to obtain a commodity comment data set from an electronic commerce platform, wherein the data set comprises commodity comment data of 'great food but the service water dreadful'.
The pretreatment process comprises the following steps:
analyzing the commodity comment data, namely 'great food but service waters dreadful', by using a Stanford Parser to obtain entity information in the commodity comment data, marking an entity as 1 and marking the rest part as 0; mapping each part-of-speech tag after the commodity comment data is analyzed into a corresponding vector, and splicing the vector with the obtained mark of each word to obtain an intermediate vector; while obtaining the syntax dependency tree as shown in fig. 2.
In addition, the embodiment of the present invention may also construct an entity set related to the commodity comment data, by directly querying whether the word in the "great food but the service waters dreadful" is in the set, if it exists, the entity is marked as 1, and the rest is marked as 0.
And S2, obtaining the aspect words contained in the preprocessed commodity comment data according to the preprocessed commodity comment data. The method specifically comprises the following steps:
s21, acquiring an embedded vector matrix of the commodity comment data by adopting a glove model, splicing the embedded vector matrix with the intermediate vector acquired in the preprocessing step, inputting the spliced vector matrix into an lstm network, and acquiring a final vector of each word.
Principle of glove model: vectorized representation of words is performed so that as much semantic and grammatical information as possible is implied between vectors. A large amount of text massive data are used for training, and an English word is mapped into a vector. Then glove.840b.300d is the vector that the model maps a word to 300 dimensions. The embodiment of the invention directly introduces the word vector pre-training model.
The embedded vector matrix size is 7 x 300 dimensions, expressed as:
and S22, obtaining the position of the aspect word in the commodity comment data by adopting a deep learning method according to the final vector, and extracting the aspect word according to the position.
S3, constructing a syntax tree with the aspect words as root nodes by adopting an aspect dependency tree algorithm, and storing the syntax tree with an image matrix; the method specifically comprises the following steps:
s31, setting the facet as the root node of the syntax tree, for example, taking "food" as the root node, as shown in fig. 3 (a).
Embodiments of the present invention provide that the index of the above-mentioned product review data starts from 0, that is, 0: great, 1: when "food" is used as the root node in food … …, graph [0] [1] ═ 1 is initialized first.
S32, as shown in fig. 2, determining a next-level node of the root node "food" according to the syntax dependency tree obtained by parsing the product review data and the syntax dependency relationship of the aspect word in the syntax dependency tree.
According to the syntax dependency tree shown in FIG. 2, first find the nodes pointed by "food" as "great" (index:0) and "dreadful" (index:6), respectively; then searching whether a node points to the food, wherein the node does not exist; then graph [1] [0] is 1 and graph [1] [6] is 1.
And S33, determining the next layer of nodes according to the grammar dependency relationship of the corresponding words of the nodes in the grammar dependency tree.
Finding the words in the remaining words "but, the, service, was" that have grammatical dependencies on the accessed word, resulting in "but" (index:2), "service" (index:4), "was" (index:5)
And S34, repeatedly executing the step S33 until all words in the commodity comment data are traversed, obtaining a grammar tree as shown in the figure 3(a), and storing the grammar tree in an image matrix.
The image matrix size is 4 x 7 dimensions, expressed as:
of course, a syntax tree with service as a root node is constructed, as shown in fig. 3(b), the above construction method using the aspect-dependent tree algorithm may also be referred to, and details are not repeated here.
S4, acquiring initial characteristic representation of the root node by adopting a recursive algorithm according to the image matrix; the method specifically comprises the following steps:
and S41, fusing the image matrix and the embedded vector matrix to obtain a fusion matrix.
The step carries out feature fusion on the obtained grammar tree taking the face words as roots: in the previous step, the embodiment of the present invention maps each level of the syntax tree to a node. Taking fig. 3(a) as an example, it will be mapped into 4 layers, one node per layer, fused with the embedded vector matrix by the following formula:
and S42, averaging the values of each layer of the fusion matrix according to the syntax tree structure.
In order to retain the original characteristics, this step averages each layer of the fusion matrix Node _ Feature by the following method:
and S43, acquiring a correction matrix according to the averaged fusion matrix and the step function.
When the lstm network is adopted for carrying out recursive updating, the main idea of the embodiment of the invention comprises that the farther the nodes are away from the tree root, the smaller the influence on the root nodes is; conversely, nodes closer to the following node have a greater impact on the root node. Then, this step then proposes that the step function is 1/distance (distance is the distance of the layer feature from the root node, from 1, 2, 3 …). The correction matrix is:
and S44, inputting the correction matrix into an lstm network to obtain the initial characteristic representation of the root node.
And S5, introducing a pre-constructed emotion dictionary according to the grammar tree and the initial feature representation of the root node, and acquiring the final feature representation of the root node.
In order to better improve the efficiency of the model and the interpretability of the neural network, an emotion dictionary is constructed, and the emotion dictionary mainly selects words containing emotional colors from a commodity comment data set. According to the embodiment of the invention, the words are scientifically scored through knowledge in aspects of psychology, linguistics and the like, two labels of positive and negative are scored for each word, the sentiment label score is in the range of-1 to 1, wherein-1 represents that the degree of having negative color is highest, and 1 represents that the degree of having positive color is highest.
Using the syntax tree with root as the root constructed above, the embodiment of the present invention finds that if the child nodes of the root of the tree have strong emotional colors, the emotional colors of the aspect of the commodity are determined to a great extent.
Therefore, in the step, firstly, according to the emotion dictionary, the emotion label score corresponding to the direct child node of the grammar tree is searched; then splicing the initial feature representation and the search result of the root node; finally, acquiring the final feature representation of the root node;
and S6, acquiring the emotional tendency of the commodity comment data by adopting a deep learning method according to the final feature representation of the root node.
Specifically, in this step, the final feature of the root node is expressed and input into a pre-trained emotion classification model, and the emotion tendency of the commodity comment data is acquired.
The emotion classification model is trained in advance, and the training process can include the following steps S110 to S130:
s110, acquiring and preprocessing a plurality of pieces of commodity comment data;
since the larger the amount of data used for training, the more accurate the classification capability of the trained model, the greater the number of the commodity comment data, the better, for example, 500 or 1000 items.
S120, determining the final feature representation of the root node corresponding to each piece of data in the plurality of pieces of commodity comment data, and acquiring the real emotional tendency of the root node corresponding to the piece of data. The embodiment of the invention divides the real emotional tendency into positive, negative and neutral.
The process of determining the final feature representation of the root node corresponding to each piece of data in this step may refer to steps S2 to S5, which are not described herein again.
S130, taking the final feature representation of the root node corresponding to each piece of commodity comment data in the plurality of pieces of commodity comment data as a first training sample, taking the real emotion tendency of the root node corresponding to the piece of data as an emotion label of the first training sample, and constructing an emotion classification training database.
The method comprises the steps of constructing an emotion classification training database, wherein one training sample comprises a final feature representation of a root node corresponding to commodity comment data, and one first training sample is provided with an emotion label.
And S140, performing model training by adopting a deep learning method according to the emotion classification training database to obtain the emotion classification model.
In a second aspect, as shown in fig. 4, an embodiment of the present invention provides a commodity comment sentiment analysis system based on aspect level fine granularity, including:
the comment acquisition module is used for acquiring and preprocessing commodity comment data;
the aspect word acquisition module is used for acquiring aspect words contained in the preprocessed commodity comment data according to the preprocessed commodity comment data;
the syntax tree construction module is used for constructing a syntax tree with the aspect words as root nodes by adopting an aspect dependency tree algorithm and storing the syntax tree with an image matrix;
the initial characteristic representation module is used for obtaining the initial characteristic representation of the root node by adopting a recursive algorithm according to the image matrix;
the final feature representation module is used for introducing a pre-constructed emotion dictionary according to the grammar tree and the initial feature representation of the root node to acquire the final feature representation of the root node;
and the emotion analysis module is used for acquiring the emotional tendency of the commodity comment data by adopting a deep learning method according to the final feature representation of the root node.
It can be understood that the product comment sentiment analysis system provided by the invention corresponds to the product comment sentiment analysis method provided by the invention, and the explanation, exemplification, beneficial effect and other parts of the relevant content can refer to the corresponding parts in the product comment sentiment analysis method, and are not described herein again.
In a third aspect, the present invention provides a storage medium storing a computer program for commodity comment sentiment analysis based on aspect level fine granularity, wherein the computer program causes a computer to execute the commodity comment sentiment analysis method as described above.
In a fourth aspect, an embodiment of the present invention provides an electronic device, including:
one or more processors;
a memory; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs comprising instructions for performing the method for sentiment analysis of merchandise reviews as described above.
In summary, compared with the prior art, the method has the following beneficial effects:
the method and the device for processing the commodity comment data acquire the facet words contained in the preprocessed commodity comment data; constructing a syntax tree with the aspect words as root nodes by adopting an aspect dependency tree algorithm, and storing the syntax tree with an image matrix; acquiring initial characteristic representation of a root node by adopting a recursive algorithm according to the image matrix; introducing a pre-constructed emotion dictionary according to the initial feature representation of the syntax tree and the root node, and acquiring the final feature representation of the root node; and acquiring the emotional tendency of the commodity comment data by adopting a deep learning method according to the final feature representation of the root node. The embodiment of the invention positions each aspect word in the commodity comment data, constructs a dependency tree algorithm taking the aspect word as a root node, and carries out emotion prediction on each aspect word, breaks through the limitation that the comment only covers one emotion tendency, and improves the accuracy of predicting the emotion tendency of the aspect level; the merchant can accurately know the information for management; the method is helpful for customers to better select commodities.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (10)
1. A commodity comment sentiment analysis method based on aspect level fine granularity is characterized by comprising the following steps:
s1, acquiring and preprocessing commodity comment data;
s2, obtaining the aspect words contained in the preprocessed commodity comment data according to the preprocessed commodity comment data;
s3, constructing a syntax tree with the aspect words as root nodes by adopting an aspect dependency tree algorithm, and storing the syntax tree with an image matrix;
s4, acquiring initial characteristic representation of the root node by adopting a recursive algorithm according to the image matrix;
s5, introducing a pre-constructed emotion dictionary according to the initial feature representation of the syntax tree and the root node, and acquiring the final feature representation of the root node;
and S6, acquiring the emotional tendency of the commodity comment data by adopting a deep learning method according to the final feature representation of the root node.
2. The method for analyzing sentiment of reviews of commodities according to claim 1, wherein the preprocessing in step S1 includes:
analyzing the commodity comment data by adopting a StanfordParser analyzer to obtain entity information in the commodity comment data, marking an entity as 1 and marking the rest as 0; and mapping each part-of-speech tag after the commodity comment data is analyzed into a corresponding vector, and splicing the vector with the obtained mark of each word to obtain an intermediate vector.
3. The method for analyzing sentiment of commodity comments, according to claim 2, wherein the step S2 specifically includes:
s21, acquiring an embedded vector matrix of the commodity comment data by adopting a glove model, splicing the embedded vector matrix with the intermediate vector acquired in the preprocessing step, inputting the spliced vector matrix into an lstm network, and acquiring a final vector of each word;
and S22, obtaining the position of the aspect word in the commodity comment data by adopting a deep learning method according to the final vector, and extracting the aspect word according to the position.
4. The method for analyzing sentiment of commodity comments, according to claim 3, wherein the step S3 specifically includes:
s31, setting the aspect words as root nodes of the grammar tree;
s32, determining a next-layer node of the root node according to a grammar dependency tree obtained after the commodity comment data are analyzed and a grammar dependency relationship of the aspect words in the grammar dependency tree;
s33, determining the next layer of nodes according to the grammar dependency relationship of the corresponding words of the nodes in the grammar dependency tree;
and S34, repeatedly executing the step S33 until all words in the commodity comment data are traversed, acquiring the grammar tree, and storing the grammar tree in an image matrix.
5. The method for analyzing emotion of commodity comment data according to claim 3, wherein said step S4 specifically includes:
s41, fusing the image matrix and the embedded vector matrix to obtain a fusion matrix;
s42, averaging the numerical values of each layer of the fusion matrix according to the syntax tree structure;
and S43, acquiring a correction matrix according to the averaged fusion matrix and the step function.
And S44, inputting the correction matrix into an lstm network to obtain the initial characteristic representation of the root node.
6. The method for analyzing sentiment of commodity comments, according to claim 3, wherein the step S5 specifically includes:
according to the emotion dictionary, searching for emotion label scores corresponding to direct child nodes of the grammar tree, splicing the initial feature representation and the search result of the root node, and obtaining the final feature representation of the root node; the emotion label score is in a range of-1, wherein-1 represents that the degree with the negative color is the highest, and 1 represents that the degree with the positive color is the highest.
7. The method for analyzing commodity comment emotion of any one of claims 1 to 6, wherein the emotional tendency includes positive, negative, and neutral.
8. A commodity comment sentiment analysis system based on aspect level fine granularity is characterized by comprising:
the comment acquisition module is used for acquiring and preprocessing commodity comment data;
the aspect word acquisition module is used for acquiring aspect words contained in the preprocessed commodity comment data according to the preprocessed commodity comment data;
the syntax tree construction module is used for constructing a syntax tree with the aspect words as root nodes by adopting an aspect dependency tree algorithm and storing the syntax tree with an image matrix;
the initial characteristic representation module is used for obtaining the initial characteristic representation of the root node by adopting a recursive algorithm according to the image matrix;
the final feature representation module is used for introducing a pre-constructed emotion dictionary according to the grammar tree and the initial feature representation of the root node to acquire the final feature representation of the root node;
and the emotion analysis module is used for acquiring the emotional tendency of the commodity comment data by adopting a deep learning method according to the final feature representation of the root node.
9. A storage medium storing a computer program for commodity comment emotion analysis based on aspect level fine granularity, wherein the computer program causes a computer to execute the commodity comment emotion analysis method according to any one of claims 1 to 7.
10. An electronic device, comprising:
one or more processors;
a memory; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs comprising instructions for performing the method of merchandise review sentiment analysis of any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110529175.1A CN113343665B (en) | 2021-05-14 | 2021-05-14 | Commodity comment emotion analysis method and system based on aspect-level fine granularity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110529175.1A CN113343665B (en) | 2021-05-14 | 2021-05-14 | Commodity comment emotion analysis method and system based on aspect-level fine granularity |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113343665A true CN113343665A (en) | 2021-09-03 |
CN113343665B CN113343665B (en) | 2024-07-26 |
Family
ID=77470045
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110529175.1A Active CN113343665B (en) | 2021-05-14 | 2021-05-14 | Commodity comment emotion analysis method and system based on aspect-level fine granularity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113343665B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116561323A (en) * | 2023-06-30 | 2023-08-08 | 西安邮电大学 | Emotion analysis method based on aspect word embedding graph convolution network |
CN117592483A (en) * | 2023-11-21 | 2024-02-23 | 合肥工业大学 | Implicit emotion analysis method and device based on thinking tree |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180032507A1 (en) * | 2016-07-28 | 2018-02-01 | Abbyy Infopoisk Llc | Aspect-based sentiment analysis and report generation using machine learning methods |
CN111783474A (en) * | 2020-07-16 | 2020-10-16 | 厦门市美亚柏科信息股份有限公司 | Comment text viewpoint information processing method and device and storage medium |
CN111858945A (en) * | 2020-08-05 | 2020-10-30 | 上海哈蜂信息科技有限公司 | Deep learning-based comment text aspect level emotion classification method and system |
CN112347248A (en) * | 2020-10-30 | 2021-02-09 | 山东师范大学 | Aspect-level text emotion classification method and system |
-
2021
- 2021-05-14 CN CN202110529175.1A patent/CN113343665B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180032507A1 (en) * | 2016-07-28 | 2018-02-01 | Abbyy Infopoisk Llc | Aspect-based sentiment analysis and report generation using machine learning methods |
CN111783474A (en) * | 2020-07-16 | 2020-10-16 | 厦门市美亚柏科信息股份有限公司 | Comment text viewpoint information processing method and device and storage medium |
CN111858945A (en) * | 2020-08-05 | 2020-10-30 | 上海哈蜂信息科技有限公司 | Deep learning-based comment text aspect level emotion classification method and system |
CN112347248A (en) * | 2020-10-30 | 2021-02-09 | 山东师范大学 | Aspect-level text emotion classification method and system |
Non-Patent Citations (2)
Title |
---|
YAOWEI ZHANG等: "Replicate, Walk, and Stop on Syntax: An Effective Neural Network Model for Aspect-Level Sentiment Classification", PROCEEDINGS OF THE AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 3 April 2020 (2020-04-03) * |
冯仓龙;白宇;蔡东风;: "面向商品评价的情感要素抽取", 沈阳航空航天大学学报, no. 06, 25 December 2016 (2016-12-25) * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116561323A (en) * | 2023-06-30 | 2023-08-08 | 西安邮电大学 | Emotion analysis method based on aspect word embedding graph convolution network |
CN116561323B (en) * | 2023-06-30 | 2023-09-19 | 西安邮电大学 | Emotion analysis method based on aspect word embedding graph convolution network |
CN117592483A (en) * | 2023-11-21 | 2024-02-23 | 合肥工业大学 | Implicit emotion analysis method and device based on thinking tree |
CN117592483B (en) * | 2023-11-21 | 2024-05-28 | 合肥工业大学 | Implicit emotion analysis method and device based on thinking tree |
Also Published As
Publication number | Publication date |
---|---|
CN113343665B (en) | 2024-07-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10896392B2 (en) | Methods and systems for generating supply chain representations | |
CN108733748B (en) | Cross-border product quality risk fuzzy prediction method based on commodity comment public sentiment | |
US20120041901A1 (en) | System and Method for Knowledge Pattern Search from Networked Agents | |
Ansari | Cost-based text understanding to improve maintenance knowledge intelligence in manufacturing enterprises | |
CN113343665B (en) | Commodity comment emotion analysis method and system based on aspect-level fine granularity | |
CN111563384A (en) | Evaluation object identification method and device for E-commerce products and storage medium | |
CN109345272A (en) | One kind is based on the markovian shop credit risk forecast method of improvement | |
CN112507691A (en) | Interpretable financial subject matter generating method and device fusing emotion, industrial chain and case logic | |
CN112990973A (en) | Online shop portrait construction method and system | |
LU506520B1 (en) | A sentiment analysis method based on multimodal review data | |
JP2023008943A (en) | Data mining method, apparatus, electronic apparatus, and storage medium | |
CN112632377A (en) | Recommendation method based on user comment emotion analysis and matrix decomposition | |
CN117251570A (en) | Bidding purchasing demand analysis method and system based on text analysis | |
Addepalli et al. | A proposed framework for measuring customer satisfaction and product recommendation for ecommerce | |
Wang et al. | Factuality of large language models in the year 2024 | |
Nasereddin | A Business Analytics Approach to Strategic Management using Uncovering Corporate Challenges through Topic Modeling | |
Catanese et al. | Natural language processing in official statistics: The social mood on economy index experience | |
CN109684473A (en) | A kind of automatic bulletin generation method and system | |
CN115422453B (en) | Project recommendation method and project recommendation device | |
CN115828914A (en) | Satisfaction evaluation method considering user attribute preference | |
Lapeña et al. | Exploring new directions in traceability link recovery in models: The process models case | |
CN114911940A (en) | Text emotion recognition method and device, electronic equipment and storage medium | |
Liu | Dataset Analysis and House Price Prediction | |
CN113869058A (en) | LC-GCN (LC-GCN-based) aspect level emotion analysis method and system, storage medium and electronic equipment | |
Islam et al. | Stock market prediction of Bangladesh using multivariate long short-term memory with sentiment identification. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |