CN110175325A - The comment and analysis method and Visual Intelligent Interface Model of word-based vector sum syntactic feature - Google Patents
The comment and analysis method and Visual Intelligent Interface Model of word-based vector sum syntactic feature Download PDFInfo
- Publication number
- CN110175325A CN110175325A CN201910343337.5A CN201910343337A CN110175325A CN 110175325 A CN110175325 A CN 110175325A CN 201910343337 A CN201910343337 A CN 201910343337A CN 110175325 A CN110175325 A CN 110175325A
- Authority
- CN
- China
- Prior art keywords
- word
- words
- evaluation
- emotion
- sentence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000013598 vector Substances 0.000 title claims abstract description 40
- 238000004458 analytical method Methods 0.000 title claims abstract description 18
- 230000000007 visual effect Effects 0.000 title claims abstract description 10
- 230000008451 emotion Effects 0.000 claims abstract description 84
- 238000011156 evaluation Methods 0.000 claims abstract description 74
- 239000011159 matrix material Substances 0.000 claims abstract description 30
- 238000012545 processing Methods 0.000 claims abstract description 11
- 238000012549 training Methods 0.000 claims abstract description 10
- 238000012546 transfer Methods 0.000 claims abstract description 3
- 230000002996 emotional effect Effects 0.000 claims description 22
- 238000000034 method Methods 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 claims description 11
- 230000002452 interceptive effect Effects 0.000 claims description 11
- 230000011218 segmentation Effects 0.000 claims description 10
- 230000007704 transition Effects 0.000 claims description 10
- 230000008901 benefit Effects 0.000 claims description 8
- 238000007781 pre-processing Methods 0.000 claims description 8
- 230000007935 neutral effect Effects 0.000 claims description 6
- 238000012216 screening Methods 0.000 claims description 6
- 238000002372 labelling Methods 0.000 claims description 4
- 238000012163 sequencing technique Methods 0.000 claims description 4
- 238000007689 inspection Methods 0.000 claims description 3
- 230000002159 abnormal effect Effects 0.000 claims description 2
- 230000000295 complement effect Effects 0.000 claims description 2
- 230000007812 deficiency Effects 0.000 claims description 2
- 238000012217 deletion Methods 0.000 claims description 2
- 230000037430 deletion Effects 0.000 claims description 2
- 230000006870 function Effects 0.000 claims description 2
- 230000014509 gene expression Effects 0.000 claims description 2
- 238000007405 data analysis Methods 0.000 abstract description 3
- 238000000605 extraction Methods 0.000 abstract description 3
- 239000000284 extract Substances 0.000 abstract description 2
- 238000013079 data visualisation Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000009193 crawling Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9532—Query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Development Economics (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- General Business, Economics & Management (AREA)
- Mathematical Physics (AREA)
- Marketing (AREA)
- Data Mining & Analysis (AREA)
- Machine Translation (AREA)
Abstract
The invention proposes a kind of comment and analysis methods of word-based vector sum syntactic feature in data analysis field, comprising: obtains electric business website commodity page comment data;The target data set that will acquire is pre-processed;It extracts the word set of passing judgement on that Hownet and NTU is provided and forms basic sentiment dictionary;Term vector training is carried out by Word2Vec tool by pretreated data acquisition system by obtained;Probability transfer matrix is established using semantic similarity matrix;The comment on commodity text that will acquire, carries out the processing based on kernel sentence rule;The text of obtained removal redundancy is pre-processed;To gained dependence to passing through part of speech extraction<item property, negative word, degree word, the collocation pair of emotion word>evaluation;By gained evaluation collocation in conjunction with sentiment dictionary, evaluation object is carried out to pass judgement on value calculating, trap queuing, it is realized eventually by Visual Intelligent Interface Model, realizes and accurate, real-time, automatic, convenient processing is carried out to comment on commodity data and is analyzed, can be used in electric business platform.
Description
Technical Field
The invention belongs to the technical field of data analysis, and particularly relates to an emotion dictionary and attribute recognition algorithm which are constructed by using word vectors trained by a neural network model and are suitable for commodity comments, and a comment analysis system based on the word vectors and syntactic characteristics.
Background
With the popularization of the internet and the development of electronic commerce, internet electronic commerce websites such as the Jingdong website and the Taobao website are rapidly developed, and more consumers begin to choose online shopping; these e-commerce websites have a huge amount of commodities and a large user group, and thus generate huge comment data. The comments given by the consumers often carry subjective feelings of the consumers on the consumption, including the preference degree of the consumers for purchasing commodities, the satisfaction degree of the consumers for the services of merchants and the like. For the consumers, the comment texts can help the consumers to more objectively know the information of the related goods or services, so that more suitable choices are given; the experience information about the goods or services fed back by the user can help the user to further improve the quality of the services or goods in a targeted manner, so that more customers and profits are obtained. However, with the explosive increase of the data volume, the cost required for the user to acquire useful information from massive comment data is more and more increased, so how to quickly and effectively process and analyze the comment text of the user and extract the valuable information has important application value and research significance.
At present, a large amount of comment data cannot be fully utilized, and a consumer cannot acquire valuable information from a large amount of comment data. Therefore, a comment analysis system based on word vectors and syntactic characteristics is researched, the satisfaction degree of a user on each attribute of a commodity is obtained according to an analysis result, the advantages and disadvantages of the commodity are summarized, and then data visualization is performed on the analysis result.
Disclosure of Invention
The invention aims to solve the technical problem of how to realize accurate, real-time, automatic and convenient processing and analysis of commodity comment data, and provides a comment analysis method based on word vectors and syntactic characteristics to overcome the defects of the prior art.
The invention provides a comment analysis method based on word vectors and syntactic characteristics, which comprises the following steps:
1) acquiring commodity page comment data of an e-commerce website;
2) preprocessing the acquired target data set and constructing a candidate emotion word set;
3) extracting a commendable and derogative word set provided by the Hownet and the NTU to form a basic emotion dictionary;
4) carrying out Word vector training on the obtained preprocessed data set through a Word2Vec tool to obtain Word vectors and generate a semantic similarity matrix;
5) establishing a probability transfer matrix by using a semantic similarity matrix, and generating a final emotion dictionary by combining a seed word set through an LPA label propagation algorithm and through basic emotion dictionary inspection;
6) processing the obtained commodity comment text based on the core sentence rule to obtain a comment text with redundancy removed;
7) preprocessing the obtained text without redundancy, forming a dependency relationship tree for the obtained word segmentation data set based on dependency relationship and syntactic characteristics, and generating an SBV, VOB, ATT, CMP and COO dependency relationship pair;
8) for the obtained dependency relationship pair, the evaluation matching pair of the commodity attribute, the negative word, the degree word and the emotional word is extracted through the part of speech;
9) and combining the obtained evaluation building pairs with an emotion dictionary, performing commendatory and derogatory calculation and quality sequencing on the evaluation object, and finally realizing the evaluation through a visual interactive interface.
As a further limitation of the present invention, step 2) specifically comprises:
2-1) removing illegal characters by using a character matching algorithm;
2-2) carrying out word segmentation and part-of-speech tagging on the original data set by using LTP;
2-3) extracting words according with parts of speech, and forming a candidate emotion word set 1 through de-emphasis;
2-4) carrying out word segmentation and part-of-speech tagging on the original data set by using NLPIR;
2-5) extracting words according with parts of speech, and forming a candidate emotion word set 2 through de-emphasis;
2-6) combining the candidate emotion word set 1 and the candidate emotion word set 2, and removing duplication to obtain a candidate emotion word set.
As a further limitation of the present invention, step 3) specifically comprises: and (4) evaluating the word dictionary by using the hownet emotion dictionary and ntu, respectively extracting the commendable and derogable words in the word dictionary, combining the words and derogable words, and removing the duplication to form a basic emotion dictionary.
As a further limitation of the present invention, step 4) specifically comprises:
4-1) utilizing a Word2Vec training data set to obtain Word vectors of words;
4-2) combining the candidate emotion word sets, and calculating the semantic similarity between the words by adopting the following formula:
4-3) e.g. two n-dimensional word vectors a (x)11, x12, … , x1n) And b (x)21, x22, … , x2n) The semantic similarity calculation formula is as follows:
wherein,representing a semantic similarity value;representing the k-dimension value of the word vector a;representing the k-dimension value of the word vector b;
4-4) constructing a semantic similarity matrix according to the calculated semantic similarity.
As a further limitation of the present invention, step 5) specifically comprises:
5-1) taking each word as a node of the graph, wherein the weight of an edge between two nodes is represented by the semantic similarity between the represented words;
5-2) establishing a probability transition matrix P according to the following formula:
wherein, P [ i][j]Representing the probability of a similarity transition between words i to j, SIM (w)i,wj) Representing the similarity of the words i and j, and m represents the number of words with the highest semantic similarity with the word i;
5-3) counting the word frequency of all the emotional words in the candidate emotional word set in the original comment data, and screening out N words with the highest word frequency to form a seed word set 1; screening out words with the emotion vocabulary body strength greater than m and in the candidate emotion word set by using the emotion vocabulary body library to form a seed word set 2; combining the seed word set 1 and the seed word set 2, removing duplication to form a seed word set, and carrying out artificial emotion labeling;
5-4) building LxC's label matrix Y by using a small number of artificially labeled seed wordsLWherein: l represents the number of seed words; c represents the number of the classes, and is divided into 3 classes which are commendative, devaluative and neutral respectively;
5-5) simultaneously establishing Label matrix Y of UxC by using unlabeled sample wordsUWherein: u represents the number of unlabeled sample words; c represents the number of the classes, and is divided into 3 classes which are commendative, devaluative and neutral respectively;
5-6) finally, performing part-of-speech tagging on the sample word by adopting an LPA label propagation algorithm, and forming a final emotion dictionary after the sample word is checked by a basic emotion dictionary.
As a further limitation of the present invention, step 6) specifically comprises:
the core sentence mainly refers to deleting redundancy and reserving trunk components related to evaluation matching; if the original sentence does not accord with any rule, the original sentence is kept unchanged, the method aims to improve the accuracy of evaluating the syntactic dependency analysis of the text by using the core sentence, and the rule comprises the following steps:
rule 1: deleting the components of the sentence headings in the sentence, such as the sequence of 'advantage of …', 'disadvantage of …', 'deficiency of …', 'advantage of …', 'benefit of …';
rule 2: sentences with hypothetical tendencies, such as "say …", "wish …", "if …", "wish …", "suggest …", are deleted;
rule 3: deletion periods are "exactly", "naturally", "particularly", "also exactly", "especially" sequences;
rule 4: delete "feel", "think" claims;
rule 5: and deleting the continuous punctuation marks except the first punctuation mark and abnormal characters such as expressions, characters and brackets.
As a further limitation of the present invention, step 7) specifically comprises:
five axioms of dependency syntax:
(1) a sentence has only one and only one independent component;
(2) any component in the sentence must depend on a certain component at the same time;
(3) any component in the sentence cannot depend on two or more components at the same time;
(4) if the component a directly depends on the component b and the component c is positioned between the components a and b in the sentence, the component c depends on the component a or the component b or other components between the components a and b;
(5) the components on the left side and the right side of the central component do not have dependency relationship with each other;
the dependency tree is characterized in that:
(1) nodes in the tree are served by various components in the sentence;
(2) the root node of the tree is the central component of the whole sentence;
(3) edges formed between nodes in the tree have directionality, and asymmetric dependency relationships among the components are reflected;
(4) five axioms of dependency syntax are satisfied;
most sentence dependency relations in the comments are five types, namely a main predicate relation (SBV), a dynamic guest relation (VOB/FOB), a centering relation (ATT), a dynamic complement relation (CMP) and a parallel relation (COO), dependency syntax analysis can be carried out through an LTP dependency syntax analyzer, and dependency relation pairs are extracted by combining COO algorithms for identifying parallel evaluation objects and parallel evaluation words; the COO algorithm for identifying the parallel evaluation objects and the parallel evaluation words specifically comprises the following steps:
traversing all words related to the dependency relationship between two nodes in SBV, VOB, ATT and CMP dependency relationship pairs and in a dependency syntax tree obtained based on the dependency relationship and the syntax characteristics;
judging whether all traversed words have COO relations or not;
and expanding the parallel evaluation objects and evaluation words of the COO relation.
As a further limitation of the present invention, step 8) specifically comprises:
8-1) according to the characteristics of Chinese language, the evaluation objects are mostly nouns or verbs, and the evaluation words are mostly adjectives or verbs;
8-2) extracting an evaluation object and an evaluation word according to the part of speech, namely commodity attribute and emotion word;
8-3) traversing the obtained evaluation object and the evaluation words according to the dependency syntax tree to judge whether negative words exist between the evaluation object and the evaluation words, if yes, counting the number of the negative words by +1, and if traversing to cumulatively add a plurality of negative words until the traversal is finished, performing parity judgment on the number of the negative words. If the negative word is odd, the assignment of the corresponding negative word is-1, and if the negative word is even, the assignment of the corresponding negative word is + 1;
8-4) traversing whether a degree word exists between the obtained evaluation object and the evaluation word according to the dependency syntax tree, and if the degree word exists in a plurality of evaluation objects, accumulating the number of the degree words to obtain the number of the degree words of the collocation pair;
8-5) finally forming the evaluation matching pairs of the commodity attributes, the negative words, the degree words and the emotional words.
As a further limitation of the present invention, step 9) specifically comprises:
according to the commodity attribute a appearing n times, the commendation and derogation calculation formula is as follows:
score is the sentiment value of the commodity attribute a,for the ith time of the occurrence of the commodity attribute, the private is the value (-1 or + 1) of the negative word corresponding to the ith commodity attribute, and the degree is the number of degree adverbs corresponding to the ith commodity attribute; thus, the commodity attribute emotional value is calculated, and the same evaluation object is accumulated and calculated;
and judging whether all the extracted evaluation objects are commendably and commendably, and sorting and arranging the final results by using bubble sorting.
A visual interactive interface, which can execute all the steps of the claims, can well display the emotion value in a bar chart form, and is added with a plurality of friendly interactive functions, comprising: loading, logging in, logging out, modifying the password, logging in the use state of the user and the like.
Compared with the prior art, the invention adopting the technical scheme has the following technical effects:
according to the method, a basic emotion dictionary is constructed by obtaining commodity page comment data of an e-commerce website and preprocessing the commodity page comment data; carrying out Word vector training on the obtained preprocessed data set through a Word2Vec tool and generating a semantic similarity matrix so as to establish a probability transition matrix, and generating a final emotion dictionary through an LPA (low-power amplifier) label propagation algorithm by combining a seed Word set; processing the obtained commodity comment text based on the core sentence rule to obtain a comment text with redundancy removed; preprocessing the obtained text without redundancy, forming a dependency relationship tree for the obtained word segmentation data set based on dependency relationship and syntactic characteristics, generating SBV, VOB, ATT, CMP and COO dependency relationship pairs, extracting < commodity attribute, negative word, degree word and emotional word > evaluation matching pairs, combining an emotional dictionary, performing positive and negative value calculation and quality sequencing on the commodity attribute, and finally realizing through a visual interactive interface; the accuracy, real-time performance, automation and convenience of the comment data analysis can be realized at the same time.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The technical scheme of the invention is further explained in detail by combining the attached drawings:
the technical scheme of the invention constructs an emotion dictionary suitable for commodity comments by using a word vector trained by a neural network model and combining an LTP label propagation algorithm; designing a commodity attribute identification and extraction algorithm based on the core sentence rule, the dependency relationship and the syntactic characteristics; and a comment analysis system based on word vectors and syntactic characteristics is constructed by combining the technical scheme, the satisfaction degree of the user on each attribute of the commodity is obtained according to the analysis result, the advantages and disadvantages of the commodity are summarized, and then data visualization is carried out on the analysis result.
Referring to fig. 1, the comment analyzing method based on word vectors and syntactic characteristics according to the present invention includes the following specific steps:
step S101: and obtaining commodity page comment data of the e-commerce website.
In specific implementation, a comment data crawling algorithm is designed, comment data of various commodities of an E-commerce website are obtained, and an original comment data set is generated.
Step S102: and preprocessing the acquired target data set and constructing a basic emotion dictionary.
In specific implementation, a character matching algorithm is used for an original data set to remove illegal characters; firstly, performing word segmentation and part-of-speech tagging by using LTP, extracting words with part-of-speech labels of 'a' (adj), and removing duplication to form a candidate emotion word set 1; then, performing word segmentation and part-of-speech tagging by using NLPIR, extracting words with part-of-speech identifiers of 'a' (adj), and performing duplication removal to form a candidate emotion word set 2; and combining the candidate emotion word set 1 and the candidate emotion word set 2, and removing the duplication to form a final candidate emotion word set.
Step S103: and extracting the commendable and derogative word sets provided by the Hownet and the NTU to form a basic emotion dictionary.
In the specific implementation, the method comprises the steps of utilizing a hosnet emotion dictionary and an NTU evaluation word dictionary, extracting commendable and derogable words in the hosnet emotion dictionary and NTU evaluation word dictionary respectively, combining the commendable and derogable words, and removing the duplication to form a basic emotion dictionary.
Step S104: and carrying out Word vector training on the obtained preprocessed data set through a Word2Vec tool to obtain a Word vector and generate a semantic similarity matrix.
In the specific implementation, a Word2Vec training data set is used, training parameters size =100, window =5, sg =0, and min _ count =0 are set respectively, and a Word vector of a Word is obtained through training.
And (4) combining the candidate emotion word sets, and calculating the semantic similarity between the words by adopting the following formula.
E.g. two n-dimensional word vectors a (x)11, x12, … , x1n) And b (x)21, x22, … , x2n) The semantic similarity calculation formula is as follows:
wherein,representing a semantic similarity value;representing the k-dimension value of the word vector a;representing the k-dimension value of the word vector b;
traversing all the emotional words in the candidate emotional word set in sequence, fixing one emotional word, and calculating the similarity of the fixed emotional word and all other emotional words; supposing that m candidate emotion words are provided, and obtaining a semantic similarity matrix of m through m times of calculation.
For the convenience of the following operation, it is specified that the similarity between the same emotional words is 0.
And constructing a semantic similarity matrix according to the calculated semantic similarity.
Step S105: and establishing a probability transition matrix by using the semantic similarity matrix, and generating a final emotion dictionary by combining the seed word set through an LPA label propagation algorithm and through basic emotion dictionary inspection.
In a specific implementation, each word is considered as a node of the graph, and the weight of an edge between two nodes is represented by the semantic similarity between the words represented by the edge.
The probability transition matrix P is established according to the following formula:
wherein, P [ i][j]Representing the probability of a similarity transition between words i to j, SIM (w)i,wj) Representing the similarity of the words i and j, and m representing the number of words with the highest semantic similarity to the word i (manually set); and establishing a probability transition matrix P according to the formula.
Counting the word frequency of all emotion words in the candidate emotion word set in the original comment data, and screening out 100 words with the highest word frequency to form a seed word set 1; screening words with the emotion vocabulary body strength greater than 7 and in the candidate emotion vocabulary set by using an emotion vocabulary body library of the university of great succession of studios to form a seed vocabulary set 2; combining the seed word set 1 and the seed word set 2, removing duplication to form a seed word set, and carrying out artificial emotion labeling.
Then a label matrix Y of LxC is established by using a small number of artificially labeled seed wordsLWherein: l represents the number of seed words; c represents the number of classes, generally 3 classes (commendatory, derogatory, neutral); meanwhile, a label matrix Y of UxC is established by using unlabeled sample wordsUWherein: u represents the number of unlabeled sample words; c represents the number of classes, generally 3 classes (commendatory, derogatory, neutral); combining the two label matrixes to obtain an NxC soft label matrix F = [ Y =L;YU]。
Executing a label propagation algorithm, and specifically operating as follows: 1) and (3) performing propagation: f = PF; 2) Label of labeled sample reset F: fL=YL(ii) a 3) Repeating steps 1) and 2) until F converges.
WhereinStep 1 is to transmit the label (emotion attribute) of each node (emotion word) to other nodes according to the probability determined by the probability transition matrix, wherein if the similarity of the two nodes is higher, the transmission probability is higher; the purpose of step 2 is to reset the label of the labeled seed word to the labeled value, so as to avoid the change caused by the operation process of step 1; the method for determining F convergence in step 3 is to calculate the latest F and the last F after operation0Until the similarity no longer changes, F is considered converged.
And finally, the three numerical values in a single row in the matrix F represent the attribute propagation values of the corresponding emotional words, the maximum numerical value is selected, the corresponding attribute is judged, and the attribute of the emotional words is determined.
Deriving the emotion words with confirmed attributes to form an emotion dictionary 1, traversing all emotion words in the emotion dictionary 1, and if the basic emotion dictionary contains the emotion words and contradicts the attributes in the basic emotion dictionary in the step S103, changing the attributes of the emotion words with the attributes in the basic emotion dictionary as the standard; otherwise, the attribute is unchanged.
After the above steps are finished, the modified emotion dictionary 1 is the final emotion dictionary.
Step S106: and processing the obtained commodity comment text based on the core sentence rule to obtain the comment text with the redundancy removed.
In specific implementation, a commodity website is input on an interactive interface of a webpage of the system, comment data of a commodity input on an e-commerce platform is crawled through a web crawler mechanism designed in a background, and the system is set to crawl the top 1000 pieces of high-quality comment data of the commodity.
Carrying out redundancy removal processing on the obtained commodity comment data based on the core sentence rule, and reserving trunk components related to evaluation matching; for example: the mobile phone is good in receiving, stiffness and pixel and tone quality, and particularly gives force for express delivery (next day), the only defect is that the package is not good, and a shop can be improved. . . "the treatment is as follows:
(1) matching rule 1, the example sentence is matched to be insufficient of …, the processed example sentence is changed into' the mobile phone is received, the mobile phone is very stiff, the pixel and the tone quality are good, particularly, express delivery is very strong (next day), namely, the package is not good, and a shop can improve the rule. . . ";
(2) and 2, matching rule 2, namely, the hope is matched in the example sentence, the result is changed into' the mobile phone receives the hope after processing, the mobile phone is very stiff, the pixel and the tone quality are good, particularly, the express delivery is very good (next day), namely, the package is not very good, and the shop can improve the result. . . ";
(3) and 3, matching rule 3, namely 'the example sentence is matched with' the example sentence 'and' the example sentence is especially 'after processing, the example sentence is changed into' the example sentence is received by a mobile phone, the example sentence is very stiff, the pixel and the tone quality are good, the express delivery is very powerful (the next day), the package is not very good, and the shop can improve the example sentence once. . . ";
(4) and matching rule 5, deleting continuous punctuation marks from example sentences, and finally processing to obtain a core sentence, wherein the core sentence is' received by the mobile phone, good in shape, good in pixel and tone quality, good in express delivery, poor in package, and capable of being improved by a shop. ", this embodiment is denoted as the example sentence sequences.
Step S107: and preprocessing the obtained text without redundancy, forming a dependency relationship tree by the obtained word segmentation data set based on dependency relationship and syntactic characteristics, and generating an SBV, VOB, ATT, CMP and COO dependency relationship pair.
In a specific implementation, the text with the redundancy removed, which is obtained in step S106, is preprocessed to make punctuation clauses, so as to obtain 6 clauses. And (3) segmenting each small sentence by utilizing an LTP tool, labeling the part of speech, and forming a dependency relationship tree based on dependency relationship and syntactic characteristics. The dependency relationship is obtained for SBV < mobile phone, receiving >, SBV < pixel, good >, COO < tone quality, pixel >, SBV < express, give force >, SBV < package, good >, SBV < shop, improvement >.
For example, if the phrase "both the pixel and the tone quality are good", after the above steps are performed, the dependency relationship pair is extracted again by combining with the COO algorithm for identifying the parallel evaluation object and the parallel evaluation word, and the obtained dependency relationship pair is < pixel, good >, < tone quality, good >.
Step S108: and (4) extracting the evaluation matching pairs of the commodity attributes, the negative words, the degree words and the emotional words from the obtained dependency relationship pairs through the parts of speech.
In the specific implementation, for each extracted relation pair, traversing whether negative words exist between the evaluation object and the evaluation word, calculating the number of the negative words, judging whether the negative words between the evaluation object and the evaluation word are odd or even to obtain positive and negative values of the negative words, namely judging the negative words to be odd numbers, and assigning a value of-1 to the corresponding negative words; the negative word is judged to be even number and is correspondingly assigned with the value of + 1. And then traversing whether a degree word exists between the evaluation object and the evaluation word, and calculating the number of the degree words. Finally, a commodity attribute, private, default, emotion word evaluation matching pair is formed. In the example sentence sequences in step S106, a negative word "no" is recognized between the relation pair < package, good >, and the corresponding private value is-1; and traversing the degree adverb between the package and the good, and identifying the good, wherein the corresponding degree value is 1. The evaluation match for this sentence extraction is < package, -1, 1, good >.
Step S109: and combining the obtained evaluation building pairs with an emotion dictionary, performing commendatory and derogatory calculation and quality sequencing on the evaluation object, and finally realizing the evaluation through a visual interactive interface.
In the specific implementation, the extracted evaluation matching pairs are combined, and the commendatory and derogatory attributes of the emotion words are obtained through the emotion dictionary. And then the commendatory and derogatory value calculation of the commodity attribute is carried out according to the following formula:
for step S107The evaluation pair obtained in (1)<Packaging, -1, 1, good>The commodity attribute of "package" is commendably and derogatively calculated to obtain its emotion value。
Traversing all the obtained comment data of the commodity, performing the steps, accumulating the same evaluation objects, finally extracting all the commodity attributes of the commodity, then classifying into commendatory and derogatory types, and obtaining the final result by using bubble sorting arrangement. And finally, realizing the webpage by using a visual interactive interface through the front end and the back end.
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can understand that the modifications or substitutions within the technical scope of the present invention are included in the scope of the present invention, and therefore, the scope of the present invention should be subject to the protection scope of the claims.
Claims (10)
1. A comment analysis method based on word vectors and syntactic characteristics is characterized by comprising the following steps:
1) acquiring commodity page comment data of an e-commerce website;
2) preprocessing the acquired target data set and constructing a candidate emotion word set;
3) extracting a commendable and derogative word set provided by the Hownet and the NTU to form a basic emotion dictionary;
4) carrying out Word vector training on the obtained preprocessed data set through a Word2Vec tool to obtain Word vectors and generate a semantic similarity matrix;
5) establishing a probability transfer matrix by using a semantic similarity matrix, and generating a final emotion dictionary by combining a seed word set through an LPA label propagation algorithm and through basic emotion dictionary inspection;
6) processing the obtained commodity comment text based on the core sentence rule to obtain a comment text with redundancy removed;
7) preprocessing the obtained text without redundancy, forming a dependency relationship tree for the obtained word segmentation data set based on dependency relationship and syntactic characteristics, and generating an SBV, VOB, ATT, CMP and COO dependency relationship pair;
8) for the obtained dependency relationship pair, the evaluation matching pair of the commodity attribute, the negative word, the degree word and the emotional word is extracted through the part of speech;
9) and combining the obtained evaluation building pairs with an emotion dictionary, performing commendatory and derogatory calculation and quality sequencing on the evaluation object, and finally realizing the evaluation through a visual interactive interface.
2. The method for analyzing comments based on word vectors and syntactic characteristics according to claim 1, wherein the step 2) specifically comprises:
2-1) removing illegal characters by using a character matching algorithm;
2-2) carrying out word segmentation and part-of-speech tagging on the original data set by using LTP;
2-3) extracting words according with parts of speech, and forming a candidate emotion word set 1 through de-emphasis;
2-4) carrying out word segmentation and part-of-speech tagging on the original data set by using NLPIR;
2-5) extracting words according with parts of speech, and forming a candidate emotion word set 2 through de-emphasis;
2-6) combining the candidate emotion word set 1 and the candidate emotion word set 2, and removing duplication to obtain a candidate emotion word set.
3. The method for analyzing comments based on word vectors and syntactic characteristics according to claim 1, wherein step 3) specifically comprises: and (4) evaluating the word dictionary by using the hownet emotion dictionary and ntu, respectively extracting the commendable and derogable words in the word dictionary, combining the words and derogable words, and removing the duplication to form a basic emotion dictionary.
4. The method for analyzing comments based on word vectors and syntactic characteristics according to claim 1, wherein the step 4) specifically comprises:
4-1) utilizing a Word2Vec training data set to obtain Word vectors of words;
4-2) combining the candidate emotion word sets, and calculating the semantic similarity between the words by adopting the following formula:
4-3) e.g. two n-dimensional word vectors a (x)11, x12, … , x1n) And b (x)21, x22, … , x2n) The semantic similarity calculation formula is as follows:
wherein,representing a semantic similarity value;representing the k-dimension value of the word vector a;representing the k-dimension value of the word vector b;
4-4) constructing a semantic similarity matrix according to the calculated semantic similarity.
5. The method for analyzing comments based on word vectors and syntactic characteristics according to claim 1, wherein the step 5) specifically comprises:
5-1) taking each word as a node of the graph, wherein the weight of an edge between two nodes is represented by the semantic similarity between the represented words;
5-2) establishing a probability transition matrix P according to the following formula:
wherein, P [ i][j]Representing the probability of a similarity transition between words i to j, SIM (w)i,wj) Representing the similarity of the words i and j, and m represents the number of words with the highest semantic similarity with the word i;
5-3) counting the word frequency of all the emotional words in the candidate emotional word set in the original comment data, and screening out N words with the highest word frequency to form a seed word set 1; screening out words with the emotion vocabulary body strength greater than m and in the candidate emotion word set by using the emotion vocabulary body library to form a seed word set 2; combining the seed word set 1 and the seed word set 2, removing duplication to form a seed word set, and carrying out artificial emotion labeling;
5-4) building LxC's label matrix Y by using a small number of artificially labeled seed wordsLWherein: l represents the number of seed words; c represents the number of the classes, and is divided into 3 classes which are commendative, devaluative and neutral respectively;
5-5) simultaneously establishing Label matrix Y of UxC by using unlabeled sample wordsUWherein: u represents the number of unlabeled sample words; c represents the number of the classes, and is divided into 3 classes which are commendative, devaluative and neutral respectively;
5-6) finally, performing part-of-speech tagging on the sample word by adopting an LPA label propagation algorithm, and forming a final emotion dictionary after the sample word is checked by a basic emotion dictionary.
6. The method for analyzing comments based on word vectors and syntactic characteristics according to claim 1, wherein step 6) specifically comprises:
the core sentence mainly refers to deleting redundancy and reserving trunk components related to evaluation matching; if the original sentence does not accord with any rule, the original sentence is kept unchanged, the method aims to improve the accuracy of evaluating the syntactic dependency analysis of the text by using the core sentence, and the rule comprises the following steps:
rule 1: deleting the components of the sentence headings in the sentence, such as the sequence of 'advantage of …', 'disadvantage of …', 'deficiency of …', 'advantage of …', 'benefit of …';
rule 2: sentences with hypothetical tendencies, such as "say …", "wish …", "if …", "wish …", "suggest …", are deleted;
rule 3: deletion periods are "exactly", "naturally", "particularly", "also exactly", "especially" sequences;
rule 4: delete "feel", "think" claims;
rule 5: and deleting the continuous punctuation marks except the first punctuation mark and abnormal characters such as expressions, characters and brackets.
7. The method for analyzing comments based on word vectors and syntactic characteristics according to claim 1, wherein step 7) specifically comprises:
five axioms of dependency syntax:
(1) a sentence has only one and only one independent component;
(2) any component in the sentence must depend on a certain component at the same time;
(3) any component in the sentence cannot depend on two or more components at the same time;
(4) if the component a directly depends on the component b and the component c is positioned between the components a and b in the sentence, the component c depends on the component a or the component b or other components between the components a and b;
(5) the components on the left side and the right side of the central component do not have dependency relationship with each other;
the dependency tree is characterized in that:
(1) nodes in the tree are served by various components in the sentence;
(2) the root node of the tree is the central component of the whole sentence;
(3) edges formed between nodes in the tree have directionality, and asymmetric dependency relationships among the components are reflected;
(4) five axioms of dependency syntax are satisfied;
most sentence dependency relations in the comments are five types, namely a main predicate relation (SBV), a dynamic guest relation (VOB/FOB), a centering relation (ATT), a dynamic complement relation (CMP) and a parallel relation (COO), dependency syntax analysis can be carried out through an LTP dependency syntax analyzer, and dependency relation pairs are extracted by combining COO algorithms for identifying parallel evaluation objects and parallel evaluation words; the COO algorithm for identifying the parallel evaluation objects and the parallel evaluation words specifically comprises the following steps:
traversing all words related to the dependency relationship between two nodes in SBV, VOB, ATT and CMP dependency relationship pairs and in a dependency syntax tree obtained based on the dependency relationship and the syntax characteristics;
judging whether all traversed words have COO relations or not;
and expanding the parallel evaluation objects and evaluation words of the COO relation.
8. The method for analyzing comments based on word vectors and syntactic characteristics according to claim 1, wherein step 8) specifically comprises:
8-1) according to the characteristics of Chinese language, the evaluation objects are mostly nouns or verbs, and the evaluation words are mostly adjectives or verbs;
8-2) extracting an evaluation object and an evaluation word according to the part of speech, namely commodity attribute and emotion word;
8-3) traversing the obtained evaluation object and the evaluation words according to the dependency syntax tree to determine whether negative words exist between the evaluation object and the evaluation words, if yes, counting the number of the negative words by +1, and if the negative words are traversed to be accumulated and added, performing parity judgment on the number of the negative words until the traversal is finished;
if the negative word is odd, the assignment of the corresponding negative word is-1, and if the negative word is even, the assignment of the corresponding negative word is + 1;
8-4) traversing whether a degree word exists between the obtained evaluation object and the evaluation word according to the dependency syntax tree, and if the degree word exists in a plurality of evaluation objects, accumulating the number of the degree words to obtain the number of the degree words of the collocation pair;
8-5) finally forming the evaluation matching pairs of the commodity attributes, the negative words, the degree words and the emotional words.
9. The method for analyzing comments based on word vectors and syntactic characteristics according to claim 1, wherein step 9) specifically comprises:
according to the commodity attribute a appearing n times, the commendation and derogation calculation formula is as follows:
score is the sentiment value of the commodity attribute a, XiFor the ith time of the occurrence of the commodity attribute, the private is the value (-1 or + 1) of the negative word corresponding to the ith commodity attribute, and the degree is the number of degree adverbs corresponding to the ith commodity attribute; thus, the commodity attribute emotional value is calculated, and the same evaluation object is accumulated and calculated;
and judging whether all the extracted evaluation objects are commendably and commendably, and sorting and arranging the final results by using bubble sorting.
10. A visual interactive interface, characterized in that, all the steps of claims 1 to 9 can be executed, besides the emotional values can be well displayed in the form of bar graph, a plurality of friendly interactive functions are added, including: loading, logging in, logging out, modifying the password, logging in the use state of the user and the like.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910343337.5A CN110175325B (en) | 2019-04-26 | 2019-04-26 | Comment analysis method based on word vector and syntactic characteristics and visual interaction interface |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910343337.5A CN110175325B (en) | 2019-04-26 | 2019-04-26 | Comment analysis method based on word vector and syntactic characteristics and visual interaction interface |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110175325A true CN110175325A (en) | 2019-08-27 |
CN110175325B CN110175325B (en) | 2023-07-11 |
Family
ID=67690209
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910343337.5A Active CN110175325B (en) | 2019-04-26 | 2019-04-26 | Comment analysis method based on word vector and syntactic characteristics and visual interaction interface |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110175325B (en) |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110659828A (en) * | 2019-09-23 | 2020-01-07 | 上海海事大学 | Software feature evaluation method based on comment data |
CN110705266A (en) * | 2019-09-09 | 2020-01-17 | 创新奇智(南京)科技有限公司 | Emotion analysis method and device |
CN110706028A (en) * | 2019-09-26 | 2020-01-17 | 四川长虹电器股份有限公司 | Commodity evaluation emotion analysis system based on attribute characteristics |
CN110717654A (en) * | 2019-09-17 | 2020-01-21 | 合肥工业大学 | Product quality evaluation method and system based on user comments |
CN110750646A (en) * | 2019-10-16 | 2020-02-04 | 乐山师范学院 | Attribute description extracting method for hotel comment text |
CN111259661A (en) * | 2020-02-11 | 2020-06-09 | 安徽理工大学 | New emotion word extraction method based on commodity comments |
CN111414753A (en) * | 2020-03-09 | 2020-07-14 | 中国美术学院 | Method and system for extracting perceptual image vocabulary of product |
CN111523300A (en) * | 2020-04-14 | 2020-08-11 | 北京精准沟通传媒科技股份有限公司 | Vehicle comprehensive evaluation method and device and electronic equipment |
CN111898928A (en) * | 2020-08-18 | 2020-11-06 | 哈尔滨工业大学 | Multi-party service value-quality-capability index alignment method facing space-time boundary |
CN111930941A (en) * | 2020-07-31 | 2020-11-13 | 腾讯音乐娱乐科技(深圳)有限公司 | Method and device for identifying abuse content and server |
CN112069312A (en) * | 2020-08-12 | 2020-12-11 | 中国科学院信息工程研究所 | Text classification method based on entity recognition and electronic device |
CN112115700A (en) * | 2020-08-19 | 2020-12-22 | 北京交通大学 | Dependency syntax tree and deep learning based aspect level emotion analysis method |
CN112579776A (en) * | 2020-12-21 | 2021-03-30 | 北京智齿博创科技有限公司 | Automatic labeling method of quality problem scene labels based on categories |
CN113327140A (en) * | 2021-08-02 | 2021-08-31 | 深圳小蝉文化传媒股份有限公司 | Video advertisement putting effect intelligent analysis management system based on big data analysis |
CN113535901A (en) * | 2021-07-08 | 2021-10-22 | 北京航空航天大学 | E-commerce comment-based user-side commodity knowledge graph construction method |
CN114493760A (en) * | 2021-12-30 | 2022-05-13 | 杭州盟码科技有限公司 | E-commerce cloud data analysis method and system |
CN114881039A (en) * | 2022-05-05 | 2022-08-09 | 重庆锐云科技有限公司 | Owner portrait method, device and equipment based on customer evaluation and storage medium |
CN117436446A (en) * | 2023-12-21 | 2024-01-23 | 江西农业大学 | Weak supervision-based agricultural social sales service user evaluation data analysis method |
WO2024037483A1 (en) * | 2022-08-16 | 2024-02-22 | 中国第一汽车股份有限公司 | Text processing method and apparatus, and electronic device and medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107133282A (en) * | 2017-04-17 | 2017-09-05 | 华南理工大学 | A kind of improved evaluation object recognition methods based on two-way propagation |
CN107247702A (en) * | 2017-05-05 | 2017-10-13 | 桂林电子科技大学 | A kind of text emotion analysis and processing method and system |
-
2019
- 2019-04-26 CN CN201910343337.5A patent/CN110175325B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107133282A (en) * | 2017-04-17 | 2017-09-05 | 华南理工大学 | A kind of improved evaluation object recognition methods based on two-way propagation |
CN107247702A (en) * | 2017-05-05 | 2017-10-13 | 桂林电子科技大学 | A kind of text emotion analysis and processing method and system |
Non-Patent Citations (2)
Title |
---|
邓淑卿 等: "基于句法依赖规则和词性特征的情感词识别研究", 《情报理论与实践》 * |
陆峰: "基于word2vec扩充情感词典的商品评论倾向分析", 《电脑知识与技术》 * |
Cited By (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110705266A (en) * | 2019-09-09 | 2020-01-17 | 创新奇智(南京)科技有限公司 | Emotion analysis method and device |
CN110717654A (en) * | 2019-09-17 | 2020-01-21 | 合肥工业大学 | Product quality evaluation method and system based on user comments |
CN110659828B (en) * | 2019-09-23 | 2022-03-08 | 上海海事大学 | Software feature evaluation method based on comment data |
CN110659828A (en) * | 2019-09-23 | 2020-01-07 | 上海海事大学 | Software feature evaluation method based on comment data |
CN110706028A (en) * | 2019-09-26 | 2020-01-17 | 四川长虹电器股份有限公司 | Commodity evaluation emotion analysis system based on attribute characteristics |
CN110750646A (en) * | 2019-10-16 | 2020-02-04 | 乐山师范学院 | Attribute description extracting method for hotel comment text |
CN110750646B (en) * | 2019-10-16 | 2022-12-06 | 乐山师范学院 | Attribute description extracting method for hotel comment text |
CN111259661A (en) * | 2020-02-11 | 2020-06-09 | 安徽理工大学 | New emotion word extraction method based on commodity comments |
CN111259661B (en) * | 2020-02-11 | 2023-07-25 | 安徽理工大学 | New emotion word extraction method based on commodity comments |
CN111414753A (en) * | 2020-03-09 | 2020-07-14 | 中国美术学院 | Method and system for extracting perceptual image vocabulary of product |
CN111523300A (en) * | 2020-04-14 | 2020-08-11 | 北京精准沟通传媒科技股份有限公司 | Vehicle comprehensive evaluation method and device and electronic equipment |
CN111523300B (en) * | 2020-04-14 | 2021-03-05 | 北京精准沟通传媒科技股份有限公司 | Vehicle comprehensive evaluation method and device and electronic equipment |
CN111930941A (en) * | 2020-07-31 | 2020-11-13 | 腾讯音乐娱乐科技(深圳)有限公司 | Method and device for identifying abuse content and server |
CN112069312A (en) * | 2020-08-12 | 2020-12-11 | 中国科学院信息工程研究所 | Text classification method based on entity recognition and electronic device |
CN112069312B (en) * | 2020-08-12 | 2023-06-20 | 中国科学院信息工程研究所 | Text classification method based on entity recognition and electronic device |
CN111898928A (en) * | 2020-08-18 | 2020-11-06 | 哈尔滨工业大学 | Multi-party service value-quality-capability index alignment method facing space-time boundary |
CN111898928B (en) * | 2020-08-18 | 2021-08-31 | 哈尔滨工业大学 | Multi-party service value-quality-capability index alignment method facing space-time boundary |
CN112115700B (en) * | 2020-08-19 | 2024-03-12 | 北京交通大学 | Aspect-level emotion analysis method based on dependency syntax tree and deep learning |
CN112115700A (en) * | 2020-08-19 | 2020-12-22 | 北京交通大学 | Dependency syntax tree and deep learning based aspect level emotion analysis method |
CN112579776A (en) * | 2020-12-21 | 2021-03-30 | 北京智齿博创科技有限公司 | Automatic labeling method of quality problem scene labels based on categories |
CN113535901B (en) * | 2021-07-08 | 2023-08-18 | 北京航空航天大学 | Method for constructing user side commodity knowledge graph based on e-commerce comments |
CN113535901A (en) * | 2021-07-08 | 2021-10-22 | 北京航空航天大学 | E-commerce comment-based user-side commodity knowledge graph construction method |
CN113327140B (en) * | 2021-08-02 | 2021-10-29 | 深圳小蝉文化传媒股份有限公司 | Video advertisement putting effect intelligent analysis management system based on big data analysis |
CN113327140A (en) * | 2021-08-02 | 2021-08-31 | 深圳小蝉文化传媒股份有限公司 | Video advertisement putting effect intelligent analysis management system based on big data analysis |
CN114493760A (en) * | 2021-12-30 | 2022-05-13 | 杭州盟码科技有限公司 | E-commerce cloud data analysis method and system |
CN114881039A (en) * | 2022-05-05 | 2022-08-09 | 重庆锐云科技有限公司 | Owner portrait method, device and equipment based on customer evaluation and storage medium |
WO2024037483A1 (en) * | 2022-08-16 | 2024-02-22 | 中国第一汽车股份有限公司 | Text processing method and apparatus, and electronic device and medium |
CN117436446A (en) * | 2023-12-21 | 2024-01-23 | 江西农业大学 | Weak supervision-based agricultural social sales service user evaluation data analysis method |
CN117436446B (en) * | 2023-12-21 | 2024-03-22 | 江西农业大学 | Weak supervision-based agricultural social sales service user evaluation data analysis method |
Also Published As
Publication number | Publication date |
---|---|
CN110175325B (en) | 2023-07-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110175325B (en) | Comment analysis method based on word vector and syntactic characteristics and visual interaction interface | |
CN108694647B (en) | Method and device for mining merchant recommendation reason and electronic equipment | |
CN106649603B (en) | Designated information pushing method based on emotion classification of webpage text data | |
WO2021077973A1 (en) | Personalised product description generating method based on multi-source crowd intelligence data | |
CN103903164B (en) | Semi-supervised aspect extraction method and its system based on realm information | |
CN109376251A (en) | A kind of microblogging Chinese sentiment dictionary construction method based on term vector learning model | |
CN103309862B (en) | Webpage type recognition method and system | |
CN105550269A (en) | Product comment analyzing method and system with learning supervising function | |
CN114238573A (en) | Information pushing method and device based on text countermeasure sample | |
CN111260437A (en) | Product recommendation method based on commodity aspect level emotion mining and fuzzy decision | |
CN107818173B (en) | Vector space model-based Chinese false comment filtering method | |
CN112765974B (en) | Service assistance method, electronic equipment and readable storage medium | |
CN108984554A (en) | Method and apparatus for determining keyword | |
CN111767725A (en) | Data processing method and device based on emotion polarity analysis model | |
CN110955750A (en) | Combined identification method and device for comment area and emotion polarity, and electronic equipment | |
CN112069312B (en) | Text classification method based on entity recognition and electronic device | |
KR102325022B1 (en) | On-line image and review integrated analysis method and system using deep learning-based hybrid analysis method | |
CN110706028A (en) | Commodity evaluation emotion analysis system based on attribute characteristics | |
CN112860896A (en) | Corpus generalization method and man-machine conversation emotion analysis method for industrial field | |
CN102789449A (en) | Method and device for evaluating comment text | |
KR101416291B1 (en) | Sentiment classification system using rule-based multi agents | |
CN114971730A (en) | Method for extracting file material, device, equipment, medium and product thereof | |
CN108536673B (en) | News event extraction method and device | |
CN117764669A (en) | Article recommendation method, device, equipment, medium and product | |
CN115455151A (en) | AI emotion visual identification method and system and cloud platform |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |