CN106250365A - The extracting method of item property Feature Words in consumer reviews based on text analyzing - Google Patents
The extracting method of item property Feature Words in consumer reviews based on text analyzing Download PDFInfo
- Publication number
- CN106250365A CN106250365A CN201610580612.1A CN201610580612A CN106250365A CN 106250365 A CN106250365 A CN 106250365A CN 201610580612 A CN201610580612 A CN 201610580612A CN 106250365 A CN106250365 A CN 106250365A
- Authority
- CN
- China
- Prior art keywords
- feature words
- word
- feature
- words
- comment data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- Development Economics (AREA)
- Physics & Mathematics (AREA)
- Accounting & Taxation (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Entrepreneurship & Innovation (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Game Theory and Decision Science (AREA)
- Economics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses the extracting method of item property Feature Words in a kind of consumer reviews based on text analyzing, comprise determining that end article, and obtain the comment data of end article;Described comment data is carried out pretreatment;Part of speech sequence samples is obtained from pretreated comment data;Utilize described part of speech sequence samples to mate all comment data, state the position of Feature Words in model according to the formalization of part of speech sequence samples from comment data, extract Feature Words, and record the frequency of each Feature Words, all Feature Words constitutive characteristic word pre-candidate set;Feature Words pre-candidate set is carried out pretreatment;The similarity of any two Feature Words in statistical nature word pre-candidate set, and similarity is merged more than two Feature Words of threshold value.The present invention uses semantic similarity based on quantity of information to merge similar features word, removes redundancy feature word, decreases the data volume being analyzed Feature Words.
Description
Technical field
The present invention relates to the technical field of information processing, particularly relate in a kind of consumer reviews based on text analyzing
The extracting method of item property Feature Words.
Background technology
The ordinary consumer that develops into of the Internet and information technology is shared commodity consumption online and is experienced and provide chance, thus
The a large amount of comment data produced for Platform Analysis market, obtain user and evaluate attitude and carry out recommendation for user and provide
Good chance, obtains other users for consumer and can preferably assist it to carry out decision-making in purchasing the attitude of commodity, and
The important step of data mining it is by from comment on commodity extracting data attribute character word.
From the quality of the attribute character word that comment on commodity extracting data goes out, the impact on platform and user is all very big, good
Feature Words platform can be allowed to understand the characteristic of commodity that user pays close attention to, promote or keep the individual features of commodity, improve and sell
Amount, it is also possible to allow user understand the truth of the product characteristics oneself paid close attention to.
At present, in comment on commodity data, the method for Feature Words extraction has had a lot, is broadly divided into two big classes: rule-based
Feature extraction and feature extraction based on probability.Such as the part of speech template matching method extended based on grammatical rules, based on word sequence
The Hidden Markov of row mark and condition random field, these are all tentatively to extract the Feature Words in comment data.Research finds,
Owing to being affected by consumer's schooling, culture background, diction, for the same attribute of same commodity, also
The gap in description can be there is, but overall semanteme is close, if only with rule-based matching process to Feature Words
Extracting, the Feature Words that extracts is it would appear that redundancy phenomena.
Summary of the invention
It is an object of the invention to overcome the deficiencies in the prior art, it is provided that in a kind of consumer reviews based on text analyzing
The extracting method of item property Feature Words, uses semantic similarity based on quantity of information to merge similar features word, removes redundancy special
Levy word, decrease the data volume that Feature Words is analyzed.
It is an object of the invention to be achieved through the following technical solutions: commodity in consumer reviews based on text analyzing
The extracting method of attribute character word, comprises determining that end article, and obtains the comment data of end article;To described comment number
According to carrying out pretreatment;Part of speech sequence samples is obtained from pretreated comment data;Described part of speech sequence samples is utilized to mate
All comment data, state the position of Feature Words in model according to the formalization of part of speech sequence samples and extract spy from comment data
Levy word, and record the frequency of each Feature Words, all Feature Words constitutive characteristic word pre-candidate set;In statistical nature word pre-candidate set
The similarity of any two Feature Words, and similarity is merged more than two Feature Words of threshold value.
The acquisition methods of the comment data of end article is: use crawler algorithm to crawl end article from default website
Comment data.
The preprocess method of comment data is: according to punctuation mark, every comment data is divided into multiple statement;By described
Sentence segmentation is multiple single words;Part of speech is marked for each single word.
The preprocess method of comment data also includes, removes stop words.
The method obtaining part of speech sequence samples is:
The comment on commodity statement that definition comprises item property Feature Words is characterized sentence, chooses and carries out pretreated characteristic sentence
As part of speech sequence samples;
The formalization statement model of part of speech sequence samples is:
(BF3, BF2, BF1, featurei, AF1, AF2, AF3, Pos:i)
In formula: featureiFeature Words, BFiI-th word before Feature Words, AFiI-th word after Feature Words,
Pos Feature Words position in this feature sentence.
Further, the step that Feature Words pre-candidate set is carried out pretreatment is also included:
Whether the Feature Words in judging characteristic word pre-candidate set meets preset rules, if meeting, then retains this feature word, no
Then delete this feature word.
Described preset rules is: the length of word is less than or equal to four words, and the frequency of word is in preset range.
In statistical nature word pre-candidate set, the method for the similarity of each Feature Words is: each in Feature Words pre-candidate set
Feature Words carries out the calculating of quantity of information based on HowNet, and calculates the similar of any two Feature Words in Feature Words pre-candidate set
Degree.
The method that Feature Words merges is: more than two Feature Words of threshold value, similarity is merged into a Feature Words,
This feature word is the Feature Words that said two Feature Words medium frequency is bigger.
The invention has the beneficial effects as follows: the present invention uses semantic similarity based on quantity of information to merge similar features word, goes
Except redundancy feature word, decrease the data volume that Feature Words is analyzed.
Accompanying drawing explanation
Fig. 1 is the flow chart of one embodiment of the present of invention.
Detailed description of the invention
Technical scheme is described in further detail below in conjunction with the accompanying drawings, but protection scope of the present invention is not limited to
The following stated.
As it is shown in figure 1, the extracting method of item property Feature Words in consumer reviews based on text analyzing, including following
Step:
Step one, determine end article, and obtain the comment data of end article.
The acquisition methods of the comment data of end article is: use crawler algorithm to crawl end article from default website
Comment data.
Step 2, described comment data is carried out pretreatment.
The preprocess method of comment data is: according to punctuation mark, every comment data is divided into multiple statement;Participle: will
Described sentence segmentation is multiple single words;Part-of-speech tagging: mark part of speech for each single word.Participle refers to one
Sentence is cut into one by one individually word, it is simply that according to certain specification, continuous print word sequence is reassembled into word order
Row;Part-of-speech tagging refers to mark a correct part of speech into each word of word segmentation result, namely determines that each word is noun, moves
The process of word, adjective or other parts of speech.
The preprocess method of comment data also includes, removes stop words, and it is actual that stop words refers to what does not has in sentence
The word of implication, such as all kinds of pronouns, numeral, mathematical symbol etc..The present invention can use Open-Source Tools HanLp or Words partition system
NLPIR carries out pretreatment to comment data.Such as, comment: " mobile phone feel is pretty good, and tonequality is good, and charging rate is fast " enters with HanLp
The pretreated text of row is: " mobile phone/n feel/n is pretty good/a tonequality/n is good/a charging/v speed/n soon/a ".Wherein n represents name
Word, a represents adjective, and v represents verb, and d represents adverbial word, part of speech symbol except use defined in HanLp mark collection in addition to,
Can the most additionally add part custom words.
Step 3, from pretreated comment data obtain part of speech sequence samples.
The method obtaining part of speech sequence samples is: the comment on commodity statement that definition comprises item property Feature Words is characterized
Sentence, chooses and carries out pretreated characteristic sentence as part of speech sequence samples;The formalization statement model of part of speech sequence samples is:
(BF3, BF2, BF1, featurei, AF1, AF2, AF3, Pos:i)
In formula: featureiFeature Words, BFiI-th word before Feature Words, AFiI-th word after Feature Words,
Pos Feature Words position in this feature sentence.
Step 4, utilize described part of speech sequence samples mate all comment data, according to the formalization of part of speech sequence samples
In statement model, the position of Feature Words extracts Feature Words from comment data, and records the frequency of each Feature Words, all features
Word constitutive characteristic word pre-candidate set.
Step 5, Feature Words pre-candidate set is carried out pretreatment: whether the Feature Words in judging characteristic word pre-candidate set accords with
Close preset rules, if meeting, then retain this feature word, otherwise delete this feature word;That is, the Feature Words meeting preset rules is protected
Stay in Feature Words pre-candidate set, delete the Feature Words not meeting preset rules in Feature Words pre-candidate set.Preset rules is: word
The length of language is less than or equal to four words, and the frequency of word is in preset range.
The similarity of any two Feature Words in step 6, statistical nature word pre-candidate set, and to similarity more than threshold value
Two Feature Words merge.
In statistical nature word pre-candidate set, the method for the similarity of each Feature Words is: each in Feature Words pre-candidate set
Feature Words carries out the calculating of quantity of information based on HowNet, and calculates the similar of any two Feature Words in Feature Words pre-candidate set
Degree.
The method that Feature Words merges is: more than two Feature Words of threshold value, similarity is merged into a Feature Words,
This feature word is the Feature Words that said two Feature Words medium frequency is bigger.
Embodiment one
Several comments as follows are selected to be analyzed from the comment text of certain mobile phone of certain electricity business website:
A, " mobile phone feel is pretty good, and tonequality is good, and charging rate is fast, the same with what boudoir honey was bought ".
B, " mobile phone pixel is fine, and unlocked by fingerprint is ultrafast, and quality is the prettyst good ".
C, " mobile phone screen is enough big, and pixel is high, and performance is good, and customer service attitude is super good, super likes, and next time, bull's machine also came this
Family ".
D, " employing a period of time, screen size is suitable, and feel is pretty good, and earphone tonequality is fine, and volume is enough big, the most not
Mistake, battery is the most durable ".
E, " quickly, Mobile phone screen is suitable, and definition is felt quite pleased in logistics, and pixel is high, and customer service is fine ".
Every comment is divided into multiple sentence according to punctuation mark, and utilizes HanLp to carry out data prediction, such as: " hands
Machine/n-pixel/n very well/a fingerprint/n unblocks/v is super/d is fast/a mass/n also/d is pretty good/a ", wherein n representation noun, a representative is described
Word, v represents verb, and d represents adverbial word.
Use brief introduction HanLp being carried out to pretreatment is as follows:
import com.hankcs.hanlp.tokenizer.NLPTokenizer;
TermList=NLPTokenizer.segment (sentence).
For five examples of A, B, C, D, E chosen above, each sentence in A, B, C is selected to use as characteristic sentence.
All texts in example are carried out pretreatment:
" mobile phone/n, feel/n, good/a, tonequality/n, good/a, charging/vi, speed/n, fast/a, and/cc, boudoir honey/nz,
Buy/v, /ude1, the same/uyy] ".
" mobile phone/n, pixel/n, very well/a, fingerprint/n, unblock/v, super/d, fast/a, quality/n also/d, good/a ".
" mobile phone/n, screen/n, enough/v, big/a, pixel/n, height/a, performance/n, good/a, customer service/n, attitude/n, super/d, good/
A, super/b, like/vi, next time/t, buys/v, mobile phone/n, also/d, carrys out/vf, this/rzv, family/q ".
" use/v ,/ule, and one section/mq, time/n ,/ule, screen/n, size/n, suitable/a, feel/n, no
Mistake/a, earphone/n, tonequality/n, very/d, good/a, volume/n, enough/v, big/a, very/d, good/a, battery/n, also/d, durable/
a”。
" logistics/n, very/d, fast/a, mobile phone/n, screen/n, suitable/a, definition/n, very/d, satisfaction/v, pixel/n, high/
A, customer service/n, very/d, good/a ".
Can be expressed as respectively (the most not comprising spy by the part of speech sequence formalized model of example A, B, C, D, E
The sentence levying word only marks part of speech):
{feature1/n feature2/n AF1/a,Pos:1,2},{feature/n AF1/a,Pos:1},
{feature1/vi feature2/n AF1/a,Pos:1,2}{/cc,/nz,/v,/ude1,/uyy}。
{feature1/n feature2/n AF1/ a, Pos:1,2}, { feature1/n feature2/v AF1/d
feature2/a,Pos:1,2},{feature/n AF1/d AF2/a,Pos:1}。
{{feature1/n feature2/n AF1/v AF2/ a, Pos:1,2}, { feature/n AF1/a,Pos:1},
{feature/n AF1/a,Pos:1},{BF1/n feature/n AF1/d AF2/ a, Pos:1,2} ,/b/vi} ,/t ,/v ,/
n,/d,/v,/rzv,/q}。
{feature/n AF1/a,Pos:1},{feature/n AF1/a,Pos:1}{{feature1/n feature2/n
AF1/d AF2/ a, Pos, 1,2}, { feature/n AF1/v AF2/a,Pos:1},{/d/a},{feature/n AF1/d AF2/
a,Pos:1}。
{feature/n AF1/d AF2/a,Pos:1},{feature1/n feature2/n AF1/ a, Pos:1,2},
{feature/n AF1/d AF2/v,Pos:2},{feature/n AF1/a,Pos:1},{feature/n AF1/d AF2/a,
Pos:1}。
After sample part of speech sequences match, it is thus achieved that preliminary election concentrate Feature Words and the frequency to be: mobile phone screen: 2, tonequality:
1, charging rate: 1, mobile phone pixel: 1, unlocked by fingerprint: 1, quality: 1, pixel: 2, performance: 1, customer service attitude: 1, screen: 1, ear
Machine tonequality: 1, volume: 1, battery: 1, logistics: 1, Mobile phone screen: 1, definition: 1, customer service: 1}.
According to rule: if certain word is included in another word, using word less for word length as Feature Words, i.e.
Word1.contains (word2), then retain word2 as Feature Words.Obtain after pre-selected works are made preliminary treatment by rule
To screen: 4, and tonequality: 2, charging rate, pixel: 3, unlocked by fingerprint: 1, quality: 1, performance: 1, customer service 2, volume: 1, battery:
1, logistics: 1, definition: 1}.
The master record pattern of HowNet dictionary:
Word: W_C=
Word example: E_C=
Part of speech: G_C=
Concept definition (senses of a dictionary entry): DEF=
HowNet records example as follows:
Basic concepts in HowNet: justice is former: describe the ultimate unit of the senses of a dictionary entry;The senses of a dictionary entry: the different implications of word.
Assume that senses of a dictionary entry n_1 has n adopted former N_1={P_11, P_12 ..., P_1n}, senses of a dictionary entry n_2 have m adopted former N_2={P_
21, P_22 ..., P_2m}, des (P) they are the adopted former quantity of descendants that adopted former p comprises, and max (P) is this justice elite tree place former system of justice
The quantity of system, the most adopted former sample space, we select entity class in HowNet, event class, Attribute class, property value class, secondary spy
Levying totally 2216 the adopted original work comprised is sample space.The information computing formula of the former P of justice is:
The former similarity of justice depends on their general character and individual character, general character i.e.: on an adopted elite tree, it is assumed that adopted former P1And P2
Nearest ancestors' node is Pa, then PaFor adopted former P1And P2Minimum general character, adopted former calculating formula of similarity is:
By calculating senses of a dictionary entry n respectively1And n2In each former quantity of information of justice, the similarity between the senses of a dictionary entry, Sim can be obtainedL
(N1,N2): set similarity is equal to the arithmetic average of the similarity of its element pair, C1、C2Represent senses of a dictionary entry n respectively1And n2Middle record
Number, between the senses of a dictionary entry, calculating formula of similarity is:
For two word w1And w2, it is assumed that w1There is k the senses of a dictionary entry: w1=(n21,n22,…,n2r),w2There is r the senses of a dictionary entry: w2=
(n11,n12,…,n1k), then can obtain word w by equation below by the above senses of a dictionary entry similarity calculated1And w2Similarity.
The word using above formula to concentrate preliminary election carries out Similarity Measure two-by-two, and result is as follows:
According to the comparison of Similarity value, set similarity threshold β, at this we assume that β=0.310, then by eigenvalue
Tonequality and the similarity of volume more than threshold value beta, tonequality and volume are merged into the word that frequency is high, i.e. tonequality, and frequency are
Two word frequency rate sums, then characteristic value collection be screen: 4, tonequality: 3, charging rate, pixel: 3, unlocked by fingerprint: 1, quality:
1, performance: 1, customer service 2, battery: 1, logistics: 1, definition: 1}.
The above is only the preferred embodiment of the present invention, it should be understood that the present invention is not limited to described herein
Form, is not to be taken as the eliminating to other embodiments, and can be used for other combinations various, amendment and environment, and can be at this
In the described contemplated scope of literary composition, it is modified by above-mentioned teaching or the technology of association area or knowledge.And those skilled in the art are entered
The change of row and change, the most all should be at the protection domains of claims of the present invention without departing from the spirit and scope of the present invention
In.
Claims (9)
1. the extracting method of item property Feature Words in consumer reviews based on text analyzing, it is characterised in that: including:
Determine end article, and obtain the comment data of end article;
Described comment data is carried out pretreatment;
Part of speech sequence samples is obtained from pretreated comment data;
Utilize described part of speech sequence samples to mate all comment data, state in model special according to the formalization of part of speech sequence samples
Levy the position of word from comment data, extract Feature Words, and record the frequency of each Feature Words, all Feature Words constitutive characteristic words
Pre-candidate set;
The similarity of any two Feature Words in statistical nature word pre-candidate set, and similarity is more than two Feature Words of threshold value
Merge.
The extracting method of item property Feature Words in consumer reviews based on text analyzing the most according to claim 1,
It is characterized in that: the acquisition methods of the comment data of end article is: use crawler algorithm to crawl target business from default website
The comment data of product.
The extracting method of item property Feature Words in consumer reviews based on text analyzing the most according to claim 1,
It is characterized in that: the preprocess method of comment data is:
Every comment data is divided into multiple statement according to punctuation mark;
It is multiple single words by described sentence segmentation;
Part of speech is marked for each single word.
The extracting method of item property Feature Words in consumer reviews based on text analyzing the most according to claim 3,
It is characterized in that: the preprocess method of comment data also includes, remove stop words.
The extracting method of item property Feature Words in consumer reviews based on text analyzing the most according to claim 1,
It is characterized in that: the method obtaining part of speech sequence samples is:
The comment on commodity statement that definition comprises item property Feature Words is characterized sentence, chooses and carries out pretreated characteristic sentence conduct
Part of speech sequence samples;
The formalization statement model of part of speech sequence samples is:
(BF3, BF2, BF1, featurei, AF1, AF2, AF3, Pos:i)
In formula: featureiFeature Words, BFiI-th word before Feature Words, AFiI-th word after Feature Words, Pos
Feature Words position in this feature sentence.
The extracting method of item property Feature Words in consumer reviews based on text analyzing the most according to claim 1,
It is characterized in that: also include Feature Words pre-candidate set is carried out the step of pretreatment:
Whether the Feature Words in judging characteristic word pre-candidate set meets preset rules, if meeting, then retains this feature word, otherwise deletes
Except this feature word.
The extracting method of item property Feature Words in consumer reviews based on text analyzing the most according to claim 6,
It is characterized in that: described preset rules is: the length of word is less than or equal to four words, and the frequency of word is in preset range.
The extracting method of item property Feature Words in consumer reviews based on text analyzing the most according to claim 1,
It is characterized in that: in statistical nature word pre-candidate set, the method for the similarity of each Feature Words is: in Feature Words pre-candidate set
Each Feature Words carries out the calculating of quantity of information based on HowNet, and calculates any two Feature Words in Feature Words pre-candidate set
Similarity.
The extracting method of item property Feature Words in consumer reviews based on text analyzing the most according to claim 1,
It is characterized in that: the method that Feature Words merges is: similarity is merged into a feature more than two Feature Words of threshold value
Word, this feature word is the Feature Words that said two Feature Words medium frequency is bigger.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610580612.1A CN106250365A (en) | 2016-07-21 | 2016-07-21 | The extracting method of item property Feature Words in consumer reviews based on text analyzing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610580612.1A CN106250365A (en) | 2016-07-21 | 2016-07-21 | The extracting method of item property Feature Words in consumer reviews based on text analyzing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106250365A true CN106250365A (en) | 2016-12-21 |
Family
ID=57603270
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610580612.1A Pending CN106250365A (en) | 2016-07-21 | 2016-07-21 | The extracting method of item property Feature Words in consumer reviews based on text analyzing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106250365A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109948141A (en) * | 2017-12-21 | 2019-06-28 | 北京京东尚科信息技术有限公司 | A kind of method and apparatus for extracting Feature Words |
CN109977198A (en) * | 2019-04-01 | 2019-07-05 | 北京百度网讯科技有限公司 | Establish method and apparatus, the hardware device, computer-readable medium of mapping relations |
CN110096618A (en) * | 2019-05-10 | 2019-08-06 | 北京友普信息技术有限公司 | A kind of film recommended method based on fractional dimension sentiment analysis |
CN111275521A (en) * | 2020-01-16 | 2020-06-12 | 华南理工大学 | Commodity recommendation method based on user comment and satisfaction level embedding |
CN113378578A (en) * | 2021-05-08 | 2021-09-10 | 重庆航天信息有限公司 | Food and medicine public opinion analysis method |
CN116402049A (en) * | 2023-06-06 | 2023-07-07 | 摩尔线程智能科技(北京)有限责任公司 | Method and device for generating decorated text set and image enhancer and electronic equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080249764A1 (en) * | 2007-03-01 | 2008-10-09 | Microsoft Corporation | Smart Sentiment Classifier for Product Reviews |
CN103778214A (en) * | 2014-01-16 | 2014-05-07 | 北京理工大学 | Commodity property clustering method based on user comments |
CN105243129A (en) * | 2015-09-30 | 2016-01-13 | 清华大学深圳研究生院 | Commodity property characteristic word clustering method |
-
2016
- 2016-07-21 CN CN201610580612.1A patent/CN106250365A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080249764A1 (en) * | 2007-03-01 | 2008-10-09 | Microsoft Corporation | Smart Sentiment Classifier for Product Reviews |
CN103778214A (en) * | 2014-01-16 | 2014-05-07 | 北京理工大学 | Commodity property clustering method based on user comments |
CN105243129A (en) * | 2015-09-30 | 2016-01-13 | 清华大学深圳研究生院 | Commodity property characteristic word clustering method |
Non-Patent Citations (4)
Title |
---|
李俊等: "面向电子商务网站的产品属性提取算法", 《小型微型计算机系统》 * |
林岚岚: "基于语法模式的评论特征词提取", 《广东水利电力职业技术学院学报》 * |
栗春亮等: "中文产品评论中属性词抽取方法研究", 《计算机工程》 * |
胡龙茂: "中文在线评论中产品特征抽取研究", 《电脑知识与技术》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109948141A (en) * | 2017-12-21 | 2019-06-28 | 北京京东尚科信息技术有限公司 | A kind of method and apparatus for extracting Feature Words |
CN109977198A (en) * | 2019-04-01 | 2019-07-05 | 北京百度网讯科技有限公司 | Establish method and apparatus, the hardware device, computer-readable medium of mapping relations |
CN110096618A (en) * | 2019-05-10 | 2019-08-06 | 北京友普信息技术有限公司 | A kind of film recommended method based on fractional dimension sentiment analysis |
CN110096618B (en) * | 2019-05-10 | 2021-06-15 | 北京友普信息技术有限公司 | Movie recommendation method based on dimension-based emotion analysis |
CN111275521A (en) * | 2020-01-16 | 2020-06-12 | 华南理工大学 | Commodity recommendation method based on user comment and satisfaction level embedding |
CN111275521B (en) * | 2020-01-16 | 2022-06-14 | 华南理工大学 | Commodity recommendation method based on user comment and satisfaction level embedding |
CN113378578A (en) * | 2021-05-08 | 2021-09-10 | 重庆航天信息有限公司 | Food and medicine public opinion analysis method |
CN116402049A (en) * | 2023-06-06 | 2023-07-07 | 摩尔线程智能科技(北京)有限责任公司 | Method and device for generating decorated text set and image enhancer and electronic equipment |
CN116402049B (en) * | 2023-06-06 | 2023-08-22 | 摩尔线程智能科技(北京)有限责任公司 | Method and device for generating decorated text set and image enhancer and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Alsubari et al. | Data analytics for the identification of fake reviews using supervised learning | |
Shwartz et al. | Still a pain in the neck: Evaluating text representations on lexical composition | |
Zhou et al. | Fake news early detection: A theory-driven model | |
CN106708966B (en) | Junk comment detection method based on similarity calculation | |
CN106250365A (en) | The extracting method of item property Feature Words in consumer reviews based on text analyzing | |
US9336192B1 (en) | Methods for analyzing text | |
Seerat et al. | Opinion Mining: Issues and Challenges(A survey) | |
CN109829166B (en) | People and host customer opinion mining method based on character-level convolutional neural network | |
KR102032091B1 (en) | Method And System of Comment Emotion Analysis based on Artificial Intelligence | |
CN108154395A (en) | A kind of customer network behavior portrait method based on big data | |
Balwant | Bidirectional LSTM based on POS tags and CNN architecture for fake news detection | |
CN101782898A (en) | Method for analyzing tendentiousness of affective words | |
Ghosh et al. | Natural language processing fundamentals: build intelligent applications that can interpret the human language to deliver impactful results | |
CN107357793A (en) | Information recommendation method and device | |
CN108256968B (en) | E-commerce platform commodity expert comment generation method | |
CN105843796A (en) | Microblog emotional tendency analysis method and device | |
Sun et al. | Pre-processing online financial text for sentiment classification: A natural language processing approach | |
Zhou et al. | Fake news early detection: An interdisciplinary study | |
Gao et al. | Text classification research based on improved Word2vec and CNN | |
CN110955750A (en) | Combined identification method and device for comment area and emotion polarity, and electronic equipment | |
CN107818173B (en) | Vector space model-based Chinese false comment filtering method | |
Pandey et al. | Sentiment analysis using lexicon based approach | |
CN114722176A (en) | Intelligent question answering method, device, medium and electronic equipment | |
Rana et al. | A conceptual model for decision support systems using aspect based sentiment analysis | |
Yao et al. | Online deception detection refueled by real world data collection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20161221 |
|
RJ01 | Rejection of invention patent application after publication |