CN104915848A - Evaluation content recognition based false evaluation judgment system - Google Patents

Evaluation content recognition based false evaluation judgment system Download PDF

Info

Publication number
CN104915848A
CN104915848A CN201510354936.9A CN201510354936A CN104915848A CN 104915848 A CN104915848 A CN 104915848A CN 201510354936 A CN201510354936 A CN 201510354936A CN 104915848 A CN104915848 A CN 104915848A
Authority
CN
China
Prior art keywords
evaluation
content
evaluation content
false
similar
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510354936.9A
Other languages
Chinese (zh)
Inventor
吴雨浓
何宏靖
刘世林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Business Big Data Technology Co Ltd
Original Assignee
Chengdu Business Big Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Business Big Data Technology Co Ltd filed Critical Chengdu Business Big Data Technology Co Ltd
Priority to CN201510354936.9A priority Critical patent/CN104915848A/en
Publication of CN104915848A publication Critical patent/CN104915848A/en
Pending legal-status Critical Current

Links

Abstract

The invention relates to the field of Internet, and particularly relates to an evaluation content recognition based false evaluation judgment system. The evaluation content recognition based false evaluation judgment system comprises a client, a network connection device, a same evaluation content judgment module, a similar evaluation content judgment module and a false evaluation marking module, wherein the client acquires relevant evaluation data information of a target commodity through the network connection device, and sequentially inputs the relevant evaluation data information to the same evaluation content judgment module and the similar evaluation content judgment module; same evaluation content and similar evaluation content in the evaluation information are judged respectively through the same evaluation content judgment module and the similar evaluation content judgment module; and judgment results are inputted into the false evaluation marking module respectively. The evaluation content recognition based false evaluation judgment system realizes automatic recognition for false evaluation content in target commodity evaluation. Compared with an analysis mode of only judging the same evaluation content in the prior art, the false evaluation judgment system provided by the invention is more comprehensive and more accurate in recognition for the false content; and a simple and reliable evaluation identification tool is provided for E-commerce environment managers and commodity consumers.

Description

False evaluation based on evaluation content identification judges system
Technical field
The present invention relates to internet arena, the false evaluation particularly based on evaluation content identification judges system.
Background technology
In the present age, along with popularizing of internet, ecommerce has become a kind of commerce and trade mode be widely used.Both parties mainly carry out transaction by the webpage of electric business or software.Because ecommerce does not have traditional entity StoreFront, not high to the quantitative requirement of sales force yet, so compare conventional transaction pattern more can control operation cost, thus there is larger price advantage.But, have a lot of illegal businessman to improve the sales volume of oneself thus employing specialty brush to evaluate team also to manufacture a large amount of false evaluation and carry out false publication to the commodity of oneself, thus deception consumer improves the true sales volume of oneself.
The development of current ecommerce is swift and violent, the scale of construction is huge, Seller Number in electricity quotient ring border is numerous, user is difficult to when carrying out purchase decision the authenticity judging descriptive labelling, very high to the dependency degree of commodity evaluation content, the situation of buyer's interests loss that the situation of the performance favorable comment degree virtual height of commodity caused because seller evaluates cheating causes is serious.In order to differentiate the false evaluation that present businessman forges, the main method of prior art is: the quantity of identical content evaluated by meter, evaluates the number of times occurred too much, can be determined as false evaluation if same; There is very large problem in such recognition method, owing to only going to differentiate from identical evaluation content, a lot of false evaluation of will failing to judge, because some is evaluated is only have a few word different, such as, evaluates 1: these commodity are pretty good; Evaluate 2: these things are pretty good, and just can not be determined is false evaluation.But the situation of reality is the evaluation of occupation brush, and team consciously can carry out trickle difference to evade identical evaluation content identification of the prior art in evaluation content, prior art can not by these false evaluation content recognition out, cause a large amount of false evaluation by the situation of failing to judge, be badly in need of a kind ofly can helping commodity buyer and evading transaction risk by the first pass equipment of accurate discrimination false evaluation content comprehensively.
Summary of the invention
In order to solve problems of the prior art, the false evaluation that the invention provides based on evaluation content identification judges system, not only judge the identical evaluation content in end article evaluation information, also gone out the similar evaluation content in end article evaluation information by the interpretation of similar evaluation content judge module, comprehensive identification has been carried out to false evaluation content in end article evaluation information; Overcome in existing recognition technology and the situation of failing to judge is occurred to a large amount of similar false evaluation.
In order to realize foregoing invention object, the invention provides following technical scheme:
False evaluation based on evaluation content identification judges system; Comprise client computer, network connection device, identical and similar evaluation content judge module and false evaluation mark module; The relevant evaluation data message that wherein said client computer obtains end article by network connection device (can get the relevant information in target web at present very easily by crawler technology, the speed extracted is fast, the total amount can analyzing data is huge, to extract the analytical approach of data ripe, with low cost; By client computer, the analysis of end article and data are got), and above-mentioned evaluation information is outputted in described identical and similar evaluation content judge module, described identical and similar evaluation content judge module identifies identical evaluation content in end article evaluating data information and similar evaluation content, and is marked by described false evaluation mark module by recognition result.Further, described identical and similar evaluation content judge module, comprises identical evaluation content judge module and similar evaluation content judge module; During native system work, wherein end article evaluating data information outputs in described identical evaluation content judge module by client computer, and the identical evaluation content judged in end article evaluating data information outputs in described false evaluation mark module by described identical evaluation content judge module; Evaluation information beyond identical evaluation content outputs in described similar evaluation content judge module by described identical evaluation content judge module, described similar evaluation content judges that mould judges the evaluation that content is similar, and judged result is outputted in described false evaluation content-label module.
If businessman wants by wash sale and evaluates the system display sales volume and favorable comment situation that improve commodity at present, the quantity of required false evaluation is very large, people is that the evaluation of fabricating often is occurring with identical content under these circumstances, but occupation brush evaluates team to evade the examination to identical evaluation content of the prior art, consciously can revalue in content and carry out some differentiations, but because the demand of false evaluation is larger, similarity in evaluation content can be higher, identical evaluation content is judged by identical evaluation content judge module in the present invention, and by similar evaluation content judge module come the content text in comparison object commodity evaluating data similarity (current text ratio of similitude compared with algorithm ripe, such as cosine similarity can be taked to calculate the similarity compared between text), calculate the content likelihood of evaluation to be compared, compared by the threshold value that the result of calculation of content likelihood and module are pre-set, if this content likelihood exceedes threshold value, then evaluation content to be compared is judged as similar evaluation content.The present invention has carried out comprehensive identification to false evaluation content in end article evaluation information; Overcome in existing recognition technology and the situation of failing to judge is occurred to a large amount of similar false evaluation.Further,
Further, judged result outputs in false evaluation mark module with described similar ID judge module by described Similar content judge module, described false evaluation mark module is by quantity that is identical and similar evaluation content, and calculate doubtful false evaluation probability: (identical and similar evaluation content quantity)/(quantity of all evaluation content of end article), when doubtful false evaluation rate is greater than the threshold value pre-set, evaluation identical with similar for these contents is judged to be false evaluation by described false evaluation mark module; And the false evaluation these judged is marked.
Preferred as one, described identical evaluation content judge module is that identical evaluation content judges server, described similar evaluation content judge module is that similar evaluation content judges server, described false evaluation mark module is false evaluation mark server, server is exhibits excellent in processing power, stability, reliability, security, extensibility, manageability etc., relevant ID similarity is completed by server, the correlated judgment of content similarities, can the related data of a large amount of electric business's end article of fast processing, processing speed is fast, and efficiency is high.The present invention is processed by different servers with similar evaluation content respectively for identical evaluation content, and accessible data volume is larger, and the performance of system is more stable.
Compared with prior art, beneficial effect of the present invention: the false evaluation that the invention provides based on evaluation content identification judges system.By the network address of client access end article, crawl the evaluating data of corresponding goods webpage, and relevant evaluation data message is input in identical evaluation content judge module, judge by identical evaluation content judge module the evaluation information that content is identical, and carry out the similarity of the content text in comparison object commodity evaluating data by similar evaluation content judge module.The present invention has carried out comprehensive identification to false evaluation content in end article evaluation information; Overcome in existing recognition technology and the situation of failing to judge is occurred to a large amount of similar false evaluation; And the false evaluation judged (comprising identical evaluation content and similar evaluation content) is marked by false evaluation mark module by the present invention, the false evaluation that user can mark considers, and evades the transaction risk brought because seller evaluates cheating.
Accompanying drawing illustrates:
Fig. 1 is the modularization annexation figure that this false evaluation based on evaluation content identification judges system.
Fig. 2 is the concrete annexation figure that this false evaluation based on evaluation content identification judges system.
Fig. 3 is the preferred annexation figure that this false evaluation based on evaluation content identification judges system.
Embodiment
Below in conjunction with test example and embodiment, the present invention is described in further detail.But this should be interpreted as that the scope of the above-mentioned theme of the present invention is only limitted to following embodiment, all technology realized based on content of the present invention all belong to scope of the present invention.
The false evaluation that the invention provides based on evaluation content identification judges system, not only judge the identical evaluation content in end article evaluation information, also gone out the similar evaluation content in end article evaluation information by the interpretation of similar evaluation content judge module, comprehensive identification has been carried out to false evaluation content in end article evaluation information; Overcome in existing recognition technology and the situation of failing to judge is occurred to a large amount of similar false evaluation.
In order to realize foregoing invention object, the invention provides following technical scheme:
False evaluation based on evaluation content identification judges system, as shown in Figure 1 and Figure 2: comprise client computer, network connection device, identical and similar evaluation content judge module and false evaluation mark module (representing in dotted line frame); The relevant evaluation data message that wherein said client computer obtains end article by network connection device (can get the relevant information in target web at present very easily by crawler technology, the speed extracted is fast, the total amount can analyzing data is huge, to extract the analytical approach of data ripe, with low cost; By client computer, the analysis of end article and data are got), and above-mentioned evaluation information is outputted in described identical and similar evaluation content judge module, described identical and similar evaluation content judge module identifies identical evaluation content in end article evaluating data information and similar evaluation content, and is marked by described false evaluation mark module by recognition result.Further, described identical and similar evaluation content judge module, comprises identical evaluation content judge module and similar evaluation content judge module; During native system work, wherein end article evaluating data information outputs in described identical evaluation content judge module by client computer, and the identical evaluation content judged in end article evaluating data information outputs in described false evaluation mark module by described identical evaluation content judge module; Evaluation information beyond identical evaluation content outputs in described similar evaluation content judge module by described identical evaluation content judge module, described similar evaluation content judges that mould judges the evaluation that content is similar, and judged result is outputted in described false evaluation content-label module.
If businessman wants by wash sale and evaluates the system display sales volume and favorable comment situation that improve commodity at present, the quantity of required false evaluation is very large, people is that the evaluation of fabricating often is occurring with identical content under these circumstances, but occupation brush evaluates team to evade the examination to identical evaluation content of the prior art, consciously can revalue in content and carry out some differentiations, but because the demand of false evaluation is larger, similarity in evaluation content can be higher, identical evaluation content is judged by identical evaluation content judge module in the present invention, and the similarity of the content text in comparison object commodity evaluating data is carried out by similar evaluation content judge module, the quantity of the evaluation that statistical appraisal content is similar (current text ratio of similitude compared with algorithm ripe, such as cosine similarity can be taked to calculate the similarity compared between text), calculate the content likelihood of evaluation to be compared, compared by the threshold value that the result of calculation of content likelihood and module are pre-set, if this content likelihood exceedes threshold value, then evaluation content to be compared is judged as similar evaluation content.
In order to the cosine similarity realizing all evaluations calculates, can crawl the overall merit data of certain electric business website in advance, and according to word frequency, after we delete some function words (such as punctuate) and some low-frequency words, establish an effective notional word vocabulary as shown in table 1.
Table 1
In the specific evaluation of a certain bar, (TF-IDF is a kind of statistical method, in order to assess the significance level of a words for a copy of it file in a file set or a corpus to calculate the TF-IDF value of all notional words.The importance of words to be directly proportional increase along with the number of times that it occurs hereof, the decline but the frequency that can occur in corpus along with it is inversely proportional to simultaneously), draw a vector according to their positional alignment in vocabulary, for the word not having to occur, the value of its correspondence is zero, as shown in table 2.
Table 2
Form the vector of a n dimension by the n number calculated, and represent this evaluation with this vector.
Want the cosine similarity of Calculation Estimation A and evaluation B, need to obtain these two vectors evaluating correspondence respectively as follows:
A 1, a 2..., a nand b 1, b 2..., b n
Two likelihood probability P evaluated are to utilize cosine formula to draw
p = c o s θ = a 1 b 1 + a 2 b 2 + ... + a n b n a 1 2 + a 2 2 ... + a n 2 · b 1 2 + b 2 2 + ... + b n 2
Wherein θ represents the angle between two vectors, and probability is larger, represents that two similaritys commented on are larger, otherwise represents that the similarity of two comments is less.The likelihood probability calculated and threshold value are compared, if be greater than threshold value, is then judged to be similar comment.
Further, judged result outputs in false evaluation mark module with described similar ID judge module by described Similar content judge module; Above-mentioned judged result returns in client computer by described false evaluation mark module, by client computer by judged false evaluation result queue out.
Preferred as one, as shown in Figure 3: described identical evaluation content judge module is that identical evaluation content judges server, described similar evaluation content judge module is that similar evaluation content judges server, described false evaluation mark module is false evaluation mark server, server is in processing power, stability, reliability, security, extensibility, the aspect exhibits excellent such as manageability, relevant ID similarity is completed by server, the correlated judgment of content similarities, can the related data of a large amount of electric business's end article of fast processing, processing speed is fast, efficiency is high.The present invention is processed by different servers with similar evaluation content respectively for identical evaluation content, and accessible data volume is larger, and the performance of system is more stable.

Claims (6)

1. the false evaluation based on evaluation content identification judges system, it is characterized in that, comprises in client computer, network connection device, identical and similar evaluation content judge module and false evaluation mark module; Wherein said client computer obtains the relevant evaluation data message of end article by network connection device; And above-mentioned data evaluation information is outputted in described identical and similar evaluation content judge module; Described identical and similar evaluation content judge module identifies identical evaluation content in end article evaluating data information and similar evaluation content, and is marked by described false evaluation mark module by recognition result.
2. judge system based on the false evaluation of evaluation content identification as claimed in claim 1, it is characterized in that, described identical and similar evaluation content judge module, comprises identical evaluation content judge module and similar evaluation content judge module; Wherein end article evaluating data information outputs in described identical evaluation content judge module by client computer, and the identical evaluation content judged in end article evaluating data information outputs in described false evaluation mark module by described identical evaluation content judge module; Evaluation information beyond identical evaluation content outputs in described similar evaluation content judge module by described identical evaluation content judge module, described similar evaluation content judges that mould judges the evaluation that content is similar, and judged result is outputted in described false evaluation content-label module.
3. as claimed in claim 2 judging system based on the false evaluation of evaluation content identification, it is characterized in that, described identical evaluation content judge module, by carrying out text identification to the evaluation content in evaluating data, judging identical evaluation content.
4. judge system based on the false evaluation of evaluation content identification as claimed in claim 2, it is characterized in that, described similar evaluation content judge module, by carrying out text similarity relative discern to the evaluation content in evaluating data, judges similar evaluation content.
5. the false evaluation based on evaluation content identification as described in claim 3 or 4 judges system, it is characterized in that, the identical and similar evaluation content judged is marked by described false evaluation mark module.
6. judge system based on the false evaluation of evaluation content identification as claimed in claim 5, it is characterized in that, described identical evaluation content judge module is that identical evaluation content judges server; Described similar evaluation content judge module is that similar evaluation content judges server; Described false evaluation mark module is false evaluation mark server.
CN201510354936.9A 2015-05-16 2015-06-25 Evaluation content recognition based false evaluation judgment system Pending CN104915848A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510354936.9A CN104915848A (en) 2015-05-16 2015-06-25 Evaluation content recognition based false evaluation judgment system

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201510250951 2015-05-16
CN2015102509519 2015-05-16
CN201510354936.9A CN104915848A (en) 2015-05-16 2015-06-25 Evaluation content recognition based false evaluation judgment system

Publications (1)

Publication Number Publication Date
CN104915848A true CN104915848A (en) 2015-09-16

Family

ID=54084894

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510354936.9A Pending CN104915848A (en) 2015-05-16 2015-06-25 Evaluation content recognition based false evaluation judgment system

Country Status (1)

Country Link
CN (1) CN104915848A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280560A (en) * 2017-01-06 2018-07-13 广州市动景计算机科技有限公司 A kind of anti-brush method and device of subject evaluation
CN111681075A (en) * 2020-05-27 2020-09-18 引众传媒(苏州)有限公司 Commodity information modification system for electronic commerce and working method thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102339445A (en) * 2010-07-23 2012-02-01 阿里巴巴集团控股有限公司 Method and system for evaluating credibility of network trade user
CN102915501A (en) * 2012-10-29 2013-02-06 江苏乐买到网络科技有限公司 Method for optimizing online shopping evaluating information
CN103150378A (en) * 2013-03-13 2013-06-12 珠海市君天电子科技有限公司 Method for identifying false favorable comments in microblog advertisements
CN103198161A (en) * 2013-04-28 2013-07-10 中国科学院计算技术研究所 Microblog ghostwriter identifying method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102339445A (en) * 2010-07-23 2012-02-01 阿里巴巴集团控股有限公司 Method and system for evaluating credibility of network trade user
CN102915501A (en) * 2012-10-29 2013-02-06 江苏乐买到网络科技有限公司 Method for optimizing online shopping evaluating information
CN103150378A (en) * 2013-03-13 2013-06-12 珠海市君天电子科技有限公司 Method for identifying false favorable comments in microblog advertisements
CN103198161A (en) * 2013-04-28 2013-07-10 中国科学院计算技术研究所 Microblog ghostwriter identifying method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280560A (en) * 2017-01-06 2018-07-13 广州市动景计算机科技有限公司 A kind of anti-brush method and device of subject evaluation
CN111681075A (en) * 2020-05-27 2020-09-18 引众传媒(苏州)有限公司 Commodity information modification system for electronic commerce and working method thereof
CN111681075B (en) * 2020-05-27 2023-05-26 深圳市艾利艾文化科技有限公司 Commodity information modification system for electronic commerce and working method thereof

Similar Documents

Publication Publication Date Title
Peng et al. Detecting Spam Review through Sentiment Analysis.
CN104867017A (en) Electronic commerce client false evaluation identification system
US20140351109A1 (en) Method and apparatus for automatically identifying a fraudulent order
CN104881796A (en) False comment judgment system based on comment content and ID recognition
CN104881795A (en) E-commerce false comment judging and recognizing method
CN106776897B (en) User portrait label determination method and device
CA2863722C (en) Systems, methods and apparatus for identifying links among interactional digital data
CN105589911B (en) Customer value appraisal procedure and system
CN110009297A (en) A kind of fiduciary qualification signal auditing method, device and equipment
CA2869888A1 (en) Discovering spam merchants using product feed similarity
CN107679916A (en) For obtaining the method and device of user interest degree
CN104867032A (en) Electronic commerce client evaluation identification system
Sadman et al. Detect review manipulation by leveraging reviewer historical stylometrics in amazon, yelp, facebook and google reviews
US10089665B2 (en) Systems and methods for evaluating a credibility of a website in a remote financial transaction
CN104915848A (en) Evaluation content recognition based false evaluation judgment system
CN104867018A (en) Electronic commerce evaluation judgment system based on evaluation content and ID similarity identification
CN108492112A (en) The method, apparatus and electronic equipment of the false resource transfers of judgement and wash sale
CN112214663A (en) Method, system, device, storage medium and mobile terminal for obtaining public opinion volume
Kokkodis et al. The relationship between disclosing purchase information and reputation systems in electronic markets
CN113763077A (en) Method and apparatus for detecting false trade orders
CN108280766B (en) Transaction behavior risk identification method and device
CN111339434B (en) Information recommendation method and device, electronic equipment and computer storage medium
CN113779276A (en) Method and device for detecting comments
CN104867033A (en) Electronic commerce client evaluation judging and marking system
Al Evaluating the utilisation of mobile devices in online payments from the consumer perspective

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150916