WO2016086724A1 - 一种确定候评项的质量信息的方法与装置 - Google Patents

一种确定候评项的质量信息的方法与装置 Download PDF

Info

Publication number
WO2016086724A1
WO2016086724A1 PCT/CN2015/091925 CN2015091925W WO2016086724A1 WO 2016086724 A1 WO2016086724 A1 WO 2016086724A1 CN 2015091925 W CN2015091925 W CN 2015091925W WO 2016086724 A1 WO2016086724 A1 WO 2016086724A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
quality
comment
quality information
content
Prior art date
Application number
PCT/CN2015/091925
Other languages
English (en)
French (fr)
Inventor
王薇薇
李光
牟海东
李书鹏
张振平
徐光勇
张祎轶
Original Assignee
百度在线网络技术(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百度在线网络技术(北京)有限公司 filed Critical 百度在线网络技术(北京)有限公司
Priority to KR1020177014844A priority Critical patent/KR20170080645A/ko
Priority to JP2017529656A priority patent/JP2017536632A/ja
Publication of WO2016086724A1 publication Critical patent/WO2016086724A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services

Definitions

  • the application is filed as a priority application by a Chinese patent application.
  • the application date of the priority application is December 5, 2014, and the application number is 201410743141.2.
  • the invention is entitled "A Method and Apparatus for Determining Quality Information of Waiting Items”. .
  • the present invention relates to the field of Internet technologies, and in particular, to a technology for determining quality information of a candidate item.
  • the page can present 5 stars for users.
  • the user lights up, and the number of stars illuminated by the user represents the degree of satisfaction with the candidate.
  • the user can rate the candidate according to the percentage system, and the score indicates the degree of satisfaction with the candidate.
  • this scoring mechanism is easily abused. For example, a promoter of a candidate may hire a large number of reviewers to score high on the candidate. Another example is that the commentator may vent his dissatisfaction, and may also score low points for the candidate. Therefore, the score of the candidate can not accurately reflect the merits of the candidate.
  • a method for determining quality information of a candidate item wherein each candidate item has one or more pieces of comment information, and the method includes:
  • each candidate has one or more pieces of comment information
  • the device includes:
  • the present invention determines the quality information of each piece of comment information of the candidate item based on the quality of the comment content and/or the credibility of the reviewer, and integrates the quality information of each piece of comment information to determine Quality information of the evaluation items.
  • the present invention proposes a new evaluation index for the candidate evaluation item, that is, the quality information of the candidate evaluation item.
  • the quality information of the candidate can be more accurate and not easily abused by the reviewer.
  • the quality information of the evaluation items can guide the reviewers to comment on the evaluation items in a true and objective manner, and gradually eliminate the promotion and malicious attacks, thereby creating a better comment atmosphere.
  • the quality information of the candidate can be used to sort the plurality of candidate items, and the ranking of the candidate items is more accurate according to the quality information according to the ranking of the rating by the reviewer. More objective and more real.
  • FIG. 1 shows a flow chart of a method for determining quality information of a candidate item according to an embodiment of the present invention
  • FIG. 2 is a diagram showing an apparatus for determining quality information of a candidate item according to another embodiment of the present invention.
  • the invention can be implemented by any device having computing processing capabilities, such as network devices, user equipment, and the like.
  • network devices many network devices are used as examples.
  • each candidate item has one or more pieces of comment information, and the network device determines a quality letter of each piece of comment information. Then, the network device integrates the quality information of each comment information to determine the quality information of the candidate item.
  • the network device includes, but is not limited to, a network host, a single network server, a plurality of network server sets, or a cloud composed of a plurality of servers.
  • the cloud is composed of a large number of host or network servers based on Cloud Computing, which is a kind of distributed computing, a super virtual computer composed of a group of loosely coupled computers.
  • the network device may determine the quality information of the candidate item as a whole, or by a part of the network host/server, or even a specific device installed in one or more network hosts/servers, such as a determining device. .
  • the user equipment includes, but is not limited to, any electronic product that can interact with the user through an input device such as a keyboard, a virtual keyboard, a touch pad, a touch screen, and a voice control device, such as a PC, a notebook computer, a mobile phone, a smart phone, a PDA, a tablet. Computer, etc.
  • an input device such as a keyboard, a virtual keyboard, a touch pad, a touch screen, and a voice control device, such as a PC, a notebook computer, a mobile phone, a smart phone, a PDA, a tablet. Computer, etc.
  • step S1 the network device determines quality information of each piece of comment information of the candidate item; in step S2, the network device integrates the quality information of each piece of comment information to determine quality information of the item to be evaluated.
  • step S1 the network device determines quality information of each piece of comment information of the candidate item.
  • each candidate has one or more pieces of comment information.
  • the candidate item means an item to be reviewed for its quality. Waiting items include but are not limited to: restaurants, music, movies, books, websites, etc.
  • the commentary information means the information posted by the reviewer on the candidate's commentary.
  • the form of the comment information includes but is not limited to text, pictures, videos, and the like.
  • the website can be used as an evaluation item.
  • the website can be any Internet website. Such as: Baidu, Google, Sina, Sohu, Jingdong, Taobao, Ctrip, where to go, reviews, the US Mission and so on.
  • the website should be understood in a broad sense, that is, not only the aforementioned first-level website but also the lower-level channel of the first-level website. Specific examples are: Sina News, Baidu Map, Sohu Sports, etc.
  • the website's review information includes, but is not limited to, any information posted by the reviewer in the comment area of the website, such as the reviewer's experience of browsing the website, the reviewer's comments on the website's page design, the reviewer's comments on the website's features, and the reviewer. About the experience of using the website.
  • the quality information of the review information is used to measure the pros and cons of the review information.
  • the quality information can be represented by a numerical value, which can sometimes be regarded as a quality score.
  • the quality information of the review information can be measured in a variety of ways or factors, such as based on factors such as the content of the review, the reviewer, and the like.
  • the quality information of the comment information may be determined based on at least one of the following:
  • the quality of the review content is used to indicate the quality of the review content.
  • This quality can be expressed as a numerical value and thus can sometimes be regarded as a quality value.
  • the commenter means the user who posted the comment information.
  • the credibility of the reviewer is used to indicate the degree of trustworthiness of the reviewer. This credibility can be expressed as a value and can sometimes be considered a trusted value.
  • the quality information of the comment information may be determined only based on the quality of the comment content.
  • the network device may directly determine the quality of the comment content as the quality information of the comment information; or the quality information of the comment information may also be based only on the reviewer's
  • the credibility determines for example, the network device can directly determine the credibility of the reviewer as the quality information of the review information; or the quality information of the review information can be determined based on the quality of the review content and the credibility of the reviewer.
  • the network device may determine the sum of the quality of the review content and the average of the credibility of the reviewer or both as the quality information of the review information.
  • the quality of the review content and the credibility of the reviewer can also be determined based on other means. The following will discuss the way in which the quality of the comments is determined, as well as the reviewers. The way to determine the credibility.
  • the quality of the review content may be determined based on at least one of the following parameters:
  • the relevance of the comment content and the candidate item is used to indicate the degree of correlation between the two.
  • the correlation can be expressed by a numerical value.
  • the relevance of the comment content in the form of text to the candidate item can be determined by semantic analysis of the comment content. For example, the more keywords that are matched in the keyword table corresponding to the candidate item in the comment content, the higher the degree of relevance of the comment content and the candidate item, and the keyword table holds one corresponding to the item of the item or Multiple keywords.
  • the keywords corresponding to Jingdong include such as "commodity type", “commodity price”, “price-performance ratio", "shopping experience” and the like.
  • the keyword list of each candidate item may be preset or extracted from the page content of the candidate item, such as adding a word with a higher frequency in the page to the keyword list of the candidate item.
  • the content of the comment in the form of a picture or a video can be obtained by recognizing the OCR (Optical Character Recognition) of each frame of the image or the video to obtain the corresponding text form of the comment content for determining the content of the comment and the evaluation.
  • OCR Optical Character Recognition
  • the relevance of the item can be manually determined, or a manual review of the picture or video that the network device cannot determine can be submitted.
  • the content of the comment in the form of a picture or video does not appear separately, but will appear along with the content of the comment in the form of a text, so that it can also be determined only when the text appears together with the text. degree.
  • Whether the comment content in text form contains an advertisement can be determined by judging whether the comment content contains a character string of the advertisement feature.
  • the string of the ad feature includes a string such as a rule that conforms to the predetermined number and other ad keywords.
  • the former is specifically a string that conforms to the phone number rule, such as a ten-digit string beginning with "400” or "800", an eleven-digit string beginning with "13", "15”, or the like, or Comply with the rules of instant messaging software account
  • the comment content in the form of a picture or a video may also be obtained by OCR recognition to obtain a comment content in a corresponding text form, and then determine whether or not the advertisement is included by the above-described method for advertising the text content.
  • the advertisement picture, video, and the like generally have a large degree of repetition, it is also possible to determine whether the comment content in the form of a picture or a video includes an advertisement by querying the advertisement picture library and/or the advertisement video library.
  • whether the content of the photo in the form of a picture or a video includes an advertisement can be manually determined, or a manual review of a picture or video that cannot be judged by the network device is submitted.
  • the comment content contains an ad
  • its quality can be determined to be zero, or the value of the parameter can be determined to be a lower value, such as zero or a negative number, in combination with other parameters to determine the quality of the comment content. To eliminate the impact of advertising on quality.
  • the feedback information means the feedback of the reviewer to the content of the review.
  • Feedback information is like praise, objection and reply.
  • the number of feedback information is used as a parameter for evaluating the quality of the comment content, which can be converted into parameter values of the parameter by a certain conversion method, such as a value normalized to 0-100, 0-1.
  • the amount of feedback information is used independently or in combination with other parameters to determine the quality of the content of the review, the greater the number of likes and replies, the higher the quality of the review content; the greater the number of objections, the content of the review The lower the quality.
  • sensitive words include, but are not limited to, strings such as those used to express pornography, reaction, violence, and the like.
  • the content of the picture form and the video form can be obtained by OCR recognition to obtain the corresponding text form of the comment content, and then the sensitive content information is determined by the above-mentioned method for identifying the content of the text content to determine whether the comment content contains sensitive information.
  • the sensitive content information is determined by the above-mentioned method for identifying the content of the text content to determine whether the comment content contains sensitive information.
  • whether the content of the comment in the form of a picture or a video contains sensitive information can be manually determined, or a manual review of a picture or video that the network device cannot judge can be submitted.
  • the comment contains sensitive information
  • its quality can be determined to be zero, or the value of the parameter can be determined as a lower value, such as zero or a negative number, in combination with other parameters to determine the quality of the comment content. Eliminate the impact of sensitive information on quality.
  • the quality of the review content may be determined based on any of the above parameters, such as the value of any one of the parameters as the quality of the review content; or, the content of the review may also be based on a combination of at least two of the above parameters. It is determined that the quality of the comment content is obtained by the calculation processing of the parameter values of the at least two parameters described above.
  • the credibility of the reviewer is an evaluation of the reviewer itself, which can be determined based on at least one of the following parameters:
  • the credential's identity credibility is used to evaluate the credibility of the reviewer from an identity perspective.
  • the identity credibility can be represented by a numerical value.
  • the credibility of the reviewer's identity can be determined by whether the reviewer passes various authentications, such as whether through ID card verification, mobile phone verification, payment platform transaction verification, and the more authenticated by the reviewer, the identity can be The higher the reliability.
  • the credential's behavioral credibility is used to evaluate the credibility of the reviewer from a behavioral perspective.
  • the behavioral credibility can be expressed as a numerical value.
  • the credibility of the reviewer's behavior can be determined by the reviewer's historical behavior record. For example, if the comment content of the reviewer's history is highly correlated with the candidate's review item, the comment content does not contain the advertisement, and the comment content includes the reviewer's personal experience, the reviewer's behavior is highly credible.
  • the credibility of the reviewer's behavior can also be determined by the level of the reviewer's level.
  • the level of the commenter can be distinguished from various dimensions, such as the skill level, which can be divided into experts, ordinary, etc., which are distinguished from the level of trust. Whitelists, blacklists, etc. Each level of level corresponds to a different behavioral credibility.
  • the credibility of the reviewer can be determined based on any of the above parameters, such as the value of any one of the parameters as the credibility of the reviewer; or the credibility of the reviewer can be based on the above two
  • the combination is determined to obtain the credibility of the reviewer by calculating the parameter values of the above two parameters.
  • the network device may determine the quality information of the comment information according to the quality of the comment content and the credibility of the reviewer, and the respective corresponding weights.
  • the network device may determine quality information of one piece of comment information based on the following formula 1:
  • CommentQuality Content ⁇ W content +User ⁇ W user formula 1
  • CommentQuality represents the quality information of the comment information
  • Content represents the quality of the comment content
  • W content represents the weight of the quality of the comment content
  • User represents the credibility of the reviewer
  • W user represents the credibility weight of the reviewer.
  • the values of W content and W user may be preset or dynamically determined. For example, when the quality of the comment content is high, the corresponding weight is increased, and the corresponding weight of the credential credibility is lowered; or, when the credibility of the reviewer is high, the corresponding weight is increased, and the corresponding weight is lowered. Comment on the corresponding weight of the content quality.
  • step S2 the network device integrates the quality information of each piece of comment information to determine the quality information of the item to be evaluated.
  • the quality information of the candidate is used to measure the merits of the candidate.
  • the quality information can be represented by a numerical value, which can sometimes be regarded as a quality score.
  • the network device can determine the quality information of the candidate item in a variety of comprehensive ways. For example, the network device may use the sum or average of the quality information of all the comment information as the quality information of the candidate. For another example, the highest value and the lowest value of the quality information of all the comment information may be removed, and the average value of the quality information of the remaining comment information may be used as the quality information of the candidate item.
  • the network device may weight determine the quality information of the candidate item according to the quality information of each piece of comment information and the weight of the evaluation index information corresponding to each piece of comment information. interest.
  • the evaluation index information means the commentary's comment on the exponential nature of the candidate.
  • the evaluation index information can express the likes and dislikes of the reviewers.
  • the evaluation index information can generally be regarded as the rating of the candidate by the reviewer.
  • the evaluation index information can be expressed as the number of stars lit up by the reviewer, the score value based on the percentage system, and the like.
  • each comment information can correspond to one evaluation index information by setting the comment rule.
  • the present invention is exemplified only in the form of the number of lighted stars as the form of the evaluation index information.
  • evaluation index information that may be present as applicable to the present invention is also intended to be included within the scope of the present invention and is hereby incorporated by reference.
  • the manner in which the network device determines the quality information of the candidate item includes, but is not limited to, the following two types:
  • the network device can directly determine the quality information of the candidate item according to the quality information of each piece of comment information and the weight of the evaluation index information corresponding to each piece of comment information.
  • the network device may determine quality information of the candidate item based on the following formula 2:
  • CommentQuality n represents quality information of any one of the comment information n
  • Pn represents the weight of the evaluation index information corresponding to the comment information n
  • z represents the total number of comment information corresponding to the candidate item
  • ItemQuality represents the candidate item. Quality information.
  • the meaning expressed by Formula 2 is that the weighted sum is obtained according to the quality information of each piece of comment information and the weight of the evaluation index information corresponding to each piece of comment information, and the weighted sum is the quality information of the item to be evaluated. .
  • the influence of the weight of the evaluation index information is further considered, that is, the commentator's intuitive likes and dislikes for the candidate items are considered.
  • the evaluation index of the evaluation item is also low, such as only one star; at this time, the quality score of the comment information may be high, and if the quality score of the evaluation item is determined based only on the quality score of the comment information, then The quality score of this candidate will also be higher, which obviously does not meet the commenter's intent.
  • evaluation index information to weight the quality scores of the candidate items.
  • the rating index is lower, the corresponding weight is also lower, so that the quality score of the comment information is weighted by the weight, and the high-quality score caused by the commenter's richer comment content and the reviewer's candidate for the candidate are Balance between negative evaluations.
  • the quality information of the candidate item is generally calculated when the quality information of the candidate item is calculated based on the formula 2, the quality information of the corresponding item of the item is higher. This reflects to a certain extent the recognition of a large number of commentators on the waiting items. However, in some application scenarios, this can lead to inconsistencies in metrics.
  • the network device may further weight and sum the quality information of each comment information according to the total number of comment information corresponding to the candidate item, and use the mean value to represent the quality information of the item to be evaluated.
  • the network device may determine quality information of the candidate item based on the following formula 3:
  • CommentQuality n represents the quality information of any one of the comment information n
  • Pn represents the weight of the rating index information corresponding to the comment information n
  • z represents the total number of the comment information n corresponding to the candidate item
  • ItemQuality represents the waiting evaluation Quality information of the item.
  • the quality information of the candidate item determined based on the formula 3 is obtained according to the number of the comment information, and sometimes it is more accurate, and it can be avoided that the quality information of the item to be evaluated is high simply because the number of comment information is large.
  • the quality information of the evaluation items may be normalized so that they are always in the range of 0-1 or 0-100. Therefore, it is possible to effectively evaluate each candidate item based on the quality information.
  • the network device can determine the quality information of the candidate item based on the following steps:
  • the network device classifies the quality according to the quality information of each comment information to obtain review information belonging to different quality levels.
  • the network device may perform quality classification on the comment information according to the quality information of each piece of comment information to obtain three levels of high, medium, and low quality review information, and each quality level corresponds to different comment information.
  • the criteria for quality grading may be preset or dynamically determined. For example, a uniform grading standard can be set that can be applied to the quality grading of the comment information of all the candidates.
  • the rating information belonging to each quality level is determined according to the number of levels, for example, divided into 2 levels, 3 levels, and the like.
  • the network device determines the weight corresponding to the corresponding quality level according to the evaluation index information corresponding to the comment information in each quality level.
  • the network device can determine the weight corresponding to any one of the quality levels based on Equation 4 below:
  • PLevel m represents the weight of the evaluation index information corresponding to the comment information m in a quality level
  • a represents the total number of the comment information m in the quality level
  • PLevel represents the weight corresponding to the quality level
  • Network device may be based on Equation 4, respectively, to determine the weight of each quality level corresponding to the weight PLevel, particularly such as the right quality level corresponding to the weight PLevel high, right in the quality level of the corresponding heavy PLevel midd, right lower quality level corresponds to a weight PLevel low.
  • the network device weights the quality information of the candidate item according to the mean value of the quality information of the comment information in each quality level and the weight of the corresponding quality level.
  • the network device can determine the quality information of the candidate item based on Equation 5 below:
  • HighCommentQuality h represents quality information of the high quality level comment information h
  • b represents the total number of high quality level comment information h
  • PLevel high represents the weight of the high quality level
  • MiddCommentQuality j represents the quality information of the medium quality level comment information j
  • c represents The total number of medium quality level comment information j
  • PLevel midd indicates the weight of the medium quality level
  • LowCommentQuality k indicates the quality information of the low quality level comment information k
  • d indicates the total number of low quality level comment information k
  • PLevel low indicates the weight of the low quality level
  • ItemQuality represents the quality information of the candidate.
  • the present specification describes a scheme for determining quality information of a candidate item, and those skilled in the art should understand that the network device can determine the quality information of each candidate item based on the foregoing scheme.
  • the network device may update the comment information of each candidate item according to the predetermined condition, and then update the quality information of each item of the item according to each of the above calculation methods.
  • the update condition of the comment information may be updated according to a predetermined period, such as updating once a week; or may be updated according to a predetermined event, such as when a newly posted comment information is detected, that is, the corresponding candidate is updated. Quality information of the item.
  • the network device may establish a candidate item database, where the quality information of each candidate item that has been calculated and each calculation parameter used to calculate the quality information are stored, and each calculation parameter such as each comment information
  • the quality information, corresponding weights, etc. depend on the calculation method used by the network equipment for the quality information of the candidate. Taking Equation 2 as an example, when the update period is reached, the network device obtains new comment information for each candidate, and calculates the quality information of each newly added comment information and its corresponding weight, and then combines the stored previous comments. The quality information of the information and its corresponding weights are recalculated to the quality information of the corresponding candidate items.
  • the network device may determine and store the quality information of each candidate item "offline”; or the network device may determine the quality information of each candidate item in "online” in real time. That is, when the quality information of the candidate item needs to be called, the network device can use the quality information of the candidate item that is calculated “offline” and stored in the candidate item database, or is calculated in real time through “online”. Quality information of the assessment.
  • the network device may further sort the plurality of candidate items according to the quality information of each candidate item, and present the ranked plurality of candidate items to the user.
  • the network device can arrange the candidate items with high quality information in the front position in the order of high to low, and arrange the waiting items with low quality information in the lower position. Position and present the user with multiple candidates for the ranking.
  • the network device may first obtain each candidate item, that is, each site under the category, and then determine the order according to the quality information of each candidate item, and generate a corresponding site classification page according to the ranking. Presented to the user.
  • FIG. 2 shows a schematic diagram of a device according to another embodiment of the present invention, which specifically shows a device for determining quality information of a candidate item, that is, a determining device 10.
  • the determining device 10 is installed in a network device, and specifically includes the device 11 and the device 12.
  • the device 11 determines the quality information of each piece of comment information of the candidate item (for convenience of distinction, the device 11 is hereinafter referred to as the comment quality determining device 11); the device 12 integrates the quality information of each piece of comment information to determine the quality information of the item to be evaluated. (For ease of differentiation, the following device 12 will be It is called a candidate evaluation item determining device 12).
  • the comment quality determining means 11 determines the quality information of each piece of comment information of the item to be evaluated.
  • each candidate has one or more pieces of comment information.
  • the candidate item means an item to be reviewed for its quality. Waiting items include but are not limited to: restaurants, music, movies, books, websites, etc.
  • the commentary information means the information posted by the reviewer on the candidate's commentary.
  • the form of the comment information includes but is not limited to text, pictures, videos, and the like.
  • the website can be used as an evaluation item.
  • the website can be any Internet website, such as: Baidu, Google, Sina, Sohu, Jingdong, Taobao, Ctrip, Qunar, Review, Meituan, etc.
  • the website should be understood in a broad sense, that is, not only the aforementioned first-level website but also the lower-level channel of the first-level website. Specific examples are: Sina News, Baidu Map, Sohu Sports, etc.
  • the website's review information includes, but is not limited to, any information posted by the reviewer in the comment area of the website, such as the reviewer's experience of browsing the website, the reviewer's comments on the website's page design, the reviewer's comments on the website's features, and the reviewer. About the experience of using the website.
  • the quality information of the review information is used to measure the pros and cons of the review information.
  • the quality information can be represented by a numerical value, which can sometimes be regarded as a quality score.
  • the quality information of the review information can be measured in a variety of ways or factors, such as based on factors such as the content of the review, the reviewer, and the like.
  • the quality information of the comment information may be determined based on at least one of the following:
  • the quality of the review content is used to indicate the quality of the review content.
  • This quality can be expressed as a numerical value and thus can sometimes be regarded as a quality value.
  • the commenter means the user who posted the comment information.
  • the credibility of the reviewer is used to indicate the degree of trustworthiness of the reviewer. This credibility can be expressed as a value and can sometimes be considered a trusted value.
  • the quality information of the comment information may be determined based only on the quality of the comment content.
  • the comment quality determining means 11 may directly determine the quality of the comment content as the quality information of the comment information; or the quality information of the comment information may also be based only on
  • the credibility of the reviewer determines, for example, the review quality determining means 11 can directly determine the credibility of the reviewer as the quality information of the review information; or the quality information of the review information can be based on the quality of the review content and the reviewer
  • the credibility determines, for example, the comment quality determining means 11 may determine the sum of the quality of the comment content and the mean of the credential of the reviewer or both as the quality information of the comment information.
  • the quality of the review content and the credibility of the reviewer can also be determined based on other means.
  • the manner in which the quality of the review content is determined and the manner in which the credibility of the reviewer is determined will be separately discussed below.
  • the quality of the review content may be determined based on at least one of the following parameters:
  • the relevance of the comment content and the candidate item is used to indicate the degree of correlation between the two.
  • the correlation can be expressed by a numerical value.
  • the relevance of the comment content in the form of text to the candidate item can be determined by semantic analysis of the comment content. For example, the more keywords that are matched in the keyword table corresponding to the candidate item in the comment content, the higher the degree of relevance of the comment content and the candidate item, and the keyword table holds one corresponding to the item of the item or Multiple keywords.
  • the keywords corresponding to Jingdong include such as "commodity type", “commodity price”, “price-performance ratio", "shopping experience” and the like.
  • the keyword list of each candidate item may be preset or extracted from the page content of the candidate item, such as adding a word with a higher frequency in the page to the keyword list of the candidate item.
  • the content of the comment in the form of a picture or a video can be obtained by recognizing the OCR (Optical Character Recognition) of each frame of the image or the video to obtain the corresponding text form of the comment content for determining the content of the comment and the evaluation.
  • OCR Optical Character Recognition
  • the relevance of the item may be manually determined, or the comment quality determining means 11 submits a picture or video that cannot be judged to a manual review.
  • the content of the comment in the form of a picture or video does not appear separately, but will appear along with the content of the comment in the form of a text, so that it can also be determined only when the text appears together with the text. degree.
  • Whether the comment content in text form contains an advertisement can be determined by judging whether the comment content contains a character string of the advertisement feature.
  • the string of the ad feature includes a string such as a rule that conforms to the predetermined number and other ad keywords.
  • the former is specifically a string that conforms to the phone number rule, such as a ten-digit string beginning with "400” or "800", an eleven-digit string beginning with "13", "15”, or the like, or A string that conforms to the rules of the instant messaging software account, such as a string starting with "QQ:".
  • the latter can be determined by querying the keyword list of advertisements.
  • the comment content in the form of a picture or a video may also be obtained by OCR recognition to obtain a comment content in a corresponding text form, and then determine whether or not the advertisement is included by the above-described method for advertising the text content. Or, since the advertisement picture, video, and the like generally have a large degree of repetition, it is also possible to determine whether the comment content in the form of a picture or a video includes an advertisement by querying the advertisement picture library and/or the advertisement video library. Further, whether the comment content in the form of a picture or a video includes an advertisement can be manually determined, or the comment quality determining means 11 submits a picture or video which cannot be judged to a manual review.
  • the comment content contains an ad
  • its quality can be determined to be zero, or the value of the parameter can be determined to be a lower value, such as zero or a negative number, in combination with other parameters to determine the quality of the comment content. To eliminate the impact of advertising on quality.
  • the feedback information means the feedback of the reviewer to the content of the review.
  • Feedback information is like praise, objection and reply.
  • the amount of feedback information as a measure of the quality of the review content
  • the amount of feedback information is used independently or in combination with other parameters to determine the quality of the content of the review, the greater the number of likes and replies, the higher the quality of the review content; the greater the number of objections, the content of the review The lower the quality.
  • sensitive words include, but are not limited to, strings such as those used to express pornography, reaction, violence, and the like.
  • the content of the picture form and the video form can be obtained by OCR recognition to obtain the corresponding text form of the comment content, and then the sensitive content information is determined by the above-mentioned method for identifying the content of the text content to determine whether the comment content contains sensitive information.
  • the sensitive content information is determined by the above-mentioned method for identifying the content of the text content to determine whether the comment content contains sensitive information.
  • the comment contains sensitive information
  • its quality can be determined to be zero, or the value of the parameter can be determined as a lower value, such as zero or a negative number, in combination with other parameters to determine the quality of the comment content. Eliminate the impact of sensitive information on quality.
  • the quality of the review content may be determined based on any of the above parameters, such as the value of any one of the parameters as the quality of the review content; or, the content of the review may also be based on a combination of at least two of the above parameters. It is determined that the quality of the comment content is obtained by the calculation processing of the parameter values of the at least two parameters described above.
  • the credibility of the reviewer is an evaluation of the reviewer itself, which can be determined based on at least one of the following parameters:
  • the credential's identity credibility is used to evaluate the credibility of the reviewer from an identity perspective.
  • the identity credibility can be represented by a numerical value.
  • the credibility of the reviewer's identity can be determined by whether the reviewer passes various authentications, such as whether through ID card verification, mobile phone verification, payment platform transaction verification, and the more authenticated by the reviewer, the identity can be The higher the reliability.
  • the credential's behavioral credibility is used to evaluate the credibility of the reviewer from a behavioral perspective.
  • the behavioral credibility can be expressed as a numerical value.
  • the credibility of the reviewer's behavior can be determined by the reviewer's historical behavior record. For example, if the comment content of the reviewer's history is highly correlated with the candidate's review item, the comment content does not contain the advertisement, and the comment content includes the reviewer's personal experience, the reviewer's behavior is highly credible.
  • the credibility of the reviewer's behavior can also be determined by the level of the reviewer's level.
  • the level of the commenter can be distinguished from various dimensions, such as the skill level, which can be divided into experts, ordinary, etc., which are distinguished by the level of trust, and can be divided into whitelists and blacklists. Each level of level corresponds to a different behavioral credibility.
  • the credibility of the reviewer can be determined based on any of the above parameters, such as the value of any one of the parameters as the credibility of the reviewer; or the credibility of the reviewer can be based on the above two
  • the combination is determined to obtain the credibility of the reviewer by calculating the parameter values of the above two parameters.
  • the review quality determining means 11 may weight determine the quality information of the comment information according to the quality of the comment content and the credibility of the reviewer, and the respective corresponding weights.
  • the comment quality determining means 11 can determine the quality information of one piece of comment information based on the above formula 1.
  • the candidate item quality determining means 12 integrates the quality information of each piece of comment information to determine the quality information of the item to be evaluated.
  • the quality information of the candidate is used to measure the merits of the candidate.
  • the quality information can be represented by a numerical value, which can sometimes be regarded as a quality score.
  • the candidate item quality determining device 12 can determine the candidate item by a plurality of comprehensive methods. Quality information.
  • the candidate item quality determining means 12 may use the sum or average of the quality information of all the comment information as the quality information of the candidate item.
  • the highest value and the lowest value of the quality information of all the comment information may be removed, and the average value of the quality information of the remaining comment information may be used as the quality information of the candidate item.
  • the candidate item quality determining means 12 may weight determine the quality information of the candidate item based on the quality information of each piece of comment information and the weight of the rating index information corresponding to each piece of comment information.
  • the evaluation index information means the commentary's comment on the exponential nature of the candidate.
  • the evaluation index information can express the likes and dislikes of the reviewers.
  • the evaluation index information can generally be regarded as the rating of the candidate by the reviewer.
  • the evaluation index information can be expressed as the number of stars lit up by the reviewer, the score value based on the percentage system, and the like.
  • each comment information can correspond to one evaluation index information by setting the comment rule.
  • the present invention is exemplified only in the form of the number of lighted stars as the form of the evaluation index information.
  • evaluation index information that may be present as applicable to the present invention is also intended to be included within the scope of the present invention and is hereby incorporated by reference.
  • lighting 5 stars corresponds to the highest evaluation index information
  • lighting 1 star corresponds to the lowest evaluation index information.
  • the correspondence between the evaluation index information and the weight is as shown in Table 1 above.
  • the manner in which the candidate item quality determining means 12 determines the quality information of the candidate item includes, but is not limited to, the following two types:
  • the candidate item quality determining means 12 may directly determine the quality information of the candidate item based on the quality information of each piece of comment information and the weight of the evaluation index information corresponding to each piece of comment information.
  • the candidate item quality determining means 12 can determine the quality information of the candidate item based on the above formula 2.
  • the meaning expressed by Formula 2 is that the weighted sum is obtained according to the quality information of each piece of comment information and the weight of the evaluation index information corresponding to each piece of comment information, and the addition The right is the quality information of the evaluation.
  • the influence of the weight of the evaluation index information is further considered, that is, the commentator's intuitive likes and dislikes for the candidate items are considered.
  • the evaluation index of the evaluation item is also low, such as only one star; at this time, the quality score of the comment information may be high, and if the quality score of the evaluation item is determined based only on the quality score of the comment information, then The quality score of this candidate will also be higher, which obviously does not meet the commenter's intent.
  • evaluation index information to weight the quality scores of the candidate items.
  • the rating index is lower, the corresponding weight is also lower, so that the quality score of the comment information is weighted by the weight, and the high-quality score caused by the commenter's richer comment content and the reviewer's candidate for the candidate are Balance between negative evaluations.
  • the quality information of the candidate item is generally calculated when the quality information of the candidate item is calculated based on the formula 2, the quality information of the corresponding item of the item is higher. This reflects to a certain extent the recognition of a large number of commentators on the waiting items. However, in some application scenarios, this can lead to inconsistencies in metrics.
  • the candidate item quality determining device 12 may further weight and sum the quality information of each comment information according to the total number of comment information corresponding to the candidate item, and use the mean to represent the waiting comment. Quality information of the item.
  • the candidate item quality determining means 12 can determine the quality information of the candidate item based on the above formula 3.
  • the average value is obtained based on the number of comment information, which is sometimes more accurate, and it can be avoided that the quality information of the item to be evaluated is high simply because the number of comment information is large.
  • the quality information of the evaluation items may be normalized so that they are always in the range of 0-1 or 0-100. Therefore, it is possible to effectively evaluate each candidate item based on the quality information.
  • the candidate item quality determining means 12 can determine the waiting evaluation based on the following steps Quality information of the item:
  • the candidate item quality determining means 12 classifies the quality of each piece of comment information according to the quality information of each piece of comment information to obtain comment information belonging to different quality levels.
  • the candidate item quality determining device 12 may perform quality classification on the comment information according to the quality information of each piece of comment information to obtain three levels of high, medium, and low quality review information, each quality level corresponding to a different comment. information.
  • the criteria for quality grading may be preset or dynamically determined. For example, a uniform grading standard can be set that can be applied to the quality grading of the comment information of all the candidates.
  • the rating information belonging to each quality level is determined according to the number of levels, for example, divided into 2 levels, 3 levels, and the like.
  • the candidate item quality determining means 12 determines the weight corresponding to the corresponding quality level based on the evaluation index information corresponding to the comment information in each quality level.
  • the candidate item quality determining means 12 may determine the weight corresponding to any one of the quality levels based on the above formula 4.
  • the candidate item quality determining device 12 may separately determine the weight PLevel corresponding to each quality level based on the formula 4, specifically, the weight PLevel high corresponding to the high quality level, the weight PLevel midd corresponding to the medium quality level, and the low quality.
  • the weight corresponding to the level is PLevel low .
  • the candidate item quality determining means 12 weights the quality information of the candidate item based on the mean value of the quality information of the comment information in each quality level and the weight of the corresponding quality level.
  • the candidate item quality determining means 12 may determine the quality information of the candidate item based on the above formula 5.
  • the review quality determining means 11 and the candidate item quality determining means 12 can be integrated.
  • the present specification describes a scheme for determining quality information of one candidate item, and those skilled in the art should understand that the determining apparatus 10 can determine quality information of each candidate item based on the foregoing scheme.
  • the determining device 10 may update the comment information of each candidate item according to the predetermined condition, and further update the quality information of each item of the item according to each of the above calculation methods.
  • the update condition of the comment information may be updated according to a predetermined period, such as updating once a week; or may be updated according to a predetermined event, such as when a newly posted comment information is detected, that is, the corresponding candidate is updated. Quality information of the item.
  • the determining device 10 may establish a candidate item database in which the quality information of each candidate item that has been calculated and each calculation parameter for calculating the quality information are stored, each of which is calculated
  • the quality information of the review information, the corresponding weight, and the like depend on the calculation method used by the determining device 10 for the quality information of the candidate item. Taking Equation 2 as an example, when the update period is reached, the determining device 10 acquires new comment information for each candidate, and calculates the quality information of each newly added comment information and its corresponding weight, and then combines the previously stored each Review the quality information of the information and its corresponding weights, and recalculate the quality information of the corresponding candidate.
  • the determining means 10 may determine and store the quality information of each candidate item "off-line”; alternatively, the determining means 10 may also determine the quality information of each candidate item "on-line” in real time. That is, when the quality information of the candidate item needs to be called, the determining means 10 can use the "offline” calculation and the quality information of the candidate item that has been stored in the candidate item database, or the "online” real-time calculation. Quality information of the evaluation items.
  • the determining device 10 may further include a sorting device (not shown in FIG. 2), and the sorting device may sort the plurality of candidate items according to the quality information of each candidate item, and present the sorted plurality of items to the user. Waiting for evaluation.
  • a sorting device not shown in FIG. 2
  • the sorting device may sort the plurality of candidate items according to the quality information of each candidate item, and present the sorted plurality of items to the user. Waiting for evaluation.
  • the sorting device may arrange the candidate items with high quality information in the front position in the order of high to low, and arrange the waiting items with low quality information in the lower position. Position and present the user with multiple candidates for the ranking.
  • the sorting device may first obtain each candidate item, that is, each site under the category, and then determine the ranking according to the quality information of each candidate item, and generate a corresponding site classification page according to the ranking. Presented to the user.
  • the present invention can be implemented in software and/or a combination of software and hardware, for example, using an application specific integrated circuit (ASIC), a general purpose computer, or any other similar hardware device.
  • the software program of the present invention may be executed by a processor to implement the steps or functions described above.
  • the software program (including related data structures) of the present invention can be stored in a computer readable recording medium such as a RAM memory, a magnetic or optical drive or a floppy disk and the like.
  • some of the steps or functions of the present invention may be implemented in hardware, for example, as a circuit that cooperates with a processor to perform various steps or functions.
  • a portion of the invention can be applied as a computer program product, such as computer program instructions, which, when executed by a computer, can invoke or provide a method and/or solution in accordance with the present invention.
  • the program instructions for invoking the method of the present invention may be stored in a fixed or removable recording medium and/or transmitted by a data stream in a broadcast or other signal bearing medium, and/or stored in a The working memory of the computer device in which the program instructions are run.
  • an embodiment in accordance with the present invention includes a device including a memory for storing computer program instructions and a processor for executing program instructions, wherein when the computer program instructions are executed by the processor, triggering
  • the apparatus operates based on the aforementioned methods and/or technical solutions in accordance with various embodiments of the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Economics (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

一种确定候评项的质量信息的方法与装置,所述方法包括基于评论内容的质量度和/或评论者的可信度来确定候评项的每条评论信息的质量信息(S1),并综合每条评论信息的质量信息,确定候评项的质量信息(S2)。上述方法与装置提出了关于候评项的一种新评价指标,也即,候评项的质量信息。与现有的对候评项的打分机制不同,候评项的质量信息可以更准确,不容易被评论者滥用。并且,候评项的质量信息可以引导评论者真实、客观地对候评项进行评论,并逐渐杜绝推广、恶意攻击等行为,从而营造更好的评论氛围。

Description

一种确定候评项的质量信息的方法与装置
本申请以一中国专利申请作为优先权申请,该优先权申请的申请日为2014年12月5日,申请号为201410743141.2,发明名称为“一种确定候评项的质量信息的方法与装置”。
技术领域
本发明涉及互联网技术领域,尤其涉及一种确定候评项的质量信息的技术。
背景技术
当前,存在许多评论类网站,评论者可以在这类网站中对餐馆、音乐、电影、书籍、站点等候评项打分,打分的方式有很多种,例如,页面中可以为用户呈现5颗星星供用户点亮,用户点亮星星的数量代表了其对候评项的满意程度;又如,用户可以按百分制对候评项打分,分值高低代表了他对候评项的满意程度。
然而,这种打分机制很容易被滥用。例如,候评项的推广方可能会雇佣大量评论者为候评项打高分。又如,评论者为宣泄自己的不满,也可能多次为候评项打低分。因此,候评项的分值不能准确地体现出候评项的优劣。
发明内容
本发明的目的是提供一种确定候评项的质量信息的方法与装置。
根据本发明的一个方面,提供了一种确定候评项的质量信息的方法,其中,每个候评项具有一条或多条评论信息,该方法包括:
-确定其中每条评论信息的质量信息;
-综合所述每条评论信息的质量信息,确定所述候评项的质量信息。
根据本发明的另一个方面,还提供了一种确定候评项的质量信息 的装置,其中,每个候评项具有一条或多条评论信息,该装置包括:
用于确定其中每条评论信息的质量信息的装置;
用于综合所述每条评论信息的质量信息,确定所述候评项的质量信息的装置。
与现有技术相比,本发明如基于评论内容的质量度和/或评论者的可信度来确定候评项的每条评论信息的质量信息,并综合每条评论信息的质量信息来确定候评项的质量信息。本发明提出了关于候评项的一种新评价指标,也即,候评项的质量信息。与现有对候评项的打分机制不同,候评项的质量信息可以更准确,不容易被评论者滥用。并且,候评项的质量信息可以引导评论者真实、客观地对候评项进行评论,并逐渐杜绝推广、恶意攻击等行为,从而营造更好的评论氛围。
根据本发明的一个优选实施例,候评项的质量信息可以用于对多个候评项进行排序,相对根据评论者打分高低来排序而言,根据质量信息对候评项进行排序更为准确、更为客观、更为真实。
附图说明
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本发明的其它特征、目的和优点将会变得更明显:
图1示出根据本发明一个实施例的一种确定候评项的质量信息的方法流程图;
图2示出根据本发明另一个实施例的一种确定候评项的质量信息的装置示意图。
附图中相同或相似的附图标记代表相同或相似的部件。
具体实施方式
下面结合附图对本发明作进一步详细描述。
本发明可由任何有计算处理能力的设备来实现,如网络设备、用户设备等。本文中多以网络设备进行举例。具体地,每个候评项具有一条或多条评论信息,网络设备确定其中每条评论信息的质量信 息;接着,网络设备综合每条评论信息的质量信息,确定候评项的质量信息。
在此,网络设备包括但不限于网络主机、单个网络服务器、多个网络服务器集合或多个服务器构成的云。在此,云由基于云计算(Cloud Computing)的大量主机或网络服务器构成,其中,云计算是分布式计算的一种,由一群松散耦合的计算机集合组成的一个超级虚拟计算机。
进一步地,本发明中,网络设备可作为整体,或由其中部分网络主机/服务器,甚至装置于一个或多个网络主机/服务器中的特定装置,如确定装置,来确定候评项的质量信息。
用户设备包括但不限于任何一种可与用户通过键盘、虚拟键盘、触摸板、触摸屏以及声控设备等输入设备进行人机交互的电子产品,例如PC、笔记本电脑、手机、智能手机、PDA、平板电脑等。
图1示出根据本发明一个实施例的方法流程图,示出一种确定候评项的质量信息的过程。如图1所示,在步骤S1中,网络设备确定候评项的每条评论信息的质量信息;在步骤S2中,网络设备综合每条评论信息的质量信息,确定候评项的质量信息。
具体地,在步骤S1中,网络设备确定候评项的每条评论信息的质量信息。
在此,每个候评项具有一条或多条评论信息。
其中,所述候评项意指,待被评论其质量的项目。候评项包括但不限于:餐馆、音乐、电影、书籍、网站等。
在此,为简单说明起见,本发明仅以网站作为候评项来进行举例。本领域技术人员应能理解,其他现有的或今后可能出现的候评项如可适用于本发明,也应包含在本发明保护范围以内,并在此以引用方式包含于此。
评论信息意指,评论者对候评项发表的信息。评论信息的形式包括但不限于文字、图片、视频等。
以网站作为候评项来举例,网站可以为任意一个互联网网站,具 体如:百度、谷歌、新浪、搜狐、京东、淘宝、携程、去哪儿、点评、美团等。
在本说明书中,应对网站做广义理解,即不仅包含前述一级网站,也包含一级网站的下级频道。具体如:新浪新闻、百度地图、搜狐体育等。
网站的评论信息包括但不限于评论者在该网站的评论区发布的任何信息,通常诸如评论者对浏览网站的体验、评论者对网站页面设计的评论、评论者对网站功能的评论、评论者关于网站的使用经历等。
评论信息的质量信息用于衡量评论信息的优劣。该质量信息可以以一数值来表示,从而有时可被视为一个质量得分。评论信息的质量信息可通过多种方式或因素来进行度量,如基于评论内容、评论者等因素进行度量。
优选地,评论信息的质量信息可以基于以下至少任一项确定:
1)评论内容的质量度。
在此,评论内容的质量度用于表示评论内容的质量优劣。该质量度可以以一数值来表示,从而有时可被视为一个质量值。
2)评论者的可信度。
其中,评论者意指发表评论信息的用户。
在此,评论者的可信度用于表示评论者的可信任程度。该可信度可以以一数值来表示,从而有时可被视为一个可信值。
评论信息的质量信息可以仅基于评论内容的质量度来确定,例如,网络设备可以直接将评论内容的质量度确定为评论信息的质量信息;或者,评论信息的质量信息也可以仅基于评论者的可信度来确定,例如,网络设备可以直接将评论者的可信度确定为评论信息的质量信息;或者,评论信息的质量信息可以基于评论内容的质量度和评论者的可信度来确定,例如,网络设备可以将评论内容的质量度与评论者的可信度的均值或两者之和确定为评论信息的质量信息。
此外,评论内容的质量度、评论者的可信度还可以基于其他方式来确定。以下将分别讨论评论内容的质量度的确定方式,以及评论者 的可信度的确定方式。
评论内容的质量度
在此,评论内容的质量度可以基于以下至少任一项参数确定:
1)评论内容与候评项的相关度。
在此,评论内容与候评项的相关度用于表示两者的相关程度。该相关度可以以一数值来表示。
在此,文字形式的评论内容与候评项的相关度可通过对评论内容进行语义分析来确定。例如,评论内容中与候评项对应的关键词表中匹配的关键词越多,则评论内容与候评项的相关度越高,所述关键词表中保存有候评项对应的一个或多个关键词。其中,京东对应的关键词包括如“商品种类”、“商品价格”、“性价比”、“购物体验”等。每个候评项的关键词表可以是预置的,也可以是从候评项的页面内容中提取的,如将页面中出现频率较高的词加入候评项的关键词表。
图片形式、视频形式的评论内容可通过对图片或视频中每一帧图像的OCR(Optical Character Recognition,光学字符识别)识别来获得相应的文字形式的评论内容,以用于确定评论内容与候评项的相关度。或者,图片形式、视频形式的评论内容与候评项的相关度可通过人工确定,或对网络设备不能判断的图片或视频提交人工审核。此外,通常图片形式或视频形式的评论内容不会单独出现,而会与文字形式的评论内容一同出现,因此,也可在其与文字一同出现时,仅确定文字评论内容与候评项的相关度。
2)评论内容是否包含广告。
文字形式的评论内容是否包含广告可通过判断评论内容是否包含广告特征的字符串来确定。广告特征的字符串包括如符合预定号码规则的字符串以及其他广告关键词。其中,前者具体如符合电话号码规则的字符串,如以“400”、“800”开头的十位数字字符串,以“13”、“15”开头的十一位数字字符串等,或者如符合即时通信软件账号规则的 字符串,如开头为“QQ:”的字符串等。后者可通过查询广告关键词表来确定。
图片形式、视频形式的评论内容也可通过OCR识别来获得相应的文字形式的评论内容,进而通过上述对文字内容进行广告识别的方式确定其中是否包含广告。或者,由于广告图片、视频等通常具有较大的重复度,因此,也可以通过查询广告图片库和/或广告视频库来确定图片形式、视频形式的评论内容是否包含广告。此外,图片形式、视频形式的评论内容是否包含广告可通过人工确定,或对网络设备不能判断的图片或视频提交人工审核。
如果评论内容包含广告,可将其质量度确定为零,或者,可将该参数的值确定为一个较低的值,如零或一个负数,以在结合其他参数来确定评论内容的质量度时,消除广告对质量度的影响。
3)评论内容的反馈信息的数量。
在此,反馈信息意指评论者对评论内容的反馈。反馈信息具体如点赞、反对和回复。反馈信息的数量作为评价评论内容的质量度的一个参数,其可通过一定的转化方式被转化为该参数的参数值,如归一化为0-100、0-1的数值。
当反馈信息的数量被独立或与其他参数相结合来用于确定评论内容的质量度时,点赞和回复的数量越多,评论内容的质量度越高;反对的数量越多,评论内容的质量度越低。
4)评论内容是否包含敏感信息。
文字形式的评论内容是否包含敏感信息可通过查询敏感词表来确定。在此,敏感词包括但不限于如用于表达色情、反动、暴力等内容的字符串。
图片形式、视频形式的评论内容可通过OCR识别来获得相应的文字形式的评论内容,进而通过上述对文字内容进行敏感信息识别的方式确定评论内容是否包含敏感信息。或者,由于包含敏感信息的图片、视频等通常具有较大的重复度,因此,也可以通过查询敏感图片库和/或敏感视频库来确定图片形式、视频形式的评论内容是否包 含敏感信息。此外,图片形式、视频形式的评论内容是否包含敏感信息可通过人工确定,或对网络设备不能判断的图片或视频提交人工审核。
如果评论内容包含敏感信息,可将其质量度确定为零,或者,可将该参数的值确定为一个较低的值,如零或一个负数,以在结合其他参数来确定评论内容的质量度时,消除敏感信息对质量度的影响。
综上所述,评论内容的质量度可以基于以上任一项参数确定,如将任一项参数的值作为评论内容的质量度;或者,评论内容的也可以基于以上至少两项参数的结合来确定,如通过对上述至少两项参数的参数值的计算处理来获得评论内容的质量度。
评论者的可信度
在此,评论者的可信度是对评论者自身的评价,其可以基于以下至少任一项参数确定:
1)评论者的身份可信度。
在此,评论者的身份可信度用于从身份角度来评价评论者的可信任程度。该身份可信度可以以一数值来表示。
评论者的身份可信度可通过评论者是否通过各种身份验证来确定,如是否通过身份证验证、手机验证、支付平台交易验证,评论者通过的各项身份验证越多,则其身份可信度越高。
2)评论者的行为可信度。
在此,评论者的行为可信度用于从行为角度来评价评论者的可信任程度。该行为可信度可以以一数值来表示。
评论者的行为可信度可通过评论者的历史行为记录来确定。例如,评论者历史的评论内容与候评项的相关度高、评论内容不包含广告、评论内容包含评论者亲身经历,则评论者的行为可信度高。
或者,评论者的行为可信度也可通过评论者的水平级别来确定。评论者的水平级别可从各种维度来进行区分,如从技能级别来区分,具体可分为专家、普通等,从信任等级来区分,具体可分为 白名单、黑名单等。每个水平级别对应于不同的行为可信度。
综上所述,评论者的可信度可以基于以上任一项参数来确定,如将任一项参数的值作为评论者的可信度;或者评论者的可信度可以基于以上两者的结合来确定,如通过对上述两项参数的参数值的计算处理来获得评论者的可信度。
优选地,网络设备可以根据评论内容的质量度与评论者的可信度,以及各自对应的权重,加权确定评论信息的质量信息。
具体地,网络设备可以基于以下公式1来确定一条评论信息的质量信息:
CommentQuality=Content×Wcontent+User×Wuser  公式1
其中,CommentQuality表示评论信息的质量信息,Content表示评论内容的质量度,Wcontent表示评论内容的质量度的权重,User表示评论者的可信度,Wuser表示评论者的可信度权重。
在此,Wcontent和Wuser的值可以是预设的,也可以动态确定。例如,当评论内容的质量度较高时,调高其相应权重,同时降低评论者可信度的相应权重;或者,当评论者的可信度较高时,调高其相应权重,同时降低评论内容质量度的相应权重。
在步骤S2中,网络设备综合每条评论信息的质量信息,确定候评项的质量信息。
在此,候评项的质量信息用于衡量候评项的优劣。该质量信息可以以一数值来表示,从而有时可被视为一个质量得分。
网络设备可以通过多种综合方式来确定候评项的质量信息。例如,网络设备可以将所有评论信息的质量信息的总和或均值,以作为候评项的质量信息。又如,所有评论信息的质量信息中最高值和最低值可以被去除,剩下的评论信息的质量信息的均值可以作为候评项的质量信息。
优选地,网络设备可以根据每条评论信息的质量信息,以及每条评论信息所对应的评价指数信息的权重,加权确定候评项的质量信 息。
在此,评价指数信息意指,评论者对候评项所做出的指数性质的评论。评价指数信息可以表达出评论者对候评项的好恶等级。评价指数信息一般可认为即评论者对候评项的打分,具体地,评价指数信息可以表现为评论者点亮星星的数量、基于百分制的打分数值等等。一般而言,通过评论规则的设置,每条评论信息可以对应有一个评价指数信息。
在此,为简单说明起见,本发明仅以点亮星星的数量作为评价指数信息的形式进行举例。本领域技术人员应能理解,其他现有的或今后可能出现的评价指数信息如可适用于本发明,也应包含在本发明保护范围以内,并在此以引用方式包含于此。
例如,点亮5颗星星对应于最高的评价指数信息,点亮1颗星星对应于最低的评价指数信息。评价指数信息与权重的对应关系具体如以下表1所示:
评价指数信息 权重
1颗星星 0.2
2颗星星 0.4
3颗星星 0.6
4颗星星 0.8
5颗星星 1
表1
在此,网络设备确定候评项的质量信息的方式包括但不限于以下2种:
1)网络设备可以直接根据每条评论信息的质量信息,以及每条评论信息所对应的评价指数信息的权重,加权确定候评项的质量信息。
具体地,网络设备可以基于以下公式2确定候评项的质量信息:
Figure PCTCN2015091925-appb-000001
  公式2
其中,CommentQualityn表示任意一个评论信息n的质量信息,Pn表示与该评论信息n相对应的评价指数信息的权重,z表示候评项所对应的评论信息的总数量,ItemQuality表示候评项的质量信息。
在此,公式2所表达的意义为,根据每条评论信息的质量信息,以及每条评论信息所对应的评价指数信息的权重,求取加权和,该加权和即为候评项的质量信息。
基于公式2所确定的候评项的质量信息进一步考虑了评价指数信息的权重的影响,也即,考虑了评论者对候评项的直观好恶。例如,对于候评项的一条评论,尽管评论者详细描述了评论内容且该评论内容与候评项密切相关,但是该评论内容中均为对该候评项的否定性评论,同时评论者对该候评项的评价指数也较低,如仅有1颗星星;此时,该评论信息的质量得分可能较高,如果仅基于评论信息的质量得分来确定候评项的质量得分,那么据此获得的候评项的质量得分也会较高,这显然不符合评论者的评论意图。因此,对于此类场景,引入评价指数信息来加权计算候评项的质量得分是有益的。当评级指数越低,其对应权重也越低,从而通过该权重对评论信息的质量得分进行加权,可以在评论者较丰富的评论内容所导致的高质量得分与评论者对该候评项的负面评价之间进行平衡。
尽管当基于公式2计算候评项的质量信息时,通常评论信息的数量越多,相应候评项的质量信息也越高。这一定程度上反映了大量评论者对候评项的认可度。但是,在某些应用场景中,这可能导致衡量标准的不统一。
因此,基于上述公式2,网络设备还可根据对候评项所对应的评论信息的总数量对各评论信息的质量信息的加权和求均值,并采用该均值来表征候评项的质量信息。
具体地,网络设备可以基于以下公式3确定候评项的质量信息:
Figure PCTCN2015091925-appb-000002
公式3
其中,CommentQualityn表示任意一个评论信息n的质量信息,Pn表示与该评论信息n相对应的评价指数信息的权重,z表示候评项所对应的评论信息n的总数量,ItemQuality表示候评项的质量信息。
基于公式3所确定的候评项的质量信息根据评论信息的数量求取均值,有时会更为准确,可以避免单纯由于评论信息的数量多,造成候评项的质量信息高。
此外,为避免评论信息的数量过多所导致的候评项质量信息较高,还可对候评项的质量信息进行归一化,以使其始终处于0-1或0-100的范围,从而可以根据质量信息来有效评价各候评项。
2)优选地,网络设备可以基于以下步骤确定候评项的质量信息:
2.1)网络设备根据每条评论信息的质量信息对其进行质量分级,以获得属于不同质量等级的评论信息。
例如,网络设备可以根据每条评论信息的质量信息,对评论信息进行质量分级,以获得高、中、低三个质量等级的评论信息,每个质量等级对应于不同的评论信息。在此,质量分级的标准可以是预置的,也可以是动态确定的。例如,设定一个统一的分级标准,其可以适用于所有候评项的评论信息的质量分级。又如,根据每个候评项的所有评论信息的质量信息,按照分级数量,如分为2级、3级等,确定属于各质量等级的评论信息。
2.2)接着,网络设备根据每个质量等级中评论信息所对应的评价指数信息,确定相应质量等级所对应的权重。
例如,网络设备可基于以下公式4确定任意一个质量等级所对应的权重:
Figure PCTCN2015091925-appb-000003
  公式4
其中,PLevelm表示一质量等级中评论信息m所对应的评价指数信息的权重,a表示该质量等级中评论信息m的总数,PLevel表示质 量等级所对应的权重。
网络设备可基于公式4分别地确定各质量等级所对应的权重PLevel,具体如高质量等级所对应的权重PLevelhigh、中质量等级所对应的权重PLevelmidd、低质量等级所对应的权重PLevellow
2.3)随后,网络设备根据每个质量等级中评论信息的质量信息的均值,以及相应质量等级的权重,加权确定候评项的质量信息。
例如,网络设备可基于以下公式5来确定候评项的质量信息:
Figure PCTCN2015091925-appb-000004
                                                                  公式5
其中,HighCommentQualityh表示高质量等级评论信息h的质量信息,b表示高质量等级评论信息h的总数,PLevelhigh表示高质量等级的权重;MiddCommentQualityj表示中质量等级评论信息j的质量信息,c表示中质量等级评论信息j的总数,PLevelmidd表示中质量等级的权重;LowCommentQualityk表示低质量等级评论信息k的质量信息,d表示低质量等级评论信息k的总数,PLevellow表示低质量等级的权重;ItemQuality表示候评项的质量信息。
在前述内容中,本说明书描述了确定一个候评项的质量信息的方案,本领域技术人员应能理解,网络设备可基于前述方案确定每个候评项的质量信息。
其中,由于候评项的评论信息是动态变化的,网络设备可以按照预定条件来更新每个候评项的评论信息,进而按照上述各计算方式更新每个候评项的质量信息。其中,对评论信息的更新条件可以是按照预定周期来进行更新,如每周更新一次;也可以是按照预定事件来进行更新,如当检测到有新发布的评论信息时,即更新相应候评项的质量信息。
为减少网络设备的计算负荷,网络设备可以建立一个候评项数据库,其中存储已经计算的每个候评项的质量信息以及用于计算该质量信息的各计算参数,各计算参数诸如各评论信息的质量信息、相应权重等,取决于网络设备对候评项的质量信息所采用的计算方式。以采用公式2为例,当到达更新周期时,网络设备获取每个候选项的新增评论信息,并计算各新增评论信息的质量信息及其相应权重,进而结合已存储的之前的各评论信息的质量信息及其相应权重,重新计算相应候评项的质量信息。
在此,网络设备可以“线下”确定每个候评项的质量信息并进行存储;或者,网络设备也可以“线上”实时地确定每个候评项的质量信息。也即,当需要调用候评项的质量信息时,网络设备可以使用其“线下”计算并已存储在候评项数据库中的候评项的质量信息,或通过“线上”实时计算候评项的质量信息。
优选地,网络设备还可以根据每个候评项的质量信息对多个候评项排序,并向用户呈现排序后的多个候评项。
当向用户推送多个候评项时,网络设备可以按照从高到低的顺序,将质量信息高的候评项安排在靠前的位置,将质量信息低的候评项安排在靠后的位置,并为用户呈现符合该排序的多个候评项。
例如,对于一个站点分类页面,网络设备可以先获取各候评项,即该分类下的各站点,进而按照各候评项的质量信息确定其排序,并按照该排序生成相应的站点分类页面,呈现给用户。
图2示出根据本发明另一个实施例的装置示意图,其具体示出一种确定候评项的质量信息的装置,也即确定装置10。如图2所示,确定装置10装置于网络设备中,并具体包括装置11和装置12。
装置11确定候评项的每条评论信息的质量信息(为便于区分,以下将装置11称为评论质量确定装置11);装置12综合每条评论信息的质量信息,确定候评项的质量信息(为便于区分,以下将装置12 称为候评项确定装置12)。
具体地,评论质量确定装置11确定候评项的每条评论信息的质量信息。
在此,每个候评项具有一条或多条评论信息。
其中,所述候评项意指,待被评论其质量的项目。候评项包括但不限于:餐馆、音乐、电影、书籍、网站等。
在此,为简单说明起见,本发明仅以网站作为候评项来进行举例。本领域技术人员应能理解,其他现有的或今后可能出现的候评项如可适用于本发明,也应包含在本发明保护范围以内,并在此以引用方式包含于此。
评论信息意指,评论者对候评项发表的信息。评论信息的形式包括但不限于文字、图片、视频等。
以网站作为候评项来举例,网站可以为任意一个互联网网站,具体如:百度、谷歌、新浪、搜狐、京东、淘宝、携程、去哪儿、点评、美团等。
在本说明书中,应对网站做广义理解,即不仅包含前述一级网站,也包含一级网站的下级频道。具体如:新浪新闻、百度地图、搜狐体育等。
网站的评论信息包括但不限于评论者在该网站的评论区发布的任何信息,通常诸如评论者对浏览网站的体验、评论者对网站页面设计的评论、评论者对网站功能的评论、评论者关于网站的使用经历等。
评论信息的质量信息用于衡量评论信息的优劣。该质量信息可以以一数值来表示,从而有时可被视为一个质量得分。评论信息的质量信息可通过多种方式或因素来进行度量,如基于评论内容、评论者等因素进行度量。
优选地,评论信息的质量信息可以基于以下至少任一项确定:
1)评论内容的质量度。
在此,评论内容的质量度用于表示评论内容的质量优劣。该质量度可以以一数值来表示,从而有时可被视为一个质量值。
2)评论者的可信度。
其中,评论者意指发表评论信息的用户。
在此,评论者的可信度用于表示评论者的可信任程度。该可信度可以以一数值来表示,从而有时可被视为一个可信值。
评论信息的质量信息可以仅基于评论内容的质量度来确定,例如,评论质量确定装置11可以直接将评论内容的质量度确定为评论信息的质量信息;或者,评论信息的质量信息也可以仅基于评论者的可信度来确定,例如,评论质量确定装置11可以直接将评论者的可信度确定为评论信息的质量信息;或者,评论信息的质量信息可以基于评论内容的质量度和评论者的可信度来确定,例如,评论质量确定装置11可以将评论内容的质量度与评论者的可信度的均值或两者之和确定为评论信息的质量信息。
此外,评论内容的质量度、评论者的可信度还可以基于其他方式来确定。以下将分别讨论评论内容的质量度的确定方式,以及评论者的可信度的确定方式。
评论内容的质量度
在此,评论内容的质量度可以基于以下至少任一项参数确定:
1)评论内容与候评项的相关度。
在此,评论内容与候评项的相关度用于表示两者的相关程度。该相关度可以以一数值来表示。
在此,文字形式的评论内容与候评项的相关度可通过对评论内容进行语义分析来确定。例如,评论内容中与候评项对应的关键词表中匹配的关键词越多,则评论内容与候评项的相关度越高,所述关键词表中保存有候评项对应的一个或多个关键词。其中,京东对应的关键词包括如“商品种类”、“商品价格”、“性价比”、“购物体验”等。每个候评项的关键词表可以是预置的,也可以是从候评项的页面内容中提取的,如将页面中出现频率较高的词加入候评项的关键词表。
图片形式、视频形式的评论内容可通过对图片或视频中每一帧图像的OCR(Optical Character Recognition,光学字符识别)识别来获得相应的文字形式的评论内容,以用于确定评论内容与候评项的相关度。或者,图片形式、视频形式的评论内容与候评项的相关度可通过人工确定,或评论质量确定装置11将其不能判断的图片或视频提交人工审核。此外,通常图片形式或视频形式的评论内容不会单独出现,而会与文字形式的评论内容一同出现,因此,也可在其与文字一同出现时,仅确定文字评论内容与候评项的相关度。
2)评论内容是否包含广告。
文字形式的评论内容是否包含广告可通过判断评论内容是否包含广告特征的字符串来确定。广告特征的字符串包括如符合预定号码规则的字符串以及其他广告关键词。其中,前者具体如符合电话号码规则的字符串,如以“400”、“800”开头的十位数字字符串,以“13”、“15”开头的十一位数字字符串等,或者如符合即时通信软件账号规则的字符串,如开头为“QQ:”的字符串等。后者可通过查询广告关键词表来确定。
图片形式、视频形式的评论内容也可通过OCR识别来获得相应的文字形式的评论内容,进而通过上述对文字内容进行广告识别的方式确定其中是否包含广告。或者,由于广告图片、视频等通常具有较大的重复度,因此,也可以通过查询广告图片库和/或广告视频库来确定图片形式、视频形式的评论内容是否包含广告。此外,图片形式、视频形式的评论内容是否包含广告可通过人工确定,或评论质量确定装置11将其不能判断的图片或视频提交人工审核。
如果评论内容包含广告,可将其质量度确定为零,或者,可将该参数的值确定为一个较低的值,如零或一个负数,以在结合其他参数来确定评论内容的质量度时,消除广告对质量度的影响。
3)评论内容的反馈信息的数量。
在此,反馈信息意指评论者对评论内容的反馈。反馈信息具体如点赞、反对和回复。反馈信息的数量作为评价评论内容的质量度的 一个参数,其可通过一定的转化方式被转化为该参数的参数值,如归一化为0-100、0-1的数值。
当反馈信息的数量被独立或与其他参数相结合来用于确定评论内容的质量度时,点赞和回复的数量越多,评论内容的质量度越高;反对的数量越多,评论内容的质量度越低。
4)评论内容是否包含敏感信息。
文字形式的评论内容是否包含敏感信息可通过查询敏感词表来确定。在此,敏感词包括但不限于如用于表达色情、反动、暴力等内容的字符串。
图片形式、视频形式的评论内容可通过OCR识别来获得相应的文字形式的评论内容,进而通过上述对文字内容进行敏感信息识别的方式确定评论内容是否包含敏感信息。或者,由于包含敏感信息的图片、视频等通常具有较大的重复度,因此,也可以通过查询敏感图片库和/或敏感视频库来确定图片形式、视频形式的评论内容是否包含敏感信息。此外,图片形式、视频形式的评论内容是否包含敏感信息可通过人工确定,或评论质量确定装置11将其不能判断的图片或视频提交人工审核。
如果评论内容包含敏感信息,可将其质量度确定为零,或者,可将该参数的值确定为一个较低的值,如零或一个负数,以在结合其他参数来确定评论内容的质量度时,消除敏感信息对质量度的影响。
综上所述,评论内容的质量度可以基于以上任一项参数确定,如将任一项参数的值作为评论内容的质量度;或者,评论内容的也可以基于以上至少两项参数的结合来确定,如通过对上述至少两项参数的参数值的计算处理来获得评论内容的质量度。
评论者的可信度
在此,评论者的可信度是对评论者自身的评价,其可以基于以下至少任一项参数确定:
1)评论者的身份可信度。
在此,评论者的身份可信度用于从身份角度来评价评论者的可信任程度。该身份可信度可以以一数值来表示。
评论者的身份可信度可通过评论者是否通过各种身份验证来确定,如是否通过身份证验证、手机验证、支付平台交易验证,评论者通过的各项身份验证越多,则其身份可信度越高。
2)评论者的行为可信度。
在此,评论者的行为可信度用于从行为角度来评价评论者的可信任程度。该行为可信度可以以一数值来表示。
评论者的行为可信度可通过评论者的历史行为记录来确定。例如,评论者历史的评论内容与候评项的相关度高、评论内容不包含广告、评论内容包含评论者亲身经历,则评论者的行为可信度高。
或者,评论者的行为可信度也可通过评论者的水平级别来确定。评论者的水平级别可从各种维度来进行区分,如从技能级别来区分,具体可分为专家、普通等,从信任等级来区分,具体可分为白名单、黑名单等。每个水平级别对应于不同的行为可信度。
综上所述,评论者的可信度可以基于以上任一项参数来确定,如将任一项参数的值作为评论者的可信度;或者评论者的可信度可以基于以上两者的结合来确定,如通过对上述两项参数的参数值的计算处理来获得评论者的可信度。
优选地,评论质量确定装置11可以根据评论内容的质量度与评论者的可信度,以及各自对应的权重,加权确定评论信息的质量信息。
具体地,评论质量确定装置11可以基于以上公式1来确定一条评论信息的质量信息。
随后,候评项质量确定装置12综合每条评论信息的质量信息,确定候评项的质量信息。
在此,候评项的质量信息用于衡量候评项的优劣。该质量信息可以以一数值来表示,从而有时可被视为一个质量得分。
候评项质量确定装置12可以通过多种综合方式来确定候评项的 质量信息。例如,候评项质量确定装置12可以将所有评论信息的质量信息的总和或均值,以作为候评项的质量信息。又如,所有评论信息的质量信息中最高值和最低值可以被去除,剩下的评论信息的质量信息的均值可以作为候评项的质量信息。
优选地,候评项质量确定装置12可以根据每条评论信息的质量信息,以及每条评论信息所对应的评价指数信息的权重,加权确定候评项的质量信息。
在此,评价指数信息意指,评论者对候评项所做出的指数性质的评论。评价指数信息可以表达出评论者对候评项的好恶等级。评价指数信息一般可认为即评论者对候评项的打分,具体地,评价指数信息可以表现为评论者点亮星星的数量、基于百分制的打分数值等等。一般而言,通过评论规则的设置,每条评论信息可以对应有一个评价指数信息。
在此,为简单说明起见,本发明仅以点亮星星的数量作为评价指数信息的形式进行举例。本领域技术人员应能理解,其他现有的或今后可能出现的评价指数信息如可适用于本发明,也应包含在本发明保护范围以内,并在此以引用方式包含于此。
例如,点亮5颗星星对应于最高的评价指数信息,点亮1颗星星对应于最低的评价指数信息。评价指数信息与权重的对应关系具体如上表1所示。
在此,候评项质量确定装置12确定候评项的质量信息的方式包括但不限于以下2种:
1)候评项质量确定装置12可以直接根据每条评论信息的质量信息,以及每条评论信息所对应的评价指数信息的权重,加权确定候评项的质量信息。
具体地,候评项质量确定装置12可以基于以上公式2确定候评项的质量信息。
在此,公式2所表达的意义为,根据每条评论信息的质量信息,以及每条评论信息所对应的评价指数信息的权重,求取加权和,该加 权和即为候评项的质量信息。
基于公式2所确定的候评项的质量信息进一步考虑了评价指数信息的权重的影响,也即,考虑了评论者对候评项的直观好恶。例如,对于候评项的一条评论,尽管评论者详细描述了评论内容且该评论内容与候评项密切相关,但是该评论内容中均为对该候评项的否定性评论,同时评论者对该候评项的评价指数也较低,如仅有1颗星星;此时,该评论信息的质量得分可能较高,如果仅基于评论信息的质量得分来确定候评项的质量得分,那么据此获得的候评项的质量得分也会较高,这显然不符合评论者的评论意图。因此,对于此类场景,引入评价指数信息来加权计算候评项的质量得分是有益的。当评级指数越低,其对应权重也越低,从而通过该权重对评论信息的质量得分进行加权,可以在评论者较丰富的评论内容所导致的高质量得分与评论者对该候评项的负面评价之间进行平衡。
尽管当基于公式2计算候评项的质量信息时,通常评论信息的数量越多,相应候评项的质量信息也越高。这一定程度上反映了大量评论者对候评项的认可度。但是,在某些应用场景中,这可能导致衡量标准的不统一。
因此,基于上述公式2,候评项质量确定装置12还可根据对候评项所对应的评论信息的总数量对各评论信息的质量信息的加权和求均值,并采用该均值来表征候评项的质量信息。
具体地,候评项质量确定装置12可以基于以上公式3确定候评项的质量信息。
基于公式3所确定的候评项的质量信息,其根据评论信息的数量求取均值,有时会更为准确,可以避免单纯由于评论信息的数量多,造成候评项的质量信息高。
此外,为避免评论信息的数量过多所导致的候评项质量信息较高,还可对候评项的质量信息进行归一化,以使其始终处于0-1或0-100的范围,从而可以根据质量信息来有效评价各候评项。
2)优选地,候评项质量确定装置12可以基于以下步骤确定候评 项的质量信息:
2.1)候评项质量确定装置12根据每条评论信息的质量信息对其进行质量分级,以获得属于不同质量等级的评论信息。
例如,候评项质量确定装置12可以根据每条评论信息的质量信息,对评论信息进行质量分级,以获得高、中、低三个质量等级的评论信息,每个质量等级对应于不同的评论信息。在此,质量分级的标准可以是预置的,也可以是动态确定的。例如,设定一个统一的分级标准,其可以适用于所有候评项的评论信息的质量分级。又如,根据每个候评项的所有评论信息的质量信息,按照分级数量,如分为2级、3级等,确定属于各质量等级的评论信息。
2.2)接着,候评项质量确定装置12根据每个质量等级中评论信息所对应的评价指数信息,确定相应质量等级所对应的权重。
例如,候评项质量确定装置12可基于以上公式4确定任意一个质量等级所对应的权重。
具体地,候评项质量确定装置12可基于公式4分别地确定各质量等级所对应的权重PLevel,具体如高质量等级所对应的权重PLevelhigh、中质量等级所对应的权重PLevelmidd、低质量等级所对应的权重PLevellow
2.3)随后,候评项质量确定装置12根据每个质量等级中评论信息的质量信息的均值,以及相应质量等级的权重,加权确定候评项的质量信息。
例如,候评项质量确定装置12可基于以上公式5来确定候评项的质量信息。
优选地,评论质量确定装置11和候评项质量确定装置12可以集成在一起。
在前述内容中,本说明书描述了确定一个候评项的质量信息的方案,本领域技术人员应能理解,确定装置10可基于前述方案确定每个候评项的质量信息。
其中,由于候评项的评论信息是动态变化的,确定装置10可以按照预定条件来更新每个候评项的评论信息,进而按照上述各计算方式更新每个候评项的质量信息。其中,对评论信息的更新条件可以是按照预定周期来进行更新,如每周更新一次;也可以是按照预定事件来进行更新,如当检测到有新发布的评论信息时,即更新相应候评项的质量信息。
为减少确定装置10的计算负荷,确定装置10可以建立一个候评项数据库,其中存储已经计算的每个候评项的质量信息以及用于计算该质量信息的各计算参数,各计算参数诸如各评论信息的质量信息、相应权重等,取决于确定装置10对候评项的质量信息所采用的计算方式。以采用公式2为例,当到达更新周期时,确定装置10获取每个候选项的新增评论信息,并计算各新增评论信息的质量信息及其相应权重,进而结合已存储的之前的各评论信息的质量信息及其相应权重,重新计算相应候评项的质量信息。
在此,确定装置10可以“线下”确定每个候评项的质量信息并进行存储;或者,确定装置10也可以“线上”实时地确定每个候评项的质量信息。也即,当需要调用候评项的质量信息时,确定装置10可以使用其“线下”计算并已存储在候评项数据库中的候评项的质量信息,或通过“线上”实时计算候评项的质量信息。
优选地,确定装置10还可以包括一排序装置(图2中未示出),排序装置可以根据每个候评项的质量信息对多个候评项排序,并向用户呈现排序后的多个候评项。
当向用户推送多个候评项时,排序装置可以按照从高到低的顺序,将质量信息高的候评项安排在靠前的位置,将质量信息低的候评项安排在靠后的位置,并为用户呈现符合该排序的多个候评项。
例如,对于一个站点分类页面,排序装置可以先获取各候评项,即该分类下的各站点,进而按照各候评项的质量信息确定其排序,并按照该排序生成相应的站点分类页面,呈现给用户。
需要注意的是,本发明可在软件和/或软件与硬件的组合体中被实施,例如,可采用专用集成电路(ASIC)、通用目的计算机或任何其他类似硬件设备来实现。在一个实施例中,本发明的软件程序可以通过处理器执行以实现上文所述步骤或功能。同样地,本发明的软件程序(包括相关的数据结构)可以被存储到计算机可读记录介质中,例如,RAM存储器,磁或光驱动器或软磁盘及类似设备。另外,本发明的一些步骤或功能可采用硬件来实现,例如,作为与处理器配合从而执行各个步骤或功能的电路。
另外,本发明的一部分可被应用为计算机程序产品,例如计算机程序指令,当其被计算机执行时,通过该计算机的操作,可以调用或提供根据本发明的方法和/或技术方案。而调用本发明的方法的程序指令,可能被存储在固定的或可移动的记录介质中,和/或通过广播或其他信号承载媒体中的数据流而被传输,和/或被存储在根据所述程序指令运行的计算机设备的工作存储器中。在此,根据本发明的一个实施例包括一个装置,该装置包括用于存储计算机程序指令的存储器和用于执行程序指令的处理器,其中,当该计算机程序指令被该处理器执行时,触发该装置运行基于前述根据本发明的多个实施例的方法和/或技术方案。
对于本领域技术人员而言,显然本发明不限于上述示范性实施例的细节,而且在不背离本发明的精神或基本特征的情况下,能够以其他的具体形式实现本发明。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本发明的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本发明内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。系统权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一,第二等词语用来表示名称,而并不表示任何特定的顺序。

Claims (19)

  1. 一种确定候评项的质量信息的方法,其中,每个候评项具有一条或多条评论信息,该方法包括:
    -确定其中每条评论信息的质量信息;
    -综合所述每条评论信息的质量信息,确定所述候评项的质量信息。
  2. 根据权利要求1所述的方法,其中,所述每条评论信息的质量信息基于以下至少任一项确定:
    -评论内容的质量度;
    -评论者的可信度。
  3. 根据权利要求2所述的方法,其中,所述评论内容的质量度基于以下至少任一项确定:
    -所述评论内容与所述候评项的相关度;
    -所述评论内容是否包含广告;
    -所述评论内容的反馈信息的数量;
    -所述评论内容是否包含敏感信息。
  4. 根据权利要求2或3所述的方法,其中,所述评论者的可信度基于以下至少任一项确定:
    -所述评论者的身份可信度;
    -所述评论者的行为可信度。
  5. 根据权利要求2至4中任一项所述的方法,其中,所述确定所述每条评论信息的质量信息的步骤具体包括:
    -根据所述评论内容的质量度与所述评论者的可信度,以及各自对应的权重,加权确定所述每条评论信息的质量信息。
  6. 根据权利要求1至5中任一项所述的方法,其中,所述综合所述每条评论信息的质量信息的步骤具体包括:
    -根据所述每条评论信息的质量信息,以及所述每条评论信息所对应的评价指数信息的权重,加权确定所述候评项的质量信息。
  7. 根据权利要求6所述的方法,其中,所述综合所述每条评论信息的质量信息的步骤具体包括:
    -根据所述每条评论信息的质量信息对其进行质量分级,以获得属于不同质量等级的评论信息;
    -根据每个质量等级中评论信息所对应的评价指数信息,确定相应质量等级所对应的权重;
    -根据每个质量等级中评论信息的质量信息的均值,以及相应质量等级的权重,加权确定所述候评项的质量信息。
  8. 根据权利要求1至7中任一项所述的方法,其中,该方法还包括:
    -根据每个候评项的质量信息对多个候评项排序,并向用户呈现排序后的多个候评项。
  9. 一种确定候评项的质量信息的装置,其中,每个候评项具有一条或多条评论信息,该装置包括:
    用于确定其中每条评论信息的质量信息的装置;
    用于综合所述每条评论信息的质量信息,确定所述候评项的质量信息的装置。
  10. 根据权利要求9所述的装置,其中,所述每条评论信息的质量信息基于以下至少任一项确定:
    -评论内容的质量度;
    -评论者的可信度。
  11. 根据权利要求10所述的装置,其中,所述评论内容的质量度基于以下至少任一项确定:
    -所述评论内容与所述候评项的相关度;
    -所述评论内容是否包含广告;
    -所述评论内容的反馈信息的数量;
    -所述评论内容是否包含敏感信息。
  12. 根据权利要求10或11所述的装置,其中,所述评论者的可信度基于以下至少任一项确定:
    -所述评论者的身份可信度;
    -所述评论者的行为可信度。
  13. 根据权利要求10至12中任一项所述的装置,其中,所述确定所述每条评论信息的质量信息的装置具体用于:
    -根据所述评论内容的质量度与所述评论者的可信度,以及各自对应的权重,加权确定所述每条评论信息的质量信息。
  14. 根据权利要求9至13中任一项所述的装置,其中,所述综合所述每条评论信息的质量信息的装置具体用于:
    -根据所述每条评论信息的质量信息,以及所述每条评论信息所对应的评价指数信息的权重,加权确定所述候评项的质量信息。
  15. 根据权利要求14所述的装置,其中,所述综合所述每条评论信息的质量信息的装置具体用于:
    -根据所述每条评论信息的质量信息对其进行质量分级,以获得属于不同质量等级的评论信息;
    -根据每个质量等级中评论信息所对应的评价指数信息,确定相应质量等级所对应的权重;
    -根据每个质量等级中评论信息的质量信息的均值,以及相应质量等级的权重,加权确定所述候评项的质量信息。
  16. 根据权利要求9至15中任一项所述的装置,其中,该装置还包括:
    用于根据每个候评项的质量信息对多个候评项排序,并向用户呈现排序后的多个候评项的装置。
  17. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机指令,当所述计算机指令被执行时,如权利要求1至8中任一项所述的方法被执行。
  18. 一种计算机程序产品,当所述计算机程序产品被执行时,如权利要求1至8中任一项所述的方法被执行。
  19. 一种计算机设备,所述计算机设备包括存储器和处理器,所述存储器中存储有计算机指令,所述处理器被配置来通过执行所述计算机指令以执行如权利要求1至8中任一项所述的方法。
PCT/CN2015/091925 2014-12-05 2015-10-14 一种确定候评项的质量信息的方法与装置 WO2016086724A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1020177014844A KR20170080645A (ko) 2014-12-05 2015-10-14 코멘트 항목의 품질 정보를 확정하는 방법 및 장치
JP2017529656A JP2017536632A (ja) 2014-12-05 2015-10-14 評価項目の品質情報を決定する方法及び装置

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410743141.2A CN104536980A (zh) 2014-12-05 2014-12-05 一种确定候评项的质量信息的方法与装置
CN201410743141.2 2014-12-05

Publications (1)

Publication Number Publication Date
WO2016086724A1 true WO2016086724A1 (zh) 2016-06-09

Family

ID=52852508

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/091925 WO2016086724A1 (zh) 2014-12-05 2015-10-14 一种确定候评项的质量信息的方法与装置

Country Status (4)

Country Link
JP (1) JP2017536632A (zh)
KR (1) KR20170080645A (zh)
CN (1) CN104536980A (zh)
WO (1) WO2016086724A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017220190A (ja) * 2016-06-12 2017-12-14 智子 前田 情報信憑性評価システム、情報信憑性評価方法及び情報信憑性評価プログラム
JP2018206211A (ja) * 2017-06-07 2018-12-27 Line株式会社 情報処理方法、情報処理装置、及びプログラム
CN115068957A (zh) * 2022-08-11 2022-09-20 杭银消费金融股份有限公司 一种多维度应用系统管控方法及设备

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104536980A (zh) * 2014-12-05 2015-04-22 百度在线网络技术(北京)有限公司 一种确定候评项的质量信息的方法与装置
CN105893432A (zh) * 2015-12-09 2016-08-24 乐视网信息技术(北京)股份有限公司 视频评论分类方法、视频评论显示系统以及服务器
CN105898336A (zh) * 2015-12-15 2016-08-24 乐视网信息技术(北京)股份有限公司 热评确定方法、热评显示系统及服务器
CN105468790B (zh) * 2015-12-30 2019-10-29 北京奇艺世纪科技有限公司 一种评论信息检索方法和装置
CN105683947A (zh) * 2016-01-11 2016-06-15 程强 餐饮评论分析方法及系统
CN105809476A (zh) * 2016-03-03 2016-07-27 网易无尾熊(杭州)科技有限公司 一种设置对象特征参数的方法和装置
CN107305551A (zh) * 2016-04-18 2017-10-31 百度在线网络技术(北京)有限公司 推送信息的方法和装置
CN106131643A (zh) * 2016-07-13 2016-11-16 乐视控股(北京)有限公司 一种弹幕处理方法、处理装置及其电子设备
CN107390983B (zh) * 2017-04-28 2020-08-04 阿里巴巴集团控股有限公司 业务指令执行方法、客户端和存储介质
CN107133221A (zh) * 2017-06-09 2017-09-05 北京京东尚科信息技术有限公司 信息审核方法、装置、计算机可读介质和电子设备
CN110019942B (zh) * 2017-09-11 2021-07-09 阿里巴巴(中国)有限公司 一种视频鉴别方法及系统
JP6933112B2 (ja) * 2017-11-30 2021-09-08 富士通株式会社 サイバー攻撃情報処理プログラム、サイバー攻撃情報処理方法および情報処理装置
JP6609079B2 (ja) * 2018-04-05 2019-11-20 裕一郎 河野 記事評価システム
CN108768840A (zh) * 2018-06-12 2018-11-06 北京京东金融科技控股有限公司 一种账号管理的方法和装置
CN109583958A (zh) * 2018-12-01 2019-04-05 深圳市润隆实业有限公司 一种用于积分商城的点评系统
CN109977403B (zh) * 2019-03-18 2020-04-14 北京金堤科技有限公司 恶意评论信息识别方法及装置
CN110084373B (zh) * 2019-04-22 2021-08-24 腾讯科技(深圳)有限公司 信息处理方法、装置、计算机可读存储介质和计算机设备
KR102665314B1 (ko) * 2022-02-17 2024-05-13 정성종 악성 고객 관리 시스템

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120158753A1 (en) * 2010-12-15 2012-06-21 He Ray C Comment Ordering System
CN103020482A (zh) * 2013-01-05 2013-04-03 南京邮电大学 一种基于关系的垃圾评论检测方法
CN103389971A (zh) * 2013-07-04 2013-11-13 北京卓易讯畅科技有限公司 一种确定应用对应的评论内容的优质等级的方法与设备
CN104123358A (zh) * 2014-07-17 2014-10-29 广州金山网络科技有限公司 一种展示用户评论的方法及系统
CN104123328A (zh) * 2013-04-28 2014-10-29 北京千橡网景科技发展有限公司 用于在网站中抑制垃圾评论的方法和设备
CN104536980A (zh) * 2014-12-05 2015-04-22 百度在线网络技术(北京)有限公司 一种确定候评项的质量信息的方法与装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100771142B1 (ko) * 2006-03-07 2007-11-19 오피니티 에이피(주) 사용자의 평판 스코어를 제공하는 리뷰 스코어링 방법 및시스템
JP5264813B2 (ja) * 2010-03-16 2013-08-14 ヤフー株式会社 評価装置、評価方法及び評価プログラム
JP2012079035A (ja) * 2010-09-30 2012-04-19 Sony Corp 情報処理装置、投稿情報評価システム、投稿情報評価方法、及びプログラム

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120158753A1 (en) * 2010-12-15 2012-06-21 He Ray C Comment Ordering System
CN103020482A (zh) * 2013-01-05 2013-04-03 南京邮电大学 一种基于关系的垃圾评论检测方法
CN104123328A (zh) * 2013-04-28 2014-10-29 北京千橡网景科技发展有限公司 用于在网站中抑制垃圾评论的方法和设备
CN103389971A (zh) * 2013-07-04 2013-11-13 北京卓易讯畅科技有限公司 一种确定应用对应的评论内容的优质等级的方法与设备
CN104123358A (zh) * 2014-07-17 2014-10-29 广州金山网络科技有限公司 一种展示用户评论的方法及系统
CN104536980A (zh) * 2014-12-05 2015-04-22 百度在线网络技术(北京)有限公司 一种确定候评项的质量信息的方法与装置

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017220190A (ja) * 2016-06-12 2017-12-14 智子 前田 情報信憑性評価システム、情報信憑性評価方法及び情報信憑性評価プログラム
JP2018206211A (ja) * 2017-06-07 2018-12-27 Line株式会社 情報処理方法、情報処理装置、及びプログラム
CN115068957A (zh) * 2022-08-11 2022-09-20 杭银消费金融股份有限公司 一种多维度应用系统管控方法及设备
CN115068957B (zh) * 2022-08-11 2022-11-11 杭银消费金融股份有限公司 一种多维度应用系统管控方法及设备

Also Published As

Publication number Publication date
JP2017536632A (ja) 2017-12-07
KR20170080645A (ko) 2017-07-10
CN104536980A (zh) 2015-04-22

Similar Documents

Publication Publication Date Title
WO2016086724A1 (zh) 一种确定候评项的质量信息的方法与装置
US10936959B2 (en) Determining trustworthiness and compatibility of a person
US10567329B2 (en) Methods and apparatus for inserting content into conversations in on-line and digital environments
US10402703B2 (en) Training image-recognition systems using a joint embedding model on online social networks
US8103650B1 (en) Generating targeted paid search campaigns
US10409823B2 (en) Identifying content for users on online social networks
CN105893533B (zh) 一种文本匹配方法及装置
US10387437B2 (en) Query rewriting using session information
US9201880B2 (en) Processing a content item with regard to an event and a location
US10083379B2 (en) Training image-recognition systems based on search queries on online social networks
US20130080208A1 (en) User-Centric Opinion Analysis for Customer Relationship Management
US10229190B2 (en) Latent semantic indexing in application classification
US20140317078A1 (en) Method and system for retrieving information
US10152478B2 (en) Apparatus, system and method for string disambiguation and entity ranking
US20110219299A1 (en) Method and system of providing completion suggestion to a partial linguistic element
US20190129958A1 (en) Optimizing the Mapping of Qualitative Labels to Scores for Calculating Gain in Search Results
TWI461942B (zh) An ad management apparatus, an advertisement selecting apparatus, an advertisement management method, an advertisement management program, and a recording medium on which an advertisement management program is recorded
US20190347296A1 (en) Method of recommending at least one skin care product to a user
TWI447662B (zh) An ad management apparatus, an advertisement selecting apparatus, an advertisement management method, an advertisement management program, and a recording medium on which an advertisement management program is recorded
JP2016110260A (ja) コンテンツ検索結果提供システム及びコンテンツ検索結果提供方法
KR20140026772A (ko) 문서 관리 시스템 및 문서 관리 방법
JP2011013969A (ja) リンク作成支援装置、リンク作成支援方法およびプログラム
JP2023047661A (ja) 判定理由の提示可能なウェブ広告判定装置、システム、プログラム及び方法
CN115422485A (zh) 信息发送方法、装置、电子设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15864963

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 20177014844

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2017529656

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15864963

Country of ref document: EP

Kind code of ref document: A1