CN111563787A - Recommendation system and method based on user comments and scores - Google Patents
Recommendation system and method based on user comments and scores Download PDFInfo
- Publication number
- CN111563787A CN111563787A CN202010197884.XA CN202010197884A CN111563787A CN 111563787 A CN111563787 A CN 111563787A CN 202010197884 A CN202010197884 A CN 202010197884A CN 111563787 A CN111563787 A CN 111563787A
- Authority
- CN
- China
- Prior art keywords
- user
- score
- commodity
- scores
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
Abstract
The invention discloses a recommendation system and a recommendation method based on user comment and score, which consists of a data preprocessing module (100), a score prediction module (200) and a recommendation generation module (300), wherein the data preprocessing module (100) is used for extracting user-comment data and extracting user-score data; the score prediction module (200) is used for realizing the CF prediction based on the comments and the CF prediction based on the scores, and then obtaining mixed recommendation from the prediction result through a regression model to obtain the scores of the related commodity predictions in the final candidate set; and the recommendation generation module (300) ranks the commodity item candidate set recommended by the user according to the scores, selects top-N in the commodity item candidate set and recommends the top-N to the user to obtain a recommendation set. According to the method, the scoring of the purchased commodities of the user and the comment information are combined, so that the scoring error of the predicted project is reduced, the accuracy of recommendation is improved, and the efficiency of a recommendation system is improved; the recommended commodity can meet the requirements of the user.
Description
Technical Field
The invention relates to the field of collaborative filtering recommendation systems, in particular to a collaborative filtering recommendation system and method based on user comment and score.
Background
With the development of electronic commerce in recent years, in order to solve information overload and improve user experience, a recommendation system is widely applied. The recommendation system is intended to recommend new products that may be of interest to the target user, thereby helping the user decide what products should be purchased.
Collaborative Filtering (CF) is a technique that is widely used in recommendation systems. The basic principle of the collaborative filtering algorithm based on users is to utilize the similarity of users to items to recommend items that may be of interest to users to each other. For example, after finding the neighbor set of the user U (the user who has had a common purchase record), calculating the similarity of the neighbor set by using a similarity function by obtaining some behavior scores or comments of the user U on the commodity; the item scores in the candidate set are then predicted by a score prediction function.
In many existing researches, the application of user comments in collaborative filtering does not draw enough attention, which is also a technical problem to be solved urgently by the invention.
Disclosure of Invention
The invention aims to provide a recommendation system and method based on user comments and scores, which comprehensively utilize the scores and corresponding comments of users on commodities, predict commodities and scores which are most likely to be interested by the users according to the purchasing conditions and comprehensive evaluation of similar users, and provide more accurate recommendation for the users.
The invention relates to a recommendation system based on user comment and score, which comprises a data preprocessing module (100), a score prediction module (200) and a recommendation generation module (300), wherein:
a data pre-processing module (100) for extracting user-comment data and extracting user-rating data;
the score prediction module (200) is used for realizing comment-based collaborative filtering prediction and score-based collaborative filtering prediction, and then obtaining mixed recommendation from the prediction result through a regression model to obtain the score of the commodity prediction in the final candidate set;
and the recommendation generation module (300) ranks the commodity item candidate set recommended by the user according to the scores, selects top-N in the commodity item candidate set and recommends the top-N to the user to obtain a recommendation set.
The invention relates to a recommendation method based on user comment and score, which comprises the following specific steps:
step 1, extracting required data from a data set, wherein one group of the required data is data related to users, commodity items and commodity item scores, and the other group of the required data is data related to users, commodity items and commodity item comments, and the data are respectively expressed as a user-score matrix and a user-comment matrix;
step 2, determining users who score or comment the same commodity item as similar users, and adding the similar users into a similar user set;
step 3, calculating the similarity of the first user according to the user-comment matrix, wherein the calculation formula is as follows
Wherein r isa,iIndicating the rating of the item i by the user a,represents the average score, r, of user au,iIndicates the rating of the item i by the user u,represents the mean score, σ, of user uaNumber of items, σ, representing comments of user auRepresenting the number of items commented by user u;
step 4, calculating the similarity of a second user according to the user-comment matrix, realizing vector representation on the commodity item comment by using Doc2vec, and then obtaining the similarity of the user and similar users by using cosine theorem;
the user comment is expressed by a high-dimensional vector, and the calculation formula is as follows:
y=b+Uh(wt-k,…,wt+k;W,D)
wherein U, b represents a softmax parameter, h represents a word vector structure extracted from the document W, k represents the size of the sliding window, and W represents a word vector;
step 5, calculating the scores of the candidate set commodity items according to the user-score matrix and the user similarity, and respectively carrying out score prediction on the commodity items according to the data related to the scores of the users, the commodity items and the data related to the scores of the users, the commodity items and the commodity item comments by using a score prediction function, wherein the calculation formula is as follows:
wherein the content of the first and second substances,the average score of the user a is represented,represents the average score, w, of user ua,uRepresenting the similarity between the user a and the user u;
step 6, training a regression model in advance, wherein the formula is as follows:
y(i)=w1x1 (i)+w2x2 (i)+b
wherein x is1,x2A user-score-based prediction result of the commodity item and a user-comment-based prediction result of the commodity item, respectively;
training three parameters w of a regression model using a minimization loss function1,w2B, the minimum loss function expression is as follows:
step 7, obtaining final project prediction and scores by training a regression model through two groups of score prediction result data based on CF;
and 8, sequencing the finally calculated items according to the scores, and recommending top-N items.
According to the method, the scoring of the purchased commodities of the user and the comment information are combined, so that the scoring error of the predicted project is reduced, the accuracy of recommendation is improved, and the efficiency of a recommendation system is improved; the recommended commodity can meet the requirements of the user.
Drawings
FIG. 1 is a schematic diagram of a recommendation system architecture based on user comments and scores according to the present invention;
FIG. 2 is a flow chart of a recommendation method based on user comments and scores in accordance with the present invention;
fig. 3 is a schematic diagram of the specific processing of the score prediction module.
Detailed Description
The technical solutions of the present invention are further described below with reference to the drawings and examples, but the present invention is not limited thereto.
As shown in fig. 1, the present invention is a schematic diagram of a recommendation system architecture based on user comments and scores, and the system is composed of a data preprocessing module, a score prediction module, and a recommendation generation module. Wherein:
the data preprocessing module 100 is used for extracting user-comment data and user-rating data, wherein the data are divided into train data, develoop data and test data and are grouped according to the ratio of 8:1: 1. train data is used for predicting based on the CF model, develop data is used for parameter learning of the regression model, and test data is used for testing the accuracy of the recommendation method.
And the score prediction module 200 is configured to implement the CF prediction based on the comments and the CF prediction based on the scores, and then obtain the mixed recommendation from the prediction result through a regression model to obtain the score of the final candidate set about the commodity prediction.
And the recommendation generation module 300 ranks the commodity item candidate sets recommended by the users according to the scores, selects top-N in the commodity item candidate sets and recommends the top-N to the users to obtain a recommendation set.
The recommendation system based on the user comments and the scores balances the similarity between the users by comparing the scores given by the two users, and then recommends the items preferred by the similar users to the target user; and finally, exploring two Collaborative Filtering (CF) methods which are comprehensively based on scores and user comments to train a more accurate model and conjecture the articles which the user probably likes and the scores of the articles. And extracting user scores and comment conditions from the collected data, respectively performing preference consistency calculation on the users in corresponding collaborative filtering modules, then performing similarity calculation, thereby obtaining interested articles and score prediction, outputting to a recommending module, performing preference sorting, and outputting a predicted top-N commodity candidate set. In order to overcome the defects of the prior art, the recommendation system comprehensively considers user rating data and commodity comments, and improves the accuracy of a recommendation result.
As shown in fig. 2, a recommendation method based on user comments and scores according to the present invention includes the following specific steps:
step 1, extracting required data from a data set, wherein one group of the required data is data related to users, commodity items and commodity item scores, and the other group of the required data is data related to users, commodity items and commodity item comments, and the data are respectively expressed as a user-score matrix and a user-comment matrix;
step 2, determining users who score or comment the same commodity item as similar users, and adding the similar users into a similar user set;
step 3, calculating the similarity of the first user according to the user-comment matrix, wherein the calculation formula is as follows
Wherein r isa,iIndicating the rating of the item i by the user a,represents the average score, r, of user au,iIndicates the rating of the item i by the user u,represents the mean score, σ, of user uaNumber of items, σ, representing comments of user auRepresenting the number of items commented by user u;
and 4, calculating the similarity of a second user according to the user-comment matrix, realizing vector representation on the commodity item comments by using doc2vec, and then obtaining the similarity of the user and similar users by using the cosine theorem.
User comments are represented by high-dimensional vectors. The calculation formula is as follows:
y=b+Uh(wt-k,…,wt+k;W,D)
wherein U, b represents a softmax parameter, h is constructed by a word vector extracted from a document W, k represents the size of a sliding window, and W represents the word vector;
in Doc2vec, each document is mapped to a unique vector represented by a column in matrix D and each word is mapped to a unique vector represented by a column in matrix W.
step 5, calculating the scores of the candidate set commodity items according to the user-score matrix and the user similarity, and respectively carrying out score prediction on the commodity items according to the data related to the scores of the users, the commodity items and the data related to the scores of the users, the commodity items and the commodity item comments by using a score prediction function, wherein the calculation formula is as follows:
wherein the content of the first and second substances,the average score of the user a is represented,represents the average score, w, of user ua,uRepresenting the similarity of user a to user u,
step 6, training a regression model in advance, wherein the formula is as follows:
y(i)=w1x1 (i)+w2x2 (i)+b
wherein x is1,x2A user-score-based prediction result of the commodity item and a user-comment-based prediction result of the commodity item, respectively;
training three parameters w of a regression model using a minimization loss function1,w2B, the minimum loss function expression is as follows:
and 7, obtaining final project prediction and scoring by training the two groups of scoring prediction result data based on the CF and a regression model.
And 8, sequencing the finally calculated items according to the scores, and recommending top-N items.
As shown in table 1, an example of a user-scoring matrix for a collaborative filtering recommendation algorithm is shown.
TABLE 1
Item1 | Item2 | … | Itemj | Itemn | |
User1 | 5 | 2 | 4 | 2 | |
User2 | 2 | 3 | 2 | 3 | |
… | |||||
Useri | 4 | ? | |||
Userm | 3 | 5 |
Fig. 3 is a schematic diagram illustrating the specific processing of the score prediction module. And obtaining two groups of prediction scores based on collaborative filtering respectively according to the recommended score prediction function. One group is user-scoring data, the other group is user-comment data, and then the two groups of evaluations are evaluated through a trained regression model to obtain the final scores of all the prediction items. The WordEmbellding technology is used for carrying out vector representation on the document. Document embedding (Documentembedding) is a vector representation of a document, and is used for calculating similarity between different texts and realizing measurement of text similarity.
Combining user realitiesScoring, performing model training, and requiring ∑ e2And solving the parameters by using a least square method on the premise of minimum.
Therefore, the invention uses the conditions of both the user score and the user comment to respectively calculate the similarity. And the similarity of the users based on the scores is measured according to different scores of different users on the same project. Based on the user similarity of the comments, the similarity is calculated through text similarity measurement between the comments written by the two users.
The invention is based on the offline similarity calculation, and the similarity between different users needs to be stored.
Claims (2)
1. A recommendation system based on user comments and scores, the system comprising a data preprocessing module (100), a score prediction module (200) and a recommendation generation module (300), wherein:
a data pre-processing module (100) for extracting user-comment data and extracting user-rating data;
the score prediction module (200) is used for realizing comment-based collaborative filtering prediction and score-based collaborative filtering prediction, and then obtaining mixed recommendation from the prediction result through a regression model to obtain the score of the commodity prediction in the final candidate set;
and the recommendation generation module (300) ranks the commodity item candidate set recommended by the user according to the scores, selects top-N in the commodity item candidate set and recommends the top-N to the user to obtain a recommendation set.
2. A recommendation method based on user comments and scores is characterized by comprising the following specific steps:
step 1, extracting required data from a data set, wherein one group of the required data is data related to users, commodity items and commodity item scores, and the other group of the required data is data related to users, commodity items and commodity item comments, and the data are respectively expressed as a user-score matrix and a user-comment matrix;
step 2, determining users who score or comment the same commodity item as similar users, and adding the similar users into a similar user set;
step 3, calculating the similarity of the first user according to the user-comment matrix, wherein the calculation formula is as follows
Wherein r isa,iIndicating the rating of the item i by the user a,represents the average score, r, of user au,iIndicates the rating of the item i by the user u,represents the mean score, σ, of user uaNumber of items, σ, representing comments of user auRepresenting the number of items commented by user u;
step 4, calculating the similarity of a second user according to the user-comment matrix, realizing vector representation on the commodity item comment by using Doc2vec, and then obtaining the similarity of the user and similar users by using cosine theorem;
the user comment is expressed by a high-dimensional vector, and the calculation formula is as follows:
y=b+Uh(wt-k,…,wt+k;W,D)
wherein U, b represents a softmax parameter, h represents a word vector structure extracted from the document W, k represents the size of the sliding window, and W represents a word vector;
step 5, calculating the scores of the candidate set commodity items according to the user-score matrix and the user similarity, and respectively carrying out score prediction on the commodity items according to the data related to the scores of the users, the commodity items and the data related to the scores of the users, the commodity items and the commodity item comments by using a score prediction function, wherein the calculation formula is as follows:
wherein the content of the first and second substances,the average score of the user a is represented,represents the average score, w, of user ua,uRepresenting the similarity between the user a and the user u;
step 6, training a regression model in advance, wherein the formula is as follows:
y(i)=w1x1 (i)+w2x2 (i)+b
wherein x is1,x2Respectively, a user-score-based prediction result of the commodity item and a user-comment-based prediction result of the commodity item;
training three parameters w of a regression model using a minimization loss function1,w2B, the minimum loss function expression is as follows:
step 7, obtaining final project prediction and scores by training a regression model through two groups of score prediction result data based on CF;
and 8, sequencing the finally calculated items according to the scores, and recommending top-N items.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010197884.XA CN111563787A (en) | 2020-03-19 | 2020-03-19 | Recommendation system and method based on user comments and scores |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010197884.XA CN111563787A (en) | 2020-03-19 | 2020-03-19 | Recommendation system and method based on user comments and scores |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111563787A true CN111563787A (en) | 2020-08-21 |
Family
ID=72069899
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010197884.XA Pending CN111563787A (en) | 2020-03-19 | 2020-03-19 | Recommendation system and method based on user comments and scores |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111563787A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112884551A (en) * | 2021-02-19 | 2021-06-01 | 武汉大学 | Commodity recommendation method based on neighbor users and comment information |
CN113011942A (en) * | 2021-03-10 | 2021-06-22 | 浙江大学 | Customized product demand collaborative filtering recommendation method based on three-layer neighbor selection framework |
CN113538106A (en) * | 2021-07-26 | 2021-10-22 | 王彬 | Commodity refinement recommendation method based on comment integration mining |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104715399A (en) * | 2015-04-09 | 2015-06-17 | 苏州大学 | Grading prediction method and grading prediction system |
CN105574003A (en) * | 2014-10-10 | 2016-05-11 | 华东师范大学 | Comment text and score analysis-based information recommendation method |
CN106202519A (en) * | 2016-07-22 | 2016-12-07 | 桂林电子科技大学 | A kind of combination user comment content and the item recommendation method of scoring |
US20190080383A1 (en) * | 2017-09-08 | 2019-03-14 | NEC Laboratories Europe GmbH | Method and system for combining user, item and review representations for recommender systems |
CN110321485A (en) * | 2019-06-19 | 2019-10-11 | 淮海工学院 | A kind of proposed algorithm of combination user comment and score information |
CN110648163A (en) * | 2019-08-08 | 2020-01-03 | 中山大学 | Recommendation algorithm based on user comments |
-
2020
- 2020-03-19 CN CN202010197884.XA patent/CN111563787A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105574003A (en) * | 2014-10-10 | 2016-05-11 | 华东师范大学 | Comment text and score analysis-based information recommendation method |
CN104715399A (en) * | 2015-04-09 | 2015-06-17 | 苏州大学 | Grading prediction method and grading prediction system |
CN106202519A (en) * | 2016-07-22 | 2016-12-07 | 桂林电子科技大学 | A kind of combination user comment content and the item recommendation method of scoring |
US20190080383A1 (en) * | 2017-09-08 | 2019-03-14 | NEC Laboratories Europe GmbH | Method and system for combining user, item and review representations for recommender systems |
CN110321485A (en) * | 2019-06-19 | 2019-10-11 | 淮海工学院 | A kind of proposed algorithm of combination user comment and score information |
CN110648163A (en) * | 2019-08-08 | 2020-01-03 | 中山大学 | Recommendation algorithm based on user comments |
Non-Patent Citations (1)
Title |
---|
尤苡名: "基于用户评论的推荐算法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112884551A (en) * | 2021-02-19 | 2021-06-01 | 武汉大学 | Commodity recommendation method based on neighbor users and comment information |
CN112884551B (en) * | 2021-02-19 | 2023-08-18 | 武汉大学 | Commodity recommendation method based on neighbor users and comment information |
CN113011942A (en) * | 2021-03-10 | 2021-06-22 | 浙江大学 | Customized product demand collaborative filtering recommendation method based on three-layer neighbor selection framework |
CN113011942B (en) * | 2021-03-10 | 2023-11-03 | 浙江大学 | Customized product demand collaborative filtering recommendation method based on three-layer neighbor selection framework |
CN113538106A (en) * | 2021-07-26 | 2021-10-22 | 王彬 | Commodity refinement recommendation method based on comment integration mining |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107862343B (en) | Commodity comment attribute level emotion classification method based on rules and neural network | |
CN111222332B (en) | Commodity recommendation method combining attention network and user emotion | |
CN111563787A (en) | Recommendation system and method based on user comments and scores | |
CN106611375A (en) | Text analysis-based credit risk assessment method and apparatus | |
CN110929034A (en) | Commodity comment fine-grained emotion classification method based on improved LSTM | |
CN105069072A (en) | Emotional analysis based mixed user scoring information recommendation method and apparatus | |
CN112256866A (en) | Text fine-grained emotion analysis method based on deep learning | |
CN112182152B (en) | Sina microblog user emotion influence analysis method based on deep learning | |
Jonathan et al. | Sentiment analysis of customer reviews in zomato bangalore restaurants using random forest classifier | |
CN113486645A (en) | Text similarity detection method based on deep learning | |
Miao et al. | A recommendation system based on text mining | |
Biswas et al. | Sentiment analysis on user reaction for online food delivery services using bert model | |
Chou et al. | Rating prediction based on merge-CNN and concise attention review mining | |
CN107291686B (en) | Method and system for identifying emotion identification | |
Syn et al. | Using latent semantic analysis to identify quality in use (qu) indicators from user reviews | |
CN111723302A (en) | Recommendation method based on collaborative dual-model deep representation learning | |
CN107491490B (en) | Text emotion classification method based on emotion center | |
CN109902174A (en) | A kind of feeling polarities detection method of the memory network relied on based on aspect | |
CN112463966B (en) | False comment detection model training method, false comment detection model training method and false comment detection model training device | |
Pentland et al. | Does accuracy matter? Methodological considerations when using automated speech-to-text for social science research | |
CN112632275B (en) | Crowd clustering data processing method, device and equipment based on personal text information | |
CN114139634A (en) | Multi-label feature selection method based on paired label weights | |
Tandon et al. | An Integrated Approach For Analysing Sentiments On Social Media | |
CN116881738B (en) | Similarity detection method of project declaration documents applied to power grid industry | |
Hirota et al. | Weakly-Supervised Multimodal Learning for Predicting the Gender of Twitter Users |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200821 |
|
WD01 | Invention patent application deemed withdrawn after publication |