CN111563787A - Recommendation system and method based on user comments and scores - Google Patents

Recommendation system and method based on user comments and scores Download PDF

Info

Publication number
CN111563787A
CN111563787A CN202010197884.XA CN202010197884A CN111563787A CN 111563787 A CN111563787 A CN 111563787A CN 202010197884 A CN202010197884 A CN 202010197884A CN 111563787 A CN111563787 A CN 111563787A
Authority
CN
China
Prior art keywords
user
score
commodity
scores
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010197884.XA
Other languages
Chinese (zh)
Inventor
周晓波
陈桐
李克秋
邱铁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202010197884.XA priority Critical patent/CN111563787A/en
Publication of CN111563787A publication Critical patent/CN111563787A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Abstract

The invention discloses a recommendation system and a recommendation method based on user comment and score, which consists of a data preprocessing module (100), a score prediction module (200) and a recommendation generation module (300), wherein the data preprocessing module (100) is used for extracting user-comment data and extracting user-score data; the score prediction module (200) is used for realizing the CF prediction based on the comments and the CF prediction based on the scores, and then obtaining mixed recommendation from the prediction result through a regression model to obtain the scores of the related commodity predictions in the final candidate set; and the recommendation generation module (300) ranks the commodity item candidate set recommended by the user according to the scores, selects top-N in the commodity item candidate set and recommends the top-N to the user to obtain a recommendation set. According to the method, the scoring of the purchased commodities of the user and the comment information are combined, so that the scoring error of the predicted project is reduced, the accuracy of recommendation is improved, and the efficiency of a recommendation system is improved; the recommended commodity can meet the requirements of the user.

Description

Recommendation system and method based on user comments and scores
Technical Field
The invention relates to the field of collaborative filtering recommendation systems, in particular to a collaborative filtering recommendation system and method based on user comment and score.
Background
With the development of electronic commerce in recent years, in order to solve information overload and improve user experience, a recommendation system is widely applied. The recommendation system is intended to recommend new products that may be of interest to the target user, thereby helping the user decide what products should be purchased.
Collaborative Filtering (CF) is a technique that is widely used in recommendation systems. The basic principle of the collaborative filtering algorithm based on users is to utilize the similarity of users to items to recommend items that may be of interest to users to each other. For example, after finding the neighbor set of the user U (the user who has had a common purchase record), calculating the similarity of the neighbor set by using a similarity function by obtaining some behavior scores or comments of the user U on the commodity; the item scores in the candidate set are then predicted by a score prediction function.
In many existing researches, the application of user comments in collaborative filtering does not draw enough attention, which is also a technical problem to be solved urgently by the invention.
Disclosure of Invention
The invention aims to provide a recommendation system and method based on user comments and scores, which comprehensively utilize the scores and corresponding comments of users on commodities, predict commodities and scores which are most likely to be interested by the users according to the purchasing conditions and comprehensive evaluation of similar users, and provide more accurate recommendation for the users.
The invention relates to a recommendation system based on user comment and score, which comprises a data preprocessing module (100), a score prediction module (200) and a recommendation generation module (300), wherein:
a data pre-processing module (100) for extracting user-comment data and extracting user-rating data;
the score prediction module (200) is used for realizing comment-based collaborative filtering prediction and score-based collaborative filtering prediction, and then obtaining mixed recommendation from the prediction result through a regression model to obtain the score of the commodity prediction in the final candidate set;
and the recommendation generation module (300) ranks the commodity item candidate set recommended by the user according to the scores, selects top-N in the commodity item candidate set and recommends the top-N to the user to obtain a recommendation set.
The invention relates to a recommendation method based on user comment and score, which comprises the following specific steps:
step 1, extracting required data from a data set, wherein one group of the required data is data related to users, commodity items and commodity item scores, and the other group of the required data is data related to users, commodity items and commodity item comments, and the data are respectively expressed as a user-score matrix and a user-comment matrix;
step 2, determining users who score or comment the same commodity item as similar users, and adding the similar users into a similar user set;
step 3, calculating the similarity of the first user according to the user-comment matrix, wherein the calculation formula is as follows
Figure BDA0002418283190000021
Wherein r isa,iIndicating the rating of the item i by the user a,
Figure BDA0002418283190000022
represents the average score, r, of user au,iIndicates the rating of the item i by the user u,
Figure BDA0002418283190000023
represents the mean score, σ, of user uaNumber of items, σ, representing comments of user auRepresenting the number of items commented by user u;
step 4, calculating the similarity of a second user according to the user-comment matrix, realizing vector representation on the commodity item comment by using Doc2vec, and then obtaining the similarity of the user and similar users by using cosine theorem;
the user comment is expressed by a high-dimensional vector, and the calculation formula is as follows:
y=b+Uh(wt-k,…,wt+k;W,D)
wherein U, b represents a softmax parameter, h represents a word vector structure extracted from the document W, k represents the size of the sliding window, and W represents a word vector;
setting two documents
Figure BDA0002418283190000031
And
Figure BDA0002418283190000032
their similarity calculation formula:
Figure BDA0002418283190000033
step 5, calculating the scores of the candidate set commodity items according to the user-score matrix and the user similarity, and respectively carrying out score prediction on the commodity items according to the data related to the scores of the users, the commodity items and the data related to the scores of the users, the commodity items and the commodity item comments by using a score prediction function, wherein the calculation formula is as follows:
Figure BDA0002418283190000034
wherein the content of the first and second substances,
Figure BDA0002418283190000035
the average score of the user a is represented,
Figure BDA0002418283190000036
represents the average score, w, of user ua,uRepresenting the similarity between the user a and the user u;
step 6, training a regression model in advance, wherein the formula is as follows:
y(i)=w1x1 (i)+w2x2 (i)+b
wherein x is1,x2A user-score-based prediction result of the commodity item and a user-comment-based prediction result of the commodity item, respectively;
training three parameters w of a regression model using a minimization loss function1,w2B, the minimum loss function expression is as follows:
Figure BDA0002418283190000037
wherein in the formula
Figure BDA0002418283190000038
Representing the true value of the regression model;
step 7, obtaining final project prediction and scores by training a regression model through two groups of score prediction result data based on CF;
and 8, sequencing the finally calculated items according to the scores, and recommending top-N items.
According to the method, the scoring of the purchased commodities of the user and the comment information are combined, so that the scoring error of the predicted project is reduced, the accuracy of recommendation is improved, and the efficiency of a recommendation system is improved; the recommended commodity can meet the requirements of the user.
Drawings
FIG. 1 is a schematic diagram of a recommendation system architecture based on user comments and scores according to the present invention;
FIG. 2 is a flow chart of a recommendation method based on user comments and scores in accordance with the present invention;
fig. 3 is a schematic diagram of the specific processing of the score prediction module.
Detailed Description
The technical solutions of the present invention are further described below with reference to the drawings and examples, but the present invention is not limited thereto.
As shown in fig. 1, the present invention is a schematic diagram of a recommendation system architecture based on user comments and scores, and the system is composed of a data preprocessing module, a score prediction module, and a recommendation generation module. Wherein:
the data preprocessing module 100 is used for extracting user-comment data and user-rating data, wherein the data are divided into train data, develoop data and test data and are grouped according to the ratio of 8:1: 1. train data is used for predicting based on the CF model, develop data is used for parameter learning of the regression model, and test data is used for testing the accuracy of the recommendation method.
And the score prediction module 200 is configured to implement the CF prediction based on the comments and the CF prediction based on the scores, and then obtain the mixed recommendation from the prediction result through a regression model to obtain the score of the final candidate set about the commodity prediction.
And the recommendation generation module 300 ranks the commodity item candidate sets recommended by the users according to the scores, selects top-N in the commodity item candidate sets and recommends the top-N to the users to obtain a recommendation set.
The recommendation system based on the user comments and the scores balances the similarity between the users by comparing the scores given by the two users, and then recommends the items preferred by the similar users to the target user; and finally, exploring two Collaborative Filtering (CF) methods which are comprehensively based on scores and user comments to train a more accurate model and conjecture the articles which the user probably likes and the scores of the articles. And extracting user scores and comment conditions from the collected data, respectively performing preference consistency calculation on the users in corresponding collaborative filtering modules, then performing similarity calculation, thereby obtaining interested articles and score prediction, outputting to a recommending module, performing preference sorting, and outputting a predicted top-N commodity candidate set. In order to overcome the defects of the prior art, the recommendation system comprehensively considers user rating data and commodity comments, and improves the accuracy of a recommendation result.
As shown in fig. 2, a recommendation method based on user comments and scores according to the present invention includes the following specific steps:
step 1, extracting required data from a data set, wherein one group of the required data is data related to users, commodity items and commodity item scores, and the other group of the required data is data related to users, commodity items and commodity item comments, and the data are respectively expressed as a user-score matrix and a user-comment matrix;
step 2, determining users who score or comment the same commodity item as similar users, and adding the similar users into a similar user set;
step 3, calculating the similarity of the first user according to the user-comment matrix, wherein the calculation formula is as follows
Figure BDA0002418283190000051
Wherein r isa,iIndicating the rating of the item i by the user a,
Figure BDA0002418283190000052
represents the average score, r, of user au,iIndicates the rating of the item i by the user u,
Figure BDA0002418283190000053
represents the mean score, σ, of user uaNumber of items, σ, representing comments of user auRepresenting the number of items commented by user u;
and 4, calculating the similarity of a second user according to the user-comment matrix, realizing vector representation on the commodity item comments by using doc2vec, and then obtaining the similarity of the user and similar users by using the cosine theorem.
User comments are represented by high-dimensional vectors. The calculation formula is as follows:
y=b+Uh(wt-k,…,wt+k;W,D)
wherein U, b represents a softmax parameter, h is constructed by a word vector extracted from a document W, k represents the size of a sliding window, and W represents the word vector;
in Doc2vec, each document is mapped to a unique vector represented by a column in matrix D and each word is mapped to a unique vector represented by a column in matrix W.
Setting two documents
Figure BDA0002418283190000061
And
Figure BDA0002418283190000062
their similarity calculation formula:
Figure BDA0002418283190000063
step 5, calculating the scores of the candidate set commodity items according to the user-score matrix and the user similarity, and respectively carrying out score prediction on the commodity items according to the data related to the scores of the users, the commodity items and the data related to the scores of the users, the commodity items and the commodity item comments by using a score prediction function, wherein the calculation formula is as follows:
Figure BDA0002418283190000064
wherein the content of the first and second substances,
Figure BDA0002418283190000065
the average score of the user a is represented,
Figure BDA0002418283190000066
represents the average score, w, of user ua,uRepresenting the similarity of user a to user u,
step 6, training a regression model in advance, wherein the formula is as follows:
y(i)=w1x1 (i)+w2x2 (i)+b
wherein x is1,x2A user-score-based prediction result of the commodity item and a user-comment-based prediction result of the commodity item, respectively;
training three parameters w of a regression model using a minimization loss function1,w2B, the minimum loss function expression is as follows:
Figure BDA0002418283190000067
wherein in the formula
Figure BDA0002418283190000068
Representing the true value of the regression model;
and 7, obtaining final project prediction and scoring by training the two groups of scoring prediction result data based on the CF and a regression model.
And 8, sequencing the finally calculated items according to the scores, and recommending top-N items.
As shown in table 1, an example of a user-scoring matrix for a collaborative filtering recommendation algorithm is shown.
TABLE 1
Item1 Item2 Itemj Itemn
User1 5 2 4 2
User2 2 3 2 3
Useri 4
Userm 3 5
Fig. 3 is a schematic diagram illustrating the specific processing of the score prediction module. And obtaining two groups of prediction scores based on collaborative filtering respectively according to the recommended score prediction function. One group is user-scoring data, the other group is user-comment data, and then the two groups of evaluations are evaluated through a trained regression model to obtain the final scores of all the prediction items. The WordEmbellding technology is used for carrying out vector representation on the document. Document embedding (Documentembedding) is a vector representation of a document, and is used for calculating similarity between different texts and realizing measurement of text similarity.
Combining user realitiesScoring, performing model training, and requiring ∑ e2And solving the parameters by using a least square method on the premise of minimum.
Therefore, the invention uses the conditions of both the user score and the user comment to respectively calculate the similarity. And the similarity of the users based on the scores is measured according to different scores of different users on the same project. Based on the user similarity of the comments, the similarity is calculated through text similarity measurement between the comments written by the two users.
The invention is based on the offline similarity calculation, and the similarity between different users needs to be stored.

Claims (2)

1. A recommendation system based on user comments and scores, the system comprising a data preprocessing module (100), a score prediction module (200) and a recommendation generation module (300), wherein:
a data pre-processing module (100) for extracting user-comment data and extracting user-rating data;
the score prediction module (200) is used for realizing comment-based collaborative filtering prediction and score-based collaborative filtering prediction, and then obtaining mixed recommendation from the prediction result through a regression model to obtain the score of the commodity prediction in the final candidate set;
and the recommendation generation module (300) ranks the commodity item candidate set recommended by the user according to the scores, selects top-N in the commodity item candidate set and recommends the top-N to the user to obtain a recommendation set.
2. A recommendation method based on user comments and scores is characterized by comprising the following specific steps:
step 1, extracting required data from a data set, wherein one group of the required data is data related to users, commodity items and commodity item scores, and the other group of the required data is data related to users, commodity items and commodity item comments, and the data are respectively expressed as a user-score matrix and a user-comment matrix;
step 2, determining users who score or comment the same commodity item as similar users, and adding the similar users into a similar user set;
step 3, calculating the similarity of the first user according to the user-comment matrix, wherein the calculation formula is as follows
Figure FDA0002418283180000011
Wherein r isa,iIndicating the rating of the item i by the user a,
Figure FDA0002418283180000012
represents the average score, r, of user au,iIndicates the rating of the item i by the user u,
Figure FDA0002418283180000013
represents the mean score, σ, of user uaNumber of items, σ, representing comments of user auRepresenting the number of items commented by user u;
step 4, calculating the similarity of a second user according to the user-comment matrix, realizing vector representation on the commodity item comment by using Doc2vec, and then obtaining the similarity of the user and similar users by using cosine theorem;
the user comment is expressed by a high-dimensional vector, and the calculation formula is as follows:
y=b+Uh(wt-k,…,wt+k;W,D)
wherein U, b represents a softmax parameter, h represents a word vector structure extracted from the document W, k represents the size of the sliding window, and W represents a word vector;
setting two documents
Figure FDA0002418283180000021
And
Figure FDA0002418283180000022
their similarity calculation formula:
Figure FDA0002418283180000023
step 5, calculating the scores of the candidate set commodity items according to the user-score matrix and the user similarity, and respectively carrying out score prediction on the commodity items according to the data related to the scores of the users, the commodity items and the data related to the scores of the users, the commodity items and the commodity item comments by using a score prediction function, wherein the calculation formula is as follows:
Figure FDA0002418283180000024
wherein the content of the first and second substances,
Figure FDA0002418283180000025
the average score of the user a is represented,
Figure FDA0002418283180000026
represents the average score, w, of user ua,uRepresenting the similarity between the user a and the user u;
step 6, training a regression model in advance, wherein the formula is as follows:
y(i)=w1x1 (i)+w2x2 (i)+b
wherein x is1,x2Respectively, a user-score-based prediction result of the commodity item and a user-comment-based prediction result of the commodity item;
training three parameters w of a regression model using a minimization loss function1,w2B, the minimum loss function expression is as follows:
Figure FDA0002418283180000027
wherein in the formula
Figure FDA0002418283180000028
Representing the true value of the regression model;
step 7, obtaining final project prediction and scores by training a regression model through two groups of score prediction result data based on CF;
and 8, sequencing the finally calculated items according to the scores, and recommending top-N items.
CN202010197884.XA 2020-03-19 2020-03-19 Recommendation system and method based on user comments and scores Pending CN111563787A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010197884.XA CN111563787A (en) 2020-03-19 2020-03-19 Recommendation system and method based on user comments and scores

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010197884.XA CN111563787A (en) 2020-03-19 2020-03-19 Recommendation system and method based on user comments and scores

Publications (1)

Publication Number Publication Date
CN111563787A true CN111563787A (en) 2020-08-21

Family

ID=72069899

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010197884.XA Pending CN111563787A (en) 2020-03-19 2020-03-19 Recommendation system and method based on user comments and scores

Country Status (1)

Country Link
CN (1) CN111563787A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884551A (en) * 2021-02-19 2021-06-01 武汉大学 Commodity recommendation method based on neighbor users and comment information
CN113011942A (en) * 2021-03-10 2021-06-22 浙江大学 Customized product demand collaborative filtering recommendation method based on three-layer neighbor selection framework
CN113538106A (en) * 2021-07-26 2021-10-22 王彬 Commodity refinement recommendation method based on comment integration mining

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104715399A (en) * 2015-04-09 2015-06-17 苏州大学 Grading prediction method and grading prediction system
CN105574003A (en) * 2014-10-10 2016-05-11 华东师范大学 Comment text and score analysis-based information recommendation method
CN106202519A (en) * 2016-07-22 2016-12-07 桂林电子科技大学 A kind of combination user comment content and the item recommendation method of scoring
US20190080383A1 (en) * 2017-09-08 2019-03-14 NEC Laboratories Europe GmbH Method and system for combining user, item and review representations for recommender systems
CN110321485A (en) * 2019-06-19 2019-10-11 淮海工学院 A kind of proposed algorithm of combination user comment and score information
CN110648163A (en) * 2019-08-08 2020-01-03 中山大学 Recommendation algorithm based on user comments

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105574003A (en) * 2014-10-10 2016-05-11 华东师范大学 Comment text and score analysis-based information recommendation method
CN104715399A (en) * 2015-04-09 2015-06-17 苏州大学 Grading prediction method and grading prediction system
CN106202519A (en) * 2016-07-22 2016-12-07 桂林电子科技大学 A kind of combination user comment content and the item recommendation method of scoring
US20190080383A1 (en) * 2017-09-08 2019-03-14 NEC Laboratories Europe GmbH Method and system for combining user, item and review representations for recommender systems
CN110321485A (en) * 2019-06-19 2019-10-11 淮海工学院 A kind of proposed algorithm of combination user comment and score information
CN110648163A (en) * 2019-08-08 2020-01-03 中山大学 Recommendation algorithm based on user comments

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
尤苡名: "基于用户评论的推荐算法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884551A (en) * 2021-02-19 2021-06-01 武汉大学 Commodity recommendation method based on neighbor users and comment information
CN112884551B (en) * 2021-02-19 2023-08-18 武汉大学 Commodity recommendation method based on neighbor users and comment information
CN113011942A (en) * 2021-03-10 2021-06-22 浙江大学 Customized product demand collaborative filtering recommendation method based on three-layer neighbor selection framework
CN113011942B (en) * 2021-03-10 2023-11-03 浙江大学 Customized product demand collaborative filtering recommendation method based on three-layer neighbor selection framework
CN113538106A (en) * 2021-07-26 2021-10-22 王彬 Commodity refinement recommendation method based on comment integration mining

Similar Documents

Publication Publication Date Title
CN107862343B (en) Commodity comment attribute level emotion classification method based on rules and neural network
CN111222332B (en) Commodity recommendation method combining attention network and user emotion
CN111563787A (en) Recommendation system and method based on user comments and scores
CN106611375A (en) Text analysis-based credit risk assessment method and apparatus
CN110929034A (en) Commodity comment fine-grained emotion classification method based on improved LSTM
CN105069072A (en) Emotional analysis based mixed user scoring information recommendation method and apparatus
CN112256866A (en) Text fine-grained emotion analysis method based on deep learning
CN112182152B (en) Sina microblog user emotion influence analysis method based on deep learning
Jonathan et al. Sentiment analysis of customer reviews in zomato bangalore restaurants using random forest classifier
CN113486645A (en) Text similarity detection method based on deep learning
Miao et al. A recommendation system based on text mining
Biswas et al. Sentiment analysis on user reaction for online food delivery services using bert model
Chou et al. Rating prediction based on merge-CNN and concise attention review mining
CN107291686B (en) Method and system for identifying emotion identification
Syn et al. Using latent semantic analysis to identify quality in use (qu) indicators from user reviews
CN111723302A (en) Recommendation method based on collaborative dual-model deep representation learning
CN107491490B (en) Text emotion classification method based on emotion center
CN109902174A (en) A kind of feeling polarities detection method of the memory network relied on based on aspect
CN112463966B (en) False comment detection model training method, false comment detection model training method and false comment detection model training device
Pentland et al. Does accuracy matter? Methodological considerations when using automated speech-to-text for social science research
CN112632275B (en) Crowd clustering data processing method, device and equipment based on personal text information
CN114139634A (en) Multi-label feature selection method based on paired label weights
Tandon et al. An Integrated Approach For Analysing Sentiments On Social Media
CN116881738B (en) Similarity detection method of project declaration documents applied to power grid industry
Hirota et al. Weakly-Supervised Multimodal Learning for Predicting the Gender of Twitter Users

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200821

WD01 Invention patent application deemed withdrawn after publication