CN111563787A

CN111563787A - Recommendation system and method based on user comments and scores

Info

Publication number: CN111563787A
Application number: CN202010197884.XA
Authority: CN
Inventors: 周晓波; 陈桐; 李克秋; 邱铁
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2020-03-19
Filing date: 2020-03-19
Publication date: 2020-08-21

Abstract

The invention discloses a recommendation system and a recommendation method based on user comment and score, which consists of a data preprocessing module (100), a score prediction module (200) and a recommendation generation module (300), wherein the data preprocessing module (100) is used for extracting user-comment data and extracting user-score data; the score prediction module (200) is used for realizing the CF prediction based on the comments and the CF prediction based on the scores, and then obtaining mixed recommendation from the prediction result through a regression model to obtain the scores of the related commodity predictions in the final candidate set; and the recommendation generation module (300) ranks the commodity item candidate set recommended by the user according to the scores, selects top-N in the commodity item candidate set and recommends the top-N to the user to obtain a recommendation set. According to the method, the scoring of the purchased commodities of the user and the comment information are combined, so that the scoring error of the predicted project is reduced, the accuracy of recommendation is improved, and the efficiency of a recommendation system is improved; the recommended commodity can meet the requirements of the user.

Description

Recommendation system and method based on user comments and scores

Technical Field

The invention relates to the field of collaborative filtering recommendation systems, in particular to a collaborative filtering recommendation system and method based on user comment and score.

Background

With the development of electronic commerce in recent years, in order to solve information overload and improve user experience, a recommendation system is widely applied. The recommendation system is intended to recommend new products that may be of interest to the target user, thereby helping the user decide what products should be purchased.

Collaborative Filtering (CF) is a technique that is widely used in recommendation systems. The basic principle of the collaborative filtering algorithm based on users is to utilize the similarity of users to items to recommend items that may be of interest to users to each other. For example, after finding the neighbor set of the user U (the user who has had a common purchase record), calculating the similarity of the neighbor set by using a similarity function by obtaining some behavior scores or comments of the user U on the commodity; the item scores in the candidate set are then predicted by a score prediction function.

In many existing researches, the application of user comments in collaborative filtering does not draw enough attention, which is also a technical problem to be solved urgently by the invention.

Disclosure of Invention

The invention aims to provide a recommendation system and method based on user comments and scores, which comprehensively utilize the scores and corresponding comments of users on commodities, predict commodities and scores which are most likely to be interested by the users according to the purchasing conditions and comprehensive evaluation of similar users, and provide more accurate recommendation for the users.

The invention relates to a recommendation system based on user comment and score, which comprises a data preprocessing module (100), a score prediction module (200) and a recommendation generation module (300), wherein:

a data pre-processing module (100) for extracting user-comment data and extracting user-rating data;

the score prediction module (200) is used for realizing comment-based collaborative filtering prediction and score-based collaborative filtering prediction, and then obtaining mixed recommendation from the prediction result through a regression model to obtain the score of the commodity prediction in the final candidate set;

and the recommendation generation module (300) ranks the commodity item candidate set recommended by the user according to the scores, selects top-N in the commodity item candidate set and recommends the top-N to the user to obtain a recommendation set.

The invention relates to a recommendation method based on user comment and score, which comprises the following specific steps:

step 1, extracting required data from a data set, wherein one group of the required data is data related to users, commodity items and commodity item scores, and the other group of the required data is data related to users, commodity items and commodity item comments, and the data are respectively expressed as a user-score matrix and a user-comment matrix;

step 2, determining users who score or comment the same commodity item as similar users, and adding the similar users into a similar user set;

step 3, calculating the similarity of the first user according to the user-comment matrix, wherein the calculation formula is as follows

Wherein r is_a,iIndicating the rating of the item i by the user a,

represents the average score, r, of user a_u,iIndicates the rating of the item i by the user u,

represents the mean score, σ, of user u_aNumber of items, σ, representing comments of user a_uRepresenting the number of items commented by user u;

step 4, calculating the similarity of a second user according to the user-comment matrix, realizing vector representation on the commodity item comment by using Doc2vec, and then obtaining the similarity of the user and similar users by using cosine theorem;

the user comment is expressed by a high-dimensional vector, and the calculation formula is as follows:

y＝b+Uh(w_t-k,…,w_t+k；W,D)

wherein U, b represents a softmax parameter, h represents a word vector structure extracted from the document W, k represents the size of the sliding window, and W represents a word vector;

setting two documents

And

their similarity calculation formula:

step 5, calculating the scores of the candidate set commodity items according to the user-score matrix and the user similarity, and respectively carrying out score prediction on the commodity items according to the data related to the scores of the users, the commodity items and the data related to the scores of the users, the commodity items and the commodity item comments by using a score prediction function, wherein the calculation formula is as follows:

wherein the content of the first and second substances,

the average score of the user a is represented,

represents the average score, w, of user u_a,uRepresenting the similarity between the user a and the user u;

step 6, training a regression model in advance, wherein the formula is as follows:

y⁽ⁱ⁾＝w₁x₁ ⁽ⁱ⁾+w₂x₂ ⁽ⁱ⁾+b

wherein x is₁,x₂A user-score-based prediction result of the commodity item and a user-comment-based prediction result of the commodity item, respectively;

training three parameters w of a regression model using a minimization loss function₁,w₂B, the minimum loss function expression is as follows:

wherein in the formula

Representing the true value of the regression model;

step 7, obtaining final project prediction and scores by training a regression model through two groups of score prediction result data based on CF;

and 8, sequencing the finally calculated items according to the scores, and recommending top-N items.

According to the method, the scoring of the purchased commodities of the user and the comment information are combined, so that the scoring error of the predicted project is reduced, the accuracy of recommendation is improved, and the efficiency of a recommendation system is improved; the recommended commodity can meet the requirements of the user.

Drawings

FIG. 1 is a schematic diagram of a recommendation system architecture based on user comments and scores according to the present invention;

FIG. 2 is a flow chart of a recommendation method based on user comments and scores in accordance with the present invention;

fig. 3 is a schematic diagram of the specific processing of the score prediction module.

Detailed Description

The technical solutions of the present invention are further described below with reference to the drawings and examples, but the present invention is not limited thereto.

As shown in fig. 1, the present invention is a schematic diagram of a recommendation system architecture based on user comments and scores, and the system is composed of a data preprocessing module, a score prediction module, and a recommendation generation module. Wherein:

the data preprocessing module 100 is used for extracting user-comment data and user-rating data, wherein the data are divided into train data, develoop data and test data and are grouped according to the ratio of 8:1: 1. train data is used for predicting based on the CF model, develop data is used for parameter learning of the regression model, and test data is used for testing the accuracy of the recommendation method.

And the score prediction module 200 is configured to implement the CF prediction based on the comments and the CF prediction based on the scores, and then obtain the mixed recommendation from the prediction result through a regression model to obtain the score of the final candidate set about the commodity prediction.

And the recommendation generation module 300 ranks the commodity item candidate sets recommended by the users according to the scores, selects top-N in the commodity item candidate sets and recommends the top-N to the users to obtain a recommendation set.

The recommendation system based on the user comments and the scores balances the similarity between the users by comparing the scores given by the two users, and then recommends the items preferred by the similar users to the target user; and finally, exploring two Collaborative Filtering (CF) methods which are comprehensively based on scores and user comments to train a more accurate model and conjecture the articles which the user probably likes and the scores of the articles. And extracting user scores and comment conditions from the collected data, respectively performing preference consistency calculation on the users in corresponding collaborative filtering modules, then performing similarity calculation, thereby obtaining interested articles and score prediction, outputting to a recommending module, performing preference sorting, and outputting a predicted top-N commodity candidate set. In order to overcome the defects of the prior art, the recommendation system comprehensively considers user rating data and commodity comments, and improves the accuracy of a recommendation result.

As shown in fig. 2, a recommendation method based on user comments and scores according to the present invention includes the following specific steps:

Wherein r is_a,iIndicating the rating of the item i by the user a,

and 4, calculating the similarity of a second user according to the user-comment matrix, realizing vector representation on the commodity item comments by using doc2vec, and then obtaining the similarity of the user and similar users by using the cosine theorem.

User comments are represented by high-dimensional vectors. The calculation formula is as follows:

y＝b+Uh(w_t-k,…,w_t+k；W,D)

wherein U, b represents a softmax parameter, h is constructed by a word vector extracted from a document W, k represents the size of a sliding window, and W represents the word vector;

in Doc2vec, each document is mapped to a unique vector represented by a column in matrix D and each word is mapped to a unique vector represented by a column in matrix W.

Setting two documents

And

their similarity calculation formula:

wherein the content of the first and second substances,

the average score of the user a is represented,

represents the average score, w, of user u_a,uRepresenting the similarity of user a to user u,

y⁽ⁱ⁾＝w₁x₁ ⁽ⁱ⁾+w₂x₂ ⁽ⁱ⁾+b

wherein in the formula

Representing the true value of the regression model;

and 7, obtaining final project prediction and scoring by training the two groups of scoring prediction result data based on the CF and a regression model.

As shown in table 1, an example of a user-scoring matrix for a collaborative filtering recommendation algorithm is shown.

TABLE 1

	Item₁	Item₂	…	Item_j	Item_n
						User₁	5	2	4	2
User₂	2	3		2	3
						…
User_i		4		？
						User_m	3		5

Fig. 3 is a schematic diagram illustrating the specific processing of the score prediction module. And obtaining two groups of prediction scores based on collaborative filtering respectively according to the recommended score prediction function. One group is user-scoring data, the other group is user-comment data, and then the two groups of evaluations are evaluated through a trained regression model to obtain the final scores of all the prediction items. The WordEmbellding technology is used for carrying out vector representation on the document. Document embedding (Documentembedding) is a vector representation of a document, and is used for calculating similarity between different texts and realizing measurement of text similarity.

Combining user realitiesScoring, performing model training, and requiring ∑ e²And solving the parameters by using a least square method on the premise of minimum.

Therefore, the invention uses the conditions of both the user score and the user comment to respectively calculate the similarity. And the similarity of the users based on the scores is measured according to different scores of different users on the same project. Based on the user similarity of the comments, the similarity is calculated through text similarity measurement between the comments written by the two users.

The invention is based on the offline similarity calculation, and the similarity between different users needs to be stored.

Claims

1. A recommendation system based on user comments and scores, the system comprising a data preprocessing module (100), a score prediction module (200) and a recommendation generation module (300), wherein:

2. A recommendation method based on user comments and scores is characterized by comprising the following specific steps:

Wherein r is_a,iIndicating the rating of the item i by the user a,

y＝b+Uh(w_t-k,…,w_t+k；W,D)

setting two documents

And

their similarity calculation formula:

wherein the content of the first and second substances,

the average score of the user a is represented,

y⁽ⁱ)＝w₁x₁ ⁽ⁱ⁾+w₂x₂ ⁽ⁱ⁾+b

wherein x is₁,x₂Respectively, a user-score-based prediction result of the commodity item and a user-comment-based prediction result of the commodity item;

wherein in the formula

Representing the true value of the regression model;