CN109635200B

CN109635200B - Collaborative filtering recommendation method based on intermediary truth degree measurement and user

Info

Publication number: CN109635200B
Application number: CN201811548134.1A
Authority: CN
Inventors: 周宁宁; 陆荣
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2018-12-18
Filing date: 2018-12-18
Publication date: 2022-02-01
Anticipated expiration: 2038-12-18
Also published as: CN109635200A

Abstract

The invention discloses a collaborative filtering recommendation method based on intermediary truth degree measurement and a user, which is a method and a strategy for measuring the similarity degree of feedback behaviors of the user by using an intermediary truth degree measurement mode after calculating the interest similarity degree of the user by using a traditional mode, is used for measuring the similarity degree of the feedback behaviors of the user, is used for measuring the similarity degree of the feedback behaviors of a neighbor object with higher interest similarity degree, improves the traditional mode for calculating the interest similarity degree of the user, and finally achieves the aim of improving the accuracy and the recall rate of a recommendation result. According to the method, the intermediary truth value degree measurement is applied to the user interest similarity calculation based on the collaborative filtering recommendation method of the user, the problems that the accuracy of the recommendation result and the recall rate are low due to the fact that scientific consideration for user feedback is lacked when the user interest similarity is calculated in the conventional method are solved, and the effect of improving the accuracy of the recommendation result and the recall rate is achieved.

Description

Collaborative filtering recommendation method based on intermediary truth degree measurement and user

Technical Field

The invention relates to the technical field Of a collaborative filtering recommendation method (UserCF) based on a user, in particular to a collaborative filtering recommendation method based on measurement Of Medium channel Degreee (MMTD) and the user.

Background

The collaborative filtering recommendation method based on the users simply finds groups with interests in mutual interests and common experiences by using a certain method, and recommends the preferences of the groups to the users of the same type. The method achieves the purpose of filtering the information through the analysis of responses (such as grading and collection) made by the user to the information, and further helps other users to filter the information.

The traditional collaborative filtering recommendation method based on users mainly comprises the following steps: the interest similarity calculation and the result recommendation are carried out. The interest similarity calculation is one of the most important methods, and commonly used methods include a Jaccard formula, a cosine formula and the like. The core idea of the two methods is to obtain the intersection of the article sets which are positively fed back by the users, and then divide the intersection by a certain fixed value, and the result is the interest similarity between the users.

In summary, the traditional collaborative filtering method based on the user lacks scientific consideration for user feedback when calculating the user interest similarity, so that the recommendation result does not completely conform to the interest habits of the user, and finally the problem of low accuracy and recall rate of the recommendation result is caused. Therefore, improvement is needed in a way of calculating the interest similarity of the user, so that the interest habit of the user can be better met, and the accuracy and recall rate of the recommendation result are improved. The present invention can solve the above problems well.

Disclosure of Invention

The purpose of the invention is as follows: in order to overcome the defects in the prior art, the invention provides a collaborative filtering recommendation method based on intermediary truth degree measurement and a user, and solves the problems that a recommendation result does not completely accord with the interest habits of the user and the accuracy and recall rate of the recommendation result are low due to the lack of scientific consideration on user feedback when the user interest similarity is calculated by a traditional collaborative filtering method based on the user.

The technical scheme is as follows: in order to achieve the purpose, the invention adopts the technical scheme that:

a collaborative filtering recommendation method based on intermediary truth degree measurement and users comprises the following steps:

step 1, calculating the interest similarity of users:

searching the whole training set to record the times of purchasing each article, sorting from large to small, selecting the first F/n articles according to the recommended article number n to form a popular article set G', 0 & ltF/n & lt 0.05F, wherein F represents the total number of articles in the training set, obtaining a common article set S according to the article set which has positive feedback by a user,

S＝N(i)∩N(j)，

wherein, N (i) represents the item set which has positive feedback by the user i, and N (j) represents the item set which has positive feedback by the user j;

introducing a hot article set G' to obtain a user feedback data set after the hot articles are removed:

N(i)′＝N(i)-N(i)∩G′

N(j)′＝N(j)-N(j)∩G′

building a look-up table of items to users:

C[i][j]＝|N(i)′∩N(j)′|

if the users i, j belong to M in the reverse lookup table at the same time₁The user list corresponding to each article is ordered C [ i ]][j]＝M₁，

Calculating interest similarity of users i and j through an improved cosine similarity value formula:

step 2, calculating the score similarity based on the intermediary truth degree measurement:

the user i and j may score the item object by n₁～n₂Any positive integer value in between, the notation P (x (i, j)) indicates that the users i and j to be investigated score the same item object,

p (x (i, j)) represents that N (i) is different from N (j), and P (x (i, j)) represents that i is different from j, and a distance proportional function h is calculated_Tg(x (i, j)) obtaining the similarity degree of the user i and j to the item g score;

obtaining a relative score f (x (i, j)) according to the score of the user on the item:

f(x(i，j))＝|Q_ig-Q_jg|；

wherein Q is_igRating of item g for user i, Q_jgScoring item g for user j;

on the axis, y ═ f (x (i, j)) is symmetric about-P, and P are on the left and right sides, respectively

P, f (x (i, j)) is taken to be [0, n ]₂-n₁]；

The value of y ═ f (x (i, j)) falls within three ranges of values (α)_r+ε_r，α_l-ε_l)，(0，α_r+ε_r)，(α_l-ε_l，n₂-n₁) Within, the region of P (x (i, j)) is (alpha)_r+ε_r，α_l-ε_l) The region of P (x (i, j)) is (0, α)_r+ε_r)，

The region of P (x (i, j)) is (alpha)_l-ε_l，n₂-n₁) The true value of P (x (i, j)) is 1,

the true value of P (x (i, j)) is 0;

distance proportional function relative to P (x (i, j))

Wherein the content of the first and second substances,

wherein the distance scale function h is passed_Tg(x (i, j)) calculating to obtain the similarity degree of the user i and j to the score of the object g;

traversing the common article set S, summing the similarity degrees of all the common article object scores of the user i and the user j, and dividing the sum by the size n of the common article set S to obtain the comprehensive score similarity degree h_Tn(x(i，j))：

And 3, sorting the recommendation result from small to large according to the comprehensive grading similarity and selecting the neighbor objects.

And traversing the training set to extract data to form a user-item set and a user-item-evaluation set, calculating the user interest similarity between the user and other users, and selecting the user with the similarity of 2K as a neighbor user set M of the user, wherein K is the number of recommended users.

Traversing the neighbor user set M, extracting corresponding user-item-scores in the neighbor user set from the user-item-score set, calculating relative scores, calculating item score similarity degrees of candidate users, and finally calculating comprehensive item score similarity degrees of the users.

And selecting the neighbor user objects according to the recommended user number K and the ranking from small to large according to the comprehensive article scoring similarity degree.

The area represented by P (x (i, j)) is 20%,

the area represented by P (x (i, j)) occupies 50%, and the area represented by P (x (i, j)) occupies 30%.

Compared with the prior art, the invention has the following beneficial effects:

according to the method, the intermediary truth value degree measurement is applied to the user interest similarity calculation based on the collaborative filtering recommendation method of the user, the problems of low recommendation result accuracy and recall rate caused by the fact that scientific consideration for user feedback is lacked when the user interest similarity is calculated by the conventional method are solved, and the effect of improving the recommendation result accuracy and the recall rate is achieved.

Drawings

FIG. 1 is a flow chart of a collaborative filtering recommendation method based on intermediary truth degree metrics and users.

In fig. 2, the predicate dissimilarity corresponds to the similarity value interval.

FIG. 3 is a flow chart for similar object selection optimization using intermediary truth degree metrics.

Detailed Description

The present invention is further illustrated by the following description in conjunction with the accompanying drawings and the specific embodiments, it is to be understood that these examples are given solely for the purpose of illustration and are not intended as a definition of the limits of the invention, since various equivalent modifications will occur to those skilled in the art upon reading the present invention and fall within the limits of the appended claims.

A collaborative filtering recommendation method based on intermediary truth degree measurement and a user is a strategic method, and is a method and a strategy for measuring the similarity degree of feedback behaviors of the user by using an intermediary truth degree measurement mode after calculating the interest similarity degree of the user by using a traditional mode, measuring the similarity degree of the feedback behaviors of the user by using a neighbor object with higher interest similarity degree, improving the traditional mode for calculating the interest similarity degree of the user, and finally achieving the purpose of improving the accuracy and recall rate of a recommendation result.

In a traditional method for calculating the interest similarity of users, in order to improve the accuracy of recommendation results as much as possible, when calculating the interest similarity of users, articles with positive feedback of users, namely articles with common behaviors among users, are considered emphatically, and the feedback of users is lack of scientific consideration, so that the method has errors in calculation of the interest similarity of users, and finally, the problem of low accuracy and recall rate of recommendation results is caused. For example: a score of 1 is given after a user A watches the movie Z, a score of 5 is given after a user B watches the movie Z, the movie Z is taken as the common interest of the users A and B by a traditional method and is included in the interest similarity calculation, and the user A does not like the movie Z actually. In order to solve the problem, the method evaluates the feedback similarity of the neighbors selected by the traditional method through intermediary truth value degree measurement, re-determines the interest similarity ranking among the users according to the feedback similarity, and improves the accuracy and the recall rate of the recommendation result.

The method comprises the following steps:

as shown in fig. 1, the present invention provides a UserCF method based on intermediary truth degree measurement, which comprises the following steps:

1. user interest similarity calculation

Defining the total number of articles in the training set as F, searching the whole training set to record the times of purchasing each article, sorting the articles from large to small, and selecting the front F/n (0 < F/n < 0.05F) articles according to the recommended number of articles n to form a popular article set G'.

Let N (i) denote the item set that user i has positive feedback, and N (j) denote the item set that user j has positive feedback.

Then the shared item set S:

S＝N(i)∩N(j) (1)

n (i) ' (n) (i) n (i) ' G ' (2)

N(j)′＝N(j)-N(j)∩G′ (3)

To reduce the time complexity of the method, we build an item-to-user look-up table:

C[i][j]＝|N(i)′∩N(j)′| (4)

if the users i, j belong to M in the reverse lookup table at the same time₁The user list corresponding to each article is ordered C [ i ]][j]＝M₁。

2. scoring similarity calculation based on intermediary truth degree measurement

The user objects i and j may score the item object by n₁～n₂Any positive integer value in between. By a distance proportional function h_Tg(x (i, j)), a degree of similarity of the scores of i and j for item g can be calculated.

The predicate P (x (i, j)) indicates that the users i and j to be investigated score the same item object,

p (x (i, j)) represents that n (i) is different from n (j), and-P (x (i, j)) represents that scores of i and j are the same and different, and the correspondence between predicate differences and similarity value magnitude value intervals is shown in fig. 2. Wherein the area represented by P (x (i, j)) accounts for 10%,

the area represented by P (x (i, j)) is 60%, and the area represented by P (x (i, j)) is 30%, and the distance proportional function h can be calculated_Tg(x (i, j)) results in how similar i and j score the item object.

Suppose Q_igRating of item g for user i, Q_jgJ pairs of articles for usersg, define:

f(x(i，j))＝|Q_ig-Q_jg| (6)

as can be seen from the numerical axis, y ═ f (x (i, j)) on the numerical axis is symmetric about-P, and P are respectively provided on the left and right sides

P, f (x (i, j)) is taken to be [0, n ]₂-n₁]。

the true value of P (x (i, j)) is 0.

Distance proportional function with respect to P (x (i, j)):

wherein

By a second distance proportional function h_TgThe calculation of (x (i, j)) can obtain the similarity degree of the user i and j to the score of the object g.

3. And selecting the neighbor objects according to the recommendation result in a descending order according to the comprehensive score similarity degree.

As shown in fig. 3, the specific implementation steps are as follows:

1. and traversing the training set to extract data to form a user-item set and a user-item-evaluation set. And traversing the user-item set to obtain a hot item set, and then circularly traversing all user objects according to a formula (2) and a formula (3) for removing all user positive feedback hot items in the item set.

2. And according to the formula (4), traversing the user-item set to establish an item-user inverted table C [ i ] [ j ].

3. And according to the item-user reverse table C [ i ] [ j ], excluding users who do not have behavior on the same item as the current user. And traversing the excluded user-item set, and calculating interest similarity values between users with the shared items according to a formula (5) until all the users finish the calculation between each two users.

4. And (4) sorting each user from large to small according to the interest similarity values obtained in the step (3), selecting the top 2K users, and establishing a candidate neighbor object set.

5. And traversing the candidate neighbor object set according to the formula (1), the formulas (6) to (8) and the user-item-evaluation set, and calculating the comprehensive scoring similarity of the user.

6. And (5) preferentially selecting the users in the P (x (i, j)) as a source of the recommendation result according to the comprehensive scoring similarity degree among the users obtained in the step 5, and then selecting the neighbor user objects in an order from small to large according to the comprehensive item scoring similarity degree.

The method improves the accuracy of the method and effectively solves related problems while improving the recall rate of the collaborative filtering recommendation method based on the user, and belongs to the research field of recommendation methods. In the traditional recommendation method based on user collaborative filtering, because the similarity between positive feedback articles of the user is considered emphatically, a scientific evaluation method is lacked for user feedback, and the problem of low recall rate and accuracy of a recommendation result is caused. Aiming at the problems, the method of the invention is that after the user interest similarity is calculated by the traditional method, the intermediary truth degree measurement is taken as an evaluation method for user feedback to calculate the comprehensive scoring similarity of the user, and a better neighbor object is selected according to the comprehensive scoring similarity of the user, so that the effect of improving the accuracy and the recall rate of the recommendation result is finally achieved.

The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.

Claims

1. A collaborative filtering recommendation method based on intermediary truth degree measurement and a user is characterized by comprising the following steps:

step 1, calculating the interest similarity of users:

searching the whole training set to record the times of purchasing each article, sorting from large to small, selecting the first F/n articles according to the recommended article number n to form a popular article set G', 0< F/n <0.05F, wherein F represents the total number of articles in the training set, obtaining a common article set S according to the article set which has positive feedback by a user,

S＝N(i)∩N(j)，

N(i)′＝N(i)-N(i)∩G′

N(j)′＝N(j)-N(j)∩G′

building a look-up table of items to users:

C[i][j]＝|N(i)′∩N(j)′|

the user i and j has a score of n for the item object₁～n₂Any positive integer value in between, the notation P (x (i, j)) indicates that the users i and j to be investigated score the same item object,

representing that N (i) is different from N (j), and P (x (i, j)) representing that i is different from j, by calculating a distance scale function h_Tg(x (i, j)) obtaining the similarity degree of the user i and j to the item g score;

f(x(i,j))＝|Q_ig-Q_jg|；

wherein Q is_igRating of item g for user i, Q_jgScoring item g for user j;

Is taken as [0, n ]₂-n₁]；

The value of y ═ f (x (i, j)) falls within three ranges of values (α)_r+ε_r,α_l-ε_l),(0,α_r+ε_r),(α_l-ε_l,n₂-n₁) Within, the region of P (x (i, j)) is (alpha)_r+ε_r,α_l-ε_l) The region of P (x (i, j)) is (0, α)_r+ω_r)，

Has a region of (α)_l-ε_l,n₂-n₁) The true value of P (x (i, j)) is 1,

true value of 0;

distance proportional function with respect to P (x (i, j)):

wherein the content of the first and second substances,

traversing the common article set S, summing the similarity degrees of all the common article object scores of the user i and the user j, and dividing the sum by the size n of the common article set S to obtain the comprehensive score similarity degree h_Tn(x(i,j))：

2. The collaborative filtering recommendation method based on intermediary truth degree measure and user according to claim 1, wherein: and traversing the training set to extract data to form a user-item set and a user-item-evaluation set, calculating the user interest similarity between the user and other users, and selecting the user with the similarity of 2K as a neighbor user set M of the user, wherein K is the number of recommended users.

3. The collaborative filtering recommendation method based on intermediary truth degree measure and user according to claim 2, wherein: traversing the neighbor user set M, extracting corresponding user-item-scores in the neighbor user set from the user-item-score set, calculating relative scores, calculating item score similarity degrees of candidate users, and finally calculating comprehensive item score similarity degrees of the users.

4. The collaborative filtering recommendation method based on intermediary truth degree measure and user according to claim 3, wherein: and selecting the neighbor user objects according to the recommended user number K and the ranking from small to large according to the comprehensive article scoring similarity degree.

5. The collaborative filtering recommendation method based on intermediary truth degree measure and user according to claim 1, wherein: the area represented by P (x (i, j)) is 20%,

the area represented by P (x (i, j)) accounts for 50%, and the area represented by P (x (i, j)) accounts for 30%.