CN111008334B

CN111008334B - Top-K recommendation method and system based on local pairwise ordering and global decision fusion

Info

Publication number: CN111008334B
Application number: CN201911232316.2A
Authority: CN
Inventors: 王邦; 杨雪娇
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2019-12-04
Filing date: 2019-12-04
Publication date: 2023-04-18
Anticipated expiration: 2039-12-04
Also published as: CN111008334A

Abstract

The invention discloses a Top-K recommendation method and system based on local pairwise sorting and global decision fusion, and belongs to the field of personalized recommendation. The method comprises the following steps: constructing a user isomorphic drawing and an article isomorphic drawing according to the user-article bipartite drawing; selecting a user anchor point from the user composition and selecting an article anchor point from the article composition; obtaining a user subgroup corresponding to each user anchor point according to the correlation degree with the user anchor points, and obtaining an article subgroup corresponding to each article anchor point according to the correlation degree with the article anchor points; pairing the user subgroup and the article subgroup to obtain A user-article subgroups; for each user-item subgroup, training by using a Bayesian personalized sorting algorithm with paired difference perception to obtain a prediction score of each item by the user and obtain a local recommendation list of the user in the user-item subgroup; and fusing local recommendation lists of the user in different user-article subgroups to generate a global recommendation list of the user, so as to obtain a more accurate recommendation list.

Description

Top-K recommendation method and system based on local pairwise ordering and global decision fusion

Technical Field

The invention belongs to the field of personalized recommendation, and particularly relates to a Top-K recommendation method and system based on local pairwise sorting and global decision fusion.

Background

With the advent of the web2.0 era and the great increase of network bandwidth, various social network platforms begin to appear, and fragmented information begins to flood the lives of people. In order to solve the information overload problem, the personalized recommendation system increasingly shows important value. For example, in the e-commerce field, a recommendation system constructs an interest model of a user according to historical behavior information of the user, calculates the user's likeness to an item that they have not purchased, and then recommends an item that the user may like.

A recommendation method aiming at generating K item lists that a user may like is called a Top-K recommendation method, and a common Top-K recommendation method includes two types, namely a single-point model and a pair model. The optimization goal of the single-point model is to enable the prediction score of a user for a certain article to approach the real score of the user for the article, so that the articles which are not too much of the user are sorted according to the prediction score to generate a recommendation list. The pair-wise model learns the sequential relationship between a pair of items. Bayesian personalized ranking is one of the most common pairwise models, which predicts this pairwise preference by picking a positive sample i from items purchased by a user u and a negative sample j from items not purchased by the user at a time: user u scores item i higher than item j.

However, these algorithms are all modeled for the entire data, with an implicit assumption: i.e., all users (or all items) use a uniform feature space. This assumption of global feature space tends to make the learned features too coarse to be applicable to all users (or items). And the recommended items are often dominated by the most significant interests of the user, and it is difficult to capture different aspects of the user interests or item attributes, so that the generated recommendation list is not accurate enough.

Disclosure of Invention

Aiming at the technical problem that the accuracy of a Top-K recommendation list of a recommendation system in the prior art is limited, the invention provides a Top-K recommendation method and a Top-K recommendation system based on local pairwise sorting and global decision fusion, and aims to obtain a more accurate recommendation list.

To achieve the above object, according to a first aspect of the present invention, there is provided a Top-K recommendation method based on local pairwise ordering and global decision fusion, the method comprising the steps of:

s1, constructing a user isomorphic graph and an article isomorphic graph according to a user-article bipartite graph;

s2, selecting A user anchor points from the user composition, and selecting A article anchor points from the article composition;

s3, obtaining a user subgroup corresponding to each user anchor point according to the correlation degree of the user and the user anchor point, and obtaining an article subgroup corresponding to each article anchor point according to the correlation degree of the article and the article anchor point;

s4, pairing a user subgroup corresponding to the A user anchor points and an article subgroup corresponding to the A article anchor points to obtain an A user-article subgroup;

s5, for each user-article subgroup, training by using a Bayesian personalized sorting algorithm with paired difference perception to obtain a prediction score of each article by the user, and obtaining the top K articles with the largest score as a local recommendation list of the user in the user-article subgroup;

and S6, fusing the local recommendation lists of the users in different user-article subgroups according to the contribution of different subgroups to the global and the sequence of the articles in the local recommendation lists, and generating the global recommendation list of the users.

Specifically, step S1 includes the steps of:

s11, all M users in the training set form a set U, all N articles form a set V, the sets U and V form a point set of a user-article bipartite graph, the set U forms a point set of a user with a same composition, and the set V forms a point set of an article with a same composition;

s12, if the user U belongs to the U and the score r of the item V belongs to the V exists in the training set _uv Then, there is a side connecting user u and item v in the bipartite graph, and the weight of the side connecting is r _uv ；

S13, if two users in the bipartite graph have commonly connected articles, in the user same graph, a connecting edge exists between the two users, otherwise, the connecting edge does not exist, and the weight of the connecting edge is as follows:

wherein the content of the first and second substances,

and &>

Representing user u ₁ And u ₂ A scored item set, based on the number of items in the item set>

Representing user u ₁ Scoring v;

s14, if two articles in the bipartite graph have users connected together, in the article same graph, a connecting edge exists between the two articles, otherwise, the connecting edge does not exist, and the weight of the connecting edge is as follows:

wherein the content of the first and second substances,

and &>

Representing an article v ₁ And v ₂ A superseded user set, based on the status of the user, or based on the status of the user>

Representing user u to item v ₁ Scoring of (2).

Specifically, step S3 includes the steps of:

s31, each user anchor point is taken as a restart node respectively, random walk with restart is carried out, and a user convergence probability matrix is obtained

After random walk with restart is carried out by respectively taking each article anchor point as a restart node, an article convergence probability matrix can be obtained>

S32, for each user u, adding C _U The u-th line of (1) is arranged according to a descending order, the front rho multiplied by A user anchor points in the arrangement are taken, andthe user is assigned to these user anchors and for each item v, C is assigned _V The line v of (2) is arranged according to a descending order, the front rho multiplied by A article anchor points in the arrangement are taken, and the article is distributed to the article anchor points;

s33, for user anchor point u _a All the users assigned to him constitute a subgroup of users

For item anchor points v _a All items allocated to him constitute an item subgroup->

Wherein M is the number of users in the training set, N is the number of articles in the training set, and rho is a subgroup scale control parameter.

Specifically, in step S5, the user is found in the user-item subgroup

Local recommendation list in (1):

s51, constructing preference pairs based on the observed data, wherein the preference pairs of all the users form a training set

S52, for each preference pair (u, i, j), calculating the positive degree PL of a positive sample and the negative degree NL of a negative sample, and further calculating the gap value G of the positive sample and the negative sample;

s53, using the difference value G of the positive and negative samples as weight, and carrying out gradient descent on the training set

Training to obtain a user factor matrix U ^a Item factor matrix V ^a And an item offset vector B ^a The optimized loss function is as follows:

s54, for the articles which are not excessively scored by the user u, scoring according to the prediction

In the user-item subgroup, the first K items are taken as the user u>

Is selected based on the local recommendation list->

Wherein r is _ui Representing the user u's score for item i,

is indicated in a subgroup->

In which an excessive number of users is awarded to item j, and>

b is a translational bias, superscript a denotes the corresponding subgroup->

U ^a And V ^a Is a user-item subgroup->

The user factor matrix and the item factor matrix in (1), B ^a Is a user-item subgroup->

The item bias vector in (1), λ is a regularization coefficient, and θ represents the set of all parameters to be trained, i.e., U ^a ，V ^a ，B ^a And | | represents the matrix norm.

Specifically, step S6 includes the steps of:

s61, defining an item v in a local list

Sorting score of +>

S62, defining the object v in subgroup

Characteristic weight ω of _u，v ；

/>

S63, according to the object v

Sorting score of +>

And subgroup +>

Is based on the characteristic weight->

Calculating subgroup +>

A local score for each item v by the user u;

s64, obtaining the global score of the item v according to the local score of the item v in the local recommendation list generated by each subgroup for the user u;

s65, arranging the global scores of the user u on the items according to a descending order, and taking the largest K items as a global recommendation list L of the user u _u ；

Wherein, order (·) indicates that the object v is in

Rank of (1), N _u And &>

Respectively indicate the number and usage of all articles hit by user uU is in subgroup>

And the number of articles to be punched, M _v And &>

Respectively indicates the total number of users who have overflowed the item v and the subgroup->

The article v is given an excessive number of users.

To achieve the above object, according to a second aspect of the present invention, there is provided a Top-K recommendation system based on local pairwise ordering and global decision fusion, the system comprising:

the same composition construction module is used for constructing a user same composition and an article same composition according to the user-article bipartite graph;

the anchor point selection module is used for selecting A user anchor points from the user composition and selecting A article anchor points from the article composition;

the subgroup acquisition module is used for acquiring a user subgroup corresponding to each user anchor point according to the correlation degree of the user and the user anchor point, and acquiring an article subgroup corresponding to each article anchor point according to the correlation degree of the article and the article anchor point;

the subgroup matching module is used for matching the user subgroup corresponding to the A user anchor points with the article subgroup corresponding to the A article anchor points to obtain A user-article subgroup;

the local recommendation list acquisition module is used for training each user-article subgroup by using a Bayesian personalized sorting algorithm with pairwise difference perception to obtain a prediction score of each article by the user, and acquiring the first K articles with the largest score as a local recommendation list of the user in the user-article subgroup;

and the global recommendation list acquisition module is used for fusing the local recommendation lists of the users in different user-item subgroups according to the contribution degrees of different subgroups to the global and the sequence of the items in the local recommendation lists to generate the global recommendation list of the user.

Specifically, the isomorphic model construction module obtains the user isomorphic model and the object isomorphic model by the following method:

S13, if two users in the bipartite graph have articles connected together, in the same graph of the users, a connecting edge exists between the two users, otherwise, the connecting edge does not exist, and the weight of the connecting edge is as follows:

wherein the content of the first and second substances,

and &>

Representing user u ₁ Scoring v;

wherein the content of the first and second substances,

and &>

Representing user u to item v ₁ Scoring of (4).

Specifically, the subgroup acquiring module obtains the article subgroup corresponding to each article anchor point by the following method:

S32, for each user u, comparing C _U The u-th line of (b) is arranged according to a descending order, the first rho multiplied by A user anchor points in the arrangement are taken, the user is allocated to the user anchor points, and for each article v, C is used _V The line v of (1) is arranged according to a descending order, the front rho multiplied by A article anchor points in the arrangement are taken, and the article is distributed to the article anchor points;

s33, anchoring points u for users _a All the users assigned to him constitute a subgroup of users

For an item anchor v _a All items allocated to him constitute an item subgroup->

Specifically, the local recommendation list obtaining module obtains the user's user-item subgroup in the following manner

Local recommendation list in (1):

S52, for each preference pair (u, i, j), calculating the positive degree PL of a positive sample and the negative degree NL of a negative sample, and further calculating the difference value G of the positive sample and the negative sample;

Training to obtain a user factor matrix U ^a Item factor matrix V ^a And an item offset vector B ^a The optimized loss function is as follows: />

In order from large to small, the first K items are taken as the user u in the user-item subgroup->

Is selected based on the local recommendation list->

Wherein r is _ui Indicating the scoring of item i by user u,

is indicated in a subgroup->

An excessive number of users is granted to item j, and>

b is a translational bias, superscript a denotes the corresponding subgroup->

U ^a And V ^a Is a user-item subgroup->

Specifically, the global recommendation list obtaining module generates the global recommendation list of the user by the following method:

s61, defining an item v in a local list

Is greater than the rank score pick>

S62, defining the object v in subgroup

Characteristic weight ω of _u，v ；

S63, according to the object v

Sorting score of +>

And subgroup->

Is a characteristic ofWeight +>

Calculating a subgroup ≤>

A local score for each item v by user u;

s65, arranging the global scores of the user u on the items according to a descending order, and taking the maximum K items as a global recommendation list L of the user u _u ；

Wherein, order (·) indicates that the object v is in

Rank of (1), N _u And &>

Respectively represents the total number of items which are over hit by the user u and the sub-group->

Number of articles hit excessively in middle, M _v And &>

The article v is given an excessive number of users.

Generally, compared with the prior art, the technical scheme conceived by the invention has the following beneficial effects:

(1) The invention uses a Bayes sorting algorithm for sensing the paired difference to learn the preference sequence between the object pairs, and distinguishes the relative strength of the preference of different object pairs by calculating the difference between positive and negative samples. And the training is guided by taking the weight as the training weight, so that the final recommendation list is more accurate.

(2) According to the invention, through training in subgroups and global fusion, the users and the articles are assumed to belong to a plurality of different feature spaces, so that different aspects of user interests and article attributes can be captured from different perspectives, the problem that the learned features are too rough under the assumption of the global feature space is avoided, and the final recommendation list is more accurate.

(3) The invention adopts a global decision fusion method, and simultaneously considers the contribution of different subgroups to the global and the ordering of the articles in the local recommendation list in the fusion process, so that the fusion result is more accurate. The global decision fusion only needs to consider the local recommendation list, and does not need to consider all articles, so that the fusion efficiency is higher.

(4) The invention adopts the user same composition and the object same composition to respectively carry out random walk, and the space complexity of the two same compositions is less than that of the user-object bipartite graph. By constructing a homogeneous graph, the graph connectivity is higher than the user-item bipartite graph, and is more suitable for the following random walks.

Drawings

FIG. 1 is a flowchart of a Top-K recommendation method based on local pairwise ordering and global decision fusion according to an embodiment of the present invention;

fig. 2 is a schematic diagram that splits a user-article bipartite graph into a user isomorphic graph and an article isomorphic graph according to an embodiment of the present invention;

fig. 3 is a schematic diagram of fusing local recommendation lists into a global recommendation list according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention. In addition, the technical features involved in the respective embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

The overall concept of the invention is that the user-article bipartite graph is divided into a user identical composition graph and an article identical composition graph, random walk is firstly carried out on the two identical composition graphs to obtain a plurality of anchor points respectively, then the random walk is restarted from the anchor points to generate a plurality of user subgroups and article subgroups. And generating a user-item subgroup by pairing the user subgroup and the item subgroup. In each subgroup, training is carried out by adopting a gap perception Bayes sorting algorithm, namely, a positive sample and a negative sample pair are sampled for each user, and the gap value of the positive sample and the negative sample is calculated to be used as training weight. The optimization objective is to have positive samples score more than negative samples. And generating a local recommendation list for each user according to the prediction scores obtained after training. And finally, fusing the local recommendation lists of different subgroups according to the contribution of different subgroups to the global situation and the sequence of the articles in the local recommendation list to generate a final recommendation list.

As shown in FIG. 1, the present invention provides a Top-K recommendation method based on local pairwise ordering and global decision fusion, which comprises the following steps:

Step S1, constructing a user composition and an article composition according to the user-article bipartite graph.

As shown in fig. 2, step S1 includes the steps of:

s11, all M users in the training set form a set U, all N articles form a set V, the sets U and V form a point set of a user-article bipartite graph, the set U forms a point set of a user with a picture composition, and the set V forms a point set of an article with a picture composition.

S12, if the user U belongs to the U and the score r of the item V belongs to the V exists in the training set _uv Then, there is a side connecting user u and item v in the bipartite graph, and the weight of the side connecting is r _uv 。

wherein the content of the first and second substances,

and &>

Representing user u ₁ Scoring v.

wherein the content of the first and second substances,

and &>

Representing user u to item v ₁ Scoring of (4).

And S2, selecting A user anchor points from the user composition, and selecting A article anchor points from the article composition.

The step S2 includes the steps of:

s21, constructing a user transfer probability matrix P according to the user composition _U Constructing an article transition probability matrix P from the article isomorphism map _V 。

S211, obtaining the weighted adjacency matrix of the user identical composition picture according to the user identical composition picture

Based on the item isomorphism, a weighted adjacency matrix which obtains the item isomorphism>

In correspondence with the figure 2 of the drawings,

each element in the matrix represents a userAnd the weight of connecting edges of corresponding two user nodes in the same composition. />

Each element in the matrix represents the weight of the joint edge of the corresponding two article nodes in the article and the composition.

S212, respectively adding A _U And A _V Normalizing by columns to obtain a user transfer probability matrix

And an item transfer probability matrix>

P _U The ith column of the matrix represents the transition probability, P, of the ith user transitioning to each user _V The ith column of the matrix represents the transition probability for the ith item to transition to the respective item. For example, from the second user, there is a probability of 0.5/4.3 of transferring to the first item and a probability of 3.8/4.3 of transferring to the third item.

S22, random walk is conducted on the user composition picture, A users with the maximum node convergence probability after the walk are selected as A user anchor points, random walk is conducted on the article composition picture, and A articles with the maximum node convergence probability after the walk are selected as A article anchor points.

The random walk algorithm is to give each node a random initial value to let the nodes randomly walk in the graph, so that after iteration, each node will obtain a convergence probability which reflects the importance degree of each node in the graph.

From random walk on the user's composition, the probability vector u of the user is initialized randomly first ⁽⁰⁾ Then, random walk is performed by iteratively calculating the following formula:

wherein u is ^(t+1) Denotes the use after the t +1 th iterationAnd (3) probability vectors of users, wherein M represents the number of users, and alpha is random access probability, which indicates that each user has the probability of alpha in each iteration to randomly select one access from M users. A is typically small, e.g., 0.2. The random access probability ensures that the node randomly selects the node of the next hop with a certain probability every time the node walks, thereby avoiding the walking from falling into circulation.

When the difference of the probability distribution of the nodes which are walked twice | | | u ^(t+1) -u ^(t) When | is less than a threshold e (e is typically 1 e-8), the random walk can be considered to have converged. At this time, each node in the graph has its own convergence probability, and a users with the maximum probability during convergence are selected as user anchor points.

Similarly, random walk is performed from the same composition of the articles, and the probability vector v of the articles is initialized randomly ⁽⁰⁾ Then, the random walk is performed by iteratively calculating the following formula:

wherein v is ^(t+1) The probability vector of the item after the t +1 th iteration is represented, and N represents the number of the item. When the difference of the probability distribution of the nodes which are walked twice | | | v ^(t+1) -v ^(t) When | is less than the threshold e, the random walk is considered to have converged. And selecting the A articles with the maximum probability in convergence as the article anchor points. The value range of A in the invention is 5 < A < min (M, N), and 50 is generally adopted.

And S3, obtaining a user subgroup corresponding to each user anchor point according to the correlation degree of the user and the user anchor point, and obtaining an article subgroup corresponding to each article anchor point according to the correlation degree of the article and the article anchor point.

Preferably, the degree of correlation between the user and the user anchor point/the degree of correlation between the article and the article anchor point is calculated by adopting a restart random walk algorithm, the restart random walk algorithm starts to walk from a certain node as a starting point, only the probability of a randomly walks in each iteration, and the probability of 1-a directly returns to the initial node. After iteration, each node also obtains a convergence probability which reflects the correlation between each node and the initial node.

With user anchor u _a For restarting nodes, randomly walking from the same composition of the user, and initializing a user probability vector u by using one-hot coding ⁽⁰⁾ I.e. when u ⁽⁰⁾ Dimension i = u _a While u is ⁽⁰⁾ (i) =1, otherwise u ⁽⁰⁾ (i)＝0，i∈[1，M]. The random walk with restart is then performed by iteratively calculating the following formula:

u ^(t+1) ＝(1-β)·P _U ·u ^(t) +β·r _U

wherein u is ^(t+1) Representing the probability vector of the user after the t +1 iteration, wherein beta is the restart probability, generally 0.5, and representing that only the probability of 1-beta continues to travel along the current path in each iteration, and the probability of beta directly returns to the restart node u _a 。r _U Is a restart vector when i = u _a When r is _U (i) =1, otherwise, r _U (i)＝0。

When the difference between the node probability distributions of the two walks is less than a threshold (typically 1 e-8), the random walk can be considered to have converged. The size of the convergence probability of each node reflects the degree of correlation between each node and a restart node, namely, the anchor point u of the user and the user with the higher convergence probability is considered _a The tighter the relationship of (a). After random walk with restart is carried out by respectively taking each user anchor point as a restart node, a user convergence probability matrix can be obtained

C _U Is denoted by user anchor u _a For convergence vectors of users when restarting nodes, C _U The u-th row of (a) represents the convergence probability vector of user u when different user anchor points are used as restart nodes.

For each user U ∈ U, C is added _U The u line of the user anchor points is arranged according to a descending order, the front rho multiplied by A user anchor points in the arrangement are taken, the user is allocated to the user anchor points, wherein rho is a subgroup scale control parameter, and rho is more than 0.5 and less than 1; for user anchor u _a All are divided intoThe users assigned to him form a user subgroup

In the same way, the following is the item v _a To restart the node, a random walk is performed from the commodity isomorphism map. Initializing an item probability vector v using one-hot encoding ⁽⁰⁾ When v is ⁽⁰⁾ Dimension i = v _a When, v ⁽⁰⁾ (i) =1, otherwise, v ⁽⁰⁾ (i)＝0，i∈[1，N]Then, the random walk with restart is performed by iteratively calculating the following formula:

v ^(t+1) ＝(1-β)·P _V ·v ^(t) +β·r _V

wherein v is ^(t+1) Respectively representing the probability vectors of the articles after the t +1 th iteration, wherein beta is the restart probability, and representing that only the probability of 1-beta continues to travel along the current path in each iteration, and the probability of beta is directly returned to a restart node v _a 。r _V Is a restart vector when i = v _a When r is _V (i) =1, otherwise, r _V (i)＝0。

When the difference between the node probability distributions of the two walks is less than a threshold (typically 1 e-8), the random walk can be considered to have converged. The size of the convergence probability of each node reflects the degree of correlation between each node and the restart node, namely, the article and the article anchor point v with the larger convergence probability are considered _a The tighter the relationship of (a). After random walk with restart is carried out by taking each article anchor point as a restart node respectively, an article convergence probability matrix can be obtained

C _V Is denoted by item anchor point v _a For converging vectors of articles when restarting nodes, C _V Row v of (a) represents the converged probability vector of item v when the different anchor points are used as restart nodes. />

For each item V ∈ V, C is added _V In descending order, the first rho x A article anchor points in the arrangement are taken and the article is assigned to the article anchor pointsPoints, wherein rho is a subgroup scale control parameter; for item anchor points v _a All the items allocated to him constitute a subgroup of items

And S4, pairing the user subgroup corresponding to the A user anchor points and the article subgroup corresponding to the A article anchor points to obtain A user-article subgroup.

Each time randomly drawing a user anchor u from the user anchor set without putting back _a Randomly drawing an item anchor v from the set of item anchors without being put back _a Pairing them into an anchor pair (u) _a ，v _a ) And until the user anchor point set and the commodity anchor point set become empty sets. For each anchor point pair (u) _a ，v _a ) Their corresponding user subgroups

And a subgroup of items +>

Are also paired, constituting a user-item subgroup +>

And S5, for each user-item subgroup, training by using a Bayes personalized sorting algorithm with paired difference perception to obtain the prediction score of each item by the user, and obtaining the top K items with the largest score as a local recommendation list of the user in the user-item subgroup.

For each user-item subgroup

Obtaining a local recommendation list of the user in the user-item subgroup by:

For each user-item subgroup

In the method, a user u is randomly selected from a training set each time, an article is randomly selected from articles which are excessively hit (behavior is generated) by the user as a positive sample i, an article is randomly selected from articles which are not excessively hit (behavior is not generated) by the user as a negative sample j, and the user and the positive and negative samples form a triple (u, i, j) required by a Bayesian personalized sorting algorithm for pair-wise difference perception during training; and repeating the operation until the number of the triples is equal to the number of the samples scored by the user on the articles in the original training set.

And S52, calculating the positive degree PL of the positive sample and the negative degree NL of the negative sample of each preference pair (u, i, j), and further calculating the gap value G of the positive sample and the negative sample.

Wherein r is _ui Indicating the scoring of item i by user u,

expressed in subgroups +>

In which an excessive number of users is awarded to item j, and>

b is a translational bias, superscript a denotes the corresponding subgroup->

S53, taking the difference value G of the positive and negative samples as weight, and performing gradient descent on the training set

Training to obtain a user factor matrix U ^a Item factor matrix V ^a And an item offset vector B ^a 。

The optimized loss function is:

wherein, U ^a And V ^a Is a user-item subgroup

The user factor matrix and the item factor matrix in (1), B ^a Is an article bias vector, lambda is a regularization coefficient, generally takes 0.01, theta represents all parameter sets to be trained, namely U ^a ，V ^a ，B ^a . And | | represents the matrix norm. The optimization objective is to have positive samples score more than negative samples. U at the end of training ^a ，V ^a ，B ^a It is the user factor matrix to be obtained by the present inventionAn item factor matrix and an item bias vector.

Is selected based on the local recommendation list->

Sub-population predicted by the model at this time

The user u marks an item v as £>

The length of the local recommendation list is equal to the length K of the global recommendation list, which is typically 10. In the case of user u in a user-item subgroup +, as shown in FIG. 3>

Is selected based on the local recommendation list->

The ranking of each item in the local recommendation list can reflect how well the item is liked by user u in this subgroup.

By calculating the preference difference of the positive and negative samples and using the preference difference as weight guide training, the preference strengths of different positive and negative sample pairs can be distinguished, so that the model can learn more detailed preference relation among articles.

And S6, fusing the local recommendation lists of the users in different user-item subgroups according to the contribution of different subgroups to the global and the sequence of the items in the local recommendation lists, and generating the global recommendation list of the user.

Each subgroup contains different members and uses different feature spaces, so that each subgroup has its own characteristics, and thus the different subgroups contribute different degrees to the global recommendation list.

S61, defining an item v in a local list

Is greater than the rank score pick>

Wherein, order (·) indicates that the object v is in

Of (2). />

Stating v is on the list->

Is arranged at the head and is used for keeping the position of the blood vessel at the middle part>

Indicating that it is at the end.

S62, defining the object v in subgroup

Characteristic weight ω of _u，v 。

Wherein N is _u And

The number of the excessive articles is counted. M _v And &>

The article v is given an excessive number of users.

The implication behind this formula is: user u is in subgroup

The number of over-hit items reflects his local recommendation list->

The degree of contribution to the final recommendation list. In a similar manner, in subgroup->

The number of users in v that score an item v also affects how much v's local ranking contributes to its global ranking.

S63, according to the object v

Is greater than the rank score pick>

And subgroup +>

Characteristic weight ω of _u，v Counting a subgroup->

Of users u for each item vAnd (6) local scoring.

And S64, obtaining the global score of the item v according to the local score of the item v in the local recommendation list generated by each subgroup for the user u.

S65, arranging the global scores of the user u on all the articles according to a descending order, and taking the largest K articles as a global recommendation list L of the user u _u 。

The global decision fusion method considers the ordering of the articles in the local recommendation list and the contribution degree of different subgroups to the global situation, so that the fusion result is more accurate. In addition, by means of decision fusion, only the local recommendation list in each subgroup needs to be considered, and all articles in each subgroup do not need to be considered, so that the fusion efficiency is higher.

In order to verify the prediction effect of the prediction method provided by the invention, a Movielens-100K data set is selected as a research object, the quality of a recommendation list of the method, a recommendation method based on restart random walk, a recommendation method based on Bayes personalized ranking of pairwise ranking and a collaborative filtering ranking recommendation method based on a neural network is compared, and the comparison result is shown in table 1, wherein the method 1 is the recommendation method based on restart random walk, the method 2 is the recommendation method based on Bayes personalized ranking of pairwise ranking, the method 3 is the collaborative filtering recommendation method based on the neural network, and the method 4 is the method provided by the invention.

TABLE 1

The comparison result shows that the Top-K recommendation method based on local pairwise sorting and global decision fusion provided by the invention has obviously improved evaluation indexes of MAP and NDCG compared with the prior Top-K recommendation method. The method is characterized in that a Bayesian sorting algorithm for sensing pairwise differences is used, so that the preference strength between different article pairs is distinguished, and a model can learn more detailed preference relationship between articles. By means of training in subgroups and global fusion, users and articles are assumed to belong to a plurality of different feature spaces, so that different aspects of user interests and article attributes can be captured from different perspectives, the problem that features learned under the assumption of the global feature spaces are too rough is avoided, and a final recommendation list is more accurate.

It will be understood by those skilled in the art that the foregoing is only an exemplary embodiment of the present invention, and is not intended to limit the invention to the particular forms disclosed, since various modifications, substitutions and improvements within the spirit and scope of the invention are possible and within the scope of the appended claims.

Claims

1. The Top-K recommendation method based on local pairwise ordering and global decision fusion is characterized by comprising the following steps of:

2. The method of claim 1, wherein step S1 comprises the steps of:

s11, all M users in the training set form a set U, all N articles form a set V, the sets U and V form a point set of a user-article bipartite graph, the set U forms a point set of a user composition graph, and the set V forms a point set of an article composition graph;

s12, if the scoring r of the user U belonging to the U to the item V belonging to the V exists in the training set _uv Then, there is a side connecting user u and item v in the bipartite graph, and the weight of the side connecting is r _uv ；

wherein the content of the first and second substances,

and &>

Representing user u ₁ Scoring v;

wherein, the first and the second end of the pipe are connected with each other,

and &>

Representing user u to item v ₁ The score of (2) is given to the user,

representing user u to item v ₂ Scoring of (2).

3. The method of claim 1, wherein step S3 comprises the steps of:

After random walk with restart is carried out by taking each article anchor point as a restart node respectively, an article convergence probability matrix can be obtained>

S32, for each user u, adding C _U The u-th line of (b) is arranged according to a descending order, the first rho multiplied by A user anchor points in the arrangement are taken, the user is allocated to the user anchor points, and for each article v, C is used _V The v-th line of (1) is arranged according to descending order, and the front rho multiplied by A in the arrangement is takenItem anchors to which the item is assigned;

For an item anchor v _a All items allocated to him constitute an item subgroup +>

4. The method of claim 1, wherein in step S5, the user' S presence in a user-item subgroup is obtained by

Local recommendation list in (1):

Is selected based on the local recommendation list->

Wherein r is _ui Representing the user u's score for item i,

expressed in subgroups +>

In which an excessive number of users is awarded to item j, and>

b is a translational bias, superscript a denotes the corresponding subgroup->

U ^a And V ^a Is a sub-group of users-goods

The item bias vector in (1), λ is a regularization coefficient, and θ represents the set of all parameters to be trained, i.e., U ^a ,V ^a ,B ^a And | | represents the matrix norm.

5. The method of claim 1, wherein step S6 comprises the steps of:

s61, defining an item v in a local list

Is greater than the rank score pick>

/>

S62, defining the object v in subgroup

Characteristic weight ω of _u,v ；

S63, according to the object v

Sorting score of +>

And subgroup->

Is based on the characteristic weight->

Calculating subgroup +>

A local score for each item v by user u;

s65, arranging the global scores of the user u on the articles according to a descending order, and taking the maximum scoreK items as a global recommendation list L for user u _u ；

Wherein, order (·) indicates that the object v is in

Rank of (1), N _u And &>

Respectively represents the total number of items which are already marked by the user u and the subgroup of the user u>

Number of articles hit too far in, M _v And &>

Respectively representing the total number of users who have given an item v an excessive number and in a subgroup->

The article v is given an excessive number of users.

6. The Top-K recommendation system based on local pairwise ordering and global decision fusion is characterized by comprising the following steps:

the subgroup matching module is used for matching the user subgroups corresponding to the A user anchor points with the article subgroups corresponding to the A article anchor points to obtain A user-article subgroups;

7. The system of claim 6, wherein the isomorphic construction module derives the user isomorphism and the item isomorphism by:

and &>

For indicatingHuu (household) ₁ And u ₂ A scored item set, based on the number of items in the item set>

Representing user u ₁ Marks v, and>

representing user u ₂ Scoring the item v;

and &>

Representing an article v ₁ And v ₂ Scored user sets, <' > based on a predetermined number of user groups>

Representing user u to item v ₁ Scoring of (4).

8. The system of claim 6, wherein the subgroup acquisition module obtains the subgroup of items corresponding to each item anchor by:

Respectively taking each article anchor point as a restart nodeAfter a random walk with restart, an article convergence probability matrix may be obtained>

S32, for each user u, adding C _U The u-th line of (b) is arranged according to a descending order, the first rho multiplied by A user anchor points in the arrangement are taken, the user is allocated to the user anchor points, and for each article v, C is used _V The line v of (2) is arranged according to a descending order, the front rho multiplied by A article anchor points in the arrangement are taken, and the article is distributed to the article anchor points;

s33, for user anchor point u _a All users assigned to him constitute a subgroup of users

9. The system of claim 6, wherein the local recommendation list acquisition module obtains the user's presence in a user-item subgroup by

Local recommendation list in (1):

/>

In a local recommendation list &>

Wherein r is _ui Representing the user u's score for item i,

is indicated in a subgroup->

An excessive number of users is granted to item j, and>

b is a translational bias, superscript a denotes the corresponding subgroup->

U ^a And V ^a Is a user-item subgroup

10. The system of claim 6, wherein the global recommendation list acquisition module generates the global recommendation list for the user by:

s61, defining an object v in a local list

Is greater than the rank score pick>

S62, defining the object v in subgroup

Characteristic weight ω of _u,v ；

S63, according to the object v

Sorting score of +>

And subgroup->

Characteristic weight of +>

Calculating subgroup +>

A local score for each item v by the user u;

s64, obtaining a global score of the item v according to the local score of the item v in the local recommendation list generated by each subgroup for the user u;

Wherein, order (·) indicates that the object v is in

Rank of (1), N _u And &>

Number of articles hit excessively in middle, M _v And &>

The article v is given an excessive number of users. />