CN112650920A - Bayesian ordering-based recommendation method fusing social networks - Google Patents

Bayesian ordering-based recommendation method fusing social networks Download PDF

Info

Publication number
CN112650920A
CN112650920A CN202011435734.4A CN202011435734A CN112650920A CN 112650920 A CN112650920 A CN 112650920A CN 202011435734 A CN202011435734 A CN 202011435734A CN 112650920 A CN112650920 A CN 112650920A
Authority
CN
China
Prior art keywords
user
item
node
items
implicit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011435734.4A
Other languages
Chinese (zh)
Other versions
CN112650920B (en
Inventor
印鉴
蒙权
高静
方国鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Hengdian Information Technology Co ltd
Sun Yat Sen University
Original Assignee
Guangdong Hengdian Information Technology Co ltd
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Hengdian Information Technology Co ltd, Sun Yat Sen University filed Critical Guangdong Hengdian Information Technology Co ltd
Priority to CN202011435734.4A priority Critical patent/CN112650920B/en
Publication of CN112650920A publication Critical patent/CN112650920A/en
Application granted granted Critical
Publication of CN112650920B publication Critical patent/CN112650920B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Abstract

The invention provides a recommendation method fusing social networks based on Bayesian ranking, which is characterized in that firstly, an abnormal graph is formed by articles consumed by a user, scoring feedback and the social networks, then the abnormal graph is sampled by a novel abnormal graph walking method, and the sampled data is input into a Skip-Gram neural network to learn vector representation of the user and the articles. And then calculating the similarity of the vectors of the users by using a cosine similarity formula, and identifying the implicit friends which are most likely to have similar preference according to the similarity between the users. And finally, based on the implicit friend relationship of each user, subdividing the articles into a plurality of mutually exclusive parts, modeling through a Bayesian personalized sorting algorithm, and generating a personalized recommendation list for each user.

Description

Bayesian ordering-based recommendation method fusing social networks
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a recommendation method fusing social networks based on Bayesian ranking.
Background
The recommendation system relieves the problem that a user cannot quickly select interested articles from a large number of articles. However, on a recommendation platform like an online shopping mall, most users usually only buy a small part of a large amount of commodities, which causes a problem that data of user interaction with the commodities is too sparse, thereby causing the traditional recommendation system to have unsatisfactory effect.
With the development of online social platforms, such as qq, WeChat, and microblog-related platforms, researchers find that a user's preferences may have a certain relationship with the preferences of their social friends, and use this relationship to infer their preferences from the preferences of the social friends can alleviate the data sparseness problem of recommendation platforms using these observed social relationships.
With this knowledge, a large number of researchers have tried to integrate social relationships into corresponding fields for research. In their research, these researchers found that social relationships apply to recommendation systems, and have the following problems:
1) explicit social relationships present a lot of noise, particularly in false users, advertising marketing users, and so on. The existence of such noise makes explicit social relationships likely to adversely affect the effectiveness of the recommendation system;
2) the meaning of social relationships is too complex and not limited to preference similarity. For example, some people become friends because of similar preference relationships, some people become friends because of work, and if the explicit friend relationship is directly treated as the preference relationship without performing the depth filtering process, the effect of the recommendation system may be adversely affected. Direct use of explicit friend relationships for recommendations may be correspondingly limited.
Disclosure of Invention
The invention provides a relatively accurate Bayesian ranking-based recommendation method for fusing social networks.
In order to achieve the technical effects, the technical scheme of the invention is as follows:
a recommendation method based on Bayesian ranking and fusing social networks comprises the following steps:
s1: fusing a social network of a user and an interactive network of the user and an article to construct a heterogeneous information network graph;
s2: identifying implicit friends of each user based on the heterogeneous graph embedded representation;
s3: according to the evaluation feedback of each user to the articles and the interaction records of the hidden friends and the articles, for each user, all the articles can be classified in a fine-grained manner and are divided into 6 mutually exclusive groups: like items, plain items, dislike items, implicit friend like items, implicit friend dislike items, and remaining items;
s4: modeling the preference sequence of each user based on a Bayesian sorting algorithm: for each user, the model assumes its degree of preference: like the item > implicit friend like the item > mediocre item > remaining item > implicit friend aversion item > aversion item, the model is optimized with the higher joint probability of the assumption, and the personalized ranking list of the user when the joint probability is the maximum is obtained;
s5: when a user logs in the recommendation platform, the system searches the personalized ranking list of the user in the training result by using the id of the user, and recommends the top N items to the user.
Further, the process of step S1 is:
s11: constructing an interaction graph of users and articles, wherein each user or article is independently represented by a node, if the user has an interaction relationship with the article, an edge is used for connection, the weight of the edge represents the score of the user on the article, and after all the interaction relationships are connected, the interaction graph of the user and the article is obtained;
s12: on an interaction graph of the user and the article, adding the user to a social network through a combination unit; if a social relationship exists between the users, connecting the nodes of the two users; and when the social relations of all the users are connected, obtaining a social heterogeneous information network graph.
Further, the process of step S2 is:
s21: on a social heterogeneous information network, performing wandering sampling, and converting information from a graph form into a node sequence form;
s22: inputting the node sequence corpus into a Skip-Gram neural network for embedding, representing and learning, and training to obtain vector representation of each node;
s23: and calculating the similarity between the embedded vectors of the users, and measuring the similarity of the preference between the users by using the similarity.
Further, the process of step S21 is:
s211: setting a wandering rule from the current node to the next node:
1) if the current node is an article, the next node to be walked can only be a user because there is no edge connection between the articles, and if the current article node is connected with many users, the probability is designed
Figure BDA0002828615560000031
To select one of the users, the probability is proportional to the evaluation weight on the edge, that is, the higher the value of the user evaluation is, the higher the probability to be selected is;
2) if the current node is a user, the next node type to be walked is a user or an article, and a probability alpha n is designed firstly-1Determining whether the next node type is a user type or an article type, wherein n represents that the user type is accessed for n times continuously, so that the probability of continuously selecting the user is reduced when n is larger, the node type is prevented from staying too long, and the sampling is ensured to be more balanced; after the next node type is selected, if the next node type is selected as the user type, the current user node is selectedSelecting the next user node with equal probability from the connected user nodes; if the type of item is selected, then the item node connected with the current user node is selected according to probability
Figure BDA0002828615560000032
Selecting one of the nodes, wherein the probability ensures that the object which is scored higher by the current user has higher probability to be selected;
the walk selection algorithm is expressed by the following equation (1):
Figure BDA0002828615560000033
in the formula viRepresenting a current node; v. ofi+1Representing the next node to walk; m represents an item type; u represents a user type; alpha is belonged to 0,1]Probability of being an initial user type; e (v)i,vj) Representing a node viAnd node vjThe weight of the edge in between; | Ni+1(vi) L represents the number of social friends of the vi node; n is the number of nodes that the user type node is continuously accessed;
s212: acquiring a set U of all N user ids, and executing the following operations on each user id in the user set U: starting from the user node, wandering along the edge in the graph according to a designed wandering rule, wandering to an adjacent node of the user node according to the rule in the first step, wandering to an adjacent node of the adjacent node according to the rule in the second step, repeating the steps, wherein each step needs to carry out wandering sampling according to a designed probability until a specified step length L is wandered, and the size of the L is set according to the complexity of the heterogeneous information network to obtain N node sequences with the L step lengths;
s213: repeating the operation W times in step S212 to ensure that the sampling of the heterogeneous graph is complete enough, wherein the size of W is set according to the complexity of the heterogeneous information network, and a node sequence of W × N L steps is obtained, that is, the corpus obtained by migrating the sampling from the heterogeneous graph.
Further, in the step S22, the node sequence languageThe material library is input into a Skip-Gram neural network for embedding, characterizing and learning, a vector representation of each node is obtained through training, and for each current node vkThe optimized objective function is:
Figure BDA0002828615560000041
wherein, C (v)k) Representing a node vkV represents each node in the corpus,
Figure BDA0002828615560000042
is a Softmax function, and specifically comprises:
Figure BDA0002828615560000043
where θ is a weight parameter, VnIndicates the node type, yvAn embedded vector representing node v.
Further, in step S23, a similarity between the embedded vectors of the users is calculated, and the similarity is used to measure the similarity of the preferences between the users, where the similarity between the two vectors is calculated according to the following formula:
Figure BDA0002828615560000044
for each user, the formula is used for calculating the similarity between the user and all other users, the Top-K users with the highest similarity are taken as the implicit friends of the user, and finally the Top-K implicit friends of each user are obtained.
Further, the process of step S3 is:
s31: for certain items that the user has interacted with, the items are set to 3 levels according to the scores given to the items by the user: like item PUThe plain term OUAversion item NU(ii) a The user scores the item for 5 points if the userThe item is scored to be 4-5 points, the user is considered to like the item, and the user is classified as a favorite of the user; if the score is 3, the user is considered to feel that the item is mediocre, and the item is classified as the mediocre item of the user; if the score is 1-2, the user is considered to dislike the item, and the item is classified as the dislike item of the user;
s32: for the items which have not been interacted by the user, the items are set to 3 grades according to the evaluation of the implicit friends: implicit friend like item PSUNS for hidden friend aversion itemUThe remainder of the term EU(ii) a The implicit friends of the user have watched and liked and do not belong to PU、OUAnd NUThe items of (1) fall into the implicit friend like items of the user; the implicit friends of the user have watched the user, have the score less than or equal to 3 and do not belong to PU、OU、NUAnd PSUThe items of (1) are classified into implicit friend aversion items of the user; will not belong to PU、OU、NU、PSUAnd NSUThe rest items are classified into the rest items;
s33: through the two steps of classification, each user has mutually exclusive 6 types of articles, PU+OU+NU+PSU+NSU+EUSet of all items, and PU、OU、NU、PSU、NSU、EUThe two parts are independent and are not mutually intersected; like item PU: for all users, order PURepresenting items that user u has viewed and liked by itself;
plain item OU: let O beURepresenting items that user u has viewed and feels mediocre by itself;
aversion item NU: let NUAn item that represents what user u has viewed and disliked itself;
implicit friend like item PSU: let PSUThe implicit friends representing user u are watched and liked by someone and do not belong to PU、OUAnd NUThe article of (1);
NS for hidden friend aversion itemU: order NSUThe implicit friends of the user u are shown as watched by people and the score is less than or equal to 3, and the implicit friends do not belong to the group PU、OU、NUAnd PSUThe article of (1);
remainder term EU: has not been viewed by user u and does not belong to PU、OU、NU、PSUAnd NSUThe remainder of the article.
Further, the process of step S4 is:
s41: for a fine-grained classification of 6 items per user, the following assumptions are proposed: assuming that the user's preference level is like > implicit friend like > mediocre > remaining > implicit friend dislike > dislike, then this assumption is transferred to a mathematical formula model:
f:xui≥xuj≥xuk≥xul≥xum≥xun
where i ∈ Pu,j∈PSu,k∈Ou,l∈Eu,m∈NSu,n∈Nu
Wherein x isuiIndicates the preference, x, of the user u for the favorite i of his own evaluationujRepresenting the preference, x, of user u for implicit friend like item jukIndicates the preference, x, of user u for the mediocre term k evaluated by itselfulIndicates the preference of user u for the remaining item l, xumRepresenting user u's preference for implicit friend aversion m, xunA preference of the user u for the aversion item n evaluated by the user u;
s42: the above basic assumption can be used to maximize AUC, and a larger AUC value, meaning the greater the probability of the combination of the above assumptions, is trained in the following optimized formula:
Figure BDA0002828615560000051
when the optimization target reaches the maximum, an item list ordered by each user according to the preference degree can be obtained;
s43: the item ordered list results of each user are stored in a database for easy query.
Further, in step S5, when a user logs in the platform, the system reads the id information of the user, and then retrieves the item recommendation list of the user from the off-line database according to the id information of the user, and feeds back Top-N items in the list, which are arranged in front, to the user.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
according to the method, firstly, an abnormal figure is formed by the articles consumed by the user, the grading feedback and the social network, then the abnormal figure is sampled through a novel abnormal figure walking method, and the sampled data is input into a Skip-Gram neural network to learn the vector representation of the user and the articles. And then calculating the similarity of the vectors of the users by using a cosine similarity formula, and identifying the implicit friends which are most likely to have similar preference according to the similarity between the users. And finally, based on the implicit friend relationship of each user, subdividing the articles into a plurality of mutually exclusive parts, modeling through a Bayesian personalized sorting algorithm, and generating a personalized recommendation list for each user.
Drawings
FIG. 1 is a general flow chart of the process of the present invention;
FIG. 2 is a simplified social heterogeneous information network;
fig. 3 is a schematic diagram of a node walk sample corpus.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the present embodiments;
for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
As shown in fig. 1, a recommendation method for fusing social networks based on bayesian ranking includes the following steps:
s1: fusing the social network of the user and the interactive network of the user and the article to construct a heterogeneous information network (the step includes a combination unit of 101 in the flow chart);
s2: implicit friends of each user are identified based on the heterogeneous graph embedded representation. Specifically, in this embodiment, a new heterogeneous graph sampling method is designed to obtain a walking sequence of nodes by walking on a heterogeneous graph, and a vector representation of each user and each article is obtained by learning a sequence of the node sequences through a Skip-Gram neural network. Finally, calculating the similarity between every two users based on the vectors of the users, wherein the first K users with the highest similarity with the users are the implicit friends of the users (the step comprises a node walking sampling unit of 102, a Skip-Gram neural network of 103 and a calculation similarity unit of 104 in the flow chart);
s3: according to the evaluation feedback of each user to the articles and the interaction records of the hidden friends and the articles, for each user, all the articles can be classified in a fine-grained manner and are divided into 6 mutually exclusive groups: like item, mediocre item, dislike item, implicit friend like item, implicit friend dislike item, and remaining items (this step includes item fine-grained sort unit of 105 in the flowchart);
s4: and modeling the preference sequence of each user based on a Bayesian sorting algorithm. For each user, the model assumes its degree of preference: like item > implicit friend like item > mediocre item > remaining item > implicit friend aversive item > aversive item. The model is optimized by using the larger the joint probability of the establishment of the assumption, and a personalized ranking list of the user when the joint probability is the maximum is obtained (the step comprises a Bayesian ranking unit of 106 in the flow chart);
s5: when a user logs in the recommendation platform, the system searches the personalized ranking list of the user in the training result by using the id of the user, and recommends the top N items to the user. (this step includes a retrieval unit of 107 in the flowchart)
(1) S1: and fusing the social network of the user and the interactive network of the user and the article to construct a heterogeneous information network. (this step includes the combination unit of 101 in the flowchart)
S11: and constructing an interaction graph of the users and the articles, wherein each user or article is represented by a node, if the user and the article have an interaction relationship, an edge is connected, and the weight of the edge represents the score of the user on the article. When all the interactive relations are connected, an interactive graph of the user and the article is obtained.
S12: and joining the social network between the users through the combination unit on the interaction graph of the users and the items. Specifically, if there is a social relationship between the users, the nodes of the two users are connected. When the social relationships of all users are connected, a social heterogeneous information network diagram is obtained, and the structure of the social heterogeneous information network diagram is shown in fig. 2, wherein u _ id represents the id of the user, and m _ id represents the id of the item.
(2) S2: implicit friends of each user are identified based on the heterogeneous graph embedded representation. Specifically, in this embodiment, a new heterogeneous graph sampling method is designed to obtain a walking sequence of nodes by walking on a heterogeneous graph, and a vector representation of each user and each article is obtained by learning a sequence of the node sequences through a Skip-Gram neural network. And finally, calculating the similarity between every two users based on the vectors of the users, wherein the first K users with the highest similarity with the users are the implicit friends of the users. (this step includes the nodes in the flowchart 102 walking the sampling unit, the Skip-Gram neural network in the flowchart 103, the calculation similarity unit in the flowchart 104)
S21: on the social heterogeneous information network, the wandering sampling is carried out, and the information is converted from the form of a graph into the form of a node sequence (the step comprises a node wandering sampling unit of 102 in the flow chart). The method comprises the following steps:
s211: a migration rule from the current node to the next node is set.
1) If the current node is an item, then the next node to walk can only be the user, since there is no edge connection between items. Designing probabilities if a current item node is connected to many users
Figure BDA0002828615560000081
To select one of the users, the probability is proportional to the evaluation weight on the edge, i.e. the higher the value of the user evaluation, the higher the probability to be selected.
2) If the current node is a user, then the next node type to walk is a user or an item. In this case, a probability α n is first designed-1To decide whether the next node type is a user or an item type, where n denotes n consecutive visits to the user type. Thus, the probability of continuing to select users decreases as n increases, so as to prevent the nodes from staying too long in the node type, and ensure more balanced sampling. After the selection of the next node type, if the selection is a user type, the next user node is selected with equal probability from the user nodes having a connection with the current user node. If the type of item is selected, then the item node connected with the current user node is selected according to probability
Figure BDA0002828615560000082
One of the nodes is selected, and the probability ensures that the item scored higher by the current user is selected with greater probability.
In summary, the walk selection algorithm designed in this embodiment can be expressed by the following formula (1):
Figure BDA0002828615560000083
in the formula
ViRepresenting a current node;
Vi+1representing the next node to walk;
m represents an item type;
u represents a user type;
alpha belongs to the probability of [0,1] as the initial user type;
e(vi,vj) Representing a node viAnd node vjThe weight of the edge in between;
|Ni+1(vi) | represents viThe number of social friends of the node;
n is the number of nodes that the user type node is continuously accessed.
S212: acquiring a set U of all N user ids, and executing the following operations on each user id in the user set U: starting from the user node, the wandering rule designed in the embodiment is wandered along the edge in the graph, the first step is wandered to the adjacent node of the user node according to the rule, the second step is wandered to the adjacent node of the adjacent node according to the rule, the steps are repeated, each step needs to carry out wandering sampling according to the designed probability until the specified step length L is wandered, wherein the size of the L is set according to the complexity of the heterogeneous information network. Thus, N node sequences with L steps are obtained.
S213: repeating the operation of step S212W times ensures that the sampling of the heterogeneous graph is sufficiently comprehensive, wherein the size of W is set according to the complexity of the heterogeneous information network. In this way, a node sequence of W × N L steps is obtained, which is referred to as a corpus obtained by migrating samples from a heterogeneous graph. The node sequence corpus of fig. 3 below can be referred to in detail.
S22: and inputting the node sequence corpus into a Skip-Gram neural network for embedding, characterizing and learning, and training to obtain a vector representation of each node (the step comprises the Skip Gram neural network of 103 in the flow chart). Specifically, for each current node vk, the optimized objective function is:
Figure BDA0002828615560000091
wherein C (vk) represents a node set of upper and lower w windows of a node vk, V represents each node in a corpus, and p (vnm | vk; theta) is a Softmax function, and specifically comprises the following steps:
Figure BDA0002828615560000092
where θ is a weight parameter, Vn represents a node type, yvDisplay sectionThe embedded vector of point v.
S23: the similarity between the user's embedded vectors is calculated and used to measure the degree of similarity in preferences between users (this step includes the calculate similarity unit of 104 in the flow chart). The similarity between two vectors is calculated as:
Figure BDA0002828615560000093
for each user, the formula is used for calculating the similarity between the user and all other users, the Top-K users with the highest similarity are taken as the implicit friends of the user, and finally the Top-K implicit friends of each user are obtained
(3) S3: according to the evaluation feedback of each user to the articles and the interaction records of the hidden friends and the articles, for each user, all the articles can be classified in a fine-grained manner and are divided into 6 mutually exclusive groups: like items, mediocre items, dislike items, implicit friend like items, implicit friend dislike items, and remaining items. (this step includes the item fine-grained classification unit of 105 in the flow chart)
S31: for certain items that the user has interacted with, the items are set to 3 levels according to the scores given to the items by the user: like item (P)U) The plain term (O)U) Aversion item (N)U). The user scores 5 points for the item, if the user scores 4-5 points for the item, the user is considered to like the item, and the item is classified into the favorite of the user; if the score is 3, the user is considered to feel that the item is mediocre, and the item is classified as the mediocre item of the user; if the score is 1-2, the user is considered to dislike the item, and the item is classified as dislike for the user.
S32: for the items which have not been interacted by the user, the items are set to 3 grades according to the evaluation of the implicit friends: implicit friend like item (PS)U) Implicit friend aversion item (NS)U) The remainder (E)U). The implicit friends of the user have watched and liked, and do not belong to (P)U、OUAnd NU) The items of (1) fall into the implicit friend like items of the user; the implicit friends of the user watch the information, the score is less than or equal to 3, and the information does not belong to (P)U、OU、NUAnd PSU) The items of (1) are classified into implicit friend aversion items of the user; will not belong to (P)U、OU、NU、PSUAnd NSU) The remaining items of (a) are included in the remaining items.
S33: through the two steps of classification, each user has mutually exclusive 6 types of articles, PU+OU+NU+PSU+NSU+EUSet of all items, and PU、OU、NU、PSU、NSU、EUIndependent in pairs and not mutually intersected.
(P)U): for all users, order PURepresenting items that user u has viewed and liked by itself;
② plain term (O)U): let O beURepresenting items that user u has viewed and feels mediocre by itself;
③ aversion item (N)U): let NUAn item that represents what user u has viewed and disliked itself;
fourthly, implicit friend like item (PS)U): let PSUImplicit friends representing user u are watched and liked by someone, and do not belong to (P)U、OUAnd NU) The article of (1);
fifth hidden friend aversion item (NS)U): let NSU denote that someone in the implicit friends of user u watched and scored ≦ 3, and not (P)U、OU、NUAnd PSU) The article of (1);
sixthly item (E)U): has not been viewed by user u and does not belong to (P)U、OU、NU、PSUAnd NSU) The remainder of the article.
(4) S4: and modeling the preference sequence of each user based on a Bayesian sorting algorithm. For each user, the model assumes its degree of preference: like item > implicit friend like item > meditation item > remaining item > implicit friend aversion item > aversion item. The model is optimized according to the larger the joint probability of the established assumptions, and the personalized ranking list of the users when the joint probability is maximum is obtained. (this step includes a Bayesian ranking unit of 106 in the flow chart)
S41: for a fine-grained classification of 6 items per user, the following assumptions are proposed: it is assumed that the user's preference level is like > implicit friend like > mediocre > remaining > implicit friend dislike > dislike. This assumption is then converted to a mathematical formula model:
f:xui≥xuj≥xuk≥xul≥xum≥xun
i∈Pu,j∈PSu,k∈Ou,l∈Eu,m∈NSu,n∈Nu
where Xui denotes the preference of the user u for the favorite item i of the self-evaluation, Xuj denotes the preference of the user u for the implicit friend favorite item j, Xuk denotes the preference of the user u for the mediocre item k of the self-evaluation, Xul denotes the preference of the user u for the remaining items 1, Xum denotes the preference of the user u for the implicit friend aversion item m, and Xun denotes the preference of the user u for the aversion item n of the self-evaluation.
S42: the above basic assumption can be used to maximize AUC, a larger AUC value, meaning that the probability of the above assumptions being combined is larger. Training is carried out according to the following optimized formula:
Figure BDA0002828615560000111
when the optimization goal is maximized, a list of items that each user has ranked by preference can be obtained.
S43: the item ordered list results of each user are stored in a database for easy query.
(5) S5: when a user logs in the recommendation platform, the system searches the personalized ranking list of the user in the training result by using the id of the user, and recommends the top N items to the user.
The method comprises the steps that articles are recommended for a user on line, when a user logs in a platform, a system reads id information of the user, then an article recommendation list of the user is called from an off-line database according to the id information of the user, and Top-N articles arranged in the front of the list are fed back to the user.
And ending the article recommendation process of the converged social network based on the Bayesian ranking.
The same or similar reference numerals correspond to the same or similar parts;
the positional relationships depicted in the drawings are for illustrative purposes only and are not to be construed as limiting in the present embodiment;
it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (10)

1. A recommendation method fusing social networks based on Bayesian ranking is characterized by comprising the following steps:
s1: fusing a social network of a user and an interactive network of the user and an article to construct a heterogeneous information network graph;
s2: identifying implicit friends of each user based on the heterogeneous graph embedded representation;
s3: according to the evaluation feedback of each user to the articles and the interaction records of the hidden friends and the articles, for each user, all the articles can be classified in a fine-grained manner and are divided into 6 mutually exclusive groups: like items, plain items, dislike items, implicit friend like items, implicit friend dislike items, and remaining items;
s4: modeling the preference sequence of each user based on a Bayesian sorting algorithm: for each user, the model assumes its degree of preference: like the item > implicit friend like the item > mediocre item > remaining item > implicit friend aversion item > aversion item, the model is optimized with the higher joint probability of the assumption, and the personalized ranking list of the user when the joint probability is the maximum is obtained;
s5: when a user logs in the recommendation platform, the system searches the personalized ranking list of the user in the training result by using the id of the user, and recommends the top N items to the user.
2. The Bayesian-ranking-based converged social network recommendation method according to claim 1, wherein the step S1 is implemented by:
s11: constructing an interaction graph of users and articles, wherein each user or article is independently represented by a node, if the user has an interaction relationship with the article, an edge is used for connection, the weight of the edge represents the score of the user on the article, and after all the interaction relationships are connected, the interaction graph of the user and the article is obtained;
s12: on an interaction graph of the user and the article, adding the user to a social network through a combination unit; if a social relationship exists between the users, connecting the nodes of the two users; and when the social relations of all the users are connected, obtaining a social heterogeneous information network graph.
3. The Bayesian-ranking-based converged social network recommendation method according to claim 2, wherein the step S2 is implemented by:
s21: on a social heterogeneous information network, performing wandering sampling, and converting information from a graph form into a node sequence form;
s22: inputting the node sequence corpus into a Skip-Gram neural network for embedding, representing and learning, and training to obtain vector representation of each node;
s23: and calculating the similarity between the embedded vectors of the users, and measuring the similarity of the preference between the users by using the similarity.
4. The Bayesian-ranking-based converged social network recommendation method according to claim 3, wherein the step S21 is implemented by:
s211: designing a wandering rule from a current node to a next node:
1) if the current node is an article, the next node to be walked can only be a user because there is no edge connection between the articles, and if the current article node is connected with many users, the probability is designed
Figure FDA0002828615550000021
To select one of the users, the probability is proportional to the evaluation weight on the edge, that is, the higher the value of the user evaluation is, the higher the probability to be selected is;
2) if the current node is a user, the next node type to be walked is a user or an article, and a probability alpha n is designed firstly-1Determining whether the next node type is a user type or an article type, wherein n represents that the user type is accessed for n times continuously, so that the probability of continuously selecting the user is reduced when n is larger, the node type is prevented from staying too long, and the sampling is ensured to be more balanced; after the next node type is selected, if the next node type is selected as a user type, selecting the next user node with equal probability from user nodes connected with the current user node; if the type of item is selected, then the item node connected with the current user node is selected according to probability
Figure FDA0002828615550000022
Selecting one of the nodes, wherein the probability ensures that the object which is scored higher by the current user has higher probability to be selected;
the walk selection algorithm is expressed by the following equation (1):
Figure FDA0002828615550000023
in the formula viRepresenting a current node; v. ofi+1Representing the next node to walk; m represents an item type; u represents a user type; alpha is belonged to 0,1]Probability of being an initial user type; e (v)i,vj) Representing a node viAnd node vjThe weight of the edge in between; | Ni+1(vi) L represents the number of social friends of the vi node; n is the number of nodes that the user type node is continuously accessed;
s212: acquiring a set U of all N user ids, and executing the following operations on each user id in the user set U: starting from the user node, wandering along the edge in the graph according to a designed wandering rule, wandering to an adjacent node of the user node according to the rule in the first step, wandering to an adjacent node of the adjacent node according to the rule in the second step, repeating the steps, wherein each step needs to carry out wandering sampling according to a designed probability until a specified step length L is wandered, and the size of the L is set according to the complexity of the heterogeneous information network to obtain N node sequences with the L step lengths;
s213: repeating the operation W times in step S212 to ensure that the sampling of the heterogeneous graph is complete enough, wherein the size of W is set according to the complexity of the heterogeneous information network, and a node sequence of W × N L steps is obtained, that is, the corpus obtained by migrating the sampling from the heterogeneous graph.
5. The Bayesian-ranking-based recommendation method fusing social networks according to claim 4, wherein in the step S22, the node sequence corpus is input into a Skip-Gram neural network for embedding, characterizing and learning, and a vector representation of each node is obtained through training, and for each current node vkThe optimized objective function is:
Figure FDA0002828615550000031
wherein, C (v)k) Representing a node vkUpper and lower w window node sets, V tablesEach node in the corpus is shown as being,
Figure FDA0002828615550000032
is a Softmax function, and specifically comprises:
Figure FDA0002828615550000033
where θ is a weight parameter, VnIndicates the node type, yvAn embedded vector representing node v.
6. The Bayesian-ranking-based recommendation method for fusing social networks according to claim 5, wherein in step S23, the similarity between the embedded vectors of the users is calculated, and the similarity is used to measure the similarity of the preference between the users, and the similarity between two vectors is calculated according to the following formula:
Figure FDA0002828615550000034
for each user, the formula is used for calculating the similarity between the user and all other users, the Top-K users with the highest similarity are taken as the implicit friends of the user, and finally the Top-K implicit friends of each user are obtained.
7. The Bayesian-ranking-based converged social network recommendation method according to claim 6, wherein the step S3 is implemented by:
s31: for certain items that the user has interacted with, the items are set to 3 levels according to the scores given to the items by the user: like item PUThe plain term OUAversion item NU(ii) a The user scores 5 points for the item, if the user scores 4-5 points for the item, the user is considered to like the item, and the item is classified into the favorite of the user; if the score is 3, the user is deemed to find the item as flatMediocre, subsume it to the user's mediocre terms; if the score is 1-2, the user is considered to dislike the item, and the item is classified as the dislike item of the user;
s32: for the items which have not been interacted by the user, the items are set to 3 grades according to the evaluation of the implicit friends: implicit friend like item PSUNS for hidden friend aversion itemUThe remainder of the term EU(ii) a The implicit friends of the user have watched and liked and do not belong to PU、OUAnd NUThe items of (1) fall into the implicit friend like items of the user; the implicit friends of the user have watched the user, have the score less than or equal to 3 and do not belong to PU、OU、NUAnd PSUThe items of (1) are classified into implicit friend aversion items of the user; will not belong to PU、OU、NU、PSUAnd NSUThe rest items are classified into the rest items;
s33: through the two steps of classification, each user has mutually exclusive 6 types of articles, PU+OU+NU+PSU+NSU+EUSet of all items, and PU、OU、NU、PSU、NSU、EUIndependent in pairs and not mutually intersected.
8. The Bayesian-ranking-based converged social network recommendation method according to claim 7, wherein,
like item PU: for all users, order PURepresenting items that user u has viewed and liked by itself;
plain item OU: let O beURepresenting items that user u has viewed and feels mediocre by itself;
aversion item NU: let NUAn item that represents what user u has viewed and disliked itself;
implicit friend like item PSU: let PSUThe implicit friends representing user u are watched and liked by someone and do not belong to PU、OUAnd NUThe article of (1);
NS for hidden friend aversion itemU: order NSUThe implicit friends of the user u are shown as watched by people and the score is less than or equal to 3, and the implicit friends do not belong to the group PU、OU、NUAnd PSUThe article of (1);
remainder term EU: has not been viewed by user u and does not belong to PU、OU、NU、PSUAnd NSUThe remainder of the article.
9. The Bayesian-ranking-based converged social network recommendation method according to claim 8, wherein the step S4 is implemented by:
s41: for a fine-grained classification of 6 items per user, the following assumptions are proposed: assuming that the user's preference is like > implicit friend like > mediocre > remaining > implicit friend dislike > dislike, then this assumption is transferred to a mathematical formula model:
f:xui≥xuj≥xuk≥xul≥xum≥xun
where i ∈ Pu,j∈PSu,k∈Ou,l∈Eu,m∈NSu,n∈Nu
Wherein x isuiIndicates the preference, x, of the user u for the favorite i of his own evaluationujRepresenting the preference, x, of user u for implicit friend like item jukIndicates the preference, x, of user u for the mediocre term k evaluated by itselfulIndicates the preference of user u for the remaining item l, xumRepresenting user u's preference for implicit friend aversion m, xunA preference of the user u for the aversion item n evaluated by the user u;
s42: the above basic assumption can be used to maximize AUC, and a larger AUC value, meaning the greater the probability of the combination of the above assumptions, is trained in the following optimized formula:
Figure FDA0002828615550000051
Figure FDA0002828615550000052
when the optimization target reaches the maximum, an item list ordered by each user according to the preference degree can be obtained;
s43: the item ordered list results of each user are stored in a database for easy query.
10. The Bayesian-ranking-based recommendation method for converged social networks according to claim 9, wherein in the step S5, when a user logs in the platform, the system reads the id information of the user, then calls an item recommendation list of the user from an offline database according to the id information of the user, and feeds back Top-N items arranged at the front in the list to the user.
CN202011435734.4A 2020-12-10 2020-12-10 Recommendation method fusing social networks based on Bayesian sorting Active CN112650920B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011435734.4A CN112650920B (en) 2020-12-10 2020-12-10 Recommendation method fusing social networks based on Bayesian sorting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011435734.4A CN112650920B (en) 2020-12-10 2020-12-10 Recommendation method fusing social networks based on Bayesian sorting

Publications (2)

Publication Number Publication Date
CN112650920A true CN112650920A (en) 2021-04-13
CN112650920B CN112650920B (en) 2022-11-11

Family

ID=75350667

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011435734.4A Active CN112650920B (en) 2020-12-10 2020-12-10 Recommendation method fusing social networks based on Bayesian sorting

Country Status (1)

Country Link
CN (1) CN112650920B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106934071A (en) * 2017-04-27 2017-07-07 北京大学 Recommendation method and device based on Heterogeneous Information network and Bayes's personalized ordering
CN107403390A (en) * 2017-08-02 2017-11-28 桂林电子科技大学 A kind of friend recommendation method for merging Bayesian inference and the upper random walk of figure
CN109726747A (en) * 2018-12-20 2019-05-07 西安电子科技大学 Recommend the data fusion sort method of platform based on social networks
CN110910218A (en) * 2019-11-21 2020-03-24 南京邮电大学 Multi-behavior migration recommendation method based on deep learning
CN111428147A (en) * 2020-03-25 2020-07-17 合肥工业大学 Social recommendation method of heterogeneous graph volume network combining social and interest information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106934071A (en) * 2017-04-27 2017-07-07 北京大学 Recommendation method and device based on Heterogeneous Information network and Bayes's personalized ordering
CN107403390A (en) * 2017-08-02 2017-11-28 桂林电子科技大学 A kind of friend recommendation method for merging Bayesian inference and the upper random walk of figure
CN109726747A (en) * 2018-12-20 2019-05-07 西安电子科技大学 Recommend the data fusion sort method of platform based on social networks
CN110910218A (en) * 2019-11-21 2020-03-24 南京邮电大学 Multi-behavior migration recommendation method based on deep learning
CN111428147A (en) * 2020-03-25 2020-07-17 合肥工业大学 Social recommendation method of heterogeneous graph volume network combining social and interest information

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ZHENGSHEN JIANG ET AL.: "Recommendation in heterogeneous information networkd based on generalized random walk model and bayesian personalized ranking", 《WSDM"18:PROCEEDINGS OF THE ELEVENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING》 *
ZHENGSHEN JIANG ET AL.: "Recommendation in heterogeneous information networkd based on generalized random walk model and bayesian personalized ranking", 《WSDM"18:PROCEEDINGS OF THE ELEVENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING》, 28 February 2018 (2018-02-28), pages 288 - 296, XP058386564, DOI: 10.1145/3159652.3159715 *
尚燕敏等: "基于异构社交网络信息和内容信息的事件推荐", 《软件学报》 *
尚燕敏等: "基于异构社交网络信息和内容信息的事件推荐", 《软件学报》, vol. 31, no. 4, 30 April 2020 (2020-04-30), pages 1212 - 1223 *

Also Published As

Publication number Publication date
CN112650920B (en) 2022-11-11

Similar Documents

Publication Publication Date Title
CN111428147B (en) Social recommendation method of heterogeneous graph volume network combining social and interest information
Feng et al. Personalized recommendations based on time-weighted overlapping community detection
Kuo et al. Integration of ART2 neural network and genetic K-means algorithm for analyzing Web browsing paths in electronic commerce
CN104462383B (en) A kind of film based on a variety of behavior feedbacks of user recommends method
Forouzandeh et al. Presentation a Trust Walker for rating prediction in recommender system with Biased Random Walk: Effects of H-index centrality, similarity in items and friends
CN108920527A (en) A kind of personalized recommendation method of knowledge based map
Bok et al. Social group recommendation based on dynamic profiles and collaborative filtering
CN109190030B (en) Implicit feedback recommendation method fusing node2vec and deep neural network
CN106296312A (en) Online education resource recommendation system based on social media
Zhang et al. TopRec: domain-specific recommendation through community topic mining in social network
Arevalillo-Herráez et al. Distance-based relevance feedback using a hybrid interactive genetic algorithm for image retrieval
Bidoki et al. A3CRank: An adaptive ranking method based on connectivity, content and click-through data
Park et al. Uniwalk: Explainable and accurate recommendation for rating and network data
CN108470075A (en) A kind of socialization recommendation method of sequencing-oriented prediction
US20200226493A1 (en) Apparatus and Method for Training a Similarity Model Used to Predict Similarity Between Items
CN107391582A (en) The information recommendation method of user preference similarity is calculated based on context ontology tree
CN111475724A (en) Random walk social network event recommendation method based on user similarity
Sun et al. Leveraging friend and group information to improve social recommender system
Wang et al. Link prediction in heterogeneous collaboration networks
Zhang et al. Recommendation over a heterogeneous social network
Najafabadi et al. Tag recommendation model using feature learning via word embedding
CN112883289B (en) PMF recommendation method based on social trust and tag semantic similarity
CN107784095B (en) Learning resource automatic recommendation method based on mobile learning
Bozyiğit et al. Collaborative filtering based course recommender using OWA operators
CN112650920B (en) Recommendation method fusing social networks based on Bayesian sorting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant