CN117171447A - Online interest group recommendation method based on self-attention and contrast learning - Google Patents
Online interest group recommendation method based on self-attention and contrast learning Download PDFInfo
- Publication number
- CN117171447A CN117171447A CN202310432747.3A CN202310432747A CN117171447A CN 117171447 A CN117171447 A CN 117171447A CN 202310432747 A CN202310432747 A CN 202310432747A CN 117171447 A CN117171447 A CN 117171447A
- Authority
- CN
- China
- Prior art keywords
- user
- interest group
- interest
- online
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 239000013598 vector Substances 0.000 claims abstract description 101
- 238000012512 characterization method Methods 0.000 claims abstract description 79
- 230000006870 function Effects 0.000 claims abstract description 35
- 238000009826 distribution Methods 0.000 claims abstract description 16
- 238000013528 artificial neural network Methods 0.000 claims abstract description 13
- 238000005096 rolling process Methods 0.000 claims abstract description 8
- 230000006399 behavior Effects 0.000 claims description 17
- 239000011159 matrix material Substances 0.000 claims description 15
- 230000008569 process Effects 0.000 claims description 13
- 230000008859 change Effects 0.000 claims description 11
- 238000012549 training Methods 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 230000002776 aggregation Effects 0.000 claims description 6
- 238000004220 aggregation Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 230000000873 masking effect Effects 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 5
- 230000003993 interaction Effects 0.000 claims description 4
- 230000002457 bidirectional effect Effects 0.000 claims description 3
- 230000005540 biological transmission Effects 0.000 claims description 3
- 230000002452 interceptive effect Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000005728 strengthening Methods 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 3
- 125000004122 cyclic group Chemical group 0.000 abstract description 3
- 230000008901 benefit Effects 0.000 abstract description 2
- 230000007704 transition Effects 0.000 description 3
- 101100365014 Arabidopsis thaliana SCL4 gene Proteins 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000005304 joining Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002146 bilateral effect Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000003997 social interaction Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses an online interest group recommendation method based on self-attention and contrast learning, which recommends an online interest group to a user through an interest group recommendation model SCL4GR, wherein the interest group recommendation model consists of a node embedding module, a sequence coding module, a recommendation generating module and a contrast learning module, and the recommendation method comprises the following steps: firstly, constructing three social networks, and then, obtaining characterization vectors of a user and an online interest group by using a graph rolling network GCN; then embedding the characterization vector into a transducer model to capture the pattern of user interest changes; and finally, splicing the dynamic interests and the user characterization vectors by using a recommendation generation module to obtain the current preference of the user, inputting the preference into a multi-layer fully-connected neural network with a softmax function to obtain probability distribution of all candidate interest groups, and recommending the online interest groups to the user. Compared with the traditional methods such as a cyclic neural network and a Markov chain, the method has the advantages that the adaptability and the execution efficiency of the model are obviously improved.
Description
Technical Field
The invention relates to the technical field of online interest group recommendation, in particular to an online interest group recommendation method based on self-attention and contrast learning.
Background
Event-based social networks Event-Based Social Networks, EBSN, which have become popular over the past few years, such as the meetup. Com in the united states and the dousan network in china, offer convenience to those users who interact in both online interest groups and offline activities (also called events). By 2 months of 2023, the Meetup, one of the largest EBSNs, has over 5000 tens of thousands of users, creating over 1 ten thousands of campaigns in 190 countries through the Meetup platform per week.
Since Liu et al [1] proposed the EBSN concept for the first time, extensive research has been conducted on different types of recommendation problems in EBSN [2]. Wherein event recommendation and interest group recommendation are two important points. The purpose of event recommendation is to attempt to recommend interesting events to individual users, while interest group recommendation is to pay attention to how events are recommended to groups of users having a common interest. In addition to the recommendations described above, there are other types of recommendations such as venue recommendation [3], event-partner pair recommendation [4], bilateral recommendation [5], and the like. Although many achievements have been made in the field of EBSN recommendation, there is still a neglected recommendation problem, i.e. online interest group recommendation. Unlike traditional interest group recommendations, online interest group recommendations are intended to recommend online interest groups to individual users.
In the EBSN platform, if a user wants to attend an event, the user must first join the online interest group that issued the event. Over time, the user's interests may change, which results in the need for him to join a new interest group. Adding new interest groups can meet the psychological needs of users to engage in new friends, explore new things, and pursue enthusiasms. However, there are hundreds of thousands of online interest groups on the platform, and it is not easy for a user to find the appropriate interest group. Therefore, helping them solve this problem has a very important practical meaning for improving the satisfaction and loyalty of users to the platform.
In general, solving the online interest group recommendation problem will face three challenges: 1) Dynamic interest. User interest in joining interest groups may change dynamically over time, requiring an efficient method to capture the pattern of changes; 2) The supervisory signals are sparse. Most models implement recommended tasks under a supervised learning paradigm. In EBSN there are typically hundreds of thousands of online interest groups, but most users join only a few interest groups, which results in extremely sparse interaction data. Lack of sufficient training data prevents the model from achieving its full performance; 3) Heterogeneous social networks. Unlike traditional social networks (e.g., facebook, twitter), EBSN users possess both online and offline social networks. The online network is formed by users joining an online interest group, while the offline network is formed by their participation in offline events. Social activities in the two networks interact, facilitating each other. Modeling collaborative associations between these networks is critical to improving the performance of online interest group recommendations.
To address the above challenges, a novel online interest group recommendation model (named SCL4 GR) based on self-attention and contrast learning is presented herein. First, considering that the historical interest group sequence contains dynamic changes of user interests, the transition pattern of user interests is captured by using the self-attention mechanism of the transition in this document, inspired by the transition sequence model with great success [6] in the field of machine translation. Compared with the traditional sequence models such as a cyclic neural network, a Markov chain and the like, the model based on the Transformer has obvious improvement in the aspects of model adaptability and execution efficiency. Secondly, to address the challenge of supervised signal sparsity, a learning paradigm based on comparative Self-supervised learning is presented herein, motivated by the successful application of Self-supervised learning (Self-Supervised Learning, SSL) in computer vision [7] and natural language processing [8 ]. In this paradigm, classical supervisory tasks are complemented by auxiliary self-supervisory tasks that enhance user/group/event characterization learning through self-discrimination. Finally, to model collaborative associations of different social networks, different social views of an instance (e.g., user/interest group/event) are generated herein, and collaborative associations are captured through contrasting self-supervised learning. Specifically, three social networks (i.e., an online network, an offline network, and an integrated network (i.e., a combination of the first two networks) are first built herein. These three networks are considered herein as different views of the social relationship of the user. The graph rolling network (GCN) 9 is then used herein to obtain view-aware representations of instances in each network, respectively, over three networks. Finally, the views of the same instance are pulled closer together in the embedding space to encourage extraction of the unique information contained in each view into another view; at the same time, the ideas of the different examples are separated here, thus enhancing the discrimination of the characterization.
Reference is made to:
disclosure of Invention
The invention aims to overcome the defects of the prior art and provides an online interest group recommendation method based on self-attention and contrast learning. Three social networks, namely an online network, an offline network and an integrated network (combination of the first two networks), are firstly constructed, and are regarded as different views of the social relationship of the user. The view aware representation of the instances in each network is then obtained using a graph roll-up network (GCN) 9 over the three networks, respectively. Finally, the views of the same instance are pulled closer in the embedding space to encourage extraction of unique information contained in each view into another view, separating the views of different instances, thereby enhancing the discriminatory power of the characterization.
In order to achieve the above purpose, the invention adopts the following technical scheme:
an online interest group recommendation method based on self-attention and contrast learning recommends an online interest group to a user through an interest group recommendation model (named SCL4 GR), wherein the interest group recommendation model consists of a node embedding module, a sequence coding module, a recommendation generating module and a contrast learning module;
the node embedding module converts each node of the online network into a vector through a graph rolling network GCN, and encodes the characteristic and structure information of the vector to obtain the characterization vectors of the user and the online interest group;
the sequence coding module is used for embedding the converted online network node into a transducer model to capture the pattern of interest change of the user and obtain the dynamic interest hidden in the user behavior sequence;
the recommendation generation module is used for splicing the dynamic interests and the user characterization vectors to obtain the current preference of the user, and inputting the preference into the multi-layer fully-connected neural network with the softmax function to obtain probability distribution of all candidate interest groups;
the contrast learning module is used for strengthening the supervision task through contrast learning tasks by taking different social networks as different views of the nodes;
the method for recommending the online interest group to the user through the interest group recommendation model comprises the following steps:
step 1, constructing an online network, an offline network and an integrated network according to different views of the social relationship of a user, acquiring all node information in the online network of the user, inputting the node information into a node embedding module, converting each node of the online network into a vector by the node embedding module, and encoding the characteristic and structure information of the vector to obtain characterization vectors of the user and an online interest group;
step 2, embedding the characterization vectors of the user and the online interest group obtained after node conversion into a sequence coding module to capture the pattern of interest change of the user and obtain the dynamic interest hidden in the user behavior sequence;
and 3, splicing the dynamic interests and the user characterization vectors by using a recommendation generation module to obtain the current preference of the user, inputting the preference into a multi-layer fully-connected neural network with a softmax function to obtain probability distribution of all candidate interest groups, and recommending corresponding online interest groups to the user.
Specifically, in step 1, the node embedding module converts each node of the online network into a vector, and encodes the feature and structure information thereof to obtain the characterization of the user and the online interest group, and the specific process is as follows:
let the online network be denoted as G on =<U,G,A on >, where U represents the user set, u= { U 1 ,u 2 ,...,u n -a }; g represents an online interest set, g= { G 1 ,g 2 ,...,g m -a }; e represents an event set, E= { E 1 ,e 2 ,...,e k };A on Representing an edge set, namely that a user participates in an online interest group, and connecting one edge between the user and the interest group;
let the embedded vector of user U beThe embedding vector of interest group G is +.>Wherein d is the vector dimension, and superscript (0) represents the initial vector;
because the interest group participated by the user can reflect the interest of the user, and the user participated in the interest group can also be used as the feature of the interest group to measure the similarity of two interest groups, the feature is utilized to carry out information transmission between the connected user and the interest group, and surrounding neighbors of the user node u are aggregated by an aggregation function to obtain a characterization vector of the user after the first layer of GCN through the graph roll network GCN:
in the above-mentioned method, the step of,representing the characterization vector of the user after the first layer GCN, N (U) is the first order neighbor of user U and the user's own set, d u Adding 1, d to the degree of user node u g Is the degree of node g, +.>Is a weight matrix for extracting information, d' is the transformed dimension,/is>Is bias, σ (·) is activation function;
similarly, for the interest group node G, surrounding neighbors are aggregated by an aggregation function through a graph rolling network GCN to obtain a characterization vector of the interest group G after the first layer of GCN;
because high-order correlation information has an important impact on evaluating the correlation between users and interest groups, it is often necessary to stack multiple GCN layers to obtain final characterization vectors for user u and interest group g;
assuming that one GCN layer is stacked, such that a user or interest group can receive information from the propagation of one hop neighbor; in the first step, the iterative calculation formula of the characterization of the user u is as follows:
wherein,is a trainable weight matrix, d l ,d l-1 The transformation dimensions of the first and the first-1 layer, respectively, < >>Is a representation of the interest group output by layer 1;
similarly, calculating to obtain a characterization vector of the interest group g of the GCN layer of the first layer
By stacking multiple GCN layers and taking the output of the last GCN layer as the final characterization vector of the user u and interest group g, respectively, is recorded as
Specifically, in step 2, the characterization vector of the online interest group obtained after the node conversion is input to a sequence encoding module to capture the pattern of interest change of the user, so as to obtain the dynamic interest hidden in the user behavior sequence, and the specific process is as follows:
let S u =[g u,1 ,...,g u,i ,...,g u,t ]An online interactive interest group sequence representing a user u, and adding a position vector into an interest group characterization vector by a position coding method in order to utilize order information of an input sequence; then, a Tran is obtained by a transducer coding modelCalculating interest group characterization vectors for each position in the sformer layer at the same time, and stacking the interest group characterization vectors of all positions into a matrixThe method comprises the steps of calculating the attention weight of a user to an interest group to obtain dynamic interests hidden in a user behavior sequence;
interest group characterization vector stacking matrix for each layerThe calculation formula is as follows:
B (l-1) =LN(G (l-1) +Dropout(MH(G (l-1) ))) (3)
Trm(G (l-1) )=LN(B (l-1) +Dropout(PFFN(B (l-1) )) (4)
where d is the transformation dimension, l.e { 1..l } represents the first layer, LN (-), dropout (-), MH (-) represents layer normalization, dropoout and multi-head attention operations, respectively, PFFN (-) represents the point feed forward network.
Specifically, in step 3, the recommendation generation module is used to splice the dynamic interests and the user characterization vector to obtain the current preference of the user, and the preference is input into the multi-layer fully-connected neural network with the softmax function to obtain the probability distribution of all candidate interest groups, so as to recommend the online interest groups to the user, and the specific process is as follows:
to learn more complex transfer patterns, multiple transducer layers are typically stacked, and the final output matrix of all interest groups of the user behavior sequence obtained after one layer of transducers is set as
Suppose that step t interest group g t Masking, then characterizing the interest group obtained in step tAs masked interest group g t A predicted value of the characterization; consider->Only the dynamic interest preferences of user u are encoded without taking into account the characteristic importances of the user itself, such as ID, age, sex of the user, so further +.>And user characterization->Splicing to get the final characterization of the user's preference at time step t +.>
Inputting the user preference characterization into a K-layer fully-connected network with a ReLU activation function for generating probability distribution of interest of user u to candidate interest groups
Wherein,is a parameter of the discipline, d is the dimension of the embedded vector;
and recommending the corresponding online interest group to the user according to the generated probability distribution of interest of the user u to each candidate interest group.
Further, step 1 further includes deriving self-supervision signals from the original user behavior sequence by contrast learning, capturing unique information of social relationships of users by taking an online network, an offline network and an integrated network as three different views of the social relationships, taking the same instance of the different views as positive sample pairs, taking the different instances of the different views as negative sample pairs, and then maximizing consistency between the positive sample pairs and minimizing consistency between the negative sample pairs by adopting InfoNCE loss, wherein all views cooperate with each other to maximize model capacity, and the specific process is as follows:
considering the contrast between the online and offline network views, since only user nodes are present in both networks at the same time, only the contrast loss L between user embeddings is calculated on-off :
Wherein,for calculating the user positive sample pair +.>Is lost:
wherein sim (·) is a similarity function for measuring the difference between two vectorsIs a function of the similarity of the sequences,is user u's online network token vector, +.>Is a negative sample characterization vector in the offline network view;
considering the contrast between the online network view and the integrated network view, firstly, respectively calculating the losses between the positive user pair and the positive interest group pair, and then fusing the two losses to obtain the contrast loss L between the two views on-int :
Wherein the super parameter lambda 1 For controlling the intensity of the interest group characterization vector contrast;
similarly, consider the comparison between the offline network view and the integrated network view, calculate the comparison loss between the positive user pair and the positive event pair, respectively, and obtain the final fused loss L off-int :
Wherein lambda is 2 Is a super parameter;
fusing the contrast loss under different views to obtain an objective function of the self-supervision learning task:
L c =L on-off +β 1 L on-int +β 2 L off-int . (14)
wherein beta is 1 ,β 2 Is a super parameter.
Further, since the bidirectional transducer model is used to predict the next interest group that the user wants to join, based on the interaction sequence of the user, the complete gap-filling Cloze task is applied to the sequence recommendation, and the specific process is as follows:
for each input sequence, randomly masking a number of interest groups "[ mask ]" in the input sequence, and then predicting the original IDs of the masked interest groups based only on their left and right contexts; the potential vector corresponding to the "[ mask ]" tag is input into a softmax function acting on the interest group set to obtain the preference probability of the masked interest group, and the final objective function is defined as follows:
wherein S is u ' is interest group sequence S u Is a masked version of (a); g u m Representing the masked interest group set;a true interest group that is a masked interest group; p (·) is the probability defined by equation (8);
in order to improve the recommendation performance, a multi-task training strategy is adopted in the step 3, and the classical recommendation task and the self-supervision learning task are combined for training:
L=L s +ρL c (16)
wherein ρ is a hyper-parameter controlling the intensity of the self-supervised learning task effort.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention provides a new online interest group recommendation model SCL4GR, which captures a dynamic interest mode of a user and high-order social interaction between users by using a transducer and GCN technology; compared with the traditional sequence models such as a cyclic neural network, a Markov chain and the like, the model based on the Transformer has obvious improvement in the aspects of model adaptability and execution efficiency.
2. The invention provides an online interest group recommendation method based on self-attention and contrast learning, which is based on a contrast self-supervision learning paradigm of an EBSN heterogeneous network, takes a self-distinguishing task as a self-supervision task and provides an auxiliary supervision signal for characterization learning; meanwhile, collaborative association among different social networks in the EBSN is modeled through contrast learning, so that mutual enhancement among different social views is promoted.
Drawings
FIG. 1 is a general architecture diagram of an online interest group recommendation model SCL4GR of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by one of ordinary skill in the art without inventive faculty, are intended to be within the scope of the present invention, based on the embodiments of the present invention.
The invention provides an online interest group recommendation method based on self-attention and contrast learning, which is characterized in that an online interest group is recommended to a user through an interest group recommendation model SCL4GR shown in figure 1, wherein the interest group recommendation model consists of a node embedding module, a sequence coding module, a recommendation generating module and a contrast learning module;
the node embedding module converts each node of the online network into a vector through a graph rolling network GCN, and encodes the characteristic and structure information of the vector to obtain the characterization vectors of the user and the online interest group;
the sequence coding module is used for embedding the converted online network node into a transducer model to capture the pattern of interest change of the user and obtain the dynamic interest hidden in the user behavior sequence;
the recommendation generation module is used for splicing the dynamic interests and the user characterization vectors to obtain the current preference of the user, and inputting the preference into the multi-layer fully-connected neural network with the softmax function to obtain probability distribution of all candidate interest groups;
the contrast learning module is used for strengthening the supervision task through contrast learning tasks by taking different social networks as different views of the nodes;
the method for recommending the online interest group to the user through the interest group recommendation model comprises the following steps:
step 1, constructing an online network, an offline network and an integrated network according to different views of the social relationship of a user, acquiring all node information in the online network of the user, inputting the node information into a node embedding module, converting each node of the online network into a vector by the node embedding module, and encoding the characteristic and structure information of the vector to obtain characterization vectors of the user and an online interest group;
step 2, embedding the characterization vectors of the user and the online interest group obtained after node conversion into a sequence coding module to capture the pattern of interest change of the user and obtain the dynamic interest hidden in the user behavior sequence;
and 3, splicing the dynamic interests and the user characterization vectors by using a recommendation generation module to obtain the current preference of the user, inputting the preference into a multi-layer fully-connected neural network with a softmax function to obtain probability distribution of all candidate interest groups, and recommending corresponding online interest groups to the user.
Specifically, in step 1, the node embedding module converts each node of the online network into a vector, and encodes the feature and structure information thereof to obtain the characterization of the user and the online interest group, and the specific process is as follows:
let the online network be denoted as G on =<U,G,A on >, where U represents the user set, u= { U 1 ,u 2 ,...,u n -a }; g represents an online interest set, g= { G 1 ,g 2 ,...,g m -a }; e represents an event set, E= { E 1 ,e 2 ,...,e k };A on Representing an edge set, namely that a user participates in an online interest group, and connecting one edge between the user and the interest group;
let the embedded vector of user U beThe embedding vector of interest group G is +.>Wherein d is the vector dimension, and superscript (0) represents the initial vector;
because the interest group participated by the user can reflect the interest of the user, and the user participated in the interest group can also be used as the feature of the interest group to measure the similarity of two interest groups, the feature is utilized to carry out information transmission between the connected user and the interest group, and surrounding neighbors of the user node u are aggregated by an aggregation function to obtain a characterization vector of the user after the first layer of GCN through the graph roll network GCN:
in the above-mentioned method, the step of,representing the characterization vector of the user after the first layer GCN, N (U) is the first order neighbor of user U and the user's own set, d u Adding 1, d to the degree of user node u g Is the degree of node g, +.>Is a weight matrix for extracting information, d' is the transformed dimension,/is>Is bias, σ (·) is activation function;
similarly, for the interest group node G, surrounding neighbors are aggregated by an aggregation function through a graph rolling network GCN to obtain a characterization vector of the interest group G after the first layer of GCN;
because high-order correlation information has an important impact on evaluating the correlation between users and interest groups, it is often necessary to stack multiple GCN layers to obtain final characterization vectors for user u and interest group g;
assuming that one GCN layer is stacked, such that a user or interest group can receive information from the propagation of one hop neighbor; in the first step, the iterative calculation formula of the characterization of the user u is as follows:
wherein,is a trainable weight matrix, d l ,d l-1 The transformation dimensions of the first and the first-1 layer, respectively, < >>Is a representation of the interest group output by layer 1;
similarly, calculating to obtain a characterization vector of the interest group g of the GCN layer of the first layer
By stacking multiple GCN layers and taking the output of the last GCN layer as the final characterization vector of the user u and interest group g, respectively, is recorded as
Specifically, in step 2, the characterization vector of the online interest group obtained after the node conversion is input to a sequence encoding module to capture the pattern of interest change of the user, so as to obtain the dynamic interest hidden in the user behavior sequence, and the specific process is as follows:
let S u =[g u,1 ,...,g u,i ,...,g u,t ]An online interactive interest group sequence representing a user u, and adding a position vector into an interest group characterization vector by a position coding method in order to utilize order information of an input sequence; then calculating interest group characterization vectors for each position simultaneously in a transducer layer through a transducer coding model, and stacking the interest group characterization vectors of all positions into a matrixThe method comprises the steps of calculating the attention weight of a user to an interest group to obtain dynamic interests hidden in a user behavior sequence;
interest group characterization vector stacking matrix for each layerThe calculation formula is as follows:
B (l-1) =LN(G (l-1) +Dropout(MH(G (l-1) ))) (3)
Trm(G (l-1) )=LN(B (l-1) +Dropout(PFFN(B (l-1) )) (4)
where d is the transformation dimension, l.e { 1..l } represents the first layer, LN (-), dropout (-), MH (-) represents layer normalization, dropoout and multi-head attention operations, respectively, PFFN (-) represents the point feed forward network.
Specifically, in step 3, the recommendation generation module is used to splice the dynamic interests and the user characterization vector to obtain the current preference of the user, and the preference is input into the multi-layer fully-connected neural network with the softmax function to obtain the probability distribution of all candidate interest groups, so as to recommend the online interest groups to the user, and the specific process is as follows:
to learn more complex transfer patterns, multiple transducer layers are typically stacked, and the final output matrix of all interest groups of the user behavior sequence obtained after one layer of transducers is set as
Suppose that step t interest group g t Masking, then characterizing the interest group obtained in step tAs masked interest group g t A predicted value of the characterization; considerTo h u L ,t Only the dynamic interest preferences of user u are encoded without taking into account the characteristic importances of the user itself, such as ID, age, sex of the user, so further +.>And user characterization->Splicing to get the final characterization of the user's preference at time step t +.>
Inputting the user preference characterization into a K-layer fully-connected network with a ReLU activation function for generating probability distribution of interest of user u to candidate interest groups
Wherein,is a parameter of the discipline, d is the dimension of the embedded vector;
and recommending the corresponding online interest group to the user according to the generated probability distribution of interest of the user u to each candidate interest group.
Further, step 1 further includes deriving self-supervision signals from the original user behavior sequence by contrast learning, capturing unique information of social relationships of users by taking an online network, an offline network and an integrated network as three different views of the social relationships, taking the same instance of the different views as positive sample pairs, taking the different instances of the different views as negative sample pairs, and then maximizing consistency between the positive sample pairs and minimizing consistency between the negative sample pairs by adopting InfoNCE loss, wherein all views cooperate with each other to maximize model capacity, and the specific process is as follows:
considering the contrast between the online and offline network views, since only user nodes are present in both networks at the same time, only the contrast loss L between user embeddings is calculated on-off :
Wherein,for calculating the user positive sample pair +.>Is lost:
wherein sim (·) is a similarity function for measuring the similarity between two vectors,is user u's online network token vector, +.>Is in an offline network viewIs a negative sample characterization vector of (a);
considering the contrast between the online network view and the integrated network view, firstly, respectively calculating the losses between the positive user pair and the positive interest group pair, and then fusing the two losses to obtain the contrast loss L between the two views on-int :
Wherein the super parameter lambda 1 For controlling the intensity of the interest group characterization vector contrast;
similarly, consider the comparison between the offline network view and the integrated network view, calculate the comparison loss between the positive user pair and the positive event pair, respectively, and obtain the final fused loss L off-int :
Wherein lambda is 2 Is a super parameter;
fusing the contrast loss under different views to obtain an objective function of the self-supervision learning task:
L c =L on-off +β 1 L on-int +β 2 L off-int . (14)
wherein beta is 1 ,β 2 Is a super parameter.
Further, since the bidirectional transducer model is used to predict the next interest group that the user wants to join, based on the interaction sequence of the user, the complete gap-filling Cloze task is applied to the sequence recommendation, and the specific process is as follows:
for each input sequence, randomly masking a number of interest groups "[ mask ]" in the input sequence, and then predicting the original IDs of the masked interest groups based only on their left and right contexts; the potential vector corresponding to the "[ mask ]" tag is input into a softmax function acting on the interest group set to obtain the preference probability of the masked interest group, and the final objective function is defined as follows:
wherein S is u ' is interest group sequence S u Is a masked version of (a); g u m Representing the masked interest group set;a true interest group that is a masked interest group; p (·) is the probability defined by equation (8);
in order to improve the recommendation performance, a multi-task training strategy is adopted in the step 3, and the classical recommendation task and the self-supervision learning task are combined for training:
L=L s +ρL c (16)
wherein ρ is a hyper-parameter controlling the intensity of the self-supervised learning task effort.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (6)
1. An online interest group recommending method based on self-attention and contrast learning is characterized in that an online interest group is recommended to a user through an interest group recommending model, wherein the interest group recommending model consists of a node embedding module, a sequence coding module, a recommending generating module and a contrast learning module;
the node embedding module converts each node of the online network into a vector through a graph rolling network GCN, and encodes the characteristic and structure information of the vector to obtain the characterization vectors of the user and the online interest group;
the sequence coding module is used for embedding the converted online network node into a transducer model to capture the pattern of interest change of the user and obtain the dynamic interest hidden in the user behavior sequence;
the recommendation generation module is used for splicing the dynamic interests and the user characterization vectors to obtain the current preference of the user, and inputting the preference into the multi-layer fully-connected neural network with the softmax function to obtain probability distribution of all candidate interest groups;
the contrast learning module is used for strengthening the supervision task through contrast learning tasks by taking different social networks as different views of the nodes;
the method for recommending the online interest group to the user through the interest group recommendation model comprises the following steps:
step 1, constructing an online network, an offline network and an integrated network according to different views of the social relationship of a user, acquiring all node information in the online network of the user, inputting the node information into a node embedding module, converting each node of the online network into a vector by the node embedding module, and encoding the characteristic and structure information of the vector to obtain characterization vectors of the user and an online interest group;
step 2, embedding the characterization vectors of the user and the online interest group obtained after node conversion into a sequence coding module to capture the pattern of interest change of the user and obtain the dynamic interest hidden in the user behavior sequence;
and 3, splicing the dynamic interests and the user characterization vectors by using a recommendation generation module to obtain the current preference of the user, inputting the preference into a multi-layer fully-connected neural network with a softmax function to obtain probability distribution of all candidate interest groups, and recommending corresponding online interest groups to the user.
2. The online interest group recommendation method based on self-attention and contrast learning according to claim 1, wherein in step 1, the node embedding module converts each node of the online network into a vector, and encodes the feature and structure information thereof to obtain the characterization of the user and the online interest group, and the specific process is as follows:
let the online network be denoted as G on =<U,G,A on >, where U represents the user set, u= { U 1 ,u 2 ,...,u n -a }; g represents an online interest set, g= { G 1 ,g 2 ,...,g m -a }; e represents an event set, E= { E 1 ,e 2 ,...,e k };A on Representing an edge set, namely that a user participates in an online interest group, and connecting one edge between the user and the interest group;
let the embedded vector of user U beThe embedding vector of interest group G is +.>Wherein d is the vector dimension, and superscript (0) represents the initial vector;
because the interest group participated by the user can reflect the interest of the user, and the user participated in the interest group can also be used as the feature of the interest group to measure the similarity of two interest groups, the feature is utilized to carry out information transmission between the connected user and the interest group, and surrounding neighbors of the user node u are aggregated by an aggregation function to obtain a characterization vector of the user after the first layer of GCN through the graph roll network GCN:
in the above-mentioned method, the step of,representing the characterization vector of the user after the first layer GCN, N (U) is the first order neighbor of user U and the user's own set, d u Adding 1, d to the degree of user node u g Is the degree of node g, +.>Is a weight matrix for extracting information, d' is the transformed dimension,/is>Is bias, σ (·) is activation function;
similarly, for the interest group node G, surrounding neighbors are aggregated by an aggregation function through a graph rolling network GCN to obtain a characterization vector of the interest group G after the first layer of GCN;
because high-order correlation information has an important impact on evaluating the correlation between users and interest groups, it is often necessary to stack multiple GCN layers to obtain final characterization vectors for user u and interest group g;
assuming that one GCN layer is stacked, such that a user or interest group can receive information from the propagation of one hop neighbor; in the first step, the iterative calculation formula of the characterization of the user u is as follows:
wherein,is a trainable weight matrix, d l ,d l-1 The transform dimensions of the first and the first-1 layer respectively,is a representation of the interest group output by layer 1;
similarly, calculating to obtain a characterization vector of the interest group g of the GCN layer of the first layer
By stacking multiple GCN layers and taking the output of the last GCN layer as the final characterization vector of the user u and interest group g, respectively, is recorded as
3. The online interest group recommendation method based on self-attention and contrast learning according to claim 1, wherein in step 2, the characterization vector of the online interest group obtained after node conversion is input to a sequence encoding module to capture the pattern of user interest change, and the specific process is as follows:
let S u =[g u,1 ,...,g u,i ,...,g u,t ]An online interactive interest group sequence representing a user u, and adding a position vector into an interest group characterization vector by a position coding method in order to utilize order information of an input sequence; then calculating interest group characterization vectors for each position simultaneously in a transducer layer through a transducer coding model, and stacking the interest group characterization vectors of all positions into a matrixThe method comprises the steps of calculating the attention weight of a user to an interest group to obtain dynamic interests hidden in a user behavior sequence; interest group characterization vector stack matrix for each layer>The calculation formula is as follows:
B (l-1) =LN(G (l-1) +Dropout(MH(G (l-1) ))) (3)
Trm(G (l-1) )=LN(B (l-1) +Dropout(PFFN(B (l-1) )) (4)
where d is the transformation dimension, l.e { 1..l } represents the first layer, LN (-), dropout (-), MH (-) represents layer normalization, dropoout and multi-head attention operations, respectively, PFFN (-) represents the point feed forward network.
4. The online interest group recommendation method based on self-attention and contrast learning according to claim 1, wherein in step 3, the recommendation generation module is used for splicing the dynamic interest and the user characterization vector to obtain the current preference of the user, and inputting the preference into a multi-layer fully-connected neural network with a softmax function to obtain probability distribution of all candidate interest groups, so as to recommend the online interest groups to the user, and the specific process is as follows:
to learn more complex transfer patterns, multiple transducer layers are typically stacked, and the final output matrix of all interest groups of the user behavior sequence obtained after one layer of transducers is set as
Suppose that step t interest group g t Masking, then characterizing the interest group obtained in step tAs masked interest group g t A predicted value of the characterization; consider->Only the dynamic interest preferences of user u are encoded without taking into account the characteristic importances of the user itself, such as ID, age, sex of the user, so further +.>And user characterization->Splicing to get the final characterization of the user's preference at time step t +.>
Inputting the user preference characterization into a K-layer fully-connected network with a ReLU activation function for generating probability distribution of interest of user u to candidate interest groups
Wherein,is a parameter of the discipline, d is the dimension of the embedded vector;
and recommending the corresponding online interest group to the user according to the generated probability distribution of interest of the user u to each candidate interest group.
5. The online interest group recommendation method based on self-attention and contrast learning of claim 1, further comprising deriving self-supervision signals from the original user behavior sequence using contrast learning, capturing unique information of social relationships of users by using an online network, an offline network and an integrated network as three different views of the social relationships, using the same instance of the different views as positive sample pairs, using different instances of the different views as negative sample pairs, and then using infoce loss to maximize consistency between the positive sample pairs and minimize consistency between the negative sample pairs, all views cooperating with each other to maximize model capacity, comprising:
considering the contrast between the online and offline network views, since only user nodes are present in both networks at the same time, only the contrast loss L between user embeddings is calculated on-off :
Wherein,for calculating the user positive sample pair +.>Is lost:
wherein sim (·) is a similarity function for measuring the similarity between two vectors,is user u's online network token vector, +.>Is a negative sample characterization vector in the offline network view;
considering the contrast between the online network view and the integrated network view, firstly, respectively calculating the losses between the positive user pair and the positive interest group pair, and then fusing the two losses to obtain the contrast loss L between the two views on-int :
Wherein the super parameter lambda 1 For controlling the intensity of the interest group characterization vector contrast;
similarly, consider the comparison between the offline network view and the integrated network view, calculate the comparison loss between the positive user pair and the positive event pair, respectively, and obtain the final fused loss L off-int :
Wherein lambda is 2 Is a super parameter;
fusing the contrast loss under different views to obtain an objective function of the self-supervision learning task:
L c =L on-off +β 1 L on-int +β 2 L off-int . (14)
wherein beta is 1 ,β 2 Is a super parameter.
6. The method for online interest group recommendation based on self-attention and contrast learning as claimed in claim 4, wherein the following steps are applied to the sequence recommendation based on the interaction sequence of the user due to the fact that the bidirectional transducer model is used for predicting the next interest group the user wants to join:
for each input sequence, randomly masking a number of interest groups "[ mask ]" in the input sequence, and then predicting the original IDs of the masked interest groups based only on their left and right contexts; the potential vector corresponding to the "[ mask ]" tag is input into a softmax function acting on the interest group set to obtain the preference probability of the masked interest group, and the final objective function is defined as follows:
wherein S is u ' is interest group sequence S u Is a masked version of (a); g u m Representing the masked interest group set;a true interest group that is a masked interest group; p (·) is the probability defined by equation (8);
in order to improve the recommendation performance, a multi-task training strategy is adopted in the step 3, and the classical recommendation task and the self-supervision learning task are combined for training:
L=L s +ρL c (16)
wherein ρ is a hyper-parameter controlling the intensity of the self-supervised learning task effort.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310432747.3A CN117171447A (en) | 2023-04-21 | 2023-04-21 | Online interest group recommendation method based on self-attention and contrast learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310432747.3A CN117171447A (en) | 2023-04-21 | 2023-04-21 | Online interest group recommendation method based on self-attention and contrast learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117171447A true CN117171447A (en) | 2023-12-05 |
Family
ID=88932427
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310432747.3A Pending CN117171447A (en) | 2023-04-21 | 2023-04-21 | Online interest group recommendation method based on self-attention and contrast learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117171447A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117390295A (en) * | 2023-12-13 | 2024-01-12 | 深圳须弥云图空间科技有限公司 | Method and device for recommending objects based on mask module |
-
2023
- 2023-04-21 CN CN202310432747.3A patent/CN117171447A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117390295A (en) * | 2023-12-13 | 2024-01-12 | 深圳须弥云图空间科技有限公司 | Method and device for recommending objects based on mask module |
CN117390295B (en) * | 2023-12-13 | 2024-03-15 | 深圳须弥云图空间科技有限公司 | Method and device for recommending objects based on mask module |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114817663B (en) | Service modeling and recommendation method based on class perception graph neural network | |
CN111709518A (en) | Method for enhancing network representation learning based on community perception and relationship attention | |
CN112925977A (en) | Recommendation method based on self-supervision graph representation learning | |
CN110751188B (en) | User label prediction method, system and storage medium based on multi-label learning | |
CN113378047A (en) | Multi-aspect enhancement-based graph neural network recommendation method | |
CN111241394A (en) | Data processing method and device, computer readable storage medium and electronic equipment | |
CN117171447A (en) | Online interest group recommendation method based on self-attention and contrast learning | |
CN111324773A (en) | Background music construction method and device, electronic equipment and storage medium | |
CN114942998B (en) | Knowledge graph neighborhood structure sparse entity alignment method integrating multi-source data | |
Yin et al. | GS-InGAT: An interaction graph attention network with global semantic for knowledge graph completion | |
CN115618128A (en) | Collaborative filtering recommendation system and method based on graph attention neural network | |
Wang et al. | SCANET: Improving multimodal representation and fusion with sparse‐and cross‐attention for multimodal sentiment analysis | |
Chen et al. | Integrating user-group relationships under interest similarity constraints for social recommendation | |
CN113918711A (en) | Academic paper-oriented classification method based on multi-view and multi-layer attention | |
CN116433800B (en) | Image generation method based on social scene user preference and text joint guidance | |
CN117539999A (en) | Cross-modal joint coding-based multi-modal emotion analysis method | |
CN113191482A (en) | Heterogeneous graph neural network representation method based on element path | |
CN116186309A (en) | Graph convolution network recommendation method based on interaction interest graph fusing user intention | |
Zhou et al. | Multi-modal multi-hop interaction network for dialogue response generation | |
Cai et al. | RI-GCN: Review-aware interactive graph convolutional network for review-based item recommendation | |
CN115564013B (en) | Method for improving learning representation capability of network representation, model training method and system | |
CN116204628A (en) | Logistics knowledge neural collaborative filtering recommendation method with enhanced knowledge graph | |
CN115310004A (en) | Graph nerve collaborative filtering recommendation method fusing project time sequence relation | |
Niu et al. | Gcmcsr: A new graph convolution matrix complete method with side-information reconstruction | |
Wilson et al. | A recommendation model based on deep feature representation and multi-head self-attention mechanism |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |