CN114547276A - Three-channel diagram neural network-based session recommendation method - Google Patents
Three-channel diagram neural network-based session recommendation method Download PDFInfo
- Publication number
- CN114547276A CN114547276A CN202210082137.0A CN202210082137A CN114547276A CN 114547276 A CN114547276 A CN 114547276A CN 202210082137 A CN202210082137 A CN 202210082137A CN 114547276 A CN114547276 A CN 114547276A
- Authority
- CN
- China
- Prior art keywords
- item
- session
- embedding
- items
- graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of session recommendation, in particular to a session recommendation method based on a three-channel diagram neural network, which comprises the following steps: (1) converting the session sequence data into session map, hypergraph and global map data; (2) the graph data is learned to be embedded into three items through a graph neural network of three channels; the three channels comprise a session graph channel, a hypergraph channel and a global graph channel; the conversation graph channel is used for capturing conversion relations among items in the conversation, the hypergraph channel is used for capturing high-order relations among items in the conversation, and the global graph channel is used for capturing relations among items in different conversations; (3) project representations formed by three channels are fused to obtain more complete project conversion information; (4) the predicted probabilities of the items are output through the prediction layer. The invention can preferably perform session recommendation.
Description
Technical Field
The invention relates to the technical field of session recommendation, in particular to a session recommendation method based on a three-channel diagram neural network.
Background
In recent years, internet information is growing rapidly, and a recommendation system becomes an effective method for helping users to relieve information overload problems, and plays an important role in the aspects of consumption, service, decision making and the like. Most existing recommendation methods recommend based on long-term historical user interaction and user profiles. However, in many services, the user identification may be unknown or only historical behavior during the user session is available. The conversation recommendation, an emerging form of recommendation, remedies the above deficiencies.
Because of the high practical value of conversational recommendations, many conversational-based recommendation methods have been proposed. The markov chain is a classical example, and this method predicts the next behavior of the user based on the previous behavior, and the independent combination of past interactions may limit the accuracy of the recommendation due to the strong independence assumption of the markov chain. Deep learning based recommendation methods are rapidly developed, and many deep learning based recommendation methods are also available. For example, in conjunction with a recurrent neural network approach, the model is enhanced by data enhancement and accounting for temporal changes in user behavior. The GRU4REC makes recommendations by using the GRU to establish user short-term preferences. The use of GRU in conjunction with an attention mechanism in NARM simultaneously learns sequential behavior and user primary intent to model the sequence. The Transformer model achieves the most advanced effect in processing the translation task. The model does not use recursive or convolutional networks, but rather models the sequence using the encoder-decoder structure of the self-attention network stack, and the success of the transform model stems from the application of self-attention. Self-attention is a particular attention mechanism that has been widely used for sequence data modeling. SASrec was the model that used self-attention for the recommended field at the earliest and achieved the most advanced effects. These methods achieve good results with paired item transformation information to model preferences for a given session. However, these methods still face problems, first of all, that there is not enough user behavior in one session, and it is difficult to estimate the user representation. Second, these works model only one-way transitions when transitioning between modeling items and ignore transitions between contexts.
SR-GNNs solve the above problem by initially using graph neural networks for conversational recommendation, by modeling sequence data as graph data structures and capturing complex transformations of items through the graph neural networks. The GCE-GNN also carries out session recommendation based on the graph neural network, and the method not only considers the item conversion between target sessions but also considers the conversion between different sessions. The session recommendation based on the graph neural network achieves remarkable effect. However, the method still faces some problems, the session recommendation system based on the graph neural network models the session sequence as graph structured data in a pair relationship or as hypergraph structured data, and the mode of modeling the session sequence as a single graph cannot capture more complete item conversion information, so that the accuracy of recommendation is reduced.
Disclosure of Invention
It is an object of the present invention to provide a three-channel neural network-based session recommendation method that overcomes some or all of the deficiencies of the prior art.
The three-channel diagram neural network-based session recommendation method comprises the following steps of:
(1) converting the session sequence data into session map, hypergraph and global map data;
(2) the graph data is learned to be embedded into three items through a graph neural network of three channels; the three channels comprise a session graph channel, a hypergraph channel and a global graph channel; the session graph channel is used for capturing conversion relations among items in the session, the hypergraph channel is used for capturing high-order relations among items in the session, and the global graph channel is used for capturing relations among items in different sessions;
(3) project representations formed by three channels are fused to obtain more complete project conversion information;
(4) the predicted probabilities of the items are output through the prediction layer.
Preferably, in the session graph, a session is givenIndicating item v clicked on in session SiAnd the session length is L; gs=(Vs,Es) Representing a conversation graph, each item si∈VsItems(s) that are nodes and adjacenti-1,si)∈EsAs an edge;
Gh=(Vh,Eh) Represents a hypergraph, VhRepresenting a set of N non-repeating vertices in a hypergraph, EhRepresenting a set of M hyperedges in the hypergraph; each super edge contains at least two vertices, whereAnd isEach super edge is given a weight whhAll the weights form a diagonal matrix W e RN×M(ii) a Matrix for hypergraph H ∈ RN×MRepresentation in which when a hyper-edge contains a vertex viWhen e is V, Hih1, otherwise Hih0; the degrees of the vertex and the excess edge are D respectivelyiiAnd BhhWherein D and B are diagonal matrices;
the global graph is used for acquiring information of item conversion between different sessions, wherein Gg ═ Vg denotes the global graph, Vg denotes a set of graph nodes of all items, Eg denotes a set of all edges, and each edge corresponds to two paired items in all sessions.
Preferably, the method for embedding the session map comprises the following steps:
for the target project, different adjacent projects have different importance degrees on the target project, and the weight between different nodes is captured by using an attention mechanism; the attention coefficient is as follows:
wherein s isijIndicating the importance of the item vj to the item vi,representing relationships between items, hviEmbedding of item vi, hvjEmbedding of item vj, rijIs the relationship between vi and vj, i.e. the edge relationship, a ∈ RdRepresenting a weight; then, in order to make the coefficients comparable between different nodes, the attention weights are normalized by the softmax function, as shown in the following equation:
αij=softmax(sij)
aijthe weights are the weights between the nodes vi and vj after normalization;
and finally, linearly combining the obtained attention coefficient with the corresponding item to obtain the output of each node, wherein the formula is as follows:
Preferably, the hypergraph embedding method comprises the following steps:
the hypergraph channel is used to capture the high-order relationships between items, and defines the hypergraph convolution as:
for l +1 level item embedding, Hih、HjhIs a value in the correlation matrix, whhIn order to be the weighting coefficients,item embedding for l layers;
writing the above equation in matrix form is as follows:
wherein the content of the first and second substances,respectively an item embedding of l +1 layer and an item embedding of l layer,aggregating information from nodes to the super edges, and multiplying the information by a session sequence H to check the aggregated information from the super edges to the nodes; embedding X in original item0After hypergraph convolution of L layers, the embedding summation of each layer is averaged to be used as the final item embedding
Adding location information of an item into the item embedding by location embedding, P ═ P1,p2,...,pk]Embedding a matrix for learnable positions, k being the length of the current session; the items to which the location information is added are embedded as follows:
wherein the content of the first and second substances,for item embedding with location information, xiFor item embedding without position information, pk-i+1As location information of the item, W1∈Rd×2dAnd b ∈ RdAre parameters that can be learned; the conversation embedding is generated by aggregating the item representations in the conversation; then enhancing the sessionRepresents the embedding of:
wherein alpha isiAs weights for the node pair sequences, ShThe average value of the item embedding, c the coefficients available for training,is the embedding of a conversation s with an average embedded representation of all items in the conversationIs the embedding of the ith item in the conversation s, f is the Rd、W2∈Rd×d、W3∈Rd×dIs an attention parameter.
Preferably, the method for embedding the global graph comprises the following steps:
each item is linearly combined according to the score generated by the conversation perception attention, and the specific steps are as follows:
h=∑vj∈vgπ(vi,vj)hvj
h is the attention score, hvjEmbedding the item vj, and using pi (vi, vj) to calculate the weights of different neighbors; the more important items that are closer to the current conversation are, the more the corresponding weight is; π (vi, vj) is specifically shown below:
π(vi,vj)=softmax(π(vi,vj))
wherein using LeakyRelu as the activation function, | denotes multiplication of corresponding position elements, | denotes a link operation, wij∈R1Is the weight of each edge in the global session graph, W1And q is1Is a trainable parameter, s is a feature of the target session, is obtained by calculating the average of the current session, and then byThe softmax function normalizes the coefficients of all the adjacent items connected with vi; from this attention it can be concluded that those neighboring items can be attended to;
and finally, aggregating the target item information and the information of items adjacent to the target item, wherein the process is completed through a nonlinear conversion, and the specific steps are as follows:
hvin order to be a representation of the item,for an aggregated representation of item representations and neighbor representations, Relu is the activation function, W2∈Rd ×2dIs a trainable parameter;
to obtain higher order information we extend the single layer aggregator to multiple layers, the formula in the previous step is expressed as follows:
h(k-1)is an expression of item v generated in a previous step;embedding the items of the previous layer, Agg is a polymerization operation,aggregated k-th order item representations, the k-th order representation of an item is blended from its initial representation and adjacent k-hop items.
Preferably, the method for obtaining more complete item conversion information by fusing the item representations formed by the three channels comprises the following steps:
firstly, fusing project representations generated by a session graph channel and a global graph channel;
by merging global tables for each itemPresentation and session presentation to its final project presentationThe specific calculation is as follows:
generating a session insert via an item insert generated by fusing a session graph channel and a global graph channelThe position information of the item is determined by a learnable position matrix P ═ P1,p2,...,pk]Adding the position information of the project into the project embedding; then integrating the position information with the project expression through connection operation and nonlinear transformation; as follows:
for conversation sequences after the addition of position informationTo capture location information between items, a sequence of sessions is input from the attention layer:
wherein F is the conversation representation after attention, d is the hyper-parameter, WQ、WK、WV∈R2d×dIs a projection matrix;
using the nonlinearity of the model added by the RELU activation function, a residual join is added behind the feedforward network, as follows:
E=Relu(FW1+b1)W2+b2+F
e is a channelRepresentation of a session over residual linking, W1、W2Is a matrix of dimensions d x d, b1And b2The method is a bias vector with d dimension, and in order to prevent overfitting, a Dropout regularization technology is added in the training process, and a self-attention mechanism is expressed as follows:
E=SAN(H)
finally, single self-attention is expanded to multi-headed self-attention as follows:
E(k)=SAN(E(K-1))
E(k)for conversational representation after k-layer attention, E(k-1)Is a conversation representation after k-1 layer attention;
the conversation sequence after self attention is expressed as M ═ M1,m2,...,mk]The weight of each node pair sequence is learned through soft attention, which shows that the importance degree of different node pairs to the sequence is different, as follows:
wherein f is·、αiAre parameters that can be trained, σ represents a soft attention formula,is the embedding of a conversation s with an average embedded representation of all items in the conversationmiIs the embedding of the ith item in the conversation s, f is the Rd、W4∈Rd×d、W5∈Rd×dFor attention parameters, S is finally addeds,gAnd ShCombining to form the final session representation S ═ Ss,g+Sh。
Preferably, in step (4), the probability of obtaining the recommendation by performing a dot product operation on the embedded of each initial candidate and the session representation obtained in the previous section and then performing softmax is specifically calculated as follows:
S·in order to be a representation of a session,representing the probability that the item vi is likely to be selected next in the target session; the model is trained with the minimized objective function:
where y represents the one-hot encoding of the item and λ is the loss function, y'iIs a predicted value.
The present invention fuses the item representations generated by the three channels to improve performance based on conversational recommendations. The invention simultaneously models the conversation sequence into three kinds of graph structure data for capturing richer project conversion relations. The invention carries out extensive experiments on two real data sets, and proves the effectiveness and superiority of the model.
Drawings
Fig. 1 is a flowchart of a session recommendation method based on a three-channel graph neural network in embodiment 1;
fig. 2 is a model structure diagram of a three-channel graph neural network (MCG-SR) in example 1.
Detailed Description
For a further understanding of the invention, reference should be made to the following detailed description taken in conjunction with the accompanying drawings and examples. It is to be understood that the examples are illustrative of the invention and not limiting.
Example 1
As shown in fig. 1, the present embodiment provides a session recommendation method based on a three-channel neural network, which includes the following steps:
(1) converting the session sequence data into session map, hypergraph and global map data;
(2) the graph data is learned to be embedded into three items through a graph neural network of three channels; the three channels comprise a session graph channel, a hypergraph channel and a global graph channel; the session graph channel is used for capturing conversion relations among items in the session, the hypergraph channel is used for capturing high-order relations among items in the session, and the global graph channel is used for capturing relations among items in different sessions;
(3) project representations formed by three channels are fused to obtain more complete project conversion information;
(4) the predicted probabilities of the items are output through the prediction layer.
The embodiment provides a three-channel graph neural network (MCG-SR) model based on session recommendation, which is used for capturing richer item conversion relations of the session. Fig. 2 shows the overall structure of the model (where s1 is session 1, s2, s3 and v represents items), firstly converting session sequence data into three graph structure data forms, then learning the graph data through a graph neural network of three channels to embed three items, then fusing the three representations, and finally outputting the prediction probability of the items through a prediction layer.
In the conversation graph, each conversation sequence is converted into graph structure data, and the project embedding of a given conversation is learned through the GNN. Given a sessionIndicating item v clicked on in session SiAnd the session length is L; gs=(Vs,Es) Representing a conversation graph, each item si∈VsItem(s) as nodes and neighborsi-1,si)∈EsAs an edge; there are four types of edge relationships in the session graph, r respectivelyin、rout、rin-outAnd rself(ii) a Each item is added with a self-loop.
In hypergraph, conversation sequence is converted into hypergraph structure data, and height between items is learnedAnd (4) order conversion relation. Gh=(Vh,Eh) Represents a hypergraph, VhRepresenting a set of N non-repeating vertices in a hypergraph, EhRepresenting a set of M hyper-edges in the hyper-graph; each hyper-edge contains at least two vertices, and it is noted that any two vertices on a hyper-edge are connected, whereinAnd isEach super edge is given a weight whhAll the weights form a diagonal matrix W e RN×M(ii) a Matrix for hypergraph H ∈ RN×MRepresentation in which when a hyper-edge contains a vertex viWhen e is V, Hih1, otherwise Hih0; the degrees of the vertex and the excess edge are D respectivelyiiAnd BhhWhereinD and B are diagonal matrices.
Constructing a global graph is primarily a capture of cross-session inter-project transition information. Converting the items involved in all sessions into graph structure data is a global graph. It is worth noting that the session graph is a directed graph-global graph is not a directed graph. Gg ═ (Vg, Eg) represents the global graph, Vg represents the set of graph nodes for all items, Eg represents the set of all edges, each edge corresponding to two paired items in all sessions. Different edges should be weighted differently for each node Vi, and the magnitude of the weighting depends on the frequency of occurrence of the corresponding edge in the session.
The method for embedding the session graph comprises the following steps:
the conversion of paired items in the current conversation can be learned through the conversation graph, and is more important than the conversion of items in the conversation graph in the global graph. For the target project, different adjacent projects have different importance degrees on the target project, and the weight between different nodes is captured by using an attention mechanism; the attention coefficient is as follows:
wherein s isijRepresents the importance of node vj to node vi, rijIs the relationship between vi and vj, i.e. the edge relationship a ∈ RdRepresenting the weights while using LeakyRelu as an activation function; then, in order to make the coefficients between different nodes comparable, the attention weight is normalized by the softmax function, and the formula is as follows:
αij=softmax(sij)
and finally, linearly combining the obtained attention coefficient with the corresponding item to obtain the output of each node, wherein the formula is as follows:
The hypergraph embedding method comprises the following steps:
the hypergraph channel is used to capture the high-order relationships between items, and defines the hypergraph convolution as:
for l +1 level item embedding, Hih、HjhIs a value in the correlation matrix, whhIn order to be the weighting coefficients,item embedding for l layers;
writing the above equation in matrix form is as follows:
wherein the content of the first and second substances,respectively an item embedding of l +1 layer and an item embedding of l layer,aggregating the information from the nodes to the super edges, and multiplying the session sequence H to check the aggregated information from the super edges to the nodes before the aggregation; embedding X in original item0After hypergraph convolution of L layers, the embedding summation of each layer is averaged to be used as the final item embedding
Adding location information of an item into the item embedding by location embedding, P ═ P1,p2,...,pk]Embedding a matrix for learnable positions, k being the length of the current session; the items to which the location information is added are embedded as follows:
wherein the content of the first and second substances,for item embedding with location information, xiFor item embedding without position information, pk-i+1As location information of the item, W1∈Rd×2dAnd b ∈ RdAre parameters that can be learned; the conversation embedding is generated by aggregating the item representations in the conversation; then enhancing the sessionRepresents the embedding of:
wherein alpha isiAs weights for the node pair sequences, ShThe average value of the item embedding, c the coefficients available for training,is the embedding of a conversation s with an average embedded representation of all items in the conversationIs the embedding of the ith item in the conversation s, f is the Rd、W2∈Rd×d、W3∈Rd×dIs an attention parameter.
The method for embedding the global graph comprises the following steps:
an item may appear in multiple sessions, with conversion information being obtained from different sessions for capturing item conversions across sessions. It is important to measure whether the information in the global graph is related to the user preferences of the target session. The importance of items is differentiated according to session-aware attention. Each item is linearly combined according to the score generated by the conversation perception attention, and the specific steps are as follows:
h=∑vj∈vgπ(vi,vj)hvj
h is the attention score, hvjEmbedding the item vj, and using pi (vi, vj) to calculate the weights of different neighbors; the more important items that are closer to the current conversation are, the more the corresponding weight is; π (vi, vj) is specifically shown below:
π(vi,vj)=softmax(π(vi,vj))
wherein using LeakyRelu as the activation function, indicates that the corresponding position element is multiplied by, |Indicating a link operation, wij∈R1Is the weight of each edge in the global session graph, W1And q is1Is a trainable parameter, s is a feature of the target session, obtained by calculating the average of the current session, and then normalizing the coefficients of all neighboring items connected to vi by a softmax function; from this attention it can be concluded that those neighbouring items can be focused;
and finally, aggregating the target item information and the information of items adjacent to the target item, wherein the process is completed through a nonlinear conversion, and the specific steps are as follows:
hvin order to be a representation of the item,for an aggregated representation of item representations and neighbor representations, Relu is the activation function, W2∈Rd ×2dIs a trainable parameter;
to obtain higher order information we extend the single layer aggregator to multiple layers, the formula in the previous step is expressed as follows:
h(k-1)is an expression of item v generated in a previous step;embedding the items of the previous layer, Agg is a polymerization operation,and the k-order item representation of one item is formed by mixing the initial representation of the item and adjacent k-hop items.
The method for obtaining more complete project conversion information by fusing project representations formed by three channels comprises the following steps:
firstly, fusing project representations generated by a session graph channel and a global graph channel;
for each project, the final project representation is obtained by combining the global representation and the session representation, and the specific calculation is as follows: (use of dropout on the global representation to avoid overfitting)
Generating a session insert via an item insert generated by fusing a session graph channel and a global graph channelThe position information of the item is determined by a learnable position matrix P ═ P1,p2,...,pk]Adding the position information of the project into the project embedding; then integrating the position information with the project expression through connection operation and nonlinear transformation; as follows:
for conversation sequences after the addition of position informationTo capture location information between items, a sequence of sessions is input from the attention layer:
wherein F is the conversation representation after attention, d is the hyper-parameter, WQ、WK、WV∈R2d×dIs a projection matrix;
through the above self-attention with only linear relation, in order to enhance the expression of the session sequence, based on the nonlinearity of the model added by the RELU activation function, a residual connection is added behind the feed forward network, as follows:
E=Relu(FW1+b1)W2+b2+F
e is a representation of the session linked by residuals, W1、W2Is a matrix of dimensions d x d, b1And b2The method is a bias vector with d dimension, and in order to prevent overfitting, a Dropout regularization technology is added in the training process, and a self-attention mechanism is expressed as follows:
E=SAN(H)
finally, single self-attention is expanded to multi-headed self-attention as follows:
E(k)=SAN(E(K-1))
E(k)for conversational representation after k-layer attention, E(k-1)Is a conversation representation after k-1 layer attention;
the conversation sequence after self attention is expressed as M ═ M1,m2,...,mk]The weight of each node pair sequence is learned through soft attention, which shows that the importance degree of different node pairs to the sequence is different, as follows:
wherein f is·、αiAre parameters that can be trained, σ represents a soft attention formula,is the embedding of a conversation s with an average embedded representation of all items in the conversationmiIs the embedding of the ith item in the conversation s, f is the Rd、W4∈Rd×d、W5∈Rd×dFor attention parameters, S is finally addeds,gAnd ShCombining to form the final session representation S ═ Ss,g+Sh。
In the step (4), embedding each initial candidate item and the session representation obtained in the previous section are subjected to dot product operation, and then softmax is performed to obtain the recommended probability, which is specifically calculated as follows:
s · is a representation of a session,representing the probability that the item vi is likely to be selected next in the target session; the model is trained with the minimized objective function:
where y represents the one-hot encoding of the item and λ is the loss function, y'iIs a predicted value.
Experiment of
We describe the data set used in the experiment, a comparative model, we compared the proposed method with other methods, and we also designed an ablation experiment, investigating the contribution of each channel to the model.
Data set
We evaluated the proposed method on two real-world datasets, the Tmall dataset and the Nowplaying dataset. Where the Tmax data set is from an IJCAI-15 game, the data set contains anonymous user shopping logs on a Techthyst online shopping platform.
To make a fair comparison, we preprocessed the two data sets. In the pre-processing we filter out sessions of length 1 and items with a number of occurrences less than 5. We separate the data into training data and testing data, and segment the session to generate corresponding labels. The statistics of the data set are summarized in Table 1.
TABLE 1 data set
To evaluate our proposed method, we compared our method to a representative baseline and most advanced method.
POP, recommending top-N most popular items in the training set.
Item-KNN recommends items according to the similarity between the current conversation Item and other conversation items. Similarity is defined in terms of cosine similarity.
FPMC is a hybrid model, combines matrix decomposition and first-order Markov chain recommendation, and is a model for processing sequence data earlier.
GRU4Rec, which is a session recommendation model based on RNN, models user sequences.
STAMP the model takes into account both the user's long-term preference and the current preference level of preference.
SR-GNN, which was a model for earlier use of graph neural networks for recommendations, used a gated neural network to obtain item embedding and make recommendations.
DHCN the model applies a hypergraph to model a sequence of sessions.
The GCE-GNN not only considers the item conversion relation in the conversation, but also considers the conversion among the items of all conversations and achieves the most advanced effect in conversation recommendation.
Evaluation index
The dimension of the potential vector is set to 100, and the mini-batch of the model is also 100. For a fair comparison, we set the hyper-parameters of each model to be the same. We use Adam optimizer, initial learning rate set to 0.001, decaying 0.1 every three cycles. The L2 penalty is set to 10-5The dropout rate is searched in {0.1, 0.2.
Model performance
The results of the experiment are shown in table 2:
TABLE 2 results of the experiment
Our MCG-SR model was evaluated with the other 8 baseline models on two evaluation indices of the two datasets, and we highlighted the best results using bold. By analyzing the results in table 2, we can conclude the following.
The traditional methods (such as POP and Item-KNN) have a more remarkable gap with the recently proposed methods (such as GRU4REC, STAMP, SR-GNN and DHCN), and compared with the traditional methods, the recently proposed methods are modeling with sequence dependency, but the traditional models are not, so that the importance of sequence information on session recommendation is proved. Recently proposed methods all apply the technique of deep learning, which illustrates the key role of deep learning in conversational recommendation.
In the recently proposed approach, the performance of STAMP is significantly better than GRU4 REC. GRU4REC is based on the recurrent neural network modeling, while STAMP is a completely attention-based approach. This is because the GRU4REC only considers sequential behavior but is difficult to cope with user preference shifts, in contrast to an attention-based model that assigns different importance to different items, which can more accurately predict the user's behavior.
The latest graph neural network-based session recommendation is superior to the models based on the recurrent neural network and the attention mechanism, and the MCG-SR model provided by the inventor models a session sequence into three graphs, namely a session graph, a hypergraph and a global graph. The conversion information among the items in the three graphs is captured through a graph neural network, and then the information captured in the three graphs is fused to capture richer item conversion information. Experimental results as shown in table 2, our model outperformed all baseline models including the latest model based on graph neural networks (e.g., SR-GNN, GCE-GNN, DHCN) and achieved a great improvement, which also indicates the effectiveness of our approach.
Ablation study
To investigate the contributions of different channels to our model, we designed three variants: MCG-SR-S, MCG-SR-G and MCG-SR-H. MCG-SR-S represents a version without a session graph channel, MCG-SR-G represents a version without a global graph channel, and MCG-SR-H represents a version without a hypergraph channel. We compared these three variants with MCG-SR, DHCN and GCE-GNN on both Tmax and Nowplayng datasets.
From table 3, the different effects of the three channels on the two data sets are observed. From table 3, we can find that when we remove the session map channel, there is a large decrease in performance on the Tmall dataset compared to the MCG-SR, while there is a small decrease in P @20 index on the Nowplaying dataset, but a rise in MRR @20 index. When we remove the global graph path we have an increase in the P @20 index and a decrease in the MRR @20 index on both datasets compared to MCG-SR. When we removed the hypergraph channel, there was a drop in both evaluation metrics for both datasets compared to MCG-SR. From table 3, we can conclude that our model only promotes on both indexes of both data sets when all three channels are added, which also verifies that our model determination can capture more complete project transformation relationships and improve generalization capability.
TABLE 3
The embodiment provides a three-channel map neural network model based on session recommendation. Conversational recommendations based on prior graph neural networks typically model the sequence of conversations as a single graph, which results in a model that fails to capture richer item transformations. The model simultaneously models a conversation sequence into three graphs, namely a conversation graph, a hypergraph and a global graph. The conversion information among the items in the three graphs is captured through the graph neural network, then the information captured in the three graphs is fused for capturing richer item conversion information, and experiments on two data sets prove that the model achieves greater advantages and the effectiveness of the model is proved.
The present invention and its embodiments have been described above schematically, without limitation, and what is shown in the drawings is only one of the embodiments of the present invention, and the actual structure is not limited thereto. Therefore, if the person skilled in the art receives the teaching, without departing from the spirit of the invention, the person skilled in the art shall not inventively design the similar structural modes and embodiments to the technical solution, but shall fall within the scope of the invention.
Claims (7)
1. The three-channel diagram neural network-based session recommendation method is characterized by comprising the following steps of: the method comprises the following steps:
(1) converting the session sequence data into session map, hypergraph and global map data;
(2) the graph data is learned to be embedded into three items through a graph neural network of three channels; the three channels comprise a session graph channel, a hypergraph channel and a global graph channel; the session graph channel is used for capturing conversion relations among items in the session, the hypergraph channel is used for capturing high-order relations among items in the session, and the global graph channel is used for capturing relations among items in different sessions;
(3) project representations formed by three channels are fused to obtain more complete project conversion information;
(4) the predicted probabilities of the items are output through the prediction layer.
2. The three-channel map neural network-based session recommendation method of claim 1, wherein: in the conversation chart, a conversation is given Indicating item v clicked on in session SiAnd the session length is L; gs=(Vs,Es) Representing a conversation graph, each item si∈VsItems(s) that are nodes and adjacenti-1,si)∈EsAs an edge;
Gh=(Vh,Eh) Represents a hypergraph, VhRepresenting a set of N non-repeating vertices in a hypergraph, EhRepresenting a set of M hyperedges in the hypergraph; each super edge contains at least two vertices, whereAnd isEach super edge is given a weight whhAll the weights form a diagonal matrix W e RN×M(ii) a Matrix for hypergraph H ∈ RN×MRepresentation in which when a hyper-edge contains a vertex viWhen e is V, Hih1, otherwise Hih0; the degrees of the vertex and the excess edge are D respectivelyiiAnd BhhWherein D and B are diagonal matrices;
the global graph is used for acquiring information of item conversion between different sessions, wherein Gg ═ Vg denotes the global graph, Vg denotes a set of graph nodes of all items, Eg denotes a set of all edges, and each edge corresponds to two paired items in all sessions.
3. The three-channel map neural network-based session recommendation method of claim 2, wherein: the method for embedding the session graph comprises the following steps:
for the target project, different adjacent projects have different importance degrees on the target project, and the weight between different nodes is captured by using an attention mechanism; the attention coefficient is as follows:
wherein s isijIndicating the importance of the item vj to the item vi,representing relationships between items, hviEmbedding of item vi, hvjEmbedding of item vj, rijIs the relationship between vi and vj, i.e. the edge relationship, a ∈ RdRepresenting a weight; then, in order to make the coefficients comparable between different nodes, the attention weights are normalized by the softmax function, as shown in the following equation:
αij=soft max(sij)
aijthe weights are the weights between the nodes vi and vj after normalization;
and finally, linearly combining the obtained attention coefficient with the corresponding item to obtain the output of each node, wherein the formula is as follows:
4. The three-channel map neural network-based session recommendation method of claim 3, wherein: the hypergraph embedding method comprises the following steps:
the hypergraph channel is used to capture the high-order relationships between items, and defines the hypergraph convolution as:
for l +1 level item embedding, Hih、HjhIs a value in the correlation matrix, whhIn order to be the weighting coefficients,item embedding for l layers;
writing the above equation in matrix form is as follows:
wherein the content of the first and second substances,respectively an item embedding of l +1 layer and an item embedding of l layer,aggregating information from nodes to the super edges, and multiplying the information by a session sequence H to check the aggregated information from the super edges to the nodes; embedding X in original item0After hypergraph convolution of L layers, the embedding summation of each layer is averaged to be used as the final item embedding
Adding location information of an item into the item embedding by location embedding, P ═ P1,p2,...,pk]Embedding a matrix for a learnable position, wherein k is the length of the current session; the items to which the location information is added are embedded as follows:
wherein the content of the first and second substances,to be provided with position informationItem embedding, xiFor item embedding without position information, pk-i+1As location information of the item, W1∈Rd×2dAnd b ∈ RdAre parameters that can be learned; the conversation embedding is generated by aggregating the item representations in the conversation; then enhancing the sessionRepresents the embedding of:
wherein alpha isiAs weights for the node pair sequences, ShThe average value of the item embedding, c the coefficients available for training,is the embedding of a conversation s with an average embedded representation of all items in the conversation Is the embedding of the ith item in the conversation s, f is the Rd、W2∈Rd×d、W3∈Rd×dIs an attention parameter.
5. The three-channel map neural network-based session recommendation method of claim 4, wherein: the method for embedding the global graph comprises the following steps:
each item is linearly combined according to the score generated by the conversation perception attention, and the specific steps are as follows:
h=∑vj∈vgπ(vi,vj)hvj
h is the attention score, hvjEmbedding the item vj, and using pi (vi, vj) to calculate the weights of different neighbors; the more important items that are closer to the current conversation are, the more the corresponding weight is; π (vi, vj) is specifically shown below:
π(vi,vj)=soft max(π(vi,vj))
wherein using LeakyRelu as the activation function, | denotes multiplication of corresponding position elements, | denotes a link operation, wij∈R1Is the weight of each edge in the global session graph, W1And q is1Is a trainable parameter, s is a feature of the target session, obtained by calculating the average of the current session, and then normalizing the coefficients of all neighboring items connected to vi by a softmax function; from this attention it can be concluded that those neighboring items can be attended to;
and finally, aggregating the target item information and the information of items adjacent to the target item, wherein the process is completed through a nonlinear conversion, and the specific steps are as follows:
hvin order to be a representation of the item,for an aggregated representation of item representations and neighbor representations, Relu is the activation function, W2∈Rd×2dIs a trainable parameter;
to obtain higher order information we extend the single layer aggregator to multiple layers, the formula in the previous step is expressed as follows:
6. The three-channel map neural network-based session recommendation method of claim 5, wherein: the method for obtaining more complete project conversion information by fusing project representations formed by three channels comprises the following steps:
firstly, fusing project representations generated by a session graph channel and a global graph channel;
for each item, the global representation and the session representation are combined to obtain the final item representationThe specific calculation is as follows:
generating a session insert via an item insert generated by fusing a session graph channel and a global graph channelThe position information of the item is determined by a learnable position matrix P ═ P1,p2,...,pk]Adding the position information of the project into the project embedding; then integrating the position information with the project expression through connection operation and nonlinear transformation; as follows:
for conversation sequences after the addition of position informationTo capture location information between items, a sequence of sessions is input from the attention layer:
wherein F is the conversation representation after attention, d is the hyper-parameter, WQ、WK、WV∈R2d×dIs a projection matrix;
using the nonlinearity of the model added by the RELU activation function, a residual join is added behind the feedforward network, as follows:
E=Relu(FW1+b1)W2+b2+F
e is a representation of the session linked by residuals, W1、W2Is a matrix of dimensions d x d, b1And b2The method is a bias vector with d dimension, and in order to prevent overfitting, a Dropout regularization technology is added in the training process, and a self-attention mechanism is expressed as follows:
E=SAN(H)
finally, single self-attention is expanded to multi-headed self-attention as follows:
E(k)=SAN(E(K-1))
E(k)for conversational representation after k-layer attention, E(k-1)Is a conversation representation after k-1 layer attention;
the conversation sequence after self attention is expressed as M ═ M1,m2,...,mk]The weight of each node pair sequence is learned through soft attention, which shows that the importance degree of different node pairs to the sequence is different, as follows:
wherein, f, alphaiAre parameters that can be trained, σ represents a soft attention formula,is the embedding of a conversation s with an average embedded representation of all items in the conversationmiIs the embedding of the ith item in the conversation s, f is the Rd、W4∈Rd ×d、W5∈Rd×dFor attention parameters, S is finally addeds,gAnd ShCombining to form the final session representation S ═ Ss,g+Sh。
7. The three-channel graph neural network-based session recommendation method of claim 6, wherein: in the step (4), embedding each initial candidate item and the session representation obtained in the previous section are subjected to dot product operation, and then softmax is performed to obtain the recommended probability, which is specifically calculated as follows:
s · is a representation of a session,representing the probability that the item vi is likely to be selected next in the target session;
the model is trained with the minimized objective function:
where y represents the one-hot encoding of the item, λ is the loss function, y'iIs a predicted value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210082137.0A CN114547276A (en) | 2022-01-24 | 2022-01-24 | Three-channel diagram neural network-based session recommendation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210082137.0A CN114547276A (en) | 2022-01-24 | 2022-01-24 | Three-channel diagram neural network-based session recommendation method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114547276A true CN114547276A (en) | 2022-05-27 |
Family
ID=81670731
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210082137.0A Pending CN114547276A (en) | 2022-01-24 | 2022-01-24 | Three-channel diagram neural network-based session recommendation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114547276A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115545098A (en) * | 2022-09-23 | 2022-12-30 | 青海师范大学 | Node classification method of three-channel graph neural network based on attention mechanism |
-
2022
- 2022-01-24 CN CN202210082137.0A patent/CN114547276A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115545098A (en) * | 2022-09-23 | 2022-12-30 | 青海师范大学 | Node classification method of three-channel graph neural network based on attention mechanism |
CN115545098B (en) * | 2022-09-23 | 2023-09-08 | 青海师范大学 | Node classification method of three-channel graph neural network based on attention mechanism |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112035746A (en) | Session recommendation method based on space-time sequence diagram convolutional network | |
CN112364976B (en) | User preference prediction method based on session recommendation system | |
CN112989064B (en) | Recommendation method for aggregating knowledge graph neural network and self-adaptive attention | |
CN111581520B (en) | Item recommendation method and system based on item importance in session | |
CN108876044B (en) | Online content popularity prediction method based on knowledge-enhanced neural network | |
WO2021139415A1 (en) | Data processing method and apparatus, computer readable storage medium, and electronic device | |
CN114493755B (en) | Self-attention sequence recommendation method fusing time sequence information | |
Zarzour et al. | RecDNNing: a recommender system using deep neural network with user and item embeddings | |
CN112364242A (en) | Graph convolution recommendation system for context-aware type | |
CN111259264B (en) | Time sequence scoring prediction method based on generation countermeasure network | |
CN113487018A (en) | Global context enhancement graph neural network method based on session recommendation | |
Zhou et al. | Recommendation via collaborative autoregressive flows | |
CN114547276A (en) | Three-channel diagram neural network-based session recommendation method | |
Mu et al. | Auxiliary stacked denoising autoencoder based collaborative filtering recommendation | |
CN114780841B (en) | KPHAN-based sequence recommendation method | |
CN114842247B (en) | Characteristic accumulation-based graph convolution network semi-supervised node classification method | |
CN115470406A (en) | Graph neural network session recommendation method based on dual-channel information fusion | |
CN114741597A (en) | Knowledge-enhanced attention-force-diagram-based neural network next item recommendation method | |
CN114625969A (en) | Recommendation method based on interactive neighbor session | |
CN116263794A (en) | Double-flow model recommendation system and algorithm with contrast learning enhancement | |
CN114610862A (en) | Conversation recommendation method for enhancing context sequence of graph | |
CN112801076A (en) | Electronic commerce video highlight detection method and system based on self-attention mechanism | |
Mohan et al. | Representation learning for temporal networks using temporal random walk and deep autoencoder | |
Zhu et al. | Influential Recommender System | |
CN116485501B (en) | Graph neural network session recommendation method based on graph embedding and attention mechanism |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |