CN114547276A - Three-channel diagram neural network-based session recommendation method - Google Patents

Three-channel diagram neural network-based session recommendation method Download PDF

Info

Publication number
CN114547276A
CN114547276A CN202210082137.0A CN202210082137A CN114547276A CN 114547276 A CN114547276 A CN 114547276A CN 202210082137 A CN202210082137 A CN 202210082137A CN 114547276 A CN114547276 A CN 114547276A
Authority
CN
China
Prior art keywords
item
session
embedding
items
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210082137.0A
Other languages
Chinese (zh)
Inventor
杨青
张文祥
王逸丰
张敬伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202210082137.0A priority Critical patent/CN114547276A/en
Publication of CN114547276A publication Critical patent/CN114547276A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of session recommendation, in particular to a session recommendation method based on a three-channel diagram neural network, which comprises the following steps: (1) converting the session sequence data into session map, hypergraph and global map data; (2) the graph data is learned to be embedded into three items through a graph neural network of three channels; the three channels comprise a session graph channel, a hypergraph channel and a global graph channel; the conversation graph channel is used for capturing conversion relations among items in the conversation, the hypergraph channel is used for capturing high-order relations among items in the conversation, and the global graph channel is used for capturing relations among items in different conversations; (3) project representations formed by three channels are fused to obtain more complete project conversion information; (4) the predicted probabilities of the items are output through the prediction layer. The invention can preferably perform session recommendation.

Description

Three-channel diagram neural network-based session recommendation method
Technical Field
The invention relates to the technical field of session recommendation, in particular to a session recommendation method based on a three-channel diagram neural network.
Background
In recent years, internet information is growing rapidly, and a recommendation system becomes an effective method for helping users to relieve information overload problems, and plays an important role in the aspects of consumption, service, decision making and the like. Most existing recommendation methods recommend based on long-term historical user interaction and user profiles. However, in many services, the user identification may be unknown or only historical behavior during the user session is available. The conversation recommendation, an emerging form of recommendation, remedies the above deficiencies.
Because of the high practical value of conversational recommendations, many conversational-based recommendation methods have been proposed. The markov chain is a classical example, and this method predicts the next behavior of the user based on the previous behavior, and the independent combination of past interactions may limit the accuracy of the recommendation due to the strong independence assumption of the markov chain. Deep learning based recommendation methods are rapidly developed, and many deep learning based recommendation methods are also available. For example, in conjunction with a recurrent neural network approach, the model is enhanced by data enhancement and accounting for temporal changes in user behavior. The GRU4REC makes recommendations by using the GRU to establish user short-term preferences. The use of GRU in conjunction with an attention mechanism in NARM simultaneously learns sequential behavior and user primary intent to model the sequence. The Transformer model achieves the most advanced effect in processing the translation task. The model does not use recursive or convolutional networks, but rather models the sequence using the encoder-decoder structure of the self-attention network stack, and the success of the transform model stems from the application of self-attention. Self-attention is a particular attention mechanism that has been widely used for sequence data modeling. SASrec was the model that used self-attention for the recommended field at the earliest and achieved the most advanced effects. These methods achieve good results with paired item transformation information to model preferences for a given session. However, these methods still face problems, first of all, that there is not enough user behavior in one session, and it is difficult to estimate the user representation. Second, these works model only one-way transitions when transitioning between modeling items and ignore transitions between contexts.
SR-GNNs solve the above problem by initially using graph neural networks for conversational recommendation, by modeling sequence data as graph data structures and capturing complex transformations of items through the graph neural networks. The GCE-GNN also carries out session recommendation based on the graph neural network, and the method not only considers the item conversion between target sessions but also considers the conversion between different sessions. The session recommendation based on the graph neural network achieves remarkable effect. However, the method still faces some problems, the session recommendation system based on the graph neural network models the session sequence as graph structured data in a pair relationship or as hypergraph structured data, and the mode of modeling the session sequence as a single graph cannot capture more complete item conversion information, so that the accuracy of recommendation is reduced.
Disclosure of Invention
It is an object of the present invention to provide a three-channel neural network-based session recommendation method that overcomes some or all of the deficiencies of the prior art.
The three-channel diagram neural network-based session recommendation method comprises the following steps of:
(1) converting the session sequence data into session map, hypergraph and global map data;
(2) the graph data is learned to be embedded into three items through a graph neural network of three channels; the three channels comprise a session graph channel, a hypergraph channel and a global graph channel; the session graph channel is used for capturing conversion relations among items in the session, the hypergraph channel is used for capturing high-order relations among items in the session, and the global graph channel is used for capturing relations among items in different sessions;
(3) project representations formed by three channels are fused to obtain more complete project conversion information;
(4) the predicted probabilities of the items are output through the prediction layer.
Preferably, in the session graph, a session is given
Figure BDA0003486431230000021
Indicating item v clicked on in session SiAnd the session length is L; gs=(Vs,Es) Representing a conversation graph, each item si∈VsItems(s) that are nodes and adjacenti-1,si)∈EsAs an edge;
Gh=(Vh,Eh) Represents a hypergraph, VhRepresenting a set of N non-repeating vertices in a hypergraph, EhRepresenting a set of M hyperedges in the hypergraph; each super edge contains at least two vertices, where
Figure BDA0003486431230000022
And is
Figure BDA0003486431230000023
Each super edge is given a weight whhAll the weights form a diagonal matrix W e RN×M(ii) a Matrix for hypergraph H ∈ RN×MRepresentation in which when a hyper-edge contains a vertex viWhen e is V, Hih1, otherwise Hih0; the degrees of the vertex and the excess edge are D respectivelyiiAnd BhhWherein
Figure BDA0003486431230000031
Figure BDA0003486431230000032
D and B are diagonal matrices;
the global graph is used for acquiring information of item conversion between different sessions, wherein Gg ═ Vg denotes the global graph, Vg denotes a set of graph nodes of all items, Eg denotes a set of all edges, and each edge corresponds to two paired items in all sessions.
Preferably, the method for embedding the session map comprises the following steps:
for the target project, different adjacent projects have different importance degrees on the target project, and the weight between different nodes is captured by using an attention mechanism; the attention coefficient is as follows:
Figure BDA0003486431230000033
wherein s isijIndicating the importance of the item vj to the item vi,
Figure BDA0003486431230000034
representing relationships between items, hviEmbedding of item vi, hvjEmbedding of item vj, rijIs the relationship between vi and vj, i.e. the edge relationship, a ∈ RdRepresenting a weight; then, in order to make the coefficients comparable between different nodes, the attention weights are normalized by the softmax function, as shown in the following equation:
αij=softmax(sij)
aijthe weights are the weights between the nodes vi and vj after normalization;
and finally, linearly combining the obtained attention coefficient with the corresponding item to obtain the output of each node, wherein the formula is as follows:
Figure BDA0003486431230000035
Figure BDA0003486431230000036
representing a node.
Preferably, the hypergraph embedding method comprises the following steps:
the hypergraph channel is used to capture the high-order relationships between items, and defines the hypergraph convolution as:
Figure BDA0003486431230000037
Figure BDA0003486431230000038
for l +1 level item embedding, Hih、HjhIs a value in the correlation matrix, whhIn order to be the weighting coefficients,
Figure BDA0003486431230000039
item embedding for l layers;
writing the above equation in matrix form is as follows:
Figure BDA0003486431230000041
wherein the content of the first and second substances,
Figure BDA0003486431230000042
respectively an item embedding of l +1 layer and an item embedding of l layer,
Figure BDA0003486431230000043
aggregating information from nodes to the super edges, and multiplying the information by a session sequence H to check the aggregated information from the super edges to the nodes; embedding X in original item0After hypergraph convolution of L layers, the embedding summation of each layer is averaged to be used as the final item embedding
Figure BDA0003486431230000044
Adding location information of an item into the item embedding by location embedding, P ═ P1,p2,...,pk]Embedding a matrix for learnable positions, k being the length of the current session; the items to which the location information is added are embedded as follows:
Figure BDA0003486431230000045
wherein the content of the first and second substances,
Figure BDA0003486431230000046
for item embedding with location information, xiFor item embedding without position information, pk-i+1As location information of the item, W1∈Rd×2dAnd b ∈ RdAre parameters that can be learned; the conversation embedding is generated by aggregating the item representations in the conversation; then enhancing the session
Figure BDA0003486431230000047
Represents the embedding of:
Figure BDA0003486431230000048
Figure BDA0003486431230000049
wherein alpha isiAs weights for the node pair sequences, ShThe average value of the item embedding, c the coefficients available for training,
Figure BDA00034864312300000410
is the embedding of a conversation s with an average embedded representation of all items in the conversation
Figure BDA00034864312300000411
Is the embedding of the ith item in the conversation s, f is the Rd、W2∈Rd×d、W3∈Rd×dIs an attention parameter.
Preferably, the method for embedding the global graph comprises the following steps:
each item is linearly combined according to the score generated by the conversation perception attention, and the specific steps are as follows:
h=∑vj∈vgπ(vi,vj)hvj
h is the attention score, hvjEmbedding the item vj, and using pi (vi, vj) to calculate the weights of different neighbors; the more important items that are closer to the current conversation are, the more the corresponding weight is; π (vi, vj) is specifically shown below:
Figure BDA00034864312300000412
π(vi,vj)=softmax(π(vi,vj))
wherein using LeakyRelu as the activation function, | denotes multiplication of corresponding position elements, | denotes a link operation, wij∈R1Is the weight of each edge in the global session graph, W1And q is1Is a trainable parameter, s is a feature of the target session, is obtained by calculating the average of the current session, and then byThe softmax function normalizes the coefficients of all the adjacent items connected with vi; from this attention it can be concluded that those neighboring items can be attended to;
and finally, aggregating the target item information and the information of items adjacent to the target item, wherein the process is completed through a nonlinear conversion, and the specific steps are as follows:
Figure BDA0003486431230000051
hvin order to be a representation of the item,
Figure BDA0003486431230000052
for an aggregated representation of item representations and neighbor representations, Relu is the activation function, W2∈Rd ×2dIs a trainable parameter;
to obtain higher order information we extend the single layer aggregator to multiple layers, the formula in the previous step is expressed as follows:
Figure BDA0003486431230000053
h(k-1)is an expression of item v generated in a previous step;
Figure BDA0003486431230000054
embedding the items of the previous layer, Agg is a polymerization operation,
Figure BDA0003486431230000055
aggregated k-th order item representations, the k-th order representation of an item is blended from its initial representation and adjacent k-hop items.
Preferably, the method for obtaining more complete item conversion information by fusing the item representations formed by the three channels comprises the following steps:
firstly, fusing project representations generated by a session graph channel and a global graph channel;
by merging global tables for each itemPresentation and session presentation to its final project presentation
Figure BDA0003486431230000056
The specific calculation is as follows:
Figure BDA0003486431230000057
generating a session insert via an item insert generated by fusing a session graph channel and a global graph channel
Figure BDA0003486431230000061
The position information of the item is determined by a learnable position matrix P ═ P1,p2,...,pk]Adding the position information of the project into the project embedding; then integrating the position information with the project expression through connection operation and nonlinear transformation; as follows:
Figure BDA0003486431230000062
for conversation sequences after the addition of position information
Figure BDA0003486431230000063
To capture location information between items, a sequence of sessions is input from the attention layer:
Figure BDA0003486431230000064
wherein F is the conversation representation after attention, d is the hyper-parameter, WQ、WK、WV∈R2d×dIs a projection matrix;
using the nonlinearity of the model added by the RELU activation function, a residual join is added behind the feedforward network, as follows:
E=Relu(FW1+b1)W2+b2+F
e is a channelRepresentation of a session over residual linking, W1、W2Is a matrix of dimensions d x d, b1And b2The method is a bias vector with d dimension, and in order to prevent overfitting, a Dropout regularization technology is added in the training process, and a self-attention mechanism is expressed as follows:
E=SAN(H)
finally, single self-attention is expanded to multi-headed self-attention as follows:
E(k)=SAN(E(K-1))
E(k)for conversational representation after k-layer attention, E(k-1)Is a conversation representation after k-1 layer attention;
the conversation sequence after self attention is expressed as M ═ M1,m2,...,mk]The weight of each node pair sequence is learned through soft attention, which shows that the importance degree of different node pairs to the sequence is different, as follows:
Figure BDA0003486431230000065
Figure BDA0003486431230000071
wherein f is·、αiAre parameters that can be trained, σ represents a soft attention formula,
Figure BDA0003486431230000072
is the embedding of a conversation s with an average embedded representation of all items in the conversation
Figure BDA0003486431230000073
miIs the embedding of the ith item in the conversation s, f is the Rd、W4∈Rd×d、W5∈Rd×dFor attention parameters, S is finally addeds,gAnd ShCombining to form the final session representation S ═ Ss,g+Sh
Preferably, in step (4), the probability of obtaining the recommendation by performing a dot product operation on the embedded of each initial candidate and the session representation obtained in the previous section and then performing softmax is specifically calculated as follows:
Figure BDA0003486431230000074
S·in order to be a representation of a session,
Figure BDA0003486431230000075
representing the probability that the item vi is likely to be selected next in the target session; the model is trained with the minimized objective function:
Figure BDA0003486431230000076
where y represents the one-hot encoding of the item and λ is the loss function, y'iIs a predicted value.
The present invention fuses the item representations generated by the three channels to improve performance based on conversational recommendations. The invention simultaneously models the conversation sequence into three kinds of graph structure data for capturing richer project conversion relations. The invention carries out extensive experiments on two real data sets, and proves the effectiveness and superiority of the model.
Drawings
Fig. 1 is a flowchart of a session recommendation method based on a three-channel graph neural network in embodiment 1;
fig. 2 is a model structure diagram of a three-channel graph neural network (MCG-SR) in example 1.
Detailed Description
For a further understanding of the invention, reference should be made to the following detailed description taken in conjunction with the accompanying drawings and examples. It is to be understood that the examples are illustrative of the invention and not limiting.
Example 1
As shown in fig. 1, the present embodiment provides a session recommendation method based on a three-channel neural network, which includes the following steps:
(1) converting the session sequence data into session map, hypergraph and global map data;
(2) the graph data is learned to be embedded into three items through a graph neural network of three channels; the three channels comprise a session graph channel, a hypergraph channel and a global graph channel; the session graph channel is used for capturing conversion relations among items in the session, the hypergraph channel is used for capturing high-order relations among items in the session, and the global graph channel is used for capturing relations among items in different sessions;
(3) project representations formed by three channels are fused to obtain more complete project conversion information;
(4) the predicted probabilities of the items are output through the prediction layer.
The embodiment provides a three-channel graph neural network (MCG-SR) model based on session recommendation, which is used for capturing richer item conversion relations of the session. Fig. 2 shows the overall structure of the model (where s1 is session 1, s2, s3 and v represents items), firstly converting session sequence data into three graph structure data forms, then learning the graph data through a graph neural network of three channels to embed three items, then fusing the three representations, and finally outputting the prediction probability of the items through a prediction layer.
In the conversation graph, each conversation sequence is converted into graph structure data, and the project embedding of a given conversation is learned through the GNN. Given a session
Figure BDA0003486431230000081
Indicating item v clicked on in session SiAnd the session length is L; gs=(Vs,Es) Representing a conversation graph, each item si∈VsItem(s) as nodes and neighborsi-1,si)∈EsAs an edge; there are four types of edge relationships in the session graph, r respectivelyin、rout、rin-outAnd rself(ii) a Each item is added with a self-loop.
In hypergraph, conversation sequence is converted into hypergraph structure data, and height between items is learnedAnd (4) order conversion relation. Gh=(Vh,Eh) Represents a hypergraph, VhRepresenting a set of N non-repeating vertices in a hypergraph, EhRepresenting a set of M hyper-edges in the hyper-graph; each hyper-edge contains at least two vertices, and it is noted that any two vertices on a hyper-edge are connected, wherein
Figure BDA0003486431230000082
And is
Figure BDA0003486431230000083
Each super edge is given a weight whhAll the weights form a diagonal matrix W e RN×M(ii) a Matrix for hypergraph H ∈ RN×MRepresentation in which when a hyper-edge contains a vertex viWhen e is V, Hih1, otherwise Hih0; the degrees of the vertex and the excess edge are D respectivelyiiAnd BhhWherein
Figure BDA0003486431230000084
D and B are diagonal matrices.
Constructing a global graph is primarily a capture of cross-session inter-project transition information. Converting the items involved in all sessions into graph structure data is a global graph. It is worth noting that the session graph is a directed graph-global graph is not a directed graph. Gg ═ (Vg, Eg) represents the global graph, Vg represents the set of graph nodes for all items, Eg represents the set of all edges, each edge corresponding to two paired items in all sessions. Different edges should be weighted differently for each node Vi, and the magnitude of the weighting depends on the frequency of occurrence of the corresponding edge in the session.
The method for embedding the session graph comprises the following steps:
the conversion of paired items in the current conversation can be learned through the conversation graph, and is more important than the conversion of items in the conversation graph in the global graph. For the target project, different adjacent projects have different importance degrees on the target project, and the weight between different nodes is captured by using an attention mechanism; the attention coefficient is as follows:
Figure BDA0003486431230000091
wherein s isijRepresents the importance of node vj to node vi, rijIs the relationship between vi and vj, i.e. the edge relationship a ∈ RdRepresenting the weights while using LeakyRelu as an activation function; then, in order to make the coefficients between different nodes comparable, the attention weight is normalized by the softmax function, and the formula is as follows:
αij=softmax(sij)
and finally, linearly combining the obtained attention coefficient with the corresponding item to obtain the output of each node, wherein the formula is as follows:
Figure BDA0003486431230000092
Figure BDA0003486431230000093
representing a node.
The hypergraph embedding method comprises the following steps:
the hypergraph channel is used to capture the high-order relationships between items, and defines the hypergraph convolution as:
Figure BDA0003486431230000094
Figure BDA0003486431230000095
for l +1 level item embedding, Hih、HjhIs a value in the correlation matrix, whhIn order to be the weighting coefficients,
Figure BDA0003486431230000096
item embedding for l layers;
writing the above equation in matrix form is as follows:
Figure BDA0003486431230000101
wherein the content of the first and second substances,
Figure BDA0003486431230000102
respectively an item embedding of l +1 layer and an item embedding of l layer,
Figure BDA0003486431230000103
aggregating the information from the nodes to the super edges, and multiplying the session sequence H to check the aggregated information from the super edges to the nodes before the aggregation; embedding X in original item0After hypergraph convolution of L layers, the embedding summation of each layer is averaged to be used as the final item embedding
Figure BDA0003486431230000104
Adding location information of an item into the item embedding by location embedding, P ═ P1,p2,...,pk]Embedding a matrix for learnable positions, k being the length of the current session; the items to which the location information is added are embedded as follows:
Figure BDA0003486431230000105
wherein the content of the first and second substances,
Figure BDA0003486431230000106
for item embedding with location information, xiFor item embedding without position information, pk-i+1As location information of the item, W1∈Rd×2dAnd b ∈ RdAre parameters that can be learned; the conversation embedding is generated by aggregating the item representations in the conversation; then enhancing the session
Figure BDA0003486431230000107
Represents the embedding of:
Figure BDA0003486431230000108
Figure BDA0003486431230000109
wherein alpha isiAs weights for the node pair sequences, ShThe average value of the item embedding, c the coefficients available for training,
Figure BDA00034864312300001010
is the embedding of a conversation s with an average embedded representation of all items in the conversation
Figure BDA00034864312300001011
Is the embedding of the ith item in the conversation s, f is the Rd、W2∈Rd×d、W3∈Rd×dIs an attention parameter.
The method for embedding the global graph comprises the following steps:
an item may appear in multiple sessions, with conversion information being obtained from different sessions for capturing item conversions across sessions. It is important to measure whether the information in the global graph is related to the user preferences of the target session. The importance of items is differentiated according to session-aware attention. Each item is linearly combined according to the score generated by the conversation perception attention, and the specific steps are as follows:
h=∑vj∈vgπ(vi,vj)hvj
h is the attention score, hvjEmbedding the item vj, and using pi (vi, vj) to calculate the weights of different neighbors; the more important items that are closer to the current conversation are, the more the corresponding weight is; π (vi, vj) is specifically shown below:
Figure BDA0003486431230000111
π(vi,vj)=softmax(π(vi,vj))
wherein using LeakyRelu as the activation function, indicates that the corresponding position element is multiplied by, |Indicating a link operation, wij∈R1Is the weight of each edge in the global session graph, W1And q is1Is a trainable parameter, s is a feature of the target session, obtained by calculating the average of the current session, and then normalizing the coefficients of all neighboring items connected to vi by a softmax function; from this attention it can be concluded that those neighbouring items can be focused;
and finally, aggregating the target item information and the information of items adjacent to the target item, wherein the process is completed through a nonlinear conversion, and the specific steps are as follows:
Figure BDA0003486431230000112
hvin order to be a representation of the item,
Figure BDA0003486431230000113
for an aggregated representation of item representations and neighbor representations, Relu is the activation function, W2∈Rd ×2dIs a trainable parameter;
to obtain higher order information we extend the single layer aggregator to multiple layers, the formula in the previous step is expressed as follows:
Figure BDA0003486431230000114
h(k-1)is an expression of item v generated in a previous step;
Figure BDA0003486431230000115
embedding the items of the previous layer, Agg is a polymerization operation,
Figure BDA0003486431230000116
and the k-order item representation of one item is formed by mixing the initial representation of the item and adjacent k-hop items.
The method for obtaining more complete project conversion information by fusing project representations formed by three channels comprises the following steps:
firstly, fusing project representations generated by a session graph channel and a global graph channel;
for each project, the final project representation is obtained by combining the global representation and the session representation, and the specific calculation is as follows: (use of dropout on the global representation to avoid overfitting)
Figure BDA0003486431230000117
Generating a session insert via an item insert generated by fusing a session graph channel and a global graph channel
Figure BDA0003486431230000121
The position information of the item is determined by a learnable position matrix P ═ P1,p2,...,pk]Adding the position information of the project into the project embedding; then integrating the position information with the project expression through connection operation and nonlinear transformation; as follows:
Figure BDA0003486431230000122
for conversation sequences after the addition of position information
Figure BDA0003486431230000123
To capture location information between items, a sequence of sessions is input from the attention layer:
Figure BDA0003486431230000124
wherein F is the conversation representation after attention, d is the hyper-parameter, WQ、WK、WV∈R2d×dIs a projection matrix;
through the above self-attention with only linear relation, in order to enhance the expression of the session sequence, based on the nonlinearity of the model added by the RELU activation function, a residual connection is added behind the feed forward network, as follows:
E=Relu(FW1+b1)W2+b2+F
e is a representation of the session linked by residuals, W1、W2Is a matrix of dimensions d x d, b1And b2The method is a bias vector with d dimension, and in order to prevent overfitting, a Dropout regularization technology is added in the training process, and a self-attention mechanism is expressed as follows:
E=SAN(H)
finally, single self-attention is expanded to multi-headed self-attention as follows:
E(k)=SAN(E(K-1))
E(k)for conversational representation after k-layer attention, E(k-1)Is a conversation representation after k-1 layer attention;
the conversation sequence after self attention is expressed as M ═ M1,m2,...,mk]The weight of each node pair sequence is learned through soft attention, which shows that the importance degree of different node pairs to the sequence is different, as follows:
Figure BDA0003486431230000131
Figure BDA0003486431230000132
wherein f is·、αiAre parameters that can be trained, σ represents a soft attention formula,
Figure BDA0003486431230000133
is the embedding of a conversation s with an average embedded representation of all items in the conversation
Figure BDA0003486431230000134
miIs the embedding of the ith item in the conversation s, f is the Rd、W4∈Rd×d、W5∈Rd×dFor attention parameters, S is finally addeds,gAnd ShCombining to form the final session representation S ═ Ss,g+Sh
In the step (4), embedding each initial candidate item and the session representation obtained in the previous section are subjected to dot product operation, and then softmax is performed to obtain the recommended probability, which is specifically calculated as follows:
Figure BDA0003486431230000135
s · is a representation of a session,
Figure BDA0003486431230000136
representing the probability that the item vi is likely to be selected next in the target session; the model is trained with the minimized objective function:
Figure BDA0003486431230000137
where y represents the one-hot encoding of the item and λ is the loss function, y'iIs a predicted value.
Experiment of
We describe the data set used in the experiment, a comparative model, we compared the proposed method with other methods, and we also designed an ablation experiment, investigating the contribution of each channel to the model.
Data set
We evaluated the proposed method on two real-world datasets, the Tmall dataset and the Nowplaying dataset. Where the Tmax data set is from an IJCAI-15 game, the data set contains anonymous user shopping logs on a Techthyst online shopping platform.
To make a fair comparison, we preprocessed the two data sets. In the pre-processing we filter out sessions of length 1 and items with a number of occurrences less than 5. We separate the data into training data and testing data, and segment the session to generate corresponding labels. The statistics of the data set are summarized in Table 1.
TABLE 1 data set
Figure BDA0003486431230000141
To evaluate our proposed method, we compared our method to a representative baseline and most advanced method.
POP, recommending top-N most popular items in the training set.
Item-KNN recommends items according to the similarity between the current conversation Item and other conversation items. Similarity is defined in terms of cosine similarity.
FPMC is a hybrid model, combines matrix decomposition and first-order Markov chain recommendation, and is a model for processing sequence data earlier.
GRU4Rec, which is a session recommendation model based on RNN, models user sequences.
STAMP the model takes into account both the user's long-term preference and the current preference level of preference.
SR-GNN, which was a model for earlier use of graph neural networks for recommendations, used a gated neural network to obtain item embedding and make recommendations.
DHCN the model applies a hypergraph to model a sequence of sessions.
The GCE-GNN not only considers the item conversion relation in the conversation, but also considers the conversion among the items of all conversations and achieves the most advanced effect in conversation recommendation.
Evaluation index
The dimension of the potential vector is set to 100, and the mini-batch of the model is also 100. For a fair comparison, we set the hyper-parameters of each model to be the same. We use Adam optimizer, initial learning rate set to 0.001, decaying 0.1 every three cycles. The L2 penalty is set to 10-5The dropout rate is searched in {0.1, 0.2.
Model performance
The results of the experiment are shown in table 2:
TABLE 2 results of the experiment
Figure BDA0003486431230000151
Our MCG-SR model was evaluated with the other 8 baseline models on two evaluation indices of the two datasets, and we highlighted the best results using bold. By analyzing the results in table 2, we can conclude the following.
The traditional methods (such as POP and Item-KNN) have a more remarkable gap with the recently proposed methods (such as GRU4REC, STAMP, SR-GNN and DHCN), and compared with the traditional methods, the recently proposed methods are modeling with sequence dependency, but the traditional models are not, so that the importance of sequence information on session recommendation is proved. Recently proposed methods all apply the technique of deep learning, which illustrates the key role of deep learning in conversational recommendation.
In the recently proposed approach, the performance of STAMP is significantly better than GRU4 REC. GRU4REC is based on the recurrent neural network modeling, while STAMP is a completely attention-based approach. This is because the GRU4REC only considers sequential behavior but is difficult to cope with user preference shifts, in contrast to an attention-based model that assigns different importance to different items, which can more accurately predict the user's behavior.
The latest graph neural network-based session recommendation is superior to the models based on the recurrent neural network and the attention mechanism, and the MCG-SR model provided by the inventor models a session sequence into three graphs, namely a session graph, a hypergraph and a global graph. The conversion information among the items in the three graphs is captured through a graph neural network, and then the information captured in the three graphs is fused to capture richer item conversion information. Experimental results as shown in table 2, our model outperformed all baseline models including the latest model based on graph neural networks (e.g., SR-GNN, GCE-GNN, DHCN) and achieved a great improvement, which also indicates the effectiveness of our approach.
Ablation study
To investigate the contributions of different channels to our model, we designed three variants: MCG-SR-S, MCG-SR-G and MCG-SR-H. MCG-SR-S represents a version without a session graph channel, MCG-SR-G represents a version without a global graph channel, and MCG-SR-H represents a version without a hypergraph channel. We compared these three variants with MCG-SR, DHCN and GCE-GNN on both Tmax and Nowplayng datasets.
From table 3, the different effects of the three channels on the two data sets are observed. From table 3, we can find that when we remove the session map channel, there is a large decrease in performance on the Tmall dataset compared to the MCG-SR, while there is a small decrease in P @20 index on the Nowplaying dataset, but a rise in MRR @20 index. When we remove the global graph path we have an increase in the P @20 index and a decrease in the MRR @20 index on both datasets compared to MCG-SR. When we removed the hypergraph channel, there was a drop in both evaluation metrics for both datasets compared to MCG-SR. From table 3, we can conclude that our model only promotes on both indexes of both data sets when all three channels are added, which also verifies that our model determination can capture more complete project transformation relationships and improve generalization capability.
TABLE 3
Figure BDA0003486431230000161
The embodiment provides a three-channel map neural network model based on session recommendation. Conversational recommendations based on prior graph neural networks typically model the sequence of conversations as a single graph, which results in a model that fails to capture richer item transformations. The model simultaneously models a conversation sequence into three graphs, namely a conversation graph, a hypergraph and a global graph. The conversion information among the items in the three graphs is captured through the graph neural network, then the information captured in the three graphs is fused for capturing richer item conversion information, and experiments on two data sets prove that the model achieves greater advantages and the effectiveness of the model is proved.
The present invention and its embodiments have been described above schematically, without limitation, and what is shown in the drawings is only one of the embodiments of the present invention, and the actual structure is not limited thereto. Therefore, if the person skilled in the art receives the teaching, without departing from the spirit of the invention, the person skilled in the art shall not inventively design the similar structural modes and embodiments to the technical solution, but shall fall within the scope of the invention.

Claims (7)

1. The three-channel diagram neural network-based session recommendation method is characterized by comprising the following steps of: the method comprises the following steps:
(1) converting the session sequence data into session map, hypergraph and global map data;
(2) the graph data is learned to be embedded into three items through a graph neural network of three channels; the three channels comprise a session graph channel, a hypergraph channel and a global graph channel; the session graph channel is used for capturing conversion relations among items in the session, the hypergraph channel is used for capturing high-order relations among items in the session, and the global graph channel is used for capturing relations among items in different sessions;
(3) project representations formed by three channels are fused to obtain more complete project conversion information;
(4) the predicted probabilities of the items are output through the prediction layer.
2. The three-channel map neural network-based session recommendation method of claim 1, wherein: in the conversation chart, a conversation is given
Figure FDA0003486431220000011
Figure FDA0003486431220000012
Indicating item v clicked on in session SiAnd the session length is L; gs=(Vs,Es) Representing a conversation graph, each item si∈VsItems(s) that are nodes and adjacenti-1,si)∈EsAs an edge;
Gh=(Vh,Eh) Represents a hypergraph, VhRepresenting a set of N non-repeating vertices in a hypergraph, EhRepresenting a set of M hyperedges in the hypergraph; each super edge contains at least two vertices, where
Figure FDA0003486431220000013
And is
Figure FDA0003486431220000014
Each super edge is given a weight whhAll the weights form a diagonal matrix W e RN×M(ii) a Matrix for hypergraph H ∈ RN×MRepresentation in which when a hyper-edge contains a vertex viWhen e is V, Hih1, otherwise Hih0; the degrees of the vertex and the excess edge are D respectivelyiiAnd BhhWherein
Figure FDA0003486431220000015
Figure FDA0003486431220000016
D and B are diagonal matrices;
the global graph is used for acquiring information of item conversion between different sessions, wherein Gg ═ Vg denotes the global graph, Vg denotes a set of graph nodes of all items, Eg denotes a set of all edges, and each edge corresponds to two paired items in all sessions.
3. The three-channel map neural network-based session recommendation method of claim 2, wherein: the method for embedding the session graph comprises the following steps:
for the target project, different adjacent projects have different importance degrees on the target project, and the weight between different nodes is captured by using an attention mechanism; the attention coefficient is as follows:
Figure FDA0003486431220000017
wherein s isijIndicating the importance of the item vj to the item vi,
Figure FDA0003486431220000021
representing relationships between items, hviEmbedding of item vi, hvjEmbedding of item vj, rijIs the relationship between vi and vj, i.e. the edge relationship, a ∈ RdRepresenting a weight; then, in order to make the coefficients comparable between different nodes, the attention weights are normalized by the softmax function, as shown in the following equation:
αij=soft max(sij)
aijthe weights are the weights between the nodes vi and vj after normalization;
and finally, linearly combining the obtained attention coefficient with the corresponding item to obtain the output of each node, wherein the formula is as follows:
Figure FDA0003486431220000022
Figure FDA0003486431220000023
representing a node.
4. The three-channel map neural network-based session recommendation method of claim 3, wherein: the hypergraph embedding method comprises the following steps:
the hypergraph channel is used to capture the high-order relationships between items, and defines the hypergraph convolution as:
Figure FDA0003486431220000024
Figure FDA0003486431220000025
for l +1 level item embedding, Hih、HjhIs a value in the correlation matrix, whhIn order to be the weighting coefficients,
Figure FDA0003486431220000026
item embedding for l layers;
writing the above equation in matrix form is as follows:
Figure FDA0003486431220000027
wherein the content of the first and second substances,
Figure FDA0003486431220000028
respectively an item embedding of l +1 layer and an item embedding of l layer,
Figure FDA0003486431220000029
aggregating information from nodes to the super edges, and multiplying the information by a session sequence H to check the aggregated information from the super edges to the nodes; embedding X in original item0After hypergraph convolution of L layers, the embedding summation of each layer is averaged to be used as the final item embedding
Figure FDA00034864312200000210
Adding location information of an item into the item embedding by location embedding, P ═ P1,p2,...,pk]Embedding a matrix for a learnable position, wherein k is the length of the current session; the items to which the location information is added are embedded as follows:
Figure FDA0003486431220000031
wherein the content of the first and second substances,
Figure FDA0003486431220000032
to be provided with position informationItem embedding, xiFor item embedding without position information, pk-i+1As location information of the item, W1∈Rd×2dAnd b ∈ RdAre parameters that can be learned; the conversation embedding is generated by aggregating the item representations in the conversation; then enhancing the session
Figure FDA0003486431220000033
Represents the embedding of:
Figure FDA0003486431220000034
Figure FDA0003486431220000035
wherein alpha isiAs weights for the node pair sequences, ShThe average value of the item embedding, c the coefficients available for training,
Figure FDA0003486431220000036
is the embedding of a conversation s with an average embedded representation of all items in the conversation
Figure FDA0003486431220000037
Figure FDA0003486431220000038
Is the embedding of the ith item in the conversation s, f is the Rd、W2∈Rd×d、W3∈Rd×dIs an attention parameter.
5. The three-channel map neural network-based session recommendation method of claim 4, wherein: the method for embedding the global graph comprises the following steps:
each item is linearly combined according to the score generated by the conversation perception attention, and the specific steps are as follows:
h=∑vj∈vgπ(vi,vj)hvj
h is the attention score, hvjEmbedding the item vj, and using pi (vi, vj) to calculate the weights of different neighbors; the more important items that are closer to the current conversation are, the more the corresponding weight is; π (vi, vj) is specifically shown below:
Figure FDA0003486431220000039
π(vi,vj)=soft max(π(vi,vj))
wherein using LeakyRelu as the activation function, | denotes multiplication of corresponding position elements, | denotes a link operation, wij∈R1Is the weight of each edge in the global session graph, W1And q is1Is a trainable parameter, s is a feature of the target session, obtained by calculating the average of the current session, and then normalizing the coefficients of all neighboring items connected to vi by a softmax function; from this attention it can be concluded that those neighboring items can be attended to;
and finally, aggregating the target item information and the information of items adjacent to the target item, wherein the process is completed through a nonlinear conversion, and the specific steps are as follows:
Figure FDA0003486431220000041
hvin order to be a representation of the item,
Figure FDA0003486431220000042
for an aggregated representation of item representations and neighbor representations, Relu is the activation function, W2∈Rd×2dIs a trainable parameter;
to obtain higher order information we extend the single layer aggregator to multiple layers, the formula in the previous step is expressed as follows:
Figure FDA0003486431220000043
h(k-1)is an expression of item v generated in a previous step;
Figure FDA0003486431220000044
embedding the items of the previous layer, Agg is a polymerization operation,
Figure FDA0003486431220000045
and the k-order item representation of one item is formed by mixing the initial representation of the item and adjacent k-hop items.
6. The three-channel map neural network-based session recommendation method of claim 5, wherein: the method for obtaining more complete project conversion information by fusing project representations formed by three channels comprises the following steps:
firstly, fusing project representations generated by a session graph channel and a global graph channel;
for each item, the global representation and the session representation are combined to obtain the final item representation
Figure FDA0003486431220000046
The specific calculation is as follows:
Figure FDA0003486431220000047
generating a session insert via an item insert generated by fusing a session graph channel and a global graph channel
Figure FDA0003486431220000048
The position information of the item is determined by a learnable position matrix P ═ P1,p2,...,pk]Adding the position information of the project into the project embedding; then integrating the position information with the project expression through connection operation and nonlinear transformation; as follows:
Figure FDA0003486431220000049
for conversation sequences after the addition of position information
Figure FDA00034864312200000410
To capture location information between items, a sequence of sessions is input from the attention layer:
Figure FDA0003486431220000051
wherein F is the conversation representation after attention, d is the hyper-parameter, WQ、WK、WV∈R2d×dIs a projection matrix;
using the nonlinearity of the model added by the RELU activation function, a residual join is added behind the feedforward network, as follows:
E=Relu(FW1+b1)W2+b2+F
e is a representation of the session linked by residuals, W1、W2Is a matrix of dimensions d x d, b1And b2The method is a bias vector with d dimension, and in order to prevent overfitting, a Dropout regularization technology is added in the training process, and a self-attention mechanism is expressed as follows:
E=SAN(H)
finally, single self-attention is expanded to multi-headed self-attention as follows:
E(k)=SAN(E(K-1))
E(k)for conversational representation after k-layer attention, E(k-1)Is a conversation representation after k-1 layer attention;
the conversation sequence after self attention is expressed as M ═ M1,m2,...,mk]The weight of each node pair sequence is learned through soft attention, which shows that the importance degree of different node pairs to the sequence is different, as follows:
Figure FDA0003486431220000052
Figure FDA0003486431220000053
wherein, f, alphaiAre parameters that can be trained, σ represents a soft attention formula,
Figure FDA0003486431220000054
is the embedding of a conversation s with an average embedded representation of all items in the conversation
Figure FDA0003486431220000055
miIs the embedding of the ith item in the conversation s, f is the Rd、W4∈Rd ×d、W5∈Rd×dFor attention parameters, S is finally addeds,gAnd ShCombining to form the final session representation S ═ Ss,g+Sh
7. The three-channel graph neural network-based session recommendation method of claim 6, wherein: in the step (4), embedding each initial candidate item and the session representation obtained in the previous section are subjected to dot product operation, and then softmax is performed to obtain the recommended probability, which is specifically calculated as follows:
Figure FDA0003486431220000061
s · is a representation of a session,
Figure FDA0003486431220000062
representing the probability that the item vi is likely to be selected next in the target session;
the model is trained with the minimized objective function:
Figure FDA0003486431220000063
where y represents the one-hot encoding of the item, λ is the loss function, y'iIs a predicted value.
CN202210082137.0A 2022-01-24 2022-01-24 Three-channel diagram neural network-based session recommendation method Pending CN114547276A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210082137.0A CN114547276A (en) 2022-01-24 2022-01-24 Three-channel diagram neural network-based session recommendation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210082137.0A CN114547276A (en) 2022-01-24 2022-01-24 Three-channel diagram neural network-based session recommendation method

Publications (1)

Publication Number Publication Date
CN114547276A true CN114547276A (en) 2022-05-27

Family

ID=81670731

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210082137.0A Pending CN114547276A (en) 2022-01-24 2022-01-24 Three-channel diagram neural network-based session recommendation method

Country Status (1)

Country Link
CN (1) CN114547276A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115545098A (en) * 2022-09-23 2022-12-30 青海师范大学 Node classification method of three-channel graph neural network based on attention mechanism

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115545098A (en) * 2022-09-23 2022-12-30 青海师范大学 Node classification method of three-channel graph neural network based on attention mechanism
CN115545098B (en) * 2022-09-23 2023-09-08 青海师范大学 Node classification method of three-channel graph neural network based on attention mechanism

Similar Documents

Publication Publication Date Title
CN112035746A (en) Session recommendation method based on space-time sequence diagram convolutional network
CN112364976B (en) User preference prediction method based on session recommendation system
CN112989064B (en) Recommendation method for aggregating knowledge graph neural network and self-adaptive attention
CN111581520B (en) Item recommendation method and system based on item importance in session
CN108876044B (en) Online content popularity prediction method based on knowledge-enhanced neural network
WO2021139415A1 (en) Data processing method and apparatus, computer readable storage medium, and electronic device
CN114493755B (en) Self-attention sequence recommendation method fusing time sequence information
Zarzour et al. RecDNNing: a recommender system using deep neural network with user and item embeddings
CN112364242A (en) Graph convolution recommendation system for context-aware type
CN111259264B (en) Time sequence scoring prediction method based on generation countermeasure network
CN113487018A (en) Global context enhancement graph neural network method based on session recommendation
Zhou et al. Recommendation via collaborative autoregressive flows
CN114547276A (en) Three-channel diagram neural network-based session recommendation method
Mu et al. Auxiliary stacked denoising autoencoder based collaborative filtering recommendation
CN114780841B (en) KPHAN-based sequence recommendation method
CN114842247B (en) Characteristic accumulation-based graph convolution network semi-supervised node classification method
CN115470406A (en) Graph neural network session recommendation method based on dual-channel information fusion
CN114741597A (en) Knowledge-enhanced attention-force-diagram-based neural network next item recommendation method
CN114625969A (en) Recommendation method based on interactive neighbor session
CN116263794A (en) Double-flow model recommendation system and algorithm with contrast learning enhancement
CN114610862A (en) Conversation recommendation method for enhancing context sequence of graph
CN112801076A (en) Electronic commerce video highlight detection method and system based on self-attention mechanism
Mohan et al. Representation learning for temporal networks using temporal random walk and deep autoencoder
Zhu et al. Influential Recommender System
CN116485501B (en) Graph neural network session recommendation method based on graph embedding and attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination