CN114117229A - Project recommendation method of graph neural network based on directed and undirected structural information - Google Patents

Project recommendation method of graph neural network based on directed and undirected structural information Download PDF

Info

Publication number
CN114117229A
CN114117229A CN202111447363.6A CN202111447363A CN114117229A CN 114117229 A CN114117229 A CN 114117229A CN 202111447363 A CN202111447363 A CN 202111447363A CN 114117229 A CN114117229 A CN 114117229A
Authority
CN
China
Prior art keywords
vector
item
conversation
matrix
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111447363.6A
Other languages
Chinese (zh)
Inventor
王庆梅
王铮
胡承佐
靳博文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology Beijing USTB
Original Assignee
University of Science and Technology Beijing USTB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology Beijing USTB filed Critical University of Science and Technology Beijing USTB
Priority to CN202111447363.6A priority Critical patent/CN114117229A/en
Publication of CN114117229A publication Critical patent/CN114117229A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a project recommendation method of a graph neural network based on directed and undirected structural information, which comprises the following steps: extracting undirected structure information in the graph by using a graph convolution network, extracting directed structure information in the graph by using a gated graph neural network, calculating to obtain an intermediate item implicit vector, and performing linear transformation on the obtained intermediate item implicit vector to obtain a final item implicit vector according to the adjacency relation of the session sequence graph; and allocating higher attention to repeated click items appearing in the conversation sequence, introducing an attention mechanism when generating an item implicit vector, and modifying the weight coefficient of the corresponding item according to the degree of dependence among the items. The method and the device enable the generated conversation vector to be predicted more accurately in the recommendation process.

Description

Project recommendation method of graph neural network based on directed and undirected structural information
Technical Field
The invention relates to the technical field of recommendation methods, in particular to a project recommendation method of a graph neural network based on directed and undirected structural information.
Background
Recommendation systems are one of the most important downstream applications in the field of data mining and machine learning. It can help platform users alleviate the problem of information overload and sort out valuable information in many web applications for e-commerce platforms, music websites, and the like. In most recommendation systems the sequence of user actions is time-ordered and characterized by anonymity and large data volumes. In order to predict the behavior information of the user at the next moment, the recommendation based on the conversation sequence learns the preference of the user by mining the sequence order characteristic information in the historical behavior of the user.
A conversation sequence refers to a sequence of items generated by user clicks over a time interval; and the recommendation based on the conversation sequence can capture the importance of the dependency relationship in the sequence to the sequence prediction. In other words, users often have a common purpose in a certain sequence of sessions, such as purchasing summer clothing; the behavior characteristics of the user between different sequences may not be relevant, for example, the user may aim to purchase a cell phone accessory in other sessions. The recommendation based on the session sequence is to predict the next click of the user, i.e. the sequence tag v in the session sn+1. Using a recommendation model based on the sequence of conversations, the probability of all possible items is available for each conversation s
Figure BDA0003385139970000011
Wherein
Figure BDA0003385139970000012
Probability vector
Figure BDA0003385139970000013
All possible cases of the next click item occurring after the current session are included, and the value of each element represents the recommendation score of the corresponding item,
Figure BDA0003385139970000021
middle rowThe top K items are candidate items to be recommended.
In view of its high practical value, recommendations based on conversational sequences have received great attention in recent years and many well-worked research results have emerged. Early methods were based primarily on markov chains and recurrent neural networks. With the recent rise of neural networks and their unsophisticated performance in many downstream tasks, some research efforts have applied GNNs to conversational-based sequence recommendations. Although these GNNs-based methods perform well, there are some problems with these methods.
(1) Repeatedly appearing items in the click sequence are ignored. In fact, items that appear multiple times are not as important as other items, and to some extent, can reflect user preference information.
(2) The structural information in the session sequence diagram is not well utilized in generating the vector representation of the item. It is not sufficient to actually consider only the directionality between items, and introducing undirected relationships between items enables better learning of the user's behavioral information.
For example, the paper improvement conversation in session-based collaborative filtering by using temporal context of FONSECA et al proposes to convert the sparse session vector into a dense vector by using a clustering method; the Session-based formatting for predicting the next song of park.s.e et al, proposes a method of converting Session sequences into vectors and then calculating cosine similarity between the Session vectors. It can be seen that the neighborhood-based approach is simple but effective; meanwhile, the method is also influenced by data sparsity, and more importantly, the method does not consider the complex conversion relation among the items in the session vector.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a project recommendation method of a neural network based on directed and undirected structure information, so as to solve the problems of neglecting repeated projects in a click sequence and not well utilizing the structure information in a conversation sequence diagram when generating vector representation of the projects in the prior art.
In order to solve the above technical problem, the present invention provides a project recommendation method based on a neural network of a graph with directed and undirected structural information, the overall framework of which is shown in fig. 1, and the method comprises the following steps:
s1, let V represent the collection of items appearing in all conversation sequences, i.e.
Figure BDA0003385139970000031
Then an anonymous session sequence of length n may be used
Figure BDA0003385139970000032
Is expressed and the items in the conversation s are chronologically arranged, each viE.g. V represents the item clicked by the user in the session s, and the recommendation is to predict the next click of the user, i.e. predict the sequence label V in the session sn+1
S2, receiving the historical conversation sequence, and converting the historical conversation sequence into a directed conversation sequence chart Gs=(Vss,As) In which V issRepresents a set of points,. epsilonsRepresents the edge set, AsRepresents a set of adjacency matrices, AsDefined as three adjacency matrixes
Figure BDA0003385139970000033
And
Figure BDA0003385139970000034
the splicing of (a), wherein,
Figure BDA0003385139970000035
representing weighted adjacency matrices of undirected graphs, and
Figure BDA0003385139970000036
and
Figure BDA0003385139970000037
respectively representing weighted in-degree adjacent momentsArray and outgoing adjacency matrix;
s3, connecting point viMapping the e to the random embedded vector space to obtain d-dimensional vector representation
Figure BDA0003385139970000038
Extracting a first intermediate implicit vector of an item in the conversation sequence diagram by using a graph convolution network, extracting a second intermediate implicit vector of item conversion in the conversation sequence diagram by using a gated diagram neural network, and obtaining an item implicit vector through first linear transformation;
s4, inputting the item implicit vector into the target attention network, thereby obtaining a session implicit vector corresponding to the session sequence S;
s5, acquiring global information and local information of the conversation sequence S, and constructing a conversation vector representation through second linear transformation;
s6, predicting the probability of all possible items being clicked on the conversation sequence S by the softmax function, and recommending items with a high probability.
Further, a first intermediate implicit vector of an item in the session sequence diagram is extracted by using the graph volume network, that is, undirected structure information with attention is extracted by using the graph volume network, and the steps are as follows:
s31, generating a feature matrix X of the conversation sequence diagram; each node v in the session sequence diagramiCorresponding d-dimensional feature vector
Figure BDA0003385139970000041
The stack of (2) constituting a feature matrix of a session sequence diagram
Figure BDA0003385139970000042
Figure BDA0003385139970000043
X=[x1,…,xn]T
S32, for the graph convolution layer of the k layer, use the matrix H(k-1)Input vectors representing all nodes, by H(k)An output vector representing the node, wherein the initial d-dimensional node vector is the initial inputThe first layer of the graph convolution layer network is characterized by the following formula:
H(0)=X, (1)
before the input of each graph convolution layer, each node viThe feature of (a) is averaged with the feature vector of its local neighbor, and the calculation formula is:
Figure BDA0003385139970000044
wherein, aijIs node viAnd vjEdge weight between, di=∑jaij
S33, the output of the graph convolution network is a first intermediate implicit vector.
Further, for equation (2), the simple matrix operation of the whole graph is simplified, and S represents the result after symmetric normalization, as follows:
Figure BDA0003385139970000051
Figure BDA0003385139970000052
Figure BDA0003385139970000053
wherein, for the formula (3),
Figure BDA0003385139970000054
and is
Figure BDA0003385139970000055
Is that
Figure BDA0003385139970000056
An degree matrix of,. is a point-by-point operator.
Figure BDA0003385139970000057
Weighted adjacency matrix that is an undirected graph. I is the identity matrix.
Further, in order to increase the weight of the edge item and reduce the interference of the noise of other items, the propagation matrix
Figure BDA0003385139970000058
The left half of equation (4) for the items with edges connected in (b) is weighted up, i.e., the attention of the self information in the matrix is raised, as shown in fig. 2.
Further, with respect to the formula (4), α and β are hyperparameters, respectively, to control the ratio of the propagation matrix information and the unit matrix information, thereby controlling the absorption ratio of the node information with attention during the propagation process. As shown in FIG. 2, the adjacency matrix
Figure BDA0003385139970000059
And propagation matrix
Figure BDA00033851399700000510
Item v with repeated clicks therein2And item v converted during repeat click3There will be higher attention information, i.e. weight.
The above steps carry out local smoothing on the implicit vector representation of the nodes along the graph, and after the graph convolution network is used as a feature preprocessing method to transmit the features, the nodes can absorb the attention information of adjacent nodes, and finally the locally connected nodes can have similar prediction performance;
further, a second intermediate implicit vector of item conversion in the session sequence diagram is extracted by using a gated graph neural network, namely, the directed structure information with attention is extracted by using the gated graph neural network, and the steps are as follows:
for nodes in the session sequence chart, the node vector updating steps are as follows:
Figure BDA00033851399700000511
Figure BDA0003385139970000061
Figure BDA0003385139970000062
Figure BDA0003385139970000063
Figure BDA0003385139970000064
wherein, for the formula (6),
Figure BDA0003385139970000065
and
Figure BDA0003385139970000066
the weights and the magnitude of the bias terms are controlled,
Figure BDA0003385139970000067
to represent the result of the interaction between a node and an adjacent node,
Figure BDA0003385139970000068
and
Figure BDA0003385139970000069
reset gate and update gate, respectively, weight matrix Wz、Uz,Wr、UrAnd Wo、UoRespectively represent learnable network parameters in the reset gate, the update gate and the output gate,
Figure BDA00033851399700000610
represents node viIs used to generate the second intermediate implicit vector of (c),
Figure BDA00033851399700000611
is a sequence of node vectors in the session, and
Figure BDA00033851399700000612
Figure BDA00033851399700000613
for the first intermediate implicit vector output by the graph convolution network, σ (·) is a sigmoid function, which is a point-by-point operator. Adjacency matrix
Figure BDA00033851399700000614
Representing the communication of the nodes in the graph,
Figure BDA00033851399700000615
representative node viIn that
Figure BDA00033851399700000616
Two columns of matrix blocks.
Wherein, the matrix
Figure BDA00033851399700000617
Is defined as an in-degree matrix
Figure BDA00033851399700000618
Sum degree matrix
Figure BDA00033851399700000619
Which represent weighted connections of the input and output edges, respectively, in the session sequence diagram. For example, given a session sequence s ═ v1,v2,v3,v2,v4]Corresponding session sequence diagram GsAnd adjacency matrix
Figure BDA00033851399700000620
As shown in fig. 3;
further, a project implicit vector is generated, which comprises the following steps:
GCN and GCN over graph convolution networkAfter the information of the gate control graph neural network GGNN is processed, a second intermediate implicit vector is obtained
Figure BDA00033851399700000621
In order to balance the proportion of the non-directional structural information with attention and the directional structural information, the following formula is adopted for control:
Figure BDA00033851399700000622
wherein gamma is a hyper-parameter, thus obtaining a final accurate item implicit vector H;
after the implicit vector representation of each item is obtained, a target vector is further constructed, so that the correlation of historical behaviors can be analyzed on the premise of considering the target item. The target items are all items to be predicted.
Computing all items v in a conversation s using a local target attention modeliFor each target item vtAn attention score β for e Vi,tWherein
Figure BDA0003385139970000071
And
Figure BDA0003385139970000072
are respectively item viAnd vtIs determined by the second intermediate implicit vector representation of (1).
Figure BDA0003385139970000073
In equation (12), all items in the conversation sequence are matched with the target item respectively, and the weighted matrix is used
Figure BDA0003385139970000074
To perform a pair-wise nonlinear transformation; the resulting self-attention score is then normalized by the softmax function to obtain the final attention score βi,t
Finally, for each conversation sequence s, the user is directed to a target item vtCan be expressed as
Figure BDA0003385139970000075
I.e. vectors based on target attention
Figure BDA0003385139970000076
Figure BDA0003385139970000077
It represents the level of interest a user has in generating between different target items;
further, a conversation sequence vector is generated, which comprises the following steps:
representing a user's short-term interest as a local vector
Figure BDA0003385139970000078
By the last item in the conversation sequence
Figure BDA0003385139970000079
Represents the local vector, as shown in equation (14):
Figure BDA00033851399700000710
defining a user's long-term preferences as a global vector
Figure BDA00033851399700000711
Where all the appearing item vectors in session s are aggregated. While using a mechanism of attention to introduce items of last interaction
Figure BDA0003385139970000081
With items [ v ] appearing throughout the conversation1,v2,…,vn]The dependency relationship between them.
Figure BDA0003385139970000082
Figure BDA0003385139970000083
Wherein the ratio of q,
Figure BDA0003385139970000084
and W1,
Figure BDA0003385139970000085
Is a corresponding weight parameter, αiRepresenting the dependency between the last item and the items that appear in the entire sequence of conversations.
And finally, splicing the local vector, the global vector and the vector based on the target attention obtained in the previous step, and obtaining a session vector corresponding to the session sequence s by utilizing linear conversion.
Figure BDA0003385139970000086
Wherein the weight parameter
Figure BDA0003385139970000087
Projecting the results of three vector concatenations into vector space
Figure BDA0003385139970000088
Performing the following steps;
further, the step of generating the recommendation at step S6 is as follows:
all the items viSecond implicit vector of e.V
Figure BDA0003385139970000089
The session vector s corresponding theretohThe multiplication is carried out, and the result is obtained,
Figure BDA00033851399700000810
then obtaining an output vector of the model through a softmax function pair
Figure BDA00033851399700000811
Figure BDA00033851399700000812
Wherein
Figure BDA00033851399700000813
Represents the predicted recommendation scores of all the target items, and
Figure BDA00033851399700000814
representing the probability that the target item is clicked at the next moment in the session sequence s.
Figure BDA00033851399700000815
The top K items with the highest ranking are the items to be recommended.
For each session sequence diagram, defining a loss function as the cross entropy of the predicted value and the actual value,
Figure BDA00033851399700000816
wherein y isiA one-hot encoded vector representing the actual click item at the next instance in the session sequence.
Finally, the steps S2-S6 are iteratively trained by using a time-based back propagation algorithm BPTT algorithm to generate parameters related thereto, such as W, α, β, and the like.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of the overall framework of the model provided by the present invention;
FIG. 2 is an exemplary session sequence diagram and its corresponding undirected adjacency matrix with weights provided by the present invention
Figure BDA0003385139970000091
And weighted retransmission matrix
Figure BDA0003385139970000092
A schematic diagram;
FIG. 3 is an exemplary session sequence diagram and its corresponding adjacency matrix provided by the present invention
Figure BDA0003385139970000093
A schematic diagram;
FIG. 4 is a graph showing the behavior of different components in the P @20 index between structural information components in an ablation experiment according to the present invention;
FIG. 5 is a graph illustrating the behavior of the invention in terms of different components in the MRR @20 index between structural information components in an ablation experiment;
FIG. 6 is a graph of the present invention showing the behavior of different components in the P @20 index between the components of the attention information in the ablation experiment);
FIG. 7 is a graph illustrating the behavior of different components in the MRR @20 index of the component of the attention information in an ablation experiment according to the present invention;
FIG. 8 is a flowchart illustrating an item recommendation method according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
The invention provides a project recommendation method of a graph neural network based on undirected structure information and directed structure information, aiming at the problems that the existing method ignores projects which repeatedly appear in a click sequence, and the structure information in a conversation sequence graph is not well utilized when vector representation of the projects is generated. The method gives the prediction of the item clicked by the user at the next moment according to the current conversation sequence data of the user, and does not depend on the long-term preference information of the user in the process.
For the recommendation based on the conversation sequence, firstly, a directed conversation sequence diagram is constructed from the information of the historical conversation sequence; and extracting undirected structure information and directed structure information of item conversion in the conversation sequence diagram respectively by using a graph convolution network GCN and a gate control graph neural network GGNN, generating an accurate item implicit vector, inputting the obtained item implicit vector into an attention network, and considering global information and local information of the conversation simultaneously, thereby constructing a more reliable conversation representation and deducing a next click item.
As shown in fig. 1 and 8, the item recommendation method of the neural network based on undirected structure information and directed structure information includes:
s1, let V represent the collection of items appearing in all conversation sequences, i.e.
Figure BDA0003385139970000111
Then an anonymous session sequence of length n may be used
Figure BDA0003385139970000112
Is expressed and the items in the conversation s are chronologically arranged, each viE.g. V represents the item clicked by the user in the session s, and the recommendation is to predict the next click of the user, i.e. predict the sequence label V in the session sn+1
S2, converting the historical conversation sequence into a directed conversation sequenceDrawing Gs=(Vss,As) In which V issRepresents a set of points,. epsilonsRepresents the edge set, AsRepresenting a set of adjacency matrices. In the session sequence chart GsEach node in the set represents an item viE.g. V, and each edge (V)i-1,vi)∈εsAll represent that the user successively clicks the item vi-1And item viAnd A issDefined as three adjacency matrixes
Figure BDA0003385139970000113
And
Figure BDA0003385139970000114
the splicing of the two pieces of the paper is carried out,
Figure BDA0003385139970000115
representing weighted adjacency matrices of undirected graphs, and
Figure BDA0003385139970000116
and
Figure BDA0003385139970000117
respectively representing weighted in-degree adjacency matrix and out-degree adjacency matrix.
S3, pair session sequence chart GsProcessing to obtain the item implicit vectors of all nodes in the session sequence diagram;
s4, inputting the item implicit vector into a target attention network, and obtaining a conversation sequence attention-based vector;
s5, acquiring global information and local information of the conversation sequence S, and constructing a conversation vector representation through second linear transformation;
s6, predicting the probability of all the target items being clicked on the conversation sequence S by the softmax function, and recommending items with a high probability.
In step S3, the session sequence chart GsEach node v iniMapping the e s to a random embedded vector space to obtain a d-dimensional vector representation
Figure BDA0003385139970000118
And obtaining an item implicit vector H through graph convolution network, gated graph neural network and linear processing.
The method specifically comprises the following steps:
(1) according to the session sequence chart GsGenerating each item viInitial item implicit vector x corresponding to e ViWherein V represents the set of items that appear in all sequences of conversations; each node v in the session sequence diagramiCorresponding d-dimensional feature vector
Figure BDA0003385139970000121
The stack of (2) constituting a feature matrix of a session sequence diagram
Figure BDA0003385139970000122
X=[x1,…,xn]T
Generating xiThe method comprises the following specific steps:
1) session sequence diagram GsWeighted undirected adjacency matrix in (1)
Figure BDA0003385139970000123
Is a sparse and symmetric adjacency matrix in which aijRepresents node viAnd vjThe edge weight between the nodes, and the no connection relation between the nodes is expressed as aij=0。
2) Defining the degree matrix D as the diagonal matrix D ═ diag (D)1,…,dn) And the value on the diagonal is equal to the sum d of the row elements of the adjacent matrixi=∑jaij
3) By passing
Figure BDA0003385139970000124
Get each node v in the graphiAll have corresponding d-dimensional feature vectors
Figure BDA0003385139970000125
Figure BDA0003385139970000126
So the feature matrix of the conversation sequence
Figure BDA0003385139970000127
Namely, stacking of feature vectors corresponding to each node in the graph, i.e., X ═ X1,…,xn]T
(2) X is to beiAnd inputting the graph convolution network with the undirected structure information to obtain a first intermediate implicit vector of all nodes in the graph (the first intermediate implicit vector has the undirected structure information).
(3) And obtaining a second intermediate implicit vector (with the directional structure information) of all nodes in the graph through the gated graph neural network with the directional structure information.
(4) And inputting the second intermediate implicit vector into the first linear transformation to obtain an accurate item implicit vector.
The graph neural network has natural adaptability to the recommendation based on the conversation sequence, because the graph neural network can automatically extract the characteristics of the conversation sequence graph under the premise of considering rich node connection relation.
Similar to convolutional neural networks CNNs, multi-layer perceptron MLPs, the graph convolution network GCN is for each node v in a multi-layer structureiThe features of (a) are learned and a new feature representation is obtained and then input into the corresponding linear classifier.
In the step (2), the step of generating the first intermediate implicit vector includes:
s31, for the graph convolution layer of the k layer, use the matrix H(k-1)Input vector h representing all nodesiBy H(k)An output vector representing the node. The initial d-dimensional node vector is the feature of the initial input and is input into the first-layer GCN:
H(0)=X, (1)
a GCN with a number of layers K is equivalent to the eigenvector x for all nodes in the graphiA K-layer MLPs model is applied, except that the implicit vector representation of each node is averaged with its neighbor nodes at the beginning of each layer. Direction of nodes in each graph convolution layerThe volume representation has three update phases: feature propagation, linear transformation and point-by-point nonlinear activation. The only feature propagation stage used in the present invention for learning the item implicit vector.
S32, at the beginning of each layer, each node ShIs averaged with the feature vector of its local (random) neighbors:
Figure BDA0003385139970000131
the input of the graph convolution network is the first intermediate implicit vector.
Preferably, equation (2) is simplified with a simple matrix operation of the whole graph, as follows:
let S represent the contiguous matrix with self-loops after "symmetric normalization":
Figure BDA0003385139970000132
Figure BDA0003385139970000133
Figure BDA0003385139970000141
wherein
Figure BDA0003385139970000142
And is
Figure BDA0003385139970000143
Is that
Figure BDA0003385139970000144
The degree matrix of (n) is a dot-by-dot operator, and I is an identity matrix. Due to the adjacent matrix
Figure BDA0003385139970000145
Taking into account the weight of the repeated items, so the matrix
Figure BDA0003385139970000146
With additional attention paid to repeatedly clicking on the item. Meanwhile, due to the fact that S is provided with a self-loop, the symmetrical normalization process causes the weight of the items connected with multiple edges to be smaller than that of the items connected with a single edge or without the edges. Alpha, beta are hyper-parameters.
In order to increase the weight of the edge term and reduce the interference of the noise of other terms, the propagation matrix
Figure BDA0003385139970000147
For the items with connected edges, the weights are increased through the left half part of formula (4), i.e. the attention of self information in the matrix is increased.
The hyper-parameter alpha and the hyper-parameter beta are used for controlling the proportion of the propagation matrix information and the unit matrix information, thereby controlling the absorption proportion of the node information with attention in the propagation process. The adjacency matrix can be seen from the specific example given in fig. 2
Figure BDA0003385139970000148
And propagation matrix
Figure BDA0003385139970000149
Item α with repeat clicks therein2And item alpha converted during repeat clicks3There will be higher attention information, i.e. weight.
Thus, the equivalent updated form of equation (2) can be changed to a simple sparse matrix multiplication for all nodes.
The above steps locally smooth the implicit vector representation of the nodes along the graph, and after the graph convolution network is used as a feature preprocessing method to transmit the features, the nodes can absorb the attention information of adjacent nodes, and finally the locally connected nodes can have similar prediction performance.
In step (3), a gated graph neural network GGNN is constructed using the method of Li et al.
For session sequence diagram GsNode v iniThe node vector updating steps are as follows:
Figure BDA0003385139970000151
Figure BDA0003385139970000152
Figure BDA0003385139970000153
Figure BDA0003385139970000154
Figure BDA0003385139970000155
in equation (6), the adjacency matrix
Figure BDA0003385139970000156
Representing the communication of the nodes in the graph,
Figure BDA0003385139970000157
representative node viIn that
Figure BDA0003385139970000158
Two columns of matrix blocks in (1) are,
Figure BDA0003385139970000159
is a sequence of node vectors in the session, and
Figure BDA00033851399700001510
namely, the final output of the graph convolution network is used as the initial input of the gated graph neural network,
Figure BDA00033851399700001511
and
Figure BDA00033851399700001512
the weights and the magnitude of the bias terms are controlled,
Figure BDA00033851399700001513
which is used to represent the result of the interaction between a node and an adjacent node.
For equation (7), the reset gates are obtained by sigma (·) sigmoid function
Figure BDA00033851399700001514
And a retrofit gate
Figure BDA00033851399700001515
Weight matrix Wz、Uz,Wr、UrAnd Wo、UoNetwork parameters that can be learned in the reset gate, the update gate and the output gate, respectively, are point-by-point operators. Finally, the
Figure BDA00033851399700001516
Represents node viImplicit vectors generated by GGNN gated graph neural networks.
Matrix array
Figure BDA00033851399700001517
Is defined as an in-degree matrix
Figure BDA00033851399700001518
Sum degree matrix
Figure BDA00033851399700001519
Which represent weighted connections of the input and output edges, respectively, in the session sequence diagram. For example, given a session sequence s ═ v12,v3,v2,v4]Corresponding session sequence diagram GsAnd adjacency matrix
Figure BDA00033851399700001520
As shown in fig. 3. It can be seen that the weights in the directed adjacency matrix are set according to the degree of closeness between nodes, e.g., v2From alpha3To alpha4Each having an edge, but the two weights are different because of α2And alpha3The more edges are connected with each other, which means that the similarity between the two is higher. To achieve better prediction effect, from alpha2Should absorb a more3So the model should pay more attention to v3On the body other than v4
Therefore, the sequence diagram G is for each sessionsThe GGNN model propagates node information with attention between adjacent nodes, while reset gating and update gating determine the next information that needs to be discarded or retained, respectively.
In the step (4), after the information processing of the graph convolution network GCN and the gated graph neural network GGNN, the information is respectively obtained
Figure BDA0003385139970000161
The former is to perform attention-bearing undirected structure information processing on an initial embedded vector, and the latter is to extract attention-bearing directed structure information in a graph structure more finely on the basis of the former.
In order to balance the proportion of the non-directional structural information with attention and the directional structural information, the following formula is adopted for control:
Figure BDA0003385139970000162
wherein gamma is a hyperparameter, thus obtaining a final precise item implicit vector H.
In step S4, the specific steps include:
s41, calculating all items v in the conversation S by using a local target attention model (the model is prior art and is not described in detail)iFor each target item vtAn attention score β for e Vi,tWherein
Figure BDA0003385139970000163
And
Figure BDA0003385139970000164
are respectively item viAnd vtIs represented by an implicit vector.
Figure BDA0003385139970000165
In the above equation, items in the conversation are matched with the candidate targets, respectively, and the weighted matrix is used
Figure BDA0003385139970000166
To perform a pair-wise nonlinear conversion. The resulting self-attention score is then normalized by the softmax function and the final attention score is obtained.
S42, for each conversation sequence S, the user aims at the target item vtCan be expressed as
Figure BDA0003385139970000167
Figure BDA0003385139970000168
Finally, a vector based on the target attention is obtained
Figure BDA0003385139970000171
Which represents the level of interest a user has in generating between different target items.
In step S5, the short-term and long-term preferences of the user are further explored using the item vectors involved in the session S, so as to obtain local vectors and global vectors in the session, and a final session vector is generated by synthesizing the target attention-based vectors calculated in the above section.
And S51, acquiring a local vector. In a session sequence s, withThe final behavior of the user is often determined by the last interactive item in the current sequence. Therefore, the short-term interest of the user is expressed as a local vector
Figure BDA0003385139970000172
And the local vector is the last item in the conversation sequence
Figure BDA0003385139970000173
Is represented by a vector of (a).
Figure BDA0003385139970000174
S52, obtaining a global vector, and defining the long-term preference of the user as the global vector
Figure BDA0003385139970000175
Figure BDA0003385139970000176
Where all the appearing item vectors in session s are aggregated. While also taking advantage of the attention mechanism to introduce items of last interaction
Figure BDA0003385139970000177
With items [ v ] appearing throughout the conversation1,v2,…,vn]The dependency relationship between them.
Figure BDA0003385139970000178
Figure BDA0003385139970000179
Wherein the ratio of q,
Figure BDA00033851399700001710
and W1,
Figure BDA00033851399700001711
Are the corresponding weight parameters.
And S53, splicing the obtained local variable, the global variable and the target attention-based vector, and obtaining a session vector corresponding to the session sequence S by utilizing linear conversion.
Figure BDA00033851399700001712
Wherein the weight parameter
Figure BDA00033851399700001713
Projecting the results of three vector concatenations into vector space
Figure BDA00033851399700001714
In (1). It is noted that different session vectors may be generated for different target items (items in the session sequence) correspondingly.
In step S6, a session vector S corresponding to each session sequence S is obtainedhThereafter, for all items viScore for e.V
Figure BDA0003385139970000181
Performing calculations, i.e. vectors of candidate items
Figure BDA0003385139970000182
And session vector shThe multiplication is carried out, and the result is obtained,
Figure BDA0003385139970000183
then obtaining an output vector of the model through a softmax function pair
Figure BDA0003385139970000184
Figure BDA0003385139970000185
Wherein
Figure BDA0003385139970000186
Represents the predicted recommendation scores of all the target items, and
Figure BDA0003385139970000187
representing the probability that the target item is clicked at the next moment in the session sequence s,
Figure BDA0003385139970000188
the top K items with the highest ranking are the items to be recommended.
Sequence diagram G for each sessionsDefining the loss function as the cross entropy of the predicted value and the actual value,
Figure BDA0003385139970000189
wherein y isiA one-hot encoding vector (one-hot embedding) representing the real click item at the next instant in the session sequence.
In use, the data sets may be used to iteratively train steps S2-S6, such as training using the time-based back propagation BPTT algorithm, to obtain parameters such as W, W1, W2, W3, etc. in the above steps, which may be initially randomly set and then learned during training.
In training, each sequence is used as a training sample, so the total error is the sum of the errors at each time step (recommendation). Note that in a recommendation scenario based on conversational sequences, most conversations are relatively short sequences. To prevent the occurrence of overfitting, a smaller number of training passes is used.
Experimental analysis:
1. experimental data set
The method was evaluated in actual practice using the public data set Digineica published as the public data set Yoochoose and CIKM Cup 2016. The Yoochoose data set contains the user click stream within 6 months on the electronic shopping platform, and the Diginetica data set only contains the data of successful transaction, namely the purchase stream of the user.
At the same time, corresponding sequences and tags are further generated by slicing the input sequence data. For an input session sequence s ═ v1,v2,…,vn]As a data enhancement strategy, a series of sequences and tags ([ v ] are generated1],v2),([v1,v2],v3),…,([v1,v2,…,vn-1],vn) Among them, [ v ]1,v2,…,vn-1]Is the sequence generated, and vnA label representing the item, i.e. the sequence, clicked on at the next moment.
The details of the data set finally used are shown in table 1.
TABLE 1 Experimental data set statistics
Figure BDA0003385139970000191
2. Evaluation criteria
After the data set is determined, two metrics that are very common in the recommendation based on the conversation sequence are adopted as evaluation indexes of the algorithm.
(1) P @20(Precision) is a widely used measure of prediction accuracy. It represents the proportion of correct recommendations in the top 20 items of the algorithm recommendation.
(2) MRR @20(Mean Recircular Rank) is the Reciprocal Rank Mean of the correctly recommended items in the algorithm recommendations. When the true result exceeds 20 in the recommended ranks of the algorithm, the corresponding reciprocal rank is 0. The MMR metric is a method considering the recommendation order, and a larger MRR value represents that the real result is positioned at the top of the ranking list in the recommendation list, so that the effectiveness of the recommendation system is proved.
3. Experimental setup
The dimension of the implicit vector is set to d 100 in both datasets. All hyper-parameter settings utilize a mean of 0, standard deviationInitialization is performed for a gaussian distribution function of 0.1. These parameters were also optimized using a small batch Adam optimizer and the initial learning rate η was set to 0.001 and attenuated by 0.1 every three training cycles. Further, the batch size was set to 100, and the L2 regularization parameter was set to 10-5
4. Analysis of Experimental results
The performance of several methods on both the P @20 and MRR @20 indices is shown in table 2, where bolding shows the best results. The method provided by the invention can flexibly construct the relation between the items on the conversation sequence diagram, and extract the directed structure information and the undirected structure information with attention, so that the subsequent learning of target attention can be more accurate, and the final recommendation can be given by integrating the global interest and the local interest of the user in the conversation. From the experimental data in table 2, it is clear that the method achieves the best performance results on both indexes on the three data sets, which proves the effectiveness of the method.
Conventional recommendation methods such as POP and S-POP do not perform well on the problem based on session sequences because they ignore the user' S preferences in the current session and only consider the top K most popular items. BPR-MF states that it is meaningful to use semantic information in a conversation, while better performing FPMC states that modeling conversation sequences using first-order Markov chains is a relatively efficient method. Also as a traditional recommendation method, Item-KNN is superior to the first two. It is worth noting that Item-KNN relies only on computing similarity between items, which suggests that the simultaneous presence of items is also a relatively important piece of information. However, Item-KNN does not take into account timing information in the conversation, and there is no way to capture information for switching between items.
Unlike the traditional method, the deep learning-based method basically has better performance in the index results of all data sets. GRU4Rec is a recurrent neural network-based method that can perform better than and to a similar degree as some conventional methods. This demonstrates that the recurrent neural network has some modeling capability for sequence data. However, GRU4Rec focuses mainly on modeling the session sequence and cannot capture the user preferences in the session. Later emerging methods such as NARM and STAMP both significantly improved GRU4 Rec. Nar explicitly captures the user's main preferences in the session, whereas STAMP takes advantage of the attention mechanism to consider the user's short-term interests, which is why they are superior to GRU4 Rec. The RepeatNet is also an algorithm based on a recurrent neural network, and achieves a better prediction effect by considering the repeated click behavior of the user, which indicates that the model has certain importance on the behavior habit of the user. However, repatnet has limited promotion over NARM and STAMP, possibly because modeling the user's repeated click habits through project features alone is inadequate and RNN-based structures cannot capture some common dependencies within a session.
The graph neural network based approach constructs each session sequence as a subgraph and encodes all the items in the session through the graph neural network. Both SR-GNN and TAGNN gave better results than all RNN-based models. SR-GNN utilizes a gated graph neural network to learn dependencies between items within a sequence of conversations, while TAGNN further exploits the dependencies between items within a conversation and target items with a mechanism of attention. However, these methods completely learn according to the directed relationship of the session sequence diagram, and do not comprehensively consider the undirected relationship in the session sequence diagram, because the relationship between the items in the session sequence is sometimes not unidirectional but bidirectional, and the more comprehensive relationship between the items can be captured by using undirected structure information. Moreover, they ignore the repetitive click feature that occurs in conversational sequences, and intuitively should the importance of a repeated occurrence of an item in a sequence be greater. In addition, in an actual recommendation scene, the association degree between items is variable, and the methods adopt an averaging method for the dependency relationship between items in the session, so that the different dependency degree of a certain item on other items cannot be reflected through a weighting or attention method.
The methods presented herein perform better than the methods described above. Specifically, there were relative lifts of 3.55%, 1.38%, 1.18% for the P @20 pair of best performing correlation methods and 1.92%, 4.34%, 1.98% for the MRR @20 pair of best performing correlation methods on the three datasets. The method can well extract the structural information in the session sequence diagram, sequentially extracts the undirected structural information and the directed structural information in the diagram by utilizing the graph convolution network and the gated graph neural network, and linearly combines the undirected structural information and the directed structural information so as to achieve accurate expression of the vector. In addition, repeated click items in the conversation sequence are considered, and the weight of repeated information is improved through an attention network; meanwhile, the self-information proportion of the nodes in the conversation sequence diagram is improved by adding self-loop and matrix operation, so that the nodes are not easily interfered by noise of other nodes. And then different weights are distributed by using the attention network according to different dependency relations among the projects, so that the network can generate accurate vector representation.
TABLE 2 comparison of the results
Figure BDA0003385139970000231
Ablation experiment:
the method can flexibly capture the relationship between the structural information and the items in the conversation sequence diagram. In order to verify the actual effect of each composition in the model, several model variants were set up for ablation experiments. In the experimental link, SR-GNN is selected as a reference method for comparison, and data in the experiment are displayed in the form of relative promotion percentage of comparison SR-GNN.
Firstly, performing combined analysis of directed structure information and undirected structure information: (a) -GCN, extracting only undirected structure information in the session sequence diagram. (b) GNN, extracting only the directed structure information in the session sequence graph. (c) GCN + GNN, inputting random initial vectors into two neural networks simultaneously, and then linearly combining the model output results. (d) -GCN + GNN (GCN), inputting the random initial vector into GCN, taking the output vector of GCN model as the input of GNN model, and finally linearly combining the output results of the two models. The results of the experimental comparison are shown in fig. 4 and 5.
Where AVG represents the average performance of the four combination conditions over the three data sets, respectively. As can be seen from fig. 4 and 5, the method GCN + Gnn (GCN) for integrating the directional structure information and the undirected structure information obtains the best results on both indicators P @20 and MRR @20 of the three data sets, which proves the importance of comprehensively considering the directional structure information and the undirected structure information. The average data AVG in fig. 5 also shows that the undirected structure information alone performs better than the directed structure information alone, whereas the directed structure information performs slightly better than the undirected structure information in terms of the performance of the single data set, on the MRR @20 index of the Yoochoose 1/4 data set and the dignetica data set instead. This reflects to some extent the connection between the user's preferences and the items in the conversation sequence based recommendations, the direction of the transition between items being of different importance in different scenarios, but on average also the absence of structural information is more important. This illustrates that, in the context of conversation-based sequence recommendation, the direction of the user's transition between items is worth considering, but the user needs to consider the relationship between the items viewed by the user in order to learn the user's preference better. And better than the former two methods, namely the directional structure information and the undirected structure information are comprehensively considered, the comprehensive performance and the average performance AVG of the GCN and the GNN in the methods of the figure 4 and the figure 5 are basically better than the performance of the GCN or the GNN which is singly used. The input data of the two network models in the GCN + GNN method are random embedded vectors, and the input of the GNN model in the GCN + GNN (GCN) method is a vector for extracting the undirected structure information through the GCN model, which shows that compared with the method of directly using the random vector, the embedded vector which is more accurately represented can be obtained by extracting the undirected structure information firstly and then extracting the directed structure information.
And then, performing combined analysis of repeated click attention information and the dependency relationship between the items: (a) GCN + GNN, does not consider the different dependencies between the click-repeatedly attention information and the items. (b) AttGCN + GNN, attention information for a repeatedly clicked item is considered only in GCN. (c) GCN + AttGNN, only consider varying degrees of dependency between items in GNN. (d) AttGCN + AttGNN, while fusing repeatedly clicked attention information in GCN and attention-bearing item dependencies in GNN. The experimental results are shown in fig. 6 and 7.
As can be seen from the data in fig. 6 and 7, the AttGCN + AttGNN, which comprehensively considers the repeated click attention information and the inter-item dependency, has the best experimental results on both indicators of the three data sets, which indicates that the repeated click behavior and the inter-item dependency are of certain importance in the recommendation based on the conversation sequence. Meanwhile, according to the experimental expression of the P @20 index in FIG. 6, the attention considering the repeated click attention and the relation between items independently has better expression than the GCN + GNN not considering any attention information, and the AttGCN + AttGNN comprehensively considering the two has the best effect, which shows that the attention information can ensure that important information is kept as much as possible and more accurately expressed as an embedded vector. However, according to the experimental performance of the MRR @20 index in fig. 7, although the AttGCN + AttGNN considering two attention points together still can obtain the best experimental performance, the performance considering one of the two attention points alone is slightly worse than the performance considering GCN + GNN not considering any attention information, that is, the accuracy of the recommendation result can be improved by using the attention information alone, but the prediction effect on the recommendation ranking is not good. The possible reason is that if one model considers attention and the other model does not consider when the vectors are input into the GCN and GNN models, the expression patterns of the adjacent matrixes in the two models are not unified, so that the vectors cannot simultaneously use attention to retain structural information when being input and output between the two models, and the structural information is interfered because of the inconsistent attention patterns, so that an embedded vector with accurate representation is not generated finally, and an accurate prediction score of each item cannot be calculated in the prediction stage.
In the recommendation scene based on the conversation sequence, the repeated click behavior of the user and the graph structure information are both considerable and considerable, because the behavior of the user can be well predicted without knowing the historical preference of the user. The patent refers to the field of 'electric digital data processing'. The invention not only utilizes GCN and GNN models to extract the directed structure information and the undirected structure information in the conversation sequence diagram and carry out linear combination, but also introduces an attention mechanism when generating the hidden vector of the project, effectively extracts the repeated click of the user and the complex conversion information between the projects, and leads the generated conversation vector to be predicted more accurately in the recommendation process. On the actual data sets in three realistic scenarios, the present invention verifies that the proposed algorithm is superior to other most advanced methods, and verifies the effectiveness of the attention mechanism and the complex structural information through exhaustive ablation experiments.
Finally, it should be noted that while the above describes a preferred embodiment of the invention, it will be appreciated by those skilled in the art that, once the basic inventive concepts have been learned, numerous changes and modifications may be made without departing from the principles of the invention, which shall be deemed to be within the scope of the invention. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.

Claims (10)

1. A project recommendation method based on a graph neural network with directed and undirected structural information is characterized by comprising the following steps:
s1, let V represent the collection of items appearing in all conversation sequences, i.e.
Figure FDA0003385139960000011
Then an anonymous session sequence of length n may be used
Figure FDA0003385139960000012
Is expressed and the items in the conversation s are chronologically arranged, each vie.V represents that the user is in a meetingItem clicked in words s;
s2, receiving the historical conversation sequence, and converting the historical conversation sequence into a directed conversation sequence chart Gs=(Vss,As) In which V issRepresents a set of points,. epsilonsRepresents the edge set, AsRepresents a set of adjacency matrices, AsDefined as three adjacency matrixes
Figure FDA0003385139960000013
Figure FDA0003385139960000014
And
Figure FDA0003385139960000015
the splicing of (a), wherein,
Figure FDA0003385139960000016
representing weighted adjacency matrices of undirected graphs, and
Figure FDA0003385139960000017
and
Figure FDA0003385139960000018
respectively representing an in-degree adjacency matrix and an out-degree adjacency matrix with weights;
s3, connecting point viMapping the e to the random embedded vector space to obtain d-dimensional vector representation
Figure FDA0003385139960000019
Extracting a first intermediate implicit vector of an item in the conversation sequence diagram by using a graph convolution network, extracting a second intermediate implicit vector of item conversion in the conversation sequence diagram by using a gated diagram neural network, and obtaining an item implicit vector through first linear transformation;
s4, inputting the item implicit vector into a target attention network, and obtaining a vector of a conversation sequence S based on the target attention;
s5, acquiring global information and local information of the conversation sequence S, and constructing a conversation vector representation through second linear transformation;
s6, predicting the probability of all the target items being clicked on the conversation sequence S by the softmax function, and recommending items with a high probability.
2. The method for recommending items based on the graph neural network with directed and undirected structural information, according to claim 1, is characterized in that the step of extracting the first intermediate implicit vector of the item in the session sequence diagram by using the graph convolution network comprises the following steps:
s31, generating a feature matrix X of the conversation sequence diagram; each node v in the session sequence diagramiCorresponding d-dimensional feature vector
Figure FDA0003385139960000021
The stack of (2) constituting a feature matrix of a session sequence diagram
Figure FDA0003385139960000022
Figure FDA0003385139960000023
Figure FDA0003385139960000024
S32, for the graph convolution layer of the k layer, use the matrix H(k-1)Input vectors representing all nodes, by H(k)And expressing the output vector of the node, wherein the initial d-dimensional node vector is the characteristic initially input to the first layer of the graph volume layer network, and the formula is as follows:
H(0)=X, (1)
before the input of each graph convolution layer, each node viThe feature of (a) is averaged with the feature vector of its local neighbor, and the calculation formula is:
Figure FDA0003385139960000025
wherein, aijIs node viAnd vjEdge weight between, di=∑jaij
S33, the output of the graph convolution network is a first intermediate implicit vector.
3. The item recommendation method according to claim 2, wherein the step of extracting the first intermediate implicit vector of the item in the session sequence diagram by using the graph convolution network is as follows:
Figure FDA0003385139960000026
Figure FDA0003385139960000027
Figure FDA0003385139960000028
where S represents the contiguous matrix with self-loops after "symmetric normalization",
Figure FDA0003385139960000029
and is
Figure FDA0003385139960000031
Is that
Figure FDA0003385139960000032
The degree matrix of (v), is a point-by-point operator,
Figure FDA0003385139960000033
for the propagation matrix of the graph convolution network, α and β are hyper-parameters, I is the identity matrix,
Figure FDA0003385139960000034
is the first intermediate implicit vector of the output.
4. The item recommendation method according to claim 3, wherein for the items with edge connection, the attention of the self information in the matrix is raised by raising the weight of the left half part of the formula (4), and the hyper parameters α and β are used for controlling the ratio of the propagation matrix information and the unit matrix information, thereby controlling the absorption ratio of the node information with attention in the propagation process.
5. The item recommendation method according to any of claims 2-4, wherein the step of extracting the second intermediate implicit vector of the item transformation in the conversation sequence diagram by using the gated graph neural network comprises:
Figure FDA0003385139960000035
Figure FDA0003385139960000036
Figure FDA0003385139960000037
Figure FDA0003385139960000038
Figure FDA0003385139960000039
wherein the content of the first and second substances,
Figure FDA00033851399600000310
and
Figure FDA00033851399600000311
for controlling the weights and the magnitude of the bias terms,
Figure FDA00033851399600000312
to represent the result of the interaction between a node and an adjacent node,
Figure FDA00033851399600000313
and
Figure FDA00033851399600000314
reset gate and update gate, respectively, weight matrix Wz、Uz,Wr、UrAnd Wo、UoRespectively represent learnable network parameters in the reset gate, the update gate and the output gate,
Figure FDA00033851399600000315
represents node viIs used to generate the second intermediate implicit vector of (c),
Figure FDA00033851399600000316
is a sequence of node vectors in the session, and
Figure FDA00033851399600000317
Figure FDA00033851399600000318
for the first intermediate implicit vector output by the graph convolution network, σ (·) is a sigmoid function, which is a point-by-point operator; adjacency matrix
Figure FDA00033851399600000319
Figure FDA00033851399600000320
Representing the communication of the nodes in the graph,
Figure FDA00033851399600000321
representative node viIn that
Figure FDA00033851399600000322
Two columns of matrix blocks; matrix array
Figure FDA0003385139960000041
Is defined as an in-degree matrix
Figure FDA0003385139960000042
Sum degree matrix
Figure FDA0003385139960000043
And (4) splicing.
6. The item recommendation method according to claim 1, wherein in step S3, the first linear transformation is:
Figure FDA0003385139960000044
wherein gamma is a hyper-parameter for balancing the ratio of undirected structural information to directed structural information with attention; h is an implicit vector of the item,
Figure FDA0003385139960000045
is the output of the gated graph neural network.
7. The item recommendation method according to claim 1, wherein in step S4, the step of obtaining the target attention-based vector of the conversation sequence S is as follows:
s41, calculating all items v in the conversation sequence S by using the local target attention modeliFor each target item vte.V attention distribution ei,tAnd obtaining the attention score beta through a softmax (·) functioni,tThe formula is as follows:
Figure FDA0003385139960000046
wherein the content of the first and second substances,
Figure FDA0003385139960000047
and
Figure FDA00033851399600000414
are respectively all items viAnd target item vtThe term(s) of (a) implies a vector representation,
Figure FDA0003385139960000048
the matrix is a matrix with weight and is generated through training, and exp (cndot) is an nth power function of a calculation constant e;
s42, for each conversation sequence S, the user aims at the target item vtCan be expressed as
Figure FDA0003385139960000049
The calculation formula is as follows:
Figure FDA00033851399600000410
8. the item recommendation method according to claim 7, wherein said step S5 includes:
s51, obtaining local vectors
Figure FDA00033851399600000411
The local vector is the last item in the conversation sequence s
Figure FDA00033851399600000412
Is used to represent the vector of (a),
Figure FDA00033851399600000413
i.e. its implicit vector, the formula is:
Figure FDA0003385139960000051
s52, obtaining a global vector
Figure FDA0003385139960000052
The global vector aggregates all the appeared item vectors in the conversation sequence s, and also introduces the item [ v ] appeared in the whole conversation sequence s by using the attention mechanism1,v2,…,vn]The formula of the dependency relationship is as follows:
Figure FDA0003385139960000053
Figure FDA0003385139960000054
wherein
Figure FDA0003385139960000055
For the last item in the session,
Figure FDA0003385139960000056
and is
Figure FDA0003385139960000057
Is a corresponding weight parameter, αiRepresenting the dependency between the last item and the items appearing in the entire sequence of conversations;
s53, splicing the local vector, the global vector and the target attention-based vector, and obtaining a conversation vector corresponding to a conversation sequence S by using a second linear transformation, wherein the formula is as follows:
Figure FDA0003385139960000058
wherein the weight parameter
Figure FDA0003385139960000059
Projecting the results of three vector concatenations into vector space
Figure FDA00033851399600000510
Performing the following steps;
the W1, W2 and W3 are weight parameter matrixes and are generated through training.
9. The item recommendation method according to claim 8, wherein said step S6 includes:
s61, obtaining the corresponding conversation vector S of each conversation sequence ShThereafter, for all items viScore for e.V
Figure FDA00033851399600000511
Calculate, i.e. all items viSecond implicit vector of e.V
Figure FDA00033851399600000512
The session vector s corresponding theretohMultiplication is performed, and the formula is as follows:
Figure FDA00033851399600000513
s62, outputting the vector through a second softmax function
Figure FDA00033851399600000514
The formula is as follows:
Figure FDA0003385139960000061
wherein
Figure FDA0003385139960000062
Represents the predicted recommendation scores of all the target items, and
Figure FDA0003385139960000063
representing the probability that the target item is clicked at the next moment in the session sequence s,
Figure FDA0003385139960000064
the top K items with the highest rank are the items to be recommended; wherein for the session sequence chart GsThe loss function is defined as the cross entropy of the predicted value and the actual value.
10. The item recommendation method according to claims 1-9, wherein steps S2-S6 are repeated, using a time-based back propagation algorithm to train parameters.
CN202111447363.6A 2021-12-01 2021-12-01 Project recommendation method of graph neural network based on directed and undirected structural information Pending CN114117229A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111447363.6A CN114117229A (en) 2021-12-01 2021-12-01 Project recommendation method of graph neural network based on directed and undirected structural information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111447363.6A CN114117229A (en) 2021-12-01 2021-12-01 Project recommendation method of graph neural network based on directed and undirected structural information

Publications (1)

Publication Number Publication Date
CN114117229A true CN114117229A (en) 2022-03-01

Family

ID=80369156

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111447363.6A Pending CN114117229A (en) 2021-12-01 2021-12-01 Project recommendation method of graph neural network based on directed and undirected structural information

Country Status (1)

Country Link
CN (1) CN114117229A (en)

Similar Documents

Publication Publication Date Title
Wu et al. Session-based recommendation with graph neural networks
CN111080400B (en) Commodity recommendation method and system based on gate control graph convolution network and storage medium
CN111797321B (en) Personalized knowledge recommendation method and system for different scenes
CN112364976B (en) User preference prediction method based on session recommendation system
CN110796313B (en) Session recommendation method based on weighted graph volume and item attraction model
CN112733018B (en) Session recommendation method based on graph neural network GNN and multi-task learning
CN111753209B (en) Sequence recommendation list generation method based on improved time sequence convolution network
CN111241425B (en) POI recommendation method based on hierarchical attention mechanism
CN115048586B (en) Multi-feature-fused news recommendation method and system
CN111563770A (en) Click rate estimation method based on feature differentiation learning
CN112765461A (en) Session recommendation method based on multi-interest capsule network
CN114637911A (en) Next interest point recommendation method of attention fusion perception network
CN113297487A (en) Attention mechanism-based sequence recommendation system and method for enhancing gated cyclic unit
CN111159242B (en) Client reordering method and system based on edge calculation
CN116051175A (en) Click rate prediction model and prediction method based on depth multi-interest network
Abugabah et al. Dynamic graph attention-aware networks for session-based recommendation
CN114925270A (en) Session recommendation method and model
Zeng et al. Collaborative filtering via heterogeneous neural networks
CN115618079A (en) Session recommendation method, device, electronic equipment and storage medium
CN115470406A (en) Graph neural network session recommendation method based on dual-channel information fusion
CN113010774B (en) Click rate prediction method based on dynamic deep attention model
CN112559905B (en) Conversation recommendation method based on dual-mode attention mechanism and social similarity
CN114117229A (en) Project recommendation method of graph neural network based on directed and undirected structural information
CN114625969A (en) Recommendation method based on interactive neighbor session
CN114547276A (en) Three-channel diagram neural network-based session recommendation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination