Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a recommendation method based on a heterogeneous graph neural network.
The invention is realized by adopting the following technical scheme:
a recommendation method based on a heterogeneous graph neural network comprises the following steps:
collecting a data set with the social relationship among users, user-commodity interaction historical data and commodity category information, filtering invalid data and carrying out negative sampling;
randomly selecting a user set and a related commodity set from the data set, and carrying out multi-level graph sampling and graph building;
and (3) node feature extraction: inputting the constructed graph into a heterogeneous graph neural network comprising a plurality of layers of heterogeneous graph memory network layers for processing to obtain a fusion node embedded vector of nodes; for the commodity nodes which do not need to be subjected to the recalibration step, the fusion node embedded vectors of the commodity nodes are the commodity fusion embedded vectors;
recalibration: recalibrating the user fusion node embedded vector to obtain a user final representation embedded vector;
and (4) recommendation and prediction: and performing preference prediction by using the user final representation embedding vector and the commodity fusion embedding vector, and obtaining a recommendation sequence.
Preferably, the multi-level graph sampling and mapping process includes:
(1) randomly selecting a user commodity pair of the data set, if the commodity pair is not subjected to negative sampling, further performing commodity negative sampling, and taking the user, the commodity and the negative sampling commodity in the selected set as seed nodes of graph sampling;
(2) and carrying out multi-order graph sampling on the seed nodes, determining the order and the number of sampling points of each order, and then carrying out graph sampling.
Preferably, the heterogeneous neural network comprises a multi-layer hopping junction neural network and a user-embedded vector recalibration network, wherein:
the multi-layer jump connection neural network uses a multi-layer jump structure, and each layer of the network layer of the layer diagram has a jump connection to the final output;
and the user embedded vector recalibration network is used for recalibrating the user fusion node embedded vectors.
Preferably, the step of extracting the fusion node embedding vector of the node comprises:
(1) inputting the graph constructed in the step S2 into a neural network of a heterogeneous graph, and obtaining an initial embedded vector of a node through an initial embedded vector table;
(2) obtaining node output embedded vectors of each layer of heterogeneous graph memory network layer of the nodes through a plurality of layers of heterogeneous graph memory network layers;
(3) and fusing the initial embedded vectors of the nodes and the node output embedded vectors of each layer of the heteromorphic image memory network layer to obtain fused node embedded vectors of the nodes.
Preferably, the processing procedure of the heterogeneous graph memory network layer comprises the following steps:
1) extracting the characteristics of an edge end point of each edge in the graph by using a memory enhanced heterogeneous relation encoder;
2) fusing the messages received by each node to obtain node output embedded vectors;
3) directly taking the output embedded vector after fusing each node message as a node output embedded vector output by a heterogeneous graph memory network layer; or, using Layernorm for standardization and adding the output of the memory enhanced heterogeneous relation encoder using the self-loop as the node output embedded vector of the final output of the heterogeneous graph memory network layer.
Preferably, the memory-enhanced heterogeneous relationship encoder comprises the operations of:
a) obtaining the coefficient of each memory unit of the memory-enhanced heterogeneous relation encoder of the lower end point of the corresponding relation of the edges by using the bias linear transformation of the learnable band after the activation function is activated;
b) and performing linear transformation on the embedded vector of the lower starting point of the corresponding relation of the sides by using the sum of the products of all the memory units and the corresponding coefficients thereof as a transformation matrix.
Preferably, the representation form of the message fusion function for fusing the messages received by each node is different among different types of nodes, and the representation form includes:
for the user node, the message fusion function is expressed as the sum of the normalized social relationship of the user node and the messages obtained by the user-commodity interaction history through the memory enhanced heterogeneous relationship encoder;
for the commodity node, the message fusion function is expressed as the sum of the normalized user-commodity interaction history of the commodity node and the normalized commodity generic relation obtained by the memory enhanced heterogeneous relation encoder;
for the commodity category node, the message fusion function is expressed as the sum of the messages of the normalized commodity category node commodity generic relationship obtained by the memory enhanced heterogeneous relationship encoder.
Preferably, the method for obtaining the fusion node embedding vector of the node comprises the following steps: and splicing the node initial embedded vector with the output of each layer of the heteromorphic graph memory network layer, and normalizing by using Layernorm.
Preferably, the recalibration method comprises: and adding the user fusion node embedded vector and the result of the self-looping first-order graph convolution of the user node on the social relationship to obtain a user final representation embedded vector.
Preferably, the recommendation method further comprises:
the heterogeneous graph neural network training phase uses BPRLoss as a loss function to supervise and then backpropagate the gradient into the heterogeneous graph neural network.
Compared with the prior art, the invention has the following advantages:
1. according to the method and the device, the social relationship among users, the characteristics of commodities and the interaction historical data of the users and the commodities are considered at the same time, so that the recommendation method is ensured to achieve the excellent performance of the collaborative filtering recommendation method, the advantage of avoiding data sparseness based on the content recommendation method is achieved, the influence of the social relationship on the preference of the users is considered, more accurate recommendation is achieved, and the problem of data overload is solved.
2. The method can well deal with the data sparsity problem encountered by the collaborative filtering method. For users and commodities with rare interaction historical data, the depiction of the objects by the traditional collaborative filtering method is limited to the neighbors of the target object, but the quantity of the rare neighbors can bring deviation to the depiction of the objects. The social information and the category information are introduced, so that the characteristics of the target object can be depicted in the angles of the user and the commodity, and the problems of data sparseness and data loss in the collaborative filtering method are solved.
3. The invention introduces a jump connection mode to overcome the gradient disappearance problem in the traditional graph neural network architecture, and each graph neural network layer is provided with a jump connection to the final output layer, so that the gradient disappearance problem can be avoided when the network deepens.
4. The heterogeneous graph neural network layer of the invention adopts a memory-enhanced relationship encoder to capture heterogeneous relationship features. Different memory parameters are adopted among different types of connections, and parameter level differentiation is provided for message transmission of the neural network of the heterogeneous graph, so that mutual information among different types of messages is less, and the information quantity of message transmission is increased, and therefore, the heterogeneous relationships of user social information, commodity category information and the like can be better utilized to recommend commodities to users.
5. The invention adopts a user embedded vector recalibration mechanism, the characteristic component sources of the user node embedded vectors output by the graph neural network are complex, and the recalibration can enable the user fusion node embedded vectors to be greatly influenced by social connection, so that the user finally shows that the embedded vectors have stronger character feature description capacity.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments.
The working principle of the invention comprises: firstly, preprocessing a data set to obtain social relations among users, user-commodity interaction historical data and commodity category data, sampling a multi-level graph and constructing a heterogeneous graph, and inputting the constructed heterogeneous graph into a trained heterogeneous graph network for recommendation and prediction.
A recommendation method based on a heterogeneous graph neural network comprises the following steps:
s1, constructing a data set, collecting the data set with the social relationship among users, the user-commodity interaction historical data and the commodity category information in the scenes of E-commerce, comment websites and the like, filtering invalid data and carrying out negative sampling.
Most of the currently disclosed recommendation methods are based on collaborative filtering methods, but in real e-commerce, comment and video network services, a service provider usually has other information besides user-commodity interaction history information such as social relations among users and commodity categories. Therefore, if the additional information can be utilized, the characteristics of the user and the commodity can be accurately described, so that more accurate user preference can be obtained, the time spent by the user on the commodity which is not interested is reduced, and the interaction willingness of the user is improved. While many neural network methods based on collaborative filtering perform graph neural network reasoning according to a user-commodity interaction bipartite graph, the homograph method based on a single relationship is not applicable to the invention containing multiple relationships. Therefore, if such additional information is to be utilized, there is a need for the collection of data sets that, in addition to the historical data of user-item interactions, also contain user social relationships and item category data, as used in common collaborative filtering methods. The collection of data sets is thus critical to network training.
When the data set is constructed, the data set with the social relationship among users, the user-commodity interaction historical data and the commodity category information in the scenes of a merchant website, a comment website and the like can be collected. Several review data sets, Ciao, eponions, Yelp, that have been published so far, can be used as candidate data sets for model training and validation tests. In addition to this, a data set having more relationships, such as a data set containing personal characteristics of the user or a data set containing interrelationships between commodities, may also be used as the model target data set. According to work that has been done so far, the collection of data sets can be done as follows:
(1) the existing public data sets, such as Ciao, Epionins, Yelp data sets and the like, are directly collected, the data sets comprise the relations, and the data sets can be directly used in the neural network of the heterogeneous graph after simple preprocessing.
(2) The data set is collected or generated by the network service provider, the data can be collected by the network service provider, and the user social relationship or commodity category relationship can be generated by other data such as user-commodity interaction data and the like by a general user.
After data collection, if no cold start requirement exists, users and commodities without user-commodity interaction are removed from the data set when the data set is constructed, and effective data can be prevented from being polluted to a certain extent. And then sequencing the user-commodity interaction historical data according to time, wherein the last two interaction records of each user are respectively used as a verification set and a test set, and the rest interaction historical records are used as training sets. And for users with the user-commodity interaction quantity less than three, the user-commodity interaction quantity can not be put into the verification set or the test set. For the validation set and the test set, negative sampling is performed while constructing the data set to ensure reproducibility of subsequent results.
S2, preprocessing, randomly selecting a user set and a related commodity set from the data set, sampling and drawing a multi-level graph, and taking the graph as the input of the neural network of the heterogeneous graph.
The input of the heteromorphic graph neural network is a graph containing social relations among users, user-commodity interaction history and commodity category information, so that a sample is required to be sampled and mapped, and the method mainly comprises the following two steps:
(1) and randomly selecting user commodity pairs of the data set, if the commodity pairs are not subjected to negative sampling, further performing commodity negative sampling, and taking the users, the commodities and the negative sampling commodities in the selected set as seed nodes of the graph sampling.
(2) And carrying out multi-order graph sampling on the seed nodes, determining the order and the number of sampling points of each order, and then carrying out graph sampling. And finally, taking a graph constructed by the sampled nodes and the edges with the relations involved in the sampling process as an input graph of the heterogeneous graph neural network.
And S3, extracting node characteristics, inputting the constructed graph into a heterogeneous graph neural network comprising a plurality of heterogeneous graph memory network layers for processing, and obtaining a fusion node embedded vector of the nodes.
And (4) node feature extraction, namely firstly obtaining an initial embedded matrix by all nodes on the graph constructed by preprocessing according to an initial embedded vector table. And then inputting the initial embedded matrix into a heterogeneous graph memory network layer, and obtaining an output node embedded matrix. Repeating the process for several times to obtain node embedded matrixes under different levels, wherein the output of each layer of heterogeneous graph memory network layer corresponds to different node characteristics. The shallow features better describe the nodes themselves and the neighbor features directly connected to the nodes, while the deep features derive the high-level abstract features of the nodes. And the output of the final heterogeneous graph neural network of a certain node is subjected to fusion by standardizing the initial embedded vector of the node and the embedded vector output by each layer of heterogeneous graph memory network layer to obtain a fusion node embedded vector of the node.
In a preferred embodiment, as shown in FIG. 3, the heterogeneous neural network comprises a multi-layer hopping junction neural network and a user-embedded vector recalibration network, connected in series.
The multi-layer hopping nodes used by the multi-layer hopping connection neural network are connected, each layer of graph network layer (heterogeneous graph memory network layer) is connected with the final output in a hopping mode, the problem that the gradient of the network disappears can be avoided, and meanwhile, more layers of information can be reserved by the output of the multi-layer graph neural network. The output of each layer of the graph neural network corresponds to the aggregation information of a plurality of orders of neighbors, the low-order features describe more features of the user or the commodity object, the difference between the features of the nodes of the same type is larger, the distinction between the objects is stronger, the high-order features describe more common features of the nodes in the certain order of the neighborhood of the object, the difference between the features of the nodes of the same type is relatively smaller, and certain class of features and features of the user or the commodity object can be better described. And finally, the input features, the low-order features and the high-order features are fused, and the fused features can better describe the features of the user and the commodity and make the features of the user and the commodity more distinctive.
Specifically, the node features are extracted, and as shown in fig. 2, an initial embedding matrix H is obtained by first passing all the nodes on the graph through an initial embedding vector table(0)Then input it into the containerThe graph neural networks of a plurality of layers are processed, each layer of the graph neural networks is correspondingly embedded with the feature representation in the neighborhood of different orders, the features of lower orders describe more the features of the user or the commodity object, the features of higher orders describe more the common features of nodes in the neighborhood of certain orders of the user or the commodity object, and the final output features are the fusion of the multi-order graph neural network output, so the multi-order features of different orders can be obtained through the graph neural network output.
Specifically, the step of extracting the fusion node embedding vector of the node includes:
(1) inputting the graph constructed in the step S2 into a neural network of a heterogeneous graph, and obtaining an initial embedded vector of a node through an initial embedded vector table;
(2) through a plurality of layers of heterogeneous graph memory network layers, the operation of each layer of heterogeneous graph memory network layer comprises the following steps:
1) and performing feature extraction of the edge end point on each edge in the graph by using a memory enhanced heterogeneous relation encoder. Wherein the memory-enhanced heterogeneous relationship encoder comprises the operations of:
a) obtaining the coefficient of each memory unit of the memory-enhanced heterogeneous relation encoder of the lower end point of the corresponding relation of the edges by using the bias linear transformation of the learnable band after the activation function is activated;
b) and performing linear transformation on the embedded vector of the lower starting point of the corresponding relation of the sides by using the sum of the products of all the memory units and the corresponding coefficients thereof as a transformation matrix.
2) And fusing the messages received by each node, wherein the expression form of the message fusion function in different types of nodes is as follows:
for the user node, the message fusion function is expressed as the sum of the normalized social relationship of the user node and the messages obtained by the user-commodity interaction history through the memory enhanced heterogeneous relationship encoder.
For the commodity node, the message fusion function is expressed as the sum of the normalized user-commodity interaction history of the commodity node and the normalized commodity generic relation obtained by the memory enhanced heterogeneous relation encoder.
For the commodity category node, the message fusion function is expressed as the sum of the messages of the normalized commodity category node commodity generic relationship obtained by the memory enhanced heterogeneous relationship encoder.
3) The output embedded vector after the fusion of each node message can be directly used as the node output embedded vector output by the heterogeneous graph memory network layer. Or, further, Layernorm can be used for standardization, and the output of a memory-enhanced heterogeneous relation encoder using a self-loop is added to be used as a node output embedded vector of the final output of the memory network layer of the heterogeneous graph, so that the stability of the model during training is enhanced.
(3) And after the node output embedded vectors of each layer of the heteromorphic image memory network layer are obtained, fusing the node output embedded vectors to obtain fusion node embedded vectors of the nodes. The fusion method comprises the following steps: and splicing the node initial embedded vector with the output of each layer of the heteromorphic graph memory network layer, and normalizing by using Layernorm.
The specific structure of each layer of the neural network is as follows (taking the l-th layer as an example):
(1) input graph G ═ V, E, a, B), and input graph node embedding matrix H(l-1). Where each node V ∈ V and each edge E ∈ E are associated with its mapping function V → A and E → B, which are a set of point types and a set of edge relationships, respectively. The input graph node embedded matrix is the output of the previous layer of the graph neural network, and the first layer is the initial embedded matrix obtained through the initial embedded vector table.
(2) For a certain node t of the graph, the vector H is embedded in the output of the ith layer of the graph neural network(l)[t]Can be described by the following formula:
wherein: n (t) represents the (starting) neighbor set of node t, e
s,tAn edge connecting node s and node t is represented,
for memory-enhancing heterogeneous relation encoders, Aggre (.)Then it is a message fusion function; h
(l-1)[t]Embedding a vector for the input of node t, H
(l-1)[s]To pass through the edge e
s,tThe input of the start point s embeds a vector.
The memory-enhanced heterogeneous relation encoder is used for capturing heterogeneous relation characteristics, and different memory units are introduced into heterogeneous relations determined by different node types and edge types to achieve heterogeneous relation characteristic capture. Given the number of memory units M for a particular type of relationship, the memory-enhanced heterogeneous relationship encoder can be described by the following equation:
wherein: eta (-) represents specific coefficient function of target node, and has the structure of activation value and parameter of activation function of target node after mathematical tape bias linear transformation
And b
mAll are parameters that can be learned. Memory unit of memory enhanced heterogeneous relation encoder
Are also mathematical parameters.
The output of the memory-enhanced heterogeneous relation encoder is obtained by the product of the sum of the output products of each memory unit and the corresponding target node specific coefficient function and the embedded vector of the given specific type relation starting point. In practice, the activation function σ () may be freely selected, and in this embodiment, taking LeakyReLU as an example, σ (x) ═ max (x, α x) may be selected as the activation function, where the negative slope α may take any value within the interval [0, 1 ].
The message fusion function Aggre (-) is also different for different types of nodes:
in a preferred embodimentIn an embodiment, for user node u
iAnd its social relation S corresponds to
And the corresponding neighbors of the user on the user-commodity interaction history Y
The message fusion function can be described by the following formula:
wherein:
for user node u
iThe number of social neighbors of (a),
for user node u
iThe number of interaction histories.
For commodity node vjThe message includes two parts from the user and from the category relationship, and the message fusion function can be expressed as the following formula:
wherein:
is a commodity v
jR is the commodity v
jAnd associating certain category relation nodes.
The node r of the category relationship between the commodities can be described by the following formula:
wherein: n is a radical ofrRepresenting the neighbors of the class relationship node r.
Preferably, the output of the fused message is embedded into a vector, and normalization through Layernorm can be selected and a self-loop is added to enhance the stability of the model during training. For a certain point t of the graph, this process can be expressed as the following equation:
wherein: w is a1And w2For learnable scaling and bias parameters, μ and σ represent the output embedding vector H, respectively(l)[t]Mean and variance of.
(3) And after the node output embedded vectors of each layer of the graph neural network are obtained, fusing the node output embedded vectors to obtain fusion node embedded vectors of the nodes.
For a certain node t of the graph, the fusion node embedding vector can be described by the following formula:
wherein H(0)[t]Representing the node initial embedded vector.
If Layernorm is not selected for use in (2), the fusion process can be described using the following equation:
H*[t]=Layernorm(H(0)[t]||H(1)[t]||…||H(l)[t])
for the commodity nodes which do not need to be subjected to the recalibration step, the fusion node embedded vectors of the commodity nodes are the commodity fusion embedded vectors in the prediction stage.
And S4, recalibrating the user fusion node embedded vector obtained in the feature extraction stage to obtain a user final representation embedded vector.
The user fusion node embedded vector obtained by the multilayer jump connection neural network needs to be recalibrated by a user embedded vector recalibration network to obtain the user final representation embedded vector. The recalibration method is to add the user fusion node embedded vector and the convolution result of the user node containing the self-loop first-order graph in the social relation to obtain the user final representation embedded vector.
And recalibrating the embedded vector of the user fusion node, namely performing pooling operation on the user and the social relation connection node of the user, wherein recalibration output is the user final representation embedded vector.
The user fusion node embedded vector is recalibrated, so that the user embedded vector is influenced by social friends of the user more greatly, and the final recommendation effect can be improved in the actual process. User uiThe recalibration result of (a) can be described by the following formula:
and S5, recommending and predicting, performing preference prediction by using the user final expression embedded vector and the commodity fusion embedded vector, and obtaining a recommendation sequence.
In the recommendation prediction stage, the embedded vectors of the nodes corresponding to the target users and the target commodities are used for prediction, and the logic value prediction can be performed in the simplest inner product mode. And the commodity recommendation sequence is arranged according to the logic value descending order.
The result of the network prediction is determined by the inner product of the user final expression embedded vector and the commodity fusion embedded vector after recalibration, and the larger the value is, the higher the preference degree of the user to the commodity is, and vice versa. And finally, determining the result of the recommendation sequence according to the value in a descending order, wherein the commodity corresponding to the prediction result with the maximum value is the commodity with the strongest preference of the user. For a data set with negative sampling, the forecast commodity set is a union set of a positive sample set and a negative sample set, and the final recommendation result is determined according to the recommendation value of the set.
Recommending a prediction stage to a target user uiAnd target commodity vjThe prediction function is defined as follows:
ξ(ui,vj)=τ(H*[ui])·H*[vj]
the larger the predicted value is, the larger the user uiFor commodity vjThe stronger the preference degree of (2), the recommended commodity sequence is in descending order according to the size of the predicted value.
In one embodiment, the method further comprises:
and S6, monitoring the prediction result in a heterogeneous graph neural network training stage so as to optimize network parameters.
The heterogeneous neural network training phase uses BPRLoss as a loss function to supervise and then backpropagate the gradient into the network.
Training phase, for user uiAnd a sample commodity vj+And a negative sample commodity vj-And the output predicted value needs to be supervised and trained, and the loss of training can be defined as follows:
wherein the training data set O { (i, j)+,j-)|(i,j+)∈Y+,(i,j-)∈Y-I.e. user uiAnd history of user-goods interactions Y occurring in the data set+Middle sample commodity vj+And user-goods interaction history Y not present in the dataset-Commodity of medium load sample vj-Subscripts corresponding to the constituent triplets. Theta is a training parameter, delta (-) is a sigmoid function, and lambda is a regularization adjustment factor, and a suitable value can be selected according to actual conditions.
The optimization goal is to minimize the loss of training, and the process can be optimized by calculating the gradient of the loss to each parameter and then using a back propagation algorithm to perform model optimization.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, simplifications, substitutions, and combinations which do not depart from the spirit and principle of the present invention should be regarded as equivalent substitutions and are included in the scope of the present invention.