CN115525836A

CN115525836A - Graph neural network recommendation method and system based on self-supervision

Info

Publication number: CN115525836A
Application number: CN202210969801.3A
Authority: CN
Inventors: 孙爱晶; 王国庆; 魏帆; 李益佳; 王欣茹; 杨凯琳; 任丹
Original assignee: Xian University of Posts and Telecommunications
Current assignee: Xian University of Posts and Telecommunications
Priority date: 2022-08-12
Filing date: 2022-08-12
Publication date: 2022-12-27

Abstract

The invention discloses a graph neural network recommendation method and system based on self-supervision, wherein the recommendation method comprises the following steps: s1, acquiring user and project interaction data to form a bipartite graph; s2, enhancing the data through a popular deviation reduction method to generate enhanced data of two visual angles; s3, encoding the enhanced data to respectively generate a vector Z 'and a vector Z'; s4, training the vector Z 'and the vector Z' by adopting a loss function to obtain a graph neural network recommendation model; and S5, pushing interested item information to the user based on the graph neural network recommendation model. According to the method, a data enhancement mode related to the field is adopted, so that the enhanced data can furthest reserve information required by a recommendation task, then the added data are coded, iterative training is carried out by adopting a loss function, and a graph neural network recommendation model is obtained.

Description

Graph neural network recommendation method and system based on self-supervision

Technical Field

The invention relates to the technical field of recommendation systems, in particular to a graph neural network recommendation method and system based on self-supervision.

Background

With the rapid development of the internet, information is explosively increased, and how to screen massive information becomes one of the concerns of many researchers. The recommendation system can effectively filter the original information, so that an individualized recommendation result is generated for each user, and the problem of information overload is relieved. At present, the method is widely applied to electronic commerce, social networks, intelligent medical treatment and online education.

Recommendation systems based on graph neural networks have become an industry focus in recent years. Compared with a classical collaborative filtering algorithm, the graph neural network can capture high-order relations between users and projects, so that the upper limit of the performance of the model is improved by using a deep collaborative filtering signal during training. Meanwhile, in order to alleviate the cold start problem of the recommendation system and mine more supervision information from the original data, some scholars propose to introduce a self-supervision training mode into the graph-based recommendation system. The self-supervision algorithm is used as a branch of the unsupervised algorithm, originally originates from the field of image processing, and is different from the training of an image classification task by utilizing an artificially labeled image label. The essence of this loss function is to have the vectors generated by the two views supervise each other, which is called self-supervision because its supervision signal originates from the data itself and does not depend on manual labeling. The existing work of combining a self-supervision algorithm and a graph neural network to be applied to a recommendation algorithm mainly comprises the steps of randomly enhancing an interactive bipartite graph of a user and a project, and then training by adopting a traditional contrast loss function in the image field, but the optimization is not carried out on the basis of the background of a recommendation system, so that the following problems are mainly caused:

(1) Domain-independent data enhancement. Current self-supervision based graph recommendation algorithms typically perform data enhancement based on random edge dropping or random feature masking, and such domain-independent data enhancement has been proven to destroy the internal structure of the original graph data, resulting in data information loss and generating coupled vector representation.

(2) And (4) random negative sampling. The comparative learning loss function usually adopts a learning strategy of zooming out negative samples to prevent the network model from collapsing, so that a large amount of high-quality negative sample data is required. However, the existing model usually randomly samples all samples to obtain negative samples, or all the samples except the target sample in a batch are regarded as negative samples. Both of these approaches mislead the optimization direction of the loss function, thereby reducing the performance of the algorithm.

(3) And (4) joint learning strategy. The existing method usually combines the self-supervision task with the main task of the recommendation system score prediction to learn the model, and essentially takes the self-supervision task as an auxiliary task so as to limit the distribution of vectors. The combined learning mode increases the complexity of the algorithm, thereby increasing the difficulty of model training.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provides a graph neural network recommendation method and system based on self-supervision.

In order to achieve the above purpose, the invention provides the following technical scheme:

a graph neural network recommendation method based on self-supervision comprises the following steps:

s1, obtaining user and project interaction data, including user data, project data and user and project interaction record data, to form a bipartite graph G = (X, A), wherein X is a node attribute matrix and belongs to R ^(n+m)×d A is an adjacency matrix, and A is belonged to R ^(n+m)×(n+m) N represents the number of user nodes, m represents the number of project nodes, and d represents the vector dimension;

s2, generating enhanced data G '= (X, a') and G = (X, a ") for two views by enhancing the data G = (X, a) by a popularity bias reduction method based on item popularity;

s3, encoding enhancement data G '= (X, A') and G '= (X, A'), and generating vectors Z 'and Z' respectively;

s4, training the vectors Z 'and Z' by adopting a loss function to obtain a graph neural network recommendation model, wherein the loss function comprises a covariant part and an invariant part, and the formula of the loss function is as follows:

L(Z′,Z″)＝λs(R′,R″)+μ[c(Z′)+c(Z″)]

wherein λ and μ are hyper-parameters, controlling the weights of the invariant and covariant partial loss functions, respectively, s (R ', R') being the invariant partial loss function, c (Z '), c (Z') being the covariant partial loss functions of vector Z 'and vector Z', respectively;

and S5, pushing interested item information to the user based on the graph neural network recommendation model.

According to the technical scheme provided by the invention, the obtained graph neural network recommendation model is trained, and then the item information is recommended to the user according to the graph neural network recommendation model. The graph neural network recommendation model training is to collect original data, enhance the original data into data of two visual angles by adopting a field-related data enhancement mode, so that the enhanced data can furthest reserve information required by a recommendation task, encode the increased data, and perform iterative training by adopting a loss function to obtain the graph neural network recommendation model, wherein the loss function can effectively solve the problem of random negative sampling and simultaneously avoid joint training of the model.

Further, step S2 specifically includes the following steps:

s21, calculating the popularity of each edge in the bipartite graph G = (X, A), wherein the popularity of each edge is the popularity of the item node connected with the edge, namely

Wherein S _u,i For the popularity size of the edge connecting the user node u and the project node i in the bipartite graph,

representing the popularity of the project node i;

s22, calculating the discarding probability of each edge based on the popularity of the edge, wherein the discarding probability calculation formula is as follows:

in the formula p _u,i Probability of dropping as an edge, S _u,i -s _min /s _min -s _max Is to s _u,i Performing a normalization process, s _min And s _max Respectively minimum and maximum of all edge popularity, p _e For a self-defined global drop probability, p, of each edge _τ Is a hyper-parameter;

s23, based on p _u,i Enhancing the data G = (X, A) to obtain enhanced data G '= (X, A') and G '= (X, A'), A '= mask.A, R' = mask.A, wherein the Mask is a Mask matrix, and the Mask belongs to R ^(n+m)×(n+m) Each item m in Mask _u,i Obedience probability of p _u,i Bernoulli distribution of (a).

The data enhancement mode of the invention can dynamically update the discarding probability of the edges in the bipartite graph according to the popularity of the project, thereby enhancing the data.

Further, in step S3, the enhancement data G ' = (X, a ') and G = (X, a ") are encoded using the encoder LightGCN, generating vectors Z ' and Z" for users and items of two view enhancement data as follows:

Z′＝LightGCN(X,A),Z″＝LightGCN(X,A″)

wherein Z '= [ Z' _u1 ,...,z′ _un ,z′ _i1 ,...,z′ _im ]And Z "= [ Z" ] _u1 ,...,z″ _un ,z″ _i1 ,...,z″ _im ]Respectively, a set of user and item vectors at two views generated by the encoder.

Further, the covariate partial loss function is composed of vector Z 'and vector Z ", and the calculation formula of c (Z') and c (Z") is as follows:

where d is the dimension of vector X 'or Z', C (Z ') is the covariance matrix of vector Z', and C (Z ') is the covariance matrix of vector Z', the covariance matrix calculation formula is as follows:

in the formula (I), the compound is shown in the specification,

and

corresponding to the mean values of Z 'and Z ", respectively, Z' _k Terms for vector Z', Z _k Are terms of vector Z ".

Through the constant loss function, the non-diagonal elements in the vector covariance matrix approach to 0 in model training, so that the independence among the vector dimensions is maintained, and the model is prevented from collapsing.

Further, an invariant portion loss function is calculated based on the prediction scores, the invariant portion loss function being formulated as follows:

r 'in the formula' _k,j Is the term of the prediction score R 'to which the vector Z' corresponds, R '= Z' _u (Z′ _i ) ^T ，r″ _k,j Are the terms of the prediction score R "corresponding to the vector Z", R "= Z ″", in which _u (Z″ _i ) ^T ，R′∈R ^N×N ，R″∈R ^N×N N is the size of each batch, i.e. the number of trains per time, Z' _u And Z' _i For the separation of vector Z ' into user vector and item vector, Z ' = Z ' _u ||Z′ _i ，Z″ _u And Z ″) _i For the separation of vector Z "into a user vector and an item vector, Z" = Z ″) _u ||Z″ _i . The loss function of the invariant part is based on the idea of comparative learning, the prediction scores under two visual angles are drawn, and the loss function of the invariant part based on the prediction scores is obtained.

The invention also provides a graph neural network recommendation system based on self-supervision, which is used for realizing the graph neural network recommendation method based on self-supervision, and the recommendation system comprises:

the data acquisition module is used for acquiring user and project interaction data;

the data processing module is used for forming a bipartite graph G = (X, A) by the user and item interaction data;

a data increasing module for generating enhanced data G '= (X, a') and G = (X, a ") of two view angles by enhancing the data G = (X, a) by a popularity bias reduction method based on popularity of the item;

a data coding module for coding the enhancement data to generate vectors Z 'and Z' respectively;

the model training module is used for training the vectors Z 'and Z' by adopting a loss function to obtain a graph neural network recommendation model;

and the item recommendation module is used for pushing interested item information to the user based on the graph neural network recommendation model.

Compared with the prior art, the invention has the following beneficial effects:

1. the recommendation method provided by the invention starts from the popularity deviation problem existing in the recommendation system for a long time, original data is enhanced into data of two visual angles by adopting a data enhancement mode related to the field, and specifically, enhancement operation is carried out on the input data based on the natural popularity index in the original input data, so that the enhanced data can furthest reserve information required by a recommendation task. And the combination effect of different data enhancement modes is verified through experiments, and the optimal data enhancement combination is selected, so that the upper limit of the graph neural network recommendation model training is improved, the internal structure of the original graph data is reserved from the input end, and the data information is not lost. Unlike random data enhancement, popularity bias reduction may mitigate the popularity bias problem in a recommendation system during the data enhancement phase, thereby improving the performance of the recommendation system.

2. The recommendation method of the invention takes the unified self-supervision task and the main task of the recommendation system as the target, a brand new loss function is designed, the loss function comprises an invariant part and a covariant part, the definition of a positive sample is modified from a vector level to a grading level in the invariant part, so that the self-supervision task is combined with the grading task in the recommendation system, the training of the graph neural network recommendation model jumps out of a combined learning mode, the loss of vector information in the training of different tasks is avoided, the model structure is simplified, the model complexity is reduced, the model training efficiency is improved, and the end-to-end unification of the model training is realized; in the covariant part, the non-diagonal elements in the matrix approach to 0 by restricting the covariance matrix of the vector, thereby ensuring the independence among all the dimensions of the vector and preventing the collapse of the model. The loss function provided by the invention can effectively solve the problem of random negative sampling, and simultaneously avoids the joint training of the model.

Description of the drawings:

FIG. 1 is a flow chart of an auto-supervised based neural network recommendation method of the present invention;

FIG. 2 is a basic framework of the neural network recommendation model of the embodiment 1 based on the self-supervision graph;

FIG. 3 is a flowchart of the popularity bias reduction method based on popularity of items in example 1;

FIG. 4 is a schematic diagram of embodiment 1 with different data enhancement modes;

fig. 5 is a graph of performance results of the recommendation method in different data enhancement modes of example 1.

Detailed Description

The present invention will be described in further detail with reference to test examples and specific embodiments. It should be understood that the scope of the above-described subject matter is not limited to the following examples, and any techniques implemented based on the disclosure of the present invention are within the scope of the present invention.

Example 1

The embodiment provides an automatic supervision-based graph neural network recommendation method, as shown in fig. 1, including the following steps:

s1, obtaining user and project interaction data which comprise user data, project data and user and project interaction recording data to form a bipartite graph G = (X, A), wherein X is a node attribute matrix and belongs to R ^(n+m)×d A is an adjacency matrix, and A is belonged to R ^(n+m)×(n+m) ；

X represents the attribute value set of all nodes in the graph, historical interaction of users and items can naturally form a bipartite graph G = (V, epsilon), V is a node, V = U ≡ I and is formed by all user nodes U and item nodes I, epsilon is an edge and represents the historical interaction of the users and the items, and epsilon belongs to O ⁺ . For adjacency matrix A, where if node i and node j have historical interactions, i.e., (v) _i ,v _j ) E is epsilon, then A _i,j =1, otherwise A _i,j =0, n represents the number of user nodes, m represents the number of item nodes, and d represents the vector dimension. Table 1 below shows the data to be analyzed in this example. The three data set items acquired in this embodiment are items.

Table 1 public data set details

S2, generating enhanced data G '= (X, a') and G = (X, a ") for two views by enhancing the data G = (X, a) by a popularity bias reduction method based on item popularity, as shown in fig. 2;

recommender systems have proven to suffer from a serious popularity bias problem, i.e., they prefer to recommend items with high popularity to users, thereby ignoring large numbers of cold items. In the past, the homogeneity of recommended items is aggravated, so that the problems of information cocoon houses and long tails are caused. Therefore, a brand-new method for reducing the popularity deviation of the data enhancement mode is provided, the data enhancement mode enhances the original data based on the popularity of the item, reduces the popularity difference of the item to be recommended in the data enhancement stage, and can effectively relieve the popularity bias problem in the recommendation system.

The popularity bias reduction method specifically defines the popularity of each edge in the bipartite graph based on the historical frequency of interactions between users and items, and dynamically updates the discarding probability of each edge in the bipartite graph according to the popularity of the edge, so that the edge with high popularity has a higher probability to be discarded. The method balances the popularity difference of each node in the newly generated enhanced graph, so as to alleviate the popularity deviation of the recommendation system, the flow chart is shown in fig. 3, and the step S2 specifically includes the following steps:

Wherein s is _u,i For the popularity size of the edge connecting user node u and item node i in the bipartite graph,

representing the popularity of the project node i;

for each project, the popularity (popularity) is determined by the frequency of interaction of the project history, and in the bipartite graph, the degree of a node is defined as the number of edges connected with the node, so that in the bipartite graph formed by users and projects, the degree of each project node can be used for defining the projectThe degree of popularity. The popularity of each edge in the bipartite graph is defined as the size of the popularity of the item nodes connected to it, i.e. the popularity of each edge in the bipartite graph

representing the popularity of the project node i. For example, in FIG. 1, when computing user node u ₁ With article node i ₁ When the popularity of the edge is large, u ₁ Having 1 side connected thereto, u ₁ Degree of (1), user node u ₁ With article node i ₁ The popularity of an edge is 1.

in the formula p _u,i Probability of dropping as an edge, s _u,i -s _min /s _min -s _max Is to s _u,i Performing normalization process, s _min And s _max Respectively the minimum value and the maximum value of the popularity of all edges, and pe is the self-defined global discarding probability of each edge; the global drop probability will be based on normalized s _u,i Dynamic updating is carried out, so that the higher probability that the items with high popularity are discarded is ensured, and the popularity difference among the nodes in the graph after enhancement is reduced; p is a radical of formula _τ Is a hyperparameter, p _τ To prevent the discarding of information with too high probability that destroys the original bipartite graph, p _τ <1。

S23, based on p _u,i Enhancing the data G = (X, a) to obtain enhanced data G ' = (X, a ') and G = (X, a "), a ' = Mask · a, a ″ = Mask · a, where Mask is a Mask matrix, and Mask ∈ R ″ ^(n+m)×(n+m) Each item m in Mask _u,i Obedience probability of p _u,i Bernoulli distribution of。

Different from a random enhancement mode which is independent of the field adopted in the existing algorithm, the PBR starts from the background of a recommendation system and takes the solution of the popularity deviation problem of the recommendation system as a starting point, and the discarding probability of the edge is dynamically updated based on the popularity of each item. The popularity difference of each node in the bipartite graph generated after data enhancement becomes small. While generating two differential bipartite graphs for comparative learning, key information in the bipartite graphs affecting performance of a recommendation system is retained to the greatest extent possible. The popularity deviation problem of the recommendation system is relieved at the data input end, so that the performance of the recommendation system is improved.

S3, encoding enhancement data G ' = (X, a ') and G = (X, a ″), generating vectors Z ' and Z ", respectively.

The embodiment adopts a weight-sharing graph neural network-based encoder LightGCN to encode the enhancement data G '= (X, a') and G = (X, a "), and the LightGCN is a graph neural network encoder facing the recommendation system, which simplifies the feature transformation and nonlinear activation modules in the original graph neural network, and makes it a more lightweight and better-performing encoder suitable for the recommendation system. Thus generating vectors Z' and Z "for the user and the item, respectively, of the two perspective enhancement data as follows:

Z′＝LightGCN(X,A′),Z″＝LightGCN(X,A″)

wherein Z '= [ Z' _u1 ,...,z′ _un ,z′ _i1 ,...,z′ _im ]And Z "= [ Z") _u1 ,...,z″ _un ,z″ _i1 ,...,z″ _im ]Respectively, a set of user and item vectors at two views generated by the encoder.

L(Z′,Z″)＝λs(R′,R″)+μ[c(Z′)+c(Z″)]

where λ and μ are hyperparameters, controlling the weights of the invariant and covariant partial loss functions, respectively, s (R ', R') are invariant partial loss functions, and c (Z '), c (Z') are covariant partial loss functions of vector Z 'and vector Z', respectively.

s (R ', R') is an invariant part loss function, calculated from the prediction scores, and is formulated as follows:

r 'in the formula' _k,j Is the term of the prediction score R 'to which the vector Z' corresponds, R '= Z' _u (Z′ _i ) ^T ，r″ _k,j Are the terms of the prediction score R "corresponding to the vector Z", R "= Z ″) _u (Z″ _i ) ^T ，R′∈R ^N×N ，R″∈R ^N×N N is the size of each batch, i.e. the number of trains per time, Z' _u And Z' _i For the separation of vector Z ' into user vector and item vector, Z ' = Z ' _u ||Z′ _i ，Z″ _u And Z ″) _i For the separation of vector Z "into user vector and item vector, Z" = Z ″ ", in which _u ||Z″ _i . The separation in the separation of vectors Z 'and Z' into user and item vectors is based on the indexing of the users and items in the original graph, separating vectors Z 'and Z'. The invariant part loss function is based on the idea of comparative learning, the prediction scores under two visual angles are drawn closer, and the invariant part loss function based on the prediction scores is obtained.

The covariant partial loss function is composed of a vector Z 'and a vector Z', and the calculation formula of c (Z ') and c (Z') is as follows:

where d is the dimension of vector Z 'or Z', C (Z ') is the covariance matrix of vector Z', and C (Z ') is the covariance matrix of vector Z', the covariance matrix calculation formula is as follows:

in the formula (I), the compound is shown in the specification,

and

corresponding to the mean values of Z 'and Z ", respectively, Z' _k Term for vector Z', Z ″) _k Are terms of vector Z ". Through the covariant partial loss function, the non-diagonal elements in the vector covariance matrix approach to 0 in model training, so that the independence among the dimensions of the vector is maintained, and the model is prevented from collapsing. And carrying out iterative optimization on the vectors Z 'and Z' by adopting a loss function to realize the training of the graph neural network recommendation model.

In the conventional method, a model is trained based on a traditional comparison learning loss function, the comparison learning takes different vectors of the same node under two visual angles as positive samples, and the positive samples are drawn up in the loss function through constraint. Considering that the simple zooming-in of the positive samples causes the collapse problem of the model, the contrast learning introduces a large number of negative samples and separates the distances between the positive samples and the negative samples. However, in the existing comparative learning, all samples except the target sample in a batch are often regarded as negative samples. This definition will extend the distance of many essentially similar samples in the vector space, and reduce the performance of the recommendation system. In addition, the existing self-supervision recommendation algorithm generally regards a self-supervision model as an auxiliary task to constrain the spatial distribution of vectors, and finally needs to be trained in combination with a score prediction main task of a recommendation system. This mode of joint learning increases the complexity of the loss function and burdens the model training.

Unlike the conventional comparative learning loss function, which defines the vectors Z' and Z ″ as positive samples, the present application directly considers the scoring matrix at two views as positive samples by calculating the predicted scores of the user for the items, and we refer to such positive samples as task-related positive samples, which is also the source of task oriented in the loss function. Based on the definition of the positive sample, the self-supervision task is directly connected with the grading prediction main task of the recommendation system, so that an additional joint learning strategy is not needed, the definition object of the positive sample is transferred to the grading level from the vector level, the self-supervision task and the recommendation system main task are unified, the accurate user and item vectors can be trained by the graph neural network recommendation model through a loss function, the model training efficiency is improved, and the hardware load is reduced. In addition, by calculating the covariance matrix of the vector, and drawing the distance between the covariance matrix and the unit matrix in the loss function, the independence between each dimension of the vector is well kept by constraining the covariance matrix of the vector, and the collapse of the model is avoided, so that the uncertainty of random sampling is avoided, and the performance of the recommended model is improved.

In order to illustrate the effects of the popularity deviation reduction method and the loss function based on the item popularity, which are proposed by the present invention, ablation experiments were performed on the popularity deviation reduction method and the loss function, respectively. For the epidemic deviation reduction method, on the basis of keeping the rest of the model unchanged, the role of the epidemic deviation reduction method in the invention is verified by using four typical data enhancement modes instead of the epidemic deviation reduction method, wherein the four typical data enhancement modes are shown in fig. 3 and are specifically described as follows

(a) Node-dropping (N): a certain proportion of the nodes in the bipartite graph are randomly discarded, and in fig. 4 (a), the dotted circles represent the randomly discarded nodes.

(b) Edge dropping (E): a certain proportion of the edges in the bipartite graph are discarded randomly, and in fig. 4 (b), the dashed lines indicate the randomly discarded edges.

(c) Feature masking (F): some dimensions of the user and item feature vectors are randomly covered, and in fig. 4 (c), "×" indicates the covered dimension.

(d) Adaptive discard (a): in the bipartite graph, a certain percentage of edges are discarded based on the degree of the node, and in fig. 4 (d), a straight line of a dotted line indicates an edge discarded based on the degree of the node.

The recommendation system is applied to the ML-100k data set by adopting the four data enhancement modes, and the experimental result is shown in fig. 5 (a), the recall rate of the popular deviation reduction method (P) mode provided by the invention is higher than that of the other four modes, so that the effectiveness of the popular deviation reduction method can be proved. In addition, different data enhancement modes are combined, the experimental result is shown in fig. 5 (b), the optimal combination scheme is P & P, and the combination modes of PBR and other data also obtain good results, thereby proving the superiority of the popular deviation reduction method provided by the invention again.

For the loss function, under the condition of keeping other parts of the model unchanged, the recommendation system is applied to the ML-100k data set by adopting the method and three other loss functions, wherein the other three loss functions are respectively InfonCE, barlow Twins and ICT-variates, and the ICT-variates is a deformation method of the loss function (ICT), namely, the covariate part is kept unchanged, and a positive sample of the modified invariant part is defined as a vector level. The experimental results are shown in table 2, and it can be seen from the data in the table that the loss function is superior to other loss functions no matter the task is the main task of jointly optimizing the self-supervision task and recommending the system, or only the self-supervision task is optimized; in the case of training only the self-supervision task, the method is far superior to other loss functions, even loss function variants, and the result proves that the modification of the positive sample definition in the loss function of the invention is beneficial to unifying the self-supervision task and the scoring task of the recommendation system, mainly because the self-supervision task is changed from the embedding level to the scoring level in the experiment, so the performance of the model does not need to depend on the scoring task of the recommendation system.

TABLE 2 recommendation algorithm Performance under different penalty functions

	Self-supervision task	Joint learning
			Loss function	Recall	Recall
InfoNCE	0.1145	0.2014
			BT	0.0955	0.1954
ICT-variants	0.1254	0.2145
			Loss function	0.2226	0.2237

In order to further verify the effect of the invention, three public data sets were subjected to multiple sets of comparative experiments using the invention and 5 additional methods, specifically the 5 methods were as follows:

NGCF: a graph neural network based recommendation system;

LightGCN: on the basis of NGCF, two network architectures which are not suitable for a recommendation system, namely a nonlinear conversion function and an activation function, are removed, and the efficiency of a recommendation algorithm is improved on the basis of a simplified model;

SGL: introducing an automatic supervision algorithm into a recommendation system, and performing data enhancement (SGL-ND, SGL-ED and SGL-RW) by adopting three random enhancement modes;

SelfCF: an asymmetric network is introduced into a recommendation system, so that the model training efficiency is improved;

GCA-DE: and the probability of discarding edges is dynamically updated according to the node degree in the graph, so that excessive damage to the original graph caused by data enhancement is avoided.

The parameters of the comparative experiments are shown in table 3, and the results of the comparative experiments on the three data sets are shown in table 4.

Table 3 comparative experimental parameter settings

TABLE 4 comparison of recommended Performance for different models

As shown in table 4, compared with the existing baseline model, the method provided by the present invention achieves certain improvement in both the recall rate recall and the normalized loss cumulative gain ndcg. On one hand, the unsupervised training can help the model to expand the range of input data depending on the existence of a data enhancement mechanism, so that the training times of the model are increased, and the performance of the model is improved. On the other hand, the PBR data enhancement mode and the ICT loss function provided by the invention are specially designed aiming at the characteristics of a recommendation system. The PBR can relieve the popularity deviation problem existing in the recommendation system at the data input end, and the ICT can combine the self-supervision task and the scoring main task of the recommendation system, so that the training difficulty of the model is reduced, and the training efficiency of the model is improved.

Example 2

The embodiment provides an automatic supervision-based graph neural network recommendation system, which implements the recommendation method of embodiment 1, and the system includes:

the data processing module is used for forming a bipartite graph G = (X, A) by the user and the project interaction data;

The above description is intended to be illustrative of the preferred embodiment of the present invention and should not be taken as limiting the invention, but rather, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.

Claims

1. A graph neural network recommendation method based on self-supervision is characterized by comprising the following steps:

s1, obtaining user and project interaction data, including user data, project data and user and project interaction record data, to form a bipartite graph G = (X, A), wherein X is a node attribute matrix and belongs to R ^(n+m)×d A is an adjacency matrix, and A is belonged to R ^(n+m)×(n+m) N represents the number of user nodes, m represents the number of project nodes, and d represents a vector dimension;

s2, generating enhanced data G '= (X, a') and G = (X, a ") for two views by enhancing the data G = (X, a) by a popularity bias reduction method based on the popularity of the item;

L(Z′，Z″)＝λs(R′，R″)+μ[c(Z′)+c(Z″)]

where λ and μ are hyperparameters, s (R ', R') is a constant partial loss function, and c (Z '), c (Z') are covariant partial loss functions of vector Z 'and vector Z', respectively;

2. The self-supervision-based neural network recommendation method of claim 1, wherein step S2 specifically comprises the steps of:

Wherein s is _u，i For the popularity size of the edge connecting user node u and item node i in the bipartite graph,

representing the popularity of the project node i;

in the formula p _u，i Probability of dropping as an edge, s _u，i -s _min /s _min -s _max Is to s _u，i Performing a normalization process, s _min And s _max Respectively, the minimum and maximum values of the popularity of all edges, p _e For a self-defined global drop probability, p, of each edge _τ Is a hyper-parameter;

s23, based on p _u，i Enhancing the data G = (X, A) to obtain enhanced data G '= (X, A') and G '= (X, A'), A '= mask.A, A' = mask.A, wherein the Mask is a Mask matrix, and the Mask belongs to R ^(n+m)×(n+m) Each item m in Mask _u，i Obedience probability of p _u，i Bernoulli distribution of (a).

3. The self-supervision based map neural network recommendation method according to claim 1, wherein in step S3, the enhancement data G ' = (X, a ') and G "= (X, a") are encoded using encoder LightGCN, generating vectors Z ' and Z "of users and items of two view enhancement data as follows:

Z′＝LightGCN(X，A)，Z＝LightGCN(X，A)

wherein Z '= [ Z' _u1 ，...，z′ _un ，z′ _i1 ，...，z′ _im ]And Z "= [ Z" ] _u1 ，...，z″ _un ，z″ _i1 ，...，z″ _im ]Respectively, a set of user and item vectors at two views generated by the encoder.

4. The method of claim 1, wherein the covariant partial loss function is comprised of a vector Z 'and a vector Z ", and c (Z') and c (Z") are calculated as follows:

where d is the dimension of vector Z 'or vector Z', C (Z ') is the covariance matrix of vector Z', and C (Z ') is the covariance matrix of vector Z', the covariance matrix is calculated as follows:

in the formula (I), the compound is shown in the specification,

and

corresponding to the mean values of Z 'and Z ", respectively, Z' _k Term for vector Z', Z ″) _k Are terms of vector Z ".

5. The self-supervision based on graph neural network recommendation method of claim 4, characterized in that the invariant part loss function is calculated from the prediction scores, and the invariant part loss function formula is as follows:

r 'in the formula' _k，j Is the term of the prediction score R 'to which the vector Z' corresponds, R '= Z' _u (Z′ _i ) ^T ，r″ _k，j Are the terms of the prediction score R "corresponding to the vector Z", R "= Z ″) _u (Z″ _i ) ^T ，R′∈R ^N×N ，R″∈R ^N×N N is the size of each batch, Z' _u And Z' _i For the separation of vector Z ' into user vector and item vector, Z ' = Z ' _u ||Z′ _i ，Z″ _u And Z ″) _i For the separation of vector Z "into user vector and item vector, Z" = Z ″ ", in which _u ||Z″ _i 。

6. An auto-supervised-based graph neural network recommendation system for implementing the auto-supervised-based graph neural network recommendation method according to any one of claims 1 to 5, wherein the recommendation system comprises: