CN112925977A - Recommendation method based on self-supervision graph representation learning - Google Patents

Recommendation method based on self-supervision graph representation learning Download PDF

Info

Publication number
CN112925977A
CN112925977A CN202110219147.XA CN202110219147A CN112925977A CN 112925977 A CN112925977 A CN 112925977A CN 202110219147 A CN202110219147 A CN 202110219147A CN 112925977 A CN112925977 A CN 112925977A
Authority
CN
China
Prior art keywords
node
graph
matrix
data enhancement
embedding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110219147.XA
Other languages
Chinese (zh)
Inventor
何向南
吴剑灿
王翔
冯福利
陈亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN202110219147.XA priority Critical patent/CN112925977A/en
Publication of CN112925977A publication Critical patent/CN112925977A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a recommendation method based on self-supervision graph representation learning, which can be adapted to all existing recommendation models based on a graph neural network, utilizes a designed data enhancement strategy to carry out self-supervision learning to assist a supervised recommendation task, and constructs higher-quality vector representations for users and articles, thereby providing higher-quality and accurate personalized recommendation contents for the users.

Description

Recommendation method based on self-supervision graph representation learning
Technical Field
The invention relates to the technical field of recommendation systems, in particular to a recommendation method based on self-supervision graph characterization learning.
Background
The recommendation system aims at learning high quality representations (recurrentations) of users and items from user-item interaction data. A collaborative filtering model such as Matrix Factorization (Matrix Factorization) maps the ID of each user and item into a vector (ID embedding) to represent the characteristics of the user and item.
In recent years, Graph Convolution Networks (GCNs) have been a great leap in characterization learning, which provides an efficient end-to-end way to fuse the features of multi-hop neighbors to node characterizations. Inspired by the graph convolution network, the characterization learning in the recommendation system has evolved from using a single ID feature and interaction history to utilizing high-order connectivity in a user-item bipartite graph, and a huge improvement in recommendation accuracy is achieved.
The comprehensive analysis of the recent recommended model based on the graph neural network finds that the following defects exist: 1) sparse supervisory signals: most recommended models at the present stage use a supervised learning paradigm, wherein a supervision signal of the supervised learning paradigm is from observed interaction data between a user and an article, however, compared with an interaction space, the observed interaction data is extremely sparse, so that the models cannot learn high-quality characteristics; 2) skewed data distribution: observed interaction data tend to obey power law distribution, wherein a long tail part is formed by low-frequency articles lacking supervision signals, and in contrast, high-frequency articles frequently appear in a neighbor aggregation and loss function and have larger influence on feature learning, so that a model is biased to recommend the high-frequency articles and sacrifice the exposure rate of the long tail articles; 3) noisy interaction data: since the data provided by the user is mostly implicit feedback such as clicking, watching, and not explicit feedback such as scoring, likes/dislikes, the observed interaction data is usually noisy (e.g., the user is misled to click on an item before finding it disliked), while the GCN amplifies the effect of noisy interactions on token learning when doing neighbor aggregation, making the learned tokens exceptionally sensitive to noisy interactions.
Unlike supervised learning that relies on tagged data, self-supervised learning (SSL) has an advantage in that a supervisory signal can be actively extracted from data itself by changing input observation data without relying on tagged data, and has been widely used in the fields of Natural Language Processing (NLP), Computer Vision (CV), and the like. For example, BERT (devin et al, NAACL-HLT 2019) captures inter-term dependencies by randomly masking (masking) certain terms in a sentence and letting the predictor recover the masked terms; the labeled images are randomly rotated by RotNet (Gidaris et al, ICLR 2018), and then are subjected to contrast learning by using images at different rotation angles in order to still recognize the original images. However, SSL is rarely applied in the recommendation field, especially in the recommendation model based on graph network, and its main difficulties are: unlike the CV and NLP fields, the data of the recommendation system is discrete and interrelated. Therefore, the research on how to introduce the self-supervision graph representation learning into the recommendation system has great research value.
Disclosure of Invention
The invention aims to provide a recommendation method based on self-supervision graph characterization learning, which can improve the precision and robustness of a recommendation model based on a graph neural network and alleviate the long tail problem of the existing model.
The purpose of the invention is realized by the following technical scheme:
a recommendation method based on self-supervision graph characterization learning comprises the following steps:
in the training stage, data enhancement operation is selected, data enhancement is carried out on an embedded matrix or an adjacent matrix in the user-article bipartite graph based on the selected data enhancement operation to generate a plurality of enhancement views of each node, and positive sample pairs and negative sample pairs are constructed by utilizing different enhancement views of the same node and different enhancement views of different nodes; inputting the positive sample pair and the negative sample pair into a current graph neural network-based recommendation model for training, wherein the training targets are to maximize consistency characterization among different enhancement views in the positive sample pair and differential characterization among different enhancement views in the negative sample pair; the nodes comprise a user node and an article node in a user-article bipartite graph;
after training is finished, obtaining final characterization vectors of all nodes through one-time forward propagation; in the inference stage, for the user to be recommended, calculating the matching scores of the user to be recommended and each article by using the inner product of the final characterization vectors, performing descending order arrangement based on the matching scores, and selecting the first K non-interactive articles as recommendation results, wherein K is the size of the recommendation list.
According to the technical scheme provided by the invention, all existing recommendation models based on the graph neural network can be adapted, the designed data enhancement strategy is utilized to carry out self-supervision learning to assist the supervised recommendation task, and higher-quality vector representation is constructed for the user and the article, so that higher-quality and accurate personalized recommendation content is provided for the user.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a schematic diagram of a recommendation method based on an auto-supervised graph characterization learning according to an embodiment of the present invention;
FIG. 2 is a graph comparing the performance of SGL-ED and LightGCN in different groups according to an embodiment of the present invention;
FIG. 3 is a graph of training curves for SGL-ED and LightGCN according to an embodiment of the present invention;
fig. 4 is a graph illustrating the relationship between the model performance and the noise interaction ratio in the training set according to the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a recommendation method based on self-supervision graph characterization learning, which comprises a training stage, wherein data enhancement operation is selected, data enhancement is carried out on an embedded matrix or an adjacent matrix in a user-article bipartite graph based on the selected data enhancement operation, so that a plurality of enhancement views of each node are generated, and a positive sample pair and a negative sample pair are constructed by using different enhancement views of the same node and different enhancement views of different nodes; inputting the positive sample pair and the negative sample pair into a current graph neural network-based recommendation model for training, wherein the training targets are to maximize consistency characterization among different enhancement views in the positive sample pair and differential characterization among different enhancement views in the negative sample pair; the nodes comprise a user node and an article node in a user-article bipartite graph; after training is finished, obtaining final characterization vectors of all nodes through one-time forward propagation; in the inference stage, for the user to be recommended, calculating the matching scores of the user to be recommended and each article by using the inner product of the final characterization vector, performing descending order arrangement based on the matching scores, and selecting the first K non-interactive articles as a recommendation result, wherein K is the size of a recommendation list, K is a positive integer, the specific numerical value of the K is set according to the actual situation or experience, and the method is not limited by the numerical value.
As can be understood by those skilled in the art, the user-item bipartite graph is an existing data structure, users and items in the bipartite graph are two types of nodes respectively, and the connection edges of the nodes represent the interaction relationship between the users and the corresponding items.
In the above scheme of the embodiment of the present invention, SSL is used to construct a supervision signal of unlabeled data, and the internal association of data is mined, so as to supplement and enhance the main supervised Learning task, and the association scheme is named as a Self-supervised Graph Learning paradigm (SGL), where the SGL includes two main parts: data enhancement and contrast learning. 1) Data enhancement is responsible for generating different views (views) of each node: since the recommendation model based on the graph neural network can be understood as node embedding propagation on the bipartite graph, it can be abstractly defined by the following formula:
Figure BDA0002953801460000031
Figure BDA0002953801460000032
wherein
Figure BDA0002953801460000033
Representing a node set consisting of all user nodes and item nodes
Figure BDA0002953801460000034
And observing a bipartite graph adjacency matrix constructed by the interaction set epsilon, H (-) is a graph convolution function for coding the connection information into node representation,
Figure BDA0002953801460000041
is the ID embedding matrix for all nodes on the bipartite graph, and d is the dimension of the embedding space. It can be seen that the inputs to the graph neural network based recommendation model include: from this perspective, the label-free data is constructed by performing data enhancement on two inputs of the recommendation model based on the graph neural network, and four different data enhancement operations are designed, namely an ID Embedding mask (IM), an ID Embedding Dropout (ID), a node Dropout (node Dropout, ND) and a connected edge Dropout (EdgeDropout, ED). 2) The goal of contrast learning is to maximize the consistency between different views of the same node and the variability between views of different nodes.
In fig. 1, the first layer depicts the workflow of a supervised recommendation task. The second layer and the third layer respectively depict the work flow of the self-supervision learning task for carrying out data enhancement from two different angles of ID embedding and graph structure. In the embodiment of the invention, the embedded matrix E is subjected to data enhancement through a data enhancement operation and then is combined with an adjacent matrix
Figure BDA0002953801460000045
Obtaining a plurality of enhanced views by a graph convolution function H (-); or, by aligning the adjacency matrices
Figure BDA0002953801460000046
And (4) enhancing the data, and combining with the embedded matrix E to obtain a plurality of enhanced views by a graph convolution function H (-). Specifically, the method comprises the following steps: the embedded matrix and the adjacent matrix are used as input of a graph convolution function, and the difference is that the action objects of two types of data enhancement operations of ID embedded mask and ID embedded Dropout are the embedded matrix, and the action objects of two types of data enhancement operations of node Dropout and connecting edge Dropout are the adjacent matrix; therefore, if the selected data enhancement operation is ID embedding mask or ID embedding Dropout, the adjacent matrix in the input part of the graph convolution function is not changed, and the embedded matrix carries out data enhancement; similarly, if the selected data enhancement operation is the node Dropout or the connected edge Dropout, the embedded matrix in the input part of the graph convolution function is not changed, and the adjacent matrix carries out data enhancement; finally, a plurality of enhanced views are output through a graph convolution function.
The process of generating the enhanced view is formulated in the following in combination with the corresponding data enhancement operation from both the perspective of ID embedding and graph structure.
First, data enhancement of the embedded matrix.
Before neighbor aggregation is performed, the embedding of each node describes the intrinsic characteristics of the node, and two enhanced views Z 'and Z' are generated for the embedding matrix by:
Figure BDA0002953801460000042
wherein t 'and t' are defined as follows:
t′(E)=M′⊙E,t″(E)=M″⊙E
in the above-mentioned formula, the first and second groups,
Figure BDA0002953801460000043
representing a set of random processes masking the embedding moments, t 'and t' being derived from
Figure BDA0002953801460000044
Two independent random mask processes (corresponding to random sampling) obtained from random samplingID embedded mask or ID embedded Dropout), M' and M ″ are two mask matrices correspondingly generated, which indicates the product of hadamard.
As will be appreciated by those skilled in the art, the view can be regarded as the result of graph characterization learning, and the invention modifies the intrinsic characteristics of the nodes or the connection relationship between the nodes (i.e., "data enhancement") to obtain the result of different graph characterization learning (i.e., "enhanced view").
The enhanced embedding matrices t' (E) and t "(E) are obtained by either of the following two types of data enhancement operations:
1) ID embedding mask: a certain subset of dimensions of the embedding matrix E (a subset of dimensions made up of certain dimensions of the embedding space) is masked out with a probability p, so that the recommendation model based on the graph neural network uses only a part of the dimensions for characterization learning; with two independent random masking processes t' and t ″, this design can focus on different embedding dimensions to learn the intrinsic associations between different dimensional subsets.
2) ID embedded Dropout: with the probability rho, the subset of elements embedded in one node in the embedding matrix E is set to 0, so that the remaining element set of the node is used for distinguishing the corresponding node from other nodes; by utilizing two independent dropout random processes t 'and t' (a specific mode in a random mask process), the design encourages the recovery of node embedded information by using partial information, reduces the dependency of characterization on certain specific information, and improves the robustness of the model.
The principle of the two types of data enhancement operations is as follows: the known graph convolution function H (-) changes its output (i.e., enhances the view) by changing its inputs. Take ID embedded Dropout as an example: the input ID embedding matrix E is an initial ID embedding matrix, the ID embedding matrices obtained by data enhancement are t '(E) and t' (E), two random mask matrices M 'and M' are introduced during data enhancement operation, and a part of elements of the input ID embedding matrix are set to be 0 by the hadamard product during data enhancement, so that the input of the function is changed.
The corresponding t '(E) and t "(E) can be obtained according to the originally selected data enhancement operation, and the expressions of Z' and Z" are substituted so as to obtain the corresponding enhanced view.
For a single node k, its enhanced view z'kAnd z ″)kThe k-th row of views Z 'and Z' is enhanced for embedding in the matrix, and view Z 'is enhanced'kAnd z ″)kIs a positive sample pair and the different enhanced views of different nodes constitute a negative sample pair, e.g. the enhanced view z "of node llAnd z'kForming a negative sample pair.
Generally speaking, the nodes involved in the positive sample pair are all nodes on the bipartite graph, that is, viewed separately, the node k may be a user node or an item node; for the negative sample pair, the node type may not be limited theoretically, but the node types are distinguished when defining the loss function, that is, the user node can only form the negative sample pair with other user nodes, and the article node can only form the negative sample pair with other article nodes, that is, the node l and the node k are the same type of node.
And secondly, enhancing data of the adjacent matrix.
Since the user-item bipartite graph contains collaborative filtering signals, such as one-hop neighbors directly representing historical interactions of users and items, two-hop neighbors describing users of similar behavior or items of the same audience, and high-order communication from user nodes to item nodes reflects potential interests of users in items, the inherent mode of mining the graph structure is beneficial to characterization learning. Based on this, in the embodiment of the present invention, two enhanced views Z' and Z ″ are generated for the adjacency matrix by:
Figure BDA0002953801460000061
wherein the content of the first and second substances,
Figure BDA0002953801460000062
representing a set of random processes masking the adjacency moments, s 'and s' being derived from
Figure BDA0002953801460000063
Two independent random masking processes (corresponding to either the node Dropout or the connected edge Dropout) resulting from the random sampling.
Enhanced adjacency matrix
Figure BDA0002953801460000064
And
Figure BDA0002953801460000065
obtained by either of the following two types of data enhancement operations:
1) node Dropout: with probability ρ, each node and its connected edges on the user-item bipartite graph are discarded, which is formulated as follows:
Figure BDA0002953801460000066
where M 'and M "are two mask matrices generated correspondingly (M' and M" are the same concept as before, except that the action objects are different, here, the adjacent matrix is performed, and the embedded matrix is performed before, so the mask matrices are different in size).
The above operation enables important nodes to be identified from different enhanced perspectives, reducing the sensitivity of the learned representations to structural changes.
2) Connecting edge Dropout: with probability ρ, each side on the bipartite graph is discarded, which is formulated as follows:
Figure BDA0002953801460000067
the operation only aggregates partial neighbors to generate the node representation, aims to capture the useful mode of the node part, gives stronger robustness to the representation, and resists the influence of noise interaction on the representation.
Similarly to before, after the adjacency matrix is operated by the node Dropout or the connecting edge Dropout, two enhanced views Z' and Z ″ can be obtained; likewise, for a single node k,enhanced View z'kAnd z ″)kIs an enhanced view z "of one positive sample pair, another node llAnd z'kForming a negative sample pair.
In general, to reduce the training complexity, at the beginning of each epoch, the two mask matrices M 'and M "are regenerated according to the selected data enhancement operation, and the embedded matrix or the adjacent matrix is further subjected to data enhancement, resulting in enhanced matrices t' (E) and t ″ (E), or
Figure BDA0002953801460000068
And
Figure BDA0002953801460000069
two enhanced views Z' and Z "are finally obtained. Then, the positive and negative sample pairs are constructed according to the mode provided in the text, and then training is carried out according to the training target.
After different enhanced views of the node are constructed, an InfonCE loss function is used as an objective function of SSL, so that the auxiliary supervision signals of the positive sample pairs can promote consistency characterization among different views of the same node, and the auxiliary supervision signals of the negative sample pairs promote difference characterization among different nodes. Here, the auxiliary means that in the proposed multitasking framework, the SSL task serves as an auxiliary task, and the supervision signal means that a constructed sample pair has a positive or negative label.
Since the bipartite graph is a heterogeneous graph composed of two different types of nodes (i.e., user nodes and article nodes), the objective function of the self-supervised learning is constructed
Figure BDA0002953801460000071
And then, self-supervision learning loss functions of a user side and an article side are respectively defined:
Figure BDA0002953801460000072
Figure BDA0002953801460000073
Figure BDA0002953801460000074
wherein the content of the first and second substances,
Figure BDA0002953801460000075
the method comprises the steps that a user node set and an item node set are respectively represented, u and v are user node indexes, i and j are item node indexes, and s (-) is a cosine similarity function and is used for measuring the similarity of two enhanced views; τ is a super parameter, similar to the temperature coefficient of SoftMax. As before, View z 'is enhanced'*,z″*U, v, i, j, can be obtained by any of four types of data enhancement operations, that is, four SGL variants can be formed, i.e., SGL-ID, SGL-IM, SGL-ND, SGL-ED in table 2. Z 'in the above-mentioned loss function, with reference to previous definitions regarding pairs of positive and negative samples'u,z″u、z′i,z″iAre all positive sample pairs, z'u,z″v、z′i,z″jAre all negative sample pairs.
Finally, a multi-task training framework is designed to jointly optimize the classical recommendation task and the SSL task:
Figure BDA0002953801460000076
wherein the content of the first and second substances,
Figure BDA0002953801460000077
equivalent to the right side of the first layer of fig. 1
Figure BDA0002953801460000078
It is the BPR (Bayesian personalized ranking) loss function, λ1And λ2Is used for controlling SSL tasks (i.e. for controlling SSL tasks)
Figure BDA0002953801460000079
) And L21 regularization (i.e.
Figure BDA00029538014600000710
) And intensity hyper-reference, theta, is a trainable parameter in the graph neural network-based recommendation model.
By the scheme provided by the invention, the existing recommendation model based on the graph neural network is trained, (illustratively, the training can be stopped if the recommendation index is not promoted in 10 continuous tests), and after the training is finished, the final characterization vectors of all nodes can be obtained by only carrying out forward propagation once, so that the recommendation model based on the graph neural network can construct higher-quality characterization vectors for users and articles, and higher-quality and accurate personalized recommendation contents are provided for the users.
Compared with the existing method, the Scheme (SGL) provided by the embodiment of the invention has the following advantages: 1) the recommendation precision is obviously improved; 2) the long tail problem is effectively relieved; 3) the training efficiency is greatly improved; 4) the noise resistance is obviously enhanced. For the above advantages, exhaustive experiments were performed on three real data sets using LightGCN (He et al, SIGIR 2020) as the GCN model for SGL. Table 1 gives the statistics of the three data sets.
Data set Number of users Number of articles Number of interactions Degree of sparseness
Yelp2018 31,668 38,048 1,561,406 0.00130
Amazon-book 52,643 91,599 2,984,108 0.00062
Alibaba-iFashion 300,000 81,614 1,607,813 0.00007
TABLE 1 statistical information of data sets
1) The recommendation precision is obviously improved.
According to the invention, SSL is used for assisting the recommendation of tasks for characterization learning, so that higher-quality user and article characterizations can be obtained, and NDCG @20 is respectively promoted by 5.7%, 19.2% and 6.5% on three data sets.
The results of the experiment are shown in table 2.
Figure BDA0002953801460000081
TABLE 2 SGL Performance comparison with LightGCN
In Table 2, the recommended performance index rec @20 represents the Recall (Recall) of the first 20 recommended items, ndcg @20 represents the normalized discount cumulative gain of the first 20 recommended items, and the ID, IM, ND, ED at the end of the model portion correspond to the four types of data enhancement operations described above. As can be seen from table 2, the four data enhancement schemes of SGL all achieve higher recommendation accuracy than LightGCN, validating the above conclusion. Meanwhile, in the four data enhancement schemes, the SGL-ED is basically stable and better than the other three data enhancement schemes, but the other three data enhancement schemes also have respective advantages, so that in practical application, a user can select the type of data enhancement operation according to needs or experience. In addition, for a model of one layer, the performance improvement on two sparse data (namely Amazon-Book and Alibaba-iFashinon) in three data sets is more obvious, which shows that the SGL can better mine the intrinsic characteristics in the sparse data, and the superiority of introducing SSL into a recommendation model based on a neural network is verified.
2) The long tail problem is effectively alleviated.
As mentioned above, the GNN-based recommendation model has a serious long tail problem, and in order to verify that SGL can effectively alleviate the long tail problem, the following experiment is performed: the pool of items is first divided into 10 groups according to the popularity of the items, the smaller the group ID, the lower the popularity of the items within the group, while the total number of exposures for the items within each group is the same. Then, dividing the Recall index into the sum of Recall of each group, and formulating the sum as follows:
Figure BDA0002953801460000091
where (g) is an index to the packet, m represents the total number of users,
Figure BDA0002953801460000092
and
Figure BDA0002953801460000093
respectively representing a recommendation list of the user u and a related item list in the test set.
FIG. 2 shows the results of the above experiments, with suffixes labeled to indicate the number of layers of the panel convolutional layers, each set of bar graphs corresponding in sequence from left to right to the top left labeled SGL-ED-L1, SGL-ED-L2, SGL-ED-L3, LightGCN-L1, LightGCN-L2, LightGCN-L3. As can be seen from fig. 2, LightGCN prefers to recommend head items, while barely exposing long-tail items, such as group 10, on three data sets, the number of items in the group accounts for 0.83%, and 0.22% of the total item pool, respectively, while contributing to recall rates of 39.72%, 39.92%, and 51.92%, respectively, indicating that LightGCN has difficulty learning the characterization of high-quality long-tail items, mainly due to sparse surveillance signals. While for SGL, the contribution rates dropped to 36.27%, 29.15%, and 35.07%, respectively, indicating that SGL is relatively more likely to recommend non-headwear items. Meanwhile, compared with the results in table 2, it can be found that the improvement of the recommendation accuracy by the SGL mainly comes from accurate recommendation of waist articles, which also indicates that the characterization learning can benefit from the auxiliary supervision signal to construct a better node characterization.
3) The training efficiency is greatly improved.
SSL has proven to be a great advantage in pre-training of natural language models and graph structures, and its impact on training efficiency is therefore investigated here. For this purpose, the training curves for SGL-ED and LightGCN are shown in fig. 3, where the upper three plots represent the BPR loss function versus Epoch on the three data sets, respectively, and the lower three plots represent the recall on the corresponding test set. It can be seen that the convergence rate of SGL-ED on Yelp2018 and Amazon-Book is much faster than that of LightGCN, specifically, the epochs corresponding to the SGL on the two data sets with optimal performance are 18 and 16 respectively, while LightGCN requires 720 and 700 epochs respectively, which indicates that SGL can greatly shorten the required training time and achieve considerable performance improvement. Although a different curve trend was shown on Alibaba-iFashion, a faster convergence of SGL-ED than LightGCN was observed, also validating the benefit of introducing SSL.
4) The noise resistance is obviously enhanced.
The robustness of the SGL to noise interaction is verified by experiments. For the purpose, for the two data sets of Yelp2018 and Amazon-Book, a certain proportion (5%, 10%, 15% and 20% respectively) of unobserved interaction data is randomly added into the training set to destroy the training set to different degrees, and then the recall rate on the test set is counted. The results of the experiment are shown in fig. 4, where the bar graph represents the recall rate of the model and the line graph represents the rate of performance degradation. As can be seen from fig. 4, the accuracy of both SGL and LightGCN decreases with increasing noise ratio, but the magnitude of SGL decrease is significantly smaller than LightGCN, and the difference between the two performances increases with increasing noise ratio. This shows that by comparing different enhanced views of nodes, the SGL can find useful patterns, especially graph structure information, to reduce the dependency of the model on certain edges. On the Amazon-Book dataset, even with 20% noise interaction added, the performance of the SGL is still higher than the LightGCN without noise interaction added, which further verifies the advantage of the SGL.
Through the above description of the embodiments, it is clear to those skilled in the art that the above embodiments can be implemented by software, and can also be implemented by software plus a necessary general hardware platform. With this understanding, the technical solutions of the embodiments can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments of the present invention.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (6)

1. A recommendation method based on self-supervision graph characterization learning is characterized by comprising the following steps:
in the training stage, data enhancement operation is selected, data enhancement is carried out on an embedded matrix or an adjacent matrix in the user-article bipartite graph based on the selected data enhancement operation to generate a plurality of enhancement views of each node, and positive sample pairs and negative sample pairs are constructed by utilizing different enhancement views of the same node and different enhancement views of different nodes; inputting the positive sample pair and the negative sample pair into a current graph neural network-based recommendation model for training, wherein the training targets are to maximize consistency characterization among different enhancement views in the positive sample pair and differential characterization among different enhancement views in the negative sample pair; the nodes comprise a user node and an article node in a user-article bipartite graph;
after training is finished, obtaining final characterization vectors of all nodes through one-time forward propagation; in the inference stage, for the user to be recommended, calculating the matching scores of the user to be recommended and each article by using the inner product of the final characterization vectors, performing descending order arrangement based on the matching scores, and selecting the first K non-interactive articles as recommendation results, wherein K is the size of the recommendation list.
2. The recommendation method based on the self-supervision graph characterization learning according to claim 1, characterized in that the graph neural network based recommendation model performs node embedding propagation on the user-item bipartite graph, defined as:
Figure FDA0002953801450000011
wherein the content of the first and second substances,
Figure FDA0002953801450000012
representing a collection of nodes
Figure FDA0002953801450000013
And observing an adjacency matrix constructed by the interaction set epsilon, wherein H (-) is a graph convolution function, and E is an embedded matrix;
data enhancement is carried out on the embedded matrix E through data enhancement operation, and then the embedded matrix E is combined with the adjacent matrix
Figure FDA0002953801450000014
Obtaining a plurality of enhanced views by a graph convolution function H (-); or, by aligning the adjacency matrices
Figure FDA0002953801450000015
And (4) enhancing the data, and combining with the embedded matrix E to obtain a plurality of enhanced views by a graph convolution function H (-).
3. The recommendation method based on self-supervision graph characterization learning according to claim 2, characterized in that the data enhancement operation comprises: an ID embedding mask, an ID embedding Dropout, a node Dropout, and a connecting edge Dropout; the active objects of two types of data enhancement operations of ID embedding mask and ID embedding Dropout are embedding matrixes, and the active objects of two types of data enhancement operations of node Dropout and connecting edge Dropout are adjacent matrixes; only one data enhancement operation is used in each training, two mask matrixes M 'and M' are generated according to the selected data enhancement operation, then data enhancement is carried out on the embedded matrix or the adjacent matrix, an enhanced matrix is obtained, and finally two enhanced views Z 'and Z' are obtained; single node k, its enhanced View z'kAnd z ″)kIs the k-th line of enhanced views Z 'and Z ", and enhances view Z'kAnd z ″)kIs a positive sample pair, and the different enhanced views of different nodes form a negative sample pair.
4. A recommendation method based on self-supervised graph characterization learning, according to claim 2 or 3, characterized in that if the role object of the selected data enhancement operation is an embedding matrix, two enhancement views Z' and Z "are generated for the embedding matrix by:
Figure FDA0002953801450000021
wherein the content of the first and second substances,
Figure FDA0002953801450000022
representing a collection of nodes
Figure FDA0002953801450000023
And observing the intersectionA bipartite graph adjacency matrix constructed by the mutual aggregation epsilon, H (-) is a graph convolution function, E is an embedded matrix,
Figure FDA0002953801450000024
representing a random set of processes masking the embedded moments, M 'and M' are two mask matrices generated correspondingly, which indicate the product of the hadamard, t 'and t' are from
Figure FDA0002953801450000025
Two independent random mask processes obtained by random sampling, t 'and t' are defined as follows:
t′(E)=M′⊙E,t″(E)=M″⊙E
the enhanced embedding matrices t' (E) and t "(E) are obtained by either of the following two types of data enhancement operations:
ID embedding mask: with the probability rho, a certain dimensionality subset of the embedding matrix E is masked off, so that the recommendation model based on the graph neural network only uses a part of dimensionalities for characterization learning; this data enhancement operation utilizes two independent random masking processes t' and t ″, which can focus on different embedding dimensions to learn the intrinsic associations between different dimensional subsets;
ID embedded Dropout: with the probability rho, the subset of elements embedded in one node in the embedding matrix E is set to 0, so that the remaining element set of the node is used for distinguishing the corresponding node from other nodes; this data enhancement operation utilizes two independent dropout random processes t' and t "to recover node-embedded information using partial information; the dropout random process is a specific mode in the random mask process.
5. A recommendation method based on unsupervised graph characterization learning according to claim 2 or 3, characterized in that if the effect object of the selected data enhancement operation is the adjacency matrix, two enhancement views Z' and Z "are generated for the adjacency matrix by:
Figure FDA0002953801450000026
wherein the content of the first and second substances,
Figure FDA0002953801450000027
representing a collection of nodes
Figure FDA0002953801450000028
And a bipartite graph adjacency matrix constructed by the observation interaction set epsilon, H (-) is a graph convolution function, E is an embedded matrix,
Figure FDA0002953801450000029
representing a set of random processes masking the adjacency moments, s 'and s' being derived from
Figure FDA00029538014500000210
Two mutually independent random mask processes obtained by the random sampling in the process;
enhanced adjacency matrix
Figure FDA00029538014500000211
And
Figure FDA00029538014500000212
obtained by either of the following two types of data enhancement operations:
node Dropout: with probability ρ, each node and its connected edges on the user-item bipartite graph are discarded, which is formulated as follows:
Figure FDA00029538014500000213
connecting edge Dropout: with probability ρ, each side on the bipartite graph is discarded, which is formulated as follows:
Figure FDA00029538014500000214
where M 'and M' are two mask matrices correspondingly generated, which indicate the product of adama.
6. The recommendation method based on self-supervision graph characterization learning according to claim 2 or 3, characterized in that the training target is expressed as:
Figure FDA0002953801450000031
Figure FDA0002953801450000032
Figure FDA0002953801450000033
Figure FDA0002953801450000034
wherein the content of the first and second substances,
Figure FDA0002953801450000035
as a function of BPR loss, λ1And λ2For hyper-parameter, Θ is a trainable parameter in the graph neural network based recommendation model;
Figure FDA0002953801450000036
in order to be an objective function of the self-supervised learning,
Figure FDA0002953801450000037
respectively representing self-supervision learning loss functions of a user side and an article side;
Figure FDA0002953801450000038
respectively representing a user node set and an article node set, u and v are user node indexes, i and j are article node indexes, and s (-) is a cosine similarity function; τ is a super reference, enhancement View z'′*And z ″)*Two enhanced views of the node are denoted u, v, i, j.
CN202110219147.XA 2021-02-26 2021-02-26 Recommendation method based on self-supervision graph representation learning Pending CN112925977A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110219147.XA CN112925977A (en) 2021-02-26 2021-02-26 Recommendation method based on self-supervision graph representation learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110219147.XA CN112925977A (en) 2021-02-26 2021-02-26 Recommendation method based on self-supervision graph representation learning

Publications (1)

Publication Number Publication Date
CN112925977A true CN112925977A (en) 2021-06-08

Family

ID=76172315

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110219147.XA Pending CN112925977A (en) 2021-02-26 2021-02-26 Recommendation method based on self-supervision graph representation learning

Country Status (1)

Country Link
CN (1) CN112925977A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469289A (en) * 2021-09-01 2021-10-01 成都考拉悠然科技有限公司 Video self-supervision characterization learning method and device, computer equipment and medium
CN113961816A (en) * 2021-11-26 2022-01-21 重庆理工大学 Graph convolution neural network session recommendation method based on structure enhancement
CN114491283A (en) * 2022-04-02 2022-05-13 浙江口碑网络技术有限公司 Object recommendation method and device and electronic equipment
CN114897161A (en) * 2022-05-17 2022-08-12 中国信息通信研究院 Mask-based graph classification backdoor attack defense method and system, electronic equipment and storage medium
CN115329211A (en) * 2022-08-01 2022-11-11 山东省计算中心(国家超级计算济南中心) Personalized interest recommendation method based on self-supervision learning and graph neural network
CN115659176A (en) * 2022-10-14 2023-01-31 湖南大学 Training method of intelligent contract vulnerability detection model and related equipment
CN116151892A (en) * 2023-04-20 2023-05-23 中国科学技术大学 Item recommendation method, system, device and storage medium
CN117131282A (en) * 2023-10-26 2023-11-28 江西财经大学 Multi-view graph contrast learning recommendation method and system integrating layer attention mechanism
CN117934891A (en) * 2024-03-25 2024-04-26 南京信息工程大学 Image contrast clustering method and system based on graph structure
CN117934891B (en) * 2024-03-25 2024-06-07 南京信息工程大学 Image contrast clustering method and system based on graph structure

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299373A (en) * 2018-10-20 2019-02-01 上海交通大学 Recommender system based on figure convolution technique
CN111611472A (en) * 2020-03-31 2020-09-01 清华大学 Binding recommendation method and system based on graph convolution neural network
CN112084407A (en) * 2020-09-08 2020-12-15 辽宁工程技术大学 Collaborative filtering recommendation method fusing graph neural network and attention mechanism
CN112232925A (en) * 2020-11-02 2021-01-15 哈尔滨工程大学 Method for carrying out personalized recommendation on commodities by fusing knowledge maps

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299373A (en) * 2018-10-20 2019-02-01 上海交通大学 Recommender system based on figure convolution technique
CN111611472A (en) * 2020-03-31 2020-09-01 清华大学 Binding recommendation method and system based on graph convolution neural network
CN112084407A (en) * 2020-09-08 2020-12-15 辽宁工程技术大学 Collaborative filtering recommendation method fusing graph neural network and attention mechanism
CN112232925A (en) * 2020-11-02 2021-01-15 哈尔滨工程大学 Method for carrying out personalized recommendation on commodities by fusing knowledge maps

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JIANCAN WU等: "Self-supervised Graph Learning for Recommendation", 《HTTPS://ARXIV.ORG/ABS/2010.10783V1》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469289A (en) * 2021-09-01 2021-10-01 成都考拉悠然科技有限公司 Video self-supervision characterization learning method and device, computer equipment and medium
CN113469289B (en) * 2021-09-01 2022-01-25 成都考拉悠然科技有限公司 Video self-supervision characterization learning method and device, computer equipment and medium
CN113961816A (en) * 2021-11-26 2022-01-21 重庆理工大学 Graph convolution neural network session recommendation method based on structure enhancement
CN114491283A (en) * 2022-04-02 2022-05-13 浙江口碑网络技术有限公司 Object recommendation method and device and electronic equipment
CN114491283B (en) * 2022-04-02 2022-07-22 浙江口碑网络技术有限公司 Object recommendation method and device and electronic equipment
CN114897161B (en) * 2022-05-17 2023-02-07 中国信息通信研究院 Mask-based graph classification backdoor attack defense method and system, electronic equipment and storage medium
CN114897161A (en) * 2022-05-17 2022-08-12 中国信息通信研究院 Mask-based graph classification backdoor attack defense method and system, electronic equipment and storage medium
CN115329211A (en) * 2022-08-01 2022-11-11 山东省计算中心(国家超级计算济南中心) Personalized interest recommendation method based on self-supervision learning and graph neural network
CN115329211B (en) * 2022-08-01 2023-06-06 山东省计算中心(国家超级计算济南中心) Personalized interest recommendation method based on self-supervision learning and graph neural network
CN115659176A (en) * 2022-10-14 2023-01-31 湖南大学 Training method of intelligent contract vulnerability detection model and related equipment
CN116151892A (en) * 2023-04-20 2023-05-23 中国科学技术大学 Item recommendation method, system, device and storage medium
CN116151892B (en) * 2023-04-20 2023-08-29 中国科学技术大学 Item recommendation method, system, device and storage medium
CN117131282A (en) * 2023-10-26 2023-11-28 江西财经大学 Multi-view graph contrast learning recommendation method and system integrating layer attention mechanism
CN117131282B (en) * 2023-10-26 2024-01-05 江西财经大学 Multi-view graph contrast learning recommendation method and system integrating layer attention mechanism
CN117934891A (en) * 2024-03-25 2024-04-26 南京信息工程大学 Image contrast clustering method and system based on graph structure
CN117934891B (en) * 2024-03-25 2024-06-07 南京信息工程大学 Image contrast clustering method and system based on graph structure

Similar Documents

Publication Publication Date Title
CN112925977A (en) Recommendation method based on self-supervision graph representation learning
Darban et al. GHRS: Graph-based hybrid recommendation system with application to movie recommendation
Deng et al. On deep learning for trust-aware recommendations in social networks
Lake et al. Human-level concept learning through probabilistic program induction
CN109389151B (en) Knowledge graph processing method and device based on semi-supervised embedded representation model
Ni et al. A two-stage embedding model for recommendation with multimodal auxiliary information
Xiao et al. Uprec: User-aware pre-training for recommender systems
CN113869424A (en) Semi-supervised node classification method based on two-channel graph convolutional network
Ji et al. Relationship-aware contrastive learning for social recommendations
Xie et al. TPNE: topology preserving network embedding
Pham et al. Unsupervised training of Bayesian networks for data clustering
Liu et al. Neural matrix factorization recommendation for user preference prediction based on explicit and implicit feedback
Ye et al. Multi-granularity sequential three-way recommendation based on collaborative deep learning
Wu et al. Heterogeneous representation learning and matching for few-shot relation prediction
Chen et al. Approximate personalized propagation for unsupervised embedding in heterogeneous graphs
Hong et al. DSER: Deep-sequential embedding for single domain recommendation
Wen et al. Session‐Based Recommendation with GNN and Time‐Aware Memory Network
Chen et al. Poverty/investment slow distribution effect analysis based on Hopfield neural network
Tu et al. Joint implicit and explicit neural networks for question recommendation in CQA services
Li et al. Capsule neural tensor networks with multi-aspect information for Few-shot Knowledge Graph Completion
Chen et al. Gaussian mixture embedding of multiple node roles in networks
CN115310004A (en) Graph nerve collaborative filtering recommendation method fusing project time sequence relation
Yin et al. Deep collaborative filtering: a recommendation method for crowdfunding project based on the integration of deep neural network and collaborative filtering
Han et al. An effective heterogeneous information network representation learning framework
Sangeetha et al. An Enhanced Neural Graph based Collaborative Filtering with Item Knowledge Graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210608