CN115438258A - User preference modeling method based on lightweight graph convolution attention network - Google Patents
User preference modeling method based on lightweight graph convolution attention network Download PDFInfo
- Publication number
- CN115438258A CN115438258A CN202211044168.3A CN202211044168A CN115438258A CN 115438258 A CN115438258 A CN 115438258A CN 202211044168 A CN202211044168 A CN 202211044168A CN 115438258 A CN115438258 A CN 115438258A
- Authority
- CN
- China
- Prior art keywords
- user
- layer
- item
- interaction
- representing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a user preference modeling method based on a lightweight graph convolution attention network, which comprises the following steps of: s1, modeling static user preferences by using a lightweight GCN with only neighborhood aggregation; s2, modeling dynamic user preference by using time perception GAT based on recent interactive items; and S3, combining the static user preference and the dynamic user preference, inputting the two-channel deep neural network model, and performing feature interactive learning and matching score prediction. The invention can simultaneously capture the static and dynamic preferences of the user in an end-to-end mode, and can effectively capture the static and dynamic user preferences by using different GNN methods, which is obviously superior to the most advanced recommendation method at present.
Description
[ technical field ] A method for producing a semiconductor device
The invention relates to the technical field of user preference modeling and personalized recommendation, in particular to a user preference modeling method based on a lightweight graph convolution attention network.
[ background of the invention ]
With the rapid development of computing resources and the availability of large amounts of training data, researchers are beginning to apply Deep Learning (DL) techniques to machine Learning tasks such as speech recognition, machine translation, and recommendation. Currently, DL models have had great success in euclidean data. However, data generated from non-euclidean domains (such as graph structures) are now widely used and require additional analysis. For example, in chemistry, molecules are represented as graph-based structures, and their biological activities need to be determined for drug discovery. In e-commerce, interactions between users and products are modeled as graph structures for accurate product recommendations. A graph is a data format with "nodes" representing individuals in a network and "edges" representing connections between individuals. With the great success of DL, researchers have attempted to design the architecture of GNNs based on the idea of deep auto-encoders, recursive networks, and convolutional networks. Early studies learned the representation of the target node by iteratively propagating neighborhood information until a steady state was reached. This learning process, also known as graph embedding, aims at converting graph nodes into low-dimensional vectors by preserving network topology and node characteristics so that subsequent graph processing tasks (classification, clustering, recommendation, etc.) can be achieved by simple statistical and machine learning methods (dot product, cosine similarity, etc.).
Graph Neural Network (GNN) is a promising learning technique for graphical data representation. Graph Convolution Network (GCN) and Graph attention Network (GAT) are two main representative techniques in GNN that can learn an embedded representation of a target node by aggregating embedded representations of neighboring nodes. Thanks to the advantages of GNN in graph representation learning, many GNN-based recommendation techniques have emerged to address different challenges on various graphs. The first method is GCN, which learns user and item embedding by aggregating information from neighbors in the graph through convolution and pooling operations. Due to its powerful feature extraction and learning capabilities, GCN has gained wide application in remote sensing and has enjoyed great success. Another proposed technique based on GNN is GAT, which introduces a mechanism of attention to GNN to learn the different degrees of influence of neighbor nodes on a target node in an interaction graph. Due to its good discriminative power, GAT has been widely used to build high-performance recommendation models. Although the GNN-based recommendation method has been highly successful, the complex structure of the existing GNN method makes it less efficient in complex tasks with multiple graph nodes. The computation process of GCN and GAT is less friendly to tasks with complex graph structures. In addition, the conventional GCN and GAT ignore the time factor in the representation learning, and the obtained user preference model is static and cannot reflect the dynamic change of the user preference.
[ summary of the invention ]
The invention aims to solve the problems in the prior art and provides a user preference modeling method based on a lightweight graph convolution attention network, which can effectively capture static and dynamic user preferences by using different GNN methods.
In order to achieve the above purpose, the invention provides a user preference modeling method based on a lightweight graph convolution attention network, which comprises the following steps:
s1, modeling static user preferences by using a lightweight GCN with only neighborhood aggregation;
s2, modeling dynamic user preference by using time perception GAT based on recent interactive items;
and S3, combining the static user preference and the dynamic user preference, inputting the two-channel deep neural network model, and performing feature interactive learning and matching score prediction.
Preferably, the implementation of the method is a lightweight graph convolution attention network LightGCAN based on time perception, and includes: an input layer, an embedding layer, a presentation layer, an interaction layer and an output layer; the input layer includes two matrices: user-item interaction matrixAnd interaction time matrix Where m and n represent the number of users and items, respectively, and R is an implicit feedback matrix, if there is an interaction between user u and item i,then r is ui =1, otherwise r ui =0,T records the time of interaction between the user and the item by means of a time stamp, with dimensions identical to R, said input layer providing an initial characteristic representation of the user and the itemAndx u 、x i all the heat vectors are multiple heat vectors and respectively correspond to the u-th row and the i-th column of the R; the embedding layer is a fully connected layer and is used for converting sparse user and item representations into dense potential embedding representations and further used as input of the user preference modeling representation layer; the representation layer includes two GNN models: the LightGCN and the TGAT are respectively used for static and dynamic user preference modeling, and the obtained static and dynamic user preferences are combined and sent to an interaction layer for high-order feature interactive learning; the interaction layer includes two DNN models: and the DMF and the MLP are used for learning different feature interactions according to different deep learning strategies, and finally, serially connecting the obtained feature interaction vectors and transmitting the feature interaction vectors to an output layer for predicting the user-item matching score.
Preferably, the attention elements of the time-aware-based lightweight graph convolutional attention network LightGCAN include a current user, a target item, the last k interactive items, and an interaction time.
Preferably, the hyper-parameters of the time-aware-based lightweight graph convolutional attention network LightGCAN include the number of potential factors, the number of recent interactive items, and the number of hidden layers in the DL model.
Preferably, the number of hidden layers of the DMF and the MLP model is set to 3.
Preferably, in step S1, the static user preference modeling is performed by using a lightweight GCN model LightGCN, which only retains neighbor aggregation operations in the GCN, and has no feature transformation and non-linear activation.
Preferably, in step S1, the modeling process of the static user preference includes the following steps:
s1.1 convolution of lightweight graph: user embedding at layer k +1 is defined as an aggregation operation based on weighted summation:
wherein the content of the first and second substances,andrespectively representing the embedded representation of user u and item i at the k-th level, N u And N i Respectively representing neighbor nodes of the user u and the project i;as a normalization term; the hyper-parameters needing to be learned are embedded by the first layer of users and items, and the embedded layers of users and items are automatically learned layer by layer through the iterative process;
the user and item embedding of the first layer is represented as:
wherein, W u And W v Respectively representing a weight matrix for converting the initial feature vectors of the users and the items into potential embedded representations;
s1.2 layer polymerization: after K layers of lightweight graph convolution operation, K different user/project embeddings are generated, and each layer of embeddings represents different potential semantic information; combining the embeddings of each layer to generate an embeddings of the target user/item:
wherein the content of the first and second substances,andrespectively representing static user preferences and item characteristics; alpha is alpha k ≧ 0 represents the importance weight of the kth layer embedding.
Preferably, in step S1.2, α is adjusted k Set to 1/(K + 1).
Preferably, in step S2, the modeling process of the dynamic user preference includes the following steps:
s2.1, combining the current user, the target item, the latest interactive item and the embedded expression of the interactive time, and inputting the combined expression into an attention network; the attention network is responsible for learning importance weights of recent interactive items for user dynamic preference modeling:
wherein, W k And b k Respectively representing a weight matrix and a bias vector of a k layer of the attention network; b k Number of layers representing the attention network; σ (-) represents the activation function; e.g. of the type u ,e v ,Andembedding, x, representing the current user, the target item, the most recent interactive item and the interaction time, respectively j The four embedded vectors are connected to obtain a combined vector which is used as the input of the attention network;
s2.2, dividing the time interval between the interaction time of the historical interaction item and the current time, and then obtaining the embedded expression of the interaction time through linear transformation:
ts j =min((T-t j )/60,δ)
wherein, t j Represents the current user's interaction time with item j, T is the current time at the time of prediction, ts j Representing the time interval between the interaction time and the prediction time, the min function being used to set the threshold value of the time interval to δ, W t A transformation matrix representing temporal embedding;
s2.3 normalization of the attention coefficient by the softmax function:
wherein RK u The latest k interactive items representing the user u;
s2.4 the user dynamic preference vector is modeled as a weighted sum of the embedded representations of the current user' S most recent k interactive items:
whereinAndrespectively representing the embedding of the user u and the historical interaction item j.
Preferably, step S3 specifically includes the steps of:
s3.1, combining the static and dynamic user preference representations in a vector cascade mode to obtain a user preference representation; the item embedding vectors obtained from the embedding layer and the LightGCN are also combined to obtain an item feature representation:
wherein the content of the first and second substances,andrepresenting static user preferences and item characteristics respectively,denotes the embedding of user u, e u And e i Respectively representing a user preference vector and a project characteristic vector;represents item i based embedding; the generated user and item representations are used as input of high-order feature interactive learning models DMF and MLP;
s3.2 feature interactive learning based on DMF: the DMF model has a multi-layer two-channel structure based on user components and project components, in each of which the output of the current layer is used as the input of the next layer; in each layer, the input vector is projected as a hidden vector by linear transformation and nonlinear activation operations:
wherein, the first and the second end of the pipe are connected with each other,andhidden representations representing users u and items i in the k-th layer, respectively; here, the Andrespectively representing a weight matrix and an offset vector of a k layer of the user component;andthen representing the weight matrix and bias vector of the k-th layer of the project component;
through iterative learning of the multi-layered DMF model, user and item representations are mapped to a low-dimensional potential embedding space:
wherein L is 1 Number of layers, p, representing DMF model u And q is i Respectively representing potential representations of the learned user u and the item i;
the user-item feature interaction is defined as the product of the user and the item potential representation vector:
s3.3, feature interactive learning based on MLP: MLP is a typical deep learning model that first combines the feature vectors of users and projects, then passes through multiple hidden layers on top of it to learn the higher-order user-project feature interactions:
z 0 =[e u ||e i ]
wherein the content of the first and second substances,and alpha k Weight matrix, bias vector and activation function, H, representing the k-th layer respectively 1 Weight matrix, L, representing the output layer 2 The number of layers of the model is represented,representing a high-order feature interaction vector learned by the MLP model;
s3.4 matching score prediction: firstly, respectively operating DMF (dimethyl formamide) and MLP (Levenberg-Martin P), then combining output vectors of two models in a vector cascade mode, and finally inputting a combined embedded vector into an output layer of LightGCAN to perform matching score prediction:
wherein H 2 A weight matrix representing an output layer;
s3.5 model training: lightGCAN is a CF model based on implicit feedback information, using binary cross entropy loss as an objective function to minimize the difference between the predicted matching score and the implicit feedback information:
wherein the content of the first and second substances,is the predicted match score, r, between user u and item i ui Is implicit feedback information observed in the interaction matrix R, R + And R - Positive and negative sample sets, respectively, Θ being a hyper-parameter of the model.
The present invention proposes a time-aware graph convolutional attention network that efficiently captures static and dynamic user preferences by using different GNN methods. Specifically, static user preferences are captured by a lightweight GNN with node aggregation only, dynamic user preferences are captured based on time-aware GAT of the most recent interaction term, and the two user preferences are combined and input into a two-channel DNN model composed of DMF and MLP for feature interaction learning and matching score prediction.
The invention has the beneficial effects that:
1. an efficient user preference learning model is designed, and the framework can simultaneously capture the static and dynamic preferences of a user in an end-to-end mode.
2. A time-aware attention network model is proposed that estimates the contribution weight of each historical interaction item to modeling of dynamic user preferences in terms of the current user, target item, recent historical interaction items, and their interaction times.
3. Experiments were conducted on four data sets to evaluate the performance of the method of the present invention in Collaborative Filtering (CF) recommendations, and the experimental results show that the method of the present invention is significantly superior to the most advanced current recommendation methods.
The features and advantages of the present invention will be described in detail by embodiments in conjunction with the accompanying drawings.
[ description of the drawings ]
FIG. 1 is an overall architecture diagram of a time-aware lightweight atlas convolutional attention network LightGCAN;
FIG. 2 is a comparison graph of HR @10 with different numbers of potential factors;
FIG. 3 is a comparison graph of NDCG @10 with different numbers of potential factors;
FIG. 4 is a HR @10 comparison graph with different historical interaction terms;
FIG. 5 is a comparison graph of NDCG @10 with different historical interaction terms;
FIG. 6 is a comparison graph with different numbers of hidden layers HR @ 10;
figure 7 is a comparison graph with different numbers of hidden layers ndcg @ 10.
[ detailed description ] embodiments
1 basic knowledge
Some of the symbols used herein will be described below. Bold italic capital letters (e.g., X) and bold italic lowercase letters (e.g., X) are used to represent a matrix and a vector, respectively, X ij Indicating entries in row i and column j of matrix X, the symbols | | and | | | are used to indicate element-by-element multiplication and vector join operations, respectively. Table 1 summarizes some symbols and their description used in the rest of the text:
TABLE 1 description of symbols used therein
1.1 Graph Convolution Network (GCN)
The GCN is a neural network model for graph data structures, which takes graph data as input, learns graph node representations in a low-dimensional embedding space for each layer, and uses the output of the previous layer as input for the next layer. For simplicity, only the implementation details of one layer of the GCN will be presented next.
Node embedding in the GCN layer involves two main operations: node aggregation and feature transformation. The node embedding process can be abstracted as:
wherein the content of the first and second substances,andrespectively representing embedded representations of a target node i and its neighboring nodes j;representing the potential embedding of the target node d < f;andrespectively representing a node aggregation function and a feature transformation function.
Node aggregation improves the representation of a target node by collecting information from its neighboring nodes. The rationale behind node aggregation is that: the attributes of a target node may typically be reflected to some extent by the attributes of its neighbors. In recent years. Research on GCN has focused on constructing different node aggregation functions to capture information from the neighborhood. For example: the average pooling function is used to filter out common attributes of neighboring nodes, and the maximum pooling function is used to extract representative features from the neighbors.
By mapping the target node from the input representation space to the potential embedding space, the feature transformation makes the representation of the target node more comprehensive. In conventional GCN, feature transformation is generally defined as a process with matrix mapping and a non-linear activation function, abstracted as follows:
wherein the content of the first and second substances,andrepresenting the mapping matrix and the offset vector, respectively, and sigma is the activation function.
1.2 attention network (GAT)
In embedded learning for a target node, GAT distinguishes the different roles of its neighbor nodes based on an attention mechanism. Like GCN, GAT also includes a hidden representation of multi-level, layer-by-layer learning nodes. For simplicity, only one layer of GAT will be described in detail below.
Taking a set of node features as inputs to the GATs, each GAT layer generates a new set of node representations with different dimensions as its outputs:
x i =σ(∑ j∈neighbor(i) α ij Wx j ) (3)
wherein, the first and the second end of the pipe are connected with each other,representing a shared weight matrix for feature transformation; f and f' represent the dimensions of the input and output node feature vectors, respectively; alpha is alpha ij Indicating that the neighbor node j represents the learned importance weight for the target node i.
In general, shared self-attention mechanismsUsed to calculate the impact weight (attention coefficient) of the neighborhood:
e ij =α(Wx i ,Wx j ) (4)
in order to make the attention coefficient more convenient to compare between different neighbors of the target node, the attention coefficient is typically normalized using the softmax function:
2 lightweight map convolutional attention network (LightGCAN)
2.1 Overall framework
The overall framework of LightGCAN is shown in fig. 1 and comprises five layers: an input layer, an embedding layer, a presentation layer, an interaction layer, and an output layer. The input layer is composed of two matrices, namely a user-item interaction matrixAnd interaction time matrixWhere m and n represent the number of users and items, respectively. R is an implicit feedback matrix, R if there is an interaction between user u and item i ui =1, otherwise r ui =0.T records the interaction time between the user and the item through a time stamp, and the dimension of the interaction time is the same as that of R. The input layer provides an initial characterization of the user and the itemAndthey are multiple heat vectors, corresponding to the u-th row and i-th column of R, respectively. The embedding layer is a fully connected layer for converting sparse user and item representations into dense, potentially embedded representations, which are then used as input to the user preference modeling representation layer. The presentation layer contains two GNN models, lightGCN and TGAT, for static and dynamic user preference modeling, respectively. Combining the static and dynamic user preferences and sending them to the interaction layerAnd performing high-order feature interactive learning. The interaction layer comprises two DNN models, namely DMF and MLP, and is used for learning different feature interactions according to different deep learning strategies. And finally, serially connecting the obtained feature interaction vectors, and sending the feature interaction vectors to an output layer for predicting the user-item matching score.
2.2 static user preference modeling
User preferences can generally be divided into two categories: static preferences and dynamic preferences. Static preferences refer to the relatively fixed interests and hobbies that users develop over time. Dynamic preferences refer to the user's short-term preferences at the current time. In the CF recommendation, the most common way to capture the user's static preferences is to use historical interaction information. Implicit feedback information is used in the present invention as a data source for static user preference modeling.
In the CF recommendation system, the relationship between a user and an item is generally described as a graph structure. GCN has been widely used for interactive graph-based user and item representation learning in view of its powerful representation learning capabilities in graph structured data. In the invention, a lightweight GCN model LightGCN is used for static user preference modeling. LightGCN only retains neighbor aggregation operations in GCN, and discards complex feature transformation and nonlinear activation operations which are meaningless for recommendation tasks.
2.2.1 lightweight graph convolution
LightGCN is a simplified GCN, a multi-layer model. Formally, user (item) embedding at the k +1 th level is defined as an aggregation operation based on weighted summation:
wherein the content of the first and second substances,andrespectively representing the embedded representation of user u and item i at the k-th level, N u And N i Respectively representing the neighbor nodes of user u and item i.Used as a normalization term to avoid explosion phenomena in feature aggregation. The only hyper-parameter to learn here is the first level of user and item embedding, since the high level of user and item embedding can be learned automatically layer by layer through the above iterative process.
Specifically, the user and item embedding of the first tier is represented as:
wherein W u And W v The separate representation converts the initial feature vectors of the user and the item into a weight matrix of potential embedded representations.
2.2.2 layer polymerization
After K layers of lightweight graph convolution operation, K different user (item) embeddings are generated. Each layer of embedding represents different underlying semantic information. The intuitive idea is to combine the embeddings of each layer to generate an embeddings of the target user (item):
wherein the content of the first and second substances,andrespectively representing static user preferences and item characteristics; alpha is alpha k ≧ 0 represents the importance weight of the kth layer embedding. In our experiments, α was uniformly assigned k Set to 1/(K + 1), which achieves good performance. This setup strategy may avoid complicating the LightGCN while maintaining its simplicity and efficiency.
2.3 dynamic user preference modeling
Static user preference modeling is based on the entire interaction history of the target user, completely ignoring drift in user preferences over time. When a user faces different items at different times, his interests and preferences are different, which preferences belong to short-term preferences. To capture short-term user preferences, one should rely on items that the user has interacted with recently, which are more reflective of the user's current interests and preferences than items that the user interacted with long ago. Further, user interaction time with items should be taken into account in dynamic user preference modeling, which may reveal drift in user preferences over time.
Herein, we propose a time-aware GAT model (TGAT) to capture dynamic user preferences. Unlike existing attention networks that model dynamic user preferences based on the entire interaction history or using only the current session, TGAT models the dynamic preferences of the user using the user's last k interaction terms. Recently interacted with items reflect the user's current interests and preferences more than items interacted with a long time ago. Further, using fewer items may reduce the computational complexity of user preference modeling. In order to make the historical interaction vectors of all users equal in length, when the number of historical interaction items of a user is less than k, a meaningless constant (e.g., -1) is usually used to fill in the historical interaction vectors.
First, the current user, the target item, the most recent interactive item, and the embedded representation of the interaction time are combined and input to the attention network. The attention network is responsible for learning importance weights of recent interactive items for user dynamic preference modeling:
wherein, W k And b k Respectively representing a weight matrix and a bias vector of a k layer of the attention network; b is a mixture of k Number of layers representing the attention network; σ (-) represents the activation function; e.g. of the type u ,e v ,Andembedding, x, representing the current user, the target item, the most recent interactive item and the interaction time, respectively j The four embeddings are concatenated to obtain a combined vector, which is used as the input to the attention network.
Dividing the time interval between the interaction time of the historical interaction item and the current time by taking minutes as a unit to realize discretization of continuous time, and then obtaining the embedded expression of the interaction time through linear transformation:
ts j =min((T-t j )/60,δ) (14)
wherein, t j Represents the current user's interaction time (in seconds) with item j, T is the current time at the time of prediction, ts j Representing the time interval (minutes) between the interaction time and the prediction time, the min function being used to set the threshold value of the time interval to δ, W t Representing a time-embedded transformation matrix.
Then, to facilitate comparison between different historical interaction terms of the user, the attention coefficient is normalized by the softmax function:
wherein RK u Representing the last k interactive items of user u.
Finally, the user dynamic preference vector is modeled as a weighted sum of the embedded representations of the current user's most recent k interaction terms:
2.4 feature Interactive learning and Scoring prediction
The static and dynamic user preference representations are combined in the form of a vector cascade to obtain a user preference representation. Furthermore, the item embedding vectors obtained from the embedding layer and LightGCN are also combined to obtain an item feature representation:
wherein, the first and the second end of the pipe are connected with each other,andrespectively representing static user biasesThe characteristics of the good and the project,indicating the embedding of user u, e u And e i Respectively representing a user preference vector and a project characteristic vector;represents item i based embedding; the generated user and item representations will be used as input to the high-order feature interactive learning models DMF and MLP.
2.4.1 feature interactive learning based on DMF
The DMF model is a multi-layer two-channel structure based on user components and project components. In each component, the output of the current layer is used as the input for the next layer. In each layer, the input vector is projected as a hidden vector by linear transformation and nonlinear activation operations:
wherein the content of the first and second substances,andhidden representations representing users u and items i in the k-th layer, respectively; here, the Andrespectively representing weights of k-th layers of user elementsA matrix and an offset vector;andthe weight matrix and bias vector representing the k-th layer of the project component.
Through iterative learning of the multi-layered DMF model, user and item representations are mapped to a low-dimensional potential embedding space:
wherein L is 1 Number of layers, p, representing DMF model u And q is i Respectively representing potential representations of learned users u and items i.
The user-item feature interaction is defined as the product of the user and the item potential representation vector:
2.4.2 MLP-based feature Interactive learning
The DMF learns the potential representations of the user and the item, respectively, using two independent channels, and finally it is intuitive to connect the potential representations of the user and the item. However, the interaction between the user and the underlying factors of the project is not well described using vector connections alone. The feature vectors of users and projects are combined first by using another deep learning model MLP, and then a plurality of hidden layers are added on the combined feature vectors to learn high-level user-project feature interaction:
Z 0 =[e u ||e i ] (25)
wherein, the first and the second end of the pipe are connected with each other,and alpha k Weight matrix, bias vector and activation function, H, representing the k-th layer respectively 1 Weight matrix, L, representing the output layer 2 The number of layers of the model is represented,representing the higher order feature interaction vectors learned by the MLP model.
2.4.3 match score prediction
To date, two types of high-order feature interaction vectors have been obtained. The DMF model uses a two-channel structure to model potential representations of users and items, and then computes interaction vectors. The MLP model integrates user and item representations first, and then learns interaction vectors using a typical DNN model. To maintain the advantages of both models, DMF and MLP can be fused into one integrated model. In order to provide great flexibility for the integrated model, the DMF and the MLP are operated separately, then the outputs of the two models are combined in a vector cascade mode, and finally the combined embedded vector is input to the output layer of the LightGCAN to perform matching score prediction:
wherein H 2 A weight matrix representing the output layer.
2.4.4 model training
The objective functions point-wise and pair-wise are typically used for model training of the recommendation system. For simplicity, the point-wise method is used in the present invention. Since LightGCAN is a CF model based on implicit feedback information, binary cross entropy loss is used here as an objective function to minimize the difference between the predicted matching score and the implicit feedback information:
wherein the content of the first and second substances,is the predicted match score, r, between user u and item i ui Is implicit feedback information observed in the interaction matrix R, R + And R - Positive and negative sample sets, respectively, Θ being a hyper-parameter of the model.
3 experiments and analyses
In order to answer the following study questions, a series of experiments will be performed below, and the results of the experiments will be analyzed in detail:
rq1. Is the proposed recommendation model LightGCAN better than the existing CF recommendation model in the Top-k recommendation task?
Not different components in lightgcan work on the recommended task?
Not different settings of the hyper-parameter in the rq3.Lightgcan will affect the recommended performance?
3.1 Experimental setup
3.1.1 data set
To evaluate the recommended performance of the model LightGCAN, experiments were now performed using four real datasets in different fields and different scales, the datasets being movilens 100K (ml-100K), movielen 1M (ml-1M), amazon music (Amusic) and Amazon days (Atoy), respectively. Table 2 gives detailed statistical information of the experimental data set:
TABLE 2 statistical information of data sets
3.1.2 evaluation of strategies and Performance indicators
In this experiment, a leave-one strategy was used for performance evaluation. The most recent interactive item of each user is used as a test sample, and the remaining interactive items are used for training. Since it is time consuming to sort all items in each test, 100 non-interactive items are randomly selected for each user and then sorted with the test items according to the predicted match scores. In the experiment, two popular test criteria were used: hit Rate (HR) and normalized presentation cumulative gain (NDCG) to evaluate ranking performance. The length of the ranking list is set to 10 to evaluate Top-10 recommendation performance.
3.1.3 comparative method
The following CF process was chosen for comparison with the process of the present invention:
ItempPop is a statistical-based method that ranks items according to their popularity (frequency of access). It is commonly used as a baseline method for CF recommendations.
eALS (Xiaoangnan He, handing Zhang, min-Yen Kan, anddat-Seng Chua,2016.Fast three factor for online correlation with an implementation feedback. In SIGIR.549-558.) is a Matrix Factorization (MF) based CF method with fast parameter learning technique. It takes the mutual terms observed and not observed as positive and negative samples, respectively.
DMF (Hongjian Xue, xinyu Dai, jianbin Zhang, shuijian Huang, and Jianjun Chen,2017.Deep matrix factorization models for recommender systems. In IJCAI.3203-3209.) is a CF method for representation learning based on DL. It performs model training using normalized binary cross entropy as a loss function.
NeuMF (Xiangnan He, lizi Liao, hanwang Zhang, liqiang Nie, xia Hu, and Tat-sen Chua,2017.Neural social networking filtering. In www.173-182.) is a CF method for matching score prediction based on DL, which learns feature interactions using different learning strategies based on two major models, GMF and MLP. The outputs of the two models are combined for matching score prediction.
Deep CF (zhonghong ding, ling Huang, changdong Wang, jianhuang Lai, and philip s. Yu,2019.Deep CF.
3.2 recommended Performance comparison (RQ 1)
Table 3 shows the Top-10 recommended performance of the different recommendations, and the tests on the Amusic and Atoy datasets did not include time-embedded components because there was no time information in the two datasets. The following results can be observed from the table:
all methods performed best on the ml-100k dataset, next to the ml-1m dataset, then the Amusic dataset, and finally the Atoy dataset. This result indicates that the sparsity of the training data has a large impact on the recommended performance of the CF method, since the data sparsity of the data sets increases in turn.
The MF-based approach etals outperforms the statistical-based approach itempp, a phenomenon that suggests that, although the principle of the underlying factor model is simple and has only linear modeling capability, it plays a role in feature-interactive learning and CF recommendation.
The advantage of the DL-based MF methods (DMF and NeuMF) over the traditional MF method etals suggests that deep learning has a strong advantage in both expression learning and matching function learning, which is very beneficial for CF.
The MF-based integrated methods (NeuMF and depcf) performed better on all datasets than the MF-based independent method (DMF). This is because each component in the integrated model is predicted according to a different strategy, which in combination can preserve the advantages of each component, making the model more robust.
In the DL-based approach, deepCF outperforms NeuMF. Analysis shows that DeepCF uses two different deep learning strategies for feature interactive learning, while NeuMF uses a linear model and a DL model. This further shows that the deep learning model is significantly better than the linear model in terms of feature interactive learning.
The method LightGCAN of the present invention performs best on all datasets, with a significant and stable degree of performance improvement compared to the comparative method. On the ml-100k dataset, the HR and NDCG of LightGCAN were 6.3% and 9.7% higher, respectively, than the current state-of-the-art DeepCF method; HR and NDCG were 4.7% and 16.1% higher, respectively, on the ml-1m dataset.
TABLE 3 comparison of Performance of different recommendations
3.3 ablation Experimental analysis (RQ 2)
The method LightGCAN proposed by the present invention comprises two key parts: user static preference modeling and user dynamic preference modeling. To verify the role of these two components in the CF recommendation task, ablation experiments were performed, with recommendations made using only static modeling, applying only dynamic modeling, and integrating static modeling and dynamic modeling, with the experimental results shown in table 4. As can be seen from the table, the recommendation performance obtained based on dynamic modeling is better than that based on static modeling, which indicates that dynamic preference modeling considering time factors can better capture the current preference of the user. However, if time information is not available, dynamic modeling will not work and only static modeling, whose data sources are readily available, can be used to obtain long-term user preferences. Therefore, static modeling is a common modeling approach in the CF recommendation, which is why both static and dynamic modeling are combined in the model of the present invention. Experimental results show that the combination of the two modeling methods can obtain the best recommended performance.
TABLE 4 Performance comparison of different user modeling methods
In LightGCAN, a new attention network for dynamic user preference modeling is designed that comprehensively considers a number of attention elements, including the current user, the target item, the last k interactive items and the interaction time. To demonstrate the effectiveness of different attention elements in dynamic user preference modeling, experiments were conducted on different combinations of these elements. Since the conventional method uses only the history interactive item for user preference modeling, the other three elements should be combined with the history interactive item to view the result, and the experimental result is shown in table 5. As can be seen from the table, all attention elements have a positive impact on the dynamic user preference modeling, and by combining these factors, the best performance can be obtained. Among them, the interaction time plays the most critical role, but the existing methods often ignore this point, which is the novelty of the attention mechanism proposed by the present invention.
TABLE 5 comparison of Performance of different attention mechanisms
3.4 Parametric sensitivity analysis (RQ 3)
The impact of different settings of the model hyper-parameters on the recommended performance will be investigated below. The hyper-parameters of the model of the invention include the number of latent factors, the number of recent interactive items and the number of hidden layers in the DL model. Figures 2 and 3 show the HR and NDCG performance with different numbers of potential factors in the user (project) representation learning. As can be seen from the figure, the recommendation performance becomes better as the number of potential factors increases. This result indicates that the model of the present invention can achieve better performance by considering more influencing factors in the representation learning. However, considering more factors will inevitably lead to higher computational complexity. Therefore, it is common practice not to increase the number of latent factors any more when the performance improvement is not significant. In the present invention, the potential factor of embedding the user and item in the representation is set to 64.
FIGS. 4 and 5 show HR and NDCG for different numbers of historical interaction items during dynamic user preference modeling on a ml-100k data set. As can be seen from the figure, the most suitable number of historical interaction terms on the ml-100k data set is 20, considering that too many or too few interaction terms will result in performance degradation. Because there is not enough information to learn user preferences when too few historical interaction items are used, and using too many historical interaction items inevitably introduces noise. After all, items that have been interacted with long ago do not reflect the user's current interest preferences. The number of best historical interaction terms varies from data set to data set and needs to be found experimentally.
In general, the number of hidden layers of the DL model has a significant impact on model prediction. To show the impact of the depth of the DL model on representation learning and feature interaction learning, the performance of LightGCAN with different hidden layers was evaluated. Fig. 6 and 7 show HR and NDCG performance with different number of concealment layers in DMF and MLP, respectively. As can be seen from the figure, initially, as the number of hidden layers increases, the performance becomes better; when the number of hidden layers reaches three, performance begins to degrade. This result indicates that when there are three hidden layers, the learning power of the model has saturated, adding more hidden layers will result in overfitting. Therefore, the number of hidden layers of the DMF and MLP models is set to 3 in the present invention.
4 conclusion
As is well known, CF recommendations include two phases: representing learning and matching function learning. The expression learning goes through a process from matrix decomposition to deep learning, and the matching function learning goes through a process from point multiplication and decomposition to deep learning. The currently best performing strategy for matching function learning is the dual path DL based strategy, so this structure continues to be used in the present invention. Static and dynamic user preference modeling is an innovation of the present invention. In the present invention, the possibility of user representation learning and CF recommendations in combination with long-term and short-term user preferences is explored. A new user representation learning method is proposed, which comprises two parts of static and dynamic user preference modeling. In the static user preference modeling stage, extracting the long-term user preference by using a lightweight GCN; in the dynamic user preference modeling phase, short-term user preferences are modeled using time-aware GAT. And combining the long-term user preference and the short-term user preference and sending the combined preference to a dual-channel DL model for feature interactive learning and matching score prediction. Experimental results on four data sets show that the method of the present invention is significantly superior to the existing CF recommendation method.
The above embodiments are illustrative of the present invention, and are not intended to limit the present invention, and any simple modifications of the present invention are within the scope of the present invention.
Claims (10)
1. The user preference modeling method based on the lightweight graph convolution attention network is characterized by comprising the following steps: the method comprises the following steps:
s1, modeling static user preferences by using a lightweight GCN with only neighborhood aggregation;
s2, modeling dynamic user preference by using time perception GAT based on recent interactive items;
and S3, combining the static user preference and the dynamic user preference, inputting the two-channel deep neural network model, and performing feature interactive learning and matching score prediction.
2. The method of lightweight graph convolution attention network based user preference modeling of claim 1, characterized by: the method for realizing the light-weight graph convolution attention network LightGCAN based on time perception comprises the following steps: an input layer, an embedding layer, a presentation layer, an interaction layer and an output layer;
the input layer includes two matrices: user-item interaction matrixAnd interaction time matrixWhere m and n represent the number of users and items, respectively, R is an implicit feedback matrix, and R is the number of users and items if there is an interaction between user u and item i ui =1, noThen r is ui =0,T records the time of interaction between the user and the item by means of a time stamp, with dimensions identical to R, said input layer providing an initial characteristic representation of the user and the itemAndx u 、x i all the heat vectors are multiple heat vectors and respectively correspond to the u-th row and the i-th column of the R;
the embedding layer is a fully connected layer and is used for converting sparse user and item representations into dense potential embedding representations and further used as input of the user preference modeling representation layer;
the representation layer includes two GNN models: the LightGCN and the TGAT are respectively used for static and dynamic user preference modeling, and the obtained static and dynamic user preferences are combined and sent to an interaction layer for high-order feature interactive learning;
the interaction layer includes two DNN models: and the DMF and the MLP are used for learning different feature interactions according to different deep learning strategies, and finally serially connecting the obtained feature interaction vectors and transmitting the feature interaction vectors to an output layer for predicting the user-project matching score.
3. The method of lightweight graph convolution attention network based user preference modeling of claim 2, characterized by: the attention elements of the time-aware-based lightweight graph convolutional attention network LightGCAN include a current user, a target item, the last k interactive items, and an interaction time.
4. The method of lightweight graph convolution attention network based user preference modeling of claim 2, characterized by: the hyper-parameters of the time-aware-based lightweight graph convolutional attention network LightGCAN comprise the number of potential factors, the number of recent interactive items and the number of hidden layers in a DL model.
5. The method of lightweight graph convolution attention network based user preference modeling of claim 2, characterized by: the number of hidden layers of the DMF and the MLP model is set to be 3.
6. The method of lightweight graph convolution attention network based user preference modeling of claim 1, characterized by: in the step S1, a lightweight GCN model LightGCN is used for static user preference modeling, the LightGCN only keeps neighbor aggregation operation in the GCN, and two operations of feature-free transformation and nonlinear activation are carried out.
7. The method for user preference modeling based on lightweight atlas-volume attention network of claim 6, wherein: in step S1, the modeling process of the static user preference includes the following steps:
s1.1 convolution of lightweight graph: user embedding at layer k +1 is defined as an aggregation operation based on weighted summation:
wherein the content of the first and second substances,andrepresents the embedded representation of user u and item i, respectively, at the k-th level, N u And N i Respectively representing neighbor nodes of the user u and the project i;as a normalization term; ultra requiring learningThe parameters are embedded by the first layer of users and items, and the embedded layers of users and items are automatically learned layer by layer through the iterative process;
the user and item embedding of the first layer is represented as:
wherein, W u And W v Respectively representing a weight matrix for converting the initial feature vectors of the user and the project into potential embedded representations;
s1.2 layer polymerization: after K layers of lightweight graph convolution operation, K different user/project embeddings are generated, and each layer of embeddings represent different potential semantic information; combining the embeddings of each layer to generate an embeddings of the target user/item:
8. The method of claim 7 based on a lightweight graph convolution attention networkThe method for modeling the user preference is characterized by comprising the following steps: in said step S1.2, alpha is k Set to 1/(K + 1).
9. The method for user preference modeling based on lightweight atlas-volume attention network of claim 1, wherein: in step S2, the modeling process of the dynamic user preference includes the following steps:
s2.1, combining the current user, the target item, the latest interactive item and the embedded expression of the interactive time, and inputting the combined expression into an attention network; the attention network is responsible for learning importance weights of recent interactive items for user dynamic preference modeling:
wherein, W k And b k Respectively representing a weight matrix and a bias vector of a k layer of the attention network; b k Representing the number of layers of the attention network; σ (-) represents the activation function; e.g. of the type u ,e v ,Andembedding, x, representing the current user, the target item, the most recent interactive item and the interaction time, respectively j The four embedded vectors are connected to obtain a combined vector which is used as the input of the attention network;
s2.2, dividing the time interval between the interaction time of the historical interaction item and the current time, and then obtaining the embedded expression of the interaction time through linear transformation:
ts j =min((T-t j )/60,δ)
wherein, t j Represents the current user's interaction time with item j, T is the current time at the time of prediction, ts j Representing the time interval between the interaction time and the prediction time, the min function being used to set the threshold value of the time interval to δ, W t A transformation matrix representing temporal embedding;
s2.3 normalization of the attention coefficient by the softmax function:
wherein RK u The last k interactive items representing user u;
s2.4 the user dynamic preference vector is modeled as a weighted sum of the embedded representations of the current user' S most recent k interactive items:
10. The method of lightweight graph convolution attention network based user preference modeling of claim 2, characterized by: the step S3 specifically includes the following steps:
s3.1, combining static and dynamic user preference representations in a vector cascade mode to obtain a user preference representation; the item embedding vectors obtained from the embedding layer and the LightGCN are also combined to obtain an item feature representation:
wherein, the first and the second end of the pipe are connected with each other,andrepresenting static user preferences and item characteristics respectively,denotes the embedding of user u, e u And e i Respectively representing a user preference vector and a project characteristic vector;represents item i based embedding; the generated user and item representations are used as input of high-order feature interactive learning models DMF and MLP;
s3.2 feature interactive learning based on DMF: the DMF model has a multi-layer two-channel structure based on user components and project components, in each of which the output of the current layer is used as the input of the next layer; in each layer, the input vector is projected as a hidden vector by linear transformation and nonlinear activation operations:
wherein the content of the first and second substances,andhidden representations representing users u and items i in the k-th layer, respectively; here, the Anda weight matrix and a bias vector respectively representing a k-th layer of the user component;andthen representing the weight matrix and the offset vector of the k layer of the item component;
through iterative learning of the multi-layered DMF model, user and item representations are mapped to a low-dimensional potential embedding space:
wherein L is 1 Number of layers, p, representing DMF model u And q is i Respectively representing potential representations of the learned user u and the item i;
the user-item feature interaction is defined as the product of the user and the item potential representation vector:
s3.3, feature interactive learning based on MLP: MLP is a typical deep learning model that first combines the feature vectors of users and projects, and then learns the high-level user-project feature interactions through multiple hidden layers on top of it:
z 0 =[e u ||e i ]
wherein the content of the first and second substances,and alpha k Weight matrix, bias vector and activation function, H, representing the k-th layer respectively 1 Weight matrix, L, representing the output layer 2 The number of layers of the model is represented,representing a higher-order feature interaction vector learned by the MLP model;
s3.4 matching score prediction: firstly, respectively operating DMF (dimethyl formamide) and MLP (multi-level prediction), then combining output vectors of two models in a vector cascade mode, and finally inputting the combined embedded vector into an output layer of LightGCAN to perform matching fraction prediction:
wherein H 2 A weight matrix representing an output layer;
s3.5 model training: lightGCAN is a CF model based on implicit feedback information, using binary cross entropy loss as an objective function to minimize the difference between the predicted matching score and the implicit feedback information:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211044168.3A CN115438258A (en) | 2022-08-30 | 2022-08-30 | User preference modeling method based on lightweight graph convolution attention network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211044168.3A CN115438258A (en) | 2022-08-30 | 2022-08-30 | User preference modeling method based on lightweight graph convolution attention network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115438258A true CN115438258A (en) | 2022-12-06 |
Family
ID=84244994
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211044168.3A Pending CN115438258A (en) | 2022-08-30 | 2022-08-30 | User preference modeling method based on lightweight graph convolution attention network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115438258A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117235375A (en) * | 2023-07-20 | 2023-12-15 | 重庆理工大学 | User multi-behavior recommendation method based on graphic neural network and element learning |
-
2022
- 2022-08-30 CN CN202211044168.3A patent/CN115438258A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117235375A (en) * | 2023-07-20 | 2023-12-15 | 重庆理工大学 | User multi-behavior recommendation method based on graphic neural network and element learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111931062B (en) | Training method and related device of information recommendation model | |
Liu et al. | Contextualized graph attention network for recommendation with item knowledge graph | |
TWI754033B (en) | Generating document for a point of interest | |
US11074289B2 (en) | Multi-modal visual search pipeline for web scale images | |
CN106663124B (en) | Generating and using knowledge-enhanced models | |
CN111881342A (en) | Recommendation method based on graph twin network | |
CN113722611B (en) | Recommendation method, device and equipment for government affair service and computer readable storage medium | |
Tian et al. | When multi-level meets multi-interest: A multi-grained neural model for sequential recommendation | |
Yang et al. | Microblog sentiment analysis via embedding social contexts into an attentive LSTM | |
CN113378938B (en) | Edge transform graph neural network-based small sample image classification method and system | |
CA3004097A1 (en) | Methods and systems for investigation of compositions of ontological subjects and intelligent systems therefrom | |
Liao et al. | Coronavirus pandemic analysis through tripartite graph clustering in online social networks | |
CN110598084A (en) | Object sorting method, commodity sorting device and electronic equipment | |
CN114065048A (en) | Article recommendation method based on multi-different-pattern neural network | |
CN111506821A (en) | Recommendation model, method, device, equipment and storage medium | |
Mahmood et al. | Using artificial neural network for multimedia information retrieval | |
CN115438258A (en) | User preference modeling method based on lightweight graph convolution attention network | |
Tsinganos et al. | Utilizing convolutional neural networks and word embeddings for early-stage recognition of persuasion in chat-based social engineering attacks | |
Liu et al. | Joint user profiling with hierarchical attention networks | |
CN111079011A (en) | Deep learning-based information recommendation method | |
Xu et al. | Towards annotating media contents through social diffusion analysis | |
Zakir et al. | Convolutional neural networks method for analysis of e-commerce customer reviews | |
CN117688390A (en) | Content matching method, apparatus, computer device, storage medium, and program product | |
CN115408605A (en) | Neural network recommendation method and system based on side information and attention mechanism | |
Liu | POI recommendation model using multi-head attention in location-based social network big data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |