CN115438258A - User preference modeling method based on lightweight graph convolution attention network - Google Patents

User preference modeling method based on lightweight graph convolution attention network Download PDF

Info

Publication number
CN115438258A
CN115438258A CN202211044168.3A CN202211044168A CN115438258A CN 115438258 A CN115438258 A CN 115438258A CN 202211044168 A CN202211044168 A CN 202211044168A CN 115438258 A CN115438258 A CN 115438258A
Authority
CN
China
Prior art keywords
user
layer
item
interaction
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211044168.3A
Other languages
Chinese (zh)
Inventor
王瑞琴
蒋云良
楼俊钢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huzhou University
Original Assignee
Huzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huzhou University filed Critical Huzhou University
Priority to CN202211044168.3A priority Critical patent/CN115438258A/en
Publication of CN115438258A publication Critical patent/CN115438258A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a user preference modeling method based on a lightweight graph convolution attention network, which comprises the following steps of: s1, modeling static user preferences by using a lightweight GCN with only neighborhood aggregation; s2, modeling dynamic user preference by using time perception GAT based on recent interactive items; and S3, combining the static user preference and the dynamic user preference, inputting the two-channel deep neural network model, and performing feature interactive learning and matching score prediction. The invention can simultaneously capture the static and dynamic preferences of the user in an end-to-end mode, and can effectively capture the static and dynamic user preferences by using different GNN methods, which is obviously superior to the most advanced recommendation method at present.

Description

User preference modeling method based on lightweight graph convolution attention network
[ technical field ] A method for producing a semiconductor device
The invention relates to the technical field of user preference modeling and personalized recommendation, in particular to a user preference modeling method based on a lightweight graph convolution attention network.
[ background of the invention ]
With the rapid development of computing resources and the availability of large amounts of training data, researchers are beginning to apply Deep Learning (DL) techniques to machine Learning tasks such as speech recognition, machine translation, and recommendation. Currently, DL models have had great success in euclidean data. However, data generated from non-euclidean domains (such as graph structures) are now widely used and require additional analysis. For example, in chemistry, molecules are represented as graph-based structures, and their biological activities need to be determined for drug discovery. In e-commerce, interactions between users and products are modeled as graph structures for accurate product recommendations. A graph is a data format with "nodes" representing individuals in a network and "edges" representing connections between individuals. With the great success of DL, researchers have attempted to design the architecture of GNNs based on the idea of deep auto-encoders, recursive networks, and convolutional networks. Early studies learned the representation of the target node by iteratively propagating neighborhood information until a steady state was reached. This learning process, also known as graph embedding, aims at converting graph nodes into low-dimensional vectors by preserving network topology and node characteristics so that subsequent graph processing tasks (classification, clustering, recommendation, etc.) can be achieved by simple statistical and machine learning methods (dot product, cosine similarity, etc.).
Graph Neural Network (GNN) is a promising learning technique for graphical data representation. Graph Convolution Network (GCN) and Graph attention Network (GAT) are two main representative techniques in GNN that can learn an embedded representation of a target node by aggregating embedded representations of neighboring nodes. Thanks to the advantages of GNN in graph representation learning, many GNN-based recommendation techniques have emerged to address different challenges on various graphs. The first method is GCN, which learns user and item embedding by aggregating information from neighbors in the graph through convolution and pooling operations. Due to its powerful feature extraction and learning capabilities, GCN has gained wide application in remote sensing and has enjoyed great success. Another proposed technique based on GNN is GAT, which introduces a mechanism of attention to GNN to learn the different degrees of influence of neighbor nodes on a target node in an interaction graph. Due to its good discriminative power, GAT has been widely used to build high-performance recommendation models. Although the GNN-based recommendation method has been highly successful, the complex structure of the existing GNN method makes it less efficient in complex tasks with multiple graph nodes. The computation process of GCN and GAT is less friendly to tasks with complex graph structures. In addition, the conventional GCN and GAT ignore the time factor in the representation learning, and the obtained user preference model is static and cannot reflect the dynamic change of the user preference.
[ summary of the invention ]
The invention aims to solve the problems in the prior art and provides a user preference modeling method based on a lightweight graph convolution attention network, which can effectively capture static and dynamic user preferences by using different GNN methods.
In order to achieve the above purpose, the invention provides a user preference modeling method based on a lightweight graph convolution attention network, which comprises the following steps:
s1, modeling static user preferences by using a lightweight GCN with only neighborhood aggregation;
s2, modeling dynamic user preference by using time perception GAT based on recent interactive items;
and S3, combining the static user preference and the dynamic user preference, inputting the two-channel deep neural network model, and performing feature interactive learning and matching score prediction.
Preferably, the implementation of the method is a lightweight graph convolution attention network LightGCAN based on time perception, and includes: an input layer, an embedding layer, a presentation layer, an interaction layer and an output layer; the input layer includes two matrices: user-item interaction matrix
Figure BDA0003821711440000021
And interaction time matrix
Figure BDA0003821711440000022
Figure BDA0003821711440000023
Where m and n represent the number of users and items, respectively, and R is an implicit feedback matrix, if there is an interaction between user u and item i,then r is ui =1, otherwise r ui =0,T records the time of interaction between the user and the item by means of a time stamp, with dimensions identical to R, said input layer providing an initial characteristic representation of the user and the item
Figure BDA0003821711440000024
And
Figure BDA0003821711440000025
x u 、x i all the heat vectors are multiple heat vectors and respectively correspond to the u-th row and the i-th column of the R; the embedding layer is a fully connected layer and is used for converting sparse user and item representations into dense potential embedding representations and further used as input of the user preference modeling representation layer; the representation layer includes two GNN models: the LightGCN and the TGAT are respectively used for static and dynamic user preference modeling, and the obtained static and dynamic user preferences are combined and sent to an interaction layer for high-order feature interactive learning; the interaction layer includes two DNN models: and the DMF and the MLP are used for learning different feature interactions according to different deep learning strategies, and finally, serially connecting the obtained feature interaction vectors and transmitting the feature interaction vectors to an output layer for predicting the user-item matching score.
Preferably, the attention elements of the time-aware-based lightweight graph convolutional attention network LightGCAN include a current user, a target item, the last k interactive items, and an interaction time.
Preferably, the hyper-parameters of the time-aware-based lightweight graph convolutional attention network LightGCAN include the number of potential factors, the number of recent interactive items, and the number of hidden layers in the DL model.
Preferably, the number of hidden layers of the DMF and the MLP model is set to 3.
Preferably, in step S1, the static user preference modeling is performed by using a lightweight GCN model LightGCN, which only retains neighbor aggregation operations in the GCN, and has no feature transformation and non-linear activation.
Preferably, in step S1, the modeling process of the static user preference includes the following steps:
s1.1 convolution of lightweight graph: user embedding at layer k +1 is defined as an aggregation operation based on weighted summation:
Figure BDA0003821711440000026
Figure BDA0003821711440000027
wherein the content of the first and second substances,
Figure BDA0003821711440000028
and
Figure BDA0003821711440000029
respectively representing the embedded representation of user u and item i at the k-th level, N u And N i Respectively representing neighbor nodes of the user u and the project i;
Figure BDA00038217114400000210
as a normalization term; the hyper-parameters needing to be learned are embedded by the first layer of users and items, and the embedded layers of users and items are automatically learned layer by layer through the iterative process;
the user and item embedding of the first layer is represented as:
Figure BDA0003821711440000031
Figure BDA0003821711440000032
wherein, W u And W v Respectively representing a weight matrix for converting the initial feature vectors of the users and the items into potential embedded representations;
s1.2 layer polymerization: after K layers of lightweight graph convolution operation, K different user/project embeddings are generated, and each layer of embeddings represents different potential semantic information; combining the embeddings of each layer to generate an embeddings of the target user/item:
Figure BDA0003821711440000033
Figure BDA0003821711440000034
wherein the content of the first and second substances,
Figure BDA0003821711440000035
and
Figure BDA0003821711440000036
respectively representing static user preferences and item characteristics; alpha is alpha k ≧ 0 represents the importance weight of the kth layer embedding.
Preferably, in step S1.2, α is adjusted k Set to 1/(K + 1).
Preferably, in step S2, the modeling process of the dynamic user preference includes the following steps:
s2.1, combining the current user, the target item, the latest interactive item and the embedded expression of the interactive time, and inputting the combined expression into an attention network; the attention network is responsible for learning importance weights of recent interactive items for user dynamic preference modeling:
Figure BDA0003821711440000037
Figure BDA0003821711440000038
wherein, W k And b k Respectively representing a weight matrix and a bias vector of a k layer of the attention network; b k Number of layers representing the attention network; σ (-) represents the activation function; e.g. of the type u ,e v ,
Figure BDA0003821711440000039
And
Figure BDA00038217114400000310
embedding, x, representing the current user, the target item, the most recent interactive item and the interaction time, respectively j The four embedded vectors are connected to obtain a combined vector which is used as the input of the attention network;
s2.2, dividing the time interval between the interaction time of the historical interaction item and the current time, and then obtaining the embedded expression of the interaction time through linear transformation:
ts j =min((T-t j )/60,δ)
Figure BDA00038217114400000311
wherein, t j Represents the current user's interaction time with item j, T is the current time at the time of prediction, ts j Representing the time interval between the interaction time and the prediction time, the min function being used to set the threshold value of the time interval to δ, W t A transformation matrix representing temporal embedding;
s2.3 normalization of the attention coefficient by the softmax function:
Figure BDA00038217114400000312
wherein RK u The latest k interactive items representing the user u;
s2.4 the user dynamic preference vector is modeled as a weighted sum of the embedded representations of the current user' S most recent k interactive items:
Figure BDA00038217114400000313
wherein
Figure BDA0003821711440000041
And
Figure BDA0003821711440000042
respectively representing the embedding of the user u and the historical interaction item j.
Preferably, step S3 specifically includes the steps of:
s3.1, combining the static and dynamic user preference representations in a vector cascade mode to obtain a user preference representation; the item embedding vectors obtained from the embedding layer and the LightGCN are also combined to obtain an item feature representation:
Figure BDA0003821711440000043
Figure BDA0003821711440000044
wherein the content of the first and second substances,
Figure BDA0003821711440000045
and
Figure BDA0003821711440000046
representing static user preferences and item characteristics respectively,
Figure BDA0003821711440000047
denotes the embedding of user u, e u And e i Respectively representing a user preference vector and a project characteristic vector;
Figure BDA0003821711440000048
represents item i based embedding; the generated user and item representations are used as input of high-order feature interactive learning models DMF and MLP;
s3.2 feature interactive learning based on DMF: the DMF model has a multi-layer two-channel structure based on user components and project components, in each of which the output of the current layer is used as the input of the next layer; in each layer, the input vector is projected as a hidden vector by linear transformation and nonlinear activation operations:
Figure BDA0003821711440000049
Figure BDA00038217114400000410
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA00038217114400000411
and
Figure BDA00038217114400000412
hidden representations representing users u and items i in the k-th layer, respectively; here, the
Figure BDA00038217114400000413
Figure BDA00038217114400000414
And
Figure BDA00038217114400000415
respectively representing a weight matrix and an offset vector of a k layer of the user component;
Figure BDA00038217114400000416
and
Figure BDA00038217114400000417
then representing the weight matrix and bias vector of the k-th layer of the project component;
through iterative learning of the multi-layered DMF model, user and item representations are mapped to a low-dimensional potential embedding space:
Figure BDA00038217114400000418
Figure BDA00038217114400000419
wherein L is 1 Number of layers, p, representing DMF model u And q is i Respectively representing potential representations of the learned user u and the item i;
the user-item feature interaction is defined as the product of the user and the item potential representation vector:
Figure BDA00038217114400000420
wherein
Figure BDA00038217114400000421
Representing high-order feature interaction vectors learned by a DMF model;
s3.3, feature interactive learning based on MLP: MLP is a typical deep learning model that first combines the feature vectors of users and projects, then passes through multiple hidden layers on top of it to learn the higher-order user-project feature interactions:
z 0 =[e u ||e i ]
Figure BDA00038217114400000425
Figure BDA00038217114400000422
wherein the content of the first and second substances,
Figure BDA00038217114400000423
and alpha k Weight matrix, bias vector and activation function, H, representing the k-th layer respectively 1 Weight matrix, L, representing the output layer 2 The number of layers of the model is represented,
Figure BDA00038217114400000424
representing a high-order feature interaction vector learned by the MLP model;
s3.4 matching score prediction: firstly, respectively operating DMF (dimethyl formamide) and MLP (Levenberg-Martin P), then combining output vectors of two models in a vector cascade mode, and finally inputting a combined embedded vector into an output layer of LightGCAN to perform matching score prediction:
Figure BDA0003821711440000051
wherein H 2 A weight matrix representing an output layer;
s3.5 model training: lightGCAN is a CF model based on implicit feedback information, using binary cross entropy loss as an objective function to minimize the difference between the predicted matching score and the implicit feedback information:
Figure BDA0003821711440000052
wherein the content of the first and second substances,
Figure BDA0003821711440000053
is the predicted match score, r, between user u and item i ui Is implicit feedback information observed in the interaction matrix R, R + And R - Positive and negative sample sets, respectively, Θ being a hyper-parameter of the model.
The present invention proposes a time-aware graph convolutional attention network that efficiently captures static and dynamic user preferences by using different GNN methods. Specifically, static user preferences are captured by a lightweight GNN with node aggregation only, dynamic user preferences are captured based on time-aware GAT of the most recent interaction term, and the two user preferences are combined and input into a two-channel DNN model composed of DMF and MLP for feature interaction learning and matching score prediction.
The invention has the beneficial effects that:
1. an efficient user preference learning model is designed, and the framework can simultaneously capture the static and dynamic preferences of a user in an end-to-end mode.
2. A time-aware attention network model is proposed that estimates the contribution weight of each historical interaction item to modeling of dynamic user preferences in terms of the current user, target item, recent historical interaction items, and their interaction times.
3. Experiments were conducted on four data sets to evaluate the performance of the method of the present invention in Collaborative Filtering (CF) recommendations, and the experimental results show that the method of the present invention is significantly superior to the most advanced current recommendation methods.
The features and advantages of the present invention will be described in detail by embodiments in conjunction with the accompanying drawings.
[ description of the drawings ]
FIG. 1 is an overall architecture diagram of a time-aware lightweight atlas convolutional attention network LightGCAN;
FIG. 2 is a comparison graph of HR @10 with different numbers of potential factors;
FIG. 3 is a comparison graph of NDCG @10 with different numbers of potential factors;
FIG. 4 is a HR @10 comparison graph with different historical interaction terms;
FIG. 5 is a comparison graph of NDCG @10 with different historical interaction terms;
FIG. 6 is a comparison graph with different numbers of hidden layers HR @ 10;
figure 7 is a comparison graph with different numbers of hidden layers ndcg @ 10.
[ detailed description ] embodiments
1 basic knowledge
Some of the symbols used herein will be described below. Bold italic capital letters (e.g., X) and bold italic lowercase letters (e.g., X) are used to represent a matrix and a vector, respectively, X ij Indicating entries in row i and column j of matrix X, the symbols | | and | | | are used to indicate element-by-element multiplication and vector join operations, respectively. Table 1 summarizes some symbols and their description used in the rest of the text:
TABLE 1 description of symbols used therein
Figure BDA0003821711440000061
1.1 Graph Convolution Network (GCN)
The GCN is a neural network model for graph data structures, which takes graph data as input, learns graph node representations in a low-dimensional embedding space for each layer, and uses the output of the previous layer as input for the next layer. For simplicity, only the implementation details of one layer of the GCN will be presented next.
Node embedding in the GCN layer involves two main operations: node aggregation and feature transformation. The node embedding process can be abstracted as:
Figure BDA0003821711440000062
wherein the content of the first and second substances,
Figure BDA0003821711440000063
and
Figure BDA0003821711440000064
respectively representing embedded representations of a target node i and its neighboring nodes j;
Figure BDA0003821711440000065
representing the potential embedding of the target node d < f;
Figure BDA0003821711440000066
and
Figure BDA0003821711440000067
respectively representing a node aggregation function and a feature transformation function.
Node aggregation improves the representation of a target node by collecting information from its neighboring nodes. The rationale behind node aggregation is that: the attributes of a target node may typically be reflected to some extent by the attributes of its neighbors. In recent years. Research on GCN has focused on constructing different node aggregation functions to capture information from the neighborhood. For example: the average pooling function is used to filter out common attributes of neighboring nodes, and the maximum pooling function is used to extract representative features from the neighbors.
By mapping the target node from the input representation space to the potential embedding space, the feature transformation makes the representation of the target node more comprehensive. In conventional GCN, feature transformation is generally defined as a process with matrix mapping and a non-linear activation function, abstracted as follows:
Figure BDA0003821711440000068
wherein the content of the first and second substances,
Figure BDA0003821711440000069
and
Figure BDA00038217114400000610
representing the mapping matrix and the offset vector, respectively, and sigma is the activation function.
1.2 attention network (GAT)
In embedded learning for a target node, GAT distinguishes the different roles of its neighbor nodes based on an attention mechanism. Like GCN, GAT also includes a hidden representation of multi-level, layer-by-layer learning nodes. For simplicity, only one layer of GAT will be described in detail below.
Taking a set of node features as inputs to the GATs, each GAT layer generates a new set of node representations with different dimensions as its outputs:
x i =σ(∑ j∈neighbor(i) α ij Wx j ) (3)
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003821711440000071
representing a shared weight matrix for feature transformation; f and f' represent the dimensions of the input and output node feature vectors, respectively; alpha is alpha ij Indicating that the neighbor node j represents the learned importance weight for the target node i.
In general, shared self-attention mechanisms
Figure BDA0003821711440000072
Used to calculate the impact weight (attention coefficient) of the neighborhood:
e ij =α(Wx i ,Wx j ) (4)
in order to make the attention coefficient more convenient to compare between different neighbors of the target node, the attention coefficient is typically normalized using the softmax function:
Figure BDA0003821711440000073
2 lightweight map convolutional attention network (LightGCAN)
2.1 Overall framework
The overall framework of LightGCAN is shown in fig. 1 and comprises five layers: an input layer, an embedding layer, a presentation layer, an interaction layer, and an output layer. The input layer is composed of two matrices, namely a user-item interaction matrix
Figure BDA0003821711440000074
And interaction time matrix
Figure BDA0003821711440000075
Where m and n represent the number of users and items, respectively. R is an implicit feedback matrix, R if there is an interaction between user u and item i ui =1, otherwise r ui =0.T records the interaction time between the user and the item through a time stamp, and the dimension of the interaction time is the same as that of R. The input layer provides an initial characterization of the user and the item
Figure BDA0003821711440000076
And
Figure BDA0003821711440000077
they are multiple heat vectors, corresponding to the u-th row and i-th column of R, respectively. The embedding layer is a fully connected layer for converting sparse user and item representations into dense, potentially embedded representations, which are then used as input to the user preference modeling representation layer. The presentation layer contains two GNN models, lightGCN and TGAT, for static and dynamic user preference modeling, respectively. Combining the static and dynamic user preferences and sending them to the interaction layerAnd performing high-order feature interactive learning. The interaction layer comprises two DNN models, namely DMF and MLP, and is used for learning different feature interactions according to different deep learning strategies. And finally, serially connecting the obtained feature interaction vectors, and sending the feature interaction vectors to an output layer for predicting the user-item matching score.
2.2 static user preference modeling
User preferences can generally be divided into two categories: static preferences and dynamic preferences. Static preferences refer to the relatively fixed interests and hobbies that users develop over time. Dynamic preferences refer to the user's short-term preferences at the current time. In the CF recommendation, the most common way to capture the user's static preferences is to use historical interaction information. Implicit feedback information is used in the present invention as a data source for static user preference modeling.
In the CF recommendation system, the relationship between a user and an item is generally described as a graph structure. GCN has been widely used for interactive graph-based user and item representation learning in view of its powerful representation learning capabilities in graph structured data. In the invention, a lightweight GCN model LightGCN is used for static user preference modeling. LightGCN only retains neighbor aggregation operations in GCN, and discards complex feature transformation and nonlinear activation operations which are meaningless for recommendation tasks.
2.2.1 lightweight graph convolution
LightGCN is a simplified GCN, a multi-layer model. Formally, user (item) embedding at the k +1 th level is defined as an aggregation operation based on weighted summation:
Figure BDA00038217114400000811
Figure BDA0003821711440000081
wherein the content of the first and second substances,
Figure BDA0003821711440000082
and
Figure BDA0003821711440000083
respectively representing the embedded representation of user u and item i at the k-th level, N u And N i Respectively representing the neighbor nodes of user u and item i.
Figure BDA0003821711440000084
Used as a normalization term to avoid explosion phenomena in feature aggregation. The only hyper-parameter to learn here is the first level of user and item embedding, since the high level of user and item embedding can be learned automatically layer by layer through the above iterative process.
Specifically, the user and item embedding of the first tier is represented as:
Figure BDA0003821711440000085
Figure BDA0003821711440000086
wherein W u And W v The separate representation converts the initial feature vectors of the user and the item into a weight matrix of potential embedded representations.
2.2.2 layer polymerization
After K layers of lightweight graph convolution operation, K different user (item) embeddings are generated. Each layer of embedding represents different underlying semantic information. The intuitive idea is to combine the embeddings of each layer to generate an embeddings of the target user (item):
Figure BDA0003821711440000087
Figure BDA0003821711440000088
wherein the content of the first and second substances,
Figure BDA0003821711440000089
and
Figure BDA00038217114400000810
respectively representing static user preferences and item characteristics; alpha is alpha k ≧ 0 represents the importance weight of the kth layer embedding. In our experiments, α was uniformly assigned k Set to 1/(K + 1), which achieves good performance. This setup strategy may avoid complicating the LightGCN while maintaining its simplicity and efficiency.
2.3 dynamic user preference modeling
Static user preference modeling is based on the entire interaction history of the target user, completely ignoring drift in user preferences over time. When a user faces different items at different times, his interests and preferences are different, which preferences belong to short-term preferences. To capture short-term user preferences, one should rely on items that the user has interacted with recently, which are more reflective of the user's current interests and preferences than items that the user interacted with long ago. Further, user interaction time with items should be taken into account in dynamic user preference modeling, which may reveal drift in user preferences over time.
Herein, we propose a time-aware GAT model (TGAT) to capture dynamic user preferences. Unlike existing attention networks that model dynamic user preferences based on the entire interaction history or using only the current session, TGAT models the dynamic preferences of the user using the user's last k interaction terms. Recently interacted with items reflect the user's current interests and preferences more than items interacted with a long time ago. Further, using fewer items may reduce the computational complexity of user preference modeling. In order to make the historical interaction vectors of all users equal in length, when the number of historical interaction items of a user is less than k, a meaningless constant (e.g., -1) is usually used to fill in the historical interaction vectors.
First, the current user, the target item, the most recent interactive item, and the embedded representation of the interaction time are combined and input to the attention network. The attention network is responsible for learning importance weights of recent interactive items for user dynamic preference modeling:
Figure BDA0003821711440000091
Figure BDA0003821711440000092
wherein, W k And b k Respectively representing a weight matrix and a bias vector of a k layer of the attention network; b is a mixture of k Number of layers representing the attention network; σ (-) represents the activation function; e.g. of the type u ,e v ,
Figure BDA0003821711440000093
And
Figure BDA0003821711440000094
embedding, x, representing the current user, the target item, the most recent interactive item and the interaction time, respectively j The four embeddings are concatenated to obtain a combined vector, which is used as the input to the attention network.
Dividing the time interval between the interaction time of the historical interaction item and the current time by taking minutes as a unit to realize discretization of continuous time, and then obtaining the embedded expression of the interaction time through linear transformation:
ts j =min((T-t j )/60,δ) (14)
Figure BDA0003821711440000095
wherein, t j Represents the current user's interaction time (in seconds) with item j, T is the current time at the time of prediction, ts j Representing the time interval (minutes) between the interaction time and the prediction time, the min function being used to set the threshold value of the time interval to δ, W t Representing a time-embedded transformation matrix.
Then, to facilitate comparison between different historical interaction terms of the user, the attention coefficient is normalized by the softmax function:
Figure BDA0003821711440000096
wherein RK u Representing the last k interactive items of user u.
Finally, the user dynamic preference vector is modeled as a weighted sum of the embedded representations of the current user's most recent k interaction terms:
Figure BDA0003821711440000097
wherein
Figure BDA0003821711440000098
And
Figure BDA0003821711440000099
representing the embedding of user u and historical interaction item j, respectively.
2.4 feature Interactive learning and Scoring prediction
The static and dynamic user preference representations are combined in the form of a vector cascade to obtain a user preference representation. Furthermore, the item embedding vectors obtained from the embedding layer and LightGCN are also combined to obtain an item feature representation:
Figure BDA00038217114400000910
Figure BDA00038217114400000911
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA00038217114400000912
and
Figure BDA00038217114400000913
respectively representing static user biasesThe characteristics of the good and the project,
Figure BDA00038217114400000914
indicating the embedding of user u, e u And e i Respectively representing a user preference vector and a project characteristic vector;
Figure BDA00038217114400000915
represents item i based embedding; the generated user and item representations will be used as input to the high-order feature interactive learning models DMF and MLP.
2.4.1 feature interactive learning based on DMF
The DMF model is a multi-layer two-channel structure based on user components and project components. In each component, the output of the current layer is used as the input for the next layer. In each layer, the input vector is projected as a hidden vector by linear transformation and nonlinear activation operations:
Figure BDA0003821711440000101
Figure BDA0003821711440000102
wherein the content of the first and second substances,
Figure BDA0003821711440000103
and
Figure BDA0003821711440000104
hidden representations representing users u and items i in the k-th layer, respectively; here, the
Figure BDA0003821711440000105
Figure BDA0003821711440000106
And
Figure BDA0003821711440000107
respectively representing weights of k-th layers of user elementsA matrix and an offset vector;
Figure BDA0003821711440000108
and
Figure BDA0003821711440000109
the weight matrix and bias vector representing the k-th layer of the project component.
Through iterative learning of the multi-layered DMF model, user and item representations are mapped to a low-dimensional potential embedding space:
Figure BDA00038217114400001010
Figure BDA00038217114400001011
wherein L is 1 Number of layers, p, representing DMF model u And q is i Respectively representing potential representations of learned users u and items i.
The user-item feature interaction is defined as the product of the user and the item potential representation vector:
Figure BDA00038217114400001012
wherein
Figure BDA00038217114400001013
Representing the higher order feature interaction vector learned by the DMF model.
2.4.2 MLP-based feature Interactive learning
The DMF learns the potential representations of the user and the item, respectively, using two independent channels, and finally it is intuitive to connect the potential representations of the user and the item. However, the interaction between the user and the underlying factors of the project is not well described using vector connections alone. The feature vectors of users and projects are combined first by using another deep learning model MLP, and then a plurality of hidden layers are added on the combined feature vectors to learn high-level user-project feature interaction:
Z 0 =[e u ||e i ] (25)
Figure BDA00038217114400001014
Figure BDA00038217114400001015
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA00038217114400001016
and alpha k Weight matrix, bias vector and activation function, H, representing the k-th layer respectively 1 Weight matrix, L, representing the output layer 2 The number of layers of the model is represented,
Figure BDA00038217114400001017
representing the higher order feature interaction vectors learned by the MLP model.
2.4.3 match score prediction
To date, two types of high-order feature interaction vectors have been obtained. The DMF model uses a two-channel structure to model potential representations of users and items, and then computes interaction vectors. The MLP model integrates user and item representations first, and then learns interaction vectors using a typical DNN model. To maintain the advantages of both models, DMF and MLP can be fused into one integrated model. In order to provide great flexibility for the integrated model, the DMF and the MLP are operated separately, then the outputs of the two models are combined in a vector cascade mode, and finally the combined embedded vector is input to the output layer of the LightGCAN to perform matching score prediction:
Figure BDA0003821711440000111
wherein H 2 A weight matrix representing the output layer.
2.4.4 model training
The objective functions point-wise and pair-wise are typically used for model training of the recommendation system. For simplicity, the point-wise method is used in the present invention. Since LightGCAN is a CF model based on implicit feedback information, binary cross entropy loss is used here as an objective function to minimize the difference between the predicted matching score and the implicit feedback information:
Figure BDA0003821711440000112
wherein the content of the first and second substances,
Figure BDA0003821711440000113
is the predicted match score, r, between user u and item i ui Is implicit feedback information observed in the interaction matrix R, R + And R - Positive and negative sample sets, respectively, Θ being a hyper-parameter of the model.
3 experiments and analyses
In order to answer the following study questions, a series of experiments will be performed below, and the results of the experiments will be analyzed in detail:
rq1. Is the proposed recommendation model LightGCAN better than the existing CF recommendation model in the Top-k recommendation task?
Not different components in lightgcan work on the recommended task?
Not different settings of the hyper-parameter in the rq3.Lightgcan will affect the recommended performance?
3.1 Experimental setup
3.1.1 data set
To evaluate the recommended performance of the model LightGCAN, experiments were now performed using four real datasets in different fields and different scales, the datasets being movilens 100K (ml-100K), movielen 1M (ml-1M), amazon music (Amusic) and Amazon days (Atoy), respectively. Table 2 gives detailed statistical information of the experimental data set:
TABLE 2 statistical information of data sets
Figure BDA0003821711440000114
3.1.2 evaluation of strategies and Performance indicators
In this experiment, a leave-one strategy was used for performance evaluation. The most recent interactive item of each user is used as a test sample, and the remaining interactive items are used for training. Since it is time consuming to sort all items in each test, 100 non-interactive items are randomly selected for each user and then sorted with the test items according to the predicted match scores. In the experiment, two popular test criteria were used: hit Rate (HR) and normalized presentation cumulative gain (NDCG) to evaluate ranking performance. The length of the ranking list is set to 10 to evaluate Top-10 recommendation performance.
3.1.3 comparative method
The following CF process was chosen for comparison with the process of the present invention:
ItempPop is a statistical-based method that ranks items according to their popularity (frequency of access). It is commonly used as a baseline method for CF recommendations.
eALS (Xiaoangnan He, handing Zhang, min-Yen Kan, anddat-Seng Chua,2016.Fast three factor for online correlation with an implementation feedback. In SIGIR.549-558.) is a Matrix Factorization (MF) based CF method with fast parameter learning technique. It takes the mutual terms observed and not observed as positive and negative samples, respectively.
DMF (Hongjian Xue, xinyu Dai, jianbin Zhang, shuijian Huang, and Jianjun Chen,2017.Deep matrix factorization models for recommender systems. In IJCAI.3203-3209.) is a CF method for representation learning based on DL. It performs model training using normalized binary cross entropy as a loss function.
NeuMF (Xiangnan He, lizi Liao, hanwang Zhang, liqiang Nie, xia Hu, and Tat-sen Chua,2017.Neural social networking filtering. In www.173-182.) is a CF method for matching score prediction based on DL, which learns feature interactions using different learning strategies based on two major models, GMF and MLP. The outputs of the two models are combined for matching score prediction.
Deep CF (zhonghong ding, ling Huang, changdong Wang, jianhuang Lai, and philip s. Yu,2019.Deep CF.
3.2 recommended Performance comparison (RQ 1)
Table 3 shows the Top-10 recommended performance of the different recommendations, and the tests on the Amusic and Atoy datasets did not include time-embedded components because there was no time information in the two datasets. The following results can be observed from the table:
all methods performed best on the ml-100k dataset, next to the ml-1m dataset, then the Amusic dataset, and finally the Atoy dataset. This result indicates that the sparsity of the training data has a large impact on the recommended performance of the CF method, since the data sparsity of the data sets increases in turn.
The MF-based approach etals outperforms the statistical-based approach itempp, a phenomenon that suggests that, although the principle of the underlying factor model is simple and has only linear modeling capability, it plays a role in feature-interactive learning and CF recommendation.
The advantage of the DL-based MF methods (DMF and NeuMF) over the traditional MF method etals suggests that deep learning has a strong advantage in both expression learning and matching function learning, which is very beneficial for CF.
The MF-based integrated methods (NeuMF and depcf) performed better on all datasets than the MF-based independent method (DMF). This is because each component in the integrated model is predicted according to a different strategy, which in combination can preserve the advantages of each component, making the model more robust.
In the DL-based approach, deepCF outperforms NeuMF. Analysis shows that DeepCF uses two different deep learning strategies for feature interactive learning, while NeuMF uses a linear model and a DL model. This further shows that the deep learning model is significantly better than the linear model in terms of feature interactive learning.
The method LightGCAN of the present invention performs best on all datasets, with a significant and stable degree of performance improvement compared to the comparative method. On the ml-100k dataset, the HR and NDCG of LightGCAN were 6.3% and 9.7% higher, respectively, than the current state-of-the-art DeepCF method; HR and NDCG were 4.7% and 16.1% higher, respectively, on the ml-1m dataset.
TABLE 3 comparison of Performance of different recommendations
Figure BDA0003821711440000131
3.3 ablation Experimental analysis (RQ 2)
The method LightGCAN proposed by the present invention comprises two key parts: user static preference modeling and user dynamic preference modeling. To verify the role of these two components in the CF recommendation task, ablation experiments were performed, with recommendations made using only static modeling, applying only dynamic modeling, and integrating static modeling and dynamic modeling, with the experimental results shown in table 4. As can be seen from the table, the recommendation performance obtained based on dynamic modeling is better than that based on static modeling, which indicates that dynamic preference modeling considering time factors can better capture the current preference of the user. However, if time information is not available, dynamic modeling will not work and only static modeling, whose data sources are readily available, can be used to obtain long-term user preferences. Therefore, static modeling is a common modeling approach in the CF recommendation, which is why both static and dynamic modeling are combined in the model of the present invention. Experimental results show that the combination of the two modeling methods can obtain the best recommended performance.
TABLE 4 Performance comparison of different user modeling methods
Figure BDA0003821711440000132
In LightGCAN, a new attention network for dynamic user preference modeling is designed that comprehensively considers a number of attention elements, including the current user, the target item, the last k interactive items and the interaction time. To demonstrate the effectiveness of different attention elements in dynamic user preference modeling, experiments were conducted on different combinations of these elements. Since the conventional method uses only the history interactive item for user preference modeling, the other three elements should be combined with the history interactive item to view the result, and the experimental result is shown in table 5. As can be seen from the table, all attention elements have a positive impact on the dynamic user preference modeling, and by combining these factors, the best performance can be obtained. Among them, the interaction time plays the most critical role, but the existing methods often ignore this point, which is the novelty of the attention mechanism proposed by the present invention.
TABLE 5 comparison of Performance of different attention mechanisms
Figure BDA0003821711440000141
3.4 Parametric sensitivity analysis (RQ 3)
The impact of different settings of the model hyper-parameters on the recommended performance will be investigated below. The hyper-parameters of the model of the invention include the number of latent factors, the number of recent interactive items and the number of hidden layers in the DL model. Figures 2 and 3 show the HR and NDCG performance with different numbers of potential factors in the user (project) representation learning. As can be seen from the figure, the recommendation performance becomes better as the number of potential factors increases. This result indicates that the model of the present invention can achieve better performance by considering more influencing factors in the representation learning. However, considering more factors will inevitably lead to higher computational complexity. Therefore, it is common practice not to increase the number of latent factors any more when the performance improvement is not significant. In the present invention, the potential factor of embedding the user and item in the representation is set to 64.
FIGS. 4 and 5 show HR and NDCG for different numbers of historical interaction items during dynamic user preference modeling on a ml-100k data set. As can be seen from the figure, the most suitable number of historical interaction terms on the ml-100k data set is 20, considering that too many or too few interaction terms will result in performance degradation. Because there is not enough information to learn user preferences when too few historical interaction items are used, and using too many historical interaction items inevitably introduces noise. After all, items that have been interacted with long ago do not reflect the user's current interest preferences. The number of best historical interaction terms varies from data set to data set and needs to be found experimentally.
In general, the number of hidden layers of the DL model has a significant impact on model prediction. To show the impact of the depth of the DL model on representation learning and feature interaction learning, the performance of LightGCAN with different hidden layers was evaluated. Fig. 6 and 7 show HR and NDCG performance with different number of concealment layers in DMF and MLP, respectively. As can be seen from the figure, initially, as the number of hidden layers increases, the performance becomes better; when the number of hidden layers reaches three, performance begins to degrade. This result indicates that when there are three hidden layers, the learning power of the model has saturated, adding more hidden layers will result in overfitting. Therefore, the number of hidden layers of the DMF and MLP models is set to 3 in the present invention.
4 conclusion
As is well known, CF recommendations include two phases: representing learning and matching function learning. The expression learning goes through a process from matrix decomposition to deep learning, and the matching function learning goes through a process from point multiplication and decomposition to deep learning. The currently best performing strategy for matching function learning is the dual path DL based strategy, so this structure continues to be used in the present invention. Static and dynamic user preference modeling is an innovation of the present invention. In the present invention, the possibility of user representation learning and CF recommendations in combination with long-term and short-term user preferences is explored. A new user representation learning method is proposed, which comprises two parts of static and dynamic user preference modeling. In the static user preference modeling stage, extracting the long-term user preference by using a lightweight GCN; in the dynamic user preference modeling phase, short-term user preferences are modeled using time-aware GAT. And combining the long-term user preference and the short-term user preference and sending the combined preference to a dual-channel DL model for feature interactive learning and matching score prediction. Experimental results on four data sets show that the method of the present invention is significantly superior to the existing CF recommendation method.
The above embodiments are illustrative of the present invention, and are not intended to limit the present invention, and any simple modifications of the present invention are within the scope of the present invention.

Claims (10)

1. The user preference modeling method based on the lightweight graph convolution attention network is characterized by comprising the following steps: the method comprises the following steps:
s1, modeling static user preferences by using a lightweight GCN with only neighborhood aggregation;
s2, modeling dynamic user preference by using time perception GAT based on recent interactive items;
and S3, combining the static user preference and the dynamic user preference, inputting the two-channel deep neural network model, and performing feature interactive learning and matching score prediction.
2. The method of lightweight graph convolution attention network based user preference modeling of claim 1, characterized by: the method for realizing the light-weight graph convolution attention network LightGCAN based on time perception comprises the following steps: an input layer, an embedding layer, a presentation layer, an interaction layer and an output layer;
the input layer includes two matrices: user-item interaction matrix
Figure FDA0003821711430000011
And interaction time matrix
Figure FDA0003821711430000012
Where m and n represent the number of users and items, respectively, R is an implicit feedback matrix, and R is the number of users and items if there is an interaction between user u and item i ui =1, noThen r is ui =0,T records the time of interaction between the user and the item by means of a time stamp, with dimensions identical to R, said input layer providing an initial characteristic representation of the user and the item
Figure FDA0003821711430000013
And
Figure FDA0003821711430000014
x u 、x i all the heat vectors are multiple heat vectors and respectively correspond to the u-th row and the i-th column of the R;
the embedding layer is a fully connected layer and is used for converting sparse user and item representations into dense potential embedding representations and further used as input of the user preference modeling representation layer;
the representation layer includes two GNN models: the LightGCN and the TGAT are respectively used for static and dynamic user preference modeling, and the obtained static and dynamic user preferences are combined and sent to an interaction layer for high-order feature interactive learning;
the interaction layer includes two DNN models: and the DMF and the MLP are used for learning different feature interactions according to different deep learning strategies, and finally serially connecting the obtained feature interaction vectors and transmitting the feature interaction vectors to an output layer for predicting the user-project matching score.
3. The method of lightweight graph convolution attention network based user preference modeling of claim 2, characterized by: the attention elements of the time-aware-based lightweight graph convolutional attention network LightGCAN include a current user, a target item, the last k interactive items, and an interaction time.
4. The method of lightweight graph convolution attention network based user preference modeling of claim 2, characterized by: the hyper-parameters of the time-aware-based lightweight graph convolutional attention network LightGCAN comprise the number of potential factors, the number of recent interactive items and the number of hidden layers in a DL model.
5. The method of lightweight graph convolution attention network based user preference modeling of claim 2, characterized by: the number of hidden layers of the DMF and the MLP model is set to be 3.
6. The method of lightweight graph convolution attention network based user preference modeling of claim 1, characterized by: in the step S1, a lightweight GCN model LightGCN is used for static user preference modeling, the LightGCN only keeps neighbor aggregation operation in the GCN, and two operations of feature-free transformation and nonlinear activation are carried out.
7. The method for user preference modeling based on lightweight atlas-volume attention network of claim 6, wherein: in step S1, the modeling process of the static user preference includes the following steps:
s1.1 convolution of lightweight graph: user embedding at layer k +1 is defined as an aggregation operation based on weighted summation:
Figure FDA0003821711430000021
Figure FDA0003821711430000022
wherein the content of the first and second substances,
Figure FDA0003821711430000023
and
Figure FDA0003821711430000024
represents the embedded representation of user u and item i, respectively, at the k-th level, N u And N i Respectively representing neighbor nodes of the user u and the project i;
Figure FDA0003821711430000025
as a normalization term; ultra requiring learningThe parameters are embedded by the first layer of users and items, and the embedded layers of users and items are automatically learned layer by layer through the iterative process;
the user and item embedding of the first layer is represented as:
Figure FDA0003821711430000026
Figure FDA0003821711430000027
wherein, W u And W v Respectively representing a weight matrix for converting the initial feature vectors of the user and the project into potential embedded representations;
s1.2 layer polymerization: after K layers of lightweight graph convolution operation, K different user/project embeddings are generated, and each layer of embeddings represent different potential semantic information; combining the embeddings of each layer to generate an embeddings of the target user/item:
Figure FDA0003821711430000028
Figure FDA0003821711430000029
wherein the content of the first and second substances,
Figure FDA00038217114300000210
and
Figure FDA00038217114300000211
respectively representing static user preferences and item characteristics; alpha is alpha k ≧ 0 represents the importance weight of the kth layer embedding.
8. The method of claim 7 based on a lightweight graph convolution attention networkThe method for modeling the user preference is characterized by comprising the following steps: in said step S1.2, alpha is k Set to 1/(K + 1).
9. The method for user preference modeling based on lightweight atlas-volume attention network of claim 1, wherein: in step S2, the modeling process of the dynamic user preference includes the following steps:
s2.1, combining the current user, the target item, the latest interactive item and the embedded expression of the interactive time, and inputting the combined expression into an attention network; the attention network is responsible for learning importance weights of recent interactive items for user dynamic preference modeling:
Figure FDA00038217114300000212
Figure FDA00038217114300000213
wherein, W k And b k Respectively representing a weight matrix and a bias vector of a k layer of the attention network; b k Representing the number of layers of the attention network; σ (-) represents the activation function; e.g. of the type u ,e v ,
Figure FDA00038217114300000214
And
Figure FDA00038217114300000215
embedding, x, representing the current user, the target item, the most recent interactive item and the interaction time, respectively j The four embedded vectors are connected to obtain a combined vector which is used as the input of the attention network;
s2.2, dividing the time interval between the interaction time of the historical interaction item and the current time, and then obtaining the embedded expression of the interaction time through linear transformation:
ts j =min((T-t j )/60,δ)
Figure FDA0003821711430000031
wherein, t j Represents the current user's interaction time with item j, T is the current time at the time of prediction, ts j Representing the time interval between the interaction time and the prediction time, the min function being used to set the threshold value of the time interval to δ, W t A transformation matrix representing temporal embedding;
s2.3 normalization of the attention coefficient by the softmax function:
Figure FDA0003821711430000032
wherein RK u The last k interactive items representing user u;
s2.4 the user dynamic preference vector is modeled as a weighted sum of the embedded representations of the current user' S most recent k interactive items:
Figure FDA0003821711430000033
wherein
Figure FDA0003821711430000034
And
Figure FDA0003821711430000035
respectively representing the embedding of the user u and the historical interaction item j.
10. The method of lightweight graph convolution attention network based user preference modeling of claim 2, characterized by: the step S3 specifically includes the following steps:
s3.1, combining static and dynamic user preference representations in a vector cascade mode to obtain a user preference representation; the item embedding vectors obtained from the embedding layer and the LightGCN are also combined to obtain an item feature representation:
Figure FDA0003821711430000036
Figure FDA0003821711430000037
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003821711430000038
and
Figure FDA0003821711430000039
representing static user preferences and item characteristics respectively,
Figure FDA00038217114300000310
denotes the embedding of user u, e u And e i Respectively representing a user preference vector and a project characteristic vector;
Figure FDA00038217114300000311
represents item i based embedding; the generated user and item representations are used as input of high-order feature interactive learning models DMF and MLP;
s3.2 feature interactive learning based on DMF: the DMF model has a multi-layer two-channel structure based on user components and project components, in each of which the output of the current layer is used as the input of the next layer; in each layer, the input vector is projected as a hidden vector by linear transformation and nonlinear activation operations:
Figure FDA00038217114300000312
Figure FDA00038217114300000313
wherein the content of the first and second substances,
Figure FDA00038217114300000314
and
Figure FDA00038217114300000315
hidden representations representing users u and items i in the k-th layer, respectively; here, the
Figure FDA00038217114300000316
Figure FDA00038217114300000317
And
Figure FDA00038217114300000318
a weight matrix and a bias vector respectively representing a k-th layer of the user component;
Figure FDA00038217114300000319
and
Figure FDA00038217114300000320
then representing the weight matrix and the offset vector of the k layer of the item component;
through iterative learning of the multi-layered DMF model, user and item representations are mapped to a low-dimensional potential embedding space:
Figure FDA00038217114300000321
Figure FDA00038217114300000322
wherein L is 1 Number of layers, p, representing DMF model u And q is i Respectively representing potential representations of the learned user u and the item i;
the user-item feature interaction is defined as the product of the user and the item potential representation vector:
Figure FDA0003821711430000041
wherein
Figure FDA0003821711430000042
Representing a high-order feature interaction vector learned by a DMF model;
s3.3, feature interactive learning based on MLP: MLP is a typical deep learning model that first combines the feature vectors of users and projects, and then learns the high-level user-project feature interactions through multiple hidden layers on top of it:
z 0 =[e u ||e i ]
Figure FDA0003821711430000043
Figure FDA0003821711430000044
wherein the content of the first and second substances,
Figure FDA0003821711430000045
and alpha k Weight matrix, bias vector and activation function, H, representing the k-th layer respectively 1 Weight matrix, L, representing the output layer 2 The number of layers of the model is represented,
Figure FDA0003821711430000046
representing a higher-order feature interaction vector learned by the MLP model;
s3.4 matching score prediction: firstly, respectively operating DMF (dimethyl formamide) and MLP (multi-level prediction), then combining output vectors of two models in a vector cascade mode, and finally inputting the combined embedded vector into an output layer of LightGCAN to perform matching fraction prediction:
Figure FDA0003821711430000047
wherein H 2 A weight matrix representing an output layer;
s3.5 model training: lightGCAN is a CF model based on implicit feedback information, using binary cross entropy loss as an objective function to minimize the difference between the predicted matching score and the implicit feedback information:
Figure FDA0003821711430000048
wherein the content of the first and second substances,
Figure FDA0003821711430000049
is the predicted match score, r, between user u and item i ui Is implicit feedback information observed in the interaction matrix R, R + And R - Positive and negative sample sets, respectively, and Θ is the hyper-parameter of the model.
CN202211044168.3A 2022-08-30 2022-08-30 User preference modeling method based on lightweight graph convolution attention network Pending CN115438258A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211044168.3A CN115438258A (en) 2022-08-30 2022-08-30 User preference modeling method based on lightweight graph convolution attention network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211044168.3A CN115438258A (en) 2022-08-30 2022-08-30 User preference modeling method based on lightweight graph convolution attention network

Publications (1)

Publication Number Publication Date
CN115438258A true CN115438258A (en) 2022-12-06

Family

ID=84244994

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211044168.3A Pending CN115438258A (en) 2022-08-30 2022-08-30 User preference modeling method based on lightweight graph convolution attention network

Country Status (1)

Country Link
CN (1) CN115438258A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117235375A (en) * 2023-07-20 2023-12-15 重庆理工大学 User multi-behavior recommendation method based on graphic neural network and element learning

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117235375A (en) * 2023-07-20 2023-12-15 重庆理工大学 User multi-behavior recommendation method based on graphic neural network and element learning

Similar Documents

Publication Publication Date Title
CN111931062B (en) Training method and related device of information recommendation model
Liu et al. Contextualized graph attention network for recommendation with item knowledge graph
TWI754033B (en) Generating document for a point of interest
US11074289B2 (en) Multi-modal visual search pipeline for web scale images
CN106663124B (en) Generating and using knowledge-enhanced models
CN111881342A (en) Recommendation method based on graph twin network
CN113722611B (en) Recommendation method, device and equipment for government affair service and computer readable storage medium
Tian et al. When multi-level meets multi-interest: A multi-grained neural model for sequential recommendation
Yang et al. Microblog sentiment analysis via embedding social contexts into an attentive LSTM
CN113378938B (en) Edge transform graph neural network-based small sample image classification method and system
CA3004097A1 (en) Methods and systems for investigation of compositions of ontological subjects and intelligent systems therefrom
Liao et al. Coronavirus pandemic analysis through tripartite graph clustering in online social networks
CN110598084A (en) Object sorting method, commodity sorting device and electronic equipment
CN114065048A (en) Article recommendation method based on multi-different-pattern neural network
CN111506821A (en) Recommendation model, method, device, equipment and storage medium
Mahmood et al. Using artificial neural network for multimedia information retrieval
CN115438258A (en) User preference modeling method based on lightweight graph convolution attention network
Tsinganos et al. Utilizing convolutional neural networks and word embeddings for early-stage recognition of persuasion in chat-based social engineering attacks
Liu et al. Joint user profiling with hierarchical attention networks
CN111079011A (en) Deep learning-based information recommendation method
Xu et al. Towards annotating media contents through social diffusion analysis
Zakir et al. Convolutional neural networks method for analysis of e-commerce customer reviews
CN117688390A (en) Content matching method, apparatus, computer device, storage medium, and program product
CN115408605A (en) Neural network recommendation method and system based on side information and attention mechanism
Liu POI recommendation model using multi-head attention in location-based social network big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination