CN114187077A - Sequence recommendation method based on edge-enhanced global decoupling graph neural network - Google Patents
Sequence recommendation method based on edge-enhanced global decoupling graph neural network Download PDFInfo
- Publication number
- CN114187077A CN114187077A CN202111603830.XA CN202111603830A CN114187077A CN 114187077 A CN114187077 A CN 114187077A CN 202111603830 A CN202111603830 A CN 202111603830A CN 114187077 A CN114187077 A CN 114187077A
- Authority
- CN
- China
- Prior art keywords
- item
- sequence
- decoupling
- neural network
- representing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Probability & Statistics with Applications (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a sequence recommendation method based on an edge-enhanced global decoupling graph neural network, which comprises the following steps: an input layer: taking all projects in the training data as nodes and the interaction sequence between the user and the projects as edges, constructing a global linked graph, and inputting the global linked graph to a graph neural network; decoupling the learning layer: aggregating the influence probabilities of the factors with the preset number from all the neighbor projects of the project and the influence probabilities of the factors with the preset number from the previous projects of the project of the target user according to the interaction time sequence; prediction layer: accumulating the decoupling item representations of the global level and the local level, and multiplying the decoupling item representations by the initial embedded representation of the candidate item to obtain the item v of the target useriThen the probability of the next interactive item is taken as the probability of the occurrence; iteratively training to complete the neural network; prediction output of trained graph neural networkAnd the item with the highest probability is the item recommended to the target user. The method and the system have the advantages of perceiving the transfer of the user intention and recommending products to the user more accurately.
Description
Technical Field
The invention relates to the technical field of sequence recommendation. More specifically, the invention relates to a sequence recommendation method based on an edge-enhanced global decoupling graph neural network.
Background
The recommendation system plays a crucial role in the rapidly developing internet age, and the sequence recommendation is one of the important components. Sequence recommendations model user behavior as a sequence of items rather than a collection of items. A markov chain is a classical method that models short term item transitions and predicts the next item that a user may like. With the development of deep learning networks, the recurrent neural network has succeeded in sequence recommendation. For example, long-short term memory networks are a common variant of recurrent neural networks to enhance the ability of models to retain sequence information through memory units. The GRU4REC applies gated loop units to the session-based recommendations in parallel in small batches. But the recurrent neural network-based approach faces the problem of difficulty in retaining remote information, while self-attention networks have recently been applied to sequence recommendations to capture long-term and short-term dependencies.
But these previous works model the user intent through a historical sequence of sequential interactions, ignoring the dynamic underlying relationships behind the project. Linking the edges of pairs of items contains rich semantic information that accounts for why and how a user selects one item over another. These potential factors are all related to real-world concepts, and a factor often dominates in a single instance. For example, assume that there are two users interacting with six items, as shown in FIG. 2. The link map shows item 2 adjacent to all five other items. But these link edges are intuitively driven by different factors. Item 2 is linked to items 1 and 4 because they are the same color, and item 5 and 6 are both short sleeves. Item 3 is connected to item 2 because it can be used as a T-shirt coat. These different factors show the intent translation of user behavior and also reveal the sharing characteristics of the paired items. Therefore, it is desirable to identify and distinguish these potential factors.
Decoupling means learning is very hot in many areas of computer vision and has recently been applied in recommendation systems. Decoupling represents the general purpose of learning to separate the unique and informative factors of data changes, where each unit is associated with a single concept in the real world. A single change in one factor will result in a change in the relevant units. The most common learning-decoupled networks are represented by β -VAE and InfoGAN. In the recommendation system aspect, the macro-VAE infers high-level concepts of user intent on a macro level and applies VAEs to enhance decoupling on a micro level. The authors also propose an auto-supervised seq2seq training strategy for sequence recommendation that generates sub-sequences using an intent decoupling encoder and compares user intent between two sub-sequences. However, these studies do not take into account the link relationship pattern of the project, nor do they distinguish between different user intentions behind the sequence. Thus, the learned sequence model will be sensitive to noisy data and lack interpretability.
Disclosure of Invention
An object of the present invention is to solve at least the above problems and to provide at least the advantages described later.
The invention also aims to provide a sequence recommendation method based on the edge-enhanced global decoupling graph neural network, which can achieve the conversion of identifying the potential intention of user interaction item conversion, thereby accurately recommending products to users and improving the transaction speed and the transaction success rate.
To achieve these objects and other advantages in accordance with the purpose of the invention, there is provided a method for sequence recommendation based on an edge-enhanced global decoupling graph neural network, comprising:
an input layer: taking all projects in the training data as nodes and the interaction sequence between the user and the projects as edges, constructing a global linked graph, and inputting the global linked graph to a graph neural network;
decoupling the learning layer: the global layer adopts a decoupling representation method based on a graph neural network to aggregate the information from the item viInfluence probability of preset number of factors of all neighbor projects to obtain a project viIs represented by a global level decoupling item;
The local layer firstly adopts a sequence neural network-based preprocessing project interaction sequence of a target user and then adopts a decoupling representation method based on a graph neural network to aggregate projects v from the target useriObtaining the influence probability of the preset number of factors of the first L-1 items according to the interaction time sequence to obtain the item v of the target useriThe local level of decoupling item representation of (a);
prediction layer: accumulating items viAnd the item v of the target useriThe decoupling item representation of the local level is multiplied by the initial embedded representation of each candidate item, and then the target user in the item v is obtainediThen the probability of the next interactive item is taken as the probability of the occurrence;
iterative training: setting a training target training graph neural network, and updating the item representation and the internal parameters to obtain a trained graph neural network;
and (4) arranging the output results of the prediction layers of the trained graph neural network in a descending order, wherein the item with the highest probability is the item recommended to the target user.
Preferably, the global-level decoupling representation method includes: preset item viBy its neighbor item vjK number of influencing factors, i.e. from item viTo neighbor item vjK channels are provided, each channel corresponds to an influence factor, and each neighbor item v is calculated based on a channel perception mechanismjK influencing factors on item viAnd aggregate the influence probabilities from the items viAll neighbor items vjTo update the item viIs represented by the item(s) of (1), to obtain zi g。
Preferably, the neighbor item vjIs given to the item viThe influence probability of (c) is calculated by using formula 1:
wherein the content of the first and second substances,representing item viAnd vjAre linked adjacently, and item vjOn item v over influence kiThe probability of influence of (a) is,parameter, W, representing channel kk′Parameter representing channel k', dinDimension representing item embedding of input, dchannelRepresenting the dimension of each channel embedding representation, hiRepresenting item viInitial embedded representation of hjRepresenting item vjRepresents a non-linear activation function, Wk TAnd Wk′ TRespectively represent a pair matrix WkAnd Wk′Taking the transpose.
Preferably, the term v is accumulated using equation 2iTo item v of the k-th influence factor of the domain ofiThe method comprises the following steps:
wherein the content of the first and second substances,representing accumulated items viThe k-th influence factor of the field of (1) and the after-influence-probability item viIs indicative of the item of (a),representing item viThe neighborhood of (1), i.e., the set of all neighbor items;
wherein the content of the first and second substances,representing item viThe items represented by the accumulated influence probabilities of the K channels, i.e., item viIs represented by a global level of decoupling.
Preferably, the decoupling representation method of the local layer includes embedding the position of the item sequence, and the specific method includes:
selecting an item v interacted by a target user u at one momentiAnd item viForming a project sequence by the last L-1 interactive projects, respectively embedding the positions of each project in the project sequence in the sequence order, inputting the project sequence into a self-attention encoder for encoding, and updating the position information into the project representation of the project sequence in an aggregation manner;
the position-embedded item sequence is represented by equation 4:
wherein p is1~pLAn embedded representation representing L positions, h1~hLIndicating initial embedding of L items, and H indicating item sequence representation after embedding with position;
the output of the single-headed self-attention encoder is calculated using equation 5:
where D represents the output of a single-ended self-attention network, WQ,WK,WVAll represent parameters of the self-attention network, and softmax is a normalized exponential function;
the expression of the sequence of items after the single-head self-attention network is calculated by formula 6, namely hs:
hsD + H equation 6.
Preferably, the item sequence with the position information is processed by a variational self-encoder, and the specific method comprises the following steps:
sequence of items h with output from attention encodersInputting the input items into a variational self-encoder for encoding, and outputting to obtain item representations of item sequences which obey normal distribution;
calculating by using formula 7, formula 8 and formula 9:
μv=l1(hs) Equation 7
σv=l2(hs) Equation 8
zv=μv+σvε equation 9
Wherein l1And l2Representing a linear transformation function, zvRepresenting the sequence representation of the item, mu, after passing through a variational self-coder networkvMeans, σ, representing the Gaussian distributionvThe variance of this Gaussian distribution, ε to N (0, I).
Preferably, the k-th influencing factor of each item in the sequence of items is on item viThe aggregation of influence probabilities of (c) is calculated using equation 10:
wherein z isi l(k)Representing the item representation after accumulating the influence probability of the k-th influencing factor of the L-1 items;
z for K channels using equation 11i l(k)And (3) accumulation:
wherein z isi lRepresenting item viThe combination of accumulated influence probabilities of K channels, i.e., item viIs represented by the local level of decoupling.
Preferably, formula 12 is used to calculate the target user's item viProbability of occurrence of the latter as the next interactive item:
wherein the content of the first and second substances,indicating that the target user is in item viAfter as the probability of the next interactive item appearing, zL TDenotes zLTranspose of hiAn initial embedded representation of a candidate is represented,representing the final embedded representation of the last item in the sequence, softmax is a normalized exponential function.
Preferably, with the lower evidence bound as the training target, the following formula 13 is used to calculate:
wherein the content of the first and second substances,indicating the reconstruction error, KL (q)φ(z | y) | p (z)) represents the KL divergence;
wherein the content of the first and second substances,representing item viProbability of occurrence as next interactive item in target user sequence, yiRepresenting item viWhether it really appears as the next interactive item in the target user sequence, yiThe true occurrence value is 1, otherwise the value is 0.
the invention at least comprises the following beneficial effects: the invention provides an edge-enhanced global decoupling graph neural network model for capturing link information of items. The project representation and user intent are modeled from both a global and a local level. In one aspect, a global item linkage graph is constructed over all sequences. And decomposing the project edge into a plurality of channels by applying a channel perception mechanism, wherein each channel corresponds to one influence factor. The channel extracts specific factor features from the project neighbors and jointly aggregates different factors to the target project. On the other hand, the decoupled sequence representation is modeled on the current sequence. The latent variables are first inferred as gaussian distributions for decoupled representation learning from the statistical perspective of the variational autoencoder, and then a graph neural network is applied to aggregate the project information with the previous projects and model the user intent. Therefore, the conversion of potential intention of identifying the conversion of the user interaction item can be achieved, products can be accurately recommended to the user, and the transaction speed and the transaction success rate are improved.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention.
Drawings
FIG. 1 is an architecture diagram of the sequence recommendation model according to one embodiment of the present invention;
fig. 2 is a partial example of a global link diagram according to one embodiment of the present invention.
Detailed Description
The present invention is further described in detail below with reference to the attached drawings so that those skilled in the art can implement the invention by referring to the description text.
It is to be noted that the experimental methods described in the following embodiments are all conventional methods unless otherwise specified, and the reagents and materials, if not otherwise specified, are commercially available; in the description of the present invention, the terms indicating orientation or positional relationship are based on the orientation or positional relationship shown in the drawings only for the convenience of description and simplification of description, and do not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.
As shown in fig. 1-2, the invention provides a sequence recommendation method based on an edge-enhanced global decoupling graph neural network, which includes:
an input layer: taking all projects in the training data as nodes and the interaction sequence between the user and the projects as edges, constructing a global linked graph, and inputting the global linked graph to a graph neural network;
decoupling the learning layer: the global layer adopts a decoupling representation method based on a graph neural network to aggregate the information from the item viInfluence probability of preset number of factors of all neighbor projects to obtain a project viThe global level of decoupling item representation of (a);
the local layer firstly adopts a sequence neural network-based preprocessing project interaction sequence of a target user and then adopts a decoupling representation method based on a graph neural network to aggregate projects v from the target useriObtaining the influence probability of the preset number of factors of the first L-1 items according to the interaction time sequence to obtain the item v of the target useriThe local level of decoupling item representation of (a);
prediction layer: accumulating items viAnd the item v of the target useriThe decoupling item representation of the local level is multiplied by the initial embedded representation of each candidate item, and then the target user in the item v is obtainediThen the probability of the next interactive item is taken as the probability of the occurrence;
iterative training: setting a training target training graph neural network, and updating the item representation and the internal parameters to obtain a trained graph neural network;
and (4) arranging the output results of the prediction layers of the trained graph neural network in a descending order, wherein the item with the highest probability is the item recommended to the target user.
In the technical scheme, by establishing the relation between the user and the project in the training data and based on the interaction sequence, the global link chart can show the project and determine the domain information of the project, and an information basis is provided for the later mining of the interaction intention. And then learning the interaction intention between the projects from two levels by a decoupling learning method, on one hand, mining and aggregating the influence probability of a plurality of factors from all the neighbor projects of the projects, namely the influence of the projects on a certain factor in the field, through the field of the projects in the global linked graph, thereby finding out the factor with the largest influence. On the other hand, the information of the item interaction sequence of the target user in a certain time period is subjected to correlation processing through the sequential neural network, and then the target user learns the item interaction correlation information in the near future through the decoupling representation method of the graph neural network, so that the influence of the factors on the near future interaction items from the target user can be aggregated, and the factor with the largest influence can be found out.
Based on the mining of potential intention in the item interaction sequence of the two steps, the next most likely interactive item can be predicted and recommended to the target user, and the recommendation accuracy and the transaction promotion possibility are obviously increased.
The sequence recommendation method disclosed by the invention has a particularly prominent effect in product recommendation of shopping websites, and the speed and success rate of successful transaction are remarkably improved.
In another technical solution, a global level decoupling representation method includes: preset item viBy its neighbor item vjK number of influencing factors, i.e. from item viTo neighbor item vjHas the advantages ofK channels, each channel corresponds to an influence factor, and each neighbor item v is respectively calculated based on a channel perception mechanismjK influencing factors on item viAnd aggregate the influence probabilities from the items viAll neighbor items vjTo update the item viIs expressed by the item of (1), get
In the above technical solution, there are several unknown influencing factors for a given training data, but there are actually a certain number of influencing factors, so iterative training is performed by setting the number of influencing factors, and if the training result is not good, the number of influencing factors can be changed, for example, after repeating for several times, until the number of influencing factors suitable for the given training data is obtained.
In another technical scheme, the neighbor item vjIs given to the item viThe influence probability of (c) is calculated by using formula 1:
wherein the content of the first and second substances,representing item viAnd vjAre linked adjacently, and item vjOn item v over influence kiThe probability of influence of (a) is,parameter, W, representing channel kk′Parameter representing channel k', dinDimension representing item embedding of input, dchannelRepresenting the dimension of each channel embedding representation, hiRepresenting item viInitial embedded representation of hjRepresenting item vjRepresents a non-linear activation function, Wk TAnd Wk′ TRespectively represent a pair matrix WkAnd Wk′Taking the transpose.
In the above technical solution, each neighbor item pair item v can be calculated and obtained through formula 1iFor quantifying the influencing factors.
In another solution, the term v is accumulated using equation 2iTo item v of the k-th influence factor of the domain ofiThe method comprises the following steps:
wherein the content of the first and second substances,representing accumulated items viThe k-th influence factor of the field of (1) and the after-influence-probability item viIs indicative of the item of (a),representing item viThe neighborhood of (1), i.e., the set of all neighbor items;
wherein the content of the first and second substances,representing item viThe items represented by the accumulated influence probabilities of the K channels, i.e., item viIs represented by a global level of decoupling.
In the above technical solution, the influence conditions of multiple influence factors of all neighbor projects can be aggregated to the project v through formula 2 and formula 3iIn, item pairEye viThe purpose of global decoupling learning is achieved.
In another technical solution, the method for representing the local decoupling layer includes embedding the position of the item sequence, and the specific method includes:
selecting an item v interacted by a target user u at one momentiAnd item viForming a project sequence by the last L-1 interactive projects, respectively embedding the positions of each project in the project sequence in the sequence order, inputting the project sequence into a self-attention encoder for encoding, and updating the position information into the project representation of the project sequence in an aggregation manner;
the position-embedded item sequence is represented by equation 4:
wherein p is1~pLAn embedded representation representing L positions, h1~hLIndicating initial embedding of L items, and H indicating item sequence representation after embedding with position;
the output of the single-headed self-attention encoder is calculated using equation 5:
where D represents the output of a single-ended self-attention network, WQ,WK,WVAll represent parameters of the self-attention network, and softmax is a normalized exponential function;
the expression of the sequence of items after the single-head self-attention network is calculated by formula 6, namely hs:
hsD + H equation 6.
In the technical scheme, the recent item interaction information of the target user is represented to the current item through the self-attention encoder, and the embedded representation of the position is combined, so that the information of the item interaction sequence of the target user is represented, and an information basis is provided for the subsequent decoupling learning.
In another technical scheme, a variational self-encoder is adopted to process an item sequence with position information, and the specific method comprises the following steps:
sequence of items h with output from attention encodersInputting the input items into a variational self-encoder for encoding, and outputting to obtain item representations of item sequences which obey normal distribution;
calculating by using formula 7, formula 8 and formula 9:
μv=l1(hs) Equation 7
σv=l2(hs) Equation 8
zv=μv+σvε equation 9
Wherein l1And l2Representing a linear transformation function, zvRepresenting the sequence representation of the item, mu, after passing through a variational self-coder networkvMeans, σ, representing the Gaussian distributionvThe variance of this Gaussian distribution, ε to N (0, I).
In the technical scheme, the item representation is inferred into Gaussian distribution through the variational self-encoder, so that uncertainty modeling is introduced into the system, and the independence among dimensions of the item representation in decoupling representation learning is further improved.
In another technical scheme, the k-th influence factor of each item in the item sequence is applied to the item viThe aggregation of influence probabilities of (c) is calculated using equation 10:
wherein the content of the first and second substances,representing the item representation after accumulating the influence probability of the k-th influencing factor of the L-1 items;
wherein the content of the first and second substances,representing item viThe combination of accumulated influence probabilities of K channels, i.e., item viIs represented by the local level of decoupling.
In the technical scheme, local intention mining can be performed on a target user through a decoupling learning method, the influence conditions of a plurality of influence factors are decoupled and learned on the recently interacted projects of the target user according to the interaction sequence, and the influence conditions are aggregated to the project viIn, for item viThe purpose of local decoupling learning for the target user is achieved.
In another technical scheme, formula 12 is adopted to calculate the item v of the target useriProbability of occurrence of the latter as the next interactive item:
wherein the content of the first and second substances,indicating that the target user is in item viAfter as the probability of the next interactive item appearing, zL TDenotes zLTranspose of hiAn initial embedded representation of a candidate is represented,representing the final embedded representation of the last item in the sequence, softmax is a normalized exponential function.
In the above technical solution, the candidate items are actually all items in the training data, each candidate item can be calculated as an item of a next possible interaction of a certain target user through formula 12, and the item with the highest probability represents the next most likely interaction item.
In another technical solution, the lower bound of evidence is taken as a training target, and formula 13 is adopted to calculate:
wherein the content of the first and second substances,which is indicative of the error in the reconstruction,represents the KL divergence;
wherein the content of the first and second substances,representing item viProbability of occurrence as next interactive item in target user sequence, yiRepresenting item viWhether it really appears as the next interactive item in the target user sequence, yiThe true occurrence value is 1, otherwise the value is 0.
In the technical scheme, the lower evidence bound is used as a training target, the neural network of the image can be trained more quickly and optimally, overfitting is avoided, and the pre-failure accuracy is improved.
In another technical scheme, the method also comprises regularization by adopting an L2 paradigmAndand by adopting regularization processing data, the stability of the numerical value can be ensured.
Specifically, the method comprises the following steps:
fig. 1 shows a model architecture of an entire edge-enhanced global decoupling graph neural network.
1. Problem definition
Given M users and N items, we represent the user set as U ═ U1,u2,...,uMRepresents the set of items asFor each of the users, the user is provided with,representing the sequential behavior of user u interacting with v. Given the historical sequence at time t, the sequence recommendation model aims to predict the next item that the user may be interested in at time t + 1.
In the inventive method, a global level link map is provided to capture item link translation information. The global link graph is defined as G ═ V, E >, where V is the set of all the items in the training data and E is the set of edges. Each side < vi,vj"E" indicates that the user is in v orderiThen with vjAnd carrying out interaction.Representing item viI.e. the neighborhood of v in the sequenceiThe adjacent items.
2. Channel awareness mechanism
Given a history sequenceDefinition ofIndicating that user u is associated with an itemPotential intention in interaction. Assuming there are K factors related to the user's intent, the potential representation will beDivided into K channels, i.e.The kth channel independently corresponds to the kth factor. For each pair of adjacent items,andthe correlation between them represents the item viAnd item vjSimilarity in the influencing factor k and reveals why and how these two items are related to each other.
3. Global level decoupled representation learning
Assume that there are K concepts related to user intent, which means that there are K potential factors that need to be decoupled. Given a global graph G ═ V, E >, constructed based on training data, the nodes (i.e., items) are partitioned into K components in the underlying space, and the link edges are correspondingly partitioned into K channels. The kth component is related to the kth factor of the user's intent, and the kth channel represents how factor k affects the links to the items.
For a single node v in the graphiThe goal is to aggregate the neighbors from itThe information of (1). The project representation is first split into K components and factor K is computed from its neighborsInfluencing item viProbability of (c):
disclosing why the item viAnd vjAre linked adjacently, and item vjHow to influence item v on factor ki. Can then be selected from item viAccumulates information and updates v in the neighbors ofiThe item of (d) represents:
By projecting the item representations into different channels, the item information can be aggregated from the perspective of different conceptual factors. Such that the items represent at the global level of the modelCan be written as a combination of K channels, i.e.
4. Partial level decoupled representation learning
Given that items appearing in a sequence are rarely repeated, the local learning model is based on a sequence approach rather than a graph neural network. Given a series of historical behaviors of usersThe most recent L interactive items are selected. For users with sequence length less than L, zero vectors are repeatedly added to the left side of the sequence. To distinguish item representations at different positions in a sequence, a learnable position embedding is added to the initial item embeddingAnd as the final input to the learning layer:
the self-attention network is first applied to the learning model of the present invention to take advantage of its ability to capture the long-term and short-term dependencies of items in sequence. Order toFor the output of the single-headed self-attention encoder, hsAs input to a variational self-encoder, in which
In a variational autocoder, the posterior distribution is inferred as a polynomial layer. The mean vector and variance vector are calculated based on the self-attention vector as follows:
μv=l1(hs)
σv=l2(hs)
the output of the variational autocoder layer is written as z by using a "reparameterization trick" that follows the traditional variational autocoder modelv=μv+σvε, where ε is N (0, I).
After self-attention variational auto-encoder models, a term representation is obtained where the entire sequence follows a normal distribution. Then, the application channel perception aggregation mechanism is subjected to local decoupling learning.
To obtain the user intent translation, bases are usedChannel-aware sliding window strategies in graph neural networks. The sliding window length is set to L. This means that v is the target item for eachiThe information will be aggregated from its previous L-1 items. Item viAnd vjThe probability between is calculated based on the channel perception mechanism as follows:
5. prediction layer
Based on learned z-representations from global and local layersg,zlThe final sequence representation can be obtained, written as: z is equal to zg+zl。
The final recommendation probability of the candidate item can be estimated based on the current sequence embedding and the initial item embedding. By usingRepresenting item viProbability of occurrence as next interaction in the current sequence:
the training targets are calculated from the lower evidence bound and the reconstruction error is defined as the cross entropy, as shown below.
6. Iterative training
And training the model of the whole edge-enhanced global decoupling graph neural network by taking the lower evidence boundary as a training target to obtain the trained model of the edge-enhanced global decoupling graph neural network, predicting a next item which is possibly interested in a certain target user at the current moment, and recommending the next item to the target user.
While embodiments of the invention have been described above, it is not limited to the applications set forth in the description and the embodiments, which are fully applicable in various fields of endeavor to which the invention pertains, and further modifications may readily be made by those skilled in the art, it being understood that the invention is not limited to the details shown and described herein without departing from the general concept defined by the appended claims and their equivalents.
Claims (10)
1. The sequence recommendation method based on the edge-enhanced global decoupling graph neural network is characterized by comprising the following steps of:
an input layer: taking all projects in the training data as nodes and the interaction sequence between the user and the projects as edges, constructing a global linked graph, and inputting the global linked graph to a graph neural network;
decoupling the learning layer: the global layer adopts a decoupling representation method based on a graph neural network to aggregate the information from the item viInfluence probability of preset number of factors of all neighbor projects to obtain a project viThe global level of decoupling item representation of (a);
the local layer firstly adopts a sequence neural network-based preprocessing project interaction sequence of a target user and then adopts a decoupling representation method based on a graph neural network to aggregate projects v from the target useriObtaining the influence probability of the preset number of factors of the first L-1 items according to the interaction time sequence to obtain the item v of the target useriThe local level of decoupling item representation of (a);
prediction layer: accumulating items viAnd the item v of the target useriThe decoupling item representation of the local level is multiplied by the initial embedded representation of each candidate item, and then the target user in the item v is obtainediThen the probability of the next interactive item is taken as the probability of the occurrence;
iterative training: setting a training target training graph neural network, and updating the item representation and the internal parameters to obtain a trained graph neural network;
and (4) arranging the output results of the prediction layers of the trained graph neural network in a descending order, wherein the item with the highest probability is the item recommended to the target user.
2. The edge-enhanced global decoupling graph neural network-based sequence recommendation method of claim 1, wherein the global-level decoupling representation method comprises: preset item viBy its neighbor item vjK number of influencing factors, i.e. from item viTo neighbor item vjK channels are provided, each channel corresponds to an influence factor, and each neighbor item v is calculated based on a channel perception mechanismjK influencing factors on item viAnd aggregate the influence probabilities from the items viAll neighbor items vjTo update the item viIs expressed by the item of (1), get
3. The edge-enhanced global decoupling graph neural network-based sequence recommendation method of claim 1, wherein neighbor items vjIs given to the item viThe influence probability of (c) is calculated by using formula 1:
wherein the content of the first and second substances,representing item viAnd vjAre linked adjacently, and item vjOn item v over influence kiThe probability of influence of (a) is,parameter, W, representing channel kk′Parameter representing channel k', dinDimension representing item embedding of input, dchannelRepresenting the dimension of each channel embedding representation, hiRepresenting item viInitial embedded representation of hjRepresenting item vjRepresents a non-linear activation function, Wk TAnd Wk′ TRespectively represent a pair matrix WkAnd Wk′Taking the transpose.
4. The edge-enhanced global decoupling graph neural network-based sequence recommendation method of claim 2, wherein the items v are accumulated by formula 2iTo item v of the k-th influence factor of the domain ofiThe method comprises the following steps:
wherein the content of the first and second substances,representing accumulated items viThe k-th influence factor of the field of (1) and the after-influence-probability item viIs indicative of the item of (a),representing item viThe neighborhood of (1), i.e., the set of all neighbor items;
5. The edge-enhanced global decoupling graph neural network-based sequence recommendation method of claim 1, wherein the local-level decoupling representation method comprises position embedding of an item sequence, and the specific method comprises the following steps:
selecting an item v interacted by a target user u at one momentiAnd item viForming a project sequence by the last L-1 interactive projects, respectively embedding the positions of each project in the project sequence in the sequence order, inputting the project sequence into a self-attention encoder for encoding, and updating the position information into the project representation of the project sequence in an aggregation manner;
the position-embedded item sequence is represented by equation 4:
wherein p is1~pLAn embedded representation representing L positions, h1~hLIndicating initial embedding of L items, and H indicating item sequence representation after embedding with position;
the output of the single-headed self-attention encoder is calculated using equation 5:
where D represents the output of a single-ended self-attention network, WQ,WK,WVAll represent parameters of the self-attention network, and softmax is a normalized exponential function;
the expression of the sequence of items after the single-head self-attention network is calculated by formula 6, namely hs:
hsD + H equation 6.
6. The sequence recommendation method based on the edge-enhanced global decoupling graph neural network of claim 5, wherein a variational self-encoder is adopted to process the item sequence with the position information, and the specific method is as follows:
sequence of items h with output from attention encodersInputting the input items into a variational self-encoder for encoding, and outputting to obtain item representations of item sequences which obey normal distribution;
calculating by using formula 7, formula 8 and formula 9:
μv=l1(hs) Equation 7
σv=l2(hs) Equation 8
zv=μv+σvε equation 9
Wherein l1And l2Representing a linear transformation function, zvRepresenting the sequence representation of the item, mu, after passing through a variational self-coder networkvMeans, σ, representing the Gaussian distributionvThe variance of this Gaussian distribution, ε to N (0, I).
7. The edge-enhanced global decoupling graph neural network-based sequence recommendation method of claim 6, wherein the kth influencing factor of each item in the item sequence is applied to the item viThe aggregation of influence probabilities of (c) is calculated using equation 10:
wherein z isi l(k)Representing the item representation after accumulating the influence probability of the k-th influencing factor of the L-1 items;
z for K channels using equation 11i l(k)And (3) accumulation:
wherein z isi lRepresenting item viThe combination of accumulated influence probabilities of K channels, i.e., item viIs represented by the local level of decoupling.
8. The edge-enhanced global decoupling graph neural network-based sequence recommendation method of claim 1, wherein formula 12 is adopted to calculate the target user in item viProbability of occurrence of the latter as the next interactive item:
wherein the content of the first and second substances,indicating that the target user is in item viAfter as the probability of the next interactive item appearing, zL TDenotes zLTranspose of hiAn initial embedded representation of a candidate is represented,representing the final embedded representation of the last item in the sequence, softmax is a normalized exponential function.
9. The edge-enhanced global decoupling graph neural network-based sequence recommendation method of claim 1, wherein the lower evidence bound is taken as a training target, and formula 13 is adopted for calculation:
wherein the content of the first and second substances,indicating the reconstruction error, KL (q)φ(z | y) | p (z)) represents the KL divergence;
wherein the content of the first and second substances,representing item viProbability of occurrence as next interactive item in target user sequence, yiRepresenting item viWhether it really appears as the next interactive item in the target user sequence, yiThe true occurrence value is 1, otherwise the value is 0.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111603830.XA CN114187077A (en) | 2021-12-24 | 2021-12-24 | Sequence recommendation method based on edge-enhanced global decoupling graph neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111603830.XA CN114187077A (en) | 2021-12-24 | 2021-12-24 | Sequence recommendation method based on edge-enhanced global decoupling graph neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114187077A true CN114187077A (en) | 2022-03-15 |
Family
ID=80544955
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111603830.XA Pending CN114187077A (en) | 2021-12-24 | 2021-12-24 | Sequence recommendation method based on edge-enhanced global decoupling graph neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114187077A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112000689A (en) * | 2020-08-17 | 2020-11-27 | 吉林大学 | Multi-knowledge graph fusion method based on text analysis |
-
2021
- 2021-12-24 CN CN202111603830.XA patent/CN114187077A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112000689A (en) * | 2020-08-17 | 2020-11-27 | 吉林大学 | Multi-knowledge graph fusion method based on text analysis |
CN112000689B (en) * | 2020-08-17 | 2022-10-18 | 吉林大学 | Multi-knowledge graph fusion method based on text analysis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Abdar et al. | A review of uncertainty quantification in deep learning: Techniques, applications and challenges | |
CN110119467B (en) | Project recommendation method, device, equipment and storage medium based on session | |
Ma et al. | End-to-end incomplete time-series modeling from linear memory of latent variables | |
Silva-Ramírez et al. | Missing value imputation on missing completely at random data using multilayer perceptrons | |
CN111079931A (en) | State space probabilistic multi-time-series prediction method based on graph neural network | |
CN113590900A (en) | Sequence recommendation method fusing dynamic knowledge maps | |
JP7474446B2 (en) | Projection Layer of Neural Network Suitable for Multi-Label Prediction | |
Jun et al. | Uncertainty-gated stochastic sequential model for EHR mortality prediction | |
Chen et al. | Fast approximate geodesics for deep generative models | |
CN114298851A (en) | Network user social behavior analysis method and device based on graph sign learning and storage medium | |
CN112085293A (en) | Method and device for training interactive prediction model and predicting interactive object | |
Nasiri et al. | A node representation learning approach for link prediction in social networks using game theory and K-core decomposition | |
CN113821724B (en) | Time interval enhancement-based graph neural network recommendation method | |
CN114187077A (en) | Sequence recommendation method based on edge-enhanced global decoupling graph neural network | |
CN114896515A (en) | Time interval-based self-supervision learning collaborative sequence recommendation method, equipment and medium | |
CN115953215B (en) | Search type recommendation method based on time and graph structure | |
Janković Babić | A comparison of methods for image classification of cultural heritage using transfer learning for feature extraction | |
Liang et al. | Paying deep attention to both neighbors and multiple tasks | |
CN116257691A (en) | Recommendation method based on potential graph structure mining and user long-short-term interest fusion | |
Ann et al. | Parameter estimation of Lorenz attractor: A combined deep neural network and K-means clustering approach | |
Zhang | An English teaching resource recommendation system based on network behavior analysis | |
Kim | Active Label Correction Using Robust Parameter Update and Entropy Propagation | |
Kalina et al. | Robust training of radial basis function neural networks | |
Chen et al. | Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions | |
Oldenhof et al. | Self-labeling of fully mediating representations by graph alignment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |