CN114187077A - Sequence recommendation method based on edge-enhanced global decoupling graph neural network - Google Patents

Sequence recommendation method based on edge-enhanced global decoupling graph neural network Download PDF

Info

Publication number
CN114187077A
CN114187077A CN202111603830.XA CN202111603830A CN114187077A CN 114187077 A CN114187077 A CN 114187077A CN 202111603830 A CN202111603830 A CN 202111603830A CN 114187077 A CN114187077 A CN 114187077A
Authority
CN
China
Prior art keywords
item
sequence
decoupling
neural network
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111603830.XA
Other languages
Chinese (zh)
Inventor
沈利东
沈利辉
赵朋朋
李蕴祎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Yiyou Huiyun Software Co ltd
Original Assignee
Jiangsu Yiyou Huiyun Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Yiyou Huiyun Software Co ltd filed Critical Jiangsu Yiyou Huiyun Software Co ltd
Priority to CN202111603830.XA priority Critical patent/CN114187077A/en
Publication of CN114187077A publication Critical patent/CN114187077A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Probability & Statistics with Applications (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a sequence recommendation method based on an edge-enhanced global decoupling graph neural network, which comprises the following steps: an input layer: taking all projects in the training data as nodes and the interaction sequence between the user and the projects as edges, constructing a global linked graph, and inputting the global linked graph to a graph neural network; decoupling the learning layer: aggregating the influence probabilities of the factors with the preset number from all the neighbor projects of the project and the influence probabilities of the factors with the preset number from the previous projects of the project of the target user according to the interaction time sequence; prediction layer: accumulating the decoupling item representations of the global level and the local level, and multiplying the decoupling item representations by the initial embedded representation of the candidate item to obtain the item v of the target useriThen the probability of the next interactive item is taken as the probability of the occurrence; iteratively training to complete the neural network; prediction output of trained graph neural networkAnd the item with the highest probability is the item recommended to the target user. The method and the system have the advantages of perceiving the transfer of the user intention and recommending products to the user more accurately.

Description

Sequence recommendation method based on edge-enhanced global decoupling graph neural network
Technical Field
The invention relates to the technical field of sequence recommendation. More specifically, the invention relates to a sequence recommendation method based on an edge-enhanced global decoupling graph neural network.
Background
The recommendation system plays a crucial role in the rapidly developing internet age, and the sequence recommendation is one of the important components. Sequence recommendations model user behavior as a sequence of items rather than a collection of items. A markov chain is a classical method that models short term item transitions and predicts the next item that a user may like. With the development of deep learning networks, the recurrent neural network has succeeded in sequence recommendation. For example, long-short term memory networks are a common variant of recurrent neural networks to enhance the ability of models to retain sequence information through memory units. The GRU4REC applies gated loop units to the session-based recommendations in parallel in small batches. But the recurrent neural network-based approach faces the problem of difficulty in retaining remote information, while self-attention networks have recently been applied to sequence recommendations to capture long-term and short-term dependencies.
But these previous works model the user intent through a historical sequence of sequential interactions, ignoring the dynamic underlying relationships behind the project. Linking the edges of pairs of items contains rich semantic information that accounts for why and how a user selects one item over another. These potential factors are all related to real-world concepts, and a factor often dominates in a single instance. For example, assume that there are two users interacting with six items, as shown in FIG. 2. The link map shows item 2 adjacent to all five other items. But these link edges are intuitively driven by different factors. Item 2 is linked to items 1 and 4 because they are the same color, and item 5 and 6 are both short sleeves. Item 3 is connected to item 2 because it can be used as a T-shirt coat. These different factors show the intent translation of user behavior and also reveal the sharing characteristics of the paired items. Therefore, it is desirable to identify and distinguish these potential factors.
Decoupling means learning is very hot in many areas of computer vision and has recently been applied in recommendation systems. Decoupling represents the general purpose of learning to separate the unique and informative factors of data changes, where each unit is associated with a single concept in the real world. A single change in one factor will result in a change in the relevant units. The most common learning-decoupled networks are represented by β -VAE and InfoGAN. In the recommendation system aspect, the macro-VAE infers high-level concepts of user intent on a macro level and applies VAEs to enhance decoupling on a micro level. The authors also propose an auto-supervised seq2seq training strategy for sequence recommendation that generates sub-sequences using an intent decoupling encoder and compares user intent between two sub-sequences. However, these studies do not take into account the link relationship pattern of the project, nor do they distinguish between different user intentions behind the sequence. Thus, the learned sequence model will be sensitive to noisy data and lack interpretability.
Disclosure of Invention
An object of the present invention is to solve at least the above problems and to provide at least the advantages described later.
The invention also aims to provide a sequence recommendation method based on the edge-enhanced global decoupling graph neural network, which can achieve the conversion of identifying the potential intention of user interaction item conversion, thereby accurately recommending products to users and improving the transaction speed and the transaction success rate.
To achieve these objects and other advantages in accordance with the purpose of the invention, there is provided a method for sequence recommendation based on an edge-enhanced global decoupling graph neural network, comprising:
an input layer: taking all projects in the training data as nodes and the interaction sequence between the user and the projects as edges, constructing a global linked graph, and inputting the global linked graph to a graph neural network;
decoupling the learning layer: the global layer adopts a decoupling representation method based on a graph neural network to aggregate the information from the item viInfluence probability of preset number of factors of all neighbor projects to obtain a project viIs represented by a global level decoupling item;
The local layer firstly adopts a sequence neural network-based preprocessing project interaction sequence of a target user and then adopts a decoupling representation method based on a graph neural network to aggregate projects v from the target useriObtaining the influence probability of the preset number of factors of the first L-1 items according to the interaction time sequence to obtain the item v of the target useriThe local level of decoupling item representation of (a);
prediction layer: accumulating items viAnd the item v of the target useriThe decoupling item representation of the local level is multiplied by the initial embedded representation of each candidate item, and then the target user in the item v is obtainediThen the probability of the next interactive item is taken as the probability of the occurrence;
iterative training: setting a training target training graph neural network, and updating the item representation and the internal parameters to obtain a trained graph neural network;
and (4) arranging the output results of the prediction layers of the trained graph neural network in a descending order, wherein the item with the highest probability is the item recommended to the target user.
Preferably, the global-level decoupling representation method includes: preset item viBy its neighbor item vjK number of influencing factors, i.e. from item viTo neighbor item vjK channels are provided, each channel corresponds to an influence factor, and each neighbor item v is calculated based on a channel perception mechanismjK influencing factors on item viAnd aggregate the influence probabilities from the items viAll neighbor items vjTo update the item viIs represented by the item(s) of (1), to obtain zi g
Preferably, the neighbor item vjIs given to the item viThe influence probability of (c) is calculated by using formula 1:
Figure BDA0003432912180000031
wherein the content of the first and second substances,
Figure BDA0003432912180000032
representing item viAnd vjAre linked adjacently, and item vjOn item v over influence kiThe probability of influence of (a) is,
Figure BDA0003432912180000033
parameter, W, representing channel kk′Parameter representing channel k', dinDimension representing item embedding of input, dchannelRepresenting the dimension of each channel embedding representation, hiRepresenting item viInitial embedded representation of hjRepresenting item vjRepresents a non-linear activation function, Wk TAnd Wk′ TRespectively represent a pair matrix WkAnd Wk′Taking the transpose.
Preferably, the term v is accumulated using equation 2iTo item v of the k-th influence factor of the domain ofiThe method comprises the following steps:
Figure BDA0003432912180000034
wherein the content of the first and second substances,
Figure BDA0003432912180000035
representing accumulated items viThe k-th influence factor of the field of (1) and the after-influence-probability item viIs indicative of the item of (a),
Figure BDA0003432912180000036
representing item viThe neighborhood of (1), i.e., the set of all neighbor items;
using formula 3 with K channels
Figure BDA0003432912180000037
And (3) accumulation:
Figure BDA0003432912180000038
wherein the content of the first and second substances,
Figure BDA0003432912180000039
representing item viThe items represented by the accumulated influence probabilities of the K channels, i.e., item viIs represented by a global level of decoupling.
Preferably, the decoupling representation method of the local layer includes embedding the position of the item sequence, and the specific method includes:
selecting an item v interacted by a target user u at one momentiAnd item viForming a project sequence by the last L-1 interactive projects, respectively embedding the positions of each project in the project sequence in the sequence order, inputting the project sequence into a self-attention encoder for encoding, and updating the position information into the project representation of the project sequence in an aggregation manner;
the position-embedded item sequence is represented by equation 4:
Figure BDA0003432912180000041
wherein p is1~pLAn embedded representation representing L positions, h1~hLIndicating initial embedding of L items, and H indicating item sequence representation after embedding with position;
the output of the single-headed self-attention encoder is calculated using equation 5:
Figure BDA0003432912180000042
where D represents the output of a single-ended self-attention network, WQ,WK,WVAll represent parameters of the self-attention network, and softmax is a normalized exponential function;
the expression of the sequence of items after the single-head self-attention network is calculated by formula 6, namely hs
hsD + H equation 6.
Preferably, the item sequence with the position information is processed by a variational self-encoder, and the specific method comprises the following steps:
sequence of items h with output from attention encodersInputting the input items into a variational self-encoder for encoding, and outputting to obtain item representations of item sequences which obey normal distribution;
calculating by using formula 7, formula 8 and formula 9:
μv=l1(hs) Equation 7
σv=l2(hs) Equation 8
zv=μvvε equation 9
Wherein l1And l2Representing a linear transformation function, zvRepresenting the sequence representation of the item, mu, after passing through a variational self-coder networkvMeans, σ, representing the Gaussian distributionvThe variance of this Gaussian distribution, ε to N (0, I).
Preferably, the k-th influencing factor of each item in the sequence of items is on item viThe aggregation of influence probabilities of (c) is calculated using equation 10:
Figure BDA0003432912180000043
wherein z isi l(k)Representing the item representation after accumulating the influence probability of the k-th influencing factor of the L-1 items;
z for K channels using equation 11i l(k)And (3) accumulation:
Figure BDA0003432912180000051
wherein z isi lRepresenting item viThe combination of accumulated influence probabilities of K channels, i.e., item viIs represented by the local level of decoupling.
Preferably, formula 12 is used to calculate the target user's item viProbability of occurrence of the latter as the next interactive item:
Figure BDA0003432912180000052
wherein the content of the first and second substances,
Figure BDA0003432912180000053
indicating that the target user is in item viAfter as the probability of the next interactive item appearing, zL TDenotes zLTranspose of hiAn initial embedded representation of a candidate is represented,
Figure BDA0003432912180000054
representing the final embedded representation of the last item in the sequence, softmax is a normalized exponential function.
Preferably, with the lower evidence bound as the training target, the following formula 13 is used to calculate:
Figure BDA0003432912180000055
wherein the content of the first and second substances,
Figure BDA0003432912180000056
indicating the reconstruction error, KL (q)φ(z | y) | p (z)) represents the KL divergence;
wherein the content of the first and second substances,
Figure BDA0003432912180000057
Figure BDA0003432912180000058
wherein the content of the first and second substances,
Figure BDA0003432912180000059
representing item viProbability of occurrence as next interactive item in target user sequence, yiRepresenting item viWhether it really appears as the next interactive item in the target user sequence, yiThe true occurrence value is 1, otherwise the value is 0.
Preferably, the method further comprises regularization by using L2 paradigm
Figure BDA00034329121800000510
And
Figure BDA00034329121800000511
the invention at least comprises the following beneficial effects: the invention provides an edge-enhanced global decoupling graph neural network model for capturing link information of items. The project representation and user intent are modeled from both a global and a local level. In one aspect, a global item linkage graph is constructed over all sequences. And decomposing the project edge into a plurality of channels by applying a channel perception mechanism, wherein each channel corresponds to one influence factor. The channel extracts specific factor features from the project neighbors and jointly aggregates different factors to the target project. On the other hand, the decoupled sequence representation is modeled on the current sequence. The latent variables are first inferred as gaussian distributions for decoupled representation learning from the statistical perspective of the variational autoencoder, and then a graph neural network is applied to aggregate the project information with the previous projects and model the user intent. Therefore, the conversion of potential intention of identifying the conversion of the user interaction item can be achieved, products can be accurately recommended to the user, and the transaction speed and the transaction success rate are improved.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention.
Drawings
FIG. 1 is an architecture diagram of the sequence recommendation model according to one embodiment of the present invention;
fig. 2 is a partial example of a global link diagram according to one embodiment of the present invention.
Detailed Description
The present invention is further described in detail below with reference to the attached drawings so that those skilled in the art can implement the invention by referring to the description text.
It is to be noted that the experimental methods described in the following embodiments are all conventional methods unless otherwise specified, and the reagents and materials, if not otherwise specified, are commercially available; in the description of the present invention, the terms indicating orientation or positional relationship are based on the orientation or positional relationship shown in the drawings only for the convenience of description and simplification of description, and do not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.
As shown in fig. 1-2, the invention provides a sequence recommendation method based on an edge-enhanced global decoupling graph neural network, which includes:
an input layer: taking all projects in the training data as nodes and the interaction sequence between the user and the projects as edges, constructing a global linked graph, and inputting the global linked graph to a graph neural network;
decoupling the learning layer: the global layer adopts a decoupling representation method based on a graph neural network to aggregate the information from the item viInfluence probability of preset number of factors of all neighbor projects to obtain a project viThe global level of decoupling item representation of (a);
the local layer firstly adopts a sequence neural network-based preprocessing project interaction sequence of a target user and then adopts a decoupling representation method based on a graph neural network to aggregate projects v from the target useriObtaining the influence probability of the preset number of factors of the first L-1 items according to the interaction time sequence to obtain the item v of the target useriThe local level of decoupling item representation of (a);
prediction layer: accumulating items viAnd the item v of the target useriThe decoupling item representation of the local level is multiplied by the initial embedded representation of each candidate item, and then the target user in the item v is obtainediThen the probability of the next interactive item is taken as the probability of the occurrence;
iterative training: setting a training target training graph neural network, and updating the item representation and the internal parameters to obtain a trained graph neural network;
and (4) arranging the output results of the prediction layers of the trained graph neural network in a descending order, wherein the item with the highest probability is the item recommended to the target user.
In the technical scheme, by establishing the relation between the user and the project in the training data and based on the interaction sequence, the global link chart can show the project and determine the domain information of the project, and an information basis is provided for the later mining of the interaction intention. And then learning the interaction intention between the projects from two levels by a decoupling learning method, on one hand, mining and aggregating the influence probability of a plurality of factors from all the neighbor projects of the projects, namely the influence of the projects on a certain factor in the field, through the field of the projects in the global linked graph, thereby finding out the factor with the largest influence. On the other hand, the information of the item interaction sequence of the target user in a certain time period is subjected to correlation processing through the sequential neural network, and then the target user learns the item interaction correlation information in the near future through the decoupling representation method of the graph neural network, so that the influence of the factors on the near future interaction items from the target user can be aggregated, and the factor with the largest influence can be found out.
Based on the mining of potential intention in the item interaction sequence of the two steps, the next most likely interactive item can be predicted and recommended to the target user, and the recommendation accuracy and the transaction promotion possibility are obviously increased.
The sequence recommendation method disclosed by the invention has a particularly prominent effect in product recommendation of shopping websites, and the speed and success rate of successful transaction are remarkably improved.
In another technical solution, a global level decoupling representation method includes: preset item viBy its neighbor item vjK number of influencing factors, i.e. from item viTo neighbor item vjHas the advantages ofK channels, each channel corresponds to an influence factor, and each neighbor item v is respectively calculated based on a channel perception mechanismjK influencing factors on item viAnd aggregate the influence probabilities from the items viAll neighbor items vjTo update the item viIs expressed by the item of (1), get
Figure BDA0003432912180000071
In the above technical solution, there are several unknown influencing factors for a given training data, but there are actually a certain number of influencing factors, so iterative training is performed by setting the number of influencing factors, and if the training result is not good, the number of influencing factors can be changed, for example, after repeating for several times, until the number of influencing factors suitable for the given training data is obtained.
In another technical scheme, the neighbor item vjIs given to the item viThe influence probability of (c) is calculated by using formula 1:
Figure BDA0003432912180000072
wherein the content of the first and second substances,
Figure BDA0003432912180000081
representing item viAnd vjAre linked adjacently, and item vjOn item v over influence kiThe probability of influence of (a) is,
Figure BDA0003432912180000082
parameter, W, representing channel kk′Parameter representing channel k', dinDimension representing item embedding of input, dchannelRepresenting the dimension of each channel embedding representation, hiRepresenting item viInitial embedded representation of hjRepresenting item vjRepresents a non-linear activation function, Wk TAnd Wk′ TRespectively represent a pair matrix WkAnd Wk′Taking the transpose.
In the above technical solution, each neighbor item pair item v can be calculated and obtained through formula 1iFor quantifying the influencing factors.
In another solution, the term v is accumulated using equation 2iTo item v of the k-th influence factor of the domain ofiThe method comprises the following steps:
Figure BDA0003432912180000083
wherein the content of the first and second substances,
Figure BDA0003432912180000084
representing accumulated items viThe k-th influence factor of the field of (1) and the after-influence-probability item viIs indicative of the item of (a),
Figure BDA0003432912180000085
representing item viThe neighborhood of (1), i.e., the set of all neighbor items;
using formula 3 with K channels
Figure BDA0003432912180000086
And (3) accumulation:
Figure BDA0003432912180000087
wherein the content of the first and second substances,
Figure BDA0003432912180000088
representing item viThe items represented by the accumulated influence probabilities of the K channels, i.e., item viIs represented by a global level of decoupling.
In the above technical solution, the influence conditions of multiple influence factors of all neighbor projects can be aggregated to the project v through formula 2 and formula 3iIn, item pairEye viThe purpose of global decoupling learning is achieved.
In another technical solution, the method for representing the local decoupling layer includes embedding the position of the item sequence, and the specific method includes:
selecting an item v interacted by a target user u at one momentiAnd item viForming a project sequence by the last L-1 interactive projects, respectively embedding the positions of each project in the project sequence in the sequence order, inputting the project sequence into a self-attention encoder for encoding, and updating the position information into the project representation of the project sequence in an aggregation manner;
the position-embedded item sequence is represented by equation 4:
Figure BDA0003432912180000091
wherein p is1~pLAn embedded representation representing L positions, h1~hLIndicating initial embedding of L items, and H indicating item sequence representation after embedding with position;
the output of the single-headed self-attention encoder is calculated using equation 5:
Figure BDA0003432912180000092
where D represents the output of a single-ended self-attention network, WQ,WK,WVAll represent parameters of the self-attention network, and softmax is a normalized exponential function;
the expression of the sequence of items after the single-head self-attention network is calculated by formula 6, namely hs
hsD + H equation 6.
In the technical scheme, the recent item interaction information of the target user is represented to the current item through the self-attention encoder, and the embedded representation of the position is combined, so that the information of the item interaction sequence of the target user is represented, and an information basis is provided for the subsequent decoupling learning.
In another technical scheme, a variational self-encoder is adopted to process an item sequence with position information, and the specific method comprises the following steps:
sequence of items h with output from attention encodersInputting the input items into a variational self-encoder for encoding, and outputting to obtain item representations of item sequences which obey normal distribution;
calculating by using formula 7, formula 8 and formula 9:
μv=l1(hs) Equation 7
σv=l2(hs) Equation 8
zv=μvvε equation 9
Wherein l1And l2Representing a linear transformation function, zvRepresenting the sequence representation of the item, mu, after passing through a variational self-coder networkvMeans, σ, representing the Gaussian distributionvThe variance of this Gaussian distribution, ε to N (0, I).
In the technical scheme, the item representation is inferred into Gaussian distribution through the variational self-encoder, so that uncertainty modeling is introduced into the system, and the independence among dimensions of the item representation in decoupling representation learning is further improved.
In another technical scheme, the k-th influence factor of each item in the item sequence is applied to the item viThe aggregation of influence probabilities of (c) is calculated using equation 10:
Figure BDA0003432912180000101
wherein the content of the first and second substances,
Figure BDA0003432912180000102
representing the item representation after accumulating the influence probability of the k-th influencing factor of the L-1 items;
using formula 11 with K channels
Figure BDA0003432912180000103
And (3) accumulation:
Figure BDA0003432912180000104
wherein the content of the first and second substances,
Figure BDA0003432912180000105
representing item viThe combination of accumulated influence probabilities of K channels, i.e., item viIs represented by the local level of decoupling.
In the technical scheme, local intention mining can be performed on a target user through a decoupling learning method, the influence conditions of a plurality of influence factors are decoupled and learned on the recently interacted projects of the target user according to the interaction sequence, and the influence conditions are aggregated to the project viIn, for item viThe purpose of local decoupling learning for the target user is achieved.
In another technical scheme, formula 12 is adopted to calculate the item v of the target useriProbability of occurrence of the latter as the next interactive item:
Figure BDA0003432912180000106
wherein the content of the first and second substances,
Figure BDA0003432912180000107
indicating that the target user is in item viAfter as the probability of the next interactive item appearing, zL TDenotes zLTranspose of hiAn initial embedded representation of a candidate is represented,
Figure BDA0003432912180000108
representing the final embedded representation of the last item in the sequence, softmax is a normalized exponential function.
In the above technical solution, the candidate items are actually all items in the training data, each candidate item can be calculated as an item of a next possible interaction of a certain target user through formula 12, and the item with the highest probability represents the next most likely interaction item.
In another technical solution, the lower bound of evidence is taken as a training target, and formula 13 is adopted to calculate:
Figure BDA0003432912180000109
wherein the content of the first and second substances,
Figure BDA00034329121800001010
which is indicative of the error in the reconstruction,
Figure BDA00034329121800001011
represents the KL divergence;
wherein the content of the first and second substances,
Figure BDA00034329121800001012
Figure BDA0003432912180000111
wherein the content of the first and second substances,
Figure BDA0003432912180000112
representing item viProbability of occurrence as next interactive item in target user sequence, yiRepresenting item viWhether it really appears as the next interactive item in the target user sequence, yiThe true occurrence value is 1, otherwise the value is 0.
In the technical scheme, the lower evidence bound is used as a training target, the neural network of the image can be trained more quickly and optimally, overfitting is avoided, and the pre-failure accuracy is improved.
In another technical scheme, the method also comprises regularization by adopting an L2 paradigm
Figure BDA0003432912180000113
And
Figure BDA0003432912180000114
and by adopting regularization processing data, the stability of the numerical value can be ensured.
Specifically, the method comprises the following steps:
fig. 1 shows a model architecture of an entire edge-enhanced global decoupling graph neural network.
1. Problem definition
Given M users and N items, we represent the user set as U ═ U1,u2,...,uMRepresents the set of items as
Figure BDA0003432912180000115
For each of the users, the user is provided with,
Figure BDA0003432912180000116
representing the sequential behavior of user u interacting with v. Given the historical sequence at time t, the sequence recommendation model aims to predict the next item that the user may be interested in at time t + 1.
In the inventive method, a global level link map is provided to capture item link translation information. The global link graph is defined as G ═ V, E >, where V is the set of all the items in the training data and E is the set of edges. Each side < vi,vj"E" indicates that the user is in v orderiThen with vjAnd carrying out interaction.
Figure BDA0003432912180000117
Representing item viI.e. the neighborhood of v in the sequenceiThe adjacent items.
2. Channel awareness mechanism
Given a history sequence
Figure BDA0003432912180000118
Definition of
Figure BDA0003432912180000119
Indicating that user u is associated with an item
Figure BDA00034329121800001110
Potential intention in interaction. Assuming there are K factors related to the user's intent, the potential representation will be
Figure BDA00034329121800001111
Divided into K channels, i.e.
Figure BDA00034329121800001112
The kth channel independently corresponds to the kth factor. For each pair of adjacent items,
Figure BDA00034329121800001113
and
Figure BDA00034329121800001114
the correlation between them represents the item viAnd item vjSimilarity in the influencing factor k and reveals why and how these two items are related to each other.
3. Global level decoupled representation learning
Assume that there are K concepts related to user intent, which means that there are K potential factors that need to be decoupled. Given a global graph G ═ V, E >, constructed based on training data, the nodes (i.e., items) are partitioned into K components in the underlying space, and the link edges are correspondingly partitioned into K channels. The kth component is related to the kth factor of the user's intent, and the kth channel represents how factor k affects the links to the items.
For a single node v in the graphiThe goal is to aggregate the neighbors from it
Figure BDA0003432912180000121
The information of (1). The project representation is first split into K components and factor K is computed from its neighbors
Figure BDA0003432912180000122
Influencing item viProbability of (c):
Figure BDA0003432912180000123
Figure BDA0003432912180000124
disclosing why the item viAnd vjAre linked adjacently, and item vjHow to influence item v on factor ki. Can then be selected from item viAccumulates information and updates v in the neighbors ofiThe item of (d) represents:
Figure BDA0003432912180000125
to ensure numerical stability, regularization using the L2 paradigm
Figure BDA0003432912180000126
Will be provided with
Figure BDA0003432912180000127
Writing
Figure BDA0003432912180000128
By projecting the item representations into different channels, the item information can be aggregated from the perspective of different conceptual factors. Such that the items represent at the global level of the model
Figure BDA0003432912180000129
Can be written as a combination of K channels, i.e.
Figure BDA00034329121800001210
4. Partial level decoupled representation learning
Given that items appearing in a sequence are rarely repeated, the local learning model is based on a sequence approach rather than a graph neural network. Given a series of historical behaviors of users
Figure BDA00034329121800001211
The most recent L interactive items are selected. For users with sequence length less than L, zero vectors are repeatedly added to the left side of the sequence. To distinguish item representations at different positions in a sequence, a learnable position embedding is added to the initial item embedding
Figure BDA00034329121800001212
And as the final input to the learning layer:
Figure BDA00034329121800001213
the self-attention network is first applied to the learning model of the present invention to take advantage of its ability to capture the long-term and short-term dependencies of items in sequence. Order to
Figure BDA00034329121800001214
For the output of the single-headed self-attention encoder, hsAs input to a variational self-encoder, in which
Figure BDA00034329121800001215
In a variational autocoder, the posterior distribution is inferred as a polynomial layer. The mean vector and variance vector are calculated based on the self-attention vector as follows:
μv=l1(hs)
σv=l2(hs)
the output of the variational autocoder layer is written as z by using a "reparameterization trick" that follows the traditional variational autocoder modelv=μvvε, where ε is N (0, I).
After self-attention variational auto-encoder models, a term representation is obtained where the entire sequence follows a normal distribution. Then, the application channel perception aggregation mechanism is subjected to local decoupling learning.
To obtain the user intent translation, bases are usedChannel-aware sliding window strategies in graph neural networks. The sliding window length is set to L. This means that v is the target item for eachiThe information will be aggregated from its previous L-1 items. Item viAnd vjThe probability between is calculated based on the channel perception mechanism as follows:
Figure BDA0003432912180000131
then the sequence representation of the local layer is a combination of K factors:
Figure BDA0003432912180000132
5. prediction layer
Based on learned z-representations from global and local layersg,zlThe final sequence representation can be obtained, written as: z is equal to zg+zl
The final recommendation probability of the candidate item can be estimated based on the current sequence embedding and the initial item embedding. By using
Figure BDA0003432912180000133
Representing item viProbability of occurrence as next interaction in the current sequence:
Figure BDA0003432912180000134
the training targets are calculated from the lower evidence bound and the reconstruction error is defined as the cross entropy, as shown below.
Figure BDA0003432912180000135
6. Iterative training
And training the model of the whole edge-enhanced global decoupling graph neural network by taking the lower evidence boundary as a training target to obtain the trained model of the edge-enhanced global decoupling graph neural network, predicting a next item which is possibly interested in a certain target user at the current moment, and recommending the next item to the target user.
While embodiments of the invention have been described above, it is not limited to the applications set forth in the description and the embodiments, which are fully applicable in various fields of endeavor to which the invention pertains, and further modifications may readily be made by those skilled in the art, it being understood that the invention is not limited to the details shown and described herein without departing from the general concept defined by the appended claims and their equivalents.

Claims (10)

1. The sequence recommendation method based on the edge-enhanced global decoupling graph neural network is characterized by comprising the following steps of:
an input layer: taking all projects in the training data as nodes and the interaction sequence between the user and the projects as edges, constructing a global linked graph, and inputting the global linked graph to a graph neural network;
decoupling the learning layer: the global layer adopts a decoupling representation method based on a graph neural network to aggregate the information from the item viInfluence probability of preset number of factors of all neighbor projects to obtain a project viThe global level of decoupling item representation of (a);
the local layer firstly adopts a sequence neural network-based preprocessing project interaction sequence of a target user and then adopts a decoupling representation method based on a graph neural network to aggregate projects v from the target useriObtaining the influence probability of the preset number of factors of the first L-1 items according to the interaction time sequence to obtain the item v of the target useriThe local level of decoupling item representation of (a);
prediction layer: accumulating items viAnd the item v of the target useriThe decoupling item representation of the local level is multiplied by the initial embedded representation of each candidate item, and then the target user in the item v is obtainediThen the probability of the next interactive item is taken as the probability of the occurrence;
iterative training: setting a training target training graph neural network, and updating the item representation and the internal parameters to obtain a trained graph neural network;
and (4) arranging the output results of the prediction layers of the trained graph neural network in a descending order, wherein the item with the highest probability is the item recommended to the target user.
2. The edge-enhanced global decoupling graph neural network-based sequence recommendation method of claim 1, wherein the global-level decoupling representation method comprises: preset item viBy its neighbor item vjK number of influencing factors, i.e. from item viTo neighbor item vjK channels are provided, each channel corresponds to an influence factor, and each neighbor item v is calculated based on a channel perception mechanismjK influencing factors on item viAnd aggregate the influence probabilities from the items viAll neighbor items vjTo update the item viIs expressed by the item of (1), get
Figure FDA0003432912170000011
3. The edge-enhanced global decoupling graph neural network-based sequence recommendation method of claim 1, wherein neighbor items vjIs given to the item viThe influence probability of (c) is calculated by using formula 1:
Figure FDA0003432912170000012
wherein the content of the first and second substances,
Figure FDA0003432912170000021
representing item viAnd vjAre linked adjacently, and item vjOn item v over influence kiThe probability of influence of (a) is,
Figure FDA0003432912170000022
parameter, W, representing channel kk′Parameter representing channel k', dinDimension representing item embedding of input, dchannelRepresenting the dimension of each channel embedding representation, hiRepresenting item viInitial embedded representation of hjRepresenting item vjRepresents a non-linear activation function, Wk TAnd Wk′ TRespectively represent a pair matrix WkAnd Wk′Taking the transpose.
4. The edge-enhanced global decoupling graph neural network-based sequence recommendation method of claim 2, wherein the items v are accumulated by formula 2iTo item v of the k-th influence factor of the domain ofiThe method comprises the following steps:
Figure FDA0003432912170000023
wherein the content of the first and second substances,
Figure FDA0003432912170000024
representing accumulated items viThe k-th influence factor of the field of (1) and the after-influence-probability item viIs indicative of the item of (a),
Figure FDA0003432912170000025
representing item viThe neighborhood of (1), i.e., the set of all neighbor items;
using formula 3 with K channels
Figure FDA0003432912170000026
And (3) accumulation:
Figure FDA0003432912170000027
wherein the content of the first and second substances,
Figure FDA0003432912170000028
representing item viThe items represented by the accumulated influence probabilities of the K channels, i.e., item viIs represented by a global level of decoupling.
5. The edge-enhanced global decoupling graph neural network-based sequence recommendation method of claim 1, wherein the local-level decoupling representation method comprises position embedding of an item sequence, and the specific method comprises the following steps:
selecting an item v interacted by a target user u at one momentiAnd item viForming a project sequence by the last L-1 interactive projects, respectively embedding the positions of each project in the project sequence in the sequence order, inputting the project sequence into a self-attention encoder for encoding, and updating the position information into the project representation of the project sequence in an aggregation manner;
the position-embedded item sequence is represented by equation 4:
Figure FDA0003432912170000029
wherein p is1~pLAn embedded representation representing L positions, h1~hLIndicating initial embedding of L items, and H indicating item sequence representation after embedding with position;
the output of the single-headed self-attention encoder is calculated using equation 5:
Figure FDA0003432912170000031
where D represents the output of a single-ended self-attention network, WQ,WK,WVAll represent parameters of the self-attention network, and softmax is a normalized exponential function;
the expression of the sequence of items after the single-head self-attention network is calculated by formula 6, namely hs
hsD + H equation 6.
6. The sequence recommendation method based on the edge-enhanced global decoupling graph neural network of claim 5, wherein a variational self-encoder is adopted to process the item sequence with the position information, and the specific method is as follows:
sequence of items h with output from attention encodersInputting the input items into a variational self-encoder for encoding, and outputting to obtain item representations of item sequences which obey normal distribution;
calculating by using formula 7, formula 8 and formula 9:
μv=l1(hs) Equation 7
σv=l2(hs) Equation 8
zv=μvvε equation 9
Wherein l1And l2Representing a linear transformation function, zvRepresenting the sequence representation of the item, mu, after passing through a variational self-coder networkvMeans, σ, representing the Gaussian distributionvThe variance of this Gaussian distribution, ε to N (0, I).
7. The edge-enhanced global decoupling graph neural network-based sequence recommendation method of claim 6, wherein the kth influencing factor of each item in the item sequence is applied to the item viThe aggregation of influence probabilities of (c) is calculated using equation 10:
Figure FDA0003432912170000032
wherein z isi l(k)Representing the item representation after accumulating the influence probability of the k-th influencing factor of the L-1 items;
z for K channels using equation 11i l(k)And (3) accumulation:
Figure FDA0003432912170000033
wherein z isi lRepresenting item viThe combination of accumulated influence probabilities of K channels, i.e., item viIs represented by the local level of decoupling.
8. The edge-enhanced global decoupling graph neural network-based sequence recommendation method of claim 1, wherein formula 12 is adopted to calculate the target user in item viProbability of occurrence of the latter as the next interactive item:
Figure FDA0003432912170000041
wherein the content of the first and second substances,
Figure FDA0003432912170000042
indicating that the target user is in item viAfter as the probability of the next interactive item appearing, zL TDenotes zLTranspose of hiAn initial embedded representation of a candidate is represented,
Figure FDA0003432912170000043
representing the final embedded representation of the last item in the sequence, softmax is a normalized exponential function.
9. The edge-enhanced global decoupling graph neural network-based sequence recommendation method of claim 1, wherein the lower evidence bound is taken as a training target, and formula 13 is adopted for calculation:
Figure FDA0003432912170000044
wherein the content of the first and second substances,
Figure FDA0003432912170000045
indicating the reconstruction error, KL (q)φ(z | y) | p (z)) represents the KL divergence;
wherein the content of the first and second substances,
Figure FDA0003432912170000046
Figure FDA0003432912170000047
wherein the content of the first and second substances,
Figure FDA0003432912170000048
representing item viProbability of occurrence as next interactive item in target user sequence, yiRepresenting item viWhether it really appears as the next interactive item in the target user sequence, yiThe true occurrence value is 1, otherwise the value is 0.
10. The edge-enhanced global decoupling graph neural network-based sequence recommendation method of claim 4, further comprising regularizing by using L2 paradigm
Figure FDA0003432912170000049
And
Figure FDA00034329121700000410
CN202111603830.XA 2021-12-24 2021-12-24 Sequence recommendation method based on edge-enhanced global decoupling graph neural network Pending CN114187077A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111603830.XA CN114187077A (en) 2021-12-24 2021-12-24 Sequence recommendation method based on edge-enhanced global decoupling graph neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111603830.XA CN114187077A (en) 2021-12-24 2021-12-24 Sequence recommendation method based on edge-enhanced global decoupling graph neural network

Publications (1)

Publication Number Publication Date
CN114187077A true CN114187077A (en) 2022-03-15

Family

ID=80544955

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111603830.XA Pending CN114187077A (en) 2021-12-24 2021-12-24 Sequence recommendation method based on edge-enhanced global decoupling graph neural network

Country Status (1)

Country Link
CN (1) CN114187077A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112000689A (en) * 2020-08-17 2020-11-27 吉林大学 Multi-knowledge graph fusion method based on text analysis

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112000689A (en) * 2020-08-17 2020-11-27 吉林大学 Multi-knowledge graph fusion method based on text analysis
CN112000689B (en) * 2020-08-17 2022-10-18 吉林大学 Multi-knowledge graph fusion method based on text analysis

Similar Documents

Publication Publication Date Title
Abdar et al. A review of uncertainty quantification in deep learning: Techniques, applications and challenges
CN110119467B (en) Project recommendation method, device, equipment and storage medium based on session
Ma et al. End-to-end incomplete time-series modeling from linear memory of latent variables
Silva-Ramírez et al. Missing value imputation on missing completely at random data using multilayer perceptrons
CN111079931A (en) State space probabilistic multi-time-series prediction method based on graph neural network
CN113590900A (en) Sequence recommendation method fusing dynamic knowledge maps
JP7474446B2 (en) Projection Layer of Neural Network Suitable for Multi-Label Prediction
Jun et al. Uncertainty-gated stochastic sequential model for EHR mortality prediction
Chen et al. Fast approximate geodesics for deep generative models
CN114298851A (en) Network user social behavior analysis method and device based on graph sign learning and storage medium
CN112085293A (en) Method and device for training interactive prediction model and predicting interactive object
Nasiri et al. A node representation learning approach for link prediction in social networks using game theory and K-core decomposition
CN113821724B (en) Time interval enhancement-based graph neural network recommendation method
CN114187077A (en) Sequence recommendation method based on edge-enhanced global decoupling graph neural network
CN114896515A (en) Time interval-based self-supervision learning collaborative sequence recommendation method, equipment and medium
CN115953215B (en) Search type recommendation method based on time and graph structure
Janković Babić A comparison of methods for image classification of cultural heritage using transfer learning for feature extraction
Liang et al. Paying deep attention to both neighbors and multiple tasks
CN116257691A (en) Recommendation method based on potential graph structure mining and user long-short-term interest fusion
Ann et al. Parameter estimation of Lorenz attractor: A combined deep neural network and K-means clustering approach
Zhang An English teaching resource recommendation system based on network behavior analysis
Kim Active Label Correction Using Robust Parameter Update and Entropy Propagation
Kalina et al. Robust training of radial basis function neural networks
Chen et al. Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions
Oldenhof et al. Self-labeling of fully mediating representations by graph alignment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination