CN114780841B

CN114780841B - KPHAN-based sequence recommendation method

Info

Publication number: CN114780841B
Application number: CN202210416700.3A
Authority: CN
Inventors: 杨超; 阮书琪
Original assignee: Hunan University
Current assignee: Hunan University
Priority date: 2022-04-20
Filing date: 2022-04-20
Publication date: 2024-04-30
Anticipated expiration: 2042-04-20
Also published as: CN114780841A

Abstract

The invention relates to the technical field of recommendation, in particular to a KPHAN-based sequence recommendation method. KPHAN consist essentially of KGSP and LSPF modules. At KGSP, the item corresponding entity and entity context are encoded using knowledge embedding techniques; the project representation is enhanced by adopting a knowledge bidirectional fusion mode, and the global short-term preference characteristics of the user are captured by modeling the project sequence. In the LSPF, the long-term preference of the user is captured by using a personalized hierarchy attention mechanism, and the short-term preference of the user is fused to complete the training and prediction of the user preference. The beneficial effects are that: semantic association is provided for the sequence items, so that the item identification degree is improved, and the similarity between the items is well revealed. The personalized hierarchical attention network merges the long-term preference of the user and obtains more accurate and comprehensive personalized user preference. The hit rate is improved by 10.7% on average, and the normalized folding loss cumulative gain is improved by 13.5% on average.

Description

KPHAN-based sequence recommendation method

Technical Field

The invention relates to the technical field of recommendation, in particular to a KPHAN-based sequence recommendation method.

Background

With the advent of the big data age, information overload is urgent to be solved, and recommendation systems have been developed. The main objective of the recommendation system is to search from mass online content and services, find partial products meeting the interest preferences of the user and recommend the partial products to the user. The current recommendation technology has been widely applied to a plurality of e-commerce platforms such as Taobao, amazon, yelp and the like, plays an important role in promoting the service growth and dynamic decision process of the platform, and improves the user satisfaction to a great extent. Although recommendation systems have achieved great success, conventional recommendation techniques generally assume that the user's preferences are stable and constant, and only capture the user's long-term general preferences, resulting in recommendation results with certain limitations; in fact, over time, both the user's preferences and the popularity of the merchandise are dynamic. Sequential recommendation aims at recommending items to be interacted next time for a user according to the historical sequence of the user interaction items, and can flexibly capture useful sequential modes and interest drift of the user to make more accurate and dynamic recommendation. Due to its utility, more and more researchers began study sequence recommendations.

The existing sequence recommendation methods are mainly divided into two main categories, namely, the first category: the traditional method mainly comprises sequence pattern mining, markov chain and matrix decomposition. The Markov chain-based method mainly deduces the future interests of the user according to the last or last few behaviors, the next item is deduced by utilizing the last interactive item based on first-order Markov, and FPMC is a representative model which fuses matrix decomposition and Markov chains to carry out personalized next sequence basket recommendation. FOSSIL, FISM make sequence recommendations by combining a similarity-based approach with a higher order markov chain approach that takes into account the dependence of more behavior. Such methods cannot capture long-term complex dependencies and are susceptible to data sparseness. The second category is sequence recommendation based on deep learning models. The cyclic neural network is very effective in modeling data with sequence characteristics, the GRU4Rec captures long-term dependence in a session by using the GRU for the first time, and the interaction in the session is used as sequence history for sequence modeling, however, the methods such as RNN have limited capture of long-term preference; a convolutional neural network-based method has also emerged, primarily for capturing local features in a sequence. Models based on self-attention mechanisms achieve the most advanced performance in machine translation tasks, so the SASRec model first applies self-attention to sequence recommendations, which can solve the limitations set forth above and becomes the mainstream framework.

Although the self-attention mechanism based approach improves the performance of sequence recommendations to some extent, two problems remain. Firstly, the method only considers the sequence conversion mode among item features, ignores the importance of auxiliary information on improving the sequence recommendation performance, in fact, semantic association among items has a certain auxiliary effect on improving the identification of the items, revealing the similarity of the items, mining the preference of a user, such as a movie buddha and a finger ring king, which are both directors by the Jackson, and the two items have stronger semantic association at the level of directors, and although the prior art utilizes a knowledge-enhanced memory network to capture the preference of the attribute level of the user, the semantic association among the sequence items cannot be dynamically captured. Second, previous self-attention mechanism based methods all use a representation of the last time step of the sequence model to represent the sequential behavior of the entire sequence and as a final preference representation for the user, whereas the representation of the last time step can only represent the current user's preference, i.e. a short-term preference, ignoring the long-term static preferences of the user in the sequence, i.e. the user's long-term unchanged preferences, and more importantly the invention herein devised a sequence-aware penalty function to make the recommendation more accurate and efficient.

Disclosure of Invention

The invention aims to provide a KPHAN-based sequence recommendation method and a KPHAN-based sequence recommendation device, so that the defects in the prior art are overcome, and KPHAN refers to a knowledge-enhanced personalized hierarchical attention network.

The technical scheme of the invention is that a knowledge-enhanced personalized hierarchical attention network (KPHAN) is provided, knowledge information is utilized, and long-term and short-term preferences of users are considered to improve recommendation performance.

The method comprises two main modules: a knowledge-enhanced global short-term preference module (KGSP) and a personalized hierarchical attention-aware long-term preference fusion module (LSPF), comprising the steps of:

Step one: in order to enrich the representation of the items, the semantic association between the items is enhanced, the item identification is improved, and the knowledge information of the items is integrated into the items. Therefore, the invention proposes to embed the knowledge graph into the representation of the modeling rich item, and encode the entities and the relations in the knowledge graph into low-dimensional dense vectors on the premise of retaining the structural information and semantic association of the knowledge graph;

Specifically, (1.1) firstly, using an entity linking method in KB4Rec (specific implementation reference [1 ]), finding an entity E E corresponding to an item v in a knowledge graph, wherein E is a set of all entities of the knowledge graph;

(1.2) then finding out a first-order sub-graph (e, R, e _t) e G from the entity, wherein (e, e _t) represents the head node and the tail node in the first-order sub-graph G, respectively, and R e R represents all the relations in the first-order sub-graph;

(1.3) training the first-order subgraph through TransE model (specific implementation reference [2 ]) to obtain low-dimensional dense vector of entity corresponding to the project The rest of the entities in the first-order subgraph are embedded/>The entity and the relation are expressed in the same space through TransE model, so that the closer the sum of the head node vector and the relation vector is to the tail node, namely, e+r is approximately equal to e _t, the score function of TransE is as follows:

Wherein the method comprises the steps of Represents an L2 norm;

(1.4) in order to train TransE models, the optimized objective function uses a loss function of negative sampling and maximum spacing strategy, so that the positive sample score is higher and the negative sample score is lower and the loss function is as follows:

wherein dpos represents the score of the positive triplet, dneg represents the score of the negative triplet, margin represents the maximum interval, and max is the maximum function;

Step two: since the first-order neighbors of an entity (the first-order neighbor entity is called a context entity in the present invention) are usually important features for providing semantic association, the context entity is introduced on the basis of embedding the entity corresponding to the item in the step one, and the context entity of the entity corresponding to the item is defined as:

context(e)＝{e_t|(e,r,e_t)∈G}

the corresponding context entity embedding is obtained by calculating an average value of the context entity embedding:

Wherein the method comprises the steps of Context entity embedding of the entity corresponding to the item is represented, wherein I context (e) is the number of the context entities, and sigma represents summation operation;

step three: the maximum length of the model interaction sequence is set as n, the maximum sequence length of the user interaction sequence is valued in a sliding window mode, and if the interaction sequence is less than n, 0 is supplemented on the left side of the sequence; creating an item embedding matrix M epsilon R ^|V|×d, wherein d represents the item embedding dimension, v represents the number of items, and each time the sequence modeling retrieves the corresponding input embedding matrix Adding a leachable location embedding/>, for each itemThe final sequence item is embedded as/>Previous methods will directly/>As an input of the adaptive attention module, however, the semantic features of the item are ignored only by adopting the mode of item embedding, so that the entity embedding corresponding to the item in the first step and the context embedding corresponding to the item in the second step are integrated into the item to enhance the representation of the item and improve the item identification, and the bidirectional integration f _{Bi-interaction} is adopted to embed/>, on the final sequence itemSequence item embedding/>, obtained by integrating knowledge and enhanced by knowledgeAnd dropoutlayer (neurons are discarded with a certain probability during training) is used for relieving the problem of overfitting of the deep neural network, wherein leakyReLU is a nonlinear activation function, W ₁ and W ₂ are learnable parameters, and e represents vector corresponding element multiplication;

Wherein the method comprises the steps of Is the entity embedding matrix corresponding to the sequence item v,/>Is the context embedding matrix of the entity corresponding to item v,/>And/>Embedding a context entity corresponding to the project into the same space of the project through the same full-connection network as the context entity and applying a tanh nonlinear activation function, wherein W, W1, W2 and b are learnable parameters;

step four: embedding the sequence item with enhanced knowledge on the basis of the third step Acquiring short-term interests of a user as input to an adaptive attention module; specifically, the adaptive attention module includes an adaptive attention and feed forward neural network (FFN); wherein the self-attention portion employs scalar dot product defined as: /(I)

S＝Attention(Q,K,V)

Wherein Q represents queries, K represents keys, V represents values, Is a learnable parameter, layerNorm represents layer normalization; notably, to ensure the chronological order of the sequences, a mask mechanism is adopted in the computation of Q, K and V, Q _i and K _j satisfy the dependency relationship between item i and item j calculated by the attention weights between i < j, Q _i and K _j, and the sequential change of the sequences is captured by calculating the attention weighted sum of the previous items; in order to add nonlinearity to the model, the feedforward neural network is performed after the attention to obtain deeper features, and a two-layer feedforward neural network is adopted, specifically:

F＝FFN(S)＝Relu(dropout(Relu(dropout(SW⁽¹⁾+b)))W⁽²⁾+b))

Wherein Relu is a nonlinear activation function, dropout means that neurons are discarded according to a certain probability in training, W ⁽¹⁾,W⁽²⁾, b is a learnable parameter of a neural network, and S is a sequence representation after self-attention; experiments show that propagating low-level features to high-level is more beneficial for model learning, so residual connection is used in both the self-attention module and the feedforward neural network part, and in order to learn more complex dependency relationships, omega self-adaptive attention blocks (self-attention and FFN) are stacked, and b < th > block (b > 1) is defined as:

S^(b)＝Attention(F^(b-1))

F^(b)＝FFN(S^(b))

when b=1, s=attention (Q, K, V), f=ffn (S), For the feature vector of the last item in the last block, it merges the features of all previous items to obtain more accurate current item features, which are important for user preference prediction, so will/>Consider short-term preferences of the user; after the global dynamic short-term preference module with enhanced knowledge is described, the short-term preference fusion module with personalized level attention perception is described in the following steps;

Step five: in addition to the user's current preferences acting on the user's next item interactions, the user's long-term static preferences, i.e., preferences that the user has not changed for a long time, also play an indispensable role in predicting the user's next item interactions, the user's long-term preferences being present in items that the user interacted with, where it will be And carrying out similarity calculation with the candidate items to obtain long-term static preference of the user, wherein the similarity between the candidate items and the item at each moment is calculated as follows:

where q is the number of candidate items, Representing the first n-1 item representations after stacking ω adaptive attention modules,/>For the attention score of the project at the time t of the user, softmax is a normalization function, and finally the long-term preference/>, of the user is obtainedExpressed as a weighted sum of the first n-1 moments:

Step six: long-term and short-term preferences play a different role in predicting the next interaction item of the user, so it is more important to explore which of the long-term and short-term preferences is in predicting the next item further by using the attention:

Wherein the method comprises the steps of Is an implicit user preference extracted from items that the user interacted with, consists of long-term preferences and short-term preferences,/>Weight for short-term preference,/>Q is the weight of the long-term preference, q is the item to be interacted with next;

Step seven: to further provide personalized sequence recommendations, an explicit user preference matrix is introduced on a step six basis Where d represents the dimension, |u| represents the number of users, and fusing implicit user preferences with explicit user preferences via a weight parameter α to obtain the final user preference representation/>

To predict the next possible item of the user, a dot product calculation is performed on the end user preference and the candidate set of items to obtain the probability of the user interacting with the next item, whereinFor candidate item set/>For the score of the ith user to the candidate jth item, the higher the score is, the greater the probability of the user interacting with the item is, the candidate item sets are ordered in a descending order, and the first K items are selected as recommended items;

the whole sequence recommendation flow is described; the model is trained in an end-to-end manner, and a binary cross entropy loss function of sequence perception is designed:

Where k is the number of sequences that the user ui can maximally divide, j is the positive sample of each divided sequence, For the negative samples sampled by the user ui when taking positive samples j, here 100 negative examples are sampled for each positive example,/>Representation regularizes all parameters and embedding (item embedding) of model,/>Is positive sample interaction probability,/>Is a negative sample interaction probability.

The invention has the beneficial effects that (1) the optimal performance is obtained on two evaluation indexes, especially on a music data set, the hit rate HR is improved by 32%, and the normalized folding cumulative gain NDCG is improved by 33%. Illustrating the effectiveness of the knowledge-enhanced personalized hierarchical attention network presented herein. The knowledge-enhanced global short-term preference module provides semantic association between sequential items, improving item recognition and revealing similarity between items well. The personalized hierarchical attention network merges the long-term preference of the user and obtains more accurate and comprehensive personalized user preference. (2) Compared with the prior models FDSA and KSR added with auxiliary information, the hit rate is improved by 16.3 percent on average, and the NDCG is improved by 18.4 percent on average. Compared with the prior method, the method can capture the sequence conversion of the items and the semantic association between the items, consider the preference of long and short periods and improve the recommendation performance. (3) To verify the validity of the knowledge enhancement project presentation module and the personalized hierarchical attention module, ablation experiments were performed on three variants of KPHAN, KPHAN-K & a representing a basic model that ignores the knowledge enhancement module and the personalized hierarchical attention module. KPHAN-A is a personalized hierarchical attention ignoring the model, discussing the usefulness of knowledge. The effect of the basic model is obviously improved after knowledge content is added, the hit rate is averagely improved by 7.8% on three different data sets, and the normalized folding cumulative gain is averagely improved by 7.7%. KPHAN-K is a knowledge enhancement module of the neglect model, discusses the usefulness of the level attention, improves the hit rate on three different data sets by 1.4% on average, and improves the normalized break cumulative gain by 2.4% on average. KPHAN is a model presented herein, with significantly improved effects compared to the previous three variants. Indicating that the knowledge enhancement module and the personalized hierarchical attention module complement each other and that the model achieves the best performance.

Drawings

FIG. 1 is a flow chart of the proposed sequence recommendation model.

FIG. 2 shows the comparison result of the present invention and the reference model.

FIG. 3 shows the results of an ablation experiment according to the present invention.

Detailed Description

The objects, technical solutions and advantages of the present invention will become more apparent by the following detailed description of the present invention with reference to the accompanying drawings. It should be understood that the description is only illustrative and is not intended to limit the scope of the invention. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the present invention; in addition, the technical features of the different embodiments of the present invention described below may be combined with each other as long as they do not collide with each other. The invention will be described in more detail below with reference to the accompanying drawings. Like elements are denoted by like reference numerals throughout the various figures. For purposes of clarity, the various parts of the drawings are not drawn to scale; a KPHAN-based sequence recommendation method according to an embodiment of the present invention is described below with reference to fig. 1 to 3:

s1, acquiring a first-order sub-graph knowledge graph corresponding to an item and implementing embedded coding;

s2, acquiring entity and context entity codes corresponding to the items;

s3, project features are enhanced by fusing project, entity and context entity features;

s4, acquiring global short-term preference characteristics of a user;

S5, acquiring long-term static preference characteristics of a user;

s6, fusing the long-period preference characteristics to obtain implicit user preference characteristics;

s7, merging the explicit user preference characteristics and the implicit user preference characteristics to obtain final user preference for training and prediction;

where S1-S4 are knowledge-enhanced global short-term preference modules (KGSP) and S5-S7 are personalized hierarchical attention-aware long-short-term preference fusion modules (LSPF).

The invention comprises the following specific steps:

step one: knowledge-graph can produce semantic associations between items, providing more accurate recommendations, and enhancing representations of items by modeling using knowledge-graph embedding. The aim of the knowledge graph embedding is to encode the entities and the relations in the knowledge graph into low-dimensional dense vectors on the premise of retaining the structural information and semantic association of the knowledge graph;

Specifically, (1.1) first, using KB4Rec (implementing the entity linking method in reference [1]：Zhao,Wayne Xin,et al."Kb4rec:A data set for linking knowledge bases with recommender systems."Data Intelligence 1.2(2019):121-136.Bordes,Antoine,et al.), find entity E corresponding to item v in the knowledge graph, where E is the set of all entities in the knowledge graph;

(1.3) finally training the first-order subgraph through TransE model (specific implementation reference [2]："Translating embeddings for modeling multi-relational data."Advances in neural information processing systems 26(2013).) to obtain low-dimensional dense vector of entity corresponding to the project The rest of the entities in the first-order subgraph are embedded/>The entity and the relation are expressed in the same space through TransE model, so that the closer the sum of the head node vector and the relation vector is to the tail node, namely, e+r is approximately equal to e _t, the score function of TransE is as follows:

Wherein the method comprises the steps of Represents an L2 norm;

context(e)＝{e_t|(e,r,e_t)∈G}

step three: the maximum length of the model interaction sequence is set as n, the maximum sequence length of the user interaction sequence is valued in a sliding window mode, and if the interaction sequence is less than n, 0 is supplemented on the left side of the sequence; creating an item embedding matrix M epsilon R ^|V|×d, wherein d represents the item embedding dimension, v represents the number of items, and each time the sequence modeling retrieves the corresponding input embedding matrix Adding a leachable location embedding/>, for each itemThe final sequence item is embedded as/>Previous methods will directly/>As an input of the adaptive attention module, however, the semantic features of the item are ignored only by adopting the mode of item embedding, so that the entity embedding corresponding to the item in the first step and the context embedding corresponding to the item in the second step are integrated into the item to enhance the representation of the item and improve the item identification, and the bidirectional integration f _{Bi-interaction} is adopted to embed/>, on the final sequence itemSequence item embedding/>, obtained by integrating knowledge and enhanced by knowledgeAnd dorpout layer (discarding neurons with a certain probability during training) is used for alleviating the problem of deep neural network overfitting, wherein leakyReLU is a nonlinear activation function, W ₁ and W ₂ are learnable parameters, and the element multiplication corresponding to the vector is represented by the element multiplication;

S＝Attention(Q,K,V)

F＝FFN(S)＝Relu(dropout(Relu(dropout(SW⁽¹⁾+b)))W⁽²⁾+b))

Wherein Relu is a nonlinear activation function, dropout means that neurons are discarded according to a certain probability in training, W ⁽¹⁾,W⁽²⁾, b is a learnable parameter of a neural network, and S is a sequence representation after self-attention; experiments show that propagating low-level features to high-level is more beneficial for model learning, so residual connections are used in both the self-attention module and the feedforward neural network part, and in order to learn more complex dependency relationships, ω self-attention blocks, namely (self-attention and FFN), are stacked, and the b < th > block (b > 1) is defined as:

S^(b)＝Attention(F^(b-1))

F^(b)＝FFN(S^(b))

when b=1, s=attention (Q, K, V), f=ffn (S), For the feature vector of the last item in the last block, it merges the features of all previous items to obtain more accurate current item features, which are important for user preference prediction, so will/>Consider short-term preferences of the user; after the global short-term preference module with enhanced knowledge is described, the following steps begin to describe the long-term preference fusion module with personalized level attention perception;

Step seven: to further provide personalized sequence recommendations, an explicit user preference matrix is introduced on a step six basis Wherein d represents the dimension, u represents the number of users, and the implicit user preference and the explicit user preference are fused through a weight parameter alpha to obtain the final user preference representation/>

To predict the next possible item of the user, a dot product calculation is performed on the end user preference and the candidate set of items to obtain the probability of the user interacting with the next item, whereinFor candidate item set,/>For the score of the ith user to the candidate jth item, the higher the score is, the greater the probability of the user interacting with the item is, the candidate item sets are ordered in a descending order, and the first K items are selected as recommended items;

Where k is the number of sequences that the user ui can maximally divide, j is the positive sample of each divided sequence, For the negative samples sampled by the user ui when taking positive samples j, here 100 negative examples are sampled for each positive example,/>Representing regularization of all parameters and embedding of the model (i.e., item embedding)/>Is positive sample interaction probability,/>Is a negative sample interaction probability.

The invention has the beneficial effects that as shown in fig. 2, the optimal performance is obtained on both evaluation indexes, especially on a music data set, the hit rate HR is improved by 32%, and the normalized folding cumulative gain NDCG is improved by 33%. Illustrating the effectiveness of the knowledge-enhanced personalized hierarchical attention network presented herein. The knowledge-enhanced global short-term preference module provides semantic association between sequential items, improving item recognition and revealing similarity between items well. The personalized hierarchical attention network merges the long-term preference of the user and obtains more accurate and comprehensive personalized user preference.

Compared with the prior models FDSA and KSR added with auxiliary information, the hit rate is improved by 16.3 percent on average, and the NDCG is improved by 18.4 percent on average. Compared with the prior method, the method can capture the sequence conversion of the items and the semantic association between the items, consider the preference of long and short periods and improve the recommendation performance.

Further, to verify the validity of the knowledge-enhanced project presentation module and the personalized hierarchical attention module, ablation experiments were performed on three variants of KPHAN, as shown in fig. 3. KPHAN-K & a represent the basic model of ignoring the knowledge enhancement module and the personalized hierarchical attention module. KPHAN-A is a personalized hierarchical attention ignoring the model, discussing the usefulness of knowledge. From the table, it can be seen that the effect of the basic model is obviously improved after knowledge content is added, the hit rate is averagely improved by 7.8% on three different data sets, and the normalized break cumulative gain is averagely improved by 7.7%. KPHAN-K is a knowledge enhancement module of the neglect model, discusses the usefulness of the level attention, improves the hit rate on three different data sets by 1.4% on average, and improves the normalized break cumulative gain by 2.4% on average. KPHAN is a model presented herein, with significantly improved effects compared to the previous three variants. Indicating that the knowledge enhancement module and the personalized hierarchical attention module complement each other and that the model achieves the best performance.

It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Claims

1. A KPHAN-based sequence recommendation method, which is characterized in that KPHAN refers to a knowledge-enhanced personalized hierarchical attention network;

The method specifically comprises the following steps:

Step one: the method comprises the steps that a knowledge graph is used for embedding the representation of modeling enhancement items, and entities and relations in the knowledge graph are encoded into low-dimensional dense vectors on the premise that knowledge graph structural information and semantic association are reserved;

firstly, using an entity linking method in KB4Rec to find an entity E E corresponding to a project v in a knowledge graph, wherein E is a set of all entities of the knowledge graph;

(1.3) training the first-order subgraph through TransE model to obtain low-dimensional dense vector of entity corresponding to the project The rest of the entities in the first-order subgraph are embedded/>The entity and relationship are represented in the same space by TransE model, so that the sum of the head node vector and the relationship vector, i.e. e+r≡e _t, therefore the scoring function of TransE is:

wherein/> Representing L2 norms

Step two: since the first-order neighbors of the entity are usually important features for providing semantic association, a context entity is introduced on the basis of embedding the entity corresponding to the item in the step one, and the context entity of the entity corresponding to the item is defined as:

context(e)＝{e_t|(e,r,e_t)∈G}

corresponding context entity embedding is obtained by calculating an average value of the context entity embedding

Wherein the method comprises the steps ofContext entity embedding of the entity corresponding to the item is represented, wherein I context (e) is the number of the context entities, and sigma represents summation operation;

Step three: setting the maximum length of the model interaction sequence as n, and carrying out maximum sequence length value on the user interaction sequence in the form of a sliding window, if the interaction sequence is less than n, supplementing 0 on the left side of the sequence; creating an item embedding matrix M epsilon R ^|V|×d, wherein d represents the item embedding dimension, v represents the number of items, and each time the sequence modeling retrieves the corresponding input embedding matrix Adding a leachable location embedding/>, for each itemThe final sequence item is embedded as/>Previous methods will directly/>As an input of the adaptive attention module, however, the semantic features of the item are ignored only by adopting the mode of item embedding, so that the entity embedding corresponding to the item in the first step and the context embedding corresponding to the item in the second step are integrated into the item to enhance the representation of the item and improve the item identification, and the bidirectional integration f _{Bi-interaction} is adopted to embed/>, on the final sequence itemSequence item embedding/>, obtained by integrating knowledge and enhanced by knowledgeAnd dorpout layer is used for solving the problem of overfitting of the deep neural network by discarding neurons with a certain probability during training, wherein leakyReLU is a nonlinear activation function, W ₁ and W ₂ are learnable parameters, and e represents multiplication of corresponding elements of vectors;

Wherein the method comprises the steps of Is the entity embedding matrix corresponding to the sequence item v,/>Is the context embedding matrix of the entity corresponding to item v,And/>Embedding a context entity corresponding to the project into the same space of the project through the same full-connection network as the context entity and applying a tanh nonlinear activation function, wherein W, W1, W2 and b are learnable parameters;

step four: embedding the sequence item with enhanced knowledge on the basis of the third step Acquiring short-term interests of a user as input to an adaptive attention module; the adaptive attention module includes an adaptive attention and feed forward neural network (FFN); wherein the self-attention portion employs scalar dot product defined as:

S＝Attention(Q,K,V)

wherein Q represents queries, K represents keys, V represents values, Is a learnable parameter, layerNorm represents layer normalization; in order to ensure the time sequence of the sequence, mask is adopted in Q, K and V calculation, Q _i and K _j meet the requirement that i < j, Q _i and K _j are attention weight calculation, namely the dependency relationship between item i and item j is calculated, and the sequence change is captured by calculating the attention weighted sum of the previous items;

In order to add nonlinearity to the model, the feedforward neural network is performed after the attention to obtain deeper features, and a two-layer feedforward neural network is adopted, specifically:

F＝FFN(S)＝Relu(dropout(Relu(dropout(SW⁽¹⁾+b)))W⁽²⁾+b))

Wherein Relu is a nonlinear activation function, dropout means that neurons are discarded according to a certain probability in training, W ⁽¹⁾,W⁽²⁾, b is a learnable parameter of a neural network, and S is a sequence representation after self-attention; residual connections are used in both the self-attention module and the feedforward neural network part, and in order to learn more complex dependencies, ω self-attention blocks, self attention and FFN, b-th block, and b >1 are stacked, defined as:

S^(b)＝Attention(F^(b-1))

F^(b)＝FFN(S^(b))

when b=1, s=attention (Q, K, V), f=ffn (S), For the feature vector of the last item in the last block, it merges the features of all previous items to obtain more accurate current item features, which are important for user preference prediction, so will/>Consider short-term preferences of the user;

Step five: in addition to the user's current preferences acting on the user's next item interactions, the user's long-term static preferences also acting on predicting the user's next item interactions, the user's long-term preferences existing in the items that the user interacted with, as will be described herein And carrying out similarity calculation with the candidate items to obtain long-term static preference of the user, wherein the similarity between the candidate items and the item at each moment is calculated as follows:

Step six: exploring which of the long-term and short-term preferences is more important in predicting the next item with attention:

Wherein the method comprises the steps of Is an implicit user preference extracted from items that the user interacted with, consists of long-term preferences and short-term preferences,/>Weight for short-term preference,/>For the weight of long-term preference, q is the next item to be interacted with, softmax is the normalization function;

Where k is the number of sequences that the user ui can maximally divide, j is the positive sample of each divided sequence, For the negative samples sampled by the user ui when taking positive samples j, here 100 negative examples are sampled for each positive example,/>Representing all parameters and embedding of the model, namely embedding the items, and regularizing;

is positive sample interaction probability,/> Is a negative sample interaction probability.

2. The KPHAN-based sequence recommendation method as claimed in claim 1, comprising two modules: a global short-term preference module with enhanced knowledge, KGSP for short, and a long-term preference fusion module with personalized level attention perception, LSPF for short; using knowledge information to enhance the representation of the items in KGSP, modeling semantic association among the items, obtaining more accurate short-term preference characteristics of the user, and realizing the steps one to four; in the LSPF, capturing the long-term preference characteristics of the user through an attention mechanism and fusing the short-term preference characteristics of the user to obtain the final preference characteristics of the user, so as to realize the steps five to seven.