CN111932336A - Commodity list recommendation method based on long-term and short-term interest preference - Google Patents
Commodity list recommendation method based on long-term and short-term interest preference Download PDFInfo
- Publication number
- CN111932336A CN111932336A CN202010693227.4A CN202010693227A CN111932336A CN 111932336 A CN111932336 A CN 111932336A CN 202010693227 A CN202010693227 A CN 202010693227A CN 111932336 A CN111932336 A CN 111932336A
- Authority
- CN
- China
- Prior art keywords
- term
- user
- interest
- vector
- long
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000007774 longterm Effects 0.000 title claims abstract description 88
- 238000000034 method Methods 0.000 title claims abstract description 30
- 239000013598 vector Substances 0.000 claims abstract description 125
- 230000003993 interaction Effects 0.000 claims abstract description 35
- 230000004927 fusion Effects 0.000 claims abstract description 30
- 238000000605 extraction Methods 0.000 claims abstract description 19
- 230000007246 mechanism Effects 0.000 claims abstract description 13
- 239000011159 matrix material Substances 0.000 claims description 40
- 230000009466 transformation Effects 0.000 claims description 31
- 238000013528 artificial neural network Methods 0.000 claims description 18
- 239000000126 substance Substances 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 8
- 239000005648 plant growth regulator Substances 0.000 claims description 3
- 230000006399 behavior Effects 0.000 description 34
- 230000000694 effects Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000013459 approach Methods 0.000 description 3
- 239000011521 glass Substances 0.000 description 3
- 235000020095 red wine Nutrition 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 235000014101 wine Nutrition 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 235000013405 beer Nutrition 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001364 causal effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention belongs to the field of personalized recommendation systems, and particularly relates to a commodity list recommendation method based on long-term and short-term interest preferences, which comprises the steps that a user clicks a data set of a commodity, the data set is input into an embedding layer, and all input elements are embedded into a low-dimensional vector with a fixed size by the embedding layer; the interest extraction layer outputs the short-term interest preference of the user by capturing the relation between the short-term interest low-dimensional vector of the user and the commodity to be selected; inputting the low-dimensional vectors of long-term interest of the user into an interaction layer, searching high-order features in the interaction layer by using a multi-head attention mechanism, and outputting long-term behavior preference of the user; respectively extracting long-term and short-term characteristics from the interest extraction layer and the interaction layer, inputting the characteristics into a fusion layer for fusion, and taking a fusion result as a recommended commodity list; the invention can obtain the long-term and short-term preference of the user based on the user behavior information of the e-commerce, so as to recommend a proper commodity list to the user.
Description
Technical Field
The invention belongs to the field of personalized recommendation systems, and particularly relates to a commodity list recommendation method based on long-term and short-term interest preferences.
Background
Large recommendation systems in the industry need to accurately predict the user's preferences and quickly respond to their current needs. A large e-commerce website, having billions of items of merchandise and users, first retrieves a set of candidate merchandise for the user, and then applies a ranking module to generate final recommendations. Currently, most matching models deployed by commercial websites are mainly based on a project-based Collaborative Filtering (CF) approach. However, they model static user item interactions and do not capture dynamic transitions throughout the sequence of user actions well. This approach typically results in a generic recommendation. To accurately understand the interests and preferences of the user, short-term session information should be incorporated into the matching module.
Dynamic evolution of user interests is considered by introducing a deep sequential recommendation model rather than an item-based CF of the matching phase. As people begin using online shopping services on e-commerce websites, their behavior accumulates for a relatively long time. The sequence consists of sessions. A session is a list of user actions that occur within a given time frame. A user typically has specific unique shopping needs in a session and his/her interests may change dramatically when he/she starts a new session. Directly modeling sequences while ignoring such internal structures can compromise performance. Therefore, we refer to the latest interactive session of the user as a short-term behavior and the other behaviors before as long-term behaviors. These two parts are modeled separately to encode their intrinsic information, which can be used to represent different levels of interest of the user. Our goal is to recall the first N items after the candidate for the user sequence match.
For short-term session modeling, Recursive Neural Network (RNN) based approaches have shown effective performance in session-based recommendations. Most importantly, Li and Liu et al. Attention models are further proposed to emphasize the main purpose and effect of the final click in the short-term session, respectively, to avoid the transfer of interest by random actions of the user. However, they ignore the user's points of interest in more than one session. We observe that customers are concerned with multiple aspects of goods, such as categories, brands, colors, styles, and store reputations, among others. The user repeatedly compares many items before making a final decision on the favorite items. Thus, the use of a single head of product attention does not reflect the variety of interests that occur at different times of purchase. Instead, the multi-headed attention first put forward for the machine translation task allows the model to focus on multiple different information at different locations in common. A multi-headed structure can naturally solve the multi-benefit problem by representing preferences from different perspectives. Therefore, we propose a multi-interest module to leverage multi-head attention to enhance the RNN-based sequential recommender. Meanwhile, due to the self-attention function, the module can represent accurate user preference by filtering out causal clicks.
The current decision-making is always influenced by the user's long-term general preferences. Intuitively, if the user is a football fan, he can view/click on the merchandise related to football stars. When he now chooses to buy shoes, the sports shoes of the famous globes will be more attractive to him than the ordinary shoes. Therefore, it is important to consider long-term preferences and short-term behavior. Ying and Li et al. The long-term preferences of the customer are taken into account by a simple combination with the current session. However, in practice, customers have a variety of shopping needs, and their long-term behavior is also complex and diverse. Things related to the planet account for a very small fraction of long-term behavior. Long-term user preferences related to the current short-term session cannot be significantly expressed in the overall long-term behavior. This is not an efficient method of fusion if we simply connect long-term and short-term representations or aggregate them into a weighted attention. Information in the long-term vector relating to the current short-term session should be retained.
Disclosure of Invention
Aiming at the problem that the long-term interest preference and the short-term interest preference of a user are difficult to balance in the existing recommendation algorithm, the invention provides a commodity list recommendation method based on the long-term and short-term interest preference, commodities to be recommended are input into a network comprising an embedding layer, an interest extraction layer, an interaction layer and an interest fusion layer, and a commodity list recommended to the user is selected through the following steps:
s1, using the commodity list information clicked by the user in the short-term conversation as a data set, inputting the data set into an embedding layer, and embedding all input elements into a low-dimensional vector with a fixed size by the embedding layer;
s2, the interest extraction layer outputs a short-term interest preference vector of the user by capturing the relation between the short-term interest low-dimensional vector of the user and the commodity to be selected;
s3, taking the long-term historical data as a data set, and inputting the data into an embedding layer to obtain a low-dimensional vector with fixed size of the long-term historical data;
s4, inputting the low-dimensional vectors of long-term interest of the user into an interaction layer, searching high-order features in the interaction layer by using a multi-head attention mechanism, and outputting long-term behavior preference of the user;
and S5, extracting the interest extraction layer and the interaction layer into long-term and short-term characteristic input fusion layers respectively for fusion, and taking the fusion result as a recommended commodity list.
Further, the process of extracting the short-term interest features of the user by the interest extraction layer comprises the following steps:
carrying out sequential coding on a commodity sequence clicked by a user, recording the operation as a position code, and carrying out deviation coding on the basis of the position code;
updating the behavior conversation of the user after adding the deviation code;
and constructing a feed-forward network based on a multi-head attention mechanism and calculating the kth session interest of the user, wherein the kth session interest is the characteristics of the user.
Further, the interest of the user in the kth session is expressed as:
wherein, IkRepresenting the user's interest in the kth session; avg is average merging operation;an interest vector in the kth session of the user representing the feature Q; concat represents the connection between vectors; head H represents the H-th head vector inside the transform; wOA linear transformation matrix representing the characteristic O.
Further, the H-th head vector head H in the transform model is represented as:
wherein Q iskhIs QkThe h head vector of (1); qkIs the Query vector under the k-th session; wQOutputting a matrix for linear transformation of the Query vector; wKOutputting a matrix for linear transformation of the Key vector; wVOutputting a matrix for linear transformation of Value vectors; dmodelA matrix is input for the model.
Further, the process of extracting the long-term interest features of the user by the interaction layer comprises the following steps:
acquiring the correlation between the defined feature m and the defined feature k under the specific attention head h;
updating the feature m in the subspace h by combining all relevant features guided by the coefficients h, k;
and adding standard residual error connection in the network to obtain the final long-term interest characteristics.
Further, the correlation between feature m and feature k is defined under a particular attention head h as:
wherein the content of the first and second substances,representing the attention vector output of the feature m, k under the h head; psi(h)(em,ek) Representing an aggregated output of the embedding vectors between feature m and feature k;representing a linear transformation matrix under the head of a Query vector h;expressing a linear transformation matrix under a Key vector h head; e.g. of the typemAnd outputting an imbedding vector representing the feature m.
Further, updating the feature m in the subspace h by combining all relevant features guided by the coefficients h, k comprises:
wherein the content of the first and second substances,representing an imbedding vector of the feature m under the h head;a linear transformation matrix representing the Value vector under the h head;an imbedding vector representing feature m;is the join operator, and H is the total number of heads.
Further, the process of fusing the long-term user interest features and the short-term user interest features by the fusion layer is represented as follows: long-term interest preference for users from different feature scalesCoding, user long-term interest preferenceIncluding at least an article IDLeaves and their use as plant growth regulatorsFirst kindShopAnd brand
Modeling different feature sets through an attention layer, and calculating attention scores of the user under different feature sets by taking different preference degrees of different categories as query vectors if the user probably has different preference degrees of different stores;
and constructing a gated neural network, wherein the network takes the long-term session interest preference vector and the short-term session interest preference vector as input, and performs weighting in the gated neural network to finally output a recommendation list.
Further, the attention scores are used as the contribution degrees of the long-term interest preference and the short-term interest preference to the input into the gated neural network, wherein the contribution degree is calculated according to the following formula:
pu=tanh(Wpzu+b)
wherein alpha iskAttentio representing feature kn vectors;representing the user fusion interest vector under the characteristic T; e.g. of the typeuAn embedding vector representing a user;representing a fully connected neural network;representing a user fusion interest vector; z is a radical ofkRepresenting a fully connected neural network; p is a radical ofuA long-term behavior vector representing a user; wpDenotes zuA linear transformation matrix of the vectors; b represents a constant matrix;representing a long-term session interest vector under feature f;representing a long-term session interest vector under feature j; f denotes the feature f.
Further, the gated neural network includes:
wherein the content of the first and second substances,representing the contribution of the gate vector to control long-term and short-term interest to the overall commodity recommendation; w1Denotes euA linear transformation matrix of the vectors; e.g. of the typeuRepresenting a user embedding vector; w2To representA linear transformation matrix of the vectors;a short-term behavior vector representing a user; w3Represents puA linear transformation matrix of the vectors; b represents a constant matrix;representing a user behavior vector; an element bit-wise multiplication; p is a radical ofuRepresenting a long-term behavior vector of the user.
The invention has the following beneficial technical effects:
(1) the invention belongs to the category of personalized recommendation systems, and has the effects of low parameter and high precision; the dynamic commodity personalized recommendation method can perform dynamic commodity personalized recommendation on users in the E-commerce platform.
(2) A novel short-term user interest preference extraction network architecture is provided, by which session-based user interest preferences can be more accurately and dynamically captured.
(3) The high-order feature automatic extraction network architecture with long-term interest preference is provided, the problem that the original feature combination extraction with practical significance can only be carried out by relying on expert domain knowledge is solved, and the problem that the traditional low-order feature combination algorithm can only learn the interpretability lack of implicit feature combination is also solved.
(4) A door control network is provided, and dynamic interests of a user can be recommended more accurately by fusing long and short interest preferences.
Drawings
FIG. 1 is a schematic structural diagram of a long-short interest preference network according to the present invention;
fig. 2 is a diagram illustrating the effect of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a commodity list recommendation method based on long-term and short-term interest preferences, which is characterized in that commodities to be recommended are input into a network comprising an embedding layer, an interest extraction layer, an interaction layer and an interest fusion layer, and a commodity list recommended to a user is selected through the following steps:
s1, using the commodity list information clicked by the user in the short-term conversation as a data set, inputting the data set into an embedding layer, and embedding all input elements into a low-dimensional vector with a fixed size by the embedding layer;
s2, the interest extraction layer outputs a short-term interest preference vector of the user by capturing the relation between the short-term interest low-dimensional vector of the user and the commodity to be selected;
s3, taking the long-term historical data as a data set, and inputting the data into an embedding layer to obtain a low-dimensional vector with fixed size of the long-term historical data;
s4, inputting the low-dimensional vectors of long-term interest of the user into an interaction layer, searching high-order features in the interaction layer by using a multi-head attention mechanism, and outputting long-term behavior preference of the user;
and S5, extracting the interest extraction layer and the interaction layer into long-term and short-term characteristic input fusion layers respectively for fusion, and taking the fusion result as a recommended commodity list.
In particular, a short-term session refers to a session of a user in a short time, and a long-term session refers to a session in a relatively long time; preferably, unless otherwise specified in the embodiments of the present invention, the short-term session generally refers to the commodity click sequence of the user within 15 minutes in one session, and the long-term session generally refers to the commodity click sequence of the user within one week or more
Example 1
The embodiment gives a specific embodiment of acquiring a data set and processing the data set.
As shown in fig. 1, the data sets of the clicked commodities, which are long-term conversation data and short-term conversation data, are respectively input into the embedding layer, and the embedding layer embeds all input elements into low-dimensional vectors with fixed sizes; the interest extraction layer outputs the short-term interest preference of the user by capturing the relation between the short-term interest low-dimensional vector of the user and the commodity to be selected; inputting the low-dimensional vectors of long-term interest of the user into an interaction layer, searching high-order features in the interaction layer by using a multi-head attention mechanism, and outputting long-term behavior preference of the user; and respectively extracting long-term and short-term characteristics from the interest extraction layer and the interaction layer, inputting the characteristics into the fusion layer for fusion, and taking the fusion result as a recommended commodity list.
In this embodiment, the data set obtained is a Taobao-sampled data set that includes active users who randomly selected 40 commodities to interact with in 2018 for 8 consecutive days of 12 months. In addition, in this embodiment, users with interaction times exceeding 1000 items are filtered, and these users are considered as false users in this embodiment.
In the historical interaction data collected in this example, where the first 7 days were used for training and the 8 th day was used for testing, this example filters out merchandise that appeared less than 5 times in the dataset of interactions.
The present embodiment makes a session generation rule:
interactions with the same session ID belong to the same session;
merging adjacent interactions with session time less than 10 minutes into one session;
the maximum length of a session is set to 50, which means that a new session will start when the session length exceeds 50;
each latest session of user u is considered a short-term behavior, where m is the length of the sequence;
the behavior of the user u occurring in the session one week before is regarded as long-term behavior;
in consideration of the short-term behavior and the long-term behavior of the user u, the present embodiment recommends an item for the user u;
the maximum limit for each is 20 during training and sessions with length less than 2 will be deleted.
In the testing phase, this embodiment would select about 1 million active users for rapid evaluation on day 8, these selected users were included in the model in the first 25% of short-term meetings on day 8, and the rest of the interactions were ground truth; in addition to this, the customer may browse through certain items many times a day, thus discourageing repeated recommendations, and therefore we retain these items only once in the user's test data.
In particular, a short-term session as referred to in embodiments of the present invention generally refers to a user's click sequence of items within a short period of time, e.g., within 15 minutes, in a session, while a long-term session refers to a user's click sequence of items within a week or more.
Example 2
On the basis of embodiment 1, this embodiment provides a data processing method.
The non-serialized data is processed, wherein the non-serialized data comprises user attribute information and commodity attributes, and three different features are respectively processed at an Embedding layer, wherein the three features are respectively a single-value discrete feature, a multi-value discrete feature and a continuous feature. For a single-valued discrete feature, a corresponding Embedding representation is directly obtained through an Embedding vocabulary and is expressed as follows: e.g. of the typei=VixiIn which V isiRepresents; for discrete features, xiIs a one-hot vector or a multi-hot vector, taking values other than 0, i.e. 1, and for consecutive features, xiThe method is directly a scalar, and the value of the scalar is directly multiplied by the corresponding Embedding, namely: e.g. of the typem=vmxmWherein e ismImbedding vector output, v, representing feature mmMask vector output, x, representing feature mmA one-hot vector output representing a feature m;
for the multi-valued discrete features, after obtaining the corresponding Embedding through the Embedding vocabulary, averaging the Embedding of the same field by the avg-posing method is also needed, which is expressed as: wherein q is the number of values in the multi-value discrete characteristic.
Example 3
In this embodiment, the process of extracting the short-term interest features of the user by the interest extraction layer specifically includes:
carrying out sequential coding on a commodity sequence clicked by a user, recording the operation as a position code, and carrying out deviation coding on the basis of the position code;
updating the behavior conversation of the user after adding the deviation code;
and constructing a feed-forward network based on a multi-head attention mechanism and calculating the kth session interest of the user, wherein the kth session interest is the characteristics of the user.
And processing serialized data, wherein the serialized data is a section of data of commodity interaction behaviors of a user in a session id, and behaviors in the same session of the session interest extractor layer are closely related to each other. Furthermore, the random behavior of the user in the conversation may bias the interest of the conversation away from its original expression. To capture the internal relationships between behaviors in the same session and reduce the impact of these unrelated behaviors, a multi-headed self-mechanism is employed in each session, and some improvements are made to the self-attention mechanism to better achieve our goals.
In order to exploit the sequential relationship of the sequences, the self-care mechanism applies position coding to the input embedding; furthermore, there is a need to capture the sequential relationship of the sessions and the deviations that exist in the different representation subspaces. Thus, on the basis of position codingPerforming bias coding, wherein each element in the bias code BE is defined as follows:
whereinIs the bias of the session, the index k is the index of the session,is a bias to the position in the session, the subscript t is an index to the behavior in the session,is the location of the cell's bias in the behavior embedding, and subscript c is the index of the unit in the behavior embedding. After adding the offset code, the behavior session Q of the user is updated as follows:
Q=Q+BE
the multi-point attention mechanism, in the recommendation system, the user's click behavior is influenced by various factors (e.g., color, style, and price). Multi-headed self-attention may capture relationships in different representation subspaces. Mathematically let Qk=[Qk1;....;Qkh;....QkH]WhereinIs QkH is the sum of the number of headsheadhThe output of (c) is calculated as follows:
wherein Q iskhIs QkThe h head vector of (1); qkIs a Query vector; wQOutputting a matrix for linear transformation of the Query vector; wKOutputting a matrix for linear transformation of the Key vector; wVOutputting a matrix for linear transformation of Value vectors; dmodelA matrix is input for the model.
The vectors of the different heads are then concatenated and then input into a feed forward network:
wherein FFN (-) is a feed-forward network, WOIs a linear matrix.
Continuously proceeding with residual connection and layer normalization, user's kth session interest IkThe calculation is as follows:
wherein Avg is average merge; in particular, the weights are shared among the self-care mechanisms of the different sessions.
Example 4
The embodiment provides a process for extracting long-term interest features of a user by an interaction layer, which specifically comprises the following steps:
acquiring the correlation between the defined feature m and the defined feature k under the specific attention head h;
updating the feature m in the subspace h by combining all relevant features guided by the coefficients h, k;
and adding standard residual error connection in the network to obtain the final long-term interest characteristics.
The interaction layer is an encoder part of a Transformer, and is stacked by multiple layers to learn high-order combinations between features, and the invention defines the correlation between a feature m and a feature k under a specific attention head h, and can be expressed as:
wherein the content of the first and second substances,representing the attention vector output of the feature m, k under the h head; psi(h)(em,ek) Representing the sum of the features m and kAggregating and outputting the imbedding vectors among the elements;represents the linear transformation matrix under the Query vector h header,representing a linear transformation matrix under the head of a Key vector h, wherein the two matrixes are used for embedding the original into a space RdMapping to a new space Rd’The transformation matrix of (2).
The representation of the feature m in the subspace h is then updated by combining all relevant features guided by the coefficients h, k
Wherein the content of the first and second substances,representing an imbedding vector of the feature m under the h head;a linear transformation matrix representing the Value vector under the h head;representing the total vector representation of the feature m in the imbedding vector under each head;is the join operator, and H is the total number of heads.
In order to preserve the previously learned combined features, including the original individual features, standard residual connections are added in the network. In view of the form of the utility model,
wherein the content of the first and second substances,activation function value of embedding vector representing feature m under interest interaction layer, WResA linear transformation matrix below the interest interaction layer representing the feature m; and relu (z) max (0, z) is a non-linear activation function; with such an interaction layer, each feature emWill be updated to a new feature representationThis cardioid feature is represented as a representation of higher order features, and multiple such layers may be stacked using the output of a previous interaction layer as input to the next interaction layer. In this way, any order of the combined features can be simulated.
Example 5
The short-term interest features of the user obtained in the embodiment 3 and the long-term interest features of the user obtained in the embodiment 4 are used as the input of the interest fusion layer for feature fusion, and the method specifically comprises the following steps:
long-term interest preference for users from different feature scalesCoding, user long-term interest preferenceIncluding at least an article IDLeaves and their use as plant growth regulatorsFirst kindShopAnd brand
And constructing a gated neural network, wherein the network takes the long-term session interest preference vector and the short-term session interest preference vector as input, and performs weighting in the gated neural network to finally output a recommendation list.
From different characteristic scales to long-term behaviorCoding, long-term behaviorFormed of a plurality of subsets, i.e.Comprises that(the article ID),(class of leaves),(of the first type),(shop) and(branding). The attention score is calculated using the user profile embedding as a query vector and the resulting representation:
pu=tanh(Wpzu+b)
wherein alpha iskAn attention vector representing feature k;representing the user fusion interest vector under the characteristic T; e.g. of the typeuAn embedding vector representing a user;representing a fully connected neural network;representing a user fusion interest vector; z is a radical ofuRepresenting a fully connected neural network; p is a radical ofuA long-term behavior vector representing a user; wpDenotes zuA linear transformation matrix of the vectors; b represents a constant matrix;representing a long-term session interest vector under feature f;representing a long-term session interest vector under feature j; f denotes the feature f.
To incorporate short-term behavior, the present embodiment designs a gated neural network that takes as input a long-term session interest preference and a short-term session interest preference. Gate vectorThe short and long term contribution percentages for determining time and sigmoid are expressed as:
get the interest representation of the userThen, the next interactive item of the user can be obtained according to the log, and K-1 negative examples of items are sampled as positive examples. And (4) calculating inner products of the embedding corresponding to the K items according to the user interest expression respectively to serve as the score of each item. And finally calculating loss through softmax and cross entropy, and performing model training:
wherein the content of the first and second substances,a predicted probability representing the distribution of each sample item; z represents an item score;representing a cross entropy loss function;represents K items; y isiRepresenting the true probability of item i;representing the predicted probability of item i.
As shown in FIG. 2, in the real data set, the user is shown in a short-term session SuIncluding red wine glass and champion wine glass. The patent network model directly recommends champion wine cups, which is related to the last click in a short-term session, meaning that users are more likely to be interested in champion wine cups at present; meanwhile, the door control network module in the patent network model can capture a large number of long-term conversations L of the most relevant articles in red wine in the useruMany unrelated clicks are included, such as beer, paring knives and small dishes, as well as in conjunction with the short-term conversation magenta glass to create a recommended item red wine decanter; this case shows this patent network model door module, shows that patent network model door module has validity and accurate convergence.
The invention provides a commodity list recommendation method with long-term and short-term interest preference, and better recommendation results are obtained by respectively modeling serialized data and non-serialized data and finally recommending commodity lists through a gate fusion mechanism.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (10)
1. A commodity list recommendation method based on long-term and short-term interest preferences is characterized in that commodities to be recommended are input to a commodity list including an embedding layer, an interest extraction layer, an interaction layer and an interest fusion layer, and the commodity list recommended to a user is selected through the following steps:
s1, using the commodity list information clicked by the user in the short-term conversation as a data set, inputting the data set into an embedding layer, and embedding all input elements into a low-dimensional vector with a fixed size by the embedding layer;
s2, the interest extraction layer outputs the short-term interest preference of the user by capturing the relation between the short-term interest low-dimensional vector of the user and the commodity to be selected;
s3, taking the long-term historical data as a data set, and inputting the data into an embedding layer to obtain a low-dimensional vector with fixed size of the long-term historical data;
s4, inputting the low-dimensional vectors of long-term interest of the user into an interaction layer, searching high-order features in the interaction layer by using a multi-head attention mechanism, and outputting long-term behavior preference of the user;
and S5, extracting the interest extraction layer and the interaction layer into long-term and short-term characteristic input fusion layers respectively for fusion, and taking the fusion result as a recommended commodity list.
2. The method as claimed in claim 1, wherein the process of extracting the short-term interest features of the user by the interest extraction layer comprises:
carrying out sequential coding on a commodity sequence clicked by a user, recording the operation as a position code, and carrying out deviation coding on the basis of the position code;
updating the behavior conversation of the user after adding the deviation code;
and constructing a feed-forward network based on a multi-head attention mechanism and calculating the kth session interest of the user, wherein the kth session interest is the characteristics of the user.
3. The method as claimed in claim 2, wherein the interest of the kth session of the user is expressed as:
wherein, IkRepresenting the user's interest in the kth session; avg is average merging operation;user's Kth conversation representing feature QAn interest vector; concat represents the connection between vectors; headHRepresenting the H head vector in the transform model; wOA linear transformation matrix representing the characteristic O.
4. The method as claimed in claim 3, wherein the H head vector head in the transform model is used for recommending the merchandise list based on the long-short term interest preferenceHExpressed as:
wherein Q iskhIs QkThe h head vector of (1); qkIs the query vector under the kth session; wQOutputting a matrix for linear transformation of the Query vector; wKOutputting a matrix for linear transformation of the Key vector; wVOutputting a matrix for linear transformation of Value vectors; dmodelA matrix is input for the model.
5. The method as claimed in claim 1, wherein the process of extracting the long-term interest features of the user by the interaction layer comprises:
acquiring the correlation between the defined feature m and the defined feature k under the specific attention head h;
updating the feature m in the subspace h by combining all relevant features guided by the coefficients h, k;
and adding standard residual error connection in the network to obtain the final long-term interest characteristics.
6. The method as claimed in claim 5, wherein the correlation between the feature m and the feature k is defined as follows:
wherein the content of the first and second substances,representing the attention vector output of the feature m, k under the h head; psi(h)(em,ek) Representing an aggregated output of the embedding vectors between feature m and feature k;representing a linear transformation matrix under the head of a Query vector h;expressing a linear transformation matrix under a Key vector h head; e.g. of the typemAnd outputting an imbedding vector representing the feature m.
7. The method of claim 5, wherein updating the feature m in the subspace h by combining all the relevant features guided by the coefficients h, k comprises:
wherein the content of the first and second substances,representing an imbedding vector of the feature m under the h head;a linear transformation matrix representing the Value vector under the h head;an imbedding vector representing feature m; ≧ is the join operator, H is the total number of headers.
8. The method for recommending a commodity list based on the long-term and short-term interest preferences as claimed in claim 1, wherein the process of fusing the long-term and short-term user interest features by the fusion layer is represented as follows:
long-term interest preference for users from different feature scalesCoding, user long-term interest preferenceAt least comprising an article of merchandiseLeaves and their use as plant growth regulatorsFirst kindShopAnd brand
Modeling different feature sets through an attention layer, and calculating attention scores of the user under different feature sets by taking different preference degrees of different categories as query vectors if the user probably has different preference degrees of different stores;
and constructing a gated neural network, wherein the network takes the long-term session interest preference vector and the short-term session interest preference vector as input, and performs weighting in the gated neural network to finally output a recommendation list.
9. The method of claim 8, wherein the interest scores of the long-term interest preference and the short-term interest preference are calculated as the contribution degrees of the long-term interest preference and the short-term interest preference to the gated neural network according to the attention scores in claim 8, wherein the contribution degrees are calculated as follows:
pu=tanh(Wpzu+b)
wherein alpha iskAn attention vector representing feature k;representing the user fusion interest vector under the characteristic T; e.g. of the typeuAn embedding vector representing a user;representing a fully connected neural network;representing a user fusion interest vector; z is a radical ofuRepresenting a fully connected neural network; p is a radical ofuA long-term behavior vector representing a user; wpDenotes zuA linear transformation matrix of the vectors; b represents a constant matrix;representing a long-term session interest vector under feature f;representing a long-term session interest vector under feature j; f denotes the feature f.
10. The method of claim 8, wherein the gated neural network comprises:
wherein the content of the first and second substances,representing the contribution of the gate vector to control long-term and short-term interest to the overall commodity recommendation; w1Denotes euA linear transformation matrix of the vectors; e.g. of the typeuRepresenting a user embedding vector; w2To representA linear transformation matrix of the vectors;a short-term behavior vector representing a user; w3Represents puA linear transformation matrix of the vectors; b represents a constant matrix;representing a user behavior vector; an element bit-wise multiplication; p is a radical ofuRepresenting a long-term behavior vector of the user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010693227.4A CN111932336A (en) | 2020-07-17 | 2020-07-17 | Commodity list recommendation method based on long-term and short-term interest preference |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010693227.4A CN111932336A (en) | 2020-07-17 | 2020-07-17 | Commodity list recommendation method based on long-term and short-term interest preference |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111932336A true CN111932336A (en) | 2020-11-13 |
Family
ID=73313786
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010693227.4A Pending CN111932336A (en) | 2020-07-17 | 2020-07-17 | Commodity list recommendation method based on long-term and short-term interest preference |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111932336A (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112559905A (en) * | 2020-12-24 | 2021-03-26 | 北京理工大学 | Conversation recommendation method based on dual-mode attention mechanism and social similarity |
CN112818224A (en) * | 2021-01-26 | 2021-05-18 | 北京百度网讯科技有限公司 | Information recommendation method and device, electronic equipment and readable storage medium |
CN112862007A (en) * | 2021-03-29 | 2021-05-28 | 山东大学 | Commodity sequence recommendation method and system based on user interest editing |
CN112905887A (en) * | 2021-02-22 | 2021-06-04 | 中国计量大学 | Conversation recommendation method based on multi-interest short-term priority model |
CN112948683A (en) * | 2021-03-16 | 2021-06-11 | 山西大学 | Socialized recommendation method with dynamic fusion of social information |
CN112948716A (en) * | 2021-03-05 | 2021-06-11 | 桂林电子科技大学 | Continuous interest point package recommendation method based on multi-head attention mechanism |
CN112950325A (en) * | 2021-03-16 | 2021-06-11 | 山西大学 | Social behavior fused self-attention sequence recommendation method |
CN113407819A (en) * | 2021-05-20 | 2021-09-17 | 桂林电子科技大学 | Sequence recommendation method, system and storage medium based on residual error network |
CN113505215A (en) * | 2021-06-30 | 2021-10-15 | 北京明略软件系统有限公司 | Product recommendation method and device, electronic equipment and storage medium |
CN113722599A (en) * | 2021-09-06 | 2021-11-30 | 中国计量大学 | Conversation recommendation method based on user long-term interest and short-term interest modeling |
CN113868542A (en) * | 2021-11-25 | 2021-12-31 | 平安科技(深圳)有限公司 | Attention model-based push data acquisition method, device, equipment and medium |
CN115099886A (en) * | 2022-05-25 | 2022-09-23 | 华南理工大学 | Long and short interest sequence recommendation method and device and storage medium |
WO2023108324A1 (en) * | 2021-12-13 | 2023-06-22 | 中国科学院深圳先进技术研究院 | Comparative learning enhanced two-stream model recommendation system and algorithm |
CN116562992A (en) * | 2023-07-11 | 2023-08-08 | 数据空间研究院 | Method, device and medium for recommending items for modeling uncertainty of new interests of user |
CN116595157A (en) * | 2023-07-17 | 2023-08-15 | 江西财经大学 | Dynamic interest transfer type session recommendation method and system based on user intention fusion |
CN117455629A (en) * | 2023-11-24 | 2024-01-26 | 美服数字科技(广州)有限公司 | Live broadcast and cargo carrying intelligent pushing method and system |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110807156A (en) * | 2019-10-23 | 2020-02-18 | 山东师范大学 | Interest recommendation method and system based on user sequence click behaviors |
CN110929164A (en) * | 2019-12-09 | 2020-03-27 | 北京交通大学 | Interest point recommendation method based on user dynamic preference and attention mechanism |
-
2020
- 2020-07-17 CN CN202010693227.4A patent/CN111932336A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110807156A (en) * | 2019-10-23 | 2020-02-18 | 山东师范大学 | Interest recommendation method and system based on user sequence click behaviors |
CN110929164A (en) * | 2019-12-09 | 2020-03-27 | 北京交通大学 | Interest point recommendation method based on user dynamic preference and attention mechanism |
Non-Patent Citations (3)
Title |
---|
FUYU LV ET AL.: "SDM: Sequential Deep Matching Model for Online Large-scale Recommender System", 《PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT》 * |
WEIPING SONG ET AL.: "AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks", 《PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT》 * |
YUFEI FENG ET AL.: "Deep Session Interest Network for Click-Through Rate Prediction", 《HTTPS://DOI.ORG/10.48550/ARXIV.1905.06482》 * |
Cited By (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112559905A (en) * | 2020-12-24 | 2021-03-26 | 北京理工大学 | Conversation recommendation method based on dual-mode attention mechanism and social similarity |
CN112559905B (en) * | 2020-12-24 | 2022-09-06 | 北京理工大学 | Conversation recommendation method based on dual-mode attention mechanism and social similarity |
CN112818224A (en) * | 2021-01-26 | 2021-05-18 | 北京百度网讯科技有限公司 | Information recommendation method and device, electronic equipment and readable storage medium |
CN112818224B (en) * | 2021-01-26 | 2024-02-20 | 北京百度网讯科技有限公司 | Information recommendation method and device, electronic equipment and readable storage medium |
CN112905887A (en) * | 2021-02-22 | 2021-06-04 | 中国计量大学 | Conversation recommendation method based on multi-interest short-term priority model |
CN112948716A (en) * | 2021-03-05 | 2021-06-11 | 桂林电子科技大学 | Continuous interest point package recommendation method based on multi-head attention mechanism |
CN112948716B (en) * | 2021-03-05 | 2023-02-28 | 桂林电子科技大学 | Continuous interest point package recommendation method based on multi-head attention mechanism |
CN112948683A (en) * | 2021-03-16 | 2021-06-11 | 山西大学 | Socialized recommendation method with dynamic fusion of social information |
CN112950325A (en) * | 2021-03-16 | 2021-06-11 | 山西大学 | Social behavior fused self-attention sequence recommendation method |
CN112950325B (en) * | 2021-03-16 | 2023-10-03 | 山西大学 | Self-attention sequence recommendation method for social behavior fusion |
CN112862007A (en) * | 2021-03-29 | 2021-05-28 | 山东大学 | Commodity sequence recommendation method and system based on user interest editing |
CN112862007B (en) * | 2021-03-29 | 2022-12-13 | 山东大学 | Commodity sequence recommendation method and system based on user interest editing |
CN113407819B (en) * | 2021-05-20 | 2022-06-17 | 桂林电子科技大学 | Sequence recommendation method, system and storage medium based on residual error network |
CN113407819A (en) * | 2021-05-20 | 2021-09-17 | 桂林电子科技大学 | Sequence recommendation method, system and storage medium based on residual error network |
CN113505215A (en) * | 2021-06-30 | 2021-10-15 | 北京明略软件系统有限公司 | Product recommendation method and device, electronic equipment and storage medium |
CN113722599A (en) * | 2021-09-06 | 2021-11-30 | 中国计量大学 | Conversation recommendation method based on user long-term interest and short-term interest modeling |
CN113868542B (en) * | 2021-11-25 | 2022-03-11 | 平安科技(深圳)有限公司 | Attention model-based push data acquisition method, device, equipment and medium |
CN113868542A (en) * | 2021-11-25 | 2021-12-31 | 平安科技(深圳)有限公司 | Attention model-based push data acquisition method, device, equipment and medium |
WO2023108324A1 (en) * | 2021-12-13 | 2023-06-22 | 中国科学院深圳先进技术研究院 | Comparative learning enhanced two-stream model recommendation system and algorithm |
CN115099886A (en) * | 2022-05-25 | 2022-09-23 | 华南理工大学 | Long and short interest sequence recommendation method and device and storage medium |
CN115099886B (en) * | 2022-05-25 | 2024-04-19 | 华南理工大学 | Long-short interest sequence recommendation method, device and storage medium |
CN116562992A (en) * | 2023-07-11 | 2023-08-08 | 数据空间研究院 | Method, device and medium for recommending items for modeling uncertainty of new interests of user |
CN116562992B (en) * | 2023-07-11 | 2023-09-29 | 数据空间研究院 | Method, device and medium for recommending items for modeling uncertainty of new interests of user |
CN116595157A (en) * | 2023-07-17 | 2023-08-15 | 江西财经大学 | Dynamic interest transfer type session recommendation method and system based on user intention fusion |
CN116595157B (en) * | 2023-07-17 | 2023-09-19 | 江西财经大学 | Dynamic interest transfer type session recommendation method and system based on user intention fusion |
CN117455629A (en) * | 2023-11-24 | 2024-01-26 | 美服数字科技(广州)有限公司 | Live broadcast and cargo carrying intelligent pushing method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111932336A (en) | Commodity list recommendation method based on long-term and short-term interest preference | |
CN109299396B (en) | Convolutional neural network collaborative filtering recommendation method and system fusing attention model | |
CN111523047B (en) | Multi-relation collaborative filtering algorithm based on graph neural network | |
CN111797321B (en) | Personalized knowledge recommendation method and system for different scenes | |
CN111310063B (en) | Neural network-based article recommendation method for memory perception gated factorization machine | |
CN109785062B (en) | Hybrid neural network recommendation system based on collaborative filtering model | |
CN112364976B (en) | User preference prediction method based on session recommendation system | |
CN110781409B (en) | Article recommendation method based on collaborative filtering | |
CN109087178A (en) | Method of Commodity Recommendation and device | |
CN110930219B (en) | Personalized merchant recommendation method based on multi-feature fusion | |
WO2003088107A2 (en) | Determination of attributes based on product descriptions | |
CN112712418B (en) | Method and device for determining recommended commodity information, storage medium and electronic equipment | |
CN111127146A (en) | Information recommendation method and system based on convolutional neural network and noise reduction self-encoder | |
CN106157156A (en) | A kind of cooperation recommending system based on communities of users | |
CN114861050A (en) | Feature fusion recommendation method and system based on neural network | |
CN116228368A (en) | Advertisement click rate prediction method based on deep multi-behavior network | |
CN116932896A (en) | Attention mechanism-based multimode fusion personalized recommendation architecture | |
CN113704439B (en) | Conversation recommendation method based on multi-source information heteromorphic graph | |
CN115293812A (en) | E-commerce platform session perception recommendation prediction method based on long-term and short-term interests | |
CN114519600A (en) | Graph neural network CTR estimation algorithm fusing adjacent node variances | |
CN115525819A (en) | Cross-domain recommendation method for information cocoon room | |
Ahsan et al. | Complementary Recommendations Using Deep Multi-modal Embeddings For Online Retail | |
CN110956528A (en) | Recommendation method and system for e-commerce platform | |
Sharma et al. | Recommendation system for movies using improved version of som with hybrid filtering methods | |
Sharma et al. | A ResNet-101 Based Recommendation System for E-Commerce |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20201113 |
|
RJ01 | Rejection of invention patent application after publication |