CN111932336A - Commodity list recommendation method based on long-term and short-term interest preference - Google Patents

Commodity list recommendation method based on long-term and short-term interest preference Download PDF

Info

Publication number
CN111932336A
CN111932336A CN202010693227.4A CN202010693227A CN111932336A CN 111932336 A CN111932336 A CN 111932336A CN 202010693227 A CN202010693227 A CN 202010693227A CN 111932336 A CN111932336 A CN 111932336A
Authority
CN
China
Prior art keywords
term
user
interest
vector
long
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010693227.4A
Other languages
Chinese (zh)
Inventor
钟福金
陈良操
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202010693227.4A priority Critical patent/CN111932336A/en
Publication of CN111932336A publication Critical patent/CN111932336A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the field of personalized recommendation systems, and particularly relates to a commodity list recommendation method based on long-term and short-term interest preferences, which comprises the steps that a user clicks a data set of a commodity, the data set is input into an embedding layer, and all input elements are embedded into a low-dimensional vector with a fixed size by the embedding layer; the interest extraction layer outputs the short-term interest preference of the user by capturing the relation between the short-term interest low-dimensional vector of the user and the commodity to be selected; inputting the low-dimensional vectors of long-term interest of the user into an interaction layer, searching high-order features in the interaction layer by using a multi-head attention mechanism, and outputting long-term behavior preference of the user; respectively extracting long-term and short-term characteristics from the interest extraction layer and the interaction layer, inputting the characteristics into a fusion layer for fusion, and taking a fusion result as a recommended commodity list; the invention can obtain the long-term and short-term preference of the user based on the user behavior information of the e-commerce, so as to recommend a proper commodity list to the user.

Description

Commodity list recommendation method based on long-term and short-term interest preference
Technical Field
The invention belongs to the field of personalized recommendation systems, and particularly relates to a commodity list recommendation method based on long-term and short-term interest preferences.
Background
Large recommendation systems in the industry need to accurately predict the user's preferences and quickly respond to their current needs. A large e-commerce website, having billions of items of merchandise and users, first retrieves a set of candidate merchandise for the user, and then applies a ranking module to generate final recommendations. Currently, most matching models deployed by commercial websites are mainly based on a project-based Collaborative Filtering (CF) approach. However, they model static user item interactions and do not capture dynamic transitions throughout the sequence of user actions well. This approach typically results in a generic recommendation. To accurately understand the interests and preferences of the user, short-term session information should be incorporated into the matching module.
Dynamic evolution of user interests is considered by introducing a deep sequential recommendation model rather than an item-based CF of the matching phase. As people begin using online shopping services on e-commerce websites, their behavior accumulates for a relatively long time. The sequence consists of sessions. A session is a list of user actions that occur within a given time frame. A user typically has specific unique shopping needs in a session and his/her interests may change dramatically when he/she starts a new session. Directly modeling sequences while ignoring such internal structures can compromise performance. Therefore, we refer to the latest interactive session of the user as a short-term behavior and the other behaviors before as long-term behaviors. These two parts are modeled separately to encode their intrinsic information, which can be used to represent different levels of interest of the user. Our goal is to recall the first N items after the candidate for the user sequence match.
For short-term session modeling, Recursive Neural Network (RNN) based approaches have shown effective performance in session-based recommendations. Most importantly, Li and Liu et al. Attention models are further proposed to emphasize the main purpose and effect of the final click in the short-term session, respectively, to avoid the transfer of interest by random actions of the user. However, they ignore the user's points of interest in more than one session. We observe that customers are concerned with multiple aspects of goods, such as categories, brands, colors, styles, and store reputations, among others. The user repeatedly compares many items before making a final decision on the favorite items. Thus, the use of a single head of product attention does not reflect the variety of interests that occur at different times of purchase. Instead, the multi-headed attention first put forward for the machine translation task allows the model to focus on multiple different information at different locations in common. A multi-headed structure can naturally solve the multi-benefit problem by representing preferences from different perspectives. Therefore, we propose a multi-interest module to leverage multi-head attention to enhance the RNN-based sequential recommender. Meanwhile, due to the self-attention function, the module can represent accurate user preference by filtering out causal clicks.
The current decision-making is always influenced by the user's long-term general preferences. Intuitively, if the user is a football fan, he can view/click on the merchandise related to football stars. When he now chooses to buy shoes, the sports shoes of the famous globes will be more attractive to him than the ordinary shoes. Therefore, it is important to consider long-term preferences and short-term behavior. Ying and Li et al. The long-term preferences of the customer are taken into account by a simple combination with the current session. However, in practice, customers have a variety of shopping needs, and their long-term behavior is also complex and diverse. Things related to the planet account for a very small fraction of long-term behavior. Long-term user preferences related to the current short-term session cannot be significantly expressed in the overall long-term behavior. This is not an efficient method of fusion if we simply connect long-term and short-term representations or aggregate them into a weighted attention. Information in the long-term vector relating to the current short-term session should be retained.
Disclosure of Invention
Aiming at the problem that the long-term interest preference and the short-term interest preference of a user are difficult to balance in the existing recommendation algorithm, the invention provides a commodity list recommendation method based on the long-term and short-term interest preference, commodities to be recommended are input into a network comprising an embedding layer, an interest extraction layer, an interaction layer and an interest fusion layer, and a commodity list recommended to the user is selected through the following steps:
s1, using the commodity list information clicked by the user in the short-term conversation as a data set, inputting the data set into an embedding layer, and embedding all input elements into a low-dimensional vector with a fixed size by the embedding layer;
s2, the interest extraction layer outputs a short-term interest preference vector of the user by capturing the relation between the short-term interest low-dimensional vector of the user and the commodity to be selected;
s3, taking the long-term historical data as a data set, and inputting the data into an embedding layer to obtain a low-dimensional vector with fixed size of the long-term historical data;
s4, inputting the low-dimensional vectors of long-term interest of the user into an interaction layer, searching high-order features in the interaction layer by using a multi-head attention mechanism, and outputting long-term behavior preference of the user;
and S5, extracting the interest extraction layer and the interaction layer into long-term and short-term characteristic input fusion layers respectively for fusion, and taking the fusion result as a recommended commodity list.
Further, the process of extracting the short-term interest features of the user by the interest extraction layer comprises the following steps:
carrying out sequential coding on a commodity sequence clicked by a user, recording the operation as a position code, and carrying out deviation coding on the basis of the position code;
updating the behavior conversation of the user after adding the deviation code;
and constructing a feed-forward network based on a multi-head attention mechanism and calculating the kth session interest of the user, wherein the kth session interest is the characteristics of the user.
Further, the interest of the user in the kth session is expressed as:
Figure BDA0002590091980000031
wherein, IkRepresenting the user's interest in the kth session; avg is average merging operation;
Figure BDA0002590091980000032
an interest vector in the kth session of the user representing the feature Q; concat represents the connection between vectors; head H represents the H-th head vector inside the transform; wOA linear transformation matrix representing the characteristic O.
Further, the H-th head vector head H in the transform model is represented as:
Figure BDA0002590091980000033
wherein Q iskhIs QkThe h head vector of (1); qkIs the Query vector under the k-th session; wQOutputting a matrix for linear transformation of the Query vector; wKOutputting a matrix for linear transformation of the Key vector; wVOutputting a matrix for linear transformation of Value vectors; dmodelA matrix is input for the model.
Further, the process of extracting the long-term interest features of the user by the interaction layer comprises the following steps:
acquiring the correlation between the defined feature m and the defined feature k under the specific attention head h;
updating the feature m in the subspace h by combining all relevant features guided by the coefficients h, k;
and adding standard residual error connection in the network to obtain the final long-term interest characteristics.
Further, the correlation between feature m and feature k is defined under a particular attention head h as:
Figure BDA0002590091980000041
Figure BDA0002590091980000042
wherein the content of the first and second substances,
Figure BDA0002590091980000043
representing the attention vector output of the feature m, k under the h head; psi(h)(em,ek) Representing an aggregated output of the embedding vectors between feature m and feature k;
Figure BDA0002590091980000044
representing a linear transformation matrix under the head of a Query vector h;
Figure BDA0002590091980000045
expressing a linear transformation matrix under a Key vector h head; e.g. of the typemAnd outputting an imbedding vector representing the feature m.
Further, updating the feature m in the subspace h by combining all relevant features guided by the coefficients h, k comprises:
Figure BDA0002590091980000046
Figure BDA0002590091980000047
wherein the content of the first and second substances,
Figure BDA0002590091980000048
representing an imbedding vector of the feature m under the h head;
Figure BDA0002590091980000049
a linear transformation matrix representing the Value vector under the h head;
Figure BDA00025900919800000410
an imbedding vector representing feature m;
Figure BDA00025900919800000411
is the join operator, and H is the total number of heads.
Further, the process of fusing the long-term user interest features and the short-term user interest features by the fusion layer is represented as follows: long-term interest preference for users from different feature scales
Figure BDA00025900919800000412
Coding, user long-term interest preference
Figure BDA00025900919800000413
Including at least an article ID
Figure BDA00025900919800000414
Leaves and their use as plant growth regulators
Figure BDA00025900919800000415
First kind
Figure BDA00025900919800000416
Shop
Figure BDA00025900919800000417
And brand
Figure BDA00025900919800000418
Modeling different feature sets through an attention layer, and calculating attention scores of the user under different feature sets by taking different preference degrees of different categories as query vectors if the user probably has different preference degrees of different stores;
and constructing a gated neural network, wherein the network takes the long-term session interest preference vector and the short-term session interest preference vector as input, and performs weighting in the gated neural network to finally output a recommendation list.
Further, the attention scores are used as the contribution degrees of the long-term interest preference and the short-term interest preference to the input into the gated neural network, wherein the contribution degree is calculated according to the following formula:
Figure BDA0002590091980000051
Figure BDA0002590091980000052
Figure BDA0002590091980000053
pu=tanh(Wpzu+b)
wherein alpha iskAttentio representing feature kn vectors;
Figure BDA0002590091980000054
representing the user fusion interest vector under the characteristic T; e.g. of the typeuAn embedding vector representing a user;
Figure BDA0002590091980000055
representing a fully connected neural network;
Figure BDA0002590091980000056
representing a user fusion interest vector; z is a radical ofkRepresenting a fully connected neural network; p is a radical ofuA long-term behavior vector representing a user; wpDenotes zuA linear transformation matrix of the vectors; b represents a constant matrix;
Figure BDA0002590091980000057
representing a long-term session interest vector under feature f;
Figure BDA0002590091980000058
representing a long-term session interest vector under feature j; f denotes the feature f.
Further, the gated neural network includes:
Figure BDA0002590091980000059
Figure BDA00025900919800000510
wherein the content of the first and second substances,
Figure BDA00025900919800000511
representing the contribution of the gate vector to control long-term and short-term interest to the overall commodity recommendation; w1Denotes euA linear transformation matrix of the vectors; e.g. of the typeuRepresenting a user embedding vector; w2To represent
Figure BDA00025900919800000512
A linear transformation matrix of the vectors;
Figure BDA00025900919800000513
a short-term behavior vector representing a user; w3Represents puA linear transformation matrix of the vectors; b represents a constant matrix;
Figure BDA00025900919800000514
representing a user behavior vector; an element bit-wise multiplication; p is a radical ofuRepresenting a long-term behavior vector of the user.
The invention has the following beneficial technical effects:
(1) the invention belongs to the category of personalized recommendation systems, and has the effects of low parameter and high precision; the dynamic commodity personalized recommendation method can perform dynamic commodity personalized recommendation on users in the E-commerce platform.
(2) A novel short-term user interest preference extraction network architecture is provided, by which session-based user interest preferences can be more accurately and dynamically captured.
(3) The high-order feature automatic extraction network architecture with long-term interest preference is provided, the problem that the original feature combination extraction with practical significance can only be carried out by relying on expert domain knowledge is solved, and the problem that the traditional low-order feature combination algorithm can only learn the interpretability lack of implicit feature combination is also solved.
(4) A door control network is provided, and dynamic interests of a user can be recommended more accurately by fusing long and short interest preferences.
Drawings
FIG. 1 is a schematic structural diagram of a long-short interest preference network according to the present invention;
fig. 2 is a diagram illustrating the effect of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a commodity list recommendation method based on long-term and short-term interest preferences, which is characterized in that commodities to be recommended are input into a network comprising an embedding layer, an interest extraction layer, an interaction layer and an interest fusion layer, and a commodity list recommended to a user is selected through the following steps:
s1, using the commodity list information clicked by the user in the short-term conversation as a data set, inputting the data set into an embedding layer, and embedding all input elements into a low-dimensional vector with a fixed size by the embedding layer;
s2, the interest extraction layer outputs a short-term interest preference vector of the user by capturing the relation between the short-term interest low-dimensional vector of the user and the commodity to be selected;
s3, taking the long-term historical data as a data set, and inputting the data into an embedding layer to obtain a low-dimensional vector with fixed size of the long-term historical data;
s4, inputting the low-dimensional vectors of long-term interest of the user into an interaction layer, searching high-order features in the interaction layer by using a multi-head attention mechanism, and outputting long-term behavior preference of the user;
and S5, extracting the interest extraction layer and the interaction layer into long-term and short-term characteristic input fusion layers respectively for fusion, and taking the fusion result as a recommended commodity list.
In particular, a short-term session refers to a session of a user in a short time, and a long-term session refers to a session in a relatively long time; preferably, unless otherwise specified in the embodiments of the present invention, the short-term session generally refers to the commodity click sequence of the user within 15 minutes in one session, and the long-term session generally refers to the commodity click sequence of the user within one week or more
Example 1
The embodiment gives a specific embodiment of acquiring a data set and processing the data set.
As shown in fig. 1, the data sets of the clicked commodities, which are long-term conversation data and short-term conversation data, are respectively input into the embedding layer, and the embedding layer embeds all input elements into low-dimensional vectors with fixed sizes; the interest extraction layer outputs the short-term interest preference of the user by capturing the relation between the short-term interest low-dimensional vector of the user and the commodity to be selected; inputting the low-dimensional vectors of long-term interest of the user into an interaction layer, searching high-order features in the interaction layer by using a multi-head attention mechanism, and outputting long-term behavior preference of the user; and respectively extracting long-term and short-term characteristics from the interest extraction layer and the interaction layer, inputting the characteristics into the fusion layer for fusion, and taking the fusion result as a recommended commodity list.
In this embodiment, the data set obtained is a Taobao-sampled data set that includes active users who randomly selected 40 commodities to interact with in 2018 for 8 consecutive days of 12 months. In addition, in this embodiment, users with interaction times exceeding 1000 items are filtered, and these users are considered as false users in this embodiment.
In the historical interaction data collected in this example, where the first 7 days were used for training and the 8 th day was used for testing, this example filters out merchandise that appeared less than 5 times in the dataset of interactions.
The present embodiment makes a session generation rule:
interactions with the same session ID belong to the same session;
merging adjacent interactions with session time less than 10 minutes into one session;
the maximum length of a session is set to 50, which means that a new session will start when the session length exceeds 50;
each latest session of user u is considered a short-term behavior, where m is the length of the sequence;
the behavior of the user u occurring in the session one week before is regarded as long-term behavior;
in consideration of the short-term behavior and the long-term behavior of the user u, the present embodiment recommends an item for the user u;
the maximum limit for each is 20 during training and sessions with length less than 2 will be deleted.
In the testing phase, this embodiment would select about 1 million active users for rapid evaluation on day 8, these selected users were included in the model in the first 25% of short-term meetings on day 8, and the rest of the interactions were ground truth; in addition to this, the customer may browse through certain items many times a day, thus discourageing repeated recommendations, and therefore we retain these items only once in the user's test data.
In particular, a short-term session as referred to in embodiments of the present invention generally refers to a user's click sequence of items within a short period of time, e.g., within 15 minutes, in a session, while a long-term session refers to a user's click sequence of items within a week or more.
Example 2
On the basis of embodiment 1, this embodiment provides a data processing method.
The non-serialized data is processed, wherein the non-serialized data comprises user attribute information and commodity attributes, and three different features are respectively processed at an Embedding layer, wherein the three features are respectively a single-value discrete feature, a multi-value discrete feature and a continuous feature. For a single-valued discrete feature, a corresponding Embedding representation is directly obtained through an Embedding vocabulary and is expressed as follows: e.g. of the typei=VixiIn which V isiRepresents; for discrete features, xiIs a one-hot vector or a multi-hot vector, taking values other than 0, i.e. 1, and for consecutive features, xiThe method is directly a scalar, and the value of the scalar is directly multiplied by the corresponding Embedding, namely: e.g. of the typem=vmxmWherein e ismImbedding vector output, v, representing feature mmMask vector output, x, representing feature mmA one-hot vector output representing a feature m;
for the multi-valued discrete features, after obtaining the corresponding Embedding through the Embedding vocabulary, averaging the Embedding of the same field by the avg-posing method is also needed, which is expressed as:
Figure BDA0002590091980000081
Figure BDA0002590091980000082
wherein q is the number of values in the multi-value discrete characteristic.
Example 3
In this embodiment, the process of extracting the short-term interest features of the user by the interest extraction layer specifically includes:
carrying out sequential coding on a commodity sequence clicked by a user, recording the operation as a position code, and carrying out deviation coding on the basis of the position code;
updating the behavior conversation of the user after adding the deviation code;
and constructing a feed-forward network based on a multi-head attention mechanism and calculating the kth session interest of the user, wherein the kth session interest is the characteristics of the user.
And processing serialized data, wherein the serialized data is a section of data of commodity interaction behaviors of a user in a session id, and behaviors in the same session of the session interest extractor layer are closely related to each other. Furthermore, the random behavior of the user in the conversation may bias the interest of the conversation away from its original expression. To capture the internal relationships between behaviors in the same session and reduce the impact of these unrelated behaviors, a multi-headed self-mechanism is employed in each session, and some improvements are made to the self-attention mechanism to better achieve our goals.
In order to exploit the sequential relationship of the sequences, the self-care mechanism applies position coding to the input embedding; furthermore, there is a need to capture the sequential relationship of the sessions and the deviations that exist in the different representation subspaces. Thus, on the basis of position coding
Figure BDA0002590091980000091
Performing bias coding, wherein each element in the bias code BE is defined as follows:
Figure BDA0002590091980000092
wherein
Figure BDA0002590091980000093
Is the bias of the session, the index k is the index of the session,
Figure BDA0002590091980000094
is a bias to the position in the session, the subscript t is an index to the behavior in the session,
Figure BDA0002590091980000095
is the location of the cell's bias in the behavior embedding, and subscript c is the index of the unit in the behavior embedding. After adding the offset code, the behavior session Q of the user is updated as follows:
Q=Q+BE
the multi-point attention mechanism, in the recommendation system, the user's click behavior is influenced by various factors (e.g., color, style, and price). Multi-headed self-attention may capture relationships in different representation subspaces. Mathematically let Qk=[Qk1;....;Qkh;....QkH]Wherein
Figure BDA0002590091980000096
Is QkH is the sum of the number of heads
Figure BDA0002590091980000097
headhThe output of (c) is calculated as follows:
Figure BDA0002590091980000101
wherein Q iskhIs QkThe h head vector of (1); qkIs a Query vector; wQOutputting a matrix for linear transformation of the Query vector; wKOutputting a matrix for linear transformation of the Key vector; wVOutputting a matrix for linear transformation of Value vectors; dmodelA matrix is input for the model.
The vectors of the different heads are then concatenated and then input into a feed forward network:
Figure BDA0002590091980000102
wherein FFN (-) is a feed-forward network, WOIs a linear matrix.
Continuously proceeding with residual connection and layer normalization, user's kth session interest IkThe calculation is as follows:
Figure BDA0002590091980000103
wherein Avg is average merge; in particular, the weights are shared among the self-care mechanisms of the different sessions.
Example 4
The embodiment provides a process for extracting long-term interest features of a user by an interaction layer, which specifically comprises the following steps:
acquiring the correlation between the defined feature m and the defined feature k under the specific attention head h;
updating the feature m in the subspace h by combining all relevant features guided by the coefficients h, k;
and adding standard residual error connection in the network to obtain the final long-term interest characteristics.
The interaction layer is an encoder part of a Transformer, and is stacked by multiple layers to learn high-order combinations between features, and the invention defines the correlation between a feature m and a feature k under a specific attention head h, and can be expressed as:
Figure BDA0002590091980000104
Figure BDA0002590091980000105
wherein the content of the first and second substances,
Figure BDA0002590091980000106
representing the attention vector output of the feature m, k under the h head; psi(h)(em,ek) Representing the sum of the features m and kAggregating and outputting the imbedding vectors among the elements;
Figure BDA0002590091980000107
represents the linear transformation matrix under the Query vector h header,
Figure BDA0002590091980000111
representing a linear transformation matrix under the head of a Key vector h, wherein the two matrixes are used for embedding the original into a space RdMapping to a new space Rd’The transformation matrix of (2).
The representation of the feature m in the subspace h is then updated by combining all relevant features guided by the coefficients h, k
Figure BDA0002590091980000112
Figure BDA0002590091980000113
Figure BDA0002590091980000114
Wherein the content of the first and second substances,
Figure BDA0002590091980000115
representing an imbedding vector of the feature m under the h head;
Figure BDA0002590091980000116
a linear transformation matrix representing the Value vector under the h head;
Figure BDA0002590091980000117
representing the total vector representation of the feature m in the imbedding vector under each head;
Figure BDA0002590091980000118
is the join operator, and H is the total number of heads.
In order to preserve the previously learned combined features, including the original individual features, standard residual connections are added in the network. In view of the form of the utility model,
Figure BDA0002590091980000119
wherein the content of the first and second substances,
Figure BDA00025900919800001110
activation function value of embedding vector representing feature m under interest interaction layer, WResA linear transformation matrix below the interest interaction layer representing the feature m; and relu (z) max (0, z) is a non-linear activation function; with such an interaction layer, each feature emWill be updated to a new feature representation
Figure BDA00025900919800001111
This cardioid feature is represented as a representation of higher order features, and multiple such layers may be stacked using the output of a previous interaction layer as input to the next interaction layer. In this way, any order of the combined features can be simulated.
Example 5
The short-term interest features of the user obtained in the embodiment 3 and the long-term interest features of the user obtained in the embodiment 4 are used as the input of the interest fusion layer for feature fusion, and the method specifically comprises the following steps:
long-term interest preference for users from different feature scales
Figure BDA00025900919800001112
Coding, user long-term interest preference
Figure BDA00025900919800001113
Including at least an article ID
Figure BDA00025900919800001114
Leaves and their use as plant growth regulators
Figure BDA00025900919800001115
First kind
Figure BDA00025900919800001116
Shop
Figure BDA00025900919800001117
And brand
Figure BDA00025900919800001118
And constructing a gated neural network, wherein the network takes the long-term session interest preference vector and the short-term session interest preference vector as input, and performs weighting in the gated neural network to finally output a recommendation list.
From different characteristic scales to long-term behavior
Figure BDA0002590091980000121
Coding, long-term behavior
Figure BDA0002590091980000122
Formed of a plurality of subsets, i.e.
Figure BDA0002590091980000123
Comprises that
Figure BDA0002590091980000124
(the article ID),
Figure BDA0002590091980000125
(class of leaves),
Figure BDA0002590091980000126
(of the first type),
Figure BDA0002590091980000127
(shop) and
Figure BDA0002590091980000128
(branding). The attention score is calculated using the user profile embedding as a query vector and the resulting representation:
Figure BDA0002590091980000129
Figure BDA00025900919800001210
Figure BDA00025900919800001211
pu=tanh(Wpzu+b)
wherein alpha iskAn attention vector representing feature k;
Figure BDA00025900919800001212
representing the user fusion interest vector under the characteristic T; e.g. of the typeuAn embedding vector representing a user;
Figure BDA00025900919800001213
representing a fully connected neural network;
Figure BDA00025900919800001214
representing a user fusion interest vector; z is a radical ofuRepresenting a fully connected neural network; p is a radical ofuA long-term behavior vector representing a user; wpDenotes zuA linear transformation matrix of the vectors; b represents a constant matrix;
Figure BDA00025900919800001215
representing a long-term session interest vector under feature f;
Figure BDA00025900919800001216
representing a long-term session interest vector under feature j; f denotes the feature f.
To incorporate short-term behavior, the present embodiment designs a gated neural network that takes as input a long-term session interest preference and a short-term session interest preference. Gate vector
Figure BDA00025900919800001217
The short and long term contribution percentages for determining time and sigmoid are expressed as:
Figure BDA00025900919800001218
Figure BDA00025900919800001219
get the interest representation of the user
Figure BDA00025900919800001220
Then, the next interactive item of the user can be obtained according to the log, and K-1 negative examples of items are sampled as positive examples. And (4) calculating inner products of the embedding corresponding to the K items according to the user interest expression respectively to serve as the score of each item. And finally calculating loss through softmax and cross entropy, and performing model training:
Figure BDA00025900919800001221
Figure BDA00025900919800001222
wherein the content of the first and second substances,
Figure BDA0002590091980000131
a predicted probability representing the distribution of each sample item; z represents an item score;
Figure BDA0002590091980000132
representing a cross entropy loss function;
Figure BDA0002590091980000133
represents K items; y isiRepresenting the true probability of item i;
Figure BDA0002590091980000134
representing the predicted probability of item i.
As shown in FIG. 2, in the real data set, the user is shown in a short-term session SuIncluding red wine glass and champion wine glass. The patent network model directly recommends champion wine cups, which is related to the last click in a short-term session, meaning that users are more likely to be interested in champion wine cups at present; meanwhile, the door control network module in the patent network model can capture a large number of long-term conversations L of the most relevant articles in red wine in the useruMany unrelated clicks are included, such as beer, paring knives and small dishes, as well as in conjunction with the short-term conversation magenta glass to create a recommended item red wine decanter; this case shows this patent network model door module, shows that patent network model door module has validity and accurate convergence.
The invention provides a commodity list recommendation method with long-term and short-term interest preference, and better recommendation results are obtained by respectively modeling serialized data and non-serialized data and finally recommending commodity lists through a gate fusion mechanism.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (10)

1. A commodity list recommendation method based on long-term and short-term interest preferences is characterized in that commodities to be recommended are input to a commodity list including an embedding layer, an interest extraction layer, an interaction layer and an interest fusion layer, and the commodity list recommended to a user is selected through the following steps:
s1, using the commodity list information clicked by the user in the short-term conversation as a data set, inputting the data set into an embedding layer, and embedding all input elements into a low-dimensional vector with a fixed size by the embedding layer;
s2, the interest extraction layer outputs the short-term interest preference of the user by capturing the relation between the short-term interest low-dimensional vector of the user and the commodity to be selected;
s3, taking the long-term historical data as a data set, and inputting the data into an embedding layer to obtain a low-dimensional vector with fixed size of the long-term historical data;
s4, inputting the low-dimensional vectors of long-term interest of the user into an interaction layer, searching high-order features in the interaction layer by using a multi-head attention mechanism, and outputting long-term behavior preference of the user;
and S5, extracting the interest extraction layer and the interaction layer into long-term and short-term characteristic input fusion layers respectively for fusion, and taking the fusion result as a recommended commodity list.
2. The method as claimed in claim 1, wherein the process of extracting the short-term interest features of the user by the interest extraction layer comprises:
carrying out sequential coding on a commodity sequence clicked by a user, recording the operation as a position code, and carrying out deviation coding on the basis of the position code;
updating the behavior conversation of the user after adding the deviation code;
and constructing a feed-forward network based on a multi-head attention mechanism and calculating the kth session interest of the user, wherein the kth session interest is the characteristics of the user.
3. The method as claimed in claim 2, wherein the interest of the kth session of the user is expressed as:
Figure FDA0002590091970000011
wherein, IkRepresenting the user's interest in the kth session; avg is average merging operation;
Figure FDA0002590091970000012
user's Kth conversation representing feature QAn interest vector; concat represents the connection between vectors; headHRepresenting the H head vector in the transform model; wOA linear transformation matrix representing the characteristic O.
4. The method as claimed in claim 3, wherein the H head vector head in the transform model is used for recommending the merchandise list based on the long-short term interest preferenceHExpressed as:
Figure FDA0002590091970000021
wherein Q iskhIs QkThe h head vector of (1); qkIs the query vector under the kth session; wQOutputting a matrix for linear transformation of the Query vector; wKOutputting a matrix for linear transformation of the Key vector; wVOutputting a matrix for linear transformation of Value vectors; dmodelA matrix is input for the model.
5. The method as claimed in claim 1, wherein the process of extracting the long-term interest features of the user by the interaction layer comprises:
acquiring the correlation between the defined feature m and the defined feature k under the specific attention head h;
updating the feature m in the subspace h by combining all relevant features guided by the coefficients h, k;
and adding standard residual error connection in the network to obtain the final long-term interest characteristics.
6. The method as claimed in claim 5, wherein the correlation between the feature m and the feature k is defined as follows:
Figure FDA0002590091970000022
Figure FDA0002590091970000023
wherein the content of the first and second substances,
Figure FDA0002590091970000024
representing the attention vector output of the feature m, k under the h head; psi(h)(em,ek) Representing an aggregated output of the embedding vectors between feature m and feature k;
Figure FDA0002590091970000025
representing a linear transformation matrix under the head of a Query vector h;
Figure FDA0002590091970000026
expressing a linear transformation matrix under a Key vector h head; e.g. of the typemAnd outputting an imbedding vector representing the feature m.
7. The method of claim 5, wherein updating the feature m in the subspace h by combining all the relevant features guided by the coefficients h, k comprises:
Figure FDA0002590091970000031
Figure FDA0002590091970000032
wherein the content of the first and second substances,
Figure FDA0002590091970000033
representing an imbedding vector of the feature m under the h head;
Figure FDA0002590091970000034
a linear transformation matrix representing the Value vector under the h head;
Figure FDA0002590091970000035
an imbedding vector representing feature m; ≧ is the join operator, H is the total number of headers.
8. The method for recommending a commodity list based on the long-term and short-term interest preferences as claimed in claim 1, wherein the process of fusing the long-term and short-term user interest features by the fusion layer is represented as follows:
long-term interest preference for users from different feature scales
Figure FDA0002590091970000036
Coding, user long-term interest preference
Figure FDA0002590091970000037
At least comprising an article of merchandise
Figure FDA0002590091970000038
Leaves and their use as plant growth regulators
Figure FDA0002590091970000039
First kind
Figure FDA00025900919700000310
Shop
Figure FDA00025900919700000311
And brand
Figure FDA00025900919700000312
Modeling different feature sets through an attention layer, and calculating attention scores of the user under different feature sets by taking different preference degrees of different categories as query vectors if the user probably has different preference degrees of different stores;
and constructing a gated neural network, wherein the network takes the long-term session interest preference vector and the short-term session interest preference vector as input, and performs weighting in the gated neural network to finally output a recommendation list.
9. The method of claim 8, wherein the interest scores of the long-term interest preference and the short-term interest preference are calculated as the contribution degrees of the long-term interest preference and the short-term interest preference to the gated neural network according to the attention scores in claim 8, wherein the contribution degrees are calculated as follows:
Figure FDA00025900919700000313
Figure FDA00025900919700000314
Figure FDA00025900919700000315
pu=tanh(Wpzu+b)
wherein alpha iskAn attention vector representing feature k;
Figure FDA00025900919700000316
representing the user fusion interest vector under the characteristic T; e.g. of the typeuAn embedding vector representing a user;
Figure FDA0002590091970000041
representing a fully connected neural network;
Figure FDA0002590091970000042
representing a user fusion interest vector; z is a radical ofuRepresenting a fully connected neural network; p is a radical ofuA long-term behavior vector representing a user; wpDenotes zuA linear transformation matrix of the vectors; b represents a constant matrix;
Figure FDA0002590091970000043
representing a long-term session interest vector under feature f;
Figure FDA0002590091970000044
representing a long-term session interest vector under feature j; f denotes the feature f.
10. The method of claim 8, wherein the gated neural network comprises:
Figure FDA0002590091970000045
Figure FDA0002590091970000046
wherein the content of the first and second substances,
Figure FDA0002590091970000047
representing the contribution of the gate vector to control long-term and short-term interest to the overall commodity recommendation; w1Denotes euA linear transformation matrix of the vectors; e.g. of the typeuRepresenting a user embedding vector; w2To represent
Figure FDA0002590091970000048
A linear transformation matrix of the vectors;
Figure FDA0002590091970000049
a short-term behavior vector representing a user; w3Represents puA linear transformation matrix of the vectors; b represents a constant matrix;
Figure FDA00025900919700000410
representing a user behavior vector; an element bit-wise multiplication; p is a radical ofuRepresenting a long-term behavior vector of the user.
CN202010693227.4A 2020-07-17 2020-07-17 Commodity list recommendation method based on long-term and short-term interest preference Pending CN111932336A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010693227.4A CN111932336A (en) 2020-07-17 2020-07-17 Commodity list recommendation method based on long-term and short-term interest preference

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010693227.4A CN111932336A (en) 2020-07-17 2020-07-17 Commodity list recommendation method based on long-term and short-term interest preference

Publications (1)

Publication Number Publication Date
CN111932336A true CN111932336A (en) 2020-11-13

Family

ID=73313786

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010693227.4A Pending CN111932336A (en) 2020-07-17 2020-07-17 Commodity list recommendation method based on long-term and short-term interest preference

Country Status (1)

Country Link
CN (1) CN111932336A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559905A (en) * 2020-12-24 2021-03-26 北京理工大学 Conversation recommendation method based on dual-mode attention mechanism and social similarity
CN112818224A (en) * 2021-01-26 2021-05-18 北京百度网讯科技有限公司 Information recommendation method and device, electronic equipment and readable storage medium
CN112862007A (en) * 2021-03-29 2021-05-28 山东大学 Commodity sequence recommendation method and system based on user interest editing
CN112905887A (en) * 2021-02-22 2021-06-04 中国计量大学 Conversation recommendation method based on multi-interest short-term priority model
CN112948683A (en) * 2021-03-16 2021-06-11 山西大学 Socialized recommendation method with dynamic fusion of social information
CN112948716A (en) * 2021-03-05 2021-06-11 桂林电子科技大学 Continuous interest point package recommendation method based on multi-head attention mechanism
CN112950325A (en) * 2021-03-16 2021-06-11 山西大学 Social behavior fused self-attention sequence recommendation method
CN113407819A (en) * 2021-05-20 2021-09-17 桂林电子科技大学 Sequence recommendation method, system and storage medium based on residual error network
CN113505215A (en) * 2021-06-30 2021-10-15 北京明略软件系统有限公司 Product recommendation method and device, electronic equipment and storage medium
CN113722599A (en) * 2021-09-06 2021-11-30 中国计量大学 Conversation recommendation method based on user long-term interest and short-term interest modeling
CN113868542A (en) * 2021-11-25 2021-12-31 平安科技(深圳)有限公司 Attention model-based push data acquisition method, device, equipment and medium
CN115099886A (en) * 2022-05-25 2022-09-23 华南理工大学 Long and short interest sequence recommendation method and device and storage medium
WO2023108324A1 (en) * 2021-12-13 2023-06-22 中国科学院深圳先进技术研究院 Comparative learning enhanced two-stream model recommendation system and algorithm
CN116562992A (en) * 2023-07-11 2023-08-08 数据空间研究院 Method, device and medium for recommending items for modeling uncertainty of new interests of user
CN116595157A (en) * 2023-07-17 2023-08-15 江西财经大学 Dynamic interest transfer type session recommendation method and system based on user intention fusion
CN117455629A (en) * 2023-11-24 2024-01-26 美服数字科技(广州)有限公司 Live broadcast and cargo carrying intelligent pushing method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807156A (en) * 2019-10-23 2020-02-18 山东师范大学 Interest recommendation method and system based on user sequence click behaviors
CN110929164A (en) * 2019-12-09 2020-03-27 北京交通大学 Interest point recommendation method based on user dynamic preference and attention mechanism

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807156A (en) * 2019-10-23 2020-02-18 山东师范大学 Interest recommendation method and system based on user sequence click behaviors
CN110929164A (en) * 2019-12-09 2020-03-27 北京交通大学 Interest point recommendation method based on user dynamic preference and attention mechanism

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
FUYU LV ET AL.: "SDM: Sequential Deep Matching Model for Online Large-scale Recommender System", 《PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT》 *
WEIPING SONG ET AL.: "AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks", 《PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT》 *
YUFEI FENG ET AL.: "Deep Session Interest Network for Click-Through Rate Prediction", 《HTTPS://DOI.ORG/10.48550/ARXIV.1905.06482》 *

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559905A (en) * 2020-12-24 2021-03-26 北京理工大学 Conversation recommendation method based on dual-mode attention mechanism and social similarity
CN112559905B (en) * 2020-12-24 2022-09-06 北京理工大学 Conversation recommendation method based on dual-mode attention mechanism and social similarity
CN112818224A (en) * 2021-01-26 2021-05-18 北京百度网讯科技有限公司 Information recommendation method and device, electronic equipment and readable storage medium
CN112818224B (en) * 2021-01-26 2024-02-20 北京百度网讯科技有限公司 Information recommendation method and device, electronic equipment and readable storage medium
CN112905887A (en) * 2021-02-22 2021-06-04 中国计量大学 Conversation recommendation method based on multi-interest short-term priority model
CN112948716A (en) * 2021-03-05 2021-06-11 桂林电子科技大学 Continuous interest point package recommendation method based on multi-head attention mechanism
CN112948716B (en) * 2021-03-05 2023-02-28 桂林电子科技大学 Continuous interest point package recommendation method based on multi-head attention mechanism
CN112948683A (en) * 2021-03-16 2021-06-11 山西大学 Socialized recommendation method with dynamic fusion of social information
CN112950325A (en) * 2021-03-16 2021-06-11 山西大学 Social behavior fused self-attention sequence recommendation method
CN112950325B (en) * 2021-03-16 2023-10-03 山西大学 Self-attention sequence recommendation method for social behavior fusion
CN112862007A (en) * 2021-03-29 2021-05-28 山东大学 Commodity sequence recommendation method and system based on user interest editing
CN112862007B (en) * 2021-03-29 2022-12-13 山东大学 Commodity sequence recommendation method and system based on user interest editing
CN113407819B (en) * 2021-05-20 2022-06-17 桂林电子科技大学 Sequence recommendation method, system and storage medium based on residual error network
CN113407819A (en) * 2021-05-20 2021-09-17 桂林电子科技大学 Sequence recommendation method, system and storage medium based on residual error network
CN113505215A (en) * 2021-06-30 2021-10-15 北京明略软件系统有限公司 Product recommendation method and device, electronic equipment and storage medium
CN113722599A (en) * 2021-09-06 2021-11-30 中国计量大学 Conversation recommendation method based on user long-term interest and short-term interest modeling
CN113868542B (en) * 2021-11-25 2022-03-11 平安科技(深圳)有限公司 Attention model-based push data acquisition method, device, equipment and medium
CN113868542A (en) * 2021-11-25 2021-12-31 平安科技(深圳)有限公司 Attention model-based push data acquisition method, device, equipment and medium
WO2023108324A1 (en) * 2021-12-13 2023-06-22 中国科学院深圳先进技术研究院 Comparative learning enhanced two-stream model recommendation system and algorithm
CN115099886A (en) * 2022-05-25 2022-09-23 华南理工大学 Long and short interest sequence recommendation method and device and storage medium
CN115099886B (en) * 2022-05-25 2024-04-19 华南理工大学 Long-short interest sequence recommendation method, device and storage medium
CN116562992A (en) * 2023-07-11 2023-08-08 数据空间研究院 Method, device and medium for recommending items for modeling uncertainty of new interests of user
CN116562992B (en) * 2023-07-11 2023-09-29 数据空间研究院 Method, device and medium for recommending items for modeling uncertainty of new interests of user
CN116595157A (en) * 2023-07-17 2023-08-15 江西财经大学 Dynamic interest transfer type session recommendation method and system based on user intention fusion
CN116595157B (en) * 2023-07-17 2023-09-19 江西财经大学 Dynamic interest transfer type session recommendation method and system based on user intention fusion
CN117455629A (en) * 2023-11-24 2024-01-26 美服数字科技(广州)有限公司 Live broadcast and cargo carrying intelligent pushing method and system

Similar Documents

Publication Publication Date Title
CN111932336A (en) Commodity list recommendation method based on long-term and short-term interest preference
CN109299396B (en) Convolutional neural network collaborative filtering recommendation method and system fusing attention model
CN111523047B (en) Multi-relation collaborative filtering algorithm based on graph neural network
CN111797321B (en) Personalized knowledge recommendation method and system for different scenes
CN111310063B (en) Neural network-based article recommendation method for memory perception gated factorization machine
CN109785062B (en) Hybrid neural network recommendation system based on collaborative filtering model
CN112364976B (en) User preference prediction method based on session recommendation system
CN110781409B (en) Article recommendation method based on collaborative filtering
CN109087178A (en) Method of Commodity Recommendation and device
CN110930219B (en) Personalized merchant recommendation method based on multi-feature fusion
WO2003088107A2 (en) Determination of attributes based on product descriptions
CN112712418B (en) Method and device for determining recommended commodity information, storage medium and electronic equipment
CN111127146A (en) Information recommendation method and system based on convolutional neural network and noise reduction self-encoder
CN106157156A (en) A kind of cooperation recommending system based on communities of users
CN114861050A (en) Feature fusion recommendation method and system based on neural network
CN116228368A (en) Advertisement click rate prediction method based on deep multi-behavior network
CN116932896A (en) Attention mechanism-based multimode fusion personalized recommendation architecture
CN113704439B (en) Conversation recommendation method based on multi-source information heteromorphic graph
CN115293812A (en) E-commerce platform session perception recommendation prediction method based on long-term and short-term interests
CN114519600A (en) Graph neural network CTR estimation algorithm fusing adjacent node variances
CN115525819A (en) Cross-domain recommendation method for information cocoon room
Ahsan et al. Complementary Recommendations Using Deep Multi-modal Embeddings For Online Retail
CN110956528A (en) Recommendation method and system for e-commerce platform
Sharma et al. Recommendation system for movies using improved version of som with hybrid filtering methods
Sharma et al. A ResNet-101 Based Recommendation System for E-Commerce

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201113

RJ01 Rejection of invention patent application after publication