CN115293812A - E-commerce platform session perception recommendation prediction method based on long-term and short-term interests - Google Patents

E-commerce platform session perception recommendation prediction method based on long-term and short-term interests Download PDF

Info

Publication number
CN115293812A
CN115293812A CN202210967561.3A CN202210967561A CN115293812A CN 115293812 A CN115293812 A CN 115293812A CN 202210967561 A CN202210967561 A CN 202210967561A CN 115293812 A CN115293812 A CN 115293812A
Authority
CN
China
Prior art keywords
user
term
long
item
short
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210967561.3A
Other languages
Chinese (zh)
Inventor
肖云鹏
黄于洋
李暾
王蓉
贾朝龙
陶禹冲
朱宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202210967561.3A priority Critical patent/CN115293812A/en
Publication of CN115293812A publication Critical patent/CN115293812A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Abstract

The invention belongs to the technical field of internet application, and particularly relates to a conversation perception recommendation prediction method of an E-commerce platform based on long-term and short-term interests, which comprises the steps of obtaining online data, wherein the data comprises basic information of a user, basic information of an article and a conversation sequence of user behaviors; extracting user behaviors and user preferences through the acquired online data, and constructing a long-term interest set of the user; acquiring the short-term interest of the user at the current stage from the long-term interest set of the user through interest matching; constructing a prediction model, taking a click sequence of a user in a session, namely article input, and short-term interest of the user in the current stage as input, and outputting and recommending the predicted articles by the prediction model; the method and the system can effectively mine the long-term and short-term interest information of the user in the user behavior sequence, more accurately express the interest preference of the user, and simultaneously improve the recommendation accuracy of the e-commerce platform.

Description

E-commerce platform session perception recommendation prediction method based on long-term and short-term interests
Technical Field
The invention belongs to the technical field of internet application, and particularly relates to a prediction method for conversation perception recommendation of an e-commerce platform based on long-term and short-term interests.
Background
In recent years, with the explosion of artificial intelligence technology, it has been the "intelligence" factor that makes recommendations more interesting and useful. Intelligence is a key core of personalization that can learn about a user's preferences, predict preferences unknown to the user, and ultimately provide recommendations beyond simple searches by matching queries and content. Recommendation system research combines a variety of Artificial Intelligence (AI) techniques including machine learning, data mining, user modeling, and case-based reasoning. The idea of having an intelligent system that can think and learn like humans has led to a more humanized technique called Computational Intelligence (CI). CI is a branch of AI that explores adaptive mechanisms to enable intelligent operation in complex and changing environments. Such intelligent "recommendations" may come from a variety of factors, including the digital habits of the user, and the history, preferences, interests, and behaviors of similar users. Recommendation systems have rapidly become one of the most important traffic centers for modern e-commerce websites and any websites with a large amount of content and users. In short, the recommender system is a complex filtering system that predicts consumer preferences in a digital environment.
At the beginning of the recommender system invention, it was easy to discover explicit similarities between people and products, but recommender systems have used a way to look at the similarities of potential attributes by using matrix factorization. Briefly, all of the attributes of an item or a customer are combined in a way that reveals relationships that have not yet been implemented, but this is very limiting and the advent of artificial intelligence allows the recommendation system to discover more potential attributes and hide relationships.
Although a great deal of research is conducted on the session-aware recommendation model by numerous scholars, which can minimize the information loss of the existing recommendation model due to ignoring short-term transactions, some challenges still remain:
1. the interests of the user dynamically evolve over time. User preferences for items under e-commerce platforms are typically dynamic rather than static, and may change significantly or slowly over time.
2. The user's selection of an item depends not only on long-term established preferences but also on short-term recent preferences. Both long-term and short-term interests of the user are important, and it is clearly a problem how to distinguish and exploit these two interests simultaneously.
3. The impact of different interests is different. The user interests are only continuously and equally utilized, so that the model is not real, and the accuracy of the recommendation result is influenced.
Disclosure of Invention
In order to solve the problems, the application provides a conversation perception recommendation prediction method of an e-commerce platform based on long-term and short-term interests, which specifically comprises the following steps:
acquiring online data comprising basic information of a user, basic information of an article and a conversation sequence of user behaviors;
extracting user behaviors and user preferences through the acquired online data, and constructing a long-term interest set of the user;
acquiring the short-term interest of the user at the current stage from the long-term interest set of the user through interest matching;
and constructing a prediction model, taking a click sequence of a user in one session, namely item input, and short-term interest of the user in the current stage as input, and outputting and recommending the predicted items by the prediction model.
Further, constructing the user long-term interest set includes: acquiring a historical commodity matrix according to a conversation sequence of historical user behaviors, performing convolution calculation on the historical commodity matrix by utilizing convolution checks with different scales, obtaining preference feature mapping of a user by each convolution check, converting all preferences into long-term interests of the user through a full connection layer after splicing, and forming a long-term interest set by the long-term interests obtained by all conversations of the user.
Further, when the convolution operation is performed on each of the historical item matrices by using τ convolution kernels, the convolution kernel matrix is expressed as Ω = { ω = } ω 12 ,…,ω τ And then, the long-term interest of the user obtained at the ith session is expressed as:
Figure BDA0003795294370000021
wherein ReLU () represents a ReLU activation function; concat () represents a splicing operation; h is a historical commodity matrix and is expressed as H = [ v = 1 ,v 2 ,…,v i ,…] T ,v i Representing an interactive item vector, W, representing an ith sub-session map l Is the weight matrix of the fully-connected layer, and b is the offset term of the fully-connected layer.
Further, for a value in the historical item matrix, the value is forced to be set to 0 with a probability expressed as:
Figure BDA0003795294370000031
wherein, O represents the probability of setting a certain value in the historical commodity matrix to 0; u shape u Is a user vector representation; v v Is an item vector representation;
Figure BDA0003795294370000032
representing a user vector representation resulting from user interaction learning with the item;
Figure BDA0003795294370000033
representing an item vector representation derived from user interaction learning with the item.
Further, in the process of user vector representation and article vector representation obtained by interactive learning of the user and the article, the loss function is gradually reduced until the loss function reaches a set threshold, and the loss function is represented as:
Figure BDA0003795294370000034
wherein l is a loss function; y belongs to {0,1}, and represents that the user has interacted with the current item when y =1 and does not interact with the current user when y = 0;
Figure BDA0003795294370000035
representing an item vector representation derived from user interaction learning with the item and exceeding an average dwell period with a period of dwell at the item;
Figure BDA0003795294370000036
the item vector representation resulting from user interaction learning with the item and used to stay on the item does not exceed the average stay time.
Further, the short-term interest of the user at the current stage is obtained from the long-term interest set of the user through interest matching, and is represented as:
Figure BDA0003795294370000037
wherein the content of the first and second substances,
Figure BDA0003795294370000038
for short-term interest at the nth stage in the process of the s-th session, a user may interact with a plurality of items in one session, wherein each time the user interacts with one item in the session is defined as one stage;
Figure BDA0003795294370000039
indicating the long-term interest gained in the ith session; m represents the number of elements in the long-term interest set; n is a radical of s In order to be the number of stages,
Figure BDA00037952943700000310
a weight representing the short-term interest of the nth stage during the s-th session.
Further, a vector matched from the user's long-term interest set is calculated by an attention mechanism
Figure BDA00037952943700000311
For is to
Figure BDA00037952943700000312
Influence of (2), a vector matching from the user's long-term interest set
Figure BDA00037952943700000313
Expressed as:
Figure BDA00037952943700000314
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003795294370000041
conversation sequence of embedded vector representing user long-term interest set and current user behavior
Figure BDA0003795294370000042
A difference projected to the embedding space vector;
Figure BDA0003795294370000043
indicating the long-term interest gained in the ith session;
Figure BDA0003795294370000044
representing quantized
Figure BDA0003795294370000045
<,>Means inner product calculation, w () means weight function;
Figure BDA0003795294370000046
show to obtain
Figure BDA0003795294370000047
Taking the average value in the range of the set Q; q represents all predecessor behaviors of the user within the current session.
Further, a prediction model is built based on the gated loop unit, the output of the gated loop unit output gate is used for predicting the probability of each item being clicked through a softmax function layer, and the items are recommended to the user from high to low according to the probability, and the process is represented as follows:
Figure BDA0003795294370000048
Figure BDA0003795294370000049
Figure BDA00037952943700000410
Figure BDA00037952943700000411
Figure BDA00037952943700000412
Figure BDA00037952943700000413
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA00037952943700000414
inputting a vector for the item;
Figure BDA00037952943700000415
in order to be of short-term interest to the user,
Figure BDA00037952943700000416
in order to be in the intermediate hidden state,
Figure BDA00037952943700000417
in the state of being hidden, the mobile phone is in a hidden state,
Figure BDA00037952943700000418
in order to update the door,
Figure BDA00037952943700000419
in order to reset the gate, the gate is reset,
Figure BDA00037952943700000420
is an output vector; w p 、W q 、W h A transformation matrix, W, representing GRU units 1 、W 2 、W 3 Representing a weight matrix of the GRU unit; σ denotes a sigmoid activation function, an by-bit multiplication operation;
Figure BDA00037952943700000421
represents the probability that the | V | th item in the set of items V was clicked on, | V | represents the number of items in the set of items V, and W represents the weight matrix that the hidden layer is connected to the output layer.
The method and the system can effectively mine the long-term and short-term interest information of the user in the user behavior sequence, more accurately express the interest preference of the user, and simultaneously improve the recommendation accuracy of the e-commerce platform.
Drawings
FIG. 1 is a flow chart of a long-short term interest-based session-aware recommendation prediction model of the present invention;
FIG. 2 is a schematic diagram of a probabilistic model for predicting purchased items at each stage of a session according to the present invention;
FIG. 3 is a schematic diagram of a CNN convolutional network according to the present invention;
FIG. 4 is a schematic diagram of the SiM algorithm calculating short-term interest preferences according to the present invention;
FIG. 5 is a diagram of a long-term interest evolution model according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a long-term and short-term interest-based E-commerce platform session awareness recommendation prediction method, which comprises the following steps of:
acquiring online data comprising basic information of a user, basic information of an article and a conversation sequence of user behaviors;
extracting user behaviors and user preferences through the acquired online data, and constructing a long-term interest set of the user;
acquiring the short-term interest of the user at the current stage from the long-term interest set of the user through interest matching;
and constructing a prediction model, taking a click sequence of a user in one session, namely item input, and short-term interest of the user in the current stage as input, and outputting and recommending the predicted items by the prediction model.
In this embodiment, as shown in fig. 1, the performing of recommendation prediction mainly includes the following steps:
s1: the data is acquired online. The data may be obtained from a public data set website or by direct query of the enterprise to provide real-time sales data in a database. What needs to be obtained here is a conversation sequence of basic information of a user, basic information of an article and user behavior, the conversation sequence comprises a plurality of elements, the conversation sequence is converted into a fixed length, and data is preprocessed.
S2: and extracting the relevant attributes. And extracting relevant attributes according to the acquired user basic information, the acquired article basic information and the user historical conversation sequence under the E-commerce platform. And the user preference is described by fusing the characteristics from the two aspects of the long-term interest and the short-term interest of the user.
S3: and (5) establishing a model. And constructing a prediction model, predicting to obtain a candidate article sequence, and pushing the sequence to a user.
The embodiment provides a specific method for acquiring a data source, which mainly comprises the following steps:
s11: raw data is acquired. The raw data can be obtained through real-time inquiry of an enterprise database or through a public data set website.
S12: simple data cleaning. The raw data that is typically acquired is unstructured and cannot be used directly for data analysis. Most unstructured data can be structured by simple data cleansing. For example, duplicate data is deleted, invalid information is cleared, and the like.
S13: and (4) storing the data. The data after the simple data cleaning is stored by using the database, the data is further normalized by the table structure, and the data retrieval efficiency and the mapping of the relationship among the tables can be greatly improved by the database.
In the E-business platform, the ordering behavior of the user on the marketing campaign is influenced by various factors, such as the hobbies of the user, the participation behavior of similar users, the influence of a marketing campaign reward mechanism on the impulsive consumption of the user and the like. Based on this, the present embodiment extracts factors affecting user behavior from both internal and external factors, that is, to extract user behavior as an external factor and user preference as an internal factor, and the present embodiment provides a specific extraction process, including:
s21: and extracting the user behavior.
S211: set of user sessions E
Each record of interaction between a user and an item is associated with a session number s, using e s To represent the set of all items interacted by a given user in the s-th session, all interaction information including and in the current session s is represented as E = { E = 1 ,e 2 ,…,e s }。
S212: user-item interaction matrix T
User set and item set are denoted as U = { U =, respectively 1 ,u 2 8230 = and V = { V 1 ,v 2 8230j, according to the user to the articleDefines an interaction matrix T = { T ] between a user and an item uv L U belongs to U and V belongs to V, i.e. if user U clicks on item V, then t uv =1, otherwise t uv =0。
And S22, extracting the user preference.
S221: user item Attention
The Attention of the user to the item can be determined by two conditions of the stay time t of the user on the item page and the number of click times Num (click), which is expressed as Attention = t × Num (click).
S222: user long-term interest set P
With the first s-1 sessions, we prepare a long-term interest set P for each user to characterize the user's long-term interest preferences. The long-term interest set P stores m hidden preference vectors for a user, denoted as
Figure BDA0003795294370000071
S223: short-term user interest
Figure BDA0003795294370000072
The long-term interest set of the user before the s-th session is expressed as
Figure BDA0003795294370000073
Short-term interest of the user is matched to the user's predecessor behavior by each stage n within the s-th session
Figure BDA0003795294370000074
Derived from the long-term interest set P and can be defined as
Figure BDA0003795294370000075
The method for mining and memorizing the long-term interest of the user by the constructed prediction model and estimating the candidate item sequence based on the interest evolution of the user comprises the following steps:
mining long-term interest of the user with fine granularity from the user behavior sequence in the conversation, and storing the user behavior sequence to a long-term interest set;
further matching Short-term interest dynamic representation of the user at the current stage from a long-term interest set of the user by designing a SiM (Short-interest Match) interest matching method;
the method comprises the steps of realizing the process of constructing user interest change generated along with the time by the GRU;
a gated cyclic neural network is introduced to model a click sequence of a user in a session, and a candidate article sequence is estimated.
In the process of obtaining the long-term interest set of the user, a long-term interest set module P is defined for each user in s sessions, and from a feature level, by performing feature extraction on historical articles interacted by the user and storing m hidden interest vectors, preference features of each user are described, as shown in fig. 3, the method specifically includes the following steps:
inputting an embedding layer according to a historical conversation sequence of a user to obtain a historical commodity matrix H = [ v ] 1 ,v 2 ,…,v i ,…] T In which
Figure BDA0003795294370000076
Multiple convolution kernels
Figure BDA0003795294370000077
Performing convolution calculation on the matrix H, wherein each convolution kernel can obtain a preference feature mapping p of one user i Namely:
Figure BDA0003795294370000078
wherein, w t Denotes the element of the convolution kernel ω with index t, H t,i Representing the element in the ith row and ith column of the matrix,
Figure BDA0003795294370000081
an offset term that is a function of the above; n is I Is the dimension of the embedding vector; v. of i Interactive item vectors representing the ith sub-session map, the vector packageL features are included, each feature being an element in the vector.
Generating different user preference feature mappings by using a plurality of convolution kernels, and setting omega and tau as convolution kernel matrixes and numbers, obtaining preference feature mappings p of tau users according to the numbers of the convolution kernels, splicing the preference feature mappings p and the preference feature mappings p, and converting the preference feature mappings p into long-term interests of the users through a full connection layer, wherein the calculation is as follows:
Figure BDA0003795294370000082
where and concat represent convolution calculations and vector splicing operations respectively,
Figure BDA0003795294370000083
is a weight matrix for the fully-connected layer,
Figure BDA0003795294370000084
is the bias term for the fully connected layer,
Figure BDA0003795294370000085
indicating long-term interest to the user. The convolutional neural network can capture the global relationship among the same dimensional features in the user behavior, and can effectively extract the long-term interest of the user.
Since the long-term interest of the user is continuously increased along with the behavior data of the user, the user preference characteristics are written after each session is ended, and the accuracy of the recommendation system is further enriched. While at the same time to prevent overfitting, the present embodiment introduces Dropout to reduce the sensitivity of the model to noise. For example, the user may sometimes inadvertently click on some items that the user is not interested in, and the model may be overfit due to this noisy portion of the data. Specifically, the input article feature is forced to be 0 according to a certain probability, and the probability is expressed as:
Figure BDA0003795294370000086
wherein the content of the first and second substances,
Figure BDA0003795294370000087
is a representation of the user's vector learned by the model,
Figure BDA0003795294370000088
is a vector representation of the object learned by the model, U u And V v User and item vector representations, respectively, externally input as supervisory signals;
Figure BDA0003795294370000089
a vector for the predicted user;
Figure BDA00037952943700000810
a vector for predicting the item;
Figure BDA00037952943700000811
probability of clicking for a user;
Figure BDA00037952943700000812
is the probability of the item being clicked; u represents the value range of the user; v represents the value range of the article; t denotes the operation of matrix row-column transposition.
In order to simplify the training process, the present embodiment uses the interaction duration between the user and the article as a supervision signal, and if the residence time of the user on the article exceeds the average time, the user and the article form a positive sample; those that do not exceed the average duration are constructed as negative samples. Through the design of the loss function, the long-term interest of the user, the model of which learns the positive sample, can be more accurate, and the loss function is as follows:
Figure BDA0003795294370000091
wherein l is a loss function; y belongs to {0,1}, and represents that the user has interacted with the current item when y =1 and does not interact with the current user when y = 0;
Figure BDA0003795294370000092
representing an item vector representation resulting from user interaction learning with the item and exceeding an average dwell period with a duration of dwell at the item;
Figure BDA0003795294370000093
representing an item vector representation resulting from user interaction learning with the item and not exceeding an average dwell period with a duration of dwell at the item; v. of + An item indicating that the user stays on the item for a period of time exceeding an average period of time; v. of - An item that indicates that the user has stayed on the item for a period that does not exceed the average period.
The introduction of the user long-term interest set has two purposes: one of the short-term interest sets is at each stage of a conversation, the long-term interest set can match the short-term interest of a user through the SiM, the matched short-term interest vector can describe the current behavior of the user in a finer granularity by combining the current hidden state of the user, and the other is that after one conversation is finished, the long-term interest set can store the information of the user in the conversation and realize the modeling of the long-term interest evolution of the user through the sequence learning capability of the recurrent neural network.
In the interaction process of the user and the platform, the user conveys a very focused and clear intention to the model through the current clicked item, so that the recommended related result cannot be generalized; however, at the same time, the gathering of the smell will reduce the effectiveness of the distribution, so that the user will feel tired during the browsing process, and therefore the "clear intention, moderate divergence" policy should be followed.
Given a long-term interest set P of a user in the s-th session, the task of this embodiment is to extract the short-term interest of the user at this stage from the long-term interest set through the preamble behavior in the session, and abstract this operation as a matching (Match) operation:
Figure BDA0003795294370000094
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003795294370000095
what is shown is the short-term interest of the user at the nth stage in the s-th session. One common practice is by averaging all the interest vectors of the user:
Figure BDA0003795294370000096
wherein, N s Is the number of stages and is also the number of interactions with the item in the current session.
Taking the average results in the short-term interest being static across the session. In order to more truly simulate the current preference of a user in a real scene, we need to obtain a finer-grained preference of short-term interest. Therefore, a module for matching the short-term interest of the user based on the long-term interest of the user is designed in the model, and the preface behavior sequence of the user is obtained
Figure BDA0003795294370000101
And carrying out efficient vector similarity search (ScaNN) with the long-term interest set P of the user, namely the operation shown in the figure 4. The method specifically comprises the following steps:
firstly, the method is to
Figure BDA0003795294370000102
Mapping to an embedding space, and then finding an embedding vector closest to a query in all P embedding vectors; the vector search process is as follows:
suppose there are two embedded vectors x 1 And x 2 Quantize each to one of two centers c 1 Or c 2
Each x is i Is quantified as
Figure BDA0003795294370000103
To make the inner product
Figure BDA0003795294370000104
As close as possible to the original inner product<q,x i >This process can be visualized as
Figure BDA0003795294370000105
The projection amplitude on q is as similar as possible to x i The amplitude of the projection on q;
in the conventional quantization method, for each x i Selecting the nearest center will result in incorrect relative rankings of the two points:
Figure BDA0003795294370000106
is greater than
Figure BDA0003795294370000107
But in practice it is<q,x 1 >Is less than<q,x 2 >. Thus using anisotropic vector quantization, x 1 Is assigned to c 1 Will x 2 Is given to c 2 The inner product error is reduced, and the precision is improved, which is shown as follows:
Figure BDA0003795294370000108
wherein h is || (w,||x i ||)、h (w,||x i | |) represents x i The scale parameters of (1);
Figure BDA0003795294370000109
denotes x i And
Figure BDA00037952943700001010
the parallel residuals of (1);
Figure BDA00037952943700001011
represents x i And
Figure BDA00037952943700001012
the quadrature residual of (a); the | | table calculates the vector modulo length.
In this embodiment, the following components
Figure BDA00037952943700001013
As an embedding vector, will
Figure BDA00037952943700001014
As a difference, find the user interest vector that best matches the long-term interest set, i.e., find the difference
Figure BDA00037952943700001015
The vector of the smallest size is the vector of the smallest size,
Figure BDA00037952943700001016
represents:
Figure BDA00037952943700001017
finally, using f att The normalized impact value is calculated as an attention function, and as a softmax function:
Figure BDA00037952943700001018
attention function f att Reading a vector obtained by matching from a user long-term interest set
Figure BDA00037952943700001019
And
Figure BDA00037952943700001020
as an input, and output the influence value between them, a larger value of α indicates that the interest will have a more dominant influence on the user at the current stage. The user's short-term interest can be expressed as a weighted average according to the value of α:
Figure BDA0003795294370000111
the interest of one user usually changes among different sessions, and in order to simulate the interest evolution phenomenon, an interest evolution module is provided, which has the advantages that more historical related information can be provided for the final interest representation through the interest evolution module, and meanwhile, the user click item sequence prediction can be better carried out through the interest evolution trend.
The following two characteristics can exist in the evolution process due to the long-term interest of users: one of the characteristics is that as the diversity of interests is increased, the interests of the user can drift, and the interest drift can generate certain influence on the behavior of the user; the other is that although the user interests may influence each other, each interest has its own evolution process, so the embodiment only focuses on the development process related to the target item. As shown in fig. 5, in the interest evolution module, an Attention mechanism is adopted to overcome the phenomenon of interest drift, which is shown as:
Figure BDA0003795294370000112
wherein e is a Is a vector representation of the candidate item(s),
Figure BDA0003795294370000113
is a weight matrix, calculated a t Weight values for candidate items for different interests; nA is the embedding dimension, i.e. the size of the item vector; nH is the degree of the hidden layer.
After the addition of the Attention mechanism, the weight of the update gate is controlled by the Attention Score. The original updating direction is kept, and the updating strength of the hidden layer state can be controlled according to the degree of correlation with the candidate object. In this way, accurately controlling when to update, the degree and direction of the update can solve the problem of information loss caused by directly multiplying the Attention Score on each interest vector as the input of the next layer of ordinary GRU, which is expressed as:
u′ t =u t *a t
Figure BDA0003795294370000114
in AUGRU, the calculated update factor u t Needs to be multiplied by an a t Weight factor as final update factor u t '. Wherein
Figure BDA0003795294370000115
An implicit state of AUGRU, which is a bit-wise multiplication operation.
In this model, GRU is selected as the RNN implementation, modeling the user's click sequence in one session. Due to its unique gate unit design, the GRU is an RNN architecture that can efficiently extract information from time series with long distance dependencies. The method specifically comprises the following steps:
defined in the nth phase of the input sequence (N =1,2, \8230;, N) s ) The GRU unit is as follows: article input
Figure BDA0003795294370000121
User short-term interest input
Figure BDA0003795294370000122
Intermediate hidden state
Figure BDA0003795294370000123
Hidden state
Figure BDA0003795294370000124
Updating door
Figure BDA0003795294370000125
Reset door
Figure BDA0003795294370000126
Output vector
Figure BDA0003795294370000127
Wherein the gate vector
Figure BDA0003795294370000128
And
Figure BDA0003795294370000129
has a value range of [1,0 ]]The transfer equation for the GRU unit is as follows:
Figure BDA00037952943700001210
Figure BDA00037952943700001211
Figure BDA00037952943700001212
Figure BDA00037952943700001213
Figure BDA00037952943700001214
wherein W p ,W q ,W h A conversion matrix, W, representing GRU units 1 ,W 2 And W 3 Is a matrix used to calculate the output values; σ is a sigmoid activation function, and |, is a bit-wise multiplication operation.
As shown in fig. 2, the GRU unit adopted in the present embodiment is different from the conventional GRU unit in three places, which specifically includes the following:
initial state value of GRU
Figure BDA00037952943700001215
The value is assigned by the interest evolution module, and the interest evolution module contains the interest information of the user in the previous session, namely a long-term interest set, so that the cold start problem of the traditional GRU can be reduced;
each GRU unit has an additional input supervisory signal indicating a user's short-term interest input
Figure BDA00037952943700001216
Different from the traditional GRU which directly uses the hidden state as the output, additionally uses
Figure BDA00037952943700001217
And
Figure BDA00037952943700001218
to construct an output vector;
experimental results show that compared with the traditional model, the model provided by the application can effectively improve the performance of the model through the improvement.
At the nth stage
Figure BDA00037952943700001219
Then, a softmax function layer is used to calculate the predicted probability values of all candidate items:
Figure BDA00037952943700001220
wherein | V | is the number of items in the item set V;
Figure BDA0003795294370000131
the method is characterized in that the probability that the items i are clicked in the nth stage in the s-th conversation is predicted, and the items are recommended to the user after being sorted in a descending order according to the probability value.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that various changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (8)

1. A long-term and short-term interest-based E-commerce platform session awareness recommendation prediction method is characterized by comprising the following steps:
acquiring online data comprising basic information of a user, basic information of an article and a conversation sequence of user behaviors;
extracting user behaviors and user preferences through the acquired online data, and constructing a long-term interest set of the user;
acquiring the short-term interest of the user at the current stage from the long-term interest set of the user through interest matching;
and constructing a prediction model, taking a click sequence of a user in one session, namely item input, and short-term interest of the user in the current stage as input, and outputting and recommending the predicted items by the prediction model.
2. The E-commerce platform session awareness recommendation prediction method based on long-term and short-term interests as claimed in claim 1, wherein constructing the user long-term interest set comprises: acquiring a historical commodity matrix according to a conversation sequence of historical user behaviors, performing convolution calculation on the historical commodity matrix by utilizing convolution checks with different scales, obtaining preference feature mapping of a user by each convolution check, converting all preferences into long-term interests of the user through a full connection layer after splicing, and forming a long-term interest set by the long-term interests obtained by all conversations of the user.
3. The method as claimed in claim 2, wherein if τ convolution kernels are used to perform convolution operation on historical item matrices respectively, the convolution kernel matrices are represented as Ω = { ω = 12 ,…,ω τ And then, the long-term interest of the user obtained in the ith session is represented as:
Figure FDA0003795294360000011
wherein ReLU () represents a ReLU activation function; concat () represents a splicing operation; h is a historical commodity matrix and is expressed as H = [ v = 1 ,v 2 ,…,v i ,…] T ,v i Representing an interactive item vector, W, representing an ith sub-session map l Is made ofThe weight matrix of the connection layer, b is the bias term of the full connection layer.
4. The E-commerce platform session awareness recommendation prediction method based on long and short term interests as claimed in claim 2, wherein for the values in the historical item matrix, the values are forced to be set to 0 with a probability expressed as:
Figure FDA0003795294360000021
wherein, O represents the probability of setting a certain value in the historical commodity matrix to 0; u shape u Is a user vector representation; v v Is an item vector representation;
Figure FDA0003795294360000022
representing a user vector representation resulting from user interaction learning with the item;
Figure FDA0003795294360000023
representing an item vector representation derived from user interaction learning with the item.
5. The E-commerce platform conversation awareness recommendation prediction method based on long-term and short-term interests as claimed in claim 3, wherein the user vector representation and the item vector representation obtained by interactive learning of the user and the item are processes of making a loss function gradually smaller until a set threshold is reached, and the loss function is represented as:
Figure FDA0003795294360000024
wherein l is a loss function; y belongs to {0,1}, and represents that the user has interacted with the current item when y =1 and does not interact with the current user when y = 0;
Figure FDA0003795294360000025
representing an item vector representation resulting from user interaction learning with the item and exceeding an average dwell period with a duration of dwell at the item;
Figure FDA0003795294360000026
the item vector representation resulting from user interaction learning with the item and used to stay on the item does not exceed the average stay time.
6. The E-commerce platform session awareness recommendation prediction method based on long-short term interest as claimed in claim 1, wherein the short term interest of the user at the current stage is obtained from the user long term interest set through interest matching, and is represented as follows:
Figure FDA0003795294360000027
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003795294360000028
short-term interest at the nth stage during the s-th session;
Figure FDA0003795294360000029
indicating the long-term interest gained in the ith session; m represents the number of elements in the long-term interest set; n is a radical of hydrogen s In order to be the number of stages,
Figure FDA00037952943600000210
a weight representing the short-term interest of the nth stage during the s-th session.
7. The E-commerce platform session awareness recommendation prediction method based on long-term and short-term interests as claimed in claim 5, wherein a vector matched from the user's long-term interest set is calculated through an attention mechanism
Figure FDA00037952943600000211
For is to
Figure FDA00037952943600000212
Influence of (2), a vector obtained from matching the user's long-term interest set
Figure FDA00037952943600000213
Expressed as:
Figure FDA00037952943600000214
wherein the content of the first and second substances,
Figure FDA0003795294360000031
conversation sequence of embedded vector representing user long-term interest set and current user behavior
Figure FDA0003795294360000032
A difference projected to the embedded space vector;
Figure FDA0003795294360000033
indicating the long-term interest gained in the ith session;
Figure FDA0003795294360000034
representing quantized
Figure FDA0003795294360000035
<,>Means inner product calculation, w () means weight function;
Figure FDA0003795294360000036
expression solution
Figure FDA0003795294360000037
Taking the average value in the range of the set Q; q represents the current meetingAll predecessor behaviors of the user within the conversation.
8. The E-commerce platform conversation awareness recommendation prediction method based on long-term and short-term interests as claimed in claim 1, wherein a prediction model is built based on a gating cycle unit, the output of an output gate of the gating cycle unit is used for predicting the probability of each item being clicked through a softmax function layer, and the items are recommended to a user according to the probability from high to low, and the process is represented as:
Figure FDA0003795294360000038
Figure FDA0003795294360000039
Figure FDA00037952943600000310
Figure FDA00037952943600000311
Figure FDA00037952943600000312
Figure FDA00037952943600000313
wherein the content of the first and second substances,
Figure FDA00037952943600000314
inputting a vector for the article;
Figure FDA00037952943600000315
in order to be of short-term interest to the user,
Figure FDA00037952943600000316
in order to be in a middle hidden state,
Figure FDA00037952943600000317
in the state of being hidden, the mobile phone is in a hidden state,
Figure FDA00037952943600000318
in order to update the door,
Figure FDA00037952943600000319
in order to reset the gate, the gate is reset,
Figure FDA00037952943600000320
is an output vector; w is a group of p 、W q 、W h A transformation matrix, W, representing GRU units 1 、W 2 、W 3 Representing a weight matrix of the GRU unit; σ denotes a sigmoid activation function, an-denotes a bit-wise multiplication operation;
Figure FDA00037952943600000321
represents the probability that the | V | th item in the set of items V was clicked on, | V | represents the number of items in the set of items V, and W represents the weight matrix that the hidden layer is connected to the output layer.
CN202210967561.3A 2022-08-12 2022-08-12 E-commerce platform session perception recommendation prediction method based on long-term and short-term interests Pending CN115293812A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210967561.3A CN115293812A (en) 2022-08-12 2022-08-12 E-commerce platform session perception recommendation prediction method based on long-term and short-term interests

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210967561.3A CN115293812A (en) 2022-08-12 2022-08-12 E-commerce platform session perception recommendation prediction method based on long-term and short-term interests

Publications (1)

Publication Number Publication Date
CN115293812A true CN115293812A (en) 2022-11-04

Family

ID=83828588

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210967561.3A Pending CN115293812A (en) 2022-08-12 2022-08-12 E-commerce platform session perception recommendation prediction method based on long-term and short-term interests

Country Status (1)

Country Link
CN (1) CN115293812A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117455629A (en) * 2023-11-24 2024-01-26 美服数字科技(广州)有限公司 Live broadcast and cargo carrying intelligent pushing method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117455629A (en) * 2023-11-24 2024-01-26 美服数字科技(广州)有限公司 Live broadcast and cargo carrying intelligent pushing method and system

Similar Documents

Publication Publication Date Title
Wu et al. Session-based recommendation with graph neural networks
CN110717098B (en) Meta-path-based context-aware user modeling method and sequence recommendation method
CN111737578B (en) Recommendation method and system
CN111144933B (en) Commodity recommendation method and device, electronic equipment and storage medium
CN111581520A (en) Item recommendation method and system based on item importance in session
Deng et al. G^ 3SR: global graph guided Session-based recommendation
CN112712418B (en) Method and device for determining recommended commodity information, storage medium and electronic equipment
CN112258262A (en) Conversation recommendation method based on convolution self-attention network
CN112632296B (en) Knowledge graph-based paper recommendation method and system with interpretability and terminal
CN109189922B (en) Comment evaluation model training method and device
Wu et al. Leveraging neighborhood session information with dual attentive neural network for session-based recommendation
CN111506821A (en) Recommendation model, method, device, equipment and storage medium
CN115293812A (en) E-commerce platform session perception recommendation prediction method based on long-term and short-term interests
Liang et al. BA-GNN: Behavior-aware graph neural network for session-based recommendation
CN116910357A (en) Data processing method and related device
CN116484092A (en) Hierarchical attention network sequence recommendation method based on long-short-term preference of user
CN115525835A (en) Long-short term attention cycle network recommendation method
Duan et al. Context-aware short-term interest first model for session-based recommendation
CN116263794A (en) Double-flow model recommendation system and algorithm with contrast learning enhancement
CN114117233A (en) Conversation news recommendation method and system based on user implicit feedback
Yang et al. Wgin: A session-based recommendation model considering the repeated link effect
Luo et al. User dynamic preference construction method based on behavior sequence
Sridhar et al. Extending Deep Neural Categorisation Models for Recommendations by Applying Gradient Based Learning
CN113888238B (en) Advertisement click rate prediction method and device and computer equipment
TOUHAMI et al. Session-based Recommendation Systems with Graph ATtention Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination