CN108648049B

CN108648049B - Sequence recommendation method based on user behavior difference modeling

Info

Publication number: CN108648049B
Application number: CN201810414330.3A
Authority: CN
Inventors: 陈恩红; 刘淇; 李徵; 赵洪科; 张凯
Original assignee: University of Science and Technology of China USTC
Current assignee: Chen Enhong; Huang Zhenya; Liu Qi; University of Science and Technology of China USTC
Priority date: 2018-05-03
Filing date: 2018-05-03
Publication date: 2022-03-01
Anticipated expiration: 2038-05-03
Also published as: CN108648049A

Abstract

The invention discloses a sequence recommendation method based on user behavior difference modeling, which comprises the following steps: acquiring historical behavior information of a user; calculating a commodity feature vector according to the acquired historical behavior information; performing sequence modeling by using a behavior difference modeling method in combination with commodity feature vectors, and acquiring the current demand and the historical preference of a user through two different neural network architectures; and predicting the next interested commodity of the user through combined learning according to the current purchase demand and the historical preference of the user, matching in a commodity vector space, finding a plurality of commodities which are most similar to the prediction result in the commodity vector space, and generating a commodity recommendation sequence. According to the method, through differential modeling of the user time sequence behaviors, the current requirements and long-term preferences in the purchasing decision of the user are intelligently understood, and accurate sequence recommendation service can be provided for the user.

Description

Sequence recommendation method based on user behavior difference modeling

Technical Field

The invention relates to the technical field of machine learning and electronic commerce, in particular to a sequence recommendation method based on user behavior difference modeling.

Background

With the continuous development of online shopping platforms, recommendation systems have become irreplaceable important components in e-commerce. The recommendation system can learn the hidden preference information in the user historical behaviors, so that the shopping behaviors of the user are further predicted, customers are helped to select satisfied commodities, and the income of an e-commerce platform is promoted to be improved. Therefore, how to efficiently and accurately provide personalized commodity recommendation service for users has been an important issue for research in the academic world and the industrial industry.

Currently, there are two main categories of research on recommendation systems:

1) recommendation system based on user static preference

The algorithms based on content, collaborative filtering or mixed type are all similar, and the methods take the commodity information of the user as static characteristics, and mine the information of commodity similarity, user personalized preference and the like hidden in the characteristics by clustering, matrix decomposition and other methods, so that the similar content or similar preference recommendation is carried out on the user. Under the model, the historical behavior data of the user is regarded as the static characteristics of the user, the preference of the user is regarded as stable for a long time and can influence the future decision making process of the user, and on the basis, the recommendation system only needs to learn the historical preference of the user and recommend similar commodities according to the preference of the user.

2) Short-session-based sequence recommendation method

Some online platforms, especially small retail platforms and multimedia content providers, lack sufficient user history, but their back office accumulates a large amount of user short-session content. In consideration of the fact that the scene lacks of long-term preference characteristics of the user, a short-session-based user sequence recommendation method is proposed by scholars. The method is usually based on the short-term operation behaviors of the user, and a deep neural network is constructed to model the dynamic changes of the behaviors of the user in the short term, so that the dynamic changes are used for predicting the commodity objects which are interested in the next step of the user and recommending the user.

In the above online platform recommendation method, the recommendation system based on the static preference of the user can well learn and understand the stable preference of the user and can recommend the favorite goods or services to the user more accurately, but the method is static, only the preference of the user is regarded as long-term unchanged, the preference of the user is not considered, and the process of dynamic change is also not considered, and meanwhile, the method does not consider the current requirements of the user, so that the goods or services recommended to the user are really the favorite of the user but not the requirements of the user. The short-session-based sequence recommendation method is to record the interaction process of a user in a short period, and analyze the sequence characteristics of the user in the current decision-making process from the short-term behaviors so as to judge the next interested commodity or service of the user. The short-session sequence recommendation method can model dynamic changes of user behaviors in a short period through a deep neural network, but the method ignores the preference of the user, so that the recommendation result is always in line with the user requirements but is not the favorite type of the user. Meanwhile, the two methods cannot deeply model the dynamic change of the user in the whole decision making process, and different preference degrees expressed by different behaviors of the user are not specifically analyzed. Therefore, it is difficult to accurately model a complete decision process when a user selects goods or services by using the conventional recommendation method, and requirements and preferences of the user cannot be combined, so that recommended content cannot meet the expectations of the user.

Disclosure of Invention

The invention aims to provide a sequence recommendation method based on user behavior difference modeling, which intelligently understands the current requirements and long-term preference in user purchasing decisions and can provide accurate sequence recommendation service for users by difference modeling of user time sequence behaviors.

The purpose of the invention is realized by the following technical scheme:

a sequence recommendation method based on user behavior difference modeling comprises the following steps:

acquiring historical behavior information of a user;

calculating a commodity feature vector according to historical behavior information of a user;

performing sequence modeling by using a behavior difference modeling method in combination with commodity feature vectors, and acquiring the current demand and the historical preference of a user through two different neural network architectures;

and predicting the next interested commodity of the user through combined learning according to the current purchase demand and the historical preference of the user, matching in a commodity vector space, finding a plurality of commodities which are most similar to the prediction result in the commodity vector space, and generating a commodity recommendation sequence.

According to the technical scheme provided by the invention, the influence factors borne in the user decision process can be deeply discussed by aiming at the differential modeling of the user behaviors, the different requirements and preferences of the user reflected by the different types of behaviors of the user are deeply considered, and the next operation of the user is effectively predicted, so that the commodity which is more satisfied by the user is recommended, the sequence recommendation can be dynamically carried out through the interaction with the user, and the defects that the existing method is lack of dynamics and the individuation is exact are overcome.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

Fig. 1 is a flowchart of a sequence recommendation method based on user behavior difference modeling according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment of the invention provides a sequence recommendation method based on user behavior difference modeling, which mainly comprises the following steps as shown in figure 1:

step 1, obtaining historical behavior information of a user.

Each user can leave a series of log records in the background when browsing the online platform, and the records have definite time sequence relation and comprise commodity related operations of browsing, clicking, adding a shopping cart, collecting and purchasing and the like. This data may be collected directly from the online shopping platform or online service provider.

In the embodiment of the present invention, the acquired historical behavior information of the user is data in the form of an interactive commodity sequence, and the interactive commodity sequence of the user u is represented as:

wherein x represents the commodity, subscript is the serial number of the commodity, b represents the user operation behavior, and is a one-hot vector, and the length of the vector is the number of the interaction types.

And 2, calculating a commodity feature vector according to the historical behavior information of the user.

In the embodiment of the invention, a Skip-gram model based on negative sampling is constructed by modeling a commodity sequence relation in user behaviors, and a characteristic vector of a commodity is generated; the main process is as follows:

interactive merchandise sequence according to user u

A commodity feature vector is established aiming to maximize the following objectives:

wherein N is an interactive commodity sequence

P represents the form of a softmax function, defined as x_j、x_iProbability of correlation, p (x)_j|x_i) This form is known in the professional field as the softmax function and is of the form:

wherein, w_iAnd v_iIs related to the commodity x_iRepresents the corresponding potential vector and target vector; w is a_jIs related to the commodity x_jRepresents the corresponding potential vector; w is a_k'Is related to the commodity x_k'Represents the corresponding potential vector; k' takes a value from 1 to N;

to alleviate the computational complexity of the gradient, the above equation is replaced by the following procedure:

where σ (r)' 1/(1+ exp (-r)) is the sigmoid function, and E is the number of negative samples to be drawn per positive sample, where a positive sample refers to x_iContext-related commodities, negative examples refer to irrelevant commodities, and the size of E can be set by a user according to actual conditions or experience;

considering that different commodities have different occurrence times and certain noise is brought to the negative sampling process, the above formula is defined again based on the mode of weighting the occurrence times of the individual commodities:

wherein, theta (x)_i) Is a commodity x_iAnd counting the frequency appearing in the interactive commodity sequence, wherein the target of the commodity embedded representation is a maximized loss function:

then, a commodity feature vector P is obtained in a gradient descending mode_u＝{v₁,v₂,...,v_NIn which v is_jRepresenting a commodity x_jD-dimensional feature vector of (1).

And 3, performing sequence modeling by using a behavior difference modeling method in combination with the commodity feature vector, and acquiring the current demand and the historical preference of the user through two different neural network architectures.

After the commodity feature vectors are obtained, the differential behavior modeling can take continuous behaviors as prior knowledge, and aims to recommend items which are most likely to be accessed by a target user in the next access. The decision making process of a user is mainly influenced by two factors: current motivation and historical preference. More specifically, the user's current consumer motivation is dynamic in the short term, and recent fluctuations are also important to reflect short-term characteristics. Considering that all recent actions (e.g., click, collect, shopping cart, purchase) may mean the user's current short-term motivation, the present invention uses all types of recent actions to present the current consumption motivation. On the other hand, not all types of behavior can describe a user's preferences with the user's historical preferences. To model the long-term preferences of a user, the present invention only retains behaviors that clearly describe the user's potential preferences from the interaction history, i.e., purchasing behavior. In effect, the user's interactive process is a series of implicit feedback over time. Thus, unlike conventional recommendation systems that explore user item interactions from a static approach, the next suggestion is processed through sequential modeling. Specifically, we have devised two distinct behavioral modeling processes: conversation behavior modeling and preference behavior modeling, which discriminately learns the current consumption motivation and long-term stable preference of a user. Furthermore, on this basis, we have invented two LSTM-based deep-cycle neural networks to jointly learn permutations of these two motivations and preferred behavior.

Firstly, carrying out conversation behavior modeling, and obtaining a commodity feature vector P_u＝{v₁,v₂,...,v_NThe corresponding interactive commodity sequence is

Defining the following indicator function to determine the goods x_iWhether the scope of the current session behavior is satisfied:

D_SBL(x_i,x_N)＝Φ((N-i)≤Ts)；

wherein phi (a) is a Boolean type function, when a is true, the function value is 1, otherwise, the function value is 0; ts represents a control time step of the conversational behavior, and is used for controlling the length of the conversational behavior; x is the number of_NIs the current interactive commodity sequence S_u ^bThe last commodity in the group;

after the definition of the initialized LSTM matrix, in the t-th iteration step, the hidden layer state h of each_tAnd hidden layer state h of the last time step_t-1And the currently input commodity feature vector v_tAnd a behavior vector b_tCorrelation; wherein the updating step is as follows:

h_t＝o_ttanh(c_t)

wherein i_t、f_t、o_tAn input gate, a forgetting gate and an output gate in the t-th iteration step respectively; c. C_tA memory module that is a network element; b_tThe user operation behavior corresponding to the t-th commodity is input in the t-th iteration step; w_vi、W_hi、W_ci、W_biCorresponding to input gate i_tMiddle v_t、h_t-1、c_t-1、b_tThe weight of (c); w_vf、W_hf、W_cf、W_bfCorresponding to the forgetting door f_tMiddle v_t、h_t-1、c_t-1、b_tThe weight of (c); w_vc、W_hc、W_bcCorresponding to v in the memory module_t、h_t-1、b_tThe weight of (c); w_vo、W_ho、W_co、W_boCorresponding to the output gate o_tMiddle v_t、h_t-1、c_t-1、b_tThe weight of (c);

respectively correspond to input gates i_tForgetting door f_tAnd an output gate o_tMemory module c_tA deviation of (a); h is_tIs the output of the current state; tan h is the hyperbolic tangent function.

The current purchase demand of the user is expressed as:

Ψ_SBL＝h_N；

in the above operation process, the iteration number is the same as the number of commodities in the interactive commodity sequence, i.e. t is 1,2_NLast item of the sequence x_NAnd the output after input is the output of the Nth iteration step.

Second, historical preference modeling of the user is performed, operating commodity-behavior pairs (v) for each user_i,b_i)∈S_u ^b(ii) a The indicator function is expressed as:

D_PBL(v_i,b_i)＝Φ(b_i∈P)；

wherein, P is a set of preference behaviors, mainly comprising purchasing, collecting and shopping cart adding operation behaviors;

using a bi-directional LSTM network to learn a user's preference expression, there are two hidden layer outputs at each time step of historical preference modeling, for the s-th time step, where the forward output is

Is output from its previous time step

And the current commodity-action pair (v)_s,b_s) (ii) determined; backward output

Is output by the next time step

And the current commodity-action pair (v)_s,b_s) (ii) determined; the corresponding formula is as follows:

h_s＝o_stanh(c_s)

wherein i_s、f_s、o_sAn input gate, a forgetting gate and an output gate which are respectively the s-th time step; c. C_sA memory module that is a network element; b_sThe user operation behavior corresponding to the s-th commodity is input in the s-th iteration step; w_vi'、W_hi'、W_ci'、W_bi' corresponding to input gate i_sMiddle v_s、h_s-1、c_s-1、b_sThe weight of (c); w_vf'、W_hf'、W_cf'、W_bf' corresponding to forget gate f_sMiddle v_s、h_s-1、c_s-1、b_sThe weight of (c); w_vc'、W_hc'、W_bc' corresponds to v in the memory module_s、h_s-1、b_sThe weight of (c); w_vo'、W_ho'、W_co'、W_bo' corresponding to output gate o_sMiddle v_s、h_s-1、c_s-1、b_sThe weight of (c);

respectively correspond to input gates i_sForgetting door f_sAnd an output gate o_sMemory module c_sA deviation of (a); h is_sIs the output of the current state; if the process is a forward process, the output h of the current state_sIs that

If the process is a backward process, the output h of the current state_sIs that

Through the bidirectional LSTM network, the preference characterization vector of the current user can be obtained for each time step:

wherein the content of the first and second substances,

is composed of

The historical preference of the user is expressed as an average pooling process as follows:

and 4, predicting the next interested commodity of the user through combined learning according to the current purchase demand and the historical preference of the user, matching in a commodity vector space, finding a plurality of commodities which are most similar to the prediction result in the commodity vector space, and generating a commodity recommendation sequence.

In the embodiment of the invention, the current purchase demand psi of the user is combined through a full link layer_SBLAnd historical preferences Ψ_PBLAnd then, calculating to obtain a prediction vector of the next commodity of interest of the user:

wherein the content of the first and second substances,

and

corresponding to the weights of the current purchasing demand and the historical preference; bias represents the model bias.

In the model training process, the next interested commodity vector of the real user is assumed as: v. of_T+1＝(y₁,y₂,...,y_d) (ii) a The loss function of the model can be defined as:

where d is the dimension of the vector.

According to the scheme of the embodiment of the invention, sequence information of different users is divided according to time sequence aiming at historical behavior records of the users, the method is embodied on the method for constructing the commodity feature vector and the method for establishing the user behavior difference modeling, the commodity feature vector is generated by using a commodity embedding representation method, the difference sequence modeling is carried out on different behaviors of the commodity by the users, the current requirements and the historical preferences of the users are respectively learned, and the next commodity which the users are interested in is predicted. The method combines the historical preference of the user with the current requirement, models different preferences of commodities expressed by different behaviors of the user, dynamically learns the decision process of the user through the recurrent neural network, and further generates personalized sequence recommendation for the user, and overcomes the defects that the existing method is lack of dynamics and is really personalized.

Through the above description of the embodiments, it is clear to those skilled in the art that the above embodiments can be implemented by software, and can also be implemented by software plus a necessary general hardware platform. With this understanding, the technical solutions of the embodiments can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments of the present invention.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A sequence recommendation method based on user behavior difference modeling is characterized by comprising the following steps:

acquiring historical behavior information of a user;

predicting the next interested commodity of the user through combined learning according to the current purchasing demand and the historical preference of the user, matching in a commodity vector space, finding a plurality of commodities which are most similar to the prediction result in the commodity vector space, and generating a commodity recommendation sequence;

the method comprises the following steps of performing sequence modeling by using a behavior difference modeling method in combination with commodity feature vectors, and acquiring the current demand and the historical preference of a user through two different neural network architectures, wherein the method comprises the following steps:

performing conversation behavior modeling and commodity feature vector P_u＝{v₁,v₂,...,v_NThe corresponding interactive commodity sequence is

D_SBL(x_i,x_N)＝Φ((N-i)≤Ts)；

wherein x represents a commodity, subscript represents a serial number of the commodity, b represents a user operation behavior, N is a length of an interactive commodity sequence, Φ (a) is a boolean function, when a is true, a function value is 1, otherwise, 0; ts represents a control time step of the conversational behavior, and is used for controlling the length of the conversational behavior; x is the number of_NIs the current interactive commodity sequence

The last commodity in the group;

h_t＝o_ttanh(c_t)

respectively correspond to input gates i_tForgetting door f_tAnd an output gate o_tMemory module c_tA deviation of (a); h is_tIs the output of the current state; tan h is a hyperbolic tangent function;

the current purchase demand of the user is expressed as:

Ψ_SBL＝h_N；

in the operation process, the iteration times and the number of commodities in the interactive commodity sequenceThe same, i.e. t 1,2_NLast item of the sequence x_NAn output after input;

modeling historical preference of users, and operating commodity-behavior pairs for each user

The indicator function is expressed as:

D_PBL(v_i,b_i)＝Φ(b_i∈P)；

wherein P is a set of preference behaviors;

Is output from its previous time step

Is output by the next time step

h_s＝o_stanh(c_s)

wherein the content of the first and second substances,

is composed of

2. the sequence recommendation method based on the user behavior difference modeling as claimed in claim 1, wherein the obtained historical behavior information of the user is data in the form of an interactive commodity sequence, and the interactive commodity sequence of the user u is represented as:

wherein x represents the commodity, the subscript is the serial number of the commodity, and b represents the user operation behavior.

3. The sequence recommendation method based on the user behavior difference modeling according to claim 2, wherein the step of calculating the commodity feature vector according to the historical behavior information of the user comprises:

interactive merchandise sequence according to user u

wherein N is an interactive commodity sequence S_uAnd p (x)_j|x_i) Defined as the softmax function, which is of the form:

the above formula is replaced by the following procedure:

where σ (r) ═ 1/(1+ exp (-r)) is the sigmoid function, E is the number of negative samples to be drawn per positive sample;

based on the mode of weighting the occurrence times of the commodity individuals, the formula is defined again as follows:

wherein, theta (x)_i) Is a commodity x_iThe frequency of occurrence in the interactive commodity sequence is counted, and then the commodity is embeddedThe goal of characterization is to maximize the loss function:

4. The sequential recommendation method based on user behavior difference modeling according to claim 1 or 3, wherein the predicting the next interested commodity of the user through the joint learning according to the current purchase demand and the historical preference of the user comprises:

combining the user's current purchase demand Ψ through a full link layer_SBLAnd historical preferences Ψ_PBLAnd then, calculating to obtain a prediction vector of the next commodity of interest of the user:

wherein the content of the first and second substances,

and

corresponding to the weights of the current purchasing demand and the historical preference; bias represents the model bias;

in the model training process, the next interested commodity vector of the real user is assumed as: v. of_T+1＝(y₁,y₂,...,y_d) (ii) a The loss function of the model is defined as:

where d is the dimension of the vector.