CN110245299B

CN110245299B - Sequence recommendation method and system based on dynamic interaction attention mechanism

Info

Publication number: CN110245299B
Application number: CN201910533753.1A
Authority: CN
Inventors: 蔡飞; 陈洪辉; 刘俊先; 罗爱民; 舒振; 陈涛; 罗雪山
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2019-06-19
Filing date: 2019-06-19
Publication date: 2022-02-08
Anticipated expiration: 2039-06-19
Also published as: CN110245299A

Abstract

The invention provides a sequence recommendation method based on a dynamic interaction attention mechanism, which comprises the steps of obtaining initial short-term preference and initial long-term preference of a user; obtaining a long-term preference and a short-term preference according to the interactive attention network in combination with the initial short-term preference and the initial long-term preference; and according to the sequence recommendation model, the long-term preference and the short-term preference are combined to score the corresponding articles, and the articles are recommended to the user according to the scoring result. The dynamic interaction attention mechanism network model for sequential recommendation (DCN-SR) constructed by the invention can learn the common dependency representation of long-term and short-term interaction of the user through the model, and combines long-term preference and short-term preference to enable the recommendation result to be more accurate.

Description

Sequence recommendation method and system based on dynamic interaction attention mechanism

Technical Field

The invention belongs to the field of sequence recommendation, and particularly relates to a sequence recommendation method and a sequence recommendation system based on a dynamic interaction attention mechanism.

Background

Recommendation systems are an effective solution to help people deal with increasingly complex information environments. Conventional recommendation systems typically ignore sequential information and focus on mining static correlations between users and items from interactions. For example, a typical conventional recommendation system based on matrix factorization may learn from the entire interaction history to effectively model the general preferences of a user, but it does not model the user's short-term, sequential interaction behavior. Unlike the conventional recommendation system, the sequential recommendation system predicts the next object that may be interested by the user according to the interaction history of the user over a period of time.

The modeling method of the existing sequential recommendation system mainly comprises a Markov chain and a Recurrent Neural Network (RNN). For example, decomposing a personalized Markov chain (FPMC) model combines Markov chains with matrix decomposition to achieve good recommendation performance. Also included in the prior art is a Hierarchical Representation Model (HRM) model that extends the idea of FPMC by building representations of users and items using a two-layer structure. The markov chain based approach can only model the local sequence pattern between every two adjacent interactions. RNN-based models can effectively model multi-step sequential behavior. The Hierarchical Recurrent Neural Network (HRNN) model and the dynamic bAsket model (DREAM) model combine the long and short term tastes of users. Significant improvement was achieved compared to both HRM and FPMC. However, these methods do not capture the change in relative importance of the user's long-term preferences versus short-term preferences.

Disclosure of Invention

The invention aims to provide a sequence recommendation method and a system thereof based on a dynamic interaction attention mechanism, so as to solve the technical problems in the prior art.

The invention discloses a sequence recommendation method based on a dynamic interactive attention mechanism, which comprises the following steps:

acquiring initial short-term preference and initial long-term preference of a user;

obtaining a long-term preference and a short-term preference according to the interactive attention network in combination with the initial short-term preference and the initial long-term preference;

and according to the sequence recommendation model, the long-term preference and the short-term preference are combined to score the corresponding articles, and the articles are recommended to the user according to the scoring result.

Preferably, after the corresponding article is scored, the scoring result needs to be corrected through a loss function, where the loss function is:

indicating the degree of preference, y, of the predicted user u for item i_uiRepresenting the real result and V representing the number of all items.

Preferably, the way to obtain the initial short-term preference of the user is through a context-based GRU model: the CGRU performs initial short-term preference queries, the CGRU being:

z_t＝σ(W_zx_t+V_za_t+U_zh_t-1)

r_t＝σ(W_rx_t+V_ra_t+U_rh_t-1)。

W_z，V_z，U_z，U_z，W_r，V_r，U_rw, V, U are model parameters, which can be obtained by training, x_tVector representation representing items input at time t, a_tThe final calculated h is the vector representation of the behavior action_tσ represents a sigmoid function and tan h represents a tangent function, which are hidden layer state representations at time.

Preferably, the method for obtaining the initial long-term preference of the user is to use a multi-layer perceptron to model the long-term preference of the user, and the modeling process of the multi-layer perceptron is as follows:

W_mand b_mParameters representing the m-th layer perceptron, phi the activation function, X_iDenotes x_iThe final output state vector.

Preferably, the interactive attention network is:

H^l＝tan h[W_lU_l+(W_sU_s+W_th_s,T)C^T]

H^s＝tan h[W_sU_s+W_th_s,T+(W_lU_l)C]

U_land U_sIs the user's initial long and short term state vector representation, W_c,W_l,W_s,W_t,W_hl,W_hsAs model parameters, can be obtained by training, h_s,TRepresenting the last hidden layer state vector of the CGRU output, a of the final output_lAnd a_SA weight of attention that is a long-short term preference of the user.

Preferably, the short-term preference and the long-term preference of the user are respectively:

U_co-sand U_co-lIndicating passing of interactive attentionAnd (4) making the final long-term and short-term preference representation of the user obtained by calculation.

Preferably, the manner of scoring the pairs of articles is:

B_l，B_s，B^Tall model parameters can be obtained through training and finally output

Indicating the degree of preference of user u for all items.

The invention also provides a sequence recommendation system based on the dynamic interaction attention mechanism, which is based on the method, and comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor implements the steps of any one of the methods when executing the computer program.

The beneficial effects of the invention include:

1. the invention constructs a dynamic interaction attention mechanism network model for sequential recommendation (DCN-SR), and the model can learn the common dependency representation of long-term and short-term interaction of a user and combine the relationship between the long term and the short term to ensure that the recommendation result is more accurate.

2. The context-gated cyclic unit CGRU is constructed, contains different types of short-term user action behaviors, and can better estimate the next interest preference of the user.

Drawings

FIG. 1 is a frame diagram of a sequence recommendation method based on a dynamic interaction attention mechanism according to the present invention;

FIG. 2 is a CGRU framework diagram of the present invention;

FIG. 3 is a schematic diagram of an interactive attention network of the present invention;

FIG. 4 is a comparison graph of the recommendation effect of different models at different session lengths according to the preferred embodiment of the present invention;

FIG. 5 is a diagram illustrating a comparison of recommendation effects of different models for different lengths of history information according to an embodiment of the present invention;

fig. 6 is a schematic diagram illustrating the influence of the session importance on the recommendation effect in the preferred embodiment of the present invention.

Detailed Description

Example 1:

the sequence recommendation is to predict the next object possibly interested by the user according to the interaction history of the user in a period of time. A reasonably efficient sequence recommendation model should be able to efficiently combine the long-term and short-term preferences of the user. The existing method always regards the long-term interest preference of the user as a static fixed vector, and the long-term preference of the user is not changed when different dynamic preferences of the user are processed. But this is not reasonable because in the face of different dynamic preferences of the user, their long-term preferences should have different importance. The relative importance of events in a user's long-term interaction history depends on the events in their short-term interaction history, and vice versa. If a user has searched for a camera in the current session, the long-term interaction with the electronic product should be more important than her clothing-related interaction in the user's long-term search history when deciding what to recommend next. Conversely, if the user's past interactions indicate a greater interest in the Sony brand, then during the current session, interactions related to that brand may be more important than other interactions in predicting the next recommended item. At the same time, different user actions (e.g., clicking, adding to a shopping cart, or purchasing) provide different types of information about the user's interests. Clicking on a camera may indicate that the current recommendation is unsatisfactory and thus may proceed to recommend another camera, whereas if a user purchases a camera, we should later recommend some camera related products instead of the camera, e.g. memory cards, etc. Based on this consideration, we propose a recommendation model DCN-SR based on a dynamic interaction attention mechanism, which employs an interactive attention network to model the interaction between the user's long-term and short-term preferences, and includes the following three parts:

(1) short-term preferences of the user are modeled. The present invention uses a context-gated cycle unit (CGRU) network to incorporate information contained in user operations. The short-term preference is expressed as a combination of the interaction states hidden in the current session.

(2) A multi-layered perceptron (MLP) is used to process the user's historical interactions and infer its general preferences.

(3) Using the outputs of the first two parts, we apply an interaction attention mechanism network to capture the interaction between the user's behaviors in the long-term and short-term interaction histories and generate a co-dependent representation of their long-term and short-term preferences. We compute a recommendation score for each candidate based on these co-dependent representations.

Referring to fig. 1, fig. 1 is a block diagram of the method of the present invention, comprising the three components described above: short term query preferences, long term query preferences, interactive attention networks.

Referring to fig. 2, the present invention considers different behaviors of users and the time-sequence characteristics of the behaviors, and therefore proposes a context-based GRU model: CGRU. Namely, the behavior information is added to the input gate, the forgetting gate and the updating gate of the GRU, and the CGRU is expressed as follows:

z_t＝σ(W_zx_t+V_za_t+U_zh_t-1)

r_t＝σ(W_rx_t+V_ra_t+U_rh_t-1)

W_z，V_z，U_z，U_z，W_r，V_r，U_rw, V, U are model parameters, which can be obtained by training, x_tVector representation representing items input at time t, a_tThe final calculated h is the vector representation of the behavior action_tIs a hidden layer state representation of the time of day. σ denotes a sigmoid function, and tan h denotes a tangent function.

This may result in a hidden layer representation for each interaction representing the user's initial short-term preference with their set.

The long-term preferences of the user are modeled using a multi-tier perceptron, where we use only the items that the user has collected and purchased, as these behaviors are more representative of the user's preferences. The modeling process of the multilayer perceptron is as follows:

W_mand b_mRepresenting the parameters of the m-th layer perceptron and phi the activation function. X_iDenotes x_iThe final output state vector.

The multilayer perceptron modeling is adopted mainly because the multilayer perceptron modeling has good nonlinear modeling capability and is widely applied to a collaborative filtering method.

It is beneficial to combine the short-term and long-term preferences of the user when making recommendations. However, conventional approaches treat these two types of preferences as independent, ignoring the (potential) interdependencies between them. Furthermore, conventional attention mechanisms assign weights to events in the user's history and recent interactions, respectively. Historical interactions and recent interactions should provide context for each other in calculating the importance of each event. Therefore, the present invention designs an interactive attention network to model the long and short term interest preferences of users. As shown in fig. 3. The interactive attention network is:

H^l＝tan h[W_lU_l+(W_sU_s+W_th_s,T)C^T]

H^s＝tan h[W_sU_s+W_th_s,T+(W_lU_l)C]

U_land U_sIs the user's initial long and short term state vector representation, W_c,W_l,W_s,W_t,W_hl,W_hsAs model parameters, can be obtained by training, h_s,TRepresenting the last hidden layer state vector output by the CGRU. A of final output_lAnd a_sA weight of attention that is a long-short term preference of the user.

And finally, respectively obtaining the short-term preference and the long-term preference of the user as follows:

U_co-sand U_co-lThe final user long-short term preference representation obtained through the calculation of the interactive attention mechanism is represented.

In the prediction stage, the long-term and short-term interest preferences of the user are combined to score the corresponding articles:

Indicating the degree of preference of user u for all items.

Wherein the loss function at training is as follows:

Comparing the DCN-SR model provided by the invention with GRURec and NARM models, the DCN-SR model can be well popularized to the GRURec and NARM models.

GRU4Rec represents the user's interest preference with the last hidden layer state of the GRU, namely:

h_T＝GRU_sess(v_T,h_T-1)。

GRU_sessdenotes a GRU calculation unit, v_TRepresenting the input state vector at time t.

When the DCN-SR model does not consider the user's long-term retrieval history and behavior information, the DDCN-SR model can simplify:

h_s,Trepresenting the last hidden lamina vector of the CGRU. v. of_iRepresenting the ith item State vector representation

For NARM, it uses the attention mechanism and combines the attention mechanism-based vector and the last GRU hidden layer vector to form the user's interest preference representation:

a_iindicating the attention weight of the ith item.

When the DCN-SR model does not consider the long-term history of the user and simultaneously sets reasonable parameters and activation functions, the following results can be obtained:

therefore, the DCN-SR model is a relatively general model.

In conclusion, the invention designs a dynamic interaction attention mechanism network model for sequential recommendation (DCN-SR), which can learn the co-dependent representation of the long-term and short-term interactions of a user and utilize the relationship between her long-term and short-term; the present invention designs a context-gated loop unit CGRU to incorporate different types of short-term user action behavior in order to better estimate the user's next interest preferences.

Example 2:

in this embodiment, a heaven and cat electronic commerce data set and a heaven and battery electronic commerce data set are selected, and specific information of the data set is shown in table 1 below:

TABLE 1

The following models were selected as comparison targets in this experiment: item-pop is a method of sorting items according to their number of interactions and then recommending according to the sort, which is a non-personalized method. FPMC is a recommendation method based on markov chains and collaborative filtering. GRU4Rec, a recommendation model based on RNN and query sessions, is a non-personalized approach. The HRNN is used for a personalized recommendation method based on query sessions, adopts a layered RNN-structure, namely a session-level RNN and a user-level RNN, and is used for simulating short-term and long-term preferences of a user. NARM is based on a model of RNN, applying an attention mechanism to capture user preference information, but it is also a non-personalized approach. STAMP is a query recommendation method that combines a memory network and an attention mechanism.

The overall effect of these several reference models and the DDCN-SR model proposed by the present invention is shown in table 2 below:

table 2:

as can be seen from Table 2, the RNN-based methods, i.e., Item-pop and FPMC, are superior to the conventional methods. At the same time, HRNN's results are only higher than RNN-based methods, such as GRU4Rec, which means that combining the user's history with recent interactions can help improve recommendation performance. In addition NARM and STAMP also showed improvement on HRNN. STAMP achieves better performance than others. We therefore used STAMP as the best benchmark for comparison in later experiments. Next, comparison was made with the DCN-SR model. In Recall @10, both NARM and STAMP are weak DCN-SR. This indicates that applying the dynamic interactive attention network helps to improve the recommendation effect. This is because the dynamic interaction attention network can capture the relationship between the user's history and recent interactions. Improvement of DCN-SR over the best reference model: in Recall @10, the data set for the skatecat was 2.58% and the data set for the skull was 3.08%. The improvement in the day cat dataset at MRR @10 was 3.78% and the improvement in the day pool dataset at MRR @10 was 4.05%.

To demonstrate the utility of the CGRU network, the different behaviors of the user were treated as search Contextual meetings in a short time, and the recommended performance of DCN-SR, DCN-SRGRU (with simple GRU network) working and DCN-SR CGRU (with context GRU network), was examined under different settings. Table 3 compares its performance to the best baseline model (STAMP):

TABLE 3

Experimental results show that the model of the invention has better recommendation effect all the time. For different numbers of recommended items, it can be seen that the overall performance of Recall and MRR improves, with the size of the recommendation list ranging from 5 to 15.

The short-term conversation of the user is divided into short, medium and long according to different lengths, and then the recommendation effects of all models are compared. The comparison results are shown in FIGS. 4(a) to 4 (d). From fig. 4 we can see that the performance of all models improves on the tianmao dataset as the session length increases. The DCN-SR model always achieves the highest score for different session lengths on the two data sets. Specifically, for Recall @10, STAMP is superior to NARM, except for short-term sessions; this may be because short-term sessions contain less information than long sessions, and therefore location and order information may be provided based on the RNN model, i.e., NARM. STAMP lacks such position and sequence information.

The user history information is divided into eight types according to different lengths, and then recommendation effects of all models are compared. The comparison results are shown in FIGS. 5(a) to 5 (d). From fig. 5 the best performance can be achieved with DCN-SR, all eight groups in both datasets. For the day cat dataset, as the user's historical number of interactions increased, the performance of all models began to fluctuate first, but showed an overall upward trend.

For the sky-pool dataset, the performance of all models will degrade, in terms of two metrics. However, DCN-SR and HRNN descent speeds were slower than STAMP and NARM models, consistent with our findings shown in FIG. 5(a) and FIG. 5 (b). These results demonstrate that the DCN-SR model can improve the recommendation effect in combination with the advantages of long-term and short-term interest preferences.

To illustrate the role of the common focus mechanism, the present embodiment presents an example of two users. We randomly chosen two sessions from the test set of the day pool dataset because the day pool dataset contains category information of the items, which helps us to evaluate the recommendation effect to some extent. In fig. 6, the depth of the color indicates the importance of the event, and the darker the color, the more important the event. The number above the column is the category of the corresponding item. As shown in fig. 6, first, although two sessions of a single user share the same historical interactions, the historical interactions are weighted differently. In addition, some items have greater attention in the same category as the target item. Secondly, the interactions in the session also have different weights for predicting the user's preferences, which indicates that the DCN-SR can select important events and ignore unexpected interactions. Third, there are some important interactions in the session that are not near the user's last click. For example, in Session2, the sixth event is more important than the last event. This may be due to a shift in the interest of the user. However, DCN-SR may also give them higher weight.

In summary, the DCN-SR applies a common interest network to capture the relationship between the long-term and short-term of the user and generate a common dependency representation of the user that represents the long-term and short-term preferences. The model is tested on two electric quotient data sets, and the result shows that the method has better effect than the existing method, on the Tmax data set, the Recall @10 is improved by 2.58%, the MRR @10 is improved by 3.78%, on the Tianchi data set, the Recall @10 is improved by 3.08%, and the MRR @10 is improved by 4.05%. The improvement of the DCN-SR model on short-session and active users is more obvious in the sensitivity and stability of the model.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A sequence recommendation method based on a dynamic interaction attention mechanism is characterized by comprising the following steps:

combining the initial short-term preference and the initial long-term preference to derive a long-term preference and a short-term preference according to an interactive attention network;

according to a sequence recommendation model, combining the long-term preference and the short-term preference to score corresponding articles, and recommending the articles for the user according to a scoring result;

the way to obtain the initial short-term preference of the user is through a context-based GRU model: performing the initial short-term preference query by a CGRU, the CGRU being:

z_t＝σ(W_zx_t+V_za_t+U_zh_t-1)

r_t＝σ(W_rx_t+V_ra_t+U_rh_t-1)

wherein, W_z，V_z，U_z，W_r，V_r，U_rW, V and U are model parameters and are obtained by training, x_tVector representation representing items input at time t, a_tThe final calculated h is the vector representation of the behavior action_tRepresenting hidden layer state at moment, wherein sigma represents a sigmoid function, and tan h represents a tangent function;

the method for obtaining the initial long-term preference of the user is to adopt a multi-layer perceptron to model the long-term preference of the user, and the modeling process of the multi-layer perceptron is as follows:

wherein, W_mAnd b_mParameters representing the m-th layer perceptron, phi the activation function, tan h the tangent function, x_iIndicating the ith item, X, in the user's long-term interaction_iDenotes x_iThe final output state vector of (2);

the interactive attention network is:

H^l＝tan h[W_lU_l+(W_sU_s+W_th_s，T)C^T]

H^s＝tan h[W_sU_s+W_th_s，T+(W_lU_l)C]

wherein, U_lAnd U_sIs the user's initial long and short term state vector representation,

transposed matrix, W, representing a representation of a user's long-term interactive items_c，W_l，W_s，W_t，W_hl，W_hsObtained by training as a model parameter, h_s，TThe last hidden layer state vector, representing the CGRU output, CT represents the transpose of the correlation matrix between long and short term interactions of the user,

represents a weight parameter for learning the importance of the user's long-term interactive items,

a representing a weight parameter for learning the importance of user short-term interactive articles, and a finally output_lAnd a_SAttention weight that is a long-short term preference of the user;

the short-term preference and long-term preference of the user are respectively:

wherein, U_co-sAnd U_co-lRepresenting calculation through an interactive attention mechanismThe resulting final user long-short term preference representation, T represents the number of items in the user's short term interaction,

representing attention weight, h, of the t-th item in the user's short-term interaction_s，tA representation representing the t-th item in the user's short-term interaction, N representing the number of items in the user's long-term interaction,

representing the attention weight, X, of the nth item in the user's long-term interaction_nA representation representing the nth item in the user's long-term interaction output by the multi-tier perceptron;

the mode of scoring the article pairs is as follows:

wherein, B_l，B_s，B^TAll the parameters are obtained by training the model parameters,

representing a score on item i predicted from the user's long-term preference,

representing a predicted score on item i based on the user's short-term preferences,

represents the score on item i of the final prediction, sigma represents the sigmoid function,

middle v_iRepresents the ith item state vector, T represents the number of items in the user's short-term interaction, U_longIndicating the user's long-term preference, U_shortIndicating short-term preferences of the user, finally output

Indicates the preference degree of the user U for all the items, U_longBy U_co-lIs calculated to obtain U_shortBy U_co-sConcatenated with the representation of the last item in the short-term interaction.

2. The sequence recommendation method based on the dynamic interaction attention mechanism as claimed in claim 1, wherein the scoring result is further modified by a loss function after the corresponding item is scored, and the loss function is:

3. A sequence recommendation system based on a dynamic interactive attention mechanism, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method according to any one of the preceding claims 1 or 2 when executing the computer program.