CN113902518A

CN113902518A - Depth model sequence recommendation method and system based on user representation

Info

Publication number: CN113902518A
Application number: CN202111107990.5A
Authority: CN
Inventors: 孙杰; 吴泓辰; 敬静; 张化祥
Original assignee: Shandong Normal University
Current assignee: Shandong Normal University
Priority date: 2021-09-22
Filing date: 2021-09-22
Publication date: 2022-01-07

Abstract

The invention provides a depth model sequence recommendation method and system based on user representation, wherein the method comprises the following steps: respectively acquiring a long-term sequence of the overall preference of a user and a short-term sequence of the current dynamic preference of the user; obtaining a recommendation result through a depth sequence recommendation model MLUR; the depth sequence recommendation model comprises a short-term modeling representation learning module, a long-term modeling representation learning module and a gating fusion module, and the gating fusion module is combined with the short-term modeling representation learning module and the long-term modeling representation learning module. The contribution ratio of the long-term and short-term representation is decided by considering the relationship between hot-market items, recently interacted items and long-term preferences through a multi-layer perceptron MLP-based gating module. The module balances the long-term and short-term representations by taking into account information of the hot-sell item, so that the user's intent dynamics can be handled simultaneously.

Description

Depth model sequence recommendation method and system based on user representation

Technical Field

The disclosure relates to the technical field of data mining recommendation, in particular to a depth model sequence recommendation method and system based on user representation.

Background

The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.

Information overload means that the knowledge level and cognitive ability of a user are limited, and when the user faces massive complex network information, the user cannot quickly and accurately find required information or even cannot understand and use the information. Therefore, in view of the huge amount of data, how to capture data useful for the user is a focus. The recommendation system is presented to solve the problem of information overload, help consumers find needed products in the great amount like the tobacco, and provide personalized shopping experience for the consumers. Recommendation systems are increasingly popular with users and are also receiving increasing attention from more and more students and e-commerce websites. Especially in recent years, the emerging theme of the sequential recommendation system attracts more and more attention because the sequential recommendation system more accurately describes the characteristics of the consumer user, thereby recommending more accurately, individually and dynamically. Although the current sequence recommendation system achieves certain results, due to uncertainty of shopping behaviors of users, user-commodity interaction contains some noises and some irrelevant interactions which can interfere with the prediction of the next step, and further improvement is needed.

The sequence recommendation System (Sequential recommendation System) mainly performs sequence dependency modeling on user-item interactions in a sequence to recommend items that may be of interest to a user, and various methods of the recommendation System are classified into three categories from a technical point of view: the first type is a traditional sequence model, and the traditional sequence model including sequence pattern mining and Markov chains is a relatively simple and intuitive solution, and the method utilizes the natural advantages of a sequence recommendation system to establish a sequence dependency relationship on the interaction between users and commodities in a sequence. But sequence pattern mining typically generates a large number of redundant patterns that increase unnecessary consumption of time and space. The second category is a potential characterization model, which first learns a potential characterization of a user or good, and then uses the learned potential characterization to predict subsequent user-good interactions. More implicit and complex dependencies are thus captured in a potential space, which greatly improves the recommendation functionality of the recommendation system. The third category is deep neural networks in recommendation systems, which can be divided into two subcategories, basic deep neural networks and sequential recommendation systems based on neural networks with advanced models. The basic neural network comprises a circular neural network-based sequence recommendation system, a convolutional neural network-based sequence recommendation system and a graph neural network-based sequence recommendation system. In order to solve the limitation of the sequence recommendation system based on the basic neural network structure, some advanced models such as a relatively common attention model, a memory network and a hybrid model are usually combined with a basic deep neural network to establish a more authoritative sequence recommendation system, and generally, the memory network has shown the potential in the sequence recommendation system but has not been fully researched. The sequence recommendation system based on the hybrid model combines different models which are good at capturing different kinds of dependency relationships, so that the capability of the whole model in capturing various dependency relationships is enhanced, and better recommendation is given. However, these models are in an early stage.

Although the deep learning technology has demonstrated its potential in the sequence recommendation system, the influence of uncertainty of user intention caused by external uncertain environment on recommendation is rarely considered in previous work, which has not been fully studied, and thus a deep model sequence recommendation method and system based on user representation is needed.

Disclosure of Invention

The present disclosure provides a depth model sequence recommendation method and system based on user representation to solve the above problems, and the present disclosure can effectively process long-term sequences by a deep learning-based method, and balance long-term and short-term preferences of users using uncertainty of user intentions.

According to some embodiments, the following technical scheme is adopted in the disclosure:

a depth model sequence recommendation method based on user representation comprises the following steps:

respectively acquiring a long-term sequence of the overall preference of a user and a short-term sequence of the current dynamic preference of the user;

obtaining a recommendation result through a depth sequence recommendation model MLUR;

the depth sequence recommendation model comprises a short-term modeling representation learning module, a long-term modeling representation learning module and a gating fusion module, and the gating fusion module is combined with the short-term modeling representation learning module and the long-term modeling representation learning module.

Further, the short-term modeling represents that item transitions of users in a short-term sequence are captured in a learning module by a self-attention sequence recommendation model and using a hierarchical self-attention network.

Further, the long-term modeling represents that the cyclic neural networks RNNs have the function of capturing sequence patterns for modeling in the learning module.

Further, the recurrent neural network update sequence process is as follows:

h_t＝g(x_tW+h_t-1A)；

wherein g is an activation function, x_tFor the user's current interactive item in long-term sequence, h_t-1The last hidden state.

Further, the long-term modeling represents that in the learning module, a gated cyclic unit GRU is used as a cyclic unit for sequence recommendation.

Further, the GRU specifically is:

h_t＝z_t⊙h_t-1+(1-z_t)⊙h′_t；

wherein

h_t-1，h_t，h′_t，z_t，

W_z，W_r，W_h，A_z，A_r，

z_tAnd r_tDenote update and reset gates, h 'respectively'_tIs a candidate state, h_tIs the output hidden state, σ and tanh are the activation functions, and + and |, respectively represent the element addition and element multiplication operations.

Further, the long-term modeling represents a learning module, time interval context is introduced to encode and process a long-term sequence, and the time interval context is modeled through a GRU network

Evolution of (c).

Further, in the gated fusion module, weights of the short-term representation and the long-term representation are calculated by utilizing a project similarity gated fusion model to balance the contribution of the long-term and short-term sequences

Further, in the gated fusion module, the MLUR is trained by minimizing the binary cross-entropy loss using an Adam optimizer, and the loss function is:

a user representation-based depth model sequence recommendation system, comprising:

the sequence acquisition module is configured to respectively acquire a long-term sequence of the overall preference of the user and a short-term sequence of the current dynamic preference of the user;

the depth sequence recommendation module is configured to obtain a recommendation result through a depth sequence recommendation model MLUR;

Compared with the prior art, the beneficial effect of this disclosure is:

the present disclosure can effectively process long-term sequences with a deep learning based approach, and balance the long-term and short-term preferences of the user using uncertainty of the user's intent; the contribution ratio of the long-term and short-term representation is decided by considering the relationship between hot-market items, recently interacted items and long-term preferences through a multi-layer perceptron MLP-based gating module. The module balances the long-term and short-term representations by taking into account information of the hot-sell item, so that the user's intent dynamics can be handled simultaneously.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application.

FIG. 1 is an architectural diagram of the present embodiment;

fig. 2 is a network configuration diagram of the long-term preference representation learning of the present embodiment.

The specific implementation mode is as follows:

the present disclosure is further described with reference to the following drawings and examples.

It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

Example 1.

The present invention is a depth model based on long-term and short-term user representations for sequence recommendation. The structure of the depth model MLUR provided by the invention comprises three parts, namely short-term modeling representation learning, long-term modeling representation learning and a door model for balancing the two representations.

As shown in fig. 1, a depth model sequence recommendation method based on user representation includes the following steps:

Further, the recurrent neural network update sequence process is as follows:

h_t＝g(x_tW+h_t-1A)；

Further, the GRU specifically is:

h_t＝z_t⊙h_t-1+(1-z_t)⊙h′_t；

wherein

h_t-1，h_t，h′_t，z_t，

W_z，W_r，W_h，A_z，a_r，

Evolution of (c).

in particular, the method comprises the following steps of,

1. the short term modeling represents a learning module.

In short-term modeling representation learning, a hierarchical self-attention network is used to capture item transitions for users in a short-term sequence according to a self-attention sequence recommendation SASRec.

And (4) embedding the layer. First, a short-term sequence S is formed^uAs an input sequence, let

Represented as an embedded matrix with potential learnable items in the d-dimension. So the input sequence can be represented as an embedded matrix

To capture the influence of position, a learnable position embedding matrix is added

To the input matrix E, obtain the input matrix

To the self-attention network:

self-attention block. Then, the sequence X is added⁽⁰⁾Self-attention blocks (SABS) are provided into a series of self-attention blocks, where the output of the b-th block is:

X^(b)＝SABS^(b)(X^(b-1)),b∈{1,2,…,B} (2)

ignoring the normalization layer for residual concatenation, each self-attention block can be represented as a self-attention layer SAL with a feed-forward layer FFL (-):

SABS(X)＝FFL(SAL(X)) (3)

FFL(X′)＝ReLU(X′W⁽¹⁾+b⁽¹⁾)W⁽²⁾+b⁽²⁾ (5)

x is the input matrix for position sensing, Q ═ XW^Q,K＝XW^KAnd V ═ W^V,W^Q,W^KAnd

the projection query, key, and value matrix are separately represented to increase flexibility. Note that W is⁽¹⁾,

And b⁽¹⁾，

Is the weight and bias of the two-layer convolution. The normalization and dropout layers used in this module are the same as in the Attention mechanism approach.

In this module, fromOutput vector of last self-attention block

Is marked as

Is a representation of a short-term sequence of behaviors that represents the current needs of the user.

2. The long-term modeling represents a learning module.

Because of the outstanding ability of recurrent neural networks RNNs to capture sequence patterns, great success has been achieved in user modeling. The sequence update procedure can be expressed as:

h_t＝g(x_tW+h_t-1A) (6)

g is an activation function, x_tFor the user's current interactive item in long-term sequence, h_t-1The last hidden state. Long short term memory network (LSTM) is a popular RNN variant, while gated round-robin unit GRU is a simplified version of LSTM that mimics the structure of LSTM while preserving the attributes of LSTM and works well, and is therefore also a very popular network today. Using the GRU network as a cyclic unit for sequence recommendation in the present invention, the GRU can be described specifically as:

h_t＝z_t⊙h_t-1+(1-z_t)⊙h′_t (10)

wherein

h_t-1，h_t，h′_t，z_t，

W_z，W_r，W_h，A_z，A_r，

From a long-term perspective, the relationship between two interactions in a sequence of user behavior is not only related to their relative position, but is highly influenced by the information between items in the sequence. Because a short interval between two items has a greater impact on the user than a long interval, and the time intervals between different items are different. The present invention introduces a time interval context to encode process long-term sequences. Temporal context

Occurring between two adjacent items, the invention proposes to model the time interval context through the GRU network

The evolution of (c) is as follows:

z_t＝σ(W_zc_t+A_zp_t-1) (11)

r_t＝σ(W_rc_t+A_rp_t-1) (12)

p′_t＝tanh(W_pc_t+r_t⊙A_pp_t-1) (13)

p_t＝z_t⊙p_t-1+(1-z_t)⊙p′_t (14)

in GRU, the propagation sequence signal of the cyclic connection weight matrix A represents p in continuous hidden layers_tAnd p_t-1Instead of A, p is fed through an intermediate feed-forward layer_tConversion to M_tTo keep the low dimensional implicit representation of p_tTo a higher dimension to represent M_t. The two GRUs are then networked together, as shown in fig. 2. So dynamic d-dimension hidden layer user h_tCan be expressed as follows:

M_t＝σ(p_tF) (14)

h_t＝z_t⊙h_t-1+(1-z_t)⊙h′_t (18)

converted temporal context representation M_tIs used in h_t-1And h_tTo propagate signals therebetween. By M_tSurrogate a is primarily intended to incorporate a dynamic time interval context that changes constantly.

Therefore, the last hidden layer h of output_tAs a long-term sequence representation, it is noted that k ═ h_t。

3. And a gated fusion module.

For a better combination of the short-term representation and the long-term representation, it may naturally be conceivable to join these two parts or to perform a weighted summation operation. These methods are based on the ideal situation of considering all the preferences of the user to be reflected in the historical sequence. In the real world, however, the user's intent can be influenced by many factors and become uncertain. For example, the SDM model proposed by the ali group, after modeling the long-and short-term behavior representations of the user separately, also makes a combination of long-and short-term behavior in order to more accurately represent the user's behavior. When combining long-term and short-term behaviors, they design a door neural network, which not only represents the long-term and short-term behaviors of the user as inputs to the door neural network, but also simultaneously configures the user together as a common input to the door neural network. The resulting gate vector is then used to determine the percentage contribution for long and short periods at time t. The model of the invention is the same as the gating fusion module in the sdm, namely, the two parts of the long-term and short-term preference representation of the user are used as the input of the gate neural network, and other factors influencing the recommendation are also considered. SDM takes into account the configuration of the user himself, while the invention takes into account the intention of the user with uncertainty.

In order to solve the problem of uncertainty of user intention in sequence recommendation, the invention designs a gating network based on a multilayer perceptron. The invention focuses on hot-marketing products h_iE I leads to user intent changes and is inspired by project similarity model and self-attention network Fusion (FISSA), the invention proposes a project similarity gated fusion model to compute weights of short-term and long-term representation to balance the contribution of long-short-term sequences.

In particular, by modelling hot-sell products

And the nearest interaction product m_slThere is also similarity between history items k to compute weights, using the hot-pin product similarity gating function HISG, and then write a single level of gating to MLP as follows:

HISG (. cndot.) is a hot-market goal function [. cndot. ]]It is shown that a three-element cascade operation,

respectively representing the weights and deviations to be learned. Make itUsing sigmoid function sigma (zeta) as 1/(1+ e)^-ζ) The value of G may be limited to between 0 and 1 as an activation function.

The final representation of the user behavior sequence at step i is obtained from a weighted sum of the corresponding short-term representation and long-term representation as follows:

o_l＝G⊙j+(1-G)⊙k (20)

finally, at the next time t +1, the probability of predicting that item i becomes a preferred item of user u is:

p_i+1＝σ(r_i+1) (21)

σ is sigmoid function, r_i+1Is the predicted score of the user u to the item i at time t +1, and is defined as

The MLUR is trained by minimizing the binary cross-entropy loss using an Adam optimizer. The loss function is as follows:

example 2.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Although the present disclosure has been described with reference to specific embodiments, it should be understood that the scope of the present disclosure is not limited thereto, and those skilled in the art will appreciate that various modifications and changes can be made without departing from the spirit and scope of the present disclosure.

Claims

1. A depth model sequence recommendation method based on user representation is characterized by comprising the following steps:

2. The user representation-based depth model sequence recommendation method of claim 1, wherein the short-term modeling representation learning module captures item transitions of users in a short-term sequence through a self-attention sequence recommendation model and using a hierarchical self-attention network.

3. The method as claimed in claim 1, wherein the long-term modeling representation learning module performs modeling by using a recurrent neural networks RNNs with a function of capturing sequence patterns.

4. The user representation-based depth model sequence recommendation method of claim 3, wherein the recurrent neural network update sequence process is as follows:

h_t＝g(x_tW+h_t-1A)；

5. The user representation-based depth model sequence recommendation method of claim 3, wherein the long-term modeling representation learning module utilizes a gated cyclic unit GRU as a cyclic unit of sequence recommendation.

6. The method of claim 5, wherein the GRU is specifically:

h_t＝z_t⊙h_t-1+(1-z_t)⊙h′_t；

wherein

h_t-1，h_t，h′_t，z_t，

W_z，W_r，W_h，A_z，A_r，

7. The method as claimed in claim 5, wherein the long-term modeling representation learning module introduces time interval context to encode and process the long-term sequence, and the time interval context is modeled by the GRU network

Evolution of (c).

8. The method as claimed in claim 1, wherein the gated fusion module calculates weights of the short-term and long-term representations by using a project similarity gated fusion model to balance contributions of long-term and short-term sequences

9. The user representation-based depth model sequence recommendation method of claim 8, wherein in the gated fusion module, MLUR is trained by minimizing binary cross entropy loss using Adam optimizer, and the loss function is:

10. a depth model sequence recommendation system based on user representation according to any one of claims 1 to 9, comprising: