CN112256971A

CN112256971A - Sequence recommendation method and computer-readable storage medium

Info

Publication number: CN112256971A
Application number: CN202011182145.XA
Authority: CN
Inventors: 袁春; 鲍维克; 李思楠; 张可; 张徐之
Original assignee: Shenzhen International Graduate School of Tsinghua University
Current assignee: Shenzhen International Graduate School of Tsinghua University
Priority date: 2020-10-29
Filing date: 2020-10-29
Publication date: 2021-01-22
Anticipated expiration: 2040-10-29
Also published as: CN112256971B

Abstract

The invention provides a sequence recommendation method and a computer readable storage medium, wherein the method comprises the following steps: acquiring data of a user, wherein the data is divided into a training set and a test set, the training set is used for training parameters of a sequence recommendation model, and the test set is used for testing the effect of the sequence recommendation model; constructing a sequence recommendation model; the sequence recommendation model adopts a long-term short-term self-attention sequence recommendation method; training the sequence recommendation model with a training set; testing the effect of the sequence recommendation model by using the test set; inputting the user characteristics, the recommended candidate item set characteristics and the feedback data of the user into a trained sequence recommendation model to obtain the comprehensive preference of the user, estimating the preference degree of the user on the recommended candidate items in the candidate recommended item set through the comprehensive preference, recommending items with high preference degree rank to the user, and finishing recommendation. And obtaining the comprehensive preference of the user through the user characteristics, the recommended candidate item set characteristics and the user feedback data.

Description

Sequence recommendation method and computer-readable storage medium

Technical Field

The present invention relates to the field of sequence recommendation technologies, and in particular, to a sequence recommendation method and a computer-readable storage medium.

Background

With the popularization of the internet, the number of users applying the internet is increased unprecedentedly, and many internet companies adopt an intelligent recommendation algorithm on the basis of huge user data to improve the usability and the user experience of products. However, the classical recommendation algorithm often has some problems, which in turn leads to low recommendation precision or repeated recommendation:

1) the interdependence and sequence analysis presented to the user feedback data is insufficient;

2) insufficient dynamic correspondence to feedback data and context (context);

3) the long term/general preference of the user is often expressed fixedly rather than based on user feedback data.

In the scenario of sequence recommendation, two important difficulties are: 1) learning high order dependencies; two main schemes are currently available: a high-order Markov chain model and a recurrent neural network model. However, the high-order markov chain model has limited historical state of analysis because the number of parameters grows exponentially with the order; and the single cyclic neural network model has difficulty in processing the user feedback data sequence with non-strict order correlation. 2) Learning long-term order dependence; the main scheme is a recurrent neural network model, and a single recurrent neural network model depends on strong correlation of adjacent items.

The sequence recommendation method in the prior art cannot solve the technical difficulty in the sequence recommendation scene.

The above background disclosure is only for the purpose of assisting understanding of the concept and technical solution of the present invention and does not necessarily belong to the prior art of the present patent application, and should not be used for evaluating the novelty and inventive step of the present application in the case that there is no clear evidence that the above content is disclosed at the filing date of the present patent application.

Disclosure of Invention

The present invention provides a sequence recommendation method and a computer-readable storage medium for solving the existing problems.

In order to solve the above problems, the technical solution adopted by the present invention is as follows:

a sequence recommendation method comprising the steps of: s1: acquiring data of a user, wherein the data is divided into a training set and a test set, the training set is used for training parameters of a sequence recommendation model, and the test set is used for testing the effect of the sequence recommendation model; s2: constructing the sequence recommendation model; the sequence recommendation model adopts a long-term short-term self-attention sequence recommendation method; s3: training the sequence recommendation model with the training set; testing the effect of the sequence recommendation model with the test set; s4: inputting the user characteristics, the recommended candidate item set characteristics and the feedback data of the user into the trained sequence recommendation model to obtain the comprehensive preference of the user, estimating the preference degree of the user on the recommended candidate items in the candidate recommended item set through the comprehensive preference, recommending items with high preference degree rank to the user, and completing recommendation.

Preferably, the long-term and short-term self-attention sequence recommendation method comprises the following steps: s21: extracting long term/general preferences of a user from user long term feedback data; s22: extracting sequence preferences of a user from user short-term feedback data; s23: and extracting the comprehensive preference of the user by combining the long-term/general preference, the sequential preference and the weighted non-linear expression of the user short-term feedback data participating in the attention mechanism.

Preferably, the sequence recommendation model comprises an embedded representation layer, a self-attention layer, a GRU layer and an attention layer.

Preferably, the embedding representation layer is used for embedding sparse representations of the features of the user, the features of the recommended candidate item set and the features of the feedback data of the user, and converting the sparse representations into dense embedded representations.

Preferably, the self-attention tier is a set of candidate recommendation items

Is embedded in the representation

Embedded representation of user u

User long-term feedback sequence

For input, a representation u of the user's long-term/general preferences is output^longThe formula is as follows:

wherein ,

item sequence length representing the user's long-term feedback sequence, d represents the dimension of the embedding representation layer,

denotes l, u,

The joint vector of (a) is calculated,

wherein

Is composed of

L, u as context and

union, dynamically representing long-term information, i.e. the same user feedback data, Query, Key, Value representing Query, index in attention mechanism, data to be weighted by attention mechanism,

namely Query, Key and Value all represent

Nonlinear representation of the layer's weight parameters for Query and Key, respectively, ReLU (·) represents the leakage _ ReLU excitation function, Q ' and K ' represent nonlinear representations of Query and Key, respectively, leakage _ ReLU being a variant of ReLU;

wherein ,

is a correlation matrix representation of Q 'and K', and serves as an attention weight matrix from the attention layer,

to scale the dot product;

wherein ,

as attention weight matrix and joint vector

The multiplication results in a weighted output

To pair

Aggregated, resulting in a representation of long-term/general preferences of the user

Preferably, the GRU layer is configured to extract sequence preference in the user short-term feedback data, and to apply the user short-term feedback data

As an input, the output short-term feedback data represents a sequential preference representation of the user

The formulation is as follows:

z_j＝σ(W_z[h_j-1，v_2j])

r_j＝σ(W_r[h_j-1，v_2j])

y_j＝σ(W_oh_j)

wherein ,

for the j' th item, h, in the short-term feedback data sequence of the user_jRepresents the hidden state of the jth unit in the GRU layer, sigma (-) and tanh (-) represent the Sigmoid activation function and tanh activation function respectively, and z_jUpdate gate entry and W for GRU_zTo update the door weight, r_jTo reset the gate entry and W_rIn order to reset the gate weight(s),

is a reset item in a hidden state and

as its weight, y_jDenotes the output of the jth cell in the GRU layer, W_oIn order to output the weight, the weight is output,

representing user short-term feedback data sequences

The length of (a) of (b),

representing the output of the last GRU, a sequential preference representation of the user

Preferably, the attention layer is used for combining user short-term feedback data sequences

Participating in the attention mechanism, the user's long-term/general preference indicates u^longUser's sequential preference representation u^seqUser short-term feedback data sequence

The three items are combined into

wherein

Is composed of

The length of (a) of (b),

inputting the attention layer ultimately results in a representation u of the integrated preferences of user u^comp(ii) a The formulation of the attention layer model is as follows:

wherein ,

and

for the weight parameter of the attention layer, here "+" means

W_AEach row of (a) and (b)_AAdd to obtain from the above formula

Is composed of

A non-linear representation of (d);

wherein ,u^longAs context vector (context) of attention layer, the context vector is combined using the softmax function

Is calculated to obtain

Attention weight vector of

Attention weight vector derived from the above equation

To pair

Weighted summation, finally obtaining the representation of the comprehensive preference of the user u

Preferably, when the sequence recommendation model is trained, the forward transfer of the sequence recommendation model obtains a representation u of the user's comprehensive preference^compExpressing u by inner product method^compAnd candidate item

To represent the similarity of user u to candidate itemv_3jDegree of preference of

The specific formula is as follows:

preferably, the loss function for training the sequence recommendation model is:

wherein D represents a training set constructed by a user, a positive sample and a negative sample,

indicating the degree of preference of the user u for the positive sample candidate itemj,

represents the preference degree of the user u to the negative sample candidate itemk, sigma (·) represents sigmoid function, and theta_eWeight parameter, Θ, representing the embedding layer_aWeight parameter, Θ, representing the self-attention layer and the attention layer_seqRepresents the weight parameter, λ, of the GRU layer_e、λ_a、λ_seqAre the corresponding regular term coefficients.

The invention also provides a computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of the above.

The invention has the beneficial effects that: a sequence recommendation method and a computer-readable storage medium are provided, by adopting a self-attention sequence recommendation method, extracting long-term/general preferences of a user from long-term feedback data of the user, extracting sequence preferences of the user from short-term feedback data of the user, and simultaneously considering dynamic influences of the user and a candidate recommendation item set; and obtaining the comprehensive preference of the user through the user characteristics, the recommended candidate item set characteristics and the user feedback data.

Further, the attention mechanism gives different weights to different feedback data to dynamically capture important information, and meanwhile, the dynamic influence of different users and different candidate recommendation item sets on recommendation results is considered.

Still further, the self-attention mechanism captures long-term interdependencies between long-term feedback data, and does not rely on neighboring strong correlations (or sequential correlations), accurately expressing the long-term/general preferences of the user, rather than fixedly representing the long-term/general preferences of the user based on user characteristic information.

Furthermore, the GRU captures the sequentiality of the short-term feedback data and participates in the weighting of the attention mechanism, the attention weight is adjusted according to the strength of the data sequence correlation, and the sequence preference of the user is accurately expressed.

Drawings

Fig. 1 is a schematic diagram of a sequence recommendation method according to an embodiment of the present invention.

Fig. 2 is a schematic diagram of a long-term and short-term self-attention sequence recommendation method according to an embodiment of the present invention.

FIG. 3 is a diagram of a sequence recommendation model in an embodiment of the invention.

FIG. 4 is a schematic diagram of a self-attention layer in an embodiment of the invention.

FIGS. 5(a) and 5(b) are schematic diagrams illustrating comparison results of the method of the present invention in the embodiment of the present invention and the method of the prior art with the recall ratio as an indicator.

Fig. 6(a) and fig. 6(b) are graphs showing the results of comparing the AUC of the method of the present invention in the embodiment of the present invention with the AUC of the method of the prior art.

Fig. 7(a) and 7(b) are ablation experiment maps in the embodiment of the invention.

Detailed Description

In order to make the technical problems, technical solutions and advantageous effects to be solved by the embodiments of the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and the embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

It will be understood that when an element is referred to as being "secured to" or "disposed on" another element, it can be directly on the other element or be indirectly on the other element. When an element is referred to as being "connected to" another element, it can be directly connected to the other element or be indirectly connected to the other element. In addition, the connection may be for either a fixing function or a circuit connection function.

It is to be understood that the terms "length," "width," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like are used in an orientation or positional relationship indicated in the drawings for convenience in describing the embodiments of the present invention and to simplify the description, and are not intended to indicate or imply that the referenced device or element must have a particular orientation, be constructed in a particular orientation, and be in any way limiting of the present invention.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the embodiments of the present invention, "a plurality" means two or more unless specifically limited otherwise.

Noun interpretation in the field of recommendation systems:

(1) item: representing items in a recommendation system, such as merchandise, videos, etc.;

(2) candidate recommendation item: the candidate item that may be recommended to the user. The recommendation system estimates the preference degree of the user on the candidate recommendation items, ranks the preference degree of the user on each item in the candidate recommendation item set, and recommends the candidate recommendation items with high rank to the user;

(3) user-feedback (user-item interactions) data: item that the user has interacted with, such as goods purchased by the user, videos viewed by the user, and the like. The single item of feedback data is a single item, and the user feedback data list is a plurality of items;

(4) and (3) a recommendation algorithm: and estimating the preference of the user through the user feedback data, further estimating the preference degree of the user on the candidate recommended item, and recommending the item with high preference degree in the candidate recommended item set by the user to the user.

As shown in fig. 1, a sequence recommendation method includes the following steps:

s1: acquiring data of a user, wherein the data is divided into a training set and a test set, the training set is used for training parameters of a sequence recommendation model, and the test set is used for testing the effect of the sequence recommendation model;

in one embodiment of the invention, an open data set is adopted, the data set is divided into a training set and a test set according to a certain proportion, the training set is used for training sequence recommendation model parameters, and the test set is used for testing the sequence recommendation model effect.

S2: constructing the sequence recommendation model; the sequence recommendation model adopts a long-term short-term self-attention sequence recommendation method;

s3: training the sequence recommendation model with the training set; testing the effect of the sequence recommendation model with the test set;

s4: inputting the user characteristics, the recommended candidate item set characteristics and the feedback data of the user into the trained sequence recommendation model to obtain the comprehensive preference of the user, estimating the preference degree of the user on the recommended candidate items in the candidate recommended item set through the comprehensive preference, recommending items with high preference degree rank to the user, and completing recommendation.

In the invention, a self-attention sequence recommendation method is adopted, the long-term/general preference of the user is extracted from the long-term feedback data of the user, the sequence preference of the user is extracted from the short-term feedback data of the user, and the dynamic influence of the user and the candidate recommendation item set is considered; and obtaining the comprehensive preference of the user through the user characteristics, the recommended candidate item set characteristics and the user feedback data.

As shown in fig. 2, the long-term and short-term self-attention sequence recommendation method includes the following steps:

s21: extracting long term/general preferences of a user from user long term feedback data;

s22: extracting sequence preferences of a user from user short-term feedback data;

s23: and extracting the comprehensive preference of the user by combining the long-term/general preference, the sequential preference and the weighted non-linear expression of the user short-term feedback data participating in the attention mechanism.

The invention provides a long-term and short-term self-attention sequence recommendation method. The long-term and short-term representation divides the feedback data of the user into long-term and short-term, the long-term feedback data of the user reflects the long-term/general preference of the user, and the short-term feedback data of the user reflects the short-term preference and the sequence preference of the user; an Attention (Attention) mechanism can give different weights to different data to help an algorithm to dynamically capture important information in the data, and on the basis, a Self-Attention (Self-Attention) mechanism can effectively capture the interdependence between long sequence data. GRU (gatereccurrentunit, a type of recurrent neural network) can analyze sequence characteristics of sequence data. Two difficulties correspond to sequence recommendations: when the GRU learns the high-order dependence, the GRU combines with the empowerment of an attention mechanism to dynamically adjust the weight according to the strength of the sequence relevance of the data so as to accurately learn the high-order dependence; the self-attention mechanism accurately learns long-term sequential dependence without depending on the adjacent relation of data while capturing the interdependence between long-sequence data.

Further, the self-attention mechanism extracts the user's long-term/general preferences from the user's long-term feedback data, while taking into account the dynamic impact of the user and the set of candidate recommendation items. The GRU extracts the user's sequence preferences from the user's short-term feedback data. The long-term/general preference of the user and the sequence preference of the user obtained from the previous steps are combined with the weighting and nonlinear expression of the user short-term feedback data participating in the attention mechanism (the long-term/general preference of the user represents a context vector which is simultaneously used as the attention mechanism), so that the comprehensive preference of the user is obtained (based on the comprehensive preference, the preference scores of all items in the candidate recommendation item set are estimated, a recommendation list is obtained, and the recommendation is completed).

Fig. 3 is a schematic diagram of a sequence recommendation model in an embodiment of the present invention. The sequence recommendation model comprises an embedded representation layer, a self-attention layer, a GRU layer and an attention layer. The sequence recommendation model is responsible for forward propagation, and the user characteristics, the recommended candidate item set characteristics and the user feedback data are input into the model part, so that the comprehensive preference of the user can be obtained. The network parameters of the sequence recommendation model can accurately express the comprehensive preference of the user only through training and updating, and the parameter learning part represents the training and updating process of the model network weight parameters. The model after parameter learning/training can be used for reliable recommendation for users.

In an embodiment of the present invention, the step of constructing the sequence recommendation model is as follows, and is divided into the construction of an Embedding representation (Embedding) layer, a self-attention layer, a GRU layer, and an attention layer:

(1) build-Embedding of model representation (Embedding) layer: and embedding (embed) sparse representations of the characteristics of the user, the set of candidate recommendation items which can be recommended and the characteristics of the user feedback data, and converting the sparse representations into dense embedded representations (embedding).

(2) Construction of model-self-attention layer: fig. 4 is a schematic diagram of a self-attention layer according to an embodiment of the present invention. Self-attention tier of the invention with a set of candidate recommendation items

Is embedded in the representation

Embedded representation of user u

User long-term feedback sequence

Is an input (wherein,

item sequence length representing user long-term feedback sequence, d represents dimension embedded in representation layer, and is also global dimension parameter), and output representation u of user long-term/general preference^long(long-term representation)。

Denotes l, u,

In combination withThe vector of the vector is then calculated,

wherein

Is composed of

L, u as context and

jointly, dynamically representing long-term information, i.e. the same user feedback data, may have different effects on the recommendation result in case of different sets of candidate items or users. Query, Key, Value represent Query, index in attention mechanism, data to be weighted by attention mechanism. In the self-attention layer herein,

namely Query, Key and Value all represent

The self-attention layer is formulated as follows:

in the formula (1) (2),

the weight parameters of the Query and the Key nonlinear representation layers are respectively, ReLU (·) represents a Leaky _ ReLU excitation function, Q 'and K' represent nonlinear representations of Query and Key respectively, and Leaky _ ReLU is a variant of ReLU, so that the problem that neurons cannot learn after the ReLU function enters a negative interval is solved.

In the formula (3), the following steps are carried out,

is a correlation matrix representation of Q 'and K', and serves as an attention weight matrix (AttentionMap) from the attention layer,

to scale the dot product so that the gradient of the softmax function does not easily approach zero because d is too large.

In the formula (4), the first and second groups,

as attention weight matrix and joint vector

The multiplication results in a weighted output

In the formula (5), for

Aggregation (e.g., summing, taking the maximum, here taking the average) results in an indication of the user's long-term/general preference

The self-attention mechanism captures long-term interdependencies between long-term feedback data, does not rely on neighboring strong correlations (or sequential correlations), accurately expresses the long-term/general preferences of the user, rather than fixedly representing the long-term/general preferences of the user based on user characteristic information.

(3) Construction of the model-GRU layer: unlike the interdependence between extracting the user's long-term feedback data using the self-attention layer, the emphasis of the user's short-term feedback data is to extract the sequential preferences in the user's short-term feedback data. The GRU is a kind of circulating neural network, solves the problems of long-term memory, gradient in back propagation and the like, and is easy to calculate. Model feeds back user short-term feedback data

Inputting GRU, calculating short-term feedback data to show sequential preference expression u of user^seq. The formulation of the model GRU layer is as follows:

z_j＝σ(W_z[h_j-1，v_2j]) (6)

r_j＝σ(W_r[h_j-1，v_2j]) (7)

y_j＝σ(W_oh_j) (10)

in the formulae (6) to (11),

for the j' th item, h, in the short-term feedback data sequence of the user_jRepresenting the jth cell in a GRU networkHidden states (hiddenstate), σ () and tanh () represent the Sigmoid activation function and tanh activation function, respectively. In the formula (6), z_jUpdate gate (update) entry for GRU and W_zTo update the door weight; in the formula (7), r_jIs reset gate (resetgate) term and W_rTo reset the gate weight; in the formula (8), the first and second groups,

is a reset item in a hidden state and

is its weight; in the formula (10), y_jDenotes the output of the jth cell in the GRU network, W_oIs the output weight; in the formula (11), the reaction mixture,

representing user short-term feedback data sequences

The length of (a) of (b),

the output representing the last GRU, i.e. the output of the GRU layer of the model, exists only in the last GRU, and is a sequential preference representation for the user

The GRU captures the sequentiality of short-term feedback data and participates in the weighting of an attention mechanism, the attention weight is adjusted according to the strength of data sequence correlation, and the sequential preference of a user is accurately expressed.

(4) Construction of model-attention layer the user's long-term/general preference representation u was previously derived from the attention layer^longObtaining user sequence preference expression u by GRU layer^seq. It should be noted that the sequence preference of the user cannot sufficiently reflect the information of the short-term feedback data, and some user feedback data with partial non-strict sequence correlation often exist in some sequence recommendation scenes, and the sequence recommendation scenes may beResults u for GRU layer^seqAn influence is produced. Therefore, u^seqRequiring joint user short-term feedback data sequences

The attention mechanism is engaged to fully express the information contained in the short-term feedback data while structurally giving higher weight to the relatively important short-term data. User's long term/general preference representation u^longUser's sequential preference representation u^seqUser short-term feedback data sequence

The three items are combined into

wherein

Is composed of

The length of (a) of (b),

inputting the attention layer ultimately results in a representation u of the integrated preferences of user u^comp. The formulation of the attention layer model is as follows:

in the formula (12), the first and second groups,

and

for the weight parameter of the attention layer, here "+" means

W_AEach row of (a) and (b)_AAdded up, and obtained by the formula (11)

Is composed of

A non-linear representation of (d);

in the formula (13), u^longAs context vector (context) of attention layer, the context vector is combined using the softmax function

Is calculated to obtain

Attention weight vector of

Attention weight vector derived from equation (14)

To pair

The attention mechanism gives different weights to different feedback data to dynamically capture important information, and meanwhile, the dynamic influence of different users and different candidate recommendation item sets on recommendation results is considered.

The representation of the long-term/general preference, the representation of the sequence preference and the short-term feedback data are weighted by an attention mechanism, relatively important short-term feedback data are given higher weight from the structure, and the representation of the long-term/general preference is used as a context vector, so that the comprehensive preference of the user is accurately estimated.

In the process of training the model, the forward transmission of the model obtains the representation u of the comprehensive preference of the user^compNow, the inner product method is shown as formula (15) to express u^compAnd candidate item

In the recommendation system scenario of hidden feedback, users often do not have specific scores for items, but only record interactions. In this case, the recommendation system has only positive examples and lacks negative examples, and the training effect of the model is affected accordingly. Items that have no record of interaction with the user can simply be treated as negative examples of the user, thereby constructing a negative example set. The model only needs a negative sample set as large as the positive sample set, which results in a huge negative sample set and low quality of the negative sample set. The BPR method is a matrix decomposition-based method, a pair of user interaction and two item items which are not interacted form a partial order relation pair, the partial order relation between the items under one user forms a partial order matrix, a user set is traversed to establish a prediction sorting matrix, the BPR method decomposes the prediction sorting matrix to generate a user matrix and an item matrix, and the user matrix and the item matrix can obtain the preference degree of the user to each item. And generating a negative sample set with low preference degree by using a BPR method, wherein the size of the negative sample set is equal to that of the positive sample set, and participating in training.

The loss function in the training model is defined as follows:

in equation (16), D represents the training set constructed by the user, positive samples and negative samples,

represents the preference degree of the user u for the negative sample candidate itemk, and σ (·) represents the sigmoid function. The three terms after the first plus sign are regular terms (prevent overfitting), Θ_eWeight parameter, Θ, representing the embedding layer_aWeight parameter, Θ, representing the self-attention layer and the attention layer_seqRepresents the weight parameter, λ, of the GRU layer_e、λ_a、λ_seqAre the corresponding regular term coefficients. And selecting an optimizer (generally an Adam optimizer), and iteratively updating the network weight parameters in the algorithm model based on the gradient of the loss function to obtain the trained algorithm model.

And calculating the correlation between the comprehensive preference of the user obtained by the model and the candidate recommended item by using an inner product method, and representing the preference score of the user on the candidate recommended item. Then, a loss function of the algorithm is defined, and the construction of the loss function considers: preference scores of positive and negative examples are a gap, a regular term that prevents overfitting. Meanwhile, a negative sample is provided through the BPR algorithm, and the negative sample and the positive sample are combined to participate in the calculation of the loss function. And finally, the optimizer iteratively updates the parameters of the algorithm according to the gradient of the loss function to obtain the trained algorithm model.

In the face of a user needing to be recommended, inputting the user characteristics, the recommended candidate item set characteristics and the feedback data of the user into an algorithm model to obtain the comprehensive preference of the user, estimating the preference degree of the user on the recommended candidate items in the candidate recommended item set through the comprehensive preference, recommending items with high preference degree to the user, and completing the recommendation.

In one embodiment of the present invention, the recommendation of the memorable sequence using the method described above comprises the following procedures:

1: acquiring training data, such as public data sets or enterprise background data;

2: setting structural hyper-parameters of a sequence recommendation model, including dimension parameters, an optimizer and the like;

3: constructing a sequence recommendation model (an Embedding layer, a self-attention layer, a GRU layer and an attention layer), and initializing network weight parameters of the sequence recommendation model;

4: enumerating iteration times i from 1-N, training network weight parameters of a sequence recommendation model:

5: enumerating batches (batch, in the invention, batch is 1), and selecting one batch of users and corresponding feedback data to participate in training each time until the users and the corresponding feedback data of the whole training data set all participate in training in the ith iteration:

6: and obtaining a user feedback data list from the training data set, dividing the user feedback data list into a short-term feedback data list and a long-term feedback data list, and converting the user feedback data list, the user characteristic and the recommended candidate item set into respective embedded representations. Inputting an algorithm model, and obtaining a representation of the comprehensive preference of the user by the algorithm model;

7: calculating a loss function through preference representation, a positive sample, a negative sample and the current network weight of a user, and updating the network weight parameters of the algorithm through back propagation by an optimizer based on the gradient of the loss function;

8: ending enumeration;

9: after enumeration is finished, obtaining an algorithm model after training is finished;

10: and at the moment, recommending the user, inputting the user characteristics, the recommended candidate item set characteristics and the user feedback data into an algorithm model for forward propagation to obtain comprehensive preference, estimating several items of items with the highest user preference degree in the recommended candidate item set according to the comprehensive preference, and recommending the user.

As shown in fig. 5(a), fig. 5(b), fig. 6(a) and fig. 6(b), the performance of applying the above-mentioned distribution of long-term and short-term self-attention sequence recommendation method (LSSSAN) on the Tmall and Gowalla data sets is compared with the method in the prior art, wherein the comparison indexes of fig. 5(a) and fig. 5(b) are recall rates, and the comparison indexes of fig. 6(a) and fig. 6(b) are AUC. The result shows that the performance of the recommendation algorithm of the invention is better than that of the recommendation algorithm in the prior art on the whole:

(1) the long term short term self attention sequence recommendation algorithm (LSSSAN) was overall superior to the AttRec model of self attention, with Recall @20 and AUC 0.126, 0.797 on the Tmall dataset and 0.461, 0.982 on the Gowalla. Compared with the AttRec model, the LSSSAN index recall rate @20 is respectively improved by 6.07% and 20.49% on the two data sets, and the AUC index is respectively improved by 10.45% and 0.81% on the two data sets. Compared with an AttRec model which fixedly expresses the long-term/general preference and neglects the sequence preference of the user, the self-attention layer of the LSSSAN extracts the long-term/general preference of the user from long-term feedback data, and the GRU layer extracts the sequence preference of the user from short-term feedback data, structurally gives higher weight to short-term feedback, and is more beneficial to the recommendation result.

(2) LSSSAN performs better on the Gowalla data set than the SHAN model on the whole, and performs better and worse on the Tmax data set than the SHAN model. LSSSAN improved the indicator recall @20 and AUC by 1.51% and 0.37%, respectively, on the Gowalla dataset, by 1.48% on the Tmall dataset, and by 14.6% behind the SHAN model on the Tmall dataset. The reason for this is that the mutual dependence and order correlation between the user feedback data of the Gowalla data set is more strict than that of the Tmall data set, and the model captures the mutual dependence and sequence between the user feedback data by using self attention and GRU emphasis compared with the SHAN model, so that the LSSSAN performance on Gowalla is better than that of the SHAN model as a whole, and the performance on the Tmall data set is less stable than that of the SHAN model. In conclusion, compared with the SHAN that the interdependence analysis of the long-term data is insufficient and the sequence preference is ignored, the self-attention layer of the LSSSAN analyzes the interdependence of the long-term data and the GRU layer extracts the sequence preference, so that the recommendation result has better performance, especially under the condition that the data has stronger interdependence and sequence correlation.

Further, the present invention performed ablation experiments. The ablation experiment has the effect in the field of artificial intelligence/deep learning to verify the reasonability of the model structure, and the elimination of a certain part of the model is compared with the original model in the experiment to prove that the part has an irreplaceable effect in the model. For comparison of ablation experiments see table 1 and fig. 7(a), 7 (b):

TABLE 1 ablation experiment control Table

LSSSAN1 is a model after the long-term short-term Self-Attention sequence recommendation algorithm (LSSSAN) eliminated the Self-Attention layer, performing poorly on both data sets. LSSSAN1 reduced the index recall @20 by 26.98% and 38.83% on both datasets compared to LSSSAN, primarily because the model lacked an expression for long term/general preferences after eliminating the self-attention layer, and also reduced the weight of the relatively important short term feedback data in the model.

The LSSSAN2 is a model of the long-term short-term self-attention sequence recommendation algorithm (LSSSAN) after removing GRU layers, and the two indexes of LSSSAN2 on Gowalla are respectively reduced by 0.87% and 0.31% compared with LSSSAN, and the AUC of LSSSAN2 on Tmall is reduced by 0.89% compared with LSSSAN, although the index recall rate @20 of LSSSAN2 on the Tmall dataset is improved by 3.17% compared with LSSSAN, the overall performance of LSSSAN2 on the Tmall dataset is slightly inferior to that of LSSSAN as can be observed from fig. 5. LSSSAN2 with GRU layers removed performed better on the Tmall dataset than on the Gowalla basis because the order and interdependence of the Tmall dataset was less stringent than the Gowalla dataset. Compared with LSSSAN2, the index Recall is more stable when the N parameter is larger, and the advantage of the GRU layer extraction sequence preference for the recommendation result is greater than the disadvantage that the GRU layer is unstable due to the influence of non-strict sequence correlation and weak interdependence.

Thus, the ablation experiment verifies the important role played by the GRU layer and the self-attention layer in the invention.

An embodiment of the present application further provides a control apparatus, including a processor and a storage medium for storing a computer program; wherein a processor is adapted to perform at least the method as described above when executing the computer program.

Embodiments of the present application also provide a storage medium for storing a computer program, which when executed performs at least the method described above.

Embodiments of the present application further provide a processor, where the processor executes a computer program to perform at least the method described above.

The storage medium may be implemented by any type of volatile or non-volatile storage device, or combination thereof. Among them, the nonvolatile Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an erasable Programmable Read-Only Memory (EPROM), an electrically erasable Programmable Read-Only Memory (EEPROM), a magnetic random Access Memory (FRAM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical Disc, or a Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Static Random Access Memory (SSRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), Double Data rate Synchronous Dynamic Random Access Memory (DDRSDRAM, Double Data rate Synchronous Dynamic Random Access Memory), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM, Enhanced Synchronous Dynamic Random Access Memory), Synchronous link Dynamic Random Access Memory (SLDRAM, Synchronous Dynamic Random Access Memory (DRAM), Direct Memory (DRM, Random Access Memory). The storage media described in connection with the embodiments of the invention are intended to comprise, without being limited to, these and any other suitable types of memory.

In the several embodiments provided in the present application, it should be understood that the disclosed system and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, all the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Alternatively, the integrated unit of the present invention may be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as a separate product. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or a part contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.

The methods disclosed in the several method embodiments provided in the present application may be combined arbitrarily without conflict to obtain new method embodiments.

Features disclosed in several of the product embodiments provided in the present application may be combined in any combination to yield new product embodiments without conflict.

The features disclosed in the several method or apparatus embodiments provided in the present application may be combined arbitrarily, without conflict, to arrive at new method embodiments or apparatus embodiments.

The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several equivalent substitutions or obvious modifications can be made without departing from the spirit of the invention, and all the properties or uses are considered to be within the scope of the invention.

Claims

1. A sequence recommendation method, comprising the steps of:

2. The sequence recommendation method of claim 1, wherein the long term short term self attention sequence recommendation method comprises the steps of:

3. The sequence recommendation method of claim 2, wherein the sequence recommendation model comprises an embedded representation layer, a self-attention layer, a GRU layer, an attention layer.

4. The sequence recommendation method of claim 3, wherein the embedding representation layer is used for embedding sparse representations of features of the user, features of the recommended candidate item set, and features of the feedback data of the user, and converting the sparse representations into dense embedded representations.

5. The sequence recommendation method of claim 4, wherein the self-attention layer is aggregated with a set of candidate recommendation items

Is embedded in the representation

Embedded representation of user u

User long-term feedback sequence

wherein ,

denotes l, u,

The joint vector of (a) is calculated,

wherein

Is composed of

L, u as context and

namely Query, Key and Value all represent

wherein ,

to scale the dot product;

wherein ,

as attention weight matrix and joint vector

The multiplication results in a weighted output

To X'_u ^longAggregated, resulting in a representation of long-term/general preferences of the user

6. The sequence recommendation method of claim 5, wherein the GRU layer is configured to extract sequence preferences in the user short-term feedback data and to apply the user short-term feedback data to the user short-term feedback data

As an input, the output short-term feedback data represents a sequential preferred representation u of the user^seq(ii) a The formulation is as follows:

z_j＝σ(W_z[h_j-1，v_2j])

r_j＝σ(W_r[h_j-1，v_2j])

y_j＝σ(W_oh_j)

wherein ,

is a reset item in a hidden state and

representing user short-term feedback data sequences

The length of (a) of (b),

7. The sequence recommendation method of claim 6, wherein the attention layer is used to combine user short-term feedback data sequences

The three items are combined into

wherein

Is composed of

The length of (a) of (b),

wherein ,

and

for the weight parameter of the attention layer, here "+" means

Each row of (a) and (b)_AAdd to obtain from the above formula

Is composed of

A non-linear representation of (d);

wherein ,u^longAs context vector of attention layer, joint using softmax function

Is calculated to obtain

Attention weight vector of

Attention weight vector derived from the above equation

To pair

8. The sequence recommendation method of claim 7, wherein the forward pass of the sequence recommendation model results in a representation u of the user's integrated preferences when training the sequence recommendation model^compExpressing u by inner product method^compAnd candidates

The specific formula is as follows:

9. the sequence recommendation method of claim 8, wherein a penalty function for training the sequence recommendation model is:

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 9.