CN115687772A

CN115687772A - Sequence recommendation method based on sequence dependence enhanced self-attention network

Info

Publication number: CN115687772A
Application number: CN202211398166.4A
Authority: CN
Inventors: 贾兆红; 张虎
Original assignee: Anhui University
Current assignee: Anhui University
Priority date: 2022-11-09
Filing date: 2022-11-09
Publication date: 2023-02-03

Abstract

The invention discloses a sequence recommendation method based on a sequence dependence enhanced self-attention network, which comprises the following steps: 1. constructing a data set and a representation of sequence recommendation, 2, acquiring a feature representation of an interaction sequence, 3, acquiring sequence dependence information and user preference change information of the interaction sequence, 4, capturing various feature information of the interaction sequence through a sequence dependence enhanced self-attention network, and 5, performing sequence recommendation by using the finally captured representation of the interaction sequence. When the relationship of the interactive items is processed, the sequential dependence enhanced self-attention network model is constructed, and the sequential dependence information of the user interactive items and the user preference change information are considered, so that the recommendation precision can be improved.

Description

Sequence recommendation method based on sequence dependence enhanced self-attention network

Technical Field

The invention relates to the field of recommendation, in particular to a sequence recommendation method based on a sequence dependence enhanced self-attention network.

Background

Sequence recommendation is an important research topic in the field of recommendation systems. Traditional recommendation systems model user and item interactions in a static manner, and we know that user interaction behavior tends to be continuous, user preferences and item popularity tend to change over time, and the like. At this point, sequence recommendations are generated, which view the user and item interactions as a sequence of dynamic interactions, mining the user's preferences by considering the contextual connections of the user-interactive items.

Traditional sequence recommendation algorithms mainly utilize a markov chain based approach (MC) which assumes that the next item to be interacted with by the user depends on several items that have been interacted with recently; due to this assumption, the MC-based approach cannot capture the higher order dependencies between items. In recent years, with the development of deep learning, deep learning models such as a Recurrent Neural Network (RNN) and a Convolutional Neural Network (CNN) are widely applied to sequence recommendation; RNN captures the sequence dependency relationship among items through a recursive structure, but has the problems of low efficiency, difficult long-term dependency storage and the like; while the CNN model captures the local features of the input sequence by performing convolution operations on the sequence, it cannot capture global information but only local information due to its own limitations. Later, with the big fire of the transform model in various fields, researchers began to introduce self-attention methods into sequence recommendations, which can capture the global dependency of the sequence well and can make the model focus more on previous interactive items with larger influence on future interactive items. User behavior tends to be contextual and user preferences tend to change over time. For example: after purchasing the tickets of the concert, the user can purchase the plane tickets going to the concert and then order the hotel to live in; users previously like to drink carbonated beverages and now like to drink fruit juice carbonated beverages. The conventional method cannot be well applied to the scenes, so that the recommendation accuracy is not high.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides a sequence recommendation method based on a sequence dependence enhanced self-attention network, so that the sequence dependence of a user interaction sequence and the user preference change information can be better captured, and the accuracy of recommending articles to a user can be improved.

In order to achieve the purpose, the invention adopts the following technical scheme:

the invention relates to a sequence recommendation method based on a sequence dependency enhanced self-attention network, which is characterized by comprising the following steps of:

the method comprises the following steps: acquiring a user set U and acquiring an interaction sequence of any user U in the user set U

Wherein the content of the first and second substances,

the t-th interactive item representing the user u; i S ^u L represents the length of the interaction sequence of user u; thus the item set I is formed by the interaction sequence of all users u;

step two: constructing an enhanced self-attention network based on sequential dependence, comprising: an embedded layer, a position information layer, a GRU module, a self-attention module, a feedforward layer and a prediction layer;

step 2.1, the embedding layer utilizes a term vector matrix

Interaction sequence S of user u ^u Conversion to an embedded vector matrix

And is

Wherein the content of the first and second substances,

represent

The embedded vector representation of (a); i represents the length of the item set I; d represents the dimension of the item embedding vector;

step 2.2, the position information layer utilizes a position matrix

For embedded vector matrix E ^u Adding to obtain a position-embedded vector matrix X ^u And is made of

Wherein the content of the first and second substances,

represent

And the t-th position vector in the position matrix P

(ii) the added vector representation;

step 2.3: the GRU module utilizes formula (1) -formula (4) to vector matrix X ^u Processing to obtain the sequence and preference change information matrix

In the formulae (1) to (4), W _r And U _r A weight matrix representing the reset gates is shown,

status information indicating the t-1 st position,

representing a t position middle vector of the reset gate, and sigma representing a sigmoid activation function; an element that is a product of an element,

represents the t position intermediate vector of the update gate; w _z And U _z A weight matrix representing the updated gate, tanh represents a hyperbolic tangent function,

candidate state information indicating the t-th position; w is a group of _h And U _h A weight matrix to be learned representing a candidate state;

indicating the order-dependent information and user preference variation information captured at the t-th position;

step 2.4: the self-attention module utilizes a vector matrix X of equations (5) to (7) ^u And an order dependency information and user preference change information matrix H ^u Processing to obtain a self-attention information matrix

In the formula (5), the reaction mixture is,

are two weight matrices of the query vector from attention,

two weight matrices that are vectors of values from attention; e.g. of a cylinder _ti Representing attention scores of the interactive items at the t positions and the interactive items at the ith position; a (a) to _ti A weight value that is an attention score of the interactive item at the t-th position and the interactive item at the i-th position;

and

respectively, embedding vectors at the ith position

And the order and preference variation information vector of the ith position

The weight matrix of (a) is determined,

an output self-attention information vector representing the t-th position;

step 2.5, the feedforward layer utilizes the pair of formula (8)

Processing to obtain a feed-forward information matrix FFN (y) _i )：

FFN(y _i )＝relu(W ₁ y _i +b ₁ )W ₂ +b ₂ (8)

In formula (8), W ₁ And W ₂ Is a weight matrix; b is a mixture of ₁ And b ₂ Is an offset; relu is an activation function;

step 2.6, utilizing the pair of the formula (9)

After normalization operation is carried out, a normalized normalization information matrix is obtained

Inputting the feedforward information into the feedforward layer for processing to obtain a feedforward information matrix

And are combined with

After residual error connection, the intermediate sequence list of the t position is obtained

In the formula (9), the reaction mixture is,

and

represent

Mean and variance of; α and β are a scaling factor and an offset; epsilon represents a constant;

step 2.7, represent the t-th position middle order

Input into the self-attention moduleAnd with the order dependent information and the user preference variation information matrix H ^u Processing the two together to obtain a laminated self-attention information matrix Y ^u Then, the processing of step 2.5 and step 2.6 is carried out to obtain the final sequence representation of the t-th position

Step 2.8, the prediction layer represents the t-th position final sequence by using the formula (10)

Calculating to obtain the output of the t-th position interaction item of the user u after the t-th position interaction item passes through the sequential dependence self-attention network

Score r for ith item _i,t ：

In the formula (10), the compound represented by the formula (10),

an embedded vector representation representing the ith item in the item vector matrix M; r is _i,t A relevance score representing the ith item; t represents transposition;

step 2.9, constructing a target function Loss of the binary cross entropy by using a formula (11):

in formula (11), S represents a set of interaction sequences of all users; o _z A positive sample number representing the expected z-th position, the positive sample representing the next item of user u predicted interaction; o 'to' _z A negative sample number corresponding to the z-th position, the negative sample representing an item that does not appear in the user u interaction sequence;

represents the positive sample number o at the z-th position _z A score for the corresponding positive sample;

denotes the z-th position negative sample number o' _z The score of the corresponding negative example.

And 2.10, training the order dependence enhancement self-attention network by using a back propagation and gradient descent method, enabling the target function Loss to be minimum to update network parameters, stopping training when the iteration times reach the maximum iteration times, obtaining an optimal recommendation model for outputting scores of candidate items of the input interaction sequence, and selecting top items with the maximum scores from the optimal recommendation model for recommendation.

The electronic device of the invention comprises a memory and a processor, and is characterized in that the memory is used for storing programs for supporting the processor to execute the sequence recommendation method, and the processor is configured to execute the programs stored in the memory.

The invention relates to a computer-readable storage medium, on which a computer program is stored, characterized in that the computer program is adapted to perform the steps of the sequence recommendation method when being executed by a processor.

Compared with the prior art, the invention has the beneficial effects that:

1. according to the method, on the basis of considering the absolute position information of the items, the sequence dependence information of the user interaction sequence and the information of the user preference change are further captured by the gate control mechanism-based recurrent neural network GRU, so that the capability of the model for capturing multi-aspect information is better enhanced, and therefore when the recommendation is performed on the users, the change of the user preference can be better found, and the required items can be better recommended to the users.

2. The invention adopts a mode of combining a self-attention method and a GRU method, and integrates multi-aspect information captured by the GRU on the basis of the self-attention method, and meanwhile, the attention mechanism also enables a model to pay more attention to information useful for future interaction items, so that the interference of noise items can be filtered out, and useful suggestions can be better provided for users.

3. The invention adopts the feedforward layer, effectively makes up the defect of the model in capturing nonlinear characteristics through two linear changes and an activation function, and improves the fault-tolerant capability of the model, thereby strengthening the generalization capability of the method and being capable of adapting to various application scenes.

4. According to the method, various dependency information of the interaction sequence can be captured more fully by overlapping the self-attention network for many times, and meanwhile, overfitting of the model in the training process is prevented by adopting layer normalization, residual connection and dropout regularization technologies, so that the training effect of the model is improved, and the accuracy rate recommended to a user is improved.

Drawings

Fig. 1 is a model diagram of a sequence recommendation method based on an order-dependent enhanced self-attention network according to the present invention.

Detailed Description

In this embodiment, a sequence recommendation method based on a sequential dependency enhanced self-attention network mainly utilizes a recurrent neural network GRU based on a gating mechanism and a self-attention network to extract various information and dependency relationships in an interaction sequence. As shown in fig. 1, the input of the model is a historical interaction sequence of the user, and the sequence input is obtained through an embedding layer to obtain an embedded vector representation of each interaction item; then adding each interactive item and the embedded vector of the corresponding position of the interactive item; inputting the addition result into GRU to extract sequence dependent information and user preference variation information; then, the GRU output and the original information are utilized to extract information in various aspects of the interactive sequence by self attention; then improving the nonlinear capturing capability of the model through a feedforward layer; the capability of the model for capturing various information is improved by superposing multiple layers of self-attention and feedforward layers; and over-fitting is prevented by layer normalization, residual concatenation; finally, the scores of the candidate items are calculated through the expression of the interaction sequences extracted by the user, so that top items with high scores are recommended to the user. Specifically, the method comprises the following steps:

Sequencing the interaction sequence of the user u according to the time stamp; wherein, the first and the second end of the pipe are connected with each other,

the t-th interactive item representing the user u; i S ^u L represents the length of the interaction sequence of user u; so that the item set I is formed by the interaction sequences of all users y; users with less than 5 interactive items and items with less than 5 interactions are not considered.

Step two: constructing an order dependence-based enhanced self-attention network as shown in fig. 1 includes: the system comprises an embedded layer, a position information layer, a GRU module, a self-attention module, a feedforward layer and a prediction layer;

step 2.1, the embedding layer utilizes the project vector matrix

Interaction sequence S of user y ^u Conversion to an embedded vector matrix

And is provided with

Wherein, the first and the second end of the pipe are connected with each other,

to represent

step 2.2, the position information layer utilizes the position matrix

For embedded vector matrix E ^u Carrying out superposition processing to obtain a vector matrix X after position embedding ^u And is and

the position information can be displayed to show the context of each item in the interaction sequence of the user u. Wherein, the first and the second end of the pipe are connected with each other,

to represent

And the t-th position vector in the position matrix P

The added vector is represented.

Step 2.3: GRU module utilizes formula (1) -formula (4) to embed vector matrix X after position ^u Processing to obtain sequence dependence information and user preference change information matrix

In the formulae (1) to (4),W _r and U _r A weight matrix representing the reset gates is shown,

status information indicating the t-1 st position,

representing a t position middle vector of the reset gate, and sigma representing a sigmoid activation function; an element indicates that the number of lines,

represents the t-th position intermediate vector of the update gate; w _z And U _z A weight matrix representing the updated gate, tanh represents a hyperbolic tangent function,

indicating the order-dependent information and user preference variation information captured at the t-th position; the GRU module can well capture the sequence dependency relationship among the user interaction items and can instantly sense when the user preference changes so as to make better recommendation.

Step 2.4: the self-attention module utilizes a vector matrix X after embedding the positions by using an expression (5) to an expression (7) ^u And order dependency information and user preference change information matrix H ^u Processing to obtain a self-attention information matrix

In the formula (5), the reaction mixture is,

is a self-attentive query vector weight matrix,

a weight matrix that is a vector of values from attention; e.g. of the type _ti Representing attention scores of the interactive items at the t positions and the interactive items at the ith position; a (a) to _ti A weight value that is an attention score of the interactive item at the t-th position and the interactive item at the i-th position;

and

respectively, the i-th position embedding vector

And the order dependency information and user preference variation information vector of the ith position

The weight matrix of (a) is determined,

an output self-attention information vector representing the t-th position;

mainly for preventing e in the input formula (6) _ij The value is too large, resulting in the partial derivative approaching 0. Some items which are irrelevant to the items to be interacted with by the user in the future exist in the interaction sequence of the user, and the attention mechanism can be well usedInformation of interaction items having a large influence on the items that we want to interact with in the future is captured, and interference of such noise data is reduced.

Step 2.5, the feedforward layer utilizes the pair of formula (8)

Processing to obtain a feed-forward information matrix FFN (y) _i )：

FFN(y _i )＝relu(W ₁ y _i +b ₁ )W ₂ +b ₂ (8)

In the formula (8), W ₁ And W ₂ Is a weight matrix; b ₁ And b ₂ Is an offset; relu is an activation function; the feedforward layer is composed of a linear change function and an activation function, the nonlinear characteristics of the model can be increased, and the fault tolerance capability and the generalization capability of the model are improved.

Step 2.6, utilizing the pair of the formula (9)

Then inputting the data into a feedforward layer for processing to obtain a feedforward information matrix

And is combined with

After residual error connection, a middle sequence list representation matrix of the t-th position is obtained

In the formula (9), the reaction mixture is,

and

represent

Mean and variance of; α and β are a scaling factor and an offset; e =1e ^-8 Representing a constant. By adopting layer normalization and residual connection, the overfitting problem in the model training process and the information loss problem caused by the deepening of the network depth can be effectively reduced.

Step 2.7, represent the t-th position middle order

Input into the attention module and is related to the sequence dependency information and the user preference change information matrix H ^u Processing the two together to obtain a laminated self-attention information matrix Y ^u Then, the processing of step 2.5 and step 2.6 is carried out to obtain the final sequence representation of the t-th position

The two-time processing enables the model to better capture information in various aspects of the user interaction sequence, and improves the recommendation accuracy of the model.

Calculating to obtain the score r of the ith position interactive item of the user u to the ith item after the t position interactive item passes through the sequential self-attention-dependent network _i,t ：

In the formula (10), the compound represented by the formula (10),

an embedded vector representation representing the ith item in the item vector matrix M; r is _i,t Representing the relevance score of the ith item; t represents transposition;

step 2.9, constructing a target function Loss of the binary cross entropy by using the formula (11):

in formula (11), S represents a set of interaction sequences of all users; o _z A positive sample number representing the expected z-th position, the positive sample representing the next item the user u predicts to interact with; o' _z A negative sample number corresponding to the z-th position, the negative sample representing an item that does not appear in the user u interaction sequence;

denotes the z-th position positive sample number o _z A score for the corresponding positive sample;

denotes the z-th position negative sample number o' _z Score of the corresponding negative example. In this embodiment, the data set is divided into a training set, a verification set, and a test set, the latest interaction item of the user is used as the test set, the second new interaction item is used as the verification set, and the rest are used as the training set.

Step 2.10, training the order dependence-based enhanced self-attention network by using a back propagation and gradient descent method, wherein the gradient descent method adopts a learning rate of 0.001 and an exponential decay rate beta ₁ ＝0.9，β ₂ =0.98, adam optimization algorithm is used; and minimizing the Loss of the target function to update the network parameters, stopping training when the iteration times reach the maximum iteration times of 600, so as to obtain the score of the optimal recommendation model for outputting the candidate items of the input interaction sequence, and selecting the top items with the maximum score from the optimal recommendation model for recommendation.

In this embodiment, an electronic device includes a memory for storing a program that supports a processor to execute the above-described sequence recommendation method, and a processor configured to execute the program stored in the memory.

In this embodiment, a computer-readable storage medium stores a computer program, and the computer program is executed by a processor to execute the steps of the sequence recommendation method.

Claims

1. A sequence recommendation method based on a sequence dependence enhanced self-attention network is characterized by comprising the following steps:

step two: constructing an enhanced self-attention network based on sequential dependence, comprising: the system comprises an embedded layer, a position information layer, a GRU module, a self-attention module, a feedforward layer and a prediction layer;

step 2.1, the embedding layer utilizes the project vector matrix

Interaction sequence S of user u ^u Conversion to an embedded vector matrix

And is

represent

step 2.2, the position information layer utilizes a position matrix

represent

And the t-th position vector in the position matrix P

The added vector representation;

step 2.3: the GRU module utilizes an equation (1) -equation (4) vector matrix X ^u Processing to obtain the order and preference change information matrix

status information indicating the t-1 st position,

candidate state information indicating the t-th position; w _h And U _h A weight matrix to be learned representing a candidate state;

step 2.4: the self-attention module utilizes a vector matrix X of equations (5) -7 ^u And an order dependency information and user preference change information matrix H ^u Is processed to obtainSelf-attention information matrix

In the formula (5), the reaction mixture is,

are two weight matrices of the query vector from attention,

two weight matrices that are vectors of values from attention; e.g. of a cylinder _ti Representing the attention scores of the interactive items of the t positions and the interactive items of the ith position; a (a) to _ti A weight value that is an attention score of the interactive item at the t-th position and the interactive item at the i-th position;

and

respectively, embedding vectors at the ith position

And the order and preference variation information vector of the ith position

The weight matrix of (a) is determined,

an output self-attention information vector representing the t-th position;

step 2.5, the feedforward layer utilizes the pair of formula (8)

Processing to obtain a feed-forward information matrix FFN (y) _i )：

FFN(y _i )＝relu(W ₁ y _i +b ₁ )W ₂ +b ₂ (8)

In the formula (8), W ₁ And W ₂ Is a weight matrix; b is a mixture of ₁ And b ₂ Is an offset; relu is an activation function;

step 2.6, utilizing the pair of the formula (9)

After normalization operation is carried out, a normalized information matrix is obtained

And are combined with

In the formula (9), the reaction mixture is,

and

to represent

Mean and variance of; α and β are a scaling factor and an offset; e represents a constant;

step 2.7, represent the t-th position middle order

Input into the self-attention module, and output with the sequence dependence information and the user preference variation information matrix H ^u Processing the two together to obtain a laminated self-attention information matrix Y ^u Then, the processing of step 2.5 and step 2.6 is carried out to obtain the final sequence representation of the t-th position

Calculating to obtain the output of the t-th position interactive item of the user u after the t-th position interactive item passes through the sequential self-attention-dependent network

Score r for ith item _i,t ：

In the formula (10), the compound represented by the formula (10),

in formula (11), S represents a set of interaction sequences of all users; o _z A positive sample number representing the expected z-th position, the positive sample representing the next item of predicted interaction by user u; o 'to' _z A negative sample number corresponding to the z-th position, the negative sample representing an item that does not appear in the user u interaction sequence;

And 2.10, training the order dependence enhancement self-attention network by using a back propagation and gradient descent method, enabling the target function Loss to be minimum so as to update network parameters, stopping training when the iteration times reach the maximum iteration times, obtaining an optimal recommendation model for outputting scores of candidate items of the input interaction sequence, and selecting top items with the maximum scores from the optimal recommendation model for recommendation.

2. An electronic device comprising a memory and a processor, wherein the memory is configured to store a program that enables the processor to perform the sequence recommendation method of claim 1, and the processor is configured to execute the program stored in the memory.

3. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the sequence recommendation method of claim 1.