CN113742596A

CN113742596A - Attention mechanism-based neural collaborative filtering recommendation method

Info

Publication number: CN113742596A
Application number: CN202111097754.XA
Authority: CN
Inventors: 张峰; 孟祥福
Original assignee: Liaoning Technical University
Current assignee: Liaoning Technical University
Priority date: 2021-09-18
Filing date: 2021-09-18
Publication date: 2021-12-03

Abstract

The invention discloses a neural collaborative filtering recommendation method based on an attention mechanism, which comprises the following steps: data acquisition and processing; dividing the data set: dividing the processed data into a training set and a test set according to a leave-one-out mechanism; constructing a neural collaborative filtering model based on an attention mechanism: constructing a feature fusion construction attention part by using a local inference part of a public ESIM model, learning a feature interaction part of a user and a project by using a multilayer perceptron, and finally fusing the feature fusion construction attention part and the feature interaction part for recommendation; and (3) model training and result displaying, namely using the obtained training data set and the test data set for training and evaluating the constructed neural collaborative filtering model, and judging whether to recommend the project to the user according to the prediction score of the user on the project. The method and the system introduce an attention mechanism to distribute the weight values, and meanwhile, the MLP multi-layer perceptron is combined to capture multi-layer interaction information between the user and the project, so that the recommendation performance is improved.

Description

Attention mechanism-based neural collaborative filtering recommendation method

Technical Field

The invention belongs to the technical field of information recommendation, and particularly relates to a neural collaborative filtering recommendation method based on an attention mechanism.

Background

With the development of science and technology and the improvement of living standard of people, the scale and the coverage of the internet become bigger and bigger, and the amount of generated information data is increased explosively. The excessive amount of information makes it necessary for the user to spend a lot of time on information filtering and integration to filter out valid information. Information overload becomes an urgent problem to be solved in the current information age.

In order to improve the effective utilization of information by people, the recommendation system is taken as an important means for filtering important information and is distinguished from a plurality of filtering technologies. The method can meet the mainstream requirements and can provide quick and efficient personalized services for users. The collaborative filtering algorithm is a main component of a recommendation system, and can be mainly classified into content-based filtering recommendation, collaborative filtering-based recommendation, mixed recommendation and the like at present. The recommendation based on collaborative filtering is used as a recommendation algorithm which is most widely and successfully applied in a recommendation system, and the core idea is to recommend products which may be interested by a target user for the target user by utilizing the behaviors and feedback of other users. I.e., a user's favorite resource similar to a user who is likely to also like; a user likes a resource and most likely also likes other resources similar to the resource.

When the traditional collaborative filtering model relates to the interaction between the user and the project characteristics, which is a key factor of modeling, a matrix decomposition mode is adopted, and the inner product is taken as the potential characteristic point multiplication of the user and the project. However, this implicit feedback through user item interaction tends to cause problems for the system. For example: if user u and item i have a record of interaction this does not mean that user u really likes item i, and likewise if user u and item i have no record of interaction this does not mean that user u does not like item i, perhaps because user u does not know that there is this item i at all. Sparse user item data can cause large errors in recommendation ranking, and if a large number of potential factors are used for solving, the generalization ability of the recommendation system can be adversely affected.

Disclosure of Invention

Based on the defects of the prior art, the technical problem to be solved by the invention is to provide the neural collaborative filtering recommendation method based on the attention mechanism, solve the problems that the inner product operation of the traditional collaborative filtering algorithm is limited and the recommendation error influence is large, and have good recommendation accuracy and interpretability.

In order to solve the technical problems, the invention is realized by the following technical scheme:

the invention provides a neural collaborative filtering recommendation method based on an attention mechanism, which comprises the following steps of:

step S1, data acquisition and processing: downloading data sets in the movielens-1m official network and the Pinterest official network, and cleaning dirty data;

step S2, dividing the data set: dividing the processed data into a training set and a test set according to a leave-one-out mechanism;

step S3, constructing a neural collaborative filtering model based on an attention mechanism: constructing a feature fusion construction attention part by using a local inference part of a public ESIM model, learning a feature interaction part of a user and a project by using a multilayer perceptron, and finally fusing the feature fusion construction attention part and the feature interaction part for recommendation;

step S4, model training and result display: the training data set and the testing data set obtained in the step S2 are used for training and evaluating the neural collaborative filtering model constructed in the step S3, and whether to recommend the item to the user is judged according to the predicted score of the user for the item.

The specific steps of the data acquisition and processing of step S1 are as follows:

s11, reserving movielens-1m data set users containing at least 20 scored data, and finally obtaining 6040 users, 3706 projects and 1000209 interactive data; the Pinterest data set also only holds users who have 20 pins, resulting in 5187 users, 9916 items and 1500809 interactive data.

Further, the step S3 is a specific step of constructing a neural collaborative filtering model based on the attention mechanism, and the specific step is as follows:

s31, constructing an input layer: the input layer comprises two parts, namely a one-hot code of the user and the item ID; user and item embedding vectors;

s32, constructing an ANCF layer: namely, a learning framework of the neural collaborative filtering recommendation method based on the attention mechanism is constructed, and is defined as an ANCF (attention neural Filter) layer in the invention;

s33, constructing an output layer: the user' S rating of the project is predicted, and the result obtained in the above step S32 is used as an input to complete the predicted rating of the project of the user through an activation function.

Further, the specific steps of constructing the ANCF layer in step S32 are as follows:

s32-1, constructing an Attention model: extracting inferred information embedded by the user and the project by using the embedded vectors of the user and the project through a local reasoning mechanism, and constructing an Attention model;

s32-2, constructing a user project interaction MLP model: after the embedded vectors of the user and the project are spliced, connecting a multilayer perceptron, and learning the nonlinear action between the user embedding and the project embedding;

s32-3, constructing a fusion model: and splicing the outputs of the models obtained in the steps S32 and S33 to obtain a new vector, inputting the new vector into a full connection layer, and outputting the result as an input vector of an output layer under the action of the full connection layer.

Further, the step S32-1 is a specific step of constructing an Attention model as follows:

s32-1-1, user and item embedding vectors are used as the input of the Attention model;

s32-1-2, the obtained output of the Attention model is subjected to maximum pooling operation, and differences are captured through various combined operation operations such as pairwise multiplication, summation, averaging and the like;

s32-1-3, multiplying the three results obtained by the combined operation of the step S32-2 with the three weight operation components respectively, and then multiplying the three results with each other to obtain the output result of the Attention model.

Further, the specific steps of constructing the user project interaction MLP model in step S32-2 are as follows:

s32-2-1, splicing and inputting the user and the project embedding vectors into a perception layer of a multi-layer perceptron MLP;

s32-2-2, learning the interaction relation of the user item by the multi-layer perceptron MLP, and converting the input vector into the user item interaction vector as the next layer input of the model.

Further, the specific steps of constructing the output layer in step S33 are as follows:

s33-1, activating the output result vector obtained in the step S32-3 by using a sigmoid function, and compressing the vector to a range [0,1 ].

Further, the specific steps of model training and result displaying in step S4 are as follows:

s41: constructing a loss function: the loss value of the model is calculated using a two-class cross entropy loss function BCE. The formula is as follows:

s42: constructing an optimization function: adam was used as an optimization function for the model.

S43: calculate HR @10 and NDCG @ 10: the HR index and the NDCG index are used to evaluate model performance. HR measures the recall rate of the recommendation result, namely how many of the first N recommended items can hit the actual preference of the user, the larger the index is, the better the index is, and the expression is as follows:

the denominator represents all test sets, and the numerator is the sum of the number of test sets belonging to the previous N recommended results of each user, where N is designated as 10 in the present application.

Also, NDCG is often used as an evaluation index for ranking with the best discrimination, and when the model outputs TOP-N of the recommendation, the accuracy of this sequence can be predicted by NDCG, with larger indices being better. The expression is as follows

Wherein Z is_kIs a normalized coefficient, representing the whole in the best case

Reciprocal of (a), r_pA rule representing the rank association at the p-th position is defined as follows:

wherein i represents an item, i_pItems at the p-th position are indicated, and T represents the experimental test set.

From the above, the attention mechanism-based neural collaborative filtering recommendation method of the invention has at least the following advantages:

(1) according to the method, from the perspective of implicit feedback, maximum pooling operation is utilized, more characteristic values are obtained by combining various different data fusion modes, a multilayer neural network is used for modeling interaction between a user and data, the limitation of traditional matrix decomposition in a scoring prediction task is broken through, and the recommendation quality is improved.

(2) The method uses and expands local reasoning in an ESIM (enhanced LSTM for Natural Language reference) algorithm to capture difference, introduces an attention mechanism to distribute weight values, and combines an MLP multi-layer perceptron to capture multi-layer interaction information between users and projects, thereby improving the quality and accuracy of recommendation.

The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical means of the present invention more clearly understood, the present invention may be implemented in accordance with the content of the description, and in order to make the above and other objects, features, and advantages of the present invention more clearly understood, the following detailed description is given in conjunction with the preferred embodiments, together with the accompanying drawings.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings of the embodiments will be briefly described below.

FIG. 1 is a flow chart of a neural collaborative filtering recommendation method based on attention mechanism according to the present invention;

FIG. 2 is a flow chart of the neural collaborative filtering recommendation model based on attention mechanism of the present invention;

FIG. 3 is a flow diagram of an attention-based design collaborative filtering recommendation learning framework of the present invention;

FIG. 4 is a block diagram of the Attention model of the present invention;

FIG. 5 is a block diagram of a user project interaction MLP model of the present invention;

fig. 6 is a general structural diagram of the attention mechanism-based neural collaborative filtering recommendation method of the present invention.

Detailed Description

Other aspects, features and advantages of the present invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, which form a part of this specification, and which illustrate, by way of example, the principles of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in fig. 1 to 6, the present invention provides a neural collaborative filtering recommendation method based on attention mechanism, including the following steps:

s1, data acquisition and processing: in the embodiment, in order to test the accuracy of the recommendation result of the predicted project by using the user project relationship, experiments are performed on the MovieLens-1m and Pinterest data sets, so that the accuracy of the method is demonstrated. Wherein the MovieLens-1m data set is an English literature data set in the field of computers all over the world. It includes rating data of multiple movies by multiple users, and also includes movie metadata information and user attribute information, and has been widely used for evaluating collaborative filtering algorithms. For the processing of the data set, inactive users are removed, users with ratings of no less than 20 records are retained, and a data set containing 100 ten thousand ratings, each of which has at least 20 ratings, is finally obtained. Pinterest-20 is an implicit feedback data for evaluating content-based image recommendations, which the user has shown his preferences by "pin" pictures on his picture board. Over 100 million ratings were also obtained on the Pinterest-20 dataset. The two data sets were characterized statistically as in table 1 below.

Table 1: statistical information of the Experimental data set (after data processing)

S2, dividing the data set: in this embodiment, a leave-one-out mechanism is adopted to divide the test set and the data set, the test user is the latest interaction information of the user, and the rest is used as a training set. During testing, 100 items which have not interacted with the user are randomly extracted, and the test set is arranged in the 100 items.

S3, constructing a neural collaborative filtering model based on an attention mechanism: and constructing a feature fusion construction attention part by using a local inference part of the disclosed ESIM model, learning a feature interaction part of the user and the project by using a multilayer perceptron, and finally fusing the feature fusion construction attention part and the feature interaction part for recommendation.

S4, model training and result display: the training data set and the testing data set obtained in the step S2 are used for training and evaluating the neural collaborative filtering model constructed in the step S3, and whether to recommend the item to the user is judged according to the predicted score of the user for the item.

As shown in fig. 2, the specific steps of constructing the attention mechanism-based neural collaborative filtering model in step S3 are as follows:

s31, constructing an input layer: the input layer comprises two parts, namely a one-hot code of the user and the item ID; the user and item embed the vector.

In this embodiment, first, the user and the item ID are used as input features, and are converted into a binary sparse vector by one-hot encoding.

The conversion relationship is shown in table 2 below:

table 2: user/item ID one-hot transcoding (Movielens)

ID	one-hot coding
		0	000…001
1	000…010
		…	…
6039	100…000

And embedding the binary sparse vectors to respectively obtain final latent vectors of the user and the project, namely the embedded vectors.

Suppose that

And

representing the feature vectors with u and item i, respectively, their embedded vectors are f_uAnd h_iThen the method of embedding is defined as follows:

wherein, F^M*KEmbedding matrices for user characteristics, H^M*KThe method is characterized by comprising the following steps of embedding a matrix of article features, wherein K is embedding size, M is user feature number, and N is project feature number.

In a pytoreh, this can be implemented by the following code:

self.embedding_user＝torch.nn.Embedding(num_embedding＝self.n_user,embedding_dim＝self.dim_latent)

self.embedding_item＝torch.nn.Embedding(num_embedding＝self.m_item,embedding_dim＝self.dim_latent)

user_embedding＝self.embedding_user(user)

item_embedding＝self.embedding_item(item)

s33, constructing an output layer: and predicting the scoring of the user to the project, taking the result obtained in the step S32 as input, and completing the predicted scoring of the user project through an activation function Sigmoid.

Therefore, for the neural collaborative filtering recommendation method based on the attention mechanism, the result expression of the final prediction can be given as follows:

where σ denotes the Sigmoid activation function, φ^ACNFThe output result of the ANCF layer constructed for step S32.

In a pytoreh, this can be implemented by the following code:

self.logistic＝torch.nn.Sigmoid()

self.affine_out＝torch.nn.Linear(in_feature＝in_feature_size,out_feature＝out_feature_size)

acnf_vector＝torch.cat([amf_vector,mlp_vector],dim＝-1)

y_pre＝self.logistic(self.affine_out(acnf_vector))

as shown in fig. 3, the specific steps of constructing the ANCF layer in step S32 are as follows:

s32-1, constructing an Attention model: and extracting inference information embedded by the user and the project through a local reasoning mechanism by utilizing the embedded vectors of the user and the project, and constructing an Attention model. For the attention model, the following objective function expression is defined as follows:

wherein

Refers to multiplication between elements, P_mul、P_sumAnd P_avgWhich are vectors obtained by pairwise multiplication, pairwise addition and pairwise averaging operation of the input embedded vectors of the attention model, will be described in detail in the following analysis steps. W₁，W₂，W₃The weights of the 3 arithmetic elements of the attention-drawing mechanism are respectively expressed.

S32-2, constructing a user project interaction MLP model: after the embedded vectors of the user and the project are spliced, connecting a multilayer perceptron, learning the nonlinear action between the user embedding and the project embedding, and defining the target function expression as follows:

wherein W_x,b_xAnd a_xA weight matrix, a bias vector and an activation function respectively representing the perceptrons of the x-th layer,

and

embedded vectors representing users and items, respectively, entered by the input layer.

S32-3, constructing a fusion model: and splicing the outputs of the models obtained in the steps S32 and S33 to obtain a new vector, inputting the new vector into a full connection layer, and outputting the result as an input vector of an output layer under the action of the full connection layer. Thus, the expression definition of the output result is obtained:

where W is the weight matrix of the fully-connected layer and b is the offset vector of the fully-connected layer.

Representing vector stitching.

As shown in FIG. 4, S32-1 includes the following steps:

s32-1-1, user and item embedding vectors are used as input to the Attention model.

After the S32-1-2 and the Attention model are subjected to maximum pooling operation, the obtained output is subjected to various combined operation operations such as pairwise multiplication, summation, averaging and the like to capture the difference. Herein is defined as P_mul、P_sumAnd P_avgRespectively representing the result vectors obtained by the multiplication, summation and averaging operations. The expression is defined as follows:

wherein the content of the first and second substances,

means that the feature is extracted by adopting a multiplication mode,

indicating that the features are extracted by summation and ☉ indicating that the features are extracted by averaging.

In a pytoreh, the following code can be used to implement:

# definitions maxporoling maximum pooling operation:

self.maxpooling_user＝torch.nn.MaxPool1d(2,stride＝2)

self.maxpooling_item＝torch.nn.MaxPool1d(2,stride＝2)

add dimension to pool

user_embedding＝user_embedding.unsqueeze(1)

item_embedding＝item_embedding.unsqueeze(1)

user_maxpooling＝self.maxpooling_user(user_embedding)

item_maxpooling＝self.maxpooling_item(item_embedding)

Dimension reduction for multiplying, adding and averaging

p_mul＝torch.mul(user_maxpooling,item_maxpooling)

p_sum＝torch.add(user_maxpooling,item_maxpooling)

p_avg＝torch.add(user_maxpooling,item_maxpooling)/2

S32-1-3, multiplying the three results obtained by the combined operation of the step S32-2 with the three weight operation components respectively, and then multiplying the three results with each other to obtain the output result of the Attention model. The target expression is defined as follows:

wherein W₁,W₂,W₃The weights of the 3 arithmetic elements are respectively expressed and are continuously updated through learning in the model training.

As shown in FIG. 5, the specific steps of constructing the user project interaction MLP model in step S32-2 are as follows:

s32-2-1, splicing and inputting the user and the project embedding vector to a perception layer of a multi-layer perceptron MLP: and still inputting the obtained user and item binary sparse vectors into an embedding layer to obtain potential feature vectors of the items, and then connecting the user embedding vectors and the item embedding vectors to obtain feature connections of the user and the items. And inputting the connection characteristics into each hidden layer of the perceptron layer by layer. The model function is set as:

…

wherein p is_uRepresenting user embedding sum q_iRepresenting items embedded in vectors, p being embedded at the embedding level_uAnd q is_iAnd (5) vector connection. W_x,b_xAnd a_xA weight matrix, a bias vector and an activation function respectively representing the perceptron of the x-th layer. The ReLu function is chosen for the activation function.

In a Pythrch, this can be implemented in the following code:

# spliced user and item embedding

vector＝torch.cat([user_embedding,item_embedding],dim＝-1)

Size of input and output of each layer of # perceptron

self.fc_layers＝torch.nn.ModuleList()

config[‘layers’]＝[16：64：32：16：8]

# definition of the perceptron

for idx,(in_size,out_size)in enumerate(zip(config[‘layers’][：-1],config[‘layers’][1：]))：

self.fc_layers.append(torch.nn.Linear(in_size,out_size))

In a pytorech, this is achieved by the following code:

# definition perceptron learning interaction output interaction vector

for idx,_in enumerate(range(len(self.fc_layers)))：

vector＝self.fc_layers[idx][vector]

vector＝torcn.nn.Relu()(vector)

The specific steps of the model training and result display of the step S4 are as follows:

in a Pythrch, this is implemented by the following code:

# defines a two-class Cross entropy loss function

self.crit＝torch.nn.BCELoss()

# Per batch sample training

for batch_id,batch in enumerate(train_loader)：

# user, project, and interaction information

users,items,ratings＝batch[0],batch[1],batch[2]

# prediction of results by model

ratings_pre＝self.model(user,item)

# calculation of losses Using Cross entropy function

loss＝self.crit(ratings_pre.view(-1),ratings)

In a pytoreh, this can be achieved by the following function:

# define optimization function Adam

self.opt＝torch.optim.Adam(network.parameters(),lr,weight_decay)

# optimized network

self.opt.zero_grad()

self.opt.step()

S43: calculate HR @10 and NDCG @ 10: the HR index and the NDCG index are used to evaluate model performance. HR measures the recall of the recommendation, i.e., how many of the top N items recommended can hit the user's actual preference, with a higher index being better. The expression is as follows:

Also, NDCG is often used as an evaluation index for ranking with the best discrimination, and when the model outputs TOP-N of the recommendation, the accuracy of this sequence can be predicted by NDCG, with larger indices being better. The expression is as follows:

In summary, the attention mechanism-based neural collaborative filtering recommendation method of the invention comprises the following steps: step S1, data acquisition and processing; step S2, dividing the data set; step S3, constructing a neural collaborative filtering model based on an attention mechanism: (1) constructing a feature fusion construction attention part by using a local inference part of the disclosed ESIM model; (2) building a feature interaction model of the user and the project by using a multi-layer perception mechanism; (3) a neural collaborative filtering model based on an attention mechanism; s4, model training and result displaying.

The invention provides a neural collaborative filtering recommendation method based on an attention mechanism, which considers an implicit data feature extraction mode and weight distribution of the attention mechanism, utilizes maximum pooling operation from the perspective of implicit feedback, more characteristic values are obtained by combining a plurality of different data fusion modes, the interaction modeling between the user and the data is carried out by using the multilayer neural network, the limitation of matrix decomposition in the scoring prediction task is broken through, the expressiveness of matrix decomposition is optimized, deep features of users and projects can be mined, local reasoning in an enhanced LSTM for Natural Language inference is used and expanded to capture differences, while an attention mechanism is introduced to assign the weight values, multi-layer interaction information between the user and the item is captured by combining an MLP multi-layer perceptron, so that the recommendation performance is improved.

While the foregoing is directed to the preferred embodiment of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.

Claims

1. A neural collaborative filtering recommendation method based on an attention mechanism is characterized by comprising the following steps:

2. The attention-based neural collaborative filtering recommendation method according to claim 1, wherein the data acquisition and processing of step S1 includes the following specific steps:

s11, reserving data with at least 20 scores in the movielens-1m data set, and finally obtaining 6040 users, 3706 projects and 1000209 interactive data; the Pinterest data set also only holds users who have 20 pins, resulting in 5187 users, 9916 items and 1500809 interactive data.

3. The attention mechanism-based neural collaborative filtering recommendation method according to claim 1, wherein the specific steps of constructing the neural collaborative filtering-based model in the step S3 are as follows:

s31, constructing an input layer: the input layer comprises two parts, a user embedded vector and an item embedded vector;

s32, constructing an ANCF layer: constructing a learning framework of a neural collaborative filtering recommendation method based on an attention mechanism;

4. The attention mechanism-based neural collaborative filtering recommendation method according to claim 3, wherein the step S32 of constructing the ANCF layer comprises the following specific steps:

5. The Attention mechanism-based neural collaborative filtering recommendation method according to claim 4, wherein the step S32-1 specifically constructs the Attention model by the following steps:

6. The attention mechanism-based neural collaborative filtering recommendation method according to claim 4, wherein the step S32-2 is implemented by the following steps:

7. The attention mechanism-based neural collaborative filtering recommendation method according to claim 4, wherein the step S33 of constructing the output layer comprises the following specific steps:

the vector of the output result obtained in the above step S32-3 is activated using the sigmoid function, and is compressed to the range [0,1 ].