CN112016000B

CN112016000B - Movie group recommendation method and system based on convolution collaborative filtering

Info

Publication number: CN112016000B
Application number: CN202010696335.7A
Authority: CN
Inventors: 杨青; 李贺永
Original assignee: Guilin University of Electronic Technology
Current assignee: Guilin University of Electronic Technology
Priority date: 2020-07-20
Filing date: 2020-07-20
Publication date: 2021-08-10
Anticipated expiration: 2040-07-20
Also published as: CN112016000A

Abstract

The invention relates to a method and system for recommending a movie group based on convolution collaborative filtering. The recommending method includes: obtaining user data and commodity content data through an operator to form a user group, and processing it into a format that can be recognized by a model; The product collaborative filtering recommendation algorithm processes the above data to obtain the recommendation list of the user group; recommends to the relevant user group, obtains the user's feedback data at the same time, returns the feedback data to the system, processes it into the corresponding format, and then uses the convolution-based The collaborative filtering algorithm performs data processing to calculate the recommendation list, and continues to recommend products to the user group. The advantage of the present invention is that after linear fusion of user embedding and item embedding features, the processed fusion embedding vector is directly sent to a single-layer convolutional neural network, which reduces a lot of parameters. It can effectively improve the hit rate of the recommended products of the model.

Description

Movie group recommendation method and system based on convolution collaborative filtering

Technical Field

The invention relates to the technical field of recommendation services, in particular to a movie group recommendation method and system based on convolution collaborative filtering.

Background

With the social development becoming faster and faster, people need to screen a lot of information every day when surfing the internet; in order to solve the problem of information overload, recommendation systems are widely applied to online information systems such as e-commerce platforms and mobile apps. An efficient recommendation system not only brings flow and profits to the facilitator, but also helps to sort out the goods that are of greater interest to them.

The traditional recommendation algorithm is not applied to a system, so that the system performance has a large space for improvement, in the recent years, a recommendation system is applied to various neural networks, but most of the recommendation systems are forward propagation networks such as a multilayer perceptron and the like, so that the problem of complex and various parameters exists, and the system training model needs to take a long time; meanwhile, the conventional recommendation system is often used for recommending the commodities which are interested to the single user, and the efficiency of the recommendation system is low relative to the efficiency of group recommendation.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, provides a movie group recommendation method and system based on convolution collaborative filtering, and solves the defects of the conventional recommendation method.

The purpose of the invention is realized by the following technical scheme: a movie group recommendation method based on convolution collaborative filtering, the recommendation method comprising:

the user data and the commodity content data are obtained by an operator to form a user group and are processed into a format which can be identified by a model;

processing the data by using a recommendation algorithm based on convolution collaborative filtering to obtain a recommendation list of the user group;

recommending the relevant user groups, simultaneously acquiring feedback data of the users, returning the feedback data to the system, processing the feedback data into a corresponding format, then performing data processing and calculation on a recommendation list by using a convolution-based collaborative filtering algorithm, and continuously recommending commodities to the user groups.

Further, the data processing based on the convolution collaborative filtering recommendation algorithm comprises a user group embedding aggregation step, an item group embedding aggregation step and a feature interactive learning step of convolution collaborative filtering of the attention neural network; the user group embedding aggregation step of the attention neural network comprises the following steps:

collecting and processing original data, cleaning and recombining to obtain user data and project data;

embedding the user data of the user group by adopting an attention neural network, and learning the weighting weight of the user belonging to a certain specific group through the whole model;

and carrying out weighted aggregation on the embedded features of the users according to the weighted weights of the users belonging to a certain specific user group, thereby obtaining the embedded features of the user group.

Further, the feature interactive learning step of the convolution collaborative filtering includes:

fusing the embedding characteristics of the user group and the embedding characteristics of the project group according to the dot product mode of the corresponding elements, and superposing the fused result with the embedding characteristics of the user group and the embedding characteristics of the project group according to columns to obtain the dimensionality of data;

sending the obtained fusion characteristics of the data dimensionality into a convolution neural network for convolution operation processing;

and inputting the data output by the convolution operation into a full-connection layer network, continuing to train the model and continuously improving the accuracy of the model.

Further, the item group embedding aggregation step includes: and carrying out embedding operation on the high-dimensional sparse data through a neural network embedding algorithm according to the attribute data and the item ID of the item, and carrying out dimension reduction on the high-dimensional sparse data to convert the high-dimensional sparse data into low-dimensional and dense embedded feature vectors.

Further, the method further comprises: and combining new effective users into a new user group according to rules irregularly, and acquiring the recommended commodity information of the new user group through the convolution collaborative filtering recommendation algorithm.

Further, the user data and the commodity content data include: user ID, user age, user gender, user selected preferences, user address information, user browsing information, user purchase information, item ID, and category to which the item belongs.

A movie group recommendation system based on convolutional collaborative filtering, the movie group recommendation system comprising:

a data acquisition module: the system is used for acquiring user data and commodity content data to form a user group and processing the user data and the commodity content data into a format which can be identified by a model;

a data processing module: the recommendation method comprises the steps of processing the data by using a recommendation algorithm based on convolution collaborative filtering to obtain a recommendation list of a user group;

a recommendation module: the system is used for recommending the relevant user groups, simultaneously acquiring feedback data of the users, returning the feedback data to the system, processing the feedback data into a corresponding format, then performing data processing based on a convolution collaborative filtering algorithm to calculate a recommendation list, and continuously recommending commodities to the user groups.

Further, the data processing module comprises: embedding a user group into an aggregation unit and a feature interaction learning unit;

the user group embedding aggregation unit: embedding the user data of the user group through an attention neural network to obtain user weight, and then performing weighted aggregation on the embedded characteristics of the users to obtain the embedded characteristics of the user group;

the feature interaction learning unit: acquiring embedded characteristics of the user group and the project group for fusion, splicing according to the columns, inputting the total characteristics into a single-channel convolution neural network for convolution operation after splicing, and finally inputting data output by the convolution operation into a full-connection layer network.

The invention has the beneficial effects that: a movie group recommendation method and system based on convolution collaborative filtering are disclosed, after linear fusion of user embedding (or user group embedding) and project embedding characteristics, processed fusion embedding vectors are directly sent to a single-layer convolution neural network; because the convolutional neural network is in the process of calculation, neurons between networks are locally connected, and a group of connections can share a weight, so that a lot of parameters are reduced. Meanwhile, the pooling characteristic of the convolutional neural network can also well learn useful data of user embedding (or user group embedding) and item embedding characteristics, so that the preference score of a user or a user group for an item can be acquired more effectively. Therefore, the hit rate of the recommended commodities of the model can be effectively improved.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a schematic diagram of an attention neural network-based user group embedding aggregation structure according to the present invention;

FIG. 3 is a schematic diagram of a feature interactive learning structure based on convolution collaborative filtering according to the present invention;

FIG. 4 is an enlarged view of the calculation details of the mapping of Original Embedding (Original Embedding) to convolutional Embedding (CNN Embedding);

FIG. 5 is a schematic diagram of dot product of vector elements performed by Original Embedding (Original Embedding) and convolutional Embedding (CNN Embedding).

Detailed Description

In order to make the purpose, technical solution and advantages of the embodiments of the present application clearer, the technical solution in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, but not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as presented in the figures, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application. The invention is further described below with reference to the accompanying drawings.

As shown in fig. 1, the present invention relates to a movie group recommendation method based on convolution collaborative filtering, which includes:

s1, acquiring new login user information through a big data operator, wherein the new login user information comprises information such as a user ID, a user age, a user gender, user selected preferences, an irregularly updated address of the user and the like;

s2, forming a new user group by using the user data acquired in the step S1 through a certain data processing mode, and carrying out data cleaning and statistics on the data of the commodities, so that subsequent model processing work is facilitated, and the data processing work facilitates the subsequent first directional recommendation of the commodities to the new user;

and S3, pushing the user and project data obtained in the step S2 to the model of the product. Through the cleaning and recombination of data, new effective users are not regularly combined into a new user group according to rules; by a convolution collaborative filtering recommendation algorithm, firstly, the embedded characteristics of the user group and the project group are obtained, then the characteristics are spliced according to columns, the total characteristics are input into a single-channel convolution neural network after splicing, and the data output by convolution is continuously input into a full-connection layer network. By training the model, the accuracy of the model is continuously improved. Finally, acquiring a commodity recommendation list of the user group by using the trained model;

and S4, acquiring original data of the new user and the commodity from the operator big data system, wherein the original data comprises user login information, user age and gender, recent browsing data of the user, historical user purchasing data, new and old commodity IDs and the like. Using the new click data of the batch of users to further push commodities to the users;

and S5, acquiring data of the effective user group about browsing time, browsing times, effective operation types and the like of the recommended commodities. Model learning and commodity recommendation are continuously carried out on the existing user groups. Meanwhile, new users are obtained through a big data operator, new user groups are formed regularly, and directional recommendation of commodities is conducted.

As shown in fig. 2, the user group aggregation algorithm based on the attention neural network comprises the following steps:

and A1, collecting and processing the original data, and cleaning and recombining the original data to obtain user data and project data, wherein the user data comprises the number, the ID and the attribution group of the users. The data of the article includes the number of the article, the ID of the article, and the like;

a2, based on step A1, carrying out weighted aggregation processing on the user data of each user group, wherein the weighted mode adopts an attention neural network. The specific implementation comprises the following steps: the data of each user is subjected to an embedding process, and the weighted weight of the user belonging to a certain group is learned through the whole model. The weights of the items are also learned. The learned user weights are:

wherein h (i, t) in the above formula represents the user, item embedding combination propagated through the full connection. On this basis, the individual user-embedded attention weight is obtained via the softmax activation function.

And A3, applying the weight of the user learned in the step A2 and belonging to a specific group to carry out weighted aggregation on the embedded features of the user, thereby obtaining the embedded features of the group. To facilitate subsequent feature fusion operations. The group embedded features aggregated by the user embedded features are represented as:

here, the group embedding feature g obtained based on CAMRA2011 data_l(i) Has a dimension of 256 x 32.

Further, the implementation of the embedded feature of the set of items includes: and carrying out embedding operation on the high-dimensional sparse data through a neural network embedding algorithm according to the attribute data and the item ID of the item, and carrying out dimension reduction on the high-dimensional sparse data to convert the high-dimensional sparse data into low-dimensional and dense embedded feature vectors.

The item refers to a commodity or an article in the recommendation list, and includes many types of commodities, such as commodities on apps of Taobao, Jingdong, Shuduo and the like; or virtual goods such as movies, music, etc.

As shown in fig. 3, the feature interactive learning algorithm based on convolution collaborative filtering is:

b1, fusing the acquired group characteristics and the acquired project characteristics in a dot-product mode according to corresponding elements: g_l(i)⊙v_jAnd overlapping the fused result with the group characteristics and the item characteristics according to the column. The data dimension after stacking is 256 × 96.

And B2, sending the fusion features with the dimension of 256 × 96 acquired in the step B1 into a convolutional neural network, and performing convolution operation processing on the fusion features.

B3, the concrete operation of the model convolution structure comprises: the embedding layer maps the interaction, the fully-connected layer, and the prediction layer.

Firstly, convolution operation is carried out, when convolution operation is carried out, a convolution kernel is assumed to be k x k, boundary filling is set to be P, step size is S, and input dimension is assumed to be H x W, then output dimension H becomes H

W is changed into

Let the convolution kernel k 3, the padding P1, and the step S1. Then dimension 256 x 96 of the input remains unchanged after convolution. After convolution, pass throughThe convolutional embedding and the original embedding are element dot multiplied.

V＝F(A,E)＝[a₁·e₁,···,a_f·e_f]＝[c₁,···,c_f]

Wherein A represents original embedding, E represents convolution embedding, and V represents embedding of original feature importance degree through convolution processing.

Then using the full connectivity layer:

s＝RELU(W₂(RELU(W₁V+b₁))+b₂)

wherein V is data obtained by embedding interaction; a is₁: a column of vectors (we set a total of f columns) of the original embedded features (after the user embedded features and the item embedded features are fused); e.g. of the type₁: a convolution-embedded column of vectors (we set a total of f columns) is obtained after the original embedded features are subjected to convolution layer processing; c. C₁：a₁·e₁The result obtained by calculation adopts the vector calculation mode of FIG. 5 among the results; b₁、b₂: bias vectors of neural network forward propagation by optimization algorithm and W₁、W₂Training is carried out synchronously, wherein W₁、W₂Representing a mapping parameter matrix.

Finally, a prediction score can be obtained by using the prediction layer.

We can vary W₁,W₂To determine the amount of the parameter. The parameters of a convolution kernel added by a single layer convolution are 9 parameters, and when W is₁:96*32,W₂:32*1. The parameters of the fully connected layers are 3104(96 × 32+32 × 1) parameters. It can be seen that the amount of parameters added by convolution is small.

Where 256 x 96 data dimensions are derived: the batch of the program is firstly set to 256 pieces of data per batch, that is, each batch of training set comprises 256 pieces of data of users (groups) and 256 pieces of data of items, and the characteristic dimension of each entity is set to 32. As shown in FIG. 3, the dimensions of the embedded features of the user (group), the item, and the user (group) and item interactions are the same:

g_l(i),v_j,g_l(i)⊙v_j∈R^256*32

in the fusion stage, three groups of embedded features are fused according to 'columns', so that the number of rows is kept unchanged, the number of columns is added, and the dimensionality after fusion is changed into:

Original Embedding∈R^256*96

as shown in fig. 4, the boundary was complemented by 0 using a convolution kernel of 3 × 3: let P be 1 and let the step size of the convolution be equal to 1, then the dimension of the embedded feature remains unchanged after the convolution operation using the convolution kernel. As shown in fig. 5, the dot product between vector elements is performed for Original Embedding (Original Embedding) and convolutional Embedding (CNN Embedding). The resulting formula is:

V＝F(A,E)＝A⊙E＝[a₁·e₁,···,a_f·e_f]＝[c₁,···,c_f]

the original embedded feature and the convolution embedded feature here are already features after user (group) and item interaction.

The present invention divides data into 290 groups of users in a particular experiment. The data includes 113334 pieces of user scoring training data, 3010 pieces of user scoring test data, 143618 pieces of group scoring training data and 1450 pieces of group scoring test data. And 3010 pieces of negative sample data were constructed for user test data, 1450 pieces of negative sample data were constructed for user group test data, each piece of test data followed by 100 negative sample items (negative samples that were not clicked by the group or the user). Wherein the negative sample data of users and groups of users is used in the evaluation of the model.

The online data is obtained by collecting active user data of a big data platform, and the function of recommending proper commodities to groups or single users can be achieved by processing the data collected regularly in a streaming mode.

By using the convolutional neural network, the increasing chance of the parameters can be ignored after the convolution is increased, and 9 parameters of the Convolutional Neural Network (CNN) are equivalent to 3104 parameters of a multilayer perceptron (MLP); the convolution Embedding is obtained by convolution processing of Original Embedding (Original Embedding), and the part serves as the feature weight of the Original Embedding, namely the convolution Embedding and the Original Embedding are subjected to dot multiplication, and after the dot multiplication, the importance of the Original embedded feature is successfully modeled.

The foregoing is illustrative of the preferred embodiments of this invention, and it is to be understood that the invention is not limited to the precise form disclosed herein and that various other combinations, modifications, and environments may be resorted to, falling within the scope of the concept as disclosed herein, either as described above or as apparent to those skilled in the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A movie group recommendation method based on convolution collaborative filtering is characterized in that: the recommendation method comprises the following steps:

recommending related user groups, simultaneously acquiring feedback data of the users, returning the feedback data to the system, processing the feedback data into a corresponding format, then performing data processing and calculation on a recommendation list by using a convolution-based collaborative filtering recommendation algorithm, and continuously recommending commodities to the user groups;

the data processing based on the convolution collaborative filtering recommendation algorithm comprises a user group embedding aggregation step, an item group embedding aggregation step and a feature interactive learning step of convolution collaborative filtering of an attention neural network; the user group embedding aggregation step of the attention neural network comprises the following steps:

carrying out weighted aggregation on the embedded features of the users according to the weighted weights of the users belonging to a certain specific user group, thereby obtaining the embedded features of the user group;

the item group embedding aggregation step comprises: carrying out embedding operation on the high-dimensional sparse data through a neural network embedding algorithm according to the attribute data and the project ID of the project, and carrying out dimension reduction on the high-dimensional sparse data to convert the high-dimensional sparse data into low-dimensional and dense embedded feature vectors;

the feature interactive learning step of the convolution collaborative filtering comprises the following steps:

fusing the obtained embedded characteristics of the user group and the embedded characteristics of the project group according to the dot-product mode of the corresponding elements, and superposing the fused result with the embedded characteristics of the user group and the embedded characteristics of the project group according to columns to obtain the dimensionality of data; the data dimension after stacking is 256 × 96;

inputting data output by convolution operation into a full-connection layer network, continuing to train the model and continuously improving the accuracy of the model;

W is changed into

Let convolution kernel k 3, padding P1, step S1, and after convolution, input dimension 256 x 96 remains unchanged;

after convolution, the convolved embedding and the original embedding are subjected to element dot multiplication, and V is F (A, E) [ [ a ] ]₁·e₁,···,a_f·e_f]＝[c₁,···,c_f]Where A denotes original embedding, E denotes convolution embedding, and V denotes pass volumeProduct processing, which can model the embedding of the importance of the original features;

then using the full connectivity layer: s ═ RELU (W)₂(RELU(W₁V+b₁))+b₂) Wherein V is data obtained by embedding interaction; a is₁: a column of vectors of the original embedded features; e.g. of the type₁: a row of vectors of convolution embedding are obtained after the original embedding characteristics are processed by a convolution layer; c. C₁：a₁·e₁Calculating the obtained result; b₁、b₂: bias vectors of neural network forward propagation by optimization algorithm and W₁、W₂Training is carried out synchronously, wherein W₁、W₂Representing a mapping parameter matrix.

2. The movie group recommendation method based on convolution collaborative filtering as claimed in claim 1, wherein: the method further comprises the following steps: and combining new effective users into a new user group according to rules irregularly, and acquiring the recommended commodity information of the new user group through the convolution collaborative filtering recommendation algorithm.

3. The movie group recommendation method based on convolution collaborative filtering as claimed in claim 2, wherein: the user data and the commodity content data include: user ID, user age, user gender, user selected preferences, user address information, user browsing information, user purchase information, item ID, and category to which the item belongs.

4. A movie group recommendation system based on convolution collaborative filtering, characterized in that: the movie group recommendation system includes:

a recommendation module: the system is used for recommending related user groups, simultaneously acquiring feedback data of the users, returning the feedback data to the system, processing the feedback data into a corresponding format, then performing data processing on the feedback data by using a convolution-based collaborative filtering algorithm to calculate a recommendation list, and continuously recommending commodities to the user groups;

the data processing module comprises: the system comprises a user group embedding aggregation unit, an item group embedding aggregation unit and a feature interaction learning unit;

the item group embedding aggregation unit: carrying out embedding operation on the high-dimensional sparse data through a neural network embedding algorithm according to the attribute data and the project ID of the project, and carrying out dimension reduction on the high-dimensional sparse data to convert the high-dimensional sparse data into low-dimensional and dense embedded feature vectors;

the feature interaction learning unit: acquiring embedded characteristics of a user group and a project group for fusion, splicing according to columns, inputting total characteristics into a single-channel convolution neural network for convolution operation after splicing, and finally inputting data output by the convolution operation into a full-connection layer network;

the method specifically comprises the following steps: firstly, convolution operation is carried out, when convolution operation is carried out, a convolution kernel is assumed to be k x k, boundary filling is set to be P, step size is S, and input dimension is assumed to be H x W, then output dimension H becomes H

W is changed into

after convolution, the convolved embedding and the original embedding are subjected to element dot multiplication, and V is F (A, E) [ [ a ] ]₁·e₁,···,a_f·e_f]＝[c₁,···,c_f]Wherein A represents the original embeddingsE represents convolution embedding, and V represents embedding of original feature importance degree through convolution processing;