CN110688585B - Personalized movie recommendation method based on neural network and collaborative filtering - Google Patents

Personalized movie recommendation method based on neural network and collaborative filtering Download PDF

Info

Publication number
CN110688585B
CN110688585B CN201910912752.8A CN201910912752A CN110688585B CN 110688585 B CN110688585 B CN 110688585B CN 201910912752 A CN201910912752 A CN 201910912752A CN 110688585 B CN110688585 B CN 110688585B
Authority
CN
China
Prior art keywords
matrix
bert
item
movie
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910912752.8A
Other languages
Chinese (zh)
Other versions
CN110688585A (en
Inventor
杨新武
熊乐歌
王羽钧
董雨萌
杜欣钰
宋霖涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201910912752.8A priority Critical patent/CN110688585B/en
Publication of CN110688585A publication Critical patent/CN110688585A/en
Application granted granted Critical
Publication of CN110688585B publication Critical patent/CN110688585B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a personalized movie recommendation method based on a neural network and collaborative filtering, which is a quick and effective method for extracting features of movie plots by adopting a Bert neural network, forming a feature matrix related to item to be linked with Funk-SVD, and generating a complete U-I matrix by utilizing a matrix decomposition technology to obtain all prediction scores. Firstly, extracting the characteristics of the movie plot by utilizing a Bert neural network, and obtaining a characteristic matrix about the movie item; then, connecting the obtained characteristic matrix with a Funk-SVD algorithm, optimizing by utilizing a matrix decomposition technology and a gradient descent method to obtain a complete U-I matrix with the minimum error, and finally obtaining a series of operations such as all the prediction scores; on the basis of the original explicit feedback and implicit feedback, the method adds auxiliary information, namely the movie scenario, and more accurately obtains the feature matrix of the item, so that the minimum error is reduced by 2.40%, and the prediction accuracy is improved.

Description

Personalized movie recommendation method based on neural network and collaborative filtering
Technical Field
The invention belongs to the field of artificial intelligence-based personalized recommendation, and particularly relates to a quick and effective method for extracting characteristics of a movie plot by adopting a Bert neural network, forming a characteristic matrix related to item to be linked with a Funk-SVD, and generating a complete U-I matrix by utilizing a matrix decomposition technology to obtain all prediction scores.
Background
Currently, there are three main approaches to implementing recommendation systems that are more used: content-based recommendations (CB), collaborative filtering recommendations (CF), and hybrid recommendations.
CB, compare the item with the items previously liked by the user and then recommend the best matching item. But the main problems with this approach are the cold start problem and similar user reliability problems.
CF: collaborative filtering is the most popular algorithm in a recommendation system, and is modeled by analyzing user and article interaction data to predict the preference of a user for an article. The main obstacle is the sparsity of the user and article interaction data.
Mixing and filtering: the method combines various recommendation algorithms, realizes defect complementation and realizes better recommendation effect. In practical applications, we can adopt a suitable combination strategy for specific problems. The combination of content-based recommendations and collaborative filtering recommendations is currently a research and application rich combination.
In calculating the item scoring matrix, Natural Language Processing (NLP) is used for the movie scenario in order to extract more accurate feature vectors. Two strategies exist to pre-train the relationship that produces a word vector and a specific NLP task downstream: feature-based processing (e.g., ELMo) and fine-tuning processing (e.g., the generated Pre-trained Transformer (OpenAI GPT)). Both of these approaches limit the results of pre-training to generate word vectors, mainly because their standard language models are unidirectional, which limits the structure used in training and results in a decrease in the accuracy of feature extraction.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a personalized movie recommendation method based on a neural network and collaborative filtering. The method has the overall idea that the Bert neural network is used for carrying out feature extraction on the movie scenario and obtaining a feature matrix related to the movie item; and then connecting the obtained characteristic matrix with a Funk-SVD algorithm, optimizing by using a matrix decomposition technology and a gradient descent method to obtain a complete U-I matrix with the minimum error, and finally obtaining a series of operations such as all the prediction scores.
In order to achieve the purpose, the technical scheme adopted by the invention is a personalized recommender based on neural network and collaborative filtering, which comprises the following steps:
neural network Bert:
bert replaces a small number of words with Mask or another random word with a small probability when training a bi-directional language model in order to force the model to increase memory of the context. The berttransform uses a bi-directional self-attribute to extract and encode statement information from left to right and right to left, respectively. The use of a very large data set, migration from the source domain to the target domain learns effectively improved characterization capabilities of the model. The data in the experiment is a movie plot text, and because the sentences describing the movie plot are long, compared with the method that the RNN extracts the features according to the time sequence, the transformer can effectively ensure that the previous features do not disappear Bert, and the method comprises two main steps: pre-training and fine-tuning. Wherein, in the pre-training process, Bert shades 15% of the input movie scenario text, the whole sequence is run through a transform Encoder, and then only the shaded movie scenario part is predicted, so as to achieve deep bidirectional pre-training representation. First, the method of Bert is used to convert the story text into word vectors and obtain feature matrices, and then the resulting matrices are used in the CF model. Bert uses the structure of a transform, which consists of several stacked layers, each layer consisting of an attention layer and a non-linear function applied to each input element. The Transformer iteratively uses the steps of syntactic parsing and semantic synthesis to solve their interdependence problem, thereby better generating a vector containing all movie features, i.e., an item feature matrix.
Collaborative filtering model:
the collaborative filtering model adopted in the Bert-SVD model provided by the method is Funk-SVD, and from the perspective of the relationship between a user and a project, the first focus is explicit feedback, namely data which can be directly presented in a digital form, such as the value of the user to a certain project. In the following formulae, r (ui) represents the predicted score value, u represents the overall average of all scoring data, and buThe mark bias of a specific user is shown, and the shadow of the human subjective factor on the mark in reality is restoredLoud speaker, biThe score bias generated by a specific item is represented, and the influence of different scores caused by the item attribute in reality is restored, so that the specific differentiation is realized through the difference of bias items.
r(ui)=u+bi+bu (1)
For the bias term bi,buThe average value n of the scores generated by a specific user or item is solved, and the bias is obtained through the difference value between the average value n and the total average value u.
bu=nu-u (2)
bi=ni-u (3)
In order to further increase the utilization rate of the data, the calculation of implicit feedback, such as comment records, browsing records, purchasing records and the like of the user, is added. By means of matrix decomposition technology, two K-dimensional matrixes P and Q are formed through decomposition and are used for describing the implicit characteristics of users and items respectively, and the requirement for the number of the implicit feedback types in the algorithm is reduced through K. The influence of original explicit feedback on 100% of a prediction result is weakened by adding an implicit feedback item, and prediction is carried out from multiple dimensions, so that the precision is improved.
r(ui)=u+bi+bu+qi Tpu (4)
Gradient descent method:
because the initial values of all elements in the P and Q matrixes are set randomly by the system, the values of all elements in the matrixes are updated iteratively by a gradient descent method until the system converges, the error is reduced to obtain an optimal solution, euiRepresents the error of a certain predicted score from the known score r (ui), and SSE is the sum of the squared errors.
eui=R(ui)-r(ui) (5)
SSE=Σu,ieui2=Σu,i[r(ui)-Σk=1pukqki]2 (6)
After gradient solution, the final result is represented by the following formula, where η is the learning rate and λ is the regularization parameter, excessive convergence is avoided, and p is updated according to the gradientukAnd q iskiComprises the following steps:
puk=puk+2η(euiqki-λpuk) (7)
qki=qki+2η(euipuk-λqki) (8)
the method is characterized in that: comprises the following steps of (a) carrying out,
step 1, a group of 943 × 1682 scoring matrixes including 1682 items and 943 users are obtained from MovieLens in an experiment, only 100000 groups of scoring data are correspondingly generated, and the sparsity of a U-I matrix is 93.7%. In the Bert-SVD model, the movie scenario of each item is crawled from the IMDB by means of Python crawler, and the content features extracted from the scenario summarization are used as key item factor vectors in the item recommendation model to become key components of the recommendation model, which affects the training of the model and the prediction of unknown ratings of CS items.
Step 2, firstly, a 943 × 100 user decomposition matrix and an 1682 × 100 item decomposition matrix are set by using random numbers, a K value in a CF algorithm is selected as 100 in the data set, in order to verify the influence of the learning process of adding bert on errors, two groups of comparison experiments are carried out, one group is a matrix generated by directly using two random values, the other group is an item feature matrix generated by replacing the original random value with the item feature matrix calculated by the bert through a movie plot, and the subsequent operation result is iterated for 800 times to make the system converge to obtain the minimum value of the errors, so that the two feature matrices obtained after training can predict the vacancy items of the U-I matrix in the data set.
Comparing results after SVD is added into the Bert neural network:
when the Bert is used for learning the movie scenario, the expected output result is an item matrix of 1682 x 100, and the item feature matrix generated by iteration of matlab random values is directly replaced.
The error is calculated using RMSE, the root mean square error, also known as the standard error, which is defined as i ═ 1, 2, 3, … n. In a limited number of measurements, the root mean square error is often represented by: [ ∑ di ^2/n]1/2Re, wherein: n is the number of measurements(ii) a di is the deviation of a set of measured values from the true value.
Two experimental hypotheses are generated for the feature matrix output by the Bert, the first one is directly used as the feature matrix of the item, iterative calculation is not carried out, errors are increased, and the hypotheses fail. In the second method, only the output result is used for replacing the random initial value in the original matrix, the minimum error is reduced by 2.40%, and the specific experimental result is as follows:
Figure BDA0002215210350000061
Figure BDA0002215210350000062
step 3, obtaining the target compound through 800 iterations
Two decomposed feature matrices are then used to populate the scoring null entries in the real 943 x 1682U-I matrix according to equation (4).
Drawings
FIG. 1 is a schematic diagram of a Bert neural network.
Fig. 2 is a flow chart of the NLP process.
FIG. 3 is a comparison of experimental error for Bert-SVD and Funk-SVD.
Detailed Description
The invention will be further explained with reference to the drawings and examples.
FIG. 1 is a schematic diagram of a Bert neural network.
Fig. 2 is a flow chart of the NLP process.
Fig. 3 is a comparison of experimental errors of Bert-SVD and Funk-SVD, and the present invention uses a set of 100000 sets of scored data generated by 943 users, 1682 items provided by Movielens for testing, and sets the implicit feedback dimension K to 100, the learning rate to 0.002, the regularization parameter to 0.01, and the number of iterations to 800. The minimum error of the original Funk-SVD is 0.129, and the minimum error of the invention is 0.126.

Claims (2)

1. A personalized movie recommendation method based on combination of a Bert neural network and a Funk-SVD model in a collaborative filtering algorithm is characterized by comprising the following steps: comprises the following steps of (a) carrying out,
step 1, obtaining a group of 943 × 1682 scoring matrixes from MovieLens in an experiment, wherein the scoring matrixes comprise 1682 items and 943 users, only 100000 groups of scoring data are correspondingly generated, and the sparsity of a U-I matrix is 93.7%; in the Bert-SVD model, the movie plot of each item is crawled from the IMDB in a Python crawler mode, and the content features extracted from the plot summary are used as key item factor vectors in the item recommendation model to become key components of the recommendation model;
step 2, firstly generating a 943 x 100 user random feature matrix and an 1682 x 100 item random feature matrix, selecting a K value in a CF algorithm as 100 in the data set, carrying out two groups of comparison experiments in order to verify the influence of the learning process added with bert on errors, wherein one group of comparison experiments is to directly use two random feature matrices, the other group of comparison experiments is to replace the item feature matrix generated by the original random value with the item feature matrix calculated by the bert through a movie plot, and the subsequent operation result is iterated for 800 times to make the system converge to obtain the minimum value of the errors, so that the two feature matrices obtained after training can predict the vacant items of the U-I matrix in the data set;
taking the output result of the Bert neural network as the input of the Funk-SVD:
learning the movie plot by using Bert, wherein an expected output result is an item matrix of 1682 x 100, and the item feature matrix generated by iteration of matlab random values is directly replaced;
the error is calculated using RMSE, the root mean square error, also known as the standard error, which is defined as i 1, 2, 3, … n; in a limited number of measurements, the root mean square error is often represented by: [ ∑ di ^2/n]1/2Re, wherein: n is the number of measurements; di is the deviation of a set of measured values from the true values;
step 3, obtaining the target compound through 800 iterations
Two decomposed feature matrices are used, and then the two matrices are used for filling scoring empty items in a real U-I matrix of 943 x 1682 according to a formula (4);
in the collaborative filtering model, the collaborative filtering model adopted in the put-forward Bert-SVD model is Funk-SVD, and from the perspective of the relationship between users and items, the first focus is on explicit feedback, namely data which can be directly presented in a digital form; in the following formula, r (ui) represents a predicted score value, u represents the overall average value of all score data, bu represents the score bias of a specific user, the influence of human subjective factors on the score in reality is restored, bi represents the score bias generated by a specific item, and the influence of different scores caused by item attributes in reality is restored, so that specific differentiation is realized through the difference of bias items;
r(ui)=u+bi+bu (1)
for the calculation of the bias terms bi and bu, firstly, the average value n of scores generated by a specific user or project is solved, and then the bias is obtained through the difference value between the average value n and the total average value u;
bu=nu-u (2)
bi=ni-u (3)
in order to further increase the utilization rate of the data, the calculation of implicit feedback is added; decomposing to form two K-dimensional matrixes P and Q by means of a matrix decomposition technology, wherein the two K-dimensional matrixes P and Q are respectively used for describing the implicit characteristics of users and items, and the requirement on the number of implicit feedback types in the algorithm is reduced by K;
r(ui)=u+bi+bu+qiTpu (4)
in the gradient descent method, because the initial values of all elements in the P and Q matrixes are set randomly by the system, the values of all elements in the matrixes are updated iteratively by the gradient descent method until the system converges, the error is reduced to obtain the optimal solution, eui represents the error between a certain predicted score and a known score R (ui), and SSE is the sum of square errors;
eui=R(ui)-r(ui) (5)
SSE=Σu,i eui2=Σu,i[r(ui)-Σk=1 puk qki]2 (6)
after gradient solution, the final result is represented by the following formula, where η is the learning rate, λ is the regularization parameter, excessive convergence is avoided, and puk and qki after updating according to the gradient are:
puk=puk+2η(eui qki-λpuk) (7)
qki=qki+2η(eui puk-λqki) (8)。
2. the personalized movie recommendation method based on the combination of the Bert neural network and the Funk-SVD model in the collaborative filtering algorithm as claimed in claim 1, wherein: in the neural network Bert, data in an experiment is a movie plot text, and because a sentence describing a movie plot is long, compared with a method that an RNN extracts features according to a time sequence, a transform can effectively ensure that the previous features do not disappear, and the Bert comprises two steps: pre-training and fine-tuning; wherein, in the pre-training process, Bert shields 15% of the input movie plot text, the whole sequence is run through a transform Encoder, and then only the shielded movie plot part is predicted, so as to achieve deep two-way pre-training representation; firstly, converting an episode text into a word vector by using a Bert method and obtaining a characteristic matrix, and then using the obtained matrix in a CF model; bert uses the structure of a transform, which consists of several stacked layers, each layer consisting of an attention layer and a non-linear function applied to each input element; the Transformer iteratively uses the steps of syntactic parsing and semantic synthesis to solve their interdependence problem, thereby better generating a vector containing all movie features, i.e., an item feature matrix.
CN201910912752.8A 2019-09-25 2019-09-25 Personalized movie recommendation method based on neural network and collaborative filtering Active CN110688585B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910912752.8A CN110688585B (en) 2019-09-25 2019-09-25 Personalized movie recommendation method based on neural network and collaborative filtering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910912752.8A CN110688585B (en) 2019-09-25 2019-09-25 Personalized movie recommendation method based on neural network and collaborative filtering

Publications (2)

Publication Number Publication Date
CN110688585A CN110688585A (en) 2020-01-14
CN110688585B true CN110688585B (en) 2022-04-19

Family

ID=69110223

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910912752.8A Active CN110688585B (en) 2019-09-25 2019-09-25 Personalized movie recommendation method based on neural network and collaborative filtering

Country Status (1)

Country Link
CN (1) CN110688585B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11995036B2 (en) * 2019-10-11 2024-05-28 Ikigai Labs Inc. Automated customized modeling of datasets with intuitive user interfaces
CN112364064A (en) * 2020-08-27 2021-02-12 南京信息职业技术学院 Movie recommendation system algorithm for improving prediction accuracy by using dynamic deviation value
CN113077313B (en) * 2021-04-13 2022-09-13 合肥工业大学 Complementary product recommendation method fusing user generated scene image and personalized preference
CN116664253B (en) * 2023-07-28 2023-10-24 江西财经大学 Project recommendation method based on generalized matrix decomposition and attention shielding

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109064249A (en) * 2018-06-28 2018-12-21 中山大学 A kind of clothes recommendation optimization method and its system based on feature personalization modification
CN109918574A (en) * 2019-03-28 2019-06-21 北京卡路里信息技术有限公司 Item recommendation method, device, equipment and storage medium
CN110147452A (en) * 2019-05-17 2019-08-20 北京理工大学 A kind of coarseness sentiment analysis method based on level BERT neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109064249A (en) * 2018-06-28 2018-12-21 中山大学 A kind of clothes recommendation optimization method and its system based on feature personalization modification
CN109918574A (en) * 2019-03-28 2019-06-21 北京卡路里信息技术有限公司 Item recommendation method, device, equipment and storage medium
CN110147452A (en) * 2019-05-17 2019-08-20 北京理工大学 A kind of coarseness sentiment analysis method based on level BERT neural network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Integrating Spectral-CF and FP-Growth for Recommendation;HuaXin Zhang.etc;《EBIMCS "19: Proceedings of the 2019 2nd International Conference on E-Business, Information Management and Computer Science》;20190830;第1-7页 *
SVD-based group recommendation approaches: an experimental study of Moviepilot;Xun Hu.etc;《CAMRa "11: Proceedings of the 2nd Challenge on Context-Aware Movie Recommendation》;20111030;第1286-1290页 *
基于ASVD 的协同过滤推荐算法;李春春等;《小型微型计算机系统》;20180630;第1286-1290页 *

Also Published As

Publication number Publication date
CN110688585A (en) 2020-01-14

Similar Documents

Publication Publication Date Title
CN110688585B (en) Personalized movie recommendation method based on neural network and collaborative filtering
CN110807154B (en) Recommendation method and system based on hybrid deep learning model
CN109829299B (en) Unknown attack identification method based on depth self-encoder
CN111753024B (en) Multi-source heterogeneous data entity alignment method oriented to public safety field
CN111310063B (en) Neural network-based article recommendation method for memory perception gated factorization machine
CN108921657B (en) Knowledge-enhanced memory network-based sequence recommendation method
CN111737535B (en) Network characterization learning method based on element structure and graph neural network
CN109190030B (en) Implicit feedback recommendation method fusing node2vec and deep neural network
CN112487190B (en) Method for extracting relationships between entities from text based on self-supervision and clustering technology
CN112800344B (en) Deep neural network-based movie recommendation method
CN109033294B (en) Mixed recommendation method for integrating content information
CN112417306A (en) Method for optimizing performance of recommendation algorithm based on knowledge graph
CN112182424A (en) Social recommendation method based on integration of heterogeneous information and isomorphic information networks
CN114692602A (en) Drawing convolution network relation extraction method guided by syntactic information attention
CN110781401A (en) Top-n project recommendation method based on collaborative autoregressive flow
Li et al. Sentiment analysis of Weibo comments based on graph neural network
CN112905906B (en) Recommendation method and system fusing local collaboration and feature intersection
Wang et al. An efficient method for autoencoder‐based collaborative filtering
CN117171440A (en) News recommendation method and system based on news event and news style joint modeling
CN114121178A (en) Chromatogram retention index prediction method and device based on graph convolution network
CN112734519A (en) Commodity recommendation method based on convolution self-encoder network
CN110555161A (en) personalized recommendation method based on user trust and convolutional neural network
Baizal Book recommender system using singular value decomposition combined with slope one algorithm
Zhang Collaborative Recommendation based on Variational Automatic Coding Machine
CN117252665B (en) Service recommendation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant