CN111259133A - Personalized recommendation method integrating multiple information - Google Patents

Personalized recommendation method integrating multiple information Download PDF

Info

Publication number
CN111259133A
CN111259133A CN202010054209.1A CN202010054209A CN111259133A CN 111259133 A CN111259133 A CN 111259133A CN 202010054209 A CN202010054209 A CN 202010054209A CN 111259133 A CN111259133 A CN 111259133A
Authority
CN
China
Prior art keywords
user
project
algorithm
item
adopting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010054209.1A
Other languages
Chinese (zh)
Other versions
CN111259133B (en
Inventor
乔少杰
韩楠
沈杰
宋学江
程维杰
魏军林
张小辉
丁超
肖月强
陈文林
李斌勇
张吉烈
张永清
何林波
元昌安
彭京
周凯
余华
范勇强
冉先进
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Tianma Technology Co Ltd
Sichuan Jinkecheng Geographic Information Technology Co ltd
Chengdu University of Information Technology
Original Assignee
Chengdu Tianma Technology Co Ltd
Sichuan Jinkecheng Geographic Information Technology Co ltd
Chengdu University of Information Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Tianma Technology Co Ltd, Sichuan Jinkecheng Geographic Information Technology Co ltd, Chengdu University of Information Technology filed Critical Chengdu Tianma Technology Co Ltd
Priority to CN202010054209.1A priority Critical patent/CN111259133B/en
Publication of CN111259133A publication Critical patent/CN111259133A/en
Application granted granted Critical
Publication of CN111259133B publication Critical patent/CN111259133B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a multi-information fused personalized recommendation method which comprises the steps of obtaining similarity between a user and a project by adopting a word2vec algorithm and an FM algorithm, obtaining a predicted click probability between the user and the project by adopting a RippleNet algorithm, obtaining a predicted score by adopting a dynamic fusion algorithm, and providing a personalized recommendation list for the user based on the predicted score. According to the invention, the knowledge graph and the comment content are used as multi-source data, different algorithms are used for processing the data, and a dynamic fusion method is adopted for effective combination, so that more accurate personalized recommendation service is provided for users, a better recommendation effect can be realized, and the problem of reduced recommendation accuracy caused by sparse data can be effectively solved.

Description

Personalized recommendation method integrating multiple information
Technical Field
The invention belongs to the technical field of recommendation systems, and particularly relates to a personalized recommendation method fusing multiple information.
Background
With the rapid development of advanced technologies such as artificial intelligence, cloud computing and big data technology, and mobile internet, the scale of various information data also shows explosive growth. While enjoying the convenience of such data, it is necessary to deal with the problem of "information overload" caused by an excessive amount of data. The recommendation system is one of effective methods for solving the problem of "information overload", and can find the interest points of the user according to the related attributes of the user and the items (item), and recommend the items in which the user is interested to the user in a personalized directory manner.
Currently, collaborative filtering based recommendation systems have achieved some benefit by taking into account historical user interaction with items and then making recommendation suggestions for the user based on their underlying characteristics. But collaborative filtering based recommendation systems typically face sparsity of user and merchant historical interaction data and concomitant cold start problems. To address these limitations, researchers have incorporated auxiliary information such as user/item attributes, social networks, images, background, etc. into collaborative filtering based recommendation systems.
Among various auxiliary information, Knowledge Graph (KG) is widely focused by researchers due to its highly efficient fact description capability and associated information between interpretable projects. A knowledge graph is a directed heteromorphic graph in which nodes correspond to entities and edges correspond to relationships. Researchers have proposed a number of knowledge maps, such as: NELL, DBpedia, and commercial Knowledge maps such as Google Knowledge Graph and Microsoft Satori. These knowledge maps have been successfully applied in a number of areas, such as knowledge map filling, human-machine question-answering, word embedding (10), and text classification.
Deep learning is a research hotspot of the current internet and artificial intelligence. The deep learning mainly generates high-level semantic abstraction from low-level attribute features, automatically digs out distributed feature representation of data, solves the problem that features need to be designed manually in the traditional machine learning, and makes great progress in the fields of image recognition, machine translation and the like. The deep learning based recommendation system has recently attracted much attention, and uses data related to users and commodity items as input, obtains hidden representations of the users and the items with corresponding attribute characteristics through a deep learning model, and recommends the items for the users based on the hidden representations.
Knowledge maps are widely used in various fields and researchers try to improve the performance of recommendation systems using knowledge maps. Existing knowledge-graph-based recommendation systems are classified into two categories:
(1) embedding (embedding) -based methods of this type use the Knowledge Graph Embedding (KGE) algorithm to preprocess the KG and embed the learned entities into the recommendation system framework. The embedding-based method utilizes the KG auxiliary recommendation system to improve the flexibility of the algorithm, but the KGE algorithm adopted by the method is more suitable for link prediction rather than the recommendation system.
(2) Path-based methods that explore the association patterns between entities in the KG as additional auxiliary information for the recommendation system. The path-based method uses the KG in a more intuitive manner, but depends heavily on manually set meta-paths, the generality cannot be guaranteed, and different meta-paths need to be set in different application scenarios. Furthermore, entities and relationships are not manually designed meta-paths in certain scenarios (e.g., news recommendations) within a domain.
The literature earlier applied graph embedding techniques to the recommendation field. And embedding (embedding) the movies and the user information in the Movielens into the same vector space, further calculating the spatial distance between the user and the movies, and generating a recommendation list. Wang et al embed medical knowledge maps, disease & patient bipartite graphs, and disease & drug bipartite graphs into low-dimensional vector spaces, respectively, recommending safer drug therapy for patients. Combining the knowledge-graph with the bipartite graph by weighted averaging generates patient and drug vectors containing finer grained attribute information, ultimately generating a list of drugs top-k for a given patient.
Ostuni et al fuse the implicit semantic feedback information in the KG path and propose a path algorithm SPrank based on implicit semantic feedback. The data set is mined based on the path features to capture complex relationships between items. The main idea of sprink is to explore paths in the semantic graph in order to find items related to the items of interest to the user. And (3) extracting features based on the path by analyzing the path, and generating a recommendation result by utilizing a learning algorithm combining a random forest and a gradient enhanced regression tree.
Disclosure of Invention
In order to more effectively fuse various data information, solve the problem of data sparseness and improve the accuracy of a recommendation system, the invention provides a personalized recommendation method fusing multiple information.
In order to achieve the purpose of the invention, the invention adopts the technical scheme that:
a personalized recommendation method fusing multiple information comprises the following steps:
s1, acquiring a user-project comment data set, acquiring feature word vectors of a user and a project respectively by adopting a word2vec algorithm, and acquiring similarity between the user and the project by adopting an FM algorithm;
s2, constructing an interaction matrix of the user and the project according to historical click project information of the user, and obtaining the predicted click probability of the user and the project by adopting a RippleNet algorithm in combination with a knowledge map;
and S3, dynamically fusing the similarity between the user and the project obtained in the step S1 and the predicted click probability between the user and the project obtained in the step S2 by using a dynamic fusion algorithm to obtain a predicted score, and providing a personalized recommendation list for the user based on the predicted score.
Further, the step S1 specifically includes the following sub-steps:
s1-1, obtaining all user-project comment information in a database, synthesizing comments of a user on all projects into text data representing the user information by adopting a word2vec algorithm, and integrating comments of all users received by a project into the text data of the project;
s1-2, respectively carrying out vectorization processing on the text data of the user information and the text data of the project obtained in the step S1-1 by adopting a word2vec algorithm to obtain feature word vectors of the user and the project;
and S1-3, combining the feature word vectors of the users and the projects obtained in the step S1-2 pairwise by adopting an FM algorithm, and adding cross item features to obtain the similarity of the users and the projects.
Further, in step S1-3, the model of the FM algorithm is represented as:
Figure BDA0002372249760000041
wherein m is0Representing global bias terms, m being a feature vector z of user u and item vuvM is a weight matrix of second order interactions, Mj,cIs the value of j row and c column of M, ij,icIs equal to zuvJ and c, and an i-dimensional hidden vector.
Further, the step S1 takes the square loss as the objective function of the parameter optimization, and is expressed as:
Figure BDA0002372249760000042
where O represents the set of observed user-item score pairs, yu,vRepresents the interaction history of user u with item v, theta represents all parameters, lambdaΘIndicating the L2 regularization parameter.
Further, the step S2 specifically includes the following sub-steps:
s2-1, setting the user set and the item set to U ═ U, respectively1,u2,...,umV ═ V } and V ═ V1,v2,...,vnAnd constructing an interaction matrix of the user and the project, wherein the interaction matrix is represented as:
Yuv={yuv|u∈U,v∈V}
wherein, yu,vRepresenting the interaction history of the user u and the item v, m representing the number of users, and n representing the number of items;
s2-2, according to the interaction matrix of the user and the project and the knowledge graph containing the relation-entity triple, defining the kth associated entity of the user u as:
Figure BDA0002372249760000051
wherein, (H, r, t) represents a relationship-entity triple contained in the knowledge graph, H represents a head entity, r represents a relationship, t represents a tail entity, and H represents the farthest position associated with the origin item;
defining the k jump ripple set of the user u on the knowledge graph G as follows:
Figure BDA0002372249760000052
s2-3, correspondingly creating an embedded vector v with d dimensions for each item v, and combining each triple (h) of the 1 st jump ripple set of the user ui,ri,ti) The correlation coefficient with v is:
Figure BDA0002372249760000053
wherein R isiRepresents the relation riEmbedded vector of hiRepresents a head entity hiThe embedded vector of (2);
s2-4, according to the correlation coefficient, the tail entity t of the first hop ripple set of the user uiCalculating weighted sum to obtain user u pairsThe multi-level reverberations in item v are:
Figure BDA0002372249760000054
according to the multi-level reverberation of the user u on the item v, the embedded vector of the user u on the item v is defined as follows:
Figure BDA0002372249760000055
wherein, αiIs a positive mixing parameter;
s2-5, obtaining the predicted click probability of the user and the project according to the embedded vector of the user u of the project v, wherein the predicted click probability is expressed as follows:
Figure BDA0002372249760000056
wherein z isKGRepresenting recommendations based on knowledge-graph data.
Further, the loss function of the rippelenet algorithm in the step S2 is expressed as:
Γ=∑(u,v)∈Y-yuvlogσ(uTv)+(1-yuv)log(1-σ(uTv))。
further, in step S3, the similarity between the user and the item and the predicted click probability between the user and the item are dynamically fused by using a dynamic fusion algorithm to obtain a predicted score, which is expressed as:
Figure BDA0002372249760000061
wherein,
Figure BDA0002372249760000062
zreviewindicating a recommendation based on the text comment data.
The invention has the following beneficial effects: according to the invention, the knowledge graph and the comment content are used as multi-source data, different algorithms are used for processing the data, and a dynamic fusion method is adopted for effective combination, so that more accurate personalized recommendation service is provided for users, a better recommendation effect can be realized, and the problem of reduced recommendation accuracy caused by sparse data can be effectively solved.
Drawings
FIG. 1 is a flow chart of a personalized recommendation method fusing multiple information according to the present invention;
FIG. 2 is a schematic view of a corrugated structure in an embodiment of the present invention;
FIG. 3 is a schematic diagram of the REME model structure in the embodiment of the present invention;
FIG. 4 is a graph showing a comparison of recall ratios of different models of a data set AZ according to an embodiment of the present invention;
FIG. 5 is a graph illustrating the recall ratio comparison between different models of the data set SC according to the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, an embodiment of the present invention provides a personalized recommendation method fusing multiple pieces of information, including the following steps S1 to S3:
s1, acquiring a user-project comment data set, acquiring feature word vectors of a user and a project respectively by adopting a word2vec algorithm, and acquiring similarity between the user and the project by adopting an FM algorithm;
in this embodiment, most current social media websites and e-commerce systems allow users to post text comments. The text contains rich information and can find the potential interest points of the user, so that the method applies the user comment text to the recommendation system, thereby improving the accuracy of the recommendation system.
The invention applies a Word2vec model based on deep learning to a recommendation system, wherein the Word2vec model is a Word embedding model based on Skip-gram or CBOW (Continuous Bag-of-Words). Under the condition of no part-of-speech tagging, word2vec can be used for learning vector representation of words from original linguistic data, and semantic and syntactic similarities among the words are compared.
The method utilizes word2vec to process the text, synthesizes comments of a user on all merchants into text data representing user information, similarly integrates comments of all users received by a merchant into the text data of the merchant, extracts potential text features of the user and the project, matches the potential text features according to the features, and finally carries out reasonable recommendation.
The step S1 specifically includes the following sub-steps:
s1-1, obtaining all user-project comment information in a database, synthesizing comments of a user on all projects into text data representing the user information by adopting a word2vec algorithm, and integrating comments of all users received by a project into the text data of the project;
s1-2, respectively carrying out vectorization processing on the text data of the user information and the text data of the project obtained in the step S1-1 by adopting a word2vec algorithm to obtain feature word vectors of the user and the project;
word2vec can be regarded as a neural network, which is mainly used for training each Word in the natural language into a Word vector through a three-layer neural network, so that the problems that the traditional bag-of-words (BOW) model cannot represent text context semantic information and cause dimension disasters are well solved, and words similar in semantics have similar vector representation.
The word2vec algorithm adopts a CBOW prediction model and a hierarchical softmax (high speed tree, HS) training model, CBOW predicts the posterior probability of a central word according to a known context word, and the model structure is as follows:
1) input layer, context word vector (w).
2) A projection layer to add the 2c context (w) word vectors of the input layer.
3) And the output layer outputs the intermediate word vectors.
The training function for CBOW is:
maxΦ=∑W∈Clogp(w|Context(w))
and S1-3, combining the feature word vectors of the users and the projects obtained in the step S1-2 pairwise by adopting an FM algorithm, and adding cross item features to obtain the similarity of the users and the projects.
The invention firstly sets the input of FM algorithm: constructing a feature vector of the user and the project based on Word2vec, wherein the feature vector comprises the following components:
tu=word2vec(Tu)
tv=word2vec(Tv)
wherein, Tu,TvComments, t, representing user u and item v, respectivelyuAnd tvAre the corresponding user and item feature vectors.
Combining the feature word vectors of the user and the project pairwise, and expressing as follows:
zuv=tu⊙tv
where ⊙ denotes the vector dot product operation, zuvIs a vector of correlation coefficients between u and v.
The invention adopts FM algorithm to combine the feature word vectors of the user and the project pairwise, and adds cross item features, thereby obviously improving the accuracy of the model.
The model for the FM algorithm is represented as:
Figure BDA0002372249760000091
wherein m is0Representing global bias terms, m being a feature vector z of user u and item vuvM is a weight matrix of second order interactions, Mj,cIs the value of j row and c column of M, ij,icIs equal to zuvJ and c, and an i-dimensional hidden vector.
Finally, the quadratic loss is taken as an objective function for parameter optimization, expressed as:
Figure BDA0002372249760000092
wherein O represents an observed user-itemSet of scoring pairs, yu,vRepresents the interaction history of user u with item v, theta represents all parameters, lambdaΘRepresenting the L2 regularization parameter, the second term λΘ||Θ||2Prevention of model overfitting is achieved.
S2, constructing an interaction matrix of the user and the project according to historical click project information of the user, and obtaining the predicted click probability of the user and the project by adopting a RippleNet algorithm in combination with a knowledge map;
in this embodiment, the existing rippet algorithm only uses a knowledge graph formed by historical click records of users and structured knowledge, and does not consider users and item comment data containing rich knowledge, so that the hidden features of the users and merchants are extracted by using word2vec, the hidden features are processed by a Factorization Machine (FM) algorithm, and then the click probability value of the users is calculated; and combining the value obtained by the RippleNet algorithm with the value obtained by the word2vec + FM by adding a dynamic parameter to finally obtain a click rate prediction value.
The step S2 specifically includes the following sub-steps:
s2-1, setting the user set and the item set to U ═ U, respectively1,u2,...,umV ═ V } and V ═ V1,v2,...,vnAnd constructing an interaction matrix of the user and the project, wherein the interaction matrix is represented as:
Yuv={yuv|u∈U,v∈V}
wherein, yu,vRepresenting the interaction history of the user u and the item v, m representing the number of users, and n representing the number of items; y isuvWhen the value is 1, it indicates that there is a history interaction between the user u and the item v, that is, the user u has clicked to view the item v.
S2-2, according to the interaction matrix of the user and the project and the knowledge graph containing the relation-entity triple, defining the kth associated entity of the user u as:
Figure BDA0002372249760000101
wherein, (h, r, t) represents a relation-entity triple contained in the knowledge graph, h represents a head entity, r represents a tail entity, and t represents a relation; h belongs to E, R belongs to R, t belongs to E, E and R respectively represent an entity set and a relation set in the knowledge graph G, and H represents the farthest position related to the origin item set by the experiment.
The objective of the RippleNet algorithm is to obtain the click prediction scores of a user u and an undetermined item v under the condition of the existing interaction matrix Y and knowledge graph G. Namely, the user u and the item v are used as input, and the probability that the user u can click the item v is output.
Defining the k jump ripple set of the user u on the knowledge graph G as follows:
Figure BDA0002372249760000102
wherein epsilonu 0={v|y uv1 represents that user u has clicked on item v, i.e. the user u's seed set in G. The superscript 0 indicates seed node.
The meaning of "corrugation" includes:
1) regarding the historical clicks of the user as individual water drops, a plurality of ripples are formed on the water surface of the knowledge graph, and the propagation of the ripples can be used for representing the potential interest propagation path of the user.
2) The user's degree of potential interest becomes smaller as k increases, i.e., the farther the propagation distance, the less similar to the initial item.
The "ripple set" is shown in FIG. 2: triangles represent the "seed set" that the user initially clicked on, squares represent the first Hop ripple set (Hop1) directly connected to the seed set, filled circles represent the second Hop ripple set (Hop2), and so on.
S2-3, correspondingly creating a d-dimensional embedding vector v for each item v, wherein the item embedding vector is an item represented by characteristic information such as one-hot ID, distribution, bag of words and the like. Hop1 ripple set S of existing user uu 1Embedding each triplet (h) of the 1 st jump ripple set of user u with an item into vector vi,ri,ti) Correlation with vThe coefficients are:
Figure BDA0002372249760000111
wherein R isiRepresents the relation riIs a d x d matrix; h isiRepresents a head entity hiIs a d-dimensional vector; coefficient of correlation piRepresenting item v and head entity hiIn the relation RiTo a similar degree above.
S2-4, obtaining the correlation coefficient piThen, for Su 1Tail entity t ofiCalculating the weighted sum to obtain a vector Ou 1
Figure BDA0002372249760000112
Vector Ou 1Representing a 1 st order response (Responding) to item v based on user u's historical interactions is equivalent to representing user u with the characteristics of item v, rather than using a separate characteristic vector. Similarly, the 2 nd order reverberation and the multi-order reverberation of the user u on v can be obtained.
According to the multi-level reverberation of the user u on the item v, the embedded vector of the user u on the item v is defined as follows:
Figure BDA0002372249760000113
wherein, αiFor positive trainable blending parameters, αi>0, and the sum thereof is 1;
s2-5, obtaining the predicted click probability of the user and the project according to the embedded vector of the user u of the project v, wherein the predicted click probability is expressed as follows:
Figure BDA0002372249760000121
wherein z isKGRepresenting recommendations based on knowledge-graph data.
The penalty function for the rippeenet algorithm is derived from the above equation as:
Γ=∑(u,v)∈Y-yuvlogσ(uTv)+(1-yuv)log(1-σ(uTv))。
wherein, yuvAnd when the value is 1, the historical interaction between the user u and the item v is shown, namely the user u clicks and watches the item v once. The defined loss function is used to train and adjust the parameters.
And S3, dynamically fusing the similarity between the user and the project obtained in the step S1 and the predicted click probability between the user and the project obtained in the step S2 by using a dynamic fusion algorithm to obtain a predicted score, and providing a personalized recommendation list for the user based on the predicted score.
In order to make the integration of the two hidden features complement each other and generate a better prediction result, a linear interpolation α is added, and a dynamic fusion recommendation Model REME (rippeenet and word2 vecsuation Model) is provided, as shown in fig. 3, the similarity between the user and the project obtained in step S1 and the predicted click probability between the user and the project obtained in step S2 are dynamically fused to obtain a prediction score, which is expressed as:
Figure BDA0002372249760000122
wherein,
Figure BDA0002372249760000123
zreviewindicating a recommendation based on the text comment data.
The invention adopts a random gradient descent and back propagation method to optimize the parameters of the formula, and the specific process is as follows:
firstly, counting a ripple set of each user and a set of all comments of the user, and converting a comment set file into a corresponding user feature vector by using a word2vec algorithm;
within a preset iteration number T, updating the parameter { α) by using a random gradient descent algorithm and a back propagation algorithmi,i=1,2,....,H};
Calculating a corresponding project characteristic vector for each project by using the operation of calculating the same user characteristic vector;
after all the user-item feature vectors are calculated, traversing the user-item pairs of the test set, and calculating a user-item correlation coefficient vector zuv
Updating the parameter theta by using a random gradient descent algorithm and a back propagation algorithm based on an FM algorithm;
final output parameter { αiI 1, 2.. said, H } and Θ.
To illustrate that the REME algorithm has better time performance while improving the accuracy of the algorithm, the invention analyzes the time complexity of the REME algorithm.
Firstly, creating a user feature vector: calculating the time complexity of the user ripple set to be O (a multiplied by m), wherein a is the number of users and is a constant; the time complexity of the word2vec algorithm is O (log (n)). Combining the above steps, the time complexity of creating the user feature vector is O (a (m + log (n))), and since the value of n is much larger than m, it is approximately O (log (n)). Similar to creating the user feature vector, the time complexity of creating the project feature vector is O (log (n)). The time complexity of calculating the cross vector of the user feature and the project feature is O (log)2(n)), overall, the algorithm time complexity of REME is O (log)2(n))。
The invention uses specific examples to compare the performance of the invention with different algorithms.
A general Yelp dataset was used in the experiments for the recommended performance analysis. The invention extracts restaurant data in two different regions, namely, Arizona (AZ) and Carolina (SC), in a Yelp dataset, and comprises comment data of users and attribute datasets of merchants. The comment data of the user mainly contains information such as comments, scores and the like of the user. The user comment was considered to have checked in once in the experiment. The attribute data set of the merchant mainly contains information such as the ID, name, location (region, city, longitude and latitude, etc.), restaurant category, and tag of the merchant. The experiment utilized Microsoft Satori to build a knowledge graph for Yelp merchants.
The statistical information of the data sets of the two screened areas is shown in table 1.
Table 1 various statistical information of data sets
Figure BDA0002372249760000141
From table 1, it can be found that the number of AZ users is about twice that of the SC, while the number of merchants is about five times that of the SC, thus bringing about differences in data sparsity, resulting in some differences in final experimental results.
In Ripplenet, the ripple jump number H is set to 2, and it is proved from experimental results that a large ripple jump number does not improve the performance, but rather increases the extra calculation overhead.
The parameters of the complete experiment are set as that the embedding dimension d of the merchant and the knowledge graph is 16, the learning rate η is 0.02, and the regularization parameter lambda is1=10-7,λ2=0.01,H=2。
For word2vec, the dimension of the resulting embedded vector is set to be d as well. The hyperparameter was determined by validating the AUC curve on the data set.
To achieve better experimental results, training was performed on each data set with a training, evaluation, and test set ratio of 6:2: 2. Each experiment was repeated 5 times and the average was taken as the final data.
The invention adopts the following two evaluation indexes to evaluate the performance of the algorithm:
1) for click-through rate (CTR) prediction, acc (accuracy) and AUC are used herein to evaluate the performance of CTR prediction.
2) For the top-k recommendation, a call @ k is used as an evaluation index, and the call @ k is defined as the formula:
Figure BDA0002372249760000151
wherein recall @ k represents the recall rate in the top-k recommendation list, i.e., the probability that the user clicks in the recommendation list. Where hit represents the number of times that a user in the test set clicks on a restaurant in the recommendation list, and recall represents the total number of check-ins for the test set.
In the present invention, the following three classical recommendation algorithms are mainly compared:
1) CKE: the CKE mainly combines collaborative filtering and structural knowledge, text knowledge and image knowledge into a unified framework for recommendation.
2) DKN: DKN treat entity embedding and word embedding as multiple channels and combine them in CNN for CTR prediction. The merchant tag was used in the experiment as a text input at DKN.
3) PMF: the PMF mainly utilizes check-in information of a user, decomposes a check-in matrix of 'user-interest point' into a user implicit factor matrix and an interest point implicit factor matrix, predicts the score of the user for the interest point by utilizing the implicit factor matrices, and further generates a recommendation list for the user.
The results of top-k recommendations and CTR predictions for different algorithms are shown in FIGS. 4 and 5, and in Table 2.
TABLE 2 AUC and Accuracy results in click Rate prediction
Figure BDA0002372249760000152
Figure BDA0002372249760000161
The experimental results show that:
(1) the experimental effect of SC was always better than that of AZ on different data sets, because there was a difference in the sparsity of the data in the two regions, and the average flow per merchant for AZ was less than that of SC.
(2) CKE uses only structural knowledge here and therefore is less effective than rippenet. RippleNet has a better result compared with other models, but only a knowledge graph is considered, and data such as comment text information are not effectively utilized, so that the recommendation effect is not good as REME. DKN, because only the label information is used here, no other effective information is considered.
(3) The recommended effect of PMF is always the worst in which dataset because the user's check-in data is sparse. In addition, the PMF algorithm does not fuse other content data information.
(4) In both data sets, REME achieved the best recommendations, which were 7.8% -19.3% and 4.9% -20% higher in AUC in both AZ and SC data sets, respectively, compared to the other baseline, and also achieved the best results in the call @ k test.
Compared with the conventional typical model, the REME model provided by the invention obviously improves the recommendation effect under the condition of effectively fusing various data, and can obtain good recommendation effect under the condition of sparse data, so that the REME model can effectively solve the negative influence of sparse data on the recommendation result.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (7)

1. A personalized recommendation method fusing multiple information is characterized by comprising the following steps:
s1, acquiring a user-project comment data set, acquiring feature word vectors of a user and a project respectively by adopting a word2vec algorithm, and acquiring similarity between the user and the project by adopting an FM algorithm;
s2, constructing an interaction matrix of the user and the project according to historical click project information of the user, and obtaining the predicted click probability of the user and the project by adopting a RippleNet algorithm in combination with a knowledge map;
and S3, dynamically fusing the similarity between the user and the project obtained in the step S1 and the predicted click probability between the user and the project obtained in the step S2 by using a dynamic fusion algorithm to obtain a predicted score, and providing a personalized recommendation list for the user based on the predicted score.
2. The method for personalized recommendation fusing multiple information according to claim 1, wherein the step S1 specifically comprises the following sub-steps:
s1-1, obtaining all user-project comment information in a database, synthesizing comments of a user on all projects into text data representing the user information by adopting a word2vec algorithm, and integrating comments of all users received by a project into the text data of the project;
s1-2, respectively carrying out vectorization processing on the text data of the user information and the text data of the project obtained in the step S1-1 by adopting a word2vec algorithm to obtain feature word vectors of the user and the project;
and S1-3, combining the feature word vectors of the users and the projects obtained in the step S1-2 pairwise by adopting an FM algorithm, and adding cross item features to obtain the similarity of the users and the projects.
3. The method for personalized recommendation fusing multiple information according to claim 2, wherein in step S1-3, the model of FM algorithm is represented as:
Figure FDA0002372249750000011
wherein m is0Representing global bias terms, m being a feature vector z of user u and item vuvM is a weight matrix of second order interactions, Mj,cIs the value of j row and c column of M, ij,icIs equal to zuvJ and c, and an i-dimensional hidden vector.
4. The method for personalized recommendation fusing multiple informations according to claim 3, wherein the step S1 adopts a square loss as an objective function of parameter optimization, and is expressed as:
Figure FDA0002372249750000021
where O represents the set of observed user-item score pairs, yu,vRepresents the interaction history of user u with item v, theta represents all parameters, lambdaΘIndicating the L2 regularization parameter.
5. The method for personalized recommendation fusing multiple information according to claim 4, wherein the step S2 specifically comprises the following sub-steps:
s2-1, setting the user set and the item set to U ═ U, respectively1,u2,...,umV ═ V } and V ═ V1,v2,...,vnAnd constructing an interaction matrix of the user and the project, wherein the interaction matrix is represented as:
Yuv={yuv|u∈U,v∈V}
wherein, yu,vRepresenting the interaction history of the user u and the item v, m representing the number of users, and n representing the number of items;
s2-2, according to the interaction matrix of the user and the project and the knowledge graph containing the relation-entity triple, defining the kth associated entity of the user u as:
Figure FDA0002372249750000022
wherein, (H, r, t) represents a relationship-entity triple contained in the knowledge graph, H represents a head entity, r represents a relationship, t represents a tail entity, and H represents the farthest position associated with the origin item;
defining the k jump ripple set of the user u on the knowledge graph G as follows:
Figure FDA0002372249750000023
s2-3, creating an embedded vector v with d dimensions corresponding to each item v, and collecting the 1 st jump ripple set of the user uEach triplet (h)i,ri,ti) The correlation coefficient with v is:
Figure FDA0002372249750000031
wherein R isiRepresents the relation riEmbedded vector of hiRepresents a head entity hiThe embedded vector of (2);
s2-4, according to the correlation coefficient, the tail entity t of the first hop ripple set of the user uiCalculating the weighted sum to obtain the first-order reverberation of the user u to the item v as follows:
Figure FDA0002372249750000032
according to the multi-level reverberation of the user u on the item v, the embedded vector of the user u on the item v is defined as follows:
Figure FDA0002372249750000033
wherein, αiIs a positive mixing parameter;
s2-5, obtaining the predicted click probability of the user and the project according to the embedded vector of the user u of the project v, wherein the predicted click probability is expressed as follows:
Figure FDA0002372249750000034
wherein z isKGRepresenting recommendations based on knowledge-graph data.
6. The method for personalized recommendation fusing multiple information according to claim 5, wherein the loss function of the rippley algorithm in the step S2 is expressed as:
Γ=Σ(u,v)∈Y-yuvlogσ(uTv)+(1-yuv)log(1-σ(uTv))。
7. the method for recommending fused multiple information items according to claim 6, wherein in step S3, the similarity between users and items and the predicted click probability between users and items are dynamically fused by using a dynamic fusion algorithm to obtain a predicted score, which is expressed as:
Figure FDA0002372249750000035
wherein,
Figure FDA0002372249750000041
zreviewindicating a recommendation based on the text comment data.
CN202010054209.1A 2020-01-17 2020-01-17 Personalized recommendation method integrating multiple information Active CN111259133B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010054209.1A CN111259133B (en) 2020-01-17 2020-01-17 Personalized recommendation method integrating multiple information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010054209.1A CN111259133B (en) 2020-01-17 2020-01-17 Personalized recommendation method integrating multiple information

Publications (2)

Publication Number Publication Date
CN111259133A true CN111259133A (en) 2020-06-09
CN111259133B CN111259133B (en) 2021-02-19

Family

ID=70952218

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010054209.1A Active CN111259133B (en) 2020-01-17 2020-01-17 Personalized recommendation method integrating multiple information

Country Status (1)

Country Link
CN (1) CN111259133B (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111782813A (en) * 2020-07-07 2020-10-16 支付宝(杭州)信息技术有限公司 User community evaluation method, device and equipment
CN111784081A (en) * 2020-07-30 2020-10-16 南昌航空大学 Social network link prediction method adopting knowledge graph embedding and time convolution network
CN111859125A (en) * 2020-07-09 2020-10-30 威海天鑫现代服务技术研究院有限公司 Semantic network construction and service recommendation method oriented to intellectual property technical resource field
CN111932308A (en) * 2020-08-13 2020-11-13 中国工商银行股份有限公司 Data recommendation method, device and equipment
CN112163929A (en) * 2020-09-27 2021-01-01 中国平安财产保险股份有限公司 Service recommendation method and device, computer equipment and storage medium
CN112487200A (en) * 2020-11-25 2021-03-12 吉林大学 Improved deep recommendation method containing multi-side information and multi-task learning
CN112633504A (en) * 2020-12-23 2021-04-09 北京工业大学 Wisdom cloud knowledge service system and method for fruit tree diseases and insect pests based on knowledge graph
CN112733040A (en) * 2021-01-27 2021-04-30 中国科学院地理科学与资源研究所 Travel itinerary recommendation method
CN113032618A (en) * 2021-03-26 2021-06-25 齐鲁工业大学 Music recommendation method and system based on knowledge graph
CN113190593A (en) * 2021-05-12 2021-07-30 《中国学术期刊(光盘版)》电子杂志社有限公司 Search recommendation method based on digital human knowledge graph
CN113392325A (en) * 2021-06-21 2021-09-14 电子科技大学 Deep learning-based information recommendation method
CN114925294A (en) * 2022-06-04 2022-08-19 上海交通大学 Position prediction system and method based on graph-enhanced time-space model
CN115270005A (en) * 2022-09-30 2022-11-01 腾讯科技(深圳)有限公司 Information recommendation method, device, equipment and storage medium
CN115982646A (en) * 2023-03-20 2023-04-18 西安弘捷电子技术有限公司 Multi-source test data management method and system based on cloud platform
CN116701772A (en) * 2023-08-03 2023-09-05 广东美的暖通设备有限公司 Data recommendation method and device, computer readable storage medium and electronic equipment
WO2023197910A1 (en) * 2022-04-12 2023-10-19 华为技术有限公司 User behavior prediction method and related device thereof
CN117786234A (en) * 2024-02-28 2024-03-29 云南师范大学 Multimode resource recommendation method based on two-stage comparison learning
CN118245849A (en) * 2024-05-21 2024-06-25 北京德和顺天科技有限公司 Automobile fault detection method based on big data

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103995823A (en) * 2014-03-25 2014-08-20 南京邮电大学 Information recommending method based on social network
US20170098236A1 (en) * 2015-10-02 2017-04-06 Yahoo! Inc. Exploration of real-time advertising decisions
CN107330461A (en) * 2017-06-27 2017-11-07 安徽师范大学 Collaborative filtering recommending method based on emotion with trust
CN107562795A (en) * 2017-08-01 2018-01-09 广州市香港科大霍英东研究院 Recommendation method and device based on Heterogeneous Information network
CN109241424A (en) * 2018-08-29 2019-01-18 陕西师范大学 A kind of recommended method
WO2018226888A8 (en) * 2017-06-06 2019-02-21 Diffeo, Inc. Knowledge operating system
CN109388731A (en) * 2018-08-31 2019-02-26 昆明理工大学 A kind of music recommended method based on deep neural network
CN109871858A (en) * 2017-12-05 2019-06-11 北京京东尚科信息技术有限公司 Prediction model foundation, object recommendation method and system, equipment and storage medium
CN110245285A (en) * 2019-04-30 2019-09-17 中国科学院信息工程研究所 A kind of personalized recommendation method based on Heterogeneous Information network

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103995823A (en) * 2014-03-25 2014-08-20 南京邮电大学 Information recommending method based on social network
US20170098236A1 (en) * 2015-10-02 2017-04-06 Yahoo! Inc. Exploration of real-time advertising decisions
WO2018226888A8 (en) * 2017-06-06 2019-02-21 Diffeo, Inc. Knowledge operating system
CN107330461A (en) * 2017-06-27 2017-11-07 安徽师范大学 Collaborative filtering recommending method based on emotion with trust
CN107562795A (en) * 2017-08-01 2018-01-09 广州市香港科大霍英东研究院 Recommendation method and device based on Heterogeneous Information network
CN109871858A (en) * 2017-12-05 2019-06-11 北京京东尚科信息技术有限公司 Prediction model foundation, object recommendation method and system, equipment and storage medium
CN109241424A (en) * 2018-08-29 2019-01-18 陕西师范大学 A kind of recommended method
CN109388731A (en) * 2018-08-31 2019-02-26 昆明理工大学 A kind of music recommended method based on deep neural network
CN110245285A (en) * 2019-04-30 2019-09-17 中国科学院信息工程研究所 A kind of personalized recommendation method based on Heterogeneous Information network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HONGWEI WANG: "RippleNet: Propagating User Preferences on the Knowledge Graph for Recommender Systems", 《URL:HTTPS://ARXIV.ORG/PDF/1803.03467.PDF》 *
熊海涛: "《面向复杂数据推荐分析研究》", 31 January 2015 *

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111782813A (en) * 2020-07-07 2020-10-16 支付宝(杭州)信息技术有限公司 User community evaluation method, device and equipment
CN111782813B (en) * 2020-07-07 2023-10-31 支付宝(杭州)信息技术有限公司 User community evaluation method, device and equipment
CN111859125A (en) * 2020-07-09 2020-10-30 威海天鑫现代服务技术研究院有限公司 Semantic network construction and service recommendation method oriented to intellectual property technical resource field
CN111784081B (en) * 2020-07-30 2022-03-01 南昌航空大学 Social network link prediction method adopting knowledge graph embedding and time convolution network
CN111784081A (en) * 2020-07-30 2020-10-16 南昌航空大学 Social network link prediction method adopting knowledge graph embedding and time convolution network
CN111932308A (en) * 2020-08-13 2020-11-13 中国工商银行股份有限公司 Data recommendation method, device and equipment
CN112163929A (en) * 2020-09-27 2021-01-01 中国平安财产保险股份有限公司 Service recommendation method and device, computer equipment and storage medium
CN112163929B (en) * 2020-09-27 2024-04-05 中国平安财产保险股份有限公司 Service recommendation method, device, computer equipment and storage medium
CN112487200A (en) * 2020-11-25 2021-03-12 吉林大学 Improved deep recommendation method containing multi-side information and multi-task learning
CN112633504A (en) * 2020-12-23 2021-04-09 北京工业大学 Wisdom cloud knowledge service system and method for fruit tree diseases and insect pests based on knowledge graph
CN112733040B (en) * 2021-01-27 2021-07-30 中国科学院地理科学与资源研究所 Travel itinerary recommendation method
CN112733040A (en) * 2021-01-27 2021-04-30 中国科学院地理科学与资源研究所 Travel itinerary recommendation method
CN113032618A (en) * 2021-03-26 2021-06-25 齐鲁工业大学 Music recommendation method and system based on knowledge graph
CN113190593A (en) * 2021-05-12 2021-07-30 《中国学术期刊(光盘版)》电子杂志社有限公司 Search recommendation method based on digital human knowledge graph
CN113392325A (en) * 2021-06-21 2021-09-14 电子科技大学 Deep learning-based information recommendation method
WO2023197910A1 (en) * 2022-04-12 2023-10-19 华为技术有限公司 User behavior prediction method and related device thereof
CN114925294A (en) * 2022-06-04 2022-08-19 上海交通大学 Position prediction system and method based on graph-enhanced time-space model
CN115270005B (en) * 2022-09-30 2022-12-23 腾讯科技(深圳)有限公司 Information recommendation method, device, equipment and storage medium
CN115270005A (en) * 2022-09-30 2022-11-01 腾讯科技(深圳)有限公司 Information recommendation method, device, equipment and storage medium
CN115982646A (en) * 2023-03-20 2023-04-18 西安弘捷电子技术有限公司 Multi-source test data management method and system based on cloud platform
CN116701772A (en) * 2023-08-03 2023-09-05 广东美的暖通设备有限公司 Data recommendation method and device, computer readable storage medium and electronic equipment
CN116701772B (en) * 2023-08-03 2024-03-19 广东美的暖通设备有限公司 Data recommendation method and device, computer readable storage medium and electronic equipment
CN117786234A (en) * 2024-02-28 2024-03-29 云南师范大学 Multimode resource recommendation method based on two-stage comparison learning
CN117786234B (en) * 2024-02-28 2024-04-26 云南师范大学 Multimode resource recommendation method based on two-stage comparison learning
CN118245849A (en) * 2024-05-21 2024-06-25 北京德和顺天科技有限公司 Automobile fault detection method based on big data

Also Published As

Publication number Publication date
CN111259133B (en) 2021-02-19

Similar Documents

Publication Publication Date Title
CN111259133B (en) Personalized recommendation method integrating multiple information
US11914674B2 (en) System and method for extremely efficient image and pattern recognition and artificial intelligence platform
US11195057B2 (en) System and method for extremely efficient image and pattern recognition and artificial intelligence platform
US11074495B2 (en) System and method for extremely efficient image and pattern recognition and artificial intelligence platform
Taneja et al. Modeling user preferences using neural networks and tensor factorization model
US20140079297A1 (en) Application of Z-Webs and Z-factors to Analytics, Search Engine, Learning, Recognition, Natural Language, and Other Utilities
Zhang et al. Cross-domain recommendation with semantic correlation in tagging systems
Yang et al. POI neural-rec model via graph embedding representation
CN112328832B (en) Movie recommendation method integrating labels and knowledge graph
Ma et al. Exploring multiple spatio-temporal information for point-of-interest recommendation
Wang et al. Research on BP neural network recommendation model fusing user reviews and ratings
Park et al. An effective 3D text recurrent voting generator for metaverse
Shokeen et al. An application-oriented review of deep learning in recommender systems
Abdollahi Accurate and justifiable: new algorithms for explainable recommendations.
Sun Music Individualization Recommendation System Based on Big Data Analysis
Gan et al. CDMF: a deep learning model based on convolutional and dense-layer matrix factorization for context-aware recommendation
Liao et al. An integrated model based on deep multimodal and rank learning for point-of-interest recommendation
CN116610874A (en) Cross-domain recommendation method based on knowledge graph and graph neural network
Xing et al. DynHEN: A heterogeneous network model for dynamic bipartite graph representation learning
Drif et al. A sentiment enhanced deep collaborative filtering recommender system
Sangeetha et al. Predicting personalized recommendations using GNN
Sangeetha et al. An Enhanced Neural Graph based Collaborative Filtering with Item Knowledge Graph
Ao et al. Deep Collaborative Filtering Recommendation Algorithm Based on Sentiment Analysis
Sun et al. Joint matrix factorization: A novel approach for recommender system
CN113362034A (en) Position recommendation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant