Background
With the development of multimedia technology and the popularization of handheld mobile electronic devices, people can acquire information on the network at any time and any place, the information amount generated by the internet in one day can not be seen by one person for a whole life, and in the face of the increasingly serious information overload problem, a recommendation system is produced at the discretion of the people, so that the recommendation system has the function of establishing more efficient connection between users and information, and a great amount of time and cost are saved. In 1994, a news recommendation system was designed by the Group Lens research Group of the computer system in the university of minnesota in the city of double cities, and the concept of Collaborative Filtering (CF) was proposed for the first time. Since birth, CF has not only been intensively studied in academia but also has been practically deployed in industry. The method mainly analyzes the historical interaction records of the user and the articles to predict the next possible interaction articles of the user, and fully utilizes collective intelligence to find out the content which can best meet the user interest. Item-based collaborative filtering (ICF for short) based on articles is recommended according to similarity among the articles, and the ICF considers that personalized interests of users are embodied in historical articles interacted by the users, and can be used as an important component of user interest modeling.
Knowledge map (KG) is a new concept proposed by Google in 2012. From an academic point of view, we can give a definition of knowledge-graph: a "knowledge graph is essentially a knowledge base of Semantic networks (Semantic networks). Knowledge graph is an important ring of cognitive intelligence, and has a series of applications in the fields of retrieval recommendation and the like, and knowledge-enabled intelligent recommendation becomes the mainstream of future recommendation. The knowledge graph is introduced into the recommendation system as auxiliary information, so that the problems of sparsity and cold start existing in the traditional recommendation system can be effectively solved, and a certain interpretability is provided for a final recommendation result. In order to mine the key features of data on KG, a knowledge-graph mapping (KGE) algorithm is required to encode information that can adequately characterize the original data as low-dimensional embedded features. The KGE algorithm can be generally divided into two categories, a translation distance model and a semantic matching model. The translation distance model measures the reasonableness of the fact through the distance between two entities by utilizing a distance-based scoring function, such as Trans E, Trans H, Trans R, Trans D and the like; the semantic matching model utilizes a similarity-based scoring function. Semantic matching models measure the trustworthiness of facts, such as Dist Multi and RESCAL, by matching the underlying semantics of the entities and the relationships contained in the vector space representation.
In the research of the personalized recommendation algorithm, user interest modeling is a constant topic. How to express, calculate and update user interests has become an important research topic. The user's interests tend to follow a hierarchical pattern, from higher level attributes (e.g., genre, director, actors) to specific lower level attributes (a specific movie), in the case of a movie recommendation scenario. Previous models regarding user interest extraction, such as DIN and DIEN methods in ali, apparently neglect this hierarchical model and only extract user interest from the item level. The user historical behavior information is considered to be important information, the recommendation effect can be effectively improved by extracting the user interest from the information, and the multi-dimensional interest expression of the user can be extracted from the historical interaction behavior of the user with the aid of an external information knowledge graph.
In summary, the recommended user interest expression has not achieved the optimal effect, most of research uses a uniform and mixed hidden vector to express the user, the expression of the user multi-dimensional interest is not considered, the performance improvement is limited, the multi-dimensional expression of the user interest is difficult to obtain due to the sparsity of interactive data and the problem of cold start, how to define multiple dimensions is a troublesome problem, and the interpretability is lacking. The invention provides a knowledge graph-assisted method for extracting multi-dimensional interest of a user.
Disclosure of Invention
Most of the existing recommendation algorithms adopt a single and mixed vector to express user interests, the user interests cannot be expressed from multiple dimensions, and the method for uniformly modeling the user interests ignores entanglement in hidden vectors, so that suboptimal user interest expression is easily obtained, and meanwhile, certain interpretability is lacked. Aiming at the requirements of the current recommendation system, the invention aims to explore a method for reasonably extracting the multi-dimensional interests of a user from a user historical interaction record, and obtains the expression of the user interests in different attribute spaces of an article with the assistance of an external information knowledge base, as shown in figure 1.
In order to express user interest from multiple dimensions, the invention discloses a knowledge graph assistance-based user multi-dimensional interest extraction method. The overall frame is shown in fig. 2. The data sets used in this method are both from academia and industry public data sets (Amazon-book: books, Last-FM: music). The method comprises three modules which are an input layer, a multi-dimensional interest extraction layer and a maximum interest response layer respectively. Inputting historical interactive articles and articles to be recommended of a user, wherein the used knowledge graph represents that a learning method is Trans R; the multi-dimensional interest extraction layer realizes the extraction of multi-dimensional interest of the user by aggregating different entities under different connection relations, namely different user interest expressions exist in different article attribute spaces, and the essence of the multi-dimensional interest extraction layer lies in clustering after articles are mapped to different semantic spaces; the maximum interest response layer firstly maps the item to be recommended to different item attribute spaces, then performs inner product with the corresponding interest of the user in the space, and selects the maximum value as the final prediction score.
The invention contents of each main module of the method are as follows:
1. input layer
The input layer is divided into two parts, firstly, historical articles interacted by a user are obtained in a data set, and the historical articles and target articles to be recommended are used as original input of a model; then, all the articles in the data set are in one-to-one correspondence with the entities in the knowledge graph, and the representation learning method Trans R is used for learning, namely, the entities are mapped into different relation spaces for comparison, as shown in FIG. 3. The triples (h, r, t) can be formed only when h + r ═ t is satisfied in the same relationship space, and the essence is that the entities (i.e., articles) in the knowledge graph are a complex of multiple attributes, different relationships concern different attributes of the entities, and two entities that are similar in the a relationship space may not be similar in the B relationship space. As shown in particular in figure 3.
2. Multi-dimensional interest extraction layer
The multi-dimensional interest extraction layer can obtain the interest expression of the user in different item attribute spaces. Firstly, according to the input user history interactive objects, the corresponding entities of the objects in the knowledge graph are found. Taking a movie as an example, the used knowledge graph does not contain users, and different movie entities are connected according to the relationship of the same director, the same actor and the like to form triplets of movie 1-director-movie 2, movie 3-actor-movie 4 and the like, as shown in fig. 4. After obtaining the historical interacted objects of the user and their respective knowledge connection entities, we extract the entities which are the same director as the interacted entities of the user, and it is worth emphasizing that since the historical objects of the user are not one in most cases, the obtained set of director entities does not have only one director but a plurality of directors, so that the set of entities expresses the interest of the user in the attribute of the director of the movie, and is not a specific director. Similarly, the entity interacted with the user is the same actor, so that the user is interested in the attribute of the actor of the movie, and the like. In the method, firstly, how many relationships are shared in the knowledge graph, that is, how many attributes of the kind of articles (movies) are, in each attribute, there is a corresponding expression of user interest. In addition, since the characteristics of Trans R are that even the same entity has different expressions in different relationship spaces, for example, the same movie, in the director relationship space, we focus on its director attribute, and in the actor relationship space, focus on the actor attribute, so that the characteristic expressions of the same movie in different relationship spaces are different.
3. Maximum interest response layer
The maximum interest response layer can finally obtain the interaction prediction score of the user on the article. After the multi-dimensional interest expression of the user is obtained, the method firstly maps the object to be recommended to different object attribute spaces by using different relation matrixes, then performs inner product with the corresponding interest of the user in the space, selects the maximum value as the final prediction score, and has the essence that the user likes a certain attribute of the object particularly, so that the object is recommended to the user reasonably. And after the maximum interest response layer outputs the interactive prediction scores of all the articles to be recommended of the user, all the articles are sorted in a descending order according to the scores, the top N articles are generated into a list to be recommended to the user, and the recommendation is completed.
the specific implementation mode is as follows:
the invention discloses a knowledge graph-assisted personalized recommendation method for extracting multi-dimensional interests of a user. The method comprises the following concrete implementation steps:
the method comprises the following steps: data preprocessing and training set test set division: the data preprocessing is divided into two parts, one part is to select a proper public data set, sequence numbers are arranged for all users and articles, and the other part is to record user-article interaction records in a hidden feedback mode: if the interaction between the user and the object is in a scoring form, the label of the scored object can be marked as 1, otherwise the label of the scored object is 0. And then all the articles are in one-to-one correspondence with the entities in the knowledge graph. And finally, dividing a training set and a test set, wherein in the method, a knowledge graph represents that two tasks of learning and recommendation prediction are jointly trained, and for the recommendation prediction task, the proportion of positive samples in the training set and the test set is 4:1, through experimental verification, the knowledge graph shows that the performance is best when the ratio of positive and negative samples in a training set of the learning task and the recommendation and prediction task is 1: 1.
Step two: model input and Trans R training:
the input to the model is a set of historical item entities that have been interacted with by a user
Wherein
Is a historical item set interacted by a user and a target item to be recommended
And the articles are in one-to-one correspondence with the entities in the knowledge graph. First, to represent the knowledge of structuring, the present invention uses an undirected graph G ═ (V, R), where V ═ V
1,v
2...v
NRepresents the set of entities in the knowledge-graph, N represents the total number of entities, R ═ R
1,r
2...r
MAnd M represents the total number of the relations. In the Trans R algorithm, for each triplet (v)
h,r,v
t),v
hAnd v
tIs two connected head and tail entities, r represents the type of relationship between the two entities. Firstly, v is
hAnd v
tMapping to a particular space v
h,v
t∈R
kK is the dimension of the mapped vector, the value is 64, and | h | calculation
2≤1,||t||
2≤1,r∈R
dD is the dimension of the mapped vector, the value is 128, and | r | | y calculation
2Less than or equal to 1. Then setting a conversion matrix M for the current relationship
r∈R
k·dWhich can convert an entity to a corresponding gateIn the system space, as shown in fig. 3. The method specifically comprises the following steps:
the scoring function for this one triplet is defined as:
step three: extracting the multi-dimensional interest of the user: the function of this layer is to obtain the expression of interest of the user in different item attribute spaces: i isu=fextractor(VuG ═ V, R)). Assuming that the used knowledge graph is complete, the relationship in the knowledge graph is the division of the attribute space of the class of articles, taking a movie as an example, and the relationship of a director, an actor and the like corresponds to the attribute division space of the entity of the movie, so that the number of relationships in the knowledge graph can be considered to be the same number of attribute spaces of the class of articles under the current cognition. Taking a history entity interacted by a user as a central point, sequentially acquiring interest expressions of the user in different attribute spaces of the article according to different relations, wherein a specific calculation formula is as follows:
......
and has r1≠r2≠...≠rM,r1,r2...rMe.R, in the method of the invention, only the information of hop1 in the knowledge graph is used without expandingThe large information aggregation range is considered that when the user interest is expressed, certain noise is introduced at the same time when high-order information is introduced, and the performance is further influenced; in addition, in the traditional method, the historical items interacted by the user are usually directly embodied by the user interests, and in the invention, when the interest expressions of the user in different attribute spaces of the items are obtained, the historical interaction items are also calculated in the interest expressions according to different relations.
Step four: maximum interest response obtains prediction score and recommendation list: after the interest expression of the user on different attribute spaces of the object is obtained, in order to obtain the most accurate interactive prediction score, the object to be recommended is respectively mapped to the space with the same multidimensional interest of the user for comparison, and the specific calculation formula is as follows:
wherein
Is the entity characteristic expression of the item to be recommended in the knowledge graph,
is a mapping matrix of any relation, and M is the number of relations in the knowledge graph. After the expressions of the to-be-recommended articles in different attribute spaces are obtained, inner products are respectively made with the corresponding interest characteristics of the user in the space, the maximum value is selected as the final prediction score, and the calculation formula is as follows:
wherein
Is an activation function. The nature of the maximum interest response layer is that the user likes a certain attribute of the article particularly, so there is reason to giveThis item is recommended to the user. And after the maximum interest response layer outputs the interactive prediction scores of all the articles to be recommended of the user, all the articles are sorted in a descending order according to the scores, and the top N articles are generated into a list to be recommended to the user.
Step five: optimization method and loss function: the loss function in the invention comprises three parts, wherein the first part is the loss function of the interactive prediction part. Two learning strategies commonly used in the recommendation system are a point (pointwise) and a pair (pair) optimization method, and the invention selects the point. The point method is widely applied to numerous recommendation algorithms and achieves excellent effects. It transforms the recommendation problem into a two-class task, minimizing the following objective function:
where δ (-) is a sigmoid function, controlling the prediction score between 0 and 1, R+Is a positive sample set, i.e. a set labeled 1, R-Is a negative sample set, i.e., a set labeled 0.
The second part is a loss function of a knowledge graph representing learning Trans R training part, wherein edge loss Margin loss is selected, the input sample pair is a positive sample set S and a negative sample set S' which are acquired from a training set, the difference between scores of positive and negative samples is larger than a threshold value gamma, and the target function is as follows:
where max (x, y) represents the maximum of x and y obtained and γ represents the threshold, set to 1.0. The third part is a regularization loss function for preventing model overfitting, and comprises two parts of parameters of interactive prediction and knowledge graph representation learning Trans R training, which are defined as follows:
Lreg=LKG_reg+LRec_reg=λ(||θKG||2+||θRec||2)
hyper-parametric lambda control L2The strength of regularization was set to 10-7,θKGAnd thetaRecModel parameters to prevent overfitting in separate table knowledge-graph training and cross-prediction, where θKGAn initialization vector representing the entities and relationships,
θRecincluding initialization vectors for the user and the item.
The loss function of the invention consists of the three parts, and the joint training is as follows: l ═ LRec+LKG+Lre
Step six: and (3) verifying the validity of the method: after model training is completed, in order to verify the effectiveness of the method, the method is carried out in a public data set Amazon-book: book, Last-FM: experiments were performed musically. After the interactive prediction scores of the target users for the items to be recommended are obtained, the Top 20, 40,60,80 and 100 items with the highest scores are selected for each target user to form a Top-N personalized recommendation list. Evaluation indexes used in the experiment are Recall (correct Rate for all positive samples), Precision (correct Rate for Top-N list), Hit Rate (check probability of hitting any positive sample in Top-N list), and Normalized dispersed relationship Gain (NDCG focuses on the position where a positive sample appears in the Top-N recommendation list, and NDCG increases the further forward the position). Table 1 shows the performance of the method of the invention on two public data sets.
Table one: experimental performance display of the method on Amazon-book and Last-FM data sets