Method for calculating similarity of articles based on restricted Boltzmann machine
Technical Field
The invention relates to the technical field of recommendation systems, in particular to a calculation method of article similarity in a recommendation algorithm.
Background
In reality, each user has a viewing angle or an understanding emphasis for the article, so that different articles can be related to each other when the user stands at different angles, in other words, the similarity between the articles exists in different viewing angles or understanding emphasis.
For example, if one were to start from a gender perspective, the movie "avanda" would be very similar to "dare to death" because there are many aggressive scenes, so it is likely that the two movies would be more similar to one another in this perspective for men. However, if the subject of the whole movie is started, the movie "avanda" and "dare-to-death" do not have much similarity, and the subject of the movie "avanda" is very similar to that of "hacker empire" in the movie. Thus, from the previous example of movies, the similarity between the objects can be measured from multiple points of view, and finally the similarity between the objects is comprehensively calculated.
Currently, collaborative filtering techniques for recommending based on similarity of items and user behavior are widely used. There are two types of common item similarity calculation, one is directly calculated by the user item scoring matrix, and there are cosine similarity (cosine similarity), pearson correlation (pearson correlation), and cosine similarity (adjust cosine similarity) called by Sarwar. The second category is to calculate the similarity of the objects by learning, which includes SLIM (Sparse Linear Methods for Top-N communications Systems) method proposed by Ning et al and GLSLIM (Local Item-items Models for Top-N communications) method proposed by Christakopoulou et al.
In 2007, Salakhutdinov et al proposed the use of a restricted Boltzmann machine for collaborative filtering recommendations, a result that demonstrates the usefulness of a restricted Boltzmann machine for collaborative filtering. In 2012, Hinton introduced how to train an available restricted boltzmann machine in practice, which greatly reduced the difficulty of training the restricted boltzmann machine in practice. Christakopoulou et al in 2016 proposed an improvement to the SLIM algorithm that was very effective in Top-N recommendation, and demonstrated that the lack of computing item similarity from a global user item scoring matrix perspective also laterally illustrates the potential advantages of the present invention.
The method for calculating the similarity of the articles aiming at the two types has the following defects:
the first method for calculating the similarity of the articles (e.g. cosine similarity, pearson coefficient and modified cosine similarity) mainly depends on the calculation of the similarity between the coincident items of the user article-score matrix. The calculation method has two important disadvantages, namely, the calculation of the similarity is too simple, so that the similarity is inaccurate to a certain extent; the second is that when the purchase or scoring records of the goods are less, the problem of incorrect calculation occurs.
The second category of methods for calculating similarity of objects (e.g., SLIM and GLSLIM) mainly calculates a correlation matrix between objects by learning. The disadvantage of this calculation method is that the relation between the articles is considered from the first order only, the correlation of the articles is not explained from multiple angles, and for the reason of calculating the speed, this kind of method of calculating the similarity forces the similarity matrix of the articles to be sparse, thus losing a part of the relation between the articles.
The invention provides a method for calculating the similarity of articles by applying a limited Boltzmann machine in combination with the prior background technology.
Disclosure of Invention
Aiming at the prior technical situation, the invention aims to: providing a new method and idea for calculating the similarity of the articles; calculating the similarity of the articles from a plurality of angles by using the learning training of a limited Boltzmann machine; by the method for calculating the similarity of the articles, the performance of the neighborhood-based collaborative filtering algorithm is improved.
In order to achieve the above object, the present invention provides a method for calculating the similarity of an article based on a limited Boltzmann Machine (Restricted Boltzmann Machine), which specifically comprises the following steps:
s1: collecting the scoring records of the user on the articles in the system, and establishing a user-article scoring matrix;
s2: setting hyper-parameters of the model according to the calculation task, wherein the hyper-parameters comprise the angle number K of the article similarity calculation and the number of characteristic values of corresponding angles;
s3: constructing a limited Boltzmann machine model aiming at each angle, and carrying out limited Boltzmann machine training by taking the user scoring record vector of the article as the input of the model to obtain the characteristic value activation probability vector value of each article at each angle and the weight lambda occupied by each angle in the similarity measurementk;
S4: activating probability vector values and weights lambda according to characteristic values of the subject article at all angleskAnd calculating the similarity between the main article and other articles.
In the invention, a restricted Boltzmann model is established by inputting the angle number K for calculating the similarity of the articles and the number of the characteristic values of the corresponding calculation angles, and the characteristic value activation probability vector value of each article under the angle of K and the weight lambda occupied by each angle in the similarity measurement are obtained by training
k,
Activating probability vector value and weight lambda by characteristic value of each article
kAnd calculating the similarity between the main article and other articles. The calculated similarity result comprehensively considers a plurality of related angles among all the articles, is not a simple first-order angle calculation result any more, and can more accurately and comprehensively express the needs of users.
According to another embodiment of the invention, the calculation method performs parameter fusion on the restricted boltzmann model.
According to another embodiment of the present invention, the fusion of the restricted boltzmann machine has two layers of learning processes: parallel training learning and weighting parameter lambda of restricted Boltzmann machine represented by K angleskAnd (5) learning.
According to another embodiment of the present invention, the visible elements in the visible layer of the restricted boltzmann machine represent user scoring vectors for the item.
According to another embodiment of the invention, a restricted boltzmann model represents an angle for calculating similarity of an article, wherein the hidden layer distinguishes different angles by setting different unit numbers.
According to another embodiment of the invention, the method for calculating the similarity of the articles based on the restricted Boltzmann machine records x according to the scores of the articles by the user iiE {0,1}, where 0 represents unscored and 1 represents scored to build the model.
According to another specific embodiment of the invention, in the restricted boltzmann machine fusion model, a sigmoid function is used as an activation function of a network in a hidden layer, and the form of the sigmoid function is as follows:
according to another embodiment of the invention, the sigmoid function is a non-linear function.
According to another embodiment of the present invention, the similarity between the item i and the item j is calculated by the following formula:
wherein KL (F)k(i)||Fk(j) KL divergence value representing the activation probability distribution of the feature value at the kth angle of article j for article i.
According to another embodiment of the present invention, the KL divergence values in the restricted Boltzmann machine-based article similarity calculation formula have asymmetry, so that S (i, j) ≠ S (j, i).
Compared with the prior art, the invention has the following beneficial effects:
1. the similarity between the items is considered from multiple angles. The invention utilizes the strong learning ability of the limited Boltzmann machine to learn the characteristics of different angles by setting different hidden layer unit numbers. And then, similarity calculation is carried out on the eigenvalue distribution of different angle features according to the similarity of the KL calculation distribution. In this way, compared with the traditional similarity calculation method, the similarity between the articles can be calculated from multiple angles.
2. Abstract representations of different levels of the object are given. In the invention, the position vectors of the articles in different feature type spaces can be learned by setting different unit numbers of the hidden layers, so that the projections of the articles in different feature spaces are given. Therefore, by using the method, the object can be represented in the space with different dimensions, and abstractions of different layers of the object are given.
3. Non-linear relationships of similarity between items are considered. The restricted Boltzmann machine is used for learning the dependency relationship among the articles, compared with the traditional similarity calculation method, the model finally calculates the non-linear relationship of the article similarity, the non-linear relationship is closer to the actual situation, and therefore the calculation result is more accurate.
The present invention will be described in further detail with reference to the accompanying drawings.
Drawings
FIG. 1 is a model structure of a restricted Boltzmann machine in example 1;
FIG. 2 is a schematic diagram of the calculation of similarity of articles from three angles in example 1;
FIG. 3 is a feature value activation probability vector diagram for three angles in example 1;
FIG. 4 is a feature value activation probability vector diagram for K angles in example 1;
FIG. 5 is a flowchart of the process of calculating the similarity of objects based on the restricted Boltzmann machine in embodiment 1
Detailed Description
Example 1
The embodiment provides a similarity calculation method for performing multiple angle calculation on an article based on a restricted Boltzmann machine. The constrained boltzmann machine is a parameterized generative model, and has a two-layer structure, as shown in fig. 1, the lower layer is called a visible layer, the upper layer is called a hidden layer, the layers are all connected in two directions, and no connection exists in the layers. The restricted boltzmann machine may also be referred to as a stochastic neural network model or an undirected probabilistic graphical model.
In a constrained boltzmann machine, each cell in the visible layer represents an observed variable value, and the cells in the hidden layer are used to model interdependencies between the cells in the visible layer. The whole limited boltzmann machine represents a joint probability distribution of the visible layer variable and the hidden layer variable.
In the method for calculating the similarity of an article according to the embodiment, the similarity of different angles is calculated mainly by controlling the number of hidden layer units. For example, in considering similarity of movies from different perspectives, when considering from the perspective of gender, only two eigenvalues, male and female, are two hidden layer cell numbers corresponding to the restricted boltzmann machine, and the value of each cell represents the probability of activation of one eigenvalue; however, when the subject is considered, the feature values are very many, such as science fiction, comedy, love, and the like, and the number of hidden layer units corresponding to the limited boltzmann machine is also large, and each unit corresponds to the probability of activation of one subject feature value. It should be noted that, in the process of training the limited boltzmann machine, we do not know which feature value each unit specifically corresponds to, but can only distinguish different feature types according to the number of the unit of the hidden layer. Therefore, when the number of hidden layer units is two, we can consider this feature as gender, but other feature types with two feature values are also possible. Therefore, in the method, different angles considering the similarity of the articles are distinguished according to the number of the units of the hidden layer of the limited Boltzmann machine.
In the process of calculating the similarity of the article by using the limited Boltzmann machine, one limited Boltzmann machine model represents an angle for calculating the similarity of the article, and the similarity calculation of different angles requires different numbers of units of the hidden layers of the limited Boltzmann machine model. According to the understanding of the article, K angles can be selected to calculate the similarity of the article, which corresponds to K limited Boltzmann models, and the hidden layers of the K limited Boltzmann models have different numbers of hidden units, namely the number of characteristic values of the corresponding angles.
For example, as shown in FIG. 2, the lower level represents the visible level unit corresponding to the rating of the item by n users in the system, x
iE {0,1}, where 0 represents unscored and 1 represents scored. In this example, we chose 3 angles to model the similarity of items in the system, where F
iRepresents the (i) th angle of the angle,
the activation probability of the characteristic value representing the j state at the i angle,
in the process of calculating the similarity of the objects by using a restricted boltzmann machine, a nonlinear function sigmoid is selected as an activation function of a model. The sigmoid function is of the form:
however, for the hidden layer of the model, the output values of all the hidden layer units cannot form an effective probability distribution, so that after the model is trained, when the feature value activation probability vector value of the article is calculated by inputting the article vector, the obtained feature vector needs to be normalized to form an effective probability distribution.
As described above, for this article, our model computes its feature value activations at three feature type angles, which correspond to vector values in different dimensional spaces, as shown in FIG. 3. The similarity between the three feature type angles of each article can then be further calculated from their vector values.
Assuming that there are m users and n items in the recommendation system and that the scoring records of the items by the users in the system are collected, we use a triple (i, j,1) to represent a scoring record of the item j by the user i. Filling all the scoring records into a matrix to form a user-item scoring matrix B, wherein the B belongs to {0,1}m×nIn which B isij1 indicates that user i has scored item j, BijA value of 0 indicates that user i has not scored item j. We use B·jAnd j column of the matrix B represents the scoring records of the item j by all users.
According to the understanding of the article domain in the system, K angles for calculating the similarity of the article are specified, and the number of characteristic values which the corresponding similarity calculation angles should have is specified for the K angles. Then, constructing a limited Boltzmann machine model aiming at the angle calculated by each similarity, and calculating K limited Boltzmann machines according to the K limited Boltzmann machinesTraining the Boltzmann machine, and obtaining the probability vector value of the activation of the characteristic value of each article under the K angles and the weight lambda of each angle in the similarity measurement after the K limited Boltzmann machines are trained
k,
Then for each item in the system we can get the eigenvalue activation probability vector values at K angles as shown in figure 4.
The eigenvalue activation probability vector values for the K angles shown in FIG. 4 represent the probability distribution of possible eigenvalue activation for an item at each angle, and the parameter λ to the right of the eigenvalue activation probability vectoriRepresenting the magnitude of the weight of the ith metric similarity angle specified in the system. Then for item i, the similarity between it and item j is:
if it is for item j, then the similarity between it and item i is:
wherein KL (F)k(i)||Fk(j) KL divergence values representing the activation distribution of characteristic values of item j at the k-th angle for item i, S (i, j) ≠ S (j, i) because of the asymmetry of KL divergence. The reason why the KL divergence is used here to calculate the similarity between the articles is because when the feature distribution of the other article and the feature distribution of the subject article i are calculated, it can be considered that the feature distribution of the article i is approximated by the feature distribution of the other article, in which case KL (F) is usedk(j)||Fk(i) To calculate the similarity, the matching of item j to the feature value with a higher activation probability in the feature distribution of item i can be emphasized, i.e. the angle at which two items are in all the feature values is consideredThe features are the best match.
FIG. 5 is a flow chart of a method for calculating similarity of an item using a restricted Boltzmann computer.
Although the present invention has been described with reference to the preferred embodiments, it is not intended to limit the scope of the invention. It will be appreciated by those skilled in the art that changes may be made without departing from the scope of the invention, and it is intended that all matter contained in the above description or shown in the accompanying drawings shall be interpreted as illustrative and not in a limiting sense.