CN110188283B

CN110188283B - Information recommendation method and system based on joint neural network collaborative filtering

Info

Publication number: CN110188283B
Application number: CN201910484886.4A
Authority: CN
Inventors: 蔡飞; 陈洪辉; 刘俊先; 罗爱民; 舒振; 陈涛; 罗雪山
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2019-06-05
Filing date: 2019-06-05
Publication date: 2021-11-23
Anticipated expiration: 2039-06-05
Also published as: CN110188283A

Abstract

The invention provides an information recommendation method and system based on joint neural network collaborative filtering, which comprises the following steps: extracting user characteristic information and article characteristic information based on a user-article scoring matrix and combined with a deep neural network A; modeling the interaction relation between the user and the article by taking the user characteristic information and the article characteristic information as input and combining a deep neural network B; and outputting a predicted value of the interaction behavior between the user and the article according to the model obtained by modeling to provide data support for information recommendation. The invention provides a recommendation method based on joint neural network collaborative filtering, which adopts a joint neural network to tightly combine a depth feature extraction process and a depth interactive behavior modeling process, and can quickly and accurately obtain a predicted value.

Description

Information recommendation method and system based on joint neural network collaborative filtering

Technical Field

The invention belongs to the field of information resource recommendation, and particularly relates to an information recommendation method and system based on joint neural network collaborative filtering.

Background

In the face of complex and large amount of information resources, the recommendation system can effectively help users to acquire the information which the users want. Collaborative filtering is one of the recommended methods that are widely used at present. The traditional collaborative filtering is obtained based on matrix decomposition, for example, a current factor model LFM (latent factor model) represents both the user and the article as potential vectors, and then an inner product represents the correlation between the user and the article, but the inner product is based on a linear correlation.

The deep learning method has good effect in a recommendation system, and overcomes many problems existing in the traditional method, such as modeling of complex user-item relationship, modeling of dynamic interests and hobbies of users and the like. However, most of the current deep learning methods of recommendation systems are mining auxiliary information, such as: text information, audiovisual information, etc. The characteristic information of the article is enriched through the above steps. However, most methods still use the conventional matrix decomposition method for the user-item interaction.

The RBM (Restricted Boltzmann Machine) is the first method to simulate the interaction between users and articles by using a Neural network, which is better than the traditional method, but only a two-layer network cannot be considered as a Deep learning method, the CDAE (Collaborative Denoising Auto-Encoders) is also a method based on the Neural network, but is mainly used for predicting user scores, the NCF (Neural Collaborative filtering) adopts a Deep Neural network to model the interaction information between users and articles, but does not mine the characteristic information of the users and the articles, the methods of the CDAE and the NCF do not utilize the display feedback information of the users, the method of the DMF (Deep Matrix Factorization) adopts the Neural network to model the user-article scores, the user features and item features are effectively extracted, but the linear method as the LFM is still adopted in relation to the user-item interaction behavior.

Disclosure of Invention

The invention aims to provide an information recommendation method and an information recommendation system based on joint neural network collaborative filtering, and solve the technical problems in the prior art.

The content of the invention comprises:

an information recommendation method based on joint neural network collaborative filtering is provided, which comprises the following steps:

extracting user characteristic information and article characteristic information based on a user-article scoring matrix and combined with a deep neural network A;

modeling the interaction relation between the user and the article by taking the user characteristic information and the article characteristic information as input and combining a deep neural network B;

and outputting a predicted value of the interaction behavior between the user and the article according to the model obtained by modeling to provide data support for information recommendation.

Preferably, the deep neural network a comprises two parallel networks Net_userAnd Net_itemThe scoring information of the user and the article is respectively used as the input of the two networks and is represented as follows: v. of_u＝〈y_u1，…，y_uN>，v_i＝<y_1i，…，y_Mi>Wherein

Preferably, Net is implemented by using a multilayer perceptron_userThis high-dimensional vector is mapped into a low-dimensional vector space, resulting in:

wherein

Respectively representing a weight vector, a bias vector and an activation function in an X-th layer perceptron, wherein X represents the number of network layers of a deep neural network A, and z represents_uIs a depth representation of the user's features; by the same token, z can be obtained_iIs a depth representation of the article features.

Preferably, ReLU is employed as the activation function.

Preferably, the interaction between the user and the item is modeled in a combination of linear and non-linear ways.

Preferably, the interaction between the user and the item is represented as:

preferably, a multilayer perceptron pair a is used_uiThe treatment is carried out to obtain:

wherein

Respectively representing a weight vector, an offset vector and an activation function in a Y-th layer perceptron, wherein Y represents the number of network layers of the DF network, and ReLU is adopted as the activation function.

Preferably, the predicted value of the interaction behavior between the user and the item is output through a sigmoid function:

where the sigmoid function limits the output to between (0,1), and the vector h is used to control z_uiThe weights of the different dimensional positions in the vector can be obtained by training.

Preferably, the accuracy of the predicted value needs to be improved by a loss function, where the loss function is a combined loss function:

alpha is a weight coefficient that controls both losses,

Max(R_u) Representing the highest score among all the scores given by user u, omega (theta) represents the regularization term,

and

respectively representing the predictive scores for positive and negative examples, N_SRepresenting a sampling set of negative samples.

By means of the method, the invention further provides an information recommendation system based on joint neural network collaborative filtering, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor implements the steps of any one of the methods when executing the computer program.

The beneficial effects of the invention include:

1. the invention provides a recommendation method based on joint neural network collaborative filtering, which adopts a joint neural network to tightly combine a depth feature extraction process and a depth interactive behavior modeling process, and can quickly and accurately obtain a predicted value.

2. The invention provides a new loss function, which considers display and implicit feedback, single-point loss and pairing loss and obviously improves the accuracy of the predicted value.

Drawings

FIG. 1 is a flow chart of an information recommendation method based on joint neural network collaborative filtering according to the present invention;

FIG. 2 is a schematic structural diagram of the J-NCF model of the present invention;

FIG. 3 is a diagram illustrating interaction among different data sets in accordance with a preferred embodiment of the present invention;

FIG. 4 is a graph showing a comparison of the performance of different models of the preferred embodiment of the present invention;

FIG. 5 is a graph showing the sensitivity of J-NCF to different data sparsity in a preferred embodiment of the present invention.

Detailed Description

Example 1:

traditional recommendation methods most recommendation systems are based on collaborative filtering for the first N recommended tasks, where recommendations depend on past recommendationsBehavior (rating) regardless of domain knowledge. the task of top-N recommendation is to recommend a list of articles to a user according to the history of the articles scored by the user, and to arrange the articles liked by the user at the front end of the list as much as possible. We denote the user as U ═ user₁，…，user_MItem is denoted I ═ item₁，…，item_N}，R∈i^M·NIs a user-item scoring matrix, scoring is a display feedback information that shows the user's preference for items, while those items that are not scored are considered as unknown or dislike for the user, and thus are considered as implicit feedback information, both of which are important information in recommendations

There are two main CF (Collaborative Filtering) methods traditionally recommended: based on neighborhood or based on latent factors. The neighborhood approach is similar to the user-personalized project model, which is different from the approach of the present invention based on the user project model. Accordingly, the present invention is directed to a latent factor approach. Most studies modeling latent factors are based on factoring a user project scoring matrix, referred to as SVD (Singular Value Decomposition). The SVD decomposes the user-item-scoring matrix into a product of two lower-level matrices, one containing the "user factors" and the other containing the "item factors". Then, user preferences for items can be generated for in-house products and deviations. Another SVD-based model is SVD + +, a model of SVD + + contains explicit and implicit feedback and shows improved performance on many MF (Matrix Factorization) models. This is consistent with the motivation of the present invention to combine explicit and implicit feedback in J-NCF (Joint neural network collaborative filtering based recommendation method). However, applying the conventional MF method to sparsely evaluate the matrix is difficult. Many conventional recommendation systems apply the inner product of a linear kernel and user and item vectors to model user-item interactions. The linear function may not accurately characterize the user (item) and user item interaction, and the non-linearity has potential advantages for improving the performance of the recommendation system.

Deep learning based recommendation methods DL (Deep learning) based recommendation systems can be divided into two categories, namely a single neural network model and a Deep ensemble model, depending on whether they rely on Deep learning techniques alone or in combination with the traditional recommendation model.

For the first category, RBM is an early neural recommendation system. It uses a two-layer undirected graph to model tabular data, such as the user's explicit rating of movies. The outcome management system is directed to rating prediction, rather than to the top N recommendations, whose loss function only considers the observed ratings. Training to incorporate negative samples into the RBM is technically challenging, which is necessary for the top N recommendations. Using Auto-Encoder for scoring predictions only considers the ratings observed in the loss function, which does not guarantee good performance for the top N recommendations. In order to avoid the auto-encoder learning the identity recognition function and failing to summarize as invisible data, DAE (Denoising auto-encoder) has been applied to learn from intentionally corrupted inputs. Most of the publications listed so far focus on explicit feedback and therefore the user's preferences cannot be understood from implicit feedback. Extended DAEs, whose input is implicit feedback that is partially observed by the user. Unlike the present invention, both DAEs and CDAEs are personalized using project models, using their scoring projects on behalf of the user. For the deep integration model, CDL (Collaborative deep learning) is a stacked bayesian model that integrates stacked DAE into the traditional probabilistic MF. It differs from the present invention in two ways: (1) it extracts the depth feature representation of the item from the content information; (2) it still uses a linear kernel to model the user and item vectors of the relationship between the user and the item. Another integrated model directly relevant to the present invention is DMF. It uses a deep MF model with a neural network to map users and items to a common low-dimensional space. However, it follows that LFM uses internal products to compute the interaction between the user and the item. Unlike DMF, the present invention applies a multi-layered perceptron to use a combination of user and item feature vectors as input to simulate user-item interactions. This not only helps the model to be more expressive than a linear product in modeling user-item interactions, but also helps to improve the accuracy of user and item feature extraction.

In order to solve the above-mentioned problems, the present invention provides an information recommendation method based on joint neural network collaborative filtering, referring to fig. 1, including the following steps:

and extracting user characteristic information and article characteristic information based on the user-article scoring matrix and combined with the deep neural network A.

A user-item scoring matrix may be constructed from the scoring records for each user to item recorded by the system.

And modeling the interaction relation between the user and the article by taking the user characteristic information and the article characteristic information as input and combining the deep neural network B.

The deep neural network A and the deep neural network B are both multilayer perceptron networks, because the networks can well depict the characteristics of users and articles and model the nonlinear relation between the users and the articles. The deep neural network A comprises two parallel networks Net_userAnd Net_itemScoring information for the user and the item is input to the two networks, respectively. The output is a characteristic representation of the user and the item. The input of the deep neural network B is the combination of the feature vectors of the users and the articles, and the output is a predicted value of the interaction behavior between the users and the articles.

And outputting the preference arrangement of the article information by taking the user characteristic information as input according to the model obtained by modeling. And outputting a predicted value of the interaction behavior between the user and the article according to the model obtained by modeling to provide data support for information recommendation.

Referring to fig. 2, fig. 2 illustrates the structure of the J-NCF model, which contains two main neural networks, the DF network to mine the user and item feature information, and the DI network to model the user-item interaction. The output of the DF network serves as the input to the DI network.

DF network mainly uses two parallel networks to respectively mine the characteristic information of user and article, and finally uses these two networksThe outputs are combined, so we call these two networks Net respectively_userAnd Net_item. We take the scoring information of the user and the item as input to the two networks, respectively, and represent it as: v. of_u＝〈y_u1，…，y_uN>，v_i＝<y_1i，…，y_Mi>Wherein

This high-dimensional vector is then mapped to a low-dimensional vector space using a multi-layered perceptron. Because Net_userAnd Net_itemOnly differs in input, so the invention uses Net_userFor example, Met_itemAnd so on. The use of a multilayer perceptron can result in:

wherein

Respectively representing a weight vector, a bias vector and an activation function in an x-th layer perceptron. Here we use ReLU as the activation function because it has better information expression ability and biological similarity, and at the same time, it can solve the problem of gradient disappearance well. X denotes the number of network layers the DF network has. Its output z_uIs a depth representation of the user's features, and similarly, z can be obtained_iIs a depth representation of the features of the article. A ReLU (Linear rectification function) is an activation function.

For modeling of user-item interaction behaviors, most of the traditional methods adopt a mode of carrying out point multiplication on a user vector and an item vector to measure the relationship between a user and an item. But this is a linear modeling approach, since non-linearity is better than linear modeling. We combine the feature vectors of users and items in two forms:

the first way is to splice two vectors directly, which is a nonlinear combination way, and the second way is to multiply corresponding elements of the two vectors to obtain a new vector, which is a linear combination way. We propose two different variants of the J-NCF model based on these two approaches.

Generation of a_uiIt is only the first step of modeling the user-item interaction, which is not enough to accurately characterize the user-item interaction behavior, so we continue to process it with a multi-layered perceptron:

wherein

Respectively representing a weight vector, a bias vector and an activation function in a perception machine of the y-th layer. Here we also use ReLU as the activation function. Y represents the number of network layers the DF network has. Finally, outputting a predicted value of the interaction behavior between the user and the article through a sigmoid function:

where the sigmoid function may limit the output to between (0,1), and the vector h is used to control z_uiThe weights of the different dimensional positions in the vector can be obtained by training.

The predicted values also need to be processed by a loss function. The loss functions of general training are single point loss and knot pair loss functions. The single-point loss function is mainly used for improving the accuracy of prediction of a certain score value, is more suitable for a recommendation task of score prediction, focuses on different preference degrees of different articles by a user, and is more suitable for a top-N recommendation task.

One loss function is represented by l (g), the regularization term is represented by Ω (θ), and the general calculation for a single point loss function is as follows:

in particular, the squared error loss function is more suitable for displaying feedback information:

for implicit feedback information, the cross entropy loss function is more applicable:

the result-pair loss function considers the different preference degrees of the user for the two items and the relative sequence of the two items, so that the result-pair loss function is more suitable for the top-N recommendation task.

A pair-loss function TOP1 is also proposed for the TOP-N recommendation task, whose computation is shown in the following graph:

wherein

And

respectively represent aligned samplesAnd predictive scoring of negative examples, N_SRepresenting a sampling set of negative samples.

Most of deep learning-based recommendation methods adopt a single-point loss function, and take a pair loss function as later research work. The method has certain advantages and disadvantages in both single-point loss function and junction-to-pair loss function. For the single point loss function, the correlation between the item scores is ignored, and for the pair loss function, the correlation is only considered, and the preference degree of a user for a specific item is not considered. The invention therefore combines two loss functions to yield:

L＝αl_point-wise+(1-α)l_point-wise

where α is the weighting factor that controls both losses.

We further take into account both implicit and explicit feedback information, resulting in:

wherein

Max(R_u) Representing the highest score of all the scores given by user u. So that different scoring values may have different effects on the loss function. We refer to the loss function proposed herein as a combined loss function.

The J-NCF model is shown by an algorithm below. In steps 1-4, the initialization of parameters, in steps 9-10, the extraction of characteristic information of users and articles, and in steps 11 and 12, the prediction of user-article interaction behavior in combination with DI network. Finally, the loss function and back propagation are used to optimize parameters in the network at steps 13 and 14.

Input：Epoches：training iterations；

R：the oriainal rating matrix；

U：user set；

I：item set；

Output：

Weight matrix of Nrt_user；

Bias ofNet_user；

Weight matrix of Net_item；

Bias of Net_item；

Weight matrix of Di network；

Bias matrix of Di network；

The J-NCF model provided by the method adopts a joint neural network, and a deep feature extraction process and a deep interaction behavior modeling process are tightly combined. The deep feature extraction process is based on a user-article scoring matrix and combines a deep neural network to extract feature information of users and articles. In the deep interaction behavior modeling process, the extracted features of the user and the goods are used as input, and a nonlinear interaction relation between the user and the goods is modeled by combining a deep neural network. J-NCF combines these two processes and allows them to continually optimize each other through training, thereby increasing the effectiveness of the recommendation.

The experimental results show that the method of the invention has improved HR @10 index in comparison with the existing method in three data sets, wherein the HR @10 index is respectively improved by 8.24%, 10.81% and 10.21% in MovieLens100K, ML1M and AMovies data sets, and the NDCG @10 index is respectively improved by 12.42%, 14.24% and 15.06%. Meanwhile, the experimental result also shows that the J-NCF model has better effect than the existing method on sparse data sets and some inactive users.

Example 2:

this embodiment mainly uses two data sets: (1) MoviesLens: which contains multiple scoring datasets for the MovieLens website. The data sets are collected at different time periods. MovieLens100K (ML100K) contains 10 ten thousand ratings of 943 users of 1,682 movies, and 3,706 movies contain more than 100 ten thousand ratings of 6040 users MovieLens 1M (ML 1M). (2) Amazon movies (AMovies), which contain click score information from amazon movie 4,607,047, are larger and more sparse than the MovieLens dataset, and are widely used for evaluation in recommendation systems. The data set information after the specific processing is shown in the following table 1:

TABLE 1

The present embodiment plots the distribution of users with different numbers of interactions in all three data sets as shown in FIG. 3. Most users in the three datasets have only a few ratings, these are considered "inactive users" and few "active users" have higher ratings. In the ML100K dataset, 61.72% of users scored less than 100, 32.66% scored between 100 and 300, and only 5.6% of users scored more than 300.

When used on datasets with different characteristics, the model considered in this embodiment will result in different scores, i.e. number of users and number of items. Thus, to evaluate the performance of the model on data sets with different degrees of sparsity, the number of users and items is kept the same. That is, for each of the three data sets, three versions of different sparsity are created. For each data set, a subset of users and items is first randomly selected from the main data set. The data set is indicated with the '1' suffix. Keeping the same set of users and items, a first sparse version of the dataset with the '-2' suffix is created by randomly deleting entries from the user item matrix of the first dataset. A second sparse version of the dataset with the '-3' suffix is similarly created by randomly removing entries from the user item matrix of the second dataset. Table 2 summarizes the characteristics of all data sets.

TABLE 2

It can be seen from Table 3 that DMF performs better than the other baselines at HR @10 and NDCG @ 10. Therefore, only DMF was used as the best baseline for comparison in later experiments. BPR (Bayesian Personalized Ranking) clearly shows a higher improvement in NDCG @10 than the Item-pop baseline than HR @10, indicating that pairwise penalties perform strongly in Ranking prediction. Both NCF and DMF models performed better than the two traditional CF models, indicating the utility of DL techniques in improving the recommended performance.

TABLE 3

The baseline was compared to the J-NCF model. Both NCF and DMF are reduced in HR @10 and NDCG @10 compared to the J-NCF model. This suggests that a joint neural network structure that tightly combines deep feature learning and deep interaction modeling helps to improve recommendation performance. For the J-NCF model, independent of the selected combination of user and project vectors, the performance of the J-NCF implementation outperformed the DMF baseline, resulting in improvements in HR10 in the ML100K dataset ranging from 5.04% to 8.24%, in the ML1M dataset ranging from 5.62% to 10.81%, and in the AMovies dataset ranging from 7.21% to 10.21%. The improvement in NDCG @10 ranged from 7.22% to 12.42% of the ML100K dataset, 6.25% to 14.24% of the ML1M dataset, and 10.44% to 15.06% of the AMovies dataset.

By comparing J-NCF_cAnd J-NCF_mWe see J-NCF_cThe best performance was achieved with 3.05%, 3.51% and 2.81% improvement in HR10, respectively, and 4.85%, 7.51% and 4.18% improvement in performance with NDCG @10 on the three datasets, respectively at J-NCF_mAbove. This is because complex relationships between users and items can be described in a non-linear kernel, rather than a linear kernel.

When the size of the current N recommendation lists is between 1 and 10, the overall performance increases with HR and NDCG, because the value of large N increases the probability of including user preference objectives in the recommendation list. J-NCF_hybridThe improvement in DMF was consistently achieved, and both models had separate loss functions between different positions, demonstrating the utility of our loss function. Based on ML100K database, J-NCF_hybridRatio J-NCF in HR @10, respectively_pointAnd J-NCF_pairThe improvement is 2.68 percent and 7.61 percent; for J-NCF_pointAnd J-NCF_pairThe improvement in NDCG @10 was 3.99% and 2.36%, respectively. See FIG. 4 for a comparison of J-NCF_pointAnd J-NCF_pairWe have found J-NCF_pointIs superior to J-NCF in HR_pairAnd J-NCF_pairExhibit a ratio of J-NCF in NDCG_pointHigher competitive performance. The tie-loss function performs strongly on the ranking prediction. Thus combining point-by-point losses and pair-wise losses in the mixing loss function. Apparently based on J-NCF_cModel of (2), i.e. J-NCF_point，J-NCF_pairAnd J-NCF_hybridShows better performance than DMF, which also justifies the combined neural structure. FIG. 4(a) -FIG. 4(f) are respectively NDCG @ N of NDCG @ N, ML1M of NDCG @ N, ML @ N, AMovies of HR @ N, ML1M and HR @ N, AMovies of ML00K @ N, ML M and HR @ N, ML 00K. J-NCF_pointIs a J-NCF model when a single point loss function is adopted, J-NCF_pairIs a J-NCF model when a tie-to-loss function is adopted, J-NCF_hybridIs a J-NCF model when a mixing loss function is adopted.

In J-NCF_cIn this embodiment, not only the features of users and articles are learned by the DF neural network with multiple hidden layers, but also the interactions of user items with the multi-layer perceptron are simulated in the DI network. Therefore, it is crucial whether DL contributes to our model. This example examines the various numbers of layers of J-NCF in DF and DI networks, respectively, by experiment_cThe performance of (c). The results are shown in Table 4. In Table 4, i in DF-i and DI-i represents J-NCF, respectively_cThe DF network and the DI network.

TABLE 4

As shown in table 4, the recommendation performance improves as the number of layers in the DF network increases from 1 to 5 and the DI network increases from 1 to 4, indicating the effectiveness of the DL technique for the recommendation system. In particular, addition of more than 2 deep layers seems useless compared to DMF, J-NCF_cStacking more layers or both in the DI or DF network achieves further improvements. Furthermore, stacking more layers in a DF network is more helpful than a DI network in enhancing recommended performance. For example, based on the ML100K dataset, the improvement over (DF-2, DI-2), (DF-3, DI-2) was 2.82% and 4.31% in HR @10 and NDCG @10, respectively, and the improvement in (DF-2, DI-3) (DF-2, DI-2) was 1.05% and 2.62%, respectively. When we areJ-NCF when stacking 4 or more layers in DI network (DI-5)_cThe performance of (2) is not increased. However, it still seems helpful to stack more layers in the DF network (DF-5), with the best results produced for each dataset based on J-NCF_c(DF-5, DI-4) configuration.

As shown in Table 5, J-NCF_cThe best baseline model DMF was better at all activity levels, i.e. "inactive" users that constituted most activities and relatively few "very active" users scored all data sets higher. In addition, J-NCF_cThe model consistently achieves the best performance in terms of HR @10 and NDCG @ 10.

TABLE 5

In particular, J-NCF shows a much greater improvement over the DMF model for "inactive" users than for "very active" users. For example, when a user with more interaction is included, i.e. from 50% to 90%, according to HR10 from 11.08% to 7.85%, respectively, calculated as NDCG @10 of ML100K, 9.57% to 7.32% data set. This is because "very active" users have many interactions with less scored items, and collaborative filtering lacks information to recommend items based solely on the scoring matrix, which suggests that J-NCF can be extended in conjunction with having more ancillary information, such as content information, to explore more accurate relationships between items.

To study the sensitivity of J-NCF to different degrees of data sparsity, this example also studied the recommended performance of the different sparsity datasets in table 3. Figure 5 shows the result that all models achieve better performance when applied to a data set containing more users and items, as shown in figure 5. For example, the overall performance of all models on the AMovies dataset is better than the other two datasets. Therefore, in order to study model sensitivity in datasets with different degrees of sparsity, it is necessary to keep the number of users and items on the same scale as the dataset. FIGS. 5(a) -5 (f) are HR @10 for ML00K, HR @10 for ML1M, HR @10 for AMovies, NDCG @10 for ML00K, NDCG @10 for ML1M, and NDCG @10 for AMovies, respectively.

In particular, for the ML100K dataset, the ML1M dataset and the AMovies dataset, the J-NCF model outperforms the baseline model DMF in all sub-datasets with different sparsity in HR @10 and NDCG @ 10. In addition, the best model found in this example is J-NCF_cHigher improvement is shown on sparse data sets. For example, based on the ML100K dataset, J-NCF in terms of the ML100K-1 subset (density 4.413%) on HR10 and NDCG10 metrics_cThe improvement compared to DMF reached 4.91% and 9.12%, while the improvement in ML100K-3 subset (density 0.630%) HR10 and NDCG10 was 7.77% and 12.02%, respectively.

The above J-NCF_mThe method is characterized in that J-NCF uses element-based multiplication to combine user and item feature vectors into the input of a DI network, and a linear kernel is arranged inside the DI network; J-NCF_cIt is the J-NCF that combines the user and item feature vectors into the input of the DI layer using concatenation, which is a non-linear way.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An information recommendation method based on joint neural network collaborative filtering is characterized by comprising the following steps:

extracting user characteristic information and article characteristic information based on a user-article scoring matrix and combined with a deep neural network A; the deep neural network A comprises two parallel networks Net_userAnd Net_itemThe scoring information of the user and the article is respectively used as the input of the two networks and is represented as follows: v. of_u＝<y_ut，…，y_uN>，v_i＝<y_ti，…，y_Mi>Wherein

The output is a characteristic representation of the user and the item;

taking the user characteristic information and the article characteristic information as input, and combining a deep neural network B to model the interaction relation between the user and the article; the input of the deep neural network B is the combination of the characteristic vectors of the users and the articles, and the output is a predicted value of the interaction behavior between the users and the articles;

outputting a predicted value of an interaction behavior between a user and an article according to a model obtained by modeling to provide data support for information recommendation, and improving the accuracy of the predicted value through a loss function and a back propagation algorithm, wherein the loss function is a combined loss function:

alpha is a weight coefficient that controls both losses,

and

respectively representing the predictive scores for positive and negative examples, N_SSet of samples, U, V, representing negative samples⁺Respectively representing the set of all users, all scored item sets of a certain user, sigma representing the sigmoid function, i.e.

2. The information recommendation method based on joint neural network collaborative filtering as claimed in claim 1, wherein a multilayer perceptron is adopted to combine Net_userThe high-dimensional vector is mapped to the low-dimensional vector space, resulting in:

wherein

Respectively representing a weight vector, a bias vector and an activation function in an X-th layer perceptron, wherein X represents the number of network layers of a deep neural network A, and z represents_uIs a depth representation of the user's features.

3. The information recommendation method based on joint neural network collaborative filtering as claimed in claim 2, wherein ReLU is used as the activation function.

4. The information recommendation method based on joint neural network collaborative filtering as claimed in claim 1, wherein the interaction relationship between the user and the object is modeled in a linear and nonlinear combination manner.

5. The information recommendation method based on joint neural network collaborative filtering as claimed in claim 4, wherein the interaction relationship between the user and the article is expressed as:

z_ifor a depth representation of the item features, the collocation represents concatenation.

6. The information recommendation method based on joint neural network collaborative filtering as claimed in claim 5, wherein a is subjected to multi-layer perceptron pair_uiThe treatment is carried out to obtain:

wherein

Respectively representing a weight vector, a bias vector and an activation function in an X-th layer perceptron, wherein X represents the number of network layers of a DF network, ReLU is adopted as the activation function, DF is used for depth feature modeling, and z_uiIs a depth representation of the user's features.

7. The information recommendation method based on joint neural network collaborative filtering as claimed in claim 6, wherein the predicted value of the interaction behavior between the user and the article is output through a sigmoid function:

where the sigmoid function limits the output to between (0,1), and the vector h is used to control z_uiAnd weights of different dimensional positions in the vector are obtained through training.

8. An information recommendation system based on joint neural network collaborative filtering, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method according to any one of the preceding claims 1 to 7 when executing the computer program.