CN103106535B - Method for solving collaborative filtering recommendation data sparsity based on neural network - Google Patents

Method for solving collaborative filtering recommendation data sparsity based on neural network Download PDF

Info

Publication number
CN103106535B
CN103106535B CN201310055267.6A CN201310055267A CN103106535B CN 103106535 B CN103106535 B CN 103106535B CN 201310055267 A CN201310055267 A CN 201310055267A CN 103106535 B CN103106535 B CN 103106535B
Authority
CN
China
Prior art keywords
matrix
user
value
project
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310055267.6A
Other languages
Chinese (zh)
Other versions
CN103106535A (en
Inventor
孙健
王晓丽
徐杰
隆克平
张毅
梁雪芬
李乾坤
姚洪哲
陈旭
陈小英
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201310055267.6A priority Critical patent/CN103106535B/en
Publication of CN103106535A publication Critical patent/CN103106535A/en
Application granted granted Critical
Publication of CN103106535B publication Critical patent/CN103106535B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method for solving collaborative filtering recommendation data sparsity based on a neural network. The method for solving collaborative filtering recommendation data sparsity based on the neural network adopts generalized regression of neural network (GRNN) and conducts full filling on sparse data by a train network model and score prediction. The method for solving collaborative filtering recommendation data sparsity based on the neural network comprises the following steps: before conducting the GRNN training, conducting screening on input variables of the neural network by adopting mean impact value (MIV), choosing characteristic values having great impact on output as valid input variables; using the valid input variable to construct the input matrix of the GRNN; adopting Kfold cross validation circulation to find out an optimal spread value of the GRNN; using the optimal spread value and the corresponding input matrix and output matrix to conduct GRNN training; using the trained GRNN to conduct score prediction on a sparse score matrix; and replacing non-scored data of the sparse score matrix with predicted score values. The method for solving collaborative filtering recommendation data sparsity based on the neural network can conduct fully filling on sparse recommendation data, solve the data height sparsity problem most outstanding in existing collaborative technology, and enable recommendation result to be accurate.

Description

A kind of method solving collaborative filtering recommending Deta sparseness based on neural network
Technical field
The invention belongs to artificial neural network and personalized recommendation technical field, specifically, relate to a kind of method solving collaborative filtering recommending Deta sparseness based on neural network.
Background technology
In advanced information society, every profession and trade all can produce the information data of magnanimity through the accumulation in one period, how from mass data, effectively to extract the research boom that useful information has started personalized recommendation technology.Collaborative filtering receives much concern as main recommended technology, by Successful utilization in various commending system.But along with the continuous expansion of resource category and the increase day by day of user, the data matrix being used for passing judgment on is more and more sparse, has had a strong impact on recommendation quality.
Neural network is a kind of imitation animal nerve network behavior feature, carries out the algorithm mathematics model of distributed parallel information processing.The processing unit of neural network generally can be divided three classes: input block, Hidden unit, output unit.Input block realizes network and extraneous connection, and hidden layer realizes the nonlinear transformation of the input space to concealed space, and output unit realizes final network and exports.Conventional neural network has counterpropagation network, self-organizing network, Recursive Networks and Radial Basis Function Network etc.
Generalized regression nerve networks GRNN(Generalized Regression Neural Network) compared with other neural networks, training process is more simple, only training sample need be determined, connection weights between corresponding network structure and each neuron can be determined automatically, and network training process the most important thing is the process determining smoothing factor.Generalized regression nerve networks has higher approximation capability, pace of learning, robustness, fault-tolerance and non-linear mapping capability, is widely used in fields such as decision control system, structure analysis, education industry, signal analysis.
Summary of the invention
The object of the invention is to overcome the deficiencies in the prior art, a kind of method solving collaborative filtering recommending Deta sparseness based on neural network is provided, adopt generalized regression nerve networks GRNN, score in predicting is carried out by training network model, sparse data are filled completely, improves the openness problem of data height of collaborative filtering.
For achieving the above object, the present invention is based on the method that neural network solves collaborative filtering recommending Deta sparseness, it is characterized in that, comprise the following steps:
Step 1: for representing the sparse rating matrix A that M user marks to N number of project, calculate the degree of rarefication of degree of rarefication that each user marks to all items and all user's scorings of each project, wherein, in sparse rating matrix A, the score value of certain user certain project NE is unified replaces with 0;
User's degree of rarefication threshold value and project degree of rarefication threshold value are set, when the degree of rarefication of certain user is less than user's degree of rarefication threshold value, then delete this user; When the degree of rarefication of certain project is less than project degree of rarefication threshold value, then by this project of deletion, the number of users obtained is designated as m, the number of entry is designated as n, according to m user U i, 1≤i≤m is to n project P j, the scoring of 1≤j≤n builds iotave evaluation matrix T:
T = t 11 t 12 · · · t 1 n t 21 t 22 · · · t 2 n · · · · · · · · · · · · t m 1 t m 2 · · · t mn
Wherein t ij, 1≤i≤m, 1≤j≤n represents user U ito project P jscoring;
Step 2: the actual conditions according to iotave evaluation matrix T select f eigenwert, and calculate f user characteristics value and f item feature value, wherein each user characteristics value is according to the score calculation of user to all items, each item feature value according to all users to this project score calculation; F the eigenwert of each user forms the user characteristics vector of row, and f eigenwert of each project forms the item feature vector of row;
Construct original input matrix I, each user characteristics vector carries out being combined as row with n item feature vector successively, amounts to m*n row, and forms original input matrix I:
Wherein, u ik, 1≤i≤m, 1≤k≤f represents user U ia corresponding kth eigenwert, p jk, 1≤j≤n, 1≤k≤f represents project P ja corresponding kth eigenwert;
With original input matrix I be input matrix, iotave evaluation matrix T for output matrix training GRNN network, now smoothing factor spread value=1 of GRNN network, the GRNN network of having been trained;
Step 3: to d, 1≤d≤2f input variable in original input matrix I, namely the capable data of d increase or reduce 10% on original value basis, and other input variables are constant, obtains two new input matrix I_increase_d, I_decrease_d; Each input variable all carries out same process, obtains 4f input matrix altogether;
The input matrix of the GRNN network of having been trained in step 2 by this 4f input matrix carries out simulation and prediction, and obtain 4f simulation and prediction output matrix R_increase_d, R_decrease_d, simulation and prediction output matrix is m*n matrix, r ij, 1≤i≤m, 1≤j≤n is the user U of simulation and prediction ito project P jscoring;
Step 4: for simulation data matrix R_increase_d, R_decrease_d that d the input variable obtained in step 3 is corresponding, calculates Mean Impact Value MIV d:
MIV d = Σ i = 1 m Σ j = 1 n ( r ij _ increase _ d - r ij _ decrease _ d ) m × n
Wherein, r ij_increase_drepresent that d input variable increases user U in the 10% simulation data matrix R_increase_d obtained ito project P jscoring, r ij_decrease_drepresent that d input variable reduces user U in the 10% simulation data matrix R_decrease_d obtained ito project P jscoring;
Calculate 2f each self-corresponding Mean Impact Value MIV of input variable d, find out 2f Mean Impact Value MIV din maximal value max (MIV d), calculated threshold Q=max (MIV d) × 10%, selects Mean Impact Value MIV dbe greater than the input variable of threshold value Q as effective input variable, effective input variable quantity is designated as F; In original input matrix I, retain the row data that effective input variable is corresponding, delete the row data that other input variables are corresponding, generate new input matrix I w;
Step 5: by the input matrix I generated in step 4 was the input matrix of GRNN network, iotave evaluation matrix T as the output matrix of GRNN network, the optimizing step-length of setting spread value and Search Range, adopt Kfold cross validation GRNN network, the minimum spread value of Select Error is optimum spread value, and input matrix corresponding to optimum spread value is designated as I s, output matrix is designated as T s; Adopt optimum spread value, with input matrix I sas input matrix, output matrix T sas output matrix re-training GRNN network;
Step 6: adopt the GRNN network trained in step 5 to carry out score in predicting to sparse rating matrix A, M the user that compute sparse rating matrix A comprises and N number of project characteristic of correspondence vector, the input matrix of GRNN network is that F is capable, often capable expression effective input variable, amount to M*N row, often be classified as the combination of the proper vector of any one user in M user and the proper vector of N number of any one project of project, the prediction rating matrix of all M user to all N number of projects is obtained after score in predicting, the score value represented with special symbol by sparse rating matrix A is to predict that score value replaces accordingly.
Wherein, the eigenwert in step 2 comprises mean value, standard deviation, extreme difference, degree of rarefication, maximal value, minimum value.
Goal of the invention of the present invention is achieved in that
The present invention is based on the method that neural network solves collaborative filtering recommending Deta sparseness, the input variable of network is screened, select, on exporting the larger eigenwert of impact as effective input variable, inapparent for effect eigenwert to be foreclosed, reduces secondary variable to the impact of result precision.Effective input variable is utilized to construct the input matrix of GRNN, iotave evaluation matrix is output matrix, the optimizing step-length of setting spread value and Search Range, the circulation of Kfold cross validation method is adopted to find out the optimum spread value of GRNN, the input matrix of optimum spread value and correspondence and output matrix is utilized to carry out GRNN network training, utilize the GRNN trained to carry out score in predicting to sparse rating matrix, replace sparse rating matrix not carry out the data of marking with prediction score value.
Adopt and the present invention is based on the method that neural network solves collaborative filtering recommending Deta sparseness, sparse recommending data can be filled completely, improve the openness problem of data height that existing collaborative filtering is the most outstanding, make recommendation results more accurate.
Accompanying drawing explanation
Fig. 1 the present invention is based on a kind of embodiment process flow diagram that neural network solves the method for collaborative filtering recommending Deta sparseness.
Embodiment
Below in conjunction with accompanying drawing, the specific embodiment of the present invention is described, so that those skilled in the art understands the present invention better.Requiring particular attention is that, in the following description, when perhaps the detailed description of known function and design can desalinate main contents of the present invention, these are described in and will be left in the basket here.
Embodiment
Fig. 1 the present invention is based on a kind of embodiment process flow diagram that neural network solves the method for collaborative filtering recommending Deta sparseness.As shown in Figure 1, in the present embodiment, the device realizing the method that the present invention is based on neural network solution collaborative filtering recommending Deta sparseness comprises two main functional modules, and be Variable Selection module and score in predicting module respectively, embodiment comprises the following steps:
S101: data acquisition and data prediction.
For representing the sparse rating matrix A that M user marks to N number of project, the score value unification special symbol of certain user certain project NE replaces, calculate degree of rarefication that each user marks to all items and the degree of rarefication that all users mark to each project, project degree of rarefication threshold value and user's degree of rarefication threshold value are set, when the degree of rarefication of certain user or the degree of rarefication of certain project are less than threshold value, then the data of correspondence are deleted, retain other users and project data, the number of users obtained is designated as m, the number of entry is designated as n, according to m user U i, 1≤i≤m is to n project P j, the score data of 1≤j≤n builds iotave evaluation matrix T:
T = t 11 t 12 · · · t 1 n t 21 t 22 · · · t 2 n · · · · · · · · · · · · t m 1 t m 2 · · · t mn
Wherein t ij, 1≤i≤m, 1≤j≤n represents user U ito project P jscoring.
In the present embodiment, Movielens data set is adopted to be described specific embodiment of the invention flow process, score value unification special symbol i.e. 0 replacement of certain user certain project NE in sparse rating matrix A.
Calculate degree of rarefication that each user marks to all items and the degree of rarefication that all users mark to each project, the project degree of rarefication threshold value that the present embodiment is arranged is 0.25, user's degree of rarefication threshold value is 0.7, and screening obtains 60 users to the Evaluations matrix of 39 projects.For ease of describing, in the present embodiment, all matrixes represent all in a tabular form.
The iotave evaluation matrix T that table 1 obtains from Movielens data set for the present embodiment.
P 1 P 2 P 3 P 39
U 1 5 4 5 4
U 2 3 2 0 0
U 3 5 4 4 4
U 4 4 4 0 4
U 5 3 0 0 4
U 60 4 3 5 1
Table 1
Iotave evaluation matrix T is sent into Variable Selection module, Variable Selection module processes iotave evaluation matrix T, by training GRNN network, adopt Mean Impact Value MIV(Mean Impact Value) carry out Variable Selection, find input variable network Output rusults being had to considerable influence, comprise following concrete steps:
Step 102: calculate GRNN network input variable.
Select f eigenwert according to the actual conditions of iotave evaluation matrix T, using user characteristics value with item feature value as input variable, 2f input variable altogether, calculates m user and n project characteristic of correspondence is vectorial.
In the present embodiment, choose 6 eigenwerts: mean value, standard deviation, extreme difference, degree of rarefication, maximal value, minimum value, calculate m user and 6 eigenwerts corresponding to n project respectively.For user characteristics value, as certain user U ito n in n project iindividual project is marked, then calculate n ithe mean value of individual project scoring, according to mean value calculation n ithe standard deviation of individual project scoring, finds out n imaxima and minima in individual project scoring, extreme difference is the difference of maxima and minima, and degree of rarefication is n iaccount for the number percent comprising number of entry N in sparse rating matrix A; For item feature value, as certain project P jthere is m jindividual user marks to it, then calculate m jmean value mean value, standard deviation, extreme difference, degree of rarefication, maximal value, the minimum value of individual user's scoring; Thus the proper vector of each user and each project can be obtained.
Table 2 is user characteristics vector matrix S_U.
u_mean u_sd u_range u_sparsity u_max u_min
U 1 4.5161 0.7980 3 0.7947 5 2
U 2 3.9286 0.7986 3 0.7179 5 2
U 3 4 0.9535 3 0.8462 5 2
U 4 4.1944 0.6591 2 0.9231 5 3
U 60 3.7500 1.0564 4 0.7179 5 1
Table 2
U_mean represents user U ito n ithe mean value of the scoring of individual project, u_sd represents user U ito n ithe standard deviation of the scoring of individual project, u_range represents user U ito n ithe extreme difference of the scoring of individual project, u_sparsity represents user U ito n ithe degree of rarefication of the scoring of individual project, u_max represents user U ito n imaximal value in the scoring of individual project, u_min represents user U ito n iminimum value in the scoring of individual project.
Table 3 is item feature vector matrix S _ P.
p_mean p_sd p_range p_sparsity p_max p_min
P 1 3.9091 0.9586 3 0.9167 5 2
P 2 3.8654 0.8555 3 0.8667 5 2
P 3 3.9063 1.1280 4 0.5333 5 1
P 4 4.4746 0.7215 3 0.9833 5 2
P 39 3.2973 1.0871 4 0.6167 5 1
Table 3
P_mean represents m jindividual user is to project P jthe mean value of scoring, p_sd represents m jindividual user is to project P jthe standard deviation of scoring, p_range represents m jindividual user is to project P jthe extreme difference of scoring, p_sparsity represents m jindividual user is to project P jthe degree of rarefication of scoring, p_max represents m jindividual user is to project P jscoring in maximal value, p_min represents m jindividual user is to project P jscoring in minimum value.
S103: utilize raw data to train GRNN network.
Determine the original input matrix of GRNN network and original output matrix, the building method of original input matrix I is: often capable expression input variable of original input matrix I, amount to 2f capable, often be classified as the combination of the proper vector of any one project in the proper vector of any one user in m user and n project, amount to m*n row.
Wherein, u ik, 1≤i≤m, 1≤k≤f represents user U ia corresponding kth eigenwert, p jk, 1≤j≤n, 1≤k≤f represents project P ja corresponding kth eigenwert.
In the present embodiment, original input matrix I comprises 12 row, indicates 12 input variables, i.e. 6 input variables of characterizing consumer feature and 6 input variables of sign item characteristic.The present embodiment iotave evaluation matrix T comprises 60 users and 39 projects, and therefore original input matrix I has 2340 row.
The original input matrix I that table 4 is formed for the present embodiment.
1st row 2nd row 3rd row 2340th row
u_mean 4.5161 4.5161 4.5161 3.7500
u_sd 0.7980 0.7980 0.7980 1.0564
u_range 3 3 3 4
u_sparsity 0.7947 0.7948 0.7948 0.7179
u_max 5 5 5 5
u_min 2 2 2 1
p_mean 3.9091 3.8654 3.9063 3.2973
p_sd 0.9586 0.8555 1.1280 1.0871
p_range 3 3 4 4
p_sparspty 0.9167 0.8667 0.5333 0.6167
p_max 5 5 5 5
p_mpn 2 2 1 1
Table 4
With original input matrix I be input matrix, iotave evaluation matrix T for output matrix training GRNN network, now the smoothing factor spread value of GRNN network is default value: spread value=1, the GRNN network of having been trained.
S104: increase, reduction input variable emulate.
To d, 1≤d≤2f input variable in original input matrix I, namely the capable data of d increase or reduce 10% on original value basis, and other input variables are constant, obtain two new input matrix I_increase_d, I_decrease_d.Each input variable all carries out same process, obtains 4f input matrix altogether.The input matrix of the GRNN network trained in step S103 by this 4f input matrix carries out simulation and prediction, obtains 4f simulation and prediction output matrix R_increase_d, R_decrease_d.Simulation and prediction output matrix is the same with iotave evaluation matrix T is m*n matrix, r ij, 1≤i≤m, 1≤j≤n is the user U of simulation and prediction ito project P jscoring.In the present embodiment, 12 simulation and prediction output matrixes can be obtained altogether.
S105: the MIV value of computer sim-ulation prediction output matrix.
For simulation and prediction output matrix R_increase_d, R_decrease_d that d the input variable obtained in step S104 is corresponding, calculate the difference of each element and the difference of all elements is averaged, obtaining Mean Impact Value MIV d, its computing formula is:
MIV d = Σ i = 1 m Σ j = 1 n ( r ij _ increase _ d - r ij _ decrease _ d ) m × n
Wherein, r ij_increase_drepresent that d input variable increases user U in the 10% simulation data matrix obtained ito project P jscoring, r ij_decrease_drepresent that d input variable reduces user U in the 10% simulation data matrix obtained ito project P jscoring.2f Mean Impact Value MIV can be obtained altogether d.In the present embodiment, 12 Mean Impact Value MIV can be obtained d.
Table 5 is the Mean Impact Value MIV that in the present embodiment, each input variable is corresponding d.
u_mean u_sd u_range u_sparsity u_max u_min
0.1158 -0.0031 -0.0328 0.0041 0.0055 0.0232
p_mean p_sd p_range p_sparspty p_max p_mpn
0.1552 -0.0076 -0.0250 0.0263 0 0.0049
Table 5
S106: Variable Selection.
Mean Impact Value MIV drepresent the weight of d input variable, the larger explanation of weight is larger on the impact of this input variable on Output rusults, and the impact of the less explanation of weight on Output rusults is less, carries out the screening of input variable accordingly.Screening technique is: find out the Mean Impact Value MIV that 2f input variable is corresponding din maximal value max (MIV d), calculate 10 of this maximal value for selecting threshold value Q=max (MIV d) × 10%, selects Mean Impact Value MIV dbe greater than the input variable of threshold value Q as effective input variable, effective input variable quantity is designated as F.
In the present embodiment, can be obtained by table 5, in 12 Mean Impact Values, be the Mean Impact Value that p_mean is corresponding to the maximum: 0.1552, therefore threshold value Q is 0.01552, selects Mean Impact Value MIV dbe greater than the input variable of threshold value Q, therefore effectively input variable there are 6, is respectively u_mean, u_range, u_min, p_mean, p_range, p_sparsity.
After Variable Selection module filters out effective input variable, enter score in predicting module and utilize GRNN network to carry out Collaborative Filtering, comprise following concrete steps:
S107: structure GRNN network input matrix.
In original input matrix I, the row data that effective input variable that reservation step S106 filters out is corresponding, delete the row data that other input variables are corresponding, generate new input matrix I w.
The new input matrix I of table 6 for generating in the present embodiment w.
1st row 2nd row 3rd row 2340th row
u_mean 4.5161 4.5161 4.5161 3.7500
u_range 3 3 3 4
u_min 2 2 2 1
i_mean 3.9091 3.8654 3.9063 3.2973
i_range 3 3 4 4
i_sparsity 0.9167 0.8667 0.5333 0.6167
Table 6
Step 108: training network, finds optimum spread value.
By the input matrix I generated in step 107 was the input matrix of GRNN network, iotave evaluation matrix T as the output matrix of GRNN network, the optimizing step-length of setting spread value and Search Range, adopt Kfold cross validation GRNN network, the minimum spread value of Select Error is optimum spread value, and the input matrix corresponding to optimum spread value is designated as I s, output matrix is designated as T s.
In the present embodiment, by input matrix I wbe divided into input training set and input test collection, front 80% row are as input training set I w_ train, remaining 20% row are as input test collection I w_ test; Done to be divided into by 80% row before iotave evaluation matrix T and export training set T_train, remaining 20% row is as output test set T_test.With I w_ train is input matrix, T_train is output matrix, adopt Kfold cross validation GRNN network, find optimum spread value: the Search Range of setting spread is 0 ~ 2, the optimizing step-length that each circulation spread increases is 0.1, calculate the square error MSE of each element in the output matrix and T_train at every turn circulated, find minimum MSE after circulation terminates, the spread value of its correspondence is optimum spread value.In the present embodiment, the optimum spread value obtained is 0.5, and corresponding input matrix is I s, output matrix is T s.
S109: utilize optimum spread value training network.
Adopt the optimum spread value obtained in step S108, with the input matrix I obtained in step S108 sas input matrix, output matrix T sas output matrix re-training GRNN network.
S110: score in predicting.
The GRNN network trained in step S109 is utilized to carry out score in predicting to needing the sparse rating matrix A filled, M the user that compute sparse rating matrix A comprises and N number of project characteristic of correspondence vector, the input matrix of GRNN score in predicting is that F is capable, often capable expression effective input variable, amount to M*N row, often be classified as the combination of the proper vector of any one user in M user and the proper vector of N number of any one project of project, the prediction rating matrix of all M user to all N number of projects is obtained after score in predicting, the score value represented with special symbol by sparse rating matrix A is to predict that score value replaces accordingly, can realize the score in predicting of sparse rating matrix A and fill completely.
In the present embodiment, utilize the GRNN network trained to input test collection I w_ test carries out score in predicting, obtains score in predicting matrix.Table 7 is input test collection I wthe score in predicting matrix of _ test
P 1 P 2 P 3 P 4 P 5 P 39
U 49 3 3 2 3 3 2
U 50 4 4 2 5 4 3
U 51 4 4 3 4 4 3
U 52 3 3 2 3 4 2
U 53 4 4 2 5 4 3
U 54 4 4 2 4 4 3
U 55 4 3 2 4 4 3
U 56 4 4 2 5 4 3
U 57 3 3 2 4 4 2
U 58 3 3 2 3 4 2
U 59 3 3 2 4 4 2
U 60 3 3 2 3 4 2
Table 7
Table 8 is real rating matrix T_test.
P 1 P 2 P 3 P 4 P 5 P 39
U 49 3 4 5 4 5 0
U 50 5 5 5 5 5 0
U 51 4 3 0 5 5 4
U 52 4 4 3 3 1 3
U 53 5 4 0 5 4 5
U 54 3 5 4 4 5 5
U 55 3 3 4 4 5 0
U 56 5 4 0 5 4 0
U 57 4 4 4 5 0 0
U 58 2 5 5 4 5 0
U 59 4 4 5 5 5 2
U 60 4 3 5 3 0 1
Table 8
In the present embodiment, adopt average absolute percentage error MAPE to weigh the accuracy of scoring, computing formula is as follows:
Wherein num is that the number of the value not being 0 and user comment undue item number, observed in true rating matrix T_test xfor the actual score value in T_test, predicted xfor input test collection I w_ test is by the corresponding score value in the score in predicting matrix of neural network forecast.
As calculated, the MAPE value obtained in the present embodiment is 24.86%, visible employing the present invention is based on method that neural network solves collaborative filtering recommending Deta sparseness, and to carry out the precision of score in predicting higher, effectively can fill, solve Sparse Problem in collaborative filtering to sparse matrix.
Although be described the illustrative embodiment of the present invention above; so that those skilled in the art understand the present invention; but should be clear; the invention is not restricted to the scope of embodiment; to those skilled in the art; as long as various change to limit and in the spirit and scope of the present invention determined, these changes are apparent, and all innovation and creation utilizing the present invention to conceive are all at the row of protection in appended claim.

Claims (2)

1. solve a method for collaborative filtering recommending Deta sparseness based on neural network, it is characterized in that, comprise the following steps:
Step 1: for representing the sparse rating matrix A that M user marks to N number of project, calculate the degree of rarefication of degree of rarefication that each user marks to all items and all user's scorings of each project, wherein, in sparse rating matrix A, the score value of certain user certain project NE is unified replaces with 0;
User's degree of rarefication threshold value and project degree of rarefication threshold value are set, when the degree of rarefication of certain user is less than user's degree of rarefication threshold value, then delete this user; When the degree of rarefication of certain project is less than project degree of rarefication threshold value, then by this project of deletion, the number of users obtained is designated as m, the number of entry is designated as n, according to m user U i, 1≤i≤m is to n project P j, the scoring of 1≤j≤n builds iotave evaluation matrix T:
T = t 11 t 12 . . . t 1 n t 21 t 22 . . . t 2 n . . . . . . . . . . . . t m 1 t m 2 . . . t mn ;
Wherein t ij, 1≤i≤m, 1≤j≤n represents user U ito project P jscoring;
Step 2: the actual conditions according to iotave evaluation matrix T select f eigenwert, and calculate f user characteristics value and f item feature value, wherein each user characteristics value is according to the score calculation of user to all items, each item feature value according to all users to this project score calculation; F the eigenwert of each user forms the user characteristics vector of row, and f eigenwert of each project forms the item feature vector of row;
Construct original input matrix I, a m user characteristics vector to carry out being combined as row with n item feature vector successively, amount to m*n row and form original input matrix I:
Wherein, u ik, 1≤i≤m, 1≤k≤f represents user U ia corresponding kth eigenwert, p jk, 1≤j≤n, 1≤k≤f represents project P ja corresponding kth eigenwert;
With original input matrix I be input matrix, iotave evaluation matrix T for output matrix training GRNN network, now smoothing factor spread value=1 of GRNN network, the GRNN network of having been trained;
Step 3: often capable expression input variable of original input matrix I, amounts to 2f capable; To d, 1≤d≤2f input variable in original input matrix I, namely the capable data of d increase or reduce 10% on original value basis, and other input variables are constant, obtain two new input matrix I_increase_d, I_decrease_d; Each input variable all carries out same process, obtains 4f input matrix altogether;
The input matrix of the GRNN network of having been trained in step 2 by this 4f input matrix carries out simulation and prediction, obtain 4f simulation and prediction output matrix R_increase_d, R_decrease_d, simulation and prediction output matrix is m*n matrix, the element r in simulation and prediction output matrix ij, 1≤i≤m, 1≤j≤n is the user U that simulation and prediction obtains ito project P jscoring;
Step 4: for simulation data matrix R_increase_d, R_decrease_d that d the input variable obtained in step 3 is corresponding, calculates Mean Impact Value MIV d:
MIV d = Σ i = 1 m Σ j = 1 n ( r ij _ increase _ d - r ij _ decrease _ d ) m × n
Wherein, r ij_increase_drepresent that d input variable increases user U in the 10% simulation data matrix R_increase_d obtained ito project P jscoring, r ij_decrease_drepresent that d input variable reduces user U in the 10% simulation data matrix R_decrease_d obtained ito project P jscoring;
Calculate 2f each self-corresponding Mean Impact Value MIV of input variable d, find out 2f Mean Impact Value MIV din maximal value max (MIV d), calculated threshold Q=max (MIV d) × 10%, selects Mean Impact Value MIV dbe greater than the input variable of threshold value Q as effective input variable, effective input variable quantity is designated as F; In original input matrix I, retain the row data that effective input variable is corresponding, delete the row data that other input variables are corresponding, generate new input matrix I w;
Step 5: by the input matrix I generated in step 4 was the input matrix of GRNN network, iotave evaluation matrix T as the output matrix of GRNN network, the optimizing step-length of setting spread value and Search Range, adopt Kfold cross validation GRNN network, the minimum spread value of Select Error is optimum spread value, and input matrix corresponding to optimum spread value is designated as I s, output matrix is designated as T s; Adopt optimum spread value, with input matrix I sas input matrix, output matrix T sas output matrix re-training GRNN network;
Step 6: adopt the GRNN network trained in step 5 to carry out score in predicting to sparse rating matrix A, M the user that compute sparse rating matrix A comprises and N number of project characteristic of correspondence vector, the input matrix of GRNN network is that F is capable, often capable expression effective input variable, amount to M*N row, often be classified as the combination of the proper vector of any one user in M user and the proper vector of N number of any one project of project, the prediction rating matrix of all M user to all N number of projects is obtained after score in predicting, the score value represented with special symbol by sparse rating matrix A is to predict that score value replaces accordingly.
2. the method solving collaborative filtering recommending Deta sparseness based on neural network according to claim 1, it is characterized in that, f eigenwert in described step 2 is 6 eigenwerts, is respectively mean value, standard deviation, extreme difference, degree of rarefication, maximal value, minimum value.
CN201310055267.6A 2013-02-21 2013-02-21 Method for solving collaborative filtering recommendation data sparsity based on neural network Expired - Fee Related CN103106535B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310055267.6A CN103106535B (en) 2013-02-21 2013-02-21 Method for solving collaborative filtering recommendation data sparsity based on neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310055267.6A CN103106535B (en) 2013-02-21 2013-02-21 Method for solving collaborative filtering recommendation data sparsity based on neural network

Publications (2)

Publication Number Publication Date
CN103106535A CN103106535A (en) 2013-05-15
CN103106535B true CN103106535B (en) 2015-05-13

Family

ID=48314380

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310055267.6A Expired - Fee Related CN103106535B (en) 2013-02-21 2013-02-21 Method for solving collaborative filtering recommendation data sparsity based on neural network

Country Status (1)

Country Link
CN (1) CN103106535B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103353872B (en) * 2013-06-03 2017-03-01 大连理工大学 A kind of teaching resource personalized recommendation method based on neutral net
CN104854580B (en) * 2013-09-10 2018-09-28 华为技术有限公司 A kind of recommendation method and apparatus
TWI613604B (en) * 2013-10-15 2018-02-01 財團法人資訊工業策進會 Recommandation system, method and non-volatile computer readable storage medium for storing thereof
CN104008164A (en) * 2014-05-29 2014-08-27 华东师范大学 Generalized regression neural network based short-term diarrhea multi-step prediction method
CN104632188A (en) * 2014-12-04 2015-05-20 杭州和利时自动化有限公司 Prediction method and device for single oil well yield
US11423323B2 (en) * 2015-09-02 2022-08-23 Qualcomm Incorporated Generating a sparse feature vector for classification
CN107622427B (en) * 2016-07-13 2021-04-06 阿里巴巴集团控股有限公司 Deep learning method, device and system
CN107577736B (en) * 2017-08-25 2021-12-17 武汉数字智能信息科技有限公司 File recommendation method and system based on BP neural network
CN109509051B (en) * 2018-09-12 2020-11-13 北京奇艺世纪科技有限公司 Article recommendation method and device
CN111598627A (en) * 2020-05-26 2020-08-28 揭阳职业技术学院 Personalized advertisement pushing method for elevator media terminal
CN111693975A (en) * 2020-05-29 2020-09-22 电子科技大学 MIMO radar sparse array design method based on deep neural network
CN112749345B (en) * 2021-02-09 2023-11-14 上海海事大学 K neighbor matrix decomposition recommendation method based on neural network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005048185A1 (en) * 2003-11-17 2005-05-26 Auckland University Of Technology Transductive neuro fuzzy inference method for personalised modelling
CN101694652A (en) * 2009-09-30 2010-04-14 西安交通大学 Network resource personalized recommended method based on ultrafast neural network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005048185A1 (en) * 2003-11-17 2005-05-26 Auckland University Of Technology Transductive neuro fuzzy inference method for personalised modelling
CN101694652A (en) * 2009-09-30 2010-04-14 西安交通大学 Network resource personalized recommended method based on ultrafast neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Collaborative Filtering Using a Regression-Based;Slobodan Vucetic, Zoran Obradovic;《Knowledge and Information Systems》;20050101;第7卷(第1期);全文 *
使用BP神经网络缓解协同过滤推荐算法的稀疏性问题;张锋,常会友;《计算机研究与发展》;20060430;第43卷(第4期);全文 *

Also Published As

Publication number Publication date
CN103106535A (en) 2013-05-15

Similar Documents

Publication Publication Date Title
CN103106535B (en) Method for solving collaborative filtering recommendation data sparsity based on neural network
Cadenas et al. Short term wind speed forecasting in La Venta, Oaxaca, México, using artificial neural networks
US20170308934A1 (en) Management method of power engineering cost
CN103226741B (en) Public supply mains tube explosion prediction method
CN101480143B (en) Method for predicating single yield of crops in irrigated area
CN104835103A (en) Mobile network health evaluation method based on neural network and fuzzy comprehensive evaluation
CN108446794A (en) One kind being based on multiple convolutional neural networks combination framework deep learning prediction techniques
CN107169628A (en) A kind of distribution network reliability evaluation method based on big data mutual information attribute reduction
CN105512404B (en) Time-varying reliability Global sensitivity analysis method based on chaos polynomial expansion
CN112149873B (en) Low-voltage station line loss reasonable interval prediction method based on deep learning
CN104794361A (en) Comprehensive evaluation method for water flooding oil reservoir development effect
CN109647899B (en) Method for forecasting power consumption of multi-specification rolled pieces in hot rolling and finish rolling process of strip steel
CN107705556A (en) A kind of traffic flow forecasting method combined based on SVMs and BP neural network
Dong et al. Applying the ensemble artificial neural network-based hybrid data-driven model to daily total load forecasting
CN104865827B (en) A kind of pumping production optimization method based on multi-state model
CN107679660A (en) Based on SVMs by when building energy consumption Forecasting Methodology
CN104050547A (en) Non-linear optimization decision-making method of planning schemes for oilfield development
CN105893669A (en) Global simulation performance predication method based on data digging
CN103853939A (en) Combined forecasting method for monthly load of power system based on social economic factor influence
CN107644297A (en) A kind of energy-saving of motor system amount calculates and verification method
CN104881718A (en) Regional power business index constructing method based on multi-scale leading economic indicators
CN105160496A (en) Comprehensive evaluation method of enterprise electricity energy efficiency
CN106600037A (en) Multi-parameter auxiliary load forecasting method based on principal component analysis
Mathew et al. Demand forecasting for economic order quantity in inventory management
CN110880044A (en) Markov chain-based load prediction method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20150513

Termination date: 20190221