CN113537619A

CN113537619A - Power grid user evaluation sparse matrix scoring prediction method

Info

Publication number: CN113537619A
Application number: CN202110868792.4A
Authority: CN
Inventors: 杨强; 张云菊; 郭明; 史虎军; 张玉罗; 司胜文; 杜秀举
Original assignee: Guizhou Power Grid Co Ltd
Current assignee: Guizhou Power Grid Co Ltd
Priority date: 2021-07-30
Filing date: 2021-07-30
Publication date: 2021-10-22

Abstract

The invention discloses a power grid user evaluation sparse matrix scoring prediction method, which comprises the following steps: calculating semantic similarity between user evaluations by constructing a hierarchical structure tree of an ontology concept, and performing partial grading, prediction and filling on the user evaluations of which the similarity is greater than a similarity threshold in a sparse matrix; based on the predicted scoring matrix, performing dimensionality reduction and decomposition based on a matrix decomposition theory, and further performing scoring prediction on user evaluation less than or equal to a similarity threshold value, so that the prediction filling of scoring missing values in the sparse matrix is realized; the overfitting phenomenon generated by the matrix decomposition algorithm under the condition that the matrix is extremely sparse is improved, and the quality of the collaborative filtering recommendation algorithm is improved.

Description

Power grid user evaluation sparse matrix scoring prediction method

Technical Field

The invention belongs to the technical field of software, and particularly relates to a power grid user evaluation sparse matrix scoring prediction method.

Background

The intelligent question-answering system orderly and scientifically arranges the accumulated unordered corpus information and establishes a knowledge-based classification model; the classification models can guide the newly added corpus consultation and service information, save human resources and improve the automation of information processing. The intelligent question-answering system searches corresponding answers from a knowledge base or the Internet according to user questions, and then directly returns the answers to the user, wherein a collaborative filtering recommendation mechanism is a personalized recommendation technology which has the most research and the best effect at present, and generates recommendations by collecting evaluation information of other users which are the same as or similar to the interests and hobbies of the user according to historical selection information and similarity relations of the user, so that important progress is made in both theoretical research and engineering practice.

Under a big data environment, historical data generated by power grid user behaviors are rapidly increased, compared with the total number of users and information data of user evaluation, data generated by single user-user evaluation is very little, and new users and user evaluation are continuously added into a system to continuously generate new association, so that a data set has high sparsity.

Research shows that when user scoring data is sparse, the performance of a recommendation system is sharply reduced. Aiming at the problem of sparsity of power grid user behavior scoring, the problem of data sparsity is solved by iteratively learning user evaluation hidden variable distribution compression user scoring matrix dimension, and the user evaluation space dimension is reduced through singular value decomposition, but the dimension reduction can cause information loss and is difficult to ensure the recommendation effect. The user-item scoring matrix is filled based on the concept semantic similarity, good effect is achieved, however, the algorithm can only conduct scoring prediction on user evaluation with high similarity, and cannot conduct predictive scoring on user evaluation with low similarity in the user-item scoring matrix.

The semantic similarity is considered based on the semantic overlapping degree, so that the similar part between concepts can be embodied, but only the part with the overlapped semantics is considered, and the difference part is not considered. After the content similarity among the user evaluations is obtained, a plurality of the user evaluations with higher similarity are selected for score prediction, and the prediction scores are used for filling empty items in a user-user evaluation matrix, so that the sparsity of the empty items is reduced. Because there is a large difference in attribute description between different categories of user evaluations, the semantic-based method cannot calculate the similarity between the cross-category user evaluations, and thus cannot perform cross-category rating prediction. In addition, the similarity calculation based on the semantics needs to extract the attribute characteristics evaluated by the user, design the domain knowledge, and has a narrow application range.

SVD (Singular Value Decomposition) is a matrix Decomposition algorithm, and can effectively extract key features and deeply reveal the internal structure of a matrix. The singular value decomposition graph is shown in fig. 1. Sarwar et al introduced SVD into a collaborative filtering algorithm, decomposed the user evaluation scores into user and user evaluation feature vector matrices using a matrix decomposition method, and extracted some essential features using singular values of the score matrices using the potential relationships between users and user evaluations. The SVD algorithm improves the recommendation quality of the collaborative filtering recommendation system on the sparse scoring matrix, the difference between the eigenvalue of the matrix after the scoring matrix is filled and the eigenvalue before the scoring matrix is filled is small, and due to the fact that data in the recommendation system is sparse, the SVD algorithm is often subjected to an overfitting phenomenon in scoring prediction.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the utility model provides a power grid user evaluation sparse matrix scoring prediction method, which aims to solve the technical problems of overfitting phenomenon generated by a matrix decomposition algorithm caused by the sparsity of a power grid user scoring matrix, and reduction of the quality of a collaborative filtering recommendation algorithm.

The technical scheme of the invention is as follows:

a power grid user evaluation sparse matrix scoring prediction method comprises the following steps: calculating semantic similarity between user evaluations by constructing a hierarchical structure tree of an ontology concept, and performing partial grading, prediction and filling on the user evaluations of which the similarity is greater than a similarity threshold in a sparse matrix; and on the basis of the predicted scoring matrix, performing dimensionality reduction and decomposition based on a matrix decomposition theory, and further performing scoring prediction on the user evaluation less than or equal to the similarity threshold value, so that the prediction filling of the scoring missing value in the sparse matrix is realized.

A power grid user evaluation sparse matrix scoring prediction method specifically comprises the following steps:

step 1, classifying user evaluation into a scored concept instance set V according to user scores_mAnd a set W of scored concept instances to be predicted_m；

Step 2, traversing the ontology hierarchical tree to obtain the depth (I) of the concept instance_i),I_i∈V_m，depth(I_j),I_j∈W_mAnd its smallest common ancestor node lso (I)_i,I_j) Calculating the user evaluation similarity;

step 3, obtaining V_mSubset V with user evaluation similarity greater than similarity threshold epsilon_m', evaluation of the user to be predicted I_j∈W_mCarrying out score prediction;

step 4, V_mWithout any element and I_jThe similarity is larger than a threshold value epsilon, access W_mSkipping to the next element in the step 2;

and 5, taking the score matrix predicted by the semantic similarity method as an input matrix, reducing the dimension of the score matrix according to a matrix decomposition method, decomposing the score matrix into a low-order approximate matrix which is the best approximate to the original matrix, calculating a loss function, performing iterative calculation to obtain element values of the two matrixes after the score matrix is decomposed, and finally obtaining the evaluation score value of the user to be predicted in the score matrix.

The method for calculating the user evaluation similarity in the step 2 comprises the following steps: the shorter the distance between two concept instances, the higher the similarity between the two concepts, and vice versa, the similarity formula is:

in the formula: depth represents the shortest path from the node to the root node, lso represents the minimum common ancestor node of the two instances, len represents the shortest path between the two nodes; as can be seen from equation (1), the similarity between concepts increases with increasing depth of the minimum common ancestor node, and, when I is_i＝I_jWhen, sim (I)_i,I_j)＝1。

Step 3 obtaining V_mSubset V with user evaluation similarity greater than similarity threshold epsilon_m', evaluation of the user to be predicted I_j∈W_mThe formula for making the score prediction is:

the specific method for predicting the score is as follows:

let the user rating set be I ═ I (I)₁,...,I_i,I_j,...,I_n) ε is a threshold constant, if I_iAnd I_jIf the semantic similarity of (1) is greater than the threshold value, then I_iAnd I_jAre similar between them; otherwise I_iAnd I_jAre dissimilar;

let U ═ be (U) set as user set₁,...,U_m,U_n,...,U_k) User U_mThe evaluation set of the scored users is V_mThe set of unscored user ratings is W_m,I_i、I_jAre respectively a set V_m、W_mAny one of the elements of (1), R_miIs a user U_mScored user ratings I_iEpsilon is a threshold constant;

user U_mRating I to the user_jScore prediction of (3) R_mjIs as follows.

And (3) proving that: suppose thatScored set V_mThe method comprises t user evaluations and user evaluation I to be evaluated_jTwo with similarity greater than threshold epsilon are respectively I_p,I_qAccording to the user evaluation I_p,I_qThe predictive scoring formula of (1):

comprehensive consideration of similar user evaluation I_p,I_qUser evaluation with similarity I_jThe similarity weight of (c) is given by:

according to a known score R_mp,R_mqThe similarity weight can be given by the formula:

further extend to U_mScored set V_mComprises more than one user evaluation I to be evaluated_jWhen similar users rate, i.e.

sim(I_i,I_j) If the value is more than epsilon, the user U has a score for evaluating the user to be predicted according to the semantic similarity prediction method_mRating I to the user_jScore prediction of (3) R_mjAnd (4) calculating a formula.

Step 5, taking the score matrix predicted by the semantic similarity method as an input matrix, reducing the dimension of the score matrix according to a matrix decomposition method, decomposing the score matrix into a low-order approximate matrix which is the best approximate to the original matrix, and calculating a loss function by the method comprising the following steps:

setting a scoring matrix predicted by a semantic similarity method as a matrix R of m multiplied by n, and for the sparse matrix R, applying a matrix decomposition method to predict a matrix missing value, wherein the matrix R isThe matrix can be decomposed into a U matrix with dimensions of m multiplied by k and a V matrix with dimensions of k multiplied by n; symbol

For the approximate scoring matrix after prediction of matrix R:

the matrix U is the relation between m users and k subjects, the matrix V is the relation between k subjects and n user evaluations, the subject k is a parameter based on specific user evaluations, the ith row of the scoring matrix is approximated, and the element value of the jth column is as follows:

assuming the loss function of the approximate scoring matrix as the sum of the squares of the actual scores and the approximate scores, the loss function of the score prediction based on matrix decomposition is expressed as:

step 5, the method for obtaining each element value of the two matrixes after the decomposition of the scoring matrix by iterative computation and finally obtaining the user evaluation scoring value to be predicted in the scoring matrix comprises the following steps:

and respectively solving partial derivatives of the loss functions to obtain:

advancing along the fastest descending direction based on a gradient descent optimization algorithm, wherein the symbol alpha is a learning rate:

to prevent overfitting of the scoring matrix, a regularization term beta (| | u) is added_i||²+||v_k||²) Where β is a regularization parameter, yielding:

u_i,k＝u_i,k+α(2E_i,jv_k,j-βu_i,k)

v_k,j＝v_k,j+α(2E_i,ju_i,k-βv_k,j) (12)

after the matrix U and V is solved, the prediction scoring formula of the user i on the item j is as follows:

u(i,1)*v(1,j)+u(i,2)*v(2,j)+…+u(i,k)*v(k,j) (13)。

the method also comprises the step of verifying the feasibility and the effectiveness of the power grid user evaluation sparse matrix scoring prediction method, wherein the verification method specifically comprises the following steps: the algorithm is realized by C + + language under QT7.4.7 programming environment; data is from a dataset collected by the university of minnesota state computer science group research group for collaborative filtering algorithms; the sparsity of the data set is 100000/1682 multiplied by 943 ≈ 93.7%, firstly, the whole data set is subjected to randomized shuffling operation, then, experimental data are averagely divided into 5 mutually disjoint sub data sets, and the data proportion of the training set to the testing set is 4: 1; the scale values of the distribution of score 1 to score 5 for the 5 data sets are shown in table 1.

TABLE 1 score distribution ratio

The difference of the data distribution ratio of the data set 1 with the score of 5 and the data set 3 with the score of 5 is 1.87%, the difference is the largest in the whole data set, and the differences of the scores of the rest data sets are smaller, so that the data set 1 is not used as a test set, and 1 of the data sets from 2 to 5 is selected as the test set to be tested;

in the experiment, the times of classifying and constructing the ontology layer according to data are recorded as hierarchy tree, the semantic similarity threshold is epsilon, the characteristic number or dimension of matrix decomposition is F, the learning rate is alpha, the regularization parameter is beta, and the related parameter settings are shown in Table 2:

table 2 experimental parameter settings

In order to verify the performance of the algorithm, a score prediction algorithm based on semantic similarity, a score prediction algorithm based on sparse matrix singular value decomposition and a prediction algorithm based on random number filling missing values and then matrix decomposition are respectively operated on a data set and are subjected to comparative analysis, and an operation result is obtained by adjusting a similarity threshold value and iteration times and is subjected to statistical analysis;

adopting the average absolute deviation MAE as a measurement standard; the MAE measures the accuracy of the prediction by calculating the deviation between the predicted user score and the actual user score; assume that the scores of the N predicted projects are represented by a vector of { p } in the algorithm₁,p₂,…,p_nThe corresponding actual user evaluation set is r₁,r₂,…r_nAnd then the MAE calculation formula is as follows:

the resulting similarity threshold was set to 0.75.

For effectiveness, a prediction filling method is carried out by directly carrying out SVD (singular value decomposition) on the basis of a sparse matrix, and random numbers are used, wherein the range is 1-5; comparing the algorithm of pre-filling the sparse matrix and then performing matrix decomposition with the result of fluctuation of the algorithm along with the preserved dimension change after matrix decomposition to obtain the following result: under the condition of considering the preservation of the attribute of the original scoring data set, the similarity threshold value in the scoring prediction process is adjusted to partially pre-fill the items with higher similarity, richer source data information is provided for the original sparse matrix of matrix decomposition, and the MAE value of the algorithm is smaller than that of the SVD decomposition algorithm under the condition that the similarity threshold value is set to be 0.75.

The invention has the beneficial effects that:

according to the method, partial scoring prediction filling is carried out on the user evaluation with higher similarity in the sparse matrix through the semantic similarity of the ontology concept, then the missing value is further predicted for the non-scored item with lower similarity by utilizing a matrix decomposition algorithm on the basis of the predicted scoring matrix, the sparse matrix is subjected to secondary prediction filling, and finally the complete user-scoring matrix is obtained, so that the overfitting phenomenon generated by the matrix decomposition algorithm under the condition that the matrix is extremely sparse is improved, and the quality of the collaborative filtering recommendation algorithm is improved.

Drawings

FIG. 1 is a schematic matrix decomposition of the present invention;

FIG. 2 is a schematic diagram illustrating the effect of a semantic similarity threshold on an MAE value according to the present invention;

FIG. 3 is a graph comparing the change of MAE values with the preserved dimension after decomposition.

Detailed Description

The shorter the distance between two concept instances, the higher the similarity between the two concepts and vice versa. The semantic distance similarity of the concept is embodied on the basis of the overlapping path of the concept words in the ontology hierarchical tree, and the similarity algorithm model is shown as a formula (1).

Where depth represents the shortest path from a node to the root node, lso represents the least common ancestor node of the two instances, and len represents the shortest path between the two nodes. As can be seen from equation (1), the similarity between concepts increases with increasing depth of the minimum common ancestor node, and, when I is_i＝I_jWhen, sim (I)_i,I_j)＝1。

Let the user rating set be I ═ I (I)₁,...,I_i,I_j,...,I_n) Epsilon is oneA threshold constant, if I_iAnd I_jIf the semantic similarity of (1) is greater than the threshold value, then I_iAnd I_jAre similar between them; otherwise I_iAnd I_jAre dissimilar.

The user set is U ═ U₁,...,U_m,U_n,...,U_k) User U_mThe evaluation set of the scored users is V_mThe set of unscored user ratings is W_m,I_i、I_jAre respectively a set V_m、W_mAny one of the elements of (1), R_miIs a user U_mScored user ratings I_iε is a threshold constant.

User U_mRating I to the user_jScore prediction of (3) R_mjIs as follows.

And (3) proving that: hypothesis scored set V_mThe method comprises t user evaluations and user evaluation I to be evaluated_jTwo with similarity greater than threshold epsilon are respectively I_p,I_qAccording to the user evaluation I_p,I_qThe predictive scoring formula of (1):

further extend to U_mScored set V_mComprises a plurality of user evaluations I to be evaluated_jWhen similar users rate, i.e.

sim(I_i,I_j) If the value is more than epsilon, the user U has a score for evaluating the user to be predicted according to the semantic similarity prediction method_mRating I to the user_jScore prediction of (3) R_mj。

The scoring prediction based on semantic similarity effectively fills a user scoring sparse matrix, but only can perform scoring prediction on similar user evaluations, and the user evaluations with lower similarity cannot perform scoring prediction.

And setting a user evaluation scoring matrix as a matrix R of m multiplied by n, and for the sparse matrix R, applying a matrix decomposition method to predict matrix missing values, wherein the matrix R can be decomposed into a U matrix of m multiplied by k dimensions and a V matrix of k multiplied by n dimensions. Symbol

Is an approximate scoring matrix after the matrix R is predicted.

The matrix U is the relation between m users and k subjects, the matrix V is the relation between k subjects and n user evaluations, and the subject k is a parameter based on specific user evaluations and can be adjusted according to requirements. The element values of the ith row and the jth column of the approximate scoring matrix are:

and respectively solving partial derivatives of the loss functions to obtain:

based on the above, the power grid user evaluation sparse matrix scoring prediction algorithm based on the ontology concept hierarchical structure tree can be expressed as follows: the method comprises the steps of firstly, carrying out partial scoring prediction filling on user evaluation with high similarity in a sparse matrix through semantic similarity of an ontology concept, then further predicting missing values of un-scored items with low similarity by utilizing a matrix decomposition algorithm based on the predicted scoring matrix, carrying out secondary prediction filling on the sparse matrix, and finally obtaining a complete user-scoring matrix so as to improve an overfitting phenomenon generated by the matrix decomposition algorithm under the condition that the matrix is extremely sparse and improve the quality of a collaborative filtering recommendation algorithm. The detailed steps of the algorithm are as follows:

inputting: a user-user rating score sparse matrix.

And (3) outputting: user-user evaluation prediction scoring matrix.

Step 1: classifying user ratings into a set of scored concept instances V according to user scores_mAnd a set W of scored concept instances to be predicted_m。

Step 2: traversing the ontology hierarchical tree to obtain the depth (I) of the concept instance_i),I_i∈V_m，depth(I_j),I_j∈W_mAnd its smallest common ancestor node lso (I)_i,I_j) And calculating the user evaluation similarity according to the formula (1).

Step 3: obtaining V_mSubset V with user evaluation similarity greater than similarity threshold epsilon_m', the user to be predicted is evaluated I according to the formula (6)_j∈W_mAnd (6) performing score prediction.

Step 4：V_mWithout any element and I_jThe similarity is larger than a threshold value epsilon, access W_mAnd jumping to Step 2.

Step 5: taking the score matrix predicted by the semantic similarity method as an input matrix, reducing the dimension of the score matrix according to a matrix decomposition method, decomposing the score matrix into a low-order approximate matrix which is the best approximate to the original matrix,

and (3) calculating a loss function according to a formula (9), obtaining each element value of the two matrixes after the decomposition of the scoring matrix through iterative calculation according to a formula (12), and substituting the element values into a formula (13) to calculate the evaluation scoring value of the user to be predicted in the scoring matrix.

In order to verify the feasibility and effectiveness of the algorithm, the algorithm is implemented in C + + language under the QT7.4.7 programming environment. The data used in the experiments were from a dataset collected by the university of minnesota computer science group for collaborative filtering algorithms (http:// group. org/datasets/movielens/100k /), which was found to be 100000/1682 x 943 ≈ 93.7%, and thus very sparse. In the experiment, firstly, the whole data set is subjected to randomized shuffling operation, then the experimental data is averagely divided into 5 mutually disjoint sub data sets, and the data ratio of the training set to the testing set is 4: 1.

The scale values of the distribution of score 1 to score 5 for the 5 data sets are shown in table 1.

TABLE 1 score distribution ratio

As can be seen from the score distribution ratios in table 1, the difference between the data having a score of 5 in the data set 1 and the data having a score of 5 in the data set 3 is 1.87%, and the difference is the largest in the entire data set, and the score distributions of the remaining data sets have smaller differences, so that the data set 1 is not used as a test set, and 1 of the data sets from the data set 2 to the data set 5 is optionally used as a test set for testing.

In the experiment, the number of times of classifying and constructing the ontology layer according to data is recorded as hierarchy tree, the semantic similarity threshold is epsilon, the characteristic number (dimension) of matrix decomposition is F, the learning rate is alpha, the regularization parameter is beta, and the related parameters are set as shown in Table 2.

Table 2 experimental parameter settings

In order to verify the performance of the algorithm, a scoring prediction algorithm based on semantic similarity, a scoring prediction algorithm based on sparse matrix singular value decomposition and a prediction algorithm based on random number filling and matrix decomposition are respectively operated on a data set and are compared and analyzed. And obtaining an operation result by adjusting parameters such as a similarity threshold value, iteration times and the like, and performing statistical analysis.

The mean Absolute deviation mae (means Absolute error) was used as a metric. The MAE measures the accuracy of the prediction by calculating the deviation between the predicted user score and the actual user score. Assume that the scores of the N predicted projects are represented by a vector of { p } in the algorithm₁,p₂,…,p_nThe corresponding actual user evaluation set is r₁,r₂,…r_nAnd then the MAE calculation formula is as follows. The variation curves of the different score prediction algorithms MAE values are shown in fig. 3.

The experiment is that firstly, the semantic similarity threshold is adjusted to compare the semantic similarity algorithm with the MAE value change trend of the proposed algorithm, and the operation result is shown in figure 1. The experimental result shows that the change trend of the semantic similarity algorithm provided by the invention and the project semantic similarity algorithm based on the structure tree of the building body along with the semantic similarity threshold is similar, and the MAE value is not reduced along with the increase of the semantic similarity threshold under the condition that the semantic similarity threshold is larger than 0.8, because if the semantic similarity threshold is set to be too large, the algorithm only scores and predicts the projects with extremely high similarity, the prediction result is relatively more accurate, but the projects with low similarity still cannot score and predict, the projects scored and predicted by the semantic similarity are relatively reduced, the data set is still sparse after prediction, and the purpose of the prediction algorithm cannot be achieved. Too small a semantic similarity threshold can result in inaccurate scoring predictions that corrupt the original properties of the data set. In order to achieve the purpose of not damaging the internal structure of a data set, but also partially predicting items with larger similarity in a polarity sparse matrix to provide more basic information for matrix decomposition, through experimental data analysis, the semantic similarity threshold of the algorithm is set to be 0.75 appropriately.

In order to verify the effectiveness of the algorithm of the invention, a prediction filling method based on the direct SVD decomposition of the sparse matrix, an algorithm which uses random numbers (ranging from 1 to 5) to pre-fill the sparse matrix and then carries out matrix decomposition are compared with the fluctuation result of the algorithm along with the preserved dimension change after matrix decomposition, and the line graph of the change of the MAE value is shown in FIG. 2.

As can be seen from the comparison of the algorithm in fig. 2, the MAE value after the conventional pre-filling of the score matrix with random numbers or score means for matrix decomposition is likely to be larger than that after the original matrix is directly subjected to SVD decomposition, because the filling of the missing score may change the essential properties of the original data. Under the condition of considering the attribute of the original score data set to be reserved, the algorithm of the invention partially pre-fills the items with higher similarity by adjusting the similarity threshold value in the score prediction process, and provides richer source data information for the original sparse matrix of matrix decomposition. In fig. 2, the MAE value of the algorithm of the present invention is smaller than that of the SVD decomposition algorithm when the similarity threshold is set to 0.75.

And (4) conclusion:

in order to solve the problem of sparsity of a scoring matrix in the intelligent question-answering recommendation system for the power grid, the method introduces the ontology semantic similarity to perform partial scoring prediction on items with higher similarity in the sparse scoring matrix, performs preprocessing on the sparse matrix for further predicting scoring missing values through matrix decomposition, and effectively solves the sparsity problem of the scoring matrix through twice scoring prediction. The semantic similarity algorithm of the ontology concept can not destroy the essential characteristics of original data by adjusting the semantic similarity threshold, and can provide more complete data support for the matrix decomposition algorithm. The experimental results show that: the sparse scoring matrix is completely filled, and the matrix MAE value is reduced to a certain extent.

Claims

1. A power grid user evaluation sparse matrix scoring prediction method is characterized by comprising the following steps: it includes: calculating semantic similarity between user evaluations by constructing a hierarchical structure tree of an ontology concept, and performing partial grading, prediction and filling on the user evaluations of which the similarity is greater than a similarity threshold in a sparse matrix; and on the basis of the predicted scoring matrix, performing dimensionality reduction and decomposition based on a matrix decomposition theory, and further performing scoring prediction on the user evaluation less than or equal to the similarity threshold value, so that the prediction filling of the scoring missing value in the sparse matrix is realized.

2. The power grid user evaluation sparse matrix scoring prediction method according to claim 1, is characterized by specifically comprising the following steps:

3. The power grid user evaluation sparse matrix scoring prediction method according to claim 2, characterized in that: the method for calculating the user evaluation similarity in the step 2 comprises the following steps: the shorter the distance between two concept instances, the higher the similarity between the two concepts, and vice versa, the similarity formula is:

4. The power grid user evaluation sparse matrix scoring prediction method according to claim 2, characterized in that: step 3 obtaining V_mSubset V with user evaluation similarity greater than similarity threshold epsilon_m', evaluation of the user to be predicted I_j∈W_mThe formula for making the score prediction is:

5. the power grid user evaluation sparse matrix scoring prediction method according to claim 4, characterized in that: the specific method for predicting the score is as follows:

user U_mRating I to the user_jScore prediction of (3) R_mjIs as follows.

All have a score for evaluating the user to be predicted, and according to the semantic similarity prediction method, the user U_mRating I to the user_jScore prediction of (3) R_mjAnd (4) calculating a formula.

6. The power grid user evaluation sparse matrix scoring prediction method according to claim 2, characterized in that: step 5, taking the score matrix predicted by the semantic similarity method as an input matrix, reducing the dimension of the score matrix according to a matrix decomposition method, decomposing the score matrix into a low-order approximate matrix which is the best approximate to the original matrix, and calculating a loss function by the method comprising the following steps: setting a scoring matrix predicted by a semantic similarity method as an m multiplied by n matrix R, and for a sparse matrix R, applying a matrix decomposition method to predict a matrix missing value, wherein the matrix R can be decomposed into an m multiplied by k dimensional U matrix and a k multiplied by n dimensional V matrix; symbol

For the approximate scoring matrix after prediction of matrix R:

7. the power grid user evaluation sparse matrix scoring prediction method according to claim 5, wherein the power grid user evaluation sparse matrix scoring prediction method comprises the following steps: step 5, the method for obtaining each element value of the two matrixes after the decomposition of the scoring matrix by iterative computation and finally obtaining the user evaluation scoring value to be predicted in the scoring matrix comprises the following steps:

and respectively solving partial derivatives of the loss functions to obtain:

u_i,k＝u_i,k+α(2E_i,jv_k,j-βu_i,k)

v_k,j＝v_k,j+α(2E_i,ju_i,k-βv_k,j) (12)

u(i,1)*v(1,j)+u(i,2)*v(2,j)+…+u(i,k)*v(k,j) (13)。

8. the power grid user evaluation sparse matrix scoring prediction method according to claim 2, characterized in that: the method also comprises the step of verifying the feasibility and the effectiveness of the power grid user evaluation sparse matrix scoring prediction method, wherein the verification method specifically comprises the following steps: the algorithm is realized by C + + language under QT7.4.7 programming environment; data is from a dataset collected by the university of minnesota state computer science group research group for collaborative filtering algorithms; the sparsity of the data set is 100000/1682 multiplied by 943 ≈ 93.7%, firstly, the whole data set is subjected to randomized shuffling operation, then, experimental data are averagely divided into 5 mutually disjoint sub data sets, and the data proportion of the training set to the testing set is 4: 1; the scale values of the distribution of score 1 to score 5 for the 5 data sets are shown in table 1.

TABLE 1 score distribution ratio

table 2 experimental parameter settings

adopting the average absolute deviation MAE as a measurement standard; MAE calculates the predicted user score and the actual user scoreThe accuracy of the prediction of the deviation measure between; assume that the scores of the N predicted projects are represented by a vector of { p } in the algorithm₁,p₂,…,p_nThe corresponding actual user evaluation set is r₁,r₂,…r_nAnd then the MAE calculation formula is as follows:

the similarity threshold is set to 0.75;

the validation method of the effectiveness comprises the following steps: performing SVD decomposition directly based on the sparse matrix to perform a prediction filling method, and using random numbers in a range of 1-5; the algorithm for pre-filling the sparse matrix and then performing matrix decomposition is compared with the result of fluctuation of the algorithm along with the preserved dimension change after matrix decomposition; under the condition of considering the preservation of the attribute of the original scoring data set, the similarity threshold value in the scoring prediction process is adjusted to partially pre-fill the items with higher similarity, richer source data information is provided for the original sparse matrix of matrix decomposition, and the MAE value of the algorithm is smaller than that of the SVD decomposition algorithm under the condition that the similarity threshold value is set to be 0.75.