CN104794250B

CN104794250B - A kind of project selection method based on adaptive Active Learning

Info

Publication number: CN104794250B
Application number: CN201510255684.4A
Authority: CN
Inventors: 吴健; 李承超; 张宇
Original assignee: SUZHOU RONGXI INFORMATION TECHNOLOGY Co Ltd
Current assignee: Suzhou Feiyu Mutual Entertainment Information Technology Co., Ltd.
Priority date: 2015-05-19
Filing date: 2015-05-19
Publication date: 2018-10-19
Anticipated expiration: 2035-05-19
Also published as: CN104794250A

Abstract

The invention discloses a kind of project selection methods based on adaptive Active Learning, including：Calculate the uncertainty of candidate items；Calculate the representativeness of candidate items；According to described uncertain and representative, the adaptively highest project of dynamic select information content.The present invention can consider the uncertainty of project and representativeness picks out the highest project of information content.

Description

A kind of project selection method based on adaptive Active Learning

Technical field

The present invention relates to commending system technical field more particularly to a kind of items selection sides based on adaptive Active Learning Method.

Background technology

In Collaborative Filtering Recommendation System, solve the problems, such as that the key of user's cold start-up is how quickly to establish new user's Interest preference model.When user initially uses system, the method based on Active Learning scoring guiding actively selects some projects Allow user's evaluation that can effectively obtain the personalization preferences information of user.Project is selected to user's scoring to consider for 2 points：(1) User can obtain the more score datas of user to project scoring, and score information is more, and commending system is more effective；(2) not All score informations are all equivalent, some, which score, can represent the customized information of user, some then cannot, therefore it is different Active Learning scoring bootstrap technique can bring different effects.For example popular project is selected always to user's evaluation, although can obtain More user's score datas are obtained, but the personalization preferences information that user is obtained for system helps less, because most of User likes popular project.Therefore, an effective Active Learning items selection strategy how is designed, choosing that can be as few as possible The higher project score data of information content is selected preferably to indicate problem and mesh that the preference information of user is very crucial Preceding urgent problem to be solved.

Invention content

The present invention provides a kind of project selection methods based on adaptive Active Learning, can consider project not Certainty and representativeness pick out the highest project of information content.

The present invention provides a kind of project selection methods based on adaptive Active Learning, including：

Calculate the uncertainty of candidate items；

Calculate the representativeness of candidate items；

According to the described uncertain and representative selection highest project of information content.

Preferably, the uncertainty for calculating candidate items is：

According to formulaThe uncertainty of candidate items is calculated, wherein： R_cxIndicate scorings of the user c to project x,Indicate the average score of user, U_x(sim) indicate and it is current newly user it is similar and There is the user of scoring behavior to gather project x.

Preferably, the representativeness for calculating candidate items includes：

In training set T_cOn c be calculated according to prediction model θ score the prediction of xAnd estimate that c is that x scorings are Probability p (U=c, the R of r_cx=r), and by r as y_cx(θ) changing value, wherein

Update scoring training set T_c, by prediction scoring changing valueIt is added to the scoring item aggregate list of c In, obtain new scoring training set T_c,r=T_c∪(x,r)；

In scoring training set T_cAnd T_c,rOn, according to prediction model θ, c is to non-scoring item set for predictionIn it is other Non- scoring item x_iScore value, respectively obtain on corresponding training set prediction scoring beWith

In Probability p (U=c, the R that scoring is r_cx=r) under, the scoring of estimation current candidate project x changes to other projects The influence for predicting scoring square indicates scoring variation, according to formula with differenceThe representative rep (x) of current candidate project x is calculated, Wherein：C indicates that current new user, x represent current candidate project, represent the non-scoring item set of c, indicate c's The non-scoring item set of residue of c after x is removed in scoring item set, expression, i.e.,In Each project x_iIt indicates,It is the corresponding training datasets of c, R_cxIndicate scorings of the c to x.

Preferably, it is described according to it is described uncertain and it is representative select the highest project of information content for：

According to formulaThe high project of information content is calculated, wherein： Uncertainty (x) is uncertainty, and rep (x) is representativeness, and c indicates that current new user, x represent current candidate project,Represent the non-scoring item set of c.

Preferably, further include after the representativeness for calculating candidate items：

Preassign weights set W, W={ w₁,w₂,…,w_n-1,w_n, size | W |=n；

It is sky that candidate items set I, which is arranged,

For current weight w_i, w_i∈ W, L candidate items before selecting constitute project set I_i；

Update candidate items set I=I ∪ I_i；

In the existing scoring set T of user c_cUpper training obtains prediction model θ, and pre- test and appraisal of the c to project x are calculated according to θ PointUpdate training set T_c；

Calculate the corresponding prediction effort analysis ε (x) of each project；

The project x of most information content is selected from candidate items set I^*。

Preferably, described to be directed to current weight w_i, w_i∈ W, select before L candidate items for：

According to the uncertainty uncertainty (x) and representativeness rep (x), according to formula info (x)= uncertainty(x)^w×rep(x)^(1-w)Calculate the consequent purpose information content info (x) of combination；

According to formulaCalculate the project x of most information content^*, L candidate items before selecting.

Preferably, described in the existing scoring set T of user c_cUpper training obtains prediction model θ, and c is calculated to item according to θ The prediction of mesh x is scoredUpdate training set T_cFor：

According to formulaUpdate training set T_c。

Preferably, the corresponding prediction effort analysis ε (x) of each project of calculating is：

According to updated T_cTraining obtains new prediction modelIt is based onIt predicts that c trains scoring item to gather Middle project t (t ∈ T_c) scoringAccording to formulaThe deviation of estimation true scoring and prediction scoring ε (x), wherein：Indicate updated collaborative filtering modelScorings of the c of prediction to project t.

Preferably, the project x of most information content is selected in the I from candidate items set^*For：

According to formulaThe project x of selection most information content^*。

By said program it is found that a kind of project selection method based on adaptive Active Learning provided by the invention, passes through To candidate items uncertainty and representative calculating, the uncertainty and representativeness of project is considered, have selected information When the highest project of content scores to user, the deficiency based on uncertain items selection strategy is overcome, letter can be picked out Cease the highest project of content.

Description of the drawings

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with Obtain other attached drawings according to these attached drawings.

Fig. 1 is a kind of flow chart of the project selection method based on adaptive Active Learning disclosed by the embodiments of the present invention；

Fig. 2 is a kind of flow of the project selection method based on adaptive Active Learning disclosed in another embodiment of the present invention Figure；

Fig. 3 is uncertain project schematic diagram；

Fig. 4 is uncertain sampling defect schematic diagram；

Fig. 5 is that representative items select schematic diagram；

Fig. 6 is project x_iScoring variation influences schematic diagram；

Fig. 7 is project x_jScoring variation influences schematic diagram.

Specific implementation mode

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

As shown in Figure 1, being a kind of project selection method based on adaptive Active Learning disclosed by the embodiments of the present invention, packet It includes：

S101, the uncertainty for calculating candidate items；

In Active Learning classification field, the cardinal principle based on uncertain sampling is exactly to be concentrated every time from unlabeled data When selecting data sample, it is desirable that be equivalent to current learning model be least determining to the sample that do not mark for being selected or being constructed 's.In Collaborative Filtering Recommendation System, uncertain project, which refers to just system, cannot judge that user likes it project of attitude. If according to the scoring history of user and the score information of other users, system can Accurate Prediction user like (or not liking Vigorously) this project, then illustrate what this project was to determine, is otherwise exactly uncertain project.Since user is to the score information of project Can indicate the preference of user, thus most of researchs with score information come the uncertainty of metric terms.User is to project Scoring is more inconsistent, claims the uncertainty of this intermediate item higher.As shown in figure 3, in 3 users (User1, User2, User3) To in the rating matrix of 3 projects (Item1, Item2, Item3), 3 users are least consistent to the score value of project Item1, So the uncertainty of project Item1 is higher than project Item2 and Item3.

The uncertainty of project is higher, and the user that illustrate to have scored is bigger to the dispute of this intermediate item, and commending system is affirmed not It can determine that interest of the user to be recommended to this intermediate item.The uncertain high project of selection gives user to be recommended scoring, can from The specific scoring to this intermediate item is obtained with family, to be best understood from the preference information of user.It is not true for project x Qualitative uncertainty (x) calculation formula is as follows：

Wherein, U_xIt represents and the user that project x has scoring is gathered, | U_x| indicate user's number, R_cxIndicate user c to project The scoring of x,Indicate the average score of user.

When calculating current candidate purpose variance, existing method is all based on the score data of all users in system Come what is calculated.For a project, if being calculated according to the scoring of all users in system, variance is relatively low, and to be recommended new It is very high that its variance is calculated in the similar users of user, for this intermediate item, according to the side of existing system overall situation user metric variance Method will not select this project to user certainly.By the basic principle of collaborative filtering it is found that its similar users is not true to this project Fixed, then after score in predicting calculates, this purpose uncertainty is still very high.In order to solve this problem, in calculating side When poor, only according to the scoring distributed intelligence of the similar users of user to be recommended come computational item purpose variance.Improved computational item Mesh uncertainty measure method formula is as follows：

Wherein, U_x(sim) it indicates similar with current new user and there is the user of scoring behavior to gather project x.

S102, the representativeness for calculating candidate items；

In Active Learning classification problem, uncertainty only embodies the candidate shadow for not marking sample to current class device It rings, does not account for its information content in not marking sample set largely.Most uncertain sample may be in many cases, Independent point or noise spot.As shown in figure 4, triangle and diamond shape represent the sample marked in sample set, remaining circle represents not Mark the sample in sample set.Due to x_ANearest from classification boundaries line, the influence to grader is maximum, is adopted using uncertainty Quadrat method, inevitable choice x_ASample transfers to human expert to mark, due to x_AIt is isolated point, it is more likely that cause classification boundaries line wrong It moves, if the classification boundaries line of grader in figure is by the position that original solid line position variation is dotted line, may cause to classify in this way It largely malfunctions when the remaining sample of the category.In fact, sample x_BWith higher information content, because it more can representative sample Overall distribution should select x_BArtificial mark.In order to solve the problems, such as above-mentioned isolated point or noise spot, need to consider current sample not Mark the representativeness in sample set.

In collaborative filtering recommending, there is also similar problems for the selection of indeterminate purpose, because based on uncertainty The uncertain project of standard selection merely reduces the uncertainty of current project, is scored it by user and is merely capable of understanding User does not know that preference of the user to sundry item, i.e., uncertain reduction method do not have to the preference of currently selected project Consider the relationship of currently selected project and a large amount of non-scoring items, it cannot be from the global uncertainty for reducing other projects.Fig. 5 gives Go out a diagram to explain.Solid circles indicate that scoring item, empty circles indicate non-scoring item in figure, and user is in figure Project of the distance relatively closely, in same category set has similar scoring behavior.After a project in generic is scored, Can reduce it is generic in other projects uncertainty.For two projects a and d in system, if the user that scored is to item The scoring of mesh d is more inconsistent, and the uncertainty of d is more than a, i.e. uncertainty (d)>Uncertainty (a), uncertainty contracting Subtracting strategy can select project d to score to user, but representative highers of the apparent project a in remaining non-scoring item, selection The project can obtain preference of the user to a large amount of remaining non-scoring items to user's scoring, so selection project a, system can give User shows better recommendation results.

Based on the above analysis, to consider influence of the selected item in other non-scoring item set, and overcome least really Determine the problem of project may be boundary point, it is also necessary to weigh the representativeness of selected item.

By the basic principle of Collaborative Filtering Recommendation Algorithm it is found that when system obtains candidate items of new user couple to be recommended Scoring when, this scoring can influence score in predicting to other non-scoring items, and the generation of candidate items is weighed with this influence Table, this representativeness measure consider the existing preference information of new user.The scoring of candidate items does not comment other The score value variation influence of sub-item is bigger, then illustrates that the representativeness of current project is higher.Candidate is set forth in Fig. 6 and Fig. 7 Project x_iWith x_jScore value variation after influence to other non-scoring items schematic diagram, as can be seen that candidate item from diagram Mesh x_iScore value variation after bigger is influenced on the scorings of other non-scoring items variation, therefore, it is considered that project x_iRepresentativeness it is high In project x_jRepresentativeness, i.e. rep (x_i)>rep(x_j)。

In commending system, the project scoring that general user provides all is limited several score values, and such as 0,1 indicates not like Vigorously, like or film recommend scene in common 5 values (1-5) scoring.The score value that user is capable of providing is denoted as r, Corresponding possible scoring value set is denoted as R, r ∈ R.Reduce in strategy similar to desired error rate and considers not mark all of sample Possible classification considers all score values that user may provide.According to the existing scoring of user, calculated using collaborative filtering method The prediction scoring of user and counting user have scored the probability distribution of set R, and prediction is scored and regards user as to candidate items True scoring, each r values are the change value of prediction scoring.Estimation scoring variation is to other non-scoring items under different probability The influence of score value variation, target are to find out the project being affected to the scoring of other projects, i.e., representative higher project.Base In the above analysis, the method based on scoring variation influence measures sports representative's property can be obtained.It is described in detail as follows：

First, following symbol description is provided：C indicates that current new user, x represent current candidate project,Represent c not Scoring item set,Indicate the scoring item set of c,The non-scoring item set of residue of c after x is removed in expression, I.e. In each project x_iIt indicates,It is the corresponding trained numbers of c According to collection, R_cxIndicate scorings of the c to x.

In training set T_cOn c be calculated according to prediction model θ score the prediction of xAnd estimate c be x scoring be r Probability p (U=c, R_cx=r), and by r as y_cx(θ) changing value, changing valueFormula is as follows：

Update scoring training set T_c, by prediction scoring changing valueIt is added in the aggregate list of scoring item of c, New scoring training set is obtained, formula is as follows：

T_c,r=T_c∪(x,r)

In scoring training set T_cAnd T_c,rOn, according to prediction model θ, c is to non-scoring item set for predictionIn it is other Non- scoring item x_iScore value, respectively obtain on corresponding training set prediction scoring beWithIt is r in scoring Probability p (U=c, R_cx=r) under, the scoring of estimation current candidate project x changes the influence scored other project forecast, usesWithSquare expression scoring variation of difference.It is found that representative rep (x) measure of current candidate project x Formula is as follows：

S103, the highest project of information content is selected according to described uncertain and representativeness.

Based on the representative measure that the above scoring variation influences, the representativeness of project was not only considered but also had made full use of The existing score information of each user.Consider that the representativeness of project can overcome uncertain project in non-scoring item set The problem of outlier or isolated point may be chosen in selection course, leads to pick out the project to score to user without generation Table, to can not effectively predict more user preference informations.The uncertainty and representativeness for considering project, are selected The highest project x of information content^*, common combined method formula is as follows：

Fixed Combination method disclosed in above-described embodiment considers the uncertainty and representativeness of project, the two product value Larger project is the higher project of information content, is overcome to a certain extent based on uncertain criterion picks project The deficiency of method.All it is the uncertainty and representativeness for weighing project simultaneously but in each iterative process, needs processing not All items in scoring item set, when non-scoring item set is larger or representative metrics process is more complicated, meter Calculation amount undoubtedly can be very high.In view of showing that the project as few as possible, information content is high scores to new user preferably to express User preference information, the items selection that information content should be avoided low.If system can determine new according to existing score information To the hobby of a certain project, that scores there is no need to select this intermediate item to user user, thus can be to avoid selection information The low project of content.So a kind of method that serial combination selects project may be used：First using uncertain reduction standard weighing apparatus The uncertainty of the non-scoring item of amount, uncertainty is sorted from high to low, is selected the most uncertain project of system, is obtained least Set of identifying project (The Most Uncertain Item Set, abbreviation MUIS).And to overcome possibility of least identifying project For independent point or outlier the problem of, with representative standard calculate MUIS set in project representativeness, then to MUIS gather In project carry out representative sequence, the representative high Project Exhibition of selection to user can ensure that user is transferred to score in this way The existing higher uncertainty of project also have higher representativeness.

The method of serial combination can avoid the project that selection systematic comparison determines, relative to the method for fixed Combination, energy The efficiency for enough effectively improving items selection avoids the low Project Exhibition of information content from scoring to user.This method also has centainly The drawbacks of, for uncertain relatively low and representative relatively high project, it is excluded certainly except MUIS set, i.e. the party Method is the representativeness with the project of sacrificing to a certain extent for cost.However, in practical situations, it is difficult to which determination is to resit an exam The uncertainty or representativeness of worry project.Fixed Combination method treats the uncertainty and representativeness of project on an equal basis, there is also Similar problem.In view of this, the present invention discloses another kind on the basis of the above embodiments is based on adaptive Active Learning Project selection method.

As shown in Fig. 2, for a kind of items selection side based on adaptive Active Learning disclosed in another embodiment of the present invention Method, including：

S201, the uncertainty for calculating candidate items；

S202, the representativeness for calculating candidate items；

S203, weights set W, W={ w are preassigned₁,w₂,…,w_n-1,w_n, size | W |=n；

S204, setting candidate items set I are sky,

S205, it is directed to current weight w_i, w_i∈ W, L candidate items before selecting constitute project set I_i；

S206, update candidate items set I=I ∪ I_i；

S207, in the existing scoring set T of user c_cUpper training obtains prediction model θ, and c is calculated to the pre- of project x according to θ Test and appraisal pointUpdate training set T_c；

S208, the corresponding prediction effort analysis ε (x) of each project is calculated；

S209, the project x that most information content is selected from candidate items set I^*。

Specifically, the operation principle of above-described embodiment is：Fixed Combination method and serial combination method all exist inactive State adjusted iterm uncertainty and representative weight distribution problem.The uncertainty of given project and representative measure Afterwards, target is to propose a kind of group frame, can integrate uncertain and representative advantage.Purpose is the candidate for ensureing to pick out Project is uncertain relative to current system, and is concentrated with higher representativeness in non-scoring item.Therefore, when time After option is added to scoring item collection, obtained updated collaborative filtering model can preferably predict the preference of user Information is more accurately recommended to be provided for new user.Common group frame is exactly the form using product in research.Assuming that The uncertainty of current candidate project x is expressed as uncertainty (x), and representative table is shown as rep (x), then combines consequent purpose Information content info (x) is expressed as：

Info (x)=uncertainty (x)^w×rep(x)^(1-w)；

The most project x of information content^*For：

Wherein w (0≤w≤1) is a weighting factor of item controlled uncertainty and representative size.Work as w>When 0.5, When illustrating selection project, the uncertain weight of project is more than representativeness；Work as w<When 0.5, when selecting project, then pay the utmost attention to The representativeness of project.Under extreme case, if w=1, combined method is at single uncertain project selection method；If w= 0, then it is single representative items selection method.This group frame there are a unavoidable problem, exactly weigh because The size of sub- w is difficult to determine.In different situations, it is difficult to it is determined that the uncertain or representative of priority discipline Property.And during active options purpose, the importance of two kinds of standards also should be adjusted dynamically.In order in items selection In the process, it is adapted dynamically w, selects the project of current most worthy that user is transferred to score, it is proposed that is a kind of adaptively to combine Strategy is described in detail below：

(1) weights set W is preassigned：W={ w₁,w₂,…,w_n-1,w_n, size | W |=n；

(2) in each item selection procedure, setting candidate items set I is sky；

(3) the uncertain uncertainty (x) of project x, representative rep (x) are calculated；

(4) according to each of weights set value w_i(w_i∈ W), according to formulaSelect preceding L A candidate items obtain current candidate project set I_i；

(5) final candidate items set I known to is I=I₁∪I₂∪…∪I_n-1∪I_n；

(6) optimal w values are selected that is, selecting the highest project of information content from candidate items set I.

In collaborative filtering recommending, target of the Active Learning for items selection is exactly to pick out the high project of information content Score data user preference information is better anticipated, that is, maximizes user satisfaction.Reduce similar to estimated error rate The thought of strategy, the project selection method of user satisfaction can be maximized with adaptively selected optimal weight w, base by devising This thought is：For each project x in candidate project set I, estimate user to x using current collaborative filtering prediction model Prediction scoring, by x and its prediction scoring one by one simulation be added to scored training set, update training obtains new prediction mould Type, the scoring using new prediction model estimation user to scoring item, selection can make user really score and prediction scoring Deviation minimum project give user scoring.User satisfaction maximization approach meets the base of project-based collaborative filtering recommending Present principles, if user is interested in some comparison of item, can speculate the user also can like and this comparison of item phase As other projects.Selection can make user really score and predict the project of the deviation minimum to score, and being selection can most reflect The project of user preference information.User satisfaction maximization approach is described in detail below：

In the existing scoring set T of user c_cUpper training obtains prediction model θ, and pre- test and appraisal of the c to project x are calculated according to θ PointUpdate training set T_c

According to updated T_cTraining obtains new prediction modelIt is based onIt predicts that c trains scoring item to gather Middle project t (t ∈ T_c) scoringThe deviation ε (x) of estimation true scoring and prediction scoring, formula are as follows：

Wherein,Indicate updated collaborative filtering modelScorings of the c of prediction to project t.It can make deviation ε (x) most Small project is to best suit user preference, can most make customer satisfaction system project, the i.e. highest project of information content.More than being based on The user satisfaction provided maximizes strategy, selects optimal weight w, is that letter is selected from final candidate items set I The breath highest project of content transfers to user to score, it is known that, the most project x of information content^*Selection criteria formula is as follows：

So far, by uncertainty, the representativeness of project in the non-scoring item set of measurement, then most by user satisfaction Bigization criterion picks are sent as an envoy to the prediction effort analysis minimum i.e. highest project of information content of new user scoring item set, are given User scores.After obtaining user's score information, scoring item set, non-scoring item set, update collaborative filtering are pre- for update Model, the iteration above process are surveyed, until reaching stopping criterion (the scoring number that such as new user provides reaches certain quantity).

It is measured since the uncertain and representative strategy that the present invention studies is the score information directly using user, So prediction model θ uses the collaborative filtering recommending method based on user, the similarity measurement between user is related using Pearson came Similarity method uses the score in predicting of user the weighted average prediction technique for considering user's scoring scale problem.

In conclusion the present invention when selecting the highest project of information content and scoring to user, overcomes based on uncertain The deficiency of property items selection strategy, has considered the uncertainty and representativeness of project.Adaptive group frame, can optimal tune Whole uncertain and representative combination, ensures the candidate items picked out, is uncertain relative to current commending system, and And it is concentrated with higher representativeness in non-scoring item.Therefore, when the scoring item set for candidate items being added to user Afterwards, obtained updated collaborative filtering model can preferably predict the preference information of user, more accurate to provide to the user True recommendation.

If the function described in the present embodiment method is realized in the form of SFU software functional unit and as independent product pin It sells or in use, can be stored in a computing device read/write memory medium.Based on this understanding, the embodiment of the present invention The part of the part that contributes to existing technology or the technical solution can be expressed in the form of software products, this is soft Part product is stored in a storage medium, including some instructions are used so that computing device (can be personal computer, Server, mobile computing device or network equipment etc.) execute all or part of step of each embodiment the method for the present invention Suddenly.And storage medium above-mentioned includes：USB flash disk, read-only memory (ROM, Read-Only Memory), is deposited mobile hard disk at random The various media that can store program code such as access to memory (RAM, Random Access Memory), magnetic disc or CD.

Each embodiment is described by the way of progressive in this specification, the highlights of each of the examples are with it is other The difference of embodiment, just to refer each other for same or similar part between each embodiment.

The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest range caused.

Claims

1. a kind of project selection method based on adaptive Active Learning, which is characterized in that including：

According to formulaThe uncertainty of candidate items is calculated, wherein：R_cxTable Show scorings of the user c to project x,Indicate the average score of user, U_x(sim) indicate similar with current new user and to project X has the user of scoring behavior to gather；

In training set T_cOn c be calculated according to prediction model θ score y to the prediction of x_cx(θ), and estimate c be x scoring be r's Probability p (U=c, R_cx=r), and by r as y_cx(θ) changing value, wherein

Update scoring training set T_c, by prediction scoring changing valueIt is added in the aggregate list of scoring item of c, obtains New scoring training set T_c,r=T_c∪_(x,r)；

In scoring training set T_cAnd T_c,rOn, according to prediction model θ, c is to non-scoring item set for predictionIn other do not comment Sub-item x_iScore value, respectively obtain on corresponding training set prediction scoring beWith

In Probability p (U=c, the R that scoring is r_cx=r) under, the scoring of estimation current candidate project x changes to other project forecast The influence of scoring is usedWithSquare expression scoring variation of difference, according to formulaThe representative rep (x) of current candidate project x is calculated, Wherein：C indicates that current new user, x represent current candidate project,The non-scoring item set of c is represented,Indicate c Scoring item set,The non-scoring item set of residue of c after x is removed in expression, i.e., In Each project x_iIt indicates,It is the corresponding training datasets of c, R_cxIndicate scorings of the c to x；

2. according to the method described in claim 1, it is characterized in that, further including after the representativeness for calculating candidate items：

Preassign weights set W, W={ w₁,w₂,…,w_n-1,w_n, size | W |=n；

It is sky that candidate items set I, which is arranged,

Update candidate items set I=I ∪ I_i；

In the existing scoring set T of user c_cUpper training obtains prediction model θ, and calculating c according to θ scores to the prediction of project x Update training set T_c；

3. according to the method described in claim 2, it is characterized in that, described be directed to current weight w_i, w_i∈ W, L time before selection Option is：

According to the uncertainty uncertainty (x) and representativeness rep (x), according to formula info (x)=uncertainty (x)^w×rep(x)^(1-w)Calculate the consequent purpose information content info (x) of combination；

4. according to the method described in claim 3, it is characterized in that, described in the existing scoring set T of user c_cUpper training obtains Prediction model θ calculates c according to θ and scores the prediction of project xUpdate training set T_cFor：

According to formulaUpdate training set T_c。

5. according to the method described in claim 4, it is characterized in that, described calculate the corresponding prediction effort analysis ε of each project (x) it is：

According to updated T_cTraining obtains new prediction model θ_x, it is based on θ_xPredict that c trains set middle term mesh to scoring item t(t∈T_c) scoringAccording to formulaThe deviation ε (x) of estimation true scoring and prediction scoring, In：Indicate updated collaborative filtering model θ_xScorings of the c of prediction to project t.

6. according to the method described in claim 5, it is characterized in that, selecting most information to contain in the I from candidate items set The project x of amount^*For：

According to formulaThe project x of selection most information content^*。