CN110377841A

CN110377841A - A kind of similarity calculating method applied in collaborative filtering method and system

Info

Publication number: CN110377841A
Application number: CN201910478934.9A
Authority: CN
Inventors: 杨志明
Original assignee: Reflections On Artificial Intelligence Robot Technology (beijing) Co Ltd
Current assignee: Reflections On Artificial Intelligence Robot Technology (beijing) Co Ltd
Priority date: 2019-06-04
Filing date: 2019-06-04
Publication date: 2019-10-25
Anticipated expiration: 2039-06-04
Also published as: CN110377841B

Abstract

The invention discloses a kind of similarity calculating method applied in collaborative filtering recommending method and systems, the embodiment of the present invention proposes the score data for not needing user, but the collaborative filtering recommending method of the related commentary information based on user, especially the step of calculating similarity in existing collaborative filtering recommending method, is improved, so that similarity calculating section is only modeled according to the comment information of user, the score information for then directly inputting user when in use can be obtained by similarity result.In this manner it is possible to realize the calculating between the similarity in collaborative filtering recommending method on the basis of not needing the score data of user.

Description

A kind of similarity calculating method applied in collaborative filtering method and system

Technical field

The present invention relates to field of computer technology, in particular to a kind of similarity applied in collaborative filtering recommending method Calculation method and system.

Background technique

With the fast development of Internet technology, Internet lateral root is that user's progress personalization pushes away according to user data It recommends.During carrying out personalized recommendation, the history preference according to user and the user's history data of behavior are needed, is mentioned to user For its interested recommendation information.In order to be embodied as user's recommendation information, can be recommended using collaborative filtering recommending method.

Current collaborative filtering recommending method requires to construct recommended models to the explicit scoring of article based on user greatly, Then the score information of user is input in recommended models, final output obtains recommendation information.

Current collaborative filtering recommending method includes:

First step calculates the similarity between user

Be currently used in calculate user between method for measuring similarity have very much, wherein used extensively have Euclidean away from From, cosine similarity, Pierre's Si related coefficient and Jie Kade similarity factor etc., Euclidean distance, cosine similarity and Pierre's Si phase Relationship number etc. requires to calculate the scoring of article based on user, and Jie Kade similarity factor can be in the case where not being scored The calculating of user's similarity is completed, what is considered is the relevant number of articles of user, calculation formula are as follows:

Wherein Jaccard (u, v) indicates the similarity between user u and user v；I_uAnd I_vIt respectively indicates and user u and use The relevant article set of family v；I_u,vIndicate the intersection with user u and user's v relative article.

Second step obtains the K nearest neighbor set of target user

Based on the similarity between the user calculated in first step, filter out and target user's similarity maximum K User's set, that is, filter out the K user most like with target user.

Third step obtains the potential recommendation article set of target user

The K nearest neighbor set based on target user obtains the potential recommendation article set of target user, specific real It applies step to be divided into: a, the union for obtaining the relative article of all users in the K nearest neighbor set of target user；B, from a In relative article and concentrate deletion all items relevant to target user；It c, is exactly mesh according to the article set that b is obtained Mark user potentially recommends article set.

Four steps obtains the recommendation article set to target user

To the potential recommendation article set for the target user that third step obtains, the preference of wherein all items is calculated separately Degree, calculation formula are as follows:

Wherein p_u,iIndicate user u to the preference of article i；U_iIndicate user's set relevant to article i；U_uIndicate user The K nearest neighbor set of u；s_u,vIndicate the similarity between user u and user v；r_viIndicate scoring of the user v to article i.

As can be seen that requiring the unsolicited scoring number of user in the whole flow process of current collaborative filtering recommending method According to participation, collaborative filtering recommending is then unable to complete when obtaining the score data less than user.However it is more and more common at present Situation are as follows: network side can not get the active score data of user, for example network side does not provide the choosing of the explicit scoring to article , and the comment option to article is merely provided, it, at this moment just can not be using currently similar to the options etc. such as thumbing up or collect Collaborative filtering recommending method is user's recommendation information.

Further, basic as the calculating in collaborative filtering recommending method, the similarity calculation between user, in addition to The calculating of Jie Kade similarity factor mode, other calculations are also required to the score data of opportunity user.

Summary of the invention

In view of this, the embodiment of the present invention provides a kind of similarity calculation side applied in collaborative filtering recommending method Method, this method can be realized between the similarity in collaborative filtering recommending method on the basis of not needing the score data of user Calculating.

The embodiment of the present invention also provides a kind of similarity calculation system applied in collaborative filtering recommending method, the system The calculating between the similarity in collaborative filtering recommending method can be realized on the basis of not needing the score data of user.

The embodiments of the present invention are implemented as follows:

A kind of similarity calculating method applied in collaborative filtering recommending method, comprising:

Comment information based on user carries out the similarity modeling between user or between article；

The comment information for getting user is input in the similarity model between user or between article, obtains user Between similarity result or article between similarity result.

The comment information of the user includes: article set related to user, and the characteristic information of the user obtained.

The similarity modeling carried out between user is based on: user is to the attention rate of article, the popularity of article, non-public affairs Relative article quantity and common phase close number of articles.

The formula that the similarity modeling carried out between user uses are as follows:

Wherein us_u,vIndicate the similarity between user u and user v；I_uAnd I_vIt respectively indicates user u and user v was commented on Article set；I_u,vIndicate the article set that user u and user v is commented on jointly；α_u(> 0) is user's coefficient of similarity, setting It is 1；β_u> 0 is the German number of user Jie Ka, setting 0.5.

Similarity modeling between the carry out article is based on: article attention, the popularity of user interest, non-common phase It closes number of users and common phase closes number of users.

The formula that similarity modeling between the carry out article uses are as follows:

Wherein is_i,jIndicate the similarity between article i and article j；

U_iAnd U_jRespectively indicate the comment user set of article i and article j；

U_i,jIndicate the common comment user set of article i and article j；

α_i> 0 is article coefficient of similarity, is initially set to 1；β_i> 0 is that article outstanding person blocks German number, is initially set to 0.5.

The α_iAnd β_iReal-time update respectively.

This method further include:

Based on the similarity result between user, the nearest neighbor set of the setting quantity of user is obtained；

According to the recommendation article set of the nearest neighbor set of the setting quantity of user, acquires the potential of user and push away Recommend article set；

The potential recommendation article set for the user that will acquire is input in the recommendation object model of setting, obtain for The article set that family is recommended.

The recommendation object model are as follows:

Wherein candidateItem_uIndicate the Candidate Recommendation article set of target user u；I_vIt indicates relevant to user v Article set；

Recommend weighted value for each article setting in the Candidate Recommendation article set of target user u, by what is be calculated The article of the maximum setting number of weighted value is as recommendation article result；

Described is that the calculating of each article setting recommendation weighted value includes:

p_u,i=mus_u,i·recognition_i·pml_u,i·ma_u,i·uic_u,i·heat_i

Wherein, p_u,iTable article indicates preference of the user u relative to article i；

mus_u,iIndicate maximum user similarity of the article i about user u, U_iIt indicates The comment user of article i gathers, s_u,vIndicate the similarity between user u and v；

recognition_u,iIndicate the setting K quantity in user u Degree of recognition of the nearest neighbor to article i, state_v,iIndicate status indication whether article i is related to user v,

Wherein I_vIndicate article set relevant to user v；

pml_u,iThe matching of the tag set and user u portrait set that indicate article i is horizontal, F_iIndicate the attribute set of article i, LUP_uIndicate that user u implicitly draws a portrait set；

ma_u,iIndicate maximum attention of the article i relative to user u Degree, attention_v,iBe calculated asattention_u,iIndicate concern of the user u to article i Degree, noc_u,iIndicate user u to the number of reviews of article i；K > 0 is attention rate coefficient, is set as 1；

Wherein uic indicates the correlation between user u and article i；s_i,jIndicate the similarity between article i and j；

heat_i=noc_i, heat_iIndicate total comment number that article i is obtained.

A kind of system for applying the similarity calculation in collaborative filtering recommending method, comprising: model building module and place Manage module, wherein

Model building module carries out the similarity between user or between article and builds for the comment information based on user Mould；

Processing module is input to the similarity mould between user or between article for getting the comment information of user In type, the similarity result between user or the similarity result between article are obtained.

As above as it can be seen that the embodiment of the present invention proposes the score data for not needing user, but the correlation based on user is commented By the collaborative filtering recommending method of information, especially to the calculating similarity in existing collaborative filtering recommending method the step of into Improvement of having gone then directly inputs when in use so that similarity calculating section is only modeled according to the comment information of user The score information of user can be obtained by similarity result.In this manner it is possible on the basis of not needing the score data of user, it is real The calculating between similarity in existing collaborative filtering recommending method.

Detailed description of the invention

Fig. 1 is a kind of similarity calculating method stream applied in collaborative filtering recommending method provided in an embodiment of the present invention Cheng Tu；

Fig. 2 is a kind of similarity calculation system knot applied in collaborative filtering recommending method provided in an embodiment of the present invention Structure schematic diagram；

Fig. 3 is a kind of collaborative filtering recommending method implementation procedure schematic diagram provided in an embodiment of the present invention.

Specific embodiment

To make the objectives, technical solutions, and advantages of the present invention more comprehensible, right hereinafter, referring to the drawings and the embodiments, The present invention is further described.

It is actively mentioned from background technique as can be seen that requiring user in the whole flow process of existing collaborative filtering recommending method The score data of confession participates in, and collaborative filtering recommending is then unable to complete when obtaining the score data less than user, wherein existing Collaborative filtering recommending method in calculating user between similarity also without illustrating specifically how to be not based on the scoring of user Data are calculated.

In order to overcome the above problem, the embodiment of the present invention proposes the score data for not needing user, but is based on user Related commentary information collaborative filtering recommending method, especially to the calculating similarity in existing collaborative filtering recommending method The step of improved so that similarity calculating section is only modeled according to the comment information of user, then when in use The score information for directly inputting user can be obtained by similarity result.

In this manner it is possible to be realized similar in collaborative filtering recommending method on the basis of not needing the score data of user Calculating between degree.

In embodiments of the present invention, the calculating for realizing the similarity part in collaborative filtering recommending method, can be directed to It is the similarity between user, the similarity that can also be directed between article.

The calculating between similarity in the embodiment of the present invention, based on user comment information, and the scoring number of non-user According to the user comment information includes: article set related to user, and the characteristic information of the user obtained.The spy of user Reference ceases the implicit portrait for being referred to as user.When obtaining, can be obtained using natural language processing (NLP) technology It takes.

In embodiments of the present invention, user draw a portrait (UP:User Profile) can be divided into: explicit user's portrait (EUP: Explicit User Profile) and implicit user's portrait (LUP:Latent User Profile).In general, user draws a portrait Refer to that those can embody the background information of user individual, and the EUP of the embodiment of the present invention refers to the letter that user oneself provides Breath, for example, user native place, age, gender or/and taste etc. or user clearly refer to information, such as: " I compares happiness Vigorously eat peppery " in " peppery " and " I has diabetes " in " diabetes ".In the specifically mentioned characteristic information of acquisition user, use NLP technology is acquired from the related commentary information of user.LUP refers to that user does not have specifically mentioned UP, for example, certain user Menu routine in it is most of all have " pregnant woman " label, cherish then someone in the household of the user or the user can be deduced It is pregnant, therefore " pregnancy " label can be determined as to the LUP of the user.

In embodiments of the present invention, the concrete mode for obtaining LUP includes:

1) article set related to user is obtained:

On the different web sites provided for Internet side, the form of relative article is also different, in Internet side On the website of offer, relative article set can be article, the article even user in user's shopping cart middle age that user bought The article browsed.

2) the implicit user portrait of user is obtained

The information of all items, such as goods attribute information are counted, the most attribute information set of frequency of occurrence is determined For the LUP of user, wherein N1 is algorithm parameter, needs to be determined according to actual data.

Fig. 1 is a kind of similarity calculating method stream applied in collaborative filtering recommending method provided in an embodiment of the present invention Cheng Tu, the specific steps are that:

Step 101, the comment information based on user carry out the similarity modeling between user or between article；

Step 102, the comment information for getting user are input in the similarity model between user or between article, Obtain the similarity result between user or the similarity result between article.

How following detailed description carries out the similarity modeling between similarity modeling and article between user.

Carry out attention rate modeling

Attention rate (attention) refers to user to the degree of attentiveness of relative article, and the embodiment of the present invention is commented from user's The number of reviews of article models attention rate based on user to the number of reviews of article by user is obtained in information, has Volume modeling form are as follows:

Wherein attention_u,iIndicate user u to the attention rate of article i；noc_u,iIndicate user u to the comment number of article i Amount；K (> 0) is attention rate coefficient, and default value is 1, and actual value needs are determined according to specific data.Reason is to work as When the value of k is smaller, it is impossible to highlight influence of the number of reviews to preference intensity；When value is larger, it will seriously ignore and comment The preference intensity for being 1 by number.

Carry out similarity modeling

Since the result that the similarity calculation obtains will be applied to Collaborative Filtering Recommendation Algorithm, it is therefore desirable to establish user it Between similarity relationship, it is therefore an objective to find the neighborhood for having similarity preference with target user, and then complete collaborative filtering mistake Journey.

In addition, the subsequent similarity also used between article and user of the embodiment of the present invention pushes away the candidate of target user Article set is recommended to be weighted, and the similarity between article and user was commented on based on target item and target user Similarity modeling between each article, it is therefore desirable to which the similarity between article is measured.

In the background technology, other than the similarity calculation that Jie Kade similarity modeling mode carries out between user, Remaining nearly all score data participation for requiring user to article, and the embodiment of the present invention is then based on the comment information of user Similarity between user and between article is more efficiently modeled, is illustrated individually below.

The similarity calculation between similarity calculation and article between user has identical structure, all by two parts group At: similarity main part and based on the Jie Kade factor section of Jie Kade similarity modeling.

The calculating of similarity between user

Comment information based on user, article aggregate information specifically related to user, including the various information of article, than If specific object number, the quantity for article of commenting on and each article are by number of reviews etc., to the similarity between user (us, user similarity) is modeled, and entire modeling process considers following four factors:

1) attention rate of the user to article: what the factor considered is attention rate of two users about the same article, right Similarity between user influences, and two different users are higher to the attention rate of the same article simultaneously, then to a certain degree On can illustrate that the two users are more similar, concrete form are as follows: attention_u,i·attention_v,i。

2) popularity of article: what the factor considered is influence of the article popularity to user's similarity, the prevalence of article Degree is portrayed using number of users relevant to article.Associated user's quantity that some article possesses is more, indicates the article It is more popular, in other words, the article be by the article liked of masses, therefore it cannot highlight it is similar between two users Degree, concrete form is:It is, us and | U_i| it is negatively correlated, wherein U_iIndicate user's set relevant to article i, k₁It (> 0) is parameter.

3) non-common phase closes number of articles: what the factor considered is that non-common phase closes influence of the number of articles to user's similarity. In all items at least relevant to one of two users, it is fewer (it is more that i.e. non-common phase closes number of articles) that common phase closes number of articles Can illustrate that the overlapping degree of preference between two users is smaller to a certain extent, then the two users may possess compared with Small similarity；Conversely, then possessing biggish similarity, concrete form is the modeling based on Jie Kade similarity factor, i.e., | I_u,v |/|I_u+I_v|。

4) quantity of total relative article: what the factor considered is to be total to relevant number of articles to entire user to two users The influence of similarity, it is therefore apparent that two more users of common phase pass number of articles are bigger may to possess higher similarity, tool Body is that the stacking pattern of user's similarity based on single item embodies.

The Holistic modeling form of us is as follows:

It enablesThen:

Wherein us_u,vIndicate the similarity between user u and v；I_uAnd I_vRespectively indicate the article collection that user u and v were commented on It closes；I_u,vIndicate the article set that user u and v are commented on jointly；α_u(> 0) is user's coefficient of similarity, and default value is 1；β_u(> 0) It is the German number of user Jie Ka, default value is 0.5.α_uAnd β_uActual value need the effect based on experiment in real data come It determines.

Similarity calculation between article

During carrying out the similarity calculation between article, be based on article by comment information, including it is specific Number of articles that comment user, the quantity for commenting on user and each comment user comment are crossed etc. is to the similarity between article The influence of (is:item similarity), models is, and modeling format is identical as the form of user's similarity, it is also considered that Four factors are respectively:

1) article attention: what the factor considered is two different articles about the same user attention to object The influence of product similarity, the attention rate that two different articles obtain the same user simultaneously is higher, then to a certain extent can be with Illustrate that the two articles are more similar, concrete form is attention_u,i·attention_u,j。

2) popularity of user interest: what the factor considered is the interest to two different articles while interested user The extensive emerging influence to article similarity.The the number of articles of user comment the more, illustrate that the user has more to a certain extent Extensive interest, that is to say, that be also user's extensive review mistake even if some user comments on certain two article Article in two, can not illustrate that this user just shows special preference to the two articles well.In other words, should User cannot highlight the similarity between the two articles；On the contrary, if some user has only commented on two articles, one Determine to illustrate that the two articles are similar in some dimension in degree.Specific form is:Namely is With | I_u| it is negatively correlated, wherein I_uIndicate the relevant article set of user u, k₂It (> 0) is parameter.

3) non-common phase closes number of users: what the factor considered is that non-common phase closes influence of the number of users to article similarity. In all users at least relevant to one of two articles, common phase pass number of users is fewer, i.e., it is more that non-common phase closes number of users Can illustrate that the degree of overlapping of attribute between two articles is smaller to a certain extent, then the two articles may possess it is smaller Similarity；Conversely, then possessing biggish similarity, concrete form is the modeling based on Jie Kade similarity factor, i.e., | U_ (i, j)|/|U_i+U_j|。

4) common phase closes number of users: what the factor considered is to be total to relevant number of users to entire user's phase to two articles Like the influence of degree, it is therefore apparent that two more articles of common phase pass number of users are bigger may to possess higher similarity, specifically It is the stacking pattern embodiment of the article similarity based on single user.

The Holistic modeling form of is is as follows:

It enablesThen:

Wherein is_i,jIndicate the similarity between article i and j；U_iAnd U_jRespectively indicate the comment user set of article i and j； U_i,jIndicate the common comment user set of article i and j；α_i(> 0) is article coefficient of similarity, and default value is 1；β_i(> 0) is Article outstanding person blocks German number, and default value is 0.5.α_iAnd β_iActual value needs the effect based on experiment in real data come really It is fixed.

In embodiments of the present invention, the similarity calculation between the user can be applied to collaborative filtering recommending side In method, and other steps in collaborative filtering recommending method are carried out using the step in background technique.

Fig. 2 is a kind of similarity calculation system knot applied in collaborative filtering recommending method provided in an embodiment of the present invention Structure schematic diagram, comprising: model building module and processing module, wherein

Using the similarity result between user provided in an embodiment of the present invention, so that it may be applied to collaborative filtering recommending side In method, final recommendation article set is calculated.At this moment, the recommended method that can be provided using background technique can also be with It is described further below using recommendation implementation procedure shown in Fig. 3.

As shown in figure 3, it executes specific steps are as follows:

Step 1, the neighborhood of target user is obtained

The neighborhood of target user refers to that the user for having common relative article with target user gathers.Specific acquisition side Formula is as follows:

Wherein u indicates target user；neighbor_uIndicate the neighborhood of u；I_uIndicate article set relevant to u；U_i Indicate user's set relevant to article i.

Step 2, the similarity of target user and all neighbor users are calculated

Neighbor is calculated separately based on the calculating formula of similarity between user_uThe similarity of middle user and target user u。

Step 3, the setting quantity K nearest neighbor set of target user is obtained

According to the user's similarity calculated in step 2 to neighbor_uIn user by similarity from big to small sequence row The maximum preceding K user of similarity is determined as the K nearest neighbor set of target user u, symbolically by sequence Wherein K is algorithm parameter, needs to determine based on specific data set.

Step 4, the Candidate Recommendation article set of target user is obtained

The Candidate Recommendation article of target user u is based on K nearest neighbor setIt obtains, acquisition modes are as follows:

Wherein candidateItem_uIndicate the Candidate Recommendation article set of target user u；I_vIt indicates relevant to user v Article set.

Step 5, the recommended weight of Candidate Recommendation article is calculated

When incipient, candidateItem_uThe recommended weight of middle all items is all equal, is default value 1.This One step will be candidateItem_uIn each article add a differentiation weight, i.e., default-weight 1 is weighted, have The method of weighting of body is as described below.

Step 6, the recommendation item lists of target user are generated

According to the recommended weight of the article calculated in step 5 to candidateItem_uIn article by weight from greatly to Small to be ranked up, wherein the maximum preceding N2 article of weight is targeted the recommendation item lists of user.

In order to preferably recommend Item Information to target user, the embodiment of the present invention is needed in candidateItem_uIt finds More meet the article of user preference, user is bigger for article preference, then the recommended weight of the article is higher.For this purpose, The embodiment of the present invention devises 6 factor pair users and is measured relative to the preference of article, including 5 basic factors and 1 spreading factor.5 basic factors are respectively: maximum user's similarity (mus:maximum user similarity), object Product degree of recognition (recognition), portrait matching level (pml:portrait matching level), maximum attention degree (ma:maximum attention) and consumer articles correlation (uic:user-item correlation)；1 spreading factor It is article temperature (heat).

The specific modeling scheme of preference is as follows:

p_u,i=mus_u,i·recognition_i·pml_u,i·ma_u,i·uic_u,i·heat_i (4)

Wherein p_u,iTable article indicates preference of the user u relative to article i.

5 basic factors and 1 spreading factor are described in detail below.

Maximum user's similarity

Maximum user's similarity (mus) refers to: in the similarity of user relevant to recommended article and target user Maximum value specifically uses similarity corresponding with the neighbours of the most like degree of target user as last uus weighted factor.It is former Because being: each article in article candidate collection may correspond to multiple neighbor users, therefore also have multiple user's phases accordingly It is corresponding to it like degree.The specific formula for calculation of uus is as follows:

Wherein mus_u,iIndicate maximum user similarity of the article i about user u；U_iIndicate the comment user collection of article i It closes；s_u,vIndicate the similarity between user u and user v.

Article degree of recognition

To candidateItem_uFor, setting quantity K nearest neighbor quantity can associated by each article therein Can more than one, such as it is at least related to a user, at most in the case where withIn all users it is related.Therefore, The degree of recognition (recognition) of article refers to:In number of users relevant to recommended article, that is, exist candidateItem_uGenerating process in each article frequency of occurrence.It is clear that association K nearest neighbor quantity is got over More articles ought to obtain higher recommended weight.The specific formula for calculation of recognition is as follows:

Wherein recognition_u,iIt indicates in the K nearest neighbor of user u to the degree of recognition of article i；state_v,iIt indicates Status indication whether article i is related to user v, calculation formula are as follows:

Wherein I_vIndicate article set relevant to user v.

Portrait matching is horizontal

Portrait matching degree (pml) refers to: the intersection size of attribute set and the implicit user portrait (LUP) of article.It is aobvious and It is clear to, those are higher (the intersection size of attribute set and the implicit user portrait of article is bigger) with target user's portrait matching degree Article ought to obtain higher recommendation weight.Specific pml calculation formula is as follows:

Wherein pml_u,iThe matching of the tag set and user u portrait set that indicate article i is horizontal；F_iIndicate the category of article i Property set；LUP_uIndicate that user u implicitly draws a portrait set.

Maximum attention degree

Maximum preference intensity (ma) refers to: maximum value of the neighbor user of target user to recommended article attention rate.When When one user tirelessly repeatedly comments on an article, it is sufficient to seem the user to the degree of concern of the article, In turn it is exactly the preference that the article is more in line with the user, therefore higher weight ought to be obtained when recommending.Ma's is specific Calculation formula is as follows:

Wherein ma_u,iIndicate maximum attention degree of the article i relative to user u；attention_v,iCalculating see formula (1).

User-article correlation

Here correlation (uic) refers to: the degree of correlation between user and article is modeled based on article similarity, That is the average value of similarity between target item i all items relevant to target user u.Specific modeling scheme is as follows:

Wherein uic indicates the correlation between user u and article i；s_i,jIndicate the similarity between article i and article j.

Additional weighted factor

It can reflect that there are many factor of article temperature (heat), such as: comment number, collection number, hits that article obtains etc.. Different systems can choose different temperature factors, fix tentatively the general comment number obtained for article here.

heat_i=noc_i (11)

Wherein heat_iIndicate total comment number that article i is obtained.

In addition, the model of setting of the embodiment of the present invention is provided with more to show personalization when facing different data A parameter is respectively: potential user's portrait number N1, attention rate coefficient k, user and article coefficient of similarity α_uAnd α_i, user and object The Jie Kade factor beta of product_uAnd β_iWith nearest neighbor quantity K.In different systems, the scale of user, the scale of article and The attribute scale of article is different from, and the setting of the above parameter is provided to allow model that can preferably be fitted this differentiation.

The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the present invention.

Claims

1. a kind of similarity calculating method applied in collaborative filtering recommending method characterized by comprising

The comment information for getting user is input in the similarity model between user or between article, obtains between user Similarity result or article between similarity result.

2. the method as described in claim 1, which is characterized in that the comment information of the user includes: object related to user Product set, and obtain user characteristic information.

3. the method as described in claim 1, which is characterized in that the similarity modeling carried out between user is based on: user Number of articles is closed to the attention rate of article, the popularity of article, non-public relative article quantity and common phase.

4. method as claimed in claim 1 or 3, which is characterized in that the similarity modeling carried out between user used Formula are as follows:

Wherein us_u,vIndicate the similarity between user u and user v；I_uAnd I_vRespectively indicate the object that user u and user v were commented on Product set；I_u,vIndicate the article set that user u and user v is commented on jointly；α_u(> 0) is user's coefficient of similarity, is set as 1； β_u> 0 is the German number of user Jie Ka, setting 0.5.

5. the method as described in claim 1, which is characterized in that the similarity modeling carried out between article is based on: article Attention, the popularity of user interest, non-common phase close number of users and common phase closes number of users.

6. method as claimed in claim 1 or 5, which is characterized in that the similarity modeling carried out between article used Formula are as follows:

Wherein is_i,jIndicate the similarity between article i and article j；

U_i,jIndicate the common comment user set of article i and article j；

7. the method as described in claim 4 or 6, which is characterized in that the α_iAnd β_iReal-time update respectively.

8. the method as described in claim 1, which is characterized in that this method further include:

According to the recommendation article set of the nearest neighbor set of the setting quantity of user, the potential recommendation of user is acquired Product set；

The potential recommendation article set for the user that will acquire is input in the recommendation object model of setting, obtains pushing away for user The article set recommended.

9. method according to claim 8, which is characterized in that the recommendation object model are as follows:

Wherein candidateItem_uIndicate the Candidate Recommendation article set of target user u；I_vIndicate article relevant to user v Set；

Recommend weighted value, the weight that will be calculated for each article setting in the Candidate Recommendation article set of target user u It is worth the article of maximum setting number as recommendation article result；

p_u,i=mus_u,i·recognition_i·pml_u,i·ma_u,i·uic_u,i·heat_i

mus_u,iIndicate maximum user similarity of the article i about user u, U_iIndicate article i Comment user set, s_u,vIndicate the similarity between user u and v；

recognition_u,iIndicate user u setting K quantity most Degree of recognition of the neighbour user to article i, state_v,iIndicate status indication whether article i is related to user v,

Wherein I_vIndicate article set relevant to user v；

pml_u,iThe matching of the tag set and user u portrait set that indicate article i is horizontal, F_iIt indicates The attribute set of article i, LUP_uIndicate that user u implicitly draws a portrait set；

ma_u,iIndicate maximum attention degree of the article i relative to user u, attention_v,iBe calculated asattention_u,iIndicate user u to the attention rate of article i, noc_u,iIndicate user u to the number of reviews of article i；K > 0 is attention rate coefficient, is set as 1；

heat_i=noc_i, heat_iIndicate total comment number that article i is obtained.

10. a kind of system for applying the similarity calculation in collaborative filtering recommending method characterized by comprising model is built Formwork erection block and processing module, wherein

Model building module carries out the similarity modeling between user or between article for the comment information based on user；

Processing module is input in the similarity model between user or between article for getting the comment information of user, Obtain the similarity result between user or the similarity result between article.