CN102959539B - Item recommendation method during a kind of repeat in work and system - Google Patents

Item recommendation method during a kind of repeat in work and system Download PDF

Info

Publication number
CN102959539B
CN102959539B CN201180001057.8A CN201180001057A CN102959539B CN 102959539 B CN102959539 B CN 102959539B CN 201180001057 A CN201180001057 A CN 201180001057A CN 102959539 B CN102959539 B CN 102959539B
Authority
CN
China
Prior art keywords
project
user
business
digital media
electronic commerce
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201180001057.8A
Other languages
Chinese (zh)
Other versions
CN102959539A (en
Inventor
杜家春
汪芳山
钟杰萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN102959539A publication Critical patent/CN102959539A/en
Application granted granted Critical
Publication of CN102959539B publication Critical patent/CN102959539B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Abstract

The embodiment of the invention discloses item recommendation method when Digital Media or electronic commerce affair intersect and system, described method comprises: obtain the Digital Media that using of targeted customer or electronic commerce affair mark and targeted customer by computer network interface and identify, and from storer, obtain the service source data prestored according to described service identification; For described targeted customer generates Candidate Recommendation project set; Obtain the prediction scoring of each Candidate Recommendation project in described Candidate Recommendation project set; The final recommended project list that qualified Candidate Recommendation project generates described targeted customer is extracted in prediction scoring according to described Candidate Recommendation project from described Candidate Recommendation project; The client that described final project recommendation list is sent to described targeted customer is shown.Adopt method or the system of the embodiment of the present invention, processing time during project recommendation can be reduced, thus the efficiency that Improving Project is recommended.

Description

Item recommendation method during a kind of repeat in work and system
Technical field
The present invention relates to communication and Internet technical field, particularly relate to item recommendation method when a kind of Digital Media or electronic commerce affair intersect and system.
Background technology
Along with the universal of Internet technology and fast development, internet there is a large amount of data messages, and user only wishes to see oneself data message that is interested or that need when browsing webpage, but in actual scene, often there are other irrelevant data messages a lot of simultaneously, although this just causes data message to get more and more, but the phenomenon that the utilization factor of data message is more and more lower, is called information overload phenomenon.
In order to prevent or reduce information overload phenomenon to the impact brought alternately of user and internet as far as possible, before data message is shown to user, can carry out analyzing and processing to data message, namely such as personalized recommendation technology is by recommending the resource meeting its interest or demand to reduce a kind of method of information overload phenomenon for user.Personalized recommendation technology has been widely applied in multiple fields such as ecommerce, digital library, music, video and news at present, this multiple application also comprises multiple business, such as telecom operators are proposed Ring Back Tone service, ring service and complete bent business etc. in music field, for telecom operators, some overlaps project (referring to music here) in each business again, and further the user of each business also may have coincidence.Such as e-commerce website again, each seller can regard a business as, and the commodity of seller may be overlap, and the client of seller overlaps.In this same application, the business item of multiple business and/or the phenomenon that partially overlaps of user, be namely repeat in work phenomenon.
The technology that of personalized recommendation is traditional is collaborative filtering, and collaborative filtering carries out personalized recommendation based on user-project score data, and user-project score data can explicit and/or implicit expression obtain.The openness of data can affect final recommendation effect.In the application of personalized recommendation technology, user-project score data represents the fancy grade of user for certain project in business, and the score value larger expression user of user-project score data more likes this project.User-project score data can explicit or implicit expression obtain, and such as user directly carries out scoring operation to project.But in most of application, user only may with a portion of scoring to little in bulk items set, and therefore user-project score data just has the phenomenon of a lot of disappearance, and user-project score data also can be more sparse.
In prior art, for solving the openness problem of user-project score data, a kind of method is had to be supply the score data of user to non-scoring item; Such as the scoring of user to non-scoring item can be set to the intermediate value of scoring scope, or directly the scoring of non-scoring item be set to the average score of user; But the score data of this default setting has certain subjectivity, usually differ greatly with actual user-project score data; Also having a kind of method to send out is adopt some proposed algorithm to dope the scoring of user to non-scoring item; But this method based on prediction scoring is because be the Rating Model obtained based on former sparse data, so can not ensure the authentic and valid of score data.
Further, when applying the said method of prior art, because all need Dynamic Acquisition user-project score data when carrying out project recommendation, then recommend according to user-project score data, also cause the efficiency of project recommendation lower at every turn; Further, because well cannot solve the openness problem of user-project score data, and the quality of data is also not high enough, and can make does not need the data message being shown to user to be shown to user originally, reduces validity and the accuracy of project recommendation.
Summary of the invention
The item recommendation method when embodiment of the present invention provides Digital Media or electronic commerce affair to intersect and system, to solve the openness problem of user-project score data in prior art in conjunction with practical application area, reduce processing time during project recommendation, thus the efficiency that Improving Project is recommended.
For solving the problems of the technologies described above, embodiments provide item recommendation method when a kind of Digital Media or electronic commerce affair intersection, the method comprises:
Obtain the Digital Media that using of targeted customer or electronic commerce affair mark and targeted customer by computer network interface to identify, and from storer, obtain the service source data prestored according to described service identification;
The Digital Media used according to described targeted customer's mark, targeted customer or electronic commerce affair identify and service source data, for described targeted customer generates Candidate Recommendation project set;
The prediction scoring of each Candidate Recommendation project in described Candidate Recommendation project set is at least obtained according to the user's similarity in described service source data and/or item similarity;
The final recommended project list that qualified Candidate Recommendation project generates described targeted customer is extracted in prediction scoring according to described Candidate Recommendation project from described Candidate Recommendation project;
The client that described final project recommendation list is sent to described targeted customer is shown by Digital Media or electronic commerce affair server.
Embodiments provide item recommendation system when a kind of Digital Media or electronic commerce affair intersection, this system comprises:
Obtain identify unit, identify for being obtained the Digital Media that using of targeted customer or electronic commerce affair mark and targeted customer by computer network interface;
Obtain service source data cell, for obtaining the service source data prestored from storer according to described service identification;
Generate candidate collection unit, for the Digital Media that using according to described targeted customer's mark, targeted customer or electronic commerce affair mark and service source data, for described targeted customer generates Candidate Recommendation project set;
Obtain prediction scoring unit, at least according to the user's similarity in described service source data and/or item similarity, obtain the prediction scoring of each Candidate Recommendation project in described Candidate Recommendation project set;
Generate final list cell, from described Candidate Recommendation project, extract for the prediction scoring according to described Candidate Recommendation project the final recommended project list that qualified Candidate Recommendation project generates described targeted customer;
Display unit, shows for the client that described final project recommendation list is sent to described targeted customer.
The embodiment of the present invention has the following advantages:
Item recommendation method disclosed in the embodiment of the present invention by obtaining the user's similarity and item similarity that prestore from storer, can directly for project recommendation provides data, so just can reduce processing time during project recommendation, thus the efficiency that Improving Project is recommended.Further, before prestoring user's similarity and item similarity, by the user-project score data after the mapping of well selecting this business, validity and the accuracy of recommendation results can be improved by user's similarity of calculating and item similarity.Therefore, the embodiment of the present invention can be good at the openness problem solving user-project score data, processing time during project recommendation can be reduced, thus the efficiency that Improving Project is recommended, and by promoting validity and the authenticity of user-project score data, validity and the accuracy of recommendation results on increase line can be come.Certainly, implementing any one embodiment disclosed by the invention, not necessarily to need to reach above-mentioned institute effective simultaneously.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 is the acquisition process flow diagram of user's similarity and/or item similarity in Digital Media of the present invention or electronic commerce affair item recommendation method when intersecting;
Fig. 2 is the process flow diagram of step 102 in the process flow diagram shown in Fig. 1;
Fig. 3 is the process flow diagram of step 202 in the process flow diagram shown in Fig. 2;
Fig. 4 is the process flow diagram of step 304 in the process flow diagram shown in Fig. 3;
Fig. 5 is the process flow diagram of Digital Media of the present invention or electronic commerce affair item recommendation method embodiment when intersecting;
Fig. 6 is the process flow diagram of Digital Media of the present invention or electronic commerce affair another embodiment of item recommendation method when intersecting;
The process flow diagram of another embodiment of item recommendation method when the Digital Media that Fig. 7 is or electronic commerce affair intersect;
Fig. 8 is the structural representation of the acquisition embodiment of user's similarity when intersecting of Digital Media of the present invention or electronic commerce affair and/or item similarity;
Fig. 9 is the structural representation of integral unit 802 in the embodiment shown in Fig. 8;
Figure 10 is the structural representation of the second coupling subelement 902 in the integral unit 802 shown in Fig. 9;
Figure 11 is the structural representation of business coupling subelement 1004 in the second coupling subelement 902 shown in Figure 10;
Figure 12 is the structural representation of Digital Media of the present invention or electronic commerce affair item recommendation system when intersecting.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
First the personalized recommendation technology mentioned in background technology is further introduced.Most widely used in personalized recommendation technology is collaborative filtering.Collaborative filtering based on user-project score data for user carries out personalized recommendation.Here user U is supposed ito project I jscoring be r ij, this scoring can explicitly obtain (user carries out scoring operation to project), also implicit expression can obtain (by user to the search of project, browse or/and the behavior structure score functions such as purchase calculate).Usually, the score value of user-project scoring all can be limited to the integer representation in a certain scope, and score value larger expression user more likes this project.
The most frequently used in collaborative filtering is collaborative filtering based on memory, and the collaborative filtering based on memory comprises collaborative filtering based on user and project-based collaborative filtering.Ultimate principle based on the collaborative filtering of user utilizes user to recommend mutually the interested project of user's possibility to the similarity that project is marked.Such as, to active user U, system is by its scoring record and specific similarity function, calculate the nearest-neighbors collection of k the user the most close with its scoring behavior as user U, the project that U does not mark by adding up the neighbour user of U to mark generates Candidate Recommendation collection, then calculating U concentrates the prediction of each project i mark to Candidate Recommendation, and the wherein the highest N number of project of prediction scoring of getting recommends to collect as the top-N of user U.The similarity of project-based collaborative filtering then between item compared, and generate Candidate Recommendation collection according to the similar terms of active user scoring item.
Item recommendation method when Digital Media of the present invention or electronic commerce affair intersect is primarily of (can be understood as background system) under line and line (can be understood as foreground system) two parts composition, wherein, the function of project recommendation is that part realizes on line, and the service source online data lower part used in item recommendation method is carried out computing and is stored in storer, described source data should at least comprise user's similarity and/or item similarity, so that inline system can carry out project recommendation according to user's similarity and/or item similarity.Further, the acquisition methods of described similarity and/or item similarity can be performed by processor, and is stored in storer by execution result; And user items recommend method in the specific implementation can by performing with the Digital Media that described functional processor is connected or e-commerce server, this Digital Media or e-commerce server can carry out data interaction by human-computer interaction interface and user, and realize the acquisition of original source data or service source data by computer network interface and share.
In embodiment part, the user's similarity conveniently stored in the clearer understanding storer of those skilled in the art and/or the obtain manner of item similarity, so with reference to figure 1, for realizing the acquisition methods process flow diagram of user's similarity and/or item similarity in the embodiment of the present invention, specifically can comprise the following steps:
Step 101: the original source data obtaining multiple different digital media or electronic commerce affair, described original source data comprises: the initial user-project score data of described multiple Digital Media or electronic commerce affair.
In actual applications, described original source data at least should comprise the initial user-project score data of multiple business; Optionally, UAD and the item attribute data of multiple business can also be comprised, and call detailed list data etc.
Wherein, described Digital Media or electronic commerce affair, include but not limited to: music, application download, internet book store, electronic reading, game and/or shopping online.User-project score data is that the user that produces in business use procedure of user that miscellaneous service is corresponding is to the score data of project.Call detailed list is the message registration in a period of time between user, and note also can be utilized to contact the data of situation in detail between the similar reflection such as list, instant messaging record or E-mail communication record user.
Step 102: according to the standardization result of all users between multiple Digital Media or electronic commerce affair and the matching result of project and the initial user-project score data of described multiple Digital Media or electronic commerce affair, is integrated into the unification user-project score data of all users and the project comprising described multiple Digital Media or electronic commerce affair by the initial user-project score data of described multiple Digital Media or electronic commerce affair.
Wherein, described original source data can also comprise: the UAD of multiple business and item attribute data, and with reference to figure 2, described step 102 can comprise the steps: again when specific implementation
Step 201: according to initial user mark, initial user attribute and initial user property value in described UAD, coupling obtains actual user unique between multiple Digital Media or electronic commerce affair; Described initial user label table is shown in user unique in a certain business; Described initial user property value is for representing user all unique between described multiple business.
Wherein, described UAD is in order to describe the attribute of each user in miscellaneous service, and described item attribute data are in order to describe the attribute of each project in miscellaneous service.Such as in music field, the UAD in each business comprises the telephone number etc. of user, and the item attribute data in each business comprise music name, singer, school, album name, issuing date, region, languages, duration and/or form etc.
Namely what step 201 performed is to the process that all users between multiple Digital Media or electronic commerce affair are mated in step 102, and user's coupling here refers to determines that, in the UAD that these are different, which user is same user.User's coupling can have been mated with associating between user ID according to the user property value of each business, wherein user property value energy unique identification user identity; And by association matching relationship, to find out which user in each business be same user.
Such as in business A, initial user is designated " lawyer ", and in business B, initial user is designated " Zhang San ", but the contact method of these two initial user mark correspondences is all " 1380000000 ", namely be user property value be all " 1380000000 ", then represent that " lawyer " in business A is same person with " Zhang San " in business B.
Step 202: according to the initial item identification in described item attribute data, initial project attribute and initial project property value, coupling obtains actual items unique between multiple Digital Media or electronic commerce affair; Described initial item identification represents project unique in a certain business.
Namely what described step 202 performed is to the process that all items between miscellaneous service mates in step 102.Project coupling refers to determines which project is same project in the business that these are different.
Shown in figure 3, described step 202 can comprise the steps: again when specific implementation
Step 301: utilize actual items attributes different between the multiple Digital Media of initial project attributes match of multiple Digital Media or electronic commerce affair or electronic commerce affair.
Namely this step is that the Property Name of the project utilizing the name-matches relation of initial project attribute the item attribute data of various Digital Media or electronic commerce affair to be comprised mates, and obtains the actual items attribute between each Digital Media or electronic commerce affair.The name-matches relation of initial project attribute can establish in each field in advance, also can be completed by the mode manually participated in.In the process of Property Name coupling, can think that " music name " and " song title " refer to same item attribute, or " singer " refers to same item attribute with " singer ".
Step 302: the item attribute collection registration average obtaining item attribute collection registration between multiple Digital Media or electronic commerce affair and each business and other business according to described different actual items attribute.
Suppose that all different attribute names obtained after the name-matches of item attribute are called T, the item attribute collection registration between Ze Liangzhong Digital Media or electronic commerce affair is calculated by formula (1):
C ( S i , S j ) = | { Attr t | Attr t ∈ S i } ∩ { Attr t | Attr t ∈ S j } | | { Attr t | Attr t ∈ S i } ∪ { Attr t | Attr t ∈ S j } | , 1≤t≤T(1)
Wherein, suppose that business one has K kind, S iand S j(1≤i ≠ j≤K) represents i-th kind of business and jth kind business respectively, { Attr t| Attr t∈ S irepresent item attribute set in i-th kind of business, Attr trepresent t attribute.
So, the item attribute collection registration average of i-th kind of business and other business is calculated by formula (2):
C S i ‾ = 1 K Σ j = 1 K C ( S i , S j ) - - - ( 2 )
Step 303: described multiple Digital Media or electronic commerce affair are sorted according to the size of described item attribute collection registration average.
Sort to described various Digital Media or electronic commerce affair, the item attribute collection registration average being positioned at the business before sequence after sequence is large, and the item attribute collection registration average being positioned at sequence Digital Media below or electronic commerce affair is little.
Step 304: according to the sequencing of multiple Digital Media or electronic commerce affair after sequence, with the first business for current business performs business coupling flow process, described business coupling flow process comprises: the matching entries determining described current business and other business, and, delete described current business.
The first business in Selective sequence, according in this business comprise project project label order, each project be followed successively by this first business determines matching entries in other business, project in this matching entries and the first business is same project, after each project of the first business has been mated, delete the first business, then start each project of mating in the second business, by that analogy, until all business have all been mated.
Shown in figure 4, in described step 304 when determining the matching entries of described current business and other business, can carry out in the following manner:
Step 401: the initial item identification order comprised according to current business, selects first project as current project project implementation coupling flow process.
In each project specifically comprised at Contemporary Digital media or electronic commerce affair (being namely the first business), need first to select first project as current project project implementation coupling flow process according to initial item identification order; Concrete, described project coupling flow process can comprise:
Sub-step 4011: utilize initial project property value to calculate the project matching degree of each project in current project and other business;
When computational item matching degree, initial project property value can be utilized to adopt formula (3) to process:
M ( I i , I j ) = Σ t = 1 T w t · δ Attr t ( I i ) , Attr t ( I j ) - - - ( 3 )
Wherein, I iand I jrepresent two projects respectively, w t(1≤t≤T) is the weight of t attribute, Attr t(I i) represent project I ithe property value of t attribute.As project I iand I jthe property value of t attribute all exist and equal time, function value is 1, otherwise is 0.
Sub-step 4012: for each other business, chooses suitable project matching degree, to form the set of multiple project matching degree according to the threshold condition preset;
For each other business, retain all project matching degrees being not less than predetermined threshold, form multiple project matching degree data set.Here predetermined threshold is relevant to each business, and its value, can be different according to practical business between 0 ~ 1.
Sub-step 4013: select project that matching degree is the highest as the matching entries of current project in the set of each project matching degree;
Sub-step 4014: the matching relationship recording described current project and its matching entries, and the described matching entries of deleting that described multiple Digital Media or electronic commerce affair comprise;
Sub-step 4015: delete described current project;
Sub-step 4016: judge that whether all items collection of this Digital Media or electronic commerce affair is empty, if it is terminate, otherwise perform step 402.
Step 402: with second project in the first business described for current project, performs described project coupling flow process, until the project comprised in the first business described is for empty.
After first project coupling of the first business described is complete, with second project for current project, perform described project coupling flow process, until the project comprised in the first business described is all mated complete.
Step 305: with the second business be described current business perform business coupling flow process, until when the business in described sequence is empty, obtain actual items all unique between described multiple Digital Media or electronic commerce affair according to described matching entries and project matching relationship.
The all items contained when the first business packet described all mate complete after, then with the second business for current business, perform business coupling flow process, until all items in all Digital Medias or electronic commerce affair all mates complete.
When all items of all Digital Medias or electronic commerce affair all mate complete after, determine actual items all unique between each Digital Media or electronic commerce affair according to matching entries and project matching relationship.
Step 203: according to user-project scoring score range of multiple Digital Media or the initial user-project score data of electronic commerce affair, multiple Digital Media or electronic commerce affair and the minimum value of described score range, obtain the standardization result of the initial user-project score data of described multiple Digital Media or electronic commerce affair.
After determining actual items, by the scoring score range standardization in the user-project score data of various Digital Media or electronic commerce affair, and the user-project score data of the various Digital Media calculated after standardization or electronic commerce affair.The computing formula of standardization result is:
r ij ( k ) ′ = [ r ij ( k ) - min ( rate ( k ) ) range ( k ) · min 1 ≤ k ≤ K ( range ( k ) ) + min ( rank ( arg min 1 ≤ k ≤ K ( range ( k ) ) ) ) + 0.5 ] - - - ( 4 )
Wherein, suppose that business one has K kind, (1≤k≤K) represents user U in the rear kth kind business of standardization ito project I jthe scoring of standardization result, (1≤k≤K) represents user U in kth kind business ito project I joriginal scoring, range (k)(1≤k≤K) represents the scoring score range of kth kind business, min (rate (k)) (1≤k≤K) represent the minimum value of the scoring score range of kth kind business.
Step 204: according to described actual user, actual items and standardization result, integrate the user-project score data of described multiple Digital Media or electronic commerce affair, generate unified user-project score data, described unified user-project score data comprises the user-project score data after the integration of all users and project in described multiple Digital Media or electronic commerce affair.
Based on user's coupling, project coupling and various Digital Media or electronic commerce affair user-project score data standardization after result, integrate the user-project score data of various Digital Media or electronic commerce affair, generate unified user-project score data.Because in original user-project score data, polyisomenism is there is between each user and between each project, so original user-project score data will be integrated, in the unified user-project score data generated, user and project are all actual user and actual items, therefore, the score data of same user to same project only has a score value.
In this step, actual user U ito actual items I junique scoring r ijany one method in formula (5), (6) or (7) can be used to obtain:
r ij = max 1 ≤ k ≤ K ( r ij ( k ) ) - - - ( 5 )
there is at least one (6)
Wherein, the α in formula (6) ik(1≤k≤K) is user U ito the preference weight of kth kind business, user U can be set in advance iscoring number of times in kth kind business, or user is U iuse the duration etc. of kth kind business; It should be noted that, when user items attribute data comprises call detailed list, formula (7) can be adopted to calculate, the NB (U in formula (7) i) be user U ilinkman set within a period of time, can be obtained by call detailed list, also can be obtained by the detailed list of note, instant messaging record or E-mail communication record etc.β isuser U iwith user U stight ness rating, user U in a period of time can be set in advance iwith user U scontact frequency, or contact duration etc.
Step 103: described unification user-project score data is mapped to described multiple Digital Media or electronic commerce affair successively, generates the user-project score data after described multiple Digital Media or electronic commerce affair mapping.
After obtaining unified user-project score data, need again unified user-project score data to be mapped to various Digital Media or electronic commerce affair successively, to generate the user-project score data after various Digital Media or electronic commerce affair mapping.
For often kind of business, mode one: all score data that all projects that often kind of business comprises are corresponding can be extracted in unified user-project score data, form user-project score data that the miscellaneous service after mapping is relevant;
Mode two: also can extract all score data that total user that often kind of business comprises is corresponding in unified user-project score data, form user-project score data that the miscellaneous service after mapping is relevant;
Mode three: also can extract all projects that often kind of business comprises and all corresponding all score data of total user in unified user-project score data, form user-project score data that the miscellaneous service after mapping is relevant.
Step 104: according to the item similarity in the user's similarity in the user after described multiple Digital Media or electronic commerce affair mapping-project score data acquisition same business between different user and/or same business between disparity items.
Concrete, the item destination aggregation (mda) that the item destination aggregation (mda) of jointly marking according to two different users in the user after described mapping-project score data, same business and described two different users are marked separately calculates the user's similarity between described two different users; And/or,
According in the user after described mapping-project score data, same business, the item similarity between described two disparity items is calculated to the set of the user that two disparity items are marked jointly and the described set to the user that two disparity items are marked separately.
It should be noted that, at step 104, only can calculate the user's similarity in same business between different user, also only can calculate the user's similarity in same business between disparity items, also all can calculate similarity between different user and disparity items in same business.Concrete, the cosine similarity of formula (8) can be adopted to calculate user's similarity in same business between different user:
sim ( U i , U j ) = Σ I t ∈ X U i , U j r it · r jt Σ I t ∈ X U i r it 2 · Σ I t ∈ X U j r jt 2 - - - ( 8 )
Wherein, represent user U iwith user U jjointly comment undue item destination aggregation (mda), represent user U icomment undue item destination aggregation (mda).
The cosine similarity of employing formula (9) calculates the item similarity in same business between disparity items:
sim ( I i , I j ) = Σ U t ∈ X I i , I j r ti · r tj Σ U t ∈ X I i r ti 2 · Σ U t ∈ X I j r tj 2 - - - ( 9 )
Wherein, represent project I iwith project I jall comment the set of undue user, represent project I icomment the set of undue user.
Step 105: described user's similarity and/or item similarity are stored in described storer.
After getting user's similarity and/or item similarity, because need the use when carrying out project recommendation, so can first user's similarity and/or item similarity be stored in storer, if follow-up like this needs carry out project recommendation to user, just directly can get required user's similarity and/or item similarity from storer, this makes it possible to directly for the project recommendation of follow-up execution provides service source data, to reduce the recommendation time of project recommendation, thus improve the efficiency of project recommendation.
Further, the acquisition methods of user's similarity and/or item similarity disclosed in Fig. 1, because integrate by the specification of user-project score data and map and calculate user's similarity or item similarity, not only can provide data for project recommendation, by well selecting the user-project score data after the mapping of this business and corresponding user's similarity and/or item similarity, can also can improve the validity accuracy of project recommendation.
Conveniently those skilled in the art better understand the principle about user's similarity and/or item similarity, and with reference to figure 5, give the instantiation that obtains user's similarity and/or item similarity, its method realized can comprise the following steps:
Step 501: the original source data obtaining multiple Digital Media or electronic commerce affair, described original source data comprises: the initial user-project score data of multiple Digital Media or electronic commerce affair.
Suppose to there are 3 kinds of business in music field, be designated as S respectively 1, S 2, S 3, at business S 1in, suppose that scoring score range is 1-5, business S 1in user-project score data as shown in table 1:
Table 1
Do not have the positional representation user of data not comment too for the project of correspondence in table 1, i.e. user-project score data does not exist.
Business S 1in UAD as shown in table 2:
Table 2
User ID Telephone number
U 1 134********
U 2 134********
U 3 138********
U 4 158********
U 5 137********
Business S 1in item attribute data as shown in table 3:
Table 3
Project label Music name Singer
I 1 There is you on the way Open schoolmate
I 2 In order to like that dream throughout one's life Wang Jie
I 3 The Desert Is Be Lonely Zhou Chuanxiong
I 4 Perfect interaction Wang Lihong
And at business S 2in, scoring score range is 1-10, business S 2in user-project score data as shown in table 4:
Table 4
Business S 2in UAD as shown in table 5:
Table 5
User ID Telephone number
U 1 138********
U 2 138********
U 3 134********
U 4 137********
U 5 150********
U 6 139********
Business S 2in item attribute data as shown in table 6:
Table 6
Project label Song title Singer Album name
I 1 There is you on the way Open schoolmate B&W
I 2 Do not cease to be faithful for one Open schoolmate Do not cease to be faithful for one
I 3 The Desert Is Be Lonely Zhou Chuanxiong Legend under starry sky
I 4 One cuts plum Take Yuqin It is big
I 5 Daphne odera Zhou Jielun Daphne odera
And at business S 3in, scoring score range is 1-5, business S 3in user-project score data as shown in table 7:
Table 7
Business S 3in UAD as shown in table 8:
Table 8
User ID Telephone number
U 1 139********
U 2 137********
U 3 134********
U 4 138********
U 5 138********
U 6 150********
U 7 137********
Business S 3in item attribute data as shown in table 9:
Table 9
Project label Song title Singer School Languages
I 1 Perfect interaction Wang Lihong Popular Mandarin
I 2 One cuts plum Take Yuqin Classical old song Mandarin
I 3 GOOD BYE,AUTUMN Wang Qiang Popular Mandarin
I 4 Daphne odera Zhou Jielun Popular Mandarin
I 5 Garden party Zhou Jielun Popular Mandarin
Step 502: according to initial user mark, initial user attribute and initial user property value in described UAD, coupling obtains actual user unique between multiple Digital Media or electronic commerce affair.
First the user's matching process in multiple business is carried out.According to the content of table 2, table 5 and table 8, be same user with the user that telephone number is identical, all user's matching relationship data obtained in 3 kinds of business are as shown in table 10:
Table 10
The user ID of distributing after coupling Business S 1 Business S 2 Business S 3
U′ 1 U 1 U 3
U′ 2 U 2 U 3
U′ 3 U 3 U 2
U′ 4 U 4
U′ 5 U 5 U 4 U 2
U′ 6 U 1 U 4
U′ 7 U 5 U 6
U′ 8 U 6 U 1
U′ 9 U 5
U′ 10 U 7
First row in table 10 represents the actual user's mark all unique in each business redistributed after user's coupling, is also illustrated in the user in unified user-project score data.With the second behavior example of table 10, user's matching relationship is described: this row shows business S 1user U 1with business S 2user U 3same user, with U in unified user-project score data 1represent.
Step 503: according to the initial item identification in described item attribute data, initial project attribute and initial project property value, coupling obtains actual items unique between described multiple Digital Media or electronic commerce affair.
First be Property Name coupling in this step.In this example, item attribute " music name " and " song title " refer to same item attribute, and " singer " and " singer " also refer to same item attribute.The item attribute collection registration average of the miscellaneous service so obtained by formula (1) and (2) is as shown in table 11:
Table 11
Service identification Item attribute collection registration average
S 1 0.58
S 2 0.53
S 3 0.45
Suppose in project matching process, get music name, singer, album name, school and languages weight be respectively 0.5,0.3,0.1,0.05 and 0.05, and get business S 1with business S 2, business S 1with business S 3, business S 2with business S 3project matching degree threshold value when being respectively 0.8,0.7 and 0.7, the project matching relationship obtained in this example is as shown in table 12:
Table 12
The project label distributed after coupling S 1 S 2 S 3
I′ 1 I 1 I 1
I′ 2 I 2
I′ 3 I 3 I 3
I′ 4 I 4 I 1
I′ 5 I 2
I′ 6 I 4 I 2
I′ 7 I 5 I 4
I′ 8 I 3
I′ 9 I 5
The actual items mark all unique between miscellaneous service that first row data representation is in table 12 redistributed after project coupling, is also illustrated in the project in unified user-project score data.The second behavior example explanation project matching relationship with table 12: this row shows business S 1project I 1with business S 2project I 1same project, with I ' in unified user-project score data 1represent.
Step 504: according to user-project scoring score range of described multiple Digital Media or the initial user-project score data of electronic commerce affair, multiple Digital Media or electronic commerce affair and the minimum value of described score range, calculate the standardization result of the initial user-project score data of described multiple Digital Media or electronic commerce affair.
Utilize formula (4) in this step, the user-project score data of the miscellaneous service after the standardization obtained.Business S 1standardization after user-project score data as shown in table 13:
Table 13
Business S 2standardization after user-project score data as shown in table 14:
Table 14
Business S 3standardization after user-project score data as shown in Table 15:
Table 15
Step 505: according to described actual user, actual items and standardization result, integrate the user-project score data of described multiple Digital Media or electronic commerce affair, generate unified user-project score data, described unified user-project score data comprises the user-project score data after the integration of all users and project in described multiple Digital Media or electronic commerce affair.
Specifically utilize aforementioned formula (5) in step 505, the unification user obtained-project score data, specifically shown in table 16:
Table 16
In table 16, user and project are unique between miscellaneous service all respectively, then can as seen from Table 16, and in 3 kinds of business, different actual users has 10, and different actual items has 9.
Step 506: described unification user-project score data is mapped to described multiple Digital Media or electronic commerce affair successively, generates the user-project score data after described multiple Digital Media or electronic commerce affair mapping.
Utilize aforementioned manner three, user-project score data that the business after miscellaneous service mapping is relevant can be obtained.Business S 1user after middle mapping-project score data is shown in table 17:
Table 17
Business S 2user after middle mapping-project score data is shown in table 18:
Table 18
Business S 3user after middle mapping-project score data is shown in table 19:
Table 19
Step 507: the item similarity in the user after mapping according to described multiple Digital Media or electronic commerce affair-project score data calculating same business between disparity items.
In step 507, hypothesis calculates the similarity in same Digital Media or electronic commerce affair between disparity items, utilizes aforementioned formula (9) can calculate the item similarity of various Digital Media or electronic commerce affair.The item similarity data of business S1 are shown in table 20:
Table 20
I′ 1 I′ 2 I′ 3 I′ 4
I′ 1 1.00 0.44 0.73 0.00
I′ 2 0.44 1.00 0.00 0.39
I′ 3 0.73 0.00 1.00 0.35
I′ 4 0.00 0.39 0.35 1.00
Business S 2item similarity data shown in table 21:
Table 21
I′ 1 I′ 5 I′ 3 I′ 6 I′ 7
I′ 1 1.00 0.44 0.49 0.38 0.24
I′ 5 0.44 1.00 0.00 0.77 0.64
I′ 3 0.49 0.00 1.00 0.00 0.42
I′ 6 0.38 0.77 0.00 1.00 0.73
I′ 7 0.24 0.64 0.42 0.73 1.00
Business S 3item similarity data shown in table 22:
Table 22
I′ 4 I′ 6 I′ 8 I′ 7 I′ 9
I′ 4 1.00 0.44 0.23 0.57 0.77
I′ 6 0.44 1.00 0.41 0.54 0.56
I′ 8 0.23 0.41 1.00 0.44 0.00
I′ 7 0.57 0.54 0.44 1.00 0.24
I′ 9 0.77 0.56 0.00 0.24 1.00
In the present example, because various Digital Media or electronic commerce affair belong to music field, thus carry out user-project score data specification, integrate and mapping be rational.By the acquisition methods of this item similarity of the present embodiment, user-project score data that business after the mapping obtained is correlated with is abundanter than the user-project score data of original miscellaneous service, and it is with a high credibility, the openness problem of user-project score data can be solved well, and the user-project score data passed through again when carrying out project recommendation after the mapping of well this business of selection and accordingly user's similarity and/or item similarity, also can improve validity and the accuracy of project recommendation.
After having introduced the acquisition flow process of user's similarity and/or the item similarity related in the embodiment of the present invention, shown in figure 6, item recommendation method disclosed in the embodiment of the present invention specifically can comprise:
Step 601: the service identification used by computer network interface acquisition targeted customer and targeted customer's mark.
Targeted customer needs the user for its recommended project in this step, first obtains the service identification that using of targeted customer and user ID thereof.It should be noted that, here it is unique that targeted customer is identified in same business, not necessarily unique in different business, but because can uniquely determine a target service according to service identification, therefore, target service is identified in this target service and uniquely can determines a user.
Step 602: obtain the service source data prestored according to described service identification from storer.
Wherein, described service source data specifically can comprise: similarity, user's matching relationship data and project matching relationship data between the user-project score data after this business maps, the disparity items of this business; Or, similarity, user's matching relationship data and project matching relationship data between the user-project score data after this business maps, the different user of this business.
According to service identification can obtain from the result of flow chart of data processing line this business map after the similarity of user-between project score data and the disparity items of this business, and user's matching relationship data and project matching relationship data, also can obtain from the result of flow chart of data processing line this business map after the similarity of user-between project score data and the different user of this business, user's matching relationship data and project matching relationship data can also be got.
Suppose to need in practice for aforementioned business S 3user U 5recommend a project, then the target service got is designated business S 3, targeted customer is designated business S 3in user U 5user-project the score data identified according to target service after the mapping got is the content shown in table 19, similarity in this business between disparity items is the content shown in table 22, and user's matching relationship data are the content shown in table 10, and project matching relationship data are the content shown in table 12.
Step 603: the service identification used according to described targeted customer's mark, targeted customer and service source data, for described targeted customer generates Candidate Recommendation project set.
Need in this step according to targeted customer's mark, the service identification that using of targeted customer and service source data for targeted customer generates Candidate Recommendation project set.This Candidate Recommendation project is combined in acquisition process the combination that can adopt any one mode following or two kinds of modes:
Mode A: select the user meeting prerequisite with user's similarity of described targeted customer, and the scoring selecting described user's similarity to meet the user of prerequisite is higher than predetermined threshold and the described targeted customer item design Candidate Recommendation project set of not marking;
Wherein, the Candidate Recommendation project in described Candidate Recommendation project set all belongs to the Digital Media or electronic commerce affair that described targeted customer using.
Wherein, described Candidate Recommendation project can comprise: digital media content, e-commerce product or uniform resource position mark URL.
Mode B: select mark higher than the project of predetermined threshold value with the user-project of described targeted customer, and the item similarity between selection and the described user-project project of marking higher than predetermined threshold value meets prerequisite and the described targeted customer item design Candidate Recommendation project set of not marking; Wherein, the Candidate Recommendation project in described Candidate Recommendation project set all belongs to the business that described targeted customer is using.
Wherein, judge whether a project belongs to the Digital Media or electronic commerce affair that targeted customer using, the service identification that can use according to targeted customer and actual items mark judge.
For mode B, according to the table 19 got in step 601 and table 22, for user U 5(be namely the U ' in table 19 9) the high project of the score value of user-project data, meet the high and user U of similarity between the project high with described score value 5do not comment undue item design Candidate Recommendation project set.
In the present example, suppose to think user U 5the project definition that score value is not less than 3 is the project that score value is high, then the project obtained is I ' 4with I ' 8; Suppose that the implication that similarity is high is that similarity is not less than 0.4, so, with I ' again 4high and the user U of similarity 5undue project is not commented to be I ' 6, I ' 7with I ' 9, with I ' 8high and the user U of similarity 5undue project is not commented to be I ' 6with I ' 7, therefore Candidate Recommendation project set comprises I ' 6, I ' 7with I ' 9, correspond to business S 3in project be I 2, I 4and I 5.
Step 604: the prediction scoring at least obtaining each Candidate Recommendation project in described Candidate Recommendation project set according to the user's similarity in described service source data and/or item similarity.
In actual applications, user U ito project I jprediction scoring any one mode in formula (10), (11) and (12) can be adopted to calculate:
P U i , I j = Σ U k ∈ NN U i sim ( U i , U k ) · r kj Σ U k ∈ NN U i | sim ( U i , U k ) | - - - ( 10 )
P U i , I j = Σ I k ∈ NN I j sim ( I j , I k ) · r ik Σ I k ∈ NN I j | sim ( I j , I k ) | - - - ( 11 )
P U i , I j = α · Σ U k ∈ NN U i sim ( U i , U k ) · r kj Σ U k ∈ NN U i | sim ( U i , U k ) | + ( 1 - α ) Σ I k ∈ NN I j sim ( I j , I k ) · r ik Σ I k ∈ NN I j | sim ( I j , I k ) | - - - ( 12 )
Wherein, in formula (10) represent and user U ithe set of user's composition that similarity is high, i.e. user U ineighbour; Sim (U i, U k) represent user U iwith user U ksimilarity; In formula (11) represent and project I jthe set of the item design that similarity is high, i.e. project I jsimilar terms collection, sim (I j, I k) represent project I jwith project I ksimilarity; α in formula (12) is the parameter between 0 to 1, can empirically manually set, or obtains according to training data study, such as, constantly adjust the value of α, select minimum that of final error.
Suppose to utilize formula (11) computational prediction to mark, then user U 5to project I 2, I 4and I 5prediction be respectively 3.96,3.87 and 3.00.
Step 605: the final recommended project list that qualified Candidate Recommendation project generates described targeted customer is extracted in the prediction scoring according to described Candidate Recommendation project from described Candidate Recommendation project.
According to prediction scoring for targeted customer generates final bulleted list, this final bulleted list comprises several higher projects of prediction scoring, and specifically choosing how many projects can also adjust according to actual conditions.Suppose to get the highest Candidate Recommendation project of prediction scoring in this example as final bulleted list, then final bulleted list is project I 2.Certainly, also can option I 2and I 4as final bulleted list.
Step 606: the client that described final recommended project list is sent to described targeted customer is shown by Digital Media or electronic commerce affair server.
After the final bulleted list of generation, final recommended project list is sending to the client of described targeted customer to show by Digital Media or electronic commerce affair server.
In the present embodiment, when carrying out project recommendation, according to targeted customer's mark and service identification, the user after the mapping of this business-project score data and corresponding user's similarity and/or item similarity can be selected, by directly utilizing the user's similarity and/or item similarity that store in storer, namely be the user-project score data after the mapping that have selected this business preferably and corresponding similarity, so decrease the processing time of project recommendation, improve the efficiency of project recommendation, and validity and the accuracy of project recommendation can be improved.
With reference to figure 7, the embodiment of the invention also discloses item recommendation method during a kind of repeat in work, the method includes acquisition flow process and the project recommendation flow process of user's similarity and/or item similarity simultaneously; Concrete, item recommendation method during this repeat in work can comprise the steps:
Step 701: the original source data obtaining multiple different digital media or electronic commerce affair, described original source data comprises: the initial user-project score data of multiple business.
Step 702: according to the standardization result of the user between multiple Digital Media or electronic commerce affair and the matching result of project and the initial user-project score data of described multiple Digital Media or electronic commerce affair, is integrated into the initial user-project score data of described multiple Digital Media or electronic commerce affair and comprises the user of described each multiple Digital Media or electronic commerce affair and the unification user-project score data of project.
Step 703: described unification user-project score data is mapped to described multiple Digital Media or electronic commerce affair successively, generates the user-project score data after described multiple Digital Media or electronic commerce affair mapping.
Step 704: according to the item similarity in the user's similarity in the user after described multiple Digital Media or electronic commerce affair mapping-project score data acquisition same business between different user and/or same business between disparity items.
Step 705: described user's similarity and/or item similarity are stored in described storer.
It should be noted that, the storage user similarity that step 701 ~ step 705 is illustrated and/or the process of item similarity can think preprocessing process, the project recommendation process can illustrated with subsequent step 706 ~ 711 is independently carried out, so also can Guarantee item recommend real-time and validity.Just preprocessing process and project recommendation process are introduced in order for convenience's sake in the present embodiment.
Step 706: the service identification used by computer network interface acquisition targeted customer and targeted customer's mark.
Step 707: obtain the service source data prestored according to described service identification from storer.
Step 708: the Digital Media used according to described targeted customer's mark, targeted customer or electronic commerce affair identify and source data, for described targeted customer generates Candidate Recommendation project set.
Step 709: the prediction scoring at least obtaining each Candidate Recommendation project in described Candidate Recommendation project set according to the user's similarity in described service source data and/or item similarity.
Step 710: the final recommended project list that qualified Candidate Recommendation project generates described targeted customer is extracted in the prediction scoring according to described Candidate Recommendation project from described Candidate Recommendation project.
Step 711: the client that described final project recommendation list is sent to described targeted customer is shown by Digital Media or electronic commerce affair server.
Because the acquisition flow process of user's similarity and/or item similarity and project recommendation flow process by the agency of very detailed in embodiment before, so the present embodiment not to the greatest extent part, can with reference to user's similarity and/or the acquisition flow process of item similarity and the related introduction of project recommendation flow process.
It should be noted that, for aforesaid each embodiment of the method, in order to simple description, therefore it is all expressed as a series of combination of actions, but those skilled in the art should know, the present invention is not by the restriction of described sequence of movement, because according to the present invention, some step can adopt other orders or carry out simultaneously.Secondly, those skilled in the art also should know, the embodiment described in instructions all belongs to preferred embodiment, and involved action and module might not be that the present invention is necessary.
Corresponding with the acquisition methods of user's similarity that the invention described above embodiment provides and/or item similarity, see Fig. 8, the embodiment of the present invention additionally provides the structural representation of the acquisition embodiment of user's similarity and/or item similarity, specifically can comprise:
Obtain original source data unit 801, for obtaining the original source data of different digital media or electronic commerce affair, described original source data comprises: the initial user-project score data of multiple Digital Media or electronic commerce affair;
Integral unit 802, for the standardization result according to the matching result of the user between multiple Digital Media or electronic commerce affair and project and the initial user-project score data of described multiple Digital Media or electronic commerce affair, the initial user-project score data of described multiple Digital Media or electronic commerce affair is integrated into the unification user-project score data of user and the project comprising described multiple Digital Media or electronic commerce affair;
Described original source data can also comprise: the UAD of multiple Digital Media or electronic commerce affair and item attribute data, and shown in figure 9, then described integral unit 802 specifically can comprise:
First coupling subelement 901, for according to initial user mark, initial user attribute and the initial user property value in described UAD, mates and obtains actual user unique between described multiple Digital Media or electronic commerce affair; Described initial user label table is shown in user unique in a certain business; Described initial user property value is for representing user all unique between described multiple business;
Second coupling subelement 902, for according to the initial item identification in described item attribute data, initial project attribute and initial project property value, mates and obtains actual items unique between multiple Digital Media or electronic commerce affair; Described initial item identification represents project unique in a certain business;
With reference to shown in Figure 10, described second coupling subelement 902 specifically can comprise again:
3rd coupling subelement 1001, for utilizing actual items attributes different between the multiple Digital Media of initial project attributes match of multiple Digital Media or electronic commerce affair or electronic commerce affair;
Second obtains subelement 1002, for obtaining the item attribute collection registration average of item attribute collection registration between multiple Digital Media or electronic commerce affair and each business and other business according to described different actual items attribute;
Sequence subelement 1003, for sorting to described multiple Digital Media or electronic commerce affair according to the size of described item attribute collection registration average;
Business coupling subelement 1004, for the sequencing according to described multiple Digital Media or electronic commerce affair after sequence, with the first business for current business performs business coupling flow process, described business coupling flow process comprises: the matching entries determining described current business and other business, and, delete described current business;
With reference to shown in Figure 11, described business coupling subelement 1004, specifically can comprise again:
Project coupling subelement 1102, for the initial item identification order comprised according to the first business, selects first project as current project project implementation coupling flow process; Described project coupling flow process comprises: the project matching degree calculating each project in current project and other business; For each other business, choose suitable project matching degree, to form the set of multiple project matching degree according to the threshold condition preset; In the set of each project matching degree, select project that matching degree is the highest as the matching entries of current project; Record the matching relationship of described current project and its matching entries, and delete the described matching entries that miscellaneous service comprises; Delete described current project;
Circulation subelement 1103, with second project in the first business described for current project, performs described project coupling flow process, until the project comprised in the first business described is for empty.
3rd obtains subelement 1005, for taking the second business as described current business execution business coupling flow process, until when the business in described sequence is empty, obtain actual items all unique between described multiple Digital Media or electronic commerce affair according to described matching entries and project matching relationship.
First obtains subelement 903, for the minimum value according to the initial user-project score data of multiple business, the user-project scoring score range of multiple business and described score range, obtain the standardization result of the initial user-project score data of described multiple business;
Integron unit 904, for according to described actual user, actual items and standardization result, integrate the user-project score data of described multiple Digital Media or electronic commerce affair, generate unified user-project score data, described unified user-project score data comprises the user-project score data after the integration of user and project in described multiple Digital Media or electronic commerce affair.
Generate score data unit 803, for described unification user-project score data is mapped to described multiple Digital Media or electronic commerce affair successively, generate the user-project score data after described multiple Digital Media or electronic commerce affair mapping;
In actual applications, described generation score data unit 803, may be used for further:
For often kind of business, in unified user-project score data, extract all projects that Mei Zhong Digital Media or electronic commerce affair comprise and/or all users-project score data corresponding to total user, form the user-project score data after the mapping of miscellaneous service.
Obtain similarity unit 804, calculate the similarity in similarity in same business between different user and/or same business between disparity items for the user-project score data after mapping according to described multiple Digital Media or electronic commerce affair.
In actual applications, described acquisition similarity unit 804, may be used for further:
The item destination aggregation (mda) that the item destination aggregation (mda) of jointly marking according to two different users in the user after described mapping-project score data, same business and described two different users are marked separately calculates the similarity between described two different users; And/or,
According in the user after described mapping-project score data, same business, the similarity between described two disparity items is calculated to the set of the user that two disparity items are marked jointly and the described set to the user that two disparity items are marked separately.
Storage unit 805, for being stored to described user's similarity and/or item similarity in described storer.
The acquisition system of the similarity of user disclosed in the embodiment of the present invention and/or item similarity, because integrate by the specification of user-project score data and map and calculate user's similarity or item similarity, not only can provide data for project recommendation, by well selecting the user-project score data after the mapping of this business and corresponding user's similarity and/or item similarity, can also can improve the validity accuracy of project recommendation.
With reference to shown in Figure 12, the embodiment of the invention also discloses item recommendation system during a kind of repeat in work, this item recommendation system comprises:
Obtain identify unit 1201, identify for being obtained the Digital Media that using of targeted customer or electronic commerce affair mark and targeted customer by computer network interface;
Obtain service source data cell 1202, for obtaining the service source data prestored from storer according to described service identification;
Generate candidate collection unit 1203, for the Digital Media that using according to described targeted customer's mark, targeted customer or electronic commerce affair mark and service source data, for described targeted customer generates Candidate Recommendation project set;
In actual applications, described generation candidate collection unit 1202 may be used for further:
Select the user meeting prerequisite with user's similarity of described targeted customer, and the scoring selecting described user's similarity to meet the user of prerequisite is higher than predetermined threshold and the described targeted customer item design Candidate Recommendation project set of not marking; And/or,
Select mark higher than the project of predetermined threshold value with the user-project of described targeted customer, and the item similarity between selection and the described user-project project of marking higher than predetermined threshold value meets prerequisite and the described targeted customer item design Candidate Recommendation project set of not marking.
Wherein, described Candidate Recommendation project all belongs to the Digital Media or electronic commerce affair that described targeted customer using.
Obtain prediction scoring unit 1204, at least according to the user's similarity in described service source data and/or item similarity, obtain the prediction scoring of each Candidate Recommendation project in described Candidate Recommendation project set;
Generate final list cell 1205, from described Candidate Recommendation project, extract for the prediction scoring according to described Candidate Recommendation project the final recommended project list that qualified Candidate Recommendation project generates described targeted customer;
Display unit 1206, shows for the client that described final project recommendation list is sent to described targeted customer.
The item recommendation system of the present embodiment is when carrying out project recommendation, according to targeted customer's mark and service identification, the user after the mapping of this business-project score data and corresponding user's similarity and/or item similarity can be selected, by directly utilizing the user's similarity and/or item similarity that store in storer, namely be the user-project score data after the mapping that have selected this business preferably and corresponding similarity, so decrease the processing time of project recommendation, improve the efficiency of project recommendation, and validity and the accuracy of project recommendation can be improved.
It should be noted that, in the item recommendation system of practical application, perform the system obtaining user's similarity and/or item similarity and the system of carrying out project recommendation, can independently work, because the project recommendation obtaining user's similarity and/or item similarity can be carried out simultaneously, only need can obtain the user's similarity and/or item similarity calculated when carrying out project recommendation, so also can the real-time of Guarantee item commending system institute recommended project and validity.
It should be noted that, each embodiment in this instructions all adopts the mode of going forward one by one to describe, and what each embodiment stressed is the difference with other embodiments, between each embodiment identical similar part mutually see.For system class embodiment, due to itself and embodiment of the method basic simlarity, so description is fairly simple, relevant part illustrates see the part of embodiment of the method.
Also it should be noted that, in this article, term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thus make to comprise the process of a series of key element, method, article or equipment and not only comprise those key elements, but also comprise other key elements clearly do not listed, or also comprise by the intrinsic key element of this process, method, article or equipment.When not more restrictions, the key element limited by statement " comprising ... ", and be not precluded within process, method, article or the equipment comprising described key element and also there is other identical element.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is that the hardware that can carry out instruction relevant by program has come, this program can be stored in a computer-readable recording medium, and storage medium can comprise: ROM, RAM, disk or CD etc.
The item recommendation method during repeat in work provided the embodiment of the present invention above and system are described in detail, apply specific case herein to set forth principle of the present invention and embodiment, the explanation of above embodiment is just for helping method and the thought thereof of understanding the embodiment of the present invention; Meanwhile, for one of ordinary skill in the art, according to the thought of the embodiment of the present invention, all will change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention.

Claims (18)

1. item recommendation method when Digital Media or electronic commerce affair intersect, is characterized in that, comprising:
Obtain the Digital Media that using of targeted customer or electronic commerce affair mark and targeted customer by computer network interface to identify, and from storer, obtain the service source data prestored according to described service identification;
The Digital Media used according to described targeted customer's mark, targeted customer or electronic commerce affair identify and service source data, for described targeted customer generates Candidate Recommendation project set;
The prediction scoring of each Candidate Recommendation project in described Candidate Recommendation project set is at least obtained according to the user's similarity in described service source data and/or item similarity;
The final recommended project list that qualified Candidate Recommendation project generates described targeted customer is extracted in prediction scoring according to described Candidate Recommendation project from described Candidate Recommendation project;
The client that described final project recommendation list is sent to described targeted customer is shown by Digital Media or electronic commerce affair server.
2. method according to claim 1, it is characterized in that, the described Digital Media that using according to described targeted customer mark, targeted customer or electronic commerce affair mark and service source data, for described targeted customer generates Candidate Recommendation project set, comprising:
Select the user meeting prerequisite with user's similarity of described targeted customer, and the scoring selecting described user's similarity to meet the user of prerequisite is higher than predetermined threshold and the described targeted customer item design Candidate Recommendation project of not marking; And/or,
Select mark higher than the project of predetermined threshold value with the user-project of described targeted customer, and the item similarity between selection and the described user-project project of marking higher than predetermined threshold value meets prerequisite and the described targeted customer item design Candidate Recommendation project of not marking;
Wherein, described Candidate Recommendation project all belongs to the Digital Media or electronic commerce affair that described targeted customer using.
3. method according to claim 1, is characterized in that, also comprises:
Obtain the original source data of multiple different digital media or electronic commerce affair, described original source data comprises: the initial user-project score data of described multiple Digital Media or electronic commerce affair;
According to the standardization result of the user between described multiple Digital Media or electronic commerce affair and the matching result of project and the initial user-project score data of described multiple Digital Media or electronic commerce affair, the initial user-project score data of described multiple Digital Media or electronic commerce affair is integrated into the unification user-project score data of user and the project comprising described multiple Digital Media or electronic commerce affair;
Described unification user-project score data is mapped to described multiple Digital Media or electronic commerce affair successively, generates the user-project score data after described multiple Digital Media or electronic commerce affair mapping;
According to the item similarity in the user's similarity in the user after described multiple Digital Media or electronic commerce affair mapping-project score data acquisition same business between different user and/or same business between disparity items;
Described user's similarity and/or item similarity are stored in described storer.
4. method according to claim 3, it is characterized in that, described original source data also comprises: the UAD of multiple Digital Media or electronic commerce affair and item attribute data, then described initial user-project score data by described multiple Digital Media or electronic commerce affair is integrated into the unification user-project score data of user and the project comprising described multiple Digital Media or electronic commerce affair, comprising:
According to initial user mark, initial user attribute and initial user property value in described UAD, coupling obtains actual user unique between multiple Digital Media or electronic commerce affair; Described initial user label table is shown in user unique in a certain business; Described initial user property value is for representing user all unique between described multiple Digital Media or electronic commerce affair;
According to the initial item identification in described item attribute data, initial project attribute and initial project property value, coupling obtains actual items unique between multiple Digital Media or electronic commerce affair; Described initial item identification represents project unique in a certain business;
According to user-project scoring score range of multiple Digital Media or the initial user-project score data of electronic commerce affair, multiple Digital Media or electronic commerce affair and the minimum value of described score range, obtain the standardization result of the initial user-project score data of described multiple Digital Media or electronic commerce affair;
According to described actual user, actual items and standardization result, integrate the user-project score data of described multiple Digital Media or electronic commerce affair, generate unified user-project score data, described unified user-project score data comprises the user-project score data after the integration of user and project in described multiple Digital Media or electronic commerce affair.
5. method according to claim 4, it is characterized in that, described according to the initial item identification in described item attribute data, initial project attribute and initial project property value, coupling obtains actual items unique between multiple Digital Media or electronic commerce affair, comprising:
Utilize actual items attributes different between Digital Media described in the initial project attributes match of described multiple Digital Media or electronic commerce affair or ecommerce multiple business;
The item attribute collection registration average of item attribute collection registration between described multiple Digital Media or electronic commerce affair and each business and other business is obtained according to described different actual items attribute;
According to the size of described item attribute collection registration average, described multiple Digital Media or electronic commerce affair are sorted;
According to the sequencing of multiple Digital Media or electronic commerce affair after sequence, with the first business for current business performs business coupling flow process, described business coupling flow process comprises: the matching entries determining described current business and other business, and, delete described current business;
Be that described current business performs business coupling flow process with the second business, until when the business in sequence be empty, according to described matching entries and the actual items all unique between described multiple Digital Media or electronic commerce affair of project matching relationship acquisition.
6. method according to claim 5, is characterized in that, the described matching entries determining described current business and other business, comprising:
The initial item identification order comprised according to current business, selects first project as current project project implementation coupling flow process; Described project coupling flow process comprises: the project matching degree calculating each project in current project and other business; For each other business, choose suitable project matching degree, to form the set of multiple project matching degree according to the threshold condition preset; In the set of each project matching degree, select project that matching degree is the highest as the matching entries of current project; Record the matching relationship of described current project and its matching entries, and delete the described matching entries that described multiple business comprises; Delete described current project;
With the project of second in described current business for current project, perform described project coupling flow process, until the project comprised in described current business is for empty.
7. method according to claim 3, it is characterized in that, described described unification user-project score data is mapped to described multiple Digital Media or electronic commerce affair successively, generate described multiple Digital Media or electronic commerce affair map after user-project score data, comprising:
For often kind of business, in unified user-project score data, extract all projects that often kind of business comprises and/or all users-project score data corresponding to total user, form the user-project score data after the mapping of miscellaneous service.
8. method according to claim 3, it is characterized in that, described according to the similarity in the similarity in the user after described multiple Digital Media or electronic commerce affair mapping-project score data acquisition same business between different user and/or same business between disparity items, comprising:
The item destination aggregation (mda) that the item destination aggregation (mda) of jointly marking according to two different users in the user after described mapping-project score data, same business and described two different users are marked separately calculates the similarity between described two different users; And/or,
According in the user after described mapping-project score data, same business, the similarity between described two disparity items is calculated to the set of the user that two disparity items are marked jointly and the described set to the user that two disparity items are marked separately.
9. method according to claim 1, is characterized in that, described Digital Media or electronic commerce affair, comprising: music, application download, electronic reading, game and/or shopping online.
10. method according to claim 1, is characterized in that, described Candidate Recommendation project comprises: digital media content, e-commerce product or uniform resource position mark URL.
Item recommendation system when 11. 1 kinds of Digital Medias or electronic commerce affair intersect, is characterized in that, comprising:
Obtain identify unit, identify for being obtained the Digital Media that using of targeted customer or electronic commerce affair mark and targeted customer by computer network interface;
Obtain service source data cell, for obtaining the service source data prestored from storer according to described service identification;
Generate candidate collection unit, for the Digital Media that using according to described targeted customer's mark, targeted customer or electronic commerce affair mark and service source data, for described targeted customer generates Candidate Recommendation project set;
Obtain prediction scoring unit, at least according to the user's similarity in described service source data and/or item similarity, obtain the prediction scoring of each Candidate Recommendation project in described Candidate Recommendation project set;
Generate final list cell, from described Candidate Recommendation project, extract for the prediction scoring according to described Candidate Recommendation project the final recommended project list that qualified Candidate Recommendation project generates described targeted customer;
Display unit, shows for the client that described final project recommendation list is sent to described targeted customer.
12. systems as claimed in claim 11, is characterized in that, described generation candidate collection unit is further used for:
Select the user meeting prerequisite with user's similarity of described targeted customer, and the scoring selecting described user's similarity to meet the user of prerequisite is higher than predetermined threshold and the described targeted customer item design Candidate Recommendation project of not marking; And/or,
Select mark higher than the project of predetermined threshold value with the user-project of described targeted customer, and the item similarity between selection and the described user-project project of marking higher than predetermined threshold value meets prerequisite and the described targeted customer item design Candidate Recommendation project of not marking;
Wherein, described Candidate Recommendation project all belongs to the Digital Media or electronic commerce affair that described targeted customer using.
13. systems as claimed in claim 11, is characterized in that, also comprise:
Obtain original source data unit, for obtaining the original source data of multiple Digital Media or electronic commerce affair, described original source data comprises: the initial user-project score data of described multiple Digital Media or electronic commerce affair;
Integral unit, for the standardization result according to the matching result of the user between multiple Digital Media or electronic commerce affair and project and the initial user-project score data of described multiple Digital Media or electronic commerce affair, the initial user-project score data of described multiple Digital Media or electronic commerce affair is integrated into the unification user-project score data of user and the project comprising described multiple Digital Media or electronic commerce affair;
Generate score data unit, for described unification user-project score data is mapped to described multiple Digital Media or electronic commerce affair successively, generate the user-project score data after described multiple Digital Media or electronic commerce affair mapping;
Obtain similarity unit, obtain the item similarity in user's similarity in same business between different user and/or same business between disparity items for the user-project score data after mapping according to described multiple Digital Media or electronic commerce affair;
Storage unit, for being stored to described user's similarity and/or item similarity in described storer.
14. systems as claimed in claim 13, it is characterized in that, described original source data also comprises: the UAD of multiple Digital Media or electronic commerce affair and item attribute data, then described integral unit comprises:
First coupling subelement, for according to initial user mark, initial user attribute and the initial user property value in described UAD, mates and obtains actual user unique between described multiple Digital Media or electronic commerce affair; Described initial user label table is shown in user unique in a certain business; Described initial user property value is for representing user all unique between described multiple Digital Media or electronic commerce affair;
Second coupling subelement, for according to the initial item identification in described item attribute data, initial project attribute and initial project property value, mates and obtains actual items unique between described multiple Digital Media or electronic commerce affair; Described initial item identification represents project unique in a certain business;
First obtains subelement, for according to the user-project scoring score range of multiple digital media or the initial user-project score data of electronic commerce affair, multiple Digital Media or electronic commerce affair and the minimum value of described score range, obtain the standardization result of the initial user-project score data of described multiple Digital Media or electronic commerce affair;
Integron unit, for according to described actual user, actual items and standardization result, integrate the user-project score data of described multiple Digital Media or electronic commerce affair, generate unified user-project score data, described unified user-project score data comprises the user-project score data after the integration of user and project in described multiple Digital Media or electronic commerce affair.
15. systems as claimed in claim 14, is characterized in that, described second coupling subelement comprises:
3rd coupling subelement, for utilizing actual items attributes different between the multiple Digital Media of initial project attributes match of described multiple Digital Media or electronic commerce affair or electronic commerce affair;
Second obtains subelement, for obtaining the item attribute collection registration average of item attribute collection registration between multiple Digital Media or electronic commerce affair and each business and other business according to described different actual items attribute;
Sequence subelement, for sorting to described multiple Digital Media or electronic commerce affair according to the size of described item attribute collection registration average;
Business coupling subelement, for the sequencing according to multiple Digital Media or electronic commerce affair after sequence, with the first business for current business performs business coupling flow process, described business coupling flow process comprises: the matching entries determining described current business and other business, and, delete described current business;
3rd obtains subelement, for taking the second business as described current business execution business coupling flow process, until when the business in sequence is empty, obtain actual items all unique between described multiple Digital Media or electronic commerce affair according to described matching entries and project matching relationship.
16. systems as claimed in claim 15, is characterized in that, described business coupling subelement concrete configuration is:
Project coupling subelement, for the initial item identification order comprised according to current business, selects first project as current project project implementation coupling flow process; Described project coupling flow process comprises: the project matching degree calculating each project in current project and other business; For each other business, choose suitable project matching degree, to form the set of multiple project matching degree according to the threshold condition preset; In the set of each project matching degree, select project that matching degree is the highest as the matching entries of current project; Record the matching relationship of described current project and its matching entries, and delete the described matching entries that described multiple business comprises; Delete described current project;
Circulation subelement, for the project of second in described current business for current project, perform described project coupling flow process, until the project comprised in described current business is for empty.
17. systems as claimed in claim 13, is characterized in that, described generation score data unit concrete configuration is:
For often kind of business, in unified user-project score data, extract all projects that often kind of business comprises and/or all users-project score data corresponding to total user, form the user-project score data after the mapping of described multiple Digital Media or electronic commerce affair.
18. systems as claimed in claim 13, is characterized in that, described acquisition similarity unit concrete configuration is:
The item destination aggregation (mda) that the item destination aggregation (mda) of jointly marking according to two different users in the user after described mapping-project score data, same business and described two different users are marked separately calculates the similarity between described two different users; And/or,
According in the user after described mapping-project score data, same business, the similarity between described two disparity items is calculated to the set of the user that two disparity items are marked jointly and the described set to the user that two disparity items are marked separately.
CN201180001057.8A 2011-06-29 2011-06-29 Item recommendation method during a kind of repeat in work and system Active CN102959539B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2011/076551 WO2012159308A1 (en) 2011-06-29 2011-06-29 Method and system for item recommendation in service crossing situation

Publications (2)

Publication Number Publication Date
CN102959539A CN102959539A (en) 2013-03-06
CN102959539B true CN102959539B (en) 2015-09-23

Family

ID=47216551

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201180001057.8A Active CN102959539B (en) 2011-06-29 2011-06-29 Item recommendation method during a kind of repeat in work and system

Country Status (2)

Country Link
CN (1) CN102959539B (en)
WO (1) WO2012159308A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108355349A (en) * 2018-03-14 2018-08-03 张伟东 Games system

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105338408B (en) * 2015-12-02 2018-11-13 南京理工大学 Video recommendation method based on time factor
CN107656938B (en) * 2016-07-26 2022-01-11 北京搜狗科技发展有限公司 Recommendation method and device and recommendation device
CN106512405B (en) * 2016-12-06 2019-02-19 腾讯科技(深圳)有限公司 A kind of method and device of the plug-in resource acquisition of virtual objects
WO2018103516A1 (en) 2016-12-06 2018-06-14 腾讯科技(深圳)有限公司 Method of acquiring virtual resource of virtual object, and client
CN107807967B (en) * 2017-10-13 2021-10-22 平安科技(深圳)有限公司 Real-time recommendation method, electronic device and computer-readable storage medium
CN108536662B (en) * 2018-04-16 2022-04-12 苏州大学 Data labeling method and device

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10339538B2 (en) * 2004-02-26 2019-07-02 Oath Inc. Method and system for generating recommendations
US7797197B2 (en) * 2004-11-12 2010-09-14 Amazon Technologies, Inc. Method and system for analyzing the performance of affiliate sites
US8566884B2 (en) * 2007-11-29 2013-10-22 Cisco Technology, Inc. Socially collaborative filtering
CN101459908B (en) * 2007-12-13 2012-04-25 华为技术有限公司 Service subscribing method, system, server
US8131732B2 (en) * 2008-06-03 2012-03-06 Nec Laboratories America, Inc. Recommender system with fast matrix factorization using infinite dimensions
CN101329683A (en) * 2008-07-25 2008-12-24 华为技术有限公司 Recommendation system and method
CN101685458B (en) * 2008-09-27 2012-09-19 华为技术有限公司 Recommendation method and system based on collaborative filtering
JP2010176327A (en) * 2009-01-28 2010-08-12 Sony Corp Learning device, learning method, information-processing device, data-selecting method, data-accumulating method, data-converting method, and program
US20110112981A1 (en) * 2009-11-09 2011-05-12 Seung-Taek Park Feature-Based Method and System for Cold-Start Recommendation of Online Ads
JP5740814B2 (en) * 2009-12-22 2015-07-01 ソニー株式会社 Information processing apparatus and method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108355349A (en) * 2018-03-14 2018-08-03 张伟东 Games system

Also Published As

Publication number Publication date
WO2012159308A1 (en) 2012-11-29
CN102959539A (en) 2013-03-06

Similar Documents

Publication Publication Date Title
CN102959539B (en) Item recommendation method during a kind of repeat in work and system
CN104899273B (en) A kind of Web Personalization method based on topic and relative entropy
CN103473230B (en) Service area determines that method, logistics service provider recommend method and related device
CN105335409B (en) A kind of determination method, equipment and the network server of target user
CN105005582B (en) The recommendation method and device of multimedia messages
CN106202331A (en) The commending system of secret protection and operational method based on this commending system by different level
CN103577549A (en) Crowd portrayal system and method based on microblog label
CN105446972A (en) Search method, device and system based on and fusing with user relation data
CN104579909B (en) Method and equipment for classifying user information and acquiring user grouping information
CN108446964B (en) User recommendation method based on mobile traffic DPI data
CN105430504A (en) Family member mix identification method and system based on television watching log mining
CN102279851A (en) Intelligent navigation method, device and system
CN104951468A (en) Data searching and processing method and system
CN106033415A (en) A text content recommendation method and device
CN104077415A (en) Searching method and device
WO2016101811A1 (en) Information arrangement method and apparatus
WO2021208583A1 (en) Recommendation information generation method and apparatus, electronic device and readable storage medium
Prando et al. Content-based Recommender System using Social Networks for Cold-start Users.
CN105488522A (en) Search engine user information demand satisfaction evaluation method capable of integrating multiple views and semi-supervised learning
Tiwari et al. Implicit preferences discovery for biography recommender system using twitter
CN108416645B (en) Recommendation method, device, storage medium and equipment for user
US20230281695A1 (en) Determining and presenting information related to a semantic context of electronic message text or voice data
US7890494B2 (en) System and/or method for processing events
Jia et al. Study on data sparsity in social network-based recommender system
CN107545039A (en) The index acquisition methods and device of keyword, computer equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant