CN105718488A - Computer system based recommendation method and apparatus - Google Patents

Computer system based recommendation method and apparatus Download PDF

Info

Publication number
CN105718488A
CN105718488A CN201410736666.3A CN201410736666A CN105718488A CN 105718488 A CN105718488 A CN 105718488A CN 201410736666 A CN201410736666 A CN 201410736666A CN 105718488 A CN105718488 A CN 105718488A
Authority
CN
China
Prior art keywords
project
recommendation
user
targeted customer
items
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410736666.3A
Other languages
Chinese (zh)
Inventor
潘晓彤
金柯
刘忠义
魏虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201410736666.3A priority Critical patent/CN105718488A/en
Priority to PCT/CN2015/095834 priority patent/WO2016086802A1/en
Publication of CN105718488A publication Critical patent/CN105718488A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to recommendation technologies for implementation of computer systems, and discloses a computer system based recommendation method and apparatus. In the recommendation method of the present invention, firstly, clustering is performed according to an item scoring record of each user, so as to divide user feature data into a plurality of categories, and then, in each category of user feature data, items are recommended for target users based on the items, so that a high efficient recommendation method is implemented based on big data, and system stability and recommendation diversification are ensured. In addition, each calculation node does not need to store all categories of user feature data, so that the occasion of insufficient internal storage is prevented.

Description

Recommendation method based on computer system and device thereof
Technical field
The present invention relates to the recommended technology realized with computer system, particularly to based on computer system Recommendation method and device.
Background technology
Proposed algorithm is generally divided into content-based recommendation, recommendation based on correlation rule, based on collaborative Filtered recommendation, and the combination of some basic skills.But, it was found by the inventors of the present invention that currently There are some problems in CF (Collaborative Filtering, collaborative filtering) algorithm, particularly in distribution Under formula environment, some problem becomes apparent from, and understands from CF operation logic, algorithm bottleneck mainly with Lower three places:
First is present in data scale, and no matter which time is recommended, each calculating joint of Distributed Architecture Point will retain global data because each reducer can not learn in advance present node allocated be Which user, so only storage local data can affect data precision.The most each reducer is just by reality Example turns to a small-sized recommendation scene.Assume the calculating resource of total t unit, then global data is superfluous More than store t-1 part, the most each reducer only can run into fraction number in real recommendation process According to calculating, other data will also result in the great wasting of resources.Therefore when data scale is bigger, no matter From the time or in storage, it is huge burden to each calculating node.Experimentation at us In, due to programming language and the local design of compiler, when user or project any data amount exceed During millions, the excessive problem of crossing the border of array will necessarily occur, when user or project any data amount are thousand During ten thousand ranks, then due in cluster each calculate node configuration uneven, some low node of joining will Low memory problem occurs.
Second point is data skew problem.From the point of view of CF algorithmic procedure, either based on project or base In user, we are required for the similarity between calculating project.Here there is a hidden problem: real In the application scenarios of border, some project belongs to " enliven one's share of expenses for a joint undertaking ", some belongs to " inactive one's share of expenses for a joint undertaking ", such as, exist When using MapReduce framework, under<key, value>data schema (pattern), Value corresponding for some key can be a lot, and some can seldom, and this quantity is inconsistent, uneven Situation, referred to as data skew (data skew).When value quantity differ between different key 3 with During the upper order of magnitude, between calculating project, during similarity, will result in serious data skew, " live Jump one's share of expenses for a joint undertaking " cause calculating time long-tail.In like manner, in recommendation process, the row of accumulation before some user For many, before some user, the behavior of accumulation is few, at this moment " any active ues " overall calculation mistake will be tied down Journey.
It it is thirdly Sparse Problem.In object set, produce the object of relation to seldom;Can To be interpreted as all objects to be divided into a matrix, wherein (i j) represents i-th user and jth project Between relation, if great majority point is 0 (representing that it doesn't matter), be then defined as Sparse.Number According to dense in contrast.Particularly primary data is the most incomplete, at this moment phase between calculating project Just be easy to Sparse Problem occur when seemingly spending, i.e. most of position of user items matrix is all 0.
Summary of the invention
It is an object of the invention to provide a kind of recommendation method based on computer system and device thereof, can To realize recommending efficiently method under big data, it is ensured that the stability of system and the multiformity of recommendation.
For solving above-mentioned technical problem, embodiments of the present invention disclose a kind of based on computer system Recommendation method, the method comprises the following steps:
Obtain each user project scoring record to projects;
Project scoring record according to each user clusters, and user characteristic data is divided into R class In not, R is greater than the integer of 1;
In the user characteristic data of each classification, it is targeted customer's recommended project based on project.
Embodiments of the present invention also disclose a kind of recommendation apparatus based on computer system, device bag Include:
User items initial relation computing module, for obtaining each user project scoring note to projects Record;
Cluster module, the item of each user for obtaining according to user items initial relation computing module Mesh scoring record cluster, user characteristic data is divided in R classification, R be greater than 1 whole Number;And
Recommending module, in the user characteristic data of each classification divided at cluster module, base It is targeted customer's recommended project in project.
Compared with prior art, the main distinction and effect thereof are embodiment of the present invention:
In the recommendation method of the present invention, first cluster according to the project scoring record of each user, will User characteristic data is divided in multiple classification, then based on project in the user characteristic data of each classification For targeted customer's recommended project, can realize recommending efficiently method under big data, it is ensured that system Stability and the multiformity of recommendation.
Further, each calculating node need not preserve the user characteristic data of all categories, it is to avoid The problem of low memory.
Further, for each project in each classification or each user, only choose and its relation The strongest several projects rather than retain all items of associated system, can avoid relation more weak The data skew problem that project produces.
Further, use Sparse degree that Sparse Problem is detected, and find data After Sparse Problems, carry out similarity completion by two degree of relations between project, to avoid Sparse to pushing away Recommend the impact of accuracy.
Further, choose whether user to be clustered according to number of users, with the suitableeest Should under small data and big data under project recommendation.
Accompanying drawing explanation
Fig. 1 is that in first embodiment of the invention, the flow process of a kind of recommendation method based on computer system is shown It is intended to;
In Fig. 2 first embodiment of the invention, in a kind of recommendation method based on computer system, cluster judges Schematic flow sheet;
Fig. 3 is to recommend step in second embodiment of the invention in a kind of recommendation method based on computer system Rapid schematic flow sheet;
Fig. 4 is to recommend step in second embodiment of the invention in a kind of recommendation method based on computer system Rapid schematic flow sheet;
Fig. 5 is to recommend step in second embodiment of the invention in a kind of recommendation method based on computer system Rapid schematic flow sheet;
Fig. 6 is that in second embodiment of the invention, in a kind of recommendation method based on computer system, data are mended Full schematic flow sheet;
Fig. 7 is the existing schematic diagram calculating user's similarity;
Fig. 8 and Fig. 9 is the schematic diagram of existing collaborative filtering based on user;
Figure 10 and Figure 11 is the schematic diagram of existing project-based collaborative filtering;
Figure 12 is the existing MapReduce frame diagram realizing Distributed C F algorithm;
Figure 13 is the flow process of a kind of recommendation method based on computer system in second embodiment of the invention Schematic diagram;
Figure 14 is the flow process of a kind of recommendation method based on computer system in second embodiment of the invention Schematic diagram;
Figure 15 is the structure of a kind of recommendation apparatus based on computer system in third embodiment of the invention Schematic diagram;
Figure 16 is to recommend in a kind of recommendation apparatus based on computer system in four embodiment of the invention The structural representation of module.
Detailed description of the invention
In the following description, many technology are proposed in order to make reader be more fully understood that the application thin Joint.But, even if it will be understood by those skilled in the art that do not have these ins and outs and based on The many variations of following embodiment and amendment, it is also possible to realize the required guarantor of each claim of the application The technical scheme protected.
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to this The embodiment of invention is described in further detail.
First embodiment of the invention relates to a kind of recommendation method based on computer system.Fig. 1 is this base Schematic flow sheet in the recommendation method of computer system.As it is shown in figure 1, the method includes following step Rapid:
In a step 101, each user project scoring record to projects is obtained.It is appreciated that at this In each embodiment of invention, project can be commodity, service or other recommended.
Then into step 102, cluster according to the project scoring record of each user, user is special Levying data to be divided in R classification, R is greater than the integer of 1.It is appreciated that at each of the present invention In embodiment, K-means algorithm can be used directly user characteristic data to be clustered, it is possible to First to use Canopy algorithm slightly to cluster, then K-means algorithm is used carefully to cluster.
First use Canopy algorithm slightly to cluster, then use K-means algorithm carefully to cluster, While ensureing accuracy, improve cluster speed.
Furthermore, it is to be understood that user characteristic data is to item by user profile, project information and user The data of purpose scoring record composition.
Then into step 103, in the user characteristic data of each classification, it is that target is used based on project Family recommended project.It is appreciated that in various embodiments of the present invention, can use based on working in coordination with Filter, come for targeted customer's recommended project based on correlation rule or proposed algorithm based on effectiveness.
Hereafter process ends.
Certainly, in other embodiments of the present invention, it is also possible to cluster with project for object, Come for targeted customer's recommended project based on user in the user characteristic data of each classification again, or cluster It is all based on user with recommendation or is all based on project.
In the recommendation method of present embodiment, first gather according to the project scoring record of each user Class, is divided into user characteristic data in multiple classification, then base in the user characteristic data of each classification It is targeted customer's recommended project in project, can realize recommending efficiently method under big data, it is ensured that The stability of system and the multiformity of recommendation.
Preferably, above computer system is distributed system.This computer system includes at least two meter Operator node.
In step 103, user characteristic data of all categories is distributed to multiple calculating node, Mei Geji Operator node at most preserves the user characteristic data of R-1 classification, and each calculating node is every preserved The user characteristic data of individual classification is targeted customer's recommended project based on project.Each calculating node is not required to The user characteristic data of all categories to be preserved, it is to avoid the problem of low memory.
Preferably, each calculating node preserves the user characteristic data of a classification and processes.This Outward, it will be understood that in the embodiments of the present invention, can be according to the configuration of each calculating node by two Individual or two or more classification user characteristic data is distributed to the calculating node of high configuration and is processed.When So, user characteristic data amount is not the biggest when, it is also possible to calculated node by one and process.
As optional embodiment, as in figure 2 it is shown, further comprising the steps of before step 102:
In step 201, it is judged that whether number of users is more than userbase threshold value.If number of users is less than Userbase threshold value, then enter step 202;If number of users is more than userbase threshold value, then enter step Rapid 102.
In step 202., it is directly targeted customer's recommendation items based on project in all user characteristic data Mesh.
Hereafter process ends.
Choose whether user to be clustered according to number of users, to be better adapted to small data Project recommendation down and under big data.
Furthermore, it is to be understood that in other embodiments of the present invention, it is also possible to not to number of users Judge, directly user characteristic data is clustered.
Second embodiment of the invention relates to a kind of recommendation method based on computer system.Fig. 3 is this base The schematic flow sheet of recommendation step in the recommendation method of computer system.
Second embodiment has been substantially carried out following two improvement on the basis of the first embodiment:
First is improved to, for each project in each classification or each user, only choose and close with it It is the strongest several projects rather than all items retaining associated system, relation can be avoided more weak Project produce data skew problem.Specifically:
In step 103, using project-based collaborative filtering is targeted customer's recommended project.As Shown in Fig. 3, this step 103 includes following sub-step:
In sub-step 301, according to the project scoring record of user each in above-mentioned classification, calculate above-mentioned Similarity between all items in classification, and choose, for each project, M the project that similarity is the highest, M is Predefined integer.
Then into sub-step 302, according to the project scoring record of targeted customer in above-mentioned classification, for mesh Mark user chooses T the project that scoring is the highest, and T is predefined integer.
Then into sub-step 303, by T the project chosen for targeted customer and for every in T project M the project that individual project is chosen combines, and therefrom removes the item in the bulleted list of targeted customer Mesh, forms initial recommendation result.Such as, T the project chosen for targeted customer is A, B, C, and M the project chosen for project A, B, C is respectively (D, E), (C, F) and (B, H), then shape The initial recommendation result become is (D, E, F, H).
Hereafter process ends.
Preferably, as shown in Figure 4, after sub-step 303, following sub-step is also included:
In sub-step 401, it is judged that whether the number of entry in initial recommendation result is more than N, N is pre- The integer of definition.If the number of entry in initial recommendation result is more than N, then enter sub-step 402;If The number of entry in initial recommendation result is less than N, then enter sub-step 403.
In sub-step 402, choose from initial recommendation result the highest N number of project recommendation of similarity to Targeted customer.
Hereafter process ends.
In sub-step 403, by all items in the bulleted list of targeted customer and for targeted customer's M the project that in bulleted list, each project is chosen combines, and therefrom removes the project of targeted customer All items in list, forms user data completion recommendation results.It is appreciated that user data completion The formation of recommendation results is similar with the formation of initial recommendation result, does not repeats them here.
Hereafter process ends.
More preferably, as it is shown in figure 5, also include following sub-step after sub-step 403:
In sub-step 501, it is judged that whether the number of entry in user data completion recommendation results is more than N, N are predefined integer.If the number of entry in user data completion recommendation results is more than N, then Enter sub-step 502;If the number of entry in user data completion recommendation results is less than N, then enter son Step 503.
In sub-step 502, from user data completion recommendation results, choose N number of item that similarity is the highest Mesh recommends targeted customer.
Hereafter process ends.
In sub-step 503, by all items in the bulleted list of targeted customer with and targeted customer In bulleted list, each project has all items of similarity relation and combines, and therefrom removes target and use All items in the bulleted list at family, forms project data completion recommendation results.It is appreciated that project The formation of Supplementing Data recommendation results is similar with the formation of initial recommendation result, does not repeats them here.
Hereafter process ends.
Second is improved to use Sparse degree to detect Sparse Problem, and is finding number After Sparse Problems, carry out similarity completion, to avoid Sparse pair by two degree of relations between project Recommend the impact of accuracy.Specifically:
As shown in Figure 6, further comprising the steps of after step 103:
In step 601, it is judged that whether Sparse degree is more than Sparse degree threshold value, Sparse degreeWherein k has the quantity of project pair of similarity relation, l in being calculated classification For the quantity of project in classification,If Sparse degree is less than Sparse degree threshold value, then Enter step 602;If Sparse degree is more than Sparse degree threshold value, then enter step 603.
In step 602, be one group with first item, second items and third item, first item with Between second items, between second items and third item, there is similarity relation, be first by second items Project and third item set up similarity relation, and tie up in classification according to similarity pass between supplementary project The project of being again based on is targeted customer's recommended project.
Hereafter process ends.
In step 603, targeted customer will be recommended based on the calculated recommended project of project.
Hereafter process ends.
If furthermore, it is to be understood that after data being carried out similarity completion by two degree of relations between project Yet suffer from Sparse Problem, three degree between project, four degree or higher degree relation pair can be continued through Data carry out similarity completion, to avoid the Sparse impact on recommending accuracy.
Generally proposed algorithm be divided into content-based recommendation, recommendation based on correlation rule, based on collaborative The recommendation filtered, and the combination of some basic skills.Content-based recommendation is according to user (user) Recommend with the project (item) degree of similarity on some attribute, typical such as vector space mould Type;Recommendation based on correlation rule is based on correlation rule, using project of purchasing as rule head, rule Body is recommended;The degree of depth between promotion expo excavation project based on collaborative filtering or between user is closed System, according to the group behavior rule of user, (crowd that i.e. have purchased this project can tend to any other item Mesh?) it is that user does and recommends, such as recommend strong relation project.Have strong between two users's (project) During relation, referring to that both have higher similarity, weak relation is in contrast.
Above-mentioned collaborative filtering has two kinds of implementation methods, and the first is based on user (user-based), the Two kinds is based on project (item-based).
1. collaborative filtering based on user
As its name suggests, first to calculate the most like n of active user adjacent for user-based collaborative filtering User, the preference project of selected n neighbor user in recommendation process, calculating similarity between user Time, need to calculate, as shown in Figure 7 according to the project preference of two users.
Whole process sets up contact by the relation between user, and the physical relationship between user passes through Project calculates as intermediate medium.As shown in Figure 8 and Figure 9, concrete steps can be such that
(1) calculate the neighbor list of active user (i.e. targeted customer), during calculating, want profit By the project list of preferences of active user Yu arbitrary neighbours, using the relation between project as pass between user The bridge of system.
(2) n neighbor user of Top is taken, as recommended candidate.
(3) in n neighbor user of Top, find out the project not occurred in active user's list of preferences, Set up recommended candidate list (candidate list).
(4) to each item i in candidate list, the list of preferences of itself and active user is calculated In the preference of each project, and draw final score (final score).
(5) to each item i in candidate list, sort according to final score, take Top m Individual project is as recommendation results.
The most project-based collaborative filtering
Item-based collaborative filtering, according to user-project relationship, first calculates similarity between project, According to the existing behavior of active user, it is recommended that its n most like project, as shown in Figure 10 and Figure 11.
Whole flow process sets up contact by the similarity between project, and concrete steps can be such that
(1) by user as bridge, the similarity between item i and item j is calculated.
(2) one matrix of structure, (i j) represents the similarity between item i and item j to point.
(3) to each item in the list of preferences of active user, its Top n is calculated similar items。
(4) all similar items are sorted according to score, using Top n items as recommending knot Really.
In both CF algorithms, it is required for carrying out Similarity Measure, but total algorithm framework not office Being limited to certain specific similarity calculating method, system is simply designed as open connecing Similarity Measure Mouthful, actually we can use multiple similarity algorithm, and (Europe is several for such as Euclidean distance In must be apart from), jaccard coefficient (outstanding block German number) etc..
In application scenarios, it is more outstanding that we are difficult to talk clearly which kind of algorithm, and algorithm performance depends on reality Border data distribution:
1. denser when item-item matrix, the relation between major part item can be by one When score expresses, and when this relation has a preferable discrimination (score distribution uniform, and not It is limited to certain interval), item-based algorithm tends to show more preferably.
2. another one selects the scene of item-based algorithm to be that item quantity is significantly less than user number Amount;Whereas if user quantity is less than item quantity, then select user-based algorithm.
3. data stability is also a reference factor of selection algorithm, and which is more steady for item and user Fixed, which kind of algorithm often will obtain better effects.
4., if we pursue the multiformity of recommendation rather than accuracy, user-based algorithm can show more Good.
Some of the above experience is not always the most effective, in actual applications, will be found out by great many of experiments Preferably suggested design.
How to evaluate the recommendation effect of a commending system, the standard that industry is the most unified, except Precision/recall conventional in machine learning (machine learning) (look into standard/recall) etc. refers to Outside mark, it is the richest that we the most also can pay close attention to the multiformity of recommendation, the i.e. recommendation results of a user Rich.
At big data age, the proposed algorithm of uniprocessor version has been difficult to exercise one's ability, application MapReduce framework (framework), hadoop framework have been realized in complete set CF algorithm, algorithm bag name is Mahout, and it not only achieves item-based and user-based Algorithm, and achieve multiple similarity and neighbor algorithm.Additionally, under the Computational frame of higher level Collaborative filtering, such as Spark framework can also be realized.
User-based algorithm:
(1) set up data model (data model), initialize user2item and item2user Data structure
(2) according to user-item-neighborhood relationship, certain similarity operator is utilized Method, calculates Top n neighborhood to each user in the overall situation (all users)
(3) utilize user-neighborhood-item relationship, calculate possible items
(4) utilize item-possible item similarity, recommend for active user Item-based algorithm:
(1) set up data model, initialize user2item and item2user data structure
(2) according to user-item-user-item relationship, the possible of each user is calculated items
(3) degree of association of calculating possible item and current user:
pref i 2 i ( j ) = &Sigma; i = 0 n sim i 2 i ( i , j ) * pref ( i )
sim i 2 i ( j ) = &Sigma; i = 0 n sim i 2 i ( i , j )
preference ( j ) = pref i 2 i ( j ) sim i 2 i ( j )
(4) sort according to preference score, select high score person as recommendation Items (recommended project).
Above-mentioned MapReduce framework is a kind of distributed computing framework, a task is resolved For map process and reduce process, wherein map process is output as<key, value>schema (pattern), its all value are done specific algorithm for each key by reduce process.Such as Figure 12 Shown in, in order to realize Distributed C F algorithm, in MapReduce framework, it would be desirable to During map, arrange input data, such as, resolve input data, load primary data schema (pattern), by unified for data for<key, value>form, wherein key is that (user marks userID Know), value is itemID (project label) and score.And initialize during reduce Mahout data model and some global data structures (neighborhood object, Recommender object, similarity object etc.), then carry out real recommendation process (user-based or item-based recommendation).
But, existing CF algorithm there is also big data problem, data skew problem and Sparse and asks Topic.Problems above can be solved by above-mentioned recommendation method based on computer system.Below will be from This recommendation method based on computer system is further described in detail by these three aspect.
1. clustering method solves big data problem
In the actual application scenarios that data scale is bigger, such as in hundred million rank data volumes, we use Clustering method degrades problem.Cluster is a kind of unsupervised learning algorithm, for a certain class object, than Such as user or project, it is divided in multiple classification according to object properties, it is not necessary to manually mark, I.e. without under any manual intervention premise, we are expressed as a feature list (feature each item List), clustering algorithm can be automatically performed cluster (cluster) process.
Preferably, we choose user as clustering object, i.e. similar on feature User gathers in same class;The most why not choose item as clustering object?Reason be if We select item as clustering object, and in final cluster result, the items of certain classification only can limit to On certain several item, so run counter to recommending diversity index, affect the multiformity of recommendation results, So we are using user as cluster result.Another reason is that we use item-based algorithm to make For main body proposed algorithm, if in cluster process or use item to cluster, to a certain extent Can recommend to produce with item-based and repeat, the most also can affect the multiformity of arithmetic result.Certainly, In other embodiments of the invention, it would however also be possible to employ user-based is as main body proposed algorithm, choosing Take item as clustering object.
Prepare Feature: we each user as an object (object), then by this User characterization, every historical record of this user is counted as a feature, such as user i one Bar record<i, t, s>, represents that user i is s to the preference of item t, then we add a feature for it " t:s ", the most each user is characterized.
Alternatively, the scale of cluster is so to calculate, and about 10,000,000users can be gathered one In individual classification, this can ensure that and not have deadlock phenomenon on Distributed Computing Platform.Certainly, according to It is actually needed to arrange and the user of other quantity is gathered in a classification.
The bottleneck of clustering algorithm is to calculate between item in similarity, it is preferable that we use Canopy algorithm determines initial center, then does final cluster with Kmeans.Canopy algorithm Total data can first be divided into r son concentrate, two sons are concentrated and are likely to occur data overlap, then exist Each subset clusters with Kmeans algorithm, between the data in different subsets, similarity meter will not be carried out Calculate.The flow chart of clustering method is as shown in figure 13.Certainly, in other embodiments of the invention, also Can directly use Kmeans algorithm or other clustering algorithms that total data is clustered.
Wherein, Canopy algorithmic procedure is specific as follows:
(1) put into internal memory after data set vectorization being obtained a list (list), select two distances Threshold value: T1 and T2, wherein T1 > value of T2, T1 and T2 can determine with cross check;
(2) appoint from list and take 1 P, quickly calculate a P with all by the low this method that is calculated as Distance between Canopy is (if there is currently no Canopy, then using a P as one Canopy), if fruit dot P and certain Canopy distance are within T1, then a P is joined this Canopy;
(3) such as fruit dot P once with the distance of certain Canopy within T2, then need a some P From list delete, this step is to think that a P has now reached near with this Canopy, therefore it The center of other Canopy cannot be done again;
(4) repeat step 2,3, until list is that sky terminates.
2. reconstruct CF algorithm, solves data skew problem by top N method
As shown in figure 14, the CF algorithm of reconstruct is as follows:
(1) according to the historgraphic data recording of each user, calculate the different item under same user it Between relation, data schema are<item1, score1, item2, score2>.
(2) with item1_item2 as key, the similarity between two item is calculated.
(3) each item only retains top M similar items, forms topItemList, for using Also fetch data when recommending from this topItemList in family.
(4) in userItemList (i.e. the bulleted list of user), each user only takes top T Individual items, generates betterItemList (i.e. the list of preferences of user).
(5) from the betterItemList of each user, items is taken out, in conjunction with each item's TopItemList, filters out the items of behavior, generates itemCandidateList (the most initial Recommendation results).
(6) if item number is less than N in itemCandidateList, the most first reduce BetterItemList is userItemList, if item number is the most not in itemCandidateList Foot, then reduction topItemList is total data.
(7) in itemCandidateList, top N is calculated according to similarity and user preference Items is as recommendation results.
3. solve Sparse method
In experimentation, it has been found that some experimental data there will be serious Sparse Problem, i.e. When calculating similarity between item, the most little a part of item pair (project to) has relation, greatly Without direct relation between part item, therefore we define Sparse degree:Wherein l For the i2i pair quantity calculated by CF algorithm, k is different item quantity, and this metric is the least Then data are the most sparse.It is appreciated that in other embodiments of the invention, it is possible to use other data Degree of rarefication definition detects Sparse Problem.
Preferably, the method solving Sparse is as follows:
(1) traditional method calculates CF
(2) statistical result DSP, if DSP is less than threshold (i.e. Sparse degree threshold value), Then do i2i completion;Concrete threshold is defined as DST=α, and wherein α is self-defined
(3) I2i completion algorithm is itemA-> itemB-> itemC, and i.e. utilizing middle item is both sides Item sets up contact, and wherein itemA and itemB, itemB and itemC are neighbours.Such as, itemA Having similarity SAB with itemB, itemB Yu itemC has similarity SBC, then itemA with ItemC has similarity SAC=SAB*SBC, or
It is demonstrated experimentally that completion algorithm can generally increase by 30% new data, for recommending to have done strong number According to supplementing.
These are only a preferred embodiment of the present invention, after each improvement combination, form the preferable of the present invention Embodiment, but each improvement can also use respectively.Further, each parameter mentioned in the above-described embodiments is also Relative set can be carried out as required.
The each method embodiment of the present invention all can realize in modes such as software, hardware, firmwares.No The pipe present invention is to realize with software, hardware or firmware mode, and instruction code may be stored in any In the addressable memorizer of computer of type (the most permanent or revisable, volatibility or Non-volatile, solid-state or non-solid, fixing or removable medium etc.).With Sample, memorizer can e.g. programmable logic array (Programmable Array Logic, be called for short " PAL "), random access memory (Random Access Memory, be called for short " RAM "), programmable read only memory (Programmable Read Only Memory, letter Claim " PROM "), read only memory (Read-Only Memory, be called for short " ROM "), Electrically Erasable Read Only Memory (Electrically Erasable Programmable ROM, letter Claim " EEPROM "), disk, CD, digital versatile disc (Digital Versatile Disc, It is called for short " DVD ") etc..
Third embodiment of the invention relates to a kind of recommendation apparatus based on computer system.Figure 15 is this The structural representation of recommendation apparatus based on computer system.As shown in figure 15, this device includes:
User items initial relation computing module, for obtaining each user project scoring note to projects Record.
Cluster module, the item of each user for obtaining according to user items initial relation computing module Mesh scoring record cluster, user characteristic data is divided in R classification, R be greater than 1 whole Number.And
Recommending module, in the user characteristic data of each classification divided at cluster module, base It is targeted customer's recommended project in project.It is appreciated that in various embodiments of the present invention, above-mentioned Recommending module can use based on collaborative filtering, based on correlation rule or proposed algorithm based on effectiveness come for Targeted customer's recommended project.
Furthermore, it is to be understood that in other embodiments of the present invention, cluster module can also be to item Mesh clusters, it is recommended that module is used for target based on user again in the user characteristic data of each classification Family recommended project, or cluster and recommendation are all based on user or are all based on project.
In the recommendation apparatus of present embodiment, cluster module is first marked according to the project of each user and is remembered Record clusters, and user characteristic data is divided in multiple classification, it is recommended that module is again in each classification User characteristic data is targeted customer's recommended project based on project, can realize efficient under big data Recommendation method, it is ensured that the stability of system and the multiformity of recommendation.
Preferably, above computer system is distributed system.This computer system includes at least two meter Operator node.
Above-mentioned recommending module is for distributing to multiple calculating node by user characteristic data of all categories, often Individual calculating node at most preserves the user characteristic data of R-1 classification, and each calculating node is being preserved Each classification user characteristic data in be targeted customer's recommended project based on project.Each calculating node Need not preserve the user characteristic data of all categories, it is to avoid the problem of low memory.
Preferably, each calculating node preserves the user characteristic data of a classification and processes.This Outward, it will be understood that in the embodiments of the present invention, can be according to the configuration of each calculating node by two Individual or two or more classification user characteristic data is distributed to the calculating node of high configuration and is processed.When So, user characteristic data amount is not the biggest when, it is also possible to calculated node by one and process.
As optional embodiment, said apparatus also includes userbase judge module, in cluster Before module clusters, it is judged that whether number of users is more than userbase threshold value.
If for userbase judge module, recommending module confirms that number of users is less than userbase threshold value, It is directly then targeted customer's recommended project based on project in all user characteristic data.
If for userbase judge module, cluster module confirms that number of users is more than userbase threshold value, Then cluster according to the project scoring record of each user, user characteristic data is divided into R classification In, R is greater than the integer of 1.
Choose whether user to be clustered according to number of users, to be better adapted to small data Project recommendation down and under big data.
Furthermore, it is to be understood that in other embodiments of the present invention, it is also possible to not to number of users Judge, directly user is clustered.
First embodiment is the method embodiment corresponding with present embodiment, and present embodiment can Work in coordination enforcement with the first embodiment.The relevant technical details mentioned in first embodiment is in this reality Execute in mode still effective, in order to reduce repetition, repeat no more here.Correspondingly, in present embodiment The relevant technical details mentioned is also applicable in the first embodiment.
Four embodiment of the invention relates to a kind of recommendation apparatus based on computer system.Figure 16 is this The structural representation of recommending module in recommendation apparatus based on computer system.
4th embodiment has been substantially carried out following two improvement on the basis of the 3rd embodiment:
First is improved to, for each project in each classification or each user, only choose and close with it It is the strongest several projects rather than all items retaining associated system, relation can be avoided more weak Project produce data skew problem.Specifically:
Above-mentioned recommending module uses project-based collaborative filtering to be targeted customer's recommended project.As Shown in Figure 16, this recommending module includes:
Item similarity submodule, for the project scoring record according to user each in classification, calculates Similarity between all items in classification, and choose, for each project, M the project that similarity is the highest, M is Predefined integer.
User recommends submodule, for according to the project scoring record of targeted customer in classification, for target User chooses T the project that scoring is the highest, and T is predefined integer.And
Initial recommendation submodule, for user recommended submodule be T project choosing of targeted customer and Item similarity submodule is that M the project that in T project, each project is chosen combines, and therefrom goes Except the project in the bulleted list of targeted customer, form initial recommendation result.
Preferably, above-mentioned recommending module also includes:
Initial recommendation judges submodule, for judging the initial recommendation knot that initial recommendation submodule is formed Whether the number of entry in Guo is more than N, N is predefined integer.
For initial recommendation, initial recommendation screening submodule, if judging that submodule confirms initial recommendation result In the number of entry more than N, choose from initial recommendation result the highest N number of project recommendation of similarity to Targeted customer.And
For initial recommendation, user data scale reduction submodule, if judging that submodule confirms initial recommendation The number of entry in result is less than N, then by all items in the bulleted list of targeted customer with for target M the project that in the bulleted list of user, each project is chosen combines, and therefrom removes targeted customer Bulleted list in all items, formed user data completion recommendation results.
More preferably, above-mentioned recommending module also includes:
Completion is recommended to judge submodule, for judging the use that user data scale reduction submodule is formed Whether the number of entry in user data completion recommendation results is more than N, N is predefined integer.
Screening submodule is recommended in completion, if recommending to judge that submodule confirms user data completion for completion The number of entry in recommendation results is more than N, chooses similarity the highest from user data completion recommendation results N number of project recommendation to targeted customer.And
Project data scale reduction submodule, if recommending to judge that submodule confirms user data for completion The number of entry in completion recommendation results is less than N, then by all items in the bulleted list of targeted customer With and the bulleted list of targeted customer in each project there is all items of similarity relation combine, and Therefrom remove all items in the bulleted list of targeted customer, form project data completion recommendation results.
Second is improved to use Sparse degree to detect Sparse Problem, and is finding number After Sparse Problems, carry out similarity completion, to avoid Sparse pair by two degree of relations between project Recommend the impact of accuracy.Specifically:
Said apparatus also includes:
Recommendation results Sparse degree judge module, is used for judging that Sparse degree is the dilutest more than data Dredge degree threshold value, Sparse degreeWherein k has similarity pass in being calculated classification The quantity of the project pair of system, l is the quantity of project in classification,And
For recommendation results Sparse degree judge module, Sparse completion module, if confirming that data are dilute Dredge degree less than Sparse degree threshold value, be then one group with first item, second items and third item, the Between one project and second items, between second items and third item, there is similarity relation, pass through Section 2 Mesh is first item and third item sets up similarity relation.
Recommending module similarity between the project supplemented according to Sparse completion module is closed and is tied up to class The project that is again based in not is targeted customer's recommended project, and if recommendation results Sparse degree judge mould Block confirms that Sparse degree, then will be based on the calculated recommended project of project more than Sparse degree threshold value Recommend targeted customer.
If furthermore, it is to be understood that after data being carried out similarity completion by two degree of relations between project Yet suffer from Sparse Problem, three degree between project, four degree or higher degree relation pair can be continued through Data carry out similarity completion, to avoid the Sparse impact on recommending accuracy.
Form the better embodiment of the present invention above after each improvement combination, but each improvement can also be distinguished Use.
Second embodiment is the method embodiment corresponding with present embodiment, and present embodiment can Work in coordination enforcement with the second embodiment.The relevant technical details mentioned in second embodiment is in this reality Execute in mode still effective, in order to reduce repetition, repeat no more here.Correspondingly, in present embodiment The relevant technical details mentioned is also applicable in the second embodiment.
To sum up, the application scenarios faced due to us be user quantity and item quantity all in hundred million ranks, Traditional algorithm cannot meet our demand, so in above-mentioned recommendation based on computer system In method and apparatus, use cluster can solve this problem with reconstruct two kinds of methods of CF algorithm.Improve Afterwards, in the case of using 600 reducer, hundred million rank data volumes can be realized in 90 minutes Recommendation.And by defining the evaluation index of Sparse, when item-item Similarity Measure terminates After, if result is less than a certain threshold value of evaluation index, then calculate the higher degree relation between item, Do similarity completion, improve and recommend accuracy.
It should be noted that each module mentioned in the present invention each equipment embodiment is all logic mould Block, physically, a logic module can be a physical module, it is also possible to be a physical module A part, it is also possible to realize with the combination of multiple physical modules, the physics reality of these logic modules itself Existing mode is not most important, and the combination of the function that these logic modules are realized is only the solution present invention The key of the technical problem proposed.Additionally, for the innovative part highlighting the present invention, the present invention is above-mentioned Each equipment embodiment is not by the mould the closest with solving technical problem relation proposed by the invention Block introduces, and this is not intended that the said equipment embodiment does not exist other module.
It should be noted that in the claim and description of this patent, such as first and second etc. Etc relational terms be used merely to by an entity or operation separate with another entity or operating space Come, and not necessarily require or imply these entities or operation between exist any this reality relation or Person's order.And, term " includes ", " comprising " or its any other variant are intended to non-row Comprising, so that include that the process of a series of key element, method, article or equipment not only wrap of his property Include those key elements, but also include other key elements being not expressly set out, or also include for this mistake The key element that journey, method, article or equipment are intrinsic.In the case of there is no more restriction, by statement The key element " including one " and limiting, it is not excluded that include the process of described key element, method, article or Person's equipment there is also other identical element.
Although by referring to some of the preferred embodiment of the invention, the present invention being shown And description, but it will be understood by those skilled in the art that and can in the form and details it be made Various changes, without departing from the spirit and scope of the present invention.

Claims (14)

1. a recommendation method based on computer system, it is characterised in that the method includes following step Rapid:
Obtain each user project scoring record to projects;
Project scoring record according to each user clusters, and user characteristic data is divided into R In classification, R is greater than the integer of 1;
In the user characteristic data of each described classification, it is targeted customer's recommended project based on project.
Recommendation method based on computer system the most according to claim 1, it is characterised in that Described computer system includes that at least two calculates node;
Described " in the user characteristic data of each described classification, is that targeted customer recommends based on project Project " step in, the user characteristic data of each described classification is distributed to multiple calculating node, often Individual calculating node at most preserves the user characteristic data of R-1 described classification, and each calculating node is in institute The user characteristic data of each described classification preserved is targeted customer's recommended project based on project.
Recommendation method based on computer system the most according to claim 1, it is characterised in that Described " in the user characteristic data of each described classification, is targeted customer's recommendation items based on project Mesh " step in, using project-based collaborative filtering is targeted customer's recommended project;
Described " in the user characteristic data of each described classification, is that targeted customer recommends based on project Project " step include following sub-step:
Project scoring record according to user each in described classification, calculates all items in described classification Between similarity, and choose, for each project, M the project that similarity is the highest, M is predefined whole Number;
Project scoring record according to targeted customer described in described classification, chooses for described targeted customer Marking T the highest project, T is predefined integer;
T the project chosen for described targeted customer is chosen with for each project in described T project M project combine, and therefrom remove the project in the bulleted list of described targeted customer, formed Initial recommendation result.
Recommendation method based on computer system the most according to claim 3, it is characterised in that Following sub-step is also included after the sub-step forming initial recommendation result:
It is predefined whole for judging whether the number of entry in described initial recommendation result is more than N, N Number;
If the number of entry in described initial recommendation result is more than N, then from described initial recommendation result Choose the highest N number of project recommendation of similarity to described targeted customer;
If the number of entry in described initial recommendation result is less than N, then by the project of described targeted customer M the item that all items in list is chosen with each project in the bulleted list for described targeted customer Mesh combines, and therefrom removes all items in the bulleted list of described targeted customer, forms user Supplementing Data recommendation results.
Recommendation method based on computer system the most according to claim 4, it is characterised in that Following sub-step is also included after the sub-step forming user data completion recommendation results:
It is predetermined for judging whether the number of entry in described user data completion recommendation results is more than N, N The integer of justice;
If the number of entry in described user data completion recommendation results is more than N, then from described number of users According to completion recommendation results being chosen the highest N number of project recommendation of similarity to described targeted customer;
If the number of entry in described user data completion recommendation results is less than N, then described target is used All items in the bulleted list at family with and the bulleted list of described targeted customer in each project have The all items of similarity relation combines, and therefrom removes in the bulleted list of described targeted customer All items, forms project data completion recommendation results.
Recommendation method based on computer system the most according to claim 1, it is characterised in that Described " in the user characteristic data of each described classification, is targeted customer's recommendation items based on project Mesh " step after further comprising the steps of:
Judge that whether Sparse degree is more than Sparse degree threshold value, described Sparse degreeWherein k has the number of project pair of similarity relation in being calculated described classification Amount, l is the quantity of project in described classification,
If described Sparse degree is less than Sparse degree threshold value, then with first item, second items and Third item is one group, and between described first item and described second items, described second items is with described There is between third item similarity relation, be described first item and described by described second items Three projects set up similarity relation, and close according to similarity between supplementary project and tie up in described classification again Secondary is described targeted customer's recommended project based on project;
If described Sparse degree is more than Sparse degree threshold value, then will push away based on project is calculated Recommend project recommendation to described targeted customer.
Recommendation method based on computer system the most according to any one of claim 1 to 6, It is characterized in that, described " cluster, by user characteristics according to the project of each user record of marking Data are divided in R classification, and R is greater than the integer of 1 " step before further comprising the steps of:
Judge that whether number of users is more than userbase threshold value;
If described number of users is less than userbase threshold value, then direct base in all user characteristic data It is targeted customer's recommended project in project;
If described number of users is more than userbase threshold value, then enters and " comment according to the project of each user Member record clusters, and user characteristic data is divided in R classification, and R is greater than the integer of 1 " Step.
8. a recommendation apparatus based on computer system, it is characterised in that described device includes:
User items initial relation computing module, for obtaining each user project scoring note to projects Record;
Cluster module, for each user obtained according to described user items initial relation computing module Project scoring record cluster, user characteristic data is divided in R classification, R is greater than 1 Integer;And
Recommending module, the user characteristics number of each described classification for being divided at described cluster module According to, it is targeted customer's recommended project based on project.
Recommendation apparatus based on computer system the most according to claim 8, it is characterised in that Described computer system includes that at least two calculates node;
Described recommending module saves for the user characteristic data of each described classification is distributed to multiple calculating Point, each calculating node at most preserves the user characteristic data of R-1 described classification, and each calculating saves Point is targeted customer's recommendation items based on project in the user characteristic data of each described classification preserved Mesh.
Recommendation apparatus based on computer system the most according to claim 8, its feature exists In, described recommending module uses project-based collaborative filtering to be targeted customer's recommended project;
Described recommending module includes:
Item similarity submodule, for the project scoring record according to user each in described classification, Calculate in described classification similarity between all items, and choose the highest M of similarity for each project Project, M is predefined integer;
User recommends submodule, marks for the project according to targeted customer described in described classification and remembers Record, chooses, for described targeted customer, T the project that scoring is the highest, and T is predefined integer;And
Initial recommendation submodule, is that described targeted customer chooses for described user is recommended submodule T project is that the M that in described T project, each project is chosen is individual with described item similarity submodule Project combines, and therefrom removes the project in the bulleted list of described targeted customer, is formed and initially pushes away Recommend result.
11. recommendation apparatus based on computer system according to claim 10, its feature exists In, described recommending module also includes:
Initial recommendation judges submodule, for judging that what described initial recommendation submodule formed initially pushes away Recommend the number of entry in result whether being more than N, N is predefined integer;
For described initial recommendation, initial recommendation screening submodule, if judging that submodule confirmation is described initially The number of entry in recommendation results is more than N, chooses, from described initial recommendation result, the N that similarity is the highest Described targeted customer is given in individual project recommendation;And
User data scale reduction submodule, if it is described to judge that submodule confirms for described initial recommendation The number of entry in initial recommendation result is less than N, then by the institute in the bulleted list of described targeted customer M the project having project to choose with each project in the bulleted list for described targeted customer combines, And therefrom remove all items in the bulleted list of described targeted customer, form user data completion and push away Recommend result.
12. recommendation apparatus based on computer system according to claim 11, its feature exists In, described recommending module also includes:
Completion is recommended to judge submodule, is used for judging that described user data scale reduction submodule is formed User data completion recommendation results in the number of entry be whether predefined integer more than N, N;
Screening submodule is recommended in completion, if recommending to judge that submodule confirms described user for described completion The number of entry in Supplementing Data recommendation results is more than N, from described user data completion recommendation results Choose the highest N number of project recommendation of similarity to described targeted customer;And
Project data scale reduction submodule, if it is described to recommend to judge that submodule confirms for described completion The number of entry in user data completion recommendation results is less than N, then the project of described targeted customer arranged All items in table with and the bulleted list of described targeted customer in each project there is similarity relation All items combine, and therefrom remove all items in the bulleted list of described targeted customer, Form project data completion recommendation results.
13. recommendation apparatus based on computer system according to claim 8, its feature exists In, described device also includes:
Recommendation results Sparse degree judge module, is used for judging that Sparse degree is the dilutest more than data Dredge degree threshold value, described Sparse degreeHave during wherein k is calculated described classification Having the quantity of the project pair of similarity relation, l is the quantity of project in described classification,With And
Sparse completion module, if confirming institute for described recommendation results Sparse degree judge module State Sparse degree and be less than Sparse degree threshold value, then with first item, second items and third item It it is one group, between described first item and described second items, described second items and described third item Between there is similarity relation, be described first item by described second items and described third item built Vertical similarity relation;
Described recommending module is similarity between the project supplemented according to described Sparse completion module It is described targeted customer's recommended project that pass ties up to be again based on project in described classification, if pushing away described in and Recommend result data degree of rarefication judge module and confirm that described Sparse degree is more than Sparse degree threshold value, then Described targeted customer will be recommended based on the calculated recommended project of project.
14. according to Claim 8 to recommendation based on the computer system dress according to any one of 13 Put, it is characterised in that described device also includes userbase judge module, at described cluster mould Before block cluster, it is judged that whether number of users is more than userbase threshold value;
If described recommending module confirms described number of users less than using for described userbase judge module Family size threshold, then be directly targeted customer's recommendation items based on project in all user characteristic data Mesh;
If described cluster module confirms described number of users more than using for described userbase judge module Family size threshold, then cluster, by user characteristic data according to the project scoring record of each user Being divided in R classification, R is greater than the integer of 1.
CN201410736666.3A 2014-12-04 2014-12-04 Computer system based recommendation method and apparatus Pending CN105718488A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201410736666.3A CN105718488A (en) 2014-12-04 2014-12-04 Computer system based recommendation method and apparatus
PCT/CN2015/095834 WO2016086802A1 (en) 2014-12-04 2015-11-27 Computer system-based recommendation method and device thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410736666.3A CN105718488A (en) 2014-12-04 2014-12-04 Computer system based recommendation method and apparatus

Publications (1)

Publication Number Publication Date
CN105718488A true CN105718488A (en) 2016-06-29

Family

ID=56091009

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410736666.3A Pending CN105718488A (en) 2014-12-04 2014-12-04 Computer system based recommendation method and apparatus

Country Status (2)

Country Link
CN (1) CN105718488A (en)
WO (1) WO2016086802A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106162529A (en) * 2016-07-08 2016-11-23 北京邮电大学 Indoor orientation method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103455613A (en) * 2013-09-06 2013-12-18 南京大学 Interest aware service recommendation method based on MapReduce model
CN103678672A (en) * 2013-12-25 2014-03-26 北京中兴通软件科技股份有限公司 Method for recommending information

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101685458B (en) * 2008-09-27 2012-09-19 华为技术有限公司 Recommendation method and system based on collaborative filtering
US9087123B2 (en) * 2009-12-18 2015-07-21 Toyota Jidosha Kabushiki Kaisha Collaborative filtering using evaluation values of contents from users
CN103389966A (en) * 2012-05-09 2013-11-13 阿里巴巴集团控股有限公司 Massive data processing, searching and recommendation methods and devices
CN103049488B (en) * 2012-12-05 2015-11-25 北京奇虎科技有限公司 A kind of collaborative filtering disposal route and system
CN103412948B (en) * 2013-08-27 2017-10-24 北京交通大学 The Method of Commodity Recommendation and system of collaborative filtering based on cluster

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103455613A (en) * 2013-09-06 2013-12-18 南京大学 Interest aware service recommendation method based on MapReduce model
CN103678672A (en) * 2013-12-25 2014-03-26 北京中兴通软件科技股份有限公司 Method for recommending information

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106162529A (en) * 2016-07-08 2016-11-23 北京邮电大学 Indoor orientation method and device

Also Published As

Publication number Publication date
WO2016086802A1 (en) 2016-06-09

Similar Documents

Publication Publication Date Title
CN104866474B (en) Individuation data searching method and device
CN102841946B (en) Commodity data retrieval ordering and Method of Commodity Recommendation and system
US20140108190A1 (en) Recommending product information
CN108121737A (en) A kind of generation method, the device and system of business object attribute-bit
CN103942712A (en) Product similarity based e-commerce recommendation system and method thereof
CN109636494A (en) Drug recommended method and system
CN107169052A (en) Recommend method and device
CN107515886A (en) A kind of recognition methods of tables of data, device and system
CN104615631B (en) A kind of method and device of information recommendation
CN104850567A (en) Method and device for identifying association between network users
CN104239324A (en) Methods and systems for user behavior based feature extraction and personalized recommendation
CN102609422A (en) Class misplacing identification method and device
US20110213786A1 (en) Generating recommended items in unfamiliar domain
CN103136683A (en) Method and device for calculating product reference price and method and system for searching products
Tibély et al. Extracting tag hierarchies
CN106919582A (en) The association of network articles and related information statistical method and device
TW201636877A (en) Filtering data objects
US20130304539A1 (en) User recommendation method and device
CN110019785A (en) A kind of file classification method and device
CN104915440A (en) Commodity de-duplication method and system
Riquelme et al. The neighborhood role in the linear threshold rank on social networks
Chen et al. Does product recommendation meet its Waterloo in unexplored categories? No, price comes to help
CN115659055A (en) Commodity recommendation method, system, equipment and storage medium based on event sequence
Liu et al. Detecting industry clusters from the bottom up based on co-location patterns mining: A case study in Dongguan, China
CN106469182A (en) A kind of information recommendation method based on mapping relations and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160629

RJ01 Rejection of invention patent application after publication