CN109478190A - The region of search of recommender system reduces - Google Patents

The region of search of recommender system reduces Download PDF

Info

Publication number
CN109478190A
CN109478190A CN201680086991.7A CN201680086991A CN109478190A CN 109478190 A CN109478190 A CN 109478190A CN 201680086991 A CN201680086991 A CN 201680086991A CN 109478190 A CN109478190 A CN 109478190A
Authority
CN
China
Prior art keywords
subfunction
search
entry
objective function
region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201680086991.7A
Other languages
Chinese (zh)
Inventor
马克西姆.谢尔盖耶维奇.克利诺夫
亚历山大.尼古拉耶维奇.菲利波夫
维克多.弗拉基米罗维奇.斯米尔诺夫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN109478190A publication Critical patent/CN109478190A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles

Abstract

The present invention relates to a kind of devices, including at least one processing unit, are used for: the request of the entry in data set for identification is received, it is described to request specified objective function;Objective function is resolved into multiple subfunctions;Determine the constant limit of at least one subfunction in the multiple subfunction;The boundary of the objective function is calculated using the constant limit of at least one subfunction;Using data set described in calculated boundary definitions region of search;Pass through the entry in the entry in described search region in region of search of the processing target function to assess the data set.Furthermore there is disclosed a kind of recommender systems and entry recommended method.

Description

The region of search of recommender system reduces
Technical field
The present invention relates to a kind of device, recommender system and entry recommended methods.The invention further relates to be stored with for configuring Equipment is calculated to execute the computer readable storage medium of the instruction of this method.
Background technique
Recommender system is the important component of present communications and information processing system.User can retrieve, handle and on The bulk information and data of biography need to preselect or be reduced to information or data amount can be by suitable quantity that user is handled or big It is small.But in order to generate significant recommendation, it is necessary to analysis and processing large data collection, and this needs a large amount of computing resource With the calculating time.Searching method is proposed for each data set to handle particular task.However, it is difficult to by these methods Suitable for different recommended settings.For different data sets, these methods may never reduce the quantity of computing resource Or shorten and calculate the time, in some instances it may even be possible to which the result of inaccuracy can be provided.
Summary of the invention
The purpose of the present invention is to provide a kind of device, recommender system and entry recommended method, described devices, recommender system One or more above problems in the prior art are overcome with method.
The first aspect of the present invention provides a kind of device, including at least one processing unit, is used for: reception counts for identification It is described to request specified objective function according to the request of the entry of concentration;Objective function is resolved into multiple subfunctions;It determines described more The constant limit of at least one subfunction in a subfunction;Described in constant limit calculating using at least one subfunction The boundary of objective function;Using data set described in calculated boundary definitions region of search;By in described search region In entry on entry in processing (that is, the calculate or application) region of search of objective function to assess the data set.
Device in first aspect is suitable for any kind of data set and inquiry.The request received is by specifying target Function defines the assessment of the entry in data set.However, due to for both computing resource and calculating time, in data set In all entries on assessment objective function it is prohibitively expensive, therefore described device for decomposition goal function and use objective function The constant limit of at least one subfunction carry out approximate objective function, to efficiently reduce region of search.It region of search can To identify candidate entries for (in region of search), to calculate relatively expensive (original) mesh in the entry in region of search Scalar functions.It may include the subset for having assessed entry according to query results such as objective functions, such as recommendation results.For example, according to It is worth (or index) defined in objective function, result may include any combination of one or more best or worst entries.
This method is very flexible, because objective function can indicate the various inquiry classes suitable for any type data set Type.In addition, carrying out approximate at least some subfunctions by using constant limit, such as calculate the increased subfunction of cost, the field of search Domain can effectively reduce, and still can effectively provide accurate approximation even for non-monotonic objective function.In addition, Objective function is resolved into multiple subfunctions to be advantageous, because various pieces can be on the specialized processing units of described device Or handled on for example calculating the external processing unit in cloud or cluster, so as to EQUILIBRIUM CALCULATION FOR PROCESS resource and accelerate to inquire Processing speed.
According in a first aspect, at least one described processing unit is for going in the first implementation of described device Except the one or more subfunctions remained unchanged for the size variation in described search region in the multiple subfunction.Removal For at least some subfunctions that region of search size variation remains unchanged, further speeded up by objective function it is subsequent based on Calculate speed.Preferably, processing unit can be used to determine whether that there are this constant subfunctions, and from the objective function of decomposition The identified subfunction of middle removal.Throughout the specification, the remaining subfunction of objective function can also be known as with being combined Simplify objective function.
According to first aspect itself or according to the first implementation of first aspect, second in described device is real In existing mode, the constant limit of at least one subfunction is the maximum using the input parameter of at least one subfunction Value and/or minimum value determine.Preferably, at least one described subfunction can be identified as needing in objective function at most Computing resource or a part for calculating the time.Therefore, throughout the specification, at least one described subfunction can also be referred to as Objective function or the computation-intensive part (computationally intensive part, abbreviation CIP) for simplifying objective function. CIP may be limited to reachable or inaccessible maximum value and minimum value or boundary.Can by using input value minimum value/most Big value and/or the mathematical method of the upper and lower bound function for constructing CIP estimate maximum value and/or minimum value.These values It can be improved further.Maximum value and/or minimum value (or its improvement values) can be by characterization estimation maximum value and/or minimum values At least one parameter of compromise between required time and its accuracy controls.For example, as initial step, it can be rough Ground estimation maximum value and/or minimum value, the maximum value and/or minimum value can be improved further, with construct it is more approximate at least The improved constant limit of the boundary of one subfunction.
According to first aspect itself or according to any implementation in the aforementioned implementation of first aspect, in institute It states in the third implementation of device, the constant limit is determined by executing a process.Constant limit can be with It is determined using the procedure method that can be executed automatically on device.
According to first aspect itself or according to any implementation in the aforementioned implementation of first aspect, in institute It states in the 4th kind of implementation of device, at least one described processing unit is also used to measure the size in described search region;And And if measured size is more than threshold value, at least one described processing unit is also used to determine in the multiple subfunction At least one other subfunction other constant limits.Identified constant limit is used to calculate the mesh for influencing region of search The boundary of scalar functions.If the size of region of search is more than threshold value, alternative manner can be initiated.The alternative manner can pass through The other constant limits for determining described at least one subfunction, other subfunctions or combinations thereof or arrangement, so that region of search is more Accurately.Therefore, if reduced region of search is too big, other subfunctions in multiple subfunctions or subfunction can be identified Combination, and can determine other subfunctions or the corresponding constant limit of combination.This is by identifying more suitable subfunction Until region of search has acceptable size, to improve region of search automatically.In addition, at least one processing unit can be used In the constant limit of the parallel at least one set of subfunction for determining objective function, and the size of corresponding region of search is assessed, with true The fixed region of search with acceptable size.This enables computing resource to be fully used.
According to first aspect itself or according to any implementation in the aforementioned implementation of first aspect, in institute It states in the 5th kind of implementation of device, at least one described processing unit is also used to calculate using other constant limits Other boundaries of the objective function, and described search region is defined using the calculated other boundaries of institute.Which improve It can be by being the preconfigured threshold value of suitable size of region of search the region of search that controls.
According to first aspect itself or according to any implementation in the aforementioned implementation of first aspect, in institute It states in the 6th kind of implementation of device, described at least one subfunction of others include at least one described subfunction and described At least one of multiple subfunctions others subfunction.Described at least one subfunction of others can indicate in objective function The segment bigger than at least one subfunction initially considered.However, it should be understood that in one or more of the other implementation In, described at least one subfunction of others can also indicate to can be used for being further improved the smaller of region of search in objective function Segment or entirely different part.
According to first aspect itself or according to any implementation in the aforementioned implementation of first aspect, in institute It states in the 7th kind of implementation of device, at least one described processing unit is used to the objective function resolving into multiple sub- letters Several sums.This makes it possible to simplify the processing to each component part of summation and combinations thereof, to determine to CIP and remainder Suitable decomposition.According to first aspect itself or according to any implementation in the aforementioned implementation of first aspect, In the 8th kind of implementation of described device, the objective function corresponds to a kind of request for recommender system.Preferably, Inquiry is top n or rear N number of search, is based on the objective function, and search result includes with the best of one or more users Or N number of destination aggregation (mda) of worst grading.Further, request classification may include that value can be in (any kind of) section The search based on section of internal (or external), the objective function of many variables such as user group or entry group, other conditions Or request and corresponding subtask.
According to first aspect itself or according to any implementation in the aforementioned implementation of first aspect, in institute It states in the 9th kind of implementation of device, described device further include: at least one data-interface is coupled at least one database To retrieve the entry in the data set.Described device can be visited by data-interface via one or more interfaces of database Ask the entry in data set, the interface can be used for so that can identify, retrieve or storing data library in entry.For example, should Interface can enable to identify entry based on region of search and/or objective function.Preferably, which can It is calculated in cloud with being included in.Calculating cloud can be specific cloud, or can with for handling the cloud or cluster at least partly requested It is identical.
The second aspect of the present invention is related to a kind of recommender system, comprising: according to first aspect or first aspect wherein one Device and at least one database for the entry concentrated for storing data described in kind implementation.
According to second aspect, in the first implementation of the recommender system, described device is connected to or is arranged in In calculating cloud for handling at least part request.Enough computing resources can be provided with to support for handling by calculating cloud The device of inquiry.For example, one or more processing tasks of request can be unloaded to calculating cloud by the device.Calculating cloud can use Objective function is assessed in the boundary of calculating target function and/or at least some entries.The device, which can be also used for assessment, asks The complexity asked.Based on assessment result, which can execute simple request, and the request of bigger data set will be unloaded It is downloaded to calculating cloud.If calculating the constant limit of at least one subfunction in calculating cloud, calculates cloud and can save and calculate Constant limit with for further processing.The size (and process resource of cloud) for calculating cloud is adapted to estimating for recommender system Count workload or number of users.This is provided quickly for can further adapt to the very big data set of prospective users quantity And effective recommendation.
The third aspect of the present invention is related to a kind of entry recommended method, includes the following steps: to receive data set for identification In entry request, it is described to request specified objective function;Objective function is resolved into multiple subfunctions;Determine the multiple son The constant limit of at least one subfunction in function;The target is calculated using the constant limit of at least one subfunction The boundary of function;Using data set described in calculated boundary definitions region of search;By in described search region Entry in entry in region of search of the processing target function to assess the data set.
The method can be as the device according to first aspect or according to this hair according to a third aspect of the present invention Recommender system described in bright second aspect executes.The further feature of the method or realization side according to a third aspect of the present invention Formula can execute the device according to first aspect or described according to a second aspect of the present invention recommender system and its difference The function of way of realization.Fourth aspect present invention is related to being stored with such as computer readable storage medium of instruction or program code Deng one or more computer-readable mediums, described instruction or program code configure the calculating and set when executing on the computing device For to execute the method according to one of corresponding embodiment of the third aspect or the third aspect.
Detailed description of the invention
In order to which the technical characteristic in the embodiment of the present invention is described in more detail, will make below to required in embodiment description Attached drawing is briefly described.Attached drawing only represents some embodiments of the present invention, these embodiments are not departing from the present invention such as In the case where range defined in claims, it can modify.
Fig. 1 is the block diagram of apparatus according to an embodiment of the present invention;
Fig. 2 is the schematic diagram of the recommender system of another embodiment according to the present invention;
Fig. 3 is the flow chart of entry recommended method according to an embodiment of the invention;
Fig. 4 shows objective function used in the entry recommended method of another embodiment according to the present invention and searches with what is reduced The approximate representation in rope region;
Fig. 5 is the another flow chart of entry recommended method according to another embodiment of the present invention.
Specific embodiment
Fig. 1 is the block diagram of apparatus according to an embodiment of the present invention.
Device 100 may include at least one processing unit 102, can be used for handling for the entry in data set Request, for example, can be used for generating the search of the recommendation of the entry in data set, identification or recommendation request.
Device 100 may include that device 100 can be connected to the interface 104 for calculating cloud 106.Device 100 may include For device 100 to be connected to the data-interface 108 of at least one database 110.Shown in dotted line, it should be understood that interface 104 and data-interface 108 can be optional component.
At least one processing unit 102 can be used for receiving the request of the entry in data set for identification, and can be with The searched targets function from request.Objective function can define a kind of request, such as top n or rear N number of search.At at least one Reason unit 102 can be used for resolving into objective function into multiple subfunctions;Determine at least one of the multiple subfunction The constant limit of function;The objective function of described search request is calculated using the constant limit of at least one subfunction Boundary.Processing unit 102 can using calculated boundary define the region of search of data set, and can by Entry in entry in region of search in processing (or calculating) region of search of objective function to assess data set.
At least some processing tasks can be unloaded to by device 100 calculates cloud 106.For example, processing unit 102 can be to meter It calculates cloud 106 and submits at least some constant limits, which can be used for calculating target letter based on the constant limit submitted Several boundaries.In a preferred embodiment, calculating cloud 106 can be used for for entry calculating target function.It should be understood, however, that It is that device 100 is not only restricted to the unloading of particular task.But processing unit 102 can be used for unloading any suitable task To cloud 106 is calculated, it can preferably consider for corresponding task to be unloaded to all data transmission needed for calculating cloud 106.
Device 100 effectively approximate objective function and can reduce region of search, to efficiently perform for any The accurate request of the recommendation of the objective function of classification.Device 100 can for any kind of request dull objective function and Non-monotonic objective function finds accurate solution.
Fig. 2 is the schematic diagram of recommender system according to an embodiment of the invention.
System 200 may include recommendation server 202, which can correspond to a reality according to the present invention The device of example is applied, for example, the device 100 in Fig. 1.Recommendation server 202 can be communicated with cloud 204 is calculated, the calculating cloud 204 can be defined as the network or cluster for calculating equipment or computing unit 206 and store the cluster or array of equipment 208. Recommendation server 202 can be accessed by the one or more client devices 210 run by relative users 212.It should be understood that Although only showing a client device 210, recommendation server 202 can be run any amount of by relative users Client device access.
User 212 can retrieve, handle and upload any kind of data and/or content, such as audio, video or media And other information, document, file and stream.Throughout the specification, these usually can be said to entry.User 212 Film or music file or stream can be such as played, is graded to them, other documents and information etc. are browsed.In addition, user 212 Can for example via user interface or by API that client device 210 provides come explicitly or implicitly with recommendation server 202 It interacts, to receive the recommendation of entry.Recommendation server 202 may also receive from the information of user 212, and can incite somebody to action The information is stored as data set 214.
Recommendation server 202 can train and establish the model 216 for being used for then recommending using machine learning algorithm.It pushes away Recommending server 202 can be accessed by administrator 218, which can for example be based on such as root-mean-square deviation (root-mean- Square deviation, abbreviation RMSE) etc. various indexs other data such as training data 220 are provided, to control Trained and model 216 quality.
Recommendation server 202 can be communicated with cloud 204 is calculated, with outsourcing or the various operations of unloading.For example, recommending clothes Business device 202 can be predicted by calculating cloud 204 as recommendation to be presented to the user 212 entry.
Recommendation request can be formulated to specify looking into for the objective function for assessing each entry in data set 214 It askes.For example, top n or rear N number of inquiry can be executed, so as to obtain N number of best or worst entry based on objective function.
Such as discussed in Fig. 3 as follows and Fig. 5, recommendation server 202 can using one according to the present invention or Method described in multiple embodiments handles the objective function of request.
Fig. 3 is the flow chart of entry recommended method according to an embodiment of the invention.Method 300 can be applied in Fig. 1 Device 100 or Fig. 2 in recommendation server 202.Especially, device 100 or recommendation server 202 can be used for according to method 300 based on optimization and flexible request processing come to user's recommended entry.In addition, one or more processing steps of method 300 The calculating cloud 204 calculated in cloud 106 or Fig. 2 that can be discharged into Fig. 1.
Method 300 can start in box 302, and may proceed to box 304, wherein can receive for knowing The request of entry in other data set.Objective function is specified in the request, which can indicate each in data set Purpose index or score.Objective function can (implicitly or explicitly) specify request type, such as top n request, rear N number of ask It asks, the request based on section of designated value, the request of any other type, the conditional combination etc. in a request in section. Objective function can for example pass through summing function f (x)=∑ fi(x) it is defined as multiple subfunctions, wherein fi(x) target is indicated The subfunction of function.
It is requested for being directed to the top n with the entry accordingly graded from multiple users, such as can be by being used for Objective function is established as following grading formula by the SVD++ method of recommendation:
Wherein, u and i is index relevant to user and entry respectively.
μ is the average grading of all (reality) grading.
W is the quantity of potential factor, is training parameter.
buIt is u index element of the user with respect to the intensive vector of the deviation of μ.
biIt is i index element of the entry with respect to the intensive vector of the deviation of μ.
puThe height of the u index column of the dense matrix P of the potential factor of user, dense matrix P is equal to w, and width is equal to Number of users.
N (u) is the u index line of the sparse matrix of the implicit non-ratings data about entry of user, can be use All data that family entry is checked, the height of the sparse matrix are equal to number of users, and width is equal to number of entries.
yjIt is the j index column of the dense matrix Y of the potential factor implicit preferences about entry, the height of dense matrix Y Equal to w, width is equal to number of entries.
qiIt is the i index column of the dense matrix Q of the potential factor of entry, the height of dense matrix Q is equal to w, width etc. In number of entries.
Grading formula can use in the training process of recommender system, and for recommending top n entry.However, answering Understand, the present invention is not limited to formula of specifically grading, objective function or query types.But the grading of top n request is public Formula is only example.
Method 300 may proceed to box 306, wherein objective function can be resolved into multiple subfunctions.Box 306 In decomposition can be used for removing the reduction in objective function for region of search and those of remain unchanged part (or subfunction). The processing of request can be made of at least two subtasks.First subtask is for finding candidate entries, second subtask For assessing objective function in the candidate entries.Therefore, if at least one subfunction only influences second subtask, without It needs to identify candidate entries (first subtask), then can be ignored in Optimization Steps.
For example, subfunction μ+b can be removed for above-mentioned exampleu, because it is constant and right for entry Obtained entry set does not influence.By removing at least one subfunction in multiple subfunctions, remaining subfunction It can be referred to as and simplify objective function (simplified objective function, abbreviation SOF).In the examples described above, SOF can be provided as follows:
Decompose the computation-intensive part that can be also used for then being limited using constant limit in identification objective function (computationally intensive part, abbreviation CIP).For example, can identify CIP in SOF.CIP can be by dividing The group for the subfunction or remaining subfunction in the remaining subfunction in multiple subfunctions that solution objective function obtains Closing indicates.
For above-mentioned example, CIP can be provided as follows:
Therefore, objective function f (u, i)=c+g (u)+h (i)+r (u, i) can be resolved into at least one constant part c+ G (u), at least one CIPr (u, i) and at least one remaining subfunction h (i), wherein i indicates that the index of entry, u indicate to use The index at family.Since the explicit value of the f (u, i) for all entries is unknown and to calculate cost too high, can replace Generation ground calculates h (i) by the steady state value as r (u, i) by lower limit and/or the upper limit.It therefore, can be in reduced region of search Interior estimation f (u, i).For approximate CIP, method 300 may proceed to box 308, wherein can determine in multiple subfunctions Indicate the constant limit of at least one subfunction of CIP.The constant of CIP can be estimated and improved using any suitable technology Upper and lower bound.Subsequent table 1 gives an example of procedure method.Since CIP is usually User ID and Entry ID Function, therefore input the tuple that parameter x can be considered as User ID and Entry ID.
Table 1: for determining the algorithm of the constant limit of CIP
Constant limit is estimated based on the maximum value of vector, with the sum of estimate vector, scalar product and non-monotonic addend, from And obtain dull sqrt (mnz) the * my in procedure method shown in table 1.Which show use most in a line of matrix The non-zero entry of big quantity usually estimates that the upper limit estimates the correctness of lower limit again.
In the above examples, all entries and user use identical lower and upper limit value.It is preferably implemented according to one Example, for each entry/user or for entry group/user group, can estimate constant limit respectively.This may be by can be anti- Reflect the influence of one or more parameters of the compromise between boundary accuracy and boundary calculating duration.
Using the constant limit of the CIP determined in box 308, method 300 may proceed to box 310, wherein use The constant limit of CIP carrys out the boundary of calculating target function.It then, such as can be using branch and boundary method or any other conjunction Suitable technology calculates lower and upper limit function.
According to an example, for top n method, N number of best entry must be identified for each user.This can be by looking for Optimize to k entry, wherein N≤k < < #i, #i is number of entries.If CIP is limited to constant limit [r1, r2], then it is right It should keep assuming as follows in the remainder of SOF: h (i)+r1> h (i ')+r2.This makes it possible for h and finds N number of best item Mesh, the minimum value in them are hN, min, requested entry is in such as h (i) > hN, min-(r2-r1) etc. be identified in entries.? In one example, the calculating of the lower and upper limit of SOF can be completed using process shown in table 2.
Table 2: for calculating the algorithm of the lower and upper limit of SOF
In table 2, identical boundary is used for all users, so that can call once for each entry Compute_limits_step3 in table 1.
Method 300 may proceed to box 312, wherein can using the boundary of calculated objective function define The region of search of data set.It can be entry/user of the composition part of result by removing those not to reduce region of search. This can depend on the type of searching request.For example, requesting for top n, top n lower limit can be found, and can identify Wherein minimum lower limit.Lowest limit can be expressed as b.Hereafter, the identification upper limit is greater than the entry of b.Although being directed to these entries The explicit value of objective function be unknown, but they be likely to before entry in N.It is removed to reduce region of search Number of entries it is also contemplated that training data, which, which can define, may great interval between lower and upper limit And biFrom an entry to the variation degree of another entry.
In optional step (not shown), the size of region of search can be determined;And if it find that region of search is uncomfortable It closes or inaccuracy, then can identify other CIP using other decomposition of SOF or other Subset selections of multiple subfunctions, To obtain more suitable region of search.E.g., including other subfunctions or the choosing of other subfunctions in the multiple subfunction The larger segment of the SOF selected can be considered as CIP.If it is considered that the larger segment of SOF, may cause between lower and upper limit Larger interval, but it is more inaccurate to the calculating of the remainder of objective function.However, bigger interval may cause to subtract Small region of search and assess SOF progress calculating it is more.As an alternative, other subfunctions or the son previously considered be can be The smaller fragment of the SOF of a part in function is considered CIP.This may will increase the calculation amount in subsequent step. But then, due to effectively reducing region of search, it is thus possible to more entries can be excluded except further consideration, It is required to reduce overall calculate.Due to can quickly check what region of search can be reduced based on the rough estimate to CIP Degree, therefore can be further improved the treatment effeciency of method 300.
Method 300 may proceed to box 314, wherein can calculate accurately in all entries in region of search Objective function to assess entry, and provides search or recommendation results in response to the request.
It is shown in schematic in fig. 4 according to the constant CIP of the boundary of one or more embodiments to reduced search The influence of the identification of candidate entries in region.Can using such as method 300 as described in Figure 3 according to the present invention one reality The method for applying example carrys out estimation objective function and reduces region of search.
Fig. 4 shows multiple entries 402, and wherein inner circle indicates the exact value of objective function f (x).Although being shown in Fig. 4 Certain amount of entry 402, it should be understood that the present invention is not limited to certain amount of entries.Due to being directed to the mesh of each entry The assessment of scalar functions is computationally excessively intensive, therefore the CIP that can carry out limited target function by using constant limit comes closely Like calculating.Region 404 is described as this approximation of each entry 402.Although these regions do not correspond to exact value, It is that it can be computed using more efficient way.By determine region 404 whether fall into region of search 406 (or with Region of search 406 is overlapped), explicit value is not used, and suitable entry is identified using region 404.For obtained candidate item Mesh 408, can be with the exact value of calculating target function.
Illustrated in the following result of top n request method using one or more embodiments according to the present invention into The advantageous processing and effectively acceleration of capable recommendation.
The calculated result in the system using SVD++ method, to construct recommended models, wherein be used for SVD++ instruction Experienced Spark GraphX is realized.The result is that calculated based on MovieLens data set, MovieLens data set tool There is 130K user to grade the 20M of 27K entry.Method according to an embodiment of the invention accelerates preceding 10 inquiries 2.6, and And accelerate preceding 1000 inquiries 2.1 of the baseline version using the inquiry of standard top n.Further performance comparison result is base It is calculated in biggish data set.These data sets are by from the set of matrices of referred to as nlpkkt80 and nlpkkt200 The sparse matrix of selection indicates.The speed-up ratio of 90 and about 2000 is had been realized in using method according to an embodiment of the invention Rate.Table 3 shows more details.
Table 3: for the performance of all users of nlpkkt matrix
On nlpkkt200 data set, the total evaluation time of all users spends about 18 minutes in total, and baseline needs About 13 days or more total evaluation times are spent to assess all entries of all users in entire data set.
Fig. 5 is the another flow chart of method according to an embodiment of the invention.This method can be applied to the device in Fig. 1 Recommendation server 202 in 100 or Fig. 2.In addition, each processing step of method 500 can be with any combination and method 300 Processing step merges, and vice versa.
Method 500 can be based on the request that may include objective function.Method 500 can start in box 502, In, it those of can remain unchanged part by reducing for region of search in removal objective function based on objective function and define Simplify objective function (simplified objective function, abbreviation SOF).If there is no such part, then SOF can be equal to primal objective function.
This method may proceed to box 504, wherein computation-intensive part can be selected in SOF (computationally intensive part, abbreviation CIP).SOF can be resolved into CIP 506 and remainder by this 508。
In box 510, (reachable or inaccessible) constant lower and upper limit or boundary are determined for CIP 506.Box 510 by greatly reducing the computational intensity of CIP with steady state value approximation CIP.In addition, the processing in box 510 can be used for Control limit calculates the compromise between time and accuracy in computation.Processing in box 510 can using procedure method come into Row.
Method 500 may proceed to box 512, wherein can calculate the lower and upper limit or boundary of entire SOF.Box Calculating in 512 can be based on the processing result in box 510.Processing in box 510 can using procedure method come into Row.
In block 514, can using it is from box 512 as a result, by according to the lower limit of calculated SOF and upper Limit removal is not most possibly those of composition part of recommendation results entry/user, to reduce region of search.
In box 516, the size of the region of search of reduction can be determined or measured.If region of search is still very big, Then method 500 may proceed to box 504, wherein the other parts selection of other decomposition or SOF that SOF can be used comes true Surely other CIP for the subsequent processing in next iteration.
In block 518, request can be executed using reduced region of search, wherein it can identify candidate entries, and And it can be directed to the exact value of all candidate entries calculating target functions, to generate the response for request.
The one or more steps or box of method 500 can in apparatus according to an embodiment of the present invention and/or It is executed on the calculating cloud of the recommender system of one or more embodiment according to the present invention.It should be understood that any suitable Business can be unloaded to calculating cloud from device or recommendation server with any combination.
Above description is only embodiments of the present invention, and the scope of the present invention is not limited to disclosed example and implementation Example.But any change or replacement can easily be carried out by those skilled in the art.Therefore, protection scope of the present invention It should be subject to the protection scopes of appended claims.

Claims (15)

1. a kind of device characterized by comprising
At least one processing unit, is used for:
The request of the entry in data set for identification is received, it is described to request specified objective function;
Objective function is resolved into multiple subfunctions;
Determine the constant limit of at least one subfunction in the multiple subfunction;
The boundary of the objective function is calculated using the constant limit of at least one subfunction;
Using data set described in calculated boundary definitions region of search;
Pass through the item in the entry in described search region in region of search of the processing target function to assess the data set Mesh.
2. the apparatus according to claim 1, which is characterized in that at least one described processing unit is the multiple for removing The one or more subfunctions remained unchanged for the size variation in described search region in subfunction.
3. according to device described in preceding claims one of them, which is characterized in that at least one subfunction it is constant Boundary is determined using the maximum value and/or minimum value of the input parameter of at least one subfunction.
4. according to device described in preceding claims one of them, which is characterized in that the constant limit is by executing one A process determines.
5. according to device described in preceding claims one of them, which is characterized in that at least one described processing unit is also used Size in measurement described search region;And if measured size is more than threshold value, at least one described processing is single Member is also used to determine other constant limits of at least one other subfunction in the multiple subfunction.
6. device according to claim 5, which is characterized in that at least one described processing unit be also used to using it is described its Its constant limit calculates other boundaries of the objective function, and described search is defined using the calculated other boundaries of institute Rope region.
7. device according to claim 5 or 6, which is characterized in that described at least one subfunction of others includes described At least one of at least one subfunction and the multiple subfunction others subfunction.
8. according to device described in preceding claims one of them, which is characterized in that at least one described processing unit is used for The objective function is resolved into the sum of multiple subfunctions.
9. according to device described in preceding claims one of them, which is characterized in that the objective function, which corresponds to be directed to, to be pushed away Recommend a kind of request of system.
10. according to device described in preceding claims one of them, which is characterized in that inquiry is top n or rear N number of search, Based on the objective function, search result includes the collection of N number of entry of the best or worst grading with one or more users It closes.
11. according to device described in preceding claims one of them, which is characterized in that further include: at least one data-interface, At least one database is coupled to retrieve the entry in the data set.
12. a kind of recommender system characterized by comprising
According to device described in preceding claims one of them;
At least one database for the entry concentrated for storing data.
13. recommender system according to claim 12, which is characterized in that described device is connected to or is arranged in for handling In the calculating cloud of at least part request.
14. a kind of entry recommended method characterized by comprising
The request of the entry in data set for identification is received, it is described to request specified objective function;
Objective function is resolved into multiple subfunctions;
Determine the constant limit of at least one subfunction in the multiple subfunction;
The boundary of the objective function is calculated using the constant limit of at least one subfunction;
Using data set described in calculated boundary definitions region of search;
Pass through the item in the entry in described search region in region of search of the processing target function to assess the data set Mesh.
15. the computer-readable medium that one or more is stored with instruction, which is characterized in that described instruction is held on the computing device The calculating equipment is configured when row to execute the method according to claim 11.
CN201680086991.7A 2016-08-29 2016-08-29 The region of search of recommender system reduces Pending CN109478190A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/RU2016/000582 WO2018044189A1 (en) 2016-08-29 2016-08-29 Search region decreasing for recommendation systems

Publications (1)

Publication Number Publication Date
CN109478190A true CN109478190A (en) 2019-03-15

Family

ID=58402114

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201680086991.7A Pending CN109478190A (en) 2016-08-29 2016-08-29 The region of search of recommender system reduces

Country Status (2)

Country Link
CN (1) CN109478190A (en)
WO (1) WO2018044189A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7574422B2 (en) * 2006-11-17 2009-08-11 Yahoo! Inc. Collaborative-filtering contextual model optimized for an objective function for recommending items
CN103049523A (en) * 2012-12-20 2013-04-17 浙江大学 Method for solving social recommendation problem by low-rank semi-definite programming
CN103093376A (en) * 2013-01-16 2013-05-08 北京邮电大学 Clustering collaborative filtering recommendation system based on singular value decomposition algorithm

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7574422B2 (en) * 2006-11-17 2009-08-11 Yahoo! Inc. Collaborative-filtering contextual model optimized for an objective function for recommending items
CN103049523A (en) * 2012-12-20 2013-04-17 浙江大学 Method for solving social recommendation problem by low-rank semi-definite programming
CN103093376A (en) * 2013-01-16 2013-05-08 北京邮电大学 Clustering collaborative filtering recommendation system based on singular value decomposition algorithm

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WIKIPEDIA: "Recommender system: Difference", 《HTTPS://EN.WIKIPEDIA.ORG/W/INDEX.PHP?TITLE=RECOMMENDER_SYSTEM&DIFF=733707206&OLDID=731173706》 *
里奇(RICCI.F) 等: "《推荐系统:技术、评估及高效算法》", 31 July 2015 *

Also Published As

Publication number Publication date
WO2018044189A1 (en) 2018-03-08

Similar Documents

Publication Publication Date Title
US10671933B2 (en) Method and apparatus for evaluating predictive model
US11423082B2 (en) Methods and apparatus for subgraph matching in big data analysis
US20190362222A1 (en) Generating new machine learning models based on combinations of historical feature-extraction rules and historical machine-learning models
CN108052394B (en) Resource allocation method based on SQL statement running time and computer equipment
US9390142B2 (en) Guided predictive analysis with the use of templates
US20170140278A1 (en) Using machine learning to predict big data environment performance
CN110069502A (en) Data balancing partition method and computer storage medium based on Spark framework
CN110825966A (en) Information recommendation method and device, recommendation server and storage medium
US10545972B2 (en) Identification and elimination of non-essential statistics for query optimization
CN115296984B (en) Abnormal network node detection method and device, equipment and storage medium
CN111783810A (en) Method and apparatus for determining attribute information of user
US10824956B1 (en) System and method for price estimation of reports before execution in analytics
CN106874332B (en) Database access method and device
US20140303933A1 (en) Optimizing analytic flows
Kumar et al. Scalable performance tuning of hadoop MapReduce: A noisy gradient approach
US20160189026A1 (en) Running Time Prediction Algorithm for WAND Queries
CN113780287A (en) Optimal selection method and system for multi-depth learning model
CN111510473B (en) Access request processing method and device, electronic equipment and computer readable medium
US20090138237A1 (en) Run-Time Characterization of On-Demand Analytical Model Accuracy
Tesser et al. Selecting efficient VM types to train deep learning models on Amazon SageMaker
CN109478190A (en) The region of search of recommender system reduces
CN113536085B (en) Method and system for scheduling subject term search crawlers based on combined prediction method
CN113326203B (en) Information recommendation method, equipment and storage medium
JP6203313B2 (en) Feature selection device, feature selection method, and program
CN114638316A (en) Data clustering method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190315