CN109478190A - The region of search of recommender system reduces - Google Patents
The region of search of recommender system reduces Download PDFInfo
- Publication number
- CN109478190A CN109478190A CN201680086991.7A CN201680086991A CN109478190A CN 109478190 A CN109478190 A CN 109478190A CN 201680086991 A CN201680086991 A CN 201680086991A CN 109478190 A CN109478190 A CN 109478190A
- Authority
- CN
- China
- Prior art keywords
- subfunction
- search
- entry
- objective function
- region
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
Abstract
The present invention relates to a kind of devices, including at least one processing unit, are used for: the request of the entry in data set for identification is received, it is described to request specified objective function;Objective function is resolved into multiple subfunctions;Determine the constant limit of at least one subfunction in the multiple subfunction;The boundary of the objective function is calculated using the constant limit of at least one subfunction;Using data set described in calculated boundary definitions region of search;Pass through the entry in the entry in described search region in region of search of the processing target function to assess the data set.Furthermore there is disclosed a kind of recommender systems and entry recommended method.
Description
Technical field
The present invention relates to a kind of device, recommender system and entry recommended methods.The invention further relates to be stored with for configuring
Equipment is calculated to execute the computer readable storage medium of the instruction of this method.
Background technique
Recommender system is the important component of present communications and information processing system.User can retrieve, handle and on
The bulk information and data of biography need to preselect or be reduced to information or data amount can be by suitable quantity that user is handled or big
It is small.But in order to generate significant recommendation, it is necessary to analysis and processing large data collection, and this needs a large amount of computing resource
With the calculating time.Searching method is proposed for each data set to handle particular task.However, it is difficult to by these methods
Suitable for different recommended settings.For different data sets, these methods may never reduce the quantity of computing resource
Or shorten and calculate the time, in some instances it may even be possible to which the result of inaccuracy can be provided.
Summary of the invention
The purpose of the present invention is to provide a kind of device, recommender system and entry recommended method, described devices, recommender system
One or more above problems in the prior art are overcome with method.
The first aspect of the present invention provides a kind of device, including at least one processing unit, is used for: reception counts for identification
It is described to request specified objective function according to the request of the entry of concentration;Objective function is resolved into multiple subfunctions;It determines described more
The constant limit of at least one subfunction in a subfunction;Described in constant limit calculating using at least one subfunction
The boundary of objective function;Using data set described in calculated boundary definitions region of search;By in described search region
In entry on entry in processing (that is, the calculate or application) region of search of objective function to assess the data set.
Device in first aspect is suitable for any kind of data set and inquiry.The request received is by specifying target
Function defines the assessment of the entry in data set.However, due to for both computing resource and calculating time, in data set
In all entries on assessment objective function it is prohibitively expensive, therefore described device for decomposition goal function and use objective function
The constant limit of at least one subfunction carry out approximate objective function, to efficiently reduce region of search.It region of search can
To identify candidate entries for (in region of search), to calculate relatively expensive (original) mesh in the entry in region of search
Scalar functions.It may include the subset for having assessed entry according to query results such as objective functions, such as recommendation results.For example, according to
It is worth (or index) defined in objective function, result may include any combination of one or more best or worst entries.
This method is very flexible, because objective function can indicate the various inquiry classes suitable for any type data set
Type.In addition, carrying out approximate at least some subfunctions by using constant limit, such as calculate the increased subfunction of cost, the field of search
Domain can effectively reduce, and still can effectively provide accurate approximation even for non-monotonic objective function.In addition,
Objective function is resolved into multiple subfunctions to be advantageous, because various pieces can be on the specialized processing units of described device
Or handled on for example calculating the external processing unit in cloud or cluster, so as to EQUILIBRIUM CALCULATION FOR PROCESS resource and accelerate to inquire
Processing speed.
According in a first aspect, at least one described processing unit is for going in the first implementation of described device
Except the one or more subfunctions remained unchanged for the size variation in described search region in the multiple subfunction.Removal
For at least some subfunctions that region of search size variation remains unchanged, further speeded up by objective function it is subsequent based on
Calculate speed.Preferably, processing unit can be used to determine whether that there are this constant subfunctions, and from the objective function of decomposition
The identified subfunction of middle removal.Throughout the specification, the remaining subfunction of objective function can also be known as with being combined
Simplify objective function.
According to first aspect itself or according to the first implementation of first aspect, second in described device is real
In existing mode, the constant limit of at least one subfunction is the maximum using the input parameter of at least one subfunction
Value and/or minimum value determine.Preferably, at least one described subfunction can be identified as needing in objective function at most
Computing resource or a part for calculating the time.Therefore, throughout the specification, at least one described subfunction can also be referred to as
Objective function or the computation-intensive part (computationally intensive part, abbreviation CIP) for simplifying objective function.
CIP may be limited to reachable or inaccessible maximum value and minimum value or boundary.Can by using input value minimum value/most
Big value and/or the mathematical method of the upper and lower bound function for constructing CIP estimate maximum value and/or minimum value.These values
It can be improved further.Maximum value and/or minimum value (or its improvement values) can be by characterization estimation maximum value and/or minimum values
At least one parameter of compromise between required time and its accuracy controls.For example, as initial step, it can be rough
Ground estimation maximum value and/or minimum value, the maximum value and/or minimum value can be improved further, with construct it is more approximate at least
The improved constant limit of the boundary of one subfunction.
According to first aspect itself or according to any implementation in the aforementioned implementation of first aspect, in institute
It states in the third implementation of device, the constant limit is determined by executing a process.Constant limit can be with
It is determined using the procedure method that can be executed automatically on device.
According to first aspect itself or according to any implementation in the aforementioned implementation of first aspect, in institute
It states in the 4th kind of implementation of device, at least one described processing unit is also used to measure the size in described search region;And
And if measured size is more than threshold value, at least one described processing unit is also used to determine in the multiple subfunction
At least one other subfunction other constant limits.Identified constant limit is used to calculate the mesh for influencing region of search
The boundary of scalar functions.If the size of region of search is more than threshold value, alternative manner can be initiated.The alternative manner can pass through
The other constant limits for determining described at least one subfunction, other subfunctions or combinations thereof or arrangement, so that region of search is more
Accurately.Therefore, if reduced region of search is too big, other subfunctions in multiple subfunctions or subfunction can be identified
Combination, and can determine other subfunctions or the corresponding constant limit of combination.This is by identifying more suitable subfunction
Until region of search has acceptable size, to improve region of search automatically.In addition, at least one processing unit can be used
In the constant limit of the parallel at least one set of subfunction for determining objective function, and the size of corresponding region of search is assessed, with true
The fixed region of search with acceptable size.This enables computing resource to be fully used.
According to first aspect itself or according to any implementation in the aforementioned implementation of first aspect, in institute
It states in the 5th kind of implementation of device, at least one described processing unit is also used to calculate using other constant limits
Other boundaries of the objective function, and described search region is defined using the calculated other boundaries of institute.Which improve
It can be by being the preconfigured threshold value of suitable size of region of search the region of search that controls.
According to first aspect itself or according to any implementation in the aforementioned implementation of first aspect, in institute
It states in the 6th kind of implementation of device, described at least one subfunction of others include at least one described subfunction and described
At least one of multiple subfunctions others subfunction.Described at least one subfunction of others can indicate in objective function
The segment bigger than at least one subfunction initially considered.However, it should be understood that in one or more of the other implementation
In, described at least one subfunction of others can also indicate to can be used for being further improved the smaller of region of search in objective function
Segment or entirely different part.
According to first aspect itself or according to any implementation in the aforementioned implementation of first aspect, in institute
It states in the 7th kind of implementation of device, at least one described processing unit is used to the objective function resolving into multiple sub- letters
Several sums.This makes it possible to simplify the processing to each component part of summation and combinations thereof, to determine to CIP and remainder
Suitable decomposition.According to first aspect itself or according to any implementation in the aforementioned implementation of first aspect,
In the 8th kind of implementation of described device, the objective function corresponds to a kind of request for recommender system.Preferably,
Inquiry is top n or rear N number of search, is based on the objective function, and search result includes with the best of one or more users
Or N number of destination aggregation (mda) of worst grading.Further, request classification may include that value can be in (any kind of) section
The search based on section of internal (or external), the objective function of many variables such as user group or entry group, other conditions
Or request and corresponding subtask.
According to first aspect itself or according to any implementation in the aforementioned implementation of first aspect, in institute
It states in the 9th kind of implementation of device, described device further include: at least one data-interface is coupled at least one database
To retrieve the entry in the data set.Described device can be visited by data-interface via one or more interfaces of database
Ask the entry in data set, the interface can be used for so that can identify, retrieve or storing data library in entry.For example, should
Interface can enable to identify entry based on region of search and/or objective function.Preferably, which can
It is calculated in cloud with being included in.Calculating cloud can be specific cloud, or can with for handling the cloud or cluster at least partly requested
It is identical.
The second aspect of the present invention is related to a kind of recommender system, comprising: according to first aspect or first aspect wherein one
Device and at least one database for the entry concentrated for storing data described in kind implementation.
According to second aspect, in the first implementation of the recommender system, described device is connected to or is arranged in
In calculating cloud for handling at least part request.Enough computing resources can be provided with to support for handling by calculating cloud
The device of inquiry.For example, one or more processing tasks of request can be unloaded to calculating cloud by the device.Calculating cloud can use
Objective function is assessed in the boundary of calculating target function and/or at least some entries.The device, which can be also used for assessment, asks
The complexity asked.Based on assessment result, which can execute simple request, and the request of bigger data set will be unloaded
It is downloaded to calculating cloud.If calculating the constant limit of at least one subfunction in calculating cloud, calculates cloud and can save and calculate
Constant limit with for further processing.The size (and process resource of cloud) for calculating cloud is adapted to estimating for recommender system
Count workload or number of users.This is provided quickly for can further adapt to the very big data set of prospective users quantity
And effective recommendation.
The third aspect of the present invention is related to a kind of entry recommended method, includes the following steps: to receive data set for identification
In entry request, it is described to request specified objective function;Objective function is resolved into multiple subfunctions;Determine the multiple son
The constant limit of at least one subfunction in function;The target is calculated using the constant limit of at least one subfunction
The boundary of function;Using data set described in calculated boundary definitions region of search;By in described search region
Entry in entry in region of search of the processing target function to assess the data set.
The method can be as the device according to first aspect or according to this hair according to a third aspect of the present invention
Recommender system described in bright second aspect executes.The further feature of the method or realization side according to a third aspect of the present invention
Formula can execute the device according to first aspect or described according to a second aspect of the present invention recommender system and its difference
The function of way of realization.Fourth aspect present invention is related to being stored with such as computer readable storage medium of instruction or program code
Deng one or more computer-readable mediums, described instruction or program code configure the calculating and set when executing on the computing device
For to execute the method according to one of corresponding embodiment of the third aspect or the third aspect.
Detailed description of the invention
In order to which the technical characteristic in the embodiment of the present invention is described in more detail, will make below to required in embodiment description
Attached drawing is briefly described.Attached drawing only represents some embodiments of the present invention, these embodiments are not departing from the present invention such as
In the case where range defined in claims, it can modify.
Fig. 1 is the block diagram of apparatus according to an embodiment of the present invention;
Fig. 2 is the schematic diagram of the recommender system of another embodiment according to the present invention;
Fig. 3 is the flow chart of entry recommended method according to an embodiment of the invention;
Fig. 4 shows objective function used in the entry recommended method of another embodiment according to the present invention and searches with what is reduced
The approximate representation in rope region;
Fig. 5 is the another flow chart of entry recommended method according to another embodiment of the present invention.
Specific embodiment
Fig. 1 is the block diagram of apparatus according to an embodiment of the present invention.
Device 100 may include at least one processing unit 102, can be used for handling for the entry in data set
Request, for example, can be used for generating the search of the recommendation of the entry in data set, identification or recommendation request.
Device 100 may include that device 100 can be connected to the interface 104 for calculating cloud 106.Device 100 may include
For device 100 to be connected to the data-interface 108 of at least one database 110.Shown in dotted line, it should be understood that interface
104 and data-interface 108 can be optional component.
At least one processing unit 102 can be used for receiving the request of the entry in data set for identification, and can be with
The searched targets function from request.Objective function can define a kind of request, such as top n or rear N number of search.At at least one
Reason unit 102 can be used for resolving into objective function into multiple subfunctions;Determine at least one of the multiple subfunction
The constant limit of function;The objective function of described search request is calculated using the constant limit of at least one subfunction
Boundary.Processing unit 102 can using calculated boundary define the region of search of data set, and can by
Entry in entry in region of search in processing (or calculating) region of search of objective function to assess data set.
At least some processing tasks can be unloaded to by device 100 calculates cloud 106.For example, processing unit 102 can be to meter
It calculates cloud 106 and submits at least some constant limits, which can be used for calculating target letter based on the constant limit submitted
Several boundaries.In a preferred embodiment, calculating cloud 106 can be used for for entry calculating target function.It should be understood, however, that
It is that device 100 is not only restricted to the unloading of particular task.But processing unit 102 can be used for unloading any suitable task
To cloud 106 is calculated, it can preferably consider for corresponding task to be unloaded to all data transmission needed for calculating cloud 106.
Device 100 effectively approximate objective function and can reduce region of search, to efficiently perform for any
The accurate request of the recommendation of the objective function of classification.Device 100 can for any kind of request dull objective function and
Non-monotonic objective function finds accurate solution.
Fig. 2 is the schematic diagram of recommender system according to an embodiment of the invention.
System 200 may include recommendation server 202, which can correspond to a reality according to the present invention
The device of example is applied, for example, the device 100 in Fig. 1.Recommendation server 202 can be communicated with cloud 204 is calculated, the calculating cloud
204 can be defined as the network or cluster for calculating equipment or computing unit 206 and store the cluster or array of equipment 208.
Recommendation server 202 can be accessed by the one or more client devices 210 run by relative users 212.It should be understood that
Although only showing a client device 210, recommendation server 202 can be run any amount of by relative users
Client device access.
User 212 can retrieve, handle and upload any kind of data and/or content, such as audio, video or media
And other information, document, file and stream.Throughout the specification, these usually can be said to entry.User 212
Film or music file or stream can be such as played, is graded to them, other documents and information etc. are browsed.In addition, user 212
Can for example via user interface or by API that client device 210 provides come explicitly or implicitly with recommendation server 202
It interacts, to receive the recommendation of entry.Recommendation server 202 may also receive from the information of user 212, and can incite somebody to action
The information is stored as data set 214.
Recommendation server 202 can train and establish the model 216 for being used for then recommending using machine learning algorithm.It pushes away
Recommending server 202 can be accessed by administrator 218, which can for example be based on such as root-mean-square deviation (root-mean-
Square deviation, abbreviation RMSE) etc. various indexs other data such as training data 220 are provided, to control
Trained and model 216 quality.
Recommendation server 202 can be communicated with cloud 204 is calculated, with outsourcing or the various operations of unloading.For example, recommending clothes
Business device 202 can be predicted by calculating cloud 204 as recommendation to be presented to the user 212 entry.
Recommendation request can be formulated to specify looking into for the objective function for assessing each entry in data set 214
It askes.For example, top n or rear N number of inquiry can be executed, so as to obtain N number of best or worst entry based on objective function.
Such as discussed in Fig. 3 as follows and Fig. 5, recommendation server 202 can using one according to the present invention or
Method described in multiple embodiments handles the objective function of request.
Fig. 3 is the flow chart of entry recommended method according to an embodiment of the invention.Method 300 can be applied in Fig. 1
Device 100 or Fig. 2 in recommendation server 202.Especially, device 100 or recommendation server 202 can be used for according to method
300 based on optimization and flexible request processing come to user's recommended entry.In addition, one or more processing steps of method 300
The calculating cloud 204 calculated in cloud 106 or Fig. 2 that can be discharged into Fig. 1.
Method 300 can start in box 302, and may proceed to box 304, wherein can receive for knowing
The request of entry in other data set.Objective function is specified in the request, which can indicate each in data set
Purpose index or score.Objective function can (implicitly or explicitly) specify request type, such as top n request, rear N number of ask
It asks, the request based on section of designated value, the request of any other type, the conditional combination etc. in a request in section.
Objective function can for example pass through summing function f (x)=∑ fi(x) it is defined as multiple subfunctions, wherein fi(x) target is indicated
The subfunction of function.
It is requested for being directed to the top n with the entry accordingly graded from multiple users, such as can be by being used for
Objective function is established as following grading formula by the SVD++ method of recommendation:
Wherein, u and i is index relevant to user and entry respectively.
μ is the average grading of all (reality) grading.
W is the quantity of potential factor, is training parameter.
buIt is u index element of the user with respect to the intensive vector of the deviation of μ.
biIt is i index element of the entry with respect to the intensive vector of the deviation of μ.
puThe height of the u index column of the dense matrix P of the potential factor of user, dense matrix P is equal to w, and width is equal to
Number of users.
N (u) is the u index line of the sparse matrix of the implicit non-ratings data about entry of user, can be use
All data that family entry is checked, the height of the sparse matrix are equal to number of users, and width is equal to number of entries.
yjIt is the j index column of the dense matrix Y of the potential factor implicit preferences about entry, the height of dense matrix Y
Equal to w, width is equal to number of entries.
qiIt is the i index column of the dense matrix Q of the potential factor of entry, the height of dense matrix Q is equal to w, width etc.
In number of entries.
Grading formula can use in the training process of recommender system, and for recommending top n entry.However, answering
Understand, the present invention is not limited to formula of specifically grading, objective function or query types.But the grading of top n request is public
Formula is only example.
Method 300 may proceed to box 306, wherein objective function can be resolved into multiple subfunctions.Box 306
In decomposition can be used for removing the reduction in objective function for region of search and those of remain unchanged part (or subfunction).
The processing of request can be made of at least two subtasks.First subtask is for finding candidate entries, second subtask
For assessing objective function in the candidate entries.Therefore, if at least one subfunction only influences second subtask, without
It needs to identify candidate entries (first subtask), then can be ignored in Optimization Steps.
For example, subfunction μ+b can be removed for above-mentioned exampleu, because it is constant and right for entry
Obtained entry set does not influence.By removing at least one subfunction in multiple subfunctions, remaining subfunction
It can be referred to as and simplify objective function (simplified objective function, abbreviation SOF).In the examples described above,
SOF can be provided as follows:
Decompose the computation-intensive part that can be also used for then being limited using constant limit in identification objective function
(computationally intensive part, abbreviation CIP).For example, can identify CIP in SOF.CIP can be by dividing
The group for the subfunction or remaining subfunction in the remaining subfunction in multiple subfunctions that solution objective function obtains
Closing indicates.
For above-mentioned example, CIP can be provided as follows:
Therefore, objective function f (u, i)=c+g (u)+h (i)+r (u, i) can be resolved into at least one constant part c+
G (u), at least one CIPr (u, i) and at least one remaining subfunction h (i), wherein i indicates that the index of entry, u indicate to use
The index at family.Since the explicit value of the f (u, i) for all entries is unknown and to calculate cost too high, can replace
Generation ground calculates h (i) by the steady state value as r (u, i) by lower limit and/or the upper limit.It therefore, can be in reduced region of search
Interior estimation f (u, i).For approximate CIP, method 300 may proceed to box 308, wherein can determine in multiple subfunctions
Indicate the constant limit of at least one subfunction of CIP.The constant of CIP can be estimated and improved using any suitable technology
Upper and lower bound.Subsequent table 1 gives an example of procedure method.Since CIP is usually User ID and Entry ID
Function, therefore input the tuple that parameter x can be considered as User ID and Entry ID.
Table 1: for determining the algorithm of the constant limit of CIP
Constant limit is estimated based on the maximum value of vector, with the sum of estimate vector, scalar product and non-monotonic addend, from
And obtain dull sqrt (mnz) the * my in procedure method shown in table 1.Which show use most in a line of matrix
The non-zero entry of big quantity usually estimates that the upper limit estimates the correctness of lower limit again.
In the above examples, all entries and user use identical lower and upper limit value.It is preferably implemented according to one
Example, for each entry/user or for entry group/user group, can estimate constant limit respectively.This may be by can be anti-
Reflect the influence of one or more parameters of the compromise between boundary accuracy and boundary calculating duration.
Using the constant limit of the CIP determined in box 308, method 300 may proceed to box 310, wherein use
The constant limit of CIP carrys out the boundary of calculating target function.It then, such as can be using branch and boundary method or any other conjunction
Suitable technology calculates lower and upper limit function.
According to an example, for top n method, N number of best entry must be identified for each user.This can be by looking for
Optimize to k entry, wherein N≤k < < #i, #i is number of entries.If CIP is limited to constant limit [r1, r2], then it is right
It should keep assuming as follows in the remainder of SOF: h (i)+r1> h (i ')+r2.This makes it possible for h and finds N number of best item
Mesh, the minimum value in them are hN, min, requested entry is in such as h (i) > hN, min-(r2-r1) etc. be identified in entries.?
In one example, the calculating of the lower and upper limit of SOF can be completed using process shown in table 2.
Table 2: for calculating the algorithm of the lower and upper limit of SOF
In table 2, identical boundary is used for all users, so that can call once for each entry
Compute_limits_step3 in table 1.
Method 300 may proceed to box 312, wherein can using the boundary of calculated objective function define
The region of search of data set.It can be entry/user of the composition part of result by removing those not to reduce region of search.
This can depend on the type of searching request.For example, requesting for top n, top n lower limit can be found, and can identify
Wherein minimum lower limit.Lowest limit can be expressed as b.Hereafter, the identification upper limit is greater than the entry of b.Although being directed to these entries
The explicit value of objective function be unknown, but they be likely to before entry in N.It is removed to reduce region of search
Number of entries it is also contemplated that training data, which, which can define, may great interval between lower and upper limit
And biFrom an entry to the variation degree of another entry.
In optional step (not shown), the size of region of search can be determined;And if it find that region of search is uncomfortable
It closes or inaccuracy, then can identify other CIP using other decomposition of SOF or other Subset selections of multiple subfunctions,
To obtain more suitable region of search.E.g., including other subfunctions or the choosing of other subfunctions in the multiple subfunction
The larger segment of the SOF selected can be considered as CIP.If it is considered that the larger segment of SOF, may cause between lower and upper limit
Larger interval, but it is more inaccurate to the calculating of the remainder of objective function.However, bigger interval may cause to subtract
Small region of search and assess SOF progress calculating it is more.As an alternative, other subfunctions or the son previously considered be can be
The smaller fragment of the SOF of a part in function is considered CIP.This may will increase the calculation amount in subsequent step.
But then, due to effectively reducing region of search, it is thus possible to more entries can be excluded except further consideration,
It is required to reduce overall calculate.Due to can quickly check what region of search can be reduced based on the rough estimate to CIP
Degree, therefore can be further improved the treatment effeciency of method 300.
Method 300 may proceed to box 314, wherein can calculate accurately in all entries in region of search
Objective function to assess entry, and provides search or recommendation results in response to the request.
It is shown in schematic in fig. 4 according to the constant CIP of the boundary of one or more embodiments to reduced search
The influence of the identification of candidate entries in region.Can using such as method 300 as described in Figure 3 according to the present invention one reality
The method for applying example carrys out estimation objective function and reduces region of search.
Fig. 4 shows multiple entries 402, and wherein inner circle indicates the exact value of objective function f (x).Although being shown in Fig. 4
Certain amount of entry 402, it should be understood that the present invention is not limited to certain amount of entries.Due to being directed to the mesh of each entry
The assessment of scalar functions is computationally excessively intensive, therefore the CIP that can carry out limited target function by using constant limit comes closely
Like calculating.Region 404 is described as this approximation of each entry 402.Although these regions do not correspond to exact value,
It is that it can be computed using more efficient way.By determine region 404 whether fall into region of search 406 (or with
Region of search 406 is overlapped), explicit value is not used, and suitable entry is identified using region 404.For obtained candidate item
Mesh 408, can be with the exact value of calculating target function.
Illustrated in the following result of top n request method using one or more embodiments according to the present invention into
The advantageous processing and effectively acceleration of capable recommendation.
The calculated result in the system using SVD++ method, to construct recommended models, wherein be used for SVD++ instruction
Experienced Spark GraphX is realized.The result is that calculated based on MovieLens data set, MovieLens data set tool
There is 130K user to grade the 20M of 27K entry.Method according to an embodiment of the invention accelerates preceding 10 inquiries 2.6, and
And accelerate preceding 1000 inquiries 2.1 of the baseline version using the inquiry of standard top n.Further performance comparison result is base
It is calculated in biggish data set.These data sets are by from the set of matrices of referred to as nlpkkt80 and nlpkkt200
The sparse matrix of selection indicates.The speed-up ratio of 90 and about 2000 is had been realized in using method according to an embodiment of the invention
Rate.Table 3 shows more details.
Table 3: for the performance of all users of nlpkkt matrix
On nlpkkt200 data set, the total evaluation time of all users spends about 18 minutes in total, and baseline needs
About 13 days or more total evaluation times are spent to assess all entries of all users in entire data set.
Fig. 5 is the another flow chart of method according to an embodiment of the invention.This method can be applied to the device in Fig. 1
Recommendation server 202 in 100 or Fig. 2.In addition, each processing step of method 500 can be with any combination and method 300
Processing step merges, and vice versa.
Method 500 can be based on the request that may include objective function.Method 500 can start in box 502,
In, it those of can remain unchanged part by reducing for region of search in removal objective function based on objective function and define
Simplify objective function (simplified objective function, abbreviation SOF).If there is no such part, then
SOF can be equal to primal objective function.
This method may proceed to box 504, wherein computation-intensive part can be selected in SOF
(computationally intensive part, abbreviation CIP).SOF can be resolved into CIP 506 and remainder by this
508。
In box 510, (reachable or inaccessible) constant lower and upper limit or boundary are determined for CIP 506.Box
510 by greatly reducing the computational intensity of CIP with steady state value approximation CIP.In addition, the processing in box 510 can be used for
Control limit calculates the compromise between time and accuracy in computation.Processing in box 510 can using procedure method come into
Row.
Method 500 may proceed to box 512, wherein can calculate the lower and upper limit or boundary of entire SOF.Box
Calculating in 512 can be based on the processing result in box 510.Processing in box 510 can using procedure method come into
Row.
In block 514, can using it is from box 512 as a result, by according to the lower limit of calculated SOF and upper
Limit removal is not most possibly those of composition part of recommendation results entry/user, to reduce region of search.
In box 516, the size of the region of search of reduction can be determined or measured.If region of search is still very big,
Then method 500 may proceed to box 504, wherein the other parts selection of other decomposition or SOF that SOF can be used comes true
Surely other CIP for the subsequent processing in next iteration.
In block 518, request can be executed using reduced region of search, wherein it can identify candidate entries, and
And it can be directed to the exact value of all candidate entries calculating target functions, to generate the response for request.
The one or more steps or box of method 500 can in apparatus according to an embodiment of the present invention and/or
It is executed on the calculating cloud of the recommender system of one or more embodiment according to the present invention.It should be understood that any suitable
Business can be unloaded to calculating cloud from device or recommendation server with any combination.
Above description is only embodiments of the present invention, and the scope of the present invention is not limited to disclosed example and implementation
Example.But any change or replacement can easily be carried out by those skilled in the art.Therefore, protection scope of the present invention
It should be subject to the protection scopes of appended claims.
Claims (15)
1. a kind of device characterized by comprising
At least one processing unit, is used for:
The request of the entry in data set for identification is received, it is described to request specified objective function;
Objective function is resolved into multiple subfunctions;
Determine the constant limit of at least one subfunction in the multiple subfunction;
The boundary of the objective function is calculated using the constant limit of at least one subfunction;
Using data set described in calculated boundary definitions region of search;
Pass through the item in the entry in described search region in region of search of the processing target function to assess the data set
Mesh.
2. the apparatus according to claim 1, which is characterized in that at least one described processing unit is the multiple for removing
The one or more subfunctions remained unchanged for the size variation in described search region in subfunction.
3. according to device described in preceding claims one of them, which is characterized in that at least one subfunction it is constant
Boundary is determined using the maximum value and/or minimum value of the input parameter of at least one subfunction.
4. according to device described in preceding claims one of them, which is characterized in that the constant limit is by executing one
A process determines.
5. according to device described in preceding claims one of them, which is characterized in that at least one described processing unit is also used
Size in measurement described search region;And if measured size is more than threshold value, at least one described processing is single
Member is also used to determine other constant limits of at least one other subfunction in the multiple subfunction.
6. device according to claim 5, which is characterized in that at least one described processing unit be also used to using it is described its
Its constant limit calculates other boundaries of the objective function, and described search is defined using the calculated other boundaries of institute
Rope region.
7. device according to claim 5 or 6, which is characterized in that described at least one subfunction of others includes described
At least one of at least one subfunction and the multiple subfunction others subfunction.
8. according to device described in preceding claims one of them, which is characterized in that at least one described processing unit is used for
The objective function is resolved into the sum of multiple subfunctions.
9. according to device described in preceding claims one of them, which is characterized in that the objective function, which corresponds to be directed to, to be pushed away
Recommend a kind of request of system.
10. according to device described in preceding claims one of them, which is characterized in that inquiry is top n or rear N number of search,
Based on the objective function, search result includes the collection of N number of entry of the best or worst grading with one or more users
It closes.
11. according to device described in preceding claims one of them, which is characterized in that further include: at least one data-interface,
At least one database is coupled to retrieve the entry in the data set.
12. a kind of recommender system characterized by comprising
According to device described in preceding claims one of them;
At least one database for the entry concentrated for storing data.
13. recommender system according to claim 12, which is characterized in that described device is connected to or is arranged in for handling
In the calculating cloud of at least part request.
14. a kind of entry recommended method characterized by comprising
The request of the entry in data set for identification is received, it is described to request specified objective function;
Objective function is resolved into multiple subfunctions;
Determine the constant limit of at least one subfunction in the multiple subfunction;
The boundary of the objective function is calculated using the constant limit of at least one subfunction;
Using data set described in calculated boundary definitions region of search;
Pass through the item in the entry in described search region in region of search of the processing target function to assess the data set
Mesh.
15. the computer-readable medium that one or more is stored with instruction, which is characterized in that described instruction is held on the computing device
The calculating equipment is configured when row to execute the method according to claim 11.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/RU2016/000582 WO2018044189A1 (en) | 2016-08-29 | 2016-08-29 | Search region decreasing for recommendation systems |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109478190A true CN109478190A (en) | 2019-03-15 |
Family
ID=58402114
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201680086991.7A Pending CN109478190A (en) | 2016-08-29 | 2016-08-29 | The region of search of recommender system reduces |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109478190A (en) |
WO (1) | WO2018044189A1 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7574422B2 (en) * | 2006-11-17 | 2009-08-11 | Yahoo! Inc. | Collaborative-filtering contextual model optimized for an objective function for recommending items |
CN103049523A (en) * | 2012-12-20 | 2013-04-17 | 浙江大学 | Method for solving social recommendation problem by low-rank semi-definite programming |
CN103093376A (en) * | 2013-01-16 | 2013-05-08 | 北京邮电大学 | Clustering collaborative filtering recommendation system based on singular value decomposition algorithm |
-
2016
- 2016-08-29 CN CN201680086991.7A patent/CN109478190A/en active Pending
- 2016-08-29 WO PCT/RU2016/000582 patent/WO2018044189A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7574422B2 (en) * | 2006-11-17 | 2009-08-11 | Yahoo! Inc. | Collaborative-filtering contextual model optimized for an objective function for recommending items |
CN103049523A (en) * | 2012-12-20 | 2013-04-17 | 浙江大学 | Method for solving social recommendation problem by low-rank semi-definite programming |
CN103093376A (en) * | 2013-01-16 | 2013-05-08 | 北京邮电大学 | Clustering collaborative filtering recommendation system based on singular value decomposition algorithm |
Non-Patent Citations (2)
Title |
---|
WIKIPEDIA: "Recommender system: Difference", 《HTTPS://EN.WIKIPEDIA.ORG/W/INDEX.PHP?TITLE=RECOMMENDER_SYSTEM&DIFF=733707206&OLDID=731173706》 * |
里奇(RICCI.F) 等: "《推荐系统:技术、评估及高效算法》", 31 July 2015 * |
Also Published As
Publication number | Publication date |
---|---|
WO2018044189A1 (en) | 2018-03-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10671933B2 (en) | Method and apparatus for evaluating predictive model | |
US11423082B2 (en) | Methods and apparatus for subgraph matching in big data analysis | |
US20190362222A1 (en) | Generating new machine learning models based on combinations of historical feature-extraction rules and historical machine-learning models | |
CN108052394B (en) | Resource allocation method based on SQL statement running time and computer equipment | |
US9390142B2 (en) | Guided predictive analysis with the use of templates | |
US20170140278A1 (en) | Using machine learning to predict big data environment performance | |
CN110069502A (en) | Data balancing partition method and computer storage medium based on Spark framework | |
CN110825966A (en) | Information recommendation method and device, recommendation server and storage medium | |
US10545972B2 (en) | Identification and elimination of non-essential statistics for query optimization | |
CN115296984B (en) | Abnormal network node detection method and device, equipment and storage medium | |
CN111783810A (en) | Method and apparatus for determining attribute information of user | |
US10824956B1 (en) | System and method for price estimation of reports before execution in analytics | |
CN106874332B (en) | Database access method and device | |
US20140303933A1 (en) | Optimizing analytic flows | |
Kumar et al. | Scalable performance tuning of hadoop MapReduce: A noisy gradient approach | |
US20160189026A1 (en) | Running Time Prediction Algorithm for WAND Queries | |
CN113780287A (en) | Optimal selection method and system for multi-depth learning model | |
CN111510473B (en) | Access request processing method and device, electronic equipment and computer readable medium | |
US20090138237A1 (en) | Run-Time Characterization of On-Demand Analytical Model Accuracy | |
Tesser et al. | Selecting efficient VM types to train deep learning models on Amazon SageMaker | |
CN109478190A (en) | The region of search of recommender system reduces | |
CN113536085B (en) | Method and system for scheduling subject term search crawlers based on combined prediction method | |
CN113326203B (en) | Information recommendation method, equipment and storage medium | |
JP6203313B2 (en) | Feature selection device, feature selection method, and program | |
CN114638316A (en) | Data clustering method, device and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190315 |