WO2016058485A2 - 用于计算排序分及建立模型的方法、装置及商品推荐系统 - Google Patents

用于计算排序分及建立模型的方法、装置及商品推荐系统 Download PDF

Info

Publication number
WO2016058485A2
WO2016058485A2 PCT/CN2015/091216 CN2015091216W WO2016058485A2 WO 2016058485 A2 WO2016058485 A2 WO 2016058485A2 CN 2015091216 W CN2015091216 W CN 2015091216W WO 2016058485 A2 WO2016058485 A2 WO 2016058485A2
Authority
WO
WIPO (PCT)
Prior art keywords
ranking
evaluated
sorting
factor
value
Prior art date
Application number
PCT/CN2015/091216
Other languages
English (en)
French (fr)
Other versions
WO2016058485A3 (zh
Inventor
刘睿
吕韬
孙超
杨志雄
Original Assignee
阿里巴巴集团控股有限公司
刘睿
吕韬
孙超
杨志雄
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司, 刘睿, 吕韬, 孙超, 杨志雄 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2016058485A2 publication Critical patent/WO2016058485A2/zh
Publication of WO2016058485A3 publication Critical patent/WO2016058485A3/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Definitions

  • the present application relates to a sorting technique, and in particular to a method for calculating a ranking score of an object to be evaluated.
  • the present application also provides an apparatus for calculating a ranking score of an object to be evaluated, a method and apparatus for establishing a ranking calculation model, and a commodity recommendation system.
  • the sorting factor used in the process of calculating the sorting points refers to the factors affecting the ordering of the final items.
  • the attributes related to the goods can be selected as the sorting factors, for example: price, sales volume, number of transactions, number of buyers, hot search words. The number of times, etc.; the algorithm for calculating the sorting points according to the above sorting factors is also various.
  • the application scenario needs to introduce a new sorting factor it can be divided into the following two implementation schemes:
  • the sorting factor that affects the final sorting result is relatively stable, usually there is no new sorting factor to be integrated, so the model is usually used, and the model will generally
  • the design is relatively complex, taking into account the various possible relationships between each feature (ie, the ranking factor) and the final goal, and using machine learning to determine the weighting factors for each sorting factor in the model. If you want to add a sorting factor in this scenario, you usually need to modify the established model and re-solve the weighting factors of all sorting factors in the model.
  • the new sorting factor is introduced by the first method of the prior art. Because of the change of the model, it is necessary to re-acquire a large amount of training data and use the machine learning algorithm to train, and recalculate each new model.
  • the weighting coefficient value of the sorting factor can be used to calculate the sorting score of the commodity according to the model. The whole process is more complicated.
  • the new sorting factor is introduced by the second method, the factors of manual intervention are relatively large, and the weighting coefficient of the sorting factor is completely dependent on The expert's subjective experience is set, so the calculated sorting points are probably not accurate enough to reflect the actual ordering of the goods relatively objectively.
  • the present application provides a method and apparatus for calculating a ranking score of an object to be evaluated, so as to solve the problem that the prior art cannot easily introduce a new sorting factor, and simply rely on the expert experience to set the weighting coefficient to cause the ranking result to be inaccurate.
  • the present application further provides a method and apparatus for establishing a ranking calculation model, and a commodity recommendation system.
  • the present application provides a method for calculating a ranking score of an object to be evaluated, including:
  • the difference between the actual ranking distribution obtained from the actual behavior data and the predicted ranking distribution obtained according to the preset ranking calculation model is minimized as an optimization target, and the new ranking in the ranking calculation model is solved.
  • the weighting factor of the factor is minimized as an optimization target, and the new ranking in the ranking calculation model is solved.
  • a representation manner of a power sub-summation is adopted for each new sorting factor
  • the weight coefficient of the newly added ranking factor refers to a sequence of weight coefficients, and each weight coefficient in the sequence corresponds to a power term of the newly added ranking factor.
  • the representation manner of using the power term summation for each newly added sorting factor specifically refers to a representation manner of summing the fourth power sub-items.
  • the specific sorting target is: a click quantity, a transaction volume, or a transaction amount.
  • the difference between the actual ranking distribution and the predicted ranking distribution specifically refers to a KL distance between the two distributions.
  • the difference between the actual ranking distribution obtained according to the actual behavior data and the predicted ranking distribution obtained according to the preset ranking calculation model is minimized as an optimization target, and the ranking calculation is solved.
  • the weighting factors for the new sorting factors in the model including:
  • the current value of the newly added ranking factor weight coefficient refers to the value of the weight coefficient obtained by the method last calculated
  • the value of the KL distance expression is minimized as an optimization target, and the value of the weight coefficient of the newly added ranking factor is solved.
  • the value of the KL distance expression is minimized as an optimization target, and the value of the weight coefficient for solving the newly added ranking factor is determined by using a stochastic gradient descent algorithm or a logistic regression optimization algorithm.
  • Obtaining a predicted ranking distribution of the object to be evaluated by calculating a ratio of a predicted ranking score of the object to be evaluated to a total of a predicted ranking score of all objects to be evaluated;
  • the step of solving the weighting coefficient of the newly added sorting factor in the sorting score calculation model is not executed; correspondingly, the The value of the original rating data of the object to be evaluated, the value of the newly added ranking factor, and the calculated weight coefficient of the newly added ranking factor are input, and the ranking of the object to be evaluated is calculated by using the ranking calculation model. It means that the value of the newly added ranking factor weight coefficient obtained by the most recent calculation is input.
  • step of calculating the predicted ranking score of the object to be evaluated by using the sorting score calculation model is performed for the first time, setting a current value of the newly added ranking factor weight coefficient to a preset value Initial value.
  • the predetermined number of objects to be evaluated are selected according to the order of the original rating data of the object to be evaluated, and the weighting coefficient used to solve the new ranking factor by using the method is to be evaluated. object.
  • the present application further provides an apparatus for calculating an object sorting score to be evaluated, including:
  • a data obtaining unit configured to obtain original scoring data of the object to be evaluated, a value of the newly added sorting factor, and a historical behavior data corresponding to each object to be evaluated in the actual interactive behavior system, and the specific sorting target extracted therefrom Actual behavioral data;
  • a weight coefficient calculation unit for minimizing a difference between an actual ranking distribution obtained according to the actual behavior data and a prediction ranking distribution obtained according to a preset ranking calculation model as an optimization target, and solving the ranking score Calculate the weighting factor of the new sorting factor in the model;
  • a sorting sub-calculation unit configured to input, by using the original scoring data of the object to be evaluated, the value of the newly added sorting factor, and the calculated value of the newly added sorting factor weight coefficient, and using the sorting score calculation model Calculating the ranking score of the object to be evaluated.
  • a representation manner of power sub-summation is adopted for each new ranking factor.
  • the weight coefficient calculation unit is specifically configured to minimize the KL distance between the actual ranking distribution obtained according to the actual behavior data and the predicted ranking distribution obtained according to the preset ranking calculation model.
  • the optimization goal is to solve the weight coefficient of the new ranking factor in the ranking calculation model.
  • the weight coefficient calculation unit includes:
  • An actual sorting distribution obtaining sub-unit configured to obtain an actual ranking distribution of the object to be evaluated by calculating a ratio of the actual behavior data of the object to be evaluated to the sum of the actual behavior data of all objects to be evaluated;
  • a prediction sorting sub-unit for inputting the original score data of the object to be evaluated, the value of the newly added sorting factor, and the current value of the newly added ranking factor weight coefficient, and using the sorting score calculation model Calculating a predicted ranking score of the object to be evaluated;
  • the current value of the newly added ranking factor weighting coefficient refers to a value of the weighting coefficient obtained by using the method last time;
  • a prediction sorting expression expression obtaining unit configured to substitute the weighting coefficient of the newly added sorting factor as an unknown number, and substitute the original score data of the object to be evaluated and the value of the newly added sorting factor into the sorting score calculation model And obtaining, according to the sum of the obtained expression and the predicted ranking score of the object to be evaluated, a predicted ranking distribution represented by a weighting coefficient of the newly added ranking factor;
  • a KL distance expression obtaining subunit configured to obtain an expression of a KL distance between the actual sorting distribution and the predicted sorting distribution
  • the weight coefficient solving subunit is configured to minimize the value of the KL distance expression as an optimization target, and solve the value of the weighting coefficient of the newly added sorting factor.
  • the weight coefficient solving subunit is specifically configured to use a stochastic gradient descent algorithm or a logistic regression optimization algorithm to solve the weight coefficient of the newly added sorting factor.
  • the weight coefficient calculation unit further includes:
  • a prediction order distribution obtaining sub-unit configured to calculate a predicted ranking score of the object to be evaluated and all objects to be evaluated after acquiring the actual ranking distribution of the object to be evaluated and calculating the predicted ranking score of the object to be evaluated Predicting a ratio of the sum of the sub-scores, and obtaining a predicted ranking distribution of the object to be evaluated;
  • a KL distance value calculation subunit configured to calculate a KL distance value between the actual ranking distribution and the predicted ranking distribution of the predicted ranking distribution acquisition subunit output
  • a KL distance value judging subunit configured to determine the KL distance value and the last time calculated by the method If the KL distance value is compared, the ratio of the value decrease is less than a preset threshold; if yes, the weight coefficient calculation unit is not triggered in the subsequent process of calculating the object sorting score to be evaluated by using the device.
  • the sub-units work, and correspondingly, the sorting sub-calculation unit is specifically configured to use the original scoring data of the object to be evaluated, the value of the newly added sorting factor, and the newly calculated ranking factor weight coefficient. The value of the value is solved.
  • the current value of the newly added ranking factor weight coefficient is set to a preset initial value.
  • the device further includes:
  • the object number determining subunit is configured to determine, before triggering the work of the weight coefficient calculating unit, whether the number of objects to be evaluated is greater than a predetermined number of objects to be evaluated required to solve the newly added ranking factor weight coefficient;
  • a subject selection subunit configured to: when the output of the object number judging subunit is "Yes", select the predetermined number of objects to be evaluated according to an order of the original scoring data of the object to be evaluated from large to small And as the object to be evaluated used by the method to solve the weight coefficient of the newly added sorting factor.
  • the present application also provides a method for establishing a ranking calculation model, including:
  • the predicted ranking distribution is the original rating data
  • the new The value of the increasing ranking factor and the current value of the newly added ranking factor weighting coefficient are obtained by inputting, and the current value of the newly added ranking factor weighting coefficient refers to the weighting coefficient value obtained last time;
  • the step of obtaining the original score data, the value of the newly added ranking factor, and the actual behavior data continues to be performed.
  • the difference between the actual ranking distribution and the predicted ranking distribution specifically refers to a KL distance between the two distributions; correspondingly, the difference value between the two distributions specifically refers to the KL.
  • the value of the distance specifically refers to a KL distance between the two distributions; correspondingly, the difference value between the two distributions specifically refers to the KL. The value of the distance.
  • the preset convergence requirement is that the calculated KL distance value is compared with the last calculated KL distance value, and the ratio of the value reduction is less than a preset threshold.
  • the calculating the difference between the actual ranking distribution obtained according to the actual behavior data and the predicted ranking distribution obtained by using a preset ranking calculation model includes:
  • Obtaining a predicted ranking distribution of the object to be evaluated by calculating a ratio of a predicted ranking score of the object to be evaluated to a total of a predicted ranking score of all objects to be evaluated;
  • a KL distance value between the actual ranking distribution and the predicted ranking distribution is calculated.
  • the difference between the predicted ranking distribution and the actual ranking distribution is minimized as an optimization target, and the weighting coefficient of the newly added ranking factor in the ranking calculation model is solved, including:
  • the value of the KL distance expression is minimized as an optimization target, and the value of the newly added ranking factor weight coefficient is solved.
  • the value of the KL distance expression is minimized as an optimization target, and the value of the newly added ranking factor weight coefficient is determined by using a stochastic gradient descent algorithm or a logistic regression optimization algorithm.
  • the present application further provides an apparatus for establishing a ranking calculation model, including:
  • a data acquisition unit configured to acquire original score data of the object to be evaluated, a value of a new sort factor, And the actual behavior data corresponding to the specific sorting target extracted from the historical behavior data corresponding to each object to be evaluated in the actual interaction behavior system;
  • a distribution difference value calculation unit configured to calculate a difference value between an actual ranking distribution obtained from the actual behavior data and a predicted ranking distribution obtained by using a preset ranking calculation model;
  • the raw score data, the value of the newly added sorting factor, and the current value of the newly added sorting factor weight coefficient are input, and the current value of the newly added sorting factor weighting coefficient refers to the weight calculated last time.
  • a convergence determining unit configured to determine whether the difference value satisfies a preset convergence requirement
  • a weight coefficient optimization unit configured to: when the output of the convergence determination unit is “No”, minimize a difference between the predicted ranking distribution and the actual ranking distribution as an optimization target, and solve the calculation method in the ranking calculation model Add the weighting factor of the sorting factor;
  • the loop control unit is configured to trigger the operation of each unit according to a preset time interval.
  • the difference between the predicted ranking distribution and the actual ranking distribution according to the weight coefficient optimization unit refers to a KL distance between the two distributions; and the distribution difference value calculation unit
  • the calculated difference value refers to the KL distance value between the above two distributions.
  • the pre-set convergence requirement used by the convergence determining unit to perform the determining is that the KL distance value calculated this time is compared with the last calculated KL distance value, and the ratio of the numerical decrease is Less than a preset threshold.
  • the distribution difference value calculation unit includes:
  • An actual sorting distribution obtaining sub-unit configured to obtain an actual ranking distribution of the object to be evaluated by calculating a ratio of the actual behavior data of the object to be evaluated to the sum of the actual behavior data of all objects to be evaluated;
  • a prediction sorting sub-unit for inputting the original score data of the object to be evaluated, the value of the newly added sorting factor, and the current value of the newly added ranking factor weight coefficient, and using the sorting score calculation model Calculating a predicted ranking score of the object to be evaluated; when the sub-unit operation is triggered for the first time, setting a current value of the newly added ranking factor weight coefficient to a preset initial value;
  • a prediction order distribution acquisition sub-unit configured to obtain a prediction ranking score of the object to be evaluated by calculating a ratio of a prediction ranking score of the object to be evaluated to a total of a prediction ranking score of all objects to be evaluated cloth;
  • the KL distance value calculation subunit is configured to calculate a KL distance value between the actual ranking distribution and the predicted ranking distribution.
  • the weight coefficient optimization unit includes:
  • a prediction sorting expression expression obtaining unit configured to substitute the weighting coefficient of the newly added sorting factor as an unknown number, and substitute the original score data of the object to be evaluated and the value of the newly added sorting factor into the sorting score calculation model And obtaining the predicted ranking distribution expression according to the sum of the obtained expression and the predicted ranking of the object to be evaluated;
  • a KL distance expression obtaining subunit configured to obtain an expression of a KL distance between the actual sorting distribution and the predicted sorting distribution
  • the weight coefficient solving subunit is configured to minimize the value of the KL distance expression as an optimization target, and solve the value of the newly added ranking factor weight coefficient.
  • the weight coefficient solving subunit is specifically configured to solve the value of the newly added ranking factor weight coefficient by using a random gradient descent algorithm or a logistic regression optimization algorithm.
  • the application also provides a product recommendation system, including:
  • a commodity recommendation server configured to receive a commodity inquiry request of the client, and push the plurality of commodities matching the keywords in the query request to the client, where the pushed plurality of commodities are according to claim 1.
  • the method for calculating the object sorting points to be evaluated, and sorting the recommendable candidate items by a pre-calculated sorting point, the recommended order is in the high order commodity.
  • the method for calculating the object sorting score to be evaluated by the present application obtains the original scoring data of the object to be evaluated, the value of the newly added sorting factor, and the actual behavior data of the object to be evaluated by the user in the actual interactive behavior system.
  • the difference between the actual ranking distribution and the predicted ranking distribution is minimized as an optimization target, the weighting coefficient of the newly added ranking factor in the ranking calculation model is solved, and the ranking evaluation model is used to calculate the to-be-evaluated according to the solution result.
  • the sorting points of the object so that the new sorting factor can be introduced quickly and conveniently, and the calculated sorting score can be relatively objective and accurate by optimizing the new ranking factor weighting coefficient in the sorting score calculation model. Predict the ordering of the objects to be evaluated, which is closer to the actual sorting result.
  • the method for establishing a ranking calculation model minimizes the difference between the actual ranking distribution and the predicted ranking distribution as an optimization target, and solves the new ranking in the ranking calculation model.
  • the weighting coefficient of the factor is repeated and the iterative optimization is repeated.
  • the ranking calculation model is established.
  • FIG. 1 is a flow chart of an embodiment of a method for calculating an object sorting score to be evaluated according to the present application
  • FIG. 2 is a flowchart of a process for solving a weighting coefficient of a newly added sorting factor by minimizing a KL distance between an actual sorting distribution and a predicted sorting distribution as an optimization target provided by the present application;
  • FIG. 3 is a schematic diagram of an apparatus for calculating an object sorting score to be evaluated according to the present application
  • FIG. 4 is a flow chart of an embodiment of a method for establishing a ranking calculation model of the present application
  • FIG. 5 is a schematic diagram of an embodiment of an apparatus for establishing a ranking calculation model of the present application.
  • FIG. 1 is a flowchart of an embodiment of a method for calculating an object sorting score to be evaluated. The method includes the following steps:
  • Step 101 Obtain original score data of the object to be evaluated, a value of the newly added sorting factor, and actual behavior data corresponding to the specific sorting target extracted from the historical behavior data corresponding to each object to be evaluated in the actual interactive behavior system. .
  • the method for calculating the object sorting points to be evaluated provided by the present application, based on the original score data, introduces a new sorting factor, and uses a pre-set sorting score calculation model to predict the between the sorting distribution and the actual sorting distribution. Minimize the difference (ie, the highest similarity between the two distributions above) is optimized. The target is solved, the value of the newly added ranking factor weight coefficient is obtained, and the ranking score of the object to be evaluated is further calculated according to the model.
  • the newly added sorting factor is introduced conveniently and naturally, but also the calculated sorting score is relatively accurate, and the sorting condition of the object to be evaluated can be predicted objectively and accurately according to the score.
  • the original rating data refers to an original ranking score for the object to be evaluated without introducing a new ranking factor.
  • the method provided by the present application is applied to an online transaction system, and the object to be evaluated is to be evaluated. Sorting the merchandise, then the original sorting score is usually calculated according to the following general attributes of the merchandise to be sorted: price, sales volume, number of transactions, number of buyers, number of hot search words, and the like.
  • the specific calculation manner of the original sorting points may be a simple weighted summation, or a relatively complex algorithm model, regardless of which calculation method is used to obtain the original scoring data in different application scenarios (including the need to introduce a new sorting).
  • the data can be used as the basic data, and a new sorting factor is introduced on the basis of the data, and only the weighting coefficient of the new sorting factor is required to be solved, without the need to perform the original algorithm or model.
  • the ranking of the objects to be evaluated in different application scenarios can be recalculated. It can be seen that the new ranking factor can be easily integrated with the original sorting algorithm or model by using the method provided by the present application.
  • the newly added sorting factor may be different elements according to different application scenarios. For example, when the online trading system organizes some promotional activities, the seller transaction level and the VIP buyer transaction ratio may be newly added. factor. The influence of different new sorting factors on the sorting result is usually different. In order to reflect the difference, a weighting coefficient can usually be assigned to each new sorting factor, and the product of the two is used as the sorting point of the object to be evaluated. An integral part.
  • the new sorting factor and the sorting result are not all simple linear correlations, so the form of the simple product term above cannot reflect the nonlinear relationship between the new sorting factor and the sorting result.
  • the sorting score thus calculated is naturally inaccurate.
  • the technical solution of the present application provides a preferred embodiment, that is, a representation of a power sub-summation for each new ranking factor, each new ranking factor and a weight coefficient sequence Correspondingly, each weight coefficient in the sequence corresponds to a power term of the newly added ranking factor.
  • Taylor's formula can be used to construct a polynomial with these derivatives as a function of the known derivative values at a certain point. The value of the function in the neighborhood of this point.
  • the expansion formula of the Taylor formula is as follows:
  • the final sort score S can be regarded as a function f (X) of the new sort factor X. ), by selecting the appropriate weight coefficient, the X and S of any relationship can be well fitted.
  • the sorting score calculation model shown below is preset:
  • S 0 is the original sorting score (ie: original score data)
  • S is the sorting score after introducing a new sorting factor
  • X and Y represent new sorting factors
  • ⁇ i and ⁇ i represent each new sorting factor.
  • the coefficient of the child, the weight coefficient of the newly added sorting factor is determined by a sequence of coefficients, and the coefficient of each sub-item corresponds to the corresponding power term of the sorting factor.
  • the coefficient of the sorting factor X is ⁇ 0 , ⁇ 1 , ⁇ 2 , ⁇ 3 , and ⁇ 4 determine that ⁇ 0 , ⁇ 1 , ⁇ 2 , ⁇ 3 , and ⁇ 4 correspond to respective sub-items of X 0 , X 1 , X 2 , X 3 , and X 4 , respectively.
  • the relevant data needed to solve the weight coefficient in the above model and calculate the sorting score are obtained, including: the original scoring data, the value of the newly added sorting factor, and the corresponding object to be evaluated in the actual interactive behavior system.
  • the historical behavior data is based on the actual behavior data extracted from the corresponding sort target.
  • the original rating data refers to the original ranking score for the object to be evaluated without introducing a new ranking factor.
  • the value may be calculated according to the original algorithm or the calculation model used when the new ranking factor is not introduced, or may be obtained from other modules or systems responsible for calculating the data, and the specific acquisition method is not used.
  • the core of this application is not specifically limited by this application.
  • the new ranking factor reflects the specific elements that need to be considered for sorting the objects to be evaluated in different application scenarios, and the values need to be obtained in advance.
  • the party The method is applied to the online trading system, and the product recommendation is to be based on the product sorting result.
  • the data can be obtained from the operating department responsible for the product recommendation, and the data reflects the basic idea of the operating department in recommending the product in the event. .
  • the new ranking factors obtained and their values are organized in the following format:
  • the itemId is the identifier of the item to be sorted, factor1 and factor2 are new sorting factors, and value1 and value2 are the new sorting factors. If there are other new sorting factors, they only need to be appended to the end of the record according to the factor:value format. Connect with a specific separator (for example, a comma).
  • a specific separator for example, a comma.
  • the actual ranking distribution of the object to be evaluated needs to be used, so this step also needs to acquire the historical behavior data corresponding to each object to be evaluated in the actual interaction behavior system, and Extract the actual behavior data corresponding to a particular sort target.
  • the actual interaction behavior system refers to a system in which a user interacts with an object to be evaluated.
  • the interaction behavior system is an online transaction system (for example, a Taobao transaction platform)
  • the specific ranking target includes: a click quantity, a transaction volume, or The amount of the transaction, etc.
  • the transaction amount is the largest as the sorting target, and in this step, from the log file used by the online transaction system for storing the user historical behavior data, the item to be sorted is extracted within a set time period (for example, The transaction amount in the past 7 days is the actual behavior data corresponding to the specific sorting target.
  • the foregoing function of acquiring the original score data, adding the value of the sorting factor, and the actual data corresponding to the specific sorting target may be completed by a data collecting module or a data collecting system, thereby being the next step 102 and step 103.
  • the calculations are ready for the data.
  • Step 102 Minimizing the KL distance between the actual ranking distribution obtained according to the actual behavior data and the predicted ranking distribution obtained according to the preset ranking calculation model as an optimization target, and solving the ranking calculation model The weighting factor of the new sorting factor.
  • the method for calculating the object sorting points to be evaluated according to the present application can predict the ordering status of the object to be evaluated according to the calculation result, so the method can also be regarded as a leaderboard algorithm, and the final goal is naturally hoped to pass
  • the predicted ranking result obtained by calculating the sorting score can be as close as possible to the actual sorting result.
  • the commodity with a large transaction amount should be This row is in front of the item with a small transaction amount.
  • it is generally desirable to predict that the ranking distribution is as close as possible to the actual ranking distribution.
  • the technical solution of the present application minimizes the difference between the actual ranking distribution and the predicted ranking distribution as an optimization target, and solves the weighting coefficient of each newly added ranking factor in the preset ranking calculation model, and utilizes The weighting coefficient obtained by the solution calculates the ranking score of the object to be evaluated.
  • the KL distance is used to measure the proximity of the two distributions, or the similarity. In other embodiments, other indicators capable of measuring the similarity of the distribution may also be used.
  • the process of solving the newly added ranking factor weighting coefficient includes steps 102-1 to 102-8, which will be further described below with reference to FIG.
  • Step 102-1 Obtain an actual ranking distribution of the objects to be evaluated.
  • step 101 the actual behavior data corresponding to the specific ranking target has been extracted from the historical behavior data.
  • step 101 by calculating the sum of the actual behavior data of the object to be evaluated and the actual behavior data of all the objects to be evaluated. The ratio is obtained, and the actual ranking distribution of the object to be evaluated is obtained.
  • the transaction amount of the goods to be sorted in the set time period has been obtained with the maximum transaction amount as the sorting target, and in this step, the sum of the transaction amounts of the goods to be sorted is first calculated. Then, the ratio of the transaction amount of each item to be sorted to the sum of the transaction amounts is solved, and the actual sorting distribution of the items to be sorted is obtained.
  • Step 102-2 Calculate the predicted ranking score of the object to be evaluated.
  • the method for calculating the object sorting points to be evaluated provided by the present application may be repeatedly executed according to a certain time interval, and the weight coefficient value of the newly added sorting factor is calculated for each execution.
  • the weight coefficient of the newly added sorting factor will be more and more accurate, and the predicted sorting distribution obtained by calculating the sorting score will be closer to the actual sorting result, which is a process of gradual optimization.
  • the current score data of the object to be evaluated, the value of the newly added ranking factor, and the current value of the newly added ranking factor weight coefficient are input, and the ranking calculation model is used to calculate the to-be-calculated The predicted ranking score of the evaluation object; wherein the current value of the newly added ranking factor weight coefficient refers to the value of the weight coefficient obtained by the method last calculated.
  • the current value of the weighting coefficient may be set to a preset initial value in this step. value.
  • the initial value of each weight coefficient is set. Is -1.
  • Step 102-3 Obtain a predicted ranking distribution of the object to be evaluated.
  • step 102-2 the predicted ranking scores of each object to be evaluated have been acquired.
  • the total of the predicted ranking points of all the objects to be evaluated is first calculated, and then the prediction ranking and prediction ranking of each object to be evaluated are calculated. The ratio of the sum is divided to obtain a predicted ranking distribution of the object to be evaluated.
  • Step 102-4 Calculate a KL distance value between the actual ranking distribution and the predicted ranking distribution.
  • the KL distance is an abbreviation of Kullback-Leibler distance (Kullback-Leibler Divergence), also called relative entropy. It measures the distance (also called the difference case or similarity) of two probability distributions in the same event space, usually defined as the expectation of the logarithmic difference between the two probability distributions of P and Q.
  • the formula is:
  • P(x) represents the true distribution of the data (ie: actual distribution)
  • Q(x) represents the approximate distribution of the data (ie: predicted distribution)
  • Q) is the KL distance described in this application.
  • Step 102-5 Determine whether the ratio of the KL distance value decreases is less than a preset threshold; if yes, go to step 103 to execute; otherwise, perform step 102-6.
  • Step 102-6 Acquire a predicted ranking distribution represented by a weighting coefficient of the newly added ranking factor.
  • the weighting coefficient of the newly added ranking factor is an unknown number, and the original score data of the object to be evaluated and the value of the newly added ranking factor are substituted into the ranking calculation model. (Formula 1), so that each sorting sub-expression of the object to be evaluated can be obtained, and then the expression is divided by the sum of the predicted sorting points of the object to be evaluated, and the weight coefficient of the newly added sorting factor is obtained.
  • the predicted sort distribution of the representation is an unknown number, and the original score data of the object to be evaluated and the value of the newly added ranking factor are substituted into the ranking calculation model.
  • Step 102-7 Acquire an expression of a KL distance between the actual ranking distribution and the predicted ranking distribution.
  • the predicted ranking distribution represented by the weighting coefficient of the newly added ranking factor obtained in step 102-6, and the actual ranking distribution of the object to be evaluated obtained in step 102-1 are substituted into the above formula 2, and the obtained An expression of the KL distance between the actual ranking distribution and the predicted ranking distribution.
  • the weight coefficient of each new sorting factor is the variable currently to be solved.
  • Step 102-8 The value of the KL distance expression is minimized as an optimization target, and the value of the weight coefficient of the newly added ranking factor is solved.
  • the value of the KL distance expression is minimized as an optimization target, and the value of the weighting coefficient of the newly added ranking factor may be solved by using a stochastic gradient descent algorithm SGD or a logistic regression optimization algorithm L-BFGS.
  • the gradient descent algorithm usually adopts an iterative strategy. Starting from the initial point w1, each step along the objective function f(w) advances in the negative gradient direction of the current point by a certain step size, as long as the step size is set properly, so that A monotonically decreasing sequence ⁇ f(w1),...,f(wt),... ⁇ until the end does not fall, the optimal solution w* can be obtained.
  • the Stochastic Gradient Descent (SGD) is a simplified process of the gradient descent algorithm, which has a relatively fast convergence speed and can avoid the situation of falling into local optimum.
  • L (imited memory)-BFGS (BFGS is an abbreviation combination of the initials of four people) is an optimization algorithm of traditional logistic regression algorithm, which can improve the convergence speed of the algorithm.
  • the calculation amount is usually large, and in the specific implementation process, the distributed computing can be selected.
  • Platform to improve computing efficiency For example, in the above specific example of the embodiment, the Spark computing platform (a memory-based big data distributed computing platform) is adopted, so that the calculation of an iterative model such as L-BFGS can be completed relatively quickly, and the execution of the method is effectively improved. effectiveness.
  • the step 102-3 to the step 102-5 are mainly for determining whether the KL distance value between the predicted sorting distribution and the actual sorting distribution has substantially met the preset convergence requirement, and may not be used in the specific implementation process. Judging, but performing an optimization calculation of the weight coefficient each time the method is implemented, the technical solution of the present application can also be implemented.
  • the step 102 to solve the newly added ranking factor weight coefficient it is also possible to first determine whether the number of objects to be evaluated is greater than a predetermined number of items required to perform the above-mentioned solution process, and if so, according to the original object to be evaluated The ranking data is selected from the largest to the smallest, and the predetermined number of objects to be evaluated are selected as the object to be evaluated which is used to solve the weighting coefficient by using the method.
  • step 102 there are a total of 10,000 items to be sorted, and in the solution process of step 102, it is generally required to use the relevant data of 4000 items to obtain a relatively satisfactory calculation result, and the calculation accuracy and efficiency are considered.
  • 4,000 items are selected to participate in the calculation according to the order of the original rating data of 10,000 items to be sorted.
  • the value of the new ranking factor weighting factor calculated using these 4000 items is usually representative, so it can also be used to calculate the ranking points of other items to be sorted.
  • Step 103 input the value of the original rating data of the object to be evaluated, the value of the newly added ranking factor, and the calculated weight coefficient of the newly added ranking factor, and calculate the waiting by using the ranking calculation model. Evaluate the sorting points of the object.
  • this step can calculate the ranking score of each object to be evaluated by using a preset sorting score calculation model.
  • the calculated item sorting score may also be provided to other modules or systems responsible for selecting products or recommending products, and the latter may be divided into main items according to the order of the products, or may be simultaneously Taking into account other factors, the selection or recommendation of the product is finally completed.
  • the method provided by the present application may be repeatedly executed cyclically, and each time, not only the ranking of the object to be evaluated but also other modules or system references may be calculated, and The KL distance between the actual ranking distribution and the predicted ranking distribution is minimized as the optimization goal, and the new ranking factor weighting coefficient is continuously adjusted and optimized, so that the predicted ranking distribution reflected by the object to be evaluated is more and more close to the actual ranking. distributed.
  • the foregoing method provided by the present application is performed once a day, the value of the newly added ranking factor weight coefficient is optimized, and the calculated predicted ranking result is placed in the corresponding business scenario of the online transaction system.
  • specific behavior data such as user browsing, clicking, and purchasing goods are stored in the user behavior log, and the actual behavior data corresponding to the specific sorting target extracted from the log can be fed back to the calculation process of the next day. Participate in a new round of calculations as an actual sorting distribution... Repeating the above process every day forms a closed-loop feedback process. In this process, the weighting coefficient of the new sorting factor is gradually optimized, and the predicted sorting score is closer to the actual sorting result.
  • the optimization of the weighting coefficient may be performed without performing step 102, each time directly according to the The raw score data of the evaluation object, the value of the newly added sorting factor, and the value of the newly added ranking factor weight coefficient obtained by the last optimization calculation are calculated, and the sorting score of the object to be evaluated is calculated.
  • the method for calculating the object sorting points to be evaluated according to the present application on the basis of obtaining the original score data of the object to be evaluated, the difference between the actual sorting distribution and the predicted sorting distribution is minimized as an optimization target, and the new sorting is solved.
  • the optimization calculation of the coefficients enables the calculated ranking points to predict the ranking of the objects to be evaluated relatively objectively and accurately, and is closer to the actual ranking result.
  • FIG. 3 is a schematic diagram of an apparatus for calculating a ranking score of an object to be evaluated according to the present application. Since the device embodiment is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment. The device embodiments described below are merely illustrative.
  • An apparatus for calculating an object sorting score to be evaluated includes: a data obtaining unit 301, configured to acquire original score data of an object to be evaluated, a value of a newly added sorting factor, and corresponding to an actual interactive behavior system The historical behavior data of each object to be evaluated is based on the corresponding The actual behavior data of the sorting target; the weight coefficient calculating unit 302 is configured to minimize the KL distance between the actual sorting distribution obtained according to the actual behavior data and the predicted sorting distribution obtained according to the preset sorting calculation model And a weighting coefficient of the newly added sorting factor in the sorting score calculation model; the sorting score calculating unit 303 is configured to use the original score data of the object to be evaluated, the value of the newly added sorting factor, and The calculated value of the newly added ranking factor weight coefficient is input, and the ranking score of the object to be evaluated is calculated by using the ranking calculation model.
  • a representation manner of power sub-summation is adopted for each new ranking factor.
  • the weight coefficient calculation unit includes:
  • An actual sorting distribution obtaining sub-unit configured to obtain an actual ranking distribution of the object to be evaluated by calculating a ratio of the actual behavior data of the object to be evaluated to the sum of the actual behavior data of all objects to be evaluated;
  • a prediction sorting sub-unit for inputting the original score data of the object to be evaluated, the value of the newly added sorting factor, and the current value of the newly added ranking factor weight coefficient, and using the sorting score calculation model Calculating a predicted ranking score of the object to be evaluated;
  • the current value of the newly added ranking factor weighting coefficient refers to a value of the weighting coefficient obtained by using the method last time;
  • a prediction sorting expression expression obtaining unit configured to substitute the weighting coefficient of the newly added sorting factor as an unknown number, and substitute the original score data of the object to be evaluated and the value of the newly added sorting factor into the sorting score calculation model And obtaining, according to the sum of the obtained expression and the predicted ranking score of the object to be evaluated, a predicted ranking distribution represented by a weighting coefficient of the newly added ranking factor;
  • a KL distance expression obtaining subunit configured to obtain an expression of a KL distance between the actual sorting distribution and the predicted sorting distribution
  • the weight coefficient solving subunit is configured to minimize the value of the KL distance expression as an optimization target, and solve the value of the weighting coefficient of the newly added sorting factor.
  • the weight coefficient solving sub-unit is specifically configured to solve a weight coefficient of the newly added sorting factor by using a stochastic gradient descent algorithm SGD or a logistic regression optimization algorithm L-BFGS.
  • the weight coefficient calculation unit further includes:
  • a prediction order distribution obtaining sub-unit configured to calculate a predicted ranking score of the object to be evaluated and all objects to be evaluated after acquiring the actual ranking distribution of the object to be evaluated and calculating the predicted ranking score of the object to be evaluated Predicting the ratio of the sum of the sub-scores, and obtaining the predicted row of the object to be evaluated Order distribution
  • a KL distance value calculation subunit configured to calculate a KL distance value between the actual ranking distribution and the predicted ranking distribution of the predicted ranking distribution acquisition subunit output
  • a KL distance value judging subunit configured to determine whether the KL distance value is compared with a KL distance value calculated by the method last time, and whether the ratio of the value decrease is less than a preset threshold; if yes, the subsequent use
  • the weighting coefficient calculating unit and its subunits are no longer triggered to work.
  • the sorting score calculating unit is specifically configured to use the original scoring data of the object to be evaluated, The value of the newly added sorting factor and the value of the newly added ranking factor weighting coefficient calculated last time are input.
  • the current value of the newly added ranking factor weight coefficient is set to a preset initial value.
  • the device further includes:
  • the object number determining subunit is configured to determine, before triggering the work of the weight coefficient calculating unit, whether the number of objects to be evaluated is greater than a predetermined number of objects to be evaluated required to solve the newly added ranking factor weight coefficient;
  • a subject selection subunit configured to: when the output of the object number judging subunit is "Yes", select the predetermined number of objects to be evaluated according to an order of the original scoring data of the object to be evaluated from large to small And as the object to be evaluated used by the method to solve the weight coefficient of the newly added sorting factor.
  • FIG. 4 is a flowchart of an embodiment of a method for establishing a ranking calculation model according to the present application. The same steps in the embodiment are the same as those in the first embodiment, and the following focuses on the differences. .
  • a method for establishing a ranking calculation model provided by the present application includes:
  • Step 401 Obtain original score data of the object to be evaluated, a value of the newly added sorting factor, and actual behavior data corresponding to the specific sorting target extracted from the historical behavior data corresponding to each object to be evaluated in the actual interactive behavior system. .
  • the ranking calculation model used in the present application adds a new ranking factor item based on the original rating data, and each new ranking factor has a weight coefficient corresponding thereto (for powering
  • the ordering factor of the secondary representation is the sequence of weighting coefficients).
  • Model building process The core is to solve the weighting coefficient of the new sorting factor, and the weighting coefficient is determined, then the model is established.
  • the method for establishing a ranking calculation model provided by this embodiment is to minimize the difference value between the predicted ranking distribution and the actual ranking distribution as an optimization target, and to solve the value of the newly added ranking factor weight coefficient, and adopt a loop. In an iterative manner, the value of the weight coefficient is continuously optimized, and the value of the weight coefficient when the algorithm satisfies the convergence condition is used as the final weight coefficient value of the model, thereby completing the model establishment process.
  • This step obtains data required for performing the calculation, including: original rating data of the object to be evaluated, value of the newly added ranking factor, and historical activity data corresponding to each object to be evaluated in the actual interactive behavior system, from which The actual behavior data extracted corresponding to a particular sort target.
  • Step 402 Calculate a KL distance value between the actual ranking distribution obtained from the actual behavior data and the predicted ranking distribution obtained by using a preset ranking calculation model.
  • the KL distance value between the actual ranking distribution and the predicted ranking distribution is used as a specific value for measuring the difference between the two distributions.
  • calculating a KL distance value between the actual ranking distribution and the predicted ranking distribution includes the following processes:
  • the actual ranking distribution of the object to be evaluated is obtained by calculating a ratio of the actual behavior data of the object to be evaluated to the sum of the actual behavior data of all objects to be evaluated.
  • the original score data of the item to be sorted, the value of the newly added sorting factor, and the current value of the newly added sorting factor weight coefficient are input, and the sorting point calculation model is used to calculate the object to be evaluated. Forecast sorting points.
  • the current value of the newly added ranking factor weight coefficient refers to the weight coefficient value obtained last time, and the current value of the newly added ranking factor weight coefficient is set to a preset when the prediction ranking score is first calculated. The initial value.
  • the predicted ranking distribution of the object to be evaluated is obtained.
  • Step 403 Determine whether the KL distance value satisfies a preset convergence requirement. If yes, go to step 404. Otherwise, go to step 405.
  • the preset convergence requirement means that the KL distance value calculated this time is compared with the last calculated KL distance value, and the ratio of the numerical value reduction is smaller than a preset threshold. If yes, it indicates that the KL distance value between the predicted ranking distribution and the actual ranking distribution has met the predetermined convergence requirement, and The optimization of the weighting coefficient of the new sorting factor is no longer performed, so step 404 is continued; if not, it is necessary to continue to narrow the difference between the predicted sorting distribution and the actual sorting distribution, that is, it is necessary to add The weighting factor of the sorting factor continues with the optimization calculation, so the process proceeds to step 405.
  • a specific threshold may be preset to determine whether the algorithm converges.
  • the KL distance value is the number of statistical iteration calculations.
  • the number of iterative calculations is greater than or equal to the number of calculations set in advance according to experience, the algorithm can be considered to have converged.
  • Step 404 End the execution of the method, and the commodity sorting score calculation model is established.
  • the KL distance value between the predicted ranking distribution and the actual ranking distribution has met the preset convergence requirement, and the weighting coefficient of the newly added ranking factor is not needed to be optimized. Therefore, the current value of the newly added ranking factor weighting coefficient is directly used as the value of the corresponding weighting coefficient of the finalized ranking calculation model, and the model is established, and the execution of the method is ended.
  • Step 405 Minimize the difference between the predicted ranking distribution and the actual ranking distribution as an optimization target, and solve the weighting coefficient of the newly added ranking factor in the ranking calculation model.
  • the step includes the following processes: first, the weighting coefficient of the newly added sorting factor is an unknown number, and the original score data of the object to be evaluated and the value of the newly added sorting factor are substituted into the sorting Calculating a model, and obtaining the predicted ranking distribution expression according to the sum of the obtained expression and the predicted ranking of the object to be evaluated; and then acquiring between the actual ranking distribution and the predicted ranking distribution Expression of KL distance; Finally, the value of the KL distance expression is minimized as an optimization target, and the value of the newly added ranking factor weight coefficient is solved by a stochastic gradient descent algorithm SGD or a logistic regression optimization algorithm L-BFGS.
  • SGD stochastic gradient descent algorithm
  • L-BFGS logistic regression optimization algorithm
  • Step 406 Continue to perform the step 401 of obtaining the original rating data, the value of the newly added ranking factor, and the actual behavior data according to a preset time interval.
  • the above steps 401-405 are repeatedly performed once a day, and during the loop, the weight coefficient values of the newly added sorting factor are continuously optimized, and finally the sorting score calculation model is established.
  • the method for establishing a ranking calculation model minimizes the difference between the actual ranking distribution and the predicted ranking distribution as an optimization target, and solves the ranking calculation model.
  • the weighting coefficient of the newly added sorting factor is repeated and the iterative optimization is repeated.
  • the sorting score calculation model is established.
  • FIG. 5 is a schematic diagram of an apparatus embodiment for establishing a ranking calculation model according to the present application. Since the device embodiment is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment. The device embodiments described below are merely illustrative.
  • An apparatus for establishing a ranking calculation model comprising: a data acquisition unit 501, configured to acquire original rating data of an object to be evaluated, a value of a newly added ranking factor, and corresponding to each of the actual interaction behavior systems The historical behavior data of the object to be evaluated is based on the actual behavior data corresponding to the specific sorting target extracted therefrom; the distribution difference value calculating unit 502 is configured to calculate the actual ranking distribution obtained according to the actual behavior data, and adopt preset The sorting score calculates a KL distance value between the predicted ranking distributions obtained by the model; the predicted ranking distribution is the current value of the original ranking data, the value of the newly added ranking factor, and the current value of the newly added ranking factor weighting coefficient For the input, the current value of the newly added ranking factor weight coefficient refers to the weight coefficient value obtained last time; the convergence determining unit 503 is configured to determine whether the KL distance value satisfies a preset convergence requirement; End execution unit 504, configured to end each unit of the device when the output of the convergence determination unit is YES
  • the pre-set convergence requirement used by the convergence determining unit to perform the determining is that the KL distance value calculated this time is compared with the last calculated KL distance value, and the ratio of the numerical decrease is Less than a preset threshold.
  • the distribution difference value calculation unit includes:
  • An actual sorting distribution obtaining sub-unit configured to obtain an actual ranking distribution of the object to be evaluated by calculating a ratio of the actual behavior data of the object to be evaluated to the sum of the actual behavior data of all objects to be evaluated;
  • a prediction sorting sub-unit for inputting the original score data of the object to be evaluated, the value of the newly added sorting factor, and the current value of the newly added ranking factor weight coefficient, and using the sorting score calculation model Calculating a predicted ranking score of the object to be evaluated; when the sub-unit operation is triggered for the first time, setting a current value of the newly added ranking factor weight coefficient to a preset initial value;
  • a prediction order distribution acquisition sub-unit configured to obtain a predicted ranking distribution of the object to be evaluated by calculating a ratio of a prediction ranking of the object to be evaluated to a total of a prediction ranking of all objects to be evaluated;
  • the KL distance value calculation subunit is configured to calculate a KL distance value between the actual ranking distribution and the predicted ranking distribution.
  • the weight coefficient optimization unit includes:
  • a prediction sorting expression expression obtaining unit configured to substitute the weighting coefficient of the newly added sorting factor as an unknown number, and substitute the original score data of the object to be evaluated and the value of the newly added sorting factor into the sorting score calculation model And obtaining the predicted ranking distribution expression according to the sum of the obtained expression and the predicted ranking of the object to be evaluated;
  • a KL distance expression obtaining subunit configured to obtain an expression of a KL distance between the actual sorting distribution and the predicted sorting distribution
  • the weight coefficient solving subunit is configured to minimize the value of the KL distance expression as an optimization target, and solve the value of the newly added ranking factor weight coefficient.
  • the weight coefficient solving subunit is specifically configured to solve the value of the newly added ranking factor weight coefficient by using a stochastic gradient descent algorithm SGD or a logistic regression optimization algorithm L-BFGS.
  • the embodiment of the present application further provides a commodity recommendation system, the system includes a commodity recommendation server, and the server communicates with a plurality of clients, and receives a commodity inquiry request sent by the client, and obtains the Querying a set of candidate products that can be recommended by matching keywords in the request, and pre-calculating the item sorting points according to the method for calculating the object sorting points to be evaluated provided by the present application, the group is recommended for recommendation
  • the candidate items are sorted, and the sorted items are pushed to the client that initiated the query request in order of highest order from the order.
  • the ordered number of items in the high position may also be pushed to the client according to the sorted order.
  • the product recommendation system can be applied to an online transaction platform to perform product recommendation for a client accessing the platform, and the system uses the method provided by the present application for calculating an object sorting score to be evaluated. Pre-calculating the sorting points of the products, and performing product recommendation based on the sorting points, so in different application scenarios (for example, big promotion activities), the sorted products recommended for the client can more accurately reflect the products in the application scenario.
  • the actual sorting situation makes it easy for client users to browse and select, which can improve the experience of the client users and also increase the sales volume of the online trading platform.
  • the product recommendation system provided by the present application is not limited to being implemented in the above-mentioned online transaction platform, and may be implemented in other platforms or applications. As long as it is an application that needs to be recommended according to the sorting points, the application can be applied.
  • the product recommendation system is provided for product recommendation.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • the memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory.
  • RAM random access memory
  • ROM read only memory
  • Memory is an example of a computer readable medium.
  • Computer readable media including both permanent and non-persistent, removable and non-removable media may be implemented by any method or technology.
  • the information can be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device.
  • computer readable media does not include non-transitory computer readable media, such as modulated data signals and carrier waves.
  • embodiments of the present application can be provided as a method, system, or computer program product.
  • the present application can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment in combination of software and hardware.
  • the application may employ computer-usable storage media (including but not limited to disk storage, in one or more of the computer-usable program code embodied therein.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本申请公开了一种用于计算待评价客体排序分的方法和装置、一种用于建立排序分计算模型的方法和装置、以及一种商品推荐系统。其中所述用于计算待评价客体排序分的方法包括:获取待评价客体的原始评分数据、新增排序因子的值、以及从实际交互行为系统的历史行为数据中提取的实际行为数据;以根据实际行为数据得到的实际排序分布、和根据预先设定的排序分计算模型得到的预测排序分布之间的差异最小化为优化目标,求解排序分计算模型中的新增排序因子的权重系数;采用所述排序分计算模型计算所述待评价客体的排序分。采用本申请提供的方法,不仅可以快速、方便地引入新增排序因子,而且计算出的排序分能够相对客观、准确地预测待评价客体的排序状况。

Description

用于计算排序分及建立模型的方法、装置及商品推荐系统 技术领域
本申请涉及排序技术,具体涉及一种用于计算待评价客体排序分的方法。本申请同时提供一种用于计算待评价客体排序分的装置,一种用于建立排序分计算模型的方法和装置,以及一种商品推荐系统。
背景技术
随着互联网的普及和网站技术的发展,越来越多的用户选择在网上浏览、挑选、或者购买自己需要的商品。在这种情况下,很多网站都不同程度地采用各种形式的推荐技术向用户进行商品推荐,比较通用的做法是选取特定的排序因子,并按照预先设定的排序算法计算待推荐商品的排序分,然后按照排序分值的高低选择并推荐商品。
在计算排序分的过程中所采用的排序因子是指,影响最终商品排序的因素,通常可以选择与商品相关的属性作为排序因子,例如:价格、销量、交易次数、买家数、热搜词次数等;依据上述排序因子计算排序分的算法也是多种多样的,通常根据应用场景是否需要引入新的排序因子,可以分为以下两种实现方案:
1)在日常排序场景(比如搜索,聚划算等场景)下,影响最终排序结果的排序因子较为稳定,通常不会有新增排序因子融入进来,因此通常采用建立模型的方式,而且模型一般会设计的相对复杂,将每个特征(即:排序因子)与最终目标的各种可能关系都考虑进来,并利用机器学习的方式来确定模型中的各个排序因子的权重系数。如果要在这种场景下新增排序因子,通常需要修改已建立的模型,并重新求解模型中的所有排序因子的权重系数。
2)在新增业务排序因子较多的场景(例如,大促活动专场的排行榜场景)下,影响最终排序结果的排序因子较多,并且需要根据业务场景的特点在原有的常规排序因子的基础上引入新的排序因子,例如,卖家的交易等级、卖家的VIP会员的月交易额等。因此通常采用比较简单的方法计算排序分,即:根据专家经验给定原有排序因子与新增排序因子之间的权重系数(该权重系数反映了排序因子对最终排序结果的影响力),再用各个排序因子的值与其权重系数相乘 求和,得出最终的排序分,计算公式如下所示,其中Y为最终的排序分,wi为专家经验给出的排序因子fi的权重系数。
Y=w1f1+...+wnfn
通过上面的描述可以看出,采用现有技术的第1)种方式引入新的排序因子,因为模型的变化,需要重新采集大量的训练数据并采用机器学习算法进行训练,重新计算新模型的各个排序因子的权重系数值,然后才能依据该模型计算商品的排序分,整个过程比较复杂;采用第2)种方式引入新的排序因子时,人工干预的因素比较大,排序因子的权重系数完全依靠专家的主观经验进行设定,因此计算出来的排序分很可能不够准确,无法相对客观地反映商品实际的排序状况。
发明内容
本申请提供一种用于计算待评价客体排序分的方法和装置,以解决现有技术无法方便地引入新的排序因子、以及单纯依赖专家经验设置权重系数导致排序分计算结果不准确的问题。本申请另外提供一种用于建立排序分计算模型的方法和装置,以及一种商品推荐系统。
本申请提供一种用于计算待评价客体排序分的方法,包括:
获取待评价客体的原始评分数据、新增排序因子的值、以及以实际交互行为系统中对应每个待评价客体的历史行为数据为根据,从中提取的对应特定排序目标的实际行为数据;
以根据所述实际行为数据得到的实际排序分布、和根据预先设定的排序分计算模型得到的预测排序分布之间的差异最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数;
以所述待评价客体的原始评分数据、所述新增排序因子的值以及计算得到的所述新增排序因子权重系数的值为输入,采用所述排序分计算模型计算所述待评价客体的排序分。
可选的,在所述排序分计算模型中,针对每个新增排序因子采用幂次项求和的表示方式;
相应的,所述新增排序因子的权重系数是指权重系数序列,所述序列中的每个权重系数都与所述新增排序因子的一个幂次项相对应。
可选的,所述针对每个新增排序因子采用幂次项求和的表示方式具体是指,采用四幂次项求和的表示方式。
可选的,当所述交互行为系统为在线交易系统时,所述特定排序目标为:点击数、交易量或者交易金额。
可选的,所述实际排序分布和预测排序分布之间的差异具体是指,所述两个分布之间的KL距离。
可选的,所述以根据所述实际行为数据得到的实际排序分布、和根据预先设定的排序分计算模型得到的预测排序分布之间的差异最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数,包括:
通过计算待评价客体的所述实际行为数据与全部待评价客体的所述实际行为数据总和的比值,获取所述待评价客体的实际排序分布;
以所述待评价客体的原始评分数据、所述新增排序因子的值和所述新增排序因子权重系数的当前值为输入,采用所述排序分计算模型计算所述待评价客体的预测排序分;所述新增排序因子权重系数的当前值是指,采用本方法上一次计算得到的所述权重系数的值;
以所述新增排序因子的权重系数为未知数,将所述待评价客体的原始评分数据、所述新增排序因子的值代入所述排序分计算模型,并根据得到的表达式与待评价客体的所述预测排序分的总和,获取以所述新增排序因子的权重系数表示的预测排序分布;
获取所述实际排序分布和所述预测排序分布之间的KL距离的表达式;
以所述KL距离表达式的值最小化为优化目标,求解所述新增排序因子的权重系数的值。
可选的,所述以所述KL距离表达式的值最小化为优化目标,求解所述新增排序因子的权重系数的值是指,采用随机梯度下降算法或者逻辑回归优化算法求解。
可选的,在所述获取所述待评价客体的实际排序分布和所述计算所述待评价客体的预测排序分的步骤后,执行下述操作:
通过计算所述待评价客体的预测排序分与全部待评价客体的预测排序分总和的比值,获取所述待评价客体的预测排序分布;
计算所述实际排序分布和所述预测排序分布之间的KL距离值;
判断所述KL距离值与上次采用本方法计算得到的KL距离值相比较,其数值减小的比例是否小于预先设定的阈值;
若是,则在后续使用本方法计算待评价客体排序分的过程中,不再执行所述求解所述排序分计算模型中的新增排序因子的权重系数的步骤;相应的,所述以所述待评价客体的原始评分数据、所述新增排序因子的值以及计算得到的所述新增排序因子权重系数的值为输入,采用所述排序分计算模型计算所述待评价客体的排序分是指,以最近一次计算得到的所述新增排序因子权重系数的值为输入进行求解。
可选的,在第一次执行所述采用所述排序分计算模型计算所述待评价客体的预测排序分的步骤时,将所述新增排序因子权重系数的当前值设置为预先设定的初始值。
可选的,在执行所述求解所述排序分计算模型中的新增排序因子的权重系数的步骤之前,执行下述操作:
判断所述待评价客体的数目是否大于求解新增排序因子权重系数所需待评价客体的预定数量;
若是,按照所述待评价客体的原始评分数据从大到小的顺序,从中选择所述预定数量的待评价客体,作为后续使用本方法求解所述新增排序因子的权重系数所采用的待评价客体。
相应的,本申请还提供一种用于计算待评价客体排序分的装置,包括:
数据获取单元,用于获取待评价客体的原始评分数据、新增排序因子的值、以及以实际交互行为系统中对应每个待评价客体的历史行为数据为根据,从中提取的对应特定排序目标的实际行为数据;
权重系数计算单元,用于以根据所述实际行为数据得到的实际排序分布、和根据预先设定的排序分计算模型得到的预测排序分布之间的差异最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数;
排序分计算单元,用于以所述待评价客体的原始评分数据、所述新增排序因子的值以及计算得到的所述新增排序因子权重系数的值为输入,采用所述排序分计算模型计算所述待评价客体的排序分。
可选的,所述权重系数计算单元和所述排序分计算单元采用的排序分计算模型中,针对每个新增排序因子采用幂次项求和的表示方式。
可选的,所述权重系数计算单元具体用于,以根据所述实际行为数据得到的实际排序分布、和根据预先设定的排序分计算模型得到的预测排序分布之间的KL距离最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数。
可选的,所述权重系数计算单元包括:
实际排序分布获取子单元,用于通过计算待评价客体的所述实际行为数据与全部待评价客体的所述实际行为数据总和的比值,获取所述待评价客体的实际排序分布;
预测排序分计算子单元,用于以所述待评价客体的原始评分数据、所述新增排序因子的值和所述新增排序因子权重系数的当前值为输入,采用所述排序分计算模型计算所述待评价客体的预测排序分;所述新增排序因子权重系数的当前值是指,采用本方法上一次计算得到的所述权重系数的值;
预测排序分布表达式获取子单元,用于以所述新增排序因子的权重系数为未知数,将所述待评价客体的原始评分数据、所述新增排序因子的值代入所述排序分计算模型,并根据得到的表达式与待评价客体的所述预测排序分的总和,获取以所述新增排序因子的权重系数表示的预测排序分布;
KL距离表达式获取子单元,用于获取所述实际排序分布和所述预测排序分布之间的KL距离的表达式;
权重系数求解子单元,用于以所述KL距离表达式的值最小化为优化目标,求解所述新增排序因子的权重系数的值。
可选的,所述权重系数求解子单元具体用于,采用随机梯度下降算法或者逻辑回归优化算法求解所述新增排序因子的权重系数。
可选的,所述权重系数计算单元还包括:
预测排序分布获取子单元,用于在获取所述待评价客体的实际排序分布和计算所述待评价客体的预测排序分之后,通过计算所述待评价客体的预测排序分与全部待评价客体的预测排序分总和的比值,获取所述待评价客体的预测排序分布;
KL距离值计算子单元,用于计算所述实际排序分布和所述预测排序分布获取子单元输出的预测排序分布之间的KL距离值;
KL距离值判断子单元,用于判断所述KL距离值与上次采用本方法计算得 到的KL距离值相比较,其数值减小的比例是否小于预先设定的阈值;若是,则在后续使用本装置计算待评价客体排序分的过程中,不再触发所述权重系数计算单元及其子单元工作,相应的,所述排序分计算单元具体用于以所述待评价客体的原始评分数据、所述新增排序因子的值以及最近一次计算得到的所述新增排序因子权重系数的值为输入进行求解。
可选的,第一次触发所述预测排序分计算子单元工作时,将所述新增排序因子权重系数的当前值设置为预先设定的初始值。
可选的,所述装置还包括:
客体数目判断子单元,用于在触发所述权重系数计算单元工作之前,判断所述待评价客体的数目是否大于求解新增排序因子权重系数所需待评价客体的预定数量;
客体选择子单元,用于当所述客体数目判断子单元的输出为“是”时,按照所述待评价客体的原始评分数据从大到小的顺序,从中选择所述预定数量的待评价客体,作为后续使用本方法求解所述新增排序因子的权重系数所采用的待评价客体。
此外,本申请还提供一种用于建立排序分计算模型的方法,包括:
获取待评价客体的原始评分数据、新增排序因子的值、以及以实际交互行为系统中对应每个待评价客体的历史行为数据为根据,从中提取的对应特定排序目标的实际行为数据;
计算根据所述实际行为数据得到的实际排序分布、和采用预先设定的排序分计算模型得到的预测排序分布之间的差异值;所述预测排序分布是以所述原始评分数据、所述新增排序因子的值以及所述新增排序因子权重系数的当前值为输入得到的,所述新增排序因子权重系数的当前值是指,上一次计算得到的权重系数值;
判断所述差异值是否满足预先设定的收敛要求;
若是,结束本方法的执行,所述排序分计算模型建立完毕;
若否,以预测排序分布和所述实际排序分布之间的差异最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数;
按照预先设定的时间间隔,转到获取所述原始评分数据、所述新增排序因子的值以及所述实际行为数据的步骤继续执行。
可选的,所述实际排序分布和预测排序分布之间的差异具体是指,上述两个分布之间的KL距离;相应的,上述两个分布之间的差异值具体是指,所述KL距离的值。
可选的,所述预先设定的收敛要求是指,本次计算的KL距离值与上次计算得到的KL距离值相比较,其数值减小的比例小于预先设定的阈值。
可选的,所述计算根据所述实际行为数据得到的实际排序分布、和采用预先设定的排序分计算模型得到的预测排序分布之间的差异值,包括:
通过计算待评价客体的所述实际行为数据与全部待评价客体的所述实际行为数据总和的比值,获取所述待评价客体的实际排序分布;
以所述待评价客体的原始评分数据、所述新增排序因子的值和所述新增排序因子权重系数的当前值为输入,采用所述排序分计算模型计算所述待评价客体的预测排序分;第一次执行本步骤时,将所述新增排序因子权重系数的当前值设置为预先设定的初始值;
通过计算所述待评价客体的预测排序分与全部待评价客体的预测排序分总和的比值,获取所述待评价客体的预测排序分布;
计算所述实际排序分布和所述预测排序分布之间的KL距离值。
可选的,所述以预测排序分布和所述实际排序分布之间的差异最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数,包括:
以所述新增排序因子的权重系数为未知数,将所述待评价客体的原始评分数据、所述新增排序因子的值代入所述排序分计算模型,并根据得到的表达式与所述待评价客体的所述预测排序分总和,获取所述预测排序分布表达式;
获取所述实际排序分布和所述预测排序分布之间的KL距离的表达式;
以所述KL距离表达式的值最小化为优化目标,求解所述新增排序因子权重系数的值。
可选的,所述以所述KL距离表达式的值最小化为优化目标,求解所述新增排序因子权重系数的值是指,采用随机梯度下降算法或者逻辑回归优化算法求解。
相应的,本申请还提供一种用于建立排序分计算模型的装置,包括:
数据获取单元,用于获取待评价客体的原始评分数据、新增排序因子的值、 以及以实际交互行为系统中对应每个待评价客体的历史行为数据为根据,从中提取的对应特定排序目标的实际行为数据;
分布差异值计算单元,用于计算根据所述实际行为数据得到的实际排序分布、和采用预先设定的排序分计算模型得到的预测排序分布之间的差异值;所述预测排序分布是以所述原始评分数据、所述新增排序因子的值以及所述新增排序因子权重系数的当前值为输入得到的,所述新增排序因子权重系数的当前值是指,上一次计算得到的权重系数值;
收敛判断单元,用于判断所述差异值是否满足预先设定的收敛要求;
结束执行单元,用于当所述收敛判断单元的输出为“是”,结束本装置各个单元的工作,所述排序分计算模型建立完毕;
权重系数优化单元,用于当所述收敛判断单元的输出为“否”时,以预测排序分布和所述实际排序分布之间的差异最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数;
循环控制单元,用于按照预先设定的时间间隔,触发上述各个单元工作。
可选的,所述权重系数优化单元进行求解所依据的所述预测排序分布和所述实际排序分布之间的差异是指,上述两个分布之间的KL距离;所述分布差异值计算单元计算的差异值是指,上述两个分布之间的KL距离值。
可选的,所述收敛判断单元进行判断所采用的所述预先设定的收敛要求是指,本次计算的KL距离值与上次计算得到的KL距离值相比较,其数值减小的比例小于预先设定的阈值。
可选的,所述分布差异值计算单元包括:
实际排序分布获取子单元,用于通过计算待评价客体的所述实际行为数据与全部待评价客体的所述实际行为数据总和的比值,获取所述待评价客体的实际排序分布;
预测排序分计算子单元,用于以所述待评价客体的原始评分数据、所述新增排序因子的值和所述新增排序因子权重系数的当前值为输入,采用所述排序分计算模型计算所述待评价客体的预测排序分;第一次触发本子单元工作时,将所述新增排序因子权重系数的当前值设置为预先设定的初始值;
预测排序分布获取子单元,用于通过计算所述待评价客体的预测排序分与全部待评价客体的预测排序分总和的比值,获取所述待评价客体的预测排序分 布;
KL距离值计算子单元,用于计算所述实际排序分布和所述预测排序分布之间的KL距离值。
可选的,所述权重系数优化单元包括:
预测排序分布表达式获取子单元,用于以所述新增排序因子的权重系数为未知数,将所述待评价客体的原始评分数据、所述新增排序因子的值代入所述排序分计算模型,并根据得到的表达式与所述待评价客体的所述预测排序分总和,获取所述预测排序分布表达式;
KL距离表达式获取子单元,用于获取所述实际排序分布和所述预测排序分布之间的KL距离的表达式;
权重系数求解子单元,用于以所述KL距离表达式的值最小化为优化目标,求解所述新增排序因子权重系数的值。
可选的,所述权重系数求解子单元具体用于,采用随机梯度下降算法或者逻辑回归优化算法求解所述新增排序因子权重系数的值。
此外,本申请还提供一种商品推荐系统,包括:
商品推荐服务器,用于接收客户端的商品查询请求,并向所述客户端推送多个与所述查询请求中的关键词相匹配的商品,所述推送的多个商品是按照权利要求1所述用于计算待评价客体排序分的方法,以预先计算的排序分对可推荐的候选商品进行排序后,推荐的序位处于高位的商品。
与现有技术相比,本申请具有以下优点:
本申请提供的用于计算待评价客体排序分的方法,通过获取待评价客体的原始评分数据、新增排序因子的值、以及用户在实际交互行为系统中对所述待评价客体的实际行为数据,以实际排序分布和预测排序分布之间的差异最小化为优化目标,求解排序分计算模型中的新增排序因子的权重系数,并根据求解结果采用所述排序分计算模型计算所述待评价客体的排序分,从而在快速、方便地引入新增排序因子的同时,通过对所述排序分计算模型中新增排序因子权重系数的优化计算,使得计算得到的排序分能够相对客观、准确地预测待评价客体的排序状况,更加接近实际的排序结果。
本申请提供的用于建立排序分计算模型的方法,以实际排序分布和预测排序分布之间的差异最小化为优化目标,求解所述排序分计算模型中的新增排序 因子的权重系数,并重复上述步骤进行迭代优化,当所述差异值满足预先设定的收敛要求时,所述排序分计算模型建立完毕。采用上述方法,不仅能够方便地引入新的排序因子,而且可以比较准确地计算出新增排序因子的权重系数,并建立起所述排序分计算模型,为在新增排序因子的场景下计算待评价客体的排序分提供依据。
附图说明
图1是本申请的一种用于计算待评价客体排序分的方法实施例的流程图;
图2是本申请提供的以实际排序分布和预测排序分布之间的KL距离最小化为优化目标,求解新增排序因子的权重系数的处理流程图;
图3是本申请的一种用于计算待评价客体排序分的装置实施例的示意图;
图4是本申请的一种用于建立排序分计算模型的方法实施例的流程图;
图5是本申请的一种用于建立排序分计算模型的装置实施例的示意图。
具体实施方式
在下面的描述中阐述了很多具体细节以便于充分理解本申请。但是本申请能够以很多不同于在此描述的其它方式来实施,本领域技术人员可以在不违背本申请内涵的情况下做类似推广,因此本申请不受下面公开的具体实施的限制。
在本申请中,分别提供了一种用于计算待评价客体排序分的方法和装置、以及一种用于建立排序分计算模型的方法和装置。在下面的实施例中逐一进行详细说明。
请参考图1,其为本申请的一种用于计算待评价客体排序分的方法实施例的流程图。所述方法包括如下步骤:
步骤101:获取待评价客体的原始评分数据、新增排序因子的值、以及以实际交互行为系统中对应每个待评价客体的历史行为数据为根据,从中提取的对应特定排序目标的实际行为数据。
本申请提供的用于计算待评价客体排序分的方法,在原始评分数据的基础上,引入新增排序因子,采用预先设定的排序分计算模型,以预测排序分布与实际排序分布之间的差异最小化(即:上述两个分布的相似度最高)为优化目 标进行求解,获取新增排序因子权重系数的值,并进一步根据所述模型计算出待评价客体的排序分。采用上述方法,不仅方便、自然地引入了新增排序因子,而且计算得到的排序分相对准确,依据该分数能够比较客观、准确地预测出待评价客体的排序状况。
所述原始评分数据是指,在未引入新增排序因子的情况下,为待评价客体评定的原始排序分,例如,将本申请提供的方法应用于在线交易系统,所述待评价客体是待排序商品,那么所述原始排序分通常是根据所述待排序商品的下列常规属性计算得到的:价格、销量、交易次数、买家数、热搜词次数等。所述原始排序分的具体计算方式可以是简单地加权求和,也可以是相对复杂的算法模型,不管采用哪种计算方式得到的原始评分数据,在不同的应用场景(包括需要引入新的排序因子的通用场景)下,都可以将该数据作为基础数据,并在其基础上引入新的排序因子,并且只需要求解新增排序因子的权重系数,而不需要对原有的算法或者模型进行任何改动,就可以重新计算出待评价客体在不同应用场景下的排序分。由此可见,采用本申请提供的方法可以方便地将新增排序因子与原有的排序算法或者模型融合在一起。
所述新增排序因子,根据应用场景的不同,也可以是不同的元素,例如:在线交易系统组织某些促销活动时,可能会将卖家交易等级、VIP买家交易比数等作为新增排序因子。不同新增排序因子对排序结果的影响力通常是不同的,为了反映该差异,通常可以为每个新增排序因子指定一个权重系数,并将两者的乘积项作为计算待评价客体排序分的一个组成部分。
考虑到在实际应用中,新增排序因子与排序结果之间并非都是简单的线性相关的情况,因此采用上述简单乘积项的形式无法反映出新增排序因子与排序结果之间的非线性关系,这样计算出来的排序分自然也是不准确的。为了解决这一问题,本申请的技术方案提供了一种优选实施方式,即:针对每个新增排序因子采用幂次项求和的表示方式,每个新增排序因子与一个权重系数序列相对应,所述序列中的每个权重系数都与所述新增排序因子的一个幂次项相对应。
之所以采用上述幂次项求和的表示方式,其依据是泰勒定理:在已知函数在某一点的各阶导数值的情况之下,泰勒公式可以用这些导数值做系数构建一个多项式来近似函数在这一点的邻域中的值。泰勒公式的展开公式如下:
Figure PCTCN2015091216-appb-000001
泰勒公式的最大优点是能够拟合各种非线性关系的函数,将上述理论应用到本申请的技术方案中,可以把最终的排序分数S看作是新增排序因子X的一个函数f(X),通过选择合适的权重系数,使得任意关系的X和S都能够得到很好的拟合。
泰勒公式理论上可以展开到任意高的幂次项,幂次越高,拟合越精确,但同时也会使得数据计算量急剧增加。均衡考虑拟合的精准性和数据计算量的大小这两个因素,在具体实现中可以取a=0,并且将新增排序因子采用4幂次项进行拟合,不仅可以模拟较为复杂的非线性关系,而且计算量也在可以接受的范围之内。
在本实施例的一个具体例子中,预先设定了如下所示的排序分计算模型:
Figure PCTCN2015091216-appb-000002
----------公式1
其中,S0是原始排序分(即:原始评分数据),S是引入新的排序因子之后的排序分,X、Y代表新增排序因子,αi、βi表示每个新增排序因子的子项的系数,新增排序因子的权重系数是由一个系数序列决定的,每个子项的系数对应于该排序因子相应的幂次项,例如,排序因子X的系数由α0、α1、α2、α3、α4决定,α0、α1、α2、α3、α4分别对应X0、X1、X2、X3、X4各个子项。
在本步骤中,要获取求解上述模型中的权重系数以及计算排序分所需的相关数据,包括:原始评分数据、新增排序因子的值、以及以实际交互行为系统中对应每个待评价客体的历史行为数据为根据,从中提取的对应特定排序目标的实际行为数据。
前面已经说明,所述原始评分数据是指,在未引入新增排序因子的情况下,为待评价客体评定的原始排序分。在具体实现时,可以按照未引入新增排序因子时采用的原有算法或者计算模型计算该数值,也可以向其他负责计算该数据的模块或者系统获取,具体采用何种获取方式,不是本申请的核心,本申请不对此作明确的限定。
所述新增排序因子,反映了在不同应用场景下对待评价客体进行排序所需考虑的特定元素,其值也需要预先获取。在本实施例的上述具体例子中,本方 法应用于在线交易系统,要根据商品排序结果进行商品推荐,这种情况下可以向负责进行商品推荐的运营部门获取该数据,这些数据反映了运营部门在本次活动中进行商品推荐的基本思路。获取的新增排序因子及其取值采用如下格式组织:
(itemId,factor1:value1,factor2:value2,……)
其中itemId为待排序商品的标识,factor1和factor2是新增排序因子,value1和value2是新增排序因子的值,如果有其他新增排序因子,只需要按照factor:value的格式追加在记录末尾,用特定的分隔符(例如:半角逗号)进行连接即可。上面给出的数据组织格式是示意性的,在其他实施方式中,可以采用其他的数据组织形式,本申请不对此进行限定。
由于在步骤102中对新增排序因子权重系数的求解过程,需要使用待评价客体的实际排序分布,因此本步骤还需要获取以实际交互行为系统中对应每个待评价客体的历史行为数据,并从中提取对应特定排序目标的实际行为数据。所述实际交互行为系统是指用户与待评价客体进行交互的系统,当所述交互行为系统为在线交易系统时(例如:淘宝交易平台),所述特定排序目标包括:点击数、交易量或者交易金额等。
在本实施例的上述具体例子中,以交易金额最大作为排序目标,在本步骤中从在线交易系统用于存储用户历史行为数据的日志文件中,提取待排序商品在设定时间段内(例如:过去7天之内)的交易金额,即为所述对应特定排序目标的实际行为数据。
在具体实现时,上述获取原始评分数据、新增排序因子的值、以及对应特定排序目标的实际数据的功能,可以由一个数据采集模块或者数据采集系统来完成,从而为后续步骤102和步骤103的计算做好数据上的准备。
步骤102:以根据所述实际行为数据得到的实际排序分布、和根据预先设定的排序分计算模型得到的预测排序分布之间的KL距离最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数。
本申请提供的用于计算待评价客体排序分的方法,依据其计算结果可以对待评价客体的排序状况进行预测,因此本方法也可以看作是一个排行榜算法,其最终目标,自然是希望通过计算排序分得到的预测排序结果,能够尽可能地符合实际的排序结果,例如,在本实施例的上述例子中,交易金额多的商品应 该排在交易金额少的商品前面。为了达到上述目标,通常希望预测排序分布与实际排序分布越接近越好。
本申请的技术方案就是依据上述原理,以实际排序分布和预测排序分布之间的差异最小化为优化目标,求解预先设定的排序分计算模型中的各新增排序因子的权重系数,并利用求解得到的权重系数计算出待评价客体的排序分。在本实施例中,采用KL距离来衡量上述两个分布的接近程度,或者说是相似度,在其他实施方式中,也可以采用能够衡量分布相似度的其他指标。
具体说,求解新增排序因子权重系数的过程包括步骤102-1至步骤102-8,下面结合附图2作进一步说明。
步骤102-1:获取待评价客体的实际排序分布。
在步骤101中已经从历史行为数据中提取了对应特定排序目标的实际行为数据,在本步骤中,通过计算待评价客体的所述实际行为数据与全部待评价客体的所述实际行为数据总和的比值,获取所述待评价客体的实际排序分布。
在本实施例的上述具体例子中,已经以交易金额最大作为排序目标,获取了待排序商品在设定时间段内的交易金额,在本步骤中,首先计算出待排序商品的交易金额总和,然后求解每个待排序商品的交易金额与交易金额总和的比值,则得到了所述待排序商品的实际排序分布。
步骤102-2:计算待评价客体的预测排序分。
在实际应用中,本申请提供的用于计算待评价客体排序分的方法,可以按照一定的时间间隔重复执行,每一次执行都会计算出新增排序因子的权重系数值。通过上述循环计算过程,求解出的新增排序因子的权重系数会越来越准确,通过计算排序分得到的预测排序分布会越来越接近实际的排序结果,是一个逐步优化的过程。
在本步骤中,以所述待评价客体的原始评分数据、所述新增排序因子的值和所述新增排序因子权重系数的当前值为输入,采用所述排序分计算模型计算所述待评价客体的预测排序分;其中,所述新增排序因子权重系数的当前值是指,采用本方法上一次计算得到的所述权重系数的值。
在第一次采用本方法计算所述待评价客体的排序分时,由于尚未求解新增排序因子的权重系数,因此在本步骤中可以将所述权重系数的当前值设置为预先设定的初始值。在本实施例的上述具体例子中,设置每个权重系数的初始值 为-1。
步骤102-3:获取待评价客体的预测排序分布。
在步骤102-2中已经获取了每个待评价客体的预测排序分,在本步骤中首先计算全部待评价客体的预测排序分总和,然后通过计算每个待评价客体的预测排序分与预测排序分总和的比值,从而获取所述待评价客体的预测排序分布。
步骤102-4:计算所述实际排序分布和所述预测排序分布之间的KL距离值。
所述KL距离,是Kullback-Leibler距离(Kullback-Leibler Divergence—莱布勒距离)的简称,也叫做相对熵。它衡量的是相同事件空间里的两个概率分布的距离(也称差异情况或者相似度),通常定义为P和Q两个概率分布在对数差异上的期望,其计算公式为:
Figure PCTCN2015091216-appb-000003
--------公式2
其中,P(x)表示数据的真实分布(即:实际分布),Q(x)表示数据的近似分布(即:预测分布),D(P||Q)即为本申请所述的KL距离,其值反映了两个概率分布的差异程度,KL距离值越小,表示预测排序分布与实际排序分布越为接近,也就是说预测越准确,KL距离值为0,则说明两个概率分布完全相同,即:P(x)=Q(x)。
具体到本技术方案,因为通过步骤102-1至步骤102-3的计算,已经获取了待评价客体的实际排序分布和预测排序分布,因此在本步骤中直接根据上述公式2,即可计算得到上述两个分布之间的KL距离值。
步骤102-5:判断所述KL距离值减小的比例是否小于预先设定的阈值;若是,转到步骤103执行,否则,执行步骤102-6。
判断所述KL距离值与上次采用本方法计算得到的KL距离值相比较,其数值减小的比例是否小于预先设定的阈值。
若是,说明预测排序分布与实际排序分布之间的KL距离值已经满足了预先设定的收敛要求,预测排序分布与实际排序分布之间的差异已经基本趋于稳定,在这种情况下,可以不再进行新增排序因子的权重系数的优化求解,也就是说在后续采用本方法计算待评价客体的排序分时,可以使用各个新增排序因子的权重系数的当前值直接进行计算即可,对于本次计算也是同样,因此直接转到步骤103执行。
若否,说明预测排序分布与实际排序分布之间的KL距离值尚未满足预先设定的收敛要求,还有必要继续缩小预测排序分布与实际排序分布之间的差异,也就是说,还需要对新增排序因子的权重系数继续进行优化计算,因此继续执行后续的步骤102-6。
步骤102-6:获取以所述新增排序因子的权重系数表示的预测排序分布。
具体说,针对每一种待评价客体,以所述新增排序因子的权重系数为未知数,将所述待评价客体的原始评分数据、所述新增排序因子的值代入所述排序分计算模型(公式1),从而可以得到每一种待评价客体的排序分表达式,然后依次用该表达式除以待评价客体的预测排序分总和,就获取了以所述新增排序因子的权重系数表示的预测排序分布。
步骤102-7:获取所述实际排序分布和所述预测排序分布之间的KL距离的表达式。
将步骤102-6得到的以所述新增排序因子的权重系数表示的预测排序分布,以及步骤102-1得到的所述待评价客体的实际排序分布代入上述公式2中,就可以得到所述实际排序分布和所述预测排序分布之间的KL距离的表达式。在该表达式中,各个新增排序因子的权重系数是目前待求解的变量。
步骤102-8:以所述KL距离表达式的值最小化为优化目标,求解所述新增排序因子的权重系数的值。
在本步骤中,以所述KL距离表达式的值最小化为优化目标,可以采用随机梯度下降算法SGD或者逻辑回归优化算法L-BFGS求解所述新增排序因子的权重系数的值。
其中,梯度下降算法通常采用迭代的策略,从初始点w1开始,每次沿着目标函数f(w)在当前点的负梯度方向前进一定的步长,只要步长设置合理,这样就可以得到一个单调递减的序列{f(w1),…,f(wt),…},直至最终不再下降,此时就可以得到最优解w*。而随机梯度下降算法(Stochastic Gradient Descent—SGD)是梯度下降算法的简化过程,其收敛速度相对快一些而且可以避免出现陷入局部最优的情况。L(imited memory)-BFGS(BFGS是四个人的名称首字母的缩写组合)则是传统逻辑回归算法的优化算法,可以提高算法的收敛速度。
至于上述SGD算法和L-BFGS算法,属于本领域中比较成熟的算法,因此不在此对具体的求解过程进行详细说明。
由于本步骤求解新增排序因子权重系数的过程,也就是通常所说的对模型(即:排序分计算模型)的训练过程,计算量通常比较大,在具体实施过程中,可以选择分布式计算平台,以提高计算效率。例如,在本实施例的上述具体例子中,采用了Spark计算平台(基于内存的大数据分布式计算平台),从而能够相对快速地完成L-BFGS等迭代模型的计算,有效提高本方法的执行效率。
至此,通过执行上述步骤102-1至步骤102-8,计算出了新增排序因子权重系数的值。其中,步骤102-3至步骤102-5主要是为了判断预测排序分布与实际排序分布之间的KL距离值是否已经基本满足了预先设定的收敛要求,在具体实现过程中,也可以不作上述判断,而是每次实施本方法时都执行权重系数的优化计算,同样可以实现本申请的技术方案。
此外,在执行本步骤102求解新增排序因子权重系数之前,还可以先判断待评价客体的数目是否大于进行上述求解过程所需商品的预定数量,如果是,可以按照所述待评价客体的原始评分数据从大到小的顺序,从中选择所述预定数量的待评价客体,作为后续使用本方法求解权重系数所采用的待评价客体。
在本实施例的上述具体例子中,总共有10000件待排序商品,而在步骤102的求解过程中通常需要使用4000件商品的相关数据就可以得到相对满意的计算结果,权衡考虑计算精度和效率,在本步骤执行之前,按照10000件待排序商品的原始评分数据从大到小的顺序,从中选择4000件商品参与计算。采用这4000件商品计算得到的新增排序因子权重系数的值通常是具有代表性的,因此也可以用于计算其他待排序商品的排序分。
步骤103:以所述待评价客体的原始评分数据、所述新增排序因子的值以及计算得到的所述新增排序因子权重系数的值为输入,采用所述排序分计算模型计算所述待评价客体的排序分。
由于已经计算得到了新增排序因子权重系数的值,因此本步骤可以采用预先设定的排序分计算模型计算各个待评价客体的排序分。在本实施例的上述具体例子中,还可以将计算得到的商品排序分提供给其他负责选品或者进行商品推荐的模块或者系统,后者可以以所述商品排序分为主要依据,也可以同时兼顾考虑其他一些因素,最终完成商品的选取或者推荐操作。
需要说明的是,在具体实施过程中,可以反复循环执行本申请提供的方法,每一次不仅可以计算出待评价客体的排序分供其他模块或者系统参考,还可以 以实际排序分布与预测排序分布之间的KL距离最小化为优化目标,不断调整优化新增排序因子权重系数,使得由所述待评价客体排序分反映出的预测排序分布越来越接近实际排序分布。
在本实施例的上述具体例子中,每天执行一次本申请提供的上述方法,优化计算新增排序因子权重系数的值,并将计算得到的预测排序结果投放到在线交易系统的相应业务场景中,在相应业务场景中,用户浏览、点击、购买商品等具体行为数据被存储在用户行为日志中,从该日志中提取的对应特定排序目标的实际行为数据,又可以反馈到第二天的计算过程中作为实际排序分布参与新一轮的计算......。每天重复执行上述过程,形成了一个闭环反馈的过程,在该过程中新增排序因子的权重系数会逐步优化,预测排序分也会与实际排序结果越来越接近。
当预测排序分布与实际排序分布之间的KL距离值已经达到了预先设定的收敛要求时,在后续使用本方法时,则可以不执行步骤102进行权重系数的优化求解,每次直接根据所述待评价客体的原始评分数据、所述新增排序因子的值以及最后一次优化计算得到的新增排序因子权重系数的值,计算所述待评价客体的排序分就可以了。
本申请提供的用于计算待评价客体排序分的方法,在获取待评价客体的原始评分数据的基础上,以实际排序分布和预测排序分布之间的差异最小化为优化目标,求解新增排序因子的权重系数,并根据求解结果采用排序分计算模型计算所述待排序商品的排序分,从而在快速、方便地引入新增排序因子的同时,通过对所述排序分计算模型中排序因子权重系数的优化计算,使得计算得到的排序分能够相对客观、准确地预测待评价客体的排序状况,更加接近实际的排序结果。
在上述的实施例中,提供了一种用于计算待评价客体排序分的方法,与之相对应的,本申请还提供一种用于计算待评价客体排序分的装置。请参看图3,其为本申请的一种用于计算待评价客体排序分的装置实施例的示意图。由于装置实施例基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。下述描述的装置实施例仅仅是示意性的。
本实施例的一种用于计算待评价客体排序分的装置,包括:数据获取单元301,用于获取待评价客体的原始评分数据、新增排序因子的值、以及以实际交互行为系统中对应每个待评价客体的历史行为数据为根据,从中提取的对应特 定排序目标的实际行为数据;权重系数计算单元302,用于以根据所述实际行为数据得到的实际排序分布、和根据预先设定的排序分计算模型得到的预测排序分布之间的KL距离最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数;排序分计算单元303,用于以所述待评价客体的原始评分数据、所述新增排序因子的值以及计算得到的所述新增排序因子权重系数的值为输入,采用所述排序分计算模型计算所述待评价客体的排序分。
可选的,所述权重系数计算单元和所述排序分计算单元采用的排序分计算模型中,针对每个新增排序因子采用幂次项求和的表示方式。
可选的,所述权重系数计算单元包括:
实际排序分布获取子单元,用于通过计算待评价客体的所述实际行为数据与全部待评价客体的所述实际行为数据总和的比值,获取所述待评价客体的实际排序分布;
预测排序分计算子单元,用于以所述待评价客体的原始评分数据、所述新增排序因子的值和所述新增排序因子权重系数的当前值为输入,采用所述排序分计算模型计算所述待评价客体的预测排序分;所述新增排序因子权重系数的当前值是指,采用本方法上一次计算得到的所述权重系数的值;
预测排序分布表达式获取子单元,用于以所述新增排序因子的权重系数为未知数,将所述待评价客体的原始评分数据、所述新增排序因子的值代入所述排序分计算模型,并根据得到的表达式与待评价客体的所述预测排序分的总和,获取以所述新增排序因子的权重系数表示的预测排序分布;
KL距离表达式获取子单元,用于获取所述实际排序分布和所述预测排序分布之间的KL距离的表达式;
权重系数求解子单元,用于以所述KL距离表达式的值最小化为优化目标,求解所述新增排序因子的权重系数的值。
可选的,所述权重系数求解子单元具体用于,采用随机梯度下降算法SGD或者逻辑回归优化算法L-BFGS求解所述新增排序因子的权重系数。
可选的,所述权重系数计算单元还包括:
预测排序分布获取子单元,用于在获取所述待评价客体的实际排序分布和计算所述待评价客体的预测排序分之后,通过计算所述待评价客体的预测排序分与全部待评价客体的预测排序分总和的比值,获取所述待评价客体的预测排 序分布;
KL距离值计算子单元,用于计算所述实际排序分布和所述预测排序分布获取子单元输出的预测排序分布之间的KL距离值;
KL距离值判断子单元,用于判断所述KL距离值与上次采用本方法计算得到的KL距离值相比较,其数值减小的比例是否小于预先设定的阈值;若是,则在后续使用本装置计算待评价客体排序分的过程中,不再触发所述权重系数计算单元及其子单元工作,相应的,所述排序分计算单元具体用于以所述待评价客体的原始评分数据、所述新增排序因子的值以及最近一次计算得到的所述新增排序因子权重系数的值为输入进行求解。
可选的,第一次触发所述预测排序分计算子单元工作时,将所述新增排序因子权重系数的当前值设置为预先设定的初始值。
可选的,所述装置还包括:
客体数目判断子单元,用于在触发所述权重系数计算单元工作之前,判断所述待评价客体的数目是否大于求解新增排序因子权重系数所需待评价客体的预定数量;
客体选择子单元,用于当所述客体数目判断子单元的输出为“是”时,按照所述待评价客体的原始评分数据从大到小的顺序,从中选择所述预定数量的待评价客体,作为后续使用本方法求解所述新增排序因子的权重系数所采用的待评价客体。
与上述的一种用于计算待评价客体排序分的方法相对应的,本申请还提供一种用于建立排序分计算模型的方法。请参考图4,其为本申请提供的一种用于建立排序分计算模型的方法实施例的流程图,本实施例与第一实施例步骤相同的部分不再赘述,下面重点描述不同之处。
本申请提供的一种用于建立排序分计算模型的方法,包括:
步骤401:获取待评价客体的原始评分数据、新增排序因子的值、以及以实际交互行为系统中对应每个待评价客体的历史行为数据为根据,从中提取的对应特定排序目标的实际行为数据。
为了方便地引入新增排序因子,本申请采用的排序分计算模型在原始评分数据的基础上,加入新增排序因子项,每个新增排序因子都有与之对应的权重系数(对于采用幂次项表示方式的排序因子为权重系数序列)。模型的建立过程, 核心就在于求解新增排序因子的权重系数,所述权重系数确定了,那么模型也就建立起来了。本实施例提供的用于建立排序分计算模型的方法,就是以预测排序分布和实际排序分布之间的差异值最小化为优化目标,求解所述新增排序因子权重系数的值,并采用循环迭代的方式,使得所述权重系数的值不断优化,并将算法满足收敛条件时的权重系数的值,作为该模型最终的权重系数值,从而完成模型的建立过程。
本步骤获取进行所述计算所需的数据,包括:待评价客体的原始评分数据、新增排序因子的值、以及以实际交互行为系统中对应每个待评价客体的历史行为数据为根据,从中提取的对应特定排序目标的实际行为数据。
步骤402:计算根据所述实际行为数据得到的实际排序分布、和采用预先设定的排序分计算模型得到的预测排序分布之间的KL距离值。
在本实施例中,用所述实际排序分布和所述预测排序分布之间的KL距离值作为衡量所述两个分布之间差异的具体数值。
具体说,计算所述实际排序分布和所述预测排序分布之间的KL距离值,包括以下几个过程:
首先,通过计算待评价客体的所述实际行为数据与全部待评价客体的所述实际行为数据总和的比值,获取所述待评价客体的实际排序分布。
其次,以所述待排序商品的原始评分数据、所述新增排序因子的值和所述新增排序因子权重系数的当前值为输入,采用所述排序分计算模型计算所述待评价客体的预测排序分。所述新增排序因子权重系数的当前值是指,上一次计算得到的权重系数值,在第一次计算预测排序分时,将所述新增排序因子权重系数的当前值设置为预先设定的初始值。
然后,通过计算所述待评价客体的预测排序分与全部待评价客体的预测排序分总和的比值,获取所述待评价客体的预测排序分布。
最后,计算所述实际排序分布和所述预测排序分布之间的KL距离值。
步骤403:判断所述KL距离值是否满足预先设定的收敛要求,若是,执行步骤404,否则,执行步骤405。
所述预先设定的收敛要求是指,本次计算的KL距离值与上次计算得到的KL距离值相比较,其数值减小的比例小于预先设定的阈值。若是,说明预测排序分布与实际排序分布之间的KL距离值已经满足了预先设定的收敛要求,可以 不再进行新增排序因子的权重系数的优化求解,因此继续执行步骤404;若否,说明还有必要继续缩小预测排序分布与实际排序分布之间的差异,也就是说,还需要对新增排序因子的权重系数继续进行优化计算,因此转到步骤405执行。
在具体实施时,还可以采用其他的判断方法。例如,可以预先设置一个具体的阈值用于判定算法是否收敛,当步骤402计算出的KL距离值大于所述阈值时,说明没有满足预先设定的收敛要求,否则认为已收敛;或者不计算具体的KL距离值,而是统计迭代计算的次数,当迭代计算的次数大于或者等于预先根据经验设定的计算次数时,可以认为算法已经收敛。上述各种判断方式,都是具体实施方式的变更,并不偏离本申请的核心,因此都在本申请的保护范围内。
步骤404:结束本方法的执行,所述商品排序分计算模型建立完毕。
执行到本步骤,说明预测排序分布与实际排序分布之间的KL距离值已经满足了预先设定的收敛要求,不用再对新增排序因子的权重系数进行优化计算。因此,直接用所述新增排序因子权重系数的当前值作为最终确定的所述排序分计算模型的相应权重系数的值,所述模型建立完毕,结束本方法的执行。
步骤405:以预测排序分布和所述实际排序分布之间的差异最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数。
具体实现时,本步骤包括以下几个过程:首先,以所述新增排序因子的权重系数为未知数,将所述待评价客体的原始评分数据、所述新增排序因子的值代入所述排序分计算模型,并根据得到的表达式与所述待评价客体的所述预测排序分总和,获取所述预测排序分布表达式;然后,获取所述实际排序分布和所述预测排序分布之间的KL距离的表达式;最后,以所述KL距离表达式的值最小化为优化目标,采用随机梯度下降算法SGD或者逻辑回归优化算法L-BFGS求解所述新增排序因子权重系数的值。
步骤406:按照预先设定的时间间隔,转到获取所述原始评分数据、所述新增排序因子的值以及所述实际行为数据的步骤401继续执行。
在本实施例的一个具体例子中,每天重复执行一遍上述步骤401-405,在循环过程中,所述新增排序因子的权重系数值不断优化,最终建立起所述排序分计算模型。
综上所述,本申请提供的用于建立排序分计算模型的方法,以实际排序分布和预测排序分布之间的差异最小化为优化目标,求解所述排序分计算模型中 的新增排序因子的权重系数,重复上述步骤进行迭代优化,当所述差异值满足预先设定的收敛要求时,所述排序分计算模型建立完毕。采用上述方法,不仅能够方便地引入新的排序因子,而且可以比较准确地计算出新增排序因子的权重系数,并建立起所述排序分计算模型,为在新增排序因子的场景下计算待评价客体的排序分提供依据。
在上述的实施例中,提供了一种用于建立排序分计算模型的方法,与之相对应的,本申请还提供一种用于建立排序分计算模型的装置。请参看图5,其为本申请的一种用于建立排序分计算模型的装置实施例的示意图。由于装置实施例基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。下述描述的装置实施例仅仅是示意性的。
本实施例的一种用于建立排序分计算模型的装置,包括:数据获取单元501,用于获取待评价客体的原始评分数据、新增排序因子的值、以及以实际交互行为系统中对应每个待评价客体的历史行为数据为根据,从中提取的对应特定排序目标的实际行为数据;分布差异值计算单元502,用于计算根据所述实际行为数据得到的实际排序分布、和采用预先设定的排序分计算模型得到的预测排序分布之间的KL距离值;所述预测排序分布是以所述原始评分数据、所述新增排序因子的值以及所述新增排序因子权重系数的当前值为输入得到的,所述新增排序因子权重系数的当前值是指,上一次计算得到的权重系数值;收敛判断单元503,用于判断所述KL距离值是否满足预先设定的收敛要求;结束执行单元504,用于当所述收敛判断单元的输出为“是”,结束本装置各个单元的工作,所述排序分计算模型建立完毕;权重系数优化单元505,用于当所述收敛判断单元的输出为“否”时,以预测排序分布和所述实际排序分布之间的KL距离值最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数;循环控制单元506,用于按照预先设定的时间间隔,触发上述各个单元工作。
可选的,所述收敛判断单元进行判断所采用的所述预先设定的收敛要求是指,本次计算的KL距离值与上次计算得到的KL距离值相比较,其数值减小的比例小于预先设定的阈值。
可选的,所述分布差异值计算单元包括:
实际排序分布获取子单元,用于通过计算待评价客体的所述实际行为数据与全部待评价客体的所述实际行为数据总和的比值,获取所述待评价客体的实际排序分布;
预测排序分计算子单元,用于以所述待评价客体的原始评分数据、所述新增排序因子的值和所述新增排序因子权重系数的当前值为输入,采用所述排序分计算模型计算所述待评价客体的预测排序分;第一次触发本子单元工作时,将所述新增排序因子权重系数的当前值设置为预先设定的初始值;
预测排序分布获取子单元,用于通过计算所述待评价客体的预测排序分与全部待评价客体的预测排序分总和的比值,获取所述待评价客体的预测排序分布;
KL距离值计算子单元,用于计算所述实际排序分布和所述预测排序分布之间的KL距离值。
可选的,所述权重系数优化单元包括:
预测排序分布表达式获取子单元,用于以所述新增排序因子的权重系数为未知数,将所述待评价客体的原始评分数据、所述新增排序因子的值代入所述排序分计算模型,并根据得到的表达式与所述待评价客体的所述预测排序分总和,获取所述预测排序分布表达式;
KL距离表达式获取子单元,用于获取所述实际排序分布和所述预测排序分布之间的KL距离的表达式;
权重系数求解子单元,用于以所述KL距离表达式的值最小化为优化目标,求解所述新增排序因子权重系数的值。
可选的,所述权重系数求解子单元具体用于,采用随机梯度下降算法SGD或者逻辑回归优化算法L-BFGS求解所述新增排序因子权重系数的值。
此外,本申请实施例还提供了一种商品推荐系统,该系统包括一种商品推荐服务器,所述服务器与若干个客户端进行通信,接收所述客户端发送的商品查询请求,获取与所述查询请求中的关键词相匹配的一组可供推荐的候选商品,并按照采用本申请提供的用于计算待评价客体排序分的方法预先计算得到的商品排序分,对该组可供推荐的候选商品排序,并将排序后的商品按照序位从高到低的顺序推送给发起所述查询请求的客户端。
如果所述可供推荐的候选商品的数量大于预先设定的推荐数量,也可以按照排序后的序位将处于高位的预订数量的商品推送给所述客户端。
所述商品推荐系统可以应用于在线交易平台,为访问该平台的客户端进行商品推荐,由于所述系统采用本申请提供的用于计算待评价客体排序分的方法 预先计算商品的排序分,并基于所述排序分进行商品推荐,因此在不同的应用场景下(例如:大促活动),为客户端推荐的排序商品可以比较准确地反映商品在所述应用场景下的实际排序状况,便于客户端用户进行浏览与选择,能够改善客户端用户的体验,同时也能够提高在线交易平台的销量。
当然,本申请提供的商品推荐系统并不仅限于在上述在线交易平台中实施,也可以在其他的平台或者应用中实施,只要是需要根据排序分进行商品推荐的应用场合,就都可以采用本申请提供的商品推荐系统进行商品推荐。
本申请虽然以较佳实施例公开如上,但其并不是用来限定本申请,任何本领域技术人员在不脱离本申请的精神和范围内,都可以做出可能的变动和修改,因此本申请的保护范围应当以本申请权利要求所界定的范围为准。
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
1、计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括非暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
2、本领域技术人员应明白,本申请的实施例可提供为方法、系统或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、 CD-ROM、光学存储器等)上实施的计算机程序产品的形式。

Claims (31)

  1. 一种用于计算待评价客体排序分的方法,其特征在于,包括:
    获取待评价客体的原始评分数据、新增排序因子的值、以及以实际交互行为系统中对应每个待评价客体的历史行为数据为根据,从中提取的对应特定排序目标的实际行为数据;
    以根据所述实际行为数据得到的实际排序分布、和根据预先设定的排序分计算模型得到的预测排序分布之间的差异最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数;
    以所述待评价客体的原始评分数据、所述新增排序因子的值以及计算得到的所述新增排序因子权重系数的值为输入,采用所述排序分计算模型计算所述待评价客体的排序分。
  2. 根据权利要求1所述的用于计算待评价客体排序分的方法,其特征在于,在所述排序分计算模型中,针对每个新增排序因子采用幂次项求和的表示方式;
    相应的,所述新增排序因子的权重系数是指权重系数序列,所述序列中的每个权重系数都与所述新增排序因子的一个幂次项相对应。
  3. 根据权利要求2所述的用于计算待评价客体排序分的方法,其特征在于,所述针对每个新增排序因子采用幂次项求和的表示方式具体是指,采用四幂次项求和的表示方式。
  4. 根据权利要求1所述的用于计算待评价客体排序分的方法,其特征在于,当所述交互行为系统为在线交易系统时,所述特定排序目标为:点击数、交易量或者交易金额。
  5. 根据权利要求1-4任一所述的用于计算待评价客体排序分的方法,其特征在于,所述实际排序分布和预测排序分布之间的差异具体是指,所述两个分布之间的KL距离。
  6. 根据权利要求5所述的用于计算待评价客体排序分的方法,其特征在于,所述以根据所述实际行为数据得到的实际排序分布、和根据预先设定的排序分计算模型得到的预测排序分布之间的差异最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数,包括:
    通过计算待评价客体的所述实际行为数据与全部待评价客体的所述实际行 为数据总和的比值,获取所述待评价客体的实际排序分布;
    以所述待评价客体的原始评分数据、所述新增排序因子的值和所述新增排序因子权重系数的当前值为输入,采用所述排序分计算模型计算所述待评价客体的预测排序分;所述新增排序因子权重系数的当前值是指,采用本方法上一次计算得到的所述权重系数的值;
    以所述新增排序因子的权重系数为未知数,将所述待评价客体的原始评分数据、所述新增排序因子的值代入所述排序分计算模型,并根据得到的表达式与待评价客体的所述预测排序分的总和,获取以所述新增排序因子的权重系数表示的预测排序分布;
    获取所述实际排序分布和所述预测排序分布之间的KL距离的表达式;
    以所述KL距离表达式的值最小化为优化目标,求解所述新增排序因子的权重系数的值。
  7. 根据权利要求6所述的用于计算待评价客体排序分的方法,其特征在于,所述以所述KL距离表达式的值最小化为优化目标,求解所述新增排序因子的权重系数的值是指,采用随机梯度下降算法或者逻辑回归优化算法求解。
  8. 根据权利要求6所述的用于计算待评价客体排序分的方法,其特征在于,在所述获取所述待评价客体的实际排序分布和所述计算所述待评价客体的预测排序分的步骤后,执行下述操作:
    通过计算所述待评价客体的预测排序分与全部待评价客体的预测排序分总和的比值,获取所述待评价客体的预测排序分布;
    计算所述实际排序分布和所述预测排序分布之间的KL距离值;
    判断所述KL距离值与上次采用本方法计算得到的KL距离值相比较,其数值减小的比例是否小于预先设定的阈值;
    若是,则在后续使用本方法计算待评价客体排序分的过程中,不再执行所述求解所述排序分计算模型中的新增排序因子的权重系数的步骤;相应的,所述以所述待评价客体的原始评分数据、所述新增排序因子的值以及计算得到的所述新增排序因子权重系数的值为输入,采用所述排序分计算模型计算所述待评价客体的排序分是指,以最近一次计算得到的所述新增排序因子权重系数的值为输入进行求解。
  9. 根据权利要求6所述的用于计算待评价客体的方法,其特征在于,在第 一次执行所述采用所述排序分计算模型计算所述待评价客体的预测排序分的步骤时,将所述新增排序因子权重系数的当前值设置为预先设定的初始值。
  10. 根据权利要求1所述的用于计算待评价客体排序分的方法,其特征在于,在执行所述求解所述排序分计算模型中的新增排序因子的权重系数的步骤之前,执行下述操作:
    判断所述待评价客体的数目是否大于求解新增排序因子权重系数所需待评价客体的预定数量;
    若是,按照所述待评价客体的原始评分数据从大到小的顺序,从中选择所述预定数量的待评价客体,作为后续使用本方法求解所述新增排序因子的权重系数所采用的待评价客体。
  11. 一种用于计算待评价客体排序分的装置,其特征在于,包括:
    数据获取单元,用于获取待评价客体的原始评分数据、新增排序因子的值、以及以实际交互行为系统中对应每个待评价客体的历史行为数据为根据,从中提取的对应特定排序目标的实际行为数据;
    权重系数计算单元,用于以根据所述实际行为数据得到的实际排序分布、和根据预先设定的排序分计算模型得到的预测排序分布之间的差异最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数;
    排序分计算单元,用于以所述待评价客体的原始评分数据、所述新增排序因子的值以及计算得到的所述新增排序因子权重系数的值为输入,采用所述排序分计算模型计算所述待评价客体的排序分。
  12. 根据权利要求11所述的用于计算待评价客体排序分的装置,其特征在于,所述权重系数计算单元和所述排序分计算单元采用的排序分计算模型中,针对每个新增排序因子采用幂次项求和的表示方式。
  13. 根据权利要求11-12任一所述的用于计算待评价客体排序分的装置,其特征在于,所述权重系数计算单元具体用于,以根据所述实际行为数据得到的实际排序分布、和根据预先设定的排序分计算模型得到的预测排序分布之间的KL距离最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数。
  14. 根据权利要求13所述的用于计算待评价客体排序分的装置,其特征在于,所述权重系数计算单元包括:
    实际排序分布获取子单元,用于通过计算待评价客体的所述实际行为数据与全部待评价客体的所述实际行为数据总和的比值,获取所述待评价客体的实际排序分布;
    预测排序分计算子单元,用于以所述待评价客体的原始评分数据、所述新增排序因子的值和所述新增排序因子权重系数的当前值为输入,采用所述排序分计算模型计算所述待评价客体的预测排序分;所述新增排序因子权重系数的当前值是指,采用本方法上一次计算得到的所述权重系数的值;
    预测排序分布表达式获取子单元,用于以所述新增排序因子的权重系数为未知数,将所述待评价客体的原始评分数据、所述新增排序因子的值代入所述排序分计算模型,并根据得到的表达式与待评价客体的所述预测排序分的总和,获取以所述新增排序因子的权重系数表示的预测排序分布;
    KL距离表达式获取子单元,用于获取所述实际排序分布和所述预测排序分布之间的KL距离的表达式;
    权重系数求解子单元,用于以所述KL距离表达式的值最小化为优化目标,求解所述新增排序因子的权重系数的值。
  15. 根据权利要求14所述的用于计算待评价客体排序分的装置,其特征在于,所述权重系数求解子单元具体用于,采用随机梯度下降算法或者逻辑回归优化算法求解所述新增排序因子的权重系数。
  16. 根据权利要求14所述的用于计算待评价客体排序分的装置,其特征在于,所述权重系数计算单元还包括:
    预测排序分布获取子单元,用于在获取所述待评价客体的实际排序分布和计算所述待评价客体的预测排序分之后,通过计算所述待评价客体的预测排序分与全部待评价客体的预测排序分总和的比值,获取所述待评价客体的预测排序分布;
    KL距离值计算子单元,用于计算所述实际排序分布和所述预测排序分布获取子单元输出的预测排序分布之间的KL距离值;
    KL距离值判断子单元,用于判断所述KL距离值与上次采用本方法计算得到的KL距离值相比较,其数值减小的比例是否小于预先设定的阈值;若是,则在后续使用本装置计算待评价客体排序分的过程中,不再触发所述权重系数计算单元及其子单元工作,相应的,所述排序分计算单元具体用于以所述待评价 客体的原始评分数据、所述新增排序因子的值以及最近一次计算得到的所述新增排序因子权重系数的值为输入进行求解。
  17. 根据权利要求14所述的用于计算待评价客体排序分的装置,其特征在于,第一次触发所述预测排序分计算子单元工作时,将所述新增排序因子权重系数的当前值设置为预先设定的初始值。
  18. 根据权利要求11所述的用于计算待评价客体排序分的装置,其特征在于,所述装置还包括:
    客体数目判断子单元,用于在触发所述权重系数计算单元工作之前,判断所述待评价客体的数目是否大于求解新增排序因子权重系数所需待评价客体的预定数量;
    客体选择子单元,用于当所述客体数目判断子单元的输出为“是”时,按照所述待评价客体的原始评分数据从大到小的顺序,从中选择所述预定数量的待评价客体,作为后续使用本方法求解所述新增排序因子的权重系数所采用的待评价客体。
  19. 一种用于建立排序分计算模型的方法,其特征在于,包括:
    获取待评价客体的原始评分数据、新增排序因子的值、以及以实际交互行为系统中对应每个待评价客体的历史行为数据为根据,从中提取的对应特定排序目标的实际行为数据;
    计算根据所述实际行为数据得到的实际排序分布、和采用预先设定的排序分计算模型得到的预测排序分布之间的差异值;所述预测排序分布是以所述原始评分数据、所述新增排序因子的值以及所述新增排序因子权重系数的当前值为输入得到的,所述新增排序因子权重系数的当前值是指,上一次计算得到的权重系数值;
    判断所述差异值是否满足预先设定的收敛要求;
    若是,结束本方法的执行,所述排序分计算模型建立完毕;
    若否,以预测排序分布和所述实际排序分布之间的差异最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数;
    按照预先设定的时间间隔,转到获取所述原始评分数据、所述新增排序因子的值以及所述实际行为数据的步骤继续执行。
  20. 根据权利要求19所述的用于建立排序分计算模型的方法,其特征在于, 所述实际排序分布和预测排序分布之间的差异具体是指,上述两个分布之间的KL距离;相应的,上述两个分布之间的差异值具体是指,所述KL距离的值。
  21. 根据权利要求20所述的用于建立排序分计算模型的方法,其特征在于,所述预先设定的收敛要求是指,本次计算的KL距离值与上次计算得到的KL距离值相比较,其数值减小的比例小于预先设定的阈值。
  22. 根据权利要求20-21任一所述的用于建立排序分计算模型的方法,其特征在于,所述计算根据所述实际行为数据得到的实际排序分布、和采用预先设定的排序分计算模型得到的预测排序分布之间的差异值,包括:
    通过计算待评价客体的所述实际行为数据与全部待评价客体的所述实际行为数据总和的比值,获取所述待评价客体的实际排序分布;
    以所述待评价客体的原始评分数据、所述新增排序因子的值和所述新增排序因子权重系数的当前值为输入,采用所述排序分计算模型计算所述待评价客体的预测排序分;第一次执行本步骤时,将所述新增排序因子权重系数的当前值设置为预先设定的初始值;
    通过计算所述待评价客体的预测排序分与全部待评价客体的预测排序分总和的比值,获取所述待评价客体的预测排序分布;
    计算所述实际排序分布和所述预测排序分布之间的KL距离值。
  23. 根据权利要求22所述的用于建立排序分计算模型的方法,其特征在于,所述以预测排序分布和所述实际排序分布之间的差异最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数,包括:
    以所述新增排序因子的权重系数为未知数,将所述待评价客体的原始评分数据、所述新增排序因子的值代入所述排序分计算模型,并根据得到的表达式与所述待评价客体的所述预测排序分总和,获取所述预测排序分布表达式;
    获取所述实际排序分布和所述预测排序分布之间的KL距离的表达式;
    以所述KL距离表达式的值最小化为优化目标,求解所述新增排序因子权重系数的值。
  24. 根据权利要求23所述的用于建立排序分计算模型的方法,其特征在于,所述以所述KL距离表达式的值最小化为优化目标,求解所述新增排序因子权重系数的值是指,采用随机梯度下降算法或者逻辑回归优化算法求解。
  25. 一种用于建立排序分计算模型的装置,其特征在于,包括:
    数据获取单元,用于获取待评价客体的原始评分数据、新增排序因子的值、以及以实际交互行为系统中对应每个待评价客体的历史行为数据为根据,从中提取的对应特定排序目标的实际行为数据;
    分布差异值计算单元,用于计算根据所述实际行为数据得到的实际排序分布、和采用预先设定的排序分计算模型得到的预测排序分布之间的差异值;所述预测排序分布是以所述原始评分数据、所述新增排序因子的值以及所述新增排序因子权重系数的当前值为输入得到的,所述新增排序因子权重系数的当前值是指,上一次计算得到的权重系数值;
    收敛判断单元,用于判断所述差异值是否满足预先设定的收敛要求;
    结束执行单元,用于当所述收敛判断单元的输出为“是”,结束本装置各个单元的工作,所述排序分计算模型建立完毕;
    权重系数优化单元,用于当所述收敛判断单元的输出为“否”时,以预测排序分布和所述实际排序分布之间的差异最小化为优化目标,求解所述排序分计算模型中的新增排序因子的权重系数;
    循环控制单元,用于按照预先设定的时间间隔,触发上述各个单元工作。
  26. 根据权利要求25所述的用于建立排序分计算模型的装置,其特征在于,所述权重系数优化单元进行求解所依据的所述预测排序分布和所述实际排序分布之间的差异是指,上述两个分布之间的KL距离;所述分布差异值计算单元计算的差异值是指,上述两个分布之间的KL距离值。
  27. 根据权利要求26所述的用于建立排序分计算模型的装置,其特征在于,所述收敛判断单元进行判断所采用的所述预先设定的收敛要求是指,本次计算的KL距离值与上次计算得到的KL距离值相比较,其数值减小的比例小于预先设定的阈值。
  28. 根据权利要求26-27任一所述的用于建立排序分计算模型的装置,其特征在于,所述分布差异值计算单元包括:
    实际排序分布获取子单元,用于通过计算待评价客体的所述实际行为数据与全部待评价客体的所述实际行为数据总和的比值,获取所述待评价客体的实际排序分布;
    预测排序分计算子单元,用于以所述待评价客体的原始评分数据、所述新增排序因子的值和所述新增排序因子权重系数的当前值为输入,采用所述排序 分计算模型计算所述待评价客体的预测排序分;第一次触发本子单元工作时,将所述新增排序因子权重系数的当前值设置为预先设定的初始值;
    预测排序分布获取子单元,用于通过计算所述待评价客体的预测排序分与全部待评价客体的预测排序分总和的比值,获取所述待评价客体的预测排序分布;
    KL距离值计算子单元,用于计算所述实际排序分布和所述预测排序分布之间的KL距离值。
  29. 根据权利要求28所述的用于建立排序分计算模型的装置,其特征在于,所述权重系数优化单元包括:
    预测排序分布表达式获取子单元,用于以所述新增排序因子的权重系数为未知数,将所述待评价客体的原始评分数据、所述新增排序因子的值代入所述排序分计算模型,并根据得到的表达式与所述待评价客体的所述预测排序分总和,获取所述预测排序分布表达式;
    KL距离表达式获取子单元,用于获取所述实际排序分布和所述预测排序分布之间的KL距离的表达式;
    权重系数求解子单元,用于以所述KL距离表达式的值最小化为优化目标,求解所述新增排序因子权重系数的值。
  30. 根据权利要求29所述的用于建立排序分计算模型的装置,其特征在于,所述权重系数求解子单元具体用于,采用随机梯度下降算法或者逻辑回归优化算法求解所述新增排序因子权重系数的值。
  31. 一种商品推荐系统,其特征在于,包括:
    商品推荐服务器,用于接收客户端的商品查询请求,并向所述客户端推送多个与所述查询请求中的关键词相匹配的商品,所述推送的多个商品是按照权利要求1所述用于计算待评价客体排序分的方法,以预先计算的排序分对可推荐的候选商品进行排序后,推荐的序位处于高位的商品。
PCT/CN2015/091216 2014-10-15 2015-09-30 用于计算排序分及建立模型的方法、装置及商品推荐系统 WO2016058485A2 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410544767.0 2014-10-15
CN201410544767.0A CN105574025B (zh) 2014-10-15 2014-10-15 用于计算排序分及建立模型的方法、装置及商品推荐系统

Publications (2)

Publication Number Publication Date
WO2016058485A2 true WO2016058485A2 (zh) 2016-04-21
WO2016058485A3 WO2016058485A3 (zh) 2016-06-09

Family

ID=55747496

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/091216 WO2016058485A2 (zh) 2014-10-15 2015-09-30 用于计算排序分及建立模型的方法、装置及商品推荐系统

Country Status (2)

Country Link
CN (1) CN105574025B (zh)
WO (1) WO2016058485A2 (zh)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106127506A (zh) * 2016-06-13 2016-11-16 浙江大学 一种基于主动学习解决商品冷启动问题的推荐方法
CN106503566A (zh) * 2016-10-22 2017-03-15 肇庆市联高电子商务有限公司 用于电子商务的服务器
CN107330747A (zh) * 2017-05-16 2017-11-07 深圳和而泰智能家居科技有限公司 美容设备档位推荐方法、美容设备以及存储介质
CN110069699A (zh) * 2018-07-27 2019-07-30 阿里巴巴集团控股有限公司 排序模型训练方法和装置
CN111652738A (zh) * 2020-04-17 2020-09-11 世纪保众(北京)网络科技有限公司 一种基于用户行为权重的保险产品推荐的方法
CN113112148A (zh) * 2021-04-09 2021-07-13 北京邮电大学 推荐系统模型评测结果的评测方法及电子设备
CN113516504A (zh) * 2021-05-20 2021-10-19 深圳马六甲网络科技有限公司 一种商品推荐方法、装置、设备及存储介质
CN113515704A (zh) * 2021-07-22 2021-10-19 中移(杭州)信息技术有限公司 推荐效果评价方法、装置、系统及计算机程序产品

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202904A (zh) * 2016-07-05 2016-12-07 广州华多网络科技有限公司 一种基于渠道资源位的游戏导量数据排期方法及装置
CN107886345B (zh) * 2016-09-30 2021-12-07 阿里巴巴集团控股有限公司 选取数据对象的方法及装置
CN106569687A (zh) * 2016-10-19 2017-04-19 北京三快在线科技有限公司 虚拟按钮的图标排布方法、装置及终端
CN107169586A (zh) * 2017-03-29 2017-09-15 北京百度网讯科技有限公司 基于人工智能的资源组合优化方法、装置及存储介质
CN107194190B (zh) * 2017-06-20 2021-07-09 韩晟 医药费用数据库中识别服务对象对费用影响的方法及装置
CN109118029B (zh) * 2017-06-22 2022-02-18 腾讯科技(深圳)有限公司 对象排序处理方法、装置、计算机设备和存储介质
CN108009885A (zh) * 2017-11-30 2018-05-08 广州云移信息科技有限公司 一种商品信息推荐方法及系统
CN109300021A (zh) * 2018-11-29 2019-02-01 爱保科技(横琴)有限公司 保险推荐方法及装置
CN112651839A (zh) * 2021-01-07 2021-04-13 中国农业银行股份有限公司 一种产品优化方法及系统
CN114978414B (zh) * 2021-11-08 2023-06-06 淮阴师范学院 基于大数据和非正交多址的数据传输方法及系统

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040059592A1 (en) * 2002-07-23 2004-03-25 Rani Yadav-Ranjan System and method of contractor risk assessment scoring system (CRASS) using the internet, and computer software
CN101206752A (zh) * 2007-12-25 2008-06-25 北京科文书业信息技术有限公司 电子商务网站相关商品推荐系统及其方法
US8037043B2 (en) * 2008-09-09 2011-10-11 Microsoft Corporation Information retrieval system
CN101431485B (zh) * 2008-12-31 2012-07-04 深圳市迅雷网络技术有限公司 一种自动推荐互联网上信息的方法及系统
CN102411596A (zh) * 2010-09-21 2012-04-11 阿里巴巴集团控股有限公司 一种信息推荐方法及系统
US20120303412A1 (en) * 2010-11-24 2012-11-29 Oren Etzioni Price and model prediction system and method
CN102073720B (zh) * 2011-01-10 2014-01-22 北京航空航天大学 一种对个性化推荐结果进行优化的fr方法
CN102214207A (zh) * 2011-04-27 2011-10-12 百度在线网络技术(北京)有限公司 一种用于对信息实体中的属性集合进行排序的方法与设备
CN102629360B (zh) * 2012-03-13 2016-04-20 浙江大学 一种有效的动态商品推荐方法及商品推荐系统
CN103577413B (zh) * 2012-07-20 2017-11-17 阿里巴巴集团控股有限公司 搜索结果排序方法及系统、搜索结果排序优化方法及系统
CN103514239B (zh) * 2012-11-26 2016-12-21 Tcl美国研究所 一种集成用户行为和物品内容的推荐方法及系统

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106127506A (zh) * 2016-06-13 2016-11-16 浙江大学 一种基于主动学习解决商品冷启动问题的推荐方法
CN106503566A (zh) * 2016-10-22 2017-03-15 肇庆市联高电子商务有限公司 用于电子商务的服务器
CN107330747A (zh) * 2017-05-16 2017-11-07 深圳和而泰智能家居科技有限公司 美容设备档位推荐方法、美容设备以及存储介质
CN110069699A (zh) * 2018-07-27 2019-07-30 阿里巴巴集团控股有限公司 排序模型训练方法和装置
CN110069699B (zh) * 2018-07-27 2022-12-16 创新先进技术有限公司 排序模型训练方法和装置
CN111652738A (zh) * 2020-04-17 2020-09-11 世纪保众(北京)网络科技有限公司 一种基于用户行为权重的保险产品推荐的方法
CN113112148A (zh) * 2021-04-09 2021-07-13 北京邮电大学 推荐系统模型评测结果的评测方法及电子设备
CN113112148B (zh) * 2021-04-09 2022-08-05 北京邮电大学 推荐系统模型评测结果的评测方法及电子设备
CN113516504A (zh) * 2021-05-20 2021-10-19 深圳马六甲网络科技有限公司 一种商品推荐方法、装置、设备及存储介质
CN113515704A (zh) * 2021-07-22 2021-10-19 中移(杭州)信息技术有限公司 推荐效果评价方法、装置、系统及计算机程序产品
CN113515704B (zh) * 2021-07-22 2024-05-03 中移(杭州)信息技术有限公司 推荐效果评价方法、装置、系统及计算机程序产品

Also Published As

Publication number Publication date
CN105574025B (zh) 2018-10-16
CN105574025A (zh) 2016-05-11
WO2016058485A3 (zh) 2016-06-09

Similar Documents

Publication Publication Date Title
WO2016058485A2 (zh) 用于计算排序分及建立模型的方法、装置及商品推荐系统
JP6152173B2 (ja) 商品検索結果の順位付け
US11301884B2 (en) Seed population diffusion method, device, information delivery system and storage medium
JP5897019B2 (ja) 候補製品のリンクリストを判定する方法および装置
US9727616B2 (en) Systems and methods for predicting sales of item listings
EP2960849A1 (en) Method and system for recommending an item to a user
US20140278778A1 (en) Method, apparatus, and computer-readable medium for predicting sales volume
US9846885B1 (en) Method and system for comparing commercial entities based on purchase patterns
US10878058B2 (en) Systems and methods for optimizing and simulating webpage ranking and traffic
US20180308152A1 (en) Data Processing Method and Apparatus
WO2019072128A1 (zh) 对象识别方法及其系统
JP2015511039A (ja) 製品情報の公開
US20110191169A1 (en) Kalman filter modeling in online advertising bid optimization
CN108446297B (zh) 一种推荐方法及装置,电子设备
US20190303980A1 (en) Training and utilizing multi-phase learning models to provide digital content to client devices in a real-time digital bidding environment
US20220122100A1 (en) Product evaluation system and method of use
US20210004379A1 (en) Utilize high performing trained machine learning models for information retrieval in a web store
CN113268656A (zh) 一种用户推荐方法、装置、电子设备及计算机存储介质
US20210166179A1 (en) Item substitution techniques for assortment optimization and product fulfillment
US11494810B2 (en) Keyword bids determined from sparse data
CN110910201B (zh) 信息推荐的控制方法、装置、计算机设备及存储介质
CN113744017A (zh) 电商搜索的推荐方法及装置、设备、存储介质
KR102097045B1 (ko) 사용자의 특성을 반영하여 상품을 추천하는 방법 및 장치
US20140279251A1 (en) Search result ranking by brand
CN111178949A (zh) 服务资源匹配参考数据确定方法、装置、设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15849862

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15849862

Country of ref document: EP

Kind code of ref document: A2