WO2015035556A1 - Recommendation method and device - Google Patents

Recommendation method and device Download PDF

Info

Publication number
WO2015035556A1
WO2015035556A1 PCT/CN2013/083218 CN2013083218W WO2015035556A1 WO 2015035556 A1 WO2015035556 A1 WO 2015035556A1 CN 2013083218 W CN2013083218 W CN 2013083218W WO 2015035556 A1 WO2015035556 A1 WO 2015035556A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
data
parameters
system model
recommendation system
Prior art date
Application number
PCT/CN2013/083218
Other languages
French (fr)
Chinese (zh)
Inventor
张洪波
格卢霍夫瓦列里
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2013/083218 priority Critical patent/WO2015035556A1/en
Priority to CN201380001312.8A priority patent/CN104854580B/en
Publication of WO2015035556A1 publication Critical patent/WO2015035556A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Definitions

  • the present invention relates to the field of information processing, and in particular, to a recommendation method and device.
  • the existing information recommendation method is to display feedback from users.
  • the information and implicit feedback information are fused in the data model.
  • the optimization problem of minimizing the loss function is solved by the stochastic gradient descent algorithm.
  • the parameters in the data model are solved by serial calculation, and the products that have not yet been scored by the user are based on this parameter.
  • this method has higher recommendation accuracy because both the user's display feedback and implicit feedback are considered.
  • the inventors have found that the existing information recommendation method is not suitable for data processing in a massive data environment because it adopts a serial computing method, and is extremely inefficient in a mass data environment, and also affects the effect recommended to the user.
  • the embodiment of the invention provides a recommendation method and device, and improves the recommendation efficiency of the recommendation system in the mass data environment by parallel computing, and improves the recommendation effect of the recommendation system by considering the implicit feedback of the user.
  • an embodiment of the present invention provides a recommendation method, including: placing score data in a score data set to at least two data layers, where the score data corresponds to a user and a product respectively, and each The user and the product corresponding to any two of the score data in one of the data layers are different; Calculating parameters of the recommended system model in the data layer in parallel according to the preset recommendation system model and the scoring data in the data layer, and using the parameters of each layer of the data layer as initial values of the corresponding next layer of data layers And obtaining an optimal parameter of the recommendation system model; wherein the recommendation system model is a correspondence between a score prediction value of each product for each user and a parameter of the average score and the recommendation system model And obtaining, according to the optimal parameter and the recommendation system model, a score prediction value of each product for each product, and recommending a product to the user according to the score prediction value.
  • the recommendation system model includes a recommendation system model that provides implicit feedback, a recommendation system model that does not provide implicit feedback, and a recommendation system model that considers space-time speciality And a recommended system model for asymmetric potential factors.
  • the recommendation system includes: a first recommendation system model
  • the second recommendation system model indicates, in the first recommendation model and the second recommendation model, a score predicted by the user u for the product i, and ⁇ indicates all the ratings in the score data set
  • the average value of the data, brace represents the offset of the user u from the average user score
  • T represents the transpose operator symbol
  • prank User factor vector
  • represents a collection size of all products in which user u provides an implicit preference
  • N(w) represents a collection of all products in which user u provides an implicit preference
  • ; represents a factor vector associated with product j that is used to characterize implicit feedback information.
  • the method further includes: determining a mean square error of the score prediction value and the score data, and the recommendation system The relationship between the parameters of the unified model results in a cost function of the recommended system model, wherein the cost function comprises: a first cost function
  • the preset recommendation system model and the rating data in the data layer are calculated in parallel, and the parameters of each layer of the data layer are used as initial values of the corresponding next layer of data layers, until the optimal parameters of the recommended system model are obtained, including:
  • A calculating an average score of all the score data in the score data set
  • the step B includes:
  • B1 obtaining an initial estimated value of the score data of the data layer of the layer according to the initial value of the parameter of the data layer of each layer, and further, according to the score data of the data layer of the layer and the initial The estimated value is obtained by the scoring error of the data layer of the layer;
  • B2 acquiring, according to the scoring error, a parameter calculated by the data layer of the layer;
  • determining, according to the parameter calculated by the data layer of the last layer, whether the recommendation system model converges including: Calculating the calculated parameters of the data layer of the last layer and the parameters calculated by the data layer of the last layer obtained by the previous calculation are all substituted into the cost function for calculation, and if the cost function is substituted for calculation The difference between the results of the calculation is not greater than the preset threshold value, and the parameter calculated by the data layer of the last layer obtained by the current calculation is convergent. Otherwise, the data of the last layer obtained by the current calculation is converged. The parameters calculated by the layer are not convergent.
  • the obtaining the parameter calculated by the layer of the data layer according to the scoring error further includes: expressing the expression in the first recommendation model Let ft+
  • N( M ) X be used as an equivalent parameter, and use the auxiliary variable to represent the equivalent parameter, ie A+
  • 4 ; then according to the gradient of the auxiliary variable ⁇ ⁇ 2e M ⁇ ⁇ , obtaining the auxiliary variable to obtain the equivalent parameter; and acquiring the parameter q according to the auxiliary variable, ie, +7 2 A ⁇ ,), wherein the symbol represents the updated symbol, that is, the calculated value on the right side of the updated symbol Instead of the variable value to the left of the update symbol, the parameters appearing on the right side of the update symbol are the initial values of the corresponding parameters, and the parameters appearing on the left side of the update symbol are the updated values of the parameters.
  • the second aspect provides a recommendation device, including: a data placement unit, configured to separately set the score data in the score data set to at least two data layers, where the score data corresponds to the user and the product respectively. And the user and the product corresponding to any two of the score data in each of the data layers are different; the parallel computing unit is configured to perform parallel calculation according to the preset recommendation system model and the score data in the data layer.
  • the parameters of the system model are recommended in the data layer, and the parameters of each layer of the data layer are used as initial values of the corresponding next layer of data layers until the optimal parameters of the recommended system model are obtained; wherein the recommendation system
  • the model is a relationship between a score prediction value of each product for each product and a parameter of the average score and the recommendation system model; a prediction recommendation unit for using the optimal parameter and the recommendation
  • the system model obtains a score prediction value for each product for each user, and recommends a product to the user based on the score prediction value.
  • the recommendation system model includes a recommendation system model that provides implicit feedback, a recommendation system model that does not provide implicit feedback, and a recommendation system model that considers space-time special characteristics. And a recommended system model for asymmetric potential factors.
  • the recommendation system includes: a first recommendation system model
  • the second recommendation system model indicates, in the first recommendation model and the second recommendation model, a score predicted by the user u for the product i, and ⁇ represents an average value of all the score data in the score data set, b07 indicates the offset of the user u from the average user score, indicating the offset of the product i from the average product score, representing the product factor vector, T representing the transpose operator symbol, pulate representing the user factor vector,
  • the method further includes: a cost function generating unit, configured to: according to the mean square error of the score prediction value and the score data, and the recommendation system model The relationship between the parameters yields a cost function of the recommended system model, wherein the cost function includes: a first cost function
  • the parallel computing unit includes:
  • An average score calculation sub-unit configured to calculate an average score of all the score data in the score data set; a hierarchical calculation sub-unit, configured to sequentially calculate a parameter of the data layer of each layer by using a parallel calculation manner, and The parameter calculated by the data layer of each layer is used as a parameter initial value of the data layer of the next layer; wherein, the initial value of the parameter of the data layer of the first layer is set by the system; the convergence determining subunit is used according to The parameter calculated by the data layer of the last layer determines whether the recommendation system model converges, and if it converges, the calculation ends, and the optimal parameter is obtained; if not, the data layer calculated by the last layer is calculated.
  • the parameter is used as a parameter initial value of the data layer of the first layer, and the parameter initial value is transmitted to the hierarchical calculation subunit to perform hierarchical calculation.
  • the hierarchical calculation sub-unit is further configured to: a score error generating module, configured to initialize an initial value of the data layer according to each layer
  • the recommendation system model obtains an initial estimate of the score data of the data layer of the layer, And obtaining a scoring error of the layer of the data layer according to the scoring data of the layer of the data layer and the initial estimated value;
  • a parameter calculation module configured to acquire, according to the scoring error, a parameter calculated by the data layer of the layer; and a calculation control module, configured to use the parameter calculated by the data layer of the layer as an initial value of a parameter of a data layer of a next layer And obtaining, by the scoring error generating module and the parameter calculating module, the parameters calculated by the next layer of the data layer until the parameter calculated by the data layer of the last layer is obtained.
  • the convergence determining subunit is further configured to: calculate the parameter calculated by the last layer of the data layer obtained by the current calculation And the parameters calculated by the data layer of the last layer obtained by the previous calculation are all substituted into the cost function for calculation, and if the difference between the results of the calculation performed by the cost function is not greater than a preset threshold, The parameters calculated by the data layer of the last layer obtained by the second calculation are convergent. Otherwise, the parameters calculated by the data layer of the last layer obtained in this calculation are not convergent.
  • the parameter calculation module is further configured to use the expression ft +
  • N( M ) X in the first recommendation model as an equivalent a parameter, and an auxiliary variable to represent the equivalent parameter, ie, A +
  • 4 ; and then obtaining the auxiliary variable according to the gradient ⁇ ⁇ 2e M ⁇ ⁇ of the auxiliary variable Equivalent parameter; and obtaining the parameter q according to the auxiliary variable, ie, + 7 2 A ⁇ , ), wherein the symbol represents an update symbol, that is, replacing the variable value on the left side of the update symbol with the calculated value on the right side of the update symbol, updating the right side of the symbol
  • the parameters that appear are the initial values of the corresponding parameters, and the parameters that appear to the left of the update symbol are the updated values of the parameters.
  • a recommendation device including a processor and a memory, where the processor is configured to separately place the score data in the score data set into at least two data layers, where the score data is associated with the user and Product separately - corresponding, And the user and the product corresponding to any two of the score data in each of the data layers are different; and calculating the data layer in parallel according to the preset recommendation system model and the score data in the data layer Recommending parameters of the system model, and taking the parameters of each layer of the data layer as initial values of the corresponding next layer of data layers until obtaining the optimal parameters of the recommended system model; wherein the recommended system model is for each user a correspondence between the score prediction value of each product and the average score and the parameters of the recommendation system model;
  • the recommendation system model includes a recommendation system model that provides implicit feedback, a recommendation system model that does not provide implicit feedback, and a recommendation system model that considers space-time speciality And a recommended system model for asymmetric potential factors.
  • the recommendation system includes: a first recommendation system model
  • the second recommendation system model indicates, in the first recommendation model and the second recommendation model, a score predicted by the user u for the product i, and ⁇ indicates all the ratings in the score data set
  • the average value of the data, brace represents the offset of the user u from the average user score
  • T represents the transpose operator symbol
  • prank User factor vector
  • represents a set size of all products in which user u provides an implicit preference
  • N(w) represents a place in which user u provides an implicit preference.
  • the processor is further configured to: according to the mean square error of the score prediction value and the score data, and the parameter of the recommended system model The relationship between the costs of the recommended system model, wherein the cost function comprises: a first cost function
  • the processor is configured to:
  • A calculating an average score of all the score data in the score data set
  • the processor is further configured to:
  • B 1 obtaining an initial estimated value of the score data of the layer of the data layer according to the parameter initial value of the data layer of each layer and the recommendation system model, and further according to the score data of the layer of the layer and the The initial estimate obtains a score error of the data layer of the layer;
  • B 2 obtaining, according to the scoring error, a parameter calculated by the data layer of the layer;
  • the processor is configured to calculate a parameter calculated by the last layer of the data layer obtained by the current calculation and a previous calculation
  • the obtained parameters of the data layer calculated in the last layer are all substituted into the cost function for calculation. If the difference between the results calculated by the cost function is not greater than a preset threshold, then the calculation is performed.
  • the obtained parameters of the data layer calculated in the last layer are converged.
  • the processor is configured to acquire, according to the scoring error, the calculated parameter of the layer of the data layer, further comprising: the processor:
  • N( M ) X in the first recommendation model is eW(") as an equivalent parameter, and the auxiliary variable is used to represent the equivalent parameter, ie
  • the symbol indicates the update symbol, that is, the variable value on the left side of the update symbol is replaced by the calculated value on the right side of the update symbol.
  • the parameters appearing on the right side of the update symbol are the initial values of the corresponding parameters, and the parameters appearing on the left side of the update symbol are the updated values of the parameters.
  • FIG. 1 is a schematic flowchart of a recommendation method according to an embodiment of the present invention
  • FIG. 2 is a detailed flowchart of a recommendation method according to an embodiment of the present invention
  • FIG. 3A is a flow chart of a method for placing rating data according to an embodiment of the present invention
  • FIG. 3B is a flow chart of another method for placing rating data according to an embodiment of the present invention.
  • FIG. 4 is a schematic flowchart of a solution for an optimal parameter according to an embodiment of the present invention
  • FIG. 5 is a schematic diagram of a method for updating a parameter according to an embodiment of the present invention
  • FIG. 5B is another schematic diagram of an embodiment of the present invention.
  • FIG. 6 is a structural diagram of a recommended device according to an embodiment of the present invention
  • FIG. 8 is a structural diagram of another recommended device according to an embodiment of the present invention
  • a hardware device diagram of a recommended device A hardware device diagram of a recommended device.
  • FIG. 1 it is a schematic flowchart of a recommendation method provided by an embodiment of the present invention, including:
  • the scoring data in the scoring data set is respectively placed in at least two data layers, wherein, the scoring data is corresponding to the user and the product respectively, and the user and the product corresponding to any two scoring data in each data layer are different;
  • S 1 02 Calculate parameters of the recommended system model in the data layer in parallel according to the preset recommendation system model and the scoring data in the data layer, and use the parameters of each layer of the data layer as the initial value of the corresponding next layer of the data layer. Until the optimal parameters of the recommended system model are obtained; wherein the recommendation system model is a correspondence between the predicted value of each product for each product and the average score and the parameters of the recommended system model; exemplary, recommended system model It may include a recommendation system model that provides implicit feedback, a recommendation system model that does not provide implicit feedback, a recommendation system model that considers spatiotemporal characteristics, and a recommendation system model that models asymmetric potential factors; for example, the recommendation system model may also include : First recommendation system model
  • the user u predicts the score of the product i
  • represents the average value of all the score data in the score data set
  • b indicates the offset of the user u from the average user score.
  • T represents the transpose operator symbol, representing the user factor vector;
  • represents a collection size of all products in which user u provides an implicit preference
  • N(w) represents a collection of all products in which user u provides an implicit preference
  • the relationship between the predicted value obtained from the recommended system model and the mean square error of the score data and the parameters of the recommended system model are obtained as a cost function of the recommended system model, wherein the cost function comprises: a first cost function ⁇ [r tl - ⁇ -b u -b - q (p u + ⁇ N(u ⁇ j,)] 2
  • the parameters of the recommended system model in the data layer are calculated in parallel, and the parameters of each layer of the data layer are used as the initial values of the corresponding next layer of the data layer. Until the optimal parameters for the recommended system model are obtained, including:
  • A Calculate the average score of all the score data in the score data set
  • step B The parameters of each layer of the data layer are calculated by using parallel computing in turn, and the parameters calculated by each layer of the data layer are used as initial parameters of the data layer of the next layer; wherein the initial values of the parameters of the first layer of the data layer Set by the system; further, step B, including:
  • B1 obtaining an initial estimated value of the score data of the data layer of the layer according to the initial value of the parameter of each layer of the data layer and the recommendation system model, and further obtaining the score error of the data layer of the layer according to the score data of the data layer of the layer and the initial estimated value ;
  • B3 The parameter calculated by the data layer of the layer is used as the initial value of the parameter of the next layer of the data layer, and the parameters calculated by the next layer of the data layer are obtained according to steps B1 and B2 until the parameter calculated by the last layer of the data layer is obtained.
  • determining whether the recommended system model converges according to the parameter calculated by the last layer of the data layer includes: The parameters calculated by the last layer of the data layer obtained in this calculation and the parameters calculated by the last layer of the data layer obtained in the previous calculation are all substituted into the cost function, and the difference between the results calculated by the cost function is calculated. If the threshold is not greater than the preset threshold, the parameters calculated in the last layer of the data layer obtained by this calculation are convergent. Otherwise, the parameters calculated in the last layer of the data layer obtained in this calculation are not convergent. .
  • S 1 03 obtaining a score prediction value of each product for each product according to the optimal parameter and the recommendation system model, and recommending the product to the user according to the score prediction value; the embodiment provides a recommendation method, which is improved by parallel calculation.
  • the recommended efficiency of the system is recommended in a mass data environment, and the recommendation effect of the recommendation system is improved by considering user implicit feedback.
  • a detailed flowchart of a recommended method provided in this embodiment includes:
  • the scoring data in the scoring data set is respectively placed into at least two data layers; exemplarily, the scoring data set can be obtained by obtaining the user's scoring data of the product, or by browsing and purchasing the record information of the user.
  • the user can not only obtain the user's feedback on the product, but also obtain the implicit feedback of the user's preference.
  • the embodiment of the present invention does not impose any limitation on this.
  • the score data set can be obtained by the user's rating data of the product.
  • the user's rating data of the product not only shows the user's evaluation of the product, but also implicitly feedbacks the user's preference for the product through the user's behavior of rating the product.
  • the score data set may be represented by a matrix form, wherein different rows of the matrix represent different users, and different columns of the matrix represent different products; further, the score data corresponds to the user and the product respectively, and can be understood by those skilled in the art.
  • the data layer placed by the rating data It can be represented by a matrix form
  • the scoring data set can also be represented by a matrix form, wherein different rows of the matrix represent different users, different columns of the matrix represent different products, and thus the scoring data in the scoring data set can be respectively placed
  • all data layer matrices have the same number of rows and the same number of columns as the matrix of the scoring data set, when any two scoring data in each data layer corresponds to the user And the products are different, that is, when any two scoring data in each data layer matrix are not in the same row and are not in the same column, all the scoring data in the same layer of data layer can be satisfied that there is no dependency between each other.
  • the scoring data in the same data layer can be calculated in parallel.
  • the specific placement steps are not limited in any embodiment of the present invention. Any two scoring data in each data layer matrix may not be in the same row and are not in the same The placement method of the columns is within the protection scope of the embodiment of the present invention.
  • a preferred placement method in this embodiment includes:
  • the manner of selecting the method is not limited in this embodiment, and the manner of selecting data from the score data set is the same as that of the first selection, and is not described herein;
  • 3A 02 Select the next scoring data, and start from the first layer of the data layer matrix to the corresponding data layer matrix of the last layer, and compare with all the scoring data in turn, whether the corresponding position of the scoring data in the data layer matrix is satisfied. There is no score data in the row and column. If it is satisfied, the score data is placed in the first satisfied data layer matrix. If the data layer matrix corresponding to the last layer/ max is not satisfied, the last layer is updated to Max max + i , wherein the symbol indicates that the data on the left side of the symbol is updated to the data on the right side of the symbol, the same below, and the score data is placed in the updated data layer of the last layer;
  • step 302 for the remaining score data in the score data set until all the score data in the score data ⁇ are placed.
  • the exemplary operation process is as follows: Select a score data r M in the score data set, put it into the first layer, and then select the second score data to judge "' and "Is it the same user and whether 'and' is the same product.
  • the rating data is placed in the first layer, otherwise it is placed in the second layer.
  • take out the third rating data ⁇ if the user "' and product” corresponding to the rating data are different from the users and products placed on the first layer, then put it into the first layer, otherwise The user and product corresponding to the score data of the second layer are compared.
  • (7, i 5, 2) / 5 ⁇ Layer 5) may be represented by a matrix form:
  • Another preferred placement method in this embodiment includes:
  • the scoring data is sequentially compared from the last data layer to the first data layer until it can be found from the last layer to the first data layer.
  • the last one that satisfies the score data in the row and column of the corresponding position in the data layer matrix has no other scoring data, and H 'j the scoring data is placed at the corresponding position of the last satisfied data layer, if the last layer is If the first layer has no satisfied data scoring matrix, the last layer number is updated to ⁇ / max +l, and the scoring data is placed in the data layer corresponding to the updated last layer number;
  • step 3B03 Repeat step 3B02 for the remaining score data in the score data set until all the score data in the score data set are placed.
  • the exemplary operation process is as follows: Select a rating data r M from the score data set, put it into the first layer, and then Select the second rating data to determine whether M 'and M are the same user and 'and whether it is the same product.
  • the rating data r MY is placed in the first layer, otherwise it will be Put in the second layer.
  • model may be a latent factor model considering the implicit feedback of the user, or may be a latent factor model based only on the actual feedback, and may also include a recommendation system model considering the spatiotemporal characteristics, an asymmetric latent factor model, etc.
  • an improved first recommendation system model considering user implicit feedback such as equation (1), and a second recommendation system that does not consider user implicit feedback, is constructed.
  • the user u predicts the score of product i
  • represents the average of all the score data in the score data set
  • represents the offset of user u from the average user score, indicating the product
  • q t represents the product factor vector
  • T represents the transpose operator symbol
  • p counselor represents the user factor vector
  • represents that the user u provides
  • N(w) represents the set of all products that user u provides implicit preference
  • b represents the aggregate size of all products implicitly preferred
  • N(w) represents the set of all products that user u provides implicit preference
  • the relationship between the predicted value obtained from the recommended system model and the mean square error of the score data and the parameters of the recommended system model are obtained by the recommended system model a cost function, and the scoring data of the hierarchical data matrix obtained according to step S 2 01, obtained by solving the cost function optimization problem related to the above model Knowing the parameters, that is, calculating the parameters of the recommended system model in the data layer in parallel, and taking the parameters of each layer of the data layer as the initial values of the corresponding next layer of data layers, until the optimal parameters of the recommended system model are obtained, wherein, the optimal The parameter is the optimal value of the unknown parameter of the above model.
  • the cost function related to the first recommendation system model and the second recommendation system model can be expressed as the first cost function and expression of the equation (3), respectively.
  • the second cost function of (4) ⁇ [r tl - ⁇ -b u -b - q (p u + ⁇ N(u ⁇ j,)] 2
  • the parameters b M , b., q and the initial values of ⁇ and ⁇ of the first layer data layer may be randomly set.
  • the parameter b fatigue, b., q and ⁇ initial values can be set to scalar parameters as
  • the vector parameter is a vector ⁇ , wherein the arrow symbol above the number 0 is a vector symbol, and ⁇ and ⁇ can be arbitrarily set to a relatively small positive value, which is not limited by the embodiment of the present invention, and is used to indicate the regularity
  • the values of the parameters b u , bq and the data layer of the layer are calculated in parallel, and the calculated parameter values are used as parameters of the next layer of data.
  • the update calculations of the unknown parameters bute, b., q t , and sum corresponding to different scoring data can be performed in parallel, and the updated parameters are used as the initial values of the parameters of the next layer of data layer for the next layer of data layer.
  • the parameters corresponding to the included scoring data are updated and calculated, until the parameters corresponding to the scoring data included in the last layer of the data layer are updated and calculated, and the last layer of data is obtained.
  • the update calculation can be performed by the gradient descent method.
  • the symbol represents the update symbol, that is, the value of the variable on the left side of the update symbol is replaced by the calculated value on the right side of the update symbol.
  • the parameters appearing on the right side of the update symbol of the formula (5)-formula (9) are corresponding parameters.
  • the initial value, the parameters appearing on the left of the update symbol are the updated values of the parameters, 71 and 2 are the iteration step lengths.
  • a constant, or it can be an amount related to the number of iterations, that is, the iteration step is gradually reduced as the number of iterations increases.
  • the embodiment of the present invention provides a calculation method for the formula (9), as shown in FIG. 5A.
  • the schematic diagram shown is as follows:
  • the embodiment of the present invention provides another equivalent calculation method for the equations (8) and (9), as shown in the schematic diagram of FIG. 5B.
  • 4 X is used as an equivalent parameter, and the auxiliary variable is used to represent the equivalent parameter, ie
  • the auxiliary variable is equivalent to the expression ft +
  • 4 ⁇ , in which case the first recommended model is equivalent to r ui ⁇ + ⁇ ⁇ +b i + q i T ⁇ z u , its parameters Is 3 ⁇ 4, 6, ⁇ , q i , z u ;
  • the update process of the parameter sum can be replaced only by the update process of the auxiliary variable.
  • the update of the parameter q t of the formula (8) can also be changed accordingly, that is, The update formula of q t becomes ⁇ ⁇ + 7 2 ⁇ ( ⁇ - where. is the initial value at the time of calculation in this layer. It can be seen that the introduction of the auxiliary variable simplifies the calculation on the one hand, and the parameter of the recommended model on the other hand because the calculation of ⁇ is eliminated. The solution reduces the inner loop of one layer, thus greatly improving the speed of the operation while ensuring the same accuracy.
  • the parameter calculated by the data layer of the layer is used as the initial value of the data layer of the next layer, and the parameters calculated by the next layer of the data layer are obtained according to steps 4 02 1 and 4 022 until the last layer of data is obtained. Calculate the resulting parameters.
  • the parameter calculated by the last layer of data layer obtained by this calculation is not convergent, and the calculation system will be The obtained parameter of the last layer of the data layer is used as the initial value of the parameter of the first layer of the data layer to be calculated next, and the parameters of the recommended system model are continuously calculated.
  • each user's score prediction value for each product is obtained, and the product is recommended to the user according to the score prediction value.
  • the obtained optimal parameter is brought into the formula (1) or the formula (2)
  • the predicted value of each user for each product can be obtained, and the same user can be used for all products.
  • the ranking predictors are ranked, and the preset number of products with the highest score prediction value is selected for recommendation to the user.
  • This embodiment provides a recommendation method for improving massive data through parallel computing.
  • the recommended efficiency of the system is recommended in the environment, and the recommendation effect of the recommendation system is improved by considering the implicit feedback of the user.
  • the embodiment of the present invention provides a recommendation device 60.
  • the method includes: a data placement unit 601, configured to separately set the score data in the score data set to at least two data layers, where the score data and the user and The products are respectively corresponding, and the users and products corresponding to any two scoring data in each data layer are different;
  • the parallel computing unit 602 is configured to calculate parameters of the recommended system model in the data layer in parallel according to the preset recommendation system model and the scoring data in the data layer, and use the parameters of each layer of the data layer as the corresponding next layer of data layer.
  • the unit 603 is configured to obtain, according to the optimal parameter and the recommendation system model, a predicted value of each user for each product, and recommend the product to the user according to the predicted value of the score.
  • the scoring data set can be obtained by obtaining the user's scoring data of the product, or by browsing and purchasing the record information of the user, not only obtaining the user's display feedback on the product, but also obtaining implicit feedback of the user's preference.
  • the scoring data set can be obtained by the user's scoring data of the product.
  • the user's scoring data of the product not only displays the feedback of the user to the product. The evaluation, and also implicitly feedback the user's preference for the product by the user's behavior of rating the product.
  • the score data set can be represented by a matrix form, wherein different rows of the matrix represent different users The different columns of the matrix represent different products; further, the score data is respectively corresponding to the user and the product - correspondingly, those skilled in the art can understand that the data layer placed by the score data can be represented by a matrix form, and the score data is simultaneously Sets can also pass through the matrix
  • the form is represented, wherein different rows of the matrix represent different users, different columns of the matrix represent different products, and thus the scoring data in the scoring data set can be respectively placed to at least two data layer moments Array, and all the data layer matrices have the same number of rows and the same number of columns as the matrix of the scoring data set, when the user and the product of any two scoring data in each data layer are different, that is, each data When any two scoring data in the layer matrix are not in the same row and are not in the same column, all the scoring data in the same data layer can be satisfied without any dependence on each other, so the score in the same data layer can be
  • the data is subjected to parallel computing, and the specific placement steps are not limited in any embodiment of the present invention. Any placement method capable of making any two of the score data in each data layer matrix not in the same row and not in the same column is in the embodiment of the present invention.
  • the data placement unit 6 0 1 selects the next score data, and starts from the first layer of the data layer matrix to the corresponding data layer matrix of the last layer, and sequentially All the score data are compared, and whether the row and the column of the corresponding position of the score data in the data layer matrix are not scored data, if satisfied, the score data is placed in the first satisfied data layer matrix, if until the last layer / max data corresponding to a matrix layer is not satisfied, the last layer of the number of update / max / max + l, where the symbol represents the number of symbols on the left Update to the data on the right side of the symbol, the same below, and place the score data in the updated last layer of data layer; the data placement unit 6 0 1 repeats the above process of placing the score data in sequence for the remaining score data in the score data set until the score All scoring data in the data set is placed.
  • scoring data is shown in matrix A: 3 * * 2 * *
  • the data placement unit 601 can be exemplified by the placement method as shown in FIG. 3A.
  • the specific operation process is as follows: Select a score data r ⁇ in the score data set, put it into the first layer, and then select the first Two scoring data ⁇ , , judge whether "' and " are the same user and whether ' and ' is the same product. When "and” , the scoring data r MY is placed in the first layer, otherwise it is placed Into the second layer. Then take the third rating data ⁇ , if the user corresponding to the rating data and the product are not the same as the users and products placed on the first layer, then put it into the first layer Otherwise, the user and the product corresponding to the score data of the second layer are compared.
  • the second layer is placed, otherwise the third layer is placed, where ⁇ 2 indicates the second The user of the layer, representing the product of the second layer. And so on, until all the score data of the set is placed. You can get 5 data layers in turn, from the first data layer to the last layer, the fifth data layer. as followed:
  • the data placing unit 601 starts from the second scoring data selected in the scoring data set, and sequentially compares the scoring data from the data layer corresponding to the last layer to the first data layer until the first layer can be counted to the first In the layer data layer, if there is no other scoring data in the row and the column of the column that satisfies the corresponding position of the scoring data in the data layer, the scoring data is placed in the corresponding position of the last satisfied data layer, if from the last one If the number of layers is not satisfied by the first layer, the last layer is updated to / max / max +l, and the score data is placed in the data layer corresponding to the last layer of the update; data placement unit 601 Repeat the above-mentioned process of placing the second score data on the remaining score data in the score data set until all the score data in the score data set are placed.
  • the exemplary operation process is as follows: Select a rating data r M from the score data set, put it into the first layer, and then Select the second rating data to determine whether M 'and M are the same user and 'and whether it is the same product.
  • the rating data r MY is placed in the first layer, otherwise it will be Put in the first Second floor.
  • the seventh layer of data layers are:
  • the preset system model may be a latent factor model considering user implicit feedback, or a latent factor model based only on real feedback, and may include a recommendation system model considering a spatiotemporal characteristic, and an asymmetric latent factor model.
  • the embodiment of the present invention does not limit this,
  • an improved first recommendation system model considering user implicit feedback is constructed, such as equation (1), and a second recommendation system model that does not consider user implicit feedback, such as 2);
  • equations (1) and (2) the predicted value of the user u for the product i, ⁇ is the average of all the score data in the score data set, and b indicates the deviation of the user u from the average user score.
  • the recommendation device 60 may further include a cost function generating unit 604 for estimating the predicted value and the scoring data according to the recommended system model.
  • the relationship between the square error and the parameters of the recommended system model is obtained by the cost function of the recommended system model and the score data of the hierarchical data matrix obtained from the data placement unit 601, by solving the cost function optimization problem related to the above model.
  • the unknown parameters of the model that is, the parameters of the recommended system model in the parallel computing data layer, and the parameters of each layer of the data layer
  • the optimal parameter is the optimal value of the unknown parameter of the model, specifically, in this embodiment, the first recommendation system
  • the cost function related to the model and the second recommendation system model can be expressed as Equation (3) and Equation (4), respectively, where
  • the parallel computing unit 602 solves the optimization problem of the equations (3) and (4) according to the above model, and the specific solution process has similarities, and will not be described again.
  • the parallel computing unit 602 may include:
  • the average score calculation sub-unit 6021 is configured to calculate an average score of all the score data in the score data set; the hierarchical calculation sub-unit 6022 calculates the parameters of each layer of the data layer in a parallel calculation manner, and each layer of the data layer
  • the calculated parameter is used as the initial value of the parameter of the next layer of data; exemplarily, in this embodiment, the parameters b M , b., q , and ⁇ of the first layer of the data layer may be before the first calculation.
  • the initial value of ⁇ is randomly set.
  • the initial values of the parameters b congestion, b., q and ⁇ can be set to scalar parameters.
  • the vector parameter is a vector ⁇ , wherein the arrow symbol above the number 0 is a vector symbol, and 4 and ⁇ can be arbitrarily set to a relatively small positive value, which is not limited by the embodiment of the present invention, and is used to indicate the regularity.
  • the values of the parameters b u , bi, q t , ? make and ⁇ .
  • the layer calculation sub-unit 6022 performs an update calculation on the unknown parameters b congestion, b., q corresponding to the score data included in the first layer data layer according to the parameter initial value of the first layer data layer, because the score data of the same layer matrix is There is no interdependence relationship between them, so the update calculation of the unknown parameters b M , b., q and ⁇ corresponding to different score data of the same layer matrix can be performed in parallel, and the updated parameters are used as the parameters of the next layer of data layer.
  • the initial value of the number is updated and calculated for the parameter corresponding to the score data included in the next layer of the data layer, until the parameter corresponding to the score data included in the last layer of the data layer is updated and calculated, and the parameter calculated by the last layer of the data layer is obtained.
  • the update calculation can be performed by the gradient descent method. Since the update methods of the parameters corresponding to all the scoring data are the same, those skilled in the art can understand that the parameter b M corresponding to one scoring data is understood. After the b, q and ⁇ are performed, the calculation steps are described, and the calculation step can be applied to other score data without any creativity.
  • the calculation step of the hierarchical calculation sub-unit 6022 for the formula (1) can be as follows: First, The initial estimated value of the scoring data r ui is obtained according to the initial value of the parameter of each layer of the data layer and the recommended system model shown by the formula (1), and the layer data is obtained according to the scoring data of the data layer of the layer and the initial estimated value f ui score layer error e ui; Next, error rates, and ⁇ of formula (5) - to give the formula (9) the negative gradient direction
  • the updated value of the parameter as can be understood by those skilled in the art, the formula (5) - (8) can be updated according to the corresponding calculation formula, and the meaning of the symbol and the parameter appearing in the calculation process is no longer meaningful. To sum up, correspondingly, the solution to the optimization problem of equation (4) only needs to perform the update calculation of equation (5) - equation (8), and obtain the parameters b congestion, bq;
  • the embodiment of the present invention provides another method for the equations (8) and (9).
  • the specific calculation method is as follows: the expression ft +
  • r ui ⁇ + ⁇ ⁇ +b i + q i T ⁇ z u , whose parameters are 3 ⁇ 4, 6, ⁇ , q i , z u ;
  • the update process of the parameter sum can be replaced only by the update process of the auxiliary variable. Furthermore, due to the introduction of the auxiliary variable, the update of the parameter q t of equation (8) can also be changed accordingly, ie The update formula of q t becomes ⁇ + 7 2 ⁇ ( ⁇ - where. is the initial value at the time of calculation in this layer.
  • the convergence determining sub-unit 6023 is configured to determine whether the recommended system model converges according to the parameter calculated by the last layer of the data layer.
  • the calculation ends, and the optimal parameter is obtained; if the optimal parameter is obtained, if not, the last parameter is obtained.
  • the parameter calculated by the layer data layer is used as the initial value of the parameter of the first layer data layer, and the next hierarchical calculation is continued through the hierarchical calculation sub-unit 6022.
  • the convergence determination sub-unit 6023 can substitute the parameter calculated by the last layer of the data layer obtained by the current calculation and the parameter calculated by the last layer of the data layer obtained by the previous calculation into the formula (3) or the formula (3) or 4)
  • the cost function is calculated. If the difference between the two calculation results is not greater than the preset threshold value, the parameter calculated in the last layer of the data layer obtained in this calculation can be considered as the optimal parameter.
  • Set Threshold value, H 'j will not converge the parameters calculated by the last layer of data layer obtained in this calculation, and the parameters calculated by the last layer of data layer obtained in this calculation are used as hierarchical calculation
  • the unit 6022 continues the parameter initial value of the first layer data layer of the next hierarchical calculation, and continues to calculate the parameters of the recommended system model.
  • the prediction recommending unit 603 brings the obtained optimal parameters into the formula (1) or the formula (2), and can obtain the predicted value of each user for each product, which can be obtained by the same user for all products.
  • the score prediction values are arranged, and a predetermined number of products with the highest score prediction value are selected for recommendation to the user.
  • the recommendation device 60 may further include: a feedback unit 605, after the product recommendation, inputting new rating data made by the user to the product into the recommendation system, so that the recommendation system can be Real-time updates recommend parameters of the system model to ensure high-precision and high-efficiency recommendations for real-time performance.
  • the present embodiment provides a recommendation device 60, which improves the recommendation efficiency of the recommendation system in a massive data environment by parallel computing, and improves the recommendation effect of the recommendation system by considering user implicit feedback.
  • the present embodiment provides a recommendation device 60, as shown in FIG. 8, comprising: at least one processor 801, a memory 802, and at least one communication bus 803 for implementing connection and mutual communication between the devices, wherein
  • the communication bus 803 may be an Industry Standard Architecture (ISA) bus, a Peripheral Component (PCI) bus, or an Extended Indus try Standard Architecture (EISA). ) Bus, etc.
  • ISA Industry Standard Architecture
  • PCI Peripheral Component
  • EISA Extended Indus try Standard Architecture
  • the bus 803 can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in Figure 8, but it does not mean that there is only one bus or one type of bus.
  • the memory 802 is used to store executable program code and processing results of the processor 801, the program code including computer operating instructions.
  • the memory 802 may include a high speed RAM memory and may also include a non-volatile memory. For example at least one disk storage.
  • the processor 801 may be a central processing unit (CPU), or an application specific integrated circuit (ASIC), or one or more configured to implement the embodiments of the present invention. integrated circuit.
  • the processor 801 is configured to execute executable program code stored in the memory 704, such as a computer program, to execute a program corresponding to the executable code.
  • the processor 801 is configured to: separately record the score data in the score data set to at least one data layer, where the score data corresponds to the user and the product respectively, and the user corresponding to any two of the score data in each data layer and The products are all different;
  • the recommendation system model is a correspondence between each user's score prediction value and the average score of each product and the parameters of the recommendation system model; and the optimal parameter and recommendation system The model obtains a score prediction value for each product for each user, and recommends the product to the user based on the score prediction value.
  • the scoring data set can be obtained by obtaining the user's scoring data of the product, or by browsing and purchasing the record information of the user, not only obtaining the user's display feedback on the product, but also obtaining implicit feedback of the user's preference.
  • the scoring data set can be obtained by the user's scoring data of the product.
  • the user's scoring data of the product not only displays the feedback of the user to the product. The evaluation, and also implicitly feedback the user's preference for the product by the user's behavior of rating the product.
  • the score data set can be represented by a matrix form, wherein different rows of the matrix represent different users The different columns of the matrix represent different products; further, the score data is respectively corresponding to the user and the product - correspondingly, those skilled in the art can understand that the data layer placed by the score data can be represented by a matrix form, and the score data is simultaneously Sets can also pass through the matrix Form of representation, Wherein, different rows of the matrix represent different users, different columns of the matrix represent different products, and the processor 801 can then place the scoring data in the scoring data set into at least two data layer matrices, and all the data layer matrices and the scoring data set The matrix has the same number of rows and the same number of columns.
  • any two scores in each data layer correspond to different users and products, that is, any two score data in each data layer matrix are not the same.
  • all the scoring data in the same layer of data layer can be mutually independent, so that the scoring data in the same layer of data can be calculated in parallel, and the specific placement steps are implemented in the present invention.
  • the method is not limited, and any method for placing any two of the data in the data layer matrix in the same row and not in the same column is within the protection scope of the embodiment of the present invention;
  • the processor 801 can be used to implement the placement method as shown in FIG.
  • the method for selecting the method is not limited in this embodiment, and the manner of selecting data from the score data set is the same as that of the first selection, and is not described herein; the processor 801 selects the next score data and the data from the first layer.
  • the layer matrix begins to the corresponding data layer matrix of the last layer, and is sequentially compared with all the score data therein, whether there is no score data for the row and column where the corresponding position of the score data in the data layer matrix is satisfied, and if so, the The scoring data is placed in the first satisfied data layer matrix. If the data layer corresponding to the last layer/ max is not satisfied, the last layer number is updated to the data indicating that the data on the left side of the symbol is updated to the data on the right side of the symbol. Same, and the rating data is placed in the updated last layer of data; processor 8 01 is left in the score data set Score data sequentially repeating the above process until placing all rates dataset ratings data are placed. For example, the scoring data is shown in matrix A: 3 * * 2 * * *
  • the exemplary operation process is as follows: Select a score data r M in the score data set, put it into the first layer, and then select the second score data to judge "I and " are the same user and whether 'and' is the same product.
  • the rating data is placed in the first layer, otherwise it is placed in the second layer.
  • take out the third rating data ⁇ if the user "' and product” corresponding to the rating data are different from the users and products placed on the first layer, then put it into the first layer, otherwise The user and product corresponding to the score data of the second layer are compared.
  • the processor 801 starts from the second score data selected in the score data set, and compares the score data from the data layer corresponding to the last layer to the first data layer until the first layer can be compared to the first layer. If there is no other scoring data in the data layer to find the last row and column of the corresponding position of the scoring data in the data layer, the scoring data is placed at the corresponding position of the last satisfied data layer, if from the last layer Counting the data scoring matrix that is not satisfied by the first layer, the last layer number is updated to max max +l, and the scoring data is placed in the data layer corresponding to the updated last layer; processor 801 pairs the scoring data The remaining data in the set repeats the above-described placement process for the second score data until all the score data in the score data set is placed.
  • the exemplary operation process is as follows: Select a rating data r M from the score data set, put it into the first layer, and then Select the second rating data to determine whether M 'and M are the same user and 'and whether it is the same product.
  • the rating data r MY is placed in the first layer, otherwise it will be Put in the first Second floor.
  • the seventh layer of data layers are:
  • the preset system model may be a latent factor model considering user implicit feedback, or a latent factor model based only on real feedback, and may include a recommendation system model considering a spatiotemporal characteristic, and an asymmetric latent factor model.
  • the processor 801 constructs an improved first recommendation system model considering user implicit feedback, such as equation (1), and a A second recommendation system model that does not consider user implicit feedback, as in equation (2),
  • the user u predicts the score of product i
  • represents the average of all the score data in the score data set
  • represents the offset of user u from the average user score, indicating the product
  • q t represents the product factor vector
  • T represents the transpose operator symbol
  • p counselor represents the user factor vector
  • represents that the user u provides All collection sizes of implicit preferences
  • N(w) represents the set of all products that user u provides for implicit preference
  • the relationship between the mean square error of the value and the score data and the parameters of the recommended system model is obtained by the cost function of the recommended system model, and the score data of the hierarchical data matrix, which is obtained by solving the cost function optimization problem related to the above model.
  • the unknown parameters of the model that is, the parameters of the recommended system model in the parallel computing data layer, and the parameters of each layer of the data layer are used as the initial values of the corresponding next layer of data layers, until the optimal parameters of the recommended system model are obtained, wherein
  • the optimal parameter is the optimal value of the unknown parameter of the above model.
  • the cost functions related to the first recommendation system model and the second recommendation system model may be expressed as equations (3) and (4), respectively; wherein
  • the processor 801 can obtain the unknown parameters of the above model by solving the cost function optimization problem related to the above model, that is, the recommended system model in the parallel computing data layer.
  • the parameters of each layer of the data layer are taken as the initial values of the corresponding data layer of the next layer until the optimal parameters of the recommended system model are obtained, and those skilled in the art can understand the formula (3) and the formula ( 4) Solving the optimization problem, the specific solution process has similarity, and will not be described again.
  • equation (3) as an example, as shown in FIG. 7, the processor 801 is further used to:
  • the parameters b M , b. , q and the initial values of ⁇ and ⁇ of the first layer data layer may be randomly set.
  • the parameter b The initial values ofenfin, b. , q and ⁇ can be set to scalar parameters as
  • the vector parameter is a vector ⁇ , wherein the arrow symbol above the number 0 is a vector symbol, and ⁇ and ⁇ can be arbitrarily set to a relatively small positive value, which is not limited by the embodiment of the present invention, and is used to indicate the regularity.
  • the processor 801 starts from the first layer of the data layer, sequentially calculates the values of the parameters b congestion, bq?, and ⁇ . of the layer of the data layer in parallel, and takes the calculated parameter value as the next step.
  • Processor 801 According to the initial value of the parameter of the first layer data layer, the unknown parameters bsur, b., q t , and the corresponding parameters corresponding to the scoring data included in the first layer of the data layer are updated, because there is no mutual mutuality between the scoring data of the same layer matrix.
  • the different data rates corresponding to the same layer as the matrix unknown parameters b ", b., q t can be calculated and updated in parallel, and the updated parameter as a parameter in one data layer Value for the next
  • the parameters corresponding to the scoring data included in one layer of the data layer are updated and calculated, until the parameters corresponding to the scoring data included in the last layer of the data layer are updated and calculated, and the parameters calculated by the last layer of the data layer are obtained; preferably,
  • the processor 801 can perform the update calculation by the gradient descent method. Since the update methods of the parameters corresponding to all the scoring data are the same, those skilled in the art can understand that the parameters b M , b corresponding to one scoring data are understood.
  • the processor 801 may specifically calculate the step of formula (1): the following mouth 3 ⁇ 4: first, the processor 801 obtain an initial estimated value of the scoring data r ui according to the initial value of the parameter of each layer of the data layer and the recommended system model represented by the formula (1), and then find the layer data according to the scoring data and the initial estimated value of the data layer of the layer.
  • Layer score error e ui ;
  • the processor 801 obtains an updated value of the parameter in the negative gradient direction by the scoring error ⁇ and the equation (5) - (9), and those skilled in the art can understand that the equations (5) - (8) can be corresponding according to the corresponding The calculation of the parameters of the calculation, the calculation process and the meaning of the symbols and parameters appearing in the various equations are not repeated, correspondingly, the solution to the optimization problem of equation (4) only needs to be carried out (5) - (8 Update calculation, get the parameter bure, bq P Organic;
  • the processor 801 also needs to perform serial calculation on the basis of the score data in the score data set, which reduces the computational efficiency. Therefore, the embodiment of the present invention provides a calculation method for the formula (9).
  • the processor 801 then aggregates to obtain the value of the layer ⁇ , ie +y 2 (4) + Ay + A)), where is the initial value.
  • the embodiment of the present invention provides another method for the equations (8) and (9).
  • the specific calculation method is as follows:
  • the first recommended model is equivalent to eW(")
  • r ui ⁇ + ⁇ ⁇ +b i + q i T ⁇ z u , whose parameters are 3 ⁇ 4, 6, ⁇ , q i , z u ;
  • the update process of the parameter sum can be replaced only by the update process of the auxiliary variable. Furthermore, since the auxiliary variable is introduced, the update of the parameter q t of the formula (8) can also be changed accordingly, that is, The update formula of q t becomes ⁇ + 7 2 ⁇ ( ⁇ - where. is the initial value at the time of calculation in this layer.
  • the processor 801 can use the parameter calculated by the layer data layer as the parameter initial value of the next layer of the data layer, and then the processor 801 uses the updated value as the initial value of the parameter of the second layer data layer, and according to the above The method calculates an updated value of the parameters of the second layer of data layers, and so on, and the processor 801 obtains an updated value of the parameters of the last layer of the data layer.
  • the processor 801 is further configured to determine, according to the parameter calculated by the last layer of the data layer, whether the recommended system model converges. If the convergence is performed, the calculation ends, and the optimal parameter is obtained; if the optimal parameter is obtained, if not, The parameter calculated by the last layer of the data layer is used as the initial value of the parameter of the first layer of the data layer, and the calculation of the recommended system model is continued.
  • the processor 801 can calculate the obtained parameter of the last layer of the data layer obtained by the current calculation and the parameter calculated by the last layer of the data layer obtained by the previous calculation into the equation (3) or (4).
  • the cost function is calculated. If the difference between the two calculation results is not greater than the preset threshold value, the parameter calculated in the last layer of the data layer obtained by the current calculation may be regarded as the optimal parameter, if it is greater than the preset threshold. Limit, then The parameters calculated in the last layer of the data layer obtained in this calculation are not convergent, and the parameters calculated in the last layer of the data layer obtained in this calculation are used as the parameters of the first layer of the data layer to be calculated next time. Value, continue to calculate the parameters of the recommended system model.
  • the processor 801 brings the obtained optimal parameter into the formula (1) or the formula (2), and can obtain the predicted value of each user for each product, and can score the same user for all products. The predicted values are arranged, and a predetermined number of products with the highest score prediction value are selected for recommendation to the user.
  • the processor 801 can also be used to input new scoring data made by the user on the product into the recommendation system after the product recommendation, so that the recommendation system can ensure high according to the parameters of the system model in real-time update recommendation. Accuracy and high efficiency are recommended for real-time.
  • the present embodiment provides a recommendation device 60, which improves the recommendation efficiency of the recommendation system in a massive data environment by parallel computing, and improves the recommendation effect of the recommendation system by considering user implicit feedback.
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of cells is only a logical function division.
  • multiple units or components may be combined or integrated. Go to another system, or some features can be ignored, or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may be physically included separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
  • the above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium.
  • the software functional unit described above is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform portions of the steps of various embodiments of the present invention.
  • a computer device which may be a personal computer, server, or network device, etc.
  • the foregoing storage medium includes: a USB flash drive, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. Medium.

Abstract

Embodiments of the present invention provide a recommendation method and a device. Recommendation efficiency of a recommendation system in a massive data environment is improved through parallel calculation, and a recommendation effect of the recommendation system is improved through considering implicit feedbacks of users, wherein the recommendation method comprises: placing scoring data in a scoring data set to at least one data layer respectively; based on a preset recommendation system model and the scoring data in the data layers, computing parameters of the recommendation system model in the data layers in parallel and using parameters of each layer of the data layers as initial values of a next layer of the data layers until optimal parameters of the recommendation system model are obtained; based on the optimal parameters and the recommendation system model, obtaining a predicted scoring value of each user on each product; and recommending products to the users based on the predicted scoring value.

Description

一种推荐方法和设备  A recommended method and device
技术领域 Technical field
本发明涉及信息处理领域, 尤其涉及一种推荐方法和设备。  The present invention relates to the field of information processing, and in particular, to a recommendation method and device.
背景技术 Background technique
随着网络的不断发展, 网络信息量也呈现了爆炸式的增长, 信 息推荐方法的出现使得用户能够在海量网络信息中获取自 己感兴趣 的信息, 现有的信息推荐方法是将用户的显示反馈信息与隐式反馈 信息融合在数据模型中, 通过随机梯度下降算法求解损失函数最小 化的优化问题, 以串行计算的方式解出数据模型中的参数, 并依据 此参数对用户尚未评分的产品进行预测并对用户进行推荐预测的用 户感兴趣的产品, 由于同时考虑了用户的显示反馈和隐式反馈, 因 此这种方法有较高的推荐精度。 发明人发现现有的信息推荐方法由于采用的是串行计算的方 式, 并不适合在海量数据环境下进行数据处理, 在海量数据环境下 效率极低, 同时也会影响向用户推荐的效果。  With the continuous development of the network, the amount of network information has also exploded. The emergence of information recommendation methods enables users to obtain information of interest in massive network information. The existing information recommendation method is to display feedback from users. The information and implicit feedback information are fused in the data model. The optimization problem of minimizing the loss function is solved by the stochastic gradient descent algorithm. The parameters in the data model are solved by serial calculation, and the products that have not yet been scored by the user are based on this parameter. For products that are of interest to users who make predictions and make recommendations for users, this method has higher recommendation accuracy because both the user's display feedback and implicit feedback are considered. The inventors have found that the existing information recommendation method is not suitable for data processing in a massive data environment because it adopts a serial computing method, and is extremely inefficient in a mass data environment, and also affects the effect recommended to the user.
发明内容 Summary of the invention
本发明实施例提供了一种推荐方法和设备, 通过并行计算提高 了海量数据环境下推荐系统的推荐效率, 并且通过考虑用户隐式反 馈提高了推荐系统的推荐效果。  The embodiment of the invention provides a recommendation method and device, and improves the recommendation efficiency of the recommendation system in the mass data environment by parallel computing, and improves the recommendation effect of the recommendation system by considering the implicit feedback of the user.
为达到上述目 的, 本发明的实施例采用如下技术方案:  In order to achieve the above objectives, embodiments of the present invention adopt the following technical solutions:
第一方面, 本发明实施例提供了一种推荐方法, 包括: 将评分数据集中的评分数据分别放置到至少两个数据层, 其中, 所述评分数据与用户以及产品分别——对应, 且每一个所述数据层 中的任意两个所述评分数据对应的用户以及产品均不相同; 依据预设的推荐系统模型以及所述数据层中的评分数据, 并行 计算所述数据层中推荐系统模型的参数, 并将每一层数据层的参数 作为对应的下一层数据层的初值, 直至获取所述推荐系统模型的最 优参数; 其中, 所述推荐系统模型为每个用户对每个产品的评分预 测值与所述平均分和所述推荐系统模型的参数之间的对应关系; 根据所述最优参数与所述推荐系统模型获取每个用户对每个产 品的评分预测值, 并根据所述评分预测值向所述用户推荐产品。 根据第一种可能的实现方式, 结合第一方面, 所述推荐系统模 型包括提供了隐式反馈的推荐系统模型、 未提供隐式反馈的推荐系 统模型、 考虑时空特' f生的推荐系统模型和非对称的潜在因素的推荐 系统模型。 In a first aspect, an embodiment of the present invention provides a recommendation method, including: placing score data in a score data set to at least two data layers, where the score data corresponds to a user and a product respectively, and each The user and the product corresponding to any two of the score data in one of the data layers are different; Calculating parameters of the recommended system model in the data layer in parallel according to the preset recommendation system model and the scoring data in the data layer, and using the parameters of each layer of the data layer as initial values of the corresponding next layer of data layers And obtaining an optimal parameter of the recommendation system model; wherein the recommendation system model is a correspondence between a score prediction value of each product for each user and a parameter of the average score and the recommendation system model And obtaining, according to the optimal parameter and the recommendation system model, a score prediction value of each product for each product, and recommending a product to the user according to the score prediction value. According to a first possible implementation manner, in combination with the first aspect, the recommendation system model includes a recommendation system model that provides implicit feedback, a recommendation system model that does not provide implicit feedback, and a recommendation system model that considers space-time speciality And a recommended system model for asymmetric potential factors.
根据第二种可能的实现方式, 结合第一方面或者第一种可能的 实现方式, 所述推荐系统包括: 第一推荐系统模型  According to the second possible implementation manner, in combination with the first aspect or the first possible implementation manner, the recommendation system includes: a first recommendation system model
—丄  —丄
j≡N(u) 或者, 第二推荐系统模型 在所述第一推荐模型和所述第二推荐模型中, 表示用户 u 对 产品 i 的评分预测值, μ表示所述评分数据集中的所有评分数据的 平均值, b„表示所述用户 u相对用户平均评分的偏移量, 表示所述 产品 i相对产品平均评分的偏移量, 表示产品因素矢量, T表示转 置运算符号, p„表示用户因素矢量,  j≡N(u) Alternatively, the second recommendation system model indicates, in the first recommendation model and the second recommendation model, a score predicted by the user u for the product i, and μ indicates all the ratings in the score data set The average value of the data, b „ represents the offset of the user u from the average user score, represents the offset of the product i from the average product score, represents the product factor vector, T represents the transpose operator symbol, p „ User factor vector,
进一步的, 在所述第一推荐模型中, |N(w)|表示用户 u提供了隐 式偏好的所有产品的集合大小, N(w)表示用户 u提供了隐式偏好的所 有产品的集合; 表示与产品 j相关联的因素矢量, 其用于表征 隐式反馈信息。  Further, in the first recommendation model, |N(w)| represents a collection size of all products in which user u provides an implicit preference, and N(w) represents a collection of all products in which user u provides an implicit preference. ; represents a factor vector associated with product j that is used to characterize implicit feedback information.
根据第三种可能的实现方式, 结合第二种可能的实现方式, 所 述的方法, 还包括: 根据所述评分预测值与所述评分数据的均方误差和所述推荐系 统模型的参数之间的关系得到所述推荐系统模型的代价函数, 其中 所述代价函数包括: 第一代价函数 According to a third possible implementation manner, in combination with the second possible implementation manner, the method further includes: determining a mean square error of the score prediction value and the score data, and the recommendation system The relationship between the parameters of the unified model results in a cost function of the recommended system model, wherein the cost function comprises: a first cost function
Figure imgf000005_0001
Figure imgf000005_0001
或者, 第二代价函数  Or, the second cost function
∑ [rui -μ- -b - q puf + {bu 2 + bf) + ^\\ +\\pu\\ ) 其中, ||*||表示矢量 *的所有元素的平方和, 与 ^为正则化因子。 根据第四种可能的实现方式, 结合第一方面、 第一种至第三种 可能的实现方式中的任一项, 所述依据预设的推荐系统模型以及所 述数据层中的评分数据, 并行计算所述数据层中推荐系统模型的参 数, 并将每一层数据层的参数作为对应的下一层数据层的初值, 直 至获取所述推荐系统模型的最优参数, 包括: ∑ [r ui -μ- -b - qp u f + {b u 2 + bf) + ^\\ +\\p u \\ ) where ||*|| represents the sum of the squares of all the elements of the vector *, And ^ is a regularization factor. According to the fourth possible implementation manner, in combination with any one of the first aspect, the first to the third possible implementation manner, the preset recommendation system model and the rating data in the data layer, The parameters of the recommended system model in the data layer are calculated in parallel, and the parameters of each layer of the data layer are used as initial values of the corresponding next layer of data layers, until the optimal parameters of the recommended system model are obtained, including:
A: 计算所述评分数据集中的所有所述评分数据的平均分;  A: calculating an average score of all the score data in the score data set;
B: 依次采用并行计算的方式计算每一层所述数据层的参数, 并 将每一层所述数据层计算所得的参数作为下一层数据层的参数初 值; 其中, 所述第一层所述数据层的参数初值由系统设置; B: sequentially calculating the parameters of the data layer of each layer by using parallel computing, and using the parameter calculated by the data layer of each layer as the initial value of the parameter of the next data layer; wherein, the first layer The initial value of the parameter of the data layer is set by the system;
C: 根据最后一层所述数据层计算所得的参数判断所述推荐系统 模型是否收敛, 若收敛, 则计算结束, 得到所述最优参数; 若不收 敛, 则将最后一层所述数据层计算所得的参数作为第一层所述数据 层的参数初值, 重复所述步骤 B、 C。 根据第五种可能的实现方式, 结合第四种可能的实现方式, 所 述步骤 B, 包括: C: judging whether the recommendation system model converges according to the parameter calculated by the data layer of the last layer, if convergence, the calculation ends, and the optimal parameter is obtained; if not, the last layer of the data layer is obtained The calculated parameters are used as the initial values of the parameters of the data layer of the first layer, and the steps B and C are repeated. According to the fifth possible implementation manner, in combination with the fourth possible implementation manner, the step B includes:
B1: 根据每一层所述数据层的参数初值和所述推荐系统模型得 到所述该层数据层的评分数据的初始估计值, 进而根据所述该层数 据层的评分数据和所述初始估计值得到所述该层数据层的评分误 差; B2: 根据所述评分误差获取所述该层数据层计算所得的参数; B1: obtaining an initial estimated value of the score data of the data layer of the layer according to the initial value of the parameter of the data layer of each layer, and further, according to the score data of the data layer of the layer and the initial The estimated value is obtained by the scoring error of the data layer of the layer; B2: acquiring, according to the scoring error, a parameter calculated by the data layer of the layer;
B3:将所述该层数据层计算所得的参数作为下一层数据层的参 数初值, 根据步骤 B1 和 B2得到下一层数据层计算所得的参数, 直 至得到最后一层所述数据层计算所得的参数。 根据第六种可能的实现方式, 结合第四种或第五种可能的实现 方式, 所述根据最后一层所述数据层计算所得的参数判断所述推荐 系统模型是否收敛, 包括: 将本次计算所得到的最后一层所述数据层计算所得的参数和前 一次计算所得到的最后一层所述数据层计算所得的参数均代入所述 代价函数进行计算, 若代入所述代价函数进行计算的的结果之差不 大于预设的门限值, 则本次计算所得到的最后一层所述数据层计算 所得的参数是收敛的, 否则, 本次计算所得到的最后一层所述数据 层计算所得的参数是不收敛的。 根据第七种可能的实现方式, 结合第五种可能的实现方式, 所 述根据所述评分误差获取所述该层数据层计算所得的参数, 进一步 包括: 将所述第一推荐模型中的表达式 ft+|N(M) X 作为一个等效 参数, 并用辅助变量 来表示所述等效参数, 即 =A+|N(M)|4 ; 然后根据所述辅助变量 的梯度 ΔζΜ =2eM · ^ , 获取所述辅助 变量 即得到所述等效参数; 以及根据所述辅助变量 获取所述参 数 q 即 +72 A^,), 其中, 符号表示更新符号, 即用更新 符号右边的计算值替代更新符号左边的变量值, 更新符号右边出现 的参数均为相应的参数的初值, 更新符号左边出现的参数均为参数 的更新值。 B3: taking the parameter calculated by the data layer of the layer as the initial value of the parameter of the next layer, and obtaining the parameter calculated by the next layer of the data layer according to steps B1 and B2 until the data layer of the last layer is calculated. The resulting parameters. According to the sixth possible implementation manner, in combination with the fourth or fifth possible implementation manner, determining, according to the parameter calculated by the data layer of the last layer, whether the recommendation system model converges, including: Calculating the calculated parameters of the data layer of the last layer and the parameters calculated by the data layer of the last layer obtained by the previous calculation are all substituted into the cost function for calculation, and if the cost function is substituted for calculation The difference between the results of the calculation is not greater than the preset threshold value, and the parameter calculated by the data layer of the last layer obtained by the current calculation is convergent. Otherwise, the data of the last layer obtained by the current calculation is converged. The parameters calculated by the layer are not convergent. According to the seventh possible implementation manner, in combination with the fifth possible implementation manner, the obtaining the parameter calculated by the layer of the data layer according to the scoring error further includes: expressing the expression in the first recommendation model Let ft+|N( M ) X be used as an equivalent parameter, and use the auxiliary variable to represent the equivalent parameter, ie =A+|N( M )|4 ; then according to the gradient of the auxiliary variable Δζ Μ =2e M · ^, obtaining the auxiliary variable to obtain the equivalent parameter; and acquiring the parameter q according to the auxiliary variable, ie, +7 2 A^,), wherein the symbol represents the updated symbol, that is, the calculated value on the right side of the updated symbol Instead of the variable value to the left of the update symbol, the parameters appearing on the right side of the update symbol are the initial values of the corresponding parameters, and the parameters appearing on the left side of the update symbol are the updated values of the parameters.
第二方面, 提供了一种推荐设备, 包括: 数据放置单元, 用于将评分数据集中的评分数据分别放置到至 少两个数据层, 其中, 所述评分数据与用户以及产品分别——对应, 且每一个所述数据层中的任意两个所述评分数据对应的用户以及产 品均不相同; 并行计算单元, 用于依据预设的推荐系统模型以及所述数据层 中的评分数据, 并行计算所述数据层中推荐系统模型的参数, 并将 每一层数据层的参数作为对应的下一层数据层的初值, 直至获取所 述推荐系统模型的最优参数; 其中, 所述推荐系统模型为每个用户 对每个产品的评分预测值与所述平均分和所述推荐系统模型的参数 之间的 3†应关系; 预测推荐单元, 用于根据所述最优参数与所述推荐系统模型获 取每个用户对每个产品的评分预测值, 并根据所述评分预测值向所 述用户推荐产品。 根据第一种可能的实现方式, 结合第二方面, 所述推荐系统模 型包括提供了隐式反馈的推荐系统模型、 未提供隐式反馈的推荐系 统模型、 考虑时空特' f生的推荐系统模型和非对称的潜在因素的推荐 系统模型。 The second aspect provides a recommendation device, including: a data placement unit, configured to separately set the score data in the score data set to at least two data layers, where the score data corresponds to the user and the product respectively. And the user and the product corresponding to any two of the score data in each of the data layers are different; the parallel computing unit is configured to perform parallel calculation according to the preset recommendation system model and the score data in the data layer. The parameters of the system model are recommended in the data layer, and the parameters of each layer of the data layer are used as initial values of the corresponding next layer of data layers until the optimal parameters of the recommended system model are obtained; wherein the recommendation system The model is a relationship between a score prediction value of each product for each product and a parameter of the average score and the recommendation system model; a prediction recommendation unit for using the optimal parameter and the recommendation The system model obtains a score prediction value for each product for each user, and recommends a product to the user based on the score prediction value. According to a first possible implementation manner, in combination with the second aspect, the recommendation system model includes a recommendation system model that provides implicit feedback, a recommendation system model that does not provide implicit feedback, and a recommendation system model that considers space-time special characteristics. And a recommended system model for asymmetric potential factors.
根据第二种可能的实现方式, 结合第二方面或者第一种可能的 实现方式, 所述推荐系统包括: 第一推荐系统模型  According to the second possible implementation manner, in combination with the second aspect or the first possible implementation manner, the recommendation system includes: a first recommendation system model
—丄 j≡N(u)  —丄 j≡N(u)
或者, 第二推荐系统模型 在所述第一推荐模型和所述第二推荐模型中, 表示用户 u 对 产品 i 的评分预测值, μ表示所述评分数据集中的所有评分数据的 平均值, b„表示所述用户 u相对用户平均评分的偏移量, 表示所述 产品 i相对产品平均评分的偏移量, 表示产品因素矢量, T表示转 置运算符号, p„表示用户因素矢量,  Alternatively, the second recommendation system model indicates, in the first recommendation model and the second recommendation model, a score predicted by the user u for the product i, and μ represents an average value of all the score data in the score data set, b „ indicates the offset of the user u from the average user score, indicating the offset of the product i from the average product score, representing the product factor vector, T representing the transpose operator symbol, p „ representing the user factor vector,
进一步的, 在所述第一推荐模型中, |N(w)|表示用户 u提供了隐 式偏好的所有产品的集合大小, N(w)表示用户 u提供了隐式偏好的所 有产品的集合; 表示与产品 j相关联的因素矢量, 其用于表征 隐式反馈信息。 根据第三种可能的实现方式, 结合第二种可能的实现方式, 还 包括: 代价函数生成单元, 用于根据所述评分预测值与所述评分数 据的均方误差和所述推荐系统模型的参数之间的关系得到所述推荐 系统模型的代价函数, 其中所述代价函数包括: 第一代价函数 Further, in the first recommendation model, |N(w)| represents a collection size of all products in which user u provides an implicit preference, and N(w) represents a collection of all products in which user u provides an implicit preference. ; represents a factor vector associated with product j that is used to characterize implicit feedback information. According to a third possible implementation manner, in combination with the second possible implementation manner, the method further includes: a cost function generating unit, configured to: according to the mean square error of the score prediction value and the score data, and the recommendation system model The relationship between the parameters yields a cost function of the recommended system model, wherein the cost function includes: a first cost function
Figure imgf000008_0001
Figure imgf000008_0001
或者, 第二代价函数  Or, the second cost function
∑ [rui - μ - - b - q puf + {bu 2 + bf ) + ^\\ + \\pu\\ ) 其中, |*|表示矢量 *的所有元素的平方和, 与 ^为正则化因子。 根据第四种可能的实现方式, 结合第二方面、 第一种至第三种 可能的实现方式中的任一项, 所述并行计算单元, 包括: ∑ [r ui - μ - - b - qp u f + {b u 2 + bf ) + ^\\ + \\p u \\ ) where |*| represents the sum of the squares of all the elements of the vector *, and ^ Is a regularization factor. According to the fourth possible implementation, in combination with the second aspect, the first to the third possible implementation, the parallel computing unit includes:
平均分计算子单元, 用于计算所述评分数据集中的所有所述评 分数据的平均分; 分层计算子单元, 用于依次采用并行计算的方式计算每一层所 述数据层的参数, 并将每一层所述数据层计算所得的参数作为下一 层数据层的参数初值; 其中, 所述第一层所述数据层的参数初值由 系统设置; 收敛判断子单元, 用于根据最后一层所述数据层计算所得的参 数判断所述推荐系统模型是否收敛, 若收敛, 则计算结束, 得到所 述最优参数; 若不收敛, 则将最后一层所述数据层计算所得的参数 作为第一层所述数据层的参数初值, 并将所述参数初值传输至所述 分层计算子单元重复进行分层计算。  An average score calculation sub-unit, configured to calculate an average score of all the score data in the score data set; a hierarchical calculation sub-unit, configured to sequentially calculate a parameter of the data layer of each layer by using a parallel calculation manner, and The parameter calculated by the data layer of each layer is used as a parameter initial value of the data layer of the next layer; wherein, the initial value of the parameter of the data layer of the first layer is set by the system; the convergence determining subunit is used according to The parameter calculated by the data layer of the last layer determines whether the recommendation system model converges, and if it converges, the calculation ends, and the optimal parameter is obtained; if not, the data layer calculated by the last layer is calculated. The parameter is used as a parameter initial value of the data layer of the first layer, and the parameter initial value is transmitted to the hierarchical calculation subunit to perform hierarchical calculation.
根据第五种可能的实现方式, 结合第四种可能的实现方式, 所 述分层计算子单元进一步用于, 评分误差生成模块, 用于根据每一层所述数据层的参数初值和 所述推荐系统模型得到所述该层数据层的评分数据的初始估计值, 进而根据所述该层数据层的评分数据和所述初始估计值得到所述该 层数据层的评分误差; According to a fifth possible implementation manner, in combination with the fourth possible implementation manner, the hierarchical calculation sub-unit is further configured to: a score error generating module, configured to initialize an initial value of the data layer according to each layer The recommendation system model obtains an initial estimate of the score data of the data layer of the layer, And obtaining a scoring error of the layer of the data layer according to the scoring data of the layer of the data layer and the initial estimated value;
参数计算模块, 用于根据所述评分误差获取所述该层数据层计 算所得的参数; 计算控制模块, 用于将所述该层数据层计算所得的参数作为下 一层数据层的参数初值, 通过所述评分误差生成模块和所述参数计 算模块得到下一层数据层计算所得的参数, 直至得到最后一层所述 数据层计算所得的参数。  a parameter calculation module, configured to acquire, according to the scoring error, a parameter calculated by the data layer of the layer; and a calculation control module, configured to use the parameter calculated by the data layer of the layer as an initial value of a parameter of a data layer of a next layer And obtaining, by the scoring error generating module and the parameter calculating module, the parameters calculated by the next layer of the data layer until the parameter calculated by the data layer of the last layer is obtained.
根据第六种可能的实现方式, 结合第四种或第五种可能的实现 方式, 所述收敛判断子单元进一步用于, 将本次计算所得到的最后 一层所述数据层计算所得的参数和前一次计算所得到的最后一层所 述数据层计算所得的参数均代入所述代价函数进行计算, 若代入所 述代价函数进行计算的结果之差不大于预设的门限值, 则本次计算 所得到的最后一层所述数据层计算所得的参数是收敛的, 否则, 本 次计算所得到的最后一层所述数据层计算所得的参数是不收敛的。  According to the sixth possible implementation manner, in combination with the fourth or the fifth possible implementation manner, the convergence determining subunit is further configured to: calculate the parameter calculated by the last layer of the data layer obtained by the current calculation And the parameters calculated by the data layer of the last layer obtained by the previous calculation are all substituted into the cost function for calculation, and if the difference between the results of the calculation performed by the cost function is not greater than a preset threshold, The parameters calculated by the data layer of the last layer obtained by the second calculation are convergent. Otherwise, the parameters calculated by the data layer of the last layer obtained in this calculation are not convergent.
根据第七种可能的实现方式, 结合第五种可能的实现方式, 所 述参数计算模块进一步用于, 将所述第一推荐模型中的表达式 ft + |N(M) X 作为一个等效 参数, 并用辅助变量 来表示所述等效参数, 即 = A + |N(M)|4 ; 然后根据所述辅助变量 的梯度 ΔζΜ = 2eM · ^ , 获取所述辅助 变量 即得到所述等效参数; 以及根据所述辅助变量 获取所述参 数 q 即 + 72 A^, ) , 其中, 符号表示更新符号, 即用更新 符号右边的计算值替代更新符号左边的变量值, 更新符号右边出现 的参数均为相应的参数的初值, 更新符号左边出现的参数均为参数 的更新值。 According to the seventh possible implementation, in combination with the fifth possible implementation, the parameter calculation module is further configured to use the expression ft + |N( M ) X in the first recommendation model as an equivalent a parameter, and an auxiliary variable to represent the equivalent parameter, ie, = A + |N( M )|4 ; and then obtaining the auxiliary variable according to the gradient Δζ Μ = 2e M · ^ of the auxiliary variable Equivalent parameter; and obtaining the parameter q according to the auxiliary variable, ie, + 7 2 A^, ), wherein the symbol represents an update symbol, that is, replacing the variable value on the left side of the update symbol with the calculated value on the right side of the update symbol, updating the right side of the symbol The parameters that appear are the initial values of the corresponding parameters, and the parameters that appear to the left of the update symbol are the updated values of the parameters.
第三方面, 提供了一种推荐设备, 包括处理器和存储器, 其中, 所述处理器用于, 将评分数据集中的评分数据分别放置到至少 两个数据层, 其中, 所述评分数据与用户以及产品分别——对应, 且每一个所述数据层中的任意两个所述评分数据对应的用户以及产 品均不相同; 以及依据预设的推荐系统模型以及所述数据层中的评分数据, 并行计算所述数据层中推荐系统模型的参数, 并将每一层数据层的 参数作为对应的下一层数据层的初值, 直至获取所述推荐系统模型 的最优参数; 其中, 所述推荐系统模型为每个用户对每个产品的评 分预测值与所述平均分和所述推荐系统模型的参数之间的对应关 系; In a third aspect, a recommendation device is provided, including a processor and a memory, where the processor is configured to separately place the score data in the score data set into at least two data layers, where the score data is associated with the user and Product separately - corresponding, And the user and the product corresponding to any two of the score data in each of the data layers are different; and calculating the data layer in parallel according to the preset recommendation system model and the score data in the data layer Recommending parameters of the system model, and taking the parameters of each layer of the data layer as initial values of the corresponding next layer of data layers until obtaining the optimal parameters of the recommended system model; wherein the recommended system model is for each user a correspondence between the score prediction value of each product and the average score and the parameters of the recommendation system model;
以及根据所述最优参数与所述推荐系统模型获取每个用户对每 个产品的评分预测值, 并根据所述评分预测值向所述用户推荐产品; 所述存储器用于保存评分数据集以及处理器所执行的程序和执 行的结果。 根据第一种可能的实现方式, 结合第三方面, 所述推荐系统模 型包括提供了隐式反馈的推荐系统模型、 未提供隐式反馈的推荐系 统模型、 考虑时空特' f生的推荐系统模型和非对称的潜在因素的推荐 系统模型。  And obtaining a score prediction value of each product for each product according to the optimal parameter and the recommendation system model, and recommending a product to the user according to the score prediction value; the memory is used to save the score data set and The program executed by the processor and the result of the execution. According to a first possible implementation manner, in combination with the third aspect, the recommendation system model includes a recommendation system model that provides implicit feedback, a recommendation system model that does not provide implicit feedback, and a recommendation system model that considers space-time speciality And a recommended system model for asymmetric potential factors.
根据第二种可能的实现方式, 结合第三方面或者第一种可能的 实现方式, 所述推荐系统包括: 第一推荐系统模型  According to the second possible implementation manner, in combination with the third aspect or the first possible implementation manner, the recommendation system includes: a first recommendation system model
—丄  —丄
j≡N(u) 或者, 第二推荐系统模型 在所述第一推荐模型和所述第二推荐模型中, 表示用户 u 对 产品 i 的评分预测值, μ表示所述评分数据集中的所有评分数据的 平均值, b„表示所述用户 u相对用户平均评分的偏移量, 表示所述 产品 i相对产品平均评分的偏移量, 表示产品因素矢量, T表示转 置运算符号, p„表示用户因素矢量,  j≡N(u) Alternatively, the second recommendation system model indicates, in the first recommendation model and the second recommendation model, a score predicted by the user u for the product i, and μ indicates all the ratings in the score data set The average value of the data, b „ represents the offset of the user u from the average user score, represents the offset of the product i from the average product score, represents the product factor vector, T represents the transpose operator symbol, p „ User factor vector,
进一步的, 在所述第一推荐模型中, |N(w)|表示用户 u提供了隐 式偏好的所有产品的集合大小, N(w)表示用户 u提供了隐式偏好的所 有产品的集合; 表示与产品 j相关联的因素矢量, 其用于表征 隐式反馈信息。 Further, in the first recommendation model, |N(w)| represents a set size of all products in which user u provides an implicit preference, and N(w) represents a place in which user u provides an implicit preference. There is a collection of products; represents a factor vector associated with product j that is used to characterize implicit feedback information.
根据第三种可能的实现方式, 结合第二种可能的实现方式, 所 述处理器还用于, 根据所述评分预测值与所述评分数据的均方误差 和所述推荐系统模型的参数之间的关系得到所述推荐系统模型的代 价函数, 其中所述代价函数包括: 第一代价函数  According to a third possible implementation manner, in combination with the second possible implementation manner, the processor is further configured to: according to the mean square error of the score prediction value and the score data, and the parameter of the recommended system model The relationship between the costs of the recommended system model, wherein the cost function comprises: a first cost function
Figure imgf000011_0001
Figure imgf000011_0001
或者, 第二代价函数 Or, the second cost function
Figure imgf000011_0002
其中, |*|表示矢量 *的所有元素的平方和, 与 ^为正则化因子。 根据第四种可能的实现方式, 结合第三方面、 第一种至第三种 可能的实现方式中的任一项, 所述处理器用于,
Figure imgf000011_0002
Where |*| represents the sum of the squares of all the elements of the vector *, and ^ is the regularization factor. According to a fourth possible implementation manner, in combination with any one of the third aspect, the first to the third possible implementation manner, the processor is configured to:
A : 计算所述评分数据集中的所有所述评分数据的平均分; A: calculating an average score of all the score data in the score data set;
B : 依次采用并行计算的方式计算每一层所述数据层的参数, 并 将每一层所述数据层计算所得的参数作为下一层数据层的参数初 值; 其中, 所述第一层所述数据层的参数初值由系统设置; B: calculating the parameters of the data layer of each layer in a parallel calculation manner, and using the parameter calculated by the data layer of each layer as the initial value of the parameter of the next data layer; wherein, the first layer The initial value of the parameter of the data layer is set by the system;
C : 根据最后一层所述数据层计算所得的参数判断所述推荐系统 模型是否收敛, 若收敛, 则计算结束, 得到所述最优参数; 若不收 敛, 则将最后一层所述数据层计算所得的参数作为第一层所述数据 层的参数初值, 重复所述步骤 B、 C。 根据第五种可能的实现方式, 结合第四种可能的实现方式, 所 述处理器进一步用于, C: determining whether the recommended system model converges according to the parameter calculated by the data layer of the last layer. If convergence, the calculation ends, and the optimal parameter is obtained; if not, the last layer of the data layer is obtained. The calculated parameters are used as the initial values of the parameters of the data layer of the first layer, and the steps B and C are repeated. According to a fifth possible implementation, in combination with the fourth possible implementation, the processor is further configured to:
B 1 : 根据每一层所述数据层的参数初值和所述推荐系统模型得 到所述该层数据层的评分数据的初始估计值, 进而根据所述该层数 据层的评分数据和所述初始估计值得到所述该层数据层的评分误 差; B 2 : 根据所述评分误差获取所述该层数据层计算所得的参数; B 1 : obtaining an initial estimated value of the score data of the layer of the data layer according to the parameter initial value of the data layer of each layer and the recommendation system model, and further according to the score data of the layer of the layer and the The initial estimate obtains a score error of the data layer of the layer; B 2: obtaining, according to the scoring error, a parameter calculated by the data layer of the layer;
B 3 :将所述该层数据层计算所得的参数作为下一层数据层的参 数初值, 根据步骤 B 1 和 B 2得到下一层数据层计算所得的参数, 直 至得到最后一层所述数据层计算所得的参数。 根据第六种可能的实现方式, 结合第四种或第五种可能的实现 方式, 所述处理器用于, 将本次计算所得到的最后一层所述数据层计算所得的参数和前 一次计算所得到的最后一层所述数据层计算所得的参数均代入所述 代价函数进行计算, 若代入所述代价函数进行计算的的结果之差不 大于预设的门限值, 则本次计算所得到的最后一层所述数据层计算 所得的参数是收敛的, 否则, 本次计算所得到的最后一层所述数据 层计算所得的参数是不收敛的。 根据第七种可能的实现方式, 结合第五种可能的实现方式, 所 述处理器用于根据所述评分误差获取所述该层数据层计算所得的参 数, 进一步包括: 所述处理器将所述第一推荐模型中的表达式 + |N(M) X 作 eW(") 为一个等效参数, 并用辅助变量 来表示所述等效参数, 即B 3: taking the parameter calculated by the data layer of the layer as the initial value of the parameter of the next layer of data, and obtaining the parameter calculated by the next layer of the data layer according to steps B 1 and B 2 until the last layer is obtained. The data layer calculates the parameters. According to the sixth possible implementation manner, in combination with the fourth or fifth possible implementation manner, the processor is configured to calculate a parameter calculated by the last layer of the data layer obtained by the current calculation and a previous calculation The obtained parameters of the data layer calculated in the last layer are all substituted into the cost function for calculation. If the difference between the results calculated by the cost function is not greater than a preset threshold, then the calculation is performed. The obtained parameters of the data layer calculated in the last layer are converged. Otherwise, the parameters calculated by the data layer of the last layer obtained in this calculation are not converged. According to the seventh possible implementation, in combination with the fifth possible implementation, the processor is configured to acquire, according to the scoring error, the calculated parameter of the layer of the data layer, further comprising: the processor: The expression + |N( M ) X in the first recommendation model is eW(") as an equivalent parameter, and the auxiliary variable is used to represent the equivalent parameter, ie
"
然后根据所述辅助变量 的梯度 ΔζΜ = 2eM · ^ , 获取所述辅助 变量 即得到所述等效参数; 以及根据所述辅助变量 获取所述参 数 q 即 + 72 A^, ) , 其中, 符号表示更新符号, 即用更新 符号右边的计算值替代更新符号左边的变量值, 更新符号右边出现 的参数均为相应的参数的初值, 更新符号左边出现的参数均为参数 的更新值。 本发明实施例提供的推荐方法和设备, 通过并行计算提高了海 量数据环境下推荐系统的推荐效率, 并且通过考虑用户隐式反馈提 高了推荐系统的推荐效果。 附图说明 为了更清楚地说明本发明实施例或现有技术中的技术方案, 下 面将对实施例或现有技术描述中所需要使用的附图作简单地介绍, 显而易见地, 下面描述中的附图仅仅是本发明的一些实施例, 对于 本领域普通技术人员来讲, 在不付出创造性劳动的前提下, 还可以 根据这些附图获得其他的附图。 图 1 为本发明实施例提供的一种推荐方法流程示意图; Then obtaining the auxiliary parameter according to the gradient Δζ Μ = 2e M · ^ of the auxiliary variable, and obtaining the parameter q according to the auxiliary variable; ie, the parameter q is + 7 2 A^, ) The symbol indicates the update symbol, that is, the variable value on the left side of the update symbol is replaced by the calculated value on the right side of the update symbol. The parameters appearing on the right side of the update symbol are the initial values of the corresponding parameters, and the parameters appearing on the left side of the update symbol are the updated values of the parameters. The recommendation method and device provided by the embodiments of the present invention improve the recommendation efficiency of the recommendation system in the mass data environment by parallel computing, and improve the recommendation effect of the recommendation system by considering the user implicit feedback. BRIEF DESCRIPTION OF THE DRAWINGS In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings to be used in the embodiments or the description of the prior art will be briefly described below, and obviously, in the following description The drawings are only some of the embodiments of the present invention, and those skilled in the art can obtain other drawings based on these drawings without any creative work. FIG. 1 is a schematic flowchart of a recommendation method according to an embodiment of the present invention;
图 2为本发明实施例提供的一种推荐方法的详细流程图; 图 3 A 为本发明实施例所提供的一种评分数据的放置方法流程 图;  2 is a detailed flowchart of a recommendation method according to an embodiment of the present invention; FIG. 3A is a flow chart of a method for placing rating data according to an embodiment of the present invention;
图 3 B 为本发明实施例所提供的另一种评分数据的放置方法流 程图;  FIG. 3B is a flow chart of another method for placing rating data according to an embodiment of the present invention; FIG.
图 4为本发明实施例所提供的最优参数的求解流程示意图; 图 5 A为本发明实施例所提供的一种参数更新方法的示意图; 图 5 B为本发明实施例所提供的另一种参数更新方法的示意图; 图 6为本发明实施例提供的一种推荐设备结构图; 图 7为本发明实施例提供的另一种推荐设备的结构图; 图 8为本发明实施例提供的一种推荐设备的硬件装置图。 具体实施方式 下面将结合本发明实施例中的附图, 对本发明实施例中的技术 方案进行清楚、 完整地描述, 显然, 所描述的实施例仅仅是本发明 一部分实施例, 而不是全部的实施例。 基于本发明中的实施例, 本 领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他 实施例, 都属于本发明保护的范围。  4 is a schematic flowchart of a solution for an optimal parameter according to an embodiment of the present invention; FIG. 5 is a schematic diagram of a method for updating a parameter according to an embodiment of the present invention; FIG. 5B is another schematic diagram of an embodiment of the present invention. FIG. 6 is a structural diagram of a recommended device according to an embodiment of the present invention; FIG. 8 is a structural diagram of another recommended device according to an embodiment of the present invention; A hardware device diagram of a recommended device. The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. example. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
如图 1 所示, 为本发明实施例提供的一种推荐方法的流程示意 图, 包括:  As shown in FIG. 1 , it is a schematic flowchart of a recommendation method provided by an embodiment of the present invention, including:
S 1 0 1:将评分数据集中的评分数据分别放置到至少两个数据层, 其中, 评分数据与用户以及产品分别——对应, 且每一个数据层中 的任意两个评分数据对应的用户以及产品均不相同; S 1 0 1: The scoring data in the scoring data set is respectively placed in at least two data layers, Wherein, the scoring data is corresponding to the user and the product respectively, and the user and the product corresponding to any two scoring data in each data layer are different;
S 1 02:依据预设的推荐系统模型以及数据层中的评分数据, 并行 计算数据层中推荐系统模型的参数, 并将每一层数据层的参数作为 对应的下一层数据层的初值, 直至获取推荐系统模型的最优参数; 其中, 推荐系统模型为每个用户对每个产品的评分预测值与平均分 和推荐系统模型的参数之间的对应关系; 示例性的, 推荐系统模型可以包括提供了隐式反馈的推荐系统 模型、 未提供隐式反馈的推荐系统模型、 考虑时空特性的推荐系统 模型和非对称的潜在因素的推荐系统模型; 示例性的, 推荐系统模型还可以包括: 第一推荐系统模型  S 1 02: Calculate parameters of the recommended system model in the data layer in parallel according to the preset recommendation system model and the scoring data in the data layer, and use the parameters of each layer of the data layer as the initial value of the corresponding next layer of the data layer. Until the optimal parameters of the recommended system model are obtained; wherein the recommendation system model is a correspondence between the predicted value of each product for each product and the average score and the parameters of the recommended system model; exemplary, recommended system model It may include a recommendation system model that provides implicit feedback, a recommendation system model that does not provide implicit feedback, a recommendation system model that considers spatiotemporal characteristics, and a recommendation system model that models asymmetric potential factors; for example, the recommendation system model may also include : First recommendation system model
—丄 j≡N(u) 或者, 第二推荐系统模型  —丄 j≡N(u) or, the second recommended system model
在第一推荐模型和第二推荐模型中, 表示用户 u对产品 i 的 评分预测值, μ表示评分数据集中的所有评分数据的平均值, b„表示 用户 u相对用户平均评分的偏移量, 表示产品 i相对产品平均评分 的偏移量, 表示产品因素矢量, T表示转置运算符号, 表示用户 因素矢量; In the first recommendation model and the second recommendation model, the user u predicts the score of the product i, μ represents the average value of all the score data in the score data set, and b indicates the offset of the user u from the average user score. Indicates the offset of product i from the average product score, indicating the product factor vector, and T represents the transpose operator symbol, representing the user factor vector;
进一步的, 在第一推荐模型中, |N(w)|表示用户 u提供了隐式偏 好的所有产品的集合大小, N(w)表示用户 u提供了隐式偏好的所有产 品的集合; 表示与产品 j相关联的因素矢量, 其用于表征隐式 反馈信息。  Further, in the first recommendation model, |N(w)| represents a collection size of all products in which user u provides an implicit preference, and N(w) represents a collection of all products in which user u provides an implicit preference; A factor vector associated with product j that is used to characterize implicit feedback information.
相应的, 根据推荐系统模型得到的预测值与评分数据的均方误 差和推荐系统模型的参数之间的关系得到推荐系统模型的代价函 数, 其中代价函数包括: 第一代价函数 ∑ [rtl -μ-bu-b -q (pu+\N(u ∑ j,)]2 Correspondingly, the relationship between the predicted value obtained from the recommended system model and the mean square error of the score data and the parameters of the recommended system model are obtained as a cost function of the recommended system model, wherein the cost function comprises: a first cost function ∑ [r tl -μ-b u -b - q (p u +\N(u ∑ j,)] 2
≡N  ≡N
2 2)+ iu +≡ xN i if) 或者, 第二代价函数 2 2 )+ iu + ≡ xN i if) or, second cost function
∑ [rui -μ-Κ-b - q p + {bu 2 + bf) + ^f + ||^ ) 其中, |*|2表示矢量 *的所有元素的平方和, 与 ^为正则化因子。 示例性的, 依据预设的推荐系统模型以及数据层中的评分数据, 并行计算数据层中推荐系统模型的参数, 并将每一层数据层的参数 作为对应的下一层数据层的初值, 直至获取推荐系统模型的最优参 数, 包括: ∑ [r ui -μ-Κ-b - qp + {b u 2 + bf) + ^f + ||^ ) where |*| 2 represents the sum of the squares of all the elements of the vector *, and ^ is the regularization factor . Exemplarily, according to the preset recommendation system model and the scoring data in the data layer, the parameters of the recommended system model in the data layer are calculated in parallel, and the parameters of each layer of the data layer are used as the initial values of the corresponding next layer of the data layer. Until the optimal parameters for the recommended system model are obtained, including:
A: 计算评分数据集中的所有评分数据的平均分;  A: Calculate the average score of all the score data in the score data set;
B: 依次采用并行计算的方式计算每一层数据层的参数, 并将每 一层数据层计算所得的参数作为下一层数据层的参数初值; 其中, 第一层数据层的参数初值由系统设置; 进一步的, 步骤 B, 包括: B: The parameters of each layer of the data layer are calculated by using parallel computing in turn, and the parameters calculated by each layer of the data layer are used as initial parameters of the data layer of the next layer; wherein the initial values of the parameters of the first layer of the data layer Set by the system; further, step B, including:
B1: 根据每一层数据层的参数初值和推荐系统模型得到该层数 据层的评分数据的初始估计值, 进而根据该层数据层的评分数据和 初始估计值得到该层数据层的评分误差; B1: obtaining an initial estimated value of the score data of the data layer of the layer according to the initial value of the parameter of each layer of the data layer and the recommendation system model, and further obtaining the score error of the data layer of the layer according to the score data of the data layer of the layer and the initial estimated value ;
B2: 根据评分误差获取该层数据层计算所得的参数; B2: obtaining a parameter calculated by the data layer of the layer according to the score error;
B3:将该层数据层计算所得的参数作为下一层数据层的参数初 值, 根据步骤 B1 和 B2得到下一层数据层计算所得的参数, 直至得 到最后一层数据层计算所得的参数。 B3: The parameter calculated by the data layer of the layer is used as the initial value of the parameter of the next layer of the data layer, and the parameters calculated by the next layer of the data layer are obtained according to steps B1 and B2 until the parameter calculated by the last layer of the data layer is obtained.
C: 根据最后一层数据层计算所得的参数判断推荐系统模型是否 收敛, 若收敛, 则计算结束, 得到最优参数; 得到最优参数, 若不 收敛, 则将最后一层数据层计算所得的参数作为第一层数据层的参 数初值, 重复步骤 B、 C。 示例性的, 根据最后一层数据层计算所得的参数判断推荐系统 模型是否收敛, 包括: 将本次计算所得到的最后一层数据层计算所得的参数和前一次 计算所得到的最后一层数据层计算所得的参数均代入代价函数进行 计算, 若代入代价函数进行计算的的结果之差不大于预设的门限值, 则本次计算所得到的最后一层数据层计算所得的参数是收敛的, 否 则, 本次计算所得到的最后一层数据层计算所得的参数是不收敛的。 C: judge whether the recommended system model converges according to the parameters calculated by the last layer of the data layer. If it converges, the calculation ends and the optimal parameters are obtained; the optimal parameters are obtained, and if not, the data layer of the last layer is calculated. The parameters are used as the initial values of the parameters of the first layer of the data layer, and steps B and C are repeated. Exemplarily, determining whether the recommended system model converges according to the parameter calculated by the last layer of the data layer includes: The parameters calculated by the last layer of the data layer obtained in this calculation and the parameters calculated by the last layer of the data layer obtained in the previous calculation are all substituted into the cost function, and the difference between the results calculated by the cost function is calculated. If the threshold is not greater than the preset threshold, the parameters calculated in the last layer of the data layer obtained by this calculation are convergent. Otherwise, the parameters calculated in the last layer of the data layer obtained in this calculation are not convergent. .
S 1 03 :根据最优参数与推荐系统模型获取每个用户对每个产品 的评分预测值, 并根据评分预测值向用户推荐产品; 本实施例提供了一种推荐方法, 通过并行计算提高了海量数据 环境下推荐系统的推荐效率, 并且通过考虑用户隐式反馈提高了推 荐系统的推荐效果。 S 1 03: obtaining a score prediction value of each product for each product according to the optimal parameter and the recommendation system model, and recommending the product to the user according to the score prediction value; the embodiment provides a recommendation method, which is improved by parallel calculation. The recommended efficiency of the system is recommended in a mass data environment, and the recommendation effect of the recommendation system is improved by considering user implicit feedback.
如图 2 所示, 为本实施例提供的一种推荐方法的详细流程图, 包括; As shown in FIG. 2, a detailed flowchart of a recommended method provided in this embodiment includes:
S 2 01 : 将评分数据集中的评分数据分别放置到至少两个数据层; 示例性的, 评分数据集可以通过获取用户对产品的评分数据得 到, 也可以通过用户的浏览、 购买记录信息得到, 不仅可以得到用 户对产品的显示反馈, 也可以获取到用户偏好的隐式反馈, 本发明 实施例对此不做任何限制, 优选的, 评分数据集可以通过用户对产 品的评分数据获得, 本领域技术人员可以理解的, 用户对产品的评 分数据不仅显示的反馈了用户对产品的评价, 而且也通过用户对产 品评分的行为隐式的反馈了用户对产品的偏好, 优选的, 在本实施 例中, 评分数据集可以通过矩阵形式表示, 其中, 矩阵的不同行表 示不同用户, 矩阵的不同列表示不同产品; 进一步的, 评分数据与用户以及产品分别——对应, 本领域技 术人员可以理解的, 由于评分数据所放置的数据层可以通过矩阵的 形式进行表示, 同时评分数据集也可以通过矩阵的形式进行表示, 其中, 矩阵的不同行表示不同用户, 矩阵的不同列表示不同产品, 于是可以将评分数据集中的评分数据分别放置到至少两个数据层矩 阵, 并且所有数据层矩阵均与评分数据集的矩阵具有相同的行数和 相同的列数, 当每一个数据层中的任意两个评分数据对应的用户以 及产品均不相同, 即每一个数据层矩阵中的任意两个评分数据均不 在同一行且均不在同一列的时候, 就可以满足同一层数据层中的所 有评分数据之间互相没有依赖关系, 因此可以对同一层数据层中的 评分数据进行并行计算, 具体的放置步骤本发明实施例不作任何限 定, 任何能够使得每一个数据层矩阵中的任意两个评分数据均不在 同一行且均不在同一列的放置方法均在本发明实施例的保护范围 内; S 2 01 : The scoring data in the scoring data set is respectively placed into at least two data layers; exemplarily, the scoring data set can be obtained by obtaining the user's scoring data of the product, or by browsing and purchasing the record information of the user. The user can not only obtain the user's feedback on the product, but also obtain the implicit feedback of the user's preference. The embodiment of the present invention does not impose any limitation on this. Preferably, the score data set can be obtained by the user's rating data of the product. The skilled person can understand that the user's rating data of the product not only shows the user's evaluation of the product, but also implicitly feedbacks the user's preference for the product through the user's behavior of rating the product. Preferably, in this embodiment The score data set may be represented by a matrix form, wherein different rows of the matrix represent different users, and different columns of the matrix represent different products; further, the score data corresponds to the user and the product respectively, and can be understood by those skilled in the art. , the data layer placed by the rating data It can be represented by a matrix form, and the scoring data set can also be represented by a matrix form, wherein different rows of the matrix represent different users, different columns of the matrix represent different products, and thus the scoring data in the scoring data set can be respectively placed To at least two data layer matrices, and all data layer matrices have the same number of rows and the same number of columns as the matrix of the scoring data set, when any two scoring data in each data layer corresponds to the user And the products are different, that is, when any two scoring data in each data layer matrix are not in the same row and are not in the same column, all the scoring data in the same layer of data layer can be satisfied that there is no dependency between each other. Therefore, the scoring data in the same data layer can be calculated in parallel. The specific placement steps are not limited in any embodiment of the present invention. Any two scoring data in each data layer matrix may not be in the same row and are not in the same The placement method of the columns is within the protection scope of the embodiment of the present invention;
可选的, 如图 3A所示, 本实施例优选的一种放置方法包括:  Optionally, as shown in FIG. 3A, a preferred placement method in this embodiment includes:
3A 01 : 选取评分数据集中的一个评分数据, 将其放在第一层数 据层矩阵中对应于该评分数据的用户和产品的位置上, 此时, 最后 一层数设置为 /max = l,其中, 选取的方式本实施例不做限定, 后续从 评分数据集中选取数据的方式与第一次选取的方式相同, 在此不作 赘述; 3A 01 : Select one of the scoring data in the scoring data set and place it in the position of the user and product corresponding to the scoring data in the first data layer matrix. At this time, the last layer is set to / max = l, The manner of selecting the method is not limited in this embodiment, and the manner of selecting data from the score data set is the same as that of the first selection, and is not described herein;
3A 02 : 选取下一个评分数据, 并从第一层数据层矩阵开始到最 后一层对应的数据层矩阵, 依次与其中的所有评分数据进行比较, 是否满足评分数据在数据层矩阵中的对应位置所在的行和列均没有 评分数据, 若满足, 则该评分数据放置于最先满足的数据层矩阵, 若直至最后一层 /max对应的数据层矩阵也不满足, 则最后一层数更新 为 max max + i , 其中 符号表示将符号左边的数据更新为符号右边 的数据, 下同, 并且将该评分数据放置于更新后的最后一层数据层;3A 02 : Select the next scoring data, and start from the first layer of the data layer matrix to the corresponding data layer matrix of the last layer, and compare with all the scoring data in turn, whether the corresponding position of the scoring data in the data layer matrix is satisfied. There is no score data in the row and column. If it is satisfied, the score data is placed in the first satisfied data layer matrix. If the data layer matrix corresponding to the last layer/ max is not satisfied, the last layer is updated to Max max + i , wherein the symbol indicates that the data on the left side of the symbol is updated to the data on the right side of the symbol, the same below, and the score data is placed in the updated data layer of the last layer;
3A 03 : 对评分数据集中剩余的评分数据依次重复步骤 302 直至 评分数据 ^ 中所有的评分数据均放置完毕。 3A 03 : Repeat step 302 for the remaining score data in the score data set until all the score data in the score data ^ are placed.
例如 ,  E.g ,
3 2  3 2
2 2 1  2 2 1
4 5 5 2  4 5 5 2
A = 2 3 *  A = 2 3 *
1 5  1 5
1 2 3  1 2 3
5 2 通过如图 3A所示的放置方法, 示例性的, 具体操作过程如下: 评分数据集中选出一个评分数据 rM , 将其放入第一层, 然后再选择 第二个评分数据 判断" '和"是否为同一个用户以及 '和' '是否为同 一个产品, 当 "'≠ "且 '·'≠ 时, 该评分数据 放入第一层, 否则将其放 入第二层。 随后再取出第三个评分数据^ , 如果该评分数据所对应 的用户 "'和产品 '与放置在第一层的用户和产品均不相同时, 则将其 放入第一层, 否则与第二层的评分数据对应的用户和产品进行比较, 当满足 ^≠%^2且 '≠ 2时, 则将其放入第二层, 否则放入第三层, 这 里" 2表示第二层的用户, „2表示第二层的产品。 依此类推, 直 到集合 的所有评分数据均放置完毕。 可以依次得到 5 个数据层, 从第一层数据层到最后一层也就是 第五层数据层依次为:5 2 Through the placement method as shown in FIG. 3A, the exemplary operation process is as follows: Select a score data r M in the score data set, put it into the first layer, and then select the second score data to judge "' and "Is it the same user and whether 'and' is the same product. When "'≠" and '·'≠, the rating data is placed in the first layer, otherwise it is placed in the second layer. Then take out the third rating data ^, if the user "' and product" corresponding to the rating data are different from the users and products placed on the first layer, then put it into the first layer, otherwise The user and product corresponding to the score data of the second layer are compared. When ^ ≠% ^ 2 and ' ≠ 2 are satisfied, they are placed in the second layer, otherwise they are placed in the third layer, where " 2 represents the second layer. User, „ 2 indicates the product of the second layer. And so on, until all the score data of the set is placed. You can get 5 data layers in turn, from the first data layer to the last layer, which is the fifth data layer. as followed:
u2,i2 2、 ,3) [ 5, ,5) lx {Layer 1) ζ·4, 2) ( 2, 5,l) , 4) ( 4 ,2) ( 5,i2,\) /2 {Layer 2) u 2 , i 2 2, , 3) [ 5 , , 5) l x {Layer 1) ζ·4, 2) ( 2 , 5 , l) , 4) ( 4 , 2) ( 5 , i 2 , \ ) / 2 {Layer 2)
( 3, 5,5) l) (w? /3 {Layer 3) z6 2) ( 6, 2,2) Z4 J /4 {Layer 4) ( 3 , 5 , 5) l) (w ? / 3 {Layer 3) z 6 2) ( 6 , 2 , 2) Z 4 J / 4 {Layer 4)
( 7 , i5 , 2) /5 {Layer 5) 通过矩阵形式表示可以是: (7, i 5, 2) / 5 {Layer 5) may be represented by a matrix form:
Figure imgf000018_0001
其中, 矩阵中出现的 *表示该位置对应的用户没有对该位置对应 的产品进行评分, 可以得到 A = +i2+i3+i4+i5。 可选的, 如图 3B所示, 本实施例优选的另一种放置方法包括:
Figure imgf000018_0001
The * appearing in the matrix indicates that the user corresponding to the location does not score the product corresponding to the location, and A = +i 2 +i 3 +i 4 +i 5 can be obtained. Optionally, as shown in FIG. 3B, another preferred placement method in this embodiment includes:
3B01: 选取评分数据集中的一个评分数据, 将其放在第一层数 据层中对应于该评分数据的用户和产品的位置上, 此时, 最后一层 数 =1 3B01: Select one of the scoring data in the scoring data set and place it in the position of the user and product corresponding to the scoring data in the first data layer. At this time, the last layer is =1.
3B02: 从评分数据集中选取的第二个评分数据开始, 将该评分 数据依次从最后一层数据层直至第一层数据层进行比较, 直至能够 从最后一层到第一层数据层中找出最后一个满足该评分数据在数据 层矩阵中对应位置的所在行和所在列上没有其他评分数据, H 'j该评 分数据放置于最后一个满足的数据层的对应位置, 若从最后一层数 到第一层均没有满足的数据评分矩阵, 则最后一层数更新为 ^/max +l , 并且将该评分数据放置于更新后的最后一层数对应的 数据层; 3B02: Starting from the second scoring data selected in the scoring data set, the scoring data is sequentially compared from the last data layer to the first data layer until it can be found from the last layer to the first data layer. The last one that satisfies the score data in the row and column of the corresponding position in the data layer matrix has no other scoring data, and H 'j the scoring data is placed at the corresponding position of the last satisfied data layer, if the last layer is If the first layer has no satisfied data scoring matrix, the last layer number is updated to ^/ max +l, and the scoring data is placed in the data layer corresponding to the updated last layer number;
3B03: 对评分数据集中剩余的评分数据依次重复步骤 3B02直至 评分数据集中所有的评分数据均放置完毕。  3B03: Repeat step 3B02 for the remaining score data in the score data set until all the score data in the score data set are placed.
例如, 仍以矩阵 A为例, 通过如图 3B所示的放置方法, 示例性 的, 具体操作过程如下: 从评分数据集中选出一个评分数据 rM, 将其放入第一层, 然后再 选择第二个评分数据 ,判断 M '和 M是否为同一个用户以及'和 是否为同 一个产品, 当《'≠«且 '≠ 时, 该评分数据 rMY放入第一层, 否则将其放入第 二层。 随后再取出第三个评分数据^ , 将其与第二层已放置的评分数据 进行比较(如果有第二层的话), 如果该评分数据所对应的用户 M '和产品 ' 与放置在第二层的用户 M '和产品 '均不相同时, 则继续将其与第一层已放 置的评分数据进行比较, 当与第一层已放置的评分数据也满足 M"≠M且 时则将其放入第一层, 否则放置第二层, 而当该评分数据所对应的用户 和产品 与放置在第二层的用户 M '和产品 '满足 M' = M '或者 = 任意一个条 件时, 则直接将其放入第三层。 依此类推, 直到评分数据集中的所有评分 数据均放置完毕, 可以依次得到 7个数据层, 从第一层数据层到最后 一层也就是第七层数据层依次为: lx{Layer 1) For example, still taking the matrix A as an example, through the placement method as shown in FIG. 3B, the exemplary operation process is as follows: Select a rating data r M from the score data set, put it into the first layer, and then Select the second rating data to determine whether M 'and M are the same user and 'and whether it is the same product. When '≠« and' ≠, the rating data r MY is placed in the first layer, otherwise it will be Put in the second layer. Then take out the third rating data ^ and compare it with the scored data already placed on the second layer (if there is a second layer), if the rating data corresponds to the user M 'and the product' and placed in the second When the user M 'and the product' of the layer are different, they will continue to compare it with the score data already placed on the first layer. When the score data already placed with the first layer also satisfies M " ≠ M and then it will be Put in the first layer, otherwise place the second layer, and when the user and product corresponding to the scoring data and the user M ' and product ' placed on the second layer satisfy M ' = M ' or = any condition, then Put it directly into the third layer. By analogy, until all the score data in the score data set is placed, you can get 7 data layers in turn, from the first data layer to the last layer, the seventh data layer. as followed: l x {Layer 1)
W Ζ·4, 2) 5 , 2 , 1 ) /2 {Layer 2) W Ζ ·4, 2) 5 , 2 , 1 ) / 2 {Layer 2)
W2 Ζ·4 2 ) ( 3, 3,5) ( /3 {Layer 3) W 2 Ζ · 4 2 ) ( 3 , 3 , 5) ( / 3 {Layer 3)
2 5 1 ) ( 4, 3,2) 6, 2,2) /4 {Layer 4) 2 5 1 ) ( 4 , 3 , 2) 6 , 2 , 2) / 4 {Layer 4)
W3 Z5 5 ) (W4,3) ( /5 {Layer 5) W 3 Z 5 5 ) (W 4 , 3) ( / 5 {Layer 5)
W3 Z6 2) ( 4, 5,3) ( ζ·4 5 ) /6 {Layer 6) W 3 Z 6 2) ( 4 , 5 , 3) ( ζ·4 5 ) / 6 {Layer 6)
( 7, 5,2) /7 {Layer 7) 通过矩阵形式表示为: ( 7 , 5 , 2) / 7 {Layer 7) Expressed in matrix form as:
Figure imgf000020_0001
Figure imgf000020_0001
5 5
2 其中, 矩阵中出现的 *表示该位置对应的用户没有对该位置对应 的产品进行评分, 可以得到
Figure imgf000020_0002
2 where * appears in the matrix, indicating that the user corresponding to the location does not score the product corresponding to the location, and can obtain
Figure imgf000020_0002
S202: 依据预设的推荐系统模型以及数据层中的评分数据, 并 行计算数据层中推荐系统模型的参数, 并将每一层数据层的参数作 为对应的下一层数据层的初值, 直至获取推荐系统模型的最优参数; 示例性的, 预设的系统模型可以是考虑用户隐式反馈的潜在因 素模型, 也可以是仅仅基于现实反馈的潜在因素模型, 还可以包括 考虑时空特性的推荐系统模型、 非对称的潜在因素模型等, 本发明 实施例对此不做任何限定, 进一步的, 在本实施例中, 构建了一个改进的考虑用户隐式反 馈的第一推荐系统模型, 如式 ( 1 ) , 以及一个未考虑用户隐式反馈 的第二推荐系统模型, ( 2 ): S202: According to a preset recommendation system model and rating data in a data layer, and Calculate the parameters of the recommended system model in the data layer, and take the parameters of each layer of the data layer as the initial values of the corresponding next layer of data until the optimal parameters of the recommended system model are obtained; exemplary, preset system The model may be a latent factor model considering the implicit feedback of the user, or may be a latent factor model based only on the actual feedback, and may also include a recommendation system model considering the spatiotemporal characteristics, an asymmetric latent factor model, etc. Without further limitation, further, in the present embodiment, an improved first recommendation system model considering user implicit feedback, such as equation (1), and a second recommendation system that does not consider user implicit feedback, is constructed. Model, (2):
= μ + + δί + q (pu + J,) ( 1 )
Figure imgf000021_0001
= μ + + δ ί + q (p u + J,) ( 1 )
Figure imgf000021_0001
式 ( 1 ) 和式 ( 2 ) 中, 表示用户 u对产品 i 的评分预测值, μ 表示评分数据集中的所有评分数据的平均值, ^表示用户 u 相对用 户平均评分的偏移量, 表示产品 i相对产品平均评分的偏移量, qt 表示产品因素矢量, T表示转置运算符号, p„表示用户因素矢量; 进一步在式 ( 1 ) 中, |N(w)|表示用户 u 提供了隐式偏好的所有 产品的集合大小, N(w)表示用户 u 提供了隐式偏好的所有产品的集 合; 表示与产品 j相关联的因素矢量, 其用于表征隐式反馈信 息。 而且 b„, b. , q 和 为用户的推荐系统模型的未知参数; 示例性的, 根据推荐系统模型得到的预测值与评分数据的均方 误差和推荐系统模型的参数之间的关系得到推荐系统模型的代价函 数, 以及根据步骤 S 2 01得到的分层数据矩阵的评分数据, 通过对上 述模型有关的代价函数最优化问题的求解得到上述模型的未知参 数, 即并行计算数据层中推荐系统模型的参数, 并将每一层数据层 的参数作为对应的下一层数据层的初值, 直至获取推荐系统模型的 最优参数, 其中, 最优参数就是上述模型的未知参数最优值, 具体 的, 在本实施例中, 第一推荐系统模型和第二推荐系统模型有关的 代价函数可以分别表示为式 ( 3 ) 的第一代价函数和式 ( 4 ) 的第二 代价函数: ∑ [rtl -μ-bu-b -q (pu+\N(u ∑ j,)]2 In equations (1) and (2), the user u predicts the score of product i, μ represents the average of all the score data in the score data set, and ^ represents the offset of user u from the average user score, indicating the product The offset of i from the average score of the product, q t represents the product factor vector, T represents the transpose operator symbol, p „ represents the user factor vector; further in equation ( 1 ), |N(w)| represents that the user u provides The aggregate size of all products implicitly preferred, N(w) represents the set of all products that user u provides implicit preference; represents the factor vector associated with product j, which is used to characterize implicit feedback information. , b. , q and the unknown parameters of the recommended system model for the user; exemplarily, the relationship between the predicted value obtained from the recommended system model and the mean square error of the score data and the parameters of the recommended system model are obtained by the recommended system model a cost function, and the scoring data of the hierarchical data matrix obtained according to step S 2 01, obtained by solving the cost function optimization problem related to the above model Knowing the parameters, that is, calculating the parameters of the recommended system model in the data layer in parallel, and taking the parameters of each layer of the data layer as the initial values of the corresponding next layer of data layers, until the optimal parameters of the recommended system model are obtained, wherein, the optimal The parameter is the optimal value of the unknown parameter of the above model. Specifically, in this embodiment, the cost function related to the first recommendation system model and the second recommendation system model can be expressed as the first cost function and expression of the equation (3), respectively. The second cost function of (4): ∑ [r tl -μ-b u -b - q (p u +\N(u ∑ j,)] 2
(",'■) j≡N(u) ( 2 )  (",'■) j≡N(u) ( 2 )
2 2)+ iu + xN(u) i if) 2 2 )+ iu + xN(u) i if)
∑ [rui -μ-Κ-b - q puf +
Figure imgf000022_0001
( 4 ) 其中, |*|2表示矢量 *的所有元素的平方和, 与 ^为正则化因子。 本领域技术人员可以理解的, 对式 ( 3 ) 与式 ( 4 ) 进行最优化 问题的求解, 具体的求解过程具有相似性, 不再赘述, 如图 4所示, 对于式 ( 3 ) 进行最优化问题的求解, 可以包括:
∑ [r ui -μ-Κ-b - qp u f +
Figure imgf000022_0001
(4) where |*| 2 represents the sum of the squares of all the elements of the vector *, and ^ is the regularization factor. It can be understood by those skilled in the art that the optimization problem is solved for the equations (3) and (4), and the specific solution process has similarities, and will not be described again, as shown in FIG. 4, for the equation (3) The solution to the optimization problem can include:
401、 计算评分数据集中的所有评分数据的平均分 μ; 401. Calculate an average score of all the score data in the score data set;
402、 依次采用并行计算的方式计算每一层数据层的参数, 并将 每一层数据层计算所得的参数作为下一层数据层的参数初值; 402. Calculate the parameters of each layer of the data layer by using parallel computing in turn, and use the parameter calculated by each layer of the data layer as the initial value of the parameter of the next layer of the data layer;
示例性的, 本实施例中, 在第一次计算之前, 可以将第一层数 据层的参数 bM, b., q 和 以及 ^与 ^的初值进行随机设置, 为 了简便起见, 参数 b„, b., q 和^的初值可以设置为标量参数为Exemplarily, in this embodiment, before the first calculation, the parameters b M , b., q and the initial values of ^ and ^ of the first layer data layer may be randomly set. For the sake of simplicity, the parameter b „, b., q and ^ initial values can be set to scalar parameters as
0, 矢量参数为向量 δ, 其中, 数字 0上方的箭头符号为向量符号, 而 ^与 ^可以任意设置为一个为比较小的正值, 本发明实施例对此 不作任何限定, 用来表示正则化因子; 示例性的, 从第一层数据层开始, 依次的并行计算该层数据层 的参数 bu, b q 和 的值, 并将计算所得的参数值作为下一层 数据层的参数 b„ , b q 和 的初值, 直至计算完最后一层数据 层的参数 b„, b q p„和; 得到最后一层数据层计算所得的参数; 具体的计算过程如下: 根据第一层数据层的参数初值对第一层数据层所包含的评分数 据对应的未知参数 bM, b., qt, 和 进行更新计算, 由于同一层矩 阵的评分数据之间没有相互依赖的关系, 因此同一层矩阵的不同评 分数据对应的未知参数 b„, b., qt, 和 的更新计算可以并行进行, 并将更新后的参数作为下一层数据层的参数初值对下一层数据层所 包含的评分数据对应的参数进行更新计算, 直至最后一层数据层所 包含的评分数据对应的参数进行更新计算完毕, 得到最后一层数据 优选的, 在本实施例中, 可以通过梯度下降法进行更新计算, 由于所有评分数据对应的参数的更新方法均相同, 本领域技术人员 可以理解, 通过对一个评分数据 对应的的参数 bM, b., q 和^进 行更新计算步骤进行描述后, 可以无需创造性的将计算步骤应用在 其他的评分数据中, 如图 4所示, 具体的步骤如下: 0, the vector parameter is a vector δ, wherein the arrow symbol above the number 0 is a vector symbol, and ^ and ^ can be arbitrarily set to a relatively small positive value, which is not limited by the embodiment of the present invention, and is used to indicate the regularity For example, starting from the first layer of data, the values of the parameters b u , bq and the data layer of the layer are calculated in parallel, and the calculated parameter values are used as parameters of the next layer of data. , the initial value of bq and , until the parameters of the last layer of the data layer b„, bqp„ and ; are obtained; the parameters calculated by the last layer of the data layer are obtained; the specific calculation process is as follows: According to the parameters of the first layer of the data layer The value is updated for the unknown parameters b M , b., q t , and the corresponding score data contained in the first layer data layer. Since there is no interdependent relationship between the score data of the same layer matrix, the same layer matrix The update calculations of the unknown parameters b„, b., q t , and sum corresponding to different scoring data can be performed in parallel, and the updated parameters are used as the initial values of the parameters of the next layer of data layer for the next layer of data layer. The parameters corresponding to the included scoring data are updated and calculated, until the parameters corresponding to the scoring data included in the last layer of the data layer are updated and calculated, and the last layer of data is obtained. Preferably, in this embodiment, the update calculation can be performed by the gradient descent method. Since the update methods of the parameters corresponding to all the scoring data are the same, those skilled in the art can understand that by the parameter b M corresponding to one scoring data, b., q and ^ After the update calculation step is described, the calculation step can be applied to other score data without creatively, as shown in Fig. 4, the specific steps are as follows:
4021、 根据每一层数据层的参数初值和式 ( 1 ) 所示的推荐系统 模型得到评分数据 rui的初始估计值 ,进而根据该层数据层的评分数 据和初始估计值 fui得到该层数据层的评分误差 eui4041. Obtain an initial estimated value of the score data r ui according to the initial value of the parameter of each layer of the data layer and the recommended system model represented by the formula (1), and obtain the score according to the score data of the data layer of the layer and the initial estimated value f ui Layer data layer score error e ui ;
4022、 根据评分误差 ^得到负梯度方向的该层数据层计算所得 的参数的更新值, 如图 5A所示的示意图, 包括: 4022. Obtain an updated value of the parameter calculated by the layer of the data layer in the negative gradient direction according to the scoring error ^, as shown in the schematic diagram of FIG. 5A, including:
(5)  (5)
(6) (7) (6) (7)
Figure imgf000023_0001
其中, 符号表示更新符号, 即用更新符号右边的计算值替代 更新符号左边的变量值, 在本实施例中, 式 ( 5 ) -式 ( 9 ) 的更新符 号右边出现的参数均为相应的参数的初值, 更新符号左边出现的参 数均为参数的更新值, 712为迭代步长, 本实施例中优选的, 可以 是一个设定的合适的常量, 即可以设定 712为某个常量, 或者也可 以是一个与迭代次数相关的量, 即随着迭代次数的增加而逐渐减少 迭代步长。 如:^ * o.96NumOJIter和 72 * 0.96NumOJIter , 其中, NumO f Iter 表示迭代 次数。 本发明实施例对此不做限定。 本领域技术人员可以理解的, 式 ( 5 ) -式 ( 8 ) 均可以根据相应 的计算式进行参数的更新, 计算过程不再赘述, 相应的, 对式 ( 4 ) 的最优化问题的求解只需要进行式 ( 5 ) -式 ( 8 ) 的更新计算, 得到 参数 b„ , b., q p„即可;
Figure imgf000023_0001
The symbol represents the update symbol, that is, the value of the variable on the left side of the update symbol is replaced by the calculated value on the right side of the update symbol. In this embodiment, the parameters appearing on the right side of the update symbol of the formula (5)-formula (9) are corresponding parameters. The initial value, the parameters appearing on the left of the update symbol are the updated values of the parameters, 71 and 2 are the iteration step lengths. In this embodiment, it is preferable to set a suitable constant, that is, 71 and 2 can be set. A constant, or it can be an amount related to the number of iterations, that is, the iteration step is gradually reduced as the number of iterations increases. Such as: ^ * o.96 NumOJIter and 7 2 * 0.96 NumOJIter , where NumO f Iter represents the number of iterations. This embodiment of the present invention does not limit this. It can be understood by those skilled in the art that the formula (5) - (8) can be updated according to the corresponding calculation formula, and the calculation process will not be described again. Correspondingly, the formula (4) The solution to the optimization problem only needs to perform the update calculation of equation (5) - equation (8), and obtain the parameters b„, b., qp„;
但是对于式( 9 ) 的计算还需要结合评分数据集中的评分数据依 次对 进行串行计算, 降低了运算效率, 因此本发明实施例提供了 一种针对式 ( 9 ) 的计算方法, 如图 5A 所示的示意图, 具体实施方 式为:  However, the calculation of the formula (9) also needs to be combined with the scoring data in the scoring data set to perform serial calculations in sequence, which reduces the computational efficiency. Therefore, the embodiment of the present invention provides a calculation method for the formula (9), as shown in FIG. 5A. The schematic diagram shown is as follows:
当同一层矩阵有多个 ^需要计算时,利用评分数据集中 ^所表示 的产品 j对应的用户来并行计算其对应的梯度 Ay , 然后将产品 j对 应的所有的梯度 Ay进行聚合得到该层 的值, 比如在评分数据集中 , 用户 1,用户 2,和用户 6对产品 4 ( j = 4 )进行了评分, 则可以并行计 算用户 1,用户 2,和用户 6对应的产品 j 的梯度 Ay , Ayf> , Ayf> , 其 中 , Ay;1) ^en-2^-qi- λ2γ) , A e22 · 3— ^ · q2― λ,γ) , Ay^ e64 · · q4 - X2y); 然后通过 +y2(A) +Ayf) )得到该层 的更新值, 其中, 为 在该层计算时的初值。 此外, 本发明实施例提供了另一种针对式 ( 8 ) 和式 ( 9 ) 的等 效计算方法, 如图 5B所示的示意图, 具体实施方式为: 将第一推荐模型中的表达式 ft+|N(M)|4 X 作为一个等效参数, 并用辅助变量 来表示所述等效参数, 即
Figure imgf000024_0001
, 其中辅 助变量 与表达式 ft+|N(M)|4∑ 等价, 此时, 第一推荐模型等价为 rui =μ + δη +bi + qi T · zu , 其参数为 ¾, 6,·, qi, zu
When there are multiple calculations of the same layer matrix, the user corresponding to the product j represented by the score data set is used to calculate the corresponding gradient Ay in parallel, and then all the gradients Ay corresponding to the product j are aggregated to obtain the layer. Values, such as in the scoring data set, User 1, User 2, and User 6 score product 4 ( j = 4 ), then the gradient Ay of product j corresponding to User 1, User 2, and User 6 can be calculated in parallel. Ayf> , Ayf> , where Ay; 1 ) ^e n -2^- qi - λ 2 γ) , A e 22 · 3— ^ · q 2 ― λ, γ) , Ay^ e 64 · · q 4 - X 2 y); Then the updated value of the layer is obtained by +y 2 (A) + Ayf)), where is the initial value at the time of calculation in the layer. In addition, the embodiment of the present invention provides another equivalent calculation method for the equations (8) and (9), as shown in the schematic diagram of FIG. 5B. The specific implementation manner is: the expression ft in the first recommendation model +|N( M )|4 X is used as an equivalent parameter, and the auxiliary variable is used to represent the equivalent parameter, ie
Figure imgf000024_0001
Where the auxiliary variable is equivalent to the expression ft +|N( M )|4∑, in which case the first recommended model is equivalent to r ui =μ + δ η +b i + q i T · z u , its parameters Is 3⁄4, 6,·, q i , z u ;
然后根据辅助变量 zu的梯度 ΔζΜ = 2eui 并由此得到辅助变 量 的更新公式 +y2.(2eMi- ), 其中 为 在该层计算时的初 值。 Then according to the gradient Δζ Μ = 2e ui of the auxiliary variable z u and thus the update formula of the auxiliary variable + y 2 . (2 eM i- ), where is the initial value at the time of calculation of the layer.
通过本方法, 可以只通过对辅助变量 的更新过程来替代对参 数 和 的更新过程, 此外, 由于引入了辅助变量 , 式 ( 8 ) 的关 于参数 qt的更新也可以做出相应的变动, 即 qt的更新公式变为 ^■^ ^ + 72 · (^ - 其中 。为 在该层计算时的初值。 由此可知, 引入辅助变量 一方面简化了计算, 另一方面由于 消除了 ^的计算, 推荐模型的参数求解减少了一层内循环, 从而在 确保同样精度的前提下极大地提升了运算速度。 With this method, the update process of the parameter sum can be replaced only by the update process of the auxiliary variable. Furthermore, since the auxiliary variable is introduced, the update of the parameter q t of the formula (8) can also be changed accordingly, that is, The update formula of q t becomes ^■^ ^ + 7 2 · (^ - where. is the initial value at the time of calculation in this layer. It can be seen that the introduction of the auxiliary variable simplifies the calculation on the one hand, and the parameter of the recommended model on the other hand because the calculation of ^ is eliminated. The solution reduces the inner loop of one layer, thus greatly improving the speed of the operation while ensuring the same accuracy.
4 02 3 , 将该层数据层计算所得的参数作为下一层数据层的参数 初值, 根据步骤 4 02 1 和 4 022得到下一层数据层计算所得的参数, 直至得到最后一层数据层计算所得的参数。  4 02 3 , the parameter calculated by the data layer of the layer is used as the initial value of the data layer of the next layer, and the parameters calculated by the next layer of the data layer are obtained according to steps 4 02 1 and 4 022 until the last layer of data is obtained. Calculate the resulting parameters.
4 0 3 : 根据最后一层数据层计算所得的参数判断推荐系统模型是 否收敛, 若收敛, 则计算结束, 得到最优参数; 得到最优参数, 若 不收敛, 则将最后一层数据层计算所得的参数作为第一层数据层的 参数初值, 重复步骤 4 02, 4 03 ; 示例性的, 本次计算所得到的最后一层数据层计算所得的参数 和前一次计算所得到的最后一层数据层计算所得的参数均代入式 ( 3 ) 或式 ( 4 ) 的代价函数进行计算, 如果两个计算结果之差不大 于预设的门限值, 则可以认为本次计算所得到的最后一层数据层计 算所得的参数为最优参数, 如果大于预设的门限值, 则将本次计算 所得到的最后一层数据层计算所得的参数是不收敛的, 并将本次计 算所得到的最后一层数据层计算所得的参数作为下次计算的第一层 数据层的参数初值, 继续计算推荐系统模型的参数。  4 0 3 : Determine whether the recommended system model converges according to the parameters calculated by the data layer of the last layer. If it converges, the calculation ends and the optimal parameters are obtained; the optimal parameters are obtained, and if it does not converge, the last layer of data is calculated. The obtained parameters are used as the initial values of the parameters of the first layer of the data layer, and steps 4 02, 4 03 are repeated; exemplary, the parameters calculated by the last layer of the data layer obtained by the current calculation and the last one obtained by the previous calculation The parameters calculated by the layer data layer are substituted into the cost function of equation (3) or (4). If the difference between the two calculation results is not greater than the preset threshold, the final result of this calculation can be considered. The parameter calculated by one layer of data layer is the optimal parameter. If it is greater than the preset threshold value, the parameter calculated by the last layer of data layer obtained by this calculation is not convergent, and the calculation system will be The obtained parameter of the last layer of the data layer is used as the initial value of the parameter of the first layer of the data layer to be calculated next, and the parameters of the recommended system model are continuously calculated.
S 2 03 : 根据最优参数与推荐系统模型获取每个用户对每个产品 的评分预测值, 并根据评分预测值向用户推荐产品。  S 2 03 : According to the optimal parameter and the recommendation system model, each user's score prediction value for each product is obtained, and the product is recommended to the user according to the score prediction value.
示例性的,本实施例中,将获得的最优参数带入式( 1 )或式( 2 ) , 则可以获得每个用户对每个产品的评分预测值, 可以通过将相同用 户对所有产品的评分预测值进行排列, 选择评分预测值最高的预设 数量的产品推荐给用户。  Exemplarily, in the embodiment, if the obtained optimal parameter is brought into the formula (1) or the formula (2), the predicted value of each user for each product can be obtained, and the same user can be used for all products. The ranking predictors are ranked, and the preset number of products with the highest score prediction value is selected for recommendation to the user.
S 2 04 : 产品推荐之后, 用户对产品做出的新的评分数据输入到 推荐系统中, 以使得推荐系统能够根据实时的更新推荐系统模型的 参数, 来保证高精度和高效率推荐的实时性。  S 2 04 : After the product recommendation, the new scoring data made by the user to the product is input into the recommendation system, so that the recommendation system can ensure the real-time of the high-precision and high-efficiency recommendation according to the parameters of the system model recommended in real-time update. .
本实施例提供了一种推荐方法, 通过并行计算提高了海量数据 环境下推荐系统的推荐效率, 并且通过考虑用户隐式反馈提高了推 荐系统的推荐效果。 This embodiment provides a recommendation method for improving massive data through parallel computing. The recommended efficiency of the system is recommended in the environment, and the recommendation effect of the recommendation system is improved by considering the implicit feedback of the user.
本发明实施例提供了一种推荐设备 60 , 如图 6所示, 包括: 数据放置单元 601 , 用于将评分数据集中的评分数据分别放置 到至少两个数据层, 其中, 评分数据与用户以及产品分别——对应, 且每一个数据层中的任意两个评分数据对应的用户以及产品均不相 同; The embodiment of the present invention provides a recommendation device 60. As shown in FIG. 6, the method includes: a data placement unit 601, configured to separately set the score data in the score data set to at least two data layers, where the score data and the user and The products are respectively corresponding, and the users and products corresponding to any two scoring data in each data layer are different;
并行计算单元 602 , 用于依据预设的推荐系统模型以及数据层 中的评分数据, 并行计算数据层中推荐系统模型的参数, 并将每一 层数据层的参数作为对应的下一层数据层的初值, 直至获取推荐系 统模型的最优参数; 其中, 推荐系统模型为每个用户对每个产品的 评分预测值与平均分和所述推荐系统模型的参数之间的对应关系; 预测推荐单元 603 , 用于根据最优参数与推荐系统模型获取每 个用户对每个产品的评分预测值, 并根据评分预测值向用户推荐产  The parallel computing unit 602 is configured to calculate parameters of the recommended system model in the data layer in parallel according to the preset recommendation system model and the scoring data in the data layer, and use the parameters of each layer of the data layer as the corresponding next layer of data layer. The initial value until the optimal parameter of the recommendation system model is obtained; wherein, the recommendation system model is a correspondence between the predicted value of the score for each product and the average score of each product and the parameters of the recommended system model; The unit 603 is configured to obtain, according to the optimal parameter and the recommendation system model, a predicted value of each user for each product, and recommend the product to the user according to the predicted value of the score.
示例性的, 评分数据集可以通过获取用户对产品的评分数据得 到, 也可以通过用户的浏览、 购买记录信息得到, 不仅可以得到用 户对产品的显示反馈, 也可以获取到用户偏好的隐式反馈, 本发明 实施例对此不做任何限制, 优选的, 评分数据集可以通过用户对产 品的评分数据获得, 本领域技术人员可以理解的, 用户对产品的评 分数据不仅显示的反馈了用户对产品的评价, 而且也通过用户对产 品评分的行为隐式的反馈了用户对产品的偏好, 优选的, 在本实施 例中, 评分数据集可以通过矩阵形式表示, 其中, 矩阵的不同行表 示不同用户, 矩阵的不同列表示不同产品; 进一步的, 评分数据与用户以及产品分别——对应, 本领域技 术人员可以理解的, 由于评分数据所放置的数据层可以通过矩阵的 形式进行表示, 同时评分数据集也可以通过矩阵的形式进行表示, 其中, 矩阵的不同行表示不同用户, 矩阵的不同列表示不同产品, 于是可以将评分数据集中的评分数据分别放置到至少两个数据层矩 阵, 并且所有数据层矩阵均与评分数据集的矩阵具有相同的行数和 相同的列数, 当每一个数据层中的任意两个评分数据对应的用户以 及产品均不相同, 即每一个数据层矩阵中的任意两个评分数据均不 在同一行且均不在同一列的时候, 就可以满足同一层数据层中的所 有评分数据之间互相没有依赖关系, 因此可以对同一层数据层中的 评分数据进行并行计算, 具体的放置步骤本发明实施例不作任何限 定, 任何能够使得每一个数据层矩阵中的任意两个评分数据均不在 同一行且均不在同一列的放置方法均在本发明实施例的保护范围 内; Exemplarily, the scoring data set can be obtained by obtaining the user's scoring data of the product, or by browsing and purchasing the record information of the user, not only obtaining the user's display feedback on the product, but also obtaining implicit feedback of the user's preference. The embodiment of the present invention does not impose any limitation on this. Preferably, the scoring data set can be obtained by the user's scoring data of the product. As can be understood by those skilled in the art, the user's scoring data of the product not only displays the feedback of the user to the product. The evaluation, and also implicitly feedback the user's preference for the product by the user's behavior of rating the product. Preferably, in this embodiment, the score data set can be represented by a matrix form, wherein different rows of the matrix represent different users The different columns of the matrix represent different products; further, the score data is respectively corresponding to the user and the product - correspondingly, those skilled in the art can understand that the data layer placed by the score data can be represented by a matrix form, and the score data is simultaneously Sets can also pass through the matrix The form is represented, wherein different rows of the matrix represent different users, different columns of the matrix represent different products, and thus the scoring data in the scoring data set can be respectively placed to at least two data layer moments Array, and all the data layer matrices have the same number of rows and the same number of columns as the matrix of the scoring data set, when the user and the product of any two scoring data in each data layer are different, that is, each data When any two scoring data in the layer matrix are not in the same row and are not in the same column, all the scoring data in the same data layer can be satisfied without any dependence on each other, so the score in the same data layer can be scored. The data is subjected to parallel computing, and the specific placement steps are not limited in any embodiment of the present invention. Any placement method capable of making any two of the score data in each data layer matrix not in the same row and not in the same column is in the embodiment of the present invention. Within the scope of protection;
进一步的, 在本实施例中, 数据放置单元 6 0 1 可以用于实现如 图 3所示放置方法, 包括: 数据放置单元 6 0 1 选取评分数据集中的一个评分数据, 将其放 在第一层数据层矩阵中对应于该评分数据的用户和产品的位置上, 此时, 最后一层数 /max = l,其中, 选取的方式本实施例不做限定, 后 续从评分数据集中选取数据的方式与第一次选取的方式相同, 在此 不作赘述; 数据放置单元 6 0 1 选取下一个评分数据, 并从第一层数据层矩 阵开始到最后一层对应的数据层矩阵, 依次与其中的所有评分数据 进行比较, 是否满足评分数据在数据层矩阵中的对应位置所在的行 和列均没有评分数据, 若满足, 则该评分数据放置于最先满足的数 据层矩阵, 若直至最后一层 /max对应的数据层矩阵也不满足, 则最后 一层数更新为 /max /max + l , 其中 符号表示将符号左边的数据更新 为符号右边的数据, 下同, 并且将该评分数据放置于更新后的最后 一层数据层; 数据放置单元 6 0 1 对评分数据集中剩余的评分数据依次重复上 述放置评分数据的过程直至评分数据集中所有的评分数据均放置完 毕。 Further, in this embodiment, the data placement unit 6 0 1 may be used to implement the placement method as shown in FIG. 3, including: the data placement unit 6 0 1 selects one rating data in the score data set, and places it in the first In the layer data layer matrix, the location of the user and the product corresponding to the score data, at this time, the last layer number / max = l, wherein the selected mode is not limited in this embodiment, and the data is selected from the score data set. The method is the same as the first selection method, and is not described here; the data placement unit 6 0 1 selects the next score data, and starts from the first layer of the data layer matrix to the corresponding data layer matrix of the last layer, and sequentially All the score data are compared, and whether the row and the column of the corresponding position of the score data in the data layer matrix are not scored data, if satisfied, the score data is placed in the first satisfied data layer matrix, if until the last layer / max data corresponding to a matrix layer is not satisfied, the last layer of the number of update / max / max + l, where the symbol represents the number of symbols on the left Update to the data on the right side of the symbol, the same below, and place the score data in the updated last layer of data layer; the data placement unit 6 0 1 repeats the above process of placing the score data in sequence for the remaining score data in the score data set until the score All scoring data in the data set is placed.
例如, 评分数据如矩阵 A所示: 3 * * 2 * * For example, the scoring data is shown in matrix A: 3 * * 2 * *
2 2 1  2 2 1
4 5 5 2  4 5 5 2
A 2 3  A 2 3
1 5  1 5
1 2 3  1 2 3
3 5 2 数据放置单元 601可以通过如图 3A所示的放置方法,示例性的 , 具体操作过程如下: 评分数据集中选出一个评分数据 r∞, 将其放入 第一层, 然后再选择第二个评分数据^ ,, 判断" '和"是否为同一个用 户以及 '和 '·是否为同一个产品, 当 "且 '≠ 时, 该评分数据 rMY放入 第一层, 否则将其放入第二层。 随后再取出第三个评分数据^ , 如 果该评分数据所对应的用户 "'和产品 与放置在第一层的用户和产 品均不相同时, 则将其放入第一层, 否则与第二层的评分数据对应 的用户和产品进行比较, 当满足2 ^'^ 且' 时, 则将其放入第二 层, 否则放入第三层, 这里「Μ 2表示第二层的用户, 表示第二层 的产品。 依此类推, 直到集合 的所有评分数据均放置完毕。 可以依次得到 5 个数据层, 从第一层数据层到最后一层也就 第五层数据层依次为: 3 5 2 The data placement unit 601 can be exemplified by the placement method as shown in FIG. 3A. The specific operation process is as follows: Select a score data r∞ in the score data set, put it into the first layer, and then select the first Two scoring data ^ , , judge whether "' and " are the same user and whether ' and ' is the same product. When "and" ,, the scoring data r MY is placed in the first layer, otherwise it is placed Into the second layer. Then take the third rating data ^, if the user corresponding to the rating data and the product are not the same as the users and products placed on the first layer, then put it into the first layer Otherwise, the user and the product corresponding to the score data of the second layer are compared. When 2 ^'^ and ' is satisfied, the second layer is placed, otherwise the third layer is placed, where Μ 2 indicates the second The user of the layer, representing the product of the second layer. And so on, until all the score data of the set is placed. You can get 5 data layers in turn, from the first data layer to the last layer, the fifth data layer. as followed:
ζ·ι, 3) (" ,2) ( ^3 3 5 ) ( 4, 5,3) ( lx {Layer 1) ζ·ι, 3) (" ,2) ( ^3 3 5 ) ( 4 , 5 , 3) ( l x {Layer 1)
ζ·4, 2) (M ,l) ( 3 ζι 4 ) ( 4, 3,2) ( 5 , 2 , 1 ) /2 {Layer 2) ζ·4, 2) (M ,l) ( 3 ζ ι 4 ) ( 4 , 3 ,2) ( 5 , 2 , 1 ) / 2 {Layer 2)
("3, ,5) ( ( 7 3 , 3 ) /3 {Layer 3) z6 2) ("6,'·2,2) ( W7 , Ζ·4 , 5 ) /4 {Layer 4) ("3, ,5) ( ( 7 3 , 3 ) / 3 {Layer 3) z 6 2) ("6,'·2, 2 ) ( W 7 , Ζ · 4 , 5 ) / 4 {Layer 4)
( 7 , 5 , 2) /5 {Layer 5) 通过矩阵形式表示可以是: ( 7 , 5 , 2) / 5 {Layer 5) The representation in matrix form can be:
Figure imgf000028_0001
Figure imgf000029_0001
Figure imgf000028_0001
Figure imgf000029_0001
其中, 矩阵中出现的 *表示该位置对应的用户没有对该位置对应 的产品进行评分, 并且 ^+^+^+^+^。 进一步的, 在本实施例中, 数据放置单元 601 也可以用于实现 如图 3B所示放置方法, 包括: 数据放置单元 601 选取评分数据集中的一个评分数据, 将其放 在第一层数据层中对应于该评分数据的用户和产品的位置上, 此时, 最后一层数 /max =l, The * appearing in the matrix indicates that the user corresponding to the location does not score the product corresponding to the location, and ^+^+^+^+^. Further, in this embodiment, the data placement unit 601 can also be used to implement the placement method as shown in FIG. 3B, including: the data placement unit 601 selects one rating data in the score data set, and places it in the first layer data layer. In the position of the user and product corresponding to the score data, at this time, the last layer number / max = l,
数据放置单元 601 从评分数据集中选取的第二个评分数据开 始, 将该评分数据依次从最后一层数对应的数据层直至第一层数据 层进行比较, 直至能够从最后一层数到第一层数据层中找出最后一 个满足该评分数据在数据层中对应位置的所在行和所在列上没有其 他评分数据, 则该评分数据放置于最后一个满足的数据层的对应位 置, 若从最后一层数到第一层均没有满足的数据评分矩阵, 则最后 一层数更新为 /max /max +l , 并且将该评分数据放置于更新后的最后 一层数对应的数据层; 数据放置单元 601 对评分数据集中剩余的评分数据依次重复上 述对第二个评分数据的放置过程直至评分数据集中所有的评分数据 均放置完毕。 The data placing unit 601 starts from the second scoring data selected in the scoring data set, and sequentially compares the scoring data from the data layer corresponding to the last layer to the first data layer until the first layer can be counted to the first In the layer data layer, if there is no other scoring data in the row and the column of the column that satisfies the corresponding position of the scoring data in the data layer, the scoring data is placed in the corresponding position of the last satisfied data layer, if from the last one If the number of layers is not satisfied by the first layer, the last layer is updated to / max / max +l, and the score data is placed in the data layer corresponding to the last layer of the update; data placement unit 601 Repeat the above-mentioned process of placing the second score data on the remaining score data in the score data set until all the score data in the score data set are placed.
例如, 仍以矩阵 A为例, 通过如图 3B所示的放置方法, 示例性 的, 具体操作过程如下: 从评分数据集中选出一个评分数据 rM, 将其放入第一层, 然后再 选择第二个评分数据 ,判断 M '和 M是否为同一个用户以及'和 是否为同 一个产品, 当《'≠«且 '≠ 时, 该评分数据 rMY放入第一层, 否则将其放入第 二层。 随后再取出第三个评分数据^ , 将其与第二层已放置的评分数据 进行比较(如果有第二层的话), 如果该评分数据所对应的用户 Μ'和产品 ζ·' 与放置在第二层的用户 M '和产品 '均不相同时, 则继续将其与第一层已放 置的评分数据进行比较, 当与第一层已放置的评分数据也满足 ^«且 '≠ 时则将其放入第一层, 否则放置第二层, 而当该评分数据所对应的用户 M" 和产品 与放置在第二层的用户 M '和产品 '满足 M' = M '或者 = 任意一个条 件时, 则直接将其放入第三层。 依此类推, 直到评分数据集中的所有评分 数据均放置完毕, 可以依次得到 7个数据层, 从第一层数据层到最后 一层也就是第七层数据层依次为: For example, still taking the matrix A as an example, through the placement method as shown in FIG. 3B, the exemplary operation process is as follows: Select a rating data r M from the score data set, put it into the first layer, and then Select the second rating data to determine whether M 'and M are the same user and 'and whether it is the same product. When '≠« and' ≠, the rating data r MY is placed in the first layer, otherwise it will be Put in the first Second floor. Then take out the third scoring data ^ and compare it with the scored data already placed on the second layer (if there is a second layer), if the rating data corresponds to the user Μ 'and the product ζ·' When the user M 'and the product' of the second layer are different, they continue to compare it with the score data already placed on the first layer. When the score data already placed with the first layer also satisfies ^ «And' ≠ Then put it into the first layer, otherwise place the second layer, and when the rating data corresponds to the user M "and the product and the user M ' placed on the second layer and the product 'satisfies M ' = M ' or = any When a condition is met, it is placed directly in the third layer. By analogy, until all the score data in the score data set is placed, 7 data layers can be obtained in turn, from the first data layer to the last layer. The seventh layer of data layers are:
wi, ζ·ι ,3) lx{Layer 1) , w i, ζ ·ι ,3) l x {Layer 1)
wi, ζ·4, 2) 5 , 2 , 1 ) /2 {Layer 2) , w i, ζ · 4, 2) 5 , 2 , 1 ) / 2 {Layer 2)
、w2, 4,2) ( 3, 3,5) ( /3 {Layer 3) , w 2 , 4 , 2) ( 3 , 3 , 5) ( / 3 {Layer 3)
、w2,5,l) ( 4, 3,2) 6, 2,2) /4 {Layer 4) , w 2 , 5 , l) ( 4 , 3 , 2) 6 , 2 , 2) / 4 {Layer 4)
、w3, 5,5) (W4,3) ( /5 {Layer 5) , w 3 , 5 , 5) (W 4 , 3) ( / 5 {Layer 5)
、w3, 6,2) ( 4, 5,3) ( ζ·4 5 ) /6 {Layer 6) , w 3 , 6 , 2) ( 4 , 5 , 3) ( ζ · 4 5 ) / 6 {Layer 6)
( 7, 5,2) /7 {Layer 7) 通过矩阵形式表示为: ( 7 , 5 , 2) / 7 {Layer 7) Expressed in matrix form as:
Figure imgf000030_0001
7
Figure imgf000030_0001
7
5  5
2 其中, *表示该位置对应的用户没有对该位置对应的产品进行评 分, 可以得到 ^=/; +4 +/;+/: +/;+/+/;。 示例性的, 预设的系统模型可以是考虑用户隐式反馈的潜在因 素模型, 也可以是仅仅基于现实反馈的潜在因素模型, 还可以包括 考虑时空特性的推荐系统模型、 非对称的潜在因素模型等, 本发明 实施例对此不做任何限定,  2 where * indicates that the user corresponding to the location does not score the product corresponding to the location, and ^=/; +4 +/;+/: +/;+/+/; can be obtained. Exemplarily, the preset system model may be a latent factor model considering user implicit feedback, or a latent factor model based only on real feedback, and may include a recommendation system model considering a spatiotemporal characteristic, and an asymmetric latent factor model. The embodiment of the present invention does not limit this,
进一步的, 在本实施例中, 构建了一个改进的考虑用户隐式反 馈的第一推荐系统模型, 如式 ( 1 ), 以及一个未考虑用户隐式反馈 的第二推荐系统模型, 如式 ( 2 ); 在式 ( 1 ) 和式 ( 2 ) 中, 表示用户 u对产品 i 的评分预测值, μ表示评分数据集中的所有评分数据的平均值, b„表示用户 u相对用 户平均评分的偏移量, 表示产品 i相对产品平均评分的偏移量, qt 表示产品因素矢量, T表示转置运算符号, p„表示用户因素矢量; 进一步在式 ( 1 ) 中, |N(w)|表示用户 u 提供了隐式偏好的所有 产品的集合大小, N(w)表示用户 u 提供了隐式偏好的所有产品的集 合; 表示与产品 j相关联的因素矢量, 其用于表征隐式反馈信 息。 而且 b„, b., q 和 为用户的推荐系统模型的未知参数; 示例性的, 推荐设备 60还可以包括代价函数生成单元 604, 用 于根据推荐系统模型得到的预测值与评分数据的均方误差和推荐系 统模型的参数之间的关系得到推荐系统模型的代价函数 以及根据 数据放置单元 601 得到的分层数据矩阵的评分数据, 通过对上述模 型有关的代价函数最优化问题的求解得到上述模型的未知参数, 即 并行计算数据层中推荐系统模型的参数, 并将每一层数据层的参数 作为对应的下一层数据层的初值, 直至获取推荐系统模型的最优参 数, 其中, 最优参数就是上述模型的未知参数最优值, 具体的, 在 本实施例中, 第一推荐系统模型和第二推荐系统模型有关的代价函 数可以分别表示为式 ( 3 ) 和式 ( 4 ), 其中, |*||2表示矢量 *的所有元 素的平方和, 与 ^为正则化因子。 本领域技术人员可以理解的, 并行计算单元 602 根据上述的模 型对式 ( 3) 与式 ( 4 ) 进行最优化问题的求解, 具体的求解过程具 有相似性, 不再赘述, 具体的, 以式 ( 3) 为例, 如图 7所示, 并行 计算单元 602可以包括: Further, in the embodiment, an improved first recommendation system model considering user implicit feedback is constructed, such as equation (1), and a second recommendation system model that does not consider user implicit feedback, such as 2); In equations (1) and (2), the predicted value of the user u for the product i, μ is the average of all the score data in the score data set, and b indicates the deviation of the user u from the average user score. Shift, which represents the offset of product i from the average product score, q t represents the product factor vector, T represents the transpose operator symbol, p „ represents the user factor vector; further in equation ( 1 ), |N(w)| Represents the collection size of all products for which user u provides an implicit preference, N(w) represents a collection of all products that user u provides for implicit preference; represents a factor vector associated with product j that is used to characterize implicit feedback information. Moreover, b„, b., q and unknown parameters of the recommended system model of the user; exemplarily, the recommendation device 60 may further include a cost function generating unit 604 for estimating the predicted value and the scoring data according to the recommended system model. The relationship between the square error and the parameters of the recommended system model is obtained by the cost function of the recommended system model and the score data of the hierarchical data matrix obtained from the data placement unit 601, by solving the cost function optimization problem related to the above model. The unknown parameters of the model, that is, the parameters of the recommended system model in the parallel computing data layer, and the parameters of each layer of the data layer As the initial value of the corresponding next layer of the data layer, until the optimal parameter of the recommended system model is obtained, wherein the optimal parameter is the optimal value of the unknown parameter of the model, specifically, in this embodiment, the first recommendation system The cost function related to the model and the second recommendation system model can be expressed as Equation (3) and Equation (4), respectively, where |*|| 2 represents the sum of squares of all elements of the vector *, and ^ is a regularization factor. As can be understood by those skilled in the art, the parallel computing unit 602 solves the optimization problem of the equations (3) and (4) according to the above model, and the specific solution process has similarities, and will not be described again. (3) As an example, as shown in FIG. 7, the parallel computing unit 602 may include:
平均分计算子单元 6021, 用于计算评分数据集中的所有评分数 据的平均分; 分层计算子单元 6022, 依次采用并行计算的方式计算每一层数 据层的参数, 并将每一层数据层计算所得的参数作为下一层数据层 的参数初值; 示例性的, 本实施例中, 在第一次计算之前, 可以将第一层数 据层的参数 bM, b., q 和 以及 ^与 ^的初值进行随机设置, 为 了简便起见, 参数 b„, b., q 和^的初值可以设置为标量参数为The average score calculation sub-unit 6021 is configured to calculate an average score of all the score data in the score data set; the hierarchical calculation sub-unit 6022 calculates the parameters of each layer of the data layer in a parallel calculation manner, and each layer of the data layer The calculated parameter is used as the initial value of the parameter of the next layer of data; exemplarily, in this embodiment, the parameters b M , b., q , and ^ of the first layer of the data layer may be before the first calculation. The initial value of ^ is randomly set. For the sake of simplicity, the initial values of the parameters b„, b., q and ^ can be set to scalar parameters.
0, 矢量参数为向量 δ, 其中, 数字 0上方的箭头符号为向量符号, 而 4与 ^可以任意设置为一个为比较小的正值, 本发明实施例对此 不作任何限定, 用来表示正则化因子; 示例性的, 从第一层数据层开始, 依次的并行计算该层数据层 的参数 bu, bi, qt, ?„和^.的值, 并将计算所得的参数值作为下一层 数据层的参数 b„, b q 和 的初值, 直至计算完最后一层数据 层的参数 b„, b q 和^, 得到最后一层数据层计算所得的参数; 具体的计算过程如下: 分层计算子单元 6022 根据第一层数据层的参数初值对第一层 数据层所包含的评分数据对应的未知参数 b„, b., q 和 进行更 新计算, 由于同一层矩阵的评分数据之间没有相互依赖的关系, 因 此同一层矩阵的不同评分数据对应的未知参数 bM, b., q 和^的 更新计算可以并行进行, 并将更新后的参数作为下一层数据层的参 数初值对下一层数据层所包含的评分数据对应的参数进行更新计 算, 直至最后一层数据层所包含的评分数据对应的参数进行更新计 算完毕, 得到最后一层数据层计算所得的参数; 优选的, 在本实施例中, 可以通过梯度下降法进行更新计算, 由于所有评分数据对应的参数的更新方法均相同, 本领域技术人员 可以理解, 通过对一个评分数据 对应的的参数 bM, b., q 和^进 行更新计算步骤进行描述后, 可以无需创造性的将计算步骤应用在 其他的评分数据中, 分层计算子单元 6022 对式 ( 1 ) 的计算步骤具 体可以如下: 首先, 根据每一层数据层的参数初值和式 ( 1 ) 所示的推荐系统 模型得到评分数据 rui的初始估计值 ,进而根据该层数据层的评分数 据和初始估计值 fui得到该层数据层的评分误差 eui; 接着, 通过评分误差^和式 ( 5 ) -式 ( 9 ) 得到负梯度方向的参 数的更新值, 本领域技术人员可以理解的, 式 ( 5 ) -式 ( 8 ) 均可以 根据相应的计算式进行参数的更新, 计算过程及各式中出现的符号 与参数的意义不再赘述, 相应的, 对式 ( 4 ) 的最优化问题的求解只 需要进行式 ( 5 ) -式 ( 8 ) 的更新计算, 得到参数 b„ , b q 即 可; 0, the vector parameter is a vector δ, wherein the arrow symbol above the number 0 is a vector symbol, and 4 and ^ can be arbitrarily set to a relatively small positive value, which is not limited by the embodiment of the present invention, and is used to indicate the regularity. For example, starting from the first layer of the data layer, the values of the parameters b u , bi, q t , ? „ and ^. of the layer of the data layer are sequentially calculated in parallel, and the calculated parameter values are taken as The parameters of the data layer b„, bq and the initial value of a layer of data, until the parameters b„, bq and ^ of the last layer of the data layer are calculated, the parameters calculated by the last layer of the data layer are obtained; the specific calculation process is as follows: The layer calculation sub-unit 6022 performs an update calculation on the unknown parameters b„, b., q corresponding to the score data included in the first layer data layer according to the parameter initial value of the first layer data layer, because the score data of the same layer matrix is There is no interdependence relationship between them, so the update calculation of the unknown parameters b M , b., q and ^ corresponding to different score data of the same layer matrix can be performed in parallel, and the updated parameters are used as the parameters of the next layer of data layer. The initial value of the number is updated and calculated for the parameter corresponding to the score data included in the next layer of the data layer, until the parameter corresponding to the score data included in the last layer of the data layer is updated and calculated, and the parameter calculated by the last layer of the data layer is obtained. Preferably, in this embodiment, the update calculation can be performed by the gradient descent method. Since the update methods of the parameters corresponding to all the scoring data are the same, those skilled in the art can understand that the parameter b M corresponding to one scoring data is understood. After the b, q and ^ are performed, the calculation steps are described, and the calculation step can be applied to other score data without any creativity. The calculation step of the hierarchical calculation sub-unit 6022 for the formula (1) can be as follows: First, The initial estimated value of the scoring data r ui is obtained according to the initial value of the parameter of each layer of the data layer and the recommended system model shown by the formula (1), and the layer data is obtained according to the scoring data of the data layer of the layer and the initial estimated value f ui score layer error e ui; Next, error rates, and ^ of formula (5) - to give the formula (9) the negative gradient direction The updated value of the parameter, as can be understood by those skilled in the art, the formula (5) - (8) can be updated according to the corresponding calculation formula, and the meaning of the symbol and the parameter appearing in the calculation process is no longer meaningful. To sum up, correspondingly, the solution to the optimization problem of equation (4) only needs to perform the update calculation of equation (5) - equation (8), and obtain the parameters b„, bq;
但是对于式( 9 ) 的计算还需要结合评分数据集中的评分数据依 次对 进行串行计算, 降低了运算效率, 因此本发明实施例提供了 一种针对式 ( 9 ) 的计算方法, 具体实施方式为: 当同一层矩阵有多个 ^需要计算时,利用评分数据集中 所表示 的产品 j对应的用户来并行计算其对应的梯度 Ay , 比如在评分数据 集中, 用户 1,用户 2,和用户 6 对产品 4 ( j=4 ) 进行了评分, 则可 以并行计算用户 1,用户 2,和用户 6对应的产品 j 的梯度 Δ^ , Ay^ ,  However, the calculation of the formula (9) also needs to be combined with the scoring data in the scoring data set to perform serial calculations in sequence, which reduces the computational efficiency. Therefore, the embodiment of the present invention provides a calculation method for the formula (9), and the specific implementation manner For: When there are multiple ^s in the same layer matrix, the user corresponding to the product j represented in the score data set is used to calculate the corresponding gradient Ay in parallel, for example, in the score data set, user 1, user 2, and user 6 If the product 4 ( j=4 ) is scored, the gradients Δ^ , Ay^ of the product j corresponding to the user 1, the user 2, and the user 6 can be calculated in parallel.
其中, Ay;1) ei 2— , Ayf ¾·3— , Ay ¾ ·3— ^ . — ; 然后聚合得到该层 的值, 即
Figure imgf000033_0001
其中, 为 ^的初值。 此外, 本发明实施例提供了另一种针对式 ( 8 ) 和式 ( 9 ) 的等 效计算方法, 具体实施方式为: 将第一推荐模型中的表达式 ft+|N(M)|4 X 作为一个等效参数, 并用辅助变量 来表示所述等效参数, 即
Figure imgf000034_0001
, 其中辅 助变量 与表达式 ft+|N(M)|4∑ 等价, 此时, 第一推荐模型等价为 eW(")
Wherein, Ay; 1) e i 2- , Ayf ¾ · 3-, Ay ¾ · 3- ^ -;. And then polymerizing the layer obtained value, i.e.,
Figure imgf000033_0001
Where is the initial value of ^. In addition, the embodiment of the present invention provides another method for the equations (8) and (9). The specific calculation method is as follows: the expression ft +|N( M )|4 X in the first recommendation model is taken as an equivalent parameter, and the auxiliary parameter is used to represent the equivalent parameter, that is,
Figure imgf000034_0001
, where the auxiliary variable is equivalent to the expression ft +|N( M )|4∑, in which case the first recommended model is equivalent to eW(")
rui =μ + δη +bi + qi T · zu , 其参数为 ¾, 6,·, qi, zur ui =μ + δ η +b i + q i T · z u , whose parameters are 3⁄4, 6,·, q i , z u ;
然后根据辅助变量 zu的梯度 ΔζΜ = 2eui 并由此得到辅助变 量 的更新公式 +y2.(2eMi- ), 其中 为 在该层计算时的初 值。 Then according to the gradient Δζ Μ = 2e ui of the auxiliary variable z u and thus the update formula of the auxiliary variable + y 2 . (2 eM i- ), where is the initial value at the time of calculation of the layer.
通过本方法, 可以只通过对辅助变量 的更新过程来替代对参 数 和 的更新过程, 此外, 由于引入了辅助变量 , 式 ( 8) 的关 于参数 qt的更新也可以做出相应的变动, 即 qt的更新公式变为 ^■^^ + 72· (^- 其中 。为 在该层计算时的初值。 With this method, the update process of the parameter sum can be replaced only by the update process of the auxiliary variable. Furthermore, due to the introduction of the auxiliary variable, the update of the parameter q t of equation (8) can also be changed accordingly, ie The update formula of q t becomes ^■^^ + 7 2 · (^- where. is the initial value at the time of calculation in this layer.
由此可知, 引入辅助变量 一方面简化了计算, 另一方面由于 消除了 ^的计算, 推荐模型的参数求解减少了一层内循环, 从而在 确保同样精度的前提下极大地提升了运算速度。 然后, 可以得到第一层数据层的参数的更新值, 接下来将此更 新值作为第二层数据层的参数的初值, 并且根据上述方法计算第二 层数据层的参数的更新值, 以此类推, 得到最后一层数据层的参数 的更新值。 收敛判断子单元 6023, 用于根据最后一层数据层计算所得的参 数判断推荐系统模型是否收敛, 若收敛, 则计算结束, 得到最优参 数; 得到最优参数, 若不收敛, 则将最后一层数据层计算所得的参 数作为第一层数据层的参数初值, 继续通过分层计算子单元 6022继 续进行下一次分层计算。  It can be seen that the introduction of auxiliary variables simplifies the calculation on the one hand. On the other hand, the elimination of the calculation of ^, the parameter solution of the recommended model reduces the inner loop, thus greatly improving the operation speed while ensuring the same accuracy. Then, an updated value of the parameter of the first layer data layer can be obtained, and then the updated value is used as the initial value of the parameter of the second layer data layer, and the updated value of the parameter of the second layer data layer is calculated according to the above method, This type of push gives the updated value of the parameters of the last layer of the data layer. The convergence determining sub-unit 6023 is configured to determine whether the recommended system model converges according to the parameter calculated by the last layer of the data layer. If the convergence, the calculation ends, and the optimal parameter is obtained; if the optimal parameter is obtained, if not, the last parameter is obtained. The parameter calculated by the layer data layer is used as the initial value of the parameter of the first layer data layer, and the next hierarchical calculation is continued through the hierarchical calculation sub-unit 6022.
优选的, 收敛判断子单元 6023可以将本次计算所得到的最后一 层数据层计算所得的参数和前一次计算所得到的最后一层数据层计 算所得的参数均代入式 ( 3) 或式 ( 4 ) 的代价函数进行计算, 如果 两个计算结果之差不大于预设的门限值, 则可以认为本次计算所得 到的最后一层数据层计算所得的参数为最优参数, 如果大于预设的 门限值, H 'j将本次计算所得到的最后一层数据层计算所得的参数是 不收敛的, 并将本次计算所得到的最后一层数据层计算所得的参数 作为分层计算子单元 6022 继续进行下一次分层计算的第一层数据 层的参数初值, 继续计算推荐系统模型的参数。 示例性的, 预测推荐单元 603将获得的最优参数带入式 ( 1 ) 或 式 ( 2 ), 则可以获得每个用户对每个产品的评分预测值, 可以通过 将相同用户对所有产品的评分预测值进行排列, 选择评分预测值最 高的预设数量的产品推荐给用户。 Preferably, the convergence determination sub-unit 6023 can substitute the parameter calculated by the last layer of the data layer obtained by the current calculation and the parameter calculated by the last layer of the data layer obtained by the previous calculation into the formula (3) or the formula (3) or 4) The cost function is calculated. If the difference between the two calculation results is not greater than the preset threshold value, the parameter calculated in the last layer of the data layer obtained in this calculation can be considered as the optimal parameter. Set Threshold value, H 'j will not converge the parameters calculated by the last layer of data layer obtained in this calculation, and the parameters calculated by the last layer of data layer obtained in this calculation are used as hierarchical calculation The unit 6022 continues the parameter initial value of the first layer data layer of the next hierarchical calculation, and continues to calculate the parameters of the recommended system model. Exemplarily, the prediction recommending unit 603 brings the obtained optimal parameters into the formula (1) or the formula (2), and can obtain the predicted value of each user for each product, which can be obtained by the same user for all products. The score prediction values are arranged, and a predetermined number of products with the highest score prediction value are selected for recommendation to the user.
示例性的, 如图 7所示, 推荐设备 60还可以包括: 反馈单元 605, 用于产品推荐之后, 将用户对产品做出的新的 评分数据输入到推荐系统中, 以使得推荐系统能够根据实时的更新 推荐系统模型的参数, 来保证高精度和高效率推荐的实时性。  Exemplarily, as shown in FIG. 7, the recommendation device 60 may further include: a feedback unit 605, after the product recommendation, inputting new rating data made by the user to the product into the recommendation system, so that the recommendation system can be Real-time updates recommend parameters of the system model to ensure high-precision and high-efficiency recommendations for real-time performance.
本实施例提供了一种推荐设备 60, 通过并行计算提高了海量数 据环境下推荐系统的推荐效率, 并且通过考虑用户隐式反馈提高了 推荐系统的推荐效果。  The present embodiment provides a recommendation device 60, which improves the recommendation efficiency of the recommendation system in a massive data environment by parallel computing, and improves the recommendation effect of the recommendation system by considering user implicit feedback.
本实施例提供了一种推荐设备 60, 如图 8所示, 包括: 至少一 个处理器 801、 存储器 802和至少一个通信总线 803, 用于实现这些 装置之间的连接和相互通信, 其中, The present embodiment provides a recommendation device 60, as shown in FIG. 8, comprising: at least one processor 801, a memory 802, and at least one communication bus 803 for implementing connection and mutual communication between the devices, wherein
通信总线 803 可以是工业标准体系结构 ( Industry Standard Architecture, 简称为 ISA ) 总线、 夕卜部设备互连 ( Peripheral Component, 简称为 PCI ) 总线或扩展工业标准体系结构 ( Extended Indus try Standard Architecture, 简称为 EISA ) 总线等。 该总线 803 可以分为地址总线、 数据总线、 控制总线等。 为便 于表示, 图 8 中仅用一条粗线表示, 但并不表示仅有一根总线或一 种类型的总线。  The communication bus 803 may be an Industry Standard Architecture (ISA) bus, a Peripheral Component (PCI) bus, or an Extended Indus try Standard Architecture (EISA). ) Bus, etc. The bus 803 can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in Figure 8, but it does not mean that there is only one bus or one type of bus.
存储器 802 用于存储可执行程序代码以及处理器 801 的处理结 果, 该程序代码包括计算机操作指令。 存储器 802可能包含高速 RAM 存储器, 也可能还包括非易失性存储器 ( non-volatile memory ), 例如至少一个磁盘存储器。 处理器 801可能是一个中央处理器( Central Processing Unit, 简称为 CPU ) , 或者是特定集成电路 ( Application Specific Integrated Circuit, 简称为 ASIC ), 或者是被配置成实施本发明 实施例的一个或多个集成电路。 处理器 801 用于执行存储器 704 中存储的可执行程序代码, 例 如计算机程序来运行与可执行代码对应的程序。 处理器 801用于: 将评分数据集中的评分数据分别放置到至少一个数据层, 其中, 评分数据与用户以及产品分别——对应, 且每一个数据层中的任意 两个评分数据对应的用户以及产品均不相同; The memory 802 is used to store executable program code and processing results of the processor 801, the program code including computer operating instructions. The memory 802 may include a high speed RAM memory and may also include a non-volatile memory. For example at least one disk storage. The processor 801 may be a central processing unit (CPU), or an application specific integrated circuit (ASIC), or one or more configured to implement the embodiments of the present invention. integrated circuit. The processor 801 is configured to execute executable program code stored in the memory 704, such as a computer program, to execute a program corresponding to the executable code. The processor 801 is configured to: separately record the score data in the score data set to at least one data layer, where the score data corresponds to the user and the product respectively, and the user corresponding to any two of the score data in each data layer and The products are all different;
以及依据预设的推荐系统模型以及数据层中的评分数据, 并行 计算数据层中推荐系统模型的参数, 并将每一层数据层的参数作为 对应的下一层数据层的初值, 直至获取推荐系统模型的最优参数; 其中, 推荐系统模型为每个用户对每个产品的评分预测值与平均分 和所述推荐系统模型的参数之间的对应关系; 以及根据最优参数与推荐系统模型获取每个用户对每个产品的 评分预测值, 并根据评分预测值向用户推荐产品。 示例性的, 评分数据集可以通过获取用户对产品的评分数据得 到, 也可以通过用户的浏览、 购买记录信息得到, 不仅可以得到用 户对产品的显示反馈, 也可以获取到用户偏好的隐式反馈, 本发明 实施例对此不做任何限制, 优选的, 评分数据集可以通过用户对产 品的评分数据获得, 本领域技术人员可以理解的, 用户对产品的评 分数据不仅显示的反馈了用户对产品的评价, 而且也通过用户对产 品评分的行为隐式的反馈了用户对产品的偏好, 优选的, 在本实施 例中, 评分数据集可以通过矩阵形式表示, 其中, 矩阵的不同行表 示不同用户, 矩阵的不同列表示不同产品; 进一步的, 评分数据与用户以及产品分别——对应, 本领域技 术人员可以理解的, 由于评分数据所放置的数据层可以通过矩阵的 形式进行表示, 同时评分数据集也可以通过矩阵的形式进行表示, 其中, 矩阵的不同行表示不同用户, 矩阵的不同列表示不同产品, 于是处理器 801 可以将评分数据集中的评分数据分别放置到至少两 个数据层矩阵, 并且所有数据层矩阵均与评分数据集的矩阵具有相 同的行数和相同的列数, 当每一个数据层中的任意两个评分数据对 应的用户以及产品均不相同, 即每一个数据层矩阵中的任意两个评 分数据均不在同一行且均不在同一列的时候, 就可以满足同一层数 据层中的所有评分数据之间互相没有依赖关系, 因此可以对同一层 数据层中的评分数据进行并行计算, 具体的放置步骤本发明实施例 不作任何限定, 任何能够使得每一个数据层矩阵中的任意两个评分 数据均不在同一行且均不在同一列的放置方法均在本发明实施例的 保护范围内; 可选的, 在本实施例中, 处理器 801可以用于实现如图 3A所示 放置方法, 包括: 处理器 801选取评分数据集中的一个评分数据,将其放在第一层 数据层矩阵中对应于该评分数据的用户和产品的位置上, 此时, 最 后一层数设置为 /max = l,其中, 选取的方式本实施例不做限定, 后续 从评分数据集中选取数据的方式与第一次选取的方式相同, 在此不 作赘述; 处理器 801选取下一个评分数据, 并从第一层数据层矩阵开始到 最后一层对应的数据层矩阵,依次与其中的所有评分数据进行比较, 是否满足评分数据在数据层矩阵中的对应位置所在的行和列均没有 评分数据, 若满足, 则该评分数据放置于最先满足的数据层矩阵, 若直至最后一层 /max对应的数据层也不满足, 则最后一层数更新为 其中 符号表示将符号左边的数据更新为符号右边的 数据, 下同, 并且将该评分数据放置于更新后的最后一层数据层; 处理器 8 01 对评分数据集中的剩余评分数据依次重复上述的放 置过程直至评分数据集中所有的评分数据均放置完毕。 例如, 评分 数据如矩阵 A所示: 3 * * 2 * * And calculating the parameters of the recommended system model in the data layer in parallel according to the preset recommendation system model and the scoring data in the data layer, and using the parameters of each layer of the data layer as the initial value of the corresponding next layer of the data layer until obtaining Recommending an optimal parameter of the system model; wherein, the recommendation system model is a correspondence between each user's score prediction value and the average score of each product and the parameters of the recommendation system model; and the optimal parameter and recommendation system The model obtains a score prediction value for each product for each user, and recommends the product to the user based on the score prediction value. Exemplarily, the scoring data set can be obtained by obtaining the user's scoring data of the product, or by browsing and purchasing the record information of the user, not only obtaining the user's display feedback on the product, but also obtaining implicit feedback of the user's preference. The embodiment of the present invention does not impose any limitation on this. Preferably, the scoring data set can be obtained by the user's scoring data of the product. As can be understood by those skilled in the art, the user's scoring data of the product not only displays the feedback of the user to the product. The evaluation, and also implicitly feedback the user's preference for the product by the user's behavior of rating the product. Preferably, in this embodiment, the score data set can be represented by a matrix form, wherein different rows of the matrix represent different users The different columns of the matrix represent different products; further, the score data is respectively corresponding to the user and the product - correspondingly, those skilled in the art can understand that the data layer placed by the score data can be represented by a matrix form, and the score data is simultaneously Sets can also pass through the matrix Form of representation, Wherein, different rows of the matrix represent different users, different columns of the matrix represent different products, and the processor 801 can then place the scoring data in the scoring data set into at least two data layer matrices, and all the data layer matrices and the scoring data set The matrix has the same number of rows and the same number of columns. When any two scores in each data layer correspond to different users and products, that is, any two score data in each data layer matrix are not the same. When the rows are not in the same column, all the scoring data in the same layer of data layer can be mutually independent, so that the scoring data in the same layer of data can be calculated in parallel, and the specific placement steps are implemented in the present invention. The method is not limited, and any method for placing any two of the data in the data layer matrix in the same row and not in the same column is within the protection scope of the embodiment of the present invention; In an example, the processor 801 can be used to implement the placement method as shown in FIG. 3A, including Processor 801 to select a set of data rates of data rates, the first layer on which data layer of the matrix corresponding to the position of the user data rates on the product and, at this time, the number is set to the last layer / max = l, The method for selecting the method is not limited in this embodiment, and the manner of selecting data from the score data set is the same as that of the first selection, and is not described herein; the processor 801 selects the next score data and the data from the first layer. The layer matrix begins to the corresponding data layer matrix of the last layer, and is sequentially compared with all the score data therein, whether there is no score data for the row and column where the corresponding position of the score data in the data layer matrix is satisfied, and if so, the The scoring data is placed in the first satisfied data layer matrix. If the data layer corresponding to the last layer/ max is not satisfied, the last layer number is updated to the data indicating that the data on the left side of the symbol is updated to the data on the right side of the symbol. Same, and the rating data is placed in the updated last layer of data; processor 8 01 is left in the score data set Score data sequentially repeating the above process until placing all rates dataset ratings data are placed. For example, the scoring data is shown in matrix A: 3 * * 2 * *
2 2 1  2 2 1
4 5 5 2  4 5 5 2
A 2 3  A 2 3
1 5  1 5
1 2 3  1 2 3
3 5 2 通过如图 3A所示的放置方法, 示例性的, 具体操作过程如下: 评分数据集中选出一个评分数据 rM , 将其放入第一层, 然后再选择 第二个评分数据 判断" '和"是否为同一个用户以及 '和' '是否为同 一个产品, 当 "'≠ "且 '·'≠ 时, 该评分数据 放入第一层, 否则将其放 入第二层。 随后再取出第三个评分数据^ , 如果该评分数据所对应 的用户 "'和产品 '与放置在第一层的用户和产品均不相同时, 则将其 放入第一层, 否则与第二层的评分数据对应的用户和产品进行比较, 当满足 2且^2时, 则将其放入第二层, 否则放入第三层, 这 里" 2表示第二层的用户, „2表示第二层的产品。 依此类推, 直 到集合 的所有评分数据均放置完毕。 可以依次得到 5 个数据层, 从第一层数据层到最后一层也就是 第五层数据层依次为: 3 5 2 Through the placement method as shown in FIG. 3A, the exemplary operation process is as follows: Select a score data r M in the score data set, put it into the first layer, and then select the second score data to judge "I and " are the same user and whether 'and' is the same product. When "'≠" and '·'≠, the rating data is placed in the first layer, otherwise it is placed in the second layer. Then take out the third rating data ^, if the user "' and product" corresponding to the rating data are different from the users and products placed on the first layer, then put it into the first layer, otherwise The user and product corresponding to the score data of the second layer are compared. When 2 and ^ 2 are satisfied, they are placed in the second layer, otherwise they are placed in the third layer, where " 2 indicates the user of the second layer, „ 2 indicates The second layer of products. And so on, until all the score data of the collection is placed. You can get 5 data layers in turn, from the first data layer to the last layer, that is, the fifth data layer:
(MJ, J,3) (u2,i2,2) (t3, 3,5) ( 4, ,3) ("5, '6,5) [ 6,i4, lx {Layer 1) (MJ, J, 3) (u 2 , i 2 , 2) (t3, 3, 5) ( 4 , , 3) (" 5 , ' 6 , 5) [ 6 , i 4 , l x {Layer 1)
[ul,iA,2) (w2,5,l) (w3, j,4) [u , ,2) ( 5,i2,\) /2 {Layer 2) [u l ,i A ,2) (w 2 , 5 ,l) (w 3 , j,4) [u , ,2) ( 5 ,i 2 ,\) / 2 {Layer 2)
(u2,i4,2) u3,i5,5) (Μ6,[,1) (U7. /3 {Layer 3) (u 2 , i 4 , 2) u 3 , i 5 , 5) (Μ 6 , [, 1) (U 7 . / 3 {Layer 3)
/4 {Layer 4) / 4 {Layer 4)
("7,'·5,2) /5 {Layer 5) 通过矩阵形式表示可以是: (" 7 , '· 5 , 2 ) / 5 {Layer 5) The representation in matrix form can be:
Figure imgf000038_0001
Figure imgf000039_0001
Figure imgf000038_0001
Figure imgf000039_0001
其中, 矩阵中出现的 *表示该位置对应的用户没有对该位置对应 的产品进行评分, 可以得到 A = +i2+i3+i4+i5。 进一步的, 在本实施例中, 处理器 801 也可以用于实现如图 3B 所示放置方法, 包括: 处理器 801选取评分数据集中的一个评分数据,将其放在第一层 数据层中对应于该评分数据的用户和产品的位置上, 此时, 最后一 层数/ =1; The * appearing in the matrix indicates that the user corresponding to the location does not score the product corresponding to the location, and A = +i 2 +i 3 +i 4 +i 5 can be obtained. Further, in this embodiment, the processor 801 can also be used to implement the placement method as shown in FIG. 3B, including: the processor 801 selects one rating data in the score data set, and places it in the first layer data layer. At the location of the user and product of the rating data, at this time, the last layer number / =1;
处理器 801从评分数据集中选取的第二个评分数据开始,将该评 分数据依次从最后一层数对应的数据层直至第一层数据层进行比 较, 直至能够从最后一层数到第一层数据层中找出最后一个满足该 评分数据在数据层中对应位置的所在行和所在列上没有其他评分数 据, 则该评分数据放置于最后一个满足的数据层的对应位置, 若从 最后一层数到第一层均没有满足的数据评分矩阵, 则最后一层数更 新为 max max+l, 并且将该评分数据放置于更新后的最后一层数对 应的数据层; 处理器 801 对评分数据集中的剩余数据依次重复上述对第二个 评分数据的放置过程直至评分数据集中所有的评分数据均放置完 毕。 The processor 801 starts from the second score data selected in the score data set, and compares the score data from the data layer corresponding to the last layer to the first data layer until the first layer can be compared to the first layer. If there is no other scoring data in the data layer to find the last row and column of the corresponding position of the scoring data in the data layer, the scoring data is placed at the corresponding position of the last satisfied data layer, if from the last layer Counting the data scoring matrix that is not satisfied by the first layer, the last layer number is updated to max max +l, and the scoring data is placed in the data layer corresponding to the updated last layer; processor 801 pairs the scoring data The remaining data in the set repeats the above-described placement process for the second score data until all the score data in the score data set is placed.
例如, 仍以矩阵 A为例, 通过如图 3B所示的放置方法, 示例性 的, 具体操作过程如下: 从评分数据集中选出一个评分数据 rM, 将其放入第一层, 然后再 选择第二个评分数据 ,判断 M '和 M是否为同一个用户以及'和 是否为同 一个产品, 当《'≠«且 '≠ 时, 该评分数据 rMY放入第一层, 否则将其放入第 二层。 随后再取出第三个评分数据^ , 将其与第二层已放置的评分数据 进行比较(如果有第二层的话), 如果该评分数据所对应的用户 Μ'和产品 ζ·' 与放置在第二层的用户 M '和产品 '均不相同时, 则继续将其与第一层已放 置的评分数据进行比较, 当与第一层已放置的评分数据也满足 ^«且 '≠ 时则将其放入第一层, 否则放置第二层, 而当该评分数据所对应的用户 M" 和产品 与放置在第二层的用户 M '和产品 '满足 M' = M '或者 = 任意一个条 件时, 则直接将其放入第三层。 依此类推, 直到评分数据集中的所有评分 数据均放置完毕, 可以依次得到 7个数据层, 从第一层数据层到最后 一层也就是第七层数据层依次为: For example, still taking the matrix A as an example, through the placement method as shown in FIG. 3B, the exemplary operation process is as follows: Select a rating data r M from the score data set, put it into the first layer, and then Select the second rating data to determine whether M 'and M are the same user and 'and whether it is the same product. When '≠« and' ≠, the rating data r MY is placed in the first layer, otherwise it will be Put in the first Second floor. Then take out the third scoring data ^ and compare it with the scored data already placed on the second layer (if there is a second layer), if the rating data corresponds to the user Μ 'and the product ζ·' When the user M 'and the product' of the second layer are different, they continue to compare it with the score data already placed on the first layer. When the score data already placed with the first layer also satisfies ^ «And' ≠ Then put it into the first layer, otherwise place the second layer, and when the rating data corresponds to the user M "and the product and the user M ' placed on the second layer and the product 'satisfies M ' = M ' or = any When a condition is met, it is placed directly in the third layer. By analogy, until all the score data in the score data set is placed, 7 data layers can be obtained in turn, from the first data layer to the last layer. The seventh layer of data layers are:
wi, ζ·ι ,3) lx{Layer 1) , w i, ζ ·ι ,3) l x {Layer 1)
wi, ζ·4, 2) 5 , 2 , 1 ) /2 {Layer 2) , w i, ζ · 4, 2) 5 , 2 , 1 ) / 2 {Layer 2)
、w2, 4,2) ( 3, 3,5) ( /3 {Layer 3) , w 2 , 4 , 2) ( 3 , 3 , 5) ( / 3 {Layer 3)
、w2,5,l) ( 4, 3,2) 6, 2,2) /4 {Layer 4) , w 2 , 5 , l) ( 4 , 3 , 2) 6 , 2 , 2) / 4 {Layer 4)
、w3, 5,5) (W4,3) ( /5 {Layer 5) , w 3 , 5 , 5) (W 4 , 3) ( / 5 {Layer 5)
、w3, 6,2) ( 4, 5,3) ( ζ·4 5 ) /6 {Layer 6) , w 3 , 6 , 2) ( 4 , 5 , 3) ( ζ · 4 5 ) / 6 {Layer 6)
( 7, 5,2) /7 {Layer 7) 通过矩阵形式表示为: ( 7 , 5 , 2) / 7 {Layer 7) Expressed in matrix form as:
Figure imgf000040_0001
Figure imgf000040_0001
合产鬼 Ghost
5 2 其中, 矩阵中出现的 *表示该位置对应的用户没有对该位置对应 的产品进行评分, 可以得到 ^=/; +/; +/;+/:+/;+/ +/;。 示例性的, 预设的系统模型可以是考虑用户隐式反馈的潜在因 素模型, 也可以是仅仅基于现实反馈的潜在因素模型, 还可以包括 考虑时空特性的推荐系统模型、 非对称的潜在因素模型等, 本发明 实施例对此不做任何限定, 进一步的, 在本实施例中, 处理器 801 构建了一个改进的考虑 用户隐式反馈的第一推荐系统模型, 如式 ( 1 ), 以及一个未考虑用 户隐式反馈的第二推荐系统模型, 如式 ( 2 ), 5 2 where * appears in the matrix, indicating that the user corresponding to the location does not score the product corresponding to the location, and can obtain ^=/; +/; +/;+/:+/;+/ +/;. Exemplarily, the preset system model may be a latent factor model considering user implicit feedback, or a latent factor model based only on real feedback, and may include a recommendation system model considering a spatiotemporal characteristic, and an asymmetric latent factor model. The embodiment of the present invention does not limit this. Further, in the embodiment, the processor 801 constructs an improved first recommendation system model considering user implicit feedback, such as equation (1), and a A second recommendation system model that does not consider user implicit feedback, as in equation (2),
式 ( 1 ) 和式 ( 2 ) 中, 表示用户 u对产品 i 的评分预测值, μ 表示评分数据集中的所有评分数据的平均值, ^表示用户 u 相对用 户平均评分的偏移量, 表示产品 i相对产品平均评分的偏移量, qt 表示产品因素矢量, T表示转置运算符号, p„表示用户因素矢量; 进一步在式 ( 1 ) 中, |N(w)|表示用户 u 提供了隐式偏好的所有 的集合大小, N(w)表示用户 u 提供了隐式偏好的所有产品的集In equations (1) and (2), the user u predicts the score of product i, μ represents the average of all the score data in the score data set, and ^ represents the offset of user u from the average user score, indicating the product The offset of i from the average score of the product, q t represents the product factor vector, T represents the transpose operator symbol, p „ represents the user factor vector; further in equation ( 1 ), |N(w)| represents that the user u provides All collection sizes of implicit preferences, N(w) represents the set of all products that user u provides for implicit preference
'表示与产品 j相关联的因素矢量, 其用于表征隐式反馈信 而且 bM, b q 和^为用户的推荐系统模型的未知 示例性的, 处理器 801 还可以根据推荐系统模型得到的预测值 与评分数据的均方误差和推荐系统模型的参数之间的关系得到推荐 系统模型的代价函数, 以及分层数据矩阵的评分数据, 通过对上述 模型有关的代价函数最优化问题的求解得到上述模型的未知参数, 即并行计算数据层中推荐系统模型的参数, 并将每一层数据层的参 数作为对应的下一层数据层的初值, 直至获取推荐系统模型的最优 参数, 其中, 最优参数就是上述模型的未知参数最优值, 具体的, 在本实施例中, 第一推荐系统模型和第二推荐系统模型有关的代价 函数可以分别表示为式 ( 3 ) 和式 ( 4 ) ; 其中, |*|2表示矢量 *的所有 元素的平方和, 与 ^为正则化因子。 示例性的,处理器 801根据上述的模型以及分层数据矩阵的评分 数据, 可以通过对上述模型有关的代价函数最优化问题的求解得到 上述模型的未知参数, 即并行计算数据层中推荐系统模型的参数, 并将每一层数据层的参数作为对应的下一层数据层的初值, 直至获 取推荐系统模型的最优参数, 本领域技术人员可以理解的, 对式( 3 ) 与式 ( 4 ) 进行最优化问题的求解, 具体的求解过程具有相似性, 不 再赘述, 具体的, 以式 ( 3 ) 为例, 如图 7 所示, 处理器 801 进一步 用于: 'Represents the factor vector associated with product j, which is used to characterize the implicit feedback signal and b M , bq and ^ are unknown examples of the user's recommendation system model, and the processor 801 can also obtain predictions based on the recommended system model. The relationship between the mean square error of the value and the score data and the parameters of the recommended system model is obtained by the cost function of the recommended system model, and the score data of the hierarchical data matrix, which is obtained by solving the cost function optimization problem related to the above model. The unknown parameters of the model, that is, the parameters of the recommended system model in the parallel computing data layer, and the parameters of each layer of the data layer are used as the initial values of the corresponding next layer of data layers, until the optimal parameters of the recommended system model are obtained, wherein The optimal parameter is the optimal value of the unknown parameter of the above model. Specifically, In this embodiment, the cost functions related to the first recommendation system model and the second recommendation system model may be expressed as equations (3) and (4), respectively; wherein |*| 2 represents the sum of squares of all elements of the vector * , and ^ is a regularization factor. Exemplarily, according to the above model and the scoring data of the hierarchical data matrix, the processor 801 can obtain the unknown parameters of the above model by solving the cost function optimization problem related to the above model, that is, the recommended system model in the parallel computing data layer. The parameters of each layer of the data layer are taken as the initial values of the corresponding data layer of the next layer until the optimal parameters of the recommended system model are obtained, and those skilled in the art can understand the formula (3) and the formula ( 4) Solving the optimization problem, the specific solution process has similarity, and will not be described again. Specifically, taking equation (3) as an example, as shown in FIG. 7, the processor 801 is further used to:
计算评分数据集中的所有评分数据的平均分; 以及依次采用并行计算的方式计算每一层数据层的参数, 并将 每一层数据层计算所得的参数作为下一层数据层的参数初值; 示例性的, 本实施例中, 在第一次计算之前, 可以将第一层数 据层的参数 bM , b. , q 和 以及 ^与 ^的初值进行随机设置, 为 了简便起见, 参数 b„, b. , q 和^的初值可以设置为标量参数为Calculating the average score of all the score data in the score data set; and calculating the parameters of each layer of the data layer by using the parallel calculation method in turn, and using the parameter calculated by each layer of the data layer as the initial value of the parameter of the next layer of the data layer; Exemplarily, in this embodiment, before the first calculation, the parameters b M , b. , q and the initial values of ^ and ^ of the first layer data layer may be randomly set. For the sake of simplicity, the parameter b The initial values of „, b. , q and ^ can be set to scalar parameters as
0 , 矢量参数为向量 δ , 其中, 数字 0上方的箭头符号为向量符号, 而 ^与 ^可以任意设置为一个为比较小的正值, 本发明实施例对此 不作任何限定, 用来表示正则化因子; 示例性的, 处理器 801从第一层数据层开始, 依次的并行计算该 层数据层的参数 b„, b q ?„和^.的值, 并将计算所得的参数值作 为下一层数据层的参数 b„, b q 和 的初值, 直至计算完最后 一层数据层的参数 bu , b q 和; 得到最后一层数据层计算所 得的参数; 具体的计算过程如下: 处理器 801 根据第一层数据层的参数初值对第一层数据层所包 含的评分数据对应的未知参数 b„, b. , qt , 和 进行更新计算, 由 于同一层矩阵的评分数据之间没有相互依赖的关系, 因此同一层矩 阵的不同评分数据对应的未知参数 b„, b. , qt , 和 的更新计算可 以并行进行, 并将更新后的参数作为下一层数据层的参数初值对下 一层数据层所包含的评分数据对应的参数进行更新计算, 直至最后 一层数据层所包含的评分数据对应的参数进行更新计算完毕, 得到 最后一层数据层计算所得的参数; 优选的, 在本实施例中, 处理器 801可以通过梯度下降法进行更 新计算, 由于所有评分数据对应的参数的更新方法均相同, 本领域 技术人员可以理解, 通过对一个评分数据 对应的的参数 bM, b., q pu和 进行更新计算步骤进行描述后, 可以无需创造性的将计算步 骤应用在其他的评分数据中, 处理器 801对式 ( 1 ) 的计算步骤具体 可以:¾口下: 首先, 处理器 801根据每一层数据层的参数初值和式 ( 1 ) 所示 的推荐系统模型得到评分数据 rui的初始估计值 ,进而根据该层数据 层的评分数据和初始估计值 寻到该层数据层的评分误差 eui0, the vector parameter is a vector δ, wherein the arrow symbol above the number 0 is a vector symbol, and ^ and ^ can be arbitrarily set to a relatively small positive value, which is not limited by the embodiment of the present invention, and is used to indicate the regularity. For example, the processor 801 starts from the first layer of the data layer, sequentially calculates the values of the parameters b„, bq?, and ^. of the layer of the data layer in parallel, and takes the calculated parameter value as the next step. The parameters of the layer data layer b„, bq and the initial value, until the parameters of the last layer of data layer b u , bq and ; are calculated; the parameters calculated by the last layer of the data layer are obtained; the specific calculation process is as follows: Processor 801 According to the initial value of the parameter of the first layer data layer, the unknown parameters b„, b., q t , and the corresponding parameters corresponding to the scoring data included in the first layer of the data layer are updated, because there is no mutual mutuality between the scoring data of the same layer matrix. dependent relationship, the different data rates corresponding to the same layer as the matrix unknown parameters b ", b., q t , can be calculated and updated in parallel, and the updated parameter as a parameter in one data layer Value for the next The parameters corresponding to the scoring data included in one layer of the data layer are updated and calculated, until the parameters corresponding to the scoring data included in the last layer of the data layer are updated and calculated, and the parameters calculated by the last layer of the data layer are obtained; preferably, In this embodiment, the processor 801 can perform the update calculation by the gradient descent method. Since the update methods of the parameters corresponding to all the scoring data are the same, those skilled in the art can understand that the parameters b M , b corresponding to one scoring data are understood. ., qp u and updates the calculation step will be described later, can be calculated without inventive step of applying other score data, the processor 801 may specifically calculate the step of formula (1): the following mouth ¾: first, the processor 801 obtain an initial estimated value of the scoring data r ui according to the initial value of the parameter of each layer of the data layer and the recommended system model represented by the formula (1), and then find the layer data according to the scoring data and the initial estimated value of the data layer of the layer. Layer score error e ui ;
接着, 处理器 801 通过评分误差^和式 ( 5 ) -式 ( 9 ) 得到负梯 度方向的参数的更新值, 本领域技术人员可以理解的, 式 ( 5 ) -式 ( 8 ) 均可以根据相应的计算式进行参数的更新, 计算过程及各式中 出现的符号与参数的意义不再赘述, 相应的, 对式 ( 4 ) 的最优化问 题的求解只需要进行式 ( 5 ) -式 ( 8 ) 的更新计算, 得到参数 b„, b q P„即可;  Next, the processor 801 obtains an updated value of the parameter in the negative gradient direction by the scoring error ^ and the equation (5) - (9), and those skilled in the art can understand that the equations (5) - (8) can be corresponding according to the corresponding The calculation of the parameters of the calculation, the calculation process and the meaning of the symbols and parameters appearing in the various equations are not repeated, correspondingly, the solution to the optimization problem of equation (4) only needs to be carried out (5) - (8 Update calculation, get the parameter b„, bq P„;
但是对于式 ( 9 ) 的计算, 处理器 801还需要结合评分数据集中 的评分数据依次对 进行串行计算, 降低了运算效率, 因此本发明 实施例提供了一种针对式 ( 9 ) 的计算方法, 具体实施方式为: 当同一层矩阵有多个 ^需要计算时, 处理器 801利用评分数据集 中 所表示的产品 j 对应的用户来并行计算其对应的梯度 , 比如 在评分数据集中, 用户 1,用户 2,和用户 6 对产品 4 ( j = 4 ) 进行了 产 ,
Figure imgf000043_0001
However, for the calculation of the formula (9), the processor 801 also needs to perform serial calculation on the basis of the score data in the score data set, which reduces the computational efficiency. Therefore, the embodiment of the present invention provides a calculation method for the formula (9). The specific implementation manner is: when there are multiple calculations of the same layer matrix, the processor 801 uses the user corresponding to the product j represented in the score data set to calculate the corresponding gradient in parallel, for example, in the score data set, the user 1, User 2, and User 6 have produced product 4 ( j = 4 ).
Figure imgf000043_0001
然后处理器 801聚合得到该层 ^的值,即 +y2(4) +Ay +A) ), 其中, 为 的初值。 此外, 本发明实施例提供了另一种针对式 ( 8 ) 和式 ( 9 ) 的等 效计算方法, 具体实施方式为: 将第一推荐模型中的表达式 ft+|N(M)|4 X 作为一个等效参数, 并用辅助变量 来表示所述等效参数, 即 =A+| V( )|4 , 其中辅 助变量 与表达式 ft+|N(M)|4∑ 等价, 此时, 第一推荐模型等价为 eW(") The processor 801 then aggregates to obtain the value of the layer ^, ie +y 2 (4) + Ay + A)), where is the initial value. In addition, the embodiment of the present invention provides another method for the equations (8) and (9). The specific calculation method is as follows: The expression ft +|N( M )|4 X in the first recommendation model is taken as an equivalent parameter, and the equivalent parameter is used to represent the equivalent parameter, that is, =A+| V ( )|4 , where the auxiliary variable is equivalent to the expression ft +|N( M )|4∑. At this time, the first recommended model is equivalent to eW(")
rui =μ + δη +bi + qi T · zu , 其参数为 ¾, 6,·, qi, zur ui =μ + δ η +b i + q i T · z u , whose parameters are 3⁄4, 6,·, q i , z u ;
然后根据辅助变量 zu的梯度 ΔζΜ = 2eui 并由此得到辅助变 量 的更新公式 +y2.(2eMi- ), 其中 为 在该层计算时的初 值。 Then according to the gradient Δζ Μ = 2e ui of the auxiliary variable z u and thus the update formula of the auxiliary variable + y 2 . (2 eM i- ), where is the initial value at the time of calculation of the layer.
通过本方法, 可以只通过对辅助变量 的更新过程来替代对参 数 和 的更新过程, 此外, 由于引入了辅助变量 , 式 ( 8 ) 的关 于参数 qt的更新也可以做出相应的变动, 即 qt的更新公式变为 ^■^^ + 72· (^- 其中 。为 在该层计算时的初值。 With this method, the update process of the parameter sum can be replaced only by the update process of the auxiliary variable. Furthermore, since the auxiliary variable is introduced, the update of the parameter q t of the formula (8) can also be changed accordingly, that is, The update formula of q t becomes ^■^^ + 7 2 · (^- where. is the initial value at the time of calculation in this layer.
由此可知, 引入辅助变量 一方面简化了计算, 另一方面由于 消除了 ^的计算, 推荐模型的参数求解减少了一层内循环, 从而在 确保同样精度的前提下极大地提升了运算速度。 然后,处理器 801可以将该层数据层计算所得的参数作为下一层 数据层的参数初值,接下来处理器 801将此更新值作为第二层数据层 的参数的初值, 并且根据上述方法计算第二层数据层的参数的更新 值, 以此类推, 处理器 801得到最后一层数据层的参数的更新值。 示例性的, 处理器 801 进一步的用于根据最后一层数据层计算所 得的参数判断推荐系统模型是否收敛, 若收敛, 则计算结束, 得到 最优参数; 得到最优参数, 若不收敛, 则将最后一层数据层计算所 得的参数作为第一层数据层的参数初值, 继续计算推荐系统模型的  It can be seen that the introduction of auxiliary variables simplifies the calculation on the one hand. On the other hand, the elimination of the calculation of ^, the parameter solution of the recommended model reduces the inner loop, thus greatly improving the operation speed while ensuring the same accuracy. Then, the processor 801 can use the parameter calculated by the layer data layer as the parameter initial value of the next layer of the data layer, and then the processor 801 uses the updated value as the initial value of the parameter of the second layer data layer, and according to the above The method calculates an updated value of the parameters of the second layer of data layers, and so on, and the processor 801 obtains an updated value of the parameters of the last layer of the data layer. Exemplarily, the processor 801 is further configured to determine, according to the parameter calculated by the last layer of the data layer, whether the recommended system model converges. If the convergence is performed, the calculation ends, and the optimal parameter is obtained; if the optimal parameter is obtained, if not, The parameter calculated by the last layer of the data layer is used as the initial value of the parameter of the first layer of the data layer, and the calculation of the recommended system model is continued.
优选的,处理器 801可以本次计算所得到的最后一层数据层计算 所得的参数和前一次计算所得到的最后一层数据层计算所得的参数 均代入式 ( 3 ) 或式 ( 4 ) 的代价函数进行计算, 如果两个计算结果 之差不大于预设的门限值, 则可以认为本次计算所得到的最后一层 数据层计算所得的参数为最优参数, 如果大于预设的门限值, 则将 本次计算所得到的最后一层数据层计算所得的参数是不收敛的, 并 将本次计算所得到的最后一层数据层计算所得的参数作为下次计算 的第一层数据层的参数初值, 继续计算推荐系统模型的参数。 Preferably, the processor 801 can calculate the obtained parameter of the last layer of the data layer obtained by the current calculation and the parameter calculated by the last layer of the data layer obtained by the previous calculation into the equation (3) or (4). The cost function is calculated. If the difference between the two calculation results is not greater than the preset threshold value, the parameter calculated in the last layer of the data layer obtained by the current calculation may be regarded as the optimal parameter, if it is greater than the preset threshold. Limit, then The parameters calculated in the last layer of the data layer obtained in this calculation are not convergent, and the parameters calculated in the last layer of the data layer obtained in this calculation are used as the parameters of the first layer of the data layer to be calculated next time. Value, continue to calculate the parameters of the recommended system model.
示例性的, 处理器 801将获得的最优参数带入式 ( 1 ) 或式 ( 2 ) , 则可以获得每个用户对每个产品的评分预测值, 可以通过将相同用 户对所有产品的评分预测值进行排列, 选择评分预测值最高的预设 数量的产品推荐给用户。 示例性的, 处理器 801还可以用于产品推荐之后, 将用户对产品 做出的新的评分数据输入到推荐系统中, 以使得推荐系统能够根据 实时的更新推荐系统模型的参数, 来保证高精度和高效率推荐的实 时性。 本实施例提供了一种推荐设备 6 0 , 通过并行计算提高了海量数 据环境下推荐系统的推荐效率, 并且通过考虑用户隐式反馈提高了 推荐系统的推荐效果。  Exemplarily, the processor 801 brings the obtained optimal parameter into the formula (1) or the formula (2), and can obtain the predicted value of each user for each product, and can score the same user for all products. The predicted values are arranged, and a predetermined number of products with the highest score prediction value are selected for recommendation to the user. Exemplarily, the processor 801 can also be used to input new scoring data made by the user on the product into the recommendation system after the product recommendation, so that the recommendation system can ensure high according to the parameters of the system model in real-time update recommendation. Accuracy and high efficiency are recommended for real-time. The present embodiment provides a recommendation device 60, which improves the recommendation efficiency of the recommendation system in a massive data environment by parallel computing, and improves the recommendation effect of the recommendation system by considering user implicit feedback.
在本申请所提供的几个实施例中, 应该理解到, 所揭露的系统, 装置和方法, 可以通过其它的方式实现。 例如, 以上所描述的装置 实施例仅仅是示意性的, 例如, 单元的划分, 仅仅为一种逻辑功能 划分, 实际实现时可以有另外的划分方式, 例如多个单元或组件可 以结合或者可以集成到另一个系统, 或一些特征可以忽略, 或不执 行。 另一点, 所显示或讨论的相互之间的耦合或直接耦合或通信连 接可以是通过一些接口, 装置或单元的间接耦合或通信连接, 可以 是电性, 机械或其它的形式。 作为分离部件说明的单元可以是或者也可以不是物理上分开 的, 作为单元显示的部件可以是或者也可以不是物理单元, 即可以 位于一个地方, 或者也可以分布到多个网络单元上。 可以根据实际 的需要选择其中的部分或者全部单元来实现本实施例方案的目的。 另外, 在本发明各个实施例中的各功能单元可以集成在一个处 理单元中, 也可以是各个单元单独物理包括, 也可以两个或两个以 上单元集成在一个单元中。 上述集成的单元既可以采用硬件的形式 实现, 也可以采用硬件加软件功能单元的形式实现。 上述以软件功能单元的形式实现的集成的单元, 可以存储在一 个计算机可读取存储介质中。 上述软件功能单元存储在一个存储介 质中, 包括若干指令用以使得一台计算机设备(可以是个人计算机, 服务器, 或者网络设备等)执行本发明各个实施例方法的部分步骤。 而前述的存储介质包括: U盘、 移动硬盘、 只读存储器 ( Read-Only Memory, 简称 ROM )、 随机存取存储器 ( Random Access Memory, 简 称 RAM)、 磁碟或者光盘等各种可以存储程序代码的介质。 最后应说明的是: 以上实施例仅用以说明本发明的技术方案, 而非对其限制; 尽管参照前述实施例对本发明进行了详细的说明, 本领域的普通技术人员应当理解: 其依然可以对前述各实施例所记 载的技术方案进行修改, 或者对其中部分技术特征进行等同替换; 而这些修改或者替换, 并不使相应技术方案的本质脱离本发明各实 施例技术方案的精神和范围。 In the several embodiments provided by the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of cells is only a logical function division. In actual implementation, there may be another division manner. For example, multiple units or components may be combined or integrated. Go to another system, or some features can be ignored, or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form. The units described as separate components may or may not be physically separate, and the components displayed as the units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment. In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may be physically included separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units. The above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium. The software functional unit described above is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform portions of the steps of various embodiments of the present invention. The foregoing storage medium includes: a USB flash drive, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. Medium. It should be noted that the above embodiments are only for explaining the technical solutions of the present invention, and are not intended to be limiting; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that: The technical solutions described in the foregoing embodiments are modified, or some of the technical features are equivalently replaced. The modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

权 利 要 求 书 claims
1、 一种推荐方法, 其特征在于, 包括: 1. A recommended method, characterized by including:
将评分数据集中的评分数据分别放置到至少两个数据层, 其中, 所述评分数据与用户以及产品分别——对应, 且每一个所述数据层中 的任意两个所述评分数据对应的用户以及产品均不相同; Place the rating data in the rating data set into at least two data layers respectively, where the rating data corresponds to users and products respectively, and any two of the rating data in each of the data layers correspond to users And the products are not the same;
依据预设的推荐系统模型以及所述数据层中的评分数据, 并行计 算所述数据层中推荐系统模型的参数, 并将每一层数据层的参数作为 对应的下一层数据层的初值, 直至获取所述推荐系统模型的最优参 数; 其中, 所述推荐系统模型为每个用户对每个产品的评分预测值与 所述平均分和所述推荐系统模型的参数之间的对应关系; Based on the preset recommendation system model and the rating data in the data layer, the parameters of the recommendation system model in the data layer are calculated in parallel, and the parameters of each data layer are used as the initial values of the corresponding next data layer. , until the optimal parameters of the recommendation system model are obtained; wherein, the recommendation system model is the corresponding relationship between the predicted score of each user for each product and the average score and the parameters of the recommendation system model ;
根据所述最优参数与所述推荐系统模型获取每个用户对每个产 品的评分预测值, 并根据所述评分预测值向所述用户推荐产品。 Obtain the predicted score of each user for each product based on the optimal parameters and the recommendation system model, and recommend products to the user based on the predicted score.
2、 根据权利要求 1 所述的方法, 其特征在于, 所述推荐系统模 型包括提供了隐式反馈的推荐系统模型、 未提供隐式反馈的推荐系统 模型、 考虑时空特性的推荐系统模型和非对称的潜在因素的推荐系统 模型。 2. The method according to claim 1, wherein the recommendation system model includes a recommendation system model that provides implicit feedback, a recommendation system model that does not provide implicit feedback, a recommendation system model that considers spatiotemporal characteristics, and a non-recommendation system model. A symmetric latent factor model for recommender systems.
3、 根据权利要求 1 或 2 所述的方法, 其特征在于, 所述推荐系 统模型包括: 3. The method according to claim 1 or 2, characterized in that the recommendation system model includes:
第一推荐系统模型
Figure imgf000047_0001
The first recommendation system model
Figure imgf000047_0001
或者, 第二推荐系统模型 在所述第一推荐模型和所述第二推荐模型中, 表示用户 u对产 品 i 的评分预测值, μ表示所述评分数据集中的所有评分数据的平均 值, b„表示所述用户 u相对用户平均评分的偏移量, b表示所述产品 i 相对产品平均评分的偏移量, 表示产品因素矢量, T 表示转置运 算符号, 表示用户因素矢量, Or, in the first recommendation model and the second recommendation model, the second recommendation system model represents the predicted rating value of user u for product i, μ represents the average value of all rating data in the rating data set, b „ represents the offset of the user u relative to the average user rating, b represents the offset of the product i relative to the average product rating, represents the product factor vector, T represents the transposition operator symbol, represents the user factor vector,
进一步的, 在所述第一推荐模型中, |N(w)|表示用户 u 提供了隐 式偏好的所有产品的集合大小, N(w)表示用户 u提供了隐式偏好的所 有产品的集合; 表示与产品 j相关联的因素矢量, 其用于表征 隐式反馈信息。 Further, in the first recommendation model, |N(w)| represents the set size of all products for which user u has provided implicit preferences, and N(w) represents the set of all products for which user u has provided implicit preferences. ; Represents the factor vector associated with product j, which is used to characterize implicit feedback information.
4、 根据权利要求 3所述的方法, 其特征在于, 还包括: 根据所述评分预测值与所述评分数据的均方误差和所述推荐系 统模型的参数之间的关系得到所述推荐系统模型的代价函数, 其中所 述代价函数包括: 4. The method according to claim 3, further comprising: obtaining the recommendation system based on the relationship between the rating prediction value, the mean square error of the rating data and parameters of the recommendation system model. The cost function of the model, where the cost function includes:
第一代价函数 first cost function
Figure imgf000048_0001
Figure imgf000048_0001
或者, 第二代价函数 Or, the second cost function
Figure imgf000048_0002
其中, |*||表示矢量 *的所有元素的平方和, 与 ^为正则化因子。
Figure imgf000048_0002
Among them, |*|| represents the sum of squares of all elements of vector *, and ^ is the regularization factor.
5、 根据权利要求 1 - 4 任一项所述的方法, 其特征在于, 所述依 据预设的推荐系统模型以及所述数据层中的评分数据, 并行计算所述 数据层中推荐系统模型的参数, 并将每一层数据层的参数作为对应的 下一层数据层的初值,直至获取所述推荐系统模型的最优参数, 包括:5. The method according to any one of claims 1 to 4, characterized in that, based on the preset recommendation system model and the rating data in the data layer, the recommendation system model in the data layer is calculated in parallel. parameters, and use the parameters of each data layer as the initial value of the corresponding next data layer until the optimal parameters of the recommendation system model are obtained, including:
A : 计算所述评分数据集中的所有所述评分数据的平均分; A: Calculate the average score of all the rating data in the rating data set;
B : 依次采用并行计算的方式计算每一层所述数据层的参数, 并 将每一层所述数据层计算所得的参数作为下一层数据层的参数初值; 其中, 所述第一层所述数据层的参数初值由系统设置; B: Use parallel computing to calculate the parameters of each data layer in turn, and use the parameters calculated by each data layer as the initial parameter values of the next data layer; wherein, the first layer The initial parameter values of the data layer are set by the system;
C : 根据最后一层所述数据层计算所得的参数判断所述推荐系统 模型是否收敛, 若收敛, 则计算结束, 得到所述最优参数; 若不收敛, 则将最后一层所述数据层计算所得的参数作为第一层所述数据层的 参数初值, 重复所述步骤 B、 C。 C: Determine whether the recommendation system model converges based on the parameters calculated by the last layer of the data layer. If it converges, the calculation ends and the optimal parameters are obtained; if it does not converge, the last layer of the data layer The calculated parameters are used as the initial parameter values of the first data layer, and steps B and C are repeated.
6、 根据权利要求 5 所述的方法, 其特征在于, 所述步骤 B , 包 括: 6. The method according to claim 5, characterized in that step B includes:
B 1 : 根据每一层所述数据层的参数初值和所述推荐系统模型得到 所述该层数据层的评分数据的初始估计值, 进而根据所述该层数据层 的评分数据和所述初始估计值得到所述该层数据层的评分误差; B 1: Obtain the initial estimated value of the scoring data of the data layer of each layer based on the initial parameter value of the data layer and the recommendation system model, and then obtain the initial estimated value of the scoring data of the data layer based on the scoring data of the data layer and the recommendation system model. The initial estimation value obtains the scoring error of the data layer of the said layer;
B 2 : 根据所述评分误差获取所述该层数据层计算所得的参数; B 3 :将所述该层数据层计算所得的参数作为下一层数据层的参数 初值, 根据步骤 B 1和 B 2得到下一层数据层计算所得的参数, 直至得 到最后一层所述数据层计算所得的参数。 B2: Obtain the parameters calculated by the data layer of this layer according to the scoring error; B3: Use the parameters calculated by the data layer of this layer as the initial value of the parameters of the next data layer, according to steps B1 and B 2 gets the parameters calculated by the next data layer until it is The parameters calculated by the data layer to the last layer.
7、 根据权利要求 4 或 6 所述的方法, 其特征在于, 所述根据最 后一层所述数据层计算所得的参数判断所述推荐系统模型是否收敛, 包括: 7. The method according to claim 4 or 6, characterized in that, judging whether the recommendation system model converges based on the parameters calculated by the last layer of the data layer includes:
将本次计算所得到的最后一层所述数据层计算所得的参数和前 一次计算所得到的最后一层所述数据层计算所得的参数均代入所述 代价函数进行计算, 若代入所述代价函数进行计算的的结果之差不大 于预设的门限值, 则本次计算所得到的最后一层所述数据层计算所得 的参数是收敛的, 否则, 本次计算所得到的最后一层所述数据层计算 所得的参数是不收敛的。 Substituting the parameters calculated from the last layer of the data layer obtained in this calculation and the parameters calculated from the last layer of the data layer obtained in the previous calculation into the cost function for calculation, if the cost is substituted The difference between the results calculated by the function is not greater than the preset threshold value, then the parameters calculated by the last layer of the data layer obtained by this calculation are convergent, otherwise, the parameters of the last layer obtained by this calculation are converged. The parameters calculated by the data layer are not convergent.
8、 根据权利要求 6 所述的方法, 其特征在于, 所述根据所述评 分误差获取所述该层数据层计算所得的参数, 进一步包括: 将所述第一推荐模型中的表达式 ft + |N(M) X 作为一个等效参 数, 并用辅助变量 来表示所述等效参数, 即 = ft + |N(M)|4 ; 然后根据所述辅助变量 的梯度 ΔζΜ = 2eM . q, -
Figure imgf000049_0001
获取所述辅助变 量 , 即得到所述等效参数; 以及根据所述辅助变量 获取所述参数 q 即 + 72 A^, ) , 其中, 符号表示更新符号, 即用更新符 号右边的计算值替代更新符号左边的变量值, 更新符号右边出现的参 数均为相应的参数的初值, 更新符号左边出现的参数均为参数的更新 值。
8. The method according to claim 6, characterized in that: obtaining the parameters calculated by the data layer of the layer according to the scoring error further includes: converting the expression ft + in the first recommendation model | N ( M ) , -
Figure imgf000049_0001
Obtain the auxiliary variable, that is, obtain the equivalent parameter; and obtain the parameter q according to the auxiliary variable, that is, + 7 2 A^, ), where the symbol represents the update symbol, that is, replaced by the calculated value on the right side of the update symbol The variable value on the left side of the update symbol, the parameters appearing on the right side of the update symbol are the initial values of the corresponding parameters, and the parameters appearing on the left side of the update symbol are the updated values of the parameters.
9、 一种推荐设备, 其特征在于, 包括: 9. A recommended device, characterized by including:
数据放置单元, 用于将评分数据集中的评分数据分别放置到至少 两个数据层, 其中, 所述评分数据与用户以及产品分别——对应, 且 每一个所述数据层中的任意两个所述评分数据对应的用户以及产品 均不相同; A data placement unit is used to place the rating data in the rating data set into at least two data layers respectively, wherein the rating data corresponds to users and products respectively, and any two of the data in each of the data layers The above rating data corresponds to different users and products;
并行计算单元, 用于依据预设的推荐系统模型以及所述数据层中 的评分数据, 并行计算所述数据层中推荐系统模型的参数, 并将每一 层数据层的参数作为对应的下一层数据层的初值, 直至获取所述推荐 系统模型的最优参数; 其中, 所述推荐系统模型为每个用户对每个产 品的评分预测值与所述平均分和所述推荐系统模型的参数之间的对 应关系; A parallel computing unit, configured to calculate parameters of the recommendation system model in the data layer in parallel based on the preset recommendation system model and the rating data in the data layer, and use the parameters of each data layer as the corresponding next The initial value of the layer data layer until the optimal parameters of the recommendation system model are obtained; wherein, the recommendation system model is a parameter for each user for each product. The corresponding relationship between the predicted score of the product and the average score and the parameters of the recommendation system model;
预测推荐单元, 用于根据所述最优参数与所述推荐系统模型获取 每个用户对每个产品的评分预测值, 并根据所述评分预测值向所述用 户推荐产品。 A prediction and recommendation unit, configured to obtain each user's predicted score for each product based on the optimal parameters and the recommendation system model, and recommend products to the user based on the predicted score value.
1 0、 根据权利要求 9所述的设备, 其特征在于, 所述推荐系统模 型包括提供了隐式反馈的推荐系统模型、 未提供隐式反馈的推荐系统 模型、 考虑时空特性的推荐系统模型和非对称的潜在因素的推荐系统 模型。 10. The device according to claim 9, wherein the recommendation system model includes a recommendation system model that provides implicit feedback, a recommendation system model that does not provide implicit feedback, a recommendation system model that considers spatiotemporal characteristics, and Asymmetric latent factor model for recommender systems.
1 1、 根据权利要求 9 或 1 0所述的设备, 其特征在于, 所述推荐 系统模型包括: 11. The device according to claim 9 or 10, characterized in that the recommendation system model includes:
所述推荐系统模型包括: The recommendation system model includes:
第一推荐系统模型
Figure imgf000050_0001
The first recommendation system model
Figure imgf000050_0001
或者, 第二推荐系统模型 在所述第一推荐模型和所述第二推荐模型中, 表示用户 u对产 品 i 的评分预测值, μ表示所述评分数据集中的所有评分数据的平均 值, b„表示所述用户 u相对用户平均评分的偏移量, b表示所述产品 i 相对产品平均评分的偏移量, 表示产品因素矢量, T 表示转置运 算符号, 表示用户因素矢量, Or, in the first recommendation model and the second recommendation model, the second recommendation system model represents the predicted rating value of user u for product i, μ represents the average value of all rating data in the rating data set, b „ represents the offset of the user u relative to the average user rating, b represents the offset of the product i relative to the average product rating, represents the product factor vector, T represents the transposition operator symbol, represents the user factor vector,
进一步的, 在所述第一推荐模型中, |N(w)|表示用户 u 提供了隐 式偏好的所有产品的集合大小, N(w)表示用户 u提供了隐式偏好的所 有产品的集合; 表示与产品 j相关联的因素矢量, 其用于表征 隐式反馈信息。 Further, in the first recommendation model, |N(w)| represents the set size of all products for which user u has provided implicit preferences, and N(w) represents the set of all products for which user u has provided implicit preferences. ; Represents the factor vector associated with product j, which is used to characterize implicit feedback information.
1 2、 根据权利要求 1 1 所述的设备, 其特征在于, 还包括: 代价 函数生成单元, 用于根据所述评分预测值与所述评分数据的均方误差 和所述推荐系统模型的参数之间的关系得到所述推荐系统模型的代 价函数, 其中所述代价函数包括: 1 2. The device according to claim 1 1, further comprising: a cost function generation unit, configured to generate the mean square error between the rating prediction value and the rating data and the parameters of the recommendation system model. The relationship between them obtains the cost function of the recommendation system model, where the cost function includes:
第一代价函数 ∑ [rtl -μ-bu-b -q (pu+\N(u j,)]2 2 2)+ iu + x i if) first cost function ∑ [r tl -μ-b u -b - q (p u +\N(uj,)] 2 2 2 )+ iu + xi if)
≡N ≡N
或者, 第二代价函数 Or, the second cost function
∑ [rui -μ-Κ-b - q p + {bu 2 + bf) + ^f + ||^ ) 其中, if表示矢量 *的所有元素的平方和, 与 ^为正则化因子。 ∑ [r ui -μ-Κ-b - qp + {b u 2 + bf) + ^f + ||^ ) where, if represents the sum of squares of all elements of the vector *, and ^ is the regularization factor.
13、 根据权利要求 9-12任一项所述的设备, 其特征在于, 所述 并行计算单元, 包括: 13. The device according to any one of claims 9-12, characterized in that the parallel computing unit includes:
平均分计算子单元, 用于计算所述评分数据集中的所有所述评分 数据的平均分; The average score calculation subunit is used to calculate the average score of all the score data in the score data set;
分层计算子单元, 用于依次采用并行计算的方式计算每一层所述 数据层的参数, 并将每一层所述数据层计算所得的参数作为下一层数 据层的参数初值; 其中, 所述第一层所述数据层的参数初值由系统设 置; The hierarchical calculation subunit is used to calculate the parameters of each layer of the data layer in a parallel computing manner, and use the parameters calculated by the data layer of each layer as the initial value of the parameters of the next layer of data layer; where , the initial value of the parameters of the first layer and the data layer is set by the system;
收敛判断子单元, 用于根据最后一层所述数据层计算所得的参数 判断所述推荐系统模型是否收敛, 若收敛, 则计算结束, 得到所述最 优参数; 若不收敛, 则将最后一层所述数据层计算所得的参数作为第 一层所述数据层的参数初值, 并将所述参数初值传输至所述分层计算 子单元重复进行分层计算。 The convergence judgment subunit is used to judge whether the recommendation system model has converged based on the parameters calculated by the last layer of the data layer. If it converges, the calculation ends and the optimal parameters are obtained; if it does not converge, the last one is The parameters calculated by the data layer of the first layer are used as the initial parameter values of the data layer of the first layer, and the initial parameter values are transmitted to the hierarchical calculation subunit to repeat the hierarchical calculation.
14、 根据权利要求 13所述的设备, 其特征在于, 所述分层计算 子单元进一步用于, 14. The device according to claim 13, characterized in that the hierarchical computing subunit is further used to,
评分误差生成模块, 用于根据每一层所述数据层的参数初值和所 述推荐系统模型得到所述该层数据层的评分数据的初始估计值, 进而 根据所述该层数据层的评分数据和所述初始估计值得到所述该层数 据层的评分误差; A rating error generation module, configured to obtain an initial estimate of the rating data of the data layer based on the initial parameter value of each data layer and the recommendation system model, and then based on the rating of the data layer. The data and the initial estimate are used to obtain the scoring error of the data layer of the layer;
参数计算模块, 用于根据所述评分误差获取所述该层数据层计算 所得的参数; A parameter calculation module, used to obtain the parameters calculated by the data layer of the layer according to the scoring error;
计算控制模块, 用于将所述该层数据层计算所得的参数作为下一 层数据层的参数初值, 通过所述评分误差生成模块和所述参数计算模 块得到下一层数据层计算所得的参数, 直至得到最后一层所述数据层 计算所得的参数。 The calculation control module is used to use the parameters calculated by the data layer of this layer as the initial parameter values of the next data layer, and obtain the parameters calculated by the next data layer through the scoring error generation module and the parameter calculation module. parameters until the parameters calculated by the last layer of the data layer are obtained.
1 5、 根据权利要求 1 3或 1 4所述的设备, 其特征在于, 所述收敛 判断子单元进一步用于, 15. The device according to claim 13 or 14, characterized in that the convergence judgment subunit is further used to,
将本次计算所得到的最后一层所述数据层计算所得的参数和前 一次计算所得到的最后一层所述数据层计算所得的参数均代入所述 代价函数进行计算, 若代入所述代价函数进行计算的的结果之差不大 于预设的门限值, 则本次计算所得到的最后一层所述数据层计算所得 的参数是收敛的, 否则, 本次计算所得到的最后一层所述数据层计算 所得的参数是不收敛的。 Substituting the parameters calculated from the last layer of the data layer obtained in this calculation and the parameters calculated from the last layer of the data layer obtained in the previous calculation into the cost function for calculation, if the cost is substituted The difference between the results calculated by the function is not greater than the preset threshold value, then the parameters calculated by the last layer of the data layer obtained by this calculation are convergent, otherwise, the parameters of the last layer obtained by this calculation are converged. The parameters calculated by the data layer are not convergent.
1 6、 根据权利要求 1 4 所述的设备, 其特征在于, 所述参数计算 模块进一步用于, 将所述第一推荐模型中的表达式 ft + |N(M) ∑ 作为一个等效参 数, 并用辅助变量 来表示所述等效参数, 即 = ft + |N(M)|4 ; 然后根据所述辅助变量 的梯度 ΔζΜ = 2eM · qt -
Figure imgf000052_0001
获取所述辅助变 量 , 即得到所述等效参数; 以及根据所述辅助变量 获取所述参数 q 即 + 72 其中, 符号表示更新符号, 即用更新符 号右边的计算值替代更新符号左边的变量值, 更新符号右边出现的参 数均为相应的参数的初值, 更新符号左边出现的参数均为参数的更新 值。
16. The device according to claim 14, characterized in that the parameter calculation module is further configured to use the expression ft + |N( M ) ∑ in the first recommendation model as an equivalent parameter , and use auxiliary variables to represent the equivalent parameters, that is = ft + |N( M )|4; Then according to the gradient of the auxiliary variables Δζ M = 2e M · q t -
Figure imgf000052_0001
Obtain the auxiliary variable, that is, obtain the equivalent parameter; and obtain the parameter q according to the auxiliary variable, that is, + 7 2 where, the symbol represents the update symbol, that is, the calculated value on the right side of the update symbol is used to replace the variable on the left side of the update symbol value, the parameters appearing on the right side of the update symbol are the initial values of the corresponding parameters, and the parameters appearing on the left side of the update symbol are the updated values of the parameters.
1 7、 一种推荐设备, 包括处理器和存储器, 其中, 1 7. A recommended device, including a processor and a memory, where,
所述处理器用于, 将评分数据集中的评分数据分别放置到至少两 个数据层, 其中, 所述评分数据与用户以及产品分别——对应, 且每 一个所述数据层中的任意两个所述评分数据对应的用户以及产品均 不相同; The processor is configured to place the rating data in the rating data set into at least two data layers respectively, wherein the rating data corresponds to users and products respectively, and any two of the data layers in each of the data layers. The above rating data corresponds to different users and products;
以及依据预设的推荐系统模型以及所述数据层中的评分数据, 并 行计算所述数据层中推荐系统模型的参数, 并将每一层数据层的参数 作为对应的下一层数据层的初值, 直至获取所述推荐系统模型的最优 参数; 其中, 所述推荐系统模型为每个用户对每个产品的评分预测值 与所述平均分和所述推荐系统模型的参数之间的对应关系; And based on the preset recommendation system model and the rating data in the data layer, the parameters of the recommendation system model in the data layer are calculated in parallel, and the parameters of each data layer are used as the initial parameters of the corresponding next data layer. value until the optimal parameters of the recommendation system model are obtained; wherein, the recommendation system model is the correspondence between the predicted score of each user for each product and the average score and the parameters of the recommendation system model relation;
以及根据所述最优参数与所述推荐系统模型获取每个用户对每 个产品的评分预测值, 并根据所述评分预测值向所述用户推荐产品; 所述存储器用于保存评分数据集以及处理器所执行的程序和执 行的结果。 and obtain each user's response to each user based on the optimal parameters and the recommendation system model. Predicted rating values of each product, and recommend products to the user based on the predicted rating values; the memory is used to save the rating data set and the program executed by the processor and the results of the execution.
1 8、 根据权利要求 1 7 所述的设备, 其特征在于, 所述推荐系统 模型包括提供了隐式反馈的推荐系统模型、 未提供隐式反馈的推荐系 统模型、 考虑时空特性的推荐系统模型和非对称的潜在因素的推荐系 统模型。 18. The device according to claim 17, wherein the recommendation system model includes a recommendation system model that provides implicit feedback, a recommendation system model that does not provide implicit feedback, and a recommendation system model that considers spatiotemporal characteristics. and asymmetric latent factors in recommender system models.
1 9、根据权利要求 1 7或 1 8所述的设备,所述推荐系统模型包括: 所述推荐系统模型包括: 19. The device according to claim 17 or 18, the recommendation system model includes: The recommendation system model includes:
第一推荐系统模型
Figure imgf000053_0001
The first recommendation system model
Figure imgf000053_0001
或者, 第二推荐系统模型 在所述第一推荐模型和所述第二推荐模型中, 表示用户 u对产 品 i 的评分预测值, μ表示所述评分数据集中的所有评分数据的平均 值, b„表示所述用户 u相对用户平均评分的偏移量, b表示所述产品 i 相对产品平均评分的偏移量, 表示产品因素矢量, T 表示转置运 算符号, 表示用户因素矢量, Or, in the first recommendation model and the second recommendation model, the second recommendation system model represents the predicted rating value of user u for product i, μ represents the average value of all rating data in the rating data set, b „ represents the offset of the user u relative to the average user rating, b represents the offset of the product i relative to the average product rating, represents the product factor vector, T represents the transposition operator symbol, represents the user factor vector,
进一步的, 在所述第一推荐模型中, |N(w)|表示用户 u 提供了隐 式偏好的所有产品的集合大小, N(w)表示用户 u提供了隐式偏好的所 有产品的集合; 表示与产品 j相关联的因素矢量, 其用于表征 隐式反馈信息。 Further, in the first recommendation model, |N(w)| represents the set size of all products for which user u has provided implicit preferences, and N(w) represents the set size of all products for which user u has provided implicit preferences. ; Represents the factor vector associated with product j, which is used to characterize implicit feedback information.
2 0、 根据权利要求 1 9 所述的设备, 其特征在于, 所述处理器还 用于, 根据所述评分预测值与所述评分数据的均方误差和所述推荐系 统模型的参数之间的关系得到所述推荐系统模型的代价函数, 其中所 述代价函数包括: 20. The device according to claim 19, characterized in that the processor is further configured to calculate the mean square error between the rating prediction value and the rating data and the parameters of the recommendation system model. The cost function of the recommendation system model is obtained by the relationship, where the cost function includes:
第一代价函数 first cost function
Figure imgf000053_0002
Figure imgf000053_0002
或者, 第二代价函数 ∑ [rui -μ- -b - q puf + {bu 2 + bf) + ^\\ +\\pu\\ ) 其中, if表示矢量 *的所有元素的平方和, 与 ^为正则化因子。 Or, the second cost function ∑ [r ui -μ- -b - qp u f + {b u 2 + bf) + ^\\ +\\p u \\ ) where, if represents the sum of squares of all elements of the vector *, and ^ is regular transformation factor.
21、 根据权利要求 18-20任一项的设备, 其特征在于, 所述处理 器用于, 21. The device according to any one of claims 18-20, characterized in that the processor is used to,
A: 计算所述评分数据集中的所有所述评分数据的平均分; A: Calculate the average score of all the rating data in the rating data set;
B: 依次采用并行计算的方式计算每一层所述数据层的参数, 并 将每一层所述数据层计算所得的参数作为下一层数据层的参数初值; 其中, 所述第一层所述数据层的参数初值由系统设置; B: Use parallel computing to calculate the parameters of each data layer in turn, and use the parameters calculated by each data layer as the initial parameter values of the next data layer; wherein, the first layer The initial parameter values of the data layer are set by the system;
C: 根据最后一层所述数据层计算所得的参数判断所述推荐系统 模型是否收敛, 若收敛, 则计算结束, 得到所述最优参数; 若不收敛, 则将最后一层所述数据层计算所得的参数作为第一层所述数据层的 参数初值, 重复所述步骤 B、 C。 C: Determine whether the recommendation system model converges based on the parameters calculated by the last layer of the data layer. If it converges, the calculation ends and the optimal parameters are obtained; if it does not converge, the last layer of the data layer The calculated parameters are used as the initial parameter values of the first data layer, and steps B and C are repeated.
22、 根据权利要求 21 所述的设备, 其特征在于, 所述处理器用 于, 22. The device according to claim 21, characterized in that, the processor is used to,
B1: 根据每一层所述数据层的参数初值和所述推荐系统模型得到 所述该层数据层的评分数据的初始估计值, 进而根据所述该层数据层 的评分数据和所述初始估计值得到所述该层数据层的评分误差; B1: Obtain the initial estimated value of the rating data of the data layer of each layer based on the initial parameter value of the data layer and the recommendation system model, and then based on the rating data of the data layer of the layer and the initial The estimated value is the scoring error of the data layer in question;
B2: 根据所述评分误差获取所述该层数据层计算所得的参数; B 3:将所述该层数据层计算所得的参数作为下一层数据层的参数 初值, 根据步骤 B1和 B2得到下一层数据层计算所得的参数, 直至得 到最后一层所述数据层计算所得的参数。 B2: Obtain the parameters calculated by the data layer of this layer according to the scoring error; B3: Use the parameters calculated by the data layer of this layer as the initial value of the parameters of the next data layer, obtained according to steps B1 and B2 parameters calculated by the next data layer until the parameters calculated by the last data layer are obtained.
23、 根据权利要求 21或 22所述的设备, 其特征在于, 所述处理 器用于, 23. The device according to claim 21 or 22, characterized in that the processor is used to,
将本次计算所得到的最后一层所述数据层计算所得的参数和前 一次计算所得到的最后一层所述数据层计算所得的参数均代入所述 代价函数进行计算, 若代入所述代价函数进行计算的的结果之差不大 于预设的门限值, 则本次计算所得到的最后一层所述数据层计算所得 的参数是收敛的, 否则, 本次计算所得到的最后一层所述数据层计算 所得的参数是不收敛的。 Substituting the parameters calculated from the last layer of the data layer obtained in this calculation and the parameters calculated from the last layer of the data layer obtained in the previous calculation into the cost function for calculation, if the cost is substituted The difference between the results calculated by the function is not greater than the preset threshold value, then the parameters calculated by the last layer of the data layer obtained by this calculation are convergent, otherwise, the parameters of the last layer obtained by this calculation are converged. The parameters calculated by the data layer are not convergent.
24、 根据权利要求 22 所述的设备, 其特征在于, 所述处理器用 于根据所述评分误差获取所述该层数据层计算所得的参数, 进一步包 括: 24. The device according to claim 22, wherein the processor is configured to obtain parameters calculated by the data layer of the layer according to the scoring error, further comprising: Includes:
所述处理器将所述第一推荐模型中的表达式 ft+|N(M)|4 X 作为 一个等效参数, 并用 辅助 变量 来表示所述等效参数, 即
Figure imgf000055_0001
Z y.; 然后根据所述辅助变量 的梯度 ΔζΜ = 2eM · qt
Figure imgf000055_0002
获取所述辅助变 量 , 即得到所述等效参数; 以及根据所述辅助变量 获取所述参数 q 即 +72 A^,), 其中, 符号表示更新符号, 即用更新符 号右边的计算值替代更新符号左边的变量值, 更新符号右边出现的参 数均为相应的参数的初值, 更新符号左边出现的参数均为参数的更新 值。
The processor takes the expression ft +|N( M )|4 X in the first recommendation model as an equivalent parameter, and uses auxiliary variables to represent the equivalent parameter, that is
Figure imgf000055_0001
Z y.; Then according to the gradient of the auxiliary variable Δζ M = 2e M · q t
Figure imgf000055_0002
Obtain the auxiliary variable, that is, obtain the equivalent parameter; and obtain the parameter q according to the auxiliary variable, that is, +7 2 A^,), where the symbol represents the update symbol, that is, replaced by the calculated value on the right side of the update symbol The variable value on the left side of the update symbol, the parameters appearing on the right side of the update symbol are the initial values of the corresponding parameters, and the parameters appearing on the left side of the update symbol are the updated values of the parameters.
PCT/CN2013/083218 2013-09-10 2013-09-10 Recommendation method and device WO2015035556A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2013/083218 WO2015035556A1 (en) 2013-09-10 2013-09-10 Recommendation method and device
CN201380001312.8A CN104854580B (en) 2013-09-10 2013-09-10 A kind of recommendation method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2013/083218 WO2015035556A1 (en) 2013-09-10 2013-09-10 Recommendation method and device

Publications (1)

Publication Number Publication Date
WO2015035556A1 true WO2015035556A1 (en) 2015-03-19

Family

ID=52664913

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/083218 WO2015035556A1 (en) 2013-09-10 2013-09-10 Recommendation method and device

Country Status (2)

Country Link
CN (1) CN104854580B (en)
WO (1) WO2015035556A1 (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017136279A1 (en) 2016-02-01 2017-08-10 3M Innovative Properties Company Conformable, peelable adhesive articles
CN107526753A (en) * 2016-07-29 2017-12-29 腾讯科技(深圳)有限公司 The recommendation method and apparatus of application program
WO2018106489A1 (en) 2016-12-07 2018-06-14 3M Innovative Properties Company Methods of passivating adhesives
WO2019226809A1 (en) 2018-05-23 2019-11-28 3M Innovative Properties Company Wall anchors and assemblies for heavyweight objects
CN111241408A (en) * 2020-01-21 2020-06-05 武汉轻工大学 Recommendation model construction system and method
US10927277B2 (en) 2017-08-25 2021-02-23 3M Innovative Properties Company Adhesive articles permitting damage free removal
WO2021064696A1 (en) 2019-10-04 2021-04-08 3M Innovative Properties Company A film backing for releasable securement
US11078383B2 (en) 2017-08-25 2021-08-03 3M Innovative Properties Company Adhesive articles permitting damage free removal
US11472156B2 (en) 2017-03-28 2022-10-18 3M Innovative Properties Company Conformable adhesive articles
US11503929B2 (en) 2018-12-19 2022-11-22 3M Innovative Properties Company Flexible hardgoods with enhanced peel removability
WO2022263954A1 (en) 2021-06-15 2022-12-22 3M Innovative Properties Company Stretch removable pressure sensitive adhesive articles
US11593689B2 (en) * 2018-09-14 2023-02-28 Kabushiki Kaisha Toshiba Calculating device, calculation program, recording medium, and calculation method
WO2023111747A1 (en) 2021-12-17 2023-06-22 3M Innovative Properties Company Catheter stabilization device
USD996195S1 (en) 2022-02-28 2023-08-22 3M Innovative Properties Company Mounting hook
US11950372B2 (en) 2018-06-28 2024-04-02 3M Innovation Properties Methods of making metal patterns on flexible substrate

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344328B (en) * 2018-09-21 2021-01-05 百度在线网络技术(北京)有限公司 Method and device for obtaining optimal parameter combination of recommendation system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020065797A1 (en) * 2000-11-30 2002-05-30 Wizsoft Ltd. System, method and computer program for automated collaborative filtering of user data
CN101437220A (en) * 2008-09-18 2009-05-20 广州五度信息技术有限公司 System and method for implementing mutual comment and color bell recommendation between users
US20100185579A1 (en) * 2009-01-22 2010-07-22 Kwang Seok Hong User-based collaborative filtering recommendation system and method for amending similarity using information entropy
CN102262764A (en) * 2010-05-28 2011-11-30 王希 Electronic commerce recommending method based on regression model

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050096950A1 (en) * 2003-10-29 2005-05-05 Caplan Scott M. Method and apparatus for creating and evaluating strategies
US8103675B2 (en) * 2008-10-20 2012-01-24 Hewlett-Packard Development Company, L.P. Predicting user-item ratings
US9336315B2 (en) * 2010-01-19 2016-05-10 Ebay Inc. Personalized recommendation of a volatile item
CN102541920A (en) * 2010-12-24 2012-07-04 华东师范大学 Method and device for improving accuracy degree by collaborative filtering jointly based on user and item
CN102780920A (en) * 2011-07-05 2012-11-14 上海奂讯通信安装工程有限公司 Television program recommending method and system
CN103106535B (en) * 2013-02-21 2015-05-13 电子科技大学 Method for solving collaborative filtering recommendation data sparsity based on neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020065797A1 (en) * 2000-11-30 2002-05-30 Wizsoft Ltd. System, method and computer program for automated collaborative filtering of user data
CN101437220A (en) * 2008-09-18 2009-05-20 广州五度信息技术有限公司 System and method for implementing mutual comment and color bell recommendation between users
US20100185579A1 (en) * 2009-01-22 2010-07-22 Kwang Seok Hong User-based collaborative filtering recommendation system and method for amending similarity using information entropy
CN102262764A (en) * 2010-05-28 2011-11-30 王希 Electronic commerce recommending method based on regression model

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017136279A1 (en) 2016-02-01 2017-08-10 3M Innovative Properties Company Conformable, peelable adhesive articles
CN107526753A (en) * 2016-07-29 2017-12-29 腾讯科技(深圳)有限公司 The recommendation method and apparatus of application program
CN107526753B (en) * 2016-07-29 2020-06-16 腾讯科技(深圳)有限公司 Recommendation method and device for application program
WO2018106489A1 (en) 2016-12-07 2018-06-14 3M Innovative Properties Company Methods of passivating adhesives
US11472156B2 (en) 2017-03-28 2022-10-18 3M Innovative Properties Company Conformable adhesive articles
US10927277B2 (en) 2017-08-25 2021-02-23 3M Innovative Properties Company Adhesive articles permitting damage free removal
US11898069B2 (en) 2017-08-25 2024-02-13 3M Innovative Properties Company Adhesive articles permitting damage free removal
US11078383B2 (en) 2017-08-25 2021-08-03 3M Innovative Properties Company Adhesive articles permitting damage free removal
WO2019226809A1 (en) 2018-05-23 2019-11-28 3M Innovative Properties Company Wall anchors and assemblies for heavyweight objects
US11950372B2 (en) 2018-06-28 2024-04-02 3M Innovation Properties Methods of making metal patterns on flexible substrate
US11593689B2 (en) * 2018-09-14 2023-02-28 Kabushiki Kaisha Toshiba Calculating device, calculation program, recording medium, and calculation method
US11503929B2 (en) 2018-12-19 2022-11-22 3M Innovative Properties Company Flexible hardgoods with enhanced peel removability
US11819146B2 (en) 2018-12-19 2023-11-21 3M Innovative Properties Company Flexible hardgoods with enhanced peel removability
WO2021064696A1 (en) 2019-10-04 2021-04-08 3M Innovative Properties Company A film backing for releasable securement
CN111241408A (en) * 2020-01-21 2020-06-05 武汉轻工大学 Recommendation model construction system and method
WO2022263954A1 (en) 2021-06-15 2022-12-22 3M Innovative Properties Company Stretch removable pressure sensitive adhesive articles
WO2023111747A1 (en) 2021-12-17 2023-06-22 3M Innovative Properties Company Catheter stabilization device
USD996195S1 (en) 2022-02-28 2023-08-22 3M Innovative Properties Company Mounting hook

Also Published As

Publication number Publication date
CN104854580A (en) 2015-08-19
CN104854580B (en) 2018-09-28

Similar Documents

Publication Publication Date Title
WO2015035556A1 (en) Recommendation method and device
CN106023015B (en) Course learning path recommendation method and device
CN107358293B (en) Neural network training method and device
US10180968B2 (en) Gaussian ranking using matrix factorization
US20170206551A1 (en) Personalized Recommendation Computation in Real Time using Incremental Matrix Factorization and User Factor Clustering
TWI658420B (en) Method, device, server and computer readable storage medium for integrate collaborative filtering with time factor
JP7287397B2 (en) Information processing method, information processing apparatus, and information processing program
WO2012149705A1 (en) Long-term prediction method and apparatus of network traffic
CN115087970A (en) Recommendation system using bayesian graph convolution network
CN107545444B (en) Business advertisement data recommendation method and device
CN109298930B (en) Cloud workflow scheduling method and device based on multi-objective optimization
CN109074349A (en) Data are handled to characterize the thermal behavior of battery
EP3509366A1 (en) Method and device for predicting network distance
TW202314558A (en) Improved recommender system and method using shared neural item representations for cold-start recommendations
CN111260449A (en) Model training method, commodity recommendation device and storage medium
WO2023168856A1 (en) Associated scene recommendation method and device, storage medium, and electronic device
JP2018190409A (en) Recommendation device, recommendation method, and program
US9906577B2 (en) Method and server for searching for data stream dividing point based on server
JP2020027644A (en) Motor excitation signal search method and electronic apparatus
CN111402003B (en) System and method for realizing user-related recommendation
CN109905880B (en) Network partitioning method, system, electronic device and storage medium
CN104933312A (en) Node similarity calculation method based on SimRank
WO2022183889A1 (en) Generation method and apparatus for bayesian network structure, and electronic device and storage medium
WO2020147971A1 (en) Training in communication systems
JP2019200510A (en) Forecasting system and forecasting method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13893571

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13893571

Country of ref document: EP

Kind code of ref document: A1