WO2018205853A1 - Distributed computing system and method and storage medium - Google Patents

Distributed computing system and method and storage medium Download PDF

Info

Publication number
WO2018205853A1
WO2018205853A1 PCT/CN2018/084870 CN2018084870W WO2018205853A1 WO 2018205853 A1 WO2018205853 A1 WO 2018205853A1 CN 2018084870 W CN2018084870 W CN 2018084870W WO 2018205853 A1 WO2018205853 A1 WO 2018205853A1
Authority
WO
WIPO (PCT)
Prior art keywords
matrix
item
user
sub
node
Prior art date
Application number
PCT/CN2018/084870
Other languages
French (fr)
Chinese (zh)
Inventor
谭蕴琨
余乐乐
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2018205853A1 publication Critical patent/WO2018205853A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]

Definitions

  • the present invention relates to computer technology, and in particular, to a distributed computing system, method, and storage medium.
  • a machine learning method is used to train a model for predicting the user's rating on different commodities, so that the user's ranking of different products can be calculated, and the high-scoring product can be selected. Recommend to users, can help users quickly locate the products of interest, and achieve accurate and efficient product marketing.
  • Embodiments of the present invention are directed to providing a distributed computing system, method, and storage medium capable of performing computing tasks in a resource intensive manner.
  • an embodiment of the present invention provides a distributed computing system, including:
  • At least two computing nodes and at least two parameter service nodes At least two computing nodes and at least two parameter service nodes; wherein
  • the computing node is configured to initialize a vector corresponding to the user in the user matrix according to a user included in the subset of the training data, to obtain a user sub-matrix formed by the initialized vector;
  • the computing node is configured to iteratively calculate the user sub-matrix and the item sub-matrix according to the subset of the training data, the item sub-matrix obtained from the parameter service node, and obtain the calculation after each iteration
  • the project sub-matrix is transmitted to the corresponding parameter service node;
  • the parameter service node is configured to initialize a vector corresponding to the partial item, and obtain a project sub-matrix composed of the initialized vector, where the partial item is a part of the items included in the training data;
  • the parameter service node is configured to update an item sub-matrix stored by the parameter service node according to an item sub-matrix transmitted by the computing node;
  • the user sub-matrix stored by each of the computing nodes is used to combine to obtain a user matrix
  • the item sub-matrix stored by each parameter service node is used to combine to obtain an item matrix
  • a vector corresponding to the target user in the user matrix and a vector of the corresponding target item in the item matrix are used to obtain a score of the target user for the target item.
  • an embodiment of the present invention provides a distributed computing method, which is applied to a distributed computing system including at least two computing nodes and at least two parameter service nodes;
  • the computing node initializes a vector corresponding to the user in the user matrix according to a user included in the subset of the training data, and obtains a user sub-matrix composed of the initialized vector;
  • the computing node iteratively calculates the user sub-matrix and the item sub-matrix according to the subset of the training data, the item sub-matrix obtained from the parameter service node, and the item obtained after each iteration calculation a matrix, transmitted to the corresponding parameter service node;
  • the parameter service node initializes a vector corresponding to the partial item, and obtains a project sub-matrix composed of the initialized vector, where the partial item is a part of the items included in the training data;
  • parameter service node updates the item sub-matrix stored by the parameter service node according to the item sub-matrix transmitted by the computing node;
  • the user sub-matrix stored by each of the computing nodes is used to combine to obtain a user matrix
  • the item sub-matrix stored by each parameter service node is used to combine to obtain an item matrix
  • a vector corresponding to the target user in the user matrix and a vector of the corresponding target item in the item matrix are used to obtain a score of the target user for the target item.
  • an embodiment of the present invention provides a storage medium storing an executable program, and when the executable program is executed by a processor, the following operations are implemented:
  • the vector corresponding to the user in the user matrix is initialized, and a user sub-matrix composed of the initialized vector is obtained;
  • the vector corresponding to the partial item is initialized, and a project sub-matrix composed of the initialized vector is obtained, and the partial item is a part of the items included in the training data;
  • the item sub-matrix stored by the parameter service node is updated according to the item sub-matrix transmitted by the computing node.
  • a plurality of computing nodes perform iterative calculation on the stored user sub-matrix and the item sub-matrix based on the subset of the training data.
  • the computational complexity is reduced, thereby reducing the overhead of computing resources for a single node, and reducing the single
  • the computational complexity of the nodes on the other hand, the way in which the computational nodes are paralleled effectively improves the computational efficiency.
  • FIG. 1 is an optional schematic diagram of decomposing a scoring matrix into a user matrix and an item matrix according to a matrix decomposition model according to an embodiment of the present invention
  • FIG. 2 is an optional structural diagram of a big data platform according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of decomposing a scoring matrix into a user matrix and an item matrix according to a matrix decomposition model according to an embodiment of the present invention
  • FIG. 4 is a schematic structural diagram of a distributed computing system 200 according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a distributed computing system 200 according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of an optional process when the distributed computing system 200 shown in FIG. 5 is used for model training according to an embodiment of the present invention
  • FIG. 7 is a schematic diagram of an optional process when the distributed computing system 200 shown in FIG. 5 is used for model training according to an embodiment of the present invention
  • 8-1 is an optional schematic diagram of parameters of a transmission item matrix between a parameter service node and a computing node according to an embodiment of the present invention
  • 8-2 is an optional schematic diagram of parameters of a transmission item matrix between a parameter service node and a computing node according to an embodiment of the present invention
  • FIG. 9 is a schematic diagram of processing of a calculation node sub-batch and a parameter service node transmission item matrix according to an embodiment of the present invention.
  • FIG. 10 is a schematic flowchart of applying to a distributed computing method according to an embodiment of the present invention.
  • FIG. 11 is an optional schematic flowchart of a model for training a predictive score according to an embodiment of the present invention.
  • FIG. 12 is a schematic diagram of an optional application scenario of the big data platform 200 shown in FIG. 2 according to an embodiment of the present invention.
  • Behavior data including user (such as identification information in the form of serial number, etc.), items that users generate scoring behavior (such as goods, articles, applications, etc., can be described by serial number, etc.), and user interest in the project (also referred to as scoring in this article), behavior data of multiple users constitute a behavior data set (also referred to as training data in this paper); for online products, for example, the scoring behavior includes: browsing products, collecting items, purchasing goods and comments commodity.
  • the model also known as the Latent Factor Model (LFM)
  • LFM Latent Factor Model
  • the training data is represented by the scoring matrix Y, assuming that the scoring data relates to the scores of M users for N different items, and each row vector of the scoring matrix Y corresponds to one user for different items. Scoring, each column vector of the scoring matrix Y corresponds to the scores of different users obtained by one item, and the matrix decomposition model is used to initialize the scoring matrix, that is, the feature of K (preset value) dimensions is introduced in the scoring matrix, so that the score is obtained
  • the matrix Y is initialized according to the matrix decomposition model as a product of a user-characteristic matrix (referred to as user matrix) U and a feature-item matrix V (abbreviated as an item matrix).
  • the training data is the user's behavior data
  • the missing values in the scoring matrix are predicted, that is, the user's score on the ungraded items is predicted
  • the matrix decomposition model is The problem of predicting missing values is transformed into the problem of solving the parameters of the user matrix and the parameters of the item matrix, that is, solving the parameter vector of the user matrix in K dimensions and the parameter vector of the item matrix in K dimensions.
  • FIG. 1 is an optional schematic diagram of decomposing a scoring matrix into a user matrix and an item matrix according to a matrix decomposition model according to an embodiment of the present invention, for a given training data (including all users, All projects, as well as the scores of each user's scoring behavior), model the behavioral data using a latent semantic model to obtain the model shown in Figure 1 (assuming there are 3 users and 4 projects in the behavioral data)
  • the score is decomposed into a user matrix (representing the interest of three users for the features of the three dimensions) and an item matrix (representing the weight of the features of the four items in the three dimensions).
  • y 11 of the user 1 for the item 1 can be expressed as: a row vector (u 11 , u 12 , u 13 ) corresponding to the user 1 in the user matrix and a column vector (q 11 of the corresponding item 1 in the item matrix, The product of q 21 , q 31 ).
  • Training ie model training, iteratively calculates the parameters of the model using the training data, ie iteratively calculates the parameter u ik of the user matrix U and the parameter v kj in the item matrix V until the iterative suspension condition is met, eg the iterative calculation reaches a predetermined number of times or Parameter convergence.
  • the training data is decomposed into multiple subsets and distributed to multiple computing nodes in the distributed computing system.
  • the computing nodes calculate the parameters of the model in parallel based on the subset of the assigned training data, since the computing task will be Assigned to multiple computing nodes to complete, so distributed computing can expand the scale of computing and improve the efficiency of training.
  • Parameter service node architecture A distributed computing system that implements machine learning architecture in distributed computing. It consists of a parameter service node (PS, Parameter Server) and a compute node (Worker). The number of each node is at least two. One.
  • Parameter service node at least two parameter service nodes are included in the distributed computing system, and each parameter service node may be implemented by one or more servers, and may also be referred to as a parameter service node when implemented by a server.
  • each parameter service node may be implemented by one or more servers, and may also be referred to as a parameter service node when implemented by a server.
  • the parameter service node responsible for storing and updating the parameters of the sub-matrix of the item matrix (hereinafter referred to as the item sub-matrix), and the parameter service node provides the service node with a parameter reading and updating the parameters of the item matrix.
  • Each computing node can be implemented by one server or multiple servers, and the parameter service node architecture includes multiple computing nodes.
  • Each compute node is assigned to a subset of the training data, the subset includes behavior data of some users, and the parameters of the project matrix are obtained from the parameter service node (the parameter service node always stores the latest parameters of the project matrix), and the training is used.
  • the data update user parameter corresponding to the parameters of the above part of the user, and the update value of the parameter of the partial item of the item matrix (that is, the item of the partial user generating the scoring behavior), and then the updated value of the parameter of the item matrix is transmitted to the parameter service.
  • the node, the parameter service node, in combination with the updated value of the parameter transmitted by each computing node updates the item matrix stored locally by the parameter service node.
  • Spark A distributed computing architecture based on the model training implemented by the Map-Reduce node, which involves mapping nodes and protocol nodes.
  • the mapping nodes are responsible for the filtering and distribution of data, and the protocol nodes are responsible for the calculation and merging of data.
  • the big data platform is widely used for the processing of user's behavior data collected by various industries, data cleaning and screening when necessary, and then a matrix decomposition model based on behavior data to predict the user's score on different items.
  • the score reflects the user's
  • the degree of interest of the project, in the business scenario recommended by the project is recommended to the user according to the ranking of the project from high to low, and can support targeted production/marketing activities to achieve high efficiency and cost of production/marketing. Saving.
  • FIG. 2 is an optional structural diagram of the big data platform provided by the embodiment of the present invention, and relates to the distributed computing system 200.
  • the data acquisition system 300, the real-time computing system 400, the offline computing system 500, and the resource scheduling 600 are described below.
  • the data collection system 300 is configured to collect training data of the training model (for example, for project recommendation, the training data may include: all users, all projects, users browsing, purchasing, paying attention, putting in a shopping cart, etc. A list of items of various behaviors), and appropriate processing. It can be understood that for training data, appropriate processing may include: data cleaning and screening to filter out noise data (such as apparently non-real data outside the predetermined interval), beyond the validity period (such as data collected six months ago). ), and to make the training data conform to the desired distribution and the like.
  • a mechanism for user authorization and application authorization is provided to protect privacy in the context of employing various behavioral data of the user.
  • the distributed computing system 200 is configured to train the model in a manner that iteratively calculates the parameters of the model based on the training data until the iterative abort condition is met.
  • the real-time computing system 400 is configured to implement the distributed computing system 200 to train the machine learning model in a real-time manner (also referred to as online mode), one or a batch of records in the training data (each record corresponds to one user, including the user to different objects)
  • a real-time manner also referred to as online mode
  • the distributed computing system 200 loads the received one or a batch of records in real time in memory and performs training based on the training results (eg, between the true and predicted values of the score) The degree of difference) the updated parameters of the real-time calculation model.
  • the offline computing system 500 is configured to implement the distributed computing system 200 to adopt an offline mode training model.
  • the distributed computing system 200 loads the newly received training data and the received historical training data in the memory to iteratively calculate the updated model. parameter.
  • the resource scheduling 600 is configured to allocate computing resources such as a central processing unit (CPU) and a graphics processing unit (GPU) to each of the foregoing systems, and allocate bandwidth resources for communication and the like.
  • computing resources such as a central processing unit (CPU) and a graphics processing unit (GPU)
  • the matrix decomposition model is initialized, which is expressed as: the product of the user-character matrix and the feature-item matrix (referred to as the item matrix, which represents the scores of users with different features for different items).
  • FIG. 3 is a schematic diagram of decomposing a scoring matrix into a user matrix and an item matrix according to a matrix decomposition model according to an embodiment of the present invention, and setting scoring data relates to M users' ratings of N items.
  • the scoring matrix Y is used to represent the scoring data, the dimension of Y is: M ⁇ N; the scoring matrix is initialized by using the matrix decomposition model, that is, the features of K dimensions are introduced into the scoring matrix, thereby decomposing the scoring matrix Y into user-features.
  • Y M ⁇ N
  • y ij represents the score of the jth item by the i-th user
  • y ij is expressed as:
  • u ik represents the score of the user k for the feature k
  • v kj represents the weight of the item j in the feature k
  • k takes the value 1 ⁇ k ⁇ K
  • the values of i and j are positive integers, 1 ⁇ i ⁇ M, 1 ⁇ j ⁇ N.
  • the scoring matrix Y is initialized to the product of the user matrix U and the item matrix V.
  • the dimension of the user matrix U is M ⁇ K
  • the row vector u i is a K-dimensional vector corresponding to the user i to the K dimensions.
  • the score of the feature; the dimension of the item matrix V is K ⁇ N, each column corresponds to a K-dimensional column vector v j , which represents the weight of the item j in K dimensions;
  • K is the dimension of the feature specified in the matrix decomposition, user i pairs
  • the score y ij of item j is the product of u i and v j .
  • the scoring matrix is sparse, that is, the value of some elements in the scoring matrix is missing (represented by 0).
  • the missing value according to the above formula (2), the missing value in the scoring matrix can be predicted, thereby converting the prediction of the missing value into the parameter u ik for solving the user matrix U and the parameter v kj in the item matrix V. That is, the problem of solving the parameter vector u i of the user matrix U in K dimensions and the parameter vector v j of the item matrix V in K dimensions.
  • the product of the user vector u i and the item vector v j is used as the predicted value of the score of the user i on the item j, and is recorded as The real value of the score of user i on item j is y ij , and the difference between the predicted value and the true value is recorded as e ij , that is:
  • the problem of solving the model parameters is transformed into the problem of minimizing e ij .
  • the objective function is used to represent the difference between the predicted value and the true value of the model for the score.
  • the objective function is as shown in formula (4):
  • the process of iteratively training the model is converted into a process of solving the value (ie, parameter) of u ik , v kj when the above objective function converges, for example, using the gradient descent method for the above objective function, that is, the negative gradient of the objective function
  • the direction is solved by solving u ik and v kj , and the update formula for u ik and v kj is: u ik ⁇ u ik +2 ⁇ e ij v kj (7.1)
  • is the step size, indicating the learning rate.
  • the number of iterations of training is reached a predetermined number of times, or the value of the objective function is lower than a predetermined value (ie, the objective function converges) as the abort condition of the iterative training, and the output training is obtained.
  • the parameters of the model according to the parameters, combined with the formula (2), can calculate the user's score for different items, and select a certain number of items with the highest score for recommendation.
  • FIG. 4 is a schematic structural diagram of a distributed computing system 200 according to an embodiment of the present invention.
  • the distributed matrix decomposition and training are implemented by using a Map-Reduce distributed architecture, and the model is stored in a driver.
  • the node 210 can be implemented by one server (or multiple servers), each of the Executor nodes can be implemented by one server (or multiple servers), and the driving node 210 transmits the item matrix and the user matrix to the executor node 220.
  • the executor node 220 performs training according to the received user matrix and the item matrix, calculates an updated value of the parameter of the model, and then transmits it to the driving node 210, and the driving node 210 updates the updated value of the parameter transmitted by all the executor nodes 220, and updates The parameters of the locally stored model are then broadcast to all of the actuator nodes 220.
  • the Spark distributed computing architecture maintains all the parameters of the model on a single drive node. The physical limitations of the memory of the drive node result in the inability to train complex models.
  • each actuator node transmits the parameters of the model to the driver node, and the driver nodes are aggregated and broadcast to all actuator nodes, resulting in a relationship between the driver node and the actuator node.
  • a large communication overhead the communication node and multiple actuator nodes communicate to encounter bandwidth bottlenecks, and the transmission of updated values of model parameters leads to low communication efficiency.
  • a distributed computing architecture based on a parameter service node is provided.
  • the user is used to decompose the training data to obtain a subset of the training data, and the plurality of computing nodes are parallel based on the subset of the training data.
  • the training model is then combined with the parameters of the model calculated by each computing node by the parameter service node.
  • FIG. 5 is an optional structural diagram of a distributed computing system 200 according to an embodiment of the present invention.
  • the parameter service node 230, the control node 240, the computing node 250, and the scheduling layer 260 are involved.
  • storage layer 270 storage layer 270.
  • the control node 240 is configured to control the overall operation of the parameter service node 230 and the computing node 250 to ensure the orderly operation of the operation, including: dividing the training data into subsets by using the user as a dimension, and each subset includes a part of users (ie, training data A portion of all users involved) assigns a subset of the training data to each computing node 250 and controls the orderly execution of the operations of each computing node and parameter service node 230.
  • the set control node 240 may be omitted from the distributed computing system 200 illustrated in FIG. 5, coupling the functionality of the control node 240 into the parameter service node 230.
  • each parameter service node 230 is configured to store sub-matrices of the item matrix V (hereinafter referred to as item sub-matrices); each compute node 250 is configured to be stored A sub-matrix of the user matrix U (hereinafter referred to as a user sub-matrix), based on the item sub-matrix obtained from the parameter service node 230, and iteratively calculates the parameters of the stored user sub-matrix according to the subset of the assigned training data.
  • item sub-matrices sub-matrices of the item matrix V
  • each compute node 250 is configured to be stored A sub-matrix of the user matrix U (hereinafter referred to as a user sub-matrix), based on the item sub-matrix obtained from the parameter service node 230, and iteratively calculates the parameters of the stored user sub-matrix according to the subset of the assigned training data.
  • the scheduling layer 260 is an abstract representation of the scheduling functions of the distributed computing system 200, involving the allocation of computing resources (such as CPU and GPU) of the control node 240, the parameter service node 230, and the computing node 250, as well as the control node 240, the parameter service node.
  • the storage layer 270 is an abstract representation of the storage resources of the distributed computing system 200, and relates to memory resources of the above-described nodes and non-volatile storage resources.
  • the distributed computing system 200 shown in FIG. 5 can be implemented by a cluster of servers.
  • the servers in the server cluster can be separated in physical locations, or can be deployed in the same physical location, through various communications such as optical cables and cables. Way to connect.
  • each node shown in FIG. 5 it may have a one-to-one correspondence with servers in the cluster.
  • multiple nodes may be deployed in one server according to the actual processing capability of the server; in particular, for servers in the cluster.
  • a virtual machine environment can be set in the cluster, and the node shown in FIG. 5 is deployed in the virtual machine environment, which facilitates rapid deployment and migration of the node.
  • FIG. 6 is a schematic diagram of a distributed computing system 200 shown in FIG. An optional processing diagram during training (with part of the structure in FIG. 5 omitted), showing a distributed computing architecture based on parameter service nodes, wherein multiple parameter service nodes 230 and multiple computing nodes 250 are involved, Explain separately.
  • the parameter service node 230 is configured to store the item matrix V, and each parameter service node 230 stores a project sub-matrix composed of vectors of corresponding partial items in the item matrix V, denoted as V-part, and the item stored by the different parameter service node 230
  • the items corresponding to the matrix are different, and the intersection of the items corresponding to the item sub-matrix stored by all parameter service nodes 230 is all the items involved in the training data.
  • each parameter service node 230 Since the sub-matrix stored by each parameter service node 230 only corresponds to a part of the project, the technical effect of adaptively adjusting the scale of the items in the model can be realized by adjusting the number of parameter service nodes 230, which is advantageous for adjusting the distributed according to service requirements.
  • the size of the parameter service node 230 in the computing system 200 is advantageous for adjusting the distributed according to service requirements.
  • the number of parameter service nodes 230 may be increased in the distributed computing system 200, and the newly added parameter service node 230 is responsible for storing the vector of the corresponding new item in the project matrix V;
  • the newly added parameter service node 230 is responsible for storing the vector of the corresponding new item in the project matrix V;
  • it can be implemented by revoking the parameter service node 230 storing the corresponding sub-matrix.
  • the computing node 250 is configured to utilize a subset of the assigned training data, the subset including behavior data of a portion of the users (ie, some of the users involved in the training data), during each iterative calculation, the computing node 250
  • the parameters of the item sub-matrix V are sequentially acquired from each parameter service node 230, and the parameters of the item sub-matrix acquired from any parameter service node 230 are combined with the assigned subset, and the user sub-item is calculated according to the above-mentioned update formula (7.1).
  • the parameter of the matrix U-part (that is, the matrix of the user matrix U corresponding to the vector of the partial user) is updated locally, and the user sub-matrix U-part is updated locally; then the parameter of the item sub-matrix V-part is calculated according to the formula (7.2)
  • the updated value transmits the updated value of the parameter of the item sub-matrix V-part to the parameter service node 230 storing the corresponding item sub-matrix for updating.
  • each computing node 250 processes only the training data of some users, it is possible to achieve the technical effect of adaptively adjusting the user scale by adjusting the number of computing nodes 250.
  • the number of computing nodes 250 may be increased in the distributed computing system 200, and the newly added computing node 250 is responsible for storing and calculating the sub-dimension of the corresponding new user in the user matrix U.
  • Matrix for the same reason, when it is no longer necessary to predict the score of some users for the project, it can be realized by revoking the computing node 250 storing the sub-matrix of the corresponding user.
  • the size of the matrix decomposition model (number of users + number of projects) ⁇ K, the scale of the model in actual applications will rise to hundreds of millions, or even one billion or ten billion, in the embodiment of the present invention, the distributed computing architecture using parameter service nodes, The dimension of the model stored and calculated by the computing node is reduced, thereby reducing the network communication overhead caused by the transmission model parameters between the computing node and the parameter service node, improving the network transmission efficiency, and supporting the adjustment of the parameter service node and the computing node.
  • the number, the linear expansion of the support model scale mainly involves the following aspects.
  • the training data is processed into a format of "user ID, item ID: rating, ..., item: rating", that is, all the scores of one user are stored in one record, and the training data is divided into dimensions by user (for example, evenly divided). a plurality of subsets, each subset comprising a plurality of user records, the subset being assigned to a plurality of compute nodes 250; for example, a subset of the training data is evenly distributed to each compute node based on the state of the computational power balance of each compute node 250 Or, according to the case where the calculation power of each calculation node 250 is disparity (the calculation power ratio exceeds the ratio threshold), a subset of the training data of the corresponding ratio is allocated according to the ratio of the calculation power.
  • the update of the item sub-matrix and the user sub-matrix is mutually dependent.
  • it is first necessary to calculate the updated value of the parameter of the user sub-matrix using the parameters of the item sub-matrix (it can be understood, Since each iteration calculation is to iterate an update value based on the original value of the parameter, therefore, in this paper, for the update value of the calculation parameter, and the calculation of the updated parameter, you can make no distinction), and then use the user.
  • the updated value of the parameter of the sub-matrix calculates the updated value of the parameter of the item sub-matrix, so before the iteration begins, the computing node needs to obtain the parameters of the item sub-matrix from the parameter service node through the network, and the computing node needs to serve the parameter through the network after the iteration ends.
  • the node transmits the updated value of the parameters of the project submatrix.
  • the item sub-matrix is stored by the parameter service node 230
  • the computing user sub-matrix is stored by the computing node 250, so that in each iterative calculation
  • the computing node 250 calculates the updated value of the parameter of the user sub-matrix, it only needs to obtain the parameters of the item sub-matrix from each parameter service node 250, and after the iterative calculation ends, return the updated parameters of the item sub-matrix to the corresponding item.
  • the parameter service node 230 of the sub-matrix updates the item sub-matrix by the parameter service node 230.
  • the update formula of the component vector u i in the user matrix shown by the formula (7.1) in the dimension u ik of the dimension k shows that the calculation of the parameter is only related to the user's score, and the vectors corresponding to different users in the user matrix are independent of each other.
  • the user matrix U is divided into a plurality of sub-matrices according to the user dimension, correspondingly stored in the plurality of computing nodes 250, and the training data allocated by each computing node 250 calculates the updated value of the stored parameters of the user sub-matrix, the user sub-matrix
  • the dimension is: the number of users involved in the training data to which the compute node 250 is assigned x K.
  • control node 240 divides the training data, assigns a subset of the training data to each computing node 250, initializes the user matrix U and the item matrix V, and then iterates through multiple trainings, each iteration training.
  • Each computing node 250 performs the following operations in parallel:
  • FIG. 7 is an optional processing diagram of the distributed computing system 200 shown in FIG. 5, which is configured as a model training, and obtains a corresponding parameter service node from each parameter service node 230.
  • 230 stores the parameters of the item sub-matrix.
  • the calculation node 250 calculates the updated parameters of the locally stored user sub-matrix U-part; and then calculates the parameter update of the item sub-matrix according to the formula (7.2).
  • the value is transmitted to the parameter service node 230 storing the corresponding item sub-matrix, and the parameter storage node 230 updates the locally stored item sub-matrix.
  • the calculation node 250 calculates the updated value of the vector of the corresponding item in the item sub-matrix, the calculation result is only related to the user's score for the item, and the subset of the training data allocated by the calculation node 250 may only include the part of the item sub-matrix.
  • the score of the item so only the updated value corresponding to the vector corresponding to the scored item in the item sub-matrix is decreased according to the maximum gradient, and the gradient value calculated for the item without scoring is 0, which is equivalent to no update.
  • the computing node 250 when the computing node 250 obtains the item sub-matrix from the parameter service node 230, only the vector corresponding to the scored item in the item sub-matrix stored by the parameter service node 230 may be acquired, and V-sub, according to formula (7.1), combining the subset of the assigned training data and the vector corresponding to the scored item in the item sub-matrix, calculating the updated value of the vector corresponding to some users in the locally stored user sub-matrix, Some of the users are users who generate scoring behavior for the scored items in the project submatrix;
  • the updated value of the vector corresponding to the scored item in the item sub-matrix is calculated, and the parameter service node 230 (ie, the parameter service storing the corresponding item sub-matrix) is calculated.
  • the node 230 returns the updated value of the vector of the scored item, since the vector corresponding to the unscore item no longer needs to be transmitted, thus saving the communication overhead caused by transmitting the vector of the unrated item.
  • FIG. 8-1 is an optional schematic diagram of parameters of a transmission item matrix between a parameter service node 1 and a computing node according to an embodiment of the present invention, where a distributed computing system is provided.
  • 4 computing nodes, computing node 1 to computing node 4 are correspondingly assigned to different subsets of training data, and correspondingly stored user sub-matrices are: U part1 , U part2 , U part 3 and U part 4 ; computing node 1 to 4 slave parameters when the serving node 1 acquires parameters submatrix V part1 project respectively from a centralized service node parameter acquisition sub items corresponding to the item has a sub-matrix of the vector V part1 score.
  • the computing node 1 determines, according to the subset of the assigned training data, the scored items in the subset, and obtains the corresponding vector of the scored items in the item sub-matrix V part1 from the parameter service node, to parameter service node 1
  • the obtained vector of the scored item in the item sub-matrix V part1 is denoted as V part1-sub1 ; according to formula (7.1), a subset of the assigned training data and V part1-sub1 are calculated U
  • the updated value of the parameter of part1 for example, when calculating the updated value of the vector corresponding to the partial user in U part1 , the partial user is the user who generates the scoring behavior for the scored item; according to formula (7.2), the partial user corresponding to U part1 corresponds updated value vector calculating an updated value V part1-sub1, denoted by ⁇ V part1-sub1, transmission ⁇ V part1-sub1 to the parameter serving node 1, the parameter serving node 1 based on the updated values of each computing node returns (including
  • FIG. 8-1 Only one parameter service node 1 is shown in FIG. 8-1. At least two parameter service nodes are disposed in the distributed computing system, and the parameter service node 2 including the storage item sub-matrix V part2 is taken as an example. 8-2, the computing nodes 1 to 4 also obtain corresponding vectors of the scored items in the item sub-matrix V part2 from the parameter service node 2, and record them as V part2-sub1 , V part2-sub2 , V part2-sub3, and V part2 -sub4 , and perform iterative calculation.
  • the parameter service node 2 returns the updated value of the return vector of each computing node (including ⁇ V part2-sub4 returned by the computing node 1, ⁇ V part2-sub2 returned by the computing node 2, and the return of the computing node 3) the ⁇ V part2-sub3, computing node 4 ⁇ V part2-sub4 returned) to update the locally stored program sub-matrix V part2.
  • the scheme of updating the V-sub matrix in batches can be adopted, so that the parameters of each batch transmission are smaller than the memory of the computing node 250, and the computing node 250 is guaranteed to have Sufficient memory to calculate the updated value of the parameter.
  • the compute node 250 retrieves the parameters of the V-sub matrix from the parameter service node 230 in batches, sub-batch from the parameter service node 230 based on the scored items assigned to the subset of training data. Obtaining a vector corresponding to a part of the scored items in the V-sub; calculating the stored user sub-in accordance with the formula (7.1), combining the vector of the scored items acquired by each batch, and the subset of the assigned training data The updated value of the parameter of the matrix; according to formula (7.2), combined with the updated value of the parameter of the user sub-matrix, the updated value of the corresponding vector of the scored item is calculated and transmitted to the corresponding parameter service node 230 for the parameter service node 230 to update the local storage. The vector of the item that has been graded in the item submatrix.
  • FIG. 9 is a schematic diagram of processing of a calculation node sub-batch and a parameter service node transmission item matrix according to an embodiment of the present invention.
  • the training data relates to M users' ratings of N items.
  • the training data is divided into subsets and equally distributed to 4 computing nodes, and the 4 computing nodes correspond to sub-matrices storing the initialized user matrix, which are recorded as U part1 , U part2 , U part3 and U part4 ;
  • Each computing node performs such operations in parallel: dividing the scored items in the assigned subset into two batches, and obtaining a batch from the item sub-matrix stored in the parameter service node in each iterative calculation process
  • the vector corresponding to the scored item is recorded as V-sub; according to formula (7.1), combined with the V-sub and the subset of the assigned training data, some users in the user sub-matrix are calculated (ie, the scoring behavior of the scored item is generated) User) corresponding to the updated value of the vector, and then according to formula (7.2), combined with the updated value of the partial user corresponding vector in the user submatrix, the updated value of the scored item corresponding vector in the item submatrix is calculated and transmitted to the parameter service node, and the parameter is transmitted.
  • the service node updates the item matrix stored locally.
  • the parameters of the sub-matrix of the batch transmission project between the computing node and the parameter service node avoid the limitation of the memory resources of the computing node caused by all the parameters of the one-time transmission project sub-matrix, effectively avoiding the single calculation when training the large-scale model.
  • the memory resource overhead of the node is large.
  • FIG. 10 is a schematic diagram of a distributed computing method, which is applicable to at least two applications. a computing system and a distributed computing system of at least two parameter service nodes;
  • Step 101 The computing node initializes a vector of a corresponding user in the user matrix according to a user included in the subset of the training data, and obtains a user sub-matrix composed of the initialized vector.
  • the distributed control system may further include a control node that divides the training data by the user, and divides the plurality of score data for the different items included in the training data into a plurality of subsets.
  • the plurality of subsets are assigned to the compute nodes; for example, an average allocation or a proportional allocation according to the computational power of the compute nodes may be employed.
  • Step 102 The parameter service node initializes a vector corresponding to the partial item, and obtains a project sub-matrix composed of the initialized vector, and the partial item is a part of the items included in the training data.
  • Step 103 The computing node iteratively calculates the user sub-matrix and the item sub-matrix according to the subset of the training data and the item sub-matrix obtained from the parameter service node, and transmits the item sub-matrix calculated by each iteration to the corresponding parameter service node.
  • the updated value of the item sub-matrix may be calculated, and the updated value of the item sub-matrix is transmitted to the corresponding parameter service node (ie, before the iterative calculation is stored)
  • the parameter service node of the item sub-matrix the parameter service node calculates new parameters of the item sub-matrix according to the updated value of the item sub-matrix transmitted by the calculation node, and updates the item sub-matrix stored locally by the parameter service node.
  • the computing node initializes a vector of a corresponding user in the user matrix in the following manner, and the computing node determines, according to the assigned subset, the scored items included in the subset, and the items stored from the parameter service node. In the submatrix, obtain a vector corresponding to the scored item;
  • the computing node iteratively calculates the user sub-matrix
  • the item sub-matrix adopts the following method: iteratively calculates a vector corresponding to some users in the user sub-matrix, and a vector corresponding to the scored item in the item sub-matrix, and some users are included in the subset. a user who has a rating for a graded item;
  • the vector corresponding to the scored item is obtained by iterative calculation and transmitted to the corresponding parameter service node for the parameter service node to update the stored item sub-matrix.
  • the computing node may obtain the vector corresponding to the scored item from the item sub-matrix stored by the parameter service node, and may store the item from the parameter service node.
  • the vector corresponding to the scored item is obtained in batches; the vector corresponding to the corresponding batch user in the user sub-matrix is calculated and the vector corresponding to the scored item of the corresponding batch is calculated, and the corresponding batch user is a batch for some users. The user who has scored the graded item;
  • the vector corresponding to the scored item of the corresponding batch obtained after each iteration calculation is transmitted to the corresponding parameter service node for the parameter service node to be based on the locally stored item sub-matrix.
  • the computing node is determined according to the memory space of the computing node, wherein the storage space occupied by the vector corresponding to the scored item of each batch is smaller than the memory space of the computing node, and the calculation has sufficient resources. Complete the calculation.
  • the computing node iteratively calculates the user sub-matrix and the item sub-matrix, and calculates the user sub-matrix and the item sub-matrix by using the loss function as the target of the maximum gradient; for example,
  • the computing node compares the score prediction value with the score actual value included in the subset of the training data to obtain a predicted difference value; the product of the predicted difference value and the item sub-matrix, and the locally stored user sub- The matrix is superimposed to obtain the updated user sub-matrix; the product of the predicted difference value and the updated user sub-matrix is superimposed with the item sub-matrix to obtain the updated item sub-matrix; when the iterative suspension condition is satisfied, the control node is responsible for outputting the complete model.
  • the user sub-matrix stored by each computing node is combined to obtain a user matrix; the item sub-matrix stored by each parameter service node is combined to obtain a project matrix; when it is required to predict the target user's score for the target item, according to The product of the corresponding target user in the user matrix, and the product of the corresponding target item in the item matrix, obtains the target user's score for the target item.
  • FIG. 11 is an optional schematic flowchart of a model for training a predictive score according to an embodiment of the present invention, which is described in conjunction with the distributed computing system shown in FIG.
  • k the feature vector of the user, the dimension of the feature vector of the item.
  • the sample data in the training data includes the user ID, and the user's rating of the item.
  • BatchNum During each iterative training, the computing node 250 acquires the item matrix from the parameter service node 230 in batches, and performs iterative calculation according to each batch acquisition item sub-matrix.
  • control node 240 evenly distributes a subset of the training data for each of the computing nodes 250.
  • each computing node 250 performs the following processing in parallel:
  • Step 2021 Create and initialize a user sub-matrix according to the subset of the assigned training data, and each computing node stores a sub-matrix of the user matrix.
  • Each row vector of the user sub-matrix corresponds to one user, the row number corresponds to the user's ID, the row vector represents the user's score for different features, and the user sub-matrix includes a vector corresponding to the partial user, and the above-mentioned partial users are allocated to the computing node 250.
  • the users included in the subset correspond to one user, the row number corresponds to the user's ID, the row vector represents the user's score for different features, and the user sub-matrix includes a vector corresponding to the partial user, and the above-mentioned partial users are allocated to the computing node 250. The users included in the subset.
  • step 2022 the scored item is divided into a plurality of batches.
  • IDset Collect the set of IDs of the scored items in the subset of the assigned training data, and record them as IDset; divide the IDset into multiple subsets, the number is BatchNum, and each subset is recorded as: IDset[1],...,IDset[BatchNum] .
  • step 203 the parameter service node 230 creates and initializes a sub-matrix of the N ⁇ k-dimensional item matrix, and each parameter service node stores the item sub-matrix.
  • N is the number of items.
  • Each column vector of the item matrix corresponds to one item.
  • the column number corresponds to the ID of the item, and the column vector indicates the weight of the item in different features.
  • step 201 there is no limitation of the execution order between step 201, step 202 and step 203.
  • Step 204 The calculation node 250 obtains the vector corresponding to the scored item from the item sub-matrix stored by the parameter service node 230 in batches.
  • the vector corresponding to IDset[m] is obtained from the parameter service node 230, and the value of m satisfies: 1 ⁇ m ⁇ BatchNum, and the parameter service node 250 according to each calculation node 250 for the IDset[m] in the item matrix.
  • the request for the corresponding vector returns to the computing node 250 a vector corresponding to IDset[m] in the item matrix.
  • Step 205 Update a vector of users in the user sub-matrix that has scored the scored item, and calculate an updated value of the vector corresponding to the scored item in the item sub-matrix.
  • Step 206 The parameter service node 230 updates the locally stored item sub-matrix according to the updated value of the vector corresponding to the scored item in the item sub-matrix returned by each computing node.
  • the vector corresponding to the update IDset[m] is updated as follows:
  • Num is the number of compute nodes 250 in the distributed computing system 200.
  • Step 207 The control node 240 acquires parameters of the user sub-matrix from each computing node 250 and combines to form a user matrix, and acquires parameters of the item matrix from each parameter service node 230 and combines to form an item matrix.
  • the matrix decomposition model based on the scores of different users in the training data is obtained.
  • the scores of different users can be calculated.
  • the product with the highest score can be selected.
  • User recommendation is obtained.
  • Embodiments of the present invention provide a storage medium, including any type of volatile or non-volatile storage device, or a combination thereof.
  • the non-volatile memory may be a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), or an Erasable Programmable Read (EPROM). Only Memory), etc., an executable program is stored in the storage medium, and when the executable program is executed by the processor, the following operations are performed:
  • the vector corresponding to the user in the user matrix is initialized, and a user sub-matrix composed of the initialized vector is obtained;
  • the user sub-matrix and the item sub-matrix are iteratively calculated according to a subset of the training data, the item sub-matrix obtained from the parameter service node, and the item sub-matrix obtained after each iteration calculation , transmitted to the corresponding parameter service node;
  • the vector corresponding to the partial item is initialized, and a project sub-matrix composed of the initialized vector is obtained, and the partial item is a part of the items included in the training data;
  • the item sub-matrix stored by the parameter service node is updated according to the item sub-matrix transmitted by the computing node.
  • the scores for the plurality of items included in the training data are divided by the user, and a plurality of subsets of the training data are obtained, and the plurality of subsets are allocated to at least two computing nodes.
  • the user sub-matrix stored by each computing node is combined to obtain a user matrix
  • the item sub-matrix stored by each parameter service node is combined to obtain an item matrix
  • the target user's score for the target item is obtained according to the product of the corresponding target user's vector in the user matrix and the vector of the corresponding target item in the item matrix.
  • the vector corresponding to the scored item obtained after each iteration calculation is transmitted to the corresponding parameter service node.
  • the vector corresponding to the scored item is obtained in batches from the item sub-matrix stored by the parameter service node;
  • the vector corresponding to the scored item of the corresponding batch obtained after each iteration calculation is transmitted to the corresponding parameter service node.
  • the number of batches is determined according to the memory space of the compute node, wherein the storage space occupied by the vector corresponding to the scored item of each batch is smaller than the memory space of the compute node.
  • the item sub-matrix stored by the parameter service node is updated according to the vector corresponding to the scored item transmitted by the calculation node.
  • the score prediction value is compared with the score actual value included in the subset of the training data to obtain a predicted difference value
  • the product of the prediction difference and the item sub-matrix is superimposed with the user sub-matrix to obtain an updated user sub-matrix
  • the product of the predicted difference and the updated user submatrix is superimposed with the item submatrix to obtain an updated item submatrix.
  • the distributed computing system can be based on the training data.
  • the scoring matrix can be decomposed into the product of the user matrix and the item matrix as shown in FIG. 1, and according to the model shown in FIG. 1, the user pair can be calculated.
  • the scores of different items indicate the degree of interest of the user to the project. According to the descending order of the scores, the items of interest to the user can be accurately selected and recommended to the user.
  • FIG. 12 is a schematic diagram of an optional application scenario of the big data platform 200 shown in FIG. 2 according to an embodiment of the present invention.
  • the distributed computing system 200 deployed by the big data platform 100 can employ the architecture of the distributed computing system 200 as shown in FIG.
  • the online shopping system 700 provides a page-based access method to support user access through a browser and a shopping APP.
  • the behavior data collection function is enabled to collect behavior data in the following forms: user ID, access time, browse product, purchase item, return item, and item rating.
  • the online shopping system 700 opens the permission of the behavior data to the data collection system 300 of the big data platform 100.
  • the data collection system 300 periodically or irregularly obtains the behavior data of the accessing user of the online shopping system 700, and cleans the behavior data, such as removing The malicious scoring data, and the high score with cheating behavior, construct the training data in the user-oriented dimension of the scoring data, and each record of the training data includes a user ID, a product ID, and a product score.
  • the training data is submitted to the distributed computing system 200 of the big data platform 100 for iterative calculation.
  • the user's score on the unrated products is predicted to form a matrix decomposition model as shown in FIG.
  • the user's rating for each item is represented by the product of the vector corresponding to the user in the user matrix and the vector corresponding to the item in the item matrix, and the parameters of the user model and the product model are returned to the online shopping system 700.
  • the online shopping system 700 can calculate the user's rating of different commodities according to the matrix decomposition model. For example, when the linear shopping system 700 needs to perform online promotion for an item, in order to accurately locate the potential consumer of the product, according to the matrix decomposition model Calculate the predetermined number of users who have the highest rating for the product, and push the promotion information of the product to the user to achieve accurate marketing.
  • the above-mentioned shopping system 700 can also be replaced with an online APP store to accurately recommend an APP of interest to the user.
  • the APP store can calculate the user's rating (interesting degree) for different APPs according to the matrix decomposition model, according to the calculated calculation. Scoring, the user is pushed to a specific APP; the shopping system 700 described above may also be a social platform system that recommends interested contacts to the user. Next, the social platform system is used to recommend contacts to users as an example.
  • the social platform system provides a page-based access method to support user access through a browser and a social platform APP.
  • the social networking system is enabled to open a data collection function, and collect behavior data in the following form: user ID, user Various behavioral data (such as publishing original content, comments, attention information, etc.) reflecting the similarity between users in a social network; or collecting user data in the following forms: gender, age, work, location, and the like.
  • the social platform system opens data permissions to the data collection system of the big data platform, and the data collection system periodically or irregularly obtains behavior data and/or user data of the accessing user of the social platform system, and cleans the data, such as removing malicious comments, and contacting
  • the person rating data constructs training data in a user dimension, each record of the training data including a first user ID, a second user ID, and a second user rating.
  • the training data is iteratively calculated by the distributed computing system 200 submitted to the big data platform 100, and based on the score of the scored second user, the user's score on the un-scorsized second user is predicted to form a matrix decomposition model as shown in FIG.
  • the first user's score for each second user is represented by a product of a vector corresponding to the first user in the first user matrix and a vector corresponding to the second user in the second user matrix, the first user
  • the parameters of the model and the second user model are returned to the social platform system.
  • the social platform system can calculate the scores of the first user for different second users according to the matrix decomposition model. For example, the social platform system needs to perform friend recommendation to the first user, in order to accurately locate the second user recommended to the first user, according to The matrix decomposition model calculates a predetermined number of second users that score higher on the second user, and pushes related information of the second user to the first user to implement accurate friend recommendation.
  • the user matrix is distributed and stored in the user sub-matrix, and the project matrix is distributed and stored in the item sub-matrix, which reduces the occupation of the memory space of the node, and overcomes the related technology for storing the complete user matrix for the single-machine memory.
  • the limitations of the project matrix enabling large-scale calculations in distributed computing systems with limited memory;
  • the plurality of computing nodes calculate the stored user sub-matrix and the item sub-matrix obtained from the parameter service node based on the subset of the training data, on the one hand, reduce the computational complexity of the single node, and on the other hand, the computing node The way of parallel computing effectively improves the computational efficiency;
  • the project matrix and the user matrix are distributed and stored in a sub-matrix manner, which effectively reduces the capacity of the transmission item sub-matrix between the computing node and the parameter service node.
  • the communication overhead of a single node is effectively reduced, eliminating communication.
  • the transmission efficiency is high, which avoids the situation that the computing node is idle due to waiting data, and improves the calculation efficiency.
  • the project matrix is decomposed into multiple project sub-matrix distributions stored in the parameter service nodes, and the project vectors are obtained in batches in each iteration, and the large-scale matrix decomposition model is solved.
  • the computational problem can be extended linearly by increasing the number of parameter service nodes and the number of compute nodes to support very large-scale calculations.
  • the distributed computing system in the embodiment of the present invention includes: at least two computing nodes and at least two parameter service nodes; wherein the computing node is configured to initialize corresponding users in the user matrix according to users included in the subset of training data. Deriving the user's vector to obtain a user sub-matrix composed of the initialized vectors; the computing node is configured to iteratively calculate the user according to the subset of the training data, the item sub-matrix obtained from the parameter service node Sub-matrix, and the item sub-matrix, the item sub-matrix obtained after each iteration calculation is transmitted to a corresponding parameter service node; the parameter service node is configured to initialize a vector corresponding to the partial item, and obtain the vector initialized by a project sub-matrix, the partial item is a part of the items included in the training data; the parameter service node is configured to update the parameter service node to store according to the item sub-matrix transmitted by the computing node Project submatrix.
  • the project matrix and the user matrix are distributed and stored in a sub-matrix manner, which reduces the occupation of the memory space of a single node, and overcomes the limitation that the related technology needs to be able to store a complete user matrix and a project matrix for a single-node memory.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Disclosed in the present invention are a distributed computing system and method, and a storage medium. The distributed computing system comprises: at least two computing nodes, and at least two parameter service nodes. The computing nodes initializes, according to users comprised in a training data subset, vectors corresponding to users in a user matrix to obtain a user submatrix composed of the initialized vectors; the computing nodes iteratively compute the user submatrix and a project submatrix according to the training data subset and the project submatrix acquired from the parameter service nodes, and transmit a project submatrix obtained after each iterative computation to the corresponding parameter service nodes; the parameter service nodes initialize vectors corresponding to a part of projects to obtain a project submatrix composed of the initialized vectors; the parameter service nodes update, according to the project submatrix transmitted by the computing nodes, the project submatrix stored by the parameter service nodes.

Description

分布式计算系统、方法及存储介质Distributed computing system, method and storage medium
相关申请的交叉引用Cross-reference to related applications
本申请基于申请号为201710327494.8、申请日为2017年05月10日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。The present application is based on a Chinese patent application filed on Jan. 10, 2017, the entire disclosure of which is hereby incorporated by reference.
技术领域Technical field
本发明涉及计算机技术,尤其涉及一种分布式计算系统、方法及存储介质。The present invention relates to computer technology, and in particular, to a distributed computing system, method, and storage medium.
背景技术Background technique
人工智能得到快速发展,广泛应用到各种行业中。以商品推荐的应用场景为例,根据用户的行为数据,采用机器学习的方法训练出预测用户对不同商品的评分的模型,从而能够计算出用户对不同商品的评分的排序,选取评分高的商品向用户推荐,能够帮助用户迅速定位感兴趣的商品,实现精准、高效地产品营销。Artificial intelligence has been rapidly developed and widely used in various industries. Taking the application scenario of the product recommendation as an example, according to the user's behavior data, a machine learning method is used to train a model for predicting the user's rating on different commodities, so that the user's ranking of different products can be calculated, and the high-scoring product can be selected. Recommend to users, can help users quickly locate the products of interest, and achieve accurate and efficient product marketing.
例如,目前的商品推荐依赖于大数据处理技术,需要对采集的海量的行为数据进行分析处理,以训练具有评分预测性能的模型,这对承担训练任务的计算系统的资源(包括内存资源、通信资源等)开销提出很高的要求。For example, current product recommendations rely on big data processing techniques, and the collected behavioral data needs to be analyzed and processed to train a model with scoring prediction performance, which is the resources of the computing system that undertakes the training task (including memory resources, communication). Resources, etc.) overheads place high demands.
然而,相关技术提供的计算系统中单个节点的资源有限,且计算系统的升级往往具有滞后性,单个节点的资源有限的现状与模型训练的计算需要高资源开销之间的矛盾,成为难以解决的技术问题。However, the resources of a single node in the computing system provided by the related art are limited, and the upgrade of the computing system often has hysteresis, and the current situation of limited resources of a single node and the calculation of model training require high resource overhead, which becomes difficult to solve. technical problem.
发明内容Summary of the invention
本发明实施例期望提供一种分布式计算系统、方法及存储介质,能够以资源集约的方式完成计算任务。Embodiments of the present invention are directed to providing a distributed computing system, method, and storage medium capable of performing computing tasks in a resource intensive manner.
本发明实施例的技术方案是这样实现的:The technical solution of the embodiment of the present invention is implemented as follows:
第一方面,本发明实施例提供一种分布式计算系统,包括:In a first aspect, an embodiment of the present invention provides a distributed computing system, including:
至少两个计算节点和至少两个参数服务节点;其中,At least two computing nodes and at least two parameter service nodes; wherein
所述计算节点,配置为根据训练数据的子集包括的用户,初始化用户矩阵中对应所述用户的向量,得到由所初始化的向量构成的用户子矩阵;The computing node is configured to initialize a vector corresponding to the user in the user matrix according to a user included in the subset of the training data, to obtain a user sub-matrix formed by the initialized vector;
所述计算节点,配置为根据所述训练数据的子集、从所述参数服务节点获取的项目子矩阵,迭代计算所述用户子矩阵、以及所述项目子矩阵,将每次迭代计算之后得到的项目子矩阵传输至相应的参数服务节点;The computing node is configured to iteratively calculate the user sub-matrix and the item sub-matrix according to the subset of the training data, the item sub-matrix obtained from the parameter service node, and obtain the calculation after each iteration The project sub-matrix is transmitted to the corresponding parameter service node;
所述参数服务节点,配置为初始化部分项目对应的向量,得到由所初始化的向量构成的项目子矩阵,所述部分项目为所述训练数据包括的项目中的部分项目;The parameter service node is configured to initialize a vector corresponding to the partial item, and obtain a project sub-matrix composed of the initialized vector, where the partial item is a part of the items included in the training data;
所述参数服务节点,配置为根据所述计算节点传输的项目子矩阵,更新所述参数服务节点所存储的项目子矩阵;The parameter service node is configured to update an item sub-matrix stored by the parameter service node according to an item sub-matrix transmitted by the computing node;
其中,各所述计算节点存储的用户子矩阵用于组合得到用户矩阵,各所述参数服务节点存储的项目子矩阵用于组合得到项目矩阵;The user sub-matrix stored by each of the computing nodes is used to combine to obtain a user matrix, and the item sub-matrix stored by each parameter service node is used to combine to obtain an item matrix;
所述用户矩阵中对应目标用户的向量及所述项目矩阵中对应目标项目的向量,用于得到所述目标用户针对所述目标项目的评分。A vector corresponding to the target user in the user matrix and a vector of the corresponding target item in the item matrix are used to obtain a score of the target user for the target item.
第二方面,本发明实施例提供一种分布式计算方法,应用于包括至少两个计算节点和至少两个参数服务节点的分布式计算系统;包括:In a second aspect, an embodiment of the present invention provides a distributed computing method, which is applied to a distributed computing system including at least two computing nodes and at least two parameter service nodes;
所述计算节点根据训练数据的子集包括的用户,初始化用户矩阵中对应所述用户的向量,得到由所初始化的向量构成的用户子矩阵;The computing node initializes a vector corresponding to the user in the user matrix according to a user included in the subset of the training data, and obtains a user sub-matrix composed of the initialized vector;
所述计算节点根据所述训练数据的子集、从所述参数服务节点获取的 项目子矩阵,迭代计算所述用户子矩阵、以及所述项目子矩阵,将每次迭代计算之后得到的项目子矩阵,传输至相应的参数服务节点;The computing node iteratively calculates the user sub-matrix and the item sub-matrix according to the subset of the training data, the item sub-matrix obtained from the parameter service node, and the item obtained after each iteration calculation a matrix, transmitted to the corresponding parameter service node;
所述参数服务节点初始化部分项目对应的向量,得到由所初始化的向量构成的项目子矩阵,所述部分项目为所述训练数据包括的项目中的部分项目;The parameter service node initializes a vector corresponding to the partial item, and obtains a project sub-matrix composed of the initialized vector, where the partial item is a part of the items included in the training data;
所述参数服务节点根据所述计算节点传输的项目子矩阵,更新所述参数服务节点所存储的项目子矩阵;And the parameter service node updates the item sub-matrix stored by the parameter service node according to the item sub-matrix transmitted by the computing node;
其中,各所述计算节点存储的用户子矩阵用于组合得到用户矩阵,各所述参数服务节点存储的项目子矩阵用于组合得到项目矩阵;The user sub-matrix stored by each of the computing nodes is used to combine to obtain a user matrix, and the item sub-matrix stored by each parameter service node is used to combine to obtain an item matrix;
所述用户矩阵中对应目标用户的向量及所述项目矩阵中对应目标项目的向量,用于得到所述目标用户针对所述目标项目的评分。A vector corresponding to the target user in the user matrix and a vector of the corresponding target item in the item matrix are used to obtain a score of the target user for the target item.
第三方面,本发明实施例提供一种存储介质,存储有可执行程序,所述可执行程序被处理器执行时实现以下的操作:In a third aspect, an embodiment of the present invention provides a storage medium storing an executable program, and when the executable program is executed by a processor, the following operations are implemented:
当处于计算节点模式时,根据训练数据的子集包括的用户,初始化用户矩阵中对应所述用户的向量,得到由所初始化的向量构成的用户子矩阵;When in the computing node mode, according to the user included in the subset of the training data, the vector corresponding to the user in the user matrix is initialized, and a user sub-matrix composed of the initialized vector is obtained;
当处于计算节点模式时,根据所述训练数据的子集、从所述参数服务节点获取的项目子矩阵,迭代计算所述用户子矩阵、以及所述项目子矩阵,将每次迭代计算之后得到的项目子矩阵传输至相应的参数服务节点;When in the computing node mode, iteratively calculates the user sub-matrix and the item sub-matrix according to the subset of the training data, the item sub-matrix obtained from the parameter service node, and obtains after each iteration calculation The project sub-matrix is transmitted to the corresponding parameter service node;
当处于参数服务节点模式时,初始化部分项目对应的向量,得到由所初始化的向量构成的项目子矩阵,所述部分项目为所述训练数据包括的项目中的部分项目;When in the parameter service node mode, the vector corresponding to the partial item is initialized, and a project sub-matrix composed of the initialized vector is obtained, and the partial item is a part of the items included in the training data;
当处于参数服务节点模式时,根据所述计算节点传输的项目子矩阵,更新所述参数服务节点所存储的项目子矩阵。When in the parameter service node mode, the item sub-matrix stored by the parameter service node is updated according to the item sub-matrix transmitted by the computing node.
本发明实施例具有以下有益效果:Embodiments of the present invention have the following beneficial effects:
1)将项目矩阵和用户矩阵以子矩阵的方式分布式存储,降低了对单个 节点的内存空间的占用,克服了相关技术对于单节点内存需要能够存储完整的用户矩阵和项目矩阵的限制,能够在内存资源有限的分布式计算系统中实现大规模的计算;1) Distributing the project matrix and the user matrix in a sub-matrix manner, which reduces the occupation of the memory space of a single node, and overcomes the limitation that the related technology needs to be able to store a complete user matrix and a project matrix for a single-node memory. Implement large-scale computing in distributed computing systems with limited memory resources;
2)单个节点的通信开销被有效降低,消除了通信开销遇到网络带宽瓶颈的情况,有利于网络通信负载的均衡化,避免了因等待数据导致计算节点闲置的情况,提升了计算效率;2) The communication overhead of a single node is effectively reduced, which eliminates the situation that the communication overhead encounters the network bandwidth bottleneck, which is beneficial to the equalization of the network communication load, avoids the situation that the computing node is idle due to waiting data, and improves the calculation efficiency;
3)多个计算节点基于训练数据的子集对存储的用户子矩阵、以及项目子矩阵进行迭代计算,一方面,因为计算复杂度降低进而降低了对单个节点的计算资源的开销,降低了单个节点的计算复杂度,另一方面,计算节点并行计算的方式有效提升了计算效率。3) A plurality of computing nodes perform iterative calculation on the stored user sub-matrix and the item sub-matrix based on the subset of the training data. On the one hand, the computational complexity is reduced, thereby reducing the overhead of computing resources for a single node, and reducing the single The computational complexity of the nodes, on the other hand, the way in which the computational nodes are paralleled effectively improves the computational efficiency.
附图说明DRAWINGS
图1是本发明实施例提供的根据矩阵分解模型将评分矩阵分解为用户矩阵和项目矩阵的一个可选的示意图;1 is an optional schematic diagram of decomposing a scoring matrix into a user matrix and an item matrix according to a matrix decomposition model according to an embodiment of the present invention;
图2是本发明实施例提供的大数据平台的一个可选的结构示意图;2 is an optional structural diagram of a big data platform according to an embodiment of the present invention;
图3是本发明实施例提供根据矩阵分解模型将评分矩阵分解为用户矩阵和项目矩阵的一个示意图;3 is a schematic diagram of decomposing a scoring matrix into a user matrix and an item matrix according to a matrix decomposition model according to an embodiment of the present invention;
图4是本发明实施例提供的分布式计算系统200的一个可选的架构示意图;FIG. 4 is a schematic structural diagram of a distributed computing system 200 according to an embodiment of the present invention;
图5是本发明实施例提供的分布式计算系统200的一个可选的结构示意图;FIG. 5 is a schematic structural diagram of a distributed computing system 200 according to an embodiment of the present invention;
图6是本发明实施例提供的如图5所示的分布式计算系统200用于模型训练时的一个可选的处理示意图;6 is a schematic diagram of an optional process when the distributed computing system 200 shown in FIG. 5 is used for model training according to an embodiment of the present invention;
图7是本发明实施例提供的如图5所示的分布式计算系统200用于模型训练时的一个可选的处理示意图;FIG. 7 is a schematic diagram of an optional process when the distributed computing system 200 shown in FIG. 5 is used for model training according to an embodiment of the present invention;
图8-1是本发明实施例提供的参数服务节点与计算节点之间传输项目 矩阵的参数的一个可选的示意图;8-1 is an optional schematic diagram of parameters of a transmission item matrix between a parameter service node and a computing node according to an embodiment of the present invention;
图8-2是本发明实施例提供的参数服务节点与计算节点之间传输项目矩阵的参数的一个可选的示意图;8-2 is an optional schematic diagram of parameters of a transmission item matrix between a parameter service node and a computing node according to an embodiment of the present invention;
图9是本发明实施例提供的计算节点分批次与参数服务节点传输项目矩阵的处理示意图;FIG. 9 is a schematic diagram of processing of a calculation node sub-batch and a parameter service node transmission item matrix according to an embodiment of the present invention; FIG.
图10是本发明实施例提供的应用于一种分布式计算方法的流程示意图;10 is a schematic flowchart of applying to a distributed computing method according to an embodiment of the present invention;
图11是本发明实施例提供的训练用于预测评分的模型的可选的流程示意图;FIG. 11 is an optional schematic flowchart of a model for training a predictive score according to an embodiment of the present invention; FIG.
图12是本发明实施例提供的如图2所示的大数据平台200的一个可选的应用场景示意图。FIG. 12 is a schematic diagram of an optional application scenario of the big data platform 200 shown in FIG. 2 according to an embodiment of the present invention.
具体实施方式detailed description
以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的实施例仅仅用以解释本发明,并不用于限定本发明。The present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
对本发明进行进一步详细说明之前,对本发明实施例中涉及的名词和术语进行说明,本发明实施例中涉及的名词和术语适用于如下的解释。Before the present invention is further described in detail, the nouns and terms involved in the embodiments of the present invention are explained, and the nouns and terms involved in the embodiments of the present invention are applied to the following explanations.
1)行为数据,包括用户(如序列号等形式的标识信息描述),用户产生评分行为的项目(如商品、文章、应用等,可以采用序列号等方式描述)、以及用户针对项目的兴趣度(本文中也称为评分),多个用户的行为数据构成行为数据集(本文中也称为训练数据);以线上商品为例,评分行为包括:浏览商品、收藏项目、购买商品和评论商品。1) Behavior data, including user (such as identification information in the form of serial number, etc.), items that users generate scoring behavior (such as goods, articles, applications, etc., can be described by serial number, etc.), and user interest in the project (also referred to as scoring in this article), behavior data of multiple users constitute a behavior data set (also referred to as training data in this paper); for online products, for example, the scoring behavior includes: browsing products, collecting items, purchasing goods and comments commodity.
2)模型,即矩阵分解模型,也称为潜在语义模型(LFM,Latent Factor Model),用于初始化评分矩阵,将用于表示训练数据的评分矩阵进行分解,形成用户矩阵与项目矩阵的乘积的模型。2) The model, the matrix decomposition model, also known as the Latent Factor Model (LFM), is used to initialize the scoring matrix and decompose the scoring matrix used to represent the training data to form the product of the user matrix and the item matrix. model.
3)矩阵分解(MF,Matrix Factorization),将训练数据使用评分矩阵Y 表示,假设评分数据涉及M个用户对N个不同项目的评分,评分矩阵Y的每个行向量对应一个用户对不同项目的评分,评分矩阵Y的每个列向量对应一个项目所得到的不同用户的评分,使用矩阵分解模型初始化评分矩阵,即在评分矩阵中引入K(为预设值)个维度的特征,从而将评分矩阵Y根据矩阵分解模型初始化为:用户-特征矩阵(简称用户矩阵)U和特征-项目矩阵V(简称项目矩阵)的乘积。3) Matrix Factorization (MF, Matrix Factorization), the training data is represented by the scoring matrix Y, assuming that the scoring data relates to the scores of M users for N different items, and each row vector of the scoring matrix Y corresponds to one user for different items. Scoring, each column vector of the scoring matrix Y corresponds to the scores of different users obtained by one item, and the matrix decomposition model is used to initialize the scoring matrix, that is, the feature of K (preset value) dimensions is introduced in the scoring matrix, so that the score is obtained The matrix Y is initialized according to the matrix decomposition model as a product of a user-characteristic matrix (referred to as user matrix) U and a feature-item matrix V (abbreviated as an item matrix).
由于训练数据是用户的行为数据,而实际上用户不可能采集对全部项目的评分,对评分矩阵中的缺失值进行预测,也就是预测用户对未评分项目的评分,通过矩阵分解模型,将对缺失值的预测问题,转换为求解用户矩阵的参数和项目矩阵的参数的问题,也就是求解用户矩阵在K个维度的参数向量、以及项目矩阵在K个维度的参数向量的问题。Since the training data is the user's behavior data, in reality, it is impossible for the user to collect the scores of all the items, and the missing values in the scoring matrix are predicted, that is, the user's score on the ungraded items is predicted, and the matrix decomposition model is The problem of predicting missing values is transformed into the problem of solving the parameters of the user matrix and the parameters of the item matrix, that is, solving the parameter vector of the user matrix in K dimensions and the parameter vector of the item matrix in K dimensions.
举例来说,参见图1,图1是本发明实施例提供的根据矩阵分解模型将评分矩阵分解为用户矩阵和项目矩阵的一个可选的示意图,对于给定的训练数据(包括所有的用户、所有的项目,以及每个用户产生评分行为的项目的评分),使用潜在语义模型对行为数据进行建模,得到如图1所示的模型(假设行为数据中有3个用户和4个项目的评分,被分解为用户矩阵(表示3个用户对3个维度的特征的兴趣度)和项目矩阵(表示4个项目在3个维度的特征的权重)。For example, referring to FIG. 1, FIG. 1 is an optional schematic diagram of decomposing a scoring matrix into a user matrix and an item matrix according to a matrix decomposition model according to an embodiment of the present invention, for a given training data (including all users, All projects, as well as the scores of each user's scoring behavior), model the behavioral data using a latent semantic model to obtain the model shown in Figure 1 (assuming there are 3 users and 4 projects in the behavioral data) The score is decomposed into a user matrix (representing the interest of three users for the features of the three dimensions) and an item matrix (representing the weight of the features of the four items in the three dimensions).
以用户1对于项目1的评分y 11为例,可以表示为:用户矩阵中用户1对应的行向量(u 11,u 12,u 13)与项目矩阵中对应项目1的列向量(q 11,q 21,q 31)的乘积。 Taking the score y 11 of the user 1 for the item 1 as an example, it can be expressed as: a row vector (u 11 , u 12 , u 13 ) corresponding to the user 1 in the user matrix and a column vector (q 11 of the corresponding item 1 in the item matrix, The product of q 21 , q 31 ).
4)训练,即模型训练,利用训练数据迭代计算模型的参数,即迭代计算用户矩阵U的参数u ik和项目矩阵V中的参数v kj,直至满足迭代中止条件,如迭代计算达到预定次数或者参数收敛。 4) Training, ie model training, iteratively calculates the parameters of the model using the training data, ie iteratively calculates the parameter u ik of the user matrix U and the parameter v kj in the item matrix V until the iterative suspension condition is met, eg the iterative calculation reaches a predetermined number of times or Parameter convergence.
5)分布式计算:是将训练数据分解为多个子集,分配给分布式计算系 统中的多个计算节点,由计算节点基于分配的训练数据的子集并行计算模型的参数,由于将计算任务分配给多个计算节点完成,因此分布式计算可以扩大计算规模,提高训练的效率。5) Distributed computing: the training data is decomposed into multiple subsets and distributed to multiple computing nodes in the distributed computing system. The computing nodes calculate the parameters of the model in parallel based on the subset of the assigned training data, since the computing task will be Assigned to multiple computing nodes to complete, so distributed computing can expand the scale of computing and improve the efficiency of training.
6)参数服务节点架构:一种分布式计算实现机器学习的架构的分布式计算系统,主要由参数服务节点(PS,Parameter Server)和计算节点(Worker)组成,每种节点的数量至少为两个。6) Parameter service node architecture: A distributed computing system that implements machine learning architecture in distributed computing. It consists of a parameter service node (PS, Parameter Server) and a compute node (Worker). The number of each node is at least two. One.
7)参数服务节点:在分布式计算系统中包括至少两个参数服务节点,每个参数服务节点可以由一台或多台服务器实现,当由一台服务器实现时也可以称为参数服务节点,负责存储、更新项目矩阵的子矩阵(下文中称为项目子矩阵)的参数,参数服务节点向计算节点提供参数读取、更新项目矩阵的参数的服务。7) Parameter service node: at least two parameter service nodes are included in the distributed computing system, and each parameter service node may be implemented by one or more servers, and may also be referred to as a parameter service node when implemented by a server. Responsible for storing and updating the parameters of the sub-matrix of the item matrix (hereinafter referred to as the item sub-matrix), and the parameter service node provides the service node with a parameter reading and updating the parameters of the item matrix.
8)计算节点:每个计算节点可以由一台服务器或多台服务器实现,参数服务节点架构中包括多个计算节点。每个计算节点分配到训练数据中一个子集,子集中包括部分用户的行为数据),并从参数服务节点获取项目矩阵的参数(参数服务节点总是存储项目矩阵的最新的参数),使用训练数据更新用户矩阵中对应上述部分用户的参数、以及项目矩阵的部分项目(也就是前述的部分用户产生评分行为的项目)的参数的更新值,再把项目矩阵的参数的更新值传输给参数服务节点,参数服务节点结合各个计算节点传输的参数的更新值,更新参数服务节点本地存储的项目矩阵。8) Compute node: Each computing node can be implemented by one server or multiple servers, and the parameter service node architecture includes multiple computing nodes. Each compute node is assigned to a subset of the training data, the subset includes behavior data of some users, and the parameters of the project matrix are obtained from the parameter service node (the parameter service node always stores the latest parameters of the project matrix), and the training is used. The data update user parameter corresponding to the parameters of the above part of the user, and the update value of the parameter of the partial item of the item matrix (that is, the item of the partial user generating the scoring behavior), and then the updated value of the parameter of the item matrix is transmitted to the parameter service. The node, the parameter service node, in combination with the updated value of the parameter transmitted by each computing node, updates the item matrix stored locally by the parameter service node.
9)Spark:基于映射-规约(Map-Reduce)节点实现的模型训练的分布式计算架构,涉及映射节点和规约节点,映射节点负责数据的过滤分发,规约节点负责数据的计算归并。9) Spark: A distributed computing architecture based on the model training implemented by the Map-Reduce node, which involves mapping nodes and protocol nodes. The mapping nodes are responsible for the filtering and distribution of data, and the protocol nodes are responsible for the calculation and merging of data.
大数据平台广泛用于各行业的采集的用户的行为数据的处理,在必要时进行数据清洗和筛选,之后根据行为数据建立矩阵分解模型,来预测用户对不同项目的评分,评分反映了用户对项目感兴趣的程度,在项目推荐 的业务场景中,根据项目的评分从高到低的顺序向用户推荐,能够支持有针对性地开展生产/营销活动,实现生产/营销的高效率和成本的节约化。The big data platform is widely used for the processing of user's behavior data collected by various industries, data cleaning and screening when necessary, and then a matrix decomposition model based on behavior data to predict the user's score on different items. The score reflects the user's The degree of interest of the project, in the business scenario recommended by the project, is recommended to the user according to the ranking of the project from high to low, and can support targeted production/marketing activities to achieve high efficiency and cost of production/marketing. Saving.
对训练得到上述的模型进行说明,作为基于训练数据训练模型的一个示例,参见图2,图2是本发明实施例提供的大数据平台的一个可选的结构示意图,涉及分布式计算系统200、数据采集系统300、实时计算系统400、离线计算系统500和资源调度600几个部分,下面分别进行说明。For the training to obtain the above model, as an example of the training data training model, refer to FIG. 2, which is an optional structural diagram of the big data platform provided by the embodiment of the present invention, and relates to the distributed computing system 200. The data acquisition system 300, the real-time computing system 400, the offline computing system 500, and the resource scheduling 600 are described below.
数据采集系统300,配置为采集训练模型的训练数据(例如,对于项目推荐来说,训练数据可以包括:全部用户、全部项目、用户在线上产生浏览、购买、关注、放入购物车等等的各种行为的项目的列表),进行适当的处理。可以理解,对于训练数据,适当的处理可以包括:数据清洗和筛选,以过滤掉噪音数据(如取值在预定区间之外的明显不属实的数据)、超出有效期(如在半年前采集的数据),以及使得训练数据符合期望的分布等。The data collection system 300 is configured to collect training data of the training model (for example, for project recommendation, the training data may include: all users, all projects, users browsing, purchasing, paying attention, putting in a shopping cart, etc. A list of items of various behaviors), and appropriate processing. It can be understood that for training data, appropriate processing may include: data cleaning and screening to filter out noise data (such as apparently non-real data outside the predetermined interval), beyond the validity period (such as data collected six months ago). ), and to make the training data conform to the desired distribution and the like.
在本发明可选实施例中,对于采用用户的各种行为数据而言,提供用户授权和应用授权的机制以保护隐私。In an alternative embodiment of the invention, a mechanism for user authorization and application authorization is provided to protect privacy in the context of employing various behavioral data of the user.
分布式计算系统200,配置为根据训练数据以迭代计算模型的参数的方式训练模型,直至满足迭代中止条件。The distributed computing system 200 is configured to train the model in a manner that iteratively calculates the parameters of the model based on the training data until the iterative abort condition is met.
实时计算系统400,配置为实现分布式计算系统200采用实时方式(也称为在线方式)训练机器学习模型,训练数据中的一个或一批记录(每个记录对应一个用户,包括用户对不同对象的评分)提交到分布式计算系统200时,分布式计算系统200在内存中实时加载所接收的一个或一批的记录并进行训练,根据训练结果(例如,评分的真实值与预测值之间的差异程度)实时计算模型的更新的参数。The real-time computing system 400 is configured to implement the distributed computing system 200 to train the machine learning model in a real-time manner (also referred to as online mode), one or a batch of records in the training data (each record corresponds to one user, including the user to different objects) When submitted to the distributed computing system 200, the distributed computing system 200 loads the received one or a batch of records in real time in memory and performs training based on the training results (eg, between the true and predicted values of the score) The degree of difference) the updated parameters of the real-time calculation model.
离线计算系统500,配置为实现分布式计算系统200采用离线方式训练模型,分布式计算系统200将新接收的训练数据与所接收的历史训练数据在内存中全部加载,以迭代计算模型的更新的参数。The offline computing system 500 is configured to implement the distributed computing system 200 to adopt an offline mode training model. The distributed computing system 200 loads the newly received training data and the received historical training data in the memory to iteratively calculate the updated model. parameter.
资源调度600,配置为对上述各系统分配计算资源如中央处理器(CPU,Central Processing Unit)和图形处理器(GPU,Graphics Processing Unit),并分配用于通信的带宽资源等。The resource scheduling 600 is configured to allocate computing resources such as a central processing unit (CPU) and a graphics processing unit (GPU) to each of the foregoing systems, and allocate bandwidth resources for communication and the like.
就分布式计算系统200训练模型而言,以训练前述的用于评分的模型为例,需要采集用户对不同项目(如项目)的评分,形成用户对不同项目的评分数据,评分数据一个示例如下表1所示:In the case of the distributed computing system 200 training model, taking the aforementioned model for scoring as an example, it is necessary to collect the user's scores on different items (such as items), and form user's scoring data for different items. An example of the scoring data is as follows Table 1 shows:
  项目1Item 1 项目1Item 1 项目1Item 1 项目1Item 1 ……......
用户1User 1 44 55 11   ……......
用户2User 2     33 44 ……......
用户3User 3   22     ……......
用户4User 4 55   11 11 ……......
用户5User 5         ……......
用户6User 6 33 44     ……......
……...... ……...... ……...... ……...... ……...... ……......
表1 用户-项目评分数据Table 1 User-item rating data
对于如表1所示的评分数据来说,基于评分数据可以建立由全部用户、全部项目、用户对不同项目评分的数据构成的评分矩阵,当然评分矩阵中不可避免地存在缺失值;评分矩阵根据矩阵分解模型初始化,即表示为:用户-特征矩阵与特征-项目矩阵(简称项目矩阵,表示不同特征的用户对不同项目的评分)的乘积。For the scoring data as shown in Table 1, based on the scoring data, a scoring matrix composed of data of all users, all items, and users scoring different items can be established, and of course, there is inevitably a missing value in the scoring matrix; The matrix decomposition model is initialized, which is expressed as: the product of the user-character matrix and the feature-item matrix (referred to as the item matrix, which represents the scores of users with different features for different items).
作为评分矩阵分解的示例,参见图3,图3是本发明实施例提供根据矩阵分解模型将评分矩阵分解为用户矩阵和项目矩阵的一个示意图,设评分数据涉及M个用户对N个项目的评分,则使用评分矩阵Y表示评分数据时,Y的维度为:M×N;使用矩阵分解模型初始化评分矩阵,即在评分矩阵中引入K个维度的特征,从而将评分矩阵Y分解为用户-特征矩阵(简称用 户矩阵)U和特征-项目矩阵V(简称项目矩阵)的乘积的形式,即:As an example of scoring matrix decomposition, referring to FIG. 3, FIG. 3 is a schematic diagram of decomposing a scoring matrix into a user matrix and an item matrix according to a matrix decomposition model according to an embodiment of the present invention, and setting scoring data relates to M users' ratings of N items. When the scoring matrix Y is used to represent the scoring data, the dimension of Y is: M×N; the scoring matrix is initialized by using the matrix decomposition model, that is, the features of K dimensions are introduced into the scoring matrix, thereby decomposing the scoring matrix Y into user-features. The form of the product of the matrix (referred to as the user matrix) U and the feature-item matrix V (referred to as the item matrix), namely:
Y M×N≈U M×K×V K×N    (1) Y M×N ≈U M×K ×V K×N (1)
Y的维度为:M×N,y ij表示第i个用户对第j个项目的评分,y ij表示为: The dimension of Y is: M×N, y ij represents the score of the jth item by the i-th user, and y ij is expressed as:
Figure PCTCN2018084870-appb-000001
Figure PCTCN2018084870-appb-000001
其中,u ik表示用户i对特征k的评分,v kj表示项目j在特征k中的权重,k取值为1≤k≤K,i、j的取值均为正整数,1≤i≤M,1≤j≤N。 Where u ik represents the score of the user k for the feature k, v kj represents the weight of the item j in the feature k, k takes the value 1 ≤ k ≤ K, and the values of i and j are positive integers, 1 ≤ i ≤ M, 1 ≤ j ≤ N.
根据矩阵分解模型,把评分矩阵Y初始化为用户矩阵U和项目矩阵V的乘积,用户矩阵U的维度为M×K,行向量u i是一个K维的向量,对应用户i对K个维度的特征的评分;项目矩阵V的维度为K×N,每一列对应一个K维的列向量v j,代表项目j在K个维度的权重;K是矩阵分解时指定的特征的维度,用户i对项目j的评分y ij是u i和v j的乘积。 According to the matrix decomposition model, the scoring matrix Y is initialized to the product of the user matrix U and the item matrix V. The dimension of the user matrix U is M×K, and the row vector u i is a K-dimensional vector corresponding to the user i to the K dimensions. The score of the feature; the dimension of the item matrix V is K×N, each column corresponds to a K-dimensional column vector v j , which represents the weight of the item j in K dimensions; K is the dimension of the feature specified in the matrix decomposition, user i pairs The score y ij of item j is the product of u i and v j .
实际采集用户的评分数据涉及项目数量较多、且每个用户往往只对部分项目有评分,所以评分矩阵具有稀疏性,即评分矩阵中的部分元素的取值是缺失的(用0表示),称为缺失值,根据上述公式(2),能够对评分矩阵中的缺失值进行预测,从而将缺失值的预测转换为求解用户矩阵U的参数u ik和项目矩阵V中的参数v kj的问题,也就是求解用户矩阵U在K个维度的参数向量u i、以及项目矩阵V在K个维度的参数向量v j的问题。 The actual collection user's scoring data involves a large number of items, and each user tends to only score some items, so the scoring matrix is sparse, that is, the value of some elements in the scoring matrix is missing (represented by 0). Referred to as the missing value, according to the above formula (2), the missing value in the scoring matrix can be predicted, thereby converting the prediction of the missing value into the parameter u ik for solving the user matrix U and the parameter v kj in the item matrix V. That is, the problem of solving the parameter vector u i of the user matrix U in K dimensions and the parameter vector v j of the item matrix V in K dimensions.
举例来说,以用户向量u i和项目向量v j的乘积作为用户i对项目j的评分的预测值,记为
Figure PCTCN2018084870-appb-000002
用户i对项目j的评分的真实值为y ij,将预测值与真实值之差记为e ij,即:
For example, the product of the user vector u i and the item vector v j is used as the predicted value of the score of the user i on the item j, and is recorded as
Figure PCTCN2018084870-appb-000002
The real value of the score of user i on item j is y ij , and the difference between the predicted value and the true value is recorded as e ij , that is:
e ij=y ij-u i·v j    (3) e ij =y ij -u i ·v j (3)
那么,求解模型参数的问题转换为最小化e ij的问题,基于此,使用目 标函数表示模型针对评分的预测值与真实值之间的差距,目标函数如公式(4)所示: Then, the problem of solving the model parameters is transformed into the problem of minimizing e ij . Based on this, the objective function is used to represent the difference between the predicted value and the true value of the model for the score. The objective function is as shown in formula (4):
Figure PCTCN2018084870-appb-000003
Figure PCTCN2018084870-appb-000003
为了防止模型与训练数据过拟合的问题,在目标函数中引入正则项,目标函数如公式(5)所示:In order to prevent the model from over-fitting the training data, a regular term is introduced in the objective function. The objective function is shown in formula (5):
Figure PCTCN2018084870-appb-000004
Figure PCTCN2018084870-appb-000004
其中β/2为正则项的权重,由于用户i对项目j的评分值y ij是u i和v j的乘积分解为K个维度,那么得到矩阵分解算法的目标函数可以表示为: Where β/2 is the weight of the regular term, since the user i's score y ij for the item j is the multiplied integral solution of u i and v j into K dimensions, then the objective function of the matrix decomposition algorithm can be expressed as:
Figure PCTCN2018084870-appb-000005
Figure PCTCN2018084870-appb-000005
迭代训练模型的过程,转换为求解使得上述目标函数收敛时u ik、v kj的取值(也就是参数)过程,例如,对上述目标函数使用梯度下降法,也即是使得目标函数的负梯度方向进行收敛求解u ik与v kj,得到u ik与v kj的更新公式为:u ik←u ik+2αe ijv kj    (7.1) The process of iteratively training the model is converted into a process of solving the value (ie, parameter) of u ik , v kj when the above objective function converges, for example, using the gradient descent method for the above objective function, that is, the negative gradient of the objective function The direction is solved by solving u ik and v kj , and the update formula for u ik and v kj is: u ik ←u ik +2αe ij v kj (7.1)
v kj←v kj+2αe iju ik    (7.2) v kj ←v kj +2αe ij u ik (7.2)
其中,α为步长,表示学习速率,实际应用中,将迭代训练次数到达预定次数,或者,目标函数的值低于预定值(即目标函数收敛)作为迭代训练的中止条件,输出训练后得到模型的参数,根据参数,结合公式(2)可以计算用户对不同项目的评分,选取评分最高的一定数量的项目进行推荐。Where α is the step size, indicating the learning rate. In practical applications, the number of iterations of training is reached a predetermined number of times, or the value of the objective function is lower than a predetermined value (ie, the objective function converges) as the abort condition of the iterative training, and the output training is obtained. The parameters of the model, according to the parameters, combined with the formula (2), can calculate the user's score for different items, and select a certain number of items with the highest score for recommendation.
参见图4,图4是本发明实施例提供的分布式计算系统200的一个可选的架构示意图,利用Map-Reduce分布式架构实现了分布式的矩阵分解和训练,模型存储在驱动(Driver)节点210,可由一台服务器(或多台服务器)实现,每个执行器(Executor)节点可以采用一台服务器(或多台服务器)实现,驱动节点210向执行器节点220传输项目矩阵和用户矩阵之后,执 行器节点220根据接收的用户矩阵和项目矩阵进行训练,计算出模型的参数的更新值,之后传输给驱动节点210,驱动节点210结合所有执行器节点220传输的参数的更新值,更新本地存储的模型的参数,再将模型的全部参数广播给所有的执行器节点220。Referring to FIG. 4, FIG. 4 is a schematic structural diagram of a distributed computing system 200 according to an embodiment of the present invention. The distributed matrix decomposition and training are implemented by using a Map-Reduce distributed architecture, and the model is stored in a driver. The node 210 can be implemented by one server (or multiple servers), each of the Executor nodes can be implemented by one server (or multiple servers), and the driving node 210 transmits the item matrix and the user matrix to the executor node 220. Thereafter, the executor node 220 performs training according to the received user matrix and the item matrix, calculates an updated value of the parameter of the model, and then transmits it to the driving node 210, and the driving node 210 updates the updated value of the parameter transmitted by all the executor nodes 220, and updates The parameters of the locally stored model are then broadcast to all of the actuator nodes 220.
可以看出,存在以下问题:It can be seen that the following problems exist:
1)矩阵分解模型很容易达到很大的规模,以netfliex站点提供的训练数据为例,涉及17771个项目,480000个用户,当取K=1000的时候,模型的维度高达5×10 8。Spark分布式计算架构将模型的参数全部维护在单个驱动节点上,驱动节点的内存的物理限制导致了无法训练复杂的模型。 1) The matrix decomposition model can easily reach a large scale. Take the training data provided by the netfliex site as an example, involving 17771 projects and 480,000 users. When taking K=1000, the dimension of the model is as high as 5×10 8 . The Spark distributed computing architecture maintains all the parameters of the model on a single drive node. The physical limitations of the memory of the drive node result in the inability to train complex models.
2)在为训练模型而进行映射/规约的过程中,每个执行器节点把模型的参数传输给驱动节点,驱动节点汇总后广播给所有执行器节点,导致驱动节点和执行器节点之间的很大的通信开销,驱动节点和多个执行器节点通信会遇到带宽的瓶颈问题,模型参数的更新值的传输耗时导致通信效率低。2) In the process of mapping/provisioning for the training model, each actuator node transmits the parameters of the model to the driver node, and the driver nodes are aggregated and broadcast to all actuator nodes, resulting in a relationship between the driver node and the actuator node. A large communication overhead, the communication node and multiple actuator nodes communicate to encounter bandwidth bottlenecks, and the transmission of updated values of model parameters leads to low communication efficiency.
针对上述问题,本发明可选实施例中提供一种基于参数服务节点的分布式计算架构,以用户为维度分解训练数据得到训练数据的子集,在多个计算节点基于训练数据的子集并行训练模型,然后由参数服务节点将各个计算节点计算的模型的参数进行组合的方案。In an alternative embodiment of the present invention, a distributed computing architecture based on a parameter service node is provided. The user is used to decompose the training data to obtain a subset of the training data, and the plurality of computing nodes are parallel based on the subset of the training data. The training model is then combined with the parameters of the model calculated by each computing node by the parameter service node.
例如,参见图5,图5是本发明实施例提供的分布式计算系统200的一个可选的结构示意图,在图5中,涉及参数服务节点230、控制节点240、计算节点250、调度层260和存储层270。For example, referring to FIG. 5, FIG. 5 is an optional structural diagram of a distributed computing system 200 according to an embodiment of the present invention. In FIG. 5, the parameter service node 230, the control node 240, the computing node 250, and the scheduling layer 260 are involved. And storage layer 270.
控制节点240,配置为控制参数服务节点230以及计算节点250的整体操作,保证操作的有序进行,包括:以用户为维度划分训练数据形成子集,每个子集中包括部分用户(即训练数据所涉及的全部用户中的部分用户),向各计算节点250分配训练数据的子集,并控制各计算节点和参数服务节点230工作的有序进行。可以理解,在一可选实施例中,图5示出的分布 式计算系统200中可以省略设置控制节点240,将控制节点240的功能耦合到参数服务节点230中。The control node 240 is configured to control the overall operation of the parameter service node 230 and the computing node 250 to ensure the orderly operation of the operation, including: dividing the training data into subsets by using the user as a dimension, and each subset includes a part of users (ie, training data A portion of all users involved) assigns a subset of the training data to each computing node 250 and controls the orderly execution of the operations of each computing node and parameter service node 230. It will be appreciated that in an alternative embodiment, the set control node 240 may be omitted from the distributed computing system 200 illustrated in FIG. 5, coupling the functionality of the control node 240 into the parameter service node 230.
参数服务节点230和计算节点250的数量均为多个,每个参数服务节点230,配置为存储项目矩阵V的子矩阵(下文中称为项目子矩阵);每个计算节点250,配置为存储用户矩阵U的一个子矩阵(下文中称为用户子矩阵),根据从参数服务节点230获取的项目子矩阵,结合所分配到的训练数据的子集,迭代计算所存储的用户子矩阵的参数的更新值,以及所获取的项目子矩阵的参数的更新值,在每次迭代计算完成后,将项目子矩阵的参数的更新值返回(当然,也可以直接返回更新的参数)相应的参数服务节点230。The number of parameter service nodes 230 and compute nodes 250 are both multiple, and each parameter service node 230 is configured to store sub-matrices of the item matrix V (hereinafter referred to as item sub-matrices); each compute node 250 is configured to be stored A sub-matrix of the user matrix U (hereinafter referred to as a user sub-matrix), based on the item sub-matrix obtained from the parameter service node 230, and iteratively calculates the parameters of the stored user sub-matrix according to the subset of the assigned training data. The updated value, and the updated value of the obtained parameter of the item sub-matrix, after each iteration calculation is completed, the updated value of the parameter of the item sub-matrix is returned (of course, the updated parameter can also be directly returned) corresponding parameter service Node 230.
调度层260是对分布式计算系统200的调度功能的抽象表示,涉及控制节点240、参数服务节点230和计算节点250的计算资源(如CPU和GPU)的分配,以及控制节点240、参数服务节点230和计算节点250之间进行通信的通信资源的分配。The scheduling layer 260 is an abstract representation of the scheduling functions of the distributed computing system 200, involving the allocation of computing resources (such as CPU and GPU) of the control node 240, the parameter service node 230, and the computing node 250, as well as the control node 240, the parameter service node. The allocation of communication resources for communication between the computing node 250 and the computing node 250.
存储层270是对分布式计算系统200的存储资源的抽象表示,涉及上述各节点的内存资源以及非易失性的存储资源。The storage layer 270 is an abstract representation of the storage resources of the distributed computing system 200, and relates to memory resources of the above-described nodes and non-volatile storage resources.
可以理解,图5示出的分布式计算系统200可以由服务器的集群实现,服务器集群中的服务器在物理位置上可以是分离的,也可以在同一物理位置部署,通过光缆、电缆等各种通信方式连接。It can be understood that the distributed computing system 200 shown in FIG. 5 can be implemented by a cluster of servers. The servers in the server cluster can be separated in physical locations, or can be deployed in the same physical location, through various communications such as optical cables and cables. Way to connect.
对于图5示出的各个节点来说,与集群中的服务器可以是一一对应的关系,当然根据服务器的实际处理能力也可以在一个服务器中部署多个节点;特别地,针对集群中的服务器在硬件、软件上的差异,在本发明可选实施例中可以在集群中设置虚拟机环境,在虚拟机环境中部署图5示出的节点,有利于节点的快速部署和迁移。For each node shown in FIG. 5, it may have a one-to-one correspondence with servers in the cluster. Of course, multiple nodes may be deployed in one server according to the actual processing capability of the server; in particular, for servers in the cluster. In the hardware and software, in the optional embodiment of the present invention, a virtual machine environment can be set in the cluster, and the node shown in FIG. 5 is deployed in the virtual machine environment, which facilitates rapid deployment and migration of the node.
对图5示出的分布式计算系统200进行用于评分的模型的训练进行说 明,参见图6,图6是本发明实施例提供的如图5所示的分布式计算系统200,配置为模型训练时的一个可选的处理示意图(其中省略掉图5中的部分结构),示出了基于参数服务节点的分布式计算架构,其中,涉及多个参数服务节点230和多个计算节点250,分别进行说明。FIG. 6 is a schematic diagram of a distributed computing system 200 shown in FIG. An optional processing diagram during training (with part of the structure in FIG. 5 omitted), showing a distributed computing architecture based on parameter service nodes, wherein multiple parameter service nodes 230 and multiple computing nodes 250 are involved, Explain separately.
参数服务节点230,配置为存储项目矩阵V,并且每个参数服务节点230存储项目矩阵V中对应部分项目的向量构成的项目子矩阵,记为V-part,不同参数服务节点230存储的项目子矩阵对应的项目不同,并且,全部参数服务节点230存储的项目子矩阵对应的项目的交集为训练数据中涉及的全部项目。The parameter service node 230 is configured to store the item matrix V, and each parameter service node 230 stores a project sub-matrix composed of vectors of corresponding partial items in the item matrix V, denoted as V-part, and the item stored by the different parameter service node 230 The items corresponding to the matrix are different, and the intersection of the items corresponding to the item sub-matrix stored by all parameter service nodes 230 is all the items involved in the training data.
由于每个参数服务节点230存储的子矩阵仅与部分项目对应,从而能够通过调整参数服务节点230的数量来实现对模型中项目的规模自适应调整的技术效果,有利于根据业务需求调整分布式计算系统200中参数服务节点230的规模。Since the sub-matrix stored by each parameter service node 230 only corresponds to a part of the project, the technical effect of adaptively adjusting the scale of the items in the model can be realized by adjusting the number of parameter service nodes 230, which is advantageous for adjusting the distributed according to service requirements. The size of the parameter service node 230 in the computing system 200.
例如,当需要扩展项目的规模时,可以在分布式计算系统200中增加参数服务节点230的数量,由新增的参数服务节点230来负责存储项目矩阵V中对应新增项目的向量;同理,当不再需要预测某些项目的评分时,可以通过撤销存储相应子矩阵的参数服务节点230来实现。For example, when the scale of the project needs to be expanded, the number of parameter service nodes 230 may be increased in the distributed computing system 200, and the newly added parameter service node 230 is responsible for storing the vector of the corresponding new item in the project matrix V; When it is no longer necessary to predict the score of certain items, it can be implemented by revoking the parameter service node 230 storing the corresponding sub-matrix.
计算节点250,配置为利用分配到的训练数据的子集,子集中包括部分用户(即训练数据中涉及的全部用户中的部分用户)的行为数据,每次迭代计算的过程中,计算节点250依次从各参数服务节点230获取项目子矩阵V的参数,对于从任一参数服务节点230获取的项目子矩阵的参数,结合所分配到的子集,根据上述的更新公式(7.1)计算用户子矩阵U-part(也就是用户矩阵U中对应上述部分用户的向量构成的矩阵)更新的参数,在本地更新用户子矩阵U-part;然后根据公式(7.2)计算项目子矩阵V-part的参数的更新值,将项目子矩阵V-part的参数的更新值传输给存储相应项 目子矩阵的参数服务节点230进行更新。The computing node 250 is configured to utilize a subset of the assigned training data, the subset including behavior data of a portion of the users (ie, some of the users involved in the training data), during each iterative calculation, the computing node 250 The parameters of the item sub-matrix V are sequentially acquired from each parameter service node 230, and the parameters of the item sub-matrix acquired from any parameter service node 230 are combined with the assigned subset, and the user sub-item is calculated according to the above-mentioned update formula (7.1). The parameter of the matrix U-part (that is, the matrix of the user matrix U corresponding to the vector of the partial user) is updated locally, and the user sub-matrix U-part is updated locally; then the parameter of the item sub-matrix V-part is calculated according to the formula (7.2) The updated value transmits the updated value of the parameter of the item sub-matrix V-part to the parameter service node 230 storing the corresponding item sub-matrix for updating.
可以理解,由于每个计算节点250仅对部分用户的训练数据进行处理,这就能够通过调整计算节点250的数量实现对用户规模自适应调整的技术效果。举例来说,当需要扩展用户的规模时,可以在分布式计算系统200中增加计算节点250的数量,由新增计算节点250来负责存储、计算用户矩阵U中对应新增用户的维度的子矩阵;同理,当不再需要预测某些用户针对项目的评分时,可以通过撤销存储相应用户的子矩阵的计算节点250来实现。It can be understood that since each computing node 250 processes only the training data of some users, it is possible to achieve the technical effect of adaptively adjusting the user scale by adjusting the number of computing nodes 250. For example, when the scale of the user needs to be expanded, the number of computing nodes 250 may be increased in the distributed computing system 200, and the newly added computing node 250 is responsible for storing and calculating the sub-dimension of the corresponding new user in the user matrix U. Matrix; for the same reason, when it is no longer necessary to predict the score of some users for the project, it can be realized by revoking the computing node 250 storing the sub-matrix of the corresponding user.
下面再对训练模型的实现过程进行说明。The implementation process of the training model will be described below.
矩阵分解模型的规模=(用户数+项目数)×K,实际应用中模型的规模会上升到数亿,甚至十亿或百亿,本发明实施例中利用参数服务节点的分布式计算架构,降低了计算节点存储、计算的模型的维度,进而减少计算节点与参数服务节点之间因传输模型参数而导致的网络通信的开销,提高网络传输效率,并支持通过调整参数服务节点和计算节点的数量,实现了支持模型规模的线性伸缩,主要涉及以下几个方面。The size of the matrix decomposition model = (number of users + number of projects) × K, the scale of the model in actual applications will rise to hundreds of millions, or even one billion or ten billion, in the embodiment of the present invention, the distributed computing architecture using parameter service nodes, The dimension of the model stored and calculated by the computing node is reduced, thereby reducing the network communication overhead caused by the transmission model parameters between the computing node and the parameter service node, improving the network transmission efficiency, and supporting the adjustment of the parameter service node and the computing node. The number, the linear expansion of the support model scale, mainly involves the following aspects.
1)训练数据划分1) Training data division
将训练数据处理为“用户ID,项目ID:评分,…,项目:评分”的格式,即一个用户的所有评分存储在一条记录中,将训练数据以用户为维度划分(例如,均匀划分)为多个子集,每个子集包括多个用户的记录,子集被分配给多个计算节点250;例如,根据各计算节点250的算力均衡的状态,向各计算节点平均分配训练数据的子集;或者,根据各计算节点250的算力悬殊(算力比值超出比值阈值)的情况,根据算力的比值,分配对应比例的训练数据的子集。The training data is processed into a format of "user ID, item ID: rating, ..., item: rating", that is, all the scores of one user are stored in one record, and the training data is divided into dimensions by user (for example, evenly divided). a plurality of subsets, each subset comprising a plurality of user records, the subset being assigned to a plurality of compute nodes 250; for example, a subset of the training data is evenly distributed to each compute node based on the state of the computational power balance of each compute node 250 Or, according to the case where the calculation power of each calculation node 250 is disparity (the calculation power ratio exceeds the ratio threshold), a subset of the training data of the corresponding ratio is allocated according to the ratio of the calculation power.
2)模型存储2) Model storage
根据前述公式(7.1)和(7.2)可知,项目子矩阵和用户子矩阵的更新 相互依赖,每次迭代计算,首先需要使用项目子矩阵的参数计算用户子矩阵的参数的更新值(可以理解,由于每次迭代计算,都是在参数的原有取值的基础上迭代一个更新值,因此,本文中,对于计算参数的更新值、以及计算更新的参数,可以不做区分),然后使用用户子矩阵的参数的更新值计算项目子矩阵的参数的更新值,因此在迭代开始之前,计算节点需要通过网络从参数服务节点获取项目子矩阵的参数,迭代结束之后计算节点需要通过网络向参数服务节点传输项目子矩阵的参数的更新值。According to the foregoing formulas (7.1) and (7.2), the update of the item sub-matrix and the user sub-matrix is mutually dependent. For each iteration calculation, it is first necessary to calculate the updated value of the parameter of the user sub-matrix using the parameters of the item sub-matrix (it can be understood, Since each iteration calculation is to iterate an update value based on the original value of the parameter, therefore, in this paper, for the update value of the calculation parameter, and the calculation of the updated parameter, you can make no distinction), and then use the user. The updated value of the parameter of the sub-matrix calculates the updated value of the parameter of the item sub-matrix, so before the iteration begins, the computing node needs to obtain the parameters of the item sub-matrix from the parameter service node through the network, and the computing node needs to serve the parameter through the network after the iteration ends. The node transmits the updated value of the parameters of the project submatrix.
鉴于在大多数应用场景中,训练数据中涉及的用户数量都远远超过项目数量,以netfliex训练数据为例,涉及的用户数量是项目数量的27倍。因此,为了减少计算节点250和参数服务节点230之间因传输参数导致的通信开销,由参数服务节点230存储项目子矩阵,由计算节点250存储计算用户子矩阵,这样,在每次迭代计算中,当计算节点250计算用户子矩阵的参数的更新值时,只需要从各参数服务节点250获取项目子矩阵的参数,在迭代计算结束后,将项目子矩阵的更新的参数返回给存储相应项目子矩阵的参数服务节点230,由参数服务节点230更新项目子矩阵。Given that in most application scenarios, the number of users involved in the training data far exceeds the number of projects, taking netfliex training data as an example, the number of users involved is 27 times the number of projects. Therefore, in order to reduce the communication overhead caused by the transmission parameters between the computing node 250 and the parameter service node 230, the item sub-matrix is stored by the parameter service node 230, and the computing user sub-matrix is stored by the computing node 250, so that in each iterative calculation When the computing node 250 calculates the updated value of the parameter of the user sub-matrix, it only needs to obtain the parameters of the item sub-matrix from each parameter service node 250, and after the iterative calculation ends, return the updated parameters of the item sub-matrix to the corresponding item. The parameter service node 230 of the sub-matrix updates the item sub-matrix by the parameter service node 230.
可见,参数服务节点230和计算节点250之间只需要传输项目矩阵的参数即可,不需要传输用户矩阵U,由于V小于U多个数量级,这就显著降低了参数服务节点230和计算节点250之间的通信开销。It can be seen that only the parameters of the item matrix need to be transmitted between the parameter service node 230 and the computing node 250, and the user matrix U does not need to be transmitted. Since V is less than U orders of magnitude, the parameter service node 230 and the computing node 250 are significantly reduced. Communication overhead between.
3)模型计算3) Model calculation
由公式(7.1)示出的用户矩阵中的特征向量u i在维度k的分量u ik的更新公式可知,参数的计算只和用户的评分有关,用户矩阵中对应不同用户的向量相互独立,因此,以用户为维度将用户矩阵U分为多个子矩阵,对应存储在多个计算节点250,每个计算节点250分配到的训练数据计算所存储的用户子矩阵的参数的更新值,用户子矩阵维度是:计算节点250分配到的训练数据涉及的用户数量×K。 The update formula of the component vector u i in the user matrix shown by the formula (7.1) in the dimension u ik of the dimension k shows that the calculation of the parameter is only related to the user's score, and the vectors corresponding to different users in the user matrix are independent of each other. The user matrix U is divided into a plurality of sub-matrices according to the user dimension, correspondingly stored in the plurality of computing nodes 250, and the training data allocated by each computing node 250 calculates the updated value of the stored parameters of the user sub-matrix, the user sub-matrix The dimension is: the number of users involved in the training data to which the compute node 250 is assigned x K.
以梯度下降法求解参数为例,首先,控制节点240划分训练数据,向各计算节点250分配训练数据的子集;初始化用户矩阵U和项目矩阵V,然后迭代多次训练,每次迭代训练中,每个计算节点250,并行执行如下操作:Taking the gradient descent method as an example, first, the control node 240 divides the training data, assigns a subset of the training data to each computing node 250, initializes the user matrix U and the item matrix V, and then iterates through multiple trainings, each iteration training. Each computing node 250 performs the following operations in parallel:
参见图7,图7是本发明实施例提供的如图5所示的分布式计算系统200,配置为模型训练时的一个可选的处理示意图,从各参数服务节点230获取到相应参数服务节点230存储的项目子矩阵的参数,根据前述的公式(7.1),计算节点250计算本地存储的用户子矩阵U-part的更新的参数;再根据公式(7.2),计算项目子矩阵的参数的更新值,传输给存储相应项目子矩阵的参数服务节点230,由参数服务节点230更新本地存储的项目子矩阵。Referring to FIG. 7, FIG. 7 is an optional processing diagram of the distributed computing system 200 shown in FIG. 5, which is configured as a model training, and obtains a corresponding parameter service node from each parameter service node 230. 230 stores the parameters of the item sub-matrix. According to the foregoing formula (7.1), the calculation node 250 calculates the updated parameters of the locally stored user sub-matrix U-part; and then calculates the parameter update of the item sub-matrix according to the formula (7.2). The value is transmitted to the parameter service node 230 storing the corresponding item sub-matrix, and the parameter storage node 230 updates the locally stored item sub-matrix.
由于计算节点250计算项目子矩阵中对应项目的向量的更新值时,计算结果只与用户对该项目的评分相关,且计算节点250分配到的训练数据的子集可能只包括项目子矩阵中部分项目的评分,因此只能计算出项目子矩阵中已评分项目对应的向量按照最大梯度下降时对应的更新值,对无评分的项目计算的梯度值为0,相当于没有更新。Since the calculation node 250 calculates the updated value of the vector of the corresponding item in the item sub-matrix, the calculation result is only related to the user's score for the item, and the subset of the training data allocated by the calculation node 250 may only include the part of the item sub-matrix. The score of the item, so only the updated value corresponding to the vector corresponding to the scored item in the item sub-matrix is decreased according to the maximum gradient, and the gradient value calculated for the item without scoring is 0, which is equivalent to no update.
鉴于上述情况,在本发明可选实施例中,计算节点250从参数服务节点230获取项目子矩阵时,可以只获取参数服务节点230存储的项目子矩阵中与已评分项目对应的向量,记为V-sub,根据公式(7.1),结合所分配的训练数据的子集、以及项目子矩阵中已评分项目对应的向量,计算本地存储的用户子矩阵中部分用户对应的向量的更新值,上述的部分用户为针对项目子矩阵中已评分项目产生评分行为的用户;In the optional embodiment of the present invention, when the computing node 250 obtains the item sub-matrix from the parameter service node 230, only the vector corresponding to the scored item in the item sub-matrix stored by the parameter service node 230 may be acquired, and V-sub, according to formula (7.1), combining the subset of the assigned training data and the vector corresponding to the scored item in the item sub-matrix, calculating the updated value of the vector corresponding to some users in the locally stored user sub-matrix, Some of the users are users who generate scoring behavior for the scored items in the project submatrix;
根据公式(7.2),结合用户子矩阵中部分用户对应向量的更新值,计算项目子矩阵中与已评分项目对应的向量的更新值,向参数服务节点230(即存储相应项目子矩阵的参数服务节点230)返回已评分项目的向量的更新值,由于未评分项目对应的向量不再需要传输,因此节约了传输未评分项 目的向量而导致的通信开销。According to formula (7.2), combined with the updated value of the partial user correspondence vector in the user sub-matrix, the updated value of the vector corresponding to the scored item in the item sub-matrix is calculated, and the parameter service node 230 (ie, the parameter service storing the corresponding item sub-matrix) is calculated. The node 230) returns the updated value of the vector of the scored item, since the vector corresponding to the unscore item no longer needs to be transmitted, thus saving the communication overhead caused by transmitting the vector of the unrated item.
举例来说,参见图8-1,图8-1是本发明实施例提供的参数服务节点1与计算节点之间传输项目矩阵的参数的一个可选的示意图,设分布式计算系统中设置有4个计算节点,计算节点1至计算节点4对应分配到训练数据的不同子集,且对应存储的用户子矩阵为:U part1、U part2、U part3和U part4;计算节点1至4从参数服务节点1获取项目子矩阵V part1的参数时,分别从参数服务节点1获取子集中已评分项目在项目子矩阵V part1中对应的向量。 For example, referring to FIG. 8-1, FIG. 8-1 is an optional schematic diagram of parameters of a transmission item matrix between a parameter service node 1 and a computing node according to an embodiment of the present invention, where a distributed computing system is provided. 4 computing nodes, computing node 1 to computing node 4 are correspondingly assigned to different subsets of training data, and correspondingly stored user sub-matrices are: U part1 , U part2 , U part 3 and U part 4 ; computing node 1 to 4 slave parameters when the serving node 1 acquires parameters submatrix V part1 project respectively from a centralized service node parameter acquisition sub items corresponding to the item has a sub-matrix of the vector V part1 score.
以计算节点1举例来说,根据所分配到的训练数据的子集确定子集中已评分的项目,从参数服务节点获取已评分项目在项目子矩阵V part1中对应的向量,以参数服务节点1为例,所获取的已评分项目在项目子矩阵V part1中对应的向量记为V part1-sub1;根据公式(7.1),结合所分配到的训练数据的子集、以及V part1-sub1计算U part1的参数的更新值,例如,计算U part1中部分用户对应的向量的更新值时,上述部分用户为针对已评分项目产生评分行为的用户;根据公式(7.2),结合U part1的部分用户对应的向量的更新值,计算V part1-sub1的更新值,记为ΔV part1-sub1,传输ΔV part1-sub1给参数服务节点1,参数服务节点1根据各计算节点返回的更新值(包括计算节点1返回的ΔV part1-sub4、计算节点2返回的ΔV part1-sub2、计算节点3返回的ΔV part1-sub3、计算节点4返回的ΔV part1-sub4)更新本地存储的项目子矩阵。 For example, the computing node 1 determines, according to the subset of the assigned training data, the scored items in the subset, and obtains the corresponding vector of the scored items in the item sub-matrix V part1 from the parameter service node, to parameter service node 1 For example, the obtained vector of the scored item in the item sub-matrix V part1 is denoted as V part1-sub1 ; according to formula (7.1), a subset of the assigned training data and V part1-sub1 are calculated U The updated value of the parameter of part1 , for example, when calculating the updated value of the vector corresponding to the partial user in U part1 , the partial user is the user who generates the scoring behavior for the scored item; according to formula (7.2), the partial user corresponding to U part1 corresponds updated value vector calculating an updated value V part1-sub1, denoted by ΔV part1-sub1, transmission ΔV part1-sub1 to the parameter serving node 1, the parameter serving node 1 based on the updated values of each computing node returns (including computing node 1 returned ΔV part1-sub4, computing node 2 ΔV part1-sub2 returned computing node 3 ΔV part1-sub3 returned computing node 4 ΔV part1-sub4 returned) updates the local memory Stored project submatrix.
图8-1中仅示出了一个参数服务节点1,分布式计算系统中设置有至少2个参数服务节点,以还包括存储项目子矩阵V part2的参数服务节点2为例,那么,参见图8-2,计算节点1至4还从参数服务节点2对应获取已评分项目在项目子矩阵V part2中对应的向量,记为V part2-sub1、V part2-sub2、V part2-sub3和V part2-sub4,并进行迭代计算,同理,参数服务节点2根据各计算节点返回向量的更新值(包括计算节点1返回的ΔV part2-sub4、计算节点2返回的ΔV part2-sub2、计算节点3返回的ΔV part2-sub3、计算节点4返回的ΔV part2-sub4)更新 本地存储的项目子矩阵V part2Only one parameter service node 1 is shown in FIG. 8-1. At least two parameter service nodes are disposed in the distributed computing system, and the parameter service node 2 including the storage item sub-matrix V part2 is taken as an example. 8-2, the computing nodes 1 to 4 also obtain corresponding vectors of the scored items in the item sub-matrix V part2 from the parameter service node 2, and record them as V part2-sub1 , V part2-sub2 , V part2-sub3, and V part2 -sub4 , and perform iterative calculation. Similarly, the parameter service node 2 returns the updated value of the return vector of each computing node (including ΔV part2-sub4 returned by the computing node 1, ΔV part2-sub2 returned by the computing node 2, and the return of the computing node 3) the ΔV part2-sub3, computing node 4 ΔV part2-sub4 returned) to update the locally stored program sub-matrix V part2.
对于图7示出的分布式计算系统200,当计算节点250分配到的训练数据涉及的项目数量以及K的取值较大而使模型超出预定规模时(例如,使模型的规模达到以亿计时),存在V-sub矩阵所需的存储空间仍然超过单个计算节点250的内存的情况。For the distributed computing system 200 shown in FIG. 7, when the number of items involved in the training data assigned by the computing node 250 and the value of K are large and the model exceeds a predetermined scale (for example, the model is scaled to billions of times) ), there is a case where the storage space required for the V-sub matrix still exceeds the memory of a single compute node 250.
对于这种情况,由于项目矩阵中各项目的向量相互独立,因此可以采用分批次更新V-sub矩阵的方案,使每个批次传输的参数小于计算节点250的内存,保证计算节点250具有足够的内存计算参数的更新值。For this case, since the vectors of the items in the item matrix are independent of each other, the scheme of updating the V-sub matrix in batches can be adopted, so that the parameters of each batch transmission are smaller than the memory of the computing node 250, and the computing node 250 is guaranteed to have Sufficient memory to calculate the updated value of the parameter.
在本发明可选实施例中,计算节点250分批次从参数服务节点230获取V-sub矩阵的参数,根据所分配到训练数据的子集中的以评分项目,分批次从参数服务节点230获取V-sub中的部分已评分项目对应的向量;根据公式(7.1),结合每个批次获取的已评分项目的向量、以及所分配到的训练数据的子集,计算所存储的用户子矩阵的参数的更新值;根据公式(7.2),结合用户子矩阵的参数的更新值,计算已评分项目对应向量的更新值,传输给相应的参数服务节点230,供参数服务节点230更新本地存储的项目子矩阵中已评分项目的向量。In an alternative embodiment of the present invention, the compute node 250 retrieves the parameters of the V-sub matrix from the parameter service node 230 in batches, sub-batch from the parameter service node 230 based on the scored items assigned to the subset of training data. Obtaining a vector corresponding to a part of the scored items in the V-sub; calculating the stored user sub-in accordance with the formula (7.1), combining the vector of the scored items acquired by each batch, and the subset of the assigned training data The updated value of the parameter of the matrix; according to formula (7.2), combined with the updated value of the parameter of the user sub-matrix, the updated value of the corresponding vector of the scored item is calculated and transmitted to the corresponding parameter service node 230 for the parameter service node 230 to update the local storage. The vector of the item that has been graded in the item submatrix.
举例来说,参见图9,图9是本发明实施例提供的计算节点分批次与参数服务节点传输项目矩阵的处理示意图,在图9中,训练数据涉及M个用户对N个项目的评分,训练数据被划分为子集平均分配给4个计算节点,4个计算节点对应存储初始化的用户矩阵的子矩阵,记为U part1、U part2、U part3和U part4For example, referring to FIG. 9, FIG. 9 is a schematic diagram of processing of a calculation node sub-batch and a parameter service node transmission item matrix according to an embodiment of the present invention. In FIG. 9, the training data relates to M users' ratings of N items. The training data is divided into subsets and equally distributed to 4 computing nodes, and the 4 computing nodes correspond to sub-matrices storing the initialized user matrix, which are recorded as U part1 , U part2 , U part3 and U part4 ;
各计算节点并行执行这样的操作:将所分配到的子集中的已评分项目,划分为2个批次,每次迭代计算过程中,向参数服务节点存储的项目子矩阵中获取一个批次的已评分项目对应的向量,记为V-sub;根据公式(7.1),结合V-sub和分配到的训练数据的子集,计算用户子矩阵中部分用户(即对 已评分项目产生评分行为的用户)对应向量的更新值,然后根据公式(7.2),结合用户子矩阵中部分用户对应向量的更新值,计算项目子矩阵中已评分项目对应向量的更新值,传输给参数服务节点,由参数服务节点更新本地所存储的项目矩阵。Each computing node performs such operations in parallel: dividing the scored items in the assigned subset into two batches, and obtaining a batch from the item sub-matrix stored in the parameter service node in each iterative calculation process The vector corresponding to the scored item is recorded as V-sub; according to formula (7.1), combined with the V-sub and the subset of the assigned training data, some users in the user sub-matrix are calculated (ie, the scoring behavior of the scored item is generated) User) corresponding to the updated value of the vector, and then according to formula (7.2), combined with the updated value of the partial user corresponding vector in the user submatrix, the updated value of the scored item corresponding vector in the item submatrix is calculated and transmitted to the parameter service node, and the parameter is transmitted. The service node updates the item matrix stored locally.
计算节点与参数服务节点之间分批次传输项目子矩阵的参数,避免了一次性传输项目子矩阵的全部参数导致计算节点内存资源受限的情况,有效规避了训练大规模模型时对于单个计算节点的内存资源开销大的情况。The parameters of the sub-matrix of the batch transmission project between the computing node and the parameter service node avoid the limitation of the memory resources of the computing node caused by all the parameters of the one-time transmission project sub-matrix, effectively avoiding the single calculation when training the large-scale model. The memory resource overhead of the node is large.
下面对本发明实施例前述提供的分布式计算系统进行模型训练的计算实现的过程进行说明,参见图10,图10是本发明实施例提供的应用于一种分布式计算方法,应用于包括至少两个计算节点和至少两个参数服务节点的分布式计算系统;包括:The following is a description of the process of implementing the model training in the distributed computing system provided by the foregoing embodiments of the present invention. Referring to FIG. 10, FIG. 10 is a schematic diagram of a distributed computing method, which is applicable to at least two applications. a computing system and a distributed computing system of at least two parameter service nodes;
步骤101,计算节点根据训练数据的子集包括的用户,初始化用户矩阵中对应用户的向量,得到由所初始化的向量构成的用户子矩阵。Step 101: The computing node initializes a vector of a corresponding user in the user matrix according to a user included in the subset of the training data, and obtains a user sub-matrix composed of the initialized vector.
在本发明可选实施例中,分布式控制系统还可以包括控制节点,控制节点以用户为维度划分训练数据,将训练数据包括的多个针对不同所述项目的评分数据划分为多个子集,将所述多个子集分配给计算节点;例如,可以采用平均分配或根据计算节点的算力按比例分配的方式。In an optional embodiment of the present invention, the distributed control system may further include a control node that divides the training data by the user, and divides the plurality of score data for the different items included in the training data into a plurality of subsets. The plurality of subsets are assigned to the compute nodes; for example, an average allocation or a proportional allocation according to the computational power of the compute nodes may be employed.
步骤102,参数服务节点初始化部分项目对应的向量,得到由所初始化的向量构成的项目子矩阵,部分项目为训练数据包括的项目中的部分项目。Step 102: The parameter service node initializes a vector corresponding to the partial item, and obtains a project sub-matrix composed of the initialized vector, and the partial item is a part of the items included in the training data.
步骤103,计算节点根据训练数据的子集、从参数服务节点获取的项目子矩阵,迭代计算用户子矩阵、以及项目子矩阵,传输每次迭代计算后的项目子矩阵,至相应的参数服务节点。Step 103: The computing node iteratively calculates the user sub-matrix and the item sub-matrix according to the subset of the training data and the item sub-matrix obtained from the parameter service node, and transmits the item sub-matrix calculated by each iteration to the corresponding parameter service node. .
在本发明可选实施例中,参数服务节点在次迭代计算项目子矩阵时,可以计算项目子矩阵的更新值,将项目子矩阵的更新值传输至相应的参数服务节点(即存储迭代计算之前的项目子矩阵的参数服务节点),参数服务 节点根据计算节点传输的项目子矩阵的更新值计算项目子矩阵的新的参数,并更新参数服务节点本地存储的项目子矩阵。In an optional embodiment of the present invention, when the parameter service node calculates the item sub-matrix in the iterative calculation, the updated value of the item sub-matrix may be calculated, and the updated value of the item sub-matrix is transmitted to the corresponding parameter service node (ie, before the iterative calculation is stored) The parameter service node of the item sub-matrix, the parameter service node calculates new parameters of the item sub-matrix according to the updated value of the item sub-matrix transmitted by the calculation node, and updates the item sub-matrix stored locally by the parameter service node.
在本发明可选实施例中,计算节点采用如下的方式初始化用户矩阵中对应用户的向量,计算节点根据所分配到的子集,确定子集中包括的已评分项目,从参数服务节点存储的项目子矩阵中,获取已评分项目对应的向量;In an optional embodiment of the present invention, the computing node initializes a vector of a corresponding user in the user matrix in the following manner, and the computing node determines, according to the assigned subset, the scored items included in the subset, and the items stored from the parameter service node. In the submatrix, obtain a vector corresponding to the scored item;
相应地,计算节点迭代计算用户子矩阵、以及项目子矩阵采用如下方式:迭代计算用户子矩阵中部分用户对应的向量、以及项目子矩阵中对应已评分项目的向量,部分用户为子集包括的用户中针对已评分项目有评分的用户;Correspondingly, the computing node iteratively calculates the user sub-matrix, and the item sub-matrix adopts the following method: iteratively calculates a vector corresponding to some users in the user sub-matrix, and a vector corresponding to the scored item in the item sub-matrix, and some users are included in the subset. a user who has a rating for a graded item;
在计算节点每次迭代计算结束后,将迭代计算后得到以已评分项目对应的向量,传输至相应的参数服务节点,供参数服务节点更新所存储的项目子矩阵。After the calculation of each iteration of the calculation node, the vector corresponding to the scored item is obtained by iterative calculation and transmitted to the corresponding parameter service node for the parameter service node to update the stored item sub-matrix.
其中,为了进一步降低计算节点与参数服务节点传输项目子矩阵的通信开销,计算节点从参数服务节点存储的项目子矩阵中,获取已评分项目对应的向量时,可以从参数服务节点存储的项目子矩阵中,分批次获取已评分项目对应的向量;迭代计算用户子矩阵中相应批次用户对应的向量、以及相应批次的已评分项目对应的向量,相应批次用户为部分用户中针对批次的已评分项目有评分的用户;In order to further reduce the communication overhead of the calculation node sub-matrix of the computing node and the parameter service node, the computing node may obtain the vector corresponding to the scored item from the item sub-matrix stored by the parameter service node, and may store the item from the parameter service node. In the matrix, the vector corresponding to the scored item is obtained in batches; the vector corresponding to the corresponding batch user in the user sub-matrix is calculated and the vector corresponding to the scored item of the corresponding batch is calculated, and the corresponding batch user is a batch for some users. The user who has scored the graded item;
在每次迭代计算结束后,将每次迭代计算后得到的与相应批次的已评分项目对应的向量,传输至相应的参数服务节点,供参数服务节点根据本地所存储的项目子矩阵。After each iteration calculation is completed, the vector corresponding to the scored item of the corresponding batch obtained after each iteration calculation is transmitted to the corresponding parameter service node for the parameter service node to be based on the locally stored item sub-matrix.
对于确定批次的方式来说,计算节点根据计算节点的内存空间确定,其中,每个批次的已评分项目对应的向量占用的存储空间,小于计算节点的内存空间,保证计算具有足够的资源完成计算。For the way of determining the batch, the computing node is determined according to the memory space of the computing node, wherein the storage space occupied by the vector corresponding to the scored item of each batch is smaller than the memory space of the computing node, and the calculation has sufficient resources. Complete the calculation.
不难看出,由于计算节点与参数服务节点之间不需要传输项目子矩阵中未评分项目对应的向量,在不影响迭代计算的前提下,最大程度降低了计算节点与参数服务节点之间的通信消耗,对于计算节点来说,传输等待的时间得以进一步减低,进而提升了迭代计算的效率。It is not difficult to see that, because the calculation node and the parameter service node do not need to transmit the vector corresponding to the unscoring item in the project sub-matrix, the communication between the computing node and the parameter service node is minimized without affecting the iterative calculation. Consumption, for the computing node, the transmission waiting time is further reduced, thereby improving the efficiency of the iterative calculation.
在本发明可选实施例中,计算节点迭代计算用户子矩阵、以及项目子矩阵时,以使损失函数沿最大梯度下降为目标,计算用户子矩阵和项目子矩阵;举例来说,In an optional embodiment of the present invention, the computing node iteratively calculates the user sub-matrix and the item sub-matrix, and calculates the user sub-matrix and the item sub-matrix by using the loss function as the target of the maximum gradient; for example,
在每次迭代计算过程中,计算节点将评分预测值与训练数据的子集中包括的评分实际值作差,得到预测差值;将预测差值与项目子矩阵的乘积,与本地存储的用户子矩阵叠加,得到更新的用户子矩阵;将预测差值与更新的用户子矩阵的乘积,与项目子矩阵叠加,得到更新的项目子矩阵;在满足迭代中止条件时,由控制节点负责输出完整的模型。In each iterative calculation process, the computing node compares the score prediction value with the score actual value included in the subset of the training data to obtain a predicted difference value; the product of the predicted difference value and the item sub-matrix, and the locally stored user sub- The matrix is superimposed to obtain the updated user sub-matrix; the product of the predicted difference value and the updated user sub-matrix is superimposed with the item sub-matrix to obtain the updated item sub-matrix; when the iterative suspension condition is satisfied, the control node is responsible for outputting the complete model.
就控制节点输出模型来说,组合各计算节点存储的用户子矩阵,得到用户矩阵;组合各参数服务节点存储的项目子矩阵,得到项目矩阵;当需要预测目标用户针对目标项目的评分时,根据用户矩阵中对应目标用户的向量,与项目矩阵中对应目标项目的向量的乘积,得到目标用户针对目标项目的评分。In the control node output model, the user sub-matrix stored by each computing node is combined to obtain a user matrix; the item sub-matrix stored by each parameter service node is combined to obtain a project matrix; when it is required to predict the target user's score for the target item, according to The product of the corresponding target user in the user matrix, and the product of the corresponding target item in the item matrix, obtains the target user's score for the target item.
参见图11,图11是本发明实施例提供的训练用于预测评分的模型的可选的流程示意图,结合图7示出的分布式计算系统进行说明。Referring to FIG. 11, FIG. 11 is an optional schematic flowchart of a model for training a predictive score according to an embodiment of the present invention, which is described in conjunction with the distributed computing system shown in FIG.
首先对涉及的模型的参数说明:First explain the parameters of the model involved:
N:项目的个数N: number of items
M:用户的个数M: number of users
k:用户的特征向量、项目的特征向量的维度。k: the feature vector of the user, the dimension of the feature vector of the item.
Iterm,训练数据中的样本数据,样本数据包括用户ID,用户对项目的 评分。Iterm, the sample data in the training data, the sample data includes the user ID, and the user's rating of the item.
IterNum:迭代训练的次数IterNum: number of iterations of training
BatchNum:每次迭代训练的过程中,计算节点250分批次从参数服务节点230获取项目矩阵,并根据每个批次获取项目子矩阵进行迭代计算。BatchNum: During each iterative training, the computing node 250 acquires the item matrix from the parameter service node 230 in batches, and performs iterative calculation according to each batch acquisition item sub-matrix.
第一、初始化;First, initialization;
步骤201,控制节点240为各计算节点250平均分配训练数据的子集。In step 201, the control node 240 evenly distributes a subset of the training data for each of the computing nodes 250.
步骤202,各计算节点250并行执行以下处理:In step 202, each computing node 250 performs the following processing in parallel:
步骤2021,根据分配到的训练数据的子集,创建并初始化用户子矩阵,每个计算节点存储用户矩阵的一个子矩阵。Step 2021: Create and initialize a user sub-matrix according to the subset of the assigned training data, and each computing node stores a sub-matrix of the user matrix.
用户子矩阵的每个行向量对应一个用户,行号对应用户的ID,行向量表示用户对不同特征的评分,用户子矩阵包括部分用户对应的向量,上述的部分用户为计算节点250分配到的子集所包括的用户。Each row vector of the user sub-matrix corresponds to one user, the row number corresponds to the user's ID, the row vector represents the user's score for different features, and the user sub-matrix includes a vector corresponding to the partial user, and the above-mentioned partial users are allocated to the computing node 250. The users included in the subset.
步骤2022,将已评分项目划分为多个批次。In step 2022, the scored item is divided into a plurality of batches.
收集分配到的训练数据的子集中已评分项目的ID的集合,记为IDset;将IDset平均分成多个子集,数量为BatchNum,各子集记为:IDset[1],…,IDset[BatchNum]。Collect the set of IDs of the scored items in the subset of the assigned training data, and record them as IDset; divide the IDset into multiple subsets, the number is BatchNum, and each subset is recorded as: IDset[1],...,IDset[BatchNum] .
步骤203,参数服务节点230创建并初始化N×k维的项目矩阵的子矩阵,每个参数服务节点存储项目子矩阵。In step 203, the parameter service node 230 creates and initializes a sub-matrix of the N×k-dimensional item matrix, and each parameter service node stores the item sub-matrix.
N为项目个数,项目矩阵的每个列向量对应一个项目,列号对应项目的ID,列向量表示项目在不同的特征中的权重。N is the number of items. Each column vector of the item matrix corresponds to one item. The column number corresponds to the ID of the item, and the column vector indicates the weight of the item in different features.
需要指出,步骤201、步骤202与步骤203之间不存在执行顺序的限制。It should be noted that there is no limitation of the execution order between step 201, step 202 and step 203.
第二、迭代计算过程;Second, an iterative calculation process;
迭代IterNum次迭代计算,每次迭代计算过程,针对每个参数服务节点250,执行以下步骤:Iterative IterNum iterative calculations, each iterative calculation process, for each parameter service node 250, perform the following steps:
步骤204,计算节点250分批次从参数服务节点230存储的项目子矩阵 中,获取已评分项目对应的向量。Step 204: The calculation node 250 obtains the vector corresponding to the scored item from the item sub-matrix stored by the parameter service node 230 in batches.
每个批次中,从参数服务节点230获取IDset[m]对应的向量,m的取值满足:1≤m≤BatchNum,参数服务节点250根据每个计算节点250针对项目矩阵中IDset[m]对应的向量的请求,向计算节点250返回项目矩阵中IDset[m]对应的向量。In each batch, the vector corresponding to IDset[m] is obtained from the parameter service node 230, and the value of m satisfies: 1≤m≤BatchNum, and the parameter service node 250 according to each calculation node 250 for the IDset[m] in the item matrix. The request for the corresponding vector returns to the computing node 250 a vector corresponding to IDset[m] in the item matrix.
步骤205,更新用户子矩阵中对已评分项目有评分的用户的向量,计算项目子矩阵中已评分项目对应的向量的更新值。Step 205: Update a vector of users in the user sub-matrix that has scored the scored item, and calculate an updated value of the vector corresponding to the scored item in the item sub-matrix.
更新计算节点250存储的用户子矩阵中对IDset[m]中有评分的用户的更新的向量:u ik←u ik+2αe ijv kj;计算IDset[m]对应的向量的更新值:△v kj=2αe iju ik;然后将IDset[m]对应的向量的更新值△v kj传输给参数服务节点230。 The updated vector of the user who has scored in IDset[m] in the user sub-matrix stored by the calculation node 250 is updated: u ik ←u ik +2αe ij v kj ; the updated value of the vector corresponding to IDset[m] is calculated: Δv kj = 2αe ij u ik; then IDset [m] corresponding to vector △ v kj updated value of the parameter is transmitted to the service node 230.
步骤206,参数服务节点230根据各计算节点返回的项目子矩阵中已评分项目对应的向量的更新值,更新本地存储的项目子矩阵。Step 206: The parameter service node 230 updates the locally stored item sub-matrix according to the updated value of the vector corresponding to the scored item in the item sub-matrix returned by each computing node.
在接收到计算节点250传输IDset[m]对应的向量的更新值时,更新IDset[m]对应的向量如下:Upon receiving the update value of the vector corresponding to the IDset[m] transmitted by the calculation node 250, the vector corresponding to the update IDset[m] is updated as follows:
v j←v j+△v j/Num,Num为分布式计算系统200中的计算节点250的数量。 v j ←v j +Δv j /Num, Num is the number of compute nodes 250 in the distributed computing system 200.
步骤207,控制节点240从各计算节点250获取用户子矩阵的参数并组合形成用户矩阵,从各参数服务节点230获取项目矩阵的参数并组合形成项目矩阵。Step 207: The control node 240 acquires parameters of the user sub-matrix from each computing node 250 and combines to form a user matrix, and acquires parameters of the item matrix from each parameter service node 230 and combines to form an item matrix.
至此,得到训练数据中各用户对不同项目的评分的基于矩阵分解模型的表达,根据公式(2)可以计算不同用户对项目的评分,在商品推荐的业务场景中,可以选取评分最高的商品向用户推荐。So far, the matrix decomposition model based on the scores of different users in the training data is obtained. According to formula (2), the scores of different users can be calculated. In the business scenario of product recommendation, the product with the highest score can be selected. User recommendation.
本发明实施例提供一种存储介质,包括任何类型的易失性或非易失性 存储设备、或者它们的组合来实现。其中,非易失性存储器可以是只读存储器(ROM,Read Only Memory)、可编程只读存储器(PROM,Programmable Read-Only Memory)、可擦除可编程只读存储器(EPROM,Erasable Programmable Read-Only Memory)等,在存储介质中存储有可执行程序,可执行程序被处理器执行时,执行如下的操作:Embodiments of the present invention provide a storage medium, including any type of volatile or non-volatile storage device, or a combination thereof. The non-volatile memory may be a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), or an Erasable Programmable Read (EPROM). Only Memory), etc., an executable program is stored in the storage medium, and when the executable program is executed by the processor, the following operations are performed:
当处于计算节点模式时,根据训练数据的子集包括的用户,初始化用户矩阵中对应所述用户的向量,得到由所初始化的向量构成的用户子矩阵;When in the computing node mode, according to the user included in the subset of the training data, the vector corresponding to the user in the user matrix is initialized, and a user sub-matrix composed of the initialized vector is obtained;
当处于计算节点模式时,根据训练数据的子集、从参数服务节点获取的项目子矩阵,迭代计算所述用户子矩阵、以及所述项目子矩阵,将每次迭代计算之后得到的项目子矩阵,传输至相应的参数服务节点;When in the computing node mode, the user sub-matrix and the item sub-matrix are iteratively calculated according to a subset of the training data, the item sub-matrix obtained from the parameter service node, and the item sub-matrix obtained after each iteration calculation , transmitted to the corresponding parameter service node;
当处于参数服务节点模式时,初始化部分项目对应的向量,得到由所初始化的向量构成的项目子矩阵,所述部分项目为所述训练数据包括的项目中的部分项目;When in the parameter service node mode, the vector corresponding to the partial item is initialized, and a project sub-matrix composed of the initialized vector is obtained, and the partial item is a part of the items included in the training data;
当处于参数服务节点模式时,根据计算节点传输的项目子矩阵,更新参数服务节点存储的项目子矩阵。When in the parameter service node mode, the item sub-matrix stored by the parameter service node is updated according to the item sub-matrix transmitted by the computing node.
在本发明可选实施例中,可执行程序被处理器执行时,还执行如下的操作:In an alternative embodiment of the invention, when the executable program is executed by the processor, the following operations are also performed:
当处于控制节点模式时,以用户为维度,划分训练数据包括的针对多个项目的评分,得到训练数据的多个子集,将多个子集分配给至少两个计算节点。When in the control node mode, the scores for the plurality of items included in the training data are divided by the user, and a plurality of subsets of the training data are obtained, and the plurality of subsets are allocated to at least two computing nodes.
在本发明可选实施例中,可执行程序被处理器执行时,还执行如下的操作:In an alternative embodiment of the invention, when the executable program is executed by the processor, the following operations are also performed:
当处于控制节点模式时,当计算节点迭代计算的中止条件满足时,组合各计算节点存储的用户子矩阵,得到用户矩阵;组合各参数服务节点存储的项目子矩阵,得到项目矩阵;When in the control node mode, when the suspension condition of the calculation node iterative calculation is satisfied, the user sub-matrix stored by each computing node is combined to obtain a user matrix; the item sub-matrix stored by each parameter service node is combined to obtain an item matrix;
当处于控制节点模式时,根据用户矩阵中对应目标用户的向量,与项目矩阵中对应目标项目的向量的乘积,得到目标用户针对目标项目的评分。When in the control node mode, the target user's score for the target item is obtained according to the product of the corresponding target user's vector in the user matrix and the vector of the corresponding target item in the item matrix.
在本发明可选实施例中,可执行程序被处理器执行时,还执行如下的操作:In an alternative embodiment of the invention, when the executable program is executed by the processor, the following operations are also performed:
当处于计算节点模式时,根据所分配到的子集,确定子集中包括的已评分项目,从参数服务节点存储的项目子矩阵中,获取已评分项目对应的向量;When in the computing node mode, determining the scored items included in the subset according to the assigned subset, and obtaining a vector corresponding to the scored items from the item sub-matrix stored by the parameter service node;
当处于计算节点模式时,迭代计算用户子矩阵中部分用户对应的向量、以及项目子矩阵中对应已评分项目的向量,部分用户为子集包括的用户中针对已评分项目产生评分行为的用户;When in the computing node mode, iteratively calculates a vector corresponding to a part of users in the user sub-matrix and a vector corresponding to the scored item in the item sub-matrix, and some users are users of the subset including users who generate a scoring behavior for the scored items;
当处于计算节点模式时,将每次迭代计算后得到的与已评分项目对应的向量,传输至相应的参数服务节点。When in the computing node mode, the vector corresponding to the scored item obtained after each iteration calculation is transmitted to the corresponding parameter service node.
在本发明可选实施例中,可执行程序被处理器执行时,还执行如下的操作:In an alternative embodiment of the invention, when the executable program is executed by the processor, the following operations are also performed:
当处于计算节点模式时,从参数服务节点存储的项目子矩阵中,分批次获取已评分项目对应的向量;When in the computing node mode, the vector corresponding to the scored item is obtained in batches from the item sub-matrix stored by the parameter service node;
当处于计算节点模式时,迭代计算用户子矩阵中相应批次用户对应的向量、以及相应批次的已评分项目对应的向量,相应批次用户为部分用户中针对批次的已评分项目产生评分行为的用户;When in the compute node mode, iteratively calculates the vector corresponding to the corresponding batch user in the user sub-matrix and the vector corresponding to the scored item of the corresponding batch, and the corresponding batch user scores the scored items for the batch among the partial users. Behavioral user;
当处于计算节点模式时,将每次迭代计算后得到的与相应批次的已评分项目对应的向量,传输至相应的参数服务节点。When in the computing node mode, the vector corresponding to the scored item of the corresponding batch obtained after each iteration calculation is transmitted to the corresponding parameter service node.
在本发明可选实施例中,可执行程序被处理器执行时,还执行如下的操作:In an alternative embodiment of the invention, when the executable program is executed by the processor, the following operations are also performed:
当处于计算节点模式时,根据计算节点的内存空间,确定批次的数量,其中,每个批次的已评分项目对应的向量占用的存储空间,小于计算节点 的内存空间。When in the compute node mode, the number of batches is determined according to the memory space of the compute node, wherein the storage space occupied by the vector corresponding to the scored item of each batch is smaller than the memory space of the compute node.
在本发明可选实施例中,可执行程序被处理器执行时,还执行如下的操作:In an alternative embodiment of the invention, when the executable program is executed by the processor, the following operations are also performed:
当处于参数服务节点模式时,根据计算节点传输的与已评分项目对应的向量,更新参数服务节点存储的项目子矩阵。When in the parameter service node mode, the item sub-matrix stored by the parameter service node is updated according to the vector corresponding to the scored item transmitted by the calculation node.
在本发明可选实施例中,可执行程序被处理器执行时,还执行如下的操作:In an alternative embodiment of the invention, when the executable program is executed by the processor, the following operations are also performed:
当处于计算节点模式时,将评分预测值与训练数据的子集中包括的评分实际值作差,得到预测差值;When in the computing node mode, the score prediction value is compared with the score actual value included in the subset of the training data to obtain a predicted difference value;
当处于计算节点模式时,将预测差值与项目子矩阵的乘积,与用户子矩阵叠加,得到更新的用户子矩阵;When in the computing node mode, the product of the prediction difference and the item sub-matrix is superimposed with the user sub-matrix to obtain an updated user sub-matrix;
当处于计算节点模式时,将预测差值与更新的用户子矩阵的乘积,与项目子矩阵叠加,得到更新的项目子矩阵。When in the compute node mode, the product of the predicted difference and the updated user submatrix is superimposed with the item submatrix to obtain an updated item submatrix.
可以理解,当分布式计算系统的节点中设置上述的存储介质时,部分节点处于计算节点模式,部分节点处于参数服务接地模式,一个示例如数据图7所示,分布式计算系统能够基于训练数据进行迭代计算,对于如图1表示的训练数据的评分矩阵来说,能够将评分矩阵分解为如图1所示的用户矩阵和项目矩阵的乘积,根据图1示出的模型,能够计算用户对不同项目的评分,评分表示了用户对项目的感兴趣的程度,根据评分的降序可以精确选取用户感兴趣的项目向用户推荐。It can be understood that when the above storage medium is set in the node of the distributed computing system, some nodes are in the computing node mode, and some nodes are in the parameter service grounding mode. As an example, as shown in the data diagram 7, the distributed computing system can be based on the training data. Performing iterative calculation, for the scoring matrix of the training data shown in FIG. 1, the scoring matrix can be decomposed into the product of the user matrix and the item matrix as shown in FIG. 1, and according to the model shown in FIG. 1, the user pair can be calculated. The scores of different items indicate the degree of interest of the user to the project. According to the descending order of the scores, the items of interest to the user can be accurately selected and recommended to the user.
下面再结合一个应用场景进行说明,参见图12,图12是本发明实施例提供的如图2所示的大数据平台200的一个可选的应用场景示意图,示例性地,图2示出的大数据平台100部署的分布式计算系统200可以采用如图7示出的分布式计算系统200的架构。The following is a description of an application scenario. Referring to FIG. 12, FIG. 12 is a schematic diagram of an optional application scenario of the big data platform 200 shown in FIG. 2 according to an embodiment of the present invention. The distributed computing system 200 deployed by the big data platform 100 can employ the architecture of the distributed computing system 200 as shown in FIG.
在图12中示出了一线上购物系统700,线上购物系统700提供基于页 面访问方式,支持用户通过浏览器、购物APP的访问,对于登录线上购物系统700的用户,线上购物系统700开启行为数据采集功能,采集如下形式的行为数据:用户ID、访问时间、浏览商品、购买商品、退货商品和商品评分。An online shopping system 700 is shown in FIG. 12. The online shopping system 700 provides a page-based access method to support user access through a browser and a shopping APP. For a user who logs into the online shopping system 700, the online shopping system 700 The behavior data collection function is enabled to collect behavior data in the following forms: user ID, access time, browse product, purchase item, return item, and item rating.
线上购物系统700向大数据平台100的数据采集系统300开放行为数据的权限,数据采集系统300定期或不定期获取线上购物系统700的访问用户的行为数据,对行为数据进行清洗,如去除恶意评分数据、以及具有作弊行为的偏高的评分,将评分数据以用户为维度构造训练数据,训练数据的每个记录包括有用户ID、商品ID和商品评分。The online shopping system 700 opens the permission of the behavior data to the data collection system 300 of the big data platform 100. The data collection system 300 periodically or irregularly obtains the behavior data of the accessing user of the online shopping system 700, and cleans the behavior data, such as removing The malicious scoring data, and the high score with cheating behavior, construct the training data in the user-oriented dimension of the scoring data, and each record of the training data includes a user ID, a product ID, and a product score.
训练数据被提交大数据平台100的分布式计算系统200进行迭代计算,基于用户对已评分商品的评分预测出用户对未评分商品的评分,形成如图1所示的矩阵分解模型,在图1中,用户对每个商品的评分使用用户矩阵中对应该用户的向量、与商品矩阵中对应该商品的向量的乘积表示,用户模型和商品模型的参数返回线上购物系统700。The training data is submitted to the distributed computing system 200 of the big data platform 100 for iterative calculation. Based on the user's score on the scored products, the user's score on the unrated products is predicted to form a matrix decomposition model as shown in FIG. The user's rating for each item is represented by the product of the vector corresponding to the user in the user matrix and the vector corresponding to the item in the item matrix, and the parameters of the user model and the product model are returned to the online shopping system 700.
线上购物系统700根据矩阵分解模型能够计算用户对不同商品的评分,举例来说,线性购物系统700需要针对一商品进行线上促销时,为了精确定位该商品的潜在消费用户,根据矩阵分解模型计算出对该商品评分最高的预定数量的用户,向用户推送商品的促销信息,实现精准营销。The online shopping system 700 can calculate the user's rating of different commodities according to the matrix decomposition model. For example, when the linear shopping system 700 needs to perform online promotion for an item, in order to accurately locate the potential consumer of the product, according to the matrix decomposition model Calculate the predetermined number of users who have the highest rating for the product, and push the promotion information of the product to the user to achieve accurate marketing.
可以理解,上述的购物系统700也可以替换为网上APP商店,实现精准向用户推荐感兴趣的APP,如APP商店根据矩阵分解模型能够计算用户对不同APP的评分(兴趣度),根据计算出的评分,向用户实施特定的APP的推送;上述的购物系统700还可以为社交平台系统,向用户推荐感兴趣的联系人。接下来以社交平台系统向用户推荐联系人为例进行说明。It can be understood that the above-mentioned shopping system 700 can also be replaced with an online APP store to accurately recommend an APP of interest to the user. For example, the APP store can calculate the user's rating (interesting degree) for different APPs according to the matrix decomposition model, according to the calculated calculation. Scoring, the user is pushed to a specific APP; the shopping system 700 described above may also be a social platform system that recommends interested contacts to the user. Next, the social platform system is used to recommend contacts to users as an example.
社交平台系统提供基于页面访问方式,支持用户通过浏览器、社交平台APP的访问,对于登录社交平台系统的用户,登录社交平台系统开启数 据采集功能,采集如下形式的行为数据:用户ID、用户在社交网络中反映用户间相似性的各种行为数据(如发表原创内容、评论、关注信息等);或者采集如下形式的用户数据:性别、年龄、工作、所在区域等。The social platform system provides a page-based access method to support user access through a browser and a social platform APP. For a user logging in to the social platform system, the social networking system is enabled to open a data collection function, and collect behavior data in the following form: user ID, user Various behavioral data (such as publishing original content, comments, attention information, etc.) reflecting the similarity between users in a social network; or collecting user data in the following forms: gender, age, work, location, and the like.
社交平台系统向大数据平台的数据采集系统开放数据权限,数据采集系统定期或不定期获取社交平台系统的访问用户的行为数据和/或用户数据,对数据进行清洗,如去除恶意评论,将联系人评分数据以用户为维度构造训练数据,训练数据的每个记录包括有第一用户ID、第二用户ID和第二用户评分。The social platform system opens data permissions to the data collection system of the big data platform, and the data collection system periodically or irregularly obtains behavior data and/or user data of the accessing user of the social platform system, and cleans the data, such as removing malicious comments, and contacting The person rating data constructs training data in a user dimension, each record of the training data including a first user ID, a second user ID, and a second user rating.
训练数据被提交大数据平台100的分布式计算系统200进行迭代计算,基于对已评分第二用户的评分预测出用户对未评分第二用户的评分,形成如图1所示的矩阵分解模型,在图1中,第一用户对每个第二用户的评分使用第一用户矩阵中对应该第一用户的向量、与第二用户矩阵中对应该第二用户的向量的乘积表示,第一用户模型和第二用户模型的参数返回线社交平台系统。The training data is iteratively calculated by the distributed computing system 200 submitted to the big data platform 100, and based on the score of the scored second user, the user's score on the un-scorsized second user is predicted to form a matrix decomposition model as shown in FIG. In FIG. 1, the first user's score for each second user is represented by a product of a vector corresponding to the first user in the first user matrix and a vector corresponding to the second user in the second user matrix, the first user The parameters of the model and the second user model are returned to the social platform system.
社交平台系统根据矩阵分解模型能够计算第一用户对不同第二用户的评分,举例来说,社交平台系统需要向第一用户进行好友推荐,为了精确定位向第一用户推荐的第二用户,根据矩阵分解模型计算出对第二用户评分较高的预定数量的第二用户,向第一用户推送第二用户的相关信息,实现精准的好友推荐。The social platform system can calculate the scores of the first user for different second users according to the matrix decomposition model. For example, the social platform system needs to perform friend recommendation to the first user, in order to accurately locate the second user recommended to the first user, according to The matrix decomposition model calculates a predetermined number of second users that score higher on the second user, and pushes related information of the second user to the first user to implement accurate friend recommendation.
综上所述,本发明实施例具有以下有益效果:In summary, the embodiments of the present invention have the following beneficial effects:
1)将用户矩阵以用户子矩阵的方式分布存储,将项目矩阵以项目子矩阵的方式分布存储,降低了对节点的内存空间的占用,克服了相关技术对于单机内存需要能够存储完整的用户矩阵和项目矩阵的限制,能够在内存有限的分布式计算系统中实现大规模的计算;1) The user matrix is distributed and stored in the user sub-matrix, and the project matrix is distributed and stored in the item sub-matrix, which reduces the occupation of the memory space of the node, and overcomes the related technology for storing the complete user matrix for the single-machine memory. And the limitations of the project matrix, enabling large-scale calculations in distributed computing systems with limited memory;
2)多个计算节点基于训练数据的子集对存储的用户子矩阵、以及从参 数服务节点获取的项目子矩阵进行计算,一方面,降低了单个节点的计算复杂度,另一方面,计算节点并行计算的方式有效提升了计算效率;2) The plurality of computing nodes calculate the stored user sub-matrix and the item sub-matrix obtained from the parameter service node based on the subset of the training data, on the one hand, reduce the computational complexity of the single node, and on the other hand, the computing node The way of parallel computing effectively improves the computational efficiency;
3)将项目矩阵和用户矩阵以子矩阵的方式分布式存储,有效降低了计算节点与参数服务节点之间传输项目子矩阵的容量,一方面,单个节点的通信开销被有效降低,消除了通信开销遇到网络带宽瓶颈的情况,有利于网络通信负载的均衡化;另一方面,传输效率高,避免了因等待数据导致计算节点闲置的情况,提升了计算效率。3) The project matrix and the user matrix are distributed and stored in a sub-matrix manner, which effectively reduces the capacity of the transmission item sub-matrix between the computing node and the parameter service node. On the one hand, the communication overhead of a single node is effectively reduced, eliminating communication. When the overhead encounters the network bandwidth bottleneck, it is beneficial to the equalization of the network communication load. On the other hand, the transmission efficiency is high, which avoids the situation that the computing node is idle due to waiting data, and improves the calculation efficiency.
4)计算节点与参数服务节点之间只传输已评分项目对应的向量、以及更新值,由于不需要传输未评分项目相关的向量,降低了计算节点与参数服务节点之间的通信开销和传输延时,有利于提升计算效率。4) Only the vector corresponding to the scored item and the update value are transmitted between the calculation node and the parameter service node. Since it is not necessary to transmit the vector related to the unrated item, the communication overhead and transmission delay between the calculation node and the parameter service node are reduced. It is beneficial to improve the calculation efficiency.
5)通过将用户矩阵划分为子矩阵分配给多个计算节点,将项目矩阵分解为多个项目子矩阵分布存储在参数服务节点,每次迭代分批获取项目向量,解决了大规模矩阵分解模型的计算问题,可以通过增加参数服务节点、计算节点的数量的方法线性扩展模型规模,支持超大规模的计算。5) By dividing the user matrix into sub-matrices and assigning them to multiple computing nodes, the project matrix is decomposed into multiple project sub-matrix distributions stored in the parameter service nodes, and the project vectors are obtained in batches in each iteration, and the large-scale matrix decomposition model is solved. The computational problem can be extended linearly by increasing the number of parameter service nodes and the number of compute nodes to support very large-scale calculations.
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。The above is only a specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope of the present invention. It should be covered by the scope of the present invention. Therefore, the scope of the invention should be determined by the scope of the appended claims.
工业实用性Industrial applicability
本发明实施例中分布式计算系统,包括:至少两个计算节点和至少两个参数服务节点;其中,所述计算节点,配置为根据训练数据的子集包括的用户,初始化用户矩阵中对应所述用户的向量,得到由所初始化的向量构成的用户子矩阵;所述计算节点,配置为根据所述训练数据的子集、从所述参数服务节点获取的项目子矩阵,迭代计算所述用户子矩阵、以及所述项目子矩阵,将每次迭代计算之后得到的项目子矩阵传输至相应的参数 服务节点;所述参数服务节点,配置为初始化部分项目对应的向量,得到由所初始化的向量构成的项目子矩阵,所述部分项目为所述训练数据包括的项目中的部分项目;所述参数服务节点,配置为根据所述计算节点传输的项目子矩阵,更新所述参数服务节点所存储的项目子矩阵。如此,将项目矩阵和用户矩阵以子矩阵的方式分布式存储,降低了对单个节点的内存空间的占用,克服了相关技术对于单节点内存需要能够存储完整的用户矩阵和项目矩阵的限制,能够在内存资源有限的分布式计算系统中实现大规模的计算;单个节点的通信开销被有效降低,消除了通信开销遇到网络带宽瓶颈的情况,有利于网络通信负载的均衡化,避免了因等待数据导致计算节点闲置的情况,提升了计算效率;多个计算节点基于训练数据的子集对存储的用户子矩阵、以及项目子矩阵进行迭代计算,一方面,因为计算复杂度降低进而降低了对单个节点的计算资源的开销,降低了单个节点的计算复杂度,另一方面,计算节点并行计算的方式有效提升了计算效率。The distributed computing system in the embodiment of the present invention includes: at least two computing nodes and at least two parameter service nodes; wherein the computing node is configured to initialize corresponding users in the user matrix according to users included in the subset of training data. Deriving the user's vector to obtain a user sub-matrix composed of the initialized vectors; the computing node is configured to iteratively calculate the user according to the subset of the training data, the item sub-matrix obtained from the parameter service node Sub-matrix, and the item sub-matrix, the item sub-matrix obtained after each iteration calculation is transmitted to a corresponding parameter service node; the parameter service node is configured to initialize a vector corresponding to the partial item, and obtain the vector initialized by a project sub-matrix, the partial item is a part of the items included in the training data; the parameter service node is configured to update the parameter service node to store according to the item sub-matrix transmitted by the computing node Project submatrix. In this way, the project matrix and the user matrix are distributed and stored in a sub-matrix manner, which reduces the occupation of the memory space of a single node, and overcomes the limitation that the related technology needs to be able to store a complete user matrix and a project matrix for a single-node memory. Large-scale computing is realized in a distributed computing system with limited memory resources; the communication overhead of a single node is effectively reduced, eliminating the situation that the communication overhead encounters a network bandwidth bottleneck, which is beneficial to the equalization of the network communication load and avoids waiting The data leads to the idle state of the computing node, which improves the computational efficiency; multiple computing nodes perform iterative calculation on the stored user sub-matrix and the project sub-matrix based on the subset of training data. On the one hand, the computational complexity is reduced and the pair is reduced. The computational resource overhead of a single node reduces the computational complexity of a single node. On the other hand, the parallel computation of computational nodes effectively improves computational efficiency.

Claims (15)

  1. 一种分布式计算系统,包括:A distributed computing system comprising:
    至少两个计算节点和至少两个参数服务节点;其中,At least two computing nodes and at least two parameter service nodes; wherein
    所述计算节点,配置为根据训练数据的子集包括的用户,初始化用户矩阵中对应所述用户的向量,得到由所初始化的向量构成的用户子矩阵;The computing node is configured to initialize a vector corresponding to the user in the user matrix according to a user included in the subset of the training data, to obtain a user sub-matrix formed by the initialized vector;
    所述计算节点,配置为根据所述训练数据的子集、从所述参数服务节点获取的项目子矩阵,迭代计算所述用户子矩阵、以及所述项目子矩阵,将每次迭代计算之后得到的项目子矩阵传输至相应的参数服务节点;The computing node is configured to iteratively calculate the user sub-matrix and the item sub-matrix according to the subset of the training data, the item sub-matrix obtained from the parameter service node, and obtain the calculation after each iteration The project sub-matrix is transmitted to the corresponding parameter service node;
    所述参数服务节点,配置为初始化部分项目对应的向量,得到由所初始化的向量构成的项目子矩阵,所述部分项目为所述训练数据包括的项目中的部分项目;The parameter service node is configured to initialize a vector corresponding to the partial item, and obtain a project sub-matrix composed of the initialized vector, where the partial item is a part of the items included in the training data;
    所述参数服务节点,配置为根据所述计算节点传输的项目子矩阵,更新所述参数服务节点所存储的项目子矩阵;The parameter service node is configured to update an item sub-matrix stored by the parameter service node according to an item sub-matrix transmitted by the computing node;
    其中,各所述计算节点存储的用户子矩阵用于组合得到用户矩阵,各所述参数服务节点存储的项目子矩阵用于组合得到项目矩阵;The user sub-matrix stored by each of the computing nodes is used to combine to obtain a user matrix, and the item sub-matrix stored by each parameter service node is used to combine to obtain an item matrix;
    所述用户矩阵中对应目标用户的向量及所述项目矩阵中对应目标项目的向量,用于得到所述目标用户针对所述目标项目的评分。A vector corresponding to the target user in the user matrix and a vector of the corresponding target item in the item matrix are used to obtain a score of the target user for the target item.
  2. 如权利要求1所述的分布式计算系统,其中,还包括:The distributed computing system of claim 1 further comprising:
    控制节点,配置为以用户为维度,划分所述训练数据包括的针对多个所述项目的评分,得到所述训练数据的多个子集,将所述多个子集分配给所述至少两个计算节点。a control node configured to divide, according to a user dimension, a score for the plurality of the items included in the training data, obtain a plurality of subsets of the training data, and assign the plurality of subsets to the at least two calculations node.
  3. 如权利要求1所述的分布式计算系统,其中,还包括:The distributed computing system of claim 1 further comprising:
    控制节点,配置为当所述计算节点迭代计算的中止条件满足时,组 合各所述计算节点存储的用户子矩阵,得到用户矩阵;组合各所述参数服务节点存储的项目子矩阵,得到项目矩阵;And a control node configured to combine the user sub-matrix stored by each of the computing nodes to obtain a user matrix when the suspension condition of the calculation node is calculated by the iterative calculation; and combine the item sub-matrix stored by each parameter service node to obtain an item matrix ;
    所述控制节点,还配置为根据所述用户矩阵中对应目标用户的向量,与所述项目矩阵中对应目标项目的向量的乘积,得到所述目标用户针对所述目标项目的评分。The control node is further configured to obtain a score of the target user for the target item according to a product of a vector of a corresponding target user in the user matrix and a vector of a corresponding target item in the item matrix.
  4. 如权利要求1所述的分布式计算系统,其中,The distributed computing system of claim 1 wherein
    所述计算节点,配置为根据所分配到的所述子集,确定所述子集中包括的已评分项目,从所述参数服务节点存储的所述项目子矩阵中,获取所述已评分项目对应的向量;The computing node is configured to determine, according to the subset, the scored items included in the subset, and obtain, from the item sub-matrix stored by the parameter service node, the scored item Vector
    所述计算节点,配置为迭代计算所述用户子矩阵中部分用户对应的向量、以及所述项目子矩阵中对应所述已评分项目的向量,所述部分用户为所述子集包括的用户中针对所述已评分项目产生评分行为的用户;The computing node is configured to iteratively calculate a vector corresponding to a part of users in the user sub-matrix and a vector corresponding to the scored item in the item sub-matrix, where the partial users are among the users included in the subset a user who generates a scoring behavior for the scored item;
    所述计算节点,配置为将每次迭代计算后得到的与所述已评分项目对应的向量,传输至相应的参数服务节点。The computing node is configured to transmit a vector corresponding to the scored item obtained after each iteration calculation to a corresponding parameter service node.
  5. 如权利要求4所述的分布式计算系统,其中,The distributed computing system of claim 4 wherein
    所述计算节点,配置为从所述参数服务节点存储的所述项目子矩阵中,分批次获取所述已评分项目对应的向量;The computing node is configured to acquire, from the item sub-matrix stored by the parameter service node, a vector corresponding to the scored item in batches;
    所述计算节点,配置为迭代计算所述用户子矩阵中相应批次用户对应的向量、以及相应批次的已评分项目对应的向量,所述相应批次用户为所述部分用户中针对所述批次的已评分项目产生评分行为的用户;The computing node is configured to iteratively calculate a vector corresponding to a corresponding batch user in the user sub-matrix and a vector corresponding to the scored item of the corresponding batch, where the corresponding batch user is the part of the user The user of the batched graded item that generated the scoring behavior;
    所述计算节点,配置为将每次迭代计算后得到的与相应批次的已评分项目对应的向量,传输至相应的参数服务节点。The computing node is configured to transmit a vector corresponding to the scored item of the corresponding batch obtained after each iteration calculation to a corresponding parameter service node.
  6. 如权利要求5所述的分布式计算系统,其中,The distributed computing system of claim 5 wherein
    所述计算节点,还配置为根据所述计算节点的内存空间,确定所述批次的数量,其中,每个所述批次的已评分项目对应的向量占用的存储 空间,小于所述计算节点的内存空间。The computing node is further configured to determine the quantity of the batch according to a memory space of the computing node, where a storage space occupied by a vector corresponding to the scored item of each batch is smaller than the computing node Memory space.
  7. 如权利要求4所述的分布式计算系统,其中,The distributed computing system of claim 4 wherein
    所述参数服务节点,配置为根据所述计算节点传输的与所述已评分项目对应的向量,更新所述参数服务节点存储的所述项目子矩阵。The parameter service node is configured to update the item sub-matrix stored by the parameter service node according to a vector corresponding to the scored item transmitted by the computing node.
  8. 如权利要求1至7任一项所述的分布式计算系统,其中,A distributed computing system according to any one of claims 1 to 7, wherein
    所述计算节点,配置为将评分预测值与所述训练数据的子集中包括的评分实际值作差,得到预测差值;The computing node is configured to perform a difference between the score prediction value and the score actual value included in the subset of the training data to obtain a predicted difference value;
    所述计算节点,配置为将所述预测差值与所述项目子矩阵的乘积,与所述用户子矩阵叠加,得到更新的用户子矩阵;The computing node is configured to superimpose the product of the predicted difference value and the item sub-matrix with the user sub-matrix to obtain an updated user sub-matrix;
    所述计算节点,配置为将所述预测差值与所述更新的用户子矩阵的乘积,与所述项目子矩阵叠加,得到更新的项目子矩阵。The computing node is configured to superimpose the product of the predicted difference value and the updated user sub-matrix with the item sub-matrix to obtain an updated item sub-matrix.
  9. 一种分布式计算方法,应用于包括至少两个计算节点和至少两个参数服务节点的分布式计算系统;包括:A distributed computing method for a distributed computing system comprising at least two computing nodes and at least two parameter service nodes;
    所述计算节点根据训练数据的子集包括的用户,初始化用户矩阵中对应所述用户的向量,得到由所初始化的向量构成的用户子矩阵;The computing node initializes a vector corresponding to the user in the user matrix according to a user included in the subset of the training data, and obtains a user sub-matrix composed of the initialized vector;
    所述计算节点根据所述训练数据的子集、从所述参数服务节点获取的项目子矩阵,迭代计算所述用户子矩阵、以及所述项目子矩阵,将每次迭代计算之后得到的项目子矩阵,传输至相应的参数服务节点;The computing node iteratively calculates the user sub-matrix and the item sub-matrix according to the subset of the training data, the item sub-matrix obtained from the parameter service node, and the item obtained after each iteration calculation a matrix, transmitted to the corresponding parameter service node;
    所述参数服务节点初始化部分项目对应的向量,得到由所初始化的向量构成的项目子矩阵,所述部分项目为所述训练数据包括的项目中的部分项目;The parameter service node initializes a vector corresponding to the partial item, and obtains a project sub-matrix composed of the initialized vector, where the partial item is a part of the items included in the training data;
    所述参数服务节点根据所述计算节点传输的项目子矩阵,更新所述参数服务节点所存储的项目子矩阵;And the parameter service node updates the item sub-matrix stored by the parameter service node according to the item sub-matrix transmitted by the computing node;
    其中,各所述计算节点存储的用户子矩阵用于组合得到用户矩阵,各所述参数服务节点存储的项目子矩阵用于组合得到项目矩阵;The user sub-matrix stored by each of the computing nodes is used to combine to obtain a user matrix, and the item sub-matrix stored by each parameter service node is used to combine to obtain an item matrix;
    所述用户矩阵中对应目标用户的向量及所述项目矩阵中对应目标项目的向量,用于得到所述目标用户针对所述目标项目的评分。A vector corresponding to the target user in the user matrix and a vector of the corresponding target item in the item matrix are used to obtain a score of the target user for the target item.
  10. 如权利要求9所述的分布式计算方法,其中,还包括:The distributed computing method according to claim 9, further comprising:
    所述分布式计算系统中的控制节点以用户为维度,划分所述训练数据包括的针对多个所述项目的评分,得到所述训练数据的多个子集,将所述多个子集分配给所述至少两个计算节点。The control node in the distributed computing system divides the scores of the training data for a plurality of the items by using a user dimension, obtains a plurality of subsets of the training data, and allocates the plurality of subsets to the Said at least two computing nodes.
  11. 如权利要求9所述的分布式计算方法,其中,还包括:The distributed computing method according to claim 9, further comprising:
    当所述计算节点迭代计算的中止条件满足时,所述分布式计算系统中的控制节点组合各所述计算节点存储的用户子矩阵,得到用户矩阵;组合各所述参数服务节点存储的所述项目子矩阵,得到项目矩阵;When the suspension condition of the iterative calculation of the computing node is satisfied, the control node in the distributed computing system combines the user sub-matrix stored by each computing node to obtain a user matrix; and combines the storage of the parameter service node Project submatrix, get the project matrix;
    根据所述用户矩阵中对应目标用户的向量,与所述项目矩阵中对应目标项目的向量的乘积,得到所述目标用户针对所述目标项目的评分。And obtaining a score of the target user for the target item according to a product of a vector of a corresponding target user in the user matrix and a vector of a corresponding target item in the item matrix.
  12. 如权利要求9所述的分布式计算方法,其中,The distributed computing method according to claim 9, wherein
    所述计算节点根据训练数据的子集包括的用户,初始化用户矩阵中对应所述用户的向量,包括:The computing node initializes a vector corresponding to the user in the user matrix according to a user included in the subset of the training data, including:
    所述计算节点根据所分配到的所述子集,确定所述子集中包括的已评分项目,从所述参数服务节点存储的所述项目子矩阵中,获取所述已评分项目对应的向量;The computing node determines, according to the subset, the scored items included in the subset, and obtains a vector corresponding to the scored item from the item sub-matrix stored by the parameter service node;
    所述计算节点根据从所述参数服务节点获取的项目子矩阵,迭代计算所述用户子矩阵、以及所述项目子矩阵,包括:The computing node iteratively calculates the user sub-matrix and the item sub-matrix according to the item sub-matrix obtained from the parameter service node, including:
    所述计算节点迭代计算所述用户子矩阵中部分用户对应的向量、以及所述项目子矩阵中对应所述已评分项目的向量,所述部分用户为所述子集包括的用户中针对所述已评分项目产生评分行为的用户;The computing node iteratively calculates a vector corresponding to a part of users in the user sub-matrix and a vector corresponding to the scored item in the item sub-matrix, where the partial user is the user included in the subset The user who has scored the item to generate a scoring behavior;
    所述将每次迭代计算之后得到的项目子矩阵,传输至相应的参数服务节点,包括:Transmitting the item sub-matrix obtained after each iteration calculation to the corresponding parameter service node, including:
    将每次迭代计算后得到的与所述已评分项目对应的向量,传输至相应的参数服务节点。The vector corresponding to the scored item obtained after each iteration calculation is transmitted to the corresponding parameter service node.
  13. 如权利要求12所述的分布式计算方法,其中,The distributed computing method according to claim 12, wherein
    所述从所述参数服务节点存储的所述项目子矩阵中,获取已评分项目对应的向量,包括:Obtaining, from the item sub-matrix stored by the parameter service node, a vector corresponding to the scored item, including:
    所述计算节点从所述参数服务节点存储的所述项目子矩阵中,分批次获取所述已评分项目对应的向量;The computing node obtains, from the item sub-matrix stored by the parameter service node, a vector corresponding to the scored item in batches;
    迭代计算所述用户子矩阵中相应批次用户对应的向量、以及相应批次的已评分项目对应的向量,所述相应批次用户为:所述部分用户中针对所述批次的已评分项目产生评分行为的用户;Iteratively calculating a vector corresponding to the corresponding batch user in the user sub-matrix and a vector corresponding to the scored item of the corresponding batch, the corresponding batch user being: the scored item for the batch among the partial users The user who generated the scoring behavior;
    将每次迭代计算后得到的与相应批次的已评分项目对应的向量,传输至相应的参数服务节点。The vector corresponding to the scored item of the corresponding batch obtained after each iteration calculation is transmitted to the corresponding parameter service node.
  14. 如权利要求13所述的分布式计算方法,其中,还包括:The distributed computing method according to claim 13, further comprising:
    所述计算节点根据所述计算节点的内存空间,确定所述批次的数量,其中,每个所述批次的已评分项目对应的向量占用的存储空间,小于所述计算节点的内存空间。The computing node determines the quantity of the batch according to the memory space of the computing node, wherein a storage space occupied by a vector corresponding to the scored item of each batch is smaller than a memory space of the computing node.
  15. 一种存储介质,存储有可执行程序,所述可执行程序被处理器执行时实现以下的操作:A storage medium storing an executable program that, when executed by a processor, performs the following operations:
    当处于计算节点模式时,根据训练数据的子集包括的用户,初始化用户矩阵中对应所述用户的向量,得到由所初始化的向量构成的用户子矩阵;When in the computing node mode, according to the user included in the subset of the training data, the vector corresponding to the user in the user matrix is initialized, and a user sub-matrix composed of the initialized vector is obtained;
    当处于计算节点模式时,根据所述训练数据的子集、从所述参数服务节点获取的项目子矩阵,迭代计算所述用户子矩阵、以及所述项目子矩阵,将每次迭代计算之后得到的项目子矩阵传输至相应的参数服务节点;When in the computing node mode, iteratively calculates the user sub-matrix and the item sub-matrix according to the subset of the training data, the item sub-matrix obtained from the parameter service node, and obtains after each iteration calculation The project sub-matrix is transmitted to the corresponding parameter service node;
    当处于参数服务节点模式时,初始化部分项目对应的向量,得到由所初始化的向量构成的项目子矩阵,所述部分项目为所述训练数据包括的项目中的部分项目;When in the parameter service node mode, the vector corresponding to the partial item is initialized, and a project sub-matrix composed of the initialized vector is obtained, and the partial item is a part of the items included in the training data;
    当处于参数服务节点模式时,根据所述计算节点传输的项目子矩阵,更新所述参数服务节点所存储的项目子矩阵;When in the parameter service node mode, updating the item sub-matrix stored by the parameter service node according to the item sub-matrix transmitted by the computing node;
    其中,各所述计算节点存储的用户子矩阵用于组合得到用户矩阵,各所述参数服务节点存储的项目子矩阵用于组合得到项目矩阵;The user sub-matrix stored by each of the computing nodes is used to combine to obtain a user matrix, and the item sub-matrix stored by each parameter service node is used to combine to obtain an item matrix;
    所述用户矩阵中对应目标用户的向量及所述项目矩阵中对应目标项目的向量,用于得到所述目标用户针对所述目标项目的评分。A vector corresponding to the target user in the user matrix and a vector of the corresponding target item in the item matrix are used to obtain a score of the target user for the target item.
PCT/CN2018/084870 2017-05-10 2018-04-27 Distributed computing system and method and storage medium WO2018205853A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710327494.8 2017-05-10
CN201710327494.8A CN108874529B (en) 2017-05-10 2017-05-10 Distributed computing system, method, and storage medium

Publications (1)

Publication Number Publication Date
WO2018205853A1 true WO2018205853A1 (en) 2018-11-15

Family

ID=64104389

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/084870 WO2018205853A1 (en) 2017-05-10 2018-04-27 Distributed computing system and method and storage medium

Country Status (2)

Country Link
CN (1) CN108874529B (en)
WO (1) WO2018205853A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115952239A (en) * 2023-03-08 2023-04-11 北京纷扬科技有限责任公司 Distributed hierarchical computing system based on expression, electronic device and storage medium

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111274795B (en) * 2018-12-04 2023-06-20 北京嘀嘀无限科技发展有限公司 Vector acquisition method, vector acquisition device, electronic equipment and computer readable storage medium
CN110333844B (en) * 2019-05-06 2023-08-29 北京创鑫旅程网络技术有限公司 Calculation formula processing method and device
CN110490316B (en) * 2019-08-21 2023-01-06 腾讯科技(深圳)有限公司 Training processing method and training system based on neural network model training system
CN111061963B (en) * 2019-11-28 2021-05-11 支付宝(杭州)信息技术有限公司 Machine learning model training and predicting method and device based on multi-party safety calculation
CN112905873A (en) * 2019-12-03 2021-06-04 京东数字科技控股有限公司 Data processing method, device and computer readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105653657A (en) * 2015-12-25 2016-06-08 Tcl集团股份有限公司 Commodity recommendation method and device
CN106296305A (en) * 2016-08-23 2017-01-04 上海海事大学 Electric business website real-time recommendation System and method under big data environment
CN106530058A (en) * 2016-11-29 2017-03-22 广东聚联电子商务股份有限公司 Method for recommending commodities based on historical search and browse records

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102750360B (en) * 2012-06-12 2014-05-28 清华大学 Mining method of computer data for recommendation systems
CN104090919B (en) * 2014-06-16 2017-04-19 华为技术有限公司 Advertisement recommending method and advertisement recommending server
US20160034968A1 (en) * 2014-07-31 2016-02-04 Huawei Technologies Co., Ltd. Method and device for determining target user, and network server
CN106354783A (en) * 2016-08-23 2017-01-25 武汉大学 Social recommendation method based on trust relationship implicit similarity

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105653657A (en) * 2015-12-25 2016-06-08 Tcl集团股份有限公司 Commodity recommendation method and device
CN106296305A (en) * 2016-08-23 2017-01-04 上海海事大学 Electric business website real-time recommendation System and method under big data environment
CN106530058A (en) * 2016-11-29 2017-03-22 广东聚联电子商务股份有限公司 Method for recommending commodities based on historical search and browse records

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115952239A (en) * 2023-03-08 2023-04-11 北京纷扬科技有限责任公司 Distributed hierarchical computing system based on expression, electronic device and storage medium

Also Published As

Publication number Publication date
CN108874529A (en) 2018-11-23
CN108874529B (en) 2022-05-13

Similar Documents

Publication Publication Date Title
WO2018205853A1 (en) Distributed computing system and method and storage medium
WO2023097929A1 (en) Knowledge graph recommendation method and system based on improved kgat model
US10152557B2 (en) Efficient similarity ranking for bipartite graphs
US20160012088A1 (en) Parallel collective matrix factorization framework for big data
George et al. A scalable collaborative filtering framework based on co-clustering
Salman et al. Particle swarm optimization for task assignment problem
JP2022524662A (en) Integration of models with their respective target classes using distillation
Chen et al. General functional matrix factorization using gradient boosting
CN108140075A (en) User behavior is classified as exception
US20140006166A1 (en) System and method for determining offers based on predictions of user interest
JP6311851B2 (en) Co-clustering system, method and program
US20200342523A1 (en) Link prediction using hebbian graph embeddings
WO2022166125A1 (en) Recommendation system with adaptive weighted baysian personalized ranking loss
JP2018142199A (en) Learning system and learning method
Zhou et al. Maintenance optimisation of a series production system with intermediate buffers using a multi-agent FMDP
Ulm et al. Functional federated learning in erlang (ffl-erl)
US8661042B2 (en) Collaborative filtering with hashing
Ben-Shimon et al. An ensemble method for top-N recommendations from the SVD
WO2021146802A1 (en) Method and system for optimizing an objective having discrete constraints
US10313457B2 (en) Collaborative filtering in directed graph
CN113129053A (en) Information recommendation model training method, information recommendation method and storage medium
Zheng et al. Mutual benefit aware task assignment in a bipartite labor market
JPWO2018088277A1 (en) Prediction model generation system, method and program
Serrano A big data intelligent search assistant based on the random neural network
US11979309B2 (en) System and method for discovering ad-hoc communities over large-scale implicit networks by wave relaxation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18797968

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18797968

Country of ref document: EP

Kind code of ref document: A1