WO2021164365A1 - Graph neural network model training method, apparatus and system - Google Patents

Graph neural network model training method, apparatus and system Download PDF

Info

Publication number
WO2021164365A1
WO2021164365A1 PCT/CN2020/132667 CN2020132667W WO2021164365A1 WO 2021164365 A1 WO2021164365 A1 WO 2021164365A1 CN 2020132667 W CN2020132667 W CN 2020132667W WO 2021164365 A1 WO2021164365 A1 WO 2021164365A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
current
neural network
data
graph neural
Prior art date
Application number
PCT/CN2020/132667
Other languages
French (fr)
Chinese (zh)
Inventor
陈超超
王力
周俊
Original Assignee
支付宝(杭州)信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 支付宝(杭州)信息技术有限公司 filed Critical 支付宝(杭州)信息技术有限公司
Publication of WO2021164365A1 publication Critical patent/WO2021164365A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the embodiments of this specification generally relate to the field of machine learning, and more particularly to methods, devices, and systems for using horizontally segmented feature data sets to collaboratively train graph neural network models through multiple data owners.
  • the graph neural network model is a machine learning model widely used in the field of machine learning.
  • multiple model training participants for example, e-commerce companies, express companies, and banks
  • the multiple model training participants usually want to use each other's data to unify the training graph neural network model, but they do not want to provide their data to other model training participants to prevent their own data from being leaked.
  • a graph neural network model training method that can protect the security of private data is proposed. It can cooperate with the multiple model training participants to train the graph while ensuring the security of their respective data of multiple model training participants.
  • the neural network model is used by the participants in the training of the multiple models.
  • embodiments of this specification provide a method, device and system for collaboratively training graph neural network models via multiple data owners, which can be implemented while ensuring the security of the respective data of multiple data owners.
  • Graph neural network model training
  • the graph neural network model includes a discriminant model located on the server side and a graph located at each data owner.
  • each data owner has a training sample subset obtained by horizontally splitting a training sample set used for model training.
  • the training sample subset includes a feature data subset and a true label value
  • the method is executed by the data owner, and the method includes: executing the following loop process until the loop end condition is satisfied: providing the current feature data subset to the current graph neural network sub-model at the data owner to obtain the all The feature vector representation of each node of the neural network sub-model of the current graph; obtain the current discriminant model from the server; provide the feature vector representation of each node to the current discriminant model to obtain the current predicted label value of each node; The current predicted label value of the node and the corresponding true label value determine the current loss function; when the loop end condition is not satisfied, based on the current loss function, determine the gradient information of the current discriminant model and update the model of the current graph neural network sub-model Parameters; and providing gradient information of the current discriminant model to the server, and the server uses the gradient information of the current discriminant model from each data owner to update the discriminant model at the server, wherein, when the loop ending condition is not satisfied, the updated graph neural network sub
  • the gradient information obtained from each data owner may be provided to the server in a secure aggregation manner.
  • the secure aggregation may include: secure aggregation based on secret sharing; secure aggregation based on homomorphic encryption; or secure aggregation based on a trusted execution environment.
  • the method may further include: obtaining a current training sample subset.
  • the loop end condition may include: a predetermined number of loops; the variation of each model parameter of the discriminant model is not greater than a predetermined threshold; or the current total loss function is within a predetermined range.
  • the characteristic data may include characteristic data based on image data, voice data, or text data, or the characteristic data may include user characteristic data.
  • the graph neural network model includes a discriminant model located on the server and located at each data owner.
  • the graph neural network sub-model of each data owner has a training sample subset obtained by horizontally splitting the training sample set used for model training, and the training sample subset includes a feature data subset and a true label value ,
  • the method is executed by the server, and the method includes: executing the following loop process until the loop end condition is met: providing the current discriminant model to each data owner, and each data owner adding each of the current subgraph neural network models
  • the feature vector representation of the node is provided to the current discriminant model to obtain the predicted label value of each node, and the respective current loss function is determined based on the predicted label value of each node and the corresponding real label value, and when the loop end condition is not satisfied, Based on the respective current loss functions, determine the gradient information of the discriminant model and update the model parameters of the current graph neural network sub-model,
  • the feature vector of each node indicates that the current
  • the feature data subset is provided to the current graph neural network sub-model; when the loop end condition is not met, the corresponding gradient information of the current discriminant model is obtained from each data owner, and based on the data from each data owner
  • the current discriminant model is updated with the gradient information of, where, when the cycle end condition is not met, the updated graph neural network sub-model of each data owner and the discriminant model of the server are used as the next cycle process The current model.
  • the gradient information obtained from each data owner may be provided to the server in a secure aggregation manner.
  • the secure aggregation may include: secure aggregation based on secret sharing; secure aggregation based on homomorphic encryption; or secure aggregation based on a trusted execution environment.
  • a method for model prediction using a graph neural network model the graph neural network model including a discriminant model located on the server side and a graph neural network located at each data owner.
  • Network sub-model the method is executed by the data owner, and the method includes: providing the data to be predicted to the graph neural network sub-model at the data owner to obtain the information of each node of the graph neural network sub-model The feature vector representation; the discriminant model is obtained from the server; and the feature vector representation of each node is provided to the discriminant model to obtain the predicted label value of each node.
  • an apparatus for training a graph neural network model via a plurality of data owners includes a discriminant model located on the server and located at each data owner.
  • the graph neural network sub-model of each data owner has a training sample subset obtained by horizontally splitting the training sample set used for model training, and the training sample subset includes a feature data subset and a true label value ,
  • the device is applied to the data owner, and the device includes: a vector representation unit that provides a current feature data subset to the current graph neural network sub-model to obtain the feature vector of each node of the current graph neural network sub-model Means; the discriminant model acquisition unit, which acquires the current discriminant model from the server; the model prediction unit, which provides the feature vector representation of each node to the current discriminant model to obtain the current predicted label value of each node; the loss function determination unit, according to The current predicted label value of each node and the corresponding true label value determine the current loss function; the gradient information determination unit
  • the gradient information providing unit may use a secure aggregation method to provide the gradient information obtained from the data owner to the server.
  • the secure aggregation may include: secure aggregation based on secret sharing; secure aggregation based on homomorphic encryption; or secure aggregation based on a trusted execution environment.
  • the device may further include: a training sample subset acquiring unit, which acquires a current training sample subset during each cycle operation.
  • an apparatus for training a graph neural network model via a plurality of data owners includes a discriminant model located on the server and located at each data owner.
  • the graph neural network sub-model of each data owner has a training sample subset obtained by horizontally splitting the training sample set used for model training, and the training sample subset includes a feature data subset and a true label value ,
  • the device is applied to the server, and the device includes: a discriminant model providing unit, which provides the current discriminant model to each data owner, and each data owner provides the feature vector representation of each node of the current graph neural network sub-model to
  • the current discriminant model is used to obtain the predicted label value of each node, the respective current loss function is determined based on the predicted label value of each node and the corresponding real label value, and when the loop end condition is not met, each data owner is based on Each current loss function determines the gradient information of the discriminant model and updates the model parameters of the current graph neural network sub-model
  • the feature vector of each node indicates that the current feature is
  • the data subset is obtained by providing the current graph neural network sub-model; a gradient information acquisition unit, when the loop end condition is not met, acquires the corresponding gradient information of the current discriminant model from each data owner; and a discriminant model update unit , Updating the current discriminant model based on gradient information from each data owner, wherein the discriminant model providing unit, the gradient information acquiring unit, and the discriminant model updating unit operate in a loop until the loop end condition is satisfied When the loop ending condition is not satisfied, the updated graph neural network sub-model of each data owner and the discriminant model of the server are used as the current model of the next loop process.
  • a system for training a graph neural network model via a plurality of data owners including: a plurality of data owner devices, each data owner device including the above And a server device, including the device as described above, wherein the graph neural network model includes a discriminant model located on the server side and a graph neural network sub-model located at each data owner, and each data owner has a pass A training sample subset obtained by horizontally splitting a training sample set used for model training, where the training sample subset includes a feature data subset and a true label value.
  • an apparatus for performing model prediction using a graph neural network model includes a discriminant model located on the server side and a graph neural network located at each data owner.
  • the network sub-model, the device is applied to the data owner, and the device includes: a vector representation unit that provides the data to be predicted to the graph neural network sub-model at the data owner to obtain the graph neural network sub-model The feature vector representation of each node; the discriminant model acquisition unit, which obtains the discriminant model from the server; and the model prediction unit, which provides the feature vector representation of each node to the discriminant model to obtain the predicted label value of each node.
  • an electronic device including: at least one processor, and a memory coupled with the at least one processor, the memory stores instructions, and when the instructions are used by the at least one When the processor executes, the at least one processor is caused to execute the model training method executed on the data owner side as described above.
  • a machine-readable storage medium which stores executable instructions that, when executed, cause the at least one processor to execute the above-mentioned The method of model training performed.
  • an electronic device including: at least one processor, and a memory coupled with the at least one processor, the memory stores instructions, and when the instructions are used by the at least one When the processor executes, the at least one processor is caused to execute the model training method executed on the server side as described above.
  • a machine-readable storage medium which stores executable instructions that, when executed, cause the at least one processor to execute the execution on the server side as described above.
  • the model training method is provided, which stores executable instructions that, when executed, cause the at least one processor to execute the execution on the server side as described above.
  • an electronic device including: at least one processor, and a memory coupled with the at least one processor, the memory stores instructions, and when the instructions are used by the at least one When the processor executes, the at least one processor is caused to execute the above-mentioned model prediction method.
  • a machine-readable storage medium which stores executable instructions that, when executed, cause the at least one processor to execute the above-mentioned model prediction method.
  • the model parameters of the graph neural network model can be obtained by training without leaking the private data of the multiple training participants.
  • Fig. 1 shows a schematic diagram of an example of a graph neural network model according to an embodiment of the present specification
  • Fig. 2 shows a schematic diagram of an example of a horizontally segmented training sample set according to an embodiment of the present specification
  • FIG. 3 shows a schematic diagram showing the architecture of a system for training graph neural network models via multiple data owners according to an embodiment of the present specification
  • Fig. 4 shows a flowchart of a method for training a graph neural network model via multiple data owners according to an embodiment of the present specification
  • FIG. 5 shows a schematic diagram of an example process for training a graph neural network model via multiple data owners according to an embodiment of the present specification
  • Fig. 6 shows a flowchart of a model prediction process based on a graph neural network model according to an embodiment of the present specification
  • Fig. 7 shows a block diagram of an apparatus for training a graph neural network model via multiple data owners according to an embodiment of the present specification
  • FIG. 8 shows a block diagram of an apparatus for training a graph neural network model via multiple data owners according to an embodiment of the present specification
  • Fig. 9 shows a block diagram of an apparatus for model prediction based on a graph neural network model according to an embodiment of the present specification
  • FIG. 10 shows a schematic diagram of an electronic device for training a graph neural network model via multiple data owners according to an embodiment of the present specification
  • FIG. 11 shows a schematic diagram of an electronic device for training a graph neural network model via multiple data owners according to an embodiment of the present specification.
  • Fig. 12 shows a schematic diagram of an electronic device for model prediction based on a graph neural network model according to an embodiment of the present specification.
  • the term “including” and its variations mean open terms, meaning “including but not limited to”.
  • the term “based on” means “based at least in part on.”
  • the terms “one embodiment” and “an embodiment” mean “at least one embodiment.”
  • the term “another embodiment” means “at least one other embodiment.”
  • the terms “first”, “second”, etc. may refer to different or the same objects. Other definitions can be included below, whether explicit or implicit. Unless clearly indicated in the context, the definition of a term is consistent throughout the specification.
  • the training sample set used in the graph neural network model training scheme is a training sample set that has been horizontally segmented.
  • horizontal segmentation of the training sample set refers to dividing the training sample set into multiple training sample subsets according to modules/functions (or certain specified rules), and each training sample subset contains a part of training samples.
  • the training samples included in each training sample subset are complete training samples, that is, all field data and corresponding label values of the training sample are included.
  • each data owner there are three data owners Alice, Bob, and Charlie
  • local samples are obtained from each data owner to form a local sample set, and each sample contained in the local sample set is a complete sample
  • the local sample sets obtained by the three data parties Alice, Bob, and Charlie form the training sample set for graph neural network model training, where each local sample set is used as a training sample subset of the training sample set, with Used to train graph neural network models.
  • each data owner owns different parts of the data of the training samples used in the training of the graph neural network model. For example, taking two data owners as an example, suppose the training sample set includes 100 training samples, and each training sample contains multiple feature values and actual values of labels, then the data owned by the first data owner can be the training sample set The first 30 training samples in the training sample set, and the data owned by the second data owner may be the last 70 training samples in the training sample set.
  • the feature data used in the training of the graph neural network model may include feature data based on image data, voice data, or text data.
  • the graph neural network model can be applied to business risk identification, business classification or business decision-making based on image data, voice data or text data.
  • the feature data used in the training of the graph neural network model may include user feature data.
  • the graph neural network model can be applied to business risk identification, business classification, business recommendation or business decision based on user characteristic data.
  • the data to be predicted used by the graph neural network model may include image data, voice data, or text data.
  • the data to be predicted used by the graph neural network model may include user characteristic data.
  • graph neural network model and “graph neural network” can be used interchangeably.
  • graph neural network sub-model and “graph neural sub-network” can be used interchangeably.
  • data owner and “training participant” can be used interchangeably.
  • Fig. 1 shows a schematic diagram of an example of a graph neural network model according to an embodiment of the present specification.
  • the graph neural network (GNN, Graph Neural Network) model is divided into a discriminant model 10 and multiple graph neural network sub-models 20.
  • the graph neural network sub-models GNN A , GNN B and GNN C are set on the server 110, and each graph neural network sub-model is set on the corresponding data owner. For example, it can be set on the client of the corresponding data owner, and each data owner has a graph.
  • Neural network sub-model As shown in FIG. 1, GNN A is set at the data owner A 120-1, GNN B is set at the data owner B 120-2, and GNN C is set at the data owner C 120-3.
  • the graph neural network sub-model 20 is used to perform GNN calculations on the data of the data owner to obtain the feature vector representation of each node of the graph neural network sub-model. Specifically, when performing GNN calculations, the data of the data owner is provided to the graph neural network sub-model 20. According to the node characteristics and the graph neural sub-network, through the propagation of K-order neighbors, the data of each node corresponding to the current data is obtained. Feature vector representation.
  • the discriminant model 10 is used to perform model calculation based on the feature vector representation of each node obtained from the data owner to obtain the model prediction value of each node.
  • Fig. 2 shows a schematic diagram of an example of horizontally segmented training sample data according to an embodiment of the present specification.
  • a training sample in the training sample subset owned by each data party Alice and Bob is complete, that is, each training sample includes complete feature data (x) and labeled data (y).
  • Alice has a complete training sample (x0, y0).
  • FIG. 3 shows a schematic diagram illustrating the architecture of a system for training graph neural network models via multiple data owners (hereinafter referred to as "model training system 300") according to an embodiment of the present specification.
  • the model training system 300 includes a server device 310 and at least one data owner device 320.
  • Three data owner devices 320 are shown in FIG. 3. In other embodiments of this specification, more or fewer data owner devices 320 may be included.
  • the server device 310 and the at least one data owner device 320 may communicate with each other via a network 330 such as but not limited to the Internet or a local area network.
  • the trained graph neural network model (neural network model structure except the discriminant model is removed) is divided into the first number of graph neural network sub-models.
  • the first number is equal to the number of data owner devices participating in model training.
  • the graph neural network model is decomposed into N sub-models, and each data owner device has one sub-model.
  • the feature data set used for model training is located at each data owner device 320.
  • the feature data set is horizontally divided into multiple feature data subsets in the manner described in FIG. 2, and each data owner device has one Feature data subset.
  • the sub-models and corresponding feature data subsets owned by each data owner are the secrets of the data owner and cannot be learned or fully learned by other data owners.
  • multiple data owner devices 320 and server devices 310 use the training sample subsets of each data owner device 320 to collaboratively train the graph neural network model.
  • the specific training process of the model will be described in detail with reference to FIGS. 4 to 5 below.
  • the server device 310 and the data owner device 320 may be any suitable electronic devices with computing capabilities.
  • the electronic devices include, but are not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile electronic devices, smart phones, tablet computers, cellular phones, personal digital assistants (PDAs), handhelds Devices, messaging equipment, wearable electronics, consumer electronics, etc.
  • PDAs personal digital assistants
  • FIG. 4 shows a flowchart of a method 400 for training a graph neural network model via multiple data owners according to an embodiment of the present specification.
  • the graph neural network sub-model at each data owner and the discriminant model of the server are initialized. For example, initialize the graph neural network sub-models GNN A , GNN B and GNN C at the data owners A, B, and C, and initialize the discriminant model 10 on the server side.
  • each data owner device 320 obtains its current training sample subset. For example, the owner of the data to get the current training sample A subset S A, data B acquires the current owner of the training subset of samples S B, and C acquires the current owner of the data training sample subset S C.
  • Each subset of training samples includes a subset of feature data and true label values.
  • the obtained current training sample subset is provided to the respective graph neural network sub-model for GNN calculation to obtain the feature vector of each node in the graph neural network sub-model Express.
  • the current training sample subset is provided to the graph neural network sub-model 20.
  • each subset corresponding to the current training sample subset is obtained.
  • each data owner device 320 obtains the current discrimination model from the server 310. Subsequently, at 405, at each data owner device 320, the current discriminant model is used to perform model prediction based on the feature vector representation of each node to obtain the current predicted label value of each node.
  • the current loss function is determined according to the current predicted label value of each node and the corresponding real label value.
  • the loss function can be based on the formula Calculated, where i represents the i-th node, P represents the total number of nodes in the graph neural network sub-model, t i represents the true label value of the i-th node, and O i represents the current predicted label value of the i-th node.
  • each data owner device 320 determines the received gradient information of the current discriminant model, that is, the gradient information of the model parameters of the current discriminant model, for example, determine the received gradient information through backpropagation.
  • the gradient information of the current discriminant model determines the model parameters of the current graph neural network sub-model.
  • the model parameters of the current graph neural network sub-model are updated, for example, the model parameters of the current graph neural network sub-model are updated through back propagation.
  • each data owner device 320 respectively provides the gradient information of the current discriminant model determined by each to the server 310.
  • each data owner device 320 may respectively send the gradient information (for example, as it is) of the current discriminant model determined by each to the server 310, and then the server 310 performs the processing on the received gradient information. polymerization.
  • each data owner may provide the data to the server 310 in a secure aggregation manner.
  • the security aggregation may include: security aggregation based on secret sharing; security aggregation based on homomorphic encryption; or security aggregation based on a trusted execution environment.
  • other suitable safe aggregation methods can also be used.
  • aggregating the received gradient information may include averaging the received gradient information.
  • the aggregated gradient information is used to update the discriminant model at the server 310 for subsequent training cycles or as a trained discriminant model.
  • the loop end condition is satisfied, that is, whether the predetermined number of loops is reached. If the predetermined number of cycles is reached, the process ends. If the predetermined number of cycles has not been reached, return to the operation of 402 and execute the next training cycle process.
  • the graph neural network sub-model of each data owner updated in the current cycle process and the discriminant model of the server end are used as the current model of the next training cycle process.
  • the end condition of the training cycle process refers to reaching the predetermined number of cycles.
  • the end condition of the training loop process may also be that the variation of each model parameter of the discrimination model 10 is not greater than a predetermined threshold.
  • the judgment process as to whether the loop process is over is executed in the server 310.
  • the end condition of the training loop process may also be that the current total loss function is within a predetermined range, for example, the current total loss function is not greater than a predetermined threshold.
  • the process of determining whether the loop process is over is executed in the server 310.
  • each data owner device 320 needs to provide its own loss function to the server 310 for aggregation, so as to obtain the total loss function.
  • each data owner device 320 can provide their own loss function to the server 310 by means of secure aggregation.
  • the security aggregation for the loss function may also include: the security aggregation based on secret sharing; the security aggregation based on homomorphic encryption; or the security aggregation based on the trusted execution environment.
  • FIG. 5 shows a schematic diagram of an example process for training a graph neural network model via multiple data owners according to an embodiment of the present specification.
  • Figure 5 shows three data owners A, B, and C.
  • the data owners A, B, and C obtain their respective current feature data subsets X A , X B, and X C.
  • the data owners A, B, and C respectively provide the current feature data subsets X A , X B, and X C to their current graph neural network sub-models G A , G B, and G C to obtain each current graph neural network sub-model.
  • the current feature vector representation of each node in the model is referred to model.
  • each data owner obtains the current discriminant model H from the server 110. Then, each data owner provides a current discriminant model H for the obtained current feature vector representation of each node to obtain the current predicted label value of each node. Subsequently, at each data owner, based on the current predicted label value of each node and the corresponding real label value, the current loss function is determined, and based on the current loss function, the various model parameters of the current discriminant model are determined through back propagation The gradient information GH. At the same time, at each data owner, based on the current loss function, the model parameters of each layer network model of the current graph neural network sub-model are updated through back propagation.
  • the respective gradient information is provided to the server by means of safe aggregation.
  • the server updates the current discriminant model based on the obtained aggregated gradient information.
  • model training schemes with 3 data owners are shown in Fig. 3 to Fig. 5. In other examples in the embodiment of this specification, it may also include more or less than 3 data owners. square.
  • the GNN model is constructed only based on the data of a single data owner.
  • the effect of the GNN model is also limited.
  • Using the embodiment of this specification to provide a model training solution can jointly train the GNN model on the basis of protecting the data privacy of each data owner, thereby improving the effect of the GNN model.
  • the model structure of all data owners must be consistent, so that the server can safely aggregate the model gradient information of each data owner to update the model, so that different models cannot be customized for different clients.
  • the sparse quality of the data (features and graph relationships) of different data owners is different, so different GNN models may be needed for learning. For example, the node feature vector representation obtained when the data owner A propagates a 2-degree neighbor is optimal, while the node feature vector representation obtained when the data owner B propagates a 5-degree neighbor is the optimal.
  • the GNN model used to obtain the feature vector representation of the node is arranged at each data owner for self-learning (local), and the discriminant model is placed on the server (global). Through multiple data owners to learn together, which can improve the effect of the discriminant model.
  • each data owner provides the gradient information of their current discriminant model to the server through secure aggregation, thereby preventing each data owner
  • the gradient information of the data is completely provided to the server, so as to prevent the server from using the received gradient information to derive the privacy data of the data owner, thereby realizing the privacy data protection for the data owner.
  • FIG. 6 shows a flowchart of a model prediction process 600 based on a graph neural network model according to an embodiment of the present specification.
  • the graph neural network model used in the model prediction process shown in FIG. 6 is a graph neural network model trained according to the process shown in FIG. 4.
  • the data to be predicted is provided to the graph neural network sub-model of the data owner to obtain the feature vector representation of each node of the graph neural network sub-model.
  • the discriminant model is obtained from the server.
  • the feature vector representation of each node is provided to the received discriminant model to obtain the predicted label value of each node, thereby completing the model prediction process.
  • FIG. 7 shows a schematic diagram of an apparatus (hereinafter referred to as a model training apparatus) 700 for training a graph neural network model via a plurality of data owners according to an embodiment of the present specification.
  • the graph neural network model includes a discriminant model located on the server side and a graph neural network sub-model located at each data owner.
  • Each data owner has the ability to perform leveling on the training sample set used for model training.
  • a subset of training samples obtained by segmentation, where the subset of training samples includes a subset of feature data and true label values.
  • the model training device 700 is located on the side of the data owner.
  • the model training device 700 includes a vector representation unit 710, a discriminant model acquisition unit 720, a model prediction unit 730, a loss function determination unit 740, a gradient information determination unit 750, a model update unit 760, and a gradient information providing unit 770.
  • the vector representation unit 710, the discriminant model acquisition unit 720, the model prediction unit 730, the loss function determination unit 740, the gradient information determination unit 750, the model update unit 760, and the gradient information providing unit 770 operate cyclically until the cycle is satisfied. End condition.
  • the loop ending condition may include, for example, that a predetermined number of loops is reached, and the variation of each model parameter of the discriminant model is not greater than a predetermined threshold; or the current total loss function is within a predetermined range.
  • the vector representation unit 710 is configured to provide the current feature data subset to the current graph neural network sub-model to obtain the feature vector representation of each node of the current graph neural network sub-model.
  • the operation of the vector representation unit 710 may refer to the operation of 403 described above with reference to FIG. 4.
  • the discriminant model obtaining unit 720 is configured to obtain the current discriminant model from the server.
  • the operation of the discriminant model acquisition unit 720 may refer to the operation of 404 described above with reference to FIG. 4.
  • the model prediction unit 730 is configured to provide the feature vector representation of each node to the current discriminant model to obtain the current predicted label value of each node.
  • the operation of the model prediction unit 730 may refer to the operation of 405 described above with reference to FIG. 4.
  • the loss function determining unit 740 is configured to determine the current loss function according to the current predicted label value of each node and the corresponding real label value.
  • the operation of the loss function determining unit 740 may refer to the operation of 406 described above with reference to FIG. 4.
  • the gradient information determining unit 750 is configured to determine the gradient information of the current discriminant model based on the current loss function when the loop end condition is not satisfied.
  • the operation of the gradient information determining unit 750 may refer to the operation of 407 described above with reference to FIG. 4.
  • the model updating unit 760 is configured to update the model parameters of the neural network sub-model of the current graph based on the current loss function when the loop end condition is not satisfied.
  • the operation of the model update unit 760 may refer to the operation of 407 described above with reference to FIG. 4.
  • the gradient information providing unit 770 is configured to provide gradient information of the current discriminant model to the server, and the server uses the gradient information of the current discriminant model from each data owner to update the discriminant model at the server.
  • the operation of the gradient information providing unit 770 may refer to the operation of 408 described above with reference to FIG. 4.
  • the gradient information providing unit 770 can provide the gradient information of the current discriminant model to the server in a safe aggregation manner.
  • the model training device 700 may further include a training sample subset acquisition unit (not shown).
  • the training sample subset acquiring unit is configured to acquire the current training sample subset.
  • FIG. 8 shows a block diagram of an apparatus for cooperatively training a graph neural network model via a plurality of data owners (hereinafter referred to as a model training apparatus 800) according to an embodiment of the present specification.
  • the graph neural network model includes a discriminant model located on the server side and a graph neural network sub-model located at each data owner.
  • Each data owner has the ability to perform leveling on the training sample set used for model training.
  • the model training device 800 is located on the server side.
  • the model training device 800 includes a discriminant model providing unit 810, a gradient information acquiring unit 820, and a model updating unit 830.
  • the discriminant model providing unit 810, the gradient information acquiring unit 820, and the model updating unit 830 operate in a loop until the loop end condition is satisfied.
  • the loop ending condition may include, for example, that a predetermined number of loops is reached, and the variation of each model parameter of the discriminant model is not greater than a predetermined threshold; or the current total loss function is within a predetermined range.
  • the discriminant model providing unit 810 is configured to provide the current discriminant model to each data owner for use by each data owner to predict the predicted label value of each node.
  • the operation of the discriminant model providing unit 810 may refer to the operation of 404 described above with reference to FIG. 4.
  • the gradient information acquiring unit 820 is configured to acquire the corresponding gradient information of the current discriminant model from each data owner when the loop end condition is not met.
  • the operation of the gradient information acquisition unit 820 may refer to the operation of 408 described above with reference to FIG. 4.
  • the discriminant model update unit 830 is configured to update the current discriminant model based on gradient information from each data owner.
  • the operation of the discriminant model update unit 830 can refer to the operation of 409 described above with reference to FIG. 4.
  • Fig. 9 shows a block diagram of an apparatus for model prediction based on a graph neural network model (hereinafter referred to as a model prediction apparatus 900) according to an embodiment of the present specification.
  • the model prediction device 900 is applied to the data owner.
  • the model prediction device 900 includes a vector representation unit 910, a discriminant model acquisition unit 920, and a model prediction unit 930.
  • the vector representation unit 910 is configured to provide the data to be predicted to the graph neural network sub-model at the data owner to obtain the feature vector representation of each node of the graph neural network sub-model.
  • the discriminant model obtaining unit 920 is configured to obtain the discriminant model from the server.
  • the model prediction unit 930 is configured to provide the feature vector representation of each node to the discriminant model to obtain the predicted label value of each node, thereby completing the model prediction process.
  • model training and prediction method, device, and system As above, referring to FIGS. 1 to 9, the model training and prediction method, device, and system according to the embodiments of this specification are described.
  • the above model training device and model prediction device can be implemented by hardware, or by software or a combination of hardware and software.
  • FIG. 10 shows a hardware structure diagram of an electronic device 1000 for implementing a neural network model through a training graph of multiple data owners according to an embodiment of the present specification.
  • the electronic device 1000 may include at least one processor 1010, a memory (for example, a non-volatile memory) 1020, a memory 1030, and a communication interface 1040, and at least one processor 1010, a memory 1020, a memory 1030, and a communication interface 1040.
  • the interfaces 1040 are connected together via a bus 1060.
  • At least one processor 1010 executes at least one computer-readable instruction (ie, the above-mentioned element implemented in the form of software) stored or encoded in the memory.
  • computer-executable instructions are stored in the memory, which when executed, cause at least one processor 1010: to execute the following loop process until the loop end condition is met: provide the current feature data subset to the data owner The current graph neural network sub-model to obtain the feature vector representation of each node of the current graph neural network sub-model; obtain the current discriminant model from the server; provide the feature vector representation of each node to the current discriminant model to obtain the feature vector representation of each node Current predicted label value; determine the current loss function according to the current predicted label value of each node and the corresponding real label value; when the loop end condition is not met, based on the current loss function, determine the gradient information of the current discriminant model through backpropagation And update the model parameters of the current graph neural network sub-model; and provide the gradient information of the current discriminant model to the server, which uses the gradient information from the current discriminant model of each data owner to update the discrimination at the server Model, where the updated graph neural network sub-model of each data owner and the discriminant model at the server end are
  • FIG. 11 shows a hardware structure diagram of an electronic device 1100 for training a graph neural network model via multiple data owners according to an embodiment of the present specification.
  • the electronic device 1100 may include at least one processor 1110, a memory (for example, a non-volatile memory) 1120, a memory 1130, and a communication interface 1140, and at least one processor 1110, a memory 1120, a memory 1130, and a communication interface 1140.
  • the interfaces 1140 are connected together via a bus 1160.
  • At least one processor 1110 executes at least one computer-readable instruction (ie, the above-mentioned element implemented in the form of software) stored or encoded in the memory.
  • computer-executable instructions are stored in the memory, which, when executed, cause at least one processor 1110 to execute the following loop process until the loop end condition is met: provide the current discriminant model to each data owner, each The data owner provides the feature vector representation of each node of the current subgraph neural network model to the current discriminant model to obtain the predicted label value of each node, and determines the respective current loss function based on the predicted label value of each node and the corresponding true label value , And when the loop end condition is not met, based on the respective current loss function, determine the gradient information of the discriminant model through back propagation and update the model parameters of the current graph neural network sub-model, and provide the determined gradient information to the server , The feature vector representation of each node is obtained by providing the current feature data subset to the current graph neural network sub-model; when the loop end condition is not met, the corresponding gradient information of the current discriminant model is obtained from each data owner, and based on The gradient information of each data owner updates the current discriminant model, where, when the loop end condition is
  • FIG. 12 shows a hardware structure diagram of an electronic device 1200 for model prediction based on a graph neural network model according to an embodiment of the present specification.
  • the electronic device 1200 may include at least one processor 1210, a memory (for example, a non-volatile memory) 1220, a memory 1230, and a communication interface 1240, and at least one processor 1210, a memory 1220, a memory 1230, and a communication interface.
  • the interfaces 1240 are connected together via a bus 1260.
  • At least one processor 1210 executes at least one computer-readable instruction (i.e., the above-mentioned element implemented in the form of software) stored or encoded in the memory.
  • computer-executable instructions are stored in the memory, which, when executed, cause at least one processor 1210 to: provide the data to be predicted to the graph neural network sub-model at the data owner to obtain the graph neural network The feature vector representation of each node of the sub-model; obtain the discriminant model from the server; and provide the feature vector representation of each node to the discriminant model to obtain the predicted label value of each node.
  • a program product such as a machine-readable medium (for example, a non-transitory machine-readable medium) is provided.
  • the machine-readable medium may have instructions (ie, the above-mentioned elements implemented in the form of software), which, when executed by a machine, cause the machine to perform the various operations and functions described above in conjunction with FIGS. 1-9 in the various embodiments of this specification.
  • a system or device equipped with a readable storage medium may be provided, and the software program code for realizing the function of any one of the above-mentioned embodiments is stored on the readable storage medium, and the computer or device of the system or device The processor reads and executes the instructions stored in the readable storage medium.
  • the program code itself read from the readable medium can realize the function of any one of the above embodiments, so the machine readable code and the readable storage medium storing the machine readable code constitute the present invention a part of.
  • Examples of readable storage media include floppy disks, hard disks, magneto-optical disks, optical disks (such as CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD-RW), magnetic tape, Volatile memory card and ROM.
  • the program code can be downloaded from the server computer or the cloud via the communication network.
  • the device structure described in the foregoing embodiments may be a physical structure or a logical structure, that is, some units may be implemented by the same physical entity, or some units may be implemented by multiple physical entities, or may be implemented by multiple physical entities. Some components in independent devices are implemented together.
  • the hardware unit or module can be implemented mechanically or electrically.
  • a hardware unit, module, or processor may include a permanent dedicated circuit or logic (such as a dedicated processor, FPGA or ASIC) to complete the corresponding operation.
  • the hardware unit or processor may also include programmable logic or circuits (such as general-purpose processors or other programmable processors), which may be temporarily set by software to complete corresponding operations.
  • the specific implementation mechanical, or dedicated permanent circuit, or temporarily set circuit

Abstract

A method and apparatus for training a graph neural network model by means of multiple data owners. In the method, a graph neural network model is divided into a discrimination model and multiple graph neural network sub-models. During model training, each data owner provides its own feature data subsets to its own graph neural network sub-models to obtain feature vector representation of each node. Each data owner receives the discrimination model from a server, and obtains a current predicted label value of each node by using the feature vector representation of each node, whereby obtaining a current loss function at each data owner by calculation, determining gradient information of the discrimination model on the basis of the current loss function, and updating its own graph neural network sub-models. Each data owner provides its own gradient information to the server, so that the server updates the discrimination model. By using method, the data security of private data at each data owner can be ensured.

Description

图神经网络模型训练方法、装置及系统Graph neural network model training method, device and system 技术领域Technical field
本说明书实施例通常涉及机器学习领域,尤其涉及用于使用水平切分的特征数据集来经由多个数据拥有方协同训练图神经网络模型的方法、装置及系统。The embodiments of this specification generally relate to the field of machine learning, and more particularly to methods, devices, and systems for using horizontally segmented feature data sets to collaboratively train graph neural network models through multiple data owners.
背景技术Background technique
图神经网络模型是机器学习领域广泛使用的机器学习模型。在很多情况下,多个模型训练参与方(例如,电子商务公司、快递公司和银行)各自拥有训练图神经网络模型所使用的特征数据的不同部分数据。该多个模型训练参与方通常想共同使用彼此的数据来统一训练图神经网络模型,但又不想把各自的数据提供给其它各个模型训练参与方以防止自己的数据被泄露。The graph neural network model is a machine learning model widely used in the field of machine learning. In many cases, multiple model training participants (for example, e-commerce companies, express companies, and banks) each have different parts of the feature data used by the training graph neural network model. The multiple model training participants usually want to use each other's data to unify the training graph neural network model, but they do not want to provide their data to other model training participants to prevent their own data from being leaked.
面对这种情况,提出了能够保护隐私数据安全的图神经网络模型训练方法,其能够在保证多个模型训练参与方的各自数据安全的情况下,协同该多个模型训练参与方来训练图神经网络模型,以供该多个模型训练参与方使用。In the face of this situation, a graph neural network model training method that can protect the security of private data is proposed. It can cooperate with the multiple model training participants to train the graph while ensuring the security of their respective data of multiple model training participants. The neural network model is used by the participants in the training of the multiple models.
发明内容Summary of the invention
鉴于上述问题,本说明书实施例提供了一种用于经由多个数据拥有方协同训练图神经网络模型的方法、装置及系统,其能够在保证多个数据拥有方的各自数据安全的情况下实现图神经网络模型训练。In view of the above-mentioned problems, embodiments of this specification provide a method, device and system for collaboratively training graph neural network models via multiple data owners, which can be implemented while ensuring the security of the respective data of multiple data owners. Graph neural network model training.
根据本说明书实施例的一个方面,提供一种用于经由多个数据拥有方来训练图神经网络模型的方法,所述图神经网络模型包括位于服务端的判别模型以及位于各个数据拥有方处的图神经网络子模型,每个数据拥有方具有通过对用于模型训练的训练样本集进行水平切分而获得的训练样本子集,所述训练样本子集包括特征数据子集以及真实标签值,所述方法由数据拥有方执行,所述方法包括:执行下述循环过程,直到满足循环结束条件:将当前特征数据子集提供给所述数据拥有方处的当前图神经网络子模型,以得到所述当前图神经网络子模型的各个节点的特征向量表示;从服务端获取当前判别模型;将各个节点的特征向量表示提供给所述当前判别模型,以得到各个节点的当前预测标签值;根据各个节点的当前预测标签值以及对应的真实标签值,确定当前损失函数;在不满足循环结束条件时,基于当前损失函数,确定所述当前判别模型的梯度信息并且更新 当前图神经网络子模型的模型参数;以及将所述当前判别模型的梯度信息提供给所述服务端,所述服务端使用来自于各个数据拥有方的所述当前判别模型的梯度信息来更新所述服务端处的判别模型,其中,在未满足所述循环结束条件时,所述更新后的各个数据拥有方的图神经网络子模型和所述服务端处的判别模型用作下一循环过程的当前模型。According to one aspect of the embodiments of the present specification, there is provided a method for training a graph neural network model via multiple data owners. The graph neural network model includes a discriminant model located on the server side and a graph located at each data owner. For neural network sub-models, each data owner has a training sample subset obtained by horizontally splitting a training sample set used for model training. The training sample subset includes a feature data subset and a true label value, so The method is executed by the data owner, and the method includes: executing the following loop process until the loop end condition is satisfied: providing the current feature data subset to the current graph neural network sub-model at the data owner to obtain the all The feature vector representation of each node of the neural network sub-model of the current graph; obtain the current discriminant model from the server; provide the feature vector representation of each node to the current discriminant model to obtain the current predicted label value of each node; The current predicted label value of the node and the corresponding true label value determine the current loss function; when the loop end condition is not satisfied, based on the current loss function, determine the gradient information of the current discriminant model and update the model of the current graph neural network sub-model Parameters; and providing gradient information of the current discriminant model to the server, and the server uses the gradient information of the current discriminant model from each data owner to update the discriminant model at the server, Wherein, when the loop ending condition is not satisfied, the updated graph neural network sub-model of each data owner and the discriminant model at the server end are used as the current model of the next loop process.
可选地,在上述方面的一个示例中,各个数据拥有方处得到的梯度信息可以通过安全聚合的方式提供给所述服务端。Optionally, in an example of the foregoing aspect, the gradient information obtained from each data owner may be provided to the server in a secure aggregation manner.
可选地,在上述方面的一个示例中,所述安全聚合可以包括:基于秘密共享的安全聚合;基于同态加密的安全聚合;或者基于可信执行环境的安全聚合。Optionally, in an example of the foregoing aspect, the secure aggregation may include: secure aggregation based on secret sharing; secure aggregation based on homomorphic encryption; or secure aggregation based on a trusted execution environment.
可选地,在上述方面的一个示例中,在每次循环过程中,所述方法还可以包括:获取当前训练样本子集。Optionally, in an example of the foregoing aspect, during each cycle, the method may further include: obtaining a current training sample subset.
可选地,在上述方面的一个示例中,所述循环结束条件可以包括:预定循环次数;所述判别模型的各个模型参数的变化量不大于预定阈值;或者当前总损失函数位于预定范围内。Optionally, in an example of the foregoing aspect, the loop end condition may include: a predetermined number of loops; the variation of each model parameter of the discriminant model is not greater than a predetermined threshold; or the current total loss function is within a predetermined range.
可选地,在上述方面的一个示例中,所述特征数据可以包括基于图像数据、语音数据或文本数据的特征数据,或者所述特征数据可以包括用户特征数据。Optionally, in an example of the above aspect, the characteristic data may include characteristic data based on image data, voice data, or text data, or the characteristic data may include user characteristic data.
根据本说明书的实施例的另一方面,提供一种用于经由多个数据拥有方来训练图神经网络模型的方法,所述图神经网络模型包括位于服务端的判别模型以及位于各个数据拥有方处的图神经网络子模型,每个数据拥有方具有通过对用于模型训练的训练样本集进行水平切分而获得的训练样本子集,所述训练样本子集包括特征数据子集以及真实标签值,所述方法由服务端执行,所述方法包括:执行下述循环过程,直到满足循环结束条件:将当前判别模型提供给各个数据拥有方,各个数据拥有方将当前子图神经网络模型的各个节点的特征向量表示提供给所述当前判别模型以得到各个节点的预测标签值,基于各个节点的预测标签值以及对应的真实标签值确定各自的当前损失函数,以及在不满足循环结束条件时,基于各自的当前损失函数,确定判别模型的梯度信息并且更新当前图神经网络子模型的模型参数,并且将所确定的梯度信息提供给所述服务端,所述各个节点的特征向量表示通过将当前特征数据子集提供给所述当前图神经网络子模型而得到;在未满足所述循环结束条件时,从各个数据拥有方获取所述当前判别模型的对应梯度信息,并且基于来自各个数据拥有方的梯度信息更新所述当前判别模型,其中,在未满足所述循环结束条件时,所述更新后的各个数据拥有方的图神经网络子模型和所述 服务端的判别模型用作下一循环过程的当前模型。According to another aspect of the embodiments of the present specification, there is provided a method for training a graph neural network model via multiple data owners. The graph neural network model includes a discriminant model located on the server and located at each data owner. The graph neural network sub-model of each data owner has a training sample subset obtained by horizontally splitting the training sample set used for model training, and the training sample subset includes a feature data subset and a true label value , The method is executed by the server, and the method includes: executing the following loop process until the loop end condition is met: providing the current discriminant model to each data owner, and each data owner adding each of the current subgraph neural network models The feature vector representation of the node is provided to the current discriminant model to obtain the predicted label value of each node, and the respective current loss function is determined based on the predicted label value of each node and the corresponding real label value, and when the loop end condition is not satisfied, Based on the respective current loss functions, determine the gradient information of the discriminant model and update the model parameters of the current graph neural network sub-model, and provide the determined gradient information to the server. The feature vector of each node indicates that the current The feature data subset is provided to the current graph neural network sub-model; when the loop end condition is not met, the corresponding gradient information of the current discriminant model is obtained from each data owner, and based on the data from each data owner The current discriminant model is updated with the gradient information of, where, when the cycle end condition is not met, the updated graph neural network sub-model of each data owner and the discriminant model of the server are used as the next cycle process The current model.
可选地,在上述方面的一个示例中,各个数据拥有方处得到的梯度信息可以通过安全聚合的方式提供给所述服务端。Optionally, in an example of the foregoing aspect, the gradient information obtained from each data owner may be provided to the server in a secure aggregation manner.
可选地,在上述方面的一个示例中,所述安全聚合可以包括:基于秘密共享的安全聚合;基于同态加密的安全聚合;或者基于可信执行环境的安全聚合。Optionally, in an example of the foregoing aspect, the secure aggregation may include: secure aggregation based on secret sharing; secure aggregation based on homomorphic encryption; or secure aggregation based on a trusted execution environment.
根据本说明书的实施例的另一方面,提供一种用于使用图神经网络模型来进行模型预测的方法,所述图神经网络模型包括位于服务端的判别模型以及位于各个数据拥有方处的图神经网络子模型,所述方法由数据拥有方执行,所述方法包括:将待预测数据提供给所述数据拥有方处的图神经网络子模型,以得到所述图神经网络子模型的各个节点的特征向量表示;从服务端获取判别模型;以及将各个节点的特征向量表示提供给所述判别模型,以得到各个节点的预测标签值。According to another aspect of the embodiments of the present specification, there is provided a method for model prediction using a graph neural network model, the graph neural network model including a discriminant model located on the server side and a graph neural network located at each data owner. Network sub-model, the method is executed by the data owner, and the method includes: providing the data to be predicted to the graph neural network sub-model at the data owner to obtain the information of each node of the graph neural network sub-model The feature vector representation; the discriminant model is obtained from the server; and the feature vector representation of each node is provided to the discriminant model to obtain the predicted label value of each node.
根据本说明书的实施例的另一方面,提供一种用于经由多个数据拥有方来训练图神经网络模型的装置,所述图神经网络模型包括位于服务端的判别模型以及位于各个数据拥有方处的图神经网络子模型,每个数据拥有方具有通过对用于模型训练的训练样本集进行水平切分而获得的训练样本子集,所述训练样本子集包括特征数据子集以及真实标签值,所述装置应用于数据拥有方,所述装置包括:向量表示单元,将当前特征数据子集提供给当前图神经网络子模型,以得到所述当前图神经网络子模型的各个节点的特征向量表示;判别模型获取单元,从服务端获取当前判别模型;模型预测单元,将各个节点的特征向量表示提供给所述当前判别模型,以得到各个节点的当前预测标签值;损失函数确定单元,根据各个节点的当前预测标签值以及对应的真实标签值,确定当前损失函数;梯度信息确定单元,在不满足循环结束条件时,基于当前损失函数,确定所述当前判别模型的梯度信息;模型更新单元,在不满足循环结束条件时,基于当前损失函数,更新当前图神经网络子模型的模型参数;以及梯度信息提供单元,将所述当前判别模型的梯度信息提供给所述服务端,所述服务端使用来自各个数据拥有方的所述当前判别模型的梯度信息来更新所述服务端处的判别模型,其中,所述向量表示单元,所述判别模型获取单元、所述模型预测单元、所述损失函数确定单元、所述梯度信息确定单元、所述模型更新单元和所述梯度信息提供单元循环操作,直到满足所述循环结束条件,在未满足所述循环结束条件时,所述更新后的各个数据拥有方的图神经网络子模型和所述服务端的判别模型用作下一循环过程的当前模型。According to another aspect of the embodiments of the present specification, there is provided an apparatus for training a graph neural network model via a plurality of data owners. The graph neural network model includes a discriminant model located on the server and located at each data owner. The graph neural network sub-model of each data owner has a training sample subset obtained by horizontally splitting the training sample set used for model training, and the training sample subset includes a feature data subset and a true label value , The device is applied to the data owner, and the device includes: a vector representation unit that provides a current feature data subset to the current graph neural network sub-model to obtain the feature vector of each node of the current graph neural network sub-model Means; the discriminant model acquisition unit, which acquires the current discriminant model from the server; the model prediction unit, which provides the feature vector representation of each node to the current discriminant model to obtain the current predicted label value of each node; the loss function determination unit, according to The current predicted label value of each node and the corresponding true label value determine the current loss function; the gradient information determination unit, when the loop end condition is not met, determines the gradient information of the current discriminant model based on the current loss function; model update unit , When the loop end condition is not met, update the model parameters of the current graph neural network sub-model based on the current loss function; and the gradient information providing unit provides the gradient information of the current discriminant model to the server, and the service The terminal uses the gradient information of the current discriminant model from each data owner to update the discriminant model at the server, wherein the vector representation unit, the discriminant model acquisition unit, the model prediction unit, and the The loss function determining unit, the gradient information determining unit, the model updating unit, and the gradient information providing unit cyclically operate until the loop end condition is met. When the loop end condition is not met, the updated The graph neural network sub-model of each data owner and the discriminant model of the server are used as the current model of the next cycle process.
可选地,在上述方面的一个示例中,所述梯度信息提供单元可以使用安全聚合的方 式来将所述数据拥有方处得到的梯度信息提供给所述服务端。Optionally, in an example of the foregoing aspect, the gradient information providing unit may use a secure aggregation method to provide the gradient information obtained from the data owner to the server.
可选地,在上述方面的一个示例中,所述安全聚合可以包括:基于秘密共享的安全聚合;基于同态加密的安全聚合;或者基于可信执行环境的安全聚合。Optionally, in an example of the foregoing aspect, the secure aggregation may include: secure aggregation based on secret sharing; secure aggregation based on homomorphic encryption; or secure aggregation based on a trusted execution environment.
可选地,在上述方面的一个示例中,所述装置还可以包括:训练样本子集获取单元,在每次循环操作时,获取当前训练样本子集。Optionally, in an example of the foregoing aspect, the device may further include: a training sample subset acquiring unit, which acquires a current training sample subset during each cycle operation.
根据本说明书的实施例的另一方面,提供一种用于经由多个数据拥有方来训练图神经网络模型的装置,所述图神经网络模型包括位于服务端的判别模型以及位于各个数据拥有方处的图神经网络子模型,每个数据拥有方具有通过对用于模型训练的训练样本集进行水平切分而获得的训练样本子集,所述训练样本子集包括特征数据子集以及真实标签值,所述装置应用于服务端,所述装置包括:判别模型提供单元,将当前判别模型提供给各个数据拥有方,各个数据拥有方将当前图神经网络子模型的各个节点的特征向量表示提供给所述当前判别模型来得到各个节点的预测标签值,基于各个节点的预测标签值以及对应的真实标签值来确定出各自的当前损失函数,以及在不满足循环结束条件时,各个数据拥有方基于各自的当前损失函数,确定判别模型的梯度信息并且更新当前图神经网络子模型的模型参数,并且将所确定的梯度信息提供给所述服务端,所述各个节点的特征向量表示通过将当前特征数据子集提供给所述当前图神经网络子模型而得到;梯度信息获取单元,在未满足循环结束条件时,从各个数据拥有方获取所述当前判别模型的对应梯度信息;以及判别模型更新单元,基于来自各个数据拥有方的梯度信息来更新所述当前判别模型,其中,所述判别模型提供单元,所述梯度信息获取单元和所述判别模型更新单元循环操作,直到满足所述循环结束条件,在未满足所述循环结束条件时,所述更新后的各个数据拥有方的图神经网络子模型和所述服务端的判别模型用作下一循环过程的当前模型。According to another aspect of the embodiments of the present specification, there is provided an apparatus for training a graph neural network model via a plurality of data owners. The graph neural network model includes a discriminant model located on the server and located at each data owner. The graph neural network sub-model of each data owner has a training sample subset obtained by horizontally splitting the training sample set used for model training, and the training sample subset includes a feature data subset and a true label value , The device is applied to the server, and the device includes: a discriminant model providing unit, which provides the current discriminant model to each data owner, and each data owner provides the feature vector representation of each node of the current graph neural network sub-model to The current discriminant model is used to obtain the predicted label value of each node, the respective current loss function is determined based on the predicted label value of each node and the corresponding real label value, and when the loop end condition is not met, each data owner is based on Each current loss function determines the gradient information of the discriminant model and updates the model parameters of the current graph neural network sub-model, and provides the determined gradient information to the server. The feature vector of each node indicates that the current feature is The data subset is obtained by providing the current graph neural network sub-model; a gradient information acquisition unit, when the loop end condition is not met, acquires the corresponding gradient information of the current discriminant model from each data owner; and a discriminant model update unit , Updating the current discriminant model based on gradient information from each data owner, wherein the discriminant model providing unit, the gradient information acquiring unit, and the discriminant model updating unit operate in a loop until the loop end condition is satisfied When the loop ending condition is not satisfied, the updated graph neural network sub-model of each data owner and the discriminant model of the server are used as the current model of the next loop process.
根据本说明书的实施例的另一方面,提供一种用于经由多个数据拥有方来训练图神经网络模型的系统,包括:多个数据拥有方设备,每个数据拥有方设备包括如上所述的装置;以及服务端设备,包括如上所述的装置,其中,所述图神经网络模型包括位于服务端的判别模型以及位于各个数据拥有方处的图神经网络子模型,每个数据拥有方具有通过对用于模型训练的训练样本集进行水平切分而获得的训练样本子集,所述训练样本子集包括特征数据子集以及真实标签值。According to another aspect of the embodiments of the present specification, there is provided a system for training a graph neural network model via a plurality of data owners, including: a plurality of data owner devices, each data owner device including the above And a server device, including the device as described above, wherein the graph neural network model includes a discriminant model located on the server side and a graph neural network sub-model located at each data owner, and each data owner has a pass A training sample subset obtained by horizontally splitting a training sample set used for model training, where the training sample subset includes a feature data subset and a true label value.
根据本说明书的实施例的另一方面,提供一种用于使用图神经网络模型来进行模型预测的装置,所述图神经网络模型包括位于服务端的判别模型以及位于各个数据拥有方 处的图神经网络子模型,所述装置应用于数据拥有方,所述装置包括:向量表示单元,将待预测数据提供给所述数据拥有方处的图神经网络子模型,以得到所述图神经网络子模型的各个节点的特征向量表示;判别模型获取单元,从服务端获取判别模型;以及模型预测单元,将各个节点的特征向量表示提供给所述判别模型,以得到各个节点的预测标签值。According to another aspect of the embodiments of the present specification, there is provided an apparatus for performing model prediction using a graph neural network model. The graph neural network model includes a discriminant model located on the server side and a graph neural network located at each data owner. The network sub-model, the device is applied to the data owner, and the device includes: a vector representation unit that provides the data to be predicted to the graph neural network sub-model at the data owner to obtain the graph neural network sub-model The feature vector representation of each node; the discriminant model acquisition unit, which obtains the discriminant model from the server; and the model prediction unit, which provides the feature vector representation of each node to the discriminant model to obtain the predicted label value of each node.
根据本说明书实施例的另一方面,提供一种电子设备,包括:至少一个处理器,以及与所述至少一个处理器耦合的存储器,所述存储器存储指令,当所述指令被所述至少一个处理器执行时,使得所述至少一个处理器执行如上所述的在数据拥有方侧执行的模型训练方法。According to another aspect of the embodiments of the present specification, an electronic device is provided, including: at least one processor, and a memory coupled with the at least one processor, the memory stores instructions, and when the instructions are used by the at least one When the processor executes, the at least one processor is caused to execute the model training method executed on the data owner side as described above.
根据本说明书实施例的另一方面,提供一种机器可读存储介质,其存储有可执行指令,所述指令当被执行时使得所述至少一个处理器执行如上所述的在数据拥有方侧执行的模型训练方法。According to another aspect of the embodiments of the present specification, a machine-readable storage medium is provided, which stores executable instructions that, when executed, cause the at least one processor to execute the above-mentioned The method of model training performed.
根据本说明书实施例的另一方面,提供一种电子设备,包括:至少一个处理器,以及与所述至少一个处理器耦合的存储器,所述存储器存储指令,当所述指令被所述至少一个处理器执行时,使得所述至少一个处理器执行如上所述的在服务端侧执行的模型训练方法。According to another aspect of the embodiments of the present specification, an electronic device is provided, including: at least one processor, and a memory coupled with the at least one processor, the memory stores instructions, and when the instructions are used by the at least one When the processor executes, the at least one processor is caused to execute the model training method executed on the server side as described above.
根据本说明书实施例的另一方面,提供一种机器可读存储介质,其存储有可执行指令,所述指令当被执行时使得所述至少一个处理器执行如上所述的在服务端侧执行的模型训练方法。According to another aspect of the embodiments of the present specification, a machine-readable storage medium is provided, which stores executable instructions that, when executed, cause the at least one processor to execute the execution on the server side as described above. The model training method.
根据本说明书实施例的另一方面,提供一种电子设备,包括:至少一个处理器,以及与所述至少一个处理器耦合的存储器,所述存储器存储指令,当所述指令被所述至少一个处理器执行时,使得所述至少一个处理器执行如上所述的模型预测方法。According to another aspect of the embodiments of the present specification, an electronic device is provided, including: at least one processor, and a memory coupled with the at least one processor, the memory stores instructions, and when the instructions are used by the at least one When the processor executes, the at least one processor is caused to execute the above-mentioned model prediction method.
根据本说明书实施例的另一方面,提供一种机器可读存储介质,其存储有可执行指令,所述指令当被执行时使得所述至少一个处理器执行如上所述的模型预测方法。According to another aspect of the embodiments of the present specification, a machine-readable storage medium is provided, which stores executable instructions that, when executed, cause the at least one processor to execute the above-mentioned model prediction method.
利用本说明书实施例的方案,能够在不泄漏该多个训练参与方的隐私数据的情况下训练得到图神经网络模型的模型参数。Using the solution of the embodiment of the present specification, the model parameters of the graph neural network model can be obtained by training without leaking the private data of the multiple training participants.
附图说明Description of the drawings
通过参照下面的附图,可以实现对于本说明书内容的本质和优点的进一步理解。 在附图中,类似组件或特征可以具有相同的附图标记。By referring to the following drawings, a further understanding of the nature and advantages of the contents of this specification can be achieved. In the drawings, similar components or features may have the same reference signs.
图1示出了根据本说明书的实施例的图神经网络模型的示例的示意图;Fig. 1 shows a schematic diagram of an example of a graph neural network model according to an embodiment of the present specification;
图2示出了根据本说明书的实施例的经过水平切分的训练样本集的示例的示意图;Fig. 2 shows a schematic diagram of an example of a horizontally segmented training sample set according to an embodiment of the present specification;
图3示出了示出了根据本说明书的实施例的用于经由多个数据拥有方训练图神经网络模型的系统的架构示意图;FIG. 3 shows a schematic diagram showing the architecture of a system for training graph neural network models via multiple data owners according to an embodiment of the present specification;
图4示出了根据本说明书的实施例的用于经由多个数据拥有方训练图神经网络模型的方法的流程图;Fig. 4 shows a flowchart of a method for training a graph neural network model via multiple data owners according to an embodiment of the present specification;
图5示出了根据本说明书的实施例的用于经由多个数据拥有方训练图神经网络模型的一个示例过程的示意图;FIG. 5 shows a schematic diagram of an example process for training a graph neural network model via multiple data owners according to an embodiment of the present specification;
图6示出了根据本说明书的实施例的基于图神经网络模型的模型预测过程的流程图;Fig. 6 shows a flowchart of a model prediction process based on a graph neural network model according to an embodiment of the present specification;
图7示出了根据本说明书的实施例的用于经由多个数据拥有方训练图神经网络模型的装置的方框图;Fig. 7 shows a block diagram of an apparatus for training a graph neural network model via multiple data owners according to an embodiment of the present specification;
图8示出了根据本说明书的实施例的用于经由多个数据拥有方训练图神经网络模型的装置的方框图;FIG. 8 shows a block diagram of an apparatus for training a graph neural network model via multiple data owners according to an embodiment of the present specification;
图9示出了根据本说明书的实施例的用于基于图神经网络模型来进行模型预测的装置的方框图;Fig. 9 shows a block diagram of an apparatus for model prediction based on a graph neural network model according to an embodiment of the present specification;
图10示出了根据本说明书的实施例的用于经由多个数据拥有方训练图神经网络模型的电子设备的示意图;FIG. 10 shows a schematic diagram of an electronic device for training a graph neural network model via multiple data owners according to an embodiment of the present specification;
图11示出了根据本说明书的实施例的用于经由多个数据拥有方训练图神经网络模型的电子设备的示意图;和FIG. 11 shows a schematic diagram of an electronic device for training a graph neural network model via multiple data owners according to an embodiment of the present specification; and
图12示出了根据本说明书的实施例的用于基于图神经网络模型来进行模型预测的电子设备的示意图。Fig. 12 shows a schematic diagram of an electronic device for model prediction based on a graph neural network model according to an embodiment of the present specification.
具体实施方式Detailed ways
现在将参考示例实施方式讨论本文描述的主题。应该理解,讨论这些实施方式只是为了使得本领域技术人员能够更好地理解从而实现本文描述的主题,并非是对权利要求书中所阐述的保护范围、适用性或者示例的限制。可以在不脱离本说明书内容的保护 范围的情况下,对所讨论的元素的功能和排列进行改变。各个示例可以根据需要,省略、替代或者添加各种过程或组件。例如,所描述的方法可以按照与所描述的顺序不同的顺序来执行,以及各个步骤可以被添加、省略或者组合。另外,相对一些示例所描述的特征在其它例子中也可以进行组合。The subject matter described herein will now be discussed with reference to example embodiments. It should be understood that the discussion of these embodiments is only to enable those skilled in the art to better understand and realize the subject described herein, and is not to limit the scope of protection, applicability, or examples set forth in the claims. The function and arrangement of the discussed elements can be changed without departing from the scope of protection of the contents of this specification. Various examples can omit, substitute, or add various procedures or components as needed. For example, the described method may be executed in a different order from the described order, and various steps may be added, omitted, or combined. In addition, features described with respect to some examples can also be combined in other examples.
如本文中使用的,术语“包括”及其变型表示开放的术语,含义是“包括但不限于”。术语“基于”表示“至少部分地基于”。术语“一个实施例”和“一实施例”表示“至少一个实施例”。术语“另一个实施例”表示“至少一个其他实施例”。术语“第一”、“第二”等可以指代不同的或相同的对象。下面可以包括其他的定义,无论是明确的还是隐含的。除非上下文中明确地指明,否则一个术语的定义在整个说明书中是一致的。As used herein, the term "including" and its variations mean open terms, meaning "including but not limited to". The term "based on" means "based at least in part on." The terms "one embodiment" and "an embodiment" mean "at least one embodiment." The term "another embodiment" means "at least one other embodiment." The terms "first", "second", etc. may refer to different or the same objects. Other definitions can be included below, whether explicit or implicit. Unless clearly indicated in the context, the definition of a term is consistent throughout the specification.
在本说明书中,图神经网络模型训练方案中所使用的训练样本集是经过水平切分的训练样本集。术语“对训练样本集进行水平切分”是指按照模块/功能(或者某种指定规则)来将该训练样本集切分为多个训练样本子集,各个训练样本子集包含一部分训练样本,并且每个训练样本子集中所包括的训练样本是完整的训练样本,即,包括该训练样本的所有字段数据和对应的标记值。在本公开中,假设存在三个数据拥有方Alice、Bob和Charlie,则在各个数据拥有方处获取本地样本以形成本地样本集,在该本地样本集中所包含的每条样本都是完整的样本,然后,三个数据方Alice、Bob和Charlie所获取的本地样本集组成用于图神经网络模型训练的训练样本集,其中,每个本地样本集作为该训练样本集的训练样本子集,以用于训练图神经网络模型。In this specification, the training sample set used in the graph neural network model training scheme is a training sample set that has been horizontally segmented. The term "horizontal segmentation of the training sample set" refers to dividing the training sample set into multiple training sample subsets according to modules/functions (or certain specified rules), and each training sample subset contains a part of training samples. And the training samples included in each training sample subset are complete training samples, that is, all field data and corresponding label values of the training sample are included. In this disclosure, assuming that there are three data owners Alice, Bob, and Charlie, local samples are obtained from each data owner to form a local sample set, and each sample contained in the local sample set is a complete sample , And then, the local sample sets obtained by the three data parties Alice, Bob, and Charlie form the training sample set for graph neural network model training, where each local sample set is used as a training sample subset of the training sample set, with Used to train graph neural network models.
在本说明书中,各个数据拥有方各自拥有图神经网络模型训练所使用的训练样本的不同部分数据。例如,以两个数据拥有方为例,假设训练样本集包括100个训练样本,每一个训练样本包含多个特征值和标记实际值,那么,第一数据拥有方拥有的数据可以是训练样本集内的前30个训练样本,以及,第二数据拥有方拥有的数据可以是训练样本集内的后70个训练样本。In this specification, each data owner owns different parts of the data of the training samples used in the training of the graph neural network model. For example, taking two data owners as an example, suppose the training sample set includes 100 training samples, and each training sample contains multiple feature values and actual values of labels, then the data owned by the first data owner can be the training sample set The first 30 training samples in the training sample set, and the data owned by the second data owner may be the last 70 training samples in the training sample set.
在本说明书的实施例中,图神经网络模型训练时所使用的特征数据可以包括基于图像数据、语音数据或文本数据的特征数据。相应地,图神经网络模型可以应用于基于图像数据、语音数据或者文本数据的业务风险识别、业务分类或者业务决策等等。或者,图神经网络模型训练时所使用的特征数据可以包括用户特征数据。相应地,图神经网络模型可以应用于基于用户特征数据的业务风险识别、业务分类、业务推荐或者业务决策等等。In the embodiment of this specification, the feature data used in the training of the graph neural network model may include feature data based on image data, voice data, or text data. Correspondingly, the graph neural network model can be applied to business risk identification, business classification or business decision-making based on image data, voice data or text data. Alternatively, the feature data used in the training of the graph neural network model may include user feature data. Correspondingly, the graph neural network model can be applied to business risk identification, business classification, business recommendation or business decision based on user characteristic data.
此外,在本说明书的实施例中,图神经网络模型所使用的待预测数据可以包括图像数据、语音数据或文本数据。或者,图神经网络模型所使用的待预测数据可以包括用户特征数据。In addition, in the embodiments of this specification, the data to be predicted used by the graph neural network model may include image data, voice data, or text data. Alternatively, the data to be predicted used by the graph neural network model may include user characteristic data.
在本说明书中,术语“图神经网络模型”和“图神经网络”可以互换使用。术语“图神经网络子模型”和“图神经子网络”可以互换使用。此外,术语“数据拥有方”和“训练参与方”可以互换使用。In this specification, the terms "graph neural network model" and "graph neural network" can be used interchangeably. The terms "graph neural network sub-model" and "graph neural sub-network" can be used interchangeably. In addition, the terms "data owner" and "training participant" can be used interchangeably.
下面将结合附图来详细描述根据本说明书实施例的用于经由多个数据拥有方来协同训练图神经网络模型的方法、装置以及系统。The method, device, and system for collaborative training of graph neural network models through multiple data owners according to embodiments of the present specification will be described in detail below with reference to the accompanying drawings.
图1示出了根据本说明书的实施例的图神经网络模型的示例的示意图。Fig. 1 shows a schematic diagram of an example of a graph neural network model according to an embodiment of the present specification.
如图1所示,图神经网络(GNN,Graph Neural Network)模型被分割为判别模型10和多个图神经网络子模型20,比如,图1中的图神经网络子模型GNN A、GNN B和GNN C。判别模型10设置在服务端110,以及各个图神经网络子模型分别设置在对应的数据拥有方处,例如,可以设置在对应的数据拥有方处的客户端上,每个数据拥有方具有一个图神经网络子模型。如图1中所示,GNN A设置在数据拥有方A 120-1处,GNN B设置在数据拥有方B 120-2处,以及GNN C设置在数据拥有方C 120-3处。 As shown in Figure 1, the graph neural network (GNN, Graph Neural Network) model is divided into a discriminant model 10 and multiple graph neural network sub-models 20. For example, the graph neural network sub-models GNN A , GNN B and GNN C. The discriminant model 10 is set on the server 110, and each graph neural network sub-model is set on the corresponding data owner. For example, it can be set on the client of the corresponding data owner, and each data owner has a graph. Neural network sub-model. As shown in FIG. 1, GNN A is set at the data owner A 120-1, GNN B is set at the data owner B 120-2, and GNN C is set at the data owner C 120-3.
图神经网络子模型20被使用来对数据拥有方的数据进行GNN计算,以得到该图神经网络子模型的各个节点的特征向量表示。具体地,在进行GNN计算时,将数据拥有方的数据提供给图神经网络子模型20,根据节点特征以及图神经子网络,经过K阶邻居的传播,得到与当前数据对应的每个节点的特征向量表示。The graph neural network sub-model 20 is used to perform GNN calculations on the data of the data owner to obtain the feature vector representation of each node of the graph neural network sub-model. Specifically, when performing GNN calculations, the data of the data owner is provided to the graph neural network sub-model 20. According to the node characteristics and the graph neural sub-network, through the propagation of K-order neighbors, the data of each node corresponding to the current data is obtained. Feature vector representation.
判别模型10被使用来基于数据拥有方处得到的每个节点的特征向量表示进行模型计算,以得到每个节点的模型预测值。The discriminant model 10 is used to perform model calculation based on the feature vector representation of each node obtained from the data owner to obtain the model prediction value of each node.
在本说明书中,各个数据拥有方处所具有的数据是经过水平切分的数据。图2示出了根据本说明书的实施例的经过水平切分的训练样本数据的示例的示意图。在图2中,示出了2个数据方Alice和Bob,多个数据方也类似。每个数据方Alice和Bob拥有的训练样本子集中的每条训练样本是完整的,即,每条训练样本包括完整的特征数据(x)和标记数据(y)。比如,Alice拥有完整的训练样本(x0,y0)。In this specification, the data possessed by each data owner is horizontally segmented data. Fig. 2 shows a schematic diagram of an example of horizontally segmented training sample data according to an embodiment of the present specification. In Figure 2, two data parties Alice and Bob are shown, and multiple data parties are similar. Each training sample in the training sample subset owned by each data party Alice and Bob is complete, that is, each training sample includes complete feature data (x) and labeled data (y). For example, Alice has a complete training sample (x0, y0).
图3示出了示出了根据本说明书的实施例的用于经由多个数据拥有方训练图神经网络模型的系统(在下文中称为“模型训练系统300”)的架构示意图。FIG. 3 shows a schematic diagram illustrating the architecture of a system for training graph neural network models via multiple data owners (hereinafter referred to as "model training system 300") according to an embodiment of the present specification.
如图3所示,模型训练系统300包括服务端设备310以及至少一个数据拥有方设备320。在图3中示出了3个数据拥有方设备320。在本说明书的其它实施例中,可以包括更多或者更少的数据拥有方设备320。服务端设备310以及至少一个数据拥有方设备320可以通过例如但不局限于互联网或局域网等的网络330相互通信。As shown in FIG. 3, the model training system 300 includes a server device 310 and at least one data owner device 320. Three data owner devices 320 are shown in FIG. 3. In other embodiments of this specification, more or fewer data owner devices 320 may be included. The server device 310 and the at least one data owner device 320 may communicate with each other via a network 330 such as but not limited to the Internet or a local area network.
在本说明书中,所训练的图神经网络模型(去除判别模型之外的神经网络模型结构)被分割为第一数目个图神经网络子模型。这里,第一数目等于参与模型训练的数据拥有方设备的数目。这里,假设数据拥有方设备的数目为N。相应地,图神经网络模型被分解为N个子模型,每个数据拥有方设备具有一个子模型。用于模型训练的特征数据集分别位于各个数据拥有方设备320处,所述特征数据集按照图2中所述的方式水平切分为多个特征数据子集,每个数据拥有方设备具有一个特征数据子集。这里,每个数据拥有方所拥有的子模型以及对应的特征数据子集是该数据拥有方的秘密,不能被其他数据拥有方获悉或者完整地获悉。In this specification, the trained graph neural network model (neural network model structure except the discriminant model is removed) is divided into the first number of graph neural network sub-models. Here, the first number is equal to the number of data owner devices participating in model training. Here, assume that the number of data owner devices is N. Correspondingly, the graph neural network model is decomposed into N sub-models, and each data owner device has one sub-model. The feature data set used for model training is located at each data owner device 320. The feature data set is horizontally divided into multiple feature data subsets in the manner described in FIG. 2, and each data owner device has one Feature data subset. Here, the sub-models and corresponding feature data subsets owned by each data owner are the secrets of the data owner and cannot be learned or fully learned by other data owners.
在本说明书中,多个数据拥有方设备320和服务端设备310一起使用各个数据拥有方设备320的训练样本子集来协同训练图神经网络模型。关于模型的具体训练过程将在下面参照图4到图5进行详细描述。In this specification, multiple data owner devices 320 and server devices 310 use the training sample subsets of each data owner device 320 to collaboratively train the graph neural network model. The specific training process of the model will be described in detail with reference to FIGS. 4 to 5 below.
在本说明书中,服务端设备310以及数据拥有方设备320可以是任何合适的具有计算能力的电子设备。所述电子设备包括但不限于:个人计算机、服务器计算机、工作站、桌面型计算机、膝上型计算机、笔记本计算机、移动电子设备、智能电话、平板计算机、蜂窝电话、个人数字助理(PDA)、手持装置、消息收发设备、可佩戴电子设备、消费电子设备等等。In this specification, the server device 310 and the data owner device 320 may be any suitable electronic devices with computing capabilities. The electronic devices include, but are not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile electronic devices, smart phones, tablet computers, cellular phones, personal digital assistants (PDAs), handhelds Devices, messaging equipment, wearable electronics, consumer electronics, etc.
图4示出了根据本说明书的实施例的用于经由多个数据拥有方训练图神经网络模型的方法400的流程图。FIG. 4 shows a flowchart of a method 400 for training a graph neural network model via multiple data owners according to an embodiment of the present specification.
如图4所示,在401,初始化各个数据拥有方处的图神经网络子模型以及服务端的判别模型。例如,初始化数据拥有方A、B和C处的图神经网络子模型GNN A、GNN B和GNN C,以及初始化服务端的判别模型10。 As shown in Fig. 4, at 401, the graph neural network sub-model at each data owner and the discriminant model of the server are initialized. For example, initialize the graph neural network sub-models GNN A , GNN B and GNN C at the data owners A, B, and C, and initialize the discriminant model 10 on the server side.
接着,循环执行402到410的操作,直到满足循环结束条件。Then, the operations from 402 to 410 are executed in a loop until the loop end condition is satisfied.
具体地,在402,各个数据拥有方设备320获取各自的当前训练样本子集。例如,数据拥有方A获取当前训练样本子集S A,数据拥有方B获取当前训练样本子集S B,以及数据拥有方C获取当前训练样本子集S C。每个训练样本子集都包括特征数据子集和真 实标签值。 Specifically, at 402, each data owner device 320 obtains its current training sample subset. For example, the owner of the data to get the current training sample A subset S A, data B acquires the current owner of the training subset of samples S B, and C acquires the current owner of the data training sample subset S C. Each subset of training samples includes a subset of feature data and true label values.
在403,在各个数据拥有方设备320处,将所获取的当前训练样本子集提供给各自的图神经网络子模型来进行GNN计算,以得到该图神经网络子模型中的各个节点的特征向量表示。具体地,在进行GNN计算时,将当前训练样本子集提供给图神经网络子模型20,根据节点特征以及图神经子网络,经过K阶邻居的传播,得到与当前训练样本子集对应的每个节点的特征向量表示。At 403, at each data owner device 320, the obtained current training sample subset is provided to the respective graph neural network sub-model for GNN calculation to obtain the feature vector of each node in the graph neural network sub-model Express. Specifically, when performing GNN calculations, the current training sample subset is provided to the graph neural network sub-model 20. According to the node characteristics and the graph neural sub-network, through the propagation of K-order neighbors, each subset corresponding to the current training sample subset is obtained. The feature vector representation of each node.
在404,各个数据拥有方设备320从服务端310获取当前判别模型。随后,在405,在各个数据拥有方设备320处,使用当前判别模型来基于各个节点的特征向量表示来进行模型预测,以得到各个节点的当前预测标签值。At 404, each data owner device 320 obtains the current discrimination model from the server 310. Subsequently, at 405, at each data owner device 320, the current discriminant model is used to perform model prediction based on the feature vector representation of each node to obtain the current predicted label value of each node.
然后,在406,在各个数据拥有方设备320处,根据各个节点的当前预测标签值和对应的真实标签值,确定当前损失函数。例如,在一个示例中,损失函数可以基于公式
Figure PCTCN2020132667-appb-000001
计算出,其中,i表示第i个节点,P表示图神经网络子模型中的节点总数,t i表示第i个节点的真实标签值,以及O i表示第i个节点的当前预测标签值。
Then, at 406, at each data owner device 320, the current loss function is determined according to the current predicted label value of each node and the corresponding real label value. For example, in one example, the loss function can be based on the formula
Figure PCTCN2020132667-appb-000001
Calculated, where i represents the i-th node, P represents the total number of nodes in the graph neural network sub-model, t i represents the true label value of the i-th node, and O i represents the current predicted label value of the i-th node.
在407,在各个数据拥有方设备320处,基于当前损失函数,确定所接收的当前判别模型的梯度信息,即,当前判别模型的模型参数的梯度信息,例如,通过反向传播来确定所接收的当前判别模型的梯度信息。此外,在各个数据拥有方设备320处,基于当前损失函数,更新当前图神经网络子模型的模型参数,例如,通过反向传播来更新当前图神经网络子模型的模型参数。At 407, at each data owner device 320, based on the current loss function, determine the received gradient information of the current discriminant model, that is, the gradient information of the model parameters of the current discriminant model, for example, determine the received gradient information through backpropagation. The gradient information of the current discriminant model. In addition, at each data owner device 320, based on the current loss function, the model parameters of the current graph neural network sub-model are updated, for example, the model parameters of the current graph neural network sub-model are updated through back propagation.
在408,各个数据拥有方设备320分别将各自确定的当前判别模型的梯度信息提供给服务端310。在一个示例中,各个数据拥有方设备320可以分别将各自确定的当前判别模型的梯度信息(例如,依其原样)发送给服务端310,然后在服务端310,对所接收的各个梯度信息进行聚合。在另一示例中,各个数据拥有方可以通过安全聚合的方式提供给服务端310。在本说明书中,所述安全聚合可以包括:基于秘密共享的安全聚合;基于同态加密的安全聚合;或者基于可信执行环境的安全聚合。此外,在本说明书中,也可以采用其它合适的安全聚合方式。In 408, each data owner device 320 respectively provides the gradient information of the current discriminant model determined by each to the server 310. In an example, each data owner device 320 may respectively send the gradient information (for example, as it is) of the current discriminant model determined by each to the server 310, and then the server 310 performs the processing on the received gradient information. polymerization. In another example, each data owner may provide the data to the server 310 in a secure aggregation manner. In this specification, the security aggregation may include: security aggregation based on secret sharing; security aggregation based on homomorphic encryption; or security aggregation based on a trusted execution environment. In addition, in this specification, other suitable safe aggregation methods can also be used.
此外,要说明的是,在本说明书中,对所接收的各个梯度信息进行聚合可以包括对所接收的各个梯度信息进行求平均。In addition, it should be noted that in this specification, aggregating the received gradient information may include averaging the received gradient information.
在409,在服务端310处,使用经过聚合后的梯度信息来更新服务端310处的判别 模型,以用于后续训练循环过程,或者作为训练好的判别模型。In 409, at the server 310, the aggregated gradient information is used to update the discriminant model at the server 310 for subsequent training cycles or as a trained discriminant model.
在410,判断是否满足循环结束条件,即,是否达到预定循环次数。如果达到预定循环次数,则流程结束。如果未达到预定循环次数,则返回到402的操作,执行下一训练循环过程。这里,当前循环过程中更新的各个数据拥有方处的图神经网络子模型以及服务端的判别模型用作下一训练循环过程的当前模型。At 410, it is determined whether the loop end condition is satisfied, that is, whether the predetermined number of loops is reached. If the predetermined number of cycles is reached, the process ends. If the predetermined number of cycles has not been reached, return to the operation of 402 and execute the next training cycle process. Here, the graph neural network sub-model of each data owner updated in the current cycle process and the discriminant model of the server end are used as the current model of the next training cycle process.
这里要说明的是,在上述的示例中,训练循环过程的结束条件是指达到预定循环次数。在本说明书的另一示例中,训练循环过程的结束条件也可以是判别模型10的各个模型参数的变化量不大于预定阈值。在这种情况下,关于循环过程是否结束的判断过程在服务端310中执行。此外,在本说明书的另一示例中,训练循环过程的结束条件也可以是当前总损失函数位于预定范围内,例如,当前总损失函数不大于预定阈值。同样,关于循环过程是否结束的判断过程在服务端310中执行。此外,在这种情况下,各个数据拥有方设备320需要将各自的损失函数提供给服务端310来进行聚合,以得到总损失函数。另外,为了保证各个数据拥有方设备320的损失函数的隐私安全,在本说明书的另一示例中,各个数据拥有方设备320可以通过安全聚合的方式将各自的损失函数提供给服务端310来得到总损失函数。同样,针对损失函数的安全聚合也可以包括:基于秘密共享的安全聚合;基于同态加密的安全聚合;或者基于可信执行环境的安全聚合。It should be explained here that in the above example, the end condition of the training cycle process refers to reaching the predetermined number of cycles. In another example of this specification, the end condition of the training loop process may also be that the variation of each model parameter of the discrimination model 10 is not greater than a predetermined threshold. In this case, the judgment process as to whether the loop process is over is executed in the server 310. In addition, in another example of this specification, the end condition of the training loop process may also be that the current total loss function is within a predetermined range, for example, the current total loss function is not greater than a predetermined threshold. Similarly, the process of determining whether the loop process is over is executed in the server 310. In addition, in this case, each data owner device 320 needs to provide its own loss function to the server 310 for aggregation, so as to obtain the total loss function. In addition, in order to ensure the privacy and security of the loss function of each data owner device 320, in another example of this specification, each data owner device 320 can provide their own loss function to the server 310 by means of secure aggregation. Total loss function. Similarly, the security aggregation for the loss function may also include: the security aggregation based on secret sharing; the security aggregation based on homomorphic encryption; or the security aggregation based on the trusted execution environment.
图5示出了根据本说明书的实施例的用于经由多个数据拥有方训练图神经网络模型的一个示例过程的示意图。FIG. 5 shows a schematic diagram of an example process for training a graph neural network model via multiple data owners according to an embodiment of the present specification.
图5中示出了三个数据拥有方A、B和C。在进行模型训练时,在每轮循环过程中,数据拥有方A、B和C分别获取各自的当前特征数据子集X A、X B和X C。数据拥有方A、B和C分别将当前特征数据子集X A、X B和X C提供给各自的当前图神经网络子模型G A、G B和G C,以得到各个当前图神经网络子模型中的各个节点的当前特征向量表示。 Figure 5 shows three data owners A, B, and C. During model training, in each round of the cycle, the data owners A, B, and C obtain their respective current feature data subsets X A , X B, and X C. The data owners A, B, and C respectively provide the current feature data subsets X A , X B, and X C to their current graph neural network sub-models G A , G B, and G C to obtain each current graph neural network sub-model. The current feature vector representation of each node in the model.
随后,各个数据拥有方从服务端110获取当前判别模型H。然后,各个数据拥有方将所得到的各个节点的当前特征向量表示提供个当前判别模型H,以得到各个节点的当前预测标签值。随后,在各个数据拥有方处,基于各个节点的当前预测标签值以及对应的真实标签值,确定出当前损失函数,并且基于当前损失函数,通过反向传播来确定出当前判别模型的各个模型参数的梯度信息GH。同时,在各个数据拥有方处,还基于当前损失函数,通过反向传播来更新当前图神经网络子模型的各层网络模型的模型参数。Subsequently, each data owner obtains the current discriminant model H from the server 110. Then, each data owner provides a current discriminant model H for the obtained current feature vector representation of each node to obtain the current predicted label value of each node. Subsequently, at each data owner, based on the current predicted label value of each node and the corresponding real label value, the current loss function is determined, and based on the current loss function, the various model parameters of the current discriminant model are determined through back propagation The gradient information GH. At the same time, at each data owner, based on the current loss function, the model parameters of each layer network model of the current graph neural network sub-model are updated through back propagation.
在各个数据拥有方得到当前判别模型的各个模型参数的梯度信息GH后,通过安全 聚合的方式来将各自的梯度信息提供给服务端。服务端基于所得到的聚合后的梯度信息来更新当前判别模型。After each data owner obtains the gradient information GH of each model parameter of the current discriminant model, the respective gradient information is provided to the server by means of safe aggregation. The server updates the current discriminant model based on the obtained aggregated gradient information.
按照上述方式循环操作,直到满足循环结束条件,由此完成图神经网络模型训练过程。Loop operations in the above manner until the loop end condition is met, thereby completing the graph neural network model training process.
此外,要说明的是,图3-图5中示出的是具有3个数据拥有方的模型训练方案,在本说明书实施例的其它示例中,也可以包括多于或者少于3个数据拥有方。In addition, it should be noted that the model training schemes with 3 data owners are shown in Fig. 3 to Fig. 5. In other examples in the embodiment of this specification, it may also include more or less than 3 data owners. square.
在传统的GNN模型中,由于多个数据拥有方的数据不能彼此分享,所以都是只基于单个数据拥有方的数据来构建GNN模型。此外,由于单个数据拥有方的数据有限,所以GNN模型的效果也有限。利用本说明书的实施例提供模型训练方案,可以在保护各个数据拥有方的数据隐私的基础上,共同训练GNN模型,由此提升GNN模型效果。In the traditional GNN model, since the data of multiple data owners cannot be shared with each other, the GNN model is constructed only based on the data of a single data owner. In addition, due to the limited data of a single data owner, the effect of the GNN model is also limited. Using the embodiment of this specification to provide a model training solution can jointly train the GNN model on the basis of protecting the data privacy of each data owner, thereby improving the effect of the GNN model.
在现有的联邦学习方案中,GNN模型的所有模型部分都布置在服务端,各个数据拥有方(客户端)通过使用各自的隐私数据来学习模型梯度信息,然后将所得到的模型梯度信息提供给服务端进行安全聚合,然后进行全局模型更新。按照这种方式,所有数据拥有方的模型结构都必须一致,这样服务端才能对各个数据拥有方的模型梯度信息进行安全聚合来更新模型,从而不能针对不同的客户端来定制化不同的模型。然而,不同的数据拥有方的数据(特征及图关系)的稀疏质量不同,所以可以需要不同的GNN模型来学习。比如,数据拥有方A传播2度邻居时得到的节点特征向量表示是最优的,而数据拥有方B传播5度邻居时得到的节点特征向量表示才是最优的。In the existing federated learning scheme, all the model parts of the GNN model are arranged on the server, and each data owner (client) learns the model gradient information by using their own private data, and then provides the obtained model gradient information Perform security aggregation on the server, and then update the global model. In this way, the model structure of all data owners must be consistent, so that the server can safely aggregate the model gradient information of each data owner to update the model, so that different models cannot be customized for different clients. However, the sparse quality of the data (features and graph relationships) of different data owners is different, so different GNN models may be needed for learning. For example, the node feature vector representation obtained when the data owner A propagates a 2-degree neighbor is optimal, while the node feature vector representation obtained when the data owner B propagates a 5-degree neighbor is the optimal.
利用本说明书的实施例提供的模型训练方法,通过将用于得到节点特征向量表示的GNN模型部分布置在各个数据拥有方处自己学习(局部),而将判别模型放在服务端(全局)来经由多个数据拥有方来共同学习,从而可以提高判别模型的效果。Using the model training method provided by the embodiments of this specification, the GNN model used to obtain the feature vector representation of the node is arranged at each data owner for self-learning (local), and the discriminant model is placed on the server (global). Through multiple data owners to learn together, which can improve the effect of the discriminant model.
此外,利用图3-图5中公开的图神经网络模型训练方法,各个数据拥有方通过安全聚合的方式来将各自的当前判别模型的梯度信息提供给服务端,由此可以防止各个数据拥有方的梯度信息被完整地提供给服务端,从而避免服务端能够使用所接收的梯度信息来推导出数据拥有方的隐私数据,进而实现针对数据拥有方的隐私数据保护。In addition, using the graph neural network model training method disclosed in Figures 3 to 5, each data owner provides the gradient information of their current discriminant model to the server through secure aggregation, thereby preventing each data owner The gradient information of the data is completely provided to the server, so as to prevent the server from using the received gradient information to derive the privacy data of the data owner, thereby realizing the privacy data protection for the data owner.
图6示出了根据本说明书的实施例的基于图神经网络模型的模型预测过程600的流程图。图6中示出的模型预测过程中使用的图神经网络模型是按照图4所示的过程训练的图神经网络模型。FIG. 6 shows a flowchart of a model prediction process 600 based on a graph neural network model according to an embodiment of the present specification. The graph neural network model used in the model prediction process shown in FIG. 6 is a graph neural network model trained according to the process shown in FIG. 4.
在进行模型预测时,在610,将待预测数据提供给数据拥有方处的图神经网络子模 型,以得到图神经网络子模型的各个节点的特征向量表示。接着,在620,从服务端获取判别模型。然后,在630,将各个节点的特征向量表示提供给所接收的判别模型,以得到各个节点的预测标签值,由此完成模型预测过程。When performing model prediction, at 610, the data to be predicted is provided to the graph neural network sub-model of the data owner to obtain the feature vector representation of each node of the graph neural network sub-model. Next, at 620, the discriminant model is obtained from the server. Then, at 630, the feature vector representation of each node is provided to the received discriminant model to obtain the predicted label value of each node, thereby completing the model prediction process.
图7示出了根据本说明书实施例的用于经由多个数据拥有方训练图神经网络模型的装置(下文中称为模型训练装置)700的示意图。在该实施例中,所述图神经网络模型包括位于服务端的判别模型以及位于各个数据拥有方处的图神经网络子模型,每个数据拥有方具有通过对用于模型训练的训练样本集进行水平切分而获得的训练样本子集,所述训练样本子集包括特征数据子集以及真实标签值。模型训练装置700位于数据拥有方侧。FIG. 7 shows a schematic diagram of an apparatus (hereinafter referred to as a model training apparatus) 700 for training a graph neural network model via a plurality of data owners according to an embodiment of the present specification. In this embodiment, the graph neural network model includes a discriminant model located on the server side and a graph neural network sub-model located at each data owner. Each data owner has the ability to perform leveling on the training sample set used for model training. A subset of training samples obtained by segmentation, where the subset of training samples includes a subset of feature data and true label values. The model training device 700 is located on the side of the data owner.
如图7所示,模型训练装置700包括向量表示单元710、判别模型获取单元720、模型预测单元730、损失函数确定单元740、梯度信息确定单元750、模型更新单元760和梯度信息提供单元770。As shown in FIG. 7, the model training device 700 includes a vector representation unit 710, a discriminant model acquisition unit 720, a model prediction unit 730, a loss function determination unit 740, a gradient information determination unit 750, a model update unit 760, and a gradient information providing unit 770.
在进行模型训练时,向量表示单元710、判别模型获取单元720、模型预测单元730、损失函数确定单元740、梯度信息确定单元750、模型更新单元760和梯度信息提供单元770循环操作,直到满足循环结束条件。所述循环结束条件例如可以包括:达到预定循环次数,所述判别模型的各个模型参数的变化量不大于预定阈值;或者当前总损失函数位于预定范围内。在循环过程未结束时,更新后的各个数据拥有方的图神经网络子模型以及服务端的判别模型用作下一循环过程的当前模型。During model training, the vector representation unit 710, the discriminant model acquisition unit 720, the model prediction unit 730, the loss function determination unit 740, the gradient information determination unit 750, the model update unit 760, and the gradient information providing unit 770 operate cyclically until the cycle is satisfied. End condition. The loop ending condition may include, for example, that a predetermined number of loops is reached, and the variation of each model parameter of the discriminant model is not greater than a predetermined threshold; or the current total loss function is within a predetermined range. When the cycle process is not over, the updated graph neural network sub-model of each data owner and the discriminant model of the server are used as the current model of the next cycle process.
具体地,向量表示单元710被配置为将当前特征数据子集提供给当前图神经网络子模型,以得到当前图神经网络子模型的各个节点的特征向量表示。向量表示单元710的操作可以参考上面参照图4描述的403的操作。Specifically, the vector representation unit 710 is configured to provide the current feature data subset to the current graph neural network sub-model to obtain the feature vector representation of each node of the current graph neural network sub-model. The operation of the vector representation unit 710 may refer to the operation of 403 described above with reference to FIG. 4.
判别模型获取单元720被配置为从服务端获取当前判别模型。判别模型获取单元720的操作可以参考上面参照图4描述的404的操作。The discriminant model obtaining unit 720 is configured to obtain the current discriminant model from the server. The operation of the discriminant model acquisition unit 720 may refer to the operation of 404 described above with reference to FIG. 4.
模型预测单元730被配置为将各个节点的特征向量表示提供给当前判别模型,以得到各个节点的当前预测标签值。模型预测单元730的操作可以参考上面参照图4描述的405的操作。The model prediction unit 730 is configured to provide the feature vector representation of each node to the current discriminant model to obtain the current predicted label value of each node. The operation of the model prediction unit 730 may refer to the operation of 405 described above with reference to FIG. 4.
损失函数确定单元740被配置为根据各个节点的当前预测标签值以及对应的真实标签值,确定当前损失函数。损失函数确定单元740的操作可以参考上面参照图4描述的406的操作。The loss function determining unit 740 is configured to determine the current loss function according to the current predicted label value of each node and the corresponding real label value. The operation of the loss function determining unit 740 may refer to the operation of 406 described above with reference to FIG. 4.
梯度信息确定单元750被配置为在不满足循环结束条件时,基于当前损失函数,确定当前判别模型的梯度信息。梯度信息确定单元750的操作可以参考上面参照图4描述的407的操作。The gradient information determining unit 750 is configured to determine the gradient information of the current discriminant model based on the current loss function when the loop end condition is not satisfied. The operation of the gradient information determining unit 750 may refer to the operation of 407 described above with reference to FIG. 4.
模型更新单元760被配置为在不满足循环结束条件时,基于当前损失函数,更新当前图神经网络子模型的模型参数。模型更新单元760的操作可以参考上面参照图4描述的407的操作。The model updating unit 760 is configured to update the model parameters of the neural network sub-model of the current graph based on the current loss function when the loop end condition is not satisfied. The operation of the model update unit 760 may refer to the operation of 407 described above with reference to FIG. 4.
梯度信息提供单元770被配置为将当前判别模型的梯度信息提供给服务端,所述服务端使用来自各个数据拥有方的所述当前判别模型的梯度信息来更新服务端处的判别模型。梯度信息提供单元770的操作可以参考上参照图4描述的408的操作。The gradient information providing unit 770 is configured to provide gradient information of the current discriminant model to the server, and the server uses the gradient information of the current discriminant model from each data owner to update the discriminant model at the server. The operation of the gradient information providing unit 770 may refer to the operation of 408 described above with reference to FIG. 4.
在本说明书的一个示例中,梯度信息提供单元770可以采用安全聚合的方式来将当前判别模型的梯度信息提供给服务端。In an example of this specification, the gradient information providing unit 770 can provide the gradient information of the current discriminant model to the server in a safe aggregation manner.
此外,可选地,模型训练装置700还可以包括训练样本子集获取单元(未示出)。在每次循环操作时,训练样本子集获取单元被配置为获取当前训练样本子集。In addition, optionally, the model training device 700 may further include a training sample subset acquisition unit (not shown). In each cycle operation, the training sample subset acquiring unit is configured to acquire the current training sample subset.
图8示出了根据本说明书实施例的用于经由多个数据拥有方来协同训练图神经网络模型的装置(下文中称为模型训练装置800)的方框图。在该实施例中,所述图神经网络模型包括位于服务端的判别模型以及位于各个数据拥有方处的图神经网络子模型,每个数据拥有方具有通过对用于模型训练的训练样本集进行水平切分而获得的训练样本子集,所述训练样本子集包括特征数据子集以及真实标签值。模型训练装置800位于服务端侧。FIG. 8 shows a block diagram of an apparatus for cooperatively training a graph neural network model via a plurality of data owners (hereinafter referred to as a model training apparatus 800) according to an embodiment of the present specification. In this embodiment, the graph neural network model includes a discriminant model located on the server side and a graph neural network sub-model located at each data owner. Each data owner has the ability to perform leveling on the training sample set used for model training. A subset of training samples obtained by segmentation, where the subset of training samples includes a subset of feature data and true label values. The model training device 800 is located on the server side.
如图8所示,模型训练装置800包括判别模型提供单元810、梯度信息获取单元820和模型更新单元830。As shown in FIG. 8, the model training device 800 includes a discriminant model providing unit 810, a gradient information acquiring unit 820, and a model updating unit 830.
在进行模型训练时,判别模型提供单元810、梯度信息获取单元820和模型更新单元830循环操作,直到满足循环结束条件。所述循环结束条件例如可以包括:达到预定循环次数,所述判别模型的各个模型参数的变化量不大于预定阈值;或者当前总损失函数位于预定范围内。在循环过程未结束时,更新后的各个数据拥有方的图神经网络子模型以及服务端的判别模型用作下一循环过程的当前模型。During model training, the discriminant model providing unit 810, the gradient information acquiring unit 820, and the model updating unit 830 operate in a loop until the loop end condition is satisfied. The loop ending condition may include, for example, that a predetermined number of loops is reached, and the variation of each model parameter of the discriminant model is not greater than a predetermined threshold; or the current total loss function is within a predetermined range. When the cycle process is not over, the updated graph neural network sub-model of each data owner and the discriminant model of the server are used as the current model of the next cycle process.
具体地,判别模型提供单元810被配置为将当前判别模型提供给各个数据拥有方,以供各个数据拥有方使用来预测各个节点的预测标签值。判别模型提供单元810的操作可以参考上面参照图4描述的404的操作。Specifically, the discriminant model providing unit 810 is configured to provide the current discriminant model to each data owner for use by each data owner to predict the predicted label value of each node. The operation of the discriminant model providing unit 810 may refer to the operation of 404 described above with reference to FIG. 4.
梯度信息获取单元820被配置为在未满足循环结束条件时,从各个数据拥有方获取当前判别模型的对应梯度信息。梯度信息获取单元820的操作可以参考上面参照图4描述的408的操作。The gradient information acquiring unit 820 is configured to acquire the corresponding gradient information of the current discriminant model from each data owner when the loop end condition is not met. The operation of the gradient information acquisition unit 820 may refer to the operation of 408 described above with reference to FIG. 4.
判别模型更新单元830被配置基于来自各个数据拥有方的梯度信息来更新当前判别模型。判别模型更新单元830的操作可以参考上面参照图4描述的409的操作。The discriminant model update unit 830 is configured to update the current discriminant model based on gradient information from each data owner. The operation of the discriminant model update unit 830 can refer to the operation of 409 described above with reference to FIG. 4.
图9示出了根据本说明书的实施例的用于基于图神经网络模型来进行模型预测的装置(下文中简称为模型预测装置900)的方框图。模型预测装置900应用于数据拥有方。Fig. 9 shows a block diagram of an apparatus for model prediction based on a graph neural network model (hereinafter referred to as a model prediction apparatus 900) according to an embodiment of the present specification. The model prediction device 900 is applied to the data owner.
如图9所述,模型预测装置900包括向量表示单元910、判别模型获取单元920和模型预测单元930。As shown in FIG. 9, the model prediction device 900 includes a vector representation unit 910, a discriminant model acquisition unit 920, and a model prediction unit 930.
向量表示单元910被配置为将待预测数据提供给数据拥有方处的图神经网络子模型,以得到图神经网络子模型的各个节点的特征向量表示。判别模型获取单元920被配置为从服务端获取判别模型。模型预测单元930被配置为将各个节点的特征向量表示提供给判别模型,以得到各个节点的预测标签值,由此完成模型预测过程。The vector representation unit 910 is configured to provide the data to be predicted to the graph neural network sub-model at the data owner to obtain the feature vector representation of each node of the graph neural network sub-model. The discriminant model obtaining unit 920 is configured to obtain the discriminant model from the server. The model prediction unit 930 is configured to provide the feature vector representation of each node to the discriminant model to obtain the predicted label value of each node, thereby completing the model prediction process.
如上参照图1到图9,对根据本说明书实施例的模型训练和预测方法、装置及系统进行了描述。上面的模型训练装置、模型预测装置可以采用硬件实现,也可以采用软件或者硬件和软件的组合来实现。As above, referring to FIGS. 1 to 9, the model training and prediction method, device, and system according to the embodiments of this specification are described. The above model training device and model prediction device can be implemented by hardware, or by software or a combination of hardware and software.
图10示出了根据本说明书实施例的用于实现经由多个数据拥有方训练图神经网络模型的电子设备1000的硬件结构图。如图10所示,电子设备1000可以包括至少一个处理器1010、存储器(例如,非易失性存储器)1020、内存1030和通信接口1040,并且至少一个处理器1010、存储器1020、内存1030和通信接口1040经由总线1060连接在一起。至少一个处理器1010执行在存储器中存储或编码的至少一个计算机可读指令(即,上述以软件形式实现的元素)。FIG. 10 shows a hardware structure diagram of an electronic device 1000 for implementing a neural network model through a training graph of multiple data owners according to an embodiment of the present specification. As shown in FIG. 10, the electronic device 1000 may include at least one processor 1010, a memory (for example, a non-volatile memory) 1020, a memory 1030, and a communication interface 1040, and at least one processor 1010, a memory 1020, a memory 1030, and a communication interface 1040. The interfaces 1040 are connected together via a bus 1060. At least one processor 1010 executes at least one computer-readable instruction (ie, the above-mentioned element implemented in the form of software) stored or encoded in the memory.
在一个实施例中,在存储器中存储计算机可执行指令,其当执行时使得至少一个处理器1010:执行下述循环过程,直到满足循环结束条件:将当前特征数据子集提供给数据拥有方处的当前图神经网络子模型,以得到当前图神经网络子模型的各个节点的特征向量表示;从服务端获取当前判别模型;将各个节点的特征向量表示提供给当前判别模型,以得到各个节点的当前预测标签值;根据各个节点的当前预测标签值以及对应的真实标签值,确定当前损失函数;在不满足循环结束条件时,基于当前损失函数,通过 反向传播来确定当前判别模型的梯度信息和更新当前图神经网络子模型的模型参数;以及将当前判别模型的梯度信息提供给服务端,所述服务端使用来自于各个数据拥有方的当前判别模型的梯度信息来更新服务端处的判别模型,其中,在未满足循环结束条件时,更新后的各个数据拥有方的图神经网络子模型和服务端处的判别模型用作下一循环过程的当前模型。In one embodiment, computer-executable instructions are stored in the memory, which when executed, cause at least one processor 1010: to execute the following loop process until the loop end condition is met: provide the current feature data subset to the data owner The current graph neural network sub-model to obtain the feature vector representation of each node of the current graph neural network sub-model; obtain the current discriminant model from the server; provide the feature vector representation of each node to the current discriminant model to obtain the feature vector representation of each node Current predicted label value; determine the current loss function according to the current predicted label value of each node and the corresponding real label value; when the loop end condition is not met, based on the current loss function, determine the gradient information of the current discriminant model through backpropagation And update the model parameters of the current graph neural network sub-model; and provide the gradient information of the current discriminant model to the server, which uses the gradient information from the current discriminant model of each data owner to update the discrimination at the server Model, where the updated graph neural network sub-model of each data owner and the discriminant model at the server end are used as the current model of the next cycle process when the loop end condition is not met.
应该理解,在存储器中存储的计算机可执行指令当执行时使得至少一个处理器1010进行本说明书的各个实施例中以上结合图1-9描述的各种操作和功能。It should be understood that the computer-executable instructions stored in the memory, when executed, cause at least one processor 1010 to perform the various operations and functions described above in conjunction with FIGS. 1-9 in the various embodiments of this specification.
图11示出了根据本说明书实施例的用于实现经由多个数据拥有方来训练图神经网络模型的电子设备1100的硬件结构图。如图11所示,电子设备1100可以包括至少一个处理器1110、存储器(例如,非易失性存储器)1120、内存1130和通信接口1140,并且至少一个处理器1110、存储器1120、内存1130和通信接口1140经由总线1160连接在一起。至少一个处理器1110执行在存储器中存储或编码的至少一个计算机可读指令(即,上述以软件形式实现的元素)。FIG. 11 shows a hardware structure diagram of an electronic device 1100 for training a graph neural network model via multiple data owners according to an embodiment of the present specification. As shown in FIG. 11, the electronic device 1100 may include at least one processor 1110, a memory (for example, a non-volatile memory) 1120, a memory 1130, and a communication interface 1140, and at least one processor 1110, a memory 1120, a memory 1130, and a communication interface 1140. The interfaces 1140 are connected together via a bus 1160. At least one processor 1110 executes at least one computer-readable instruction (ie, the above-mentioned element implemented in the form of software) stored or encoded in the memory.
在一个实施例中,在存储器中存储计算机可执行指令,其当执行时使得至少一个处理器1110:执行下述循环过程,直到满足循环结束条件:将当前判别模型提供给各个数据拥有方,各个数据拥有方将当前子图神经网络模型的各个节点的特征向量表示提供给当前判别模型以得到各个节点的预测标签值,基于各个节点的预测标签值以及对应的真实标签值确定各自的当前损失函数,以及在不满足循环结束条件时,基于各自的当前损失函数,通过反向传播确定判别模型的梯度信息以及更新当前图神经网络子模型的模型参数,并且将所确定的梯度信息提供给服务端,各个节点的特征向量表示通过将当前特征数据子集提供给当前图神经网络子模型而得到;在未满足循环结束条件时,从各个数据拥有方获取当前判别模型的对应梯度信息,并且基于来自各个数据拥有方的梯度信息更新当前判别模型,其中,在未满足所述循环结束条件时,更新后的各个数据拥有方的图神经网络子模型和服务端的判别模型用作下一循环过程的当前模型。In one embodiment, computer-executable instructions are stored in the memory, which, when executed, cause at least one processor 1110 to execute the following loop process until the loop end condition is met: provide the current discriminant model to each data owner, each The data owner provides the feature vector representation of each node of the current subgraph neural network model to the current discriminant model to obtain the predicted label value of each node, and determines the respective current loss function based on the predicted label value of each node and the corresponding true label value , And when the loop end condition is not met, based on the respective current loss function, determine the gradient information of the discriminant model through back propagation and update the model parameters of the current graph neural network sub-model, and provide the determined gradient information to the server , The feature vector representation of each node is obtained by providing the current feature data subset to the current graph neural network sub-model; when the loop end condition is not met, the corresponding gradient information of the current discriminant model is obtained from each data owner, and based on The gradient information of each data owner updates the current discriminant model, where, when the loop end condition is not met, the updated graph neural network sub-model of each data owner and the discriminant model of the server are used as the current discriminant model of the next cycle process. Model.
应该理解,在存储器中存储的计算机可执行指令当执行时使得至少一个处理器1110进行本说明书的各个实施例中以上结合图1-9描述的各种操作和功能。It should be understood that the computer-executable instructions stored in the memory, when executed, cause at least one processor 1110 to perform various operations and functions described above in conjunction with FIGS. 1-9 in the various embodiments of this specification.
图12示出了根据本说明书实施例的用于基于图神经网络模型进行模型预测的电子设备1200的硬件结构图。如图12所示,电子设备1200可以包括至少一个处理器1210、存储器(例如,非易失性存储器)1220、内存1230和通信接口1240,并且至少一个处理器1210、存储器1220、内存1230和通信接口1240经由总线1260连接在一起。至少 一个处理器1210执行在存储器中存储或编码的至少一个计算机可读指令(即,上述以软件形式实现的元素)。FIG. 12 shows a hardware structure diagram of an electronic device 1200 for model prediction based on a graph neural network model according to an embodiment of the present specification. As shown in FIG. 12, the electronic device 1200 may include at least one processor 1210, a memory (for example, a non-volatile memory) 1220, a memory 1230, and a communication interface 1240, and at least one processor 1210, a memory 1220, a memory 1230, and a communication interface. The interfaces 1240 are connected together via a bus 1260. At least one processor 1210 executes at least one computer-readable instruction (i.e., the above-mentioned element implemented in the form of software) stored or encoded in the memory.
在一个实施例中,在存储器中存储计算机可执行指令,其当执行时使得至少一个处理器1210:将待预测数据提供给数据拥有方处的图神经网络子模型,以得到所述图神经网络子模型的各个节点的特征向量表示;从服务端获取判别模型;以及将各个节点的特征向量表示提供给判别模型,以得到各个节点的预测标签值。In one embodiment, computer-executable instructions are stored in the memory, which, when executed, cause at least one processor 1210 to: provide the data to be predicted to the graph neural network sub-model at the data owner to obtain the graph neural network The feature vector representation of each node of the sub-model; obtain the discriminant model from the server; and provide the feature vector representation of each node to the discriminant model to obtain the predicted label value of each node.
应该理解,在存储器中存储的计算机可执行指令当执行时使得至少一个处理器1210进行本说明书的各个实施例中以上结合图1-9描述的各种操作和功能。It should be understood that the computer-executable instructions stored in the memory, when executed, cause at least one processor 1210 to perform the various operations and functions described above in conjunction with FIGS. 1-9 in the various embodiments of this specification.
根据一个实施例,提供了一种比如机器可读介质(例如,非暂时性机器可读介质)的程序产品。机器可读介质可以具有指令(即,上述以软件形式实现的元素),该指令当被机器执行时,使得机器执行本说明书的各个实施例中以上结合图1-9描述的各种操作和功能。具体地,可以提供配有可读存储介质的系统或者装置,在该可读存储介质上存储着实现上述实施例中任一实施例的功能的软件程序代码,且使该系统或者装置的计算机或处理器读出并执行存储在该可读存储介质中的指令。According to one embodiment, a program product such as a machine-readable medium (for example, a non-transitory machine-readable medium) is provided. The machine-readable medium may have instructions (ie, the above-mentioned elements implemented in the form of software), which, when executed by a machine, cause the machine to perform the various operations and functions described above in conjunction with FIGS. 1-9 in the various embodiments of this specification. . Specifically, a system or device equipped with a readable storage medium may be provided, and the software program code for realizing the function of any one of the above-mentioned embodiments is stored on the readable storage medium, and the computer or device of the system or device The processor reads and executes the instructions stored in the readable storage medium.
在这种情况下,从可读介质读取的程序代码本身可实现上述实施例中任何一项实施例的功能,因此机器可读代码和存储机器可读代码的可读存储介质构成了本发明的一部分。In this case, the program code itself read from the readable medium can realize the function of any one of the above embodiments, so the machine readable code and the readable storage medium storing the machine readable code constitute the present invention a part of.
可读存储介质的实施例包括软盘、硬盘、磁光盘、光盘(如CD-ROM、CD-R、CD-RW、DVD-ROM、DVD-RAM、DVD-RW、DVD-RW)、磁带、非易失性存储卡和ROM。可选择地,可以由通信网络从服务器计算机上或云上下载程序代码。Examples of readable storage media include floppy disks, hard disks, magneto-optical disks, optical disks (such as CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD-RW), magnetic tape, Volatile memory card and ROM. Alternatively, the program code can be downloaded from the server computer or the cloud via the communication network.
本领域技术人员应当理解,上面公开的各个实施例可以在不偏离发明实质的情况下做出各种变形和修改。因此,本发明的保护范围应当由所附的权利要求书来限定。Those skilled in the art should understand that the various embodiments disclosed above can be modified and modified without departing from the essence of the invention. Therefore, the protection scope of the present invention should be defined by the appended claims.
需要说明的是,上述各流程和各系统结构图中不是所有的步骤和单元都是必须的,可以根据实际的需要忽略某些步骤或单元。各步骤的执行顺序不是固定的,可以根据需要进行确定。上述各实施例中描述的装置结构可以是物理结构,也可以是逻辑结构,即,有些单元可能由同一物理实体实现,或者,有些单元可能分由多个物理实体实现,或者,可以由多个独立设备中的某些部件共同实现。It should be noted that not all steps and units in the above processes and system structure diagrams are necessary, and some steps or units can be omitted according to actual needs. The order of execution of each step is not fixed and can be determined as needed. The device structure described in the foregoing embodiments may be a physical structure or a logical structure, that is, some units may be implemented by the same physical entity, or some units may be implemented by multiple physical entities, or may be implemented by multiple physical entities. Some components in independent devices are implemented together.
以上各实施例中,硬件单元或模块可以通过机械方式或电气方式实现。例如,一个硬件单元、模块或处理器可以包括永久性专用的电路或逻辑(如专门的处理器,FPGA 或ASIC)来完成相应操作。硬件单元或处理器还可以包括可编程逻辑或电路(如通用处理器或其它可编程处理器),可以由软件进行临时的设置以完成相应操作。具体的实现方式(机械方式、或专用的永久性电路、或者临时设置的电路)可以基于成本和时间上的考虑来确定。In the above embodiments, the hardware unit or module can be implemented mechanically or electrically. For example, a hardware unit, module, or processor may include a permanent dedicated circuit or logic (such as a dedicated processor, FPGA or ASIC) to complete the corresponding operation. The hardware unit or processor may also include programmable logic or circuits (such as general-purpose processors or other programmable processors), which may be temporarily set by software to complete corresponding operations. The specific implementation (mechanical, or dedicated permanent circuit, or temporarily set circuit) can be determined based on cost and time considerations.
上面结合附图阐述的具体实施方式描述了示例性实施例,但并不表示可以实现的或者落入权利要求书的保护范围的所有实施例。在整个本说明书中使用的术语“示例性”意味着“用作示例、实例或例示”,并不意味着比其它实施例“优选”或“具有优势”。出于提供对所描述技术的理解的目的,具体实施方式包括具体细节。然而,可以在没有这些具体细节的情况下实施这些技术。在一些实例中,为了避免对所描述的实施例的概念造成难以理解,公知的结构和装置以框图形式示出。The specific implementations set forth above in conjunction with the drawings describe exemplary embodiments, but do not represent all embodiments that can be implemented or fall within the protection scope of the claims. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration", and does not mean "preferred" or "advantageous" over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, these techniques can be implemented without these specific details. In some instances, in order to avoid incomprehensibility to the concepts of the described embodiments, well-known structures and devices are shown in the form of block diagrams.
本公开内容的上述描述被提供来使得本领域任何普通技术人员能够实现或者使用本公开内容。对于本领域普通技术人员来说,对本公开内容进行的各种修改是显而易见的,并且,也可以在不脱离本公开内容的保护范围的情况下,将本文所定义的一般性原理应用于其它变型。因此,本公开内容并不限于本文所描述的示例和设计,而是与符合本文公开的原理和新颖性特征的最广范围相一致。The foregoing description of the present disclosure is provided to enable any person of ordinary skill in the art to implement or use the present disclosure. For those of ordinary skill in the art, various modifications to the present disclosure are obvious, and the general principles defined herein can also be applied to other modifications without departing from the scope of protection of the present disclosure. . Therefore, the present disclosure is not limited to the examples and designs described herein, but is consistent with the widest scope that conforms to the principles and novel features disclosed herein.

Claims (23)

  1. 一种用于经由多个数据拥有方来训练图神经网络模型的方法,所述图神经网络模型包括位于服务端的判别模型以及位于各个数据拥有方处的图神经网络子模型,每个数据拥有方具有通过对用于模型训练的训练样本集进行水平切分而获得的训练样本子集,所述训练样本子集包括特征数据子集以及真实标签值,所述方法由数据拥有方执行,所述方法包括:A method for training a graph neural network model via multiple data owners. The graph neural network model includes a discriminant model located on the server side and a graph neural network sub-model located at each data owner. Each data owner There is a training sample subset obtained by horizontally splitting a training sample set used for model training, the training sample subset including a feature data subset and a true label value, the method is executed by the data owner, the Methods include:
    执行下述循环过程,直到满足循环结束条件:Perform the following loop process until the loop end condition is met:
    将当前特征数据子集提供给所述数据拥有方处的当前图神经网络子模型,以得到所述当前图神经网络子模型的各个节点的特征向量表示;Providing the current feature data subset to the current graph neural network sub-model at the data owner to obtain the feature vector representation of each node of the current graph neural network sub-model;
    从服务端获取当前判别模型;Obtain the current discriminant model from the server;
    将各个节点的特征向量表示提供给所述当前判别模型,以得到各个节点的当前预测标签值;Providing the feature vector representation of each node to the current discriminant model to obtain the current predicted label value of each node;
    根据各个节点的当前预测标签值以及对应的真实标签值,确定当前损失函数;Determine the current loss function according to the current predicted label value of each node and the corresponding real label value;
    在不满足循环结束条件时,When the loop end condition is not met,
    基于当前损失函数,确定所述当前判别模型的梯度信息并且更新当前图神经网络子模型的模型参数;以及Based on the current loss function, determine the gradient information of the current discriminant model and update the model parameters of the neural network sub-model of the current graph; and
    将所述当前判别模型的梯度信息提供给所述服务端,所述服务端使用来自于各个数据拥有方的所述当前判别模型的梯度信息来更新所述服务端处的判别模型,其中,在未满足所述循环结束条件时,所述更新后的各个数据拥有方的图神经网络子模型和所述服务端处的判别模型用作下一循环过程的当前模型。The gradient information of the current discriminant model is provided to the server, and the server uses the gradient information of the current discriminant model from each data owner to update the discriminant model at the server, where When the loop ending condition is not met, the updated graph neural network sub-model of each data owner and the discriminant model at the server end are used as the current model of the next loop process.
  2. 如权利要求1所述的方法,其中,各个数据拥有方处得到的梯度信息通过安全聚合的方式提供给所述服务端。The method of claim 1, wherein the gradient information obtained from each data owner is provided to the server in a secure aggregation manner.
  3. 如权利要求2所述的方法,其中,所述安全聚合包括:The method of claim 2, wherein the secure aggregation includes:
    基于秘密共享的安全聚合;Secure aggregation based on secret sharing;
    基于同态加密的安全聚合;或者Secure aggregation based on homomorphic encryption; or
    基于可信执行环境的安全聚合。Secure aggregation based on trusted execution environment.
  4. 如权利要求1所述的方法,其中,在每次循环过程中,所述方法还包括:The method of claim 1, wherein, during each cycle, the method further comprises:
    获取当前训练样本子集。Get the current training sample subset.
  5. 如权利要求1到4中任一所述的方法,其中,所述循环结束条件包括:The method according to any one of claims 1 to 4, wherein the loop end condition comprises:
    预定循环次数;Predetermined number of cycles;
    所述判别模型的各个模型参数的变化量不大于预定阈值;或者The variation of each model parameter of the discriminant model is not greater than a predetermined threshold; or
    当前总损失函数位于预定范围内。The current total loss function is within a predetermined range.
  6. 如权利要求1到4中任一所述的方法,其中,所述特征数据包括基于图像数据、语音数据或文本数据的特征数据,或者所述特征数据包括用户特征数据。The method according to any one of claims 1 to 4, wherein the characteristic data includes characteristic data based on image data, voice data, or text data, or the characteristic data includes user characteristic data.
  7. 一种用于经由多个数据拥有方来训练图神经网络模型的方法,所述图神经网络模型包括位于服务端的判别模型以及位于各个数据拥有方处的图神经网络子模型,每个数据拥有方具有通过对用于模型训练的训练样本集进行水平切分而获得的训练样本子集,所述训练样本子集包括特征数据子集以及真实标签值,所述方法由服务端执行,所述方法包括:A method for training a graph neural network model via multiple data owners. The graph neural network model includes a discriminant model located on the server side and a graph neural network sub-model located at each data owner. Each data owner There is a training sample subset obtained by horizontally splitting a training sample set used for model training, the training sample subset including a feature data subset and a true label value, the method is executed by the server, the method include:
    执行下述循环过程,直到满足循环结束条件:Perform the following loop process until the loop end condition is met:
    将当前判别模型提供给各个数据拥有方,各个数据拥有方将当前子图神经网络模型的各个节点的特征向量表示提供给所述当前判别模型以得到各个节点的预测标签值,基于各个节点的预测标签值以及对应的真实标签值确定各自的当前损失函数,以及在不满足循环结束条件时,各个数据拥有方基于各自的当前损失函数,确定判别模型的梯度信息以及更新当前图神经网络子模型的模型参数,并且将所确定的梯度信息提供给所述服务端,所述各个节点的特征向量表示通过将当前特征数据子集提供给所述当前图神经网络子模型而得到;Provide the current discriminant model to each data owner, and each data owner provides the feature vector representation of each node of the current subgraph neural network model to the current discriminant model to obtain the predicted label value of each node, based on the prediction of each node The label value and the corresponding real label value determine their current loss function, and when the loop end condition is not met, each data owner determines the gradient information of the discriminant model and updates the current graph neural network sub-model based on their current loss function Model parameters, and provide the determined gradient information to the server, and the feature vector representation of each node is obtained by providing a current feature data subset to the current graph neural network sub-model;
    在未满足所述循环结束条件时,从各个数据拥有方获取所述当前判别模型的对应梯度信息,并且基于来自各个数据拥有方的梯度信息更新所述当前判别模型,When the loop end condition is not met, obtain the corresponding gradient information of the current discriminant model from each data owner, and update the current discriminant model based on the gradient information from each data owner,
    其中,在未满足所述循环结束条件时,所述更新后的各个数据拥有方的图神经网络子模型和所述服务端的判别模型用作下一循环过程的当前模型。Wherein, when the loop ending condition is not met, the updated graph neural network sub-model of each data owner and the discriminant model of the server are used as the current model of the next loop process.
  8. 如权利要求7所述的方法,其中,各个数据拥有方处得到的梯度信息通过安全聚合的方式提供给所述服务端。The method according to claim 7, wherein the gradient information obtained from each data owner is provided to the server in a secure aggregation manner.
  9. 如权利要求8所述的方法,其中,所述安全聚合包括:The method of claim 8, wherein the secure aggregation comprises:
    基于秘密共享的安全聚合;Secure aggregation based on secret sharing;
    基于同态加密的安全聚合;或者Secure aggregation based on homomorphic encryption; or
    基于可信执行环境的安全聚合。Secure aggregation based on trusted execution environment.
  10. 一种用于使用图神经网络模型来进行模型预测的方法,所述图神经网络模型包括位于服务端的判别模型以及位于各个数据拥有方处的图神经网络子模型,所述方法由数据拥有方执行,所述方法包括:A method for making model predictions using a graph neural network model. The graph neural network model includes a discriminant model located on the server side and a graph neural network sub-model located at each data owner. The method is executed by the data owner , The method includes:
    将待预测特征数据提供给所述数据拥有方处的图神经网络子模型,以得到所述图神经网络子模型的各个节点的特征向量表示;Providing the feature data to be predicted to the graph neural network sub-model at the data owner to obtain the feature vector representation of each node of the graph neural network sub-model;
    从服务端获取判别模型;以及Obtain the discriminant model from the server; and
    将各个节点的特征向量表示提供给所述判别模型,以得到各个节点的预测标签值。The feature vector representation of each node is provided to the discriminant model to obtain the predicted label value of each node.
  11. 一种用于经由多个数据拥有方来训练图神经网络模型的装置,所述图神经网络模型包括位于服务端的判别模型以及位于各个数据拥有方处的图神经网络子模型,每个数据拥有方具有通过对用于模型训练的训练样本集进行水平切分而获得的训练样本子集,所述训练样本子集包括特征数据子集以及真实标签值,所述装置应用于数据拥有方,所述装置包括:A device for training a graph neural network model via multiple data owners. The graph neural network model includes a discriminant model on the server side and a graph neural network sub-model located at each data owner. Each data owner There is a training sample subset obtained by horizontally splitting a training sample set used for model training, the training sample subset includes a feature data subset and a true label value, the device is applied to the data owner, the The device includes:
    向量表示单元,将当前特征数据子集提供给当前图神经网络子模型,以得到所述当前图神经网络子模型的各个节点的特征向量表示;The vector representation unit provides the current feature data subset to the current graph neural network sub-model to obtain the feature vector representation of each node of the current graph neural network sub-model;
    判别模型获取单元,从服务端获取当前判别模型;The discriminant model acquisition unit, which obtains the current discriminant model from the server;
    模型预测单元,将各个节点的特征向量表示提供给所述当前判别模型,以得到各个节点的当前预测标签值;The model prediction unit provides the feature vector representation of each node to the current discriminant model to obtain the current predicted label value of each node;
    损失函数确定单元,根据各个节点的当前预测标签值以及对应的真实标签值,确定当前损失函数;The loss function determining unit determines the current loss function according to the current predicted label value of each node and the corresponding real label value;
    梯度信息确定单元,在不满足循环结束条件时,基于当前损失函数,确定所述当前判别模型的梯度信息;A gradient information determining unit, when the loop end condition is not met, determine the gradient information of the current discriminant model based on the current loss function;
    模型更新单元,在不满足循环结束条件时,基于当前损失函数,更新当前图神经网络子模型的模型参数;以及The model update unit updates the model parameters of the neural network sub-model of the current graph based on the current loss function when the loop end condition is not met; and
    梯度信息提供单元,将所述当前判别模型的梯度信息提供给所述服务端,所述服务端使用来自各个数据拥有方的所述当前判别模型的梯度信息来更新所述服务端处的判别模型,The gradient information providing unit provides the gradient information of the current discriminant model to the server, and the server uses the gradient information of the current discriminant model from each data owner to update the discriminant model at the server ,
    其中,所述向量表示单元,所述判别模型获取单元、所述模型预测单元、所述损失函数确定单元、所述梯度信息确定单元、所述模型更新单元和所述梯度信息提供单元循环操作,直到满足所述循环结束条件,在未满足所述循环结束条件时,所述更新后的各个数据拥有方的图神经网络子模型和所述服务端的判别模型用作下一循环过程的当前模型。Wherein, the vector representation unit, the discriminant model acquisition unit, the model prediction unit, the loss function determination unit, the gradient information determination unit, the model update unit, and the gradient information providing unit cyclically operate, Until the loop end condition is met, when the loop end condition is not met, the updated graph neural network sub-model of each data owner and the discriminant model of the server are used as the current model of the next loop process.
  12. 如权利要求11所述的装置,其中,所述梯度信息提供单元使用安全聚合的方式来将所述数据拥有方处得到的梯度信息提供给所述服务端。The apparatus of claim 11, wherein the gradient information providing unit uses a secure aggregation method to provide the gradient information obtained from the data owner to the server.
  13. 如权利要求12所述的装置,其中,所述安全聚合包括:The apparatus of claim 12, wherein the secure aggregation comprises:
    基于秘密共享的安全聚合;Secure aggregation based on secret sharing;
    基于同态加密的安全聚合;或者Secure aggregation based on homomorphic encryption; or
    基于可信执行环境的安全聚合。Secure aggregation based on trusted execution environment.
  14. 如权利要求11所述的装置,还包括:The apparatus of claim 11, further comprising:
    训练样本子集获取单元,在每次循环操作时,获取当前训练样本子集。The training sample subset acquisition unit acquires the current training sample subset during each cycle operation.
  15. 一种用于经由多个数据拥有方来训练图神经网络模型的装置,所述图神经网络模型包括位于服务端的判别模型以及位于各个数据拥有方处的图神经网络子模型,每个数据拥有方具有通过对用于模型训练的训练样本集进行水平切分而获得的训练样本子集,所述训练样本子集包括特征数据子集以及真实标签值,所述装置应用于服务端,所述装置包括:A device for training a graph neural network model via multiple data owners. The graph neural network model includes a discriminant model on the server side and a graph neural network sub-model located at each data owner. Each data owner There is a training sample subset obtained by horizontally splitting a training sample set used for model training, the training sample subset including a feature data subset and a true label value, the device is applied to the server, the device include:
    判别模型提供单元,将当前判别模型提供给各个数据拥有方,各个数据拥有方将当前图神经网络子模型的各个节点的特征向量表示提供给所述当前判别模型来得到各个节点的预测标签值,基于各个节点的预测标签值以及对应的真实标签值来确定出各自的当前损失函数,以及在不满足循环结束条件时,各个数据拥有方基于各自的当前损失函数,确定判别模型的梯度信息以及更新当前图神经网络子模型的模型参数,并且将所确定的梯度信息提供给所述服务端,所述各个节点的特征向量表示通过将当前特征数据子集提供给所述当前图神经网络子模型而得到;The discriminant model providing unit provides the current discriminant model to each data owner, and each data owner provides the feature vector representation of each node of the current graph neural network submodel to the current discriminant model to obtain the predicted label value of each node, Determine the respective current loss function based on the predicted label value of each node and the corresponding real label value, and when the loop end condition is not satisfied, each data owner determines the gradient information and update the discriminant model based on their current loss function The model parameters of the current graph neural network sub-model, and the determined gradient information is provided to the server. The feature vector of each node indicates that the current feature data subset is provided to the current graph neural network sub-model. get;
    梯度信息获取单元,在未满足循环结束条件时,从各个数据拥有方获取所述当前判别模型的对应梯度信息;以及The gradient information acquiring unit, when the loop end condition is not met, acquires the corresponding gradient information of the current discriminant model from each data owner; and
    判别模型更新单元,基于来自各个数据拥有方的梯度信息来更新所述当前判别模型,The discriminant model update unit updates the current discriminant model based on gradient information from each data owner,
    其中,所述判别模型提供单元,所述梯度信息获取单元和所述判别模型更新单元循环操作,直到满足所述循环结束条件,在未满足所述循环结束条件时,所述更新后的各个数据拥有方的图神经网络子模型和所述服务端的判别模型用作下一循环过程的当前模型。Wherein, the discriminant model providing unit, the gradient information acquiring unit, and the discriminant model updating unit operate in a loop until the loop end condition is satisfied. When the loop end condition is not satisfied, the updated data The graph neural network sub-model of the owner and the discriminant model of the server are used as the current model of the next cycle process.
  16. 一种用于经由多个数据拥有方来训练图神经网络模型的系统,包括:A system for training graph neural network models through multiple data owners, including:
    多个数据拥有方设备,每个数据拥有方设备包括如权利要求11到14中任一所述的装置;以及A plurality of data owner devices, each data owner device comprising the device according to any one of claims 11 to 14; and
    服务端设备,包括如权利要求15所述的装置,Server equipment, including the device as claimed in claim 15,
    其中,所述图神经网络模型包括位于服务端的判别模型以及位于各个数据拥有方处的图神经网络子模型,每个数据拥有方具有通过对用于模型训练的训练样本集进行水平切分而获得的训练样本子集,所述训练样本子集包括特征数据子集以及真实标签值。Wherein, the graph neural network model includes a discriminant model located on the server side and a graph neural network sub-model located at each data owner. Each data owner has the ability to obtain by horizontally splitting the training sample set used for model training. A subset of training samples of, the subset of training samples includes a subset of feature data and true label values.
  17. 一种用于使用图神经网络模型来进行模型预测的装置,所述图神经网络模型包括位于服务端的判别模型以及位于各个数据拥有方处的图神经网络子模型,所述装置应 用于数据拥有方,所述装置包括:A device for making model predictions using a graph neural network model. The graph neural network model includes a discriminant model located on the server side and a graph neural network sub-model located at each data owner. The device is applied to the data owner , The device includes:
    向量表示单元,将待预测数据提供给所述数据拥有方处的图神经网络子模型,以得到所述图神经网络子模型的各个节点的特征向量表示;The vector representation unit provides the data to be predicted to the graph neural network sub-model at the data owner to obtain the feature vector representation of each node of the graph neural network sub-model;
    判别模型获取单元,从服务端获取判别模型;以及The discriminant model acquisition unit obtains the discriminant model from the server; and
    模型预测单元,将各个节点的特征向量表示提供给所述判别模型,以得到各个节点的预测标签值。The model prediction unit provides the feature vector representation of each node to the discriminant model to obtain the predicted label value of each node.
  18. 一种电子设备,包括:An electronic device including:
    至少一个处理器,以及At least one processor, and
    与所述至少一个处理器耦合的存储器,所述存储器存储指令,当所述指令被所述至少一个处理器执行时,使得所述至少一个处理器执行如权利要求1到6中任一所述的方法。A memory coupled with the at least one processor, the memory stores instructions, and when the instructions are executed by the at least one processor, the at least one processor is caused to execute any one of claims 1 to 6 Methods.
  19. 一种机器可读存储介质,其存储有可执行指令,所述指令当被执行时使得所述机器执行如权利要求1到6中任一所述的方法。A machine-readable storage medium storing executable instructions, which when executed, cause the machine to execute the method according to any one of claims 1 to 6.
  20. 一种电子设备,包括:An electronic device including:
    至少一个处理器,以及At least one processor, and
    与所述至少一个处理器耦合的存储器,所述存储器存储指令,当所述指令被所述至少一个处理器执行时,使得所述至少一个处理器执行如权利要求7到9中任一所述的方法。A memory coupled to the at least one processor, the memory stores instructions, and when the instructions are executed by the at least one processor, the at least one processor is caused to execute any one of claims 7 to 9 Methods.
  21. 一种机器可读存储介质,其存储有可执行指令,所述指令当被执行时使得所述机器执行如权利要求7到9中任一所述的方法。A machine-readable storage medium storing executable instructions, which when executed, cause the machine to execute the method according to any one of claims 7 to 9.
  22. 一种电子设备,包括:An electronic device including:
    至少一个处理器,以及At least one processor, and
    与所述至少一个处理器耦合的存储器,所述存储器存储指令,当所述指令被所述至少一个处理器执行时,使得所述至少一个处理器执行如权利要求10所述的方法。A memory coupled with the at least one processor, the memory stores instructions, and when the instructions are executed by the at least one processor, the at least one processor executes the method according to claim 10.
  23. 一种机器可读存储介质,其存储有可执行指令,所述指令当被执行时使得所述机器执行如权利要求10所述的方法。A machine-readable storage medium storing executable instructions, which when executed, cause the machine to execute the method according to claim 10.
PCT/CN2020/132667 2020-02-17 2020-11-30 Graph neural network model training method, apparatus and system WO2021164365A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010096248.8 2020-02-17
CN202010096248.8A CN110929870B (en) 2020-02-17 2020-02-17 Method, device and system for training neural network model

Publications (1)

Publication Number Publication Date
WO2021164365A1 true WO2021164365A1 (en) 2021-08-26

Family

ID=69854815

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/132667 WO2021164365A1 (en) 2020-02-17 2020-11-30 Graph neural network model training method, apparatus and system

Country Status (2)

Country Link
CN (1) CN110929870B (en)
WO (1) WO2021164365A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113571133A (en) * 2021-09-14 2021-10-29 内蒙古农业大学 Lactic acid bacteria antibacterial peptide prediction method based on graph neural network
CN113771289A (en) * 2021-09-16 2021-12-10 健大电业制品(昆山)有限公司 Method and system for optimizing injection molding process parameters
CN113849665A (en) * 2021-09-02 2021-12-28 中科创达软件股份有限公司 Multimedia data identification method, device, equipment and storage medium
CN114117926A (en) * 2021-12-01 2022-03-01 南京富尔登科技发展有限公司 Robot cooperative control algorithm based on federal learning
CN114819182A (en) * 2022-04-15 2022-07-29 支付宝(杭州)信息技术有限公司 Method, apparatus and system for training a model via multiple data owners
CN114819139A (en) * 2022-03-28 2022-07-29 支付宝(杭州)信息技术有限公司 Pre-training method and device for graph neural network
CN114117926B (en) * 2021-12-01 2024-05-14 南京富尔登科技发展有限公司 Robot cooperative control algorithm based on federal learning

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110929870B (en) * 2020-02-17 2020-06-12 支付宝(杭州)信息技术有限公司 Method, device and system for training neural network model
CN111581648B (en) * 2020-04-06 2022-06-03 电子科技大学 Method of federal learning to preserve privacy in irregular users
CN111612126A (en) * 2020-04-18 2020-09-01 华为技术有限公司 Method and device for reinforcement learning
CN111665861A (en) * 2020-05-19 2020-09-15 中国农业大学 Trajectory tracking control method, apparatus, device and storage medium
CN111553470B (en) * 2020-07-10 2020-10-27 成都数联铭品科技有限公司 Information interaction system and method suitable for federal learning
CN111738438B (en) * 2020-07-17 2021-04-30 支付宝(杭州)信息技术有限公司 Method, device and system for training neural network model
CN111737474B (en) * 2020-07-17 2021-01-12 支付宝(杭州)信息技术有限公司 Method and device for training business model and determining text classification category
CN111783143B (en) * 2020-07-24 2023-05-09 支付宝(杭州)信息技术有限公司 Method, device and system for determining service model use of user data
CN112052942B (en) * 2020-09-18 2022-04-12 支付宝(杭州)信息技术有限公司 Neural network model training method, device and system
CN112131303A (en) * 2020-09-18 2020-12-25 天津大学 Large-scale data lineage method based on neural network model
CN112364819A (en) * 2020-11-27 2021-02-12 支付宝(杭州)信息技术有限公司 Method and device for joint training and recognition of model
CN112766500B (en) * 2021-02-07 2022-05-17 支付宝(杭州)信息技术有限公司 Method and device for training graph neural network
CN113052333A (en) * 2021-04-02 2021-06-29 中国科学院计算技术研究所 Method and system for data analysis based on federal learning
CN113254996B (en) * 2021-05-31 2022-12-27 平安科技(深圳)有限公司 Graph neural network training method and device, computing equipment and storage medium
CN113221153B (en) * 2021-05-31 2022-12-27 平安科技(深圳)有限公司 Graph neural network training method and device, computing equipment and storage medium
CN113222143B (en) * 2021-05-31 2023-08-01 平安科技(深圳)有限公司 Method, system, computer equipment and storage medium for training graphic neural network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109684855A (en) * 2018-12-17 2019-04-26 电子科技大学 A kind of combined depth learning training method based on secret protection technology
US20190286972A1 (en) * 2018-03-14 2019-09-19 Microsoft Technology Licensing, Llc Hardware accelerated neural network subgraphs
CN110782044A (en) * 2019-10-29 2020-02-11 支付宝(杭州)信息技术有限公司 Method and device for multi-party joint training of neural network of graph
CN110929870A (en) * 2020-02-17 2020-03-27 支付宝(杭州)信息技术有限公司 Method, device and system for training neural network model

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4202782A1 (en) * 2015-11-09 2023-06-28 Google LLC Training neural networks represented as computational graphs
CN110751275B (en) * 2019-08-03 2022-09-02 北京达佳互联信息技术有限公司 Graph training system, data access method and device, electronic device and storage medium
CN110751269B (en) * 2019-10-18 2022-08-05 网易(杭州)网络有限公司 Graph neural network training method, client device and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190286972A1 (en) * 2018-03-14 2019-09-19 Microsoft Technology Licensing, Llc Hardware accelerated neural network subgraphs
CN109684855A (en) * 2018-12-17 2019-04-26 电子科技大学 A kind of combined depth learning training method based on secret protection technology
CN110782044A (en) * 2019-10-29 2020-02-11 支付宝(杭州)信息技术有限公司 Method and device for multi-party joint training of neural network of graph
CN110929870A (en) * 2020-02-17 2020-03-27 支付宝(杭州)信息技术有限公司 Method, device and system for training neural network model

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113849665A (en) * 2021-09-02 2021-12-28 中科创达软件股份有限公司 Multimedia data identification method, device, equipment and storage medium
CN113571133A (en) * 2021-09-14 2021-10-29 内蒙古农业大学 Lactic acid bacteria antibacterial peptide prediction method based on graph neural network
CN113571133B (en) * 2021-09-14 2022-06-17 内蒙古农业大学 Lactic acid bacteria antibacterial peptide prediction method based on graph neural network
CN113771289A (en) * 2021-09-16 2021-12-10 健大电业制品(昆山)有限公司 Method and system for optimizing injection molding process parameters
CN113771289B (en) * 2021-09-16 2022-06-24 健大电业制品(昆山)有限公司 Method and system for optimizing injection molding process parameters
CN114117926A (en) * 2021-12-01 2022-03-01 南京富尔登科技发展有限公司 Robot cooperative control algorithm based on federal learning
CN114117926B (en) * 2021-12-01 2024-05-14 南京富尔登科技发展有限公司 Robot cooperative control algorithm based on federal learning
CN114819139A (en) * 2022-03-28 2022-07-29 支付宝(杭州)信息技术有限公司 Pre-training method and device for graph neural network
CN114819182A (en) * 2022-04-15 2022-07-29 支付宝(杭州)信息技术有限公司 Method, apparatus and system for training a model via multiple data owners

Also Published As

Publication number Publication date
CN110929870B (en) 2020-06-12
CN110929870A (en) 2020-03-27

Similar Documents

Publication Publication Date Title
WO2021164365A1 (en) Graph neural network model training method, apparatus and system
WO2021103901A1 (en) Multi-party security calculation-based neural network model training and prediction methods and device
WO2020156004A1 (en) Model training method, apparatus and system
CN111061963B (en) Machine learning model training and predicting method and device based on multi-party safety calculation
CN110782044A (en) Method and device for multi-party joint training of neural network of graph
US11715044B2 (en) Methods and systems for horizontal federated learning using non-IID data
US11341411B2 (en) Method, apparatus, and system for training neural network model
CN111738438B (en) Method, device and system for training neural network model
CN111523556B (en) Model training method, device and system
CN111368983A (en) Business model training method and device and business model training system
CN111523134B (en) Homomorphic encryption-based model training method, device and system
CN110929887B (en) Logistic regression model training method, device and system
CN111523674B (en) Model training method, device and system
CN111737756B (en) XGB model prediction method, device and system performed through two data owners
Lei et al. Federated learning over coupled graphs
CN110175283B (en) Recommendation model generation method and device
CN111523675B (en) Model training method, device and system
CN112183759B (en) Model training method, device and system
CN111738453B (en) Business model training method, device and system based on sample weighting
Xu et al. FedG2L: a privacy-preserving federated learning scheme base on “G2L” against poisoning attack
CN112183566B (en) Model training method, device and system
CN112183564B (en) Model training method, device and system
US20230084507A1 (en) Servers, methods and systems for fair and secure vertical federated learning
Razeghi et al. Deep Privacy Funnel Model: From a Discriminative to a Generative Approach with an Application to Face Recognition
Azogagh et al. Crypto'Graph: Leveraging Privacy-Preserving Distributed Link Prediction for Robust Graph Learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20920246

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20920246

Country of ref document: EP

Kind code of ref document: A1