WO2021164365A1 - Procédé, appareil et système d'apprentissage de modèle de réseau neuronal graphique - Google Patents

Procédé, appareil et système d'apprentissage de modèle de réseau neuronal graphique Download PDF

Info

Publication number
WO2021164365A1
WO2021164365A1 PCT/CN2020/132667 CN2020132667W WO2021164365A1 WO 2021164365 A1 WO2021164365 A1 WO 2021164365A1 CN 2020132667 W CN2020132667 W CN 2020132667W WO 2021164365 A1 WO2021164365 A1 WO 2021164365A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
current
neural network
data
graph neural
Prior art date
Application number
PCT/CN2020/132667
Other languages
English (en)
Chinese (zh)
Inventor
陈超超
王力
周俊
Original Assignee
支付宝(杭州)信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 支付宝(杭州)信息技术有限公司 filed Critical 支付宝(杭州)信息技术有限公司
Publication of WO2021164365A1 publication Critical patent/WO2021164365A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the embodiments of this specification generally relate to the field of machine learning, and more particularly to methods, devices, and systems for using horizontally segmented feature data sets to collaboratively train graph neural network models through multiple data owners.
  • the graph neural network model is a machine learning model widely used in the field of machine learning.
  • multiple model training participants for example, e-commerce companies, express companies, and banks
  • the multiple model training participants usually want to use each other's data to unify the training graph neural network model, but they do not want to provide their data to other model training participants to prevent their own data from being leaked.
  • a graph neural network model training method that can protect the security of private data is proposed. It can cooperate with the multiple model training participants to train the graph while ensuring the security of their respective data of multiple model training participants.
  • the neural network model is used by the participants in the training of the multiple models.
  • embodiments of this specification provide a method, device and system for collaboratively training graph neural network models via multiple data owners, which can be implemented while ensuring the security of the respective data of multiple data owners.
  • Graph neural network model training
  • the graph neural network model includes a discriminant model located on the server side and a graph located at each data owner.
  • each data owner has a training sample subset obtained by horizontally splitting a training sample set used for model training.
  • the training sample subset includes a feature data subset and a true label value
  • the method is executed by the data owner, and the method includes: executing the following loop process until the loop end condition is satisfied: providing the current feature data subset to the current graph neural network sub-model at the data owner to obtain the all The feature vector representation of each node of the neural network sub-model of the current graph; obtain the current discriminant model from the server; provide the feature vector representation of each node to the current discriminant model to obtain the current predicted label value of each node; The current predicted label value of the node and the corresponding true label value determine the current loss function; when the loop end condition is not satisfied, based on the current loss function, determine the gradient information of the current discriminant model and update the model of the current graph neural network sub-model Parameters; and providing gradient information of the current discriminant model to the server, and the server uses the gradient information of the current discriminant model from each data owner to update the discriminant model at the server, wherein, when the loop ending condition is not satisfied, the updated graph neural network sub
  • the gradient information obtained from each data owner may be provided to the server in a secure aggregation manner.
  • the secure aggregation may include: secure aggregation based on secret sharing; secure aggregation based on homomorphic encryption; or secure aggregation based on a trusted execution environment.
  • the method may further include: obtaining a current training sample subset.
  • the loop end condition may include: a predetermined number of loops; the variation of each model parameter of the discriminant model is not greater than a predetermined threshold; or the current total loss function is within a predetermined range.
  • the characteristic data may include characteristic data based on image data, voice data, or text data, or the characteristic data may include user characteristic data.
  • the graph neural network model includes a discriminant model located on the server and located at each data owner.
  • the graph neural network sub-model of each data owner has a training sample subset obtained by horizontally splitting the training sample set used for model training, and the training sample subset includes a feature data subset and a true label value ,
  • the method is executed by the server, and the method includes: executing the following loop process until the loop end condition is met: providing the current discriminant model to each data owner, and each data owner adding each of the current subgraph neural network models
  • the feature vector representation of the node is provided to the current discriminant model to obtain the predicted label value of each node, and the respective current loss function is determined based on the predicted label value of each node and the corresponding real label value, and when the loop end condition is not satisfied, Based on the respective current loss functions, determine the gradient information of the discriminant model and update the model parameters of the current graph neural network sub-model,
  • the feature vector of each node indicates that the current
  • the feature data subset is provided to the current graph neural network sub-model; when the loop end condition is not met, the corresponding gradient information of the current discriminant model is obtained from each data owner, and based on the data from each data owner
  • the current discriminant model is updated with the gradient information of, where, when the cycle end condition is not met, the updated graph neural network sub-model of each data owner and the discriminant model of the server are used as the next cycle process The current model.
  • the gradient information obtained from each data owner may be provided to the server in a secure aggregation manner.
  • the secure aggregation may include: secure aggregation based on secret sharing; secure aggregation based on homomorphic encryption; or secure aggregation based on a trusted execution environment.
  • a method for model prediction using a graph neural network model the graph neural network model including a discriminant model located on the server side and a graph neural network located at each data owner.
  • Network sub-model the method is executed by the data owner, and the method includes: providing the data to be predicted to the graph neural network sub-model at the data owner to obtain the information of each node of the graph neural network sub-model The feature vector representation; the discriminant model is obtained from the server; and the feature vector representation of each node is provided to the discriminant model to obtain the predicted label value of each node.
  • an apparatus for training a graph neural network model via a plurality of data owners includes a discriminant model located on the server and located at each data owner.
  • the graph neural network sub-model of each data owner has a training sample subset obtained by horizontally splitting the training sample set used for model training, and the training sample subset includes a feature data subset and a true label value ,
  • the device is applied to the data owner, and the device includes: a vector representation unit that provides a current feature data subset to the current graph neural network sub-model to obtain the feature vector of each node of the current graph neural network sub-model Means; the discriminant model acquisition unit, which acquires the current discriminant model from the server; the model prediction unit, which provides the feature vector representation of each node to the current discriminant model to obtain the current predicted label value of each node; the loss function determination unit, according to The current predicted label value of each node and the corresponding true label value determine the current loss function; the gradient information determination unit
  • the gradient information providing unit may use a secure aggregation method to provide the gradient information obtained from the data owner to the server.
  • the secure aggregation may include: secure aggregation based on secret sharing; secure aggregation based on homomorphic encryption; or secure aggregation based on a trusted execution environment.
  • the device may further include: a training sample subset acquiring unit, which acquires a current training sample subset during each cycle operation.
  • an apparatus for training a graph neural network model via a plurality of data owners includes a discriminant model located on the server and located at each data owner.
  • the graph neural network sub-model of each data owner has a training sample subset obtained by horizontally splitting the training sample set used for model training, and the training sample subset includes a feature data subset and a true label value ,
  • the device is applied to the server, and the device includes: a discriminant model providing unit, which provides the current discriminant model to each data owner, and each data owner provides the feature vector representation of each node of the current graph neural network sub-model to
  • the current discriminant model is used to obtain the predicted label value of each node, the respective current loss function is determined based on the predicted label value of each node and the corresponding real label value, and when the loop end condition is not met, each data owner is based on Each current loss function determines the gradient information of the discriminant model and updates the model parameters of the current graph neural network sub-model
  • the feature vector of each node indicates that the current feature is
  • the data subset is obtained by providing the current graph neural network sub-model; a gradient information acquisition unit, when the loop end condition is not met, acquires the corresponding gradient information of the current discriminant model from each data owner; and a discriminant model update unit , Updating the current discriminant model based on gradient information from each data owner, wherein the discriminant model providing unit, the gradient information acquiring unit, and the discriminant model updating unit operate in a loop until the loop end condition is satisfied When the loop ending condition is not satisfied, the updated graph neural network sub-model of each data owner and the discriminant model of the server are used as the current model of the next loop process.
  • a system for training a graph neural network model via a plurality of data owners including: a plurality of data owner devices, each data owner device including the above And a server device, including the device as described above, wherein the graph neural network model includes a discriminant model located on the server side and a graph neural network sub-model located at each data owner, and each data owner has a pass A training sample subset obtained by horizontally splitting a training sample set used for model training, where the training sample subset includes a feature data subset and a true label value.
  • an apparatus for performing model prediction using a graph neural network model includes a discriminant model located on the server side and a graph neural network located at each data owner.
  • the network sub-model, the device is applied to the data owner, and the device includes: a vector representation unit that provides the data to be predicted to the graph neural network sub-model at the data owner to obtain the graph neural network sub-model The feature vector representation of each node; the discriminant model acquisition unit, which obtains the discriminant model from the server; and the model prediction unit, which provides the feature vector representation of each node to the discriminant model to obtain the predicted label value of each node.
  • an electronic device including: at least one processor, and a memory coupled with the at least one processor, the memory stores instructions, and when the instructions are used by the at least one When the processor executes, the at least one processor is caused to execute the model training method executed on the data owner side as described above.
  • a machine-readable storage medium which stores executable instructions that, when executed, cause the at least one processor to execute the above-mentioned The method of model training performed.
  • an electronic device including: at least one processor, and a memory coupled with the at least one processor, the memory stores instructions, and when the instructions are used by the at least one When the processor executes, the at least one processor is caused to execute the model training method executed on the server side as described above.
  • a machine-readable storage medium which stores executable instructions that, when executed, cause the at least one processor to execute the execution on the server side as described above.
  • the model training method is provided, which stores executable instructions that, when executed, cause the at least one processor to execute the execution on the server side as described above.
  • an electronic device including: at least one processor, and a memory coupled with the at least one processor, the memory stores instructions, and when the instructions are used by the at least one When the processor executes, the at least one processor is caused to execute the above-mentioned model prediction method.
  • a machine-readable storage medium which stores executable instructions that, when executed, cause the at least one processor to execute the above-mentioned model prediction method.
  • the model parameters of the graph neural network model can be obtained by training without leaking the private data of the multiple training participants.
  • Fig. 1 shows a schematic diagram of an example of a graph neural network model according to an embodiment of the present specification
  • Fig. 2 shows a schematic diagram of an example of a horizontally segmented training sample set according to an embodiment of the present specification
  • FIG. 3 shows a schematic diagram showing the architecture of a system for training graph neural network models via multiple data owners according to an embodiment of the present specification
  • Fig. 4 shows a flowchart of a method for training a graph neural network model via multiple data owners according to an embodiment of the present specification
  • FIG. 5 shows a schematic diagram of an example process for training a graph neural network model via multiple data owners according to an embodiment of the present specification
  • Fig. 6 shows a flowchart of a model prediction process based on a graph neural network model according to an embodiment of the present specification
  • Fig. 7 shows a block diagram of an apparatus for training a graph neural network model via multiple data owners according to an embodiment of the present specification
  • FIG. 8 shows a block diagram of an apparatus for training a graph neural network model via multiple data owners according to an embodiment of the present specification
  • Fig. 9 shows a block diagram of an apparatus for model prediction based on a graph neural network model according to an embodiment of the present specification
  • FIG. 10 shows a schematic diagram of an electronic device for training a graph neural network model via multiple data owners according to an embodiment of the present specification
  • FIG. 11 shows a schematic diagram of an electronic device for training a graph neural network model via multiple data owners according to an embodiment of the present specification.
  • Fig. 12 shows a schematic diagram of an electronic device for model prediction based on a graph neural network model according to an embodiment of the present specification.
  • the term “including” and its variations mean open terms, meaning “including but not limited to”.
  • the term “based on” means “based at least in part on.”
  • the terms “one embodiment” and “an embodiment” mean “at least one embodiment.”
  • the term “another embodiment” means “at least one other embodiment.”
  • the terms “first”, “second”, etc. may refer to different or the same objects. Other definitions can be included below, whether explicit or implicit. Unless clearly indicated in the context, the definition of a term is consistent throughout the specification.
  • the training sample set used in the graph neural network model training scheme is a training sample set that has been horizontally segmented.
  • horizontal segmentation of the training sample set refers to dividing the training sample set into multiple training sample subsets according to modules/functions (or certain specified rules), and each training sample subset contains a part of training samples.
  • the training samples included in each training sample subset are complete training samples, that is, all field data and corresponding label values of the training sample are included.
  • each data owner there are three data owners Alice, Bob, and Charlie
  • local samples are obtained from each data owner to form a local sample set, and each sample contained in the local sample set is a complete sample
  • the local sample sets obtained by the three data parties Alice, Bob, and Charlie form the training sample set for graph neural network model training, where each local sample set is used as a training sample subset of the training sample set, with Used to train graph neural network models.
  • each data owner owns different parts of the data of the training samples used in the training of the graph neural network model. For example, taking two data owners as an example, suppose the training sample set includes 100 training samples, and each training sample contains multiple feature values and actual values of labels, then the data owned by the first data owner can be the training sample set The first 30 training samples in the training sample set, and the data owned by the second data owner may be the last 70 training samples in the training sample set.
  • the feature data used in the training of the graph neural network model may include feature data based on image data, voice data, or text data.
  • the graph neural network model can be applied to business risk identification, business classification or business decision-making based on image data, voice data or text data.
  • the feature data used in the training of the graph neural network model may include user feature data.
  • the graph neural network model can be applied to business risk identification, business classification, business recommendation or business decision based on user characteristic data.
  • the data to be predicted used by the graph neural network model may include image data, voice data, or text data.
  • the data to be predicted used by the graph neural network model may include user characteristic data.
  • graph neural network model and “graph neural network” can be used interchangeably.
  • graph neural network sub-model and “graph neural sub-network” can be used interchangeably.
  • data owner and “training participant” can be used interchangeably.
  • Fig. 1 shows a schematic diagram of an example of a graph neural network model according to an embodiment of the present specification.
  • the graph neural network (GNN, Graph Neural Network) model is divided into a discriminant model 10 and multiple graph neural network sub-models 20.
  • the graph neural network sub-models GNN A , GNN B and GNN C are set on the server 110, and each graph neural network sub-model is set on the corresponding data owner. For example, it can be set on the client of the corresponding data owner, and each data owner has a graph.
  • Neural network sub-model As shown in FIG. 1, GNN A is set at the data owner A 120-1, GNN B is set at the data owner B 120-2, and GNN C is set at the data owner C 120-3.
  • the graph neural network sub-model 20 is used to perform GNN calculations on the data of the data owner to obtain the feature vector representation of each node of the graph neural network sub-model. Specifically, when performing GNN calculations, the data of the data owner is provided to the graph neural network sub-model 20. According to the node characteristics and the graph neural sub-network, through the propagation of K-order neighbors, the data of each node corresponding to the current data is obtained. Feature vector representation.
  • the discriminant model 10 is used to perform model calculation based on the feature vector representation of each node obtained from the data owner to obtain the model prediction value of each node.
  • Fig. 2 shows a schematic diagram of an example of horizontally segmented training sample data according to an embodiment of the present specification.
  • a training sample in the training sample subset owned by each data party Alice and Bob is complete, that is, each training sample includes complete feature data (x) and labeled data (y).
  • Alice has a complete training sample (x0, y0).
  • FIG. 3 shows a schematic diagram illustrating the architecture of a system for training graph neural network models via multiple data owners (hereinafter referred to as "model training system 300") according to an embodiment of the present specification.
  • the model training system 300 includes a server device 310 and at least one data owner device 320.
  • Three data owner devices 320 are shown in FIG. 3. In other embodiments of this specification, more or fewer data owner devices 320 may be included.
  • the server device 310 and the at least one data owner device 320 may communicate with each other via a network 330 such as but not limited to the Internet or a local area network.
  • the trained graph neural network model (neural network model structure except the discriminant model is removed) is divided into the first number of graph neural network sub-models.
  • the first number is equal to the number of data owner devices participating in model training.
  • the graph neural network model is decomposed into N sub-models, and each data owner device has one sub-model.
  • the feature data set used for model training is located at each data owner device 320.
  • the feature data set is horizontally divided into multiple feature data subsets in the manner described in FIG. 2, and each data owner device has one Feature data subset.
  • the sub-models and corresponding feature data subsets owned by each data owner are the secrets of the data owner and cannot be learned or fully learned by other data owners.
  • multiple data owner devices 320 and server devices 310 use the training sample subsets of each data owner device 320 to collaboratively train the graph neural network model.
  • the specific training process of the model will be described in detail with reference to FIGS. 4 to 5 below.
  • the server device 310 and the data owner device 320 may be any suitable electronic devices with computing capabilities.
  • the electronic devices include, but are not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile electronic devices, smart phones, tablet computers, cellular phones, personal digital assistants (PDAs), handhelds Devices, messaging equipment, wearable electronics, consumer electronics, etc.
  • PDAs personal digital assistants
  • FIG. 4 shows a flowchart of a method 400 for training a graph neural network model via multiple data owners according to an embodiment of the present specification.
  • the graph neural network sub-model at each data owner and the discriminant model of the server are initialized. For example, initialize the graph neural network sub-models GNN A , GNN B and GNN C at the data owners A, B, and C, and initialize the discriminant model 10 on the server side.
  • each data owner device 320 obtains its current training sample subset. For example, the owner of the data to get the current training sample A subset S A, data B acquires the current owner of the training subset of samples S B, and C acquires the current owner of the data training sample subset S C.
  • Each subset of training samples includes a subset of feature data and true label values.
  • the obtained current training sample subset is provided to the respective graph neural network sub-model for GNN calculation to obtain the feature vector of each node in the graph neural network sub-model Express.
  • the current training sample subset is provided to the graph neural network sub-model 20.
  • each subset corresponding to the current training sample subset is obtained.
  • each data owner device 320 obtains the current discrimination model from the server 310. Subsequently, at 405, at each data owner device 320, the current discriminant model is used to perform model prediction based on the feature vector representation of each node to obtain the current predicted label value of each node.
  • the current loss function is determined according to the current predicted label value of each node and the corresponding real label value.
  • the loss function can be based on the formula Calculated, where i represents the i-th node, P represents the total number of nodes in the graph neural network sub-model, t i represents the true label value of the i-th node, and O i represents the current predicted label value of the i-th node.
  • each data owner device 320 determines the received gradient information of the current discriminant model, that is, the gradient information of the model parameters of the current discriminant model, for example, determine the received gradient information through backpropagation.
  • the gradient information of the current discriminant model determines the model parameters of the current graph neural network sub-model.
  • the model parameters of the current graph neural network sub-model are updated, for example, the model parameters of the current graph neural network sub-model are updated through back propagation.
  • each data owner device 320 respectively provides the gradient information of the current discriminant model determined by each to the server 310.
  • each data owner device 320 may respectively send the gradient information (for example, as it is) of the current discriminant model determined by each to the server 310, and then the server 310 performs the processing on the received gradient information. polymerization.
  • each data owner may provide the data to the server 310 in a secure aggregation manner.
  • the security aggregation may include: security aggregation based on secret sharing; security aggregation based on homomorphic encryption; or security aggregation based on a trusted execution environment.
  • other suitable safe aggregation methods can also be used.
  • aggregating the received gradient information may include averaging the received gradient information.
  • the aggregated gradient information is used to update the discriminant model at the server 310 for subsequent training cycles or as a trained discriminant model.
  • the loop end condition is satisfied, that is, whether the predetermined number of loops is reached. If the predetermined number of cycles is reached, the process ends. If the predetermined number of cycles has not been reached, return to the operation of 402 and execute the next training cycle process.
  • the graph neural network sub-model of each data owner updated in the current cycle process and the discriminant model of the server end are used as the current model of the next training cycle process.
  • the end condition of the training cycle process refers to reaching the predetermined number of cycles.
  • the end condition of the training loop process may also be that the variation of each model parameter of the discrimination model 10 is not greater than a predetermined threshold.
  • the judgment process as to whether the loop process is over is executed in the server 310.
  • the end condition of the training loop process may also be that the current total loss function is within a predetermined range, for example, the current total loss function is not greater than a predetermined threshold.
  • the process of determining whether the loop process is over is executed in the server 310.
  • each data owner device 320 needs to provide its own loss function to the server 310 for aggregation, so as to obtain the total loss function.
  • each data owner device 320 can provide their own loss function to the server 310 by means of secure aggregation.
  • the security aggregation for the loss function may also include: the security aggregation based on secret sharing; the security aggregation based on homomorphic encryption; or the security aggregation based on the trusted execution environment.
  • FIG. 5 shows a schematic diagram of an example process for training a graph neural network model via multiple data owners according to an embodiment of the present specification.
  • Figure 5 shows three data owners A, B, and C.
  • the data owners A, B, and C obtain their respective current feature data subsets X A , X B, and X C.
  • the data owners A, B, and C respectively provide the current feature data subsets X A , X B, and X C to their current graph neural network sub-models G A , G B, and G C to obtain each current graph neural network sub-model.
  • the current feature vector representation of each node in the model is referred to model.
  • each data owner obtains the current discriminant model H from the server 110. Then, each data owner provides a current discriminant model H for the obtained current feature vector representation of each node to obtain the current predicted label value of each node. Subsequently, at each data owner, based on the current predicted label value of each node and the corresponding real label value, the current loss function is determined, and based on the current loss function, the various model parameters of the current discriminant model are determined through back propagation The gradient information GH. At the same time, at each data owner, based on the current loss function, the model parameters of each layer network model of the current graph neural network sub-model are updated through back propagation.
  • the respective gradient information is provided to the server by means of safe aggregation.
  • the server updates the current discriminant model based on the obtained aggregated gradient information.
  • model training schemes with 3 data owners are shown in Fig. 3 to Fig. 5. In other examples in the embodiment of this specification, it may also include more or less than 3 data owners. square.
  • the GNN model is constructed only based on the data of a single data owner.
  • the effect of the GNN model is also limited.
  • Using the embodiment of this specification to provide a model training solution can jointly train the GNN model on the basis of protecting the data privacy of each data owner, thereby improving the effect of the GNN model.
  • the model structure of all data owners must be consistent, so that the server can safely aggregate the model gradient information of each data owner to update the model, so that different models cannot be customized for different clients.
  • the sparse quality of the data (features and graph relationships) of different data owners is different, so different GNN models may be needed for learning. For example, the node feature vector representation obtained when the data owner A propagates a 2-degree neighbor is optimal, while the node feature vector representation obtained when the data owner B propagates a 5-degree neighbor is the optimal.
  • the GNN model used to obtain the feature vector representation of the node is arranged at each data owner for self-learning (local), and the discriminant model is placed on the server (global). Through multiple data owners to learn together, which can improve the effect of the discriminant model.
  • each data owner provides the gradient information of their current discriminant model to the server through secure aggregation, thereby preventing each data owner
  • the gradient information of the data is completely provided to the server, so as to prevent the server from using the received gradient information to derive the privacy data of the data owner, thereby realizing the privacy data protection for the data owner.
  • FIG. 6 shows a flowchart of a model prediction process 600 based on a graph neural network model according to an embodiment of the present specification.
  • the graph neural network model used in the model prediction process shown in FIG. 6 is a graph neural network model trained according to the process shown in FIG. 4.
  • the data to be predicted is provided to the graph neural network sub-model of the data owner to obtain the feature vector representation of each node of the graph neural network sub-model.
  • the discriminant model is obtained from the server.
  • the feature vector representation of each node is provided to the received discriminant model to obtain the predicted label value of each node, thereby completing the model prediction process.
  • FIG. 7 shows a schematic diagram of an apparatus (hereinafter referred to as a model training apparatus) 700 for training a graph neural network model via a plurality of data owners according to an embodiment of the present specification.
  • the graph neural network model includes a discriminant model located on the server side and a graph neural network sub-model located at each data owner.
  • Each data owner has the ability to perform leveling on the training sample set used for model training.
  • a subset of training samples obtained by segmentation, where the subset of training samples includes a subset of feature data and true label values.
  • the model training device 700 is located on the side of the data owner.
  • the model training device 700 includes a vector representation unit 710, a discriminant model acquisition unit 720, a model prediction unit 730, a loss function determination unit 740, a gradient information determination unit 750, a model update unit 760, and a gradient information providing unit 770.
  • the vector representation unit 710, the discriminant model acquisition unit 720, the model prediction unit 730, the loss function determination unit 740, the gradient information determination unit 750, the model update unit 760, and the gradient information providing unit 770 operate cyclically until the cycle is satisfied. End condition.
  • the loop ending condition may include, for example, that a predetermined number of loops is reached, and the variation of each model parameter of the discriminant model is not greater than a predetermined threshold; or the current total loss function is within a predetermined range.
  • the vector representation unit 710 is configured to provide the current feature data subset to the current graph neural network sub-model to obtain the feature vector representation of each node of the current graph neural network sub-model.
  • the operation of the vector representation unit 710 may refer to the operation of 403 described above with reference to FIG. 4.
  • the discriminant model obtaining unit 720 is configured to obtain the current discriminant model from the server.
  • the operation of the discriminant model acquisition unit 720 may refer to the operation of 404 described above with reference to FIG. 4.
  • the model prediction unit 730 is configured to provide the feature vector representation of each node to the current discriminant model to obtain the current predicted label value of each node.
  • the operation of the model prediction unit 730 may refer to the operation of 405 described above with reference to FIG. 4.
  • the loss function determining unit 740 is configured to determine the current loss function according to the current predicted label value of each node and the corresponding real label value.
  • the operation of the loss function determining unit 740 may refer to the operation of 406 described above with reference to FIG. 4.
  • the gradient information determining unit 750 is configured to determine the gradient information of the current discriminant model based on the current loss function when the loop end condition is not satisfied.
  • the operation of the gradient information determining unit 750 may refer to the operation of 407 described above with reference to FIG. 4.
  • the model updating unit 760 is configured to update the model parameters of the neural network sub-model of the current graph based on the current loss function when the loop end condition is not satisfied.
  • the operation of the model update unit 760 may refer to the operation of 407 described above with reference to FIG. 4.
  • the gradient information providing unit 770 is configured to provide gradient information of the current discriminant model to the server, and the server uses the gradient information of the current discriminant model from each data owner to update the discriminant model at the server.
  • the operation of the gradient information providing unit 770 may refer to the operation of 408 described above with reference to FIG. 4.
  • the gradient information providing unit 770 can provide the gradient information of the current discriminant model to the server in a safe aggregation manner.
  • the model training device 700 may further include a training sample subset acquisition unit (not shown).
  • the training sample subset acquiring unit is configured to acquire the current training sample subset.
  • FIG. 8 shows a block diagram of an apparatus for cooperatively training a graph neural network model via a plurality of data owners (hereinafter referred to as a model training apparatus 800) according to an embodiment of the present specification.
  • the graph neural network model includes a discriminant model located on the server side and a graph neural network sub-model located at each data owner.
  • Each data owner has the ability to perform leveling on the training sample set used for model training.
  • the model training device 800 is located on the server side.
  • the model training device 800 includes a discriminant model providing unit 810, a gradient information acquiring unit 820, and a model updating unit 830.
  • the discriminant model providing unit 810, the gradient information acquiring unit 820, and the model updating unit 830 operate in a loop until the loop end condition is satisfied.
  • the loop ending condition may include, for example, that a predetermined number of loops is reached, and the variation of each model parameter of the discriminant model is not greater than a predetermined threshold; or the current total loss function is within a predetermined range.
  • the discriminant model providing unit 810 is configured to provide the current discriminant model to each data owner for use by each data owner to predict the predicted label value of each node.
  • the operation of the discriminant model providing unit 810 may refer to the operation of 404 described above with reference to FIG. 4.
  • the gradient information acquiring unit 820 is configured to acquire the corresponding gradient information of the current discriminant model from each data owner when the loop end condition is not met.
  • the operation of the gradient information acquisition unit 820 may refer to the operation of 408 described above with reference to FIG. 4.
  • the discriminant model update unit 830 is configured to update the current discriminant model based on gradient information from each data owner.
  • the operation of the discriminant model update unit 830 can refer to the operation of 409 described above with reference to FIG. 4.
  • Fig. 9 shows a block diagram of an apparatus for model prediction based on a graph neural network model (hereinafter referred to as a model prediction apparatus 900) according to an embodiment of the present specification.
  • the model prediction device 900 is applied to the data owner.
  • the model prediction device 900 includes a vector representation unit 910, a discriminant model acquisition unit 920, and a model prediction unit 930.
  • the vector representation unit 910 is configured to provide the data to be predicted to the graph neural network sub-model at the data owner to obtain the feature vector representation of each node of the graph neural network sub-model.
  • the discriminant model obtaining unit 920 is configured to obtain the discriminant model from the server.
  • the model prediction unit 930 is configured to provide the feature vector representation of each node to the discriminant model to obtain the predicted label value of each node, thereby completing the model prediction process.
  • model training and prediction method, device, and system As above, referring to FIGS. 1 to 9, the model training and prediction method, device, and system according to the embodiments of this specification are described.
  • the above model training device and model prediction device can be implemented by hardware, or by software or a combination of hardware and software.
  • FIG. 10 shows a hardware structure diagram of an electronic device 1000 for implementing a neural network model through a training graph of multiple data owners according to an embodiment of the present specification.
  • the electronic device 1000 may include at least one processor 1010, a memory (for example, a non-volatile memory) 1020, a memory 1030, and a communication interface 1040, and at least one processor 1010, a memory 1020, a memory 1030, and a communication interface 1040.
  • the interfaces 1040 are connected together via a bus 1060.
  • At least one processor 1010 executes at least one computer-readable instruction (ie, the above-mentioned element implemented in the form of software) stored or encoded in the memory.
  • computer-executable instructions are stored in the memory, which when executed, cause at least one processor 1010: to execute the following loop process until the loop end condition is met: provide the current feature data subset to the data owner The current graph neural network sub-model to obtain the feature vector representation of each node of the current graph neural network sub-model; obtain the current discriminant model from the server; provide the feature vector representation of each node to the current discriminant model to obtain the feature vector representation of each node Current predicted label value; determine the current loss function according to the current predicted label value of each node and the corresponding real label value; when the loop end condition is not met, based on the current loss function, determine the gradient information of the current discriminant model through backpropagation And update the model parameters of the current graph neural network sub-model; and provide the gradient information of the current discriminant model to the server, which uses the gradient information from the current discriminant model of each data owner to update the discrimination at the server Model, where the updated graph neural network sub-model of each data owner and the discriminant model at the server end are
  • FIG. 11 shows a hardware structure diagram of an electronic device 1100 for training a graph neural network model via multiple data owners according to an embodiment of the present specification.
  • the electronic device 1100 may include at least one processor 1110, a memory (for example, a non-volatile memory) 1120, a memory 1130, and a communication interface 1140, and at least one processor 1110, a memory 1120, a memory 1130, and a communication interface 1140.
  • the interfaces 1140 are connected together via a bus 1160.
  • At least one processor 1110 executes at least one computer-readable instruction (ie, the above-mentioned element implemented in the form of software) stored or encoded in the memory.
  • computer-executable instructions are stored in the memory, which, when executed, cause at least one processor 1110 to execute the following loop process until the loop end condition is met: provide the current discriminant model to each data owner, each The data owner provides the feature vector representation of each node of the current subgraph neural network model to the current discriminant model to obtain the predicted label value of each node, and determines the respective current loss function based on the predicted label value of each node and the corresponding true label value , And when the loop end condition is not met, based on the respective current loss function, determine the gradient information of the discriminant model through back propagation and update the model parameters of the current graph neural network sub-model, and provide the determined gradient information to the server , The feature vector representation of each node is obtained by providing the current feature data subset to the current graph neural network sub-model; when the loop end condition is not met, the corresponding gradient information of the current discriminant model is obtained from each data owner, and based on The gradient information of each data owner updates the current discriminant model, where, when the loop end condition is
  • FIG. 12 shows a hardware structure diagram of an electronic device 1200 for model prediction based on a graph neural network model according to an embodiment of the present specification.
  • the electronic device 1200 may include at least one processor 1210, a memory (for example, a non-volatile memory) 1220, a memory 1230, and a communication interface 1240, and at least one processor 1210, a memory 1220, a memory 1230, and a communication interface.
  • the interfaces 1240 are connected together via a bus 1260.
  • At least one processor 1210 executes at least one computer-readable instruction (i.e., the above-mentioned element implemented in the form of software) stored or encoded in the memory.
  • computer-executable instructions are stored in the memory, which, when executed, cause at least one processor 1210 to: provide the data to be predicted to the graph neural network sub-model at the data owner to obtain the graph neural network The feature vector representation of each node of the sub-model; obtain the discriminant model from the server; and provide the feature vector representation of each node to the discriminant model to obtain the predicted label value of each node.
  • a program product such as a machine-readable medium (for example, a non-transitory machine-readable medium) is provided.
  • the machine-readable medium may have instructions (ie, the above-mentioned elements implemented in the form of software), which, when executed by a machine, cause the machine to perform the various operations and functions described above in conjunction with FIGS. 1-9 in the various embodiments of this specification.
  • a system or device equipped with a readable storage medium may be provided, and the software program code for realizing the function of any one of the above-mentioned embodiments is stored on the readable storage medium, and the computer or device of the system or device The processor reads and executes the instructions stored in the readable storage medium.
  • the program code itself read from the readable medium can realize the function of any one of the above embodiments, so the machine readable code and the readable storage medium storing the machine readable code constitute the present invention a part of.
  • Examples of readable storage media include floppy disks, hard disks, magneto-optical disks, optical disks (such as CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD-RW), magnetic tape, Volatile memory card and ROM.
  • the program code can be downloaded from the server computer or the cloud via the communication network.
  • the device structure described in the foregoing embodiments may be a physical structure or a logical structure, that is, some units may be implemented by the same physical entity, or some units may be implemented by multiple physical entities, or may be implemented by multiple physical entities. Some components in independent devices are implemented together.
  • the hardware unit or module can be implemented mechanically or electrically.
  • a hardware unit, module, or processor may include a permanent dedicated circuit or logic (such as a dedicated processor, FPGA or ASIC) to complete the corresponding operation.
  • the hardware unit or processor may also include programmable logic or circuits (such as general-purpose processors or other programmable processors), which may be temporarily set by software to complete corresponding operations.
  • the specific implementation mechanical, or dedicated permanent circuit, or temporarily set circuit

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

L'invention concerne un procédé et un appareil d'apprentissage d'un modèle de réseau neuronal graphique au moyen de multiples propriétaires de données. Dans le procédé, un modèle de réseau neuronal graphique est divisé en un modèle de discrimination et de multiples sous-modèles de réseau neuronal graphique. Pendant l'apprentissage de modèle, chaque propriétaire de données fournit ses propres sous-ensembles de données de caractéristique à ses propres sous-modèles de réseau neuronal graphique pour obtenir une représentation vectorielle de caractéristiques de chaque nœud. Chaque propriétaire de données reçoit le modèle de discrimination d'un serveur, et obtient une valeur d'étiquette prédite actuelle de chaque nœud en utilisant la représentation vectorielle de caractéristiques de chaque nœud, ce qui permet d'obtenir une fonction de perte actuelle au niveau de chaque propriétaire de données par calcul, de déterminer des informations de gradient du modèle de discrimination sur la base de la fonction de perte actuelle, et de mettre à jour ses propres sous-modèles de réseau neuronal graphique. Chaque propriétaire de données fournit ses propres informations de gradient au serveur, de sorte que le serveur met à jour le modèle de discrimination. En utilisant le procédé, la sécurité des données de données privées à chaque propriétaire de données peut être assurée.
PCT/CN2020/132667 2020-02-17 2020-11-30 Procédé, appareil et système d'apprentissage de modèle de réseau neuronal graphique WO2021164365A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010096248.8 2020-02-17
CN202010096248.8A CN110929870B (zh) 2020-02-17 2020-02-17 图神经网络模型训练方法、装置及系统

Publications (1)

Publication Number Publication Date
WO2021164365A1 true WO2021164365A1 (fr) 2021-08-26

Family

ID=69854815

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/132667 WO2021164365A1 (fr) 2020-02-17 2020-11-30 Procédé, appareil et système d'apprentissage de modèle de réseau neuronal graphique

Country Status (2)

Country Link
CN (1) CN110929870B (fr)
WO (1) WO2021164365A1 (fr)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113571133A (zh) * 2021-09-14 2021-10-29 内蒙古农业大学 一种基于图神经网络的乳酸菌抗菌肽预测方法
CN113771289A (zh) * 2021-09-16 2021-12-10 健大电业制品(昆山)有限公司 一种注塑成型工艺参数优化的方法和系统
CN113849665A (zh) * 2021-09-02 2021-12-28 中科创达软件股份有限公司 多媒体数据识别方法、装置、设备及存储介质
CN114117926A (zh) * 2021-12-01 2022-03-01 南京富尔登科技发展有限公司 一种基于联邦学习的机器人协同控制算法
CN114819139A (zh) * 2022-03-28 2022-07-29 支付宝(杭州)信息技术有限公司 一种图神经网络的预训练方法及装置
CN114819182A (zh) * 2022-04-15 2022-07-29 支付宝(杭州)信息技术有限公司 用于经由多个数据拥有方训练模型的方法、装置及系统
CN114819182B (zh) * 2022-04-15 2024-05-31 支付宝(杭州)信息技术有限公司 用于经由多个数据拥有方训练模型的方法、装置及系统

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110929870B (zh) * 2020-02-17 2020-06-12 支付宝(杭州)信息技术有限公司 图神经网络模型训练方法、装置及系统
CN111581648B (zh) * 2020-04-06 2022-06-03 电子科技大学 在不规则用户中保留隐私的联邦学习的方法
CN111612126A (zh) * 2020-04-18 2020-09-01 华为技术有限公司 强化学习的方法和装置
CN111665861A (zh) * 2020-05-19 2020-09-15 中国农业大学 一种轨迹跟踪控制方法、装置、设备和存储介质
CN111553470B (zh) * 2020-07-10 2020-10-27 成都数联铭品科技有限公司 适用于联邦学习的信息交互系统及方法
CN111737474B (zh) * 2020-07-17 2021-01-12 支付宝(杭州)信息技术有限公司 业务模型的训练和确定文本分类类别的方法及装置
CN111738438B (zh) * 2020-07-17 2021-04-30 支付宝(杭州)信息技术有限公司 图神经网络模型训练方法、装置及系统
CN111783143B (zh) * 2020-07-24 2023-05-09 支付宝(杭州)信息技术有限公司 用户数据的业务模型使用确定方法、装置及系统
CN112052942B (zh) * 2020-09-18 2022-04-12 支付宝(杭州)信息技术有限公司 神经网络模型训练方法、装置及系统
CN112131303A (zh) * 2020-09-18 2020-12-25 天津大学 基于神经网络模型的大规模数据沿袭方法
CN112364819A (zh) * 2020-11-27 2021-02-12 支付宝(杭州)信息技术有限公司 一种联合训练识别模型的方法和装置
CN112766500B (zh) * 2021-02-07 2022-05-17 支付宝(杭州)信息技术有限公司 图神经网络的训练方法及装置
CN113052333A (zh) * 2021-04-02 2021-06-29 中国科学院计算技术研究所 基于联邦学习进行数据分析的方法及系统
CN113254996B (zh) * 2021-05-31 2022-12-27 平安科技(深圳)有限公司 图神经网络训练方法、装置、计算设备及存储介质
CN113222143B (zh) * 2021-05-31 2023-08-01 平安科技(深圳)有限公司 图神经网络训练方法、系统、计算机设备及存储介质
CN113221153B (zh) * 2021-05-31 2022-12-27 平安科技(深圳)有限公司 图神经网络训练方法、装置、计算设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109684855A (zh) * 2018-12-17 2019-04-26 电子科技大学 一种基于隐私保护技术的联合深度学习训练方法
US20190286972A1 (en) * 2018-03-14 2019-09-19 Microsoft Technology Licensing, Llc Hardware accelerated neural network subgraphs
CN110782044A (zh) * 2019-10-29 2020-02-11 支付宝(杭州)信息技术有限公司 多方联合训练图神经网络的方法及装置
CN110929870A (zh) * 2020-02-17 2020-03-27 支付宝(杭州)信息技术有限公司 图神经网络模型训练方法、装置及系统

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4202782A1 (fr) * 2015-11-09 2023-06-28 Google LLC Formation de réseaux neuronaux représentés sous forme de graphes de calcul
CN110807125B (zh) * 2019-08-03 2020-12-22 北京达佳互联信息技术有限公司 推荐系统、数据访问方法及装置、服务器、存储介质
CN110751269B (zh) * 2019-10-18 2022-08-05 网易(杭州)网络有限公司 图神经网络训练方法、客户端设备及系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190286972A1 (en) * 2018-03-14 2019-09-19 Microsoft Technology Licensing, Llc Hardware accelerated neural network subgraphs
CN109684855A (zh) * 2018-12-17 2019-04-26 电子科技大学 一种基于隐私保护技术的联合深度学习训练方法
CN110782044A (zh) * 2019-10-29 2020-02-11 支付宝(杭州)信息技术有限公司 多方联合训练图神经网络的方法及装置
CN110929870A (zh) * 2020-02-17 2020-03-27 支付宝(杭州)信息技术有限公司 图神经网络模型训练方法、装置及系统

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113849665A (zh) * 2021-09-02 2021-12-28 中科创达软件股份有限公司 多媒体数据识别方法、装置、设备及存储介质
CN113571133A (zh) * 2021-09-14 2021-10-29 内蒙古农业大学 一种基于图神经网络的乳酸菌抗菌肽预测方法
CN113571133B (zh) * 2021-09-14 2022-06-17 内蒙古农业大学 一种基于图神经网络的乳酸菌抗菌肽预测方法
CN113771289A (zh) * 2021-09-16 2021-12-10 健大电业制品(昆山)有限公司 一种注塑成型工艺参数优化的方法和系统
CN113771289B (zh) * 2021-09-16 2022-06-24 健大电业制品(昆山)有限公司 一种注塑成型工艺参数优化的方法和系统
CN114117926A (zh) * 2021-12-01 2022-03-01 南京富尔登科技发展有限公司 一种基于联邦学习的机器人协同控制算法
CN114117926B (zh) * 2021-12-01 2024-05-14 南京富尔登科技发展有限公司 一种基于联邦学习的机器人协同控制算法
CN114819139A (zh) * 2022-03-28 2022-07-29 支付宝(杭州)信息技术有限公司 一种图神经网络的预训练方法及装置
CN114819182A (zh) * 2022-04-15 2022-07-29 支付宝(杭州)信息技术有限公司 用于经由多个数据拥有方训练模型的方法、装置及系统
CN114819182B (zh) * 2022-04-15 2024-05-31 支付宝(杭州)信息技术有限公司 用于经由多个数据拥有方训练模型的方法、装置及系统

Also Published As

Publication number Publication date
CN110929870A (zh) 2020-03-27
CN110929870B (zh) 2020-06-12

Similar Documents

Publication Publication Date Title
WO2021164365A1 (fr) Procédé, appareil et système d'apprentissage de modèle de réseau neuronal graphique
WO2021103901A1 (fr) Procédés et dispositif d'apprentissage et de prédiction de modèle de réseau neuronal basé sur un calcul de sécurité multi-parties
WO2020156004A1 (fr) Procédé, appareil, et système d'apprentissage de modèle
CN111061963B (zh) 基于多方安全计算的机器学习模型训练及预测方法、装置
CN110782044A (zh) 多方联合训练图神经网络的方法及装置
US11715044B2 (en) Methods and systems for horizontal federated learning using non-IID data
US11341411B2 (en) Method, apparatus, and system for training neural network model
CN111738438B (zh) 图神经网络模型训练方法、装置及系统
CN111523556B (zh) 模型训练方法、装置及系统
CN111368983A (zh) 业务模型训练方法、装置及业务模型训练系统
CN111523134B (zh) 基于同态加密的模型训练方法、装置及系统
CN110929887B (zh) 逻辑回归模型训练方法、装置及系统
CN111523674B (zh) 模型训练方法、装置及系统
Lei et al. Federated learning over coupled graphs
CN111737756B (zh) 经由两个数据拥有方进行的xgb模型预测方法、装置及系统
CN110175283B (zh) 一种推荐模型的生成方法及装置
CN111523675B (zh) 模型训练方法、装置及系统
CN111738453B (zh) 基于样本加权的业务模型训练方法、装置及系统
CN112288088B (zh) 业务模型训练方法、装置及系统
Razeghi et al. Deep Privacy Funnel Model: From a Discriminative to a Generative Approach with an Application to Face Recognition
CN112183566B (zh) 模型训练方法、装置及系统
CN112183564B (zh) 模型训练方法、装置及系统
US20230084507A1 (en) Servers, methods and systems for fair and secure vertical federated learning
Azogagh et al. Crypto'Graph: Leveraging Privacy-Preserving Distributed Link Prediction for Robust Graph Learning
Arpitha et al. My privacy my decision: control communication media on online social networks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20920246

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20920246

Country of ref document: EP

Kind code of ref document: A1