WO2021082681A1 - Method and device for multi-party joint training of graph neural network - Google Patents

Method and device for multi-party joint training of graph neural network Download PDF

Info

Publication number
WO2021082681A1
WO2021082681A1 PCT/CN2020/111501 CN2020111501W WO2021082681A1 WO 2021082681 A1 WO2021082681 A1 WO 2021082681A1 CN 2020111501 W CN2020111501 W CN 2020111501W WO 2021082681 A1 WO2021082681 A1 WO 2021082681A1
Authority
WO
WIPO (PCT)
Prior art keywords
embedding
sample
network
vector
node
Prior art date
Application number
PCT/CN2020/111501
Other languages
French (fr)
Chinese (zh)
Inventor
陈超超
郑龙飞
王力
周俊
Original Assignee
支付宝(杭州)信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 支付宝(杭州)信息技术有限公司 filed Critical 支付宝(杭州)信息技术有限公司
Publication of WO2021082681A1 publication Critical patent/WO2021082681A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Definitions

  • One or more embodiments of this specification relate to the fields of data security and machine learning, and in particular, to methods and devices for multi-party joint training of graph neural networks.
  • the data needed for machine learning often involves multiple fields.
  • the electronic payment platform owns the user's transaction flow data
  • the social platform owns the user's friend contact data
  • the banking institution owns the user's loan data.
  • Data often exists in the form of islands. Due to industry competition, data security, user privacy and other issues, data integration is facing great resistance. It is difficult to integrate data scattered on various platforms to train machine learning models. Under the premise of ensuring that data is not leaked, the use of multi-party data to jointly train machine learning models has become a major challenge at present.
  • Graph neural network is a widely used machine learning model. Compared with the traditional neural network, the graph neural network can not only capture the characteristics of nodes, but also describe the characteristics of the relationship between nodes. Therefore, it has achieved excellent results in a number of machine learning tasks. However, this also makes graph neural networks have a certain complexity. In particular, when faced with the phenomenon of data islands, how to integrate multi-party data and safely conduct multi-party joint modeling has become a problem to be solved.
  • One or more embodiments of this specification describe methods and devices for multi-party joint training of graph neural networks, which can safely and efficiently jointly train graph neural networks among multiple parties as a prediction model.
  • the graph neural network includes a graph embedding sub-network and a classification sub-network.
  • the multiple parties include a server and N data holders.
  • the server maintains In the classification sub-network, each of the N data holders maintains a part of the graph embedding sub-network; any first holder of the N data holders stores the first of each sample in the sample set Characteristic part, and a first graph structure containing the respective samples as corresponding nodes; the first holder maintains the first network part of the graph embedding sub-network, and the first network part includes an embedding layer and an aggregation layer
  • the method is executed by the first holder, and includes: in the embedding layer, based at least on the first characteristic part of each sample, using a multi-party secure computing scheme to combine with other N-1 data holders Calculate the primary embedding vector of each sample; in the aggregation layer, based on the first graph structure and the primary embed
  • the multi-party secure computing scheme includes a secret sharing scheme; accordingly, the primary embedding vector of each sample can be obtained in the following manner: the first characteristic part of each sample is shared to obtain the first shared characteristic part ; Send the first shared characteristic part to other N-1 data holders, and receive N-1 shared characteristic parts from the other N-1 data holders; for the first characteristic Part and the N-1 shared characteristic parts are integrated to obtain the first comprehensive characteristic; the first comprehensive characteristic is sent to other N-1 data holders, and from the other N-1 data holders N-1 comprehensive features are received respectively; and the primary embedding vector of each sample is determined according to the first comprehensive feature and the N-1 comprehensive features.
  • the embedding layer has embedding parameters
  • obtaining the primary embedding vector of each sample at the embedding layer includes, based on the first feature part of each sample, and the embedding parameters in the embedding layer, using a multi-party secure computing scheme, and The other N-1 data holders jointly calculate the primary embedding vector of each sample.
  • updating the first network part includes updating the embedded parameters.
  • the embedding layer adopts a secret sharing scheme to jointly calculate the primary embedding vector of each sample with other N-1 data holders, which specifically includes: sharing the first feature part of each sample , Obtain the first shared characteristic part; perform sharing processing on the embedded parameter to obtain the first shared parameter part; send the first shared characteristic part and the first shared parameter part to other N-1 data holders , And respectively receive N-1 shared characteristic parts and N-1 shared parameter parts from other N-1 data holders; the first synthesis composed of the embedded parameters and the N-1 shared parameter parts Parameters, processing the first integrated feature composed of the first feature part and the N-1 shared feature parts to obtain a first integrated embedding result; sending the first integrated embedding result to the other N-1 Data holders, and receive corresponding N-1 integrated embedding results from the other N-1 data holders; according to the first integrated embedding result and the N-1 integrated embedding results, determine The primary embedding vector of each sample.
  • a secret sharing scheme to jointly calculate the primary embedding vector of each sample with other N-1 data holders, which specifically includes: sharing
  • each level of aggregation in the aggregation layer includes, for any first sample in each sample, the corresponding first node in the first graph structure: at least according to the upper limit of the neighbor node of the first node.
  • the first-level embedding vector is used to determine the neighbor aggregation vector; and the current-level embedding vector of the first node is determined according to the neighbor aggregation vector and the upper-level embedding vector of the first node.
  • a pooling operation is performed on the upper-level embedding vector of the neighbor node of the first node to obtain the neighbor aggregation vector.
  • the weighted summation of the upper-level embedding vectors of the neighbor nodes of the first node obtains the neighbor aggregation vector, and the weight corresponding to each neighbor node is based on the relationship between the neighbor node and the first node.
  • the characteristics of the connecting edge are determined.
  • the neighbor aggregation vector is determined based on the upper-level embedding vector of each neighbor node and the edge embedding vector of each connection edge between each neighbor node and the first node.
  • the process of updating the first network part includes: according to the loss gradient, using a backpropagation algorithm to reversely update the aggregation parameters in the aggregation layer and the embedding parameters in the embedding layer layer by layer.
  • the graph neural network includes a graph embedding sub-network and a classification sub-network.
  • the multiple parties include a server and N data holders.
  • the server maintains In the classification sub-network, each of the N data holders maintains a part of the graph embedding sub-network; each of the N data holders stores part of the characteristics of each sample in the sample set, and A graph structure containing each sample as a corresponding node;
  • the method is executed by the server, and the method includes: for any target sample in the sample set, respectively receiving the target sample from the N data holders
  • the N high-order embedding vectors of the sample where the i-th high-order embedding vector is obtained by the i-th holder of the N data holders, by combining the stored graph structure and the characteristic part of the target sample , Input the graph embedding sub-network part maintained therein to obtain; in the classification sub-network, the N high-order embedd
  • the N high-order embedding vectors can be synthesized to obtain the integrated embedding vector of the target sample in the following manner: the N high-order embedding vectors are spliced to obtain the integrated embedding vector; or , Averaging the N high-order embedding vectors to obtain the comprehensive embedding vector.
  • synthesizing the N high-order embedding vectors to obtain the comprehensive embedding vector of the target sample includes: using N weight vectors to perform bitwise multiplication with the N high-order embedding vectors to obtain N weighted processing vectors; sum the N weighted processing vectors to obtain the integrated embedding vector; wherein, updating the classification sub-network includes updating the N weight vectors.
  • the method before determining the predicted loss, the method further includes: receiving the sample label from a second holder of the N data holders.
  • the graph neural network includes a graph embedding sub-network and a classification sub-network.
  • the multiple parties include a server and N data holders.
  • the server maintains the classification sub-network, each of the N data holders maintains a part of the graph embedding sub-network; any first holder of the N data holders stores the data of each sample in the sample set The first characteristic part, and the first graph structure containing the respective samples as the corresponding nodes; the first holder maintains the first network part of the graph embedding sub-network, and the first network part includes the embedding layer and Aggregation layer;
  • the device is deployed in the first holder and includes: a primary embedding unit, configured to use a multi-party secure computing solution at the embedding layer, based at least on the first characteristic part of each sample, with other N -1 data holders jointly calculate the primary embedding vector of each sample; the aggregation unit is configured to perform, at
  • the graph neural network includes a graph embedding sub-network and a classification sub-network.
  • the multiple parties include a server and N data holders.
  • the server maintains the classification sub-network, each of the N data holders maintains a part of the graph embedding sub-network; each of the N data holders stores partial characteristics of each sample in the sample set , And a graph structure containing each sample as a corresponding node;
  • the device is deployed in the server and includes: a vector receiving unit configured to hold any target sample in the sample set from the N data
  • a party receives N high-order embedding vectors for the target sample, where the i-th high-order embedding vector is generated by the i-th holder of the N data holders, and the graph structure stored therein And the characteristic part of the target sample, which is obtained by inputting the maintained graph embedding sub-network part;
  • the classification prediction unit is configured to integrate the N high-order embedding
  • a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed in a computer, the computer is caused to execute the method of the first aspect or the second aspect.
  • a computing device including a memory and a processor, the memory stores executable code, and when the processor executes the executable code, the method of the first aspect or the second aspect is implemented .
  • multiple data holders and servers jointly train a graph neural network, wherein each data holder stores part of the characteristics of the sample and the graph structure with the sample as the node.
  • the graph neural network is divided into a graph embedding sub-network and a classification sub-network.
  • Each data holder maintains a part of the graph embedding sub-network, and the server maintains the classification sub-network.
  • the primary embedding vector of the sample is calculated jointly with other holders through a multi-party secure calculation scheme. On this basis, the node is multiplied according to the local graph structure.
  • the high-level neighbors aggregate to obtain the high-order embedding vector of the node and send it to the server.
  • the server uses the classification sub-network to synthesize the high-level embedding vectors of the samples from each data holder, and then classifies and predicts the samples accordingly to determine the loss.
  • the loss gradient is passed from the classification sub-network in the server back to the graph embedding sub-network in the data holder to realize the joint training of the entire graph neural network. In the whole process, the privacy and security of the sample feature data and graph structure data are guaranteed, and the calculation and training efficiency of the entire network is also improved.
  • Figure 1 is a schematic diagram of an implementation scenario of an embodiment disclosed in this specification
  • Fig. 2 shows a process of a multi-party joint training graph neural network method according to an embodiment
  • Figure 3 shows the flow of the method for the first holder to determine the primary embedding vector of the sample in a secret sharing manner
  • Fig. 4 shows a schematic block diagram of a training device deployed in a first holder according to an embodiment
  • Fig. 5 shows a schematic block diagram of a training device deployed in a server according to an embodiment.
  • Fig. 1 is a schematic diagram of an implementation scenario of an embodiment disclosed in this specification.
  • two data holders are shown, namely holder A and holder B.
  • Holders A and B each store part of the characteristics of the sample, and a graph structure that records the relationship between the samples.
  • the sample may be a user.
  • the holder A may be, for example, an electronic payment platform (such as Alipay), in which a part of the user's characteristics (such as payment-related characteristics) are stored.
  • This part of the features is shown in FIG. 1 as features f1 to f4.
  • the holder A also stores a graph structure A constructed through a payment relationship, for example.
  • the holder B may be, for example, a social platform (such as Dingding), in which another part of the user's characteristics (such as social-related characteristics) are stored. These features are shown in FIG. 1 as features f5, f6, and f7.
  • the holder B also stores a graph structure B constructed through social relationships, for example. More specifically, in the social platform, users who have a friend relationship or have a history of communication can be connected by connecting edges, thereby forming a graph structure B.
  • holder A and holder B can store different characteristic parts of the same sample, and because holder A and holder B perform graph structure based on different association relationships Construction (for example, the holder A is based on the payment relationship, and the holder B is based on the social friend relationship). Therefore, the holders A and B each store different graph structures.
  • a neutral server is introduced in addition to each data holder, and each data holder and server jointly train the graph neural network.
  • the graph neural network to be trained is divided into two parts: the graph embedding sub-network and the classification sub-network.
  • the graph embedding sub-network is used to generate the high-order embedding vectors of the nodes corresponding to each sample according to the sample characteristics and graph structure.
  • the calculation of the graph embedding sub-network involves original sample features and graph structure data, which is related to privacy data calculations, so it can be performed locally on the data holder.
  • each data holder may maintain a part of the graph embedding sub-network, and use the locally maintained graph embedding sub-network part based on locally stored feature data and graph structure data to calculate the high-order embedding vector of the node.
  • the graph embedding sub-network in each data holder can be divided into an embedding layer and an aggregation layer.
  • each data holder uses a multi-party secure calculation scheme to synthesize the stored sample feature parts to obtain the primary embedding vector of the node corresponding to each sample.
  • the data holder performs multi-level neighbor aggregation on the node based on the node's primary embedding vector and the locally stored graph structure, thereby obtaining the node's high-order embedding vector.
  • the classification sub-network is used to synthesize the high-order embedding vectors of the nodes obtained by embedding the graph into the sub-network, and perform classification prediction on the nodes according to the comprehensive results.
  • the calculation of the classification sub-network does not involve the original sample features and graph structure data, and is a non-privacy related calculation. Therefore, it can be performed in the server to improve the efficiency of calculation and training.
  • the server can determine the prediction loss according to the classification prediction results of the classification sub-network and sample labels (shown as y1, y2 and y3 in Figure 1), and update the classification sub-network through back propagation until it is determined The loss gradient of the input layer of the classification sub-network. Then, the server sends the loss gradient to each data holder, so each data holder can continue to update the graph embedding sub-network according to the loss gradient. As a result, the update and training of the entire graph neural network is realized.
  • N data holders there are N data holders, and N is usually greater than or equal to 2.
  • the N data holders each store a part of the characteristics of each sample in the sample set, and a graph structure containing each sample as a node. In such a scenario, it is hoped that N data holders and servers will jointly train a graph neural network model.
  • the graph neural network is divided into graph embedding sub-networks and classification sub-networks.
  • the N data holders each maintain a part of the graph embedding sub-network; the server maintains the classification sub-network.
  • the following describes the execution steps of the joint training in conjunction with any one of the N data holders, called the first holder.
  • FIG. 2 shows a process of a multi-party joint training graph neural network method according to an embodiment. It can be seen that Figure 2 shows the respective processing procedures of any first holder and the server and the interaction procedures between the two in the joint training. Among them, the first holder and the server can be executed by any device, device, platform, or device cluster with computing and processing capabilities.
  • the first holder stores part of the characteristics of each sample in the sample set, which is referred to herein as the first characteristic part.
  • the graph structure stored in the first holder is referred to as the first graph structure, and the graph embedded sub-network part maintained therein is referred to as the first network part.
  • the first network part includes an embedded layer and an aggregation layer.
  • the process of joint training includes the following steps.
  • the first holder uses the embedding layer in the first network part maintained by it, at least based on the first feature part of each sample, and other N-1 pieces of data.
  • the holder jointly calculates the primary embedding vector of each sample.
  • the joint calculation in this step involves the original characteristics of the samples stored in each data holder and belongs to private data. Therefore, for data security considerations, the aforementioned joint calculation needs to adopt a multi-party secure computing (MPC) solution.
  • MPC multi-party secure computing
  • step 201 the specific algorithm of the embedding layer can be combined with various applicable MPC schemes to jointly calculate the primary embedding vector of the sample.
  • the processing of the sample feature by the embedding layer mainly involves encoding and characterizing the original feature data (for example, encoding as a vector), and does not involve parameter calculation processing on the feature data.
  • various MPC schemes can be used to synthesize the characteristic parts of the sample encoded by the N holders to obtain the primary embedding vector of the sample.
  • the embedding layer encodes and characterizes the feature data of the sample, it further performs calculation processing involving parameters, such as linear transformation of the feature using a parameter vector, or further applying a non-linear function (such as sigmoid function) processing, and so on.
  • the embedding layer of the holder i includes the embedding parameter ⁇ i required for feature calculation.
  • a homomorphic encryption method can be used to separately integrate the sample feature parts encoded by the N holders and the embedded parameter parts maintained to obtain the primary embedding vector of the sample.
  • homomorphic encryption can be used to synthesize sample features, and homomorphic encryption can also be used to synthesize the embedded parameters maintained in each holder, and then use the integrated embedding parameters to process the synthesized features to obtain the primary embedding vector of the sample. .
  • a secret sharing method is adopted to obtain the primary embedding vector of the sample based on each feature part and embedding parameters.
  • Figure 3 shows the flow of the method for the first holder to determine the primary embedding vector of the sample in a secret sharing manner.
  • the first holder i performs sharing processing on the first characteristic part x i of the sample to obtain the first shared characteristic part x′ i .
  • the above-mentioned sharing processing can be realized by using an algorithm in secret sharing, by adding a random number generated in a certain manner on the basis of the original data.
  • the first shared feature part can be obtained as follows:
  • r i is a random number for the sharing and processing of sample characteristics.
  • the first holder i also shares the embedded parameter ⁇ i therein to obtain the first shared parameter part ⁇ ′ i .
  • the sharing process can be performed according to the following formula:
  • s i is a random number for the sharing of embedded parameters.
  • step 302 the first holder i sends the first shared characteristic part x′ i and the first shared parameter part ⁇ ′ i to other N-1 data holders.
  • the other N-1 data holders respectively calculate the corresponding shared characteristic part x′ j (j ⁇ i) and the shared parameter part ⁇ ′ j , and send them out.
  • the first holder i receives N-1 shared characteristic parts x′ j and N-1 shared parameter parts ⁇ ′ j from other N-1 data holders, respectively.
  • the first comprehensive feature can be obtained according to the following formula:
  • the first holder i also obtains the first comprehensive parameter W i similarly based on its own embedded parameter ⁇ i and N-1 shared parameter parts ⁇ ′ j :
  • step 304 the first holder i sends the first integrated embedding result Hi to the other N-1 data holders.
  • Other holders similarly get the integrated embedding result H j . Therefore, the first holder i receives corresponding N-1 integrated embedding results H j from other N-1 data holders.
  • the first holder i determines the primary embedding vector H of each sample according to the first integrated embedding result H i and the N-1 integrated embedding results H j, for example:
  • the first holder i and each data holder jointly calculate the same primary embedding vector H.
  • step 202 on the basis of obtaining the primary embedding vector of each sample at the embedding layer of the first holder i using the MPC scheme, then, in step 202, at the aggregation layer, based on the first graph structure stored therein, and each The primary embedding vector of the sample, and the multi-level neighbor aggregation is performed on each sample to determine the high-order embedding vector of each sample.
  • each sample corresponds to each node in the first graph structure, and based on the connection information between the nodes in the first graph structure, multi-level neighbor aggregation is performed on each node, where each level of aggregation includes, for each node, Determine the embedding vector of this node at least based on the previous embedding vector of the neighbor node of the node.
  • the k-th level aggregation for the first node may include:
  • Adopt the aggregation function AGG k at least according to the upper level (that is, k-1 level) embedding vector of the neighbor node u of the first node v Determine the neighbor aggregation vector
  • N(v) represents the set of neighbor nodes of node v, namely:
  • f represents the aggregation vector of neighbors And the upper level vector of node v
  • the applied synthesis function, W k is the parameter of the k-th level of aggregation.
  • the integrated operation in the function f can include: versus Splicing, or summing, or averaging, etc.
  • the above aggregation function AGG k can take different forms and algorithms.
  • the aforementioned aggregation function AGG k includes a pooling operation.
  • the aforementioned pooling operation may include maximum pooling, average pooling, and so on.
  • the above-mentioned aggregation function AGG k can be expressed as embedding the upper-level embedding vector of each neighbor node u Input the LSTM neural network in turn, and use the hidden vector thus obtained as the neighbor aggregation vector
  • the aforementioned aggregation function AGG k includes a weighted sum operation.
  • formula (6) is embodied as:
  • the above-mentioned weight factor ⁇ uv is determined according to the characteristics of the connecting edge e uv between the neighbor node u and the first node v.
  • the characteristics of the connecting edge e uv between the two nodes u and v may include the total transfer amount of the two users corresponding to the two nodes.
  • the characteristics of the connecting edge e uv between the two nodes u and v may include the interaction frequency of the two users corresponding to the two nodes.
  • the weight factor of the neighbor node u can be determined based on the characteristics of the connected edge e uv , and the neighbor aggregation vector can be obtained through the aggregation function of formula (8)
  • the edge embedding vector of each connected edge is determined according to the edge feature of the connected edge between each node.
  • the aggregation of the embedding vectors of the opposite edges is also introduced. Specifically, based on the upper-level embedding vector of each neighbor node u And the edge embedding vector of each connecting edge e uv between each neighbor node u and the first node v, determine the neighbor aggregation vector More specifically, in an example, the formula (6) for aggregation using the above aggregation function AGG k can be embodied as:
  • q uv is the edge embedding vector of the connecting edge e uv between the first node v and its neighbor node u.
  • the neighbor aggregation vector is determined based on the upper-level embedding vector of the neighbor node Then, according to formula (7), the embedding vector of the first node v at this level is obtained
  • the initial embedding vector of the sample determined in step 201 can be used as the 0-level embedding vector.
  • k is made from 1 to the preset aggregation level K, and the aggregation is performed step by step, and the preset node v can be obtained.
  • High-order embedding vector of series K Among them, the aggregation level K is a preset hyperparameter, which corresponds to the order of neighbor nodes considered for aggregation.
  • the first holder i is in the aggregation layer, and obtains the high-order embedding vector of each sample based on the first graph structure stored therein and the primary embedding vector of each sample.
  • step 203 the first holder i sends the high-order embedding vector of each sample to the server.
  • the first holder i is any one of the N data holders.
  • the other data holder j will also perform similar operations to the first holder i.
  • the high-level embedding vector of each sample will be correspondingly obtained and sent To the server.
  • the server can receive the high-order embedding vector of the sample processed by each of the N data holders.
  • the server can respectively receive N high-order embedding vectors for the sample v from N data holders among them, Represents the high-order embedding vector obtained by the i-th holder for the sample v.
  • the server uses the classification sub-network maintained by the server to synthesize the N high-order embedding vectors of the target sample v to obtain the comprehensive embedding vector of the target sample, and determine the classification of the target sample according to the comprehensive embedding vector forecast result.
  • the classification sub-network may include a synthesis layer for synthesizing N high-order embedding vectors of the target sample v.
  • the synthesis layer can adopt many different synthesis methods.
  • the N high-order embedding vectors of sample v are Perform splicing to get a comprehensive embedding vector
  • the above-mentioned N high-order embedding vectors are Take the average to get the integrated embedding vector
  • the above-mentioned N high-order embedding vectors are Weighted summation to get the integrated embedding vector which is:
  • ⁇ i is the weighting factor corresponding to the i-th data holder.
  • the weight factor ⁇ i can be a pre-set hyperparameter, or it can be determined through training.
  • the integrated embedding vector is obtained in the following way
  • ⁇ i is the weight vector corresponding to the i-th data holder, which has the same dimension as the high-order embedding vector
  • means bitwise multiplication. That is to say, in formula (11), N weight vectors are used to perform bitwise multiplication with N high-order embedding vectors to obtain N weighted processing vectors, and the N weighted processing vectors are summed to obtain Synthetic embedding vector It needs to be understood that the above N weight vectors are determined through network training.
  • the classification sub-network can be based on the integrated embedding vector Determine the classification prediction result of the target sample. For example, in the classification sub-network, you can continue to embed the integrated vector For further processing, and then input into the classification layer for classification; or, you can also embed the integrated vector Enter directly into the classification layer. Through the classification layer, the classification prediction result of the target sample can be obtained.
  • step 205 the server determines the prediction loss based on at least the classification prediction result of the target sample and the corresponding sample label.
  • the sample label of each sample comes from the data holder.
  • one of the N data holders for example, called the second holder, owns the sample labels of all training samples.
  • the server receives the sample label of each sample from the second holder in advance.
  • the sample labels of each sample are distributed among different data holders.
  • the server collects sample labels of each sample from each data holder in advance.
  • the server may determine the prediction loss according to the definition of various loss functions, at least based on the comparison between the classification prediction result of the target sample and the label value of the sample label.
  • the server updates the classification sub-network according to the predicted loss obtained above, and determines the loss gradient corresponding to the input layer of the classification sub-network.
  • the loss back propagation method can be used. Starting from the output layer of the classification sub-network, the loss gradient is determined layer by layer, the network parameters of this layer are adjusted based on the loss gradient, and the loss gradient is passed to the upper layer until it is determined The loss gradient corresponding to the input layer.
  • step 207 the server sends the aforementioned loss gradient to N data holders.
  • the first holder i of the N data holders receives the aforementioned loss gradient.
  • the first holder i updates the graph embedded sub-network part in it according to the received loss gradient, that is, the aforementioned first network part.
  • the first holder i continues to perform the back propagation of the loss according to the aforementioned loss gradient to update the network parameters therein.
  • Backpropagation is first performed in the aggregation layer, so the aggregation parameters in the aggregation layer can be updated layer by layer in the reverse direction.
  • the embedding layer involves embedding parameters that need to be trained (for example, as shown in the aforementioned formula (4))
  • the backpropagation continues, and the embedding parameters in the embedding layer are further updated. In this way, the graph embedding sub-network part of the first holder i is updated.
  • each of the N data holders can perform the above operations similarly, thereby updating the graph embedded sub-network part maintained therein. As a result, the entire graph embedded sub-network is updated. Furthermore, combined with the classification sub-network in the server, the entire graph neural network has been trained and updated.
  • the forward processing process to obtain the sample prediction results can be divided into three stages. These three stages use three different processing methods and implementation subjects. .
  • the process of determining the primary embedding vector of the sample is jointly executed by N data holders using the MPC scheme.
  • the security of the feature data is ensured through the MPC scheme, and the feature data of each holder is thus comprehensively integrated to obtain the primary embedding vector.
  • the process of determining the high-level embedding vector of the sample is performed by each data holder. Therefore, on the one hand, the security of graph structure data is ensured, and on the other hand, each data holder is allowed to perform multi-level aggregation based on the different graph structures maintained by each.
  • the process of determining the predicted result and predicted loss of the sample is executed in the server. This is because the processing of high-order embedding vectors does not involve private data, and multiple processing in the neural network involves non-linear transformation, which requires relatively high computational performance. In this way, a neutral server maintains the classification sub-network to improve training and calculation efficiency.
  • a device for multi-party joint training graph neural network is provided, the device is deployed in any first holder of the aforementioned N data holders, and the first holder It can be implemented as any device, platform or device cluster with computing and processing capabilities.
  • the graph neural network includes a graph embedding sub-network and a classification sub-network.
  • the server maintains the classification sub-network, and the N data holders each maintain a part of the graph embedding sub-network.
  • the first holder stores the first characteristic part of each sample, and the first graph structure containing each sample as the corresponding node; and, the first holder maintains the first network part of the graph embedding sub-network, the first A network part includes an embedded layer and an aggregation layer.
  • Fig. 4 shows a schematic block diagram of a training device deployed in a first holder according to an embodiment.
  • the training device 400 includes the following units.
  • the primary embedding unit 41 is configured to jointly calculate the primary embedding vector of each sample based on at least the first feature part of each sample at the embedding layer, using a multi-party secure computing scheme, and other N-1 data holders .
  • the aggregation unit 42 is configured to perform multi-level aggregation on each sample based on the first graph structure and the primary embedding vector of each sample at the aggregation layer to determine the high-order embedding vector of each sample;
  • Each level of aggregation includes, for each sample corresponding to the node in the first graph structure, determining the current level of embedding vector of the node based at least on the previous level of embedding vector of the neighboring node of the node.
  • the sending unit 43 is configured to send the high-order embedding vectors of the respective samples to the server, so that the server uses the classification sub-network based on the high-order embedding vector pairs sent by the N data holders Each sample is classified and predicted, and the classification prediction result is obtained.
  • the receiving unit 44 is configured to receive a loss gradient from the server, the loss gradient being determined based on at least the classification prediction result of each sample and the sample label.
  • the update unit 45 is configured to update the first network part according to the loss gradient.
  • the primary embedding unit 41 is configured to: based on the first characteristic part of each sample and the embedding parameters in the embedding layer, use a multi-party secure computing scheme to be compatible with other N-1 data.
  • the elementary embedding vector of each sample is obtained by a methodical joint calculation; accordingly, the update unit 45 is configured to update the embedding parameter.
  • the multi-party secure computing scheme adopts a secret sharing scheme
  • the primary embedding unit 41 is specifically configured to: perform sharing processing on the first feature part of each sample to obtain the first shared feature Share the embedded parameters to obtain the first shared parameter part; send the first shared characteristic part and the first shared parameter part to other N-1 data holders, and from other N- A data holder receives N-1 shared characteristic parts and N-1 shared parameter parts respectively; using the embedded parameter and the first comprehensive parameter composed of the N-1 shared parameter parts, the processing is performed by the The first integrated feature composed of the first feature part and the N-1 shared feature parts obtains the first integrated embedding result; sending the first integrated embedding result to the other N-1 data holders, And receive corresponding N-1 comprehensive embedding results from the other N-1 data holders; determine the primary level of each sample according to the first comprehensive embedding result and the N-1 comprehensive embedding results Embedding vector.
  • the aggregation unit 42 is configured to, for any first sample in each sample, correspond to the first node in the first graph structure: at least according to the upper level of the neighbor node of the first node.
  • the embedding vector determines the neighbor aggregation vector; according to the neighbor aggregation vector and the upper-level embedding vector of the first node, the current-level embedding vector of the first node is determined.
  • the aggregation unit 42 determining the neighbor aggregation vector specifically includes:
  • the aggregating unit 42 determining the neighbor aggregation vector specifically includes: a weighted summation of the upper-level embedding vectors of the neighbor nodes of the first node to obtain the neighbor aggregation vector, and the weight corresponding to each neighbor node It is determined according to the characteristics of the connecting edge between the neighbor node and the first node.
  • the determination of the neighbor aggregation vector by the aggregation unit 42 specifically includes: based on the upper-level embedding vector of each neighbor node, and the edge embedding vector of each connection edge between each neighbor node and the first node, Determine the neighbor aggregation vector.
  • the update unit 45 is specifically configured to: according to the loss gradient, use a backpropagation algorithm to reversely update the aggregation parameters in the aggregation layer and the embedding parameters in the embedding layer layer by layer.
  • an apparatus for joint training of a graph neural network by multiple parties wherein the graph neural network includes a graph embedding sub-network and a classification sub-network, and the multiple parties include a server and N data holders,
  • the server maintains the classification sub-network, each of the N data holders maintains a part of the graph embedding sub-network; each of the N data holders stores partial characteristics of each sample in the sample set , And a graph structure containing the respective samples as corresponding nodes; the device is deployed in the server, and the server can be implemented as any device, platform or device cluster with computing and processing capabilities.
  • Fig. 5 shows a schematic block diagram of a training device deployed in a server according to an embodiment.
  • the training device 500 includes the following units.
  • the vector receiving unit 51 is configured to respectively receive N high-order embedding vectors for the target sample from the N data holders for any target sample, where the i-th high-order embedding vector is determined by the N
  • the i-th holder of the data holders is obtained by embedding the graph structure stored therein and the characteristic part of the target sample into the graph maintained in the sub-network part.
  • the classification prediction unit 52 is configured to integrate the N high-order embedding vectors in the classification sub-network to obtain the integrated embedding vector of the target sample, and determine the value of the target sample according to the integrated embedding vector Classification prediction results.
  • the loss determining unit 53 is configured to determine the prediction loss based at least on the classification prediction result of the target sample and the corresponding sample label.
  • the updating unit 54 is configured to update the classification sub-network according to the predicted loss, and determine the loss gradient corresponding to the input layer of the classification sub-network.
  • the sending unit 55 is configured to send the loss gradient to the N data holders, so that each holder updates the graph embedded sub-network part therein.
  • the classification prediction unit 52 is specifically configured to: splice the N high-order embedding vectors to obtain the integrated embedding vector; or, averaging the N high-order embedding vectors to obtain the Comprehensive embedding vector.
  • the classification prediction unit 52 is specifically configured to: use N weight vectors to perform bitwise multiplication with the N high-order embedding vectors to obtain N weighted processing vectors; The processing vectors are summed to obtain the integrated embedding vector; accordingly, the update unit 54 is configured to update the N weight vectors.
  • the device 500 further includes (not shown) a tag receiving unit configured to receive the sample tag from a second holder of the N data holders.
  • a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed in a computer, the computer is caused to execute the method described in conjunction with FIG. 2.
  • a computing device including a memory and a processor, the memory is stored with executable code, and when the processor executes the executable code, it implements the method described in conjunction with FIG. 2 method.

Abstract

A method and a device for multi-party joint training of a graph neural network. A plurality of parties comprises a plurality of data holders and a server; a graph neural network includes a graph embedding sub-network and a classification sub-network. Each data holder respectively maintains a part of the graph embedding sub-network, and the server maintains the classification sub-network. Any of the data holders, in the graph embedding sub-network maintained thereby, computes jointly with other holders, by means of secure multi-party computation (MPC), a primary embedding vector of a sample, performs multi-level neighbor aggregation on a node according to a local graph structure to obtain a high-order embedding vector of the node, and sends same to the server. The server combines the high-order embedding vectors from the data holders by using the classification sub-network, and performs classification prediction accordingly, to determine loss. A loss gradient is passed back from the classification sub-network in the server to the graph embedding sub-network in the data holders, realizing the joint training of the whole graph neural network. The present invention protects the data privacy of the parties.

Description

多方联合训练图神经网络的方法及装置Method and device for multi-party joint training graph neural network 技术领域Technical field
本说明书一个或多个实施例涉及数据安全和机器学习领域,具体地,涉及多方联合训练图神经网络的方法和装置。One or more embodiments of this specification relate to the fields of data security and machine learning, and in particular, to methods and devices for multi-party joint training of graph neural networks.
背景技术Background technique
机器学习所需要的数据往往会涉及到多个领域。例如在基于机器学习的用户分类分析场景中,电子支付平台拥有用户的交易流水数据,社交平台拥有用户的好友联络数据,银行机构拥有用户的借贷数据。数据往往以孤岛的形式存在。由于行业竞争、数据安全、用户隐私等问题,数据整合面临着很大阻力,将分散在各个平台的数据整合在一起训练机器学习模型难以实现。在保证数据不泄露的前提下,使用多方数据联合训练机器学习模型变成目前的一大挑战。The data needed for machine learning often involves multiple fields. For example, in a user classification analysis scenario based on machine learning, the electronic payment platform owns the user's transaction flow data, the social platform owns the user's friend contact data, and the banking institution owns the user's loan data. Data often exists in the form of islands. Due to industry competition, data security, user privacy and other issues, data integration is facing great resistance. It is difficult to integrate data scattered on various platforms to train machine learning models. Under the premise of ensuring that data is not leaked, the use of multi-party data to jointly train machine learning models has become a major challenge at present.
图神经网络是广为使用的机器学习模型。相对于传统的神经网络,图神经网络不仅能够捕捉节点的特征,而且能够刻画节点之间的关联关系特征,因此,在多项机器学习任务中取得了优异的效果。然而,这也使得图神经网络具有一定复杂性。特别是,当面对数据孤岛现象时,如何综合多方数据,安全地进行多方联合建模,成为有待解决的问题。Graph neural network is a widely used machine learning model. Compared with the traditional neural network, the graph neural network can not only capture the characteristics of nodes, but also describe the characteristics of the relationship between nodes. Therefore, it has achieved excellent results in a number of machine learning tasks. However, this also makes graph neural networks have a certain complexity. In particular, when faced with the phenomenon of data islands, how to integrate multi-party data and safely conduct multi-party joint modeling has become a problem to be solved.
因此,希望能有改进的方案,能够安全、有效地在多方之间联合训练图神经网络。Therefore, it is hoped that there will be an improved scheme that can safely and effectively train graph neural networks among multiple parties.
发明内容Summary of the invention
本说明书一个或多个实施例描述了多方联合训练图神经网络的方法和装置,能够安全、高效地在多方之间联合训练图神经网络作为预测模型。One or more embodiments of this specification describe methods and devices for multi-party joint training of graph neural networks, which can safely and efficiently jointly train graph neural networks among multiple parties as a prediction model.
根据第一方面,提供了一种多方联合训练图神经网络的方法,所述图神经网络包括图嵌入子网络和分类子网络,所述多方包括服务器和N个数据持有方,所述服务器维护所述分类子网络,所述N个数据持有方各自维护所述图嵌入子网络的一部分;所述N个数据持有方中任意的第一持有方存储有样本集中各个样本的第一特征部分,以及包含所述各个样本作为对应节点的第一图结构;所述第一持有方维护所述图嵌入子网络的第一网络部分,所述第一网络部分包括嵌入层和聚合层;所述方法通过该第一持有方执行, 包括:在所述嵌入层,至少基于所述各个样本的第一特征部分,利用多方安全计算方案,与其他N-1个数据持有方联合计算得到各个样本的初级嵌入向量;在所述聚合层,基于所述第一图结构,以及所述各个样本的初级嵌入向量,对所述各个样本执行多级聚合,以确定各个样本的高阶嵌入向量;其中每级聚合包括,对于各个样本在所述第一图结构中对应的节点,至少基于该节点的邻居节点的上一级嵌入向量,确定该节点的本级嵌入向量;将所述各个样本的高阶嵌入向量发送至所述服务器,以使得所述服务器利用所述分类子网络,基于所述N个数据持有方发送的高阶嵌入向量对各个样本进行分类预测,得到分类预测结果;从所述服务器接收损失梯度,所述损失梯度至少基于所述各个样本的分类预测结果与样本标签而确定;根据所述损失梯度,更新所述第一网络部分。According to a first aspect, there is provided a method for jointly training a graph neural network by multiple parties. The graph neural network includes a graph embedding sub-network and a classification sub-network. The multiple parties include a server and N data holders. The server maintains In the classification sub-network, each of the N data holders maintains a part of the graph embedding sub-network; any first holder of the N data holders stores the first of each sample in the sample set Characteristic part, and a first graph structure containing the respective samples as corresponding nodes; the first holder maintains the first network part of the graph embedding sub-network, and the first network part includes an embedding layer and an aggregation layer The method is executed by the first holder, and includes: in the embedding layer, based at least on the first characteristic part of each sample, using a multi-party secure computing scheme to combine with other N-1 data holders Calculate the primary embedding vector of each sample; in the aggregation layer, based on the first graph structure and the primary embedding vector of each sample, perform multi-level aggregation on each sample to determine the high-level of each sample Embedding vector; where each level of aggregation includes, for each sample corresponding to the node in the first graph structure, at least based on the previous level of the node’s neighboring node’s embedding vector, determining the node’s current level of embedding vector; The high-order embedding vector of each sample is sent to the server, so that the server uses the classification sub-network to classify and predict each sample based on the high-order embedding vector sent by the N data holders to obtain a classification prediction Result; receiving a loss gradient from the server, the loss gradient being determined based on at least the classification prediction result of each sample and the sample label; updating the first network part according to the loss gradient.
根据一种实施方式,所述多方安全计算方案包括秘密分享方案;相应地,可以通过以下方式得到各个样本的初级嵌入向量:对各个样本的第一特征部分进行分享处理,得到第一分享特征部分;将所述第一分享特征部分发送给其他N-1个数据持有方,并从所述其他N-1个数据持有方分别接收N-1个分享特征部分;对所述第一特征部分和所述N-1个分享特征部分进行综合,得到第一综合特征;将所述第一综合特征发送给其他N-1个数据持有方,并从其他N-1个数据持有方分别接收N-1个综合特征;根据所述第一综合特征,以及所述N-1个综合特征,确定所述各个样本的初级嵌入向量。According to an embodiment, the multi-party secure computing scheme includes a secret sharing scheme; accordingly, the primary embedding vector of each sample can be obtained in the following manner: the first characteristic part of each sample is shared to obtain the first shared characteristic part ; Send the first shared characteristic part to other N-1 data holders, and receive N-1 shared characteristic parts from the other N-1 data holders; for the first characteristic Part and the N-1 shared characteristic parts are integrated to obtain the first comprehensive characteristic; the first comprehensive characteristic is sent to other N-1 data holders, and from the other N-1 data holders N-1 comprehensive features are received respectively; and the primary embedding vector of each sample is determined according to the first comprehensive feature and the N-1 comprehensive features.
在一个实施例中,嵌入层具有嵌入参数,在嵌入层得到各个样本的初级嵌入向量包括,基于各个样本的第一特征部分,以及所述嵌入层中的嵌入参数,利用多方安全计算方案,与其他N-1个数据持有方联合计算得到各个样本的初级嵌入向量。In one embodiment, the embedding layer has embedding parameters, and obtaining the primary embedding vector of each sample at the embedding layer includes, based on the first feature part of each sample, and the embedding parameters in the embedding layer, using a multi-party secure computing scheme, and The other N-1 data holders jointly calculate the primary embedding vector of each sample.
在这样的情况下,更新第一网络部分包括,更新所述嵌入参数。In this case, updating the first network part includes updating the embedded parameters.
在一个进一步的实施例中,嵌入层采用秘密分享方案,与其他N-1个数据持有方联合计算得到各个样本的初级嵌入向量,这具体包括:对各个样本的第一特征部分进行分享处理,得到第一分享特征部分;并对所述嵌入参数进行分享处理,得到第一分享参数部分;将所述第一分享特征部分和第一分享参数部分发送给其他N-1个数据持有方,并从其他N-1个数据持有方分别接收N-1个分享特征部分以及N-1个分享参数部分;利用所述嵌入参数和所述N-1个分享参数部分构成的第一综合参数,处理由所述第一特征部分和所述N-1个分享特征部分构成的第一综合特征,得到第一综合嵌入结果;将所述第一综合嵌入结果发送给所述其他N-1个数据持有方,并从所述其他N-1个数据持有方接收对应的N-1个综合嵌入结果;根据所述第一综合嵌入结果和所述N-1个综合嵌入结果,确定所述各个样本的初级嵌入向量。In a further embodiment, the embedding layer adopts a secret sharing scheme to jointly calculate the primary embedding vector of each sample with other N-1 data holders, which specifically includes: sharing the first feature part of each sample , Obtain the first shared characteristic part; perform sharing processing on the embedded parameter to obtain the first shared parameter part; send the first shared characteristic part and the first shared parameter part to other N-1 data holders , And respectively receive N-1 shared characteristic parts and N-1 shared parameter parts from other N-1 data holders; the first synthesis composed of the embedded parameters and the N-1 shared parameter parts Parameters, processing the first integrated feature composed of the first feature part and the N-1 shared feature parts to obtain a first integrated embedding result; sending the first integrated embedding result to the other N-1 Data holders, and receive corresponding N-1 integrated embedding results from the other N-1 data holders; according to the first integrated embedding result and the N-1 integrated embedding results, determine The primary embedding vector of each sample.
在一个实施例中,聚合层中的每级聚合包括,对于各个样本中任意的第一样本在所述第一图结构中对应的第一节点:至少根据该第一节点的邻居节点的上一级嵌入向量,确定邻居聚合向量;根据所述邻居聚合向量,以及所述第一节点的上一级嵌入向量,确定该第一节点的本级嵌入向量。In one embodiment, each level of aggregation in the aggregation layer includes, for any first sample in each sample, the corresponding first node in the first graph structure: at least according to the upper limit of the neighbor node of the first node. The first-level embedding vector is used to determine the neighbor aggregation vector; and the current-level embedding vector of the first node is determined according to the neighbor aggregation vector and the upper-level embedding vector of the first node.
进一步的,在一个例子中,对第一节点的邻居节点的上一级嵌入向量进行池化操作,得到所述邻居聚合向量。Further, in an example, a pooling operation is performed on the upper-level embedding vector of the neighbor node of the first node to obtain the neighbor aggregation vector.
在另一例子中,对所述第一节点的邻居节点的上一级嵌入向量加权求和,得到所述邻居聚合向量,各邻居节点对应的权重根据该邻居节点与所述第一节点之间的连接边的特征而确定。In another example, the weighted summation of the upper-level embedding vectors of the neighbor nodes of the first node obtains the neighbor aggregation vector, and the weight corresponding to each neighbor node is based on the relationship between the neighbor node and the first node. The characteristics of the connecting edge are determined.
在又一例子中,基于各个邻居节点的上一级嵌入向量,以及各个邻居节点与所述第一节点之间的各个连接边的边嵌入向量,确定所述邻居聚合向量。In another example, the neighbor aggregation vector is determined based on the upper-level embedding vector of each neighbor node and the edge embedding vector of each connection edge between each neighbor node and the first node.
根据一个实施例,更新第一网络部分的过程包括:根据损失梯度,采用反向传播算法,反向逐层更新所述聚合层中的聚合参数,以及所述嵌入层中的嵌入参数。According to an embodiment, the process of updating the first network part includes: according to the loss gradient, using a backpropagation algorithm to reversely update the aggregation parameters in the aggregation layer and the embedding parameters in the embedding layer layer by layer.
根据第二方面,提供了一种多方联合训练图神经网络的方法,所述图神经网络包括图嵌入子网络和分类子网络,所述多方包括服务器和N个数据持有方,所述服务器维护所述分类子网络,所述N个数据持有方各自维护所述图嵌入子网络的一部分;所述N个数据持有方中的各个持有方存储有样本集中各个样本的部分特征,以及包含所述各个样本作为对应节点的图结构;所述方法通过所述服务器执行,所述方法包括:对于所述样本集中任意的目标样本,分别从所述N个数据持有方接收针对该目标样本的N个高阶嵌入向量,其中,第i个高阶嵌入向量是由所述N个数据持有方中的第i个持有方,通过将其中存储的图结构和目标样本的特征部分,输入其中维护的图嵌入子网络部分而得到;在所述分类子网络中,对所述N个高阶嵌入向量进行综合,得到所述目标样本的综合嵌入向量,并根据所述综合嵌入向量确定所述目标样本的分类预测结果;至少基于所述目标样本的分类预测结果与对应的样本标签,确定预测损失;根据所述预测损失更新所述分类子网络,并确定所述分类子网络输入层对应的损失梯度;将所述损失梯度发送给所述N个数据持有方,以使得各个持有方更新其中的图嵌入子网络部分。According to a second aspect, there is provided a method for jointly training a graph neural network by multiple parties. The graph neural network includes a graph embedding sub-network and a classification sub-network. The multiple parties include a server and N data holders. The server maintains In the classification sub-network, each of the N data holders maintains a part of the graph embedding sub-network; each of the N data holders stores part of the characteristics of each sample in the sample set, and A graph structure containing each sample as a corresponding node; the method is executed by the server, and the method includes: for any target sample in the sample set, respectively receiving the target sample from the N data holders The N high-order embedding vectors of the sample, where the i-th high-order embedding vector is obtained by the i-th holder of the N data holders, by combining the stored graph structure and the characteristic part of the target sample , Input the graph embedding sub-network part maintained therein to obtain; in the classification sub-network, the N high-order embedding vectors are integrated to obtain the integrated embedding vector of the target sample, and the integrated embedding vector is obtained according to the integrated embedding vector Determine the classification prediction result of the target sample; determine the prediction loss based at least on the classification prediction result of the target sample and the corresponding sample label; update the classification sub-network according to the prediction loss, and determine the classification sub-network input The loss gradient corresponding to the layer; the loss gradient is sent to the N data holders, so that each holder updates the graph embedded sub-network part.
在不同实施例中,可以通过以下方式对N个高阶嵌入向量进行综合,得到所述目标样本的综合嵌入向量:对所述N个高阶嵌入向量进行拼接,得到所述综合嵌入向量;或者,对所述N个高阶嵌入向量求平均,得到所述综合嵌入向量。In different embodiments, the N high-order embedding vectors can be synthesized to obtain the integrated embedding vector of the target sample in the following manner: the N high-order embedding vectors are spliced to obtain the integrated embedding vector; or , Averaging the N high-order embedding vectors to obtain the comprehensive embedding vector.
在一个实施例中,对N个高阶嵌入向量进行综合,得到所述目标样本的综合嵌入向量包括:利用N个权重向量,分别与所述N个高阶嵌入向量进行按位相乘,得到N个加权处理向量;对所述N个加权处理向量求和,得到所述综合嵌入向量;其中,更新所述分类子网络包括,更新所述N个权重向量。In one embodiment, synthesizing the N high-order embedding vectors to obtain the comprehensive embedding vector of the target sample includes: using N weight vectors to perform bitwise multiplication with the N high-order embedding vectors to obtain N weighted processing vectors; sum the N weighted processing vectors to obtain the integrated embedding vector; wherein, updating the classification sub-network includes updating the N weight vectors.
根据一个实施例,在确定预测损失之前,方法还包括:从所述N个数据持有方中的第二持有方,接收所述样本标签。According to an embodiment, before determining the predicted loss, the method further includes: receiving the sample label from a second holder of the N data holders.
根据第三方面,提供了一种用于多方联合训练图神经网络的装置,所述图神经网络包括图嵌入子网络和分类子网络,所述多方包括服务器和N个数据持有方,所述服务器维护所述分类子网络,所述N个数据持有方各自维护所述图嵌入子网络的一部分;所述N个数据持有方中任意的第一持有方存储有样本集中各个样本的第一特征部分,以及包含所述各个样本作为对应节点的第一图结构;所述第一持有方维护所述图嵌入子网络的第一网络部分,所述第一网络部分包括嵌入层和聚合层;所述装置部署在第一持有方中,包括:初级嵌入单元,配置为在所述嵌入层,至少基于所述各个样本的第一特征部分,利用多方安全计算方案,与其他N-1个数据持有方联合计算得到各个样本的初级嵌入向量;聚合单元,配置为在所述聚合层,基于所述第一图结构,以及所述各个样本的初级嵌入向量,对所述各个样本执行多级聚合,以确定各个样本的高阶嵌入向量;其中每级聚合包括,对于各个样本在所述第一图结构中对应的节点,至少基于该节点的邻居节点的上一级嵌入向量,确定该节点的本级嵌入向量;发送单元,配置为将所述各个样本的高阶嵌入向量发送至所述服务器,以使得所述服务器利用所述分类子网络,基于所述N个数据持有方发送的高阶嵌入向量对各个样本进行分类预测,得到分类预测结果;接收单元,配置为从所述服务器接收损失梯度,所述损失梯度至少基于所述各个样本的分类预测结果与样本标签而确定;更新单元,配置为根据所述损失梯度,更新所述第一网络部分。According to a third aspect, there is provided an apparatus for joint training of a graph neural network by multiple parties. The graph neural network includes a graph embedding sub-network and a classification sub-network. The multiple parties include a server and N data holders. The server maintains the classification sub-network, each of the N data holders maintains a part of the graph embedding sub-network; any first holder of the N data holders stores the data of each sample in the sample set The first characteristic part, and the first graph structure containing the respective samples as the corresponding nodes; the first holder maintains the first network part of the graph embedding sub-network, and the first network part includes the embedding layer and Aggregation layer; the device is deployed in the first holder and includes: a primary embedding unit, configured to use a multi-party secure computing solution at the embedding layer, based at least on the first characteristic part of each sample, with other N -1 data holders jointly calculate the primary embedding vector of each sample; the aggregation unit is configured to perform, at the aggregation layer, based on the first graph structure and the primary embedding vector of each sample, The samples perform multi-level aggregation to determine the high-level embedding vector of each sample; each level of aggregation includes, for each sample corresponding to the node in the first graph structure, at least based on the previous level embedding vector of the node’s neighbor node , Determine the embedding vector of the node at this level; the sending unit is configured to send the high-level embedding vector of each sample to the server, so that the server uses the classification sub-network to hold the N data The high-order embedding vector sent by Youfang performs classification prediction on each sample to obtain a classification prediction result; the receiving unit is configured to receive a loss gradient from the server, and the loss gradient is based on at least the classification prediction result and sample label of each sample And it is determined; the update unit is configured to update the first network part according to the loss gradient.
根据第四方面,提供了一种用于多方联合训练图神经网络的装置,所述图神经网络包括图嵌入子网络和分类子网络,所述多方包括服务器和N个数据持有方,所述服务器维护所述分类子网络,所述N个数据持有方各自维护所述图嵌入子网络的一部分;所述N个数据持有方中的各个持有方存储有样本集中各个样本的部分特征,以及包含所述各个样本作为对应节点的图结构;所述装置部署在所述服务器中,包括:向量接收单元,配置为对于所述样本集中任意的目标样本,分别从所述N个数据持有方接收针对该目标样本的N个高阶嵌入向量,其中,第i个高阶嵌入向量是由所述N个数据持有方中的第 i个持有方,通过将其中存储的图结构和目标样本的特征部分,输入其中维护的图嵌入子网络部分而得到;分类预测单元,配置为在所述分类子网络中,对所述N个高阶嵌入向量进行综合,得到所述目标样本的综合嵌入向量,并根据所述综合嵌入向量确定所述目标样本的分类预测结果;损失确定单元,配置为至少基于所述目标样本的分类预测结果与对应的样本标签,确定预测损失;更新单元,配置为根据所述预测损失更新所述分类子网络,并确定所述分类子网络输入层对应的损失梯度;发送单元,配置为将所述损失梯度发送给所述N个数据持有方,以使得各个持有方更新其中的图嵌入子网络部分。According to a fourth aspect, there is provided an apparatus for joint training of a graph neural network by multiple parties. The graph neural network includes a graph embedding sub-network and a classification sub-network. The multiple parties include a server and N data holders. The server maintains the classification sub-network, each of the N data holders maintains a part of the graph embedding sub-network; each of the N data holders stores partial characteristics of each sample in the sample set , And a graph structure containing each sample as a corresponding node; the device is deployed in the server and includes: a vector receiving unit configured to hold any target sample in the sample set from the N data A party receives N high-order embedding vectors for the target sample, where the i-th high-order embedding vector is generated by the i-th holder of the N data holders, and the graph structure stored therein And the characteristic part of the target sample, which is obtained by inputting the maintained graph embedding sub-network part; the classification prediction unit is configured to integrate the N high-order embedding vectors in the classification sub-network to obtain the target sample And determine the classification prediction result of the target sample according to the comprehensive embedding vector; a loss determination unit configured to determine the prediction loss based at least on the classification prediction result of the target sample and the corresponding sample label; update unit , Configured to update the classification sub-network according to the predicted loss, and determine the loss gradient corresponding to the input layer of the classification sub-network; a sending unit, configured to send the loss gradient to the N data holders, This allows each holder to update the graph embedded in the sub-network part.
根据第五方面,提供了一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行第一方面或第二方面的方法。According to a fifth aspect, there is provided a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed in a computer, the computer is caused to execute the method of the first aspect or the second aspect.
根据第六方面,提供了一种计算设备,包括存储器和处理器,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现第一方面或第二方面的方法。According to a sixth aspect, there is provided a computing device, including a memory and a processor, the memory stores executable code, and when the processor executes the executable code, the method of the first aspect or the second aspect is implemented .
根据本说明书实施例提供的方法和装置,由多个数据持有方和服务器共同联合训练图神经网络,其中各个数据持有方各自存储样本的一部分特征以及以样本为节点的图结构。图神经网络被划分为图嵌入子网络和分类子网络,各个数据持有方各自维护图嵌入子网络的一部分,服务器维护分类子网络。任意的数据持有方在其维护的图嵌入子网络中,通过多方安全计算方案,与其他持有方联合计算出样本的初级嵌入向量,在此基础上,根据本地的图结构对节点进行多级邻居聚合,得到节点的高阶嵌入向量,并发送给服务器。服务器利用分类子网络对来自各个数据持有方的样本高阶嵌入向量进行综合,并据此对样本进行分类预测,确定损失。最终,损失梯度从服务器中的分类子网络传递回数据持有方中的图嵌入子网络,实现整个图神经网络的联合训练。整个过程中,保证了样本特征数据和图结构数据的隐私安全,还提升了整个网络的计算和训练效率。According to the method and device provided by the embodiments of this specification, multiple data holders and servers jointly train a graph neural network, wherein each data holder stores part of the characteristics of the sample and the graph structure with the sample as the node. The graph neural network is divided into a graph embedding sub-network and a classification sub-network. Each data holder maintains a part of the graph embedding sub-network, and the server maintains the classification sub-network. In the graph embedding sub-network maintained by any data holder, the primary embedding vector of the sample is calculated jointly with other holders through a multi-party secure calculation scheme. On this basis, the node is multiplied according to the local graph structure. The high-level neighbors aggregate to obtain the high-order embedding vector of the node and send it to the server. The server uses the classification sub-network to synthesize the high-level embedding vectors of the samples from each data holder, and then classifies and predicts the samples accordingly to determine the loss. Finally, the loss gradient is passed from the classification sub-network in the server back to the graph embedding sub-network in the data holder to realize the joint training of the entire graph neural network. In the whole process, the privacy and security of the sample feature data and graph structure data are guaranteed, and the calculation and training efficiency of the entire network is also improved.
附图说明Description of the drawings
为了更清楚地说明本申请实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings needed in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, without creative work, other drawings can be obtained from these drawings.
图1为本说明书披露的一个实施例的实施场景示意图;Figure 1 is a schematic diagram of an implementation scenario of an embodiment disclosed in this specification;
图2示出根据一个实施例的多方联合训练图神经网络的方法过程;Fig. 2 shows a process of a multi-party joint training graph neural network method according to an embodiment;
图3示出第一持有方采用秘密分享方式确定样本的初级嵌入向量的方法流程;Figure 3 shows the flow of the method for the first holder to determine the primary embedding vector of the sample in a secret sharing manner;
图4示出根据一个实施例的部署在第一持有方中的训练装置的示意性框图;Fig. 4 shows a schematic block diagram of a training device deployed in a first holder according to an embodiment;
图5示出根据一个实施例的部署在服务器中的训练装置的示意性框图。Fig. 5 shows a schematic block diagram of a training device deployed in a server according to an embodiment.
具体实施方式Detailed ways
下面结合附图,对本说明书提供的方案进行描述。The following describes the solutions provided in this specification with reference to the accompanying drawings.
图1为本说明书披露的一个实施例的实施场景示意图。在图1中,为了清楚和简单,示出了2个数据持有方,分别为持有方A和持有方B。持有方A和B各自存储有样本的一部分特征,和记录样本之间关联关系的图结构。在一个具体例子中,样本可以是用户。相应的,持有方A例如可以是电子支付平台(例如支付宝),其中存储有用户的一部分特征(例如与支付相关的特征)。这一部分特征在图1中示出为特征f1到f4。此外,持有方A中还存储有例如通过支付关系构建的图结构A。更具体的,在该电子支付平台中具有支付或转账关系的用户之间,可以通过连接边进行连接,由此形成图结构A。另一方面,持有方B例如可以是社交平台(例如钉钉),其中存储有用户的另一部分特征(例如与社交相关的特征)。这部分特征在图1中示出为特征f5,f6和f7。此外,持有方B中还存储有例如通过社交关系构建的图结构B。更具体的,在该社交平台中,具有好友关系或有过沟通记录的用户之间,可以通过连接边进行连接,由此形成图结构B。Fig. 1 is a schematic diagram of an implementation scenario of an embodiment disclosed in this specification. In Figure 1, for clarity and simplicity, two data holders are shown, namely holder A and holder B. Holders A and B each store part of the characteristics of the sample, and a graph structure that records the relationship between the samples. In a specific example, the sample may be a user. Correspondingly, the holder A may be, for example, an electronic payment platform (such as Alipay), in which a part of the user's characteristics (such as payment-related characteristics) are stored. This part of the features is shown in FIG. 1 as features f1 to f4. In addition, the holder A also stores a graph structure A constructed through a payment relationship, for example. More specifically, users who have a payment or transfer relationship in the electronic payment platform can be connected by connecting edges, thereby forming a graph structure A. On the other hand, the holder B may be, for example, a social platform (such as Dingding), in which another part of the user's characteristics (such as social-related characteristics) are stored. These features are shown in FIG. 1 as features f5, f6, and f7. In addition, the holder B also stores a graph structure B constructed through social relationships, for example. More specifically, in the social platform, users who have a friend relationship or have a history of communication can be connected by connecting edges, thereby forming a graph structure B.
通过图1的图示可以看到,持有方A和持有方B可以存储有相同样本的不同特征部分,并且,由于持有方A和持有方B基于不同的关联关系进行图结构的构建(例如,持有方A基于支付关系,持有方B基于社交好友关系),因此,持有方A和B各自存储有不同的图结构。It can be seen from the illustration in Figure 1 that holder A and holder B can store different characteristic parts of the same sample, and because holder A and holder B perform graph structure based on different association relationships Construction (for example, the holder A is based on the payment relationship, and the holder B is based on the social friend relationship). Therefore, the holders A and B each store different graph structures.
为了提高预测准确性,希望基于更加丰富的样本信息进行机器学习。而数据持有方A和持有方B各自只保存了样本的部分特征以及基于某种关联关系构建的图结构,因此,希望能够将数据持有方A和B中存储的特征数据和图数据综合起来,联合训练一个图神经网络作为预测模型。同时,希望在联合训练的过程中,数据持有方A和B中的原始数据不会泄露出去,以保证隐私和安全。In order to improve the accuracy of prediction, it is hoped to perform machine learning based on richer sample information. The data holder A and the holder B each only save part of the characteristics of the sample and the graph structure constructed based on a certain relationship. Therefore, it is hoped that the feature data and graph data stored in the data holders A and B can be saved Taken together, a graph neural network is jointly trained as a predictive model. At the same time, it is hoped that in the process of joint training, the original data in data holders A and B will not be leaked out to ensure privacy and security.
为此,在本说明书的一个实施例中,在各个数据持有方之外还引入中立的服务器,由各个数据持有方和服务器共同联合训练图神经网络。为了兼顾数据的隐私安全和计算 效率,将有待训练的图神经网络划分为两部分:图嵌入子网络和分类子网络。For this reason, in an embodiment of this specification, a neutral server is introduced in addition to each data holder, and each data holder and server jointly train the graph neural network. In order to take into account both the privacy and security of the data and the computational efficiency, the graph neural network to be trained is divided into two parts: the graph embedding sub-network and the classification sub-network.
图嵌入子网络用于根据样本特征和图结构,生成各个样本对应节点的高阶嵌入向量。图嵌入子网络的计算涉及原始样本特征和图结构数据,属于隐私数据相关计算,因此,可以在数据持有方本地进行。具体的,各个数据持有方可以各自维护图嵌入子网络的一部分,基于本地存储的特征数据和图结构数据,利用本地维护的图嵌入子网络部分,计算得到节点的高阶嵌入向量。The graph embedding sub-network is used to generate the high-order embedding vectors of the nodes corresponding to each sample according to the sample characteristics and graph structure. The calculation of the graph embedding sub-network involves original sample features and graph structure data, which is related to privacy data calculations, so it can be performed locally on the data holder. Specifically, each data holder may maintain a part of the graph embedding sub-network, and use the locally maintained graph embedding sub-network part based on locally stored feature data and graph structure data to calculate the high-order embedding vector of the node.
更具体的,各个数据持有方中的图嵌入子网络可以分为嵌入层和聚合层。在嵌入层,各个数据持有方之间通过多方安全计算方案,综合其存储的样本特征部分,得到各个样本对应节点的初级嵌入向量。在聚合层,数据持有方基于节点的初级嵌入向量和本地存储的图结构,对节点进行多级邻居聚合,从而得到节点的高阶嵌入向量。More specifically, the graph embedding sub-network in each data holder can be divided into an embedding layer and an aggregation layer. At the embedding layer, each data holder uses a multi-party secure calculation scheme to synthesize the stored sample feature parts to obtain the primary embedding vector of the node corresponding to each sample. In the aggregation layer, the data holder performs multi-level neighbor aggregation on the node based on the node's primary embedding vector and the locally stored graph structure, thereby obtaining the node's high-order embedding vector.
分类子网络用于将图嵌入子网络得到的节点的高阶嵌入向量进行综合,并根据综合结果对节点进行分类预测。分类子网络的计算并不涉及原始样本特征和图结构数据,属于非隐私相关计算,因此,可以在服务器中进行,以提升计算和训练效率。The classification sub-network is used to synthesize the high-order embedding vectors of the nodes obtained by embedding the graph into the sub-network, and perform classification prediction on the nodes according to the comprehensive results. The calculation of the classification sub-network does not involve the original sample features and graph structure data, and is a non-privacy related calculation. Therefore, it can be performed in the server to improve the efficiency of calculation and training.
在训练过程中,服务器可以根据分类子网络的分类预测结果和样本标签(图1中示出为y1,y2和y3),确定出预测损失,并通过反向传播更新分类子网络,直到确定出分类子网络输入层的损失梯度。然后,服务器将损失梯度发送给各个数据持有方,于是各个数据持有方可以根据该损失梯度,继续更新其中的图嵌入子网络。由此,实现整个图神经网络的更新和训练。During the training process, the server can determine the prediction loss according to the classification prediction results of the classification sub-network and sample labels (shown as y1, y2 and y3 in Figure 1), and update the classification sub-network through back propagation until it is determined The loss gradient of the input layer of the classification sub-network. Then, the server sends the loss gradient to each data holder, so each data holder can continue to update the graph embedding sub-network according to the loss gradient. As a result, the update and training of the entire graph neural network is realized.
下面具体描述多方联合训练图神经网络的具体过程。The specific process of multi-party joint training graph neural network is described in detail below.
需要理解,尽管图1中仅示出了2个数据提供方,但是以上的构思和架构可以适用于更多数据提供方的场景。为了不失一般性,下面的描述中,假定存在N个数据持有方,N通常大于等于2。这N个数据持有方各自存储有样本集中各个样本的一部分特征,以及包含各个样本作为节点的图结构。在这样的场景下,希望由N个数据持有方和服务器共同联合训练出图神经网络模型。It should be understood that although only two data providers are shown in FIG. 1, the above concept and architecture can be applied to scenarios of more data providers. For the sake of generality, in the following description, it is assumed that there are N data holders, and N is usually greater than or equal to 2. The N data holders each store a part of the characteristics of each sample in the sample set, and a graph structure containing each sample as a node. In such a scenario, it is hoped that N data holders and servers will jointly train a graph neural network model.
为了兼顾联合训练的安全和效率,如上所述,将图神经网络划分为图嵌入子网络和分类子网络。在这样的情况下,N个数据持有方各自维护图嵌入子网络的一部分;服务器维护分类子网络。为了描述的简单和清楚,下面结合N个数据持有方中任意一个数据持有方,称为第一持有方,描述联合训练的执行步骤。In order to take into account the safety and efficiency of joint training, as described above, the graph neural network is divided into graph embedding sub-networks and classification sub-networks. In this case, the N data holders each maintain a part of the graph embedding sub-network; the server maintains the classification sub-network. For simplicity and clarity of description, the following describes the execution steps of the joint training in conjunction with any one of the N data holders, called the first holder.
图2示出根据一个实施例的多方联合训练图神经网络的方法过程。可以看到,图2 示出了联合训练中,任意的第一持有方与服务器各自的处理过程,以及两者的交互过程。其中,第一持有方,以及服务器,均可以通过任何具有计算、处理能力的装置、设备、平台、设备集群来执行。Fig. 2 shows a process of a multi-party joint training graph neural network method according to an embodiment. It can be seen that Figure 2 shows the respective processing procedures of any first holder and the server and the interaction procedures between the two in the joint training. Among them, the first holder and the server can be executed by any device, device, platform, or device cluster with computing and processing capabilities.
可以理解,作为N个数据持有方中任意的一个,第一持有方中存储有样本集中各个样本的部分特征,在此称为第一特征部分。此外,将第一持有方中存储的图结构称为第一图结构,其中维护的图嵌入子网络部分称为第一网络部分。进一步的,第一网络部分包括嵌入层和聚合层。It can be understood that, as any one of the N data holders, the first holder stores part of the characteristics of each sample in the sample set, which is referred to herein as the first characteristic part. In addition, the graph structure stored in the first holder is referred to as the first graph structure, and the graph embedded sub-network part maintained therein is referred to as the first network part. Further, the first network part includes an embedded layer and an aggregation layer.
基于以上的场景,联合训练的过程包含以下步骤。Based on the above scenario, the process of joint training includes the following steps.
首先,在步骤201,第一持有方(可以记为持有方i)利用其维护的第一网络部分中的嵌入层,至少基于各个样本的第一特征部分,与其他N-1个数据持有方联合计算得到各个样本的初级嵌入向量。该步骤的联合计算涉及各个数据持有方中存储的样本原始特征,属于私密数据,因此出于数据安全考虑,上述联合计算需要采用多方安全计算(MPC)方案进行。First, in step 201, the first holder (can be denoted as holder i) uses the embedding layer in the first network part maintained by it, at least based on the first feature part of each sample, and other N-1 pieces of data. The holder jointly calculates the primary embedding vector of each sample. The joint calculation in this step involves the original characteristics of the samples stored in each data holder and belongs to private data. Therefore, for data security considerations, the aforementioned joint calculation needs to adopt a multi-party secure computing (MPC) solution.
存在多种已有的多方安全计算MPC方案,例如包括同态加密,混淆电路,秘密分享,等等。在步骤201中,可以结合嵌入层的具体算法,采用各种适用的MPC方案,联合计算样本的初级嵌入向量。There are many existing multi-party secure computing MPC solutions, including homomorphic encryption, obfuscating circuits, secret sharing, and so on. In step 201, the specific algorithm of the embedding layer can be combined with various applicable MPC schemes to jointly calculate the primary embedding vector of the sample.
在一个实施例中,嵌入层对样本特征的处理,主要涉及将原始特征数据进行编码表征(例如编码为向量),而不涉及对特征数据的参数计算处理。在这样的情况下,可以采用各种MPC方案,将N个持有方编码的样本特征部分进行综合,以得到样本的初级嵌入向量。In one embodiment, the processing of the sample feature by the embedding layer mainly involves encoding and characterizing the original feature data (for example, encoding as a vector), and does not involve parameter calculation processing on the feature data. In this case, various MPC schemes can be used to synthesize the characteristic parts of the sample encoded by the N holders to obtain the primary embedding vector of the sample.
在另一个实施例中,嵌入层在对样本的特征数据进行编码表征后,还进一步对其进行涉及参数的计算处理,例如利用参数向量对特征进行线性变换,或者还进一步施加非线性函数(例如sigmoid函数)处理,等等。换而言之,持有方i的嵌入层中包含特征运算所需的嵌入参数θ iIn another embodiment, after the embedding layer encodes and characterizes the feature data of the sample, it further performs calculation processing involving parameters, such as linear transformation of the feature using a parameter vector, or further applying a non-linear function (such as sigmoid function) processing, and so on. In other words, the embedding layer of the holder i includes the embedding parameter θ i required for feature calculation.
在这样的情况下,在一个例子中,可以采用同态加密的方式,将N个持有方编码的样本特征部分以及维护的嵌入参数部分分别进行综合,以得到样本的初级嵌入向量。具体的,可以通过同态加密综合样本特征,还利用同态加密来综合各个持有方中维护的嵌入参数,然后用综合得到的嵌入参数处理综合得到的特征,据此得到样本的初级嵌入向量。In this case, in an example, a homomorphic encryption method can be used to separately integrate the sample feature parts encoded by the N holders and the embedded parameter parts maintained to obtain the primary embedding vector of the sample. Specifically, homomorphic encryption can be used to synthesize sample features, and homomorphic encryption can also be used to synthesize the embedded parameters maintained in each holder, and then use the integrated embedding parameters to process the synthesized features to obtain the primary embedding vector of the sample. .
在另一例子中,采用秘密分享的方式,基于各个特征部分和嵌入参数,得到样本的初级嵌入向量。图3示出第一持有方采用秘密分享方式确定样本的初级嵌入向量的方法流程。In another example, a secret sharing method is adopted to obtain the primary embedding vector of the sample based on each feature part and embedding parameters. Figure 3 shows the flow of the method for the first holder to determine the primary embedding vector of the sample in a secret sharing manner.
具体地,对于某个目标样本,在步骤301,第一持有方i对该样本的第一特征部分x i进行分享处理,得到第一分享特征部分x′ i。上述分享处理可以采用秘密分享中的算法,通过在原始数据的基础上添加一定方式生成的随机数而实现。例如,可以如下得到第一分享特征部分: Specifically, for a certain target sample, in step 301, the first holder i performs sharing processing on the first characteristic part x i of the sample to obtain the first shared characteristic part x′ i . The above-mentioned sharing processing can be realized by using an algorithm in secret sharing, by adding a random number generated in a certain manner on the basis of the original data. For example, the first shared feature part can be obtained as follows:
x′ i=x i+r i            (1) x′ i = x i +r i (1)
其中,r i为针对样本特征的分享处理随机数。 Among them, r i is a random number for the sharing and processing of sample characteristics.
类似的,第一持有方i还对其中的嵌入参数θ i进行分享处理,得到第一分享参数部分θ′ i。具体的,可以按照下式进行分享处理: Similarly, the first holder i also shares the embedded parameter θ i therein to obtain the first shared parameter part θ′ i . Specifically, the sharing process can be performed according to the following formula:
θ′ i=θ i+s i     (2) θ′ ii +s i (2)
其中,s i为针对嵌入参数的分享处理随机数。 Among them, s i is a random number for the sharing of embedded parameters.
然后,在步骤302,第一持有方i将该第一分享特征部分x′ i和第一分享参数部分θ′ i发送给其他N-1个数据持有方。类似的,其他N-1个数据持有方各自计算得到对应的分享特征部分x′ j(j≠i)和分享参数部分θ′ j,并将其发送出去。如此,第一持有方i从其他N-1个数据持有方分别接收N-1个分享特征部分x′ j以及N-1个分享参数部分θ′ jThen, in step 302, the first holder i sends the first shared characteristic part x′ i and the first shared parameter part θ′ i to other N-1 data holders. Similarly, the other N-1 data holders respectively calculate the corresponding shared characteristic part x′ j (j≠i) and the shared parameter part θ′ j , and send them out. In this way, the first holder i receives N-1 shared characteristic parts x′ j and N-1 shared parameter parts θ′ j from other N-1 data holders, respectively.
接着,在步骤303,第一持有方i基于第一特征部分x′ i和N-1个分享特征部分x′ j,得到第一综合特征X i。具体的,可以按照下式得到第一综合特征: Next, at step 303, based on a first holder of the first feature section i x 'i and the N-1 share the characterizing part of x' j, to obtain a first integrated feature X i. Specifically, the first comprehensive feature can be obtained according to the following formula:
X i=x i+∑ jx′ j         (3) X i =x i +∑ j x′ j (3)
此外,第一持有方i还基于自身嵌入参数θ i和N-1个分享参数部分θ′ j,类似地得到第一综合参数W iIn addition, the first holder i also obtains the first comprehensive parameter W i similarly based on its own embedded parameter θ i and N-1 shared parameter parts θ′ j :
W i=θ i+∑ jθ′ j      (4) W i =θ i +∑ j θ′ j (4)
然后,用第一综合参数W i处理上述第一综合特征X i,得到第一综合嵌入结果H iThen, the first integrated process parameter W i of the first integrated feature X i, to obtain a first integrated nested result H i.
在步骤304,第一持有方i将该第一综合嵌入结果H i发送给所述其他N-1个数据持有方。其他持有方也类似的得到有综合嵌入结果H j。于是,第一持有方i从其他N-1个数据持有方接收对应的N-1个综合嵌入结果H jIn step 304, the first holder i sends the first integrated embedding result Hi to the other N-1 data holders. Other holders similarly get the integrated embedding result H j . Therefore, the first holder i receives corresponding N-1 integrated embedding results H j from other N-1 data holders.
最后,在步骤305,第一持有方i根据第一综合嵌入结果H i和所述N-1个综合嵌入结果H j,确定各个样本的初级嵌入向量H,例如: Finally, in step 305, the first holder i determines the primary embedding vector H of each sample according to the first integrated embedding result H i and the N-1 integrated embedding results H j, for example:
Figure PCTCN2020111501-appb-000001
Figure PCTCN2020111501-appb-000001
如此,通过秘密分享的方式,针对该目标样本,第一持有方i与各个数据持有方联合计算得到相同的初级嵌入向量H。In this way, by means of secret sharing, for the target sample, the first holder i and each data holder jointly calculate the same primary embedding vector H.
回到图2,在第一持有方i的嵌入层利用MPC方案得到各个样本的初级嵌入向量的基础上,接着,在步骤202,在聚合层,基于其中存储的第一图结构,以及各个样本的初级嵌入向量,对各个样本执行多级邻居聚合,以确定各个样本的高阶嵌入向量。具体地,将各个样本对应到第一图结构中的各个节点,基于第一图结构中的节点之间连接信息,对各个节点进行多级邻居聚合,其中每级聚合包括,对于每个节点,至少基于该节点的邻居节点的上一级嵌入向量,确定该节点的本级嵌入向量。Returning to Figure 2, on the basis of obtaining the primary embedding vector of each sample at the embedding layer of the first holder i using the MPC scheme, then, in step 202, at the aggregation layer, based on the first graph structure stored therein, and each The primary embedding vector of the sample, and the multi-level neighbor aggregation is performed on each sample to determine the high-order embedding vector of each sample. Specifically, each sample corresponds to each node in the first graph structure, and based on the connection information between the nodes in the first graph structure, multi-level neighbor aggregation is performed on each node, where each level of aggregation includes, for each node, Determine the embedding vector of this node at least based on the previous embedding vector of the neighbor node of the node.
具体地,对于各个样本中任意的第一样本在第一图结构中对应的第一节点v,针对该第一节点的第k级聚合可以包括:Specifically, for the first node v corresponding to any first sample in the first graph structure in each sample, the k-th level aggregation for the first node may include:
采用聚合函数AGG k,至少根据该第一节点v的邻居节点u的上一级(即k-1级)嵌入向量
Figure PCTCN2020111501-appb-000002
确定邻居聚合向量
Figure PCTCN2020111501-appb-000003
其中N(v)表示节点v的邻居节点集合,即:
Adopt the aggregation function AGG k , at least according to the upper level (that is, k-1 level) embedding vector of the neighbor node u of the first node v
Figure PCTCN2020111501-appb-000002
Determine the neighbor aggregation vector
Figure PCTCN2020111501-appb-000003
Where N(v) represents the set of neighbor nodes of node v, namely:
Figure PCTCN2020111501-appb-000004
Figure PCTCN2020111501-appb-000004
然后,根据邻居聚合向量
Figure PCTCN2020111501-appb-000005
以及该第一节点v的上一级(即k-1级)嵌入向量
Figure PCTCN2020111501-appb-000006
确定该第一节点v的本级(k级)嵌入向量
Figure PCTCN2020111501-appb-000007
即:
Then, aggregate the vector according to the neighbors
Figure PCTCN2020111501-appb-000005
And the upper level (i.e. k-1 level) embedding vector of the first node v
Figure PCTCN2020111501-appb-000006
Determine the embedding vector of this level (k level) of the first node v
Figure PCTCN2020111501-appb-000007
which is:
Figure PCTCN2020111501-appb-000008
Figure PCTCN2020111501-appb-000008
其中,f表示对邻居聚合向量
Figure PCTCN2020111501-appb-000009
和节点v上一级向量
Figure PCTCN2020111501-appb-000010
施加的综合函数,W k是第k级聚合的参数。在不同实施例中,函数f中的综合操作可以包括,将
Figure PCTCN2020111501-appb-000011
Figure PCTCN2020111501-appb-000012
拼接,或者求和,或者求平均,等等。
Among them, f represents the aggregation vector of neighbors
Figure PCTCN2020111501-appb-000009
And the upper level vector of node v
Figure PCTCN2020111501-appb-000010
The applied synthesis function, W k is the parameter of the k-th level of aggregation. In different embodiments, the integrated operation in the function f can include:
Figure PCTCN2020111501-appb-000011
versus
Figure PCTCN2020111501-appb-000012
Splicing, or summing, or averaging, etc.
在不同实施例中,以上的聚合函数AGG k可以采取不同的形式和算法。 In different embodiments, the above aggregation function AGG k can take different forms and algorithms.
在一个实施例中,上述聚合函数AGG k包括池化操作。相应的,在公式(6)中根据邻居节点u的上一级嵌入向量
Figure PCTCN2020111501-appb-000013
确定邻居聚合向量
Figure PCTCN2020111501-appb-000014
即意味着,对第一节点v的各个邻居节点u的上一级嵌入向量
Figure PCTCN2020111501-appb-000015
进行池化操作,得到邻居聚合向量
Figure PCTCN2020111501-appb-000016
更具体的,上述池化操作可以包括,最大池化,平均池化,等等。
In one embodiment, the aforementioned aggregation function AGG k includes a pooling operation. Correspondingly, according to the previous embedding vector of neighbor node u in formula (6)
Figure PCTCN2020111501-appb-000013
Determine the neighbor aggregation vector
Figure PCTCN2020111501-appb-000014
That means, for each neighbor node u of the first node v, the upper level embedding vector
Figure PCTCN2020111501-appb-000015
Perform a pooling operation to get the neighbor aggregation vector
Figure PCTCN2020111501-appb-000016
More specifically, the aforementioned pooling operation may include maximum pooling, average pooling, and so on.
在另一实施例中,上述聚合函数AGG k可以表示,将各个邻居节点u的上一级嵌入向量
Figure PCTCN2020111501-appb-000017
依次输入LSTM神经网络,将如此得到的隐向量作为邻居聚合向量
Figure PCTCN2020111501-appb-000018
In another embodiment, the above-mentioned aggregation function AGG k can be expressed as embedding the upper-level embedding vector of each neighbor node u
Figure PCTCN2020111501-appb-000017
Input the LSTM neural network in turn, and use the hidden vector thus obtained as the neighbor aggregation vector
Figure PCTCN2020111501-appb-000018
在又一个实施例中,上述聚合函数AGG k包括加权求和操作。相应的,公式(6)具体化为: In yet another embodiment, the aforementioned aggregation function AGG k includes a weighted sum operation. Correspondingly, formula (6) is embodied as:
Figure PCTCN2020111501-appb-000019
Figure PCTCN2020111501-appb-000019
也就是,对第一节点v的邻居节点u的上一级嵌入向量
Figure PCTCN2020111501-appb-000020
加权求和,得到邻居聚合向量
Figure PCTCN2020111501-appb-000021
其中,α uv为权重因子。
That is, the previous embedding vector of the neighbor node u of the first node v
Figure PCTCN2020111501-appb-000020
Weighted summation to get the neighbor aggregation vector
Figure PCTCN2020111501-appb-000021
Among them, α uv is the weighting factor.
在一个例子中,上述权重因子α uv根据该邻居节点u与第一节点v之间的连接边e uv的特征而确定。例如,当第一图结构基于转账关系而构建时,两个节点u和v之间的连接边e uv的特征可以包括,两个节点对应的两个用户的转账总额。当第一图结构基于社交关系而构建时,两个节点u和v之间的连接边e uv的特征可以包括,两个节点对应的两个用户的交互频次。如此,可以基于连接边e uv的特征而确定邻居节点u的权重因子,并通过公式(8)的聚合函数得到邻居聚合向量
Figure PCTCN2020111501-appb-000022
In an example, the above-mentioned weight factor α uv is determined according to the characteristics of the connecting edge e uv between the neighbor node u and the first node v. For example, when the first graph structure is constructed based on the transfer relationship, the characteristics of the connecting edge e uv between the two nodes u and v may include the total transfer amount of the two users corresponding to the two nodes. When the first graph structure is constructed based on the social relationship, the characteristics of the connecting edge e uv between the two nodes u and v may include the interaction frequency of the two users corresponding to the two nodes. In this way, the weight factor of the neighbor node u can be determined based on the characteristics of the connected edge e uv , and the neighbor aggregation vector can be obtained through the aggregation function of formula (8)
Figure PCTCN2020111501-appb-000022
在又一实施例中,对于第一图结构,根据各个节点之间连接边的边特征,确定各个连接边的边嵌入向量。相应的,在聚合函数AGG k中,还引入对边嵌入向量的聚合。具体的,基于各个邻居节点u的上一级嵌入向量
Figure PCTCN2020111501-appb-000023
以及各个邻居节点u与第一节点v之间的各个连接边e uv的边嵌入向量,确定邻居聚合向量
Figure PCTCN2020111501-appb-000024
更具体的,在一个例子中,利用上述聚合函数AGG k进行聚合的公式(6)可以具体化为:
In another embodiment, for the first graph structure, the edge embedding vector of each connected edge is determined according to the edge feature of the connected edge between each node. Correspondingly, in the aggregation function AGG k , the aggregation of the embedding vectors of the opposite edges is also introduced. Specifically, based on the upper-level embedding vector of each neighbor node u
Figure PCTCN2020111501-appb-000023
And the edge embedding vector of each connecting edge e uv between each neighbor node u and the first node v, determine the neighbor aggregation vector
Figure PCTCN2020111501-appb-000024
More specifically, in an example, the formula (6) for aggregation using the above aggregation function AGG k can be embodied as:
Figure PCTCN2020111501-appb-000025
Figure PCTCN2020111501-appb-000025
其中,q uv为第一节点v与其邻居节点u之间的连接边e uv的边嵌入向量。 Among them, q uv is the edge embedding vector of the connecting edge e uv between the first node v and its neighbor node u.
以上,通过多种形式和算法的聚合函数AGG k,基于邻居节点的上一级嵌入向量确定出邻居聚合向量
Figure PCTCN2020111501-appb-000026
然后,根据公式(7),得到第一节点v的本级嵌入向量
Figure PCTCN2020111501-appb-000027
Above, through the aggregation function AGG k of various forms and algorithms, the neighbor aggregation vector is determined based on the upper-level embedding vector of the neighbor node
Figure PCTCN2020111501-appb-000026
Then, according to formula (7), the embedding vector of the first node v at this level is obtained
Figure PCTCN2020111501-appb-000027
可以理解,步骤201中确定出的样本的初始嵌入向量即可以作为0级嵌入向量,基于此,使得k从1直到预设的聚合级数K,逐级执行聚合,可以得到节点v的预设级数K的高阶嵌入向量
Figure PCTCN2020111501-appb-000028
其中,聚合级数K为预设的超参数,对应于聚合所考虑的邻居节点的阶数。
It can be understood that the initial embedding vector of the sample determined in step 201 can be used as the 0-level embedding vector. Based on this, k is made from 1 to the preset aggregation level K, and the aggregation is performed step by step, and the preset node v can be obtained. High-order embedding vector of series K
Figure PCTCN2020111501-appb-000028
Among them, the aggregation level K is a preset hyperparameter, which corresponds to the order of neighbor nodes considered for aggregation.
如此,根据步骤202,第一持有方i在聚合层,基于其中存储的第一图结构,以及各个样本的初级嵌入向量,得到各个样本的高阶嵌入向量。In this way, according to step 202, the first holder i is in the aggregation layer, and obtains the high-order embedding vector of each sample based on the first graph structure stored therein and the primary embedding vector of each sample.
接着,在步骤203,第一持有方i将各个样本的高阶嵌入向量发送至服务器。Next, in step 203, the first holder i sends the high-order embedding vector of each sample to the server.
可以理解,第一持有方i是N个数据持有方中任意的一个。其他的数据持有方j也会执行与第一持有方i类似的操作,基于其中存储的第j图结构,以及各个样本的初级嵌入向量,对应得到各个样本的高阶嵌入向量,并发送至服务器。It can be understood that the first holder i is any one of the N data holders. The other data holder j will also perform similar operations to the first holder i. Based on the structure of the j-th image stored therein and the primary embedding vector of each sample, the high-level embedding vector of each sample will be correspondingly obtained and sent To the server.
于是,服务器可以从N个数据持有方中的各个持有方分别接收到其处理得到的样本的高阶嵌入向量。为了表述的清楚,下面针对任意的目标样本v进行描述。针对该目标样本v,服务器可以分别从N个数据持有方接收针对该样本v的N个高阶嵌入向量
Figure PCTCN2020111501-appb-000029
其中,
Figure PCTCN2020111501-appb-000030
表示第i个持有方针对样本v得到的高阶嵌入向量。
Therefore, the server can receive the high-order embedding vector of the sample processed by each of the N data holders. In order to make the presentation clear, the following describes any target sample v. For the target sample v, the server can respectively receive N high-order embedding vectors for the sample v from N data holders
Figure PCTCN2020111501-appb-000029
among them,
Figure PCTCN2020111501-appb-000030
Represents the high-order embedding vector obtained by the i-th holder for the sample v.
于是,在步骤204,服务器利用其维护的分类子网络,对目标样本v的N个高阶嵌入向量进行综合,得到该目标样本的综合嵌入向量,并根据该综合嵌入向量确定该目标样本的分类预测结果。Therefore, in step 204, the server uses the classification sub-network maintained by the server to synthesize the N high-order embedding vectors of the target sample v to obtain the comprehensive embedding vector of the target sample, and determine the classification of the target sample according to the comprehensive embedding vector forecast result.
具体地,分类子网络可以包括综合层,用于对目标样本v的N个高阶嵌入向量进行综合。综合层可以采用多种不同综合方式。Specifically, the classification sub-network may include a synthesis layer for synthesizing N high-order embedding vectors of the target sample v. The synthesis layer can adopt many different synthesis methods.
在一个实施例中,在综合层中对样本v的N个高阶嵌入向量
Figure PCTCN2020111501-appb-000031
进行拼接,得到综合嵌入向量
Figure PCTCN2020111501-appb-000032
In one embodiment, the N high-order embedding vectors of sample v are
Figure PCTCN2020111501-appb-000031
Perform splicing to get a comprehensive embedding vector
Figure PCTCN2020111501-appb-000032
在另一实施例中,在综合层中对上述N个高阶嵌入向量
Figure PCTCN2020111501-appb-000033
求平均,得到综合嵌入向量
Figure PCTCN2020111501-appb-000034
In another embodiment, the above-mentioned N high-order embedding vectors are
Figure PCTCN2020111501-appb-000033
Take the average to get the integrated embedding vector
Figure PCTCN2020111501-appb-000034
在又一实施例中,在综合层中对上述N个高阶嵌入向量
Figure PCTCN2020111501-appb-000035
加权求和,得到综合嵌入向量
Figure PCTCN2020111501-appb-000036
即:
In another embodiment, the above-mentioned N high-order embedding vectors are
Figure PCTCN2020111501-appb-000035
Weighted summation to get the integrated embedding vector
Figure PCTCN2020111501-appb-000036
which is:
Figure PCTCN2020111501-appb-000037
Figure PCTCN2020111501-appb-000037
其中,β i为第i个数据持有方对应的权重因子。权重因子β i可以是预先设置的超参数,也可以通过训练而确定。 Among them, β i is the weighting factor corresponding to the i-th data holder. The weight factor β i can be a pre-set hyperparameter, or it can be determined through training.
在另外一个实施例中,在综合层中,通过以下方式得到综合嵌入向量
Figure PCTCN2020111501-appb-000038
In another embodiment, in the integrated layer, the integrated embedding vector is obtained in the following way
Figure PCTCN2020111501-appb-000038
Figure PCTCN2020111501-appb-000039
Figure PCTCN2020111501-appb-000039
其中,ω i为第i个数据持有方对应的权重向量,与高阶嵌入向量具有同样的维度,⊙表示按位相乘。也就是说,在公式(11)中,利用N个权重向量,分别与N个高阶嵌入向量进行按位相乘,得到N个加权处理向量,并对这N个加权处理向量求和,得到综合嵌入向量
Figure PCTCN2020111501-appb-000040
需要理解,上述N个权重向量通过网络训练而确定。
Among them, ω i is the weight vector corresponding to the i-th data holder, which has the same dimension as the high-order embedding vector, and ⊙ means bitwise multiplication. That is to say, in formula (11), N weight vectors are used to perform bitwise multiplication with N high-order embedding vectors to obtain N weighted processing vectors, and the N weighted processing vectors are summed to obtain Synthetic embedding vector
Figure PCTCN2020111501-appb-000040
It needs to be understood that the above N weight vectors are determined through network training.
在得到上述综合嵌入向量
Figure PCTCN2020111501-appb-000041
之后,分类子网络可以基于综合嵌入向量
Figure PCTCN2020111501-appb-000042
确定该目标样本的分类预测结果。例如,在分类子网络中,可以继续对该综合嵌入向量
Figure PCTCN2020111501-appb-000043
进行进一步处理,然后输入到分类层中进行分类;或者,也可以将该综合嵌入向量
Figure PCTCN2020111501-appb-000044
直接输入到分类层中。通过分类层,可以得到该目标样本的分类预测结果。
After obtaining the above comprehensive embedding vector
Figure PCTCN2020111501-appb-000041
After that, the classification sub-network can be based on the integrated embedding vector
Figure PCTCN2020111501-appb-000042
Determine the classification prediction result of the target sample. For example, in the classification sub-network, you can continue to embed the integrated vector
Figure PCTCN2020111501-appb-000043
For further processing, and then input into the classification layer for classification; or, you can also embed the integrated vector
Figure PCTCN2020111501-appb-000044
Enter directly into the classification layer. Through the classification layer, the classification prediction result of the target sample can be obtained.
接着,在步骤205,服务器至少基于上述目标样本的分类预测结果与对应的样本标签,确定预测损失。Next, in step 205, the server determines the prediction loss based on at least the classification prediction result of the target sample and the corresponding sample label.
一般而言,各个样本的样本标签(例如针对用户进行分类的情况下,所划分的用户群的标识)来源于数据持有方。在一个例子中,N个数据持有方中的某个持有方,例如称为第二持有方,拥有全部训练样本的样本标签。在这样的情况下,在步骤205之前,服务器预先从该第二持有方接收各个样本的样本标签。在另一例子中,各个样本的样本标签分布于不同数据持有方中。在这样的情况下,在步骤205之前,服务器预先从各个数据持有方收集各个样本的样本标签。Generally speaking, the sample label of each sample (for example, in the case of user classification, the identification of the divided user group) comes from the data holder. In an example, one of the N data holders, for example, called the second holder, owns the sample labels of all training samples. In this case, before step 205, the server receives the sample label of each sample from the second holder in advance. In another example, the sample labels of each sample are distributed among different data holders. In this case, before step 205, the server collects sample labels of each sample from each data holder in advance.
在获得样本标签的情况下,在步骤205,服务器可以根据各种损失函数的定义,至少基于该目标样本的分类预测结果与样本标签的标签值的比较,确定预测损失。In the case of obtaining the sample label, in step 205, the server may determine the prediction loss according to the definition of various loss functions, at least based on the comparison between the classification prediction result of the target sample and the label value of the sample label.
然后,在步骤206,服务器根据以上得到的预测损失更新其中的分类子网络,并确定分类子网络输入层对应的损失梯度。具体地,可以采用损失反向传播的方式,从分类子网络的输出层开始,逐层确定损失梯度,基于损失梯度调整该层的网络参数,并将损失梯度传递到上一层,直到确定出输入层对应的损失梯度。Then, in step 206, the server updates the classification sub-network according to the predicted loss obtained above, and determines the loss gradient corresponding to the input layer of the classification sub-network. Specifically, the loss back propagation method can be used. Starting from the output layer of the classification sub-network, the loss gradient is determined layer by layer, the network parameters of this layer are adjusted based on the loss gradient, and the loss gradient is passed to the upper layer until it is determined The loss gradient corresponding to the input layer.
然后,在步骤207,服务器将上述损失梯度发送给N个数据持有方。相应的,N个数据持有方的第一持有方i,接收到上述损失梯度。Then, in step 207, the server sends the aforementioned loss gradient to N data holders. Correspondingly, the first holder i of the N data holders receives the aforementioned loss gradient.
于是,在步骤208,第一持有方i根据接收到的损失梯度,更新其中的图嵌入子网络部分,也就是前述的第一网络部分。Therefore, in step 208, the first holder i updates the graph embedded sub-network part in it according to the received loss gradient, that is, the aforementioned first network part.
具体地,第一持有方i根据上述损失梯度,继续进行损失的反向传播,以更新其中的网络参数。反向传播首先在聚合层中进行,因而可以反向逐层更新聚合层中的聚合参数。在嵌入层涉及需要训练的嵌入参数(例如如前述公式(4)所示)的情况下,反向传播继续进行,进一步更新嵌入层中的嵌入参数。如此,实现第一持有方i中的图嵌入子网络部分的更新。Specifically, the first holder i continues to perform the back propagation of the loss according to the aforementioned loss gradient to update the network parameters therein. Backpropagation is first performed in the aggregation layer, so the aggregation parameters in the aggregation layer can be updated layer by layer in the reverse direction. In the case where the embedding layer involves embedding parameters that need to be trained (for example, as shown in the aforementioned formula (4)), the backpropagation continues, and the embedding parameters in the embedding layer are further updated. In this way, the graph embedding sub-network part of the first holder i is updated.
可以理解,N个数据持有方中的各个持有方,均可以类似的执行上述操作,从而更新其中维护的图嵌入子网络部分。于是,整个图嵌入子网络得到更新。更进一步的, 与服务器中的分类子网络结合在一起,整个图神经网络得到了训练和更新。It can be understood that each of the N data holders can perform the above operations similarly, thereby updating the graph embedded sub-network part maintained therein. As a result, the entire graph embedded sub-network is updated. Furthermore, combined with the classification sub-network in the server, the entire graph neural network has been trained and updated.
回顾图2所示的图神经网络的联合训练过程,可以看到,得到样本预测结果的正向处理过程可以划分为三个阶段,这三个阶段分别采用了三种不同的处理方式和实施主体。Looking back at the joint training process of the graph neural network shown in Figure 2, it can be seen that the forward processing process to obtain the sample prediction results can be divided into three stages. These three stages use three different processing methods and implementation subjects. .
第一个阶段中确定样本的初级嵌入向量的过程,由N个数据持有方采用MPC方案协同联合执行。在这个过程中,通过MPC方案确保了特征数据的安全性,并由此将各个持有方的特征数据进行全面综合,得到初级嵌入向量。In the first stage, the process of determining the primary embedding vector of the sample is jointly executed by N data holders using the MPC scheme. In this process, the security of the feature data is ensured through the MPC scheme, and the feature data of each holder is thus comprehensively integrated to obtain the primary embedding vector.
第二个阶段中确定样本的高阶嵌入向量的过程,由各个数据持有方各自执行。由此,一方面确保图结构数据的安全性,并一方面,允许各个数据持有方基于各自维护的不同图结构进行多级聚合。In the second stage, the process of determining the high-level embedding vector of the sample is performed by each data holder. Therefore, on the one hand, the security of graph structure data is ensured, and on the other hand, each data holder is allowed to perform multi-level aggregation based on the different graph structures maintained by each.
第三个阶段中确定样本的预测结果和预测损失的过程,在服务器中执行。这是考虑到,对高阶嵌入向量的处理并不涉及隐私数据,并且,神经网络中的多项处理涉及非线性变换,对计算性能要求比较高。如此,由中立的服务器维护分类子网络,以此提升训练和计算效率。In the third stage, the process of determining the predicted result and predicted loss of the sample is executed in the server. This is because the processing of high-order embedding vectors does not involve private data, and multiple processing in the neural network involves non-linear transformation, which requires relatively high computational performance. In this way, a neutral server maintains the classification sub-network to improve training and calculation efficiency.
如此,通过上述实施例的方案,高效而安全地实现了图神经网络的多方联合训练。In this way, the multi-party joint training of the graph neural network is efficiently and safely realized through the scheme of the above-mentioned embodiment.
根据另一方面的实施例,提供了一种用于多方联合训练图神经网络的装置,该装置部署于前述N个数据持有方中任意的第一持有方中,该第一持有方可以实现为任何具有计算、处理能力的设备、平台或设备集群。如前所述,图神经网络包括图嵌入子网络和分类子网络,服务器维护所述分类子网络,所述N个数据持有方各自维护所述图嵌入子网络的一部分。第一持有方存储各个样本的第一特征部分,以及包含各个样本作为对应节点的第一图结构;并且,第一持有方维护所述图嵌入子网络的第一网络部分,所述第一网络部分包括嵌入层和聚合层。According to another embodiment, a device for multi-party joint training graph neural network is provided, the device is deployed in any first holder of the aforementioned N data holders, and the first holder It can be implemented as any device, platform or device cluster with computing and processing capabilities. As mentioned above, the graph neural network includes a graph embedding sub-network and a classification sub-network. The server maintains the classification sub-network, and the N data holders each maintain a part of the graph embedding sub-network. The first holder stores the first characteristic part of each sample, and the first graph structure containing each sample as the corresponding node; and, the first holder maintains the first network part of the graph embedding sub-network, the first A network part includes an embedded layer and an aggregation layer.
图4示出根据一个实施例的部署在第一持有方中的训练装置的示意性框图。如图4所示,该训练装置400包括以下单元。Fig. 4 shows a schematic block diagram of a training device deployed in a first holder according to an embodiment. As shown in FIG. 4, the training device 400 includes the following units.
初级嵌入单元41,配置为在所述嵌入层,至少基于所述各个样本的第一特征部分,利用多方安全计算方案,与其他N-1个数据持有方联合计算得到各个样本的初级嵌入向量。The primary embedding unit 41 is configured to jointly calculate the primary embedding vector of each sample based on at least the first feature part of each sample at the embedding layer, using a multi-party secure computing scheme, and other N-1 data holders .
聚合单元42,配置为在所述聚合层,基于所述第一图结构,以及所述各个样本的初级嵌入向量,对所述各个样本执行多级聚合,以确定各个样本的高阶嵌入向量;其中 每级聚合包括,对于各个样本在所述第一图结构中对应的节点,至少基于该节点的邻居节点的上一级嵌入向量,确定该节点的本级嵌入向量。The aggregation unit 42 is configured to perform multi-level aggregation on each sample based on the first graph structure and the primary embedding vector of each sample at the aggregation layer to determine the high-order embedding vector of each sample; Each level of aggregation includes, for each sample corresponding to the node in the first graph structure, determining the current level of embedding vector of the node based at least on the previous level of embedding vector of the neighboring node of the node.
发送单元43,配置为将所述各个样本的高阶嵌入向量发送至所述服务器,以使得所述服务器利用所述分类子网络,基于所述N个数据持有方发送的高阶嵌入向量对各个样本进行分类预测,得到分类预测结果。The sending unit 43 is configured to send the high-order embedding vectors of the respective samples to the server, so that the server uses the classification sub-network based on the high-order embedding vector pairs sent by the N data holders Each sample is classified and predicted, and the classification prediction result is obtained.
接收单元44,配置为从所述服务器接收损失梯度,所述损失梯度至少基于所述各个样本的分类预测结果与样本标签而确定。The receiving unit 44 is configured to receive a loss gradient from the server, the loss gradient being determined based on at least the classification prediction result of each sample and the sample label.
更新单元45,配置为根据所述损失梯度,更新所述第一网络部分。The update unit 45 is configured to update the first network part according to the loss gradient.
根据一种实施方式,所述初级嵌入单元41配置为:基于所述各个样本的第一特征部分,以及所述嵌入层中的嵌入参数,利用多方安全计算方案,与其他N-1个数据持有方联合计算得到各个样本的初级嵌入向量;相应地,所述更新单元45配置为,更新所述嵌入参数。According to an embodiment, the primary embedding unit 41 is configured to: based on the first characteristic part of each sample and the embedding parameters in the embedding layer, use a multi-party secure computing scheme to be compatible with other N-1 data. The elementary embedding vector of each sample is obtained by a methodical joint calculation; accordingly, the update unit 45 is configured to update the embedding parameter.
在上述实施方式的一个实施例中,所述多方安全计算方案采用秘密分享方案,所述初级嵌入单元41具体配置为:对所述各个样本的第一特征部分进行分享处理,得到第一分享特征部分;并对所述嵌入参数进行分享处理,得到第一分享参数部分;将所述第一分享特征部分和第一分享参数部分发送给其他N-1个数据持有方,并从其他N-1个数据持有方分别接收N-1个分享特征部分以及N-1个分享参数部分;利用所述嵌入参数和所述N-1个分享参数部分构成的第一综合参数,处理由所述第一特征部分和所述N-1个分享特征部分构成的第一综合特征,得到第一综合嵌入结果;将所述第一综合嵌入结果发送给所述其他N-1个数据持有方,并从所述其他N-1个数据持有方接收对应的N-1个综合嵌入结果;根据所述第一综合嵌入结果和所述N-1个综合嵌入结果,确定所述各个样本的初级嵌入向量。In an example of the foregoing implementation manner, the multi-party secure computing scheme adopts a secret sharing scheme, and the primary embedding unit 41 is specifically configured to: perform sharing processing on the first feature part of each sample to obtain the first shared feature Share the embedded parameters to obtain the first shared parameter part; send the first shared characteristic part and the first shared parameter part to other N-1 data holders, and from other N- A data holder receives N-1 shared characteristic parts and N-1 shared parameter parts respectively; using the embedded parameter and the first comprehensive parameter composed of the N-1 shared parameter parts, the processing is performed by the The first integrated feature composed of the first feature part and the N-1 shared feature parts obtains the first integrated embedding result; sending the first integrated embedding result to the other N-1 data holders, And receive corresponding N-1 comprehensive embedding results from the other N-1 data holders; determine the primary level of each sample according to the first comprehensive embedding result and the N-1 comprehensive embedding results Embedding vector.
根据一个实施例,所述聚合单元42配置为,对于各个样本中任意的第一样本在所述第一图结构中对应的第一节点:至少根据该第一节点的邻居节点的上一级嵌入向量,确定邻居聚合向量;根据所述邻居聚合向量,以及所述第一节点的上一级嵌入向量,确定该第一节点的本级嵌入向量。According to an embodiment, the aggregation unit 42 is configured to, for any first sample in each sample, correspond to the first node in the first graph structure: at least according to the upper level of the neighbor node of the first node. The embedding vector determines the neighbor aggregation vector; according to the neighbor aggregation vector and the upper-level embedding vector of the first node, the current-level embedding vector of the first node is determined.
进一步的,在一个例子中,聚合单元42确定邻居聚合向量具体包括:Further, in an example, the aggregation unit 42 determining the neighbor aggregation vector specifically includes:
对所述第一节点的邻居节点的上一级嵌入向量进行池化操作,得到所述邻居聚合向量。Perform a pooling operation on the upper-level embedding vector of the neighbor node of the first node to obtain the neighbor aggregation vector.
在另一个例子中,所述聚合单元42确定邻居聚合向量具体包括:对所述第一节点的邻居节点的上一级嵌入向量加权求和,得到所述邻居聚合向量,各邻居节点对应的权重根据该邻居节点与所述第一节点之间的连接边的特征而确定。In another example, the aggregating unit 42 determining the neighbor aggregation vector specifically includes: a weighted summation of the upper-level embedding vectors of the neighbor nodes of the first node to obtain the neighbor aggregation vector, and the weight corresponding to each neighbor node It is determined according to the characteristics of the connecting edge between the neighbor node and the first node.
在又一个例子中,所述聚合单元42确定邻居聚合向量具体包括:基于各个邻居节点的上一级嵌入向量,以及各个邻居节点与所述第一节点之间的各个连接边的边嵌入向量,确定所述邻居聚合向量。In another example, the determination of the neighbor aggregation vector by the aggregation unit 42 specifically includes: based on the upper-level embedding vector of each neighbor node, and the edge embedding vector of each connection edge between each neighbor node and the first node, Determine the neighbor aggregation vector.
根据一个实施例,所述更新单元45具体配置为:根据损失梯度,采用反向传播算法,反向逐层更新所述聚合层中的聚合参数,以及所述嵌入层中的嵌入参数。According to an embodiment, the update unit 45 is specifically configured to: according to the loss gradient, use a backpropagation algorithm to reversely update the aggregation parameters in the aggregation layer and the embedding parameters in the embedding layer layer by layer.
根据又一方面的实施例,提供了一种用于多方联合训练图神经网络的装置,其中图神经网络包括图嵌入子网络和分类子网络,所述多方包括服务器和N个数据持有方,服务器维护所述分类子网络,所述N个数据持有方各自维护所述图嵌入子网络的一部分;所述N个数据持有方中的各个持有方存储有样本集中各个样本的部分特征,以及包含所述各个样本作为对应节点的图结构;所述装置部署在所述服务器中,该服务器可以实现为任何具有计算、处理能力的设备、平台或设备集群。According to another embodiment, there is provided an apparatus for joint training of a graph neural network by multiple parties, wherein the graph neural network includes a graph embedding sub-network and a classification sub-network, and the multiple parties include a server and N data holders, The server maintains the classification sub-network, each of the N data holders maintains a part of the graph embedding sub-network; each of the N data holders stores partial characteristics of each sample in the sample set , And a graph structure containing the respective samples as corresponding nodes; the device is deployed in the server, and the server can be implemented as any device, platform or device cluster with computing and processing capabilities.
图5示出根据一个实施例的部署在服务器中的训练装置的示意性框图。如图5所示,该训练装置500包括以下单元。Fig. 5 shows a schematic block diagram of a training device deployed in a server according to an embodiment. As shown in FIG. 5, the training device 500 includes the following units.
向量接收单元51,配置为对于任意的目标样本,分别从所述N个数据持有方接收针对该目标样本的N个高阶嵌入向量,其中,第i个高阶嵌入向量是由所述N个数据持有方中的第i个持有方,通过将其中存储的图结构和目标样本的特征部分,输入其中维护的图嵌入子网络部分而得到。The vector receiving unit 51 is configured to respectively receive N high-order embedding vectors for the target sample from the N data holders for any target sample, where the i-th high-order embedding vector is determined by the N The i-th holder of the data holders is obtained by embedding the graph structure stored therein and the characteristic part of the target sample into the graph maintained in the sub-network part.
分类预测单元52,配置为在所述分类子网络中,对所述N个高阶嵌入向量进行综合,得到所述目标样本的综合嵌入向量,并根据所述综合嵌入向量确定所述目标样本的分类预测结果。The classification prediction unit 52 is configured to integrate the N high-order embedding vectors in the classification sub-network to obtain the integrated embedding vector of the target sample, and determine the value of the target sample according to the integrated embedding vector Classification prediction results.
损失确定单元53,配置为至少基于所述目标样本的分类预测结果与对应的样本标签,确定预测损失。The loss determining unit 53 is configured to determine the prediction loss based at least on the classification prediction result of the target sample and the corresponding sample label.
更新单元54,配置为根据所述预测损失更新所述分类子网络,并确定所述分类子网络输入层对应的损失梯度。The updating unit 54 is configured to update the classification sub-network according to the predicted loss, and determine the loss gradient corresponding to the input layer of the classification sub-network.
发送单元55,配置为将所述损失梯度发送给所述N个数据持有方,以使得各个持有方更新其中的图嵌入子网络部分。The sending unit 55 is configured to send the loss gradient to the N data holders, so that each holder updates the graph embedded sub-network part therein.
在一个实施例中,分类预测单元52具体配置为:对所述N个高阶嵌入向量进行拼接,得到所述综合嵌入向量;或者,对所述N个高阶嵌入向量求平均,得到所述综合嵌入向量。In one embodiment, the classification prediction unit 52 is specifically configured to: splice the N high-order embedding vectors to obtain the integrated embedding vector; or, averaging the N high-order embedding vectors to obtain the Comprehensive embedding vector.
在另一实施例中,分类预测单元52具体配置为:利用N个权重向量,分别与所述N个高阶嵌入向量进行按位相乘,得到N个加权处理向量;对所述N个加权处理向量求和,得到所述综合嵌入向量;相应地,所述更新单元54配置为,更新所述N个权重向量。In another embodiment, the classification prediction unit 52 is specifically configured to: use N weight vectors to perform bitwise multiplication with the N high-order embedding vectors to obtain N weighted processing vectors; The processing vectors are summed to obtain the integrated embedding vector; accordingly, the update unit 54 is configured to update the N weight vectors.
根据一种实施方式,装置500还包括(未示出)标签接收单元,配置为从所述N个数据持有方中的第二持有方,接收所述样本标签。According to an embodiment, the device 500 further includes (not shown) a tag receiving unit configured to receive the sample tag from a second holder of the N data holders.
根据另一方面的实施例,还提供一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行结合图2所描述的方法。According to another embodiment, there is also provided a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed in a computer, the computer is caused to execute the method described in conjunction with FIG. 2.
根据再一方面的实施例,还提供一种计算设备,包括存储器和处理器,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现结合图2所述的方法。According to an embodiment of still another aspect, there is also provided a computing device, including a memory and a processor, the memory is stored with executable code, and when the processor executes the executable code, it implements the method described in conjunction with FIG. 2 method.
本领域技术人员应该可以意识到,在上述一个或多个示例中,本发明所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。Those skilled in the art should be aware that, in one or more of the foregoing examples, the functions described in the present invention can be implemented by hardware, software, firmware, or any combination thereof. When implemented by software, these functions can be stored in a computer-readable medium or transmitted as one or more instructions or codes on the computer-readable medium.
以上所述的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本发明的保护范围之内。The specific embodiments described above further describe the purpose, technical solutions and beneficial effects of the present invention in detail. It should be understood that the above are only specific embodiments of the present invention, and are not intended to limit the scope of the present invention. The protection scope, any modification, equivalent replacement, improvement, etc. made on the basis of the technical solution of the present invention shall be included in the protection scope of the present invention.

Claims (26)

  1. 一种多方联合训练图神经网络的方法,所述图神经网络包括图嵌入子网络和分类子网络,所述多方包括服务器和N个数据持有方,所述服务器维护所述分类子网络,所述N个数据持有方各自维护所述图嵌入子网络的一部分;所述N个数据持有方中任意的第一持有方存储有样本集中各个样本的第一特征部分,以及包含所述各个样本作为对应节点的第一图结构;所述第一持有方维护所述图嵌入子网络的第一网络部分,所述第一网络部分包括嵌入层和聚合层;所述方法通过该第一持有方执行,包括:A method for multi-party joint training of a graph neural network. The graph neural network includes a graph embedding sub-network and a classification sub-network. The multi-party includes a server and N data holders. The server maintains the classification sub-network. The N data holders each maintain a part of the graph embedding sub-network; any first holder of the N data holders stores the first characteristic part of each sample in the sample set, and contains the Each sample serves as the first graph structure of the corresponding node; the first holder maintains the first network part of the graph embedding sub-network, and the first network part includes the embedding layer and the aggregation layer; Execution by the holder, including:
    在所述嵌入层,至少基于所述各个样本的第一特征部分,利用多方安全计算MPC方案,与其他N-1个数据持有方联合计算得到各个样本的初级嵌入向量;In the embedding layer, based on at least the first feature part of each sample, a multi-party secure computing MPC scheme is used to jointly calculate with other N-1 data holders to obtain the primary embedding vector of each sample;
    在所述聚合层,基于所述第一图结构,以及所述各个样本的初级嵌入向量,对所述各个样本执行多级聚合,以确定各个样本的高阶嵌入向量;其中每级聚合包括,对于各个样本在所述第一图结构中对应的节点,至少基于该节点的邻居节点的上一级嵌入向量,确定该节点的本级嵌入向量;At the aggregation layer, based on the first graph structure and the primary embedding vector of each sample, multi-level aggregation is performed on each sample to determine the high-order embedding vector of each sample; wherein each level of aggregation includes, For the node corresponding to each sample in the first graph structure, determine the current-level embedding vector of the node based at least on the previous-level embedding vector of the neighbor node of the node;
    将所述各个样本的高阶嵌入向量发送至所述服务器,以使得所述服务器利用所述分类子网络,基于所述N个数据持有方发送的高阶嵌入向量对各个样本进行分类预测,得到分类预测结果;Sending the high-order embedding vectors of the respective samples to the server, so that the server uses the classification sub-network to classify and predict each sample based on the high-order embedding vectors sent by the N data holders, Obtain classification prediction results;
    从所述服务器接收损失梯度,所述损失梯度至少基于所述各个样本的分类预测结果与样本标签而确定;Receiving a loss gradient from the server, the loss gradient being determined based on at least a classification prediction result of each sample and a sample label;
    根据所述损失梯度,更新所述第一网络部分。According to the loss gradient, the first network part is updated.
  2. 根据权利要求1所述的方法,其中,至少基于所述各个样本的第一特征部分,利用多方安全计算MPC方案,与其他N-1个数据持有方联合计算得到各个样本的初级嵌入向量,包括:The method according to claim 1, wherein, based on at least the first characteristic part of each sample, a multi-party secure computing MPC scheme is used to jointly calculate with other N-1 data holders to obtain the primary embedding vector of each sample, include:
    基于所述各个样本的第一特征部分,以及所述嵌入层中的嵌入参数,利用多方安全计算MPC方案,与其他N-1个数据持有方联合计算得到各个样本的初级嵌入向量;Based on the first feature part of each sample and the embedding parameters in the embedding layer, a multi-party secure computing MPC scheme is used to jointly calculate with other N-1 data holders to obtain the primary embedding vector of each sample;
    所述更新所述第一网络部分包括,更新所述嵌入参数。The updating the first network part includes updating the embedding parameter.
  3. 根据权利要求2所述的方法,其中,所述多方安全计算MPC方案包括秘密分享方案,所述与其他N-1个数据持有方联合计算得到各个样本的初级嵌入向量包括:The method according to claim 2, wherein the multi-party secure computing MPC scheme comprises a secret sharing scheme, and the joint calculation with other N-1 data holders to obtain the primary embedding vector of each sample comprises:
    对所述各个样本的第一特征部分进行分享处理,得到第一分享特征部分;并对所述嵌入参数进行分享处理,得到第一分享参数部分;Performing sharing processing on the first characteristic portion of each sample to obtain a first sharing characteristic portion; performing sharing processing on the embedded parameter to obtain a first sharing parameter portion;
    将所述第一分享特征部分和第一分享参数部分发送给其他N-1个数据持有方,并从其他N-1个数据持有方分别接收N-1个分享特征部分以及N-1个分享参数部分;Send the first shared characteristic part and the first shared parameter part to other N-1 data holders, and receive N-1 shared characteristic parts and N-1 from the other N-1 data holders, respectively Share parameter section;
    利用所述嵌入参数和所述N-1个分享参数部分构成的第一综合参数,处理由所述第一特征部分和所述N-1个分享特征部分构成的第一综合特征,得到第一综合嵌入结果;Using the first integrated parameter composed of the embedded parameter and the N-1 shared parameter parts, the first integrated feature composed of the first feature part and the N-1 shared feature parts is processed to obtain the first integrated parameter. Comprehensive embedding results;
    将所述第一综合嵌入结果发送给所述其他N-1个数据持有方,并从所述其他N-1个数据持有方接收对应的N-1个综合嵌入结果;Sending the first comprehensive embedding result to the other N-1 data holders, and receiving corresponding N-1 comprehensive embedding results from the other N-1 data holders;
    根据所述第一综合嵌入结果和所述N-1个综合嵌入结果,确定所述各个样本的初级嵌入向量。According to the first comprehensive embedding result and the N-1 comprehensive embedding results, the primary embedding vector of each sample is determined.
  4. 根据权利要求1所述的方法,其中,所述每级聚合包括,对于各个样本中任意的第一样本在所述第一图结构中对应的第一节点:The method according to claim 1, wherein the aggregation at each level comprises, for any first sample in each sample, a corresponding first node in the first graph structure:
    至少根据该第一节点的邻居节点的上一级嵌入向量,确定邻居聚合向量;Determine the neighbor aggregation vector at least according to the upper-level embedding vector of the neighbor node of the first node;
    根据所述邻居聚合向量,以及所述第一节点的上一级嵌入向量,确定该第一节点的本级嵌入向量。According to the neighbor aggregation vector and the upper-level embedding vector of the first node, the current-level embedding vector of the first node is determined.
  5. 根据权利要求4所述的方法,其中,所述确定邻居聚合向量包括:The method according to claim 4, wherein said determining a neighbor aggregation vector comprises:
    对所述第一节点的邻居节点的上一级嵌入向量进行池化操作,得到所述邻居聚合向量。Perform a pooling operation on the upper-level embedding vector of the neighbor node of the first node to obtain the neighbor aggregation vector.
  6. 根据权利要求4所述的方法,其中,所述确定邻居聚合向量包括:The method according to claim 4, wherein said determining a neighbor aggregation vector comprises:
    对所述第一节点的邻居节点的上一级嵌入向量加权求和,得到所述邻居聚合向量,各邻居节点对应的权重根据该邻居节点与所述第一节点之间的连接边的特征而确定。The weighted summation of the upper-level embedding vectors of the neighbor nodes of the first node obtains the neighbor aggregation vector, and the weight corresponding to each neighbor node is determined according to the characteristics of the connecting edge between the neighbor node and the first node. determine.
  7. 根据权利要求4所述的方法,其中,所述确定邻居聚合向量包括:The method according to claim 4, wherein said determining a neighbor aggregation vector comprises:
    基于各个邻居节点的上一级嵌入向量,以及各个邻居节点与所述第一节点之间的各个连接边的边嵌入向量,确定所述邻居聚合向量。The neighbor aggregation vector is determined based on the upper-level embedding vector of each neighbor node and the edge embedding vector of each connection edge between each neighbor node and the first node.
  8. 根据权利要求1所述的方法,其中,根据所述损失梯度,更新所述第一网络部分包括:The method according to claim 1, wherein, according to the loss gradient, updating the first network part comprises:
    根据所述损失梯度,采用反向传播算法,反向逐层更新所述聚合层中的聚合参数,以及所述嵌入层中的嵌入参数。According to the loss gradient, a back-propagation algorithm is used to reversely update the aggregation parameters in the aggregation layer and the embedding parameters in the embedding layer layer by layer.
  9. 一种多方联合训练图神经网络的方法,所述图神经网络包括图嵌入子网络和分类子网络,所述多方包括服务器和N个数据持有方,所述服务器维护所述分类子网络,所述N个数据持有方各自维护所述图嵌入子网络的一部分;所述N个数据持有方中的各个持有方存储有样本集中各个样本的部分特征,以及包含所述各个样本作为对应节点的图结构;所述方法通过所述服务器执行,所述方法包括:A method for multi-party joint training of a graph neural network. The graph neural network includes a graph embedding sub-network and a classification sub-network. The multi-party includes a server and N data holders. The server maintains the classification sub-network. Each of the N data holders maintains a part of the graph embedding sub-network; each of the N data holders stores part of the characteristics of each sample in the sample set, and includes each sample as a corresponding Graph structure of the node; the method is executed by the server, and the method includes:
    对于所述样本集中任意的目标样本,分别从所述N个数据持有方接收针对该目标样本的N个高阶嵌入向量,其中,第i个高阶嵌入向量是由所述N个数据持有方中的第i 个持有方,通过将其中存储的图结构和目标样本的特征部分,输入其中维护的图嵌入子网络部分而得到;For any target sample in the sample set, N high-order embedding vectors for the target sample are respectively received from the N data holders, where the i-th high-order embedding vector is held by the N data holders. The i-th holder of a party is obtained by embedding the stored graph structure and the characteristic part of the target sample into the graph maintained in the sub-network part;
    在所述分类子网络中,对所述N个高阶嵌入向量进行综合,得到所述目标样本的综合嵌入向量,并根据所述综合嵌入向量确定所述目标样本的分类预测结果;In the classification sub-network, the N high-order embedding vectors are synthesized to obtain the integrated embedding vector of the target sample, and the classification prediction result of the target sample is determined according to the integrated embedding vector;
    至少基于所述目标样本的分类预测结果与对应的样本标签,确定预测损失;Determine the prediction loss based at least on the classification prediction result of the target sample and the corresponding sample label;
    根据所述预测损失更新所述分类子网络,并确定所述分类子网络输入层对应的损失梯度;Update the classification sub-network according to the predicted loss, and determine the loss gradient corresponding to the input layer of the classification sub-network;
    将所述损失梯度发送给所述N个数据持有方,以使得各个持有方更新其中的图嵌入子网络部分。The loss gradient is sent to the N data holders, so that each holder updates the graph embedded sub-network part.
  10. 根据权利要求9所述的方法,其中,对所述N个高阶嵌入向量进行综合,得到所述目标样本的综合嵌入向量,包括:The method according to claim 9, wherein the integrating the N high-order embedding vectors to obtain the integrated embedding vector of the target sample comprises:
    对所述N个高阶嵌入向量进行拼接,得到所述综合嵌入向量;或者,Splicing the N high-order embedding vectors to obtain the comprehensive embedding vector; or,
    对所述N个高阶嵌入向量求平均,得到所述综合嵌入向量。The N high-order embedding vectors are averaged to obtain the comprehensive embedding vector.
  11. 根据权利要求9所述的方法,其中,对所述N个高阶嵌入向量进行综合,得到所述目标样本的综合嵌入向量,包括:The method according to claim 9, wherein the integrating the N high-order embedding vectors to obtain the integrated embedding vector of the target sample comprises:
    利用N个权重向量,分别与所述N个高阶嵌入向量进行按位相乘,得到N个加权处理向量;Use N weight vectors to perform bitwise multiplication with the N high-order embedding vectors to obtain N weighted processing vectors;
    对所述N个加权处理向量求和,得到所述综合嵌入向量;Sum the N weighted processing vectors to obtain the integrated embedding vector;
    其中,更新所述分类子网络包括,更新所述N个权重向量。Wherein, updating the classification sub-network includes updating the N weight vectors.
  12. 根据权利要求9所述的方法,其中,在确定预测损失之前,还包括:The method according to claim 9, wherein before determining the predicted loss, the method further comprises:
    从所述N个数据持有方中的第二持有方,接收所述样本标签。Receiving the sample label from the second holder of the N data holders.
  13. 一种用于多方联合训练图神经网络的装置,所述图神经网络包括图嵌入子网络和分类子网络,所述多方包括服务器和N个数据持有方,所述服务器维护所述分类子网络,所述N个数据持有方各自维护所述图嵌入子网络的一部分;所述N个数据持有方中任意的第一持有方存储有样本集中各个样本的第一特征部分,以及包含所述各个样本作为对应节点的第一图结构;所述第一持有方维护所述图嵌入子网络的第一网络部分,所述第一网络部分包括嵌入层和聚合层;所述装置部署在第一持有方中,包括:A device for multi-party joint training of a graph neural network. The graph neural network includes a graph embedding sub-network and a classification sub-network. The multi-party includes a server and N data holders. The server maintains the classification sub-network. , The N data holders each maintain a part of the graph embedding sub-network; any first holder of the N data holders stores the first characteristic part of each sample in the sample set, and includes The respective samples serve as the first graph structure of the corresponding node; the first holder maintains the first network part of the graph embedding sub-network, and the first network part includes the embedding layer and the aggregation layer; the device deployment Among the first holders, include:
    初级嵌入单元,配置为在所述嵌入层,至少基于所述各个样本的第一特征部分,利用多方安全计算MPC方案,与其他N-1个数据持有方联合计算得到各个样本的初级嵌入向量;The primary embedding unit is configured to, at the embedding layer, at least based on the first characteristic part of each sample, use a multi-party secure computing MPC scheme, and jointly calculate with other N-1 data holders to obtain the primary embedding vector of each sample ;
    聚合单元,配置为在所述聚合层,基于所述第一图结构,以及所述各个样本的初级 嵌入向量,对所述各个样本执行多级聚合,以确定各个样本的高阶嵌入向量;其中每级聚合包括,对于各个样本在所述第一图结构中对应的节点,至少基于该节点的邻居节点的上一级嵌入向量,确定该节点的本级嵌入向量;The aggregation unit is configured to perform multi-level aggregation on each sample based on the first graph structure and the primary embedding vector of each sample at the aggregation layer to determine the high-order embedding vector of each sample; wherein Each level of aggregation includes, for each sample corresponding to the node in the first graph structure, determining the current level of embedding vector of the node based at least on the previous level of embedding vector of the neighboring node of the node;
    发送单元,配置为将所述各个样本的高阶嵌入向量发送至所述服务器,以使得所述服务器利用所述分类子网络,基于所述N个数据持有方发送的高阶嵌入向量对各个样本进行分类预测,得到分类预测结果;The sending unit is configured to send the high-order embedding vectors of the respective samples to the server, so that the server uses the classification sub-network to pair each of the high-order embedding vectors based on the high-order embedding vectors sent by the N data holders. The sample is classified and predicted, and the classification prediction result is obtained;
    接收单元,配置为从所述服务器接收损失梯度,所述损失梯度至少基于所述各个样本的分类预测结果与样本标签而确定;A receiving unit configured to receive a loss gradient from the server, the loss gradient being determined based on at least a classification prediction result of each sample and a sample label;
    更新单元,配置为根据所述损失梯度,更新所述第一网络部分。The update unit is configured to update the first network part according to the loss gradient.
  14. 根据权利要求13所述的装置,其中,所述初级嵌入单元配置为:The device according to claim 13, wherein the primary embedding unit is configured to:
    基于所述各个样本的第一特征部分,以及所述嵌入层中的嵌入参数,利用多方安全计算MPC方案,与其他N-1个数据持有方联合计算得到各个样本的初级嵌入向量;Based on the first feature part of each sample and the embedding parameters in the embedding layer, a multi-party secure computing MPC scheme is used to jointly calculate with other N-1 data holders to obtain the primary embedding vector of each sample;
    所述更新单元配置为,更新所述嵌入参数。The updating unit is configured to update the embedded parameter.
  15. 根据权利要求14所述的装置,其中,所述多方安全计算方案包括秘密分享方案,所述初级嵌入单元配置为:The apparatus according to claim 14, wherein the multi-party secure computing scheme comprises a secret sharing scheme, and the primary embedding unit is configured to:
    对所述各个样本的第一特征部分进行分享处理,得到第一分享特征部分;并对所述嵌入参数进行分享处理,得到第一分享参数部分;Performing sharing processing on the first characteristic portion of each sample to obtain a first sharing characteristic portion; performing sharing processing on the embedded parameter to obtain a first sharing parameter portion;
    将所述第一分享特征部分和第一分享参数部分发送给其他N-1个数据持有方,并从其他N-1个数据持有方分别接收N-1个分享特征部分以及N-1个分享参数部分;Send the first shared characteristic part and the first shared parameter part to other N-1 data holders, and receive N-1 shared characteristic parts and N-1 from the other N-1 data holders, respectively Share parameter section;
    利用所述嵌入参数和所述N-1个分享参数部分构成的第一综合参数,处理由所述第一特征部分和所述N-1个分享特征部分构成的第一综合特征,得到第一综合嵌入结果;Using the first integrated parameter composed of the embedded parameter and the N-1 shared parameter parts, the first integrated feature composed of the first feature part and the N-1 shared feature parts is processed to obtain the first integrated parameter. Comprehensive embedding results;
    将所述第一综合嵌入结果发送给所述其他N-1个数据持有方,并从所述其他N-1个数据持有方接收对应的N-1个综合嵌入结果;Sending the first comprehensive embedding result to the other N-1 data holders, and receiving corresponding N-1 comprehensive embedding results from the other N-1 data holders;
    根据所述第一综合嵌入结果和所述N-1个综合嵌入结果,确定所述各个样本的初级嵌入向量。According to the first comprehensive embedding result and the N-1 comprehensive embedding results, the primary embedding vector of each sample is determined.
  16. 根据权利要求13所述的装置,其中,所述聚合单元配置为,对于各个样本中任意的第一样本在所述第一图结构中对应的第一节点:The apparatus according to claim 13, wherein the aggregation unit is configured to, for any first sample in each sample, a corresponding first node in the first graph structure:
    至少根据该第一节点的邻居节点的上一级嵌入向量,确定邻居聚合向量;Determine the neighbor aggregation vector at least according to the upper-level embedding vector of the neighbor node of the first node;
    根据所述邻居聚合向量,以及所述第一节点的上一级嵌入向量,确定该第一节点的本级嵌入向量。According to the neighbor aggregation vector and the upper-level embedding vector of the first node, the current-level embedding vector of the first node is determined.
  17. 根据权利要求16所述的装置,其中,所述确定邻居聚合向量包括:The apparatus according to claim 16, wherein said determining a neighbor aggregation vector comprises:
    对所述第一节点的邻居节点的上一级嵌入向量进行池化操作,得到所述邻居聚合向量。Perform a pooling operation on the upper-level embedding vector of the neighbor node of the first node to obtain the neighbor aggregation vector.
  18. 根据权利要求16所述的装置,其中,所述确定邻居聚合向量包括:The apparatus according to claim 16, wherein said determining a neighbor aggregation vector comprises:
    对所述第一节点的邻居节点的上一级嵌入向量加权求和,得到所述邻居聚合向量,各邻居节点对应的权重根据该邻居节点与所述第一节点之间的连接边的特征而确定。The weighted summation of the upper-level embedding vectors of the neighbor nodes of the first node obtains the neighbor aggregation vector, and the weight corresponding to each neighbor node is determined according to the characteristics of the connecting edge between the neighbor node and the first node. determine.
  19. 根据权利要求16所述的装置,其中,所述确定邻居聚合向量包括:The apparatus according to claim 16, wherein said determining a neighbor aggregation vector comprises:
    基于各个邻居节点的上一级嵌入向量,以及各个邻居节点与所述第一节点之间的各个连接边的边嵌入向量,确定所述邻居聚合向量。The neighbor aggregation vector is determined based on the upper level embedding vector of each neighbor node and the edge embedding vector of each connecting edge between each neighbor node and the first node.
  20. 根据权利要求13所述的装置,其中,所述更新单元配置为:The device according to claim 13, wherein the update unit is configured to:
    根据所述损失梯度,采用反向传播算法,反向逐层更新所述聚合层中的聚合参数,以及所述嵌入层中的嵌入参数。According to the loss gradient, a back-propagation algorithm is used to reversely update the aggregation parameters in the aggregation layer and the embedding parameters in the embedding layer layer by layer.
  21. 一种用于多方联合训练图神经网络的装置,所述图神经网络包括图嵌入子网络和分类子网络,所述多方包括服务器和N个数据持有方,所述服务器维护所述分类子网络,所述N个数据持有方各自维护所述图嵌入子网络的一部分;所述N个数据持有方中的各个持有方存储有样本集中各个样本的部分特征,以及包含所述各个样本作为对应节点的图结构;所述装置部署在所述服务器中,包括:A device for multi-party joint training of a graph neural network. The graph neural network includes a graph embedding sub-network and a classification sub-network. The multi-party includes a server and N data holders. The server maintains the classification sub-network. , Each of the N data holders maintains a part of the graph embedding sub-network; each of the N data holders stores part of the characteristics of each sample in the sample set, and includes each sample As a graph structure of the corresponding node; the device deployed in the server includes:
    向量接收单元,配置为对于所述样本集中任意的目标样本,分别从所述N个数据持有方接收针对该目标样本的N个高阶嵌入向量,其中,第i个高阶嵌入向量是由所述N个数据持有方中的第i个持有方,通过将其中存储的图结构和目标样本的特征部分,输入其中维护的图嵌入子网络部分而得到;The vector receiving unit is configured to, for any target sample in the sample set, respectively receive N high-order embedding vectors for the target sample from the N data holders, where the i-th high-order embedding vector is determined by The i-th holder of the N data holders is obtained by embedding the graph structure stored therein and the characteristic part of the target sample into the graph maintained in the sub-network part;
    分类预测单元,配置为在所述分类子网络中,对所述N个高阶嵌入向量进行综合,得到所述目标样本的综合嵌入向量,并根据所述综合嵌入向量确定所述目标样本的分类预测结果;The classification prediction unit is configured to synthesize the N high-order embedding vectors in the classification sub-network to obtain a comprehensive embedding vector of the target sample, and determine the classification of the target sample according to the comprehensive embedding vector forecast result;
    损失确定单元,配置为至少基于所述目标样本的分类预测结果与对应的样本标签,确定预测损失;A loss determination unit configured to determine a prediction loss based at least on the classification prediction result of the target sample and the corresponding sample label;
    更新单元,配置为根据所述预测损失更新所述分类子网络,并确定所述分类子网络输入层对应的损失梯度;An update unit configured to update the classification sub-network according to the predicted loss, and determine the loss gradient corresponding to the input layer of the classification sub-network;
    发送单元,配置为将所述损失梯度发送给所述N个数据持有方,以使得各个持有方更新其中的图嵌入子网络部分。The sending unit is configured to send the loss gradient to the N data holders, so that each holder updates the graph embedded sub-network part.
  22. 根据权利要求21所述的装置,其中,所述分类预测单元配置为:The apparatus according to claim 21, wherein the classification prediction unit is configured to:
    对所述N个高阶嵌入向量进行拼接,得到所述综合嵌入向量;或者,Splicing the N high-order embedding vectors to obtain the comprehensive embedding vector; or,
    对所述N个高阶嵌入向量求平均,得到所述综合嵌入向量。The N high-order embedding vectors are averaged to obtain the comprehensive embedding vector.
  23. 根据权利要求21所述的装置,其中,所述分类预测单元配置为:The apparatus according to claim 21, wherein the classification prediction unit is configured to:
    利用N个权重向量,分别与所述N个高阶嵌入向量进行按位相乘,得到N个加权处理向量;Use N weight vectors to perform bitwise multiplication with the N high-order embedding vectors to obtain N weighted processing vectors;
    对所述N个加权处理向量求和,得到所述综合嵌入向量;Sum the N weighted processing vectors to obtain the integrated embedding vector;
    其中,所述更新单元配置为,更新所述N个权重向量。Wherein, the update unit is configured to update the N weight vectors.
  24. 根据权利要求21所述的装置,还包括:The device according to claim 21, further comprising:
    标签接收单元,配置为从所述N个数据持有方中的第二持有方,接收所述样本标签。The label receiving unit is configured to receive the sample label from the second holder of the N data holders.
  25. 一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行权利要求1-12中任一项的所述的方法。A computer-readable storage medium having a computer program stored thereon, and when the computer program is executed in a computer, the computer is caused to execute the method of any one of claims 1-12.
  26. 一种计算设备,包括存储器和处理器,其特征在于,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现权利要求1-12中任一项所述的方法。A computing device, comprising a memory and a processor, characterized in that executable code is stored in the memory, and when the processor executes the executable code, the method described in any one of claims 1-12 is implemented. method.
PCT/CN2020/111501 2019-10-29 2020-08-26 Method and device for multi-party joint training of graph neural network WO2021082681A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911040222.5A CN110782044A (en) 2019-10-29 2019-10-29 Method and device for multi-party joint training of neural network of graph
CN201911040222.5 2019-10-29

Publications (1)

Publication Number Publication Date
WO2021082681A1 true WO2021082681A1 (en) 2021-05-06

Family

ID=69387467

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/111501 WO2021082681A1 (en) 2019-10-29 2020-08-26 Method and device for multi-party joint training of graph neural network

Country Status (2)

Country Link
CN (1) CN110782044A (en)
WO (1) WO2021082681A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210192296A1 (en) * 2019-12-23 2021-06-24 Electronics And Telecommunications Research Institute Data de-identification method and apparatus
CN113222143A (en) * 2021-05-31 2021-08-06 平安科技(深圳)有限公司 Graph neural network training method, system, computer device and storage medium
CN113254580A (en) * 2021-05-24 2021-08-13 厦门大学 Special group searching method and system
CN113657577A (en) * 2021-07-21 2021-11-16 阿里巴巴达摩院(杭州)科技有限公司 Model training method and computing system
CN116527824A (en) * 2023-07-03 2023-08-01 北京数牍科技有限公司 Method, device and equipment for training graph convolution neural network
CN117218459A (en) * 2023-11-08 2023-12-12 支付宝(杭州)信息技术有限公司 Distributed node classification method and device
CN117273086A (en) * 2023-11-17 2023-12-22 支付宝(杭州)信息技术有限公司 Method and device for multi-party joint training of graph neural network

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110782044A (en) * 2019-10-29 2020-02-11 支付宝(杭州)信息技术有限公司 Method and device for multi-party joint training of neural network of graph
CN110929870B (en) * 2020-02-17 2020-06-12 支付宝(杭州)信息技术有限公司 Method, device and system for training neural network model
GB2592076B (en) * 2020-02-17 2022-09-07 Huawei Tech Co Ltd Method of training an image classification model
CN110929887B (en) * 2020-02-17 2020-07-03 支付宝(杭州)信息技术有限公司 Logistic regression model training method, device and system
CN111275176B (en) * 2020-02-27 2023-09-26 支付宝(杭州)信息技术有限公司 Distributed computing method and distributed computing system
CN113469206A (en) * 2020-03-31 2021-10-01 华为技术有限公司 Method, device, equipment and storage medium for obtaining artificial intelligence model
CN111461215B (en) * 2020-03-31 2021-06-29 支付宝(杭州)信息技术有限公司 Multi-party combined training method, device, system and equipment of business model
CN113657617A (en) * 2020-04-23 2021-11-16 支付宝(杭州)信息技术有限公司 Method and system for model joint training
CN111538827B (en) * 2020-04-28 2023-09-05 清华大学 Case recommendation method, device and storage medium based on content and graph neural network
CN111553470B (en) * 2020-07-10 2020-10-27 成都数联铭品科技有限公司 Information interaction system and method suitable for federal learning
CN111737755B (en) * 2020-07-31 2020-11-13 支付宝(杭州)信息技术有限公司 Joint training method and device for business model
CN112104446A (en) * 2020-09-03 2020-12-18 哈尔滨工业大学 Multi-party combined machine learning method and system based on homomorphic encryption
CN112085172B (en) * 2020-09-16 2022-09-16 支付宝(杭州)信息技术有限公司 Method and device for training graph neural network
CN112101531B (en) * 2020-11-16 2021-02-09 支付宝(杭州)信息技术有限公司 Neural network model training method, device and system based on privacy protection
CN112200321B (en) * 2020-12-04 2021-04-06 同盾控股有限公司 Inference method, system, device and medium based on knowledge federation and graph network
WO2022133725A1 (en) * 2020-12-22 2022-06-30 Orange Improved distributed training of graph-embedding neural networks
CN113240505A (en) * 2021-05-10 2021-08-10 深圳前海微众银行股份有限公司 Graph data processing method, device, equipment, storage medium and program product
CN113221153B (en) * 2021-05-31 2022-12-27 平安科技(深圳)有限公司 Graph neural network training method and device, computing equipment and storage medium
CN113254996B (en) * 2021-05-31 2022-12-27 平安科技(深圳)有限公司 Graph neural network training method and device, computing equipment and storage medium
CN113626650A (en) * 2021-08-04 2021-11-09 支付宝(杭州)信息技术有限公司 Service processing method and device and electronic equipment
CN114121206B (en) * 2022-01-26 2022-05-20 中电云数智科技有限公司 Case portrait method and device based on multi-party combined K mean modeling
CN114462600B (en) * 2022-04-11 2022-07-05 支付宝(杭州)信息技术有限公司 Training method and device for graph neural network corresponding to directed graph
CN114971742A (en) * 2022-06-29 2022-08-30 支付宝(杭州)信息技术有限公司 Method and device for training user classification model and user classification processing

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190156214A1 (en) * 2017-11-18 2019-05-23 Neuralmagic Inc. Systems and methods for exchange of data in distributed training of machine learning algorithms
CN110245787A (en) * 2019-05-24 2019-09-17 阿里巴巴集团控股有限公司 A kind of target group's prediction technique, device and equipment
CN110348573A (en) * 2019-07-16 2019-10-18 腾讯科技(深圳)有限公司 The method of training figure neural network, figure neural network unit, medium
CN110782044A (en) * 2019-10-29 2020-02-11 支付宝(杭州)信息技术有限公司 Method and device for multi-party joint training of neural network of graph

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104796475B (en) * 2015-04-24 2018-10-26 苏州大学 A kind of socialization recommendation method based on homomorphic cryptography
CN109934706B (en) * 2017-12-15 2021-10-29 创新先进技术有限公司 Transaction risk control method, device and equipment based on graph structure model
CN109102393B (en) * 2018-08-15 2021-06-29 创新先进技术有限公司 Method and device for training and using relational network embedded model
CN109918454B (en) * 2019-02-22 2024-02-06 创新先进技术有限公司 Method and device for embedding nodes into relational network graph

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190156214A1 (en) * 2017-11-18 2019-05-23 Neuralmagic Inc. Systems and methods for exchange of data in distributed training of machine learning algorithms
CN110245787A (en) * 2019-05-24 2019-09-17 阿里巴巴集团控股有限公司 A kind of target group's prediction technique, device and equipment
CN110348573A (en) * 2019-07-16 2019-10-18 腾讯科技(深圳)有限公司 The method of training figure neural network, figure neural network unit, medium
CN110782044A (en) * 2019-10-29 2020-02-11 支付宝(杭州)信息技术有限公司 Method and device for multi-party joint training of neural network of graph

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210192296A1 (en) * 2019-12-23 2021-06-24 Electronics And Telecommunications Research Institute Data de-identification method and apparatus
CN113254580A (en) * 2021-05-24 2021-08-13 厦门大学 Special group searching method and system
CN113254580B (en) * 2021-05-24 2023-10-03 厦门大学 Special group searching method and system
CN113222143B (en) * 2021-05-31 2023-08-01 平安科技(深圳)有限公司 Method, system, computer equipment and storage medium for training graphic neural network
CN113222143A (en) * 2021-05-31 2021-08-06 平安科技(深圳)有限公司 Graph neural network training method, system, computer device and storage medium
CN113657577B (en) * 2021-07-21 2023-08-18 阿里巴巴达摩院(杭州)科技有限公司 Model training method and computing system
CN113657577A (en) * 2021-07-21 2021-11-16 阿里巴巴达摩院(杭州)科技有限公司 Model training method and computing system
CN116527824A (en) * 2023-07-03 2023-08-01 北京数牍科技有限公司 Method, device and equipment for training graph convolution neural network
CN116527824B (en) * 2023-07-03 2023-08-25 北京数牍科技有限公司 Method, device and equipment for training graph convolution neural network
CN117218459A (en) * 2023-11-08 2023-12-12 支付宝(杭州)信息技术有限公司 Distributed node classification method and device
CN117218459B (en) * 2023-11-08 2024-01-26 支付宝(杭州)信息技术有限公司 Distributed node classification method and device
CN117273086A (en) * 2023-11-17 2023-12-22 支付宝(杭州)信息技术有限公司 Method and device for multi-party joint training of graph neural network
CN117273086B (en) * 2023-11-17 2024-03-08 支付宝(杭州)信息技术有限公司 Method and device for multi-party joint training of graph neural network

Also Published As

Publication number Publication date
CN110782044A (en) 2020-02-11

Similar Documents

Publication Publication Date Title
WO2021082681A1 (en) Method and device for multi-party joint training of graph neural network
Zhu et al. From federated learning to federated neural architecture search: a survey
US20210279342A1 (en) Neural-network training using secure data processing
WO2021164365A1 (en) Graph neural network model training method, apparatus and system
CN112015749B (en) Method, device and system for updating business model based on privacy protection
WO2021204271A1 (en) Data privacy protected joint training of service prediction model by two parties
WO2022089256A1 (en) Method, apparatus and device for training federated neural network model, and computer program product and computer-readable storage medium
WO2022068575A1 (en) Calculation method for vertical federated learning, apparatus, device, and medium
US11843586B2 (en) Systems and methods for providing a modified loss function in federated-split learning
CN112464292B (en) Method and device for training neural network based on privacy protection
Wink et al. An approach for peer-to-peer federated learning
CN111428887B (en) Model training control method, device and system based on multiple computing nodes
US11843587B2 (en) Systems and methods for tree-based model inference using multi-party computation
Baryalai et al. Towards privacy-preserving classification in neural networks
Cao et al. A federated learning framework for privacy-preserving and parallel training
Lu et al. Defeat: A decentralized federated learning against gradient attacks
CN114186256A (en) Neural network model training method, device, equipment and storage medium
WO2023038940A1 (en) Systems and methods for tree-based model inference using multi-party computation
WO2023038978A1 (en) Systems and methods for privacy preserving training and inference of decentralized recommendation systems from decentralized data
CN115271939A (en) Method and device for identifying fund link group, computing equipment and medium
CN114492828A (en) Block chain technology-based vertical federal learning malicious node detection and reinforcement method and application
WO2022081539A1 (en) Systems and methods for providing a modified loss function in federated-split learning
US20230084507A1 (en) Servers, methods and systems for fair and secure vertical federated learning
US20240144029A1 (en) System for secure and efficient federated learning
CN112836868A (en) Joint training method and device for link prediction model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20882384

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20882384

Country of ref document: EP

Kind code of ref document: A1