CN114091651B - Method, device and system for multi-party combined training of graph neural network - Google Patents

Method, device and system for multi-party combined training of graph neural network Download PDF

Info

Publication number
CN114091651B
CN114091651B CN202111297665.XA CN202111297665A CN114091651B CN 114091651 B CN114091651 B CN 114091651B CN 202111297665 A CN202111297665 A CN 202111297665A CN 114091651 B CN114091651 B CN 114091651B
Authority
CN
China
Prior art keywords
party
noise
data
encryption
ciphertext
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111297665.XA
Other languages
Chinese (zh)
Other versions
CN114091651A (en
Inventor
倪翔
吕灵娟
许小龙
孟昌华
王维强
吕乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202111297665.XA priority Critical patent/CN114091651B/en
Publication of CN114091651A publication Critical patent/CN114091651A/en
Application granted granted Critical
Publication of CN114091651B publication Critical patent/CN114091651B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioethics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification provides a method, a device and a system for protecting private data by multiparty joint training of a graph neural network, wherein the method comprises the following steps: the first party processes a first characteristic part of the sample object by using a first parameter part of the graph neural network to obtain a first processing result; homomorphic encryption is carried out on the first processing result by utilizing a target public key of the controller, so that a first encryption result is obtained; receiving a second encryption result from the second party; based on the first encryption result, the second encryption result and a preset loss function, a first gradient ciphertext is obtained through homomorphic operation; adding a first noise ciphertext for encrypting the first noise on the first gradient ciphertext to obtain first encrypted and noisy data; send it to the controller; receiving first encryption noise data after decryption of the first encryption noise data from a controller, and removing first noise from the first encryption noise data to obtain a first gradient plaintext; and updating the first parameter part according to the first gradient plaintext.

Description

Method, device and system for multi-party combined training of graph neural network
Technical Field
The present disclosure relates to the field of data security technologies, and in particular, to a method, an apparatus, and a system for protecting private data in a multi-party joint training graph neural network.
Background
The data required for machine learning often involves multiple fields. For example, in a machine learning-based user classification analysis scenario, an electronic payment platform has transaction flow data of a user, a social platform has friend contact data of the user, and a banking institution has lending data of the user. Data often exists in the form of islands. Because of the problems of industry competition, data security, user privacy and the like, data integration faces great resistance, and the integration of data scattered on each platform is difficult to realize by training a machine learning model. On the premise of ensuring that data is not revealed, the use of multiparty data to jointly train a machine learning model becomes a current challenge.
The graph neural network is a widely used machine learning model. Compared with the traditional neural network, the graph neural network not only can capture the characteristics of the nodes, but also can characterize the association relation among the nodes, so that excellent effects are achieved in multiple machine learning tasks. However, when the data islanding phenomenon is faced, how to synthesize multiparty data and safely perform multiparty joint training of the graph neural network becomes a problem to be solved.
Disclosure of Invention
One or more embodiments of the present disclosure provide a method, apparatus, and system for protecting private data in a multi-party joint training graph neural network, so as to achieve safe and effective joint training of the graph neural network between multiple parties.
According to a first aspect, there is provided a method of protecting private data in a multi-party joint training graph neural network, the multi-party comprising a first party, a second party and a controller, the method being performed by the first party and comprising:
Processing a first characteristic part of a sample object by using a first parameter part of the graph neural network to obtain a first processing result;
Homomorphic encryption is carried out on the first processing result by utilizing a target public key of the controller, so that a first encryption result is obtained;
Receiving a second encryption result from a second party, wherein the second encryption result is obtained by homomorphic encryption of a second processing result by the second party by using the target public key, and the second processing result is obtained by the second party by using a second parameter part of the graph neural network to process a second characteristic part of the sample object;
based on the first encryption result, the second encryption result and a preset loss function, obtaining a first gradient ciphertext corresponding to the first parameter part through homomorphic operation, wherein the loss function adopts a form of an orthogonal polynomial approaching an activation function in the graph neural network;
adding a first noise ciphertext for encrypting the first noise on the first gradient ciphertext to obtain first encrypted and noisy data; transmitting the first encrypted noise-added data to the controller;
Receiving first noise-adding data after the first encryption noise-adding data are decrypted from the controller, and removing the first noise from the first noise-adding data to obtain a first gradient plaintext;
and updating the first parameter part according to the first gradient plaintext.
In one implementation manner, the sample object is a node, or an edge, in a relational network graph corresponding to the graph neural network; the node represents one of the following business objects: users, merchants, and items; the edges represent the association between the business objects.
In one embodiment, the obtaining the first processing result includes:
and multiplying the first parameter part and the first characteristic part to obtain the first processing result.
In one embodiment, before the obtaining the first processing result, the method further includes:
and utilizing the first original characteristic part of the neighbor object of the sample object and the first original characteristic part of the sample object to aggregate to obtain a first characteristic part.
In an implementation manner, the obtaining, by homomorphic operation, the first gradient ciphertext corresponding to the first parameter portion based on the first encryption result, the second encryption result, and a preset loss function includes:
Determining a loss value ciphertext through homomorphic operation based on the first encryption result, the second encryption result and a preset loss function;
the first gradient ciphertext is determined based on the loss value ciphertext and the first parameter portion.
In one embodiment, the method further comprises:
sending the loss value ciphertext to the controller;
The receiving, from the controller, the first encrypted and noisy data after decrypting the first encrypted and noisy data, including:
and receiving the first noise adding data sent by the controller under the condition that the loss value plaintext corresponding to the loss value ciphertext is not lower than a preset loss threshold value.
In one embodiment, the orthonormal polynomial is a second order orthonormal polynomial.
In one embodiment, the loss function is defined as an orthogonal polynomial with the sum of the first and second processing results as a variable.
In one embodiment, the method further comprises:
The target public key is received from the controller and stored.
According to a second aspect, there is provided a method of protecting private data in a multi-party joint training graph neural network, the method comprising:
The first party processes a first characteristic part of the sample object by utilizing a first parameter part of the graph neural network to obtain a first processing result; homomorphic encryption is carried out on the first processing result by utilizing a target public key of the controller, so that a first encryption result is obtained;
The second party processes a second characteristic part of the sample object by using a second parameter part of the graph neural network to obtain a second processing result; homomorphic encryption is carried out on the second processing result by utilizing the target public key, so that a second encryption result is obtained and sent to the first party;
The first party obtains a first gradient ciphertext corresponding to the first parameter part through homomorphic operation based on the first encryption result, the second encryption result and a preset loss function; adding a first noise ciphertext for encrypting the first noise on the first gradient ciphertext to obtain first encrypted and noisy data; transmitting the first encrypted noise-added data to the controller; wherein the loss function is in the form of an orthogonal polynomial approximating an activation function in the graph neural network;
After the controller receives the first encrypted and noisy data, decrypting the first encrypted and noisy data by using a target private key corresponding to the target public key to obtain first noisy data; transmitting the first noisy data to the first party;
The first party receives the first noise adding data, removes the first noise from the first noise adding data, obtains a first gradient plaintext, and updates the first parameter portion by using the first gradient plaintext.
According to a third aspect, there is provided an apparatus for protecting private data in a multi-party joint training graph neural network, the multi-party comprising a first party, a second party and a controller, the apparatus being deployed at the first party, the apparatus comprising:
The first processing module is configured to process a first characteristic part of the sample object by utilizing the first parameter part of the graph neural network to obtain a first processing result;
the homomorphic encryption module is configured to homomorphic encrypt the first processing result by utilizing the target public key of the controller to obtain a first encryption result;
the first receiving module is configured to receive a second encryption result from a second party, wherein the second encryption result is obtained by homomorphic encryption of a second processing result by the second party by using the target public key, and the second processing result is obtained by the second party by processing a second characteristic part of the sample object by using a second parameter part of the graph neural network;
the gradient ciphertext determining module is configured to obtain a first gradient ciphertext corresponding to the first parameter part through homomorphic operation based on the first encryption result, the second encryption result and a preset loss function, wherein the loss function is in a form of an orthogonal polynomial approaching an activation function in the graph neural network;
The adding and transmitting module is configured to add a first noise ciphertext for encrypting the first noise on the first gradient ciphertext to obtain first encrypted and noisy data; transmitting the first encrypted noise-added data to the controller;
And the second receiving module is configured to receive the first noise-adding data after the first encryption noise-adding data are decrypted from the controller, and remove the first noise from the first noise-adding data to obtain a first gradient plaintext.
And the updating module is configured to update the first parameter part according to the first gradient plaintext.
According to a fourth aspect, there is provided a system for protecting private data in a multi-party joint training graph neural network, the system comprising a first party, a second party and a controller, wherein,
The first party is used for processing a first characteristic part of the sample object by utilizing a first parameter part of the graph neural network to obtain a first processing result; homomorphic encryption is carried out on the first processing result by utilizing a target public key of the controller, so that a first encryption result is obtained;
The second party is used for processing a second characteristic part of the sample object by utilizing a second parameter part of the graph neural network to obtain a second processing result; homomorphic encryption is carried out on the second processing result by utilizing the target public key, so that a second encryption result is obtained and sent to the first party;
the first party is further used for obtaining a first gradient ciphertext corresponding to the first parameter part through homomorphic operation based on the first encryption result, the second encryption result and a preset loss function; adding a first noise ciphertext for encrypting the first noise on the first gradient ciphertext to obtain first encrypted and noisy data; transmitting the first encrypted noise-added data to the controller; the loss function adopts a form of an orthogonal polynomial which approximates an activation function in the graph neural network;
the controller is used for decrypting the first encrypted and noisy data by utilizing a target private key corresponding to the target public key after receiving the first encrypted and noisy data to obtain first noisy data; transmitting the first noisy data to the first party;
the first party is also used for receiving the first noise adding data, removing the first noise from the first noise adding data to obtain a first gradient plaintext, and updating the first parameter part by using the first gradient plaintext.
According to a fifth aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first or second aspect.
According to a sixth aspect, there is provided a computing device comprising a memory and a processor, wherein the memory has executable code stored therein, the processor implementing the method of the first or second aspect when executing the executable code.
According to the method, the device and the system provided by the embodiment of the specification, the data are interacted among multiple parties in a homomorphic encryption mode, so that the leakage of the data of each party is avoided, when the first party calculates the first gradient ciphertext corresponding to the first parameter part based on the encryption result of the multiple parties, the loss function adopts the form of an orthogonal polynomial approximating the activation function in the graph neural network, the obtained result is approximated to be linear from nonlinear, homomorphic operation is supported, and the precision of the gradient ciphertext is ensured to a certain extent.
Drawings
In order to more clearly illustrate the technical solution of the embodiments of the present invention, the drawings that are required to be used in the description of the embodiments will be briefly described below. It is evident that the drawings in the following description are only some embodiments of the present invention and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.
FIG. 1 is a schematic diagram of an implementation framework of one embodiment of the disclosure;
FIG. 2 is a flowchart of a method for protecting private data in a multi-party joint training graph neural network according to an embodiment;
FIG. 3 is a flowchart of another method for protecting private data in a multi-party joint training graph neural network according to an embodiment;
FIG. 4 is a schematic block diagram of an apparatus for multi-party joint training of a graph neural network to protect private data provided by an embodiment;
Fig. 5 is a schematic block diagram of a system for multi-party joint training graph neural networks for protecting private data according to an embodiment.
Detailed Description
The technical solutions of the embodiments of the present specification will be described in detail below with reference to the accompanying drawings.
The embodiment of the specification discloses a method, a device and a system for protecting private data by multi-party joint training graph neural network, and introduces application scenes and an invention conception of the method for protecting private data by multi-party joint training graph neural network, specifically as follows:
Data often exist in island form, and data integration faces great resistance due to problems of industry competition, data security, user privacy and the like. On the premise of ensuring that data is not revealed, the use of multiparty data to jointly train a machine learning model becomes a current challenge.
The graph neural network is a widely used machine learning model. Compared with the traditional neural network, the graph neural network not only can capture the characteristics of the nodes, but also can characterize the association relation among the nodes, so that excellent effects are achieved in multiple machine learning tasks. When the island phenomenon of the data is faced, how to synthesize the multiparty data and safely perform multiparty joint training of the graph neural network becomes the problem to be solved.
In view of this, the embodiments of the present disclosure provide a method for protecting a multiparty joint training graph neural network of private data, where the multiparty uses homomorphic encryption to implement safe collaborative training in the process of multiparty joint training graph neural network. Specifically, fig. 1 is a schematic diagram of an implementation scenario of an embodiment disclosed in the present specification. As shown in fig. 1, a scenario of a multiparty joint training graph neural network protecting private data, involving parties may include: a first party a, a second party B and a controller C. Each participant may be implemented by any device, platform, server, or cluster of devices having computing, processing capabilities. The three parties are to jointly train a graphic neural network under the condition of protecting the data privacy. In one implementation, the graph neural network may be a graph convolution network based on GRAPHSAGE algorithm.
The first party A stores a first parameter part WA of the graph neural network to be trained and a corresponding relation network graph of the graph neural network. The second party B stores the second parameter part WB of the required trained neural network, and the relational network graph. Also, the first party a stores a part of the features of the n sample objects in the training data set, referred to as a first feature part XA. The second party B stores the second characteristic part XB of the n sample objects. The sample object is a node or an edge in the relational network graph; the node may represent one of the following business objects, users, merchandise, and items; edges represent associations between business objects in a relational network graph. Assume that the second party also stores tag values for n sample objects, the n tag values constituting a tag vector Y. It will be appreciated that, different participants in the multi-party joint training graph neural network may exist for the business objects represented by the nodes, and the business objects represented by the nodes may further include: residential, corporate, etc.
In this case, the first party a and the second party B store the same structure of the relationship network map which is commonly constructed in advance based on the association relationships with n sample objects stored in each of the first party a and the second party B.
For example, in one exemplary scenario, the first party a and the second party B are an electronic payment platform and a banking institution, and both parties need a co-trained graphic neural network for evaluating the credit rating of the user. Where the sample object is a user, the electronic payment platform may store a portion of the user's characteristics (e.g., characteristics related to payment). The banking institution may store another part of the user's characteristics (e.g. the user's credit record related characteristics) from which the second characteristic part XB corresponding to the sample object can be determined. The electronic payment platform and the banking mechanism are also respectively stored with a relationship network diagram corresponding to the graphic neural network for the user, wherein the relationship network diagram is commonly constructed based on the association relationship between the users and/or between the users and other objects (such as merchants and articles) stored by the electronic payment platform and the banking mechanism respectively. The electronic payment platform can determine and obtain a first characteristic part XA corresponding to the user through the stored partial characteristics of the user and the relation network diagram. The banking institution can determine and obtain the second characteristic part XB corresponding to the user through the stored partial characteristics of the user and the relation network diagram.
In another exemplary scenario, the first party a and the second party B are an electronic commerce platform and an electronic payment platform, and both parties need a co-trained neural network for assessing fraud risk of merchants. Wherein the sample object is a merchant. The e-commerce platform may store sales data for the merchant as part of the features; the electronic payment platform maintains transaction flow data for the merchant as another partial feature. The electronic commerce platform and the electronic payment platform are also respectively stored with a relation network diagram corresponding to the graphic neural network for the commercial tenant, and the relation network diagram is commonly constructed based on the association relation between the commercial tenant and other objects (articles and users) stored by the electronic commerce platform and the electronic payment platform respectively. The electronic payment platform can determine and obtain a first characteristic part XA corresponding to the user through the stored partial characteristics of the user and the relation network diagram. The banking institution can determine and obtain the second characteristic part XB corresponding to the user through the stored partial characteristics of the user and the relation network diagram.
In other scenario examples, the business object may also be other objects to be evaluated, such as merchandise, interaction events (e.g., transaction events, login events, click events, purchase events), and so forth. Accordingly, the participants may be different business parties that maintain different characteristic portions of the business objects described above. The graph neural network may be a network that performs classification prediction or regression prediction for the corresponding business object.
It is to be understood that the business object features maintained by each participant belong to private data, and in the process of joint training, plaintext exchange cannot be performed so as to protect the security of the private data. And, eventually, the first party a wishes to train to obtain a model parameter part for processing the first feature part XA, i.e. the first parameter part WA; the second party wishes to train a second parameter part WB for processing the second feature part XB, which together constitute the graph neural network.
In order to perform joint training of a model without revealing private data, according to an embodiment of the present disclosure, as shown in fig. 1, in a graph neural network training process, a first party a processes a first feature part XA by using a first parameter part WA to obtain a first processing result MA, and uses a target public key PK of a controller C to homomorphic encrypt the first processing result MA to obtain a first encryption result [ MA ] PK; and the second party B processes the second characteristic part XB by utilizing the second parameter part WB to obtain a corresponding processing result MB, and homomorphic encryption is carried out on the second processing result MB by utilizing the target public key PK to obtain a second encryption result [ MB ] PK. Then the first party A and the second party B send the respective obtained encryption result [ M ] PK (the first party A is the first encryption result [ MA ] PK and the second party B is the second encryption result [ MB ] PK) to each other; after the first party A and the second party B receive the encryption result [ M ] PK of the other party, respectively obtaining the corresponding gradient ciphertext [ G ] PK through homomorphism operation by utilizing the encryption result [ M ] PK of the two parties and a preset loss function (the first party A obtains the first gradient ciphertext [ GA ] PK, and the second party B obtains the second gradient ciphertext [ GB ] PK). Then, the first party a and the second party B each add a noise ciphertext [ epsilon ] PK to their gradient ciphertext [ G ] PK (first party a is [ GA ] PK, and second party B is [ GB ] PK), and then send the result to the controller C. Wherein [ (i ] represents encryption and the subscript represents the key used for encryption).
The noise ciphertext [ epsilon 1] PK added by the first party A in the first gradient ciphertext [ GA ] PK and the noise ciphertext [ epsilon 2] PK added by the second party B in the second gradient ciphertext [ GB ] PK are obtained by homomorphic encryption of randomly generated noise. The two of which may be the same or different.
The activation function in the graph neural network is generally a nonlinear function, so that the training process supports homomorphic operation, the loss function adopts an orthogonal polynomial form which approximates the activation function in the graph neural network, and the accuracy of gradient ciphertext (gradient) can be improved to a certain extent by using the orthogonal polynomial to approximate the activation function in the graph neural network.
After obtaining the gradient ciphertext [ G ] PK+[ε]PK of the noise ciphertext transmitted by the first party A and the second party B respectively, the controller C decrypts the gradient ciphertext [ G ] PK+[ε]PK of the noise ciphertext by utilizing the target private key SK corresponding to the target public key PK to obtain the gradient G+epsilon (GA+epsilon 1 is the first party A and GB+epsilon 2 is the second party B) of the noise ciphertext of the first party A and the second party B respectively, and feeds GA+epsilon 1 back to the first party A and GB+epsilon 2 back to the second party B.
After the first party A obtains the gradient GA+epsilon 1 of the added noise, the noise epsilon 1 is removed from the gradient GA to obtain a corresponding gradient plaintext GA, and after the second party B obtains the gradient GB+epsilon 2 of the added noise, the noise epsilon 2 is removed from the gradient GA to obtain a corresponding gradient plaintext GB. And the first party A and the second party B update the parameter parts of the graph neural network model stored by the first party A and the second party B respectively according to the obtained gradient plaintext G, so as to realize multi-party joint training of the graph neural network.
In the whole training process, all parties do not exchange data in the clear text, all communication data are encrypted data or mixed data are added, so that privacy data are ensured not to be revealed in the combined training process, and the safety of the data is enhanced. In order to support homomorphic operation, the loss function adopts an orthogonal polynomial form of an activation function in an approximation graph neural network, the obtained result is approximated from nonlinearity to linearity, and the accuracy of the gradient ciphertext is ensured to a certain extent by utilizing the orthogonal polynomial to approximate the activation function. The following describes a specific implementation of the above scheme.
FIG. 2 illustrates a flow chart of a method of protecting a multiparty joint training graph neural network of private data in one embodiment of the present description. Wherein the parties include a first party a, a second party B, and a controller C. It is to be understood that the initialization phase is first followed by iterative training of the model (i.e., iterative training of the graph neural network). In this initialization phase, the controller C generates an asymmetric key pair for homomorphic encryption, i.e., a target public key PK and a target private key SK, and then transmits the target public key PK to the first party a and the second party B, respectively, which store the target public key PK. The controller C keeps the target private key SK private.
In addition, the first party A and the second party B initialize the parameter parts of the stored graph neural network. Specifically, the first party a initializes a first parameter part WA for processing a first characteristic part of the sample object. The second party B initiates generation of a second parameter part WB for processing a second characteristic part of the sample object. In one implementation, the first parameter part WA and the second parameter part WB may be initialized by means of random generation.
Then, the model iterative training process shown in fig. 2 is entered. The following describes a method flow of the multiparty joint training graph neural network for protecting privacy data from the perspective of the first party a. The first party a may be implemented by any means, device, platform, cluster of devices, etc. having computing, processing capabilities. It will be appreciated that the second party B and the controller C may also be implemented by any means, device, platform, cluster of devices, etc. having computing, processing capabilities.
Accordingly, the method is performed by a first party a, comprising the steps S210-S270 of:
S210: the first characteristic part XA of the sample object is processed with the first parameter part WA of the graph neural network, resulting in a first processing result MA. The first party A stores a relationship network diagram corresponding to the graph neural network besides the first parameter part of the graph neural network, and the relationship network diagram can comprise a plurality of nodes and edges. Wherein a node may represent one of the following business objects: users, merchants, and items; edges may represent associations between business objects. The first party a also stores a portion of the features of the n sample objects in the training data set of the training pattern neural network, referred to as the first feature portion XA. The first party a processes the first characteristic part XA of the sample object by using the first parameter part WA of the graph neural network to obtain a first processing result MA. In one implementation, step S210 may specifically include: the first parameter part WA and the first feature part XA are multiplied to obtain a first processing result MA.
In this case, the relationship network diagram stored by the first party a is identical in structure to the relationship network diagram stored by the second party B, and the relationship network diagram is commonly constructed in advance based on the association relationship between the sample objects stored by each of the first party a and the second party B.
The considered parameter part and the feature part are both private data, and the processing results obtained by determining the two are also private data, so that leakage of the private data in the process of multiparty joint training of the graph neural network is avoided, the training process is ensured to be effectively carried out, the first party A needs to homomorphic encrypt the first processing result, and then the homomorphic encrypted first processing result is sent to the second party B for data interaction. Accordingly, after obtaining the first processing result, the first party a performs step S220: and homomorphic encryption is carried out on the first processing result MA by using the target public key PK of the controller C, so as to obtain a first encryption result [ MA ] PK.
The first party A then sends the first encryption result [ MA ] PK to the second party B. In the model iterative training process of the second side B, the second side B processes a second characteristic part XB of the sample object by using a second parameter part WB of the graph neural network to obtain a second processing result MB. Then, in order to avoid disclosure of the private data (second processing result MB), the second processing result MB is homomorphic encrypted by using the target public key PK of the controller C, to obtain a second encrypted result [ MB ] PK; in one aspect, the second party B sends the second encryption result [ MB ] PK to the first party A. After that, the first party a performs step S230: the second encryption result MB PK is received from the second party B.
On the other hand, after obtaining the first encryption result [ MA ] PK and the second encryption result [ MB ] PK, the second party B obtains a second gradient ciphertext [ GB ] PK corresponding to the second parameter part by homomorphic operation by using the first encryption result [ MA ] PK, the second encryption result [ MB ] PK and a preset loss function; and adding a second noise ciphertext [ epsilon 2] PK obtained by encrypting the second noise epsilon 2 to the second gradient ciphertext [ GB PK to obtain second encrypted noise-added data [ GB ] PK+[ε2]PK; the second encrypted noise-added data [ GB ] PK+[ε2]PK is sent to the controller C. The process of obtaining the second processing result MB by the second party B may be: the second parameter part WB and the second feature part XB are multiplied to obtain a second processing result MB. The process of obtaining the second gradient ciphertext [ GB ] PK by the second party B may be: the first encryption result [ MA ] PK and the second encryption result [ MB ] PK are subjected to homomorphic operation to obtain a loss value ciphertext; and determining a second gradient ciphertext [ GB ] PK by using the loss value ciphertext and the second parameter part WB. For a specific procedure, the following procedure for determining the first gradient ciphertext [ GA ] PK by the first party A may be referred to.
After the first party a receives the second encryption result [ MB ] PK, step S240 is performed: based on the first encryption result [ MA ] PK, the second encryption result [ MB ] PK and a preset loss function, a first gradient ciphertext [ GA ] PK corresponding to the first parameter part WA is obtained through homomorphism operation. It will be appreciated that the activation function in the neural network is typically a nonlinear function and homomorphic operation is typically a linear operation. In view of this, in order to support homomorphic operation and to ensure the accuracy of the gradient obtained by the operation, the activation function in the graph neural network is approximated by using an orthogonal polynomial, and correspondingly, the loss function takes the form of an orthogonal polynomial approximating the activation function in the graph neural network. In one case, the activation function in the neural network using the orthonormal polynomial approximation map may include: the hidden layer activation function RELU function and the outer layer (output layer) activation function softmax function of the neural network are shown.
In one embodiment, step S240 may include steps 11-12 as follows:
Step 11: the loss value ciphertext [ L ] PK is determined through homomorphic operation based on the first encryption result [ MA ] PK and the second encryption result [ MB ] PK and a preset loss function.
In one implementation, the orthonormal polynomial is a second order orthonormal polynomial, which can be represented by the following equation (1):
Where a is a range value determined in advance based on training data (including the first characteristic portion and the second characteristic portion of the sample object) in the training data set, and x represents an argument.
In one implementation, the loss function may be defined as an orthogonal polynomial that is variable as the sum of the first processing result and the second processing result. Accordingly, the loss function can be expressed by the following formula (2):
L(WA,WB)=p(WA*XA+WB*XB); (2)
Where WA XA represents the first processing result, WB XB represents the second processing result.
Based on the above formula (1) and formula (2), the loss value plaintext L corresponding to the loss value ciphertext [ L ] PK can be represented by the following formula (3):
After the homomorphic encryption operator is applied, that is, based on the first encryption result and the second encryption result, and a preset loss function, the obtained loss value ciphertext [ L ] PK may be represented by the following formula (4):
[L(WA,WB)]PK=[p(WA*XA+WB*XB)]PK; (4)
accordingly, based on the above formula (3), the formula (4) can be modified into the formula (5):
further, the above formula (5) may be modified into formula (6):
It will be appreciated that the first encryption result [ MA ] PK (i.e., [ WA x XA ] PK) and the second encryption result [ WB x XB ] PK are known, and in view of this, the above formula (6) can be modified into the following formula (7):
[L(WA,WB)]PK=[LA]PK+[LB]PK+[LAB]PK; (7)
wherein,
The homomorphism of homomorphic encryption algorithm is utilized, namely, plaintext is calculated and then encrypted, and corresponding calculation is carried out on ciphertext after encryption, so that the result is equivalent. For example, encrypting v1 and v2 with the same public key pk yields E pk (v 1) and E pk (v 2), if:
Then the homomorphic encryption algorithm is considered to satisfy the addition homomorphism, wherein And performing operation for the corresponding homomorphism. In practice,/>The operations may correspond to conventional additions, multiplications, etc.
Also for example: encrypting v1 and v2 with the same public key pk yields E pk (v 1) and E pk (v 2), if:
Then the homomorphic encryption algorithm is considered to satisfy the multiplication homomorphism, wherein Is a corresponding homomorphic multiplication operation.
By utilizing the homomorphism, homomorphism addition operation and homomorphism multiplication operation are carried out in the formula (7) above, and a loss value ciphertext [ L ] PK is obtained.
Step 12: based on the loss value ciphertext [ L ] PK and the first parameter portion WA, a first gradient ciphertext [ GA ] PK is determined. In the case of using the maximum likelihood probability and the random gradient descent method, the specific method for determining the first gradient ciphertext [ GA ] PK may be: the partial derivative of the loss value ciphertext [ L ] PK is determined using the first parameter part WA, and the resulting partial derivative is determined as the first gradient ciphertext [ GA ] PK.
It will be appreciated that the resulting gradient (in plain text) on the first party a side can be represented by the following equation (8):
The resulting gradient (plain text) on the second side B can be represented by the following formula (9):
after application of the homomorphic encryption operator, i.e., the first gradient ciphertext [ GA ] PK, may be represented by the following equation (10):
The second gradient ciphertext [ GB ] PK may be represented by the following equation (11):
By utilizing the homomorphism, homomorphism addition operation and homomorphism multiplication operation are carried out in the formulas (10) and (11) above, and a first gradient ciphertext [ GA ] PK and a second gradient ciphertext [ GB ] PK are obtained.
S250: adding a first noise ciphertext [ epsilon 1] PK for encrypting the first noise epsilon 1 to the first gradient ciphertext [ GA PK to obtain first encrypted noise-added data [ GA ] PK+[ε1]PK; the first encrypted noise plus data [ GA ] PK+[ε1]PK is sent to the controller C.
In this step, in order to avoid leakage of the first gradient on the controller C side, before the first party a sends the first gradient ciphertext to the controller C side, noise is added to the first gradient ciphertext [ GA ] PK, so that after the controller C side decrypts, the real gradient cannot be known, and protection of the gradient on the first party a side is achieved. In one implementation, after obtaining the first gradient ciphertext [ GA ] PK, the first party a generates first noise, and uses the target public key to homomorphically encrypt the first noise epsilon 1 to obtain a first noise ciphertext [ epsilon 1] PK, and adds the first noise ciphertext to the first gradient ciphertext, that is, based on homomorphism addition operation of the first noise ciphertext [ epsilon 1] PK and the first gradient ciphertext [ GA ] PK, the first encrypted noise addition data [ GA ] PK+[ε1]PK is obtained. And sends the first encrypted noise plus data GA PK+[ε1]PK to the controller C.
After receiving the first encrypted and noisy data [ GA ] PK+[ε1]PK, the controller C decrypts the first encrypted and noisy data [ GA ] PK+[ε1]PK by using the target private key to obtain first noisy data ga+epsilon1, and sends the first noisy data ga+epsilon1 to the first party a. Accordingly, the first party a performs the subsequent step S260.
S260: and receiving first noise adding data GA+epsilon 1 after decrypting the first encryption noise adding data [ GA ] PK+[ε1]PK from the controller C, and removing the first noise epsilon 1 from the first noise adding data GA+epsilon 1 to obtain a first gradient plaintext GA.
S270: the first parameter part WA is updated according to the first gradient plaintext GA.
After removing the first noise epsilon 1 from the first noise added data GA+epsilon 1 to obtain a first gradient plaintext GA, the first party A determines an updated value of the first parameter part WA by using the first gradient plaintext GA and the current value of the first parameter part WA, and updates the current value of the first parameter part WA into the updated value to realize the update of the first parameter part WA. The above procedure of determining the updated value of the first parameter portion WA aims at minimizing the loss value plaintext.
Similarly, after receiving the second encrypted and noisy data [ GB ] PK+[ε2]PK sent by the second party B, the controller C decrypts the second encrypted and noisy data [ GB ] PK+[ε2]PK by using the target private key to obtain second noisy data GB+ε2, and sends the second noisy data GB+ε2 to the second party B. The second party B receives second noise adding data GB+epsilon 2, removes second noise epsilon 2 from the second noise adding data GB+epsilon 2, obtains a second gradient plaintext GB, and updates a second parameter part WB according to the second gradient plaintext GB.
It will be appreciated that the first noise and the second noise are both randomly generated noise, which may be the same or different.
The steps S210 to S270 are a model iterative training process. The above procedure may be performed in multiple iterations for training to get a better neural network. That is, the updated model parameter first parameter part WA after step S270 returns to step S210.
The stopping condition of the model iterative training process may include that the number of iterative training reaches a preset number of times threshold, or the iterative training time reaches a preset time, or the loss value is smaller than a set loss threshold, etc.
In the embodiment, in the training process of the whole graph neural network, all parties do not exchange data in the clear text, all communication data are encrypted data or mixed data are added, so that the privacy data are ensured not to be revealed in the joint training process, and the safety of the data is enhanced. In order to support homomorphic operation, the loss function adopts an orthogonal polynomial form of an activation function in an approximation graph neural network, the obtained result is approximated from nonlinearity to linearity, and the accuracy of the gradient ciphertext is ensured to a certain extent by utilizing the orthogonal polynomial to approximate the activation function.
In this embodiment, the graph neural network is jointly trained by using feature parts of different dimensions of the sample objects stored by the first party and the second party, so as to implement longitudinal federal training on the graph neural network. In the case where the graph neural network is a graph convolution network based on GRAPHSAGE algorithm, the algorithm on which the method provided by the embodiment of the present specification depends may be referred to as FEDVGRAPHSAGE algorithm.
Referring back to the execution of steps S210 to S270, the above embodiment is described by taking a sample object as an example. In another embodiment, the steps S210 to S260 may be performed on a batch of object samples, that is, a plurality of sample objects, to obtain a first gradient plaintext corresponding to each sample object, and determine an average gradient plaintext based on the first gradient plaintext corresponding to the plurality of sample objects, and further adjust the first parameter portion based on the average gradient plaintext, so that the number of times of adjusting the first parameter portion can be reduced, and implementation of the training process is easier.
In one implementation, the first party a initially stores a part of the original features of each sample object, which is called a first original feature part, where the first original feature part includes, but is not limited to, part of attribute information of the sample object and association relation information of other sample objects that are adjacent to each other, where the other sample objects that are adjacent to the sample object may be called neighbor objects. Other sample objects connected with the sample object through one or more edges in the relational network graph are neighbor objects of the sample object, wherein the other sample objects connected with the sample object through one edge in the relational network graph are one-hop neighbor objects of the sample object; the other sample objects that are connected to the sample object by two edges (one other sample object in between), are two-hop neighbor objects of the sample object, and so on.
The first feature portion XA is a feature vector embedding of the sample object. The first feature portion XA may be obtained by aggregation of the original feature portion of the sample object itself and the original feature portions of its neighboring objects. Specifically, the first party a may first determine, for each sample object, based on the relational network graph, a neighbor object that participates in calculation and corresponds to the sample object, and then aggregate the first original feature portion of the neighbor object of the sample object and the first original feature portion of the sample object to obtain a first feature portion XA of the sample object. The process of determining the second feature portion XB of each sample object by the second party B using the second original feature of each sample object stored in the second party B may refer to the process of determining the first feature portion XA of each sample object by the first party a, which will not be described herein.
It will be appreciated that the first party a and the second party B store original features of different dimensions of the sample object. The first party a and the second party B may each determine a respective feature portion based on the original features of their stored sample objects. For example, the first party a stores a first original feature portion of the sample object S comprising: the original feature S1, the original feature S2, and the original feature S3, and the first original feature portion of the neighbor object Si corresponding to the sample object S includes: original feature Si1, original feature Si2, and original feature Si3. The second party B stores a second original characteristic portion of the sample object S comprising: the original feature S4 and the original feature S5, and the second original feature portion of the neighbor object Si corresponding to the sample object S includes: original feature Si4 and original feature Si5. Where Si represents the ith neighbor object of the sample object S.
Accordingly, the first party a obtains the first feature part XA of the sample object S based on the original feature S1, the original feature S2, and the original feature S3 of the sample object S, and the original feature Si1, the original feature Si2, and the original feature Si3 of the neighbor object Si. The second party B obtains a second characteristic part XB of the sample object S based on the original characteristics S4 and the original characteristics S5 of the sample object S and the original characteristics Si4 and the original characteristics Si5 of the neighbor object Si by aggregation.
In one embodiment, the method further comprises the step 21 of: the loss value ciphertext [ L ] PK is sent to the controller C;
The step S260 includes: and the receiving controller C sends first noise-added data G+epsilon under the condition that the loss value plaintext L corresponding to the loss value ciphertext [ L ] PK is not lower than a preset loss threshold value.
In the present embodiment, after determining the loss value ciphertext [ L ] PK, the first party a transmits the loss value ciphertext [ L ] PK to the controller C. The controller C decrypts the loss value ciphertext [ L ] PK by using the target private key SK to obtain a loss value plaintext L. Furthermore, the controller C determines whether the loss value plaintext L is not lower than a preset loss threshold, and if it is determined that the loss value plaintext L is not lower than the preset loss threshold, it may determine that the neural network does not reach a convergence state, and correspondingly, sends the first noise-added data ga+epsilon1 obtained by decryption to the first party a. The first party A receives the first noise adding data GA+epsilon 1, and further removes the first noise epsilon 1 from the first noise adding data GA+epsilon 1 to obtain a first gradient plaintext GA; and updates the first parameter part WA based on the first gradient plaintext GA.
And under the condition that the controller C judges that the loss value plaintext L is not lower than a preset loss threshold value, second noise-added data obtained by decrypting the second encryption noise-added data is also sent to a second party B. The second party B receives the second noise adding data, and further removes second noise from the second noise adding data to obtain a second gradient plaintext; and updates the second parameter portion WB based on the second gradient plaintext.
In another implementation, the controller C may determine that the graph neural network reaches a convergence state when the loss value plaintext L is determined to be lower than the preset loss threshold, so that it may determine that the graph neural network has been trained, and correspondingly, the controller C may send information indicating that the model training is completed to the first party a and the second party B, so that the first party a and the second party B determine that the graph neural network joint training is completed.
Corresponding to the above method embodiment, the embodiment of the present disclosure further provides a method for protecting private data by multi-party joint training graph neural network, where, as shown in fig. 3, the method may include:
The first party 310 processes the first characteristic part of the sample object by using the first parameter part of the graph neural network to obtain a first processing result; homomorphic encryption is carried out on the first processing result by utilizing a target public key of the controller, so that a first encryption result is obtained;
A second party 320 processes a second characteristic part of the sample object by using a second parameter part of the graph neural network to obtain a second processing result; homomorphic encryption is performed on the second processing result by using the target public key, so as to obtain a second encryption result, and the second encryption result is sent to the first party 310;
the first party 310 obtains a first gradient ciphertext corresponding to the first parameter part through homomorphic operation based on the first encryption result, the second encryption result and a preset loss function; adding a first noise ciphertext for encrypting the first noise on the first gradient ciphertext to obtain first encrypted and noisy data; transmitting the first encrypted noise-added data to the controller 330; wherein the loss function is in the form of an orthogonal polynomial approximating an activation function in the graph neural network;
After receiving the first encrypted and noisy data, the controller 330 decrypts the first encrypted and noisy data by using a target private key corresponding to the target public key to obtain first noisy data; transmitting the first noisy data to the first party 310;
the first party 310 receives the first noisy data, removes the first noise from the first noisy data, obtains a first gradient plaintext, and updates the first parameter portion with the first gradient plaintext.
In one implementation manner, the sample object is a node, or an edge, in a relational network graph corresponding to the graph neural network; the node represents one of the following business objects: users, merchants, and items; the edges represent the association between the business objects.
In an embodiment, the first party 310 is specifically configured to multiply the first parameter portion and the first feature portion in a process of obtaining the first processing result, so as to obtain the first processing result.
In one implementation manner, before the first processing result is obtained, the first party 310 is further configured to aggregate to obtain a first feature portion by using the first original feature portion of the neighbor object of the sample object and the first original feature portion of the sample object;
in one implementation manner, in the process that the first party 310 obtains the first gradient ciphertext corresponding to the first parameter part through homomorphic operation based on the first encryption result and the second encryption result and a preset loss function, the first party is specifically configured to determine a loss value ciphertext through homomorphic operation based on the first encryption result and the second encryption result and a preset loss function;
the first gradient ciphertext is determined based on the loss value ciphertext and the first parameter portion.
In one embodiment, the first party 310 is further configured to send the loss value ciphertext to the controller;
correspondingly, the controller 320 is further configured to decrypt the loss value ciphertext by using the target private key to obtain a loss value plaintext; judging whether the loss value plaintext is not lower than a preset loss threshold value or not; and sending the first noise-added data to the first party 310 under the condition that the loss value plaintext is not lower than the preset loss threshold.
In one embodiment, the orthogonal-polynomial is a second order orthogonal-polynomial.
In one embodiment, the loss function is defined as an orthogonal polynomial with the sum of the first and second processing results as a variable.
In one embodiment, the first party 310 is further configured to receive the target public key from the controller and store the target public key.
In this embodiment, in the training process of the entire graph neural network, all parties do not perform plaintext exchange of data, and all communication data are encrypted data or mixed data are added, so that privacy data are ensured not to be revealed in the joint training process, and the security of the data is enhanced. In order to support homomorphic operation, the loss function adopts an orthogonal polynomial form of an activation function in an approximation graph neural network, the obtained result is approximated from nonlinearity to linearity, and the accuracy of the gradient ciphertext is ensured to a certain extent by utilizing the orthogonal polynomial to approximate the activation function.
The foregoing describes certain embodiments of the present disclosure, other embodiments being within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. Furthermore, the processes depicted in the accompanying figures are not necessarily required to achieve the desired result in the particular order shown, or in a sequential order. In some embodiments, multitasking and parallel processing are also possible, or may be advantageous.
Corresponding to the above method embodiments, the present disclosure provides an apparatus 400 for protecting private data in a multi-party joint training graph neural network, a schematic block diagram of which is shown in fig. 4, where the multi-party includes a first party, a second party and a controller, and the apparatus is applied to the first party, and the apparatus includes:
a first processing module 410 configured to process a first feature portion of a sample object using a first parameter portion of the graph neural network to obtain a first processing result;
The homomorphic encryption module 420 is configured to homomorphic encrypt the first processing result by using the target public key of the controller, so as to obtain a first encrypted result;
A first receiving module 430 configured to receive a second encryption result from a second party, where the second encryption result is obtained by the second party homomorphic encrypting a second processing result by using the target public key, and the second processing result is obtained by the second party processing a second characteristic portion of the sample object by using a second parameter portion of the graph neural network;
The gradient ciphertext determining module 440 is configured to obtain a first gradient ciphertext corresponding to the first parameter part through homomorphic operation based on the first encryption result, the second encryption result and a preset loss function, wherein the loss function adopts a form of an orthogonal polynomial approaching an activation function in the graph neural network;
an adding and transmitting module 450, configured to add a first noise ciphertext that encrypts a first noise on the first gradient ciphertext, to obtain first encrypted and noisy data; transmitting the first encrypted noise-added data to the controller;
a second receiving module 460 configured to receive, from the controller, first noise-added data after the first encrypted noise-added data is decrypted, and remove the first noise from the first noise-added data, to obtain a first gradient plaintext;
an updating module 470 is configured to update the first parameter portion according to the first gradient plaintext.
In one implementation manner, the sample object is a node, or an edge, in a relational network graph corresponding to the graph neural network; the node represents one of the following business objects: users, merchants, and items; the edges represent the association between the business objects.
In one embodiment, the first processing module is specifically configured to multiply the first parameter portion and the first feature portion to obtain the first processing result.
In one embodiment, the method further comprises:
An aggregation module (not shown in the figure) configured to aggregate, before the first processing result is obtained, a first feature portion by using a first original feature portion of a neighbor object of the sample object and the first original feature portion of the sample object;
In one embodiment, the gradient ciphertext determination module 440 is specifically configured to determine a loss value ciphertext by homomorphic operation based on the first encryption result and the second encryption result, and a preset loss function;
the first gradient ciphertext is determined based on the loss value ciphertext and the first parameter portion.
In one embodiment, the method further comprises:
a transmission module (not shown) configured to transmit the loss value ciphertext to the controller;
The second receiving module 460 is specifically configured to receive the first noise-added data sent by the controller when it is determined that a loss value plaintext corresponding to the loss value ciphertext is not lower than a preset loss threshold.
In one embodiment, the orthogonal-polynomial is a second order orthogonal-polynomial.
In one embodiment, the loss function is defined as an orthogonal polynomial with the sum of the first and second processing results as a variable.
In one embodiment, the method further comprises:
A receiving storage module (not shown in the figure) configured to receive the target public key from the controller and store the target public key.
Corresponding to the above-described method embodiments, the present description provides a system 500 for protecting a multiparty joint training graph neural network of private data, the system comprising a first party 510, a second party 520 and a controller 530, a schematic block diagram of which is shown in fig. 5, wherein,
The first party 510 is configured to process a first feature portion of the sample object by using a first parameter portion of the neural network to obtain a first processing result; homomorphic encryption is carried out on the first processing result by utilizing a target public key of the controller, so that a first encryption result is obtained;
The second party 520 is configured to process a second feature portion of the sample object by using a second parameter portion of the neural network to obtain a second processing result; homomorphic encryption is performed on the second processing result by using the target public key, so as to obtain a second encryption result, and the second encryption result is sent to the first party 510;
The first party 510 is further configured to obtain, based on the first encryption result, the second encryption result, and a preset loss function, a first gradient ciphertext corresponding to the first parameter portion through homomorphic operation; adding a first noise ciphertext for encrypting the first noise on the first gradient ciphertext to obtain first encrypted and noisy data; transmitting the first encrypted noise-added data to the controller 530; the loss function adopts a form of an orthogonal polynomial which approximates an activation function in the graph neural network;
The controller 530 is configured to decrypt the first encrypted and noisy data by using a target private key corresponding to the target public key after receiving the first encrypted and noisy data, to obtain first noisy data; transmitting the first noisy data to the first party 510;
the first party 510 is further configured to receive the first noisy data, remove the first noise from the first noisy data, obtain a first gradient plaintext, and update the first parameter portion with the first gradient plaintext.
In one embodiment, the controller 530 is further configured to obtain second encrypted noise-added data sent by the second party; decrypting the second encrypted and noisy data by using the target private key to obtain second noisy data; transmitting the second noise adding data to the second party 520, where the second encrypted noise adding data is obtained by adding a second noise ciphertext obtained by encrypting a second noise on a second gradient ciphertext by the second party 520; the second gradient ciphertext is obtained by homomorphic operation of the second party 520 based on the first encryption result and the second encryption result, a preset loss function and the second parameter part;
The second party 520 is configured to receive the second noise-added data, remove the second noise from the second noise-added data, and obtain a second gradient plaintext; and updating the second parameter part according to the second gradient plaintext.
The foregoing apparatus and system embodiments correspond to the method embodiments, and specific descriptions may be referred to descriptions of method embodiment portions, which are not repeated herein. The device and the system embodiments are obtained based on the corresponding method embodiments, have the same technical effects as the corresponding method embodiments, and specific description can be seen from the corresponding method embodiments.
The embodiments of the present specification also provide a computer readable storage medium having a computer program stored thereon, which when executed in a computer, causes the computer to perform the method for protecting a multiparty joint training graph neural network of privacy data provided in the present specification.
The embodiment of the specification also provides a computing device, which comprises a memory and a processor, wherein executable codes are stored in the memory, and the processor realizes the method for protecting the multiparty joint training graph neural network of the privacy data provided by the specification when executing the executable codes.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for storage media and computing device embodiments, since they are substantially similar to method embodiments, the description is relatively simple, with reference to the description of method embodiments in part.
Those skilled in the art will appreciate that in one or more of the examples described above, the functions described in the embodiments of the present invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, these functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The foregoing detailed description of the embodiments of the present invention further details the objects, technical solutions and advantageous effects of the embodiments of the present invention. It should be understood that the foregoing description is only specific to the embodiments of the present invention and is not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements, etc. made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims (21)

1. A method of protecting private data in a multi-party joint training graph neural network, the multi-party comprising a first party, a second party, and a controller, the method performed by the first party comprising:
Processing a first characteristic part of a sample object by using a first parameter part of the graph neural network to obtain a first processing result;
Homomorphic encryption is carried out on the first processing result by utilizing a target public key of the controller, so that a first encryption result is obtained;
Receiving a second encryption result from a second party, wherein the second encryption result is obtained by homomorphic encryption of a second processing result by the second party by using the target public key, and the second processing result is obtained by the second party by using a second parameter part of the graph neural network to process a second characteristic part of the sample object;
based on the first encryption result, the second encryption result and a preset loss function, obtaining a first gradient ciphertext corresponding to the first parameter part through homomorphic operation, wherein the loss function adopts a form of an orthogonal polynomial approaching an activation function in the graph neural network;
adding a first noise ciphertext for encrypting the first noise on the first gradient ciphertext to obtain first encrypted and noisy data; transmitting the first encrypted noise-added data to the controller;
Receiving first noise-adding data after the first encryption noise-adding data are decrypted from the controller, and removing the first noise from the first noise-adding data to obtain a first gradient plaintext;
and updating the first parameter part according to the first gradient plaintext.
2. The method of claim 1, wherein the sample object is a node, or an edge, in a relational network graph corresponding to the graph neural network; the node represents one of the following business objects: users, merchants, and items; the edges represent the association between the business objects.
3. The method of claim 1, wherein the obtaining the first processing result comprises:
and multiplying the first parameter part and the first characteristic part to obtain the first processing result.
4. The method of claim 1, further comprising, prior to said obtaining the first processing result:
and utilizing the first original characteristic part of the neighbor object of the sample object and the first original characteristic part of the sample object to aggregate to obtain a first characteristic part.
5. The method of claim 1, wherein the obtaining, by homomorphic operation, the first gradient ciphertext corresponding to the first parameter portion based on the first encryption result and the second encryption result, and a preset loss function, includes:
Determining a loss value ciphertext through homomorphic operation based on the first encryption result, the second encryption result and a preset loss function;
the first gradient ciphertext is determined based on the loss value ciphertext and the first parameter portion.
6. The method of claim 5, further comprising:
sending the loss value ciphertext to the controller;
The receiving, from the controller, the first encrypted and noisy data after decrypting the first encrypted and noisy data, including:
and receiving the first noise adding data sent by the controller under the condition that the loss value plaintext corresponding to the loss value ciphertext is not lower than a preset loss threshold value.
7. The method of claim 1, the orthogonal-polynomial being a second-order orthogonal-polynomial.
8. The method of claim 1, wherein the loss function is defined as an orthogonal polynomial that is a variable that is a sum of the first and second processing results.
9. The method of claim 1, further comprising:
The target public key is received from the controller and stored.
10. A method of protecting private data in a multi-party joint training graph neural network, wherein the method comprises:
The first party processes a first characteristic part of the sample object by utilizing a first parameter part of the graph neural network to obtain a first processing result; homomorphic encryption is carried out on the first processing result by utilizing a target public key of the controller, so that a first encryption result is obtained;
The second party processes a second characteristic part of the sample object by using a second parameter part of the graph neural network to obtain a second processing result; homomorphic encryption is carried out on the second processing result by utilizing the target public key, so that a second encryption result is obtained and sent to the first party;
The first party obtains a first gradient ciphertext corresponding to the first parameter part through homomorphic operation based on the first encryption result, the second encryption result and a preset loss function; adding a first noise ciphertext for encrypting the first noise on the first gradient ciphertext to obtain first encrypted and noisy data; transmitting the first encrypted noise-added data to the controller; wherein the loss function is in the form of an orthogonal polynomial approximating an activation function in the graph neural network;
After the controller receives the first encrypted and noisy data, decrypting the first encrypted and noisy data by using a target private key corresponding to the target public key to obtain first noisy data; transmitting the first noisy data to the first party;
The first party receives the first noise adding data, removes the first noise from the first noise adding data, obtains a first gradient plaintext, and updates the first parameter portion by using the first gradient plaintext.
11. An apparatus for protecting private data in a multi-party joint training graph neural network, the multi-party including a first party, a second party, and a controller, the apparatus deployed at the first party, the apparatus comprising:
The first processing module is configured to process a first characteristic part of the sample object by utilizing the first parameter part of the graph neural network to obtain a first processing result;
the homomorphic encryption module is configured to homomorphic encrypt the first processing result by utilizing the target public key of the controller to obtain a first encryption result;
the first receiving module is configured to receive a second encryption result from a second party, wherein the second encryption result is obtained by homomorphic encryption of a second processing result by the second party by using the target public key, and the second processing result is obtained by the second party by processing a second characteristic part of the sample object by using a second parameter part of the graph neural network;
the gradient ciphertext determining module is configured to obtain a first gradient ciphertext corresponding to the first parameter part through homomorphic operation based on the first encryption result, the second encryption result and a preset loss function, wherein the loss function is in a form of an orthogonal polynomial approaching an activation function in the graph neural network;
The adding and transmitting module is configured to add a first noise ciphertext for encrypting the first noise on the first gradient ciphertext to obtain first encrypted and noisy data; transmitting the first encrypted noise-added data to the controller;
The second receiving module is configured to receive first noise-adding data after the first encryption noise-adding data are decrypted from the controller, and remove the first noise from the first noise-adding data to obtain a first gradient plaintext;
And the updating module is configured to update the first parameter part according to the first gradient plaintext.
12. The apparatus of claim 11, wherein the sample object is a node, or an edge, in a relational network graph corresponding to the graph neural network; the node represents one of the following business objects: users, merchants, and items; the edges represent the association between the business objects.
13. The apparatus of claim 11, wherein the first processing module is specifically configured to multiply the first parameter portion and the first feature portion to obtain the first processing result.
14. The apparatus of claim 11, further comprising:
and the aggregation module is configured to aggregate the first characteristic part by utilizing the first original characteristic part of the neighbor object of the sample object and the first original characteristic part of the sample object before the first processing result is obtained.
15. The apparatus of claim 11, wherein the gradient ciphertext determination module is specifically configured to determine a loss value ciphertext via homomorphic operation based on the first encryption result and the second encryption result, and a preset loss function;
the first gradient ciphertext is determined based on the loss value ciphertext and the first parameter portion.
16. The apparatus of claim 15, further comprising:
the sending module is configured to send the loss value ciphertext to the controller;
The second receiving module is specifically configured to receive the first noise-added data sent by the controller when it is determined that a loss value plaintext corresponding to the loss value ciphertext is not lower than a preset loss threshold.
17. The device of claim 11, the orthogonal polynomial being a second order orthogonal polynomial.
18. The apparatus of claim 11, wherein the loss function is defined as an orthogonal polynomial that is a variable that is a sum of the first and second processing results.
19. The apparatus of claim 11, further comprising:
And a receiving and storing module configured to receive the target public key from the controller and store the target public key.
20. A system for protecting private data in a multi-party joint training graph neural network, the system comprising a first party, a second party, and a controller, wherein,
The first party is used for processing a first characteristic part of the sample object by utilizing a first parameter part of the graph neural network to obtain a first processing result; homomorphic encryption is carried out on the first processing result by utilizing a target public key of the controller, so that a first encryption result is obtained;
The second party is used for processing a second characteristic part of the sample object by utilizing a second parameter part of the graph neural network to obtain a second processing result; homomorphic encryption is carried out on the second processing result by utilizing the target public key, so that a second encryption result is obtained and sent to the first party;
the first party is further used for obtaining a first gradient ciphertext corresponding to the first parameter part through homomorphic operation based on the first encryption result, the second encryption result and a preset loss function; adding a first noise ciphertext for encrypting the first noise on the first gradient ciphertext to obtain first encrypted and noisy data; transmitting the first encrypted noise-added data to the controller; the loss function adopts a form of an orthogonal polynomial which approximates an activation function in the graph neural network;
the controller is used for decrypting the first encrypted and noisy data by utilizing a target private key corresponding to the target public key after receiving the first encrypted and noisy data to obtain first noisy data; transmitting the first noisy data to the first party;
the first party is also used for receiving the first noise adding data, removing the first noise from the first noise adding data to obtain a first gradient plaintext, and updating the first parameter part by using the first gradient plaintext.
21. A computing device comprising a memory and a processor, wherein the memory has executable code stored therein, which when executed by the processor, implements the method of any of claims 1-9.
CN202111297665.XA 2021-11-03 2021-11-03 Method, device and system for multi-party combined training of graph neural network Active CN114091651B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111297665.XA CN114091651B (en) 2021-11-03 2021-11-03 Method, device and system for multi-party combined training of graph neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111297665.XA CN114091651B (en) 2021-11-03 2021-11-03 Method, device and system for multi-party combined training of graph neural network

Publications (2)

Publication Number Publication Date
CN114091651A CN114091651A (en) 2022-02-25
CN114091651B true CN114091651B (en) 2024-05-24

Family

ID=80298829

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111297665.XA Active CN114091651B (en) 2021-11-03 2021-11-03 Method, device and system for multi-party combined training of graph neural network

Country Status (1)

Country Link
CN (1) CN114091651B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160573A (en) * 2020-04-01 2020-05-15 支付宝(杭州)信息技术有限公司 Method and device for protecting business prediction model of data privacy joint training by two parties
CN112199702A (en) * 2020-10-16 2021-01-08 鹏城实验室 Privacy protection method, storage medium and system based on federal learning
CN112966298A (en) * 2021-03-01 2021-06-15 广州大学 Composite privacy protection method, system, computer equipment and storage medium
CN113221153A (en) * 2021-05-31 2021-08-06 平安科技(深圳)有限公司 Graph neural network training method and device, computing equipment and storage medium
CN113505882A (en) * 2021-05-14 2021-10-15 深圳市腾讯计算机系统有限公司 Data processing method based on federal neural network model, related equipment and medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11558176B2 (en) * 2017-02-15 2023-01-17 Lg Electronics Inc. Apparatus and method for generating ciphertext data with maintained structure for analytics capability
CN107704877B (en) * 2017-10-09 2020-05-29 哈尔滨工业大学深圳研究生院 Image privacy perception method based on deep learning
EP3767511B1 (en) * 2019-07-19 2021-08-25 Siemens Healthcare GmbH Securely performing parameter data updates

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160573A (en) * 2020-04-01 2020-05-15 支付宝(杭州)信息技术有限公司 Method and device for protecting business prediction model of data privacy joint training by two parties
CN112199702A (en) * 2020-10-16 2021-01-08 鹏城实验室 Privacy protection method, storage medium and system based on federal learning
CN112966298A (en) * 2021-03-01 2021-06-15 广州大学 Composite privacy protection method, system, computer equipment and storage medium
CN113505882A (en) * 2021-05-14 2021-10-15 深圳市腾讯计算机系统有限公司 Data processing method based on federal neural network model, related equipment and medium
CN113221153A (en) * 2021-05-31 2021-08-06 平安科技(深圳)有限公司 Graph neural network training method and device, computing equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
多数据源隐私保护数据清洗与联合学习研究;杨烨;中国优秀硕士学位论文全文数据库 信息科技辑;20200215(第02期);全文 *

Also Published As

Publication number Publication date
CN114091651A (en) 2022-02-25

Similar Documents

Publication Publication Date Title
WO2021197037A1 (en) Method and apparatus for jointly performing data processing by two parties
CN111160573B (en) Method and device for protecting business prediction model of data privacy joint training by two parties
Li et al. Privacy-preserving-outsourced association rule mining on vertically partitioned databases
CN112989368B (en) Method and device for processing private data by combining multiple parties
Malavolta et al. Silentwhispers: Enforcing security and privacy in decentralized credit networks
Atallah et al. Secure multi-party computational geometry
CN112199702A (en) Privacy protection method, storage medium and system based on federal learning
CN110912713B (en) Method and device for processing model data by multi-party combination
US8130947B2 (en) Privacy preserving social network analysis
EP3709563A1 (en) Secure key agreement with non-trusted devices
CN112541593B (en) Method and device for jointly training business model based on privacy protection
EP2228942A1 (en) Securing communications sent by a first user to a second user
Liu et al. Image encryption technique based on new two-dimensional fractional-order discrete chaotic map and Menezes–Vanstone elliptic curve cryptosystem
CN111371545B (en) Encryption method and system based on privacy protection
CN114866225B (en) Super-threshold multi-party privacy set intersection method based on careless pseudorandom secret sharing
US20180006803A1 (en) Multivariate Signature Method for Resisting Key Recovery Attack
CN112597542B (en) Aggregation method and device of target asset data, storage medium and electronic device
Medwed et al. Unknown-input attacks in the parallel setting: Improving the security of the CHES 2012 leakage-resilient PRF
Battarbee et al. Cryptanalysis of semidirect product key exchange using matrices over non-commutative rings
CN116561787A (en) Training method and device for visual image classification model and electronic equipment
CN112929151B (en) Entity alignment method based on privacy protection and computer storage medium
Kumari et al. Signature based Merkle Hash Multiplication algorithm to secure the communication in IoT devices
CN114091651B (en) Method, device and system for multi-party combined training of graph neural network
Far et al. A privacy-preserving framework for blockchain-based multi-level marketing
CN115580443A (en) Graph data processing method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant