CN112464292A - Method and device for training neural network based on privacy protection - Google Patents

Method and device for training neural network based on privacy protection Download PDF

Info

Publication number
CN112464292A
CN112464292A CN202110109491.3A CN202110109491A CN112464292A CN 112464292 A CN112464292 A CN 112464292A CN 202110109491 A CN202110109491 A CN 202110109491A CN 112464292 A CN112464292 A CN 112464292A
Authority
CN
China
Prior art keywords
node
sampling
graph
neighbor
gradient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110109491.3A
Other languages
Chinese (zh)
Other versions
CN112464292B (en
Inventor
熊涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202110109491.3A priority Critical patent/CN112464292B/en
Priority to CN202110957071.0A priority patent/CN113536383B/en
Publication of CN112464292A publication Critical patent/CN112464292A/en
Application granted granted Critical
Publication of CN112464292B publication Critical patent/CN112464292B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The embodiment of the specification provides a method and a device for training a neural network based on privacy protection. And for any second node in the neighbor node set, inputting the node information of the second node, the node information of the first node and the connection information of the second node and the first node into the multilayer neural network to obtain the matching degree of the second node and the first node. And then, sampling the neighbor node set according to the matching degree corresponding to each neighbor node in the neighbor node set to obtain a sampled neighbor node set of the first node. And then, forming a sparse relation network graph based on the sampling neighbor node sets corresponding to the nodes in the original graph. Then, based on the sparse relationship network graph, a graph neural network is trained.

Description

Method and device for training neural network based on privacy protection
Technical Field
One or more embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a method and an apparatus for training a neural network based on privacy protection.
Background
Relational network diagrams are recently becoming a large core area of machine learning. Data mining and machine learning based on relational network graphs are of increasing value in many areas. For example, the structure of the social network may be understood by predicting potential connections, fraud detection may be performed based on graph structures, consumer behavior of e-commerce users may be understood or real-time recommendations may be made, and so forth.
Meanwhile, people pay more attention to privacy. A large amount of information is hidden in the relational network graph, and various artificial intelligence and machine learning (AI/ML) models using graph information have a great risk of revealing data privacy if the protection is not proper. For example, with the advent of the IOT era, many AI/ML are deployed on-site (cell phones/other IOT devices) to make real-time decisions after being developed with large-scale graph data at the cloud. It is clear from such advantages that since data transmission from the end to the cloud is reduced, privacy of the user is protected, and data transmission cost is also reduced. However, if the model is stolen, the large-scale graph data information used to train the model is also at risk of being leaked.
Therefore, it is desirable to have an improved scheme for more safely and efficiently training a reliable neural network model.
Disclosure of Invention
One or more embodiments of the present specification describe a method and an apparatus for training a graph neural network based on privacy protection, so that the trained graph neural network can better protect the privacy security of graph data information.
According to a first aspect, there is provided a method for training a neural network based on privacy protection, comprising:
acquiring an original relation network graph, wherein the original relation network graph comprises a plurality of nodes, and any first node in the plurality of nodes is provided with a corresponding first neighbor node set;
for any second node in the first neighbor node set, inputting the node information of the second node, the node information of the first node, and the connection information of the second node and the first node into a multilayer neural network to obtain the matching degree of the second node and the first node;
sampling the first neighbor node set according to the matching degree corresponding to each neighbor node in the first neighbor node set to obtain a sampled neighbor node set of the first node;
forming a sparse relation network graph based on the sampling neighbor node sets corresponding to the plurality of nodes respectively;
and training a neural network of the graph based on the sparse relationship network graph.
In one embodiment, sampling the first neighbor node set according to the matching degree specifically includes: normalizing the matching degrees respectively corresponding to the neighbor nodes to obtain corresponding matching probabilities; and sampling each neighbor node according to the matching probability.
In another embodiment, sampling the first neighbor node set according to the matching degree specifically includes: determining a first sampling probability of the second node being sampled according to a first privacy budget and a matching degree of the second node and the first node based on an exponential mechanism of differential privacy; and sampling each neighbor node according to the first sampling probability corresponding to each neighbor node in the first neighbor node set.
Further, in one example, a predetermined number k of samplings may be performed based on the first sampling probability to sample k neighbor nodes from the first set of neighbor nodes as the set of sampled neighbor nodes.
In yet another embodiment, the first sampling probabilities respectively corresponding to the neighboring nodes may be input into a gummel-softmax function to obtain second sampling probabilities respectively corresponding to the neighboring nodes; and sampling each neighbor node according to the second sampling probability corresponding to each neighbor node.
Further, in one example, a predetermined number k of samplings may be performed according to the second sampling probability, and k neighbor nodes may be sampled from the first neighbor node set as the sampling neighbor node set.
According to one embodiment, a sparse relationship network graph includes labeled nodes having labels; training a neural network based on the sparse relationship network graph, comprising: carrying out graph embedding on the sparse relationship network graph by utilizing the graph neural network to obtain a node embedding vector of the labeling node; determining a corresponding first gradient of the graph neural network according to the node embedding vector and the label; updating the graph neural network according to the first gradient.
In one embodiment, updating the neural network of the map according to the first gradient specifically includes: adding noise on the first gradient by using a noise mechanism of differential privacy to obtain a first noise gradient; and updating the parameters of the graph neural network according to the first noise gradient.
Further, in one example, the first noise gradient is obtained by: based on a preset cutting threshold value, cutting the first gradient to obtain a cutting gradient; determining Gaussian noise for realizing differential privacy by utilizing a Gaussian distribution determined based on the clipping threshold, wherein the variance of the Gaussian distribution is positively correlated with the square of the clipping threshold; and superposing the Gaussian noise and the cutting gradient to obtain the first noise gradient.
In one embodiment, the graph embedding the sparse relationship network graph to obtain the node embedding vector of the labeled node specifically includes: acquiring neighbor nodes of the marked nodes in the sparse relationship network graph as target neighbor nodes; determining an aggregation weight of each target neighbor node, the aggregation weight being determined based on the degree of matching of the annotation node with each target neighbor node; and according to the aggregation weight, aggregating the node information of each target neighbor node to obtain the node embedded vector of the labeled node.
Further, in an example, determining the aggregation weight of each target neighbor node specifically includes: acquiring sampling probability of each target neighbor node when being sampled, wherein the sampling probability is determined based on the matching degree of each target neighbor node and the label node by using a Gumbel-softmax function; and determining the aggregation weight of each target neighbor node according to the sampling probability.
According to one embodiment, training a neural network based on the sparse relationship network graph further comprises: determining a second gradient corresponding to the multilayer neural network according to the node embedding vector and the label; updating the multi-layer neural network according to the second gradient.
Further, a first noise may be added to the first gradient in a differential privacy manner, so as to obtain a first noise gradient; updating parameters of the graph neural network according to the first noise gradient; adding a second noise to the second gradient by using a differential privacy mode to obtain a second noise gradient; and updating the parameters of the multilayer neural network according to the second noise gradient.
In various embodiments, the plurality of nodes in the relational network graph may include at least one of: user nodes, merchant nodes and article nodes.
According to a second aspect, there is provided an apparatus based on a privacy preserving training graph neural network, comprising:
an original graph obtaining unit, configured to obtain an original relationship network graph, where the original relationship network graph includes a plurality of nodes, and any first node in the plurality of nodes has a corresponding first neighbor node set;
a matching degree obtaining unit configured to input node information of a second node, node information of the first node, and connection information of the second node and the first node to a multilayer neural network for any second node in the first neighbor node set, so as to obtain a matching degree between the second node and the first node;
the sampling unit is configured to sample the first neighbor node set according to the matching degree corresponding to each neighbor node in the first neighbor node set, so as to obtain a sampling neighbor node set of the first node;
a sparse graph forming unit configured to form a sparse relationship network graph based on a sampling neighbor node set corresponding to each of the plurality of nodes;
and the training unit is configured to train the neural network of the graph based on the sparse relationship network graph.
According to a third aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first aspect.
According to a fourth aspect, there is provided a computing device comprising a memory and a processor, wherein the memory has stored therein executable code, and wherein the processor, when executing the executable code, implements the method of the first aspect.
By the method and the device provided by the embodiment of the specification, the neighbor nodes in the original relationship network graph are sampled by utilizing the matching degree between the nodes determined by the multilayer neural network, so that the sparse relationship network graph is obtained. And training a neural network of the graph based on the sampled sparse relationship network graph. Because the sparse relational network graph only contains partial connecting edges sampled from the original relational network graph, the accurate graph structure information in the original relational network graph is difficult to reversely deduce based on the graph neural network trained in the way, and thus the data privacy of the original relational network graph is protected. And optionally, in a sampling stage and/or a gradient propagation stage, a differential privacy mechanism can be introduced, so that certain randomness is introduced for the training of the graph neural network. By introducing a differential privacy mechanism, the privacy data security of the original relationship network diagram is further enhanced on the basis of ensuring the basic performance of the neural network of the diagram.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 illustrates an architectural diagram of a training graph neural network in accordance with the concepts of the present technology;
FIG. 2 illustrates a flow diagram of a method for a privacy preserving training graph-based neural network, according to one embodiment;
FIG. 3 illustrates an exemplary presentation of a relational network diagram;
FIG. 4 illustrates a flow of steps to train a neural network based on a sparse relationship network graph in one embodiment;
FIG. 5 shows a schematic block diagram of a training apparatus of the graph neural network according to one embodiment.
Detailed Description
The scheme provided by the specification is described below with reference to the accompanying drawings.
As previously mentioned, graph neural networks are typically trained based on relational network graphs. A relational network graph is a description of relationships between entities in the real world and may be generally represented as a set of nodes representing entities in the real world and a set of edges representing associations between entities in the real world. For example, in a social network, people are entities and relationships or connections between people are edges. The relational network graph is generally formed by collecting and organizing a large amount of user-related data, and therefore, often contains private data of users. For example, a social network graph may include records of social interactions between users, and the like.
After training the graph neural network based on a particular graph of the relationship network, the graph neural network can be used to predict tasks associated with nodes and/or edges in the graph, e.g., predicting relationships between nodes, predicting classifications of nodes, etc. Because the training is carried out based on the relational network diagram, the neural network of the diagram carries the information of the relational network diagram. Therefore, if the model parameters of the graph neural network are not protected properly, the model parameters are leaked or stolen, and accordingly, the risk of leakage exists in the relational network graph data used for training the graph neural network.
Based on the above consideration, the inventor proposes several embodiments in this specification, in which the original relationship network graph is sampled and thinned, and the graph neural network is trained based on the thinned relationship network graph, so as to achieve the effect of protecting the original graph privacy.
Fig. 1 shows an architectural diagram of a training graph neural network according to the technical concept of the present specification. As shown in fig. 1, an original relationship network diagram 100 is first obtained. The original relationship network graph 100 may be a network graph reflecting various associations. In general, the original relational network graph is a dense graph in which most nodes have a large number of connecting edges, for example, several tens of connecting edges. Nodes connected by connecting edges may be referred to as neighbor nodes. Each node then has a corresponding set of neighbor nodes.
For each node in the original relationship network graph 100, it is input into the multi-layer neural network 10 together with neighboring nodes. The multi-layer neural network 10 is used to predict the degree of match between two nodes of the input. Then, the connecting edges may be sampled according to the matching degree between the nodes where the connecting edges exist, or the neighboring nodes in the neighboring node set may be sampled. Through sampling, only part of the connecting edges are reserved, so that the original relational network graph is thinned, and the sparse relational network graph 200 is obtained.
The sparse relationship network graph 200 may then be input into the graph neural network 20 for training the graph neural network 20. Since the sparse relationship network graph 200 only contains partial information in the original relationship network graph, it is difficult to reversely deduce accurate graph structure information in the original relationship network graph through the graph neural network trained in this way, thereby protecting the data privacy of the original relationship network graph. Moreover, through the thinning, the training speed of the graph neural network can be increased, and the obtained graph neural network has stronger robustness.
Further, a differential privacy mechanism can be introduced in the training process. For example, in the sampling stage, the sampling probability of a certain connection edge can be determined through an exponential mechanism of differential privacy, so that certain randomness is introduced into the sampling process. And certain noise can be introduced into the gradient of back propagation through a noise mechanism of difference privacy in the process of training the neural network of the graph by utilizing the sparse graph, so that certain randomness is introduced for determining the model parameters. By introducing a differential privacy mechanism, on the basis of ensuring the basic performance of the graph neural network, the graph information of the original relationship network graph is difficult to infer based on the graph neural network, so that the security of private data is further protected.
The following describes a specific implementation of the above concept.
FIG. 2 illustrates a flow diagram of a method for a privacy preserving training graph-based neural network, according to one embodiment. It is to be appreciated that the method can be performed by any apparatus, device, platform, cluster of devices having computing and processing capabilities. The following describes a training process of the privacy-preserving-based graph neural network, with reference to the implementation architecture shown in fig. 1 and the method flow shown in fig. 2.
As shown in fig. 2, first, in step 21, an original relationship network diagram is obtained.
In different embodiments, the original relationship network graph may be a network graph reflecting various associations. For example, in one example, the original relationship network graph is a social relationship graph that contains a number of nodes, each node representing a user; the connecting edges between the nodes represent social connection behaviors between two corresponding users, such as conversation, short message and other social interactions. In another example, the original relationship network graph is an anomaly graph reflecting user behavior habits. In such an anomaly map, a variety of different kinds of nodes may be included, for example, a business node, an article node, in addition to a user node. When a user accesses or purchases an item (e.g., watches a movie, reads a book, etc.), or the user conducts a transaction at a merchant, a connecting edge may be established between the corresponding nodes. Specific examples of the relationship network graph are not exhaustive and are not described here by way of example.
In many scenarios, the number of nodes in the original relationship network graph is huge, for example, the number of nodes in the social relationship graph can reach the order of thousands or even hundreds of millions; the connection relationship between nodes is also complex, and most nodes have a large number of connection edges, for example, dozens or even hundreds of connection edges. Thus, the original relational network graph tends to be a denser graph.
FIG. 3 shows an exemplary presentation of a relational network graph, with the left-hand (a) portion showing an example of an original relational network graph. It can be seen that the number of connecting edges between nodes is huge, and the connection relationship is complex and dense.
For this reason, according to the embodiment of the present specification, the dense original relationship network graph described above is next thinned. For simplicity and clarity of description, the following description is made in conjunction with any one node u in the original relational network graph, which will be referred to as the first node hereinafter. Correspondingly, the nodes connected with the first node through the connecting edge are neighbor nodes of the first node u, and the neighbor nodes form a neighbor node set of the first node u, called a first neighbor node set and marked as Nu. In the case that the relational network graph is a directed graph, the first set of neighbor nodes may be defined as a set of nodes pointing to the first node, or a set of nodes pointed to by the first node, or both, as desired.
Next, at step 22, N is applied to the first set of neighbor nodes aboveuInputting the node information N (v) of the second node v and the node information N (u) of the first node, and the connection information A (u, v) of the second node v and the first node u into the multilayer neural network 10 to obtain the matching degree z of the second node and the first nodeu,v. The multi-layer neural network 10 may be implemented, for example, as a multi-layer perceptron MLP, a deep feedforward neural network DNN, a multi-layer convolutional neural network CNN, or the like. In case of implementation as a multi-layered perceptron MLP, the above degree of matching zu,vCan be expressed as:
Figure 213970DEST_PATH_IMAGE001
(1)
note that the above node information n (u) and n (v) is determined according to the attribute of the object represented by the node itself. For example, in the case where the node represents a user, n (u) may contain basic attribute information of the user u, such as sex, age, registration duration, and the like. A (u, v) can be based on the original relationship network graph, the first node uAnd the second node v. For example, in a social relationship graph, a connecting edge corresponds to a social interaction; a (u, v) contains information on the frequency and/or manner of social interaction between user u and user v. In one embodiment, the node information n (u) and n (v), and the connection information a (u, v) are encoded in the form of vectors and input to the multi-layer neural network. Obtaining the matching degree z of the first node u and the second node v through the operation of a multilayer neural networku,v
Then, in step 23, according to the first set of neighbor nodes NuAnd sampling the first neighbor node set according to the matching degree corresponding to each neighbor node in the first node to obtain a sampled neighbor node set of the first node.
In one embodiment, the degree of match z output by the multi-layer neural networku,vAs such, is a numerical value within the range of (0, 1). In this case, the matching degree corresponding to each neighbor node may be set as the matching probability Pu,vAnd sampling according to the matching probability.
In another embodiment, the matching degree output by the multi-layer neural network is a score of the matching degree with a numerical value not limited to a (0, 1) range. In such a case, the matching degrees respectively corresponding to the neighboring nodes may be normalized to obtain the corresponding matching probability Pu,v(ii) a Then, each neighbor node is sampled according to the matching probability. For example, k times of sampling may be performed according to the matching probability of each node to obtain k neighbor nodes, which are used as a sampling neighbor node set of the first node.
According to one embodiment, when the neighbor nodes are sampled, a privacy protection mechanism of differential privacy is introduced, so that certain randomness is introduced in the sampling process, and the privacy protection effect is enhanced.
In particular, for the first node u and any neighbor nodes thereof, such as the second node v, the index mechanism based on differential privacy can be used to estimate the privacy budget
Figure 396690DEST_PATH_IMAGE002
And the second obtained beforeDegree of matching z between node v and first node uu,vDetermining the sampling probability that the second node v is sampled
Figure 442006DEST_PATH_IMAGE003
Referred to as the first sampling probability.
In one example, the first sampling probability is
Figure 368374DEST_PATH_IMAGE003
Can be determined by:
Figure 448326DEST_PATH_IMAGE004
(2)
in the above formula (2),
Figure 536367DEST_PATH_IMAGE002
for privacy budgeting, NuIs a set of neighboring nodes to the first node u,
Figure 68980DEST_PATH_IMAGE005
for sensitivity, the maximum difference in the function value (here, the degree of matching) when a function operation is performed on adjacent data sets is indicated. Under the condition that the matching degree of the output of the multilayer neural network is (-1, 1),
Figure 297574DEST_PATH_IMAGE005
the value is 2. The above equation (2) shows that when sampling is performed according to the first sampling probability, the probability that the second node is sampled, and the matching degree z between the second node and the first nodeu,vPositively correlated with the privacy computation
Figure 232032DEST_PATH_IMAGE002
By setting N for the first neighbor node setuAnd each neighbor node performs the operation of the formula to obtain a first sampling probability corresponding to each neighbor node. Thus, a first set of neighbor nodes N may be identifieduThe first sampling corresponding to each neighbor nodeAnd sampling probability, namely sampling each neighbor node.
In one specific example, the maximum number of neighbor nodes k per node in the sparse graph may be set in advance. Accordingly, when the original image sampling process is executed, the arbitrary first node u may be selected according to the first sampling probability
Figure 490975DEST_PATH_IMAGE003
Performing k samples, each sample from a first set N of neighbor nodesuOne neighbor node is sampled, and thus k neighbor nodes are sampled from the first neighbor node set as a sampled neighbor node set. Of course, in a special case, if the original number of nodes in the first neighbor node set is not greater than k, the original first neighbor node set may be directly used as the sampling neighbor node set. The advantage of this example scheme is that no matter how dense the original relational network graph is, it can be ensured that the number of neighbor nodes of each node in the finally obtained sparse relational graph does not exceed k.
In another specific example, a sampling ratio r of the sparse graph with respect to each node of the original graph may be set in advance, for example, 20%. Correspondingly, when the original image sampling process is executed, aiming at any first node u, firstly, according to the first neighbor node set N of the first node uuAnd determining the number k of the nodes to be sampled according to the number of the middle nodes and the sampling proportion r. Then, according to the first sampling probability
Figure 510883DEST_PATH_IMAGE003
Performing k samples, each sample from a first set N of neighbor nodesuOne neighbor node is sampled, and thus k neighbor nodes are sampled from the first neighbor node set as a sampled neighbor node set. By the method, sampling and compression can be performed according to a preset proportion regardless of the number of neighbor nodes of each node in the original relational network graph, and the neighbor node set after sampling of each node is different from the original neighbor node set.
Further, according to one embodiment, to facilitate multi-layer neural networks and figuresAnd when the network is jointly trained, the effective backward propagation of the gradient is adopted, and the sampling probability is determined in a form more beneficial to gradient derivation. Specifically, the first sampling probability is obtained by the above formula (2)
Figure 513474DEST_PATH_IMAGE003
On the basis, the first sampling probability respectively corresponding to each neighbor node can be input into Gumbel-softmax function to obtain the second sampling probability respectively corresponding to each neighbor node
Figure 36859DEST_PATH_IMAGE006
(ii) a Then according to the second sampling probability corresponding to each neighbor node
Figure 732283DEST_PATH_IMAGE007
And sampling each neighbor node.
Specifically, in one example, the first sampling probability may be based on by the following equation (3)
Figure 239488DEST_PATH_IMAGE003
Determining a second sampling probability
Figure 780190DEST_PATH_IMAGE006
Figure 689241DEST_PATH_IMAGE008
(3)
In equation (3) above, s is randomly selected from the samples (0, 1).
When neighbor sampling is performed based on the second sampling probability, similarly, a predetermined number k of sampling may be performed, so that k neighbor nodes are sampled from the first set of neighbor nodes as a set of sampled neighbor nodes. Alternatively, sampling is performed based on a predetermined sampling ratio r. And will not be described in detail herein.
By executing the above steps 22 and 23 on each node in the original relationship network graph, a sampling neighbor node set corresponding to each node can be obtained. Then, in step 24, a sparse relationship network graph may be formed based on the sampled neighboring node sets corresponding to the respective nodes.
The right part (b) of fig. 3 shows a sparse relational network diagram obtained by sampling the original relational network diagram on the left side. Compared with the original relational network graph, the number of the connecting edges in the sparse relational network graph is greatly reduced, and the connecting relation between the nodes is greatly simplified.
Next, in step 25, a neural network of the graph is trained based on the sparse relationship network graph obtained above.
The specific process of training the neural network of the graph can be implemented in various ways. From a randomness perspective, embodiments may include a way of rigorous training based on raw gradients, and a way of approximate training based on differential privacy mechanisms to introduce noise to the gradients; from the perspective of joint training, embodiments can be divided into a mode of training separately with the multilayer neural network and a mode of training together with the multilayer neural network. The embodiments from different angles above, as well as the embodiments combined, are described below in connection with a typical process of training based on labeled nodes.
In a typical graph neural network training, some of the nodes may be labeled with labels corresponding to the predicted tasks. For example, when the prediction task is to predict the transaction risk of the user based on the user social relationship graph, a label may be given to a part of the users in the social relationship graph whose risk status is known, the label showing their real risk status. In different examples, the tags may be category tags (e.g., high risk, medium risk, low risk categories) or numerical tags (e.g., specific risk scores). The labels are associated with subsequent predictive tasks performed based on the graph neural network. The embedded vectors of the labeled nodes are characterized and learned through a graph neural network, and the unlabeled nodes with unknown states can be predicted.
Fig. 4 shows an example of a flow of steps for training a neural network based on a sparse relationship network diagram, namely the sub-steps of step 25 described above, in one embodiment. The training mode of fig. 4 is performed based on the labeled nodes, that is, the sparse relationship network diagram includes labeled nodes x with labels and the nodes have labeled labels y.
As shown in fig. 4, in step 251, the obtained sparse relationship network graph is graph-embedded by using a graph neural network, so as to obtain a node embedding vector Ex of the labeled node x. Different graph neural networks adopt different algorithms to carry out graph embedding to obtain node embedding vectors of all nodes, and some graph neural networks can also obtain edge embedding vectors of edges in the graph. Although the specific algorithm is different, generally, when the graph neural network performs graph embedding, for a target node to be analyzed, information of neighbor nodes of the target node is obtained, and the information of the neighbor nodes is aggregated, so as to determine an embedded vector of the target node. When the label node x is used as a target node, a node embedding vector Ex corresponding to the label node x can be obtained.
Then, at step 252, a corresponding first gradient of the graph neural network is determined based on the node embedding vector Ex and the label y. In general, a prediction result y' can be obtained by performing prediction on a task corresponding to a tag based on the node embedding vector Ex. Then, based on the prediction result y' and the label y, obtaining a prediction loss L according to a preset loss function; then, the predicted loss L is propagated backward in the graph neural network, that is, the partial derivative of the predicted loss with respect to the network parameters of each network layer is obtained from the network layer direction of the graph neural network from the backward direction, and the first gradient corresponding to the graph neural network is obtained.
Next, in step 253, the neural network of the map is updated, i.e. the network parameters therein are updated according to the first gradient.
The basic steps of updating the neural network of the graph based on the labeled nodes in the sparse relationship network graph are described above. Various embodiments based on the above basic steps are described below.
As previously described, from a randomness point of view, embodiments may include a strictly trained embodiment a and an approximately trained embodiment B.
In embodiment a, at step 253, the neural network of the map is updated based on the original values of the first gradientThe network parameter of (2). Suppose that in the t-th iteration, the first gradient obtained is
Figure 555565DEST_PATH_IMAGE009
Then for the current network parameters of the t-th round
Figure 81225DEST_PATH_IMAGE010
The update of (a) may be expressed as:
Figure 425618DEST_PATH_IMAGE011
(4)
wherein the content of the first and second substances,
Figure 221798DEST_PATH_IMAGE012
the learning step length or the learning rate is represented as a preset hyper-parameter;
Figure 524604DEST_PATH_IMAGE013
is shown passing through
Figure 740821DEST_PATH_IMAGE014
And training the obtained updated network parameters in turn.
In embodiment B, in step 253, noise is added to the first gradient by using a noise mechanism of differential privacy, so as to obtain a first noise gradient; parameters of the graph neural network are then updated based on the first noise gradient.
The noise mechanism of differential privacy is mainly realized by adding noise in the query result. The noise may be embodied as laplacian noise, gaussian noise, or the like. In this step 253, differential privacy is achieved by adding gaussian noise in the gradient, according to one embodiment. More specifically, the first gradient may be clipped based on a preset clipping threshold to obtain a clipping gradient; then, determining Gaussian noise for realizing difference privacy by utilizing Gaussian distribution determined based on the clipping threshold, wherein the variance of the Gaussian distribution is positively correlated with the square of the clipping threshold; the gaussian noise and the clipping gradient may then be superimposed to obtain a first noise gradient.
More specifically, as an example, assume that in the t-th iteration, the first gradient obtained is
Figure 888906DEST_PATH_IMAGE009
. In order to add gaussian noise to the above-mentioned difference, gradient clipping may be performed on the original gradient based on a preset clipping threshold to obtain a clipping gradient, gaussian noise for implementing difference privacy is determined based on the clipping threshold and a predetermined noise scaling coefficient (a preset super parameter), and then the clipping gradient is fused (e.g., summed) with the gaussian noise to obtain a gradient including noise. It can be understood that this way, on one hand, performs clipping on the original gradient, and on the other hand, superimposes the clipped gradients, thereby performing differential privacy processing satisfying gaussian noise on the gradient.
For example, the original gradient
Figure 506969DEST_PATH_IMAGE009
The gradient cutting is carried out as follows:
Figure 715097DEST_PATH_IMAGE015
(5)
wherein the content of the first and second substances,
Figure 418610DEST_PATH_IMAGE016
the gradient after the cropping is represented by a graph,
Figure 104807DEST_PATH_IMAGE017
a clipping threshold value is indicated that is,
Figure 842956DEST_PATH_IMAGE018
to represent
Figure 221984DEST_PATH_IMAGE009
The second order norm of (d). That is, at a gradient less than or equal to the clipping threshold
Figure 412794DEST_PATH_IMAGE017
In the case of (2), the original gradient is retained, whereas the gradient is greater than the clipping threshold
Figure 902681DEST_PATH_IMAGE017
In case of (2), the original gradient is made to be greater than the clipping threshold
Figure 229757DEST_PATH_IMAGE017
Is cut to the corresponding size.
Adding gaussian noise to the clipped gradient to obtain a gradient containing noise, for example:
Figure 543802DEST_PATH_IMAGE019
(6)
wherein the content of the first and second substances,
Figure 221908DEST_PATH_IMAGE020
representing gradients containing noise;
Figure 515486DEST_PATH_IMAGE021
indicating that the probability density satisfies the condition of taking 0 as a mean value,
Figure 962648DEST_PATH_IMAGE022
Gaussian noise that is a gaussian distribution of variances;
Figure 417900DEST_PATH_IMAGE023
the noise scaling coefficient is a preset hyper parameter and can be set as required;
Figure 583302DEST_PATH_IMAGE017
the clipping threshold value is set;
Figure 414992DEST_PATH_IMAGE024
indicating the indicator function, 0 or 1 can be taken, for example, it can be set that 1 is taken in the even rounds and 0 is taken in the odd rounds in the multi-round training.
Then, the gradient after gaussian noise addition, i.e. the first noise gradient, can be used to adjust the network parameters of the graph neural network to, with the goal of minimizing the prediction loss L:
Figure 247819DEST_PATH_IMAGE025
(7)
and under the condition that the Gaussian noise added in the gradient meets the differential privacy, the adjustment of the network parameters meets the differential privacy. Therefore, a certain randomness is introduced for updating of the graph neural network based on the noise mechanism of the differential privacy, and a good balance is obtained between model performances in privacy protection.
On the other hand, from the perspective of whether or not to jointly train, embodiments may include embodiment a trained alone and embodiment b trained jointly.
In embodiment a, the multi-layer neural network 10 in FIG. 1 is a pre-trained neural network with fixed network parameters
Figure 139551DEST_PATH_IMAGE026
. The training mode may be to pre-label the matching degree between the nodes as a label to train the multi-layer neural network 10. Accordingly, the embedding vector Ex obtained by the graph embedding in step 251 is only the network parameters of the graph neural network 20
Figure 621610DEST_PATH_IMAGE027
As a function of (c). Subsequently, it is only necessary to determine the target at step 252
Figure 96804DEST_PATH_IMAGE027
And in step 253, to
Figure 518558DEST_PATH_IMAGE027
And updating the parameters.
In embodiment b, the multi-layer neural network 10 and the graph neural network 20 are trained jointly. Network parameters of both
Figure 581192DEST_PATH_IMAGE026
And
Figure 455607DEST_PATH_IMAGE027
the graph embedding process through step 251 is associated together. Specifically, for the labeled node x, in step 251, first, the neighbor nodes in the sparse relationship network graph are obtained as target neighbor nodes, and then the aggregation weight w of each target neighbor node is determined, where the aggregation weight w is based on the matching degree z of the labeled node x and each target neighbor node ix,iAnd is determined. And then, according to the aggregation weight w, aggregating the node information of each target neighbor node to obtain a node embedded vector Ex of the labeled node x. Thus, the node embedding vector Ex depends not only on the network parameters of the neural network of the graph
Figure 894679DEST_PATH_IMAGE027
Also dependent on the degree of match z by the aggregate weight wx,iAnd degree of matching zx,iIs output by the multi-layer neural network 10, is a network parameter
Figure 436519DEST_PATH_IMAGE026
Is a function of, and therefore the node embedding vector Ex is a network parameter
Figure 702677DEST_PATH_IMAGE026
And
Figure 392285DEST_PATH_IMAGE027
a common function. Correspondingly, the prediction loss L determined according to the node embedding vector Ex and the label y is also a network parameter
Figure 369468DEST_PATH_IMAGE026
And
Figure 500235DEST_PATH_IMAGE027
a common function.
In such a case, the parameters for the network are determined in step 252
Figure 904672DEST_PATH_IMAGE027
After the first gradient, the prediction loss (determined according to the node embedding vector Ex and the label) is continuously propagated to the multilayer neural network 10 in a backward direction, and a second gradient corresponding to the multilayer neural network 10 is determined; then, according to the second gradient, the network parameters in the multi-layer neural network 10 are updated
Figure 753679DEST_PATH_IMAGE026
And realizing the joint training of the multilayer neural network 10 and the graph neural network 20.
Above according to the matching degree zx,iDetermining the aggregation weight w may be implemented in a variety of ways. In one example, the normalized matching degree may be used as an aggregation weight. In another example, a first sampling probability of the target neighbor node when sampled may be obtained, the first sampling probability being determined using equation (2), and then the aggregation weight w may be determined according to the first sampling probability. In yet another example, a second sampling probability of each target neighbor node as sampled may be obtained, the second sampling probability being determined based on a degree of matching of each target neighbor node to the labeled node using a Gumbel-softmax function. For example, the second sampling probability may be determined using the aforementioned equation (3). And then determining the aggregation weight w of each target neighbor node according to the second sampling probability. The form of the Gumbel-softmax function facilitates gradient derivation and thus facilitates gradient propagation from the graphical neural network 20 to the multi-layered neural network 10.
Different embodiments of the graph neural network training process are described above from two different perspectives, a randomness perspective and a joint training perspective. Since these two angles are independent of each other, the above embodiments a, B and embodiments a, B can be combined in various ways to obtain more specific examples.
When embodiment B is combined with embodiment B, i.e. a differential privacy mechanism is introduced in case of joint training, in one example noise may be added on both the first gradient for the graph neural network 20 and the second gradient for the multi-layer neural network 10. Specifically, a first noise may be added to the first gradient in a differential privacy manner to obtain a first noise gradient; updating parameters of the graph neural network according to the first noise gradient; in addition, a second noise is added to the second gradient in a differential privacy mode to obtain a second noise gradient; and updating the parameters of the multilayer neural network according to the second noise gradient. The noise adding process can be referred to the foregoing description of embodiment B, and is not described herein again. In another example, noise may also be added for only one of the first and second gradients.
Reviewing the process, the graph neural network is obtained through training based on the sampled sparse relation network graph in various modes. Because the sparse relational network graph only contains partial connecting edges sampled from the original relational network graph, the accurate graph structure information in the original relational network graph is difficult to reversely deduce based on the graph neural network trained in the way, and thus the data privacy of the original relational network graph is protected. And optionally, in a sampling stage and/or a gradient propagation stage, a differential privacy mechanism can be introduced, so that certain randomness is introduced for the training of the graph neural network. By introducing a differential privacy mechanism, the privacy data security of the original relationship network diagram is further enhanced on the basis of ensuring the basic performance of the neural network of the diagram.
According to another embodiment, an apparatus based on a privacy-preserving training graph neural network is also provided, and the apparatus may be deployed in any apparatus, device, platform, or device cluster having computing and processing capabilities. FIG. 5 shows a schematic block diagram of a training apparatus of the graph neural network according to one embodiment. As shown in fig. 5, the training apparatus 500 includes:
an original graph obtaining unit 51, configured to obtain an original relationship network graph, where the original relationship network graph includes a plurality of nodes, and any first node in the plurality of nodes has a corresponding first neighbor node set;
a matching degree obtaining unit 52, configured to input, to any second node in the first neighboring node set, node information of the second node, node information of the first node, and connection information of the second node and the first node into a multilayer neural network, so as to obtain a matching degree between the second node and the first node;
the sampling unit 53 is configured to sample the first neighbor node set according to the matching degree corresponding to each neighbor node in the first neighbor node set, so as to obtain a sampled neighbor node set of the first node;
a sparse graph forming unit 54 configured to form a sparse relationship network graph based on a sampling neighbor node set corresponding to each of the plurality of nodes;
and the training unit 55 is configured to train a neural network of the graph based on the sparse relationship network graph.
According to an embodiment, the sampling unit 53 is configured to: normalizing the matching degrees respectively corresponding to the neighbor nodes to obtain corresponding matching probabilities; and sampling each neighbor node according to the matching probability.
According to another embodiment, the sampling unit 53 comprises (not shown):
a first probability determination module configured to determine a first sampling probability that the second node is sampled according to a first privacy budget and a matching degree of the second node with the first node based on an exponential mechanism of differential privacy;
and the neighbor sampling module is configured to sample each neighbor node according to the first sampling probability corresponding to each neighbor node in the first neighbor node set.
Further, in one embodiment, the neighbor sampling module is configured to: and executing k times of sampling with a preset number according to the first sampling probability, and sampling k neighbor nodes from the first neighbor node set to serve as the sampling neighbor node set.
In another embodiment, the neighbor sampling module is configured to: inputting the first sampling probability corresponding to each neighbor node into a Gumbel-softmax function to obtain a second sampling probability corresponding to each neighbor node; and sampling each neighbor node according to the second sampling probability corresponding to each neighbor node.
Furthermore, the neighbor sampling module may perform k sampling times according to the second sampling probability, and sample k neighbor nodes from the first neighbor node set as the sampling neighbor node set.
According to one embodiment, the sparse relationship network graph includes labeled nodes with labels; the training unit 55 comprises (not shown):
the graph embedding module is configured to carry out graph embedding on the sparse relationship network graph by utilizing the graph neural network to obtain node embedding vectors of the labeled nodes;
a first gradient determination module configured to determine a corresponding first gradient of the graph neural network from the node embedding vector and the tag;
a first update module configured to update the graph neural network according to the first gradient.
In one embodiment, the first update module is configured to: adding noise on the first gradient by using a noise mechanism of differential privacy to obtain a first noise gradient; and updating the parameters of the graph neural network according to the first noise gradient.
Further, in an example, the first updating module is specifically configured to: based on a preset cutting threshold value, cutting the first gradient to obtain a cutting gradient; determining Gaussian noise for realizing differential privacy by utilizing a Gaussian distribution determined based on the clipping threshold, wherein the variance of the Gaussian distribution is positively correlated with the square of the clipping threshold; and superposing the Gaussian noise and the cutting gradient to obtain the first noise gradient.
In one embodiment, the graph embedding module is configured to: acquiring neighbor nodes of the marked nodes in the sparse relationship network graph as target neighbor nodes; determining an aggregation weight of each target neighbor node, the aggregation weight being determined based on the degree of matching of the annotation node with each target neighbor node; and according to the aggregation weight, aggregating the node information of each target neighbor node to obtain the node embedded vector of the labeled node.
Further, in one example, the graph embedding module is configured to determine the aggregate weight of each target neighbor node by: acquiring sampling probability of each target neighbor node when being sampled, wherein the sampling probability is determined based on the matching degree of each target neighbor node and the label node by using a Gumbel-softmax function; and determining the aggregation weight of each target neighbor node according to the sampling probability.
According to one embodiment, the training unit 55 further comprises:
a second gradient determining module configured to determine a second gradient corresponding to the multilayer neural network according to the node embedding vector and the label;
a second update module configured to update the multi-layer neural network according to the second gradient.
Further, in one embodiment, the first update module is configured to: adding first noise on the first gradient by using a differential privacy mode to obtain a first noise gradient; updating parameters of the graph neural network according to the first noise gradient; and the second update module is configured to: adding second noise on the second gradient by using a differential privacy mode to obtain a second noise gradient; and updating the parameters of the multilayer neural network according to the second noise gradient.
In various embodiments, the plurality of nodes in the original relationship network graph may include at least one of: user nodes, merchant nodes and article nodes.
It should be noted that the apparatus 500 shown in fig. 5 is an apparatus embodiment corresponding to the method embodiment shown in fig. 2, and the corresponding description in the method embodiment shown in fig. 2 is also applicable to the apparatus 500, and is not repeated herein.
The graph neural network obtained by the training of the device 500 can effectively protect the privacy and safety of the original graph data.
According to an embodiment of a further aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in connection with fig. 2.
According to an embodiment of yet another aspect, there is also provided a computing device comprising a memory and a processor, the memory having stored therein executable code, the processor, when executing the executable code, implementing the method described in connection with fig. 2.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in the embodiments of this specification may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The above-mentioned embodiments are intended to explain the technical idea, technical solutions and advantages of the present specification in further detail, and it should be understood that the above-mentioned embodiments are merely specific embodiments of the technical idea of the present specification, and are not intended to limit the scope of the technical idea of the present specification, and any modification, equivalent replacement, improvement, etc. made on the basis of the technical solutions of the embodiments of the present specification should be included in the scope of the technical idea of the present specification.

Claims (29)

1. A method for training neural networks based on privacy protection comprises the following steps:
acquiring an original relation network graph, wherein the original relation network graph comprises a plurality of nodes, and any first node in the plurality of nodes is provided with a corresponding first neighbor node set;
for any second node in the first neighbor node set, inputting the node information of the second node, the node information of the first node, and the connection information of the second node and the first node into a multilayer neural network to obtain the matching degree of the second node and the first node;
sampling the first neighbor node set according to the matching degree corresponding to each neighbor node in the first neighbor node set to obtain a sampled neighbor node set of the first node;
forming a sparse relation network graph based on the sampling neighbor node sets corresponding to the plurality of nodes respectively;
and training a neural network of the graph based on the sparse relationship network graph.
2. The method of claim 1, wherein sampling the first set of neighbor nodes according to the matching degrees corresponding to the respective neighbor nodes in the first set of neighbor nodes comprises:
normalizing the matching degrees respectively corresponding to the neighbor nodes to obtain corresponding matching probabilities;
and sampling each neighbor node according to the matching probability.
3. The method of claim 1, wherein sampling the first set of neighbor nodes according to the matching degrees corresponding to the respective neighbor nodes in the first set of neighbor nodes comprises:
determining a first sampling probability of the second node being sampled according to a first privacy budget and a matching degree of the second node and the first node based on an exponential mechanism of differential privacy;
and sampling each neighbor node according to the first sampling probability corresponding to each neighbor node in the first neighbor node set.
4. The method of claim 3, wherein sampling each neighbor node in the first set of neighbor nodes according to a first sampling probability corresponding to the neighbor node, respectively, comprises:
and executing k times of sampling with a preset number according to the first sampling probability, and sampling k neighbor nodes from the first neighbor node set to serve as the sampling neighbor node set.
5. The method of claim 3, wherein sampling each neighbor node in the first set of neighbor nodes according to a first sampling probability corresponding to the neighbor node, respectively, comprises:
inputting the first sampling probability corresponding to each neighbor node into a Gumbel-softmax function to obtain a second sampling probability corresponding to each neighbor node;
and sampling each neighbor node according to the second sampling probability corresponding to each neighbor node.
6. The method of claim 5, wherein sampling the neighboring nodes according to the second sampling probabilities respectively corresponding to the neighboring nodes comprises:
and executing k times of sampling with preset number according to the second sampling probability, and sampling k neighbor nodes from the first neighbor node set to serve as the sampling neighbor node set.
7. The method of claim 1, wherein the sparse relationship network graph includes labeled nodes with labels;
training a neural network based on the sparse relationship network graph, comprising:
carrying out graph embedding on the sparse relationship network graph by utilizing the graph neural network to obtain a node embedding vector of the labeling node;
determining a corresponding first gradient of the graph neural network according to the node embedding vector and the label;
updating the graph neural network according to the first gradient.
8. The method of claim 7, wherein updating the graph neural network according to the first gradient comprises:
adding noise on the first gradient by using a noise mechanism of differential privacy to obtain a first noise gradient;
and updating the parameters of the graph neural network according to the first noise gradient.
9. The method of claim 8, wherein adding noise to the first gradient using a noise mechanism of differential privacy to obtain a first noise gradient comprises:
based on a preset cutting threshold value, cutting the first gradient to obtain a cutting gradient;
determining Gaussian noise for realizing differential privacy by utilizing a Gaussian distribution determined based on the clipping threshold, wherein the variance of the Gaussian distribution is positively correlated with the square of the clipping threshold;
and superposing the Gaussian noise and the cutting gradient to obtain the first noise gradient.
10. The method of claim 7, wherein graph embedding the sparse relationship network graph using the graph neural network to obtain node embedding vectors of the labeled nodes comprises:
acquiring neighbor nodes of the marked nodes in the sparse relationship network graph as target neighbor nodes;
determining an aggregation weight of each target neighbor node, the aggregation weight being determined based on the degree of matching of the annotation node with each target neighbor node;
and according to the aggregation weight, aggregating the node information of each target neighbor node to obtain the node embedded vector of the labeled node.
11. The method of claim 10, wherein determining an aggregate weight for each target neighbor node comprises:
acquiring sampling probability of each target neighbor node when being sampled, wherein the sampling probability is determined based on the matching degree of each target neighbor node and the label node by using a Gumbel-softmax function;
and determining the aggregation weight of each target neighbor node according to the sampling probability.
12. The method of claim 10, wherein training a neural network of a graph based on the sparse relationship network graph further comprises: determining a second gradient corresponding to the multilayer neural network according to the node embedding vector and the label; updating the multi-layer neural network according to the second gradient.
13. The method of claim 12, wherein,
updating the graph neural network according to the first gradient, including: adding first noise on the first gradient by using a differential privacy mode to obtain a first noise gradient; updating parameters of the graph neural network according to the first noise gradient;
updating the multi-layer neural network according to the second gradient, including: adding second noise on the second gradient by using a differential privacy mode to obtain a second noise gradient; and updating the parameters of the multilayer neural network according to the second noise gradient.
14. The method of claim 1, wherein the plurality of nodes comprises at least one of: user nodes, merchant nodes and article nodes.
15. An apparatus based on a privacy preserving training graph neural network, comprising:
an original graph obtaining unit, configured to obtain an original relationship network graph, where the original relationship network graph includes a plurality of nodes, and any first node in the plurality of nodes has a corresponding first neighbor node set;
a matching degree obtaining unit configured to input node information of a second node, node information of the first node, and connection information of the second node and the first node to a multilayer neural network for any second node in the first neighbor node set, so as to obtain a matching degree between the second node and the first node;
the sampling unit is configured to sample the first neighbor node set according to the matching degree corresponding to each neighbor node in the first neighbor node set, so as to obtain a sampling neighbor node set of the first node;
a sparse graph forming unit configured to form a sparse relationship network graph based on a sampling neighbor node set corresponding to each of the plurality of nodes;
and the training unit is configured to train the neural network of the graph based on the sparse relationship network graph.
16. The apparatus of claim 15, wherein the sampling unit is configured to:
normalizing the matching degrees respectively corresponding to the neighbor nodes to obtain corresponding matching probabilities;
and sampling each neighbor node according to the matching probability.
17. The apparatus of claim 15, wherein the sampling unit comprises:
a first probability determination module configured to determine a first sampling probability that the second node is sampled according to a first privacy budget and a matching degree of the second node with the first node based on an exponential mechanism of differential privacy;
and the neighbor sampling module is configured to sample each neighbor node according to the first sampling probability corresponding to each neighbor node in the first neighbor node set.
18. The apparatus of claim 17, wherein the neighbor sampling module is configured to:
and executing k times of sampling with a preset number according to the first sampling probability, and sampling k neighbor nodes from the first neighbor node set to serve as the sampling neighbor node set.
19. The apparatus of claim 17, wherein the neighbor sampling module is configured to:
inputting the first sampling probability corresponding to each neighbor node into a Gumbel-softmax function to obtain a second sampling probability corresponding to each neighbor node;
and sampling each neighbor node according to the second sampling probability corresponding to each neighbor node.
20. The apparatus of claim 19, wherein the neighbor sampling module is further configured to:
and executing k times of sampling with preset number according to the second sampling probability, and sampling k neighbor nodes from the first neighbor node set to serve as the sampling neighbor node set.
21. The apparatus of claim 15, wherein the sparse relationship network graph comprises labeled nodes with labels;
the training unit includes:
the graph embedding module is configured to carry out graph embedding on the sparse relationship network graph by utilizing the graph neural network to obtain node embedding vectors of the labeled nodes;
a first gradient determination module configured to determine a corresponding first gradient of the graph neural network from the node embedding vector and the tag;
a first update module configured to update the graph neural network according to the first gradient.
22. The apparatus of claim 21, wherein the first update module is configured to:
adding noise on the first gradient by using a noise mechanism of differential privacy to obtain a first noise gradient;
and updating the parameters of the graph neural network according to the first noise gradient.
23. The apparatus of claim 22, wherein the first update module is configured to:
based on a preset cutting threshold value, cutting the first gradient to obtain a cutting gradient;
determining Gaussian noise for realizing differential privacy by utilizing a Gaussian distribution determined based on the clipping threshold, wherein the variance of the Gaussian distribution is positively correlated with the square of the clipping threshold;
and superposing the Gaussian noise and the cutting gradient to obtain the first noise gradient.
24. The apparatus of claim 21, wherein the graph embedding module is configured to:
acquiring neighbor nodes of the marked nodes in the sparse relationship network graph as target neighbor nodes;
determining an aggregation weight of each target neighbor node, the aggregation weight being determined based on the degree of matching of the annotation node with each target neighbor node;
and according to the aggregation weight, aggregating the node information of each target neighbor node to obtain the node embedded vector of the labeled node.
25. The apparatus of claim 24, wherein the graph embedding module is configured to determine the aggregate weight for each target neighbor node by:
acquiring sampling probability of each target neighbor node when being sampled, wherein the sampling probability is determined based on the matching degree of each target neighbor node and the label node by using a Gumbel-softmax function;
and determining the aggregation weight of each target neighbor node according to the sampling probability.
26. The apparatus of claim 24, wherein the training unit further comprises:
a second gradient determining module configured to determine a second gradient corresponding to the multilayer neural network according to the node embedding vector and the label;
a second update module configured to update the multi-layer neural network according to the second gradient.
27. The apparatus of claim 26,
the first update module is configured to: adding first noise on the first gradient by using a differential privacy mode to obtain a first noise gradient; updating parameters of the graph neural network according to the first noise gradient;
the second update module is configured to: adding second noise on the second gradient by using a differential privacy mode to obtain a second noise gradient; and updating the parameters of the multilayer neural network according to the second noise gradient.
28. The apparatus of claim 15, wherein the plurality of nodes comprises at least one of: user nodes, merchant nodes and article nodes.
29. A computing device comprising a memory and a processor, wherein the memory has stored therein executable code that, when executed by the processor, performs the method of any of claims 1-14.
CN202110109491.3A 2021-01-27 2021-01-27 Method and device for training neural network based on privacy protection Active CN112464292B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110109491.3A CN112464292B (en) 2021-01-27 2021-01-27 Method and device for training neural network based on privacy protection
CN202110957071.0A CN113536383B (en) 2021-01-27 2021-01-27 Method and device for training graph neural network based on privacy protection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110109491.3A CN112464292B (en) 2021-01-27 2021-01-27 Method and device for training neural network based on privacy protection

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202110957071.0A Division CN113536383B (en) 2021-01-27 2021-01-27 Method and device for training graph neural network based on privacy protection

Publications (2)

Publication Number Publication Date
CN112464292A true CN112464292A (en) 2021-03-09
CN112464292B CN112464292B (en) 2021-08-20

Family

ID=74802376

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202110109491.3A Active CN112464292B (en) 2021-01-27 2021-01-27 Method and device for training neural network based on privacy protection
CN202110957071.0A Active CN113536383B (en) 2021-01-27 2021-01-27 Method and device for training graph neural network based on privacy protection

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202110957071.0A Active CN113536383B (en) 2021-01-27 2021-01-27 Method and device for training graph neural network based on privacy protection

Country Status (1)

Country Link
CN (2) CN112464292B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095490A (en) * 2021-06-07 2021-07-09 华中科技大学 Graph neural network construction method and system based on differential privacy aggregation
CN113190841A (en) * 2021-04-27 2021-07-30 中国科学技术大学 Method for defending graph data attack by using differential privacy technology
CN113298116A (en) * 2021-04-26 2021-08-24 上海淇玥信息技术有限公司 Attention weight-based graph embedding feature extraction method and device and electronic equipment
CN113642717A (en) * 2021-08-31 2021-11-12 西安理工大学 Convolutional neural network training method based on differential privacy
CN113837382A (en) * 2021-09-26 2021-12-24 杭州网易云音乐科技有限公司 Method and system for training graph neural network
CN115081024A (en) * 2022-08-16 2022-09-20 杭州金智塔科技有限公司 Decentralized business model training method and device based on privacy protection

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023213233A1 (en) * 2022-05-06 2023-11-09 墨奇科技(北京)有限公司 Task processing method, neural network training method, apparatus, device, and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110009093A (en) * 2018-12-07 2019-07-12 阿里巴巴集团控股有限公司 For analyzing the nerve network system and method for relational network figure
CN111091005A (en) * 2019-12-20 2020-05-01 北京邮电大学 Meta-structure-based unsupervised heterogeneous network representation learning method
US10825219B2 (en) * 2018-03-22 2020-11-03 Northeastern University Segmentation guided image generation with adversarial networks
CN112085172A (en) * 2020-09-16 2020-12-15 支付宝(杭州)信息技术有限公司 Method and device for training graph neural network

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8000262B2 (en) * 2008-04-18 2011-08-16 Bonnie Berger Leighton Method for identifying network similarity by matching neighborhood topology
US10789526B2 (en) * 2012-03-09 2020-09-29 Nara Logics, Inc. Method, system, and non-transitory computer-readable medium for constructing and applying synaptic networks
CN103020163A (en) * 2012-11-26 2013-04-03 南京大学 Node-similarity-based network community division method in network
CN106302104B (en) * 2015-06-26 2020-01-21 阿里巴巴集团控股有限公司 User relationship identification method and device
US10977384B2 (en) * 2017-11-16 2021-04-13 Microsoft Technoogy Licensing, LLC Hardware protection for differential privacy
CN108022654B (en) * 2017-12-20 2021-11-30 深圳先进技术研究院 Association rule mining method and system based on privacy protection and electronic equipment
CN109614975A (en) * 2018-10-26 2019-04-12 桂林电子科技大学 A kind of figure embedding grammar, device and storage medium
CN110866190B (en) * 2019-11-18 2021-05-14 支付宝(杭州)信息技术有限公司 Method and device for training neural network model for representing knowledge graph

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10825219B2 (en) * 2018-03-22 2020-11-03 Northeastern University Segmentation guided image generation with adversarial networks
CN110009093A (en) * 2018-12-07 2019-07-12 阿里巴巴集团控股有限公司 For analyzing the nerve network system and method for relational network figure
CN111091005A (en) * 2019-12-20 2020-05-01 北京邮电大学 Meta-structure-based unsupervised heterogeneous network representation learning method
CN112085172A (en) * 2020-09-16 2020-12-15 支付宝(杭州)信息技术有限公司 Method and device for training graph neural network

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113298116A (en) * 2021-04-26 2021-08-24 上海淇玥信息技术有限公司 Attention weight-based graph embedding feature extraction method and device and electronic equipment
CN113298116B (en) * 2021-04-26 2024-04-02 上海淇玥信息技术有限公司 Attention weight-based graph embedded feature extraction method and device and electronic equipment
CN113190841A (en) * 2021-04-27 2021-07-30 中国科学技术大学 Method for defending graph data attack by using differential privacy technology
CN113095490A (en) * 2021-06-07 2021-07-09 华中科技大学 Graph neural network construction method and system based on differential privacy aggregation
CN113642717A (en) * 2021-08-31 2021-11-12 西安理工大学 Convolutional neural network training method based on differential privacy
CN113642717B (en) * 2021-08-31 2024-04-02 西安理工大学 Convolutional neural network training method based on differential privacy
CN113837382A (en) * 2021-09-26 2021-12-24 杭州网易云音乐科技有限公司 Method and system for training graph neural network
CN115081024A (en) * 2022-08-16 2022-09-20 杭州金智塔科技有限公司 Decentralized business model training method and device based on privacy protection

Also Published As

Publication number Publication date
CN112464292B (en) 2021-08-20
CN113536383B (en) 2023-10-27
CN113536383A (en) 2021-10-22

Similar Documents

Publication Publication Date Title
CN112464292B (en) Method and device for training neural network based on privacy protection
Balle et al. Reconstructing training data with informed adversaries
Yu et al. Learning deep network representations with adversarially regularized autoencoders
CN112084331A (en) Text processing method, text processing device, model training method, model training device, computer equipment and storage medium
CN110913354A (en) Short message classification method and device and electronic equipment
Ji et al. Multi-range gated graph neural network for telecommunication fraud detection
Yoon et al. Robust probabilistic time series forecasting
Ugendhar et al. A novel intelligent-based intrusion detection system approach using deep multilayer classification
Zheng et al. Jora: Weakly supervised user identity linkage via jointly learning to represent and align
CN112597399B (en) Graph data processing method and device, computer equipment and storage medium
Yin et al. An Anomaly Detection Model Based On Deep Auto-Encoder and Capsule Graph Convolution via Sparrow Search Algorithm in 6G Internet-of-Everything
CN115982570A (en) Multi-link custom optimization method, device, equipment and storage medium for federated learning modeling
CN111860655B (en) User processing method, device and equipment
CN114882557A (en) Face recognition method and device
CN114387088A (en) Loan risk identification method and device based on knowledge graph
Kim et al. Network anomaly detection based on domain adaptation for 5g network security
Kartik et al. Decoding of graphically encoded numerical digits using deep learning and edge detection techniques
Zhang et al. Construct new graphs using information bottleneck against property inference attacks
Khan et al. Synthetic Identity Detection using Inductive Graph Convolutional Networks
Long Understanding and mitigating privacy risk in machine learning systems
Chen et al. Risk probability estimating based on clustering
Qiu et al. Abnormal Traffic Detection Method of Internet of Things Based on Deep Learning in Edge Computing Environment
CN117555489A (en) Internet of things data storage transaction anomaly detection method, system, equipment and medium
Huang et al. Exploring network reliability by predicting link status based on simplex neural network
Palaniappan et al. Learning Disentangled Representations Using Dormant Variations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40047463

Country of ref document: HK