US11989634B2 - Private federated learning with protection against reconstruction - Google Patents
Private federated learning with protection against reconstruction Download PDFInfo
- Publication number
- US11989634B2 US11989634B2 US16/501,132 US202016501132A US11989634B2 US 11989634 B2 US11989634 B2 US 11989634B2 US 202016501132 A US202016501132 A US 202016501132A US 11989634 B2 US11989634 B2 US 11989634B2
- Authority
- US
- United States
- Prior art keywords
- machine learning
- learning model
- update
- privatized
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
- 230000004224 protection Effects 0.000 title description 13
- 238000010801 machine learning Methods 0.000 claims abstract description 143
- 239000013598 vector Substances 0.000 claims abstract description 87
- 238000012549 training Methods 0.000 claims abstract description 34
- 238000000034 method Methods 0.000 claims description 88
- 238000012545 processing Methods 0.000 claims description 35
- 230000015654 memory Effects 0.000 claims description 24
- 238000013145 classification model Methods 0.000 claims description 9
- 238000003058 natural language processing Methods 0.000 claims description 6
- 230000007246 mechanism Effects 0.000 description 38
- 230000006870 function Effects 0.000 description 32
- 238000004891 communication Methods 0.000 description 17
- 230000008569 process Effects 0.000 description 15
- 238000013528 artificial neural network Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 7
- 238000005070 sampling Methods 0.000 description 7
- 230000001133 acceleration Effects 0.000 description 6
- 238000013459 approach Methods 0.000 description 6
- 230000008901 benefit Effects 0.000 description 6
- 230000036541 health Effects 0.000 description 6
- 230000033001 locomotion Effects 0.000 description 5
- 238000013507 mapping Methods 0.000 description 5
- 230000002093 peripheral effect Effects 0.000 description 5
- 238000005259 measurement Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000003860 storage Methods 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000004931 aggregating effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000013503 de-identification Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 241000347889 Debia Species 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001143 conditioned effect Effects 0.000 description 1
- 238000001816 cooling Methods 0.000 description 1
- 238000012517 data analytics Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000005265 energy consumption Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000003116 impacting effect Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 229910044991 metal oxide Inorganic materials 0.000 description 1
- 150000004706 metal oxides Chemical class 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000010897 surface acoustic wave method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
Definitions
- Embodiments described herein relate generally to federated machine learning using distributed computing systems. More specifically, embodiments relate to a private federated learning system with protections against reconstruction attacks.
- the training of machine learning models for use in image classification, next word prediction, and other related tasks generally makes use of powerful hardware and a large amount of data.
- the large amount of training data can increase the accuracy of the trained models.
- the more powerful the hardware the faster the training operations can be performed.
- the training of machine learning models required dedicated, high-performance compute nodes.
- modern mobile electronic devices are now able to perform on-device training, even for large complex machine learning models.
- Training data can be divided and distributed to a large number of mobile electronic devices, which can each perform on-device training of the model using a subset of the training data.
- the training data that is used on each mobile device is generally a small fraction of the full dataset.
- Shared models can then be deployed to each device to benefit all users for a variety of tasks, which can improve the overall user experience.
- One way to compute a shared model in this distributed setting is to directly transmit data from each device to a central server where training can be done. However, the data on each device is sensitive by nature and transmitting user data to a centralized server can compromise the privacy of user data.
- One embodiment provides for a data processing system comprising a memory to store instructions and one or more processors to execute the instructions.
- the instructions cause the one or more processors to receive a machine learning model from a server at a client device, train the machine learning model using local data at the client device to generate a trained machine learning model, generate an update for the machine learning model, the update including a weight vector that represents a difference between the machine learning model and the trained machine learning model, privatize the update for the machine learning model, and transmit the privatized update for the machine learning model to the server.
- One embodiment provides for a method comprising receiving a machine learning model from a server at a client device, training the machine learning model using local data at the client device to generate a trained machine learning model, generating an update for the machine learning model, the update including a weight vector that represents a difference between the machine learning model and the trained machine learning model, privatizing the update for the machine learning model, and transmitting the privatized update for the machine learning model to the server.
- One embodiment provides for a non-transitory machine-readable medium that stores instructions to cause one or more processors of a data processing system to perform operations comprising receiving a machine learning model from a server at a client device, training the machine learning model using local data at the client device, generating an update for the machine learning model, the update including a weight vector that represents a difference between the received machine learning model and the trained machine learning model, privatizing the update for the machine learning model, and transmitting the privatized update for the machine learning model to the server.
- privatizing the update for the machine learning model can be performed using a variety of mechanisms, including the use of a separated differential privacy mechanism that separately privatizes a unit vector and a magnitude for each update to the machine learning model before the update is transmitted by the user device.
- Privatizing the update using separated differential privacy includes decomposing the weight vector into a unit vector and a magnitude, privatizing the unit vector, and separately privatizing the magnitude.
- the magnitude is privatized with absolute error.
- the magnitude is privatized with relative error.
- the unit vector is privatized based on 2 -unit vectors on the unit cube.
- the unit vector is privatized based on ⁇ -unit vectors on the unit cube.
- FIG. 1 illustrates a system to enable private federated learning, according to an embodiment
- FIG. 2 illustrates an additional system to enable private federated learning, according to embodiments described herein;
- FIG. 3 is a block diagram of a system for generating privatizing model updates, according to an embodiment
- FIG. 4 is a flow diagram of a method of performing private federated learning using the computing components and privatization techniques described herein;
- FIG. 5 A- 5 B illustrates techniques for privatizing model updates, according to an embodiment
- FIG. 6 A- 6 C illustrate algorithms to generate a privatized unit vector and privatized magnitude, according to embodiments
- FIG. 7 illustrates compute architecture on a client device that can be used to enable on-device training using machine learning algorithms, according to embodiments described herein;
- FIG. 8 is a block diagram of a device architecture for a mobile or embedded device, according to an embodiment.
- FIG. 9 is a block diagram of a computing system, according to an embodiment.
- Federated learning also referred to as distributed learning, focuses on learning problems when data is distributed across many devices.
- ⁇ be model parameters for a particular model that is to be trained using user data.
- the model parameters ⁇ are transmitted to each device and then each device trains a local model using its local data.
- a collection of b model differences ⁇ 1 , . . . , ⁇ b ⁇ is then aggregated to obtain ⁇ on a central server which is then used to update the central model parameters ⁇ + ⁇ .
- the update to the shared model can then be deployed to the mobile devices. This feedback loop continues as the model improves and the user data changes.
- Some approaches to federated learning even when performing on-device training, does not provide any formal privacy guarantees to users that participate in the federated learning system. Even though data remains on device, the transmitted model updates are computed with user data and contain information about the personal data. A curious onlooker that has access to the model updates may be able to reconstruct data of individual users.
- differential privacy has many nice properties, including closure under post-processing and the graceful degradation of privacy parameters if multiple differential privacy algorithms are composed together, that has made it the de facto privacy definition in data analytics and machine learning.
- an objection to differential privacy in the central model is that the users submit their data, perhaps through an encrypted channel, that is then decrypted on the server.
- the server is trusted to use a differential privacy algorithm with the data to only reveal a privatized result.
- An adversary with access to the server may be able to see the true model updates prior to any execution of a differential privacy algorithm.
- SMC secure multiparty computation
- Local privacy protections provide numerous benefits including avoiding risks associated to maintaining private data. Additionally, local privacy protections allow transparent protection of user privacy, as private data never leaves an individual's device in the clear. However, local differential privacy can create challenges for learning systems.
- Embodiments described herein address the above deficiencies in the art by providing a private federated learning system that privatizes model updates submitted by users via a separated differential privacy model with protections against adversaries with some prior information about user updates.
- Separated differential privacy involves decomposing the weight vector that includes updates to a learning model into a unit vector and an associated magnitude. The decomposed vectors can then be separately privatized using techniques described herein.
- Separated differential privacy enables privatized learning by implementing a privacy model that is tailored towards protecting against an attacker that may wish to decode individual user data based on model updates, rather than an attacker that wants to differentiate between two inputs. This approach allows the use of a more relaxed privacy parameter ⁇ , which improves the effectiveness of the learning process, while still providing protection against curious onlookers that may be able to obtain access to privatized model updates.
- This model of privacy is well suited to federated learning scenarios that use distributed model training.
- Separated differential privacy enables learning models to be trained in a decentralized setting while providing local privacy guarantees for the transmitted model updates from the devices.
- a private federated learning system can be enabled that provides comparable utility to a federated learning system that does not provide privacy safeguards.
- Privacy is enabled by obfuscating the individual updates to the server.
- a relaxed privacy parameter ⁇ is used, user data is still protected against reconstruction by individuals (e.g., internal employees) that may have access to privatized updates.
- fully ⁇ -differentially-private techniques are used to enable privatization of the magnitude.
- relative noise mechanisms are used to privatize the magnitude.
- an additional layer of privacy is enabled by encapsulating the separated differential privacy model within a central differential privacy layer on the learning server.
- the use of central differential privacy provides additional protection for updated learning models on the server against external adversaries that may have access to the model and any other information except the user data that the adversary wishes to decode.
- FIG. 1 illustrates a system 100 to enable private federated learning, according to an embodiment.
- the system 100 includes a server 130 that can receive data from a set of client devices 110 a - 110 n , 111 a - 111 n , 112 a - 112 n over a network 120 .
- the server 130 can be any kind of server, including an individual server or a cluster of servers.
- the server 130 can also be or include a cloud-based server, application server, backend server, virtual server, or combination thereof.
- the network 120 can be any suitable type of wired or wireless network such as a local area network (LAN), a wide area network (WAN), or combination thereof.
- LAN local area network
- WAN wide area network
- Each of the client devices can include any type of computing device such as a desktop computer, a tablet computer, a smartphone, a television set top box, a smart speaker system, a gaming system, or other computing device.
- a client device can be an iPhone®, Apple® Watch, Apple® TV, HomePodTM, etc., and can be associated with a user within a large set of users to which tasks can be crowdsourced with the permission of the user.
- the server 130 stores a machine learning model 131 (e.g., model M0), which can be implemented using one or more neural networks, such as but not limited to a deep learning neural network.
- the machine learning model 131 can be implemented using, for example, a convolutional neural network (CNN) or a recurrent neural network (RNN), including a long short-term memory (LSTM) variant of an RNN.
- CNN convolutional neural network
- RNN recurrent neural network
- LSTM long short-term memory
- the machine learning model 131 can include a set of model weights that can be updated based on an aggregated model update 135 that is generated based on aggregated privatized model updates sent from the set of client devices 110 a - 110 n , 111 a - 111 n , 112 a - 112 n.
- the client devices can be organized into device groups (e.g., device group 110 , device group 111 , device group 112 ) that can each contain multiple client devices.
- Each device group can contain n devices, where n can be any number of devices.
- device group 110 can contain client device 110 a - 110 n .
- Device group 111 can contain client device 111 a - 111 n .
- Device group 112 can contain client device 112 a - 112 n .
- each device group can contain up to 128 devices, although the number of client devices in each device group can vary across embodiments and is not limited to any specific number of devices.
- each of the client devices can include a local machine learning module.
- client device 110 a - 110 n of device group 110 can each contain corresponding local machine learning module 136 a - 136 n .
- client device 111 a - 111 n of device group 111 can each contain corresponding local machine learning module 137 a - 137 n .
- client device 112 a - 112 n of device group 112 can each contain a corresponding local machine learning module 138 a - 138 n .
- the local machine learning modules can be loaded on each client device during factory provisioning or can be loaded or updated when a system image of the client device is updated.
- the machine learning model 131 of the server 130 can be transmitted to each local machine learning module over the network 120 .
- the local machine learning models on the client devices can be individualized to each client device by training the local models using local data stored on the client device.
- different types of data can be used to train the models, and the specifics of the models can vary based on the type of data that is used to train.
- the machine learning model 131 and the local machine learning models are image classifier models.
- the models are natural language processing models that are used to enable a predictive keyboard and/or keyboard autocorrect.
- the models can be voice recognition or voice classification models that are used to improve voice recognition or voice classification capability for a virtual assistant.
- the local machine learning modules 136 a - 136 n , 137 a - 137 n , 138 a - 138 n on each client device can generate model updates that are privatized by the client devices 110 a - 110 n , 111 a - 111 n , 112 a - 112 n before transmission to the server 130 .
- client devices 110 a - 110 n can each send privatized model updates 121
- client devices 111 a - 111 n can each send privatized model updates 122
- client devices 112 a - 112 n can each send privatized model updates 123 .
- the privatized model updates can be sent through the network 120 to the server 130 , where the updates can be processed into an aggregated model update 135 . Updates are sent to the server while satisfying separated differential privacy for the local updates and no raw data for users is transmitted to the server. Separated differential privacy is used to protect the privatized model updates 121 , 22 , 123 from a reconstruction breach. A reconstruction breach occurs when a curious onlooker having access to the model updates is able to determine at least some detail about the user data on which the model is trained.
- FIG. 2 illustrates an additional system 200 to enable private federated learning, according to embodiments described herein.
- the system 200 includes a set of client devices 210 a - 210 c (collectively, 210 ), which can be any of the client devices described above (e.g., client devices 110 a - 110 n , 111 a - 111 n , 112 a - 112 n ).
- the client devices 210 can generate privatized model updates 212 a - 212 c (e.g., privatized model update 212 a from client device 210 a , privatized model update 212 b from client device 210 b , privatized model update 212 c from client device 210 c ), which can be transmitted to the server 130 via the network 120 .
- the privatized model updates 212 a - 212 c can be stripped of their IP addresses or other information that can be used to identify the client devices 210 prior to entering an ingestor 232 on the server 130 .
- the ingestor 232 can collect the data from the client devices 210 and remove metadata and forwards the data to an aggregator 233 .
- the aggregator takes the privatized model updates and aggregates them to form a single update to the current server model, which in the initial round is machine learning model 131 (e.g., model M0).
- a model updater 234 can then apply the updates to the current server model to generate an updated machine learning model 235 (e.g., model M1).
- the privatized model updates 212 a - 212 c can be protected using separated differential privacy as described herein.
- the aggregated model updates and/or updated machine learning model 235 can be protected using the central model of differential privacy.
- FIG. 3 is a block diagram of a system 300 for generating privatizing model updates, according to an embodiment.
- the system 300 includes a client device 310 , which can be any of client devices 110 a - 110 n , 111 a - 111 n , 112 a - 112 n or client devices 210 .
- the client device 310 includes a machine learning module 361 that includes, at least initially, a copy of machine learning model 131 , which can be provided by the server 130 .
- a local training module 330 can be used to train the machine learning model 131 based on local client data 332 to generate a local model update 333 .
- the local model update 333 is then privatized using a privacy engine 353 .
- the privacy engine 353 includes a privacy daemon 356 and a privacy framework or application programming interface (API) 355 .
- the privacy engine 353 can use various tools, such as hash functions, including cryptographic hash functions, to privatize the local model update 333 to the machine learning model 131 using one or more of a variety of privatization techniques including but not limited to separated differential privacy as described herein.
- the privatized local model update 333 can then be transmitted to the server 130 via the network 120 .
- the server 130 can include a receive module 351 and an ingestor/aggregator 341 .
- the receive module 351 can asynchronously receive privatized model updates from a large plurality of client devices and provide the updates to the ingestor/aggregator 341 .
- the receive module 351 can remove latent identifiers such as IP addresses or other data that might identify the client device 310 .
- the ingestor/aggregator can include components of the ingestor 232 and aggregator 233 shown in FIG. 2 and can perform similar operations, such as removing metadata, session identifiers, and other identifying information, and aggregating the privatized information to generate an aggregated model update 331 .
- the aggregated model update 331 can be used by the model updater 234 to update machine learning model 131 (e.g., model M0) into updated machine learning model 235 (e.g., model M1).
- a deployment module 352 can then be used to deploy the updated machine learning model 235 to the client devices for an additional round of training. While the updated machine learning model 235 is on the server 130 , the model can be protected using central differential privacy.
- FIG. 4 is a flow diagram of a method 400 of performing private federated learning using the computing components and privatization techniques described herein. Operations of the method 400 will be described below, along with relevant mathematical descriptions of the operations to be performed.
- the method 400 includes operations (block 401 ) to transmit a machine learning model from a server to a set of client devices. Let ⁇ be model parameters for a particular model that is to be trained using user data. The model parameters ⁇ are transmitted to each device in a set of client devices.
- the server can be, for example, server 130 as described herein.
- the client devices can be, for example, client devices 110 a - 110 n , 111 a - 111 n , 112 a - 112 n , client devices 210 , or the client device 310 as described herein.
- Each client device then performs operations (block 402 ) to train an individual machine learning model using local data on the client device.
- Individualized model updates can be generated based on an individualized difference between a previous model (e.g., starting model, previous model iteration, etc.) and a most recent locally trained model on the individual client devices (block 403 ).
- the individual model updates can then be privatized on the set of client devices using separated differential privacy (block 404 ).
- the privatized model updates are then sent from the set of client devices to a central learning server (block 405 ).
- a collection of b model differences ⁇ 1 , . . . , ⁇ b ⁇ is then aggregated to obtain an aggregate model update ⁇ on the central server (block 406 ).
- the aggregate model update can then be used to update the central model parameters ⁇ + ⁇ .
- the update to the shared model can then be deployed to the client devices.
- These operations can continue in a loop continues as the model improves and user data changes.
- central differential privacy techniques are used to protect the model updates on the server.
- Method 400 can additionally include to privatize the aggregate model updates on the learning server using central differential privacy (block 407 ).
- the privatized model updates can then be used to update the server machine learning model (block 408 ). Additional details are provided below for the operations of method 400 .
- the server has some global model parameters ⁇ d , which can be the model weights for each layer of a neural net or it can be the model weights of just the last layer in the case of transfer learning.
- Each model can have a particular neural net architecture and loss function, which in one embodiment are assumed to be consistent across devices.
- Each model also has some set of hyperparameters which will include parameters such as learning rate, dropout rate, mini-batch size, number of rounds for training, trainable parameters, etc. These hyperparameters are tuned on the server and sent along with the current server model ⁇ d , to each device.
- the server then sends the current model ⁇ to a batch of devices, where each device will train the model using local data.
- the batch of devices will be of expected size q ⁇ N where N is the total number of users opted in for training and q is the subsampling rate, so that user i will be selected for local training with probability q.
- the selected batch can be denoted as ⁇ [n].
- user data is leveraged to update central model parameters ⁇ .
- a possible update rule includes Gradient Descent, where for learning rate ⁇
- a gradient update with step size ⁇ i ⁇ can be performed such that
- stochastic proximal-point-type updates can be applied, such that
- Z 1 is an unbiased (private) estimate of ⁇ i / ⁇ i ⁇
- Z 2 is an unbiased estimate of ⁇ i ⁇
- a privatized difference ⁇ circumflex over ( ⁇ ) ⁇ i is transmitted via a separated differential privacy algorithm.
- privatized difference ⁇ circumflex over ( ⁇ ) ⁇ i is generated by a combination of a unit vector privatization technique, PrivUnit 2 and a magnitude privatization technique AbsMagnDP, where the pair is separated differentially private, such that
- privatized unit vectors can also be generated via a mechanism PrivUnit ⁇
- magnitudes can be privatized via mechanisms PrivMagn or RelMagnDP, which are described in further detail in FIG. 6 A- 6 C .
- the server To generate an aggregate model update on the learning server (block 406 ), once the server has all the privatized updates from each device in the selected batch , i. e. ⁇ circumflex over ( ⁇ ) ⁇ i ⁇ ⁇ , the server then aggregates the privatized updates to form a single update to the server model.
- the aggregated update then becomes the following for N users, where each device's update is weighted equally
- Aggregate model privatization (block 407 ) is performed to prevent any one user from substantially impacting the overall server model with their local data, to prevent overfitting, and to enable privacy for server data.
- Aggregate model privatization is performed by incorporating central differential privacy into the separated differential privacy model used for model updates.
- each update to ⁇ is ( ⁇ , ⁇ )-differentially private.
- the privacy model used for private federated learning as described herein considers settings in which a random variable W over domain is present, with two corresponding downstream variables U from domain and R from domain , governed by the Markovian graphical structure U ⁇ W ⁇ R.
- FIG. 5 A- 5 B illustrates techniques for privatizing model updates, according to an embodiment.
- FIG. 5 A illustrates a Markovian graphical structure 500 between data X and privatized pair (Z 1 ,Z 2 ).
- Data X is transformed into a vector W ( 502 ), which may simply be an identity transformation, but can also be a gradient of the loss on the datum X or other derived statistic.
- Unit direction U( 504 ) and magnitude R( 503 ) can be privatized into privatized unit vector 1 ( 506 ) and privatized radius or magnitude 2 ( 505 ).
- embodiments described herein present several mechanisms M 1 : ⁇ 1 and M 2 : ⁇ 2 that map pair (U,R) into privatized pair (Z 1 , Z 2 ).
- the pair (Z 1 ,Z 2 ) does not give substantial information about the input, which allows separated differential privacy to protect against reconstruction breaches.
- the separated differential privacy protections are specifically tailored to protect against certain curious onlooker adversaries, which can be represented by prior distributions over the triple (U, W, R).
- FIG. 5 B illustrates a method 510 of privatizing model updates, according to an embodiment.
- Method 510 can be performed on a client device (e.g., client device 310 ) of the set of client devices selected to send a model update to a model update server (e.g., server 130 ).
- client device e.g., client device 310
- model update server e.g., server 130
- method 510 includes an operation (block 511 ) to obtain a weight vector that represents a difference between a previous and recently trained model. This difference represents the model update that is to be transmitted to the learning server to update the current server model.
- Method 500 additionally includes an operation (block 513 ) to privatize the unit vector.
- the unit vector can be privatized via mapping mechanism M 1 : ⁇ 1 described above.
- the unit vector is privatized using a technique (PrivUnit 2 ) to minimize the 2 -norm of the privatized vector. Other techniques can also be used.
- Method 510 additionally includes an operation (block 514 ) to separately privatize the magnitude.
- the magnitude can be privatized via mapping mechanism M 2 : ⁇ 2 described above.
- the mapping mechanism can be based on relative noise (PrivMagn) and will be privatized based on assumptions made about the availability of the data available to the attacker.
- the mapping mechanism can also be a differentially private mechanism, which can be an absolute error-based mechanism (AbsMagnDP) or a relative error-based mechanism (RelMagnDP).
- Method 510 additionally includes an operation (block 515 ) to transmit the privatized unit vector and magnitude to the learning server as the model update.
- the model update is represented by model difference ⁇ circumflex over ( ⁇ ) ⁇ i
- the model difference is transmitted as differentially private pair PrivUnit 2 and PrivMagn, where the pair is separated differentially private, such that
- the unit vector for the model difference can be transmitted using mechanism PrivUnit ⁇ which is based on ⁇ -unit vectors.
- the magnitude can also be privatized using a relative noise-based mechanism RelMagnDP or a relative error-based mechanism PrivMagn under additional assumptions about the adversary.
- FIG. 6 A- 6 C illustrate algorithms to generate a privatized unit vector and privatized magnitude, according to embodiments.
- FIG. 6 A illustrates methods 601 , 602 , 603 to generate a privatized unit vector and privatized magnitude, according to embodiments.
- method 601 can be used to generate privatized unit vector PrivUnit 2 . Specifically, method 601 takes as input unit vector u ⁇ d-1 and parameter ⁇ [0,1] and returns privatized vector Z, which has the property that [Z
- u] u. The mechanism of method 601 then draws a vector V uniformly from a cap ⁇ v ⁇ d-1
- the mechanism of method 601 then sets a and z values such that
- Method 601 then makes use of the incomplete beta function
- u] u for all u ⁇ d , where the size of Z (its ⁇ -norm ⁇ Z ⁇ ⁇ ) is as small as possible.
- Method 602 additionally includes an operation to draw random vector V according to the following distribution,
- V ( uniform ⁇ on ⁇ ⁇ v ⁇ ⁇ - 1 , + 1 ⁇ d
- An additional operation can be performed to set
- Method 602 additionally includes an operation to set
- An efficient approach to implement the sampling of method 602 is to first sample a Bernoulli B with success probability
- the random variable B′ indicates the number of coordinates of ⁇ that the random vector V needs to match. Uniform sampling of B′ coordinates of ⁇ is performed and corresponding coordinates of V are set to be the same. The remaining coordinates of V are then set to be the flipped value of the corresponding coordinates of ⁇ .
- Method 603 generates a privatized magnitude (PrivMagn).
- Method 602 includes to sample Y ⁇ Uni[ ⁇ v,v] and set
- FIG. 6 B illustrates methods 604 , 605 to generate privatized magnitudes for transmission to a server.
- Methods 604 and 605 present two mechanisms for the e-differentially-private release of a single variable (value) r ⁇ [0,r max ], where r max is some a priori upper bound of r.
- Method 604 enables mechanism AbsMagnDP, which achieves order optimal scaling for the mean-squared error [(Z ⁇ r) 2
- Method 605 enables mechanism RelMagnDP, which achieves a truncated relative error guarantee, which for a fixed threshold ⁇ 0 ⁇ [e ⁇ /2 ,1] is
- a value k ⁇ is fixed. Then r is randomly rounded to an index value J taking values in ⁇ 0, 1, 2, . . . , k ⁇ with the property that
- Method 605 provides an alternate differential privacy mechanism that enables a relative error guarantee.
- r can be assigned to an end point of the interval at random to obtain an unbiased estimator.
- a randomized response is then used to obtain ⁇ , which is then debiased to obtain Z, which is an estimator for r.
- FIG. 6 C illustrates method 606 , which enables an optimization that provides for the efficient sampling of unit vectors in PrivUnit 2 . Specifically, method 606 provides an efficient mechanism to perform the sampling of Z in 601 .
- Method 606 given u ⁇ d-1 and ⁇ [0,1], specifies to sample
- FIG. 7 illustrates compute architecture 700 on a client device that can be used to enable on-device training using machine learning algorithms, according to embodiments described herein.
- compute architecture 700 includes a client machine learning framework 702 that can be configured to leverage a processing system 720 on a client device.
- the client machine learning framework 702 includes a vision/image framework 704 , a language processing framework 706 , and one or more other frameworks 708 , which each can reference primitives provided by a core machine learning framework 710 .
- the core machine learning framework 710 can access resources provided via a CPU acceleration layer 712 , neural network processor acceleration layer 713 and a GPU acceleration layer 714 .
- the CPU acceleration layer 712 , neural network processor acceleration layer 713 , and the GPU acceleration layer 714 each facilitate access to a processing system 720 on the various client devices described herein.
- the processing system includes an application processor 722 , a neural network processor 723 , and a graphics processor 724 , each of which can be used to accelerate operations of the core machine learning framework 710 and the various higher-level frameworks that operate via primitives provided via the core machine learning framework.
- the application processor 722 and graphics processor 724 include hardware that can be used to perform general-purpose processing and graphics specific processing for the core machine learning framework 710 .
- the neural network processor 723 includes hardware that is tuned specifically to accelerate processing operations for artificial neural networks.
- the neural network processor 723 can increase speed at which neural network operations are performed but is not required to enable the operation of the client machine learning framework 702 . For example, training can also be performed using the application processor 722 and/or the graphics processor 724 .
- the various frameworks and hardware resources of the compute architecture 700 can be used for inferencing operations as well as training operations.
- a client device can use the compute architecture 700 to perform supervised learning via a machine learning model as described herein, such as but not limited to a CNN, RNN, or LSTM model.
- the client device can then use the trained machine learning model to perform classification operations for one or a variety of predictive models including but not limited to a natural language processing model, a predictive text model, an application suggestion model, and application activity suggestion model, a voice classification model, and an image classification model.
- FIG. 8 is a block diagram of a device architecture 800 for a mobile or embedded device, according to an embodiment.
- the device architecture 800 includes a memory interface 802 , a processing system 804 including one or more data processors, image processors and/or graphics processing units, and a peripherals interface 806 .
- the various components can be coupled by one or more communication buses or signal lines.
- the various components can be separate logical components or devices or can be integrated in one or more integrated circuits, such as in a system on a chip integrated circuit.
- the memory interface 802 can be coupled to memory 850 , which can include high-speed random-access memory such as static random-access memory (SRAM) or dynamic random-access memory (DRAM) and/or non-volatile memory, such as but not limited to flash memory (e.g., NAND flash, NOR flash, etc.).
- SRAM static random-access memory
- DRAM dynamic random-access memory
- non-volatile memory such as but not limited to flash memory (e.g., NAND flash, NOR flash, etc.).
- Sensors, devices, and subsystems can be coupled to the peripherals interface 806 to facilitate multiple functionalities.
- a motion sensor 810 a light sensor 812 , and a proximity sensor 814 can be coupled to the peripherals interface 806 to facilitate the mobile device functionality.
- One or more biometric sensor(s) 815 may also be present, such as a fingerprint scanner for fingerprint recognition or an image sensor for facial recognition.
- Other sensors 816 can also be connected to the peripherals interface 806 , such as a positioning system (e.g., GPS receiver), a temperature sensor, or other sensing device, to facilitate related functionalities.
- a camera subsystem 820 and an optical sensor 822 e.g., a charged coupled device (CCD) or a complementary metal-oxide semiconductor (CMOS) optical sensor, can be utilized to facilitate camera functions, such as recording photographs and video clips.
- CCD charged coupled device
- CMOS complementary metal-oxide semiconductor
- wireless communication subsystems 824 can include radio frequency receivers and transmitters and/or optical (e.g., infrared) receivers and transmitters.
- the specific design and implementation of the wireless communication subsystems 824 can depend on the communication network(s) over which a mobile device is intended to operate.
- a mobile device including the illustrated device architecture 800 can include wireless communication subsystems 824 designed to operate over a GSM network, a CDMA network, an LTE network, a Wi-Fi network, a Bluetooth network, or any other wireless network.
- the wireless communication subsystems 824 can provide a communications mechanism over which a media playback application can retrieve resources from a remote media server or scheduled events from a remote calendar or event server.
- An audio subsystem 826 can be coupled to a speaker 828 and a microphone 830 to facilitate voice-enabled functions, such as voice recognition, voice replication, digital recording, and telephony functions.
- voice-enabled functions such as voice recognition, voice replication, digital recording, and telephony functions.
- the audio subsystem 826 can be a high-quality audio system including support for virtual surround sound.
- the I/O subsystem 840 can include a touch screen controller 842 and/or other input controller(s) 845 .
- the touch screen controller 842 can be coupled to a touch sensitive display system 846 (e.g., touch-screen).
- the touch sensitive display system 846 and touch screen controller 842 can, for example, detect contact and movement and/or pressure using any of a plurality of touch and pressure sensing technologies, including but not limited to capacitive, resistive, infrared, and surface acoustic wave technologies, as well as other proximity sensor arrays or other elements for determining one or more points of contact with a touch sensitive display system 846 .
- Display output for the touch sensitive display system 846 can be generated by a display controller 843 .
- the display controller 843 can provide frame data to the touch sensitive display system 846 at a variable frame rate.
- a sensor controller 844 is included to monitor, control, and/or processes data received from one or more of the motion sensor 810 , light sensor 812 , proximity sensor 814 , or other sensors 816 .
- the sensor controller 844 can include logic to interpret sensor data to determine the occurrence of one of more motion events or activities by analysis of the sensor data from the sensors.
- the I/O subsystem 840 includes other input controller(s) 845 that can be coupled to other input/control devices 848 , such as one or more buttons, rocker switches, thumb-wheel, infrared port, USB port, and/or a pointer device such as a stylus, or control devices such as an up/down button for volume control of the speaker 828 and/or the microphone 830 .
- other input/control devices 848 such as one or more buttons, rocker switches, thumb-wheel, infrared port, USB port, and/or a pointer device such as a stylus, or control devices such as an up/down button for volume control of the speaker 828 and/or the microphone 830 .
- the memory 850 coupled to the memory interface 802 can store instructions for an operating system 852 , including portable operating system interface (POSIX) compliant and non-compliant operating system or an embedded operating system.
- the operating system 852 may include instructions for handling basic system services and for performing hardware dependent tasks.
- the operating system 852 can be a kernel.
- the memory 850 can also store communication instructions 854 to facilitate communicating with one or more additional devices, one or more computers and/or one or more servers, for example, to retrieve web resources from remote web servers.
- the memory 850 can also include user interface instructions 856 , including graphical user interface instructions to facilitate graphic user interface processing.
- the memory 850 can store sensor processing instructions 858 to facilitate sensor-related processing and functions; telephony instructions 860 to facilitate telephone-related processes and functions; messaging instructions 862 to facilitate electronic-messaging related processes and functions; web browser instructions 864 to facilitate web browsing-related processes and functions; media processing instructions 866 to facilitate media processing-related processes and functions; location services instructions including GPS and/or navigation instructions 868 and Wi-Fi based location instructions to facilitate location based functionality; camera instructions 870 to facilitate camera-related processes and functions; and/or other software instructions 872 to facilitate other processes and functions, e.g., security processes and functions, and processes and functions related to the systems.
- sensor processing instructions 858 to facilitate sensor-related processing and functions
- telephony instructions 860 to facilitate telephone-related processes and functions
- messaging instructions 862 to facilitate electronic-messaging related processes and functions
- web browser instructions 864 to facilitate web browsing-related processes and functions
- media processing instructions 866 to facilitate media processing-related processes and functions
- location services instructions including GPS and/or navigation instructions 868 and Wi
- the memory 850 may also store other software instructions such as web video instructions to facilitate web video-related processes and functions; and/or web shopping instructions to facilitate web shopping-related processes and functions.
- the media processing instructions 866 are divided into audio processing instructions and video processing instructions to facilitate audio processing-related processes and functions and video processing-related processes and functions, respectively.
- a mobile equipment identifier such as an International Mobile Equipment Identity (IMEI) 874 or a similar hardware identifier can also be stored in memory 850 .
- IMEI International Mobile Equipment Identity
- Each of the above identified instructions and applications can correspond to a set of instructions for performing one or more functions described above. These instructions need not be implemented as separate software programs, procedures, or modules.
- the memory 850 can include additional instructions or fewer instructions.
- various functions may be implemented in hardware and/or in software, including in one or more signal processing and/or application specific integrated circuits.
- FIG. 9 is a block diagram of a computing system 900 , according to an embodiment.
- the illustrated computing system 900 is intended to represent a range of computing systems (either wired or wireless) including, for example, desktop computer systems, laptop computer systems, tablet computer systems, cellular telephones, personal digital assistants (PDAs) including cellular-enabled PDAs, set top boxes, entertainment systems or other consumer electronic devices, smart appliance devices, or one or more implementations of a smart media playback device.
- Alternative computing systems may include more, fewer and/or different components.
- the computing system 900 can be used to provide the computing device and/or a server device to which the computing device may connect.
- the computing system 900 includes bus 935 or other communication device to communicate information, and processor(s) 910 coupled to bus 935 that may process information. While the computing system 900 is illustrated with a single processor, the computing system 900 may include multiple processors and/or co-processors.
- the computing system 900 further may include memory 920 , such as random-access memory (RAM) or other dynamic storage device coupled to the bus 935 .
- the memory 920 may store information and instructions that may be executed by processor(s) 910 .
- the memory 920 may also be used to store temporary variables or other intermediate information during execution of instructions by the processor(s) 910 .
- the computing system 900 may also include read only memory (ROM) 930 and/or another data storage device 940 coupled to the bus 935 that may store information and instructions for the processor(s) 910 .
- ROM read only memory
- the data storage device 940 can be or include a variety of storage devices, such as a flash memory device, a magnetic disk, or an optical disc and may be coupled to computing system 900 via the bus 935 or via a remote peripheral interface.
- the computing system 900 may also be coupled, via the bus 935 , to a display device 950 to display information to a user.
- the computing system 900 can also include an alphanumeric input device 960 , including alphanumeric and other keys, which may be coupled to bus 935 to communicate information and command selections to processor(s) 910 .
- Another type of user input device includes a cursor control 970 device, such as a touchpad, a mouse, a trackball, or cursor direction keys to communicate direction information and command selections to processor(s) 910 and to control cursor movement on the display device 950 .
- the computing system 900 may also receive user input from a remote device that is communicatively coupled via one or more network interface(s) 980 .
- the computing system 900 further may include one or more network interface(s) 980 to provide access to a network, such as a local area network.
- the network interface(s) 980 may include, for example, a wireless network interface having antenna 985 , which may represent one or more antenna(e).
- the computing system 900 can include multiple wireless network interfaces such as a combination of Wi-Fi, Bluetooth®, near field communication (NFC), and/or cellular telephony interfaces.
- the network interface(s) 980 may also include, for example, a wired network interface to communicate with remote devices via network cable 987 , which may be, for example, an Ethernet cable, a coaxial cable, a fiber optic cable, a serial cable, or a parallel cable.
- the network interface(s) 980 may provide access to a local area network, for example, by conforming to IEEE 802.11 standards, and/or the wireless network interface may provide access to a personal area network, for example, by conforming to Bluetooth standards. Other wireless network interfaces and/or protocols can also be supported.
- network interface(s) 980 may provide wireless communications using, for example, Time Division, Multiple Access (TDMA) protocols, Global System for Mobile Communications (GSM) protocols, Code Division, Multiple Access (CDMA) protocols, Long Term Evolution (LTE) protocols, and/or any other type of wireless communications protocol.
- TDMA Time Division, Multiple Access
- GSM Global System for Mobile Communications
- CDMA Code Division, Multiple Access
- LTE Long Term Evolution
- the computing system 900 can further include one or more energy sources 905 and one or more energy measurement systems 945 .
- Energy sources 905 can include an AC/DC adapter coupled to an external power source, one or more batteries, one or more charge storage devices, a USB charger, or other energy source.
- Energy measurement systems include at least one voltage or amperage measuring device that can measure energy consumed by the computing system 900 during a predetermined period of time. Additionally, one or more energy measurement systems can be included that measure, e.g., energy consumed by a display device, cooling subsystem, Wi-Fi subsystem, or other frequently used or high-energy consumption subsystem.
- the hash functions described herein can utilize specialized hardware circuitry (or firmware) of the system (client device or server).
- the function can be a hardware-accelerated function.
- the system can use a function that is part of a specialized instruction set.
- the hardware can use an instruction set which may be an extension to an instruction set architecture for a particular type of microprocessors. Accordingly, in an embodiment, the system can provide a hardware-accelerated mechanism for performing cryptographic operations to improve the speed of performing the functions described herein using these instruction sets.
- the hardware-accelerated engines/functions are contemplated to include any implementations in hardware, firmware, or combination thereof, including various configurations which can include hardware/firmware integrated into the SoC as a separate processor, or included as special purpose CPU (or core), or integrated in a coprocessor on the circuit board, or contained on a chip of an extension circuit board, etc.
- this gathered data may include personal information data that uniquely identifies or can be used to identify a specific person.
- personal information data can include demographic data, location-based data, online identifiers, telephone numbers, email addresses, social media IDs, home addresses, data or records relating to a user's health or level of fitness (e.g., vital signs measurements, medication information, exercise information), date of birth, or any other identifying or personal information.
- the present disclosure recognizes that the use of such personal information data, in the present technology, can be used to the benefit of users.
- the personal information data can be used to learn new words, improve keyboard layouts, improve autocorrect engines for keyboards, and to enable an electronic device to better anticipate the needs of a user.
- other uses for personal information data that benefit the user are also contemplated by the present disclosure.
- health and fitness data may be used, in accordance with the user's preferences, to provide insights into their general wellness, or may be used as positive feedback to individuals using technology to pursue wellness goals.
- the present disclosure contemplates that those entities responsible for the collection, analysis, disclosure, transfer, storage, or other use of such personal information data will comply with well-established privacy policies and/or privacy practices.
- such entities would be expected to implement and consistently apply privacy practices that are generally recognized as meeting or exceeding industry or governmental requirements for maintaining the privacy of users.
- Such information regarding the use of personal data should be prominently and easily accessible by users and should be updated as the collection and/or use of data changes.
- personal information from users should be collected for legitimate uses only. Further, such collection/sharing should occur only after receiving the consent of the users or other legitimate basis specified in applicable law. Additionally, such entities should consider taking any needed steps for safeguarding and securing access to such personal information data and ensuring that others with access to the personal information data adhere to their privacy policies and procedures.
- policies and practices should be adapted for the particular types of personal information data being collected and/or accessed and adapted to applicable laws and standards, including jurisdiction-specific considerations which may serve to impose a higher standard. For instance, in the US, collection of or access to certain health data may be governed by federal and/or state laws, such as the Health Insurance Portability and Accountability Act (HIPAA); whereas health data in other countries may be subject to other regulations and policies and should be handled accordingly.
- HIPAA Health Insurance Portability and Accountability Act
- the present disclosure also contemplates embodiments in which users selectively block the use of, or access to, personal information data. That is, the present disclosure contemplates that hardware and/or software elements can be provided to prevent or block access to such personal information data.
- the present technology can be configured to allow users to select to “opt in” or “opt out” of participation in the collection of personal information data during registration for services or anytime thereafter.
- the present disclosure contemplates providing notifications relating to the access or use of personal information. For instance, a user may be notified upon downloading an app that their personal information data will be accessed and then reminded again just before personal information data is accessed by the app.
- personal information data should be managed and handled in a way to minimize risks of unintentional or unauthorized access or use. Risk can be minimized by limiting the collection of data and deleting data once it is no longer needed.
- data de-identification can be used to protect a user's privacy. De-identification may be facilitated, when appropriate, by removing identifiers, controlling the amount or specificity of data stored (e.g., collecting location data at city level rather than at an address level), controlling how data is stored (e.g., aggregating data across users), and/or other methods such as differential privacy.
- the present disclosure broadly covers use of personal information data to implement one or more various disclosed embodiments
- the present disclosure also contemplates that the various embodiments can also be implemented without the need for accessing such personal information data. That is, the various embodiments of the present technology are not rendered inoperable due to the lack of all or a portion of such personal information data. For example, crowdsourcing of sequences can be performed over a large number of users and is based on aggregated, non-personal information data. A large number of individual users can opt out of sending data to the sequence learning server and overall trends can still be detected.
- One embodiment described herein provides for a non-transitory machine-readable medium storing instructions to cause one or more processors of a data processing system to perform operations comprising receiving a machine learning model from a server at a client device, training the machine learning model using local data at the client device, generating an update for the machine learning model, the update including a weight vector that represents a difference between the received machine learning model and the trained machine learning model, privatizing the update for the machine learning model, and transmitting the privatized update for the machine learning model to the server.
- One embodiment described herein provides for a data processing system comprising a memory to store instructions and one or more processors to execute the instructions.
- the instructions cause the one or more processors to receive a machine learning model from a server at a client device, train the machine learning model using local data at the client device to generate a trained machine learning model, generate an update for the machine learning model, the update including a weight vector that represents a difference between the machine learning model and the trained machine learning model, privatize the update for the machine learning model, and transmit the privatized update for the machine learning model to the server.
- One embodiment described herein provides for a method comprising receiving a machine learning model from a server at a client device, training the machine learning model using local data at the client device to generate a trained machine learning model, generating an update for the machine learning model, the update including a weight vector that represents a difference between the machine learning model and the trained machine learning model, privatizing the update for the machine learning model, and transmitting the privatized update for the machine learning model to the server.
- local model updates generated by user devices are privatized using mechanisms that can be shown to be sufficient to guarantee strong reconstruction protections for high-dimensional data for a large range of ⁇ , ⁇ parameters when the adversary knows relatively little a priori about the actual input.
- the various privacy mechanisms described herein can be employed without reducing the utility of the data for learning operations.
- separated differential privacy mechanisms are employed.
- the separated differential privacy mechanisms can decompose a model update into a unit vector and magnitude, then separately privatize the unit vector and magnitude for each update to the machine learning model before the update is transmitted by the user device.
- the magnitude is privatized with absolute error.
- the magnitude is privatized with relative error.
- the unit vector is privatized based on 2 -unit vectors on the unit cube.
- the unit vector is privatized based on ⁇ -unit vectors on the unit cube.
- the machine learning models described herein can be used in a variety of applications, including natural language processing, mage classification, or voice classification. After a model is updated using aggregated model updates, the updated model, or the simply the updates to the model, can be re-transmitted to the client devices for further training and updates.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Neurology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
θi←Update(x i,θ;)
where Update denotes the update rule on each device. Any update procedure can be used. A possible update rule includes Gradient Descent, where for learning rate η∈
Clip({circumflex over (Δ)}i ;S)={circumflex over (Δ)}i·min{S/∥{circumflex over (Δ)} i∥2,1}.
described above has 2-sensitivity at most S/(qN) (modifying a single update Δi can cause {circumflex over (Δ)} to change by at most S/(qN) in 2-distance). Consequently, addition of appropriate Gaussian noise enables a guarantee of (ε,δ)-approximate differential privacy. Assuming computation of a total of T global updates, and where ε>0 and δ∈(0,1) are the desired approximate privacy parameters. Letting
[M(D)∈]≤e ∈ [M(D′)∈]+δ
-
- (i) The mechanism M1:→ 1 is ε-differentially private, i.e. for any u,u′∈ and outcome set S⊂ 1,
-
- (ii) The mechanism M2:→ 2 is ρ-differentially private, i.e. for any r, r′∈R and outcome set S⊂ 2,
or otherwise uniformly from its complement {v∈ d-1|v,u<γ}. The mechanism of
and transmits private value Z.
and debias the vector
and transmit Z.
If B=0, then use rejection sampling to generate a uniform random vector V˜Uni({−1,1}d) and only accept if <V,Û>≤K. Otherwise, if B=1 then sample a conditional Binomial random variable B′ with the following CDF and use the inverse transform sampling technique,
Privatized magnitude r·X can then be transmitted to the server.
for ε≥1.
Next, a random response is employed over k outcomes to obtain Ĵ, which is then debiased to obtain Z, which is an estimator for r.
E 0=[0,vα],E i =[v i α,v i+1α] for i=1, . . . ,k−1.
If Y=1, then sample B′=2B−1 where
conditioned on
and Set V←Û+V. Otherwise, perform rejection sampling, i.e. Bool←True; while Bool do: Draw U˜Uni( d-1). If U,u<γ, then V←U; Bool=False. Value V can then be provided for use in determining Z as in
Claims (30)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/501,132 US11989634B2 (en) | 2018-11-30 | 2020-01-17 | Private federated learning with protection against reconstruction |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201862774126P | 2018-11-30 | 2018-11-30 | |
US201862774227P | 2018-12-01 | 2018-12-01 | |
US16/501,132 US11989634B2 (en) | 2018-11-30 | 2020-01-17 | Private federated learning with protection against reconstruction |
Publications (2)
Publication Number | Publication Date |
---|---|
US20210166157A1 US20210166157A1 (en) | 2021-06-03 |
US11989634B2 true US11989634B2 (en) | 2024-05-21 |
Family
ID=76091558
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/501,132 Active 2041-09-29 US11989634B2 (en) | 2018-11-30 | 2020-01-17 | Private federated learning with protection against reconstruction |
Country Status (1)
Country | Link |
---|---|
US (1) | US11989634B2 (en) |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11836643B2 (en) * | 2019-03-08 | 2023-12-05 | Nec Corporation | System for secure federated learning |
US11922359B2 (en) * | 2019-04-29 | 2024-03-05 | Abb Schweiz Ag | System and method for securely training and using a model |
US11520322B2 (en) * | 2019-05-24 | 2022-12-06 | Markforged, Inc. | Manufacturing optimization using a multi-tenant machine learning platform |
US11238167B2 (en) * | 2019-06-14 | 2022-02-01 | Sap Se | Secure sublinear time differentially private median computation |
US11443240B2 (en) * | 2019-09-06 | 2022-09-13 | Oracle International Corporation | Privacy preserving collaborative learning with domain adaptation |
US11500929B2 (en) * | 2019-11-07 | 2022-11-15 | International Business Machines Corporation | Hierarchical federated learning using access permissions |
US11188791B2 (en) * | 2019-11-18 | 2021-11-30 | International Business Machines Corporation | Anonymizing data for preserving privacy during use for federated machine learning |
US20210295215A1 (en) * | 2020-03-18 | 2021-09-23 | Abb Schweiz Ag | Technologies for decentralized fleet analytics |
KR102501496B1 (en) * | 2020-06-11 | 2023-02-20 | 라인플러스 주식회사 | Method, system, and computer program for providing multiple models of federated learning using personalization |
US20220006783A1 (en) * | 2020-07-02 | 2022-01-06 | Accenture Global Solutions Limited | Privacy preserving cooperative firewall rule optimizer |
CN111985650B (en) * | 2020-07-10 | 2022-06-28 | 华中科技大学 | Activity recognition model and system considering both universality and individuation |
US20220156574A1 (en) * | 2020-11-19 | 2022-05-19 | Kabushiki Kaisha Toshiba | Methods and systems for remote training of a machine learning model |
US20220182802A1 (en) * | 2020-12-03 | 2022-06-09 | Qualcomm Incorporated | Wireless signaling in federated learning for machine learning components |
US11785024B2 (en) * | 2021-03-22 | 2023-10-10 | University Of South Florida | Deploying neural-trojan-resistant convolutional neural networks |
US11777812B2 (en) * | 2021-06-25 | 2023-10-03 | Qualcomm Technologies, Inc. | Zone-based federated learning |
US20230018116A1 (en) * | 2021-07-14 | 2023-01-19 | Accenture Global Solutions Limited | Systems and methods for synthesizing cross domain collective intelligence |
CN113538071B (en) * | 2021-09-15 | 2022-01-25 | 北京顶象技术有限公司 | Method and device for improving wind control strategy effect |
CN116187473B (en) * | 2023-01-19 | 2024-02-06 | 北京百度网讯科技有限公司 | Federal learning method, apparatus, electronic device, and computer-readable storage medium |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120023043A1 (en) | 2010-07-21 | 2012-01-26 | Ozgur Cetin | Estimating Probabilities of Events in Sponsored Search Using Adaptive Models |
US20150324686A1 (en) * | 2014-05-12 | 2015-11-12 | Qualcomm Incorporated | Distributed model learning |
US20170091652A1 (en) * | 2015-09-24 | 2017-03-30 | Linkedin Corporation | Regularized model adaptation for in-session recommendations |
US20180268283A1 (en) | 2017-03-17 | 2018-09-20 | Microsoft Technology Licensing, Llc | Predictive Modeling from Distributed Datasets |
US20180365580A1 (en) | 2017-06-15 | 2018-12-20 | Microsoft Technology Licensing, Llc | Determining a likelihood of a user interaction with a content element |
US20190227980A1 (en) * | 2018-01-22 | 2019-07-25 | Google Llc | Training User-Level Differentially Private Machine-Learned Models |
US20190279082A1 (en) * | 2018-03-07 | 2019-09-12 | Movidius Ltd. | Methods and apparatus to determine weights for use with convolutional neural networks |
US20190340534A1 (en) * | 2016-09-26 | 2019-11-07 | Google Llc | Communication Efficient Federated Learning |
US20200034566A1 (en) * | 2018-07-24 | 2020-01-30 | Arizona Board Of Regents On Behalf Of Arizona State University | Systems, Methods, and Apparatuses for Implementing a Privacy-Preserving Social Media Data Outsourcing Model |
US20210065002A1 (en) * | 2018-05-17 | 2021-03-04 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Concepts for distributed learning of neural networks and/or transmission of parameterization updates therefor |
US20210097381A1 (en) * | 2019-09-27 | 2021-04-01 | Canon Medical Systems Corporation | Model training method and apparatus |
US11132602B1 (en) * | 2016-08-11 | 2021-09-28 | Twitter, Inc. | Efficient online training for machine learning |
US11341429B1 (en) * | 2017-10-11 | 2022-05-24 | Snap Inc. | Distributed machine learning for improved privacy |
-
2020
- 2020-01-17 US US16/501,132 patent/US11989634B2/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120023043A1 (en) | 2010-07-21 | 2012-01-26 | Ozgur Cetin | Estimating Probabilities of Events in Sponsored Search Using Adaptive Models |
US20150324686A1 (en) * | 2014-05-12 | 2015-11-12 | Qualcomm Incorporated | Distributed model learning |
US20170091652A1 (en) * | 2015-09-24 | 2017-03-30 | Linkedin Corporation | Regularized model adaptation for in-session recommendations |
US11132602B1 (en) * | 2016-08-11 | 2021-09-28 | Twitter, Inc. | Efficient online training for machine learning |
US20190340534A1 (en) * | 2016-09-26 | 2019-11-07 | Google Llc | Communication Efficient Federated Learning |
US20180268283A1 (en) | 2017-03-17 | 2018-09-20 | Microsoft Technology Licensing, Llc | Predictive Modeling from Distributed Datasets |
US20180365580A1 (en) | 2017-06-15 | 2018-12-20 | Microsoft Technology Licensing, Llc | Determining a likelihood of a user interaction with a content element |
US11341429B1 (en) * | 2017-10-11 | 2022-05-24 | Snap Inc. | Distributed machine learning for improved privacy |
US20190227980A1 (en) * | 2018-01-22 | 2019-07-25 | Google Llc | Training User-Level Differentially Private Machine-Learned Models |
US20190279082A1 (en) * | 2018-03-07 | 2019-09-12 | Movidius Ltd. | Methods and apparatus to determine weights for use with convolutional neural networks |
US20210065002A1 (en) * | 2018-05-17 | 2021-03-04 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Concepts for distributed learning of neural networks and/or transmission of parameterization updates therefor |
US20200034566A1 (en) * | 2018-07-24 | 2020-01-30 | Arizona Board Of Regents On Behalf Of Arizona State University | Systems, Methods, and Apparatuses for Implementing a Privacy-Preserving Social Media Data Outsourcing Model |
US20210097381A1 (en) * | 2019-09-27 | 2021-04-01 | Canon Medical Systems Corporation | Model training method and apparatus |
Non-Patent Citations (2)
Title |
---|
Duchi, Local Privacy, Data Processing Inequalities, and Minimax Rates, 2014 (Year: 2014). * |
Geyer: Differentially Private Federated Learning; copyright Mar. 2018 (Year: 2018). * |
Also Published As
Publication number | Publication date |
---|---|
US20210166157A1 (en) | 2021-06-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11989634B2 (en) | Private federated learning with protection against reconstruction | |
US11710035B2 (en) | Distributed labeling for supervised learning | |
US11671493B2 (en) | Timeline generation | |
EP3525388B1 (en) | Privatized machine learning using generative adversarial networks | |
CN113435583B (en) | Federal learning-based countermeasure generation network model training method and related equipment thereof | |
US11106809B2 (en) | Privacy-preserving transformation of continuous data | |
US11501008B2 (en) | Differential privacy using a multibit histogram | |
US20180349636A1 (en) | Differential privacy using a count mean sketch | |
US20190370334A1 (en) | Privatized apriori algorithm for sequential data discovery | |
Chen et al. | Privacy and fairness in Federated learning: on the perspective of Tradeoff | |
US20140101752A1 (en) | Secure gesture | |
US20240064001A1 (en) | Anonymous aggregation service for sensitive data | |
CN109219003B (en) | Information encryption method and device, storage medium and electronic equipment | |
Salim et al. | Perturbation-enabled deep federated learning for preserving internet of things-based social networks | |
US12111895B2 (en) | Group-based authentication technique | |
CN109300540B (en) | Privacy protection medical service recommendation method in electronic medical system | |
US11647004B2 (en) | Learning to transform sensitive data with variable distribution preservation | |
Theodorakopoulos et al. | On-the-fly privacy for location histograms | |
CN115205089A (en) | Image encryption method, network model training method and device and electronic equipment | |
CN116758661B (en) | Intelligent unlocking method, intelligent unlocking device, electronic equipment and computer readable medium | |
CN112765898B (en) | Multi-task joint training model method, system, electronic equipment and storage medium | |
Wu et al. | A privacy-preserving student status monitoring system | |
CN112583782B (en) | System and method for filtering user request information | |
US20230267372A1 (en) | Hyper-efficient, privacy-preserving artificial intelligence system | |
CN117709444B (en) | Differential privacy model updating method and system based on decentralised federal learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FEPP | Fee payment procedure |
Free format text: PETITION RELATED TO MAINTENANCE FEES GRANTED (ORIGINAL EVENT CODE: PTGR); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED |
|
FEPP | Fee payment procedure |
Free format text: PETITION RELATED TO MAINTENANCE FEES GRANTED (ORIGINAL EVENT CODE: PTGR); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
AS | Assignment |
Owner name: APPLE INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BHOWMICK, ABHISHEK;FREUDIGER, JULIEN F.;ROGERS, RYAN M.;AND OTHERS;SIGNING DATES FROM 20210611 TO 20220815;REEL/FRAME:060825/0935 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
CC | Certificate of correction |