WO2022111789A1

WO2022111789A1 - Distributed training with random secure averaging

Info

Publication number: WO2022111789A1
Application number: PCT/EP2020/083154
Authority: WO
Inventors: Thomas VANNET; Xuebing Zhou
Original assignee: Huawei Technologies Co., Ltd.
Priority date: 2020-11-24
Filing date: 2020-11-24
Publication date: 2022-06-02
Also published as: CN116438554A

Abstract

A method for distributed machine learning for a network includes receiving a local model update vector. The method further includes generating a public key and a secret key. The method further includes receiving an external public key for each of the other client devices. The method further includes generating, based on the secret key and the external public key, a pseudorandom number. The method further includes determining whether each of the other client devices is to be allocated to a set of neighbor devices of the client device. The method further includes generating a model update output according to a secure sum protocol based on the set of neighbor devices and outputting the model update output for transmission to a central server for incorporation into a global model update. The method provides an enhanced privacy protection of users' data and simultaneously solves the scaling issue and ensures high practical utility.

Description

DISTRIBUTED TRAINING WITH RANDOM SECURE AVERAGING

TECHNICAL FIELD

The present disclosure relates generally to the field of data security and machine learning; and more specifically to methods and devices for distributed machine learning (or training) with a random secure averaging protocol.

BACKGROUND

Nowadays, a wide array of services use machine learning (ML) to improve their usefulness in day to day life. Generally, machine learning algorithms include two phases: a training phase and a prediction phase. By use of machine learning algorithms, a model is developed which is trained by use of sample data. The sample data is used during the training phase in order to make a prediction (or a decision) during the prediction phase of the model. In general, when either customers’ data or personal data is used during the model’s training phase and the prediction phase, how to achieve privacy protection then becomes a major technical issue.

Currently, certain approaches have been proposed to enhance the privacy such as a conventional federated learning (FL). The conventional federated learning is used as a distributed training mechanism for the model without forcing the customers (or clients) to upload their raw data during the training phase. Thus, the conventional federated learning decreases an overall communication cost while enhancing the privacy partially. The conventional federated learning somewhat addresses the need to constantly update the model at a low cost and manifests a possibility of training the model locally by use of the customers’

(or clients’) local data. However, the conventional federated learning incurs privacy issues such as the personal data may be directly extracted from the model updates received by a conventional server. Moreover, the model (or the trained model) which is shared with other clients may also reveal personal information about the customers (or clients) who are involved in the training phase of the model. The privacy issues of the conventional federated learning may be resolved to an extent by use of a conventional secure sum technique and a conventional differential privacy (DP) technique. Alternatively stated, the distributed training of the model (or the machine learning model) is performed either by use of the conventional secure sum technique or the conventional differential privacy technique. The conventional secure sum technique is a cryptographic technique which distributedly and privately computes a sum of updates of many customers (or clients) at once and prevents the conventional server from observing any individual update. Thereby, the conventional secure sum technique protects the privacy of individual clients in the distributed learning manner. However, the conventional secure sum technique incurs a high computational cost when the number of clients increases and provides no privacy guarantees for a final trained model. In the conventional differential privacy technique, the conventional server adds a noise after each global iteration of the training phase. Thus, the conventional differential privacy technique provides some amount of formal privacy guarantees for the final trained model. However, the conventional differential privacy technique does not address the privacy issues with regards to the conventional server. Moreover, the conventional differential privacy technique requires a large number of clients while the conventional secure sum technique manifests a scaling issue (i.e. does not scale properly and is error prone) when the number of clients increases. Thus, there exists a technical problem of inadequate privacy protection of customers data or personal data when a large number of clients are involved or when the number of clients (i.e. client devices) increases, the scaling issue is manifested.

Therefore, in light of the foregoing discussion, there exists a need to overcome the aforementioned drawbacks associated with the conventional techniques used for distributed machine learning.

SUMMARY

The present disclosure seeks to provide methods and devices for distributed machine learning with improved privacy protection of users’ data or personal data as well as high performance and high utility. The present disclosure seeks to provide a solution to the existing problem of inadequate privacy protection of customers’ data or personal data when a large number of clients are involved or when the number of clients (i.e. client devices) increases leading to the scaling issue. An aim of the present disclosure is to provide a solution that overcomes at least partially the problems encountered in prior art, and provide methods and devices for distributed machine learning with improved privacy protection of users’ data including personal data and at the same time solves the scaling issue and ensures high performance and high practical utility.

The object of the present disclosure is achieved by the solutions provided in the enclosed independent claims. Advantageous implementations of the present disclosure are further defined in the dependent claims.

In one aspect, the present disclosure provides a method for distributed machine learning for a network comprises receiving, by a processor of a client device, a local model update vector. The method further comprises generating, by the processor, a public key and a secret key, where the public key is for broadcasting to a plurality of other client devices on the network. The method further comprises receiving, by the processor, an external public key for each of the other client devices. The method further comprises for each other client device: generating, based on the secret key and the external public key, a pseudorandom number; and determining whether each of the other client devices is to be allocated to a set of neighbor devices of the client device based on the pseudorandom number and a predetermined neighbor probability parameter. The method further comprises generating, by the processor, a model update output according to a secure sum protocol based on the set of neighbor devices; and outputting, by the processor, the model update output for transmission to a central server for incorporation into a global model update.

The method of the present disclosure employs distributed machine learning with random secure averaging (RdSA) protocol. The disclosed method provides an enhanced privacy protection of client devices’ (or users’) data including personal data, improved performance, and high utility. The disclosed method employs a combination of a secure sum technique (or cryptographic technique) and a differential privacy technique in a coordinated and synergistic manner to enhance the client devices’ data privacy. The method by use of the secure sum protocol prevents the central server from accessing private information from any individual client. Beneficially, the method of the present disclosure uses a neighbor selection algorithm. Based on the neighbor selection algorithm, the set of neighbors is selected in each iteration of the random secure averaging (RdSA) protocol. By virtue of the use of the neighbor selection algorithm, the number of client devices reduces in each iteration of the RdSA protocol, which further resolves the scaling issue. The use of the neighbor selection algorithm also allows the use of the differential privacy technique to further enhance the privacy protection of the client devices’ data.

In an implementation form, the neighbor probability parameter is configured to define a number of neighbors in the set of neighbor devices based on a predefined value, where the predefined value is defined based on a modelled risk of a successful attack.

The disclosed method provides the enhanced privacy protection of users’ data, and improved performance even in the presence of a large number of client devices. The scaling problem of the conventional secure sum technique is resolved by use of the neighbor selection algorithm. By use of the neighbor selection algorithm, the client device selects its neighbors based on the neighbor probability parameter. The neighbors are selected in such a way that each pair of client devices agree on whether or not they are neighbors and any other client device (e.g. the central server) learns nothing about whether or not they are neighbors. When the number of client devices is large, the neighbor selection algorithm still provides strong privacy at less computational cost by selecting the set of the neighbor client devices.. Moreover, the use of the neighbor selection algorithm enables the use of the differential privacy technique to further enhance the client’s data privacy.

In a further implementation form, the neighbor probability parameter p and predefined value r:

/(h_¾, p) > 1 — r and /(¾, p — _<5) < 1 — r for precision _<5 where:

where: active security passive security

By configuring the neighbor probability parameter p, the disclosed method can be used with an active security model as well as a passive security model. The active security model provides protection against very powerful (e.g. national level or state level) adversaries or inside attackers. The passive security model provides protection against data breaches, remote hackers, and compliance with regulations.

In a further implementation form, generating the model update output according to the secure sum protocol comprises generating a one-time-pad for each neighbor device and adding the plurality of one-time-pads to the local model update vector, wherein the one-time-pad for each neighbor device is generated based on a shared secret derived from the secret key of the client device and the external public key of the neighbor device.

The addition of the plurality of one-time -pads to the local model update vector of the client device encrypts a secret value of the client device.

In a further implementation form, a set of all the one-time-pads generated by the plurality of client devices sums substantially to zero.

The set of all the one-time-pads generated by the plurality of client devices are added at the central server which comes substantially to zero, that means the central server obtains a sum of the client’s (unencrypted) secrets.

In a further implementation form, generating the model update output according to the secure sum protocol comprises splitting the local model vector update into a plurality of shares according to the number of neighbor devices, transmitting the shares to the respective neighbor devices, receiving external shares from the neighbor devices and summing the plurality of external shares to form the model update output.

The splitting of the local model vector update into the plurality of shares according to the number of neighbor devices improves the secret sharing on the client side in comparison to the conventional secure sum technique where the scaling issue is prominent.

In a further implementation form, generating the model update output includes adding a locally generated noise signal to the local model update vector and wherein a distribution of the locally generated noise signal is gaussian or binomial. In the differential privacy technique, the locally generated noise signal is added to the local model update vector to provide an improved local privacy which means that a private data is fully protected. The central server can not observe any private information from any individual client. The noise signal is added locally (on the client’s side) rather than centrally (on the server’ side) because the local privacy is more desirable than central privacy. The locally generated noise signal of binomial distribution is used to provide the local privacy protection.

In a further implementation form, the locally generated noise signal is generated with a standard deviation which is defined by a noise parameter received from the central server, wherein the noise parameter for each client device is such that a corresponding set of locally generated noise signals from the plurality of client devices sums to a global noise having a predetermined standard deviation.

By virtue of generating the locally generated noise signal with the standard deviation selected by the central server leads the global noise to have the binomial distribution.

In a further implementation form, the method further comprises converting the local model update vector from a vector of floating point values to an integer vector.

The conversion of the local model update vector from the vector of floating point values to the (modular) integer vector ensures the security on the client side.

In a further implementation form, the method further comprises sending the public key to the central server and receiving the external public keys from the central server.

The public key is communicated to the central server which further communicates the public key to other client devices so that the client device can perform a key agreement with other client devices and selects its neighbors.

In a further implementation form, a computer-readable medium configured to store instructions which, when executed, cause a client device processor to perform the method.

The client device processor achieves all the advantages and effects of the method. In another aspect, the present disclosure provides a client device comprising a training module configured to generate a local model update vector. The client device further comprises a processor configured to generate a public key and a secret key. The client device further comprises a transceiver configured to broadcast the public key to a plurality of other client devices on the network, receive an external public key for each of the other client devices, and transmit a model update output to a central server for incorporation into a global model update. The processor of the client device is further configured to generate, for each other client device, a pseudorandom number based on the secret key and the external public key. The processor of the client device is further configured to determine whether each of the other client devices is to be allocated to a set of neighbor devices of the client device based on the pseudorandom number and a predetermined neighbor probability parameter. The processor of the client device is further configured to generate the model update output according to a secure sum protocol based on the set of neighbor devices.

The client device of the present disclosure manifests an enhanced local privacy of personal data. The client device uses the neighbor selection algorithm which results into a reduced number of client devices in each iteration of the RdSA protocol. Therefore, the client device also manifests a reduced computational cost even in the presence of large number of client devices. The client device uses the secure sum technique (or cryptographic technique) and prevents the central server from accessing private information from any individual client.

The client device selects its neighbors by use of the neighbor probability parameter in the neighbor selection algorithm. The neighbors are selected in such a way that each pair of client devices agree on whether or not they are neighbors and any other client device (e.g. the central server) learns nothing about whether or not they are neighbors.

In a further implementation form, the neighbor probability parameter p and predefined value r: f(n_h, p) > 1 — r and f(n_h, p — d) < 1 — r for precision d where:

where: active security passive security

By configuring the neighbor probability parameter p, the client device can be used in an active security model as well as for a passive security model.

In a further implementation form, the processor is further configured to generate the model update output according to the secure sum protocol by generating a one-time-pad for each neighbor device and adding the plurality of one-time-pads to the local model update vector, wherein the one-time-pad for each neighbor device is generated based on a shared secret derived from the secret key of the client device and the external public key of the neighbor device.

The addition of the plurality of one-time -pads to the local model update vector of the client device encrypts a secret value of the client device. In a further implementation form, a set of all the one-time-pads generated by the plurality of client devices sums substantially to zero.

The set of all the one-time-pads generated by the plurality of client devices are added at the central server which comes substantially to zero, that means the central server obtains a sum of the client’s (unencrypted) secrets. In a further implementation form, the processor is configured to generate the model update output according to the secure sum protocol by splitting the local model vector update into a plurality of shares according to the number of neighbor devices, transmitting the shares to the respective neighbor devices, receiving external shares from the neighbor devices and summing the plurality of external shares to form the model update output. The splitting of the local model vector update into the plurality of shares according to the number of neighbor devices improves the secret sharing on the client side in comparison to the conventional secure sum technique where the scaling issue is prominent.

In a further implementation form, the processor is configured to generate the model update output by adding a locally generated noise signal to the local model update vector and wherein a distribution of the locally generated noise signal is gaussian or binomial.

In the differential privacy technique, the client device adds the locally generated noise signal to the local model update vector to provide an improved local privacy which means that a private data is fully protected. The central server can not observe any private information from any individual client. The noise signal is added locally (on the client’s side) rather than centrally (on the server’ side) because the local privacy is more desirable than central privacy. The locally generated noise signal of binomial distribution is used to provide the local privacy protection.

In a further implementation form, the locally generated noise signal is generated with a standard deviation which is defined by a noise parameter received from the central server, wherein the noise parameter for each client device is such that a corresponding set of differentially private noise signals from the plurality of client devices sums to a global noise having a predetermined standard deviation.

In a further implementation form, the processor is further configured to convert the local model update vector from a vector of floating point values to an integer vector.

The client device converts the local model update vector from the vector of floating point values to the (modular) integer vector to ensure the security.

In a further implementation form, the transceiver is further configured to send the public key to the central server and receive the external public keys from the central server. The client device communicates the public key to the central server which further communicates the public key to other client devices so that the client device can perform a key agreement with other client devices and selects its neighbors.

In another aspect, the present disclosure provides a method for distributed machine learning for a network comprising: receiving, by a central server, a plurality of model update outputs transmitted by a plurality of client devices. The method further comprises determining, by the central server, an aggregated sum of model updates based on the plurality of model update outputs. The method further comprises updating, by the central server, a global model based on the aggregated sum of model updates. The method further comprises transmitting, by the central server, the global model update to each of the client devices.

The central server performs random secure averaging of the plurality of model update outputs transmitted by the plurality of client devices. Based on the random secure averaging, the central server determines the global model update which is further shared with the plurality of client devices. The plurality of client devices manifests an improved accuracy and enhanced privacy protection of personal data because of the global model update.

In an implementation form, updating the global model comprises converting an integer vector to a floating point vector.

The global model comprises conversion from the integer vector to the floating point vector to enable ease of operation.

In a further implementation form, the method further comprises determining, by the central server, that a client device has dropped out. The method further comprises adding, by the central server, an additional noise to the aggregated sum of model updates based on a predetermined variance value for the local noise added to the aggregated sum of model updates.

The central server determines if any client device has dropped out during the execution of the RdSA protocol. After determining the drop out, the central server compensates for the lost noise by adding the additional noise to the aggregated sum of model updates. In this way, the central server performs the differential noise recovery also. In a further implementation form, the method further comprises performing a client dropout recovery protocol including: receiving, by the central server, a plurality of key shares from the plurality of client devices, representing a set of secret keys for each client device which are split into a plurality of key shares according to a secret sharing protocol, distributed among the client devices and sent to the central server by each client device. The method further comprises determining, by the central server, that a client device has dropped out. The method further comprises combining, by the central server, a plurality of received key shares corresponding to a dropout client to recover the secret key corresponding to the dropout client.

The central server performs client drop out recovery in case of the client drop out by use of the secret sharing protocol.

In a further implementation form, a computer-readable medium configured to store instructions which, when executed, cause a central server processor to perform the method.

The central server processor achieves all the advantages and effects of the method.

In another aspect, the present disclosure provides a central server comprising a transceiver configured to receive a plurality of model update outputs transmitted by a plurality of client devices, and transmit a global model update to each of the client devices. The central server further comprises a processor configured to determine an aggregated sum of model updates based on the plurality of model update outputs, and update a global model to generate the global model update based on the aggregated sum of model updates.

In an implementation form, updating the global model comprises converting an integer vector to a floating point vector. The global model comprises conversion from the integer vector to the floating point vector to enable ease of operation at the central server.

In a further implementation form, the processor is further configured to determine that a client device has dropped out. The processor is further configured to add an additional noise to the aggregated sum of model updates based on a predetermined variance value for the local noise added to the aggregated sum of model updates.

The central server determines if any client device has dropped out during the execution of the RdSA protocol. After determining the drop out, the central server compensates for the lost noise by adding the additional noise to the aggregated sum of model updates. In this way, the central server performs the differential noise recovery also.

In a further implementation form, the central server is further configured to perform a client dropout recovery protocol including: receiving, by the transceiver, a plurality of key shares from the plurality of client devices, representing a set of secret keys for each client device which are split into a plurality of key shares according to a secret sharing protocol, distributed among the client devices and sent to the central server by each client device. The central server is further configured to perform a client dropout recovery protocol including: determining, by the processor, that a client device has dropped out. The central server is further configured to perform a client dropout recovery protocol including: combining, by the processor, a plurality of received key shares corresponding to a dropout client to recover the secret key corresponding to the dropout client.

It has to be noted that all devices, elements, circuitry, units and means described in the present application could be implemented in the software or hardware elements or any kind of combination thereof. All steps which are performed by the various entities described in the present application as well as the functionalities described to be performed by the various entities are intended to mean that the respective entity is adapted to or configured to perform the respective steps and functionalities. Even if, in the following description of specific embodiments, a specific functionality or step to be performed by external entities is not reflected in the description of a specific detailed element of that entity which performs that specific step or functionality, it should be clear for a skilled person that these methods and functionalities can be implemented in respective software or hardware elements, or any kind of combination thereof. It will be appreciated that features of the present disclosure are susceptible to being combined in various combinations without departing from the scope of the present disclosure as defined by the appended claims.

Additional aspects, advantages, features and objects of the present disclosure would be made apparent from the drawings and the detailed description of the illustrative implementations construed in conjunction with the appended claims that follow.

BRIEF DESCRIPTION OF THE DRAWINGS

The summary above, as well as the following detailed description of illustrative embodiments, is better understood when read in conjunction with the appended drawings. For the purpose of illustrating the present disclosure, exemplary constructions of the disclosure are shown in the drawings. However, the present disclosure is not limited to specific methods and instrumentalities disclosed herein. Moreover, those in the art will understand that the drawings are not to scale. Wherever possible, like elements have been indicated by identical numbers.

Embodiments of the present disclosure will now be described, by way of example only, with reference to the following diagrams wherein:

FIG. 1 is a flowchart of a method for distributed machine learning for a network, in accordance with an example of the present disclosure;

FIG. 2 is a flowchart of a method for distributed machine learning for a network, in accordance with another example of the present disclosure;

FIG. 3A is a network environment diagram that depicts distributed machine learning with random secure averaging, in accordance with an example of the present disclosure;

FIG. 3B is a block diagram that illustrates various exemplary components of the client device, in accordance with an example of the present disclosure; FIG. 3C is a block diagram that illustrates various exemplary components of the central server, in accordance with an example of the present disclosure;

FIG. 4 is a network environment diagram that depicts distributed machine learning with random secure averaging, in accordance with another example of the present disclosure; and

FIG. 5 illustrates an exemplary implemenatation scenario of distributed machine learning, in accordance with an example of the present disclosure.

In the accompanying drawings, an underlined number is employed to represent an item over which the underlined number is positioned or an item to which the underlined number is adjacent. A non-underlined number relates to an item identified by a line linking the non- underlined number to the item. When a number is non-underlined and accompanied by an associated arrow, the non-underlined number is used to identify a general item at which the arrow is pointing.

DETAILED DESCRIPTION OF EMBODIMENTS

The following detailed description illustrates embodiments of the present disclosure and ways in which they can be implemented. Although some modes of carrying out the present disclosure have been disclosed, those skilled in the art would recognize that other embodiments for carrying out or practicing the present disclosure are also possible.

FIG. 1 is a flowchart of a method for distributed machine learning for a network, in accordance with an example of the present disclosure. With reference to FIG. 1, there is shown a method 100 for distributed machine learning for a network. The method 100 includes steps 102, 104, 106, 108, 110, 112, and 114. In an implementation, the method 100 is executed by a client device, described in detail, for example, in FIGs. 3 A and 3B.

At step 102, the method 100 comprises receiving, by a processor of the client device, a local model update vector. The processor of the client device updates the local model update vector by use of a local data or a raw data of the client device. At step 104, the method 100 further comprises generating, by the processor, a public key and a secret key, where the public key is for broadcasting to a plurality of other client devices on the network. The generated public key and the secret key (or a private key) of the client device are used to perform a key agreement with another client device.

At step 106, the method 100 further comprises receiving, by the processor, an external public key for each of the other client devices. The processor of the client device performs a key agreement with the plurality of other client devices on the network and generates a key pair based on the external public key received. The key agreement is performed by use of a key agreement scheme, such as elliptic-curve-diffie-hellman (ECDH) which allows two client devices, each having an elliptic-curve external public-secret key pair (also known as a shared key), to establish a shared secret over an insecure channel. For example, s_{u v} is the shared secret between two clients, such as a client u and a client v. The shared secret is obtained by applying a secure key derivation function, such as hash-based key derivation function (HKDF) to the elliptic-curve external public-secret key pair (or the shared key).

At step 108, the method 100 further comprises for each other client device: generating, based on the secret key and the external public key, a pseudorandom number. The elliptic -curve external public- secret key pair which is based on the secret key and the external public key is used to generate the pseudorandom number for each other client device. The shared secret s_{u v} is used as a seed for a pseudorandom number generator which is secure and deterministic. For example, (rand“'^v) is a sequence of random numbers generated by the client pair (u, v).

At step 110, the method 100 further comprises for each other client device: determining whether each of the other client devices is to be allocated to a set of neighbor devices of the client device based on the pseudorandom number and a predetermined neighbor probability parameter. For example, for each of the other client devices (i.e. the client v), the client device (i.e. the client u) generates the pseudorandom number, such as rand^^v over at least 128 bits. The client device’s (i.e. the client u’s) set of neighbor devices is

where: p is the predetermined neighbor probability parameter for each other client device and d is a bit size of the generated pseudorandom number.

At step 112, the method 100 further comprises generating, by the processor, a model update output according to a secure sum protocol based on the set of neighbor devices. The secure sum protocol is used to generate the model update output either by use of an additive secret sharing scheme or a one-time-pad (OTP) based scheme.

At step 114, the method 100 further comprises outputting, by the processor, the model update output for transmission to a central server for incorporation into a global model update. The processor of the client device transmits the locally trained model update output to the central server to finalize the global model update which is further shared with each of the client devices.

In accordance with an embodiment, the neighbor probability parameter is configured to define a number of neighbors in the set of neighbor devices based on a predefined value, where the predefined value is defined based on a modelled risk of a successful attack. The predefined value based on the modelled risk of the successful attack is represented as r_tot for N number of rounds of training. The predefined value computed for each round of training is r = 1 — ^N _*Jl — r_tot so that 1 — (1 — r)^N = r_tot. The modelled risk of the successful attack is also termed as a risk parameter a such that a E [0,1] . The risk parameter a represents a probability that the privacy of at least one client will not be protected against the central server. The risk parameter a can be computed either for each round of training or for N number of rounds of training (or the entire training). In an example, if a = 0.01 per round then for each round of training, there is a 1% chance that at least one client’s privacy will not be protected with respect to the central server. In another example, if a = 0.1 for 100 (i.e. N=100) rounds of training, there is a 10% chance that at least one client’s privacy will not be protected at some point during the training. The risk parameter a can be chosen based on business requirements. A higher privacy risk will translate into a better computational performance. Generally, the risk parameter a is chosen of the same order of magnitude as a standard d parameter used in the conventional differential privacy technique. In accordance with an embodiment, for the neighbor probability parameter p and predefined value r:

/(h_¾, p) > 1 — r and /(¾, p — <5) < 1 — r for precision <5 where:

where: active security passive security

/(n_¾, p) is a probability that a random graph over n_h vertices is connected. The method 100 includes an active security model as well as a passive security model. The active security model provides the privacy protection in a case of a malicious server where the malicious server tries to recover a client’s update by modifying message sent by a number of clients. The active security model provides security sutiable for, e.g., highly confidential data such as medical data or financial data. Moreover, the active security model provides protection against very powerful (e.g. national level or state level) adversaries or inside attackers. The passive security model provides the privacy protection in a case of a honest-but-curious server where an attacker tries to recover the client’s update by listening to the conversation between the number of clients and the conventional server. The passive security model provides security suitable for, e.g., personal data. Additionally, the passive security model provides protection against data breaches, remote hackers, and compliance with regulations.

In accordance with an embodiment, generating the model update output according to the secure sum protocol comprises generating a one-time-pad for each neighbor device and adding the plurality of one-time-pads to the local model update vector, wherein the one-time- pad for each neighbor device is generated based on a shared secret derived from the secret key of the client device and the external public key of the neighbor device. Generally, in cryptography, the one-time pad is an encryption technique which uses the secret key of the client device and the external public key of the neighbor device to encrypt the shared secret. For example, for each of the other client devices (i.e. the client v), the client device (i.e. the client u) generates the one-time-pad (OTP) by use of the generated sequence of pseudorandom numbers, (rand“'^v) such that the two neighbor devices use the same OTP for each other. Thereafter, the plurality of one-time-pads (i.e. encrypted secrets) generated for the plurality of other client devices are added to the local model update vector. Hence, a sum is obtained for each of the client devices’s (unencrypted) secrets. In accordance with an embodiment, the number of neighbors per client is significantly reduced by the neighbor selection algorithm described above, which reduces a cost associated with the aggregation steps on the client device.

In accordance with an embodiment, a set of all the one-time-pads generated by the plurality of client devices sums substantially to zero. The generated OTP from each pair of neighbors cancel each other, therefore, the set of all the one-time-pads generated by the plurality of client devices sums equal to zero.

Alternatively, in accordance with an embodiment, generating the model update output according to the secure sum protocol comprises splitting the local model vector update into a plurality of shares according to the number of neighbor devices, transmitting the shares to the respective neighbor devices, receiving external shares from the neighbor devices and summing the plurality of external shares to form the model update output. The secure sum protocol is used to generate the model update output by use of an additive secret sharing scheme, such as Shamir Secret Sharing (SSS). In the additive secret sharing scheme, the local model update vector is quantized and viewed as a vector of fixed length integers. The client device splits each integer into the plurality of shares, one for each neighbor device. The plurality of shares are transmitted to the respective neighbor devices. Thereafter, the external shares are received from the neighbor devices and are added to form the model update output.

In accordance with an embodiment, generating the model update output includes adding a locally generated noise signal to the local model update vector and wherein a distribution of the locally generated noise signal is gaussian or binomial. The locally generated noise signal has either the Gaussian distribution or the Binomial distribution. The Binomial distribution is generally preferred, particularly for small word sizes.

In accordance with an embodiment, the locally generated noise signal is generated with a standard deviation which is defined by a noise parameter received from the central server, wherein the noise parameter for each client device is such that a corresponding set of locally generated noise signals from the plurality of client devices sums to a global noise having a predetermined standard deviation. The corresponding set of locally generated noise signals from the plurality of client devices when summed to the global noise with binomial distribution and the predetermined standard deviation (s).

In accordance with an embodiment, the method 100 further comprises converting the local model update vector from a vector of floating point values to an integer vector. The local model update vector is converted from the vector of floating point values to the integer (or modular) vector through a quantization process. In the quantization process, an unbiased, space-efficient algorithm is used. For example, requiring bounds on individual model parameters allows an efficient mapping to an integer-space. These bounds may be provided directly by the central server or inferred from differential privacy (DP) specific parameters such as the model update clipping bound S. The quantization must take into account the risk of overflow when summing the weighted updates from the plurality of other client devices. In particular, the plurality of other client devices must be aware of the sum of weights to be used during this iteration.

In accordance with an embodiment, the method 100 further comprises sending the public key to the central server and receiving the external public keys from the central server. The public key generated by the processor of the client device is communicated to the central server. The processor of the client device is configured to receive the external public keys from the central server for each of the other client devices.

In accordance with an embodiment, a computer-readable medium configured to store instructions which, when executed, cause the client device processor to perform the method 100. The processor of the client device is configured to execute the method 100. The steps 102, 104, 106, 108, 110, 112, and 114 are only illustrative and other alternatives can also be provided where one or more steps are added, one or more steps are removed, or one or more steps are provided in a different sequence without departing from the scope of the claims herein.

FIG. 2 is a flowchart of a method for distributed machine learning for a network, in accordance with another example of the present disclosure. FIG. 2 is described in conjunction with elements from FIG. 1. With reference to FIG. 2, there is shown a method 200 for distributed machine learning for a network. The method 200 includes steps 202, 204, 206, and 208. In an implementation, the method 200 is executed by a central server, described in detail, for example, in FIGs. 3 A, and 3C.

At step 202, the method 200 comprises receiving, by the central server, a plurality of model update outputs transmitted by a plurality of client devices. The central server receives the plurality of model update outputs corresponding to the plurality of client devices. Each of the plurality of client devices generates one of the plurality of model update outputs by training its respective local model update vector.

At step 204, the method 200 further comprises determining, by the central server, an aggregated sum of model updates based on the plurality of model update outputs. The central server obtains a vector which represents the aggregated sum of model updates based on the plurality of model update outputs.

At step 206, the method 200 further comprises updating, by the central server, a global model based on the aggregated sum of model updates. Based on the aggregated sum of model updates, the central server determines the global model update which is shared with each of the plurality of client devices.

At step 208, the method 200 further comprises transmitting, by the central server, the global model update to each of the client devices. The central server transmits the global model update to each of the client devices for a next iteration.

In accordance with an embodiment, updating the global model comprises converting an integer vector to a floating point vector. The global model is updated by converting from the integer (modular) vector to the floating point vector through a reverse quantization process.

The conversion from the integer (modular) vector to the floating point vector results into a sum of noisy model updates. If a client device has dropped out during the process, then the central server takes into consideration a lost noise while executing the reverse quantization process.

In accordance with an embodiment, the method further comprises determining, by the central server, that a client device has dropped out. The method further comprises adding, by the central server, an additional noise to the aggregated sum of model updates based on a predetermined variance value for the local noise added to the aggregated sum of model updates. In a case, the central server is configured to do noise compensation where one or more of the plurality of client devices has dropped out during the execution of the random secure averaging (RdSA) protocol. The central server is configured to add the additional noise to the aggregated sum of model updates based on the predetermined variance value for the local noise added to the aggregated sum of model updates, described in detail, for example, in FIG. 3A.

In accordance with an embodiment, the method further comprises performing a client dropout recovery protocol including: receiving, by the central server, a plurality of key shares from the plurality of client devices, representing a set of secret keys for each client device which are split into a plurality of key shares according to a secret sharing protocol, distributed among the client devices and sent to the central server by each client device. The method further comprises determining, by the central server, that a client device has dropped out. The method further comprises combining, by the central server, a plurality of received key shares corresponding to a dropout client to recover the secret key corresponding to the dropout client. The central server combines the plurality of received key shares corresponding to the dropout client to derive what value of a random noise is added to the aggregated sum of model updates. Such value of the random noise added to compensate a lost noise because of the drop out client from the aggregated sum of model updates. This is described in detail, for example, in FIG. 3A.

In accordance with an embodiment, a computer-readable medium configured to store instructions which, when executed, cause the central server processor to perform the method 200. The processor of the central server is configured to execute the method 200. The steps 202, 204, 206, and 208 are only illustrative and other alternatives can also be provided where one or more steps are added, one or more steps are removed, or one or more steps are provided in a different sequence without departing from the scope of the claims herein.

FIG. 3A is a network environment diagram that depicts distributed machine learning with random secure averaging protocol, in accordance with an example of the present disclosure. FIG. 3A is described in conjunction with elements from FIGs. 1 and 2. With reference to FIG. 3A, there is shown a system 300A that includes a plurality of client devices 302, a central server 304 and a network 306. The plurality of client devices 302 includes a client device 302A and other client devices 302B-302N. The system 300A describes an exemplary sequence of operations 308A, 308B, 308C, 308D, 308E and 308F executed by the client device 302A.

Each of the plurality of client devices 302 includes suitable logic, circuitry, interfaces, and/or code that is configured to communicate with the central server 304, via the network 306. Each of the plurality of client devices 302 is further configured to train a local model update vector and compute a model update output. Examples of the plurality of client devices 302 may include, but are not limited to a user device, a laptop, a computing device, a communication apparatus including a portable or non-portable electronic device, or a supercomputer. The various exemplary components of the client device 302A are described in detail, for example, in FIG. 3B.

The central server 304 includes suitable logic, circuitry, interfaces, or code that is configured to communicate with the plurality of client devices 302 via the network 306. The central server 304 is further configured to determine a global model update which is further shared with each of the plurality of client devices 302 for the next iteration. Examples of the central server 304 include, but are not limited to a storage server, a cloud server, a web server, an application server, or a combination thereof. According to an embodiment, the central server 304 includes an arrangement of physical or virtual computational entities capable of enhancing information to perform various computational tasks. In an example, the central server 304 may be a single hardware server. In another example, the central server 304 may be a plurality of hardware servers operating in a parallel or in a distributed architecture. In an implementation, the central server 304 may include components such as a memory, a processor, a network interface and the like, to store, process or share information with the plurality of client devices 302. In another implementation, the central server 304 is implemented as a computer program that provides various services (such as a database service) to the plurality of client devices 302, or modules or apparatus. The various exemplary components of the central server 304 are described in detail, for example, in FIG. 3C.

The network 306 includes a medium (e.g. a communication channel) through which the plurality of client devices 302 communicate with the central server 304 or vice-versa. The network 306 may be a wired or wireless communication network. Examples of the network 306 may include, but are not limited to, a Wireless Fidelity (Wi-Fi) network, a Local Area Network (LAN), a wireless personal area network (WPAN), a Wireless Local Area Network (WLAN), a wireless wide area network (WWAN), a cloud network, a Long Term Evolution (LTE) network, a Metropolitan Area Network (MAN), or the Internet. The plurality of client devices 302 and the central server 304 are potentially configured to connect to the network 306, in accordance with various wired and wireless communication protocols. Examples of such wired and wireless communication protocols may include, but are not limited to, Transmission Control Protocol and Internet Protocol (TCP/IP), User Datagram Protocol (UDP), Hypertext Transfer Protocol (HTTP), File Transfer Protocol (FTP), ZigBee, EDGE, infrared (IR), IEEE 802.11, 802.16, Long Term Evolution (LTE), Light Fidelity(Li-Fi), or other cellular communication protocols or Bluetooth (BT) communication protocols, including variants thereof.

In operation, the client device 302A of the plurality of client devices 302 performs local training in a series of operations. At operation 308A, the client device 302A is configured to generate a local model update vector. The client device 302A updates the local model update vector by use of a local data.

At operation 308B, the client device 302A is further configured to generate a public key and a secret key. The client device 302A generates the public key and the secret key (or a private key) in order to perform a key agreement with the other client devices 302B-302N. At operation 308C, the client device 302A is further configured to broadcast the public key to the other client devices 302B-302N on the network 306, receive an external public key for each of the other client devices 302B-302N, and transmit a model update output to the central server 304 for incorporation into a global model update 312. The client device 302A is configured to broadcast the generated public key to the other client devices 302B-302N on the network 306. In a case, the client device 302A can not directly communicate with the other client devices 302B-302N. In such a case, the generated public key is broadcasted to the other client devices 302B-302N through the central server 304. In another case of the active security model (or malicious server), a trusted third party is used to initially establish the key agreement between each of the clients 302. Thereafter, the client device 302A is further configured to receive the external public key for each of the other client devices 302B-302N to generate a key pair based on the received external public key and the generated secret key. The key agreement is performed by use of an elliptic-curve-diffie- hellman (ECDH) key agreement scheme which allows two client devices, each having an elliptic -curve external public-secret key pair (also known as a shared key), to establish a shared secret over an insecure channel.

At operation 308D, the client device 302A is further configured to generate, for each other client device 302B-302N, a pseudorandom number based on the secret key and the external public key. The two client devices who have established the shared secret, create identical pseudo random numbers which are used to indicate that the two clients are neighbors of each other.

At operation 308E, the client device 302A is further configured to determine whether each of the other client devices 302B-302N is to be allocated to a set of neighbor devices of the client device 302A based on the pseudorandom number and a predetermined neighbor probability parameter. The client device’s 302A set of neighbor devices is

where: p is the predetermined neighbor probability parameter for each other client device 302B-302N and d is a bit size of the generated pseudorandom number.

At operation 308F, the client device 302A is further configured to generate the model update output according to a secure sum protocol 310 based on the set of neighbor devices. The client device 302A generates the model update by applying the secure sum protocol 310 to the set of neighbor devices. The secure sum protocol 310 is applied either by an additive secret sharing scheme or a one-time-pad (OTP) based scheme.

In accordance with an embodiment, the neighbor probability parameter is configured to define a number of neighbors in the set of neighbor devices based on a predefined value, where the predefined value is defined based on a modelled risk of a successful attack. The number of neighbors in the set of neighbors devices is computed by use of a neighbor selection algorithm. The neighbor selection algorithm includes two phases namely, a set up phase and an online phase. In the setup phase, the computation of the number of neighbors in the set of neighbors devices depends on the risk parameter r_tot for N number of rounds of training and the neighbor probability parameter p, have been described in detail, for example, in FIG. 1. In the online phase, the client device 302A obtains the shared secret for one of the other client devices 302B-302N. And from the shared secret, the client device 302A is configured to derive k shared uniform bits b_t and thereafter add the respective one of the other client devices 302B-302N to list of neighbors

< p2^k.

In accordance with an embodiment, the neighbor probability parameter p and predefined value r: f(n_h, p) > 1 — r and f(n_h, p — d) < 1 — r for precision d where:

where: active security passive security

The client device 302A manifests the active security model as well the passive security model depending on need basis. The active security model and the passive security model have been described in detail, for example, in FIG. 1. In accordance with an embodiment, the client device 302A is further configured to generate the model update output according to the secure sum protocol 310 by generating a one-time- pad for each neighbor device and adding the plurality of one-time-pads to the local model update vector, wherein the one-time-pad for each neighbor device is generated based on a shared secret derived from the secret key of the client device 302A and the external public key of the neighbor device. After neighbor selection, the secure sum protocol 310 is applied by use of the OTP based scheme which have been described in detail, for example, in FIG. 1. After generating the OTP for each of the neighbor devices, the client device 302A adds the plurality of one-time-pads to the local model update vector and generates the model update output.

In accordance with an embodiment, a set of all the one-time-pads generated by the plurality of client devices 302 sums substantially to zero. The generated OTP for each pair of neighbors cancel each other, therefore, the set of all the one-time-pads generated by the plurality of client devices sums equal to zero.

Alternatively, in accordance with an embodiment, the client device 302A is further configured to generate the model update output according to the secure sum protocol 310 by splitting the local model vector update into a plurality of shares according to the number of neighbor devices, transmitting the shares to the respective neighbor devices, receiving external shares from the neighbor devices and summing the plurality of external shares to form the model update output. After neighbor selection, the secure sum protocol 310 is applied by an additive secret sharing scheme, such as Shamir Secret Sharing (SSS) scheme. In the additive secret sharing scheme, the local model update vector is quantized and viewed as a vector of fixed length modular integers. The client device 302A splits each integer into the plurality of shares, one for each neighbor device. The plurality of shares are transmitted to the respective neighbor devices 302B-302N. Thereafter, the external shares are received from the neighbor devices 302B-302N and are added to form the model update output.

In accordance with an embodiment, the client device 302A is further configured to generate the model update output by adding a locally generated noise signal to the local model update vector and wherein a distribution of the locally generated noise signal is gaussian or binomial. The locally generated noise signal has either the Gaussian distribution or the Binomial distribution. The Binomial distribution is generally preferred, particularly for small word sizes.

In accordance with an embodiment, the locally generated noise signal is generated with a standard deviation which is defined by a noise parameter received from the central server 304, wherein the noise parameter for each client device 302A-302N is such that a corresponding set of differentially private noise signals from the plurality of client devices 302 sums to a global noise having a predetermined standard deviation. The central server 304 is configured to determine the noise parameter such as a noise multiplier z a model update clipping bound S. The locally generated noise signal is derived from the noise multiplier in the following steps: at first step, the standard deviation is computed which is added to the sum as s = zS. At second step, a noise splitting strategy is selected to split the locally generated noise signal between the plurality of clients 302. The noise splitting strategy is provided with standard deviation s and a set of the plurality of clients 302 weights {w_u}_u and returns a set of local noises {J_u}_u such that å_u s = s². Three different noise splitting strategies can be used such that strategy 1 : a_u , strategy 2: s_h

and strategy 3: s_h = -7= . For generating the Gaussian distributed locally generated noise

V signal, the client device 302A local noise is J\f (0, s ) which has normal distribution, mean 0 and variance

For generating the Binomial locally generated distributed noise signal, four steps are followed. These are (a) setting N = 2^wordslze (b) setting quantization

2s 2 parameter k = (c) setting N_u = (T or such that å_UN_U = N (d)

computing local noise following the binomial mechanism: In

this way, the corresponding set of locally generated noise signals from the plurality of client devices 302 when summed to the global noise with binomial distribution with the predetermined standard deviation (s).

In accordance with an embodiment, the client device 302A is further configured to convert the local model update vector from a vector of floating point values to an integer vector. The local model update vector is converted from the vector of floating point values to the integer (or modular) vector through the quantization process. In the quantization process, an unbiased, space-efficient algorithm is used. For example, requiring bounds on individual model parameters allows an efficient mapping to an integer-space. These bounds may be provided directly or inferred from a differential privacy parameter that is S. The quantization process must take into account the risk of overflow when summing the weighted updates from the plurality of other client devices 302B-302N. In particular, the plurality of other client devices 302B-302N must be aware of the sum of weights to be used during this iteration.

In accordance with an embodiment, the client device 302A is further configured to send the public key to the central server 304 and receive the external public keys from the central server 304. For example, in a case, where the client device 302A can not directly communicate with the plurality of other client devices 302B-302N. Then in such a case, the client device 302A communicates the generated public key to the central server 304 which further communicates the generated public key to the plurality of other client devices 302B- 302N. After communicating the generated public key to the central server 304, the client device 302A is configured to receive the external public keys from the central server 304.

Thus, each of the plurality of client devices 302 trains the local model update vector and generates the model update output. Each of the plurality of client devices 302 scales the local model update vector, quantizes the local model update vector and thereafter adds a calibrated noise having the binomial distribution to the local model update vector. The calibrated noise is based on the privacy requirements and type of the security model (i.e. the active security model or the passive security model) used. After that, each of the plurality of client devices 302 is given the risk parameter and the selected security model, then each of the plurality of client devices 302 derives the neighbor probability parameter p by use of the neighbor selection algorithm. The neighbor selection algorithm is based on the notion of connectivity of a random graph. After neighbor selection, each pair of client devices share a secret. The shared secret is used to derive a shared randomness. With the neighbor probability parameter p (over the shared randomness), each pair of client devices chooses to be neighbors. Each pair of neighbors generates the identical pseudorandom number and add to their respective secret value (with opposite sign) to generate the model update output. After that, each of the plurality of client devices 302 sends the generated model update output to the central server 304. The central server 304 is configured to receive a plurality of model update outputs transmitted by the plurality of client devices 302, and transmit the global model update 312 to each of the client devices 302A-302N. The central server 304 determines the global model update 312 by use of the plurality of model update outputs transmitted by the plurality of client devices 302. The central server 304 is further configured to transmit the global model update 312 to each of the client devices 302A-302N for a next iteration.

The central server 304 is further configured to determine an aggregated sum of model updates based on the plurality of model update outputs, and update the global model to generate the global model update 312 based on the aggregated sum of model updates. The central server 304 obtains a vector which represents the aggregated sum of model updates based on the plurality of model update outputs. Based on the aggregated sum of model updates, the central server 304 determines the global model update 312 which is shared with each of the plurality of client devices 302.

In accordance with an embodiment, updating the global model comprises converting an integer vector to a floating point vector. The global model is updated by converting from the integer (modular) vector to the floating point vector through a reverse quantization process. The conversion from the integer (modular) vector to the floating point vector results into a sum of noisy model updates. If a client device has dropped out during the process, then the central server 304 takes into consideration while executing the reverse quantization process.

In accordance with an embodiment, the central server 304 is further configured to determine that a client device has dropped out. The central server 304 is further configured to add an additional noise to the aggregated sum of model updates based on a predetermined variance value for the local noise added to the aggregated sum of model updates. In a case, the central server 304 is configured to do noise compensation where one or more of the plurality of client devices 302 has dropped out during the execution of the random secure averaging (RdSA) protocol. In such a case of the client drop out, a total noise in the aggregated sum of model updates is less than required value. Therefore, the central server 304 is configured to add the additional noise to the aggregated sum of model updates based on the predetermined variance value for the local noise added to the aggregated sum of model updates. In an example, if the gaussian distributed local noise is added to the aggregated sum of model updates, then the predetermined variance is å_{Uº /j} s where, _<A is a set of the clients whose model update outputs are included in the aggregated sum of model updates. In another example, if the binomial distributed local noise is added to the aggregated sum of model updates, then the predetermined variance is

where, _<A is a set of the clients whose model update outputs are included in the aggregated sum of model updates. Depending on the distribution of the local noise, the additional noise is added to the aggregated sum of model updates. In an implementation, if Gaussian distribution is used, then a Gaussian noise (i.e. the additional noise) K (0, o_d) with mean 0 and standard deviation o_d = j s² — å_uecd °u is added to the aggregated sum of model updates. In another implementation, if Binomial distribution is used, then binomial noise added to the aggregated sum of

model updates.

In accordance with an embodiment, the central server 304 is further configured to perform a client dropout recovery protocol including: receiving, a plurality of key shares from the plurality of client devices 302, representing a set of secret keys for each client device 302A- 302N which are split into a plurality of key shares according to a secret sharing protocol, distributed among the client devices and sent to the central server 304 by each client device 302A-302N. The central server 304 is further configured to determine that a client device has dropped out. The central server 304 is further configured to combine a plurality of received key shares corresponding to a dropout client to recover the secret key corresponding to the dropout client. In an implementation, in order to determine that a client device has dropped out, the central server 304 is configured to combine a plurality of received key shares corresponding to a dropout client to recover the secret key corresponding to the dropout client. After determining by the central server 304, that the client device has dropped out, the central server 304 combines the plurality of received key shares corresponding to the dropout client to derive what value of a random noise is added to the aggregated sum of model updates. Such value of the random noise added because of the drop out client is removed from the aggregated sum of model updates. In another implementation, the central server 304 is configured to combine the plurality of received key shares corresponding to the non-dropout client to derive what value of a local random noise is added to the aggregated sum of model updates. Such value of the local random noise added because of the non- dropout client is removed from the aggregated sum of model updates. In this way, the central server 304 obtains the aggregated sum of the non-dropout’s model updates.

Thus, the central server 304 determine the global model update 312 by use of the model update outputs received from the plurality of client devices 302. After that, the central server 304 shares the global model update 312 with the plurality of client devices 302 in order to prepare for the next iteration.

The system 300A is based on a graph theory and connectivity properties of random graphs. In the system 300A, the central server 304 learns a sum of at least n/3 private model updates, except with probability equal to the risk parameter a. The system 300A provides an enhanced privacy protection of client devices’ data including personal data, improved performance and high utility. The system 300A employs a combination of the secure sum technique (or cryptographic technique) and the differential privacy technique. The secure sum technique (or cryptographic technique) prevents the central server 304 from accessing private information from any individual client. In the differential privacy technique, a noise is added locally (on the client’s side) rather than centrally (on the central server’ side) which provides a strong privacy protection.

FIG. 3B is a block diagram that illustrates various exemplary components of the client device, in accordance with an example of the present disclosure. FIG. 3B is described in conjunction with elements from FIGs. 1, 2, and 3A. With reference to FIG. 3B, there is shown a block diagram 300B of the client device 302A (of FIG. 3 A) that includes a processor 314, a network interface 316, a memory 318 and an input/output (I/O) component 320. The memory 318 further includes a training module 318A.

The processor 314 includes suitable logic, circuitry, or interfaces that is configured to generate a public key and a secret key. In an implementation, the processor 314 is configured to execute instructions stored in the memory 318. In an example, the processor 314 may be a general-purpose processor. Other examples of the processor 314 may include, but is not limited to a microprocessor, a microcontroller, a complex instruction set computing (CISC) processor, an application- specific integrated circuit (ASIC) processor, a reduced instruction set (RISC) processor, a very long instruction word (VLIW) processor, a central processing unit (CPU), a state machine, a data processing unit, and other processors or control circuitry. Moreover, the processor 314 may refer to one or more individual processors, processing devices, a processing unit that is part of a machine, such as the client device 302A.

The network interface 316 includes suitable logic, circuitry, or interfaces that is configured to broadcast the public key to a plurality of other client devices 302B-302N on the network 306, receive an external public key for each of the other client devices 302B-302N, and transmit a model update output to the central server 304 for incorporation into the global model update 312. Examples of the network interface 316 may include, but are not limited to, an antenna, a radio frequency (RF) transceiver, one or more amplifiers, a digital signal processor, or a subscriber identity module (SIM) card.

The memory 318 includes suitable logic, circuitry, or interfaces that is configured to store the instructions executable by the processor 314. Examples of implementation of the memory 318 may include, but are not limited to, Electrically Erasable Programmable Read- Only Memory (EEPROM), Random Access Memory (RAM), Read Only Memory (ROM), Hard Disk Drive (HDD), Flash memory, Solid-State Drive (SSD), or CPU cache memory. The memory 318 may store an operating system or other program products (including one or more operation algorithms) to operate the client device 302A.

In an exemplary implementation, the training module 318A is configured to generate a local model update vector. The training module 318A corresponds to a machine learning model which is trained by use of the local model update vector. The training module 318A is locally trained on the client device 302A. During training, the client device 302A uses the secure sum protocol 310 which prevents the central server 304 to observe any private information from the training module 318A. Additionally, the client device 302A adds a local differential privacy (DP) noise to the training module 318A to further enhance the privacy protection. After training, the client device 302A shares the locally trained model with the central server 304. Similarly, the central server 304 receives the locally trained model with respect to each of the plurality of other client devices 302B-302N. After receiving the locally trained module, the central server 304 computes a global model update which is shared with each of the plurality of client devices 302. The global model update manifests improved performance in terms of more accurate prediction or decision, and privacy protection of personal data and less computation cost even in the presence of large number of client devices. In an implementation, the training module 318A (which may include one or more software modules) is potentially implemented as a separate circuitry in the client device 302A. Alternatively, in another implementation, the training module 318A is implemented as a part of another circuitry to execute various operations.

The input/output (I/O) component 320 refers to input and output components (or devices) that can receive input from a user (e.g. the client device 302A) and provide output to the user (i.e. the client device 302A). The I/O component 320 may be communicatively coupled to the processor 314. Examples of input components may include, but are not limited to, a touch screen, such as a touch screen of a display device, a microphone, a motion sensor, a light sensor, a dedicated hardware input unit (such as a push button), and a docking station. Examples of output components include a display device and a speaker.

In operation, the training module 318A is configured to generate a local model update vector. The processor 314 is configured to generate a public key and a secret key. The transceiver 316 (or the network interface) is configured to broadcast the public key to a plurality of other client devices 302B-302N on the network 306, receive an external public key for each of the other client devices 302B-302N, and transmit a model update output to the central server 304 for incorporation into a global model update 312. The processor 314 is further configured to generate, for each other client device 302B-302N, the pseudorandom number based on the secret key and the external public key. The processor 314 is further configured to determine whether each of the other client devices 302B-302N is to be allocated to the set of neighbor devices of the client device 302A based on the pseudorandom number and the predetermined neighbor probability parameter. The processor 314 is further configured to generate the model update output according to the secure sum protocol based on the set of neighbor devices. In an implementation, the processor 314 is further configured to generate the model update output according to the secure sum protocol by generating a one-time-pad for each neighbor device and adding the plurality of one-time-pads to the local model update vector, wherein the one-time-pad for each neighbor device is generated based on a shared secret derived from the secret key of the client device 302A and the external public key of the neighbor device. The set of all the one-time-pads generated by the plurality of client devices 302 sums substantially to zero. In another implementation, the processor 314 is further configured to generate the model update output according to the secure sum protocol by splitting the local model vector update into a plurality of shares according to the number of neighbor devices, transmitting the shares to the respective neighbor devices, receiving external shares from the neighbor devices and summing the plurality of external shares to form the model update output. In accordance with an embodiment, the processor 314 is further configured to generate the model update output by adding a locally generated noise signal to the local model update vector and wherein a distribution of the locally generated noise signal is gaussian or binomial. The processor 314 is further configured to convert the local model update vector from a vector of floating point values to an integer vector. The various embodiments, operations, and variants disclosed in the method 100 of FIG. 1 apply mutatis mutandis to the client device 302A and the processor 314.

FIG. 3C is a block diagram that illustrates various exemplary components of the central server, in accordance with an example of the present disclosure. FIG. 3C is described in conjunction with elements from FIGs. 1, 2, 3 A, and 3B. With reference to FIG. 3C, there is shown a block diagram 300C of the central server 304 (of FIG. 3 A) that includes a processor 322, a network interface 324, and a memory 326.

The processor 322 includes suitable logic, circuitry, or interfaces that is configured to determine an aggregated sum of model updates based on the plurality of model update outputs, and update a global model to generate the global model update based on the aggregated sum of model updates. In an implementation, the processor 322 is configured to execute instructions stored in the memory 326. In an example, the processor 322 may be a general-purpose processor. Other examples of the processor 322 may include, but is not limited to a microprocessor, a microcontroller, a complex instruction set computing (CISC) processor, an application- specific integrated circuit (ASIC) processor, a reduced instruction set (RISC) processor, a very long instruction word (VLIW) processor, a central processing unit (CPU), a state machine, a data processing unit, and other processors or control circuitry. Moreover, the processor 322 may refer to one or more individual processors, processing devices, a processing unit that is part of a machine, such as the central server 304.

The network interface 324 includes suitable logic, circuitry, or interfaces that is configured to receive a plurality of model update outputs transmitted by the plurality of client devices 302, and transmit the global model update 312 to each of the client devices 302A-302N. Examples of the network interface 324 may include, but are not limited to, an antenna, a radio frequency (RF) transceiver, one or more amplifiers, a digital signal processor, or a subscriber identity module (SIM) card.

The memory 326 includes suitable logic, circuitry, or interfaces that is configured to store the instructions executable by the processor 322. Examples of implementation of the memory 326 may include, but are not limited to, Electrically Erasable Programmable Read- Only Memory (EEPROM), Random Access Memory (RAM), Read Only Memory (ROM), Hard Disk Drive (HDD), Flash memory, Solid-State Drive (SSD), or CPU cache memory. The memory 326 may store an operating system or other program products (including one or more operation algorithms) to operate the central server 304.

In operation, the transceiver 324 (or the network interface) is configured to receive a plurality of model update outputs transmitted by a plurality of client devices 302, and transmit a global model update 312 to each of the client devices 302A-302N. The processor 322 is configured to determine an aggregated sum of model updates based on the plurality of model update outputs, and update a global model to generate the global model update 312 based on the aggregated sum of model updates.

In accordance with an embodiment, the processor 322 is further configured to determine that a client device has dropped out and add an additional noise to the aggregated sum of model updates based on a predetermined variance value for the local noise added to the aggregated sum of model updates. The central server 304 is further configured to perform a client dropout recovery protocol including: receiving, by the transceiver 324, a plurality of key shares from the plurality of client devices 302, representing a set of secret keys for each client device 302 which are split into a plurality of key shares according to a secret sharing protocol, distributed among the client devices and sent to the central server 304 by each client device 302. The central server 304 is further configured to perform a client dropout recovery protocol including: determining, by the processor 322, that a client device has dropped out and combining, by the processor 322, a plurality of received key shares corresponding to a dropout client to recover the secret key corresponding to the dropout client.The various embodiments, operations, and variants disclosed in the method 200 of FIG. 2 apply mutatis mutandis to the central server 304 and the processor 322. FIG. 4 is a network environment diagram that depicts distributed machine learning random secure averaging, in accordance with another example of the present disclosure. FIG. 4 is described in conjunction with elements from FIGs. 1, 2, 3A, 3B, and 3C. With reference to FIG. 4, there is shown a system 400 that depicts an implementation of distributed machine learning with random secure averaging (RdSA) in more detail. The system 400 describes an exemplary sequence of operations 402, 402A, 402B, 402C, 402D, 402E, 402F, 402G, 402H, and 4021 executed by the client device 302A. There is further shown an exemplary sequence of operations 404A, 404B, 404C, 404D, 404E, 404F, and 406 which are executed by the central server 304.

The system 400 includes the plurality of client devices 302 and the central server 304 of FIG. 3A. The client side distributed learning or the client side random secure averaging is explained with respect to the client device 302A and the plurality of other client devices 302B-302N. Similarly, the central server side distributed learning or the central server side random secure averaging is explained with respect to the central server 304.

In operation, the client device 302A of the plurality of client devices 302 performs a local training of a global model in a series of operations. The global model is shared by the central server 304 with each of the plurality of client devices 302. Moreover, the central server 304 chooses values for standard distributed learning parameters, such as number of local epochs, learning rate or total number of iterations N. The central server 304 shares the chosen parameters of standard distributed learning with the plurality of client devices 302.

At operation 402, the client device 302A starts the local training of a local model update vector by use of local data or raw data.

At operation 402A, the client device 302 performs the parameters choice which are shared by the central server 304. The chosen parameters are used in the local training of the local model update vector.

At operation 402B, the client device 302A is further configured to generate a public key and a secret key. The client device 302A generates the public key and the secret key (or a private key) in order to perform a key agreement with the other client devices 302B-302N. In a case, where the client device 302A can not directly communicate with the plurality of other client devices 302B-302N. In such a case, the client device 302A communicates the generated public key to the central server 304. The central server 304 shares the generated public key with each of the plurality of client devices 302. Therefore, in return, each client device 302 receives an external public key from the central server 304.

At operation 402C, the client device 302A is further configured to generate a key pair based on the generated secret key and the received external public key. The generated key pair is used to perform a key agreement with the plurality of other client devices 302B-302N. The key agreement is performed by use of an elliptic-curve-diffie-hellman (ECDH) key agreement scheme which allows two client devices, each having an elliptic-curve external public-secret key pair (also known as a shared key), to establish a shared secret over an insecure channel.

At operation 402D, the client device 302A is further configured to select its neighbors by use of a neighbor selection algorithm, which have been described in detail, for example, in FIGs. 1 and 3A. The two client devices who have established the shared secret, create identical pseudo random numbers which are used to indicate that the two clients are neighbors of each other.

At operation 402E, secret sharing is performed in order to split the private secrets of the client device 302A. Two private secrets of the client device 302A are getting split at the operation 402E, such as the secret key of the client device 302A from which a shared secret is derived for generating a one-time-pad for each neighbor device and a personal seed of the client device 302A which is used to add a random noise in a further operation.

At operation 402F, the client device 302A is further configured to add a locally generated differential privacy (DP) noise signal to the local model update vector and wherein a distribution of the locally generated DP noise signal is gaussian or binomial. The locally generated noise signal has either the Gaussian distribution or the Binomial distribution. The noise parameters are selected and shared by the central server 304 with each of the plurality of client devices 302 for generating the DP noise signal.

At operation 402G, after adding the locally generated differential privacy (DP) noise signal to the local model update vector, the client device 302A is further configured to perform a quantization process. In the quantization process, the local model update vector is converted from the vector of floating point values to the integer (or modular) vector.

At operation 402H, the client device 302A is further configured to apply the secure sum protocol 310 by generating a one-time-pad for each neighbor device and adding the generated one-time-pads to the quantized model update vector. The one-time-pad for each neighbor device is generated based on a shared secret derived from the secret key of the client device 302A and the external public key of the neighbor device. Moreover, the client device 302A is further configured to compute a local randomized noise based on its personal seed and adding the computed local randomized noise to the quantized model update vector.

At operation 4021, the client device 302A is configured to communicate one share of each other client with the central server 304. In an example, if another client device drops out during the execution of the RdSA protocol, then the client device 302A is configured to communicate a share of the other client device’s secret key to the central server 304 in order to derive shared OTPs of the other client device. In another example, if the other client device drops out during the execution of the RdSA protocol, then the client device 302A is configured to communicate a share of the other client device’s personal seed to the central server 304 to derive the local randomized noise of the plurality of other client devices 302B- 302N.

The operations 402A to 402G are required for performing the client side random secure averaging. The operations 402A to 4021 are executed in the same order for each of the plurality of client devices 302. By executing the operations from 402A, 402B, 402C, 402D, 402E, 402F and 402G, each of the plurality of client devices 302 is configured to determine a plurality of model update outputs after training of the local model update vectors, respectively. Each of the plurality of client devices 302 shares the plurality of model update outputs with the central server 304.

Similarly, the central server 304 is configured to perform a series of operations to determine a global model update 406. At operation 404A, the central server 304 is configured to broadcast the generated public key of the client device 302A to the plurality of other devices

302B-302N. At operation 404B, the central server 304 is configured to distribute the plurality of shares of the client device 302A to the plurality of other client devices 302B-302N.

At operation 404C, the central server 304 is configured to receive the plurality of model update outputs transmitted by the plurality of client devices 302. The central server 304 is further configured to determine an aggregated sum of model updates based on the plurality of model update outputs. The central server 304 obtains a vector which represents the aggregated sum of model updates based on the plurality of model update outputs. Based on the aggregated sum of model updates, the central server 304 determines the global model update 406 which is shared with each of the plurality of client devices 302.

At operation 404D, the central server 304 is configured to perform a dropout recovery. The two ways of drop out recovery have been described in detail, for example, in FIG. 3A.

At operation 404E, the central server 304 is configured to perform a reverse quantization process for converting an integer vector to a floating point vector. The conversion from the integer (modular) vector to the floating point vector results into a sum of noisy model updates.

At operation 404F, the central server 304 is configured to compensate the differential privacy noise in case of drop out of the one or more client devices during the execution of the RdSA protocol. The central server 304 is further configured to add an additional noise to the aggregated sum of model updates based on a predetermined variance value for the local noise added to the aggregated sum of model updates

After executing the operations from 404A, 404B, 404C, 404D, 404E and 404F, the central server 304 determines the global model update 406. The global model update 406 is communicated with each of the plurality of client devices 302 in order to prepare for next iteration. The sequence of operations from 404A, 404B, 404C, 404D, 404E and 404F indicates the central server side random secure averaging (RdSA).

FIG. 5 illustrates an exemplary implementation scenario of distributed machine learning, in accordance with an example of the present disclosure. FIG. 5 is described in conjunction with elements from FIGs. 1, 2, 3A, 3B, 3C, and 4. With reference to FIG. 5, there is shown a system 500 that includes a plurality of client devices 502, a central server 504 and a network 506. The plurality of client devices 502 includes a client device 502A and other client devices 502B-502N. The client device 502A uses a video recommendation tool 508A (e.g. a machine learning model). Similarly, the other client devices 502B-502N use video recommendation tools 508B-508N.

The plurality of client devices 502, the central server 504 and the network 506 corresponds to the plurality of client devices 302, the central server 304 and the network 306 of FIG. 3 A, respectively.

The client device 502A uses the video recommendation tool 508A, which uses a predictive model. The video recommendation tool 508A is locally trained on the client device 502A as the system 500 uses distributed machine learning with random secure averaging (RdSA) protocol. Similarly, the other client devices 502B-502N locally train their respective video recommendation tools 508B-508N. Each of the plurality of client devices 502 shares their respective locally trained video recommendation tools with the central server 504. Thereafter, RdSA protocol is used to compute a private global model update at the central server 504 for N (hundreds to thousands) number of client devices. The computed private global model update at the central server 504 is shared with each of the plurality of client devices 502 which now benefit of an increasd accuracy of the video recommendation tool alongwith a privacy protection.

In another implementation scenario, the plurality of client devices 502 may correspond to a plurality of computing devices used at hospitals or laboratories. For example, the client device 502A corresponds to a computing device used at a hospital, and the client device 502B corresponds to another computing device used at a laboratory or an organization and so on. Each of the plurality of client devices 502 uses a training model trained from images alongwith manually provided annotation by medical practitioners which is used to detect illness using the medical imagery. In many jurisdictions, such a model of medical images can not be shared between different hospitals or laboratories. Therefore, local training of such a model is performed at the respective computing devices used at the hospital or the laboratory or the organization. After local training, each of the plurality of computing devices used at the hospital or the laboratory or the organization share their locally trained model with the central server 504. In this exemplary scenario, the central server 504 may compute a global model update by use of the RdSA protocol and share the computed global model update with each of the plurality of computing devices used at the hospital, or the laboratory or the organization which now detect the illness with accuracy while maintaining improved privacy and protection of personal data (may provide formal guarantee of strong data privacy protection) as well as avoiding the scaling issue

Modifications to embodiments of the present disclosure described in the foregoing are possible without departing from the scope of the present disclosure as defined by the accompanying claims. Expressions such as "including", "comprising", "incorporating", "have", "is" used to describe and claim the present disclosure are intended to be construed in a non-exclusive manner, namely allowing for items, components or elements not explicitly described also to be present. Reference to the singular is also to be construed to relate to the plural. The word "exemplary" is used herein to mean "serving as an example, instance or illustration". Any embodiment described as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments and/or to exclude the incorporation of features from other embodiments. The word "optionally" is used herein to mean "is provided in some embodiments and not provided in other embodiments". It is appreciated that certain features of the present disclosure, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the present disclosure, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable combination or as suitable in any other described embodiment of the disclosure.

Claims

1. A method (100) for distributed machine learning for a network (306), comprising: receiving, by a processor (314) of a client device (302A), a local model update vector; generating, by the processor (314), a public key and a secret key, where the public key is for broadcasting to a plurality of other client devices (302B-302N) on the network (306); receiving, by the processor (314), an external public key for each of the other client devices (302B-302N); for each other client device (302B-302N): generating, based on the secret key and the external public key, a pseudorandom number; determining whether each of the other client devices (302B-302N) is to be allocated to a set of neighbor devices of the client device (302A) based on the pseudorandom number and a predetermined neighbor probability parameter; generating, by the processor (314), a model update output according to a secure sum protocol (310) based on the set of neighbor devices; and outputting, by the processor (314), the model update output for transmission to a central server (304) for incorporation into a global model update (312).

2. The method (100) of claim 1, wherein the neighbor probability parameter is configured to define a number of neighbors in the set of neighbor devices based on a predefined value, where the predefined value is defined based on a modelled risk of a successful attack.

3. The method (100) of claim 2, wherein for the neighbor probability parameter p and predefined value r: f(n_h, p) > 1 — r and f(n_h, p — d) < 1 — r for precision d where:

where: active security passive security

4. The method (100) of any preceding claim, wherein generating the model update output according to the secure sum protocol (310) comprises generating a one-time-pad for each neighbor device and adding the plurality of one-time-pads to the local model update vector, wherein the one-time-pad for each neighbor device is generated based on a shared secret derived from the secret key of the client device and the external public key of the neighbor device.

5. The method (100) of claim 4, wherein a set of all the one-time-pads generated by the plurality of client devices (302) sums substantially to zero.

6. The method (100) of any one of claims 1 to 3, wherein generating the model update output according to the secure sum protocol (310) comprises splitting the local model vector update into a plurality of shares according to the number of neighbor devices, transmitting the plurality of shares to the respective neighbor devices, receiving a plurality of external shares from the neighbor devices and summing the plurality of external shares to form the model update output.

7. The method (100) of any preceding claim, wherein generating the model update output includes adding a locally generated noise signal to the local model update vector and wherein a distribution of the locally generated noise signal is gaussian or binomial.

8. The method (100) of claim 7, wherein the locally generated noise signal is generated with a standard deviation which is defined by a noise parameter received from the central server (304), wherein the noise parameter for each client device (302A-302N) is such that a corresponding set of locally generated noise signals from the plurality of client devices (302) sums to a global noise having a predetermined standard deviation.

9. The method (100) of any preceding claim, further comprising converting the local model update vector from a vector of floating point values to an integer vector.

10. The method (100) of any preceding claim, further comprising sending the public key to the central server (304) and receiving the external public keys from the central server (304).

11. A computer-readable medium configured to store instructions which, when executed, cause a client device processor (314) to perform the method (100) of any preceding claims 1 to 10.

12. A client device (302A) comprising: a training module (318 A) configured to generate a local model update vector; a processor (314) configured to generate a public key and a secret key; a transceiver (316) configured to broadcast the public key to a plurality of other client devices (302B-302N) on the network (306), receive an external public key for each of the other client devices (302B-302N), and transmit a model update output to a central server (304) for incorporation into a global model update (312); wherein the processor (314) is further configured to: generate, for each other client device (302B-302N), a pseudorandom number based on the secret key and the external public key; determine whether each of the other client devices (302B-302N) is to be allocated to a set of neighbor devices of the client device (302A) based on the pseudorandom number and a predetermined neighbor probability parameter; and generate the model update output according to a secure sum protocol (310) based on the set of neighbor devices.

13. The client device (302A) of claim 12, wherein the neighbor probability parameter is configured to define a number of neighbors in the set of neighbor devices based on a predefined value, where the predefined value is defined based on a modelled risk of a successful attack.

14. The client device (302A) of claim 13, wherein for the neighbor probability parameter p and predefined value r: f(n_h, p) > 1 — r and f(n_h, p — d) < 1 — r for precision d where:

where: active security passive security

15. The client device (302A) of any one of claims 12 to 14, wherein the processor (314) is configured to generate the model update output according to the secure sum protocol (310) by generating a one-time-pad for each neighbor device and adding the plurality of one-time- pads to the local model update vector, wherein the one-time-pad for each neighbor device is generated based on a shared secret derived from the secret key of the client device and the external public key of the neighbor device.

16. The client device (302A) of claim 15, wherein a set of all the one-time-pads generated by the plurality of client devices (302) sums substantially to zero.

17. The client device (302A) of any one of claims 12 to 14, wherein the processor (314) is configured to generate the model update output according to the secure sum protocol (310) by splitting the local model vector update into a plurality of shares according to the number of neighbor devices, transmitting the shares to the respective neighbor devices, receiving external shares from the neighbor devices and summing the plurality of external shares to form the model update output.

18. The client device (302A) of any one of claims 12 to 17, wherein the processor (314) is configured to generate the model update output by adding a locally generated noise signal to the local model update vector and wherein a distribution of the locally generated noise signal is gaussian or binomial.

19. The client device (302A) of claim 18, wherein the locally generated noise signal is generated with a standard deviation which is defined by a noise parameter received from the central server (304), wherein the noise parameter for each client device (302A-302N) is such that a corresponding set of differentially private noise signals from the plurality of client devices (302) sums to a global noise having a predetermined standard deviation.

20. The client device (302A) of any one of claims 12 to 19, wherein the processor (314) is further configured to convert the local model update vector from a vector of floating point values to an integer vector.

21. The client device (302A) of any one of claims 12 to 20, wherein the transceiver (316) is further configured to send the public key to the central server (304) and receive the external public keys from the central server (304).

22. A method (200) for distributed machine learning for a network (306), comprising: receiving, by a central server (304), a plurality of model update outputs transmitted by a plurality of client devices (302); determining, by the central server (304), an aggregated sum of model updates based on the plurality of model update outputs; updating, by the central server (304), a global model based on the aggregated sum of model updates; and transmitting, by the central server (304), the global model update (312) to each of the client devices (302A-302N).

23. The method (200) of claim 22, wherein updating the global model comprises converting an integer vector to a floating point vector.

24. The method (200) of claim 22 or claim 23, further comprising: determining, by the central server (304), that a client device has dropped out; and adding, by the central server (304), an additional noise to the aggregated sum of model updates based on a predetermined variance value for the local noise added to the aggregated sum of model updates.

25. The method (200) of any one of claims 22 to 24, further comprising performing a client dropout recovery protocol including: receiving, by the central server (304), a plurality of key shares from the plurality of client devices (302), representing a set of secret keys for each client device (302) which are split into a plurality of key shares according to a secret sharing protocol, distributed among the client devices and sent to the central server (304) by each client device; determining, by the central server (304), that a client device has dropped out; and combining, by the central server (304), a plurality of received key shares corresponding to a dropout client to recover the secret key corresponding to the dropout client.

26. A computer-readable medium configured to store instructions which, when executed, cause a central server processor (322) to perform the method (200) of any one of claims 22 to 25.

27. A central server (304) comprising: a transceiver (324) configured to receive a plurality of model update outputs transmitted by a plurality of client devices (302), and transmit a global model update (312) to each of the client devices (302A-302N); and a processor (322) configured to determine an aggregated sum of model updates based on the plurality of model update outputs, and update a global model to generate the global model update (312) based on the aggregated sum of model updates.

28. The central server (304) of claim 27, wherein updating the global model comprises converting an integer vector to a floating point vector.

29. The central server (304) of claim 27 or claim 28, wherein the processor (322) is further configured to: determine that a client device has dropped out; and add an additional noise to the aggregated sum of model updates based on a predetermined variance value for the local noise added to the aggregated sum of model updates.

30. The central server (304) of any one of claims 27 to 28, further configured to perform a client dropout recovery protocol including: receiving, by the transceiver (324), a plurality of key shares from the plurality of client devices (302), representing a set of secret keys for each client device (302) which are split into a plurality of key shares according to a secret sharing protocol, distributed among the client devices and sent to the central server (304) by each client device; determining, by the processor (322), that a client device has dropped out; and combining, by the processor (322), a plurality of received key shares corresponding to a dropout client to recover the secret key corresponding to the dropout client.