WO2022142366A1 - 机器学习模型更新的方法和装置 - Google Patents

机器学习模型更新的方法和装置 Download PDF

Info

Publication number
WO2022142366A1
WO2022142366A1 PCT/CN2021/112644 CN2021112644W WO2022142366A1 WO 2022142366 A1 WO2022142366 A1 WO 2022142366A1 CN 2021112644 W CN2021112644 W CN 2021112644W WO 2022142366 A1 WO2022142366 A1 WO 2022142366A1
Authority
WO
WIPO (PCT)
Prior art keywords
gradient
encrypted
model
intermediate result
noise
Prior art date
Application number
PCT/CN2021/112644
Other languages
English (en)
French (fr)
Inventor
邵云峰
李秉帅
吴骏
田海博
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP21913114.1A priority Critical patent/EP4270266A1/en
Publication of WO2022142366A1 publication Critical patent/WO2022142366A1/zh
Priority to US18/344,188 priority patent/US20230342669A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/65Updates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • H04L63/0442Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload wherein the sending and receiving network entities apply asymmetric encryption, i.e. different keys for encryption and decryption
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/008Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols involving homomorphic encryption
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/088Usage controlling of secret information, e.g. techniques for restricting cryptographic keys to pre-authorized uses, different access levels, validity of crypto-period, different key- or password length, or different strong and weak cryptographic algorithms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0891Revocation or update of secret information, e.g. encryption key update or rekeying
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/30Public key, i.e. encryption algorithm being computationally infeasible to invert or user's encryption keys not requiring secrecy

Definitions

  • the present application relates to the technical field of machine learning, and in particular, to a model method and apparatus.
  • Federated learning is a distributed machine learning technique.
  • Each federated learning client such as federated learning devices 1, 2, 3...k, uses local computing resources and local network business data for model training, and uses the model parameters generated during the local training process.
  • Update information ⁇ such as ⁇ 1 , ⁇ 2 , ⁇ 3 ?? ⁇ k , is sent to the federated learning server (FLS).
  • the federated learning server uses the aggregation algorithm to aggregate the models based on the model update parameters to obtain an aggregated machine learning model.
  • the pooled machine learning model is used as the initial model for the next model training performed by the federated learning device.
  • the federated learning device and the federated learning server perform the above model training process for many times, and stop training until the obtained converged machine learning model satisfies the preset conditions.
  • federated learning technology there may be data with different characteristics located in different entities that need to be aggregated to train the model to enhance the learning ability of the model.
  • the method of model training after aggregation of data with different characteristics of different entities is called vertical federated learning.
  • Device B and Device A receive the server's public key and private key pair, and update the model according to the gradient of the model sent by Client A and Client B, and update the updated model.
  • the model's are sent to client A and client B respectively.
  • the existing vertical federated learning needs to rely on the server, but since the public key and private key are generated by the server, whether the server can be trusted is an important issue. If the server is an untrusted entity, it will pose a greater threat to data security. How to improve the security of vertical federated learning has become a problem that needs to be solved.
  • the present application provides a method, device and system for updating a machine learning model to improve the security of vertical federated learning.
  • the embodiments of the present application provide a method for updating a machine learning model.
  • the method includes generating a first intermediate result by a first device based on a first subset of data.
  • the first device receives the encrypted second intermediate result sent by the second device, where the second intermediate result is generated according to the second data subset corresponding to the second device.
  • the first device obtains a first gradient of a first model, where the first gradient of the first model is generated according to the first intermediate result and the encrypted second intermediate result.
  • the second private key is used to update the first model, and the second private key is a decryption key for homomorphic encryption generated by the second device.
  • a second intermediate result sent to the first device is homomorphically encrypted using a key (eg, a public key) generated by the second device by the second device owning the second subset of data, and used by the second device
  • a key eg, a public key
  • the private key generated by the second device decrypts the gradient. Therefore, in the vertical federated learning scenario, when the first device uses the data of the second device to update the model, the data security of the second device can be protected. For example, user data such as age, job, and sex in Table 2 will not be obtained. to protect user privacy.
  • the first gradient may be determined by the first apparatus, or may be determined by other apparatuses according to the first intermediate result and the encrypted second intermediate result.
  • the second intermediate result is encrypted using a second public key for homomorphic encryption generated by the second device.
  • the first device generates a first public key and a first private key for homomorphic encryption.
  • the first device encrypts the first intermediate result using the first public key.
  • the first device and the second device respectively generate and encrypt or decrypt the data of the respective data subsets, thereby ensuring the data security of the respective data subsets.
  • the first device sends the encrypted first intermediate result to the second device, so that the second device can use the data of the first device for model training, and can ensure that The security of the data of the first device.
  • the first gradient of the first model is determined according to the first intermediate result and the encrypted second intermediate result. Specifically, the first gradient of the first model is determined according to the encrypted second intermediate result. The first intermediate result and the encrypted second intermediate result are determined. The first device decrypts the first gradient of the first model using the first private key. Through the method, the first device ensures the security of the training data when all encrypted data is required for training.
  • the first device generates a first noise of the first gradient of the first model; the first device sends the first gradient including the first noise to the second device; The first device receives a first gradient decrypted using the second private key, the decrypted gradient including the first noise. With this method, noise is added to the first gradient. In the case where the first gradient is sent to the second device for decryption, the data security of the first data subset of the first device can still be guaranteed.
  • the first device receives the second parameter of the second model sent by the second device.
  • the first device determines a second gradient of the second model according to the encrypted first intermediate result and the encrypted second intermediate result, and a second parameter set of the second model.
  • the first device sends the second gradient of the second model to the second device.
  • the second gradient of the second model is determined by the first device based on the second data subset of the second device and the first data subset of the first device, due to the use of the encrypted intermediate results of the second data subset, Data security of the second data subset is guaranteed.
  • the first device determines a second noise of the second gradient.
  • the second gradient sent to the second device includes the second noise.
  • the first device receives an updated second parameter including the second noise, and the second parameter set is a parameter set for updating the second model with the second gradient ; the first device removes the second noise contained in the updated second parameter.
  • the first device de-noises the second parameter, so that the first device can also guarantee the accuracy of the first data subset when the second model is updated by the first device. Safety.
  • the first device receives at least two of the second public keys for homomorphic encryption, the at least two second public keys being generated by at least two second devices.
  • the first device generates a public public key for homomorphic encryption according to the at least two second public keys and the first public key received; the public public key is used to encrypt the second intermediate result and /or said first intermediate result.
  • decrypting the first gradient of the first model using the second private key includes:
  • the first device sends the first gradients of the first model to the at least two second devices in sequence, and receives the first gradients decrypted by the at least two second devices using the corresponding second private keys respectively.
  • the first gradient of the model is ensured.
  • the first device decrypts the first gradient of the first model using the first private key.
  • the embodiments of the present application provide a method for updating a machine learning model.
  • the method includes: the first device sends the encrypted first data subset and the encrypted first parameter of the first model, and the encrypted first data subset and the encrypted first parameter are used to determine the encrypted first data subset and the encrypted first parameter. the first intermediate result.
  • the first device receives the encrypted first gradient of the first model, and the first gradient of the first model is based on the encrypted first intermediate result, the encrypted first parameter and the encrypted first gradient.
  • the second intermediate result is determined.
  • the first device decrypts the encrypted first gradient using a first private key, and the decrypted first gradient of the first model is used to update the first model.
  • the first device places the calculation of the first gradient for the first model update in another device, and the first device encrypts the first data subset and sends it, which can ensure the data security of the first data subset .
  • the first device receives the encrypted second gradient of the second model, and the encrypted second gradient is based on the encrypted first intermediate result and the encrypted second intermediate
  • the result is determined, the second intermediate result is determined according to the second data subset of the second device and the parameters of the second model of the second device, and the encrypted second intermediate result is determined by the second device for the second
  • the intermediate result is obtained by homomorphic encryption.
  • the first device decrypts the second gradient using the first private key.
  • the first device sends a second gradient decrypted by the first private key to the second device, and the decrypted second gradient is used to update the second model.
  • the first device decrypts the gradient of the model of the second device, so as to ensure the data security of the first data subset of the first device.
  • the first gradient received by the first device includes the first noise
  • the decrypted first gradient includes the first noise
  • the updated parameters of the first model include the first noise.
  • Including noise in the gradient can further ensure data security.
  • the first device updates the first model according to the decrypted first gradient. Or the first device sends the decrypted first gradient.
  • the first device receives at least two of the second public keys for homomorphic encryption, the at least two second public keys being generated by at least two second devices.
  • the first device generates a public public key for homomorphic encryption according to the at least two second public keys and the first public key received; the public public key is used to encrypt the second intermediate result and /or said first intermediate result.
  • an embodiment of the present application provides a method for updating a machine learning model.
  • the method includes receiving an encrypted first intermediate result and an encrypted second intermediate result.
  • the first gradient of the first model is determined according to the encrypted first intermediate result, the encrypted second intermediate result, and the parameters of the first model.
  • the first gradient is decrypted.
  • the first model is updated according to the decrypted first gradient.
  • the encrypted first intermediate result is obtained by homomorphically encrypting the first intermediate result using the first public key;
  • the encrypted second intermediate result is obtained by using the first public key
  • the public key is obtained by homomorphically encrypting the second intermediate result.
  • decrypting the first gradient includes: decrypting the first gradient using the first private key pair.
  • the first gradient is sent to the first device.
  • the first public key is obtained from the first device, and the first public key is sent to the second device.
  • the present application provides an apparatus.
  • the apparatus is used to perform any one of the methods provided in the first aspect to the third aspect.
  • the present application may divide the functional modules of the machine learning model management apparatus according to any of the methods provided in the first aspect.
  • each function module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the present application may divide the machine learning model management apparatus into a receiving module, a processing module, and a sending module according to functions.
  • a receiving module for the description of the possible technical solutions and beneficial effects performed by the above-mentioned divided functional modules, reference may be made to the technical solutions provided by the first aspect or its corresponding possible designs, and the technical solutions provided by the second aspect or its corresponding possible designs. , or the technical solutions provided by the third aspect or its corresponding possible designs, which will not be repeated here.
  • the machine learning model management device includes: a memory and a processor, and the memory and the processor are coupled.
  • the memory is used to store computer instructions
  • the processor is used to invoke the computer instructions to perform the method provided by the first aspect or its corresponding possible design, the method provided by the second aspect or its corresponding possible design, or the third aspect or its corresponding possible design method.
  • the present application provides a computer readable storage medium, such as a computer non-transitory readable storage medium.
  • a computer program (or instruction) is stored thereon, and when the computer program (or instruction) is run on a computer device, the computer device is made to perform the method provided by the above-mentioned first aspect or its corresponding possible design, the second aspect or The method provided by its corresponding possible design, or the method provided by the third aspect or its corresponding possible design.
  • the present application provides a computer program product that, when run on a computer device, enables the method provided by the first aspect or its corresponding possible designs, the second aspect or its corresponding possible designs The method, or the method provided by the third aspect or its corresponding possible designs is performed.
  • the present application provides a chip system, comprising: a processor, where the processor is configured to call and run a computer program stored in the memory from a memory, and execute the method provided by the first aspect or a corresponding possible design thereof, A method provided by the second aspect or a corresponding possible design thereof, or a method provided by the third aspect or a corresponding possible design thereof.
  • the sending action in the third aspect may be specifically replaced by sending under the control of the processor; the receiving action in the above-mentioned second aspect or the first aspect may be specifically replaced by receiving under the control of the processor.
  • any of the systems, devices, computer storage media, computer program products or chip systems provided above can be applied to the corresponding methods provided in the first aspect, the second aspect or the third aspect. Therefore, its For the beneficial effects that can be achieved, reference may be made to the beneficial effects in the corresponding method, which will not be repeated here.
  • Figure 1 is a schematic diagram of the structure of the existing vertical federated learning system
  • FIG. 2A is a schematic structural diagram of a vertical federated learning system provided by an embodiment of the present application.
  • FIG. 2B is a schematic structural diagram of a vertical federated learning system provided by an embodiment of the present application.
  • FIG. 3 is a flowchart of a method suitable for vertical federated learning provided by an embodiment of the present application
  • FIG. 4 is a flowchart of a method suitable for vertical federated learning provided by an embodiment of the present application
  • 5A-5B are flowcharts of another method suitable for vertical federated learning provided by an embodiment of the present application.
  • 6A-6B are flowcharts of another method suitable for vertical federated learning provided by the embodiments of the present application.
  • FIG. 7 is a flowchart of another method suitable for vertical federated learning provided by an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of an apparatus for updating a machine learning model provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a hardware structure of a computer device according to an embodiment of the present application.
  • Machine learning is the use of algorithms to parse data, learn from it, and then make decisions and predictions about real-world events. Machine learning is to use a large amount of data to "train”, and learn how to complete a certain model business from the data through various algorithms.
  • a machine learning model is a file containing algorithm implementation code and parameters used to complete a model business.
  • the algorithm implementation code is used to describe the model structure of the machine learning model
  • the parameters are used to describe the attributes of each component of the machine learning model.
  • the file is hereinafter referred to as a machine learning model file.
  • sending a machine learning model hereinafter specifically refers to sending a machine learning model file.
  • the machine learning model is a logical functional module that completes the business of a certain model. For example, the values of input parameters are input into a machine learning model, and the values of output parameters of the machine learning model are obtained.
  • Machine learning models include artificial intelligence (AI) models such as neural network models.
  • AI artificial intelligence
  • Vertical Federated Learning (Vertical Federated Learning, also known as Heterogenous Federated Learning) refers to the technology of federated learning under the setting that each party has different feature spaces. Longitudinal federated learning can be trained using the same user data with different user characteristics and located on disembodied devices. Vertical federated learning can aggregate data with different characteristics or attributes located in different entities to enhance the federated learning of model capabilities. Characteristics of data can also be attributes of data.
  • the model gradient is the variation of the model parameters during the training process of the machine learning model.
  • Homomorphic encryption is a form of encryption that allows people to perform certain forms of algebraic operations on ciphertext to obtain a result that is still encrypted. Using the key in the homomorphic key pair to decrypt the result of the operation on the homomorphically encrypted data is the same as the result of the same operation on the plaintext.
  • the key used to encrypt when performing homomorphic encryption is the key used to encrypt when performing homomorphic encryption.
  • the key used for decryption when performing homomorphic encryption is the key used for decryption when performing homomorphic encryption.
  • words such as “exemplary” or “for example” are used to represent examples, illustrations or illustrations. Any embodiments or designs described in the embodiments of the present application as “exemplary” or “such as” should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as “exemplary” or “such as” is intended to present the related concepts in a specific manner.
  • the size of the sequence number of each process does not mean the sequence of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not be used in the embodiment of the present application. Implementation constitutes any limitation.
  • determining B according to A does not mean that B is only determined according to A, and B may also be determined according to A and/or other information.
  • references throughout the specification to "one embodiment,” “an embodiment,” and “one possible implementation” mean that a particular feature, structure, or characteristic related to the embodiment or implementation is included in the present application at least one embodiment of .
  • appearances of "in one embodiment” or “in an embodiment” or “one possible implementation” in various places throughout this specification are not necessarily necessarily referring to the same embodiment.
  • the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments.
  • connection mentioned in the embodiments of the present application may be a direct connection, an indirect connection, a wired connection, or a wireless connection, that is, the embodiment of the present application can
  • the connection method is not limited.
  • FIG. 2A is a schematic structural diagram of a system applied to an application scenario of vertical federated learning according to an embodiment of the present application.
  • the system 200 shown in FIG. 2A may include a network data analysis functional entity 201 , a base station 202 , a core network network element 203 , and an application functional entity 204 .
  • Each network entity in FIG. 2A may be device A or device B in this embodiment of the present application.
  • Network Data Analytics Function (NWDAF) entity 201 can obtain data from various network entities, such as base station 202, core network element 203 and/or application function entity 204, and perform data analysis. Data analysis refers to the training of a model based on the obtained data as the input for model training. In addition, the network data analysis functional entity 201 can also perform inference based on the model to determine the data analysis result. Then, provide data analysis results to other network entities, third-party service servers, terminal devices or network management systems. This application mainly involves the data collection function and model training function of the NWDAF entity 201 .
  • Application function (Application Function, AF) entity 204 used to provide services, or perform routing of application-related data, such as providing data to the NWDAF entity 201 for model training. Further, the application function entity 204 can also use the private data not sent to the NWDAF entity 201 to perform vertical federated learning with other network entities.
  • AF Application Function
  • the base station 202 provides access services for the terminal, and further completes the forwarding of control signals and user data between the terminal and the core network.
  • the base station 202 may also send data to the NWDAF entity 201 for the NWDAF entity 201 to perform model training.
  • the base station 202 can also use private data not sent to the NWDAF entity 201 to perform vertical federated learning with other network entities.
  • the core network element 203 provides the terminal with related services of the core network.
  • the core network element 203 may be a user plane functional entity applied to the 5G architecture, such as a UPF entity, a session management functional entity SMF entity, or a policy control functional entity such as a PCF entity. It should be understood that the core network element can also be applied to other network architectures in the future, such as a 6G architecture.
  • any core network element may also send data to the NWDAF entity 201 for the NWDAF entity 201 to perform model training.
  • the core network element 203 can also use the private data not sent to the NWDAF entity 201 to perform vertical federated learning with other network entities.
  • architecture of the embodiment of the present application may further include other network elements, which is not limited in the embodiment of the present application.
  • any network entity can send data that does not involve privacy to the NWDAF entity 201, and the NWDAF entity 201 composes data subsets according to the data sent by one or more devices, and communicates with other networks. Entities have no private data sent to NWDF for vertical federated learning.
  • the NWDAF entity 201 can perform vertical federated learning together with one type of network entities, or can perform vertical federated learning together with multiple types of network entities. For example, the NWDAF entity 201 may jointly perform vertical federated learning with one or more base stations 202 according to network data sent by multiple base stations 202 .
  • the NWDAF entity 201 may also perform vertical federated learning together with the base station 202 and the AF entity 204 according to the data sent by the base station 202 and the AF entity 204 .
  • Table 1 is an example of a dataset for vertical federated learning using the network architecture of Figure 2A:
  • the 3rd row, the 7th row, and the 12th row respectively represent the private data that is not sent to the NWDAF entity 201 and is stored in the base station 202, the core network element 203, or the AF entity 204, which can be used as the private data in Figure 3- Figure 7 subset of data.
  • the data columns in Table 1, indicating the characteristics of the data correspond to the parameters of the longitudinal federated learning model. For example, the content of lines 1-2, lines 4-6, and lines 8-10 correspond to the parameters of the model trained or used by the NWDAF entity 201 .
  • the source column in Table 1 indicates the source of the data for each feature, respectively.
  • the data corresponding to lines 1-2 is sent by the AF entity 204 to the NWDAF entity 201, and the data corresponding to the lines 4-6 is sent by the UPF entity to the NWDAF entity 201.
  • the data in the first row ie, service experience
  • the label data for model training that is, the user's business experience is used as the label data.
  • the data in rows 1-12 are the data of the same user in multiple entities.
  • the NWDAF entity 201 acts as the device B in FIG. 4 to FIG. 7 , and the corresponding data subset includes tags.
  • FIG. 2B is a schematic structural diagram of a system applied to an application scenario of vertical federated learning according to an embodiment of the present application.
  • the system shown in FIG. 2B may include a business system server A 251 and a business system server B 252.
  • the business system servers A and B may be servers applied to different business systems, such as a server of a banking business system and a server of a call business system.
  • the service system server A251 in FIG. 2B may also be the base station in FIG. 2A , the core network network element 203 , the application function network element 204 or the network graph number analysis function entity 201 .
  • the service system server B 252 in FIG. 2B may also be the base station in FIG. 2A, the core network network element 203, the application function network element 204 or the network graph number analysis function entity 201.
  • Table 2 is a schematic diagram of data features taking the service system server A as the server of the calling service system and the service system server B as the server of the banking service system as an example.
  • the data with row number 1 (that is, status) is used as the label data for model training.
  • the data corresponding to line numbers 1-9 is the data obtained by the banking system server, which can be used as the data subset B corresponding to device B, and the data corresponding to line numbers 10-14 is the data obtained by the operator's business system, which can be used as device A.
  • the data of row numbers 1-14 are the data of the same user in different systems.
  • Device A has a subset of data A (D A ) and device B has a subset of data B (D B ).
  • Data subset A and data subset B respectively contain P pieces of data (for example, data of P users).
  • Data subset A contains N features
  • fN is denoted as the Nth feature
  • fN+M is denoted as the N+Mth feature.
  • Data subset A (D A ) with feature A and data subset B (D B ) with feature B are merged into dataset D for longitudinal federated learning.
  • d p represents the p-th piece of data (d p is any piece of data in D, and p is any positive integer less than or equal to P), and d p has N+M features, which are expressed as
  • N is the Nth feature of the pth data
  • N+Mth feature of the pth data is the N+Mth feature of the pth data.
  • Each piece of data can be divided into two parts according to feature FA and feature FB : in, is the eigenvalue corresponding to the feature A (F A ) of the p-th data, namely: is the eigenvalue corresponding to the feature FB , that is
  • Data set D can be divided into two data subsets according to feature FA and feature FB, namely data subset D A and data subset DB, namely
  • the data subset D A is the P user data with the characteristic F A owned by the device A:
  • the data subset DB is the P user data with the feature FB owned by the device B ,
  • Model AW A initializes the parameters of model AW A as:
  • Device B initializes the parameters of model BW B as
  • the device B and device A are described from the model dimension to correspond to models with different parameters, respectively.
  • the parameters of the model correspond one-to-one with the features of the subset of data. For example, if the data subset D A of device A has N features, the model of device A has N parameters.
  • the model in this embodiment of the present application refers to a model that can be solved iteratively by using gradient information.
  • the gradient information is the updated value of the model.
  • the model in this embodiment of the present application refers to a linear model or a neural network model.
  • y is the output parameter of the model, also called the model
  • w1 to wn are the N parameters of the model
  • x1 to xn are the 1st to nth features of a piece of data.
  • different features (values) of the same user are respectively located in two or more devices (it is assumed to be two in the embodiment of this application).
  • Parameters of model W A Parameters of model WB two parts. In this embodiment of the present application, it is assumed that a parameter of the model corresponds to a feature in the data subset.
  • FIG. 3 provides a method for updating a machine learning model in a vertical federated learning scenario provided by an embodiment of the present application, which is applicable to the two application scenarios of FIG. 2A and FIG. 2B . It includes the following steps: the following figure is an embodiment of the application, including the following steps:
  • Step 302 The first device generates a first intermediate result according to the first data subset.
  • the first intermediate result is generated from the model of the first device (ie, the first model) and the first subset of data.
  • the first intermediate result is used to generate gradients of the first model with intermediate results generated by other devices participating in the longitudinal federated learning (eg, second intermediate results generated by the second device according to the second model and the second data subset).
  • the gradient of the first model may be referred to as the first gradient.
  • the first device may be the device A in FIG. 4 to FIG. 7 , or the device B in FIG. 4 to FIG. 7 , which is not limited in the embodiment of the present application.
  • Step 304 The first device receives the encrypted second intermediate result sent by the second device.
  • the second intermediate result is generated by the second apparatus based on the second model and the second subset of data.
  • the second device homomorphically encrypts the second intermediate result using the second device's public key or a public public key generated by the second device's public key and the other device's public key.
  • the second device may be one device, or may be multiple devices, which is not limited in the embodiment of the present application.
  • Step 306 The first device acquires the first gradient of the first model.
  • the first device may generate the first gradient according to the first intermediate result and the second intermediate result.
  • the first device may also acquire the first gradient generated according to the first intermediate result and the second intermediate result from another device such as the second device.
  • the second intermediate result used to generate the first gradient is the encrypted intermediate result.
  • both the second intermediate result and the first intermediate result used to generate the first gradient are encrypted intermediate results.
  • the gradient is the update vector of the model parameters.
  • Step 308 The first device updates the first model according to the first gradient.
  • the first device since the second intermediate result used to generate the first gradient is an encrypted intermediate result, the first device cannot obtain the second intermediate result and cannot deduce the original source of the second data subset for generating the second intermediate result. data. Therefore, data security in the vertical federated learning scenario can be guaranteed.
  • Steps 400-401 Device A generates a homomorphically encrypted public key A (pk A ) and a private key A (sk A ), and sends the public key A to device B.
  • pk A homomorphically encrypted public key A
  • sk A private key A
  • Step 402 The device A groups the data subset A (D A ) to obtain the grouped data subset A (DD A ).
  • D A is the data subset A owned by device A, which can be regarded as an original two-dimensional matrix, in which each row of data corresponds to a user, and each column corresponds to a feature. Specifically, row i and column j represent the j-th feature of the i-th piece of data.
  • the data subset A is the base station, and the core network element or AF has no private data sent to the NWDAF entity.
  • the data subset A may be the data of the operator's service system. And arrivals, CALL NUMS, Communication flows, etc. are the characteristics A of the data subset A.
  • DD A represents the result after grouping (packing) the data subset A. After grouping, all data values in the two-dimensional matrix of data subset A are divided into multiple blocks, and each block represents the value of the same feature of multiple pieces of data (also multiple rows of data in DA, such as L pieces of data), that is, A block is a column vector with L rows and 1 column.
  • a block is a column vector with L rows and 1 column. for example is the first feature of the 1st to L pieces of data of device A, expressed as: is the n-th feature of the q*(L-1)+1 to Q*L-th data of device A, expressed as:
  • the last block may be less than L values. But the last block needs to be filled with L values, so the insufficient data is filled with 0, that is
  • L polynomial order.
  • the value of L can be set as required, which is not limited in this embodiment of the present application.
  • the device B In steps 400'-401', the device B generates a homomorphically encrypted public key B (pk B ) and a private key B (sk B ), and sends the public key pk B to the device A.
  • pk B public key
  • sk B private key B
  • the public key is used for encryption and the private key is used for decryption.
  • Step 402' the device B groups the data subset B to obtain the grouped data subset B (DD B ).
  • Data set D includes data subset A (D A ) and data subset B (D B ).
  • Data subset A and data subset B correspond to the same user, and data subset A and data subset B have different characteristics.
  • N+1 feature of the Lth data is the m-th feature of the q*(L-1)+1 to q*L-th data of device B, corresponding to the N+m-th feature of data set D, expressed as:
  • the data subset B is the data of the NWDAF entity.
  • service experience, Buffer size, etc. are used as features corresponding to data subset B.
  • the data subset A may be the data of the banking business system.
  • status, age, job, etc. are used as the feature B of the data subset B.
  • Grouping refers to dividing all data according to feature dimension, and dividing each feature into Q groups according to polynomial order L. Grouping can enable the data of a group (packet) to be encrypted at the same time (multiple input and multiple output) during subsequent encryption, which can speed up the encryption.
  • step 403 the device A uses the model A (W A ) and the data subset A to determine (or generate) an intermediate result A (U A ) of the data subset A.
  • U A D A W A , indicating that each piece of data of the data subset A owned by the device A is multiplied by the parameter W A of the model A.
  • U A D A W A , indicating that each piece of data of the data subset A owned by the device A is multiplied by the parameter W A of the model A.
  • in Indicates the data obtained by multiplying the first data in the data subset D A by the parameter A of the model A.
  • Step 404 The device A groups the intermediate results A to obtain the grouped intermediate results A (DU A ). in, It indicates that the intermediate result A is divided into Q groups, and the Q group may contain zero-padded data.
  • L polynomial order.
  • the value of L can be configured as required, which is not limited in this embodiment of the present application.
  • Step 405. Device A encrypts the grouped intermediate result DU A using the public key A (pk A ) to obtain the encrypted intermediate result And send the encrypted intermediate result A to device B.
  • the encrypted intermediate result A includes the encrypted intermediate results of each group, expressed as The encrypted first set of intermediate results correspond to the first to L pieces of data of the encrypted device A.
  • UA is an intermediate result B in the process of training model A using data subset A. If the plaintext is transmitted to the device B, it is possible for the device B to deduce the original data D A , that is, the data subset A. Therefore, the intermediate result A needs to be encrypted and transmitted. Since the device B receives the encrypted data, it can use the plaintext data of the data subset B for calculation, or it can calculate the gradient B of the model B after encrypting the data subset B with the public key A.
  • Steps 403'-405' the device B uses the model B (W B ) and the data subset B (D B ) to determine (or generate) an intermediate result B (U B ) of the data subset B, and then group the intermediate results B , to obtain the intermediate result B (DU B ) after grouping.
  • Device B encrypts the grouped intermediate result DU B using the public key pk B to obtain the encrypted intermediate result and the encrypted intermediate result sent to device A.
  • U B D B W B -Y B , indicating the result of multiplying each piece of data for subset B of data owned by device B by the parameters of model B and subtracting label Y B .
  • U B D B W B -Y B , indicating the result of multiplying each piece of data for subset B of data owned by device B by the parameters of model B and subtracting label Y B .
  • U B D B W B -Y B , indicating the result of multiplying each piece of data for subset B of data owned by device B by the parameters of model B and subtracting label Y B .
  • U B D B W B -Y B , indicating the result of multiplying each piece of data for subset B of data owned by device B by the parameters of model B and subtracting label Y B .
  • Step 406 Device A combines the encrypted intermediate result B with the intermediate result A to obtain the combined first intermediate result
  • the device A may also combine the encrypted intermediate result B with the encrypted intermediate result A.
  • the device A uses the public key B to encrypt the intermediate result A.
  • the combined first intermediate result contains the combined intermediate results for each group.
  • the combined intermediate result of each group includes the encrypted intermediate result B of each group and the unencrypted intermediate result A of the corresponding group. for example, and is the first intermediate result of the combined group q, and the first intermediate result of the combined group q includes the encrypted intermediate results of the group q with the unencrypted qth group of intermediate results
  • the combined first intermediate result may further include an encrypted intermediate result B and an encrypted intermediate result A.
  • the intermediate result A and the intermediate result B both use the public key B to perform homomorphic encryption.
  • Step 407 the device A determines (or generates) the encrypted gradient A of the model A, that is, The gradient A includes the updated values of the parameters of the model A.
  • the encrypted gradient A may not encrypt the gradient A, but because it is determined that the combined first intermediate result of the gradient A includes the encrypted data subset, such as the encrypted data subset A and /or Encrypted data subset B.
  • the gradient A of the model A includes the gradient A corresponding to each parameter of the model A. like is the gradient corresponding to the nth parameter of model A.
  • the gradient corresponding to each parameter It is determined (or generated) according to the encrypted intermediate result A and the encrypted intermediate result B (or the unencrypted intermediate result A and the encrypted intermediate result B) and the feature values of each group of corresponding features. for example, It is expressed as: the sum obtained by adding the intermediate result B of the qth group and the intermediate result A of the qth group, multiplying it by the nth eigenvalue of the qth group, and then averaging.
  • the gradient of the nth feature of model A is obtained by summing the gradients of the nth feature of the first to Q groups. is the nth eigenvalue of the q*(L-1)+1 to q*Lth data corresponding to the qth group, expressed as:
  • Step 408 The device A determines (or generates) the noise A (R A ) of the gradient A, and the noise A set of the gradient A includes the noise A of each parameter of the model A (corresponding to each feature of the data subset A), which can be expressed as :
  • Noise is a random number generated for a feature (one random number can be generated for each feature, or device A can jointly generate a random number for each feature.
  • one feature corresponding to one random number is the random number corresponding to the second feature (that is, the noise A of the second feature), is the random number corresponding to the nth feature.
  • the random number corresponding to any feature contains the noise of the feature corresponding to each piece of user data in any group, which is expressed as or the noise of the feature corresponding to each piece of data in multiple groups. in, is the noise of the nth feature corresponding to the second user data in the group, is the noise of the nth feature corresponding to the Lth user data in the group.
  • Step 409 Device A obtains an encrypted gradient containing noise A according to the gradient of noise A corresponding to the gradient corresponding to each parameter and the corresponding parameter Then the encrypted gradient containing noise A sent to device B.
  • the encrypted gradient A set containing noise A contains the encrypted gradient A of each parameter, which can be expressed as Add the noise A of the first parameter to the encrypted gradient A of the first parameter.
  • the noise may be encrypted noise or unencrypted noise.
  • Step 406' device B according to the grouped intermediate result B (DU B ) and the encrypted intermediate result get the merged first intermediate result
  • the combined second intermediate result is an intermediate result used to generate the gradient B of the model B.
  • the combined second intermediate result contains the unencrypted intermediate result B and the encrypted intermediate result A, or the combined second intermediate result contains the encrypted intermediate result A and the encrypted intermediate result B.
  • the intermediate results included in the combined second intermediate result use the public key A generated by the device A to encrypt the intermediate result B and the intermediate result A.
  • the combined first intermediate result is the intermediate result used to generate the gradient A of model A.
  • the combined first intermediate result includes the unencrypted intermediate result A and the encrypted intermediate result B, or the combined first intermediate result includes the encrypted intermediate result A and the encrypted intermediate result B.
  • the intermediate results included in the combined second intermediate result use the public key B generated by the device B to encrypt the intermediate result B and/or the intermediate result A.
  • Combined second intermediate result Contains the intermediate results after each group is merged.
  • the combined intermediate result of each group includes the encrypted intermediate result A of the corresponding group and the unencrypted intermediate result B of the corresponding group.
  • the combined second intermediate result can be expressed as:
  • the second intermediate result after the qth group is merged can be expressed as: is the encrypted intermediate result A of the qth group, is the unencrypted intermediate result B of the qth group.
  • Step 407 ′ the device B determines (or generates) the gradient of the model B
  • the gradient B includes the updated values of the parameters of the model A.
  • the gradient B of the model B includes the gradient A corresponding to each parameter of the model B (that is, the gradient B corresponding to each feature of the model B). like The gradient B corresponding to the mth parameter of model B.
  • step 408 ′ the device B generates the noise B (R B ) of the gradient B, and the noise B of the gradient B includes the noise A of the gradient corresponding to each parameter of the model B, which can be expressed as:
  • Step 410 Device A uses the private key A (sk A ) to decrypt the encrypted gradient containing noise B sent by device B The decrypted gradient B (DG BR) containing noise B is obtained.
  • Steps 411-412 Device A obtains the gradient B (GB R) containing noise B before grouping according to the decrypted gradient B containing noise B (DG B R), and converts the gradient B ( GB R) containing noise B before grouping. R) is sent to device B.
  • DG B R decrypted gradient B containing noise B
  • R converts the gradient B ( GB R) containing noise B before grouping.
  • the gradient B containing noise B before grouping includes the gradient B containing noise B corresponding to each parameter before each grouping, which can be expressed as: in, is the gradient B containing noise B before the grouping of the first parameter of model B.
  • the first parameter of model B corresponds to the N+1th feature in the dataset.
  • Step 410' the device B uses the private key B (sk B ) to decrypt the gradient containing the noise A sent by the device A
  • the decrypted gradient A (DG AR) containing noise A is obtained.
  • the device B uses the private key B (sk B ) to decrypt and generate the gradient A of each parameter in the gradient A.
  • the decrypted gradient A containing noise A contains the gradient A containing noise A corresponding to the parameters of model A.
  • DG AR contains the gradient A containing noise A corresponding to the parameters of model A.
  • Noise B represents the first parameter of model A.
  • Steps 411'-412' Device B obtains the gradient B set G A R containing noise before grouping according to the gradient B set, and sends the noise-containing gradient B set G A R corresponding to each feature before grouping to device A.
  • the gradient B set containing noise B before grouping G A R includes the gradient B containing noise B corresponding to each feature before each grouping.
  • Step 413 Device A obtains a decrypted gradient B set G A from which noise B is removed according to the decrypted gradient B set G A R corresponding to each feature before grouping and including noise B.
  • the gradient B set G A contains the gradients of each feature of the model B parameters.
  • the gradient B set can be expressed as: in, is the gradient of the second feature.
  • Step 414 the device A updates the model A (W A ) according to the gradient A from which the noise A is removed.
  • Step 413 ′ the device B obtains the gradient B (GB R ) according to the gradient B (GB R) corresponding to each parameter before grouping and including the noise B.
  • Step 414 ′ the device B updates the model B(W B ) according to the gradient B(G B ).
  • Steps 407 to 414' are repeatedly executed, and the direct changes to the model parameters are less than the preset values.
  • device A and device B exchange encrypted intermediate results B and intermediate results A, and use the encrypted intermediate results to generate gradients. And encrypt the gradient and send it to the other party. Therefore, encrypted transmission is adopted in the data exchange process between device A and device B, which ensures the security of data transmission.
  • 5A-5B are flowcharts of another method for model updating provided by an embodiment of the present invention, including the following steps:
  • Steps 500-501 Device A generates homomorphically encrypted public key A (pk A ) and private key A (sk A ), and sends public key B to device B.
  • Step 502. Device A groups the data subset A (D A ) to obtain the grouped data subset A (DD A ).
  • step 402 For the specific method of this step, reference may be made to the description of step 402, which is not described in detail in this embodiment of the present application.
  • Device A uses public key A to encrypt the grouped data subset A to obtain the encrypted data subset
  • the encrypted data subset A includes data corresponding to each feature of each group. as follows:
  • Step 504. Form a parameter group corresponding to each parameter A according to each parameter A of the model A (W A ).
  • the parameter A of the model A is also called the feature of the model A, and the parameters of the model A are in one-to-one correspondence with the features of the data subset A.
  • the parameter A of model A is expressed as: is the first parameter (or the first feature) of model A.
  • Model A has a total of N parameters.
  • Forming a parameter group corresponding to each parameter A includes: duplicating each parameter A by L copies to form a group corresponding to the parameter A.
  • L is the polynomial order in Figure 4. for example: That is, the parameter of the nth group is the group corresponding to the feature n, including the nth parameter of L shares.
  • Each parameter A of the model A is copied L copies because each parameter A needs to be multiplied by the grouped data subset A (DD A ).
  • the parameter A is a vector, which can be changed into a matrix form by copying L copies, which is convenient for matrix multiplication with (DD A ).
  • Step 505 the device A uses the public key A to perform homomorphic encryption on each group of parameters A to obtain the encrypted parameters
  • the encrypted parameter A includes a parameter group corresponding to each encrypted parameter A. Expressed as:
  • Step 506. Device A sends the encrypted parameter A and the encrypted data subset A to device B.
  • step 502 may be executed together with step 505 .
  • Step 502' The device B groups the data subset B (D B ) to obtain the grouped data subset B (DD B ).
  • step 402' For the specific method of this step, reference may be made to the description of step 402', which is not described in detail in this embodiment of the present application.
  • Step 503 ′ The device B groups the labels Y B with the highest number in the data subset B to obtain a grouped label set.
  • Each group of the grouped label set corresponds to L labels.
  • D B data subset B
  • Step 504 ′ the device B uses the model B (W B ) and the data subset B (D B ) to calculate the intermediate result B (U B ) of the data subset B, and then groups the intermediate results B to obtain the intermediate results after the grouping. Result B(DU B ).
  • Step 507 device B according to the encrypted parameters and the encrypted subset of data get the encrypted intermediate result
  • the encrypted matrix of the parameter A may be multiplied by the encrypted matrix of the data subset A to obtain the encrypted intermediate result A.
  • the encrypted intermediate result A includes the intermediate results A of each group.
  • the intermediate result A of each group is the sum of the intermediate results A of each parameter A. can be expressed as:
  • Step 508 Device B according to the grouped intermediate result B (DU B ) and the encrypted intermediate result Get the combined first intermediate fruit
  • step 508 For a detailed description of step 508, refer to step 407', which is not described in detail in this embodiment of the present application.
  • the device B may further obtain a combined intermediate result with the encrypted intermediate result A after performing homomorphic encryption on the grouped intermediate result B using the public key A.
  • the combined intermediate result generated using the encrypted intermediate result A and the encrypted intermediate result B can be used to determine (or generate) the encrypted gradient A of model A, and the encrypted gradient A of model B The gradient of B.
  • Step 509 Device B determines (or generates) the encrypted gradient of model A
  • the gradient A includes the updated value of each parameter A of the model A.
  • Device B gets the encrypted gradient
  • the detailed description of the parameter can be used to describe the description of step 407, which is not described in detail in this embodiment of the present application.
  • Step 510 the device B determines (or generates) the noise A (R A ) of the model A, and the noise A of the model A is the noise A of each parameter A (also each feature) of the model A, which can be expressed as:
  • the detailed description of the device B determining (or generating) the noise A (R A ) of the model A introduces the description of the parameter step 408 , which is not described in detail in this embodiment of the present application.
  • Step 511 Device B obtains an encrypted gradient containing noise A according to the noise A corresponding to each gradient and the gradient A of the corresponding parameter
  • Step 409 obtains the encrypted gradient containing noise A
  • the detailed description of the parameter description of step 409 is introduced, which is not described in detail in this embodiment of the present application.
  • Step 512 the device B determines (or generates) the gradient of the model B
  • the gradient B includes the updated values of the parameters of the model B.
  • Step 407 ′ gets the gradient of Model B
  • step 407 ′ which is not described in detail in this embodiment of the present application.
  • Device B generates noise B (R B ) of model B.
  • noise B R B
  • Step 408 ′ the description of step 408 ′, which is not described in detail in this embodiment of the present application.
  • Step 514 The device B obtains the encrypted gradient including the noise B according to the noise B corresponding to each parameter and the gradient B of the corresponding parameter Encrypted gradient containing noise B
  • the gradient B including the noise B of each parameter B of the model B can be expressed as:
  • Step 514 Device B converts the encrypted gradient containing noise B and the encrypted gradient containing noise A sent to device A.
  • Step 515 Device A receives the gradient containing noise A sent by device B and the gradient containing the noise B Then, use the private key A (sk A ) to decrypt the encrypted gradient containing noise A to obtain the decrypted gradient A (DG AR) containing noise A.
  • the decrypted gradient A containing noise A (DG AR) contains the gradient A containing noise A corresponding to each parameter A of model A. for example, represents the gradient A of the first parameter, represents the noise A of the first parameter.
  • step 410' For this step, reference may be made to the description of step 410'.
  • Step 516 The device A obtains the gradient A (G AR) containing the noise A before the grouping according to the decrypted gradient A (DG AR) containing the noise A.
  • the gradient A containing noise A before grouping includes the gradient A containing noise A corresponding to each parameter A before grouping. Expressed as: Gradient A before grouping containing noise A for the nth feature. It is determined according to the gradient A of each piece of data in a group that contains noise A in the feature. can be expressed as:
  • the decrypted value is the result of grouping the values of the same feature, and the gradient of the corresponding parameter must be obtained by averaging multiple values of the same feature (or parameter) in the same group.
  • the update to model A has noise A. Since there is no noise A value on the A side of the device. Therefore, the update of the model A obtained in this step is generated by the noisy gradient A, so the parameters of the updated model A are not the target model either.
  • Step 518 Device A obtains the parameter A including noise A of the updated model A according to the updated model A (WR A ), which is expressed as:
  • Step 519 Device A uses public key A to perform homomorphic encryption on the updated parameter A of model A containing noise A, to obtain the encrypted parameter containing noise A
  • Step 520 Device A uses the private key A (sk A ) to decrypt the gradient containing noise B sent by device B
  • the decrypted gradient B (DG BR) containing noise B is obtained.
  • Step 521 may refer to the detailed description of step 410 . The embodiments of the present application are not described in detail here.
  • Step 521 The device A obtains the gradient B (GB R) containing the noise B before the grouping according to the decrypted gradient B containing the noise B.
  • the gradient B containing noise B before grouping includes the gradient B containing noise B corresponding to each parameter before each grouping, which can be expressed as: in, is the noise-containing gradient A corresponding to the N+1 th feature before the grouping.
  • Step 522 Device A combines the gradient A set GBR containing noise A before grouping and the encrypted updated parameters containing noise A sent to device B.
  • Device A can convert GBR and They are sent to device B separately, or they can be sent to device B together.
  • device A performs steps 520-521, and steps 515-516 are not in a time sequence.
  • Step 523 Device B removes the encrypted updated parameters containing noise A according to the stored noise A of each gradient A
  • the noise A in and the encrypted updated parameter A is obtained.
  • the encrypted updated parameter A includes the encrypted updated parameters A, which can be expressed as:
  • Step 524 The encrypted and updated parameters of device B sent to device A.
  • Step 525 Device A uses private key A to encrypt the updated parameters Decrypt to get the updated parameters of model A
  • Step 524 The device B removes the noise B in the gradient B (GB R) including the noise B according to the stored noise B, to obtain a gradient B set.
  • the gradient B set can be expressed as:
  • the device B updates the model B(W B ) according to the gradient B(G B ).
  • the embodiments of the present application are not limited.
  • Steps 504' to 525 are repeatedly executed until the direct change to the model parameters is less than the preset value.
  • device A performs block encryption of the second data set
  • device B calculates gradient B and gradient A
  • device A decrypts gradient B and gradient A
  • device B encrypts the decrypted gradient B and gradient A are denoised
  • model B and model A are updated according to the denoised gradient B and gradient A, respectively.
  • the transmission of the gradient is not only encrypted but also contains noise, which makes it more difficult to obtain the original data of the opposite end through the gradient, thereby improving the data security of both parties.
  • FIG. 6A-FIG. 6B are in the method flow of another embodiment of updating model parameters provided by the embodiment of the present application.
  • the encrypted data is calculated by a third party.
  • This method embodiment includes the following steps:
  • Step 601 The device A generates a homomorphically encrypted public key A (pk A ) and a private key A (sk A ).
  • Steps 602-603 the device A groups the data subset A (D A ) to obtain the grouped data subset A (DD A ), and uses the public key A to encrypt the grouped data subset A to obtain encrypted data. Subset of data after
  • steps 602-603 For the detailed description of steps 602-603, reference is made to the description of steps 502 and 503, which is not described in detail in this embodiment of the present application.
  • Step 604 Device A combines the public key A (pk A ) with the encrypted data subset sent to device C.
  • Step 605 The device A forms a parameter group A corresponding to each parameter A according to each parameter A of the model A (W A ), and then uses the public key A to perform homomorphic encryption on each group of parameter groups A to obtain an encrypted parameter group.
  • step 605 For a detailed description of step 605, reference may be made to the detailed description of steps 504 to 505, which is not described in detail in this embodiment of the present application.
  • Step 606 Device A sets the encrypted parameter B to sent to device C.
  • the device A may not form a parameter group, but encrypt each parameter A of the model A and send it to the device C.
  • step 604 and step 606 are executed in combination.
  • Step 601' the device B groups the data subset B (D B ) to obtain the grouped data subset B (DD B ).
  • step 402' For the specific method of this step, reference may be made to the description of step 402', which is not described in detail in this embodiment of the present application.
  • step 602' the device B groups the labels Y B with the highest number in the data subset B to obtain the grouped labels.
  • Each group of the grouped labels corresponds to L labels.
  • the method for grouping Y B reference may be made to the method for grouping DB, which is not described in detail in this embodiment of the present application.
  • Step 607 device C according to the encrypted parameters and the encrypted subset of data get the encrypted intermediate result
  • step 607 For a detailed description of step 607, reference may be made to the detailed description of step 507, which is not described in detail in this embodiment of the present application.
  • Step 608 Device C will encrypt the intermediate result sent to device B.
  • Step 609 Device B determines (or generates) an intermediate result B (U B ) of the data subset B by using the model B (W B ), the data subset B (D B ) and the grouped labels, and then performs the intermediate result B on the intermediate result B. grouping to obtain the grouped intermediate result B (DU B ).
  • Step 610 Device B according to the grouped intermediate result B (DU B ) and the encrypted intermediate result Get the combined first intermediate fruit
  • step 610 For the detailed description of step 610, reference is made to the detailed description of step 406', which is not described in detail in this embodiment of the present application.
  • the device B can also use the public key A to perform homomorphic encryption on the grouped intermediate result B, obtain the encrypted intermediate result B, and combine the encrypted result B with the encrypted intermediate result B , to obtain the combined first intermediate result. If device B needs to use public key A to encrypt the grouped intermediate result B, device B needs to obtain public key A first.
  • Step 611 the device B calculates the gradient of the model B And the noise B (R B ) of the model B is generated, and the noise B of the model B includes the noise B of each parameter of the model B. Then the device B obtains the encrypted gradient containing the noise B according to the noise B corresponding to each parameter B and the gradient B of the corresponding feature and the encrypted gradient containing noise B send to device A
  • step 611 For the detailed description of step 611, reference may be made to the detailed description of step 407'-step 409', which is not described in detail in this embodiment of the present application.
  • Step 612 Device B will combine the first intermediate result sent to device C.
  • the core calculation process is placed on the B side and the C side, but the calculation on the B side and the C side is performed after encryption, and the B side and the C side perform the ciphertext calculation, so the calculation is obtained.
  • the gradient information of is ciphertext. Since updating the model requires the model parameters of the plaintext, the calculated ciphertext must be sent to the A side for decryption. At the same time, in order to prevent the A-side from obtaining the plaintext gradient, it is necessary to add the calculated gradient to the random number to ensure that even if the A-side decrypts, the real gradient cannot be obtained.
  • Step 613 Device B converts the encrypted gradient containing noise B sent to device A.
  • Step 614 The device C determines (or generates) the gradient A of the model A according to the combined first intermediate result A and the encrypted parameter A of the model A.
  • Step 614 is the same as step 509, which is not described in detail in this embodiment of the present application.
  • the device C determines (or generates) the noise A (R A ) of the gradient A.
  • the noise A of the gradient A includes the noise A corresponding to each parameter (also each feature) of the model A, which can be expressed as:
  • the detailed description of the device C determining (or generating) the noise A(R A ) of the model A can be parameterized in the description of step 408 , which will not be described in detail in this embodiment of the present application.
  • Step 616 Device C performs homomorphic encryption using the public key A according to the noise A corresponding to each parameter A and the gradient A of the corresponding parameter to obtain the encrypted gradient containing the noise A
  • Step 409 obtains the encrypted gradient containing noise A
  • the detailed description of the parameter description of step 409 is introduced, which is not described in detail in this embodiment of the present application.
  • Step 617 Device C converts the encrypted gradient containing noise A sent to device A.
  • Step 618 Device A receives the encrypted gradient containing noise A sent by device C and after the set of gradient A containing noise A Then, use the private key A (sk A ) to decrypt the encrypted gradient containing noise A to obtain the decrypted gradient A (DG AR) containing noise A.
  • the decrypted gradient A containing noise A (DG AR) contains the gradient A containing noise A corresponding to each parameter A of model A. for example, represents the gradient A of the first parameter A, represents the noise A of the first gradient A.
  • step 410' For this step, reference may be made to the description of step 410'.
  • Step 619 The device A obtains the gradient A (G AR) containing the noise A before the grouping according to the decrypted gradient A (DG AR) containing the noise A.
  • G AR gradient A
  • DG AR decrypted gradient A
  • Step 620 Device A uses the private key A (sk A ) to decrypt the encrypted gradient containing noise B to obtain the decrypted gradient B containing noise B (DG BR).
  • Step 621 Device A obtains a gradient B (GB R) containing noise B before grouping according to the decrypted gradient B (DG BR) containing noise B.
  • G R a gradient B
  • DG BR decrypted gradient B
  • Step 622 Device A sends the gradient A containing noise A before grouping to device C.
  • Step 623 Device A sends the gradient B containing noise B before grouping to device B.
  • Step 624 The device C removes the noise A in the gradient A including the noise A according to the stored noise A of each gradient, so as to obtain the gradient A (G A ).
  • Step 626 The device B removes the noise B in the gradient B including the noise B according to the stored noise B corresponding to each parameter B of the model B, and obtains the gradient B (G B ).
  • Steps 610 to 627 are repeated until the direct change to the model parameters is less than the preset value.
  • part of the calculation steps are placed in the device C, so that the calculation of the device B can be reduced.
  • the interactions between device A, device C, and device B are grouped and encrypted data, or gradients of a noise-added model, data security can also be guaranteed.
  • FIG. 7 is a flowchart of another method for updating a model provided by an embodiment of the present invention.
  • different features (values) of the data of the same user are respectively located in multiple devices (it is assumed to be three in the embodiment of this application), but only the data in one device includes tags.
  • the model of the vertical federation scenario includes two or more models A (W A1 and W A2 , if there are H devices A, then there are H models W AH ) and model B (W B ).
  • the parameters of the model A(W A ) can be expressed as:
  • the parameters of model B (W B ) can be expressed as: Different models A have different parameters.
  • it is assumed that a parameter of the model corresponds to a feature in the data subset.
  • each device uses its own generated public key (including the public key A1 generated by the device A-1, the public key A-2 generated by the device A-2, and the public key generated by the device B).
  • Key B generates a public public key that each device uses to encrypt its respective subset of data.
  • the encrypted data subset, the encrypted intermediate result and the encrypted gradient generated according to the encrypted data subset, and/or the noise contained in the encrypted gradient, etc. are sent to other apparatuses.
  • device A1 sends encrypted data subsets, intermediate results, gradients and/or noise, etc. to device B or A2.
  • the device A1 sends the encrypted data subset D A1 , the intermediate result DU A1 and/or the noise A1 to the device B or A1 respectively.
  • each device participates in the training of the vertical federation model, but only the data contained in one device is allowed to be labeled data (in the embodiment of this application, the data of device B is labeled data), and the data contained in other devices is allowed. for unlabeled data. Assuming that the data of H devices in total is unlabeled data, the devices containing the unlabeled data can be represented as A1-AN, which is the same as device A.
  • the data subset with the label is referred to as the data subset B
  • the device storing the data subset B is referred to as the device B
  • the other device that stores unlabeled data is called device A.
  • the embodiments of the present application have two or more devices A.
  • the embodiment of the present application includes the following steps:
  • Step 701 Each device generates a homomorphically encrypted public key and a private key, and transmits the public key generated by the device to other devices. Then, a common public key is generated according to the public key generated by itself and the received public keys generated by other devices.
  • device A1 generates homomorphically encrypted public key A1 (pk A1 ) and private key A1 (sk A1 ), and receives public key B (pk B ) sent by device B and public key A2 sent by device A2 (pk C2 ), and send the public key A1 to device B and device A2, respectively.
  • the device A1 generates the public public key pkAll according to the public key A1, the public key A2 and the public key B.
  • the device B and the device A2 also perform the same steps performed by the device A1, which are not described in detail in the examples of the present application.
  • Step 702 Each device determines (or generates) an intermediate result for each data subset using the respective data subset and the respective model.
  • step 702 may refer to the detailed description of step 403, which is not described in detail in this embodiment of the present application.
  • Step 703 Each device encrypts the intermediate result using the public public key, and sends the encrypted intermediate result to other devices.
  • the device A1 encrypts the intermediate result A1 using the public public key, and encrypts the encrypted intermediate result. Sent to Device B and Device A2.
  • device B uses the public public key to encrypt the intermediate result B, and encrypts the encrypted intermediate result B. Sent to device A1 and device A2.
  • the device A2 uses the public public key to encrypt the intermediate result A2 (U A2 ), and encrypts the encrypted intermediate result Sent to Device A1 and Device B.
  • the intermediate results used in each model training process are generated from the data subsets for each device and the model for each device.
  • the intermediate result A1 is determined (or generated) based on the model A1 and the data subset A1.
  • the intermediate result A2 is determined (or generated) from the model A2 and the data subset A2.
  • the intermediate result B is determined (or generated) according to the model B and the data subset B.
  • the embodiment of the present application uses the public key to encrypt the intermediate result and sends it to other devices, so as to avoid untrusted third parties to obtain data according to the intermediate result, thereby ensuring the security of the data.
  • Step 704 Each device generates a combined intermediate result according to each determined (or generated) encrypted intermediate result and the received encrypted intermediate result sent by other devices.
  • the combined intermediate result is:
  • Step 705 Each device calculates the gradient of each model according to the combined intermediate result.
  • the gradient of the model A1 includes the gradient corresponding to each parameter of the model A1, which can be expressed as: in, is the gradient corresponding to the nth feature of model A1.
  • in is: the data corresponding to the nth feature in the data subset A1.
  • Step 706 Each device sends the corresponding gradient to other devices, and receives the decryption result of the gradient from other devices. Then, using the decrypted gradients, update the respective models.
  • step 706 can use a sequential decryption method, as follows:
  • device A1 will gradient Send to device B or A2 in turn. After receiving the gradient decrypted by device B or device A2, send the decrypted gradient by device B or device A2 to device A2 or device B until all devices perform the gradient analysis. decrypt.
  • Apparatus B or Apparatus A2 for gradient Decrypt with their respective private keys are Apparatus B or Apparatus A2 for gradient Decrypt with their respective private keys.
  • step 706 can use a separate decryption method, as follows:
  • device A1 will gradient It is sent to device B and device A2 respectively, and after receiving the decrypted gradient of device B and device A2, the gradient decrypted by device B and device A2 is synthesized to obtain the final decryption result.
  • Apparatus B and Apparatus A2 pair gradients Decrypt with their respective private keys.
  • step 414 can be referred to according to the model A1.
  • each device may directly use the encrypted gradient for modulo without sending the gradient to other devices for decryption.
  • a process of decrypting the encrypted parameters can be optionally performed to calibrate the model parameters in the ciphertext state.
  • the calibration of the model parameters of any party requires an agent to be responsible for the calibration of the encrypted parameters of the party.
  • the optional implementation of this operation is one of the following two:
  • the parameter parties perform decryption respectively.
  • the encrypted model parameters of the calibration party A1 are sent to the agent B after adding noise, and the agent sends the encrypted model parameters after the noise addition to other parties independently.
  • the participants receive the decryption results returned by all parties.
  • the agent also decrypts the noised encryption model parameters, synthesizes the multi-party decryption results to obtain the plaintext noisy model parameters, and uses the synthesized public key to perform the decryption process. After encryption, it is fed back to the device A1, and the device A1 performs the operation of ciphertext denoising on the returned encryption model parameters to obtain the calibrated encryption model parameters.
  • the second implementation method is that the parameter side decrypts in sequence.
  • the encrypted model parameters of the calibration party A1 are added with noise R1 (under ciphertext) and sent to agent B, and the agent adds noise RB (under ciphertext). ), send it to other participants in turn, and the parties participating in this cycle add noise (under the ciphertext) in turn, and finally return to the agent B.
  • the agent sends the encrypted model parameters with the noise of each party to each party (including A1 and B itself), and each party decrypts it and returns it to the agent B.
  • Party B obtains the plaintext model parameters with the noise of each party.
  • party B encrypts with the synthetic key, calls all parties except A1 to denoise in turn (in the ciphertext state), and finally returns to A1.
  • A1 denoises R1 in the ciphertext state, and obtains the calibrated encryption model parameters.
  • each device encrypts the intermediate result with a public public key, and the device uses its own private key pair
  • the gradients generated by other devices are decrypted.
  • the data security is guaranteed, and the encryption operation of all parties is performed only once, which reduces the number of interactions and saves network resources.
  • FIG. 8 provides an apparatus according to an embodiment of the present application, including a receiving module 801 , a processing module 802 and a sending module 803 .
  • the processing module 802 is configured to generate a first intermediate result according to the first data subset.
  • the receiving module 801 is configured to receive an encrypted second intermediate result sent by a second device, where the second intermediate result is generated according to a second data subset corresponding to the second device.
  • the processing module 802 is further configured to: obtain a first gradient of the first model, where the first gradient of the first model is generated according to the first intermediate result and the encrypted second intermediate result; the first gradient After the first gradient of the model is decrypted using the second private key, the second private key is used to update the first model, and the second private key is a decryption key for homomorphic encryption generated by the second device.
  • the second intermediate result is encrypted using a second public key for homomorphic encryption generated by the second device, and the processing module 802 is further configured to generate a first public key for homomorphic encryption and a first private key, and encrypting the first intermediate result using the first public key.
  • the sending module 803 is configured to send the encrypted first intermediate result for sending.
  • the sending module 803 is configured to send the encrypted first data subset and the encrypted first parameter of the first model, the encrypted first data subset and the encrypted first data subset The parameters are used to determine (or generate) the encrypted first intermediate result.
  • the receiving module 801 is configured to receive the encrypted first gradient of the first model, where the first gradient of the first model is based on the encrypted first intermediate result, the encrypted first parameter and The encrypted second intermediate result is determined (or generated).
  • the processing module 802 is configured to decrypt the encrypted first gradient using the first private key, and the decrypted first gradient of the first model for updating the first model.
  • the receiving module 801 is configured to receive the encrypted first intermediate result and the encrypted second intermediate result, and receive the parameters of the first model.
  • the processing module 802 is configured to determine (or generate) the first gradient of the first model according to the encrypted first intermediate result and the encrypted second intermediate result, as well as the parameters of the first model, to determine (or generate) the first gradient of the first model.
  • the first gradient is decrypted, and the first model is updated according to the decrypted first gradient.
  • modules in the apparatus in FIG. 8 can also be used to execute any step of any apparatus in the method flow in FIG. 3 to FIG. 7 .
  • the embodiments of the present application are not described in detail here.
  • the device may be a core book.
  • FIG. 9 it is a schematic diagram of the hardware structure of the apparatus 70 provided by the embodiment of the present application.
  • the apparatus may be each entity or network element in FIG. 2A , or may be each apparatus in FIG. 2B .
  • the device may be any of the devices in Figures 3-7.
  • the apparatus shown in FIG. 9 may include: a processor 901 , a memory 202 , a communication interface 904 , an output device 905 , an input device 906 and a bus 903 .
  • the processor 901 , the memory 902 , the communication interface 904 , the output device 905 , and the input device 906 can be connected through a bus 903 .
  • the processor 901 is the control center of the computer equipment, and can be a general-purpose central processing unit (central processing unit, CPU), or other general-purpose processors. Wherein, the general-purpose processor may be a microprocessor or any conventional processor or the like.
  • processor 901 may include one or more CPUs.
  • Memory 902 may be read-only memory (ROM) or other type of static storage device that can store static information and instructions, random access memory (RAM) or other type of static storage device that can store information and instructions
  • ROM read-only memory
  • RAM random access memory
  • a dynamic storage device that can also be an electrically erasable programmable read-only memory (EEPROM), a magnetic disk storage medium, or other magnetic storage device, or can be used to carry or store instructions or data structures in the form of desired program code and any other medium that can be accessed by a computer, but is not limited thereto.
  • EEPROM electrically erasable programmable read-only memory
  • magnetic disk storage medium or other magnetic storage device, or can be used to carry or store instructions or data structures in the form of desired program code and any other medium that can be accessed by a computer, but is not limited thereto.
  • the memory 902 may exist independently of the processor 901 .
  • the memory 902 may be connected to the processor 901 through a bus 903 for storing data, instructions or program codes.
  • the processor 91 calls and executes the instructions or program codes stored in the memory 902
  • the method for updating the machine learning model provided by the embodiments of the present application can be implemented, for example, the updating of the machine learning model shown in any of FIG. 3 to FIG. 7 method.
  • the memory 902 may also be integrated with the processor 701 .
  • the communication interface 904 is used for connecting the device and other devices through a communication network, and the communication network can be Ethernet, radio access network (RAN), wireless local area networks (WLAN) and the like.
  • the communication interface 904 may include a receiving unit for receiving data, and a transmitting unit for transmitting data.
  • the bus 903 can be an industry standard architecture (industry standard architecture, ISA) bus, a peripheral component interconnect (peripheral component interconnect, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus or the like.
  • the bus can be divided into address bus, data bus, control bus and so on. For ease of presentation, only one thick line is used in FIG. 9, but it does not mean that there is only one bus or one type of bus.
  • FIG. 9 does not constitute a limitation on the computer device 90.
  • the computer device 70 may include more or less components than those shown in the figure, or a combination of certain components may be included. some components, or a different arrangement of components.
  • a machine learning model management device such as a machine learning model management center or a federated learning server
  • each functional module may be divided into each function, or two Or two or more functions are integrated in one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. It should be noted that, the division of modules in the embodiments of the present application is schematic, and is only a logical function division, and there may be other division manners in actual implementation.
  • the disclosed methods may be implemented as computer program instructions encoded in a machine-readable format on a computer-readable storage medium or on other non-transitory media or articles of manufacture.
  • the computer may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • a software program it can be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer-executed instructions are loaded and executed on the computer, the flow or function according to the embodiments of the present application is generated in whole or in part.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website site, computer, server, or data center over a wire (e.g.
  • coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (eg infrared, wireless, microwave, etc.) means to transmit to another website site, computer, server or data center.
  • Computer-readable storage media can be any available media that can be accessed by a computer or data storage devices including one or more servers, data centers, etc., that can be integrated with the media.
  • Useful media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media (eg, solid state disks (SSDs)), and the like.

Abstract

一种机器学习模型的更新方法,应用于人工智能领域。该方法包括:第一装置根据第一数据子集生成第一中间结果。第一装置接收第二装置发送的加密后的第二中间结果,所述第二中间结果根据第二装置对应的第二数据子集生成。第一装置获得第一模型的第一梯度,所述第一模型的第一梯度根据所述第一中间结果和加密后的所述第二中间结果生成。所述第一模型的第一梯度在使用第二私钥进行解密后,用于更新第一模型,所述第二私钥为所述第二装置生成的用于同态加密的解密密钥。通过本方法,在纵向联邦学习的场景下,第一装置使用第二装置的数据进行模型更新时,可以保护第二装置的数据安全,从而保护用户隐私。

Description

机器学习模型更新的方法和装置
本申请要求于2020年12月31日提交中国国家知识产权局、申请号为202011635759.9、发明名称为“机器学习模型更新的方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及机器学习技术领域,尤其涉及模型方法和装置。
背景技术
联邦学习(federated learning)是一种分布式机器学习技术。每个联邦学习客户端(federated learning client,FLC),如联邦学习装置1、2、3……k,利用本地计算资源和本地网络业务数据进行模型训练,并将本地训练过程中产生的模型参数更新信息Δω,如Δω 1、Δω 2、Δω 3……Δω k,发送给联邦学习服务端(federated learning server,FLS)。联邦学习服务端基于模型更新参数采用汇聚算法进行模型汇聚,得到汇聚机器学习模型。汇聚机器学习模型作为联邦学习装置下一次执行模型训练的初始模型。联邦学习装置和联邦学习服务端多次执行上述模型训练过程,直到得到的汇聚机器学习模型满足预设条件时,停止训练。
在联邦学习的技术中,有可能有位于不同实体的具有不同特征的数据需要进行聚合后训练模型,以增强模型的学习能力。对于使用不同实体的具有不同特征的数据进行聚合后进行的模型训练的方法称为纵向联邦学习。
现有的纵向联邦学习的可参考图1所示,装置B和装置A接收服务器的公钥和私钥对,并根据客户端A和客户端B发送的模型的梯度更新模型,并将更新后的模型的分别发送给客户端A和客户端B。
现有的纵向联邦学习需要依赖的服务器,但由于公钥和私钥由服务器生成,因此服务器是否可信是一个重要的问题。如果服务器为不可信任实体,则会对数据安全造成较大的威胁。如何提高纵向联邦学习的安全性成为需要解决的问题。
发明内容
本申请提供了一种机器学习模型的更新方法、装置和系统,提高纵向联邦学习的安全性。
为了达到上述目的,本申请的实施例提供以下技术方案:
第一方面,本申请的实施例提供一种机器学习模型的更新方法。该方法包括:第一装置根据第一数据子集生成第一中间结果。第一装置接收第二装置发送的加密后的第二中间结果,所述第二中间结果根据第二装置对应的第二数据子集生成。第一装置获得第一模型的第一梯度,所述第一模型的第一梯度根据所述第一中间结果和加密后的所述第二中间结果生成。所述第一模型的第一梯度在使用第二私钥进行解密后,用于更新第一模型,所述第二私钥为所述第二装置生成的用于同态加密的解密密钥。通过本方法,由拥有第二数据子集的第二装置使用第二装置生成的密钥(比如公钥)对发送给第一装置的第二中间结果进行同态加密,并 由第二装置使用第二装置生成的私钥对梯度进行解密。从而使得在纵向联邦学习的场景下,第一装置使用第二装置的数据进行模型更新时,可以保护第二装置的数据安全,比如表二中的age,job,sex等用户数据不会被获取,从而保护用户隐私。
示例性的,第一梯度可以由第一装置确定,也可以其它的装置根据第一中间结果和加密后的第二中间结果确定。
在一种可能的设计中,所述第二中间结果使用所述第二装置生成的用于同态加密的第二公钥加密。所述第一装置生成用于同态加密的第一公钥和第一私钥。所述第一装置使用所述第一公钥对所述第一中间结果进行加密。通过本方法,由第一装置和第二装置分别生成对各自数据子集的数据进行加密或解密,可以保证各自数据子集的数据安全。
在一种可能的设计中,所述第一装置将所述加密后的第一中间结果发送给所述第二装置,从而使得第二装置可以使用第一装置的数据进行模型训练,并能保证第一装置的数据的安全。
在一种可能的设计中,所述第一模型的第一梯度根据所述第一中间结果和加密后的所述第二中间结果确定具体为:所述第一模型的第一梯度根据加密后的所述第一中间结果和加密后的所述第二中间结果确定。所述第一装置使用所述第一私钥对所述第一模型的第一梯度解密。通过本方法,使得第一装置在需要都加密的数据进行训练时,保证训练数据的安全。
在一种可能的设计中,所述第一装置生成第一模型的第一梯度的第一噪声;所述第一装置将包含所述第一噪声的第一梯度发送给所述第二装置;所述第一装置接收使用所述第二私钥解密后的第一梯度,所述解密后的梯度包含所述第一噪声。通过本方法,对第一梯度增加噪声。在将第一梯度发送给第二装置进行解密的情况下,仍能保证第一装置的第一数据子集的数据安全。
在一种可能的设计中,所述第一装置接收所述第二装置发送的第二模型的第二参数。所述第一装置根据所述加密后第一中间结果和加密后的所述第二中间结果,以及所述第二模型的第二参数集确定所述第二模型的第二梯度。所述第一装置将所述第二模型的第二梯度发送给所述第二装置。通过本方法,由第一装置根据第二装置的第二数据子集和第一装置的第一数据子集确定第二模型的第二梯度,由于使用加密的第二数据子集的中间结果,保证了第二数据子集的数据安全。
在一种可能的设计中,所述第一装置确定所述第二梯度的第二噪声。所述发送给所述第二装置的第二梯度包含所述第二噪声。通过本方法,在第一装置更新第二装置的第二模型的场景下,第一装置对第二梯度增加第二噪声,可以保证第一装置的第一数据子集的安全。
在一种可能的设计中,所述第一装置接收包含所述第二噪声的更新后的第二参数,所述第二参数集为用所述第二梯度更新所述第二模型的参数集;所述第一装置去除包含在所述更新后的第二参数中的第二噪声。通过本方法,在第一装置更新第二装置的第二模型的场景下,第一装置对第二参数解除噪声,从而实现由第一装置更新第二模型时也能保证第一数据子集的安全。
在一种可能的设计中,所述第一装置接收至少两个用于同态加密的所述第二公钥,所述至少两个第二公钥由至少两个第二装置生成。所述第一装置根据接收到所述至少两个第二公钥以及所述第一公钥生成用于同态加密的公共公钥;所述公共公钥用于加密所述第二中间结果和/或所述第一中间结果。通过本方法,当由多个装置的数据参与机器学习模型的更新时,可以保证各方数据的安全。
在一种可能的设计中,所述第一模型的第一梯度使用第二私钥进行解密包括:
所述第一装置将所述第一模型的第一梯度依次发送给所述至少两个第二装置,并接收所 述至少两个第二装置分别使用对应的第二私钥解密后的第一模型的第一梯度。通过本方法,当由多个装置的数据参与机器学习模型的更新时,可以保证各方数据的安全。
在一种可能的设计中,所述第一装置使用所述第一私钥对所述第一模型的第一梯度解密。
第二方面,本申请的实施例提供一种机器学习模型的更新方法。该方法包括:第一装置发送加密后的第一数据子集以及加密后的第一模型的第一参数,所述加密后的第一数据子集以及加密后的第一参数用于确定加密后的第一中间结果。所述第一装置接收加密后所述第一模型的第一梯度,所述第一模型的第一梯度根据所述加密后的第一中间结果,所述加密后的第一参数以及加密后的第二中间结果确定。所述第一装置对所述加密后的第一梯度使用第一私钥进行解密,所述解密后的第一模型的第一梯度用于更新所述第一模型。通过本方法,第一装置将用于第一模型更新的第一梯度的计算置于其它装置内,由第一装置将第一数据子集加密后发送,可以保证第一数据子集的数据安全。
在一种可能的设计中,所述第一装置接收加密后的第二模型的第二梯度,所述加密后的第二梯度根据所述加密后的第一中间结果和加密后的第二中间结果确定,所述第二中间结果根据第二装置的第二数据子集和第二装置的第二模型的参数确定,所述加密后第二中间结果由所述第二装置对所述第二中间结果进行同态加密得到。所述第一装置使用所述第一私钥解密所述第二梯度。所述第一装置向所述第二装置发送经所述第一私钥解密后的第二梯度,所述解密后的第二梯度用于更新所述第二模型。通过本方法,第一装置对第二装置的模型的梯度进行解密,保证第一装置的第一数据子集的数据安全。
在一种可能的设计中,所述第一装置接收的第一梯度包含第一噪声,所述解密后的第一梯度包含所述第一噪声,所述更新后的第一模型的参数包含所第一噪声。在梯度中包含噪声,可以进一步的保证数据安全。
在一种可能的设计中,所述第一装置根据所述解密后的第一梯度更新所述第一模型。或所述第一装置发送所述解密后的第一梯度。
在一种可能的设计中,所述第一装置接收至少两个用于同态加密的所述第二公钥,所述至少两个第二公钥由至少两个第二装置生成。所述第一装置根据接收到所述至少两个第二公钥以及所述第一公钥生成用于同态加密的公共公钥;所述公共公钥用于加密所述第二中间结果和/或所述第一中间结果。
第三方面,本申请实施例提供了一种机器学习模型的更新方法。该方法包括:接收加密后的第一中间结果和加密后的第二中间结果。
接收第一模型的参数。根据所述加密后的第一中间结果和所述加密后的第二中间结果,以及第一模型的参数确定第一模型的第一梯度。对所述第一梯度进行解密。根据所述解密后的第一梯度,更新所述第一模型。通过本方法,第一中间结果和第二中间结果都经过加密,保证了各数集子集的数据安全。
在一种可能的设计中,所述加密后的第一中间结果为使用第一公钥对第一中间结果进行同态加密后获得;所述加密后的第二中间结果为使用所述第一公钥对第二中间结果进行同态加密后获得。
在一种可能的设计中,对所述第一梯度进行解密包括:使用第一私钥对的第一梯度进行解密。
在一种可能的设计中,将所述第一梯度发送给第一装置。
在一种可能的设计中,从第一装置获取第一公钥,将述第一公钥发送给第二装置。
第四方面,本申请提供了一种装置。该装置用于执行上述第一方面至第三方面提供的任 一种方法。
在一种可能的设计方式中,本申请可以根据上述第一方面提供的任一种方法,对该机器学习模型管理装置进行功能模块的划分。例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。
示例性的,本申请可以按照功能将该机器学习模型管理装置划分为接收模块、处理模块和发送模块等。上述划分的各个功能模块执行的可能的技术方案和有益效果的描述均可以参考上述第一方面或其相应的可能的设计提供的技术方案,第二方面或其相应的可能的设计提供的技术方案,或第三方面或其相应的可能的设计提供的技术方案,此处不再赘述。
在另一种可能的设计中,该机器学习模型管理装置包括:存储器和处理器,存储器和处理器耦合。存储器用于存储计算机指令,处理器用于调用该计算机指令,以执行如第一方面或其相应的可能的设计提供的方法,第二方面或其相应的可能的设计提供的方法,或第三方面或其相应的可能的设计提供的方法。
第五方面,本申请提供了一种计算机可读存储介质,如计算机非瞬态的可读存储介质。其上储存有计算机程序(或指令),当该计算机程序(或指令)在计算机设备上运行时,使得该计算机设备执行上述第一方面或其相应的可能的设计提供的方法,第二方面或其相应的可能的设计提供的方法,或第三方面或其相应的可能的设计提供的方法。
第六方面,本申请提供了一种计算机程序产品,当其在计算机设备上运行时,使得第一方面或其相应的可能的设计提供的方法,第二方面或其相应的可能的设计提供的方法,或第三方面或其相应的可能的设计提供的方法被执行。
第七方面,本申请提供了一种芯片系统,包括:处理器,处理器用于从存储器中调用并运行该存储器中存储的计算机程序,执行第一方面或其相应的可能的设计提供的方法,第二方面或其相应的可能的设计提供的方法,或第三方面或其相应的可能的设计提供的方法。
可以理解的是,在上述第一方面的另一种可能的设计、第二面的另一种可能的设计、第二至第七方面提供的任何一种技术方案中:上述第一、第二或第三方面中的发送动作,具体可以替换为在处理器的控制下发送;上述第二方面或第一方面中的接收动作,具体可以替换为在处理器的控制下接收。
可以理解的是,上述提供的任一种系统、装置、计算机存储介质、计算机程序产品或芯片系统等均可以应用于第一方面、第二方面或第三方面提供的对应的方法,因此,其所能达到的有益效果可参考对应的方法中的有益效果,此处不再赘述。
在本申请中,上述任一种装置的名字对设备或功能模块本身不构成限定,在实际实现中,这些设备或功能模块可以以其他名称出现。只要各个设备或功能模块的功能和本申请类似,属于本申请权利要求及其等同技术的范围之内。
本申请的这些方面或其他方面在以下的描述中会更加简明易懂。
附图说明
图1为现有的适用于纵向联邦学习系统的结构示意图;
图2A为本申请实施例提供的适用于纵向联邦学习系统的结构示意图;
图2B为本申请实施例提供的适用于纵向联邦学习系统的结构示意图;
图3为本申请实施例提供的适用于纵向联邦学习的方法流程图;
图4为本申请实施例提供的适用于纵向联邦学习的方法流程图;
图5A-图5B为本申请实施例提供的适用于纵向联邦学习的另一方法流程图;
图6A-图6B为本申请实施例提供的适用于纵向联邦学习的另一方法流程图;
图7为本申请实施例提供的适用于纵向联邦学习的另一方法流程图;
图8为本申请实施例提供的一种机器学习模型更新的装置的结构示意图;
图9为本申请实施例提供的一种计算机设备的硬件结构示意图。
具体实施方式
以下,说明本申请实施例中所涉及的一些术语和技术:
1)、机器学习、机器学习模型、机器学习模型文件
机器学习,是使用算法来解析数据、从中学习,然后对真实世界中的事件做出决策和预测。机器学习是用大量的数据来“训练”,通过各种算法从数据中学习如何完成某模型业务。
在一些示例中,机器学习模型是包含用于完成某模型业务所采用的算法实现代码和参数的文件。其中,算法实现代码用于描述机器学习模型的模型结构,参数用于描述机器学习模型各构成部分的属性。为了方便描述,下文中将该文件称为机器学习模型文件。例如,下文中发送机器学习模型具体是指发送机器学习模型文件。
在另一些示例中,机器学习模型是完成某模型业务的逻辑功能模块。例如,将输入参数的值输入到机器学习模型,得到该机器学习模型的输出参数的值。
机器学习模型包括人工智能(artificial intelligence,AI)模型如神经网络模型等。
2)纵向联邦学习
纵向联邦学习(Vertical Federated Learning又称Heterogenous Federated Learning)指的是各方拥有不同特征空间的设定下进行联邦学习的技术。纵向联邦学习可以对于使用相同的用户的具有不同的用户特征且位于不实体装置的数据进行训练。纵向联邦学习可以对位于不同实体的具有不同特征或属性的数据进行聚合,以增强模型能力的联邦学习。数据的特征也可以是数据的属性。
3)模型梯度
模型梯度为机器学习模型在训练过程中,模型参数的变化量。
4)同态加密
同态加密(Homomorphic encryption)是一种加密形式,它允许人们对密文进行特定形式的代数运算得到仍然是加密的结果。使用同态密钥对中的密钥对经同态加密后的数据进行运算后的结果进行解密,与对明文进行同样的运算结果一样。
5)公钥
用于执行同态加密时加密的密钥。
6)私钥
用于执行同态加密时解密的密钥。
其他术语
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
在本申请的实施例中,术语“第二”、“第一”仅用于描述目的,而不能理解为指示或 暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第二”、“第一”的特征可以明示或者隐含地包括一个或者更多个该特征。在本申请的描述中,除非另有说明,“多个”的含义是两个或两个以上。
本申请中术语“至少一个”的含义是指一个或多个,本申请中术语“多个”的含义是指两个或两个以上,例如,多个第一报文是指两个或两个以上的第一报文。
应理解,在本文中对各种所述示例的描述中所使用的术语只是为了描述特定示例,而并非旨在进行限制。如在对各种所述示例的描述和所附权利要求书中所使用的那样,单数形式“一个(“a”,“an”)”和“该”旨在也包括复数形式,除非上下文另外明确地指示。
还应理解,本文中所使用的术语“和/或”是指并且涵盖相关联的所列出的项目中的一个或多个项目的任何和全部可能的组合。术语“和/或”,是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本申请中的字符“/”,一般表示前后关联对象是一种“或”的关系。
还应理解,在本申请的各个实施例中,各个过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其它信息确定B。
还应理解,术语“包括”(也称“includes”、“including”、“comprises”和/或“comprising”)当在本说明书中使用时指定存在所陈述的特征、整数、步骤、操作、元素、和/或部件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元素、部件、和/或其分组。
还应理解,术语“如果”可被解释为意指“当...时”(“when”或“upon”)或“响应于确定”或“响应于检测到”。
应理解,说明书通篇中提到的“一个实施例”、“一实施例”、“一种可能的实现方式”意味着与实施例或实现方式有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”、“一种可能的实现方式”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。
还应理解,本申请实施例中提到的“连接”,可以是直接连接,也可以是间接连接,可以是有线连接,也可以是无线连接,也就是说,本申请实施例对设备之间的连接方式不作限定。
以下,结合附图对本申请实施例提供的技术方案进行说明。
图2A为本申请实施例提供的一种应用于纵向联邦学习的应用场景的系统结构示意图。图2A所示的系统200可以包括网络数据分析功能实体201、基站202、核心网网元203,应用功能实体204。图2A的各网络实体可以为本申请实施例的装置A或装置B。
网络数据分析功能(Network Data Analytics Function,NWDAF)实体201:可以从各网络实体,比如基站202,核心网网元203和/或应用功能实体204获取数据,进行数据分析。数据分析指基于获取到的数据作为模型训练的输入,训练得到模型。此外,网络数据分析功能实体201还可以基于模型做推理确定数据分析结果。然后,向其他网络实体、第三方业务服务器、终端设备或网管系统提供数据分析结果。本申请中主要涉及的是NWDAF实体201的数据收集功能和模型训练功能。
应用功能(Application Function,AF)实体204:用于提供业务,或者进行与应用有关 的数据的路由,比如将数据提供给NWDAF实体201进行模型训练。进一步的,应用功能实体204还可以利用没有发送给NWDAF实体201的私有数据,与其它网络实体进行纵向联邦学习。
基站202,为终端提供接入服务,进而完成控制信号和用户数据在终端和核心网之间的转发。在本申请实施例中,基站202还可以将数据发送给NWDAF实体201,用于NWDAF实体201进行模型训练。基站202还可以使用没有发送给NWDAF实体201的私有数据,与其它网络实体进行纵向联邦学习。
核心网网元203,为终端提供核心网络的相关服务。核心网网元203可以是应用于5G架构的用户面功能实体如UPF实体,会话管理功能实体SMF实体,或策略控制功能实体如PCF实体等。应理解,该核心网网元还可以应用于未来其他网络架构,如6G架构。在本申请实施例中,任意核心网网元还可以将数据发送给NWDAF实体201,用于NWDAF实体201进行模型训练。核心网网元203还可以使用没有发送给NWDAF实体201的私有数据,与其它网络实体进行纵向联邦学习。
此外,本申请实施例的架构还可以包含其它的网元,本申请实施例在此不做限定。
在使用图2A的网络架构的情况下,任意网络实体可以将不涉及隐私的数据发送给NWDAF实体201,由NWDAF实体201根据一个或多个设备发送过来的数据组成数据子集,并与其它网络实体没有发送给NWDF的私有数据相结合进行纵向联邦学习。NWDAF实体201可以与一种类型的网络实体共同进行纵向联邦学习,也可以与多种类型的网络实体共同进行纵向联邦学习。比如,NWDAF实体201可以根据多个基站202发送的网络数据,与一个或多个基站202共同进行纵向联邦学习。NWDAF实体201也可以根据基站202和AF实体204发送的数据,与基站202和AF实体204共同进行纵向联邦学习。
表1为使用图2A的网络架构,进行纵向联邦学习的数据集示例:
Figure PCTCN2021112644-appb-000001
表1
其中,第3行、第7行、第12行分别表示没有发送给NWDAF实体201的,保存在基站202、核心网网元203、或者AF实体204的私有数据,可作为图3-图7中的数据子集。表1中的数据列,表明数据的特征,对应于纵向联邦学习的模型的参数。比如,第1-2行,第4-6行,第8-10行的内容,对应由NWDAF实体201训练或使用的模型的参数。表1中的来源列,分别表示各特征的数据的来源。比如第1-2行对应的数据,由AF实体204发送给NWDAF实体201,第4-6行对应的数 据,由UPF实体发送给NWDAF实体201。其中,第1行的数据(即service experience)做为模型训练的标签数据,即用户的业务体验作为标签数据。第1-12行的数据为同一用户在多个实体的数据。
因此,与图2A对应的场景中,NWDAF实体201作为图4-图7的装置B,对应的数据子集中包含标签。
图2B为本申请实施例提供的一种应用于纵向联邦学习的应用场景的系统的结构示意图。图2B所示的系统可以包括业务系统服务器A 251,业务系统服务器B 252。业务系统服务器A和B可以为应用于不同业务系统的服务器,比如银行业务系统的服务器,呼叫业务系统的服务器。图2B中的业务系统服务器A251也可以是图2A中的基站,核心网网元203、应用功能网元204或网络图数分析功能实体201。图2B所示的业务系统的服务器用于保存用户数据,并使用保存的用户数据与其它业务系统服务器保存的用户数据,与其它业务系统共同进行纵向联邦学习。图2B中的业务系统服务器B 252也可以是图2A中的基站,核心网网元203、应用功能网元204或网络图数分析功能实体201。
表2为以业务系统服务器A为呼叫业务系统的服务器,业务系统服务器B为银行业务系统的服务器为例的数据特征的示意图。
Figure PCTCN2021112644-appb-000002
行号为1的数据(即status)做为模型训练的标签数据。行号1-9对应的数据为银行业务系统服务器获得的数据,可以作为装置B对应的数据子集B,行号为10-14对应的数据为运营商业务系统获得的数据,可以作为装置A对应的数据子集A。行号1-14的数据为同一用户在不同系统的数据。
适用于图2A或图2B两种应用场景。装置A具有数据子集A(D A),装置B具有数据子集B(D B)。数据子集A和数据子集B分别包含P条数据(比如,P个用户的数据)。数据子集A包含N个特征,数据子集B包含M个特征。因此,装置A拥有的特征F A为:F A={f 1,f 2,…,f N};装置B拥有特征(F B),F B={f N+1,f N+2,…,f N+M}。f N表示为第N个特征,f N+M表示为第N+M个特征。
具有特征A的数据子集A(D A)和具有特征B的数据子集B(D B)合并为数据集D,用于进行纵向联邦学习。数据集D包含P条数据,表示为:D=[d 1,d 2,d 3,…,d P] T。d p表示第p条数据(d p为D中任意一条数据,p为小于等于P的任意正整数),d p具有N+M个特征,表示为
Figure PCTCN2021112644-appb-000003
其中
Figure PCTCN2021112644-appb-000004
为第p条数据的第N个特征,
Figure PCTCN2021112644-appb-000005
为第p条数据的第N+M个特征。每条数据根据特征F A和特征F B可以分为两部分:
Figure PCTCN2021112644-appb-000006
其中,
Figure PCTCN2021112644-appb-000007
为第p条数据的特征A(F A)对应的特征值,即:
Figure PCTCN2021112644-appb-000008
为特征F B对应的特征值,即
Figure PCTCN2021112644-appb-000009
数据集D根据特征F A和特征F B可以分为两个数据子集,即数据子集D A,数据子集D B,即
Figure PCTCN2021112644-appb-000010
其中数据子集D A为装置A拥有的P条具有特征F A的用户数据为:
Figure PCTCN2021112644-appb-000011
数据子集D B为装置B拥有的P条具有特征F B的用户数据,
Figure PCTCN2021112644-appb-000012
装置A初始化了模型AW A的参数表示为:
Figure PCTCN2021112644-appb-000013
装置B初始化了模型BW B的参数为
Figure PCTCN2021112644-appb-000014
从模型维度描述装置B和装置A分别对应具有不同参数的模型。模型的参数与数据子集的特征一一对应。比如,装置A具有的数据子集D A具有N个特征,则装置A的模型具有N个参数。本申请实施例的模型指利用梯度信息可迭代求解的模型。梯度信息为模型的更新值。本申请实施例的模型指线性模型或者神经网络模型。以一个简单的线性回归模型(不考虑纵向联邦情况)为例,模型f(x)=w1*x1+w2*x2+….+wn*xn=y,其中y是模型的输出参数,也称模型的标签,w1到wn为模型的N个参数,x1到xn为一条数据的第1到第n个特征。而在纵向联邦场景下,同一个用户的不同特征(值)分别位于两个或多个装置(本申请实施例假设为两个)。模型W A的参数
Figure PCTCN2021112644-appb-000015
模型W B的参数
Figure PCTCN2021112644-appb-000016
两个部分。在本申请实施例中,假设模型的一个参数对应数据子集中的一个特征。
图3为本申请实施例提供的在纵向联邦学习的场景下进行机器学习模型的更新方法,适用于图2A和图2B两种应用场景。包括如下步骤:如下图为本申请实施例,包括如下步骤:
步骤302、第一装置根据第一数据子集生成第一中间结果。
第一中间结果是根据第一装置的模型(即第一模型)与第一数据子集生成的。第一中间结果用于与参与纵向联邦学习的其它装置生成的中间结果(比如第二装置根据第二模型和第二数据子集生成的第二中间结果)生成第一模型的梯度。第一模型的梯度可称为第一梯度。
在图3对应的实施例中,第一装置可以是图4到图7中的装置A,也可以是图4到图7中的装置B,本申请实施例不做限定。
步骤304、第一装置接收第二装置发送的加密后的第二中间结果。
第二中间结果是第二装置根据第二模型和第二数据子集生成的。第二装置使用第二装置的公钥或第二装置的公钥和其它装置的公钥生成的公共公钥对第二中间结果进行同态加密。
本申请实施例中,第二装置可以为一个装置,也可以为多个装置,本申请实施例不做限定。
步骤306、第一装置获取第一模型的第一梯度。
作为一种可选的,第一装置可以根据第一中间结果和第二中间结果,生成第一梯度。第一装置还可以从别的装置比如第二装置获取根据第一中间结果和第二中间结果生成的第一梯度。用于生成第一梯度的第二中间结果为加密后的中间结果。作为可选,用于生成第一梯度的第二中间结果和第一中间结果都为加密后的中间结果。梯度为模型参数的更新向量。
步骤308、第一装置根据第一梯度,更新第一模型。
本申请实施例中,由于用于生成第一梯度的第二中间结果为加密后的中间结果,因此,第一装置获得第二中间结果无法推出生成第二中间结果的第二数据子集的原始数据。因此, 可以保证在纵向联邦学习的场景下的数据安全。
步骤400-401、装置A生成同态加密的公钥A(pk A)和私钥A(sk A),并将公钥A发送给装置B。
步骤402、装置A对数据子集A(D A)分组,以得到分组后的数据子集A(DD A).
Figure PCTCN2021112644-appb-000017
D A是装置A拥有的数据子集A,可以看做一个原始的二维矩阵,其中每一行数据对应一个用户,每一列对应一个特征。具体的,i行j列表示第i条数据的第j个特征。以表1中的数据为例,数据子集A为基站,核心网网元或AF没有发送给NWDAF实体的私有数据。以表2中的数据为例,数据子集A可以为运营商业务系统的数据。而arrears,CALL NUMS,Communication flows等作为数据子集A的特征A。
Figure PCTCN2021112644-appb-000018
DD A表示对数据子集A进行分组(打包)之后的结果。分组之后数据子集A的二维矩阵中的所有数据值被分多个块,每个块表示多条数据(也是D A中多行数据,比如L条数据)的同一个特征的值,即一个块是一个L行1列的一个列向量。比如
Figure PCTCN2021112644-appb-000019
为装置A的第1到L条数据的第1个特征,表示为:
Figure PCTCN2021112644-appb-000020
为装置A的第q*(L-1)+1到第Q*L条数据的第n个特征,表示为:
Figure PCTCN2021112644-appb-000021
因为数据量为P,每个块大小为L,不一定P可以整除L(即P条数据不能根据L分为Q块),则最后一个块可能不足L个值。但最后一个块需要补齐L个值,所以不足的数据采用补0操作,即
Figure PCTCN2021112644-appb-000022
Q为分组的个数
Figure PCTCN2021112644-appb-000023
其中L=多项式阶数。L的值可以根据需要进行设置,本申请实施例在此不做限定。
步骤400’-401’、装置B生成同态加密的公钥B(pk B)和私钥B(sk B),并将公钥pk B发送给装置A。在同态加密算法中,公钥用于加密,私钥用于解密。
步骤402’、装置B对数据子集B分组,以得到分组后的数据子集B(DD B).
Figure PCTCN2021112644-appb-000024
其中,
Figure PCTCN2021112644-appb-000025
为数据子集B中第1条数据的第1个特征,对应数据集D的第N+1个特征。数据集D包括数据子集A(D A)和数据子集B(D B)。数据子集A和数据子集B对应相同的用户,数据子集A和数据子集B具有不同的特征。
Figure PCTCN2021112644-appb-000026
表示为:
Figure PCTCN2021112644-appb-000027
为第L条数据的N+1个特征。
Figure PCTCN2021112644-appb-000028
为装置B的第q*(L-1)+1到第q*L条数据的第m个特征,对应数据集D的第N+m个特征,表示为:
Figure PCTCN2021112644-appb-000029
以表1中的数据为例,数据子集B为NWDAF实体的数据。比如service experience,Buffer size等则作为数据子集B对应的特征。以表2中的数据为例,数据子集A可以为银行业务系统的数据。而status,age,job等作为数据子集B的特征B。
值得说明的是,装置A和装置B进行分组的多项阶式L相同。
分组(也可以说是打包)指将全部数据按特征维度切分,并将每个特征根据多项阶式L 分为Q个组。分组可以使得后续加密时可以将一个组(包)的数据同时进行加密(多输入多输出),可以加速加密。
步骤403,装置A使用模型A(W A)以及数据子集A确定(或生成)数据子集A的中间结果A(U A)。
比如,U A=D AW A,表明对装置A拥有的数据子集A的各条数据与模型A的参数W A相乘。作为另一种表达方式,
Figure PCTCN2021112644-appb-000030
其中,
Figure PCTCN2021112644-appb-000031
表示数据子集D A中第一条数据与模型A的参数A相乘后得到的数据。
Figure PCTCN2021112644-appb-000032
表示数据子集D A中第P条数据与模型A的参数A相乘后得到的数据。
步骤404.装置A对中间结果A进行分组,以得到分组后的中间结果A(DU A)。其中,
Figure PCTCN2021112644-appb-000033
表明中间结果A分为Q组,第Q组有可能包含补零的数据。
Figure PCTCN2021112644-appb-000034
为中间结果A的第一组数据,对应中间结果A的第1到L条数据,即
Figure PCTCN2021112644-appb-000035
Figure PCTCN2021112644-appb-000036
为中间结果A的第L*(q-1)+1到L*q条数据,即
Figure PCTCN2021112644-appb-000037
对于最后一个分组的数据
Figure PCTCN2021112644-appb-000038
如果P条数据不能根据L分为Q组,由P除以L后第Q组,不足的数据采用补0操作。L=多项式阶数。L的值可以根据需要进行设备,本申请实施例在此不做限定。
步骤405.装置A对分组后的中间结果DU A使用公钥A(pk A)进行加密,以得到加密后的中间结果
Figure PCTCN2021112644-appb-000039
并将加密后的中间结果A发送给装置B。
其中符号
Figure PCTCN2021112644-appb-000040
表示加密。加密后的中间结果A包括加密后的各组的中间结果,表示为
Figure PCTCN2021112644-appb-000041
为加密后的第一组中间结果对应加密后的装置A的第1到L条数据。
在此实施例中,U A为使用数据子集A对模型A训练过程中的中间结果B。如果明文传输给装置B,有可能被装置B推理出原始数据D A,即数据子集A。因此需要将中间结果A进行加密后传输。装置B因为接收到的是加密后的数据,可以使用数据子集B的明文数据进行计算,也可以对数据子B使用公钥A加密后进行计算模型B的梯度B。
步骤403’-405’,装置B使用模型B(W B)以及数据子集B(D B)确定(或生成)数据子集B的中间结果B(U B),然后对中间结果B进行分组,以得到分组后的中间结果B(DU B)。装置B对分组后的中间结果DU B使用公钥pk B进行加密,以得到加密后的中间结果
Figure PCTCN2021112644-appb-000042
并将加密后的中间结果
Figure PCTCN2021112644-appb-000043
发送给装置A。
U B=D BW B-Y B,表明对装置B拥有的数据子集B的各条数据与模型B的参数相乘,并减去标签Y B的结果。作为另一种表达方式,
Figure PCTCN2021112644-appb-000044
其中,
Figure PCTCN2021112644-appb-000045
表示数据子集D B中第二条数据与模型B参数相乘,然后减去Y B后得到的中间数据。
Figure PCTCN2021112644-appb-000046
表示数据子集BDB中第p条数据与模型B参数相乘后得到的数据。Y B为数据子集B中的各数据对应的标签,每条数据子集B对应一个标签,表示为:
Figure PCTCN2021112644-appb-000047
分组后的中间结果DU B包括各组的中间结果B,表示为:
Figure PCTCN2021112644-appb-000048
为第一组中间结果A,第一组的中间结果B对应第一条到第L条数据。
Figure PCTCN2021112644-appb-000049
表示第q组的中间结果B,对应第(q-1)*L+1条到第q*l条数据。
步骤406、装置A将加密后的中间结果B与中间结果A合并,得到合并后的第一中间结果
Figure PCTCN2021112644-appb-000050
作为一种可选,装置A还可将加密后的中间结果B与加密后中间结果A合并。其中,装置A使用公钥B对中间结果A进行加密。
合并后的第一中间结果包含各组合并后的中间结果。各组合并后的中间结果包括各组加密后的中间结果B与相应组未加密的中间结果A。比如,
Figure PCTCN2021112644-appb-000051
Figure PCTCN2021112644-appb-000052
为合并后的第q组的第一中间结果,合并后的第q组的第一中间结果包括加密后的第q组中间结果
Figure PCTCN2021112644-appb-000053
与未加密的第q组中间结果
Figure PCTCN2021112644-appb-000054
作为一种可选方式,合并后的第一中间结果还可以包含加密后的中间结果B和加密后的中间结果A。其中,中间结果A和中间结果B都采用公钥B进行同态加密。
步骤407、装置A确定(或生成)模型A的加密后的梯度A,即
Figure PCTCN2021112644-appb-000055
梯度A包括模型A的各参数的更新值。
本申请实施例中,加密后的梯度A可以不是对梯度A进行加密,而是因为确定梯度A的合并后的第一中间结果包含加密后的数据子集,比如加密后的数据子集A和/或加密后的数据子集B。
其中,模型A的梯度A包括模型A的各参数对应的梯度A。如
Figure PCTCN2021112644-appb-000056
为模型A的第n个参数对应的梯度。其中,各参数对应的梯度
Figure PCTCN2021112644-appb-000057
根据加密后的中间结果A和加密后的中间结果B(或未加密的中间结果A和加密后的中间结果B),以及相应特征的各组的特征值确定(或生成)。比如,
Figure PCTCN2021112644-appb-000058
表示为:将第q组的中间结果B与第q组的中间结果A相加后乘以第q组的第n个特征值得到的和再求均。将第一到Q组的第n个特征的梯度相加得到模型A的第n个特征的梯度。
Figure PCTCN2021112644-appb-000059
为第q组对应的第q*(L-1)+1到第q*L条数据的第n个特征值,表示为:
Figure PCTCN2021112644-appb-000060
步骤408、装置A确定(或生成)梯度A的噪声A(R A),梯度A的噪声A集包括模型A的各参数(对应于数据子集A的各特征)的噪声A,可表示为:
Figure PCTCN2021112644-appb-000061
噪声是为特征产生的随机数(可以是每个特征产生一个随机数,也可以是装置A对各特征共同产生一个随机数,本申请请实施以一个特征对应一个随机数为例)。比如
Figure PCTCN2021112644-appb-000062
为第二个特征对应的随机数(即第二个特征的噪声A),
Figure PCTCN2021112644-appb-000063
为第n个特征对应的随机数。而任一特征对应的随机数包含了任一个分组中各条用户数据对应的该特征的噪声,表示为
Figure PCTCN2021112644-appb-000064
Figure PCTCN2021112644-appb-000065
或多个分组中各条数据对应的该特征的噪声。其中,
Figure PCTCN2021112644-appb-000066
为分组中第二条用户数据对应的第n个特征的噪声,
Figure PCTCN2021112644-appb-000067
为分组中第L条用户数据对应的第n个特征的噪声。
步骤409,装置A根据各参数对应的梯度对应的噪声A与相应参数的梯度,得到加密后的包含噪声A的梯度
Figure PCTCN2021112644-appb-000068
然后将加密后的包含噪声A的梯度
Figure PCTCN2021112644-appb-000069
发送给装置B。
加密后的包含噪声A的梯度A集合包含加密后的各参数的梯度A,可以表示为
Figure PCTCN2021112644-appb-000070
为加密后的第一参数的梯度A加上第一个参数的噪声A。其中,噪声可以的加密后的噪声,也可以是未加密的噪声。
步骤406’,装置B根据分组后的中间结果B(DU B)与加密后的中间结果
Figure PCTCN2021112644-appb-000071
得到合并后的第一中间结果
Figure PCTCN2021112644-appb-000072
本申请实施例中,合并后的第二中间结果为用于生成模型B的梯度B的中间结果。合并后的第二中间结果包含未加密的中间结果B和已加密的中间结果A,或合并后的第二中间结果包含已加密的中间结果A和已加密的中间结果B。合并后的第二中间结果包含的中间结果采用装置A生成的公钥A对中间结果B和中间结果A进行加密。
合并后的第一中间结果为用于生成模型A的梯度A的中间结果。合并后的第一中间结果包含未加密的中间结果A和已加密的中间结果B,或合并后的第一中间结果包含已加密的中间结果A和已加密的中间结果B。合并后的第二中间结果包含的中间结果采用装置B生成的公钥B对中间结果B和/或中间结果A进行加密。
合并后的第二中间结果
Figure PCTCN2021112644-appb-000073
包含各组合并后的中间结果。各组合并后的中间结果包含相应组加密后的中间结果A和相应组未加密的中间结果B。合并后的第二中间结果可表示为:
Figure PCTCN2021112644-appb-000074
其中第q组合并后的第二中间结果可表示为:
Figure PCTCN2021112644-appb-000075
为第q组加密后的中间结果A,
Figure PCTCN2021112644-appb-000076
为第q组未加密的中间结果B。
步骤407’、装置B确定(或生成)模型B的梯度
Figure PCTCN2021112644-appb-000077
梯度B包括模型A各参数 的更新值。
其中,模型B的梯度B包括模型B的各参数对应的梯度A(也即模型B的各特征对应的梯度B)。如
Figure PCTCN2021112644-appb-000078
模型B的第m个参数对应的梯度B。
步骤408’、装置B生成梯度B的噪声B(R B),梯度B的噪声B包括模型B的各参数对应的梯度的噪声A可表示为:
Figure PCTCN2021112644-appb-000079
其中,
Figure PCTCN2021112644-appb-000080
表示模型A的第m个参数对应的梯度的噪声。
Figure PCTCN2021112644-appb-000081
为分组中第一条用户数据对应的第N+m个特征的噪声。
步骤409’,装置B根据各参数对应的梯度的噪声B与相应参数的梯度B,得到加密后的包含噪声B的梯度
Figure PCTCN2021112644-appb-000082
然后将加密后的包含噪声B的梯度
Figure PCTCN2021112644-appb-000083
发送给装置A.
步骤410、装置A使用私钥A(sk A)解密装置B发送的加密后的包含噪声B的梯度
Figure PCTCN2021112644-appb-000084
得到解密后的包含噪声B的梯度B(DG BR)。
具体的,装置A使用私钥A解密加密后的包含噪声B的梯度B中各参数的对应的梯度B。解密后的包含噪声B的梯度B(DG BR)包含模型B的各参数对应的包含噪声B的梯度B。比如,
Figure PCTCN2021112644-appb-000085
表示模型B的第一个参数的梯度B,
Figure PCTCN2021112644-appb-000086
表示模型B的第一个参数对应的噪声B。模型B的第一个参数对应数据集中第N+1个特征。
步骤411-412、装置A根据解密后的包含噪声B的梯度B(DG BR)得到分组前包含噪声B的梯度B(G BR),并将分组前包含噪声B的梯度B(G BR)发送给装置B。
分组前包含噪声B的梯度B(G BR)包各分组前各参数对应的包含噪声B的梯度B,可表示为:
Figure PCTCN2021112644-appb-000087
其中,
Figure PCTCN2021112644-appb-000088
为模型B的第一个参数的分组前的包含噪声B的梯度B。
Figure PCTCN2021112644-appb-000089
模型B的第一个参数对应数据集中第N+1个特征。
步骤410’,装置B使用私钥B(sk B)解密装置A发送的包含噪声A的梯度
Figure PCTCN2021112644-appb-000090
得到解密后包含噪声A的梯度A(DG AR)。
具体的,装置B使用私钥B(sk B)解密生成梯度A中各参数的梯度A。解密后的包含噪声A的梯度A(DG AR)包含模型A的参数对应的包含噪声A的梯度A。比如,
Figure PCTCN2021112644-appb-000091
Figure PCTCN2021112644-appb-000092
表示模型A的第1个参数的包含噪声A的梯度A,
Figure PCTCN2021112644-appb-000093
表示模型A的第1个参数的的噪声B。模型A的第一个参数与数据集的第一个特征。
步骤411’-412’、装置B根据梯度B集合得到分组前包含噪声的梯度B集合G AR,并将分组前各特征对应的包含噪声的梯度B合集G AR发送给装置A。分组前包含噪声B的梯度B集合G AR包各分组前各特征对应的包含噪声B的梯度B。
步骤413、装置A根据分组前各特征对应的包含噪声B的解密后的梯度B集合G AR,得到去除噪声B的解密后的梯度B集合G A
梯度B集合G A包含模型B参数的各特征的梯度。梯度B集合可表示为:
Figure PCTCN2021112644-appb-000094
Figure PCTCN2021112644-appb-000095
其中,
Figure PCTCN2021112644-appb-000096
为第二个特征的梯度。
Figure PCTCN2021112644-appb-000097
步骤414、装置A根据去除噪声A的梯度A更新模型A(W A)。
模型A的更新可表示为:W A=W A-η*G A。其中,η为学习率为预先设定的学习率。本申请实施例不做限定。
步骤413’、装置B根据分组前各参数对应的包含噪声B的梯度B(G BR),得到梯度B(G B)。
步骤414’、装置B根据梯度B(G B)更新模型B(W B)。
重复执行步骤407到414’,直接对模型参数的改动小于预设值。
图4对应的实施例,装置A与装置B交换加密后的中间结果B和中间结果A,并使用加密后的中间结果生成梯度。并将梯度加密后发送给对方。因此,在装置A和装置B的数据交换过程中都采用了加密传输,保证了数据传输的安全性。
图5A-图5B为本发明实施例提供的模型更新的另一方法流程图,包括如下步骤:
步骤500-501、装置A生成同态加密的公钥A(pk A)和私钥A(sk A),并将公钥B发送给装置B。
步骤502.装置A对数据子集A(D A)分组,以得到分组后的数据子集A(DD A).
该步骤的具体方法可参考步骤402的描述,本申请实施例不再详述。
步骤503.装置A使用公钥A对分组后的数据子集A进行加密,得到加密后的数据子集
Figure PCTCN2021112644-appb-000098
加密后的数据子集A包括各组的各特征对应数据。如下:
Figure PCTCN2021112644-appb-000099
其中,
Figure PCTCN2021112644-appb-000100
表示加密后的第q组的第n个特征对应的数据。
步骤504.根据模型A(W A)的各参数A形成与各参数A对应的参数组。
模型A的参数A也称模型A的特征,模型A的参数与数据子集A的特征一一对应。模型A的参数A表示为:
Figure PCTCN2021112644-appb-000101
为模型A的第一个参数(或第一个特征)。模型A共有N个参数。形成与各参数A对应的参数组包括:将每个参数A复制L份形成与该参数A对应的组。L为图4中的多项式阶数。比如:
Figure PCTCN2021112644-appb-000102
即第n组的参数为与特征n对应的组,包括L份第n个参数。
将模型A的各参数A复制L份,是因为各参数A是需要与分组后的数据子集A(DD A)进行相乘。参数A是一个向量,复制L份才可以将其变化为矩阵形式,方便与(DD A)进行矩阵乘法。
步骤505、装置A使用公钥A对各组参数A进行同态加密,得到加密后的参数
Figure PCTCN2021112644-appb-000103
加密后的参数A包含加密后的各参数A对应的参数组。表示为:
Figure PCTCN2021112644-appb-000104
步骤506.装置A将加密后的参数A和加密后的数据子集A发送给装置B。
值得说明的是,步骤502可以和步骤505一起执行。
步骤502’.装置B对数据子集B(D B)分组,以得到分组后的数据子集B(DD B).
该步骤的具体方法可参考步骤402’的描述,本申请实施例不再详述。
步骤503’.装置B对数据子集B的各条数居的标签Y B进行分组,得到分组后的标签集。
分组后的标签集各组对应L个标签。对Y B进行分组的方法可参考对数据子集B(D B)进行分组的方法,本申请实施例不再详述。
步骤504’,装置B使用模型B(W B)以及数据子集B(D B)计算数据子集B的中间结果B(U B),然后对中间结果B进行分组,以得到分组后的中间结果B(DU B)。
装置B得到分组后的中间结果B的具体描述可以参步骤步403’-404’的描述,本申之前施例在此不再详述。
步骤507、装置B根据加密后的参数
Figure PCTCN2021112644-appb-000105
和加密后的数据子集
Figure PCTCN2021112644-appb-000106
得到加密后的中间结果
Figure PCTCN2021112644-appb-000107
比如,可以将加密后参数A的矩阵与加密后的数据子集A的矩阵相乘,得到加密后的中间结果A。加密后的中间结果A包括各组的中间结果A。各组的中间结果A为各参数A的中间结果A的和。可表示为:
Figure PCTCN2021112644-appb-000108
步骤508、装置B根据分组后的中间结果B(DU B)与加密后的中间结果
Figure PCTCN2021112644-appb-000109
得到合并后的第一中间果
Figure PCTCN2021112644-appb-000110
步骤508的详述描述可参步骤骤407’,本申请实施例在此不再详述。
作为一种可选方式,装置B还可以将分组后的中间结果B使用公钥A进行同态加密后, 与加密后的中间结果A得到合并的中间结果。在此实施例中,使用加密后的中间结果A和加密后的中间结果B生成的合并后的中间结果可以用于确定(或生成)模型A的加密后的梯度A,以及模型B的加密后的梯度B。
步骤509、装置B确定(或生成)模型A的加密后的梯度
Figure PCTCN2021112644-appb-000111
梯度A包括模型A各参数A的更新值。
装置B得到加密后的梯度
Figure PCTCN2021112644-appb-000112
的详述介绍可参数步骤407的描述,本申请实施例在些不再详述。
步骤510、装置B确定(或生成)模型A的噪声A(R A),模型A的噪声A为模型A的各参数A(也是各特征)的噪声A,可表示为:
Figure PCTCN2021112644-appb-000113
装置B确定(或生成)模型A的噪声A(R A)的详述介绍可参数步骤408的描述,本申请实施例在些不再详述。
步骤511、装置B根据各梯度对应的噪声A与相应参数的梯度A,得到加密后的包含噪声A的梯度
Figure PCTCN2021112644-appb-000114
装置B得到加密后的包含噪声A的梯度
Figure PCTCN2021112644-appb-000115
的详述介绍可参数步骤409的描述,本申请实施例在些不再详述。
步骤512、装置B确定(或生成)模型B的梯度
Figure PCTCN2021112644-appb-000116
梯度B包括模型B各参数的更新值。
装置B得到模型B的梯度
Figure PCTCN2021112644-appb-000117
的详述介绍可参考步骤407’的描述,本申请实施例在些不再详述。
513、装置B生成模型B的噪声B(R B)。得到噪声B(R B)的详述介绍,可参考步骤408’的描述,本申请实施例在些不再详述。
步骤514、装置B根据各参数对应的噪声B与相应参数的梯度B,得到加密后的包含噪声B的梯度
Figure PCTCN2021112644-appb-000118
加密后的包含噪声B的梯度
Figure PCTCN2021112644-appb-000119
包含模型B的各参数B的包含噪声B的梯度B,可表示为:
Figure PCTCN2021112644-appb-000120
其中:
Figure PCTCN2021112644-appb-000121
为数据集D的第N+m个特征的加密后的梯度以及相应特征的噪声,也可以为模型B的第m个参数的加密后的梯度以及第m个参数的噪声。
步骤514、装置B将加密后的包含噪声B的梯度
Figure PCTCN2021112644-appb-000122
以及加密后的包含噪声A的梯度
Figure PCTCN2021112644-appb-000123
发送给装置A。
步骤515、装置A在接收装置B发送的包含噪声A的梯度
Figure PCTCN2021112644-appb-000124
和包含噪声B的梯度
Figure PCTCN2021112644-appb-000125
后,使用私钥A(sk A)解密加密后的包含噪声A的梯度
Figure PCTCN2021112644-appb-000126
以得到解密后的包含噪声A的梯度A(DG AR)。解密后的包含噪声A的梯度A(DG AR)包含模型A的各参数A对应的包含噪声A的梯度A。比如,
Figure PCTCN2021112644-appb-000127
表示第1个参数的梯度A,
Figure PCTCN2021112644-appb-000128
表示第1个参数的噪声A。此步骤可参考步骤410’的描述。
步骤516、装置A根据解密后的包含噪声A的梯度A(DG AR)得到分组前包含噪声A的梯度A(G AR)。
分组前包含噪声A的梯度A(G AR)包分组前各参数A对应的包含噪声A的梯度A。表示为:
Figure PCTCN2021112644-appb-000129
为第n个特征包含噪声A的分组前的梯度A。
Figure PCTCN2021112644-appb-000130
是根据一组中各条数据的该特征包含噪声A的梯度A确定。可表示为:
Figure PCTCN2021112644-appb-000131
也就是说,解密后的值是同一个特征的值分组之后的结果,必须将同一组内同一特征(或参数)的多个值进行平均才可以获取对应参数的梯度。
步骤517、装置A根据分组前各参数对应的包含噪声A的梯度A更新模型AWR A=W A-η*G AR。
此步骤中,对模型A的更新带有噪声A。由于装置A侧没有噪声A的值。因此,此步骤中获取的模型A的更新是带噪声的梯度A产生的,因此更新后的模型A的参数也不是目标模 型。
步骤518.装置A根据对更新后的模型A(WR A)得到更新后模型A的包含噪声A的参数A,表示为:
Figure PCTCN2021112644-appb-000132
步骤519.装置A使用公钥A对更新后的模型A的包含噪声A的参数A进行同态加密,得到加密后的包含噪声A的参数
Figure PCTCN2021112644-appb-000133
步骤520.装置A使用私钥A(sk A)解密装置B发送的包含噪声B的梯度
Figure PCTCN2021112644-appb-000134
得到解密后的包含噪声B的梯度B(DG BR)。步骤521可参考步骤410的详细描述。本申请实施例在此不再详述。
Figure PCTCN2021112644-appb-000135
步骤521、装置A根据解密后的包含噪声B的梯度B得到分组前包含噪声B的梯度B(G BR)。
分组前包含噪声B的梯度B(G BR)包各分组前各参数对应的包含噪声B的梯度B,可表示为:
Figure PCTCN2021112644-appb-000136
其中,
Figure PCTCN2021112644-appb-000137
为分组前第N+1个特征对应的包含噪声的梯度A。
Figure PCTCN2021112644-appb-000138
步骤522、装置A将分组前包含噪声A的梯度A集合G BR和加密后的包含噪声A的更新的参数
Figure PCTCN2021112644-appb-000139
发送到装置B。
值得说明的是:装置A可以将G BR和
Figure PCTCN2021112644-appb-000140
分开发送给装置B,也可以一起发送给装置B。
另外,装置A执行步骤520-521,与步骤515到516之间没有时间的先后顺序。
步骤523.装置B根据存储的各梯度A的噪声A,去除加密后的包含噪声A的更新后的参数
Figure PCTCN2021112644-appb-000141
中的噪声A,得到加密后的更新的参数A。加密后的更新后参数A包含加密后的更新的各参数A,可表示为:
Figure PCTCN2021112644-appb-000142
Figure PCTCN2021112644-appb-000143
步骤524.装置B加密后的更新的各参数
Figure PCTCN2021112644-appb-000144
发送给装置A.
步骤525.装置A使用私钥A对加密后的更新的各参数
Figure PCTCN2021112644-appb-000145
进行解密,得到模型A更新后的参数
Figure PCTCN2021112644-appb-000146
步骤524、装置B根据存储的噪声B,去除包含噪声B的梯度B(G BR)中的噪声B,得到梯度B集。梯度B集可表示为:
Figure PCTCN2021112644-appb-000147
Figure PCTCN2021112644-appb-000148
步骤525、装置B根据梯度B(G B)更新模型B(W B)。模型B的可表示为W B=W B-η*G B,其中,η为学习率为预先设定的学习率。本申请实施例不做限定。
重复执行步骤504’到步骤525,直至直接对模型参数的改动小于预设值。
本申请实施例中,装置A进行第二数据集的分组加密,由装置B对梯度B和梯度A进行计算,并由装置A进行梯度B和梯度A进行解密,最后由装置B对解密后梯度B和梯度A进行去噪,然后根据去噪后的梯度B和梯度A分别更新模型B和模型A。在本实施例中,梯度的传输不仅加密了还包含了噪声,使得更难通过梯度获取对端的原始数据,从而提高双方的数据安全。
图6A-图6B为本申请实施例提供的更新模型参数的又一实施例的方法流程中,在此实施例中,由第三方进行加密后的数据的计算。此方法实施例包括如下步骤:
步骤601、装置A生成同态加密的公钥A(pk A)和私钥A(sk A)。
步骤602-603、装置A对数据子集A(D A)分组,以得到分组后的数据子集A(DD A),并使用公钥A对分组后的数据子集A进行加密,得到加密后的数据子集
Figure PCTCN2021112644-appb-000149
步骤602-603的详细描述参考步骤502和503的描述,本申请实施例在此不再详述。
步骤604、装置A将公钥A(pk A)和加密的数据子集
Figure PCTCN2021112644-appb-000150
发送给装置C。
步骤605、装置A根据模型A(W A)的各参数A形成与各参数A对应的参数组A,然后使用公钥A对各组参数组A进行同态加密,得到加密后的参数组
Figure PCTCN2021112644-appb-000151
步骤605的详述描述可参考步骤504到505的详细描述,本申请实施例在此不再详述。
步骤606、装置A将加密后的参数B集
Figure PCTCN2021112644-appb-000152
发送给装置C。
作为一种可选方式,装置A还可以不形成参数组,而将模型A的各参数A加密后发送给装置C。
值得说明的是:步骤604与步骤606合并执行。
步骤601’,装置B对数据子集B(D B)分组,以得到分组后的数据子集B(DD B)。
该步骤的具体方法可参考步骤402’的描述,本申请实施例不再详述。
步骤602’,装置B对数据子集B的各条数居的标签Y B进行分组,得到分组后的标签。
分组后的标签的各组对应L个标签。对Y B进行分组的方法可参考对D B进行分组的方法,本申请实施例不再详述。
步骤607、装置C根据加密后的参数
Figure PCTCN2021112644-appb-000153
和加密后的数据子集
Figure PCTCN2021112644-appb-000154
得到加密后的中间结果
Figure PCTCN2021112644-appb-000155
步骤607的详述描述可参考步骤507的详述描述,本申请实施例在此不再详述。
步骤608、装置C将加密后的中间结果
Figure PCTCN2021112644-appb-000156
发送给装置B。
步骤609、装置B使用模型B(W B)、数据子集B(D B)以及分组后的标签确定(或生成)数据子集B的中间结果B(U B),然后对中间结果B进行分组,以得到分组后的中间结果B(DU B)。
装置B得到分组后的中间结果B的具体描述可以参步骤步403’-404’的描述,本申之前施例在此不再详述。
步骤610,装置B根据分组后的中间结果B(DU B)与加密后的中间结果
Figure PCTCN2021112644-appb-000157
得到合并后的第一中间果
Figure PCTCN2021112644-appb-000158
步骤610的详述描述参考步骤406’的详述描述,本申请实施例在此不再详述。
作为一种可选方式,装置B还可以使用公钥A对分组后的中间结果B进行同态加密,得到加密后的中间结果B,并将加密后的结果B和加密后的中间结A合并,得到合并后的第一中间结果。如果装置B需要使用公钥A对分组后的中间结果B进行加密,装置B需要先获取公钥A。
步骤611,装置B计算模型B的梯度
Figure PCTCN2021112644-appb-000159
以及生成模型B的噪声B(R B),模型B的噪声B包括模型B的各参数的噪声B。然后装置B根据各参数B对应的噪声B与相应特征的梯度B,得到加密后的包含噪声B的梯度
Figure PCTCN2021112644-appb-000160
并将加密后的包含噪声B的梯度
Figure PCTCN2021112644-appb-000161
发送给装置A
步骤611的详述描述可参考步骤407’-步骤409’的详细描述,本申请实施例在些不再详述。
步骤612、装置B将合并后的第一中间果
Figure PCTCN2021112644-appb-000162
发送给装置C。
本申请实施例将是把核心的计算过程放到了B侧和C侧,只不过在B侧和C侧的计算是加密后执行的,B侧和C侧执行的是密文计算,因此计算得到的梯度信息是密文的。由于更新模型需要的是明文的模型参数,因此必须将计算得到的密文送到A侧进行解密。同时为了防止A侧获取明文梯度,因此需要把计算得到的梯度加入随机数,保证即使A方解密也无法获取真实的梯度。
步骤613、装置B将加密后的包含噪声B的梯度
Figure PCTCN2021112644-appb-000163
发送给装置A。
步骤614、装置C根据合并后的第一中间结果A以及模型A的加密后的参数A确定(或生成)模型A的梯度A。
步骤614与步骤509一样,本申请实施例在此不再详述。
步骤615、装置C确定(或生成)梯度A的噪声A(R A),梯度A的噪声A包括与模型A的各参数(也是各特征)对应的噪声A,可表示为:
Figure PCTCN2021112644-appb-000164
装置C确定(或生成)模型A的噪声A(R A)的详细介绍可参数步骤408的描述,本申请实施例在些不再详述。
步骤616、装置C根据各参数A对应的噪声A与相应参数的梯度A,并使用公钥A进行同态加密,得到加密后的包含噪声A的梯度
Figure PCTCN2021112644-appb-000165
装置C得到加密后的包含噪声A的梯度
Figure PCTCN2021112644-appb-000166
的详述介绍可参数步骤409的描述,本申请实施例在些不再详述。
步骤617、装置C将加密后的包含噪声A的梯度
Figure PCTCN2021112644-appb-000167
发送给装置A。
步骤618、装置A在接收装置C发送的加密后的包含噪声A的梯度
Figure PCTCN2021112644-appb-000168
和包含噪声A的梯度A集合后
Figure PCTCN2021112644-appb-000169
后,使用私钥A(sk A)解密加密后的包含噪声A的梯度
Figure PCTCN2021112644-appb-000170
以得到解密后的包含噪声A的梯度A(DG AR)。解密后的包含噪声A的梯度A(DG AR)包含模型A的各参数A对应的包含噪声A的梯度A。比如,
Figure PCTCN2021112644-appb-000171
Figure PCTCN2021112644-appb-000172
表示第1个参数A的梯度A,
Figure PCTCN2021112644-appb-000173
表示第1个梯度A的噪声A。此步骤可参考步骤410’的描述。
步骤619、装置A根据解密后的包含噪声A的梯度A(DG AR)得到分组前包含噪声A的梯度A(G AR)。此步骤的详述描可参考步骤517的描述,本申请实施例在此不再详述。
步骤620、装置A使用私钥A(sk A)解密加密后的包含噪声B的梯度
Figure PCTCN2021112644-appb-000174
以得到解密后的包含噪声B的梯度B(DG BR)。
步骤621、装置A根据解密后包含噪声B的梯度B(DG BR)得到分组前的包含噪声B的梯度B(G BR)。此步骤的详述描可参考步骤517的描述,本申请实施例在此不再详述。
步骤622、装置A将分组前的包含噪声A的梯度A发送给装置C。
步骤623、装置A将分组前的包含噪声B的梯度B发送给装置B。
步骤624、装置C根据存储的各梯度的噪声A,去除包含噪声A的梯度A中的噪声A,以得到梯度A(G A)。
步骤625、装置C根据分组前各参数对应的梯度A更新模型BWR A=W A-η*G A
步骤626、装置B根据存储的与模型B的各参数B对应的噪声B,去除包含噪声B的梯度B中噪声B,得到的梯度B(G B)。
步骤627、装置B根据分组前各参数对应的包含梯度B更新模型B:WR B=W B-η*G B
重复执行步骤610到627,直到直接对模型参数的改动小于预设值。
通过本申请实施例,将部分计算步骤置于装置C,可以减少装置B的计算。此外,由于装置A、装置C、以及装置B之间的交互是分组后的且加密的数据,或加噪声的模型的梯度,因此还可以保证数据安全。
图7为本发明实施例提供的更新模型的另一方法流程图。在此实施例中,同一个用户的数据的不同特征(值)分别位于多个装置(本申请实施例假设为三个),但只是一个装置中的数据包含标签。则纵向联邦场景的模型包括两个或两个以上的模型A(W A1和W A2,如果有H个装置A,则有H个模型W AH)和模型B(W B)。模型A(W A)的参数可表示为:
Figure PCTCN2021112644-appb-000175
Figure PCTCN2021112644-appb-000176
模型B(W B)的参数可表示为:
Figure PCTCN2021112644-appb-000177
不同的模型A具有不同的参数。在本申请实施例中,假设模型的一个参数对应数据子集中的一个特征。
与图4对应的实施例不同的是,加密阶段各装置使用各自生成的公钥(包含装置A-1生成的公钥A1、装置A-2生成的公钥A-2,装置B生成的公钥B)生成公共公钥,各装置使用公共公钥对各自的数据子集进行加密。并将加密的数据子集,根据加密的数据子集生成的加密的中间结果和加密的梯度,和/或包含在加密的梯度中噪声等发送给其它的装置。比如通过广播的方式,装置A1将加密的数据子集、中间结果、梯度和/或噪声等发送给装置B或A2。又比如装置A1分别将加密的数据子集D A1、中间结果DU A1和/或噪声A1等发送给装置B或A1。
为了叙述简便,各装置均参与纵向联邦模型的训练,但只允许一个装置包含的数据为有标签的数据(本申请实施例为装置B的数据为有标签的数据),而其它装置包含的数据为无标签的数据。假设共有H个装置的数据为无标签的数据,包含无标签的数据的装置可以表示为A1-AN,统装置A。
在本申请实施例中,将具有标签的数据子集称为数据子集B,存储有数据子集B的装置称为装置B。其它的存储的不具有标签的数据的装置称为装置A。本申请实施例具有两个或两 个以上装置A。
如图7所示,本申请实施例,包括如下步骤:
第701:各装置生成同态加密的公钥和私钥,并将装置生成的公钥发送给其它的装置。然后根据自身生成的公钥以及接收到的其它装置生成的公钥,生成共公共钥。
以装置A1为例,装置A1生成同态加密的公钥A1(pk A1)和私钥A1(sk A1),并接收装置B发送的公钥B(pk B)和装置A2发送的公钥A2(pk C2),以及将公钥A1分别发送给装置B和装置A2。
装置A1根据公钥A1、公钥A2以及公钥B生成公共公钥pk All
装置B和装置A2也执步装置A1执行的相同的步骤,本申请实例在此不在详述。
步骤702:各装置使用各自的数据子集以及各自的模型确定(或生成)各数据子集的中间结果。
值得说明的是:步骤702的详述描述可参考步骤403的详述描述,本申请实施例在此不再详述。
步骤703、各装置对各自的中间结果使用公共公钥进行加密,并将加密后的中间结果发送给其它的装置。
以装置A1为例,装置A1对中间结果A1使用公共公钥进行加密,并将加密后的中间结果
Figure PCTCN2021112644-appb-000178
发送给装置B和装置A2。
以装置B为例,装置B使用公共公钥对中间结果B进行加密,并将加密后的中间结果
Figure PCTCN2021112644-appb-000179
发送给装置A1和装置A2。
以装置A2为例,装置A2使用公共公钥对中间结果A2(U A2)进行加密,并将加密后的中间结果
Figure PCTCN2021112644-appb-000180
发送给装置A1和装置B。
在此实施例中,各模型训练过程中使用的中间结果是根据各装置的数据子集和各装置的模型产生的。比如,中间结果A1是根据模型A1和数据子集A1确定(或生成)的。中间结果A2是根据模型A2和数据子集A2确定(或生成)的。中间结果B是根据模型B和数据子集B确定(或生成)的。本申请实施例使用公共密钥对中间结果加密后发送给其它装置,可以避免不信任第三方根据中间结果得到数据,从而保证了数据的安全。
步骤704、各装置根据各确定(或生成)的加密后中间结果以及接收到的其它装置发送的加密后的中间结果,生成合并后的中间结果。
作为一个例子,合并后的中间结果为:
Figure PCTCN2021112644-appb-000181
步骤705、各装置根据合并后的中间结果计算各自模型的梯度。
以装置A1为例,模型A1的梯度包括模型A1的各参数对应的梯度,可以表示为:
Figure PCTCN2021112644-appb-000182
其中,
Figure PCTCN2021112644-appb-000183
为模型A1的第n个特征对应的梯度。
Figure PCTCN2021112644-appb-000184
其中,
Figure PCTCN2021112644-appb-000185
为:数据子集A1中第n个特征对应的数据。
步骤706、各装置将对应的梯度按发送给其他装置,并接收其它装置对梯度的解密结果。然后,使用解密后的梯度,更新各自的模型。
作为一种可选可选方式,步骤706可以使用依次解密的方式,如下:
以装置A1为例,装置A1将梯度
Figure PCTCN2021112644-appb-000186
发送依次发送给装置B或A2,在收到装置B或装置A2解密后的梯度后,再将由装置B或装置A2解密后的梯度发送给装置A2或装置B,直至所有的装置均对梯度进行解密。
装置B或装置A2对梯度
Figure PCTCN2021112644-appb-000187
使用各自的私钥进行解密。
作为一种可选可选方式,步骤706可以使用分别解密的方式,如下:
以装置A1为例,装置A1将梯度
Figure PCTCN2021112644-appb-000188
分别发送给装置B与A2,在收到装置B与装置A2解密后的梯度后,再将由装置B与装置A2解密后的梯度进行合成,得到最终的解密结果。
装置B与装置A2对梯度
Figure PCTCN2021112644-appb-000189
使用各自的私钥进行解密。
以装置A1为例,根据模型A1可参考步骤414的描述。
作为一种可选方式,步骤706中各装置还可以不用将梯度发送给其它装置进行解密,而直接使用加密后的梯度进行模。
在密文状态下更新模型梯度的情况下,模型更新几轮后,可选对加密的参数进行一次解密的过程,用于校准密文状态下的模型参数。任意一方模型参数的校准需要一个代理方负责本方加密后参数的校准操作。此操作可选的实现方式为一以下两种:
第一种实现方式,参数方分别进行解密,以装置A1为例,待校准方A1的加密模型参数加噪后发送给代理方B,代理方将加噪后加密模型参数分别独立的发送给其他参与方,并且接收各方返回的解密结果,同时代理方同样解密此加噪后的加密模型参数,并将多方解密结果进行合成得到明文的带噪模型参数,并将其用合成的公钥进行加密后反馈给装置A1,装置A1对返回后的加密模型参数进行密文下去噪的操作,得到校准后的加密模型参数。
第二种实现方式,参数方依次进行解密,以装置A1为例,待校准方A1的加密模型参数加噪R1(密文下)后发送给代理方B,代理方加噪RB(密文下)后,将其依次发送给其他参与方,参与此循环的各方依次加入噪声(密文下),最后返回到代理方B。代理方将加入各方噪声的加密模型参数发送给各方(包括A1与B本身),各方分别进行解密后返回代理方B,B方得到带各方噪声的明文模型参数。之后B方利用合成密钥加密后,依次调用除A1的各方依次去噪(密文状态下),最后返回给A1。A1在密文状态下去噪R1,得到校准后的加密模型参数。
与图4对应的实施例相比,图7对应的实施例中有三个或三个以上的装置参与纵向联邦学习,各装置采用公共公钥对中间结果进行加密,由装置使用各自的私钥对其它装置生成的梯度进行解密。使得在纵向联邦学习的场景下既保证的数据安全,还由于各方加密操作只执行了一次,减少了交互次数,节省网络资源。
图8为本申请实施例提供的一种装置,包括接收模块801,处理模块802和发送模块803。
处理模块802用于根据第一数据子集生成第一中间结果。接收模块801,用于接收第二装置发送的加密后的第二中间结果,所述第二中间结果根据第二装置对应的第二数据子集生成。所述处理模块802进一步用于:获得第一模型的第一梯度,所述第一模型的第一梯度根据所述第一中间结果和加密后的所述第二中间结果生成;所述第一模型的第一梯度在使用第二私钥进行解密后,用于更新第一模型,所述第二私钥为所述第二装置生成的用于同态加密的解密密钥。
可选的,所述第二中间结果使用所述第二装置生成的用于同态加密的第二公钥加密,所述处理模块802进一步用于生成用于同态加密的第一公钥和第一私钥,以及使用所述第一公钥对所述第一中间结果进行加密。
可选的,发送模块803,用于发送所述加密后的第一中间结果发送。
作为另一种可选方式,发送模块803用于发送加密后的第一数据子集以及加密后的第一模型的第一参数,所述加密后的第一数据子集以及加密后的第一参数用于确定(或生成)加密后的第一中间结果。所述接收模块801用于,接收加密后所述第一模型的第一梯度,所述第一模型的第一梯度根据所述加密后的第一中间结果,所述加密后的第一参数以及加密后的第二中间结果确定(或生成).所述处理模块802用于,对所述加密后的第一梯度使用第一私钥进行解密,所述解密后的第一模型的第一梯度用于更新所述第一模型。
作为另一种可选方式,所述接收模块801用于,接收加密后的第一中间结果和加密后的第二中间结果,以及接收第一模型的参数。所述处理模块802,用于根据所述加密后的第一中间结果和所述加密后的第二中间结果,以及第一模型的参数确定(或生成)第一模型的第一梯度,对所述第一梯度进行解密,根据所述解密后的第一梯度,更新所述第一模型。
作为另一可选方式,所述图8中装置中的模块还可以用来执行图3,到图7中的方法 流程中任意装置的任意步骤。本申请实施例在此不再详述。
作为一种可选方式,该装置可以是芯本。
如图9所示,为本申请实施例提供的装置的70的硬件结构示意图。该装置可以为图2A中的各实体或网元,也可以为图2B中的各装置。该装置可以是图3-图7中的任一装置。
图9所示的装置可以包括:处理器901、存储器202、通信接口904,输出设备905,输入设备906以及总线903。处理器901、存储器902以及通信接口904、输出设备905、输入设备906之间可以通过总线903连接。
处理器901是计算机设备的控制中心,可以是一个通用中央处理单元(central processing unit,CPU),也可以是其他通用处理器等。其中,通用处理器可以是微处理器或者是任何常规的处理器等。
作为一个示例,处理器901可以包括一个或多个CPU。
存储器902可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。
一种可能的实现方式中,存储器902可以独立于处理器901存在。存储器902可以通过总线903与处理器901相连接,用于存储数据、指令或者程序代码。处理器91调用并执行存储器902中存储的指令或程序代码时,能够实现本申请实施例提供的机器学习模型的更新方法,例如,图3图-图7任一所示的机器学习模型的更新方法。
另一种可能的实现方式中,存储器902也可以和处理器701集成在一起。
通信接口904,用于装置与其他设备通过通信网络连接,所述通信网络可以是以太网,无线接入网(radio access network,RAN),无线局域网(wireless local area networks,WLAN)等。通信接口904可以包括用于接收数据的接收单元,以及用于发送数据的发送单元。
总线903,可以是工业标准体系结构(industry standard architecture,ISA)总线、外部设备互连(peripheral component interconnect,PCI)总线或扩展工业标准体系结构(extended industry standard architecture,EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。为便于表示,图9中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
需要指出的是,图9中示出的结构并不构成对计算机设备90的限定,除图9所示部件之外,计算机设备70可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
上述主要从方法的角度对本申请实施例提供的方案进行了介绍。为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对机器学习模型管理装置(如机器学习模型管理中 心或联邦学习服务端)进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在一些实施例中,所公开的方法可以实施为以机器可读格式被编码在计算机可读存储介质上的或者被编码在其它非瞬时性介质或者制品上的计算机程序指令。
应该理解,这里描述的布置仅仅是用于示例的目的。因而,本领域技术人员将理解,其它布置和其它元素(例如,机器、接口、功能、顺序、和功能组等等)能够被取而代之地使用,并且一些元素可以根据所期望的结果而一并省略。
另外,所描述的元素中的许多是可以被实现为离散的或者分布式的组件的、或者以任何适当的组合和位置来结合其它组件实施的功能实体。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件程序实现时,可以全部或部分地以计算机程序产品的形式来实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机执行指令时,全部或部分地产生按照本申请实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或者数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可以用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质(例如,软盘、硬盘、磁带),光介质(例如,DVD)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (36)

  1. 一种机器学习模型的更新方法,其特征在于,包括:
    第一装置根据第一数据子集和第一模型生成第一中间结果;
    第一装置接收第二装置发送的加密后的第二中间结果,所述第二中间结果根据第二装置对应的第二数据子集和第二模型生成;
    第一装置获得所述第一模型的第一梯度,所述第一梯度根据所述第一中间结果和加密后的所述第二中间结果生成;
    所述第一梯度在使用第二私钥进行解密后,用于更新所述第一模型,所述第二私钥为所述第二装置生成的用于同态加密的解密密钥。
  2. 根据权利要求1所述的方法,其特征在于,所述第二中间结果使用所述第二装置生成的用于同态加密的第二公钥加密,该方法进一步包括:
    所述第一装置生成用于同态加密的第一公钥和第一私钥;
    所述第一装置使用所述第一公钥对所述第一中间结果进行加密。
  3. 根据权利要求2所述的方法,其特征在于,所述第一装置将所述加密后的第一中间结果发送给所述第二装置。
  4. 根据权利要求2所述的方法,其特征在于,所述第一模型的第一梯度根据所述第一中间结果和加密后的所述第二中间结果确定具体为:所述第一模型的第一梯度根据加密后的所述第一中间结果和加密后的所述第二中间结果确定,该方法进一步包括:
    所述第一装置使用所述第一私钥对所述第一模型的第一梯度解密。
  5. 根据权利要求1-4任一所述的方法,其特征在于,该方法进一步包括:
    所述第一装置生成所述第一模型的第一梯度的第一噪声;
    所述第一装置将包含所述第一噪声的第一梯度发送给所述第二装置;
    所述第一装置接收使用所述第二私钥解密后的第一梯度,所述解密后的梯度包含所述第一噪声。
  6. 根据权利要求1-5任一所述的方法,该方法进一步包括:
    所述第一装置接收所述第二装置发送的第二模型的第二参数;
    所述第一装置根据所述加密后第一中间结果和加密后的所述第二中间结果,以及所述第二模型的第二参数集确定所述第二模型的第二梯度;
    所述第一装置将所述第二模型的第二梯度发送给所述第二装置。
  7. 根据权利要求6所述的方法,其特征在于,该方法进一步包括:
    所述第一装置确定所述第二梯度的第二噪声;
    所述发送给所述第二装置的第二梯度包含所述第二噪声。
  8. 根据权利要求6或7所述的方法,其特征在于,该方法进一步包括:
    所述第一装置接收包含所述第二噪声的更新后的第二参数,所述第二参数集为用所述第二梯度更新所述第二模型的参数集;
    所述第一装置去除包含在所述更新后的第二参数中的第二噪声。
  9. 根据权利要求1-8任一所述的方法,其特征在于,
    所述第一装置接收至少两个用于同态加密的所述第二公钥,所述至少两个第二公钥由至少两个第二装置生成;
    所述第一装置根据接收到所述至少两个第二公钥以及所述第一公钥生成用于同态加密的公共公钥;所述公共公钥用于加密所述第二中间结果和/或所述第一中间结果。
  10. 根据权利要求9所述的方法,其特征在于:所述第一模型的第一梯度使用第二私钥进行解密包括:
    所述第一装置将所述第一模型的第一梯度依次发送给所述至少两个第二装置,并接收所述至少两个第二装置分别使用对应的第二私钥解密后的第一模型的第一梯度。
  11. 根据权利要求9或10所述的方法,其特征在于:该方法进一步包括:所述第一装置使用所述第一私钥对所述第一模型的第一梯度解密。
  12. 一种机器学习模型的更新方法,其特征在于,包括:
    第一装置发送加密后的第一数据子集以及加密后的第一模型的第一参数,所述加密后的第一数据子集以及加密后的第一参数用于确定加密后的第一中间结果;
    所述第一装置接收加密后所述第一模型的第一梯度,所述第一模型的第一梯度根据所述加密后的第一中间结果,所述加密后的第一参数以及加密后的第二中间结果确定;
    所述第一装置对所述加密后的第一梯度使用第一私钥进行解密,所述解密后的第一模型的第一梯度用于更新所述第一模型。
  13. 根据权利要求12所述的方法,其特征在于,该方法进一步包括:
    所述第一装置接收加密后的第二模型的第二梯度,所述加密后的第二梯度根据所述加密后的第一中间结果和加密后的第二中间结果确定,所述第二中间结果根据第二装置的第二数据子集和第二装置的第二模型的参数确定,所述加密后第二中间结果由所述第二装置对所述第二中间结果进行同态加密得到;
    所述第一装置使用所述第一私钥解密所述第二梯度;
    所述第一装置向所述第二装置发送经所述第一私钥解密后的第二梯度,所述解密后的第二梯度用于更新所述第二模型。
  14. 根据权利要求12或13的方法,其特征在于,所述第一装置接收的第一梯度包含第一噪声,所述解密后的第一梯度包含所述第一噪声,所述更新后的第一模型的参数包含所第一噪声。
  15. 根据权利要求12-14任一所述的方法,其特征在于,该方法进一步包括:
    所述第一装置根据所述解密后的第一梯度更新所述第一模型;或
    所述第一装置发送所述解密后的第一梯度。
  16. 根据权利要求12-15任一所述的方法,其特征在于,该方法进一步包括:
    所述第一装置接收至少两个用于同态加密的所述第二公钥,所述至少两个第二公钥由至少两个第二装置生成;
    所述第一装置根据接收到所述至少两个第二公钥以及所述第一公钥生成用于同态加密的公共公钥;所述公共公钥用于加密所述第二中间结果和/或所述第一中间结果。
  17. 一种机器学习模型的更新方法,其特征在于:
    接收加密后的第一中间结果和加密后的第二中间结果,所述加密后的第一中间结果根据第一装置的加密后的第一数据子集和第一模型生成,所述加密后的第二中间结果根据第二装置的加密后的第二数据子集和第二模型生成;
    接收第一模型的参数;
    根据所述加密后的第一中间结果和所述加密后的第二中间结果,以及第一模型的参数确定第一模型的第一梯度;
    对所述第一梯度进行解密;
    根据所述解密后的第一梯度,更新所述第一模型。
  18. 根据权利要求17所述的方法,其特征在于,所述加密后的第一中间结果为使用第一公钥对第一中间结果进行同态加密后获得;所述加密后的第二中间结果为使用所述第一公钥对第二中间结果进行同态加密后获得。
  19. 根据权利要求18或17所述的方法,其特征在于:对所述第一梯度进行解密包括:
    使用第一私钥对的第一梯度进行解密。
  20. 根据权利要求19所述的方法,其特征在于,该方法进一步包括:
    将所述第一梯度发送给第一装置。
  21. 根据权利要求20所述的方法,其特征在于,该方法进一步包括:从第一装置获取第一公钥,将述第一公钥发送给第二装置。
  22. 一种机器学习模型更新的装置,其特征在于,包括接收模块和处理模块,
    所述处理模块,用于根据第一数据子集和第一模型生成第一中间结果;
    所述接收模块,用于接收第二装置发送的加密后的第二中间结果,所述第二中间结果根据第二装置对应的第二数据子集和第二模型生成;
    所述处理模块进一步用于:获得所述第一模型的第一梯度,所述第一梯度根据所述第一中间结果和加密后的所述第二中间结果生成;所述第一模型的第一梯度在使用第二私钥进行解密后,用于更新第一模型,所述第二私钥为所述第二装置生成的用于同态加密的解密密钥。
  23. 根据权利要求22所述的装置,其特征在于,所述第二中间结果使用所述第二装置生成的 用于同态加密的第二公钥加密,
    所述处理模块进一步用于生成用于同态加密的第一公钥和第一私钥,以及使用所述第一公钥对所述第一中间结果进行加密。
  24. 根据权利要求23所述的装置,其特征在于,该装置进一步包括发送模块,用于发送所述加密后的第一中间结果发送。
  25. 根据权利要求22-24任一所述的装置,其特征在于,
    所述处理模块进一步用于生成第一模型的第一梯度的第一噪声;
    所述发送模块进一步用于,将包含所述第一噪声的第一梯度发送给所述第二装置;
    所述接收模块进一步用于,接收使用所述第二私钥解密后的第一梯度,所述解密后的梯度包含所述第一噪声。
  26. 一种机器学习模型更新的装置,其特征在于,包括接收模块、处理模块和发送模块,
    所述发送模块用于,发送加密后的第一数据子集以及加密后的第一模型的第一参数,所述加密后的第一数据子集以及加密后的第一参数用于确定加密后的第一中间结果;
    所述接收模块用于,接收加密后所述第一模型的第一梯度,所述第一模型的第一梯度根据所述加密后的第一中间结果,所述加密后的第一参数以及加密后的第二中间结果确定;
    所述处理模块用于,对所述加密后的第一梯度使用第一私钥进行解密,所述解密后的第一模型的第一梯度用于更新所述第一模型。
  27. 根据权利要求26所述的装置,其特征在于,
    所述接收模块进一步用于,接收加密后的第二模型的第二梯度,所述加密后的第二梯度根据所述加密后的第一中间结果和加密后的第二中间结果确定,所述第二中间结果根据第二装置的第二数据子集和第二装置的第二模型的参数确定,所述加密后第二中间结果由所述第二装置对所述第二中间结果进行同态加密得到;
    所述处理模块进一步用于,使用所述第一私钥解密所述第二梯度;
    所述发送模块进一步用于,发送经所述第一私钥解密后的第二梯度,所述解密后的第二梯度用于更新所述第二模型。
  28. 根据权利要求26或27所述的装置,其特征在于,所述接收模块进一步用于,接收的第一梯度包含第一噪声,所述解密后的第一梯度包含所述第一噪声,所述更新后的第一模型的参数包含所第一噪声。
  29. 根据权利要求26-28任一所述的装置,其特征在于,
    所述处理模块进一步用于,根据所述解密后的第一梯度更新所述第一模型。
  30. 根据权利要求26-29任一所述的装置,其特征在于,
    所述接收模块进一步用于,接收至少两个用于同态加密的所述第二公钥,所述至少两个第二公钥由至少两个第二装置生成;
    所述处理模块进一步用于,根据接收到所述至少两个第二公钥以及所述第一公钥生成用 于同态加密的公共公钥;所述公共公钥用于加密所述第二中间结果和/或所述第一中间结果。
  31. 一种机器学习模型更新的装置,其特征在于,包括接收模块、处理模块,
    所述接收模块用于,接收加密后的第一中间结果和加密后的第二中间结果,以及接收第一模型的参数,所述加密后的第一中间结果根据第一装置的加密后的第一数据子集和第一模型生成,所述加密后的第二中间结果根据第二装置的加密后的第二数据子集和第二模型生成;
    所述处理模块,用于根据所述加密后的第一中间结果和所述加密后的第二中间结果,以及第一模型的参数确定第一模型的第一梯度,对所述第一梯度进行解密,根据所述解密后的第一梯度,更新所述第一模型。
  32. 根据权利要求31所述的装置,其特征在于,所述加密后的第一中间结果为使用第一公钥对第一中间结果进行同态加密后获得;所述加密后的第二中间结果为使用所述第一公钥对第二中间结果进行同态加密后获得。
  33. 一种通信装置,其特征在于,包括:
    存储器,存储有可执行的程序指令;和
    处理器,所述处理器用于与所述存储器耦合,读取并执行所述存储器中的指令,以使所述通信装置实现如权利要求1-22任一所述的方法。
  34. 如权利要求33所述的装置,其特征在于,所述装置为终端、终端中的芯片或者网络设备。
  35. 一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行如权利要求1-21任一所述的方法。
  36. 一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行如权利要求1-21中任意一项所述的方法。
PCT/CN2021/112644 2020-12-31 2021-08-14 机器学习模型更新的方法和装置 WO2022142366A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21913114.1A EP4270266A1 (en) 2020-12-31 2021-08-14 Method and apparatus for updating machine learning model
US18/344,188 US20230342669A1 (en) 2020-12-31 2023-06-29 Machine learning model update method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011635759.9A CN114691167A (zh) 2020-12-31 2020-12-31 机器学习模型更新的方法和装置
CN202011635759.9 2020-12-31

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/344,188 Continuation US20230342669A1 (en) 2020-12-31 2023-06-29 Machine learning model update method and apparatus

Publications (1)

Publication Number Publication Date
WO2022142366A1 true WO2022142366A1 (zh) 2022-07-07

Family

ID=82135257

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/112644 WO2022142366A1 (zh) 2020-12-31 2021-08-14 机器学习模型更新的方法和装置

Country Status (4)

Country Link
US (1) US20230342669A1 (zh)
EP (1) EP4270266A1 (zh)
CN (1) CN114691167A (zh)
WO (1) WO2022142366A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115796305A (zh) * 2023-02-03 2023-03-14 富算科技(上海)有限公司 一种纵向联邦学习的树模型训练方法及装置
WO2024067351A1 (zh) * 2022-09-27 2024-04-04 华为技术有限公司 传输数据的方法和相关装置

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220210140A1 (en) * 2020-12-30 2022-06-30 Atb Financial Systems and methods for federated learning on blockchain
CN113033828B (zh) * 2021-04-29 2022-03-22 江苏超流信息技术有限公司 模型训练方法、使用方法、系统、可信节点及设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180359084A1 (en) * 2017-06-12 2018-12-13 Microsoft Technology Licensing, Llc Homomorphic factorization encryption
CN109886417A (zh) * 2019-03-01 2019-06-14 深圳前海微众银行股份有限公司 基于联邦学习的模型参数训练方法、装置、设备及介质
CN110190946A (zh) * 2019-07-12 2019-08-30 之江实验室 一种基于同态加密的隐私保护多机构数据分类方法
CN111178538A (zh) * 2019-12-17 2020-05-19 杭州睿信数据科技有限公司 垂直数据的联邦学习方法及装置
CN111340247A (zh) * 2020-02-12 2020-06-26 深圳前海微众银行股份有限公司 纵向联邦学习系统优化方法、设备及可读存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180359084A1 (en) * 2017-06-12 2018-12-13 Microsoft Technology Licensing, Llc Homomorphic factorization encryption
CN109886417A (zh) * 2019-03-01 2019-06-14 深圳前海微众银行股份有限公司 基于联邦学习的模型参数训练方法、装置、设备及介质
CN110190946A (zh) * 2019-07-12 2019-08-30 之江实验室 一种基于同态加密的隐私保护多机构数据分类方法
CN111178538A (zh) * 2019-12-17 2020-05-19 杭州睿信数据科技有限公司 垂直数据的联邦学习方法及装置
CN111340247A (zh) * 2020-02-12 2020-06-26 深圳前海微众银行股份有限公司 纵向联邦学习系统优化方法、设备及可读存储介质

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024067351A1 (zh) * 2022-09-27 2024-04-04 华为技术有限公司 传输数据的方法和相关装置
CN115796305A (zh) * 2023-02-03 2023-03-14 富算科技(上海)有限公司 一种纵向联邦学习的树模型训练方法及装置
CN115796305B (zh) * 2023-02-03 2023-07-07 富算科技(上海)有限公司 一种纵向联邦学习的树模型训练方法及装置

Also Published As

Publication number Publication date
EP4270266A1 (en) 2023-11-01
CN114691167A (zh) 2022-07-01
US20230342669A1 (en) 2023-10-26

Similar Documents

Publication Publication Date Title
WO2022142366A1 (zh) 机器学习模型更新的方法和装置
Giacomelli et al. Privacy-preserving ridge regression with only linearly-homomorphic encryption
WO2021197037A1 (zh) 双方联合进行数据处理的方法及装置
US11196541B2 (en) Secure machine learning analytics using homomorphic encryption
WO2021068444A1 (zh) 数据处理方法、装置、计算机设备和存储介质
CN113239404B (zh) 一种基于差分隐私和混沌加密的联邦学习方法
WO2022247576A1 (zh) 一种数据处理方法、装置、设备及计算机可读存储介质
AU2018222992B2 (en) System and method for secure two-party evaluation of utility of sharing data
WO2021092977A1 (zh) 纵向联邦学习优化方法、装置、设备及存储介质
WO2021159798A1 (zh) 纵向联邦学习系统优化方法、设备及可读存储介质
JP2016512611A (ja) プライバシー保護リッジ回帰
AU2019448601B2 (en) Privacy preserving oracle
JP5762232B2 (ja) プライバシを保護したまま暗号化された要素の順序を選択するための方法およびシステム
CN114696990B (zh) 基于全同态加密的多方计算方法、系统及相关设备
JP7388445B2 (ja) ニューラルネットワークの更新方法、端末装置、計算装置及びプログラム
CN111428887A (zh) 一种基于多个计算节点的模型训练控制方法、装置及系统
WO2021218618A1 (zh) 一种数据处理方法、装置、系统、设备及介质
WO2022213965A1 (zh) 用于控制带宽的多方联合数据处理方法及装置
CN111555880B (zh) 数据碰撞方法、装置、存储介质及电子设备
CN114003950A (zh) 基于安全计算的联邦机器学习方法、装置、设备及介质
He et al. Privacy-preserving and low-latency federated learning in edge computing
WO2022143987A1 (zh) 树模型训练方法、装置和系统
JP2023114996A (ja) 相関係数取得方法、装置、電子機器および記憶媒体
CN114492850A (zh) 基于联邦学习的模型训练方法、设备、介质及程序产品
CN116170142B (zh) 分布式协同解密方法、设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21913114

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021913114

Country of ref document: EP

Effective date: 20230724

NENP Non-entry into the national phase

Ref country code: DE