US20220129580A1 - Methods, apparatuses, and systems for updating service model based on privacy protection - Google Patents

Methods, apparatuses, and systems for updating service model based on privacy protection Download PDF

Info

Publication number
US20220129580A1
US20220129580A1 US17/511,517 US202117511517A US2022129580A1 US 20220129580 A1 US20220129580 A1 US 20220129580A1 US 202117511517 A US202117511517 A US 202117511517A US 2022129580 A1 US2022129580 A1 US 2022129580A1
Authority
US
United States
Prior art keywords
model
service
gradient data
basis
model basis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US17/511,517
Other languages
English (en)
Inventor
Yilun Lin
Hongjun Yin
Jinming Cui
Chaochao Chen
Li Wang
Jun Zhou
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to US17/512,539 priority Critical patent/US11455425B2/en
Publication of US20220129580A1 publication Critical patent/US20220129580A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06K9/6256
    • G06N3/0454
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • One or more embodiments of the specification relate to the field of computer technology, and in particular, to methods, apparatuses, and systems for updating a service model based on privacy protection.
  • Federated learning is a method of joint modeling under the condition of protecting private data.
  • cooperative security modeling is needed between enterprises, and federated learning can be carried out, so as to use data of all parties to train a data processing model cooperatively on the premise of fully protecting the privacy of enterprise data, so as to process service data more accurately and effectively.
  • a federated learning scenario for example, after all parties agree on a model architecture (or an agreed model), the parties use private data to perform training locally, aggregate model parameters by using a safe and credible method, and finally improve the local model according to the aggregated model parameters.
  • the federated learning on the basis of privacy protection, can effectively break through data islands and realize multi-party joint modeling.
  • Graph data is a type of data that describes an association relationship between various entities.
  • each service party may usually hold graph data with different architectures.
  • a first-party bank holds graph data of nodes corresponding to users, lending service, and income and association relationships thereof
  • a second-party local life service platform holds graph data of nodes corresponding to users, lending service, goods or services and association relationships thereof. Because local private data cannot be leaked to each other, it is difficult to train a graph neural network.
  • One or more embodiments of the specification describe methods and apparatuses for updating a service model based on privacy protection, to solve one or more of the problems mentioned in the background.
  • a first aspect provides a method for updating a service model based on privacy protection.
  • the service model is used for determining a corresponding service processing result based on processing of related service data and is trained jointly by a plurality of service parties assisted by a server.
  • the method includes the following:
  • the plurality of service parties determines, through negotiation, a plurality of model bases, where an individual model basis is a parameter cell including a plurality of reference parameters.
  • Each service party constructs a local service model based on a combination of the model bases in a predetermined mode.
  • Each service party processes local training samples by using the local service model, to determine respective gradient data corresponding to each model basis, and sends the respective gradient data to the server.
  • the server fuses, in response to that the individual model basis satisfies a gradient update condition, the respective gradient data of the individual model basis to obtain global gradient data corresponding to the individual model basis, and feeds back the global gradient data to each service party.
  • Each service party updates the reference parameters in the corresponding local model basis according to the fused global gradient data, to iteratively train the local service model.
  • the gradient update condition includes at least one of: a quantity of received gradient data of the model basis reaches a predetermined quantity; or an update period has arrived.
  • the server fuses the respective received gradient data of the individual model basis by one of: averaging, weighted averaging, or processing the respective gradient data arranged in a chronological order by using a pre-trained long short-term memory model.
  • the service model is a graph neural network
  • each service party holds respective graph data constructed depending on local data
  • the respective graph data are heterogeneous graphs.
  • an individual gradient data of the individual model basis includes gradients respectively corresponding to the reference parameters in the individual model basis.
  • the predetermined mode includes at least one of: a linear combination mode; or a network architecture search NAS mode.
  • a second aspect provides a method for updating a service model based on privacy protection.
  • the service model is used for determining a corresponding service processing result based on processing of related service data and is trained jointly by a plurality of service parties assisted by a server.
  • the method is performed by a first party of the plurality of service parties and includes the following: Local training samples are processed by using a local service model constructed based on a combination of a plurality of model bases in a predetermined mode, to determine respective gradient data corresponding to each model basis, and the respective gradient data are sent to the server for the server to update global gradient data of the individual model basis by using the received plurality of gradient data according to a gradient update condition corresponding to the individual model basis, where the plurality of model bases are determined by the plurality of service parties through negotiation, and the individual model basis is a parameter cell including a plurality of reference parameters.
  • local gradient data of the first model basis are updated according to the global gradient data of the first model basis.
  • the updating local gradient data of the first model basis according to the global gradient data includes replacing the local gradient data of the first model basis with the global gradient data.
  • the updating local gradient data of the first model basis according to the global gradient data includes the following: The global gradient data and the local gradient data of the first model basis are averaged with weighting to obtain weighted gradient data. Local gradient data of a plurality of reference parameters corresponding to the first model basis are updated according to the weighted gradient data.
  • the predetermined mode includes at least one of: a linear combination mode; or a network architecture search NAS mode.
  • the service model is a multi-layer neural network
  • model parameters of a single-layer neural network are a linear combination of the plurality of model bases.
  • the processing local training samples by using a local service model of which the model parameters are determined based on the linear combination of the plurality of model bases, to determine respective gradient data corresponding to each model basis includes the following: A loss of the service model is determined by comparing a sample label and an output result of the service model for a current training sample. Aiming at minimizing the loss, the respective gradient data corresponding to each model basis corresponding to the single-layer neural network are determined layer by layer from the last layer of neural network.
  • the service model is a graph neural network
  • each service party holds respective graph data constructed by local data
  • the respective graph data are heterogeneous graphs.
  • an individual gradient data of the individual model basis includes gradients respectively corresponding to the reference parameters in the individual model basis.
  • a third aspect provides a method for updating service model based on privacy protection.
  • the service model is used for determining a corresponding service processing result based on processing of related service data and is trained jointly by a plurality of service parties assisted by a server.
  • the method is performed by the server and includes the following: respective gradient data of a first model basis are obtained from each service party, where the first model basis is a parameter cell which is determined by the plurality of service parties through negotiation and includes a plurality of reference parameters, and each service party processes local training samples by using a local service model determined by combining a plurality of model bases including the first model basis in a predetermined mode, to determine the respective gradient data corresponding to the first model basis.
  • the respective received gradient data of the first model basis are fused, to obtain global gradient data of the first model basis.
  • the global gradient data of the first model basis is sent to each service party for each service party to update the reference parameters of the local first model basis based on the global gradient data of the first model basis.
  • the gradient update condition includes at least one of: a quantity of received gradient data of the model basis reaches a predetermined quantity; or an update period has arrived.
  • the global gradient of the first model basis is obtained by fusing the respective received gradient data of the first model basis by one of: averaging, weighted averaging, processing the respective gradient data arranged in a chronological order by using a pre-trained long short-term memory model.
  • an individual gradient data of the individual model basis includes gradients respectively corresponding to the reference parameters in the individual model basis.
  • the predetermined mode includes at least one of: a linear combination mode; or a network architecture search NAS mode.
  • a fourth aspect provides a system for updating a service model based on privacy protection.
  • the system includes a plurality of service parties and a server.
  • the service model is used for determining a corresponding service processing result based on processing of related service data and is trained jointly by the plurality of service parties assisted by the server.
  • An individual service party is configured to: negotiate with another service party to determine a plurality of model bases, construct a local service model based on a combination of the model bases in a predetermined mode, and process local training samples by using the local service model, to determine respective gradient data corresponding to each model basis and send the respective gradient data to the server, where the individual model basis is a parameter cell including a plurality of reference parameters.
  • the server is configured to: fuse, in response to that the individual model basis satisfies a gradient update condition, the respective gradient data of the individual model basis to obtain global gradient data corresponding to the individual model basis, and feed back the global gradient data to each service party.
  • the individual service party is further configured to: update the reference parameters in the corresponding local model basis according to the fused global gradients, to iteratively train the local service model.
  • a fifth aspect further provides an apparatus for updating a service model based on privacy protection, where the service model is used for determining a corresponding service processing result based on processing of related service data and is trained jointly by a plurality of service parties assisted by a server.
  • the apparatus is disposed on a first party of the plurality of service parties and includes: a gradient determining unit, configured to: process local training samples by using a local service model constructed based on a combination of a plurality of model bases in a predetermined mode, to determine respective gradient data corresponding to each model basis and send the respective gradient data to the server for the server to update global gradient data of the individual model basis by using the received plurality of gradient data according to a gradient update condition corresponding to the individual model basis, where the plurality of model bases are determined by the plurality of service parties through negotiation, and the individual model basis is a parameter cell including a plurality of reference parameters; a gradient update unit, configured to update, in response to receiving the global gradient data of a first model basis from the server, local gradient data of the
  • a sixth aspect provides an apparatus for updating a service model based on privacy protection.
  • the service model is used for determining a corresponding service processing result based on processing of related service data and is trained jointly by a plurality of service parties assisted by a server.
  • the apparatus is disposed on the server and includes: a communication unit, configured to obtain respective gradient data of a first model basis from each service party, where the first model basis is a parameter cell which is determined by the plurality of service parties through negotiation and includes a plurality of reference parameters, and each service party processes local training samples by using a local service model determined by combining a plurality of model bases including the first model basis in a predetermined mode, to determine the respective gradient data corresponding to the first model basis; a fusion unit, configured to fuse, in response to that the first model basis satisfies a gradient update condition, the respective received gradient data of the first model basis, to obtain global gradient data of the first model basis.
  • the communication unit is further configured to send the global gradient data of the first model basis to each service party
  • a seventh aspect provides a computer-readable storage medium, which stores a computer program, and the computer program enables a computer to perform the methods according to the first aspect, the second aspect, or the third aspect when the computer program is executed in the computer.
  • An eighth aspect provides a computing device, including a memory and a processor.
  • the memory stores executable code
  • the processor when executing the executable code, implements the methods according to the first aspect, the second aspect, or the third aspect.
  • Model bases are set, one model basis can include a plurality of reference parameters, and each service party constructs a local model parameter vector or matrix by using a combination of the model bases in a predetermined mode, so as to construct a local service model.
  • the service parties upload local gradient data at the granularity of a model basis
  • the server updates the global gradient data at the granularity of a model basis
  • the service parties update local model parameters at the granularity of a model basis.
  • the technical solution of locally updating model parameters can effectively break down the barriers for federated learning under heterogeneous graphs, provides a brand-new machine learning idea, can be extended to various service models, and is not limited to machine learning models processing graph data.
  • FIG. 1 is a schematic diagram of a specific implementation architecture of federated learning
  • FIG. 2 is a schematic diagram of an implementation framework of a technical concept of the specification
  • FIG. 3 is an operational sequencing diagram for updating a service model based on privacy protection according to an embodiment
  • FIG. 4 is flowchart of a method for processing service data based on privacy protection according to an embodiment
  • FIG. 5 is a flowchart of a method of processing service data based on privacy protection according to another embodiment.
  • FIG. 6 is a schematic block diagram of a system for updating a service model based on privacy protection according to an embodiment.
  • FIG. 1 is a specific implementation architecture of federated learning.
  • a service model can be jointly trained by two or more service parties. Each service party can use the trained service model to process a local service. Generally, each service party has data correlation.
  • a service party 1 is a bank providing users with services such as savings and loans, and can hold data such as users' age, gender, income and expenditure flow, credit line, and deposit limit
  • a service party 2 is a P2P platform and can hold data such as users' loan records, investment records, and repayment time limit
  • a service party 3 is a shopping website and holds data such as users' shopping habits, payment habits, and payment accounts.
  • each service party can respectively hold local sample data.
  • the sample data are regarded as private data and are not expected to be known by other parties.
  • each service party trains the service model by using the local sample data to obtain a local gradient of model parameters of the service model.
  • the server can provide assistance for federated learning of the service parties, for example, assisting in non-linear calculation, global model parameter gradient calculation, or the like.
  • the server shown in FIG. 1 is in a form of another party independent of the service parties such as a trusted third party.
  • the server can alternatively be distributed in or includes service parties.
  • the service parties can adopt a secure computing protocol (such as secret sharing) to complete joint aided computing. The specification is not limited to this.
  • the service parties can further obtain gradient data of global model parameters from the server, to update local model parameters.
  • the service model can be a machine learning model such as a graph neural network, an RDF2Vec, or a Weisfeiler-Lehmankernels algorithm (Weisfeiler-Lehmankernels, WL).
  • a machine learning model such as a graph neural network, an RDF2Vec, or a Weisfeiler-Lehmankernels algorithm (Weisfeiler-Lehmankernels, WL).
  • each service party combines pre-constructed graph data, processes relevant training samples by using a corresponding service model, and obtains an output result.
  • a graph neural network which is a generalized neural network based on a graph architecture (that is, graph data).
  • the graph neural network can use underlying graph data as a calculation graph, and determine neural network neurons to generate a node embedding vector by transferring, transforming, and aggregating node feature information on the whole graph.
  • the generated node embedding vector can be used as an input of any differentiable prediction layer, and can be used for node classification or prediction of connection between nodes.
  • a complete graph neural network can be trained in an end-to-end manner.
  • an update method of a layer of graph neural network can be as follows:
  • h i ( l + 1 ) ⁇ ⁇ ( ⁇ r ⁇ R ⁇ ⁇ j ⁇ N ! ⁇ 1 c i , r ⁇ W r ( l ) ⁇ h j ( l ) + W 0 ( l ) ⁇ h i ( l ) )
  • h represents a node
  • i represents an i th node in graph data
  • l and l+1 represent corresponding layers of the graph neural network
  • W represents a model parameter vector or matrix
  • c is a weight or a normalization factor
  • N represents a set of neighboring nodes
  • R is a set of graph data nodes
  • r represents a service party r
  • a is an activation function.
  • the graph data is usually constructed by nodes corresponding to entities one by one and connecting edges between nodes (corresponding to the association relationship between the entities), and the architecture may be complicated.
  • the architecture of the graph data can be quite different.
  • a service party is a payment platform, and in the graph data constructed from the service data the service party holds, the entities corresponding to the nodes can be users, and the association relationship between the nodes can be address book friends, transfer frequencies, or the like.
  • Another service party is a shopping platform
  • the entities corresponding to the nodes can include two types of users, i.e., consumers and merchants, and the association relationship between the nodes can be sharing product links among consumers or among merchants and consumers, consumers buying goods from merchants, or the like.
  • More service parties can further include graph data of more architectures.
  • each service party can construct one or more graph data according to the service data thereof, and these graph data constitute “heterogeneous graphs”. In this case, it is difficult to process the local graph data of each service party through a unified graph neural network.
  • the model basis can be regarded as a basic parameter cell that constitutes a service model (such as a graph neural network).
  • One model basis can include a plurality of reference model parameters.
  • the model basis is shared between each service party, and the service party constructs its own graph neural network locally based on the model basis.
  • a basic architecture of the model basis can be jointly agreed by each service party, for example, a quantity of model bases, dimensions of each model basis, and meanings of each dimension, etc.
  • An activation function, optimization method, and a task type that needs to be predicted in the service model can also be agreed by each service party.
  • the model parameters thereof can be defined by a linear combination of model bases, and can alternatively be determined by a combination of another mode of model bases (such as the network architecture search NAS mode).
  • the parameters of each layer of the neural network can be a linear combination of each model basis.
  • the model basis can be in a form of vector or matrix that includes reference parameters. Description is provided below by using a linear combination of model bases as an example.
  • a model parameter W r (i) can be expressed as:
  • a rb (l) represents a linear coefficient of a model basis b corresponding to the model parameter of the l th layer of the graph neural network of the service party r.
  • the coefficient can be predetermined by the service party r, and can alternatively be adjusted as a parameter of the service model in the process of model training, which is not limited herein.
  • model parameters of each layer of the graph neural network of each service party can be represented by a linear combination of a plurality of model bases.
  • each service party trains the graph neural network according to the sample data, and determines a model loss by comparing an output result and a sample label, so as to determine a gradient of parameters involved in each model basis in each layer of the neural network, and sends the gradient to a server.
  • a gradient update condition a model basis can be predetermined in the server. For example, gradient data fed back by an individual model basis reaches a predetermined quantity (such as 100) or a predetermined period (such as 10 minutes) has passed.
  • the server can fuse the received gradients of the corresponding model bases to obtain a global gradient. Fusion methods include, but are not limited to, summing, averaging, maximizing, minimizing, long short-term memory neural network unit (LSTM), and the like.
  • LSTM long short-term memory neural network unit
  • the server feeds back the global gradient of the updated model basis (local model parameters) to each service party, and each service party updates a gradient of the model parameters of the local service model according to the global gradient, such as replacing and weighted summing gradients of the local model parameters. Therefore, each service party can update the corresponding model parameter gradient. Under gradient convergence of model parameters or other termination conditions (such as the quantity of update rounds reaching a predetermined quantity), the process of federated learning ends.
  • the idea of localizing the model parameters in the specification breaks through the boundary of heterogeneous graphs among service parties, so that more service data can be used to train the graph neural network of each service party through federated learning on the premise of protecting privacy, without requiring the model architectures of each service party to be consistent.
  • FIG. 3 is a collaborative sequencing diagram of all parties for updating a service model based on privacy protection according to an embodiment.
  • the service parties participating in calculation include a first party and a second party. In practice, there can be more service parties.
  • the first party and the second party can determine a plurality of model bases through negotiation, which are assumed as V 1 , V 2 , and V 3 .
  • Content negotiated by the first party and the second party may, for example, include dimensions of the model bases.
  • each service party can also negotiate an intermediate architecture of the service model, such as activation function setting, optimization method, a task type that needs to be predicted, and the like.
  • the model bases V 1 , V 2 , and V 3 each can include a plurality of model parameters. Generally, different model bases include different reference parameters.
  • the model basis V 1 includes w 1 and w 2
  • the model basis V 2 includes w 3 , w 4 , and w 5
  • the model basis V 3 includes w 6 , w 7 , w 8 , and w 9 , and so on.
  • different model bases can include the same model parameters.
  • V 1 includes w 1 and w 2
  • V 2 includes w 2 , w 3 , and w 4
  • V 3 includes w 1 , w 4 , w 5 , and w 6 , and so on.
  • the first party can construct a local first service model by using V 1 , V 2 , and V 3
  • the second party can construct local second service model by using V 1 , V 2 , and V 3
  • the model parameters of both the first service model and the second service model can be determined by a linear combination of V 1 , V 2 , and V 3 .
  • the linear coefficient can be adjusted in the process of model training, or can be a set value.
  • the model parameter W 1 a 1 V 1 +a 2 V 2 +a 3 V 3 , where a 1 , a 2 , and a 3 can be set fixed values, such as 3, 4, and 7, or can be a parameter to be adjusted.
  • the model parameters of each layer of the neural network can be determined by the linear combination of each model basis.
  • the model parameters of the first service model and the second service model can be determined by a network architecture search (neural architecture search, NAS) mode.
  • NAS neural architecture search
  • the service data of each service party can be determined according to specific service scenarios.
  • the service data held by each service party can constitute training samples.
  • the service model is a graph neural network based on graph data
  • the service data provided by each service party can further include graph data and training samples.
  • the graph data can include various entities corresponding to service parties in related service, such as questions, tags, documents (or answers) in customer service Q&A service scenarios, and users and financial products in financial risk scenarios.
  • Each node can correspond to a node expression vector.
  • the graph neural network updates the node expression vector of the current node by fusing node expression vectors of neighboring nodes, and the finally determined node expression vector is used as an input vector of a subsequent neural network (such as an activation layer) to obtain a final output result.
  • a subsequent neural network such as an activation layer
  • the expression vector of the node corresponding to the user “Zhang San” in the graph data is processed by the graph neural network, and is then further processed to obtain an output result of 0.8 risk probability.
  • each service party can determine a model loss according to the comparison between the sample label and the output result of the service model after processing using the local service model and according to training samples thereof, and determine a gradient of model parameters in each model basis by using the model loss.
  • the gradient of the reference parameters in the model basis can be denoted as gradient data of the corresponding model basis.
  • the first party determines gradients of reference parameters in V 1 , V 2 , and V 3 , respectively.
  • the first party In the process of a round of iterative training, the first party first calculates the gradient by using the last layer of neural network, determines the gradient of the reference parameters in each model basis at the granularity of the model basis V 2 or V 3 , and then determines gradient data of the model bases V 1 and V 3 by using a previous layer of the neural network. As shown in FIG. 3 , after each service party determines the gradient data of a plurality of model bases in one gradient calculation (such as gradient calculation for one-layer of neural network), the corresponding gradient data can be fed back to the server.
  • the server can receive the gradient data of the model basis sent by each service party at any time.
  • a gradient update condition of a model basis can further be pre-stored in the server, for example, updating periodically (for example, every one minute), updating when an individual model basis has received a predetermined quantity (for example, 100) of new gradient data, or the like.
  • the server can update the global gradient data of the corresponding model basis by using the received gradient data.
  • the global gradient data herein can include the global gradient of each reference parameter in the corresponding model basis.
  • the server fuses the 100 gradient data and updates the global gradient data of V 2 by using a fused result.
  • the server can update each model basis according to the gradient update condition thereof.
  • the gradient update conditions of each model basis can be consistent, for example, updating when a predetermined quantity of gradient data has been received, or updating when a predetermined period (for example, 10 minutes) has passed.
  • the gradient data can be fused by the server by averaging and weighted summing the gradients provided by plurality of service parties.
  • temporal sequence neural networks such as LSTM can be used to process the received gradient data. This is because each gradient data of the model basis received by the server has temporal sequence characteristics.
  • Each service party can repeat the process of model training, calculating gradient data of each model basis, sending the gradient data in unit of model basis, until receiving the updated global gradient data for a model basis sent by the server, the service party can adjust the local model parameters (corresponding models and included reference parameters) according to the updated global gradient data.
  • the first party can update V 2 when receiving the global gradient data dV′ 2 of V 2 sent by the server, and continue to repeat the process of model training, calculating gradient data in unit of model basis, and sending the gradient data by using the updated model parameters.
  • the first party can update V 3 when further receiving the global gradient data dV′ 3 of V 3 sent by the server, and continue to repeat the process of model training, calculating gradient data in unit of model basis, and sending the gradient data by using the updated model parameters, until the model training is completed.
  • the local model basis gradient data can be updated by replacing the gradient data of the local model basis with the global gradient, and the reference parameter included in the corresponding model basis can be adjusted according to the updated model basis gradient data.
  • the local model basis gradient data can be updated by weighted averaging the global gradient data and the gradient data of the local model basis, and the corresponding reference parameter can be adjusted according to the updated model basis gradient data.
  • the service party can further use the global gradient data of the model basis to update the gradient data of the local model basis in more ways, and further adjust the corresponding reference parameters, which will not be repeated herein.
  • the service party can update the related model parameters according to a predetermined model parameter update method.
  • an effective solution is given to solve the training bottleneck in federated learning based on privacy protection in the case that each service party provides graph data with different architectures (heterogeneous graphs).
  • the solution proposed above localizes the model parameters of the service model, the concept of model basis is put forward.
  • the model basis is used to represent a parameter cell including a plurality of reference parameters.
  • Each service party constructs a local service model by combining each model basis in a predetermined mode.
  • the gradient data is calculated at the granularity of the model basis, and the server updates the corresponding global gradient data at the granularity of the model basis, so that the service party can update some model parameters (reference parameters in one or more model bases) at a time.
  • each service party to train a service model that makes full use of global data by using the graph neural network thereof based on the heterogeneous graphs formed by the graph data thereof in the process of federated learning.
  • the solution can be extended to a variety of suitable service models, providing a new idea for federated learning.
  • a process 400 performed by the first party can include the following steps.
  • step 401 local training samples are processed by using a local service model determined based on a combination of a plurality of model bases in a predetermined mode, to determine respective gradient data corresponding to each model basis and the respective gradient data are sent to a server.
  • Each model basis can be determined by a plurality of service parties through negotiation, and an individual model basis is a parameter cell including a plurality of reference parameters.
  • the server can receive gradient data of an individual model basis sent by each service party and update global gradient data of the individual model basis according to a gradient update condition.
  • the predetermined mode can be, for example, a linear combination mode, a network search mode, or the like.
  • the model parameters of a single-layer neural network can be a combination of a plurality of model bases in a predetermined mode.
  • a loss of the service model is determined by comparing a sample label and an output result of the service model for a current training sample. Further, aiming at minimizing the loss, the respective gradients corresponding to each model basis corresponding to the single-layer neural network are determined layer by layer from the last layer of neural network and sent to the server.
  • each service party holds respective graph data constructed by local data
  • the service model can further be a graph neural network.
  • the graph data held by each service party can be heterogeneous graphs.
  • step 402 in response to receiving the global gradient data of a first model basis from the server, local gradient data of the first model basis is updated according to the global gradient data.
  • the first model basis can be any one of the model bases.
  • the server can independently determine the corresponding global gradient data for each model basis according to the gradient update conditions, so the global gradient data issued by the server at a time may only involve all or part of the model bases.
  • the global gradient data of the model basis currently updated by the server (for example, including the first model basis herein) can be received and the local gradient data of the corresponding model basis can be updated according to the global gradient data thereof.
  • the first party can perform weighted averaging on the global gradient data and the local gradient data of the first model basis to obtain a weighted gradient, and then update the local gradient corresponding to each reference parameter in the first model basis according to the weighted gradient.
  • the first party can alternatively replace the local gradient data of the first model basis with the global gradient data of the first model basis, to update the local gradient data of the first model basis.
  • the first party can alternatively update the local gradient data of the first model basis in another method, which is not limited herein.
  • step 403 reference parameters included in the first model basis are updated according to the updated local gradient data.
  • the method for updating the model parameters can be determined by the first party in advance or negotiated with other service parties in advance.
  • a server is used as an example to describe the process of updating a service model based on privacy protection performed by the first party in some embodiments.
  • a process 500 performed by the server can include the following steps.
  • each gradient data of the first model basis is obtained from each service party.
  • a plurality of service parties can determine a plurality of parameter cells including a plurality of reference parameters, that is, model bases, through negotiation, and the first model basis can be any one of the plurality of model bases.
  • Each service party can process local training samples by using a local service model determined by combining a plurality of model bases including the first model basis in a predetermined mode, to determine a gradient corresponding to the first model basis and sends the gradient to the server.
  • the predetermined mode can be, for example, a linear combination mode, a network search mode, or the like.
  • step 502 in response to that the first model basis satisfies a gradient update condition, the respective received gradient data of the first model basis are fused, to obtain global gradient data of the first model basis.
  • the gradient update condition is a trigger condition for the server to update the global gradient data of an individual model basis, for example, a quantity of received gradient data of the model basis reaches a predetermined quantity; or an update period is arrived; or the like.
  • the server can store a unified gradient update condition for each model basis, and can alternatively set gradient update conditions for each model basis respectively, which is not limited herein.
  • the received gradient data (for example, 100) can be fused to determine the global gradient data thereof.
  • the fused gradient data can be the gradients received after the previous global gradient fusion, such as 100 newly received gradient data.
  • the server can fuse the corresponding gradient data to determine the global gradient data thereof based on an arithmetic average of the gradient data for the first model basis.
  • the server can fuse corresponding gradient data to determine the global gradient thereof based on a weighted average of gradient data for the first model basis.
  • the weighting weight can be set in advance or determined according to the receiving time, for example, the weight is negatively correlated with the distance between the receiving time and the current time.
  • the server can alternatively process the gradient data of the first model basis arranged in a chronological order by using a pre-trained long short-term memory model (LSTM). In this way, the correlation of gradient data in a chronological order can be fully mined.
  • LSTM long short-term memory model
  • the server can alternatively fuse the gradient data of the first model basis in other modes, such as taking the last one, taking the maximum value, or the like.
  • step 503 the global gradient data of the first model basis is sent to each service party for each service party to update the reference parameters of the local first model basis based on the global gradient data of the first model basis.
  • FIG. 4 and FIG. 5 respectively correspond to the first party (or the second party) and the server shown in FIG. 3 , and the related description of FIG. 3 is also applicable to the embodiments of FIG. 4 and FIG. 5 , which is not repeated herein.
  • Embodiments of another aspect further provide a system for updating a service model based on privacy protection.
  • the system for updating a service model includes a server and a plurality of service parties.
  • the server can assist the plurality of service parties to jointly train the service model.
  • the server can fuse gradients of model parameters determined by each service party, to determine a global gradient, and distribute the global gradient to each serving party.
  • a system 600 includes a plurality of service parties and a server 62 . Any one of the service parties can be denoted as a service party 61 .
  • the service party 61 is configured to: negotiate with another service party to determine a plurality of model bases, construct a local service model based on a combination of model bases in a predetermined mode, and process local training samples by using the local service model, to determine respective gradient data corresponding to each model basis and send the respective gradient data to the server, where the model basis is a parameter cell including a plurality of reference parameters.
  • the predetermined mode can be, for example, a linear combination mode; or a network search mode.
  • the server 62 is configured to: fuse, in response to that the individual model basis satisfies a gradient update condition, the respective gradient data of the individual model basis to obtain global gradient data of the individual model basis, and feed back the global gradient data to each service party.
  • the service party 61 is further configured to: update the reference parameters in the corresponding local model basis according to the fused global gradient data, to iteratively train the local service model.
  • the service party 61 can be provided with an apparatus 610 for updating a service model, including the following:
  • a gradient determining unit 611 configured to: process local training samples by using a local service model determined based on a combination of a plurality of model bases in a predetermined mode, to determine respective gradient data corresponding to each model basis and send the respective gradient data to the server for the server to update global gradient data of the individual model basis according to a gradient update condition, where the plurality of model bases are determined by the plurality of service parties through negotiation, and the individual model basis is a parameter cell including a plurality of reference parameters.
  • the predetermined mode can be, for example, a linear combination mode; or a network search mode.
  • a gradient update unit 612 configured to update, in response to receiving the global gradient data of a first model basis from the server, local gradient data of the first model basis according to the global gradient data.
  • a parameter update unit 613 configured to update, according to the updated local gradient data, the reference parameters included in the first model basis.
  • the server 62 can be provided with an apparatus 620 for updating a service model, including the following:
  • a communication unit 621 configured to obtain respective gradient data of a first model basis from each service party, where the first model basis is a parameter cell which is determined by the plurality of service parties through negotiation and includes a plurality of reference parameters, and each service party processes local training samples by using a local service model determined by combining a plurality of model bases including the first model basis in a predetermined mode, to determine the respective gradient data corresponding to the first model basis.
  • the predetermined mode can be, for example, a linear combination mode; a network search mode, or the like.
  • a fusion unit 622 is configured to fuse, in response to that the first model basis satisfies a gradient update condition, the respective gradient data of the first model basis, to obtain global gradient data of the first model basis.
  • a communication unit 611 is further configured to send the global gradient data of the first model basis to each service party, for each service party to update the reference parameters of the local first model basis based on the global gradient data of the first model basis.
  • system 600 , the apparatus 610 , and the apparatus 620 shown in FIG. 6 are product embodiments respectively corresponding to the method embodiments shown in FIG. 3 , FIG. 4 , and FIG. 5 , and the corresponding descriptions in the method embodiments shown in FIG. 3 , FIG. 4 , and FIG. 5 are applicable to the apparatus system 600 , the apparatus 610 , and the apparatus 620 , respectively, and will not be repeated herein.
  • An embodiment of another aspect further provides a computer-readable storage medium, which stores a computer program, and the computer program enables a computer to perform the methods described in combination with FIG. 4 or FIG. 5 when the computer program is executed in the computer.
  • An embodiment of another aspect further provides a computing device, including a memory and a processor.
  • the memory stores executable code
  • the processor when executing the executable code, implements the method described in combination with FIG. 4 or FIG. 5 .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Bioethics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
US17/511,517 2020-10-27 2021-10-26 Methods, apparatuses, and systems for updating service model based on privacy protection Pending US20220129580A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/512,539 US11455425B2 (en) 2020-10-27 2021-10-27 Methods, apparatuses, and systems for updating service model based on privacy protection

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011159885.1A CN112015749B (zh) 2020-10-27 2020-10-27 基于隐私保护更新业务模型的方法、装置及系统
CN202011159885.1 2020-10-27

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/512,539 Continuation US11455425B2 (en) 2020-10-27 2021-10-27 Methods, apparatuses, and systems for updating service model based on privacy protection

Publications (1)

Publication Number Publication Date
US20220129580A1 true US20220129580A1 (en) 2022-04-28

Family

ID=73528358

Family Applications (2)

Application Number Title Priority Date Filing Date
US17/511,517 Pending US20220129580A1 (en) 2020-10-27 2021-10-26 Methods, apparatuses, and systems for updating service model based on privacy protection
US17/512,539 Active US11455425B2 (en) 2020-10-27 2021-10-27 Methods, apparatuses, and systems for updating service model based on privacy protection

Family Applications After (1)

Application Number Title Priority Date Filing Date
US17/512,539 Active US11455425B2 (en) 2020-10-27 2021-10-27 Methods, apparatuses, and systems for updating service model based on privacy protection

Country Status (2)

Country Link
US (2) US20220129580A1 (zh)
CN (1) CN112015749B (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115134077A (zh) * 2022-06-30 2022-09-30 云南电网有限责任公司信息中心 基于横向lstm联邦学习的企业电力负荷联合预测方法及系统

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112231742B (zh) * 2020-12-14 2021-06-18 支付宝(杭州)信息技术有限公司 基于隐私保护的模型联合训练方法及装置
CN112396477B (zh) * 2020-12-29 2021-04-06 支付宝(杭州)信息技术有限公司 业务预测模型的构建方法及装置
CN112819177B (zh) * 2021-01-26 2022-07-12 支付宝(杭州)信息技术有限公司 一种个性化的隐私保护学习方法、装置以及设备
CN113052333A (zh) * 2021-04-02 2021-06-29 中国科学院计算技术研究所 基于联邦学习进行数据分析的方法及系统
CN113240127A (zh) * 2021-04-07 2021-08-10 睿蜂群(北京)科技有限公司 基于联邦学习的训练方法、装置、电子设备及存储介质
CN112799708B (zh) * 2021-04-07 2021-07-13 支付宝(杭州)信息技术有限公司 联合更新业务模型的方法及系统
CN113052329B (zh) * 2021-04-12 2022-05-27 支付宝(杭州)信息技术有限公司 联合更新业务模型的方法及装置
CN113177674A (zh) * 2021-05-28 2021-07-27 恒安嘉新(北京)科技股份公司 网络诈骗的预警方法、装置、设备及介质
CN114818973A (zh) * 2021-07-15 2022-07-29 支付宝(杭州)信息技术有限公司 一种基于隐私保护的图模型训练方法、装置及设备
CN113297396B (zh) * 2021-07-21 2022-05-20 支付宝(杭州)信息技术有限公司 基于联邦学习的模型参数更新方法、装置及设备
WO2023092439A1 (zh) * 2021-11-26 2023-06-01 华为技术有限公司 一种模型训练方法以及装置
US20230262483A1 (en) * 2022-01-25 2023-08-17 Qualcomm Incorporated Protocol stack for analog communication in split architecture network for machine learning (ml) functions
US20230306926A1 (en) * 2022-03-23 2023-09-28 Samsung Electronics Co., Ltd. Personalized color temperature adaptation for consumer display devices
CN115310121B (zh) * 2022-07-12 2023-04-07 华中农业大学 车联网中基于MePC-F模型的实时强化联邦学习数据隐私安全方法
CN115081642B (zh) * 2022-07-19 2022-11-15 浙江大学 一种多方协同更新业务预测模型的方法及系统

Family Cites Families (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10474950B2 (en) * 2015-06-29 2019-11-12 Microsoft Technology Licensing, Llc Training and operation of computational models
US11087880B1 (en) * 2016-04-26 2021-08-10 Express Scripts Strategic Development, Inc. Machine model generation systems and methods
JP6603182B2 (ja) * 2016-07-22 2019-11-06 ファナック株式会社 機械学習モデル構築装置、数値制御装置、機械学習モデル構築方法、機械学習モデル構築プログラム、及び記録媒体
US20180157940A1 (en) * 2016-10-10 2018-06-07 Gyrfalcon Technology Inc. Convolution Layers Used Directly For Feature Extraction With A CNN Based Integrated Circuit
CN108122027B (zh) 2016-11-29 2021-01-12 华为技术有限公司 一种神经网络模型的训练方法、装置及芯片
CN108122032B (zh) 2016-11-29 2020-02-14 华为技术有限公司 一种神经网络模型训练方法、装置、芯片和系统
FR3067496B1 (fr) * 2017-06-12 2021-04-30 Inst Mines Telecom Procede d'apprentissage de descripteurs pour la detection et la localisation d'objets dans une video
CN107609461A (zh) * 2017-07-19 2018-01-19 阿里巴巴集团控股有限公司 模型的训练方法、数据相似度的确定方法、装置及设备
US11995518B2 (en) * 2017-12-20 2024-05-28 AT&T Intellect al P Property I, L.P. Machine learning model understanding as-a-service
US20190213475A1 (en) * 2018-01-10 2019-07-11 Red Hat, Inc. Reducing machine-learning model complexity while maintaining accuracy to improve processing speed
EP4290412A3 (en) * 2018-09-05 2024-01-03 Sartorius Stedim Data Analytics AB Computer-implemented method, computer program product and system for data analysis
WO2020069051A1 (en) * 2018-09-25 2020-04-02 Coalesce, Inc. Model aggregation using model encapsulation of user-directed iterative machine learning
JP6892424B2 (ja) * 2018-10-09 2021-06-23 株式会社Preferred Networks ハイパーパラメータチューニング方法、装置及びプログラム
US11263480B2 (en) * 2018-10-25 2022-03-01 The Boeing Company Machine learning model development with interactive model evaluation
WO2020102888A1 (en) * 2018-11-19 2020-05-28 Tandemlaunch Inc. System and method for automated precision configuration for deep neural networks
US20200184272A1 (en) * 2018-12-07 2020-06-11 Astound Ai, Inc. Framework for building and sharing machine learning components
US20200218940A1 (en) * 2019-01-08 2020-07-09 International Business Machines Corporation Creating and managing machine learning models in a shared network environment
US11544535B2 (en) * 2019-03-08 2023-01-03 Adobe Inc. Graph convolutional networks with motif-based attention
US11568645B2 (en) * 2019-03-21 2023-01-31 Samsung Electronics Co., Ltd. Electronic device and controlling method thereof
WO2020197601A1 (en) * 2019-03-26 2020-10-01 Hrl Laboratories, Llc Systems and methods for forecast alerts with programmable human-machine hybrid ensemble learning
US11010938B2 (en) * 2019-04-03 2021-05-18 Uih America, Inc. Systems and methods for positron emission tomography image reconstruction
CN110263265B (zh) * 2019-04-10 2024-05-07 腾讯科技(深圳)有限公司 用户标签生成方法、装置、存储介质和计算机设备
US11704566B2 (en) * 2019-06-20 2023-07-18 Microsoft Technology Licensing, Llc Data sampling for model exploration utilizing a plurality of machine learning models
US20210042590A1 (en) * 2019-08-07 2021-02-11 Xochitz Watts Machine learning system using a stochastic process and method
CN112819019B (zh) * 2019-11-15 2023-06-20 财团法人资讯工业策进会 分类模型生成装置及其分类模型生成方法
US11599800B2 (en) * 2020-01-28 2023-03-07 Color Genomics, Inc. Systems and methods for enhanced user specific predictions using machine learning techniques
US11379720B2 (en) * 2020-03-20 2022-07-05 Avid Technology, Inc. Adaptive deep learning for efficient media content creation and manipulation
WO2020191282A2 (en) 2020-03-20 2020-09-24 Futurewei Technologies, Inc. System and method for multi-task lifelong learning on personal device with improved user experience
CN113657617A (zh) 2020-04-23 2021-11-16 支付宝(杭州)信息技术有限公司 一种模型联合训练的方法和系统
CN111553484B (zh) 2020-04-30 2023-09-08 同盾控股有限公司 联邦学习的方法、装置及系统
US20210398210A1 (en) * 2020-06-17 2021-12-23 Notto Intellectual Property Holdings Systems and methods of transaction tracking and analysis for near real-time individualized credit scoring
CA3188642A1 (en) * 2020-07-01 2022-01-06 Giant Oak, Inc. Orchestration techniques for adaptive transaction processing
US20220121929A1 (en) * 2020-10-20 2022-04-21 Wipro Limited Optimization of artificial neural network (ann) classification model and training data for appropriate model behavior

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115134077A (zh) * 2022-06-30 2022-09-30 云南电网有限责任公司信息中心 基于横向lstm联邦学习的企业电力负荷联合预测方法及系统

Also Published As

Publication number Publication date
US11455425B2 (en) 2022-09-27
CN112015749B (zh) 2021-02-19
CN112015749A (zh) 2020-12-01
US20220129700A1 (en) 2022-04-28

Similar Documents

Publication Publication Date Title
US11455425B2 (en) Methods, apparatuses, and systems for updating service model based on privacy protection
CN110110229B (zh) 一种信息推荐方法及装置
US20230078061A1 (en) Model training method and apparatus for federated learning, device, and storage medium
Perifanis et al. Federated neural collaborative filtering
CN112085159B (zh) 一种用户标签数据预测系统、方法、装置及电子设备
US11599840B2 (en) System for discovering hidden correlation relationships for risk analysis using graph-based machine learning
US11379715B2 (en) Deep learning based distribution of content items describing events to users of an online system
CN111737546B (zh) 确定实体业务属性的方法及装置
US11809577B2 (en) Application of trained artificial intelligence processes to encrypted data within a distributed computing environment
Imteaj et al. Leveraging asynchronous federated learning to predict customers financial distress
US20240037252A1 (en) Methods and apparatuses for jointly updating service model
US20210082025A1 (en) Decentralized Recommendations Using Distributed Average Consensus
CN112039702B (zh) 基于联邦学习和相互学习的模型参数训练方法及装置
CN115270001B (zh) 基于云端协同学习的隐私保护推荐方法及系统
CN113361962A (zh) 基于区块链网络识别企业风险性的方法及装置
CN112068866B (zh) 更新业务模型的方法及装置
CN109544128A (zh) 捐款信息管理的方法及服务器
US11855970B2 (en) Systems and methods for blind multimodal learning
Huang et al. A reliable and fair federated learning mechanism for mobile edge computing
CN112101609B (zh) 关于用户还款及时性的预测系统、方法、装置及电子设备
US20230419182A1 (en) Methods and systems for imrpoving a product conversion rate based on federated learning and blockchain
CN113761350A (zh) 一种数据推荐方法、相关装置和数据推荐系统
US20220318573A1 (en) Predicting targeted, agency-specific recovery events using trained artificial intelligence processes
US20220366233A1 (en) Systems and methods for generating dynamic conversational responses using deep conditional learning
CN114723012A (zh) 基于分布式训练系统的计算方法和装置

Legal Events

Date Code Title Description
STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION