WO2021103901A1 - Multi-party security calculation-based neural network model training and prediction methods and device - Google Patents

Multi-party security calculation-based neural network model training and prediction methods and device Download PDF

Info

Publication number
WO2021103901A1
WO2021103901A1 PCT/CN2020/124137 CN2020124137W WO2021103901A1 WO 2021103901 A1 WO2021103901 A1 WO 2021103901A1 CN 2020124137 W CN2020124137 W CN 2020124137W WO 2021103901 A1 WO2021103901 A1 WO 2021103901A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
client
neural network
server
layer
Prior art date
Application number
PCT/CN2020/124137
Other languages
French (fr)
Chinese (zh)
Inventor
陈超超
郑龙飞
王力
周俊
Original Assignee
支付宝(杭州)信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 支付宝(杭州)信息技术有限公司 filed Critical 支付宝(杭州)信息技术有限公司
Publication of WO2021103901A1 publication Critical patent/WO2021103901A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/71Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information

Definitions

  • the embodiments of this specification generally relate to the computer field, and more specifically, to a neural network model training method, model prediction method, and device based on multi-party secure computing.
  • data is a very important asset, such as user data and business data.
  • User data may include user identity data and the like, for example.
  • the business data may include, for example, business data that occurs on business applications provided by the company, such as commodity transaction data on Taobao. Protecting data security is a technical issue of widespread concern for companies or enterprises.
  • the neural network model is a machine learning model widely used in the field of machine learning.
  • neural network models require multiple model training participants to coordinate model training. Multiple model training participants (for example, e-commerce companies, express companies, and banks) each have the training data used to train the neural network model Part of the data in. The multiple model training participants hope to jointly use each other's data to uniformly train the neural network model, but they do not want to provide their private data to other model training participants to prevent their own private data from leaking.
  • a machine learning model training method that can protect the security of private data is proposed. It can coordinate with the multiple model training participants to train the nerves while ensuring the security of their respective private data of the multiple model training participants.
  • the network model is used by the participants in the training of the multiple models.
  • the embodiments of this specification provide a neural network model training method, model prediction method, and device based on multi-party secure computing, which can improve model training while ensuring the security of the respective private data of multiple training participants. effectiveness.
  • a neural network model training method based on multi-party secure computing uses a first number of training participants to coordinate training, and the neural network model includes multiple The hidden layer is divided into at least one client model and at least one server model according to the interval between the client model and the server model.
  • Each client model is decomposed into the first number of client sub-models, each client sub-model Having the same sub-model structure, the at least one server-side model is deployed on the server-side, and each client-side sub-model is deployed on the client of the corresponding training participant, the method includes: executing the following loop process until the end of the loop is satisfied Condition: Provide training sample data to the current neural network model to obtain the current prediction value of the current neural network model through the cooperation of each current client model and each current server model, where, in each current client model, Through each training participant, use the respective current client sub-models and the training sample data or the calculation results of the previous current server model to perform multi-party security calculations layer by layer to obtain the calculation results of the current client model, and In each current server model, use the calculation results of the previous current client model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the current server model; determine based on the current predicted value and sample label value Current prediction difference; and when the loop end condition is not met, according to the current prediction difference
  • the model calculation of the neural network model hierarchical structure in the server model has nothing to do with data privacy protection.
  • the total number of hidden layers included in the client model may be based on the computing power used for model training, the training timeliness required by the application scenario, and/or the training security level. determine.
  • the neural network model includes N hidden layers
  • the neural network model is divided into a first client model and a single server model
  • the first client model includes The input layer and the first hidden layer to the Kth hidden layer
  • the server model includes the output layer and the K+1th hidden layer to the Nth hidden layer.
  • the neural network model includes N hidden layers, and the neural network model is divided into a first client model, a single server model, and a second client model.
  • the first client model includes an input layer and the first hidden layer to the Kth hidden layer
  • the server model includes the K+1th hidden layer to the Lth hidden layer
  • the second client model includes an output layer and the first hidden layer. L+1 hidden layer to Nth hidden layer.
  • the neural network model includes N hidden layers, and the neural network model is divided into a first client model, a single server model, and a second client model.
  • the first client model includes an input layer and a first hidden layer to a Kth hidden layer
  • the server model includes a K+1th hidden layer to an Nth hidden layer
  • the second client model includes an output layer.
  • the process of determining the current prediction difference may be performed on the server or on the client of the training participant that has the sample label value.
  • the loop ending condition may include: the number of loops reaches a predetermined number; or the current prediction difference is within a predetermined difference range.
  • the multi-party secure calculation may include one of secret sharing, obfuscation circuits, and homomorphic encryption.
  • the model calculation at the server can be implemented using TensorFlow or Pytorch technology.
  • the training sample data may include training sample data based on image data, voice data, or text data, or the training sample data may include user characteristic data.
  • a model prediction method based on a neural network model includes a plurality of hidden layers and is divided into at least one according to the interval between the client model and the server model.
  • a client model and at least one server model each client model is decomposed into a first number of client sub-models, each client sub-model has the same sub-model structure, and the at least one server model is deployed on the server, Each client sub-model is deployed on the client of the corresponding model owner among the first number of model owners, and the model prediction method includes: receiving data to be predicted; and providing the data to be predicted to the neural network model to The predicted value of the neural network model is obtained through the coordinated calculation of each client model and each server model, where each client model, through each model owner, uses the respective client sub-model and the data to be predicted or Perform multi-party security calculations layer by layer based on the calculation results of the front server model to obtain the calculation results of the client model, and in each server model, use the calculation results of the
  • the data to be predicted may include image data, voice data, or text data.
  • the data to be predicted may include user characteristic data.
  • a neural network model training device based on multi-party secure computing, wherein the neural network model uses a first number of training participants for collaborative training, wherein the neural network model It includes multiple hidden layers and is divided into at least one client model and at least one server model according to the interval between the client model and the server model.
  • Each client model is decomposed into a first number of client sub-models, each The client sub-models have the same sub-model structure, the at least one server-side model is deployed on the server, and each client-side sub-model is deployed on the client of a corresponding training participant, and the neural network model training device includes: a model prediction unit , The training sample data is provided to the current neural network model to obtain the current prediction value of the current neural network model through the cooperation of each current client model and each current server model, wherein, in each current client model, through Each training participant uses their current client sub-model and the training sample data or the calculation result of the previous current server model to perform multi-party security calculation layer by layer to obtain the calculation result of the current client model, and Each current server model uses the calculation result of the previous current client model to perform non-multi-party security calculation layer by layer to obtain the calculation result of the current server model; the prediction difference determination unit is based on the current prediction value and The sample label value determines the current prediction difference; and the model adjustment unit, when the loop end condition is
  • Each layer of model parameters, the adjusted server models and the client sub-models serve as the current server models and the current client sub-models of the next cycle process, wherein the model prediction unit and the prediction difference
  • the value determination unit and the model adjustment unit cyclically perform operations until the loop end condition is satisfied.
  • the model calculation of the neural network model hierarchical structure in the server model has nothing to do with data privacy protection.
  • the total number of hidden layers included in the client model may be based on the computing power used for model training, the training timeliness required by the application scenario, and/or the training security level. determine.
  • the neural network model includes N hidden layers
  • the neural network model is divided into a first client model and a single server model
  • the first client model includes The input layer and the first hidden layer to the Kth hidden layer
  • the server model includes the output layer and the K+1th hidden layer to the Nth hidden layer.
  • the neural network model includes N hidden layers, and the neural network model is divided into a first client model, a single server model, and a second client model.
  • the first client model includes an input layer and the first hidden layer to the Kth hidden layer
  • the server model includes the K+1th hidden layer to the Lth hidden layer
  • the second client model includes an output layer and the first hidden layer. L+1 hidden layer to Nth hidden layer.
  • the prediction difference determining unit may be provided at the server or the client.
  • a model prediction device based on a neural network model, wherein the neural network model includes a plurality of hidden layers and is divided into a client model and a server model in a way that is separated from each other.
  • the first client model in the client model includes at least an input layer, each client model is decomposed into a first number of client sub-models, each client sub-model Having the same sub-model structure, the at least one server-side model is deployed at the server, and each client-side sub-model is deployed at the client of the corresponding model owner among the first number of model owners, the model predicting device Including: a data receiving unit, which receives the data to be predicted; a model prediction unit, which provides the data to be predicted to the neural network model, so as to obtain the predicted value of the neural network model through the coordinated calculation of each client model and each server model , wherein, in each client model, through each model owner, use the respective client sub-models and the calculation results of the data to be predicted or the previous server model to perform multi-party security calculations layer by layer to obtain the client model The calculation results of and in each server model, use the calculation results of the previous client model to perform non-multi-party security calculations layer by layer to obtain the calculation
  • an electronic device including: one or more processors, and a memory coupled with the one or more processors, the memory storing instructions, when the instructions When executed by the one or more processors, the one or more processors are caused to execute the neural network model training method described above.
  • a machine-readable storage medium which stores executable instructions that, when executed, cause the machine to execute the neural network model training method described above.
  • an electronic device including: one or more processors, and a memory coupled with the one or more processors, the memory storing instructions, when the instructions When executed by the one or more processors, the one or more processors are caused to execute the model prediction method described above.
  • a machine-readable storage medium which stores executable instructions that, when executed, cause the machine to execute the model prediction method as described above.
  • Figure 1 shows a schematic diagram of an example of a neural network model
  • Figure 2 shows a schematic diagram of an example of a neural network model training method based on multi-party secure computing
  • Fig. 3 shows a schematic diagram of a segmentation example of a neural network model according to an embodiment of the present specification
  • FIGS. 4A-4D show exemplary schematic diagrams of a client sub-model and a server-side model after segmentation according to an embodiment of the present specification
  • FIG. 5 shows a flowchart of an example of a neural network model training method based on multi-party secure computing according to an embodiment of the present specification
  • FIG. 6A shows a schematic diagram of an example of horizontally segmented training sample data according to an embodiment of the present disclosure
  • FIG. 6B shows a schematic diagram of an example of vertically segmented training sample data according to an embodiment of the present disclosure
  • FIG. 7A shows a schematic diagram of another segmentation example of a neural network model according to an embodiment of the present specification
  • Fig. 7B shows a schematic diagram of another segmentation example of the neural network model according to the embodiment of the present specification.
  • Fig. 8 shows a flowchart of a model prediction method based on a neural network model according to an embodiment of the present specification
  • Fig. 9 shows a block diagram of a model training device according to an embodiment of the present specification.
  • Figure 10 shows a block diagram of a model prediction device according to an embodiment of the present specification
  • FIG. 11 shows a block diagram of an electronic device for implementing neural network model training based on multi-party secure computing according to an embodiment of the present specification
  • Fig. 12 shows a block diagram of an electronic device for implementing model prediction based on a neural network model according to an embodiment of the present specification.
  • the term “including” and its variations mean open terms, meaning “including but not limited to”.
  • the term “based on” means “based at least in part on.”
  • the terms “one embodiment” and “an embodiment” mean “at least one embodiment.”
  • the term “another embodiment” means “at least one other embodiment.”
  • the terms “first”, “second”, etc. may refer to different or the same objects. Other definitions can be included below, either explicit or implicit. Unless clearly indicated in the context, the definition of a term is consistent throughout the specification.
  • FIG. 1 shows a schematic diagram of an example of a neural network model 100.
  • the neural network model 100 includes an input layer 110, a first hidden layer 120, a second hidden layer 130, a third hidden layer 140 and an output layer 150.
  • the input layer 110 includes three input nodes N1, N2, and N3 and a bias term b1.
  • the three input nodes N1, N2, and N3 respectively receive data from three different data owners.
  • the term “data owner” and the terms “model owner” and “training participant” can be used interchangeably.
  • the first hidden layer 120 includes two hidden layer nodes N4 and N5 and a bias term b2.
  • the hidden layer nodes N4 and N5 are respectively fully connected with the three input nodes N1, N2, and N3 of the input layer 110 and the bias term b1.
  • the weights between the input node N1 and the hidden layer nodes N4 and N5 are W1,4 and W1,5, respectively.
  • the weights between the input node N2 and the hidden layer nodes N4 and N5 are W2,4 and W2,5, respectively.
  • the weights between the input node N3 and the hidden layer nodes N4 and N5 are W3,4 and W3,5, respectively.
  • the second hidden layer 130 includes two hidden layer nodes N6 and N7 and a bias term b3.
  • the hidden layer nodes N6 and N7 are respectively fully connected with the two hidden layer nodes N4 and N5 of the first hidden layer 120 and the bias term b2.
  • the weights between the hidden layer node N4 and the hidden layer nodes N6 and N7 are W4,6 and W4,7, respectively.
  • the weights between the hidden layer node N5 and the hidden layer nodes N6 and N7 are W5,6 and W5,7, respectively.
  • the third hidden layer 140 includes two hidden layer nodes N8 and N9 and a bias term b4.
  • the hidden layer nodes N8 and N9 are respectively fully connected with the two hidden layer nodes N6 and N7 of the second hidden layer 130 and the bias term b3.
  • the weights between the hidden layer node N6 and the hidden layer nodes N8 and N9 are W6,8 and W6,9, respectively.
  • the weights between the hidden layer node N7 and the hidden layer nodes N8 and N9 are W7,8 and W7,9, respectively.
  • the output layer 150 includes an output node N10.
  • the output node N10 is fully connected with the two hidden layer nodes N8 and N9 of the third hidden layer 140 and the bias term b4.
  • the weight between the hidden layer node N8 and the output node N10 is W8,10.
  • the weight between the hidden layer node N9 and the output node N10 is W9,10.
  • the weights W1,4, W1,5, W2,4, W2,5, W3,4, W3,5, W4,6, W4,7, W5,6, W5 ,7, W6,8, W6,9, W7,8, W7,9, W8,10 and W9,10 are the model parameters of each layer of the neural network model.
  • the activation function processing is performed on Z1 and Z2 respectively, and the outputs a1 and a2 of the hidden layer nodes N4 and N5 are obtained.
  • FIG. 2 shows a schematic diagram of an example of a neural network model training method 200 based on multi-party secure computing.
  • three training participants Alice, Bob, and Charlie that is, the input nodes N1, N2, and N3 in FIG. 1 are taken as an example for illustration.
  • a training participant Alice is the training initiator, that is, the training sample data at Alice is used for training.
  • each training participant Alice, Bob, and Charlie has the model structure of all the layered models of the neural network model, but the model parameters of each layer of each training participant are only the neural network model Part of the model parameters of the corresponding layer, and the sum of the model parameters of each layer of each training participant is equal to the model parameters of the corresponding layer of the neural network model.
  • the first training participant Alice, the second training participant Bob and Charlie initialize the sub-model parameters of their neural network sub-models to obtain the initial values of their sub-model parameters, and
  • the number of executed training cycles t is initialized to zero.
  • the loop ending condition is to perform a predetermined number of training loops, for example, T training loops are executed.
  • multi-party security calculations are performed based on the current sub-models of each training participant to obtain the current prediction value of the neural network model to be trained for the training sample data
  • the current predicted value is determined The predicted difference with the corresponding labeled value Y
  • e is a column vector
  • Y is a column vector representing the label value of the training sample X
  • I is a column vector representing the current predicted value of the training sample X. If the training sample X contains only a single training sample, then e, Y and Both are column vectors with a single element.
  • each element in is the current predicted value of the corresponding training sample in the multiple training samples
  • each element in Y is the label value of the corresponding training sample in the multiple training samples
  • each element in e is The difference between the label value of the corresponding training sample of the multiple training samples and the current predicted value.
  • the prediction differences are sent to the second training participants Bob and Charlie, respectively.
  • the model parameters of each layer of the neural network model at each training participant are adjusted layer by layer through back propagation.
  • block 260 it is determined whether the predetermined number of cycles has been reached. If the predetermined number of cycles has not been reached, return to block 220 to execute the next training cycle process, where the updated current sub-models obtained by each training participant in the current cycle process are used as the next training cycle process. The current submodel.
  • each training participant stores the current updated value of the respective sub-model parameter as the final value of the sub-model parameter to obtain the respective trained sub-model, and then the process ends.
  • the end condition of the training loop process can also be that the determined prediction difference is within a predetermined range, for example, the sum of each element e i in the prediction difference e is less than a predetermined threshold, or The average value of each element e i in the prediction difference e is less than a predetermined threshold.
  • the operation of block 260 is performed after block 230. If the loop end condition is met, the process ends. Otherwise, perform the operations of blocks 240 and 250, and then return to block 220 to execute the next cycle.
  • the embodiment of this specification proposes a neural network model training method.
  • the neural network model is divided into multiple model parts according to the interval between the client model and the server model.
  • Some models Part of it is deployed on the server (hereinafter referred to as the "server model part"), and some other models are partially deployed on the client (hereinafter referred to as the "client model part"), where the server model part may include at least one
  • the server model and the client model part may include at least one client model.
  • each client-side model corresponds to one or more hierarchical structures of the neural network model.
  • each client model is decomposed into multiple client sub-models.
  • each client model For each client model, a client sub-model is deployed at each training participant (client), the model structure included in each client sub-model is the same, and the model parameters of each layer of each client sub-model It is obtained by segmenting the model parameters of the corresponding hierarchical structure in the neural network model, that is, the sum of the model parameters of the same layer nodes of all client sub-models is equal to the model parameters of the corresponding hierarchical structure in the neural network model .
  • each client model uses the MPC method to perform model calculations through the collaboration of each training participant, and each server model uses a non-MPC method to perform model calculations. For example, TensorFlow or Pytorch technology can be used To perform model calculations.
  • the neural network model involved in model calculations that have nothing to do with data privacy protection can be segmented into server-side models, thereby protecting data privacy security.
  • the training sample data used by the neural network model may include training sample data based on image data, voice data, or text data.
  • the neural network model can be applied to business risk identification, business classification or business decision-making based on image data, voice data or text data.
  • the training sample data used by the neural network model may include user characteristic data.
  • the neural network model can be applied to business risk identification, business classification, business recommendation or business decision based on user characteristic data.
  • the data to be predicted used by the neural network model may include image data, voice data, or text data.
  • the data to be predicted used by the neural network model may include user characteristic data.
  • the neural network model may include N hidden layers.
  • the neural network model may be divided into a first client model and a single server model.
  • the first client model includes the input layer and the first hidden layer to The Kth hidden layer and the server model include the output layer and the K+1th hidden layer to the Nth hidden layer.
  • Fig. 3 shows a schematic diagram of a segmentation example of the neural network model 100 according to an embodiment of the present specification.
  • the neural network model 100 is divided from the second hidden layer 140 into a client model and a server model.
  • the client model includes an input layer 110, a first hidden layer 120, and a second hidden layer 130.
  • the server model includes a third hidden layer 140 and an output layer 150.
  • the client model is decomposed into 3 client sub-models, and each training participant is deployed with a client sub-model, and the server model is deployed on the server.
  • Each client sub-model includes the same model structure, and the sum of the model parameters of the same layer nodes of all client sub-models is equal to the model parameters of the corresponding hierarchical nodes in the neural network model.
  • the model parameters between the input layer and the first hidden layer 120 and the model parameters between the first hidden layer 120 and the second hidden layer 130 are divided into three parts, each of which is owned by each client sub-model.
  • FIGS. 4A-4D show exemplary schematic diagrams of a client sub-model and a server-side model after segmentation according to an embodiment of the present specification.
  • the relationship between the sub-model parameters shown in FIGS. 4A-4C and the model parameters of the neural network model in FIG. 3 is as follows.
  • w 1,4 w 1,4 (1) + w 1,4 (2) + w 1,4 (3)
  • w 1,5 w 1,5 (1) + w 1,5 (2) + w 1,5 (3)
  • w 2,4 w 2,4 (1) + w 2,4 (2) + w 2,4 (3)
  • w 2,5 w 2,5 (1) + w 2,5 (2) + w 2,5 (3)
  • w 3,4 w 3,4 (1) +w 3,4 (2) +w 3,4 (3)
  • w 3,5 w 3,5 (1) +w 3,5 (2) + w 3,5 (3) ,
  • w 4,6 w 4,6 (1) +w 4,6 (2) +w 4,6 (3)
  • w 4,7 w 4,7 (1) +w 4,7 (2) + w 4,7 (3)
  • w 5,6 w 5,6 (1) +w 5,6 (2) +w 5,6 (3)
  • w 5,7 w 5,7 (1) +w 5,7 (2) + w 5,7 (3) .
  • each layer of the server model is exactly the same as the model parameters of the corresponding layer of the neural network model.
  • the neural network model segmentation in Figures 4A-4D corresponds to the data horizontal segmentation situation.
  • each data owner has only one node.
  • one node of each data owner can be transformed into three nodes by vertical-horizontal switching, so as to perform segmentation according to the neural network model segmentation method shown in FIGS. 4A-4D.
  • FIG. 5 shows a flowchart of an example of a neural network model training method 500 based on multi-party secure computing according to an embodiment of the present specification.
  • the neural network model training method 500 shown in FIG. 5 it is assumed that there are M (ie, the first number) training participants.
  • the neural network model segmentation method shown in FIG. 5 is the segmentation method in FIG. 3.
  • the M training participants may be M data owners who have data required for neural network model training, that is, each data owner has part of the data required for neural network model training.
  • part of the data owned by the M data owners may be training data that has been split horizontally, or may be training data that has been split vertically.
  • FIG. 6A shows a schematic diagram of an example of horizontally segmented training sample data according to an embodiment of the present specification.
  • Figure 6A shows two data parties Alice and Bob, and multiple data parties are similar.
  • Each training sample in the training sample subset owned by each data party Alice and Bob is complete, that is, each training sample includes complete feature data (x) and labeled data (y).
  • Alice has a complete training sample (x0, y0).
  • FIG. 6B shows a schematic diagram of an example of vertically segmented training sample data according to an embodiment of the present specification.
  • Figure 6B shows two data parties Alice and Bob, and multiple data parties are similar.
  • Each data party Alice and Bob has a partial training sub-sample of each training sample in all the training samples in the training sample set.
  • the data party Alice and Bob have part of the training sub-samples combined together to form The complete content of the training sample. For example, suppose that the content of a training sample includes label y 0 and attribute features After vertical segmentation, the training participant Alice has the y 0 and And the training participant Bob has the training sample
  • the client sub-models at the clients of the M training participants and the server models at the server are initialized.
  • multi-party secure computing can refer to any suitable multi-party secure computing implementation solution in this field.
  • multi-party secure computing may include one of Secret Sharing (SS), Garbled Circuit (GC), and Homomorphic Encryption (HE).
  • the calculation result of each current client sub-model is provided to the current server model of the server to calculate layer by layer to obtain the current prediction value of the neural network model.
  • the current prediction difference is determined based on the current prediction value and the sample label value.
  • the process of determining the current prediction difference can be performed on the server side.
  • the sample label value owned by the training participant needs to be transmitted to the server.
  • the process of determining the current prediction difference may be performed on the client terminal of the training participant who has the sample label value.
  • the current prediction value determined by the server is fed back to the training participant with the sample mark value, and then the current prediction difference is determined at the training participant. In this way, there is no need to send the sample label value to the server, so that the privacy of the sample label value at the training participant can be further protected.
  • block 550 it is determined whether the current prediction difference is within a predetermined difference range, for example, it is determined whether the current prediction difference is less than a predetermined threshold. If the current prediction difference is not within the predetermined difference range, for example, the current prediction difference is not less than the predetermined threshold, then in block 560, according to the current prediction difference, the server model and each client sub-model are adjusted layer by layer through backpropagation. Then, return to block 520 to execute the next cycle process, where the adjusted server model and each client sub-model serve as the current server model and each current client sub-model of the next cycle process .
  • the training process ends.
  • the ending condition of the training cycle process may also be to reach a predetermined number of cycles.
  • the operation of block 550 may be performed after the operation of block 560, that is, after the current prediction difference is determined in block 540, the operation of block 560 is performed, and then it is determined whether the predetermined number of cycles is reached. If the predetermined number of cycles is reached, the training process ends. If the predetermined number of cycles has not been reached, return to block 520 to execute the next cycle process.
  • the client model includes 2 hidden layers.
  • the client model may include more or fewer hidden layers, for example, may include one hidden layer or more than two hidden layers.
  • the total number of hidden layers included in the client model can be determined according to the computing power used for model training, the training timeliness required by the application scenario, and/or the training security level.
  • the neural network model training method described in FIG. 5 is a neural network model training method for the neural network model segmentation scheme shown in FIG. 3.
  • the neural network model can be segmented according to other segmentation schemes, as shown in FIGS. 7A and 7B.
  • Fig. 7A shows another example schematic diagram of a neural network segmentation scheme.
  • the neural network model is divided into a first client model, a server model, and a second client model.
  • the first client model includes an input layer 110 and a first hidden layer 120.
  • the server model includes a second hidden layer 130.
  • the second client model includes a third hidden layer 140 and an output layer 150.
  • each of the first client model and the second client model can be segmented into 3 client sub-models in a similar manner as in FIGS. 4A-4C.
  • the server-side model is the same as the corresponding layered model of the neural network model. It should be noted here that the client sub-models of the first client model and the second client model are set at the client of each training participant.
  • the training sample data is provided to each current first client sub-model in the first client model, Multi-party security calculations are performed layer by layer to obtain the calculation results of each current first client sub-model.
  • the calculation results of each current first client sub-model are provided to the server model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the server model.
  • the calculation result of the server model is provided to each current second client sub-model in the second client model, and multi-party security calculations are performed layer by layer to obtain the current prediction result of the neural network model.
  • Fig. 7B shows another example schematic diagram of a neural network segmentation scheme.
  • the neural network model is divided into at least one client model and at least one server model, for example, as shown in FIG. 7B, the first client model, the first server model, the second client model, The second service model and the third client model.
  • the first client model includes an input layer 110 and a first hidden layer 120.
  • the first server model includes a second hidden layer 130.
  • the second client model includes a third hidden layer 140.
  • the second server model includes a fourth hidden layer 150.
  • the third client model includes a fifth hidden layer 160 and an output layer 170.
  • each of the first client model, the second client model, and the third client model can be divided into 3 clients in a similar manner as in FIGS. 4A-4C Terminal model.
  • the first and second server models are the same as the corresponding hierarchical models of the neural network model. It should be explained here that the corresponding client sub-models of the first client model, the second client model, and the third client model are set at the clients of each training participant.
  • the neural network model when the neural network model is trained, in each cycle, in each current client model, through each training participant, use the respective current client sub-models and training samples Data or the calculation result of the previous current server model is used to perform multi-party security calculation layer by layer to obtain the calculation result of the current client model.
  • the calculation result of the previous client model is used to perform non-multi-party security calculation layer by layer to obtain the calculation result of the current server model.
  • At least one client model and at least one server model cooperate with calculations to obtain the current prediction value of the neural network model.
  • the training sample data is provided to each current first client sub-model in the current first client model, and multi-party security calculations are performed layer by layer to obtain the calculation results of each current first client sub-model.
  • the calculation result of each current first client sub-model is provided to the current first server model to perform non-multi-party security calculation layer by layer to obtain the calculation result of the current first server model.
  • the calculation result of the first server model is provided to each current second client sub-model in the current second client model, and multi-party security calculations are performed layer by layer to obtain the calculation result of the current second client model.
  • the calculation results of each current second client sub-model are provided to the current second server model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the current second server model.
  • the calculation result of the current second server model is provided to each current third client sub-model in the current third client model, and multi-party security calculations are performed layer by layer to obtain the current prediction result of the neural network model.
  • the first client model may include part of the hidden layer.
  • each server model may include at least part of the hidden layer.
  • the neural network model segmentation can be performed based on whether the model calculation of each layered model of the neural network model is related to data privacy protection.
  • the division related to data privacy protection The layer model is divided into the client model, and the layer model that has nothing to do with data privacy protection is divided into the server model.
  • the client model can also include a layered model that has nothing to do with data privacy protection.
  • the data relating to privacy model can be calculated directly using the respective input or output of the model calculation X i Y, e.g., with a model corresponding to the input layer and the output layer is calculated or calculated corresponding to the model.
  • Model calculation and data privacy may be irrelevant to use without calculating the respective input X i and the model output Y, e.g., neural network model in an intermediate hidden layer.
  • a neural network model can be provided.
  • the neural network model is divided into at least one client model and at least one server model according to the interval between the client model and the server model.
  • Each client model is divided into at least one client model and at least one server model. It is decomposed into the first number of client sub-models, each client sub-model has the same sub-model structure, each client sub-model is deployed on the client of a training participant, and the server model is deployed on the server.
  • the training sample data is provided to the neural network model, so that the current prediction value of the neural network model can be obtained through cooperative calculation through the client model at the client of each training participant and each server model of the server.
  • model calculation in the client model is implemented in MPC mode
  • model calculation in the server model is implemented in non-MPC mode, which can reduce the number of model layers that perform multi-party security calculations, thereby increasing the speed of model training and improving the model Training efficiency.
  • the neural network model training method of the embodiment of the present specification only the layered structure of the neural network model that is not related to data privacy protection is segmented into the server model, so that the data privacy security of each data owner can be ensured.
  • the total number of hidden layers included in the client model can be based on the computing power used for model training, the training timeliness required by the application scenario, and/or training safety.
  • the level is determined and adjusted, so that the environmental conditions of model training, data security requirements and model training efficiency can be compromised when performing neural network model segmentation.
  • the current prediction difference determination process can be performed on the client terminal of the training participant who has the sample label value. In this way, there is no need to send the sample label value to the server, so that the privacy of the sample label value at the training participant can be further protected.
  • FIG. 8 shows a flowchart of a model prediction method 800 based on a neural network model according to an embodiment of the present specification.
  • the neural network model includes N hidden layers and is divided into a single server model and a first number of client sub-models.
  • Each client sub-model includes an input layer and a first hidden layer to The Kth hidden layer
  • the server model includes the output layer and the K+1 hidden layer to the Nth hidden layer.
  • the server model is deployed at the server, and each client sub-model is deployed at the client of a model owner.
  • the client sub-models have the same sub-model structure, and the first number of client sub-models together constitute the corresponding model structure of the neural network model.
  • data to be predicted is received.
  • the data to be predicted may be received from any model owner.
  • the received data to be predicted is provided to each current client sub-model at the client of the first number of model owners to perform multi-party security calculation layer by layer to obtain the calculation result of each current client sub-model .
  • the calculation results of each current client sub-model are provided to the server model of the server to perform non-multi-party secure calculation layer by layer to obtain the model prediction result of the neural network model.
  • the model prediction process shown in FIG. 8 can be adaptively modified according to the corresponding model calculation scheme.
  • FIG. 9 shows a block diagram of a model training device 900 according to an embodiment of the present specification.
  • the model training device 900 includes a model prediction unit 910, a prediction difference determination unit 920, and a model adjustment unit 930.
  • the model prediction unit 910, the prediction difference determination unit 920, and the model adjustment unit 930 perform operations in a loop until the loop end condition is satisfied.
  • the loop ending condition may include: the number of loops reaches a predetermined number of times; or the current prediction difference is within a predetermined difference range.
  • the model prediction unit 910 is configured to provide training sample data to the current neural network model, so as to obtain the current prediction value of the current neural network model through the cooperative calculation of each current client model and each current server model.
  • Each current client model uses the respective current client sub-models and training sample data or the calculation results of the previous current server model to perform multi-party security calculations layer by layer to obtain the current client
  • the model prediction unit 910 may include a multi-party security calculation module and a server-side calculation module.
  • the multi-party security computing module is configured for each current client model, through the first number of training participants, using the training sample data or the calculation results of the previous current server model and the respective current client sub-model to perform multi-party layer by layer Secure calculation to obtain the calculation result of the current client model.
  • the multi-party security computing module is set at the client.
  • the server-side calculation module is configured to perform non-multi-party security calculations layer by layer for each server-side model using the calculation result of the previous current client-side model to obtain the calculation result of the server-side model.
  • the server-side computing module is set at the server-side.
  • the multi-party security calculation module and the server calculation module cooperate to calculate to obtain the predicted value of the neural network model.
  • the prediction difference determination unit 920 is configured to determine the current prediction difference based on the current prediction value and the sample label value.
  • the prediction difference determining unit 920 may be provided at the server or at the client.
  • the model adjustment unit 930 is configured to adjust each layer model parameter of each current server model and each current client sub-model through backpropagation according to the current prediction difference when the loop end condition is not satisfied.
  • the server model and each client sub-model serve as each current server model and each current client sub-model of the next cycle process. Part of the components in the model adjustment unit 930 are set on the client side, and other components are set on the server side.
  • Fig. 10 shows a block diagram of a model prediction apparatus 1000 according to an embodiment of the present specification.
  • the model prediction device 1000 includes a data receiving unit 1010 and a model prediction unit 1020.
  • the data receiving unit 1010 is configured to receive data to be predicted.
  • the data to be predicted may be received from any model owner.
  • the data receiving unit 1010 is provided in the client of each model owner.
  • the model prediction unit 1020 is configured to provide the data to be predicted to the neural network model to obtain the prediction value of the neural network model through the cooperation of each client model and each server model.
  • a number of model owners use their respective client sub-models and the data to be predicted or the calculation results of the previous server model to perform multi-party security calculations layer by layer to obtain the calculation results of the client model, and each server model, Use the calculation results of the previous client model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the server model.
  • the model prediction unit 1020 may include a multi-party security calculation module and a server-side calculation module.
  • the multi-party security calculation module is configured to perform multi-party security calculation layer by layer for each client model, via the first number of model owners, using the to-be-predicted data or the calculation results of the previous server model and the respective client sub-models, In order to obtain the calculation result of the client model.
  • the server-side calculation module is configured to use the calculation results of the previous client-side model to perform non-multi-party security calculations layer by layer for each server-side model, so as to obtain the calculation result of the server-side model.
  • the multi-party security calculation module and the server calculation module cooperate to calculate to obtain the predicted value of the neural network model.
  • the multi-party secure computing module is installed on the client of each model owner, and the server computing module is installed on the server.
  • model training device can be implemented by hardware, or by software or a combination of hardware and software.
  • FIG. 11 shows a structural block diagram of an electronic device 1100 for implementing neural network model training based on multi-party security computing according to an embodiment of the present specification.
  • the electronic device 1100 may include at least one processor 1110, a memory (for example, a non-volatile memory) 1120, a memory 1130, a communication interface 1140, and a bus 1160, and at least one processor 1110, a memory 1120, a memory 1130 and the communication interface 1140 are connected together via a bus 1160.
  • the at least one processor 1110 executes at least one computer-readable instruction (that is, the above-mentioned element implemented in the form of software) stored or encoded in a computer-readable storage medium.
  • computer-executable instructions are stored in the memory, which, when executed, cause at least one processor 1110 to: execute the following loop process until the loop end condition is met: provide training sample data to the current neural network model, The current prediction value of the current neural network model is obtained through the cooperation calculation of each client model and each server model.
  • each current client model through the first number of training participants, use the respective current client sub-models and The training sample data or the calculation results of the previous current server model are used to perform multi-party security calculations layer by layer to obtain the calculation results of the current client model, and in each current server model, use the previous current client model
  • the calculation result is used to perform non-multi-party security calculation layer by layer to obtain the calculation result of the current server model; determine the current prediction difference based on the current prediction value and the sample tag value; and when the loop end condition is not met, according to the current prediction difference Value, through backpropagation to adjust each layer model parameter of each current server model and each current client sub-model layer by layer, and the adjusted each server model and each client sub-model will serve as each current service in the next cycle process Client model and each current client sub-model.
  • FIG. 12 shows a structural block diagram of an electronic device 1200 for implementing model prediction based on a neural network model according to an embodiment of the present specification.
  • the electronic device 1200 may include at least one processor 1210, memory (for example, non-volatile memory) 1220, memory 1230, communication interface 1240, and bus 1260, and at least one processor 1210, memory 1220, memory The 1230 and the communication interface 1240 are connected together via a bus 1260.
  • the at least one processor 1210 executes at least one computer-readable instruction (that is, the above-mentioned element implemented in the form of software) stored or encoded in a computer-readable storage medium.
  • computer-executable instructions are stored in the memory, which, when executed, cause at least one processor 1210 to: receive the data to be predicted;
  • Each server model cooperates with calculations to obtain the predicted value of the neural network model.
  • the results are used to perform multi-party security calculations layer by layer to obtain the calculation results of the client model, and in each server model, use the calculation results of the previous client model to perform non-multi-party security calculations layer by layer to obtain the server model The result of the calculation.
  • the electronic device 1100/1200 may include, but is not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular Telephones, personal digital assistants (PDAs), handheld devices, wearable computing devices, consumer electronic devices, etc.
  • PDAs personal digital assistants
  • a program product such as a non-transitory machine-readable medium.
  • the non-transitory machine-readable medium may have instructions (ie, the above-mentioned elements implemented in the form of software), which when executed by a machine, cause the machine to execute the various embodiments described above in conjunction with FIGS. 1-10 in the various embodiments of this specification. Operation and function.
  • a system or device equipped with a readable storage medium may be provided, and the software program code for realizing the function of any one of the above-mentioned embodiments is stored on the readable storage medium, and the computer or device of the system or device The processor reads and executes the instructions stored in the readable storage medium.
  • the program code itself read from the readable medium can implement the function of any one of the above embodiments, so the machine readable code and the readable storage medium storing the machine readable code constitute the present invention a part of.
  • Examples of readable storage media include floppy disks, hard disks, magneto-optical disks, optical disks (such as CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD-RW), magnetic tape, Volatile memory card and ROM.
  • the program code can be downloaded from the server computer or the cloud via the communication network.
  • the device structure described in the foregoing embodiments may be a physical structure or a logical structure. That is, some units may be implemented by the same physical entity, or some units may be implemented by multiple physical entities, or may be implemented by multiple physical entities. Some components in independent devices are implemented together.
  • the hardware unit or module can be implemented mechanically or electrically.
  • a hardware unit, module, or processor may include a permanent dedicated circuit or logic (such as a dedicated processor, FPGA or ASIC) to complete the corresponding operation.
  • the hardware unit or processor may also include programmable logic or circuits (such as general-purpose processors or other programmable processors), which may be temporarily set by software to complete corresponding operations.
  • the specific implementation method mechanical method, or dedicated permanent circuit, or temporarily set circuit

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bioethics (AREA)
  • Computer Security & Cryptography (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A multi-party security calculation-based neural network model training method, and a model prediction method and device. In the method, a neural network model is divided into at least one client model and at least one server model, the server model being deployed in a server, and the client model being deployed in a client corresponding to a training participant. At each cycle, training sample data are provided to the neural network model to obtain a current prediction value and a current prediction difference. In each client model, a multi-party security calculation is performed layer by layer by means of each training participant using a respective client sub-model and received data. In each server model, a non-multi-party security calculation is carried out layer by layer by using the calculation result of the previous client model. When the cycle is not finished, according to the current prediction difference, a model parameter of each layer of the server model and client sub-model is adjusted by means of backpropagation. By using the method, model training efficiency can be improved while ensuring the security of private data.

Description

基于多方安全计算的神经网络模型训练及预测方法、装置Neural network model training and prediction method and device based on multi-party safe computing 技术领域Technical field
本说明书的实施例通常涉及计算机领域,更具体地,涉及基于多方安全计算的神经网络模型训练方法、模型预测方法及装置。The embodiments of this specification generally relate to the computer field, and more specifically, to a neural network model training method, model prediction method, and device based on multi-party secure computing.
背景技术Background technique
对于公司或企业而言,数据是非常重要的资产,比如,用户数据和业务数据。用户数据例如可以包括用户身份数据等。业务数据例如可以包括在公司提供的业务应用上发生的业务数据,比如淘宝上的商品交易数据等。保护数据安全是公司或企业广泛关注的技术问题。For a company or enterprise, data is a very important asset, such as user data and business data. User data may include user identity data and the like, for example. The business data may include, for example, business data that occurs on business applications provided by the company, such as commodity transaction data on Taobao. Protecting data security is a technical issue of widespread concern for companies or enterprises.
在公司或企业进行业务运营时,通常会需要使用机器学习模型来进行模型预测,以确定业务运营风险或者进行业务运营决策。神经网络模型是机器学习领域广泛使用的机器学习模型。在很多情况下,神经网络模型需要多个模型训练参与方来协同进行模型训练,多个模型训练参与方(例如,电子商务公司、快递公司和银行)各自拥有训练神经网络模型所使用的训练数据中的部分数据。该多个模型训练参与方希望共同使用彼此的数据来统一训练神经网络模型,但又不想把各自的隐私数据提供给其它各个模型训练参与方以防止自己的隐私数据泄露。When a company or enterprise conducts business operations, it usually needs to use machine learning models to make model predictions to determine business operation risks or make business operation decisions. The neural network model is a machine learning model widely used in the field of machine learning. In many cases, neural network models require multiple model training participants to coordinate model training. Multiple model training participants (for example, e-commerce companies, express companies, and banks) each have the training data used to train the neural network model Part of the data in. The multiple model training participants hope to jointly use each other's data to uniformly train the neural network model, but they do not want to provide their private data to other model training participants to prevent their own private data from leaking.
面对这种情况,提出了能够保护隐私数据安全的机器学习模型训练方法,其能够在保证多个模型训练参与方的各自隐私数据安全的情况下,协同该多个模型训练参与方来训练神经网络模型,以供该多个模型训练参与方使用。In the face of this situation, a machine learning model training method that can protect the security of private data is proposed. It can coordinate with the multiple model training participants to train the nerves while ensuring the security of their respective private data of the multiple model training participants. The network model is used by the participants in the training of the multiple models.
发明内容Summary of the invention
鉴于上述问题,本说明书的实施例提供了一种基于多方安全计算的神经网络模型训练方法、模型预测方法及装置,其能够在保证多个训练参与方的各自隐私数据安全的情况下提高模型训练效率。In view of the foregoing problems, the embodiments of this specification provide a neural network model training method, model prediction method, and device based on multi-party secure computing, which can improve model training while ensuring the security of the respective private data of multiple training participants. effectiveness.
根据本说明书的实施例的一个方面,提供一种基于多方安全计算的神经网络模型训练方法,其中,所述神经网络模型利用第一数目个训练参与方协同训练,所述神经网络模型包括多个隐层并且被按照客户端模型与服务端模型间隔的方式分割为至少一个 客户端模型和至少一个服务端模型,每个客户端模型被分解为第一数目个客户端子模型,每个客户端子模型具有相同的子模型结构,所述至少一个服务端模型部署在服务端,每个客户端子模型部署在对应的训练参与方的客户端,所述方法包括:执行下述循环过程,直到满足循环结束条件:将训练样本数据提供给当前神经网络模型,以经由各个当前客户端模型和各个当前服务端模型配合计算来得到所述当前神经网络模型的当前预测值,其中,在各个当前客户端模型,经由各个训练参与方,使用各自的当前客户端子模型以及所述训练样本数据或者在前的当前服务端模型的计算结果来逐层进行多方安全计算,以得到该当前客户端模型的计算结果,以及在各个当前服务端模型,使用在前的当前客户端模型的计算结果来逐层进行非多方安全计算,以得到该当前服务端模型的计算结果;基于所述当前预测值和样本标记值,确定当前预测差值;以及在不满足所述循环结束条件时,根据所述当前预测差值,通过反向传播来逐层调整各个当前服务端模型和各个当前客户端子模型的各层模型参数,所述调整后的各个服务端模型和各个客户端子模型充当下一循环过程的各个当前服务端模型和各个当前客户端子模型。According to one aspect of the embodiments of the present specification, a neural network model training method based on multi-party secure computing is provided, wherein the neural network model uses a first number of training participants to coordinate training, and the neural network model includes multiple The hidden layer is divided into at least one client model and at least one server model according to the interval between the client model and the server model. Each client model is decomposed into the first number of client sub-models, each client sub-model Having the same sub-model structure, the at least one server-side model is deployed on the server-side, and each client-side sub-model is deployed on the client of the corresponding training participant, the method includes: executing the following loop process until the end of the loop is satisfied Condition: Provide training sample data to the current neural network model to obtain the current prediction value of the current neural network model through the cooperation of each current client model and each current server model, where, in each current client model, Through each training participant, use the respective current client sub-models and the training sample data or the calculation results of the previous current server model to perform multi-party security calculations layer by layer to obtain the calculation results of the current client model, and In each current server model, use the calculation results of the previous current client model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the current server model; determine based on the current predicted value and sample label value Current prediction difference; and when the loop end condition is not met, according to the current prediction difference, through backpropagation to adjust the model parameters of each current server model and each current client sub-model layer by layer, so Each server model and each client sub-model after the adjustments serve as each current server model and each current client sub-model of the next cycle process.
可选地,在上述方面的一个示例中,所述服务端模型中的神经网络模型分层结构的模型计算与数据隐私保护无关。Optionally, in an example of the foregoing aspect, the model calculation of the neural network model hierarchical structure in the server model has nothing to do with data privacy protection.
可选地,在上述方面的一个示例中,所述客户端模型所包括的隐层的总层数可以根据用于模型训练的算力、应用场景所要求的训练时效性和/或训练安全等级确定。Optionally, in an example of the above aspect, the total number of hidden layers included in the client model may be based on the computing power used for model training, the training timeliness required by the application scenario, and/or the training security level. determine.
可选地,在上述方面的一个示例中,所述神经网络模型包括N个隐层,所述神经网络模型被分割为第一客户端模型和单个服务端模型,所述第一客户端模型包括输入层以及第一隐层到第K隐层,以及所述服务端模型包括输出层以及第K+1隐层到第N隐层。Optionally, in an example of the above aspect, the neural network model includes N hidden layers, the neural network model is divided into a first client model and a single server model, and the first client model includes The input layer and the first hidden layer to the Kth hidden layer, and the server model includes the output layer and the K+1th hidden layer to the Nth hidden layer.
可选地,在上述方面的一个示例中,所述神经网络模型包括N个隐层,所述神经网络模型被分割为第一客户端模型、单个服务端模型和第二客户端模型,所述第一客户端模型包括输入层以及第一隐层到第K隐层,所述服务端模型包括第K+1隐层到第L隐层,以及所述第二客户端模型包括输出层以及第L+1隐层到第N隐层。Optionally, in an example of the above aspect, the neural network model includes N hidden layers, and the neural network model is divided into a first client model, a single server model, and a second client model. The first client model includes an input layer and the first hidden layer to the Kth hidden layer, the server model includes the K+1th hidden layer to the Lth hidden layer, and the second client model includes an output layer and the first hidden layer. L+1 hidden layer to Nth hidden layer.
可选地,在上述方面的一个示例中,所述神经网络模型包括N个隐层,所述神经网络模型被分割为第一客户端模型、单个服务端模型和第二客户端模型,所述第一客户端模型包括输入层以及第一隐层到第K隐层,所述服务端模型包括第K+1隐层到第N隐层,以及所述第二客户端模型包括输出层。Optionally, in an example of the above aspect, the neural network model includes N hidden layers, and the neural network model is divided into a first client model, a single server model, and a second client model. The first client model includes an input layer and a first hidden layer to a Kth hidden layer, the server model includes a K+1th hidden layer to an Nth hidden layer, and the second client model includes an output layer.
可选地,在上述方面的一个示例中,所述当前预测差值的确定过程可以在所述服务端执行或者在拥有样本标记值的训练参与方的客户端执行。Optionally, in an example of the foregoing aspect, the process of determining the current prediction difference may be performed on the server or on the client of the training participant that has the sample label value.
可选地,在上述方面的一个示例中,所述循环结束条件可以包括:循环次数达到预定次数;或者当前预测差值在预定差值范围内。Optionally, in an example of the foregoing aspect, the loop ending condition may include: the number of loops reaches a predetermined number; or the current prediction difference is within a predetermined difference range.
可选地,在上述方面的一个示例中,所述多方安全计算可以包括秘密共享、混淆电路和同态加密中的一种。Optionally, in an example of the above aspect, the multi-party secure calculation may include one of secret sharing, obfuscation circuits, and homomorphic encryption.
可选地,在上述方面的一个示例中,在所述服务端处的模型计算可以使用TensorFlow或Pytorch技术实现。Optionally, in an example of the above aspect, the model calculation at the server can be implemented using TensorFlow or Pytorch technology.
可选地,在上述方面的一个示例中,所述训练样本数据可以包括基于图像数据、语音数据或文本数据的训练样本数据,或者,所述训练样本数据可以包括用户特征数据。Optionally, in an example of the above aspect, the training sample data may include training sample data based on image data, voice data, or text data, or the training sample data may include user characteristic data.
根据本说明书的实施例的另一方面,提供一种基于神经网络模型的模型预测方法,所述神经网络模型包括多个隐层并且被按照客户端模型与服务端模型间隔的方式分割为至少一个客户端模型和至少一个服务端模型,每个客户端模型被分解为第一数目个客户端子模型,每个客户端子模型具有相同的子模型结构,所述至少一个服务端模型部署在服务端,每个客户端子模型部署在第一数目个模型拥有方中的对应模型拥有方的客户端,所述模型预测方法包括:接收待预测数据;以及将所述待预测数据提供给神经网络模型,以经由各个客户端模型和各个服务端模型配合计算来得到所述神经网络模型的预测值,其中,在各个客户端模型,经由各个模型拥有方,使用各自的客户端子模型以及所述待预测数据或者在前服务端模型的计算结果来逐层进行多方安全计算,以得到该客户端模型的计算结果,以及在各个服务端模型,使用在前客户端模型的计算结果来逐层进行非多方安全计算,以得到该服务端模型的计算结果。According to another aspect of the embodiments of the present specification, a model prediction method based on a neural network model is provided. The neural network model includes a plurality of hidden layers and is divided into at least one according to the interval between the client model and the server model. A client model and at least one server model, each client model is decomposed into a first number of client sub-models, each client sub-model has the same sub-model structure, and the at least one server model is deployed on the server, Each client sub-model is deployed on the client of the corresponding model owner among the first number of model owners, and the model prediction method includes: receiving data to be predicted; and providing the data to be predicted to the neural network model to The predicted value of the neural network model is obtained through the coordinated calculation of each client model and each server model, where each client model, through each model owner, uses the respective client sub-model and the data to be predicted or Perform multi-party security calculations layer by layer based on the calculation results of the front server model to obtain the calculation results of the client model, and in each server model, use the calculation results of the previous client model to perform non-multi-party security calculations layer by layer , In order to get the calculation result of the server model.
可选地,在上述方面的一个示例中,所述待预测数据可以包括图像数据、语音数据或文本数据。或者,所述待预测数据可以包括用户特征数据。Optionally, in an example of the above aspect, the data to be predicted may include image data, voice data, or text data. Alternatively, the data to be predicted may include user characteristic data.
根据本说明书的实施例的另一方面,提供一种基于多方安全计算的神经网络模型训练装置,其中,所述神经网络模型利用第一数目个训练参与方协同训练,其中,所述神经网络模型包括多个隐层并且被按照客户端模型与服务端模型间隔的方式分割为至少一个客户端模型和至少一个服务端模型,每个客户端模型被分解为第一数目个客户端子模型,每个客户端子模型具有相同的子模型结构,所述至少一个服务端模型部署在服务端,每个客户端子模型部署在对应的训练参与方的客户端,所述神经网络模型训练装 置包括:模型预测单元,将训练样本数据提供给当前神经网络模型,以经由各个当前客户端模型和各个当前服务端模型配合计算来得到所述当前神经网络模型的当前预测值,其中,在各个当前客户端模型,经由各个训练参与方,使用各自的当前客户端子模型以及所述训练样本数据或者在前的当前服务端模型的计算结果来逐层进行多方安全计算,以得到该当前客户端模型的计算结果,以及在各个当前服务端模型,使用在前的当前客户端模型的计算结果来逐层进行非多方安全计算,以得到该当前服务端模型的计算结果;预测差值确定单元,基于所述当前预测值和样本标记值,确定当前预测差值;以及模型调整单元,在不满足循环结束条件时,根据所述当前预测差值,通过反向传播来逐层调整各个当前服务端模型和各个当前客户端子模型的各层模型参数,所述调整后的各个服务端模型和各个客户端子模型充当下一循环过程的各个当前服务端模型和各个当前客户端子模型,其中,所述模型预测单元、所述预测差值确定单元和所述模型调整单元循环执行操作,直到满足循环结束条件。According to another aspect of the embodiments of the present specification, there is provided a neural network model training device based on multi-party secure computing, wherein the neural network model uses a first number of training participants for collaborative training, wherein the neural network model It includes multiple hidden layers and is divided into at least one client model and at least one server model according to the interval between the client model and the server model. Each client model is decomposed into a first number of client sub-models, each The client sub-models have the same sub-model structure, the at least one server-side model is deployed on the server, and each client-side sub-model is deployed on the client of a corresponding training participant, and the neural network model training device includes: a model prediction unit , The training sample data is provided to the current neural network model to obtain the current prediction value of the current neural network model through the cooperation of each current client model and each current server model, wherein, in each current client model, through Each training participant uses their current client sub-model and the training sample data or the calculation result of the previous current server model to perform multi-party security calculation layer by layer to obtain the calculation result of the current client model, and Each current server model uses the calculation result of the previous current client model to perform non-multi-party security calculation layer by layer to obtain the calculation result of the current server model; the prediction difference determination unit is based on the current prediction value and The sample label value determines the current prediction difference; and the model adjustment unit, when the loop end condition is not met, adjusts each current server model and each current client sub-model layer by layer through backpropagation according to the current prediction difference. Each layer of model parameters, the adjusted server models and the client sub-models serve as the current server models and the current client sub-models of the next cycle process, wherein the model prediction unit and the prediction difference The value determination unit and the model adjustment unit cyclically perform operations until the loop end condition is satisfied.
可选地,在上述方面的一个示例中,所述服务端模型中的神经网络模型分层结构的模型计算与数据隐私保护无关。Optionally, in an example of the foregoing aspect, the model calculation of the neural network model hierarchical structure in the server model has nothing to do with data privacy protection.
可选地,在上述方面的一个示例中,所述客户端模型所包括的隐层的总层数可以根据用于模型训练的算力、应用场景所要求的训练时效性和/或训练安全等级确定。Optionally, in an example of the above aspect, the total number of hidden layers included in the client model may be based on the computing power used for model training, the training timeliness required by the application scenario, and/or the training security level. determine.
可选地,在上述方面的一个示例中,所述神经网络模型包括N个隐层,所述神经网络模型被分割为第一客户端模型和单个服务端模型,所述第一客户端模型包括输入层以及第一隐层到第K隐层,以及所述服务端模型包括输出层以及第K+1隐层到第N隐层。Optionally, in an example of the above aspect, the neural network model includes N hidden layers, the neural network model is divided into a first client model and a single server model, and the first client model includes The input layer and the first hidden layer to the Kth hidden layer, and the server model includes the output layer and the K+1th hidden layer to the Nth hidden layer.
可选地,在上述方面的一个示例中,所述神经网络模型包括N个隐层,所述神经网络模型被分割为第一客户端模型、单个服务端模型和第二客户端模型,所述第一客户端模型包括输入层以及第一隐层到第K隐层,所述服务端模型包括第K+1隐层到第L隐层,以及所述第二客户端模型包括输出层以及第L+1隐层到第N隐层。Optionally, in an example of the above aspect, the neural network model includes N hidden layers, and the neural network model is divided into a first client model, a single server model, and a second client model. The first client model includes an input layer and the first hidden layer to the Kth hidden layer, the server model includes the K+1th hidden layer to the Lth hidden layer, and the second client model includes an output layer and the first hidden layer. L+1 hidden layer to Nth hidden layer.
可选地,在上述方面的一个示例中,所述预测差值确定单元可以设置在服务端处或者客户端处。Optionally, in an example of the above aspect, the prediction difference determining unit may be provided at the server or the client.
根据本说明书的实施例的另一方面,提供一种基于神经网络模型的模型预测装置,其中,所述神经网络模型包括多个隐层并且被按照客户端模型与服务端模型间隔的方式分割为至少一个客户端模型和至少一个服务端模型,所述客户端模型中的第一客户端模 型至少包括输入层,每个客户端模型被分解为第一数目个客户端子模型,每个客户端子模型具有相同的子模型结构,所述至少一个服务端模型部署在服务端处,每个客户端子模型部署在第一数目个模型拥有方中的对应模型拥有方的客户端处,所述模型预测装置包括:数据接收单元,接收待预测数据;模型预测单元,将所述待预测数据提供给神经网络模型,以经由各个客户端模型和各个服务端模型配合计算来得到所述神经网络模型的预测值,其中,在各个客户端模型,经由各个模型拥有方,使用各自的客户端子模型以及所述待预测数据或者在前服务端模型的计算结果来逐层进行多方安全计算,以得到该客户端模型的计算结果,以及在各个服务端模型,使用在前客户端模型的计算结果来逐层进行非多方安全计算,以得到该服务端模型的计算结果。According to another aspect of the embodiments of the present specification, there is provided a model prediction device based on a neural network model, wherein the neural network model includes a plurality of hidden layers and is divided into a client model and a server model in a way that is separated from each other. At least one client model and at least one server model, the first client model in the client model includes at least an input layer, each client model is decomposed into a first number of client sub-models, each client sub-model Having the same sub-model structure, the at least one server-side model is deployed at the server, and each client-side sub-model is deployed at the client of the corresponding model owner among the first number of model owners, the model predicting device Including: a data receiving unit, which receives the data to be predicted; a model prediction unit, which provides the data to be predicted to the neural network model, so as to obtain the predicted value of the neural network model through the coordinated calculation of each client model and each server model , Wherein, in each client model, through each model owner, use the respective client sub-models and the calculation results of the data to be predicted or the previous server model to perform multi-party security calculations layer by layer to obtain the client model The calculation results of and in each server model, use the calculation results of the previous client model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the server model.
根据本说明书的实施例的另一方面,提供一种电子设备,包括:一个或多个处理器,以及与所述一个或多个处理器耦合的存储器,所述存储器存储指令,当所述指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行如上所述的神经网络模型训练方法。According to another aspect of the embodiments of the present specification, there is provided an electronic device, including: one or more processors, and a memory coupled with the one or more processors, the memory storing instructions, when the instructions When executed by the one or more processors, the one or more processors are caused to execute the neural network model training method described above.
根据本说明书的实施例的另一方面,提供一种机器可读存储介质,其存储有可执行指令,所述指令当被执行时使得所述机器执行如上所述的神经网络模型训练方法。According to another aspect of the embodiments of the present specification, a machine-readable storage medium is provided, which stores executable instructions that, when executed, cause the machine to execute the neural network model training method described above.
根据本说明书的实施例的另一方面,提供一种电子设备,包括:一个或多个处理器,以及与所述一个或多个处理器耦合的存储器,所述存储器存储指令,当所述指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行如上所述的模型预测方法。According to another aspect of the embodiments of the present specification, there is provided an electronic device, including: one or more processors, and a memory coupled with the one or more processors, the memory storing instructions, when the instructions When executed by the one or more processors, the one or more processors are caused to execute the model prediction method described above.
根据本说明书的实施例的另一方面,提供一种机器可读存储介质,其存储有可执行指令,所述指令当被执行时使得所述机器执行如上所述的模型预测方法。According to another aspect of the embodiments of the present specification, a machine-readable storage medium is provided, which stores executable instructions that, when executed, cause the machine to execute the model prediction method as described above.
附图说明Description of the drawings
通过参照下面的附图,可以实现对于本说明书的实施例内容的本质和优点的进一步理解。在附图中,类似组件或特征可以具有相同的附图标记。By referring to the following drawings, a further understanding of the nature and advantages of the contents of the embodiments of this specification can be achieved. In the drawings, similar components or features may have the same reference signs.
图1示出了神经网络模型的一个示例的示意图;Figure 1 shows a schematic diagram of an example of a neural network model;
图2示出了基于多方安全计算的神经网络模型训练方法的一个示例的示意图;Figure 2 shows a schematic diagram of an example of a neural network model training method based on multi-party secure computing;
图3示出了根据本说明书的实施例的神经网络模型的一个分割示例的示意图;Fig. 3 shows a schematic diagram of a segmentation example of a neural network model according to an embodiment of the present specification;
图4A-4D示出了根据本说明书的实施例的经过分割后的客户端子模型以及服务端 模型的示例示意图;4A-4D show exemplary schematic diagrams of a client sub-model and a server-side model after segmentation according to an embodiment of the present specification;
图5示出了根据本说明书的实施例的基于多方安全计算的神经网络模型训练方法的一个示例的流程图;FIG. 5 shows a flowchart of an example of a neural network model training method based on multi-party secure computing according to an embodiment of the present specification;
图6A示出了根据本公开的实施例的经过水平切分的训练样本数据的示例的示意图;6A shows a schematic diagram of an example of horizontally segmented training sample data according to an embodiment of the present disclosure;
图6B示出了根据本公开的实施例的经过垂直切分的训练样本数据的示例的示意图;FIG. 6B shows a schematic diagram of an example of vertically segmented training sample data according to an embodiment of the present disclosure;
图7A示出了根据本说明书的实施例的神经网络模型的另一分割示例的示意图;FIG. 7A shows a schematic diagram of another segmentation example of a neural network model according to an embodiment of the present specification;
图7B示出了根据本说明书的实施例的神经网络模型的另一分割示例的示意图;Fig. 7B shows a schematic diagram of another segmentation example of the neural network model according to the embodiment of the present specification;
图8示出了根据本说明书的实施例的基于神经网络模型的模型预测方法的流程图;Fig. 8 shows a flowchart of a model prediction method based on a neural network model according to an embodiment of the present specification;
图9示出了根据本说明书的实施例的模型训练装置的方框图;Fig. 9 shows a block diagram of a model training device according to an embodiment of the present specification;
图10示出了根据本说明书的实施例的模型预测装置的方框图;Figure 10 shows a block diagram of a model prediction device according to an embodiment of the present specification;
图11示出了根据本说明书的实施例的用于实现基于多方安全计算的神经网络模型训练的电子设备的方框图;FIG. 11 shows a block diagram of an electronic device for implementing neural network model training based on multi-party secure computing according to an embodiment of the present specification;
图12示出了根据本说明书的实施例的用于实现基于神经网络模型的模型预测的电子设备的方框图。Fig. 12 shows a block diagram of an electronic device for implementing model prediction based on a neural network model according to an embodiment of the present specification.
具体实施方式Detailed ways
现在将参考示例实施方式讨论本文描述的主题。应该理解,讨论这些实施方式只是为了使得本领域技术人员能够更好地理解从而实现本文描述的主题,并非是对权利要求书中所阐述的保护范围、适用性或者示例的限制。可以在不脱离本说明书的实施例内容的保护范围的情况下,对所讨论的元素的功能和排列进行改变。各个示例可以根据需要,省略、替代或者添加各种过程或组件。例如,所描述的方法可以按照与所描述的顺序不同的顺序来执行,以及各个步骤可以被添加、省略或者组合。另外,相对一些示例所描述的特征在其它例子中也可以进行组合。The subject matter described herein will now be discussed with reference to example embodiments. It should be understood that the discussion of these embodiments is only to enable those skilled in the art to better understand and realize the subject described herein, and is not to limit the scope of protection, applicability, or examples set forth in the claims. The function and arrangement of the discussed elements can be changed without departing from the scope of protection of the content of the embodiments of this specification. Various examples can omit, substitute, or add various procedures or components as needed. For example, the described method may be executed in a different order from the described order, and various steps may be added, omitted, or combined. In addition, features described with respect to some examples can also be combined in other examples.
如本文中使用的,术语“包括”及其变型表示开放的术语,含义是“包括但不限于”。术语“基于”表示“至少部分地基于”。术语“一个实施例”和“一实施例”表示“至少一个实施例”。术语“另一个实施例”表示“至少一个其他实施例”。术语“第一”、“第二”等可以指代不同的或相同的对象。下面可以包括其他的定义,无论是明 确的还是隐含的。除非上下文中明确地指明,否则一个术语的定义在整个说明书中是一致的。As used herein, the term "including" and its variations mean open terms, meaning "including but not limited to". The term "based on" means "based at least in part on." The terms "one embodiment" and "an embodiment" mean "at least one embodiment." The term "another embodiment" means "at least one other embodiment." The terms "first", "second", etc. may refer to different or the same objects. Other definitions can be included below, either explicit or implicit. Unless clearly indicated in the context, the definition of a term is consistent throughout the specification.
图1示出了神经网络模型100的一个示例的示意图。FIG. 1 shows a schematic diagram of an example of a neural network model 100.
如图1所示,神经网络模型100包括输入层110、第一隐层120、第二隐层130、第三隐层140和输出层150。As shown in FIG. 1, the neural network model 100 includes an input layer 110, a first hidden layer 120, a second hidden layer 130, a third hidden layer 140 and an output layer 150.
输入层110包括3个输入节点N1、N2和N3以及偏置项b1。三个输入节点N1、N2和N3分别接收来自三个不同数据拥有方的数据。在本说明书中,术语“数据拥有方”与术语“模型拥有方”和“训练参与方”可以互换使用。第一隐层120包括2个隐层节点N4和N5以及偏置项b2。隐层节点N4和N5分别与输入层110的3个输入节点N1、N2和N3以及偏置项b1全连接。输入节点N1与隐层节点N4和N5之间的权重分别为W1,4和W1,5。输入节点N2与隐层节点N4和N5之间的权重分别为W2,4和W2,5。输入节点N3与隐层节点N4和N5之间的权重分别为W3,4和W3,5。The input layer 110 includes three input nodes N1, N2, and N3 and a bias term b1. The three input nodes N1, N2, and N3 respectively receive data from three different data owners. In this specification, the term "data owner" and the terms "model owner" and "training participant" can be used interchangeably. The first hidden layer 120 includes two hidden layer nodes N4 and N5 and a bias term b2. The hidden layer nodes N4 and N5 are respectively fully connected with the three input nodes N1, N2, and N3 of the input layer 110 and the bias term b1. The weights between the input node N1 and the hidden layer nodes N4 and N5 are W1,4 and W1,5, respectively. The weights between the input node N2 and the hidden layer nodes N4 and N5 are W2,4 and W2,5, respectively. The weights between the input node N3 and the hidden layer nodes N4 and N5 are W3,4 and W3,5, respectively.
第二隐层130包括2个隐层节点N6和N7以及偏置项b3。隐层节点N6和N7分别与第一隐层120的2个隐层节点N4和N5以及偏置项b2全连接。隐层节点N4与隐层节点N6和N7之间的权重分别为W4,6和W4,7。隐层节点N5与隐层节点N6和N7之间的权重分别为W5,6和W5,7。The second hidden layer 130 includes two hidden layer nodes N6 and N7 and a bias term b3. The hidden layer nodes N6 and N7 are respectively fully connected with the two hidden layer nodes N4 and N5 of the first hidden layer 120 and the bias term b2. The weights between the hidden layer node N4 and the hidden layer nodes N6 and N7 are W4,6 and W4,7, respectively. The weights between the hidden layer node N5 and the hidden layer nodes N6 and N7 are W5,6 and W5,7, respectively.
第三隐层140包括2个隐层节点N8和N9以及偏置项b4。隐层节点N8和N9分别与第二隐层130的2个隐层节点N6和N7以及偏置项b3全连接。隐层节点N6与隐层节点N8和N9之间的权重分别为W6,8和W6,9。隐层节点N7与隐层节点N8和N9之间的权重分别为W7,8和W7,9。The third hidden layer 140 includes two hidden layer nodes N8 and N9 and a bias term b4. The hidden layer nodes N8 and N9 are respectively fully connected with the two hidden layer nodes N6 and N7 of the second hidden layer 130 and the bias term b3. The weights between the hidden layer node N6 and the hidden layer nodes N8 and N9 are W6,8 and W6,9, respectively. The weights between the hidden layer node N7 and the hidden layer nodes N8 and N9 are W7,8 and W7,9, respectively.
输出层150包括输出节点N10。输出节点N10与第三隐层140的2个隐层节点N8和N9以及偏置项b4全连接。隐层节点N8与输出节点N10之间的权重为W8,10。隐层节点N9与输出节点N10之间的权重为W9,10。The output layer 150 includes an output node N10. The output node N10 is fully connected with the two hidden layer nodes N8 and N9 of the third hidden layer 140 and the bias term b4. The weight between the hidden layer node N8 and the output node N10 is W8,10. The weight between the hidden layer node N9 and the output node N10 is W9,10.
在图1中示出的神经网络模型中,权重W1,4、W1,5、W2,4、W2,5、W3,4、W3,5、W4,6、W4,7、W5,6、W5,7、W6,8、W6,9、W7,8、W7,9、W8,10和W9,10是神经网络模型的各层模型参数。在进行前馈计算时,输入层110的输入节点N1、N2和N3经过计算后得到第一隐层120的各个隐层节点N4和N5的输入Z1和Z2,其中,Z1=W1,4*X1+W2,4*X2+W3,4*X3+b1,以及Z2=W1,5*X1+W2,5*X2+W3,5*X3+b1。然后,分别对Z1和Z2进行激活函数处理,得到隐层节点N4和N5的输出a1和a2。按照上述方式逐 层进行前馈计算,如图1中所示,最后得到神经网络模型的输出a7。In the neural network model shown in Figure 1, the weights W1,4, W1,5, W2,4, W2,5, W3,4, W3,5, W4,6, W4,7, W5,6, W5 ,7, W6,8, W6,9, W7,8, W7,9, W8,10 and W9,10 are the model parameters of each layer of the neural network model. In the feedforward calculation, the input nodes N1, N2, and N3 of the input layer 110 are calculated to obtain the inputs Z1 and Z2 of the hidden layer nodes N4 and N5 of the first hidden layer 120, where Z1=W1,4*X1 +W2,4*X2+W3,4*X3+b1, and Z2=W1,5*X1+W2,5*X2+W3,5*X3+b1. Then, the activation function processing is performed on Z1 and Z2 respectively, and the outputs a1 and a2 of the hidden layer nodes N4 and N5 are obtained. Perform feedforward calculation layer by layer in the above manner, as shown in Figure 1, and finally get the output a7 of the neural network model.
图2示出了基于多方安全计算的神经网络模型训练方法200的一个示例的示意图。在图2中示出的神经网络模型训练方法200中,以三个训练参与方Alice、Bob和Charlie(即,图1中的输入节点N1、N2和N3)为例来进行说明,其中,第一训练参与方Alice为训练发起方,即,使用Alice处的训练样本数据来进行训练。在图2中示出的方法中,在各个训练参与方Alice、Bob和Charlie处都具有神经网络模型的所有分层模型的模型结构,但是各个训练参与方的各层模型参数仅仅是神经网络模型的对应层模型参数的一部分,并且各个训练参与方的各层模型参数之和等于神经网络模型的对应层模型参数。FIG. 2 shows a schematic diagram of an example of a neural network model training method 200 based on multi-party secure computing. In the neural network model training method 200 shown in FIG. 2, three training participants Alice, Bob, and Charlie (that is, the input nodes N1, N2, and N3 in FIG. 1) are taken as an example for illustration. A training participant Alice is the training initiator, that is, the training sample data at Alice is used for training. In the method shown in Figure 2, each training participant Alice, Bob, and Charlie has the model structure of all the layered models of the neural network model, but the model parameters of each layer of each training participant are only the neural network model Part of the model parameters of the corresponding layer, and the sum of the model parameters of each layer of each training participant is equal to the model parameters of the corresponding layer of the neural network model.
如图2所示,首先,在块210,第一训练参与方Alice、第二训练参与方Bob和Charlie初始化各自的神经网络子模型的子模型参数,以获得其子模型参数的初始值,并且将已执行训练循环次数t初始化为零。这里,假设循环结束条件为执行预定次数训练循环,例如,执行T次训练循环。As shown in Figure 2, first, at block 210, the first training participant Alice, the second training participant Bob and Charlie initialize the sub-model parameters of their neural network sub-models to obtain the initial values of their sub-model parameters, and The number of executed training cycles t is initialized to zero. Here, it is assumed that the loop ending condition is to perform a predetermined number of training loops, for example, T training loops are executed.
在如上初始化后,循环执行块220到块260的操作,直到满足循环结束条件。After the above initialization, the operations from block 220 to block 260 are executed in a loop until the loop end condition is satisfied.
具体地,在块220,基于各个训练参与方的当前子模型来执行多方安全计算,以获得待训练的神经网络模型针对训练样本数据的当前预测值
Figure PCTCN2020124137-appb-000001
Specifically, in block 220, multi-party security calculations are performed based on the current sub-models of each training participant to obtain the current prediction value of the neural network model to be trained for the training sample data
Figure PCTCN2020124137-appb-000001
在得到当前预测值后,在块230,在第一训练参与方Alice处,确定当前预测值
Figure PCTCN2020124137-appb-000002
与对应的标记值Y之间的预测差值
Figure PCTCN2020124137-appb-000003
这里,e是一个列向量,Y是一个表示训练样本X的标记值的列向量,以及,
Figure PCTCN2020124137-appb-000004
是表示训练样本X的当前预测值的列向量。如果训练样本X仅包含单个训练样本,则e、Y和
Figure PCTCN2020124137-appb-000005
都是仅具有单个元素的列向量。如果训练样本X包含多个训练样本,则e、Y和
Figure PCTCN2020124137-appb-000006
都是具有多个元素的列向量,其中,
Figure PCTCN2020124137-appb-000007
中的每个元素是该多个训练样本中的对应训练样本的当前预测值,Y中的每个元素是该多个训练样本中的对应训练样本的标记值,以及e中的每个元素是该多个训练样本的对应训练样本的标记值与当前预测值的差值。
After the current predicted value is obtained, at block 230, at the first training participant Alice, the current predicted value is determined
Figure PCTCN2020124137-appb-000002
The predicted difference with the corresponding labeled value Y
Figure PCTCN2020124137-appb-000003
Here, e is a column vector, Y is a column vector representing the label value of the training sample X, and,
Figure PCTCN2020124137-appb-000004
Is a column vector representing the current predicted value of the training sample X. If the training sample X contains only a single training sample, then e, Y and
Figure PCTCN2020124137-appb-000005
Both are column vectors with a single element. If the training sample X contains multiple training samples, then e, Y and
Figure PCTCN2020124137-appb-000006
Are all column vectors with multiple elements, where,
Figure PCTCN2020124137-appb-000007
Each element in is the current predicted value of the corresponding training sample in the multiple training samples, each element in Y is the label value of the corresponding training sample in the multiple training samples, and each element in e is The difference between the label value of the corresponding training sample of the multiple training samples and the current predicted value.
然后,在块240,将预测差值分别发送到第二训练参与方Bob和Charlie。Then, at block 240, the prediction differences are sent to the second training participants Bob and Charlie, respectively.
在块250,在各个训练参与方处,基于所确定的预测差值,通过反向传播来逐层调整各个训练参与方处的神经网络模型的各层模型参数。In block 250, at each training participant, based on the determined prediction difference, the model parameters of each layer of the neural network model at each training participant are adjusted layer by layer through back propagation.
接着,在块260,判断是否达到预定循环次数。如果未达到预定循环次数,则返回到块220的操作来执行下一训练循环过程,其中,各个训练参与方在当前循环过程所获 得的更新后的当前子模型被用作下一训练循环过程的当前子模型。Next, in block 260, it is determined whether the predetermined number of cycles has been reached. If the predetermined number of cycles has not been reached, return to block 220 to execute the next training cycle process, where the updated current sub-models obtained by each training participant in the current cycle process are used as the next training cycle process. The current submodel.
如果达到预定循环次数,则各个训练参与方将各自的子模型参数的当前更新值,存储为其子模型参数的最终值,从而得到各自的训练后的子模型,然后流程结束。If the predetermined number of cycles is reached, each training participant stores the current updated value of the respective sub-model parameter as the final value of the sub-model parameter to obtain the respective trained sub-model, and then the process ends.
这里要说明的是,可选地,训练循环过程的结束条件也可以是所确定出的预测差值位于预定范围内,例如,预测差值e中的各个元素e i之和小于预定阈值,或者预测差值e中的各个元素e i的均值小于预定阈值。在这种情况下,块260的操作在块230之后执行。如果满足循环结束条件,则流程结束。否则,执行块240和250的操作,然后返回到块220,执行下一循环。 It should be explained here that, optionally, the end condition of the training loop process can also be that the determined prediction difference is within a predetermined range, for example, the sum of each element e i in the prediction difference e is less than a predetermined threshold, or The average value of each element e i in the prediction difference e is less than a predetermined threshold. In this case, the operation of block 260 is performed after block 230. If the loop end condition is met, the process ends. Otherwise, perform the operations of blocks 240 and 250, and then return to block 220 to execute the next cycle.
在图2中示出的神经网络模型训练方法200中,在各个训练参与方处实现神经网络模型的所有分层的模型结构,并且各层计算都是采用多方安全计算的方式实现。在这种情况下,由于采用神经网络模型的各层计算都使用MPC进行,然而MPC的计算方式复杂并且计算效率不高,从而导致这种神经网络模型训练方式的效率低下。In the neural network model training method 200 shown in FIG. 2, all the layered model structures of the neural network model are implemented at each training participant, and each layer calculation is implemented in a multi-party safe calculation manner. In this case, since the calculation of each layer of the neural network model is performed by MPC, the calculation method of MPC is complicated and the calculation efficiency is not high, which leads to the low efficiency of this neural network model training method.
鉴于上述,本说明书的实施例提出一种神经网络模型训练方法,在该神经网络模型训练方法中,神经网络模型被按照客户端模型与服务端模型间隔的方式分割为多个模型部分,一些模型部分部署在服务端(在下文中称为“服务端模型部分”),以及另一些模型部分部署在客户端(在下文中称为“客户端模型部分”),其中,服务端模型部分可以包括至少一个服务端模型,以及客户端模型部分可以包括至少一个客户端模型。而且,每个客户端端模型与神经网络模型的一个或多个分层结构对应。此外,每个客户端模型被分解为多个客户端子模型。对于每个客户端模型,在每个训练参与方(客户端)处都部署有一个客户端子模型,每个客户端子模型所包括的模型结构都相同,并且每个客户端子模型的各层模型参数是通过对神经网络模型中的对应分层结构的模型参数进行分割而得到的,即,所有客户端子模型的各个相同层节点的模型参数之和等于神经网络模型中的对应分层结构的模型参数。在进行模型训练时,在各个客户端模型,经由各个训练参与方协同来采用MPC方式进行模型计算,以及在各个服务端模型,采用非MPC方式来进行模型计算,例如,可以采用TensorFlow或Pytorch技术来进行模型计算。按照这种方式,由于仅仅部分神经网络模型采用MPC方式进行模型计算,剩余部分可以采用其他更快的非MPC方式进行模型计算,从而可以提高模型训练效率。此外,在进行神经网络模型分割时,可以将所涉及的模型计算与数据隐私保护无关的神经网络模型部分分割为服务端模型,从而可以保护数据隐私安全。In view of the above, the embodiment of this specification proposes a neural network model training method. In the neural network model training method, the neural network model is divided into multiple model parts according to the interval between the client model and the server model. Some models Part of it is deployed on the server (hereinafter referred to as the "server model part"), and some other models are partially deployed on the client (hereinafter referred to as the "client model part"), where the server model part may include at least one The server model and the client model part may include at least one client model. Moreover, each client-side model corresponds to one or more hierarchical structures of the neural network model. In addition, each client model is decomposed into multiple client sub-models. For each client model, a client sub-model is deployed at each training participant (client), the model structure included in each client sub-model is the same, and the model parameters of each layer of each client sub-model It is obtained by segmenting the model parameters of the corresponding hierarchical structure in the neural network model, that is, the sum of the model parameters of the same layer nodes of all client sub-models is equal to the model parameters of the corresponding hierarchical structure in the neural network model . When performing model training, each client model uses the MPC method to perform model calculations through the collaboration of each training participant, and each server model uses a non-MPC method to perform model calculations. For example, TensorFlow or Pytorch technology can be used To perform model calculations. In this way, since only part of the neural network model uses the MPC method for model calculation, the remaining part can use other faster non-MPC methods for model calculation, thereby improving the efficiency of model training. In addition, when performing neural network model segmentation, the neural network model involved in model calculations that have nothing to do with data privacy protection can be segmented into server-side models, thereby protecting data privacy security.
在本说明书的实施例中,神经网络模型所使用的训练样本数据可以包括基于图像数据、语音数据或文本数据的训练样本数据。相应地,神经网络模型可以应用于基于图像数据、语音数据或者文本数据的业务风险识别、业务分类或者业务决策等等。或者,神经网络模型所使用的训练样本数据可以包括用户特征数据。相应地,神经网络模型可以应用于基于用户特征数据的业务风险识别、业务分类、业务推荐或者业务决策等等。In the embodiments of this specification, the training sample data used by the neural network model may include training sample data based on image data, voice data, or text data. Correspondingly, the neural network model can be applied to business risk identification, business classification or business decision-making based on image data, voice data or text data. Alternatively, the training sample data used by the neural network model may include user characteristic data. Correspondingly, the neural network model can be applied to business risk identification, business classification, business recommendation or business decision based on user characteristic data.
在本说明书的实施例中,神经网络模型所使用的待预测数据可以包括图像数据、语音数据或文本数据。或者,神经网络模型所使用的待预测数据可以包括用户特征数据。In the embodiments of this specification, the data to be predicted used by the neural network model may include image data, voice data, or text data. Alternatively, the data to be predicted used by the neural network model may include user characteristic data.
在本说明书的一个示例中,神经网络模型可以包括N个隐层,神经网络模型可以被分割为第一客户端模型和单个服务端模型,第一客户端模型包括输入层以及第一隐层到第K隐层,以及服务端模型包括输出层以及第K+1隐层到第N隐层。In an example of this specification, the neural network model may include N hidden layers. The neural network model may be divided into a first client model and a single server model. The first client model includes the input layer and the first hidden layer to The Kth hidden layer and the server model include the output layer and the K+1th hidden layer to the Nth hidden layer.
图3示出了根据本说明书的实施例的神经网络模型100的一个分割示例的示意图。Fig. 3 shows a schematic diagram of a segmentation example of the neural network model 100 according to an embodiment of the present specification.
如图3所示,神经网络模型100被从第二隐层140处分割为客户端模型和服务端模型。客户端模型包括输入层110、第一隐层120和第二隐层130。服务端模型包括第三隐层140和输出层150。As shown in FIG. 3, the neural network model 100 is divided from the second hidden layer 140 into a client model and a server model. The client model includes an input layer 110, a first hidden layer 120, and a second hidden layer 130. The server model includes a third hidden layer 140 and an output layer 150.
由于图3中示出了3个训练参与方,从而使得客户端模型被分解为3个客户端子模型,每个训练参与方上部署一个客户端子模型,并且在服务端上部署服务端模型。每个客户端子模型所包括的模型结构都相同,并且所有客户端子模型的各个相同层节点的模型参数之和等于神经网络模型中的对应分层节点的模型参数。具体地,输入层与第一隐层120之间的模型参数以及第一隐层120与第二隐层130之间的模型参数分别被分割为3个部分,每个客户端子模型拥有其中一部分。Since 3 training participants are shown in FIG. 3, the client model is decomposed into 3 client sub-models, and each training participant is deployed with a client sub-model, and the server model is deployed on the server. Each client sub-model includes the same model structure, and the sum of the model parameters of the same layer nodes of all client sub-models is equal to the model parameters of the corresponding hierarchical nodes in the neural network model. Specifically, the model parameters between the input layer and the first hidden layer 120 and the model parameters between the first hidden layer 120 and the second hidden layer 130 are divided into three parts, each of which is owned by each client sub-model.
图4A-4D示出了根据本说明书的实施例的经过分割后的客户端子模型以及服务端模型的示例示意图。其中,图4A-4C中示出的子模型参数与图3中的神经网络模型的模型参数之间的关系如下所示。4A-4D show exemplary schematic diagrams of a client sub-model and a server-side model after segmentation according to an embodiment of the present specification. Among them, the relationship between the sub-model parameters shown in FIGS. 4A-4C and the model parameters of the neural network model in FIG. 3 is as follows.
w 1,4=w 1,4 (1)+w 1,4 (2)+w 1,4 (3),w 1,5=w 1,5 (1)+w 1,5 (2)+w 1,5 (3)w 1,4 = w 1,4 (1) + w 1,4 (2) + w 1,4 (3) , w 1,5 = w 1,5 (1) + w 1,5 (2) + w 1,5 (3) ,
w 2,4=w 2,4 (1)+w 2,4 (2)+w 2,4 (3),w 2,5=w 2,5 (1)+w 2,5 (2)+w 2,5 (3)w 2,4 = w 2,4 (1) + w 2,4 (2) + w 2,4 (3) , w 2,5 = w 2,5 (1) + w 2,5 (2) + w 2,5 (3) ,
w 3,4=w 3,4 (1)+w 3,4 (2)+w 3,4 (3),w 3,5=w 3,5 (1)+w 3,5 (2)+w 3,5 (3)w 3,4 =w 3,4 (1) +w 3,4 (2) +w 3,4 (3) ,w 3,5 =w 3,5 (1) +w 3,5 (2) + w 3,5 (3) ,
w 4,6=w 4,6 (1)+w 4,6 (2)+w 4,6 (3),w 4,7=w 4,7 (1)+w 4,7 (2)+w 4,7 (3)w 4,6 =w 4,6 (1) +w 4,6 (2) +w 4,6 (3) ,w 4,7 =w 4,7 (1) +w 4,7 (2) + w 4,7 (3) ,
w 5,6=w 5,6 (1)+w 5,6 (2)+w 5,6 (3),w 5,7=w 5,7 (1)+w 5,7 (2)+w 5,7 (3)w 5,6 =w 5,6 (1) +w 5,6 (2) +w 5,6 (3) ,w 5,7 =w 5,7 (1) +w 5,7 (2) + w 5,7 (3) .
此外,如图4D所示,服务端模型的各层模型参数与神经网络模型的对应层的模型参数完全相同。这里要说明的是,图4A-4D中的神经网络模型分割对应于数据水平切分情形。在数据垂直切分的情况下,对于输入层,每个数据拥有方只有一个节点。在这种情况下,可以通过垂直-水平切换来将每个数据拥有方的一个节点变换为3个节点,从而按照图4A-4D中示出的神经网络模型分割方式来进行分割。In addition, as shown in FIG. 4D, the model parameters of each layer of the server model are exactly the same as the model parameters of the corresponding layer of the neural network model. It should be explained here that the neural network model segmentation in Figures 4A-4D corresponds to the data horizontal segmentation situation. In the case of vertical data segmentation, for the input layer, each data owner has only one node. In this case, one node of each data owner can be transformed into three nodes by vertical-horizontal switching, so as to perform segmentation according to the neural network model segmentation method shown in FIGS. 4A-4D.
图5示出了根据本说明书的实施例的基于多方安全计算的神经网络模型训练方法500的一个示例的流程图。在图5中示出的神经网络模型训练方法500中,假设存在M(即,第一数目)个训练参与方。此外,图5中示出的神经网络模型分割方式是图3中的分割方式。这里,M个训练参与方可以是拥有神经网络模型训练所需数据的M个数据拥有方,即,每个数据拥有方拥有神经网络模型训练所需数据的部分数据。在本说明书的实施例中,M个数据拥有方所拥有的部分数据可以是经过水平切分的训练数据,或者可以是经过垂直切分的训练数据。FIG. 5 shows a flowchart of an example of a neural network model training method 500 based on multi-party secure computing according to an embodiment of the present specification. In the neural network model training method 500 shown in FIG. 5, it is assumed that there are M (ie, the first number) training participants. In addition, the neural network model segmentation method shown in FIG. 5 is the segmentation method in FIG. 3. Here, the M training participants may be M data owners who have data required for neural network model training, that is, each data owner has part of the data required for neural network model training. In the embodiment of this specification, part of the data owned by the M data owners may be training data that has been split horizontally, or may be training data that has been split vertically.
图6A示出了根据本说明书的实施例的经过水平切分的训练样本数据的示例的示意图。图6A中示出了2个数据方Alice和Bob,多个数据方也类似。每个数据方Alice和Bob拥有的训练样本子集中的每条训练样本是完整的,即,每条训练样本包括完整的特征数据(x)和标记数据(y)。比如,Alice拥有完整的训练样本(x0,y0)。FIG. 6A shows a schematic diagram of an example of horizontally segmented training sample data according to an embodiment of the present specification. Figure 6A shows two data parties Alice and Bob, and multiple data parties are similar. Each training sample in the training sample subset owned by each data party Alice and Bob is complete, that is, each training sample includes complete feature data (x) and labeled data (y). For example, Alice has a complete training sample (x0, y0).
图6B示出了根据本说明书的实施例的经过垂直切分的训练样本数据的示例的示意图。图6B中示出了2个数据方Alice和Bob,多个数据方也类似。每个数据方Alice和Bob拥有训练样本集中的所有训练样本中的每条训练样本的部分训练子样本,对于每条训练样本,数据方Alice和Bob拥有的部分训练子样本组合在一起,可以构成该训练样本的完整内容。比如,假设某个训练样本的内容包括标签y 0和属性特征
Figure PCTCN2020124137-appb-000008
则经过垂直切分后,训练参与方Alice拥有该训练样本的y 0
Figure PCTCN2020124137-appb-000009
以及训练参与方Bob拥有该训练样本的
Figure PCTCN2020124137-appb-000010
FIG. 6B shows a schematic diagram of an example of vertically segmented training sample data according to an embodiment of the present specification. Figure 6B shows two data parties Alice and Bob, and multiple data parties are similar. Each data party Alice and Bob has a partial training sub-sample of each training sample in all the training samples in the training sample set. For each training sample, the data party Alice and Bob have part of the training sub-samples combined together to form The complete content of the training sample. For example, suppose that the content of a training sample includes label y 0 and attribute features
Figure PCTCN2020124137-appb-000008
After vertical segmentation, the training participant Alice has the y 0 and
Figure PCTCN2020124137-appb-000009
And the training participant Bob has the training sample
Figure PCTCN2020124137-appb-000010
回到图5,首先,在块510,初始化M个训练参与方的客户端处的客户端子模型以及服务端处的服务端模型。Returning to FIG. 5, first, at block 510, the client sub-models at the clients of the M training participants and the server models at the server are initialized.
接着,循环执行块520到块560的操作,直到满足循环结束条件。Then, the operations from block 520 to block 560 are executed in a loop until the loop end condition is satisfied.
具体地,在块520,将训练样本数据提供给M个训练参与方的客户端处的各个当 前客户端子模型,逐层进行多方安全计算,以得到各个当前客户端子模型的计算结果。多方安全计算的具体实现过程可以参考本领域中的任何合适的多方安全计算实现方案。在本说明书中,多方安全计算可以包括秘密共享(Secret Sharing,SS)、混淆电路(Garbled Circuit,GC)和同态加密(Homomorphic Encryption,HE)中的一种。Specifically, at block 520, the training sample data is provided to each current client sub-model at the clients of the M training participants, and multi-party security calculations are performed layer by layer to obtain the calculation results of each current client sub-model. The specific implementation process of multi-party secure computing can refer to any suitable multi-party secure computing implementation solution in this field. In this specification, multi-party secure computing may include one of Secret Sharing (SS), Garbled Circuit (GC), and Homomorphic Encryption (HE).
在块530,将各个当前客户端子模型的计算结果提供给服务端的当前服务端模型来逐层计算,以得到神经网络模型的当前预测值。在服务端处执行的模型计算可以采用非MPC方式实现,例如,可以采用TensorFlow或者Pytorch技术来实现。这里要说明的是,各个当前客户端子模型的计算结果可以通过合并后提供给服务端的当前服务端子模型。即,a 3=a 3 (1)+a 3 (2)+a 3 (3)以及a 4=a 4 (1)+a 4 (2)+a 4 (3)In block 530, the calculation result of each current client sub-model is provided to the current server model of the server to calculate layer by layer to obtain the current prediction value of the neural network model. The model calculation performed at the server can be implemented in a non-MPC manner, for example, it can be implemented using TensorFlow or Pytorch technology. It should be explained here that the calculation results of each current client sub-model can be combined and provided to the current service terminal model of the server. That is, a 3 =a 3 (1) +a 3 (2) +a 3 (3) and a 4 =a 4 (1) +a 4 (2) +a 4 (3) .
在如上得到神经网络模型的当前预测值后,在块540,基于当前预测值和样本标记值,确定当前预测差值。After the current prediction value of the neural network model is obtained as described above, in block 540, the current prediction difference is determined based on the current prediction value and the sample label value.
这里要说明的是,在一个示例中,所述当前预测差值的确定过程可以在服务端执行。在这种情况下,训练参与方所拥有的样本标记值需要传送到服务端。It should be explained here that, in an example, the process of determining the current prediction difference can be performed on the server side. In this case, the sample label value owned by the training participant needs to be transmitted to the server.
可选地,在另一示例中,所述当前预测差值的确定过程可以在拥有样本标记值的训练参与方的客户端执行。在这种情况下,服务端所确定出的当前预测值被反馈回拥有样本标记值的训练参与方,然后在该训练参与方处确定当前预测差值。按照这种方式,无需将样本标记值发送给服务端,从而可以进一步保护训练参与方处的样本标记值的隐私。Optionally, in another example, the process of determining the current prediction difference may be performed on the client terminal of the training participant who has the sample label value. In this case, the current prediction value determined by the server is fed back to the training participant with the sample mark value, and then the current prediction difference is determined at the training participant. In this way, there is no need to send the sample label value to the server, so that the privacy of the sample label value at the training participant can be further protected.
接着,在块550,判断当前预测差值是否在预定差值范围内,例如,判断当前预测差值是否小于预定阈值。如果当前预测差值不在预定差值范围内,例如,当前预测差值不小于预定阈值,则在块560,根据当前预测差值,通过反向传播来逐层调整服务端模型和各个客户端子模型的各层模型参数,然后,返回到块520,执行下一循环过程,其中,所述调整后的服务端模型和各个客户端子模型充当下一循环过程的当前服务端模型和各个当前客户端子模型。Next, in block 550, it is determined whether the current prediction difference is within a predetermined difference range, for example, it is determined whether the current prediction difference is less than a predetermined threshold. If the current prediction difference is not within the predetermined difference range, for example, the current prediction difference is not less than the predetermined threshold, then in block 560, according to the current prediction difference, the server model and each client sub-model are adjusted layer by layer through backpropagation. Then, return to block 520 to execute the next cycle process, where the adjusted server model and each client sub-model serve as the current server model and each current client sub-model of the next cycle process .
如果当前预测差值在预定差值范围内,例如,当前预测差值小于预定阈值,则训练过程结束。If the current prediction difference is within the predetermined difference range, for example, the current prediction difference is less than the predetermined threshold, the training process ends.
此外,可选地,训练循环过程的结束条件也可以是达到预定循环次数。在这种情况下,块550的操作可以在块560的操作之后进行,即,在块540中确定出当前预测差值后,执行块560的操作后,然后判断是否达到预定循环次数。如果达到预定循环次数, 则训练过程结束。如果未达到预定循环次数,则返回到块520,执行下一循环过程。In addition, optionally, the ending condition of the training cycle process may also be to reach a predetermined number of cycles. In this case, the operation of block 550 may be performed after the operation of block 560, that is, after the current prediction difference is determined in block 540, the operation of block 560 is performed, and then it is determined whether the predetermined number of cycles is reached. If the predetermined number of cycles is reached, the training process ends. If the predetermined number of cycles has not been reached, return to block 520 to execute the next cycle process.
如上参照图3-图6A-6B描述了根据本说明书的实施例的神经网络模型训练方法。在图3中示出的示例中,客户端模型包括2层隐层。在本说明书的其它实施例中,客户端模型可以包括更多或更少的隐层,例如可以包括1个隐层或者多于2个隐层。在本说明书的实施例中,客户端模型所包括的隐层的总层数可以根据用于模型训练的算力、应用场景所要求的训练时效性和/或训练安全等级确定。The neural network model training method according to the embodiment of the present specification is described above with reference to FIGS. 3 to 6A-6B. In the example shown in Figure 3, the client model includes 2 hidden layers. In other embodiments of this specification, the client model may include more or fewer hidden layers, for example, may include one hidden layer or more than two hidden layers. In the embodiments of this specification, the total number of hidden layers included in the client model can be determined according to the computing power used for model training, the training timeliness required by the application scenario, and/or the training security level.
这里要说明的是,图5中描述的神经网络模型训练方法是针对图3中示出的神经网络模型分割方案的神经网络模型训练方法。在本说明书的其它实施例中,神经网络模型可以按照其他分割方案进行分割,如图7A和7B所示。It should be explained here that the neural network model training method described in FIG. 5 is a neural network model training method for the neural network model segmentation scheme shown in FIG. 3. In other embodiments of this specification, the neural network model can be segmented according to other segmentation schemes, as shown in FIGS. 7A and 7B.
图7A示出了神经网络分割方案的另一示例示意图。如图7A所示,神经网络模型被分割为第一客户端模型、服务端模型和第二客户端模型。第一客户端模型包括输入层110和第一隐层120。服务端模型包括第二隐层130。第二客户端模型包括第三隐层140和输出层150。在图7A中示出的神经网络分割方案中,第一客户端模型和第二客户端模型中的每个可以采用与图4A-4C中类似的方式分割为3个客户端子模型。服务端模型与神经网络模型的对应分层模型相同。这里要说明的是,在各个训练参与方的客户端处都设置有第一客户端模型和第二客户端模型的对应客户端子模型。Fig. 7A shows another example schematic diagram of a neural network segmentation scheme. As shown in FIG. 7A, the neural network model is divided into a first client model, a server model, and a second client model. The first client model includes an input layer 110 and a first hidden layer 120. The server model includes a second hidden layer 130. The second client model includes a third hidden layer 140 and an output layer 150. In the neural network segmentation scheme shown in FIG. 7A, each of the first client model and the second client model can be segmented into 3 client sub-models in a similar manner as in FIGS. 4A-4C. The server-side model is the same as the corresponding layered model of the neural network model. It should be noted here that the client sub-models of the first client model and the second client model are set at the client of each training participant.
针对图7A中示出的神经网络分割方案,在进行神经网络模型训练时,在每次循环过程中,首先,将训练样本数据提供给第一客户端模型中的各个当前第一客户端子模型,逐层进行多方安全计算,以得到各个当前第一客户端子模型的计算结果。然后,将各个当前第一客户端子模型的计算结果提供给服务端模型来逐层进行非多方安全计算,以得到服务端模型的计算结果。随后,将服务端模型的计算结果提供给第二客户端模型中的各个当前第二客户端子模型,逐层进行多方安全计算,以得到神经网络模型的当前预测结果。For the neural network segmentation scheme shown in FIG. 7A, when the neural network model is trained, in each cycle, first, the training sample data is provided to each current first client sub-model in the first client model, Multi-party security calculations are performed layer by layer to obtain the calculation results of each current first client sub-model. Then, the calculation results of each current first client sub-model are provided to the server model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the server model. Subsequently, the calculation result of the server model is provided to each current second client sub-model in the second client model, and multi-party security calculations are performed layer by layer to obtain the current prediction result of the neural network model.
图7B示出了神经网络分割方案的另一示例示意图。如图7B所示,神经网络模型被分割为至少一个客户端模型和至少一个服务端模型,例如,如图7B所示的第一客户端模型、第一服务端模型、第二客户端模型、第二服务模型和第三客户端模型。第一客户端模型包括输入层110和第一隐层120。第一服务端模型包括第二隐层130。第二客户端模型包括第三隐层140。第二服务端模型包括第四隐层150。第三客户端模型包括第五隐层160和输出层170。在图7B中示出的神经网络分割方案中,第一客户端模型、第二客户端模型和第三客户端模型中的每个可以采用与图4A-4C中类似的方式分割为3 个客户端子模型。第一和第二服务端模型与神经网络模型的对应分层模型相同。这里要说明的是,在各个训练参与方的客户端处都设置有第一客户端模型、第二客户端模型和第三客户端模型的对应客户端子模型。Fig. 7B shows another example schematic diagram of a neural network segmentation scheme. As shown in FIG. 7B, the neural network model is divided into at least one client model and at least one server model, for example, as shown in FIG. 7B, the first client model, the first server model, the second client model, The second service model and the third client model. The first client model includes an input layer 110 and a first hidden layer 120. The first server model includes a second hidden layer 130. The second client model includes a third hidden layer 140. The second server model includes a fourth hidden layer 150. The third client model includes a fifth hidden layer 160 and an output layer 170. In the neural network segmentation scheme shown in FIG. 7B, each of the first client model, the second client model, and the third client model can be divided into 3 clients in a similar manner as in FIGS. 4A-4C Terminal model. The first and second server models are the same as the corresponding hierarchical models of the neural network model. It should be explained here that the corresponding client sub-models of the first client model, the second client model, and the third client model are set at the clients of each training participant.
针对图7B中示出的神经网络分割方案,在进行神经网络模型训练时,在每次循环过程中,在各个当前客户端模型,经由各个训练参与方,使用各自的当前客户端子模型以及训练样本数据或者在前的当前服务端模型的计算结果来逐层进行多方安全计算,以得到该当前客户端模型的计算结果。在各个当前服务端模型,使用在前的客户端模型的计算结果来逐层进行非多方安全计算,以得到该当前服务端模型的计算结果。至少一个客户端模型和至少一个服务端模型配合计算来得出神经网络模型的当前预测值。For the neural network segmentation scheme shown in Figure 7B, when the neural network model is trained, in each cycle, in each current client model, through each training participant, use the respective current client sub-models and training samples Data or the calculation result of the previous current server model is used to perform multi-party security calculation layer by layer to obtain the calculation result of the current client model. In each current server model, the calculation result of the previous client model is used to perform non-multi-party security calculation layer by layer to obtain the calculation result of the current server model. At least one client model and at least one server model cooperate with calculations to obtain the current prediction value of the neural network model.
具体地,将训练样本数据提供给当前第一客户端模型中的各个当前第一客户端子模型,逐层进行多方安全计算,以得到各个当前第一客户端子模型的计算结果。然后,将各个当前第一客户端子模型的计算结果提供给当前第一服务端模型来逐层进行非多方安全计算,以得到当前第一服务端模型的计算结果。随后,将第一服务端模型的计算结果提供给当前第二客户端模型中的各个当前第二客户端子模型,逐层进行多方安全计算,以得到当前第二客户端模型的计算结果。接着,将各个当前第二客户端子模型的计算结果提供给当前第二服务端模型来逐层进行非多方安全计算,以得到当前第二服务端模型的计算结果。随后,将当前第二服务端模型的计算结果提供给当前第三客户端模型中的各个当前第三客户端子模型,逐层进行多方安全计算,以得到神经网络模型的当前预测结果。Specifically, the training sample data is provided to each current first client sub-model in the current first client model, and multi-party security calculations are performed layer by layer to obtain the calculation results of each current first client sub-model. Then, the calculation result of each current first client sub-model is provided to the current first server model to perform non-multi-party security calculation layer by layer to obtain the calculation result of the current first server model. Subsequently, the calculation result of the first server model is provided to each current second client sub-model in the current second client model, and multi-party security calculations are performed layer by layer to obtain the calculation result of the current second client model. Then, the calculation results of each current second client sub-model are provided to the current second server model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the current second server model. Subsequently, the calculation result of the current second server model is provided to each current third client sub-model in the current third client model, and multi-party security calculations are performed layer by layer to obtain the current prediction result of the neural network model.
此外,要说明的是,在本说明书的一个示例中,第一客户端模型可以包括部分隐层。在另一示例中,各个服务端模型可以至少包括部分隐层。In addition, it should be noted that in an example of this specification, the first client model may include part of the hidden layer. In another example, each server model may include at least part of the hidden layer.
这里,要说明的是,在本说明书的实施例中,可以基于神经网络模型的各个分层模型的模型计算是否与数据隐私保护有关来进行神经网络模型分割,其中,与数据隐私保护有关的分层模型被分割到客户端模型,以及与数据隐私保护无关的分层模型被分割到服务端模型。此外,要说明的是,客户端模型也可以包括与数据隐私保护无关的分层模型。Here, it should be noted that in the embodiments of this specification, the neural network model segmentation can be performed based on whether the model calculation of each layered model of the neural network model is related to data privacy protection. Among them, the division related to data privacy protection The layer model is divided into the client model, and the layer model that has nothing to do with data privacy protection is divided into the server model. In addition, it should be noted that the client model can also include a layered model that has nothing to do with data privacy protection.
在本说明书中,与数据隐私保护有关的模型计算可以是直接使用各个输入X i或者输出Y的模型计算,例如,与输入层对应的模型计算或者与输出层对应的模型计算。与数据隐私保护无关的模型计算可以是无需使用各个输入X i和输出Y的模型计算,例如, 神经网络模型中的中间隐层。 In the present specification, the data relating to privacy model can be calculated directly using the respective input or output of the model calculation X i Y, e.g., with a model corresponding to the input layer and the output layer is calculated or calculated corresponding to the model. Model calculation and data privacy may be irrelevant to use without calculating the respective input X i and the model output Y, e.g., neural network model in an intermediate hidden layer.
利用本说明书的实施例,可以提供一种神经网络模型,该神经网络模型被按照客户端模型与服务端模型间隔的方式分割为至少一个客户端模型和至少一个服务端模型,每个客户端模型被分解为第一数目个客户端子模型,每个客户端子模型具有相同的子模型结构,每个客户端子模型部署在一个训练参与方的客户端,以及服务端模型部署在服务端。在进行模型训练时,将训练样本数据提供给神经网络模型,以经由各个训练参与方的客户端处的客户端模型和服务端的各个服务端模型来配合计算得出神经网络模型的当前预测值。其中,客户端模型中的模型计算采用MPC方式实现,以及服务端模型中的模型计算采用非MPC方式实现,从而可以减少执行多方安全计算的模型层数,由此提高模型训练速度,进而提升模型训练效率。Using the embodiments of this specification, a neural network model can be provided. The neural network model is divided into at least one client model and at least one server model according to the interval between the client model and the server model. Each client model is divided into at least one client model and at least one server model. It is decomposed into the first number of client sub-models, each client sub-model has the same sub-model structure, each client sub-model is deployed on the client of a training participant, and the server model is deployed on the server. During model training, the training sample data is provided to the neural network model, so that the current prediction value of the neural network model can be obtained through cooperative calculation through the client model at the client of each training participant and each server model of the server. Among them, the model calculation in the client model is implemented in MPC mode, and the model calculation in the server model is implemented in non-MPC mode, which can reduce the number of model layers that perform multi-party security calculations, thereby increasing the speed of model training and improving the model Training efficiency.
此外,根据本说明书的实施例的神经网络模型训练方法,仅仅与数据隐私保护无关的神经网络模型分层结构被分割到服务端模型,从而可以确保各个数据拥有方的数据隐私安全。In addition, according to the neural network model training method of the embodiment of the present specification, only the layered structure of the neural network model that is not related to data privacy protection is segmented into the server model, so that the data privacy security of each data owner can be ensured.
此外,根据本说明书的实施例的神经网络模型训练方案,客户端模型所包括的隐层的总层数可以根据用于模型训练的算力、应用场景所要求的训练时效性和/或训练安全等级确定和调整,由此可以在进行神经网络模型分割时折衷考虑模型训练的环境条件、数据安全需求以及模型训练效率。In addition, according to the neural network model training scheme of the embodiment of this specification, the total number of hidden layers included in the client model can be based on the computing power used for model training, the training timeliness required by the application scenario, and/or training safety. The level is determined and adjusted, so that the environmental conditions of model training, data security requirements and model training efficiency can be compromised when performing neural network model segmentation.
此外,根据本说明书的实施例的神经网络模型训练方案,可以在拥有样本标记值的训练参与方的客户端执行当前预测差值的确定过程。按照这种方式,无需将样本标记值发送给服务端,从而可以进一步保护训练参与方处的样本标记值的隐私。In addition, according to the neural network model training scheme of the embodiment of the present specification, the current prediction difference determination process can be performed on the client terminal of the training participant who has the sample label value. In this way, there is no need to send the sample label value to the server, so that the privacy of the sample label value at the training participant can be further protected.
图8示出了根据本说明书的实施例的基于神经网络模型的模型预测方法800的流程图。在图8中示出的实施例中,神经网络模型包括N个隐层并且被分割为单个服务端模型和第一数目个客户端子模型,每个客户端子模型包括输入层以及第一隐层到第K隐层,服务端模型包括输出层以及第K+1隐层到第N隐层,服务端模型部署在服务端处,每个客户端子模型部署在一个模型拥有方的客户端处,每个客户端子模型具有相同的子模型结构,并且第一数目个客户端子模型共同构成所述神经网络模型的对应模型结构。FIG. 8 shows a flowchart of a model prediction method 800 based on a neural network model according to an embodiment of the present specification. In the embodiment shown in FIG. 8, the neural network model includes N hidden layers and is divided into a single server model and a first number of client sub-models. Each client sub-model includes an input layer and a first hidden layer to The Kth hidden layer, the server model includes the output layer and the K+1 hidden layer to the Nth hidden layer. The server model is deployed at the server, and each client sub-model is deployed at the client of a model owner. The client sub-models have the same sub-model structure, and the first number of client sub-models together constitute the corresponding model structure of the neural network model.
如图8所示,在块810,接收待预测数据。所述待预测数据可以是从任一模型拥有方处接收。As shown in FIG. 8, at block 810, data to be predicted is received. The data to be predicted may be received from any model owner.
接着,在块820,将所接收的待预测数据提供给第一数目个模型拥有方的客户端处 的各个当前客户端子模型来逐层进行多方安全计算,以得到各个当前客户端子模型的计算结果。Next, in block 820, the received data to be predicted is provided to each current client sub-model at the client of the first number of model owners to perform multi-party security calculation layer by layer to obtain the calculation result of each current client sub-model .
然后,在块830,将各个当前客户端子模型的计算结果提供给服务端的服务端模型来逐层非多方安全计算,以得到神经网络模型的模型预测结果。Then, in block 830, the calculation results of each current client sub-model are provided to the server model of the server to perform non-multi-party secure calculation layer by layer to obtain the model prediction result of the neural network model.
同样,在神经网络模型按照其它分割方式进行模型分割时,可以按照对应的模型计算方案来对图8中示出的模型预测过程进行适应性修改。Similarly, when the neural network model performs model segmentation according to other segmentation methods, the model prediction process shown in FIG. 8 can be adaptively modified according to the corresponding model calculation scheme.
图9示出了根据本说明书的实施例的模型训练装置900的方框图。如图9所示,模型训练装置900包括模型预测单元910、预测差值确定单元920和模型调整单元930。FIG. 9 shows a block diagram of a model training device 900 according to an embodiment of the present specification. As shown in FIG. 9, the model training device 900 includes a model prediction unit 910, a prediction difference determination unit 920, and a model adjustment unit 930.
模型预测单元910、预测差值确定单元920和模型调整单元930循环执行操作,直到满足循环结束条件。所述循环结束条件可以包括:循环次数达到预定次数;或者当前预测差值在预定差值范围内。The model prediction unit 910, the prediction difference determination unit 920, and the model adjustment unit 930 perform operations in a loop until the loop end condition is satisfied. The loop ending condition may include: the number of loops reaches a predetermined number of times; or the current prediction difference is within a predetermined difference range.
具体地,模型预测单元910被配置为将训练样本数据提供给当前神经网络模型,以经由各个当前客户端模型和各个当前服务端模型配合计算来得到当前神经网络模型的当前预测值,其中,在各个当前客户端模型,经由第一数目个训练参与方,使用各自的当前客户端子模型以及训练样本数据或者在前的当前服务端模型的计算结果来逐层进行多方安全计算,以得到该当前客户端模型的计算结果,以及在各个当前服务端模型,使用在前的当前客户端模型的计算结果来逐层进行非多方安全计算,以得到该当前服务端模型的计算结果。Specifically, the model prediction unit 910 is configured to provide training sample data to the current neural network model, so as to obtain the current prediction value of the current neural network model through the cooperative calculation of each current client model and each current server model. Each current client model, through the first number of training participants, uses the respective current client sub-models and training sample data or the calculation results of the previous current server model to perform multi-party security calculations layer by layer to obtain the current client The calculation result of the client model, and in each current server model, use the calculation result of the previous current client model to perform non-multi-party security calculation layer by layer to obtain the calculation result of the current server model.
在本说明书的一个示例中,模型预测单元910可以包括多方安全计算模块和服务端计算模块。多方安全计算模块被配置为针对各个当前客户端模型,经由第一数目个训练参与方,使用训练样本数据或者在前的当前服务端模型的计算结果以及各自的当前客户端子模型来逐层进行多方安全计算,以得到该当前客户端模型的计算结果。多方安全计算模块设置在客户端处。服务端计算模块被配置为针对各个服务端模型,使用在前的当前客户端模型的计算结果来逐层进行非多方安全计算,以得到该服务端模型的计算结果。服务端计算模块设置在服务端处。多方安全计算模块和服务端计算模块配合计算以得出神经网络模型的预测值。In an example of this specification, the model prediction unit 910 may include a multi-party security calculation module and a server-side calculation module. The multi-party security computing module is configured for each current client model, through the first number of training participants, using the training sample data or the calculation results of the previous current server model and the respective current client sub-model to perform multi-party layer by layer Secure calculation to obtain the calculation result of the current client model. The multi-party security computing module is set at the client. The server-side calculation module is configured to perform non-multi-party security calculations layer by layer for each server-side model using the calculation result of the previous current client-side model to obtain the calculation result of the server-side model. The server-side computing module is set at the server-side. The multi-party security calculation module and the server calculation module cooperate to calculate to obtain the predicted value of the neural network model.
预测差值确定单元920被配置基于当前预测值和样本标记值,确定当前预测差值。可选地,预测差值确定单元920可以设置在服务端处或者设置在客户端处。The prediction difference determination unit 920 is configured to determine the current prediction difference based on the current prediction value and the sample label value. Optionally, the prediction difference determining unit 920 may be provided at the server or at the client.
模型调整单元930被配置为在不满足循环结束条件时,根据当前预测差值,通过 反向传播来调整各个当前服务端模型和各个当前客户端子模型的各层模型参数,所述调整后的各个服务端模型和各个客户端子模型充当下一循环过程的各个当前服务端模型和各个当前客户端子模型。模型调整单元930中的部分组件设置在客户端,以及其余组件设置在服务端。The model adjustment unit 930 is configured to adjust each layer model parameter of each current server model and each current client sub-model through backpropagation according to the current prediction difference when the loop end condition is not satisfied. The server model and each client sub-model serve as each current server model and each current client sub-model of the next cycle process. Part of the components in the model adjustment unit 930 are set on the client side, and other components are set on the server side.
图10示出了根据本说明书的实施例的模型预测装置1000的方框图。如图10所示,模型预测装置1000包括数据接收单元1010和模型预测单元1020。Fig. 10 shows a block diagram of a model prediction apparatus 1000 according to an embodiment of the present specification. As shown in FIG. 10, the model prediction device 1000 includes a data receiving unit 1010 and a model prediction unit 1020.
数据接收单元1010被配置为接收待预测数据。所述待预测数据可以是从任一模型拥有方处接收。数据接收单元1010设置在各个模型拥有方的客户端。The data receiving unit 1010 is configured to receive data to be predicted. The data to be predicted may be received from any model owner. The data receiving unit 1010 is provided in the client of each model owner.
模型预测单元1020被配置为将待预测数据提供给神经网络模型,以经由各个客户端模型和各个服务端模型配合计算来得到神经网络模型的预测值,其中,在各个客户端模型,经由第一数目个模型拥有方,使用各自的客户端子模型以及待预测数据或者在前服务端模型的计算结果来逐层进行多方安全计算,以得到该客户端模型的计算结果,以及在各个服务端模型,使用在前客户端模型的计算结果来逐层进行非多方安全计算,以得到该服务端模型的计算结果。The model prediction unit 1020 is configured to provide the data to be predicted to the neural network model to obtain the prediction value of the neural network model through the cooperation of each client model and each server model. A number of model owners use their respective client sub-models and the data to be predicted or the calculation results of the previous server model to perform multi-party security calculations layer by layer to obtain the calculation results of the client model, and each server model, Use the calculation results of the previous client model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the server model.
在本说明书的一个示例中,模型预测单元1020可以包括多方安全计算模块和服务端计算模块。多方安全计算模块被配置为针对各个客户端模型,经由第一数目个模型拥有方,使用待预测数据或者在前的服务端模型的计算结果以及各自的客户端子模型来逐层进行多方安全计算,以得到该客户端模型的计算结果。服务端计算模块被配置为针对各个服务端模型,使用在前的客户端模型的计算结果来逐层非多方安全计算,以得到该服务端模型的计算结果。多方安全计算模块和服务端计算模块配合计算以得出神经网络模型的预测值。多方安全计算模块设置在各个模型拥有方的客户端,以及服务端计算模块设置在服务端。In an example of this specification, the model prediction unit 1020 may include a multi-party security calculation module and a server-side calculation module. The multi-party security calculation module is configured to perform multi-party security calculation layer by layer for each client model, via the first number of model owners, using the to-be-predicted data or the calculation results of the previous server model and the respective client sub-models, In order to obtain the calculation result of the client model. The server-side calculation module is configured to use the calculation results of the previous client-side model to perform non-multi-party security calculations layer by layer for each server-side model, so as to obtain the calculation result of the server-side model. The multi-party security calculation module and the server calculation module cooperate to calculate to obtain the predicted value of the neural network model. The multi-party secure computing module is installed on the client of each model owner, and the server computing module is installed on the server.
如上参照图1到图10,对根据本说明书的实施例的神经网络模型训练方法及模型训练装置、模型预测方法及模型预测装置的实施例进行了描述。上面的模型训练装置和模型预测装置可以采用硬件实现,也可以采用软件或者硬件和软件的组合来实现。As above, referring to FIGS. 1 to 10, the neural network model training method, model training device, model prediction method, and model prediction device according to the embodiments of this specification are described. The above model training device and model prediction device can be implemented by hardware, or by software or a combination of hardware and software.
图11示出了根据本说明书的实施例的用于实现基于多方安全计算的神经网络模型训练的电子设备1100的结构框图。FIG. 11 shows a structural block diagram of an electronic device 1100 for implementing neural network model training based on multi-party security computing according to an embodiment of the present specification.
如图11所示,电子设备1100可以包括至少一个处理器1110、存储器(例如,非易失性存储器)1120、内存1130、通信接口1140以及总线1160,并且至少一个处理器 1110、存储器1120、内存1130和通信接口1140经由总线1160连接在一起。该至少一个处理器1110执行在计算机可读存储介质中存储或编码的至少一个计算机可读指令(即,上述以软件形式实现的元素)。As shown in FIG. 11, the electronic device 1100 may include at least one processor 1110, a memory (for example, a non-volatile memory) 1120, a memory 1130, a communication interface 1140, and a bus 1160, and at least one processor 1110, a memory 1120, a memory 1130 and the communication interface 1140 are connected together via a bus 1160. The at least one processor 1110 executes at least one computer-readable instruction (that is, the above-mentioned element implemented in the form of software) stored or encoded in a computer-readable storage medium.
在一个实施例中,在存储器中存储有计算机可执行指令,其当执行时使得至少一个处理器1110:执行下述循环过程,直到满足循环结束条件:将训练样本数据提供给当前神经网络模型,以经由各个客户端模型和各个服务端模型配合计算来得到当前神经网络模型的当前预测值,其中,在各个当前客户端模型,经由第一数目个训练参与方,使用各自的当前客户端子模型以及训练样本数据或者在前的当前服务端模型的计算结果来逐层进行多方安全计算,以得到该当前客户端模型的计算结果,以及在各个当前服务端模型,使用在前的当前客户端模型的计算结果来逐层进行非多方安全计算,以得到该当前服务端模型的计算结果;基于当前预测值和样本标记值,确定当前预测差值;以及在不满足循环结束条件时,根据当前预测差值,通过反向传播来逐层调整各个当前服务端模型和各个当前客户端子模型的各层模型参数,所述调整后的各个服务端模型和各个客户端子模型充当下一循环过程的各个当前服务端模型和各个当前客户端子模型。In one embodiment, computer-executable instructions are stored in the memory, which, when executed, cause at least one processor 1110 to: execute the following loop process until the loop end condition is met: provide training sample data to the current neural network model, The current prediction value of the current neural network model is obtained through the cooperation calculation of each client model and each server model. Among them, in each current client model, through the first number of training participants, use the respective current client sub-models and The training sample data or the calculation results of the previous current server model are used to perform multi-party security calculations layer by layer to obtain the calculation results of the current client model, and in each current server model, use the previous current client model The calculation result is used to perform non-multi-party security calculation layer by layer to obtain the calculation result of the current server model; determine the current prediction difference based on the current prediction value and the sample tag value; and when the loop end condition is not met, according to the current prediction difference Value, through backpropagation to adjust each layer model parameter of each current server model and each current client sub-model layer by layer, and the adjusted each server model and each client sub-model will serve as each current service in the next cycle process Client model and each current client sub-model.
应该理解的是,在存储器中存储的计算机可执行指令当执行时使得至少一个处理器1110执行在本说明书的各个实施例中如上结合图1-10描述的各种操作和功能。It should be understood that the computer-executable instructions stored in the memory, when executed, cause at least one processor 1110 to perform various operations and functions described above in conjunction with FIGS. 1-10 in the various embodiments of this specification.
图12示出了根据本说明书的实施例的用于实现基于神经网络模型的模型预测的电子设备1200的结构框图。FIG. 12 shows a structural block diagram of an electronic device 1200 for implementing model prediction based on a neural network model according to an embodiment of the present specification.
如图12所示,电子设备1200可以包括至少一个处理器1210、存储器(例如,非易失性存储器)1220、内存1230、通信接口1240以及总线1260,并且至少一个处理器1210、存储器1220、内存1230和通信接口1240经由总线1260连接在一起。该至少一个处理器1210执行在计算机可读存储介质中存储或编码的至少一个计算机可读指令(即,上述以软件形式实现的元素)。As shown in FIG. 12, the electronic device 1200 may include at least one processor 1210, memory (for example, non-volatile memory) 1220, memory 1230, communication interface 1240, and bus 1260, and at least one processor 1210, memory 1220, memory The 1230 and the communication interface 1240 are connected together via a bus 1260. The at least one processor 1210 executes at least one computer-readable instruction (that is, the above-mentioned element implemented in the form of software) stored or encoded in a computer-readable storage medium.
在一个实施例中,在存储器中存储有计算机可执行指令,其当执行时使得至少一个处理器1210:接收待预测数据;以及将待预测数据提供给神经网络模型,以经由各个客户端模型和各个服务端模型配合计算来得到神经网络模型的预测值,其中,在各个客户端模型,经由第一数目个模型拥有方,使用各自的客户端子模型以及待预测数据或者在前服务端模型的计算结果来逐层进行多方安全计算,以得到该客户端模型的计算结果,以及在各个服务端模型,使用在前客户端模型的计算结果来逐层进行非多方安全计算,以得到该服务端模型的计算结果。In one embodiment, computer-executable instructions are stored in the memory, which, when executed, cause at least one processor 1210 to: receive the data to be predicted; Each server model cooperates with calculations to obtain the predicted value of the neural network model. Among them, in each client model, through the first number of model owners, use the respective client sub-models and the data to be predicted or the calculation of the previous server model The results are used to perform multi-party security calculations layer by layer to obtain the calculation results of the client model, and in each server model, use the calculation results of the previous client model to perform non-multi-party security calculations layer by layer to obtain the server model The result of the calculation.
应该理解的是,在存储器中存储的计算机可执行指令当执行时使得至少一个处理器1210执行在本说明书的各个实施例中如上结合图1-10描述的各种操作和功能。It should be understood that the computer-executable instructions stored in the memory, when executed, cause at least one processor 1210 to perform various operations and functions described above in conjunction with FIGS. 1-10 in the various embodiments of this specification.
在本说明书的实施例中,电子设备1100/1200可以包括但不限于:个人计算机、服务器计算机、工作站、桌面型计算机、膝上型计算机、笔记本计算机、移动计算设备、智能电话、平板计算机、蜂窝电话、个人数字助理(PDA)、手持装置、可佩戴计算设备、消费电子设备等等。In the embodiments of this specification, the electronic device 1100/1200 may include, but is not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular Telephones, personal digital assistants (PDAs), handheld devices, wearable computing devices, consumer electronic devices, etc.
根据一个实施例,提供了一种例如非暂时性机器可读介质的程序产品。非暂时性机器可读介质可以具有指令(即,上述以软件形式实现的元素),该指令当被机器执行时,使得机器执行本说明书的各个实施例中如上结合图1-10描述的各种操作和功能。According to one embodiment, a program product such as a non-transitory machine-readable medium is provided. The non-transitory machine-readable medium may have instructions (ie, the above-mentioned elements implemented in the form of software), which when executed by a machine, cause the machine to execute the various embodiments described above in conjunction with FIGS. 1-10 in the various embodiments of this specification. Operation and function.
具体地,可以提供配有可读存储介质的系统或者装置,在该可读存储介质上存储着实现上述实施例中任一实施例的功能的软件程序代码,且使该系统或者装置的计算机或处理器读出并执行存储在该可读存储介质中的指令。Specifically, a system or device equipped with a readable storage medium may be provided, and the software program code for realizing the function of any one of the above-mentioned embodiments is stored on the readable storage medium, and the computer or device of the system or device The processor reads and executes the instructions stored in the readable storage medium.
在这种情况下,从可读介质读取的程序代码本身可实现上述实施例中任何一项实施例的功能,因此机器可读代码和存储机器可读代码的可读存储介质构成了本发明的一部分。In this case, the program code itself read from the readable medium can implement the function of any one of the above embodiments, so the machine readable code and the readable storage medium storing the machine readable code constitute the present invention a part of.
可读存储介质的实施例包括软盘、硬盘、磁光盘、光盘(如CD-ROM、CD-R、CD-RW、DVD-ROM、DVD-RAM、DVD-RW、DVD-RW)、磁带、非易失性存储卡和ROM。可选择地,可以由通信网络从服务器计算机上或云上下载程序代码。Examples of readable storage media include floppy disks, hard disks, magneto-optical disks, optical disks (such as CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD-RW), magnetic tape, Volatile memory card and ROM. Alternatively, the program code can be downloaded from the server computer or the cloud via the communication network.
本领域技术人员应当理解,上面公开的各个实施例可以在不偏离发明实质的情况下做出各种变形和修改。因此,本发明的保护范围应当由所附的权利要求书来限定。Those skilled in the art should understand that the various embodiments disclosed above can be modified and modified without departing from the essence of the invention. Therefore, the protection scope of the present invention should be defined by the appended claims.
需要说明的是,上述各流程和各系统结构图中不是所有的步骤和单元都是必须的,可以根据实际的需要忽略某些步骤或单元。各步骤的执行顺序不是固定的,可以根据需要进行确定。上述各实施例中描述的装置结构可以是物理结构,也可以是逻辑结构,即,有些单元可能由同一物理实体实现,或者,有些单元可能分由多个物理实体实现,或者,可以由多个独立设备中的某些部件共同实现。It should be noted that not all steps and units in the above processes and system structure diagrams are necessary, and some steps or units can be omitted according to actual needs. The order of execution of each step is not fixed and can be determined as needed. The device structure described in the foregoing embodiments may be a physical structure or a logical structure. That is, some units may be implemented by the same physical entity, or some units may be implemented by multiple physical entities, or may be implemented by multiple physical entities. Some components in independent devices are implemented together.
以上各实施例中,硬件单元或模块可以通过机械方式或电气方式实现。例如,一个硬件单元、模块或处理器可以包括永久性专用的电路或逻辑(如专门的处理器,FPGA或ASIC)来完成相应操作。硬件单元或处理器还可以包括可编程逻辑或电路(如通用处理器或其它可编程处理器),可以由软件进行临时的设置以完成相应操作。具体实现 方式(机械方式、或专用的永久性电路、或者临时设置的电路)可以基于成本和时间上的考虑来确定。In the above embodiments, the hardware unit or module can be implemented mechanically or electrically. For example, a hardware unit, module, or processor may include a permanent dedicated circuit or logic (such as a dedicated processor, FPGA or ASIC) to complete the corresponding operation. The hardware unit or processor may also include programmable logic or circuits (such as general-purpose processors or other programmable processors), which may be temporarily set by software to complete corresponding operations. The specific implementation method (mechanical method, or dedicated permanent circuit, or temporarily set circuit) can be determined based on cost and time considerations.
上面结合附图阐述的具体实施方式描述了示例性实施例,但并不表示可以实现的或者落入权利要求书的保护范围的所有实施例。在整个本说明书中使用的术语“示例性”意味着“用作示例、实例或例示”,并不意味着比其它实施例“优选”或“具有优势”。出于提供对所描述技术的理解的目的,具体实施方式包括具体细节。然而,可以在没有这些具体细节的情况下实施这些技术。在一些实例中,为了避免对所描述的实施例的概念造成难以理解,公知的结构和装置以框图形式示出。The specific implementations set forth above in conjunction with the drawings describe exemplary embodiments, but do not represent all embodiments that can be implemented or fall within the protection scope of the claims. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration", and does not mean "preferred" or "advantageous" over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, these techniques can be implemented without these specific details. In some instances, in order to avoid incomprehensibility to the concepts of the described embodiments, well-known structures and devices are shown in the form of block diagrams.
本公开内容的上述描述被提供来使得本领域任何普通技术人员能够实现或者使用本公开内容。对于本领域普通技术人员来说,对本公开内容进行的各种修改是显而易见的,并且,也可以在不脱离本公开内容的保护范围的情况下,将本文所定义的一般性原理应用于其它变型。因此,本公开内容并不限于本文所描述的示例和设计,而是与符合本文公开的原理和新颖性特征的最广范围相一致。The foregoing description of the present disclosure is provided to enable any person of ordinary skill in the art to implement or use the present disclosure. For those of ordinary skill in the art, various modifications to the present disclosure are obvious, and the general principles defined herein can also be applied to other modifications without departing from the scope of protection of the present disclosure. . Therefore, the present disclosure is not limited to the examples and designs described herein, but is consistent with the widest scope that conforms to the principles and novel features disclosed herein.

Claims (24)

  1. 一种基于多方安全计算的神经网络模型训练方法,其中,所述神经网络模型利用第一数目个训练参与方协同训练,所述神经网络模型包括多个隐层并且被按照客户端模型与服务端模型间隔的方式分割为至少一个客户端模型和至少一个服务端模型,每个客户端模型被分解为第一数目个客户端子模型,每个客户端子模型具有相同的子模型结构,所述至少一个服务端模型部署在服务端,每个客户端子模型部署在对应的训练参与方的客户端,所述方法包括:A neural network model training method based on multi-party secure computing, wherein the neural network model uses a first number of training participants to train cooperatively, and the neural network model includes multiple hidden layers and is based on the client model and the server The model interval is divided into at least one client model and at least one server model. Each client model is decomposed into a first number of client sub-models. Each client sub-model has the same sub-model structure. The server model is deployed on the server, and each client sub-model is deployed on the client of the corresponding training participant. The method includes:
    执行下述循环过程,直到满足循环结束条件:Perform the following loop process until the loop end condition is met:
    将训练样本数据提供给当前神经网络模型,以经由各个当前客户端模型和各个当前服务端模型配合计算来得到所述当前神经网络模型的当前预测值,其中,在各个当前客户端模型,经由各个训练参与方,使用各自的当前客户端子模型以及所述训练样本数据或者在前的当前服务端模型的计算结果来逐层进行多方安全计算,以得到该当前客户端模型的计算结果,以及在各个当前服务端模型,使用在前的当前客户端模型的计算结果来逐层进行非多方安全计算,以得到该当前服务端模型的计算结果;The training sample data is provided to the current neural network model to obtain the current prediction value of the current neural network model through each current client model and each current server model. The training participants use their current client sub-models and the training sample data or the calculation results of the previous current server model to perform multi-party security calculations layer by layer to obtain the calculation results of the current client model. The current server model uses the calculation results of the previous current client model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the current server model;
    基于所述当前预测值和样本标记值,确定当前预测差值;以及Determine the current prediction difference based on the current prediction value and the sample label value; and
    在不满足所述循环结束条件时,根据所述当前预测差值,通过反向传播来逐层调整各个当前服务端模型和各个当前客户端子模型的各层模型参数,所述调整后的各个服务端模型和各个客户端子模型充当下一循环过程的各个当前服务端模型和各个当前客户端子模型。When the loop end condition is not met, according to the current prediction difference, through backpropagation to adjust each current server model and each current client sub-model layer by layer model parameters, each of the adjusted services The end model and each client sub-model serve as each current server model and each current client sub-model of the next cycle process.
  2. 如权利要求1所述的神经网络模型训练方法,其中,所述服务端模型中的神经网络模型分层结构的模型计算与数据隐私保护无关。The neural network model training method according to claim 1, wherein the model calculation of the neural network model hierarchical structure in the server model has nothing to do with data privacy protection.
  3. 如权利要求1所述的神经网络模型训练方法,其中,所述客户端模型所包括的隐层的总层数根据用于模型训练的算力、应用场景所要求的训练时效性和/或训练安全等级确定。The neural network model training method according to claim 1, wherein the total number of hidden layers included in the client model is based on computing power used for model training, training timeliness required by application scenarios, and/or training The security level is determined.
  4. 如权利要求1所述的神经网络模型训练方法,其中,所述神经网络模型包括N个隐层,所述神经网络模型被分割为第一客户端模型和单个服务端模型,所述第一客户端模型包括输入层以及第一隐层到第K隐层,以及所述服务端模型包括输出层以及第K+1隐层到第N隐层。The neural network model training method according to claim 1, wherein the neural network model includes N hidden layers, the neural network model is divided into a first client model and a single server model, and the first client The end model includes an input layer and the first hidden layer to the Kth hidden layer, and the server model includes an output layer and the K+1th hidden layer to the Nth hidden layer.
  5. 如权利要求1所述的神经网络模型训练方法,其中,所述神经网络模型包括N个隐层,所述神经网络模型被分割为第一客户端模型、单个服务端模型和第二客户端模型,所述第一客户端模型包括输入层以及第一隐层到第K隐层,所述服务端模型包括第 K+1隐层到第L隐层,以及所述第二客户端模型包括输出层以及第L+1隐层到第N隐层。The neural network model training method according to claim 1, wherein the neural network model includes N hidden layers, and the neural network model is divided into a first client model, a single server model, and a second client model , The first client model includes an input layer and a first hidden layer to a Kth hidden layer, the server model includes a K+1th hidden layer to an Lth hidden layer, and the second client model includes an output Layer and the L+1th hidden layer to the Nth hidden layer.
  6. 如权利要求1所述的神经网络模型训练方法,其中,所述神经网络模型包括N个隐层,所述神经网络模型被分割为第一客户端模型、单个服务端模型和第二客户端模型,所述第一客户端模型包括输入层以及第一隐层到第K隐层,所述服务端模型包括第K+1隐层到第N隐层,以及所述第二客户端模型包括输出层。The neural network model training method according to claim 1, wherein the neural network model includes N hidden layers, and the neural network model is divided into a first client model, a single server model, and a second client model , The first client model includes an input layer and a first hidden layer to a Kth hidden layer, the server model includes a K+1 hidden layer to an Nth hidden layer, and the second client model includes an output Floor.
  7. 如权利要求1所述的神经网络模型训练方法,其中,所述当前预测差值的确定过程在所述服务端执行或者在拥有所述样本标记值的训练参与方的客户端执行。The neural network model training method according to claim 1, wherein the process of determining the current prediction difference is executed on the server or on the client of the training participant who owns the sample label value.
  8. 如权利要求1所述的神经网络模型训练方法,其中,所述循环结束条件包括:3. The neural network model training method according to claim 1, wherein the loop end condition comprises:
    循环次数达到预定次数;或者The number of cycles reaches the predetermined number; or
    当前预测差值在预定差值范围内。The current forecast difference is within the predetermined difference range.
  9. 如权利要求1所述的神经网络模型训练方法,其中,所述多方安全计算包括秘密共享、混淆电路和同态加密中的一种。The neural network model training method according to claim 1, wherein the multi-party secure calculation includes one of secret sharing, obfuscation circuit, and homomorphic encryption.
  10. 如权利要求1所述的神经网络模型训练方法,其中,在所述服务端处的模型计算使用TensorFlow或Pytorch技术实现。The neural network model training method of claim 1, wherein the model calculation at the server end is implemented using TensorFlow or Pytorch technology.
  11. 如权利要求1到10中任一所述的神经网络模型训练方法,其中,所述训练样本数据包括基于图像数据、语音数据或文本数据的训练样本数据,或者所述训练样本数据包括用户特征数据。The neural network model training method according to any one of claims 1 to 10, wherein the training sample data includes training sample data based on image data, voice data, or text data, or the training sample data includes user characteristic data .
  12. 一种基于神经网络模型的模型预测方法,所述神经网络模型包括多个隐层并且被按照客户端模型与服务端模型间隔的方式分割为至少一个客户端模型和至少一个服务端模型,每个客户端模型被分解为第一数目个客户端子模型,每个客户端子模型具有相同的子模型结构,所述至少一个服务端模型部署在服务端处,每个客户端子模型部署在所述第一数目个模型拥有方中的对应模型拥有方的客户端处,所述模型预测方法包括:A model prediction method based on a neural network model. The neural network model includes a plurality of hidden layers and is divided into at least one client model and at least one server model according to the interval between the client model and the server model, each The client model is decomposed into a first number of client sub-models, each client sub-model has the same sub-model structure, the at least one server model is deployed at the server, and each client sub-model is deployed at the first At the client terminal of the corresponding model owner among the number of model owners, the model prediction method includes:
    接收待预测数据;以及Receive data to be predicted; and
    将所述待预测数据提供给神经网络模型,以经由各个客户端模型和各个服务端模型配合计算来得到所述神经网络模型的预测值,The data to be predicted is provided to the neural network model to obtain the predicted value of the neural network model through coordinated calculation of each client model and each server model,
    其中,在各个客户端模型,经由各个模型拥有方,使用各自的客户端子模型以及所述待预测数据或者在前服务端模型的计算结果来逐层进行多方安全计算,以得到该客户端模型的计算结果,以及在各个服务端模型,使用在前客户端模型的计算结果来逐层进行非多方安全计算,以得到该服务端模型的计算结果。Among them, in each client model, through each model owner, use the respective client sub-models and the calculation results of the data to be predicted or the previous server model to perform multi-party security calculations layer by layer to obtain the client model's The calculation results, and in each server model, use the calculation results of the previous client model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the server model.
  13. 如权利要求12所述的模型预测方法,其中,所述待预测数据包括图像数据、 语音数据或文本数据,或者所述待预测数据包括用户特征数据。The model prediction method according to claim 12, wherein the data to be predicted includes image data, voice data, or text data, or the data to be predicted includes user characteristic data.
  14. 一种基于多方安全计算的神经网络模型训练装置,其中,所述神经网络模型利用第一数目个训练参与方协同训练,其中,所述神经网络模型包括多个隐层并且被按照客户端模型与服务端模型间隔的方式分割为至少一个客户端模型和至少一个服务端模型,每个客户端模型被分解为第一数目个客户端子模型,每个客户端子模型具有相同的子模型结构,所述至少一个服务端模型部署在服务端,每个客户端子模型部署在对应的训练参与方的客户端,所述神经网络模型训练装置包括:A neural network model training device based on multi-party secure computing, wherein the neural network model utilizes a first number of training participants for collaborative training, wherein the neural network model includes multiple hidden layers and is combined with the client model according to the client model. The server model interval is divided into at least one client model and at least one server model, each client model is decomposed into a first number of client sub-models, and each client sub-model has the same sub-model structure. At least one server model is deployed on the server, and each client sub-model is deployed on the client of a corresponding training participant. The neural network model training device includes:
    模型预测单元,将训练样本数据提供给当前神经网络模型,以经由各个当前客户端模型和各个当前服务端模型配合计算来得到所述当前神经网络模型的当前预测值,其中,在各个当前客户端模型,经由各个训练参与方,使用各自的当前客户端子模型以及所述训练样本数据或者在前的当前服务端模型的计算结果来逐层进行多方安全计算,以得到该当前客户端模型的计算结果,以及在各个当前服务端模型,使用在前的当前客户端模型的计算结果来逐层进行非多方安全计算,以得到该当前服务端模型的计算结果;The model prediction unit provides training sample data to the current neural network model to obtain the current prediction value of the current neural network model through the cooperation of each current client model and each current server model. The model, through each training participant, uses the respective current client sub-models and the training sample data or the calculation results of the previous current server model to perform multi-party security calculations layer by layer to obtain the calculation results of the current client model , And in each current server model, use the calculation results of the previous current client model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the current server model;
    预测差值确定单元,基于所述当前预测值和样本标记值,确定当前预测差值;以及A prediction difference determination unit, based on the current prediction value and the sample label value, determines the current prediction difference; and
    模型调整单元,在不满足循环结束条件时,根据所述当前预测差值,通过反向传播来逐层调整各个当前服务端模型和各个当前客户端子模型的各层模型参数,所述调整后的各个服务端模型和各个客户端子模型充当下一循环过程的各个当前服务端模型和各个当前客户端子模型,The model adjustment unit adjusts the model parameters of each current server model and each current client sub-model layer by layer according to the current prediction difference according to the current prediction difference when the loop end condition is not satisfied. Each server model and each client sub-model serves as each current server model and each current client sub-model of the next cycle process,
    其中,所述模型预测单元、所述预测差值确定单元和所述模型调整单元循环执行操作,直到满足所述循环结束条件。Wherein, the model prediction unit, the prediction difference determination unit, and the model adjustment unit perform operations in a loop until the loop end condition is satisfied.
  15. 如权利要求14所述的神经网络模型训练装置,其中,所述服务端模型中的神经网络模型分层结构的模型计算与数据隐私保护无关。The neural network model training device according to claim 14, wherein the model calculation of the neural network model hierarchical structure in the server model has nothing to do with data privacy protection.
  16. 如权利要求14所述的神经网络模型训练装置,其中,所述客户端模型所包括的隐层的总层数根据用于模型训练的算力、应用场景所要求的训练时效性和/或训练安全等级确定。The neural network model training device according to claim 14, wherein the total number of hidden layers included in the client model is based on computing power used for model training, training timeliness required by application scenarios, and/or training The security level is determined.
  17. 如权利要求14所述的神经网络模型训练装置,其中,所述神经网络模型包括N个隐层,所述神经网络模型被分割为第一客户端模型和单个服务端模型,所述第一客户端模型包括输入层以及第一隐层到第K隐层,以及所述服务端模型包括输出层以及第K+1隐层到第N隐层。The neural network model training device according to claim 14, wherein the neural network model includes N hidden layers, the neural network model is divided into a first client model and a single server model, and the first client The end model includes an input layer and the first hidden layer to the Kth hidden layer, and the server model includes an output layer and the K+1th hidden layer to the Nth hidden layer.
  18. 如权利要求14所述的神经网络模型训练装置,其中,所述神经网络模型包括N个隐层,所述神经网络模型被分割为第一客户端模型、单个服务端模型和第二客户端 模型,所述第一客户端模型包括输入层以及第一隐层到第K隐层,所述服务端模型包括第K+1隐层到第L隐层,以及所述第二客户端模型包括输出层以及第L+1隐层到第N隐层。The neural network model training device according to claim 14, wherein the neural network model includes N hidden layers, and the neural network model is divided into a first client model, a single server model, and a second client model , The first client model includes an input layer and a first hidden layer to a Kth hidden layer, the server model includes a K+1th hidden layer to an Lth hidden layer, and the second client model includes an output Layer and the L+1th hidden layer to the Nth hidden layer.
  19. 如权利要求14所述的神经网络模型训练装置,其中,所述预测差值确定单元设置在服务端处或者客户端处。The neural network model training device according to claim 14, wherein the prediction difference determination unit is provided at the server or the client.
  20. 一种基于神经网络模型的模型预测装置,其中,所述神经网络模型包括多个隐层并且被按照客户端模型与服务端模型间隔的方式分割为至少一个客户端模型和至少一个服务端模型,每个客户端模型被分解为第一数目个客户端子模型,每个客户端子模型具有相同的子模型结构,所述至少一个服务端模型部署在服务端处,每个客户端子模型部署在所述第一数目个模型拥有方中的对应模型拥有方的客户端处,所述模型预测装置包括:A model prediction device based on a neural network model, wherein the neural network model includes a plurality of hidden layers and is divided into at least one client model and at least one server model according to the interval between the client model and the server model, Each client model is decomposed into a first number of client sub-models, each client sub-model has the same sub-model structure, the at least one server model is deployed at the server, and each client sub-model is deployed at the At the client terminal of the corresponding model owner of the first number of model owners, the model prediction device includes:
    数据接收单元,接收待预测数据;The data receiving unit receives the data to be predicted;
    模型预测单元,将所述待预测数据提供给神经网络模型,以经由各个客户端模型和各个服务端模型配合计算来得到所述神经网络模型的预测值,The model prediction unit provides the data to be predicted to the neural network model, so as to obtain the prediction value of the neural network model through coordinated calculation of each client model and each server model,
    其中,在各个客户端模型,经由各个模型拥有方,使用各自的客户端子模型以及所述待预测数据或者在前服务端模型的计算结果来逐层进行多方安全计算,以得到该客户端模型的计算结果,以及在各个服务端模型,使用在前客户端模型的计算结果来逐层进行非多方安全计算,以得到该服务端模型的计算结果。Among them, in each client model, through each model owner, use the respective client sub-models and the calculation results of the data to be predicted or the previous server model to perform multi-party security calculations layer by layer to obtain the client model's The calculation results, and in each server model, use the calculation results of the previous client model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the server model.
  21. 一种电子设备,包括:An electronic device including:
    一个或多个处理器,以及One or more processors, and
    与所述一个或多个处理器耦合的存储器,所述存储器存储指令,当所述指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行如权利要求1到11中任一项所述的方法。A memory coupled with the one or more processors, the memory stores instructions, and when the instructions are executed by the one or more processors, the one or more processors execute claims 1 to The method of any one of 11.
  22. 一种机器可读存储介质,其存储有可执行指令,所述指令当被执行时使得所述机器执行如权利要求1到11中任一项所述的方法。A machine-readable storage medium storing executable instructions, which when executed, cause the machine to execute the method according to any one of claims 1 to 11.
  23. 一种电子设备,包括:An electronic device including:
    一个或多个处理器,以及One or more processors, and
    与所述一个或多个处理器耦合的存储器,所述存储器存储指令,当所述指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行如权利要求12或13所述的方法。A memory coupled with the one or more processors, and the memory stores instructions, and when the instructions are executed by the one or more processors, the one or more processors are executed as claimed in claim 12 or 13. The method described.
  24. 一种机器可读存储介质,其存储有可执行指令,所述指令当被执行时使得所述 机器执行如权利要求12或13所述的方法。A machine-readable storage medium that stores executable instructions that, when executed, cause the machine to perform the method according to claim 12 or 13.
PCT/CN2020/124137 2019-11-28 2020-10-27 Multi-party security calculation-based neural network model training and prediction methods and device WO2021103901A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911195445.9 2019-11-28
CN201911195445.9A CN110942147B (en) 2019-11-28 2019-11-28 Neural network model training and predicting method and device based on multi-party safety calculation

Publications (1)

Publication Number Publication Date
WO2021103901A1 true WO2021103901A1 (en) 2021-06-03

Family

ID=69908295

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/124137 WO2021103901A1 (en) 2019-11-28 2020-10-27 Multi-party security calculation-based neural network model training and prediction methods and device

Country Status (2)

Country Link
CN (1) CN110942147B (en)
WO (1) WO2021103901A1 (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110942147B (en) * 2019-11-28 2021-04-20 支付宝(杭州)信息技术有限公司 Neural network model training and predicting method and device based on multi-party safety calculation
CN111160573B (en) * 2020-04-01 2020-06-30 支付宝(杭州)信息技术有限公司 Method and device for protecting business prediction model of data privacy joint training by two parties
CN111461309B (en) * 2020-04-17 2022-05-17 支付宝(杭州)信息技术有限公司 Method and device for updating reinforcement learning system for realizing privacy protection
CN111241570B (en) * 2020-04-24 2020-07-17 支付宝(杭州)信息技术有限公司 Method and device for protecting business prediction model of data privacy joint training by two parties
CN111368983A (en) * 2020-05-15 2020-07-03 支付宝(杭州)信息技术有限公司 Business model training method and device and business model training system
WO2021240636A1 (en) * 2020-05-26 2021-12-02 日本電信電話株式会社 Distributed deep learning system
CN112132270B (en) * 2020-11-24 2021-03-23 支付宝(杭州)信息技术有限公司 Neural network model training method, device and system based on privacy protection
CN112507388B (en) * 2021-02-05 2021-05-25 支付宝(杭州)信息技术有限公司 Word2vec model training method, device and system based on privacy protection
CN112561085B (en) * 2021-02-20 2021-05-18 支付宝(杭州)信息技术有限公司 Multi-classification model training method and system based on multi-party safety calculation
CN113377625B (en) * 2021-07-22 2022-05-17 支付宝(杭州)信息技术有限公司 Method and device for data monitoring aiming at multi-party combined service prediction
CN117574381A (en) * 2021-08-05 2024-02-20 好心情健康产业集团有限公司 Physical examination user privacy protection method, device and system
CN113780527A (en) * 2021-09-01 2021-12-10 浙江数秦科技有限公司 Privacy calculation method
CN113760551A (en) * 2021-09-07 2021-12-07 百度在线网络技术(北京)有限公司 Model deployment method, data processing method, device, electronic equipment and medium
CN113792338A (en) * 2021-09-09 2021-12-14 浙江数秦科技有限公司 Safe multi-party computing method based on neural network model
CN117313869B (en) * 2023-10-30 2024-04-05 浙江大学 Large model privacy protection reasoning method based on model segmentation

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109698822A (en) * 2018-11-28 2019-04-30 众安信息技术服务有限公司 Combination learning method and system based on publicly-owned block chain and encryption neural network
US20190228218A1 (en) * 2018-01-25 2019-07-25 X Development Llc Fish biomass, shape, and size determination
WO2019173075A1 (en) * 2018-03-06 2019-09-12 DinoplusAI Holdings Limited Mission-critical ai processor with multi-layer fault tolerance support
CN110942147A (en) * 2019-11-28 2020-03-31 支付宝(杭州)信息技术有限公司 Neural network model training and predicting method and device based on multi-party safety calculation

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108292374B (en) * 2015-11-09 2022-04-15 谷歌有限责任公司 Training neural networks represented as computational graphs
CN108122027B (en) * 2016-11-29 2021-01-12 华为技术有限公司 Training method, device and chip of neural network model
CN109308418B (en) * 2017-07-28 2021-09-24 创新先进技术有限公司 Model training method and device based on shared data
US10210860B1 (en) * 2018-07-27 2019-02-19 Deepgram, Inc. Augmented generalized deep learning with special vocabulary
CN109284626A (en) * 2018-09-07 2019-01-29 中南大学 Random forests algorithm towards difference secret protection
CN109784561A (en) * 2019-01-15 2019-05-21 北京科技大学 A kind of thickener underflow concentration prediction method based on integrated study

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190228218A1 (en) * 2018-01-25 2019-07-25 X Development Llc Fish biomass, shape, and size determination
WO2019173075A1 (en) * 2018-03-06 2019-09-12 DinoplusAI Holdings Limited Mission-critical ai processor with multi-layer fault tolerance support
CN109698822A (en) * 2018-11-28 2019-04-30 众安信息技术服务有限公司 Combination learning method and system based on publicly-owned block chain and encryption neural network
CN110942147A (en) * 2019-11-28 2020-03-31 支付宝(杭州)信息技术有限公司 Neural network model training and predicting method and device based on multi-party safety calculation

Also Published As

Publication number Publication date
CN110942147A (en) 2020-03-31
CN110942147B (en) 2021-04-20

Similar Documents

Publication Publication Date Title
WO2021103901A1 (en) Multi-party security calculation-based neural network model training and prediction methods and device
WO2021164365A1 (en) Graph neural network model training method, apparatus and system
WO2021103792A1 (en) Secure multi-party computation-based machine learning model training method and apparatus, and prediction method and apparatus
CN111523673B (en) Model training method, device and system
CN111062487B (en) Machine learning model feature screening method and device based on data privacy protection
CN112132270B (en) Neural network model training method, device and system based on privacy protection
WO2021143466A1 (en) Method and device for using trusted execution environment to train neural network model
CN111079939B (en) Machine learning model feature screening method and device based on data privacy protection
US20220092414A1 (en) Method, apparatus, and system for training neural network model
CN111523556B (en) Model training method, device and system
CN111738438B (en) Method, device and system for training neural network model
CN110929887B (en) Logistic regression model training method, device and system
CN111368983A (en) Business model training method and device and business model training system
CN111523674B (en) Model training method, device and system
CN114925786A (en) Longitudinal federal linear support vector classification method based on secret sharing
CN111737756B (en) XGB model prediction method, device and system performed through two data owners
CN112183757B (en) Model training method, device and system
CN111523675B (en) Model training method, device and system
CN112183759B (en) Model training method, device and system
CN111738453B (en) Business model training method, device and system based on sample weighting
CN112966809B (en) Privacy protection-based two-party model prediction method, device and system
CN112598127B (en) Federal learning model training method and device, electronic equipment, medium and product
CN115564447A (en) Credit card transaction risk detection method and device
CN112183566B (en) Model training method, device and system
CN112183564B (en) Model training method, device and system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20891841

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20891841

Country of ref document: EP

Kind code of ref document: A1