WO2021103901A1

WO2021103901A1 - Multi-party security calculation-based neural network model training and prediction methods and device

Info

Publication number: WO2021103901A1
Application number: PCT/CN2020/124137
Authority: WO
Inventors: 陈超超; 郑龙飞; 王力; 周俊
Original assignee: 支付宝(杭州)信息技术有限公司
Priority date: 2019-11-28
Filing date: 2020-10-27
Publication date: 2021-06-03
Also published as: CN110942147A; CN110942147B

Abstract

A multi-party security calculation-based neural network model training method, and a model prediction method and device. In the method, a neural network model is divided into at least one client model and at least one server model, the server model being deployed in a server, and the client model being deployed in a client corresponding to a training participant. At each cycle, training sample data are provided to the neural network model to obtain a current prediction value and a current prediction difference. In each client model, a multi-party security calculation is performed layer by layer by means of each training participant using a respective client sub-model and received data. In each server model, a non-multi-party security calculation is carried out layer by layer by using the calculation result of the previous client model. When the cycle is not finished, according to the current prediction difference, a model parameter of each layer of the server model and client sub-model is adjusted by means of backpropagation. By using the method, model training efficiency can be improved while ensuring the security of private data.

Description

Neural network model training and prediction method and device based on multi-party safe computing

Technical field

The embodiments of this specification generally relate to the computer field, and more specifically, to a neural network model training method, model prediction method, and device based on multi-party secure computing.

Background technique

For a company or enterprise, data is a very important asset, such as user data and business data. User data may include user identity data and the like, for example. The business data may include, for example, business data that occurs on business applications provided by the company, such as commodity transaction data on Taobao. Protecting data security is a technical issue of widespread concern for companies or enterprises.

When a company or enterprise conducts business operations, it usually needs to use machine learning models to make model predictions to determine business operation risks or make business operation decisions. The neural network model is a machine learning model widely used in the field of machine learning. In many cases, neural network models require multiple model training participants to coordinate model training. Multiple model training participants (for example, e-commerce companies, express companies, and banks) each have the training data used to train the neural network model Part of the data in. The multiple model training participants hope to jointly use each other's data to uniformly train the neural network model, but they do not want to provide their private data to other model training participants to prevent their own private data from leaking.

In the face of this situation, a machine learning model training method that can protect the security of private data is proposed. It can coordinate with the multiple model training participants to train the nerves while ensuring the security of their respective private data of the multiple model training participants. The network model is used by the participants in the training of the multiple models.

Summary of the invention

In view of the foregoing problems, the embodiments of this specification provide a neural network model training method, model prediction method, and device based on multi-party secure computing, which can improve model training while ensuring the security of the respective private data of multiple training participants. effectiveness.

According to one aspect of the embodiments of the present specification, a neural network model training method based on multi-party secure computing is provided, wherein the neural network model uses a first number of training participants to coordinate training, and the neural network model includes multiple The hidden layer is divided into at least one client model and at least one server model according to the interval between the client model and the server model. Each client model is decomposed into the first number of client sub-models, each client sub-model Having the same sub-model structure, the at least one server-side model is deployed on the server-side, and each client-side sub-model is deployed on the client of the corresponding training participant, the method includes: executing the following loop process until the end of the loop is satisfied Condition: Provide training sample data to the current neural network model to obtain the current prediction value of the current neural network model through the cooperation of each current client model and each current server model, where, in each current client model, Through each training participant, use the respective current client sub-models and the training sample data or the calculation results of the previous current server model to perform multi-party security calculations layer by layer to obtain the calculation results of the current client model, and In each current server model, use the calculation results of the previous current client model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the current server model; determine based on the current predicted value and sample label value Current prediction difference; and when the loop end condition is not met, according to the current prediction difference, through backpropagation to adjust the model parameters of each current server model and each current client sub-model layer by layer, so Each server model and each client sub-model after the adjustments serve as each current server model and each current client sub-model of the next cycle process.

Optionally, in an example of the foregoing aspect, the model calculation of the neural network model hierarchical structure in the server model has nothing to do with data privacy protection.

Optionally, in an example of the above aspect, the total number of hidden layers included in the client model may be based on the computing power used for model training, the training timeliness required by the application scenario, and/or the training security level. determine.

Optionally, in an example of the above aspect, the neural network model includes N hidden layers, the neural network model is divided into a first client model and a single server model, and the first client model includes The input layer and the first hidden layer to the Kth hidden layer, and the server model includes the output layer and the K+1th hidden layer to the Nth hidden layer.

Optionally, in an example of the above aspect, the neural network model includes N hidden layers, and the neural network model is divided into a first client model, a single server model, and a second client model. The first client model includes an input layer and the first hidden layer to the Kth hidden layer, the server model includes the K+1th hidden layer to the Lth hidden layer, and the second client model includes an output layer and the first hidden layer. L+1 hidden layer to Nth hidden layer.

Optionally, in an example of the above aspect, the neural network model includes N hidden layers, and the neural network model is divided into a first client model, a single server model, and a second client model. The first client model includes an input layer and a first hidden layer to a Kth hidden layer, the server model includes a K+1th hidden layer to an Nth hidden layer, and the second client model includes an output layer.

Optionally, in an example of the foregoing aspect, the process of determining the current prediction difference may be performed on the server or on the client of the training participant that has the sample label value.

Optionally, in an example of the foregoing aspect, the loop ending condition may include: the number of loops reaches a predetermined number; or the current prediction difference is within a predetermined difference range.

Optionally, in an example of the above aspect, the multi-party secure calculation may include one of secret sharing, obfuscation circuits, and homomorphic encryption.

Optionally, in an example of the above aspect, the model calculation at the server can be implemented using TensorFlow or Pytorch technology.

Optionally, in an example of the above aspect, the training sample data may include training sample data based on image data, voice data, or text data, or the training sample data may include user characteristic data.

According to another aspect of the embodiments of the present specification, a model prediction method based on a neural network model is provided. The neural network model includes a plurality of hidden layers and is divided into at least one according to the interval between the client model and the server model. A client model and at least one server model, each client model is decomposed into a first number of client sub-models, each client sub-model has the same sub-model structure, and the at least one server model is deployed on the server, Each client sub-model is deployed on the client of the corresponding model owner among the first number of model owners, and the model prediction method includes: receiving data to be predicted; and providing the data to be predicted to the neural network model to The predicted value of the neural network model is obtained through the coordinated calculation of each client model and each server model, where each client model, through each model owner, uses the respective client sub-model and the data to be predicted or Perform multi-party security calculations layer by layer based on the calculation results of the front server model to obtain the calculation results of the client model, and in each server model, use the calculation results of the previous client model to perform non-multi-party security calculations layer by layer , In order to get the calculation result of the server model.

Optionally, in an example of the above aspect, the data to be predicted may include image data, voice data, or text data. Alternatively, the data to be predicted may include user characteristic data.

According to another aspect of the embodiments of the present specification, there is provided a neural network model training device based on multi-party secure computing, wherein the neural network model uses a first number of training participants for collaborative training, wherein the neural network model It includes multiple hidden layers and is divided into at least one client model and at least one server model according to the interval between the client model and the server model. Each client model is decomposed into a first number of client sub-models, each The client sub-models have the same sub-model structure, the at least one server-side model is deployed on the server, and each client-side sub-model is deployed on the client of a corresponding training participant, and the neural network model training device includes: a model prediction unit , The training sample data is provided to the current neural network model to obtain the current prediction value of the current neural network model through the cooperation of each current client model and each current server model, wherein, in each current client model, through Each training participant uses their current client sub-model and the training sample data or the calculation result of the previous current server model to perform multi-party security calculation layer by layer to obtain the calculation result of the current client model, and Each current server model uses the calculation result of the previous current client model to perform non-multi-party security calculation layer by layer to obtain the calculation result of the current server model; the prediction difference determination unit is based on the current prediction value and The sample label value determines the current prediction difference; and the model adjustment unit, when the loop end condition is not met, adjusts each current server model and each current client sub-model layer by layer through backpropagation according to the current prediction difference. Each layer of model parameters, the adjusted server models and the client sub-models serve as the current server models and the current client sub-models of the next cycle process, wherein the model prediction unit and the prediction difference The value determination unit and the model adjustment unit cyclically perform operations until the loop end condition is satisfied.

Optionally, in an example of the above aspect, the prediction difference determining unit may be provided at the server or the client.

According to another aspect of the embodiments of the present specification, there is provided a model prediction device based on a neural network model, wherein the neural network model includes a plurality of hidden layers and is divided into a client model and a server model in a way that is separated from each other. At least one client model and at least one server model, the first client model in the client model includes at least an input layer, each client model is decomposed into a first number of client sub-models, each client sub-model Having the same sub-model structure, the at least one server-side model is deployed at the server, and each client-side sub-model is deployed at the client of the corresponding model owner among the first number of model owners, the model predicting device Including: a data receiving unit, which receives the data to be predicted; a model prediction unit, which provides the data to be predicted to the neural network model, so as to obtain the predicted value of the neural network model through the coordinated calculation of each client model and each server model , Wherein, in each client model, through each model owner, use the respective client sub-models and the calculation results of the data to be predicted or the previous server model to perform multi-party security calculations layer by layer to obtain the client model The calculation results of and in each server model, use the calculation results of the previous client model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the server model.

According to another aspect of the embodiments of the present specification, there is provided an electronic device, including: one or more processors, and a memory coupled with the one or more processors, the memory storing instructions, when the instructions When executed by the one or more processors, the one or more processors are caused to execute the neural network model training method described above.

According to another aspect of the embodiments of the present specification, a machine-readable storage medium is provided, which stores executable instructions that, when executed, cause the machine to execute the neural network model training method described above.

According to another aspect of the embodiments of the present specification, there is provided an electronic device, including: one or more processors, and a memory coupled with the one or more processors, the memory storing instructions, when the instructions When executed by the one or more processors, the one or more processors are caused to execute the model prediction method described above.

According to another aspect of the embodiments of the present specification, a machine-readable storage medium is provided, which stores executable instructions that, when executed, cause the machine to execute the model prediction method as described above.

Description of the drawings

By referring to the following drawings, a further understanding of the nature and advantages of the contents of the embodiments of this specification can be achieved. In the drawings, similar components or features may have the same reference signs.

Figure 1 shows a schematic diagram of an example of a neural network model;

Figure 2 shows a schematic diagram of an example of a neural network model training method based on multi-party secure computing;

Fig. 3 shows a schematic diagram of a segmentation example of a neural network model according to an embodiment of the present specification;

4A-4D show exemplary schematic diagrams of a client sub-model and a server-side model after segmentation according to an embodiment of the present specification;

FIG. 5 shows a flowchart of an example of a neural network model training method based on multi-party secure computing according to an embodiment of the present specification;

6A shows a schematic diagram of an example of horizontally segmented training sample data according to an embodiment of the present disclosure;

FIG. 6B shows a schematic diagram of an example of vertically segmented training sample data according to an embodiment of the present disclosure;

FIG. 7A shows a schematic diagram of another segmentation example of a neural network model according to an embodiment of the present specification;

Fig. 7B shows a schematic diagram of another segmentation example of the neural network model according to the embodiment of the present specification;

Fig. 8 shows a flowchart of a model prediction method based on a neural network model according to an embodiment of the present specification;

Fig. 9 shows a block diagram of a model training device according to an embodiment of the present specification;

Figure 10 shows a block diagram of a model prediction device according to an embodiment of the present specification;

FIG. 11 shows a block diagram of an electronic device for implementing neural network model training based on multi-party secure computing according to an embodiment of the present specification;

Fig. 12 shows a block diagram of an electronic device for implementing model prediction based on a neural network model according to an embodiment of the present specification.

Detailed ways

The subject matter described herein will now be discussed with reference to example embodiments. It should be understood that the discussion of these embodiments is only to enable those skilled in the art to better understand and realize the subject described herein, and is not to limit the scope of protection, applicability, or examples set forth in the claims. The function and arrangement of the discussed elements can be changed without departing from the scope of protection of the content of the embodiments of this specification. Various examples can omit, substitute, or add various procedures or components as needed. For example, the described method may be executed in a different order from the described order, and various steps may be added, omitted, or combined. In addition, features described with respect to some examples can also be combined in other examples.

As used herein, the term "including" and its variations mean open terms, meaning "including but not limited to". The term "based on" means "based at least in part on." The terms "one embodiment" and "an embodiment" mean "at least one embodiment." The term "another embodiment" means "at least one other embodiment." The terms "first", "second", etc. may refer to different or the same objects. Other definitions can be included below, either explicit or implicit. Unless clearly indicated in the context, the definition of a term is consistent throughout the specification.

FIG. 1 shows a schematic diagram of an example of a neural network model 100.

As shown in FIG. 1, the neural network model 100 includes an input layer 110, a first hidden layer 120, a second hidden layer 130, a third hidden layer 140 and an output layer 150.

The input layer 110 includes three input nodes N1, N2, and N3 and a bias term b1. The three input nodes N1, N2, and N3 respectively receive data from three different data owners. In this specification, the term "data owner" and the terms "model owner" and "training participant" can be used interchangeably. The first hidden layer 120 includes two hidden layer nodes N4 and N5 and a bias term b2. The hidden layer nodes N4 and N5 are respectively fully connected with the three input nodes N1, N2, and N3 of the input layer 110 and the bias term b1. The weights between the input node N1 and the hidden layer nodes N4 and N5 are W1,4 and W1,5, respectively. The weights between the input node N2 and the hidden layer nodes N4 and N5 are W2,4 and W2,5, respectively. The weights between the input node N3 and the hidden layer nodes N4 and N5 are W3,4 and W3,5, respectively.

The second hidden layer 130 includes two hidden layer nodes N6 and N7 and a bias term b3. The hidden layer nodes N6 and N7 are respectively fully connected with the two hidden layer nodes N4 and N5 of the first hidden layer 120 and the bias term b2. The weights between the hidden layer node N4 and the hidden layer nodes N6 and N7 are W4,6 and W4,7, respectively. The weights between the hidden layer node N5 and the hidden layer nodes N6 and N7 are W5,6 and W5,7, respectively.

The third hidden layer 140 includes two hidden layer nodes N8 and N9 and a bias term b4. The hidden layer nodes N8 and N9 are respectively fully connected with the two hidden layer nodes N6 and N7 of the second hidden layer 130 and the bias term b3. The weights between the hidden layer node N6 and the hidden layer nodes N8 and N9 are W6,8 and W6,9, respectively. The weights between the hidden layer node N7 and the hidden layer nodes N8 and N9 are W7,8 and W7,9, respectively.

The output layer 150 includes an output node N10. The output node N10 is fully connected with the two hidden layer nodes N8 and N9 of the third hidden layer 140 and the bias term b4. The weight between the hidden layer node N8 and the output node N10 is W8,10. The weight between the hidden layer node N9 and the output node N10 is W9,10.

In the neural network model shown in Figure 1, the weights W1,4, W1,5, W2,4, W2,5, W3,4, W3,5, W4,6, W4,7, W5,6, W5 ,7, W6,8, W6,9, W7,8, W7,9, W8,10 and W9,10 are the model parameters of each layer of the neural network model. In the feedforward calculation, the input nodes N1, N2, and N3 of the input layer 110 are calculated to obtain the inputs Z1 and Z2 of the hidden layer nodes N4 and N5 of the first hidden layer 120, where Z1=W1,4*X1 +W2,4*X2+W3,4*X3+b1, and Z2=W1,5*X1+W2,5*X2+W3,5*X3+b1. Then, the activation function processing is performed on Z1 and Z2 respectively, and the outputs a1 and a2 of the hidden layer nodes N4 and N5 are obtained. Perform feedforward calculation layer by layer in the above manner, as shown in Figure 1, and finally get the output a7 of the neural network model.

FIG. 2 shows a schematic diagram of an example of a neural network model training method 200 based on multi-party secure computing. In the neural network model training method 200 shown in FIG. 2, three training participants Alice, Bob, and Charlie (that is, the input nodes N1, N2, and N3 in FIG. 1) are taken as an example for illustration. A training participant Alice is the training initiator, that is, the training sample data at Alice is used for training. In the method shown in Figure 2, each training participant Alice, Bob, and Charlie has the model structure of all the layered models of the neural network model, but the model parameters of each layer of each training participant are only the neural network model Part of the model parameters of the corresponding layer, and the sum of the model parameters of each layer of each training participant is equal to the model parameters of the corresponding layer of the neural network model.

As shown in Figure 2, first, at block 210, the first training participant Alice, the second training participant Bob and Charlie initialize the sub-model parameters of their neural network sub-models to obtain the initial values of their sub-model parameters, and The number of executed training cycles t is initialized to zero. Here, it is assumed that the loop ending condition is to perform a predetermined number of training loops, for example, T training loops are executed.

After the above initialization, the operations from block 220 to block 260 are executed in a loop until the loop end condition is satisfied.

Specifically, in block 220, multi-party security calculations are performed based on the current sub-models of each training participant to obtain the current prediction value of the neural network model to be trained for the training sample data

After the current predicted value is obtained, at block 230, at the first training participant Alice, the current predicted value is determined

The predicted difference with the corresponding labeled value Y

Here, e is a column vector, Y is a column vector representing the label value of the training sample X, and,

Is a column vector representing the current predicted value of the training sample X. If the training sample X contains only a single training sample, then e, Y and

Both are column vectors with a single element. If the training sample X contains multiple training samples, then e, Y and

Are all column vectors with multiple elements, where,

Each element in is the current predicted value of the corresponding training sample in the multiple training samples, each element in Y is the label value of the corresponding training sample in the multiple training samples, and each element in e is The difference between the label value of the corresponding training sample of the multiple training samples and the current predicted value.

Then, at block 240, the prediction differences are sent to the second training participants Bob and Charlie, respectively.

In block 250, at each training participant, based on the determined prediction difference, the model parameters of each layer of the neural network model at each training participant are adjusted layer by layer through back propagation.

Next, in block 260, it is determined whether the predetermined number of cycles has been reached. If the predetermined number of cycles has not been reached, return to block 220 to execute the next training cycle process, where the updated current sub-models obtained by each training participant in the current cycle process are used as the next training cycle process. The current submodel.

If the predetermined number of cycles is reached, each training participant stores the current updated value of the respective sub-model parameter as the final value of the sub-model parameter to obtain the respective trained sub-model, and then the process ends.

It should be explained here that, optionally, the end condition of the training loop process can also be that the determined prediction difference is within a predetermined range, for example, the _{sum of each element e i} in the prediction difference e is less than a predetermined threshold, or _{The average value of each element e i} in the prediction difference e is less than a predetermined threshold. In this case, the operation of block 260 is performed after block 230. If the loop end condition is met, the process ends. Otherwise, perform the operations of blocks 240 and 250, and then return to block 220 to execute the next cycle.

In the neural network model training method 200 shown in FIG. 2, all the layered model structures of the neural network model are implemented at each training participant, and each layer calculation is implemented in a multi-party safe calculation manner. In this case, since the calculation of each layer of the neural network model is performed by MPC, the calculation method of MPC is complicated and the calculation efficiency is not high, which leads to the low efficiency of this neural network model training method.

In view of the above, the embodiment of this specification proposes a neural network model training method. In the neural network model training method, the neural network model is divided into multiple model parts according to the interval between the client model and the server model. Some models Part of it is deployed on the server (hereinafter referred to as the "server model part"), and some other models are partially deployed on the client (hereinafter referred to as the "client model part"), where the server model part may include at least one The server model and the client model part may include at least one client model. Moreover, each client-side model corresponds to one or more hierarchical structures of the neural network model. In addition, each client model is decomposed into multiple client sub-models. For each client model, a client sub-model is deployed at each training participant (client), the model structure included in each client sub-model is the same, and the model parameters of each layer of each client sub-model It is obtained by segmenting the model parameters of the corresponding hierarchical structure in the neural network model, that is, the sum of the model parameters of the same layer nodes of all client sub-models is equal to the model parameters of the corresponding hierarchical structure in the neural network model . When performing model training, each client model uses the MPC method to perform model calculations through the collaboration of each training participant, and each server model uses a non-MPC method to perform model calculations. For example, TensorFlow or Pytorch technology can be used To perform model calculations. In this way, since only part of the neural network model uses the MPC method for model calculation, the remaining part can use other faster non-MPC methods for model calculation, thereby improving the efficiency of model training. In addition, when performing neural network model segmentation, the neural network model involved in model calculations that have nothing to do with data privacy protection can be segmented into server-side models, thereby protecting data privacy security.

In the embodiments of this specification, the training sample data used by the neural network model may include training sample data based on image data, voice data, or text data. Correspondingly, the neural network model can be applied to business risk identification, business classification or business decision-making based on image data, voice data or text data. Alternatively, the training sample data used by the neural network model may include user characteristic data. Correspondingly, the neural network model can be applied to business risk identification, business classification, business recommendation or business decision based on user characteristic data.

In the embodiments of this specification, the data to be predicted used by the neural network model may include image data, voice data, or text data. Alternatively, the data to be predicted used by the neural network model may include user characteristic data.

In an example of this specification, the neural network model may include N hidden layers. The neural network model may be divided into a first client model and a single server model. The first client model includes the input layer and the first hidden layer to The Kth hidden layer and the server model include the output layer and the K+1th hidden layer to the Nth hidden layer.

Fig. 3 shows a schematic diagram of a segmentation example of the neural network model 100 according to an embodiment of the present specification.

As shown in FIG. 3, the neural network model 100 is divided from the second hidden layer 140 into a client model and a server model. The client model includes an input layer 110, a first hidden layer 120, and a second hidden layer 130. The server model includes a third hidden layer 140 and an output layer 150.

Since 3 training participants are shown in FIG. 3, the client model is decomposed into 3 client sub-models, and each training participant is deployed with a client sub-model, and the server model is deployed on the server. Each client sub-model includes the same model structure, and the sum of the model parameters of the same layer nodes of all client sub-models is equal to the model parameters of the corresponding hierarchical nodes in the neural network model. Specifically, the model parameters between the input layer and the first hidden layer 120 and the model parameters between the first hidden layer 120 and the second hidden layer 130 are divided into three parts, each of which is owned by each client sub-model.

4A-4D show exemplary schematic diagrams of a client sub-model and a server-side model after segmentation according to an embodiment of the present specification. Among them, the relationship between the sub-model parameters shown in FIGS. 4A-4C and the model parameters of the neural network model in FIG. 3 is as follows.

w _1,4 = w _1,4 ⁽¹⁾ + w _1,4 ⁽²⁾ + w _1,4 ⁽³⁾ , w _1,5 = w _1,5 ⁽¹⁾ + w _1,5 ⁽²⁾ + w _1,5 ⁽³⁾ ,

w _2,4 = w _2,4 ⁽¹⁾ + w _2,4 ⁽²⁾ + w _2,4 ⁽³⁾ , w _2,5 = w _2,5 ⁽¹⁾ + w _2,5 ⁽²⁾ + w _2,5 ⁽³⁾ ,

w _3,4 =w _3,4 ⁽¹⁾ +w _3,4 ⁽²⁾ +w _3,4 ⁽³⁾ ，w _3,5 =w _3,5 ⁽¹⁾ +w _3,5 ⁽²⁾ + w _3,5 ⁽³⁾ ,

w _4,6 =w _4,6 ⁽¹⁾ +w _4,6 ⁽²⁾ +w _4,6 ⁽³⁾ ，w _4,7 =w _4,7 ⁽¹⁾ +w _4,7 ⁽²⁾ + w _4,7 ⁽³⁾ ,

w _5,6 =w _5,6 ⁽¹⁾ +w _5,6 ⁽²⁾ +w _5,6 ⁽³⁾ ，w _5,7 =w _5,7 ⁽¹⁾ +w _5,7 ⁽²⁾ + w _5,7 ⁽³⁾ .

In addition, as shown in FIG. 4D, the model parameters of each layer of the server model are exactly the same as the model parameters of the corresponding layer of the neural network model. It should be explained here that the neural network model segmentation in Figures 4A-4D corresponds to the data horizontal segmentation situation. In the case of vertical data segmentation, for the input layer, each data owner has only one node. In this case, one node of each data owner can be transformed into three nodes by vertical-horizontal switching, so as to perform segmentation according to the neural network model segmentation method shown in FIGS. 4A-4D.

FIG. 5 shows a flowchart of an example of a neural network model training method 500 based on multi-party secure computing according to an embodiment of the present specification. In the neural network model training method 500 shown in FIG. 5, it is assumed that there are M (ie, the first number) training participants. In addition, the neural network model segmentation method shown in FIG. 5 is the segmentation method in FIG. 3. Here, the M training participants may be M data owners who have data required for neural network model training, that is, each data owner has part of the data required for neural network model training. In the embodiment of this specification, part of the data owned by the M data owners may be training data that has been split horizontally, or may be training data that has been split vertically.

FIG. 6A shows a schematic diagram of an example of horizontally segmented training sample data according to an embodiment of the present specification. Figure 6A shows two data parties Alice and Bob, and multiple data parties are similar. Each training sample in the training sample subset owned by each data party Alice and Bob is complete, that is, each training sample includes complete feature data (x) and labeled data (y). For example, Alice has a complete training sample (x0, y0).

FIG. 6B shows a schematic diagram of an example of vertically segmented training sample data according to an embodiment of the present specification. Figure 6B shows two data parties Alice and Bob, and multiple data parties are similar. Each data party Alice and Bob has a partial training sub-sample of each training sample in all the training samples in the training sample set. For each training sample, the data party Alice and Bob have part of the training sub-samples combined together to form The complete content of the training sample. For example, suppose that the content of a training sample includes label y ₀ and attribute features

After vertical segmentation, the training participant Alice has the y ₀ and

And the training participant Bob has the training sample

Returning to FIG. 5, first, at block 510, the client sub-models at the clients of the M training participants and the server models at the server are initialized.

Then, the operations from block 520 to block 560 are executed in a loop until the loop end condition is satisfied.

Specifically, at block 520, the training sample data is provided to each current client sub-model at the clients of the M training participants, and multi-party security calculations are performed layer by layer to obtain the calculation results of each current client sub-model. The specific implementation process of multi-party secure computing can refer to any suitable multi-party secure computing implementation solution in this field. In this specification, multi-party secure computing may include one of Secret Sharing (SS), Garbled Circuit (GC), and Homomorphic Encryption (HE).

In block 530, the calculation result of each current client sub-model is provided to the current server model of the server to calculate layer by layer to obtain the current prediction value of the neural network model. The model calculation performed at the server can be implemented in a non-MPC manner, for example, it can be implemented using TensorFlow or Pytorch technology. It should be explained here that the calculation results of each current client sub-model can be combined and provided to the current service terminal model of the server. That is, a ₃ =a ₃ ⁽¹⁾ +a ₃ ⁽²⁾ +a ₃ ⁽³⁾ and a ₄ =a ₄ ⁽¹⁾ +a ₄ ⁽²⁾ +a ₄ ⁽³⁾ .

After the current prediction value of the neural network model is obtained as described above, in block 540, the current prediction difference is determined based on the current prediction value and the sample label value.

It should be explained here that, in an example, the process of determining the current prediction difference can be performed on the server side. In this case, the sample label value owned by the training participant needs to be transmitted to the server.

Optionally, in another example, the process of determining the current prediction difference may be performed on the client terminal of the training participant who has the sample label value. In this case, the current prediction value determined by the server is fed back to the training participant with the sample mark value, and then the current prediction difference is determined at the training participant. In this way, there is no need to send the sample label value to the server, so that the privacy of the sample label value at the training participant can be further protected.

Next, in block 550, it is determined whether the current prediction difference is within a predetermined difference range, for example, it is determined whether the current prediction difference is less than a predetermined threshold. If the current prediction difference is not within the predetermined difference range, for example, the current prediction difference is not less than the predetermined threshold, then in block 560, according to the current prediction difference, the server model and each client sub-model are adjusted layer by layer through backpropagation. Then, return to block 520 to execute the next cycle process, where the adjusted server model and each client sub-model serve as the current server model and each current client sub-model of the next cycle process .

If the current prediction difference is within the predetermined difference range, for example, the current prediction difference is less than the predetermined threshold, the training process ends.

In addition, optionally, the ending condition of the training cycle process may also be to reach a predetermined number of cycles. In this case, the operation of block 550 may be performed after the operation of block 560, that is, after the current prediction difference is determined in block 540, the operation of block 560 is performed, and then it is determined whether the predetermined number of cycles is reached. If the predetermined number of cycles is reached, the training process ends. If the predetermined number of cycles has not been reached, return to block 520 to execute the next cycle process.

The neural network model training method according to the embodiment of the present specification is described above with reference to FIGS. 3 to 6A-6B. In the example shown in Figure 3, the client model includes 2 hidden layers. In other embodiments of this specification, the client model may include more or fewer hidden layers, for example, may include one hidden layer or more than two hidden layers. In the embodiments of this specification, the total number of hidden layers included in the client model can be determined according to the computing power used for model training, the training timeliness required by the application scenario, and/or the training security level.

It should be explained here that the neural network model training method described in FIG. 5 is a neural network model training method for the neural network model segmentation scheme shown in FIG. 3. In other embodiments of this specification, the neural network model can be segmented according to other segmentation schemes, as shown in FIGS. 7A and 7B.

Fig. 7A shows another example schematic diagram of a neural network segmentation scheme. As shown in FIG. 7A, the neural network model is divided into a first client model, a server model, and a second client model. The first client model includes an input layer 110 and a first hidden layer 120. The server model includes a second hidden layer 130. The second client model includes a third hidden layer 140 and an output layer 150. In the neural network segmentation scheme shown in FIG. 7A, each of the first client model and the second client model can be segmented into 3 client sub-models in a similar manner as in FIGS. 4A-4C. The server-side model is the same as the corresponding layered model of the neural network model. It should be noted here that the client sub-models of the first client model and the second client model are set at the client of each training participant.

For the neural network segmentation scheme shown in FIG. 7A, when the neural network model is trained, in each cycle, first, the training sample data is provided to each current first client sub-model in the first client model, Multi-party security calculations are performed layer by layer to obtain the calculation results of each current first client sub-model. Then, the calculation results of each current first client sub-model are provided to the server model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the server model. Subsequently, the calculation result of the server model is provided to each current second client sub-model in the second client model, and multi-party security calculations are performed layer by layer to obtain the current prediction result of the neural network model.

Fig. 7B shows another example schematic diagram of a neural network segmentation scheme. As shown in FIG. 7B, the neural network model is divided into at least one client model and at least one server model, for example, as shown in FIG. 7B, the first client model, the first server model, the second client model, The second service model and the third client model. The first client model includes an input layer 110 and a first hidden layer 120. The first server model includes a second hidden layer 130. The second client model includes a third hidden layer 140. The second server model includes a fourth hidden layer 150. The third client model includes a fifth hidden layer 160 and an output layer 170. In the neural network segmentation scheme shown in FIG. 7B, each of the first client model, the second client model, and the third client model can be divided into 3 clients in a similar manner as in FIGS. 4A-4C Terminal model. The first and second server models are the same as the corresponding hierarchical models of the neural network model. It should be explained here that the corresponding client sub-models of the first client model, the second client model, and the third client model are set at the clients of each training participant.

For the neural network segmentation scheme shown in Figure 7B, when the neural network model is trained, in each cycle, in each current client model, through each training participant, use the respective current client sub-models and training samples Data or the calculation result of the previous current server model is used to perform multi-party security calculation layer by layer to obtain the calculation result of the current client model. In each current server model, the calculation result of the previous client model is used to perform non-multi-party security calculation layer by layer to obtain the calculation result of the current server model. At least one client model and at least one server model cooperate with calculations to obtain the current prediction value of the neural network model.

Specifically, the training sample data is provided to each current first client sub-model in the current first client model, and multi-party security calculations are performed layer by layer to obtain the calculation results of each current first client sub-model. Then, the calculation result of each current first client sub-model is provided to the current first server model to perform non-multi-party security calculation layer by layer to obtain the calculation result of the current first server model. Subsequently, the calculation result of the first server model is provided to each current second client sub-model in the current second client model, and multi-party security calculations are performed layer by layer to obtain the calculation result of the current second client model. Then, the calculation results of each current second client sub-model are provided to the current second server model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the current second server model. Subsequently, the calculation result of the current second server model is provided to each current third client sub-model in the current third client model, and multi-party security calculations are performed layer by layer to obtain the current prediction result of the neural network model.

In addition, it should be noted that in an example of this specification, the first client model may include part of the hidden layer. In another example, each server model may include at least part of the hidden layer.

Here, it should be noted that in the embodiments of this specification, the neural network model segmentation can be performed based on whether the model calculation of each layered model of the neural network model is related to data privacy protection. Among them, the division related to data privacy protection The layer model is divided into the client model, and the layer model that has nothing to do with data privacy protection is divided into the server model. In addition, it should be noted that the client model can also include a layered model that has nothing to do with data privacy protection.

In the present specification, the data relating to privacy model can be calculated directly using the respective input or output of the model calculation X _i Y, e.g., with a model corresponding to the input layer and the output layer is calculated or calculated corresponding to the model. Model calculation and data privacy may be irrelevant to use without calculating the respective input X _i and the model output Y, e.g., neural network model in an intermediate hidden layer.

Using the embodiments of this specification, a neural network model can be provided. The neural network model is divided into at least one client model and at least one server model according to the interval between the client model and the server model. Each client model is divided into at least one client model and at least one server model. It is decomposed into the first number of client sub-models, each client sub-model has the same sub-model structure, each client sub-model is deployed on the client of a training participant, and the server model is deployed on the server. During model training, the training sample data is provided to the neural network model, so that the current prediction value of the neural network model can be obtained through cooperative calculation through the client model at the client of each training participant and each server model of the server. Among them, the model calculation in the client model is implemented in MPC mode, and the model calculation in the server model is implemented in non-MPC mode, which can reduce the number of model layers that perform multi-party security calculations, thereby increasing the speed of model training and improving the model Training efficiency.

In addition, according to the neural network model training method of the embodiment of the present specification, only the layered structure of the neural network model that is not related to data privacy protection is segmented into the server model, so that the data privacy security of each data owner can be ensured.

In addition, according to the neural network model training scheme of the embodiment of this specification, the total number of hidden layers included in the client model can be based on the computing power used for model training, the training timeliness required by the application scenario, and/or training safety. The level is determined and adjusted, so that the environmental conditions of model training, data security requirements and model training efficiency can be compromised when performing neural network model segmentation.

In addition, according to the neural network model training scheme of the embodiment of the present specification, the current prediction difference determination process can be performed on the client terminal of the training participant who has the sample label value. In this way, there is no need to send the sample label value to the server, so that the privacy of the sample label value at the training participant can be further protected.

FIG. 8 shows a flowchart of a model prediction method 800 based on a neural network model according to an embodiment of the present specification. In the embodiment shown in FIG. 8, the neural network model includes N hidden layers and is divided into a single server model and a first number of client sub-models. Each client sub-model includes an input layer and a first hidden layer to The Kth hidden layer, the server model includes the output layer and the K+1 hidden layer to the Nth hidden layer. The server model is deployed at the server, and each client sub-model is deployed at the client of a model owner. The client sub-models have the same sub-model structure, and the first number of client sub-models together constitute the corresponding model structure of the neural network model.

As shown in FIG. 8, at block 810, data to be predicted is received. The data to be predicted may be received from any model owner.

Next, in block 820, the received data to be predicted is provided to each current client sub-model at the client of the first number of model owners to perform multi-party security calculation layer by layer to obtain the calculation result of each current client sub-model .

Then, in block 830, the calculation results of each current client sub-model are provided to the server model of the server to perform non-multi-party secure calculation layer by layer to obtain the model prediction result of the neural network model.

Similarly, when the neural network model performs model segmentation according to other segmentation methods, the model prediction process shown in FIG. 8 can be adaptively modified according to the corresponding model calculation scheme.

FIG. 9 shows a block diagram of a model training device 900 according to an embodiment of the present specification. As shown in FIG. 9, the model training device 900 includes a model prediction unit 910, a prediction difference determination unit 920, and a model adjustment unit 930.

The model prediction unit 910, the prediction difference determination unit 920, and the model adjustment unit 930 perform operations in a loop until the loop end condition is satisfied. The loop ending condition may include: the number of loops reaches a predetermined number of times; or the current prediction difference is within a predetermined difference range.

Specifically, the model prediction unit 910 is configured to provide training sample data to the current neural network model, so as to obtain the current prediction value of the current neural network model through the cooperative calculation of each current client model and each current server model. Each current client model, through the first number of training participants, uses the respective current client sub-models and training sample data or the calculation results of the previous current server model to perform multi-party security calculations layer by layer to obtain the current client The calculation result of the client model, and in each current server model, use the calculation result of the previous current client model to perform non-multi-party security calculation layer by layer to obtain the calculation result of the current server model.

In an example of this specification, the model prediction unit 910 may include a multi-party security calculation module and a server-side calculation module. The multi-party security computing module is configured for each current client model, through the first number of training participants, using the training sample data or the calculation results of the previous current server model and the respective current client sub-model to perform multi-party layer by layer Secure calculation to obtain the calculation result of the current client model. The multi-party security computing module is set at the client. The server-side calculation module is configured to perform non-multi-party security calculations layer by layer for each server-side model using the calculation result of the previous current client-side model to obtain the calculation result of the server-side model. The server-side computing module is set at the server-side. The multi-party security calculation module and the server calculation module cooperate to calculate to obtain the predicted value of the neural network model.

The prediction difference determination unit 920 is configured to determine the current prediction difference based on the current prediction value and the sample label value. Optionally, the prediction difference determining unit 920 may be provided at the server or at the client.

The model adjustment unit 930 is configured to adjust each layer model parameter of each current server model and each current client sub-model through backpropagation according to the current prediction difference when the loop end condition is not satisfied. The server model and each client sub-model serve as each current server model and each current client sub-model of the next cycle process. Part of the components in the model adjustment unit 930 are set on the client side, and other components are set on the server side.

Fig. 10 shows a block diagram of a model prediction apparatus 1000 according to an embodiment of the present specification. As shown in FIG. 10, the model prediction device 1000 includes a data receiving unit 1010 and a model prediction unit 1020.

The data receiving unit 1010 is configured to receive data to be predicted. The data to be predicted may be received from any model owner. The data receiving unit 1010 is provided in the client of each model owner.

The model prediction unit 1020 is configured to provide the data to be predicted to the neural network model to obtain the prediction value of the neural network model through the cooperation of each client model and each server model. A number of model owners use their respective client sub-models and the data to be predicted or the calculation results of the previous server model to perform multi-party security calculations layer by layer to obtain the calculation results of the client model, and each server model, Use the calculation results of the previous client model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the server model.

In an example of this specification, the model prediction unit 1020 may include a multi-party security calculation module and a server-side calculation module. The multi-party security calculation module is configured to perform multi-party security calculation layer by layer for each client model, via the first number of model owners, using the to-be-predicted data or the calculation results of the previous server model and the respective client sub-models, In order to obtain the calculation result of the client model. The server-side calculation module is configured to use the calculation results of the previous client-side model to perform non-multi-party security calculations layer by layer for each server-side model, so as to obtain the calculation result of the server-side model. The multi-party security calculation module and the server calculation module cooperate to calculate to obtain the predicted value of the neural network model. The multi-party secure computing module is installed on the client of each model owner, and the server computing module is installed on the server.

As above, referring to FIGS. 1 to 10, the neural network model training method, model training device, model prediction method, and model prediction device according to the embodiments of this specification are described. The above model training device and model prediction device can be implemented by hardware, or by software or a combination of hardware and software.

FIG. 11 shows a structural block diagram of an electronic device 1100 for implementing neural network model training based on multi-party security computing according to an embodiment of the present specification.

As shown in FIG. 11, the electronic device 1100 may include at least one processor 1110, a memory (for example, a non-volatile memory) 1120, a memory 1130, a communication interface 1140, and a bus 1160, and at least one processor 1110, a memory 1120, a memory 1130 and the communication interface 1140 are connected together via a bus 1160. The at least one processor 1110 executes at least one computer-readable instruction (that is, the above-mentioned element implemented in the form of software) stored or encoded in a computer-readable storage medium.

In one embodiment, computer-executable instructions are stored in the memory, which, when executed, cause at least one processor 1110 to: execute the following loop process until the loop end condition is met: provide training sample data to the current neural network model, The current prediction value of the current neural network model is obtained through the cooperation calculation of each client model and each server model. Among them, in each current client model, through the first number of training participants, use the respective current client sub-models and The training sample data or the calculation results of the previous current server model are used to perform multi-party security calculations layer by layer to obtain the calculation results of the current client model, and in each current server model, use the previous current client model The calculation result is used to perform non-multi-party security calculation layer by layer to obtain the calculation result of the current server model; determine the current prediction difference based on the current prediction value and the sample tag value; and when the loop end condition is not met, according to the current prediction difference Value, through backpropagation to adjust each layer model parameter of each current server model and each current client sub-model layer by layer, and the adjusted each server model and each client sub-model will serve as each current service in the next cycle process Client model and each current client sub-model.

It should be understood that the computer-executable instructions stored in the memory, when executed, cause at least one processor 1110 to perform various operations and functions described above in conjunction with FIGS. 1-10 in the various embodiments of this specification.

FIG. 12 shows a structural block diagram of an electronic device 1200 for implementing model prediction based on a neural network model according to an embodiment of the present specification.

As shown in FIG. 12, the electronic device 1200 may include at least one processor 1210, memory (for example, non-volatile memory) 1220, memory 1230, communication interface 1240, and bus 1260, and at least one processor 1210, memory 1220, memory The 1230 and the communication interface 1240 are connected together via a bus 1260. The at least one processor 1210 executes at least one computer-readable instruction (that is, the above-mentioned element implemented in the form of software) stored or encoded in a computer-readable storage medium.

In one embodiment, computer-executable instructions are stored in the memory, which, when executed, cause at least one processor 1210 to: receive the data to be predicted; Each server model cooperates with calculations to obtain the predicted value of the neural network model. Among them, in each client model, through the first number of model owners, use the respective client sub-models and the data to be predicted or the calculation of the previous server model The results are used to perform multi-party security calculations layer by layer to obtain the calculation results of the client model, and in each server model, use the calculation results of the previous client model to perform non-multi-party security calculations layer by layer to obtain the server model The result of the calculation.

It should be understood that the computer-executable instructions stored in the memory, when executed, cause at least one processor 1210 to perform various operations and functions described above in conjunction with FIGS. 1-10 in the various embodiments of this specification.

In the embodiments of this specification, the electronic device 1100/1200 may include, but is not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular Telephones, personal digital assistants (PDAs), handheld devices, wearable computing devices, consumer electronic devices, etc.

According to one embodiment, a program product such as a non-transitory machine-readable medium is provided. The non-transitory machine-readable medium may have instructions (ie, the above-mentioned elements implemented in the form of software), which when executed by a machine, cause the machine to execute the various embodiments described above in conjunction with FIGS. 1-10 in the various embodiments of this specification. Operation and function.

Specifically, a system or device equipped with a readable storage medium may be provided, and the software program code for realizing the function of any one of the above-mentioned embodiments is stored on the readable storage medium, and the computer or device of the system or device The processor reads and executes the instructions stored in the readable storage medium.

In this case, the program code itself read from the readable medium can implement the function of any one of the above embodiments, so the machine readable code and the readable storage medium storing the machine readable code constitute the present invention a part of.

Examples of readable storage media include floppy disks, hard disks, magneto-optical disks, optical disks (such as CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD-RW), magnetic tape, Volatile memory card and ROM. Alternatively, the program code can be downloaded from the server computer or the cloud via the communication network.

Those skilled in the art should understand that the various embodiments disclosed above can be modified and modified without departing from the essence of the invention. Therefore, the protection scope of the present invention should be defined by the appended claims.

It should be noted that not all steps and units in the above processes and system structure diagrams are necessary, and some steps or units can be omitted according to actual needs. The order of execution of each step is not fixed and can be determined as needed. The device structure described in the foregoing embodiments may be a physical structure or a logical structure. That is, some units may be implemented by the same physical entity, or some units may be implemented by multiple physical entities, or may be implemented by multiple physical entities. Some components in independent devices are implemented together.

In the above embodiments, the hardware unit or module can be implemented mechanically or electrically. For example, a hardware unit, module, or processor may include a permanent dedicated circuit or logic (such as a dedicated processor, FPGA or ASIC) to complete the corresponding operation. The hardware unit or processor may also include programmable logic or circuits (such as general-purpose processors or other programmable processors), which may be temporarily set by software to complete corresponding operations. The specific implementation method (mechanical method, or dedicated permanent circuit, or temporarily set circuit) can be determined based on cost and time considerations.

The specific implementations set forth above in conjunction with the drawings describe exemplary embodiments, but do not represent all embodiments that can be implemented or fall within the protection scope of the claims. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration", and does not mean "preferred" or "advantageous" over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, these techniques can be implemented without these specific details. In some instances, in order to avoid incomprehensibility to the concepts of the described embodiments, well-known structures and devices are shown in the form of block diagrams.

The foregoing description of the present disclosure is provided to enable any person of ordinary skill in the art to implement or use the present disclosure. For those of ordinary skill in the art, various modifications to the present disclosure are obvious, and the general principles defined herein can also be applied to other modifications without departing from the scope of protection of the present disclosure. . Therefore, the present disclosure is not limited to the examples and designs described herein, but is consistent with the widest scope that conforms to the principles and novel features disclosed herein.

Claims

A neural network model training method based on multi-party secure computing, wherein the neural network model uses a first number of training participants to train cooperatively, and the neural network model includes multiple hidden layers and is based on the client model and the server The model interval is divided into at least one client model and at least one server model. Each client model is decomposed into a first number of client sub-models. Each client sub-model has the same sub-model structure. The server model is deployed on the server, and each client sub-model is deployed on the client of the corresponding training participant. The method includes:

Perform the following loop process until the loop end condition is met:

The training sample data is provided to the current neural network model to obtain the current prediction value of the current neural network model through each current client model and each current server model. The training participants use their current client sub-models and the training sample data or the calculation results of the previous current server model to perform multi-party security calculations layer by layer to obtain the calculation results of the current client model. The current server model uses the calculation results of the previous current client model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the current server model;

Determine the current prediction difference based on the current prediction value and the sample label value; and

When the loop end condition is not met, according to the current prediction difference, through backpropagation to adjust each current server model and each current client sub-model layer by layer model parameters, each of the adjusted services The end model and each client sub-model serve as each current server model and each current client sub-model of the next cycle process.
The neural network model training method according to claim 1, wherein the model calculation of the neural network model hierarchical structure in the server model has nothing to do with data privacy protection.
The neural network model training method according to claim 1, wherein the total number of hidden layers included in the client model is based on computing power used for model training, training timeliness required by application scenarios, and/or training The security level is determined.
The neural network model training method according to claim 1, wherein the neural network model includes N hidden layers, the neural network model is divided into a first client model and a single server model, and the first client The end model includes an input layer and the first hidden layer to the Kth hidden layer, and the server model includes an output layer and the K+1th hidden layer to the Nth hidden layer.
The neural network model training method according to claim 1, wherein the neural network model includes N hidden layers, and the neural network model is divided into a first client model, a single server model, and a second client model , The first client model includes an input layer and a first hidden layer to a Kth hidden layer, the server model includes a K+1th hidden layer to an Lth hidden layer, and the second client model includes an output Layer and the L+1th hidden layer to the Nth hidden layer.
The neural network model training method according to claim 1, wherein the neural network model includes N hidden layers, and the neural network model is divided into a first client model, a single server model, and a second client model , The first client model includes an input layer and a first hidden layer to a Kth hidden layer, the server model includes a K+1 hidden layer to an Nth hidden layer, and the second client model includes an output Floor.
The neural network model training method according to claim 1, wherein the process of determining the current prediction difference is executed on the server or on the client of the training participant who owns the sample label value.
3. The neural network model training method according to claim 1, wherein the loop end condition comprises:

The number of cycles reaches the predetermined number; or

The current forecast difference is within the predetermined difference range.
The neural network model training method according to claim 1, wherein the multi-party secure calculation includes one of secret sharing, obfuscation circuit, and homomorphic encryption.
The neural network model training method of claim 1, wherein the model calculation at the server end is implemented using TensorFlow or Pytorch technology.
The neural network model training method according to any one of claims 1 to 10, wherein the training sample data includes training sample data based on image data, voice data, or text data, or the training sample data includes user characteristic data .
A model prediction method based on a neural network model. The neural network model includes a plurality of hidden layers and is divided into at least one client model and at least one server model according to the interval between the client model and the server model, each The client model is decomposed into a first number of client sub-models, each client sub-model has the same sub-model structure, the at least one server model is deployed at the server, and each client sub-model is deployed at the first At the client terminal of the corresponding model owner among the number of model owners, the model prediction method includes:

Receive data to be predicted; and

The data to be predicted is provided to the neural network model to obtain the predicted value of the neural network model through coordinated calculation of each client model and each server model,

Among them, in each client model, through each model owner, use the respective client sub-models and the calculation results of the data to be predicted or the previous server model to perform multi-party security calculations layer by layer to obtain the client model's The calculation results, and in each server model, use the calculation results of the previous client model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the server model.
The model prediction method according to claim 12, wherein the data to be predicted includes image data, voice data, or text data, or the data to be predicted includes user characteristic data.
A neural network model training device based on multi-party secure computing, wherein the neural network model utilizes a first number of training participants for collaborative training, wherein the neural network model includes multiple hidden layers and is combined with the client model according to the client model. The server model interval is divided into at least one client model and at least one server model, each client model is decomposed into a first number of client sub-models, and each client sub-model has the same sub-model structure. At least one server model is deployed on the server, and each client sub-model is deployed on the client of a corresponding training participant. The neural network model training device includes:

The model prediction unit provides training sample data to the current neural network model to obtain the current prediction value of the current neural network model through the cooperation of each current client model and each current server model. The model, through each training participant, uses the respective current client sub-models and the training sample data or the calculation results of the previous current server model to perform multi-party security calculations layer by layer to obtain the calculation results of the current client model , And in each current server model, use the calculation results of the previous current client model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the current server model;

A prediction difference determination unit, based on the current prediction value and the sample label value, determines the current prediction difference; and

The model adjustment unit adjusts the model parameters of each current server model and each current client sub-model layer by layer according to the current prediction difference according to the current prediction difference when the loop end condition is not satisfied. Each server model and each client sub-model serves as each current server model and each current client sub-model of the next cycle process,

Wherein, the model prediction unit, the prediction difference determination unit, and the model adjustment unit perform operations in a loop until the loop end condition is satisfied.
The neural network model training device according to claim 14, wherein the model calculation of the neural network model hierarchical structure in the server model has nothing to do with data privacy protection.
The neural network model training device according to claim 14, wherein the total number of hidden layers included in the client model is based on computing power used for model training, training timeliness required by application scenarios, and/or training The security level is determined.
The neural network model training device according to claim 14, wherein the neural network model includes N hidden layers, the neural network model is divided into a first client model and a single server model, and the first client The end model includes an input layer and the first hidden layer to the Kth hidden layer, and the server model includes an output layer and the K+1th hidden layer to the Nth hidden layer.
The neural network model training device according to claim 14, wherein the neural network model includes N hidden layers, and the neural network model is divided into a first client model, a single server model, and a second client model , The first client model includes an input layer and a first hidden layer to a Kth hidden layer, the server model includes a K+1th hidden layer to an Lth hidden layer, and the second client model includes an output Layer and the L+1th hidden layer to the Nth hidden layer.
The neural network model training device according to claim 14, wherein the prediction difference determination unit is provided at the server or the client.
A model prediction device based on a neural network model, wherein the neural network model includes a plurality of hidden layers and is divided into at least one client model and at least one server model according to the interval between the client model and the server model, Each client model is decomposed into a first number of client sub-models, each client sub-model has the same sub-model structure, the at least one server model is deployed at the server, and each client sub-model is deployed at the At the client terminal of the corresponding model owner of the first number of model owners, the model prediction device includes:

The data receiving unit receives the data to be predicted;

The model prediction unit provides the data to be predicted to the neural network model, so as to obtain the prediction value of the neural network model through coordinated calculation of each client model and each server model,

Among them, in each client model, through each model owner, use the respective client sub-models and the calculation results of the data to be predicted or the previous server model to perform multi-party security calculations layer by layer to obtain the client model's The calculation results, and in each server model, use the calculation results of the previous client model to perform non-multi-party security calculations layer by layer to obtain the calculation results of the server model.
An electronic device including:

One or more processors, and

A memory coupled with the one or more processors, the memory stores instructions, and when the instructions are executed by the one or more processors, the one or more processors execute claims 1 to The method of any one of 11.
A machine-readable storage medium storing executable instructions, which when executed, cause the machine to execute the method according to any one of claims 1 to 11.
An electronic device including:

One or more processors, and

A memory coupled with the one or more processors, and the memory stores instructions, and when the instructions are executed by the one or more processors, the one or more processors are executed as claimed in claim 12 or 13. The method described.
A machine-readable storage medium that stores executable instructions that, when executed, cause the machine to perform the method according to claim 12 or 13.