CN112511361A

CN112511361A - Model training method and device and computing equipment

Info

Publication number: CN112511361A
Application number: CN202110158418.5A
Authority: CN
Inventors: 周亚顺; 赵原; 李漓春
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Ant Blockchain Technology Shanghai Co Ltd
Priority date: 2021-02-05
Filing date: 2021-02-05
Publication date: 2021-03-16
Anticipated expiration: 2041-02-05
Also published as: CN112511361B

Abstract

The embodiment of the specification discloses a model training method, a model training device and computing equipment. The method comprises the following steps: the first party generates a first random number set and a second random number set according to a first random seed, wherein the first random seed is a random seed negotiated by the first party and the random number server; the random number server generates a first random number set and a second random number set according to the first random seed, generates a third random number set according to the second random seed, and generates a fourth random number set according to the first random number set, the second random number set and the third random number set; the second random seed is a random seed negotiated by the second party and the random number server; the random number server sends a fourth random number set to the second party; the second party generates a third random number set according to the second random seed; receiving a fourth set of random numbers; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set. The embodiment can reduce the amount of transmission data.

Description

Model training method and device and computing equipment

Technical Field

The embodiment of the specification relates to the technical field of computers, in particular to a model training method, a model training device and computing equipment.

Background

The multi-party security calculation is a cryptographic technology, and the parties participating in the calculation can perform security calculation together based on the input data held by each party by executing a multi-party security calculation algorithm to obtain a calculation result without revealing the held input data to other parties.

Multi-party security computing has gained widespread use in business practice. For example, in a joint modeling scenario, the data used to train the mathematical model is scattered across different data parties. By utilizing the multi-party security computing technology, a plurality of data parties can carry out joint training on the mathematical model on the premise of not leaking data held by the data parties, so that privacy protection is realized.

In the process of executing multi-party security calculation, each data party participating in the calculation needs to use a large amount of random numbers. In the related art, a large number of random numbers are generally generated by a trusted random number server, and the generated random numbers are distributed to parties participating in computation. Because the random number server needs to send a large number of random numbers, the random number server needs to transmit a large amount of data, and thus occupies a large amount of network bandwidth.

Disclosure of Invention

The embodiment of the specification provides a model training method, a model training device and a computing device, so as to reduce the data volume needing to be transmitted by a random number server, and further reduce the occupation of network bandwidth. The technical scheme of the embodiment of the specification is as follows.

In a first aspect of embodiments of the present specification, there is provided a model training method, including:

the first party generates a first random number set and a second random number set according to a first random seed, wherein the first random seed is a random seed negotiated between the first party and a random number server;

the random number server generates a first random number set and a second random number set according to the first random seed, generates a third random number set according to the second random seed, and generates a fourth random number set according to the first random number set, the second random number set and the third random number set; the second random seed is a random seed negotiated between the second party and the random number server, and random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions;

the random number server sends a fourth random number set to the second party;

the second party generates a third random number set according to the second random seed and receives a fourth random number set; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set.

In a second aspect of the embodiments of the present specification, there is provided a model training method, applied to a first party, including:

generating a first random number set and a second random number set according to the first random seed; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set; the first random seed is a random seed negotiated between the first party and the random number server, the third random number set is generated by the second party according to the second random seed, the second random seed is a random seed negotiated between the second party and the random number server, and the fourth random number set is generated by the random number server and sent to the second party; the random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions.

In a third aspect of the embodiments of the present specification, there is provided a model training method applied to a second party, including:

generating a third random number set according to a second random seed, wherein the second random seed is a random seed negotiated between a second party and a random number server;

receiving a fourth random number set sent by the random number server; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set; the first random number set and the second random number set are generated by the first party according to a first random seed, and the first random seed is a random seed negotiated between the first party and the random number server; the random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions.

In a fourth aspect of the embodiments of the present specification, there is provided a model training method applied to a random number server, including:

generating a first random number set and a second random number set according to a first random seed, wherein the first random seed is a random seed negotiated between a first party and a random number server;

generating a fourth random number set according to the first random number set, the second random number set and the third random number set, wherein random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions;

sending a fourth set of random numbers to the second party; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set; wherein the first random number set and the second random number set at the first party are generated by the first party according to the first random seed; a third set of random numbers at the second party is generated by the second party from the second random seed.

In a fifth aspect of the embodiments of the present specification, there is provided a model training apparatus, provided on a first party, the apparatus including:

a generating unit, configured to generate a first random number set and a second random number set according to the first random seed; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set; the first random seed is a random seed negotiated between the first party and the random number server, the third random number set is generated by the second party according to the second random seed, the second random seed is a random seed negotiated between the second party and the random number server, and the fourth random number set is generated by the random number server and sent to the second party; the random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions.

In a sixth aspect of embodiments of the present specification, there is provided a model training apparatus, disposed on a second party, the apparatus including:

the generating unit is used for generating a third random number set according to a second random seed, wherein the second random seed is a random seed negotiated between a second party and a random number server;

a receiving unit, configured to receive a fourth random number set sent by the random number server; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set; the first random number set and the second random number set are generated by the first party according to a first random seed, and the first random seed is a random seed negotiated between the first party and the random number server; the random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions.

A seventh aspect of the embodiments of the present specification provides a model training apparatus, which is disposed in a random number server, and includes:

the first generation unit is used for generating a first random number set and a second random number set according to a first random seed, wherein the first random seed is a random seed negotiated between a first party and a random number server;

a second generating unit, configured to generate a third random number set according to a second random seed, where the second random seed is a random seed negotiated between a second party and a random number server;

a third generating unit, configured to generate a fourth random number set according to the first random number set, the second random number set, and the third random number set, where random numbers in the first random number set, the second random number set, the third random number set, and the fourth random number set satisfy a preset condition;

a sending unit, configured to send the fourth random number set to the second party; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set; wherein the first random number set and the second random number set at the first party are generated by the first party according to the first random seed; a third set of random numbers at the second party is generated by the second party from the second random seed.

In an eighth aspect of embodiments of the present specification, there is provided a computing device comprising:

at least one processor;

a memory storing program instructions configured to be suitable for execution by the at least one processor, the program instructions comprising instructions for performing the method of the second, third, or fourth aspect.

According to the technical scheme provided by the embodiment of the specification, when the model needs to be jointly trained, the random number server can only transmit the fourth random number set, so that the model can be jointly trained between the first party and the second party. Therefore, the embodiment of the specification can reduce the data transmission amount between the random number server and the first party and the second party, thereby reducing the occupation of network bandwidth.

Drawings

In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, the drawings in the following description are only some embodiments described in the present specification, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic diagram of a secret sharing process in an embodiment of the present disclosure;

FIG. 2 is a schematic flow chart of a model training method in an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of a combined training model using a gradient descent method in an embodiment of the present disclosure;

FIG. 4 is a diagram of a Newton's method joint training model in an embodiment of the present disclosure;

FIG. 5 is a schematic flow chart of a model training method in an embodiment of the present disclosure;

FIG. 6 is a schematic flow chart of a model training method in an embodiment of the present disclosure;

FIG. 7 is a schematic flow chart of a model training method in an embodiment of the present disclosure;

FIG. 8 is a schematic structural diagram of a model training apparatus according to an embodiment of the present disclosure;

FIG. 9 is a schematic structural diagram of a model training apparatus according to an embodiment of the present disclosure;

FIG. 10 is a schematic structural diagram of a model training apparatus according to an embodiment of the present disclosure;

fig. 11 is a schematic structural diagram of an electronic device in an embodiment of this specification.

Detailed Description

The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification without any inventive step should fall within the scope of protection of the present specification.

multi-Party Secure computing (MPC) is an algorithm that protects data privacy and security. The multi-party security calculation can ensure that a plurality of data parties perform cooperative calculation on the premise of not leaking own data.

Secret Sharing (SS) is an algorithm for protecting data privacy and security. A plurality of data parties can perform cooperative calculation by using a secret sharing algorithm on the premise of not leaking own data, and secret information is shared. The various data parties may each obtain a piece of secret information. Please refer to fig. 1. For example, suppose there is a data partyP ₁Data sideP ₂And a Trusted Third Party (TTP). Data sideP ₁Holding business datax ₁Data sideP ₂Holding business datax ₂. Using secret sharing algorithms, data partiesP ₁And dataSquare blockP ₂Can perform cooperative calculation and share secret informationy. Data sideP ₁Secret information can be obtainedyIs divided intoy ₁Data sideP ₂Secret information can be obtainedyIs divided intoy ₂。y=y ₁+y ₂=x ₁ x ₂. In particular, a trusted third party may generate random numbersURandom number ofZ ₁Random number ofVRandom number ofZ ₂(ii) a Can go to the data sideP ₁Issuing random numbersUAnd random numberZ ₁(ii) a Can go to the data sideP ₂Issuing random numbersVAnd random numberZ ₂. Random numberURandom number ofZ ₁Random number ofVAnd random numberZ ₂Satisfy the relationZ ₁+Z ₂=UV. Data sideP ₁Can receive random numbersUAnd random numberZ ₁(ii) a Can calculateE=x ₁-U(ii) a Can go to the data sideP ₂Transmitting random numbersE. Data sideP ₂Can receive random numbersVAnd random numberZ ₂(ii) a Can calculateF=x ₂-V(ii) a Can go to the data sideP ₁Transmitting random numbersF. Thus, the data sideP ₁Can receive random numbersF(ii) a Secret information can be calculatedyIs divided intoy ₁=UF+Z ₁. Data sideP ₂Can receive random numbersE(ii) a Secret information can be calculatedyIs divided intoy ₂=Ex ₂+Z ₂。

An excitation Function (also known as an Activation Function) may be used to construct the mathematical model. The excitation function defines the output at a given input. The excitation function is a non-linear function. Nonlinear factors can be added into the mathematical model through the excitation function, and the expression capacity of the mathematical model is improved. The excitation function may include a Sigmoid function, a Tanh function, a ReLU function, and the like. A Loss Function (Loss Function) may be used to measure the degree of inconsistency between the predicted and true values of the mathematical model. The smaller the value of the loss function, the better the robustness of the representation mathematical model. The Loss Function includes, but is not limited to, a Logarithmic Loss Function (Logarithmic Loss Function), a Square Loss Function (Square Loss), and the like. The mathematical model may include a logistic regression model, a neural network model, and the like.

The random seed may be a random number used to generate the random number. In practical applications, one or more random numbers may be generated using a random number generation algorithm based on the random seed. The random number generation algorithm may include a squaring Method (Midsquare Method), a Linear congruence Method (Linear Congruential Method), and the like.

In the related art, training samples are scattered on a first party and a second party. Specifically, the first party may hold label data of a training sample, and the second party may hold feature data of the training sample. The first party may jointly train a mathematical model based on the label data and the second party may jointly train a mathematical model based on the feature data. In the process of jointly training the mathematical model, random numbers are used. In general, a first set of random numbers, a second set of random numbers, a third set of random numbers, and a fourth set of random numbers may be generated by a random number server. The random number server may send a first set of random numbers and a second set of random numbers to the first party; a third set of random numbers and a fourth set of random numbers may be transmitted to the second party. The first party may receive a first set of random numbers and a second set of random numbers. The second party may receive a third set of random numbers and a fourth set of random numbers. Thus, the first party can jointly train the mathematical model according to the label data, the first random number set and the second random number set, and the second party can jointly train the mathematical model according to the feature data, the third random number set and the fourth random number set. However, since the random number server needs to send the first random number set and the second random number set to the first party and needs to send the third random number set and the fourth random number set to the second party, the amount of data that the random number server needs to transmit is large, and a large amount of network bandwidth is occupied. In particular, in the case of poor network quality of the first party and/or the second party, the efficiency of model training may be affected.

The embodiment of the specification provides a system. A mathematical model may be trained using the system.

The system may include a first party, a second party, and a nonce server. The first party, the second party, and the random number server may be a single server, a server cluster composed of a plurality of servers, or a server deployed in the cloud.

The first party and the second party may be both parties of a joint modeling. Wherein the training samples may be located at the first party and the second party discretely. Specifically, the first party may hold label data of a training sample, and the second party may hold feature data of the training sample. For example, the first party may be a credit bureau, which may hold tag data of the user, the tag data being indicative of the credit status of the user. The second party may be a big data company, and the big data company holds characteristic data such as the loan amount of the user, the base number of social security paid by the user, whether the user is married, and whether the user has a room. In practical applications, the number of training samples may be multiple. The first party may hold label data of the training samples and identifications of the training samples, and the second party may hold feature data of the training samples and identifications of the training samples. Thus, by using the identification of the training samples, the first party can select the feature data of one or more training samples, and the second party can select the label data of the same training sample to jointly train the mathematical model. Of course, the first party may also hold the feature data of the training samples, and the second party may also hold the label data of the training samples, as required.

The random number server is used for providing random numbers required in the joint modeling process to the first party and/or the second party.

Referring to fig. 2, the model training method based on the system may include the following steps.

Step S11: the first party generates a first set of random numbers and a second set of random numbers from the first random seed.

In some embodiments, the first random seed may be a random seed negotiated between the first party and the random number server. The first party may obtain a first random seed in advance. The first random seed may be generated by a first party. Specifically, the first party may generate a random seed as a first random number seed; a first random seed may be sent to the nonce server. The random number server may receive a first random seed. Alternatively, the first random seed may be generated by a random number server. Specifically, the random number server may generate a random seed as a first random number seed; a first random seed may be sent to the first party. The first party may receive a first random seed.

In some embodiments, the first and second sets of random numbers may each include one or more random numbers. The first party may generate one or more random numbers from a first random seed as random numbers in the first set of random numbers and one or more random numbers as random numbers in the second set of random numbers using a random number generation algorithm. For example, the first party may first generate N random numbers as the random numbers in the first random number set from a first random seed using a random number generation algorithm, and further generate N random numbers as the random numbers in the second random number set.

Step S13: the random number server generates a first random number set and a second random number set according to the first random seed; generating a third random number set according to the second random seed; and generating a fourth random number set according to the first random number set, the second random number set and the third random number set.

In some embodiments, the random number server may obtain the first random seed in advance, and details are not repeated.

The same random number generation algorithm is executed by different implementation bodies based on the same random seed, and the same random number can be obtained. The random number server thus generates a first set of random numbers and a second set of random numbers from the first random seed. The process of the random number server generating the first random number set and the second random number set according to the first random seed and the process of the first party generating the first random number set and the second random number set according to the first random seed may refer to the explanation.

In some embodiments, the second random seed may be a random seed negotiated between the second party and the random number server. The random number server may obtain a second random seed in advance. The second random seed may be generated by a second party. Specifically, the second party may generate a random seed as a second random number seed; a second random seed may be sent to the random number server. The random number server may receive a second random seed. Alternatively, the second random seed may be generated by a random number server. Specifically, the random number server may generate a random seed as a second random number seed; a second random seed may be sent to the second party. The second party may receive a second random seed.

In some embodiments, the third set of random numbers may include one or more random numbers. The random number server may generate one or more random numbers as random numbers in the third random number set from a second random seed using a random number generation algorithm. For example, the random number server may generate N random numbers as random numbers in the third random number set according to a second random seed by using a random number generation algorithm, where N is a natural number.

In some embodiments, the random number server may generate a fourth set of random numbers from the first set of random numbers, the second set of random numbers, and the third set of random numbers. The fourth set of random numbers may include one or more random numbers. Random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions. For example, the preset conditions may include:Z _1i+Z _2i=U _i V _i. Wherein,V _irepresenting the first of a first set of random numbersiA number of random numbers to be generated,Z _1irepresenting the first of a second set of random numbersiA number of random numbers to be generated,U _irepresenting the first of a third set of random numbersiA number of random numbers to be generated,Z _2irepresenting the fourth random number setiA number of random numbers to be generated,iis a natural number greater than 0. In some usage scenarios, the number of random numbers in the first random number set, the second random number set, the third random number set, and the fourth random number set may be the same.

In some usage scenarios, the random number server may calculate the random numbers in the fourth random number set according to the first random number set, the second random number set, the third random number set, and the preset condition. Continuing with the previous example, the random number server may be based on random numbers in the first set of random numbersV _iRandom numbers in the second random number setZ _1iRandom numbers in the third random number setU _iBy the formulaZ _2i =U _i V _i-Z _1iCalculating the random numbers in the fourth random number setZ _2i。

Step S15: the random number server sends a fourth set of random numbers to the second party.

Step S17: the second party generates a third random number set according to the second random seed and receives a fourth random number set.

In some embodiments, the second party may obtain the second random seed in advance, and the detailed process is not repeated.

The same random number generation algorithm is executed by different implementation bodies based on the same random seed, and the same random number can be obtained. The second party may generate a third set of random numbers from the second random seed. The process of the second party generating the third random number set according to the second random seed and the process of the random number server generating the third random number set according to the second random seed may refer to the explanation.

Through steps S11-S17, the first party may obtain a first set of random numbers and a second set of random numbers, and the second party may obtain a third set of random numbers and a fourth set of random numbers. Thus, the first party may jointly train the model based on the first set of random numbers and the second set of random numbers, and the second party may jointly train the model based on the third set of random numbers and the fourth set of random numbers.

In some usage scenarios, the first party may generate the first random seed in advance, and may send the first random seed to the random number server in advance. The random number server may receive the first random seed in advance. In addition, the random number server can also generate a second random seed in advance; the second random seed may be sent to the second party in advance. The second party may receive the second random seed in advance. Thus, the first party and the random number server may obtain the first random seed in advance, and the random number server and the second party may obtain the second random seed in advance. When the model needs to be jointly trained, the first party may generate a first random number set and a second random number set according to the first random seed. The random number server may generate a first random number set and a second random number set according to the first random seed; a third set of random numbers may be generated from the second random seed; a fourth set of random numbers may be generated from the first set of random numbers, the second set of random numbers, and the third set of random numbers; the fourth set of random numbers may be sent to the second party. The second party may receive a fourth set of random numbers; a third set of random numbers may be generated from the second random seed. Thus, when the joint training model is needed, data does not need to be transmitted between the random number server and the first party, and only the fourth random number set can be transmitted between the random number server and the second party. The random number server can only send the fourth random number set to the second party, so that the data transmission quantity in the process of jointly training the model is reduced, and the model training efficiency is improved.

Referring to fig. 3, a scenario example of a first party and a second party training model in combination according to an embodiment of the present disclosure is described below.

In this scenario example, the first party may hold a first slice of the label data and model parameters of the training sample, and the second party may hold a second slice of the feature data and model parameters of the training sample. The first party can jointly train the model by adopting a Gradient Descent method (Gradient Descent) based on a secret sharing algorithm according to the label data, the first fragment, the first random number set and the second random number set of the model parameters, and the second party can jointly train the model by adopting a Gradient Descent method (Gradient Descent) according to the feature data, the second fragment, the third random number set and the fourth random number set of the model parameters. The model may include a logistic regression model, a neural network model, and the like.

It should be noted that, although in this scenario example, the label data of the training sample held by the first party and the feature data of the training sample held by the second party are taken as an example, the process of jointly training the model by the first party and the second party is described. But is not limited to this in practical applications. For example, in practical applications, the first party may also hold feature data of a training sample, and the second party may also hold label data of the training sample.

Specifically, the joint training model may include a plurality of iterative processes, and each iterative process may include the following steps.

Step S21: the first party shares the product secretly according to a first set of random numbers, a second set of random numbers, and a first piece of model parameters, and the second party shares the product secretly according to a second piece of a third set of random numbers, a fourth set of random numbers, feature data, and model parameters. The first party obtains a first fragment of the product and the second party obtains a second fragment of the product.

The sum of the first slice of the model parameters and the second slice of the model parameters is equal to the model parameters. For example, the model parameters may be expressed asW. The first party may hold<W>₀The second party can hold<W>₁。<W>₀+<W>₁=W。

If the iteration process of the current round is the first iteration process, the model parameters may be initial model parameters of the mathematical model. The initial model parameters may be empirical values or random values, etc. In practical application, the random number server or other trusted computing equipment can split the initial model parameters of the mathematical model to obtain a first fragment and a second fragment of the initial model parameters; a first slice of initial model parameters may be sent to the first party; a second slice of initial model parameters may be sent to the second party. The first party may receive a first slice of initial model parameters. The second party may receive a second slice of initial model parameters. If the iteration process of the current round is a non-initial round, the first party can obtain a first fragment of the model parameter through the iteration process of the previous round, and the second party can obtain a second fragment of the model parameter.

The product may comprise a product between the feature data and the model parameters. For example, the characteristic data may be expressed asXThe model parameters can be expressed asWThe product can be expressed asWX=X·W. The sum of the first slice of the product and the second slice of the product is equal to the product. For example, the first slice of the product may be represented as<WX>₀The second slice of the product may be expressed as<WX>₁，<WX>₀+<WX>₁=WX。

The process of the first party and the second party sharing the product secretly may be as follows.

In one aspect, the first party may determine a first intermediate result in sharing the product according to a first set of random numbers, a first piece of model parameters; the first intermediate result may be sent to the second party. The second party may receive a first intermediate result; a second slice of the product may be determined based on the first intermediate result, the third set of random numbers, and the fourth set of random numbers. On the other hand, the second party may determine a second intermediate result when sharing the product according to the third random number set and the feature data; a second intermediate result may be sent to the first party. The first party may receive a second intermediate result; a first fraction of the product may be determined based on the second intermediate result, the second set of random numbers, and the first fraction of the model parameters. For example, in one aspect, the first party may be based on random numbers in a first set of random numbersV ₁First slice of model parameters<W>₀Calculating a first intermediate resultF ₁=<W>₀-V ₁(ii) a The first intermediate result may be sent to the second partyF ₁. The second party may receive a first intermediate resultF ₁(ii) a May be based on the first intermediate resultF ₁Random numbers in the third random number setU ₁Random numbers in the fourth random number setZ ₂₁Calculating<[X<W>₀]>₁=U ₁ F ₁+Z ₂₁(ii) a And can further calculateX<W>₁+<[X<W>₀]>₁As said productWXSecond section of<WX>₁. Alternatively, the second party may be based on random numbers in a third set of random numbersU ₁Characteristic dataXCalculating a second intermediate resultE ₁=X-U ₁(ii) a A second intermediate result may be sent to the first partyE ₁. The first party may receive a second intermediate resultE ₁(ii) a May be based on the second intermediate resultE ₁Random numbers in the second random number setZ ₁₁First slice of model parameters<W>₀Calculating<[X<W>₀]>₀=E₁<W>₀+Z ₁₁As said productWXFirst segment of<WX>₀. Wherein,Z ₁₁+Z ₂₁=U ₁ V ₁。

step S23: the first party shares the value of the incentive function secretly according to the first random number set, the second random number set and the first piece of the product, and the second party shares the value of the incentive function secretly according to the third random number set, the fourth random number set and the second piece of the product. The first party obtains a first segment of the value of the excitation function, and the second party obtains a second segment of the value of the excitation function.

The sum of the first slice of the excitation function value and the second slice of the excitation function value is equal to the value of the excitation function.

Considering that a nonlinear operation (such as a logarithm operation, an exponential operation, a trigonometric function operation, etc.) is usually involved in obtaining the value of the excitation function, it is difficult to directly determine the value of the excitation function by using a secret sharing method. For this purpose, a polynomial may be used to fit the excitation function, and the value of the polynomial is determined as the value of the excitation function by using a secret sharing method. Specifically, the first party may obtain the value of the secret sharing polynomial according to the first random number set, the second random number set, and the first segment of the product, and the second party may obtain the value of the secret sharing polynomial according to the third random number set, the fourth random number set, and the second segment of the product. The first party may obtain a first fragment of the polynomial value as a first fragment of the excitation function value. The second party may obtain a second patch of polynomial values as a second patch of excitation function values.

The process of the first party and the second party to share the value of the incentive function in secret is as follows.

In one aspect, the first party may determine a third intermediate result in sharing the polynomial value according to a first set of random numbers and a first segment of the product; a third intermediate result may be sent to the second party. The second party may receive a third intermediate result; a second segment of the polynomial value may be determined as a second segment of the excitation function value according to the third intermediate result, the third random number set, and the fourth random number set. On the other hand, the second party may determine a fourth intermediate result in the sharing of the polynomial value according to a third random number set and the second piece of the product; a fourth intermediate result may be sent to the first party. The first party may receive a fourth intermediate result; the first fragment of the polynomial value may be determined as the first fragment of the excitation function value according to the fourth intermediate result, the second set of random numbers, and the first fragment of the product.

For example, the excitation function may be a Sigmoid function, and the polynomial may be expressed as

a=a ₀+a ₁(WX)+a ₂(WX)³

=a ₀+a ₁(<WX>₁+<WX>₀)+a ₂(<WX>₁ ³+3<WX>₁ ²·<WX>₀+3<WX>₁·<WX>₀ ²+<WX>₀ ³)。

Then, in one aspect, the first party may be based on random numbers in the first set of random numbersV ₂Random numbers in the first random number setV ₃A first fragment of said product<WX>₀Calculating a third intermediate resultF ₂=<W>₀-V ₂AndF ₃=<W>₀ ²-V ₃(ii) a A third intermediate result may be sent to the second partyF ₂AndF ₃. The second party may receive a third intermediate resultF ₂AndF ₃(ii) a May be based on the third intermediate resultF ₂Random numbers in the third random number setU ₂Random numbers in the fourth random number setZ ₂₂Calculating<[3a ₂<WX>₁ ²·<WX>₀]>₁=U ₂ F ₂+Z ₂₂(ii) a May be based on the third intermediate resultF ₃Random numbers in the third random number setU ₃Random numbers in the fourth random number setZ ₂₃Calculating<[3a ₂<WX>₁·<WX>₀ ²]>₁=U ₃ F ₃+Z ₂₃(ii) a And can further calculatea ₀+a ₁<WX>₁+<[3a ₂<WX>₁ ²·<WX>₀]>₁+<[3a ₂<WX>₁·<WX>₀ ²]>₁+a ₂<WX>₁ ³Taking values as polynomialsaSecond section of<a>₁. Alternatively, the second party may be based on random numbers in a third set of random numbersU ₂Random numbers in the third random number setU ₃Second section of said product<WX>₁Calculating a fourth intermediate resultE ₂=3a ₂<WX>₁ ²-U ₂AndE ₃=3a ₂<WX>₁-U ₃(ii) a A fourth intermediate result may be sent to the first partyE ₂AndE ₃. The first party may receive a fourth intermediate resultE ₂AndE ₃(ii) a May be based on the fourth intermediate resultE ₂Random numbers in the second random number setZ ₁₂A first fragment of said product<WX>₀Calculating<[3a ₂<WX>₁ ²·<WX>₀]>₀=E₂<WX>₀+Z ₁₂(ii) a May be based on the fourth intermediate resultE ₃Random numbers in the second random number setZ ₁₃A first fragment of said product<WX>₀Calculating<[3a ₂<WX>₁·<WX>₀ ²]>₀=E₃<WX>₀ ²+Z ₁₃(ii) a And can further calculatea ₁<WX>₀+<[3a ₂<WX>₁ ²·<WX>₀]>₀+<[3a ₂<WX>₁·<WX>₀ ²]>₀+a ₂<WX>₀ ³Taking values as polynomialsaFirst segment of<a>₀. Wherein,<a>₀+<a>₁=a。Z ₁₂+Z ₂₂=U ₂ V ₂。Z ₁₃+Z ₂₃=U ₃ V ₃。

step S25: the first party takes a value according to the first random number set, the second random number set, the label data and the first fragment of the excitation function, and the second party takes a value according to the third random number set, the fourth random number set, the characteristic data and the second fragment of the excitation function, so that the gradient of the loss function is shared secretly. The first party obtains a first patch of loss function gradients and the second party obtains a second patch of loss function gradients.

The sum of the first patch of the loss function gradient and the second patch of the loss function gradient is equal to the gradient of the loss function.

The process of the first party and the second party secret sharing the gradient of the loss function is as follows.

On one hand, the first party can determine a fifth intermediate result when sharing the gradient of the loss function according to the first random number set, the first fragment of the value of the excitation function and the label data; a fifth intermediate result may be sent to the second party. The second party may receive a fifth intermediate result; a second slice of the gradient of the loss function may be determined based on the fifth intermediate result, the third set of random numbers, and the fourth set of random numbers. On the other hand, the second party may determine a sixth intermediate result when sharing the gradient of the loss function according to the third random number set, the second segment of the excitation function value, and the feature data; a sixth intermediate result may be sent to the first party. The first party may receive a sixth intermediate result; the first fragment of the loss function gradient can be determined according to the sixth intermediate result, the second random number set, the first fragment of the excitation function value and the label data.

For example, in one aspect, the first party may be based on random numbers in a first set of random numbersV ₄Random numbers in the first random number setV ₅First segment of excitation function value<a>₀Tag data y, calculating a fifth intermediate resultF ₄=<a>₀-V ₄AndF ₅=y-V ₅(ii) a A fifth intermediate may be sent to the second partyResultsF ₄AndF ₅. The second party may receive a fifth intermediate resultF ₄AndF ₅(ii) a May be based on a fifth intermediate resultF ₄Random numbers in the third random number setU ₄Random numbers in the fourth random number setZ ₂₄Calculating<[X ^T<a>₀]>₁=U ₄ F ₄+Z ₂₄(ii) a May be based on a fifth intermediate resultF ₅Random numbers in the third random number setU ₅Random numbers in the fourth random number setZ ₂₅Calculating<[X ^T y]>₁=U ₅ F ₅+Z ₂₅(ii) a And can further calculate<[X ^T<a>₀]>₁+X ^T<a>₁+<[X ^T y]>₁Gradient as a function of lossdW=X ^T(a-y) Second section of<dW>₁. Alternatively, the second party may be based on random numbers in a third set of random numbersU ₄Random numbers in the third random number setU ₅Characteristic dataXCalculating a sixth intermediate resultE ₄=X ^T-U ₄AndE ₅=X ^T-U ₅(ii) a A sixth intermediate result may be sent to the first partyE ₄AndE ₅. The first party may receive a sixth intermediate resultE ₄AndE ₅(ii) a May be based on a sixth intermediate resultE ₄Random numbers in the second random number setZ ₁₄First segment of excitation function value<a>₀Calculating<[X ^T<a>₀]>₀=E₄<a>₀+Z ₁₄(ii) a May be based on a sixth intermediate resultE ₅Random numbers in the second random number setZ ₁₅Tag data y, calculation<[X ^T y]>₀=E₅y+Z ₁₅(ii) a And can further calculate<[X ^T<a>₀]>₀+<[X ^T y]>₀Gradient as a function of lossdW=X ^T(a-y) First segment of<dW>₀. Wherein,<dW>₀+<dW>₁=dW。Z ₁₄+Z ₂₄=U ₄ V ₄。Z ₁₅+Z ₂₅=U ₅ V ₅。

step S27: the first party determines a first fragment of a new model parameter according to the first fragment of the model parameter, the first fragment of the loss function gradient and a preset step length.

Step S29: and the second party determines a new second fragment of the model parameter according to the second fragment of the model parameter, the second fragment of the loss function gradient and the preset step length.

The preset step size can be used for controlling the iteration speed of the gradient descent method. The preset step size may be any suitable positive real number. For example, when the preset step size is too large, the iteration speed is too fast, so that the optimal model parameters may not be obtained. When the preset step size is too small, the iteration speed is too slow, and the time is long. The preset step length may specifically be an empirical value; alternatively, the method may be obtained by machine learning. Of course, the preset step length can also be obtained in other manners. The first party and the second party may both hold the preset step size.

The first party may multiply a first segment of the loss function gradient by a preset step size; the first slice of the model parameter may be subtracted from the product between the first slice of the gradient of the loss function and the preset step size to obtain a new first slice of the model parameter. The second party may multiply a second slice of the loss function gradient by a preset step size; the second slice of the model parameter may be subtracted from the product between the second slice of the gradient of the loss function and the preset step size to obtain a new second slice of the model parameter. The sum of the first slice of the new model parameters and the second slice of the new model parameters is equal to the new model parameters.

For example, the first party may calculate<W’>₀=<W>₀-L<dW>₀As the first slice of the new model parameters. The second party can calculate<W’>₁=<W>₁-L<dW>₁As the first slice of the new model parameters. And L is a preset step length.<W>₀+<W>₁=W。<W’>₀+<W’>₁=W’。

Referring to fig. 4, another exemplary scenario for jointly training a model by a first party and a second party according to an embodiment of the present disclosure is described below.

In this scenario example, the first party may hold a first slice of the label data and model parameters of the training sample, and the second party may hold a second slice of the feature data and model parameters of the training sample. The first party can jointly train the model by adopting a Newton method according to the label data, the first fragment, the first random number set and the second random number set of the model parameters, and the second party can jointly train the model by adopting the Newton method based on the secret sharing algorithm according to the feature data, the second fragment, the third random number set and the fourth random number set of the model parameters. The model may include a logistic regression model, a neural network model, and the like.

Step S301: the first party secretly shares a first product according to a first random number set, a second random number set and a first fragment of a model parameter, the second party secretly shares a first product according to a third random number set, a fourth random number set and a second fragment of the model parameter, the first party obtains a first fragment of the first product, and the second party obtains a second fragment of the first product.

The first product may comprise a product between the feature data and the model parameters.

Step S303: the first party secretly shares the value of the incentive function according to the first random number set, the second random number set and the first fragment of the first product, and the second party secretly shares the value of the incentive function according to the third random number set, the fourth random number set and the second fragment of the first product, the first party obtains the first fragment of the value of the incentive function, and the second party obtains the second fragment of the value of the incentive function.

Step S305: the first party takes a value according to the first random number set, the second random number set, the label data and the first fragment of the excitation function, and the second party takes a value according to the third random number set, the fourth random number set, the characteristic data and the second fragment of the excitation function, so that the gradient of the loss function is shared secretly. The first party obtains a first patch of loss function gradients and the second party obtains a second patch of loss function gradients.

Step S301 may be interpreted against step S21. Step S303 may be interpreted against step S23. Step S305 may be interpreted against step S25.

Step S307: and the first party secretly shares the Hessian matrix according to the first segment of the value of the excitation function and the second party according to the characteristic data and the second segment of the value of the excitation function. The first party obtains a first patch of the hessian matrix and the second party obtains a second patch of the hessian matrix.

A Hessian Matrix (also called a blackplug Matrix, a hatse Matrix, or a sea plug Matrix) is a square Matrix formed by second-order partial derivatives of a loss function and used for representing a local curvature of the loss function.

The sum of the first patch of the hessian matrix and the second patch of the hessian matrix is equal to the hessian matrix.

In practical application, the first party may share the diagonal matrix secretly according to the first segment of the excitation function value, and the second party may share the diagonal matrix secretly according to the second segment of the excitation function value. The first party may obtain a first tile of the diagonal matrix and the second party may obtain a second tile of the diagonal matrix. The sum of the first patch of the diagonal matrix and the second patch of the diagonal matrix is equal to the diagonal matrix. The first party may share the hessian matrix secretly based on the first partition of the diagonal matrix and the second party may share the hessian matrix secretly based on the feature data and the second partition of the diagonal matrix. The first party may obtain a first patch of the hessian matrix and the second party may obtain a second patch of the hessian matrix.

For example, the first party may take a value according to a first slice of an excitation function<a>₀The second party may take a value according to the excitation function<a>₁The secret sharing diagonal matrix RNN = diag (r) = diag (a (1-a)). The first party may obtain a first sliced RNN of a diagonal matrix₀The second party may obtain a second sub-slice RNN of the diagonal matrix₁. The detailed procedure of the secret sharing diagonal matrix RNN by the first and the second party is described below.

The first party may be according to<a>₀The second party may be according to<a>₁Secret sharing<a>₀·<a>₁. The first party may obtain<[<a>₀·<a>₁]>₀The second party can obtain<[<a>₀·<a>₁]>₁。<[<a>₀·<a>₁]>₀+<[<a>₀·<a>₁]>₁=<a>₀·<a>₁. Where, denotes a bitwise multiplication operation. E.g. vectorsm=(m ₁,m ₂,m ₃) Vector of motionn=(n ₁,n ₂,n ₃). Then it is determined that,m·n=(m ₁ n ₁,m ₂ n ₂,m ₃ n ₃)。

further, the first party may calculate<r>₀=<a>₀-<[<a>₀·<a>₁]>₀-<a>₀·<a>₀The second party can calculate<r>₁=<a>₁-<[<a>₀·<a>₁]>₀-<a>₁·<a>₁。

<r>₀+<r>₁

=<a>₀-<[<a>₀·<a>₁]>₀-<a>₀·<a>₀+<a>₁-<[<a>₀·<a>₁]>₀-<a>₁·<a>₁

={<a>₀+<a>₁}{1-<a>₀-<a>₁}=a(1-a)=r。

<r>₀、<r>₁Andrrespectively, are vectors.

Further, the first party may be based on<r>₀Generating a diagonal matrixRNN=diag(r) First segment ofRNN ₀=diag(<r> ₀) The second party may be according to<r>₁Generating a diagonal matrixRNN=diag(r) Second section ofRNN ₁=diag(<r> ₁)。RNN ₀ +RNN ₁ =RNN. Diagonal matrixRNNFirst segment ofRNN ₀And a second sectionRNN ₁Are also a diagonal matrix. In practical applications, the first party may be the first party<r>₀As a data element ofRNN ₀Data elements on the main diagonal, so that the data elements are based on<r>₀GeneratingRNN ₀. The second party may be to<r>₁As a data element ofRNN ₁Data elements on the main diagonal, so that the data elements are based on<r>₁GeneratingRNN ₁。

The first party may be according toRNN ₀The second party may be according to X andRNN ₁the secrets share the hessian matrix H. The first party may obtain a first slice of a hessian matrix<H>₀The second party may obtain a second sub-slice of the Hessian matrix<H>₁. The detailed process of the first party and the second party sharing the hessian matrix H in secret is described below.

The first party may be according toRNN ₀The second party may share secrets according to XX ^T RNN ₀. The first party may obtain<X ^T RNN ₀>₀The second party can obtain<X ^T RNN ₀>₁。<X ^T RNN ₀>₀+<X ^T RNN ₀>₁=X ^T RNN ₀。

Further, the first party may be based on<X ^T RNN ₀>₀The second party may share secrets according to X<X ^T RNN ₀>₀And (4) X. The first party may obtain<[<X ^T RNN ₀>₀X]>₀The second party can obtain<[<X ^T RNN ₀>₀X]>₁。<[<X ^T RNN ₀>₀X]>₀+<[<X ^T RNN ₀>₀X]>₁=<X ^T RNN ₀>₀X。

Further, the first party may be to exchange<[<X ^T RNN ₀>₀X]>₀As a first segment of the Hessian matrix H<H>₀. The second party can calculate<[<X ^T RNN ₀>₀X]>₁+<X ^T RNN ₀>₁X+ X ^T RNN ₁ XAs a second sub-slice of the Hessian matrix H<H>₁。

<H>₀+<H>₁

=<[<X ^T RNN ₀>₀X]>₀ +<[<X ^T RNN ₀>₀X]>₁+<X ^T RNN ₀>₁X+ X ^T RNN ₁ X

=<X ^T RNN ₀>₀X +<X ^T RNN ₀>₁X+ X ^T RNN ₁ X

=X ^T RNN ₁ X+X ^T RNN ₀ X=X ^T RNNX=H。

Step S309: the first party secretly shares the first inverse matrix according to the first sharding of the hessian matrix and the second party secretly shares the first inverse matrix according to the second sharding of the hessian matrix. The first party obtains a first tile of the first inverse matrix and the second party obtains a second tile of the first inverse matrix. The first inverse matrix is an inverse of the hessian matrix.

Since the hessian matrix is a square matrix, the hessian matrix may be subjected to inversion processing, and an inverse matrix of the hessian matrix may be used as the first inverse matrix.

The random number server may generate a random number matrix. Additionally, in some embodiments, the first party and the random number server each hold a first random seed, such that the first party may generate a first fragment of the random number matrix from the first random seed and the random number server may generate the first fragment of the random number matrix from the first random seed. The random number server may subtract the random number in the first segment of the random number matrix from the random number in the random number matrix to obtain a second segment of the random number matrix; a second slice of the random number matrix may be sent to the second party. The second party may receive a second slice of the random number matrix. In other embodiments, the second party and the random number server each hold a second random seed, such that the second party may generate a second segment of the random number matrix from the second random seed and the random number server may generate the second segment of the random number matrix from the second random seed. The random number server may subtract the random number in the second segment of the random number matrix from the random number in the random number matrix to obtain a first segment of the random number matrix; a first tile of a random number matrix may be transmitted to the first party. The first party may receive a first tile of a random number matrix. For example, the random number matrix may be represented asRThe random number in the random number matrix can be expressed asR _iThe first fragment of the random number matrix may be represented as<R>₀The random numbers in the first slice of the random number matrix may be represented as<R _i>₀The second partition of the random number matrix may be represented as<R>₁The random numbers in the second slice of the random number matrix may be represented as<R _i>₁. Wherein,<R>₀+<R>₁=R，<R _i>₀+<R _i>₁=R _i. Therefore, the data transmission amount of the first party and the second party when the first party and the second party jointly train the model by using the Newton method can be further reduced by the first random seed or the second random seed.

The first party may share the second product secretly according to a first shard of the random number matrix and a first shard of the hessian matrix, and the second party may share the second product secretly according to a second shard of the random number matrix and a second shard of the hessian matrix. The first party may obtain a first slice of the second product. The second party may obtain a second slice of the second product. The sum of the first slice of the second product and the second slice of the second product is equal to the second product. The second product may comprise a product of a hessian matrix and a random number matrix. For example, the second product may be represented as HR, where H represents a hessian matrix and R represents a random number matrix.

In some embodiments of the present scenario example, the second product may be inverted by the second party. In particular, the first party may send a first slice of a second product to the second party. The second party may receive a first slice of a second product; the first slice of the second product may be added to a second slice of the second product owned by itself to obtain the second product. Considering that the second product is a square matrix, the second party may perform an inversion process on the second product to obtain an inverse matrix of the second product as a second inverse matrix; the second inverse matrix may be transmitted to the first party. The first party may receive the second inverse matrix. Alternatively, in other embodiments of the present scenario example, the first party may further perform an inversion process on the second product. In particular, the second party may send a second slice of a second product to the first party. The first party may receive a second slice of a second product; the second slice of the second product may be added to the first slice of the second product itself to obtain the second product. Considering that the second product is a square matrix, the first party may perform an inversion process on the second product to obtain an inverse matrix of the second product as a second inverse matrix; the second inverse matrix may be sent to the second party. The second party may receive the second inverse matrix.

The first party may multiply the first partition of the random number matrix with the second inverse matrix to obtain a first partition of the first inverse matrix. The second party may multiply the second partition of the random number matrix with the second inverse matrix to obtain the second partition of the first inverse matrix. The sum of the first tile of the first inverse matrix and the second tile of the first inverse matrix is equal to the first inverse matrix.

For example, the first slice of the random number matrix may be represented as<R>₀The second partition of the random number matrix may be represented as<R>₁，<R>₀+<R>₁=R. The first party may be according to<R>₀And<H>₀the second party may be according to<R>₁And<H>₁the secret shares the second product HR. The first party may obtain a first slice of the second product<HR>₀The second party may obtain a second product second slice<HR>₁. The detailed procedure for the secret sharing of the second product HR by the first and second parties is described below.

The first party may be according to<H>₀The second party may be according to<R>₁Secret sharing<H>₀<R>₁. The first party may obtain<[<H>₀<R>₁]>₀The second party can obtain<[<H>₀<R>₁]>₁。<[<H>₀<R>₁]>₀+<[<H>₀<R>₁]>₁=<H>₀<R>₁。

The first party may also be based on<R>₀The second party may also be according to<H>₁Secret sharing<H>₁<R>₀. The first party may obtain<[<H>₁<R>₀]>₀The second party can obtain<[<H>₁<R>₀]>₁。<[<H>₁<R>₀]>₀+<[<H>₁<R>₀]>₁=<H>₁<R>₀。

Further, the first party may calculate<H>₀<R>₀+<[<H>₀<R>₁]>₀+<[<H>₁<R>₀]>₀First slice as second product<HR>₀. The second party can calculate<H>₁<R>₁+<[<H>₀<R>₁]>₁+<[<H>₁<R>₀]>₁Second slice as second product<HR>₁。

<HR>₀+<HR>₁

=<H>₀<R>₀+<[<H>₀<R>₁]>₀+<[<H>₁<R>₀]>₀+<H>₁<R>₁+<[<H>₀<R>₁]>₁+<[<H>₁<R>₀]>₁

=<H>₀<R>₀+<H>₀<R>₁+<H>₁<R>₀+<H>₁<R>₁

=(<H>₀+<H>₁)(<R>₀+<R>₁)

=HR。

The second product HR is here inverted by the second party. In particular, the first party may send a first slice of a second product to the second party<HR>₀. The second party may receive a first fragment of a second product<HR>₀(ii) a The first fragment of the second product may be sliced<HR>₀Second slice of second product with its own<HR>₁Adding to obtain a second product HR; the second product HR may be inverted to obtain a second inverse matrix (h:)HR)^-1(ii) a A second inverse matrix may be sent to the first party (HR)^-1. The first party may receive a second inverse matrix (HR)^-1。

The first party may be to apply a second inverse matrix (b:)HR)^-1With a first slice of the random number matrix<R>₀Multiplying to obtain a first inverse matrixH ^-1First segment of<H ^-1>₀. The second party may be a second inverse matrix (b:)HR)^-1With a second slice of the random number matrix<R>₁Multiplying to obtain a first inverse matrixH ^-1First segment of<H ^-1>₁。

H ^-1=<H ^-1>₀+<H ^-1>₁=<R>₀(HR)^-1+<R>₁(HR)^-1=R×(HR)^-1。

Step S311: and the first party secretly shares the new model parameters according to the first fragment of the original model parameters, the first fragment of the first inverse matrix and the first fragment of the loss function gradient, and the second party secretly shares the new model parameters according to the second fragment of the original model parameters, the second fragment of the first inverse matrix and the second fragment of the loss function gradient. The first party obtains a first slice of the new model parameters and the second party obtains a second slice of the new model parameters.

The first party may secretly share the third product according to a first patch of the first inverse matrix and a first patch of the loss function gradient, and the second party may secretly share the third product according to a second patch of the first inverse matrix and a second patch of the loss function gradient. The first party may obtain a first slice of the third product. The second party may obtain a second slice of the third product. The sum of the first slice of the third product and the second slice of the third product is equal to the third product. The third product may comprise a product between the first inverse matrix and the gradient of the penalty function. For example, the third product may be expressed asH ^-1 dWWhereinH ^-1a first inverse matrix is represented that is,dWrepresenting the gradient of the loss function. Furthermore, the first party may subtract the first slice of the model parameter from the first slice of the third product to obtain a new first slice of the model parameter. The second party may subtract the second fragment of the model parameter from the second fragment of the third product to obtain a new second fragment of the model parameter.

For example, the first party may be based on<H ^-1>₀And<dW>₀the second party may be according to<H ^-1>₁And<dW>₁secret sharing third productH ^-1×dW. The first party may obtain a first slice of a third product<H ^-1×dW>₀The second party may obtain a second slice of the third product<H ^-1×dW>₁。

The sharing of a third product by the first party and the second party secret is described belowH ^-1×dWThe detailed process of (1).

The first party may be according to<H ^-1>₀The second party may be according to<dW>₁Secret sharing<H ^-1>₀<dW>₁. The first party may obtain<[<H ^-1>₀<dW>₁]>₀The second party can obtain<[<H ^-1>₀<dW>₁]>₁。<[<H ^-1>₀<dW>₁]>₀+<[<H ^-1>₀<dW>₁]>₁=<H ^-1>₀<dW>₁。

The first party may also be based on<dW>₀The second party may also be according to<H ^-1>₁Secret sharing<H ^-1>₁<dW>₀. The first party may obtain<[<H ^-1>₁<dW>₀]>₀The second party can obtain<[<H ^-1>₁<dW>₀]>₁。<[<H ^-1>₁<dW>₀]>₀+<[<H ^-1>₁<dW>₀]>₁=<H ^-1>₁<dW>₀。

Further, the first party may calculate<H ^-1>₀<dW>₀+<[<H ^-1>₀<dW>₁]>₀+<[<H ^-1>₁<dW>₀]>₀First slice as third product<H ^-1×dW>₀. The second party can calculate<H ^-1>₁<dW>₁+<[<H ^-1>₀<dW>₁]>₁+<[<H ^-1>₁<dW>₀]>₁Second slice as third product<H ^-1×dW>₁。

<H ^-1×dW>₀+<H ^-1×dW>₁

=<H ^-1>₀<dW>₀+<[<H ^-1>₀<dW>₁]>₀+<[<H ^-1>₁<dW>₀]>₀+<H ^-1>₁<dW>₁+<[<H ^-1>₀<dW>₁]>₁+<[<H ^-1>₁<dW>₀]>₁

=<H ^-1>₀<dW>₀+<H ^-1>₀<dW>₁+<H ^-1>₁<dW>₀+<H ^-1>₁<dW>₁

=(<H ^-1>₀+<H ^-1>₁)(<dW>₀+<dW>₁)

=H ^-1×dW

The first party can calculate<W ^′>₀=<W>₀-<H ^-1×dW>₀The second party can calculate<W ^′>₁=<W>₁-<H ^-1×dW>₁，<W ^′>₀A first slice representing the new model parameters,<W ^′>₁a second slice representing the new model parameters,W ^′representing the new model parameters.

W ^′=<W ^′>₀+<W ^′>₁=<W>₀-<H ^-1×dW>₀+<W>₁-<H ^-1×dW>₁=W-H ^-1×dW。

In the model training method in the embodiment of the present specification, when a joint training model is required, the random number server may transmit only the fourth random number set, so that the model may be jointly trained between the first party and the second party. Therefore, the embodiment of the specification can reduce the data transmission amount between the random number server and the first party and the second party, thereby reducing the occupation of network bandwidth.

Based on the same inventive concept, the present specification also provides another embodiment of the model training method. Please refer to fig. 5. The model training method may be applied to a first party, and may specifically include the following steps.

Step S41: generating a first random number set and a second random number set according to the first random seed; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set; the first random seed is a random seed negotiated between the first party and the random number server, the third random number set is generated by the second party according to the second random seed, the second random seed is a random seed negotiated between the second party and the random number server, and the fourth random number set is generated by the random number server and sent to the second party; the random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions.

Based on the same inventive concept, the present specification also provides another embodiment of the model training method. Please refer to fig. 6. The model training method can be applied to the second party, and specifically can comprise the following steps.

Step S51: and generating a third random number set according to the second random seed, wherein the second random seed is a random seed negotiated between the second party and the random number server.

Step S53: receiving a fourth random number set sent by the random number server; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set; the first random number set and the second random number set are generated by the first party according to a first random seed, and the first random seed is a random seed negotiated between the first party and the random number server; the random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions.

Based on the same inventive concept, the present specification also provides another embodiment of the model training method. Please refer to fig. 7. The model training method can be applied to a random number server, and specifically comprises the following steps.

Step S61: and generating a first random number set and a second random number set according to the first random seed, wherein the first random seed is a random seed negotiated between the first party and the random number server.

Step S63: and generating a third random number set according to the second random seed, wherein the second random seed is a random seed negotiated between the second party and the random number server.

Step S65: generating a fourth random number set according to the first random number set, the second random number set and the third random number set; and random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions.

Step S67: sending a fourth set of random numbers to the second party; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set; wherein the first random number set and the second random number set at the first party are generated by the first party according to the first random seed; a third set of random numbers at the second party is generated by the second party from the second random seed.

Please refer to fig. 8. The present specification also provides one embodiment of a model training apparatus. The model training apparatus may be applied to a first party, and may specifically include the following elements.

A generating unit 71, configured to generate a first random number set and a second random number set according to the first random seed; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set; the first random seed is a random seed negotiated between the first party and the random number server, the third random number set is generated by the second party according to the second random seed, the second random seed is a random seed negotiated between the second party and the random number server, and the fourth random number set is generated by the random number server and sent to the second party; the random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions.

Please refer to fig. 9. The present specification also provides one embodiment of a model training apparatus. The model training apparatus may be applied to a second party, and may specifically include the following elements.

A generating unit 81, configured to generate a third random number set according to a second random seed, where the second random seed is a random seed negotiated between the second party and the random number server;

a receiving unit 83, configured to receive a fourth random number set sent by the random number server; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set; the first random number set and the second random number set are generated by the first party according to a first random seed, and the first random seed is a random seed negotiated between the first party and the random number server; the random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions.

Please refer to fig. 10. The present specification also provides one embodiment of a model training apparatus. The model training device can be applied to a random number server, and specifically can comprise the following units.

A first generating unit 91, configured to generate a first random number set and a second random number set according to a first random seed, where the first random seed is a random seed negotiated between a first party and a random number server;

a second generating unit 93, configured to generate a third random number set according to a second random seed, where the second random seed is a random seed negotiated between the second party and the random number server;

a third generating unit 95, configured to generate a fourth random number set according to the first random number set, the second random number set, and the third random number set; the random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions;

a sending unit 97, configured to send the fourth random number set to the second party; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set; wherein the first random number set and the second random number set at the first party are generated by the first party according to the first random seed; a third set of random numbers at the second party is generated by the second party from the second random seed.

Please refer to fig. 11. The embodiment of the specification also provides a computing device.

The computing device may include a memory and a processor.

In the present embodiment, the Memory includes, but is not limited to, a Dynamic Random Access Memory (DRAM), a Static Random Access Memory (SRAM), and the like. The memory may be used to store computer instructions.

In this embodiment, the processor may be implemented in any suitable manner. For example, the processor may take the form of, for example, a microprocessor or processor and a computer-readable medium that stores computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, an embedded microcontroller, and so forth. The processor may be configured to execute the computer instructions to implement the embodiments corresponding to fig. 5, fig. 6, or fig. 7.

It should be noted that, in the present specification, each embodiment is described in a progressive manner, and the same or similar parts in each embodiment may be referred to each other, and each embodiment focuses on differences from other embodiments. In particular, for the method embodiment, the apparatus embodiment, and the computing apparatus embodiment which are implemented on one side, since they are substantially similar to the model training method embodiment, the description is relatively simple, and the relevant points can be referred to the partial description of the model training method embodiment.

In addition, it is understood that one skilled in the art, after reading this specification document, may conceive of any combination of some or all of the embodiments listed in this specification without the need for inventive faculty, which combinations are also within the scope of the disclosure and protection of this specification.

In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Hardware Description Language), traffic, pl (core universal Programming Language), HDCal (jhdware Description Language), lang, Lola, HDL, laspam, hardward Description Language (vhr Description Language), vhal (Hardware Description Language), and vhigh-Language, which are currently used in most common. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

From the above description of the embodiments, it is clear to those skilled in the art that the present specification can be implemented by software plus a necessary general hardware platform. Based on such understanding, the technical solutions of the present specification may be essentially or partially implemented in the form of software products, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and include instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments of the present specification.

The description is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.

This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

While the specification has been described with examples, those skilled in the art will appreciate that there are numerous variations and permutations of the specification that do not depart from the spirit of the specification, and it is intended that the appended claims include such variations and modifications that do not depart from the spirit of the specification.

Claims

1. A method of model training, the method comprising:

the first party generates a first random number set and a second random number set according to the first random seed; the first random seed is a random seed negotiated between a first party and a random number server;

the random number server generates a first random number set and a second random number set according to the first random seed; generating a third random number set according to the second random seed; generating a fourth random number set according to the first random number set, the second random number set and the third random number set; the second random seed is a random seed negotiated between the second party and the random number server, and random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions;

the random number server sends a fourth random number set to the second party;

the second party generates a third random number set according to the second random seed; receiving a fourth set of random numbers; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set.

2. A model training method is applied to a first party and comprises the following steps:

generating a first random number set and a second random number set according to the first random seed; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set; the first random seed is a random seed negotiated between a first party and a random number server, the third random number set is generated by a second party according to the second random seed, the second random seed is a random seed negotiated between the second party and the random number server, and the fourth random number set is generated by the random number server and sent to the second party; the random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions.

3. The method of claim 2, the first random seed obtained by a first party according to:

generating a random seed as a first random seed; or,

and receiving a random seed sent by the random number server as a first random seed.

4. The method of claim 2, the preset conditions comprising:Z _1i+Z _2i=U _i V _i(ii) a Wherein,V _irepresenting the first of a first set of random numbersiA number of random numbers to be generated,Z _1irepresenting the first of a second set of random numbersiA number of random numbers to be generated,U _irepresenting the first of a third set of random numbersiA number of random numbers to be generated,Z _2irepresenting the fourth random number setiA number of random numbers to be generated,iis a natural number greater than 0.

5. The method of claim 2, wherein a first party holds a first slice of label data and model parameters for a sample, and a second party holds a second slice of feature data and model parameters for the sample; the joint training model comprises:

and the first party jointly trains the model according to the label data, the first fragment, the first random number set and the second random number set of the model parameters, and the second party jointly trains the model according to the feature data, the second fragment, the third random number set and the fourth random number set of the model parameters.

6. The method of claim 2, wherein a first party holds a first slice of the feature data and model parameters of the exemplar, and a second party holds a second slice of the label data and model parameters of the exemplar; the joint training model comprises:

and the first party jointly trains the model according to the feature data, the first fragment, the first random number set and the second random number set of the model parameters, and the second party jointly trains the model according to the label data, the second fragment, the third random number set and the fourth random number set of the model parameters.

7. The method of claim 2, 5 or 6, the joint training model, comprising:

and (3) jointly training the model by adopting a gradient descent method or a Newton method.

8. A model training method applied to a second party comprises the following steps:

receiving a fourth random number set sent by the random number server; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set; the first random number set and the second random number set are generated by a first party according to a first random seed, and the first random seed is a random seed negotiated between the first party and a random number server; the random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions.

9. The method of claim 8, the second random seed obtained by the second party according to:

generating a random seed as a second random seed; or,

and receiving the random seed sent by the random number server as a second random seed.

10. The method of claim 8, the preset conditions comprising:Z _1i+Z _2i=U _i V _i(ii) a Wherein,V _irepresenting the first of a first set of random numbersiA number of random numbers to be generated,Z _1irepresenting the first of a second set of random numbersiA number of random numbers to be generated,U _irepresenting the first of a third set of random numbersiA number of random numbers to be generated,Z _2irepresenting the fourth random number setiA number of random numbers to be generated,iis a natural number greater than 0.

11. The method of claim 8, wherein a first party holds a first slice of label data and model parameters for a sample, and a second party holds a second slice of feature data and model parameters for the sample; the joint training model comprises:

12. The method of claim 8, wherein a first party holds a first slice of the feature data and model parameters of the exemplar, and a second party holds a second slice of the label data and model parameters of the exemplar; the joint training model comprises:

13. The method of claim 8, 11 or 12, the joint training model, comprising:

14. A model training method is applied to a random number server and comprises the following steps:

15. The method of claim 14, the first random seed obtained by a random number server according to:

generating a random seed as a first random seed; or,

receiving a random seed from a first party as a first random seed.

16. The method of claim 14, the second random seed being obtained by a random number server according to:

generating a random seed as a second random seed; or,

and receiving the random seed sent by the second party as a second random seed.

17. The method of claim 14, the preset conditions comprising:Z _1i+Z _2i=U _i V _i(ii) a Wherein,V _irepresenting the first of a first set of random numbersiA number of random numbers to be generated,Z _1irepresenting the first of a second set of random numbersiA number of random numbers to be generated,U _irepresenting the first of a third set of random numbersiA number of random numbers to be generated,Z _2irepresenting the fourth random number setiA number of random numbers to be generated,iis a natural number greater than 0.

18. The method of claim 14, the generating a fourth set of random numbers, comprising:

and calculating the random number in the fourth random number set according to the first random number set, the second random number set, the third random number set and the preset condition.

19. A model training apparatus, disposed on a first party, the apparatus comprising:

a generating unit, configured to generate a first random number set and a second random number set according to the first random seed; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set; the first random seed is a random seed negotiated between a first party and a random number server, the third random number set is generated by a second party according to the second random seed, the second random seed is a random seed negotiated between the second party and the random number server, and the fourth random number set is generated by the random number server and sent to the second party; the random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions.

20. A model training apparatus, disposed at a second party, the apparatus comprising:

a generating unit, configured to generate a third random number set according to a second random seed, where the second random seed is a random seed negotiated between a second party and a random number server;

a receiving unit, configured to receive a fourth random number set sent by the random number server; so that the first party jointly trains the model according to the first random number set and the second random number set, and the second party jointly trains the model according to the third random number set and the fourth random number set; the first random number set and the second random number set are generated by a first party according to a first random seed, and the first random seed is a random seed negotiated between the first party and a random number server; the random numbers in the first random number set, the second random number set, the third random number set and the fourth random number set meet preset conditions.

21. A model training apparatus provided in a random number server, the apparatus comprising:

a first generating unit, configured to generate a first random number set and a second random number set according to a first random seed, where the first random seed is a random seed negotiated between a first party and a random number server;

22. A computing device, comprising:

at least one processor;

a memory storing program instructions configured for execution by the at least one processor, the program instructions comprising instructions for performing the method of any of claims 2-18.