WO2021027598A1

WO2021027598A1 - Method and apparatus for determining model parameter, and electronic device

Info

Publication number: WO2021027598A1
Application number: PCT/CN2020/106254
Authority: WO
Inventors: 周亚顺; 李漓春; 殷山; 王华忠
Original assignee: 创新先进技术有限公司
Priority date: 2019-08-09
Filing date: 2020-07-31
Publication date: 2021-02-18

Abstract

A method and apparatus for determining a model parameter, and an electronic device. The method comprises: communicating with a cooperation party according to a share of a first product and a garbled circuit corresponding to an activation function to obtain a share of a value of the activation function; secretly sharing a gradient of a loss function with the cooperation party according to feature data and the share of the value of the activation function to obtain a share of the gradient of the loss function; secretly sharing a Hessian matrix with the cooperation party according to the feature data and the share of the value of the activation function to obtain a share of the Hessian matrix; secretly sharing a first inverse matrix with the cooperation party according to the share of the Hessian matrix to obtain a share of the first inverse matrix; and secretly sharing a new first model parameter with the cooperation party according to a share of a first model parameter, the share of the first inverse matrix, and the share of the gradient of the loss function to obtain a share of the new first model parameter. A model parameter of a data processing model can be determined by the collaboration of multiple parties on the premise of protecting data privacy.

Description

Model parameter determination method, device and electronic equipment

This application claims the priority of the Chinese patent application filed on August 9, 2019 with the application number 201910735439.1, the invention title is "Model parameter determination method, device and electronic equipment", the application number filed on August 9, 2019 is 201910734784.3 , The priority of the Chinese patent application with the title of "Model parameter determination method, device and electronic equipment", the application number submitted on August 9, 2019 is 201910735442.3, and the title of the invention is "Model parameter determination method, device and electronic device" The priority of the Chinese patent application and the application number of 201910735421.1 filed on August 9, 2019, the priority of the Chinese patent application with the title of "Model parameter determination method, device and electronic equipment", the entire contents of which are incorporated by reference In this application.

Technical field

The embodiments of this specification relate to the field of computer technology, and in particular to a method, device and electronic equipment for determining model parameters.

Background technique

In the era of big data, there are many data islands. Data is usually scattered in different companies. Due to competition and privacy considerations, companies do not completely trust each other. In some cases, cooperative security modeling is required between enterprises and enterprises, so that the data of all parties can be used for collaborative training of data processing models under the premise of fully protecting the privacy of enterprise data.

In the process of collaborative training of the data processing model, the model parameter optimization method can be used to optimize and adjust the model parameters of the data processing model multiple times. Since the data used to train the data processing model is scattered among the parties involved in the cooperative modeling, how to collaboratively determine the model parameters of the data processing model while protecting data privacy is a technical problem that needs to be solved urgently.

Summary of the invention

The purpose of the embodiments of this specification is to provide a method, device and electronic equipment for determining model parameters, so that the model parameters of the data processing model can be determined by multiple parties under the premise of protecting data privacy.

In order to achieve the foregoing objectives, the technical solutions provided by one or more embodiments in this specification are as follows.

According to the first aspect of one or more embodiments of the present specification, a method for determining model parameters is provided, which is applied to a first data party, including: communicating with a partner according to a share of the first product and a confusion circuit corresponding to an incentive function , Get the share of the value of the incentive function, the first product is the product of the feature data and the first model parameter; according to the share of the feature data and the value of the incentive function and the partner secretly share the gradient of the loss function, the value of the loss function is obtained Share; share the Hessian matrix secretly with the partner according to the value of the feature data and the incentive function to get the share of the Hessian matrix; secretly share the first inverse matrix with the partner according to the Hessian matrix share to get the first inverse matrix Share, the first inverse matrix is the inverse matrix of the Hessian matrix; according to the share of the first model parameter, the share of the first inverse matrix, and the share of the loss function gradient, the new first model parameter is secretly shared with the partner to obtain the new The share of the first model parameter.

According to a second aspect of one or more embodiments of this specification, a method for determining model parameters is provided, which is applied to a second data party, including: communicating with a partner according to the share of the first product and the confusion circuit corresponding to the incentive function , Obtain the share of the value of the incentive function, the first product is the product of the feature data and the first model parameter; according to the share of the label and the value of the incentive function and the partner secretly share the gradient of the loss function, the share of the gradient of the loss function is obtained ; According to the share of the incentive function, secretly share the Hessian matrix with the partner to obtain the share of the Hessian matrix; according to the share of the Hessian matrix and secretly share the first inverse matrix with the partner to obtain the share of the first inverse matrix, the The first inverse matrix is the inverse matrix of the Hessian matrix; according to the share of the first model parameters, the share of the first inverse matrix, and the share of the loss function gradient, secretly share the new first model parameters with the partner to obtain the new first model The share of parameters.

According to a third aspect of one or more embodiments of this specification, a method for determining model parameters is provided, which is applied to a first data party, including: secretly sharing the first product with a partner according to the share of feature data and the first model parameter , Obtain the share of the first product, the first product being the product of the feature data and the first model parameter; according to the share of the first product and the partner secretly share the value of the incentive function, obtain the share of the value of the incentive function; The share of the characteristic data and the value of the incentive function secretly share the gradient of the loss function and the Hessian matrix with the partner, and get the share of the loss function gradient and the Hessian matrix; according to the share of the random orthogonal matrix and the Hessian matrix Share the second product secretly with the partner to obtain the share of the second product. The second product is the product between the random orthogonal matrix and the Hessian matrix; when the condition number of the second product meets the preset condition, The share of the Sen matrix secretly shares the first inverse matrix with the partner to obtain the share of the first inverse matrix, which is the inverse matrix of the Hessian matrix; according to the share of the first inverse matrix, the share of the loss function gradient and The share of the first model parameter secretly shares the new first model parameter with the partner to obtain the share of the new first model parameter; the step of secretly sharing the first product is iteratively performed to obtain the share of the new first product, according to The share of the new first product and the confusion circuit corresponding to the incentive function communicate with the partner to obtain the share of the new incentive function value, and iteratively execute the steps of the secret sharing loss function gradient and the Hessian matrix to obtain a new The share of the loss function gradient and the share of the new Hessian matrix are iteratively performed the step of secretly sharing the second product to obtain the share of the new second product; when the condition number of the new second product does not meet the preset When conditions are met, the share of the second model parameter is calculated according to the share of the new first model parameter, the share of the new loss function gradient, and the preset step length.

According to the fourth aspect of one or more embodiments of this specification, a method for determining model parameters is provided, which is applied to a second data party, including: secretly sharing the first product according to the share of the first model parameter and the partner to obtain the first product A share of a product, the first product being the product of the feature data and the first model parameter; according to the share of the first product and the partner secretly sharing the value of the incentive function, the share of the value of the incentive function is obtained; according to the label and incentive The share of the value of the function secretly shares the gradient of the loss function with the partner to obtain the share of the gradient of the loss function; the share of the value of the incentive function is secretly shared with the partner of the Hessian matrix to obtain the share of the Hessian matrix; according to the random orthogonal matrix The share of and the share of the Hessian matrix secretly share the second product with the partner to obtain the share of the second product. The second product is the product between the random orthogonal matrix and the Hessian matrix; the condition number of the second product When the preset conditions are met, secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix. The first inverse matrix is the inverse of the Hessian matrix; Share, the share of the loss function gradient, and the share of the first model parameter secretly share the new first model parameter with the partner to obtain the new share of the first model parameter; iteratively execute the step of secretly sharing the first product to obtain the new According to the share of the first product of the new first product and the confusion circuit corresponding to the incentive function, communicate with the partner to obtain the share of the new value of the incentive function; iteratively execute the step of secretly sharing the gradient of the loss function , Obtain the share of the new loss function gradient, iteratively execute the step of secretly sharing the Hessian matrix to obtain the share of the new Hessian matrix, and iteratively execute the step of secretly sharing the second product to obtain the new second product Share; when the condition number of the new second product does not meet the preset condition, calculate the share of the second model parameter according to the share of the new first model parameter, the share of the new loss function gradient, and the preset step length .

According to a fifth aspect of one or more embodiments of this specification, there is provided a model parameter determination device, applied to a first data party, including: an incentive function value share acquisition unit, configured to obtain a share and incentive according to the first product The confusion circuit corresponding to the function communicates with the partner to obtain the share of the value of the excitation function. The first product is the product of the feature data and the first model parameter; the loss function gradient share acquisition unit is used to obtain the value of the excitation function according to the feature data and the excitation function. The valued share secretly shares the gradient of the loss function with the partner to obtain the share of the loss function gradient; the Hessian matrix share acquisition unit is used to secretly share the Hessian matrix with the partner according to the feature data and the share of the incentive function to obtain The share of the Hessian matrix; the first inverse matrix share obtaining unit is used to secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix, and the first inverse matrix is the Hessian matrix The model parameter share acquisition unit is used to secretly share the new first model parameter with the partner according to the share of the first model parameter, the share of the first inverse matrix, and the share of the loss function gradient to obtain a new first model The share of parameters.

According to a sixth aspect of one or more embodiments of the present specification, there is provided a model parameter determination device applied to a second data party, including: an incentive function value share acquisition unit, configured to obtain a share and incentive according to the first product The confusion circuit corresponding to the function communicates with the partner to obtain the share of the value of the excitation function. The first product is the product of the feature data and the first model parameter; the loss function gradient share acquisition unit is used to obtain the value of the incentive function according to the label and the excitation function. The share of the value and the partner secretly share the gradient of the loss function to obtain the share of the gradient of the loss function; the Hessian matrix share acquisition unit is used to secretly share the Hessian matrix with the partner according to the share of the incentive function to obtain the Hessian matrix Share; the first inverse matrix share acquisition unit, used to secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix, the first inverse matrix being the inverse of the Hessian matrix; The model parameter share obtaining unit is used to secretly share the new first model parameter with the partner according to the share of the first model parameter, the share of the first inverse matrix, and the share of the loss function gradient to obtain the share of the new first model parameter.

According to a seventh aspect of one or more embodiments of the present specification, there is provided a model parameter determination device, which is applied to a first data party, and includes: a first secret sharing unit, which is used to determine the share of the characteristic data and the first model parameter. Share the first product secretly with the partner to obtain the share of the first product, the first product being the product of the feature data and the first model parameter; the second secret sharing unit is used to share the secret of the first product with the partner Share the value of the incentive function to obtain the share of the value of the incentive function; the third secret sharing unit is used to secretly share the gradient of the loss function and the Hessian matrix with the partner according to the feature data and the share of the value of the incentive function to obtain the loss respectively The share of the function gradient and the share of the Hessian matrix; the fourth secret sharing unit is used to secretly share the second product with the partner according to the share of the random orthogonal matrix and the share of the Hessian matrix to obtain the share of the second product. The second product is the product between the random orthogonal matrix and the Hessian matrix; the fifth secret sharing unit is used to secretly share the first secret with the partner according to the Hessian matrix when the condition number of the second product meets the preset conditions. An inverse matrix to obtain the share of the first inverse matrix, where the first inverse matrix is the inverse matrix of the Hessian matrix; the sixth secret sharing unit is used to obtain the share of the first inverse matrix, the share of the loss function gradient, and the first The share of the model parameter secretly shares the new first model parameter with the partner to obtain the share of the new first model parameter; the iterative unit is used to iteratively execute the step of secretly sharing the first product to obtain the new first product Share, communicate with the partner according to the share of the new first product and the confusion circuit corresponding to the incentive function to obtain the share of the new incentive function, and iteratively execute the steps of the secret sharing loss function gradient and the Hessian matrix, Obtain the share of the new loss function gradient and the share of the new Hessian matrix, and iteratively execute the step of secretly sharing the second product to obtain the share of the new second product; when the condition number of the new second product does not satisfy all When the preset conditions are described, the share of the second model parameter is calculated according to the share of the new first model parameter, the share of the new loss function gradient, and the preset step length.

According to an eighth aspect of one or more embodiments of this specification, there is provided a model parameter determination device applied to a second data party, including: a first secret sharing unit for communicating with a partner according to the share of the first model parameter Secretly share the first product to obtain the share of the first product, where the first product is the product of the feature data and the first model parameter; the second secret sharing unit is used to secretly share the incentive function with the partner according to the share of the first product The value of, obtains the share of the value of the incentive function; the third secret sharing unit is used to secretly share the gradient of the loss function with the partner according to the share of the label and the value of the incentive function to obtain the share of the gradient of the loss function; The share of value and the partner secretly share the Hessian matrix to obtain the share of the Hessian matrix; the fourth secret sharing unit is used to secretly share the second product with the partner according to the share of the random orthogonal matrix and the share of the Hessian matrix with the partner to obtain The share of the second product, the second product is the product between the random orthogonal matrix and the Hessian matrix; the fifth secret sharing unit is used for when the condition number of the second product meets the preset condition, according to the Hessian matrix Share the first inverse matrix secretly with the partner to obtain the share of the first inverse matrix, and the first inverse matrix is the inverse matrix of the Hessian matrix; the sixth secret sharing unit is used to share, The share of the loss function gradient and the share of the first model parameter secretly share the new first model parameter with the partner to obtain the share of the new first model parameter; an iterative unit for iteratively executing the step of secretly sharing the first product , Obtain the share of the new first product, communicate with the partner according to the share of the new first product and the confusion circuit corresponding to the incentive function, and obtain the share of the new incentive function; iteratively execute the secret sharing loss function In the step of gradient, a new share of the gradient of the loss function is obtained, the steps of the secret sharing of the Hessian matrix are iteratively executed to obtain the shares of the new Hessian matrix, and the step of the second product of the secret sharing is iteratively executed to obtain the new first The share of the second product; when the condition number of the new second product does not meet the preset condition, the second model is calculated according to the share of the new first model parameter, the share of the new loss function gradient, and the preset step length The share of parameters.

According to a ninth aspect of one or more embodiments of this specification, there is provided an electronic device, including: a memory, configured to store computer instructions; and a processor, configured to execute the computer instructions to implement aspects such as the first and second aspects. Aspect, the method steps of the third aspect or the fourth aspect.

It can be seen from the technical solutions provided by the above embodiments of this specification that in the embodiments of this specification, the first data party and the second data party cooperate to determine the model parameters of the data processing model without leaking the data held by themselves.

Description of the drawings

In order to more clearly describe the technical solutions in the embodiments of this specification or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. The drawings in the following description are only for this specification. For some of the embodiments described in, for those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative labor.

Fig. 1 is a schematic diagram of a logic circuit according to an embodiment of the specification;

2 is a schematic diagram of a model parameter determination system according to an embodiment of the specification;

FIG. 3 is a flowchart of a method for determining model parameters according to an embodiment of the specification;

4 is a schematic diagram of calculation based on an obfuscated circuit according to an embodiment of the specification;

FIG. 5 is a flowchart of a method for determining model parameters according to an embodiment of the specification;

6 is a flowchart of a method for determining model parameters according to an embodiment of the specification;

FIG. 7 is a schematic diagram of the functional structure of a model parameter determination device according to an embodiment of the specification;

FIG. 8 is a schematic diagram of the functional structure of a model parameter determining device according to an embodiment of the specification;

FIG. 9 is a flowchart of a method for determining model parameters according to an embodiment of the specification;

10 is a flowchart of a method for determining model parameters according to an embodiment of the specification;

FIG. 11 is a flowchart of a method for determining model parameters according to an embodiment of this specification;

12 is a schematic diagram of the functional structure of a model parameter determining device according to an embodiment of the specification;

13 is a schematic diagram of the functional structure of a model parameter determination device according to an embodiment of the specification;

FIG. 14 is a schematic diagram of the functional structure of an electronic device according to an embodiment of the specification.

detailed description

The technical solutions in the embodiments of this specification will be clearly and completely described below in conjunction with the drawings in the embodiments of this specification. Obviously, the described embodiments are only a part of the embodiments of this specification, not all of the embodiments. Based on the embodiments in this specification, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this specification.

Multi-party secure computing (Secure Muti-Party Computation, MPC) is an algorithm that protects data privacy and security. Multi-party secure computing allows multiple data parties involved in the calculation to perform collaborative computing without exposing their own data.

Secret Sharing (SS, Secret Sharing) is an algorithm that protects data privacy and security, and can be used to implement multi-party secure computing. Specifically, multiple data parties can use secret sharing algorithms to perform collaborative calculations to obtain secret information without leaking their own data. Each data party can obtain a share of the secret information. A single data party cannot recover the secret information. Only multiple data parties can work together to recover the secret information. For example, the data party P ₁ holds the data x ₁ , and the data party P ₂ holds the data x ₂ . Using the secret sharing algorithm, the data party P ₁ and the data party P ₂ can perform collaborative calculations to obtain secret information y=y ₁ +y ₂ =x ₁ x ₂ . The data party P ₁ can obtain the share y _{1 of the} secret information y after the calculation, and the data party P ₂ can obtain the share y _{2 of the} secret information y after the calculation.

Garbled Circuit is a secure computing protocol that protects data privacy and can be used to implement secure multi-party computing. Specifically, a given calculation task (for example, a function) may be converted into a logic circuit, and the logic circuit may be composed of at least one arithmetic gate, and the arithmetic gate may include an AND gate, an OR gate, an exclusive OR gate, and so on. The logic circuit may include at least two input lines and at least one output line, and an obfuscated circuit can be obtained by encrypting the input lines and/or output lines of the logic circuit. Multiple data parties can use the obfuscation circuit to perform collaborative calculations without leaking their own data to obtain the execution result of the calculation task.

Oblivious Transfer (Oblivious Transfer, OT), also known as oblivious transfer, is a two-party communication protocol that can protect privacy, enabling both parties to communicate to transfer data in a manner that blurs their choices. The sender can have multiple data. The recipient can obtain one or more of the plurality of data via inadvertent transmission. In this process, the sender does not know what data the receiver receives; and the receiver cannot obtain any data other than the data it receives. The inadvertent transmission protocol is the basic protocol for obfuscating circuits. In the process of using obfuscated circuits for cooperative calculations, inadvertent transmission protocols are usually used.

The following describes an example of an application scenario of the confusion circuit.

The data party P ₁ holds data x ₁ and data x ₃ , and the data party P ₂ holds data x ₂ . The function y=f(x ₁ , x ₂ , x ₃ )=x ₁ x ₂ x ₃ can be expressed as a logic circuit as shown in FIG. 1. The logic circuit is composed of AND gate 1 and AND gate 2. The logic circuit may include an input line a, an input line b, an input line d, an output line c, and an output line s.

The following describes the process of generating the confusion truth table of AND gate 1 by data party P ₁ .

The truth table corresponding to gate 1 can be as shown in Table 1.

Table 1

aa	bb	cc
00	00	00
00	11	00
11	00	00
11	11	11

Data party P ₁ can generate two random numbers

with

Corresponding to the two input values 0 and 1 of the input line a; two random numbers can be generated

with

Corresponding to the two input values 0 and 1 of the input line b respectively; two random numbers can be generated

with

Corresponding to the two output values 0 and 1 of the output line c respectively. Thus, the randomized truth table shown in Table 2 can be obtained.

Table 2

Data party P ₁ can separately assign random numbers

with

As the key, the random number

Encryption, get random number ciphertext

Random number

with

As the key, the random number

Encryption, get random number ciphertext

Random number

with

As the key, the random number

Encryption, get random number ciphertext

Random number

with

As the key, the random number

Encryption, get random number ciphertext

From this, the encrypted randomized truth table shown in Table 3 can be obtained.

table 3

The data party P ₁ can disrupt the arrangement order of the rows in Table 3 to obtain the confusion truth table shown in Table 4.

Table 4

The data party P ₁ can also generate the confusion truth table of AND gate 2. The specific process is similar to the process of generating the confusion truth table of AND gate 1, and will not be described in detail here.

The data party P ₁ can respectively send the confusion truth table of AND gate 1 and the confusion truth table of AND gate 2 to the data party P ₂ . The data party P ₂ can receive the confusion truth table of AND gate 1 and the confusion truth table of AND gate 2.

The data party P ₁ can send the random number corresponding to each bit of the data x ₁ on the input line a to the data party P ₂ ; can send each bit of the data x ₃ on the random number corresponding to the input line d To the data party P ₂ . The data party P ₂ can receive the random number corresponding to each bit of the data x ₁ and the data x ₃ . For example, data x ₁ =b ₀ ×2 ⁰ +b ₁ ×2 ¹ +...+b _i ×2 ⁱ +.... Data for the i-th bit b _i x ₁ when b _i is 0, the data cube P ₁ b _i may be the corresponding input line A random number

Send to the data party P ₂ ; when the value of b _i is 1, the data party P ₁ can put the random number of _bi on the input line a

Sent to the data party P ₂ .

Data party P ₁ can be a random number

with

As input, the data party P ₂ can take each bit of the data x ₂ as input, and the two can inadvertently transmit. The data party P ₂ can obtain a random number corresponding to each bit of data x ₂ . Specifically, the data party P ₁ can generate two random numbers

with

Corresponding to the two input values 0 and 1 of the input line d. So for each bit of data x ₂ , the data party P ₁ can be a random number

with

As the secret information input in the inadvertent transmission process, the data party P ₂ can use this bit as the selection information input in the inadvertent transmission process for inadvertent transmission. Through inadvertent transmission, the data party P ₂ can obtain the random number corresponding to the bit on the input line d. Specifically, when the value of this bit is 0, the data party P ₂ can obtain a random number

When the value of this bit is 1, the data party P ₂ can obtain a random number

According to the characteristics of inadvertent transmission, the data party P ₁ does not know which random number the data party P ₂ specifically selected, and the data party P ₂ cannot know other random numbers other than the selected random number.

Through the above process, the data party P ₂ obtains the random number corresponding to each bit of the data x ₁ , the data x ₂ and the data x ₃ . In this way, the data party P ₂ can use the random number corresponding to each bit of the data x ₁ on the input line a and the random number corresponding to the corresponding bit of the data x ₂ on the input line b to try to confuse the truth table of the AND gate 1. Decrypt the 4 random number ciphertexts; the data party P ₂ can only successfully decrypt one of the random number ciphertexts, thereby obtaining a random number on the output line c. Next, the data party P ₂ can use the random number corresponding to the input line d of the corresponding bit of the data x ₃ and the decrypted random number of the output line c to try to confuse the 4 random numbers in the truth table of the AND gate 2. The ciphertext is decrypted; the data party P ₂ can only successfully decrypt one of the random ciphertexts, and obtain a random number on the output line s. The data party P ₂ can send the decrypted random number of the output line s to the data party P ₁ . The data party P ₁ can receive the random number of the output line s; the output value of the output line s can be obtained according to the random number of the output line s and the correspondence between the random number and the output value.

Each output value of the output line s can be regarded as one bit of the value of the function y=f(x ₁ , x ₂ , x ₃ )=x ₁ x ₂ x ₃ . In this way, the data party P ₁ can determine the value of the function y=f(x ₁ , x ₂ , x ₃ )=x ₁ x ₂ x ₃ according to the multiple output values of the output line s.

Loss function (Loss Function) can be used to measure the degree of inconsistency between the predicted value of the data processing model and the true value. The smaller the value of the loss function, the better the robustness of the data processing model. The loss function includes but is not limited to a logarithmic loss function (Logarithmic Loss Function), a square loss function (Square Loss), and the like.

Hessian matrix, also known as Hessian matrix, Hessian matrix or Hessian matrix, etc., is a square matrix formed by the second-order partial derivative of the loss function, used to express the local curvature of the loss function.

Activation function, also known as activation function, can be used to build data processing models. The excitation function defines the output at a given input. The excitation function is usually a nonlinear function. Non-linear factors can be added to the data processing model through the excitation function, which improves the expressive ability of the data processing model. The activation function may include Sigmoid function, Tanh function, ReLU function and so on. The data processing model may include a logistic regression model and a neural network model.

In the scenario of cooperative security modeling, in order to protect data privacy, multiple data parties can conduct collaborative training on data processing models based on their own data without leaking their own data. The data processing model includes but is not limited to logistic regression model and neural network model. In the process of training the data processing model, the model parameter optimization method can be used to optimize and adjust the model parameters of the data processing model. Model parameter optimization methods may include gradient descent method, Newton method, and so on. Wherein, the Newton's method may include primitive Newton's method and various deformation methods based on primitive Newton's method (such as damping Newton's method, regularized Newton's method, etc.; regularized Newton's method refers to Newton's method with regularization terms; regularization) Modification can reduce the complexity and instability of the model, thereby reducing the risk of overfitting). The method for optimizing model parameters can be implemented only in a secret sharing manner; or, it can also be implemented in a combination of secret sharing and obfuscating circuits.

This specification provides an embodiment of the model parameter determination system.

Please refer to Figure 2. In this embodiment, the model parameter determination system may include a first data party, a second data party, and a trusted third party (TTP, Trusted Third Party).

The third party may be one server; or, it may also be a server cluster including multiple servers. The third party is used to provide random numbers to the first data party and the second data party. Specifically, the third party may generate a random orthogonal matrix, and each random number in the random orthogonal matrix may be split into two shares, and one of the shares may be used as the first share, and the other share may be used as the first share. The second share. The third party may use the matrix formed by the first share of each random number in the random orthogonal matrix as the first share of the random orthogonal matrix, and calculate the second share of each random number in the random orthogonal matrix. The formed matrix is used as the second share of the random orthogonal matrix; the first share of the random orthogonal matrix may be sent to the first data party, and the random orthogonal matrix may be sent to the second data party Second share. Wherein, the sum of the first share of the random orthogonal matrix and the second share of the random orthogonal matrix is equal to the random orthogonal matrix. In addition, on the one hand, the random orthogonal matrix may be a random number matrix composed of random numbers; on the other hand, the random orthogonal matrix may also be an orthogonal matrix. After a square matrix is multiplied by an orthogonal matrix, a new matrix can be obtained, which has the same condition number as the square matrix. In this way, it is convenient for the first data party and the second data party to collaboratively calculate the condition number of the Hessian matrix without leaking their own shares of the Hessian matrix; thus, it is convenient to measure the ill-conditioned degree of the Hessian matrix according to the condition number . The specific process is detailed in the following embodiment.

In view of the fact that the first data party and the second data party involved inadvertent transmission in the calculation process based on the obfuscation circuit, the third party can also generate the first OT random number and the second OT random number; The first data party sends the first OT random number; the second OT random number may be sent to the second data party. The OT random number can be a random number used during inadvertent transmission.

The first data party and the second data party are respectively two parties of cooperative security modeling. The first data party may be a data party holding characteristic data, and the second data party may be a data party holding a tag. For example, the first data party may hold complete feature data, and the second data party may hold a label of the feature data. Alternatively, the first data party may hold a part of the feature data, and the second data party may hold another part of the feature data and a label of the feature data. Specifically, for example, the characteristic data may include the user's savings amount and loan amount. The first data party may hold the user's savings amount, and the second data party may hold the user's loan amount and the tag of the characteristic data. The tag can be used to distinguish different types of characteristic data, and the specific value can be taken from 0 and 1, for example. It is worth noting that the data party here can be an electronic device. The electronic equipment may include a personal computer, a server, a handheld device, a portable device, a tablet device, a multi-processor device; or, it may also include a cluster formed by any of the above devices or devices. In addition, the feature data and its corresponding labels together constitute sample data, and the sample data can be used to train the data processing model.

In the scenario of cooperative security modeling, the first data party and the second data party may each obtain a share of the first model parameter. Here, the share obtained by the first data party may be used as the first share of the first model parameter, and the share obtained by the second data party may be used as the second share of the first model parameter. The sum of the first share of the first model parameter and the second share of the first model parameter is equal to the first model parameter.

The first data party may receive the first share of the random orthogonal matrix and the first OT random number. The second data party may receive the second share of the random orthogonal matrix and the second OT random number. In this way, the first data party may be based on the first share of the first model parameter, characteristic data, the first share of the random orthogonal matrix, and the first OT random number, and the second data party may be based on the first model parameter The second share, the label value, the second share of the random orthogonal matrix, and the second OT random number, cooperate to determine the new first model parameter. The first data party and the second data party may each obtain a share of the new first model parameter. For the specific process, please refer to the following model parameter determination method embodiment.

Based on the foregoing embodiment of the model parameter determination system, an embodiment of the model parameter determination method in this specification will be described in detail below in conjunction with FIG. 3. This embodiment may include the following steps.

Step S11: The first data party performs communication based on the confusion circuit corresponding to the excitation function according to the first share of the first product, and the second data party uses the second share of the first product. The first data party obtains the first share of the value of the excitation function, and the second data party obtains the second share of the value of the excitation function.

Step S13: The first data party obtains the first share of the value based on the characteristic data and the incentive function, and the second data party secretly shares the gradient of the loss function based on the label and the second share of the incentive function. The first data party obtains the first share of the loss function gradient, and the second data party obtains the second share of the loss function gradient.

Step S15: The first data party obtains the first share of the value based on the characteristic data and the incentive function, and the second data party secretly shares the Hessian matrix according to the second share of the value obtained by the incentive function. The first data party gets the first share of the Hessian matrix, and the second data party gets the second share of the Hessian matrix.

Step S17: The first data party secretly shares the first inverse matrix according to the first share of the Hessian matrix, and the second data party secretly shares the first inverse matrix according to the second share of the Hessian matrix. The first data party obtains the first share of the first inverse matrix, and the second data party obtains the second share of the first inverse matrix. The first inverse matrix is the inverse matrix of the Hessian matrix.

Step S19: The first data party according to the first share of the first model parameter, the first share of the first inverse matrix and the first share of the loss function gradient, and the second data party according to the second share, first share of the first model parameter The second share of the inverse matrix and the second share of the gradient of the loss function secretly share the new first model parameters. The first data party obtains the first share of the new first model parameter, and the second data party obtains the second share of the new first model parameter.

Some terms involved in the embodiments are introduced below.

(1) The first product, the second product, the third product, the fourth product and the fifth product. The first product may be a product between the first model parameter and the feature data. The second product may be a product between a random orthogonal matrix and a Hessian matrix. The third product may be the product between the inverse matrix of the Hessian matrix and the gradient of the loss function. The fourth product may be the product of the first share of the gradient of the loss function and the preset step size. The fifth product may be the product of the second share of the gradient of the loss function and the preset step size.

In some scene examples, the first product may be expressed as XW; where W represents a first model parameter, specifically a vector formed by the first model parameter; X represents feature data, specifically a matrix formed by feature data.

The second product may be expressed as HR, where H represents a Hessian matrix, and R represents a random orthogonal matrix.

The third product may be expressed as H ^-1 dW, where H ^-1 represents the inverse matrix of the Hessian matrix, dW represents the gradient of the loss function, and dW is a vector.

The fourth product may be expressed as G<dW> ₀ , and the fifth product may be expressed as G<dW> ₁ . Among them, G represents the preset step size, <dW> ₀ represents the first share of the loss function gradient, <dW> ₁ represents the second share of the loss function gradient, and <dW> ₀ + <dW> ₁ = dW.

(2) The first inverse matrix and the second inverse matrix. Since the Hessian matrix is a square matrix, the Hessian matrix can be inverted, and the inverse matrix of the Hessian matrix can be used as the first inverse matrix. The second product may be a square matrix, and thus the second product may be inverted, and the inverse matrix of the second product may be used as the second inverse matrix.

Continuing the previous scenario example, the first inverse matrix may be expressed as H ^-1 , and the second inverse matrix may be expressed as (HR) ^-1 .

In some embodiments, before step S11, the first data party may be based on the held feature data and the first share of the first model parameter, and the second data party may be based on the held first model parameter. The second share, secretly shares the first product. The first data party and the second data party may each obtain a share of the first product. For ease of description, the share obtained by the first data party may be used as the first share of the first product, and the share obtained by the second data party may be used as the second share of the first product. The sum of the first share of the first product and the second share of the first product is equal to the first product.

Continuing the previous scenario example, the first share of the first model parameter can be expressed as <W> ₀ , and the second share of the first model parameter can be expressed as <W> ₁ , <W> ₀ +<W> ₁ =W. The first data party may secretly share the first product XW according to X and <W> ₀ , and the second data party may secretly share the first product XW according to <W> ₁ . The first data party can obtain the first share of the first product<XW> ₀ , and the second data party can obtain the second share of the first product<XW> ₁ . <XW> ₀ + <XW> ₁ = XW.

In some embodiments, a corresponding logic circuit can be constructed according to the excitation function. The logic circuit can be constructed by the first data party; alternatively, it can also be constructed by the second data party; or alternatively, it can also be constructed by other devices (for example, a trusted third party). The logic circuit may be composed of at least one arithmetic gate, and the arithmetic gate may include an AND gate, an OR gate, an exclusive OR gate, and so on. The logic circuit may include at least two input lines and at least one output line, and an obfuscated circuit can be obtained by encrypting the input lines and/or output lines of the logic circuit. The confusion circuit may include a confusion truth table of each arithmetic gate in the logic circuit. It is worth noting that the logic circuit can be constructed directly according to the excitation function; alternatively, various appropriate modifications can be made to the excitation function, and the logical circuit can be constructed according to the deformed excitation function; or, the excitation function can also be used Generate other functions as a basis, and build logic circuits based on other functions. Correspondingly, the activation function and the confusion circuit can be understood as: the confusion circuit is generated based on the logic circuit of the activation function, or the confusion circuit is generated based on the confusion circuit of the deformed activation function, or the confusion circuit is Generated according to the logic circuit of other functions.

Both the first data party and the second data party may have a confusion circuit corresponding to an excitation function. In some embodiments, the obfuscation circuit may be generated by the first data party. The first data party may send the generated obfuscation circuit to the second data party. The second data party may receive the obfuscation circuit. In other implementation manners, the obfuscation circuit may also be generated by the second data party. The second data party may send the generated obfuscation circuit to the first data party. The first data party may receive the obfuscation circuit.

In step S11, the first data party can communicate based on the first share of the first product, and the second data party can communicate based on the confusion circuit corresponding to the excitation function according to the second share of the first product. The first data party and the second data party may each obtain a share of the value of the incentive function. For ease of description, the share obtained by the first data party may be used as the first share of the value of the incentive function, and the share obtained by the second data party may be used as the second share of the value of the incentive function. The sum of the first share of the value of the excitation function and the second share of the value of the excitation function is equal to the value of the excitation function.

Please refer to Figure 4. The following describes an example of a scenario where the first data party and the second data party perform calculations based on the obfuscated circuit.

The function y=f ₁ (x ₁ , x ₂ , x ₃ )=f(x ₁ , x ₂ )-x ₃ can be constructed according to the excitation function f(x ₁ , x ₂ ). Among them, x _{1 is} used to represent the first share of the first product, x _{2 is} used to represent the second share of the first product, and x _{3 is} used to represent a share of the value of the incentive function (hereinafter referred to as the value of the incentive function The second share), the value of f ₁ (x ₁ , x ₂ , x ₃ ) is used to represent another share of the value of the excitation function (hereinafter referred to as the first share of the value of the excitation function).

A logic circuit corresponding to the function f ₁ (x ₁ , x ₂ , x ₃ ) = f(x ₁ , x ₂ )-x ₃ can be constructed, and the input line and/or output line of the logic circuit can be encrypted. You can get the confusion circuit. Both the first data party and the second data party may possess the obfuscated circuit. It is worth noting that the function y=f ₁ (x ₁ , x ₂ , x ₃ )=f(x ₁ , x ₂ )-x ₃ and its corresponding logic circuit can be constructed by the first data party; or, It can also be constructed by the second data party; or, it can also be constructed by other devices (for example, a trusted third party).

The second data party may generate a share of the value of the incentive function as the second share. In this way, the first data party can use the first share of the first product as the input to the confusion circuit, and the second data party can use the second share of the first product and the second share of the value of the incentive function as the confusion circuit. The input of the circuit for communication. The first data party may calculate another share of the value of the excitation function based on the confusion circuit as the first share. For the specific calculation process, please refer to the previous example of the scene introducing the confusion circuit, which will not be detailed here.

In some embodiments, in order to reduce the complexity of the confusion circuit, a piecewise linear function may also be used to fit the excitation function. In this way, a corresponding logic circuit can be constructed according to the piecewise linear function, and the confusion circuit can be obtained by encrypting the input line and/or output line of the logic circuit. Both the first data party and the second data party may possess the obfuscated circuit. For example, the activation function may be a Sigmoid function, and the piecewise linear function may be

k represents the coefficient of the piecewise linear function.

The first data party can communicate based on the confusion circuit based on the first share of the first product, and the second data party can communicate based on the confusion circuit based on the second share of the first product. The first data party and the second data party may respectively obtain a share of the value of the piecewise linear function. For ease of description, the share obtained by the first data party may be used as the first share of the value of the piecewise linear function, and the share obtained by the second data party may be used as the second share of the value of the piecewise linear function. The sum of the first share of the value of the piecewise linear function and the second share of the value of the piecewise linear function is equal to the value of the piecewise linear function. In this way, the first data party may use the first share of the value of the piecewise linear function as the first share of the value of the excitation function. The second data party may use the second share of the value of the piecewise linear function as the second share of the value of the excitation function.

In some embodiments, in step S13, the first data party may take the first share of the value based on the characteristic data and the incentive function, and the second data party may also take the second share of the value based on the label and the incentive function. , Secretly share the gradient of the loss function. The first data party and the second data party may obtain a share of the gradient of the loss function respectively. For ease of description, the share obtained by the first data party may be used as the first share of the loss function gradient, and the share obtained by the second data party may be used as the second share of the loss function gradient. The sum of the first share of the gradient of the loss function and the second share of the gradient of the loss function is equal to the gradient of the loss function.

Continuing the previous scenario example, the first data party can secretly share the gradient dW (specifically a vector) of the loss function based on X and <a> ₀ , and the second data party can secretly share the gradient dW of the loss function based on the label Y and <a> ₁ . The first data party can obtain the first share of the loss function gradient<dW> ₀ , and the second data party can obtain the second share of the loss function gradient<dW> ₁ . The detailed process of the first data party and the second data party secretly sharing the loss function dW is described below.

The party may be the first data X, the second party data may <a> _1, secret sharing X ^T <a> _1. The first data party can obtain <[X ^T <a> ₁ ]> ₀ , and the second data party can obtain <[X ^T <a> ₁ ]> ₁ . ＜[X ^T ＜a＞ ₁ ]＞ ₀ +＜[X ^T ＜a＞ ₁ ]＞ ₁ = X ^T ＜a＞ ₁ .

The first data party may also secretly share X ^T Y according to X, and the second data party may also secretly share X ^T Y according to tag Y (specifically, a vector formed by tags). The first data party can obtain <X ^T Y> ₀ , and the second data party can obtain <X ^T Y> ₁ . <X ^T Y> ₀ +<X ^T Y> ₁ = X ^T Y.

The first data party can calculate X ^T ＜a＞ ₀ ; can calculate X ^T ＜a＞ ₀ +＜[X ^T ＜a＞ ₁ ]＞ ₀ -＜X ^T Y＞ ₀ as the first of the loss function gradient dW Share <dW> ₀ . The second data party may calculate <[X ^T <a> ₁ ]> _1- <X ^T Y> ₁ as the second share of the loss function gradient dW <dW> ₁ .

In some embodiments, in step S15, the first data party may obtain the first share of the value according to the characteristic data and the incentive function, and the second data party may secretly share the value according to the second share of the incentive function. Hessian matrix. The first data party and the second data party may obtain a share of the Hessian matrix respectively. For ease of description, the share obtained by the first data party may be used as the first share of the Hessian matrix, and the share obtained by the second data party may be used as the second share of the Hessian matrix. The sum of the first share of the Hessian matrix and the second share of the Hessian matrix is equal to the Hessian matrix.

Specifically, the first data party may secretly share the diagonal matrix according to the first share of the value of the incentive function, and the second data party may secretly share the diagonal matrix according to the second share of the value of the incentive function. The first data party and the second data party may obtain a share of the diagonal matrix respectively. For ease of description, the share obtained by the first data party may be used as the first share of the diagonal matrix, and the share obtained by the second data party may be used as the second share of the diagonal matrix. The sum of the first share of the diagonal matrix and the second share of the diagonal matrix is equal to the diagonal matrix. In this way, the first data party can secretly share the Hessian matrix according to the feature data and the first share of the diagonal matrix, and the second data party can secretly share the Hessian matrix according to the second share of the diagonal matrix. The first data party can obtain the first share of the Hessian matrix, and the second data party can obtain the second share of the Hessian matrix.

Continuing the previous scenario example, the first data party can secretly share the diagonal matrix RNN according to <a> ₀ and the second data party can secretly share the diagonal matrix RNN according to <a> ₁ . The first data party can obtain the first share RNN ₀ of the diagonal matrix, and the second data party can obtain the second share RNN ₁ of the diagonal matrix. The detailed process of the first data party and the second data party secretly sharing the diagonal matrix RNN is described below.

The first party data may <a> _0, the second party data may <a> _1, secret sharing <a> ₀ · <a> _1. The first data party can obtain <[<a> ₀ ·<a> ₁ ]> ₀ , and the second data party can obtain <[<a> ₀ ·<a> ₁ ]> ₁ . <[<a> ₀ ·<a> ₁ ]> ₀ +<[<a> ₀ ·<a> ₁ ]> ₁ =<a> ₀ ·<a> ₁ . Among them, · represents the bitwise multiplication operation. For example, the vector m=(m ₁ , m ₂ , m ₃ ), and the vector n=(n ₁ , n ₂ , n ₃ ). Then, m·n=(m ₁ n ₁ , m ₂ n ₂ , m ₃ n ₃ ).

The first data party can calculate <r> ₀ =<a> ₀ -<[<a> ₀ ·<a> ₁ ]> ₀ -<a> ₀ ·<a> ₀ , and the second data party can Calculate <r> ₁ =<a> ₁ -<[<a> ₀ ·<a> ₁ ]> ₀ -<a> ₁ ·<a> ₁ .

r＝＜r＞ ₀ +＜r＞ ₁

＝＜a＞ ₀ -＜[＜a＞ ₀ ·＜a＞ ₁ ]＞ ₀ -＜a＞ ₀ ·＜a＞ ₀ +＜a＞ ₁ -＜[＜a＞ ₀ ·＜a＞ ₁ ]＞ ₀ -＜a＞ ₁ ·＜a＞ ₁

＝{＜a＞ ₀ +＜a＞ ₁ }{1-＜a＞ ₀ -＜a＞ ₁ }

=a(1-a)

.

<r> ₀ , <r> _1, and r are vectors respectively. Therefore, the first data party can generate the first share RNN of the diagonal matrix RNN=diag(r) according to <r> _0, and the second data party can generate the first share RNN ₀ = diag (<r> ₀ ) according to <r> _0. Generate the second share RNN ₁ = diag (<r> ₁ ) of the diagonal matrix RNN=diag(r). RNN ₀ + RNN ₁ = RNN. Wherein, the first share RNN ₀ and the second share RNN ₁ of the diagonal matrix RNN can both be a diagonal matrix. In the actual process, the first party data may be <r> as data element ₀ of the data element on the main diagonal RNN _0, thus achieved under <r> ₀ ₀ RNN generated; may <r> ₁ data element as a data element on the main diagonal RNN _1, thus realized <r> ₁ generated according RNN _1.

The first data party can secretly share the Hessian matrix H according to X and RNN ₀ , and the second data party can secretly share the Hessian matrix H according to RNN ₁ . The first data party can obtain the first share<H> ₀ of the Hessian matrix, and the second data party can obtain the second share<H> _{1 of the} Hessen matrix. The detailed process of the first data party and the second data party sharing the Hessian matrix H secretly is described below.

The party may be the first data X, the second party data may RNN _1, secret sharing X ^T RNN _1. The first data party can obtain <X ^T RNN ₁ > ₀ , and the second data party can obtain <X ^T RNN ₁ > ₁ . <X ^T RNN ₁ > ₀ +<X ^T RNN ₁ > ₁ = X ^T RNN ₁ .

The first party data may also according to X, the second party may also be in accordance with data <X ^T RNN _{_1> 1,} secret sharing <X ^T RNN _{_1> 1} X. The first data party can obtain <[<X ^T RNN ₁ > ₁ X]> ₀ , and the second data party can obtain <[<X ^T RNN ₁ > ₁ X]> ₁ .

＜[＜X ^T RNN ₁ ＞ ₁ X]＞ ₀ +＜[＜X ^T RNN ₁ ＞ ₁ X]＞ ₁ =＜X ^T RNN ₁ ＞ ₁ X.

The first data party can calculate X ^T RNN ₀ X+<X ^T RNN ₁ > ₀ X+<[<X ^T RNN ₁ > ₁ X]> ₀ as the first share of the Hessian matrix H<H> ₀ . The second data party may use <[<X ^T RNN ₁ > ₁ X]> ₁ as the second share of the Hessian matrix H<H> ₁ .

In some embodiments, the third party may issue the first share of the random orthogonal matrix to the first data party; may issue the second share of the random orthogonal matrix to the second data party. The sum of the first share of the random orthogonal matrix and the second share of the random orthogonal matrix is equal to the random orthogonal matrix. The first data party may receive a first share of a random orthogonal matrix, and the second data party may receive a second share of a random orthogonal matrix. In this way, in step S17, the first data party can be based on the first share of the random orthogonal matrix and the first share of the Hessian matrix, and the second data party can be based on the second share of the random orthogonal matrix and the first share of the Hessian matrix. The second share of the matrix secretly shares the second product. The first data party and the second data party may each obtain a share of the second product. For ease of description, the share obtained by the first data party may be used as the first share of the second product, and the share obtained by the second data party may be used as the second share of the second product. The sum of the first share of the second product and the second share of the second product is equal to the second product.

In some implementations of this embodiment, the second data party may perform inversion processing on the second product. Specifically, the first data party may send the first share of the second product to the second data party. The second data party may receive the first share of the second product; may add the first share of the second product to the second share of the second product held by itself to obtain the second product. Since the second product is a square matrix, the second data party can perform inverse processing on the second product to obtain the inverse matrix of the second product as the second inverse matrix; The data party sends the second inverse matrix. The first data party may receive the second inverse matrix. Alternatively, in other implementations of this embodiment, the first data party may also perform inversion processing on the second product. Specifically, the second data party may send the second share of the second product to the first data party. The first data party may receive the second share of the second product; may add the second share of the second product to the first share of the second product held by itself to obtain the second product. Since the second product is a square matrix, the first data party can perform inverse processing on the second product to obtain the inverse matrix of the second product as the second inverse matrix; The data party sends the second inverse matrix. The second data party may receive the second inverse matrix.

The first data party may multiply the first share of the random orthogonal matrix by the second inverse matrix to obtain the first share of the first inverse matrix. The second data party may multiply the second share of the random orthogonal matrix by the second inverse matrix to obtain the second share of the first inverse matrix. The sum of the first share of the first inverse matrix and the second share of the first inverse matrix is equal to the first inverse matrix.

Continuing the previous scenario example, the first share of the random orthogonal matrix can be expressed as <R> ₀ , and the second share of the random orthogonal matrix can be expressed as <R> ₁ , <R> ₀ +<R> ₁ =R. The first data party may secretly share the second product HR according to <R> ₀ and <H> ₀ , and the second data party may secretly share the second product HR according to <R> ₁ and <H> ₁ . The first data party can obtain the first share of the second product<HR> ₀ , and the second data party can obtain the second product second share<HR> ₁ . The detailed process of the first data party and the second data party secretly sharing the second product HR is described below.

The first party data may <H> _0, the second party data may <R> _1, secret sharing _{_{<H> 0 <R> 1}} . The first data party can obtain <[<H> ₀ <R> ₁ ]> ₀ , and the second data party can obtain <[<H> ₀ <R> ₁ ]> ₁ . ＜[＜H＞ ₀ ＜R＞ ₁ ]＞ ₀ +＜[＜H＞ ₀ ＜R＞ ₁ ]＞ ₁ =＜H＞ ₀ ＜R＞ ₁ .

The first data party may also secretly share <H> ₁ <R> ₀ according to <R> ₀ and the second data party may also secretly share <H> ₁ <R> ₀ according to <H> ₁ . The first data party can obtain <[<H> ₁ <R> ₀ ]> ₀ , and the second data party can obtain <[<H> ₁ <R> ₀ ]> ₁ . ＜[＜H＞ ₁ ＜R＞ ₀ ]＞ ₀ +＜[＜H＞ ₁ ＜R＞ ₀ ]＞ ₁ =＜H＞ ₁ ＜R＞ ₀ .

The first data party can calculate <H> ₀ <R> ₀ +<[<H> ₀ ＜R> ₁ ]> ₀ +<[<H> ₁ ＜R> ₀ ]> ₀ as the second product of One share <HR> ₀ . The second data party can calculate <H> ₁ <R> ₁ +<[<H> ₀ ＜R> ₁ ]> ₁ +<[<H> ₁ ＜R> ₀ ]> ₁ as the second product of Second share <HR> ₁ .

Here, the second data party performs inversion processing on the second product HR. Specifically, the first data party may send the first share of the second product<HR> ₀ to the second data party. The second data party may receive the first share of the second product <HR>₀; may compare the first share of the second product <HR> ₀ with the second share of the second product held by itself <HR> ₁ was added to give a second product of the HR; inversion process may be performed on the second product HR, to obtain a second inverse matrix (HR) ^-1; may send a second inverse matrix (HR) ^-1 to the first party data. The first data party may receive the second inverse matrix (HR) ^-1 .

The first data party may multiply the second inverse matrix (HR) ^-1 by the first share of the random orthogonal matrix <R> _{0 to} obtain the first share of the first inverse matrix H ^-1 <H ^-1 > ₀ . The second data party may multiply the second inverse matrix (HR) ^-1 by the second share of the random orthogonal matrix <R> _{1 to} obtain the first share of the first inverse matrix H ^-1 <H ^-1 > ₁ . H ^-1 =<H ^-1 > ₀ +<H ^-1 > ₁ =<R> ₀ (HR) ^-1 +<R> ₁ (HR) ^-1 = R×(HR) ^-1 .

In some embodiments, in step S19, the first data party may be based on the first share of the first inverse matrix and the first share of the loss function gradient, and the second data party may be based on the first share of the first inverse matrix. The second share and the second share of the gradient of the loss function secretly share the third product. The first data party and the second data party may each obtain a share of the third product. For ease of description, the share obtained by the first data party may be used as the first share of the third product, and the share obtained by the second data party may be used as the second share of the third product. The sum of the first share of the third product and the second share of the third product is equal to the third product.

The first data party may subtract the first share of the first model parameter from the first share of the third product to obtain the first share of the new first model parameter. The second data party may subtract the second share of the first model parameter from the second share of the third product to obtain the second share of the new first model parameter.

Continuing the previous scenario example, the first data party can secretly share the third party according to <H ^-1 > ₀ and <dW> ₀ , and the second data party can secretly share the third party according to <H ^-1 > ₁ and <dW> ₁ The product H ^-1 ×dW. The first data party can obtain the first share of the third product <H ^-1 ×dW> ₀ , and the second data party can obtain the second share of the third product <H ^-1 × dW> ₁ . The following describes the detailed process of the first data party and the second data party secretly sharing the third product H ^-1 ×dW.

The first party data may <H ^-1> _0, the second party data may <dW> _1, secret sharing ^{_{<H -1> 0 <dW>}} 1. The first data party can obtain <[<H ^-1 > ₀ <dW> ₁ ]> ₀ , and the second data party can obtain <[<H ^-1 ＞ ₀ <dW> ₁ ]> ₁ .

＜[＜H ^-1 ＞ ₀ ＜dW＞ ₁ ]＞ ₀ +＜[＜H ^-1 ＞ ₀ ＜dW＞ ₁ ]＞ ₁ =＜H ^-1 ＞ ₀ ＜dW＞ ₁ .

The first data party can also secretly share <H ^-1 > ₁ <dW> ₀ according to <dW> ₀ and the second data party can also secretly share <H ^-1 > ₁ <dW> ₀ according to <H ^-1 > ₁ . The first data party can obtain <[<H ^-1 > ₁ <dW> ₀ ]> ₀ , and the second data party can obtain <[<H ^-1 > ₁ <dW> ₀ ]> ₁ .

＜[＜H ^-1 ＞ ₁ ＜dW＞ ₀ ]＞ ₀ +＜[＜H ^-1 ＞ ₁ ＜dW＞ ₀ ]＞ ₁ =＜H ^-1 ＞ ₁ ＜dW＞ ₀ .

The first data party can calculate <H ^-1 ＞ ₀ ＜dW＞ ₀ +＜[＜H ^-1 ＞ ₀ ＜dW＞ ₁ ]＞ ₀ +＜[＜H ^-1 ＞ ₁ ＜dW＞ ₀ ]＞ ₀ The first share as the third product<H ^-1 ×dW> ₀ . The second data party can calculate <H ^-1 ＞ ₁ ＜dW＞ ₁ +＜[＜H ^-1 ＞ ₀ ＜dW＞ ₁ ]＞ ₁ +＜[＜H ^-1 ＞ ₁ ＜dW＞ ₀ ]＞ ₁ The second share as the third product <H ^-1 ×dW> ₁ .

H ^-1 ×dW＝＜H ^-1 ×dW＞ ₀ +＜H ^-1 ×dW＞ ₁

＝＜H ^-1 ＞ ₀ ＜dW＞ ₀ +＜[＜H ^-1 ＞ ₀ ＜dW＞ ₁ ]＞ ₀ +＜[＜H ^-1 ＞ ₁ ＜dW＞ ₀ ]＞ ₀ +＜H ^-1 ＞ ₁ ＜dW＞ ₁ +＜[＜H ^-1 ＞ ₀ ＜dW＞ ₁ ]＞ ₁ +＜[＜H ^-1 ＞ ₁ ＜dW＞ ₀ ]＞ ₁

＝＜H ^-1 ＞ ₀ ＜dW＞ ₀ +＜H ^-1 ＞ ₀ ＜dW＞ ₁ +＜H ^-1 ＞ ₁ ＜dW＞ ₀ +＜H ^-1 ＞ ₁ ＜dW＞ ₁

＝(＜H ^-1 ＞ ₀ +＜H ^-1 ＞ ₁ )(＜dW＞ ₀ +＜dW＞ ₁ )

The first party can calculate the data _{<W '> 0 = <W} > 0 - <H -1 × dW> 0, the second party can compute data _{<W'> 1 = <W} > 1 - <H - ¹ ×dW> ₁ , <W'> ₀ represents the first share of the new first model parameter, <W'> ₁ represents the second share of the new first model parameter, and W'represents the new first model parameter.

W'=<W'> ₀ +<W'> ₁ =<W> ₀ -<H ^-1 ×dW> ₀ +<W> ₁ -<H ^-1 ×dW> ₁ = WH ^-1 ×dW.

Newton's method has a faster convergence rate. The method for determining model parameters in step S11-step S19, due to the use of Newton’s method, can not only protect the data privacy of the parties in the cooperative modeling (the first data party and the second data party), but also reduce the number of model parameter optimization adjustments. Improve the training efficiency of data processing models.

In addition, since the excitation function in the data processing model is usually a non-linear function, the operations involved are non-linear operations, so its value cannot be directly calculated using the secret sharing algorithm. Therefore, if only secret sharing is used to determine the model parameters of the data processing model cooperatively using the Newton method, a polynomial needs to be used to fit the excitation function. The use of polynomials to fit the excitation function has the problem of out of bounds (when the input of the polynomial exceeds a certain range, its output will become very large or very small), which may cause the data processing model to fail to complete the training. The complexity of the confusion circuit is relatively high. Therefore, if only the confusion circuit is used to determine the model parameters of the data processing model using Newton's method, the training process of the data processing model will become complicated. The method for determining model parameters in step S11-step S19, through the combination of secret sharing and obfuscation circuit, can not only avoid the problem of cross-border, but also reduce the complexity of the data processing model training process.

In some embodiments, before step S17, the first data party can be based on the first share of the random orthogonal matrix and the first share of the Hessian matrix, and the second data party can be based on the second share of the random orthogonal matrix and the sea. The second share of the Sen matrix secretly shares the second product. The first data party can obtain the first share of the second product, and the second data party can obtain the second share of the second product. The second product is a product between a random orthogonal matrix and a Hessian matrix. For the specific process of secretly sharing the second product, please refer to the relevant description above.

In this way, in step S17, when the condition number of the second product satisfies the preset condition, the first data party can secretly share the first share according to the first share of the Hessian matrix, and the second data party can secretly share the first share according to the second share of the Hessian matrix. An inverse matrix. The first data party can obtain the first share of the first inverse matrix, and the second data party can obtain the second share of the first inverse matrix.

The preset condition may include: the number of conditions is less than or equal to a preset threshold. The preset threshold may be an empirical value, or it may also be obtained by other means (for example, a machine learning method).

Both the first data party and the second data party may hold the preset conditions. Furthermore, the first data party and the second data party may respectively determine whether the condition number of the second product satisfies the preset condition. In some embodiments, the condition number of the second product may be calculated by the first data party. Specifically, the second data party may send the second share of the second product to the first data party. The first data party may receive the second share of the second product; may add the second share of the second product to the first share of the second product held by itself to obtain the second product; and may calculate the second product The condition number of the second product can be determined whether the condition number of the second product meets the preset condition; the condition number of the second product can be sent to the second data party. The second data party can receive the condition number of the second product; and can determine whether the condition number of the second product meets the preset condition. In some other embodiments, the second data party may also calculate the condition number of the second product. Specifically, the first data party may send the first share of the second product to the second data party. The second data party may receive the first share of the second product; may add the first share of the second product to the second share of the second product held by itself to obtain the second product; and may calculate the second product The condition number of the second product can be judged whether the condition number of the second product meets the preset condition; the condition number of the second product can be sent to the first data party. The first data party can receive the condition number of the second product; and can judge whether the condition number of the second product meets the preset condition.

Alternatively, only the first data party may hold the preset condition, and then only the first data party may determine whether the condition number of the second product meets the preset condition. Specifically, the second data party may send the second share of the second product to the first data party. The first data party may receive the second share of the second product; may add the second share of the second product to the first share of the second product held by itself to obtain the second product; and may calculate the second product It is possible to judge whether the condition number of the second product satisfies the preset condition; it is possible to send the judgment result information to the second data party. The second data party may receive the judgment result information.

Alternatively, only the second data party may hold the preset condition, and then only the second data party may determine whether the condition number of the second product meets the preset condition. Specifically, the first data party may send the first share of the second product to the second data party. The second data party may receive the first share of the second product; may add the first share of the second product to the second share of the second product held by itself to obtain the second product; and may calculate the second product It is possible to judge whether the condition number of the second product satisfies the preset condition; it is possible to send the judgment result information to the first data party. The first data party may receive the judgment result information.

As mentioned earlier, a square matrix can be multiplied by an orthogonal matrix to obtain a new matrix, which has the same condition number as the square matrix. Since the Hessian matrix is a square matrix, the condition number of the second product is equal to the condition number of the Hessian matrix. In this way, the first data party and the second data party can collaboratively calculate the condition number of the Hessian matrix without leaking their share of the Hessian matrix.

The condition number of the second product satisfies the preset condition, indicating that the second product is less ill-conditioned, that is, it indicates that the Hessian matrix is less ill-conditioned, so the Newton method can be used to determine the model parameters.

The condition number of the second product does not meet the preset conditions, indicating that the second product is ill-conditioned, which means that the Hessian matrix is ill-conditioned, and the Newton method cannot be used to determine the model parameters, so the gradient descent method can be used instead The Newton method determines the model parameters. Specifically, the first data party may calculate the first share of the new first model parameter according to the first share of the first model parameter, the first share of the loss function gradient, and the preset step size. The second data party may calculate the second share of the new first model parameter according to the second share of the first model parameter, the second share of the loss function gradient, and the preset step size.

The preset step length can be used to control the iteration speed of the gradient descent method. The preset step length can be any suitable positive real number. For example, when the preset step size is too large, the iteration speed will be too fast, resulting in the possibility that the optimal model parameters cannot be obtained. When the preset step size is too small, the iteration speed will be too slow, resulting in a longer time. The preset step length may specifically be an empirical value; alternatively, it may also be obtained by means of machine learning. Of course, the preset step length can also be obtained in other ways. Both the first data party and the second data party may hold the preset step size.

The first data party may multiply the first share of the loss function gradient by the preset step size to obtain the fourth product; may subtract the first share of the first model parameter and the fourth product to obtain a new first The first share of model parameters. The second data party may multiply the second share of the loss function gradient by the preset step size to obtain the fifth product; may subtract the second share of the first model parameter and the fifth product to obtain the new first The second share of model parameters. The sum of the first share of the new first model parameter and the second share of the new first model parameter is equal to the new first model parameter.

Continuing the previous scenario example, the first data party may multiply the first share of the gradient of the loss function <dW> ₀ (specifically a vector) by the preset step size G (specifically a vector multiplication) to obtain the fourth The product G<dW>₀; the first share of the first model parameter <W> ₀ can be subtracted from the fourth product G<dW> ₀ to obtain the first share of the new first model parameter <W'> ₀ = <W> ₀ -G<dW> ₀ .

The second data party may multiply the second share of the gradient of the loss function <dW> ₁ (specifically a vector) by the preset step size G (specifically a multiplication of the vector) to obtain the fifth product G<dW>₁; The second share of the first model parameter <W> ₁ can be subtracted from the fifth product G <dW> ₁ to obtain the second share of the new first model parameter <W'> ₁ = <W> ₁ -G <dW> ₁ . Wherein, <W'> ₀ +<W'> ₁ = W', and W'represents the new first model parameter.

In this way, through the condition number and the preset condition, this embodiment can avoid the problem of non-convergence caused by the ill-conditioned matrix in the process of using the Newton method to determine the model parameters.

In some implementations of this embodiment, it may further include a process of iterative optimization and adjustment of the model parameters of the data processing model. Specifically, the step of secretly sharing the first product can be repeated, the first data party can obtain the first share of the new first product, and the second data party can obtain the second share of the new first product. The new first product is the product of the feature data and the new first model parameter. Step S11 may be repeatedly executed, the first data party can obtain the first share of the new excitation function value, and the second data party can obtain the second share of the new excitation function value. Step S13 can be repeated, the first data party can obtain the first share of the new loss function gradient, and the second data party can obtain the second share of the new loss function gradient. Step S15 may be repeated, the first data party can obtain the first share of the new Hessian matrix, and the second data party can obtain the second share of the new Hessian matrix. The step of secretly sharing the second product may be repeated, the first data party can obtain the first share of the new second product, and the second data party can obtain the second share of the new second product. The new second product is the product of the random orthogonal matrix and the new Hessian matrix.

When the condition number of the new second product satisfies the preset condition, it indicates that the Newton method can be used to determine the model parameters during this round of iteration. Step S17 can be repeated, the first data party can obtain the first share of the new first inverse matrix, and the second data party can obtain the second share of the new first inverse matrix. The new first inverse matrix is the inverse of the new Hessian matrix. The first data party may be based on the first share of the new first model parameter, the first share of the new first inverse matrix, and the first share of the new loss function gradient, and the second data party may be based on the new The second share of the first model parameter, the second share of the new first inverse matrix, and the second share of the new loss function gradient secretly share the second model parameters. The first data party can obtain a first share of the second model parameter, and the second data party can obtain a second share of the second model parameter. The sum of the first share of the second model parameter and the second share of the second model parameter is equal to the second model parameter. In this way, the iterative process can be realized through a combination of secret sharing and obfuscation circuits. The combination of secret sharing and obfuscation circuits can not only avoid the problem of cross-border, but also reduce the complexity of the data processing model training process. In addition, in the iterative process, due to the use of Newton's method, not only can the data privacy of the parties in the cooperative modeling (the first data party and the second data party) be protected, but also the number of optimization adjustments of model parameters can be reduced, and the data processing model can be improved Training efficiency.

When the condition number of the new second product does not meet the preset condition, it indicates that the Newton method cannot be used to determine the model parameters during this round of iteration, so the gradient descent method can be used instead of the Newton method to determine the model parameters. The first data party may calculate the first share of the second model parameter according to the first share of the new first model parameter, the first share of the new loss function gradient, and the preset step size. The specific calculation process is similar to the process of calculating the first share of the new first model parameter according to the first share of the first model parameter, the first share of the loss function gradient, and the preset step length. The second data party may calculate the second share of the second model parameter according to the second share of the new first model parameter, the second share of the new loss function gradient, and the preset step size. The specific calculation process is similar to the process of calculating the second share of the new first model parameter according to the second share of the first model parameter, the second share of the loss function gradient, and the preset step length. In this way, the iterative process can be realized through a combination of secret sharing and obfuscation circuits. The combination of secret sharing and obfuscation circuits can not only avoid the problem of cross-border, but also reduce the complexity of the data processing model training process. In addition, in the iterative process, the gradient descent method is used.

In some other implementations of this embodiment, it may also include a process of iterative optimization and adjustment of the model parameters of the data processing model. Specifically, the step of secretly sharing the first product can be repeated, the first data party can obtain the first share of the new first product, and the second data party can obtain the second share of the new first product. The new first product is the product of the feature data and the new first model parameter. The first data party may secretly share the value of the new incentive function based on the first share of the new first product, and the second data party may secretly share the value of the new incentive function based on the second share of the new first product. The first data party can obtain the first share of the value of the new excitation function, and the second data party can obtain the second share of the value of the new excitation function. Step S13 can be repeated, the first data party can obtain the first share of the new loss function gradient, and the second data party can obtain the second share of the new loss function gradient. Step S15 may be repeated, the first data party can obtain the first share of the new Hessian matrix, and the second data party can obtain the second share of the new Hessian matrix. The step of secretly sharing the second product may be repeated, the first data party can obtain the first share of the new second product, and the second data party can obtain the second share of the new second product. The new second product is the product of the random orthogonal matrix and the new Hessian matrix.

When the condition number of the new second product satisfies the preset condition, it indicates that the Newton method can be used to determine the model parameters during this round of iteration. Step S17 can be repeated, the first data party can obtain the first share of the new first inverse matrix, and the second data party can obtain the second share of the new first inverse matrix. The new first inverse matrix is the inverse of the new Hessian matrix. The first data party may be based on the first share of the new first model parameter, the first share of the new first inverse matrix, and the first share of the new loss function gradient, and the second data party may be based on the new first share. The second share of a model parameter, the second share of the new first inverse matrix and the second share of the new loss function gradient secretly share the second model parameters. The first data party can obtain a first share of the second model parameter, and the second data party can obtain a second share of the second model parameter. The sum of the first share of the second model parameter and the second share of the second model parameter is equal to the second model parameter. In this way, the iterative process can be realized through secret sharing. In addition, in the iterative process, due to the use of Newton's method, not only can the data privacy of the parties in the cooperative modeling (the first data party and the second data party) be protected, but also the number of optimization adjustments of model parameters can be reduced, and the data processing model can be improved Training efficiency.

When the condition number of the new second product does not meet the preset condition, it indicates that the Newton method cannot be used to determine the model parameters during this round of iteration, so the gradient descent method can be used instead of the Newton method to determine the model parameters. The first data party may calculate the first share of the second model parameter according to the first share of the new first model parameter, the first share of the new loss function gradient, and the preset step size. The specific calculation process is similar to the process of calculating the first share of the new first model parameter according to the first share of the first model parameter, the first share of the loss function gradient, and the preset step length. The second data party may calculate the second share of the second model parameter according to the second share of the new first model parameter, the second share of the new loss function gradient, and the preset step size. The specific calculation process is similar to the process of calculating the second share of the new first model parameter according to the second share of the first model parameter, the second share of the loss function gradient, and the preset step length. In this way, the iterative process can be realized through secret sharing. In addition, in the iterative process, the gradient descent method is used.

The following describes the process in which the first data party secretly shares the value of the new incentive function based on the first share of the new first product, and the second data party secretly shares the value of the new incentive function based on the second share of the new first product.

The first data party may secretly share the value of the polynomial according to the first share of the new first product, and the second data party may secretly share the value of the polynomial according to the second share of the new first product. The first data party and the second data party may respectively obtain a share of the value of the polynomial. The polynomial can be used to fit the activation function of the data processing model. In this way, the share obtained by the first data party may be used as the first share of the value of the new incentive function, and the share obtained by the second data party may be used as the second share of the value of the new incentive function. The sum of the first share of the value of the new excitation function and the second share of the value of the new excitation function is equal to the value of the new excitation function.

In the method for determining model parameters of this embodiment, the first data party and the second data party can cooperate to determine the model parameters of the data processing model without leaking the data held by themselves.

An embodiment of the method for determining model parameters in this specification is described in detail above in conjunction with FIG. 3. The above method steps performed by the first data party can be separately implemented as a model parameter determination method on the first data party's side; the method steps performed by the second data party can be separately implemented as a model parameter determination method on the second data party's side. The following will describe in detail the method for determining model parameters on the side of the first data party and the method for determining model parameters on the side of the second data party in the embodiments of the present specification in conjunction with FIG. 5 and FIG. 6.

This specification also provides another embodiment of the method for determining model parameters. In this embodiment, the first data party is the execution subject, and the first data party may hold the share of the feature data and the first model parameter. Please refer to Figure 5. This embodiment may include the following steps.

Step S21: Communicate with the partner according to the share of the first product and the confusion circuit corresponding to the excitation function to obtain the share of the value of the excitation function. The first product is the product of the feature data and the first model parameter.

In some embodiments, the cooperating party may be understood as a data party that performs cooperative security modeling with the first data party, and specifically may be the previous second data party. Specifically, the first data party may secretly share the first product with the partner according to the share of the feature data and the first model parameter to obtain the share of the first product; the confusion circuit corresponding to the incentive function may be obtained according to the share of the first product Communicate with the partner to get the share of the value of the incentive function. For the specific process, please refer to the related description of step S11 above, which will not be repeated here.

Step S23: secretly share the gradient of the loss function with the partner according to the feature data and the value of the incentive function to obtain the share of the gradient of the loss function.

In some embodiments, the first data party may secretly share the gradient of the loss function with the partner according to the share of the characteristic data and the value of the incentive function to obtain the share of the gradient of the loss function. For the specific process, please refer to the related description of step S13, which will not be repeated here.

Step S25: secretly share the Hessian matrix with the partner according to the share of the characteristic data and the value of the incentive function to obtain the share of the Hessian matrix.

In some embodiments, the first data party may secretly share the diagonal matrix with the partner according to the value of the incentive function to obtain the share of the diagonal matrix; and may secretly share the diagonal matrix with the partner according to the feature data and the share of the diagonal matrix. Share the Hessian matrix and get the share of the Hessian matrix. For the specific process, please refer to the related description of step S15 above, which will not be repeated here.

Step S27: secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix, where the first inverse matrix is the inverse of the Hessian matrix.

In some embodiments, the first data party may secretly share the second product with the partner according to the share of the random orthogonal matrix and the share of the Hessian matrix to obtain the share of the second product. The second product may be a product between a random orthogonal matrix and a Hessian matrix. The first data party may send the share of the second product to the partner; may receive the second inverse matrix fed back by the partner, the second inverse matrix being the inverse of the second product; the second inverse matrix may be Multiply the share of the random orthogonal matrix to get the share of the first inverse matrix. For the specific process, please refer to the related description of step S17 above, which will not be repeated here.

Or, in other embodiments, the first data party may secretly share the second product with the partner according to the share of the random orthogonal matrix and the share of the Hessian matrix to obtain the first share of the second product. The second product may be a product between a random orthogonal matrix and a Hessian matrix. The first data party may receive the second share of the second product sent by the partner; may determine the second inverse matrix according to the first share of the second product and the second share of the second product, the second inverse matrix Is the inverse matrix of the second product; the second inverse matrix can be multiplied by the share of the random orthogonal matrix to obtain the share of the first inverse matrix. For the specific process, please refer to the related description of step S17 above, which will not be repeated here.

Step S29: secretly share the new first model parameter with the partner according to the share of the first model parameter, the share of the first inverse matrix, and the share of the loss function gradient to obtain the share of the new first model parameter.

In some embodiments, the first data party may secretly share the third product with the partner according to the share of the first inverse matrix and the share of the loss function gradient to obtain the share of the third product. The third product may be a product between the first inverse matrix and the gradient of the loss function. The first data party may subtract the share of the first model parameter from the share of the third product to obtain a new share of the first model parameter. For the specific process, please refer to the related description of step S19 above, which will not be repeated here.

In some embodiments, the first data party may secretly share the second product with the partner according to the share of the random orthogonal matrix and the share of the Hessian matrix to obtain the share of the second product, and the second product is a random positive The product between the intersection matrix and the Hessian matrix.

In this way, in step S27, the first data party may secretly share the first inverse matrix with the partner according to the share of the Hessian matrix when the condition number of the second product meets the preset condition to obtain the share of the first inverse matrix. The first inverse matrix is the inverse matrix of the Hessian matrix. Wherein, the condition number of the second product can be calculated by the first data party and/or the partner. The condition number of the second product is equal to the condition number of the Hessian matrix.

The condition number of the second product satisfies the preset condition, indicating that the second product is less ill-conditioned, and the Newton method can be used to determine the model parameters.

The condition number of the second product does not satisfy the preset condition, indicating that the second product is ill-conditioned and cannot be determined by the Newton method. Therefore, the gradient descent method can be used instead of the Newton method to determine the model parameters. The first data party may calculate the share of the new first model parameter according to the share of the first model parameter, the share of the loss function gradient, and the preset step length. Specifically, the first data party may multiply the share of the loss function gradient by the preset step length to obtain the fourth product; may subtract the share of the first model parameter and the fourth product to obtain the new first model parameter Share.

In some implementations of this embodiment, it may further include a process of iterative optimization and adjustment of the model parameters of the data processing model. Specifically, the first data party may repeat the step of secretly sharing the first product to obtain a new share of the first product; may repeat step S21 to obtain a new share of the value of the incentive function; may repeat step S23 , Obtain the share of the new loss function gradient; repeat step S25 to obtain the share of the new Hessian matrix; repeat the step of secretly sharing the second product to obtain the share of the new second product. The new second product is the product of the random orthogonal matrix and the new Hessian matrix.

When the condition number of the new second product meets the preset condition, it indicates that the Newton method can continue to be used to determine the model parameters. The first data party may repeat step S27 to obtain the share of the new first inverse matrix. The new first inverse matrix is the inverse of the new Hessian matrix. The first data party may then secretly share the second model parameter with the partner according to the share of the new first inverse matrix, the share of the new loss function gradient, and the share of the new first model parameter to obtain the second model parameter Share.

When the condition number of the new second product does not meet the preset condition, it indicates that it is necessary to use the gradient descent method instead of the Newton method to determine the model parameters. The first data party may calculate the share of the second model parameter according to the share of the new first model parameter, the share of the new loss function gradient, and the preset step length.

In some other implementations of this embodiment, it may also include a process of iterative optimization and adjustment of the model parameters of the data processing model. Specifically, the first data party may repeat the step of secretly sharing the first product to obtain a new share of the first product. The first data party may secretly share the value of the incentive function with the partner according to the share of the new first product to obtain the share of the value of the new incentive function. The first data party may repeat step S23 to obtain a new share of the loss function gradient; repeat step S25 to obtain a new share of the Hessian matrix; may repeat the step of secretly sharing the second product to obtain a new share The share of the second product. The new second product is the product of the random orthogonal matrix and the new Hessian matrix.

When the condition number of the new second product meets the preset condition, it indicates that the Newton method can continue to be used to determine the model parameters. The first data party may repeat step S27, and the first data party may obtain the share of the new first inverse matrix. The new first inverse matrix is the inverse of the new Hessian matrix. The first data party may then secretly share the second model parameter with the partner according to the share of the new first inverse matrix, the share of the new loss function gradient, and the share of the new first model parameter to obtain the second model parameter Share.

In the method for determining model parameters of this embodiment, the first data party can cooperate with the partner to determine the model parameters of the data processing model under the premise of not leaking the data it owns, and obtain the share of the new first model parameters.

This specification also provides another embodiment of the method for determining model parameters. In this embodiment, the second data party is the execution subject, and the second data party may hold the share of the tag and the first model parameter. Please refer to Figure 6. This embodiment may include the following steps.

Step S31: Communicate with the partner according to the share of the first product and the confusion circuit corresponding to the incentive function to obtain the share of the value of the incentive function. The first product is the product of the feature data and the first model parameter.

In some embodiments, the cooperating party may be understood as a data party that performs cooperative security modeling with the second data party, and specifically may be the previous first data party. Specifically, the second data party may secretly share the first product with the partner according to the share of the first model parameter to obtain the share of the first product; the confusion circuit and the partner may be confused according to the share of the first product and the incentive function. Communication to obtain the share of the value of the excitation function. For the specific process, please refer to the related description of step S11 above, which will not be repeated here.

Step S33: secretly share the gradient of the loss function with the partner according to the share of the label and the value of the incentive function to obtain the share of the gradient of the loss function.

In some embodiments, the second data party may secretly share the gradient of the loss function with the partner according to the share of the tag and the value of the incentive function to obtain the share of the gradient of the loss function. For the specific process, please refer to the related description of step S13, which will not be repeated here.

Step S35: secretly share the Hessian matrix with the partner according to the value of the incentive function to obtain the Hessian matrix.

In some embodiments, the second data party may secretly share the diagonal matrix with the partner according to the value of the incentive function to obtain the share of the diagonal matrix; and may secretly share the Hesen with the partner according to the share of the diagonal matrix. Matrix, get the share of Hessian matrix. For the specific process, please refer to the related description of step S15 above, which will not be repeated here.

Step S37: secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix, where the first inverse matrix is the inverse of the Hessian matrix.

In some embodiments, the second data party may secretly share the second product with the partner according to the share of the random orthogonal matrix and the share of the Hessian matrix to obtain the share of the second product. The second product may be a product between a random orthogonal matrix and a Hessian matrix. The second data party may send the share of the second product to the partner; may receive the second inverse matrix fed back by the partner, the second inverse matrix being the inverse of the second product; the second inverse matrix may be Multiply the share of the random orthogonal matrix to get the share of the first inverse matrix. For the specific process, please refer to the related description of step S17 above, which will not be repeated here.

Or, in other embodiments, the second data party may secretly share the second product with the partner according to the share of the random orthogonal matrix and the share of the Hessian matrix to obtain the first share of the second product. The second product may be a product between a random orthogonal matrix and a Hessian matrix. The second data party may receive the second share of the second product sent by the partner; may determine the second inverse matrix according to the first share of the second product and the second share of the second product, the second inverse matrix Is the inverse matrix of the second product; the second inverse matrix can be multiplied by the share of the random orthogonal matrix to obtain the share of the first inverse matrix. For the specific process, please refer to the related description of step S17 above, which will not be repeated here.

Step S39: secretly share the new first model parameter with the partner according to the share of the first model parameter, the share of the first inverse matrix, and the share of the loss function gradient to obtain the share of the new first model parameter.

In some embodiments, the second data party may secretly share the third product with the partner according to the share of the first inverse matrix and the share of the loss function gradient to obtain the share of the third product. The third product may be a product between the first inverse matrix and the gradient of the loss function. The second data party may subtract the share of the first model parameter from the share of the third product to obtain a new share of the first model parameter. For the specific process, please refer to the related description of step S19 above, which will not be repeated here.

In some embodiments, the second data party may secretly share the second product with the partner according to the share of the random orthogonal matrix and the share of the Hessian matrix to obtain the share of the second product, and the second product is a random positive The product between the intersection matrix and the Hessian matrix.

In this way, in step S37, the second data party may secretly share the first inverse matrix with the partner according to the share of the Hessian matrix when the condition number of the second product satisfies the preset condition to obtain the share of the first inverse matrix. The first inverse matrix is the inverse matrix of the Hessian matrix. Wherein, the condition number of the second product can be calculated by the second data party and/or the partner. The condition number of the second product is equal to the condition number of the Hessian matrix.

The condition number of the second product does not satisfy the preset condition, indicating that the second product is ill-conditioned and cannot be determined by the Newton method. Therefore, the gradient descent method can be used instead of the Newton method to determine the model parameters. The second data party may calculate the share of the new first model parameter according to the share of the first model parameter, the share of the loss function gradient, and the preset step length. The second data party may specifically multiply the share of the loss function gradient by the preset step length to obtain the fourth product; may subtract the share of the first model parameter and the fourth product to obtain the new first model parameter Share.

In some implementations of this embodiment, it may further include a process of iterative optimization and adjustment of the model parameters of the data processing model. Specifically, the second data party may repeat the step of secretly sharing the first product to obtain a new share of the first product; may repeat step S31 to obtain a share of the new incentive function value; may repeat step S33 , Obtain the share of the new loss function gradient; repeat step S35 to obtain the share of the new Hessian matrix; repeat the step of secretly sharing the second product to obtain the share of the new second product. The new second product is the product of the random orthogonal matrix and the new Hessian matrix.

When the condition number of the new second product meets the preset condition, it indicates that the Newton method can continue to be used to determine the model parameters. The second data party may repeat step S37 to obtain the share of the new first inverse matrix. The new first inverse matrix is the inverse of the new Hessian matrix. The second data party may then secretly share the second model parameter with the partner according to the share of the new first inverse matrix, the share of the new loss function gradient, and the share of the new first model parameter to obtain the second model parameter Share.

When the condition number of the new second product does not meet the preset condition, it indicates that it is necessary to use the gradient descent method instead of the Newton method to determine the model parameters. The second data party may calculate the share of the second model parameter according to the share of the new first model parameter, the share of the new loss function gradient, and the preset step length.

In some other implementations of this embodiment, it may also include a process of iterative optimization and adjustment of the model parameters of the data processing model. Specifically, the second data party may repeat the step of secretly sharing the first product to obtain a new share of the first product. The second data party may secretly share the value of the incentive function with the partner according to the share of the new first product to obtain the share of the value of the new incentive function. The second data party may repeat step S33 to obtain a new share of the loss function gradient; repeat step S35 to obtain a new share of the Hessian matrix; repeat the step of secretly sharing the second product to obtain a new share The share of the second product. The new second product is the product of the random orthogonal matrix and the new Hessian matrix.

In the method for determining model parameters of this embodiment, the second data party can cooperate with the partner to determine the model parameters of the data processing model under the premise of not leaking the data owned by itself, and obtain the share of the new first model parameters.

The model parameter determination device in the embodiment of this specification will be described in detail below in conjunction with FIG. 7 and FIG. 8.

This specification also provides an embodiment of the device for determining model parameters. Refer to Figure 7. This embodiment can be applied to the first data party and can include the following units.

The incentive function value share acquisition unit 41 is configured to communicate with the partner according to the share of the first product and the confusion circuit corresponding to the incentive function to obtain the value share of the incentive function, and the first product is the characteristic data and the first model Product of parameters;

The loss function gradient share obtaining unit 43 is configured to secretly share the gradient of the loss function with the partner according to the share of the characteristic data and the value of the incentive function, to obtain the share of the loss function gradient;

The Hessian matrix share obtaining unit 45 is configured to secretly share the Hessian matrix with the partner according to the share of the characteristic data and the incentive function to obtain the share of the Hessian matrix;

The first inverse matrix share obtaining unit 47 is configured to secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix, where the first inverse matrix is the inverse of the Hessian matrix;

The model parameter share obtaining unit 49 is configured to secretly share the new first model parameter with the partner according to the share of the first model parameter, the share of the first inverse matrix, and the share of the loss function gradient to obtain the share of the new first model parameter .

This specification also provides another embodiment of the device for determining model parameters. Refer to Figure 8. This embodiment can be applied to the second data party and can include the following units.

The incentive function value share obtaining unit 51 is configured to communicate with the partner according to the share of the first product and the confusion circuit corresponding to the incentive function to obtain the value share of the incentive function, and the first product is the characteristic data and the first model Product of parameters;

The loss function gradient share acquisition unit 53 is used to secretly share the gradient of the loss function with the partner according to the value of the label and the incentive function to obtain the share of the loss function gradient;

The Hessian matrix share obtaining unit 55 is used to secretly share the Hessian matrix with the partner according to the share of the incentive function to obtain the Hessian matrix share;

The first inverse matrix share obtaining unit 57 is configured to secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix, where the first inverse matrix is the inverse of the Hessian matrix;

The model parameter share obtaining unit 59 is configured to secretly share the new first model parameter with the partner according to the share of the first model parameter, the share of the first inverse matrix, and the share of the loss function gradient to obtain the share of the new first model parameter .

Please refer to FIG. 2. This specification also provides another embodiment of the model parameter determination system. In this embodiment, the model parameter determination system may include a first data party, a second data party, and a trusted third party.

The third party may be one server; or, it may also be a server cluster including multiple servers. The third party is used to provide random numbers to the first data party and the second data party. Specifically, the third party may generate a random orthogonal matrix, and each random number in the random orthogonal matrix may be split into two shares, and one of the shares may be used as the first share, and the other share may be used as the first share. The second share. The third party may use the matrix formed by the first share of each random number in the random orthogonal matrix as the first share of the random orthogonal matrix, and calculate the second share of each random number in the random orthogonal matrix. The formed matrix is used as the second share of the random orthogonal matrix; the first share of the random orthogonal matrix may be sent to the first data party, and the random orthogonal matrix may be sent to the second data party Second share. Wherein, the sum of the first share of the random orthogonal matrix and the second share of the random orthogonal matrix is equal to the random orthogonal matrix. In addition, on the one hand, the random orthogonal matrix may be a random number matrix composed of random numbers; on the other hand, the random orthogonal matrix may also be an orthogonal matrix. After a square matrix is multiplied by an orthogonal matrix, a new matrix can be obtained, and the new matrix has the same condition number as the square matrix. In this way, it is convenient for the first data party and the second data party to collaboratively calculate the condition number of the Hessian matrix without leaking their own shares of the Hessian matrix; thus, it is convenient to measure the ill-conditioned degree of the Hessian matrix according to the condition number . The specific process is detailed in the following embodiment.

The first data party and the second data party are respectively two parties of cooperative security modeling. The first data party may be a data party holding characteristic data, and the second data party may be a data party holding a tag. For example, the first data party may hold complete feature data, and the second data party may hold a label of the feature data. Alternatively, the first data party may hold a part of the characteristic data, and the second data party may hold another part of the characteristic data and a label of the characteristic data. Specifically, for example, the characteristic data may include the user's savings amount and loan amount. The first data party may hold the user's savings amount, and the second data party may hold the user's loan amount and the tag of the characteristic data. The tag can be used to distinguish different types of characteristic data, and the specific value can be taken from 0 and 1, for example. It is worth noting that the data party here can be an electronic device. The electronic equipment may include a personal computer, a server, a handheld device, a portable device, a tablet device, a multi-processor device; or, it may also include a cluster formed by any of the above devices or devices. In addition, the feature data and its corresponding labels together constitute sample data, and the sample data can be used to train the data processing model.

The first data party may receive the first share of the random orthogonal matrix and the first OT random number. The second data party may receive the second share of the random orthogonal matrix and the second OT random number. In this way, the first data party may be based on the first share of the first model parameter, characteristic data, the first share of the random orthogonal matrix, and the first OT random number, and the second data party may be based on the first model parameter The second share, the label, the second share of the random orthogonal matrix, and the second OT random number cooperate to determine the second model parameters. For example, the first data party and the second data party may adopt a secret sharing method and use Newton's method to collaboratively determine the new first model parameters; further, a combination of secret sharing and obfuscation circuits may be adopted, and gradients may be used. The descent method cooperates to determine the second model parameters.

Based on the foregoing embodiment of the model parameter determination system, another embodiment of the model parameter determination method in this specification will be described in detail below in conjunction with FIG. 9. This embodiment may include the following steps.

Step S601: The first data party secretly shares the first product according to the first share of the characteristic data and the first model parameter, and the second data party secretly shares the first product according to the second share of the first model parameter. The first data party gets the first share of the first product, and the second data party gets the second share of the first product. The first product is the product of the feature data and the first model parameter.

Step S603: The first data party secretly shares the value of the incentive function according to the first share of the first product, and the second data party secretly shares the value of the incentive function according to the second share of the first product. The first data party obtains the first share of the value of the excitation function, and the second data party obtains the second share of the value of the excitation function.

Step S605: The first data party obtains the first share of the value based on the characteristic data and the incentive function, and the second data party secretly shares the gradient of the loss function based on the label and the second share of the incentive function. The first data party obtains the first share of the loss function gradient, and the second data party obtains the second share of the loss function gradient.

Step S607: The first data party takes the first share of the value based on the characteristic data and the incentive function, and the second data party secretly shares the Hessian matrix based on the second share of the value taken by the incentive function. The first data party gets the first share of the Hessian matrix, and the second data party gets the second share of the Hessian matrix.

Step S609: The first data party secretly shares the first share according to the first share of the random orthogonal matrix and the first share of the Hessian matrix, and the second data party secretly shares the first share according to the second share of the random orthogonal matrix and the second share of the Hessian matrix. Two products. The first data party gets the first share of the second product, and the second data party gets the second share of the second product. The second product is a product between a random orthogonal matrix and a Hessian matrix.

Step S611: When the condition number of the second product meets the preset condition, the first data party secretly shares the first inverse matrix according to the first share of the Hessian matrix, and the second data party secretly shares the first inverse matrix according to the second share of the Hessian matrix. The first data party obtains the first share of the first inverse matrix, and the second data party obtains the second share of the first inverse matrix. The first inverse matrix is the inverse matrix of the Hessian matrix.

Step S613: The first data party uses the first share of the first model parameter, the first share of the first inverse matrix, and the first share of the loss function gradient, and the second data party uses the second share of the first model parameter, the first share of the loss function The second share of the inverse matrix and the second share of the gradient of the loss function secretly share the new first model parameters. The first data party obtains the first share of the new first model parameter, and the second data party obtains the second share of the new first model parameter.

Some terms involved in the embodiments are introduced below.

(2) The first inverse matrix and the second inverse matrix. Since the Hessian matrix is a square matrix, the Hessian matrix can be inversed, and the inverse of the Hessian matrix can be used as the first inverse matrix. The second product may be a square matrix, and thus the second product may be inverted, and the inverse matrix of the second product may be used as the second inverse matrix.

In some embodiments, in step S601, the first data party may secretly share the feature data and the first share of the first model parameter, and the second data party may secretly share the second share of the first model parameter. The first product. The first data party and the second data party may each obtain a share of the first product. For ease of description, the share obtained by the first data party may be used as the first share of the first product, and the share obtained by the second data party may be used as the second share of the first product. The sum of the first share of the first product and the second share of the first product is equal to the first product.

In some embodiments, in step S603, the first data party may secretly share the value of the polynomial according to the first share of the first product, and the second data party may secretly share the value of the polynomial according to the second share of the first product. The first data party and the second data party may respectively obtain a share of the value of the polynomial. The polynomial can be used to fit the activation function of the data processing model. In this way, the share obtained by the first data party may be used as the first share of the value of the incentive function, and the share obtained by the second data party may be used as the second share of the value of the incentive function. The sum of the first share of the value of the excitation function and the second share of the value of the excitation function is equal to the value of the excitation function.

Continuing the previous scenario example, the excitation function may be a Sigmoid function. The value of the excitation function can be expressed as a=sigmoid(XW). The first share of the value of the excitation function may be expressed as <a> ₀ , and the second share of the value of the excitation function may be expressed as <a> ₁ . Wherein, <a> ₀ + <a> ₁ = a. <a> ₀ , <a> _1, and a are vectors respectively.

In some embodiments, in step S605, the first data party may take the first share of the value according to the characteristic data and the excitation function, and the second data party may take the second share of the value according to the label and the excitation function, Secretly share the gradient of the loss function. The first data party and the second data party may obtain a share of the gradient of the loss function respectively. For ease of description, the share obtained by the first data party may be used as the first share of the loss function gradient, and the share obtained by the second data party may be used as the second share of the loss function gradient. The sum of the first share of the gradient of the loss function and the second share of the gradient of the loss function is equal to the gradient of the loss function.

Continuing the previous scenario example, the first data party can secretly share the gradient dW (specifically a vector) of the loss function based on X and <a> ₀ , and the second data party can secretly share the gradient dW of the loss function based on the label Y and <a> ₁ . The first data party can obtain the first share of the loss function gradient<dW> ₀ , and the second data party can obtain the second share of the loss function gradient<dW> ₁ .

The detailed process of the first data party and the second data party secretly sharing the loss function dW is described below.

In some embodiments, in step S607, the first data party may obtain the first share of the value according to the characteristic data and the incentive function, and the second data party may secretly share the value according to the second share of the incentive function. Hessian matrix. The first data party and the second data party may obtain a share of the Hessian matrix respectively. For ease of description, the share obtained by the first data party may be used as the first share of the Hessen matrix, and the share obtained by the second data party may be used as the second share of the Hessen matrix. The sum of the first share of the Hessian matrix and the second share of the Hessian matrix is equal to the Hessian matrix.

Continuing the previous scenario example, the first data party can secretly share the diagonal matrix RNN according to <a> ₀ and the second data party can secretly share the diagonal matrix RNN according to <a> ₁ . The first data party can obtain the first share RNN ₀ of the diagonal matrix, and the second data party can obtain the second share RNN ₁ of the diagonal matrix.

The detailed process of the first data party and the second data party secretly sharing the diagonal matrix RNN is described below.

r＝＜r＞ ₀ +＜r＞ ₁

＝{＜a＞ ₀ +＜a＞ ₁ }{1-＜a＞ ₀ -＜a＞ ₁ }

=a(1-a)

.

The first data party can secretly share the Hessian matrix H according to X and RNN ₀ , and the second data party can secretly share the Hessian matrix H according to RNN ₁ . The first data party can obtain the first share<H> ₀ of the Hessian matrix, and the second data party can obtain the second share<H> _{1 of the} Hessen matrix.

The following describes the detailed process of the first data party and the second data party secretly sharing the Hessian matrix H.

The first data party can calculate X ^T RNN ₀ X+<X ^T RNN ₁ > ₀ X+<[<X ^T RNN ₁ > ₁ X]> ₀ as the first share of the Hessian matrix H<H> ₀ . The second data party may use <[<X ^T RNN ₁ > ₁ X]> ₁ as the second share of the Hessian matrix H<H> ₀ .

In some embodiments, the third party may issue the first share of the random orthogonal matrix to the first data party; may issue the second share of the random orthogonal matrix to the second data party. The sum of the first share of the random orthogonal matrix and the second share of the random orthogonal matrix is equal to the random orthogonal matrix. The first data party may receive the first share of the random orthogonal matrix, and the second data party may receive the second share of the random orthogonal matrix. In this way, in step S609, the first data party can be based on the first share of the random orthogonal matrix and the first share of the Hessian matrix, and the second data party can be based on the second share of the random orthogonal matrix and the first share of the Hessian matrix. The second share of the matrix secretly shares the second product. The first data party and the second data party may each obtain a share of the second product. For ease of description, the share obtained by the first data party may be used as the first share of the second product, and the share obtained by the second data party may be used as the second share of the second product. The sum of the first share of the second product and the second share of the second product is equal to the second product.

Continuing the previous scenario example, the first share of the random orthogonal matrix can be expressed as <R> ₀ , and the second share of the random orthogonal matrix can be expressed as <R> ₁ , <R> ₀ +<R> ₁ =R. The first data party may secretly share the second product HR according to <R> ₀ and <H> ₀ , and the second data party may secretly share the second product HR according to <R> ₁ and <H> ₁ . The first data party can obtain the first share of the second product<HR> ₀ , and the second data party can obtain the second product second share<HR> ₁ .

The detailed process of the first data party and the second data party secretly sharing the second product HR is described below.

In some embodiments, the preset condition may include: the number of conditions is less than or equal to a preset threshold. The preset threshold may be an empirical value, or it may also be obtained in other ways (for example, machine learning).

In some embodiments, the condition number of the second product satisfies the preset condition, indicating that the second product is less ill-conditioned, that is, it indicates that the Hessian matrix is less ill-conditioned, so the Newton method can be used to determine the model parameters. In this way, in step S611, the first data party may secretly share the first inverse matrix according to the first share of the Hessian matrix, and the second data party may secretly share the first inverse matrix according to the second share of the Hessian matrix. The first data party can obtain a first share of the first inverse matrix, and the second data party can obtain a second share of the first inverse matrix.

In some embodiments, the second product may be inverted by the second data party. Specifically, the first data party may send the first share of the second product to the second data party. The second data party may receive the first share of the second product; may add the first share of the second product to the second share of its own second product to obtain the second product. Since the second product is a square matrix, the second data party can perform inverse processing on the second product to obtain the inverse matrix of the second product as the second inverse matrix; The data party sends the second inverse matrix. The first data party may receive the second inverse matrix. Or, in some other implementation manners, the first data party may also invert the second product. Specifically, the second data party may send the second share of the second product to the first data party. The first data party may receive the second share of the second product; may add the second share of the second product to the first share of its own second product to obtain the second product. Since the second product is a square matrix, the first data party can perform inverse processing on the second product to obtain the inverse matrix of the second product as the second inverse matrix; The data party sends the second inverse matrix. The second data party may receive the second inverse matrix.

Continuing the previous scenario example, here the second data party inverts the second product HR. Specifically, the first data party may send the first share of the second product<HR> ₀ to the second data party. The second data party may receive the first share of the second product <HR>₀; may add the first share of the second product <HR> ₀ to its second share of the second product <HR> ₁ , second product to give the HR; inversion process may be performed on the second product HR, to obtain a second inverse matrix (HR) ^-1; may send a second inverse matrix (HR) ^-1 to the first party data. The first data party may receive the second inverse matrix (HR) ^-1 .

The first data party may multiply the second inverse matrix (HR) ^-1 by the first share of the random orthogonal matrix <R> _{0 to} obtain the first share of the first inverse matrix H ^-1 <H ^-1 > ₀ . The second data party may multiply the second inverse matrix (HR) ^-1 by the second share of the random orthogonal matrix <R> _{1 to} obtain the first share of the first inverse matrix H ^-1 <H ^-1 > ₁ .

H ^-1 =<H ^-1 > ₀ +<H ^-1 > ₁ =<R> ₀ (HR) ^-1 +<R> ₁ (HR) ^-1 = R×(HR) ^-1 .

In some embodiments, in step S613, the first data party may be based on the first share of the first inverse matrix and the first share of the loss function gradient, and the second data party may be based on the first share of the first inverse matrix. The second share and the second share of the gradient of the loss function secretly share the third product. The first data party and the second data party may each obtain a share of the third product. For ease of description, the share obtained by the first data party may be used as the first share of the third product, and the share obtained by the second data party may be used as the second share of the third product. The sum of the first share of the third product and the second share of the third product is equal to the third product.

Continuing the previous scenario example, the first data party can secretly share the third party according to <H ^-1 > ₀ and <dW> ₀ , and the second data party can secretly share the third party according to <H ^-1 > ₁ and <dW> ₁ The product H ^-1 ×dW. The first data party can obtain the first share of the third product <H ^-1 ×dW> ₀ , and the second data party can obtain the second share of the third product <H ^-1 × dW> ₁ .

The detailed process of the third product H ^-1 ×dW secretly shared by the first data party and the second data party is described below.

H ^-1 ×dW＝＜H ^-1 ×dW＞ ₀ +＜H ^-1 ×dW＞ ₁

＝(＜H ^-1 ＞ ₀ +＜H ^-1 ＞ ₁ )(＜dW＞ ₀ +＜dW＞ ₁ )

In some embodiments, the condition number of the second product does not satisfy the preset condition, which indicates that the second product is ill-conditioned, which means that the Hessian matrix is ill-conditioned, and the Newton method cannot be used to determine the model parameters. The gradient descent method can be used instead of Newton's method to determine the model parameters. Specifically, the first data party may calculate the first share of the new first model parameter according to the first share of the first model parameter, the first share of the loss function gradient, and the preset step size. The second data party may calculate the second share of the new first model parameter according to the second share of the first model parameter, the second share of the loss function gradient, and the preset step size.

The preset step length can be used to control the iteration speed of the gradient descent method. The preset step length can be any suitable positive real number. For example, when the preset step size is too large, the iteration speed will be too fast, resulting in the possibility that the optimal model parameters cannot be obtained. When the preset step size is too small, the iteration speed will be too slow, resulting in a longer time. The preset step length may specifically be an empirical value; or, it may also be obtained by means of machine learning. Of course, the preset step length can also be obtained in other ways. Both the first data party and the second data party may hold the preset step size.

The method for determining model parameters from step S601 to step S613 is realized by way of secret sharing. In addition, Newton's method has a faster convergence rate. The method for determining model parameters from step S601 to step S613, due to the use of Newton’s method, can not only protect the data privacy of the parties in the cooperative modeling (the first data party and the second data party), but also reduce the number of optimization adjustments of model parameters. Improve the training efficiency of data processing models.

In some embodiments, it may also include a process of iterative optimization and adjustment of the model parameters of the data processing model.

Step S601 may be repeatedly executed, the first data party can obtain the first share of the new first product, and the second data party can obtain the second share of the new first product. The new first product is the product of the feature data and the new first model parameter. The first data party can communicate based on the first share of the new first product, and the second data party can communicate based on the confusion circuit corresponding to the excitation function according to the second share of the new first product. The first data party can obtain the first share of the value of the new excitation function, and the second data party can obtain the second share of the value of the new excitation function. Step S605 may be repeatedly executed, the first data party can obtain the first share of the new loss function gradient, and the second data party can obtain the second share of the new loss function gradient. Step S607 can be repeated, the first data party can obtain the first share of the new Hessian matrix, and the second data party can obtain the second share of the new Hessian matrix. Step S609 may be repeated, and the first data party can obtain the first share of the new second product, and the second data party can obtain the second share of the new second product. The new second product is the product of the random orthogonal matrix and the new Hessian matrix.

When the condition number of the new second product satisfies the preset condition, it indicates that the Newton method can be used to determine the model parameters during this round of iteration. Step S611 can be repeated. The first data party can obtain the first share of the new first inverse matrix, and the second data party can obtain the second share of the new first inverse matrix. The new first inverse matrix is the inverse of the new Hessian matrix. The first data party may be based on the first share of the new first model parameter, the first share of the new first inverse matrix, and the first share of the new loss function gradient, and the second data party may be based on the new first share. The second share of a model parameter, the second share of the new first inverse matrix and the second share of the new loss function gradient secretly share the second model parameters. The first data party can obtain a first share of the second model parameter, and the second data party can obtain a second share of the second model parameter. The sum of the first share of the second model parameter and the second share of the second model parameter is equal to the second model parameter. In this way, the iterative process can be realized through a combination of secret sharing and obfuscation circuits. The combination of secret sharing and obfuscation circuits can not only avoid the problem of cross-border, but also reduce the complexity of the data processing model training process. In addition, in the iterative process, due to the use of Newton's method, not only can the data privacy of the parties in the cooperative modeling (the first data party and the second data party) be protected, but also the number of optimization adjustments of model parameters can be reduced, and the data processing model can be improved Training efficiency.

When the condition number of the new second product does not meet the preset condition, it indicates that the Newton method cannot be used to determine the model parameters during this round of iteration, so the gradient descent method can be used instead of the Newton method to determine the model parameters. The first data party may calculate the first share of the second model parameter according to the first share of the new first model parameter, the first share of the new loss function gradient, and the preset step size. The second data party may calculate the second share of the second model parameter according to the second share of the new first model parameter, the second share of the new loss function gradient, and the preset step size. The sum of the first share of the second model parameter and the second share of the second model parameter is equal to the second model parameter. In this way, the iterative process can be realized through a combination of secret sharing and obfuscation circuits. The combination of secret sharing and obfuscation circuits can not only avoid the problem of cross-border, but also reduce the complexity of the data processing model training process. In addition, in the iterative process, the gradient descent method is used.

The following describes a process in which the first data party performs communication based on the first share of the new first product, and the second data party performs communication based on the confusion circuit corresponding to the excitation function according to the second share of the new first product.

The corresponding logic circuit can be constructed according to the excitation function. The logic circuit can be constructed by the first data party; alternatively, it can also be constructed by the second data party; or alternatively, it can also be constructed by other devices (for example, a trusted third party). The logic circuit may be composed of at least one arithmetic gate, and the arithmetic gate may include an AND gate, an OR gate, an exclusive OR gate, and so on. The logic circuit may include at least two input lines and at least one output line, and an obfuscated circuit can be obtained by encrypting the input lines and/or output lines of the logic circuit. The confusion circuit may include a confusion truth table of each arithmetic gate in the logic circuit. It is worth noting that the logic circuit can be constructed directly according to the excitation function; alternatively, various appropriate modifications can be made to the excitation function, and the logical circuit can be constructed according to the deformed excitation function; or, the excitation function can also be used Generate other functions as a basis, and build logic circuits based on other functions. Correspondingly, the activation function and the confusion circuit can be understood as: the confusion circuit is generated based on the logic circuit of the activation function, or the confusion circuit is generated based on the confusion circuit of the deformed activation function, or the confusion circuit is Generated according to the logic circuit of other functions.

In this way, the first data party can communicate based on the first share of the new first product, and the second data party can communicate based on the confusion circuit corresponding to the excitation function according to the second share of the new first product. The first data party and the second data party may respectively obtain a share of the value of the new incentive function. For ease of description, the share obtained by the first data party may be used as the first share of the value of the new incentive function, and the share obtained by the second data party may be used as the second share of the value of the new incentive function. The sum of the first share of the value of the new excitation function and the second share of the value of the new excitation function is equal to the value of the new excitation function.

The function y=f ₁ (x ₁ , x ₂ , x ₃ )=f(x ₁ , x ₂ )-x ₃ can be constructed according to the excitation function f(x ₁ , x ₂ ). Among them, x _{1 is} used to represent the first share of the new first product, x _{2 is} used to represent the second share of the new first product, and x _{3 is} used to represent a share of the new incentive function value (hereinafter referred to as Is the second share of the value of the new incentive function), the value of f ₁ (x ₁ , x ₂ , x ₃ ) is used to represent another share of the value of the new incentive function (hereinafter referred to as the new incentive function The first share of the value).

The second data party may generate a share of the value of the new incentive function as the second share. In this way, the first data party can use the first share of the new first product as the input of the confusion circuit, and the second data party can use the second share of the new first product and the new excitation function to obtain the value. The second share is used as the input of the confusion circuit for communication. The first data party may calculate another share of the value of the new excitation function based on the confusion circuit as the first share. For the specific calculation process, please refer to the previous example of the scene introducing the confusion circuit, which will not be detailed here.

k represents the coefficient of the piecewise linear function.

The first data party may perform communication based on the confusion circuit based on the first share of the new first product, and the second data party may perform communication based on the confusion circuit based on the second share of the new first product. The first data party and the second data party respectively obtain a share of the value of the piecewise linear function. For ease of description, the share obtained by the first data party may be used as the first share of the value of the piecewise linear function, and the share obtained by the second data party may be used as the second share of the value of the piecewise linear function. The sum of the first share of the value of the piecewise linear function and the second share of the value of the piecewise linear function is equal to the value of the piecewise linear function. The first data party may use the first share of the value of the piecewise linear function as the first share of the value of the new excitation function. The second data party may use the second share of the value of the piecewise linear function as the second share of the value of the new excitation function.

In the above, in conjunction with FIG. 9, another embodiment of the method for determining model parameters in this specification is described in detail. The above method steps performed by the first data party can be separately implemented as a model parameter determination method on the first data party's side; the method steps performed by the second data party can be separately implemented as a model parameter determination method on the second data party's side. The following will describe in detail the method for determining model parameters on the side of the first data party and the method for determining model parameters on the side of the second data party in the embodiments of this specification in conjunction with FIG. 10 and FIG. 11.

This specification also provides another embodiment of the method for determining model parameters. In this embodiment, the first data party is the execution subject, and the first data party may hold the share of the feature data and the first model parameter. Refer to Figure 10. This embodiment may include the following steps.

Step S701: secretly share the first product with the partner according to the share of the feature data and the first model parameter to obtain the share of the first product, and the first product is the product of the feature data and the first model parameter.

In some embodiments, the cooperating party may be understood as a data party that performs cooperative security modeling with the first data party, and specifically may be the previous second data party.

Step S703: According to the share of the first product and the partner secretly share the value of the incentive function, the share of the value of the incentive function is obtained.

In some embodiments, the first data party may secretly share the value of the polynomial with the partner according to the share of the first product, and obtain the share of the polynomial value as the share of the value of the incentive function, and the polynomial is used for fitting The activation function.

Step S705: secretly share the gradient of the loss function and the Hessian matrix with the partner according to the share of the feature data and the value of the incentive function, and obtain the share of the gradient of the loss function and the Hessian matrix respectively.

Step S707: secretly share the second product with the partner according to the share of the random orthogonal matrix and the share of the Hessian matrix to obtain the share of the second product, and the second product is the product of the random orthogonal matrix and the Hessian matrix .

Step S709: When the condition number of the second product satisfies the preset condition, secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix, and the first inverse matrix is the Hessian matrix The inverse matrix.

In some embodiments, the preset condition may include: the number of conditions is less than or equal to a preset threshold. The condition number of the second product can be calculated by the first data party and/or the partner. The condition number of the second product is equal to the condition number of the Hessian matrix.

In some embodiments, the condition number of the second product satisfies the preset condition, indicating that the degree of ill-condition of the second product is small, and Newton's method can be used to determine the model parameters. In this way, the first data party can secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix.

Step S711: secretly share the new first model parameter with the partner according to the share of the first inverse matrix, the share of the loss function gradient, and the share of the first model parameter to obtain the share of the new first model parameter.

In some embodiments, the first data party may secretly share the third product with the partner according to the share of the first inverse matrix and the share of the loss function gradient to obtain the share of the third product. The third product may be a product between the first inverse matrix and the gradient of the loss function. The first data party may subtract the share of the first model parameter from the share of the third product to obtain a new share of the first model parameter.

In some embodiments, the condition number of the second product does not satisfy the preset condition, indicating that the second product is ill-conditioned and cannot be determined using Newton's method. Therefore, gradient descent method can be used instead of Newton's method to determine model parameters. . The first data party may calculate the share of the new first model parameter according to the share of the first model parameter, the share of the loss function gradient, and the preset step length. Specifically, the first data party may multiply the share of the loss function gradient by the preset step length to obtain the fourth product; may subtract the share of the first model parameter and the fourth product to obtain the new first model parameter Share.

The first data party may repeat step S701 to obtain a new share of the first product. The first data party may communicate with the partner according to the share of the new first product and the confusion circuit corresponding to the incentive function to obtain the share of the value of the new incentive function. The first data party may repeat step S705 to obtain the share of the new loss function gradient and the share of the new Hessian matrix; may repeat step S707 to obtain the share of the new second product. The new second product is the product of the random orthogonal matrix and the new Hessian matrix.

When the condition number of the new second product meets the preset condition, it indicates that the Newton method can continue to be used to determine the model parameters. The first data party may repeat step S709 to obtain the share of the new first inverse matrix. The new first inverse matrix is the inverse of the new Hessian matrix. The first data party may then secretly share the second model parameter with the partner according to the share of the new first inverse matrix, the share of the new loss function gradient, and the share of the new first model parameter to obtain the second model parameter Share.

In the method for determining model parameters of this embodiment, the first data party can cooperate with the partner to determine the model parameters of the data processing model without revealing the data held by itself.

This specification also provides an embodiment of another method for determining model parameters. In this embodiment, the second data party is the execution subject, and the second data party may hold the share of the tag and the first model parameter. Refer to Figure 11. This embodiment may include the following steps.

Step S801: According to the share of the first model parameter and the partner secretly share the first product to obtain the share of the first product, the first product is the product of the feature data and the first model parameter.

In some embodiments, the cooperating party may be understood as a data party that performs cooperative security modeling with the second data party, and specifically may be the previous first data party.

Step S803: According to the share of the first product and the partner secretly share the value of the incentive function, the share of the value of the incentive function is obtained.

In some embodiments, the second data party may secretly share the value of the polynomial with the partner according to the share of the first product, and obtain the share of the polynomial value as the share of the value of the incentive function, and the polynomial is used for fitting The activation function.

Step S805: secretly share the gradient of the loss function with the partner according to the value of the label and the incentive function to obtain the share of the gradient of the loss function; according to the share of the incentive function, secretly share the Hessian matrix with the partner to obtain the Hessian matrix Share.

Step S807: secretly share the second product with the partner according to the share of the random orthogonal matrix and the share of the Hessian matrix to obtain the share of the second product, and the second product is the product between the random orthogonal matrix and the Hessian matrix .

Step S809: When the condition number of the second product satisfies the preset condition, secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix, and the first inverse matrix is the Hessian matrix The inverse matrix.

In some embodiments, the preset condition may include: the number of conditions is less than or equal to a preset threshold. The condition number of the second product can be calculated by the second data party and/or the partner. The condition number of the second product is equal to the condition number of the Hessian matrix.

In some embodiments, the condition number of the second product satisfies the preset condition, indicating that the degree of ill-condition of the second product is small, and Newton's method can be used to determine the model parameters. In this way, the second data party can secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix.

Step S811: secretly share the new first model parameter with the partner according to the share of the first inverse matrix, the share of the loss function gradient, and the share of the first model parameter to obtain the share of the new first model parameter.

In some embodiments, the second data party may secretly share the third product with the partner according to the share of the first inverse matrix and the share of the loss function gradient to obtain the share of the third product. The third product may be a product between the first inverse matrix and the gradient of the loss function. The second data party may subtract the share of the first model parameter from the share of the third product to obtain a new share of the first model parameter.

In some embodiments, the condition number of the second product does not satisfy the preset condition, indicating that the second product is ill-conditioned and cannot be determined using Newton's method. Therefore, gradient descent method can be used instead of Newton's method to determine model parameters. . The second data party may calculate the share of the new first model parameter according to the share of the first model parameter, the share of the loss function gradient, and the preset step length. The second data party may specifically multiply the share of the loss function gradient by the preset step length to obtain the fourth product; may subtract the share of the first model parameter and the fourth product to obtain the new first model parameter Share.

The second data party may repeat step S801 to obtain a new share of the first product. The second data party may communicate with the partner according to the share of the new first product and the confusion circuit corresponding to the incentive function to obtain the share of the value of the new incentive function. The second data party may repeat step S805 to obtain the share of the new loss function gradient and the share of the new Hessian matrix; step S807 may be repeated to obtain the share of the new second product. The new second product is the product of the random orthogonal matrix and the new Hessian matrix.

When the condition number of the new second product meets the preset condition, it indicates that the Newton method can continue to be used to determine the model parameters. The second data party may repeat step S809 to obtain the share of the new first inverse matrix. The new first inverse matrix is the inverse of the new Hessian matrix. The second data party may then secretly share the second model parameter with the partner according to the share of the new first inverse matrix, the share of the new loss function gradient, and the share of the new first model parameter to obtain the second model parameter Share.

In the method for determining model parameters of this embodiment, the second data party can cooperate with the partner to determine the model parameters of the data processing model without leaking the data held by itself.

The model parameter determination device in the embodiment of this specification will be described in detail below in conjunction with FIG. 12 and FIG. 13.

This specification also provides an embodiment of the device for determining model parameters. Refer to Figure 12. This embodiment can be applied to the first data party and can include the following units.

The first secret sharing unit 901 is configured to secretly share the first product with the partner according to the share of the feature data and the first model parameter to obtain the share of the first product, and the first product is the product of the feature data and the first model parameter ；

The second secret sharing unit 903 is configured to secretly share the value of the incentive function with the partner according to the share of the first product to obtain the share of the value of the incentive function;

The third secret sharing unit 905 is used to secretly share the gradient of the loss function and the Hessian matrix with the partner according to the share of the feature data and the value of the incentive function, and obtain the share of the gradient of the loss function and the Hessian matrix respectively;

The fourth secret sharing unit 907 is configured to secretly share the second product with the partner according to the share of the random orthogonal matrix and the share of the Hessian matrix to obtain the share of the second product, and the second product is the random orthogonal matrix and the sea. The product between Sen matrices;

The fifth secret sharing unit 909 is configured to, when the condition number of the second product meets the preset condition, secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix. The inverse matrix is the inverse matrix of the Hessian matrix;

The sixth secret sharing unit 911 is used to secretly share the new first model parameter with the partner according to the share of the first inverse matrix, the share of the loss function gradient, and the share of the first model parameter to obtain the share of the new first model parameter ；

The obfuscation circuit unit 913 is configured to repeatedly execute the step of secretly sharing the first product; communicate with the partner according to the share of the new first product and the obfuscation circuit corresponding to the incentive function to obtain the share of the new incentive function; Repeating the step of secretly sharing the gradient of the loss function and the Hessian matrix and the step of secretly sharing the second product;

The calculation unit 915 is configured to calculate the second product according to the share of the new first model parameter, the share of the new loss function gradient, and the preset step length when the condition number of the new second product does not meet the preset condition. The share of model parameters.

This specification also provides another embodiment of the device for determining model parameters. Refer to Figure 13. This embodiment can be applied to the second data party and can include the following units.

The first secret sharing unit 1001 is configured to secretly share the first product with the partner according to the share of the first model parameter to obtain the share of the first product, where the first product is the product of the feature data and the first model parameter;

The second secret sharing unit 1003 is configured to secretly share the value of the incentive function with the partner according to the share of the first product to obtain the share of the value of the incentive function;

The third secret sharing unit 1005 is used to secretly share the gradient of the loss function and the Hessian matrix with the partner according to the value of the incentive function to obtain the share of the gradient of the loss function and the Hessian matrix respectively;

The fourth secret sharing unit 1007 is used to secretly share the second product with the partner according to the share of the random orthogonal matrix and the share of the Hessian matrix to obtain the share of the second product, and the second product is the random orthogonal matrix and the sea. The product between Sen matrices;

The fifth secret sharing unit 1009 is configured to secretly share the first inverse matrix with the partner according to the share of the Hessian matrix when the condition number of the second product meets the preset condition to obtain the share of the first inverse matrix. The inverse matrix is the inverse matrix of the Hessian matrix;

The sixth secret sharing unit 1011 is used to secretly share the new first model parameter with the partner according to the share of the first inverse matrix, the share of the loss function gradient, and the share of the first model parameter to obtain the share of the new first model parameter ；

The obfuscation circuit unit 1013 is configured to repeatedly execute the step of secretly sharing the first product; communicate with the partner according to the share of the new first product and the obfuscation circuit corresponding to the incentive function to obtain the share of the new incentive function; Repeating the step of secretly sharing the gradient of the loss function, the step of secretly sharing the Hessian matrix, and the step of secretly sharing the second product;

The calculation unit 1015 is configured to calculate the second product according to the share of the new first model parameter, the share of the new loss function gradient, and the preset step length when the condition number of the new second product does not meet the preset condition. The share of model parameters.

An embodiment of the electronic device of this specification is described below. FIG. 14 is a schematic diagram of the hardware structure of an electronic device in this embodiment. As shown in FIG. 14, the electronic device may include one or more (only one is shown in the figure) processor, memory, and transmission module. Of course, those of ordinary skill in the art can understand that the hardware structure shown in FIG. 14 is only for illustration, and it does not limit the hardware structure of the above electronic device. In practice, the electronic device may also include more or less component units than shown in FIG. 14; or, have a different configuration from that shown in FIG. 14.

The memory may include a high-speed random access memory; or, it may also include a non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. Of course, the storage may also include a remotely set network storage. The remotely set network storage can be connected to the electronic device through a network such as the Internet, an intranet, a local area network, a mobile communication network, and the like. The memory may be used to store program instructions or modules of application software, such as the program instructions or modules of the embodiment corresponding to FIG. 5 of this specification; and/or, the program instructions or modules of the embodiment corresponding to FIG. 6 of this specification; and/ Or, the program instructions or modules of the embodiment corresponding to FIG. 10 of this specification; and/or, the program instructions or modules of the embodiment corresponding to FIG. 11 of this specification.

The processor can be implemented in any suitable way. For example, the processor may take the form of, for example, a microprocessor or a processor and a computer-readable medium storing computer-readable program codes (for example, software or firmware) executable by the (micro)processor, logic gates, switches, special-purpose integrated Circuit (Application Specific Integrated Circuit, ASIC), programmable logic controller and embedded microcontroller form, etc. The processor can read and execute program instructions or modules in the memory.

The transmission module can be used for data transmission via a network, for example, data transmission via a network such as the Internet, an intranet, a local area network, a mobile communication network, and the like.

It should be noted that the various embodiments in this specification are described in a progressive manner, and the same or similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. Place. In particular, for the device embodiment and the electronic device embodiment, since they are basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the part of the description of the method embodiment. In addition, it can be understood that after reading the documents of this specification, those skilled in the art can think of any combination of some or all of the embodiments listed in this specification without creative efforts, and these combinations are also within the scope of the disclosure and protection of this specification.

In the 1990s, the improvement of a technology can be clearly distinguished between hardware improvements (for example, improvements in circuit structures such as diodes, transistors, switches, etc.) or software improvements (improvements in method flow). However, with the development of technology, the improvement of many methods and procedures of today can be regarded as a direct improvement of the hardware circuit structure. Designers almost always get the corresponding hardware circuit structure by programming the improved method flow into the hardware circuit. Therefore, it cannot be said that the improvement of a method flow cannot be realized by hardware entity modules. For example, a programmable logic device (Programmable Logic Device, PLD) (such as a Field Programmable Gate Array (FPGA)) is such an integrated circuit whose logic function is determined by the user's programming of the device. It is programmed by the designer to "integrate" a digital system on a PLD without requiring the chip manufacturer to design and manufacture a dedicated integrated circuit chip. Moreover, nowadays, instead of manually making integrated circuit chips, this kind of programming is mostly realized by using "logic compiler" software, which is similar to the software compiler used in program development and writing, but before compilation The original code must also be written in a specific programming language, which is called Hardware Description Language (HDL), and there is not only one type of HDL, but many types, such as ABEL (Advanced Boolean Expression Language) , AHDL (Altera Hardware Description Language), Confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), Lava, Lola, MyHDL, PALASM, RHDL (Ruby Hardware Description), etc., currently most commonly used The ones are VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog2. It should also be clear to those skilled in the art that only a little logic programming of the method flow in the above hardware description languages and programming into an integrated circuit can easily obtain the hardware circuit that implements the logic method flow.

The systems, devices, modules, or units illustrated in the above embodiments may be specifically implemented by computer chips or entities, or implemented by products with certain functions. A typical implementation device is a computer. Specifically, the computer may be, for example, a personal computer, a laptop computer, a cell phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or Any combination of these devices.

From the description of the foregoing implementation manners, it can be known that those skilled in the art can clearly understand that this specification can be implemented by means of software plus a necessary general hardware platform. Based on this understanding, the essence of the technical solution of this specification or the part that contributes to the existing technology can be embodied in the form of a software product, the computer software product can be stored in a storage medium, such as ROM/RAM, magnetic disk , CD-ROM, etc., including a number of instructions to make a computer device (which may be a personal computer, server, or network device, etc.) execute the methods described in each embodiment of this specification or some parts of the embodiment.

This manual can be used in many general or special computer system environments or configurations. For example: personal computers, server computers, handheld devices or portable devices, tablet devices, multi-processor systems, microprocessor-based systems, set-top boxes, programmable consumer electronic devices, network PCs, small computers, large computers, including Distributed computing environment for any of the above systems or equipment, etc.

This specification may be described in the general context of computer-executable instructions executed by a computer, such as program modules. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types. This specification can also be practiced in distributed computing environments, in which tasks are performed by remote processing devices connected through a communication network. In a distributed computing environment, program modules can be located in local and remote computer storage media including storage devices.

Although the description has been described through the embodiments, those of ordinary skill in the art know that there are many variations and changes in the specification without departing from the spirit of the specification, and it is hoped that the appended claims include these variations and changes without departing from the spirit of the specification.

Claims

A method for determining model parameters, applied to the first data party, includes:

Communicate with the partner according to the share of the first product and the confusion circuit corresponding to the incentive function to obtain the share of the value of the incentive function, where the first product is the product of the feature data and the first model parameter;

According to the share of feature data and the value of the incentive function, secretly share the gradient of the loss function and the Hessian matrix with the partner to obtain the share of the gradient of the loss function and the Hessian matrix;

Secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix, where the first inverse matrix is the inverse of the Hessian matrix;

According to the share of the first model parameter, the share of the first inverse matrix, and the share of the loss function gradient, secretly share the new first model parameter with the partner to obtain the share of the new first model parameter.
The method according to claim 1, wherein the communication with the partner according to the confusion circuit corresponding to the share of the first product and the incentive function comprises:

According to the share of the first product and the confusion circuit corresponding to the piecewise linear function, communicate with the partner to obtain the share of the piecewise linear function as the share of the excitation function, and the piecewise linear function is used to fit the Motivation function.
The method according to claim 1, wherein the sharing of the Hessian matrix secretly with the partner according to the share of the characteristic data and the incentive function value comprises:

According to the value of the incentive function, share the diagonal matrix secretly with the partner to obtain the share of the diagonal matrix;

According to the feature data and the share of the diagonal matrix, secretly share the Hessian matrix with the partner to obtain the share of the Hessian matrix.
The method according to claim 1, wherein the secret sharing of the first inverse matrix with the partner according to the share of the Hessian matrix comprises:

According to the share of the random orthogonal matrix and the share of the Hessian matrix, secretly share the second product with the partner to obtain the share of the second product, send the share of the second product to the partner, and receive the second inverse matrix fed back by the partner. The second inverse matrix is multiplied by the share of the random orthogonal matrix to obtain the share of the first inverse matrix; wherein, the second product is the product between the random orthogonal matrix and the Hessian matrix, and the second inverse matrix is The inverse matrix of the second product; or,

According to the share of the random orthogonal matrix and the share of the Hessian matrix, the share of the second product is secretly shared with the partner to obtain the share of the second product, and the share of the second product sent by the partner is received, and the second product obtained by secret sharing will be obtained. The share of the product is added to the share of the second product obtained by receiving to obtain the second product, the inverse matrix of the second product is used as the second inverse matrix, and the share of the second inverse matrix is multiplied by the random orthogonal matrix, Obtain the share of the first inverse matrix; wherein the second product is the product between the random orthogonal matrix and the Hessian matrix.
The method of claim 1, wherein the secret sharing of the new first model parameter with the partner according to the share of the first model parameter, the share of the first inverse matrix, and the share of the loss function gradient includes:

According to the share of the first inverse matrix and the share of the loss function gradient and the partner secretly share the third product to obtain the share of the third product, the third product is the product between the first inverse matrix and the loss function gradient;

The share of the first model parameter is subtracted from the share of the third product to obtain the new share of the first model parameter.
The method of claim 1, further comprising:

According to the share of the feature data and the first model parameter, the first product is secretly shared with the partner to obtain the share of the first product.
The method of claim 6, further comprising:

According to the share of the random orthogonal matrix and the share of the Hessian matrix and the partner secretly share the second product to obtain the share of the second product, the second product is the product between the random orthogonal matrix and the Hessian matrix;

Correspondingly, the secret sharing of the first inverse matrix with the partner according to the share of the Hessian matrix includes:

When the condition number of the second product satisfies the preset condition, secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix.
The method of claim 7, further comprising:

When the condition number of the second product does not satisfy the preset condition, the share of the new first model parameter is calculated according to the share of the first model parameter, the share of the loss function gradient, and the preset step length.
The method according to claim 7 or 8, further comprising:

Receive the share of the second product sent by the partner, add its share of the second product to the share of the received second product to obtain the second product, and calculate the condition number of the second product; or,

Send the share of the second product to the partner so that the partner can calculate the condition number of the second product.
The method of claim 7, further comprising:

Iteratively execute the step of secret sharing the first product to obtain a new share of the first product, iteratively execute the step of communicating with the partner to obtain a share of the new incentive function value, and iteratively execute the secret sharing loss The steps of the gradient of the function and the Hessian matrix to obtain the share of the new loss function gradient and the share of the new Hessian matrix, and iteratively execute the step of secretly sharing the second product to obtain the share of the new second product;

When the condition number of the new second product satisfies the preset condition, the step of secretly sharing the first inverse matrix is performed iteratively to obtain the share of the new first inverse matrix. According to the share of the new first inverse matrix, the new The share of the loss function gradient and the share of the new first model parameter secretly share the second model parameter with the partner to obtain the share of the second model parameter.
The method of claim 7, further comprising:

Iteratively execute the step of secret sharing the first product to obtain a new share of the first product, iteratively execute the step of communicating with the partner to obtain a share of the new incentive function value, and iteratively execute the secret sharing loss The steps of the gradient of the function and the Hessian matrix to obtain the share of the new loss function gradient and the share of the new Hessian matrix, and iteratively execute the step of secretly sharing the second product to obtain the share of the new second product;

When the condition number of the new second product does not satisfy the preset condition, the share of the second model parameter is calculated according to the share of the new first model parameter, the share of the new loss function gradient, and the preset step length.
The method of claim 7, further comprising:

Iteratively execute the step of secretly sharing the first product to obtain the share of the new first product, and obtain the share of the new incentive function value according to the share of the new first product and the value of the partner secretly sharing the incentive function, Iteratively execute the steps of the secret sharing loss function gradient and the Hessian matrix to obtain a new loss function gradient share and a new Hessian matrix share, and iteratively execute the steps of the secret sharing second product to obtain a new first The share of two products;

When the condition number of the new second product does not satisfy the preset condition, the share of the second model parameter is calculated according to the share of the new first model parameter, the share of the new loss function gradient, and the preset step length.
The method according to claim 12, said secretly sharing the value of the incentive function with the partner according to the share of the new first product, comprising:

According to the share of the new first product and the partner secretly share the value of the polynomial, the share of the polynomial value is obtained as the share of the new excitation function value, and the polynomial is used to fit the excitation function.
A method for determining model parameters, applied to the second data party, includes:

Communicate with the partner according to the share of the first product and the confusion circuit corresponding to the incentive function to obtain the share of the value of the incentive function, where the first product is the product of the feature data and the first model parameter;

Secretly share the gradient of the loss function with the partner according to the share of the label and the value of the incentive function to obtain the share of the gradient of the loss function;

According to the value of the incentive function, share the Hessian matrix secretly with the partner to obtain the Hessian matrix share;

Secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix, where the first inverse matrix is the inverse of the Hessian matrix;

According to the share of the first model parameter, the share of the first inverse matrix, and the share of the loss function gradient, secretly share the new first model parameter with the partner to obtain the share of the new first model parameter.
The method according to claim 14, wherein the communication with the partner according to the confusion circuit corresponding to the share of the first product and the incentive function comprises:

According to the share of the first product and the confusion circuit corresponding to the piecewise linear function, communicate with the partner to obtain the share of the piecewise linear function as the share of the excitation function, and the piecewise linear function is used to fit the Motivation function.
The method according to claim 14, wherein the sharing of the Hessian matrix secretly with the partner according to the value of the incentive function includes:

According to the value of the incentive function, share the diagonal matrix secretly with the partner to obtain the share of the diagonal matrix;

According to the share of the diagonal matrix, secretly share the Hessian matrix with the partner to obtain the share of the Hessian matrix.
The method according to claim 14, said secretly sharing the first inverse matrix with a partner according to the share of the Hessian matrix, comprising:

According to the share of the random orthogonal matrix and the share of the Hessian matrix, secretly share the second product with the partner to obtain the share of the second product, send the share of the second product to the partner, and receive the second inverse matrix fed back by the partner. The second inverse matrix is multiplied by the share of the random orthogonal matrix to obtain the share of the first inverse matrix; wherein, the second product is the product between the random orthogonal matrix and the Hessian matrix, and the second inverse matrix is The inverse matrix of the second product; or,

According to the share of the random orthogonal matrix and the share of the Hessian matrix, the share of the second product is secretly shared with the partner to obtain the share of the second product, and the share of the second product sent by the partner is received, and the second product obtained by secret sharing will be obtained. The share of the product is added to the share of the second product obtained by receiving to obtain the second product, the inverse matrix of the second product is used as the second inverse matrix, and the share of the second inverse matrix is multiplied by the random orthogonal matrix, Obtain the share of the first inverse matrix; wherein the second product is the product between the random orthogonal matrix and the Hessian matrix.
The method according to claim 14, wherein the secret sharing of the new first model parameter with the partner according to the share of the first model parameter, the share of the first inverse matrix, and the share of the loss function gradient includes:

According to the share of the first inverse matrix and the share of the loss function gradient and the partner secretly share the third product to obtain the share of the third product, the third product is the product between the first inverse matrix and the loss function gradient;

The share of the first model parameter is subtracted from the share of the third product to obtain the new share of the first model parameter.
The method of claim 14, further comprising:

According to the share of the first model parameter and the partner secretly share the first product, the share of the first product is obtained.
The method of claim 19, further comprising:

According to the share of the random orthogonal matrix and the share of the Hessian matrix and the partner secretly share the second product to obtain the share of the second product, the second product is the product between the random orthogonal matrix and the Hessian matrix;

Correspondingly, the secret sharing of the first inverse matrix with the partner according to the share of the Hessian matrix includes:

When the condition number of the second product satisfies the preset condition, secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix.
The method of claim 20, further comprising:

When the condition number of the second product does not satisfy the preset condition, the share of the new first model parameter is calculated according to the share of the first model parameter, the share of the loss function gradient, and the preset step length.
The method of claim 20 or 21, further comprising:

Receive the share of the second product sent by the partner, add its share of the second product to the share of the received second product to obtain the second product, and calculate the condition number of the second product; or,

Send the share of the second product to the partner so that the partner can calculate the condition number of the second product.
The method of claim 20, further comprising:

Iteratively execute the step of secret sharing the first product to obtain a new share of the first product, iteratively execute the step of communicating with the partner to obtain a share of the new incentive function value, and iteratively execute the secret sharing loss The step of the gradient of the function is to obtain the share of the new loss function gradient, the steps of the secret sharing of the Hessian matrix are iteratively executed to obtain the shares of the new Hessian matrix, and the step of the second product of the secret sharing is iteratively executed to obtain the new The share of the second product;

When the condition number of the new second product satisfies the preset condition, the step of secretly sharing the first inverse matrix is performed iteratively to obtain the share of the new first inverse matrix. According to the share of the new first inverse matrix, the new The share of the loss function gradient and the share of the new first model parameter secretly share the second model parameter with the partner to obtain the share of the second model parameter.
The method of claim 20, further comprising:

Iteratively execute the step of secret sharing the first product to obtain a new share of the first product, iteratively execute the step of communicating with the partner to obtain a share of the new incentive function value, and iteratively execute the secret sharing loss The step of function gradient is to obtain the share of the new loss function gradient, the steps of the secret sharing of the Hessian matrix are iteratively executed to obtain the shares of the new Hessian matrix, and the step of the second product of the secret sharing is iteratively executed to obtain the new The share of the second product;

When the condition number of the new second product does not satisfy the preset condition, the share of the second model parameter is calculated according to the share of the new first model parameter, the share of the new loss function gradient, and the preset step length.
The method of claim 20, further comprising:

Iteratively execute the step of secretly sharing the first product to obtain the share of the new first product, and obtain the share of the new incentive function value according to the share of the new first product and the value of the partner secretly sharing the incentive function, Iteratively execute the step of secret sharing the gradient of the loss function to obtain a new share of the loss function gradient, iteratively execute the step of the secret sharing Hessian matrix to obtain the share of the new Hessian matrix, and iteratively execute the secret sharing first Step of the second product, get the share of the new second product;

When the condition number of the new second product does not satisfy the preset condition, the share of the second model parameter is calculated according to the share of the new first model parameter, the share of the new loss function gradient, and the preset step length.
The method according to claim 25, said secretly sharing the value of the incentive function with the partner according to the share of the new first product, comprising:

According to the share of the new first product and the partner secretly share the value of the polynomial, the share of the polynomial value is obtained as the share of the new excitation function value, and the polynomial is used to fit the excitation function.
A method for determining model parameters, applied to the first data party, includes:

According to the share of the feature data and the first model parameter, secretly share the first product with the partner to obtain the share of the first product, where the first product is the product of the feature data and the first model parameter;

According to the share of the first product and the partner secretly share the value of the incentive function, the share of the value of the incentive function is obtained;

According to the share of feature data and the value of the incentive function, secretly share the gradient of the loss function and the Hessian matrix with the partner to obtain the share of the gradient of the loss function and the Hessian matrix;

According to the share of the random orthogonal matrix and the share of the Hessian matrix and the partner secretly share the second product to obtain the share of the second product, the second product is the product between the random orthogonal matrix and the Hessian matrix;

When the condition number of the second product satisfies the preset condition, secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix, the first inverse matrix being the inverse of the Hessian matrix ；

According to the share of the first inverse matrix, the share of the loss function gradient and the share of the first model parameter, secretly share the new first model parameter with the partner to obtain the share of the new first model parameter;

Iteratively execute the step of secretly sharing the first product to obtain the share of the new first product, and communicate with the partner according to the share of the new first product and the confusion circuit corresponding to the incentive function to obtain the value of the new incentive function Share, iteratively execute the steps of the gradient of the secret sharing loss function and the Hessian matrix to obtain the share of the new loss function gradient and the share of the new Hessian matrix, and iteratively execute the step of the second product of the secret sharing to obtain the new The share of the second product;

When the condition number of the new second product does not satisfy the preset condition, the share of the second model parameter is calculated according to the share of the new first model parameter, the share of the new loss function gradient, and the preset step length.
The method according to claim 27, said secretly sharing the value of the incentive function with the partner according to the share of the first product, comprising:

According to the share of the first product and the partner secretly share the value of the polynomial, the share of the polynomial value is obtained as the share of the excitation function value, and the polynomial is used to fit the excitation function.
The method according to claim 27, said secretly sharing the first inverse matrix with a partner according to the share of the Hessian matrix, comprising:

Send the share of the second product to the partner, receive the inverse matrix of the second product from the partner as the second inverse matrix, multiply the share of the second inverse matrix by the random orthogonal matrix, and get the share of the first inverse matrix ;or,

Receive the share of the second product sent by the partner, add its share of the second product to the share of the received second product to obtain the second product, and use the inverse matrix of the second product as the second inverse matrix. The second inverse matrix is multiplied by the share of the random orthogonal matrix to obtain the share of the first inverse matrix.
The method according to claim 27, wherein the secret sharing of the new first model parameter with the partner according to the share of the first inverse matrix, the share of the loss function gradient, and the share of the first model parameter includes:

According to the share of the first inverse matrix and the share of the loss function gradient and the partner secretly share the third product to obtain the share of the third product, the third product is the product between the first inverse matrix and the loss function gradient;

The share of the first model parameter is subtracted from the share of the third product to obtain the new share of the first model parameter.
The method according to claim 27, wherein the communication with the partner according to the confusion circuit corresponding to the share of the new first product and the incentive function comprises:

According to the share of the new first product and the confusion circuit corresponding to the piecewise linear function to communicate with the partner, the share of the piecewise linear function is obtained as the share of the new excitation function, and the piecewise linear function is used for Fit the excitation function.
The method of claim 27, further comprising:

Receive the share of the second product sent by the partner, add its share of the second product to the share of the received second product to obtain the second product, and calculate the condition number of the second product; or,

Send the share of the second product to the partner so that the partner can calculate the condition number of the second product.
The method of claim 27, further comprising:

When the condition number of the second product does not satisfy the preset condition, the share of the new first model parameter is calculated according to the share of the first model parameter, the share of the loss function gradient, and the preset step length.
A method for determining model parameters, applied to the second data party, includes:

According to the share of the first model parameter and the partner secretly share the first product to obtain the share of the first product, the first product is the product of the feature data and the first model parameter;

According to the share of the first product and the partner secretly share the value of the incentive function, the share of the value of the incentive function is obtained;

According to the label and the value of the incentive function, share the gradient of the loss function secretly with the partner to obtain the share of the loss function gradient; according to the share of the incentive function, share the Hessian matrix secretly with the partner to obtain the share of the Hessen matrix;

According to the share of the random orthogonal matrix and the share of the Hessian matrix and the partner secretly share the second product to obtain the share of the second product, the second product is the product between the random orthogonal matrix and the Hessian matrix;

When the condition number of the second product satisfies the preset condition, secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix, the first inverse matrix being the inverse of the Hessian matrix ；

According to the share of the first inverse matrix, the share of the loss function gradient, and the share of the first model parameter, secretly share the new first model parameter with the partner to obtain the share of the new first model parameter;

Iteratively execute the step of secretly sharing the first product to obtain the share of the new first product, and communicate with the partner according to the share of the new first product and the confusion circuit corresponding to the incentive function to obtain the value of the new incentive function Share; iteratively execute the step of secretly sharing the gradient of the loss function to obtain a new share of the loss function gradient, iteratively execute the step of secretly sharing the Hessian matrix to obtain the share of the new Hessian matrix, and iteratively execute the secret The step of sharing the second product to get the share of the new second product;

When the condition number of the new second product does not satisfy the preset condition, the share of the second model parameter is calculated according to the share of the new first model parameter, the share of the new loss function gradient, and the preset step length.
The method according to claim 34, said secretly sharing the value of the incentive function with the partner according to the share of the first product, comprising:

According to the share of the first product and the partner secretly share the value of the polynomial, the share of the polynomial value is obtained as the share of the excitation function value, and the polynomial is used to fit the excitation function.
The method according to claim 34, said secretly sharing the first inverse matrix with a partner according to the share of the Hessian matrix, comprising:

Send the share of the second product to the partner, receive the inverse matrix of the second product from the partner as the second inverse matrix, multiply the share of the second inverse matrix by the random orthogonal matrix, and get the share of the first inverse matrix ;or,

Receive the share of the second product sent by the partner, add its share of the second product to the share of the received second product to obtain the second product, and use the inverse matrix of the second product as the second inverse matrix. The second inverse matrix is multiplied by the share of the random orthogonal matrix to obtain the share of the first inverse matrix.
The method according to claim 34, wherein the secret sharing of the new first model parameter with the partner according to the share of the first inverse matrix, the share of the loss function gradient, and the share of the first model parameter includes:

According to the share of the first inverse matrix and the share of the loss function gradient and the partner secretly share the third product to obtain the share of the third product, the third product is the product between the first inverse matrix and the loss function gradient;

The share of the first model parameter is subtracted from the share of the third product to obtain the new share of the first model parameter.
The method according to claim 34, wherein the communication with the partner according to the confusion circuit corresponding to the share of the new first product and the incentive function comprises:

According to the share of the new first product and the confusion circuit corresponding to the piecewise linear function to communicate with the partner, the share of the piecewise linear function is obtained as the share of the new excitation function, and the piecewise linear function is used for Fit the excitation function.
The method of claim 34, further comprising:

Receive the share of the second product sent by the partner, add its share of the second product to the share of the received second product to obtain the second product, and calculate the condition number of the second product; or,

Send the share of the second product to the partner so that the partner can calculate the condition number of the second product.
The method of claim 34, further comprising:

When the condition number of the second product does not satisfy the preset condition, the share of the new first model parameter is calculated according to the share of the first model parameter, the share of the loss function gradient, and the preset step length.
A device for determining model parameters, applied to a first data party, includes:

The incentive function value share acquisition unit is configured to communicate with the partner according to the share of the first product and the confusion circuit corresponding to the incentive function to obtain the value share of the incentive function, and the first product is the characteristic data and the first model parameter Product of

The loss function gradient share acquisition unit is used to secretly share the gradient of the loss function with the partner according to the share of the feature data and the value of the incentive function, and obtain the share of the loss function gradient;

The Hessian matrix share acquisition unit is used to secretly share the Hessian matrix with the partner according to the share of the feature data and the incentive function to obtain the share of the Hessian matrix;

The first inverse matrix share obtaining unit is configured to secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix, where the first inverse matrix is the inverse of the Hessian matrix;

The model parameter share obtaining unit is used to secretly share the new first model parameter with the partner according to the share of the first model parameter, the share of the first inverse matrix, and the share of the loss function gradient to obtain the share of the new first model parameter.
A device for determining model parameters, applied to a second data party, includes:

The incentive function value share acquisition unit is configured to communicate with the partner according to the share of the first product and the confusion circuit corresponding to the incentive function to obtain the value share of the incentive function, and the first product is the characteristic data and the first model parameter Product of

The loss function gradient share acquisition unit is used to secretly share the gradient of the loss function with the partner according to the value of the label and the incentive function to obtain the share of the loss function gradient;

The Hessian matrix share acquisition unit is used to secretly share the Hessian matrix with the partner according to the value of the incentive function to obtain the Hessian matrix share;

The first inverse matrix share obtaining unit is configured to secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix, where the first inverse matrix is the inverse of the Hessian matrix;

The model parameter share obtaining unit is used to secretly share the new first model parameter with the partner according to the share of the first model parameter, the share of the first inverse matrix, and the share of the loss function gradient to obtain the share of the new first model parameter.
A device for determining model parameters, applied to a first data party, includes:

The first secret sharing unit is configured to secretly share the first product with the partner according to the share of the feature data and the first model parameter to obtain the share of the first product, where the first product is the product of the feature data and the first model parameter;

The second secret sharing unit is used to secretly share the value of the incentive function with the partner according to the share of the first product to obtain the share of the value of the incentive function;

The third secret sharing unit is used to secretly share the gradient of the loss function and the Hessian matrix with the partner according to the share of the feature data and the value of the incentive function, and obtain the share of the loss function gradient and the Hessian matrix respectively;

The fourth secret sharing unit is used to secretly share the second product with the partner according to the share of the random orthogonal matrix and the share of the Hessian matrix to obtain the share of the second product. The second product is the random orthogonal matrix and the Hessian matrix. Product between matrices;

The fifth secret sharing unit is used to secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix when the condition number of the second product satisfies the preset condition. The matrix is the inverse of the Hessian matrix;

The sixth secret sharing unit is used to secretly share the new first model parameter with the partner according to the share of the first inverse matrix, the share of the loss function gradient, and the share of the first model parameter to obtain the share of the new first model parameter;

The iterative unit is used to iteratively execute the step of secretly sharing the first product to obtain the share of the new first product, and communicate with the partner according to the share of the new first product and the confusion circuit corresponding to the incentive function to obtain a new The share of the value of the incentive function is iteratively performed the steps of the gradient of the secret sharing loss function and the Hessian matrix to obtain the share of the new loss function gradient and the share of the new Hessian matrix, and the second product of the secret sharing is iteratively executed The step of obtaining the share of the new second product; when the condition number of the new second product does not meet the preset condition, according to the share of the new first model parameter, the share of the new loss function gradient and the preset Step size, calculate the share of the second model parameter.
A device for determining model parameters, applied to a second data party, includes:

The first secret sharing unit is configured to secretly share the first product with the partner according to the share of the first model parameter to obtain the share of the first product, where the first product is the product of the feature data and the first model parameter;

The second secret sharing unit is used to secretly share the value of the incentive function with the partner according to the share of the first product to obtain the share of the value of the incentive function;

The third secret sharing unit is used to secretly share the gradient of the loss function with the partner according to the share of the label and the value of the incentive function to obtain the share of the gradient of the loss function; according to the share of the value of the incentive function, secretly share the Hessian matrix with the partner, Get the share of Hessian matrix;

The fourth secret sharing unit is used to secretly share the second product with the partner according to the share of the random orthogonal matrix and the share of the Hessian matrix to obtain the share of the second product. The second product is the random orthogonal matrix and the Hessian matrix. Product between matrices;

The fifth secret sharing unit is used to secretly share the first inverse matrix with the partner according to the share of the Hessian matrix to obtain the share of the first inverse matrix when the condition number of the second product satisfies the preset condition. The matrix is the inverse of the Hessian matrix;

The sixth secret sharing unit is used to secretly share the new first model parameter with the partner according to the share of the first inverse matrix, the share of the loss function gradient, and the share of the first model parameter to obtain the share of the new first model parameter;

The iterative unit is used to iteratively execute the step of secretly sharing the first product to obtain the share of the new first product, and communicate with the partner according to the share of the new first product and the confusion circuit corresponding to the incentive function to obtain a new The share of the value of the incentive function; iteratively execute the step of secret sharing the gradient of the loss function to obtain a new share of the loss function gradient, and iteratively execute the step of the secret sharing Hessian matrix to obtain the share of the new Hessian matrix, Iteratively execute the step of secretly sharing the second product to obtain the share of the new second product; when the condition number of the new second product does not meet the preset condition, according to the share of the new first model parameter, the new share The share of the gradient of the loss function and the preset step size are calculated to calculate the share of the second model parameter.
An electronic device including:

Memory, used to store computer instructions;

The processor is configured to execute the computer instructions to implement the method steps according to any one of claims 1-40.