CN112183757A - Model training method, device and system - Google Patents

Model training method, device and system Download PDF

Info

Publication number
CN112183757A
CN112183757A CN201910599381.2A CN201910599381A CN112183757A CN 112183757 A CN112183757 A CN 112183757A CN 201910599381 A CN201910599381 A CN 201910599381A CN 112183757 A CN112183757 A CN 112183757A
Authority
CN
China
Prior art keywords
training
feature
training participant
model
subset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910599381.2A
Other languages
Chinese (zh)
Other versions
CN112183757B (en
Inventor
陈超超
李梁
王力
周俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Original Assignee
Advanced New Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced New Technologies Co Ltd filed Critical Advanced New Technologies Co Ltd
Priority to CN201910599381.2A priority Critical patent/CN112183757B/en
Publication of CN112183757A publication Critical patent/CN112183757A/en
Application granted granted Critical
Publication of CN112183757B publication Critical patent/CN112183757B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure provides a method and apparatus for training a linear/logistic regression model in which a feature sample set is subjected to a vertical-horizontal segmentation transformation to obtain transformed feature sample subsets for each training participant. And obtaining a current predicted value based on the current conversion submodel and the conversion characteristic sample subset of each training participant. At a first training participant, a prediction difference and a first model update quantity are determined, the first model update quantity is decomposed, and a first portion of the model update quantity is sent to a second training participant. And at the second training participant, obtaining a second model updating quantity based on the prediction difference and the corresponding conversion characteristic sample subset, decomposing the second model updating quantity, and sending the second part of model updating quantity to the first training participant. At each training participant, the respective conversion submodel is updated based on the respective partial model update amount. When the loop-ending condition is met, determining respective submodels based on the conversion submodels of the respective training participants.

Description

Model training method, device and system
Technical Field
The present disclosure relates generally to the field of machine learning, and more particularly, to methods, apparatuses, and systems for collaborative training of linear/logistic regression models via multiple training participants using a vertically-segmented training set.
Background
Linear regression models and logistic regression models are widely used regression/classification models in the field of machine learning. In many cases, multiple model training participants (e.g., e-commerce companies, courier companies, and banks) each possess different portions of data for feature samples used to train linear/logistic regression models. The multiple model training participants generally want to use each other's data together to train a linear/logistic regression model uniformly, but do not want to provide their respective data to other individual model training participants to prevent their own data from being leaked.
In view of such a situation, a machine learning method capable of protecting data security is proposed, which is capable of training a linear/logistic regression model in cooperation with a plurality of model training participants for use by the plurality of model training participants while ensuring respective data security of the plurality of model training participants. However, the model training efficiency of the existing machine learning method capable of protecting data security is low.
Disclosure of Invention
In view of the above, the present disclosure provides a method, an apparatus, and a system for collaborative training of a linear/logistic regression model via a plurality of training participants, which can improve the efficiency of model training while ensuring the security of respective data of the plurality of training participants.
According to an aspect of the present disclosure, there is provided a method for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having one sub-model of the linear/logistic regression model, the first training participant having a first subset of feature samples and labeled values, the second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertical segmentation of a set of feature samples, the method being performed by the first training participant, the method comprising: carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants; the following loop process is executed until a loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; obtaining current predicted values for the feature sample set using secret sharing matrix multiplication based on current transformation submodels and transformation feature sample subsets of respective training participants; determining a prediction difference value between the current prediction value and the corresponding mark value; determining a first model update quantity using the prediction difference and a subset of transformed feature samples at the first training participant; decomposing the first model update quantity into two first part model update quantities, and sending one first part model update quantity to the second training participant; and receiving a second partial model update quantity from the second training participant, the second partial model update quantity being obtained by decomposing a second model update quantity at the second training participant, the second model update quantity being obtained by performing secret sharing matrix multiplication on the prediction difference and the conversion feature sample subset at the second training participant; updating the current transition submodel at the first training participant based on the remaining first partial model update quantity and the received second partial model update quantity, wherein the updated transition submodel of each training participant is used as the current transition submodel for the next cycle process when the cycle process is not ended; determining a sub-model of the first training participant based on the conversion sub-models of the first and second training participants when the loop end condition is satisfied.
According to another aspect of the present disclosure, there is provided a method for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having one sub-model of the linear/logistic regression model, the first training participant having a first subset of feature samples and labeled values, the second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertical segmentation of a set of feature samples, the method being performed by the second training participant, the method comprising: carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants; the following loop process is executed until a loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; obtaining current predicted values for the feature sample set using secret sharing matrix multiplication based on current transformation submodels and transformation feature sample subsets of respective training participants; receiving a first partial model update quantity from the first training participant, the first partial model update quantity being a result of a decomposition of a first model update quantity at the first training participant, the first model update quantity being determined at the first training participant using a prediction difference and a subset of transformed feature samples at the first training participant, wherein the prediction difference is a difference between the current prediction value and a corresponding marker value; performing secret sharing matrix multiplication on the prediction difference and the transformed feature sample subset at the second training participant to obtain a second model update; decomposing the second model update quantity into two second partial model update quantities, and sending one second partial model update quantity to the first training participant; and updating the current transition submodel of the second training participant based on the remaining second partial model update amount and the received first partial model update amount, wherein, when the cycle process is not finished, the updated transition submodel of each training participant is used as the current transition submodel of the next cycle process; determining a sub-model of the second training participant based on the conversion sub-models of the first and second training participants when the loop end condition is satisfied.
According to another aspect of the present disclosure, there is provided an apparatus for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having one sub-model of the linear/logistic regression model, the first training participant having a first subset of feature samples and labeled values, the second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertical segmentation of a set of feature samples, the apparatus being located on the first training participant side, the apparatus comprising: the model conversion unit is configured to perform model conversion processing on the submodels of the training participants to obtain conversion submodels of the training participants; the sample conversion unit is configured to perform vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; a prediction value obtaining unit configured to obtain a current prediction value for the feature sample set using secret sharing matrix multiplication based on a current conversion submodel and a conversion feature sample subset of each training participant; a prediction difference determination unit configured to determine a prediction difference between the current prediction value and a corresponding marker value; a model update amount determination unit configured to determine a first model update amount using the prediction difference value and the first converted feature sample subset; a model update amount decomposition unit configured to decompose the first model update amount into two first partial model update amounts; a model update amount transmitting/receiving unit configured to transmit a first partial model update amount to the second training participant and receive a second partial model update amount from the second training participant, the second partial model update amount being obtained by decomposing a second model update amount at the second training participant, the second model update amount being obtained by performing secret sharing matrix multiplication on the prediction difference value and the second conversion feature sample subset; a model updating unit configured to update a current converter model at the first training participant based on the remaining first partial model update amount and the received second partial model update amount; and a model determination unit configured to determine a sub-model of the first training participant based on conversion sub-models of the first training participant and the second training participant when the cycle end condition is satisfied, wherein the sample conversion unit, the predicted value acquisition unit, the prediction difference value determination unit, the model update amount decomposition unit, the model update amount transmission/reception unit, and the model update unit cyclically perform operations until the cycle end condition is satisfied, wherein the updated conversion sub-models of the respective training participants are used as a current conversion sub-model of a next cycle process when a cycle process is not ended.
According to another aspect of the present disclosure, there is provided an apparatus for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having one sub-model of the linear/logistic regression model, the first training participant having a first subset of feature samples and labeled values, the second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertical segmentation of a set of feature samples, the apparatus being located on the side of the second training participant, the apparatus comprising: the model conversion unit is configured to perform model conversion processing on the submodels of the training participants to obtain conversion submodels of the training participants; the sample conversion unit is configured to perform vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; a prediction value obtaining unit configured to obtain a current prediction value for the feature sample set using secret sharing matrix multiplication based on a current conversion submodel and a conversion feature sample subset of each training participant; a model update amount receiving unit configured to receive a first partial model update amount from the first training participant, the first partial model update amount being obtained by decomposing a first model update amount at the first training participant, the first model update amount being determined at the first training participant using a prediction difference value and a conversion feature sample subset at the first training participant, wherein the prediction difference value is a difference value between the current prediction value and a corresponding label value; a second model update amount determination unit configured to perform secret sharing matrix multiplication on the prediction difference and the transformed feature sample subset at the second training participant to obtain a second model update amount; a model update amount decomposition unit configured to decompose the second model update amount into two second partial model update amounts; a model update amount sending unit configured to send a second partial model update amount to the first training participant; a model updating unit configured to update a current sub-model of the second training participant based on the remaining second partial model update amount and the received first partial model update amount; and a model determination unit configured to determine a sub-model of the second training participant based on conversion sub-models of the first training participant and the second training participant when the cycle end condition is satisfied, wherein the sample conversion unit, the predicted value acquisition unit, the model update amount reception unit, the model update amount determination unit, the model update amount decomposition unit, the model update amount transmission unit, and the model update unit cyclically perform operations until the cycle end condition is satisfied, wherein the updated conversion sub-models of the respective training participants are used as a current conversion sub-model of a next cycle process when a cycle process is not ended.
According to another aspect of the present disclosure, there is provided a system for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having one sub-model of the linear/logistic regression model, the first training participant having a first subset of feature samples and labeled values, the second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertical segmentation of a set of feature samples, the system comprising: a first training participant device comprising means for co-training a linear/logistic regression model via first and second training participants as described above; and a second training participant device comprising means for co-training the linear/logistic regression model via the first and second training participants as described above.
According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method performed on the first training participant side as described above.
According to another aspect of the present disclosure, there is provided a machine-readable storage medium storing executable instructions that, when executed, cause the at least one processor to perform a training method as described above performed on a first training participant side.
According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform a training method performed on a second training participant side as described above.
According to another aspect of the present disclosure, there is provided a machine-readable storage medium storing executable instructions that, when executed, cause the at least one processor to perform a training method as described above performed on a second training participant side.
By using the scheme of the embodiment of the disclosure, the model parameters of the linear/logistic regression model can be obtained by training without leaking the secret data of the training participants, and the workload of the model training is only in linear relation with the number of the feature samples used for training, rather than exponential relation, so that compared with the prior art, the scheme of the embodiment of the disclosure can improve the efficiency of the model training under the condition of ensuring the safety of the respective data of the training participants.
Drawings
A further understanding of the nature and advantages of the present disclosure may be realized by reference to the following drawings. In the drawings, similar components or features may have the same reference numerals.
FIG. 1 shows a schematic diagram of an example of vertically sliced data according to an embodiment of the present disclosure;
FIG. 2 illustrates an architectural diagram showing a system for collaborative training of a linear/logistic regression model via two training participants, according to an embodiment of the present disclosure;
FIG. 3 shows a flow diagram of a method for collaborative training of a linear/logistic regression model via two training participants, in accordance with an embodiment of the present disclosure;
FIG. 4 shows a flow diagram of one example of a model transformation process, according to an embodiment of the present disclosure;
FIG. 5 shows a flow diagram of one example of a feature sample set transformation process, in accordance with an embodiment of the present disclosure;
FIG. 6 shows a flow diagram of a predictive value acquisition process according to an embodiment of the disclosure;
FIG. 7 shows a flowchart of one example of a secret-shared-matrix multiplication with a trusted initializer according to an embodiment of the disclosure;
FIG. 8 shows a flowchart of one example of untrusted initializer secret sharing matrix multiplication according to an embodiment of the present disclosure;
FIG. 9 illustrates a block diagram of an apparatus for collaborative training of a linear/logistic regression model via two training participants, in accordance with an embodiment of the present disclosure;
fig. 10 shows a block diagram of one example of a prediction value acquisition unit according to an embodiment of the present disclosure;
FIG. 11 illustrates a block diagram of an apparatus for collaborative training of a linear/logistic regression model via two training participants, in accordance with an embodiment of the present disclosure;
FIG. 12 shows a schematic diagram of a computing device for collaborative training of a linear/logistic regression model via two training participants, in accordance with an embodiment of the present disclosure;
FIG. 13 shows a schematic diagram of a computing device for collaborative training of a linear/logistic regression model via two training participants, in accordance with an embodiment of the present disclosure.
Detailed Description
The subject matter described herein will now be discussed with reference to example embodiments. It should be understood that these embodiments are discussed only to enable those skilled in the art to better understand and thereby implement the subject matter described herein, and are not intended to limit the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the scope of the disclosure. Various examples may omit, substitute, or add various procedures or components as needed. For example, the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. In addition, features described with respect to some examples may also be combined in other examples.
As used herein, the term "include" and its variants mean open-ended terms in the sense of "including, but not limited to. The term "based on" means "based at least in part on". The terms "one embodiment" and "an embodiment" mean "at least one embodiment". The term "another embodiment" means "at least one other embodiment". The terms "first," "second," and the like may refer to different or the same object. Other definitions, whether explicit or implicit, may be included below. The definition of a term is consistent throughout the specification unless the context clearly dictates otherwise.
The secret sharing method is a cryptographic technology for decomposing and storing a secret, and divides the secret into a plurality of secret shares in a proper manner, each secret share is owned and managed by one of a plurality of participants, a single participant cannot recover the complete secret, and only a plurality of participants cooperate together can the complete secret be recovered. The secret sharing method aims to prevent the secret from being too concentrated so as to achieve the purposes of dispersing risks and tolerating intrusion.
Secret sharing methods can be roughly divided into two categories: there is a trusted initializer secret sharing method and a untrusted initializer secret sharing method. In the secret sharing method with a trusted initiator, the trusted initiator is required to perform parameter initialization (often to generate random numbers meeting certain conditions) on each participant participating in multi-party secure computation. After the initialization is completed, the trusted initialization party destroys the data and disappears at the same time, and the data are not needed in the following multi-party security calculation process.
The trusted initializer secret sharing matrix multiplication is applicable to the following situations: the complete secret data is a product of the first set of secret shares and the second set of secret shares, and each of the participants has one of the first set of secret shares and one of the second set of secret shares. By the secret sharing matrix multiplication of the trusted initiator, each of the multiple participants can obtain partial complete secret data of the complete secret data, the sum of the partial complete secret data obtained by each participant is the complete secret data, and each participant discloses the obtained partial complete secret data to the rest of the participants, so that each participant can obtain the complete secret data without disclosing the secret share owned by each participant, thereby ensuring the safety of the data of each of the multiple participants.
Untrusted initializer secret sharing matrix multiplication is one of the secret sharing methods. Secret-sharing matrix multiplication by an untrusted initializer is applicable to the case where the complete secret is the product of a first secret share and a second secret share, and both parties own the first and second secret shares, respectively. By secret sharing matrix multiplication by an untrusted initiator, each of the two parties that own a respective secret share generates and discloses data that is different from the secret share that they own, but the sum of the data that the two parties each disclose is equal to the product of the secret shares that the two parties each own (i.e., the complete secret). Therefore, the parties can recover the complete secret by the cooperative work of the secret sharing matrix multiplication of the trusted initialization party without disclosing the secret shares owned by the parties, and the data security of the parties is guaranteed.
In the present disclosure, the training sample set used in the linear/logistic regression model training scheme is a vertically sliced training sample set. The term "vertically dividing the training sample set" refers to dividing the training sample set into a plurality of training sample subsets according to a module/function (or some specified rule), where each training sample subset includes a part of the training subsamples of each training sample in the training sample set, and all the training subsamples included in the training sample subset constitute the training sample. In one example, assume that the training sample includes label y0And attribute
Figure BDA0002118681630000081
Then after vertical segmentation, trainThe training participant Alice owns y of the training sample0And
Figure BDA0002118681630000082
and that the training participants Bob possess
Figure BDA0002118681630000083
In another example, assume that the training sample includes label y0And attribute
Figure BDA0002118681630000084
Figure BDA0002118681630000085
Then after vertical slicing, the training participant Alice owns y of the training sample0And
Figure BDA0002118681630000086
Figure BDA0002118681630000087
and that the training participants Bob possess
Figure BDA0002118681630000088
And
Figure BDA0002118681630000089
in addition to these two examples, there are other possible scenarios, which are not listed here.
Suppose a sample x of attribute values described by d attributes (also called features) is givenT=(x1;x2;…;xd) Wherein x isiIf the value sum T of x on the ith attribute represents transposition, the linear regression model is Y ═ Wx, and the logistic regression model is Y ═ 1/(1+ e)-wx) Where Y is the predicted value, and W is the model parameter of the linear/logistic regression model (i.e., the model described in this disclosure),
Figure BDA0002118681630000091
WPrefers to a sub-model at each training participant P in the present disclosure. In this disclosure, attribute value samples are also referred to as feature data samples.
In the present disclosure, each training participant has a different portion of the data of the training samples used to train the linear/logistic regression model. For example, taking two training participants as an example, assuming that the training sample set includes 100 training samples, each of which contains a plurality of feature values and labeled actual values, the data owned by the first participant may be the partial feature values and labeled actual values of each of the 100 training samples, and the data owned by the second participant may be the partial feature values (e.g., remaining feature values) of each of the 100 training samples.
The matrix multiplication computation described anywhere in this disclosure needs to determine whether to transpose a corresponding matrix of one or more of two or more matrices participating in matrix multiplication or not, as the case may be, to satisfy a matrix multiplication rule, thereby completing the matrix multiplication computation.
Embodiments of a method, apparatus, and system for collaborative training of a linear/logistic regression model via two training participants according to the present disclosure are described in detail below with reference to the accompanying drawings.
Fig. 1 shows a schematic diagram of an example of a vertically sliced training sample set according to an embodiment of the present disclosure. In fig. 1,2 data parties Alice and Bob are shown, as are the multiple data parties. Each data party Alice and Bob owns part of the training subsamples of each of all the training samples in the training sample set, and for each training sample, the part of the training subsamples owned by data parties Alice and Bob are combined together to form the complete content of the training sample. For example, assume that the content of a certain training sample includes a label (hereinafter referred to as "label value") y0And attribute features (hereinafter referred to as "feature samples")
Figure BDA0002118681630000092
Then after vertical slicing, the training participant Alice owns y of the training sample0And
Figure BDA0002118681630000093
and that the training participants Bob possess
Figure BDA0002118681630000094
Fig. 2 shows an architectural schematic diagram illustrating a system 1 for collaborative training of a linear/logistic regression model via two training participants (hereinafter referred to as model training system 1) according to an embodiment of the present disclosure.
As shown in fig. 2, the model training system 1 comprises a first training participant device 10 and a second training participant device 20. The first training participant device 10 and the second training participant device 20 may communicate with each other over a network 30 such as, but not limited to, the internet or a local area network. In the present disclosure, the first training participant device 10 and the second training participant device 20 are collectively referred to as training participant devices. Wherein the first training participant device 10 owns the tag value and the second training participant device 20 does not own the tag value.
In the present disclosure, the trained linear/logistic regression model is decomposed into 2 sub-models, one for each training participant device. Training sample sets for model training are located at the first training participant device 10 and the second training participant device 20, the training sample sets being vertically partitioned as described above, and the training sample sets comprising feature data sets and corresponding marker values, i.e., X shown in fig. 10And y0. The submodel and corresponding training samples owned by each training participant are secret to that training participant and cannot be learned or are completely learned by other training participants.
In the present disclosure, the linear/logistic regression model and the submodels of each training participant are represented using a weight vector W and a weight subvector Wi, respectively, where i is used to represent the serial number or identification (e.g., a and B) of the training participant. The feature data set is represented using a feature matrix X, and the predictor and the tag value are respectively represented using a predictorVector quantity
Figure BDA0002118681630000101
And a vector of tag values Y.
In performing model training, the first training participant device 10 and the second training participant device 20 together perform a secret shared matrix multiplication using the respective subset of training samples and the respective submodels to obtain predicted values for the set of training samples to cooperatively train the linear/logistic regression model. The specific training process for the model will be described in detail below with reference to fig. 3 to 8.
In the present disclosure, the first training participant device 10 and the second training participant device 20 may be any suitable computing device with computing capabilities. The computing devices include, but are not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular phones, Personal Digital Assistants (PDAs), handheld devices, messaging devices, wearable computing devices, consumer electronics, and so forth.
FIG. 3 shows a flow diagram of a method for collaborative training of a linear/logistic regression model via two training participants, in accordance with an embodiment of the present disclosure. In the training method shown in FIG. 3, a first training participant Alice has a sub-model W of a linear/logistic regression modelAThe second training participant Bob has a sub-model W of the linear/logistic regression modelBThe first training participant Alice has a first feature sample subset XAAnd a label value Y, the second training participant Bob having a second subset X of feature samplesBFirst subset of feature samples XAAnd a second subset of feature samples XBIs obtained by vertically segmenting a feature sample set X used for model training.
As shown in FIG. 3, first, at block 301, a first training participant Alice, a second training participant Bob initialize their sub-model parameters, i.e., weight sub-vectors WAAnd WBTo obtain the initial values of its sub-model parameters, and will have executedThe number of row training cycles t is initialized to zero. Here, it is assumed that the end condition of the loop process is that a predetermined number of training loops are performed, for example, T training loops are performed.
After initialization as above, at block 302, model transformation processes are performed on the respective initial sub-models at Alice and Bob, respectively, to obtain transformation sub-models.
FIG. 4 shows a flow diagram of one example of a model transformation process, according to an embodiment of the present disclosure.
As shown in FIG. 4, at Alice, at block 410, the submodel W that Alice has is modeledADecomposition into WA1And WA2. Here, the sub-pattern WAIn the decomposition process, aiming at the sub-model WAThe attribute value of the element is decomposed into 2 partial attribute values, and 2 new elements are obtained by using the decomposed partial attribute values. Then, the resulting 2 new elements are assigned to W, respectivelyA1And WA2Thereby obtaining WA1And WA2
Next, at Block 420, at Bob, the sub-model W Bob hasBDecomposition into WB1And WB2
Then, at block 430, Alice will WA2Sent to Bob and, at block 440, Bob sends WB1And sending the data to Alice.
Next, at block 450, at Alice, for WA1And WB1Splicing to obtain the converted submodel WA'. The resulting converted submodel WAThe dimension of' is equal to the dimension of the feature sample set used for model training. At block 460, at Bob, for WA2And WB2Splicing to obtain the converted submodel WB'. Likewise, the resulting converted submodel WBThe dimension of' is equal to the dimension of the feature sample set used for model training.
Returning to FIG. 3, after the model conversion is completed as above, at block 303, the first training participant Alice and the second training participant Bob cooperate to pair the first feature sample subset XAAnd a second subset of feature samples XBTo carry outVertical-to-horizontal slicing conversion to obtain a first conversion feature sample subset XA' and second transform feature sample subset XB'. The resulting first transformed feature sample subset XA' and second transform feature sample subset XB' each feature sample in the set has the complete feature content of each training sample, i.e., a subset of feature samples similar to that obtained by horizontally slicing the feature sample set.
Fig. 5 shows a flow diagram of a feature sample set transformation process according to an embodiment of the present disclosure.
As shown in FIG. 5, at Block 510, at Alice, a first feature sample subset X is combinedADecomposition into XA1And XA2. At block 520, at Bob, the second feature sample subset X is combinedBDecomposition into XB1And XB2. For a subset of feature samples XAAnd XBDecomposition process and for submodel WAThe decomposition process of (a) is exactly the same. Then, at block 530, Alice compares X with XA2Sent to Bob, and at block 540, Bob sends XB1And sending the data to Alice.
Next, at block 550, at Alice, X is pairedA1And XB1Stitching to obtain a first transformed feature sample subset XA'. The resulting first transformed feature sample subset XA' is equal in dimension to the feature sample set X used for model training. At block 560, at Bob, for XA2And XB2Performing stitching to obtain a second conversion feature sample subset XB'. Transforming feature sample subset XBThe dimension of' is the same as the dimension of the feature sample set X.
For the first feature sample subset X as aboveAAnd a second subset of feature samples XBAfter the vertical slice-horizontal slice conversion is performed, the operations of blocks 304 through 314 are performed in a loop until a loop end condition is satisfied.
Specifically, at block 304, based on the current submodel W of the individual training participantsAAnd WBAnd a respective transformed feature sample subset X of each training participantA' and XB', obtaining information to be shared using secret shared matrix multiplicationTrained linear/logistic regression model's current predicted values for feature sample set X
Figure BDA0002118681630000121
How to use secret shared matrix multiplication to obtain current predicted values of a linear/logistic regression model to be trained for a feature sample set X
Figure BDA0002118681630000122
Which will be described below with reference to fig. 6 to 8.
Obtaining the current predicted value
Figure BDA0002118681630000123
Thereafter, at a first training participant Alice, a current predictor is determined, at block 305
Figure BDA0002118681630000124
Predicted difference between corresponding flag value Y
Figure BDA0002118681630000125
Where E is a column vector, Y is a column vector representing the label values of the training samples X, and,
Figure BDA0002118681630000126
is a column vector representing the current predictor for training sample X. E, Y and if training sample X contains only a single training sample
Figure BDA0002118681630000127
Are column vectors having only a single element. E, Y if training sample X contains multiple training samples
Figure BDA0002118681630000129
Are column vectors having a plurality of elements, wherein,
Figure BDA0002118681630000128
is a current predicted value of a corresponding training sample of the plurality of training samples, each element of YThe element is a labeled value of a corresponding training sample of the plurality of training samples, and each element in E is a difference of the labeled value of the corresponding training sample of the plurality of training samples and the current predicted value.
Then, at block 306, at Alice, the prediction difference E and the first transformed feature sample subset X are usedA' determining a first model update TMP1 ═ XA' E. Then, at block 307, at Alice, the first model update quantity TMP1 is decomposed into TMP1 — TMP1A+TMP1B. Here, the decomposition process for the TMP1 is the same as the decomposition process described above and will not be described again. Subsequently, at block 308, Alice connects the TMP1BSent to Bob.
Then, at block 309, Alice and Bob pair the predicted difference E and the second transformed feature sample subset XB' performing a secret sharing matrix multiplication to calculate a second model update TMP2 ═ XB' E. Then, at block 310, at Bob, the second model update TMP2 is decomposed into TMP2 — TMP2A+TMP2B. Subsequently, at block 311, Bob applies TMP2AAnd sending the data to Alice.
Next, at block 312, at Alice, based on TMP1AAnd TMP2ATo the current converter model W at AliceA' update. Specifically, first, TMP is calculatedA=TMP1A+TMP2AThen, using TMPATo update the current conversion submodel WA', for example, the sub-model update can be performed using the following equation (1):
Figure BDA0002118681630000131
wherein, WA' (n) is the current converter model at Alice, WA' (n +1) is the updated conversion submodel at Alice, α is the learning rate (learning rate), and S is the number of training samples used by the round of model training process, i.e., the batch size (batch size) of the round of model training process.
At block 313, at Bob, based on TMP1BAnd TMP2BTo the current converter sub-model W at BobB' update. Specifically, first, TMP is calculatedB=TMP1B+TMP2BThen, using TMPBTo update the current conversion submodel WB', for example, the sub-model update can be performed using the following equation (2):
Figure BDA0002118681630000132
wherein, WB' (n) is the current converter model at Bob, WB' (n +1) is the updated conversion submodel at Bob, α is the learning rate (learning rate), and S is the number of training samples used by the round of model training process, i.e., the batch size (batch size) of the round of model training process.
Then, at block 314, it is determined whether a predetermined number of cycles has been reached, i.e., whether a cycle end condition has been reached. If a predetermined number of cycles (e.g., T) is reached, block 315 is entered. If the predetermined number of cycles has not been reached, flow returns to the operation of block 302 to perform a next training cycle in which the updated submodel obtained by the respective training participant in the current cycle is used as the current submodel for the next training cycle.
At block 315, sub-models (i.e., trained sub-models) at Alice and Bob are determined based on the updated transition sub-models of Alice and Bob, respectively.
Specifically, W is trained as described aboveA' and WB', Alice will WA'[|A|:]Sent to Bob, and Bob sends WB'[0:|A|]And sending the data to Alice. Here, WA'[|A|:]Means WAThe vector component after the' A dimension (i.e., | A |), WB'[0:|A|]Means WBThe vector components before the a dimension (i.e., | a |) in' i.e., the components from 0 to | a |. For example, let W be [0,1,2,3,4 ═ W]If | A | is 2, then W [0: | A | Y]=[0,1]And W [ | A |:]=[2,3,4]. Then, at Alice, W is calculatedA=WA'[0:|A|]+WB'[0:|A|]And at Bob, calculating WB=WB'[|A|:]+WB'[|A|:]Thus obtaining the sub-models W after training at Alice and BobAAnd WB
It is to be noted here that, in the above example, the end condition of the training loop process means that the predetermined number of loops is reached. In other examples of the disclosure, the end condition of the training loop process may also be that the determined prediction difference is within a predetermined range, i.e., each element E in the prediction difference EiAll within a predetermined range, e.g. predicting each element E of the difference EiAre less than a predetermined threshold or the mean of the predicted differences E is less than a predetermined threshold. Accordingly, the operations of block 314 in FIG. 3 may be performed after the operations of block 305.
Here, it is to be noted that, at XiWhen it is a single feature sample, XiIs a feature vector (column vector or row vector) consisting of multiple attributes, and E is a single prediction difference. At XiWhen it is a plurality of feature samples, XiIs a feature matrix, the attributes of each feature sample form a feature matrix XiAnd E is the prediction difference vector. In calculating XiWhen E, multiplied by each element in E is the matrix XiThe characteristic value of each sample corresponding to a certain characteristic. For example, assuming E is a column vector, each time E is multiplied by the matrix XiThe element in the row represents the feature value of a certain feature corresponding to each sample.
Fig. 6 shows a flowchart of a predictive value acquisition process according to an embodiment of the present disclosure.
As shown in FIG. 6, first, at block 601, at Alice, a first subset of transformed feature samples X is usedA' and Current converter model WA', calculate ZA1=XA'*WA'. At Bob, at block 602, a second subset of transformed feature samples X is usedB' and Current converter model WB', calculate ZB1=XB'*WB'。
Then, at block 603, Alice and Bob calculate Z using secret sharing matrix multiplication2=XA'*WB' and Z3=XB'*WA'. Here, the secret-sharing matrix multiplication may use a party with a trusted party to initialize the secret-sharing matrix multiplication and a party without a trusted party to initialize the secret-sharing matrix multiplication. The following description will be made with reference to fig. 7 and 8, respectively, regarding the trusted party initialized secret-shared matrix multiplication and the untrusted party initialized secret-shared matrix multiplication.
Next, at block 604, at Alice, Z is determined2Decomposition to ZA2And ZB2. At block 505, at Bob, Z is added3Decomposition to ZA3And ZB3. Here, for Z2And Z3The decomposition process of (a) is the same as the decomposition process described above for the feature sample subset and will not be described again here.
Then, at block 606, Alice compares ZB2Sent to Bob and, at block 607, Bob will ZA3And sending the data to Alice.
Next, at block 608, at Alice, Z is calculatedA=ZA1+ZA2+ZA3. At block 609, at Bob, Z is calculatedB=ZB1+ZB2+ZB3. Then, at block 610, Bob combines Z with ZBSent to Alice and, at block 611, Alice sends ZASent to Bob.
Upon respectively receiving ZAAnd ZBThereafter, at block 612, at Alice and Bob, predicted values are obtained
Figure BDA0002118681630000151
Figure 7 illustrates a flow diagram of one example of a secret-shared-matrix multiplication with a trusted initializer according to an embodiment of the disclosure. Multiplication with trusted party secret sharing matrix to compute Z shown in FIG. 72=XA'*WB' As an example, where XA' is a subset of transformed samples at Alice (hereinafter referred to as the feature matrix), WB' is the conversion submodel at Bob (hereinafter referred to as the weight vector).
As shown in FIG. 7, first, at the trusted initiator 30, 2 random weight vectors W are generatedR,1And W R,22 random feature matrices XR,1、XR,2And 2 vectors of random tag values YR,1、YR,2Wherein, in the step (A),
Figure BDA0002118681630000152
here, the dimension of the random weight vector is the same as that of the conversion submodel (weight vector) of each training participant, the dimension of the random feature matrix is the same as that of the conversion sample subset (feature matrix), and the dimension of the random token value vector is the same as that of the token value vector.
The trusted initiator 30 then forwards the generated W at block 701R,1、XR,1And YR,1Sent to Alice and at block 702, the generated W is transmittedR,2、XR,2And YR,2Sent to Bob.
Next, at block 703, at Alice, the feature matrix X is appliedA' decomposition into 2 feature sub-matrices, i.e. feature sub-matrix XA1' and XA2'。
For example, assume feature matrix XA' includes two feature samples S1 and S2, each of the feature samples S1 and S2 including 3 attribute values, where S1 ═ a1 1,a2 1,a3 1]And S2 ═ a1 2,a2 2,a3 2]Then, the feature matrix X isA' decomposition into 2 feature sub-matrices XA1' and XA2' thereafter, a first feature submatrix XA1' includes a characteristic subsample [ a11 1,a21 1,a31 1]And a characteristic subsample [ a11 2,a21 2,a31 2]Second feature submatrix XA2' includes a characteristic subsample [ a12 1,a22 1,a32 1]And a characteristic subsample [ a12 2,a22 2,a32 2]Wherein a is11 1+a12 1=a1 1,a21 1+a22 1=a2 1,a31 1+a32 1=a3 1,a11 2+a12 2=a1 2,a21 2+a22 2=a2 2And a31 2+a32 2=a3 2
Then, at block 704, Alice decomposes the decomposed feature submatrix XA2' sent to Bob.
At block 705, at Bob, a weight vector W is appliedB' decomposition into 2 weight subvectors WB1' and WB2'. The decomposition process of the weight vector is the same as the decomposition process described above. At block 706, Bob weights the subvector WB1' to Alice.
Then, at each training participant, a weight sub-vector difference E and a feature sub-matrix difference D at the training participant are determined based on the weight sub-vector, the corresponding feature sub-matrix, and the received random weight vector and random feature matrix of the training participant. For example, at block 707, at Alice, its weighted sub-vector difference E1 is determined to be WB1'-WR,1And the feature submatrix difference D1 ═ XA1'-XR,1. At block 708, at Bob, its weight subvector difference E2 ═ W is determinedB2'-WR,2And the feature submatrix difference D2 ═ XA2'-XR,2
After each training participant determines the weight sub-vector difference Ei and the feature sub-matrix difference Di, at block 709, Alice sends D1 and E1 to the training cooperator Bob, respectively. At block 710, the training cooperator Bob sends D2 and E2 to Alice.
Then, at each training participant, the weight sub-vector difference value and the feature sub-matrix difference value at each training participant are summed to obtain a weight sub-vector total difference value E and a feature sub-matrix total difference value D, respectively, at block 711. For example, as shown in fig. 7, D-D1 + D2, and E-E1 + E2.
Then, at each training participant, based on the received random weight vector WR,iRandom feature matrix XR,iVector of random mark values YR,iAnd calculating the predicted value vector Zi corresponding to the weight sub-vector total difference E and the feature sub-matrix total difference D respectively.
In one example of the present disclosure, at each training participant, the random labeled value vector of the training participant, the product of the total difference value of the weight sub-vectors and the random feature matrix of the training participant, and the product of the total difference value of the feature sub-matrices and the random weight vector of the training participant may be summed to obtain the corresponding predictor vector (first calculation). Alternatively, the random labeled value vector of the training participant, the product of the total difference value of the weight sub-vectors and the random feature matrix of the training participant, the product of the total difference value of the feature sub-matrices and the random weight vector of the training participant, and the product of the total difference value of the weight sub-vectors and the total difference value of the feature sub-matrices may be summed to obtain the corresponding predictor matrix (second calculation).
It should be noted here that, in the predictor matrix calculation at each training participant, only one predictor matrix calculated at each training participant includes the product of the total weight sub-vector difference and the total feature sub-matrix difference. In other words, for each training participant, only one of the training participants' predictor vectors is calculated in the second calculation, while the remaining training participants calculate the corresponding predictor vector in the first calculation.
For example, at block 712, at Alice, the corresponding predictor vector Z1 ═ Y is calculatedR,1+E*XR,1+D*WR,1+ D × E. At block 713, at Bob, the corresponding predictor vector Z2 ═ Y is calculatedR,2+E*XR,2+D*WR,2
Note that, in fig. 7, Z1 calculated at Alice includes D × E. In other examples of the disclosure, D _ E may also be included in Zi calculated by Bob, and accordingly, D _ E is not included in Z1 calculated at Alice. In other words, only one of the zis calculated at each training participant contains D × E.
Alice then sends Z1 to Bob at block 714. At block 715, Bob sends Z2 to Alice.
Then, at blocks 716 and 717, the training participants sum Z-Z1 + Z2 to obtain the secret sharing matrix multiplication result.
Figure 8 illustrates a flow diagram of one example of untrusted initializer secret sharing matrix multiplication according to an embodiment of the present disclosure. In FIG. 8, to train the X between the participants Alice and BobA'*WBThe calculation process of' is explained as an example.
As shown in FIG. 8, first, at block 801, if X at AliceA' (hereinafter referred to as first feature matrix) is not even in number of rows, and/or the current sub-model parameter W at BobB' (hereinafter referred to as first weight submatrix) is not even, the first feature matrix X is subjected toA' and/or a first weight submatrix WB' conducting dimension completion processing so that the first feature matrix XA' the number of rows is even and/or the first weight submatrix WB' is even. For example, the first feature matrix XA' the end of the line is increased by a line 0 value and/or the first weight submatrix WB' the dimension completion processing is performed by adding a row of 0 values at the end of the row. In the following description, it is assumed that the first weight submatrix WB' dimension is I X J, and a first feature matrix XA' has dimension J x K, wherein J is an even number.
The operations of blocks 802 to 804 are then performed at Alice to obtain a random feature matrix X1, second and third feature matrices X2 and X3. Specifically, at block 802, a random feature matrix X1 is generated. Here, the dimension of the random feature matrix X1 is the same as the first feature matrix XA' are identical in dimension, i.e., the random feature matrix X1 has dimension J × K. At block 803, the random feature matrix X1 is subtracted from the first feature matrix XA', to obtain a second feature matrix X2. The dimension of the second feature matrix X2 is J × K. In thatAt block 804, the even row submatrix X1_ e of the random feature matrix X1 is subtracted from the odd row submatrix X1_ o of the random feature matrix X1 to obtain a third feature matrix X3. The dimension of the third feature matrix X3 is J × K, where J is J/2.
Further, the operations of blocks 805 to 807 are performed at Bob to obtain a random weight submatrix WB1A second and a third weight submatrix WB2And WB3. Specifically, at block 805, a random weight submatrix W is generatedi1. Here, the random weight submatrix WB1Dimension of (d) and a first weight submatrix WB' same dimension, i.e. random weight submatrix Wi1Is I x J. At block 806, the first weight submatrix W is processedB' and random weight submatrix WB1Summing to obtain a second weight submatrix WB2. Second weight submatrix WB2Is I x J. At block 807, the random weight submatrix WB1Odd column submatrix WB1_oAdding a random weight sub-matrix WB1Of even-numbered rows of the submatrix WB1_eTo obtain a third weight submatrix WB3. Third weight submatrix WB3Is represented by I x J, where J/2.
Then, at block 808, Alice sends the generated second feature matrix X2 and third feature matrix X3 to Bob, and at block 809, Bob sends a second weight submatrix WB2And a third weight submatrix WB3And sending the data to Alice.
Next, at block 810, at Alice, W based on equation Y1B2*(2*XA'-X1)-WB3(X3+ X1_ e) performs a matrix calculation to get the first matrix product Y1, and at block 812, sends the first matrix product Y1 to Bob.
At block 811, at Bob, (W) based on equation Y2B'+2*WB1)*X2+(WB3+WB1_o) X3 computes a second matrix product Y2 and, at block 813, sends the second matrix product Y2 to Alice.
Then, at blocks 814 and 815, the first matrix product Y1 and the second matrix product Y2 are summed at Alice and Bob, respectively, to obtain XA'*WB'=YB=Y1+Y2。
Here, fig. 6 to 8 show a calculation process of the current predicted value Y — W × X in the linear regression model. In the case of a logistic regression model, W X may be determined according to the procedure shown in fig. 6 to 8, and then substituted into the logistic regression model Y1/(1 + e)-wx) Thereby calculating the current predicted value.
By using the linear/logistic regression model training method disclosed in fig. 3 to 8, the model parameters of the linear/logistic regression model can be obtained by training without leaking the secret data of the plurality of training participants, and the workload of model training is only in linear relationship with the number of the feature samples used for training, rather than exponential relationship, so that the efficiency of model training can be improved under the condition of ensuring the safety of the respective data of the plurality of training participants.
Fig. 9 shows a schematic diagram of an apparatus for collaborative training of a linear/logistic regression model via two training participants (hereinafter referred to as a model training apparatus) 900 according to an embodiment of the present disclosure. Each training participant has one sub-model of said linear/logistic regression model, the first training participant (Alice) has a first subset of feature samples and labeled values, the second training participant (Bob) has a second subset of feature samples, the first and second subsets of feature samples are obtained by vertically slicing a set of feature samples used for model training, and the model training apparatus 900 is located at the side of the first training participant.
As shown in fig. 9, the model training apparatus 900 includes a model conversion unit 910, a sample conversion unit 920, a prediction value acquisition unit 930, a prediction difference determination unit 940, a model update amount determination unit 950, a model update amount decomposition unit 960, a model update amount transmission/reception unit 970, a model update unit 980, and a model determination unit 990.
The model transformation unit 910 is configured to perform a model transformation process on the sub-models of the respective training participants to obtain transformation sub-models of the respective training participants. The operation of the model conversion unit 910 may refer to the operation of block 302 described above with reference to fig. 3 and the operation described with reference to fig. 4.
At the time of model training, the sample conversion unit 920, the predicted value acquisition unit 930, the predicted difference value determination unit 940, the model update amount determination unit 950, the model update amount decomposition unit 960, the model update amount transmission/reception unit 970, and the model update unit 980 are configured to cyclically perform operations until a cycle end condition is satisfied. The loop-ending condition may include: reaching a predetermined cycle number; or the determined prediction difference is within a predetermined range. When the loop process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next loop process.
Specifically, in each iteration process, the sample conversion unit 920 is configured to perform vertical-horizontal slicing conversion on the feature sample set to obtain a converted feature sample subset at each training participant. The operation of the sample conversion unit 920 may refer to the process described above with reference to fig. 5.
The prediction value obtaining unit 930 is configured to obtain a current prediction value for the feature sample set using secret sharing matrix multiplication based on the current conversion submodel and the conversion feature sample subset of the respective training participants. The operation of the prediction value acquisition unit 930 may refer to the operation of the block 304 described above with reference to fig. 3 and the operation described with reference to fig. 6 to 8.
The prediction difference determination unit 940 is configured to determine a prediction difference between the current prediction value and the corresponding flag value. The operation of the prediction difference determination unit 940 may refer to the operation of the block 305 described above with reference to fig. 3.
The model update amount determination unit 950 is configured to determine a first model update amount using the prediction difference and the subset of transformed feature samples at the first training participant. The operation of the model update amount determination unit 950 may refer to the operation of the block 306 described above with reference to fig. 3.
The model update amount decomposition unit 960 is configured to decompose the first model update amount into two first partial model update amounts. The operation of the model update amount decomposition unit 960 may refer to the operation of block 307 described above with reference to fig. 3.
The model update amount transmitting/receiving unit 970 is configured to transmit a first partial model update amount to the second training participant and receive a second partial model update amount from the second training participant, the second partial model update amount being obtained by decomposing a second model update amount at the second training participant, the second model update amount being obtained by performing secret sharing matrix multiplication on the prediction difference and the conversion feature sample subset at the second training participant. The operation of the model update amount transmitting/receiving unit 970 may refer to the operation of block 308/311 described above with reference to fig. 3.
The model update unit 980 is configured to update the current conversion submodel at the first training participant based on the remaining first partial model update amount and the received second partial model update amount. The operation of the model update unit 980 may refer to the operation of block 312 described above with reference to fig. 3.
The model determination unit 990 is configured to determine the sub-model of the first training participant based on the conversion sub-models of the first training participant and the second training participant when the loop end condition is fulfilled. The operation of the model determination unit 990 may refer to the operation of block 315 described above with reference to fig. 3.
In one example of the present disclosure, the sample conversion unit 920 may include a sample decomposition module (not shown), a sample transmission/reception module (not shown), and a sample stitching module (not shown). The sample decomposition module is configured to decompose the first subset of feature samples into two first partial subsets of feature samples. The sample sending/receiving module is configured to send a first partial feature sample subset to the second training participant and receive a second partial feature sample subset from the second training participant, the second partial feature sample subset being obtained by decomposing the feature sample subset at the second training participant. The sample stitching module is configured to stitch the remaining first subset of partial feature samples and the received second subset of partial feature samples to obtain a first subset of transformed feature samples.
Fig. 10 shows a block diagram of one example of a prediction value acquisition unit (hereinafter referred to as "prediction value acquisition unit 1000") according to an embodiment of the present disclosure. As shown in fig. 10, the prediction value acquisition unit 1000 may include a first calculation module 1010, a second calculation module 1020, a matrix product decomposition module 1030, a matrix product transmission/reception module 1040, a first summation module 1050, and a value transmission/reception module 1060 and a second summation module 1070.
The first calculation module 1010 is configured to calculate a conversion submodel (W) for a first training participantA') sample subset of transformed features (X) with a first training participantA') of the first matrix product. The operation of the first calculation module 1010 may refer to the operation of block 601 described above with reference to fig. 6.
Second calculation module 1020 is configured to calculate a conversion submodel (W) for the second training participant using secret-shared-matrix multiplicationB') sample subset of transformed features (X) with a first training participantA') and a conversion submodel (W) of the first training participantA') sample subset of transformed features (X) with a second training participantB') of the second matrix. The operation of the second calculation module 1020 may refer to the operation of block 603 described above with reference to fig. 6 and the operation described with reference to fig. 7-8.
The matrix product decomposition module 1030 is configured to decompose the calculated second matrix product to obtain 2 second partial matrix products. The operation of the matrix product decomposition module 1030 may refer to the operation of block 604 described above with reference to fig. 6.
The matrix product transmit/receive module 1040 is configured to transmit a second partial matrix product to the second training participant and receive a third partial matrix product from the second training participant. The third partial matrix product is obtained by decomposing the third matrix product at the second training participant. The third matrix product is the conversion submodel (W) at the first training participant performed by the second training participantA') sample subset of transformed features (X) with a second training participantB') of the matrix product. The operation of the matrix product transmit/receive module 1040 may refer to the operations of blocks 606 and 607 described above with reference to fig. 6.
The first summation module 1050 is configured to sum the first matrix product, the second partial matrix product, and the third partial matrix product to obtain a first matrix product-sum value at the first training participant. The operation of the first summing module 1050 may refer to the operation of block 608 described above with reference to fig. 6.
The sum value transmission/reception module 1060 is configured to receive a second matrix product sum value (Z) obtained at a second training participantB) And multiplying and summing (Z) a first matrix obtained at the first training participantA) And sending to the second training participant. The operation of the sum value transmission/reception module 1060 may refer to the operation of block 610/611 described above with reference to fig. 6.
The second summation module 1070 is configured to sum the resulting first and second matrix product-sum values to obtain the current predicted values of the linear/logistic regression model for the feature sample set. The operation of the second summing module 1070 may refer to the operation of block 612 described above with reference to fig. 6.
In one example of the present disclosure, the second computing module 1020 may be configured to: computing a conversion submodel (W) for a second training participant using a secret shared matrix multiplication with a trusted initiatorB') sample subset of transformed features (X) with a first training participantA') and a conversion submodel (W) of the first training participantA') sample subset of transformed features (X) with a second training participantB') of the second matrix. The operations of second calculation module 1020 may refer to the operations performed at first training participant a described above with reference to fig. 7.
In another example of the present disclosure, the second calculation module 1120 may be configured to: computing a conversion submodel (W) for a second training participant using untrusted initializer secret shared matrix multiplicationB') sample subset of transformed features (X) with a first training participantA') and a conversion submodel (W) of the first training participantA') sample subset of transformed features (X) with a second training participantB') of the second matrix. Second calculation modelThe operations of block 1020 may refer to the operations performed at first training participant a described above with reference to fig. 8.
Fig. 11 shows a schematic diagram of an apparatus (hereinafter referred to as model training apparatus) 1100 for collaborative training of a linear/logistic regression model via two training participants, according to an embodiment of the present disclosure. Each training participant has one sub-model of said linear/logistic regression model, the first training participant (Alice) has a first subset of feature samples and labeled values, the second training participant (Bob) has a second subset of feature samples, the first and second subsets of feature samples are obtained by vertically slicing the set of feature samples used for model training, and the model training apparatus 1100 is located at the side of the second training participant.
As shown in fig. 11, the model training apparatus 1100 includes a model conversion unit 1110, a sample conversion unit 1120, a prediction value acquisition unit 1130, a model update amount receiving unit 1140, a model update amount determining unit 1150, a model update amount decomposition unit 1160, a model update amount sending unit 1170, a model update unit 1180, and a model determining unit 1190.
The model transformation unit 1110 is configured to perform a model transformation process on the sub-models of the respective training participants to obtain transformed sub-models of the respective training participants. The operation of the model conversion unit 1110 may refer to the operation of block 302 described above with reference to fig. 3 and the operation described with reference to fig. 4.
In performing model training, the sample conversion unit 1120, the predicted value acquisition unit 1130, the model update amount reception unit 1140, the model update amount determination unit 1150, the model update amount decomposition unit 1160, the model update amount transmission unit 1170, and the model update unit 1180 are configured to perform operations in a loop until a loop end condition is satisfied. The loop-ending condition may include: reaching a predetermined cycle number; or the determined prediction difference is within a predetermined range. When the loop process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next loop process.
In particular, during each iteration, the sample conversion unit 1120 is configured to perform a vertical-horizontal slicing conversion on the feature sample set to obtain a converted feature sample subset at each training participant. The operation of the sample conversion unit 1120 may refer to the process described above with reference to fig. 5. Also, the sample conversion unit 1120 may have the same structure as the sample conversion unit 920.
The predictor obtaining unit 1130 is configured to obtain a current predictor for the set of feature samples using secret sharing matrix multiplication based on the current conversion submodel and the conversion feature sample subset of the respective training participant. Here, the predictor acquisition unit 1130 may be configured to obtain the current predictor for the feature sample set using trusted initializer secret sharing matrix multiplication or untrusted initializer secret sharing matrix multiplication. The operation of the prediction value acquisition unit 1130 may refer to the operation of the block 304 described above with reference to fig. 3. The predicted value acquisition unit 1130 may employ the same structure as the predicted value acquisition unit 930 (i.e., the structure shown in fig. 10). Accordingly, the second calculation module in the predictor acquisition unit 1130 is configured to calculate the conversion submodel (W) for the second training participantB') sample subset of transformed features (X) with a first training participantA') and a conversion submodel (W) of the first training participantA') sample subset of transformed features (X) with a second training participantB') of the second matrix.
The model update amount receiving unit 1140 is configured to receive a first partial model update amount from a first training participant, the first partial model update amount being obtained by decomposing the first model update amount at the first training participant, the first model update amount being determined at the first training participant using a prediction difference value and a conversion feature sample subset at the first training participant, wherein the prediction difference value is a difference value between a current prediction value and a corresponding label value. The operation of the model update amount receiving unit 1140 may refer to the operation of block 308 described above with reference to fig. 3.
The second model update amount determination unit 1150 is configured to perform a secret sharing matrix multiplication on the prediction difference and the transformed feature sample subset at the second training participant to obtain a second model update amount. The operation of the second model update amount determination unit 1150 may refer to the operation of block 309 described above with reference to fig. 3. Here, the second model update amount determination unit 1150 may be implemented using the second calculation module 1020 described in fig. 10. That is, the second model update amount determination unit 1150 may be configured to perform a trusted initializer secret sharing matrix multiplication or a non-trusted initializer secret sharing matrix multiplication on the prediction difference and the transformed feature sample subset at the second training participant to obtain the second model update amount.
The model update amount decomposition unit 1160 is configured to decompose the second model update amount into two second partial model update amounts. The operation of the model update amount decomposition unit 1160 may refer to the operation of block 310 described above with reference to fig. 3.
The model update amount transmitting unit 1170 is configured to transmit a second partial model update amount to the first training participant. The operation of the model update amount transmitting unit 1170 may refer to the operation of the block 311 described above with reference to fig. 3.
The model update unit 1180 is configured to update the current converter model of the second training participant based on the remaining second partial model update quantity and the received first partial model update quantity. The operation of the model update unit 1180 may refer to the operation of block 313 described above with reference to fig. 3.
The model determination unit 1190 is configured to determine a sub-model of the second training participant based on the transformed sub-models of the first and second training participants when the loop end condition is fulfilled. The operation of model determination unit 1190 may refer to the operation of block 315 described above with reference to fig. 3.
Embodiments of a model training method, apparatus and system according to the present disclosure are described above with reference to fig. 1 through 11. The above model training device can be implemented by hardware, or can be implemented by software, or a combination of hardware and software.
FIG. 12 illustrates a hardware block diagram of a computing device 1200 for implementing collaborative training of a linear/logistic regression model via two training participants, according to an embodiment of the disclosure. As shown in fig. 12, computing device 1200 may include at least one processor 1210, storage (e.g., non-volatile storage) 1220, memory 1230, and a communication interface 1240, and the at least one processor 1210, storage 1220, memory 1230, and communication interface 1240 are connected together via a bus 1260. The at least one processor 1210 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.
In one embodiment, computer-executable instructions are stored in the memory that, when executed, cause the at least one processor 1210 to: carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants; the following loop process is executed until a loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; obtaining current predicted values for the feature sample set using secret shared matrix multiplication based on the current transformation submodel and the transformation feature sample subset for each training participant; determining a prediction difference value between the current prediction value and the corresponding mark value; determining a first model update quantity using the prediction difference and the subset of transformed feature samples at the first training participant; decomposing the first model updating quantity into two first part model updating quantities, and sending one first part model updating quantity to a second training participant; and receiving a second partial model update quantity from the second training participant, the second partial model update quantity being obtained by decomposing the second model update quantity at the second training participant, the second model update quantity being obtained by performing secret sharing matrix multiplication on the prediction difference and the conversion feature sample subset at the second training participant; updating a current transition submodel at the first training participant based on the remaining first partial model update quantity and the received second partial model update quantity, wherein the updated transition submodel of each training participant is used as the current transition submodel for the next cycle process when the cycle process is not ended; when the loop ending condition is met, determining a sub-model of the first training participant based on the conversion sub-models of the first training participant and the second training participant.
It should be understood that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 1210 to perform the various operations and functions described above in connection with fig. 1-11 in the various embodiments of the present disclosure.
FIG. 13 illustrates a hardware block diagram of a computing device 1300 for implementing collaborative training of a linear/logistic regression model via two training participants, according to an embodiment of the disclosure. As shown in fig. 13, computing device 1300 may include at least one processor 1310, storage (e.g., non-volatile storage) 1320, memory 1330, and communication interface 1340, and the at least one processor 1310, storage 1320, memory 1330, and communication interface 1340 are connected together via a bus 1360. The at least one processor 1310 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.
In one embodiment, computer-executable instructions are stored in the memory that, when executed, cause the at least one processor 1310 to: carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants; the following loop process is executed until a loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; obtaining current predicted values for the feature sample set using secret shared matrix multiplication based on the current transformation submodel and the transformation feature sample subset for each training participant; receiving a first partial model update quantity from a first training participant, the first partial model update quantity being obtained by decomposing the first model update quantity at the first training participant, the first model update quantity being determined at the first training participant using a prediction difference value and a conversion feature sample subset at the first training participant, wherein the prediction difference value is a difference value between a current prediction value and a corresponding label value; performing secret sharing matrix multiplication on the prediction difference and the conversion feature sample subset at the second training participant to obtain a second model update quantity; decomposing the second model updating quantity into two second part model updating quantities, and sending one second part model updating quantity to the first training participant; and updating the current transition submodel of the second training participant based on the remaining second partial model update quantity and the received first partial model update quantity, wherein, when the cycle process is not finished, the updated transition submodel of each training participant is used as the current transition submodel of the next cycle process; determining a sub-model of a second training participant based on the conversion sub-models of the first and second training participants when the loop end condition is satisfied.
It should be understood that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 1310 to perform the various operations and functions described above in connection with fig. 1-11 in the various embodiments of the present disclosure.
According to one embodiment, a program product, such as a machine-readable medium (e.g., a non-transitory machine-readable medium), is provided. A machine-readable medium may have instructions (i.e., elements described above as being implemented in software) that, when executed by a machine, cause the machine to perform various operations and functions described above in connection with fig. 1-11 in the various embodiments of the present disclosure. Specifically, a system or apparatus may be provided which is provided with a readable storage medium on which software program code implementing the functions of any of the above embodiments is stored, and causes a computer or processor of the system or apparatus to read out and execute instructions stored in the readable storage medium.
In this case, the program code itself read from the readable medium can realize the functions of any of the above-described embodiments, and thus the machine-readable code and the readable storage medium storing the machine-readable code form part of the present invention.
Examples of the readable storage medium include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer or from the cloud via a communications network.
It will be understood by those skilled in the art that various changes and modifications may be made in the above-disclosed embodiments without departing from the spirit of the invention. Accordingly, the scope of the invention should be determined from the following claims.
It should be noted that not all steps and units in the above flows and system structure diagrams are necessary, and some steps or units may be omitted according to actual needs. The execution order of the steps is not fixed, and can be determined as required. The apparatus structures described in the above embodiments may be physical structures or logical structures, that is, some units may be implemented by the same physical entity, or some units may be implemented by a plurality of physical entities, or some units may be implemented by some components in a plurality of independent devices.
In the above embodiments, the hardware units or modules may be implemented mechanically or electrically. For example, a hardware unit, module or processor may comprise permanently dedicated circuitry or logic (such as a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware units or processors may also include programmable logic or circuitry (e.g., a general purpose processor or other programmable processor) that may be temporarily configured by software to perform the corresponding operations. The specific implementation (mechanical, or dedicated permanent, or temporarily set) may be determined based on cost and time considerations.
The detailed description set forth above in connection with the appended drawings describes exemplary embodiments but does not represent all embodiments that may be practiced or fall within the scope of the claims. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration," and does not mean "preferred" or "advantageous" over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, the techniques may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the described embodiments.
The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (21)

1. A method for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having one sub-model of the linear/logistic regression model, the first training participant having a first subset of feature samples and labeled values, the second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing the set of feature samples, the method being performed by the first training participant, the method comprising:
carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants;
the following loop process is executed until a loop end condition is satisfied:
performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant;
obtaining current predicted values for the feature sample set using secret sharing matrix multiplication based on current transformation submodels and transformation feature sample subsets of respective training participants;
determining a prediction difference value between the current prediction value and the corresponding mark value;
determining a first model update quantity using the prediction difference and a subset of transformed feature samples at the first training participant;
decomposing the first model update quantity into two first part model update quantities, and sending one first part model update quantity to the second training participant; and
receiving a second partial model update quantity from the second training participant, the second partial model update quantity being obtained by decomposing a second model update quantity at the second training participant, the second model update quantity being obtained by performing secret sharing matrix multiplication on the prediction difference and the conversion feature sample subset at the second training participant;
updating the current transition submodel at the first training participant based on the remaining first partial model update quantity and the received second partial model update quantity, wherein the updated transition submodel of each training participant is used as the current transition submodel for the next cycle process when the cycle process is not ended;
determining a sub-model of the first training participant based on the conversion sub-models of the first and second training participants when the loop end condition is satisfied.
2. The method of claim 1, wherein performing a vertical-to-horizontal slicing transform on the feature sample set to obtain transformed feature sample subsets at each training participant comprises:
decomposing the first subset of feature samples into two first partial subsets of feature samples;
sending a first subset of the feature samples to the second training participant;
receiving a second partial feature sample subset from the second training participant, the second partial feature sample subset being derived by decomposing the feature sample subset at the second training participant; and
and splicing the remaining first part of feature sample subset and the received second part of feature sample subset to obtain a conversion feature sample subset at the first training participant.
3. The method of claim 1 or 2, wherein using secret sharing matrix multiplication to obtain current predictors for the set of feature samples based on current transformation submodels and transformation feature sample subsets of respective training participants comprises:
obtaining current predictors for the feature sample set using a trusted initializer secret sharing matrix multiplication based on the current transformation submodel and the transformation feature sample subset for each training participant.
4. The method of claim 1 or 2, wherein using secret sharing matrix multiplication to obtain current predictors for the set of feature samples based on current transformation submodels and transformation feature sample subsets of respective training participants comprises:
obtaining current predictors for the feature sample set using untrusted initializer secret sharing matrix multiplication based on current transformation submodels and transformation feature sample subsets of respective training participants.
5. The method of any of claims 1 to 4, wherein the end-of-loop condition comprises:
a predetermined number of cycles; or
The prediction difference is within a predetermined range.
6. A method for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having one sub-model of the linear/logistic regression model, the first training participant having a first subset of feature samples and labeled values, the second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing the set of feature samples, the method being performed by the second training participant, the method comprising:
carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants;
the following loop process is executed until a loop end condition is satisfied:
performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant;
obtaining current predicted values for the feature sample set using secret sharing matrix multiplication based on current transformation submodels and transformation feature sample subsets of respective training participants;
receiving a first partial model update quantity from the first training participant, the first partial model update quantity being a result of a decomposition of a first model update quantity at the first training participant, the first model update quantity being determined at the first training participant using a prediction difference and a subset of transformed feature samples at the first training participant, wherein the prediction difference is a difference between the current prediction value and a corresponding marker value;
performing secret sharing matrix multiplication on the prediction difference and the transformed feature sample subset at the second training participant to obtain a second model update;
decomposing the second model update quantity into two second partial model update quantities, and sending one second partial model update quantity to the first training participant; and
updating the current transition submodel of the second training participant based on the remaining second partial model update quantity and the received first partial model update quantity, wherein the updated transition submodel of each training participant is used as the current transition submodel of the next cycle process when the cycle process is not finished;
determining a sub-model of the second training participant based on the conversion sub-models of the first and second training participants when the loop end condition is satisfied.
7. The method of claim 6, wherein performing a vertical-to-horizontal slicing transform on the feature sample set to obtain transformed feature sample subsets at each training participant comprises:
decomposing the second subset of feature samples into two second partial subsets of feature samples;
sending a second subset of partial feature samples to the first training participant;
receiving a first partial feature sample subset from the first training participant, the first partial feature sample subset being derived by decomposing a feature sample subset at the first training participant; and
and splicing the remaining second part of the feature sample subset and the received first part of the feature sample subset to obtain a conversion feature sample subset at the second training participant.
8. The method of claim 6 or 7, wherein using secret sharing matrix multiplication to obtain current predictors for the set of feature samples based on current transformation submodels and transformation feature sample subsets of respective training participants comprises:
obtaining current predicted values for the feature sample set using a trusted initializer secret sharing matrix multiplication based on a current transformation submodel and a transformation feature sample subset of each training participant; or
Obtaining current predictors for the feature sample set using untrusted initializer secret sharing matrix multiplication based on current transformation submodels and transformation feature sample subsets of respective training participants.
9. The method of claim 6 or 7, wherein performing a secret sharing matrix multiplication on the predicted difference and the transformed feature sample subset at the second training participant to obtain a second model update quantity comprises:
performing trusted initializer secret sharing matrix multiplication on the prediction difference and the conversion feature sample subset at the second training participant to obtain a second model updating amount; or
Performing untrusted initializer secret sharing matrix multiplication on the prediction difference and the transformed feature sample subset at the second training participant to obtain a second model update.
10. An apparatus for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having a sub-model of the linear/logistic regression model, the first training participant having a first subset of feature samples and labeled values, the second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertical segmentation of a set of feature samples, the apparatus being located on the first training participant side, the apparatus comprising:
the model conversion unit is configured to perform model conversion processing on the submodels of the training participants to obtain conversion submodels of the training participants;
the sample conversion unit is configured to perform vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant;
a prediction value obtaining unit configured to obtain a current prediction value for the feature sample set using secret sharing matrix multiplication based on a current conversion submodel and a conversion feature sample subset of each training participant;
a prediction difference determination unit configured to determine a prediction difference between the current prediction value and a corresponding marker value;
a model update amount determination unit configured to determine a first model update amount using the prediction difference and a subset of transformed feature samples at a first training participant;
a model update amount decomposition unit configured to decompose the first model update amount into two first partial model update amounts;
a model update amount transmitting/receiving unit configured to transmit a first partial model update amount to the second training participant and receive a second partial model update amount from the second training participant, the second partial model update amount being obtained by decomposing a second model update amount at the second training participant, the second model update amount being obtained by performing secret sharing matrix multiplication on the prediction difference value and a conversion feature sample subset of the second training participant;
a model updating unit configured to update a current converter model at the first training participant based on the remaining first partial model update amount and the received second partial model update amount; and
a model determination unit configured to determine a sub-model of the first training participant based on the conversion sub-models of the first and second training participants when the loop end condition is satisfied,
wherein the sample conversion unit, the predicted value acquisition unit, the predicted difference value determination unit, the model update amount decomposition unit, the model update amount transmission/reception unit, and the model update unit cyclically execute operations until a cycle end condition is satisfied,
and when the loop process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next loop process.
11. The apparatus of claim 10, wherein the sample conversion unit comprises:
a sample decomposition module configured to decompose the first subset of feature samples into two first partial subsets of feature samples;
a sample sending/receiving module configured to send a first partial feature sample subset to the second training participant and receive a second partial feature sample subset from the second training participant, the second partial feature sample subset being obtained by decomposing the feature sample subset at the second training participant; and
a sample stitching module configured to stitch the remaining first partial subset of feature samples and the received second partial subset of feature samples to obtain a transformed subset of feature samples at the first training participant.
12. The apparatus according to claim 10 or 11, wherein the prediction value acquisition unit is configured to:
obtaining current predicted values for the feature sample set using a trusted initializer secret sharing matrix multiplication based on a current transformation submodel and a transformation feature sample subset of each training participant; or
Obtaining current predictors for the feature sample set using untrusted initializer secret sharing matrix multiplication based on current transformation submodels and transformation feature sample subsets of respective training participants.
13. An apparatus for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having a sub-model of the linear/logistic regression model, the first training participant having a first subset of feature samples and labeled values, the second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertical segmentation of a set of feature samples, the apparatus being located on the side of the second training participant, the apparatus comprising:
the model conversion unit is configured to perform model conversion processing on the submodels of the training participants to obtain conversion submodels of the training participants;
the sample conversion unit is configured to perform vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant;
a prediction value obtaining unit configured to obtain a current prediction value for the feature sample set using secret sharing matrix multiplication based on a current conversion submodel and a conversion feature sample subset of each training participant;
a model update amount receiving unit configured to receive a first partial model update amount from the first training participant, the first partial model update amount being obtained by decomposing a first model update amount at the first training participant, the first model update amount being determined at the first training participant using a prediction difference value and a conversion feature sample subset at the first training participant, wherein the prediction difference value is a difference value between the current prediction value and a corresponding label value;
a second model update amount determination unit configured to perform secret sharing matrix multiplication on the prediction difference and the transformed feature sample subset at the second training participant to obtain a second model update amount;
a model update amount decomposition unit configured to decompose the second model update amount into two second partial model update amounts;
a model update amount sending unit configured to send a second partial model update amount to the first training participant;
a model updating unit configured to update a current converter model of the second training participant based on the remaining second partial model update amount and the received first partial model update amount; and
a model determination unit configured to determine a sub-model of the second training participant based on the conversion sub-models of the first and second training participants when the loop end condition is satisfied,
wherein the sample conversion unit, the predicted value acquisition unit, the model update amount reception unit, the model update amount determination unit, the model update amount decomposition unit, the model update amount transmission unit, and the model update unit cyclically execute operations until a cycle end condition is satisfied,
and when the loop process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next loop process.
14. The apparatus of claim 13, wherein the sample conversion unit comprises:
a sample decomposition module configured to decompose the second subset of feature samples into two second partial subsets of feature samples;
a sample sending/receiving module configured to send a second partial feature sample subset to the first training participant and receive a first partial feature sample subset from the first training participant, the first partial feature sample subset being obtained by decomposing the feature sample subset at the first training participant; and
a sample stitching module configured to stitch the remaining second subset of partial feature samples and the received first subset of partial feature samples to obtain a transformed subset of feature samples at the second training participant.
15. The apparatus according to claim 13 or 14, wherein the prediction value acquisition unit is configured to:
obtaining current predicted values for the feature sample set using a trusted initializer secret sharing matrix multiplication based on a current transformation submodel and a transformation feature sample subset of each training participant; or
Obtaining current predictors for the feature sample set using untrusted initializer secret sharing matrix multiplication based on current transformation submodels and transformation feature sample subsets of respective training participants.
16. The apparatus of claim 13 or 14, wherein the model update amount determination unit is configured to:
performing trusted initializer secret sharing matrix multiplication on the prediction difference and the conversion feature sample subset at the second training participant to obtain a second model updating amount; or
Performing untrusted initializer secret sharing matrix multiplication on the prediction difference and the transformed feature sample subset at the second training participant to obtain a second model update.
17. A system for collaborative training of a linear/logistic regression model via first and second training participants, each training participant having a sub-model of the linear/logistic regression model, the first training participant having a first subset of feature samples and labeled values, the second training participant having a second subset of feature samples, the first and second subsets of feature samples obtained by vertically slicing the set of feature samples, the system comprising:
a first training participant device comprising the apparatus of any one of claims 10 to 12; and
a second training participant device comprising an apparatus as claimed in any one of claims 13 or 16.
18. A computing device, comprising:
at least one processor, and
a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 1-5.
19. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any one of claims 1 to 5.
20. A computing device, comprising:
at least one processor, and
a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 6 to 9.
21. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any one of claims 6 to 9.
CN201910599381.2A 2019-07-04 2019-07-04 Model training method, device and system Active CN112183757B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910599381.2A CN112183757B (en) 2019-07-04 2019-07-04 Model training method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910599381.2A CN112183757B (en) 2019-07-04 2019-07-04 Model training method, device and system

Publications (2)

Publication Number Publication Date
CN112183757A true CN112183757A (en) 2021-01-05
CN112183757B CN112183757B (en) 2023-10-27

Family

ID=73915881

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910599381.2A Active CN112183757B (en) 2019-07-04 2019-07-04 Model training method, device and system

Country Status (1)

Country Link
CN (1) CN112183757B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114662156A (en) * 2022-05-25 2022-06-24 蓝象智联(杭州)科技有限公司 Longitudinal logistic regression modeling method based on anonymized data
WO2024093573A1 (en) * 2022-10-30 2024-05-10 抖音视界有限公司 Method and apparatus for training machine learning model, device, and medium

Citations (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101815081A (en) * 2008-11-27 2010-08-25 北京大学 Distributed calculation logic comparison method
CN102135989A (en) * 2011-03-09 2011-07-27 北京航空航天大学 Normalized matrix-factorization-based incremental collaborative filtering recommending method
KR20130067345A (en) * 2011-12-13 2013-06-24 한양대학교 산학협력단 Method for learning task skill and robot using thereof
US20150193694A1 (en) * 2014-01-06 2015-07-09 Cisco Technology, Inc. Distributed learning in a computer network
US20150193695A1 (en) * 2014-01-06 2015-07-09 Cisco Technology, Inc. Distributed model training
CN105450394A (en) * 2015-12-30 2016-03-30 中国农业大学 Share updating method and device based on threshold secret sharing
CN107025205A (en) * 2016-01-30 2017-08-08 华为技术有限公司 A kind of method and apparatus of training pattern in distributed system
US20170270671A1 (en) * 2016-03-16 2017-09-21 International Business Machines Corporation Joint segmentation and characteristics estimation in medical images
US20170372201A1 (en) * 2016-06-22 2017-12-28 Massachusetts Institute Of Technology Secure Training of Multi-Party Deep Neural Network
US20180316502A1 (en) * 2017-04-27 2018-11-01 Factom Data Reproducibility Using Blockchains
US10152676B1 (en) * 2013-11-22 2018-12-11 Amazon Technologies, Inc. Distributed training of models using stochastic gradient descent
US20180373834A1 (en) * 2017-06-27 2018-12-27 Hyunghoon Cho Secure genome crowdsourcing for large-scale association studies
CN109165515A (en) * 2018-08-10 2019-01-08 深圳前海微众银行股份有限公司 Model parameter acquisition methods, system and readable storage medium storing program for executing based on federation's study
CN109214436A (en) * 2018-08-22 2019-01-15 阿里巴巴集团控股有限公司 A kind of prediction model training method and device for target scene
CN109214404A (en) * 2017-07-07 2019-01-15 阿里巴巴集团控股有限公司 Training sample generation method and device based on secret protection
CN109255247A (en) * 2018-08-14 2019-01-22 阿里巴巴集团控股有限公司 Secure calculation method and device, electronic equipment
CN109299161A (en) * 2018-10-31 2019-02-01 阿里巴巴集团控股有限公司 A kind of data selecting method and device
US20190042763A1 (en) * 2017-08-02 2019-02-07 Alibaba Group Holding Limited Model training method and apparatus based on data sharing
CN109409125A (en) * 2018-10-12 2019-03-01 南京邮电大学 It is a kind of provide secret protection data acquisition and regression analysis
CN109583468A (en) * 2018-10-12 2019-04-05 阿里巴巴集团控股有限公司 Training sample acquisition methods, sample predictions method and corresponding intrument
CN109640095A (en) * 2018-12-28 2019-04-16 中国科学技术大学 A kind of video encryption system of binding capacity quantum key distribution
CN109635462A (en) * 2018-12-17 2019-04-16 深圳前海微众银行股份有限公司 Model parameter training method, device, equipment and medium based on federation's study
WO2019072316A2 (en) * 2019-01-11 2019-04-18 Alibaba Group Holding Limited A distributed multi-party security model training framework for privacy protection
WO2019072315A2 (en) * 2019-01-11 2019-04-18 Alibaba Group Holding Limited A logistic regression modeling scheme using secrete sharing
US20190114530A1 (en) * 2017-10-13 2019-04-18 Panasonic Intellectual Property Corporation Of America Prediction model sharing method and prediction model sharing system
CN109754060A (en) * 2017-11-06 2019-05-14 阿里巴巴集团控股有限公司 A kind of training method and device of neural network machine learning model
WO2019100724A1 (en) * 2017-11-24 2019-05-31 华为技术有限公司 Method and device for training multi-label classification model
CN109840588A (en) * 2019-01-04 2019-06-04 平安科技(深圳)有限公司 Neural network model training method, device, computer equipment and storage medium
CN109871702A (en) * 2019-02-18 2019-06-11 深圳前海微众银行股份有限公司 Federal model training method, system, equipment and computer readable storage medium
KR20190072770A (en) * 2017-12-18 2019-06-26 경희대학교 산학협력단 Method of performing encryption and decryption based on reinforced learning and client and server system performing thereof

Patent Citations (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101815081A (en) * 2008-11-27 2010-08-25 北京大学 Distributed calculation logic comparison method
CN102135989A (en) * 2011-03-09 2011-07-27 北京航空航天大学 Normalized matrix-factorization-based incremental collaborative filtering recommending method
KR20130067345A (en) * 2011-12-13 2013-06-24 한양대학교 산학협력단 Method for learning task skill and robot using thereof
US10152676B1 (en) * 2013-11-22 2018-12-11 Amazon Technologies, Inc. Distributed training of models using stochastic gradient descent
US20150193694A1 (en) * 2014-01-06 2015-07-09 Cisco Technology, Inc. Distributed learning in a computer network
US20150193695A1 (en) * 2014-01-06 2015-07-09 Cisco Technology, Inc. Distributed model training
CN105450394A (en) * 2015-12-30 2016-03-30 中国农业大学 Share updating method and device based on threshold secret sharing
CN107025205A (en) * 2016-01-30 2017-08-08 华为技术有限公司 A kind of method and apparatus of training pattern in distributed system
US20170270671A1 (en) * 2016-03-16 2017-09-21 International Business Machines Corporation Joint segmentation and characteristics estimation in medical images
US20170372201A1 (en) * 2016-06-22 2017-12-28 Massachusetts Institute Of Technology Secure Training of Multi-Party Deep Neural Network
US20180316502A1 (en) * 2017-04-27 2018-11-01 Factom Data Reproducibility Using Blockchains
US20180373834A1 (en) * 2017-06-27 2018-12-27 Hyunghoon Cho Secure genome crowdsourcing for large-scale association studies
CN109214404A (en) * 2017-07-07 2019-01-15 阿里巴巴集团控股有限公司 Training sample generation method and device based on secret protection
US20190042763A1 (en) * 2017-08-02 2019-02-07 Alibaba Group Holding Limited Model training method and apparatus based on data sharing
US20190114530A1 (en) * 2017-10-13 2019-04-18 Panasonic Intellectual Property Corporation Of America Prediction model sharing method and prediction model sharing system
CN109754060A (en) * 2017-11-06 2019-05-14 阿里巴巴集团控股有限公司 A kind of training method and device of neural network machine learning model
WO2019100724A1 (en) * 2017-11-24 2019-05-31 华为技术有限公司 Method and device for training multi-label classification model
KR20190072770A (en) * 2017-12-18 2019-06-26 경희대학교 산학협력단 Method of performing encryption and decryption based on reinforced learning and client and server system performing thereof
CN109165515A (en) * 2018-08-10 2019-01-08 深圳前海微众银行股份有限公司 Model parameter acquisition methods, system and readable storage medium storing program for executing based on federation's study
CN109255247A (en) * 2018-08-14 2019-01-22 阿里巴巴集团控股有限公司 Secure calculation method and device, electronic equipment
CN109214436A (en) * 2018-08-22 2019-01-15 阿里巴巴集团控股有限公司 A kind of prediction model training method and device for target scene
CN109583468A (en) * 2018-10-12 2019-04-05 阿里巴巴集团控股有限公司 Training sample acquisition methods, sample predictions method and corresponding intrument
CN109409125A (en) * 2018-10-12 2019-03-01 南京邮电大学 It is a kind of provide secret protection data acquisition and regression analysis
CN109299161A (en) * 2018-10-31 2019-02-01 阿里巴巴集团控股有限公司 A kind of data selecting method and device
CN109635462A (en) * 2018-12-17 2019-04-16 深圳前海微众银行股份有限公司 Model parameter training method, device, equipment and medium based on federation's study
CN109640095A (en) * 2018-12-28 2019-04-16 中国科学技术大学 A kind of video encryption system of binding capacity quantum key distribution
CN109840588A (en) * 2019-01-04 2019-06-04 平安科技(深圳)有限公司 Neural network model training method, device, computer equipment and storage medium
WO2019072316A2 (en) * 2019-01-11 2019-04-18 Alibaba Group Holding Limited A distributed multi-party security model training framework for privacy protection
WO2019072315A2 (en) * 2019-01-11 2019-04-18 Alibaba Group Holding Limited A logistic regression modeling scheme using secrete sharing
CN109871702A (en) * 2019-02-18 2019-06-11 深圳前海微众银行股份有限公司 Federal model training method, system, equipment and computer readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周洪伟;徐松林;原锦辉;: "用神经网络实现一般访问结构的多重秘密共享", 计算机工程与设计, no. 20 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114662156A (en) * 2022-05-25 2022-06-24 蓝象智联(杭州)科技有限公司 Longitudinal logistic regression modeling method based on anonymized data
WO2024093573A1 (en) * 2022-10-30 2024-05-10 抖音视界有限公司 Method and apparatus for training machine learning model, device, and medium

Also Published As

Publication number Publication date
CN112183757B (en) 2023-10-27

Similar Documents

Publication Publication Date Title
CN111523673B (en) Model training method, device and system
CN111062487B (en) Machine learning model feature screening method and device based on data privacy protection
WO2021103901A1 (en) Multi-party security calculation-based neural network model training and prediction methods and device
CN111079939B (en) Machine learning model feature screening method and device based on data privacy protection
CN111061963B (en) Machine learning model training and predicting method and device based on multi-party safety calculation
CN112052942B (en) Neural network model training method, device and system
CN111523556B (en) Model training method, device and system
CN112132270B (en) Neural network model training method, device and system based on privacy protection
CN111738438B (en) Method, device and system for training neural network model
CN110929887B (en) Logistic regression model training method, device and system
CN111523134B (en) Homomorphic encryption-based model training method, device and system
CN111523674B (en) Model training method, device and system
CN112101531A (en) Neural network model training method, device and system based on privacy protection
Zhang et al. PPNNP: A privacy-preserving neural network prediction with separated data providers using multi-client inner-product encryption
CN112183757B (en) Model training method, device and system
CN114186256A (en) Neural network model training method, device, equipment and storage medium
CN112183759B (en) Model training method, device and system
CN110874481B (en) GBDT model-based prediction method and GBDT model-based prediction device
CN111737756B (en) XGB model prediction method, device and system performed through two data owners
CN111523675B (en) Model training method, device and system
CN111738453B (en) Business model training method, device and system based on sample weighting
CN115564447A (en) Credit card transaction risk detection method and device
CN112183565B (en) Model training method, device and system
CN112966809B (en) Privacy protection-based two-party model prediction method, device and system
CN112183566B (en) Model training method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40044587

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant