CN112183565A - Model training method, device and system - Google Patents

Model training method, device and system Download PDF

Info

Publication number
CN112183565A
CN112183565A CN201910600908.9A CN201910600908A CN112183565A CN 112183565 A CN112183565 A CN 112183565A CN 201910600908 A CN201910600908 A CN 201910600908A CN 112183565 A CN112183565 A CN 112183565A
Authority
CN
China
Prior art keywords
training
training participant
model
participant
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910600908.9A
Other languages
Chinese (zh)
Other versions
CN112183565B (en
Inventor
陈超超
李梁
王力
周俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Original Assignee
Advanced New Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced New Technologies Co Ltd filed Critical Advanced New Technologies Co Ltd
Priority to CN201910600908.9A priority Critical patent/CN112183565B/en
Publication of CN112183565A publication Critical patent/CN112183565A/en
Application granted granted Critical
Publication of CN112183565B publication Critical patent/CN112183565B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0816Key establishment, i.e. cryptographic processes or cryptographic protocols whereby a shared secret becomes available to two or more parties, for subsequent use
    • H04L9/085Secret sharing or secret splitting, e.g. threshold schemes

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present disclosure provides methods and apparatus for training logistic regression models. In the method, model conversion processing is carried out on the submodels of all training participants to obtain corresponding conversion submodels. The following loop process is executed until a loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset of each training participant; and calculating the matrix product of the converted logistic regression model and each conversion characteristic sample subset, and obtaining the current predicted value of each training participant based on the matrix product. The method further includes decomposing the token value at the first training participant into a first number of partial token values and sending one partial token value to each second training participant. And determining the respective prediction difference and the model updating amount and updating the conversion submodel at each training participant. And when the cycle end condition is met, determining the submodel of each training participant based on the conversion submodel of each training participant.

Description

Model training method, device and system
Technical Field
The present disclosure relates generally to the field of machine learning, and more particularly, to a method, apparatus, and system for collaborative training of logistic regression models via multiple training participants using a horizontally-segmented training set.
Background
Logistic regression models are widely used regression/classification models in the field of machine learning. In many cases, multiple model training participants (e.g., e-commerce companies, courier companies, and banks) each possess different portions of data for feature samples used to train logistic regression models. The multiple model training participants generally want to use each other's data together to train a logistic regression model uniformly, but do not want to provide their respective data to other individual model training participants to prevent their own data from being leaked.
In view of such a situation, a machine learning method capable of protecting data security is proposed, which is capable of training a logistic regression model in cooperation with a plurality of model training participants to be used by the plurality of model training participants while ensuring the data security of each of the plurality of model training participants. However, the model training efficiency of the existing machine learning method capable of protecting data security is low.
Disclosure of Invention
In view of the above, the present disclosure provides a method, an apparatus, and a system for collaborative training of a logistic regression model via a plurality of training participants, which can improve the efficiency of model training while ensuring the security of respective data of the plurality of training participants.
According to an aspect of the present disclosure, there is provided a method for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having one sub-model, the first training participant having a first subset of feature samples and a labeled value, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples used for model training, the second number being equal to the first number minus one, the method being performed by the first training participant, the method comprising: carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants; the following loop process is executed until a loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the first training participant using secret-shared matrix multiplication; decomposing the marker value into the first number of partial marker values and sending each of the second number of partial marker values to a corresponding second training participant, respectively; determining a current predictor at the first training participant based on a matrix product at the first training participant; determining a prediction difference between a current prediction value of the first training participant and a corresponding partial marker value; determining a model update quantity at the first training participant based on the converted feature sample set and the predicted difference value at the first training participant; updating the conversion submodel of the first training participant based on the current conversion submodel of the first training participant and the corresponding model update amount, wherein when the cycle process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next cycle process; and when the cycle end condition is met, determining a sub-model of a first training participant based on the conversion sub-models of the training participants.
According to another aspect of the present disclosure, there is provided a method for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having one sub-model, the first training participant having a first subset of feature samples and a labeled value, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples used for model training, the second number being equal to the first number minus one, the method being performed by the second training participant, the method comprising: carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants; the following loop process is executed until a loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the second training participant using secret-shared matrix multiplication; receiving a corresponding partial tag value from the first training participant, the partial tag value being one of the first number of partial tag values resulting from decomposing the tag value at the first training participant; determining a current predictor at the second training participant based on the matrix product at the second training participant; determining a prediction difference value at the second training participant using the current prediction value of the second training participant and the received partial marker value; obtaining a model update quantity at the second training participant using secret sharing matrix multiplication based on the transformed feature sample set and the predicted difference value of the second training participant; updating the conversion submodel of the second training participant based on the current conversion submodel of the second training participant and the corresponding model update amount, wherein when the cycle process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next cycle process; and determining a sub-model of the second training participant based on the conversion sub-models of the training participants when the cycle end condition is satisfied.
According to another aspect of the present disclosure, there is provided an apparatus for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having one sub-model, the first training participant having a first subset of feature samples and a labeled value, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples used for model training, the second number being equal to the first number minus one, the apparatus being located on the first training participant side, the apparatus comprising: the model conversion unit is configured to perform model conversion processing on the submodels of the training participants to obtain conversion submodels of the training participants; the sample conversion unit is configured to perform vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; a matrix product acquisition unit configured to obtain a matrix product between the logistic regression model and the subset of transformed feature samples at the first training participant using secret-shared matrix multiplication; a tag value decomposition unit configured to decompose the tag value into the first number of partial tag values; a marker value transmitting unit configured to transmit each of the second number of partial marker values to a corresponding second training participant, respectively; a predictor determination unit configured to determine a current predictor at the first training participant based on a matrix product at the first training participant; a prediction difference determination unit configured to determine a prediction difference between a current prediction value of the first training participant and a corresponding partial marker value; a model update amount determination unit configured to determine a model update amount at the first training participant based on the converted feature sample set and the prediction difference value at the first training participant; a model updating unit configured to update a conversion submodel of the first training participant based on a current conversion submodel of the first training participant and a corresponding model update amount; and a model determining unit configured to determine a sub-model of the first training participant based on a conversion sub-model of each training participant when the cycle end condition is satisfied, wherein the sample converting unit, the matrix product obtaining unit, the flag value decomposing unit, the flag value transmitting unit, the prediction value determining unit, the prediction difference value determining unit, the model update amount determining unit, and the model updating unit are configured to cyclically perform operations until the cycle end condition is satisfied, wherein the updated conversion sub-model of each training participant is used as a current conversion sub-model of a next cycle process when a cycle process is not ended.
According to another aspect of the present disclosure, there is provided an apparatus for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having one sub-model, the first training participant having a first subset of feature samples and a labeled value, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples used for model training, the second number being equal to the first number minus one, the apparatus being located on the second training participant side, the apparatus comprising: the model conversion unit is configured to perform model conversion processing on the submodels of the training participants to obtain conversion submodels of the training participants; the sample conversion unit is configured to perform vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; a matrix product obtaining unit configured to obtain a matrix product between the model-converted logistic regression model and the converted feature sample subset at the second training participant using secret sharing matrix multiplication; a marker value receiving unit configured to receive a corresponding partial marker value from the first training participant, the partial marker value being one of the first number of partial marker values resulting from decomposition of the marker value at the first training participant; a predictor determination unit configured to determine a current predictor at the second training participant based on a matrix product at the second training participant; a prediction difference determination unit configured to determine a prediction difference at the second training participant using the current prediction value of the second training participant and the received partial marker value; a model update amount determination unit configured to obtain a model update amount of the second training participant using secret sharing matrix multiplication based on the converted feature sample set and the prediction difference value of the second training participant; a model updating unit configured to update the conversion submodel of the second training participant based on the current conversion submodel of the second training participant and a corresponding model update amount; and a model determining unit configured to determine a sub-model of the second training participant based on a conversion sub-model of each training participant when the cycle end condition is satisfied, wherein the sample converting unit, the matrix product obtaining unit, the flag value receiving unit, the prediction value determining unit, the prediction difference value determining unit, the model update amount determining unit, and the model updating unit are configured to cyclically perform operations until the cycle end condition is satisfied, wherein the updated conversion sub-model of each training participant is used as a current conversion sub-model of a next cycle process when a cycle process is not ended.
According to another aspect of the present disclosure, there is provided a system for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the system comprising: a first training participant device comprising an apparatus as described above on the first training participant side; and a second number of second training participant devices, each second training participant device comprising means on the side of a second training participant as described above, wherein each training participant has one sub-model, the first training participant has a first subset of feature samples and a label value, each second training participant has a second subset of feature samples, the first and second subsets of feature samples are obtained by vertically slicing a set of feature samples used for model training, the second number is equal to the first number minus one.
According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform a training method performed on a first training participant side as described above.
According to another aspect of the present disclosure, there is provided a machine-readable storage medium storing executable instructions that, when executed, cause the at least one processor to perform a training method as described above performed on a first training participant side.
According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform a training method performed on a second training participant side as described above.
According to another aspect of the present disclosure, there is provided a machine-readable storage medium storing executable instructions that, when executed, cause the at least one processor to perform a training method as described above performed on a second training participant side.
By using the scheme of the embodiment of the disclosure, the model parameters of the logistic regression model can be obtained by training without leaking the secret data of the training participants, and the workload of the model training is only in a linear relationship rather than an exponential relationship with the number of the feature samples used for training, so that compared with the prior art, the scheme of the embodiment of the disclosure can improve the efficiency of the model training under the condition of ensuring the safety of the respective data of the training participants.
Drawings
A further understanding of the nature and advantages of the present disclosure may be realized by reference to the following drawings. In the drawings, similar components or features may have the same reference numerals.
FIG. 1 shows a schematic diagram of an example of vertically sliced data according to an embodiment of the present disclosure;
FIG. 2 illustrates an architectural diagram showing a system for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure;
FIG. 3 illustrates a flow diagram of a method for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure;
FIG. 4 shows a flow diagram of a model transformation process according to an embodiment of the present disclosure;
FIG. 5 shows a flow diagram of a feature sample set transformation process in accordance with an embodiment of the present disclosure;
FIG. 6 illustrates a flow diagram of a process of performing secret-shared matrix multiplication with a trusted initiator on current submodels of various training participants and a subset of transformed feature samples of the training participants, in accordance with an embodiment of the disclosure;
FIG. 7 shows a flowchart of a process of performing secret-shared matrix multiplication of untrusted initializers on the current submodels of the individual training participants and the transformed feature sample sets of the training initiator, according to an embodiment of the disclosure;
FIG. 8 shows a flowchart of one example of untrusted initializer secret sharing matrix multiplication according to an embodiment of the present disclosure;
FIG. 9 illustrates a block diagram of an apparatus for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure;
FIG. 10 illustrates a block diagram of an apparatus for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure;
FIG. 11 illustrates a schematic diagram of a computing device for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure;
FIG. 12 illustrates a schematic diagram of a computing device for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure.
Detailed Description
The subject matter described herein will now be discussed with reference to example embodiments. It should be understood that these embodiments are discussed only to enable those skilled in the art to better understand and thereby implement the subject matter described herein, and are not intended to limit the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the scope of the disclosure. Various examples may omit, substitute, or add various procedures or components as needed. For example, the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. In addition, features described with respect to some examples may also be combined in other examples.
As used herein, the term "include" and its variants mean open-ended terms in the sense of "including, but not limited to. The term "based on" means "based at least in part on". The terms "one embodiment" and "an embodiment" mean "at least one embodiment". The term "another embodiment" means "at least one other embodiment". The terms "first," "second," and the like may refer to different or the same object. Other definitions, whether explicit or implicit, may be included below. The definition of a term is consistent throughout the specification unless the context clearly dictates otherwise.
The secret sharing method is a cryptographic technology for decomposing and storing a secret, and divides the secret into a plurality of secret shares in a proper manner, each secret share is owned and managed by one of a plurality of participants, a single participant cannot recover the complete secret, and only a plurality of participants cooperate together can the complete secret be recovered. The secret sharing method aims to prevent the secret from being too concentrated so as to achieve the purposes of dispersing risks and tolerating intrusion.
Secret sharing methods can be roughly divided into two categories: there is a trusted initializer secret sharing method and a untrusted initializer secret sharing method. In the secret sharing method with a trusted initiator, the trusted initiator is required to perform parameter initialization (often to generate random numbers meeting certain conditions) on each participant participating in multi-party secure computation. After the initialization is completed, the trusted initialization party destroys the data and disappears at the same time, and the data are not needed in the following multi-party security calculation process.
The trusted initializer secret sharing matrix multiplication is applicable to the following situations: the complete secret data is a product of the first set of secret shares and the second set of secret shares, and each of the participants has one of the first set of secret shares and one of the second set of secret shares. By the secret sharing matrix multiplication of the trusted initiator, each of the multiple participants can obtain partial complete secret data of the complete secret data, the sum of the partial complete secret data obtained by each participant is the complete secret data, and each participant discloses the obtained partial complete secret data to the rest of the participants, so that each participant can obtain the complete secret data without disclosing the secret share owned by each participant, thereby ensuring the safety of the data of each of the multiple participants.
Untrusted initializer secret sharing matrix multiplication is one of the secret sharing methods. Secret-sharing matrix multiplication by an untrusted initializer is applicable to the case where the complete secret is the product of a first secret share and a second secret share, and both parties own the first and second secret shares, respectively. By secret sharing matrix multiplication by an untrusted initiator, each of the two parties that own a respective secret share generates and discloses data that is different from the secret share that they own, but the sum of the data that the two parties each disclose is equal to the product of the secret shares that the two parties each own (i.e., the complete secret). Therefore, the parties can recover the complete secret by the cooperative work of the secret sharing matrix multiplication of the trusted initialization party without disclosing the secret shares owned by the parties, and the data security of the parties is guaranteed.
In the present disclosure, the training sample set used in the logistic regression model training scheme is a vertically sliced training sample set. The term "vertically dividing the training sample set" refers to dividing the training sample set into a plurality of training sample subsets according to a module/function (or some specified rule), each training sample subset including a part of training subsamples of each training sample in the training sample set, and all training samplesThe part of the training subsamples contained in this subset constitutes the piece of training samples. In one example, assume that the training sample includes label y0And attribute
Figure BDA0002119246020000081
Then after vertical slicing, the training participant Alice owns y of the training sample0And
Figure BDA0002119246020000082
and that the training participants Bob possess
Figure BDA0002119246020000083
In another example, assume that the training sample includes label y0And attribute
Figure BDA0002119246020000084
Figure BDA0002119246020000085
Then after vertical slicing, the training participant Alice owns y of the training sample0And
Figure BDA0002119246020000086
and that the training participants Bob possess
Figure BDA0002119246020000087
And
Figure BDA0002119246020000088
in addition to these two examples, there are other possible scenarios, which are not listed here.
Suppose a sample x of attribute values described by d attributes (also called features) is givenT=(x1;x2;…;xd) Wherein x isiIf the value sum T of x on the ith attribute represents transposition, the logistic regression model is Y ═ 1/(1+ e)-wx) Where Y is the predicted value, and W is the model parameter of the logistic regression model (i.e., the model described in this disclosure),
Figure BDA0002119246020000091
WPrefers to a sub-model at each training participant P in the present disclosure. In this disclosure, attribute value samples are also referred to as feature data samples.
In the present disclosure, each training participant has a different portion of the data of the training samples used to train the logistic regression model. For example, taking two training participants as an example, assuming that the training sample set includes 100 training samples, each of which contains a plurality of feature values and labeled actual values, the data owned by the first participant may be the partial feature values and labeled actual values of each of the 100 training samples, and the data owned by the second participant may be the partial feature values (e.g., remaining feature values) of each of the 100 training samples.
The matrix multiplication computation described anywhere in this disclosure needs to determine whether to transpose a corresponding matrix of one or more of two or more matrices participating in matrix multiplication or not, as the case may be, to satisfy a matrix multiplication rule, thereby completing the matrix multiplication computation.
Embodiments of a method, apparatus, and system for collaborative training of a logistic regression model via multiple training participants according to the present disclosure are described in detail below with reference to the accompanying drawings.
Fig. 1 shows a schematic diagram of an example of a vertically sliced training sample set according to an embodiment of the present disclosure. In fig. 1,2 data parties Alice and Bob are shown, as are the multiple data parties. Each data party Alice and Bob owns part of the training subsamples of each of all the training samples in the training sample set, and for each training sample, the part of the training subsamples owned by data parties Alice and Bob are combined together to form the complete content of the training sample. For example, assume that the content of a certain training sample includes a label (hereinafter referred to as "label value") y0And attribute features (hereinafter referred to as "feature samples")
Figure BDA0002119246020000092
Then after vertical slicing, the training participant Alice owns y of the training sample0And
Figure BDA0002119246020000093
and that the training participants Bob possess
Figure BDA0002119246020000094
Fig. 2 shows an architectural diagram illustrating a system 1 for collaborative training of a logistic regression model via multiple training participants (hereinafter referred to as model training system 1) according to an embodiment of the present disclosure.
As shown in fig. 2, the model training system 1 includes a training initiator device 10 and at least one training cooperator device 20. In fig. 2, 2 training cooperator apparatuses 20 are shown. In other embodiments of the present disclosure, one training cooperator apparatus 20 may be included or more than 2 training cooperator apparatuses 20 may be included. The training initiator device 10 and the at least one training cooperator device 20 may communicate with each other via a network 30, such as, but not limited to, the internet or a local area network or the like. In the present disclosure, the training initiator device 10 and the at least one training cooperator device 20 are collectively referred to as training participant devices.
In the present disclosure, the trained logistic regression model is decomposed into a first number of sub-models. Here, the first number is equal to the number of training participant devices participating in model training. Here, it is assumed that the number of training participant devices is N. Accordingly, the logistic regression model is decomposed into N submodels, one for each training participant device. A training sample set for model training is located at the training initiator device 10, the training sample set being a horizontally partitioned training sample set as described above, and the training sample set comprising a feature data set and corresponding marker values, i.e., x0 and y0 shown in fig. 1. The submodel and corresponding training samples owned by each training participant are secret to that training participant and cannot be learned or are completely learned by other training participants.
In the present disclosure, the training initiator device 10 and the at least one training cooperator device 20 together use a set of training samples at the training initiator device 10 and respective sub-models to cooperatively train a logistic regression model. The specific training process for the model will be described in detail below with reference to fig. 3 to 8.
In the present disclosure, the training initiator device 10 and the training cooperator device 20 may be any suitable computing device having computing capabilities. The computing devices include, but are not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular phones, Personal Digital Assistants (PDAs), handheld devices, messaging devices, wearable computing devices, consumer electronics, and so forth.
FIG. 3 illustrates a flow diagram of a method for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure. In fig. 3, a first training participant Alice and 2 second training participants Bob and Charlie are illustrated as examples. Sub-model W with logistic regression model for Alice as first training participantAThe second training participant Bob has a sub-model W of the logistic regression modelBAnd the second training participant Charlie has a submodel W of the logistic regression modelC. The first training participant Alice has a first feature sample subset XAAnd a label value Y, the second training participant Bob having a second subset X of feature samplesBAnd the second training participant Charlie has a third subset of feature samples XC. First subset of feature samples XAA second subset of feature samples XBAnd a third subset of feature samples XCIs obtained by vertically segmenting a feature sample set X used for model training. Submodel WA、WBAnd WCAnd forming the logistic regression model W.
As shown in FIG. 3, first, at block 301, a first training participant Alice, a second training participant Bob, and Charlie initialize the sub-model parameters of their sub-models, i.e., weight sub-vectors WA、WBAnd WCTo obtain initial values of its sub-model parameters and to initialize the number of training cycles performed t to zero. Here, it is assumed that the end condition of the loop process is that a predetermined number of training loops are performed, for example, T training loops are performed.
After initialization as above, at block 302, at Alice, Bob, and Charlie, the respective initial submodels W are separately mapped toA、WBAnd WCPerforming model conversion processing to obtain a conversion submodel WA'、WB' and WC'。
FIG. 4 shows a flow diagram of one example of a model transformation process, according to an embodiment of the present disclosure.
As shown in FIG. 4, at block 410, at Alice, the submodel W that Alice has is assignedADecomposition into WA1、WA2And WA3. Here, the sub-pattern WAIn the decomposition process, aiming at the sub-model WAThe attribute value of the element is decomposed into 3 partial attribute values, and 3 new elements are obtained by using the decomposed partial attribute values. Then, the resulting 3 new elements are assigned to W, respectivelyA1、WA2And WA3Thereby obtaining WA1、WA2And WA3. At Bob, the sub-model W that Bob hasBDecomposition into WB1、WB2And WB3. At Charlie, the submodel W that Charlie hasCDecomposition into WC1、WC2And WC3
Then, at block 420, Alice will WA2Send to Bob, and WA3And sending the information to Charlie. At block 430, Bob couples WB1Send to Alice, and send WB3And sending the information to Charlie. At block 440, Charlie compares WC1Send to Alice, and send WC2Sent to Bob.
Next, at block 450, at Alice, for WA1、WB1And WC1Splicing to obtain the converted submodel WA'. The resulting converted submodel WA' dimension and feature sample set for model trainingAre equal in dimension. At Bob, for WA2、WB2And WC2Splicing to obtain the converted submodel WB'. At Charlie, for WA3、WB3And WC3Splicing to obtain the converted submodel WC'. Likewise, the converted submodel WB' and WCThe dimension of' is equal to the dimension of the feature sample set used for model training.
Returning to fig. 3, after the model conversion is completed as above, the operations of blocks 303 to 311 are cyclically performed until a cycle end condition is satisfied.
Specifically, at block 303, the first training participant Alice and the second training participant Bob, Charlie cooperate to match the first feature sample subset XAA second subset of feature samples XBAnd a third subset of feature samples XC(i.e., the feature sample set) is converted from vertical to horizontal segmentation to obtain a first converted feature sample subset XA', a second conversion feature sample subset XB' and third transform feature sample subset XC'. The resulting first transformed feature sample subset XA', a second conversion feature sample subset XB' and third transform feature sample subset XC' each feature sample in the set has the complete feature content of each training sample, i.e., a subset of feature samples similar to that obtained by horizontally slicing the feature sample set.
Fig. 5 shows a flow diagram of a feature sample set transformation process according to an embodiment of the present disclosure.
As shown in FIG. 5, at Block 510, at Alice, a first feature sample subset X is combinedADecomposition into XA1、XA2And XA3. At Bob, a second subset of feature samples X is combinedBDecomposition into XB1、XB2And XB3. At Charlie, the third feature sample subset XCDecomposition into XC1、XC2And XC3. For a subset of feature samples XA、XBAnd XCDecomposition process and for submodel WAThe decomposition process of (a) is exactly the same. Then, at block 520, Alice compares X with XA2Is sent toBob, and mixing XA3And sending the information to Charlie. At block 530, Bob takes XB1Send to Alice, and XB3And sending the information to Charlie. At block 540, Charlie compares XC1Send to Alice, and XC2Sent to Bob.
Next, at block 550, at Alice, X is pairedA1、XB1And XC1Stitching to obtain a first transformed feature sample subset XA'. At Bob, for XA2、XB2And XC2Performing stitching to obtain a second conversion feature sample subset XB'. At Charlie, for XA3、XB3And XC3Performing stitching to obtain a third conversion feature sample subset XC'. The resulting transformed feature sample subset XA'、XB' and XC' same as the dimension of the training sample set.
For the first feature sample subset X as aboveAA second subset of feature samples XBAnd a third subset of feature samples XCAfter performing the vertical-to-horizontal segmentation transformation, at block 304, the model transformed logistic regression model (i.e., W ═ W) is appliedA'+WB'+WC') and a subset of transformed feature samples X for each training participantA'、XB' and XC' secret shared matrix multiplication is performed separately to obtain corresponding matrix products, i.e., the matrix product W X X at AliceA', the matrix product W X at BobB' and the matrix product W X at CharlieC'。
In one example of the disclosure, a secret shared matrix multiplication is used to obtain a model-transformed logistic regression model W and a transformed feature sample subset X of each training participantA'、XB' and XCThe matrix product between' may include: obtaining a model-converted logistic regression model W and a conversion characteristic sample subset X of each training participant by using secret shared matrix multiplication of a trusted initializerA'、XB' and XC' matrix product between. How to obtain a matrix product using secret shared matrix multiplication with a trusted initiator will be referred to belowFig. 6 illustrates this.
In another example of the present disclosure, a secret shared matrix multiplication is used to obtain a model-transformed logistic regression model W and a transformed feature sample subset X of each training participantA'、XB' and XCThe matrix product between' may include: obtaining a model-transformed logistic regression model W and a transformed feature sample subset X of each training participant by using untrusted initializer secret shared matrix multiplicationA'、XB' and XC' matrix product between. How to use untrusted initializer secret sharing matrix multiplication to obtain the matrix product will be explained below with reference to fig. 7-8.
After the matrix product for each training participant is obtained as described above, at block 305, the tag value Y is decomposed at the first training participant Alice to obtain 3 partial tag values YA、YBAnd YC. The decomposition process for the marker values Y is the same as the decomposition process for the feature sample set X described above and will not be described here.
Next, at block 306, the first training participant Alice assigns a partial tag value Y to the partial tag value YBTo the second training participant Bob and to the partial mark value YCSending to a second training participant Charlie while Alice retains a portion of the labeled value YAAs its own partial tag value.
Then, at each training participant, a current predictor at each training participant is determined based on the matrix product of each training participant at block 307. For example, the current predicted values at the various training participants may utilize a formula
Figure BDA0002119246020000131
To obtain a solution of, wherein,
Figure BDA0002119246020000132
is the predicted value at the training participant i, W ═ WA'+WB'+WC' is a logistic regression model, and XiAre a subset of the transformed feature samples at the respective training participants.
In addition, the formula can be matched
Figure BDA0002119246020000133
A taylor formula expansion is performed, that is,
Figure BDA0002119246020000134
thus, the matrix product W.X of each training participant can be used based on the Taylor expansion formulaiTo calculate the current predicted value of each training participant. As for taylor formula expansion, it needs to be approximated to several times, and it can be determined based on the accuracy required for the application scenario.
At each training participant, a prediction difference value is determined at each training participant based on the current prediction value and the respective partial label value of each training participant at block 308. I.e. the predicted difference at Alice
Figure BDA0002119246020000135
Prediction at Bob
Figure BDA0002119246020000136
And predicted value at Charlie
Figure BDA0002119246020000137
Where e is a column vector, Y is a column vector representing the label values of the training samples X, and,
Figure BDA0002119246020000141
is a column vector representing the current predictor for training sample X. E, Y and if training sample X contains only a single training sample
Figure BDA0002119246020000142
Are column vectors having only a single element. If the training sample X contains multiple training samples, e, Y and
Figure BDA0002119246020000143
are column vectors having a plurality of elements, wherein,
Figure BDA0002119246020000144
each element in (e) is a current predicted value of a corresponding training sample in the plurality of training samples, each element in (Y) is a labeled value of a corresponding training sample in the plurality of training samples, and each element in (e) is a difference of the labeled value of the corresponding training sample in the plurality of training samples and the current predicted value. It is to be noted that, in the above description, eA、eBAnd eCAre collectively referred to as e, and YA、YBAnd YCCollectively referred to as Y.
Then, at block 309, a transformed feature sample set X (X ═ X) based on the training initiator AliceA'+XB'+XC') and the predicted difference e for each training participantA、eBAnd eCTo determine the model update quantity TMP of each training participantA、TMPBAnd TMPC. Specifically, the model update quantity TMP at AliceA=X*eAModel update quantity TMP at BobB=X*eBAnd the model update quantity TMP at CharlieC=X*eC. Here, the model update quantity TMP at BobBAnd the model update quantity TMP at CharlieCIs obtained using secret sharing matrix multiplication.
Next, at each training participant, the current conversion submodel at the training participant is updated based on the current conversion submodel for the training participant and the corresponding model update amount, at block 310. For example, the training initiator Alice uses the current transition submodel WA' and corresponding model update quantity TMPATo update the conversion submodel at the training initiator Alice, and the training cooperator Bob uses the current conversion submodel WB' and corresponding model update quantity TMPBTo update the conversion submodel at the training cooperator Bob and Charlie uses the current conversion submodel WC' and corresponding model update quantity TMPCTo update the conversion submodel at the training cooperator Charlie.
In this disclosureIn one example of the above, updating the current submodel at the training participant based on the current submodel of the training participant and the corresponding model update amount may update the current submodel W at the training participant according to the following equationn+1=Wn-α·TMPi=Wn-α·X·eiWherein W isn+1Represents the updated converter sub-model, W, at the training participantnRepresents the current conversion submodel at the training participant, alpha represents the learning rate (learning rate), X represents the feature sample set, and eiRepresenting the predicted difference at the training participant. Wherein, when the training participant is the first training participant Alice, the updated current conversion submodel may be calculated separately at Alice. When the training participant is the second training participant, X.eiIs obtained at this second training participant using secret-sharing matrix multiplication, which may be performed using a similar process as shown in fig. 8 or fig. 7-8, except that X corresponds to W, and eiCorresponding to X. It is to be noted here that, when X is a single feature sample, X is a feature vector (column vector or row vector) composed of a plurality of attributes, and eiIs a single prediction difference. When X is a plurality of feature samples, X is a feature matrix, and the attribute of each feature sample constitutes one column element/one row element of the feature matrix X, and eiIs a prediction difference vector. In the calculation of X.eiWhen with eiIs the eigenvalue of each sample corresponding to a certain characteristic of the matrix X. For example, assume eiIs a column vector, each multiplication, eiMultiplied by a row in the matrix X, the elements in the row representing the eigenvalues of a certain characteristic corresponding to each sample.
After the respective update of the conversion submodel is completed at the respective training participants as described above, it is determined whether a predetermined number of cycles has been reached, that is, whether a cycle end condition has been reached, at block 311. If the predetermined number of cycles is reached, block 312 is entered. If the predetermined number of cycles has not been reached, flow returns to the operation of block 302 to perform a next training cycle in which the updated transition submodel obtained by the respective training participants in the current cycle is used as the current transition submodel for the next cycle.
At block 315, the sub-models (i.e., trained sub-models) at Alice, Bob, and Charlie are determined based on the updated conversion sub-models of Alice, Bob, and Charlie, respectively.
Specifically, W is trained as described aboveA'、WB' and WC', Alice will WA'[|A|:|B|]Send to Bob, and WA'[|B|:]And sending the information to Charlie. Bob will WB'[0:|A|]Send to Alice, and send WB'[|B|:]And sending the information to Charlie. Charlie is WC'[0:|A|]Send to Alice, and send WC'[|A|:|B|]Sent to Bob. Here, [ | B |:]refers to the vector components after the B dimension (i.e., | B |) in the matrix, 0: | A |]Refers to the vector components preceding the A dimension (i.e., | A |) in the matrix, i.e., the components from 0 to | A |, and [ | A |: | B |)]Are the vector components after the a dimension and before the B dimension in the matrix. For example, let W be [0,1,2,3,4,5,6]If | A | is 2 and | B | is 2, then W [0: | A |, does not count]=[0,1]And W | A | B | non-woven phosphor]=[2,3]And W [ | B |:]=[4,5,6]。
then, at Alice, W is calculatedA=WA'[0:|A|]+WB'[0:|A|]+WC'[0:|A|]At Bob, W is calculatedB=WA'[|A|:|B|]+WB'[|A|:|B|]+WC'[|A|:|B|]And at Charlie, calculating WC=WA'[|B|:]+WB'[|B|:]+WC'[|B|:]Thus obtaining the trained sub-models W at Alice, Bob and CharlieA、WBAnd WC
It is to be noted here that, in the above example, the end condition of the training loop process means that the predetermined number of loops is reached. In other examples of the disclosure, the ending condition of the training loop process may also be that the determined prediction difference is within a predetermined range, i.e., the prediction difference eA、eBAnd eCEach element in (1)eiAll within a predetermined range, e.g. predicting each element e of the difference eiAre less than a predetermined threshold or the mean of the predicted differences E is less than a predetermined threshold. Accordingly, the operations of block 311 in FIG. 3 may be performed after the operations of block 308.
Figure 6 shows a flow diagram of one example of a secret sharing matrix multiplication process with a trusted initializer. This example to calculate XA' W is explained as an example. In the case of using multiplication with a trusted initializer secret sharing matrix, the model training system 1 shown in fig. 2 further comprises a trusted initializer device 30.
As shown in fig. 6, first, at the trusted initiator 30, a first number of random weight vectors, a first number of random feature matrices, and a first number of random flag value vectors are generated, and a product of a sum of the first number of random weight vectors and a sum of the first number of random feature matrices is equal to a sum of the first number of random flag value vectors. Here, the first number is equal to the number of training participants.
For example, as shown in FIG. 6, the trusted initiator generates 3 random weight vectors WR,1、WR,2And W R,33 random feature matrices XR,1、XR,2And XR,3And 3 vectors of random tag values YR,1、YR,2And YR,3Wherein, in the step (A),
Figure BDA0002119246020000161
here, the dimension of the random weight vector is the same as the dimension of the transformed sub-models of the respective model training participants, the dimension of the random feature matrix is the same as the dimension of the transformed feature sample subset, and the dimension of the random token value vector is the same as the dimension of the token value vector.
The generated W is then processed at block 601R,1、XR,1And YR,1Sent to the first training participant Alice, and at block 602, the generated W is transmittedR,2、XR,2And YR,2To the second training participant Bob and, at block 603, to apply the generated WR,3、XR,3And YR,3And sending the information to a second training participant Charlie.
Next, at block 604, at Alice, the feature sample subset X is combinedA' (hereinafter referred to as feature matrix X)A) Decomposition into a first number of feature sub-matrices, e.g. 3 feature sub-matrices X as shown in FIG. 6A1'、XA2' and XA3'。
Then, Alice sends each of a second number of feature sub-matrices in the decomposed first number of feature sub-matrices to the corresponding second training participant, respectively, where the second number is equal to the first number minus one. For example, at blocks 605 and 606, 2 feature sub-matrices XA2' and XA3' to Bob and Charlie, respectively.
Then, at each training participant, a weight sub-vector (i.e., a conversion sub-model W) based on each training participantA'、WB' and WC'), corresponding feature submatrix XA1'、XA2' and XA3' and the received random weight vector and random feature matrix, determining a weight sub-vector difference value E and a feature sub-matrix difference value D at the training participants. For example, at block 607, at Alice, its weight subvector difference E1 ═ W is determinedA'-WR,1And the feature submatrix difference D1 ═ XA1'-XR,1. At block 608, at Bob, its weight subvector difference E2 ═ W is determinedB'-WR,2And the feature submatrix difference D2 ═ XA2'-XR,2. At block 609, at Charlie, its weight subvector difference E3 ═ W is determinedC'-WR,3And the feature submatrix difference D3 ═ XA3'-XR,3
Determining respective weight sub-vector difference E at each training participantiAnd the feature submatrix difference DiThen, each training participant determines the difference E of the weight sub-vectorsiAnd the feature submatrix difference DiTo the remaining training participants. For example, at blocks 610 and 611, Alice sends D1 and E1 to Bob and Charlie, respectively. At blocks 612 and 613, Bob will D2 and E2 to Alice and Charlie, respectively. Charlie sends D3 and E3 to Alice and Bob, respectively, at blocks 614 and 615.
Then, at each training participant, the weight sub-vector difference and the feature sub-matrix difference at each training participant are summed to obtain a weight sub-vector total difference E and a feature sub-matrix total difference D, respectively, at block 616. For example, as shown in fig. 6, D — D1+ D2+ D3, and E — E1+ E2+ E3.
Then, at each training participant, based on the received random weight vector WR,iRandom feature matrix XR,iVector of random mark values YR,iAnd calculating the predicted value vector Zi corresponding to the weight sub-vector total difference E and the feature sub-matrix total difference D respectively.
In one example of the present disclosure, at each training participant, the random labeled value vector of the training participant, the product of the total difference value of the weight sub-vectors and the random feature matrix of the training participant, and the product of the total difference value of the feature sub-matrices and the random weight vector of the training participant may be summed to obtain the corresponding predictor vector (first calculation). Alternatively, the random labeled value vector of the training participant, the product of the total difference value of the weight sub-vectors and the random feature matrix of the training participant, the product of the total difference value of the feature sub-matrices and the random weight vector of the training participant, and the product of the total difference value of the weight sub-vectors and the total difference value of the feature sub-matrices may be summed to obtain the corresponding predictor matrix (second calculation).
It should be noted here that, in the predictor matrix calculation at each training participant, only one predictor matrix calculated at each training participant includes the product of the total weight sub-vector difference and the total feature sub-matrix difference. In other words, for each training participant, only one of the training participants' predictor vectors is calculated in the second calculation, while the remaining training participants calculate the corresponding predictor vector in the first calculation.
For example, at block 617, at Alice, the corresponding predictor vector Z1 ═ Y is calculatedR,1+E*XR,1+D*WR,1+ D × E. At block 618, at Bob, the corresponding predictor vector Z2 ═ Y is calculatedR,2+E*XR,2+D*WR,2. At block 619, at Charlie, the corresponding predictor vector Z3 ═ Y is calculatedR,3+E*XR,3+D*WR,3
Note that, in fig. 6, Z1 calculated at Alice includes D × E. In other examples of the present disclosure, D _ E may also be included in Zi calculated by either Bob or Charlie, and accordingly, D _ E is not included in Z1 calculated at Alice. In other words, only one of the zis calculated at each training participant contains D × E.
Each training participant then discloses the calculated respective predictor vector to the remaining training participants. For example, at blocks 620 and 621, Alice sends predictor vector Z1 to Bob and Charlie, respectively. At blocks 622 and 623, Bob sends the predictor vector Z2 to Alice and Charlie, respectively. At blocks 624 and 625, Charlie sends predictor vector Z3 to Alice and Bob, respectively.
Then, at blocks 626, 627, and 628, each training participant sums the predictor vectors of the respective training participant Z-Z1 + Z2+ Z3 to obtain the corresponding matrix product result.
Fig. 7 illustrates a flow diagram of a process for obtaining a matrix product of individual training participants using untrusted initializer secret sharing matrix multiplication based on current submodels of the individual training participants and transformed feature sample subsets of the individual training participants, according to an embodiment of the disclosure. The following description will take the example of calculating the matrix product at Alice. The matrix product computation process for the training participants Bob and Charlie is similar to Alice.
As shown in FIG. 7, first, at Alice, a first weight submatrix (i.e., a conversion submodel) W at Alice is calculated, at Block 710A' with the first feature matrix (first transformed feature sample subset) XA' matrix product YA1=WA*XA
Next, at block 720, a no-trust initiator is usedThe secret sharing matrix of the initiator is multiplied to compute a first weight sub-matrix (e.g., W) for each second training participant (e.g., Bob and Charlie)B' and WC') and a first feature matrix XA' matrix product (Y)A2=WB'*XA' and YA3=WC'*XA'). How to calculate the matrix product using untrusted initializer secret shared matrix multiplication will be explained in detail below with reference to fig. 8.
Then, at Alice, the resulting individual matrix products (e.g., Y)A1、YA2And YA3) Summing to obtain a matrix product Y at AliceA=YA1+YA2+YA3
Fig. 8 shows a flow diagram of one example of a untrusted initializer secret sharing matrix multiplication process, according to an embodiment of the present disclosure. In FIG. 8, to train the X between the participants Alice and BobA'*WBThe calculation process of' is explained as an example.
As shown in FIG. 8, first, at block 801, if X at AliceA' (hereinafter referred to as first feature matrix) is not even in number of rows, and/or the current sub-model parameter W at BobB' (hereinafter referred to as first weight submatrix) is not even, the first feature matrix X is subjected toA' and/or a first weight submatrix WB' conducting dimension completion processing so that the first feature matrix XA' the number of rows is even and/or the first weight submatrix WB' is even. For example, the first feature matrix XA' the end of the line is increased by a line 0 value and/or the first weight submatrix WB' the dimension completion processing is performed by adding a row of 0 values at the end of the row. In the following description, it is assumed that the first weight submatrix WB' dimension is I X J, and a first feature matrix XA' has dimension J x K, wherein J is an even number.
The operations of blocks 802 to 804 are then performed at Alice to obtain a random feature matrix X1, second and third feature matrices X2 and X3. Specifically, at block 802, a random feature matrix X1 is generated. Here, the random characteristic momentsDimension of X1 and first feature matrix XA' are identical in dimension, i.e., the random feature matrix X1 has dimension J × K. At block 803, the random feature matrix X1 is subtracted from the first feature matrix XA', to obtain a second feature matrix X2. The dimension of the second feature matrix X2 is J × K. At block 804, the even row submatrix X1_ e of the random feature matrix X1 is subtracted from the odd row submatrix X1_ o of the random feature matrix X1 to obtain a third feature matrix X3. The dimension of the third feature matrix X3 is J × K, where J is J/2.
Further, the operations of blocks 805 to 807 are performed at Bob to obtain a random weight submatrix WB1A second and a third weight submatrix WB2And WB3. Specifically, at block 805, a random weight submatrix W is generatedi1. Here, the random weight submatrix WB1Dimension of (d) and a first weight submatrix WB' same dimension, i.e. random weight submatrix Wi1Is I x J. At block 806, the first weight submatrix W is processedB' and random weight submatrix WB1Summing to obtain a second weight submatrix WB2. Second weight submatrix WB2Is I x J. At block 807, the random weight submatrix WB1Odd column submatrix WB1_oAdding a random weight sub-matrix WB1Of even-numbered rows of the submatrix WB1_eTo obtain a third weight submatrix WB3. Third weight submatrix WB3Is represented by I x J, where J/2.
Then, at block 808, Alice sends the generated second feature matrix X2 and third feature matrix X3 to Bob, and at block 809, Bob sends a second weight submatrix WB2And a third weight submatrix WB3And sending the data to Alice.
Next, at block 810, at Alice, W based on equation Y1B2*(2*XA'-X1)-WB3(X3+ X1_ e) performs a matrix calculation to get the first matrix product Y1, and at block 812, sends the first matrix product Y1 to Bob.
At block 811, at Bob, (W) based on equation Y2B'+2*WB1)*X2+(WB3+WB1_o) X3 calculating the second matrixProduct Y2 and, at block 813, a second matrix product Y2 is sent to Alice.
Then, at blocks 814 and 815, the first matrix product Y1 and the second matrix product Y2 are summed at Alice and Bob, respectively, to obtain XA'*WB'=YA2=Y1+Y2。
Furthermore, it is noted that the model training schemes of 1 first training participant and 2 second training participants are shown in fig. 3-8, which may also include 1 second training participant or more than 2 second training participants in other examples of the disclosure.
By using the logistic regression model training method disclosed in fig. 3 to 8, the model parameters of the logistic regression model can be obtained by training without leaking the secret data of the training participants, and the workload of the model training is only in a linear relationship rather than an exponential relationship with the number of the feature samples used for training, so that the efficiency of the model training can be improved under the condition of ensuring the safety of the respective data of the training participants.
FIG. 9 shows a schematic diagram of an apparatus for collaborative training of a logistic regression model via multiple training participants (hereinafter referred to as a model training apparatus) 900, according to an embodiment of the present disclosure. In this embodiment, the logistic regression model includes a first number of sub-models, the first number being equal to the number of training participants, and each training participant has one sub-model. The training participants include a first training participant and a second number of second training participants. The first training participant has a first subset of feature samples and a tag value, each second training participant has a second subset of feature samples, the first and second subsets of feature samples are obtained by vertically slicing a set of feature samples used for model training, the second number is equal to the first number minus one. The model training apparatus 900 is located on the first training participant side.
As shown in fig. 9, the model training apparatus 900 includes a model conversion unit 901, a sample conversion unit 902, a matrix product acquisition unit 903, a flag value decomposition unit 904, a flag value transmission unit 905, a prediction value determination unit 906, a prediction difference determination unit 907, a model update amount determination unit 908, a model update unit 909, and a model determination unit 910.
The model transformation unit 901 is configured to perform a model transformation process on the sub-models of the respective training participants to obtain transformed sub-models of the respective training participants. The operation of the model conversion unit 901 may refer to the operation of the block 302 described above with reference to fig. 3 and the operation described with reference to fig. 4.
In performing model training, the sample conversion unit 902, the matrix product acquisition unit 903, the marker value decomposition unit 904, the marker value transmission unit 905, the prediction value determination unit 906, the prediction difference value determination unit 907, the model update amount determination unit 908, and the model update unit 909 are configured to cyclically perform operations until a cycle end condition is satisfied. The loop-ending condition may include: reaching a predetermined cycle number; or the determined prediction difference is within a predetermined range. When the loop process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next loop process.
In particular, during each cycle, the sample conversion unit 902 is configured to perform a vertical-horizontal slicing conversion on the feature sample set to obtain a converted feature sample subset at each training participant. The operation of the sample conversion unit 902 may refer to the operation of block 303 described above with reference to fig. 3 and the process described with reference to fig. 5.
The matrix product acquisition unit 903 is configured to obtain a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the first training participant using secret-shared matrix multiplication. The operation of the sample conversion unit 903 may refer to the operation of block 303 described above with reference to fig. 3 and the operations described with reference to fig. 6-8.
The marker value decomposition unit 904 is configured to decompose the marker value into a first number of partial marker values. The operation of the marker value decomposition unit 904 may refer to the operation of block 305 described above with reference to fig. 3.
The marker value transmitting unit 905 is configured to transmit each of the second number of partial marker values to a corresponding second training participant, respectively. The operation of the flag value transmitting unit 905 may refer to the operation of the block 306 described above with reference to fig. 3.
The predictor determination unit 906 is configured to determine a current predictor at the first training participant based on the matrix product at the first training participant. The operation of the prediction value determination unit 906 may refer to the operation of the block 307 described above with reference to fig. 3.
The prediction difference determination unit 907 is configured to determine a prediction difference between the current prediction value of the first training participant and the corresponding partial marker value. The operation of the prediction difference determination unit 907 may refer to the operation of the block 308 described above with reference to fig. 3.
The model update amount determination unit 908 is configured to determine a model update amount at the first training participant based on the set of feature samples and the predicted difference at the first training participant. The operation of the model update amount determination unit 908 may refer to the operation of block 309 described above with reference to fig. 3.
The model update unit 909 is configured to update the current conversion submodel of the first training participant based on the current conversion submodel of the first training participant and the corresponding model update amount.
The model determination unit 910 is configured to determine a sub-model of the first training participant based on the conversion sub-models of the respective training participants when the loop end condition is fulfilled.
In one example of the present disclosure, the matrix product acquisition unit 903 may be configured to: and obtaining a matrix product between the model-converted logistic regression model and the conversion feature sample subset of the first training participant by using secret shared matrix multiplication of the trusted initializer. The operations of the matrix product acquisition unit 903 may refer to the operations performed at Alice described above with reference to fig. 6.
In another example of the present disclosure, the matrix product acquisition unit 903 may be configured to: and obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset of the first training participant by using the secret shared matrix multiplication of the untrusted initializer. The operations of the matrix product acquisition unit 903 may refer to the operations performed at Alice described above with reference to fig. 7-8.
In one example of the present disclosure, the sample conversion unit 902 may include a sample decomposition module (not shown), a sample transmission/reception module (not shown), and a sample stitching module (not shown). The sample decomposition module is configured to decompose a first subset of feature samples into a first number of first partial subsets of feature samples. The sample sending/receiving module is configured to send each of a second number of the first partial feature sample subsets to a corresponding second training participant, and to receive a corresponding second partial feature sample subset from the respective second training participant, the received respective second partial feature sample subset being one of a first number of second partial feature sample subsets obtained by decomposing the second feature sample subset at the respective second training participant. The sample stitching module is configured to stitch the remaining first subset of partial feature samples and the received respective second subset of partial feature samples to obtain a transformed subset of feature samples at the first training participant.
FIG. 10 illustrates a block diagram of an apparatus for collaborative training of a logistic regression model via a plurality of training participants (hereinafter referred to as model training apparatus 1000), according to an embodiment of the present disclosure. In this embodiment, the logistic regression model includes a first number of sub-models, the first number being equal to the number of training participants, and each training participant has one sub-model. The training participants include a first training participant and a second number of second training participants. The first training participant has a first subset of feature samples and a tag value, each second training participant has a second subset of feature samples, the first and second subsets of feature samples are obtained by vertically slicing a set of feature samples used for model training, the second number is equal to the first number minus one. The model training apparatus 1000 is located on the second training participant side.
As shown in fig. 10, the model training apparatus 1000 includes a model conversion unit 1010, a sample conversion unit 1020, a matrix product acquisition unit 1030, a flag value receiving unit 1040, a predicted value determination unit 1050, a prediction difference value determination unit 1060, a model update amount determination unit 1070, a model update unit 1080, and a model determination unit 1090.
The model conversion unit 1010 is configured to perform a model conversion process on the sub-models of the respective training participants to obtain conversion sub-models of the respective training participants. The operation of the model conversion unit 1010 may refer to the operation of block 302 described above with reference to fig. 3 and the operation described with reference to fig. 4.
In performing model training, the sample conversion unit 1020, the matrix product acquisition unit 1030, the flag value receiving unit 1040, the prediction value determination unit 1050, the prediction difference determination unit 1060, the model update amount determination unit 1070, and the model update unit 1080 are configured to perform operations in a loop until a loop end condition is satisfied. The loop-ending condition may include: reaching a predetermined cycle number; or the determined prediction difference is within a predetermined range. When the loop process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next loop process.
In particular, during each cycle, the sample conversion unit 1020 is configured to perform a vertical-horizontal slicing conversion on the feature sample set to obtain a converted feature sample subset at each training participant. The operation of the sample conversion unit 1020 may refer to the operation of block 303 described above with reference to fig. 3 and the operation described with reference to fig. 5.
The matrix product acquisition unit 1030 is configured to obtain a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the second training participant using secret-shared matrix multiplication. The operation of the matrix product acquisition unit 1030 may refer to the operation of block 304 described above with reference to fig. 3 and the processes described with reference to fig. 6-8.
The marker value receiving unit 1040 is configured to receive a corresponding partial marker value from the first training participant, the partial marker value being one of a first number of partial marker values resulting from decomposition of the marker value at the first training participant. The operation of the marker value receiving unit 1040 may refer to the operation of the block 306 described above with reference to fig. 3.
The predictor determination unit 1050 is configured to determine a current predictor at the second training participant based on the matrix product at the second training participant. The operation of the prediction value determination unit 1050 may refer to the operation of the block 307 described above with reference to fig. 3.
The prediction difference determination unit 1060 is configured to determine a prediction difference at the second training participant using the current prediction value of the second training participant and the received partial marker value. The operation of the prediction difference determination unit 1060 may refer to the operation of the block 308 described above with reference to fig. 3.
The model update amount determination unit 1070 is configured to obtain a model update amount of the second training participant using a secret sharing matrix multiplication based on the set of feature samples and the predicted difference value of the second training participant. The operation of the model update amount determination unit 1070 may refer to the operation of the block 309 described above with reference to fig. 3.
The model update unit 1080 is configured to update the current conversion submodel of the second training participant based on the current conversion submodel of the second training participant and the corresponding model update amount. The operation of the model update unit 1080 may refer to the operation of block 310 described above with reference to fig. 3.
The model determination unit 1090 is configured to determine a sub-model of the second training participant based on the conversion sub-models of the respective training participants when the loop end condition is satisfied. The operation of the model determination unit 1090 may refer to the operation of block 312 described above with reference to fig. 3.
In one example of the present disclosure, the sample conversion unit 1020 may include a sample decomposition module (not shown), a sample transmission/reception module (not shown), and a sample stitching module (not shown). The sample decomposition module is configured to decompose the second subset of feature samples into a first number of second partial subsets of feature samples. The sample sending/receiving module is configured to send each of a second number of subsets of second partial feature samples to a first training participant and to receive a first partial subset of feature samples from the first training participant and a second partial subset of feature samples from each of the remaining second training participants, the first partial subset of feature samples being one of a first number of subsets of first partial feature samples obtained by decomposing the subset of feature samples at the first training participant, each received subset of second partial feature samples being one of a first number of subsets of second partial feature samples obtained by decomposing the respective subset of second feature samples at each remaining second training participant. The sample stitching module is configured to stitch the remaining second partial feature sample subset, the received first and second partial feature sample subsets to obtain a transformed feature sample subset at the second training participant.
In one example of the present disclosure, the matrix product acquisition unit 1030 may be configured to: and obtaining a matrix product between the model-converted logistic regression model and the conversion feature sample subset of the second training participant by using secret shared matrix multiplication of the trusted initializer. The operations of the matrix product acquisition unit 1030 may refer to the operations performed at the second training participant described above with reference to fig. 6.
In another example of the present disclosure, the matrix product acquisition unit 1030 may be configured to: and obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset of the second training participant by using the secret shared matrix multiplication of the untrusted initializer. The operations of the matrix product acquisition unit 1030 may refer to the operations performed at the second training participant described above with reference to fig. 7-8.
In one example of the present disclosure, the model update amount determination unit 1070 may be configured to: and obtaining a model updating amount of the second training participant by using secret sharing matrix multiplication with a trusted initializer based on the feature sample set and the prediction difference value of the second training participant.
In another example of the present disclosure, the model update amount determination unit 1070 may be configured to: and obtaining a model updating quantity of the second training participant by using secret sharing matrix multiplication of the untrusted initializer based on the characteristic sample set and the prediction difference value of the second training participant.
Embodiments of a model training method, apparatus and system according to the present disclosure are described above with reference to fig. 1 through 10. The above model training device can be implemented by hardware, or can be implemented by software, or a combination of hardware and software.
FIG. 11 illustrates a hardware block diagram of a computing device 1100 for implementing collaborative training of a logistic regression model via multiple training participants, according to an embodiment of the disclosure. As shown in fig. 11, computing device 1100 may include at least one processor 1110, a memory (e.g., non-volatile storage) 1120, a memory 1130, and a communication interface 1140, and the at least one processor 1110, memory 1120, memory 1130, and communication interface 1140 are connected together via a bus 1160. The at least one processor 1110 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.
In one embodiment, computer-executable instructions are stored in the memory that, when executed, cause the at least one processor 1110 to: carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants; the following loop process is executed until a loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the first training participant using secret-shared matrix multiplication; decomposing the labeled values into a first number of partial labeled values and sending each of a second number of partial labeled values to a corresponding second training participant, respectively; determining a current predictor at the first training participant based on the matrix product at the first training participant; determining a prediction difference value between a current prediction value of a first training participant and a corresponding partial marker value; determining a model update amount at the first training participant based on the feature sample set and the prediction difference value at the first training participant; updating the conversion submodel of the first training participant based on the current conversion submodel of the first training participant and the corresponding model update amount, wherein when the cycle process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next cycle process; and when the cycle end condition is met, determining the sub-model of the first training participant based on the conversion sub-models of the training participants.
It should be understood that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 1110 to perform the various operations and functions described above in connection with fig. 1-10 in the various embodiments of the present disclosure.
FIG. 12 illustrates a hardware block diagram of a computing device 1200 for implementing collaborative training of a logistic regression model via multiple training participants, according to an embodiment of the disclosure. As shown in fig. 12, computing device 1200 may include at least one processor 1210, storage (e.g., non-volatile storage) 1220, memory 1230, and a communication interface 1240, and the at least one processor 1210, storage 1220, memory 1230, and communication interface 1240 are connected together via a bus 1260. The at least one processor 1210 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.
In one embodiment, computer-executable instructions are stored in the memory that, when executed, cause the at least one processor 1210 to: carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants; the following loop process is executed until a loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the second training participant using secret-shared matrix multiplication; receiving a corresponding partial tag value from a first training participant, the partial tag value being one of a first number of partial tag values resulting from decomposition of the tag value at the first training participant; determining a current predictor at the second training participant based on the matrix product at the second training participant; determining a prediction difference at the second training participant using the current prediction value of the second training participant and the received partial marker value; obtaining a model update quantity at the second training participant using secret sharing matrix multiplication based on the feature sample set and the predicted difference value of the second training participant; updating the conversion submodel of the second training participant based on the current conversion submodel of the second training participant and the corresponding model updating amount, wherein when the cycle process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next cycle process; and when the cycle end condition is met, determining a sub-model of a second training participant based on the conversion sub-models of the training participants.
It should be understood that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 1210 to perform the various operations and functions described above in connection with fig. 1-10 in the various embodiments of the present disclosure.
According to one embodiment, a program product, such as a machine-readable medium (e.g., a non-transitory machine-readable medium), is provided. A machine-readable medium may have instructions (i.e., elements described above as being implemented in software) that, when executed by a machine, cause the machine to perform various operations and functions described above in connection with fig. 1-10 in various embodiments of the present disclosure. Specifically, a system or apparatus may be provided which is provided with a readable storage medium on which software program code implementing the functions of any of the above embodiments is stored, and causes a computer or processor of the system or apparatus to read out and execute instructions stored in the readable storage medium.
In this case, the program code itself read from the readable medium can realize the functions of any of the above-described embodiments, and thus the machine-readable code and the readable storage medium storing the machine-readable code form part of the present invention.
Examples of the readable storage medium include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer or from the cloud via a communications network.
It will be understood by those skilled in the art that various changes and modifications may be made in the above-disclosed embodiments without departing from the spirit of the invention. Accordingly, the scope of the invention should be determined from the following claims.
It should be noted that not all steps and units in the above flows and system structure diagrams are necessary, and some steps or units may be omitted according to actual needs. The execution order of the steps is not fixed, and can be determined as required. The apparatus structures described in the above embodiments may be physical structures or logical structures, that is, some units may be implemented by the same physical entity, or some units may be implemented by a plurality of physical entities, or some units may be implemented by some components in a plurality of independent devices.
In the above embodiments, the hardware units or modules may be implemented mechanically or electrically. For example, a hardware unit, module or processor may comprise permanently dedicated circuitry or logic (such as a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware units or processors may also include programmable logic or circuitry (e.g., a general purpose processor or other programmable processor) that may be temporarily configured by software to perform the corresponding operations. The specific implementation (mechanical, or dedicated permanent, or temporarily set) may be determined based on cost and time considerations.
The detailed description set forth above in connection with the appended drawings describes exemplary embodiments but does not represent all embodiments that may be practiced or fall within the scope of the claims. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration," and does not mean "preferred" or "advantageous" over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, the techniques may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the described embodiments.
The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (21)

1. A method for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having a sub-model, the first training participant having a first subset of feature samples and labeled values, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples used for model training, the second number being equal to the first number minus one, the method being performed by the first training participant, the method comprising:
carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants;
the following loop process is executed until a loop end condition is satisfied:
performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant;
obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the first training participant using secret-shared matrix multiplication;
decomposing the marker value into the first number of partial marker values and sending each of the second number of partial marker values to a corresponding second training participant, respectively;
determining a current predictor at the first training participant based on a matrix product at the first training participant;
determining a prediction difference between a current prediction value of the first training participant and a corresponding partial marker value;
determining a model update amount at the first training participant based on the converted feature sample set and the prediction difference value at the first training participant;
updating the conversion submodel of the first training participant based on the current conversion submodel of the first training participant and the corresponding model update amount, wherein when the cycle process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next cycle process; and
and when the cycle end condition is met, determining the sub-model of the first training participant based on the conversion sub-models of the training participants.
2. The method of claim 1, wherein performing a vertical-to-horizontal slicing transform on the feature sample set to obtain transformed feature sample subsets at each training participant comprises:
decomposing the first subset of feature samples into the first number of first partial subsets of feature samples;
sending each of the second number of first partial feature sample subsets to a corresponding second training participant;
receiving a second partial feature sample subset from each second training participant, the respective received second partial feature sample subset being one of a first number of second partial feature sample subsets obtained by decomposing the second feature sample subset at the respective second training participant; and
and splicing the remaining first part of feature sample subset and the received second part of feature sample subset to obtain a conversion feature sample subset at the first training participant.
3. The method of claim 1, wherein using secret sharing matrix multiplication to obtain a matrix product between the model transformed logistic regression model and the transformed feature sample subset of the first training participant comprises:
obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset of the first training participant by using a secret shared matrix multiplication with a trusted initializer; or
Using untrusted initializer secret sharing matrix multiplication to obtain a matrix product between the model transformed logistic regression model and the transformed feature sample subset of the first training participant.
4. The method of claim 1, wherein determining a current predictor at the training initiator based on a matrix product at the training initiator comprises:
determining a current predictor at the training initiator based on a matrix product at the training initiator according to a Taylor expansion formula.
5. The method of any of claims 1 to 4, wherein the end-of-loop condition comprises:
a predetermined number of cycles; or
The determined prediction difference is within a predetermined range.
6. A method for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having a sub-model, the first training participant having a first subset of feature samples and labeled values, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples used for model training, the second number being equal to the first number minus one, the method being performed by the second training participants, the method comprising:
carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants;
the following loop process is executed until a loop end condition is satisfied:
performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant;
obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the second training participant using secret-shared matrix multiplication;
receiving a corresponding partial tag value from the first training participant, the partial tag value being one of the first number of partial tag values resulting from decomposing the tag value at the first training participant;
determining a current predictor at the second training participant based on the matrix product at the second training participant;
determining a prediction difference value at the second training participant using the current prediction value of the second training participant and the received partial marker value;
obtaining a model update quantity at the second training participant by using secret sharing matrix multiplication based on the converted feature sample set and the prediction difference value of the second training participant;
updating the conversion submodel of the second training participant based on the current conversion submodel of the second training participant and the corresponding model update amount, wherein when the cycle process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next cycle process; and
and when the cycle end condition is met, determining the sub-model of the second training participant based on the conversion sub-models of the training participants.
7. The method of claim 6, wherein performing a vertical-to-horizontal slicing transform on the feature sample set to obtain transformed feature sample subsets at each training participant comprises:
decomposing the second subset of feature samples into the first number of second partial subsets of feature samples;
sending each of the second number of second partial feature sample subsets to the first training participant and to the remaining second training participants;
receiving a first portion of a subset of feature samples from the first training participant and a second portion of a subset of feature samples from each of the remaining second training participants, the first portion of the subset of feature samples being one of a first number of first portions of the subset of feature samples obtained by decomposing the subset of feature samples at the first training participant, each received second portion of the subset of feature samples being one of a first number of second portions of the subset of feature samples obtained by decomposing the respective second subset of feature samples at each of the remaining second training participants; and
and splicing the remaining second partial feature sample subset and the received first and second partial feature sample subsets to obtain a converted feature sample subset at the second training participant.
8. The method of claim 6, wherein using secret sharing matrix multiplication to obtain a matrix product between the model transformed logistic regression model and the transformed feature sample subset of the second training participant comprises:
obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset of the second training participant by using a secret shared matrix multiplication with a trusted initializer; or
Obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset of the second training participant using untrusted initializer secret sharing matrix multiplication.
9. The method of claim 6, wherein obtaining the model update quantity for the second training participant using secret sharing matrix multiplication based on the converted feature sample set and the predicted difference value for the second training participant comprises:
obtaining a model updating amount of the second training participant by using secret sharing matrix multiplication of a trusted initiator based on the converted feature sample set and the prediction difference value of the second training participant; or
And obtaining a model updating amount of the second training participant by using secret sharing matrix multiplication of the untrusted initializer based on the converted feature sample set and the prediction difference value of the second training participant.
10. An apparatus for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having a sub-model, the first training participant having a first subset of feature samples and labeled values, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples used for model training, the second number being equal to the first number minus one, the apparatus being located on the first training participant side, the apparatus comprising:
the model conversion unit is configured to perform model conversion processing on the submodels of the training participants to obtain conversion submodels of the training participants;
the sample conversion unit is configured to perform vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant;
a matrix product obtaining unit configured to obtain a matrix product between the model-converted logistic regression model and the converted feature sample subset at the first training participant using secret-shared matrix multiplication;
a tag value decomposition unit configured to decompose the tag value into the first number of partial tag values;
a marker value transmitting unit configured to transmit each of the second number of partial marker values to a corresponding second training participant, respectively;
a predictor determination unit configured to determine a current predictor at the first training participant based on a matrix product at the first training participant;
a prediction difference determination unit configured to determine a prediction difference between a current prediction value of the first training participant and a corresponding partial marker value;
a model update amount determination unit configured to determine a model update amount at the first training participant based on the converted feature sample set and the prediction difference value at the first training participant;
a model updating unit configured to update a conversion submodel of the first training participant based on a current conversion submodel of the first training participant and a corresponding model update amount; and
a model determination unit configured to determine a sub-model of the first training participant based on the conversion sub-models of the respective training participants when the loop end condition is satisfied,
wherein the sample conversion unit, the matrix product acquisition unit, the flag value decomposition unit, the flag value transmission unit, the prediction value determination unit, the prediction difference value determination unit, the model update amount determination unit, and the model update unit are configured to cyclically perform operations until a cycle end condition is satisfied,
wherein, when the cycle process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next cycle process.
11. The apparatus of claim 10, wherein the sample conversion unit comprises:
a sample decomposition module configured to decompose the first subset of feature samples into the first number of first partial subsets of feature samples;
a sample sending/receiving module configured to send each of the second number of first partial feature sample subsets to a corresponding second training participant and receive a corresponding second partial feature sample subset from each second training participant, each received second partial feature sample subset being one of a first number of second partial feature sample subsets obtained by decomposing the second feature sample subset at each second training participant; and
a sample stitching module configured to stitch the remaining first subset of partial feature samples and the received second number of second subset of partial feature samples to obtain a transformed subset of feature samples at the first training participant.
12. The apparatus of claim 10, wherein the matrix product acquisition unit is configured to:
obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the first training participant using a secret shared matrix multiplication with a trusted initiator; or
Using untrusted initializer secret sharing matrix multiplication to obtain a matrix product between the model transformed logistic regression model and the transformed feature sample subset at the first training participant.
13. An apparatus for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having a sub-model, the first training participant having a first subset of feature samples and labeled values, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples used for model training, the second number being equal to the first number minus one, the apparatus being located on the second training participant side, the apparatus comprising:
the model conversion unit is configured to perform model conversion processing on the submodels of the training participants to obtain conversion submodels of the training participants;
the sample conversion unit is configured to perform vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant;
a matrix product obtaining unit configured to obtain a matrix product between the model-converted logistic regression model and the converted feature sample subset at the second training participant using secret sharing matrix multiplication;
a marker value receiving unit configured to receive a corresponding partial marker value from the first training participant, the partial marker value being one of the first number of partial marker values resulting from decomposition of the marker value at the first training participant;
a predictor determination unit configured to determine a current predictor at the second training participant based on a matrix product at the second training participant;
a prediction difference determination unit configured to determine a prediction difference at the second training participant using the current prediction value of the second training participant and the received partial marker value;
a model update amount determination unit configured to obtain a model update amount of the second training participant by using secret sharing matrix multiplication based on the converted feature sample set and the prediction difference value of the second training participant;
a model updating unit configured to update the conversion submodel of the second training participant based on the current conversion submodel of the second training participant and a corresponding model update amount; and
a model determination unit configured to determine a sub-model of the second training participant based on the conversion sub-models of the respective training participants when the loop end condition is satisfied,
wherein the sample conversion unit, the matrix product acquisition unit, the flag value reception unit, the prediction value determination unit, the prediction difference value determination unit, the model update amount determination unit, and the model update unit are configured to cyclically perform operations until a cycle end condition is satisfied,
wherein, when the cycle process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next cycle process.
14. The apparatus of claim 13, wherein the sample conversion unit comprises:
a sample decomposition module configured to decompose the second subset of feature samples into the first number of second partial subset of feature samples;
a sample sending/receiving module configured to send each of the second number of second partial feature sample subsets to a first training participant and remaining second training participants, and to receive a first partial feature sample subset from the first training participant and a second partial feature sample subset from each of the remaining second training participants, the first partial feature sample subset being one of a first number of first partial feature sample subsets obtained by decomposing the feature sample subset at the first training participant, each received second partial feature sample subset being one of a first number of second partial feature sample subsets obtained by decomposing the respective second feature sample subset at each remaining second training participant; and
a sample stitching module configured to stitch the remaining second partial feature sample subset, the received first and second partial feature sample subsets to obtain a transformed feature sample subset at the second training participant.
15. The apparatus of claim 13, wherein the matrix product acquisition unit is configured to:
obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset of the second training participant by using a secret shared matrix multiplication with a trusted initializer; or
Obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset of the second training participant using untrusted initializer secret sharing matrix multiplication.
16. The apparatus of claim 13, wherein the model update amount determination unit is configured to:
obtaining a model updating amount of the second training participant by using secret sharing matrix multiplication of a trusted initiator based on the converted feature sample set and the prediction difference value of the second training participant; or
And obtaining a model updating amount of the second training participant by using secret sharing matrix multiplication of the untrusted initializer based on the converted feature sample set and the prediction difference value of the second training participant.
17. A system for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the system comprising:
a first training participant device comprising the apparatus of any one of claims 10 to 12; and
a second number of second training participant devices, each second training participant device comprising the apparatus of any one of claims 13 to 16,
wherein each training participant has a submodel, the first training participant has a first subset of feature samples and a labeled value, each second training participant has a second subset of feature samples, the first and second subsets of feature samples are obtained by vertically slicing a set of feature samples used for model training, the second number is equal to the first number minus one.
18. A computing device, comprising:
at least one processor, and
a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 1-5.
19. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any one of claims 1 to 5.
20. A computing device, comprising:
at least one processor, and
a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 6 to 9.
21. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any one of claims 6 to 9.
CN201910600908.9A 2019-07-04 2019-07-04 Model training method, device and system Active CN112183565B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910600908.9A CN112183565B (en) 2019-07-04 2019-07-04 Model training method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910600908.9A CN112183565B (en) 2019-07-04 2019-07-04 Model training method, device and system

Publications (2)

Publication Number Publication Date
CN112183565A true CN112183565A (en) 2021-01-05
CN112183565B CN112183565B (en) 2023-07-14

Family

ID=73915148

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910600908.9A Active CN112183565B (en) 2019-07-04 2019-07-04 Model training method, device and system

Country Status (1)

Country Link
CN (1) CN112183565B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105656692A (en) * 2016-03-14 2016-06-08 南京邮电大学 Multi-instance multi-label learning based area monitoring method used in wireless sensor network
CN108520303A (en) * 2018-03-02 2018-09-11 阿里巴巴集团控股有限公司 A kind of recommendation system building method and device
WO2018174873A1 (en) * 2017-03-22 2018-09-27 Visa International Service Association Privacy-preserving machine learning
US20180316502A1 (en) * 2017-04-27 2018-11-01 Factom Data Reproducibility Using Blockchains
CN108921358A (en) * 2018-07-16 2018-11-30 广东工业大学 A kind of prediction technique, forecasting system and the relevant apparatus of electric load feature
CN109241749A (en) * 2017-07-04 2019-01-18 阿里巴巴集团控股有限公司 Data encryption, machine learning model training method, device and electronic equipment
CN109327421A (en) * 2017-08-01 2019-02-12 阿里巴巴集团控股有限公司 Data encryption, machine learning model training method, device and electronic equipment
CN109413087A (en) * 2018-11-16 2019-03-01 京东城市(南京)科技有限公司 Data sharing method, device, digital gateway and computer readable storage medium
WO2019046651A2 (en) * 2017-08-30 2019-03-07 Inpher, Inc. High-precision privacy-preserving real-valued function evaluation
CN109446430A (en) * 2018-11-29 2019-03-08 西安电子科技大学 Method, apparatus, computer equipment and the readable storage medium storing program for executing of Products Show
WO2019072316A2 (en) * 2019-01-11 2019-04-18 Alibaba Group Holding Limited A distributed multi-party security model training framework for privacy protection
WO2019072315A2 (en) * 2019-01-11 2019-04-18 Alibaba Group Holding Limited A logistic regression modeling scheme using secrete sharing

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105656692A (en) * 2016-03-14 2016-06-08 南京邮电大学 Multi-instance multi-label learning based area monitoring method used in wireless sensor network
WO2018174873A1 (en) * 2017-03-22 2018-09-27 Visa International Service Association Privacy-preserving machine learning
US20180316502A1 (en) * 2017-04-27 2018-11-01 Factom Data Reproducibility Using Blockchains
CN109241749A (en) * 2017-07-04 2019-01-18 阿里巴巴集团控股有限公司 Data encryption, machine learning model training method, device and electronic equipment
CN109327421A (en) * 2017-08-01 2019-02-12 阿里巴巴集团控股有限公司 Data encryption, machine learning model training method, device and electronic equipment
WO2019046651A2 (en) * 2017-08-30 2019-03-07 Inpher, Inc. High-precision privacy-preserving real-valued function evaluation
CN108520303A (en) * 2018-03-02 2018-09-11 阿里巴巴集团控股有限公司 A kind of recommendation system building method and device
CN108921358A (en) * 2018-07-16 2018-11-30 广东工业大学 A kind of prediction technique, forecasting system and the relevant apparatus of electric load feature
CN109413087A (en) * 2018-11-16 2019-03-01 京东城市(南京)科技有限公司 Data sharing method, device, digital gateway and computer readable storage medium
CN109446430A (en) * 2018-11-29 2019-03-08 西安电子科技大学 Method, apparatus, computer equipment and the readable storage medium storing program for executing of Products Show
WO2019072316A2 (en) * 2019-01-11 2019-04-18 Alibaba Group Holding Limited A distributed multi-party security model training framework for privacy protection
WO2019072315A2 (en) * 2019-01-11 2019-04-18 Alibaba Group Holding Limited A logistic regression modeling scheme using secrete sharing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
QIANG YANG 等: "Federated Machine Learning: Concept and Applications", 《ARXIV:1902.04885V1》, pages 1 - 19 *

Also Published As

Publication number Publication date
CN112183565B (en) 2023-07-14

Similar Documents

Publication Publication Date Title
CN111523673B (en) Model training method, device and system
CN111062487B (en) Machine learning model feature screening method and device based on data privacy protection
CN110942147B (en) Neural network model training and predicting method and device based on multi-party safety calculation
CN111061963B (en) Machine learning model training and predicting method and device based on multi-party safety calculation
CN111079939B (en) Machine learning model feature screening method and device based on data privacy protection
CN112052942B (en) Neural network model training method, device and system
CN111523556B (en) Model training method, device and system
CN112132270B (en) Neural network model training method, device and system based on privacy protection
CN111738438B (en) Method, device and system for training neural network model
CN110929887B (en) Logistic regression model training method, device and system
CN111523134B (en) Homomorphic encryption-based model training method, device and system
CN111523674B (en) Model training method, device and system
Zhang et al. PPNNP: A privacy-preserving neural network prediction with separated data providers using multi-client inner-product encryption
CN112183757B (en) Model training method, device and system
CN114186256A (en) Neural network model training method, device, equipment and storage medium
CN112183759B (en) Model training method, device and system
CN110874481B (en) GBDT model-based prediction method and GBDT model-based prediction device
CN111737756B (en) XGB model prediction method, device and system performed through two data owners
CN114492850A (en) Model training method, device, medium, and program product based on federal learning
CN111523675B (en) Model training method, device and system
CN111738453B (en) Business model training method, device and system based on sample weighting
CN112183565B (en) Model training method, device and system
CN112183566B (en) Model training method, device and system
CN111737337B (en) Multi-party data conversion method, device and system based on data privacy protection
CN112966809B (en) Privacy protection-based two-party model prediction method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40044586

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant