CN112183565A

CN112183565A - Model training method, device and system

Info

Publication number: CN112183565A
Application number: CN201910600908.9A
Authority: CN
Inventors: 陈超超; 李梁; 王力; 周俊
Original assignee: Advanced New Technologies Co Ltd
Current assignee: Advanced New Technologies Co Ltd
Priority date: 2019-07-04
Filing date: 2019-07-04
Publication date: 2021-01-05
Anticipated expiration: 2039-07-04
Also published as: CN112183565B

Abstract

The present disclosure provides methods and apparatus for training logistic regression models. In the method, model conversion processing is carried out on the submodels of all training participants to obtain corresponding conversion submodels. The following loop process is executed until a loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset of each training participant; and calculating the matrix product of the converted logistic regression model and each conversion characteristic sample subset, and obtaining the current predicted value of each training participant based on the matrix product. The method further includes decomposing the token value at the first training participant into a first number of partial token values and sending one partial token value to each second training participant. And determining the respective prediction difference and the model updating amount and updating the conversion submodel at each training participant. And when the cycle end condition is met, determining the submodel of each training participant based on the conversion submodel of each training participant.

Description

Model training method, device and system

Technical Field

The present disclosure relates generally to the field of machine learning, and more particularly, to a method, apparatus, and system for collaborative training of logistic regression models via multiple training participants using a horizontally-segmented training set.

Background

Logistic regression models are widely used regression/classification models in the field of machine learning. In many cases, multiple model training participants (e.g., e-commerce companies, courier companies, and banks) each possess different portions of data for feature samples used to train logistic regression models. The multiple model training participants generally want to use each other's data together to train a logistic regression model uniformly, but do not want to provide their respective data to other individual model training participants to prevent their own data from being leaked.

In view of such a situation, a machine learning method capable of protecting data security is proposed, which is capable of training a logistic regression model in cooperation with a plurality of model training participants to be used by the plurality of model training participants while ensuring the data security of each of the plurality of model training participants. However, the model training efficiency of the existing machine learning method capable of protecting data security is low.

Disclosure of Invention

In view of the above, the present disclosure provides a method, an apparatus, and a system for collaborative training of a logistic regression model via a plurality of training participants, which can improve the efficiency of model training while ensuring the security of respective data of the plurality of training participants.

According to an aspect of the present disclosure, there is provided a method for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having one sub-model, the first training participant having a first subset of feature samples and a labeled value, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples used for model training, the second number being equal to the first number minus one, the method being performed by the first training participant, the method comprising: carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants; the following loop process is executed until a loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the first training participant using secret-shared matrix multiplication; decomposing the marker value into the first number of partial marker values and sending each of the second number of partial marker values to a corresponding second training participant, respectively; determining a current predictor at the first training participant based on a matrix product at the first training participant; determining a prediction difference between a current prediction value of the first training participant and a corresponding partial marker value; determining a model update quantity at the first training participant based on the converted feature sample set and the predicted difference value at the first training participant; updating the conversion submodel of the first training participant based on the current conversion submodel of the first training participant and the corresponding model update amount, wherein when the cycle process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next cycle process; and when the cycle end condition is met, determining a sub-model of a first training participant based on the conversion sub-models of the training participants.

According to another aspect of the present disclosure, there is provided a method for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having one sub-model, the first training participant having a first subset of feature samples and a labeled value, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples used for model training, the second number being equal to the first number minus one, the method being performed by the second training participant, the method comprising: carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants; the following loop process is executed until a loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the second training participant using secret-shared matrix multiplication; receiving a corresponding partial tag value from the first training participant, the partial tag value being one of the first number of partial tag values resulting from decomposing the tag value at the first training participant; determining a current predictor at the second training participant based on the matrix product at the second training participant; determining a prediction difference value at the second training participant using the current prediction value of the second training participant and the received partial marker value; obtaining a model update quantity at the second training participant using secret sharing matrix multiplication based on the transformed feature sample set and the predicted difference value of the second training participant; updating the conversion submodel of the second training participant based on the current conversion submodel of the second training participant and the corresponding model update amount, wherein when the cycle process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next cycle process; and determining a sub-model of the second training participant based on the conversion sub-models of the training participants when the cycle end condition is satisfied.

According to another aspect of the present disclosure, there is provided an apparatus for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having one sub-model, the first training participant having a first subset of feature samples and a labeled value, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples used for model training, the second number being equal to the first number minus one, the apparatus being located on the first training participant side, the apparatus comprising: the model conversion unit is configured to perform model conversion processing on the submodels of the training participants to obtain conversion submodels of the training participants; the sample conversion unit is configured to perform vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; a matrix product acquisition unit configured to obtain a matrix product between the logistic regression model and the subset of transformed feature samples at the first training participant using secret-shared matrix multiplication; a tag value decomposition unit configured to decompose the tag value into the first number of partial tag values; a marker value transmitting unit configured to transmit each of the second number of partial marker values to a corresponding second training participant, respectively; a predictor determination unit configured to determine a current predictor at the first training participant based on a matrix product at the first training participant; a prediction difference determination unit configured to determine a prediction difference between a current prediction value of the first training participant and a corresponding partial marker value; a model update amount determination unit configured to determine a model update amount at the first training participant based on the converted feature sample set and the prediction difference value at the first training participant; a model updating unit configured to update a conversion submodel of the first training participant based on a current conversion submodel of the first training participant and a corresponding model update amount; and a model determining unit configured to determine a sub-model of the first training participant based on a conversion sub-model of each training participant when the cycle end condition is satisfied, wherein the sample converting unit, the matrix product obtaining unit, the flag value decomposing unit, the flag value transmitting unit, the prediction value determining unit, the prediction difference value determining unit, the model update amount determining unit, and the model updating unit are configured to cyclically perform operations until the cycle end condition is satisfied, wherein the updated conversion sub-model of each training participant is used as a current conversion sub-model of a next cycle process when a cycle process is not ended.

According to another aspect of the present disclosure, there is provided an apparatus for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having one sub-model, the first training participant having a first subset of feature samples and a labeled value, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples used for model training, the second number being equal to the first number minus one, the apparatus being located on the second training participant side, the apparatus comprising: the model conversion unit is configured to perform model conversion processing on the submodels of the training participants to obtain conversion submodels of the training participants; the sample conversion unit is configured to perform vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; a matrix product obtaining unit configured to obtain a matrix product between the model-converted logistic regression model and the converted feature sample subset at the second training participant using secret sharing matrix multiplication; a marker value receiving unit configured to receive a corresponding partial marker value from the first training participant, the partial marker value being one of the first number of partial marker values resulting from decomposition of the marker value at the first training participant; a predictor determination unit configured to determine a current predictor at the second training participant based on a matrix product at the second training participant; a prediction difference determination unit configured to determine a prediction difference at the second training participant using the current prediction value of the second training participant and the received partial marker value; a model update amount determination unit configured to obtain a model update amount of the second training participant using secret sharing matrix multiplication based on the converted feature sample set and the prediction difference value of the second training participant; a model updating unit configured to update the conversion submodel of the second training participant based on the current conversion submodel of the second training participant and a corresponding model update amount; and a model determining unit configured to determine a sub-model of the second training participant based on a conversion sub-model of each training participant when the cycle end condition is satisfied, wherein the sample converting unit, the matrix product obtaining unit, the flag value receiving unit, the prediction value determining unit, the prediction difference value determining unit, the model update amount determining unit, and the model updating unit are configured to cyclically perform operations until the cycle end condition is satisfied, wherein the updated conversion sub-model of each training participant is used as a current conversion sub-model of a next cycle process when a cycle process is not ended.

According to another aspect of the present disclosure, there is provided a system for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the system comprising: a first training participant device comprising an apparatus as described above on the first training participant side; and a second number of second training participant devices, each second training participant device comprising means on the side of a second training participant as described above, wherein each training participant has one sub-model, the first training participant has a first subset of feature samples and a label value, each second training participant has a second subset of feature samples, the first and second subsets of feature samples are obtained by vertically slicing a set of feature samples used for model training, the second number is equal to the first number minus one.

According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform a training method performed on a first training participant side as described above.

According to another aspect of the present disclosure, there is provided a machine-readable storage medium storing executable instructions that, when executed, cause the at least one processor to perform a training method as described above performed on a first training participant side.

According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform a training method performed on a second training participant side as described above.

According to another aspect of the present disclosure, there is provided a machine-readable storage medium storing executable instructions that, when executed, cause the at least one processor to perform a training method as described above performed on a second training participant side.

By using the scheme of the embodiment of the disclosure, the model parameters of the logistic regression model can be obtained by training without leaking the secret data of the training participants, and the workload of the model training is only in a linear relationship rather than an exponential relationship with the number of the feature samples used for training, so that compared with the prior art, the scheme of the embodiment of the disclosure can improve the efficiency of the model training under the condition of ensuring the safety of the respective data of the training participants.

Drawings

A further understanding of the nature and advantages of the present disclosure may be realized by reference to the following drawings. In the drawings, similar components or features may have the same reference numerals.

FIG. 1 shows a schematic diagram of an example of vertically sliced data according to an embodiment of the present disclosure;

FIG. 2 illustrates an architectural diagram showing a system for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure;

FIG. 3 illustrates a flow diagram of a method for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure;

FIG. 4 shows a flow diagram of a model transformation process according to an embodiment of the present disclosure;

FIG. 5 shows a flow diagram of a feature sample set transformation process in accordance with an embodiment of the present disclosure;

FIG. 6 illustrates a flow diagram of a process of performing secret-shared matrix multiplication with a trusted initiator on current submodels of various training participants and a subset of transformed feature samples of the training participants, in accordance with an embodiment of the disclosure;

FIG. 7 shows a flowchart of a process of performing secret-shared matrix multiplication of untrusted initializers on the current submodels of the individual training participants and the transformed feature sample sets of the training initiator, according to an embodiment of the disclosure;

FIG. 8 shows a flowchart of one example of untrusted initializer secret sharing matrix multiplication according to an embodiment of the present disclosure;

FIG. 9 illustrates a block diagram of an apparatus for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure;

FIG. 10 illustrates a block diagram of an apparatus for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure;

FIG. 11 illustrates a schematic diagram of a computing device for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure;

FIG. 12 illustrates a schematic diagram of a computing device for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure.

Detailed Description

The subject matter described herein will now be discussed with reference to example embodiments. It should be understood that these embodiments are discussed only to enable those skilled in the art to better understand and thereby implement the subject matter described herein, and are not intended to limit the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the scope of the disclosure. Various examples may omit, substitute, or add various procedures or components as needed. For example, the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. In addition, features described with respect to some examples may also be combined in other examples.

As used herein, the term "include" and its variants mean open-ended terms in the sense of "including, but not limited to. The term "based on" means "based at least in part on". The terms "one embodiment" and "an embodiment" mean "at least one embodiment". The term "another embodiment" means "at least one other embodiment". The terms "first," "second," and the like may refer to different or the same object. Other definitions, whether explicit or implicit, may be included below. The definition of a term is consistent throughout the specification unless the context clearly dictates otherwise.

The secret sharing method is a cryptographic technology for decomposing and storing a secret, and divides the secret into a plurality of secret shares in a proper manner, each secret share is owned and managed by one of a plurality of participants, a single participant cannot recover the complete secret, and only a plurality of participants cooperate together can the complete secret be recovered. The secret sharing method aims to prevent the secret from being too concentrated so as to achieve the purposes of dispersing risks and tolerating intrusion.

Secret sharing methods can be roughly divided into two categories: there is a trusted initializer secret sharing method and a untrusted initializer secret sharing method. In the secret sharing method with a trusted initiator, the trusted initiator is required to perform parameter initialization (often to generate random numbers meeting certain conditions) on each participant participating in multi-party secure computation. After the initialization is completed, the trusted initialization party destroys the data and disappears at the same time, and the data are not needed in the following multi-party security calculation process.

The trusted initializer secret sharing matrix multiplication is applicable to the following situations: the complete secret data is a product of the first set of secret shares and the second set of secret shares, and each of the participants has one of the first set of secret shares and one of the second set of secret shares. By the secret sharing matrix multiplication of the trusted initiator, each of the multiple participants can obtain partial complete secret data of the complete secret data, the sum of the partial complete secret data obtained by each participant is the complete secret data, and each participant discloses the obtained partial complete secret data to the rest of the participants, so that each participant can obtain the complete secret data without disclosing the secret share owned by each participant, thereby ensuring the safety of the data of each of the multiple participants.

Untrusted initializer secret sharing matrix multiplication is one of the secret sharing methods. Secret-sharing matrix multiplication by an untrusted initializer is applicable to the case where the complete secret is the product of a first secret share and a second secret share, and both parties own the first and second secret shares, respectively. By secret sharing matrix multiplication by an untrusted initiator, each of the two parties that own a respective secret share generates and discloses data that is different from the secret share that they own, but the sum of the data that the two parties each disclose is equal to the product of the secret shares that the two parties each own (i.e., the complete secret). Therefore, the parties can recover the complete secret by the cooperative work of the secret sharing matrix multiplication of the trusted initialization party without disclosing the secret shares owned by the parties, and the data security of the parties is guaranteed.

In the present disclosure, the training sample set used in the logistic regression model training scheme is a vertically sliced training sample set. The term "vertically dividing the training sample set" refers to dividing the training sample set into a plurality of training sample subsets according to a module/function (or some specified rule), each training sample subset including a part of training subsamples of each training sample in the training sample set, and all training samplesThe part of the training subsamples contained in this subset constitutes the piece of training samples. In one example, assume that the training sample includes label y₀And attribute

Then after vertical slicing, the training participant Alice owns y of the training sample₀And

and that the training participants Bob possess

In another example, assume that the training sample includes label y₀And attribute

and that the training participants Bob possess

And

in addition to these two examples, there are other possible scenarios, which are not listed here.

Suppose a sample x of attribute values described by d attributes (also called features) is given^T＝(x₁；x₂；…；x_d) Wherein x is_iIf the value sum T of x on the ith attribute represents transposition, the logistic regression model is Y ═ 1/(1+ e)^-wx) Where Y is the predicted value, and W is the model parameter of the logistic regression model (i.e., the model described in this disclosure),

W_Prefers to a sub-model at each training participant P in the present disclosure. In this disclosure, attribute value samples are also referred to as feature data samples.

In the present disclosure, each training participant has a different portion of the data of the training samples used to train the logistic regression model. For example, taking two training participants as an example, assuming that the training sample set includes 100 training samples, each of which contains a plurality of feature values and labeled actual values, the data owned by the first participant may be the partial feature values and labeled actual values of each of the 100 training samples, and the data owned by the second participant may be the partial feature values (e.g., remaining feature values) of each of the 100 training samples.

The matrix multiplication computation described anywhere in this disclosure needs to determine whether to transpose a corresponding matrix of one or more of two or more matrices participating in matrix multiplication or not, as the case may be, to satisfy a matrix multiplication rule, thereby completing the matrix multiplication computation.

Embodiments of a method, apparatus, and system for collaborative training of a logistic regression model via multiple training participants according to the present disclosure are described in detail below with reference to the accompanying drawings.

Fig. 1 shows a schematic diagram of an example of a vertically sliced training sample set according to an embodiment of the present disclosure. In fig. 1,2 data parties Alice and Bob are shown, as are the multiple data parties. Each data party Alice and Bob owns part of the training subsamples of each of all the training samples in the training sample set, and for each training sample, the part of the training subsamples owned by data parties Alice and Bob are combined together to form the complete content of the training sample. For example, assume that the content of a certain training sample includes a label (hereinafter referred to as "label value") y₀And attribute features (hereinafter referred to as "feature samples")

and that the training participants Bob possess

Fig. 2 shows an architectural diagram illustrating a system 1 for collaborative training of a logistic regression model via multiple training participants (hereinafter referred to as model training system 1) according to an embodiment of the present disclosure.

As shown in fig. 2, the model training system 1 includes a training initiator device 10 and at least one training cooperator device 20. In fig. 2, 2 training cooperator apparatuses 20 are shown. In other embodiments of the present disclosure, one training cooperator apparatus 20 may be included or more than 2 training cooperator apparatuses 20 may be included. The training initiator device 10 and the at least one training cooperator device 20 may communicate with each other via a network 30, such as, but not limited to, the internet or a local area network or the like. In the present disclosure, the training initiator device 10 and the at least one training cooperator device 20 are collectively referred to as training participant devices.

In the present disclosure, the trained logistic regression model is decomposed into a first number of sub-models. Here, the first number is equal to the number of training participant devices participating in model training. Here, it is assumed that the number of training participant devices is N. Accordingly, the logistic regression model is decomposed into N submodels, one for each training participant device. A training sample set for model training is located at the training initiator device 10, the training sample set being a horizontally partitioned training sample set as described above, and the training sample set comprising a feature data set and corresponding marker values, i.e., x0 and y0 shown in fig. 1. The submodel and corresponding training samples owned by each training participant are secret to that training participant and cannot be learned or are completely learned by other training participants.

In the present disclosure, the training initiator device 10 and the at least one training cooperator device 20 together use a set of training samples at the training initiator device 10 and respective sub-models to cooperatively train a logistic regression model. The specific training process for the model will be described in detail below with reference to fig. 3 to 8.

In the present disclosure, the training initiator device 10 and the training cooperator device 20 may be any suitable computing device having computing capabilities. The computing devices include, but are not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular phones, Personal Digital Assistants (PDAs), handheld devices, messaging devices, wearable computing devices, consumer electronics, and so forth.

FIG. 3 illustrates a flow diagram of a method for collaborative training of a logistic regression model via a plurality of training participants, in accordance with an embodiment of the present disclosure. In fig. 3, a first training participant Alice and 2 second training participants Bob and Charlie are illustrated as examples. Sub-model W with logistic regression model for Alice as first training participant_AThe second training participant Bob has a sub-model W of the logistic regression model_BAnd the second training participant Charlie has a submodel W of the logistic regression model_C. The first training participant Alice has a first feature sample subset X_AAnd a label value Y, the second training participant Bob having a second subset X of feature samples_BAnd the second training participant Charlie has a third subset of feature samples X_C. First subset of feature samples X_AA second subset of feature samples X_BAnd a third subset of feature samples X_CIs obtained by vertically segmenting a feature sample set X used for model training. Submodel W_A、W_BAnd W_CAnd forming the logistic regression model W.

As shown in FIG. 3, first, at block 301, a first training participant Alice, a second training participant Bob, and Charlie initialize the sub-model parameters of their sub-models, i.e., weight sub-vectors W_A、W_BAnd W_CTo obtain initial values of its sub-model parameters and to initialize the number of training cycles performed t to zero. Here, it is assumed that the end condition of the loop process is that a predetermined number of training loops are performed, for example, T training loops are performed.

After initialization as above, at block 302, at Alice, Bob, and Charlie, the respective initial submodels W are separately mapped to_A、W_BAnd W_CPerforming model conversion processing to obtain a conversion submodel W_A'、W_B' and W_C'。

FIG. 4 shows a flow diagram of one example of a model transformation process, according to an embodiment of the present disclosure.

As shown in FIG. 4, at block 410, at Alice, the submodel W that Alice has is assigned_ADecomposition into W_A1、W_A2And W_A3. Here, the sub-pattern W_AIn the decomposition process, aiming at the sub-model W_AThe attribute value of the element is decomposed into 3 partial attribute values, and 3 new elements are obtained by using the decomposed partial attribute values. Then, the resulting 3 new elements are assigned to W, respectively_A1、W_A2And W_A3Thereby obtaining W_A1、W_A2And W_A3. At Bob, the sub-model W that Bob has_BDecomposition into W_B1、W_B2And W_B3. At Charlie, the submodel W that Charlie has_CDecomposition into W_C1、W_C2And W_C3。

Then, at block 420, Alice will W_A2Send to Bob, and W_A3And sending the information to Charlie. At block 430, Bob couples W_B1Send to Alice, and send W_B3And sending the information to Charlie. At block 440, Charlie compares W_C1Send to Alice, and send W_C2Sent to Bob.

Next, at block 450, at Alice, for W_A1、W_B1And W_C1Splicing to obtain the converted submodel W_A'. The resulting converted submodel W_A' dimension and feature sample set for model trainingAre equal in dimension. At Bob, for W_A2、W_B2And W_C2Splicing to obtain the converted submodel W_B'. At Charlie, for W_A3、W_B3And W_C3Splicing to obtain the converted submodel W_C'. Likewise, the converted submodel W_B' and W_CThe dimension of' is equal to the dimension of the feature sample set used for model training.

Returning to fig. 3, after the model conversion is completed as above, the operations of blocks 303 to 311 are cyclically performed until a cycle end condition is satisfied.

Specifically, at block 303, the first training participant Alice and the second training participant Bob, Charlie cooperate to match the first feature sample subset X_AA second subset of feature samples X_BAnd a third subset of feature samples X_C(i.e., the feature sample set) is converted from vertical to horizontal segmentation to obtain a first converted feature sample subset X_A', a second conversion feature sample subset X_B' and third transform feature sample subset X_C'. The resulting first transformed feature sample subset X_A', a second conversion feature sample subset X_B' and third transform feature sample subset X_C' each feature sample in the set has the complete feature content of each training sample, i.e., a subset of feature samples similar to that obtained by horizontally slicing the feature sample set.

Fig. 5 shows a flow diagram of a feature sample set transformation process according to an embodiment of the present disclosure.

As shown in FIG. 5, at Block 510, at Alice, a first feature sample subset X is combined_ADecomposition into X_A1、X_A2And X_A3. At Bob, a second subset of feature samples X is combined_BDecomposition into X_B1、X_B2And X_B3. At Charlie, the third feature sample subset X_CDecomposition into X_C1、X_C2And X_C3. For a subset of feature samples X_A、X_BAnd X_CDecomposition process and for submodel W_AThe decomposition process of (a) is exactly the same. Then, at block 520, Alice compares X with X_A2Is sent toBob, and mixing X_A3And sending the information to Charlie. At block 530, Bob takes X_B1Send to Alice, and X_B3And sending the information to Charlie. At block 540, Charlie compares X_C1Send to Alice, and X_C2Sent to Bob.

Next, at block 550, at Alice, X is paired_A1、X_B1And X_C1Stitching to obtain a first transformed feature sample subset X_A'. At Bob, for X_A2、X_B2And X_C2Performing stitching to obtain a second conversion feature sample subset X_B'. At Charlie, for X_A3、X_B3And X_C3Performing stitching to obtain a third conversion feature sample subset X_C'. The resulting transformed feature sample subset X_A'、X_B' and X_C' same as the dimension of the training sample set.

For the first feature sample subset X as above_AA second subset of feature samples X_BAnd a third subset of feature samples X_CAfter performing the vertical-to-horizontal segmentation transformation, at block 304, the model transformed logistic regression model (i.e., W ═ W) is applied_A'+W_B'+W_C') and a subset of transformed feature samples X for each training participant_A'、X_B' and X_C' secret shared matrix multiplication is performed separately to obtain corresponding matrix products, i.e., the matrix product W X X at Alice_A', the matrix product W X at Bob_B' and the matrix product W X at Charlie_C'。

In one example of the disclosure, a secret shared matrix multiplication is used to obtain a model-transformed logistic regression model W and a transformed feature sample subset X of each training participant_A'、X_B' and X_CThe matrix product between' may include: obtaining a model-converted logistic regression model W and a conversion characteristic sample subset X of each training participant by using secret shared matrix multiplication of a trusted initializer_A'、X_B' and X_C' matrix product between. How to obtain a matrix product using secret shared matrix multiplication with a trusted initiator will be referred to belowFig. 6 illustrates this.

In another example of the present disclosure, a secret shared matrix multiplication is used to obtain a model-transformed logistic regression model W and a transformed feature sample subset X of each training participant_A'、X_B' and X_CThe matrix product between' may include: obtaining a model-transformed logistic regression model W and a transformed feature sample subset X of each training participant by using untrusted initializer secret shared matrix multiplication_A'、X_B' and X_C' matrix product between. How to use untrusted initializer secret sharing matrix multiplication to obtain the matrix product will be explained below with reference to fig. 7-8.

After the matrix product for each training participant is obtained as described above, at block 305, the tag value Y is decomposed at the first training participant Alice to obtain 3 partial tag values Y_A、Y_BAnd Y_C. The decomposition process for the marker values Y is the same as the decomposition process for the feature sample set X described above and will not be described here.

Next, at block 306, the first training participant Alice assigns a partial tag value Y to the partial tag value Y_BTo the second training participant Bob and to the partial mark value Y_CSending to a second training participant Charlie while Alice retains a portion of the labeled value Y_AAs its own partial tag value.

Then, at each training participant, a current predictor at each training participant is determined based on the matrix product of each training participant at block 307. For example, the current predicted values at the various training participants may utilize a formula

To obtain a solution of, wherein,

is the predicted value at the training participant i, W ═ W_A'+W_B'+W_C' is a logistic regression model, and X_iAre a subset of the transformed feature samples at the respective training participants.

In addition, the formula can be matched

A taylor formula expansion is performed, that is,

thus, the matrix product W.X of each training participant can be used based on the Taylor expansion formula_iTo calculate the current predicted value of each training participant. As for taylor formula expansion, it needs to be approximated to several times, and it can be determined based on the accuracy required for the application scenario.

At each training participant, a prediction difference value is determined at each training participant based on the current prediction value and the respective partial label value of each training participant at block 308. I.e. the predicted difference at Alice

Prediction at Bob

And predicted value at Charlie

Where e is a column vector, Y is a column vector representing the label values of the training samples X, and,

is a column vector representing the current predictor for training sample X. E, Y and if training sample X contains only a single training sample

Are column vectors having only a single element. If the training sample X contains multiple training samples, e, Y and

are column vectors having a plurality of elements, wherein,

each element in (e) is a current predicted value of a corresponding training sample in the plurality of training samples, each element in (Y) is a labeled value of a corresponding training sample in the plurality of training samples, and each element in (e) is a difference of the labeled value of the corresponding training sample in the plurality of training samples and the current predicted value. It is to be noted that, in the above description, e_A、e_BAnd e_CAre collectively referred to as e, and Y_A、Y_BAnd Y_CCollectively referred to as Y.

Then, at block 309, a transformed feature sample set X (X ═ X) based on the training initiator Alice_A'+X_B'+X_C') and the predicted difference e for each training participant_A、e_BAnd e_CTo determine the model update quantity TMP of each training participant_A、TMP_BAnd TMP_C. Specifically, the model update quantity TMP at Alice_A＝X*e_AModel update quantity TMP at Bob_B＝X*e_BAnd the model update quantity TMP at Charlie_C＝X*e_C. Here, the model update quantity TMP at Bob_BAnd the model update quantity TMP at Charlie_CIs obtained using secret sharing matrix multiplication.

Next, at each training participant, the current conversion submodel at the training participant is updated based on the current conversion submodel for the training participant and the corresponding model update amount, at block 310. For example, the training initiator Alice uses the current transition submodel W_A' and corresponding model update quantity TMP_ATo update the conversion submodel at the training initiator Alice, and the training cooperator Bob uses the current conversion submodel W_B' and corresponding model update quantity TMP_BTo update the conversion submodel at the training cooperator Bob and Charlie uses the current conversion submodel W_C' and corresponding model update quantity TMP_CTo update the conversion submodel at the training cooperator Charlie.

In this disclosureIn one example of the above, updating the current submodel at the training participant based on the current submodel of the training participant and the corresponding model update amount may update the current submodel W at the training participant according to the following equation_n+1＝W_n-α·TMP_i＝W_n-α·X·e_iWherein W is_n+1Represents the updated converter sub-model, W, at the training participant_nRepresents the current conversion submodel at the training participant, alpha represents the learning rate (learning rate), X represents the feature sample set, and e_iRepresenting the predicted difference at the training participant. Wherein, when the training participant is the first training participant Alice, the updated current conversion submodel may be calculated separately at Alice. When the training participant is the second training participant, X.e_iIs obtained at this second training participant using secret-sharing matrix multiplication, which may be performed using a similar process as shown in fig. 8 or fig. 7-8, except that X corresponds to W, and e_iCorresponding to X. It is to be noted here that, when X is a single feature sample, X is a feature vector (column vector or row vector) composed of a plurality of attributes, and e_iIs a single prediction difference. When X is a plurality of feature samples, X is a feature matrix, and the attribute of each feature sample constitutes one column element/one row element of the feature matrix X, and e_iIs a prediction difference vector. In the calculation of X.e_iWhen with e_iIs the eigenvalue of each sample corresponding to a certain characteristic of the matrix X. For example, assume e_iIs a column vector, each multiplication, e_iMultiplied by a row in the matrix X, the elements in the row representing the eigenvalues of a certain characteristic corresponding to each sample.

After the respective update of the conversion submodel is completed at the respective training participants as described above, it is determined whether a predetermined number of cycles has been reached, that is, whether a cycle end condition has been reached, at block 311. If the predetermined number of cycles is reached, block 312 is entered. If the predetermined number of cycles has not been reached, flow returns to the operation of block 302 to perform a next training cycle in which the updated transition submodel obtained by the respective training participants in the current cycle is used as the current transition submodel for the next cycle.

At block 315, the sub-models (i.e., trained sub-models) at Alice, Bob, and Charlie are determined based on the updated conversion sub-models of Alice, Bob, and Charlie, respectively.

Specifically, W is trained as described above_A'、W_B' and W_C', Alice will W_A'[|A|:|B|]Send to Bob, and W_A'[|B|:]And sending the information to Charlie. Bob will W_B'[0:|A|]Send to Alice, and send W_B'[|B|:]And sending the information to Charlie. Charlie is W_C'[0:|A|]Send to Alice, and send W_C'[|A|:|B|]Sent to Bob. Here, [ | B |:]refers to the vector components after the B dimension (i.e., | B |) in the matrix, 0: | A |]Refers to the vector components preceding the A dimension (i.e., | A |) in the matrix, i.e., the components from 0 to | A |, and [ | A |: | B |)]Are the vector components after the a dimension and before the B dimension in the matrix. For example, let W be [0,1,2,3,4,5,6]If | A | is 2 and | B | is 2, then W [0: | A |, does not count]＝[0,1]And W | A | B | non-woven phosphor]＝[2,3]And W [ | B |:]＝[4,5,6]。

then, at Alice, W is calculated_A＝W_A'[0：|A|]+W_B'[0：|A|]+W_C'[0：|A|]At Bob, W is calculated_B＝W_A'[|A|：|B|]+W_B'[|A|：|B|]+W_C'[|A|：|B|]And at Charlie, calculating W_C＝W_A'[|B|：]+W_B'[|B|：]+W_C'[|B|：]Thus obtaining the trained sub-models W at Alice, Bob and Charlie_A、W_BAnd W_C。

It is to be noted here that, in the above example, the end condition of the training loop process means that the predetermined number of loops is reached. In other examples of the disclosure, the ending condition of the training loop process may also be that the determined prediction difference is within a predetermined range, i.e., the prediction difference e_A、e_BAnd e_CEach element in (1)e_iAll within a predetermined range, e.g. predicting each element e of the difference e_iAre less than a predetermined threshold or the mean of the predicted differences E is less than a predetermined threshold. Accordingly, the operations of block 311 in FIG. 3 may be performed after the operations of block 308.

Figure 6 shows a flow diagram of one example of a secret sharing matrix multiplication process with a trusted initializer. This example to calculate X_A' W is explained as an example. In the case of using multiplication with a trusted initializer secret sharing matrix, the model training system 1 shown in fig. 2 further comprises a trusted initializer device 30.

As shown in fig. 6, first, at the trusted initiator 30, a first number of random weight vectors, a first number of random feature matrices, and a first number of random flag value vectors are generated, and a product of a sum of the first number of random weight vectors and a sum of the first number of random feature matrices is equal to a sum of the first number of random flag value vectors. Here, the first number is equal to the number of training participants.

For example, as shown in FIG. 6, the trusted initiator generates 3 random weight vectors W_R，1、W_R，2And W _R，33 random feature matrices X_R，1、X_R，2And X_R，3And 3 vectors of random tag values Y_R，1、Y_R，2And Y_R，3Wherein, in the step (A),

here, the dimension of the random weight vector is the same as the dimension of the transformed sub-models of the respective model training participants, the dimension of the random feature matrix is the same as the dimension of the transformed feature sample subset, and the dimension of the random token value vector is the same as the dimension of the token value vector.

The generated W is then processed at block 601_R，1、X_R，1And Y_R，1Sent to the first training participant Alice, and at block 602, the generated W is transmitted_R，2、X_R，2And Y_R，2To the second training participant Bob and, at block 603, to apply the generated W_R，3、X_R，3And Y_R，3And sending the information to a second training participant Charlie.

Next, at block 604, at Alice, the feature sample subset X is combined_A' (hereinafter referred to as feature matrix X)_A) Decomposition into a first number of feature sub-matrices, e.g. 3 feature sub-matrices X as shown in FIG. 6_A1'、X_A2' and X_A3'。

Then, Alice sends each of a second number of feature sub-matrices in the decomposed first number of feature sub-matrices to the corresponding second training participant, respectively, where the second number is equal to the first number minus one. For example, at

blocks

605 and 606, 2 feature sub-matrices X_A2' and X_A3' to Bob and Charlie, respectively.

Then, at each training participant, a weight sub-vector (i.e., a conversion sub-model W) based on each training participant_A'、W_B' and W_C'), corresponding feature submatrix X_A1'、X_A2' and X_A3' and the received random weight vector and random feature matrix, determining a weight sub-vector difference value E and a feature sub-matrix difference value D at the training participants. For example, at block 607, at Alice, its weight subvector difference E1 ═ W is determined_A'-W_R，1And the feature submatrix difference D1 ═ X_A1'-X_R，1. At block 608, at Bob, its weight subvector difference E2 ═ W is determined_B'-W_R，2And the feature submatrix difference D2 ═ X_A2'-X_R，2. At block 609, at Charlie, its weight subvector difference E3 ═ W is determined_C'-W_R，3And the feature submatrix difference D3 ═ X_A3'-X_R，3。

Determining respective weight sub-vector difference E at each training participant_iAnd the feature submatrix difference D_iThen, each training participant determines the difference E of the weight sub-vectors_iAnd the feature submatrix difference D_iTo the remaining training participants. For example, at

blocks

610 and 611, Alice sends D1 and E1 to Bob and Charlie, respectively. At

blocks

612 and 613, Bob will D2 and E2 to Alice and Charlie, respectively. Charlie sends D3 and E3 to Alice and Bob, respectively, at

blocks

614 and 615.

Then, at each training participant, the weight sub-vector difference and the feature sub-matrix difference at each training participant are summed to obtain a weight sub-vector total difference E and a feature sub-matrix total difference D, respectively, at block 616. For example, as shown in fig. 6, D — D1+ D2+ D3, and E — E1+ E2+ E3.

Then, at each training participant, based on the received random weight vector W_R,iRandom feature matrix X_R,iVector of random mark values Y_R,iAnd calculating the predicted value vector Zi corresponding to the weight sub-vector total difference E and the feature sub-matrix total difference D respectively.

In one example of the present disclosure, at each training participant, the random labeled value vector of the training participant, the product of the total difference value of the weight sub-vectors and the random feature matrix of the training participant, and the product of the total difference value of the feature sub-matrices and the random weight vector of the training participant may be summed to obtain the corresponding predictor vector (first calculation). Alternatively, the random labeled value vector of the training participant, the product of the total difference value of the weight sub-vectors and the random feature matrix of the training participant, the product of the total difference value of the feature sub-matrices and the random weight vector of the training participant, and the product of the total difference value of the weight sub-vectors and the total difference value of the feature sub-matrices may be summed to obtain the corresponding predictor matrix (second calculation).

It should be noted here that, in the predictor matrix calculation at each training participant, only one predictor matrix calculated at each training participant includes the product of the total weight sub-vector difference and the total feature sub-matrix difference. In other words, for each training participant, only one of the training participants' predictor vectors is calculated in the second calculation, while the remaining training participants calculate the corresponding predictor vector in the first calculation.

For example, at block 617, at Alice, the corresponding predictor vector Z1 ═ Y is calculated_R，1+E*X_R，1+D*W_R，1+ D × E. At block 618, at Bob, the corresponding predictor vector Z2 ═ Y is calculated_R，2+E*X_R，2+D*W_R，2. At block 619, at Charlie, the corresponding predictor vector Z3 ═ Y is calculated_R，3+E*X_R，3+D*W_R，3。

Note that, in fig. 6, Z1 calculated at Alice includes D × E. In other examples of the present disclosure, D _ E may also be included in Zi calculated by either Bob or Charlie, and accordingly, D _ E is not included in Z1 calculated at Alice. In other words, only one of the zis calculated at each training participant contains D × E.

Each training participant then discloses the calculated respective predictor vector to the remaining training participants. For example, at

blocks

620 and 621, Alice sends predictor vector Z1 to Bob and Charlie, respectively. At

blocks

622 and 623, Bob sends the predictor vector Z2 to Alice and Charlie, respectively. At

blocks

624 and 625, Charlie sends predictor vector Z3 to Alice and Bob, respectively.

Then, at

blocks

626, 627, and 628, each training participant sums the predictor vectors of the respective training participant Z-Z1 + Z2+ Z3 to obtain the corresponding matrix product result.

Fig. 7 illustrates a flow diagram of a process for obtaining a matrix product of individual training participants using untrusted initializer secret sharing matrix multiplication based on current submodels of the individual training participants and transformed feature sample subsets of the individual training participants, according to an embodiment of the disclosure. The following description will take the example of calculating the matrix product at Alice. The matrix product computation process for the training participants Bob and Charlie is similar to Alice.

As shown in FIG. 7, first, at Alice, a first weight submatrix (i.e., a conversion submodel) W at Alice is calculated, at Block 710_A' with the first feature matrix (first transformed feature sample subset) X_A' matrix product Y_A1＝W_A*X_A。

Next, at block 720, a no-trust initiator is usedThe secret sharing matrix of the initiator is multiplied to compute a first weight sub-matrix (e.g., W) for each second training participant (e.g., Bob and Charlie)_B' and W_C') and a first feature matrix X_A' matrix product (Y)_A2＝W_B'*X_A' and Y_A3＝W_C'*X_A'). How to calculate the matrix product using untrusted initializer secret shared matrix multiplication will be explained in detail below with reference to fig. 8.

Then, at Alice, the resulting individual matrix products (e.g., Y)_A1、Y_A2And Y_A3) Summing to obtain a matrix product Y at Alice_A＝Y_A1+Y_A2+Y_A3。

Fig. 8 shows a flow diagram of one example of a untrusted initializer secret sharing matrix multiplication process, according to an embodiment of the present disclosure. In FIG. 8, to train the X between the participants Alice and Bob_A'*W_BThe calculation process of' is explained as an example.

As shown in FIG. 8, first, at block 801, if X at Alice_A' (hereinafter referred to as first feature matrix) is not even in number of rows, and/or the current sub-model parameter W at Bob_B' (hereinafter referred to as first weight submatrix) is not even, the first feature matrix X is subjected to_A' and/or a first weight submatrix W_B' conducting dimension completion processing so that the first feature matrix X_A' the number of rows is even and/or the first weight submatrix W_B' is even. For example, the first feature matrix X_A' the end of the line is increased by a line 0 value and/or the first weight submatrix W_B' the dimension completion processing is performed by adding a row of 0 values at the end of the row. In the following description, it is assumed that the first weight submatrix W_B' dimension is I X J, and a first feature matrix X_A' has dimension J x K, wherein J is an even number.

The operations of blocks 802 to 804 are then performed at Alice to obtain a random feature matrix X1, second and third feature matrices X2 and X3. Specifically, at block 802, a random feature matrix X1 is generated. Here, the random characteristic momentsDimension of X1 and first feature matrix X_A' are identical in dimension, i.e., the random feature matrix X1 has dimension J × K. At block 803, the random feature matrix X1 is subtracted from the first feature matrix X_A', to obtain a second feature matrix X2. The dimension of the second feature matrix X2 is J × K. At block 804, the even row submatrix X1_ e of the random feature matrix X1 is subtracted from the odd row submatrix X1_ o of the random feature matrix X1 to obtain a third feature matrix X3. The dimension of the third feature matrix X3 is J × K, where J is J/2.

Further, the operations of blocks 805 to 807 are performed at Bob to obtain a random weight submatrix W_B1A second and a third weight submatrix W_B2And W_B3. Specifically, at block 805, a random weight submatrix W is generated_i1. Here, the random weight submatrix W_B1Dimension of (d) and a first weight submatrix W_B' same dimension, i.e. random weight submatrix W_i1Is I x J. At block 806, the first weight submatrix W is processed_B' and random weight submatrix W_B1Summing to obtain a second weight submatrix W_B2. Second weight submatrix W_B2Is I x J. At block 807, the random weight submatrix W_B1Odd column submatrix W_{B1_o}Adding a random weight sub-matrix W_B1Of even-numbered rows of the submatrix W_{B1_e}To obtain a third weight submatrix W_B3. Third weight submatrix W_B3Is represented by I x J, where J/2.

Then, at block 808, Alice sends the generated second feature matrix X2 and third feature matrix X3 to Bob, and at block 809, Bob sends a second weight submatrix W_B2And a third weight submatrix W_B3And sending the data to Alice.

Next, at block 810, at Alice, W based on equation Y1_B2*(2*X_A'-X1)-W_B3(X3+ X1_ e) performs a matrix calculation to get the first matrix product Y1, and at block 812, sends the first matrix product Y1 to Bob.

At block 811, at Bob, (W) based on equation Y2_B'+2*W_B1)*X2+(W_B3+W_{B1_o}) X3 calculating the second matrixProduct Y2 and, at block 813, a second matrix product Y2 is sent to Alice.

Then, at

blocks

814 and 815, the first matrix product Y1 and the second matrix product Y2 are summed at Alice and Bob, respectively, to obtain X_A'*W_B'＝Y_A2＝Y1+Y2。

Furthermore, it is noted that the model training schemes of 1 first training participant and 2 second training participants are shown in fig. 3-8, which may also include 1 second training participant or more than 2 second training participants in other examples of the disclosure.

By using the logistic regression model training method disclosed in fig. 3 to 8, the model parameters of the logistic regression model can be obtained by training without leaking the secret data of the training participants, and the workload of the model training is only in a linear relationship rather than an exponential relationship with the number of the feature samples used for training, so that the efficiency of the model training can be improved under the condition of ensuring the safety of the respective data of the training participants.

FIG. 9 shows a schematic diagram of an apparatus for collaborative training of a logistic regression model via multiple training participants (hereinafter referred to as a model training apparatus) 900, according to an embodiment of the present disclosure. In this embodiment, the logistic regression model includes a first number of sub-models, the first number being equal to the number of training participants, and each training participant has one sub-model. The training participants include a first training participant and a second number of second training participants. The first training participant has a first subset of feature samples and a tag value, each second training participant has a second subset of feature samples, the first and second subsets of feature samples are obtained by vertically slicing a set of feature samples used for model training, the second number is equal to the first number minus one. The model training apparatus 900 is located on the first training participant side.

As shown in fig. 9, the model training apparatus 900 includes a model conversion unit 901, a sample conversion unit 902, a matrix product acquisition unit 903, a flag value decomposition unit 904, a flag value transmission unit 905, a prediction value determination unit 906, a prediction difference determination unit 907, a model update amount determination unit 908, a model update unit 909, and a model determination unit 910.

The model transformation unit 901 is configured to perform a model transformation process on the sub-models of the respective training participants to obtain transformed sub-models of the respective training participants. The operation of the model conversion unit 901 may refer to the operation of the block 302 described above with reference to fig. 3 and the operation described with reference to fig. 4.

In performing model training, the sample conversion unit 902, the matrix product acquisition unit 903, the marker value decomposition unit 904, the marker value transmission unit 905, the prediction value determination unit 906, the prediction difference value determination unit 907, the model update amount determination unit 908, and the model update unit 909 are configured to cyclically perform operations until a cycle end condition is satisfied. The loop-ending condition may include: reaching a predetermined cycle number; or the determined prediction difference is within a predetermined range. When the loop process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next loop process.

In particular, during each cycle, the sample conversion unit 902 is configured to perform a vertical-horizontal slicing conversion on the feature sample set to obtain a converted feature sample subset at each training participant. The operation of the sample conversion unit 902 may refer to the operation of block 303 described above with reference to fig. 3 and the process described with reference to fig. 5.

The matrix product acquisition unit 903 is configured to obtain a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the first training participant using secret-shared matrix multiplication. The operation of the sample conversion unit 903 may refer to the operation of block 303 described above with reference to fig. 3 and the operations described with reference to fig. 6-8.

The marker value decomposition unit 904 is configured to decompose the marker value into a first number of partial marker values. The operation of the marker value decomposition unit 904 may refer to the operation of block 305 described above with reference to fig. 3.

The marker value transmitting unit 905 is configured to transmit each of the second number of partial marker values to a corresponding second training participant, respectively. The operation of the flag value transmitting unit 905 may refer to the operation of the block 306 described above with reference to fig. 3.

The predictor determination unit 906 is configured to determine a current predictor at the first training participant based on the matrix product at the first training participant. The operation of the prediction value determination unit 906 may refer to the operation of the block 307 described above with reference to fig. 3.

The prediction difference determination unit 907 is configured to determine a prediction difference between the current prediction value of the first training participant and the corresponding partial marker value. The operation of the prediction difference determination unit 907 may refer to the operation of the block 308 described above with reference to fig. 3.

The model update amount determination unit 908 is configured to determine a model update amount at the first training participant based on the set of feature samples and the predicted difference at the first training participant. The operation of the model update amount determination unit 908 may refer to the operation of block 309 described above with reference to fig. 3.

The model update unit 909 is configured to update the current conversion submodel of the first training participant based on the current conversion submodel of the first training participant and the corresponding model update amount.

The model determination unit 910 is configured to determine a sub-model of the first training participant based on the conversion sub-models of the respective training participants when the loop end condition is fulfilled.

In one example of the present disclosure, the matrix product acquisition unit 903 may be configured to: and obtaining a matrix product between the model-converted logistic regression model and the conversion feature sample subset of the first training participant by using secret shared matrix multiplication of the trusted initializer. The operations of the matrix product acquisition unit 903 may refer to the operations performed at Alice described above with reference to fig. 6.

In another example of the present disclosure, the matrix product acquisition unit 903 may be configured to: and obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset of the first training participant by using the secret shared matrix multiplication of the untrusted initializer. The operations of the matrix product acquisition unit 903 may refer to the operations performed at Alice described above with reference to fig. 7-8.

In one example of the present disclosure, the sample conversion unit 902 may include a sample decomposition module (not shown), a sample transmission/reception module (not shown), and a sample stitching module (not shown). The sample decomposition module is configured to decompose a first subset of feature samples into a first number of first partial subsets of feature samples. The sample sending/receiving module is configured to send each of a second number of the first partial feature sample subsets to a corresponding second training participant, and to receive a corresponding second partial feature sample subset from the respective second training participant, the received respective second partial feature sample subset being one of a first number of second partial feature sample subsets obtained by decomposing the second feature sample subset at the respective second training participant. The sample stitching module is configured to stitch the remaining first subset of partial feature samples and the received respective second subset of partial feature samples to obtain a transformed subset of feature samples at the first training participant.

FIG. 10 illustrates a block diagram of an apparatus for collaborative training of a logistic regression model via a plurality of training participants (hereinafter referred to as model training apparatus 1000), according to an embodiment of the present disclosure. In this embodiment, the logistic regression model includes a first number of sub-models, the first number being equal to the number of training participants, and each training participant has one sub-model. The training participants include a first training participant and a second number of second training participants. The first training participant has a first subset of feature samples and a tag value, each second training participant has a second subset of feature samples, the first and second subsets of feature samples are obtained by vertically slicing a set of feature samples used for model training, the second number is equal to the first number minus one. The model training apparatus 1000 is located on the second training participant side.

As shown in fig. 10, the model training apparatus 1000 includes a model conversion unit 1010, a sample conversion unit 1020, a matrix product acquisition unit 1030, a flag value receiving unit 1040, a predicted value determination unit 1050, a prediction difference value determination unit 1060, a model update amount determination unit 1070, a model update unit 1080, and a model determination unit 1090.

The model conversion unit 1010 is configured to perform a model conversion process on the sub-models of the respective training participants to obtain conversion sub-models of the respective training participants. The operation of the model conversion unit 1010 may refer to the operation of block 302 described above with reference to fig. 3 and the operation described with reference to fig. 4.

In performing model training, the sample conversion unit 1020, the matrix product acquisition unit 1030, the flag value receiving unit 1040, the prediction value determination unit 1050, the prediction difference determination unit 1060, the model update amount determination unit 1070, and the model update unit 1080 are configured to perform operations in a loop until a loop end condition is satisfied. The loop-ending condition may include: reaching a predetermined cycle number; or the determined prediction difference is within a predetermined range. When the loop process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next loop process.

In particular, during each cycle, the sample conversion unit 1020 is configured to perform a vertical-horizontal slicing conversion on the feature sample set to obtain a converted feature sample subset at each training participant. The operation of the sample conversion unit 1020 may refer to the operation of block 303 described above with reference to fig. 3 and the operation described with reference to fig. 5.

The matrix product acquisition unit 1030 is configured to obtain a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the second training participant using secret-shared matrix multiplication. The operation of the matrix product acquisition unit 1030 may refer to the operation of block 304 described above with reference to fig. 3 and the processes described with reference to fig. 6-8.

The marker value receiving unit 1040 is configured to receive a corresponding partial marker value from the first training participant, the partial marker value being one of a first number of partial marker values resulting from decomposition of the marker value at the first training participant. The operation of the marker value receiving unit 1040 may refer to the operation of the block 306 described above with reference to fig. 3.

The predictor determination unit 1050 is configured to determine a current predictor at the second training participant based on the matrix product at the second training participant. The operation of the prediction value determination unit 1050 may refer to the operation of the block 307 described above with reference to fig. 3.

The prediction difference determination unit 1060 is configured to determine a prediction difference at the second training participant using the current prediction value of the second training participant and the received partial marker value. The operation of the prediction difference determination unit 1060 may refer to the operation of the block 308 described above with reference to fig. 3.

The model update amount determination unit 1070 is configured to obtain a model update amount of the second training participant using a secret sharing matrix multiplication based on the set of feature samples and the predicted difference value of the second training participant. The operation of the model update amount determination unit 1070 may refer to the operation of the block 309 described above with reference to fig. 3.

The model update unit 1080 is configured to update the current conversion submodel of the second training participant based on the current conversion submodel of the second training participant and the corresponding model update amount. The operation of the model update unit 1080 may refer to the operation of block 310 described above with reference to fig. 3.

The model determination unit 1090 is configured to determine a sub-model of the second training participant based on the conversion sub-models of the respective training participants when the loop end condition is satisfied. The operation of the model determination unit 1090 may refer to the operation of block 312 described above with reference to fig. 3.

In one example of the present disclosure, the sample conversion unit 1020 may include a sample decomposition module (not shown), a sample transmission/reception module (not shown), and a sample stitching module (not shown). The sample decomposition module is configured to decompose the second subset of feature samples into a first number of second partial subsets of feature samples. The sample sending/receiving module is configured to send each of a second number of subsets of second partial feature samples to a first training participant and to receive a first partial subset of feature samples from the first training participant and a second partial subset of feature samples from each of the remaining second training participants, the first partial subset of feature samples being one of a first number of subsets of first partial feature samples obtained by decomposing the subset of feature samples at the first training participant, each received subset of second partial feature samples being one of a first number of subsets of second partial feature samples obtained by decomposing the respective subset of second feature samples at each remaining second training participant. The sample stitching module is configured to stitch the remaining second partial feature sample subset, the received first and second partial feature sample subsets to obtain a transformed feature sample subset at the second training participant.

In one example of the present disclosure, the matrix product acquisition unit 1030 may be configured to: and obtaining a matrix product between the model-converted logistic regression model and the conversion feature sample subset of the second training participant by using secret shared matrix multiplication of the trusted initializer. The operations of the matrix product acquisition unit 1030 may refer to the operations performed at the second training participant described above with reference to fig. 6.

In another example of the present disclosure, the matrix product acquisition unit 1030 may be configured to: and obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset of the second training participant by using the secret shared matrix multiplication of the untrusted initializer. The operations of the matrix product acquisition unit 1030 may refer to the operations performed at the second training participant described above with reference to fig. 7-8.

In one example of the present disclosure, the model update amount determination unit 1070 may be configured to: and obtaining a model updating amount of the second training participant by using secret sharing matrix multiplication with a trusted initializer based on the feature sample set and the prediction difference value of the second training participant.

In another example of the present disclosure, the model update amount determination unit 1070 may be configured to: and obtaining a model updating quantity of the second training participant by using secret sharing matrix multiplication of the untrusted initializer based on the characteristic sample set and the prediction difference value of the second training participant.

Embodiments of a model training method, apparatus and system according to the present disclosure are described above with reference to fig. 1 through 10. The above model training device can be implemented by hardware, or can be implemented by software, or a combination of hardware and software.

FIG. 11 illustrates a hardware block diagram of a computing device 1100 for implementing collaborative training of a logistic regression model via multiple training participants, according to an embodiment of the disclosure. As shown in fig. 11, computing device 1100 may include at least one processor 1110, a memory (e.g., non-volatile storage) 1120, a memory 1130, and a communication interface 1140, and the at least one processor 1110, memory 1120, memory 1130, and communication interface 1140 are connected together via a bus 1160. The at least one processor 1110 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.

In one embodiment, computer-executable instructions are stored in the memory that, when executed, cause the at least one processor 1110 to: carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants; the following loop process is executed until a loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the first training participant using secret-shared matrix multiplication; decomposing the labeled values into a first number of partial labeled values and sending each of a second number of partial labeled values to a corresponding second training participant, respectively; determining a current predictor at the first training participant based on the matrix product at the first training participant; determining a prediction difference value between a current prediction value of a first training participant and a corresponding partial marker value; determining a model update amount at the first training participant based on the feature sample set and the prediction difference value at the first training participant; updating the conversion submodel of the first training participant based on the current conversion submodel of the first training participant and the corresponding model update amount, wherein when the cycle process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next cycle process; and when the cycle end condition is met, determining the sub-model of the first training participant based on the conversion sub-models of the training participants.

It should be understood that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 1110 to perform the various operations and functions described above in connection with fig. 1-10 in the various embodiments of the present disclosure.

FIG. 12 illustrates a hardware block diagram of a computing device 1200 for implementing collaborative training of a logistic regression model via multiple training participants, according to an embodiment of the disclosure. As shown in fig. 12, computing device 1200 may include at least one processor 1210, storage (e.g., non-volatile storage) 1220, memory 1230, and a communication interface 1240, and the at least one processor 1210, storage 1220, memory 1230, and communication interface 1240 are connected together via a bus 1260. The at least one processor 1210 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.

In one embodiment, computer-executable instructions are stored in the memory that, when executed, cause the at least one processor 1210 to: carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants; the following loop process is executed until a loop end condition is satisfied: performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant; obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the second training participant using secret-shared matrix multiplication; receiving a corresponding partial tag value from a first training participant, the partial tag value being one of a first number of partial tag values resulting from decomposition of the tag value at the first training participant; determining a current predictor at the second training participant based on the matrix product at the second training participant; determining a prediction difference at the second training participant using the current prediction value of the second training participant and the received partial marker value; obtaining a model update quantity at the second training participant using secret sharing matrix multiplication based on the feature sample set and the predicted difference value of the second training participant; updating the conversion submodel of the second training participant based on the current conversion submodel of the second training participant and the corresponding model updating amount, wherein when the cycle process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next cycle process; and when the cycle end condition is met, determining a sub-model of a second training participant based on the conversion sub-models of the training participants.

It should be understood that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 1210 to perform the various operations and functions described above in connection with fig. 1-10 in the various embodiments of the present disclosure.

According to one embodiment, a program product, such as a machine-readable medium (e.g., a non-transitory machine-readable medium), is provided. A machine-readable medium may have instructions (i.e., elements described above as being implemented in software) that, when executed by a machine, cause the machine to perform various operations and functions described above in connection with fig. 1-10 in various embodiments of the present disclosure. Specifically, a system or apparatus may be provided which is provided with a readable storage medium on which software program code implementing the functions of any of the above embodiments is stored, and causes a computer or processor of the system or apparatus to read out and execute instructions stored in the readable storage medium.

In this case, the program code itself read from the readable medium can realize the functions of any of the above-described embodiments, and thus the machine-readable code and the readable storage medium storing the machine-readable code form part of the present invention.

Examples of the readable storage medium include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer or from the cloud via a communications network.

It will be understood by those skilled in the art that various changes and modifications may be made in the above-disclosed embodiments without departing from the spirit of the invention. Accordingly, the scope of the invention should be determined from the following claims.

It should be noted that not all steps and units in the above flows and system structure diagrams are necessary, and some steps or units may be omitted according to actual needs. The execution order of the steps is not fixed, and can be determined as required. The apparatus structures described in the above embodiments may be physical structures or logical structures, that is, some units may be implemented by the same physical entity, or some units may be implemented by a plurality of physical entities, or some units may be implemented by some components in a plurality of independent devices.

In the above embodiments, the hardware units or modules may be implemented mechanically or electrically. For example, a hardware unit, module or processor may comprise permanently dedicated circuitry or logic (such as a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware units or processors may also include programmable logic or circuitry (e.g., a general purpose processor or other programmable processor) that may be temporarily configured by software to perform the corresponding operations. The specific implementation (mechanical, or dedicated permanent, or temporarily set) may be determined based on cost and time considerations.

The detailed description set forth above in connection with the appended drawings describes exemplary embodiments but does not represent all embodiments that may be practiced or fall within the scope of the claims. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration," and does not mean "preferred" or "advantageous" over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, the techniques may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the described embodiments.

The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having a sub-model, the first training participant having a first subset of feature samples and labeled values, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples used for model training, the second number being equal to the first number minus one, the method being performed by the first training participant, the method comprising:

carrying out model conversion processing on the submodels of all the training participants to obtain conversion submodels of all the training participants;

the following loop process is executed until a loop end condition is satisfied:

performing vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant;

obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the first training participant using secret-shared matrix multiplication;

decomposing the marker value into the first number of partial marker values and sending each of the second number of partial marker values to a corresponding second training participant, respectively;

determining a current predictor at the first training participant based on a matrix product at the first training participant;

determining a prediction difference between a current prediction value of the first training participant and a corresponding partial marker value;

determining a model update amount at the first training participant based on the converted feature sample set and the prediction difference value at the first training participant;

updating the conversion submodel of the first training participant based on the current conversion submodel of the first training participant and the corresponding model update amount, wherein when the cycle process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next cycle process; and

and when the cycle end condition is met, determining the sub-model of the first training participant based on the conversion sub-models of the training participants.

2. The method of claim 1, wherein performing a vertical-to-horizontal slicing transform on the feature sample set to obtain transformed feature sample subsets at each training participant comprises:

decomposing the first subset of feature samples into the first number of first partial subsets of feature samples;

sending each of the second number of first partial feature sample subsets to a corresponding second training participant;

receiving a second partial feature sample subset from each second training participant, the respective received second partial feature sample subset being one of a first number of second partial feature sample subsets obtained by decomposing the second feature sample subset at the respective second training participant; and

and splicing the remaining first part of feature sample subset and the received second part of feature sample subset to obtain a conversion feature sample subset at the first training participant.

3. The method of claim 1, wherein using secret sharing matrix multiplication to obtain a matrix product between the model transformed logistic regression model and the transformed feature sample subset of the first training participant comprises:

obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset of the first training participant by using a secret shared matrix multiplication with a trusted initializer; or

Using untrusted initializer secret sharing matrix multiplication to obtain a matrix product between the model transformed logistic regression model and the transformed feature sample subset of the first training participant.

4. The method of claim 1, wherein determining a current predictor at the training initiator based on a matrix product at the training initiator comprises:

determining a current predictor at the training initiator based on a matrix product at the training initiator according to a Taylor expansion formula.

5. The method of any of claims 1 to 4, wherein the end-of-loop condition comprises:

a predetermined number of cycles; or

The determined prediction difference is within a predetermined range.

6. A method for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having a sub-model, the first training participant having a first subset of feature samples and labeled values, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples used for model training, the second number being equal to the first number minus one, the method being performed by the second training participants, the method comprising:

the following loop process is executed until a loop end condition is satisfied:

obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the second training participant using secret-shared matrix multiplication;

receiving a corresponding partial tag value from the first training participant, the partial tag value being one of the first number of partial tag values resulting from decomposing the tag value at the first training participant;

determining a current predictor at the second training participant based on the matrix product at the second training participant;

determining a prediction difference value at the second training participant using the current prediction value of the second training participant and the received partial marker value;

obtaining a model update quantity at the second training participant by using secret sharing matrix multiplication based on the converted feature sample set and the prediction difference value of the second training participant;

updating the conversion submodel of the second training participant based on the current conversion submodel of the second training participant and the corresponding model update amount, wherein when the cycle process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next cycle process; and

and when the cycle end condition is met, determining the sub-model of the second training participant based on the conversion sub-models of the training participants.

7. The method of claim 6, wherein performing a vertical-to-horizontal slicing transform on the feature sample set to obtain transformed feature sample subsets at each training participant comprises:

decomposing the second subset of feature samples into the first number of second partial subsets of feature samples;

sending each of the second number of second partial feature sample subsets to the first training participant and to the remaining second training participants;

receiving a first portion of a subset of feature samples from the first training participant and a second portion of a subset of feature samples from each of the remaining second training participants, the first portion of the subset of feature samples being one of a first number of first portions of the subset of feature samples obtained by decomposing the subset of feature samples at the first training participant, each received second portion of the subset of feature samples being one of a first number of second portions of the subset of feature samples obtained by decomposing the respective second subset of feature samples at each of the remaining second training participants; and

and splicing the remaining second partial feature sample subset and the received first and second partial feature sample subsets to obtain a converted feature sample subset at the second training participant.

8. The method of claim 6, wherein using secret sharing matrix multiplication to obtain a matrix product between the model transformed logistic regression model and the transformed feature sample subset of the second training participant comprises:

obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset of the second training participant by using a secret shared matrix multiplication with a trusted initializer; or

Obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset of the second training participant using untrusted initializer secret sharing matrix multiplication.

9. The method of claim 6, wherein obtaining the model update quantity for the second training participant using secret sharing matrix multiplication based on the converted feature sample set and the predicted difference value for the second training participant comprises:

obtaining a model updating amount of the second training participant by using secret sharing matrix multiplication of a trusted initiator based on the converted feature sample set and the prediction difference value of the second training participant; or

And obtaining a model updating amount of the second training participant by using secret sharing matrix multiplication of the untrusted initializer based on the converted feature sample set and the prediction difference value of the second training participant.

10. An apparatus for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having a sub-model, the first training participant having a first subset of feature samples and labeled values, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples used for model training, the second number being equal to the first number minus one, the apparatus being located on the first training participant side, the apparatus comprising:

the model conversion unit is configured to perform model conversion processing on the submodels of the training participants to obtain conversion submodels of the training participants;

the sample conversion unit is configured to perform vertical-horizontal segmentation conversion on the feature sample set to obtain a conversion feature sample subset at each training participant;

a matrix product obtaining unit configured to obtain a matrix product between the model-converted logistic regression model and the converted feature sample subset at the first training participant using secret-shared matrix multiplication;

a tag value decomposition unit configured to decompose the tag value into the first number of partial tag values;

a marker value transmitting unit configured to transmit each of the second number of partial marker values to a corresponding second training participant, respectively;

a predictor determination unit configured to determine a current predictor at the first training participant based on a matrix product at the first training participant;

a prediction difference determination unit configured to determine a prediction difference between a current prediction value of the first training participant and a corresponding partial marker value;

a model update amount determination unit configured to determine a model update amount at the first training participant based on the converted feature sample set and the prediction difference value at the first training participant;

a model updating unit configured to update a conversion submodel of the first training participant based on a current conversion submodel of the first training participant and a corresponding model update amount; and

a model determination unit configured to determine a sub-model of the first training participant based on the conversion sub-models of the respective training participants when the loop end condition is satisfied,

wherein the sample conversion unit, the matrix product acquisition unit, the flag value decomposition unit, the flag value transmission unit, the prediction value determination unit, the prediction difference value determination unit, the model update amount determination unit, and the model update unit are configured to cyclically perform operations until a cycle end condition is satisfied,

wherein, when the cycle process is not finished, the updated conversion submodel of each training participant is used as the current conversion submodel of the next cycle process.

11. The apparatus of claim 10, wherein the sample conversion unit comprises:

a sample decomposition module configured to decompose the first subset of feature samples into the first number of first partial subsets of feature samples;

a sample sending/receiving module configured to send each of the second number of first partial feature sample subsets to a corresponding second training participant and receive a corresponding second partial feature sample subset from each second training participant, each received second partial feature sample subset being one of a first number of second partial feature sample subsets obtained by decomposing the second feature sample subset at each second training participant; and

a sample stitching module configured to stitch the remaining first subset of partial feature samples and the received second number of second subset of partial feature samples to obtain a transformed subset of feature samples at the first training participant.

12. The apparatus of claim 10, wherein the matrix product acquisition unit is configured to:

obtaining a matrix product between the model-transformed logistic regression model and the transformed feature sample subset at the first training participant using a secret shared matrix multiplication with a trusted initiator; or

Using untrusted initializer secret sharing matrix multiplication to obtain a matrix product between the model transformed logistic regression model and the transformed feature sample subset at the first training participant.

13. An apparatus for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the plurality of training participants including a first training participant and a second number of second training participants, each training participant having a sub-model, the first training participant having a first subset of feature samples and labeled values, each second training participant having a second subset of feature samples, the first and second subsets of feature samples being obtained by vertically slicing a set of feature samples used for model training, the second number being equal to the first number minus one, the apparatus being located on the second training participant side, the apparatus comprising:

a matrix product obtaining unit configured to obtain a matrix product between the model-converted logistic regression model and the converted feature sample subset at the second training participant using secret sharing matrix multiplication;

a marker value receiving unit configured to receive a corresponding partial marker value from the first training participant, the partial marker value being one of the first number of partial marker values resulting from decomposition of the marker value at the first training participant;

a predictor determination unit configured to determine a current predictor at the second training participant based on a matrix product at the second training participant;

a prediction difference determination unit configured to determine a prediction difference at the second training participant using the current prediction value of the second training participant and the received partial marker value;

a model update amount determination unit configured to obtain a model update amount of the second training participant by using secret sharing matrix multiplication based on the converted feature sample set and the prediction difference value of the second training participant;

a model updating unit configured to update the conversion submodel of the second training participant based on the current conversion submodel of the second training participant and a corresponding model update amount; and

a model determination unit configured to determine a sub-model of the second training participant based on the conversion sub-models of the respective training participants when the loop end condition is satisfied,

wherein the sample conversion unit, the matrix product acquisition unit, the flag value reception unit, the prediction value determination unit, the prediction difference value determination unit, the model update amount determination unit, and the model update unit are configured to cyclically perform operations until a cycle end condition is satisfied,

14. The apparatus of claim 13, wherein the sample conversion unit comprises:

a sample decomposition module configured to decompose the second subset of feature samples into the first number of second partial subset of feature samples;

a sample sending/receiving module configured to send each of the second number of second partial feature sample subsets to a first training participant and remaining second training participants, and to receive a first partial feature sample subset from the first training participant and a second partial feature sample subset from each of the remaining second training participants, the first partial feature sample subset being one of a first number of first partial feature sample subsets obtained by decomposing the feature sample subset at the first training participant, each received second partial feature sample subset being one of a first number of second partial feature sample subsets obtained by decomposing the respective second feature sample subset at each remaining second training participant; and

a sample stitching module configured to stitch the remaining second partial feature sample subset, the received first and second partial feature sample subsets to obtain a transformed feature sample subset at the second training participant.

15. The apparatus of claim 13, wherein the matrix product acquisition unit is configured to:

16. The apparatus of claim 13, wherein the model update amount determination unit is configured to:

17. A system for collaborative training of a logistic regression model via a plurality of training participants, the logistic regression model having a first number of sub-models, the system comprising:

a first training participant device comprising the apparatus of any one of claims 10 to 12; and

a second number of second training participant devices, each second training participant device comprising the apparatus of any one of claims 13 to 16,

wherein each training participant has a submodel, the first training participant has a first subset of feature samples and a labeled value, each second training participant has a second subset of feature samples, the first and second subsets of feature samples are obtained by vertically slicing a set of feature samples used for model training, the second number is equal to the first number minus one.

18. A computing device, comprising:

at least one processor, and

a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 1-5.

19. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any one of claims 1 to 5.

20. A computing device, comprising:

at least one processor, and

a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 6 to 9.

21. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any one of claims 6 to 9.