CN110032893B - Security model prediction method and device based on secret sharing - Google Patents

Security model prediction method and device based on secret sharing Download PDF

Info

Publication number
CN110032893B
CN110032893B CN201910185759.4A CN201910185759A CN110032893B CN 110032893 B CN110032893 B CN 110032893B CN 201910185759 A CN201910185759 A CN 201910185759A CN 110032893 B CN110032893 B CN 110032893B
Authority
CN
China
Prior art keywords
data
model
vector
prediction
random numbers
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910185759.4A
Other languages
Chinese (zh)
Other versions
CN110032893A (en
Inventor
林文珍
殷山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ant Blockchain Technology Shanghai Co Ltd
Original Assignee
Advanced New Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced New Technologies Co Ltd filed Critical Advanced New Technologies Co Ltd
Priority to CN201910185759.4A priority Critical patent/CN110032893B/en
Publication of CN110032893A publication Critical patent/CN110032893A/en
Priority to TW108133838A priority patent/TWI720622B/en
Priority to PCT/CN2020/073818 priority patent/WO2020181933A1/en
Application granted granted Critical
Publication of CN110032893B publication Critical patent/CN110032893B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • G06F21/645Protecting data integrity, e.g. using checksums, certificates or signatures using a third party
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/58Random or pseudo-random number generators
    • G06F7/588Random number generators, i.e. based on natural stochastic processes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a security model prediction method based on secret sharing, which comprises the following steps: receiving a first set of random numbers from a third party; generating a shared computation predictor using the first set of random numbers, a model coefficient vector, and a vector from a data provider; and performing model prediction using the shared computation prediction result. The invention can protect private data of each party from leakage and ensure the accuracy of calculation.

Description

Security model prediction method and device based on secret sharing
Technical Field
The present invention relates generally to multi-party data collaboration, and more particularly to data security and model security in multi-party data collaboration.
Background
In the fields of data analysis, data mining, economic prediction and the like, the model can be used for analyzing and finding potential data values. However, the data owned by the model party is often not robust, and thus it is difficult to accurately depict the target. In order to obtain a better model prediction result, a model party and a data party generally perform data cooperation and combine different data or feature labels to complete model calculation.
In the multi-party data cooperation process, problems of data security, model security and the like are involved. On one hand, the data side does not want to output own value data to the model side and leak private data; on the other hand, information such as a feature tag (also referred to as a model coefficient) included in a model is also private data of a model side and has an important commercial value, and therefore, a model security problem in data collaboration is also to be ensured.
In the prior art, there are three technical solutions for multi-party data collaboration. The first solution is that both the data side and the model side place the data and the model in a trusted third party, and the third party performs model prediction. But has the disadvantage that a completely trusted third party is difficult to implement and there are security risks in the transmission of data and models. The second scheme is that a model party performs homomorphic encryption on model coefficients, the encrypted model is deployed to a data party, the data party performs model prediction by using private data, and then a calculation result is returned to the model party. However, this scheme has a limitation on the type of computation due to the computation limitation of homomorphic encryption, and homomorphic encryption is complex and takes a long time. The third scheme is to use sgx (software Guard extension) hardware in combination with machine learning and cryptography to coefficient-obscure the trained model using differential privacy techniques. However, the model coefficient is blurred by using the differential privacy technology, and the blurring degree is difficult to grasp. For models that require accurate computation results, the accuracy of the results is affected.
Therefore, a secret sharing scheme which can protect the security of data and models and obtain accurate calculation results is expected in multi-party data cooperation.
Disclosure of Invention
In order to solve the technical problem, the invention provides a security model prediction method based on secret sharing, which comprises the following steps:
receiving a first set of random numbers from a third party;
generating a shared computation predictor using the first set of random numbers, a model coefficient vector, and a vector from a data provider; and
and performing model prediction by using the shared calculation prediction result.
Optionally, the generating the shared computation prediction result comprises:
generating an intermediate model vector using the model coefficient vector and the first set of random numbers;
sending the intermediate model vector to the data provider and receiving an intermediate data vector from the data provider;
generating an intermediate data value using the intermediate data vector and the first set of random numbers from the data provider;
receiving an intermediate model value from the data provider; and
generating the shared computation prediction result using the intermediate model value and the intermediate data value.
Optionally, the shared computed prediction result is a product of the intermediate model value and the intermediate data value.
Optionally, the method further comprises:
generating a second shared computation predictor using the model coefficient vector and a locally stored additional data vector; and
model prediction is performed using the shared computation prediction and the second shared computation prediction.
Optionally, the method further comprises:
generating a second shared computational predictor using the first set of random numbers, the model coefficient vector, and a vector from a second data provider; and
model prediction is performed using the shared computation prediction and the second shared computation prediction.
Optionally, the model prediction uses a logistic regression model and/or a linear regression model.
The embodiment of the application further provides a security model prediction method based on secret sharing, which comprises the following steps:
receiving a second set of random numbers from a third party;
generating an intermediate data vector using the second set of random numbers and a data vector;
sending the intermediate data vector to a data demander and receiving an intermediate model vector from the data demander;
generating an intermediate data value using the intermediate model vector and the second set of random numbers; and
providing the intermediate data value to the data consumer for model prediction.
Embodiments of the present application further provide an apparatus for secret sharing based security model prediction, comprising:
a receiving module configured to receive a first set of random numbers from a third party;
a prediction vector generation module configured to generate a shared computation prediction result using the first set of random numbers, a model coefficient vector, and a vector from a data provider; and
a model prediction module configured to perform model prediction using the shared computation prediction result.
Optionally, the receiving module is further configured to receive an intermediate data vector and an intermediate model value from the data provider;
the prediction vector generation module is further configured to:
generating an intermediate model vector using the model coefficient vector and the first set of random numbers;
generating an intermediate data value using an intermediate data vector and the first set of random numbers; and
generating the shared computation predictor using the intermediate model values and the intermediate data values;
the apparatus further includes a transmission module configured to transmit the intermediate model vector to the data provider.
Optionally, the shared computed prediction result is a product of the intermediate model value and the intermediate data value.
Optionally, the prediction vector generation module is further configured to:
generating a second shared computation predictor using the model coefficient vector and a locally stored additional data vector; and
model prediction is performed using the shared computation prediction and the second shared computation prediction.
Optionally, the prediction vector generation module is further configured to:
generating a second shared computational predictor using the first set of random numbers, the model coefficient vector, and a vector from a second data provider; and
model prediction is performed using the shared computation prediction and the second shared computation prediction.
Optionally, the model prediction uses a logistic regression model and/or a linear regression model.
Embodiments of the present application further provide an apparatus for secret sharing-based security model prediction, including:
a receiving module configured to receive a second set of random numbers from a third party and to receive an intermediate model vector from a data demander;
a prediction vector generation module configured to generate an intermediate data vector using the second set of random numbers and a data vector, and to generate an intermediate data value using the intermediate model vector and the second set of random numbers; and
a transfer module configured to send the intermediate data vector to a data consumer and to provide the intermediate data value to the data consumer for model prediction.
An embodiment of the present application further provides a security model prediction apparatus based on secret sharing, including:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
receiving a first set of random numbers from a third party;
generating a shared computation predictor using the first set of random numbers, a model coefficient vector, and a vector from a data provider; and
and performing model prediction by using the shared calculation prediction result.
An embodiment of the present application further provides a security model prediction apparatus based on secret sharing, including:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
receiving a second set of random numbers from a third party;
generating an intermediate data vector using the second set of random numbers and a data vector;
sending the intermediate data vector to a data demander and receiving an intermediate model vector from the data demander;
generating an intermediate data value using the intermediate model vector and the second set of random numbers; and providing the intermediate data value to the data consumer for model prediction.
The invention provides a safe decentration model prediction method, which achieves the following technical advantages:
1. the data can not be out of respective boundaries, a trusted third party is not required to perform data fusion, and the data of any party is not required to be deployed or introduced into other parties, so that model prediction can be completed.
2. And in combination with secret sharing, data privacy of all the cooperative parties is protected. And calculating each party by using a data splitting mode, wherein the partner does not expose own plaintext data to the other party, and only calculates the unidentifiable values split by the partner to obtain the final accurate calculation result.
Drawings
FIG. 1 is an architecture diagram of a multi-party data collaboration system based on secret sharing, in accordance with aspects of the present invention.
FIG. 2 illustrates an example of data collaboration by a data consumer with a data provider in accordance with aspects of the invention.
FIG. 3 illustrates an example of data collaboration by one data consumer with two data providers, in accordance with aspects of the invention.
FIG. 4 illustrates a secret sharing based data collaboration method performed by a data consumer, in accordance with aspects of the present invention.
FIG. 5 illustrates a secret sharing based data collaboration method performed by a data consumer, in accordance with aspects of the present invention.
Fig. 6 illustrates an example method of secret sharing based data collaboration performed by a data provider, in accordance with aspects of the present invention.
FIG. 7 is a block diagram of a data requestor according to aspects of the present invention.
Fig. 8 is a block diagram of a data provider in accordance with aspects of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described herein, and thus the present invention is not limited to the specific embodiments disclosed below.
FIG. 1 is an architecture diagram of a multi-party data collaboration system based on secret sharing, in accordance with aspects of the present invention.
As shown in fig. 1, the multi-party data collaboration system based on secret sharing of the present invention includes a data demand party (also referred to as a model party), a data supply party (also referred to as a data party), and a third party (a fair third party, for example, a fair judicial agency or government agency, etc.).
The data demand side possesses a model, wherein a model coefficient vector is W ═ { ω 1, ω 2, … …, ω n }, and the data provider side possesses a data vector X ═ { X1, X2, … …, xn }; and the third party generates a series of random numbers and distributes the random numbers to the data provider and the data demander respectively. The data demand side calculates by using the model coefficient and the random number distributed by the data demand side, the data provider side calculates by using the data owned by the data demand side and the random number distributed by the data provider side, the data demand side and the data provider side exchange calculation results for further processing, and then the results are summarized to obtain a model prediction result.
The technical solution of the present invention is illustrated by four specific embodiments below.
Example one
Referring to FIG. 2, one embodiment of a data demander in data collaboration with a data provider is illustrated, in accordance with aspects of the present invention.
In step 201, a third party generates a set of random numbers R1And R2
For example, R1={a,c0},R2B, c1, where a and b are random number vectors, c0 and c1 are random numbers, and c is a × b, c0+ c 1. Where a b is a vector multiplication.
At step 202, a third party assembles R a random number1And R2And respectively sending the data to a data demand side and a data provider side.
In step 203, the data consumer uses a random number set R1And the model coefficient vector W ═ ω12,……,ωnAnd (6) calculating to obtain an intermediate model vector e. For example, e ═ W-a.
At step 204, the data provider uses a set of random numbers R2Sum data vector X ═ X1,x2,……,xnAnd (6) calculating to obtain an intermediate data vector f. For example, f ═ X-b.
In steps 205 and 206, the data demander and the data provider exchange the results calculated in steps 203 and 204.
Specifically, the data consumer may send the calculation result e to the data provider at step 205, and the data provider sends the calculation result f to the data consumer at step 206.
Note that although step 205 precedes step 206 in fig. 2, its order may be swapped or may be performed simultaneously.
In step 207, the data consumer uses a random number set R1And the intermediate data vector f provided by the data provider in step 206, to obtain an intermediate data value z 0. For example, z0 ═ a × f + c0, where a × f is the vector multiplication.
At step 208, the data provider uses a set of random numbers R2And the intermediate model vector provided by the data consumer in step 205, to obtain an intermediate model value z 1. For example, z1 ═ e × X + c1, where e × X is the vector multiplication.
At step 209, the data provider sends z1 to the data demander.
At step 210, the data demander aggregates z0 and z1 to obtain the product of model coefficients and data, WxX, which is also referred to herein as the shared computation prediction.
For example, z-z 0+ z 1-a × f + c0+ e × X + c1
=a×(X-b)+(W-a)×X+c
=a×X-a×b+W×X-a×X+a×b
=W×X
In step 211, model prediction is performed using the shared computational prediction results obtained in step 210.
For example, for Logistic Regression (Logistic Regression) model, the calculation
Figure BDA0001992805490000071
Wherein, omega and lambda are model coefficients provided by a model party. x is the input required for the calculation, private data belonging to the data provider.
Example two
In the embodiment illustrated in FIG. 2, only model information is provided by the data consumers. In some cases, the data consumers have both model information W and data information X'.
In this case, step 201-209 is the same as the embodiment illustrated in fig. 2, and is not described herein again. Only the differences from the process of fig. 2 will be described below.
At step 210, the data consumer calculates an additional intermediate data value z 0'.
z0’=W×X’。
In step 211, the data demander aggregates z0, z1, and z 0' to obtain a shared computation prediction:
z=z0+z1+z0’=W×X+W×X’。
at step 212, model prediction is performed using W × X + W × X'.
EXAMPLE III
The above illustrates an embodiment in which a data consumer collaborates with a data provider. In some cases, a data consumer may need data from multiple data providers in model prediction, whereby the data consumer needs to collaborate on data with multiple data providers. Fig. 3 illustrates an example of data collaboration by one data consumer with two data providers (data provider 1 and data provider 2).
In this embodiment, the data consumer has a model WA={ωA1A2,……,ωAnW andB={ωB1B2,……,ωBn}, data provider 1 has data XA={xA1,xA2,……,xAnAnd data provider 2 has data XB={xB1,xB2,……,xBn}. Sharing of computational prediction results W in model predictionA×XAAnd WB×XB
In step 301, a third party generates a first set of random numbers { R }1、R2And a second set of random numbers R1’、R2', where a first set of random numbers is used for data collaboration by the data consumer with the data provider 1 and a second set of random numbers is used for data collaboration by the data consumer with the data provider 2.
In particular, R1={a,c0},R2B, c1, where c is a × b, c0+ c 1; r1’={a’,c0’},R2Where a, b and a ', b' are random number vectors, c0, c1 and c0 ', c 1' are random numbers, and c '═ a' × b ', c' ═ c0 '+ c 1'. Note that a × b and a '× b' are vector multiplications.
In step 302, a third party assembles R random numbers1And R1' providing to a data consumer, R2Providing R to the data provider 12' provided to the data provider 2.
In step 303, the data demander calculates e and e'.
Specifically, e ═ WA–a,e’=WB–a’。
In steps 304 and 305, the data provider 1 and the data provider 2 calculate f ═ X, respectivelyA-b and f ═ XB–b’。
In step 306-.
Specifically, the data consumer transmits the calculation result e to the data provider 1 at step 306, and transmits the calculation result e' to the data provider 2 at step 307.
The data provider 1 transmits the calculation result f to the data demander in step 308, and transmits the calculation result f' to the data demander in step 309.
Note that the specific order of steps 306-308 is shown in FIG. 3, but the order of these steps may be switched or may be performed simultaneously.
At step 310, the data consumer uses a set of random numbers R1And the settlement result f provided by the data provider 1 in step 308, to obtain the first intermediate data value z 0. For example, z0 ═ a × f + c 0.
The data demander also uses a random number set R1' and the settlement result f ' provided by the data provider 2 in step 309 are calculated to obtain the second intermediate data value z0 '. For example, z0 '═ a' × f '+ c 0'.
In step 311, the data provider 1 uses a random number set R2And the calculation result e provided by the data consumer in step 306, to obtain a first intermediate model value z 1. For example, z1 ═ e × XA+c1。
At step 312, the data provider 2 uses the set of random numbers R2' and the calculation result e ' provided by the data consumer in step 307 are calculated to obtain a second intermediate model value z1 '. For example, z1 ═ e' × XB+c1’。
At steps 313 and 314, data provider 1 sends z1 to the data demander and data provider 2 sends z 1' to the data demander.
In step 315, the data demander aggregates z0 and z1 to obtain the product W of model coefficients and dataAX, and sum z0 'and z 1' to get the product W of model coefficients and dataB×X。
For example, z-z 0+ z 1-a × f + c0+ e × XA+c1
=a×(XA-b)+(WA-a)×XA+c
=a×XA-a×b+WA×XA-a×XA+a×b
=WA×XA
z’=z0’+z1’=a’×f+c0’+e’×XB+c1’
=a’×(XB-b)+(W-a)×XB+c’
=a×XB-a×b+WB×XB-a×XB+a×b
=WB×XB
At step 316, model predictions are made using the results in steps 315 and 316 (also referred to as shared computational prediction results).
In one embodiment, model WAAnd WBMay be the same, in other words, the data consumer uses a model W ═ WA=WBAnd data from two data providers for model prediction.
Note that the process of data collaboration between one data requestor and two data providers is depicted in fig. 3 in a particular order, but other orders of steps are possible. The steps of data collaboration between the data demander and the data provider 1 and the steps of data collaboration between the data demander and the data provider 2 are independent and can be done at different times, respectively. For example, the step of data collaboration between the data consumer and the data provider 1 may be done before or after the data collaboration between the data consumer and the data provider 2, or some steps in the two processes may be temporally interleaved. And some steps may be split, e.g. the calculations e and e' in step 303 may be performed separately.
While data collaboration between one data consumer and two data providers is illustrated above, the process is also applicable to data collaboration between one data consumer and more than two data providers, which operates similarly to the process illustrated in fig. 3.
It should be noted that although the present invention is illustrated with a logistic regression model, other models can be applied to the present invention, such as a linear regression model, y ═ ω × x + e, and so on. Further, two specific random number generation methods are described above, but other random number generation methods are also within the scope of the present invention, and those skilled in the art can devise suitable random number generation methods according to actual needs.
FIG. 4 illustrates one example of a secret sharing based data collaboration method performed by a data consumer in accordance with aspects of the present invention.
Referring to fig. 4, in step 401, a first set of random numbers from a third party is received.
This step may correspond to steps 201, 202 described above with reference to fig. 2, and/or steps 301, 302 described with reference to fig. 3.
At step 402, a shared computation predictor is generated using the first set of random numbers, a model coefficient vector, and a vector from a data provider.
This step may correspond to step 203-.
At step 403, model prediction is performed using the shared computational prediction results.
This step may correspond to step 211 described above with reference to fig. 2, and/or step 303-316 described with reference to fig. 3.
FIG. 5 illustrates one example of a secret sharing based data collaboration method performed by a data consumer in accordance with aspects of the present invention.
Referring to FIG. 5, at step 501, a first set of random numbers R from a third party is received1
In particular, the third party may generate a set of random numbers R ═ { a, b, c0, c1}, where c ═ a × b, c ═ c0+ c1, where the first set of random numbers R ═ a, b, c0, c1}, where the first set of random numbers R is a set of random numbers c, c1Is { a, c0}, and R2Provided to the data provider is { b, c1 }.
In another example, the third party may generate a set of random numbers R ═ { a, b, c0, c1}, where c ═ a0+ a1, c ═ b0+ b1, where the first set of random numbers R ═ a, b, c0, c1}, where the first set of random numbers R ═ a, c ═ b0+ b11A, c0, and R2The { b, c1} may be provided to the data provider.
In step 502, a model coefficient vector W and a first set of random numbers R are used1An intermediate model vector e is generated. For example, e ═ W-a.
At step 503, the intermediate model vector e is sent to the data provider and the intermediate data vector f is received from the data provider.
In step 504, an intermediate data vector f and the first set of random numbers R are used1To generate an intermediate data value z 0.
At step 505, an intermediate model value z1 is received from a data provider.
At step 506, shared computed predictions are generated using the intermediate model value z1 and the intermediate data value z 0.
In step 507, model prediction is performed using the shared computational prediction results.
Fig. 6 illustrates an example method of secret sharing based data collaboration performed by a data provider, in accordance with aspects of the present invention.
In step 601, a second set of random numbers R from a third party is received2
At step 602, a second set of random numbers R is used2And the data vector X to generate an intermediate data vector f.
At step 603, the intermediate data vector f is sent to the data consumer and the intermediate model vector e is received from the data consumer.
In step 604, the intermediate model vector e and the second set of random numbers R are used2To generate an intermediate data value z 1.
At step 605, the intermediate data value z1 is provided to the data consumer for model prediction.
FIG. 7 illustrates a block diagram of a data requestor in accordance with aspects of the invention.
Specifically, the data demander (the model side) may include a receiving module 701, a prediction vector generation module 702, a model prediction module 703, a transmission module 704, and a memory 705. Wherein the memory 705 stores model coefficients.
The receiving module 701 may be configured to receive a first set of random numbers from a third party, receive an intermediate data vector and/or an intermediate model value from the data provider.
The prediction vector generation module 702 may be configured to generate a shared computation prediction result using the first set of random numbers, the model coefficient vector, and a vector from a data provider.
In particular, the prediction vector generation module 702 may be configured to generate an intermediate model vector using the model coefficient vector and a first set of random numbers; generating an intermediate data value using the intermediate data vector and the first set of random numbers; and generating a shared computation prediction result using the intermediate model value and the intermediate data value.
The prediction vector generation module 702 may be further configured to generate an intermediate model vector using the model coefficient vector and the first set of random numbers; the shared computational prediction is generated using the intermediate data vector and the intermediate model vector from the data provider.
The model prediction module 703 may be configured to perform model prediction using shared computational predictions.
The transmission module 704 may be configured to transmit the intermediate model vector to the data provider.
Fig. 8 illustrates a block diagram of a data provider in accordance with aspects of the invention.
Specifically, the data provider may include: a receiving module 803, a prediction vector generation module 802, a transmitting module 803, and a memory 804. Where the memory 804 may store private data.
The receiving module 801 may be configured to receive a second set of random numbers from a third party and to receive an intermediate model vector from a data demander.
The prediction vector generation module 802 may be configured to generate an intermediate data vector using the second set of random numbers and a data vector, and to generate an intermediate data value using the intermediate model vector and the second set of random numbers.
The transfer module 803 may be configured to send the intermediate data vector to a data demander and provide the intermediate data value to the data demander for model prediction.
Compared with the prior art, the invention has the following advantages:
1) private data of the parties can be protected from leakage. The data held by each party does not leave the own calculation boundary, and the parties interact locally in an encryption mode to complete the calculation. Although a fair third party participates, the third party only provides distribution of the random numbers and does not participate in a specific calculation process.
2) The docking cost is not high. The pure software scheme has no other extra hardware requirements except a basic server and the like, does not introduce other hardware security holes, and can complete calculation on line.
3) The calculation is completely lossless, and the result accuracy is not influenced.
4) The algorithm itself is not limited. The calculation result is returned in real time, four arithmetic operations such as addition, subtraction, multiplication, division and the like can be supported, and mixed calculation of the arithmetic operations is not limited.
5) The secret sharing secure multi-party computing algorithm can obtain a final result through intermediate splitting, converting, result summarizing and other modes without reserving information such as a secret key and the like. On the premise that the third party distributing the random numbers is fair, the intermediate value in the calculation process cannot be used for deducing the original plaintext.
The illustrations set forth herein in connection with the figures describe example configurations and are not intended to represent all examples that may be implemented or fall within the scope of the claims. The term "exemplary" as used herein means "serving as an example, instance, or illustration," and does not mean "preferred" or "advantageous over other examples. The detailed description includes specific details to provide an understanding of the described technology. However, the techniques may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the described examples.
In the drawings, similar components or features may have the same reference numerals. Further, various components of the same type may be distinguished by following the reference label by a dash and a second label that distinguishes among the similar components. If only the first reference label is used in the specification, the description is applicable to any one of the similar components having the same first reference label irrespective of the second reference label.
The various illustrative blocks and modules described in connection with the disclosure herein may be implemented or performed with a general purpose processor, a DSP, an ASIC, an FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices (e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration).
The functions described herein may be implemented in hardware, software executed by a processor, firmware, or any combination thereof. If implemented in software executed by a processor, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Other examples and implementations are within the scope of the disclosure and the following claims. For example, due to the nature of software, the functions described above may be implemented using software executed by a processor, hardware, firmware, hard-wired, or any combination thereof. Features that implement functions may also be physically located at various locations, including being distributed such that portions of functions are implemented at different physical locations. In addition, as used herein, including in the claims, "or" as used in a list of items (e.g., a list of items accompanied by a phrase such as "at least one of" or "one or more of") indicates an inclusive list, such that, for example, a list of at least one of A, B or C means a or B or C or AB or AC or BC or ABC (i.e., a and B and C). Also, as used herein, the phrase "based on" should not be read as referring to a closed condition set. For example, an exemplary step described as "based on condition a" may be based on both condition a and condition B without departing from the scope of the present disclosure. In other words, the phrase "based on," as used herein, should be interpreted in the same manner as the phrase "based, at least in part, on.
Computer-readable media includes both non-transitory computer storage media and communication media, including any medium that facilitates transfer of a computer program from one place to another. Non-transitory storage media may be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, non-transitory computer-readable media can comprise RAM, ROM, electrically erasable programmable read-only memory (EEPROM), Compact Disk (CD) ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other non-transitory medium that can be used to carry or store desired program code means in the form of instructions or data structures and that can be accessed by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a web site, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk (disk) and disc (disc), as used herein, includes CD, laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above are also included within the scope of computer-readable media.
The description herein is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (14)

1. A secret sharing-based security model prediction method comprises the following steps:
receiving a first set of random numbers from a third party;
generating an intermediate model vector using the first set of random numbers, a model coefficient vector;
generating an intermediate data value based on an intermediate data vector from a data provider and the first set of random numbers; wherein the intermediate data vector is generated by the data provider based on a data vector and a second set of random numbers from the third party;
generating a shared computation prediction result based on the intermediate data value and an intermediate model value from the data provider; wherein the intermediate model value is generated by the data provider based on the intermediate model vector and the second set of random numbers, an
And performing model prediction by using the shared calculation prediction result.
2. The method of claim 1, wherein the generating intermediate data values based on the intermediate data vector from the data provider and the first set of random numbers comprises:
receiving an intermediate data vector from a data provider;
generating an intermediate data value using the intermediate data vector and the first set of random numbers;
the generating a shared computation predicate based on the intermediate data value and an intermediate model value from the data provider, comprising:
sending the intermediate model vector to the data provider;
receiving an intermediate model value from the data provider generated based on the intermediate model vector and the second set of random numbers; and
generating the shared computed prediction result using the intermediate model value and the intermediate data value, wherein the shared computed prediction result is a product of the intermediate model value and the processed intermediate data value.
3. The method of claim 2, further comprising:
generating a second shared computation predictor using the model coefficient vector and a locally stored additional data vector; and
model prediction is performed using the shared computation prediction and the second shared computation prediction.
4. The method of claim 2, further comprising:
generating a second shared computational predictor using the first set of random numbers, the model coefficient vector, and a vector from a second data provider; and
model prediction is performed using the shared computation prediction and the second shared computation prediction.
5. The method of claim 1, in which the model prediction uses a logistic regression model and/or a linear regression model.
6. A secret sharing-based security model prediction method comprises the following steps:
receiving a second set of random numbers from a third party;
generating an intermediate data vector using the second set of random numbers and a data vector;
sending the intermediate data vector to a data demander and receiving an intermediate model vector from the data demander; wherein the intermediate model vector is obtained by a model coefficient vector of the data demand party and a first random number set from the third party;
generating an intermediate model value using the intermediate model vector and the second set of random numbers; and
providing the intermediate model values to the data demander for model prediction, wherein the data demander generates intermediate data values based on the intermediate data vectors and the first set of random numbers and performs model prediction based on a result of a summarization of the intermediate data values and the intermediate model values.
7. An apparatus for secret sharing based security model prediction, comprising:
a receiving module configured to receive a first set of random numbers from a third party;
a prediction vector generation module configured to generate an intermediate model vector using the first set of random numbers, a model coefficient vector; and generating an intermediate data value based on an intermediate data vector from a data provider and the first set of random numbers; wherein the intermediate data vector is generated by the data provider based on a data vector and a second set of random numbers from the third party; and generating a shared computation predictor based on the intermediate data value and an intermediate model value from the data provider; wherein the intermediate model value is generated by the data provider based on the intermediate model vector and the second set of random numbers;
a model prediction module configured to perform model prediction using the shared computation prediction result.
8. The apparatus of claim 7, wherein the receiving module is further configured to receive an intermediate data vector from a data provider, and to receive an intermediate model value from the data provider generated based on the intermediate model vector and the second set of random numbers;
the prediction vector generation module is further configured to:
generating an intermediate data value using the intermediate data vector and the first set of random numbers; and
generating the shared computed prediction result using the intermediate model value and the intermediate data value, wherein the shared computed prediction result is a product of the intermediate model value and the intermediate data value;
the apparatus further includes a transmission module configured to transmit the intermediate model vector to the data provider.
9. The apparatus of claim 7, wherein the prediction vector generation module is further configured to:
generating a second shared computation predictor using the model coefficient vector and a locally stored additional data vector; and
model prediction is performed using the shared computation prediction and the second shared computation prediction.
10. The apparatus of claim 7, wherein the prediction vector generation module is further configured to:
generating a second shared computational predictor using the first set of random numbers, the model coefficient vector, and a vector from a second data provider; and
model prediction is performed using the shared computation prediction and the second shared computation prediction.
11. The apparatus of claim 7, in which the model prediction uses a logistic regression model and/or a linear regression model.
12. An apparatus for secret sharing based security model prediction, comprising:
a receiving module configured to receive a second set of random numbers from a third party and to receive an intermediate model vector from a data demander; wherein the intermediate model vector is obtained by a model coefficient vector of the data demand party and a first random number set from the third party;
a prediction vector generation module configured to generate an intermediate data vector using the second set of random numbers and a data vector, and to generate an intermediate model value using the intermediate model vector and the second set of random numbers; and
a transmitting module configured to transmit the intermediate data vector to the data consumer and provide the intermediate model value to the data consumer for model prediction, wherein the data consumer generates an intermediate data value based on the intermediate data vector and the first set of random numbers and performs model prediction based on a result of a summation of the intermediate data value and the intermediate model value.
13. A secret sharing based security model prediction apparatus comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
receiving a first set of random numbers from a third party;
generating an intermediate model vector using the first set of random numbers, a model coefficient vector;
generating an intermediate data value based on an intermediate data vector from a data provider and the first set of random numbers; wherein the intermediate data vector is generated by the data provider based on a data vector and a second set of random numbers from the third party;
generating a shared computation prediction result based on the intermediate data value and an intermediate model value from the data provider; wherein the intermediate model value is generated by the data provider based on the intermediate model vector and the second set of random numbers, an
And performing model prediction by using the shared calculation prediction result.
14. A secret sharing based security model prediction apparatus comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
receiving a second set of random numbers from a third party;
generating an intermediate data vector using the second set of random numbers and a data vector;
sending the intermediate data vector to a data demander and receiving an intermediate model vector from the data demander; wherein the intermediate model vector is obtained by a model coefficient vector of the data demand party and a first random number set from the third party;
generating an intermediate model value using the intermediate model vector and the second set of random numbers; and
providing the intermediate model values to the data demander for model prediction, wherein the data demander generates intermediate data values based on the intermediate data vectors and the first set of random numbers and performs model prediction based on a result of a summarization of the intermediate data values and the intermediate model values.
CN201910185759.4A 2019-03-12 2019-03-12 Security model prediction method and device based on secret sharing Active CN110032893B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201910185759.4A CN110032893B (en) 2019-03-12 2019-03-12 Security model prediction method and device based on secret sharing
TW108133838A TWI720622B (en) 2019-03-12 2019-09-19 Security model prediction method and device based on secret sharing
PCT/CN2020/073818 WO2020181933A1 (en) 2019-03-12 2020-01-22 Secure model prediction method and device employing secret sharing technique

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910185759.4A CN110032893B (en) 2019-03-12 2019-03-12 Security model prediction method and device based on secret sharing

Publications (2)

Publication Number Publication Date
CN110032893A CN110032893A (en) 2019-07-19
CN110032893B true CN110032893B (en) 2021-09-28

Family

ID=67235931

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910185759.4A Active CN110032893B (en) 2019-03-12 2019-03-12 Security model prediction method and device based on secret sharing

Country Status (3)

Country Link
CN (1) CN110032893B (en)
TW (1) TWI720622B (en)
WO (1) WO2020181933A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110032893B (en) * 2019-03-12 2021-09-28 创新先进技术有限公司 Security model prediction method and device based on secret sharing
CN110569227B (en) * 2019-08-09 2020-08-14 阿里巴巴集团控股有限公司 Model parameter determination method and device and electronic equipment
CN110580410B (en) * 2019-08-09 2023-07-28 创新先进技术有限公司 Model parameter determining method and device and electronic equipment
CN110955907B (en) * 2019-12-13 2022-03-25 支付宝(杭州)信息技术有限公司 Model training method based on federal learning
CN111030811B (en) * 2019-12-13 2022-04-22 支付宝(杭州)信息技术有限公司 Data processing method
CN112507323A (en) * 2021-02-01 2021-03-16 支付宝(杭州)信息技术有限公司 Model training method and device based on unidirectional network and computing equipment
TWI824927B (en) * 2023-01-17 2023-12-01 中華電信股份有限公司 Data synthesis system with differential privacy protection, method and computer readable medium thereof

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9998434B2 (en) * 2015-01-26 2018-06-12 Listat Ltd. Secure dynamic communication network and protocol
US10491373B2 (en) * 2017-06-12 2019-11-26 Microsoft Technology Licensing, Llc Homomorphic data analysis
CN107623729B (en) * 2017-09-08 2021-01-15 华为技术有限公司 Caching method, caching equipment and caching service system
CN108400981B (en) * 2018-02-08 2021-02-12 江苏谷德运维信息技术有限公司 Public cloud auditing system and method for lightweight and privacy protection in smart city
CN108683669B (en) * 2018-05-19 2021-09-17 深圳市图灵奇点智能科技有限公司 Data verification method and secure multi-party computing system
CN109033854B (en) * 2018-07-17 2020-06-09 阿里巴巴集团控股有限公司 Model-based prediction method and device
CN109409125B (en) * 2018-10-12 2022-05-31 南京邮电大学 Data acquisition and regression analysis method for providing privacy protection
CN110032893B (en) * 2019-03-12 2021-09-28 创新先进技术有限公司 Security model prediction method and device based on secret sharing

Also Published As

Publication number Publication date
CN110032893A (en) 2019-07-19
TW202044082A (en) 2020-12-01
TWI720622B (en) 2021-03-01
WO2020181933A1 (en) 2020-09-17

Similar Documents

Publication Publication Date Title
CN110032893B (en) Security model prediction method and device based on secret sharing
Pibernik et al. Secure collaborative supply chain planning and inverse optimization–The JELS model
WO2020015478A1 (en) Model-based prediction method and device
US20230087864A1 (en) Secure multi-party computation method and apparatus, device, and storage medium
CN112989368B (en) Method and device for processing private data by combining multiple parties
CN110944011B (en) Joint prediction method and system based on tree model
US8533487B2 (en) Secure logical vector clocks
CN112199709A (en) Multi-party based privacy data joint training model method and device
US20200372394A1 (en) Machine learning with differently masked data in secure multi-party computing
CN113239391B (en) Third-party-free logistic regression federal learning model training system and method
CN112737772B (en) Security statistical method, terminal device and system for private set intersection data
CN116204909B (en) Vector element mapping method, electronic device and computer readable storage medium
CN113609781A (en) Automobile production mold optimization method, system, equipment and medium based on federal learning
CN112507372B (en) Method and device for realizing privacy protection of multi-party collaborative update model
Moon et al. An Efficient Encrypted Floating‐Point Representation Using HEAAN and TFHE
CN114492850A (en) Model training method, device, medium, and program product based on federal learning
CN117521102A (en) Model training method and device based on federal learning
Almutairi et al. Secure Third Party Data Clustering Using Data: Multi-User Order Preserving Encryption and Super Secure Chain Distance Matrices (Best Technical Paper)
CN113051586A (en) Federal modeling system and method, and federal model prediction method, medium, and device
CN114462626B (en) Federal model training method and device, terminal equipment and storage medium
CN114880693B (en) Method and device for generating activation function, electronic equipment and readable medium
CN112183759A (en) Model training method, device and system
EP3364397B1 (en) Secret authentication code adding device, secret authentification code adding method, and program
CN115130568A (en) Longitudinal federated Softmax regression method and system supporting multiple parties
Sun et al. Outsourced privacy preserving SVM with multiple keys

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200922

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman, British Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman, British Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20200922

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman, British Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220124

Address after: Room 803, floor 8, No. 618 Wai Road, Huangpu District, Shanghai 200010

Patentee after: Ant blockchain Technology (Shanghai) Co.,Ltd.

Address before: Ky1-9008 business centre, 27 Hospital Road, Georgetown, grand caiman, UK

Patentee before: Innovative advanced technology Co.,Ltd.