CN115842627A - Decision tree evaluation method, device, equipment and medium based on secure multi-party computation - Google Patents

Decision tree evaluation method, device, equipment and medium based on secure multi-party computation Download PDF

Info

Publication number
CN115842627A
CN115842627A CN202211533464.XA CN202211533464A CN115842627A CN 115842627 A CN115842627 A CN 115842627A CN 202211533464 A CN202211533464 A CN 202211533464A CN 115842627 A CN115842627 A CN 115842627A
Authority
CN
China
Prior art keywords
decision
node
secret
participant
decision tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211533464.XA
Other languages
Chinese (zh)
Inventor
张翰林
张志祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao University
Original Assignee
Qingdao University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao University filed Critical Qingdao University
Priority to CN202211533464.XA priority Critical patent/CN115842627A/en
Publication of CN115842627A publication Critical patent/CN115842627A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Storage Device Security (AREA)

Abstract

The application discloses a decision tree evaluation method, a decision tree evaluation device, decision tree evaluation equipment and a decision tree evaluation medium based on safe multi-party calculation, and relates to the technical field of machine learning. The method comprises the following steps: obtaining secret values shared by a client and a model provider and dividing the secret values into a preset number of shares by a secret sharing technology so as to determine secret shares corresponding to each participant; determining the characteristic attribute corresponding to each decision node of the participant according to the mapping matrix and the characteristic vector so as to obtain a comparison result of the decision nodes through the characteristic attribute; and performing linear transformation on the comparison result and performing dot product operation on the comparison result subjected to linear transformation and the traversal matrix so as to determine an evaluation result based on the result of the dot product operation and the label vector carried by the leaf node. According to the technical scheme, the communication cost can be reduced, and meanwhile the security of the private data is improved.

Description

Decision tree evaluation method, device, equipment and medium based on secure multi-party computation
Technical Field
The invention relates to the technical field of machine learning, in particular to a decision tree evaluation method, a decision tree evaluation device, decision tree evaluation equipment and decision tree evaluation media based on safe multi-party computation.
Background
Machine learning algorithms are widely used to solve various classification and prediction problems, and participants need to interact with private data during the execution of the algorithms. However, once leaked, such private data may not only harm the interests of the data holder, but may also violate relevant laws, such as "personal information protection law". Therefore, it is important to execute the machine learning algorithm safely on the premise of ensuring the confidentiality of the private data.
Secure multiparty computing has been extensively studied in recent years. This technique allows multiple participants to jointly complete a computation without revealing the participants' input. After the computation is completed, the computation results may be disclosed to all participants or only to designated participants. Secure multi-party computing is recognized as one of the most important technical routes to achieve machine-learned privacy protection. The existing researches are divided into two types, one is a general protocol for researching privacy protection of various machine learning algorithms; another class is to study proprietary protocols for a particular machine learning algorithm. At present, in a special protocol for a decision tree evaluation algorithm, most of existing schemes convert a decision tree into a full binary tree by adding a dummy node to hide structure information of the decision tree, but because the dummy node and a real node need the same calculation and communication cost and cannot be distinguished by participants, the time complexity and the communication complexity of the method are irrelevant to the number of the real nodes and increase exponentially along with the depth of the decision tree, and the method is very impractical for a deep and sparse decision tree model.
In summary, how to improve the practicability and the security of a protocol while reducing the communication cost for a dedicated protocol of a decision tree evaluation algorithm is a problem to be solved at present.
Disclosure of Invention
In view of the above, the present invention provides a decision tree evaluation method, apparatus, device and medium based on secure multi-party computation, which can reduce communication cost and improve the practicability and security of the protocol for the special protocol of the decision tree evaluation algorithm. The specific scheme is as follows:
in a first aspect, the present application discloses a decision tree evaluation method based on secure multiparty computation, comprising:
obtaining secret values shared by a client and a model provider, and dividing the secret values into a preset number of shares by a secret sharing technology so as to determine secret shares corresponding to each participant; wherein the participants include the client, the model provider, and a computing server; the client side provides a feature vector, and the model provider side provides a pre-trained decision tree model; the decision tree model comprises decision nodes, leaf nodes, a mapping matrix and a traversal matrix;
determining the characteristic attribute of each participant on each decision node by using the mapping matrix corresponding to each participant and the characteristic vector corresponding to each participant based on the secret share, so as to obtain the comparison result of the decision node through the characteristic attribute;
and performing linear transformation on the comparison result, and performing dot product operation on the comparison result after the linear transformation and the traversal matrix corresponding to each participant in the secret share so as to determine an evaluation result based on the result of the dot product operation and the label vector carried by the leaf node.
Optionally, the obtaining a comparison result of the decision node through the characteristic attribute includes:
traversing the decision tree model and judging the size relation between the characteristic attribute and a threshold vector provided by the model provider;
if the characteristic attribute of the current decision node is smaller than the threshold vector, the comparison result of the decision node takes a value of 1, and the right child node of the current decision node is selected to continue traversing until the current node is the leaf node, and the traversing is stopped to obtain the label of the leaf node;
and if the characteristic attribute of the current decision node is not smaller than the threshold vector, taking the comparison result of the decision node as 0, selecting a left child node of the current decision node to continue traversing, and stopping traversing until the current node is the leaf node to obtain the label of the leaf node.
Optionally, the performing linear transformation on the comparison result includes:
multiplying the comparison result by 2 and subtracting 1 from the comparison result to change the comparison result directed to the left child node to-1 and to leave the comparison result directed to the right child node unchanged.
Optionally, the obtaining secret values shared by the client and the model provider, and dividing the secret values into a preset number of shares by using a duplicate secret sharing technology to determine secret shares corresponding to each participant includes:
obtaining secret values shared by a client and a model provider respectively, and dividing the secret values into shares the same as the number of participants to obtain a first share, a second share and a third share;
setting the first share to 0 and generating a second share value using a pseudo random number generator;
the third share value is determined based on the secret value and the second share value, and then the first share value, the second share value, and the third share value are distributed to determine a respective secret share for each participant.
Optionally, the determining, based on the secret share, a feature attribute corresponding to each participant on each decision node by using a mapping matrix corresponding to each participant and a feature vector corresponding to each participant, so as to obtain a comparison result of the decision node through the feature attribute, includes:
determining feature attributes corresponding to the participants on each decision node by using a mapping matrix corresponding to each participant and a feature vector corresponding to each participant based on the secret share, performing bit decomposition on a difference value between the feature attributes and a threshold vector by using an adder circuit, and if the highest bit of the decomposed difference value is 1, determining that the feature attributes are smaller than the threshold vector; and if the highest bit of the decomposed difference value is 0, judging that the characteristic attribute is larger than the threshold vector.
Optionally, performing a dot product operation on the comparison result after the linear transformation and the traversal matrix corresponding to each participant so as to determine an evaluation result based on the result of the dot product operation and the tag vector carried by the leaf node, includes:
performing dot product operation on the comparison result after the linear transformation and the traversal matrix corresponding to each participant, and determining a difference value between the result of the dot product operation and the order of the target subset of the decision node set so as to judge whether the result of the dot product operation is equal to the order of the target subset of the decision node set or not by using the difference value; wherein the target subset is a set of decision nodes included under a path from a current leaf node to a root node in the decision tree model;
performing bit decomposition on the difference value by using an adder circuit, and performing logical OR operation on all bit positions obtained after the bit decomposition to obtain a result vector;
and determining the evaluation result by using the result vector and the label vector carried by the leaf node.
Optionally, the decision tree evaluation method based on secure multiparty computation further includes;
when the number of the decision nodes is larger than a preset threshold value, compressing the traversal matrix by using a divide-and-conquer method to obtain a target number of sub-traversal matrices;
and dividing the decision nodes into the sub traversal matrixes according to a preset node division rule.
In a second aspect, the present application discloses a decision tree evaluation device based on secure multi-party computation, comprising:
the system comprises a copy secret sharing module, a model providing module and a client side, wherein the copy secret sharing module is used for acquiring secret values shared by the client side and the model providing side respectively, and dividing the secret values into a preset number of shares by using a copy secret sharing technology so as to determine secret shares corresponding to each participant; wherein the participants include the client, the model provider, and a computing service; the client side provides a feature vector, and the model provider side provides a pre-trained decision tree model; the decision tree model comprises decision nodes, leaf nodes, a mapping matrix and a traversal matrix;
the decision module is used for determining the characteristic attribute of each participant on each decision node by using the mapping matrix corresponding to each participant and the characteristic vector corresponding to each participant based on the secret share so as to obtain the comparison result of the decision node through the characteristic attribute;
and the evaluation module is used for carrying out linear transformation on the comparison result and carrying out dot product operation on the comparison result after the linear transformation and the traversal matrix corresponding to each participant so as to determine the evaluation result based on the result of the dot product operation and the label vector carried by the leaf node.
In a third aspect, the present application discloses an electronic device comprising a processor and a memory; wherein the memory is used for storing a computer program which is loaded and executed by the processor to implement the secure multi-party computation based decision tree evaluation method as described above.
In a fourth aspect, the present application discloses a computer readable storage medium for storing a computer program; wherein the computer program when executed by a processor implements a secure multi-party computation based decision tree evaluation method as described above.
The application provides a decision tree evaluation method based on safe multiparty computation, which comprises the following steps: obtaining secret values shared by a client and a model provider, and dividing the secret values into a preset number of shares by a secret sharing technology so as to determine secret shares corresponding to each participant; wherein the participants include the client, the model provider, and a computing service; the client side provides a feature vector, and the model provider side provides a pre-trained decision tree model; the decision tree model comprises decision nodes, leaf nodes, a mapping matrix and a traversal matrix; determining the characteristic attribute of each participant on each decision node by using the mapping matrix corresponding to each participant and the characteristic vector corresponding to each participant based on the secret share, so as to obtain the comparison result of the decision node through the characteristic attribute; and performing linear transformation on the comparison result, and performing dot product operation on the comparison result subjected to the linear transformation and the traversal matrix corresponding to each participant so as to determine an evaluation result based on the dot product operation result and the label vector carried by the leaf node. Therefore, the client side inputs the feature vectors which are not expected to be leaked into the trained decision tree model provided by the model provider side to perform classification or prediction problems, each participant side is enabled to have respective secret shares to participate in calculation by copying a secret sharing technology, all the participant sides cannot recover the privacy data of other participant sides through intermediate data, the privacy data of the provider side and the client side of the decision tree model cannot be leaked, only the client side obtains a final evaluation result, and the method has high safety; by converting the traversal process of the decision tree into the dot product calculation, the structure of the decision tree is hidden, so that false nodes are prevented from being added in the decision tree, the communication cost is reduced, and the efficiency is obviously improved. In addition, because each party of the duplicated secret sharing technology can be different entities, the decision tree evaluation method based on safe multi-party calculation can be expanded to a safe outsourcing scene, so that the model provider and the client do not need to participate in the calculation on line, only the input of the own party needs to be shared safely, the practicability of the protocol is further improved, and the application range is wider.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a flow chart of a decision tree evaluation method based on secure multi-party computation disclosed in the present application;
FIG. 2 is a schematic diagram of a decision tree according to the present disclosure;
FIG. 3 is a schematic diagram of a decision tree evaluation system based on secure multi-party computing according to the present disclosure;
FIG. 4 is a schematic diagram of a secure outsourcing process disclosed herein;
FIG. 5 is a flowchart of a specific decision tree evaluation method based on secure multi-party computation disclosed in the present application;
FIG. 6 is a schematic diagram of a decision tree evaluation apparatus based on secure multi-party computation disclosed in the present application;
fig. 7 is a block diagram of an electronic device disclosed in the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Currently, in a special protocol for a decision tree evaluation algorithm, most of existing schemes convert a decision tree into a full binary tree by adding dummy nodes to hide structural information of the decision tree, but because the dummy nodes and real nodes need the same computation and communication costs and cannot be distinguished by participants, the time complexity and communication complexity of the method are irrelevant to the number of the real nodes, but exponentially increase along with the depth of the decision tree, and the method is very impractical for a deep and sparse decision tree model.
Therefore, the application provides a decision tree evaluation scheme based on secure multi-party calculation, which can reduce communication cost and improve the practicability and safety of a protocol aiming at a special protocol of a decision tree evaluation algorithm.
The embodiment of the invention discloses a decision tree evaluation method based on safe multiparty computation, which is shown in figure 1 and comprises the following steps:
step S11: obtaining secret values shared by a client and a model provider, and dividing the secret values into a preset number of shares by a secret sharing technology so as to determine secret shares corresponding to each participant; wherein the participants include the client, the model provider, and a computing server; the client side provides a feature vector, and the model provider side provides a pre-trained decision tree model; the decision tree model comprises decision nodes, leaf nodes, a mapping matrix and a traversal matrix.
Decision trees are one of the most important machine learning models, and are widely applied to many fields such as face recognition, disease diagnosis, business decision and the like. Typically, the decision tree is represented by a binary tree, and the structure information is also considered as a part of the privacy data, as shown in fig. 2, which is a schematic diagram of the decision tree. The decision tree model consists of m decision nodes and m +1 leaf nodes.
In the embodiment of the present application, the confidentiality of private data is protected by using a duplicated Secret Sharing technology (duplicated Secret Sharing). When protecting data, a special protocol for a decision tree estimation algorithm has multiple participants to participate in calculation, and fig. 3 is a schematic diagram of a system structure of a decision tree estimation method. The client side needs a trained decision tree model to help him solve some classification or prediction problems, the input of which is a feature vector x. The model provider may be a consulting company or research institute that holds a trained decision tree model whereby the user solves the problem through the provider's model.
It will be appreciated that, first, the feature vector x on the client side typically relates to sensitive information, such as the health of the patient, personal income, etc., which should not be revealed to the provider; secondly, the decision tree model is formed by training a large amount of resources invested by a provider, is a commercial secret of the provider and cannot be disclosed; and some models' training data can be recovered through model inversion attacks (model inversion attacks). Therefore, in order to protect the confidentiality of private data of the client and the provider, the user and the provider need to jointly execute a decision tree evaluation protocol based on secure multiparty computation. Since the duplicate secret sharing technique requires three parties, an independent third party is added as a computing service to assist in completing the agreement.
In the embodiment of the application, secret sharing is firstly performed on a secret value shared by a client and a model provider, the client shares a characteristic vector x, and the model provider provides a pre-trained decision tree model which mainly comprises a threshold vector y, a label vector v, an order vector c, a mapping matrix M, a traversal matrix T and the like. Splitting the secret value into the same number of shares as the number of participants to obtain a first share, a second share and a third share, e.g. the secret value a is denoted as [ a []=(a 0 ,a 1 ,a 2 ). It should be noted that all calculations of the secret value are in ring Z 2 l Wherein a = (a) 0 +a 1 +a 2 )mod2 l
Further, setting the first share to 0 and generating a second share value using a pseudo random number generator; the third share value is determined based on the secret value and the second share value, and then the first share value, the second share value, and the third share value are distributed to determine a respective secret share for each participant. For example, suppose P is used 0 、P 1 、P 2 Representing three parties, each holding a respective secret share, i.e. each party P α (α ∈ {0,1,2 }) has a α-1 And a α+1 Is marked as [ a ]] α =(a α-1 ,a α+1 ) (ii) a Such as P 0 Hold (a) 1 ,a 2 )、P 1 Hold (a) 0 ,a 2 )、P 2 Hold (a) 0 ,a 1 ). In the sharing of secret values, P α And P α+1 Presetting a pseudo-random number generator (PRG) and a key, a α Is set to 0,a α-1 Is generated by PRG, therefore, a α+1 =a-a α-1 From P α Is calculated and sent to P α-1。
Step S12: and determining the characteristic attribute of each participant on each decision node by using the mapping matrix corresponding to each participant and the characteristic vector corresponding to each participant based on the secret share, so as to obtain the comparison result of the decision node through the characteristic attribute.
In the embodiment of the application, a model provider provides a pre-trained decision tree model, and a mapping matrix M and a traversal matrix T are generated in advance. In fig. 2, the decision tree model has m decision nodes and m +1 leaf nodes. Denote the jth decision node as D j And the k leaf node is marked as L k (ii) a Each decision node comprises a comparison: cmp (j) =1{x σ(j) <y (j) In which x σ(j) Represents an attribute in the feature vector, x is the feature value, y is the threshold specified by the model provider, σ represents the mapping: j ∈ {1,2., m } → i ∈ {1,2., n }, which represent the process by which each decision node selects an attribute value, i is the sequence number of the feature vector, cmp (j) Showing the result of the comparison between the two.
The mapping matrix M functions to select attribute values of the eigenvectors for the decision nodes in the decision tree model. The row vector m of the mapping matrix j And a decision node D j Corresponds to, m j The number of elements of (a) is equal to the number of elements of the feature vector x. It is to be noted that m j Except that the σ (j) -th element is set to 1, the remaining elements are set to 0. Thus, m j The dot product with x can be extracted to D j Required attribute x σ (j)
In the embodiment of the application, each participant holds respective secret shares through an input sharing stage, and further, the participants jointly calculate x σ = M · x, i.e. the respective characteristic property of each participant at each decision node is determined based on the secret share using the mapping matrix and the eigenvector for each participant.
Further, a comparison result of the decision node is obtained through the characteristic attribute, and participants perform safety comparison to jointly calculate cmp (j) =1{x σ(j) <y (j) }. In making the security comparison, the participant first proceeds through subtractionThe method module calculates the difference a of the two, then extracts the most significant bit of a after bit decomposition is carried out by an adder circuit, if the most significant bit is 0, the difference is represented as a positive number; if the most significant bit is 1, the difference is negative, and the comparison is performed. In the embodiment of the application, when the decision tree is traversed, the traversal starts from the root node if x is in the current node σ(j) Less than y (j) Then cmp (j) If the number is equal to 1, selecting the right child node to continue traversing; otherwise cmp (j) Equal to 0, the left child node is selected to continue the traversal. And iterating until the current node is a leaf node. Each leaf node has a tag v stored therein (k) And finally, the label of the selected leaf node is obtained in the traversal according to the comparison result of the decision node.
Specifically, the obtaining of the comparison result of the decision node through the characteristic attribute includes: traversing the decision tree model and judging the size relation between the characteristic attribute and a threshold vector provided by the model provider; if the characteristic attribute of the current decision node is smaller than the threshold vector, the comparison result of the decision node takes a value of 1, and the right child node of the current decision node is selected to continue traversing until the current node is the leaf node, and the traversing is stopped to obtain the label of the leaf node; and if the characteristic attribute of the current decision node is not smaller than the threshold vector, taking the comparison result of the decision node as 0, selecting a left child node of the current decision node to continue traversing, and stopping traversing until the current node is the leaf node to obtain the label of the leaf node.
Step S13: and performing linear transformation on the comparison result, and performing dot product operation on the comparison result subjected to the linear transformation and the traversal matrix corresponding to each participant so as to determine an evaluation result based on the result of the dot product operation and the label vector carried by the leaf node.
In the embodiment of the present application, the traversing matrix T is used to simulate the process of the decision tree traversal. All decision nodes are regarded as a set S, and leaf nodes L k Decision nodes on the path to the root node are placed in a subset S of S k In, S k Is denoted by the order of c k . Will traverse the row vector t of the matrix k And leaf node L k Corresponds to, t k The number of elements of (a) is the same as the number of decision nodes. It is noted that for each decision node D j If it is not at S k In, then t will be (k,j) (kth row, jth column element of matrix T) is set to 0; if D is j At S k And L is k At D j On the left subtree of (1), then t (k,j) Set to-1; if D is j At S k And L is k At D j On the right subtree of (1), then t (k,j) Is set to 1.
In order to reduce the communication cost, false nodes are prevented from being added in the decision tree, and the traversal process of the decision tree is converted into dot product operation. In this process, the variation calculated by performing linear transformation on the comparison result obtained in step S12 is recorded as
Figure BDA0003975474910000091
I.e. by +>
Figure BDA0003975474910000092
The result after the linear transformation is determined. It can be understood that since the value is 0 when traversing the left child node and 1 when traversing the left child node, the comparison result directed to the left child node is changed to-1 by the linear transformation, while the comparison result directed to the right child node remains unchanged, if L is k Tag v in (1) (k) Is the classification result, then there is t and only t k And/or>
Figure BDA0003975474910000093
Is equal to c k
In the embodiment of the present application, a dot product operation is performed on the comparison result after the linear transformation and the traversal matrix, and a result of the dot product operation is recorded as ctr (k) . In determining the evaluation result, calculating
Figure BDA0003975474910000094
Then, a safety equality test is required to be carried out to determine the result vector of each participant, and finally, v is passed * = p · v determining the result of the evaluation of each participant, v being the customer pair * And carrying out secret reconstruction to recover a final evaluation result. It is to be noted that p (j) =1{ctr (k) =c (k) Is passed through ctr (k) And c (k) Judging whether the corresponding elements are equal according to the index to determine p (j) In the judgment process, the participator firstly calculates the difference value a of the two by a subtraction module, then carries out OR operation on all bit bits of a after realizing bit decomposition by an adder circuit, and can judge whether the two secret values are equal OR not by the result.
It is understood that the multiplication and vector dot product processes can be performed by referring to the existing operation rules, such as
Figure BDA0003975474910000095
Figure BDA0003975474910000096
Wherein, mu 012 =0。
Figure BDA0003975474910000101
Figure BDA0003975474910000102
In addition, addition and calculation of constant multiplication, such as a + b = (a) 0 +a 1 +a 2 )+(b 0 +b 1 +b 2 )=(a 0 +b 0 )+(a 1 +b 1 )+(a 2 +b 2 )、δa=δa 0 +δa 1 +δa 2 In the calculation process, the existing calculation rule is also referred to, and details are not repeated here. Therefore, compared with the existing scheme which mostly adopts a homomorphic encryption technology, the calculation complexity is higher. The embodiment of the application only needs to be simpleThe method only needs arithmetic operation, and has very good calculation performance.
The application provides a decision tree evaluation method based on safe multiparty computation, which comprises the following steps: obtaining secret values shared by a client and a model provider, and dividing the secret values into a preset number of shares by a secret sharing technology so as to determine secret shares corresponding to each participant; wherein the participants include the client, the model provider, and a computing server; the client side provides a feature vector, and the model provider side provides a pre-trained decision tree model; the decision tree model comprises decision nodes, leaf nodes, a mapping matrix and a traversal matrix; determining the characteristic attribute of each participant on each decision node by using the mapping matrix corresponding to each participant and the characteristic vector corresponding to each participant based on the secret share, so as to obtain the comparison result of the decision node through the characteristic attribute; and performing linear transformation on the comparison result, and performing dot product operation on the comparison result subjected to the linear transformation and the traversal matrix corresponding to each participant so as to determine an evaluation result based on the result of the dot product operation and the label vector carried by the leaf node. Therefore, the client side inputs the feature vectors which are not expected to be revealed to the trained decision tree model provided by the model provider to carry out classification or prediction problems, each participant has respective secret share to participate in calculation by copying a secret sharing technology, all participants cannot recover the privacy data of other participants through intermediate data, the privacy data of the provider and the client of the decision tree model cannot be revealed, and only the client obtains a final evaluation result, so that the decision tree model has high safety; by converting the traversal process of the decision tree into the dot product calculation and hiding the structure of the decision tree, false nodes are prevented from being added in the decision tree, the communication cost is reduced, and the efficiency is obviously improved. In addition, because each party of the duplicated secret sharing technology can be different entities, the decision tree evaluation method based on safe multi-party calculation can be expanded to a safe outsourcing scene, so that the model provider and the client do not need to participate in the calculation on line, only the input of the own party needs to be shared safely, the practicability of the protocol is further improved, and the application range is wider.
The technical scheme in the application can be expanded to a safe outsourcing scene, and as shown in fig. 4, the working process applied to outsourcing expansion is shown. The model provider can deploy the evaluation service to the cloud service provider, the cloud service provider serves as a computing service provider, when the customer needs to evaluate the service, the input is submitted to the cloud service provider through the secret sharing technology, the cloud service provider completes computing together, and finally the customer recovers the evaluation result through the secret sharing technology. In the process, except for input and output, other calculations are performed by three parties together, the calculation service party has no input and output, and the contacted data are all random numbers, so that the risk of privacy disclosure is avoided. After the protocol is finished, the client combines the data of the other two parties to recover the final evaluation result. The provider and the client do not need to keep an online state, do not need to participate in calculation online, only need to share the input of the own party safely, and have no requirement on the calculation capacity of the provider and the client, so that the practicability of the protocol is further improved, and the application range is wider.
In order to further reduce the communication cost, the embodiment of the invention discloses a specific decision tree evaluation method based on secure multi-party computation, which is shown in fig. 5 and comprises the following steps:
step S21: and when the number of the decision nodes is larger than a preset threshold value, compressing the traversal matrix by using a divide-and-conquer method to obtain a target number of sub-traversal matrices.
Step S22: and dividing the decision nodes into the sub traversal matrixes according to a preset node division rule.
In the embodiment of the application, if the number of decision nodes of the current decision tree model is too large, the traversal matrix can be compressed to reduce the communication cost. For example, the model provider may compress the traversal matrix T of the decision tree with decision nodes greater than 100. The provider will distribute the 1 st to the first
Figure BDA0003975474910000111
Division of individual leaf nodes intoOne group, is recorded as G' 1 (ii) a Is/are>
Figure BDA0003975474910000112
The leaf nodes from the m +1 th leaf node to the m +1 th leaf node are divided into a group, which is marked as G 2 . Since all decision nodes are considered as a set S, a subset S of S k In is leaf node L k A decision node on the path to the root node is set when the traversal matrix T is compressed
Figure BDA0003975474910000113
Figure BDA0003975474910000114
And &>
Figure BDA0003975474910000115
Provider basis>
Figure BDA0003975474910000116
And &>
Figure BDA0003975474910000117
These three sets divide the traversal matrix T into three matrices +>
Figure BDA0003975474910000118
T 1 * And &>
Figure BDA0003975474910000119
In the embodiment of the present application, the division rule is:
Figure BDA00039754749100001110
the column vector corresponding to the medium decision node is divided into ÷ s>
Figure BDA0003975474910000121
Figure BDA0003975474910000122
The column vector corresponding to the decision node in (1) is dividedTo T 1 * ,/>
Figure BDA0003975474910000123
The column vector corresponding to the middle decision node is divided into
Figure BDA0003975474910000124
Wherein it is present>
Figure BDA0003975474910000125
Remains unchanged, T 1 * Is based on the fifth->
Figure BDA0003975474910000126
To the m +1 th row and +>
Figure BDA0003975474910000127
1 st to->
Figure BDA0003975474910000128
The rows are removed and these removed elements are considered to be public 0 at the time of computation.
In the embodiment of the application, the model provider can continuously pair T by using a divide and conquer method 1 * And
Figure BDA0003975474910000129
the compression is performed until the number of compressions reaches the upper limit s. In this process, the constraint (m + 1)/2 may be set S β, where β defaults to 100. In computing ctr, participants can recover T through a series of small matrices that are compressed * Compared with the original traversal matrix T, only the sequence of the column vectors is changed, and the participator passes through T instead * And calculating ctr.
Correspondingly, the embodiment of the present application further discloses a decision tree evaluation device based on secure multi-party computation, as shown in fig. 6, the decision tree evaluation device includes:
the system comprises a copy secret sharing module 11, a model provider and a client, wherein the copy secret sharing module is used for acquiring secret values shared by the client and the model provider, and dividing the secret values into a preset number of shares by using a copy secret sharing technology so as to determine secret shares corresponding to each participant; wherein the participants include the client, the model provider, and a computing server; the client side provides a feature vector, and the model provider side provides a pre-trained decision tree model; the decision tree model comprises decision nodes, leaf nodes, a mapping matrix and a traversal matrix;
a decision module 12, configured to determine, based on the secret share, a feature attribute corresponding to each participant on each decision node by using a mapping matrix corresponding to each participant and a feature vector corresponding to each participant, so as to obtain a comparison result of the decision node through the feature attribute;
and the evaluation module 13 is configured to perform linear transformation on the comparison result, and perform dot product operation on the comparison result after the linear transformation and the traversal matrix corresponding to each participant, so as to determine an evaluation result based on the result of the dot product operation and the tag vector carried by the leaf node.
For more specific working processes of the modules, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not repeated here.
Therefore, according to the scheme of the embodiment, secret values shared by a client and a model provider are obtained, and are divided into shares of a preset number by a secret sharing technology, so that secret shares corresponding to each participant are determined; wherein the participants include the client, the model provider, and a computing server; the client side provides a feature vector, and the model provider side provides a pre-trained decision tree model; the decision tree model comprises decision nodes, leaf nodes, a mapping matrix and a traversal matrix; determining the characteristic attribute of each participant on each decision node by using the mapping matrix corresponding to each participant and the characteristic vector corresponding to each participant based on the secret share, so as to obtain the comparison result of the decision node through the characteristic attribute; and performing linear transformation on the comparison result, and performing dot product operation on the comparison result subjected to the linear transformation and the traversal matrix corresponding to each participant so as to determine an evaluation result based on the result of the dot product operation and the label vector carried by the leaf node. Therefore, the client side inputs the feature vectors which are not expected to be revealed to the trained decision tree model provided by the model provider to carry out classification or prediction problems, each participant has respective secret share to participate in calculation by copying a secret sharing technology, all participants cannot recover the privacy data of other participants through intermediate data, the privacy data of the provider and the client of the decision tree model cannot be revealed, and only the client obtains a final evaluation result, so that the decision tree model has high safety; by converting the traversal process of the decision tree into the dot product calculation, the structure of the decision tree is hidden, so that false nodes are prevented from being added in the decision tree, the communication cost is reduced, and the efficiency is obviously improved. In addition, because each party of the duplicated secret sharing technology can be different entities, the decision tree evaluation method based on safe multi-party calculation can be expanded to a safe outsourcing scene, so that the model provider and the client do not need to participate in the calculation on line, only the input of the own party needs to be shared safely, the practicability of the protocol is further improved, and the application range is wider.
Further, an electronic device is disclosed in the embodiments of the present application, and fig. 7 is a block diagram of an electronic device 20 according to an exemplary embodiment, which should not be construed as limiting the scope of the application.
Fig. 7 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present disclosure. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. Wherein the memory 22 is used for storing a computer program, which is loaded and executed by the processor 21 to implement the relevant steps in the secure multi-party computation based decision tree evaluation method disclosed in any of the foregoing embodiments. In addition, the electronic device 20 in the present embodiment may be a computer.
In this embodiment, the power supply 23 is configured to provide a working voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and a communication protocol followed by the communication interface is any communication protocol applicable to the technical solution of the present application, and is not specifically limited herein; the input/output interface 25 is configured to obtain external input data or output data to the outside, and a specific interface type thereof may be selected according to specific application requirements, which is not specifically limited herein.
In addition, the memory 22 is used as a carrier for storing resources, and may be a read-only memory, a random access memory, a magnetic disk, an optical disk, or the like, the resources stored thereon may include an operating system 221, a computer program 222, data 223, and the like, and the data 223 may include various data. The storage means may be transient storage or permanent storage.
The operating system 221 is used for managing and controlling each hardware device on the electronic device 20 and the computer program 222, and may be Windows Server, netware, unix, linux, or the like. The computer program 222 may further comprise a computer program that can be used to perform other specific tasks in addition to the computer program that can be used to perform the secure multi-party computation based decision tree evaluation method performed by the electronic device 20 disclosed in any of the foregoing embodiments.
Further, embodiments of the present application disclose a computer-readable storage medium, where the computer-readable storage medium includes a Random Access Memory (RAM), a Memory, a Read-Only Memory (ROM), an electrically programmable ROM, an electrically erasable programmable ROM, a register, a hard disk, a magnetic disk, or an optical disk or any other form of storage medium known in the art. Wherein the computer program when executed by a processor implements the aforementioned secure multiparty computation based decision tree evaluation method. For the specific steps of the method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, which are not described herein again.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The steps of the secure multi-party computation based decision tree evaluation or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. The term "comprising", without further limitation, means that the element so defined is not excluded from the group consisting of additional identical elements in the process, method, article, or apparatus that comprises the element.
The decision tree evaluation method, device, equipment and medium based on secure multi-party computation provided by the invention are introduced in detail, and a specific example is applied in the text to explain the principle and the implementation of the invention, and the description of the above embodiment is only used to help understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A decision tree evaluation method based on secure multi-party computation is characterized by comprising the following steps:
the method comprises the steps of obtaining secret values shared by a client and a model provider respectively, and dividing the secret values into a preset number of shares through a secret copying sharing technology so as to determine secret shares corresponding to each participant; wherein the participants include the client, the model provider, and a computing server; the client side provides a feature vector, and the model provider side provides a pre-trained decision tree model; the decision tree model comprises decision nodes, leaf nodes, a mapping matrix and a traversal matrix;
determining the characteristic attribute of each participant on each decision node by using the mapping matrix corresponding to each participant and the characteristic vector corresponding to each participant based on the secret share, so as to obtain the comparison result of the decision node through the characteristic attribute;
and performing linear transformation on the comparison result, and performing dot product operation on the comparison result subjected to the linear transformation and the traversal matrix corresponding to each participant so as to determine an evaluation result based on the result of the dot product operation and the label vector carried by the leaf node.
2. The secure multi-party computing based decision tree evaluation method according to claim 1, wherein the obtaining of the comparison result of the decision node by the feature attributes comprises:
traversing the decision tree model and judging the size relation between the characteristic attribute and a threshold vector provided by the model provider;
if the characteristic attribute of the current decision node is smaller than the threshold vector, the comparison result of the decision node takes a value of 1, and the right child node of the current decision node is selected to continue traversing until the current node is the leaf node, and the traversing is stopped to obtain the label of the leaf node;
and if the characteristic attribute of the current decision node is not smaller than the threshold vector, taking the comparison result of the decision node as 0, selecting a left child node of the current decision node to continue traversing, and stopping traversing until the current node is the leaf node to obtain the label of the leaf node.
3. The secure multi-party computation based decision tree evaluation method according to claim 2, wherein said linearly transforming the comparison result comprises:
multiplying the comparison result by 2 and subtracting 1 from the comparison result to change the comparison result directed to the left child node to-1 and to leave the comparison result directed to the right child node unchanged.
4. The secure multiparty computation based decision tree evaluation method according to claim 1, wherein the obtaining of the secret values shared by the client and the model provider and the dividing of the secret values into a preset number of shares by a duplicate secret sharing technique to determine the secret shares corresponding to each participant comprises:
obtaining secret values shared by a client and a model provider respectively, and dividing the secret values into shares the same as the number of participants to obtain a first share, a second share and a third share;
setting the first share to 0 and generating a second share value using a pseudo random number generator;
the third share value is determined based on the secret value and the second share value, and then the first share value, the second share value, and the third share value are distributed to determine a respective secret share for each participant.
5. The secure multiparty computation based decision tree evaluation method according to claim 2, wherein the determining, based on the secret share, the feature attributes of the participants respectively corresponding to each decision node by using the mapping matrix corresponding to each participant and the feature vector corresponding to each participant, so as to obtain the comparison result of the decision node by using the feature attributes, includes:
determining feature attributes corresponding to the participants on each decision node by using a mapping matrix corresponding to each participant and a feature vector corresponding to each participant based on the secret share, performing bit decomposition on a difference value between the feature attributes and a threshold vector by using an adder circuit, and if the highest bit of the decomposed difference value is 1, determining that the feature attributes are smaller than the threshold vector; and if the highest bit of the decomposed difference value is 0, judging that the characteristic attribute is larger than the threshold vector.
6. The method as claimed in claim 1, wherein the performing a dot product operation on the linearly transformed comparison result and the traversal matrix corresponding to each participant to determine an evaluation result based on the result of the dot product operation and the label vector carried by the leaf node comprises:
performing dot product operation on the comparison result after the linear transformation and the traversal matrix corresponding to each participant, and determining a difference value between the result of the dot product operation and the order of the target subset of the decision node set so as to judge whether the result of the dot product operation is equal to the order of the target subset of the decision node set or not by using the difference value; wherein the target subset is a set of decision nodes included under a path from a current leaf node to a root node in the decision tree model;
performing bit decomposition on the difference value by using an adder circuit, and performing logical OR operation on all bit positions obtained after the bit decomposition to obtain a result vector;
and determining the evaluation result by using the result vector and the label vector carried by the leaf node.
7. The secure multi-party computation based decision tree evaluation method according to any of claims 1 to 6, further comprising;
when the number of the decision nodes is larger than a preset threshold value, compressing the traversal matrix by using a divide-and-conquer method to obtain a target number of sub-traversal matrices;
and dividing the decision nodes into the sub traversal matrixes according to a preset node division rule.
8. A secure multi-party computation based decision tree evaluation apparatus, comprising:
the system comprises a copy secret sharing module, a model providing module and a client side, wherein the copy secret sharing module is used for acquiring secret values shared by the client side and the model providing side respectively, and dividing the secret values into a preset number of shares by using a copy secret sharing technology so as to determine secret shares corresponding to each participant; wherein the participants include the client, the model provider, and a computing service; the client side provides a feature vector, and the model provider side provides a pre-trained decision tree model; the decision tree model comprises decision nodes, leaf nodes, a mapping matrix and a traversal matrix;
the decision module is used for determining the characteristic attribute of each participant on each decision node by using the mapping matrix corresponding to each participant and the characteristic vector corresponding to each participant based on the secret share so as to obtain the comparison result of the decision node through the characteristic attribute;
and the evaluation module is used for carrying out linear transformation on the comparison result and carrying out dot product operation on the comparison result after the linear transformation and the traversal matrix corresponding to each participant in the secret share so as to determine the evaluation result based on the result of the dot product operation and the label vector carried by the leaf node.
9. An electronic device, comprising a processor and a memory; wherein the memory is for storing a computer program that is loaded and executed by the processor to implement the secure multi-party computation based decision tree evaluation method of any of claims 1 to 7.
10. A computer-readable storage medium for storing a computer program; wherein the computer program when executed by a processor implements a secure multiparty computation based decision tree evaluation method according to any of the claims 1 to 7.
CN202211533464.XA 2022-12-01 2022-12-01 Decision tree evaluation method, device, equipment and medium based on secure multi-party computation Pending CN115842627A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211533464.XA CN115842627A (en) 2022-12-01 2022-12-01 Decision tree evaluation method, device, equipment and medium based on secure multi-party computation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211533464.XA CN115842627A (en) 2022-12-01 2022-12-01 Decision tree evaluation method, device, equipment and medium based on secure multi-party computation

Publications (1)

Publication Number Publication Date
CN115842627A true CN115842627A (en) 2023-03-24

Family

ID=85577844

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211533464.XA Pending CN115842627A (en) 2022-12-01 2022-12-01 Decision tree evaluation method, device, equipment and medium based on secure multi-party computation

Country Status (1)

Country Link
CN (1) CN115842627A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116938455A (en) * 2023-09-15 2023-10-24 山东师范大学 Data processing method and system based on secret sharing size comparison
CN117118602A (en) * 2023-06-29 2023-11-24 济南大学 Method and system for realizing secure comparison protocol based on copy secret sharing
CN117857039A (en) * 2024-03-04 2024-04-09 浪潮(北京)电子信息产业有限公司 Multiparty security computing method, device, equipment and medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117118602A (en) * 2023-06-29 2023-11-24 济南大学 Method and system for realizing secure comparison protocol based on copy secret sharing
CN117118602B (en) * 2023-06-29 2024-02-23 济南大学 Method and system for realizing secure comparison protocol based on copy secret sharing
CN116938455A (en) * 2023-09-15 2023-10-24 山东师范大学 Data processing method and system based on secret sharing size comparison
CN116938455B (en) * 2023-09-15 2023-12-12 山东师范大学 Data processing method and system based on secret sharing size comparison
CN117857039A (en) * 2024-03-04 2024-04-09 浪潮(北京)电子信息产业有限公司 Multiparty security computing method, device, equipment and medium
CN117857039B (en) * 2024-03-04 2024-05-28 浪潮(北京)电子信息产业有限公司 Multiparty security computing method, device, equipment and medium

Similar Documents

Publication Publication Date Title
Mandal et al. PrivFL: Practical privacy-preserving federated regressions on high-dimensional data over mobile networks
Jha et al. Towards practical privacy for genomic computation
Chen et al. Privacy-preserving backpropagation neural network learning
CN115842627A (en) Decision tree evaluation method, device, equipment and medium based on secure multi-party computation
Erkin et al. Privacy-preserving distributed clustering
Patel et al. An efficient approach for privacy preserving distributed k-means clustering based on shamir’s secret sharing scheme
Baryalai et al. Towards privacy-preserving classification in neural networks
CN113761563B (en) Data intersection calculation method and device and electronic equipment
CN115510502A (en) PCA method and system for privacy protection
CN116595589B (en) Secret sharing mechanism-based distributed support vector machine training method and system
Zheng et al. SecDR: Enabling secure, efficient, and accurate data recovery for mobile crowdsensing
Ugwuoke et al. Secure fixed-point division for homomorphically encrypted operands
CN115865323A (en) Pearson correlation coefficient calculation method based on secret sharing and OT protocol
CN114358323A (en) Third-party-based efficient Pearson coefficient calculation method in federated learning environment
Duan et al. ACCO: Algebraic computation with comparison
CN114547684A (en) Method and device for protecting multi-party joint training tree model of private data
Shi et al. Privacy preserving growing neural gas over arbitrarily partitioned data
Guo et al. Efficient multiparty fully homomorphic encryption with computation fairness and error detection in privacy preserving multisource data mining
Shuguo et al. Multi-party privacy-preserving decision trees for arbitrarily partitioned data
Liu An application of secure data aggregation for privacy-preserving machine learning on mobile devices
CN114969783B (en) Method and system for recovering crowd sensing data with privacy protection
Case et al. The privacy-preserving padding problem: non-negative mechanisms for conservative answers with differential privacy
Tezuka et al. A fast privacy-preserving multi-layer perceptron using ring-lwe-based homomorphic encryption
Zhang et al. Secure Outsourcing Evaluation for Sparse Decision Trees
Shamreen et al. Privacy Preservation of Business Forecasting Using Homomorphic Encryption

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination