CN107145792B - Multi-user privacy protection data clustering method and system based on ciphertext data - Google Patents

Multi-user privacy protection data clustering method and system based on ciphertext data Download PDF

Info

Publication number
CN107145792B
CN107145792B CN201710225047.1A CN201710225047A CN107145792B CN 107145792 B CN107145792 B CN 107145792B CN 201710225047 A CN201710225047 A CN 201710225047A CN 107145792 B CN107145792 B CN 107145792B
Authority
CN
China
Prior art keywords
data
server
user
clustering
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710225047.1A
Other languages
Chinese (zh)
Other versions
CN107145792A (en
Inventor
王轩
蒋琳
李晔
姚霖
刘泽超
刘猛
漆舒汉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Graduate School Harbin Institute of Technology
Original Assignee
Shenzhen Graduate School Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Graduate School Harbin Institute of Technology filed Critical Shenzhen Graduate School Harbin Institute of Technology
Priority to CN201710225047.1A priority Critical patent/CN107145792B/en
Publication of CN107145792A publication Critical patent/CN107145792A/en
Application granted granted Critical
Publication of CN107145792B publication Critical patent/CN107145792B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Bioethics (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Storage Device Security (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a multi-user privacy protection data clustering method and system based on ciphertext data, and belongs to the technical field of data mining. The method of the invention comprises the following steps: more than two users send the data after being encrypted respectively, the clustering center point and the trapdoor information to a server; the server calculates the distance between the ciphertext data point and the clustering center point, and divides the clustering; the server respectively adds data points of different users in each cluster, and respectively sends the sum and the number of the data to the users; the user re-encrypts the received data sum and the number and sends the data sum and the number to the server; the server calculates a new clustering center point and sends the new clustering center point to each user; and all users jointly calculate the average value of the data points in each cluster from the cluster central point through an outsourcing privacy protection average number calculation protocol, and then send the average value to the server for next iteration. The invention greatly improves the clustering efficiency; the security calculation under the semi-honest model is realized, and collusion attack to a certain degree can be resisted.

Description

Multi-user privacy protection data clustering method and system based on ciphertext data
Technical Field
The invention relates to the technical field of data mining, in particular to a multi-user privacy protection data clustering method based on ciphertext data, and further provides a system for realizing the multi-user privacy protection data clustering method based on the ciphertext data.
Background
Privacy Preserving Data Mining (PPDM) is mainly a method for solving Data Mining involving two or more partners, but does not want private Data to be revealed in a calculation process. The privacy protection data mining ensures that the data mining can be carried out on the joint data of two parties or even multiple parties, and simultaneously ensures that the data privacy is not stolen by other people.
The technology of privacy protection data mining is mainly divided into a technical method based on data scrambling and a technical method based on passwords. The data scrambling based technology mainly realizes privacy protection of source data by adding interference on the basis of the source data, but certain precision loss is brought. The cryptographic technology mainly uses homomorphic encryption and secure multi-party calculation as main methods, and compared with data interference, the cryptographic technology has low data intervention and high precision, but the time complexity is often higher and the calculation cost is larger.
The technical method based on the password is mainly divided into a distributed computing method mainly without cloud end participation in the early stage, the method mainly adopts a protocol of security circuit evaluation of the Yao intelligence or semi-homomorphic encryption to realize data privacy protection, but the method brings about the problems of low efficiency, large calculation amount born by each participant and difficulty in practicability. Later 2012, Peter et al proposed outsourced secure multi-party computing based on BCP encryption methods, making it possible to reduce the computational load of the participants using the cloud. In the same year, Asharov proposes a gate trap homomorphic encryption method for multi-party computing, so that the efficiency of cloud computing is further improved, but the method cannot protect the privacy of users, and the content of the users is easily stolen by other users.
As for the clustering method, the classical is the traditional K-means clustering algorithm, and the realization process is that K points are randomly selected from data in the first iteration as clustering center points, then Euclidean distances from other points to the clustering center points are calculated, the shortest distance is divided into the corresponding clustering centers by comparison, after the clustering division is finished, each component in each point in each cluster is recalculated with an average value, the clustering centers are recalculated, after the calculation is finished, the first iteration is finished, and the next iteration is started. And (5) circulating to the clustering center of the iterative computation, stopping the iteration and finishing the clustering.
The K-means in the clustering algorithm is a relatively simple one, and the K-means clusters the samples into K clusters according to a certain rule through algorithm calculation, but the traditional clustering algorithm cannot realize user privacy protection, and data participants can easily acquire data of other users, so that the method has a defect in safety.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a multi-user privacy protection data clustering method based on ciphertext data and a system for realizing the method.
The invention relates to a multi-user privacy protection data clustering method based on ciphertext data, which comprises the following steps:
s1: more than two users send the data after being encrypted respectively, the clustering center point and the trapdoor information to a server;
s2: the server calculates the distance between the ciphertext data point and the clustering center point, and divides clustering according to the distance and the trapdoor information;
s3: the server respectively adds data points of different users in each cluster, and respectively sends the sum and the number of the data to respective users;
s4: each user re-encrypts the data according to the received data sum and number by a BCP encryption method and sends the data to the server;
s5: the server calculates a new clustering center point and sends the new clustering center point to each user;
s6: and (4) all users jointly calculate the average value of the data points in each cluster from the cluster central point, then the average value is sent to the server, the step S1 is executed again until the average value is smaller than the threshold value, the classification is finished, and the server sends the classification result to all users respectively according to the data source.
In a further improvement of the present invention, in step S1, the server is an outsourced server, and the user encrypts the data twice by using homomorphic encryption and BCP encryption respectively, where the data set D ═ D1,d2,...,dnContains n data points, each data point di=(xi,1,...,xi,m) M denotes that each data point is an m-dimensional vector, and each data point diComponent x in (1)i,jWill be encrypted twice and uploaded to the outsource server Enc (x)i,j)=(ce(i,j),cp(i,j)) Wherein i is more than or equal to 1 and less than or equal to n, j is more than or equal to 1 and less than or equal to m, ce(i,j)Representing the ciphertext encrypted using the Liu's homomorphic encryption scheme, cp(i,j)Representing the ciphertext encrypted using the BCP encryption scheme.
In a further improvement of the present invention, the processing method in step S2 includes:
s21: the server is based on the ciphertext ce(i,j)Computing ciphertext data point diAnd the t-th cluster center point EtDistance ED2(di,Et) Wherein k is the number of the clustering central points, and t is more than or equal to 1 and less than or equal to k;
s22: according to the Trapdoor function in the Trapdoor information provided by the user, the outsourcing server calculates ED2(di,Et) + Trapdoor compares the distance of each data point to the center of each cluster, selects the closest one, and classifies this point into the corresponding cluster.
In a further refinement of the present invention, in step S21, each data point di=(xi,1,...,xi,m) And each cluster center point Et=(et,1,...,et,m) Are all m-dimensional vectors, and the encrypted data for each data point is ce(i,j)=(ce(i,1),...,ce(i,m)) Said distance ED2(di,Et) The calculation formula of (2) is as follows:
Figure BDA0001264952770000021
in a further improvement of the present invention, in step S22, the trapwood function is used to generate an order-preserving encryption index that can compare two data sizes.
In step S3, the server side uses the ciphertext cp(i,j)To calculate how many data points in each cluster there are and the sum of the corresponding components of the data points, and to send the sum result to each user Pi, respectively, according to the data distribution.
The present invention is further improved in that in steps S4-S6, since the recalculation of each cluster center is to add the components corresponding to the discrete points belonging to the center in each divided cluster to an average value, assuming that there are n points, t users, each user Pi,
Figure BDA0001264952770000031
is the value of Pi, and is,
Figure BDA0001264952770000032
each user has an encrypted value of
Figure BDA0001264952770000033
Each one of cpiIs a m-dimensional vector, and the cloud respectively calculates the discrete point c of each clustering center PipiThe values of the corresponding components are summed and the number is calculated. Then
Figure BDA0001264952770000034
The addition result is Xi=(xi1,xi2,...,xim) And the number of points belonging to Pi in the cluster is aiaiThe server sends the calculated Xi,aiRespectively sending to each user Pi, encrypting each user Pi by BCP encryption scheme, and calculating by combining with the server with OPPWAP protocol
Figure BDA0001264952770000035
The final result is the calculated average.
The invention is further improved, the processing procedure of the server and each user based on the OPPWAP protocol comprises the following steps: a1: the outsourcing server S initializes by Setup and generates the common parameter PP ═ N, K, g, and applies the common parameter PP to the outsourcing server S
Sending the data to each user Pi;
a2: after each user Pi obtains the public parameters, the public key and the private key (pk) of the user are generated by a key generatori,ski) And the public key pkiSending the data to a server;
a3: the server combines all the public keys to calculate a unified public key and sends the unified public key Prod.pk to each user Pi;
a4: user Pi encrypts his data to obtain result (A)i,Bi) And (A)i′,Bi′);
A5: the user Pi generates two random numbers ρiAnd ρi' recalculating the encrypted data to obtain:
Figure BDA0001264952770000036
Figure BDA0001264952770000037
and sending the data to a server;
a6: after the server obtains the data, the data is calculated according to a formula
Figure BDA0001264952770000038
And will be
Figure BDA0001264952770000039
And
Figure BDA00012649527700000310
returning to each user Pi;
a7: the user Pi is calculated to obtain
Figure BDA00012649527700000311
And
Figure BDA00012649527700000312
and sending to the server;
a8: server gets data XiAnd X'iThen, new data is calculated
Figure BDA00012649527700000313
And
Figure BDA00012649527700000314
then, a random number tau is generated, and then K, K',
Figure BDA00012649527700000315
sending to each user Pi;
a9: after each user Pi obtains the data, the calculation is finally carried out
Figure BDA0001264952770000041
Thereby obtaining the average value of the distance between the data point in each cluster and the cluster central point.
The invention also provides a system for realizing the method, which comprises a server and more than two users, wherein the users are used for sending the encrypted data, the clustering central point and the trapdoor information to the server, re-encrypting the data by a BCP encryption method according to the total number and the number of the received data, sending the data to the server, calculating the average value of the distance between the data point in each cluster and the clustering central point, and then sending the data to the server; the server is used for calculating the distance between the ciphertext data points and the clustering center points, dividing the clusters according to the distance and the trapdoor information, adding the data points of different users in each cluster respectively, sending the sum and the number of the data to each user respectively, calculating a new clustering center point, sending the new clustering center point to each user, and sending the classification result to each user according to the data source after the classification is finished.
The invention is further improved, and the server is an outsourced cloud server.
Compared with the prior art, the invention has the beneficial effects that: the cryptography technology is selected, and the efficiency is improved by selecting a relatively high-efficiency encryption algorithm and a data outsourcing mode; the improved door trap encryption algorithm is combined with data mining of privacy protection, so that the efficiency is improved; the security calculation under the semi-honest model is realized, and collusion attack to a certain degree can be resisted.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of a one-time iteration structure and data processing flow according to the present invention;
FIG. 3 is a partial clustering of data;
FIG. 4 is a clustering result of the data cipher text of FIG. 3;
FIG. 5 shows the result of plaintext clustering in the data shown in FIG. 3;
FIG. 6 is a comparison of the time spent by the server and the user in one iteration;
FIG. 7 is a comparison histogram of data plaintext and ciphertext at the time of the last data and one iteration;
fig. 8 is a time-contrast histogram of data plaintext and ciphertext over an iteration.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples.
The method mainly aims at calculating the privacy protection data cluster of multiple users or multiple data sources, and in order to ensure that the privacy data of multiple data owners are not leaked in the calculation process, a scheme is needed to protect the privacy data of the data owners. Meanwhile, privacy protection of data brings a large amount of computation to a data owner, and the computation needs to be outsourced to a server to reduce the computation amount of the data owner. The invention combines the two requirements, combines the K-means clustering algorithm of multi-party data privacy protection and outsourcing calculation, realizes privacy protection through encryption, and realizes ciphertext calculation through safe multi-party calculation. And a plurality of data owners encrypt the data and upload the data to the outsourcing server, and the server calculates the data in the ciphertext and returns the data to the data owner clustering result. Most of calculation is handed to the outsourcing server, and the data owner carries out a small amount of calculation, and when realizing clustering, guarantee that the privacy data of the data owner is not revealed in the clustering process. The invention needs to overcome two main difficulties, one is to realize the K-means clustering algorithm outsourcing calculation of privacy protection; another is the computational difficulty presented by the diversity of data distribution in multi-party data sets. The invention mainly breaks through the two difficulties.
As shown in fig. 1, the method for clustering multi-user privacy protection data based on ciphertext data of the present invention includes the following steps: more than two users send the data after being encrypted respectively, the clustering center point and the trapdoor information to a server;
s2: the server calculates the distance between the ciphertext data point and the clustering center point, and divides clustering according to the distance and the trapdoor information;
s3: the server respectively adds data points of different users in each cluster, and respectively sends the sum and the number of the data to respective users;
s4: each user re-encrypts the data according to the received data sum and number by a BCP encryption method and sends the data to the server;
s5: the server calculates a new clustering center point and sends the new clustering center point to each user;
s6: and (4) all users jointly calculate the average value of the data points in each cluster from the cluster central point, then the average value is sent to the server, the step S1 is executed again until the average value is unchanged, the classification is finished, and the server sends the classification result to all users according to the data source.
In step S1, the present invention encrypts data using two encryption schemes, Liu homomorphic encryption and BCP encryption. Data set D ═ D1,d2,...,dnContains n data points, each data point di=(xi,1,...,xi,m) M denotes that each data point is an m-dimensional vector, and each data point diComponent x in (1)i,jWill be encrypted twice and uploaded to the outsource server Enc (x)i,j)=(ce(i,j),cp(i,j)) Wherein i is more than or equal to 1 and less than or equal to n, j is more than or equal to 1 and less than or equal to m, ce(i,j)Representing the ciphertext encrypted using the Liu's homomorphic encryption scheme, cp(i,j)Representing the ciphertext encrypted using the BCP encryption scheme.
The server of this example is the cloud server of outsourcing, calculates most calculations through the cloud server of outsourcing, effectively improves clustering efficiency.
As shown in fig. 2, a complete iteration process of the present invention mainly includes the following steps: assuming that each user Pi uploads the encrypted data to the wrapper server after encryption of the data is completed, the synthesized data set D is equivalent to a two-dimensional table, and the encryption process is to encrypt each data in the synthesized data set D twice and upload the encrypted data to the cloud server. The outsourced cloud server mainly calculates the distance from the data point to the clustering center, receives Trapdoor (Trapdoor) information from the data owner, compares and selects the clustering center with the shortest distance, and divides the clustering. And then, adding each component of the data points in each cluster according to the divided cluster result, sending the data addition result and the number belonging to P1 to P1, sending the data belonging to P2 to P2 and sending the data belonging to Pn according to different data distribution. And each user Pi (1< ═ i < ═ n) re-encrypts the information of the user Pi (1< ═ i < ═ n) again and sends the information to the cloud, the cloud calculates a new clustering center point, finally, the cloud calculates the completion and sends the information to each user, and the user decrypts the information and sends the new clustering center to the cloud to enter the next iteration.
From the perspective of users, an iterative process is that each user Pi provides its own Trapdoor information (according to different data distributions), a server waits for sending the sum and number of various data belonging to each clustering center, after Pi receives the data, the data is encrypted by using a BCP encryption scheme, and then the cloud is combined to complete recalculation of the clustering centers by using an opppap protocol, wherein, because the data of more than two users are distributed horizontally, each data point in a data set belongs to each user, and all the data points in the data set, in this example, when two or more parties recalculate the clustering centers, the two or more parties negotiate to generate a common set of r1v,r2v,...rmvThe value of the clustering center is encrypted, and the clustering center is re-encrypted and then returned to the cloud outsourcing server, so that the consistency of database calculation is ensured. It is emphasized that when re-encrypting a cluster center, no BCP encryption is used for the new cluster centerThe scheme, that is, the cluster center point only needs to be encrypted once by the encryption scheme of Liu.
Specifically, the processing method of step S2 includes:
s21: the server is based on the ciphertext ce(i,j)Computing ciphertext data point diAnd the t-th cluster center point EtDistance ED2(di,Et) Wherein k is the number of the clustering central points, and t is more than or equal to 1 and less than or equal to k;
assume that there are n data points D ═ D in the data set D1,d2,...,dnK cluster centers are set in advance, di(1. ltoreq. i. ltoreq.n) denotes the ith data point, Et(1. ltoreq. t.ltoreq.k) represents the t-th cluster center. Each data point di=(xi,1,...,xi,m) And each cluster center point Et=(et,1,...,et,m) Are all m-dimensional vectors. In the following, the Euclidean distance is calculated according to the formula
Figure BDA0001264952770000061
j denotes the jth vector.
However, since this example uses two encryption operations, each xi,jWill be encrypted and upload Enc (x) twicei,j)=(ce(i,j),cp(i,j))。ce(i,j)The distance between the discrete points and the central point is calculated, and the clustering center is divided; c. Cp(i,j)For recalculating the cluster centers. For convenience of representation, in the process of calculating the distance from the discrete point to the central point and dividing the clustering center, the example uses ci,jIn place of ce(i,j). The homomorphic encryption scheme of Liu is used in both the comparison and calculation of the distance of the data points to the cluster center. The encryption key in the encryption algorithm is a list K (v) because there is only one t in the key listiNot equal to 0, so the data owner only needs to associate t withiC not equal to 0iUploading to a package server, wherein c is assumed to be uploaded in the invention1
In the process of dividing and clustering, the distance from each point to the central point needs to be calculated, in this example, d is usedi=(xi,1,xi,2,...xi,m) To Et=(et,1,et,2,...et,m) Distance is an example, since only c is uploaded1Therefore using ci,jIs represented by Enc (K (v), xi,j)=(c1(i,j),...,cv(i,j)),ci,j=k1*t1*xi,j+s1 *rv(i,j)+k1 *(r1-rv-1) Similarly, use c't,jIndicating e after encryptiont,jThen c't,j=k1*t1*et,j+s1*r′v(t,j)+k1*(r1-rv-1). Then ED2(di,Et) Represents the encrypted data point diTo the cluster center EtThe distance of (c). The following formula is the calculation process of the distance from the data point to the cluster center under the ciphertext condition:
Figure BDA0001264952770000071
however, the distance calculation in this ciphertext cannot be directly used for distance comparison because the original distance D is used2(di,Et) Then adding the sum rvThe associated suffix. Therefore, when the sizes are compared under the ciphertext condition, the data owner in the trap encryption, namely the user, needs to provide trapdigital trap information to offset the part affecting the distance comparison.
S22: according to the Trapdoor function in the Trapdoor information provided by the user, the outsourcing server calculates ED2(di,Et) + Trapdoor compares the distance of each data point to the center of each cluster, selects the closest one, and classifies this point into the corresponding cluster.
The threshold information in this example is a kind of Order-preserving encrypted index (OPI) introduced in 2014 outsourcing encryption calculation by Liu. Given a key k and plaintext x, the expression OPI (k, x) will yield an index with respect to x. If there are two plaintext data x1And x2If x is1>x2Then the order-preserved encryption index will guarantee the OPI (k, x)1)>OPI(k,x2). This scheme does not recover x1And x2But their sizes may be compared.
For example, a plaintext is represented in decimal notation by the rightmost digit after the decimal point of the plaintext, e.g., a plaintext number is in XXX.XX format, which is typically 2, and the sensitivity is 10-2So if a plaintext array is of size s, its sensitivity is 10-s. The key k of the index used in this example is a pair of numbers (a, b) and a > 0. In this example, Sens represents the sensitivity of the plaintext, and OPI (k, x) ═ a × x + r, where r is uniformly distributed in the data interval [0, a × Sens). That is, the magnitude of r does not affect the comparison of the values of x. If x1>x2Then OPI (k, x)1)-OPI(k,x2)=a*(x1-x2)+r1-r2. Due to a (x)1-x2)>a*Sens>r1-r2Thus, OPI (k, x)1)>OPI(k,x2)。
Therefore, in this example, the cipher text size is compared, and the index format that needs to be created for order preservation is a × f (X) + g (X, R), where a denotes the encrypted key, X is the plaintext data set that needs to be compared, R is the set of random numbers, and f and g are two functions that respectively denote the definitions in addition and multiplication. The sensitivity of the simultaneous declaration of f (X) is f (x)1) And f (x)2) The minimum gap between them. Suppose f (x)1) Scale of is s1,f(x2) Scale of is s2Then f (x)1)+f(x2) Scale of is s1And s2The larger of the two; f (x)1)*f(x2) Scale of is s1+s2
Assuming that the sensitivity of f (X) is Senf, the ciphertext is in the form a f (X) + g (X, R), if this form is to be converted to OPI (k, X) ═ a X + R. Therefore, if the outsourcing server is required to convert the format of a × (X) + g (X, R) ciphertext into the order-preserving index of the format a × f (X) + R, the data owner needs to construct the trapdoor information-g (X, R) + R. First of all, it is necessary toWill ED2(di,Et) Written in the form of a × (X) + g (X, R), the specific calculation formula is as follows:
Figure BDA0001264952770000081
wherein a, f (X) and g (X, R) are respectively as follows:
a=(k1*t1)2
Figure BDA0001264952770000082
Figure BDA0001264952770000083
here, assume that D (D)i,Et) Is s, then D2(di,Et) The scale of (D) is s + s2 s, so D2(di,Et) Has a sensitivity of 10-2*sSince the data owner needs to provide the Trapdoor information (Trapdoor), different data distributions may result in different forms of Trapdoor information (Trapdoor).
In calculating D2(di,Et) When this happens, the outsource server does not need to make different calculations for data originating from different data owners. But the outsourcing server needs to keep track of which user each record is coming from. User Trapdoor information (Trapdoor) is required to compute distance and partition the cluster center because the outsourcing server knows each diWhether it comes from P1 or P2, etc., so the data owner to whom the data point to be computed belongs is the corresponding data owner that provides the Trapdoor information (Trapdoor).
In this example, in a horizontal data distribution, assume diFrom P1, then trapdoor information is provided by P1. The format of the trapdoor information is-g (X, R) + R. The trapdoor function designed in this example consists of two parts:
Trapit(di,Et)+Trapt(Et)
the first of these two parts Trapit(di,Et) Is a trapdoor function of the distance of each data point to the cluster center point, which is a part that can be calculated in advance by the data owner, Trapt(Et) The trapdoor function of each cluster center point is changed along with the change of different cluster centers in each iteration, and the calculation formula is as follows:
Figure BDA0001264952770000084
Figure BDA0001264952770000091
wherein, NBtjThe size range is [0, (k)1*t1)2*sens]The result of adding the random number R corresponding to a part of the random number R in-g (X, R) + R is shown in the following formula.
Figure BDA0001264952770000092
Wherein-g (X, R) corresponds to
Figure BDA0001264952770000093
And random number
Figure BDA0001264952770000094
In recalculating the clustering centers, this example will utilize the encrypted ciphertext c of the second BCP encryption schemep(i,j)To perform calculation, the server side adopts the ciphertext cp(i,j)To calculate how many data points in each cluster there are and the sum of the corresponding components of the data points, and to send the sum result to each user Pi, respectively, according to the data distribution. In order to calculate the average value of the data points in each cluster center among all parties, the example designs an OPPWAP protocol (outsourcing privacy protection average number calculation protocol), and calculates the data to be calculated under the condition of reducing the data to be calculated into a ciphertext
Figure BDA0001264952770000095
To a problem of (a).
Specifically, in steps S4-S6, since the recalculation of each cluster center is to add the components corresponding to the discrete points belonging to the center in each divided cluster to the average, assuming that there are n points, t users, each user Pi,
Figure BDA0001264952770000096
is the value of Pi, and is,
Figure BDA0001264952770000097
each user has an encrypted value of
Figure BDA0001264952770000098
Each one of cpiIs a m-dimensional vector, and the cloud respectively calculates the discrete point c of each clustering center PipiThe values of the corresponding components are summed and the number is calculated. Then
Figure BDA0001264952770000099
The addition result is Xi=(xi1,xi2,...,xim) And the number of points belonging to Pi in the cluster is aiThe server sends the calculated Xi,aiRespectively sending to each user Pi, encrypting each user Pi by BCP encryption scheme, and calculating by combining with the server with OPPWAP protocol
Figure BDA00012649527700000910
The final result is the calculated average.
The specific implementation method comprises the following steps:
a1: the outsourcing server S initializes through Setup and generates a common parameter PP which is (N, K, g), and sends the common parameter PP to each user Pi;
a2: after each user Pi obtains the public parameters, the public key and the private key (pk) of the user are generated by a key generatori,ski) And the public key pkiSending the data to a server;
a3: the server combines all the public keys to calculate a unified public key and sends the unified public key Prod.pk to each user Pi;
a4: user Pi encrypts his data to obtain result (A)i,Bi) And (A)i′,Bi′);
A5: the user Pi generates two random numbers ρiAnd ρi' recalculating the encrypted data to obtain:
Figure BDA0001264952770000101
Figure BDA0001264952770000102
and sending the data to a server;
a6: after the server obtains the data, the data is calculated according to a formula
Figure BDA0001264952770000103
And will be
Figure BDA0001264952770000104
And
Figure BDA0001264952770000105
returning to each user Pi;
a7: the user Pi is calculated to obtain
Figure BDA0001264952770000106
And
Figure BDA0001264952770000107
and sending to the server;
a8: server gets data XiAnd X'iThen, new data is calculated
Figure BDA0001264952770000108
And
Figure BDA0001264952770000109
then, a random number tau is generated, and then K, K',
Figure BDA00012649527700001010
sending to each user Pi;
a9: after each user Pi obtains the data, the calculation is finally carried out
Figure BDA00012649527700001011
Thereby obtaining the average value of the distance between the data point in each cluster and the cluster central point.
The invention selects a typical K-means algorithm in the cluster, realizes the privacy protection of the personal data of the data owner by using a cryptography technology under the condition that the data source is two or even more, and performs the safety calculation by using the safety multi-party calculation. In addition, each iteration of the K-means algorithm needs to calculate the distance from each data point to each central point, the time cost of circular calculation is high, the calculation is outsourced to the server, and the efficiency is improved.
The effects of the present invention are further illustrated below in conjunction with experimental data:
the experiment of the invention is carried out on a single machine, and the system development environment is as follows:
(1) the running system is windows7, the processor is Intel (R) core (TM) i5-4570CPU speed is 3.2GHz, and the memory size of the system is 8G;
(2) the encryption key for BCP encryption is 512 bits, and in the operation stage, the speed is low because the key is large;
(3) the programming language is Java, the operating environment is eclipse, and the system database is Mysql.
The experimental data is data mined from data downloaded from a public data set UCI, the original data is decimal, BCP encryption is a group-based encryption scheme and does not support decimal operation, and the data is processed into integers in the later period. The processed data are 10000 pieces of data with 7 attributes. Part of the data is shown in figure 3.
As shown in FIG. 4 and FIG. 5, in order to verify the correctness of the calculation under the ciphertext of the present invention, in the experiment, K-means clustering under the plaintext is performed, and it can be seen through comparison that the clustering results of the ciphertext and the plaintext under the same data are completely consistent, and the experimental result is used to verify the correctness of the theory herein.
Fig. 3 and 4 are portions of the clustering results, which are truncated, and are set to three {15,14,2,6,4,4,6}, {1,1,1,1,1, 1}, {15,13,2,6,4,4,6}, at an initial point of a cluster center, where the first piece of data is test data, and it can be seen that the clustering results are completely consistent under the condition that the data are completely the same. The data encryption time is shown in table 1, and the encryption process is performed simultaneously by two homomorphic encryptions.
From table 1, it can be seen that the time spent for encryption is within an acceptable range, and the sum of the time spent by trapport is slightly less than the encryption time, because for each d in the calculation of trapportiThe accumulation of each component needs to be calculated, the encryption times are less, but the formula is more complicated than the encryption, and d is encryptediThe time spent was slightly less contrasted, but the difference was not large.
TABLE 1 encryption time consumption
Figure BDA0001264952770000111
The time spent by the data owner in one iteration mainly includes the calculation of Trapdoor information (Trapdoor), and the time spent by the OPPWAP protocol calculation. The results of comparison in the case where the number of data points was different are shown in Table 2.
TABLE 2 one iteration elapsed time
Figure BDA0001264952770000112
As shown in fig. 6, a line graph comparing the time consumption of one iteration for the server and one user. The abscissa represents the number of center points of the cluster, and the ordinate represents the time of the cluster in milliseconds (ms). It can be seen that the data owner time consumption is much less than for the server. Table 3 shows the time consumed by the users Alice and Bob, including the time consumed by OPPWAP and Tradpor, in one iteration.
TABLE 3 time consumed by data owner in one iteration
Figure BDA0001264952770000121
Tables 2 and 3 show the time cost consumed by one iteration, and it can be seen in tables 3 and 2 that as the number of data points increases, the time cost at the server end increases, and the increase is not linear because the increase is related to the number of iterations. However, the time cost consumption between Alice and Bob is not related to the number of data points, and it can be seen from table 3 that the time cost consumed by trapwood is relatively small, and most of the time cost is occupied by the time consumption calculated by opppap. However, the number of times of OPPWAP calculation is only related to the number k of clusters and the dimension m of the data points, so in one iteration, the time consumed by the time of OPPWAP calculation in Alice and Bob is basically stable, because k and m are not changed in the calculation process. As can be seen from table 2, when the number of data points increases, the server side will bear more calculation cost, while the cost calculated by the data owner tends to be substantially stable, and the time cost calculated by the trapwood is far less than that of the server.
Because the iteration times of K-means cluster calculation are not controllable, the iteration times are related to the number of data points, the number of clusters and the initial point selected each time. In the use of the K-means algorithm, K (the number of clusters) is often a certain number, and in the case of a certain value of the number of clusters K, the experiment is set to 3, and the approximate correlation degree between the number of data points and the number of iterations is shown in table 4.
TABLE 4 reference of data points number and iteration number
Figure BDA0001264952770000122
The communication consumption in the whole cluster mainly comprises the uploading of ciphertext data, the OPPWAP protocol and the uploading of a new cluster center and a Tradpor function, wherein the cluster center and the Tradpor are uploaded together, so the communication cost is calculated together. The whole communication cost is shown in table 5, and it should be noted that the opppap protocol and the clustering center and the trapwood function are exemplified by an iteration, because the number of K-means iterations is not controllable, the total communication consumption after a plurality of iterations is given, and the cost is more meaningful than that of one iteration.
TABLE 5 communication consumption
Figure BDA0001264952770000131
From table 5, the uploading time of the data is similar to the uploading time of the cluster center point, and the uploading time of the data is basically consistent because the three-dimensional arrays are uploaded in the experiment. The time of the whole OPPWAP is the result of the time addition of the time of respectively calculating the OPPWAP by Alice and Bob in one iteration in the table 2 because the time of calculation and communication are basically carried out at the same time, and the calculated time is the average value after a plurality of programs are run because the computer starting process is not determined every time.
Finally, this example compares the time efficiency calculated in the ciphertext and the time efficiency calculated in the plaintext according to the present solution experimentally. The device is mainly divided into three parts: the method comprises the steps of respectively distinguishing uploaded plaintext data and uploaded ciphertext data, comparing consumed time of next iteration of the plaintext and the ciphertext, and comparing the consumed time of the whole clustering process. For the experimental data comparison to be obvious, the case of selecting 2000 data points and 7 attribute values was performed.
TABLE 6 comparison of computation times for plaintext and ciphertext
Figure BDA0001264952770000132
Table 6 shows the overall comparison of plaintext calculations and ciphertext calculations over the time of the upload data, one iteration, and the entire cluster. Because the uploaded data are uploaded in a 7-dimensional array form in the experiment, the time for uploading the array is basically 79ms for a set time. In one iteration, the cryptograph calculation requires approximately twice as much time as the plaintext calculation because of the required OPPWAP calculation and communication consumption. In the whole clustering calculation, because the experiment adopts a document reading mode, the data reading time is faster, and the encryption time is more increased in the ciphertext calculation than in the plaintext calculation. If the data base is read, the time is 2-3s more than that of the read document, and the method adopts the form of reading the document in order to calculate the accuracy of the time. In fig. 7 and 8, the difference between plaintext and ciphertext is represented in the form of a histogram. The ordinate in the histogram represents time, and the unit in fig. 7 is milliseconds (ms) and the unit in fig. 8 is seconds(s).
The outsourcing calculation experiment performed by the invention only increases the calculation time of the OPPWAP protocol compared with the K-means outsourcing calculation with privacy protection. From table 5, it can be seen that the opppap time consumption in one iteration is about 412ms, and since the number of iterations is not controllable, the experiment is based on the given one iteration time. The time consumption for transferring the Trapdoor function is basically the same regardless of multiple parties or a single party.
In conclusion, the invention combines the data mining and outsourcing calculation of privacy protection, and performs experimental analysis. The main achievements of the invention are as follows:
(1) the analysis summarizes the advantages and disadvantages of different technologies in the aspect of privacy protection data mining, the data scrambling technology is prone to damage data, and the method is a compromise between privacy protection and data mining precision. The cryptography technology does not affect the data mining result, and the data encryption also brings larger time cost. The invention selects the cryptography technology, and improves the efficiency by selecting a relatively high-efficiency encryption algorithm and a data outsourcing mode;
(2) the traditional method applies a safety circuit evaluation method proposed by the Yaoqian for safety calculation, and the method is realized by bit-wise encryption of 01 strings, so that the time cost is very high; the weighted averaging problem with privacy protection alone is not perfect in comparison of data point distances. The invention better solves the problems by combining an improved door trap encryption algorithm with data mining of privacy protection, and the efficiency is also improved;
(3) an outsourcing calculation protocol of a privacy protection K-means clustering algorithm is designed, the calculation of the distance between two points in the K-means algorithm through cyclic calculation is outsourced to a server, and the calculation is realized through safe multi-party calculation designed by adopting two encryption technologies. The improved Liu encryption scheme is used for comparing the distance from a data point to a cluster center and dividing clusters; BCP encryption is used for recalculation of the clustering center;
(4) time complexity analysis, space complexity and safety analysis are carried out aiming at the invention, and finally experimental verification is carried out. The invention realizes the safety calculation under the semi-honest model and can resist collusion attack to a certain degree.
The above-described embodiments are intended to be illustrative, and not restrictive, of the invention, and all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims (9)

1. The multi-user privacy protection data clustering method based on the ciphertext data is characterized by comprising the following steps of:
s1: more than two users send the data after being encrypted respectively, the clustering center point and the trapdoor information to a server;
s2: the server calculates the distance between the ciphertext data point and the clustering center point, and divides clustering according to the distance and the trapdoor information;
s3: the server respectively adds data points of different users in each cluster, and respectively sends the sum and the number of the data to respective users;
s4: each user re-encrypts the data according to the received data sum and number by a BCP encryption method and sends the data to the server;
s5: the server calculates a new clustering center point and sends the new clustering center point to each user;
s6: all users jointly calculate the average value of the data points in each cluster from the cluster center point through outsourcing a privacy protection average number calculation protocol, then send the average value to the server, and return to execute the step S1 until the average value is smaller than the threshold value, the classification is finished, and the server respectively sends the classification result to all users according to the data source;
in step S1, the server is an outsourced server, and the user encrypts the data twice through homomorphic encryption and BCP encryption respectively to obtain a data set
Figure DEST_PATH_IMAGE002
Containing n data points, each data point
Figure DEST_PATH_IMAGE004
M denotes that each data point is an m-dimensional vector, and each data point is a vector of m dimensions
Figure DEST_PATH_IMAGE006
Component (b) of
Figure DEST_PATH_IMAGE008
Will be encrypted and uploaded to the outsourcing server twice
Figure DEST_PATH_IMAGE010
Wherein, in the step (A),
Figure DEST_PATH_IMAGE012
Figure DEST_PATH_IMAGE014
Figure DEST_PATH_IMAGE016
representing the ciphertext encrypted using a homomorphic encryption scheme,
Figure DEST_PATH_IMAGE018
representing the ciphertext encrypted using the BCP encryption scheme.
2. The multi-user privacy preserving data clustering method according to claim 1, characterized in that: the processing method of step S2 includes:
s21: the server according to the ciphertext
Figure DEST_PATH_IMAGE020
Computing ciphertext data points
Figure DEST_PATH_IMAGE006A
And the t-th cluster center point
Figure DEST_PATH_IMAGE022
Is a distance of
Figure DEST_PATH_IMAGE024
Wherein k is the number of the clustering central points,
Figure DEST_PATH_IMAGE026
s22: according to the Trapdoor function in the Trapdoor information provided by the user, the outsourcing server calculates
Figure DEST_PATH_IMAGE028
And comparing the distance from each data point to the center of each cluster, selecting the closest one, and dividing the point into corresponding clusters.
3. The multi-user privacy preserving data clustering method according to claim 2, characterized in that: in step S21, each data point
Figure DEST_PATH_IMAGE030
And each cluster center point
Figure DEST_PATH_IMAGE032
Are all m-dimensional vectors, and the encrypted data for each data point is
Figure DEST_PATH_IMAGE034
Said distance
Figure DEST_PATH_IMAGE036
The calculation formula of (2) is as follows:
Figure DEST_PATH_IMAGE038
4. the multi-user privacy preserving data clustering method according to claim 3, characterized in that: in step S22, the trapwood function is used to generate an order-preserving encryption index that can compare two data sizes.
5. The multi-user privacy preserving data clustering method according to claim 2, characterized in that: in step S3, the server side uses the ciphertext
Figure DEST_PATH_IMAGE040
To calculate how many data points in each cluster there are and the sum of the corresponding components of the data points, and to send the sum result to each user Pi, respectively, according to the data distribution.
6. The multi-user privacy preserving data clustering method of claim 5, wherein: in steps S4-S6, since the recalculation of each cluster center is to add the components corresponding to the discrete points belonging to the center in each divided cluster to the averaging, assuming that there are n points, t users, each user Pi,
Figure DEST_PATH_IMAGE042
is the value of Pi, and is,
Figure DEST_PATH_IMAGE044
each user having an encrypted value of
Figure DEST_PATH_IMAGE046
Each of which is
Figure DEST_PATH_IMAGE048
Is a m-dimensional vector, and the cloud calculates the discrete point of each clustering center Pi
Figure DEST_PATH_IMAGE050
The values of the corresponding components are summed and the number is calculated, then
Figure DEST_PATH_IMAGE052
The result of the addition is
Figure DEST_PATH_IMAGE054
And the number of points belonging to Pi in the cluster is aiThe server sends the calculated Xi,aiRespectively sending to each user Pi, encrypting each user Pi by BCP encryption scheme, and calculating by combining with the server with OPPWAP protocol
Figure DEST_PATH_IMAGE056
The final result is the calculated average.
7. The multi-user privacy preserving data clustering method of claim 6, wherein: the process of the server and the respective users Pi based on the opppap protocol comprises the following steps:
a1: the outsourcing server S initializes through Setup and generates a common parameter PP = (N, K, g), and sends the common parameter PP to each user Pi;
a2: after each user Pi obtains the public parameters, the public key and the private key of the user Pi are generated through the key generator
Figure DEST_PATH_IMAGE058
And will public key
Figure DEST_PATH_IMAGE060
Sending the data to a server;
a3: the server combines all the combinations to calculate a unified public key and sends the unified public key Prod.pk to each user Pi;
a4: user Pi encrypts its data to obtain result
Figure DEST_PATH_IMAGE062
And
Figure DEST_PATH_IMAGE064
a5: user Pi generates two random numbers
Figure DEST_PATH_IMAGE066
And
Figure DEST_PATH_IMAGE068
and recalculating the encrypted data to obtain:
Figure DEST_PATH_IMAGE070
Figure DEST_PATH_IMAGE072
Figure DEST_PATH_IMAGE074
Figure DEST_PATH_IMAGE076
,
and sending the data to a server;
a6: after the server obtains the data, the data is calculated according to a formula
Figure DEST_PATH_IMAGE078
,
Figure DEST_PATH_IMAGE080
,
Figure DEST_PATH_IMAGE082
,
Figure DEST_PATH_IMAGE084
And will be
Figure DEST_PATH_IMAGE086
And
Figure DEST_PATH_IMAGE088
returning to each user Pi;
a7: the user Pi is calculated to obtain
Figure DEST_PATH_IMAGE090
And
Figure DEST_PATH_IMAGE092
and sending to the server;
a8: server get data
Figure DEST_PATH_IMAGE094
And
Figure DEST_PATH_IMAGE096
then, new data is calculated
Figure DEST_PATH_IMAGE098
And
Figure DEST_PATH_IMAGE100
then generates a random number
Figure DEST_PATH_IMAGE102
Then will be
Figure DEST_PATH_IMAGE104
Sending to each user Pi;
a9: after each user Pi obtains the data, the calculation is finally carried out
Figure DEST_PATH_IMAGE106
So as to obtain the average of the data points in each cluster from the cluster central pointAnd (4) average value.
8. A system for implementing the multi-user privacy preserving data clustering method according to any one of claims 1 to 7, characterized in that: the system comprises a server and more than two users, wherein the users are used for sending encrypted data, a clustering central point and trapdoor information to the server, re-encrypting the data by a BCP encryption method according to the total number and the number of the received data, sending the data to the server, calculating the average value of the data points in each cluster from the clustering central point, and then sending the data to the server; the server is used for calculating the distance between the ciphertext data points and the clustering center points, dividing the clusters according to the distance and the trapdoor information, adding the data points of different users in each cluster respectively, sending the sum and the number of the data to each user respectively, calculating a new clustering center point, sending the new clustering center point to each user, and sending the classification result to each user according to the data source after the classification is finished.
9. The system of claim 8, wherein: the server is an outsourced cloud server.
CN201710225047.1A 2017-04-07 2017-04-07 Multi-user privacy protection data clustering method and system based on ciphertext data Active CN107145792B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710225047.1A CN107145792B (en) 2017-04-07 2017-04-07 Multi-user privacy protection data clustering method and system based on ciphertext data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710225047.1A CN107145792B (en) 2017-04-07 2017-04-07 Multi-user privacy protection data clustering method and system based on ciphertext data

Publications (2)

Publication Number Publication Date
CN107145792A CN107145792A (en) 2017-09-08
CN107145792B true CN107145792B (en) 2020-09-15

Family

ID=59775113

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710225047.1A Active CN107145792B (en) 2017-04-07 2017-04-07 Multi-user privacy protection data clustering method and system based on ciphertext data

Country Status (1)

Country Link
CN (1) CN107145792B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615021B (en) * 2018-12-20 2022-09-27 暨南大学 Privacy information protection method based on k-means clustering
CN109688143B (en) * 2018-12-28 2021-01-22 西安电子科技大学 Clustering data mining method for privacy protection in cloud environment
CN110233730B (en) * 2019-05-22 2022-05-03 暨南大学 Privacy information protection method based on K-means clustering
CN110163292A (en) * 2019-05-28 2019-08-23 电子科技大学 Secret protection k-means clustering method based on vector homomorphic cryptography
CN111291406B (en) * 2020-01-19 2022-07-26 山东师范大学 Facility site selection method and system based on encrypted position data
CN111542058A (en) * 2020-04-27 2020-08-14 福建省众联网络科技有限公司 Encryption processing method for communication
CN111291417B (en) * 2020-05-09 2020-08-28 支付宝(杭州)信息技术有限公司 Method and device for protecting data privacy of multi-party combined training object recommendation model
CN111444545B (en) * 2020-06-12 2020-09-04 支付宝(杭州)信息技术有限公司 Method and device for clustering private data of multiple parties
CN111737753B (en) * 2020-07-24 2020-12-22 支付宝(杭州)信息技术有限公司 Two-party data clustering method, device and system based on data privacy protection
CN112101579B (en) * 2020-11-18 2021-02-09 杭州趣链科技有限公司 Federal learning-based machine learning method, electronic device, and storage medium
CN112487481B (en) * 2020-12-09 2022-06-10 重庆邮电大学 Verifiable multi-party k-means federal learning method with privacy protection
KR102247182B1 (en) * 2020-12-18 2021-05-03 주식회사 이글루시큐리티 Method, device and program for creating new data using clustering technique
WO2022141014A1 (en) * 2020-12-29 2022-07-07 深圳大学 Security averaging method based on multi-user data
CN112765664B (en) * 2021-01-26 2022-12-27 河南师范大学 Safe multi-party k-means clustering method with differential privacy
CN113626858A (en) * 2021-07-21 2021-11-09 西安电子科技大学 Privacy protection k-means clustering method, device, medium and terminal
CN113792760A (en) * 2021-08-19 2021-12-14 北京爱笔科技有限公司 Cluster analysis method and device, computer equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104601596A (en) * 2015-02-05 2015-05-06 南京邮电大学 Data privacy protection method in classification data mining system
CN105760780A (en) * 2016-02-29 2016-07-13 福建师范大学 Trajectory data privacy protection method based on road network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104601596A (en) * 2015-02-05 2015-05-06 南京邮电大学 Data privacy protection method in classification data mining system
CN105760780A (en) * 2016-02-29 2016-07-13 福建师范大学 Trajectory data privacy protection method based on road network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《Outsourcing Two-party Privacy Preserving K-means Clustering Protocol Inn Wireless Sensor Networks》;Liu Xiaoyan etc;《IEEE computer society》;20151231;第124-133页 *
《隐私保护的快速聚类算法》;薛安荣 等;《系统工程与电子技术》;20091030;第2521-2526页 *

Also Published As

Publication number Publication date
CN107145792A (en) 2017-09-08

Similar Documents

Publication Publication Date Title
CN107145792B (en) Multi-user privacy protection data clustering method and system based on ciphertext data
US11206132B2 (en) Multiparty secure computing method, device, and electronic device
CN110995409B (en) Mimicry defense arbitration method and system based on partial homomorphic encryption algorithm
TWI706279B (en) Multi-party safe calculation method and device, electronic equipment
Bonawitz et al. Practical secure aggregation for privacy-preserving machine learning
CN112989368B (en) Method and device for processing private data by combining multiple parties
US10489604B2 (en) Searchable encryption processing system and searchable encryption processing method
CN106789044B (en) Searchable encryption method for cipher text data public key stored in cloud on grid under standard model
US20190140819A1 (en) System and method for mekle puzzles symeteric key establishment and generation of lamport merkle signatures
WO2011052056A1 (en) Data processing device
JP6497747B2 (en) Key exchange method, key exchange system
JP6477461B2 (en) Order-preserving encryption system, apparatus, method and program
CN114219483B (en) Method, equipment and storage medium for sharing block chain data based on LWE-CPBE
CN110190945A (en) Based on adding close linear regression method for secret protection and system
CN105474575A (en) Multi-party secure authentication system, authentication server, intermediate server, multi-party secure authentication method, and program
CN116561787A (en) Training method and device for visual image classification model and electronic equipment
WO2018043573A1 (en) Key exchange method and key exchange system
CN107637013B (en) Key exchange method, key exchange system, key distribution device, communication device, and recording medium
CN116170142B (en) Distributed collaborative decryption method, device and storage medium
US8325913B2 (en) System and method of authentication
Behera et al. Preserving the Privacy of Medical Data using Homomorphic Encryption and Prediction of Heart Disease using K-Nearest Neighbor
Tan et al. High-performance ring-LWE cryptography scheme for biometric data security
Liu et al. Efficient and Privacy-Preserving Logistic Regression Scheme based on Leveled Fully Homomorphic Encryption
Hu et al. MASKCRYPT: Federated Learning with Selective Homomorphic Encryption
US11451518B2 (en) Communication device, server device, concealed communication system, methods for the same, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant