CN107145792A - Multi-user's secret protection data clustering method and system based on ciphertext data - Google Patents

Multi-user's secret protection data clustering method and system based on ciphertext data Download PDF

Info

Publication number
CN107145792A
CN107145792A CN201710225047.1A CN201710225047A CN107145792A CN 107145792 A CN107145792 A CN 107145792A CN 201710225047 A CN201710225047 A CN 201710225047A CN 107145792 A CN107145792 A CN 107145792A
Authority
CN
China
Prior art keywords
data
user
server
point
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710225047.1A
Other languages
Chinese (zh)
Other versions
CN107145792B (en
Inventor
王轩
蒋琳
李晔
姚霖
刘泽超
刘猛
漆舒汉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Graduate School Harbin Institute of Technology
Original Assignee
Shenzhen Graduate School Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Graduate School Harbin Institute of Technology filed Critical Shenzhen Graduate School Harbin Institute of Technology
Priority to CN201710225047.1A priority Critical patent/CN107145792B/en
Publication of CN107145792A publication Critical patent/CN107145792A/en
Application granted granted Critical
Publication of CN107145792B publication Critical patent/CN107145792B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Bioethics (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Storage Device Security (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of multi-user's secret protection data clustering method and system based on ciphertext data, belongs to data mining technology field.The inventive method includes step:Two or more user sends data and cluster centre point, trap door information after each encryption to server;Server calculates ciphertext data point and the distance of cluster centre point, and partition clustering;The data point of user different in each cluster is added by server respectively, and the summation and number of data are sent respectively into user;User by the data summation received and is sent to server after number re-encrypted;Server calculates new cluster centre point, and new cluster centre point is sent into each user;Each user calculates data point in each cluster of the common calculating of agreement by outsourcing secret protection average and, apart from the average value of cluster centre point, is then sent to server, carries out iteration next time.The present invention substantially increases cluster efficiency;The safety calculating under semi-honesty model is realized, while conspiracy attack to a certain extent can be resisted.

Description

Multi-user's secret protection data clustering method and system based on ciphertext data
Technical field
The present invention relates to data mining technology field, more particularly to a kind of multi-user's secret protection number based on ciphertext data According to clustering method, additionally provide multi-user's secret protection data clustering method based on ciphertext data described in a kind of realize is System.
Background technology
Secret protection data mining (Privacy Preserving Data Mining, PPDM), which is mainly solution, two Or the data mining that multiple partners participate in, but oneself private data one kind side compromised in calculating process is not desired to again Method.The data mining of secret protection ensure that data mining can be carried out on both sides or even multi-party joint data, protect simultaneously Card data-privacy is not stolen by other people.
The technology of secret protection data mining is broadly divided into the technical method based on data perturbation and the technology based on password Method.Technology based on data perturbation is mainly protected by adding interference on the basis of source data so as to realize to the privacy of source data Shield, but certain loss of significance can be brought.In technology based on password it is main using homomorphic cryptography and multi-party computations be main Method, cryptographic technique is compared to data interference, and data intervene low, and precision is high, but its time complexity is often higher, calculates Cost is larger.
Technical method based on password is broadly divided into the distributed computing method predominantly participated in without high in the clouds early stage, this method The agreement or half homomorphism encryption that the main safety circuit by using Andrew Chi-Chih Yao is assessed realize the secret protection of data, but bring Problem is mainly less efficient, and the amount of calculation that each participant undertakes is larger, it is difficult to practical.Later, after 2012, Peter et al. proposes the outsourcing multi-party computations based on cloud computing based on BCP encryption methods so that reduce ginseng using high in the clouds It is possibly realized with the amount of calculation of side.The same year, Asharov proposes the door calculated in many ways and falls into homomorphic cryptography method, further lifting The efficiency that high in the clouds is calculated, but such a method can not protect privacy of user, user content easily stolen by other users.
As for clustering method, relatively more classical is exactly traditional K-means clustering algorithms, and substantially process of its realization is, K point is randomly selected as cluster centre point from data in first round iteration, then calculates other each points into cluster The Euclidean distance of heart point, comparing will go apart from the most short corresponding cluster centre that is divided into, after the completion of clustering, each Each component in point in cluster recalculates average value, recalculates cluster centre, and after the completion of calculating, first round iteration is complete Finish, into next round iteration.The cluster centre for being recycled to iterative calculation is not changing, and stops iteration, and cluster is completed.
K-means should be fairly simple one kind in clustering algorithm, and K-means is calculated by algorithm, and sample is pressed It is K cluster according to certain rule confidence, but traditional clustering algorithm can not realize that privacy of user is protected, data participant holds very much Easily obtain the data of other users, therefore the Shortcomings in terms of security.
The content of the invention
To solve the problems of the prior art, the present invention provides a kind of multi-user's secret protection data based on ciphertext data Clustering method, also provides a kind of system for realizing methods described.
Multi-user's secret protection data clustering method of the invention based on ciphertext data comprises the following steps:
S1:Two or more user sends data and cluster centre point, trap door information after each encryption to server;
S2:Server calculates ciphertext data point and the distance of cluster centre point, and divides poly- according to distance and trap door information Class;
S3:The data point of user different in each cluster is added by server respectively, and by the summation of data and Number is sent respectively to respective user;
S4:Each user is according to the data summation and number received, by being sent to service after BCP enciphered method re-encrypteds Device;
S5:Server calculates new cluster centre point, and new cluster centre point is sent into each user;
S6:Each user calculates data point in each cluster and, apart from the average value of cluster centre point, is then sent to jointly Server, returns and performs step S1, until the average value is less than threshold value, classification terminates, and server is by classification results according to number Each user is sent respectively to according to source.
The present invention is further improved, in step sl, and the server is outsourcing service device, and user passes through homomorphism respectively Encryption and BCP encryptions are encrypted twice to data, data set D={ d1,d2,...,dnInclude n data, each data point di=(xi,1,...,xi,m), m represents that each data point is m dimensional vectors, each data point diIn component xi,jIt will be added It is close to be uploaded to outsourcing service device Enc (x twicei,j)=(ce(i,j),cp(i,j)), wherein, 1≤i≤n, 1≤j≤m, ce(i,j)Represent Using the ciphertext after Liu homomorphic encryption scheme encryption, cp(i,j)Represent with the ciphertext after the encryption of BCP encipherment schemes.
The present invention is further improved, and step S2 processing method includes:
S21:Server is according to ciphertext ce(i,j)Calculate ciphertext data point diWith t-th of cluster centre point EtApart from ED2 (di,Et), wherein, k is the number of cluster centre point, 1≤t≤k;
S22:Trapdoor functions in the trap door information provided according to user, outsourcing service device calculates ED2(di,Et)+ Trapdoor compares each data point to the distance of each cluster centre, nearest one of chosen distance, by this dot-dash Assign in corresponding cluster.
The present invention is further improved, in the step s 21, each data point di=(xi,1,...,xi,m) and each Cluster centre point Et=(et,1,...,et,m) all it is m dimensional vectors, and the encryption data of each data point is ce(i,j)= (ce(i,1),...,ce(i,m)), it is described apart from ED2(di,Et) calculation formula be:
The present invention is further improved, in step S22, and the Trapdoor functions, which are used for generation, can compare two numbers According to the order-preserving encrypted indexes of size.
The present invention is further improved, in step s3, and server end uses ciphertext cp(i,j)To calculate each cluster In, how many data point, and data point respective components plus and, and according to data distribution respectively will plus and result be sent to Each user Pi.
The present invention is further improved, in step S4- steps S6, due to the re-computation of each cluster centre be will be every In the cluster of one division, the corresponding component of discrete point for belonging to the center is mutually added in and averaged, it is assumed that in one clusters There is a n point, t user, each user Pi,For Pi value,Each the value after user encryption isOften One cpiIt is the vector of a m dimension, high in the clouds calculates each cluster centre Pi discrete point c respectivelypiThe value phase adduction of respective components And calculate number.SoAddition result is Xi=(xi1,xi2,...,xim), and the number for belonging to Pi points in cluster is aiai, server is the X calculatedi,aiEach user Pi is sent respectively to, each user Pi is encrypted with BCP encipherment schemes Afterwards, calculate then with server combination OPPWAP agreementsValue, last result is the average value of calculating.
The present invention is further improved, and server and processing procedure of each user based on OPPWAP agreements include following step Suddenly:A1:Outsourcing service device S initialized by Setup and generate common parameter PP=(N, K, g), and by common parameter PP
It is sent to each user Pi;
A2:Each user Pi is obtained after common parameter, and generating machine by key generates the public key and private key (pk of oneselfi, ski), and by public key pkiIt is sent to server;
A3:All are combined the unified public key of calculating by server, and unified public key Prod.pk is sent into each use Family Pi;
A4:The data of oneself are obtained result (A with being encrypted by user Pii,Bi) and (Ai′,Bi′);
A5:User Pi generates two random number ρiAnd ρi', re-computation is carried out to encryption data and obtained:
And send these data to server;
A6:Server is obtained after data, is calculated according to formulaAnd WillWithReturn to each user Pi;
A7:User Pi is calculated and obtainedWithAnd it is sent to server;
A8:Server obtains data XiWith X 'iAfterwards, then new data are calculatedWithThen generate Random number τ, then by K, K ',It is sent to each user Pi;
A9:Each user Pi is obtained after data, is finally calculatedIt is every so as to obtain Average value of the data point apart from cluster centre point in individual cluster.
Present invention also offers a kind of system for realizing the above method, including server and two or more user, the use Family is used for the data after encryption and cluster centre point, trap door information to server, according to the data summation and number received, leads to Cross and server be sent to after BCP enciphered method re-encrypteds, calculate the average value of data point in each cluster apart from cluster centre point, It is then sent to server;The server is used to calculating the distance of ciphertext data point and cluster centre point, and according to distance and Trap door information partition clustering, the data point of user different in each cluster is added respectively, and by the summation of data and Number is sent respectively to respective user, calculates new cluster centre point, and new cluster centre point is sent into each user, After classification terminates, classification results are sent respectively to each user according to data source.
The present invention is further improved, and the server is the cloud server of outsourcing.
Compared with prior art, the beneficial effects of the invention are as follows:Cryptological technique is selected, is added by the way that selection is relatively more efficient Close algorithm and the mode of data outsourcing improve efficiency;The algorithm that improved door falls into encryption is mutually tied with the data mining of secret protection Close, raising efficiency;The safety calculating under semi-honesty model is realized, while conspiracy attack to a certain extent can be resisted.
Brief description of the drawings
Fig. 1 is the inventive method flow chart;
Fig. 2 is that an iteration structure of the present invention and data processing flow to schematic diagram;
Fig. 3 is part cluster data;
Fig. 4 is the cluster result of Fig. 3 data ciphertexts;
Fig. 5 is the cluster result of Fig. 3 data clear texts;
Fig. 6 is the time contrast that server and user consume during an iteration;
Fig. 7 is data clear text and ciphertext in last time data and an iteration time contrast histogram;
Fig. 8 is data clear text and the ciphertext time in whole iteration to contrast histogram.
Embodiment
The present invention is described in further details with reference to the accompanying drawings and examples.
Secret protection data clusters are calculated present invention is generally directed to multi-user or multi-data source, to ensure that multiple data possess The private data of person is revealed not in calculating process, it is necessary to the private data for taking scheme to protect data owner.While data Secret protection substantial amounts of calculate, it is necessary to which calculating is contracted out into server reduces data owner can be brought to data owner Amount of calculation.The present invention combines two aspect demands above, by the K-means clustering algorithms and outsourcing meter of multiparty data secret protection It is combined, secret protection is realized by encryption, multi-party computations realize cryptogram computation.Multiple data owners add data It is close, outsourcing service device is uploaded to, server is calculated under ciphertext, returns to data owner's cluster result.It is most Outsourcing service device is given in calculating, and data owner is calculated on a small quantity, while cluster is realized, it is ensured that data in cluster process The private data of owner is not compromised.The present invention needs to overcome two main difficult points, and one is the K- for realizing secret protection The outsourcing of means clustering algorithms is calculated;Another is the various calculating problem brought of data distribution of multiparty data collection.The present invention is main To be broken through for two above difficult point.
As shown in figure 1, multi-user's secret protection data clustering method of the invention based on ciphertext data comprises the following steps: Two or more user sends data and cluster centre point, trap door information after each encryption to server;
S2:Server calculates ciphertext data point and the distance of cluster centre point, and divides poly- according to distance and trap door information Class;
S3:The data point of user different in each cluster is added by server respectively, and by the summation of data and Number is sent respectively to respective user;
S4:Each user is according to the data summation and number received, by being sent to service after BCP enciphered method re-encrypteds Device;
S5:Server calculates new cluster centre point, and new cluster centre point is sent into each user;
S6:Each user calculates data point in each cluster and, apart from the average value of cluster centre point, is then sent to jointly Server, returns and performs step S1, until the average value is constant, classification terminates, and server comes classification results according to data Source is sent respectively to each user.
Wherein, in step sl, the present invention is using two kinds of encipherment scheme encryption datas, and Liu homomorphic cryptography and BCP add It is close.Data set D={ d1,d2,...,dnInclude n data, each data point di=(xi,1,...,xi,m), m represents every number Strong point is all m dimensional vectors, each data point diIn component xi,jEncrypted outsourcing service device Enc will be uploaded to twice (xi,j)=(ce(i,j),cp(i,j)), wherein, 1≤i≤n, 1≤j≤m, ce(i,j)Represent to encrypt using Liu homomorphic encryption scheme Ciphertext afterwards, cp(i,j)Represent with the ciphertext after the encryption of BCP encipherment schemes.
The server of this example is the cloud server of outsourcing, and most of computing is calculated by the cloud server of outsourcing, Effectively improve cluster efficiency.
As shown in Fig. 2 once complete iterative process mainly comprises the following steps the present invention:Assuming that each user Pi is Outsourcing service device will be uploaded to after the completion of respective data encryption, the data set D of synthesis is the equal of a bivariate table, encrypted Journey is that each data completes to encrypt twice in generated data collection D, is uploaded to cloud server.The cloud server of outsourcing is main What is completed is the distance for calculating data point to cluster centre, and Trapdoor (trapdoor) information is received from data owner there, than Compared with chosen distance most short cluster centre, partition clustering.Then according to the cluster result of division, the data during each is clustered Each component of point is added, different according to the distribution of data, and the data addition result and number of belonging to P1 are issued into P1, belonged to P2's is sent to P2, and belong to Pn is sent to Pn.Each user Pi (1<=i<=n) it will be sent out again after the information re-encrypted of oneself High in the clouds is given, high in the clouds calculates new cluster centre point, finally wait high in the clouds to calculate and complete, be sent to each user, user's decryption New cluster centre is sent to high in the clouds afterwards, into next round iteration.
From the perspective of user, an iteration process is that each user Pi each provides the Trapdoor trapdoors of oneself Information (according to the difference of data distribution), waiting for server send each cluster centre the summation for belonging to various data and Number, Pi is taken after data, and data are encrypted with BCP encipherment schemes, is completed in conjunction with high in the clouds using OPPWAP agreements The re-computation of cluster centre, wherein, because being distributed as horizontal distribution in the data of users more than two sides, then in data set Each data point distribution belongs to each user, owns, and this example is in both sides or multi-party re-computation cluster centre, both sides or multi-party Consult one group of common r of generation1v,r2v,...rmvTo encrypt the value of cluster centre, then it will be returned to after cluster centre re-encrypted High in the clouds outsourcing service device, so ensures the uniformity that database is calculated.It is emphasized that in the re-encrypted of cluster centre When, it is without using BCP encipherment schemes to new cluster centre, that is to say, that cluster centre point is only needed to by Liu encryption Scheme is encrypted once.
Specifically, step S2 processing method includes:
S21:Server is according to ciphertext ce(i,j)Calculate ciphertext data point diWith t-th of cluster centre point EtApart from ED2 (di,Et), wherein, k is the number of cluster centre point, 1≤t≤k;
Assuming that there is n data point D={ d in data set D1,d2,...,dn, the cluster centre set in advance has K, di(1 ≤ i≤n) represent i-th of data point, Et(1≤t≤k) represents t-th of cluster centre.Each data point di=(xi,1,..., xi,m) and each cluster centre point Et=(et,1,...,et,m) all it is m dimensional vectors.Under plaintext, the calculating such as public affairs of Euclidean distance Formula isJ represents j-th of vector.
But, this example is due to using two kinds of cryptographic calculations, so each xi,jIt can be encrypted upload Enc (x twicei,j)= (ce(i,j),cp(i,j))。ce(i,j)For calculating discrete point to the distance of central point, partition clustering center;cp(i,j)For re-computation Cluster centre.In order to represent convenient, in distance of the discrete point to central point, partition clustering central process is calculated, this example ci,j Instead of ce(i,j).Comparing and calculating data point to cluster centre apart from this part, using Liu homomorphic encryption scheme. Encryption key in the AES is a list K (v), because only one of which t in cipher key listi≠ 0, so data possess Person only need to be by ti≠ 0 ciIt is uploaded to outsourcing service device, that assume to upload in the present invention is c1
, it is necessary to calculate each point to the distance of central point, this example is with d during partition clusteringi=(xi,1,xi,2, ...xi,m) arrive Et=(et,1,et,2,...et,m) exemplified by, because only uploading c1, so using ci,jRepresent, Enc (K (v), xi,j) =(c1(i,j),...,cv(i,j)), ci,j=k1*t1*xi,j+s1 *rv(i,j)+k1 *(r1-rv-1), similarly, with c 't,jRepresent after encryption et,j, then c 't,j=k1*t1*et,j+s1*r′v(t,j)+k1*(r1-rv-1).Then ED2(di,Et) represent the data point d after encryptioniArrive Cluster centre EtDistance.Following equation is calculating process of the data point to the distance of cluster centre under the conditions of ciphertext:
But, the distance under this ciphertext is calculated, it is impossible to is directly used in distance and is compared, because original apart from D2(di, Et) add afterwards and rvRelated suffix.Therefore, size is compared under the conditions of ciphertext, in addition it is also necessary to data in encryption are fallen into by door and possessed Person that is to say that user provides Trapdoor sunken information to balance out the part that influence distance compares.
S22:Trapdoor functions in the trap door information provided according to user, outsourcing service device calculates ED2(di,Et)+ Trapdoor compares each data point to the distance of each cluster centre, nearest one of chosen distance, by this dot-dash Assign in corresponding cluster.
The threshold information of this example is a kind of encrypted indexes of order-preserving, the encrypted indexes (Order- of order-preserving PreservingIndexing, OPI) introduced in Liu outsourcing computations in 2014.Give a key k and plaintext X, expression formula OPI (k, x) can produce an index on x.If two clear data x1And x2If, x1> x2, then The encrypted indexes of order-preserving can ensure OPI (k, x1) > OPI (k, x2).This scheme can not recover x1And x2, but can compare Their size.
Such as, the scale of a plaintext is numeral that decimal representation of place of rightmost after plaintext decimal point, The form of such as one numeral in plain text is XXX.XX, generally can the scale of expository writing be 2, susceptibility is 10-2, so such as Really the scale of a plaintext array is s, then its susceptibility is 10-s.The key k for the index that this example is used is a pair of numerals (a, b) and a > 0.This example represents the susceptibility of plaintext with Sens, while OPI (k, x)=a*x+r, wherein r are in data field Between [0, a*sens) Uniformly distributed.That is, r size has no effect on the comparison of x value.If x1> x2, then OPI (k,x1)-OPI(k,x2)=a* (x1-x2)+r1-r2.Due to a* (x1-x2) > a*Sens > r1-r2, then, OPI (k, x1) > OPI(k,x2)。
Therefore, ciphertext size is compared in this example, it is necessary to set up the index form of order-preserving for a*f (X)+g (X, R), here a tables Show the key of encryption, X is the clear data collection for needing to compare, and R is the set of random number, and f and g are two functions, are represented respectively Definition in addition and multiplying.The susceptibility for stating f (X) simultaneously is f (x1) and f (x2) between lowest difference away from.Assuming that f (x1) scale be s1, f (x2) scale be s2, then f (x1)+f(x2) scale be s1And s2It is middle than larger that;f (x1)*f(x2) scale be s1+s2
Assuming that f (X) susceptibility is Senf, the form of ciphertext is a*f (X)+g (X, R), if this form changed To OPI (k, x)=a*x+r.So a*f (X)+g (X, R) ciphertext form is converted into form a*f if necessary to outsourcing service device (X)+r order-preserving index, data owner needs to construct trap door information-g (X, R)+r.Firstly the need of by ED2(di,Et) write as a* F (X)+g (X, R) form, specific formula for calculation is as follows:
Wherein, a, f (X) and g (X, R) are as follows respectively:
A=(k1*t1)2
It is assumed here that D (di,Et) scale be s, then D2(di,Et) scale be exactly s+s=2*s, so D2(di,Et) Susceptibility be 10-2*s, because data owner needs to provide trap door information (Trapdoor), different data distributions can cause The form of trap door information (Trapdoor) is different.
Calculating D2(di,Et) when, outsourcing service device need not be made for data source in different data owners Different calculating.But it is to come from which user that outsourcing service device, which needs to record each record,.Calculating distance, partition clustering User is needed to provide trap door information (Trapdoor) during center, because outsourcing service device knows each diIt is to come from P1 or P2 Deng, so which side data owner is the data point to be calculated belong to, it is exactly that corresponding data owner provides trap door information (Trapdoor)。
This example is in horizontal data distribution, it is assumed that diFrom P1, then just provide trap door information by P1.The lattice of trap door information Formula is-g (X, R)+r.The trapdoor function designed in this example is made up of two parts:
Trapit(di,Et)+Trapt(Et)
Part I Trap in this two partsit(di,Et) it is trapdoor function of each data point to cluster centre point distance, This part is the part that data owner can calculate in advance, Trapt(Et) be each cluster centre point trapdoor function, this Part is that iteration changes as different cluster centres changes each time, and calculation formula is as follows:
Wherein, NBtjMagnitude range is [0, (k1*t1)2* sens], equivalent to one of the random number r in-g (X, R)+r Point, the result after two parts addition is as shown in following equation.
Wherein-g (X, R) equivalent toAnd with Machine number
In re-computation cluster centre, this example will utilize the ciphertext c encrypted in second of BCP encipherment schemep(i,j)To enter Row is calculated, and server end uses ciphertext cp(i,j)To calculate in each cluster, how many data point, and data point correspondence Component plus and, and according to data distribution respectively will plus and result be sent to each user Pi.It is every in order to calculate between parties The average value of data point in individual cluster centre, this example devises OPPWAP agreements (outsourcing secret protection average calculates agreement), The data to be calculated are attributed in the case of ciphertext and calculatedThe problem of.
Specifically, in step S4- steps S6, because the re-computation of each cluster centre is by the poly- of each division In class, the corresponding component of discrete point for belonging to the center is mutually added in and averaged, it is assumed that have n point in being clustered at one, and t is used Family, each user Pi,For Pi value,Each the value after user encryption isEach cpiIt is a m The vector of dimension, high in the clouds calculates each cluster centre Pi discrete point c respectivelypiThe value of respective components is added and calculates number.That Addition result is Xi=(xi1,xi2,...,xim), and the number for belonging to Pi points in cluster is ai, server is calculating Good Xi,aiEach user Pi is sent respectively to, after each user Pi is encrypted with BCP encipherment schemes, then is combined with server OPPWAP agreements are calculatedValue, last result is the average value of calculating.
Concrete methods of realizing comprises the following steps:
A1:Outsourcing service device S initialized by Setup and generate common parameter PP=(N, K, g), and by public ginseng Number PP is sent to each user Pi;
A2:Each user Pi is obtained after common parameter, and generating machine by key generates the public key and private key (pk of oneselfi, ski), and by public key pkiIt is sent to server;
A3:All are combined the unified public key of calculating by server, and unified public key Prod.pk is sent into each use Family Pi;
A4:The data of oneself are obtained result (A with being encrypted by user Pii,Bi) and (Ai′,Bi′);
A5:User Pi generates two random number ρiAnd ρi', re-computation is carried out to encryption data and obtained:
And send these data to server;
A6:Server is obtained after data, is calculated according to formula And willWithReturn to each user Pi;
A7:User Pi is calculated and obtainedWithAnd it is sent to server;
A8:Server obtains data XiWith X 'iAfterwards, then new data are calculatedWithThen generate Random number τ, then by K, K ',It is sent to each user Pi;
A9:Each user Pi is obtained after data, is finally calculatedIt is every so as to obtain Average value of the data point apart from cluster centre point in individual cluster.
The present invention is from more typical K-means algorithms in cluster, in even multi-party case of the data source for two sides Under, the secret protection of data owner's personal data is realized with cryptological technique, safe calculating is carried out with multi-party computations. In addition, each iteration of K-means algorithms is required for calculating each data point to the distance of each central point, cycle calculations Time cost it is very big, this problem by this part calculating be contracted out to server, improve efficiency.
The effect of the present invention is further illustrated with reference to experimental data:
The experiment of the present invention is carried out on unit, and system development environment is as follows:
(1) system of operation is windows7, and processor is that Intel (R) Core (TM) i5-4570CPU speed is 3.2GHz, the memory size of system is 8G;
(2) be 512 from the BCP encryption keys encrypted, in the operation phase, can because key than larger speed ratio compared with Slowly;
(3) programming language is Java, and running environment is eclipse, and system database is Mysql.
Experimental data of the present invention is the data for the data mining downloaded from common data sets UCI, and former data are decimals, by It is the encipherment scheme based on group in BCP encryptions, does not support fractional arithmetic, the later stage is processed data into as integer.Number after processing According to for 7 attributes, 10000 datas.Partial data is as shown in Figure 3.
As shown in Figure 4 and Figure 5, in order to verify the correctness of the invention calculated under ciphertext, in an experiment, carried out in plain text Under K-means cluster, by contrast as can be seen that identical data under ciphertext and plaintext cluster result it is completely the same, with reality Test the theoretical correctness of result verification this paper.
Fig. 3 and Fig. 4 are the parts for having intercepted cluster result, cluster centre initial point be set to three 15,14,2,6, 4,4,6 }, { 1,1,1,1,1,1,1 }, { 15,13,2,6,4,4,6 }, wherein the first data is test data, can therefrom be seen To data it is identical in the case of, cluster result is completely the same.The time of data encryption, ciphering process was two kinds as shown in table 1 Homomorphic cryptography is carried out simultaneously.
Encrypt the spent time within the acceptable range as can be seen from Table 1, the time that Trapdoor is spent With it is more less slightly than encryption times because in Trapdoor calculating be directed to each diNeed to calculate the cumulative of each component, plus Close number of times is few, but formula is more complex for relatively encrypting, with encryption diThe time contrast of cost is less slightly, but gap is little.
The encryption times of table 1 are consumed
During an iteration, the time spent required for data owner mainly includes trap door information (Trapdoor) Calculating, and OPPWAP agreements calculate spend time.Contrast is made respectively in the case where data point number is different, as a result As shown in table 2.
An iteration elapsed time of table 2
As shown in fig. 6, carrying out the line chart of the time loss contrast of an iteration for server and a user.Abscissa The central point number of cluster is represented, ordinate represents the time of cluster, with millisecond (ms) for unit.It can be seen that data owner Time loss is much smaller than server.Table 3 is during an iteration, the time of user Alice and Bob consumption, mainly to include The time of OPPWAP and Trapdoor consumption.
Data owner needs the time consumed in an iteration of table 3
What table 2 and table 3 were shown is all it can be seen that with data in the time cost that an iteration is consumed, table 3 and table 2 The number of point increases, and the time cost of server end can increase, because increasing relevant with iterations, is not linear increase It is long.But time cost consumption and the number of data point between Alice and Bob and uncorrelated, as can be seen from Table 3 The time cost that Trapdoor is consumed is relatively small, and time cost is largely all accounted for by the OPPWAP time loss calculated With.But the number of times that OPPWAP is calculated is related to the dimension m of data point to cluster number k, so in an iteration, The time that the time that OPPWAP is calculated is consumed in Alice and Bob is basicly stable, because k and m do not have in calculating process There is change.As can be seen from Table 2 when the number increase of data point, server end can undertake more calculation costs, and The cost that data owner is calculated tends towards stability substantially, and the time cost that Trapdoor is calculated is far smaller than server.
Because the iterations of K-means cluster calculations is less controllable, iterations and data point number, the number of cluster And each selected initial point correlation.In K-means algorithms in use, K (cluster number) is often the number of a determination, Cluster in the case that number K values determine, this, which is tested, is set as 3, data point number and iterations substantially degree of correlation such as table 4 It is shown.
The reference of the data point number of table 4 and iterations
Communication consumption in whole cluster, it is main to include uploading ciphertext data, OPPWAP agreements and upload in new cluster The heart and Trapdoor functions, cluster centre and Trapdoor are uploaded together here, so communication cost is calculated together. Whole communication cost as shown in table 5, it is necessary to explanation be OPPWAP agreements and cluster centre and Trapdoor functions are with one Exemplified by secondary iteration, because K-means iterationses are uncontrollable, total communication consumption after successive ignition is provided, it is too late once to change For more meaningful.
The communication consumption of table 5
From table 5, it is close with the time for uploading cluster centre point to upload data, because being to upload one in an experiment Three-dimensional array, thus upload data time be substantially it is consistent.Whole OPPWAP time because calculate and communication when Between substantially carry out simultaneously, so entirely OPPWAP time is that Alice and Bob is calculated respectively in an iteration in table 2 The result that OPPWAP time is added, because how much indefinite each computing computer unlatching process is, so the time taken is many Average value after secondary operation program.
Finally, this example is calculated under the ciphertext according to the progress of this programme by Experimental comparison time efficiency and plaintext The time efficiency of lower calculating.It is broadly divided into three parts:The difference for uploading clear data and ciphertext data respectively, in plain text and The time-consuming contrast of ciphertext next iteration, and the contrast that the whole cluster process time is time-consuming.In order to make experimental data contrast bright It is aobvious, select to carry out in the case of 2000 data points, 7 property values.
The calculating time contrast of the plaintext of table 6 and ciphertext
Table 6 represents to calculate in plain text in data, an iteration time and time for entirely clustering is uploaded and cryptogram computation Overall contrast.Because data are uploaded in experiment to be uploaded in 7 dimension group forms, the time for uploading array is substantially one The fixed time is 79ms.In an iteration, due to needing to carry out OPPWAP calculating and communication consumption in cryptogram computation, So cryptogram computation is relatively in plain text, it can have more the time by about one time.In whole cluster calculation, because experiment is using reading document Form, reading data time can be than very fast, and cryptogram computation is relatively calculated in plain text can also have more the time of encryption.If reading number According to storehouse, the time can relatively read document and have more 2-3s, and this example is in order to calculate the accurate of time, so in experiment with computing efficiency process In, using the form for reading document.In figures 7 and 8, the difference of plaintext and ciphertext is represented using histogrammic form.Nogata Ordinate represents the time in figure, and Fig. 7 unit is millisecond (ms), and Fig. 8 unit is the second (s).
The outsourcing experiment with computing that the present invention is carried out, the K-means outsourcings compared with secret protection are calculated, and are simply had more The time that OPPWAP agreements are calculated.From table 5, it can be seen that OPPWAP time loss is about 412ms in an iteration, due to repeatedly Generation number is uncontrollable, so this experiment is on the basis of providing an iteration time.Whether it is multi-party, or folk prescription, transmission The consumption of Trapdoor versus times is essentially identical.
In summary, the data mining and outsourcing of secret protection are calculated and are combined by the present invention, and have carried out experimental analysis. The main achievement of the present invention has following several respects:
(1) advantage and disadvantage of different technologies of the analysis and summary in terms of the data mining of secret protection, data perturbation technology Data are often damaged, and are a kind of a kind of methods compromised in secret protection and data mining precision.Cryptological technique is not Data mining results can be influenceed, data encryption is also with than larger time cost.Present invention selection cryptological technique, passes through choosing The mode for selecting the efficient AES of comparison and data outsourcing improves efficiency;
(2) conventional method application Andrew Chi-Chih Yao calculates the safety circuit appraisal procedure proposed for safety, and this method passes through The step-by-step of 01 string encrypts to realize, time cost is very big;And only with secret protection weighted average problem data point away from From comparison on again processing it is not perfect enough.The present invention falls into algorithm and the secret protection of encryption by using a kind of improved door Data mining is combined, and relatively good solves the above problems, and also has lifting in efficiency;
(3) outsourcing for devising a kind of K-means clustering algorithms of secret protection calculates agreement, by K-means algorithms The calculating of the point distance of cycle calculations two is contracted out to server, comes real by using the multi-party computations of two kinds of encryption technology designs It is existing.Improved Liu encipherment scheme is used to compare data point to the distance of cluster centre, partition clustering;BCP is encrypted for clustering The re-computation at center;
(4) time complexity analysis, space complexity and safety analysis have been carried out for the present invention, and finally carried out Experimental verification.The present invention realizes the safety calculating under semi-honesty model, while the conspiracy that can be resisted to a certain extent is attacked Hit.
Embodiment described above is the better embodiment of the present invention, not limits the specific of the present invention with this Practical range, the scope of the present invention includes being not limited to present embodiment, all equal according to the equivalence changes of the invention made Within the scope of the present invention.

Claims (10)

1. multi-user's secret protection data clustering method based on ciphertext data, it is characterised in that comprise the following steps:
S1:Two or more user sends data and cluster centre point, trap door information after each encryption to server;
S2:Server calculates ciphertext data point and the distance of cluster centre point, and according to distance and trap door information partition clustering;
S3:The data point of user different in each cluster is added by server respectively, and by the summation and number of data It is sent respectively to respective user;
S4:Each user is according to the data summation and number received, by being sent to server after BCP enciphered method re-encrypteds;
S5:Server calculates new cluster centre point, and new cluster centre point is sent into each user;
S6:Each user calculates data point distance cluster in each cluster of the common calculating of agreement by outsourcing secret protection average The average value of central point, is then sent to server, returns and performs step S1, until the average value is less than threshold value, classification knot Classification results are sent respectively to each user by beam, server according to data source.
2. multi-user's secret protection data clustering method according to claim 1, it is characterised in that:In step sl, institute Server is stated for outsourcing service device, user is encrypted twice by homomorphic cryptography and BCP encryptions to data respectively, data set D ={ d1,d2,...,dnInclude n data, each data point di=(xi,1,...,xi,m), m represents that each data point is m dimensions Vector, each data point diIn component xi,jEncrypted outsourcing service device Enc (x will be uploaded to twicei,j)=(ce(i,j), cp(i,j)), wherein, 1≤i≤n, 1≤j≤m, ce(i,j)Represent the ciphertext after the homomorphic encryption scheme encryption using Liu, cp(i,j) Represent with the ciphertext after the encryption of BCP encipherment schemes.
3. multi-user's secret protection data clustering method according to claim 2, it is characterised in that:Step S2 processing side Method includes:
S21:Server is according to ciphertext ce(i,j)Calculate ciphertext data point diWith t-th of cluster centre point EtApart from ED2(di,Et), Wherein, k is the number of cluster centre point, 1≤t≤k;
S22:Trapdoor functions in the trap door information provided according to user, outsourcing service device calculates ED2(di,Et)+ Trapdoor compares each data point to the distance of each cluster centre, nearest one of chosen distance, by this dot-dash Assign in corresponding cluster.
4. multi-user's secret protection data clustering method according to claim 3, it is characterised in that:In the step s 21, often One data point di=(xi,1,...,xi,m) and each cluster centre point Et=(et,1,...,et,m) all it is m dimensional vectors, and it is every The encryption data of individual data point is ce(i,j)=(ce(i,1),...,ce(i,m)), it is described apart from ED2(di,Et) calculation formula be:
5. multi-user's secret protection data clustering method according to claim 4, it is characterised in that:In step S22, institute State Trapdoor functions be used for produce can compare the order-preserving encrypted indexes of two size of data.
6. multi-user's secret protection data clustering method according to claim 3, it is characterised in that:In step s3, take Device end be engaged in using ciphertext cp(i,j)Come calculate each cluster in, how many data point, and data point respective components add With, and it is sent to each user Pi by plus with result respectively according to data distribution.
7. multi-user's secret protection data clustering method according to claim 6, it is characterised in that:In step S4- steps In S6, due to the re-computation of each cluster centre be will each divide cluster in, belong to the center discrete point correspondence Component be mutually added in and average, it is assumed that have n point in being clustered at one, t user, each user Pi,For Pi value,Each the value after user encryption isEach cpiIt is the vector of a m dimension, high in the clouds calculates each poly- respectively Class center Pi discrete point cpiThe value of respective components is added and calculates number.SoAddition result is Xi=(xi1, xi2,...,xim), and the number for belonging to Pi points in cluster is ai, server is the X calculatedi,aiIt is sent respectively to each use Family Pi, after each user Pi is encrypted with BCP encipherment schemes, then is calculated with server combination OPPWAP agreementsValue, last result is the average value of calculating.
8. multi-user's secret protection data clustering method according to claim 7, it is characterised in that:Server and each use Processing procedures of the family Pi based on OPPWAP agreements comprises the following steps:
A1:Outsourcing service device S initialized by Setup and generate common parameter PP=(N, K, g), and by common parameter PP It is sent to each user Pi;
A2:Each user Pi is obtained after common parameter, and generating machine by key generates the public key and private key (pk of oneselfi,ski), and By public key pkiIt is sent to server;
A3:All are combined the unified public key of calculating by server, and unified public key Prod.pk is sent into each user Pi;
A4:The data of oneself are obtained result (A with being encrypted by user Pii,Bi) and (Ai′,Bi′);
A5:User Pi generates two random number ρiWith ρ 'i, re-computation is carried out to encryption data and obtained:
<mrow> <msub> <mover> <mi>A</mi> <mo>&amp;OverBar;</mo> </mover> <mi>i</mi> </msub> <mo>=</mo> <msub> <mi>A</mi> <mi>i</mi> </msub> <mo>&amp;CenterDot;</mo> <msup> <mi>g</mi> <msub> <mi>&amp;rho;</mi> <mi>i</mi> </msub> </msup> <mi>mod</mi> <mi> </mi> <msup> <mi>N</mi> <mn>2</mn> </msup> <mo>,</mo> <msub> <mover> <mi>B</mi> <mo>&amp;OverBar;</mo> </mover> <mi>i</mi> </msub> <mo>=</mo> <msub> <mi>B</mi> <mi>i</mi> </msub> <mo>&amp;CenterDot;</mo> <mi>Pr</mi> <mi>o</mi> <mi>d</mi> <mo>.</mo> <msup> <mi>pk</mi> <msub> <mi>&amp;rho;</mi> <mi>i</mi> </msub> </msup> <mi>mod</mi> <mi> </mi> <msup> <mi>N</mi> <mn>2</mn> </msup> </mrow>
<mrow> <msubsup> <mover> <mi>A</mi> <mo>&amp;OverBar;</mo> </mover> <mi>i</mi> <mo>&amp;prime;</mo> </msubsup> <mo>=</mo> <msubsup> <mi>A</mi> <mi>i</mi> <mo>&amp;prime;</mo> </msubsup> <mo>&amp;CenterDot;</mo> <msup> <mi>g</mi> <msubsup> <mi>&amp;rho;</mi> <mi>i</mi> <mo>&amp;prime;</mo> </msubsup> </msup> <mi>mod</mi> <mi> </mi> <msup> <mi>N</mi> <mn>2</mn> </msup> <mo>,</mo> <msub> <mover> <mi>B</mi> <mo>&amp;OverBar;</mo> </mover> <mi>i</mi> </msub> <mo>=</mo> <msub> <mi>B</mi> <mi>i</mi> </msub> <mo>&amp;CenterDot;</mo> <mi>Pr</mi> <mi>o</mi> <mi>d</mi> <mo>.</mo> <msup> <mi>pk</mi> <msub> <mi>&amp;rho;</mi> <mi>i</mi> </msub> </msup> <mi>mod</mi> <mi> </mi> <msup> <mi>N</mi> <mn>2</mn> </msup> <mo>,</mo> </mrow>
And send these data to server;
A6:Server is obtained after data, is calculated according to formulaAnd WillWithReturn to each user Pi;
A7:User Pi is calculated and obtainedWithAnd it is sent to server;
A8:Server obtains data XiAnd Xi' after, then calculate new dataWithThen generate random Number τ, then by K, K ',It is sent to each user Pi;
A9:Each user Pi is obtained after data, is finally calculatedSo as to obtain each cluster Average value of the middle data point apart from cluster centre point.
9. a kind of system realized according to any one of claim 1-8 multi-user's secret protection data clustering methods, it is special Levy and be:Including server and two or more user, the user is used to believe the data after encryption and cluster centre point, trapdoor Cease to server, according to the data summation and number received, by being sent to server after BCP enciphered method re-encrypteds, calculate Data point is then sent to server apart from the average value of cluster centre point in each cluster;The server is used to calculate close The distance of literary data point and cluster centre point, and according to distance and trap door information partition clustering, by use different in each cluster The data point at family is added respectively, and the summation and number of data are sent respectively into respective user, calculates new cluster Central point, and new cluster centre point is sent to each user, after classification terminates, classification results are distinguished according to data source It is sent to each user.
10. system according to claim 9, it is characterised in that:The server is the cloud server of outsourcing.
CN201710225047.1A 2017-04-07 2017-04-07 Multi-user privacy protection data clustering method and system based on ciphertext data Active CN107145792B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710225047.1A CN107145792B (en) 2017-04-07 2017-04-07 Multi-user privacy protection data clustering method and system based on ciphertext data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710225047.1A CN107145792B (en) 2017-04-07 2017-04-07 Multi-user privacy protection data clustering method and system based on ciphertext data

Publications (2)

Publication Number Publication Date
CN107145792A true CN107145792A (en) 2017-09-08
CN107145792B CN107145792B (en) 2020-09-15

Family

ID=59775113

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710225047.1A Active CN107145792B (en) 2017-04-07 2017-04-07 Multi-user privacy protection data clustering method and system based on ciphertext data

Country Status (1)

Country Link
CN (1) CN107145792B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615021A (en) * 2018-12-20 2019-04-12 暨南大学 A kind of method for protecting privacy based on k mean cluster
CN109688143A (en) * 2018-12-28 2019-04-26 西安电子科技大学 A kind of cluster data mining method towards secret protection in cloud environment
CN110163292A (en) * 2019-05-28 2019-08-23 电子科技大学 Secret protection k-means clustering method based on vector homomorphic cryptography
CN110233730A (en) * 2019-05-22 2019-09-13 暨南大学 A kind of method for protecting privacy based on K mean cluster
CN111291406A (en) * 2020-01-19 2020-06-16 山东师范大学 Facility site selection method and system based on encrypted position data
CN111291417A (en) * 2020-05-09 2020-06-16 支付宝(杭州)信息技术有限公司 Method and device for protecting data privacy of multi-party combined training object recommendation model
CN111542058A (en) * 2020-04-27 2020-08-14 福建省众联网络科技有限公司 Encryption processing method for communication
CN111737753A (en) * 2020-07-24 2020-10-02 支付宝(杭州)信息技术有限公司 Two-party data clustering method, device and system based on data privacy protection
CN112487481A (en) * 2020-12-09 2021-03-12 重庆邮电大学 Verifiable multi-party k-means federal learning method with privacy protection
KR102247182B1 (en) * 2020-12-18 2021-05-03 주식회사 이글루시큐리티 Method, device and program for creating new data using clustering technique
CN112765664A (en) * 2021-01-26 2021-05-07 河南师范大学 Safe multi-party k-means clustering method with differential privacy
CN113626858A (en) * 2021-07-21 2021-11-09 西安电子科技大学 Privacy protection k-means clustering method, device, medium and terminal
CN113792760A (en) * 2021-08-19 2021-12-14 北京爱笔科技有限公司 Cluster analysis method and device, computer equipment and storage medium
WO2021249500A1 (en) * 2020-06-12 2021-12-16 支付宝(杭州)信息技术有限公司 Method and apparatus for clustering private data of multiple parties
WO2022105022A1 (en) * 2020-11-18 2022-05-27 杭州趣链科技有限公司 Federated learning-based machine learning method, electronic device and storage medium
WO2022141014A1 (en) * 2020-12-29 2022-07-07 深圳大学 Security averaging method based on multi-user data

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104601596A (en) * 2015-02-05 2015-05-06 南京邮电大学 Data privacy protection method in classification data mining system
CN105760780A (en) * 2016-02-29 2016-07-13 福建师范大学 Trajectory data privacy protection method based on road network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104601596A (en) * 2015-02-05 2015-05-06 南京邮电大学 Data privacy protection method in classification data mining system
CN105760780A (en) * 2016-02-29 2016-07-13 福建师范大学 Trajectory data privacy protection method based on road network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LIU XIAOYAN ETC: "《Outsourcing Two-party Privacy Preserving K-means Clustering Protocol Inn Wireless Sensor Networks》", 《IEEE COMPUTER SOCIETY》 *
薛安荣 等: "《隐私保护的快速聚类算法》", 《系统工程与电子技术》 *

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615021B (en) * 2018-12-20 2022-09-27 暨南大学 Privacy information protection method based on k-means clustering
CN109615021A (en) * 2018-12-20 2019-04-12 暨南大学 A kind of method for protecting privacy based on k mean cluster
CN109688143A (en) * 2018-12-28 2019-04-26 西安电子科技大学 A kind of cluster data mining method towards secret protection in cloud environment
CN110233730A (en) * 2019-05-22 2019-09-13 暨南大学 A kind of method for protecting privacy based on K mean cluster
CN110233730B (en) * 2019-05-22 2022-05-03 暨南大学 Privacy information protection method based on K-means clustering
CN110163292A (en) * 2019-05-28 2019-08-23 电子科技大学 Secret protection k-means clustering method based on vector homomorphic cryptography
CN111291406A (en) * 2020-01-19 2020-06-16 山东师范大学 Facility site selection method and system based on encrypted position data
CN111291406B (en) * 2020-01-19 2022-07-26 山东师范大学 Facility site selection method and system based on encrypted position data
CN111542058A (en) * 2020-04-27 2020-08-14 福建省众联网络科技有限公司 Encryption processing method for communication
CN111291417A (en) * 2020-05-09 2020-06-16 支付宝(杭州)信息技术有限公司 Method and device for protecting data privacy of multi-party combined training object recommendation model
WO2021249500A1 (en) * 2020-06-12 2021-12-16 支付宝(杭州)信息技术有限公司 Method and apparatus for clustering private data of multiple parties
CN111737753A (en) * 2020-07-24 2020-10-02 支付宝(杭州)信息技术有限公司 Two-party data clustering method, device and system based on data privacy protection
WO2022105022A1 (en) * 2020-11-18 2022-05-27 杭州趣链科技有限公司 Federated learning-based machine learning method, electronic device and storage medium
CN112487481B (en) * 2020-12-09 2022-06-10 重庆邮电大学 Verifiable multi-party k-means federal learning method with privacy protection
CN112487481A (en) * 2020-12-09 2021-03-12 重庆邮电大学 Verifiable multi-party k-means federal learning method with privacy protection
KR102247182B1 (en) * 2020-12-18 2021-05-03 주식회사 이글루시큐리티 Method, device and program for creating new data using clustering technique
WO2022141014A1 (en) * 2020-12-29 2022-07-07 深圳大学 Security averaging method based on multi-user data
CN112765664A (en) * 2021-01-26 2021-05-07 河南师范大学 Safe multi-party k-means clustering method with differential privacy
CN113626858A (en) * 2021-07-21 2021-11-09 西安电子科技大学 Privacy protection k-means clustering method, device, medium and terminal
CN113792760A (en) * 2021-08-19 2021-12-14 北京爱笔科技有限公司 Cluster analysis method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN107145792B (en) 2020-09-15

Similar Documents

Publication Publication Date Title
CN107145792A (en) Multi-user&#39;s secret protection data clustering method and system based on ciphertext data
Bonawitz et al. Practical secure aggregation for privacy-preserving machine learning
Zhang et al. Lattice-based proxy-oriented identity-based encryption with keyword search for cloud storage
CN102170357B (en) Combined secret key dynamic security management system
CN107145791A (en) A kind of K means clustering methods and system with secret protection
CN110086626A (en) Quantum secret communication alliance chain method of commerce and system based on unsymmetrical key pond pair
CN105635135B (en) A kind of encryption system and access control method based on property set and relationship predicate
CN106789044A (en) Cloud storage ciphertext data public key can search for encryption method on lattice under master pattern
CN107005408A (en) Public key encryption system
CN107181590A (en) Strategy hides the anti-leakage CP ABE methods under being decrypted with outsourcing
CN110266687A (en) A kind of Internet of Things TSM Security Agent data sharing modularity using block chain technology
CN110190945A (en) Based on adding close linear regression method for secret protection and system
CN112183767A (en) Multi-key lower model aggregation federal learning method and related equipment
Narad et al. Cascade forward back-propagation neural network based group authentication using (n, n) secret sharing scheme
CN115392487A (en) Privacy protection nonlinear federal support vector machine training method and system based on homomorphic encryption
Tang et al. Identity‐Based Linkable Ring Signature on NTRU Lattice
CN105393488B (en) The method for establishing the public key cryptography of resisting quantum computation attack
Liu et al. A secure federated data-driven evolutionary multi-objective optimization algorithm
CN109040041A (en) Data hierarchy encryption device and related electronic device, storage medium
CN107637013A (en) Key exchange method, cipher key exchange system, key distribution device, communicator and program
US20220311623A1 (en) Data communication between a group of users
CN107070900B (en) It can search for re-encryption method based on what is obscured
Zhang et al. Efficient federated learning framework based on multi-key homomorphic encryption
Nanavati et al. Analysis and evaluation of schemes for secure sum in collaborative frequent itemset mining across horizontally partitioned data
Gohel et al. A new data integrity checking protocol with public verifiability in cloud storage

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant