CN112307499B - Mining method for encrypted data frequent item set in cloud computing - Google Patents

Mining method for encrypted data frequent item set in cloud computing Download PDF

Info

Publication number
CN112307499B
CN112307499B CN202011193510.7A CN202011193510A CN112307499B CN 112307499 B CN112307499 B CN 112307499B CN 202011193510 A CN202011193510 A CN 202011193510A CN 112307499 B CN112307499 B CN 112307499B
Authority
CN
China
Prior art keywords
data
mining
ciphertext
encrypted
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011193510.7A
Other languages
Chinese (zh)
Other versions
CN112307499A (en
Inventor
程梓岩
郑培嘉
陈梓阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202011193510.7A priority Critical patent/CN112307499B/en
Publication of CN112307499A publication Critical patent/CN112307499A/en
Application granted granted Critical
Publication of CN112307499B publication Critical patent/CN112307499B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Fuzzy Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method for mining encrypted data frequent item sets in cloud computing, which solves the problem that the existing method for mining encrypted data frequent item sets cannot simultaneously give consideration to the correctness, privacy safety and efficiency of mining results, and a user generates a full homomorphic encryption private key and a bootstrap key, and transmits the bootstrap key to a data mining party through a secure channel while the private key is reserved, and then transmits the bootstrap key to a cloud server; encrypting the data by using the private key, and uploading the encrypted data to a cloud server; based on homomorphic operation, submitting a query requirement to a cloud server by the data mining direction, performing data calculation mining after the cloud server receives the query requirement, and transmitting an encrypted calculation mining result to the data mining direction; the data mining party decrypts the data through the homomorphic encryption private key, confirms the frequent item set after the mining result is obtained, and considers the correctness and privacy security of the mining result.

Description

Mining method for encrypted data frequent item set in cloud computing
Technical Field
The invention relates to the technical field of encryption domain data processing, in particular to a mining method for encrypted data frequent item sets in cloud computing.
Background
With the development of cloud computing, many cloud service providers may offer easily accessible cloud storage and computing resources. Depending on the needs of the user, cloud service providers may offer specified services to the user, including a range of data mining algorithms, which are widely used in practice. However, the data submitted by the user may contain extremely sensitive information (e.g., personal location, medical information, or business data, etc.) that the user does not want to reveal. Therefore, mining these private data inevitably brings about an important privacy disclosure problem. In this case, privacy-preserving data mining has attracted considerable attention in recent years with the aim of mining databases without accessing the original content of the data.
In practical application, association rule mining is a data mining method for finding potential relationships between variables in a large-scale dataset, frequent item set mining is a sub-process of association rule mining, and before an association rule is generated, all frequent item sets must be found through some frequent item set mining algorithm.
In 2018, 8 and 31 days, chinese patent (publication number: CN 108475292A) discloses a frequent item set mining method of a large-scale data set, which comprises the steps of firstly estimating sample capacity, collecting sample data sets with the sample capacity from the large-scale data set, mining closed frequent item sets in the sample data sets, calculating maximum length constraint corresponding to the large-scale data sets to generate reduced data sets corresponding to the large-scale data sets, constructing noise based on the reduced data sets, selecting candidate sets through noise and noise threshold values, performing privacy protection on the candidate sets by using the noise, and finally selecting a preset number of frequent item sets from the candidate sets.
Disclosure of Invention
In order to solve the problem that the conventional method for mining the frequent item sets of the encrypted data cannot simultaneously give consideration to the correctness, the privacy safety and the efficiency of the mining result, the invention provides the mining method for the frequent item sets of the encrypted data in cloud computing, the mining of the frequent item sets of the encrypted data in the cloud is completed, the correct encrypted mining result is returned, and the privacy data is not revealed.
In order to achieve the technical effects, the technical scheme of the invention is as follows:
a mining method for a encrypted data frequent item set in cloud computing at least comprises the following steps:
s1, a user generates a full homomorphic encryption private key and a bootstrap key, the private key is transmitted to a data mining party through a secure channel while being reserved, and the bootstrap key is sent to a cloud server;
s2, encrypting the data by using the private key, and uploading the encrypted data to a cloud server;
s3, submitting a query requirement to a cloud server according to the service requirement, performing calculation mining on data based on homomorphic operation after the cloud server receives the query requirement, and transmitting an encrypted calculation mining result to the data mining party;
s4, the data mining party decrypts the data through the homomorphic encryption private key, and confirms the frequent item set after the mining result is obtained.
In the technical scheme, the realization of the whole mining method is based on a three-party model formed by a user, a cloud server and a data mining party, the user and the data mining party hold encrypted private keys, the cloud server holds bootstrap keys used in the data mining process, the user side encrypts data by using the private keys and uploads the data to the cloud server, the cloud server performs data mining calculation after receiving query requirements submitted by the data mining party, the interactive calculation with other servers is avoided, all data mining is completed only on the cloud server, the efficiency is improved, the cloud server only performs homomorphic operation on the encrypted data, decryption is not involved, privacy security is improved, and in addition, the technical scheme avoids the traditional mode of protecting privacy security by means of other noise media of a third party and the like, and the correctness of mining results is ensured.
Preferably, the process of fully homomorphic encrypting the private key and the bootstrap key in step S1 is as follows:
given a security parameter λ, the formula for generating the encryption private key SK and the bootstrap key BK satisfies:
{SK,BK}←TFHE.KeyGen(1 λ )
where tfhe. Keygen represents the medium that generates the encrypted private key SK and the bootstrap key BK.
The private key SK is sent to all users and data mining parties via a secure channel for encrypting data; the bootstrap key BK is sent to the cloud server for homomorphism evaluation of the ciphertext in the encrypted domain.
Preferably, the data in step S2 includes transaction data and mining parameter data, the transaction data is encoded into a boolean matrix, and then encrypted by using an isomorphic encryption private key, and the process is as follows:
each user holds transaction data, and let the i-th user hold a set of transaction data as:
wherein m is i Is the amount of transaction data; the ith user encrypts the transaction data set bit by bit, i.e
C ij (k)←TFHE.Enc(SK,T ij (k)),
Wherein T is ij (k) Representing transaction data T ij The kth element, C ij (k) Representing T ij Transaction data encrypted by an encryption private key SK, wherein j is less than or equal to m i Is a positive integer of (2); the ith user will encrypt the transaction setThe transaction set after encryption of all users is expressed as C= { C and is sent to a cloud server 1 ,…,C m };
The mining parameter data is an unsigned integer minimum support threshold value, is encoded into a binary vector, is encrypted into minsuppCtxt by using an isomorphic private key, is transmitted to a cloud server, is transmitted once only before a data mining party submits a query requirement, and is encrypted before being outsourced to the cloud server, so that privacy safety in the data mining process is ensured.
Preferably, after step S2, step S3 further includes encrypting the query requirement of the data miner bit by bit, where the process is:
the query requirement is recorded as a Boolean vector q with a length of n, and the encryption formula is as follows:
queryCtxt←(TFHE.Enc(SK,q 1 ),…,TFHE.Enc(SK,q n ))
wherein q i And (3) the ith bit of the Boolean vector q is represented, and the encrypted query demand ciphertext vector queryCtxt is transmitted to the cloud server.
Preferably, after receiving the query requirement, the cloud server in step S3 performs the process of computing and mining data, where the process includes:
s31, initializing, namely initializing an encryption counter Boolean vector accCtxt with all elements being ciphertext 0 by a cloud server, wherein the length is
S32, gathering encrypted transaction C= { C 1 ,…,C m Inquiring a ciphertext vector minuuppCtxt of a required ciphertext vector queryCtxt and an unsigned integer minimum support threshold value as input of a cloud server; for i from 1 to m, the following calculations are performed;
s33, obtaining ciphertext by utilizing a security subset judgment algorithmFor determining transaction C i Whether the encrypted query demand ciphertext vector queryCtxt is contained or not, and SecSubDet represents a security subset judgment algorithm;
s34, ciphertext is processed by using a secure counting algorithm SecAccumAccumulated on a counter:
s35, judging whether the update times reach m, if not, returning to the step S33; if yes, using a security comparison algorithm SecCmp to compare the counter accCtxt with a ciphertext vector minsuppCtxt of an unsigned integer minimum support threshold, wherein the formula is as follows:
wherein,and the encrypted Boolean value is a ciphertext representing a comparison result and is used for judging whether the ciphertext vector queryCtxt required for query is a frequent item set or not, and the result is mined for calculation.
When the cloud server is used for data calculation mining, based on homomorphic encryption and expansion of error learning on difficult problems, the algorithm processes of judging the security subset in the encrypted domain, counting the security and comparing the security are sequentially carried out on the encrypted data (comprising the encrypted query demand vector and the transaction data), so that the very high-level security is ensured.
Preferably, step S33 uses the secure subset decision algorithm to obtain ciphertextThe process of (1) is as follows:
s331, determining a ciphertext query demand vector queryCtxt= (c) x,1 ,…,c x,n ) And encrypted transaction C i =(c y,1 ,…,c y,n ) Corresponding plaintext query request vector q= { q 1 ,…,q n Transaction T i =(T 1 ,…,T n );
S332 initializing ciphertextFor the encrypted 1, based on the Torus ring expansion scheme with error learning in the full homomorphism; for i from 1 to n, the following calculations are performed;
s333 calculation ofc=HomORNY(c x,i ,c y,i ) By usingcAnd HomAND operation update ciphertextThe update formula is:
s334, judging whether the update times reach m, if so, outputting the updated ciphertextOtherwise, returning to step S333;
preferably, the secure counting algorithm SecAccum described in step S34 will encrypt the textThe process of accumulating on the counter is as follows:
s341, determining ciphertext solving vector accCtxt= (accCtxt) 1 ,…,accCtxt k ) Ciphertext and method for producing sameRepresenting the binary representation of the integer accumulator and the calculation result of the safety subset decision algorithm, respectively;
s342, initializing ciphertext carry asTorus ring expansion scheme based on full homomorphism with error learning; for i decreasing from k to 1, the following calculation is performed;
s342, calculatingAfter this, the carry is updated using the HomAND operation, with the update formula:
carry=HomAND(carry,accCtxt i );
s343, judging whether the update times reach k, if so, outputting a ciphertext vectorOtherwise, returning to the step S342;
preferably, the process of comparing the counter accCtxt with the ciphertext vector minsuppCtxt of the unsigned integer minimum support threshold by using the secure comparison algorithm SecCmp in step S35 is as follows:
s351. determination counter accctxt= (c) x,1 ,…c x,k ) Ciphertext vector minsuppctxt= (c) with unsigned integer minimum support threshold y,1 ,…,c y,k ) Plaintext vectors x= (x) 1 ,…,x k ),y=(y 1 ,…,y k );
S352 initializationFor HomANDNY (c) x,1 ,c y,1 ) The method comprises the steps of carrying out a first treatment on the surface of the For i from 2 to k, the following is performedCalculating;
s353, first initializing the currentThen for j from 1 to i-1, the following calculation is performed;
s354 calculation ofc=HomXNOR(c x,j ,c y,j ) Re-updatingThe value of +.>
S355, judging whether the value of the current j reaches i-1, if not, returning to the step S354; if yes, update
S356, judging whether the current value of i reaches k, if not, returning to the step S353; if yes, outputCiphertext representing result b of the comparison, if b=1, x is represented<y; if b=0, x.gtoreq.y.
Preferably, the data mining party in step S4 decrypts the data through the homomorphic encryption private key, and the process of confirming the frequent item set after the mining result is obtained is as follows:
the data mining party decrypts the mining result by using the encryption private key SK, and the formula is as follows:
wherein,represents the decryption result as a Boolean value if +.>If the query requirement is 1, the query requirement submitted by the data mining party is a frequent item set, otherwise, the query requirement submitted by the data mining party is not a frequent item set.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention provides a mining method for encrypted data frequent item sets in cloud computing, a user and a data mining party hold an encryption private key, a cloud server holds a bootstrap key which is needed to be used in the data mining process, the user side encrypts data by the private key and uploads the data to the cloud server, the cloud server performs data mining computation after receiving query demands submitted by the data mining party, and the cloud server only completes all data mining on the cloud server, so that the efficiency is improved, and the cloud server only performs homomorphic operation on encrypted data, does not involve decryption, so that the privacy security is improved.
Drawings
Fig. 1 shows a flow diagram of a method for mining a data frequent item set in cloud computing according to an embodiment of the present invention;
fig. 2 shows a schematic runtime diagram when the number n of transaction data items proposed in the embodiment of the invention is fixed and the number m of transaction data items is not fixed.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the present patent;
for better illustration of the present embodiment, some parts of the drawings may be omitted, enlarged or reduced, and do not represent actual dimensions;
it will be appreciated by those skilled in the art that some well known descriptions in the figures may be omitted.
The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.
Example 1
As shown in fig. 1, a flowchart of a method for mining a data frequent item set in cloud computing is shown, and the steps include:
s1, a user generates a full homomorphic encryption private key and a bootstrap key, the private key is transmitted to a data mining party through a secure channel while being reserved, and the bootstrap key is sent to a cloud server;
s2, encrypting the data by using the private key, and uploading the encrypted data to a cloud server;
s3, submitting a query requirement to a cloud server according to the service requirement, performing calculation mining on data based on homomorphic operation after the cloud server receives the query requirement, and transmitting an encrypted calculation mining result to the data mining party;
s4, the data mining party decrypts the data through the homomorphic encryption private key, and confirms the frequent item set after the mining result is obtained.
In this embodiment, the procedure of fully homomorphic encryption private key and bootstrap key described in step S1 is as follows:
given a security parameter λ, the formula for generating the encryption private key SK and the bootstrap key BK satisfies:
{SK,BK}←TFHE.KeyGen(1 λ )
where tfhe. Keygen represents the medium that generates the encrypted private key SK and the bootstrap key BK. The private key SK is sent to all users and data mining parties through a safe channel and is used for encrypting data; the bootstrap key BK is sent to the cloud server for homomorphism evaluation of the ciphertext in the encrypted domain.
The cryptography system used is an extension of the error learning of the difficulty problem with homomorphic encryption, and the specific details of the error learning based on the above process are:
(1) And (3) key generation: private key s epsilon B in LWE encryption mode n Is a random, uniformly distributed binary vector.
(2) Encryption: the plaintext to be encrypted is μ∈t, ciphertext c= (a, b), where a∈t n Is a uniformly randomly sampled vector. After calculating b=a·s+e+μ, ciphertext c may be obtained, where e is noise.
(3) Decryption: for ciphertext c= (a, b), and key s, μ+e=b-a·s is calculated. If the increase of the noise e is controlled within a certain range, the plaintext μ can be correctly decrypted.
(4) Bootstrap (Bootstrapping): unlike a partially homomorphic encryption scheme, a TFHE isomorphic encryption scheme, for a given LWE ciphertext c= (a, b), the Bootstrap algorithm may construct the same ciphertext corresponding to plaintext under the same key s, but the amount of noise is fixed, in other words, the Bootstrap process may refresh the noise in the ciphertext, and the Bootstrap process for ciphertext c is denoted boottrap (c).
In this embodiment, the data described in step S2 includes transaction data and mining parameter data, the transaction data is encoded into a boolean matrix, and then encrypted by using an isomorphic encryption private key, and the process is as follows:
each user holds transaction data, and let the i-th user hold a set of transaction data as:
wherein m is i Is the amount of transaction data; the ith user encrypts the transaction data set bit by bit, i.e
C ij (k)←TFHE.Enc(SK,T ij (k)),
Wherein T is ij (k) Representing transaction data T ij The kth element, C ij (k) Representing T ij Transaction data encrypted by an encryption private key SK, wherein j is less than or equal to m i Is a positive integer of (2); the ith user will encrypt the transaction setThe transaction set after encryption of all users is expressed as C= { C and is sent to a cloud server 1 ,…,C m };
The mining parameter data is an unsigned integer minimum support threshold value, is encoded into a binary vector, is encrypted into minsuppCtxt by using an isomorphic private key, is transmitted to a cloud server, is transmitted once only before a data mining party submits a query requirement, and is encrypted before being outsourced to the cloud server, so that privacy safety in the data mining process is ensured.
In this embodiment, after step S2, step S3 is preceded by encrypting the query requirement of the data miner bit by bit, where the process is:
the query requirement is recorded as a Boolean vector q with a length of n, and the encryption formula is as follows:
queryCtxt←(TFHE.Enc(SK,q 1 ),…,TFHE.Enc(SK,q n ))
wherein q i And (3) the ith bit of the Boolean vector q is represented, and the encrypted query demand ciphertext vector queryCtxt is transmitted to the cloud server.
In this embodiment, after receiving the query requirement, the cloud server in step S3 performs the process of computing and mining data, where the process includes:
s31, initializing, namely initializing an encryption counter Boolean vector accCtxt with all elements being ciphertext 0 by a cloud server, wherein the length is
S32, gathering encrypted transaction C= { C 1 ,…,C m Inquiring a ciphertext vector minuuppCtxt of a required ciphertext vector queryCtxt and an unsigned integer minimum support threshold value as input of a cloud server; for i from 1 to m, the following calculations are performed;
s33, obtaining ciphertext by utilizing a security subset judgment algorithmFor determining transaction C i Whether the encrypted query demand ciphertext vector queryCtxt is contained or not, and SecSubDet represents a security subset judgment algorithm;
s34, ciphertext is processed by using a secure counting algorithm SecAccumAccumulated on a counter:
s35, judging whether the update times reach m, if not, returning to the step S33; if yes, using a security comparison algorithm SecCmp to compare the counter accCtxt with a ciphertext vector minsuppCtxt of an unsigned integer minimum support threshold, wherein the formula is as follows:
wherein,the method is used for judging whether the query demand ciphertext vector queryCtxt is a frequent item set or not and is a calculation mining result;
step S33, obtaining ciphertext by using the security subset judgment algorithmThe process of (1) is as follows:
s331, determining a ciphertext query demand vector queryCtxt= (c) x,1 ,…,c x,n ) And encrypted transaction C i =(c y,1 ,…,c y,n ) Corresponding plaintext query request vector q= { q 1 ,…,q n Transaction T i =(T 1 ,…,T n );
S332 initializing ciphertextFor the encrypted 1, based on the Torus ring expansion scheme with error learning in the full homomorphism; for i from 1 to n, the following calculations are performed;
s333 calculation ofc=HomORNY(c x,i ,c y,i ) By usingcAnd HomAND operation update ciphertextThe update formula is:
s334, judging whether the update times reach m, if so, outputting the updated ciphertextOtherwise, returning to step S333;
the secure counting algorithm SecAccum described in step S34 will encrypt the ciphertextThe process of accumulating on the counter is as follows:
s341, determining ciphertext solving vector accCtxt= (accCtxt) 1 ,…,accCtxt k ) Ciphertext and method for producing sameRepresenting the binary representation of the integer accumulator and the calculation result of the safety subset decision algorithm, respectively;
s342, initializing ciphertext carry asTorus ring expansion scheme based on full homomorphism with error learning; for i decreasing from k to 1, the following calculation is performed;
s342, calculatingAfter this, the carry is updated using the HomAND operation, with the update formula:
carry=HomAND(carry,accCtxt i );
s343, judging whether the update times reach k, if so, outputting a ciphertext vectorOtherwise, returning to the step S342;
the step S35 is a process of comparing the counter accCtxt with the ciphertext vector minsuppCtxt of the unsigned integer minimum support threshold by using the secure comparison algorithm SecCmp, and is as follows:
s351. determination counter accctxt= (c) x,1 ,…c x,k ) Ciphertext vector minsuppctxt= (c) with unsigned integer minimum support threshold y,1 ,…,c y,k ) Plaintext vectors x= (x) 1 ,…,x k ),y=(y 1 ,…,y k );
S352 initializationFor HomANDNY (c) x,1 ,c y,1 ) The method comprises the steps of carrying out a first treatment on the surface of the For i from 2 to k, the following calculation is performed;
s353, first initializing the currentThen for j from 1 to i-1, the following calculation is performed;
s354 calculation ofc=HomXNOR(c x,j ,c y,j ) Re-updatingThe value of +.>
S355, judging whether the value of the current j reaches i-1, if not, returning to the step S354; if yes, update
S356, judging whether the current value of i reaches k, if not, returning to the step S353; if yes, outputCiphertext representing result b of the comparison, if b=1, x is represented<y; if b=0, x.gtoreq.y.
The mining process is based on homomorphic door properties of homomorphic encryption, and the concrete implementation of the homomorphic door is as follows:
in this embodiment, the data mining party in step S4 decrypts the data through the fully homomorphic encryption private key, and the process of confirming the frequent item set after the mining result is obtained is as follows:
the data mining party decrypts the mining result by using the encryption private key SK, and the formula is as follows:
wherein,represents the decryption result as a Boolean value if +.>If the query requirement is 1, the query requirement submitted by the data mining party is a frequent item set, otherwise, the query requirement submitted by the data mining party is not a frequent item set.
The positional relationship depicted in the drawings is for illustrative purposes only and is not to be construed as limiting the present patent; it is to be understood that the above examples of the present invention are provided by way of illustration only and are not intended to limit the scope of the invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.

Claims (4)

1. The mining method for the encrypted data frequent item set in the cloud computing is characterized by at least comprising the following steps:
s1, a user generates a full homomorphic encryption private key and a bootstrap key, the private key is transmitted to a data mining party through a secure channel while being reserved, and the bootstrap key is sent to a cloud server;
the process of fully homomorphic encryption private key and bootstrap key described in step S1 is:
given a security parameter λ, the formula for generating the encryption private key SK and the bootstrap key BK satisfies:
{SK,BK}←TFHE.KeyGen(1 λ )
wherein, TFHE. KeyGen represents a process of generating an encryption private key SK and a bootstrap key BK;
s2, encrypting the data by using the private key, and uploading the encrypted data to a cloud server;
the data in step S2 comprises transaction data and mining parameter data, wherein the transaction data is encoded into a Boolean matrix, and then encrypted by using an isomorphic encryption private key, and the process is as follows:
each user holds transaction data, and let the i-th user hold a set of transaction data as:
wherein m is i Is the amount of transaction data; the ith user encrypts the transaction data set bit by bit, i.e.:
C ij (k)←TFHE.Enc(SK,T ij (k)),
wherein T is ij (k) Representing transaction data T ij The kth element, C ij (k) Representing T ij Transaction data encrypted by an encryption private key SK, wherein j is less than or equal to m i Is a positive integer of (2); the ith user will encrypt the transaction setThe transaction set after encryption of all users is expressed as C= { C and is sent to a cloud server 1 ,...,C m };
Mining parameter data is an unsigned integer minimum support threshold value, after encoding the parameter data into binary vectors, encrypting the binary vectors into ciphertext vectors minsuppCtxt by using an isomorphic private key, and transmitting the ciphertext vectors minsuppCtxt to a cloud server, wherein the minsuppCtxt is only transmitted once before a data mining party submits a query requirement;
after the step S2, the step S3 is preceded by encrypting the query requirement of the data mining party bit by bit, and the process is as follows:
the query requirement is recorded as a Boolean vector q with a length of n, and the encryption formula is as follows:
queryCtxt←(TFHE.Enc(SK,q 1 ),...,TFHE.Enc(SK,q n ))
wherein q i The i-th bit representing the boolean vector q, i=1, 2, n, the encrypted query demand ciphertext vector queryCtxt is transmitted to the cloud server;
s3, submitting a query requirement to a cloud server by the data mining direction based on homomorphic operation according to the service requirement, performing data calculation mining after the cloud server receives the query requirement, and transmitting an encrypted calculation mining result to the data mining direction;
s4, the data mining party decrypts the data through the homomorphic encryption private key, and confirms the frequent item set after the mining result is obtained.
2. The method for mining the encrypted data frequent item set in cloud computing according to claim 1, wherein the process of computing and mining the data after the cloud server receives the query requirement in step S3 includes:
s31, initializingThe cloud server initializes an encryption counter Boolean vector accCtxt with all elements being ciphertext 0 and the length is
S32, gathering encrypted transaction C= { C 1 ,...,C m Inquiring a ciphertext vector minuuppCtxt of a required ciphertext vector queryCtxt and an unsigned integer minimum support threshold value as input of a cloud server; for i from 1 to m, the following calculations are performed;
s33, obtaining ciphertext by utilizing a security subset judgment algorithmFor determining transaction C i Whether the encrypted query demand ciphertext vector queryCtxt is contained or not, and SecSubDet represents a security subset judgment algorithm;
s34, ciphertext is processed by using a secure counting algorithm SecAccumAccumulated on a counter:
s35, judging whether the update times reach m, if not, returning to the step S33; if yes, using a security comparison algorithm SecCmp to compare the counter accCtxt with a ciphertext vector minsuppCtxt of an unsigned integer minimum support threshold, wherein the formula is as follows:
wherein,the encrypted Boolean value is ciphertext representing a comparison result and is used for judging whether the query requirement is denseWhether the text vector queryCtxt is a frequent item set or not is a calculation mining result.
3. The method for mining encrypted data frequent item set in cloud computing as claimed in claim 2, wherein in step S33, ciphertext is obtained by using a security subset decision algorithmThe process of (1) is as follows:
s331, determining a ciphertext query demand vector queryCtxt= (c) x,1 ,...,c x,n ) And encrypted transaction C i =(c y,1 ,...,c y,n ) Corresponding plaintext query request vector q= { q 1 ,...,q n Transaction T i =(T 1 ,...,T n );
S332 initializing ciphertextFor the encrypted 1, based on the Torus ring expansion scheme with error learning in the full homomorphism; for i from 1 to n, the following calculations are performed;
s333. calculate c=homorny (c x,i ,c y,i ) Ciphertext is updated using c and HomAND operationsThe update formula is:
s334, judging whether the update times reach m, if so, outputting the updated ciphertextOtherwise, returning to step S333;
the secure counting algorithm SecAccum described in step S34 will encrypt the ciphertextThe process of accumulating on the counter is as follows:
s341, determining ciphertext solving vector accCtxt= (accCtxt) 1 ,...,accCtxt k ) Ciphertext and method for producing sameRepresenting the binary representation of the integer accumulator and the calculation result of the safety subset decision algorithm, respectively;
s342, initializing ciphertext carry asTorus ring expansion scheme based on full homomorphism with error learning; for i decreasing from k to 1, the following calculation is performed;
s342, calculatingAfter this, the carry is updated using the HomAND operation, with the update formula:
carry=HomAND(carry,accCtxt i );
s343, judging whether the update times reach k, if so, outputting a ciphertext vectorOtherwise, returning to the step S342;
the step S35 is a process of comparing the counter accCtxt with the ciphertext vector minsuppCtxt of the unsigned integer minimum support threshold by using the secure comparison algorithm SecCmp, and is as follows:
s351. determination counter accctxt= (c) x,1 ,...c x,k ) Ciphertext vector minsuppctxt= (c) with unsigned integer minimum support threshold y,1 ,...,c y,k ) Plaintext vectors x= (x) 1 ,...,x k ),y=(y 1 ,...,y k );
S352 initializationFor HomANDNY (c) x,1 ,c y,1 ) The method comprises the steps of carrying out a first treatment on the surface of the For i from 2 to k, the following calculation is performed;
s353, first initializing the currentThen for j from 1 to i-1, the following calculation is performed;
s354 calculation ofc=HomXNOR(c x,j ,c y,j ) Re-updatingThe value of +.>
S355, judging whether the value of the current j reaches i-1, if not, returning to the step S354; if yes, update
S356, judging whether the current value of i reaches k, if not, returning to the step S353; if yes, outputCiphertext representing the result b of the comparison, if b=1, x < y; if b=0, x.gtoreq.y.
4. The method for mining the encrypted data frequent item set in cloud computing according to claim 2 or 3, wherein the data mining method in step S4 decrypts the encrypted data frequent item set by using the isomorphic encryption private key, and the process of confirming the frequent item set after the mining result is obtained is as follows:
the data mining party decrypts the mining result by using the encryption private key SK, and the formula is as follows:
wherein,represents the decryption result as a Boolean value if +.>If the query requirement is 1, the query requirement submitted by the data mining party is a frequent item set, otherwise, the query requirement submitted by the data mining party is not a frequent item set.
CN202011193510.7A 2020-10-30 2020-10-30 Mining method for encrypted data frequent item set in cloud computing Active CN112307499B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011193510.7A CN112307499B (en) 2020-10-30 2020-10-30 Mining method for encrypted data frequent item set in cloud computing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011193510.7A CN112307499B (en) 2020-10-30 2020-10-30 Mining method for encrypted data frequent item set in cloud computing

Publications (2)

Publication Number Publication Date
CN112307499A CN112307499A (en) 2021-02-02
CN112307499B true CN112307499B (en) 2024-04-12

Family

ID=74332977

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011193510.7A Active CN112307499B (en) 2020-10-30 2020-10-30 Mining method for encrypted data frequent item set in cloud computing

Country Status (1)

Country Link
CN (1) CN112307499B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002044715A1 (en) * 2000-11-28 2002-06-06 Surromed, Inc. Methods for efficiently minig broad data sets for biological markers
CN103401871A (en) * 2013-08-05 2013-11-20 苏州大学 Method and system for sequencing ciphertexts orienting to homomorphic encryption
CN108183791A (en) * 2017-12-11 2018-06-19 北京航空航天大学 Applied to the Intelligent terminal data safe processing method and system under cloud environment
CN110120873A (en) * 2019-05-08 2019-08-13 西安电子科技大学 Mining Frequent Itemsets based on cloud outsourcing transaction data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002044715A1 (en) * 2000-11-28 2002-06-06 Surromed, Inc. Methods for efficiently minig broad data sets for biological markers
CN103401871A (en) * 2013-08-05 2013-11-20 苏州大学 Method and system for sequencing ciphertexts orienting to homomorphic encryption
CN108183791A (en) * 2017-12-11 2018-06-19 北京航空航天大学 Applied to the Intelligent terminal data safe processing method and system under cloud environment
CN110120873A (en) * 2019-05-08 2019-08-13 西安电子科技大学 Mining Frequent Itemsets based on cloud outsourcing transaction data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于周期采样的数据流频繁项集挖掘算法研究;侯伟;杨炳儒;吴晨生;周谆;;高技术通讯(第08期);正文全文 *

Also Published As

Publication number Publication date
CN112307499A (en) 2021-02-02

Similar Documents

Publication Publication Date Title
CN112989368B (en) Method and device for processing private data by combining multiple parties
CN108989026B (en) Method for revoking user attribute in publishing/subscribing environment
Liu et al. An efficient privacy-preserving outsourced calculation toolkit with multiple keys
JP5300983B2 (en) Data processing device
US20190295073A1 (en) Secure data processing transactions
CN108768951B (en) Data encryption and retrieval method for protecting file privacy in cloud environment
US8898478B2 (en) Method for querying data in privacy preserving manner using attributes
CN113569271B (en) Threshold proxy re-encryption method based on attribute condition
JP2008500598A (en) Method and apparatus for confidential information retrieval and lost communication with good communication efficiency
CN110120873B (en) Frequent item set mining method based on cloud outsourcing transaction data
CN110611662B (en) Attribute-based encryption-based fog collaborative cloud data sharing method
JP2014504741A (en) Method and server for evaluating the probability of observation sequence stored at client for Hidden Markov Model (HMM) stored at server
Anikin et al. Privacy preserving DBSCAN clustering algorithm for vertically partitioned data in distributed systems
WO2014132552A1 (en) Order-preserving encryption system, device, method, and program
Acar et al. Achieving secure and differentially private computations in multiparty settings
Zou et al. Highly secure privacy‐preserving outsourced k‐means clustering under multiple keys in cloud computing
CN116170142B (en) Distributed collaborative decryption method, device and storage medium
CN111859440A (en) Sample classification method of distributed privacy protection logistic regression model based on mixed protocol
CN112307499B (en) Mining method for encrypted data frequent item set in cloud computing
Li et al. Securely outsourcing ID3 decision tree in cloud computing
CN115150055A (en) Privacy protection ridge regression method based on homomorphic encryption
CN117795901A (en) Generating digital signature shares
CN115062331A (en) Privacy protection deep learning method based on additive homomorphic encryption
Ma et al. Controllable forward secure identity-based encryption with equality test in privacy-preserving text similarity analysis
Hariss et al. Cloud assisted privacy preserving using homomorphic encryption

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant