CN107203725A - A kind of vertical distribution formula association rule mining method for protecting privacy - Google Patents

A kind of vertical distribution formula association rule mining method for protecting privacy Download PDF

Info

Publication number
CN107203725A
CN107203725A CN201710366773.5A CN201710366773A CN107203725A CN 107203725 A CN107203725 A CN 107203725A CN 201710366773 A CN201710366773 A CN 201710366773A CN 107203725 A CN107203725 A CN 107203725A
Authority
CN
China
Prior art keywords
website
data
party
association rule
sent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710366773.5A
Other languages
Chinese (zh)
Inventor
凌捷
张燕平
谢锐
柳毅
杨育斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201710366773.5A priority Critical patent/CN107203725A/en
Publication of CN107203725A publication Critical patent/CN107203725A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2107File encryption

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Current Privacy Preserving Data Mining Algorithms Shortcomings both domestic and external; for example; the more use homomorphic cryptography technology of association rule mining Privacy preserving algorithms under current distributed environment; when by the Technology application in Mining Association Rules in Distributed Environments; private key owner is easily by the information for calculating website two-by-two; obtain after enough equation groups, the support information for solving item collection in each site data set causes privacy compromise.In view of the shortcomings of the prior art, the present invention proposes a kind of vertical distribution formula association rule mining method for protecting privacy.It the method use the hiding randomized response techniques in part and enter line disturbance to each website initial data with hiding, introduce semi trusted third party, each website calculates the item collection task vector being locally held, item collection global transaction vector is counted using Paillier AESs, is decrypted by third party and obtains the global support of item collection.The inventive method improves the computational efficiency and security for supporting number.

Description

A kind of vertical distribution formula association rule mining method for protecting privacy
Technical field
The present invention relates to data mining technology field, more particularly to a kind of vertical distribution formula association rule mining privacy information Guard method.
Background technology
Along with data mining technology in the extensive use of every field, privacy and data safety of the data mining to user Caused threat increasingly causes the concern of people.It is being related to enterprise's sensitive data (hospital doctor included in such as electronic health record Treatment business or financial situation) or individual privacy information (the patients' privacy illness included in such as electronic health record) various data minings In daily use, the extensive concern of scholar is caused for how to improve the security of data.
Current Privacy Preserving Data Mining Algorithms both domestic and external mainly have based on data perturbation, square based on inquiry limitation etc. Method, mainly has under distributed environment and the method such as is used in mixed way based on inquiry limitation or data perturbation and inquiry limitation.Data are disturbed Disorderly initial data is disturbed by operations such as Data Discretization, data stochastic transformation and increase noises first, after interference Data excavated, reduce excavate in privacy leakage;Inquiry limitation is then by image watermarking, sampling, divided or encryption Etc. mode, the method for probability statistics or Distributed Calculation is recycled to obtain Result, to reach the purpose of protection data.At present The more use homomorphic cryptography technology of association rule mining Privacy preserving algorithms, the topmost feature of the technology under distributed environment Be that the data progress Jing Guo homomorphic cryptography is handled to obtain an output, this output be decrypted, its result with same The output result that the initial data of method processing unencryption is obtained is the same.By the Technology application in distributed association rules When in excavation, private key owner obtains after enough equation groups easily by the information for calculating website two-by-two, solves each website The support information of item collection causes privacy compromise in data set.
The content of the invention
In view of the deficienciess of the prior art, the present invention proposes that a kind of vertical distribution formula association rule mining privacy information is protected Maintaining method, enhancing each website support and security of individual information, Fig. 2 in association rule mining under vertical distribution formula environment It is distributed data digging framework.
Main thought is as follows:
(1) before being excavated to the data of website, method is answered to raw data set first by incomplete randomization It is hidden and disturbs, is excavated to hiding with the data after interference, then reconstruct what is included in data by reconstructing method The support of item, while protecting the security of website initial data, the accurate of data is ensured by reconstructing item collection support Property.
(2) when calculating the support of item for the data being distributed in each website, each website is first carried out to data Paillier algorithm for encryption, Paillier algorithms have passes through the item to having encrypted between additive homomorphism encryption property, website Data carry out phase add operation, efficiently and safely obtain the global support of item collection.
Brief description of the drawings
Fig. 1 is method flow diagram;
Fig. 2 is distributed data digging framework.
Embodiment
A kind of vertical distribution formula association rule mining method for protecting privacy, as shown in figure 1, comprising the steps of:
Whether if k=1, that is, it is frequent item set to judge 1- item collections, by counting the affairs number that the 1- item collections are present, i.e.,To calculate the support number of the 1- item collections, and judge according to given minimum support the 1- Whether item collection is frequent item set.
If k >=2, using following methods:
(1) each website Si(1≤i≤n) produces Paillier encryption algorithm keys to (ei,di), and public key eiIt is sent to DSC, when DSC sends data to website, first uses public key e corresponding with the websiteiData are encrypted, it is ensured that data exist Security in transmission;
(2) DSC produces Paillier encryption algorithm keys to (pk, sk) and random disturbances parameter p1, p2, use eiTo public affairs Key pk and parameter p1, p2It is encrypted and is sent to each website, when website sends data to other websites or DSC, first makes Data are encrypted with public key pk, it is ensured that the security of data in the transmission;
(3) website SiAccording to the random disturbances parameter p received from DSC1And p2It is parallel to answer method using incomplete randomization Initial data is hidden and upset;
(4) S is worked asiWhen possessing multiple frequent item sets, to each task vector(task vector is represented Presence situation of this in affairs) sum operation is carried out, draw new task vector
(5)SiUse pk pairs of public keyPaillier homomorphic cryptographies are carried out, and are sent to next website Si+1, Si+1Stand Point is to the data of its ownIt is same to be encrypted using pk, obtained result and the data received are subjected to sum operation, Last result is sent to next website, last result is sent to DSC after last website computing.
(6) result received is decrypted DSC, that is, obtains the global transaction vector after the item collection is hidden and upset, The global transaction vector of the k- item collections in initial data is reconstructed by reconstructing method, the task vector of k- item collections is finally counted In " k " number of times for occurring, as the global of k- item collections support number.
The specific embodiment of the present invention is described above.It is to be appreciated that the invention is not limited in above-mentioned Particular implementation, those skilled in the art can make various deformations or amendments within the scope of the claims, this not shadow Ring the substantive content of the present invention.

Claims (8)

1. a kind of vertical distribution formula association rule mining method for protecting privacy, it is characterised in that:It is contained in following steps:
1) each website produces Paillier AES public, private keys to (ei,di), and by public key eiBe sent to introducing half is credible Third party;
2) third party produces the public, private key of Paillier AESs to (pk, sk) and the ginseng of incomplete randomization answer method Number, and public key pk and parameter are sent to each website;
3) each website utilizes the parameter received from third party to be hidden the data set held and operated with interference;
4) there is situation progress sum operation to the affairs of item collection in the website for holding multiple item collections;
5) data held are encrypted using public key pk for each website, and the data encrypted are sent into next website;
6) receive the websites of data and the data held encrypted and the data received are subjected to sum operation, and by result Next website is sent to, last website sends the data to third party;
7) data that third party's decryption is received, calculate item collection and support number.
2. a kind of vertical distribution formula association rule mining method for protecting privacy according to claim 1, its feature exists In:Each website produces Paillier AES public, private keys pair, and public key is sent to third party, is sent in third direction website During information, first information is encrypted using the public key of corresponding website, then transmits.
3. a kind of vertical distribution formula association rule mining method for protecting privacy according to claim 2, its feature exists In:Third party produces the parameter that method is answered in Paillier AESs public, private key pair and incomplete randomization, and public key with Parameter is sent to each website, and each website is first hidden and upset using the parameter logistic evidence received, and is sent out letter First information is encrypted using the public key during breath.
4. a kind of vertical distribution formula association rule mining method for protecting privacy according to claim 3, its feature exists In:Each website uses the parameter received from third party that first the initial data held is hidden and upset, and obtains new number According to collection, the data after using this to hide and upset in follow-up excavate.
5. a kind of vertical distribution formula association rule mining method for protecting privacy according to claim 4, its feature exists In:The affairs situation for possessing the item collection of the website of j item collection first to holding carries out sum operation, forms a new transaction item Collection;Transaction itemset represents presence situation of the item collection in affairs, exists for 1, in the absence of for 0, when j task vector is added, Element size in vector is less than or equal to j.
6. a kind of vertical distribution formula association rule mining method for protecting privacy according to claim 5, its feature exists In:Final transaction itemset is encrypted each website using public key pk, and the data encrypted is sent to next corresponding Website.
7. a kind of vertical distribution formula association rule mining method for protecting privacy according to claim 6, its feature exists In:The task vector held encrypted is carried out sum operation by the website for receiving data, and is sent result to next Corresponding website.
8. a kind of vertical distribution formula association rule mining method for protecting privacy according to claim 7, its feature exists In:The data that receive of third party's decryption, reconstruct the support number of item collection, and in the task vector by counting the k- item collections The number of " k " obtains the original support number of k- item collections, and the support number is compared with given minimum support number, this is judged Whether k- item collections are frequent item set.
CN201710366773.5A 2017-05-23 2017-05-23 A kind of vertical distribution formula association rule mining method for protecting privacy Pending CN107203725A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710366773.5A CN107203725A (en) 2017-05-23 2017-05-23 A kind of vertical distribution formula association rule mining method for protecting privacy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710366773.5A CN107203725A (en) 2017-05-23 2017-05-23 A kind of vertical distribution formula association rule mining method for protecting privacy

Publications (1)

Publication Number Publication Date
CN107203725A true CN107203725A (en) 2017-09-26

Family

ID=59905779

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710366773.5A Pending CN107203725A (en) 2017-05-23 2017-05-23 A kind of vertical distribution formula association rule mining method for protecting privacy

Country Status (1)

Country Link
CN (1) CN107203725A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108022654A (en) * 2017-12-20 2018-05-11 深圳先进技术研究院 A kind of association rule mining method based on secret protection, system and electronic equipment
CN108920714A (en) * 2018-07-26 2018-11-30 上海交通大学 The association rule mining method and system of secret protection under a kind of distributed environment
CN109743299A (en) * 2018-12-19 2019-05-10 西安电子科技大学 A kind of high security Mining Frequent Itemsets towards megastore's transaction record
CN110120873A (en) * 2019-05-08 2019-08-13 西安电子科技大学 Mining Frequent Itemsets based on cloud outsourcing transaction data
CN112948864A (en) * 2021-03-19 2021-06-11 西安电子科技大学 Verifiable PPFIM method based on vertical partition database
CN112966283A (en) * 2021-03-19 2021-06-15 西安电子科技大学 PPARM (vertical partition data parallel processor) method for solving intersection based on multi-party set

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103150515A (en) * 2012-12-29 2013-06-12 江苏大学 Association rule mining method for privacy protection under distributed environment
CN103605749A (en) * 2013-11-20 2014-02-26 同济大学 Privacy protection associated rule data digging method based on multi-parameter interference
CN106503575A (en) * 2016-09-22 2017-03-15 广东工业大学 A kind of Mining Association Rules in Distributed Environments method for protecting privacy

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103150515A (en) * 2012-12-29 2013-06-12 江苏大学 Association rule mining method for privacy protection under distributed environment
CN103605749A (en) * 2013-11-20 2014-02-26 同济大学 Privacy protection associated rule data digging method based on multi-parameter interference
CN106503575A (en) * 2016-09-22 2017-03-15 广东工业大学 A kind of Mining Association Rules in Distributed Environments method for protecting privacy

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈玉婵: "面向关联规则挖掘的分布式隐私保护算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108022654A (en) * 2017-12-20 2018-05-11 深圳先进技术研究院 A kind of association rule mining method based on secret protection, system and electronic equipment
CN108920714A (en) * 2018-07-26 2018-11-30 上海交通大学 The association rule mining method and system of secret protection under a kind of distributed environment
CN108920714B (en) * 2018-07-26 2021-10-01 上海交通大学 Association rule mining method and system for privacy protection in distributed environment
CN109743299A (en) * 2018-12-19 2019-05-10 西安电子科技大学 A kind of high security Mining Frequent Itemsets towards megastore's transaction record
CN109743299B (en) * 2018-12-19 2021-01-12 西安电子科技大学 High-security frequent item set mining method oriented to superstore transaction records
CN110120873A (en) * 2019-05-08 2019-08-13 西安电子科技大学 Mining Frequent Itemsets based on cloud outsourcing transaction data
CN110120873B (en) * 2019-05-08 2021-04-27 西安电子科技大学 Frequent item set mining method based on cloud outsourcing transaction data
CN112948864A (en) * 2021-03-19 2021-06-11 西安电子科技大学 Verifiable PPFIM method based on vertical partition database
CN112966283A (en) * 2021-03-19 2021-06-15 西安电子科技大学 PPARM (vertical partition data parallel processor) method for solving intersection based on multi-party set
CN112948864B (en) * 2021-03-19 2022-12-06 西安电子科技大学 Verifiable PPFIM method based on vertical partition database
CN112966283B (en) * 2021-03-19 2023-04-18 西安电子科技大学 PPARM (vertical partition data parallel processor) method for solving intersection based on multi-party set

Similar Documents

Publication Publication Date Title
CN107203725A (en) A kind of vertical distribution formula association rule mining method for protecting privacy
Li et al. Differentially private Naive Bayes learning over multiple data sources
Elhoseny et al. Secure medical data transmission model for IoT-based healthcare systems
CN106503575B (en) A kind of Mining Association Rules in Distributed Environments method for protecting privacy
KR102224998B1 (en) Computer-implemented system and method for protecting sensitive data via data re-encryption
WO2020006302A1 (en) Method and apparatus for obtaining input of secure multiparty computation protocol
CN103023633B (en) Digital image hiding method based on chaotic random phase and coherence stack principle
Xu et al. A visually secure asymmetric image encryption scheme based on RSA algorithm and hyperchaotic map
CN102622545A (en) Picture file tracking method
Wang et al. A privacy-preserving and traitor tracking content-based image retrieval scheme in cloud computing
Koppu et al. A fast enhanced secure image chaotic cryptosystem based on hybrid chaotic magic transform
CN108881230B (en) Secure transmission method and device for government affair big data
Angelou et al. Asymmetric private set intersection with applications to contact tracing and private vertical federated machine learning
EP3966988B1 (en) Generating sequences of network data while preventing acquisition or manipulation of time data
Anusudha et al. Secured medical image watermarking with DNA codec
CN105553980A (en) Safety fingerprint identification system and method based on cloud computing
CN105721148A (en) Data file encryption method and system based on double random numbers
CN104537604B (en) A kind of image determinacy encryption double blinding secrecy matching process
Kim et al. A method for decrypting data infected with hive ransomware
Liu et al. Image encryption via complementary embedding algorithm and new spatiotemporal chaotic system
CN103593590A (en) Mixing additivity multi-time watermark embedding method and decoding method based on cloud environment
Wang et al. An encryption algorithm for vector maps based on the Gaussian random and Haar transform
CN104574380A (en) Random encryption and double-blind confidential matching method for images
Benkhaddra et al. Secure transmission of secret data using optimization based embedding techniques in Blockchain
Rachmawanto et al. Block-based arnold chaotic map for image encryption

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170926