CN106021541A - Secondary k-anonymity privacy protection algorithm for differentiating quasi-identifier attributes - Google Patents

Secondary k-anonymity privacy protection algorithm for differentiating quasi-identifier attributes Download PDF

Info

Publication number
CN106021541A
CN106021541A CN201610361877.2A CN201610361877A CN106021541A CN 106021541 A CN106021541 A CN 106021541A CN 201610361877 A CN201610361877 A CN 201610361877A CN 106021541 A CN106021541 A CN 106021541A
Authority
CN
China
Prior art keywords
anonymity
data set
extensive
algorithm
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610361877.2A
Other languages
Chinese (zh)
Other versions
CN106021541B (en
Inventor
吴响
王换换
臧昊
俞啸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xuzhou Medical University
Original Assignee
Xuzhou Medical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xuzhou Medical University filed Critical Xuzhou Medical University
Priority to CN201610361877.2A priority Critical patent/CN106021541B/en
Publication of CN106021541A publication Critical patent/CN106021541A/en
Application granted granted Critical
Publication of CN106021541B publication Critical patent/CN106021541B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification

Abstract

The invention discloses a secondary k-anonymity privacy protection algorithm for differentiating quasi-identifier attributes, pertaining to the technical field of privacy protection.The algorithm comprises following steps: forming hierarchical grids with single attribute through an Incognito function to determine whether generalization satisfies k-anonymity or not, deleting nodes not satisfying k-anonymity, iterating nodes satisfying k-anonymity to form a candidate node set and determining again whether candidate nodes satisfy k-anonymity, deleting nodes not satisfying k-anonymity, and repeating the above steps till all categorical attributes are iterated and outputting root nodes satisfying k-anonymity.Data tables T are generalized through the root nodes. The MDAV algorithm is utilized for secondary generalization of generalized T'. The number of tuples in equivalence class inputted is divided into the range of k-2k-1. When partition is finished, information loss is provided for obtaining a data table with the little loss amount through comparisons.

Description

Distinguish the secondary k-anonymity Privacy preserving algorithms of standard identifier attribute
Technical field
The present invention relates to data-privacy resist technology field, a kind of secondary k-anonymity Privacy preserving algorithms distinguishing standard identifier attribute.
Background technology
Developing rapidly of information technology; increasing data are shared use by people; how to protect the privacy information issued in data not to be hacked person's malice to obtain; make again Data receiver make full use of data message effectively to explore and scientific research, be increasingly becoming an important information security issue simultaneously.K-anonymity is a kind of effective private data guard method, is paid close attention to the most widely.K-anonymity technology is proposed in 1998 by Samarati and Sweeney, and it requires to exist the individuality of some (k) undistinguishables in the data issued, and makes assailant can not determine belonging to privacy information individual.
Numerous studies show, Incognito algorithm can be efficiently by large-scale data k-anonymization, and the k-anonymization algorithm that the overall situation is recoded can cause the most extensive of numeric type variable, has more semantic loss.MDAV is that this algorithm can efficiently process the clustering problem of extensive numeric type data collection based on the classical anonymous clustering algorithm divided.
Researcher at utmost retains the availability of data while the research work that k-is anonymous is concentrated mainly on protection privacy information.At present, all there is common defect in major part data anonymous method: 1) being relatively applicable to classifying type data (nominal type and Ordinal), it is semantic that logarithm value type data generaliza-tion often loses more numerical value;2) when the number of attributes of standard identifier increases severely, it may appear that so-called " dimension disaster/figure place trap ".Dimension trap will cause the biggest information loss so that issues tables of data availability and is deteriorated.
Summary of the invention
In order to overcome the shortcoming of above-mentioned prior art, the present invention provides a kind of secondary k-anonymity Privacy preserving algorithms distinguishing standard identifier attribute, greatly reduces and is used alone the information loss that anonym algorithm causes.
The present invention realizes with following technical scheme: a kind of secondary k-anonymity Privacy preserving algorithms distinguishing standard identifier attribute,
1) judge that standard identifier concentrates attribute type;
2)Sn=Incognito (T, CQI, k), SnPresentation class type attribute has carried out extensive data set, and T represents that needs are represented anonymous constraints by extensive data set, CQI presentation class type standard identifier collection, k;
3) empty queue result, empty node node;
4) traversal SnCirculation below entering:
Data set
DjBe deposit the most extensive after tables of data;
Read SnIn node city to node;
T ' is obtained according to extensive tables of data T of node;
Traversal T ', the following circulation of entrance:
Use T 'iStorage T ' middle i-th equivalence class;
MDAV(T′i, NQI, k), T ' represents the data set needing to be clustered, and NQI represents the Numeric Attributes carrying out clustering, and k represents anonymous constraints;
Dj=Dj∪Ti';
Calculate information loss, be inserted into result;
5) compare information loss in result, obtain the D that information loss is minimumj
6) T "=Dj, return T ".
Preferably, Incognito (T, CQI, k) extensive the specifically comprising the following steps that of categorical attribute
1) single attribute generalization both candidate nodes table C is formed1With limit table E1
2) an empty queue queue is used to take out C1In all root nodes, nodes all to queue carry out equivalence class calculating;
3) judging whether to meet k-anonymous, if node meets, then this point and its all of child node being marked, if be unsatisfactory for, then by this point from C1Middle deletion, and its child node is inserted in queue queue;
4) step 3 is repeated), until C1In all ungratified knot removals, and be the C after deleting1And E1Form new table C2And E2
5) repeat step 2), 3), 4) until C after being deletedn
6)Sn={ CnAll nodes }
7) S is returnedn
Preferably, MDAV (T ', NQI, k) extensive the specifically comprising the following steps that of Numeric Attributes
1) judging that in data set, whether the number of tuple is more than 2k-1, if being more than, then continuing step 2),
Otherwise, return data set T ', and find its barycenter;
2) data set T ' is found out apart from two farthest tuples r, s by NQI;
3) with r as barycenter, find and form equivalence class C from k-1 bar tuple nearest for r, update barycenter,
And T ' deletes this k bar tuple from data set, put into collection gregarious { in Q};
4) step 3 is repeated with s for barycenter);
5) judge that in data T ', whether remaining tuple number is more than 2k-1, if more than repeating 2) 3)
4);Otherwise, return, return data set T ', and find its barycenter;
6) the standard identifier property value of tuple in its equivalence class is replaced with the standard identifier property value of its barycenter;
7) T ' is returned.
The invention has the beneficial effects as follows: the anonymous categorical attribute frequent item set of k-can be met by the method, then logarithm value type attribute carries out micro-gathering, avoid the occurrence of the possibility that universe extensive logarithm value type attribute is the most extensive, the optimal dividing that source data table can be made to be divided between k to 2k-1, greatly reduces and is used alone the information loss that anonym algorithm causes.
Accompanying drawing explanation
Fig. 1 is schematic flow sheet of the present invention;
Fig. 2 is for sex, race, the structure chart that 3 attributes of type of work are constituted;
When Fig. 3 is | QI |=6+1, information loss IL and the graph of a relation of k value;
When Fig. 4 is | QI |=6+2, information loss IL and the graph of a relation of k value;
When Fig. 5 is | QI |=6+1, time T and the graph of a relation of k value;
When Fig. 6 is | QI |=6+2, time T and the graph of a relation of k value;
Fig. 7 is the graph of a relation of time difference and k value.
Detailed description of the invention
When realizing k-anonymity, as a example by table 1, NQLG algorithm is carried out related definition.Assume that the tables of data that data publisher is held is T (A1,A2,...,An), in table, every tuple indicates the relevant information of a special entity, such as Age, Workclass, Race, Sex, Hours-per-week, Salary etc., is shown in Table 1.
Table 1
Define 1 standard identifier: assuming that a data set U, a specific tables of data T (A1,A2,...,An), fc:U → T and fg:T → U ', whereinOne standard identifier QI of TT, it is one group of attribute So f (fc(pi)[QT])=piSet up.Attribute in table 1 can serve as standard identifier, and choosing of standard identifier selects according to actual needs.
Define 2 abstraction rules: giving an attribute Q, f:Q → Q', f are to act on the extensive function set on attribute Q, thenThen represent that standard identifier carries out extensive process in order, and { f1,f2,...,fmThen represent abstraction rule.It is illustrated in figure 2 sex, race, the structure chart that 3 attributes of type of work are constituted.
Definition 3k-is anonymous: (k-anonymity) gives a tables of data T (A1,...,An) and the standard identifier QI that is associatedT=(Ai,...,Aj)If table T to meet k-anonymous, and if only if T [QITEach tuple in] is at least at T [QITOccur k time in].
As shown in table 1, table comprises 6 tuples, the corresponding concrete personal information of each tuple.In table, first is classified as sequence number field, represents that every record stores position in tables of data relatively;Second is classified as age attribute information;3rd is classified as working attributes information;4th is classified as race's attribute information;5th is classified as gender attribute information;6th is classified as operating time attribute information, and last string can be as needing information to be protected, as the Sensitive Attributes of this table.So standard identifier Q I of T in table 1T={ Age, Workclass, Race, Sex, Works_per_week}.Table 2 is the table 1 data result publishing table after 2-anonymization processes.According to DEFINED BY EQUIVALENT CLASS, in table 2, one has 3 equivalence classes, is respectively { R1,R2}、{R3,R4}、{R5,R6}.Equivalence class { R1,R2,R3Tuple in } has:
R1[QIT]=R2[QIT]={ [21,30], Self-emp-not-int, Amer-Indian-Eskimo, Female, [21-30] },
R3[QIT]=R4[QIT]={ [31,40], Private, Amer-Indian-Eskimo, Male, [31-40] },
R5[QIT]=R6[QIT]={ [41,50], Private, Amer-Indian-Eskimo, Male, [41-50] }.Therefore the probability that assailant utilizes link attack pattern to obtain privacy-sensitive is only 1/k=1/2.The table 1 tables of data (table 2) after k-anonymization processes can be effectively prevented link and attack, and table 2 is the table 1 data after 2-anonymity processes;
Table 2
Define 4 categorical attributes extensive: data set is carried out data division, when classifying type data are carried out possible time probability expansion, { R1,...,RiCategorical attribute, and R1,...,Ri∈ T, if T is (R1,...,Rj) to meet k-anonymous, i.e. and if only if T (R1,...,RjEach tuple in) is at least at T (R1,...,RjOccurring k time in), then complete categorical attribute extensive, now frequent item set is represented by T'(R1,..,Rj,...,S1,...,Sn)。
Define 5 Numeric Attributes extensive: obtained the frequent item set T'(R given by classifying type data generaliza-tion1,..,Ri,...,S1,...,Sn), table T'(S1,...,Sn) (for Numeric Attributes, Numeric Attributes on T is extensive is represented by KexpG(T ")), wherein K represents the function name that secondary is anonymous, and exp is numeric type expression formula, and G is abstraction rule, δGComplete the extensive of numeric type tuple data.
Define 6 numeric type unit group distances: set T, for given tuple set T, (t1,t2,...,tn), two tuples t1,t2(t1,t2∈ T), then the distance between tuple is its actual distance on all numeric type standard identifiers:
d n ( t i , t j ) = | t i - t j | 2 = [ Σ k = 1 p ω k | t i k - t j k | 2 ] 1 / 2 - - - ( 1 )
Wherein, ti,tjRepresent different numeric type tuples, d respectivelynRepresent the actual range between two numeric type tuples.
As it is shown in figure 1, the present invention is based on Incognito algorithm and MDAV algorithm, set forth herein an efficient k-anonymity algorithm NQLG algorithm.This algorithm combines Incognito algorithm and MDAV algorithm, obtaining first with Incognito algorithm with classifying type standard identifier is to meet the anonymous node of k-, all of root node is obtained through judgement, according to root node to respectively tables of data being carried out extensive, utilize MDAV algorithm logarithm value type hierarchical cluster attribute, making the equivalence class finally obtained is that optimum k divides, in each equivalence class, the number of tuple is between k and 2k-1, and compare the extensive result that each root node obtains, select the extensive tables of data that information loss amount is minimum.Arthmetic statement is as follows:
Categorical attribute is extensive
(T, CQI, k), T represents that needs are by extensive data set, CQI presentation class type standard identifier collection, k anonymity constraints to function: Incognito;
1) single attribute generalization both candidate nodes table and C are formed1Limit table E1
2) an empty queue queue is used to take out C1In all root nodes, nodes all to queue carry out equivalence class calculating;
3) judging whether to meet k-anonymous, if node meets, then this point and its all of child node being marked, if be unsatisfactory for, then by this point from C1Middle deletion, and its child node is inserted in queue queue;
4) step 3 is repeated), until C1In all ungratified knot removals, and be the C after deleting1And E1Form new table C2And E2
5) repeat step 2), 3), 4) until C after being deletedn
6)Sn={ CnAll nodes }
7) S is returnedn
Numeric Attributes is extensive
(T', NQI, k), T ' represents the data set needing to be clustered to function: MDAV, and NQI represents the Numeric Attributes carrying out clustering, and k represents anonymous constraints;
1) judging that in data set, whether the number of tuple is more than 2k-1, if being more than, then continuing step 2), otherwise, return data set T ', and find its barycenter;
2) data set T ' is found out apart from two farthest tuples r, s by NQI;
3) with r as barycenter, find and form equivalence class C from k-1 bar tuple nearest for r, update barycenter, and T ' deletes this k bar tuple from data set, put into collection gregarious { in Q};
4) step 3 is repeated with s for barycenter);
5) judge that in data T ', whether remaining tuple number is more than 2k-1, if more than repeating 2) 3)
4);Otherwise, return, return data set T ', and find its barycenter;
6) the standard identifier property value of tuple in its equivalence class is replaced with the standard identifier property value of its barycenter;
7) T ' is returned.
NQLG algorithm realizes
1) judge that standard identifier concentrates attribute type,
2)Sn=Incognito (T, CQI, k);
SnIt is that categorical attribute has carried out extensive data set;
3) empty queue result, empty node node;
4) traversal SnCirculation below entering:
Data set
DjBe deposit the most extensive after tables of data;
Read SnIn node city to node;
T' is obtained according to extensive tables of data T of node;
Traversal T', the following circulation of entrance:
Use Ti' store i-th equivalence class in T';
MDAV(T′i,NQI,k);
Dj=Dj ∪ Ti';
Calculate information loss, be inserted into result;
5) compare information loss in result, obtain the D that information loss is minimumj
6) T "=Dj, return T ".
From above step, NQLG algorithm passes through Incognito function, form the level grid of all single attributes and carry out judging that extensive whether to meet k-anonymous, delete and be unsatisfactory for the anonymous node of k-, the anonymous node iteration of k-will be met, form candidate's nodal set, judge whether both candidate nodes meets k-more anonymous, delete ineligible node, circulate above-mentioned steps, until all categorical attribute iteration complete, export all root nodes meeting k-anonymity.Successively tables of data T is carried out extensive by root node, utilize MDAV algorithm that extensive rear T' carries out secondary extensive, the equivalence class tuple quantity of input is divided between k to 2k-1, after completing all divisions, provide information loss, compare the tables of data showing that loss amount is minimum.
The analysis on its rationality of NQLG algorithm: by step 2) algorithm can be met the anonymous categorical attribute frequent item set of k-, then logarithm value type attribute carries out micro-gathering, avoid the occurrence of the possibility that universe extensive logarithm value type attribute is the most extensive, through step 4) after, the optimal dividing that source data table can be made to be divided between k to 2k-1, greatly reduces and is used alone the information loss that anonym algorithm causes.
NQLG algorithm analysis: assuming that this algorithm data concentrates tuple number to be n, classifying type standard identifier number is m, then this algorithm spends time series analysis as follows: step 1 time spends as O (1);Step 2 uses anonym algorithm that categorical attribute is met k-and solves, and the cost of its time is O (∑ Ci), CiNode number for ith iteration;Step 3 time spends as O (1);The cost of step 4 time isWherein l represent the most extensive after the number of root node.The time complexity of MDAV algorithm isJ is big equivalence class number obtained in the previous step;Step 5 time spends as O (l).Therefore the loss of the overall information of this algorithm is
The checking of NQLG algorithm experimental and interpretation of result:
Experimental situation: the hardware environment that experiment is used is: 4G internal memory, Windows 7 operating system, algorithm is realized by Java and SQL server 2008.There is used herein the Adult data set in UCI Machine Learning Repository as experimental data set, Adult data set is to be made up of U.S. census's data, uses the training set in data set, 30162 records are had after removing default value record, choose 8 property values herein, including Sex, Race, Hours_per_week, Marital_status, Education, Workclass, Native_country, Age.Wherein Age, Hours_per_week are seriality standard identifier, and Sex, Race, Marital_status, Education, Workclass, Native_country are classifying type standard identifier.
Interpretation: Incognito algorithm algorithm as a comparison is selected in this experiment, utilizes the data set after k-anonymization MDAV algorithm to carry out secondary anonymity, weighs this paper algorithm from information loss degree and in terms of the time of execution.Under the conditions of NQLG algorithm achieves standard identifier and the different value of K of different number, information loss degree and the change of the time of execution.The wherein computational methods of information loss degree employing document:
Equivalence class information loss amount:
The information loss amount of table:
I L ( T ) = 1 n Σ I L ( e i ) - - - ( 3 ) ;
| ei | is the quantity of cluster ei tuple, 1≤l≤m, NiIt is the scope of i-th numerical attribute, MAXNiAnd MINNiIt is maximum and minima in cluster ei, H (Tci) it is the height of classification tree, H (∧ (∪ Cj)) is the height of the classification subtree with minimum public ancestors.
Information loss degree is analyzed by Fig. 3, Fig. 4 it can be seen that work as standard identifier | QI | timing, and along with the increase of k, information loss IL of algorithm has the trend of reduction herein, and when k value reaches 50, the information loss amount of two kinds of algorithms has the trend of rising.Experimental data shows, the information loss amount of algorithm herein is significantly lower than anonym algorithm.Thus in terms of information loss measuring angle, algorithm is avoiding excessive extensive aspect tool to have an enormous advantage herein.
Run time series analysis by Fig. 5, Fig. 6 it can be seen that when standard identifier one timing, the operation time of anonym algorithm and herein algorithm all reduces along with the increase of k value.Contrasted by the datagram of different standard identifier collection QI, when | QI |=6+1 (+1 Numeric Attributes of 6 categorical attributes), anonym algorithm aspect at runtime is better than algorithm herein, and during standard identifier collection | QI |=6+2 (+2 Numeric Attributes of 6 categorical attributes), along with the increase of k value, algorithm aspect at runtime is better than anonym algorithm herein.Experimental data shows, when numeric type standard identifier increases, the superiority of algorithm can be the most obvious herein.
As seen from Figure 7, minimizing along with k value, the time difference Δ t of anonym algorithm and herein the standard identifier collection (as | QI |=6+2 and | QI |=6+1) of algorithm increases simultaneously, and the amplification of anonym algorithm is notable, much larger than the amplification of algorithm herein.Thus, from efficiency, along with standard identifier collection | QI | middle numeric type standard identifier accounting changes, the superiority of algorithm can significantly improve herein.
Semanteme in the most extensive and cluster analysis of the Numeric Attributes caused mainly for anonym algorithm herein comprises problem, it is proposed that NQLG algorithm.Experiment shows, NQLG algorithm comprises aspect compared to traditional Privacy preserving algorithms have a clear superiority in reply semanteme loss and semanteme.Research can be launched future in the following areas: data exist the probability that secondary is issued, can be to NQLG algorithm further genralrlization on dynamic data set;Along with the sharp increase of data scale, can introduce in distributed, cloud computing technology to anonymization research, improve mass data processing efficiency further.

Claims (3)

1. the secondary k-anonymity method for secret protection distinguishing standard identifier attribute, it is characterised in that:
1)Sn=Incognito (T, CQI, k), SnPresentation class type attribute has carried out extensive data set, T represent needs by extensive data set, CQI presentation class type standard identifier collection, k represents that anonymity is about Bundle condition;
2) empty queue result, empty node node;
3) traversal SnCirculation below entering:
Data set
DjBe deposit the most extensive after tables of data;
Read SnIn node city to node;
T ' is obtained according to extensive tables of data T of node;
Traversal T ', the following circulation of entrance:
Use Ti' storage T ' middle i-th equivalence class;
MDAV(T′i, NQI, k), T ' represents the data set needing to be clustered, and NQI represents and to cluster Numeric Attributes, k represents anonymous constraints;
Dj=Dj∪T′i
Calculate information loss, be inserted into result;
4) compare information loss in result, obtain the D that information loss is minimumj
5) T "=Dj, return T ".
The secondary k-anonymity method for secret protection of differentiation standard identifier attribute the most according to claim 1 , it is characterised in that: Incognito (T, CQI, k) extensive the specifically comprising the following steps that of categorical attribute
1) single attribute generalization both candidate nodes table C is formed1With limit table E1
2) an empty queue queue is used to take out C1In all root nodes, nodes all to queue carry out equivalence Class calculates;
3) judge whether to meet k-anonymous, if node meets, then to this point and its all of child node It is marked, if be unsatisfactory for, then by this point from C1Middle deletion, and its child node is inserted queue queue In;
4) step 3 is repeated), until C1In all ungratified knot removals, and be the C after deleting1And E1 Form new table C2And E2
5) repeat step 2), 3), 4) until C after being deletedn
6)Sn={ CnAll nodes }
7) S is returnedn
The secondary k-anonymity method for secret protection of differentiation standard identifier attribute the most according to claim 1 , it is characterised in that: MDAV (T ', NQI, k) extensive the specifically comprising the following steps that of Numeric Attributes
1) judging that in data set, whether the number of tuple is more than 2k-1, if being more than, then continuing step 2), Otherwise, return data set T ', and find its barycenter;
2) data set T ' is found out apart from two farthest tuples r, s by NQI;
3) with r as barycenter, find and form equivalence class C from k-1 bar tuple nearest for r, update barycenter, and From data set, T ' deletes this k bar tuple, puts into collection gregarious { in Q};
4) step 3 is repeated with s for barycenter);
5) judge that in data T ', whether remaining tuple number is more than 2k-1, if more than repeating 2) 3) 4);Otherwise, return, return data set T ', and find its barycenter;
6) the standard identifier property value of tuple in its equivalence class is replaced with the standard identifier property value of its barycenter;
7) T ' is returned.
CN201610361877.2A 2016-05-26 2016-05-26 Distinguish the anonymous Privacy preserving algorithms of secondary k of standard identifier attribute Active CN106021541B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610361877.2A CN106021541B (en) 2016-05-26 2016-05-26 Distinguish the anonymous Privacy preserving algorithms of secondary k of standard identifier attribute

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610361877.2A CN106021541B (en) 2016-05-26 2016-05-26 Distinguish the anonymous Privacy preserving algorithms of secondary k of standard identifier attribute

Publications (2)

Publication Number Publication Date
CN106021541A true CN106021541A (en) 2016-10-12
CN106021541B CN106021541B (en) 2017-08-04

Family

ID=57093604

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610361877.2A Active CN106021541B (en) 2016-05-26 2016-05-26 Distinguish the anonymous Privacy preserving algorithms of secondary k of standard identifier attribute

Country Status (1)

Country Link
CN (1) CN106021541B (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709372A (en) * 2017-01-09 2017-05-24 广西师范大学 Enhanced identity-preservation privacy protection method
CN107273757A (en) * 2017-04-23 2017-10-20 西安电子科技大学 A kind of method of the processing big data based on l diversity rules and MDAV algorithms
CN107305614A (en) * 2017-08-12 2017-10-31 西安电子科技大学 A kind of method based on the MLDM algorithm process big datas for meeting Second Aggregation
CN107358113A (en) * 2017-06-01 2017-11-17 徐州医科大学 Based on the anonymous difference method for secret protection of micro- aggregation
CN107688751A (en) * 2017-08-17 2018-02-13 复旦大学 A kind of adaptive method for secret protection of social media user behavior temporal mode
CN108133146A (en) * 2017-06-01 2018-06-08 徐州医科大学 Sensitive Attributes l-diversity method for secret protection based on secondary division
CN109388972A (en) * 2018-10-29 2019-02-26 山东科技大学 Medical data Singular variance difference method for secret protection based on OPTICS cluster
CN109684862A (en) * 2017-10-18 2019-04-26 财团法人工业技术研究院 Data remove identificationization method, apparatus and computer readable storage media
CN109726589A (en) * 2018-12-22 2019-05-07 北京工业大学 A kind of private data access method towards many intelligence cloud environments
CN110008742A (en) * 2019-03-21 2019-07-12 九江学院 It is a kind of to regularly publish the anonymous guard method of the leakage of the efficient Q value zero in private data for SRS
CN110378148A (en) * 2019-07-25 2019-10-25 哈尔滨工业大学 A kind of multiple domain data-privacy guard method of facing cloud platform
CN110659513A (en) * 2019-09-29 2020-01-07 哈尔滨工程大学 Anonymous privacy protection method for multi-sensitive attribute data release
EP3591561A1 (en) 2018-07-06 2020-01-08 Synergic Partners S.L.U. An anonymized data processing method and computer programs thereof
CN110807208A (en) * 2019-10-31 2020-02-18 北京工业大学 K anonymous privacy protection method capable of meeting personalized requirements of users
CN111201532A (en) * 2017-10-11 2020-05-26 日本电信电话株式会社 k-anonymization apparatus, method, and program
CN113051619A (en) * 2021-04-30 2021-06-29 河南科技大学 K-anonymity-based traditional Chinese medicine prescription data privacy protection method
CN113742781A (en) * 2021-09-24 2021-12-03 湖北工业大学 K anonymous clustering privacy protection method, system, computer equipment and terminal

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101834872A (en) * 2010-05-19 2010-09-15 天津大学 Data processing method of K-Anonymity anonymity algorithm based on degree priority
CN102156755A (en) * 2011-05-06 2011-08-17 天津大学 K-cryptonym improving method
JP2014164477A (en) * 2013-02-25 2014-09-08 Hitachi Systems Ltd K-anonymity database control device and control method
WO2014176024A1 (en) * 2013-04-25 2014-10-30 International Business Machines Corporation Guaranteeing anonymity of linked data graphs

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101834872A (en) * 2010-05-19 2010-09-15 天津大学 Data processing method of K-Anonymity anonymity algorithm based on degree priority
CN102156755A (en) * 2011-05-06 2011-08-17 天津大学 K-cryptonym improving method
JP2014164477A (en) * 2013-02-25 2014-09-08 Hitachi Systems Ltd K-anonymity database control device and control method
WO2014176024A1 (en) * 2013-04-25 2014-10-30 International Business Machines Corporation Guaranteeing anonymity of linked data graphs

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709372B (en) * 2017-01-09 2019-04-30 广西师范大学 The method for secret protection that enhanced identity is kept
CN106709372A (en) * 2017-01-09 2017-05-24 广西师范大学 Enhanced identity-preservation privacy protection method
CN107273757A (en) * 2017-04-23 2017-10-20 西安电子科技大学 A kind of method of the processing big data based on l diversity rules and MDAV algorithms
CN107273757B (en) * 2017-04-23 2020-08-18 西安电子科技大学 Method for processing big data based on l-diversity rule and MDAV algorithm
CN107358113A (en) * 2017-06-01 2017-11-17 徐州医科大学 Based on the anonymous difference method for secret protection of micro- aggregation
CN108133146A (en) * 2017-06-01 2018-06-08 徐州医科大学 Sensitive Attributes l-diversity method for secret protection based on secondary division
CN107305614A (en) * 2017-08-12 2017-10-31 西安电子科技大学 A kind of method based on the MLDM algorithm process big datas for meeting Second Aggregation
CN107305614B (en) * 2017-08-12 2020-05-26 西安电子科技大学 Method for processing big data based on MLDM algorithm meeting secondary aggregation
CN107688751A (en) * 2017-08-17 2018-02-13 复旦大学 A kind of adaptive method for secret protection of social media user behavior temporal mode
CN107688751B (en) * 2017-08-17 2021-02-26 复旦大学 Self-adaptive privacy protection method for social media user behavior time mode
CN111201532B (en) * 2017-10-11 2023-08-15 日本电信电话株式会社 k-anonymizing device, method and recording medium
CN111201532A (en) * 2017-10-11 2020-05-26 日本电信电话株式会社 k-anonymization apparatus, method, and program
CN109684862A (en) * 2017-10-18 2019-04-26 财团法人工业技术研究院 Data remove identificationization method, apparatus and computer readable storage media
CN109684862B (en) * 2017-10-18 2021-07-20 财团法人工业技术研究院 Data de-identification method and device and computer readable storage medium
EP3591561A1 (en) 2018-07-06 2020-01-08 Synergic Partners S.L.U. An anonymized data processing method and computer programs thereof
CN109388972A (en) * 2018-10-29 2019-02-26 山东科技大学 Medical data Singular variance difference method for secret protection based on OPTICS cluster
CN109726589B (en) * 2018-12-22 2021-11-12 北京工业大学 Crowd-sourcing cloud environment-oriented private data access method
CN109726589A (en) * 2018-12-22 2019-05-07 北京工业大学 A kind of private data access method towards many intelligence cloud environments
CN110008742A (en) * 2019-03-21 2019-07-12 九江学院 It is a kind of to regularly publish the anonymous guard method of the leakage of the efficient Q value zero in private data for SRS
CN110378148A (en) * 2019-07-25 2019-10-25 哈尔滨工业大学 A kind of multiple domain data-privacy guard method of facing cloud platform
CN110378148B (en) * 2019-07-25 2023-02-03 哈尔滨工业大学 Multi-domain data privacy protection method facing cloud platform
CN110659513A (en) * 2019-09-29 2020-01-07 哈尔滨工程大学 Anonymous privacy protection method for multi-sensitive attribute data release
CN110659513B (en) * 2019-09-29 2022-12-06 哈尔滨工程大学 Anonymous privacy protection method for multi-sensitive attribute data release
CN110807208A (en) * 2019-10-31 2020-02-18 北京工业大学 K anonymous privacy protection method capable of meeting personalized requirements of users
CN110807208B (en) * 2019-10-31 2022-02-18 北京工业大学 K anonymous privacy protection method capable of meeting personalized requirements of users
CN113051619A (en) * 2021-04-30 2021-06-29 河南科技大学 K-anonymity-based traditional Chinese medicine prescription data privacy protection method
CN113051619B (en) * 2021-04-30 2023-03-03 河南科技大学 K-anonymity-based traditional Chinese medicine prescription data privacy protection method
CN113742781A (en) * 2021-09-24 2021-12-03 湖北工业大学 K anonymous clustering privacy protection method, system, computer equipment and terminal
CN113742781B (en) * 2021-09-24 2024-04-05 湖北工业大学 K anonymous clustering privacy protection method, system, computer equipment and terminal

Also Published As

Publication number Publication date
CN106021541B (en) 2017-08-04

Similar Documents

Publication Publication Date Title
CN106021541A (en) Secondary k-anonymity privacy protection algorithm for differentiating quasi-identifier attributes
CN106294762B (en) Entity identification method based on learning
Bagui et al. Positive and negative association rule mining in Hadoop’s MapReduce environment
Gkountouna et al. Anonymizing collections of tree-structured data
Dharavath et al. Entity resolution-based jaccard similarity coefficient for heterogeneous distributed databases
Gao et al. Real-time social media retrieval with spatial, temporal and social constraints
Zhang et al. Exploring time factors in measuring the scientific impact of scholars
Shaham et al. Machine learning aided anonymization of spatiotemporal trajectory datasets
Mei et al. Proximity-based k-partitions clustering with ranking for document categorization and analysis
Huang et al. Design a batched information retrieval system based on a concept-lattice-like structure
Li et al. Optimal k-anonymity with flexible generalization schemes through bottom-up searching
Yang et al. Top k probabilistic skyline queries on uncertain data
Christen et al. Advanced record linkage methods and privacy aspects for population reconstruction—a survey and case studies
Xie et al. Efficient storage management for social network events based on clustering and hot/cold data classification
Priya et al. Entity resolution for high velocity streams using semantic measures
Rosidin et al. Improvement with Chi Square Selection Feature using Supervised Machine Learning Approach on Covid-19 Data
Fischer et al. Timely semantics: a study of a stream-based ranking system for entity relationships
Podlesny et al. Towards identifying de-anonymisation risks in distributed health data silos
Ni et al. Differential private preservation multi-core DBScan clustering for network user data
Mohammed et al. Complementing privacy and utility trade-off with self-organising maps
Azman Efficient identity matching using static pruning q-gram indexing approach
Abdullah et al. FCA-ARMM: a model for mining association rules from formal concept analysis
Adusumalli et al. An efficient and dynamic concept hierarchy generation for data anonymization
Zhang et al. Discovering top-k patterns with differential privacy-an accurate approach
Chen et al. Personalized trajectory privacy-preserving method based on sensitive attribute generalization and location perturbation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant