CN106021541B - Distinguish the anonymous Privacy preserving algorithms of secondary k of standard identifier attribute - Google Patents

Distinguish the anonymous Privacy preserving algorithms of secondary k of standard identifier attribute Download PDF

Info

Publication number
CN106021541B
CN106021541B CN201610361877.2A CN201610361877A CN106021541B CN 106021541 B CN106021541 B CN 106021541B CN 201610361877 A CN201610361877 A CN 201610361877A CN 106021541 B CN106021541 B CN 106021541B
Authority
CN
China
Prior art keywords
extensive
node
data
anonymous
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610361877.2A
Other languages
Chinese (zh)
Other versions
CN106021541A (en
Inventor
吴响
王换换
臧昊
俞啸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xuzhou Medical University
Original Assignee
Xuzhou Medical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xuzhou Medical University filed Critical Xuzhou Medical University
Priority to CN201610361877.2A priority Critical patent/CN106021541B/en
Publication of CN106021541A publication Critical patent/CN106021541A/en
Application granted granted Critical
Publication of CN106021541B publication Critical patent/CN106021541B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bioethics (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of anonymous method for secret protection of secondary k for distinguishing standard identifier attribute, it is related to data-privacy protection technique field.The present invention passes through Incognito functions, formed all single attributes level grid carry out judge it is extensive whether meet k anonymity, delete and be unsatisfactory for the anonymous nodes of k, the node iteration anonymous by k is met, forms candidate's nodal set, then judge whether both candidate nodes meet k anonymities, delete ineligible node, above-mentioned steps are circulated, until all categorical attribute iteration are completed, all root nodes for meeting k anonymities are exported.Tables of data T is carried out successively by root node extensive, it is secondary extensive to extensive rear T' carry out using MDAV algorithms, the equivalence class tuple quantity of input is divided between k to 2k 1, after all divisions are completed, information loss is provided, compares the tables of data for showing that loss amount is minimum.

Description

Distinguish the anonymous Privacy preserving algorithms of secondary k- of standard identifier attribute
Technical field
The present invention relates to data-privacy protection technique field, specifically a kind of secondary k- for distinguishing standard identifier attribute is anonymous Privacy preserving algorithms.
Background technology
Information technology is developed rapidly, and increasing data are used by people are shared, how to be protected in issue data Privacy information not by attacker malice obtain, while making Data receiver make full use of data message effectively to be explored again And scientific research, it is increasingly becoming an important information security issue.K- anonymities are a kind of effective private data guard methods, Widely paid close attention in recent years.K-anonymity technologies were proposed that it is required by Samarati and Sweeney in 1998 There is the individual of certain amount (k) undistinguishable in the data of issue, prevent attacker individual belonging to privacy information from determining.
Numerous studies show, Incognito algorithms can efficiently by large-scale data k- anonymizations, what the overall situation was recoded K- anonymizations algorithm can cause the excessive extensive of numeric type variable, there is more semantic loss.MDAV is the classics based on division Anonymous clustering algorithm, the algorithm is capable of the clustering problem of the extensive numeric type data collection of efficient process.
Researcher's research work anonymous to k- at utmost retention data while be concentrated mainly on protection privacy information Availability.At present, all there is common defect in most of data anonymous method:1) relatively it is applied to classifying type data (nominal Type and Ordinal), it is semantic that logarithm value type data generaliza-tion often loses more numerical value;2) number of attributes of standard identifier increases severely When, it may appear that so-called " dimension disaster/digit trap ".Dimension trap will cause very big information loss so that issue data Table availability is deteriorated.
The content of the invention
In order to overcome the shortcoming of above-mentioned prior art, it is anonymous that the present invention provides a kind of secondary k- for distinguishing standard identifier attribute Privacy preserving algorithms, greatly reduce and the information loss that anonym's algorithm is caused are used alone.
The present invention is realized with following technical scheme:A kind of anonymous secret protections of the secondary k- for distinguishing standard identifier attribute Algorithm,
1) judge that standard identifier concentrates attribute type;
2)Sn=Incognito (T, CQI, k), SnPresentation class type attribute has carried out extensive data set, and T represents to need Anonymous constraints is represented by extensive data set, CQI presentation class type standard identifier collection, k;
3) empty queue result, empty node node;
4) S is traveled throughnInto following circulation:
Data set
DjBe storage it is complete it is extensive after tables of data;
Read SnIn a node be inserted into node;
T ' is obtained according to the extensive tables of data T of node;
T ' is traveled through, into following circulation:
Use TiI-th of equivalence class in ' storage T ';
MDAV(T′i, NQI, k), T 'iThe data set for needing to be clustered is represented, NQI represents the numeric type category to be clustered Property, k represents anonymous constraints;
Dj=Dj∪T′i
Information loss is calculated, result is inserted into;
5) compare information loss in result, obtain the minimum D of information lossj
6) T "=Dj, return to T ".
It is preferred that, (T, CQI, k) categorical attribute is extensive comprises the following steps that Incognito:
1) single attribute generalization both candidate nodes table C is formed1With side table E1
2) C is taken out using an empty queue queue1In all root nodes, all to queue nodes carry out equivalence class meters Calculate;
3) judge whether to meet k- anonymities, if node is met, this point and its all child node be marked, If be unsatisfactory for, by this point from C1It is middle to delete, and its child node is inserted in queue queue;
4) repeat step 3), until C1In all ungratified knot removals, and make the C after deleting1And E1Formed newly Table C2And E2
5) repeat step 2), C 3), 4) after being deletedn
6)Sn={ CnAll nodes }
7) S is returnedn
It is preferred that, MDAV (T 'i, NQI, k) Numeric Attributes are extensive comprises the following steps that:
1) judge whether the number of tuple in data set is more than 2k-1, if being more than, continue step 2), otherwise, return to number According to collection T 'i, and find its barycenter;
2) data set T 'iIn find out two farthest tuples r, s of distance by NQI;
3) using r as barycenter, the k-1 bar tuple formation equivalence class C nearest from r is found, barycenter is updated, and from data set T′iThis k bar tuple is deleted, is put into collection gregarious { Q };
4) using s as barycenter repeat step 3);
5) data T ' is judgediIn remaining tuple number whether be more than 2k-1,3) 4) if more than repeating 2);Otherwise, Return, returned data collection T 'i, and find its barycenter;
6) the standard identifier property value of the tuple in its equivalence class is replaced with the standard identifier property value of its barycenter;
7) T ' is returnedi
The beneficial effects of the invention are as follows:The anonymous categorical attribute frequent item sets of k- can be met by this method, Then logarithm value type attribute carries out micro- aggregation, it is to avoid the excessive extensive possibility of the extensive logarithm value type attribute of universe occurs, can make Source data table is divided into the optimal dividing between k to 2k-1, greatly reduces and the information damage that anonym's algorithm is caused is used alone Lose.
Brief description of the drawings
Fig. 1 is schematic flow sheet of the present invention;
Fig. 2 is for sex, race, the structure chart that 3 attributes of job category are constituted;
Fig. 3 is | QI | during=6+1, and information loss IL and the graph of a relation of k values;
Fig. 4 is | QI | during=6+2, and information loss IL and the graph of a relation of k values;
Fig. 5 is | QI | during=6+1, and time T and the graph of a relation of k values;
Fig. 6 is | QI | during=6+2, and time T and the graph of a relation of k values;
Fig. 7 is the graph of a relation of time difference and k values.
Embodiment
When realizing that k- is anonymous, related definition is carried out to NQLG algorithms by taking table 1 as an example.Assuming that what data publisher was held Tables of data is T (A1,A2,...,An), every tuple indicates the relevant information of a special entity in table, such as Age, Workclass, Race, Sex, Hours-per-week, Salary etc., are shown in Table 1.
Table 1
Define 1 standard identifier:It is assumed that data set a U, a specific tables of data T (A1,A2,...,An), fc:U→T And fg:T → U ', whereinA T standard identifier QIT, it is one group of attribute So f (fc(pi)[QT])=piSet up.Attribute in table 1 can serve as standard identifier, and the selection of standard identifier is according to reality Need selection.
Define 2 abstraction rules:Give attribute a Q, f:Q → Q ', f are the extensive function set acted on attribute Q, that Then represent that standard identifier carries out extensive process in order, and { f1,f2,...,fmThen represent Abstraction rule.Sex is illustrated in figure 2, race, the structure chart that 3 attributes of job category are constituted.
Define 3k- anonymous:(k-anonymity) a tables of data T (A is given1,...,An) and its associated fiducial mark knowledge SymbolIf to meet k- anonymous by table T, and if only if T [QIT] in each member Group is at least in T [QIT] in occur k times.
As shown in table 1,6 tuples, one specific personal information of each tuple correspondence are included in table.First is classified as in table For sequence number field, relative storage location of the every record in tables of data is represented;Second is classified as age attribute information;3rd is classified as Working attributes information;4th is classified as ethnic attribute information;5th is classified as gender attribute information;6th is classified as operating time attribute letter Breath, last row can be used as the Sensitive Attributes of this table as information to be protected is needed.T standard identifier Q I so in table 1T= {Age,Workclass,Race,Sex,Works_per_week}.Table 2 is data result of the table 1 after the processing of 2- anonymizations Publishing table.According to DEFINED BY EQUIVALENT CLASS, one has 3 equivalence classes in table 2, is respectively { R1,R2}、{R3,R4}、{R5,R6}.Equivalence class {R1,R2,R3In tuple have:
R1[QIT]=R2[QIT]={ [21,30], Self-emp-not-int, Amer-Indian-Eskimo, Female, [21-30]},
R3[QIT]=R4[QIT]={ [31,40], Private, Amer-Indian-Eskimo, Male, [31-40] },
R5[QIT]=R6[QIT]={ [41,50], Private, Amer-Indian-Eskimo, Male, [41-50] }.Cause This attacker obtains the probability only 1/k=1/2 of privacy-sensitive using attack pattern is linked.Table 1 is after the processing of k- anonymizations Tables of data (table 2) can effectively prevent link attack, table 2 be table 1 by 2- anonymity processing after data;
Table 2
Define 4 categorical attributes extensive:Data division is carried out to data set, classifying type data are carried out may time probability During expansion, { R1,...,RiCategorical attribute, and R1,...,Ri∈ T, if T (R1,...,Rj) meet k- anonymities, i.e., and if only if T(R1,...,Rj) in each tuple at least in T (R1,...,Rj) in occur k times, then complete categorical attribute it is extensive, Now frequent item set is represented by T ' (R1,..,Rj,...,S1,...,Sn)。
Define 5 Numeric Attributes extensive:Given frequent item set T ' (R are obtained by classifying type data generaliza-tion1,.., Ri,...,S1,...,Sn), table T ' (S1,...,Sn) (it is Numeric Attributes, the Numeric Attributes on T are extensive to be represented by KexpG(T ")), wherein K represents secondary anonymous function name, and exp is numeric type expression formula, and G is abstraction rule, δGComplete numeric type Tuple data it is extensive.
Define 6 numeric type member group distances:If T, for given tuple set T, (t1,t2,...,tn), two tuple t1,t2 (t1,t2∈ T), then the distance between tuple is its actual distance on all numeric type standard identifiers:
Wherein, ti,tjDifferent numeric type tuples, d are represented respectivelynRepresent the actual range between two numeric type tuples.
As shown in figure 1, the present invention is based on Incognito algorithms and MDAV algorithms, set forth herein an efficient k- is anonymous Algorithm --- NQLG algorithms.The algorithm combination Incognito algorithms and MDAV algorithms, are obtained first with Incognito algorithms Using classifying type standard identifier to meet the anonymous nodes of k-, all root nodes are obtained by judgement, according to root node to respectively It is extensive to tables of data progress, utilize MDAV algorithm logarithm value type hierarchical cluster attributes so that the equivalence class finally obtained is that optimal k is drawn Point, the number of tuple is between k and 2k-1 in each equivalence class, and is compared the extensive result that each root node is obtained, and selects The minimum extensive tables of data of information loss amount.Arthmetic statement is as follows:
Categorical attribute is extensive
Function:(T, CQI, k), T represent to need by extensive data set, CQI presentation class type standard identifiers Incognito Collection, k anonymity constraintss;
1) single attribute generalization both candidate nodes table and C are formed1Side table E1
2) C is taken out using an empty queue queue1In all root nodes, all to queue nodes carry out equivalence class meters Calculate;
3) judge whether to meet k- anonymities, if node is met, this point and its all child node be marked, If be unsatisfactory for, by this point from C1It is middle to delete, and its child node is inserted in queue queue;
4) repeat step 3), until C1In all ungratified knot removals, and be the C after deleting1And E1Formed newly Table C2And E2
5) repeat step 2), C 3), 4) after being deletedn
6)Sn={ CnAll nodes }
7) S is returnedn
Numeric Attributes are extensive
Function:(T ', NQI, k), T ' expressions need the data set being clustered to MDAV, and NQI represents the numerical value to be clustered Type attribute, k represents anonymous constraints;
1) judge whether the number of tuple in data set is more than 2k-1, if being more than, continue step 2), otherwise, return to number According to collection T ', and find its barycenter;
2) two farthest tuples r, s of distance are found out by NQI in data set T ';
3) using r as barycenter, the k-1 bar tuple formation equivalence class C nearest from r is found, barycenter, and the T ' from data set is updated This k bar tuple is deleted, is put into collection gregarious { Q };
4) using s as barycenter repeat step 3);
3) 4) 5) judge in data T ' whether remaining tuple number is more than 2k-1, if more than repeating 2);Otherwise, Return, returned data collection T ', and find its barycenter;
6) the standard identifier property value of the tuple in its equivalence class is replaced with the standard identifier property value of its barycenter;
7) T ' is returned.
NQLG algorithms are realized
1) judge that standard identifier concentrates attribute type,
2)Sn=Incognito (T, CQI, k);
SnIt is that categorical attribute has carried out extensive data set;
3) empty queue result, empty node node;
4) S is traveled throughnInto following circulation:
Data set
DjBe storage it is complete it is extensive after tables of data;
Read SnIn a node be inserted into node;
T ' is obtained according to the extensive tables of data T of node;
T ' is traveled through, into following circulation:
Use T 'iStore i-th of equivalence class in T ';
MDAV(Ti′,NQI,k);
Dj=Dj∪Ti′;
Information loss is calculated, result is inserted into;
5) compare information loss in result, obtain the minimum D of information lossj
6) T "=Dj, return to T ".
From above step, NQLG algorithms are by Incognito functions, and the level grid for forming all single attributes is carried out Judge that the extensive k- that whether meets is anonymous, delete and be unsatisfactory for the anonymous nodes of k-, the anonymous node iteration of k- will be met, candidate is formed Nodal set, then judge whether both candidate nodes meet k- anonymities, ineligible node is deleted, above-mentioned steps, Zhi Daosuo are circulated There is the completion of categorical attribute iteration, export all root nodes for meeting k- anonymities.Tables of data T is carried out successively by root node general Change, it is secondary extensive to extensive rear T ' carry out using MDAV algorithms, by the equivalence class tuple quantity of input be divided into k to 2k-1 it Between, after all divisions are completed, information loss is provided, compares the tables of data for showing that loss amount is minimum.
The analysis on its rationality of NQLG algorithms:By step 2) can be met the anonymous categorical attributes of k- frequent for algorithm Item collection, the then micro- aggregation of logarithm value type attribute progress, it is to avoid the excessive extensive possibility of the extensive logarithm value type attribute of universe, warp occur Cross step 4) after, the optimal dividing that source data table can be made to be divided between k to 2k-1 greatly reduces exclusive use anonym The information loss that algorithm is caused.
NQLG algorithm analysis:Assuming that this algorithm data concentrates tuple number to be n, classifying type standard identifier number is M, then this algorithm spends time series analysis as follows:It is O (1) that step 1 time, which spends,;Step 2 is using anonym's algorithm to classifying type Attribute meet k- solution, and the cost of its time is O (∑ Ci), CiFor the node number of ith iteration;Step 3 time spends For O (1);The cost of step 4 time isWherein l represent once it is extensive after root node Number.The time complexity of MDAV algorithms isJ is big equivalence class number obtained in the previous step;It is O that step 5 time, which spends, (l).Therefore the loss of the overall information of this algorithm is
NQLG algorithm experimentals are verified and interpretation of result:
Experimental situation:Testing used hardware environment is:4G internal memories, the operating systems of Windows 7, algorithm is by Java Realized with SQL server 2008.There is used herein the Adult data in UCI Machine Learning Repository Collection is as experimental data set, and Adult data sets are made up of U.S. census's data, using the training set in data set, are gone Except 30162 records are had after default value record, 8 property values, including Sex, Race, Hours_per_week are chosen herein, Marital_status,Education,Workclass,Native_country,Age.Wherein Age, Hours_per_week For continuity standard identifier, Sex, Race, Marital_status, Education, Workclass, Native_country is Classifying type standard identifier.
Analysis of experimental results:Incognito algorithms algorithm as a comparison is selected in this experiment, by the data set after k- anonymizations Secondary anonymity is carried out using MDAV algorithms, is weighed from information loss degree and in terms of the execution time to this paper algorithms.NQLG is calculated Method is realized under the conditions of the standard identifier and different value of K of different numbers, information loss degree and the change for performing the time.Wherein information Degree of loss uses the computational methods of document:
Equivalence class information loss amount:
The information loss amount of table:
| ei | it is the quantity for clustering ei tuples, 1≤l≤m, NiIt is the scope of i-th of numerical attribute, MAXNiAnd MINNiIt is Cluster maximum and minimum value, H (T in eici) be classification tree height, H (∧ (∪ Cj)) be with minimum public ancestors point The height of class subtree.
Standard identifier is worked as in information loss degree analysis it can be seen from Fig. 3, Fig. 4 | QI | a timing, and with k increase, this paper The information loss IL of algorithm has the trend of reduction, and when k values reach 50, the information loss amount of two kinds of algorithms has becoming for rising Gesture.Experimental data shows that the information loss amount of this paper algorithm is significantly lower than anonym's algorithm.Thus from information loss measuring angle See, this paper algorithms have an enormous advantage avoiding excessively extensive aspect tool.
Run time analysis it can be seen from Fig. 5, Fig. 6 when the timing of standard identifier one, anonym's algorithm and this paper algorithms Run time is all reduced with the increase of k values.Contrasted by different standard identifier collection QI datagram, when | QI |=6+ When 1 (+1 Numeric Attributes of 6 categorical attributes), aspect is better than this paper algorithms to anonym's algorithm at runtime, and accurate Identifier collection | QI | during=6+2 (+2 Numeric Attributes of 6 categorical attributes), with the increase of k values, this paper algorithms are in operation It is better than anonym's algorithm in terms of time.Experimental data shows, during numeric type standard identifier increase, the superiority meeting of this paper algorithms It is more obvious.
As seen from Figure 7, with the reduction of k values, the standard identifier collection of anonym's algorithm and this paper algorithms (when | QI |= 6+2 and | QI | during=6+1) time difference Δ t increase simultaneously, the amplification of anonym's algorithm significantly, much larger than this paper algorithms Amplification.Thus, from efficiency, with standard identifier collection | QI | middle numeric type standard identifier accounting changes, this paper algorithms it is excellent More property can be significantly improved.
Semanteme in the excessive extensive and clustering of the Numeric Attributes caused herein mainly for anonym's algorithm Include problem, it is proposed that NQLG algorithms.Experiment shows that NQLG algorithms are lost compared to traditional Privacy preserving algorithms in reply semanteme Semanteme of becoming estranged has a clear superiority comprising aspect.Future can deploy research in the following areas:There is the possibility of secondary issue in data Property, can be on dynamic data set to NQLG algorithm further genralrlizations;With the sharp increase of data scale, distribution can be introduced Formula, cloud computing technology further improve mass data processing efficiency into anonymization research.

Claims (3)

1. a kind of anonymous method for secret protection of the secondary k- for distinguishing standard identifier attribute, it is characterised in that:
1)Sn=Incognito (T, CQI, k), SnPresentation class type attribute has carried out extensive data set, T represent to need by Extensive data set, CQI presentation class type standard identifier collection, k represents anonymous constraints;
2) empty queue result, empty node node;
3) S is traveled throughnInto following circulation:
Data set
DjBe storage it is complete it is extensive after tables of data;
Read SnIn a node be inserted into node;
T ' is obtained according to the extensive data set T of node;
T ' is traveled through, into following circulation:
Use TiI-th of equivalence class in ' storage T ';
MDAV(Ti', NQI, k), Ti' data set that needs are clustered is represented, NQI represents the Numeric Attributes to be clustered, k Represent anonymous constraints;
Dj=Dj∪Ti';
Information loss is calculated, result is inserted into;
4) compare information loss in result, obtain the minimum D of information lossj
5) T "=Dj, return to T ".
2. the anonymous method for secret protection of the secondary k- for distinguishing standard identifier attribute according to claim 1, it is characterised in that: (T, CQI, k) categorical attribute is extensive comprises the following steps that Incognito:
1) single attribute generalization both candidate nodes table C is formed1With side table E1
2) C is taken out using an empty queue queue1In all root nodes, all to queue nodes carry out equivalence class calculating;
3) judge whether to meet k- anonymities, if node is met, this point and its all child node are marked, if It is unsatisfactory for, then by this point from C1It is middle to delete, and its child node is inserted in queue queue;
4) repeat step 3), until C1In all ungratified knot removals, and make the C after deleting1And E1Form new table C2 And E2
5) repeat step 2), C 3), 4) after being deletedn
6)Sn={ CnAll nodes }
7) S is returnedn
3. the anonymous method for secret protection of the secondary k- for distinguishing standard identifier attribute according to claim 1, it is characterised in that: MDAV(Ti', NQI, k) Numeric Attributes are extensive comprises the following steps that:
1) judge whether the number of tuple in data set is more than 2k-1, if being more than, continue step 2), otherwise, returned data collection Ti', and find its barycenter;
2) data set Ti' in find out two farthest tuples r, s of distance by NQI;
3) using r as barycenter, nearest from r k-1 bars tuple formation equivalence class C is found, barycenter is updated, and from data set Ti' middle deletion This k bar tuple, is put into collection gregarious { Q };
4) using s as barycenter repeat step 3);
5) data set T is judgedi' in remaining tuple number whether be more than 2k-1, if more than repeating 2), 3), 4);Otherwise, Return, returned data collection Ti', and find its barycenter;
6) the standard identifier property value of the tuple in its equivalence class is replaced with the standard identifier property value of its barycenter;
7) T is returnedi′。
CN201610361877.2A 2016-05-26 2016-05-26 Distinguish the anonymous Privacy preserving algorithms of secondary k of standard identifier attribute Active CN106021541B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610361877.2A CN106021541B (en) 2016-05-26 2016-05-26 Distinguish the anonymous Privacy preserving algorithms of secondary k of standard identifier attribute

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610361877.2A CN106021541B (en) 2016-05-26 2016-05-26 Distinguish the anonymous Privacy preserving algorithms of secondary k of standard identifier attribute

Publications (2)

Publication Number Publication Date
CN106021541A CN106021541A (en) 2016-10-12
CN106021541B true CN106021541B (en) 2017-08-04

Family

ID=57093604

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610361877.2A Active CN106021541B (en) 2016-05-26 2016-05-26 Distinguish the anonymous Privacy preserving algorithms of secondary k of standard identifier attribute

Country Status (1)

Country Link
CN (1) CN106021541B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107305614A (en) * 2017-08-12 2017-10-31 西安电子科技大学 A kind of method based on the MLDM algorithm process big datas for meeting Second Aggregation
CN107688751A (en) * 2017-08-17 2018-02-13 复旦大学 A kind of adaptive method for secret protection of social media user behavior temporal mode

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709372B (en) * 2017-01-09 2019-04-30 广西师范大学 The method for secret protection that enhanced identity is kept
CN107273757B (en) * 2017-04-23 2020-08-18 西安电子科技大学 Method for processing big data based on l-diversity rule and MDAV algorithm
CN107358113A (en) * 2017-06-01 2017-11-17 徐州医科大学 Based on the anonymous difference method for secret protection of micro- aggregation
CN108133146A (en) * 2017-06-01 2018-06-08 徐州医科大学 Sensitive Attributes l-diversity method for secret protection based on secondary division
US11507684B2 (en) * 2017-10-11 2022-11-22 Nippon Telegraph And Telephone Corporation κ-anonymization device, method, and program
TWI644224B (en) * 2017-10-18 2018-12-11 財團法人工業技術研究院 Data de-identification method, data de-identification apparatus and non-transitory computer readable storage medium executing the same
EP3591561A1 (en) 2018-07-06 2020-01-08 Synergic Partners S.L.U. An anonymized data processing method and computer programs thereof
CN109388972A (en) * 2018-10-29 2019-02-26 山东科技大学 Medical data Singular variance difference method for secret protection based on OPTICS cluster
CN109726589B (en) * 2018-12-22 2021-11-12 北京工业大学 Crowd-sourcing cloud environment-oriented private data access method
CN110008742A (en) * 2019-03-21 2019-07-12 九江学院 It is a kind of to regularly publish the anonymous guard method of the leakage of the efficient Q value zero in private data for SRS
CN110378148B (en) * 2019-07-25 2023-02-03 哈尔滨工业大学 Multi-domain data privacy protection method facing cloud platform
CN110659513B (en) * 2019-09-29 2022-12-06 哈尔滨工程大学 Anonymous privacy protection method for multi-sensitive attribute data release
CN110807208B (en) * 2019-10-31 2022-02-18 北京工业大学 K anonymous privacy protection method capable of meeting personalized requirements of users
CN111079174A (en) * 2019-11-21 2020-04-28 中国电力科学研究院有限公司 Power consumption data desensitization method and system based on anonymization and differential privacy technology
CN113051619B (en) * 2021-04-30 2023-03-03 河南科技大学 K-anonymity-based traditional Chinese medicine prescription data privacy protection method
CN113742781B (en) * 2021-09-24 2024-04-05 湖北工业大学 K anonymous clustering privacy protection method, system, computer equipment and terminal

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101834872A (en) * 2010-05-19 2010-09-15 天津大学 Data processing method of K-Anonymity anonymity algorithm based on degree priority
CN102156755A (en) * 2011-05-06 2011-08-17 天津大学 K-cryptonym improving method
JP2014164477A (en) * 2013-02-25 2014-09-08 Hitachi Systems Ltd K-anonymity database control device and control method
WO2014176024A1 (en) * 2013-04-25 2014-10-30 International Business Machines Corporation Guaranteeing anonymity of linked data graphs

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101834872A (en) * 2010-05-19 2010-09-15 天津大学 Data processing method of K-Anonymity anonymity algorithm based on degree priority
CN102156755A (en) * 2011-05-06 2011-08-17 天津大学 K-cryptonym improving method
JP2014164477A (en) * 2013-02-25 2014-09-08 Hitachi Systems Ltd K-anonymity database control device and control method
WO2014176024A1 (en) * 2013-04-25 2014-10-30 International Business Machines Corporation Guaranteeing anonymity of linked data graphs

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107305614A (en) * 2017-08-12 2017-10-31 西安电子科技大学 A kind of method based on the MLDM algorithm process big datas for meeting Second Aggregation
CN107305614B (en) * 2017-08-12 2020-05-26 西安电子科技大学 Method for processing big data based on MLDM algorithm meeting secondary aggregation
CN107688751A (en) * 2017-08-17 2018-02-13 复旦大学 A kind of adaptive method for secret protection of social media user behavior temporal mode
CN107688751B (en) * 2017-08-17 2021-02-26 复旦大学 Self-adaptive privacy protection method for social media user behavior time mode

Also Published As

Publication number Publication date
CN106021541A (en) 2016-10-12

Similar Documents

Publication Publication Date Title
CN106021541B (en) Distinguish the anonymous Privacy preserving algorithms of secondary k of standard identifier attribute
US10579661B2 (en) System and method for machine learning and classifying data
Xu et al. MIAEC: Missing data imputation based on the evidence chain
CN107358113A (en) Based on the anonymous difference method for secret protection of micro- aggregation
Liu Mining frequent patterns from univariate uncertain data
Khabsa et al. Online person name disambiguation with constraints
Qu et al. Efficient online summarization of large-scale dynamic networks
CN110378148B (en) Multi-domain data privacy protection method facing cloud platform
Giannella et al. Breaching Euclidean distance-preserving data perturbation using few known inputs
Qian et al. Multi-granularity locality-sensitive bloom filter
Gkountouna et al. Anonymizing collections of tree-structured data
US10311093B2 (en) Entity resolution from documents
Liu et al. Strong social graph based trust-oriented graph pattern matching with multiple constraints
CN109614521B (en) Efficient privacy protection sub-graph query processing method
Shaham et al. Machine learning aided anonymization of spatiotemporal trajectory datasets
Chen et al. Metric all-k-nearest-neighbor search
Sarah et al. A novel (K, X)-isomorphism method for protecting privacy in weighted social network
CN111967045A (en) Big data-based data publishing privacy protection algorithm and system
Wu et al. Efficient evaluation of object-centric exploration queries for visualization
Xie et al. Efficient storage management for social network events based on clustering and hot/cold data classification
CN115186188A (en) Product recommendation method, device and equipment based on behavior analysis and storage medium
Dharavath et al. Entity resolution based EM for integrating heterogeneous distributed probabilistic data
Priya et al. Entity resolution for high velocity streams using semantic measures
CN110175220B (en) Document similarity measurement method and system based on keyword position structure distribution
Pola et al. Similarity sets: A new concept of sets to seamlessly handle similarity in database management systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant