CN107358115B - It is a kind of consider practicability multiattribute data go privacy methods - Google Patents
It is a kind of consider practicability multiattribute data go privacy methods Download PDFInfo
- Publication number
- CN107358115B CN107358115B CN201710496086.5A CN201710496086A CN107358115B CN 107358115 B CN107358115 B CN 107358115B CN 201710496086 A CN201710496086 A CN 201710496086A CN 107358115 B CN107358115 B CN 107358115B
- Authority
- CN
- China
- Prior art keywords
- data
- privacy
- attributes
- practicability
- multiattribute
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 238000005259 measurement Methods 0.000 claims abstract description 7
- 238000009826 distribution Methods 0.000 claims description 27
- 238000012545 processing Methods 0.000 claims description 24
- 239000011159 matrix material Substances 0.000 claims description 21
- 239000003086 colorant Substances 0.000 claims description 8
- 238000010586 diagram Methods 0.000 claims description 8
- 230000008859 change Effects 0.000 claims description 4
- 238000010276 construction Methods 0.000 abstract description 5
- 238000013461 design Methods 0.000 abstract description 3
- 241001269238 Data Species 0.000 abstract description 2
- 230000008901 benefit Effects 0.000 abstract description 2
- 238000012800 visualization Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 5
- 235000013399 edible fruits Nutrition 0.000 description 3
- 238000000691 measurement method Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000013506 data mapping Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000013079 data visualisation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000012856 packing Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000035807 sensation Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
- G06F21/6254—Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Databases & Information Systems (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of multiattribute datas for considering practicability to go privacy methods, comprising the following steps: step 1: importing pretreated multiattribute data;Step 2: according to attribute description defining required attributes and Sensitive Attributes, the pre- rule of classification of indispensable attributes is set, according to the sequence of attributive character defining required attributes;Step 3: building privacy exposure tree, the foundation for respectively generating the attribute sequence in step 2 as the hierarchic sequence of privacy exposure tree and every level branch with pre- rule of classification;Step 4: risk measurement result information is encoded on the node of privacy exposure tree and side according to Sensitive Attributes;The present invention neatly can select suitable method to solve privacy concern from several common grammer anonymizations and difference privacy model, to meet a variety of privacy requirements to different data;Wherein privacy exposure tree utilizes the distinctive advantage of tree construction, realizes the spaces compact design to Multidimensional-collection.
Description
Technical field
The present invention relates to Information Hiding Techniques field, in particular to a kind of multiattribute data for considering practicability goes to privacy side
Method.
Background technique
Data owner needs to consider whether the data are related in display data or before disclosing data and being used for analysis
To the sensitive information of individual.If being related to relevant issues, then data need to carry out privacy in advance to handle.
It is main that method is closed in the prior art including the following three aspects: first aspect secret protection model, in secret protection
Field, many automated process have been suggested, wherein semantic anonymity model and difference privacy model are the most common privacies of two classes
Protect model.Wherein k-anonymity (L.Sweeney. k-anonymity:A model for protecting
privacy.International Journal of Uncertainty,Fuzziness and Knowledge-Based
Systems,10(05):557–570, 2002.)、l-diversity(A.Machanavajjhala,D.Kifer,
J.Gehrke,and M. Venkitasubramaniam.l-diversity:Privacy beyond k-anonymity.ACM
Transactions on Knowledge Discovery from Data, 1 (1): 3,2007.) and t-closeness
(N.Li,T.Li,and S.Venkatasubramanian.t-closeness:Privacy beyond k-
anonymityandl-diversity.InProceedingsoftheIEEE23rdInternational Conference on
Data Engineering, pp.106-115.IEEE, 2007.) it is three kinds of most classic semantic anonymity models.They respectively from
The data item of equivalence class, in equivalence class in the quantity of the representative value of Sensitive Attributes and equivalence class the distribution of Sensitive Attributes and
Measure whether current data set reaches secret protection standard simultaneously in terms of difference three of Sensitive Attributes distribution in entirety set
Merge set with crossing, obscures the mode of equivalence class to provide secret protection.Difference privacy model is then by related data
Attribute value adds suitable noise to protect the sensitive information in data.Regrettably, all there is some lack in these automated process
It falls into, for example semantic anonymity model is difficult to handle high dimensional data, difference privacy methods can lose correlation of data etc..
Second aspect is the choice of privacy and practicability, and some semanteme anonymous methods are combined by merging collection away from reacting letter
Cease the degree of loss.Based on above-mentioned standard, Loukides (J.Xu, W.Wang, J.Pei, X.Wang, B.Shi, and A.W.-
C.Fu.Utility-based anonymization using local recoding.In Proceedings of the
12th ACM SIGKDD international conference on Knowledge discovery and data
Mining, pp.785-790,2006.) and Shao (Data utility and privacy protection trade-off
in k-anonymisation.In Proceedings of the international workshop on Privacy
And anonymity in information society, pp.36-45.ACM, 2008.) it is optimal poly- by dividing and determining
Class solves to weigh: first listing the feasible selection of parameter setting and optimal requirement, passes through the choosing of the curve of privacy and effectiveness later
Select combination appropriate.In addition, Rastogi et al. proposes α β anonymity algorithm (V.Rastogi, D.Suciu, and S.
Hong.The boundary between privacy and utility in data publishing.In
Proceedings of the 33rd international conference on Very large data bases,pp.
531-542.VLDB Endowment, 2007.), privacy and practicability are considered as bounded opponent by it.About difference privacy model,
About information flow information theory frame (M.S.Alvim, M.E. Andr ' es, K.Chatzikokolakis, P.Degano,
and C.Palamidessi.Differential privacy:on the trade-off between utility and
information leakage.In Proceedings of the International Workshop on Formal
Aspects in Security and Trust, pp.39-54.Springer, 2011.) it can be leaked with quantitative information and effectiveness.
Ghosh et al. (A.Ghosh, T.Roughgarden, and M.Sundararajan.Universally utility-
maximizing privacy mechanisms.SIAM Journal on Computing, 41(6):1673–1693,
2012.) also meet the constraint of difference privacy by adding random Laplacian noise, while minimizing information as much as possible and losing
To construct geometry mechanism.
However, the practicability maintaining method that the above method is proposed both for model itself, there is no really from point
The angle of analysis goes the feature for considering data comprehensively.
The third aspect is the visual research about privacy, this partial content is broadly divided into through theoretical research and phase relation
System.In theoretical side, Van proposes a model (J.J.Van Wijk.The about data mapping to user's sensation influence
value of visualization.In Proceedings of the Visualization.IEEE,pp.79–86,
2005.).Dasgupta and Kosara (A.Dasgupta and R.Kosara.Adaptive privacy-preserving
visualization using parallel coordinates.Proceedings of the IEEE transactions
17 (12): on visualization and computer graphics 2241-2248,2011.) thinks that only vision is poly-
Class and data clusters could be by increasing uncertain offer privacy reassurances, and data clustering method can be reduced with practicability
Secret protection is realized for cost.Related system is mainly Chou et al. (J.-K.Chou, C. Bryan, and K.-
L.Ma.Privacy preserving visualization for social network data with ontology
information.2017.)(J.-K.Chou,Y.Wang,and K.-L.Ma. Privacy preserving event
sequence data visualization using a sankey diagram-like representation.In
Proceedings of the SIGGRAPH ASIA Symposium on Visualization, 2016.) what is proposed is directed to
Track data and diagram data go private data processing system.
But above-mentioned existing method for secret protection does not provide a user enough practicability feedbacks.
Summary of the invention
The present invention provides a kind of multiattribute datas for considering practicability to go privacy methods, helps user to measure in real time practical
Property loss, and can be setup flexibly and find privacy concern involved in data.
It is a kind of consider practicability multiattribute data go privacy methods, comprising the following steps:
Step 1: importing pretreated multiattribute data;
Step 2: according to attribute description defining required attributes and Sensitive Attributes, setting the pre- rule of classification of indispensable attributes, root
According to the sequence of attributive character defining required attributes;Indispensable attributes refer to the data needed by handling and showing;Sensitive Attributes are
The attribute of reference and privacy.The sequence of pre- rule of classification and indispensable attributes herein can carry out it is artificially defined, in order to more preferable
Realize the present invention in ground, it is preferred that Sensitive Attributes are placed on rear layer, the attribute for being grouped less is placed on front layer.Doing so can be
More information is observed in treatment process, and keeps treatment process more flexible, in general, in conjunction with the pre- of common sense setting indispensable attributes
Rule of classification.
Step 3: building privacy exposure tree, respectively using the attribute in step 2 sequentially with pre- rule of classification as privacy
The foundation that the hierarchic sequence of exposure tree and every level branch generate;Specific building mode is as follows, each node on behalf of tree
One set, the set representated by i-th layer of node is only by the 1st, 2 ..., and attribute value corresponding to i layers is limited.Meanwhile
For simplified topology, child node according to this layer of corresponding attribute value, is merged into cluster node by node by each layer.Two kinds of nodes
It can be simultaneously displayed in view.
Step 4: risk measurement result information is encoded on the node of privacy exposure tree and side according to Sensitive Attributes;
Risk measurement result information includes: side coding risk increment and nodes encoding value-at-risk, according to three classical semantemes
The thought of anonymity model (k-anonymity, l-diversity, t-closeness) exposes wind to measure the privacy of each set
Danger.Three measurement methods specifically: the data item quantity in the corresponding set of k;The Sensitive Attributes value of data item in the corresponding set of l
Quantity;T corresponds to the difference of Sensitive Attributes distribution and all property distributions.If there is Sensitive Attributes A1,A2,…,An, then successively produce
Raw 1+2n risk measurement index, it is sorted are as follows: k, l (A1),t(A1),l(A2),t(A2),…,l(An),t(An), lead to respectively
The transparency for crossing three kinds of different colours encodes them on node in ribbon form.Since child node quantity is more, space
It is smaller, only use the maximum value in the transparency coding all risk measured value of grey.In addition to this, also with grey on side
Transparency encodes risk increment between father node and child node.
Preferably, further include step 5: carrying out the attribute value based on semantic anonymous methods on privacy exposure tree and close
And.When dragging node merging, (accurate) value of all data respective attributes is replaced in two set representated by two nodes
For the same fuzzy ranges, this range includes all original attribute values in two set.Data after merging will be accurate
Attribute value anonymization, to protect privacy.
Preferably, further include step 6: carrying out the spy to specific collection based on difference privacy on privacy exposure tree
Determine attribute and adds different size of noise.To the Response Property values of data in selected set based on the noise of difference privacy.Noise
The uncertainty of data is increased, so that the people of observation data be enabled to be difficult to determine actual property value.
Preferably, further comprising the steps of:
Step 7: two-dimensional matrix is unfolded according to indispensable attributes, each grid in the upper right corner shows initial data in two-dimensional matrix
Corresponding Joint Distribution, diagonal line display indicate the statistical chart of respective attributes distribution, data after each grid display processing in the lower left corner
Corresponding Joint Distribution.Statistical chart can use histogram, line chart etc..The two-dimensional matrix and privacy exposure tree join
Dynamic, user can be distributed and be measured by the figure of two-dimensional matrix during carrying out data processing based on privacy exposure tree
Change numerical value and in real time, comprehensively understands current practicability variation.
Preferably, in step 7, the type of the Joint Distribution of initial data includes: classification type-classification type original graph: being passed through
The radius code homogeneous data item quantity of point;Classification type-numeric type original graph: being rectangle by point deformation, by rectangle along classification
The length coding homogeneous data item quantity of axis direction;Numeric type-numeric type original graph: scatter plot.
Preferably, in step 7, the type of the Joint Distribution of data includes: based on semantic anonymous methods processing knot after processing
Fruit: matrix diagram;Based on difference privacy methods processing result: scatter plot;Integrated treatment result: the combination chart of matrix diagram and scatter plot
Table.
Preferably, further comprising the steps of:
Step 8: practicability index being calculated to each indispensable attributes in real time, and is shown and is updated in two-dimensional matrix.
The specific method is as follows: the attribute value before and after each data processing is set to fD(x) and fD’(x).If the category
Property be Numeric Attributes, respectively to initial data and processing after data set in the attribute value carry out ascending sort, it is assumed that
There are m data item, fD(x) it is arranged in original data set i-th, fD’(x) it arranges j-th, is then calculated in data after treatment:
u(fD(x),fD’(x))=1- | i-j |/(m-1);
For the classification type data of not hierarchical information, if fD(x)=fD’(x), then u (fD(x), fD ' (x)) it is 1, it is no
It is then 0.
For there is the classification type data of hierarchical information, calculating:
u(fD(x),fD’(x))=level (fD(x),fD’(x))/H;
Wherein level (fD(x),fD’It (x)) is fD(x) and fD’(x) level of common ancestor, H are the level of whole tree.
Finally according to u (fD(x),fD’(x)) the practicability index of each attribute is calculated:
U(fD(x),fD’(x))=Σ u (fD(x),fD’(x))/n。
Practicability matrix is updated after each data processing, and shows the practicability index value of current each attribute and preceding
Index value size caused by once-through operation changes.
There is provided more intuitive distribution comparative approach, it is preferred that further comprising the steps of:
Step 9: it is respectively that the Joint Distribution of data after initial data in step 7 and processing is unified to same granularity, if number
According to original granularity it is inconsistent, then data are reduced into its original granularity and are uniformly distributed, further according to the granularity newly defined into
Row data item quantity statistics;
Two charts of above-mentioned unified granularity are made the difference, the data item quantity of data after the processing in each grid is subtracted
The data item quantity of initial data obtains a difference, and the size of the value to zero to negative from being just respectively mapped to centered on white
A pair of of contrastive colours color gradient on.The depth of color and distribution clearly can convey two-dimentional Joint Distribution in data to user
Specific change information.
Beneficial effects of the present invention:
The multiattribute data of consideration practicability of the invention goes privacy methods, can neatly hide from several common grammers
Suitable method is selected to solve privacy concern in nameization and difference privacy model, to meet a variety of privacy need to different data
It asks;Wherein privacy exposure tree utilizes the distinctive advantage of tree construction, realizes the spaces compact design to Multidimensional-collection;Together
When, the design of the polymerizable function of branch and aggregation can help user efficiently to browse multidimensional data, the increment on tree construction side
Coding can more assist quickly positioning privacy concern source;And practicability matrix provides the view that previous automatic algorithms can not provide
Feel feedback, this safeguards practicability meaningful.
Detailed description of the invention
Fig. 1 is the process schematic that the method for the present invention constructs privacy exposure tree.
Fig. 2 is the schematic diagram of the privacy exposure tree of the method for the present invention construction.
Fig. 3 is the schematic diagram of the practicability matrix of the method for the present invention construction.
Fig. 4 is that the method for the present invention passes through the knot after semantic anonymous methods processing data by processing result compared with initial data
Fruit.
Fig. 5 is that the method for the present invention passes through the knot after difference privacy methods processing data by processing result compared with initial data
Fruit.
Fig. 6 is that the branch for having main problem is packed up to the privacy that can more intuitively observe other parts in the method for the present invention
Schematic diagram when exposure.
Fig. 7 is that after merging Liang Ge branch in the method for the present invention, can compare and find out with Fig. 2, most of privacy exposure wind
Danger is all solved.
Specific embodiment
In the present embodiment, the data set used is the open microdata sample collected when the generaI investigation of the Wyoming State of the U.S. in 2015
The partial data collection of data (PUMS).Information of many of this data set as unit of family.
The considerations of the present embodiment practicability multiattribute data go privacy methods the following steps are included:
Step 1: importing pretreated multiattribute data, following four attribute is extracted from data set: " insurance expenditure "
(annual), " family income " (in past 1 year), " children " (personnel amounts of under-18s in family), and " old man " (family
The personnel amount of over-65s in front yard);Wherein family income is considered as needing Sensitive Attributes to be protected.
Step 2: according to attribute description defining required attributes and Sensitive Attributes, setting the pre- rule of classification of indispensable attributes, root
According to the sequence of attributive character defining required attributes;In conjunction with the pre- rule of classification of common sense setting indispensable attributes, by classification type data
Attribute value is directly as group basis;For Numeric Attributes, according to specific point of the definition of the quantile of attribute meaning and data
Group rule;The attribute for being grouped few is placed on front layer for Sensitive Attributes are placed on rear layer by sequence.
Step 3: building privacy exposure tree, respectively using the attribute in step 2 sequentially with pre- rule of classification as privacy
The foundation that the hierarchic sequence of exposure tree and every level branch generate;As illustrated in fig. 1 and 2, privacy exposure tree is constructed,
Wherein side represents the inclusion relation between father and son's node, its shade encodes sorted risk increment;Child node refers to
The subset that whole Attribute Associations of current layer or more are formed;Cluster node refers to the subset only classified by current attribute.Cause
This, after the second layer, cluster node usually contains multiple child nodes.
By observing privacy exposure tree shown in Fig. 2, can quickly find: have more than a elderly family more
It has been easy privacy exposure.
Step 4: risk measurement result information is encoded on the node of privacy exposure tree and side according to Sensitive Attributes;Such as
Shown in Fig. 2, according to the thought of three classical semantic anonymity models (k-anonymity, l-diversity, t-closeness)
To measure the privacy exposure of each set.Three measurement methods specifically:
Data item quantity in the corresponding set of k;
The quantity of the Sensitive Attributes value of data item in the corresponding set of l;
T corresponds to the difference of Sensitive Attributes distribution and all property distributions.
Existing Sensitive Attributes family income then generates 3 risk measurement indexes, it is sorted are as follows: k, l (family income), t
(family income) is respectively encoded them on node by the transparency of three kinds of different colours in ribbon form.Due to son
Number of nodes is more, and space is smaller, only uses the maximum value in the transparency coding all risk measured value of grey.Except this it
Outside, risk increment between father node and child node also is encoded with the transparency of grey on side.
Step 5: carrying out the attribute value based on semantic anonymous methods on privacy exposure tree and merge.Packing up two most has
When: there is the branch of an old man in the branch of problem and has the branch of two or more old men, as shown in fig. 6, the color of whole tree is all
It takes off, that is to say, that most problems are related to the two attribute values.In order to solve this problem, selection is that will have privacy
The branch of problem merges, and Fig. 7 shows the privacy exposure tree after merging.Find that two nodes are residual behind third layer later
Some risks have been stayed, in first time attempts, have continued selection combining operation.However, being observed originally when by practicability matrix
When correlation, the chart-information before finding largely is affected, as shown in Fig. 7.
Step 6: the particular community addition to specific collection based on difference privacy is carried out on privacy exposure tree not
With the noise of size.
Step 7: two-dimensional matrix is unfolded according to indispensable attributes, each grid in the upper right corner shows initial data in two-dimensional matrix
Corresponding Joint Distribution, diagonal line display indicate the statistical chart of respective attributes distribution, data after each grid display processing in the lower left corner
Corresponding Joint Distribution.As shown in figure 3, two-dimensional matrix is practicability matrix, wherein diagonal line is the one-dimensional distribution of each attribute
Figure;It is original graph above diagonal line, wherein children-old man is classification type-classification type original graph;Insuring expenditure-family income is
Numeric type-numeric type original graph;Remaining is all numeric type-classification type original graph;For based on semantic anonymous methods below diagonal line
Processing result figure, wherein two attributes of old man and children still show original graph because without processed.In the present embodiment
Each dimension on pre- cluster be relatively rough.But we can still find out quickly, family income and insurance premium are just
It is related.Compared with family's (most of family spends 500-900 first every year) of not child, there is (the wherein big portion, family of child
Subfamily spends 900-1300 first every year) often it is ready to buy more insurances.
Step 8: practicability index being calculated to each indispensable attributes in real time, and is shown and is updated in two-dimensional matrix.
As known to the top data of Fig. 3.
Step 9: respectively that the Joint Distribution of data after initial data in step 7 and processing is unified to same granularity.Factor
According to original granularity it is inconsistent (treated, and data granularity is thicker), data are reduced into its original granularity and are uniformly distributed,
Data item quantity statistics are carried out further according to the granularity newly defined;
Two charts of above-mentioned unified granularity are made the difference, the data item quantity of data after the processing in each grid is subtracted
The data item quantity of initial data obtains a difference, and the size of the value to zero to negative from being just respectively mapped to centered on white
A pair of of contrastive colours color gradient on.
Fig. 4 is the result observed using above scheme: the part that black surround outlines is it is desirable that the distribution retained is special
Sign, but it can be seen from the figure that this part colours is very deep, have changed a lot.Therefore, we return backward, then
Using difference privacy technology.In FIG. 5, it can be seen that the part colours specifically outlined are shallower, thus practicability loss by
Control is in more acceptable level.
Claims (8)
1. a kind of multiattribute data for considering practicability goes privacy methods, which comprises the following steps:
Step 1: importing pretreated multiattribute data;
Step 2: according to attribute description defining required attributes and Sensitive Attributes, the pre- rule of classification of indispensable attributes is set, according to category
The sequence of property characterizing definition indispensable attributes;
Step 3: building privacy exposure tree respectively exposes the attribute sequence in step 2 with pre- rule of classification as privacy
The foundation that the hierarchic sequence of risk tree and every level branch generate;
Step 4: risk measurement result information is encoded on the node of privacy exposure tree and side according to Sensitive Attributes.
2. considering that the multiattribute data of practicability goes privacy methods as described in claim 1, which is characterized in that further include step
5: carrying out the attribute value based on semantic anonymous methods on privacy exposure tree and merge.
3. considering that the multiattribute data of practicability goes privacy methods as described in claim 1, which is characterized in that further include step
6: carrying out the particular community to specific collection based on difference privacy on privacy exposure tree and add different size of noise.
4. considering that the multiattribute data of practicability goes privacy methods as described in claim 1, which is characterized in that further include following
Step:
Step 7: two-dimensional matrix is unfolded according to indispensable attributes, each grid in the upper right corner shows the corresponding of initial data in two-dimensional matrix
Joint Distribution, diagonal line display indicate the statistical chart of respective attributes distribution, the phase of data after each grid display processing in the lower left corner
Answer Joint Distribution.
5. considering that the multiattribute data of practicability goes privacy methods as claimed in claim 4, which is characterized in that former in step 7
The type of the Joint Distribution of beginning data includes: classification type-classification type original graph: passing through the radius code homogeneous data item number of point
Amount;Classification type-numeric type original graph: being rectangle by point deformation, by rectangle along the length coding homogeneous data of classification axis direction
Item quantity;Numeric type-numeric type original graph: scatter plot.
6. considering that the multiattribute data of practicability goes privacy methods as claimed in claim 4, which is characterized in that in step 7, place
The type of the Joint Distribution of data includes: based on semantic anonymous methods processing result after reason: matrix diagram;Based on difference privacy methods
Processing result: scatter plot;Integrated treatment result: the mixing chart of matrix diagram and scatter plot.
7. considering that the multiattribute data of practicability goes privacy methods as claimed in claim 4, which is characterized in that further include following
Step:
Step 8: practicability index being calculated to each indispensable attributes in real time, and is shown and is updated in two-dimensional matrix.
8. considering that the multiattribute data of practicability goes privacy methods as claimed in claim 7, which is characterized in that further include following
Step:
Step 9: the Joint Distribution of data after initial data in step 7 and processing is unified to same granularity respectively, if data
Original granularity is inconsistent, then is reduced into data in its original granularity and is uniformly distributed, counted further according to the granularity newly defined
According to item quantity statistics;
Two charts of above-mentioned same granularity are made the difference, the data item quantity of data after the processing in each grid are subtracted original
The data item quantity of data obtains a difference, and the size of the value from positive to negative, is mapped on the graduated colors between contrastive colours,
The gradual change is using white as gradual change center, and corresponding zero.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710496086.5A CN107358115B (en) | 2017-06-26 | 2017-06-26 | It is a kind of consider practicability multiattribute data go privacy methods |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710496086.5A CN107358115B (en) | 2017-06-26 | 2017-06-26 | It is a kind of consider practicability multiattribute data go privacy methods |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107358115A CN107358115A (en) | 2017-11-17 |
CN107358115B true CN107358115B (en) | 2019-09-20 |
Family
ID=60273122
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710496086.5A Active CN107358115B (en) | 2017-06-26 | 2017-06-26 | It is a kind of consider practicability multiattribute data go privacy methods |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107358115B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11514189B2 (en) * | 2018-03-01 | 2022-11-29 | Etron Technology, Inc. | Data collection and analysis method and related device thereof |
CN109492430A (en) * | 2018-10-30 | 2019-03-19 | 江苏东智数据技术股份有限公司 | A kind of internet Keywork method for secret protection and device based on obfuscated manner |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104156668A (en) * | 2014-08-04 | 2014-11-19 | 江苏大学 | Privacy protection reissuing method for multiple sensitive attribute data |
CN105608389A (en) * | 2015-10-22 | 2016-05-25 | 广西师范大学 | Differential privacy protection method of medical data dissemination |
CN106096403A (en) * | 2016-06-23 | 2016-11-09 | 国家计算机网络与信息安全管理中心 | A kind of analysis method and device of software privacy leakage behavior |
CN106156040A (en) * | 2015-03-26 | 2016-11-23 | 阿里巴巴集团控股有限公司 | multi-dimensional data management method and device |
-
2017
- 2017-06-26 CN CN201710496086.5A patent/CN107358115B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104156668A (en) * | 2014-08-04 | 2014-11-19 | 江苏大学 | Privacy protection reissuing method for multiple sensitive attribute data |
CN106156040A (en) * | 2015-03-26 | 2016-11-23 | 阿里巴巴集团控股有限公司 | multi-dimensional data management method and device |
CN105608389A (en) * | 2015-10-22 | 2016-05-25 | 广西师范大学 | Differential privacy protection method of medical data dissemination |
CN106096403A (en) * | 2016-06-23 | 2016-11-09 | 国家计算机网络与信息安全管理中心 | A kind of analysis method and device of software privacy leakage behavior |
Non-Patent Citations (3)
Title |
---|
The Research on Algorithm of multi-sensitive based on personal anonymity;Peng GuoXing等;《2013 Fourth International Conference on Digital Manufactuing & Automation》;20131231;全文 * |
基于最大叶子子树优先策略的多敏感属性保护方法;祁瑞丽等;《燕山大学学报》;20091231;全文 * |
面向相关多敏感属性的隐私保护方法;李立等;《山东大学学报(理学版)》;20110531;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN107358115A (en) | 2017-11-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | A utility-aware visual approach for anonymizing multi-attribute tabular data | |
Dhurandhar et al. | Explanations based on the missing: Towards contrastive explanations with pertinent negatives | |
CN106294853B (en) | For handling the method and data processing system of associated data set | |
JP2020537251A (en) | Using an object model of heterogeneous data to help build data visualizations | |
JP6846356B2 (en) | Systems and methods for automatically inferring the cube schema used in a multidimensional database environment from tabular data | |
Xie et al. | A visual analytics approach for exploratory causal analysis: Exploration, validation, and applications | |
CN104573560A (en) | Differential private data publishing method based on wavelet transformation | |
US20220067202A1 (en) | Method for creating avatars for protecting sensitive data | |
Carrizosa et al. | The tree based linear regression model for hierarchical categorical variables | |
CN107358115B (en) | It is a kind of consider practicability multiattribute data go privacy methods | |
Yesin et al. | Ensuring data integrity in databases with the universal basis of relations | |
Khan et al. | Graph-based management and mining of blockchain data | |
Xie et al. | Auditing the sensitivity of graph-based ranking with visual analytics | |
Zhang et al. | Differential privacy medical data publishing method based on attribute correlation | |
Knorr et al. | Analyzing separation of duties in petri net workflows | |
Erkens et al. | Strategy and control: Findings from a set-theoretical analysis of high-performance manufacturing firms | |
Liu et al. | Differential privacy performance evaluation under the condition of non-uniform noise distribution | |
Bensalloua et al. | Spatial OLAP and multicriteria integrated approach for decision support system: Application in agroforestry management | |
Rodriguez et al. | MobilityMirror: Bias-adjusted transportation datasets | |
Hait et al. | Improved Bonferroni mean operator to apprehend graph based data interconnections with application to the Hacker Attack system | |
Tyrychtr et al. | Multidimensional modelling from open data for precision agriculture | |
Indumathi et al. | A novel framework for optimised privacy preserving data mining using the innovative desultory technique | |
Arenas et al. | LC3: A spatiotemporal data model to study qualified land cover changes | |
Salem et al. | A Cloud-Based Data Integration Framework for E-Government Service. | |
Vatresia et al. | Automated Data Integration of Biodiversity with OLAP and OLTP |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |