CN111241587B - Data desensitization method and device - Google Patents

Data desensitization method and device Download PDF

Info

Publication number
CN111241587B
CN111241587B CN202010071239.3A CN202010071239A CN111241587B CN 111241587 B CN111241587 B CN 111241587B CN 202010071239 A CN202010071239 A CN 202010071239A CN 111241587 B CN111241587 B CN 111241587B
Authority
CN
China
Prior art keywords
data
dimension
value
calculating
column
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010071239.3A
Other languages
Chinese (zh)
Other versions
CN111241587A (en
Inventor
张美跃
周业
陈佳伟
周定云
俞宏青
俞基锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hengruitong Fujian Information Technology Co ltd
Original Assignee
Hengruitong Fujian Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hengruitong Fujian Information Technology Co ltd filed Critical Hengruitong Fujian Information Technology Co ltd
Priority to CN202010071239.3A priority Critical patent/CN111241587B/en
Publication of CN111241587A publication Critical patent/CN111241587A/en
Application granted granted Critical
Publication of CN111241587B publication Critical patent/CN111241587B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Complex Calculations (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

According to the data desensitization method and device, original data are obtained, and nucleation is carried out to obtain new data; the new data is subjected to dimension reduction processing to obtain dimension reduced data, redundant information in the data is removed, the calculation complexity is simplified, and unnecessary expenditure is reduced; and carrying out centering treatment on the dimension-reduced data to obtain desensitized data, and protecting the privacy data on the premise of ensuring the usability of the data.

Description

Data desensitization method and device
Technical Field
The invention relates to the technical field of data processing, in particular to a data desensitizing method and device.
Background
In recent years, with the development of information technology, the generation of personal data has been exponentially increased, and a large amount of personal information has been stored and distributed by government parts, commercial establishments, and the like. Data distribution is taken as a means of information sharing, and the risk of personal privacy data disclosure is increased while data exchange and data sharing are facilitated. The "privacy data" is sensitive information that the data owner is not willing to know by others, such as home address, identification card number, phone number, disease information, location information, etc. For example, in order to study the amount of each type of drug used and the patient's illness, the relevant departments may need to provide relevant purchase list data, and the purchase list data contains a lot of private data. Obviously, if the drug purchase table data is directly released, the privacy information of the patient may be revealed. How to process the table data to prevent the disease privacy of the patient from being revealed, the simplest method is to remove the name attribute of the patient, so that the aggressiveness can infer personal identity information according to sensitive attribute by means of background knowledge, association attack and the like. Such data would lead to research becoming meaningless if sensitive attributes in the data were all removed.
At present, regarding the problem of privacy disclosure in data distribution, the existing research is mainly to limit methods of data distribution, data scrambling, k-anonymity and the like, and although the methods can protect the privacy of data to a certain extent, the methods have some security and usability defects. For example, limiting data distribution mainly cuts off the association between data, but the usability of the data is reduced, and the number of data distributed is not well controlled; the data scrambling mainly comprises the steps of disturbing data, changing the data by adding proper noise, and being beneficial to maintaining data characteristics, but has lower clustering availability and large calculation cost; the k-anonymity is mainly characterized in that k indistinguishable records exist in published data, so that an attacker cannot distinguish a specific individual to which private information belongs, personal privacy is protected, and the k-anonymity protects the personal privacy to a certain extent, but reduces the clustering availability of the data.
Therefore, in the existing privacy protection mechanism of data distribution, there are mainly two problems: on one hand, the method has the problems of complex calculation and high expenditure; on the other hand, it is difficult to maintain a balance of data availability and privacy.
Disclosure of Invention
First, the technical problem to be solved
In order to solve the problems in the prior art, the invention provides a data desensitizing method and device, which can reduce the calculation cost and protect the private data on the premise of ensuring the usability of the data.
(II) technical scheme
In order to achieve the above purpose, the present invention adopts a main technical scheme comprising:
a method of desensitizing data, comprising the steps of:
s1, acquiring original data, and performing nucleation treatment to obtain new data;
s2, performing dimension reduction processing on the new data to obtain dimension reduced data;
and S3, carrying out centering treatment on the dimension-reduced data to obtain desensitized data.
In order to achieve the above purpose, another main technical scheme adopted by the invention comprises:
an apparatus for desensitizing data, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of, when executing the program:
s1, acquiring original data, and performing nucleation treatment to obtain new data;
s2, performing dimension reduction processing on the new data to obtain dimension reduced data;
and S3, carrying out centering treatment on the dimension-reduced data to obtain desensitized data.
(III) beneficial effects
The invention has the beneficial effects that: obtaining new data by obtaining original data and carrying out nucleation treatment; the new data is subjected to dimension reduction processing to obtain dimension reduced data, redundant information in the data is removed, the calculation complexity is simplified, and unnecessary expenditure is reduced; and carrying out centering treatment on the dimension-reduced data to obtain desensitized data, and protecting the privacy data on the premise of ensuring the usability of the data.
Drawings
FIG. 1 is a flow chart of a method of data desensitization according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a data desensitizing apparatus according to an embodiment of the present invention.
[ reference numerals description ]
1: a device for desensitizing data;
2: a memory;
3: a processor.
Detailed Description
The invention will be better explained by the following detailed description of the embodiments with reference to the drawings.
Example 1
Referring to fig. 1, a method for desensitizing data includes the steps of:
s1, acquiring original data, and performing nucleation treatment to obtain new data;
the step S1 specifically comprises the following steps:
and obtaining original data, and carrying out nucleation processing on nonlinear data in the original data to convert the nonlinear data into linear data so as to obtain new data.
S2, performing dimension reduction processing on the new data to obtain dimension reduced data;
the step S2 specifically comprises the following steps:
and performing dimension reduction processing on the new data through principal component analysis to obtain dimension reduced data.
And S3, carrying out centering treatment on the dimension-reduced data to obtain desensitized data.
The centering process includes:
and constructing an equivalent set corresponding to the dimension-reduced data according to a distance minimization principle.
The centralizing process further includes:
and calculating the average value of each column of data of the dimension-reduced data, and replacing the specific value of each column of data with the average value of each column of data.
Example two
The difference between the present embodiment and the first embodiment is that the present embodiment will further explain how the above-mentioned data desensitizing method of the present invention is implemented with reference to a specific application scenario:
the invention mainly comprises two stages of data dimension reduction and centering treatment;
1. data dimension reduction stage
Acquiring original table data S to be published n×h Wherein n is the record number of the table data, h is the dimension of the table data, and the original table data S is firstly n×h The numerical nonlinear data in the table is converted into numerical linear data through nucleation processing to obtain new table data S' n×h The method comprises the steps of carrying out a first treatment on the surface of the Then, the new data S 'is analyzed by principal component analysis' n×h Performing dimension reduction processing to obtain dimension reduced table data S'。
Each record includes m public attributes and t sensitive attributes, where m+t=w. Let u= (U) 1 ,u 2 ,…,u m ) Is a public attribute in the table data, where u i (i=1, 2, …, m) is the i-th public attribute; v= (V) 1 ,v 2 ,…,v t ) Is a sensitive attribute in the table data, where v j (j=1, 2, …, t) is the j-th sensitive attribute, from the raw table data S n×h Numerical nonlinear table data T extracted from the data n×l The samples are recorded as n and the dimension is l.
The specific implementation steps are as follows:
first, the numerical nonlinear data in the table data is nucleated, and the table data S n×h Conversion to Table data S' n×h
S111: nonlinear data in the table data are extracted and represented by a matrix A: a= (a 1 ,A 2 ,…,A n ) T . Wherein A is f =(a f1 ,a f2 ,…,a fl ) Represents the f-th data in a;
s112: data A of each row in A f =(a f1 ,a f2 ,…,a fl ) Projected sequentially onto the hyperplane Z f =(z f1 ,z f2 ,…,z fd ) Resulting in post-projection data, where f=1, 2, …, n;
s113: acquiring the f-th data of the projected data: then for z fj Satisfy the following requirementsWherein w is fi Is data a fi Images in hyperplane, i.e. w fi =φ(a fi );
S114: calculating z fjWherein->Is a fi The j-th division of (2)Amount lambda f Is A f Is a characteristic value of (2);
s115: introducing a kernel function: k (k) f (a fi ,a fj )=φ(a fi ) T φ(a fj );
S116: calculating to obtain K f a j =λ j a j Wherein K is f Vector nucleated for the f line data;
s117: finally obtaining a kernel matrix: k= (K) 1 ,K 2 ,…,K n ) T
By carrying out nucleation treatment on the numerical nonlinear data in the original table data, the table data is converted into S' n×h =(S n×(h-l) ,K n×l )。
Next, for S' n×h The numerical value linear data in the model (1) is subjected to dimension reduction by adopting a principal component analysis method, so as to obtain dimension reduced table data S'. The specific implementation steps are as follows:
s121: calculating the mean value of each column of data:where j=1, …, h;
s122: the individual data in the linear data is de-centered, i.e. each data minus the mean of the corresponding column: s is(s) ij =s ij -E j Where i=1, …, n;
s123: calculating a covariance matrix F:
s124: f is subjected to eigenvalue decomposition, and the eigenvalue lambda is calculated i And the corresponding feature vector mu i : let |λe-f|=0, solve the value of λ to be the eigenvalue; the value of lambda is brought into |lambda E-F|=0, and the solved linear independent vector is the feature vector.
S125: and for characteristic value lambda i Sequencing: lambda (lambda) 12 >…>λ h Its corresponding feature vector is mu 1 ,μ 2 ,…,μ h
S126: the number of main components is selected: giving a threshold value alpha of availability and a remaining principal component number parameter b, then selecting the number of principal components according to whether the judgment formula 1-p is less than or equal to alpha, outputting b if the inequality is satisfied, otherwise, making b=b+1. Wherein:λ i is a characteristic value;
s127: outputting feature vector sets corresponding to the first b feature values: v (V) b ={μ 12 ,…,μ b };
S128: unitized feature vector V b Obtaining a feature matrix A: first, a feature vector set V is calculated b Modulus of each feature vector:and then carrying out unitization treatment to obtain a unit matrix: />
S129: calculating a projection matrix: s' n×b =S′ n×h A。
2. Centralizing treatment stage
S21: creating a data set S corresponding to the reduced-dimension table data S' * Order-makingSetting the number r of the equivalent sets to obtain r equivalent sets D 1 ,…,D r . Let->Let j=1;
s22: from S' optionally one record S i As an equivalent set D j Is a primitive element of (2); namely D j ={s i }, and S "=s" - { S i };
S23: calculating the set D of medium and equivalent values in S' j Record s closest to i ,D i ←D i +{s i },S″=S″-{s i -a }; repeating the step until D j The number of records in (a) is greater than or equal to k;
s24: pair equivalence set D j The elements in (3) are subjected to centering treatment: calculation D j The mean value of each column data attribute is used for replacing the specific value of each column data attribute to obtain a new equivalence set D' j
S25:S * =S * +{D′ j -a }; if j<And 5, repeating the step S22 if j=j+1, and ending if not.
Specifically, (1) aiming at the problem that the availability of the original numerical data clustering is difficult to ensure by the privacy protection method in the existing data release, the invention constructs an equivalent set of n records according to the distance minimization principle through the clustering thought, replaces the attribute value in the equivalent set with the average value, realizes centralization anonymity, ensures the privacy security of the data and simultaneously ensures smaller information loss degree; and the effectiveness and the safety of the algorithm are analyzed theoretically; (2) Aiming at the problems of large data overhead, high calculation complexity and the like in the existing data release protection mechanism, the invention performs privacy protection on the data after the data is reduced in dimension: the numerical nonlinear data is converted into linear data through nucleation, and then the linear data is subjected to dimension reduction by adopting a principal component analysis method. Redundant information can be removed, so that the calculation complexity is simplified, and unnecessary expenditure is reduced; (3) The invention makes the record forward reflect the centralized data information loss by reasonable distance between the two records and distance between the record and the equivalent set.
Example III
Referring to fig. 2, a data desensitizing apparatus 1 includes a memory 2, a processor 3, and a computer program stored in the memory 2 and executable on the processor 3, wherein the processor 3 implements the steps of the first embodiment when executing the program.
The foregoing description is only illustrative of the present invention and is not intended to limit the scope of the invention, and all equivalent changes made by the specification and drawings of the present invention, or direct or indirect application in the relevant art, are included in the scope of the present invention.

Claims (2)

1. A method of desensitizing data comprising the steps of:
s1, acquiring original data, and performing nucleation treatment to obtain new data;
s2, performing dimension reduction processing on the new data to obtain dimension reduced data;
s3, carrying out centering treatment on the dimension-reduced data to obtain desensitized data;
the step S1 specifically comprises the following steps:
acquiring original data, and carrying out nucleation treatment on nonlinear data in the original data to convert the nonlinear data into linear data to obtain new data;
s111: nonlinear data in the table data are extracted and represented by a matrix A: a= (a 1 ,A 2 ,…,A n ) T The method comprises the steps of carrying out a first treatment on the surface of the Wherein A is f =(a f1 ,a f2 ,…,a fl ) Represents the f-th data in a;
s112: data A of each row in A f =(a f1 ,a f2 ,…,a fl ) Projected sequentially onto the hyperplane Z f =(z f1 ,z f2 ,…,z fl ) Resulting in post-projection data, where f=1, 2, …, n;
s113: acquiring the f-th data of the projected data: for z fj Satisfy the following requirements Wherein w is fi Is data a fi Images in hyperplane, i.e. w fi =φ(a fi ),/>Represents W fi Is a transpose of (2);
s114: calculating z fjWherein->Is a fi Lambda of the j-th component of (2) f Is A f Is a characteristic value of (2);
s115: introducing a kernel function: k (k) f (a fi ,a fj )=φ(a fi ) T φ(a fj );
S116: calculating to obtain K f a j fi =λ f a j fi Wherein K is f Vector nucleated for the f line data;
s117: finally obtaining a kernel matrix: k= (K) 1 ,K 2 ,…,K n ) T
The step S2 specifically comprises the following steps:
performing dimension reduction processing on the new data through principal component analysis to obtain dimension reduced data;
s121: calculating the mean value of each column of data:where j=1, …, h;
s122: the individual data in the linear data is de-centered, i.e. each data minus the mean of the corresponding column: s' ij =s′ ij -E j Where i=1, …, n;
s123: calculating a covariance matrix F:
s124: f is subjected to eigenvalue decomposition, and the eigenvalue lambda is calculated i And the corresponding feature vector mu i : let |lambda E -f|=0, and solving for the value of λ as the eigenvalue; bringing the value of λ into |λ E -f|=0, the solved linear independent vector isIs a feature vector;
s125: and for characteristic value lambda i Sequencing: lambda (lambda) 12 >…>λ n Its corresponding feature vector is mu 1 ,μ 2 ,…,μ n
S126: the number of main components is selected: giving a threshold value alpha of availability and a remaining principal component number parameter b, then selecting the number of principal components according to whether the judgment formula 1-p is less than or equal to alpha, if the inequality is satisfied, outputting b, otherwise, letting b=b+1, wherein:
s127: outputting feature vector sets corresponding to the first b feature values: v (V) b ={μ 12 ,…,μ b };
S128: unitized feature vector V b Obtaining a feature matrix A': first, a feature vector set V is calculated b Modulus of each feature vector:and then carrying out unitization treatment to obtain a unit matrix: />
S129: calculating a projection matrix: s' n×b =S′ n×h A’;
The centering process includes:
constructing an equivalent set corresponding to the dimension-reduced data according to a distance minimization principle;
s21: creating a data set S corresponding to the reduced-dimension table data S', lettingSetting the number r of the equivalent sets to obtain r equivalent sets D 1 ,…,D r Let->Let j=1;
s22: from S' optionally one record S i As an equivalent set D j Is a primitive element of (2); namely D j ={s i "}, and S" =s "- { S i ”};
S23: calculating the set D of medium and equivalent values in S' j Record s closest to i ,D i ←D i +{s i ”},S″=S″-{s i "}; repeating the step until D j The number of records in (a) is greater than or equal to k;
the centralizing process further includes:
calculating the average value of each column of data of the dimension reduced data, and replacing the specific value of each column of data with the average value of each column of data;
s24: pair equivalence set D j The elements in (3) are subjected to centering treatment: calculation D j The mean value of each column data attribute is used for replacing the specific value of each column data attribute to obtain a new equivalence set D' j
S25:S*=S*+{D′ j -a }; if j<5, repeating the step S22 if j=j+1, otherwise ending;
by carrying out nucleation treatment on numerical nonlinear data in the original table data, the table data is converted into S' n×h =(S n×(h-l) ,K n×l ) Next, for S' n×h The numerical value linear data in the model (1) is subjected to dimension reduction by adopting a principal component analysis method, so as to obtain dimension reduced table data S'.
2. An apparatus for desensitizing data, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor performs the steps of:
s1, acquiring original data, and performing nucleation treatment to obtain new data;
s2, performing dimension reduction processing on the new data to obtain dimension reduced data;
s3, carrying out centering treatment on the dimension-reduced data to obtain desensitized data;
the step S1 specifically comprises the following steps:
acquiring original data, and carrying out nucleation treatment on nonlinear data in the original data to convert the nonlinear data into linear data to obtain new data;
s111: nonlinear data in the table data are extracted and represented by a matrix A: a= (a 1 ,A 2 ,…,A n ) T The method comprises the steps of carrying out a first treatment on the surface of the Wherein A is f =(a f1 ,a f2 ,…,a fl ) Represents the f-th data in a;
s112: data A of each row in A f =(a f1 ,a f2 ,…,a fl ) Projected sequentially onto the hyperplane Z f =(z f1 ,z f2 ,…,z fl) Resulting in post-projection data, where f=1, 2, …, n;
s113: acquiring the f-th data of the projected data: for z fj Satisfy the following requirements Wherein w is fi Is data a fi Images in hyperplane, i.e. w fi =φ(a fi ),/>Represents W fi Is a transpose of (2);
s114: calculating z fjWherein->Is a fi Lambda of the j-th component of (2) f Is A f Is a characteristic value of (2);
s115: guiding deviceKernel function: k (k) f (af i ,a fj )=φ(a fi ) T φ(a fj );
S116: calculating to obtain K f a j fi =λ f a j fi Wherein K is f Vector nucleated for the f line data;
s117: finally obtaining a kernel matrix: k= (K) 1 ,K 2 ,…,K n ) T
The step S2 specifically comprises the following steps:
performing dimension reduction processing on the new data through principal component analysis to obtain dimension reduced data;
s121: calculating the mean value of each column of data:where j=1, …, h;
s122: the individual data in the linear data is de-centered, i.e. each data minus the mean of the corresponding column: s' ij =s′ ij -E j Where i=1, …, n;
s123: calculating a covariance matrix F:
s124: f is subjected to eigenvalue decomposition, and the eigenvalue lambda is calculated i And the corresponding feature vector mu i : let |lambda E -f|=0, and solving for the value of λ as the eigenvalue; bringing the value of λ into |λ E -f|=0, and the solved linear independent vector is the eigenvector;
s125: and for characteristic value lambda i Sequencing: lambda (lambda) 12 >…>λ n Its corresponding feature vector is mu 1 ,μ 2 ,…,μ n
S126: the number of main components is selected: giving a threshold value alpha of availability and a remaining principal component number parameter b, then selecting the number of principal components according to whether the judgment formula 1-p is less than or equal to alpha, if the inequality is satisfied, outputting b, otherwise, letting b=b+1, wherein:
s127: outputting feature vector sets corresponding to the first b feature values: v (V) b ={μ 12 ,…,μ b };
S128: unitized feature vector V b Obtaining a feature matrix A': first, a feature vector set V is calculated b Modulus of each feature vector:and then carrying out unitization treatment to obtain a unit matrix: />
S129: calculating a projection matrix: s' n×b =S′ n×h A’;
The centering process includes:
constructing an equivalent set corresponding to the dimension-reduced data according to a distance minimization principle;
s21: creating a data set S corresponding to the reduced-dimension table data S', lettingSetting the number r of the equivalent sets to obtain r equivalent sets D 1 ,…,D r Let->Let j=1;
s22: from S' optionally one record S i As an equivalent set D j Is a primitive element of (2); namely D j ={s i "}, and S" =s "- { S i ”};
S23: calculating the set D of medium and equivalent values in S' j Record s closest to i ,D i ←D i +{s i ”},S″=S″-{s i "}; repeating the step until D j The number of records in (a) is greater than or equal to k;
the centralizing process further includes:
calculating the average value of each column of data of the dimension reduced data, and replacing the specific value of each column of data with the average value of each column of data;
s24: pair equivalence set D j The elements in (3) are subjected to centering treatment: calculation D j The mean value of each column data attribute is used for replacing the specific value of each column data attribute to obtain a new equivalence set D' j
S25:S*=S*+{D′ j -a }; if j<5, repeating the step S22 if j=j+1, otherwise ending;
by carrying out nucleation treatment on numerical nonlinear data in the original table data, the table data is converted into S' n×h =(S n×(h-l) ,K n×l ) Next, for S' n×h The numerical value linear data in the model (1) is subjected to dimension reduction by adopting a principal component analysis method, so as to obtain dimension reduced table data S'.
CN202010071239.3A 2020-01-21 2020-01-21 Data desensitization method and device Active CN111241587B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010071239.3A CN111241587B (en) 2020-01-21 2020-01-21 Data desensitization method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010071239.3A CN111241587B (en) 2020-01-21 2020-01-21 Data desensitization method and device

Publications (2)

Publication Number Publication Date
CN111241587A CN111241587A (en) 2020-06-05
CN111241587B true CN111241587B (en) 2023-09-29

Family

ID=70874874

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010071239.3A Active CN111241587B (en) 2020-01-21 2020-01-21 Data desensitization method and device

Country Status (1)

Country Link
CN (1) CN111241587B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111797919A (en) * 2020-06-30 2020-10-20 三峡大学 Dynamic security assessment method based on principal component analysis and convolutional neural network
CN112379712A (en) * 2020-11-26 2021-02-19 恒瑞通(福建)信息技术有限公司 Pig farm monitoring method and terminal serving rural area happy field

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013072930A2 (en) * 2011-09-28 2013-05-23 Tata Consultancy Services Limited System and method for database privacy protection
CN107273757A (en) * 2017-04-23 2017-10-20 西安电子科技大学 A kind of method of the processing big data based on l diversity rules and MDAV algorithms
CN108052832A (en) * 2017-11-28 2018-05-18 河海大学 A kind of micro- aggregation de-identification method based on sequence
CN108629371A (en) * 2018-05-02 2018-10-09 电子科技大学 A kind of Method of Data with Adding Windows to two-dimentional time-frequency data
CN108921230A (en) * 2018-07-25 2018-11-30 浙江浙能嘉华发电有限公司 Method for diagnosing faults based on class mean value core pivot element analysis and BP neural network
CN110069943A (en) * 2019-03-29 2019-07-30 中国电力科学研究院有限公司 A kind of data processing method and system based on cluster anonymization and difference secret protection

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201426578A (en) * 2012-12-27 2014-07-01 Ind Tech Res Inst Generation method and device and risk assessment method and device for anonymous dataset

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013072930A2 (en) * 2011-09-28 2013-05-23 Tata Consultancy Services Limited System and method for database privacy protection
CN107273757A (en) * 2017-04-23 2017-10-20 西安电子科技大学 A kind of method of the processing big data based on l diversity rules and MDAV algorithms
CN108052832A (en) * 2017-11-28 2018-05-18 河海大学 A kind of micro- aggregation de-identification method based on sequence
CN108629371A (en) * 2018-05-02 2018-10-09 电子科技大学 A kind of Method of Data with Adding Windows to two-dimentional time-frequency data
CN108921230A (en) * 2018-07-25 2018-11-30 浙江浙能嘉华发电有限公司 Method for diagnosing faults based on class mean value core pivot element analysis and BP neural network
CN110069943A (en) * 2019-03-29 2019-07-30 中国电力科学研究院有限公司 A kind of data processing method and system based on cluster anonymization and difference secret protection

Also Published As

Publication number Publication date
CN111241587A (en) 2020-06-05

Similar Documents

Publication Publication Date Title
US11748517B2 (en) Smart de-identification using date jittering
CN111373403B (en) Learning method and testing method for confusion network for hiding original data to protect personal information, learning device and testing device thereof
CN111241587B (en) Data desensitization method and device
US10387648B2 (en) Ransomware key extractor and recovery system
US10454932B2 (en) Search engine with privacy protection
US9037550B2 (en) Detecting inconsistent data records
US10522244B2 (en) Bioinformatic processing systems and methods
Rafiei et al. Group-based privacy preservation techniques for process mining
US20130167192A1 (en) Method and system for data pattern matching, masking and removal of sensitive data
Yang et al. Effective electrocardiogram steganography based on coefficient alignment
Faezi et al. Oligo-snoop: a non-invasive side channel attack against DNA synthesis machines
US20170140567A1 (en) Image Anonymization Using Analytics Tool
US20190333607A1 (en) Disease-oriented genomic anonymization
DE102018115683A1 (en) DOMINATIONAL SAFETY IN CRYPTOGRAPHICALLY PAROUSED CLOUD
CN112837770B (en) Privacy protection similar medical record query method in large-scale electronic medical system
EP3786828A1 (en) Confidential information processing system and confidential information processing method
Decouchant et al. Accurate filtering of privacy-sensitive information in raw genomic data
Rossen et al. Epidemiological typing of Serratia marcescens isolates by whole-genome multilocus sequence typing
JP4822842B2 (en) Anonymized identification information generation system and program.
US11437122B2 (en) Electronic methods and systems for microorganism characterization
CN106844006A (en) Based on data prevention method and system under virtualized environment
JP6556681B2 (en) Anonymization table generation device, anonymization table generation method, program
CN111523125B (en) Data analysis system and data analysis method
CN114490789A (en) Query request processing method and device
Maheshwari et al. Character-based search with data confidentiality in the clouds

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant