CN112231745A - Big data security and privacy protection method based on geometric deformation and storage medium - Google Patents
Big data security and privacy protection method based on geometric deformation and storage medium Download PDFInfo
- Publication number
- CN112231745A CN112231745A CN202010914945.XA CN202010914945A CN112231745A CN 112231745 A CN112231745 A CN 112231745A CN 202010914945 A CN202010914945 A CN 202010914945A CN 112231745 A CN112231745 A CN 112231745A
- Authority
- CN
- China
- Prior art keywords
- data
- transformation
- geometric deformation
- privacy protection
- attribute
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/602—Providing cryptographic facilities or services
Abstract
The invention relates to the field of big data processing, and discloses a big data security and privacy protection method and a storage medium based on geometric deformation, wherein the method comprises the following steps of establishing an attribute sensitive set: dividing the attribute of the data into four sets which respectively represent four sensitivity degrees; cleaning data: deleting the incomplete data entries, and serializing the discrete data to obtain data to be divided; dividing data: screening out data corresponding to all attributes in the same sensitive set, and classifying the data into different columns of the same matrix to form a same sensitive data set; geometric deformation: respectively carrying out corresponding translation, scaling, rotation or similar transformation processing on the homosensitive data set, and recording transformation parameters for use in subsequent inverse transformation processing; obtaining a final data set: and transforming the four processed data sets into one data set. The invention can provide a simple, high-efficiency and hierarchical data privacy protection method for the release and transmission of mass data, and the data can be restored through the inverse geometric deformation transformation.
Description
Technical Field
The invention relates to the technical field of big data processing, in particular to a big data security and privacy protection method and a storage medium based on geometric deformation.
Background
When data sharing exchange, data release and data use are carried out in a big data environment, a method for protecting data privacy in a grading way is provided, attacks of data analysis means such as cluster analysis and the like can be resisted, and the privacy safety problem of big data can be solved from the source of the data. Currently, methods for providing privacy protection in big data environments are mainly: data distortion based techniques, data encryption based techniques, distribution limited based techniques, etc., which have the following features and disadvantages.
(1) The method comprises the following steps: techniques based on data distortion
The method carries out distortion processing on the sensitive data by means of adding noise, introducing random factors, carrying out linear transformation on private vectors and the like so as to achieve the aim of changing the head and the face of original data. This processing method can be done quickly, but it is less secure and affects the data analysis result at the cost of reducing the accuracy of the data, and generally this processing method can only obtain approximate calculation results.
(2) The second method comprises the following steps: techniques based on data encryption
The method adopts a sensitive data encryption mode, and hides sensitive data in a data mining process, and the specific method comprises secure multi-party computing (SMC), distributed anonymization and the like. The method has good effect of resisting distributed data mining and high safety. However, since each sensitive attribute data is encrypted, there are difficulties in data recovery and use and large operation overhead of privacy protection processing before release, and in addition, it is necessary to consider how to protect the key separately while selecting the cryptographic algorithm separately.
(3) The third method comprises the following steps: techniques based on data anonymity
The method selectively releases original data, deletes or modifies the clear identifiers of higher personal information and sensitive data, thereby being incapable of determining specific individuals and realizing privacy protection. The technology comprises measures of generalization, clustering, inhibition and the like. The purpose of data anonymity-based techniques is to ensure that the risk of disclosure of sensitive data and privacy is within a tolerable range, rather than ensuring complete security, and thus all are vulnerable to targeted attacks, thereby resulting in privacy disclosure.
Disclosure of Invention
In order to solve the problems, the invention provides a big data security privacy protection method and a storage medium based on geometric deformation, which can solve the problems that the efficiency of processing sensitive data by using a privacy protection technology and the data security cannot be simultaneously ensured in the traditional method, solve the problem that the data subjected to privacy protection cannot be simply and efficiently recovered and reused according to needs, solve the problem that the privacy protection processing cannot be efficiently and uniformly carried out in three stages of a data source, a data frame and data analysis, realize the privacy protection from local to whole, and also solve the problem that the traditional privacy protection technology based on anonymity only achieves privacy disclosure tolerance and cannot realize complete privacy protection.
The invention relates to a big data security and privacy protection method based on geometric deformation, which comprises the following steps:
establishing an attribute sensitive set: dividing the attributes of the data into four sets which respectively represent four sensitivity degrees, including insensitive, low-sensitivity, medium-sensitivity and high-sensitivity attributes;
cleaning data: deleting the incomplete data entries, and serializing the discrete data to obtain data to be divided;
dividing data: screening out data corresponding to all attributes in the same sensitive set, and classifying the data into different columns of the same matrix to form a same sensitive data set;
geometric deformation: respectively carrying out corresponding translation, scaling, rotation or similar transformation on insensitive, low-sensitivity, medium-sensitivity and high-sensitivity data sets in the same-sensitivity data set, and recording transformation parameters for use in subsequent inverse transformation;
obtaining a final data set: converting the four processed data sets into one data set, namely obtaining data processed by big data privacy protection based on geometric deformation; and if the subsequent data prototype is needed to carry out big data analysis, the original data is obtained by adopting inverse transformation processing.
Further, in the translation transformation process, a formula of translation disturbance is adopted as follows:
Xt=Xt-1+T,T=[tx,ty]
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the coordinates of the data before transformation, txFor horizontal translation, tyIs the amount of translation in the vertical direction.
Further, the translation transformation process comprises the following sub-steps:
step S1, inputting a privacy attribute set V and a noise set TaddSelecting two privacy attributes A of a privacy attribute set VjAnd Aj+ k, where k is a predetermined value, the noise set T is selectedaddAn additive noise term e ofj;
S2, selecting the privacy attribute pair AjAnd Aj+ k, and additive noise term ejForming a matrix;
s3, performing geometric deformation calculation according to a translation disturbance formula: v ← transform (V, T)add)。
Further, in the scaling transformation process, the formula of scaling disturbance is adopted as follows:
Xt=sXt-1
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the data coordinates before transformation and the scalar s represents the uniform scaling amount.
Further, in the rotation transformation processing, a formula of rotation disturbance is adopted as follows:
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Represents the data coordinates before transformation, and θ is the angle of rotation transformation.
Further, in the similarity transformation processing, a formula of similarity disturbance is adopted as follows:
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the coordinates of the data before transformation, theta being the angle parameter of the rotation transformation, txFor horizontal translation parameters, tyFor the translation parameter in the vertical direction, the scalar s represents the uniform scaling parameter; similar perturbations are a combination of rotational, translational and scaling perturbations.
The invention relates to a storage medium, which stores a computer program, wherein the computer program realizes the steps of the big data security and privacy protection method based on geometric deformation when being executed by a processor.
The invention has the beneficial effects that:
(1) the invention provides a method for achieving the purpose of privacy protection by using a method for calculating geometric transformation relation between images in computer vision to interfere data, which can interfere the data from a data source at the bottommost layer of a system, and the interfered data not only makes cluster analysis on the data invalid or obtains an error result in an analysis stage, but also effectively protects the overall data security and privacy of a big data system, and realizes the security and privacy protection from local to overall.
(2) The invention provides a method for establishing a homosensitive set in a self-defined mode, and for different sensitivity requirements, a corresponding geometric transformation mode is adopted for data processing, such as translation, scaling, rotation and the like, so that the operation overhead is reduced according to a sensitivity classification processing mode, and the targeted privacy protection is also ensured.
(3) The invention utilizes the geometric transformation processing, can adopt the corresponding inverse transformation processing according to the requirement, can completely recover the data, has no distortion of the recovered data, and ensures the authenticity of the subsequent big data analysis.
In summary, the invention can provide a simple, efficient and hierarchical data privacy protection method for the release and transmission of mass data, distinguish the attribute sensitivity degree by using the customized attribute sensitivity set, and efficiently realize targeted privacy protection processing by classifying through the geometric deformation technology in the field of digital images. And finally, data can be restored through geometric deformation inverse transformation, and authenticity of big data analysis is guaranteed. Meanwhile, the invention adopts the technology and the algorithm which are easy to realize and relatively mature. The method and the device can be suitable for each stage needing data privacy protection in the big data full life cycle.
Drawings
FIG. 1 is a flow chart of a big data privacy protection method based on geometric deformation of the invention;
FIG. 2 is a data state before perturbation;
FIG. 3 is a data state after perturbation.
Detailed Description
In order to more clearly understand the technical features, objects, and effects of the present invention, specific embodiments of the present invention will now be described. It should be understood that the detailed description and specific examples, while indicating the preferred embodiment of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
As shown in FIG. 1, the present invention provides a big data security and privacy protection method based on geometric deformation, which includes the following steps:
establishing an attribute sensitive set: dividing the attributes of the data into four sets which respectively represent four sensitivity degrees, including insensitive, low-sensitivity, medium-sensitivity and high-sensitivity attributes;
cleaning data: deleting the incomplete data entries, and serializing the discrete data to obtain data to be divided;
dividing data: screening out data corresponding to all attributes in the same sensitive set, and classifying the data into different columns of the same matrix to form a same sensitive data set;
geometric deformation: respectively carrying out corresponding translation, scaling, rotation or similar transformation on insensitive, low-sensitivity, medium-sensitivity and high-sensitivity data sets in the same-sensitivity data set, and recording transformation parameters for use in subsequent inverse transformation;
obtaining a final data set: converting the four processed data sets into one data set, namely obtaining data processed by big data privacy protection based on geometric deformation; and if the subsequent data prototype is needed to carry out big data analysis, the original data is obtained by adopting inverse transformation processing.
In an embodiment of the present invention, in the step of establishing the attribute-sensitive set, the insensitive attribute may specifically include "height", "weight", and the like, and the highly sensitive attribute may specifically include "income", "identification number", and the like.
In one embodiment of the present invention, in the step of cleaning data, the incomplete data entry is deleted, and the serialization of the scattered data may specifically be: and (3) rewriting the occupation into the occupation { students, teachers, doctors and nurses }, so as to obtain the data to be divided.
In one embodiment of the present invention, in the translation transformation process, a formula of translation disturbance is adopted as follows:
Xt=Xt-1+T,T=[tx,ty]
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the coordinates of the data before transformation, txFor horizontal translation, tyIs the amount of translation in the vertical direction.
Specifically, when the age and income are disturbed, T [ -3,1000], the data states before and after the shift disturbance are as shown in fig. 2 and 3.
More specifically, the translation transformation process comprises the following sub-steps:
step S1, inputting a privacy attribute set V and a noise set TaddSelecting two privacy attributes A of a privacy attribute set VjAnd Aj+ k, where k is a predetermined value, the noise set T is selectedaddAn additive noise term e ofj;
S2, selecting the privacy attribute pair AjAnd Aj+ k, and additive noise term ejForming a matrix;
s3, performing geometric deformation calculation according to a translation disturbance formula: v ← transform (V, T)add)。
In one embodiment of the present invention, in the scaling transformation process, the formula of the scaling perturbation is adopted as follows:
Xt=sXt-1
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the data coordinates before transformation and the scalar s represents the uniform scaling amount.
In one embodiment of the present invention, in the rotation transformation process, the formula of the rotation disturbance is adopted as follows:
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Represents the data coordinates before transformation, and θ is the angle of rotation transformation.
In one embodiment of the present invention, in the similarity transformation process, the formula of the similarity perturbation is adopted as follows:
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the coordinates of the data before transformation, theta being the angle parameter of the rotation transformation, txFor horizontal translation parameters, tyFor the translation parameter in the vertical direction, the scalar s represents the uniform scaling parameter; similar perturbations are a combination of rotational, translational and scaling perturbations.
The invention also provides a storage medium storing a computer program, which when executed by a processor implements the steps of the above-mentioned big data security and privacy protection method based on geometric deformation.
The foregoing is illustrative of the preferred embodiments of this invention, and it is to be understood that the invention is not limited to the precise form disclosed herein and that various other combinations, modifications, and environments may be resorted to, falling within the scope of the concept as disclosed herein, either as described above or as apparent to those skilled in the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (7)
1. A big data security and privacy protection method based on geometric deformation is characterized by comprising the following steps:
establishing an attribute sensitive set: dividing the attributes of the data into four sets which respectively represent four sensitivity degrees, including insensitive, low-sensitivity, medium-sensitivity and high-sensitivity attributes;
cleaning data: deleting the incomplete data entries, and serializing the discrete data to obtain data to be divided;
dividing data: screening out data corresponding to all attributes in the same sensitive set, and classifying the data into different columns of the same matrix to form a same sensitive data set;
geometric deformation: respectively carrying out corresponding translation, scaling, rotation or similar transformation on insensitive, low-sensitivity, medium-sensitivity and high-sensitivity data sets in the same-sensitivity data set, and recording transformation parameters for use in subsequent inverse transformation;
obtaining a final data set: converting the four processed data sets into one data set, namely obtaining data processed by big data privacy protection based on geometric deformation; and if the subsequent data prototype is needed to carry out big data analysis, the original data is obtained by adopting inverse transformation processing.
2. The big data security and privacy protection method based on geometric deformation according to claim 1, wherein in the translation transformation process, a formula of translation disturbance is adopted as follows:
Xt=Xt-1+T,T=[tx,ty]
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the coordinates of the data before transformation, txFor horizontal translation, tyIs the amount of translation in the vertical direction.
3. The big data security and privacy protection method based on geometric deformation as claimed in claim 2, wherein the translation transformation process includes the following sub-steps:
step S1, inputting a privacy attribute set V and a noise set TaddSelecting two privacy attributes A of a privacy attribute set VjAnd Aj+ k, where k is a predetermined value, the noise set T is selectedaddAn additive noise term e ofj;
S2, selecting the privacy attribute pair AjAnd Aj+ k, and additive noise term ejForming a matrix;
s3, performing geometric deformation calculation according to a translation disturbance formula: v ← transform (V, T)add)。
4. The big data security and privacy protection method based on geometric deformation according to claim 1, wherein in the scaling transformation process, the formula of scaling disturbance is:
Xt=sXt-1
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the data coordinates before transformation and the scalar s represents the uniform scaling amount.
5. The big data security and privacy protection method based on geometric deformation according to claim 1, wherein in the rotation transformation process, a formula of rotation disturbance is adopted as follows:
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Represents the data coordinates before transformation, and θ is the angle of rotation transformation.
6. The big data security and privacy protection method based on geometric deformation according to claim 1, wherein in the similarity transformation process, a formula of similarity perturbation is adopted as follows:
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the coordinates of the data before transformation, theta being the angle parameter of the rotation transformation, txFor horizontal translation parameters, tyFor the translation parameter in the vertical direction, the scalar s represents the uniform scaling parameter; similar perturbations are a combination of rotational, translational and scaling perturbations.
7. A storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the steps of a geometric-deformation-based big data security and privacy protection method according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010914945.XA CN112231745A (en) | 2020-09-03 | 2020-09-03 | Big data security and privacy protection method based on geometric deformation and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010914945.XA CN112231745A (en) | 2020-09-03 | 2020-09-03 | Big data security and privacy protection method based on geometric deformation and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112231745A true CN112231745A (en) | 2021-01-15 |
Family
ID=74117016
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010914945.XA Pending CN112231745A (en) | 2020-09-03 | 2020-09-03 | Big data security and privacy protection method based on geometric deformation and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112231745A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112883389A (en) * | 2021-02-09 | 2021-06-01 | 上海凯馨信息科技有限公司 | Reversible desensitization algorithm supporting feature preservation |
CN113160348A (en) * | 2021-05-20 | 2021-07-23 | 深圳文达智通技术有限公司 | Recoverable face image privacy protection method, device, equipment and storage medium |
CN114491650A (en) * | 2022-04-13 | 2022-05-13 | 武汉光谷信息技术股份有限公司 | Geographical space information desensitization encryption method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160085982A1 (en) * | 2010-07-29 | 2016-03-24 | Oracle International Corporation | System and method for real-time transactional data obfuscation |
CN110020546A (en) * | 2019-01-07 | 2019-07-16 | 南京邮电大学 | A kind of private data cascade protection method |
CN110134719A (en) * | 2019-05-17 | 2019-08-16 | 贵州大学 | A kind of identification of structural data Sensitive Attributes and stage division of classifying |
CN111008368A (en) * | 2019-11-26 | 2020-04-14 | 武汉大学 | Grid-based collaborative design safety product data exchange method and system |
CN111199048A (en) * | 2020-01-02 | 2020-05-26 | 航天信息股份有限公司 | Big data grading desensitization method and system based on container with life cycle |
-
2020
- 2020-09-03 CN CN202010914945.XA patent/CN112231745A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160085982A1 (en) * | 2010-07-29 | 2016-03-24 | Oracle International Corporation | System and method for real-time transactional data obfuscation |
CN110020546A (en) * | 2019-01-07 | 2019-07-16 | 南京邮电大学 | A kind of private data cascade protection method |
CN110134719A (en) * | 2019-05-17 | 2019-08-16 | 贵州大学 | A kind of identification of structural data Sensitive Attributes and stage division of classifying |
CN111008368A (en) * | 2019-11-26 | 2020-04-14 | 武汉大学 | Grid-based collaborative design safety product data exchange method and system |
CN111199048A (en) * | 2020-01-02 | 2020-05-26 | 航天信息股份有限公司 | Big data grading desensitization method and system based on container with life cycle |
Non-Patent Citations (1)
Title |
---|
许杰 等: "基于几何变形的大数据安全隐私保护方法", 《通信技术》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112883389A (en) * | 2021-02-09 | 2021-06-01 | 上海凯馨信息科技有限公司 | Reversible desensitization algorithm supporting feature preservation |
CN113160348A (en) * | 2021-05-20 | 2021-07-23 | 深圳文达智通技术有限公司 | Recoverable face image privacy protection method, device, equipment and storage medium |
CN114491650A (en) * | 2022-04-13 | 2022-05-13 | 武汉光谷信息技术股份有限公司 | Geographical space information desensitization encryption method and system |
CN114491650B (en) * | 2022-04-13 | 2022-07-01 | 武汉光谷信息技术股份有限公司 | Method and system for desensitizing encryption of geographic spatial information |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112231745A (en) | Big data security and privacy protection method based on geometric deformation and storage medium | |
US8769293B2 (en) | Systems and methods for rights protection of datasets with dataset structure preservation | |
Fadl et al. | Robust copy–move forgery revealing in digital images using polar coordinate system | |
Fang et al. | Robust zero-watermarking algorithm for medical images based on SIFT and Bandelet-DCT | |
WO2020173252A1 (en) | Method, system, and terminal for protecting deep neural network by means of self-locking mechanism | |
Aparna et al. | A blind medical image watermarking for secure E-healthcare application using crypto-watermarking system | |
Gong et al. | Robust and secure zero-watermarking algorithm for medical images based on Harris-SURF-DCT and chaotic map | |
Kaur et al. | A secure data classification model in cloud computing using machine learning approach | |
Sheng et al. | Zero watermarking algorithm for medical image based on Resnet50-DCT | |
Ju | An overview of face manipulation detection | |
CN103853946B (en) | A kind of GIS vector data copyright authentication method based on FCM cluster feature | |
Zhang et al. | Robust multi-watermarking algorithm for medical images based on GoogLeNet and Henon map | |
Lucchese et al. | Rights protection of trajectory datasets with nearest-neighbor preservation | |
Davidson et al. | Locating secret messages in images | |
Daoui et al. | New method for bio-signals zero-watermarking using quaternion shmaliy moments and short-time fourier transform | |
Al-Ardhi et al. | Copyright protection and content authentication based on linear cellular automata watermarking for 2D vector maps | |
Doegar et al. | A review of passive image cloning detection approaches | |
Yuan et al. | Robust zero‐watermarking algorithm based on discrete wavelet transform and daisy descriptors for encrypted medical image | |
Kumar et al. | A review on digital watermarking-based image forensic technique | |
Xiao et al. | A zero-watermarking algorithm for medical images based on Gabor-DCT | |
Zhang | Reversible Data Hiding of Digital Image Based on Pixel Combination Algorithm | |
Moataz | Watermarking medical scans: Saliency filter pixels | |
Geetha et al. | Blind image steganalysis based on content independent statistical measures maximizing the specificity and sensitivity of the system | |
Shopon et al. | I Got Your Emotion: Emotion Preserving Face De-identification Using Injection-Based Generative Adversarial Networks | |
CN111968023B (en) | Dual-image reversible data hiding method based on EMD matrix |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210115 |
|
RJ01 | Rejection of invention patent application after publication |