CN112231745A - Big data security and privacy protection method based on geometric deformation and storage medium - Google Patents

Big data security and privacy protection method based on geometric deformation and storage medium Download PDF

Info

Publication number
CN112231745A
CN112231745A CN202010914945.XA CN202010914945A CN112231745A CN 112231745 A CN112231745 A CN 112231745A CN 202010914945 A CN202010914945 A CN 202010914945A CN 112231745 A CN112231745 A CN 112231745A
Authority
CN
China
Prior art keywords
data
transformation
geometric deformation
privacy protection
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010914945.XA
Other languages
Chinese (zh)
Inventor
许杰
石凯
张锋军
李庆华
牛作元
朱王小江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 30 Research Institute
Original Assignee
CETC 30 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 30 Research Institute filed Critical CETC 30 Research Institute
Priority to CN202010914945.XA priority Critical patent/CN112231745A/en
Publication of CN112231745A publication Critical patent/CN112231745A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services

Abstract

The invention relates to the field of big data processing, and discloses a big data security and privacy protection method and a storage medium based on geometric deformation, wherein the method comprises the following steps of establishing an attribute sensitive set: dividing the attribute of the data into four sets which respectively represent four sensitivity degrees; cleaning data: deleting the incomplete data entries, and serializing the discrete data to obtain data to be divided; dividing data: screening out data corresponding to all attributes in the same sensitive set, and classifying the data into different columns of the same matrix to form a same sensitive data set; geometric deformation: respectively carrying out corresponding translation, scaling, rotation or similar transformation processing on the homosensitive data set, and recording transformation parameters for use in subsequent inverse transformation processing; obtaining a final data set: and transforming the four processed data sets into one data set. The invention can provide a simple, high-efficiency and hierarchical data privacy protection method for the release and transmission of mass data, and the data can be restored through the inverse geometric deformation transformation.

Description

Big data security and privacy protection method based on geometric deformation and storage medium
Technical Field
The invention relates to the technical field of big data processing, in particular to a big data security and privacy protection method and a storage medium based on geometric deformation.
Background
When data sharing exchange, data release and data use are carried out in a big data environment, a method for protecting data privacy in a grading way is provided, attacks of data analysis means such as cluster analysis and the like can be resisted, and the privacy safety problem of big data can be solved from the source of the data. Currently, methods for providing privacy protection in big data environments are mainly: data distortion based techniques, data encryption based techniques, distribution limited based techniques, etc., which have the following features and disadvantages.
(1) The method comprises the following steps: techniques based on data distortion
The method carries out distortion processing on the sensitive data by means of adding noise, introducing random factors, carrying out linear transformation on private vectors and the like so as to achieve the aim of changing the head and the face of original data. This processing method can be done quickly, but it is less secure and affects the data analysis result at the cost of reducing the accuracy of the data, and generally this processing method can only obtain approximate calculation results.
(2) The second method comprises the following steps: techniques based on data encryption
The method adopts a sensitive data encryption mode, and hides sensitive data in a data mining process, and the specific method comprises secure multi-party computing (SMC), distributed anonymization and the like. The method has good effect of resisting distributed data mining and high safety. However, since each sensitive attribute data is encrypted, there are difficulties in data recovery and use and large operation overhead of privacy protection processing before release, and in addition, it is necessary to consider how to protect the key separately while selecting the cryptographic algorithm separately.
(3) The third method comprises the following steps: techniques based on data anonymity
The method selectively releases original data, deletes or modifies the clear identifiers of higher personal information and sensitive data, thereby being incapable of determining specific individuals and realizing privacy protection. The technology comprises measures of generalization, clustering, inhibition and the like. The purpose of data anonymity-based techniques is to ensure that the risk of disclosure of sensitive data and privacy is within a tolerable range, rather than ensuring complete security, and thus all are vulnerable to targeted attacks, thereby resulting in privacy disclosure.
Disclosure of Invention
In order to solve the problems, the invention provides a big data security privacy protection method and a storage medium based on geometric deformation, which can solve the problems that the efficiency of processing sensitive data by using a privacy protection technology and the data security cannot be simultaneously ensured in the traditional method, solve the problem that the data subjected to privacy protection cannot be simply and efficiently recovered and reused according to needs, solve the problem that the privacy protection processing cannot be efficiently and uniformly carried out in three stages of a data source, a data frame and data analysis, realize the privacy protection from local to whole, and also solve the problem that the traditional privacy protection technology based on anonymity only achieves privacy disclosure tolerance and cannot realize complete privacy protection.
The invention relates to a big data security and privacy protection method based on geometric deformation, which comprises the following steps:
establishing an attribute sensitive set: dividing the attributes of the data into four sets which respectively represent four sensitivity degrees, including insensitive, low-sensitivity, medium-sensitivity and high-sensitivity attributes;
cleaning data: deleting the incomplete data entries, and serializing the discrete data to obtain data to be divided;
dividing data: screening out data corresponding to all attributes in the same sensitive set, and classifying the data into different columns of the same matrix to form a same sensitive data set;
geometric deformation: respectively carrying out corresponding translation, scaling, rotation or similar transformation on insensitive, low-sensitivity, medium-sensitivity and high-sensitivity data sets in the same-sensitivity data set, and recording transformation parameters for use in subsequent inverse transformation;
obtaining a final data set: converting the four processed data sets into one data set, namely obtaining data processed by big data privacy protection based on geometric deformation; and if the subsequent data prototype is needed to carry out big data analysis, the original data is obtained by adopting inverse transformation processing.
Further, in the translation transformation process, a formula of translation disturbance is adopted as follows:
Xt=Xt-1+T,T=[tx,ty]
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the coordinates of the data before transformation, txFor horizontal translation, tyIs the amount of translation in the vertical direction.
Further, the translation transformation process comprises the following sub-steps:
step S1, inputting a privacy attribute set V and a noise set TaddSelecting two privacy attributes A of a privacy attribute set VjAnd Aj+ k, where k is a predetermined value, the noise set T is selectedaddAn additive noise term e ofj
S2, selecting the privacy attribute pair AjAnd Aj+ k, and additive noise term ejForming a matrix;
s3, performing geometric deformation calculation according to a translation disturbance formula: v ← transform (V, T)add)。
Further, in the scaling transformation process, the formula of scaling disturbance is adopted as follows:
Xt=sXt-1
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the data coordinates before transformation and the scalar s represents the uniform scaling amount.
Further, in the rotation transformation processing, a formula of rotation disturbance is adopted as follows:
Figure BDA0002664693290000041
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Represents the data coordinates before transformation, and θ is the angle of rotation transformation.
Further, in the similarity transformation processing, a formula of similarity disturbance is adopted as follows:
Figure BDA0002664693290000042
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the coordinates of the data before transformation, theta being the angle parameter of the rotation transformation, txFor horizontal translation parameters, tyFor the translation parameter in the vertical direction, the scalar s represents the uniform scaling parameter; similar perturbations are a combination of rotational, translational and scaling perturbations.
The invention relates to a storage medium, which stores a computer program, wherein the computer program realizes the steps of the big data security and privacy protection method based on geometric deformation when being executed by a processor.
The invention has the beneficial effects that:
(1) the invention provides a method for achieving the purpose of privacy protection by using a method for calculating geometric transformation relation between images in computer vision to interfere data, which can interfere the data from a data source at the bottommost layer of a system, and the interfered data not only makes cluster analysis on the data invalid or obtains an error result in an analysis stage, but also effectively protects the overall data security and privacy of a big data system, and realizes the security and privacy protection from local to overall.
(2) The invention provides a method for establishing a homosensitive set in a self-defined mode, and for different sensitivity requirements, a corresponding geometric transformation mode is adopted for data processing, such as translation, scaling, rotation and the like, so that the operation overhead is reduced according to a sensitivity classification processing mode, and the targeted privacy protection is also ensured.
(3) The invention utilizes the geometric transformation processing, can adopt the corresponding inverse transformation processing according to the requirement, can completely recover the data, has no distortion of the recovered data, and ensures the authenticity of the subsequent big data analysis.
In summary, the invention can provide a simple, efficient and hierarchical data privacy protection method for the release and transmission of mass data, distinguish the attribute sensitivity degree by using the customized attribute sensitivity set, and efficiently realize targeted privacy protection processing by classifying through the geometric deformation technology in the field of digital images. And finally, data can be restored through geometric deformation inverse transformation, and authenticity of big data analysis is guaranteed. Meanwhile, the invention adopts the technology and the algorithm which are easy to realize and relatively mature. The method and the device can be suitable for each stage needing data privacy protection in the big data full life cycle.
Drawings
FIG. 1 is a flow chart of a big data privacy protection method based on geometric deformation of the invention;
FIG. 2 is a data state before perturbation;
FIG. 3 is a data state after perturbation.
Detailed Description
In order to more clearly understand the technical features, objects, and effects of the present invention, specific embodiments of the present invention will now be described. It should be understood that the detailed description and specific examples, while indicating the preferred embodiment of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
As shown in FIG. 1, the present invention provides a big data security and privacy protection method based on geometric deformation, which includes the following steps:
establishing an attribute sensitive set: dividing the attributes of the data into four sets which respectively represent four sensitivity degrees, including insensitive, low-sensitivity, medium-sensitivity and high-sensitivity attributes;
cleaning data: deleting the incomplete data entries, and serializing the discrete data to obtain data to be divided;
dividing data: screening out data corresponding to all attributes in the same sensitive set, and classifying the data into different columns of the same matrix to form a same sensitive data set;
geometric deformation: respectively carrying out corresponding translation, scaling, rotation or similar transformation on insensitive, low-sensitivity, medium-sensitivity and high-sensitivity data sets in the same-sensitivity data set, and recording transformation parameters for use in subsequent inverse transformation;
obtaining a final data set: converting the four processed data sets into one data set, namely obtaining data processed by big data privacy protection based on geometric deformation; and if the subsequent data prototype is needed to carry out big data analysis, the original data is obtained by adopting inverse transformation processing.
In an embodiment of the present invention, in the step of establishing the attribute-sensitive set, the insensitive attribute may specifically include "height", "weight", and the like, and the highly sensitive attribute may specifically include "income", "identification number", and the like.
In one embodiment of the present invention, in the step of cleaning data, the incomplete data entry is deleted, and the serialization of the scattered data may specifically be: and (3) rewriting the occupation into the occupation { students, teachers, doctors and nurses }, so as to obtain the data to be divided.
In one embodiment of the present invention, in the translation transformation process, a formula of translation disturbance is adopted as follows:
Xt=Xt-1+T,T=[tx,ty]
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the coordinates of the data before transformation, txFor horizontal translation, tyIs the amount of translation in the vertical direction.
Specifically, when the age and income are disturbed, T [ -3,1000], the data states before and after the shift disturbance are as shown in fig. 2 and 3.
More specifically, the translation transformation process comprises the following sub-steps:
step S1, inputting a privacy attribute set V and a noise set TaddSelecting two privacy attributes A of a privacy attribute set VjAnd Aj+ k, where k is a predetermined value, the noise set T is selectedaddAn additive noise term e ofj
S2, selecting the privacy attribute pair AjAnd Aj+ k, and additive noise term ejForming a matrix;
s3, performing geometric deformation calculation according to a translation disturbance formula: v ← transform (V, T)add)。
In one embodiment of the present invention, in the scaling transformation process, the formula of the scaling perturbation is adopted as follows:
Xt=sXt-1
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the data coordinates before transformation and the scalar s represents the uniform scaling amount.
In one embodiment of the present invention, in the rotation transformation process, the formula of the rotation disturbance is adopted as follows:
Figure BDA0002664693290000071
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Represents the data coordinates before transformation, and θ is the angle of rotation transformation.
In one embodiment of the present invention, in the similarity transformation process, the formula of the similarity perturbation is adopted as follows:
Figure BDA0002664693290000081
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the coordinates of the data before transformation, theta being the angle parameter of the rotation transformation, txFor horizontal translation parameters, tyFor the translation parameter in the vertical direction, the scalar s represents the uniform scaling parameter; similar perturbations are a combination of rotational, translational and scaling perturbations.
The invention also provides a storage medium storing a computer program, which when executed by a processor implements the steps of the above-mentioned big data security and privacy protection method based on geometric deformation.
The foregoing is illustrative of the preferred embodiments of this invention, and it is to be understood that the invention is not limited to the precise form disclosed herein and that various other combinations, modifications, and environments may be resorted to, falling within the scope of the concept as disclosed herein, either as described above or as apparent to those skilled in the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (7)

1. A big data security and privacy protection method based on geometric deformation is characterized by comprising the following steps:
establishing an attribute sensitive set: dividing the attributes of the data into four sets which respectively represent four sensitivity degrees, including insensitive, low-sensitivity, medium-sensitivity and high-sensitivity attributes;
cleaning data: deleting the incomplete data entries, and serializing the discrete data to obtain data to be divided;
dividing data: screening out data corresponding to all attributes in the same sensitive set, and classifying the data into different columns of the same matrix to form a same sensitive data set;
geometric deformation: respectively carrying out corresponding translation, scaling, rotation or similar transformation on insensitive, low-sensitivity, medium-sensitivity and high-sensitivity data sets in the same-sensitivity data set, and recording transformation parameters for use in subsequent inverse transformation;
obtaining a final data set: converting the four processed data sets into one data set, namely obtaining data processed by big data privacy protection based on geometric deformation; and if the subsequent data prototype is needed to carry out big data analysis, the original data is obtained by adopting inverse transformation processing.
2. The big data security and privacy protection method based on geometric deformation according to claim 1, wherein in the translation transformation process, a formula of translation disturbance is adopted as follows:
Xt=Xt-1+T,T=[tx,ty]
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the coordinates of the data before transformation, txFor horizontal translation, tyIs the amount of translation in the vertical direction.
3. The big data security and privacy protection method based on geometric deformation as claimed in claim 2, wherein the translation transformation process includes the following sub-steps:
step S1, inputting a privacy attribute set V and a noise set TaddSelecting two privacy attributes A of a privacy attribute set VjAnd Aj+ k, where k is a predetermined value, the noise set T is selectedaddAn additive noise term e ofj
S2, selecting the privacy attribute pair AjAnd Aj+ k, and additive noise term ejForming a matrix;
s3, performing geometric deformation calculation according to a translation disturbance formula: v ← transform (V, T)add)。
4. The big data security and privacy protection method based on geometric deformation according to claim 1, wherein in the scaling transformation process, the formula of scaling disturbance is:
Xt=sXt-1
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the data coordinates before transformation and the scalar s represents the uniform scaling amount.
5. The big data security and privacy protection method based on geometric deformation according to claim 1, wherein in the rotation transformation process, a formula of rotation disturbance is adopted as follows:
Figure FDA0002664693280000021
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Represents the data coordinates before transformation, and θ is the angle of rotation transformation.
6. The big data security and privacy protection method based on geometric deformation according to claim 1, wherein in the similarity transformation process, a formula of similarity perturbation is adopted as follows:
Figure FDA0002664693280000022
wherein, Xt=[xt,yt]TCoordinates, X, representing data corresponding to each attributet-1Representing the coordinates of the data before transformation, theta being the angle parameter of the rotation transformation, txFor horizontal translation parameters, tyFor the translation parameter in the vertical direction, the scalar s represents the uniform scaling parameter; similar perturbations are a combination of rotational, translational and scaling perturbations.
7. A storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the steps of a geometric-deformation-based big data security and privacy protection method according to any one of claims 1 to 6.
CN202010914945.XA 2020-09-03 2020-09-03 Big data security and privacy protection method based on geometric deformation and storage medium Pending CN112231745A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010914945.XA CN112231745A (en) 2020-09-03 2020-09-03 Big data security and privacy protection method based on geometric deformation and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010914945.XA CN112231745A (en) 2020-09-03 2020-09-03 Big data security and privacy protection method based on geometric deformation and storage medium

Publications (1)

Publication Number Publication Date
CN112231745A true CN112231745A (en) 2021-01-15

Family

ID=74117016

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010914945.XA Pending CN112231745A (en) 2020-09-03 2020-09-03 Big data security and privacy protection method based on geometric deformation and storage medium

Country Status (1)

Country Link
CN (1) CN112231745A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112883389A (en) * 2021-02-09 2021-06-01 上海凯馨信息科技有限公司 Reversible desensitization algorithm supporting feature preservation
CN113160348A (en) * 2021-05-20 2021-07-23 深圳文达智通技术有限公司 Recoverable face image privacy protection method, device, equipment and storage medium
CN114491650A (en) * 2022-04-13 2022-05-13 武汉光谷信息技术股份有限公司 Geographical space information desensitization encryption method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160085982A1 (en) * 2010-07-29 2016-03-24 Oracle International Corporation System and method for real-time transactional data obfuscation
CN110020546A (en) * 2019-01-07 2019-07-16 南京邮电大学 A kind of private data cascade protection method
CN110134719A (en) * 2019-05-17 2019-08-16 贵州大学 A kind of identification of structural data Sensitive Attributes and stage division of classifying
CN111008368A (en) * 2019-11-26 2020-04-14 武汉大学 Grid-based collaborative design safety product data exchange method and system
CN111199048A (en) * 2020-01-02 2020-05-26 航天信息股份有限公司 Big data grading desensitization method and system based on container with life cycle

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160085982A1 (en) * 2010-07-29 2016-03-24 Oracle International Corporation System and method for real-time transactional data obfuscation
CN110020546A (en) * 2019-01-07 2019-07-16 南京邮电大学 A kind of private data cascade protection method
CN110134719A (en) * 2019-05-17 2019-08-16 贵州大学 A kind of identification of structural data Sensitive Attributes and stage division of classifying
CN111008368A (en) * 2019-11-26 2020-04-14 武汉大学 Grid-based collaborative design safety product data exchange method and system
CN111199048A (en) * 2020-01-02 2020-05-26 航天信息股份有限公司 Big data grading desensitization method and system based on container with life cycle

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
许杰 等: "基于几何变形的大数据安全隐私保护方法", 《通信技术》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112883389A (en) * 2021-02-09 2021-06-01 上海凯馨信息科技有限公司 Reversible desensitization algorithm supporting feature preservation
CN113160348A (en) * 2021-05-20 2021-07-23 深圳文达智通技术有限公司 Recoverable face image privacy protection method, device, equipment and storage medium
CN114491650A (en) * 2022-04-13 2022-05-13 武汉光谷信息技术股份有限公司 Geographical space information desensitization encryption method and system
CN114491650B (en) * 2022-04-13 2022-07-01 武汉光谷信息技术股份有限公司 Method and system for desensitizing encryption of geographic spatial information

Similar Documents

Publication Publication Date Title
CN112231745A (en) Big data security and privacy protection method based on geometric deformation and storage medium
US8769293B2 (en) Systems and methods for rights protection of datasets with dataset structure preservation
Fadl et al. Robust copy–move forgery revealing in digital images using polar coordinate system
Fang et al. Robust zero-watermarking algorithm for medical images based on SIFT and Bandelet-DCT
WO2020173252A1 (en) Method, system, and terminal for protecting deep neural network by means of self-locking mechanism
Aparna et al. A blind medical image watermarking for secure E-healthcare application using crypto-watermarking system
Gong et al. Robust and secure zero-watermarking algorithm for medical images based on Harris-SURF-DCT and chaotic map
Kaur et al. A secure data classification model in cloud computing using machine learning approach
Sheng et al. Zero watermarking algorithm for medical image based on Resnet50-DCT
Ju An overview of face manipulation detection
CN103853946B (en) A kind of GIS vector data copyright authentication method based on FCM cluster feature
Zhang et al. Robust multi-watermarking algorithm for medical images based on GoogLeNet and Henon map
Lucchese et al. Rights protection of trajectory datasets with nearest-neighbor preservation
Davidson et al. Locating secret messages in images
Daoui et al. New method for bio-signals zero-watermarking using quaternion shmaliy moments and short-time fourier transform
Al-Ardhi et al. Copyright protection and content authentication based on linear cellular automata watermarking for 2D vector maps
Doegar et al. A review of passive image cloning detection approaches
Yuan et al. Robust zero‐watermarking algorithm based on discrete wavelet transform and daisy descriptors for encrypted medical image
Kumar et al. A review on digital watermarking-based image forensic technique
Xiao et al. A zero-watermarking algorithm for medical images based on Gabor-DCT
Zhang Reversible Data Hiding of Digital Image Based on Pixel Combination Algorithm
Moataz Watermarking medical scans: Saliency filter pixels
Geetha et al. Blind image steganalysis based on content independent statistical measures maximizing the specificity and sensitivity of the system
Shopon et al. I Got Your Emotion: Emotion Preserving Face De-identification Using Injection-Based Generative Adversarial Networks
CN111968023B (en) Dual-image reversible data hiding method based on EMD matrix

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210115

RJ01 Rejection of invention patent application after publication