CN117313135B - Efficient reconfiguration personal privacy protection method based on attribute division - Google Patents

Efficient reconfiguration personal privacy protection method based on attribute division Download PDF

Info

Publication number
CN117313135B
CN117313135B CN202310965523.9A CN202310965523A CN117313135B CN 117313135 B CN117313135 B CN 117313135B CN 202310965523 A CN202310965523 A CN 202310965523A CN 117313135 B CN117313135 B CN 117313135B
Authority
CN
China
Prior art keywords
attribute
inference
constraint
servers
privacy protection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310965523.9A
Other languages
Chinese (zh)
Other versions
CN117313135A (en
Inventor
罗凯伦
李睿
阮华锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dongguan University of Technology
Original Assignee
Dongguan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dongguan University of Technology filed Critical Dongguan University of Technology
Priority to CN202310965523.9A priority Critical patent/CN117313135B/en
Publication of CN117313135A publication Critical patent/CN117313135A/en
Application granted granted Critical
Publication of CN117313135B publication Critical patent/CN117313135B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/041Abduction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/04Constraint-based CAD

Abstract

The invention relates to a personal privacy protection method capable of being efficiently reconstructed based on attribute division, which comprises a calculation inference set step and an attribute segmentation step; the step of computing the inference set comprises: carrying out inference set calculation on an input data set by a machine learning method, and adopting a recursion mode to input the inference set obtained by each calculation as a new sensitive attribute set in a machine learning algorithm for learning in the calculation process until the generated inference set tree converges or reaches a set layer number; the attribute segmentation step includes: and determining the values of the number of servers and the total number of servers which are allowed to be leaked, converting the privacy setting into corresponding constraint expressions, inputting the constraint expressions into a SAT solver, and obtaining a final attribute segmentation result through the SAT solver. According to the invention, through a calculation inference set algorithm and an attribute segmentation algorithm, privacy protection and segmentation of sensitive attributes are realized, and the security of data privacy is improved.

Description

Efficient reconfiguration personal privacy protection method based on attribute division
Technical Field
The invention relates to the technical field of network security, in particular to a personal privacy protection method which belongs to division and can be efficiently reconstructed.
Background
Current research in vertical data partitioning is mainly focused on partitioning a data set into specific subsets according to given privacy constraints and meeting proposed privacy settings; these studies propose privacy settings based on a hypothetical threat model and satisfy given privacy settings by means of attribute segmentation computation methods, where privacy constraints are typically expressed in terms of sets of attributes.
Accordingly, in solving the privacy protection problem of vertical data partitioning, the prior art has the following problems: 1. there are processing difficulties in the multi-server case: the prior art has studied about the case involving only two servers, but in practical applications, more servers may need to be handled, and in the case of multiple servers, the segmentation and privacy protection of data become more complex, and collaboration and communication between different servers need to be considered, which the prior art does not give an explicit solution. 2. An additional trusted server is needed: some of the prior art solutions rely on additional trusted servers to perform storage and querying tasks to protect the privacy of data. However, introducing trusted servers increases latency and usage costs of network queries, which may not be practical for large-scale data storage and query scenarios. 3. Computing power dependent on data encryption or trusted server: some prior art techniques use the computational power of data encryption or trusted servers in data storage and querying. This may limit the reconstruction of data and flexibility of queries, especially in scenarios where frequent queries or data operations are required.
It should be noted that the information disclosed in the above background section is only for enhancing understanding of the background of the present disclosure and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a personal privacy protection method based on attribute division and capable of being efficiently reconstructed, and solves the problems of the prior method.
The aim of the invention is achieved by the following technical scheme: an efficient reconfigurable personal privacy protection method based on attribute division comprises a calculation inference set step and an attribute segmentation step;
the step of computing the inference set comprises: carrying out inference set calculation on an input data set by a machine learning method, and adopting a recursion mode to input the inference set obtained by each calculation as a new sensitive attribute set in a machine learning algorithm for learning in the calculation process until the generated inference set tree converges or reaches a set layer number;
the attribute segmentation step includes: and determining the values of the number of servers and the total number of servers which are allowed to be leaked, converting the privacy setting into corresponding constraint expressions, inputting the constraint expressions into a SAT solver, and obtaining a final attribute segmentation result through the SAT solver.
The step of calculating the inference set specifically comprises the following steps:
initializing an inference set tree T, traversing all subsets X of the attribute set, and calculating an inference set Y=L (X) of the attribute subset X through a machine learning algorithm L;
taking the inferred set Y as a new sensitive attribute set S, namely S=Y, and judging whether an ending condition is met or not;
ending if the generated inference set Y is empty or the depth of the inference set tree T reaches the set layer number h, and adding the inference set Y into the inference set tree T to serve as a child node of the inference set tree T;
repeating the steps, and continuing to traverse the next subset of the attribute set until the traversal of all subsets in the attribute set is completed.
The input data set and sensitive attribute set s= { S1, S2,..sm }, where Si is a sensitive attribute, and the data set includes attribute set d= { A1, A2,..an }, are also required before initializing the inference set tree.
The attribute segmentation step specifically comprises the following steps:
determining parameters in the privacy settings: determining the value of the number k of servers and the number n of total servers of the highest allowable leakage;
constructing a constraint expression: constructing a constraint expression, and converting privacy setting into constraints of SAT problems;
inputting the constructed constraint expression into a SAT solver, and obtaining an attribute segmentation scheme meeting the constraint by the SAT solver through solving the constraint expression;
and outputting an attribute segmentation scheme obtained by the SAT solver, and distributing each attribute to a corresponding server to realize personal privacy protection.
The constraints include:
constraint one: each attribute must be assigned to a certain server;
constraint II: negating all privacy settings from being allowed to occur;
constraint three: additional constraints set by the user themselves.
The personal privacy protection method further includes: setting a sensitive attribute set, an inference set and a security model before performing the step of calculating the inference set;
setting the sensitive attribute set includes: is provided withIs a set of attributes in the data, sensitive attribute set +.>
The set of settings inferences includes: is provided withIs a set of attributes in the data, data set +.>L is a machine learning algorithm if P (L (U) =S)>v, then U is the dataset +.>P (L (U) =s) is the inference rate of data sets U to S, c is the privacy preserving intensity;
setting the security module includes: is provided withIs a set of attributes in the data, with a sensitive set of attributes S, will +.>Distributed to servers { X i In I epsilon {1 … n }, k servers are selected to combine the attributes +.> Representing data set +.>K < n.
The invention has the following advantages: the personal privacy protection method based on attribute division and capable of being efficiently reconstructed realizes privacy protection and segmentation of sensitive attributes through a calculation inference set algorithm and an attribute segmentation algorithm, improves the security of data privacy, converts an attribute segmentation problem into an SAT problem which is widely researched by adopting an SAT solver and a constraint expression mode, provides an effective solution idea, can obtain a better attribute segmentation scheme, and reduces the average time consumption of query operation by 20% compared with the conventional method due to efficient reconstruction.
Drawings
FIG. 1 is a schematic flow chart of the present invention.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Accordingly, the following detailed description of the embodiments of the present application, provided in connection with the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application. The invention is further described below with reference to the accompanying drawings.
The invention relates to a personal privacy protection method capable of being reconstructed with high efficiency based on attribute division, which comprises a calculation inference set algorithm part and an attribute segmentation algorithm part, wherein the calculation inference set algorithm part adopts a machine learning method to calculate an inference set through an input data set; the attribute segmentation algorithm part converts the attribute segmentation problem into a boolean Satisfiability (SAT) problem. First, the values of the number of servers and the total number of servers that are allowed to leak are determined, and privacy definitions are converted into corresponding constraint expressions. These constraint expressions are input into the SAT solver, through which the final attribute segmentation scheme is obtained.
As shown in fig. 1, the following are specifically included:
the following settings were first made:
setting the sensitive attribute set includes: is provided withIs a set of attributes in the data, sensitive attribute set +.>
The set of settings inferences includes: is provided withIs a set of attributes in the data, data set +.>L is a machine learning algorithm if P (L (U) =S)>v, then U is the dataset +.>P (L (U) =s) is the inference rate of data sets U to S, c is the privacy preserving intensity;
setting the security module includes: is provided withIs a set of attributes in the data, with a sensitive set of attributes S, will +.>Distributed to servers { X i In I epsilon {1 … n }, k servers are selected to combine the attributes +.> Representing data set +.>K < n.
Then input data set and sensitive attribute set, input data set, contain attribute set d= { A1, A2,..an }, input sensitive attribute set s= { S1, S2,..sm }, where Si is a sensitive attribute.
S101: adopting a machine learning method, and carrying out inference set calculation through an input data set;
the method specifically comprises the following steps:
initializing an inference set tree T, traversing all subsets X of the attribute set, and calculating an inference set Y=L (X) of the attribute subset X through a machine learning algorithm L, wherein X is a subset of D;
taking the inferred set Y as a new sensitive attribute set S, namely S=Y, and judging whether an ending condition is met or not;
ending if the generated inference set Y is empty or the depth of the inference set tree T reaches the set layer number h, and adding the inference set Y into the inference set tree T to serve as a child node of the inference set tree T;
repeating the steps, and continuing to traverse the next subset of the attribute set until the traversing of all subsets in the attribute set is completed S102: based on the inference set, converting the privacy security definition into a constraint expression, inputting the constraint expression into a SAT solver to calculate an attribute division scheme, and returning the result to the user;
the method specifically comprises the following steps:
determining the value of the number k of servers and the number n of total servers of the highest allowable leakage;
constructing a constraint expression, and converting the privacy setting into the constraint of the SAT problem, wherein the specific constraint comprises the following steps:
constraint one: each attribute must be assigned to a certain server, expressed as the formula: vpi is a logical proposition atom representing that the property p is stored in server i. Sign->Is a logical not; />Indicating that Vpi is taken as not, i.e. that the attribute p is not stored in server i. The symbol V is a logical OR; vpi Vpj indicates that Vpi or Vpj is satisfied. The symbol Λ is a logical AND; vpi a Vpj indicates that Vpi and Vpj are established simultaneously. a represents the number of attributes p; the range of the attribute p is denoted here as 1 to a.
Constraint II: negating the case where all privacy definitions are not allowed to occur;
constraint three: additional constraints that are user-defined.
Inputting the constructed constraint expression into a SAT solver, and obtaining an attribute segmentation scheme meeting the constraint by the SAT solver through solving the constraint expression;
and outputting an attribute segmentation scheme obtained by the SAT solver, and distributing each attribute to a corresponding server to realize personal privacy protection.
The foregoing is merely a preferred embodiment of the invention, and it is to be understood that the invention is not limited to the form disclosed herein but is not to be construed as excluding other embodiments, but is capable of numerous other combinations, modifications and environments and is capable of modifications within the scope of the inventive concept, either as taught or as a matter of routine skill or knowledge in the relevant art. And that modifications and variations which do not depart from the spirit and scope of the invention are intended to be within the scope of the appended claims.

Claims (4)

1. An efficient reconfigurable personal privacy protection method based on attribute division is characterized in that: the personal privacy protection method comprises a step of calculating an inference set and a step of dividing attributes;
the step of computing the inference set comprises: carrying out inference set calculation on an input data set by a machine learning method, and adopting a recursion mode to input the inference set obtained by each calculation as a new sensitive attribute set in a machine learning algorithm for learning in the calculation process until the generated inference set tree converges or reaches a set layer number;
the attribute segmentation step includes: determining the number of servers which are allowed to be leaked and the value of the total number of servers, converting privacy setting into corresponding constraint expressions, inputting the constraint expressions into a SAT solver, and obtaining a final attribute segmentation result through the SAT solver;
the step of calculating the inference set specifically comprises the following steps:
initializing an inference set tree T, traversing all subsets X of the attribute set, and calculating an inference set Y=L (X) of the attribute subset X through a machine learning algorithm L;
taking the inferred set Y as a new sensitive attribute set S, namely S=Y, and judging whether an ending condition is met or not;
ending if the generated inference set Y is empty or the depth of the inference set tree T reaches the set layer number h, and adding the inference set Y into the inference set tree T to serve as a child node of the inference set tree T;
repeating the steps, and continuing to traverse the next subset of the attribute set until the traversal of all subsets in the attribute set is completed;
the attribute segmentation step specifically comprises the following steps:
determining parameters in the privacy settings: determining the value of the number k of servers and the number n of total servers of the highest allowable leakage;
constructing a constraint expression: constructing a constraint expression, and converting privacy setting into constraints of SAT problems;
inputting the constructed constraint expression into a SAT solver, and obtaining an attribute segmentation scheme meeting the constraint by the SAT solver through solving the constraint expression;
and outputting an attribute segmentation scheme obtained by the SAT solver, and distributing each attribute to a corresponding server to realize personal privacy protection.
2. The efficient reconfigurable personal privacy protection method based on attribute partitioning of claim 1, wherein: the input data set and sensitive attribute set s= { S1, S2,..sm }, where Si is a sensitive attribute, and the data set includes attribute set d= { A1, A2,..an }, are also required before initializing the inference set tree.
3. The efficient reconfigurable personal privacy protection method based on attribute partitioning of claim 1, wherein: the constraints include:
constraint one: each attribute must be assigned to a certain server;
constraint II: negating all privacy settings from being allowed to occur;
constraint three: additional constraints set by the user themselves.
4. A method for efficient reconfigurable personal privacy protection based on attribute partitioning as in any of claims 1-3 wherein: the personal privacy protection method further includes: setting a sensitive attribute set, an inference set and a security model before performing the step of calculating the inference set;
setting the sensitive attribute set includes: is provided withIs a set of attributes in the data, sensitive attribute set +.>
The set of settings inferences includes: is provided withIs a set of attributes in the data, data set +.>L is a machine learning algorithm if P (L (U) =S)>v, then U is the dataset +.>P (L (U) =s) is the inference rate of data sets U to S, c is the privacy preserving intensity;
setting the security module includes: is provided withIs a set of attributes in the data, with a sensitive set of attributes S, will +.>Distributed to servers { X i In I epsilon {1 … n }, k servers are selected to combine the attributes +.>Representing data set +.>K < n.
CN202310965523.9A 2023-08-02 2023-08-02 Efficient reconfiguration personal privacy protection method based on attribute division Active CN117313135B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310965523.9A CN117313135B (en) 2023-08-02 2023-08-02 Efficient reconfiguration personal privacy protection method based on attribute division

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310965523.9A CN117313135B (en) 2023-08-02 2023-08-02 Efficient reconfiguration personal privacy protection method based on attribute division

Publications (2)

Publication Number Publication Date
CN117313135A CN117313135A (en) 2023-12-29
CN117313135B true CN117313135B (en) 2024-04-16

Family

ID=89245240

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310965523.9A Active CN117313135B (en) 2023-08-02 2023-08-02 Efficient reconfiguration personal privacy protection method based on attribute division

Country Status (1)

Country Link
CN (1) CN117313135B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106940777A (en) * 2017-02-16 2017-07-11 湖南宸瀚信息科技有限责任公司 A kind of identity information method for secret protection measured based on sensitive information
CN110866277A (en) * 2019-11-13 2020-03-06 电子科技大学广东电子信息工程研究院 Privacy protection method for data integration of DaaS application
CN112668044A (en) * 2020-12-21 2021-04-16 中国科学院信息工程研究所 Privacy protection method and device for federal learning
CN112836009A (en) * 2021-02-19 2021-05-25 东莞理工学院 Thesis duplicate checking method and system supporting privacy protection
CN114218602A (en) * 2021-12-10 2022-03-22 南京航空航天大学 Differential privacy heterogeneous multi-attribute data publishing method based on vertical segmentation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116361846A (en) * 2021-11-12 2023-06-30 香港科技大学 Method and server for defending a service against personal privacy reasoning attacks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106940777A (en) * 2017-02-16 2017-07-11 湖南宸瀚信息科技有限责任公司 A kind of identity information method for secret protection measured based on sensitive information
CN110866277A (en) * 2019-11-13 2020-03-06 电子科技大学广东电子信息工程研究院 Privacy protection method for data integration of DaaS application
CN112668044A (en) * 2020-12-21 2021-04-16 中国科学院信息工程研究所 Privacy protection method and device for federal learning
CN112836009A (en) * 2021-02-19 2021-05-25 东莞理工学院 Thesis duplicate checking method and system supporting privacy protection
CN114218602A (en) * 2021-12-10 2022-03-22 南京航空航天大学 Differential privacy heterogeneous multi-attribute data publishing method based on vertical segmentation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于KD树的信息发布隐私保护;林国滨;姚志强;熊金波;林铭炜;;计算机系统应用;20170815(第08期);全文 *
面向DaaS应用的数据集成隐私保护机制研究;周志刚;张宏莉;余翔湛;李攀攀;;通信学报;20160425(第04期);全文 *

Also Published As

Publication number Publication date
CN117313135A (en) 2023-12-29

Similar Documents

Publication Publication Date Title
Langari et al. Combined fuzzy clustering and firefly algorithm for privacy preserving in social networks
Morone et al. Influence maximization in complex networks through optimal percolation
Zhang et al. A MapReduce based approach of scalable multidimensional anonymization for big data privacy preservation on cloud
CN106611037A (en) Method and device for distributed diagram calculation
Weiss et al. A primal/dual representation for discrete Morse complexes on tetrahedral meshes
Ma et al. RuleSN: Research and application of social network access control model
Wang et al. On the fractality of complex networks: Covering problem, algorithms and ahlfors regularity
Malik et al. Concurrence percolation threshold of large-scale quantum networks
Park et al. On the power of gradual network alignment using dual-perception similarities
Chen et al. Privacy-preserving hierarchical federated recommendation systems
CN117313135B (en) Efficient reconfiguration personal privacy protection method based on attribute division
Bi et al. Outsourced and privacy-preserving collaborative K-prototype clustering for mixed data via additive secret sharing
Elmisery et al. Multi-agent based middleware for protecting privacy in IPTV content recommender services
Lee et al. SearchaStore: Fast and secure searchable cloud services
Li et al. A method for improving the accuracy of link prediction algorithms
Anuradha et al. Mining generalized positive and negative inter-cross fuzzy multiple-level coherent rules
Xu et al. Graph encryption for all‐path queries
He et al. An efficient multi-keyword search scheme over encrypted data in multi-cloud environment
Huang et al. Research on optimization of real-time efficient storage algorithm in data information serialization
CN108804788B (en) Web service evolution method based on data cell model
Zhao et al. Solving Boolean polynomial systems by parallelizing characteristic set method for cyber‐physical systems
Dong et al. FLEXBNN: fast private binary neural network inference with flexible bit-width
Ji et al. An improved random walk based community detection algorithm
Fu et al. Privacy preserving social network against dopv attacks
Retima et al. A quality-aware context information selection based fuzzy logic in IoT environment.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant