CN113177166B - Personalized position semantic publishing method and system based on differential privacy - Google Patents
Personalized position semantic publishing method and system based on differential privacy Download PDFInfo
- Publication number
- CN113177166B CN113177166B CN202110449465.5A CN202110449465A CN113177166B CN 113177166 B CN113177166 B CN 113177166B CN 202110449465 A CN202110449465 A CN 202110449465A CN 113177166 B CN113177166 B CN 113177166B
- Authority
- CN
- China
- Prior art keywords
- semantic
- noise
- sensitivity
- sem
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2111—Location-sensitive, e.g. geographical location, GPS
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Quality & Reliability (AREA)
- Remote Sensing (AREA)
- Medical Informatics (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a personalized position semantic issuing method and system based on differential privacy, belonging to the field of data mining and privacy protection.A parameter is set as a semantic privacy protection level of I according to the privacy protection requirement of a user, and the I-1 position semantic closest to the semantic to be protected is calculated; then, semantic sensitivities of all positions are calculated according to the semantic access times of the user, and the issue probabilities of the I semantics are respectively obtained based on the semantic sensitivities; and finally, generating a Laplacian variable meeting a specific probability by the Gaussian variable and the exponential distribution variable of the specific parameter, wherein the Laplacian variable is the issued user position meeting the specific semantic sensitivity. The problem that the position semantics are not protected in the existing differential privacy publishing position method is solved.
Description
Technical Field
The invention belongs to the field of data mining and privacy protection, and relates to a personalized position semantic publishing method and system based on differential privacy.
Background
With the wide use of mobile terminal devices such as mobile phones and the rapid development of wireless communication technologies, location-Based services (LBS) are used more and more frequently, and the LBS can provide services such as Location check-in, surrounding store search, information push and the like for users through a positioning technology. In the location service process, a large amount of spatial location data is generated, and in order to perform relevant push according to the preference of a user, the LBS provider uploads, publishes and shares the collected user location data. However, some sensitive information of the user may be involved in the shared location data, and the data owner may not want to directly share the own location data.
The existing location privacy protection methods are mainly divided into three types: spatial anonymity based, encryption based and perturbation based methods. The spatial anonymity mainly hides the position of a user, sets a corresponding anonymity parameter level, and confuses an original value and an anonymity value of the user to achieve the effect of protecting the position privacy of the user, but the anonymity parameter level is difficult to set based on an anonymity protection mode, and the data availability after anonymity is not high; the encryption-based location privacy protection method generally utilizes symmetric encryption and asymmetric encryption algorithms to encrypt location data so as to hide the true value of the location data, but the encryption-based method is often complex and consumes huge communication resources; in the disturbance-based method, a differential privacy protection method is taken as a representative method, and the method becomes the most important privacy protection method for protecting the position privacy due to a strict mathematical reasoning model and no limitation on the background knowledge of an attacker.
At present, the position difference privacy protection generally utilizes a Laplace noise mechanism to carry out small-range disturbance on the latitude and longitude of an original position, and high data availability can be provided while the accurate latitude and longitude data of the position are protected. However, the location semantics, which are an important component of the location information, often include sensitive information of the user (e.g., a home address, a check-in place, etc.), the existing location differential privacy protection method only protects longitude and latitude data of the location, and does not protect the location semantics of the user, and an attacker can obtain the location semantic information of the user through location semantic inference. How to protect the spatial position data of the user and the position semantics of the user when releasing the position of the user is an urgent problem to be solved.
Disclosure of Invention
In view of the above, the present invention aims to provide a personalized location semantic publishing method and system based on differential privacy, which first set a semantic privacy protection level with a parameter of l according to the privacy protection requirement of a user, and calculate l-1 location semantics closest to the semantics to be protected; then, semantic sensitivities of all positions are calculated according to the semantic access times of the user, and the issue probabilities of the I semantics are respectively obtained based on the semantic sensitivities; and finally, generating a Laplacian variable meeting a specific probability by the Gaussian variable and the exponential distribution variable of the specific parameter, wherein the Laplacian variable is the issued user position meeting the specific semantic sensitivity.
In order to achieve the purpose, the invention provides the following technical scheme:
in one aspect, the invention provides a personalized position semantic publishing method based on differential privacy, which comprises the following steps:
s1: data preprocessing, namely cleaning and stipulating the originally acquired position data to obtain position sensitive data X = { X = to-be-protected 1 ,...,x i ,...,x n H, a total of n positions, where x i Which represents the (i) th position of the (c),whereinRespectively representing longitude, latitude and semantics of the ith position;
s2: setting a corresponding semantic privacy protection level l according to the semantic privacy protection requirement;
s3: calculating semantic sensitivity;
s4: generating noise, namely generating Laplacian noise which accords with specific semantic sensitivity according to the generation principle of the Laplacian noise;
s5: adding the obtained Laplacian noise to the position data to obtain new position data X' lng =X lng +Y lng ,X′ lat =X lat +Y lat ;
S6: performing iterative processing, namely performing iterative processing on each position, and repeating the steps S2-S5 until the data processing of all the positions is finished;
s7: data issuing, wherein for each processed position data, new disturbed position data corresponds to the position data, the positions are at least in l different semantics, one position semantic is selected from the position data to be issued as the position of a user, and the issued new position is X '= { X' 1 ,x′ 2 ,...,x′ i ,...,x′ n X 'therein' i Indicating the ith position after noise additionRespectively representing longitude, latitude and semantics of the ith position after noise addition.
Further, step S3 specifically includes the following steps:
s31: calculating the ith position x according to the Euclidean distance i The latest l-1 semantics, the semantics of the semantic set Sem (Sem) and the l-1 semantics 1 ,sem 2 ,...,sem i ,...,sem l ) WhereinRepresenting a latitude and longitude range of the ith semantic;
s32: position semantic sensitivity calculation, namely calculating the sensitivity of the semantic meanings obtained in the step S3-1The calculation formula is as follows,
wherein, H (sem) i ) Representing semantics sem i The total number of times accessed, L, represents the sum of the number of times all semantics are accessed.
Further, the step S4 specifically includes the following steps:
s41: calculating standard deviation sigma of inverse Gaussian cumulative distribution function according to semantic sensitivity and position semantic range 1 ,σ 2 :
Wherein μ is 0 and σ 1 ,σ 2 The obtained Gaussian standard deviation parameter is obtained;
s42: calculating parameter lambda required by generating exponential distribution by inverse cumulative exponential distribution function according to semantic sensitivity and position semantic range 1 ,λ 2 :
S43: generating Gaussian distribution noise Z based on the Gaussian distribution parameters obtained in step S41 lng ,Z lat ;
S44: generating an exponential distribution noise W based on the exponential distribution parameter obtained in step S42 lng ,W lat ;
S45: calculating a generalized Laplace variable Wherein Y is lng ,Y lat I.e. the generated longitude and latitude noise according with the specific semantic sensitivity.
In another aspect, the present invention provides a personalized location semantic publishing system based on differential privacy, comprising
A data preprocessing module: the method is used for carrying out data cleaning and specification on the originally acquired position data to obtain position data X = { X = to-be-protected 1 ,x 2 ,...,x i ,...,x n In which x i It is indicated that the position of the i-th position, respectively representing the longitude, latitude and semantics of the ith position.
A parameter setting module: for setting a semantic location privacy level protection parameter l.
A semantic sensitivity calculation module: sensitivity P for calculating this semantic sem =(p sem1 ,p sem2 ,...,p seml );
A noise generation module: generating laplacian noise that conforms to a particular semantic sensitivity;
a noise adding module: is used for adding the generalized Laplacian noise obtained by the fifth unit in the noise generation module into the position data to obtain new position data X' lng =X lng +Y lng ,X′ lat =X lat +Y lat ;
An iteration processing module: the system is used for iteratively processing each position until all position data are updated;
a data release module: for each processed position data, new perturbed position data corresponds to the position data, the positions are at least in l different semantics, one position semantic is selected from the position data to serve as the position of a user to be issued, and the issued new position is X '= { X' 1 ,x′ 2 ,...,x′ i ,...,x′ n X 'therein' i Indicating the ith position after noise addition Respectively representing longitude, latitude and semantics of the ith position after noise addition.
Further, the semantic sensitivity calculation module includes the following sub-units:
semantic sensitivity calculation first unit: calculating l-1 semantemes nearest to the ith position according to the Euclidean distance, and taking the semanteme to which the semanteme belongs and the l-1 semantemes as a semantic set Sem = (Sem) 1 ,sem 2 ,...,sem l ) WhereinRepresenting a latitude and longitude range of the ith semantic;
semantic sensitivity calculation second unit: position semantic sensitivity calculation, calculating the sensitivity of the semanticThe following formula:
wherein, H (sem) i ) Representing semantics sem i The total number of times accessed, L, represents the sum of the number of times all semantics are accessed.
Further, the noise generation module comprises the following sub-units:
a first noise generation unit for calculating the standard deviation sigma of the inverse Gaussian distribution function according to the semantic sensitivity and the position semantic range 1 ,σ 2 ,
Wherein μ is 0 and σ 1 ,σ 2 The parameters are the parameters required by us;
a second noise generation unit for calculating the parameter lambda needed by the inverse cumulative exponential distribution function to generate the exponential distribution according to the semantic sensitivity and the position semantic range 1 ,λ 2 ;
A third noise generation unit for generating Gaussian distribution noise Z based on the Gaussian distribution parameter obtained in step S4-1 lng ,Z lat ;
A fourth noise generation unit for generating an exponential distribution noise W based on the exponential distribution parameter obtained in step S4-2 lng ,W lat ;
A fifth unit for generating noise and calculating generalized Laplace variable Y lng ,Y lat I.e. the generated longitude and latitude noise according with the specific semantic sensitivity.
The invention has the beneficial effects that: according to the method, the Laplace noise with a specific probability can be generated, so that the latitude and longitude data privacy of the position can be protected, and the position semantic privacy of a user can be protected; noise meeting the sensitivity of specific semantics can be generated according to the privacy protection requirements of users on different semantics sensitivities, and personalized protection on the position semantics of the users is realized; the implementation process and steps are simple and easy to implement, the usability of the published data is improved, the consumption of communication resources is reduced, and the method has important market value.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Drawings
For a better understanding of the objects, aspects and advantages of the present invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a flowchart illustrating steps of a method for distributing personalized location semantics based on differential privacy according to an embodiment of the present invention;
FIG. 2 is a general flow chart provided by an embodiment of the present invention;
fig. 3 is a schematic diagram of a personalized location semantic publishing system based on differential privacy according to an embodiment of the present invention.
Detailed Description
The following embodiments of the present invention are provided by way of specific examples, and other advantages and effects of the present invention will be readily apparent to those skilled in the art from the disclosure herein. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; for a better explanation of the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.
Referring to fig. 1 to fig. 3, fig. 1 and fig. 2 are general flow charts of methods for implementing the present invention, and the specific steps of the method for generating laplacian noise with specific semantic sensitivity according to the embodiment of the present invention include:
step S1, data preprocessing, namely, performing data cleaning and stipulation on the originally acquired position data to obtain position sensitive data X = { X = (X) to be protected 1 ,...,x i ,...,x n H, a total of n positions, where x i It is indicated that the position of the i-th position, respectively representing the longitude, latitude and semantics of the ith position.
In the embodiment, the check-in data is cleaned and reduced to obtain the check-in data to be protected, wherein X = { X = { X = } 1 ,x 2 ,...,x 1303 }。
And S2, setting a semantic privacy protection level parameter, and setting a corresponding privacy protection level l according to the semantic privacy protection requirement of a user.
In the embodiment, the semantic privacy protection level parameter l =4 is set, and in specific implementation, a person skilled in the art may set the semantic privacy protection level parameter by himself or herself.
Step S3, semantic sensitivity calculation, which comprises the following steps,
step S3-1, calculating the ith position x according to the Euclidean distance i The latest l-1 semantics, the semantics of the semantic set and the l-1 semantics are taken as a semantic set Sem = (Sem) 1 ,sem 2 ,...,sem l ) WhereinRepresents the latitude and longitude range of the ith semantic.
In an embodiment, the first location point x is taken 1 ={106.61704,29.541919,<Xinhua bookshop>Calculating the nearest 3 semantemes near the first position according to the Euclidean distance, and respectively making the semantemes be a great circle<New century supermarket>,<China Mobile>,<Kendyl>The four semantics are respectively in the range of 106.61647 and 106.61760 in Xinhua bookshop],[29.541427,29.542410]The semantic scope of the new century supermarkets is { [106.61495,106.616128 { [],[29.541529,29.54255]The semantic range of Chinese movement is { [106.613461,106.61486 { [],[29.54121,29.5424362]The semantic scope of Kendel is { [106.617366,106.61809 ]],[29.54063,29.54126]};
S3-2, position semantic sensitivity calculation, wherein the sensitivity of the one semantic obtained in the step S3-1 is calculatedThe calculation formula is as follows,
wherein, H (sem) i ) Representing semantics sem i The total number of times accessed, L, represents the sum of the number of times all semantics are accessed.
In the embodiment, the semantics of the Xinhua bookshop are accessed 30 times altogether, the semantics of the New century supermarket are accessed 200 times altogether, the location semantics sensitivities obtained by the China Mobile being accessed 15 times altogether and the Kendeki being accessed 10 times are respectively 0.023,0.0075,0.21 and 0.007;
step S4, generating noise, namely generating the Laplacian noise according with the specific semantic sensitivity according to the generation principle of the Laplacian noise, comprising the following steps,
s4-1, calculating standard deviation sigma of inverse Gaussian cumulative distribution function according to semantic sensitivity and position semantic range 1 ,σ 2 :
Wherein μ is 0 and σ 1 ,σ 2 The calculated Gaussian standard deviation parameter is obtained;
in the embodiment, the parameters sigma of 4 semantics are respectively obtained according to the semantic range and the calculated semantic sensitivity in the step S3 1 ={0.00346,0.0000213,0.000317,0.0000278},σ 2 ={0.00023001,0.000321,0.0000378,0.00001265}
S4-2, solving a parameter lambda required by generating the exponential distribution by the inverse cumulative exponential distribution function according to the semantic sensitivity and the position semantic range 1 ,λ 2 :
In the embodiment, parameters lambda of 4 semantics are respectively obtained according to the semantic range and the calculated semantic sensitivity in the step S3 1 ={0.000075,0.00067,0.0000379,0.0000543},λ 2 ={0.0003954,0.0001534,0.000069023,0.00001357};
Step S4-3, generating Gaussian distribution noise Z according to the Gaussian distribution parameters obtained in the step S4-1 lng ,Z lat ;
Example, σ obtained from step S4-1 1 ,σ 2 Randomly selecting one from the set as a Gaussian distribution parameter, wherein the parameter is the sigma 1 Wherein 0.000317 is selected to generate Z lng =0.0000131 from σ 2 0.000321 selected from the group to generate Z lat =0.000678;
Step S4-4, generating exponential distribution noise W according to the exponential distribution parameters obtained in the step S4-2 lng ,W lat ;
Example, λ determined from step S4-2 1 ,λ 2 Randomly selecting one from the collection as an exponential distribution parameter, where we are from λ 1 Selecting 0.00067 to generate W lng =0.000362, from σ 2 Selecting 0.000354 to generate W lat =0.000714;
Step S4-5, calculating generalized Laplace variable Y lng ,Y lat I.e. the generated longitude and latitude noise which conforms to the specific semantic sensitivity.
Example, Z determined in accordance with step S4-3 and step S4-4 lng ,Z lat ,W lng ,W lat To obtain Y lng =0.0002492444,Y lat =0.000181166;
Step S5, adding noise, namely adding the generalized Laplace noise obtained in the step S4-5 into the position data to obtain new position data X' lng =X lng +Y lng ,X′ lat =X lat +Y lat 。
Example, to the first position data x 1 ={106.61704,29.541919,<Xinhua bookshop>Adding the generalized Laplacian noise generated in the step S4-5 to obtain new position data to obtain position data x' 1 ={106.6172892444,29.54210016}。
And S6, performing iterative processing, namely performing iterative processing on each position, and repeating the steps S2 to S5 until all position data are processed.
In the embodiment, each position data is traversed, and 1303 positions are all processed in the steps S2-S5 until all the position data are processed;
step S7, data issuing, wherein for each processed position data, new perturbed position data corresponds to the processed position data, the positions are at least in one different semantic meaning, one position semantic meaning is selected from the position data to be used as the position of the user to issue, and the issued new position is X '= { X' 1 ,x′ 2 ,...,x′ i ,...,x′ n X 'therein' i Indicating the i-th position after the addition of noise, respectively representing longitude, latitude and semantics of the ith position after noise addition.
Example, x 1 The issued position data is longitude and latitude data 106.6172892444 and 29.54210016 after noise is added, and the semantic is that one semantic issue is selected by taking the size of semantic sensitivity as a probability measurement unit in Xinhua bookshop, new century supermarket, china Mobile and Kendeji.
In specific implementation, the method provided by the invention can realize automatic operation flow based on software technology, and can also realize a corresponding system in a modularized manner.
A data preprocessing module, configured to perform data cleaning and specification on the originally acquired position data to obtain position data to be protected, where X = { X = 1 ,x 2 ,...,x i ,...,x n In which x i It is indicated that the position of the i-th position, respectively representing the longitude, latitude and semantics of the ith position.
And the parameter setting module is used for setting a semantic position privacy level protection parameter l.
A semantic sensitivity calculation module for calculating the sensitivity P of the semantic sem =(p sem1 ,p sem2 ,...,p seml ) Which comprises the following sub-units,
a first unit for calculating l-1 most recent semantics at the ith position according to the Euclidean distance, and using the semantics and the l-1 semantics as a semantic set Sem = (Sem) 1 ,sem 2 ,...,sem i ,...sem l ) WhereinRepresenting the latitude and longitude range of the ith semantic.
A second unit for calculating the sensitivity of the position semantic meaningThe following formula is shown below,
wherein, H (sem) i ) Representing semantics sem i The total number of times accessed, L, represents the sum of the number of times all semantics are accessed.
A noise generation module for generating laplacian noise that conforms to a particular semantic sensitivity, comprising the following subunits,
a first unit for calculating standard deviation sigma of Gaussian inverse cumulative distribution function according to semantic sensitivity and position semantic range 1 ,σ 2 ,
Wherein μ is 0 and σ 1 ,σ 2 The parameters are the parameters required by us;
a second unit for calculating the parameter lambda needed by the inverse cumulative index distribution function to generate index distribution according to the semantic sensitivity and the position semantic range 1 ,λ 2 ;
A third unit for generating Gaussian distribution noise Z based on the Gaussian distribution parameters obtained in step S4-1 lng ,Z lat ;
A fourth unit for generating an exponential distribution noise W based on the exponential distribution parameter obtained in the step S4-2 lng ,W lat ;
A fifth unit for calculating a generalized Laplace variable Y lng ,Y lat I.e. the generated longitude and latitude noise according with the specific semantic sensitivity.
A noise adding module for adding the generalized Laplacian noise obtained by the fifth unit in the noise generating module to the position data to obtain new position data X' lng =X lng +Y lng ,X′ lat =X lat +Y lat 。
And the iterative processing module is used for iteratively processing each position, and repeating the steps S2 to S5 until all position data are updated.
The data publishing module is used for corresponding new disturbed position data to each processed position data, the positions are at least in l different semantics, one position semantic is selected from the position data to be published as the position of a user, and the data is publishedIs X '= { X' 1 ,x′ 2 ,...,x′ i ,...,x′ n X 'therein' i Indicating the i-th position after the addition of noise, respectively representing longitude, latitude and semantics of the ith position after noise addition.
Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.
Claims (2)
1. A personalized position semantic publishing method based on differential privacy is characterized in that: the method comprises the following steps:
s1: data preprocessing, namely cleaning and stipulating the originally acquired position data to obtain position sensitive data X = { X = to-be-protected 1 ,...,x i ,...,x n H, a total of n positions, where x i Which represents the (i) th position of the (c),whereinRespectively representing longitude, latitude and semantics of the ith position;
s2: setting a corresponding semantic privacy protection level l according to the semantic privacy protection requirement;
s3: calculating semantic sensitivity; the step S3 specifically includes the following steps:
s31: calculating the ith position x according to Euclidean distance i Last l-1 semanticsThe semantic meaning of the self and the 1 semantic meaning are taken as a semantic set Sem (Sem) 1 ,sem 2 ,...,sem i ,...,sem l ) WhereinRepresenting a latitude and longitude range of the ith semantic;
s32: position semantic sensitivity calculation, namely calculating the sensitivity of the semantic words obtained in the step S31The calculation formula is as follows,
wherein, H (sem) i ) Representing semantics sem i The total number of times of access, L represents the sum of the number of times of access of all semantics;
s4: generating noise, namely generating the Laplacian noise which accords with the specific semantic sensitivity according to the generation principle of the Laplacian noise; the step S4 specifically includes the following steps:
s41: calculating standard deviation sigma of inverse Gaussian cumulative distribution function according to semantic sensitivity and position semantic range 1 ,σ 2 :
Wherein μ is 0 and σ 1 ,σ 2 The calculated Gaussian standard deviation parameter is obtained;
s42: calculating the parameter lambda required by generating the exponential distribution by the inverse cumulative exponential distribution function according to the semantic sensitivity and the position semantic range 1 ,λ 2 :
S43: generating Gaussian distribution noise Z based on the Gaussian distribution parameters obtained in step S41 lng ,Z lat ;
S44: generating an exponential distribution noise W based on the exponential distribution parameter obtained in step S42 lng ,W lat ;
S45: calculating a generalized Laplace variableWherein Y is lng ,Y lat The latitude and longitude noise which is generated and accords with the specific semantic sensitivity is obtained;
s5: adding the obtained Laplacian noise to the position data to obtain new position data X' lng =X lng +Y lng ,X′ lat =X lat +Y lat ;
S6: performing iterative processing, namely performing iterative processing on each position, and repeating the steps S2-S5 until the data processing of all the positions is finished;
s7: data issuing, wherein for each processed position data, new disturbed position data corresponds to the position data, the positions are at least in l different semantics, one position semantic is selected from the position data to be issued as the position of a user, and the issued new position is X '= { X' 1 ,x' 2 ,...,x′ i ,...,x' n X 'therein' i Indicating the ith position after noise addition Respectively representing longitude, latitude and semantics of the ith position after noise addition.
2. A personalized position semantic publishing system based on differential privacy is characterized in that: comprises that
A data preprocessing module: the method is used for carrying out data cleaning and specification on the originally acquired position data to obtain position data X = { X = to-be-protected 1 ,x 2 ,...,x i ,...,x n In which x i Which represents the (i) th position of the (c), respectively representing longitude, latitude and semantics of the ith position;
a parameter setting module: the method is used for setting a semantic location privacy level protection parameter l;
A noise generation module: generating laplacian noise that conforms to a particular semantic sensitivity;
a noise adding module: is used for adding the generalized Laplacian noise obtained by the fifth unit in the noise generation module into the position data to obtain new position data X' lng =X lng +Y lng ,X′ lat =X lat +Y lat ;
An iteration processing module: the system is used for iteratively processing each position until all position data are updated;
the data release module: for each processed position data, new perturbed position data is corresponding to the processed position data, and the positions are at least in one different semantic meaning, and are selected from the one or more different semantic meaningsOne location semantic is issued as the location of the user, and the new location issued is X '= { X' 1 ,x' 2 ,...,x′ i ,...,x' n X 'therein' i Indicating the i-th position after the addition of noise, respectively representing longitude, latitude and semantics of the ith position after noise addition;
the semantic sensitivity calculation module comprises the following subunits:
semantic sensitivity calculation first unit: calculating l-1 semanteme closest to the ith position according to the Euclidean distance, and taking the semanteme to which the semanteme belongs and the l-1 semantemes as a semantic set Sem = (Sem) 1 ,sem 2 ,...,sem i ,...,sem l ) WhereinRepresenting a latitude and longitude range of the ith semantic;
semantic sensitivity calculation second unit: position semantic sensitivity calculation, calculating the sensitivity of the semanticThe following formula:
wherein, H (sem) i ) Representing semantics sem i The total number of times of access, L represents the sum of the number of times of access of all semantics;
the noise generation module comprises the following subunits:
a first noise generation unit for calculating standard deviation sigma of Gaussian inverse cumulative distribution function according to semantic sensitivity and position semantic range 1 ,σ 2 ,
Wherein μ is 0 and σ 1 ,σ 2 The parameters are obtained;
a second noise generation unit for calculating the parameter lambda needed by the inverse cumulative exponential distribution function to generate the exponential distribution according to the semantic sensitivity and the position semantic range 1 ,λ 2 ;
A third noise generation unit for generating Gaussian distribution noise Z based on the Gaussian distribution parameter obtained in step S41 lng ,Z lat ;
A fourth noise generation unit for generating an exponential distribution noise W based on the exponential distribution parameter obtained in step S42 lng ,W lat ;
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110449465.5A CN113177166B (en) | 2021-04-25 | 2021-04-25 | Personalized position semantic publishing method and system based on differential privacy |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110449465.5A CN113177166B (en) | 2021-04-25 | 2021-04-25 | Personalized position semantic publishing method and system based on differential privacy |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113177166A CN113177166A (en) | 2021-07-27 |
CN113177166B true CN113177166B (en) | 2022-10-21 |
Family
ID=76926190
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110449465.5A Active CN113177166B (en) | 2021-04-25 | 2021-04-25 | Personalized position semantic publishing method and system based on differential privacy |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113177166B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108595976A (en) * | 2018-03-27 | 2018-09-28 | 西安电子科技大学 | Android terminal sensor information guard method based on difference privacy |
CN109726594A (en) * | 2019-01-09 | 2019-05-07 | 南京航空航天大学 | A kind of novel track data dissemination method based on difference privacy |
CN110300029A (en) * | 2019-07-06 | 2019-10-01 | 桂林电子科技大学 | A kind of location privacy protection method of anti-side right attack and position semantic attacks |
CN111447181A (en) * | 2020-03-04 | 2020-07-24 | 重庆邮电大学 | Location privacy protection method based on differential privacy |
CN111931235A (en) * | 2020-08-18 | 2020-11-13 | 重庆邮电大学 | Differential privacy protection method and system under error constraint condition |
CN111950028A (en) * | 2020-08-24 | 2020-11-17 | 重庆邮电大学 | Differential privacy protection method and system for track time mode |
CN112035880A (en) * | 2020-09-10 | 2020-12-04 | 辽宁工业大学 | Track privacy protection service recommendation method based on preference perception |
CN112364379A (en) * | 2020-11-18 | 2021-02-12 | 浙江工业大学 | Location privacy protection method for guaranteeing service quality based on differential privacy |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10638305B1 (en) * | 2018-10-11 | 2020-04-28 | Citrix Systems, Inc. | Policy based location protection service |
-
2021
- 2021-04-25 CN CN202110449465.5A patent/CN113177166B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108595976A (en) * | 2018-03-27 | 2018-09-28 | 西安电子科技大学 | Android terminal sensor information guard method based on difference privacy |
CN109726594A (en) * | 2019-01-09 | 2019-05-07 | 南京航空航天大学 | A kind of novel track data dissemination method based on difference privacy |
CN110300029A (en) * | 2019-07-06 | 2019-10-01 | 桂林电子科技大学 | A kind of location privacy protection method of anti-side right attack and position semantic attacks |
CN111447181A (en) * | 2020-03-04 | 2020-07-24 | 重庆邮电大学 | Location privacy protection method based on differential privacy |
CN111931235A (en) * | 2020-08-18 | 2020-11-13 | 重庆邮电大学 | Differential privacy protection method and system under error constraint condition |
CN111950028A (en) * | 2020-08-24 | 2020-11-17 | 重庆邮电大学 | Differential privacy protection method and system for track time mode |
CN112035880A (en) * | 2020-09-10 | 2020-12-04 | 辽宁工业大学 | Track privacy protection service recommendation method based on preference perception |
CN112364379A (en) * | 2020-11-18 | 2021-02-12 | 浙江工业大学 | Location privacy protection method for guaranteeing service quality based on differential privacy |
Non-Patent Citations (4)
Title |
---|
CLM:面向轨迹发布的差分隐私保护方法;王豪 等;《通信学报》;20170625;第38卷(第6期);85-96 * |
Differential Privacy-Based Location Protection in Spatial Crowdsourcing;Jianhao Wei et al.;《IEEE TRANSACTIONS ON SERVICES COMPUTING》;20190606;第15卷(第1期);45-58 * |
PrivSem: Protecting location privacy using semantic and differential privacy;Yanhui Li et al.;《World Wide Web》;20190427;第22卷;2407-2436 * |
面向移动位置服务的轨迹隐私保护研究;鞠晓康;《中国优秀硕士学位论文全文数据库 信息科技辑》;20210215(第2期);I138-186 * |
Also Published As
Publication number | Publication date |
---|---|
CN113177166A (en) | 2021-07-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jiang et al. | A utility-aware general framework with quantifiable privacy preservation for destination prediction in LBSs | |
Zhang et al. | On reliable task assignment for spatial crowdsourcing | |
TWI505122B (en) | Method, system, and computer program product for automatically managing security and/or privacy settings | |
Menaga et al. | Least lion optimisation algorithm (LLOA) based secret key generation for privacy preserving association rule hiding | |
CN101493874B (en) | Personal context information privacy protection policy automatic generating method | |
WO2017040852A1 (en) | Modeling of geospatial location over time | |
CN110636065A (en) | Location point privacy protection method based on location service | |
CN111093191B (en) | Crowd sensing position data issuing method based on differential privacy | |
US20210089887A1 (en) | Variance-Based Learning Rate Control For Training Machine-Learning Models | |
Mugan et al. | Understandable learning of privacy preferences through default personas and suggestions | |
Primault et al. | Adaptive location privacy with ALP | |
CN110727957A (en) | Differential privacy protection method and system based on sampling | |
Gupta | Some issues for location dependent information system query in mobile environment | |
CN110704754B (en) | Push model optimization method and device executed by user terminal | |
CN114117536B (en) | Location privacy protection method in three-dimensional space LBS (location based service) based on deep reinforcement learning | |
CN111797433A (en) | LBS service privacy protection method based on differential privacy | |
EP3192061A1 (en) | Measuring and diagnosing noise in urban environment | |
CN113177166B (en) | Personalized position semantic publishing method and system based on differential privacy | |
Yan et al. | Perturb and optimize users’ location privacy using geo-indistinguishability and location semantics | |
CN114564747A (en) | Track difference privacy protection method and system based on semantics and prediction | |
Eltarjaman et al. | Private retrieval of POI details in top-K queries | |
CN111931235B (en) | Differential privacy protection method and system under error constraint condition | |
KR101685847B1 (en) | Method and System for Mediating Proximal Group Users Smart Device Usage via Location-based Virtual Usage Limiting Spaces | |
CN114219581A (en) | Personalized interest point recommendation method and system based on heteromorphic graph | |
CN109472160B (en) | Position privacy protection method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |