CN111797433A - LBS service privacy protection method based on differential privacy - Google Patents

LBS service privacy protection method based on differential privacy Download PDF

Info

Publication number
CN111797433A
CN111797433A CN202010690224.5A CN202010690224A CN111797433A CN 111797433 A CN111797433 A CN 111797433A CN 202010690224 A CN202010690224 A CN 202010690224A CN 111797433 A CN111797433 A CN 111797433A
Authority
CN
China
Prior art keywords
user
query
privacy
cluster
users
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010690224.5A
Other languages
Chinese (zh)
Other versions
CN111797433B (en
Inventor
史伟
张青云
张兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Liaoning University of Technology
Original Assignee
Liaoning University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Liaoning University of Technology filed Critical Liaoning University of Technology
Priority to CN202010690224.5A priority Critical patent/CN111797433B/en
Publication of CN111797433A publication Critical patent/CN111797433A/en
Application granted granted Critical
Publication of CN111797433B publication Critical patent/CN111797433B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The invention discloses an LBS service privacy protection method based on differential privacy. According to DP- (k)1And (3) establishing a query k-anonymous set conforming to differential privacy according to background knowledge of the cluster where the user is located and the area where the user is located, which is obtained in the l) -means algorithm, and sending the anonymous set to the LSP to avoid revealing a real query request of the user, so that the query privacy of the user is protected, and inference attack and time-space correlation attack of an attacker are avoided. DP-k2In the anonymous algorithm, a query request sent in the same time period t is selected to construct a query k-anonymous set by combining time characteristics, and the data set is processed by using an index mechanism, so that the reasonability of the query request on time is ensured, and the privacy of a user is protected from being disclosed.

Description

LBS service privacy protection method based on differential privacy
Technical Field
The invention relates to a privacy protection method, in particular to an LBS service privacy protection method based on differential privacy.
Background
With the popularization of mobile networks and the development of positioning technology, Location-based services (LBS) have gained acceptance from a large number of users. The service systems greatly facilitate the life of people, but often collect request data sent by users when the users do not know the information, and analyze and process the data, so that the problems of privacy information leakage and the like of the users are caused. The LBS-based service request includes a large amount of privacy information, such as identity information, location information, point of interest information, etc. of the user, which may be revealed at the same time when the query request is issued. Therefore, how to protect the privacy information of the user without affecting the query result of the user is a current important research and development direction.
LBS request service under a social network has strict requirements on background knowledge, and in the existing LBS privacy protection scheme, the background knowledge refers to historical query probability Of a Point Of Interest (POI) in a map. If the attacker has certain background knowledge, the accuracy of the attack can be greatly improved. Therefore, how to avoid background knowledge attacks becomes extremely important when designing LBS privacy protection schemes. The differential privacy is a privacy protection mechanism which is not influenced by background knowledge of an attacker and is not influenced by specific data change, and the problem that the traditional privacy protection method cannot resist the background knowledge attack can be better solved by combining the mechanism with the existing privacy protection scheme.
The Location-based service request sent by the user to the LBS Service Platform (LSP) includes sensitive data such as the user's identity information, the current Location information, the time when the query request is sent, and the queried information of the point of interest. The location information and other information deduced according to the location information belong to location privacy, the information related to the request content belongs to query privacy, and both the location privacy and the query privacy belong to the category of LBS privacy protection.
To better protect the privacy information in LBS requests, Gruteser et al first applied k-anonymization techniques to location privacy protection, resulting in a location k-anonymization model: when the position of a moving object at a certain moment cannot be distinguished from the positions of other k-1 users, the position is said to satisfy the position k-anonymity. Niu et al consider road network environment based on a randomization method, select candidate regions by means of circular region segmentation and mesh expansion, and generate pseudo positions according to position semantic information. Although the above algorithms are continuously improved, the common disadvantages are that the background knowledge that an attacker may possess is not considered, and the position information sent by a user to an LSP does not consider the position semantics, so that the generated position anonymity set contains many false position points which can be directly eliminated, the inference attack of the attacker cannot be resisted, the communication overhead is increased, and the service quality is influenced.
For query privacy, the truer and more diversified the generated false query is, the higher the protection level of the user query privacy is. The Dummy-Q model proposed by Pinglery et al constructs a Dummy query according to the relevant conditions of query context, user motion model, query semantics, etc., so that an attacker cannot distinguish the difference between the real query and the Dummy query of the user. In the above query privacy protection algorithm, the real query request of the user is not combined, and if the real query request of the user is not contained in the POI with high unit query probability, the returned query result is useless for the user.
Aiming at the problems, the DPLQ privacy protection scheme is provided, and the scheme is combined with a differential privacy mechanism, a k-means algorithm, a l-diversity algorithm and a k-anonymous algorithm, so that the position privacy and the query privacy of a user are effectively protected under the condition that the influence of background knowledge owned by an attacker is avoided.
Disclosure of Invention
The invention designs and develops a LBS service privacy protection method based on differential privacy, which effectively protects the position privacy and the query privacy of a user under the condition of not being influenced by the background knowledge owned by an attacker.
A LBS service privacy protection method based on differential privacy comprises the following steps:
step one, constructing a Voronoi diagram according to map information, so that each Voronoi polygon only comprises one POI;
step two, calculating the number of users contained in each Voronoi polygon by combining the position data set X, and performing descending arrangement on the Voronoi polygons according to the number of the users;
step three, selecting front k with large number of users1A Voronoi polygon having its centroid OjAs the initial clustering center of the k-means algorithm;
step four, calculating the position of each user in the position data set X and the initial clustering center OjEuclidean distance of dij
Step five, dividing the users into cluster clusters with the minimum Euclidean distance;
sixthly, after all the users are divided, recalculating the centroids of the k clusters;
step seven, if the distance between the two centroids is smaller than a set threshold value, using the original cluster center, and ending the cycle; and if the distance between the two centroids is greater than the threshold value, using the updated centroid as the cluster center, and skipping to the step four.
Step eight, returning the cluster center O where the query user is locatedjAnd l-1 centroids closer thereto, constituting a location data set with l false locations;
adding Laplace noise to the l cluster centers, constructing a position anonymity set containing l false positions by using the position points subjected to noise addition, and replacing the real positions of the users to send the position anonymity set to the LSP;
step nine, determining a cluster CjThe number of users sending requests in the middle t period;
step ten, calculating a cluster CjThe position similarity S between each position point and the user positionlAnd arranged in ascending order;
step eleven according to SlGet the corresponding query request q (q) in ascending order1,q2,…,qn);
Step twelve, the real inquiry request q of the userxPut query k2-in an anonymous set QA;
thirteen step of judging qiIf it is present in QA, if it is not present, the q is addediAdding into QA; if so, comparing the next query request q according to the query request sequence in step threei+1Until k is included in QA2An element;
fourteen steps, if cluster CjAll of q in (1)iAre all added toIn QA, | QA | is still insufficient k2Then, the historical query request probabilities Pr (qi) in the same time period t are sorted in a descending order to obtain (q)1,q2,…,qm);
Step fifteen, skipping to step thirteen, and continuing to judge qiIf present in QA, when | QA | ═ k2Stopping circulation;
sixthly, performing privacy protection on the QA by using an index mechanism meeting the requirement of differential privacy, and strictly controlling the output probability of each candidate item in the QA.
Taking 5 meters; i, identifying query requests sent by different users as integers; j identifies different clustering clusters and clustering centers, and is an integer; n represents a cluster C where a user is located during a period tjThe number of query requests sent in (1) is an integer; m represents the number of query requests issued in the history in the period t, and is an integer.
As a further preference, neighboring users in the same cluster may each use the anonymous set in place of their true location information.
The invention has the following beneficial effects:
aiming at the position privacy protection of a user, the application provides DP- (k)1L) -means algorithm. The algorithm combines a differential privacy mechanism with a k-means algorithm, performs Voronoi graph pre-division on a road network, constructs k cluster clusters according to divided Voronoi polygons and user position points contained in each polygon, selects l cluster centers for noise processing by using an l-diversity idea, and sends noisy position points to an LBS server instead of the real position of a user, so that the problem that a malicious attacker intercepts user privacy information and the LBS server is not completely credible in the process of sending the LBS service by the user is avoided, and the purpose of protecting the user position privacy is achieved.
Aiming at the privacy protection of the user, the application provides DP-k2-anonymous algorithm. And constructing a query k-anonymous set according to query information of neighboring users in the cluster at the same time period and historical query probability of the POI in the area, and adding noise to the query k-anonymous set by using an index mechanism so as to achieve the purpose of protecting the query privacy of the users.
The method is different from the situation that the existing privacy protection scheme can only protect the position privacy of the user, and the personalized privacy protection scheme is adopted to protect the LBS privacy information of the user on the premise of considering the attack of background knowledge, so that the user can independently control the privacy protection degree, and the aim of different requirements of different users on privacy protection is fulfilled.
Drawings
FIG. 1 is a diagram of a centralized architecture of the present invention.
FIG. 2 is a comparison graph of the algorithm of the deviation degree of the false position from the real position of the user according to the present invention.
FIG. 3 is a comparison graph of the average road network distance algorithm from the false location to the true location according to the present invention.
FIG. 4 is a comparison algorithm chart of the correlation between the false query and the real query according to the present invention.
FIG. 5 shows a fixing member k according to the present invention2Comparison graph of average running time of time algorithm.
FIG. 6 is a graph comparing the average run time of the fixed time algorithm of the present invention.
FIG. 7 is a graph of the difference between the degree of deviation of the false location from the true location of the user constructed by the DPLQ algorithm of the present invention.
Detailed Description
The present invention is further described in detail below with reference to the attached drawings so that those skilled in the art can implement the invention by referring to the description text.
POI refers to a property that distinguishes the location from geographical location information. For example, the user a sends a service request of "query for nearest movie theater" to the LSP in a shopping mall, the shopping mall is the POI of the location where the user is located, the geographical location of the shopping mall is the real location of the user, the movie theater is the POI of the query request of the user, and the location information of the movie theater is the geographical location of the query request of the user. In summary, the POI represents semantic information such as a basic building or other indicative facilities at a geographic location, and can be used as a keyword for a user service request.
The background knowledge includes information such as a POI where a certain location point is located, a service request issued in a certain time period, query probability as the service request, and the like, and any user side having a calculation storage function can acquire the background knowledge of the location point. The background knowledge of the present application is obtained through the Overpass API of the OpenStreetMap map.
The query request based on the LBS requires a user to obtain service according to the current location, the user first sends a query service request Q ═ { ID, Loc, POI, t, QPOI, QLoc } to a Trusted Third Party server (TTP), and the TTP performs privacy protection on the location information and the query information respectively. In the query request Q of the user, ID is an identifier of the user, a unique individual can be directly determined, Loc, POI, Qloc and QPOI are quasi identifiers of the user, and the minimum attribute set of the user can be obtained according to connection with an external table.
The Loc represents the position of the user when the user sends a query request, the POI represents the interest point of the current position, the QPOI represents the interest point to be queried by the user, and the Qloc represents the position of the query interest point. t represents the time when the user makes a query request.
The existing privacy protection system architecture based on LBS service is roughly divided into three categories: centralized architectures, distributed architectures, and hybrid architectures. The application adopts a centralized architecture which is most widely applied at present, the architecture is composed of a user side, a TTP (trusted third party server) and an LSP (local service provider), and a schematic diagram of the architecture is shown in FIG. 1.
When a user sends a service request to the LSP, the data is firstly protected in privacy through the TTP server. Generating a pseudonym corresponding to a user through a pseudonym processing module, and regenerating a new pseudonym when inquiring each time so as to prevent an attacker from carrying out speculative attack; generalizing the position data of the user through a position generalization module, and replacing the real position of the user with a k-means clustering center to send the k-means clustering center to the LSP; and performing k-anonymization processing on the query information of the user through a query anonymization module, and simultaneously sending k pieces of query information to the LSP. The LSP processes the received anonymous request and returns a service request result data set to the TTP, and the TTP returns a query result meeting the user requirement to the user through a query result refining module according to the real data of the user.
For the position privacy protection of the user, the Voronoi graph is used for dividing the map, and k-means and l-diversity are adopted as basic ideas of the privacy protection. Thus, the (k, l) -means privacy preserving model is defined as follows:
(k, l) -means privacy preserving model. If Loc1、Loc2The following conditions are satisfied:
(1)|Loc1|=k,Loc1representing a generalized set of user locations;
(2)Loc1∈DESC(O1,O2,…Ok);DESC(O1,O2,…,Ok) Representing the descending order of all the clustering centers;
(3)
Figure BDA0002589093630000061
|Loc2|=l,Loc2indicating the set of locations sent to the LSP.
Then Loc2The location data in (b) satisfies the (k, l) -means privacy protection model.
Condition 1 represents a location-generalized set of users Loc1The method comprises the steps of clustering users in the region at the same time period into k clusters, wherein the k data records are contained in the region; condition 2 represents the set Loc1The data in (1) is clustered by a clustering center OiThe descending order of (1); condition 3 represents the slave set Loc1Selecting one position point to form a set Loc2And sent to the LSP as a false location of the user.
For the query privacy protection of a user, the query k-anonymity privacy protection model is provided based on the position k-anonymity thought. The model adopts a local k-anonymity algorithm to generalize the real query request of the user to an anonymity query set, and generalize the attribute values of different users to different relatively independent generalization data sets, so that the excessive generalization of the user data can be prevented, and the service effect is prevented from being influenced.
The k-anonymous privacy preserving model is queried. Generalizing the real query information of the user into an anonymous query data set, so that the query information of the user cannot be distinguished from other k-1 records, and processing the generated anonymous set by using an index mechanism, so that the probability of successful identification of the query request of the user is lower than 1/k.
DP-(k1The l-means algorithm combines the differential privacy with the k-means algorithm, and protects the position privacy of the user according to known conditions such as map information, user real position information X, a position data set X and the like.
Firstly, carrying out Voronoi diagram pre-division on a map, and constructing k according to Voronoi polygons1And (4) clustering, selecting l cluster centers according to an l-diversity idea, adding Laplace noise, constructing a position anonymous set containing l false positions by using the position points subjected to noise addition, and sending the position anonymous set to an LSP instead of the real position of a user. All neighbor users in the same cluster can use the anonymous set to replace the real position information of the neighbor users, the reciprocity of the algorithm is realized, the service response time and the operation cost are saved, and the homogeneous attack and the background knowledge attack of an attacker can be effectively avoided.
DP-(k1L) -means algorithm
Inputting: map information M, user true position information X, position data set X
And (3) outputting: l false locations
Figure BDA0002589093630000071
Wherein, OjIs the initial cluster center of the k-means algorithm, xiRepresenting a sample point in the position data set X, dijRepresents a sample point xiTo OjEuclidean distance of (C)jRepresents the clustering center OjCorresponding cluster, Oj' denotes the updated Cluster CjThe center of mass of (a), Lap (λ), represents laplacian noise that satisfies-differential privacy.
DP- (k) is described in detail below1The l) -means algorithm implements a process of processing data.
Step one, constructing a Voronoi diagram according to map information, so that each Voronoi polygon only comprises one POI;
step two, calculating the number of users contained in each Voronoi polygon by combining the position data set X, and performing descending arrangement on the Voronoi polygons according to the number of the users;
step three, selecting front k with large number of users1A Voronoi polygon having its centroid OjAs the initial clustering center of the k-means algorithm;
step four, calculating the position of each user in the position data set X and the initial clustering center OjEuclidean distance of dij
Step five, dividing the users into cluster clusters with the minimum Euclidean distance;
sixthly, after all the users are divided, recalculating the centroids of the k clusters;
step seven, if the distance between the two centroids is smaller than a set threshold value, using the original cluster center, and ending the cycle; and if the distance between the two centroids is greater than the threshold value, using the updated centroid as the cluster center, and skipping to the fourth step.
Step eight, returning the cluster center O where the query user is located by the algorithmjAnd l-1 centroids closer thereto, constituting a location data set with l ghost locations.
DP-k2The anonymity algorithm combines differential privacy with the k-anonymity algorithm, and a user customizes k according to the privacy requirement2Value, k2The larger the value, the better the privacy protection effect, but the accuracy of the service is reduced.
The threshold value can be set according to experimental requirements, and preferably, the threshold value is 5 meters.
i, identifying query requests sent by different users as integers; j identifies different clustering clusters and clustering centers, and is an integer; n represents a cluster C where a user is located during a period tjThe number of query requests sent in (1) is an integer; m represents the number of query requests issued in the history in the period t, and is an integer.
According to DP- (k)1The background knowledge of the user cluster and the user area obtained in the l) -means algorithm is used for constructing a query k-anonymous set conforming to the differential privacy, the anonymous set is sent to the LSP to avoid revealing the real query request of the user, so that the query privacy of the user is protected, and the inference attack and the time delay of an attacker are avoidedA null associativity attack.
DP-k2-anonymity algorithm
Inputting: k is a radical of2Value, query user's cluster CjVoronoi diagram
And (3) outputting: query k2Anonymous set QA
Figure BDA0002589093630000081
Figure BDA0002589093630000091
Wherein n represents a cluster CjNumber of users, ASCE (S), sending requests during a period of time t1,S2,…,Sn) Denotes the ascending order of the positional similarity S of n position points, qiDenotes that the position similarity is S in the period tiA query request, q, issued by a location point ofxIndicating the query user's request, Pr (q), during the period ti) Representing q during a period tiThe historical query probability of (2), which can be obtained from background knowledge, DESC (Pr (q))1),Pr(q2),…,Pr(qn) Denotes p-Pr (q)i) And performing descending arrangement.
DP-k2The position similarity mentioned in the anonymity algorithm can be calculated using the following formula:
Figure BDA0002589093630000092
wherein, UiAnd UjRespectively representing two different users in a cluster, d (U)i,Uj) Representing the Euclidean distance, S, of two userslThe larger the similarity between two users. In the same time period, the query k can be used by the query request sent by any user in the cluster2Anonymous set replacement.
DP-k is described in detail below2The anonymity algorithm implements the process of processing data.
Step one, determining a cluster C according to known background knowledgejThe number of users sending requests in the middle t period;
step two, calculating a cluster CjThe position similarity S between each position point and the user positionlAnd arranged in ascending order;
step three, according to SlGet the corresponding query request q (q) in ascending order1,q2,…,qn);
Step four, the real inquiry request q of the userxPut query k2-in the anonymous set QA,
step five, judging qiIf it is present in QA, if it is not present, the q is addediAdding into QA; if so, comparing the next query request q according to the query request sequence in step threei+1Until k is included in QA2An element;
step six, if the cluster CjAll of q in (1)iAll added into QA, | QA | is still insufficient k2Then, the historical query request probabilities Pr (qi) in the same time period t are sorted in a descending order to obtain (q)1,q2,…,qn);
Step seven, skipping the algorithm to step five, and continuously judging qiIf present in QA, when | QA | ═ k2The cycle is stopped.
And step eight, carrying out privacy protection on the QA by using an index mechanism meeting the requirement of differential privacy, and strictly controlling the output probability of each candidate item in the QA.
Security analysis
(1) Resistance to homogeneity attacks; the basic idea of a homogeneity attack is to find multiple records in a data source that correspond to one and at the same time to one sensitive property.
DP-(k1And the l-means algorithm is combined with the k-anonymity algorithm and the l-diversity algorithm to generate a position anonymity set, so that all users in the cluster can use the generated position anonymity set. Even if an attacker acquires the position anonymous set constructed by the algorithm, the true information of the users in the cluster cannot be acquired, because the data in the anonymous set are all false positions surrounding the users in the cluster and do not contain the true information of the users. Therefore, the scheme can effectively resist the homogeneity attack.
(2) Resisting background knowledge attacks; the basic idea of the background knowledge attack is to find out a plurality of records corresponding to a certain data source from a plurality of data sources, and if the background knowledge of the data source is available, other sensitive attribute information corresponding to the data source may be found.
The difference privacy is based on strict mathematical knowledge, the result of processing the data set is not influenced by a specific piece of data, and any piece of data is deleted from the data set, so that the calculation result is not influenced. Assuming that the complete data set is D, the attacker already has all data except the attack object information, denoted as data set D ', data sets D and D' are neighboring data sets that differ by at most one piece of data. The sensitivity Δ F of the query algorithm F is expressed as
ΔF=maxD,D'||F(D)-F(D')||
ΔF≤1
Colloquially, the algorithm sensitivity Δ F can be understood as the worst impact of randomly adding or deleting a record on the overall dataset query result. The 2 formulas show that the attacker still cannot acquire the information of the attack object after acquiring the maximum background knowledge, so that the differential privacy can well resist the background knowledge attack. The algorithm of the application is combined with a differential privacy mechanism, so that the background knowledge attack of an attacker can be effectively resisted while the service quality is ensured.
(3) Resisting reasoning attacks; the basic idea of inference attack is that an attacker can deduce possible position information and query requests of a user according to life experience, common sense, background knowledge and other information.
For user position information, all the position anonymity sets constructed by the algorithm adopt false positions close to the user position, so that an attacker cannot deduce other privacy information of the user from the position information; for the query privacy of the user, selecting query requests sent by neighbor users in the same cluster in the same time period, and constructing a query k2Anonymous sets, both to guarantee the authenticity of the query and to avoid an attacker inferring the user's position or others from the query contentAnd (4) information. Therefore, the algorithm provided by the application can effectively avoid reasoning attack.
(4) Resisting the space-time relevance attack; spatio-temporal relevance attacks[22]Mainly aiming at query privacy, the DP-k provided in the application2In the anonymous algorithm, a query request sent in the same time period t is selected to construct a query k-anonymous set by combining time characteristics, and the data set is processed by using an index mechanism, so that the reasonability of the query request on time is ensured, and the privacy of a user is protected from being disclosed.
Experimental verification
In the experiment, the algorithm is written by Java, and the running environment is 1.70GHz Interl (R) core (TM) i5 processor, 4GB memory and 64-bit Windows 8 operating system. The data set used in the experiment was from road network information from Olderburgh, Germany and user information generated by the Foursquare website. The experimental data set included 7035 roads, 6105 vertices, POI points of 4-9 types, and 250 ten thousand query requests from different users.
The method DPLQ is compared with three algorithms of Mobimix, H-Star and T-SR in the experiment. The Mobimix is a mix-zone based road network framework and is used for protecting the location privacy of a user; H-Star is an X-Star extended stealth algorithm based on Hilbert rule; T-SR is a location privacy preserving algorithm based on POI queries. The three algorithms are respectively classical algorithms for protecting privacy information based on different technologies and have good representativeness.
The experiment uses Pseudo-Variance (PV), Average Path Distance (APD), and Association Degree (AD) between a false query and a real query as evaluation criteria. The three evaluation criteria can well reflect the reasonability and effectiveness of the false information generated by the algorithm, and are convenient for comparing the experimental effects of different algorithms. PV and APD are defined as shown in formulas (1) and (2).
Figure BDA0002589093630000121
Figure BDA0002589093630000122
Wherein, PujIs the frequency, P, of POI queries corresponding to the user's real locationijIs the POI query frequency, k, corresponding to the ith false location in the location-anonymous set1Representing k in Algorithm 11Means coefficients, l representing the l-diversity coefficients in algorithm 1.
The degree of association between two POI categories is defined as shown in equation (3).
Figure BDA0002589093630000123
Where Fnum () represents POI during t time periodiTo POIjN represents the total number of POIs within the same grid area. The size of the function AD depends on the POIiTo POIjAccess frequency and POIiRatio of total access frequency to all other POI points.
A pseudo variance ratio of the algorithm; PV represents the degree of deviation of the constructed fake location from the user's true location. The smaller the PV, the more uncertainty in the generated location data set, the more true the ghost location. FIG. 2 reflects at k1When the grid area where the user is located is 3km × 3km, the number of elements l in the location data set is different from the PV difference of the four algorithms.
As can be seen from fig. 2, the DPLQ algorithm is always superior to the other three algorithms regardless of the value of l. Therefore, the comparison shows that the false position generated by the DPLQ algorithm is more real and can resist inference attack well. When l is 10, the PV values of the DPLQ algorithm, the T-SR algorithm, and the H-Star algorithm are much smaller than the mobilmix algorithm, because when constructing the location data set, all three algorithms combine the real location of the user, taking into account both the rationality of the false location and the diversity of POI semantics in the location data set. As the value of l increases, the differences of the four algorithms become smaller and smaller, because the privacy protection budget is fixed and the grid area where the location data set is constructed is also fixed; the value of l is increased continuously, the four algorithms select more spurious locations with high similarity to construct the location data set, and therefore PV difference among the algorithms is smaller and smaller.
Comparing the average path distances of the algorithm; APD represents the average road network distance from a false location to a true location. The larger and more dispersed the false position distribution area is, the more difficult it is for a malicious attacker to obtain the real position data of the user from the position data set. Fig. 3 reflects the influence of the change of l value on the APD of different algorithms when the grid area where the user is located is 3km × 3 km.
As can be seen from fig. 3, the APD of the DPLQ algorithm is larger than the other three algorithms, which means that the false positions generated by the DPLQ algorithm are more scattered. The APD difference for the four algorithms gets smaller as the value of l increases. This is because in the experiment, the grid area where the user is located is unchanged, the value l is increased, and the four algorithms select more similar false positions, so that the APD difference between the algorithms is reduced.
And comparing the relevance between the false query and the real query. Associativity refers to the fact that at query k2-associations between generated false POI queries and user real POI queries in an anonymous set. We compare the relevance between the false queries and the real queries of the DPLQ algorithm, the T-SR algorithm, the H-Star algorithm and the Mobimix algorithm, respectively.
As can be seen from FIG. 4, the association degrees between the false queries of the DPLQ algorithm and the T-SR algorithm and the real queries are both 0, and both algorithms consider the time association between the query request and the position of the user to generate the false queries which are not associated with the real queries of the user, so that the false queries can resist the time-space association attack of a malicious attacker. The experimental results of the DPLQ algorithm and the T-SR algorithm are obviously superior to those of the other two algorithms.
Influence of experiment parameters on experiment time; the effective parameters in the experiment were: number of clusters k1Number of elements in location data set, number of elements in query anonymity set, k2Privacy preserving budgets. Number of causal clusters k1Has no direct influence on the experimental results, so k is set in the experiment 150 is unchanged. l, k2And composing privacy preserving triplets<l,k2,>. FIG. 5 shows that when k is2When 10 is constant, the comparison sum/change pairThe algorithm averages the effect of run time.
As shown in fig. 5, the fixed value, as the value of l increases, needs to generate more dummy locations, thus taking longer runtime; by fixing the value of l and viewing the graph from bottom to top, it can be seen that as the value decreases, the degree of privacy protection becomes higher, the amount of noise added increases, and the algorithm running time becomes longer.
FIG. 6 shows the comparison sum k when l is 8 constant2The effect of the variation in (c) on the average running time of the algorithm, the experiment assumes that n is 10.
As shown in fig. 6, k is fixed2The value can be obtained by the same way, and the relation between the value and the algorithm running time can be obtained; fixed value, with k2The increase in value takes longer run time when k is212 and k2At 14, the algorithm run time is significantly higher than k2Run time at 10, since the experiment assumes n is 10, when k is2<When n, directly screening k in the cluster 21 query request sent by users with small position similarity at the same time is only needed, influence of historical query results on construction of an anonymous set is not considered, and therefore the running time of the algorithm is obviously less than k2>Run time at n.
Fig. 7 shows the difference in PV values for the DPLQ algorithm when setting different privacy preserving parameters.
As can be seen from fig. 7, when 0.5 and l 12, the PV value of the algorithm is the smallest, which means that the constructed fake location has the smallest deviation from the real location of the user, so that the LSP can provide better location service for the user without revealing the location privacy information of the user.
The application provides a differential privacy based LBS service privacy protection scheme-DPLQ. The scheme includes two algorithms, DP- (k)1L) -means algorithm and DP-k2The anonymity algorithm, which can effectively protect location privacy and query privacy in LBS service requests. According to the scheme, the influence of background knowledge and space-time relevance on a privacy protection algorithm is considered, and two privacy protection models are defined; the privacy protection can be defined by the user according to the different privacy requirements of different usersStrength. Therefore, according to the scheme, malicious attacks are difficult to acquire the privacy information of the user from the constructed position data set and the query k-anonymous set, the constructed false positions are more dispersed, the false queries are more authentic, and therefore homogeneity attacks, background knowledge attacks, inference attacks and time relevance attacks are resisted. Experiments show that the algorithm has obvious advantages in the aspects of pseudo variance, average path distance, correlation degree between false query and real query and the like, has good expandability and can effectively protect the LBS privacy information of the user. In future work, privacy measurement and hierarchical protection are carried out on the position of the user and the sent query request, so that the LBS request of the user can be protected more accurately without losing the service quality.
While embodiments of the invention have been described above, it is not limited to the applications set forth in the description and the embodiments, which are fully applicable in various fields of endeavor to which the invention pertains, and further modifications may readily be made by those skilled in the art, it being understood that the invention is not limited to the details shown and described herein without departing from the general concept defined by the appended claims and their equivalents.

Claims (2)

1. A LBS service privacy protection method based on differential privacy is characterized by comprising the following steps:
step one, constructing a Voronoi diagram according to map information, so that each Voronoi polygon only comprises one POI;
step two, calculating the number of users contained in each Voronoi polygon by combining the position data set X, and performing descending arrangement on the Voronoi polygons according to the number of the users;
step three, selecting front k with large number of users1A Voronoi polygon having its centroid OjAs the initial clustering center of the k-means algorithm;
step four, calculating the position of each user in the position data set X and the initial clustering center OjEuclidean distance of dij
Step five, dividing the users into cluster clusters with the minimum Euclidean distance;
sixthly, after all the users are divided, recalculating the centroids of the k clusters;
step seven, if the distance between the two centroids is smaller than a set threshold value, using the original cluster center, and ending the cycle; and if the distance between the two centroids is greater than the threshold value, using the updated centroid as the cluster center, and skipping to the step four.
Step eight, returning the cluster center O where the query user is locatedjAnd l-1 centroids closer thereto, constituting a location data set with l false locations;
adding Laplace noise to the l cluster centers, constructing a position anonymity set containing l false positions by using the position points subjected to noise addition, and replacing the real positions of the users to send the position anonymity set to the LSP;
step nine, determining a cluster CjThe number of users sending requests in the middle t period;
step ten, calculating a cluster CjThe position similarity S between each position point and the user positionlAnd arranged in ascending order;
step eleven according to SlGet the corresponding query request q (q) in ascending order1,q2,…,qn);
Step twelve, the real inquiry request q of the userxPut query k2-in an anonymous set QA;
thirteen step of judging qiIf it is present in QA, if it is not present, the q is addediAdding into QA; if so, comparing the next query request q according to the query request sequence in step threei+1Until k is included in QA2An element;
fourteen steps, if cluster CjAll of q in (1)iAll added into QA, | QA | is still insufficient k2Then, the historical query request probabilities Pr (qi) in the same time period t are sorted in a descending order to obtain (q)1,q2,…,qm);
Step fifteen, skipping to step thirteen, and continuing to judge qiIf present in QA, when | QA | ═ k2Stopping circulation;
sixthly, performing privacy protection on the QA by using an index mechanism meeting the requirement of differential privacy, and strictly controlling the output probability of each candidate item in the QA.
Taking 5 meters; i, identifying query requests sent by different users as integers; j identifies different clustering clusters and clustering centers, and is an integer; n represents a cluster C where a user is located during a period tjThe number of query requests sent in (1) is an integer; m represents the number of query requests issued in the history in the period t, and is an integer.
2. The differential privacy-based LBS service privacy protection method of claim 1, wherein neighboring users in the same cluster may all use the anonymous set in place of their true location information.
CN202010690224.5A 2020-07-17 2020-07-17 LBS service privacy protection method based on differential privacy Active CN111797433B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010690224.5A CN111797433B (en) 2020-07-17 2020-07-17 LBS service privacy protection method based on differential privacy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010690224.5A CN111797433B (en) 2020-07-17 2020-07-17 LBS service privacy protection method based on differential privacy

Publications (2)

Publication Number Publication Date
CN111797433A true CN111797433A (en) 2020-10-20
CN111797433B CN111797433B (en) 2023-08-29

Family

ID=72808687

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010690224.5A Active CN111797433B (en) 2020-07-17 2020-07-17 LBS service privacy protection method based on differential privacy

Country Status (1)

Country Link
CN (1) CN111797433B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112035880A (en) * 2020-09-10 2020-12-04 辽宁工业大学 Track privacy protection service recommendation method based on preference perception
CN112767693A (en) * 2020-12-31 2021-05-07 北京明朝万达科技股份有限公司 Vehicle driving data processing method and device
CN113407870A (en) * 2021-06-17 2021-09-17 安徽师范大学 Semantic and space-time correlation based road network LBS interest point query privacy protection method

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140090023A1 (en) * 2012-09-27 2014-03-27 Hong Kong Baptist University Method and Apparatus for Authenticating Location-based Services without Compromising Location Privacy
CN104394509A (en) * 2014-11-21 2015-03-04 西安交通大学 High-efficiency difference disturbance location privacy protection system and method
CN109379718A (en) * 2018-12-10 2019-02-22 南京理工大学 Complete anonymous method for secret protection based on continuous-query location-based service
CN109413067A (en) * 2018-10-29 2019-03-01 福建师范大学 A kind of inquiry method for protecting track privacy
CN110062324A (en) * 2019-03-28 2019-07-26 南京航空航天大学 A kind of personalized location method for secret protection based on k- anonymity
CN110300029A (en) * 2019-07-06 2019-10-01 桂林电子科技大学 A kind of location privacy protection method of anti-side right attack and position semantic attacks
CN110855375A (en) * 2019-12-02 2020-02-28 河海大学常州校区 Source node privacy protection method based on position push in underwater acoustic sensor network
CN111339091A (en) * 2020-02-23 2020-06-26 兰州理工大学 Position big data differential privacy division and release method based on non-uniform quadtree

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140090023A1 (en) * 2012-09-27 2014-03-27 Hong Kong Baptist University Method and Apparatus for Authenticating Location-based Services without Compromising Location Privacy
CN104394509A (en) * 2014-11-21 2015-03-04 西安交通大学 High-efficiency difference disturbance location privacy protection system and method
CN109413067A (en) * 2018-10-29 2019-03-01 福建师范大学 A kind of inquiry method for protecting track privacy
CN109379718A (en) * 2018-12-10 2019-02-22 南京理工大学 Complete anonymous method for secret protection based on continuous-query location-based service
CN110062324A (en) * 2019-03-28 2019-07-26 南京航空航天大学 A kind of personalized location method for secret protection based on k- anonymity
CN110300029A (en) * 2019-07-06 2019-10-01 桂林电子科技大学 A kind of location privacy protection method of anti-side right attack and position semantic attacks
CN110855375A (en) * 2019-12-02 2020-02-28 河海大学常州校区 Source node privacy protection method based on position push in underwater acoustic sensor network
CN111339091A (en) * 2020-02-23 2020-06-26 兰州理工大学 Position big data differential privacy division and release method based on non-uniform quadtree

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
张青云: "基于LBS系统的服务请求隐私保护研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》, no. 02, pages 1 - 58 *
徐启元 等;: "基于差分隐私的混合位置隐私保护", 计算机应用与软件, vol. 36, no. 06, pages 296 - 301 *
李维皓 等;: "基于位置服务隐私自关联的隐私保护方案", 通信学报, vol. 40, no. 05, pages 57 - 66 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112035880A (en) * 2020-09-10 2020-12-04 辽宁工业大学 Track privacy protection service recommendation method based on preference perception
CN112035880B (en) * 2020-09-10 2024-02-09 辽宁工业大学 Track privacy protection service recommendation method based on preference perception
CN112767693A (en) * 2020-12-31 2021-05-07 北京明朝万达科技股份有限公司 Vehicle driving data processing method and device
CN113407870A (en) * 2021-06-17 2021-09-17 安徽师范大学 Semantic and space-time correlation based road network LBS interest point query privacy protection method
CN113407870B (en) * 2021-06-17 2023-07-04 安徽师范大学 Road network LBS interest point query privacy protection method based on semantic and space-time correlation

Also Published As

Publication number Publication date
CN111797433B (en) 2023-08-29

Similar Documents

Publication Publication Date Title
Dong et al. Novel privacy-preserving algorithm based on frequent path for trajectory data publishing
Xue et al. Location diversity: Enhanced privacy protection in location based services
Komishani et al. PPTD: Preserving personalized privacy in trajectory data publishing by sensitive attribute generalization and trajectory local suppression
Fei et al. A K-anonymity based schema for location privacy preservation
Shaham et al. Privacy preservation in location-based services: A novel metric and attack model
Ni et al. An anonymous entropy-based location privacy protection scheme in mobile social networks
Dewri et al. Query m-invariance: Preventing query disclosures in continuous location-based services
CN111797433B (en) LBS service privacy protection method based on differential privacy
Chen et al. Measuring query privacy in location-based services
Li et al. Privacy preserving participant recruitment for coverage maximization in location aware mobile crowdsensing
CN112035880B (en) Track privacy protection service recommendation method based on preference perception
Li et al. PrivSem: Protecting location privacy using semantic and differential privacy
Wang et al. Achieving effective $ k $-anonymity for query privacy in location-based services
Li et al. A cloaking algorithm based on spatial networks for location privacy
Zhang et al. A novel on-line spatial-temporal k-anonymity method for location privacy protection from sequence rules-based inference attacks
Zhao et al. A privacy-preserving trajectory publication method based on secure start-points and end-points
Zhang et al. Protecting the moving user’s locations by combining differential privacy and k-anonymity under temporal correlations in wireless networks
Hashem et al. Crowd-enabled processing of trustworthy, privacy-enhanced and personalised location based services with quality guarantee
Li et al. A differentially private data aggregation method based on worker partition and location obfuscation for mobile crowdsensing
Lu et al. A novel method for location privacy protection in LBS applications
Hu et al. Personalized trajectory privacy protection method based on user-requirement
Qiu et al. TrafficAdaptor: an adaptive obfuscation strategy for vehicle location privacy against traffic flow aware attacks
Li et al. Lisc: location inference attack enhanced by spatial-temporal-social correlations
Su et al. Location recommendation with privacy protection
Wang et al. Ropriv: Road network-aware privacy-preserving framework in spatial crowdsourcing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant