CN111797433A

CN111797433A - LBS service privacy protection method based on differential privacy

Info

Publication number: CN111797433A
Application number: CN202010690224.5A
Authority: CN
Inventors: 史伟; 张青云; 张兴
Original assignee: Liaoning University of Technology
Current assignee: Liaoning University of Technology
Priority date: 2020-07-17
Filing date: 2020-07-17
Publication date: 2020-10-20
Anticipated expiration: 2040-07-17
Also published as: CN111797433B

Abstract

The invention discloses an LBS service privacy protection method based on differential privacy. According to DP- (k)₁And (3) establishing a query k-anonymous set conforming to differential privacy according to background knowledge of the cluster where the user is located and the area where the user is located, which is obtained in the l) -means algorithm, and sending the anonymous set to the LSP to avoid revealing a real query request of the user, so that the query privacy of the user is protected, and inference attack and time-space correlation attack of an attacker are avoided. DP-k₂In the anonymous algorithm, a query request sent in the same time period t is selected to construct a query k-anonymous set by combining time characteristics, and the data set is processed by using an index mechanism, so that the reasonability of the query request on time is ensured, and the privacy of a user is protected from being disclosed.

Description

LBS service privacy protection method based on differential privacy

Technical Field

The invention relates to a privacy protection method, in particular to an LBS service privacy protection method based on differential privacy.

Background

With the popularization of mobile networks and the development of positioning technology, Location-based services (LBS) have gained acceptance from a large number of users. The service systems greatly facilitate the life of people, but often collect request data sent by users when the users do not know the information, and analyze and process the data, so that the problems of privacy information leakage and the like of the users are caused. The LBS-based service request includes a large amount of privacy information, such as identity information, location information, point of interest information, etc. of the user, which may be revealed at the same time when the query request is issued. Therefore, how to protect the privacy information of the user without affecting the query result of the user is a current important research and development direction.

LBS request service under a social network has strict requirements on background knowledge, and in the existing LBS privacy protection scheme, the background knowledge refers to historical query probability Of a Point Of Interest (POI) in a map. If the attacker has certain background knowledge, the accuracy of the attack can be greatly improved. Therefore, how to avoid background knowledge attacks becomes extremely important when designing LBS privacy protection schemes. The differential privacy is a privacy protection mechanism which is not influenced by background knowledge of an attacker and is not influenced by specific data change, and the problem that the traditional privacy protection method cannot resist the background knowledge attack can be better solved by combining the mechanism with the existing privacy protection scheme.

The Location-based service request sent by the user to the LBS Service Platform (LSP) includes sensitive data such as the user's identity information, the current Location information, the time when the query request is sent, and the queried information of the point of interest. The location information and other information deduced according to the location information belong to location privacy, the information related to the request content belongs to query privacy, and both the location privacy and the query privacy belong to the category of LBS privacy protection.

To better protect the privacy information in LBS requests, Gruteser et al first applied k-anonymization techniques to location privacy protection, resulting in a location k-anonymization model: when the position of a moving object at a certain moment cannot be distinguished from the positions of other k-1 users, the position is said to satisfy the position k-anonymity. Niu et al consider road network environment based on a randomization method, select candidate regions by means of circular region segmentation and mesh expansion, and generate pseudo positions according to position semantic information. Although the above algorithms are continuously improved, the common disadvantages are that the background knowledge that an attacker may possess is not considered, and the position information sent by a user to an LSP does not consider the position semantics, so that the generated position anonymity set contains many false position points which can be directly eliminated, the inference attack of the attacker cannot be resisted, the communication overhead is increased, and the service quality is influenced.

For query privacy, the truer and more diversified the generated false query is, the higher the protection level of the user query privacy is. The Dummy-Q model proposed by Pinglery et al constructs a Dummy query according to the relevant conditions of query context, user motion model, query semantics, etc., so that an attacker cannot distinguish the difference between the real query and the Dummy query of the user. In the above query privacy protection algorithm, the real query request of the user is not combined, and if the real query request of the user is not contained in the POI with high unit query probability, the returned query result is useless for the user.

Aiming at the problems, the DPLQ privacy protection scheme is provided, and the scheme is combined with a differential privacy mechanism, a k-means algorithm, a l-diversity algorithm and a k-anonymous algorithm, so that the position privacy and the query privacy of a user are effectively protected under the condition that the influence of background knowledge owned by an attacker is avoided.

Disclosure of Invention

The invention designs and develops a LBS service privacy protection method based on differential privacy, which effectively protects the position privacy and the query privacy of a user under the condition of not being influenced by the background knowledge owned by an attacker.

A LBS service privacy protection method based on differential privacy comprises the following steps:

step one, constructing a Voronoi diagram according to map information, so that each Voronoi polygon only comprises one POI;

step two, calculating the number of users contained in each Voronoi polygon by combining the position data set X, and performing descending arrangement on the Voronoi polygons according to the number of the users;

step three, selecting front k with large number of users₁A Voronoi polygon having its centroid O_jAs the initial clustering center of the k-means algorithm;

step four, calculating the position of each user in the position data set X and the initial clustering center O_jEuclidean distance of d_ij；

Step five, dividing the users into cluster clusters with the minimum Euclidean distance;

sixthly, after all the users are divided, recalculating the centroids of the k clusters;

step seven, if the distance between the two centroids is smaller than a set threshold value, using the original cluster center, and ending the cycle; and if the distance between the two centroids is greater than the threshold value, using the updated centroid as the cluster center, and skipping to the step four.

Step eight, returning the cluster center O where the query user is located_jAnd l-1 centroids closer thereto, constituting a location data set with l false locations;

adding Laplace noise to the l cluster centers, constructing a position anonymity set containing l false positions by using the position points subjected to noise addition, and replacing the real positions of the users to send the position anonymity set to the LSP;

step nine, determining a cluster C_jThe number of users sending requests in the middle t period;

step ten, calculating a cluster C_jThe position similarity S between each position point and the user position_lAnd arranged in ascending order;

step eleven according to S_lGet the corresponding query request q (q) in ascending order₁,q₂,…,q_n)；

Step twelve, the real inquiry request q of the user_xPut query k₂-in an anonymous set QA;

thirteen step of judging q_iIf it is present in QA, if it is not present, the q is added_iAdding into QA; if so, comparing the next query request q according to the query request sequence in step three_i+1Until k is included in QA₂An element;

fourteen steps, if cluster C_jAll of q in (1)_iAre all added toIn QA, | QA | is still insufficient k₂Then, the historical query request probabilities Pr (qi) in the same time period t are sorted in a descending order to obtain (q)₁,q₂,…,q_m)；

Step fifteen, skipping to step thirteen, and continuing to judge q_iIf present in QA, when | QA | ═ k₂Stopping circulation;

sixthly, performing privacy protection on the QA by using an index mechanism meeting the requirement of differential privacy, and strictly controlling the output probability of each candidate item in the QA.

Taking 5 meters; i, identifying query requests sent by different users as integers; j identifies different clustering clusters and clustering centers, and is an integer; n represents a cluster C where a user is located during a period t_jThe number of query requests sent in (1) is an integer; m represents the number of query requests issued in the history in the period t, and is an integer.

As a further preference, neighboring users in the same cluster may each use the anonymous set in place of their true location information.

The invention has the following beneficial effects:

aiming at the position privacy protection of a user, the application provides DP- (k)₁L) -means algorithm. The algorithm combines a differential privacy mechanism with a k-means algorithm, performs Voronoi graph pre-division on a road network, constructs k cluster clusters according to divided Voronoi polygons and user position points contained in each polygon, selects l cluster centers for noise processing by using an l-diversity idea, and sends noisy position points to an LBS server instead of the real position of a user, so that the problem that a malicious attacker intercepts user privacy information and the LBS server is not completely credible in the process of sending the LBS service by the user is avoided, and the purpose of protecting the user position privacy is achieved.

Aiming at the privacy protection of the user, the application provides DP-k₂-anonymous algorithm. And constructing a query k-anonymous set according to query information of neighboring users in the cluster at the same time period and historical query probability of the POI in the area, and adding noise to the query k-anonymous set by using an index mechanism so as to achieve the purpose of protecting the query privacy of the users.

The method is different from the situation that the existing privacy protection scheme can only protect the position privacy of the user, and the personalized privacy protection scheme is adopted to protect the LBS privacy information of the user on the premise of considering the attack of background knowledge, so that the user can independently control the privacy protection degree, and the aim of different requirements of different users on privacy protection is fulfilled.

Drawings

FIG. 1 is a diagram of a centralized architecture of the present invention.

FIG. 2 is a comparison graph of the algorithm of the deviation degree of the false position from the real position of the user according to the present invention.

FIG. 3 is a comparison graph of the average road network distance algorithm from the false location to the true location according to the present invention.

FIG. 4 is a comparison algorithm chart of the correlation between the false query and the real query according to the present invention.

FIG. 5 shows a fixing member k according to the present invention₂Comparison graph of average running time of time algorithm.

FIG. 6 is a graph comparing the average run time of the fixed time algorithm of the present invention.

FIG. 7 is a graph of the difference between the degree of deviation of the false location from the true location of the user constructed by the DPLQ algorithm of the present invention.

Detailed Description

The present invention is further described in detail below with reference to the attached drawings so that those skilled in the art can implement the invention by referring to the description text.

POI refers to a property that distinguishes the location from geographical location information. For example, the user a sends a service request of "query for nearest movie theater" to the LSP in a shopping mall, the shopping mall is the POI of the location where the user is located, the geographical location of the shopping mall is the real location of the user, the movie theater is the POI of the query request of the user, and the location information of the movie theater is the geographical location of the query request of the user. In summary, the POI represents semantic information such as a basic building or other indicative facilities at a geographic location, and can be used as a keyword for a user service request.

The background knowledge includes information such as a POI where a certain location point is located, a service request issued in a certain time period, query probability as the service request, and the like, and any user side having a calculation storage function can acquire the background knowledge of the location point. The background knowledge of the present application is obtained through the Overpass API of the OpenStreetMap map.

The query request based on the LBS requires a user to obtain service according to the current location, the user first sends a query service request Q ═ { ID, Loc, POI, t, QPOI, QLoc } to a Trusted Third Party server (TTP), and the TTP performs privacy protection on the location information and the query information respectively. In the query request Q of the user, ID is an identifier of the user, a unique individual can be directly determined, Loc, POI, Qloc and QPOI are quasi identifiers of the user, and the minimum attribute set of the user can be obtained according to connection with an external table.

The Loc represents the position of the user when the user sends a query request, the POI represents the interest point of the current position, the QPOI represents the interest point to be queried by the user, and the Qloc represents the position of the query interest point. t represents the time when the user makes a query request.

The existing privacy protection system architecture based on LBS service is roughly divided into three categories: centralized architectures, distributed architectures, and hybrid architectures. The application adopts a centralized architecture which is most widely applied at present, the architecture is composed of a user side, a TTP (trusted third party server) and an LSP (local service provider), and a schematic diagram of the architecture is shown in FIG. 1.

When a user sends a service request to the LSP, the data is firstly protected in privacy through the TTP server. Generating a pseudonym corresponding to a user through a pseudonym processing module, and regenerating a new pseudonym when inquiring each time so as to prevent an attacker from carrying out speculative attack; generalizing the position data of the user through a position generalization module, and replacing the real position of the user with a k-means clustering center to send the k-means clustering center to the LSP; and performing k-anonymization processing on the query information of the user through a query anonymization module, and simultaneously sending k pieces of query information to the LSP. The LSP processes the received anonymous request and returns a service request result data set to the TTP, and the TTP returns a query result meeting the user requirement to the user through a query result refining module according to the real data of the user.

For the position privacy protection of the user, the Voronoi graph is used for dividing the map, and k-means and l-diversity are adopted as basic ideas of the privacy protection. Thus, the (k, l) -means privacy preserving model is defined as follows:

(k, l) -means privacy preserving model. If Loc₁、Loc₂The following conditions are satisfied:

(1)|Loc₁|＝k，Loc₁representing a generalized set of user locations;

(2)Loc₁∈DESC(O₁,O₂,…O_k)；DESC(O₁,O₂,…,O_k) Representing the descending order of all the clustering centers;

(3)

|Loc₂|＝l，Loc₂indicating the set of locations sent to the LSP.

Then Loc₂The location data in (b) satisfies the (k, l) -means privacy protection model.

Condition 1 represents a location-generalized set of users Loc₁The method comprises the steps of clustering users in the region at the same time period into k clusters, wherein the k data records are contained in the region; condition 2 represents the set Loc₁The data in (1) is clustered by a clustering center O_iThe descending order of (1); condition 3 represents the slave set Loc₁Selecting one position point to form a set Loc₂And sent to the LSP as a false location of the user.

For the query privacy protection of a user, the query k-anonymity privacy protection model is provided based on the position k-anonymity thought. The model adopts a local k-anonymity algorithm to generalize the real query request of the user to an anonymity query set, and generalize the attribute values of different users to different relatively independent generalization data sets, so that the excessive generalization of the user data can be prevented, and the service effect is prevented from being influenced.

The k-anonymous privacy preserving model is queried. Generalizing the real query information of the user into an anonymous query data set, so that the query information of the user cannot be distinguished from other k-1 records, and processing the generated anonymous set by using an index mechanism, so that the probability of successful identification of the query request of the user is lower than 1/k.

DP-(k₁The l-means algorithm combines the differential privacy with the k-means algorithm, and protects the position privacy of the user according to known conditions such as map information, user real position information X, a position data set X and the like.

Firstly, carrying out Voronoi diagram pre-division on a map, and constructing k according to Voronoi polygons₁And (4) clustering, selecting l cluster centers according to an l-diversity idea, adding Laplace noise, constructing a position anonymous set containing l false positions by using the position points subjected to noise addition, and sending the position anonymous set to an LSP instead of the real position of a user. All neighbor users in the same cluster can use the anonymous set to replace the real position information of the neighbor users, the reciprocity of the algorithm is realized, the service response time and the operation cost are saved, and the homogeneous attack and the background knowledge attack of an attacker can be effectively avoided.

DP-(k₁L) -means algorithm

Inputting: map information M, user true position information X, position data set X

And (3) outputting: l false locations

Wherein, O_jIs the initial cluster center of the k-means algorithm, x_iRepresenting a sample point in the position data set X, d_ijRepresents a sample point x_iTo O_jEuclidean distance of (C)_jRepresents the clustering center O_jCorresponding cluster, O_j' denotes the updated Cluster C_jThe center of mass of (a), Lap (λ), represents laplacian noise that satisfies-differential privacy.

DP- (k) is described in detail below₁The l) -means algorithm implements a process of processing data.

step seven, if the distance between the two centroids is smaller than a set threshold value, using the original cluster center, and ending the cycle; and if the distance between the two centroids is greater than the threshold value, using the updated centroid as the cluster center, and skipping to the fourth step.

Step eight, returning the cluster center O where the query user is located by the algorithm_jAnd l-1 centroids closer thereto, constituting a location data set with l ghost locations.

DP-k₂The anonymity algorithm combines differential privacy with the k-anonymity algorithm, and a user customizes k according to the privacy requirement₂Value, k₂The larger the value, the better the privacy protection effect, but the accuracy of the service is reduced.

The threshold value can be set according to experimental requirements, and preferably, the threshold value is 5 meters.

i, identifying query requests sent by different users as integers; j identifies different clustering clusters and clustering centers, and is an integer; n represents a cluster C where a user is located during a period t_jThe number of query requests sent in (1) is an integer; m represents the number of query requests issued in the history in the period t, and is an integer.

According to DP- (k)₁The background knowledge of the user cluster and the user area obtained in the l) -means algorithm is used for constructing a query k-anonymous set conforming to the differential privacy, the anonymous set is sent to the LSP to avoid revealing the real query request of the user, so that the query privacy of the user is protected, and the inference attack and the time delay of an attacker are avoidedA null associativity attack.

DP-k₂-anonymity algorithm

Inputting: k is a radical of₂Value, query user's cluster C_jVoronoi diagram

And (3) outputting: query k₂Anonymous set QA

Wherein n represents a cluster C_jNumber of users, ASCE (S), sending requests during a period of time t₁,S₂,…,S_n) Denotes the ascending order of the positional similarity S of n position points, q_iDenotes that the position similarity is S in the period t_iA query request, q, issued by a location point of_xIndicating the query user's request, Pr (q), during the period t_i) Representing q during a period t_iThe historical query probability of (2), which can be obtained from background knowledge, DESC (Pr (q))₁),Pr(q₂),…,Pr(q_n) Denotes p-Pr (q)_i) And performing descending arrangement.

DP-k₂The position similarity mentioned in the anonymity algorithm can be calculated using the following formula:

wherein, U_iAnd U_jRespectively representing two different users in a cluster, d (U)_i,U_j) Representing the Euclidean distance, S, of two users_lThe larger the similarity between two users. In the same time period, the query k can be used by the query request sent by any user in the cluster₂Anonymous set replacement.

DP-k is described in detail below₂The anonymity algorithm implements the process of processing data.

Step one, determining a cluster C according to known background knowledge_jThe number of users sending requests in the middle t period;

step two, calculating a cluster C_jThe position similarity S between each position point and the user position_lAnd arranged in ascending order;

step three, according to S_lGet the corresponding query request q (q) in ascending order₁,q₂,…,q_n)；

Step four, the real inquiry request q of the user_xPut query k₂-in the anonymous set QA,

step five, judging q_iIf it is present in QA, if it is not present, the q is added_iAdding into QA; if so, comparing the next query request q according to the query request sequence in step three_i+1Until k is included in QA₂An element;

step six, if the cluster C_jAll of q in (1)_iAll added into QA, | QA | is still insufficient k₂Then, the historical query request probabilities Pr (qi) in the same time period t are sorted in a descending order to obtain (q)₁,q₂,…,q_n)；

Step seven, skipping the algorithm to step five, and continuously judging q_iIf present in QA, when | QA | ═ k₂The cycle is stopped.

And step eight, carrying out privacy protection on the QA by using an index mechanism meeting the requirement of differential privacy, and strictly controlling the output probability of each candidate item in the QA.

Security analysis

(1) Resistance to homogeneity attacks; the basic idea of a homogeneity attack is to find multiple records in a data source that correspond to one and at the same time to one sensitive property.

DP-(k₁And the l-means algorithm is combined with the k-anonymity algorithm and the l-diversity algorithm to generate a position anonymity set, so that all users in the cluster can use the generated position anonymity set. Even if an attacker acquires the position anonymous set constructed by the algorithm, the true information of the users in the cluster cannot be acquired, because the data in the anonymous set are all false positions surrounding the users in the cluster and do not contain the true information of the users. Therefore, the scheme can effectively resist the homogeneity attack.

(2) Resisting background knowledge attacks; the basic idea of the background knowledge attack is to find out a plurality of records corresponding to a certain data source from a plurality of data sources, and if the background knowledge of the data source is available, other sensitive attribute information corresponding to the data source may be found.

The difference privacy is based on strict mathematical knowledge, the result of processing the data set is not influenced by a specific piece of data, and any piece of data is deleted from the data set, so that the calculation result is not influenced. Assuming that the complete data set is D, the attacker already has all data except the attack object information, denoted as data set D ', data sets D and D' are neighboring data sets that differ by at most one piece of data. The sensitivity Δ F of the query algorithm F is expressed as

ΔF＝max_D,D'||F(D)-F(D')||

ΔF≤1

Colloquially, the algorithm sensitivity Δ F can be understood as the worst impact of randomly adding or deleting a record on the overall dataset query result. The 2 formulas show that the attacker still cannot acquire the information of the attack object after acquiring the maximum background knowledge, so that the differential privacy can well resist the background knowledge attack. The algorithm of the application is combined with a differential privacy mechanism, so that the background knowledge attack of an attacker can be effectively resisted while the service quality is ensured.

(3) Resisting reasoning attacks; the basic idea of inference attack is that an attacker can deduce possible position information and query requests of a user according to life experience, common sense, background knowledge and other information.

For user position information, all the position anonymity sets constructed by the algorithm adopt false positions close to the user position, so that an attacker cannot deduce other privacy information of the user from the position information; for the query privacy of the user, selecting query requests sent by neighbor users in the same cluster in the same time period, and constructing a query k₂Anonymous sets, both to guarantee the authenticity of the query and to avoid an attacker inferring the user's position or others from the query contentAnd (4) information. Therefore, the algorithm provided by the application can effectively avoid reasoning attack.

(4) Resisting the space-time relevance attack; spatio-temporal relevance attacks^[22]Mainly aiming at query privacy, the DP-k provided in the application₂In the anonymous algorithm, a query request sent in the same time period t is selected to construct a query k-anonymous set by combining time characteristics, and the data set is processed by using an index mechanism, so that the reasonability of the query request on time is ensured, and the privacy of a user is protected from being disclosed.

Experimental verification

In the experiment, the algorithm is written by Java, and the running environment is 1.70GHz Interl (R) core (TM) i5 processor, 4GB memory and 64-bit Windows 8 operating system. The data set used in the experiment was from road network information from Olderburgh, Germany and user information generated by the Foursquare website. The experimental data set included 7035 roads, 6105 vertices, POI points of 4-9 types, and 250 ten thousand query requests from different users.

The method DPLQ is compared with three algorithms of Mobimix, H-Star and T-SR in the experiment. The Mobimix is a mix-zone based road network framework and is used for protecting the location privacy of a user; H-Star is an X-Star extended stealth algorithm based on Hilbert rule; T-SR is a location privacy preserving algorithm based on POI queries. The three algorithms are respectively classical algorithms for protecting privacy information based on different technologies and have good representativeness.

The experiment uses Pseudo-Variance (PV), Average Path Distance (APD), and Association Degree (AD) between a false query and a real query as evaluation criteria. The three evaluation criteria can well reflect the reasonability and effectiveness of the false information generated by the algorithm, and are convenient for comparing the experimental effects of different algorithms. PV and APD are defined as shown in formulas (1) and (2).

Wherein, P_ujIs the frequency, P, of POI queries corresponding to the user's real location_ijIs the POI query frequency, k, corresponding to the ith false location in the location-anonymous set₁Representing k in Algorithm 1₁Means coefficients, l representing the l-diversity coefficients in algorithm 1.

The degree of association between two POI categories is defined as shown in equation (3).

Where Fnum () represents POI during t time period_iTo POI_jN represents the total number of POIs within the same grid area. The size of the function AD depends on the POI_iTo POI_jAccess frequency and POI_iRatio of total access frequency to all other POI points.

A pseudo variance ratio of the algorithm; PV represents the degree of deviation of the constructed fake location from the user's true location. The smaller the PV, the more uncertainty in the generated location data set, the more true the ghost location. FIG. 2 reflects at k₁When the grid area where the user is located is 3km × 3km, the number of elements l in the location data set is different from the PV difference of the four algorithms.

As can be seen from fig. 2, the DPLQ algorithm is always superior to the other three algorithms regardless of the value of l. Therefore, the comparison shows that the false position generated by the DPLQ algorithm is more real and can resist inference attack well. When l is 10, the PV values of the DPLQ algorithm, the T-SR algorithm, and the H-Star algorithm are much smaller than the mobilmix algorithm, because when constructing the location data set, all three algorithms combine the real location of the user, taking into account both the rationality of the false location and the diversity of POI semantics in the location data set. As the value of l increases, the differences of the four algorithms become smaller and smaller, because the privacy protection budget is fixed and the grid area where the location data set is constructed is also fixed; the value of l is increased continuously, the four algorithms select more spurious locations with high similarity to construct the location data set, and therefore PV difference among the algorithms is smaller and smaller.

Comparing the average path distances of the algorithm; APD represents the average road network distance from a false location to a true location. The larger and more dispersed the false position distribution area is, the more difficult it is for a malicious attacker to obtain the real position data of the user from the position data set. Fig. 3 reflects the influence of the change of l value on the APD of different algorithms when the grid area where the user is located is 3km × 3 km.

As can be seen from fig. 3, the APD of the DPLQ algorithm is larger than the other three algorithms, which means that the false positions generated by the DPLQ algorithm are more scattered. The APD difference for the four algorithms gets smaller as the value of l increases. This is because in the experiment, the grid area where the user is located is unchanged, the value l is increased, and the four algorithms select more similar false positions, so that the APD difference between the algorithms is reduced.

And comparing the relevance between the false query and the real query. Associativity refers to the fact that at query k₂-associations between generated false POI queries and user real POI queries in an anonymous set. We compare the relevance between the false queries and the real queries of the DPLQ algorithm, the T-SR algorithm, the H-Star algorithm and the Mobimix algorithm, respectively.

As can be seen from FIG. 4, the association degrees between the false queries of the DPLQ algorithm and the T-SR algorithm and the real queries are both 0, and both algorithms consider the time association between the query request and the position of the user to generate the false queries which are not associated with the real queries of the user, so that the false queries can resist the time-space association attack of a malicious attacker. The experimental results of the DPLQ algorithm and the T-SR algorithm are obviously superior to those of the other two algorithms.

Influence of experiment parameters on experiment time; the effective parameters in the experiment were: number of clusters k₁Number of elements in location data set, number of elements in query anonymity set, k₂Privacy preserving budgets. Number of causal clusters k₁Has no direct influence on the experimental results, so k is set in the experiment ₁50 is unchanged. l, k₂And composing privacy preserving triplets<l,k₂,>. FIG. 5 shows that when k is₂When 10 is constant, the comparison sum/change pairThe algorithm averages the effect of run time.

As shown in fig. 5, the fixed value, as the value of l increases, needs to generate more dummy locations, thus taking longer runtime; by fixing the value of l and viewing the graph from bottom to top, it can be seen that as the value decreases, the degree of privacy protection becomes higher, the amount of noise added increases, and the algorithm running time becomes longer.

FIG. 6 shows the comparison sum k when l is 8 constant₂The effect of the variation in (c) on the average running time of the algorithm, the experiment assumes that n is 10.

As shown in fig. 6, k is fixed₂The value can be obtained by the same way, and the relation between the value and the algorithm running time can be obtained; fixed value, with k₂The increase in value takes longer run time when k is₂12 and k₂At 14, the algorithm run time is significantly higher than k₂Run time at 10, since the experiment assumes n is 10, when k is₂<When n, directly screening k in the cluster ₂1 query request sent by users with small position similarity at the same time is only needed, influence of historical query results on construction of an anonymous set is not considered, and therefore the running time of the algorithm is obviously less than k₂>Run time at n.

Fig. 7 shows the difference in PV values for the DPLQ algorithm when setting different privacy preserving parameters.

As can be seen from fig. 7, when 0.5 and l 12, the PV value of the algorithm is the smallest, which means that the constructed fake location has the smallest deviation from the real location of the user, so that the LSP can provide better location service for the user without revealing the location privacy information of the user.

The application provides a differential privacy based LBS service privacy protection scheme-DPLQ. The scheme includes two algorithms, DP- (k)₁L) -means algorithm and DP-k₂The anonymity algorithm, which can effectively protect location privacy and query privacy in LBS service requests. According to the scheme, the influence of background knowledge and space-time relevance on a privacy protection algorithm is considered, and two privacy protection models are defined; the privacy protection can be defined by the user according to the different privacy requirements of different usersStrength. Therefore, according to the scheme, malicious attacks are difficult to acquire the privacy information of the user from the constructed position data set and the query k-anonymous set, the constructed false positions are more dispersed, the false queries are more authentic, and therefore homogeneity attacks, background knowledge attacks, inference attacks and time relevance attacks are resisted. Experiments show that the algorithm has obvious advantages in the aspects of pseudo variance, average path distance, correlation degree between false query and real query and the like, has good expandability and can effectively protect the LBS privacy information of the user. In future work, privacy measurement and hierarchical protection are carried out on the position of the user and the sent query request, so that the LBS request of the user can be protected more accurately without losing the service quality.

While embodiments of the invention have been described above, it is not limited to the applications set forth in the description and the embodiments, which are fully applicable in various fields of endeavor to which the invention pertains, and further modifications may readily be made by those skilled in the art, it being understood that the invention is not limited to the details shown and described herein without departing from the general concept defined by the appended claims and their equivalents.

Claims

1. A LBS service privacy protection method based on differential privacy is characterized by comprising the following steps:

fourteen steps, if cluster C_jAll of q in (1)_iAll added into QA, | QA | is still insufficient k₂Then, the historical query request probabilities Pr (qi) in the same time period t are sorted in a descending order to obtain (q)₁,q₂,…,q_m)；

2. The differential privacy-based LBS service privacy protection method of claim 1, wherein neighboring users in the same cluster may all use the anonymous set in place of their true location information.