CN110162997B

CN110162997B - Anonymous privacy protection method based on interpolation points

Info

Publication number: CN110162997B
Application number: CN201910340914.5A
Authority: CN
Inventors: 汪小寒; 张泽培; 何增宇; 王涛春; 孙丽萍; 郑孝遥; 罗永龙
Original assignee: Anhui Normal University
Current assignee: Anhui Normal University
Priority date: 2019-04-25
Filing date: 2019-04-25
Publication date: 2021-01-01
Anticipated expiration: 2039-04-25
Also published as: CN110162997A

Abstract

The invention is suitable for the technical field of privacy protection, and provides an anonymous privacy protection method based on interpolation points, which specifically comprises the following steps: s1, preprocessing an original track data set Ts to form a plurality of track equivalence classes Ecs which are consistent on a timestamp; s2, clustering the tracks in each track equivalence class according to IMHDT distance measurement, and forming a plurality of track anonymization groups in each track equivalence class, wherein the number of the tracks in each anonymization group is not less than k groups; s3, disturbing the track in each group to be anonymous, and finally satisfying interpolation track (k,) -anonymity. The track time stamp is taken as constraint, the interpolation point is limited on the track section of the corresponding time stamp, data distortion is reduced in the anonymization process, and the data availability is increased on the premise of meeting the requirement of issuing data privacy protection.

Description

Anonymous privacy protection method based on interpolation points

Technical Field

The invention belongs to the technical field of privacy protection, and provides an anonymous privacy protection method based on interpolation points.

Background

The modern society track information can be conveniently collected and shared by a mobile phone with a GPS, a PDA, a vehicle-mounted navigator, intelligent wearable equipment and the like. User can thereby conveniently use location-based service¹(LBS) such as "find nearby gas stations", "record my movement track", etc., the collected track information can be used for business decision-making, e.g., opening a supermarket in a location information intensive area, etc., which typically has a greater business value, thereby maximizing the investor's profit. And can also be used for developing the application of city planning and the like. Trajectory information is of great value because it contains special spatio-temporal information, but this information can also be collected by malicious partiesAnd collecting and analyzing, so that the privacy of the user is revealed.

Therefore, the published data set needs to be processed anonymously, and the problem of privacy disclosure is solved. Meanwhile, track characteristics such as the length and duration of a track of a corresponding user cannot be excessively changed by data output by the privacy protection system, so that the usability of data issuing can be well processed while track information is issued, and the problem that attention needs to be paid to track privacy protection application at present can be solved by ensuring that an individual track is not identified by an attacker. Many methods exist for protecting the privacy of track data distribution, most of which do not consider the availability of data for distribution.

Disclosure of Invention

The embodiment of the invention provides an anonymous privacy protection method based on interpolation points, which takes track time stamps as constraints and limits the interpolation points on track segments of corresponding time stamps, reduces data distortion in an anonymous process and increases data availability on the premise of meeting the requirement of issuing data privacy protection.

In order to achieve the above object, the present invention provides an anonymous privacy protection method based on interpolation points, which specifically includes the following steps:

s1, preprocessing an original track data set Ts to form a plurality of track equivalence classes Ecs which are consistent on a timestamp;

s2, clustering the tracks in each track equivalence class according to IMHDT distance measurement, and forming a plurality of track anonymization groups in each track equivalence class, wherein the number of the tracks in each anonymization group is not less than k groups;

and S3, disturbing the track in each anonymous group, and finally satisfying interpolation track (k,) -anonymity.

Further, the step S1 specifically includes the following steps:

s11, defining track processing fragment value P_i；

S12, acquiring the start-stop time stamp { t ] of the original track Tr_b,t_e}；

S13, the acquisition time is later than the starting time t_bAnd a module P_iTime stamp t of 0_iIn timeBefore the termination time t_eAnd a module P_iTime stamp t of 0_j；

S14, cutting the original track into t_i,t_jAnd put into the trajectory equivalence class D { i, j }.

Further, the step S2 specifically includes the following steps:

s21, placing the unclustered tracks in each track equivalence class set into an active set, and randomly selecting one track from the active set;

s22, calculating IMHDT distances from other tracks in the active set to the selected track, and taking the track with the farthest IMHDT distance as a central track;

s23, calculating IMHDT distances from other tracks in the active set to the central track;

s24, forming an anonymous cluster by the nearest k-1 tracks in the IMHDT distance and the central track, and adding the anonymous cluster into an anonymous set;

s25, obtaining the track with the farthest distance from the k-1 tracks with the shortest distance, and if the IMHDT distance between the track and the central track is larger than a threshold value max _ radius, suppressing the anonymous cluster;

the IMHDT distance is the Hausdorff distance of the interpolation point under the time constraint.

Further, the IMHDT distance between the two tracks Tr1 and Tr2 is calculated as follows:

s221, calculating each track sampling point Tr1_ node_t＝tiTo the track end

The shortest distance therebetween;

s222, calculating a track sampling point Tr1_ node_t＝tiTo the track end

The shortest distance therebetween;

s223, taking the minimum distance value in the step S221 and the step S222 as a track sampling point Tr1_ node_t＝tiThe IMHDT distance of (a);

S224track Tr₁The average value of the IMHDT distance sum of each track sampling point is the IMHDT distance between the tracks Tr1 and Tr 2.

Further, the method for calculating the shortest distance between the track sampling point and the track segment is specifically as follows:

judging whether an interpolation point exists on the track segment or not, and enabling a connecting line of the track sampling point and the interpolation point to be perpendicular to the track segment;

if the interpolation points exist, the Euclidean distance between the track sampling points and the interpolation points is the shortest distance between the track sampling points and the track segments;

if not, the minimum distance between the track sampling point and the two end points of the track end is the shortest distance between the track sampling point and the track section.

The anonymous privacy protection method based on the interpolation points, provided by the embodiment of the invention, has the following beneficial effects:

1, in the track preprocessing process, a plurality of preprocessing fragments are adopted to normalize the track, and the remaining quantity of the normalized track under different preprocessing fragments and the quality of the track to be anonymized are comprehensively compared to determine the value of the preprocessing fragments, so that the track processing method is beneficial to reducing the progress of track inhibition and subsequent track anonymization in the preprocessing process.

2, introducing a track uncertainty theory into the interpolation point anonymity model, wherein the uniqueness of track data is that each sampling point can be a quasi-identifier, and directly moving the quasi-identifier can increase the anonymity cost. Therefore, the inherent uncertain region of the track is introduced to serve as the anonymous region of the track, and the anonymous cost is favorably reduced.

3, when measuring the track distance in the track clustering process, adopting Hausdorff distance based on the interpolation points, and theoretically proving that the Hausdorff distance calculation value based on the interpolation points from the anonymous track to the central track is always less than or equal to the Euclidean distance calculation value between the same tracks. Clustering using this distance can therefore result in a cluster that is smaller than the euclidean distance generalization area.

4, an anonymity model of replacing sampling points with interpolation points in adjacent track segments of the sampling points is provided in the track anonymity process, and track disturbance in the anonymity process can be reduced by using the model, so that data distortion is reduced, and the data usability is increased on the premise of meeting the requirement of issuing data privacy protection.

Drawings

Fig. 1 is a schematic diagram of an uncertain region of a trace sampling point according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of three original tracks that do not satisfy track (3,) -anonymity provided by embodiments of the present invention;

fig. 3 is a schematic diagram of three original tracks after anonymization according to an embodiment of the present invention;

FIG. 4 is a flowchart of a method for providing anonymous privacy protection based on interpolation points according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of an interpolation trajectory similarity measurement without time constraint according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of an interpolation trajectory similarity measurement under time constraints according to an embodiment of the present invention;

FIG. 7 is a diagram illustrating a comparison of Euclidean distance between tracks and a Hausdorff distance based on interpolation points according to an embodiment of the present invention;

fig. 8 is a schematic diagram of an anonymization operation based on interpolation points according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Definition of related nouns:

1) track of

A trajectory is generally the course of a moving object over a period of time. The trajectory information is collected by sensor devices with positioning systems, which store and send the coordinates of the moving objects to the trajectory information collector during the corresponding time, in two different ways:

the track Tr is formed by a string of time stampsTriplet (t) of sequences_i,x_i,y_i) Form a

Tr＝{(t₁,x₁,y₁),(t₂,x₂,y₂),...,(t_n,x_n,y_n)}

Wherein x is_i,y_iRepresenting the trace at time stamp t_iWhen the coordinate value is more than or equal to 1, i is less than or equal to n.

Another way of representing the trajectory is to use a series of continuous polylines

Form a

Wherein p is_JRepresenting a sample point in the track Tr, p_lenTrWhich represents the length of the track Tr,

the broken line between two track points is approximate to a track route in the simulation reality, when the sampling frequency of a sampling point approaches to 0, the track is closer to the motion route in the reality, but the higher the sampling frequency is, the higher the cost for storing and analyzing the track is.

2) Hausdorff distance

The Hausdorff distance is a distance measurement method between two point sets in the image field, and two point sets T are given_i＝{a₁,a₂,...,a_i,...,a_mAnd T_j＝{b₁,b₂,...,b_j,...,b_n}，T_i,T_jThe Hausdorff distance between is defined as follows:

wherein

Because the original Hausdorff distance adopts the maximum value and the minimum value to calculate the distance between the point sets, the distance is greatly influenced by some outliers; in order to improve the robustness of the Hausdorff distance to isolated points and noise, an improved Hausdorff distance is provided, the influence caused by outliers is reduced by adopting an averaging mode, and the Hausdorff distance is improved to represent:

the Hausdorff distance based on interpolation points is expressed as follows:

3) uncertainty of trace sampling point

The existing positioning technology can not accurately position a coordinate point, the actual position is generally generalized to a circular area, the characteristic is the basis for realizing track (k,) -anonymity, the track is anonymized through the characteristic, but the invention is different in that an uncertain area of an original track sampling point is not adopted, but an interpolation point is used for replacing the uncertain area, and therefore, the smaller anonymity cost is obtained.

Due to inaccuracy of the positioning technology in reality, assuming that an uncertain threshold value is represented, a circular area with the track sampling point as a circle center and the uncertain threshold value as a radius is an uncertain area of the track sampling point (as shown in fig. 1):

dist(p_real,p)≤

wherein p is_realRepresenting the true position of the trajectory and p the sample point. p is a radical of_realMay exist at any location in the uncertainty region.

4) Cooperative anonymity of traces (Co-localization)

In the final anonymization, the corresponding sampling points on each track are required to be within an uncertain region of each other in pairs, so that the tracks are subjected to cooperative anonymity. Defining a center track Tr and an anonymous track Tr'

Tr＝{(t₁,x₁,y₁),(t₂,x₂,y₂),...,(t_n,x_n,y_n)}

Tr′＝{(t₁,x′₁,y′₁),(t₂,x′₂,y′₂),...,(t_n,x′_n,y′_n)}

Each track sampling point on Tr' is within an uncertainty range of the corresponding center track sampling point. The Euclidean distance adopted by the assumed track measurement function needs to be satisfied

Then two tracks are said to satisfy cooperative anonymity (Co-localization), noted Coloc (Tr, Tr')

5) Trajectory (k,) -anonymity group

If any two tracks in the track anonymous group meet the cooperative anonymity of the tracks, the uncertain region is as follows, and the number of the tracks in the anonymous group is more than or equal to k, the anonymous group is a track (k, k) -anonymous group.

The trace (k,) -anonymity is proposed based on the uncertainty of the trace sampling points, as shown in fig. 2, the corresponding sampling points of the anonymous traces Tr1 and Tr2 satisfy the anonymity requirement in the uncertainty region with the first sampling point of the central trace as the center and the uncertainty threshold as the radius, and the anonymous traces Tr1 and Tr2 satisfy the anonymity requirement₂And the center trajectory satisfies (3,) -anonymity at the first trajectory sampling point. The second sample point in the graph only satisfies (2,) -anonymity because the corresponding sample point on the anonymous trace Tr2 is not within the uncertainty region. Similarly, the third and fourth tracks only satisfy (2 ') -anonymity, and if a track group formed by the three tracks is to be converted into a track (3') -anonymity group, a moving operation is required for the corresponding track point, and the operation causes data distortion. FIG. 3 shows the three original tracks passing through anonymityAnd (3) -anonymous track sets are formed after the transformation, and gray track sampling points in the graph in FIG. 3 are formed by moving the positions of track points which do not meet the conditions originally, and all tracks meet the cooperative anonymity of the tracks at the moment, so that the track anonymity set is a (3) -anonymity set.

The invention aims to solve the problem that the original track data set is subjected to anonymous operation, so that the privacy disclosure risk of a track owner can be reduced on the premise of the attack assumption, any operation on the original track data set can distort published data, and the use value of the published data set is reduced.

Evaluation index

1) Data distortion

Clustering is to classify data according to certain characteristics, and the inter-cluster data similarity is small while the intra-cluster data similarity is large. Similarity is the key point of the clustering process, track clustering is not an exception, and how to express similarity between tracks becomes the core of a track clustering algorithm. The similarity of the trajectory k-anonymity is calculated by Euclidean distance in the classical algorithm NMA, while the Hausdorff distance based on interpolation calculation is adopted in the invention, and the calculation of the distance is proved to be less than or equal to the widely used Euclidean distance. Thus using the Hausdorff distance allows the clustering process to form smaller clusters with smaller generalization areas. The reduced generalization area may result in reduced data distortion during clustering.

Data distortion

Where len (ecs) represents the number of clusters after clustering, ClusterArea (Ec)_i) Represents the cluster Ec_iMaxArea represents the total area of the track area.

Anonymous cost

The anonymization of the track is to perform data conversion in a cluster formed after the track clustering, namely to move a track sampling point to satisfy (k,) -anonymity. Since the tracks in the cluster meet the requirements of k tracks after the track clustering process, the step needs to make the sampling points in each track meet the requirement that the distance from the sampling points to the central track does not exceed the requirement.

Anonymous cost

Wherein, transflationNode represents the moving distance of the track sampling point, and maxTranslation represents the moving distance of all points in the track.

Fig. 4 is a flowchart of an anonymous privacy protection method based on interpolation points according to an embodiment of the present invention, where the method specifically includes the following steps:

the measurement of the similarity of the trajectories is influenced by the fact that the sampling time of different trajectories in real life is greatly different. The original traces need to be divided into several equivalence classes according to the time stamps, wherein the traces of each equivalence class have a consistent time stamp. However, the number of traces with identical timestamps is small, and the classification directly according to the timestamps inevitably leads to the excessive number of trace equivalence classes, while the number of traces in each equivalence class is small, and if the number is less than k, the equivalence class does not meet the k-anonymity requirement and must be restrained, thereby leading to the poor quality of the anonymous data. The invention adopts a track preprocessing mode in the NWA algorithm, ensures that each track equivalence class has more tracks by inhibiting partial track points, reserves a large number of tracks compared with the mode of inhibiting the whole track equivalence class in the above, and greatly improves the track anonymity quality.

Because the selection of the preprocessing fragments needs to balance between the track reservation quantity and the quality of the data to be anonymized, different preprocessing fragments are adopted to respectively carry out preprocessing in the preprocessing process, so that a plurality of equivalent groups are formed. And comprehensively analyzing each index in the experimental part, selecting a proper preprocessing fragment for experiment, and reducing the track inhibition data volume while maintaining the track quality to be anonymous.

The algorithm 1 is a preprocessing process of an original track, an input value is an original track set Ts, a track preprocessing slicing value Pi is output as a track equivalence class processed by corresponding preprocessing slicing, and each equivalence class keeps a consistent track timestamp. For each track in the data set, a start-stop time stamp t of the track is recorded first_b,t_e]Taking i as more than t_bAnd the modulus Pi is 0, and j is taken as the time stamp which is smaller than the termination time stamp t_eAnd modulo Pi is a 0 timestamp. Intercept the trace as [ t ]_i,t_j]And put into the same i, j value trajectory equivalence class.

Finally, all the tracks are put into the equivalence classes of corresponding i, j values in the equivalence class set, the tracks in each equivalence class have the same starting and stopping time, note that: the inventive data set timestamps are consecutive, i.e., if the start-stop timestamps of two traces are consistent, then all of them are consistent.

S2, clustering the tracks in each track equivalence class according to IMHDT distance measurement, and forming a plurality of track anonymity groups in each equivalence class, wherein the number of the tracks in each anonymity group is not less than k;

in the clustering process, the tracks in the same timestamp equivalence class need to be measured according to a specific similarity function, the track similarity is higher, the tracks are divided into the same group to be anonymized, and the number of each anonymized group element group is not lower than k. The core of the process is how to determine the similarity of two tracks, i.e. the determination of the track metric function. The classical measurement function is the Euclidean distance, the Euclidean distance of the trace sampling points on the corresponding time stamps is calculated firstly, then the arithmetic mean value is taken for the sampling points on all the time stamps, and the value is the Euclidean distance of the trace. The invention provides a new measurement mode IMHDT, and the Hausdorff distance (IMHDT) calculation process of an interpolation point under time constraint is as follows:

wherein, dist (p)_a,p_b) Representative sample point p_a，p_bDistance between, interpolation point

Let dist (p)_a,_ato the broken line segment p_b-1p_bIs measured.

Wherein, dist (Tr)_a,Tr_b) Representative track Tr_a,Tr_bIMHDT distance between, t number of samples.

The opposite comparison result may be caused by not introducing a time constraint in the track similarity comparison, fig. 5 is an interpolated track similarity measure without the time constraint, in fig. 5, the IMHD calculation distance is very small when the two tracks are searched for an interpolated point without the time constraint, but actually, the two tracks are two tracks with opposite directions, and the track sampling point distances are also very different, especially at times t1 and t4 in the figure. Therefore, the use of IMHD to measure trajectories in this case is not practical; fig. 6 is a time-constrained interpolation trajectory similarity measurement, the search of interpolation points is limited to adjacent trajectory segments at the same sampling time, and the obtained IMHDT distance is relatively large, which meets the actual situation. Meanwhile, the search range of the track interpolation point is greatly reduced, the interpolation point is not required to be searched by the IMHD full-track scale, and the calculation efficiency is improved.

The invention adopts Hausdorff distance (IMHDT) based on interpolation points under time constraint as a track measurement function and carries out track clustering based on greedy clustering. Because the IMHDT distance is less than or equal to the Euclidean distance under the same condition (proved by the following description), the anonymous group formed by clustering has smaller generalization radius than the Euclidean distance, so that the clustered cluster has smaller generalization area, and the track data distortion caused by generalization is reduced.

The Hausdorff distance calculation value from the anonymous track to the central track based on the interpolation points is always less than or equal to the Euclidean distance calculation value between the same tracks, and the proving process is as follows:

the Hausdorff distance calculation of the track sampling points is different in calculation mode under different conditions:

1. as shown at t1 in fig. 7, when an anonymous trace sampling point cannot bisect the perpendicular bisector of the trace segment in the central trace that is adjacent to the timestamp, an interpolation point cannot be made. At the moment, the Hausdorff distance of the track sampling point is consistent with the Euclidean distance; 2. an interpolation point may be obtained when the anonymous trace sampling point may only bisect vertically one end of the trace segment in the central trace that is adjacent to the timestamp, as shown at t3 in fig. 7. At the moment, the Hausdorff distance of the track sampling point is the Euclidean distance from the sampling point to the interpolation point. Because a right-angled triangle is formed between the three points, the Hausdorff distance between sampling points is less than the Euclidean distance as the length of the inclined side of the right-angled triangle is greater than that of any right-angled side; 3. two interpolation points can be obtained when the anonymous trace sampling point can bisect the two ends of the trace segment in the central trace that is adjacent to the timestamp, as shown at t2 in fig. 7. At the moment, the Hausdorff distance of the track sampling point is the shorter one of Euclidean distances from the sampling point to two interpolation points. Similarly, the Hausdorff distance between sampling points is less than the Euclidean distance.

The IMHDT distance of the track is the mean value of the distances of the track points, and the values are less than or equal to Euclidean distance under three conditions. The Hausdorff distance calculation value based on interpolation points of the anonymous track to the central track can be obtained, and the Euclidean distance calculation value between the anonymous track and the central track is always smaller than or equal to the Euclidean distance calculation value between the same tracks. And under the condition that all sampling points cannot perform perpendicular bisector on track segments adjacent to the timestamp in the central track, the two calculated values are equal.

The method judges the tracks in the group to be anonymous, selects the central track, and limits other tracks in the group to be anonymous in an uncertain area, thereby realizing cooperative anonymity of the tracks and achieving the purpose of privacy protection. Moreover, unlike the trajectory (k,) -anonymity model, the present invention perturbs using interpolated points instead of indeterminate points, resulting in less data distortion, less anonymity cost, and higher data availability.

And 2, clustering algorithm, inputting the track equivalence class Ecs and the privacy protection degree k. And outputting the clustered track equivalent class set clusteredEcs. First max _ radius is set and if the clustering results in an IMHDT distance exceeding the threshold, the cluster is suppressed. Secondly, initializing an unclustered track set and setting the unclustered track set to be null, then performing clustering operation on each track equivalence class after preprocessing, initializing a clustered track set, and if the number of tracks in the track equivalence class is less than k, inhibiting the clustered track set. An active set is initialized, and all tracks in the equivalence class are filled into the set to represent an unclustered track set. Then, a central track set is initialized, and a track is randomly selected from the active set. And then performing IMHDT distance calculation between the non-clustered tracks and the tracks, selecting a track with the farthest IMHDT distance value as a central track, and then performing IMHDT distance calculation between the non-clustered tracks and the central track. And initializing an anonymous cluster anonymity, and then taking k-1 tracks with the closest IMHDT distance and the central track to form an anonymous cluster. This anonymous cluster is added to the anonymity set. In this case, the farthest track of the k-1 tracks closest to the center track needs to be calculated first, and if the IMHDT distance between this track and the center track is greater than the previously set threshold max _ radius, these tracks are suppressed. After the step, whether the tracks are classified as anonymous tracks (added into an anonymous set) or not, the tracks need to be deleted from the anonymous track set active, so that the track equivalence class clustering is finished until the active set is empty. And after all the equivalence class clustering is finished, the track clustering process is finished.

Algorithm 3 is a specific algorithm that calculates the IMHDT distance in two tracks, the inputs of which are two tracks Tr1, Tr 2. The output is the IMHDT distance between the two traces. Firstly, calculating each track sampling point Tr1_ node_t＝tiTo the track section

The shortest distance between them, then calculate

To the track section

The shortest distance therebetween. And finally, taking the minimum value as the IMHDT distance of the point, and after accumulation, taking the average value as the IMHDT distance between the tracks.

Algorithm 4 is an algorithm for calculating the shortest distance between a track sampling point and a track segment, and is input as the track sampling point

And sampling the points from two tracks

The constructed track segment. The output is the shortest distance between the track sampling point and the track segment. Firstly, judging whether an interpolation point exists on a track segment or not, enabling a connecting line of a sampling point and the interpolation point to be vertical to the track segment, if so, returning the Euclidean distance from the sampling point to the interpolation point, and if not, returning the sampling point of the track

To the other two end pointsThe distance is the minimum.

S3, anonymizing each anonymization group to satisfy the interpolation track (k, k) -anonymity.

This is accomplished by perturbing the traces in the anonymous group to satisfy the interpolation trace (k,) -anonymity requirement, as shown in fig. 8. The specific implementation mode is that the track sampling points which do not meet the requirements are moved, and the distance between the track sampling points and the center track is smaller than or equal to the distance between the track sampling points and the center track.

The anonymization operation is carried out by replacing the track sampling point with the interpolation point, so that the anonymization cost can be reduced, and the proving process is as follows:

Translation(IMHDT)＝Eurp(Trp_⊙_i,Tr_i)-

Translation(Eurp)＝Eurp(Trp_i,Tr_i)-

the calculated value of the Hausdorff distance based on interpolation points from the anonymous to the central trajectory, which has been demonstrated before, is always less than or equal to the calculated value of the euclidean distance between the same trajectories:

IMHDT(Trp_i,Tr_i)＝Eurp(Trp_⊙_i,Tr_i)≤Eurp(Trp_i,Tr_i)

because the uncertain threshold value of the track is determined, the anonymity cost of the track point can be reduced by adopting the interpolation point to replace the track sampling point to carry out anonymization operation.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims

1. An anonymous privacy protection method based on interpolation points is characterized by comprising the following steps:

s3, disturbing the track in each anonymous group to finally satisfy interpolation track (k,) -anonymity;

the step S1 specifically includes the following steps:

s11, defining track processing fragment value P_i；

S13, the acquisition time is later than the starting time t_bAnd a module P_iTime stamp t of 0_iAnd the time is earlier than the termination time t_eAnd a module P_iTime stamp t of 0_j；

S14, cutting the original track into t_i,t_jPutting the trajectory equivalence class D { i, j };

the step S2 specifically includes the following steps:

the IMHDT distance is a Hausdorff distance of an interpolation point under time constraint;

two tracks Tr₁、Tr₂The IMHDT distance calculation method comprises the following specific steps:

s221, calculating each track sampling point Tr1_ node_t＝tiTo the track end

The shortest distance therebetween;

s222, calculating a track sampling point Tr1_ node_t＝tiTo the track end

The shortest distance therebetween;

s224, track Tr₁The average value of the IMHDT distance sum of each track sampling point is the IMHDT distance between the tracks Tr1 and Tr 2.

2. The anonymous privacy protection method based on interpolation points as claimed in claim 1, wherein the shortest distance between the trace sampling point and the trace segment is calculated by the following method:

and if the distance between the track sampling point and the two end points of the track end does not exist, the minimum value of the distance between the track sampling point and the two end points of the track end is the shortest distance between the track sampling point and the track section.