CN108573165B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN108573165B
CN108573165B CN201710137709.XA CN201710137709A CN108573165B CN 108573165 B CN108573165 B CN 108573165B CN 201710137709 A CN201710137709 A CN 201710137709A CN 108573165 B CN108573165 B CN 108573165B
Authority
CN
China
Prior art keywords
anonymous
road section
user
road
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710137709.XA
Other languages
Chinese (zh)
Other versions
CN108573165A (en
Inventor
聂啸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201710137709.XA priority Critical patent/CN108573165B/en
Publication of CN108573165A publication Critical patent/CN108573165A/en
Application granted granted Critical
Publication of CN108573165B publication Critical patent/CN108573165B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Abstract

The invention discloses a data processing method and device, and relates to the field of data processing. The data processing method comprises the following steps: acquiring service requests sent by one or more users located on a certain path segment cluster in a preset time period; acquiring road sections adjacent to the road section where the user is located in the road section cluster to form an anonymous road section set, and acquiring the user on the road section in the anonymous road section set to form an anonymous user set; and sending the position information of the users in the anonymous user set or the coverage area information of the anonymous road section set to a server so that the server returns a service request result to the users according to the received information. The anonymous road section set is formed by the road sections adjacent to the road section where the user is located in the road section cluster, so that the anonymous road section set can cover the road section where the user is closer to the user, the range covered by the anonymous road section set is in the range of the limited road section cluster on the basis of ensuring that the anonymous road section set has a certain dispersion degree, and the position privacy of the user is effectively protected.

Description

Data processing method and device
Technical Field
The present invention relates to the field of data processing, and in particular, to a data processing method and apparatus.
Background
LBS (Location-Based Service) is a Service developed using geographical Location data of a mobile user. LBS queries can be generally classified into snapshot queries and continuous queries. Snapshot query is a query that a user proposes, and a server returns a result query, such as "return all hospitals within 3 km from me". Continuous queries, i.e., queries where the user makes a query, and the server returns results at intervals, such as "return to the nearest gas station to me every 5 minutes for the next 30 minutes".
Driven by the huge economic benefits brought by data, LBS service providers may sell privacy information such as the location and query content of users to gain business benefits. And thus the privacy and security of the user may be compromised.
In view of the above situation, the industry generally adopts a scheme of privacy protection for location and a scheme of privacy protection for query, aiming at preventing a specific location of a user from being revealed. The scheme for privacy protection of the location obscures and transmits the actual location information of the user to avoid transmitting the accurate location of the user to the LBS service provider. The scheme of privacy protection for query is to protect the query content of the user from being revealed, i.e. to hide the corresponding relationship between the user identifier and the query content.
In the prior art, k-anonymization techniques can be used: the position information of a certain user is mixed with the position information of (k-1) other users, namely, the position of the two-dimensional space area where the user is located is sent to the LBS service provider. Thus, the attacker only knows that k users are in the area, but cannot associate the users with their exact location.
However, in some application scenarios, this technique is less effective. For example, when a user is located in an area with too low population density, excessively increasing the area of the two-dimensional space area directly reduces the service quality of the user; when k users are in the same location, the location privacy of the users can still be revealed.
Disclosure of Invention
The embodiment of the invention aims to solve the technical problem that: how to improve the effectiveness of user location privacy protection.
According to a first aspect of embodiments of the present invention, there is provided a data processing method, including: acquiring service requests sent by one or more users located on a certain path segment cluster in a preset time period; acquiring road sections adjacent to the road section where the user is located in the road section cluster to form an anonymous road section set, and acquiring the user on the road section in the anonymous road section set to form an anonymous user set; and sending the position information of the users in the anonymous user set or the coverage area information of the anonymous road section set to a server so that the server returns a service request result to the users according to the received information.
According to a second aspect of the embodiments of the present invention, there is provided a data processing apparatus including: the system comprises a request acquisition module, a service request processing module and a service request processing module, wherein the request acquisition module is configured to acquire service requests sent by one or more users located on a certain path segment cluster in a preset time period; the anonymous set forming module is configured to obtain road sections adjacent to the road section where the user is located in the road section cluster to form an anonymous road section set, and obtain the user on the road section in the anonymous road section set to form an anonymous user set; and the information sending module is configured to send the position information of the users in the anonymous user set or the coverage area information of the anonymous road section set to the server so that the server returns the service request result to the users according to the received information.
According to a third aspect of the embodiments of the present invention, there is provided a data processing apparatus including: a memory; and a processor coupled to the memory, the processor configured to perform any of the foregoing data processing methods based on instructions stored in the memory.
According to a fourth aspect of embodiments of the present invention, there is provided a computer-readable storage medium on which a computer program is stored, characterized in that the program, when executed by a processor, implements any one of the aforementioned data processing methods.
One embodiment of the above invention has the following advantages or benefits: the anonymous road section set is formed by the road sections adjacent to the road section where the user is located in the road section cluster, so that the anonymous road section set can cover the road section where the user is closer to the user, the range covered by the anonymous road section set is in the range of the limited road section cluster on the basis of ensuring that the anonymous road section set has a certain dispersion degree, and the position privacy of the user is effectively protected.
Further, when the embodiment of the invention is used for protecting the position privacy of the moving user, the anonymous road section set can be covered on the road section which the user may move to, so that the privacy protection of the moving user can be better carried out.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of an embodiment of a data processing method of the present invention.
FIG. 2 is a flow chart of another embodiment of a data processing method of the present invention.
FIG. 3 is a flow chart of a data processing method according to another embodiment of the present invention.
FIG. 4 is a flow chart of a data processing method according to another embodiment of the present invention.
FIG. 5 is a block diagram of one embodiment of a data processing apparatus of the present invention.
FIG. 6 is a block diagram of another embodiment of a data processing apparatus according to the present invention.
FIG. 7 is a block diagram of yet another embodiment of a data processing device of the present invention.
Fig. 8 is a block diagram of still another embodiment of a data processing apparatus of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
FIG. 1 is a flow chart of an embodiment of a data processing method of the present invention. As shown in fig. 1, the data processing method of this embodiment includes:
step S102, obtaining service requests sent by one or more users located on a certain path segment cluster in a preset time period.
A road segment cluster is a collection comprising several road segments, the road segments in the same road segment cluster being geographically interconnected. When a user is located in a certain road segment cluster, if the user is in a mobile state, it is likely to be located on another road in the road segment cluster after a period of time. Therefore, users in the same road segment cluster have a greater likelihood of having similar trajectories.
The road section clusters can be formed by adopting a region division mode; the road sections with more user historical tracks can be divided into a road section cluster, so that the traveling routes of the users in the same road section cluster are more similar, and the regional fuzzy processing is facilitated.
In one embodiment, prior to step S102, historical tracks in the road network and historical traffic size of each road may be counted. The road segments that the user moves through or covers on the map are called tracks, and the number of tracks included in each road segment is called flow. Then, the following process is repeated until all road segments are added to the road segment cluster: selecting the road section with the maximum flow in the road sections which do not belong to any road section cluster as a first element of a certain road section cluster; and adding road sections which are adjacent to the edge road section of a certain road section cluster and belong to the same track into the certain road section cluster in the road sections which do not belong to any road section cluster. That is, after a road section cluster is constructed, the next road section cluster is constructed until all road sections belong to a certain road section cluster. The edge links refer to links at the head or the tail in the current link cluster, that is, links adjacent to the link include links in the link cluster and links not in the link cluster.
Therefore, the track characteristics and the flow characteristics of the road sections can be considered when the road section clusters are constructed, so that the road sections in the same road section cluster belong to the same one or more motion tracks as much as possible.
And step S104, acquiring road sections adjacent to the road section where the user is located in the road section cluster to form an anonymous road section set, and acquiring the user on the road section in the anonymous road section set to form the anonymous user set.
Although the links in the link cluster are regional or include the possible travel tracks of the user, the range of the region where the blurring process is performed needs to be further reduced.
Therefore, in the present invention, only the links adjacent to the link where each user is currently located are considered. The adjacent road section is not only closer to the current distance of the user, but also a road section where the user may be located in the next time slot when the user is in a moving state. Thus, these segments may be chosen to form a set of anonymous segments.
When a plurality of users send service requests, the users can be sorted from high to low according to the privacy requirements of the users, and road sections adjacent to the road section where the users are located are sequentially selected according to the sorting result to form an anonymous road section set. Therefore, the user's surrounding link information with a high privacy requirement can be added with priority.
In one embodiment, road segments in the road segment cluster adjacent to the road segment where the user is located in the traveling direction of the user may be obtained to form an anonymous road segment set, and the users on the road segments in the anonymous road segment set are obtained to form the anonymous user set.
The road sections adjacent to a certain road section are likely to be multiple, and the road sections adjacent to the certain road section in the traveling direction of the user are the road sections which are more likely to be reached by the user, so that the coverage area of the anonymous road section set can be further reduced, and the query result of the user is more accurate.
The users on the anonymous road segment centralized road segment can be the users who currently send the service request, and can also be the users who have the position information but do not send the service request. The skilled person can select as desired.
Alternatively, the user who sent the service request may be added to the anonymous set of links, and the link where the user who sent the service request is located may be added to the anonymous set of links. However, even if the above information is not added, since the anonymous link set has a link very close to the user who sent the service request, the technical effect of the present invention can be achieved.
And step S106, the position information of the users in the anonymous user set or the coverage area information of the anonymous road section set is sent to the server, so that the server returns the service request result to the users according to the received information.
After the anonymous user set and the anonymous road segment set are obtained, two processing methods exist. The first method is that the coverage information of the anonymous road segment set is directly sent to the server, namely, a fuzzy area is provided for the server, the server performs operations such as inquiry and the like according to the position information of the area, and the result is returned to all users sending service requests. The second method is that a representative user is selected from the anonymous user set, and the representative user may be, for example, a user located in the center of the area where the anonymous road segment set is located, a randomly selected user, or a user selected by using another method, and then the position information of the user is sent to the server, so that the server returns the result of querying the position of the representative user to all users sending the service request.
Users in the anonymous user set can share one query result, so that the system overhead is saved.
By adopting the method of the embodiment, the anonymous road section set can be formed by adopting the road sections adjacent to the road section where the user is located in the road section cluster, so that the anonymous road section set can cover the road section where the user is closer to the user, the range covered by the anonymous road section set is in the limited range of the road section cluster on the basis of ensuring that the anonymous road section set has a certain dispersion degree, and the position privacy of the user is effectively protected.
Further, when the embodiment of the invention is used for protecting the position privacy of the moving user, the anonymous road section set can be covered on the road section which the user may move to, so that the privacy protection of the moving user can be better carried out.
After the anonymous set of road segments and the anonymous set of users are obtained in step S104, it may be checked whether they satisfy the security condition. The security conditions include, for example, the following: the number of the sections in the anonymous section set is larger than a preset value, so that the users in the anonymous section set reach a certain dispersion degree; the number of users in the anonymous user set is larger than a preset value, so that an attacker is difficult to locate a real user. Those skilled in the art can select some or all of the security conditions to verify that the anonymous set of road segments and the anonymous set of users are directly available, as desired.
If the anonymous road section set and the anonymous user set do not meet the security condition, the embodiment of the invention further provides two adjustment modes.
A first adjustment is to further expand the scope of the anonymous set. A data processing method of another embodiment of the present invention is described below with reference to fig. 2.
FIG. 2 is a flow chart of another embodiment of a data processing method of the present invention. As shown in fig. 2, the data processing method of this embodiment includes:
step S202, obtaining service requests sent by one or more users located on a certain path segment cluster in a preset time period.
Step S204, a road section adjacent to the road section where the user is located in the road section cluster is obtained to form an anonymous road section set, and the user on the road section in the anonymous road section set is obtained to form the anonymous user set.
The steps S202 to S204 may be referred to in the following steps S102 to S104.
Step S206, checking whether the anonymous road section set and/or the anonymous user set meet the safety condition.
And S208, if the anonymous road section set and/or the anonymous user set do not meet the safety condition, adding the road section with the minimum distance to the road section where the user is located in the parent level road section cluster of the road section cluster where the user is located into the anonymous road section set, and adding the user on the road section in the anonymous road section set into the anonymous user set.
The specific implementation of step S208 is similar to step S104, except that step S208 further expands the range of the candidate links, and selects a link from the parent level link cluster of the link cluster to add to the anonymous link set. Wherein the road segment cluster is a subset of a parent level road segment cluster of the road segment cluster.
Steps S206 to S208 may be performed a plurality of times. For example, when the anonymous link set and the anonymous user set generated according to the parent-level link cluster also do not satisfy the security condition, the method may be further continuously executed by using the parent level of the parent-level link cluster, and so on.
Before step S202, a road segment cluster tree may be constructed in advance, where leaf nodes of the tree are the most basic road segment clusters, a root node is a cluster formed by all road segments in a road network, and each parent node in the tree is the sum of all its child nodes. Thus, the parent level road segment cluster of the road segment cluster can be viewed in the road segment cluster tree.
In one embodiment, the following method may be employed to construct the segment cluster tree. Firstly, clustering each road section as a leaf node; then, the following process is repeated until a root node is generated: and selecting the road section cluster with the maximum number of historical tracks and the road section cluster with the shortest distance to the road section cluster with the maximum number of historical tracks from the non-fused road section clusters in the same level, and fusing the two selected road section clusters into a parent level road section cluster to serve as a parent node of the two selected road section clusters.
The distance between the road section clusters can be calculated according to the distance between the central points of the road section clusters, can also be calculated according to the distance between the road sections with the farthest distance in different road section clusters, and can also be calculated by adopting other methods, which is not repeated herein.
Therefore, the road section clusters with a large number of historical tracks can be processed preferentially, and the parent level of the road section clusters with a large number of historical tracks not only has a large number of tracks, but also has a more compact space structure.
Step S210, the position information of the users in the anonymous user set or the coverage area information of the anonymous road section set is sent to a server, so that the server returns a service request result to the users according to the received information.
The step S106 may be referred to in the detailed implementation of the step S210, and is not described herein again.
By adopting the method of the embodiment, the range of the anonymous road section set can be further expanded on the basis of the original anonymous road section set, so that the number of users in the anonymous user set is increased, the dispersion degree of the users in the anonymous user set on the space is also improved due to the increase of the number of the road sections, and the effectiveness of the user position privacy protection can be further improved.
When the method of the above embodiment is adopted, the coverage area of the generated anonymous link set may be determined. If the coverage area of the anonymous road section set exceeds a preset value, namely the range of the current anonymous road section set is too large, a second adjusting mode can be adopted.
The second adjustment is to generate and add false users to the set of anonymous users. A data processing method of a further embodiment of the present invention is described below with reference to fig. 3.
FIG. 3 is a flow chart of a data processing method according to another embodiment of the present invention. As shown in fig. 3, the data processing method of this embodiment includes:
step S302, obtaining service requests sent by one or more users located on a certain path segment cluster in a preset time period.
Step S304, an anonymous road section set is obtained by acquiring road sections adjacent to the road section where the user is located in the road section cluster, and an anonymous user set is obtained by acquiring the users on the road sections in the anonymous road section set.
The steps S102 to S104 may be referred to in the detailed embodiments of the steps S302 to S304.
Step S306, checking whether the anonymous road section set and/or the anonymous user set meet the safety condition.
Step S308, if the anonymous road section set and/or the anonymous user set do not meet the safety condition, generating virtual users on the road sections in the anonymous road section set, and adding the generated virtual users to the anonymous user set, wherein the proportion of the number of the virtual users generated on each road section in the total number of the virtual users is consistent with the proportion of the number of the users on each road section in the anonymous user set.
That is, the generated fake users have a consistent location distribution with the real users.
Step S310, the position information of the users in the anonymous user set or the coverage area information of the anonymous road section set is sent to the server, so that the server returns the service request result to the users according to the received information.
By adopting the method of the embodiment, the number of users in the coverage range of the anonymous road segment set can be increased by generating the false users with the same position distribution as the real users, so that the difficulty of an attacker in determining the real user positions is increased, and the effectiveness of user position privacy protection is improved.
When the way of generating the virtual users is adopted, the virtual users may be generated based on the original anonymous sets, and may also be generated based on the way of the embodiment of fig. 2, that is, the virtual users are generated based on the anonymous sets generated according to the road section clustering of the parent hierarchy. The skilled person can select as desired. The embodiment of the invention provides one of the selection modes. A data processing method of still another embodiment of the present invention is described below with reference to fig. 4.
FIG. 4 is a flow chart of a data processing method according to another embodiment of the present invention. As shown in fig. 4, the data processing method of this embodiment includes:
step S402, acquiring service requests sent by one or more users located on the first segment cluster in a preset time period.
Step S404, a first anonymous road section set is formed by acquiring road sections adjacent to the road section where the user is located in the road section cluster, and a first anonymous user set is formed by acquiring the users on the road sections in the first anonymous road section set.
The steps S102 to S104 may be referred to in the detailed embodiments of the steps S302 to S304.
Step S406, checking whether the first anonymous road segment set and/or the first anonymous user set satisfy a security condition.
Step S408, if the first anonymous road section set and/or the first anonymous user set do not meet the safety condition, adding the road section with the minimum distance to the road section where the user is located in the second road section cluster of the parent level road section cluster of the first road section cluster to the first anonymous road section set to form a second anonymous road section set, and adding the user on the road section in the second anonymous road section set to the first anonymous user set to form a second anonymous user set.
Step S410, check whether the second set of anonymous road segments and/or the second set of anonymous users meet the security condition.
If the second anonymous road segment set and the second anonymous user set meet the security condition, the operation of the step S414 can be directly performed by using the second anonymous road segment set and the second anonymous user set; if the first anonymous road section set, the first anonymous user set, the second anonymous road section set and the second anonymous user set do not accord with the safety condition, the generation of the virtual user can be considered. And based on which anonymous set generates the virtual user, the selection may be made in the manner of step S412.
Step S412, if the second anonymous road section set and/or the second anonymous user set do not meet the safety condition, generating a first evaluation factor according to the first anonymous road section set and the first anonymous user set, generating a second evaluation factor according to the second anonymous road section set and the second anonymous user set, and selecting the anonymous user set and the anonymous road section set according to the first evaluation factor and the second evaluation factor.
The evaluation factor is determined according to the ratio of the coverage area of the road sections in the anonymous road section set to the number of the road sections, the proportion of the number of the virtual users in the anonymous user set and the number of the virtual users.
When the ratio of the coverage area of the evaluation factor and the road sections in the anonymous road section set to the number of the road sections, the proportion of the number of the virtual users in the anonymous user set and the number of the virtual users are all in positive correlation, selecting the anonymous road section set corresponding to the smaller evaluation factor and the corresponding anonymous user set; similarly, when the ratio of the coverage area of the road sections in the evaluation factor and the anonymous road section set to the number of the road sections, the proportion of the number of the virtual users in the anonymous user set and the number of the virtual users are all in negative correlation, the anonymous road section set corresponding to the larger evaluation factor and the corresponding anonymous user set are selected.
That is, the finally selected anonymous road segment set and anonymous user set are made to include more road segment numbers, smaller road segment coverage areas and fewer virtual user numbers as much as possible. Therefore, the overhead of the system caused by generating the virtual user can be reduced, and the user positions are more dispersed.
One calculation method of the evaluation factor may be as shown in equations (1) and (2).
Formula (1) provides a calculation method of the evaluation factor, and E represents the evaluation factor; l R (ssg) l represents the area covered by the anonymous road section set; the | ssg | represents the number of road segments in the anonymous road segment set; k represents the number of generated virtual users; ρ represents the weight of a preset virtual user in the anonymous user set. Formula (2) further explains the meaning of ρ on the basis of formula (1), where T represents the total number of users that the preset set of anonymous users should reach.
Figure BDA0001241881500000101
Other ways of calculating the evaluation factor may be used by those skilled in the art as desired and will not be described herein.
After the selected set of anonymous road segments and anonymous users are determined, information related to the selection may be sent to a server for further processing.
And step S414, generating virtual users on the road sections in the anonymous road section set, and adding the generated virtual users to the anonymous user set, wherein the proportion of the number of the virtual users generated on each road section in the total number of the virtual users is consistent with the proportion of the number of the current users on each road section in the anonymous user set.
Step S416, the position information of the users in the anonymous user set or the coverage area information of the anonymous road section set is sent to the server, so that the server returns the service request result to the users according to the received information.
By adopting the method, the virtual users can be generated on the basis of the selected anonymous road section set and the anonymous user set, so that the system overhead is reduced as much as possible on the basis of increasing the number of users in the coverage range of the anonymous road section set and increasing the difficulty of determining the real user position by an attacker, the dispersion degree of the anonymous users is improved, and the effectiveness of user position privacy protection is improved.
When the user is in a moving state and the transmitted requests are continuous query requests, the user is located at a different position when making each request. In the foregoing embodiments, users are selected from the same road segment cluster for confusion, so that users in an anonymous user set generated for many times in continuous queries have a high probability of overlapping, and difficulty in identifying the real position of the user by an attacker is increased.
In addition, the embodiment of the invention can also keep the position reasonableness of the generated virtual user in the motion process, namely, the virtual user can move along with the real user, thereby further increasing the difficulty of an attacker in identifying the real position of the user.
In one embodiment, the continuous service request sent by the user includes a first service request and a second service request. When the virtual user needs to be generated, for the first service request, the specific implementation manner of step S308 may be adopted to generate the virtual user on the road segment in the anonymous road segment set corresponding to the first service request, and add the generated virtual user to the anonymous user set corresponding to the first service request. When the user moves and sends a second service request, the virtual user generated for the first query request may be added to the anonymous user set corresponding to the second service request, and the position of the virtual user is updated, so that the virtual user is located on the road section in the anonymous road section set corresponding to the second service request.
That is, in the continuous query, the basic information such as the user identifier of the virtual user is kept unchanged, and only the position of the virtual user is updated, so that the virtual user is within the reachable reasonable range, and the identification difficulty of an attacker is increased.
A data processing apparatus according to an embodiment of the present invention is described below with reference to fig. 5.
FIG. 5 is a block diagram of one embodiment of a data processing apparatus of the present invention. As shown in fig. 5, the data processing apparatus of this embodiment includes: a request obtaining module 51, configured to obtain service requests sent by one or more users located on a certain segment cluster within a preset time period; an anonymous set forming module 52 configured to obtain a road segment adjacent to the road segment where the user is located in the road segment cluster to form an anonymous road segment set, and obtain a user on the road segment in the anonymous road segment set to form an anonymous user set; and the information sending module 53 is configured to send the position information of the user in the anonymous user set or the coverage area information of the anonymous road section set to the server, so that the server returns the service request result to the user according to the received information.
Wherein, the anonymous set forming module 52 may be further configured to obtain a set of anonymous road segments formed by road segments in the road segment cluster adjacent to the road segment where the user is located in the traveling direction of the user, and obtain a set of anonymous users formed by the users on the road segments in the set of anonymous road segments.
A data processing apparatus according to another embodiment of the present invention is described below with reference to fig. 6.
FIG. 6 is a block diagram of another embodiment of a data processing apparatus according to the present invention. As shown in fig. 6, the data processing apparatus of this embodiment may further include a segment cluster generating module 64 configured to count historical tracks in the road network and historical traffic of each road, and repeat the following processes until all segments are added to the segment cluster: selecting the road section with the maximum flow in the road sections which do not belong to any road section cluster as a first element of a certain road section cluster; and adding road sections which are adjacent to the edge road section of a certain road section cluster and belong to the same track into the certain road section cluster in the road sections which do not belong to any road section cluster.
Furthermore, the apparatus may further comprise an anonymous set extension module 65 configured to check whether the anonymous set of road segments and/or the anonymous set of users satisfies the security condition, add a road segment in the parent hierarchical cluster of road segments of the cluster of road segments in which the user is located that is the least distant from the road segment in which the user is located to the anonymous set of road segments if the anonymous set of road segments and/or the anonymous set of users does not satisfy the security condition, and add the user on the road segment in the anonymous set of road segments to the anonymous set of users.
Furthermore, the apparatus may further comprise a first virtual user generation module 66 comprising: a security condition detecting unit 661 configured to see whether the anonymous section set and/or the anonymous user set satisfy a security condition; and a virtual user generating unit 662 configured to generate virtual users on the links in the anonymous link set and add the generated virtual users to the anonymous user set if the anonymous link set and/or the anonymous user set do not satisfy the security condition, wherein the proportion of the number of the virtual users generated on each link to the total number of the virtual users is consistent with the proportion of the number of the users on each link to the anonymous user set.
Furthermore, the apparatus may further comprise a second virtual user generation module 67 comprising: a security condition detecting unit 661 configured to check whether an anonymous road segment set and/or an anonymous user set generated according to a parent level road segment cluster of the road segment cluster satisfies a security condition; an evaluation factor generating unit 672 configured to determine a first evaluation factor and a second evaluation factor according to a ratio of a coverage area of a road segment in an anonymous road segment set to the number of the road segments, a specific gravity of the number of virtual users in the anonymous user set, and the number of the virtual users if the anonymous road segment set and/or the anonymous user set generated according to a parent level road segment cluster of the road segment cluster do not satisfy a security condition, and select the anonymous user set and the anonymous road segment set according to the size of the first evaluation factor and the second evaluation factor, wherein the first evaluation factor corresponds to the anonymous road segment set and the anonymous user set generated according to the road segment cluster, and the second evaluation factor corresponds to the anonymous road segment set and the anonymous user set generated according to the parent level road segment cluster of the road segment cluster; and a virtual user generating unit 662 configured to generate virtual users on links in the anonymous link set, wherein the proportion of the number of virtual users generated on each link to the total number of virtual users is consistent with the proportion of the current number of users on each link to the anonymous user set, and add the generated virtual users to the anonymous user set.
Wherein the service request may be a continuous service request, including a first service request and a second service request. The virtual user generating unit 662 may be further configured to generate, for the first service request, a virtual user on a road segment in the anonymous road segment set corresponding to the first service request, and add the generated virtual user to the anonymous user set corresponding to the first service request; and aiming at the second service request, adding the virtual user generated aiming at the first query request to an anonymous user set corresponding to the second service request, and updating the position of the virtual user to enable the virtual user to be positioned on the road section in the anonymous road section set corresponding to the second service request.
Wherein the safety condition may comprise at least one of: the number of the sections in the anonymous section set is larger than a preset value, and the number of the users in the anonymous user set is larger than the preset value.
In addition, the apparatus may further include a road segment clustering fusion module 68 configured to cluster the respective road segments as leaf nodes; the following process is repeated until a root node is generated: selecting the road section cluster with the maximum number of historical tracks and the road section cluster with the shortest distance to the road section cluster with the maximum number of historical tracks from the non-fused road section clusters in the same level; and fusing the two selected road section clusters into a parent level road section cluster as a parent node of the two selected road section clusters.
FIG. 7 is a block diagram of yet another embodiment of a data processing device of the present invention. As shown in fig. 7, the apparatus 700 of this embodiment includes: a memory 710 and a processor 720 coupled to the memory 710, the processor 720 being configured to perform the data processing method of any of the foregoing embodiments based on instructions stored in the memory 710.
Memory 710 may include, for example, system memory, fixed non-volatile storage media, and the like. The system memory stores, for example, an operating system, an application program, a Boot Loader (Boot Loader), and other programs.
Fig. 8 is a block diagram of still another embodiment of a data processing apparatus of the present invention. As shown in fig. 7, the apparatus 700 of this embodiment includes: the memory 710 and the processor 720 may further include an input/output interface 830, a network interface 840, a storage interface 850, and the like. These interfaces 830, 840, 850 and the memory 710 and the processor 720 may be connected, for example, by a bus 860. The input/output interface 830 provides a connection interface for input/output devices such as a display, a mouse, a keyboard, and a touch screen. The network interface 840 provides a connection interface for various networking devices. The storage interface 850 provides a connection interface for external storage devices such as an SD card and a usb disk.
An embodiment of the present invention also provides a computer-readable storage medium on which a computer program is stored, wherein the program is configured to implement any one of the aforementioned data processing methods when executed by a processor.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (16)

1. A data processing method, comprising:
acquiring service requests sent by one or more users located on a certain path segment cluster in a preset time period;
acquiring road sections adjacent to the road section where the user is located in the road section cluster to form an anonymous road section set, and acquiring the user on the road section in the anonymous road section set to form an anonymous user set;
checking whether the anonymous road section set and/or the anonymous user set meet safety conditions;
if the anonymous road section set and/or the anonymous user set do not meet the safety condition, adding a road section with the minimum distance to the road section where the user is located in a parent level road section cluster of the road section cluster where the user is located into the anonymous road section set, and adding the user on the road section in the anonymous road section set into the anonymous user set;
checking whether an anonymous road section set and/or an anonymous user set generated according to a parent level road section cluster of the road section cluster meet safety conditions;
if the anonymous road section set and/or the anonymous user set generated according to the parent level road section cluster of the road section cluster do not meet the safety condition, determining a first evaluation factor and a second evaluation factor according to the ratio of the coverage area of the road sections in the anonymous road section set to the number of the road sections, the proportion of the number of the virtual users in the anonymous user set and the number of the virtual users, and selecting the anonymous user set and the anonymous road section set according to the first evaluation factor and the second evaluation factor, wherein the first evaluation factor corresponds to the anonymous road section set and the anonymous user set generated according to the road section cluster, and the second evaluation factor corresponds to the anonymous road section set and the anonymous user set generated according to the parent level road section cluster of the road section cluster;
and generating virtual users on the road sections in the anonymous road section set, and adding the generated virtual users to the anonymous user set, wherein the proportion of the number of the virtual users generated on each road section in the total number of the virtual users is consistent with the proportion of the number of the users on each road section in the anonymous user set, and the position information of the users in the anonymous user set or the coverage area information of the anonymous road section set is sent to a server, so that the server returns a service request result to the users according to the received information.
2. The method of claim 1, further comprising:
counting historical tracks in a road network and historical flow of each road;
the following process is repeated until all road segments are added to the road segment cluster:
selecting the road section with the maximum flow in the road sections which do not belong to any road section cluster as a first element of a certain road section cluster;
and adding road sections which are adjacent to the edge road section of the certain road section cluster and belong to the same track in road sections which do not belong to any road section cluster into the certain road section cluster.
3. The method of claim 1, wherein obtaining segments in the cluster of segments that are adjacent to segments on which users are located forms a set of anonymous segments, and wherein obtaining users on segments in the set of anonymous segments to form a set of anonymous users comprises:
and obtaining road sections adjacent to the road section where the user is located in the road section cluster in the traveling direction of the user to form an anonymous road section set, and obtaining the user on the road section in the anonymous road section set to form an anonymous user set.
4. The method of claim 1, further comprising:
checking whether the anonymous road section set and/or the anonymous user set meet safety conditions;
and if the anonymous road section set and/or the anonymous user set do not meet the safety condition, generating virtual users on the road sections in the anonymous road section set, and adding the generated virtual users to the anonymous user set, wherein the proportion of the number of the virtual users generated on each road section in the total number of the virtual users is consistent with the proportion of the number of the users on each road section in the anonymous user set.
5. The method according to claim 1 or 4, wherein the service request is a continuous service request comprising a first service request and a second service request;
the generating virtual users on the road segments in the anonymous road segment set and adding the generated virtual users to the anonymous user set comprises:
aiming at the first service request, generating a virtual user on a road section in an anonymous road section set corresponding to the first service request, and adding the generated virtual user to the anonymous user set corresponding to the first service request;
and aiming at the second service request, adding the virtual user generated aiming at the first service request to an anonymous user set corresponding to the second service request, and updating the position of the virtual user to enable the virtual user to be positioned on a road section in the anonymous road section set corresponding to the second service request.
6. The method according to any one of claims 1-4, wherein the safety condition comprises at least one of: the number of the sections in the anonymous section set is larger than a preset value, and the number of the users in the anonymous user set is larger than the preset value.
7. The method according to any one of claims 1-4, further comprising:
clustering each road section as a leaf node;
the following process is repeated until a root node is generated:
selecting the road section cluster with the maximum number of historical tracks and the road section cluster with the shortest distance to the road section cluster with the maximum number of historical tracks from the non-fused road section clusters in the same level;
and fusing the two selected road section clusters into a parent level road section cluster as a parent node of the two selected road section clusters.
8. A data processing apparatus, comprising:
the system comprises a request acquisition module, a service request processing module and a service request processing module, wherein the request acquisition module is configured to acquire service requests sent by one or more users located on a certain path segment cluster in a preset time period;
the anonymous set forming module is configured to obtain road sections, adjacent to the road section where the user is located, in the road section cluster to form an anonymous road section set, and obtain the user on the road section in the anonymous road section set to form an anonymous user set;
the anonymous set expansion module is configured to check whether an anonymous road section set and/or an anonymous user set meet safety conditions, if the anonymous road section set and/or the anonymous user set do not meet the safety conditions, a road section with the minimum distance to the road section where the user is located in a parent level road section cluster of the road section cluster where the user is located is added into the anonymous road section set, and the user on the road section in the anonymous road section set is added into the anonymous user set;
a second virtual user generation module comprising:
a security condition detection unit configured to check whether an anonymous road segment set and/or an anonymous user set generated according to a parent level road segment cluster of the road segment cluster satisfies a security condition;
an evaluation factor generation unit configured to determine a first evaluation factor and a second evaluation factor according to a ratio of a coverage area of a link in an anonymous link set to a number of links, a specific gravity of a number of virtual users in an anonymous user set, and the number of virtual users if the anonymous link set and/or the anonymous user set generated according to a parent level link cluster of the link cluster do not satisfy a security condition, and select the anonymous user set and the anonymous link set according to sizes of the first evaluation factor and the second evaluation factor, wherein the first evaluation factor corresponds to the anonymous link set and the anonymous user set generated according to the link cluster, and the second evaluation factor corresponds to the anonymous link set and the anonymous user set generated according to the parent level link cluster of the link cluster; and
the virtual user generating unit is configured to generate virtual users on the road sections in the anonymous road section set and add the generated virtual users to the anonymous user set, wherein the proportion of the number of the virtual users generated on each road section in the total number of the virtual users is consistent with the proportion of the number of the users on each road section in the anonymous user set;
and the information sending module is configured to send the position information of the users in the anonymous user set or the coverage area information of the anonymous road section set to a server so that the server returns a service request result to the users according to the received information.
9. The apparatus of claim 8, further comprising a segment cluster generation module configured to count historical tracks in the road network and historical traffic size of each road, and repeat the following process until all segments are added to the segment cluster: selecting the road section with the maximum flow in the road sections which do not belong to any road section cluster as a first element of a certain road section cluster; and adding road sections which are adjacent to the edge road section of the certain road section cluster and belong to the same track in road sections which do not belong to any road section cluster into the certain road section cluster.
10. The apparatus of claim 8, wherein the anonymous set forming module is further configured to obtain segments of the cluster of segments that are adjacent to segments on which the user is located in a direction of travel of the user to form a set of anonymous segments, and obtain the set of anonymous users on the segments in the set of anonymous segments.
11. The apparatus of claim 8, further comprising a first virtual user generation module comprising:
a security condition detection unit configured to check whether the anonymous road segment set and/or the anonymous user set satisfy a security condition;
and the virtual user generating unit is configured to generate virtual users on the road sections in the anonymous road section set and add the generated virtual users to the anonymous user set if the anonymous road section set and/or the anonymous user set do not meet the safety condition, wherein the proportion of the number of the virtual users generated on each road section in the total number of the virtual users is consistent with the proportion of the number of the users on each road section in the anonymous user set.
12. The apparatus according to claim 8 or 11, wherein the service request is a continuous service request comprising a first service request and a second service request;
the virtual user generating unit is further configured to generate a virtual user on a road segment in an anonymous road segment set corresponding to the first service request aiming at the first service request, and add the generated virtual user to the anonymous user set corresponding to the first service request; and aiming at the second service request, adding the virtual user generated aiming at the first service request to an anonymous user set corresponding to the second service request, and updating the position of the virtual user to enable the virtual user to be positioned on a road section in the anonymous road section set corresponding to the second service request.
13. The apparatus of any one of claims 8-11, wherein the safety condition comprises at least one of: the number of the sections in the anonymous section set is larger than a preset value, and the number of the users in the anonymous user set is larger than the preset value.
14. The apparatus according to any one of claims 8-11, further comprising a segment clustering fusion module configured to cluster individual segments as leaf nodes; the following process is repeated until a root node is generated: selecting the road section cluster with the maximum number of historical tracks and the road section cluster with the shortest distance to the road section cluster with the maximum number of historical tracks from the non-fused road section clusters in the same level; and fusing the two selected road section clusters into a parent level road section cluster as a parent node of the two selected road section clusters.
15. A data processing apparatus, comprising:
a memory; and
a processor coupled to the memory, the processor configured to perform the data processing method of any of claims 1-7 based on instructions stored in the memory.
16. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the data processing method of any one of claims 1 to 7.
CN201710137709.XA 2017-03-09 2017-03-09 Data processing method and device Active CN108573165B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710137709.XA CN108573165B (en) 2017-03-09 2017-03-09 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710137709.XA CN108573165B (en) 2017-03-09 2017-03-09 Data processing method and device

Publications (2)

Publication Number Publication Date
CN108573165A CN108573165A (en) 2018-09-25
CN108573165B true CN108573165B (en) 2020-11-24

Family

ID=63577852

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710137709.XA Active CN108573165B (en) 2017-03-09 2017-03-09 Data processing method and device

Country Status (1)

Country Link
CN (1) CN108573165B (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005223504A (en) * 2004-02-04 2005-08-18 Sony Corp Service providing server, information processor and data processing method, and computer program
CN104780173A (en) * 2015-04-16 2015-07-15 西安电子科技大学 System and method for selecting anonymous road section under location privacy protection in road network
CN105246072B (en) * 2015-09-01 2018-12-28 重庆邮电大学 User location method for secret protection and system under a kind of road network environment

Also Published As

Publication number Publication date
CN108573165A (en) 2018-09-25

Similar Documents

Publication Publication Date Title
Pan et al. Divert: A distributed vehicular traffic re-routing system for congestion avoidance
Bißmeyer et al. Assessment of node trustworthiness in vanets using data plausibility checks with particle filters
Arain et al. Location monitoring approach: multiple mix-zones with location privacy protection based on traffic flow over road networks
Ercan et al. Misbehavior detection for position falsification attacks in VANETs using machine learning
Wiedersheim et al. Privacy in inter-vehicular networks: Why simple pseudonym change is not enough
CN108810155B (en) Method and system for evaluating reliability of vehicle position information of Internet of vehicles
JP6464849B2 (en) Moving path data anonymization apparatus and method
Kolandaisamy et al. A multivariant stream analysis approach to detect and mitigate DDoS attacks in vehicular ad hoc networks
Kumar et al. A framework for handling local broadcast storm using probabilistic data aggregation in VANET
Shrestha et al. Trustworthy event-information dissemination in vehicular ad hoc networks
Svaigen et al. Mixdrones: A mix zones-based location privacy protection mechanism for the internet of drones
CN114666737A (en) Path loss reduction trusted agent misbehavior detection
Soleymani et al. An authentication and plausibility model for big data analytic under LOS and NLOS conditions in 5G-VANET
Mezher et al. Realistic environment for VANET simulations to detect the presence of obstacles in vehicular ad hoc networks
Raw et al. E-DIR: a directional routing protocol for VANETs in a city traffic environment
CN111651681A (en) Message pushing method and device based on intelligent information recommendation in cloud network fusion environment
CN111797433A (en) LBS service privacy protection method based on differential privacy
Liu et al. Data mining intrusion detection in vehicular ad hoc network
Souissi et al. Towards a Self-adaptive Trust Management Model for VANETs.
Khan et al. Inforank: Information-centric autonomous identification of popular smart vehicles
CN108573165B (en) Data processing method and device
Sharma Position falsification detection in vanet with consecutive bsm approach using machine learning algorithm
de Souza et al. Fns: Enhancing traffic mobility and public safety based on a hybrid transportation system
Dietzel et al. Context-adaptive detection of insider attacks in VANET information dissemination schemes
Dabboussi et al. Fault tree analysis for the intelligent vehicular networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant