CN111460513B - Similarity-binning-based space point set data privacy protection matching method - Google Patents

Similarity-binning-based space point set data privacy protection matching method Download PDF

Info

Publication number
CN111460513B
CN111460513B CN202010344075.7A CN202010344075A CN111460513B CN 111460513 B CN111460513 B CN 111460513B CN 202010344075 A CN202010344075 A CN 202010344075A CN 111460513 B CN111460513 B CN 111460513B
Authority
CN
China
Prior art keywords
point set
set data
grouping
data
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010344075.7A
Other languages
Chinese (zh)
Other versions
CN111460513A (en
Inventor
张海涛
冀康
乐洋
陈一祥
李文梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202010344075.7A priority Critical patent/CN111460513B/en
Publication of CN111460513A publication Critical patent/CN111460513A/en
Application granted granted Critical
Publication of CN111460513B publication Critical patent/CN111460513B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Abstract

The invention provides a similarity-binning-based method for protecting and matching data privacy of a space point set, which comprises the following steps of: the method comprises the steps of grouping point set data ranges at equal intervals, negotiating data grouping parameters, carrying out space division on original point set data, and obtaining a grouping number of the point set data based on matching of the point set data and a division space; equally-spaced binning of similarity between the point set data and the reference value, calculating the similarity between the attribute value and the reference value, and binning all similarity values by using an equally-spaced partitioning method to further obtain binning combinations of all point set data; based on the matching calculation of the point set data grouping combination number and the box combination number, obtaining the identification number of the point set data according to the grouping number and the box combination of the point set data, further obtaining the matching point pair of the point set data according to the identification number, and finally exchanging the corresponding point set data between the two parties according to the matching point pair. The method has the advantages of high privacy protection and precision adjustability.

Description

Similarity-binning-based space point set data privacy protection matching method
Technical Field
The invention relates to the technical field of space data privacy protection, in particular to a method for protecting and matching space point set data privacy based on similarity binning.
Background
In recent years, with the widespread of global positioning systems, sensor networks, and mobile devices, a large amount of emerging spatial data has been generated. The emerging spatial data have the advantages that the number of users is large, the space-time scale is large, and the like, which are not replaceable by the traditional spatial data. Through analysis of emerging spatial data, rules with rich semantic metaphors are found, and certain auxiliary decisions can be provided for construction of smart cities.
At present, a very serious common problem exists for a plurality of analysis applications aiming at emerging spatial data: the bias of the data. That is, using emerging spatial data from a single source for analysis, it is difficult to achieve a complete activity description for users in an area. For example, subscriber location data generated in a mobile communication network is typically collected and stored by different communication carriers (e.g., three carriers in China: China Mobile, China Unicom, China telecom, and double-card subscribers account for a small percentage), and analysis based on subscriber location data typically cannot cover all subscribers in the area. Therefore, to ensure unbiased performance for emerging spatial data analysis applications, integrated analysis of emerging spatial data from different sources is required.
The main means for realizing the spatial data integration analysis are as follows: and various spatial data mining technologies (such as decision trees, association rules, clustering and the like) which take the matched spatial data as objects and take the implicit knowledge and the spatial relationship as discovery targets. Spatial data matching is the basis of spatial data mining, and privacy protection matching is the core for ensuring data security analysis. At present, the existing privacy protection matching technology mainly adopts a method based on a third party, and the problem of traditional attack between the third party and a data owner exists.
Disclosure of Invention
The invention aims to provide a method for protecting and matching the data privacy of a space point set based on similarity binning, which only needs the direct interaction of a data owner and two parties without the participation of a third party and has the advantages of high privacy protection and precision adjustability.
The invention provides a similarity-binning-based method for protecting and matching data privacy of a space point set, which comprises the following steps of:
step 1: the method comprises the steps of performing equidistant grouping on a point set data range union set, negotiating data grouping parameters, performing space division on original point set data, and obtaining grouping of the point set data based on matching of the point set data and a division space;
step 2: equally-spaced binning of similarity between the point set data and the reference value, calculating the similarity between the attribute value and the reference value, and binning all similarity values by using an equally-spaced partitioning method to further obtain binning combinations of all point set data;
and step 3: and based on the matching calculation of the point set data grouping combination number and the box combination number, obtaining a unique matching number of the point set data, further obtaining a matching point pair of the point set data according to the unique matching number, and finally exchanging the corresponding point set data between the two parties according to the matching point pair.
The further improvement lies in that: the equidistant grouping of the point set data range union in the step 1 comprises the following specific steps:
step 1.1: exchanging the range of point set data of both sides to obtain a range union;
step 1.2: appointing grouping parameters, and obtaining grouping of a range union set by adopting an equal interval method;
step 1.3: and matching the grouping of the range union set with the coordinates of the point set data to obtain the grouping of the point set data.
The further improvement lies in that: the step 2 of equally-spaced binning of the similarity between the point set data and the reference value comprises the following specific steps:
step 2.1: obtaining a reference value of the point set data based on the grouping of the range union set;
step 2.2: calculating a similarity value between the point set data and a reference value;
step 2.3: exchanging the similarity values of the two parties to obtain the range of a similarity value union set;
step 2.4: appointing parameters of the similarity value interval, and obtaining the sub-boxes of the range of the similarity value union by adopting an equal interval method;
step 2.5: and matching the similarity value between the point set data and the reference value with the sub-box of the range of the similarity value union set to obtain the sub-box of the point set data.
The further improvement lies in that: the matching calculation based on the point set data grouping combination number and the sub-box combination number in the step 3 comprises the following specific steps:
step 3.1: exchanging the grouping of the point set data of both sides, obtaining grouping combination based on intersection operation, and further obtaining the grouping combination number of the point set data;
step 3.2: appointing parameters of the space between the sub-boxes to obtain a sub-box combination, and further obtaining a sub-box combination number of the point set data;
step 3.3: combining the grouping combination number and the sub-box combination number of the point set data to obtain a unique matching number of the point set data;
step 3.4: exchanging the unique matching numbers of the point set data of the two sides, and obtaining a matching point pair based on intersection operation;
step 3.5: and according to the matched point pairs, the two parties exchange corresponding point set data with each other. The invention has the beneficial effects that: the method has the advantages of high privacy protection performance, high precision adjustability and no need of a third party, and both data owners exchange the matching information by using the more fuzzy grouping combination and the sub-box combination, so that the method has higher privacy protection performance. By adjusting parameters such as grouping intervals, box separation combination intervals and the like, flexible adjustment of privacy protection precision can be achieved. The matching operation of the point set data is completed only by direct interaction of the data owner and the data owner, so that the collusion attack can be effectively avoided, and the performance of privacy protection is further improved.
Drawings
FIG. 1 shows a point set Ps according to the present invention1Is a schematic representation of the figure.
FIG. 2 shows a point set Ps according to the present invention2Is a schematic representation of the figure.
FIG. 3 shows a point set Ps according to the present invention1A graphical representation of the grouping of point sets of (a).
FIG. 4 shows a point set Ps according to the present invention2A graphical representation of the grouping of point sets of (a).
FIG. 5 shows a point set Ps according to the present invention1A graphical representation of the point set bin-split combination of (a).
FIG. 6 shows a point set Ps according to the present invention2A graphical representation of the point set bin-split combination of (a).
FIG. 7 is a graphical representation of the matching point pair correspondence set data of the present invention.
Detailed Description
For the purpose of enhancing understanding of the present invention, the present invention will be further described in detail with reference to the following examples, which are provided for illustration only and are not to be construed as limiting the scope of the present invention. First, several basic definitions are given:
define 1 point set: ps ═ p1,p2,...,pnN is more than or equal to 1 and is expressed as a point set in a certain space, wherein P isi={pi·x,piY), 1. ltoreq. i.ltoreq.n, representing the ith point in the set of points Ps, pi·x、piY represents the point piThe abscissa and ordinate values of (a).
Define 2 point set ranges: for point set Ps ═ p1,p2,...,pnN is more than or equal to 1, and the point set range is defined as: PsE ═ min (p)i·x,pi·y),max(pi·x,pi·y)],1≤i≤n。
Define 3 point set range union: given two point set ranges PsE1=[a1,b1]、
PsE2=[a2,b2]The union is defined as: PsEu ═ a, b]Wherein, in the step (A),
a=min(a1,a2),b=max(b1,b2)。
define 4 range and set groups: giving a range and set of PsEu ═ a, b]The equally spaced groupings are defined as: PsEuG ═ G1,G2,...,Gn},
Wherein G is1=[a,k1),G2=[k1,k2),...,Gn=[kn-1,b]And is and
[α,k1)∪[k1,k2)∪...∪[kn-1,b]=[a,b]and is and
k1-a=k2-k1=...=b-kn-1EI, which is the spacing of the packets.
Define 5 point set grouping: given a set of points Ps ═ p1,p2,...,pnN is more than or equal to 1, and a corresponding range union group PsEuG ═ G1,G2,...,GmThe grouping of the point set Ps is defined as: PsG { (p)1·Gx,p1·Gy),(p2·Gx,p2·Gy),...,(pn·Gx,pn·Gy) Wherein for any one element (p) in PsGi·Gx,pi·Gy) I is 1-n, two elements G are present in PsEuGj=[kα,kβ),Gj′=[kα′,kβ′) J is more than or equal to 1, and j' is more than or equal to m-1, and the conditions are met: k is a radical ofα≤pi·x<kβ,kα′≤pi·y<kβ', or Gj=[kα,kβ],Gj′=[kα′,kβ′]J, j' ═ m, the condition is satisfied: k is a radical ofα≤pi·x≤kβ,kα′≤pi·y≤kβ′
Define 6 point set reference values: given a set of points Ps and its corresponding range union group PsEuG ═ G1,G2,...,GnThe reference value for Ps is defined as: eugrv ═ Gi·kβWherein G isi=[kα,kβ) I is not less than 1 and not more than (n-1), or, Gi=[kα,kβ],i=n,GiDenotes the ith packet, Gi·kβIs GiThe value of the right border.
Defining a similarity value between the 7-point set and the reference value: given a set of points Ps ═ p1,p2,...,pnCorresponding point set grouping PsG { (p)1·Gx,p1·Gy),(p2·Gx,p2·Gy),...,(pn·Gx,pn·Gy) And corresponding reference value EuGRv ═ Gj·kβFor point p in PsiI is not less than 1 and not more than n, p thereofiThe similarity value between x and the reference value EuGRv is:
Figure GDA0002825935600000061
where j is the reference value EuGRv corresponding to the packet number and k is piX corresponds to packet pi·GxIs numbered, EI is piX corresponds to packet pi·GxThe pitch of (2).
Corresponding to, piY and GinsengThe similarity values between the reference EuGRv were:
Figure GDA0002825935600000062
where j is the reference value EuGRv corresponding to the packet number and k is piY corresponds to packet pi·GyIs numbered, EI is piY corresponds to packet pi·GyThe pitch of (2).
Thus, piThe similarity value with the reference value EuGRv is:
Sim(pi,EuGRv)=(Sim(pi·x,EuGRv),Sim(piy, EuGRv)), and further, the similarity value between Ps and the reference value EuGRv is:
Sim(Ps,EuGRv)=(Sim(P1,EuGRv),Sim(p2,EuGRv),...,Sim(pn,EuGRv))。
define 8 the range of similarity values: given a set of points Ps ═ p1,p2,p3,...,pnSimilarity value between } and reference EuGRv
Sim(Ps,EuGRv)=(Sim(p1,EuGRv),Sim(p2,EuGRv),...,Sim(pnEuGRv)), the corresponding similarity range is defined as:
SimE(ps,EuGRv)=[min(min(Sim(pi·x,EuGRv)),min(Sim(pi·y,EuGRv))),max(max(Sim(pi·x,EuGRv)),max(max(pi·y,EuGRv)))]wherein i is more than or equal to 1 and less than or equal to n.
Define 9 the union of the ranges of the union of similarity values: given a range SimE (Ps) of the union of two similarity values1,EuGRv)=[a1,b1]、SimE(Ps2,EuGRv)=[a2,b2]The union is defined as: SimEu ═ a, b]Wherein a is min (a)1,a2),b=max(b1,b2)。
Binning of the union of the ranges defining 10 union of similarity values: given a union of the range of the union of similarity values SimEu ═ a, b ], its equidistant binning is defined as:
SimEuB={B1,B2,...,Bn},
wherein, B1=[a,k1),B2=[k1,k2),...,Bn=[kn-1,b]And is and
[a,k1)∪[k1,k2)∪...∪[kn-1,b]=[a,b]and is and
k1-a=k2-k1=...=b-kn-1SI is the spacing of the bins.
Define 11 point set binning: given a set of points Ps ═ p1,p2,...,pnN is more than or equal to 1, reference value EuGRv and corresponding similarity value union range
SimEuB={B1,B2,...,BmThe binning of the set of points Ps is defined as:
PsB={(p1·Bx,p1·By),(p2·Bx,p2·By),...,(pn·Bx,pn·By)},
wherein, for any one element (p) in PsBi·Bx,pi·By) I is more than or equal to 1 and less than or equal to n, and two corresponding elements B exist in the SimEuBj=[kα,kβ),Bj′=[kα′,kβ′) J is more than or equal to 1, and j' is more than or equal to m-1, and the conditions are met: k is a radical ofα≤Sim(pi·x,EuGRv)<kβ,kα′≤Sim(pi·y,EuGRv)<kβ′Or Bj=[kα,kβ],Bj′=[kα′,kβ′]J ═ m, the condition is satisfied:
kα≤Sim(pi·x,EuGRv)≤kβ,kα′≤Sim(pi·y,EuGRv)≤kβ′
define 12 grouping combinations: given two point set groups
PsG={(p1·Gx,p1·Gy),(p2·Gx,p2·Gy),...,(pn·Gx,pn·Gy)},
PsG′={(p′1·Gx,p′1·Gy),(p′2·Gx,p′2·Gy),...,(p′m·Gx,p′m·Gy)},
The grouped combination of PsG, PsG' is defined as the intersection of the elements in both, i.e.:
Figure GDA0002825935600000086
s≤min(m,n),
wherein, for any one element in PsGC
Figure GDA0002825935600000087
I is 0. ltoreq. s, a corresponding element (p) being present in both PsG, PsGj·Gx,pj·Gy)、(p′k·Gx,p′k·Gy) J is more than or equal to 1 and less than or equal to n, k is more than or equal to 1 and less than or equal to m and meets the condition that:
Figure GDA0002825935600000081
define 13-point set grouping number: given a set of points Ps, the corresponding set of points is grouped
PsG={(p1·Gx,p1·Gy),(p2·Gx,p2·Gy),...,(pn·Gx,pn·Gy) And grouping combinations
Figure GDA0002825935600000082
The grouping combination number of the point set Ps is defined as: PsGCNo ═{p1·GCNo,p2·GCNo,...,pn·GCNo},
Wherein, for any element p in PsGCNoiGCNo, 1. ltoreq. i. ltoreq.n, if the condition is satisfied: (p)i·Gx,pi·Gy) E.g. PsGC, and
Figure GDA0002825935600000083
j is more than or equal to 0 and less than or equal to s, then pi·GCNo={j}。
Otherwise, if the condition is satisfied:
Figure GDA0002825935600000084
then p isi·GCNo={null}。
Define 14 binning combinations: binned SimEuB ═ B given a union of ranges of a union of similarity values1,B2,...,BnAnd the corresponding box combination is defined as:
Figure GDA0002825935600000085
wherein, BCi=((B1,B2),(B3,B4))。
If (i mod (n-1)) ≠ 0, then
Figure GDA0002825935600000091
B3=B(i mod(n-1)),B4=B(i mod(n-1))+BI
Otherwise, B1=B(i/(n-1)),B2=B(i/(n-1))+BI,B3=Bn-1,B4=Bn
Wherein the content of the first and second substances,
Figure GDA0002825935600000092
indicating rounding down, and BI indicating the spacing of the bin groupings
Defining 15 point set box combination: given a point set Ps
PsB={(p1·Bx,p1·By),(p2·Bx,p2·By),...,(pn·Bx,pn·By) And a sub-box combination
Figure GDA0002825935600000093
The binning combination of the point set Ps is defined as: PsBC ═ p1·BC,p2·BC,...,pn·BC},
Wherein p isi·BC={bcn1,bcn2,...,bcns},1≤s≤(m-1)2For any element bcnjJ, 1 ≦ j ≦ s, referred to as point piThe sub-box combination number satisfies the condition:
((BCj·B1=pi·Bx)∨(BCj·B2=pi·Bx))∧((BCj·B3=pi·By)∨(BCj·B4=pi·By))。
define 16 point set unique match number: given a packet combination number of a set of points Ps
PsGCNo={p1·GCNo,p2·GCNo,...,pnGCNo, binning SimEuB of a union of ranges of a union of similarity values { B }1,B2,...,BmA box-by-box combination pitch BI, and a box-by-box combination PsBC ═ p1·BC,p2·BC,...,pnBC, then the unique matching number of the point set Ps is defined as: PsMNo ═ p1·mno,p2·mno,...,pn·mno},
Wherein, for any one element piAnd mno, i ≦ 1 ≦ n, and if the corresponding packet combination number pi · GCNo ≦ null, pi · mno ≦ null.
Otherwise, corresponding to pi·BC={bcn1,bcn2,...,bcns},
pi·mno={mno1,mno2,...,mnos},1≤s≤(m-1)2Wherein, in the step (A),
mnoj=pi·GCNo×(m-BI)2+pi·BC·bcnj,1≤j≤s。
define 17 matching pairs: giving two sets of points a unique matching number
PsMNo={p1·mno,p2·mno,...,pn·mno},
PsMNo′={p′1·mno,p′2·mno,...,p′mMno, defining a matching pair of PsMNo and PsMNo' as an intersection of non-null elements in the two, that is:
PsMp(PsMNo、PsMNo′)={psmp1,psmp2,...,psmps},s≤min(m,n)。
wherein the psmpiI is more than or equal to 1 and less than or equal to s, j is more than or equal to 1 and less than or equal to n, k is more than or equal to 1 and less than or equal to m, and the conditions are met:
pj·mno∈PsMNo,p′k·mno∈PsMNo′,pj·mno=p′k·mno≠{null}。
the first stage is as follows: equidistant grouping based on point set data range union
Step 1) exchanging the ranges of the point set data of the two parties to obtain a range union.
In this example, the point set data for the data owner A, B are:
Figure GDA0002825935600000101
Figure GDA0002825935600000102
Ps1、Ps2the graphical representation of (a) is shown in fig. 1 and 2.
Calculating Ps according to definition 21、Ps2Respectively, to obtain: PsE1=[0,3369],PsE2=[173,3500]。
According to definition 3, calculatePsE1、PsE2The union of (a) yields:
PsEu(PsE1,PsE2)=[0,3500]。
and 2) appointing grouping parameters, and obtaining grouping of a range union set by adopting an equal interval method.
In the present example, the data owner A, B agreed that the packet spacing EI is 500, yielding PsEu (PsE) according to definition 41,PsE2) Group of (1) { PsEug ═ G1,G2,...,G7And (c) the step of (c) in which,
G1=[0,500),G2=[500,1000),G3=[1000,1500),G4=[1500,2000),G5=[2000,2500),G6=[2500,3000),G7=[3000,3500]。
and 3) matching the grouping with the coordinates of the point set data to obtain the grouping of the point set data.
In this example, according to definition 5, the data owner A, B matches the point sets Ps respectively1、Ps2The coordinate points in (3) and the ranges of the elements in the grouping PsEuG, resulting in a corresponding point set data grouping PsG1、PsG2
Ps1P in (1)1As an example, (253, 3099), a specific calculation process is given:
p1253, with G in PsEuG1Match [0, 500), i.e. 0 ≦ p1X < 500, thus, p1·Gx=G1
p13099, with G in PsEuG7=[3000,3500]That is to say,
3000≤p1y.ltoreq.3500, thus p1·Gy=G7
Similarly, calculating to obtain Ps1P in (1)2~p8Are respectively as follows:
p2·Gx=G1,p2·Gy=G6
p3·Gx=G2,p3·Gy=G3
p4·Gx=G1,p4·Gy=G4
p5·Gx=G2,p5·Gy=G6
p6·Gx=G4,p6·Gy=G4
p7·Gx=G4,p7·Gy=G7
p8·Gx=G4,p8·Gy=G7
in the end of this process,
PsG1={(G1,G7),(G1,G6),(G2,G3),(G1,G4),(G2,G6),(G4,G4),(G4,G7),(G4,G7)}。
further, Ps was calculated2Is grouped into
PsG2={(G4,G7),(G2,G7),(G2,G6),(G1,G2),(G4,G6),(G1,G7),(G1,G4),(G4,G4)}。
PsG1、PsG2The graphical representations of (a) are shown in fig. 3 and 4, respectively.
And a second stage: equidistant binning of similarity between point set data and reference values
And 4) obtaining a reference value of the point set data based on grouping of the range union set.
In this example, Ps1、Ps2The range union of (1) is grouped as PsEuG ═ G1,G2,...,G7}. According to definition 6, a first group G is selected1=[0,500) The value of the right boundary of (b) is taken as a reference value, i.e. EuGRu ═ G1·kβ=500。
Step 5) calculating a similarity value between the point set data and the reference value.
In this example, the set of points Ps is calculated according to definition 71、Ps2Similarity with the reference value EuGRv to obtain a corresponding similarity set Sim (Ps)1,EuGRv)、Sim(Ps2,EuGRv)。
Ps1P in (1)1As an example, (253, 3099), a specific calculation process is given:
p1·x=253,p1·y=3099;
EI is 500; that is, p1X corresponds to packet p1·G1Is 500.
p1·Gx=G1K is 1; that is, p1X corresponds to packet p1·GxIs G1Numbered 1.
EuGRv=G1·k β500, j 1; that is, the reference value EuGRv is correspondingly grouped as G1Numbered 1. Therefore, k is less than or equal to j,
Figure GDA0002825935600000121
in the same way, p1·Gy=G7K is 7; that is, p1Y corresponds to packet p1·GyIs GyNumbered 7.
Therefore, k > j,
Figure GDA0002825935600000131
thus, Sim (p)1,EuGRv)=(0.51,0.20)。
Further, Ps was calculated1P in (1)2~p8Similarity between points and reference EuGRv, the results are as follows:
Sim(p2,EuGRv)=(0,0.36),Sim(p3,EuGRv)=(0.76,0.74),Sim(p4,EuGRv)=(0.34,0.41),Sim(p5,EuGRv)=(0.98,0.30),Sim(p6,EuGRv)=(0.16,0.98),Sim(p7,EuGRv)=(0.58,0.74),Sim(p8,EuGRv)=(0.95,0.40)。
Sim(Ps1EuGRv ═ ((0.51, 0.20), (0, 0.36), (0.76, 0.74), (0.34, 0.41), that is, (0.98, 0.30), (0.16, 0.98), (0.58, 0.74), (0.95, 0.40)).
Further, Ps was calculated2P in (1)1~p8The similarity between the points and the reference value EuGRv yields:
Sim(Ps2,EuGRv)=((0.75,0.89),(0.51,0.29),(0.99,0.31),(0.35,0.56),(0.46,1.0),(0.50,0.21),(0.77,0.32),(0.17,0.98))。
and 6) exchanging the similarity values of the two parties to obtain the range of the similarity value union.
In this example, both data owners A, B first obtain Sim (Ps) according to definition 81,EuGRv)、Sim(Ps2EuGRv) similarity value union range SimE (Ps)1,EuGRv)=[0,0.98]、SimE(Ps1,EuGRv)=[0.17,1.0]. Then, the range of the similarity value union is exchanged, and the union SimEu of the range in which the similarity value union is obtained is calculated as [0, 1.0 ] according to definition 9]。
And 7) appointing parameters of the similarity value interval, and obtaining the sub-box of the range of the similarity value union set by adopting an equal interval method.
In this example, the data owner A, B agrees that the bin spacing SI of the similarity values is 0.1, and according to definition 10, obtains SimEu of [0, 1.0 ═ c]The box separation: SimEuB ═ B1,B2,B3,B4,B5,B6,B7,B8,B9,B10} in which the ratio of (A) to (B),
B1=[0,0.10),B2=[0.10,0.20),B3=[0.20,0.30),B4=[0.30,0.40),B5=[0.40,0.50),B6=[0.50,0.60),B7=[0.60,0.70),B8=[0.70,0.80),B9=[0.80,0.90),B10=[0.90,1.0]。
and 8) matching the similarity value between the point set data and the reference value with the sub-box of the range of the similarity value union set to obtain the sub-box of the point set data.
In this example, the data owner A, B, according to definition 11, will Sim (Ps)1,EuGRv)、Sim(Ps2EuGRv) with elements in SimEuB in bins to obtain point set data Ps1、Ps2Corresponding sub-box PsB1、PsB2
Ps1P in (1)1The similarity Sim to the reference value EuGRv (p1, EuGRv) is given as an example, and the specific calculation procedure is given:
Sim(p1,EuGRv)=(0.51,0.20),
Sim(p1x, EuGRv) ═ 0.51, B in SimEuB6To [0.50, 0.60), that is,
0.50≤Sim(p1x, EuGRv) < 0.60, thus, p1·Bx=B6
Sim(p1Y, EuGRv) ═ 0.20, B in SimEuB3To [0.20, 0.30), that is,
0.20≤Sim(p1y, EGRv) < 0.30, thus, p1·By=B3
Similarly, calculating to obtain Ps1P in (1)2~p8The sub-boxes of (1) are respectively:
p2·Bx=B1,p2·By=B4
p3·Bx=B8,p3·By=B8
p4·Bx=B4,p4·By=B5
p5·Bx=B10,p5·By=B4
p6·Bx=B2,p6·By=B10
p7·Bx=B6,p7·By=B8
p8·Bx=B10,p8·By=B5
that is to say that the first and second electrodes,
PsB1={(B6,B3),(B1,B4),(B8,B8),(B4,B5),(B10,B4),(B2,B10),(B6,B8),(B10,B5)}。
further, Ps was calculated2Is divided into boxes
PsB2={(B8,B9),(B6,B3),(B10,B4),(B4,B6),(B5,B10),(B6,B3),(B8,B4),(B2,B10)}。
PsB1、PsB2The graphical representations of (a) are shown in fig. 5 and 6, respectively.
And a third stage: matching calculation based on point set data grouping combination number and sub-box combination number
And 9) exchanging the grouping of the point set data of the two sides, obtaining grouping combination based on intersection operation, and further obtaining the grouping combination number of the point set data.
In this example, the point data packets for data owner A, B are:
PsG1={(G1,G7),(G1,G6),(G2,G3),(G1,G4),(G2,G6),(G4,G4),(G4,G7),(G4,G7)},
PsG2={(G4,G7),(G2,G7),(G2,G6),(G1,G2),(G4,G6),(G1,G7),(G1,G4),(G4,G4)}。
pursuant to definition 12, proceed to PsG1、PsG2To obtain corresponding grouping combinations
Figure GDA0002825935600000151
Wherein the content of the first and second substances,
Figure GDA0002825935600000152
Figure GDA0002825935600000153
that is, PsGC (PsG)1、PsG2)={(G1,G4),(G1,G7),(G2,G6),(G4,G4),(G4,G7)}. Further, according to definition 13, PsG will be1、PsG2And PsGC (PsG)1、PsG2) Matching to obtain corresponding grouping combination number
Figure GDA0002825935600000161
PsG1In (p)1·Gx,p1·Gy) And PsGC (PsG)1、PsG2) The matching of (a) is taken as an example, and a specific calculation process is given:
due to (p)1·Gx,p1·Gy)=(G1,G7) And psGC (PsG)1、PsG2) In (1)
Figure GDA0002825935600000166
Thus, p1·GCNo={1}。
Similarly, calculating to obtain:
p2·GCNo={null}、p3·GCNo={null}、p4·GCNo={0}、p5·GCNo={2}、p6·GCNo={3}、p7·GCNo={4}、p8·GCNo={4}
that is to say,
Figure GDA0002825935600000162
further, the calculation results in
Figure GDA0002825935600000163
And step 10) appointing parameters of the space between the sub-boxes to obtain a sub-box combination, and further obtaining a sub-box combination number of the point set data.
In this example, the data owner A, B agrees to bin combination spacing BI of 1, and bins the range of the similarity value union according to definition 14
SimEuB={B1,B2,B3,B4,B5,B6,B7,B8,B9,B10And performing box separation and combination to obtain SimEuBC ═ BC1,BC2,...,BC81}。
BC1、BC9For example, a specific calculation process is given:
for BC1Since (i mod (n-1)) ═ 1 mod (10-1)) ═ 1 ≠ 0, it can be found that:
Figure GDA0002825935600000164
Figure GDA0002825935600000165
B3=B(i mod(n-1))=B(1 mod(10-1))=B1
B4=B(i mod(n-1))+BI=B(1 mod(10-1))+1=B2
namely: BC1=((B1,B2),(B1,B2))。
For BC9Since (i mod (n-1)) ═ 0 (9 mod (10-1)),:
B1=B(i/(n-1))=B(9/(10-1))=B1
B2=B(i/(n-1))+BI=B(9/(10-1))+1=B2
B3=Bn-1=B10-1=B9
B4=Bn=B10
namely: BC9=((B1,B2),(B9,B10))。
Similarly, calculate the available BC2~BC8,BC10~BC81The final results are shown in Table 1.
TABLE 1 Bin SimEuB of the range of the union of similarity values
Figure GDA0002825935600000171
Figure GDA0002825935600000181
Figure GDA0002825935600000191
Further, according to definition 15, PsB is performed1、PsB2Matching with SimEuBC to obtain Ps1、Ps2Is combined with a box
Figure GDA0002825935600000192
PsB1In (p)1·Bx,p1·By) Matching with SimEuBC as an example, a specific calculation process is given:
due to (p)1·Bx,p1·By)=(B6,B3),
For BC38=((B5,B6),(B2,B3) Meets the conditions:
(BC38·B2=p1·Bx)∧(BC38·B4=p1·By) Thus, bcn1=38。
For the
BC39=((B5,B6),(B3,B4) Meets the conditions: (BC)39·B2=p1·Bx)∧(BC39·B3=p1·By)
Thus, bcn2=39。
For the
BC47=((B6,B7),(B2,B3) Meets the conditions: (BC)47·B1=p1·Bx)∧(BC47·B4=p1·By)
Thus, bcn3=47。
For the
BC48=((B6,B7),(B3,B4) Meets the conditions: (BC)48·B1=p1·Bx)∧(BC38·B3=p1·By)
Thus, bcn4=48。
Thus, p1·BC={bcn1,bcn2,bcn3,bcn4}={38,39,47,48}。
Similarly, calculating to obtain:
p2·BC={3,4}、p3·BC={61,62,70,71}、p4·BC={22,23,31,32}、p5·BC={75,76}、p6·BC={9,18}、p7·BC={43,44,52,53}、p8·BC={76,77}
that is to say,
Figure GDA0002825935600000201
further, the calculation results in
Figure GDA0002825935600000202
And 11) combining the grouping combination number and the binning combination number of the point set data to obtain the unique matching number of the point set data.
In this example, the packet combination number is combined
Figure GDA0002825935600000203
And box combination
Figure GDA0002825935600000204
From definition 16, point set data Ps is obtained1、Ps2Unique matching number of
Figure GDA0002825935600000205
Ps1P in (1)1、p2For example, a specific calculation process is given:
for p1With a group combination number p1GCNo ═ 1 ≠ null, binning combinations
p1BC ═ 38, 39, 47, 48, and therefore,
p1·mno={mno1,mno2,mno3,mno4and (c) the step of (c) in which,
mno1=p1·GCNo×(10-1)2+p1·BC·bcn1=1×92+38=119,
mno2=p1·GCNo×(10-1)2+p1·BC·bcn2=1×92+39=120,
mno3=p1·GCNo×(10-1)2+p1·BC·bcn3=1×92+47=128,
mno4=p1·GCNo×(10-1)2+p1·BC·bcn4=1×92+48=129。
that is, p1·mno={119,120,128,129}。
For p2Its grouping combination number p1GCNo ═ null, therefore, p2·mno={null}。
Similarly, calculate Ps1P in (1)3~p2The unique matching number is:
p3·GCNo={null},p4·GCNo={22,23,31,32},p5·GCNo={237,23B},p6·GCNo={252,261},p7·GCNo={367,368,376,377}、p8·GCNo={400,401}。
that is to say that the first and second electrodes,
Figure GDA0002825935600000211
further, Ps was calculated2Unique matching number of
Figure GDA0002825935600000212
And 12) exchanging the unique matching numbers of the data of the point sets of the two sides, and obtaining a matching point pair based on intersection operation.
In this example, according to definition 17, Ps for dataset data1、Ps2Unique matching number
Figure GDA0002825935600000213
Performing intersection operation to obtain matching points
Figure GDA0002825935600000214
As a result of this, it is possible to,
Figure GDA0002825935600000221
psmp1=(1,6)。
Figure GDA0002825935600000222
psmp2=(5,3)。
Figure GDA0002825935600000223
psmp1=(6,8)。
therefore, the temperature of the molten metal is controlled,
Figure GDA0002825935600000224
that is to say that the first and second electrodes,
Figure GDA0002825935600000225
and step 13) exchanging corresponding point set data between the two parties according to the matched point pairs.
In this example, the pairs of matching points are based on
Figure GDA0002825935600000226
Data owner A sends Ps1Point p in (1)1=(253,3099),p5=(989,2650),p6To (1580, 1988) to data owner B. Data owner B stores Ps2Point p in (1)6=(249,3103),p3=(992,2657),p8Sent to data owner B (1584, 1990). And finally, exchanging the data of the matched movement tracks. Matching point pair
Figure GDA0002825935600000227
The graphic representations of the corresponding set data are shown in fig. 7, respectively.

Claims (1)

1. A method for protecting and matching data privacy of a space point set based on similarity binning is characterized by comprising the following steps: the method comprises the following steps:
step 1: exchanging the range of point set data of both sides to obtain a range union;
step 2: appointing grouping parameters, and obtaining grouping of a range union set by adopting an equal interval method;
and step 3: matching the grouping of the range union set with the coordinates of the point set data to obtain the grouping of the point set data;
and 4, step 4: obtaining a reference value of the point set data based on the grouping of the range union set;
and 5: calculating a similarity value between the point set data and a reference value;
step 6: exchanging the similarity values of the two parties to obtain the range of a similarity value union set;
and 7: appointing parameters of the similarity value interval, and obtaining the sub-boxes of the range of the similarity value union by adopting an equal interval method;
and 8: matching the similarity value between the point set data and the reference value with the sub-box of the range of the similarity value union set to obtain a sub-box of the point set data;
and step 9: exchanging the grouping of the point set data of both sides, obtaining grouping combination based on intersection operation, and further obtaining the grouping combination number of the point set data;
step 10: appointing parameters of the space between the sub-boxes to obtain a sub-box combination, and further obtaining a sub-box combination number of the point set data;
step 11: combining the grouping combination number and the sub-box combination number of the point set data to obtain a unique matching number of the point set data;
step 12: exchanging the unique matching numbers of the point set data of the two sides, and obtaining a matching point pair based on intersection operation;
step 13: and according to the matched point pairs, the two parties exchange corresponding point set data with each other.
CN202010344075.7A 2020-04-27 2020-04-27 Similarity-binning-based space point set data privacy protection matching method Active CN111460513B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010344075.7A CN111460513B (en) 2020-04-27 2020-04-27 Similarity-binning-based space point set data privacy protection matching method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010344075.7A CN111460513B (en) 2020-04-27 2020-04-27 Similarity-binning-based space point set data privacy protection matching method

Publications (2)

Publication Number Publication Date
CN111460513A CN111460513A (en) 2020-07-28
CN111460513B true CN111460513B (en) 2021-02-02

Family

ID=71683810

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010344075.7A Active CN111460513B (en) 2020-04-27 2020-04-27 Similarity-binning-based space point set data privacy protection matching method

Country Status (1)

Country Link
CN (1) CN111460513B (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102572692A (en) * 2012-01-10 2012-07-11 周良文 Spatial data matched network communication system and method
CN103646109B (en) * 2013-12-25 2017-01-25 武汉大学 Spatial data matching method based on machine learning
CN108734022B (en) * 2018-04-03 2021-07-02 安徽师范大学 Privacy protection track data publishing method based on three-dimensional grid division
CN109446164A (en) * 2018-09-25 2019-03-08 广东国地规划科技股份有限公司 The large data sets of space planning are at method, system and device

Also Published As

Publication number Publication date
CN111460513A (en) 2020-07-28

Similar Documents

Publication Publication Date Title
CN112488322B (en) Federal learning model training method based on data feature perception aggregation
CN112181971B (en) Edge-based federated learning model cleaning and equipment clustering method and system
Park et al. Sageflow: Robust federated learning against both stragglers and adversaries
CN108536851B (en) User identity recognition method based on moving track similarity comparison
WO2022151654A1 (en) Random greedy algorithm-based horizontal federated gradient boosted tree optimization method
CN110602145B (en) Track privacy protection method based on location-based service
Li et al. Blockchain dividing based on node community clustering in intelligent manufacturing cps
CN113255002B (en) Federal k nearest neighbor query method for protecting multi-party privacy
CN102880834B (en) Method for protecting privacy information by maintaining numerical characteristics of data numerical
CN112231760A (en) Privacy-protecting distributed longitudinal K-means clustering
CN111460513B (en) Similarity-binning-based space point set data privacy protection matching method
CN113688408A (en) Maximum information coefficient method based on safe multi-party calculation
CN114641006A (en) Frequency spectrum allocation method of cognitive radio network based on binary dragonfly optimization algorithm
CN114528916A (en) Sample clustering processing method, device, equipment and storage medium
Gu et al. A Spatial-Temporal Transformer Network for City-Level Cellular Traffic Analysis and Prediction
CN111259434B (en) Privacy protection method for individual preference position in track data release
CN117113113A (en) Data privacy protection method and system based on clustered federal learning algorithm
CN111506918B (en) Mobile track privacy protection matching method based on Bloom filter
CN109862507B (en) Large-range vehicle density detection method and system
CN111988131B (en) Block chain construction method facing mobile crowd sensing
CN112100646A (en) Spatial data privacy protection matching method based on two-stage grid conversion
CN114726589A (en) Alarm data fusion method
CN111552720B (en) Basic statistical index acquisition method under distributed multi-source heterogeneous data scene
CN113255884B (en) Network abnormal traffic identification and classification method based on collaborative learning
CN113553612A (en) Privacy protection method based on mobile crowd sensing technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 210009 No. 66, New Model Road, Gulou District, Nanjing City, Jiangsu Province

Applicant after: NANJING University OF POSTS AND TELECOMMUNICATIONS

Address before: 210023 no.30-06 GuangYue Road, Qixia street, Qixia District, Nanjing City, Jiangsu Province

Applicant before: NANJING University OF POSTS AND TELECOMMUNICATIONS

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant