The content of the invention
The purpose of the present invention is to overcome the deficiencies in the prior art, designs a kind of safe data matching method and its is
System.
To reach above-mentioned purpose, the technical solution adopted in the present invention is:
A kind of safe data matching method, the method occurs directly in the both sides A and B, A and B both sides elder generation for needing data exchange
Initial data to be exchanged to respective numbering carries out Hash operation, then respectively by computing one key of each self-generating, Hash fortune
If the value obtained after the two-wheeled Montgomery Algorithm that result data after calculation intersects by A, B is equal, can be determined that value institute is right
The initial data answered is equal, the initial data for finally being matched each via numbering.
Further, method includes following detailed step:
The first step, initial data is numbered and carries out Hash operation, and the initial data that A and B both sides will each be possessed respectively is carried out
Number one by one, then the initial data to respective numbering carries out Hash operation one by one respectively, A, B both sides carry out Hash operation and are used
Hash algorithm it is identical, described initial data is data to be matched, and the data that Hash operation is obtained are hash;
Second step, generates key, and A, B both sides consult selected two 1024 prime number p and q, calculates N=p*q, and A, B are only respectively again
Integer d1, d2 from selected one 1024 make d1, d2 relatively prime with (p-1) * (q-1) as key, and respectively by key each
Preserve;
3rd step, crossing operation, A and B are utilized respectively the hash that the first step obtains and key and carry out first round Montgomery Algorithm,
And the result that will be calculated is sent to other side, A and B recycles the other side's first round Montgomery Algorithm result and respective key for receiving
Carry out the second wheel Montgomery Algorithm, then the result of the second wheel Montgomery Algorithm is carried out into Hash operation, the result of the respective Hash operation of A, B
It is stored in locally and is sent to other side;
Local data that 4th step, comparing, A and B respectively obtain Hash operation in the 3rd step and receive to number formulary
According to comparing, if value is equal, then it represents that the corresponding initial data to be matched of the value is equal;
5th step, A, B both sides obtain the equal initial data that the 4th step is obtained from its respective data set respectively.
Further, the hash algorithm in the first step is sha3.
Further, the p and q in the second step are the big prime number of two for differing.
Further, the crossing operation in the 3rd step is divided into four step by step:
Step 5.1:A and B carry out first round Montgomery Algorithm respectively:H**d ≡ c1 mod N, wherein * * represent power operation, h tables
Show the hash that the first step is obtained, during the d1 and d2, N that d represents in the key that A, B each hold, i.e. second step are second step
The numerical value that p*q is calculated;
Step 5.2:The data c1 that A and B will be calculated in first round Montgomery Algorithm respectively is sent to other side;
Step 5.3:After A and B receive the data c1 of other side's transmission again, the second wheel Montgomery Algorithm is carried out respectively:c1**d≡c2
The d1 and d2, N that mod N, wherein d represent in the key that A, B each hold, i.e. second step are that p*q is calculated in second step
Numerical value;
Step 5.4:A and B take turns the c2 calculated in Montgomery Algorithm by second and carry out Hash operation again respectively, and respectively
Operation result is stored in locally and other side is sent to.
A kind of safe data matching system, the described data matching system include carry out Data Matching participant A and
B, the participant A include data input cell, the data to be matched for being input into A, and the data to be matched are database number
According to or csv file;
Data number unit, the user data for the A to being stored in data input cell is numbered one by one;
Data hash units, Hash operation is carried out for the user data to the numbered A of band, obtains the hash of A;
Key agreement unit, for negotiating the common big prime numbers used of A and B, and independently selectes out a computing key;
Montgomery Algorithm local unit, first round Montgomery Algorithm is carried out for the hash to A, and by the data after Montgomery Algorithm
By network transmission to other side Montgomery Algorithm remote unit;
Montgomery Algorithm remote unit, for the data received from the Montgomery Algorithm local unit of other side to be carried out into the second wheel mould power
Computing, then the result of the second wheel Montgomery Algorithm is carried out into Hash operation, the result of Hash operation is stored in local and is sent to by A
Other side;
Result treatment unit, for the data for matching local data and receive from the Montgomery Algorithm remote unit of other side, is handed over
Collection data;
Data number reduction unit, for the numbering carried according to common factor data, correspondence finds out in the user data of oneself
The corresponding initial data of the numbering;
Data outputting unit, for the initial data that output data numbering reduction unit finds;The structure of the participant B with
The structure of participant A is identical.
A kind of safe data matching system, it is characterised in that the described data matching system includes carrying out Data Matching
Participant A and B, the participant A includes data input cell, the data to be matched for being input into A, the data to be matched
It is database data or csv file;
Data number unit, the user data for the A to being stored in data input cell is numbered one by one;
Data hash units, Hash operation is carried out for the user data to the numbered A of band, obtains the hash of A;
Hardware token, for negotiating the common big prime numbers used of A and B, and independently selectes out a computing key.Then to A
Hash carry out first round Montgomery Algorithm, by the data after Montgomery Algorithm by network transmission to other side hardware token
And the data after network receives other side's first round Montgomery Algorithm from the hardware token of other side, finally, by from the hard of other side
The data received in part token carry out the second wheel Montgomery Algorithm, then the result of the second wheel Montgomery Algorithm is carried out into Hash operation,
Be stored in the result of Hash operation locally and other side is sent to by A;
Result treatment unit, for the data for matching local data and receive from the hardware token of other side, obtains common factor data;
Data number reduction unit, for the numbering carried according to common factor data, correspondence finds out in the user data of oneself
The corresponding initial data of the numbering;
Data outputting unit, for the initial data that output data numbering reduction unit finds;The structure of the participant B with
The structure of participant A is identical.
Positive beneficial effect of the invention:Data matching method of the invention need not can lead to by third party, A and B
The key of both sides' joint consultation is crossed, the work of Data Matching is directly completed therebetween;And during Data Matching, can be with
Ensure the data safety of both sides;After the completion of Data Matching, each it is only capable of obtaining the common factor data of successful match, cannot but knows
Any data outside common factor;And anti-special number attacking ability is strong.
Embodiment one, present embodiment is illustrated with reference to Fig. 1, and safe data matching method of the invention is occurred directly in
Need to carry out the participant A and B of data exchange, the present invention is comprised the following steps:
First, the every data in the data set T1 that participant A possesses it is numbered, the data that participant B possesses it
Every data in collection T2 is numbered;
Make data set T1={ t11, t12, t13 ... }, wherein t11, t12, t13 ... are the data in data set T1, M1=m1 |
M1 is the unique number of t, t ∈ T1 }, each data t1 in the data set T1 that A possesses correspond to uniquely numbering m1;
Data set T2={ t21, t22, t23 ... }, wherein t21, t22, t23 ... are the data in the second matched data set, M2
=m2 | m2 is the unique number of t, t ∈ T2 }, each data t2 in the data set T2 that B possesses correspond to unique numbering
m2。
Then, the numbered data of every band during A is to data set T1 carry out Hash operation and obtain data set H1, data set
H1=h | h=Hash (t), t ∈ T1 };B to data set T2 in the numbered data of every band carry out Hash operation and obtain data
Collection H2, data set H2=h | h=Hash (t), t ∈ T2 };The wherein desirable sha3 of hash algorithm.
A and B joint consultations generate key, concretely comprise the following steps:A and B consults to determine the big prime number p and q of two 1024, order
N= p*q;A and B select integer d1, a d2 relatively prime with (p-1) * (q-1) respectively, and d1 and d2 is also 1024, and A preserves d1,
B preserves d2, and the p and q is the big prime number of two for differing;
Then, the Montgomery Algorithm of two-wheeled intersection is carried out between A and B, is concretely comprised the following steps:A is according to N, d1 to the number in data set H1
According to Montgomery Algorithm h1**d1 ≡ c1 mod N are carried out, c1 is obtained, and by manifold(M1, c1)B is transferred to, m1 is in data set T1
Data corresponding to numbering;
B carries out Montgomery Algorithm h2**d2 ≡ c2 mod N to the data in data set H2 according to N, d2, obtains c2, and by manifold
(m2,c2)A is transferred to, m2 is the numbering corresponding to the data in data set T2;
Step 2.3:A carries out Montgomery Algorithm c2**d1 ≡ f1 mod N to the manifold (m2, c2) for receiving, and obtains with numbering m2
F1, and Hash operation is carried out to f1, obtain g1, g1 now carries numbering m2, A couples(m2,g1)Preserved and transmitted
To B;
B is to the manifold that receives(m1,c1)Montgomery Algorithm c1**d2 ≡ f2 mod N are carried out, the f2 with numbering m1 is obtained, and
Hash operation is carried out to f2, g2 is obtained, g2 now carries numbering m1, B couples(m1,g2)Preserved and be transferred to A;
Now, it is stored with A(m2,g1)With(m1,g2), A pairs(m2,g1)With(m1,g2)Seek common ground, judge(m1,g2)In be
No to there is g1 identical with g2, if it is identical with g2 to there is g1, then the data t1 corresponding to numbering m1 that the g2 is carried is A and B
Data to be matched;Also it is stored with B(m2,g1)With(m1,g2), B pairs(m2,g1)With(m1,g2)Seek common ground, judge(m2,
g1)In with the presence or absence of g1 it is identical with g2, if it is identical with g2 to there is g1, then the g1 carrying numbering m2 corresponding to data t2 be
It is the data that A and B is to be matched.
Safe data matching system of the invention carries out the participant A and B of Data Matching, the structure of the participant B
Structure with participant A is identical.Participant A and B include data input cell, data number unit, data Hash list
Unit, the reduction of key agreement unit, Montgomery Algorithm local unit, Montgomery Algorithm remote unit, result treatment unit, data number are single
Unit and data outputting unit.
The data input cell, the data to be matched for being input into A, B, the data to be matched be database data or
Csv file;
Specifically, data input cell 11 and data input cell 21 read from the data set of participant A and participant B respectively
Data user's data, the data of input can be database, or csv file, and the user data of input is according to certain
Coded system is transformed into data one by one.
It is made up of the Email and cell-phone number of user than the data if desired for matching, then the mode for being connected with character string connects
Pick up and, originally two record user@user.com and 13988889999 merge into a user@user.com-
13988889999, participant A and participant B use identical coding method.
By such transcoding, coding transform, the user data of each participant becomes the data record t of a rule.Assuming that
The data set that participant A is produced from data input cell 11 is T1={ t11, t12, t13 ... }, and participant B is from data input list
The data sets that unit 21 produces are T2={ t21, t22, t23 ... }.
Data number unit, the user data for A, B to being stored in data input cell is numbered one by one;
Specifically, such as taken a data t11 from the data set T1 of participant A, t11 is numbered, it is assumed that numbering is
M11, numbering m11 can be obtained with arbitrary coding method, for positions of the unique mark t11 in T1, this numbering
Itself can not reflect the content of this data of t11, simplest method for numbering serial be according to the order of data input, it is whole with one
Count incremental sequence to represent, the numbering of the first data is 1, and the numbering of the second data is 2, by that analogy.So, data
Data inside input block one by one by the treatment of data number unit after, it is right that data set has reformed into numbering.Ginseng
Data with person A and participant B are each numbered, independent.
Assuming that participant A be numbered by data number unit 21 obtain data set be M1=(m11, t11), (m12,
T12), (m13, t13) ... .m represents numbering, and t represents data }, participant B is numbered by data number unit 22 and is counted
According to collection M2={ (m21, t21), (m22, t22), (m23, t23) ... .m represents numbering, and t represents data }.
Data hash units, Hash operation is carried out for the user data to the numbered A of band, obtains the hash of A;
Assuming that the result of the corresponding Hash operation sha3 (t11) of a data t11 of the data set M1 of participant A is h11, h11
=sha3 (t11), h12=sha3 (t12), by that analogy, such data number to having reformed into (m11, h11), (m12,
h12)…..Can so obtain the data of participant A by the result data of data hash units 13 integrate as H1=(m11,
H11), (m12, h12), (m13, h13) ... }, it is H2=that participant B integrates by the result data of data hash units 23
{(m21, h11),(m22, h22),(m23, h23)….}。
Key agreement unit, for negotiating the common big prime numbers used of A and B, and each independently draws key;
Specifically, it is assumed that participant A and participant B each select an English phrase, such as A1 and A2 are sent out mutually by network
To other side, it is stitched together after the phrase for receiving other side, obtains A1A2, this splices seed of the phrase as generating algorithm.By
Same algorithm is used in participant one and participant two, common prime number p and q can be obtained with same phrase.In the reality
Apply the prime number that the p and q that we use in example are 1024.After participant A and participant B both sides consult to obtain prime number p and q,
Participant A calculates big number N=p*q and big number r=(p-1) * (q-1), and the two numbers are used for producing the key used below.Participant
A selected one 1024 integer d1 so that d1 and r are coprime.
Participant B can select out d2 by same algorithm.So, be as a result exactly participate in both sides each select one it is whole
Number d1, d2(It is 1024), make d1, d2 and r relatively prime, respective kept secure d1, d2 and big Integer N.
Montgomery Algorithm local unit, first round Montgomery Algorithm is carried out for the hash to A, and by after Montgomery Algorithm
Data pass through Montgomery Algorithm remote unit of the network transmission to other side;
The local module 15 of Montgomery Algorithm takes out number from data set in H1={ (m11, h11), (m12, h12), (m13, h13) ... }
According to (m11, h11), h11 is carried out to calculate h1**d1 ≡ c11 mod N, final result is c11.Successively to the data pair in H1
Montgomery Algorithm is carried out, c12 is obtained, c13 ..., such participant A are L1=by the data set that Montgomery Algorithm local unit is produced
{(m11,c11),(m12,c12),(m13,c13)….}.Then the data set L1 that will be obtained is sent to participant by network S2
The Montgomery Algorithm remote unit 26 of B, waiting next step operation.
According to same process, the local module 25 of Montgomery Algorithm of participant B to data set H2=(m21, h11), (m22,
H22), (m23, h23) ... } in data carry out Montgomery Algorithm successively, obtain new data set L2=(m21, c21), (m22,
c22),(m23,c23)….}.Then data set L2 is sent to the Montgomery Algorithm remote unit of participant one by network S3
16, waiting next step operation.
Montgomery Algorithm remote unit, for the data received from the Montgomery Algorithm local unit of other side to be carried out into the second wheel
Montgomery Algorithm, then the result of the second wheel Montgomery Algorithm is carried out into Hash operation, be stored in for the result of Hash operation local concurrent by A
Give other side;
Montgomery Algorithm remote unit 16 takes out data pair from L2(M21, c21), c21 is carried out to calculate c21**d1 ≡ f21 mod
N, final result is f21.Then Hash operation is carried out to f21 again, we use sha3 hash functions, obtain g21=sha3
(f21).Successively the data in L2 are carried out with computing, so new number is obtained by the Montgomery Algorithm remote unit 16 of participant one
According to collection G2={ (m21, g21), (m22, g22), (m23, g23) ... }.Then data set G2 is sent to participation by network S4
The result treatment unit 27 of person two, while also preserving a in local path S6.
According to same process, the Montgomery Algorithm remote unit 26 of participant B to data set L1=(m11, c11),
(m12, c12), (m13, c13) ... } in element once carry out Montgomery Algorithm and sha3 Hash operations, obtain new data set
G1={ (m11, g11), (m12, g12), (m13, g13) ... }, is then sent to the result treatment of participant one by network S5
Unit 17, while also preserving a in local path S7.
Result treatment unit, for the data for matching local data and receive from the Montgomery Algorithm remote unit of other side, obtains
To common factor data;
The result treatment unit 17 of participant A use by network S5 be transmitted through come data set G1=(m11, g11), (m12,
G12), (m13, g13) ... } and local S6 storage G2={ (m21, g21), (m22, g22), (m23, g23) ... }, wherein, m
Numbering is represented, g is data.The result treatment unit 27 of participant B use by network S4 be transmitted through come data set G2=(m21,
), g21 (m22, g22), (m23, g23) ... } and local S7 storage G1=(m11, g11), (m12, g12), (m13,
g13)….} .Participant A and participant B now possess identical data set G1 and G2.
The result treatment unit 17 of participant A extracts the data component in G1, obtains G1 '={ g11, g12, g13 ... },
The same data component extracted in G2, obtains G2 '={ g21, g22, g23 ... }.Then the behaviour that seeks common ground is carried out to G1 ' and G2 '
Make, new data set G={ g1, g2, g3... } can be obtained, the data in this data set occur simultaneously in G1 ' and G2 ',
Occur simultaneously.
According to same process, the result treatment unit 27 of participant B is operated to G1 and G2, can be obtained same
Data set G={ g1, g2, g3... }.
Data number reduction unit, for the numbering carried according to common factor data, correspondence is looked into the user data of oneself
Find out the corresponding initial data of the numbering;
Data number reduction unit 18 is obtained after data common factor G={ g1, g2, g3... }, can be found from G1 data sets g1,
G2, g3.. corresponding data number m1, m2, m3... because of the numbering data in G1 are produced in data number unit 12,
So, according to these numbering, can from second step data set M1=(m11, t11), (m12, t12), (m13,
T13) ... } find the data set T={ t1, t2, t3 ... } of original coding.
According to same process, data number reduction unit 28 is obtained after data common factor G, can obtain same original volume
Code data set T={ t1, t2, t3 ... }.
Data outputting unit, for the initial data that output data numbering reduction unit finds.
Data outputting unit 19 and data outputting unit 29 after data common factor T={ t1, t2, t3 ... } is obtained, according to
The coding of data input cell 11 and 21, takes reverse operating, it is possible to the element output original user coupling number in T
According to final result can be exported in database or file.