CN112785477B

CN112785477B - Block chain-based data leakage tracing method capable of resisting multi-user collusion

Info

Publication number: CN112785477B
Application number: CN202110028463.9A
Authority: CN
Inventors: 张迎周; 邸云龙; 朱林林; 汪天琦; 李鼎文; 葛丽丽
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2021-01-11
Filing date: 2021-01-11
Publication date: 2023-02-14
Anticipated expiration: 2041-01-11
Also published as: CN112785477A

Abstract

The invention discloses a data leakage tracing method capable of resisting multi-user collusion based on a block chain, which comprises the steps of initializing the block chain, improving a classic collusion resistant coding I code, embedding the distributed improved collusion resistant coding I code into data as a watermark, extracting residual codes in leaked data, judging an attack mode suffered by the data according to common characteristics and user characteristics extracted from the leaked data, mapping a judgment result into a large prime number and extracting the large prime number, placing the data characteristics and the large prime number corresponding to the data characteristics into a mine, and tracing on the block chain by miners; on the basis of the improved I code, the invention provides a new embedding method to embed the digital code with the user characteristic, and the classic I code is improved to resist the 'or' collusion attack; when data is leaked, residual features in the leaked data are extracted by using an intelligent contract, and tracing is carried out on a block chain, so that collusion attack of a user-server is solved.

Description

Block chain-based data leakage tracing method capable of resisting multi-user collusion

Technical Field

The invention relates to the technical field of digital watermarking algorithms, in particular to a data leakage tracing method capable of resisting multi-user collusion based on a block chain.

Background

In the era of current informatization high-speed development, the popularization rate of digital products is higher and higher, and the information transmission mode is more and more convenient. Meanwhile, the copyright protection of the product is increasingly emphasized by the distributor, and once the digital product is leaked, the distributor is difficult to prevent the digital product from being copied again and illegally spreading again. Since it is difficult to prevent illegal copying and distribution of products by the leakage, distributors can increase punishment and legal protection of users who leak data. The information technology is widely applied to the power industry, the data sharing demand between power services is huge, and once data is leaked out, the source of the leaked data is difficult to trace.

The digital fingerprint technology proposed in recent years can better solve the problem, and the idea is as follows: the distributor assigns a digital fingerprint embedded in the product to each legitimate user, and once the digital product is illegally copied, the distributor can trace the legitimate user who has revealed it.

At present, many challenges still exist for solving the problems, including that the traditional client server mode cannot solve the problem of user-server collusion, the problem of multi-user collusion is difficult to track and trace, the attack type of leaked data cannot be judged, and the like, and the problems need to be solved urgently.

Disclosure of Invention

The invention aims to: the invention provides a tracing method for preventing multi-user collusion leakage data based on a block chain, which solves the problems of user-server collusion and user-user collusion in the field of digital watermarking by utilizing an improved collusion resistant coding I code, a user characteristic watermark embedding method and a characteristic mapping prime number method in a block chain environment. Meanwhile, the problem of user-user collusion caused by no people limitation is solved, the type of the leaked data attacked is judged, and the tracing speed of the block chain is improved.

The technical scheme is as follows: in order to achieve the purpose, the invention adopts the technical scheme that:

a data leakage tracing method based on block chain and capable of resisting multi-user collusion comprises the following steps:

s1, initializing a block chain; dividing all block chain link points into super nodes and common nodes; the super node stores basic data information uploaded by a user, and the common node stores transaction information of the user; specifically, the set variables are as follows:

{T ₁ ,T ₂ ,…,T _n is a super node for storing uploaded data, { t } ₁ ,t ₂ ,…,t _m Is a common node storing common transactions, and the data stored in the super node is { D } ₁ ,D ₂ ,…,D _n And the attribute corresponding to the data to be distributed is { A } ₁ ,A ₂ ,…,A _k The maximum allowable distortion of the data is epsilon, the maximum allowable number of people for downloading is J, the current number of people for downloading is J, and the user set is { u } ₁ ,u ₂ ,…,u _j }

S2, when a user downloads data, distributing an improved collusion-resistant coding I code for the user;

s3, embedding the distributed improved collusion resistant coding I code into data as a watermark, and distributing the data to users;

s4, extracting residual codes in the leaked data, judging the attack mode of the data according to common characteristics and user characteristics extracted from the leaked data, mapping the judgment result into a large prime number and extracting the large prime number;

and S5, putting the data characteristics and the corresponding large prime numbers into an ore pond, and tracing the source on the block chain by miners.

Further, in the process of initializing the block chain nodes in the step S1, according to data uploaded by the user, selectable parameters are provided to the user, where the selectable parameters include a minimum distortion position, watermark robustness, and a maximum number of people allowed to download the watermark; determining watermark embedding density according to the download times of the data which can be accommodated and selected by a user, wherein the larger the download times is accommodated; embedding [1.. 1] into stored data]The packet number of the embedding position as the bit of the common feature code is g _i The super node generates a random large prime number a and stores the large prime number

And basic features of the data, the basic features including attribute columns, attribute types, numerical features of the attributes.

Further, the step of improving allocation of collusion resistant coding I-code in step S2 is specifically as follows:

s2.1, selecting parameters by a user according to recommendation, wherein the parameter types comprise a minimum distortion position, watermark robustness and the maximum number of persons allowed to download by the user; wherein the maximum distortion epsilon and the robustness are inversely proportional to the number J of the allowed maximum download persons, and the larger the distortion is, the weaker the robustness is, the more the number of the allowed download persons is; the smaller the distortion and the stronger the robustness, the fewer the number of persons allowed to download;

step S2.2, distributing an improved collusion resistant code I code for the current user, reading the number of people downloading the current data in the super node, and distributing improved collusion resistanceCollusion coding I code I _j =1, 1., j + 1., 1}, and takes diagonal elements of the matrix as eigen symbols representing the user as the user eigen.

Further, the step S3 of embedding the data watermark includes:

s3.1, performing hash grouping on the data to find the nearest minimum integer N containing J large prime numbers, wherein the attribute corresponding to the data is { A } ₁ ,A ₂ ,…,A _k }；group _i Grouping different element groups into the same group for the group number of the tuple, performing watermark embedding operation on the group number with prime number, and hash _i Intercepting the hash value of the tuple, wherein the intercepted digit is the number k of the attributes;

step 3.2: improved collusion resistant encoded I-code to be allocated for a user and to be embedded in data:

improved collusion resistant coding of I-codes I for allocation to users _j = {1, 1., j + 1., 1}, the code bits are grouped with the hash into a hash _i Correspondingly; when the code bit is 1, the specific operation is as follows:

A _k ＝A _k ||hash _i [k]

when the code bit is not 1, embedding the code bit on the diagonal line for the k attributes, as follows:

A _k ＝(A _k +hash _i [-k：])mod 10 ^k

step 3.3: storing the user transaction information into a common node; for user u _j Collusion resistant coding of _j =1, 1., j + 1., 1}, and a prime number a group is stored in the transaction information _i 。

Further, the specific steps of extracting the common features and the user features in the leakage data in step S4 are as follows:

step S4.1, according to different watermark embedding modes, extracting watermark bits can be divided into three conditions based on the following formula:

group _i ＝Hash(A ₁ [:ε],...,A _k [:ε])％N

hash _i ＝Hash(A ₁ [:ε],...,A _k [:ε])[-k:]

(1) When hash _i ＝LSB _A1 ||LSB _A2 ||...||LSB _Ak When representing the extracted common features, the extracted group resistant to collusion coding _i The position is set to 1;

(2) When hash _i [-k：]＝(hash _i [-k：]-group _i +10 ^k )mod10 ^k Representing the extracted features of the user, the extracted group resistant to collusion coding _i The position is hash _i [k]；

(3) When the condition that the extracted watermark does not meet the conditions of (1) and (2), the improved collusion resistant coding I code is not embedded in the data;

s4.2, judging whether collusion attack or common attack is received according to common characteristics and user characteristics extracted from the leaked data; the method comprises the following specific steps:

(1) When only one user characteristic appears in the extracted collusion resistant code and the common characteristic is complete, judging that collusion attack and other common attacks do not occur;

(2) When only one user characteristic appears in the extracted collusion resistant code and the common characteristic position conflicts with the user characteristic position, judging that common subset adding attack occurs;

(3) When only one user characteristic appears in the extracted collusion resistant code and the position of the common characteristic is incomplete, judging that common subset deletion attack occurs;

(4) When a plurality of user characteristics appear in the extracted collusion resistant code and the common characteristics are complete, judging that collusion attack occurs or collusion attack occurs;

(5) When the user characteristics do not appear in the extracted collusion resistant code and the common characteristics are incomplete, judging that collusion attack occurs;

s4.3, judging whether the data is attacked or not according to the common features and the user features extracted from the leaked data, mapping the data to be large prime numbers and extracting the large prime numbers;

according to the judgment result in the step 4.2, when the judgment result belongs to any one of (1), (2) and (3), the method is usedGroup corresponding to user characteristic _i As a large prime number of the mapping; when the judgment result is (4), the groups corresponding to the extracted multiple user characteristics are selected _i Respectively used as the large prime number of mapping; when the judgment result is (5), the group corresponding to the common feature is used _i The product of (a) is taken as a large prime number;

multiplying the extracted large prime number by a random large prime number a stored in the super node

And (5) placing the ore into an ore pond to trace the source by miners.

Further, the mining source tracing in step 5 specifically comprises the following steps

Step S5.1: placing the basic characteristics and the large prime number of the data into an ore pond, and excavating ores by miners, wherein the miners find the super nodes according to the characteristics of the data;

step S5.2, when the user characteristics are extracted, directly connecting the large prime number with the large prime number stored in the transaction

Dividing, if the division can be completely divided, the tracing is successful;

step 5.3: and when the user features are not extracted, dividing the big prime number by the big prime number stored in the super node, and then dividing the result by the big prime number stored in the common node transaction, wherein if the result can be divided completely, the source tracing is successful.

Has the advantages that: the method has the following advantages:

the invention solves the problem of user-server collusion in the traditional client-server mode. The method has the advantages of reserving the advantages of the collusion resistant coding I code, further improves the collusion resistant coding I code, and provides the improved I code with the user characteristic which can solve the problem that the tracing of multi-user collusion is difficult to trace. The method can accurately trace the source of the leakage data of any number of people conspired in the user downloading the data. Meanwhile, according to the user characteristics extracted from the leaked data, the types of the leaked data subjected to attacks (attack types such as subset addition, subset deletion, collusion and collusion or collusion) can be accurately judged, and meanwhile, an efficient tracing method for the characteristic mapping prime number of the data watermark is provided, so that the tracing speed of the block chain is greatly improved.

Drawings

Fig. 1 is a flow chart of a data leakage tracing method based on block chains and capable of resisting multi-user collusion provided by the invention.

Detailed Description

The present invention will be further described with reference to the accompanying drawings.

A data leakage tracing method based on block chain and resistant to multi-user collusion as shown in fig. 1 includes the following steps:

s1, initializing a block chain; dividing all block chain link points into super nodes and common nodes; the super node stores the basic information of the data uploaded by the user (the basic characteristics of the data, the maximum distortion allowed by the data is epsilon, the maximum number of persons allowed to download is J, the number of persons currently downloaded is J, the random big prime number a =53, and the big prime number

g _i A group number embedded with a watermark), the common node stores transaction information (addresses of both parties of the transaction, a large prime number) of a user

g _i The group number in which the watermark is embedded for that user); specifically, the set variables are as follows:

{T ₁ ,T ₂ ,…,T _n is a super node for storing upload data, { t } ₁ ,t ₂ ,…,t _m The super node stores data { D } as common node for storing common transactions ₁ ,D ₂ ,…,D _n And the attribute corresponding to the data to be distributed is { A } ₁ ,A ₂ ,…,A _k The maximum distortion allowed by the data is epsilon, the maximum number of persons allowed to download is J, the number of persons currently downloaded is J, and the user set is { u } ₁ ,u ₂ ,…,u _j }。

Providing selectable parameters to the user according to the data uploaded by the user, wherein the selectable parameters comprise minimum distortion position and watermark robustnessExcellent and maximum number of downloads allowed for the user; determining watermark embedding density according to the download times of the data which can be accommodated and selected by a user, wherein the larger the download times is; embedding [1.. 1] into stored data]As a general code bit, the packet number of the embedding position is g _i The super node generates a random large prime number a and stores the large prime number

And the basic characteristics of the data (attribute column, attribute type, numerical characteristics of the attribute).

And S2, when the user downloads the data, distributing an improved collusion resistant coding I code for the user.

And S2.1, selecting parameters by the user according to the recommendation, wherein the parameter types comprise a minimum distortion position, watermark robustness and the maximum number of persons allowed to be downloaded by the user. In the embodiment, the maximum number of persons allowed to be downloaded by the user is 4. Improved collusion resistant coding I-code was determined as follows:

the maximum distortion epsilon and the robustness are inversely proportional to the number J of the allowed maximum downloading people, and the larger the distortion is, the weaker the robustness is, the more the number of the allowed downloading people is; the less distortion and robustness allows a smaller number of people to download.

Step S2.2, an improved collusion resistant encoded I-code is allocated for current user u2. Reading the number of people j =1 of the current data downloaded in the super node, and distributing an improved collusion resistant code I ₂ ＝[1,3,1,1]Unlike the conventional collusion resistant coding matrix, diagonal element 3 of the matrix is taken as user u ₂ May represent a user.

And S3, embedding the distributed improved collusion resistant coding I code into data as a watermark, and distributing the data to users.

The specific algorithm for embedding the data watermark is shown in table 1 below:

table 1 watermark embedding algorithm

S3.1, performing hash grouping on the data to find the nearest minimum integer N containing J large prime numbers, wherein the attribute corresponding to the data is { A } ₁ ,A ₂ ,…,A _k }；group _i Grouping different element groups into the same group for the group number of the tuple, performing watermark embedding operation by taking the group number as a prime number, and hash _i And intercepting the hash value of the tuple, wherein the intercepted digit is the number k of the attributes.

In this embodiment, a minimum integer N =12 is found containing J =4 large prime numbers, and the attribute for which in the data is { a } ₁ ,A ₂ ,…,A _k }。group _i Grouping different element groups into the same group for the group number of the tuple, performing watermark embedding operation on the element group with the group number of {3,5,7,11}, and hash _i For the truncation of the hash value of the tuple, the number of bits truncated in this example is k =4, for example, the first tuple is grouped:

group _i ＝Hash(0.45||0.36||0.09||0.51)％12＝3

hash _i ＝Hash(0.45||0.36||0.09||0.51)[-4:]＝1893

step 3.2: the improved collusion resistant coding I code distributed for the user is embedded into the data:

improved collusion resistant coding of I-codes I for allocation to users _j = {1, 1., j + 1., 1}, the code bits are grouped with the hash into a hash _i Corresponding; when the code bit is 1, the specific operation is as follows:

A _k ＝A _k ||hash _i [k]

when the code bit is not 1, embedding the code bit on the diagonal line for the k attributes, and the following operations are performed:

A _k ＝(A _k +hash _i [-k：])mod 10 ^k

in this embodiment, all groups are divided ₃ Insert 1895, namely A1 (0.455) ->A1(0.4551)

1893

3

0.4551

0.3658

0.0959

0.5145

Embedding group for other three groups ₅ ,group ₇ ,group ₁₁ Direct embedding of the corresponding hash, i.e. group, according to the above method ₅ 8727; obtaining:

8727

5

0.448

0.3657

0.1252

0.5167

In this embodiment, for user u ₂ Collusion resistant coding of ₂ ＝[2,1,1,1]Store a large prime number in the transaction: a group _i ＝53*3。

And S4, extracting codes remained in the leaked data, judging the attack mode of the data according to the common features and the user features extracted from the leaked data, and mapping and extracting the judgment result into a large prime number. The watermark extraction algorithm is shown in table 2 below:

table 2 watermark extraction algorithm

Step S4.1, extracting watermark bits can be divided into three conditions based on the following formula according to different watermark embedding modes:

group _i ＝Hash(A ₁ [:ε],...,A _k [:ε])％N

hash _i ＝Hash(A ₁ [:ε],...,A _k [:ε])[-k:]

(1) When hash _i ＝LSB _A1 ||LSB _A2 ||...||LSB _Ak Representing that the extracted is a common feature, the extracted group is collusion resistant coded _i The position is set to 1;

(2) When hash _i [-k：]＝(hash _i [-k：]-group _i +10 ^k )mod10 ^k When representing the extracted user features, the extracted group resistant to collusion coding _i The position is hash _i [k]；

(3) When the watermark is extracted to be not in accordance with the conditions of (1) and (2), the improved collusion resistant coding I code is not embedded in the data.

In the embodiment of the invention, the hash is used for the tuple _i ＝LSB _A1 ||LSB _A1 ||LSB _A3 =827 equals the first three bits of hash, so the hash is judged with the embedded features _i [k]＝(LSB _A4 -5+10 ⁴ )％10 ⁴ If the hash is equal to the lowest bit 7 of the hash, the embedded feature is the user feature, otherwise, the ordinary feature 1 is used.

8727

5

0.448

0.3657

0.1252

0.5167

(5) And when the user characteristics do not appear in the extracted collusion resistant code and the common characteristics are incomplete, judging that collusion attack occurs.

In the present embodiment, the determination is as follows:

(1) Extracted features I of only one user appearing in collusion resistant coding ₂ ＝[2,1,1,1]And the common characteristics are complete, and the collusion attack and other common attacks are judged not to occur.

(2) The extracted collusion resistant code only shows the characteristics of one user, and the common characteristic position conflicts with the user characteristic position I ₂ ＝[2(1),1,1,1]And judging that the common subset adding attack occurs.

(3) The extracted collusion resistant code only shows the characteristics of one user, and the position of the common characteristics is incomplete I ₂ ＝[2,1,1,0]And judging that the common subset deletion attack occurs.

(4) Only the characteristics of a plurality of users appear in the extracted collusion resistant code, and the common characteristics are complete I ₂ ＝[2(3),1,1,1]And judging that attack occurs or colludes.

(5) Extracted collusion resistant coding only has no user's characteristics and the common characteristics are incomplete I ₂ ＝[0,0,1,1]And judging that collusion attack occurs.

according to the judgment result in the step 4.2, when the judgment result belongs to any one of (1), (2) and (3), the group corresponding to the user characteristic is determined _i As a large prime number of the mapping; when the judgment result is (4), the groups corresponding to the extracted multiple user characteristics are selected _i Respectively as the large prime number of the mapping; when the determination result is (5), the common feature is setCorresponding group _i The product of (a) is taken as a large prime number;

And (5) placing the ore pulp into an ore pond to trace the source by miners.

And S5, placing the data characteristics and the corresponding large prime numbers into a mine pool, and tracing the source on the block chain by miners.

Step S5.1: placing the basic characteristics and the large prime number of the data into an ore pond, and excavating the ore by a miner, wherein the miner finds the super node according to the characteristics of the data;

Dividing, if the division can be performed completely, the source tracing is successful;

In this embodiment, when no collusion attack has occurred, the extracted watermark, e.g. I, is extracted ₁ And the product of the group values with the corresponding code of 1, namely 53 × 5 × 7 × 11, is handed to the miners to dig the mine.

When collusion attack occurs or collusion attack occurs, or collusion attack u1+ u2; the data can be extracted as [2 (1), 3 (1), 1], wherein the product of the group values corresponding to code 1, i.e., 53 × 7 × 11, is handed to the miners to dig the mine.

When collusion attack occurs, or collusion attack u1+ u2: the data can be extracted as [0, 1], where the corresponding product of group values encoded as 1, i.e., 53 x 7 x 11, is handed to the miners to dig the mine.

The collusion in the above case occurs, the extracted [0, 1] corresponds to the group number product of 7 × 11, then 53 × 3 × 5 × 7 × 11/53 × 7 × 11=3 × 5, then divided by the number of characteristic elements stored for each user, respectively, and it is the colluding user that can be divided exactly, otherwise it is not the colluding user. At this time, it is judged that the characteristic prime numbers of u1 and u2 are equal to those of u1 and u2, which is a judgment collusion user capable of performing integer division.

The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.

Claims

1. A data leakage tracing method capable of resisting multi-user collusion based on a block chain is characterized by comprising the following steps:

{T ₁ ,T ₂ ,…,T _n is a super node for storing upload data, { t } ₁ ,t ₂ ,…,t _m The super node stores data { D } as common node for storing common transactions ₁ ,D ₂ ,…,D _n And the attribute corresponding to the data to be distributed is { A } ₁ ,A ₂ ,…,A _k The maximum allowable distortion of the data is epsilon, the maximum allowable number of people for downloading is J, the current number of people for downloading is J, and the user set is { u } ₁ ,u ₂ ,…,u _j }；

S2, when the user downloads data, distributing an improved collusion resistant coding I code for the user; reading the number of people downloading current data in the super node, distributing improved collusion resistant coding I codes Ij = {1, 1., j + 1., 1}, and taking diagonal elements of a matrix as characteristic code elements representing users as user characteristics;

s3, embedding the distributed improved collusion resistant coding I code into data as a watermark, and distributing the data to a user;

s5, putting the data characteristics and the corresponding large prime numbers into an ore pond, and tracing the source on a block chain by miners;

the specific steps of extracting the common features and the user features in the leakage data in the step S4 are as follows:

group _i ＝Hash(A ₁ [:ε],...,A _k [:ε])％N

hash _i ＝Hash(A ₁ [:ε],...,A _k [:ε])[-k:]

wherein N is the closest small integer comprising J large prime numbers; k represents the number of attributes in the data table, i.e., the attribute column of the data is represented by { A ₁ ,A ₂ ,…,A _k }；

(1) When in use

When representing the extracted common features, the extracted group resistant to collusion coding _i The position is set to 1; wherein

Represents the least significant bit;

according to the judgment result in the step 4.2, when the judgment result belongs to any one of (1), (2) and (3), the group corresponding to the user characteristic is determined _i As a large prime number of the mapping; when the judgment result is (4), the groups corresponding to the extracted multiple user characteristics are selected _i Respectively as the large prime number of the mapping; when the judgment result is (5), the group corresponding to the common feature is used _i The product of (a) is taken as a large prime number;

And (5) placing the ore pulp into an ore pond to trace the source by miners.

2. The blockchain-based multi-user collusion resistant data leakage tracing method according to claim 1, wherein in the step S1, during the initialization of the blockchain node, selectable parameters are provided to the user according to data uploaded by the user, and the selectable parameters include a minimum distortion position, watermark robustness and an allowable maximum user distortion positionThe number of downloads; determining watermark embedding density according to the download times of the data which can be accommodated selected by a user, wherein the larger the download times, the larger the watermark density; embedding [1.. 1] into stored data]The packet number of the embedding position is g as a normal code bit _i The super node generates a random large prime number a and stores the large prime number

And the basic characteristics of the data to be distributed; the basic features comprise attribute columns, attribute types and digital features of the attributes.

3. The method for tracing data leakage through block chain based multi-user collusion resistant according to claim 1, wherein the step of improving allocation of collusion resistant coded I-code in step S2 is as follows:

step S2.2, an improved collusion resistant encoded I-code is allocated for the current user.

4. The method for tracing data leakage through block chaining based on multi-user collusion resistance according to claim 1, wherein the step S3 of embedding the data watermark comprises the following steps:

s3.1, performing hash grouping on the data to find the nearest minimum integer N containing J large prime numbers, wherein the attribute corresponding to the data is { A } ₁ ,A ₂ ,…,A _k }；group _i Grouping different element groups into the same group for the group number of the tuple, performing watermark embedding operation by taking the group number as a prime number, and hash _i Intercepting the hash value of the tuple, wherein the intercepted digit is the number k of the attributes;

improved collusion resistant coding of I-codes I for allocation to users _j =1, j +1, the code bits are hashed with the hash packet _i Corresponding; when the code bit is 1, the specific operation is as follows:

A _k ＝A _k ||hash _i [k]

A _k ＝(A _k +hash _i [-k：])mod 10 ^k

5. The blockchain-based multi-user collusion resistant data leakage tracing method according to claim 1, wherein the mining tracing in step 5 comprises the following specific steps:

step 5.3: and when the user features are not extracted, dividing the large prime number by the prime number stored in the super node, and then dividing the result by the prime number stored in the common node transaction, wherein if the result can be divided completely, the source tracing is successful.