CN109995505B

CN109995505B - Data security duplicate removal system and method in fog computing environment and cloud storage platform

Info

Publication number: CN109995505B
Application number: CN201910171496.1A
Authority: CN
Inventors: 齐赛宇; 张夫猷; 袁浩然; 陈晓峰; 张萌
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2019-03-07
Filing date: 2019-03-07
Publication date: 2021-08-10
Anticipated expiration: 2039-03-07
Also published as: CN109995505A

Abstract

The invention belongs to the technical field of cloud storage, and discloses a system and a method for safely removing duplicate of data in a fog computing environment and a cloud storage platform, wherein an encryption scheme supporting duplicate removal is adopted to ensure that the data is safely and efficiently stored on a cloud server and a fog node; managing data ownership of the user and the fog node using an attribute encryption scheme; generating a data tag by adopting MerkleTree, and preventing a malicious user from carrying out side channel attack on a server; the invention can also support the dynamic update of the user authority, namely, the user who has cancelled the authority is prevented from acquiring the data again by adopting a re-encryption mode, thereby ensuring the security of the server data. In addition, the invention also carries out detailed safety analysis. The invention is proved to achieve the expected security target and simultaneously realize efficient data storage and deduplication operation.

Description

Data security duplicate removal system and method in fog computing environment and cloud storage platform

Technical Field

The invention belongs to the technical field of cloud storage, and particularly relates to a data security duplicate removal system and method in a fog computing environment and a cloud storage platform.

Background

Currently, the current state of the art commonly used in the industry is such that: with the development of cloud computing, more and more users prefer to store Data on a cloud server rather than on a local side, and a report of an International Data Center (IDC) indicates that the Data volume newly generated in the world in one year in 2013 is 4.4ZB, and the Data volume newly generated in the world in one year in 2017 is increased to 15.2ZB, and the Data volume newly generated in the world in one year in 2020 is predicted to reach 40ZB, which undoubtedly brings great burden to the cloud server. While another report of IDC indicates that 75% of the data generated annually worldwide is duplicated, which seriously impairs the storage efficiency of cloud servers and the communication overhead, and the elimination of such redundant data can greatly optimize the storage efficiency of cloud computing. In addition, security problems due to data leakage are also endless each year. Recent research results from the Cloud Security Alliance (CSA) show that data leakage ranks first among all Security threats suffered by Cloud computing. In 2011, google mailboxes were revealed, 15 general users were damaged, data of many users was permanently deleted, and some user accounts were reset. In 2013, yahoo user data continues to be revealed, and the problem is not solved until 2016, and according to yahoo statistics, about 10 hundred million users suffer from different degrees of influence. In 2018, facebook is trapped in a data leakage ugly event, Cambridge analytics (Cambridge analytics) obtained personal data of five million users of facebook through an application, and facebook announces that the actual damaged users are far more than five million. Incomplete statistics show that some large enterprises lose an average of up to $ 380 million per year due to data leakage. Therefore, the data security problem needs to be paid high attention. The development of cloud computing greatly relieves the storage overhead and the computing pressure of local equipment. However, as the number of users on a cloud server increases dramatically, some of the disadvantages of cloud server-centric services are emerging. First, the delay is high because the cloud server is far away from the user; secondly, during the peak period of the cloud server usage, network congestion events are easy to occur, which makes the user experience extremely poor; finally, as cloud computing processes data centrally, failure of a cloud server is likely to lead to a breakdown of the entire network. These problems are bottlenecks in the development of cloud computing. In 2011, Cisco proposed a new network Computing paradigm, Fog Computing (Fog Computing), based on micro-clouds (Cloudlets) and Edge Computing (Edge Computing), aiming at some problems existing in cloud Computing at present. The fog calculation mainly adopts the technologies of a distributed system, virtualization, web2.0 and the like, and integrates the capabilities of network, calculation, storage, application and the like. By connecting physically discrete nodes, data and application programs are dispersed in devices located at the edge of the network, and corresponding services are provided for users nearby. Compared with cloud computing, the cloud computing is distributed at the edge of the network, so that delay is low, all the cloud nodes are independent of one another, and the damage of one node does not influence the use of other nodes. The adoption of the fog computing greatly alleviates some problems of the cloud computing.

In order to solve the problem of data security, most cloud service providers solve the problem of data security by enabling users to encrypt data at a client and then upload the data to a cloud server, but because keys selected by the users are different, even the same data can be encrypted into different ciphertexts, and therefore repeated data cannot be deleted under the ciphertexts. Message-Locked Encryption (MLE), which ensures that the same plaintext can be encrypted into the same ciphertext. However, the MLE is not dynamic, if the authority of a user is revoked, but its MLE key remains at the local end, if the user colludes with some hackers, and after stealing the ciphertext, the clear data can be obtained by decrypting with the previously retained MLE key, which is extremely insecure for the cloud server.

In summary, the problems of the prior art are as follows: the duplication removal of the encrypted data is incompatible with the data updating, and the existing encrypted data is only suitable for the cloud server, so that the pressure of data increase on the cloud server is not relieved in the existing mode, and the data privacy of a user is not effectively protected.

The difficulty of solving the technical problems is as follows:

in addition, after some user rights are revoked, in order to prevent the users from being able to decrypt data still, ciphertext data needs to be updated, and a traditional updating mode adopts a re-encryption technology, however, the re-encryption cost for complete data is relatively high.

The significance of solving the technical problems is as follows:

the method and the device can ensure that the encrypted data is safely stored in the cloud server, can realize efficient data re-encryption, and can ensure that a user with a revoked authority cannot correctly decrypt the data.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a data security duplicate removal system and method in a fog computing environment and a cloud storage platform.

The invention is realized in such a way that a data security deduplication method in a fog computing environment comprises the following steps:

firstly, a user encrypts data by adopting MLE at a client, generates a label t of a file by adopting a Merkle tree, and uploads the encrypted file and the file label t to a fog node;

secondly, the fog node receives the data label and detects whether the data label is in a data index table or not; if yes, the fog node adds ownership of the user into the data index table, and if not, the next step is carried out;

thirdly, the fog node randomly selects 256 bits from the MLE ciphertext to carry out re-encryption, and distributes the 256-bit selected positions and re-encrypted keys to other fog nodes, the fog node adds the data label and the user ownership to the data index table, and sends the re-encrypted data and the data label to the cloud server;

fourthly, the cloud server receives data sent by different fog nodes, and whether repeated data exist is judged according to data labels uploaded by the fog nodes; if yes, only one backup is reserved, and the rest redundant data is deleted.

Further, the step of encrypting the data by the user at the client by using the MLE includes:

(1) generation of the key: inputting a plaintext m, and calculating a hash value of the plaintext by using SHA256 to obtain an MLE key, namely hash (m) → k;

(2) adopting the MLE key generated in the step (1) to carry out AES encryption on the plaintext m to obtain a ciphertext c, namely Enc (m)_k→c；

(3) The user generates a hash value for the ciphertext c by using MerkleTree, and the hash value is recorded as a tag t;

(4) and the user reserves the MLE key k and uploads the ciphertext c and the label t to the fog node.

Further, user data ownership detection:

the fog node compares the tag t uploaded by the user with the data in the data index table established by the fog node, and if the t is in the data index table, the fog node directly adds the authority of the user to the data index table without receiving the ciphertext c uploaded by the user; if t is not in the index table, the fog node needs to perform re-encryption operation on the data;

and (3) re-encrypting the ciphertext data:

(1) and (3) key generation: inputting security parameters to obtain a random encryption key called FileKey: gen (1)^λ)→fk；

(2) And (3) re-encryption: 256 bits are selected from the MLE ciphertext c and are marked as c₁The remainder being denoted c₂And using AES for c₁Encrypt to get stub, i.e.: enc_fk(c₁)→stub；

(3) Updating the data index table: the fog node adds the user data tags t and ownership users of t to the data index table;

(4) and (3) data uploading: fog node c₂And packaging the data into a trimedpackage, and uploading the trimedpackage, the stub and the user tag t to a cloud server. The fog node reserves a random encryption key fk;

updating of ciphertext data:

(1) stub decryption: the fog node receives the update request of the server, and decrypts the stub by adopting the previously reserved fk to obtain c₁I.e. Dec_fk(stub)→c₁；

(2) And (3) MLE ciphertext recovery: the ciphertext c₁And trimedpackage (i.e., c)₂) Splicing to obtain an MLE ciphertext c;

(3) and (3) generating a new FileKey: inputting security parameters to obtain a random encryption key: gen (1)^λ)→fk'；

(4) And (3) ciphertext re-encryption: reselecting 256 bits from the ciphertext c, and marking as c'₁And the remainder is denoted by c'₂C'₁Using fk' encryption, a new stub, Enc, is obtained_fk'(c'₁)→stub；

(5) And (3) data uploading: c'₂Packaging into a new trimedpackage, and uploading the new stub and trimedpackage to a cloud server;

re-encryption key distribution:

after the fog node re-encrypts the ciphertext, the re-encrypted key and c'₁Sharing the selected position to other fog nodes; the method is characterized in that the fog node distributes the key to other nodes in an attribute encryption ABE mode, and the method comprises the following specific steps:

(1) and (3) key generation: input of safety parameters 1^λObtaining a public key PK and a master key MK; namely, Setup (1)^λ)→PK,MK；

(2) And (3) private key generation: inputting a public key PK, a master key MK and an attribute set of the fog node, and outputting a private key of the fog node; namely, KenGen (PK, MK, S) → SK;

(3) encryption: the fog node is c'₁The selected position and the encryption key are used as a message M, a public key PK of other fog nodes, the message M and an access strategy T are input, a ciphertext CT is output and sent to a cloud server, and then the cloud server distributes the CT to other fog nodes; namely, Enc (PK, M, T) → CT;

(4) and (3) decryption: the other fog nodes receive the ciphertext CT from the cloud server, the ciphertext is decrypted by adopting the public key and the private key SK of the other fog nodes, each ciphertext corresponds to one access strategy T, if the attribute set S of the fog node accords with the access strategy T, the ciphertext can be correctly decrypted, otherwise, the decryption fails; namely, Dec (PK, SK, CT) → M iff S ∈ T.

Further, the specific method for data deduplication and data storage at the server side and redundant data deletion at the server side includes:

(1) after receiving the data and the data tag t sent by the fog node, the cloud server judges whether the data are the same or not by detecting the t;

(2) if not, storing the file tags in a data index table, storing the data in a cloud server, wherein the data index of the cloud server only stores the data tags and does not store the owners of the data, which is different from the data index table of the fog nodes;

(3) if yes, the cloud service deletes the same data, only one backup is reserved, the rest redundant data is deleted, and the data tag is added to the data index table.

Further, the user sends the hash value of the data, that is, whether the cloud server stores the data can be judged, some malicious users may adopt the method to judge which data are stored on the cloud server, and the specific operation of proving the file ownership POW of the user includes:

(1) partitioning the ciphertext data as b₁,b₂,…,b_n；

(2) B is obtained₁,b₂,…,b_nSuccessively solving the Hash value to obtain h₁,h₂,…,h_n；

(3) H is to be₁And h₂Cascade, h₃And h₄Cascade, and so on, will h_n-1And h_nCascading;

(4) respectively hashing after cascading to obtain hs₁…hs_n/2；

(5) Then hs is processed₁And hs₂Cascade, hs₃And hs₄Cascading, and so on;

(6) and continuously calculating the hash value after the cascade connection, and continuously circulating to obtain a final result, namely the data label t.

Another object of the present invention is to provide a data security deduplication system in a fog computing environment, which implements the data security deduplication method in the fog computing environment, and the data security deduplication system in the fog computing environment includes:

the client side, the user encrypts the data by adopting MLE at the client side;

the fog node mainly executes four operations: detecting the ownership of user data, re-encrypting ciphertext data, updating the ciphertext data and distributing a re-encryption key;

and the cloud server is responsible for data deduplication and data storage of the server side, deletes redundant data and only keeps one part of data at the server side.

The invention further aims to provide a cloud storage platform applying the data security deduplication method in the fog computing environment.

In summary, the advantages and positive effects of the invention are: in the storage method for safely removing the duplicate of the data in the fog computing environment, a safe encryption scheme is mainly divided into two aspects, namely on one hand, on a client side of a user, on the other hand, on a fog node, and the duplicate removal operation is mainly performed on the fog node and a cloud server. The re-encryption updating of the ciphertext of the data is completed by the fog node. And combining MerkleTree and data ownership proving technology, preventing malicious users from carrying out side channel attack on the server.

The invention realizes a data security duplicate removal system in a fog computing environment. The storage system is deployed on the fog nodes, so that the pressure of the cloud server is relieved, and the defects of high delay, network congestion and the like on the cloud server are overcome. Meanwhile, the data is stored in a mode of encrypting the data and uploading the data, so that the problem of loss of user privacy data caused by server data leakage is effectively solved. In addition, the invention adopts a re-encryption scheme supporting updating on the fog node, so that the users with the revoked authority can be prevented from obtaining the plaintext data again. The invention adopts a scheme of client duplicate removal on the fog node, namely, the user sends the file label firstly, if the server is inquired that the data exists, the user does not need to upload, and the communication expense is greatly saved by adopting the mode. The invention also adopts a mode of generating the data label by the MerkleTree, and prevents some malicious users from carrying out side channel attack on the server.

TABLE 1 comparison of the present invention with previous protocols

Drawings

Fig. 1 is a flowchart of a data security deduplication method in a fog computing environment according to an embodiment of the present invention.

FIG. 2 is a schematic structural diagram of a data security deduplication system in a fog computing environment according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of encrypting and decrypting data according to an embodiment of the present invention.

Fig. 4 is a schematic diagram of a Message Lock Encryption (MLE) according to an embodiment of the present invention.

Fig. 5 is a schematic structural diagram of data ownership tag generation according to an embodiment of the present invention.

Fig. 6 is a schematic structural diagram of updating ciphertext data on a fog node according to an embodiment of the present invention.

Fig. 7 is a schematic structural diagram of distributing a re-encryption key according to an embodiment of the present invention.

Fig. 8 is a schematic structural diagram of a data index table on a fog node according to an embodiment of the present invention.

Fig. 9 is a schematic structural diagram of a data index table on a cloud server according to an embodiment of the present invention.

Fig. 10 is a schematic diagram illustrating an operation of deleting duplicate data by a cloud server according to an embodiment of the present invention.

Fig. 11 shows the time required for partial encryption of the TrimmedPackage.

Fig. 12 is a time required for Stub partial encryption.

Fig. 13 shows the time required for updating the ciphertext data.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The data deduplication is to aim at a large amount of data stored in a server, some data are completely the same, and the data deduplication is to delete the same data and only keep one data backup. From the fine-grained differentiation of deduplication execution, data security deduplication can be divided into two categories: file Level (File Level) deduplication and block Level (BlockLevel) deduplication: file level deduplication refers to that a file is the smallest unit for deduplication, that is, a server performs deduplication detection according to a file tag and only keeps a unique file copy. Block-level deduplication means that the data block is the smallest unit for deduplication, that is, the server performs deduplication detection according to the block tag and only keeps a unique copy of the data block. According to different blocking modes, block-level deduplication can be divided into data deduplication based on fixed-length blocking and data deduplication based on variable-length blocking. Distinguished from the deduplication framework, data security deduplication can be divided into: the Server Side (Server-Side) duplicate removal, the Client Side (Client-Side) duplicate removal:

the server-side duplicate removal means that a user uploads all data to a server, the server detects whether the uploaded data is duplicated, only one backup is reserved after redundant data is deleted, and in the process, the user does not know whether the uploaded data is duplicated or not. The client duplicate removal means that a user sends a label of a file to a server, the server detects whether the data exists or not through the label, if the data exists on the server, the user does not need to upload the data again, the server adds data permission to the user, and in the whole process, the user knows whether the data of the user is duplicate removed or not.

In the invention, a client-side duplicate removal mode is adopted on the fog nodes, and a server-side duplicate removal mode is adopted on the cloud server.

The message lock encryption, the traditional encryption method, is not suitable for data deduplication, because the same plaintext file will be encrypted into different ciphertext files, because different users select different encryption keys. In 2002, Douceur proposes a concept of Convergent Encryption (CE), which ensures that the same plaintext can generate the same key; bellare in 2013 proposes Message Locked Encryption (MLE) on the basis of convergent Encryption, and an Encryption key of the MLE is generated by a hash value of a plaintext file, so that the same plaintext can generate the same ciphertext, and the Message Locked Encryption can ensure that data can be subjected to deduplication operation under the condition of the ciphertext.

The concept of Attribute Based Encryption (ABE) was first proposed by Sahai and Waters. The attribute encryption is a public key encryption scheme, and a public key is an attribute set of a user, so that the management of the public key is greatly simplified. It is mainly divided into two types: ciphertext policies are Based on Attribute Encryption (CP-ABE), and Key policies are Based on Attribute Encryption (KP-ABE). The two modes are just opposite, wherein in the CP-ABE, the ciphertext is an access strategy, and the key is an attribute set of the user. In KP-ABE, the key is the access strategy, and the cipher text is the attribute set of the user. Compared with the two schemes, the CP-ABE is more flexible. The invention mainly adopts CP-ABE encryption.

The MerkleTree is a hash binary tree which is composed of a root node, a plurality of intermediate nodes and a plurality of leaf nodes, and is used for verifying the integrity of user data and proving the ownership of a user. Leaf nodes of the MerkleTree are composed of data information, and the other non-leaf nodes are obtained by solving the Hash value after the child node values of the non-leaf nodes are cascaded, and are calculated layer by layer from bottom to top to finally obtain the only root node.

The following detailed description of the principles of the invention is provided in connection with the accompanying drawings.

As shown in fig. 1, the method for safely removing duplicate data in a fog computing environment according to an embodiment of the present invention includes the following steps:

s101: a user encrypts data by adopting MLE at a client, generates a label t of a file by adopting a Merkle tree, and uploads the encrypted file and the file label t to a fog node;

s102: the fog node receives the data label and detects whether the data label is in the data index table or not; if yes, the fog node adds ownership of the user into the data index table, and if not, the next step is carried out;

s103: the fog node randomly selects 256 bits from the MLE ciphertext to perform re-encryption, distributes the 256-bit selected positions and the re-encrypted key to other fog nodes, adds the data tag and the user ownership to the data index table, and sends the re-encrypted data and the data tag to the cloud server;

s104: the cloud server receives data sent by different fog nodes, and whether repeated data exist is judged according to data labels uploaded by the fog nodes; if yes, only one backup is reserved, and the rest redundant data is deleted.

The user of the data security duplication elimination method in the fog computing environment executes the method at the client, and is responsible for carrying out first-step encryption on the file and generating the corresponding file label. The encryption adopts MLE encryption to ensure that the same plaintext file can generate the same ciphertext file. S102 is executed at the cloud node, and is configured to determine whether a file uploaded by a current user has already been uploaded to the cloud server, if the file is already uploaded by the cloud node, the user does not need to upload the file, and only adds the data ownership of the user to the data index table shown in fig. 7, and if the file is already queried by the cloud node, the process skips to S103. S103 is also executed on the fog node, where the fog node needs to re-encrypt the MLE ciphertext data uploaded by the user once, add the data ownership of the user to the data index table shown in fig. 7, and then upload the re-encrypted file to the cloud server. And jumps to S104. S104, executing on a cloud server, detecting whether the data uploaded by different fog nodes are repeated or not by the cloud server, if so, deleting redundant data by the cloud server, only reserving one part of redundant data, and adding the data to a data index table shown in FIG. 8; if there is no duplicate data, the cloud server saves the data and adds the data to the data index table shown in fig. 8.

As shown in fig. 2, the data security deduplication system in the fog computing environment provided by the embodiment of the present invention includes:

the client side, the user encrypts the data by adopting MLE at the client side; the method mainly comprises the following steps:

(2) adopting MLE key to proceed AES encryption to plaintext m to obtain ciphertext c, namely Enc (m)_k→c；

The fog node mainly executes four operations: detecting the ownership of user data, re-encrypting ciphertext data, updating the ciphertext data and distributing a re-encryption key; the method comprises the following specific steps:

user data ownership detection:

the fog node compares the tag t uploaded by the user with the data in the data index table established by the fog node, and if the t is in the data index table, the fog node directly adds the authority of the user to the data index table without receiving the ciphertext c uploaded by the user; if t is not in the index table, the foggy node needs to perform re-encryption operation on the data.

And (3) re-encrypting the ciphertext data:

updating of ciphertext data:

(5) And (3) data uploading: c'₂And packaging the packages into a new trimedpackage, and uploading the new stub and the trimedpackage to the cloud server.

Re-encryption key distribution:

after the fog node re-encrypts the ciphertext, the re-encrypted key and c 'are required'₁The selected positions are shared by other fog nodes, and decryption of the other fog nodes is facilitated. In the invention, the fog node distributes the key to other nodes by adopting an attribute encryption (ABE) mode, and the specific steps are as follows:

(1) and (3) key generation: input of safety parameters 1^λThe public key PK is obtained, as well as the master key MK. Namely, Setup (1)^λ)→PK,MK；

(2) And (3) private key generation: and inputting the public key PK, the master key MK and the attribute set of the fog node, and outputting the private key of the fog node. Namely, KenGen (PK, MK, S) → SK;

(3) encryption: the fog node is c'₁The selected position and the encryption key are used as a message M, a public key PK of other fog nodes, the message M and the access strategy T are input, a ciphertext CT is output and sent to a cloud server, and then the cloud server distributes the CT to other fog nodes. Namely, Enc (PK, M, T) → CT;

(4) and (3) decryption: and the other fog nodes receive the ciphertext CT from the cloud server, the ciphertext is decrypted by adopting the public key and the private key SK of the other fog nodes, each ciphertext corresponds to one access strategy T, if the attribute set S of the fog node accords with the access strategy T, the ciphertext can be correctly decrypted, and if not, the decryption fails. Namely, Dec (PK, SK, CT) → M iff S ∈ T.

The cloud server is responsible for data deduplication and data storage of the server side, redundant data are deleted, and only one part of data is reserved at the server side; the specific process is as follows:

(2) and if not, storing the file tags in a data index table, and storing the data in a cloud server, wherein the data index of the cloud server only stores the data tags and does not store the owners of the data, unlike the data index table of the fog nodes.

The data security deduplication system in the fog computing environment provided by the embodiment of the invention is mainly divided into a three-layer structure, namely a client, a fog node and a cloud server. The operations of blocking encryption and the like of data by users are all executed at the client, the fog nodes manage a plurality of clients in a certain area, and because the positions of the clients managed under the same fog node are close, the data uploaded by the users have great similarity, so that the duplicate removal efficiency is high by adopting the method. Moreover, the fog node isolates the user from the server, so that some malicious users have difficulty in directly deploying their own application on the server. For each user, the fog node is a small cloud server. The server functions like a traditional cloud server and is responsible for detecting duplicate data and saving only a single backup. In the invention, the cloud server is not directly communicated with the user but connected with the fog node, so that the server only stores data without managing the data ownership of the user, thereby greatly reducing the storage overhead of the server and simplifying the storage mode of the server.

The application of the principles of the present invention will now be described in further detail with reference to the accompanying drawings.

As shown in fig. 3, which is a schematic diagram of encryption and decryption of the present invention, wherein the step of MLE encryption is performed at a client, when a user needs to perform an upload operation, the MLE encryption is first used for data to ensure that plaintext data is not leaked, and then the data is uploaded to a cloud node. The fog node needs to divide the data into two parts, wherein one part only has 256 bits, and then the 256 bits are re-encrypted, so that the dynamism of the ciphertext data is ensured. The reason why the whole ciphertext is not encrypted is that all the contents need to be encrypted and decrypted once each time the contents are updated, which is high in cost. When a user needs to download a file, the user sends a request to the fog node, the fog node detects whether the ownership of user data is in the data index table, if yes, the fog node downloads the data from the server, decrypts the data for the first time, splices the decrypted data to obtain an MLE ciphertext, sends the MLE ciphertext to the user, and the user decrypts the MLE ciphertext at a local end to obtain original plaintext data.

As shown in fig. 4, the encryption principle of the message lock, i.e. MLE, is schematically illustrated. The encryption key K is generated from the hash value of the plaintext M, and the plaintext file is used to generate the data tag T, where the generation process of the data tag is shown in fig. 5. The encryption key K is adopted to carry out AES encryption on the plaintext M to obtain the ciphertext C, and the encryption in the mode ensures that the same plaintext can be encrypted into the same ciphertext, so that the storage overhead is saved while the data security is ensured.

As shown in fig. 5, a schematic diagram of the generation of the data tag T mainly includes the following steps:

(1) firstly, the user divides the encrypted data into blocks, and marks the blocks as b₁,b₂,b₃,b₄；

(2) B is obtained₁,b₂,b₃,b₄Successively solving the Hash value to obtain h₁,h₂,h₃,h₄；

(3) H is to be₁And h₂Cascade, h₃And h₄Cascading;

(4) respectively hashing after cascading to obtain S₁And S₂；

(5) Finally, the S is₁And S₂And (4) performing cascade connection and solving the hash value to obtain a final result, namely the data label T.

If a malicious user wants to determine which data is stored in the server by sending a data tag to the server, the user must be the owner of the data. Even if the owner of the data wants to detect whether the cloud server stores the data, the owner can only detect whether the cloud node in the area uploads the data, but cannot detect whether other cloud nodes store the data, and the side channel attack is effectively resisted by adopting the mode.

Fig. 6 is a schematic diagram of ciphertext data update. When some users revoke the rights, the ciphertext data must be updated, otherwise, if these users collude with some malicious adversaries, these adversaries can decrypt the data in the server by the key of the revoked users, which is dangerous for the server, and therefore, the ciphertext data needs to be updated regularly. The updating is mainly divided into the following steps:

(1) stub decryption: the fog node receives the update request of the server, decrypts stub to obtain c₁I.e. Dec_fk(stub)→c₁；

(2) And (3) MLE ciphertext recovery: the ciphertext c₂And trimedpackage (i.e., c)₁) Splicing to obtain an MLE ciphertext c;

(3) and (3) generating a new FileKey: inputting security parameters to obtain a random encryption key: gen (1)^λ)→fk′；

(4) And (3) ciphertext re-encryption: reselecting 256 bits from the ciphertext c, and marking as c'₂And the remainder is denoted by c'₁₁And c'₁₂C'₂Using fk' encryption, a new stub, Enc, is obtained_fk'(c'₂)→stub；

(5) And (3) data uploading: c'₁₁And c'₁₂And packaging the packages into a new trimedpackage, and uploading the new stub and the trimedpackage to the cloud server.

As shown in fig. 7, which is a schematic diagram of re-distribution of updated keys, after the data is updated, the fog node sends the new stub selected position and the encryption key to the server in an ABE encryption manner, and the server sends ABE ciphertext to other fog nodes. And if the data needs to be updated again, selecting one of the fog nodes and continuing to perform the operation.

Fig. 8 is a schematic structural diagram of a data index table on a fog node. The index table is divided into two columns, data tags are recorded in the right column and indicate that the fog nodes upload data to the cloud server, the left column is legal owners of the data, when a user sends a downloading request, the fog nodes need to detect whether the user is the legal owner of the data or not, if yes, the fog nodes download the data from the cloud server and send the data to the user, and otherwise, the fog nodes directly reject the request of the user. As shown in fig. 8, the owners of the data a68a791667344340 are user a and user B, and if user a or B applies to download the data, the data is transmitted to the user by the fog node, and if user C applies to download the data, the request of user C is directly rejected by the fog node.

Fig. 9 is a schematic structural diagram of a data index table on a cloud server. The index table is also divided into two columns, but is different from the index table on the fog node, the index table does not need to record the owner of the data, the left column records the data tags, the right column records the data corresponding to the data tags, and the data content is divided into two parts, TrimmedPackage and Stub. When the fog node receives a download request of a user, data corresponding to the data tag needs to be searched on the cloud server, the data are decrypted and spliced to obtain an MLE ciphertext, and the MLE ciphertext is sent to the user. For example, a user a wants to download data a68a791667344340, a fog node detects that the user a is a legal owner of the data, the fog node sends a data tag to a cloud server, the cloud service detects that the data corresponding to the data tag a68a791667344340 are TrimmedPackage 05 and stub 05, sends the two data to the fog node, and the fog node decrypts and splices the stub 05 with the TrimmedPackage and sends the spliced MLE ciphertext to the user.

Fig. 10 is a schematic diagram of detecting and deleting duplicate data at the server side. As shown in the figure, the cloud server receives data uploaded from different fog nodes, data a, data B, and data C, wherein:

t_a＝bcdf0a4058a8943d；

t_b＝bcdf0a4058a8943d；

t_c＝bcdf0a4058a8943d；

detected by the server, t_a，t_b，t_cThe data A, the data B and the data C are proved to be identical, at the moment, the server deletes the data B and the data C, only stores the data A, and enables the data A and the corresponding data label t to be identical_aTo the data index table shown in fig. 9.

Fig. 11 is a time required for the TrimmedPackage encryption, fig. 12 is a time required for the Stub encrypted data, the abscissa is the size of each data block after division, the ordinate is the time required for encrypting the entire file, and the entire file size is 10 MB.

Fig. 13 shows the time required for data re-encryption update, with the abscissa representing the size of each data block, the ordinate representing the time required for updating the entire file, and the entire file size being 10 MB.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims

1. A data security deduplication method in a fog computing environment is characterized by comprising the following steps:

fourthly, the cloud server receives data sent by different fog nodes, and whether repeated data exist is judged according to data labels uploaded by the fog nodes; if yes, only one backup is reserved, and the rest redundant data are deleted;

the method for the user to encrypt the data by adopting the MLE at the client comprises the following steps:

(1) generation of the key: inputting a plaintext m, and calculating a hash value of the plaintext by using a SHA256 hash algorithm to obtain an MLE key, namely hash (m) → k;

(4) the user reserves an MLE key k and uploads a ciphertext c and a tag t to the fog node;

user data ownership detection:

and (3) re-encrypting the ciphertext data:

(4) and (3) data uploading: the fog node encapsulates c2 into a trimedpackage, and uploads the trimedpackage, stub and the user tag t to the cloud server; the fog node reserves a random encryption key fk;

updating of ciphertext data:

(2) And (3) MLE ciphertext recovery: splicing the ciphertext c1 with the trimrepackage to obtain an MLE ciphertext c;

(3) and (3) generating a new FileKey: inputting security parameters to obtain a random encryption key: ge (germanium) oxide_n(1^λ)→fk′；

(4) And (3) ciphertext re-encryption: reselecting 256 bits from the ciphertext c, and marking as c'₁And the remainder is denoted by c'₂C'₁Using batch' encryption, a new stub, Enc, is obtained_fk′(c′₁)→stub；

(5) And (3) data uploading: c'₂Packaging into a new trimedpackage, and uploading the new stub and the new trimedpackage to a cloud server;

re-encryption key distribution:

after the fog node re-encrypts the ciphertext, sharing the re-encrypted key and the position selected by the c' 2 to other fog nodes; the method is characterized in that the fog node distributes the key to other nodes in an attribute encryption ABE mode, and the method comprises the following specific steps:

(1) and (3) key generation: input of safety parameters 1^λObtaining a public key PK and a master key MK; namely, Setup (1)^λ)→PK，MK；

(4) and (3) decryption: the other fog nodes receive the ciphertext CT from the cloud server, the ciphertext is decrypted by adopting the public key and the private key SK of the other fog nodes, each ciphertext corresponds to one access strategy T, if the attribute set S of the fog node accords with the access strategy T, the ciphertext can be correctly decrypted, otherwise, the decryption fails; namely, Dec (PK, SK, CT) → Miff S ∈ T.

2. The method for safely removing the duplicate data in the fog computing environment according to claim 1, wherein the data at the server side is removed from the duplicate data and stored in the data storage, redundant data is deleted, and a specific method for keeping a copy of the data at the server side comprises the following steps:

(1) after receiving the data and the data tag t sent by the fog node, the cloud server judges whether the data have the same data or not by detecting the t;

3. The method for safely deduplication in a fog computing environment as claimed in claim 2, wherein the user sends the hash value of the data, so as to determine whether the cloud server stores the data, and some malicious users may adopt the method to determine which data is stored on the cloud server, and the specific operation of proving the ownership of the file POW of the user includes:

(1) partitioning the ciphertext data as b₁，b₂，...，b_n；

(2) B is obtained₁，b₂，...，b_nSuccessively solving the Hash value to obtain h₁，h₂，...，h_n；

(4) after cascading, respectively Hash is solved to obtain hs₁…hs_n/2；

(6) and continuously calculating the hash value after cascading, and continuously circulating to obtain a final result, namely the data label t.

4. A data security deduplication system in a fog computing environment for implementing the data security deduplication method in the fog computing environment of claim 1, wherein the data security deduplication system in the fog computing environment comprises:

the client side, the user encrypts the data by adopting MLE at the client side;