CN114143098A - Data storage method and data storage device - Google Patents

Data storage method and data storage device Download PDF

Info

Publication number
CN114143098A
CN114143098A CN202111470327.1A CN202111470327A CN114143098A CN 114143098 A CN114143098 A CN 114143098A CN 202111470327 A CN202111470327 A CN 202111470327A CN 114143098 A CN114143098 A CN 114143098A
Authority
CN
China
Prior art keywords
data block
uploaded
data
client
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111470327.1A
Other languages
Chinese (zh)
Other versions
CN114143098B (en
Inventor
付钰
徐宁
谢娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CCB Finetech Co Ltd
Original Assignee
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CCB Finetech Co Ltd filed Critical CCB Finetech Co Ltd
Priority to CN202111470327.1A priority Critical patent/CN114143098B/en
Publication of CN114143098A publication Critical patent/CN114143098A/en
Application granted granted Critical
Publication of CN114143098B publication Critical patent/CN114143098B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • H04L63/045Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload wherein the sending and receiving network entities apply hybrid encryption, i.e. combination of symmetric and asymmetric encryption
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/12Applying verification of the received information
    • H04L63/123Applying verification of the received information received data contents, e.g. message integrity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]

Abstract

The application provides a data storage method and a data storage device, which can be used in the field of data storage and are beneficial to data security and data deduplication in data storage. The method comprises the following steps: the client acquires a plurality of data blocks, an identifier and a convergence key of each data block in the plurality of data blocks, and sends the identifier of each data block to the server; after receiving the identification of each data block, the server determines a data block list to be uploaded based on the identification of each data block and sends the data block list to be uploaded to the client; the client determines the data block to be uploaded based on the data block list to be uploaded, encrypts the at least one data block to be uploaded through the convergence key of the at least one data block to be uploaded to obtain the ciphertext of the at least one data block to be uploaded, and sends the ciphertext of the at least one data block to be uploaded and the convergence key ciphertext of the at least one data block to be uploaded to the server.

Description

Data storage method and data storage device
Technical Field
The present application relates to the field of data storage, and in particular, to a data storage method and a data storage device.
Background
In the field of data storage, deduplication technology is widely applied to network disks and Content Delivery Networks (CDNs). When the data is stored, if the deduplication technology is not used, the same data needs to be stored for multiple times and transmitted for multiple times, and if the deduplication technology is used, the same data can be stored for one time and transmitted for one time, so that the data storage cost and the data transmission efficiency are greatly reduced.
Current data storage methods may include plaintext, uniform key encryption, and user-defined encryption. In the plaintext and unified key encryption mode, operation and maintenance personnel in the data center can check all data and can deduplicate repeated data, but the security is low. The user defines the encryption mode, and different users encrypt the data by using different keys to obtain different ciphertexts, so that the safety can be improved, and the difficulty of data duplication removal can be increased. Therefore, the existing data storage mode cannot give consideration to both data security and data deduplication.
Disclosure of Invention
The application provides a data storage method and a data storage device, which are beneficial to data safety and data deduplication in data storage.
In a first aspect, the present application provides a data storage method, including: the method comprises the steps that a client side obtains a plurality of data blocks of a first file, an identifier of each data block in the data blocks and a convergence key of each data block; the client sends the identification of each data block to the server; the client receives a to-be-uploaded data block list sent by the server according to the identification of each data block, wherein the to-be-uploaded data block list comprises at least one identification of the to-be-uploaded data block, the at least one to-be-uploaded data block is different from the data blocks stored in the server, and the at least one to-be-uploaded data block is all or part of the data blocks; the client encrypts at least one data block to be uploaded through a convergence key of the at least one data block to be uploaded to obtain a ciphertext of the at least one data block to be uploaded; the client sends at least one cipher text of the data block to be uploaded and at least one convergence key cipher text of the data block to be uploaded to the server, and the convergence key cipher text of the at least one data block to be uploaded is obtained by encrypting the convergence key of the at least one data block to be uploaded through a first public key in a first key pair.
According to the data storage method, the data blocks which are not stored in the server are determined according to the identification of the data blocks, then the data block list to be uploaded is determined, data duplication removal is achieved, improvement of data transmission efficiency is facilitated, data storage cost is reduced, in addition, the client side encrypts the data blocks to be uploaded through the convergence key, the convergence key is encrypted through the first public key, and improvement of data transmission safety is facilitated through an encryption mode. Therefore, the method can simultaneously give consideration to data security and data deduplication, and is beneficial to reducing data storage cost and data transmission efficiency.
With reference to the first aspect, in some implementations of the first aspect, the sending, by the client, the identifier of each data block to the server includes: the client side constructs a hash tree based on the identification of each data block and sends the information of the hash tree to the server, the leaf nodes of the hash tree are hash values of the identification of each data block, and the root nodes of the hash tree are hash check values of the leaf nodes; the client receives a list of data blocks to be uploaded, which is sent by the server according to the identifier of each data block, and the list comprises the following data blocks: and the client receives a to-be-uploaded data block list sent by the server according to the information of the hash tree.
According to the data storage method, the server can perform self-checking according to the Hash tree information sent by the client, integrity of the data blocks is verified, it is favorable for guaranteeing that the received data blocks are not omitted and tampered, meanwhile, the server can determine the data blocks which are not stored in the server according to the information of the leaf nodes of the Hash tree sent by the client, and then determines a list of the data blocks to be uploaded, data deduplication is achieved, data transmission efficiency is improved, and data storage cost is reduced.
With reference to the first aspect, in certain implementations of the first aspect, the method further includes: the client signs a first tuple through a first private key in a first key pair to obtain a signed first tuple, wherein the first tuple comprises the information of the root node, the file name of a first file and the version number of the first file; and the client sends the signed first tuple to the server.
According to the data storage method, the client signs the first tuple through the first private key, so that the follow-up server can verify the integrity of the data again according to the information of the first tuple and the information of the hash tree, and the safety of the data is further improved.
With reference to the first aspect, in certain implementations of the first aspect, the method further includes: a client sends a request for downloading a first file to a server; the client receives ciphertext of a plurality of data blocks and convergence key ciphertext of the data blocks from the server; the client decrypts the convergence key ciphertexts of the data blocks respectively through the first private key to obtain the convergence keys of the data blocks; and the client decrypts the ciphertexts of the data blocks through the convergence keys of the data blocks to obtain the data blocks.
According to the data storage method, the client downloads the first file stored by the server, and the plurality of data blocks of the first file are obtained through two decryption modes, so that the data transmission safety is improved.
With reference to the first aspect, in certain implementations of the first aspect, the method further includes: the client receives the signed first tuple from the server; and the client side carries out integrity verification on the signed first tuple by using the first public key.
According to the data storage method, integrity verification is carried out on the signed first tuple by using the first public key, so that the integrity of data is guaranteed, and the safety of data transmission is further improved.
With reference to the first aspect, in certain implementations of the first aspect, the method further includes: a client sends a request for sharing a first file with another client to a server; the client receives a second public key in a second key pair corresponding to the other client from the server and a convergence key ciphertext of the plurality of data blocks; the client decrypts the convergence key ciphertexts of the data blocks respectively through the first private key to obtain the convergence keys of the data blocks; the client encrypts the convergence keys of the data blocks respectively through the second public key to obtain new convergence key ciphertexts of the data blocks; the client sends the new convergence key cryptographs of the data blocks to the server.
According to the data storage method, when a client and another client share a file, the client can firstly obtain a second public key of the other client through a server, and encrypt the convergence keys of the data blocks based on the second public key to obtain new convergence key ciphertexts of the data blocks, the other client obtains the new convergence key ciphertexts of the data blocks and the key ciphertexts of the data blocks through the server, the new convergence key ciphertexts of the data blocks can be decrypted through the second public key generated by the other client, the convergence key capable of decrypting the key ciphertexts of the data blocks is obtained, and then the data blocks are obtained. In the method, the file sharing can be realized between the client and the other client without exposing the public key of the client, and the safety of data transmission can be provided.
In a second aspect, the present application provides a data storage method, including: the server receives the identification of each data block in a plurality of data blocks from the client; the server compares the identifications of the data blocks with the identifications of the data blocks stored in the server respectively to determine a list of the data blocks to be uploaded, wherein the list of the data blocks to be uploaded comprises at least one identification of the data block to be uploaded, the at least one data block to be uploaded is different from the data blocks stored in the server, and the at least one data block to be uploaded is all or part of the data blocks; the server sends a list of data blocks to be uploaded to the client; the server receives at least one ciphertext of the data block to be uploaded and at least one convergence key ciphertext of the data block to be uploaded from the client, wherein the at least one ciphertext of the data block to be uploaded is obtained by encrypting the at least one data block to be uploaded through the convergence key of the at least one data block to be uploaded, and the at least one convergence key ciphertext of the data block to be uploaded is obtained by encrypting the convergence key of the at least one data block to be uploaded through a first public key of a first key pair.
With reference to the second aspect, in some implementations of the second aspect, the receiving, by the server, an identification of each of a plurality of data chunks from the client includes: the server receives information of a hash tree from the client, wherein the hash tree is constructed based on the identifiers of a plurality of data blocks of a first file, a leaf node of the hash tree is a hash value of the identifier of each data block in the plurality of data blocks, and a root node of the hash tree is a hash check value of the leaf node; after the server receives the identification of each of the plurality of data chunks from the client, the method further comprises: the server carries out integrity check by utilizing the information of the root node and the information of the leaf nodes; the server compares the identifications of the data blocks with the identifications of the data blocks already stored in the server respectively, and determines a list of the data blocks to be uploaded, and the method comprises the following steps: and under the condition of passing the integrity check, the server compares the identifications of the data blocks with the identifications of the data blocks already stored in the server respectively to determine a list of the data blocks to be uploaded.
With reference to the second aspect, in certain implementations of the second aspect, the method further includes: the server receives a signed first tuple from the client, wherein the first tuple comprises root node information, a file name of a first file and a version number of the first file, and the signed first tuple is obtained by signing the first tuple through a first private key in a first key pair; and the server carries out integrity verification on the signed first tuple again by using the first public key.
With reference to the second aspect, in certain implementations of the second aspect, the method includes: the server receives a request for downloading a first file from a client; the server sends the ciphertext of the data blocks and the convergence key ciphertext of the data blocks to the client based on the request for downloading the first file.
With reference to the second aspect, in some implementations of the second aspect, the method further includes: the server receives a request from a client to share a first file with another client; the server sends a second public key in a second key pair corresponding to the other client and a convergence key ciphertext of the plurality of data blocks to the client based on a request for sharing the first file with the other client; the server receives new convergence key ciphertexts of the data blocks from the client, wherein the new convergence key ciphertexts of the data blocks are obtained by encrypting the convergence keys of the data blocks through a second public key; the server sends the new convergence key ciphertext for the plurality of data blocks and the ciphertext for the plurality of data blocks to another client.
In a third aspect, the present application provides a data storage device comprising: a processing module and a transceiver module. The processing module is used for acquiring a plurality of data blocks of the first file, an identifier of each data block in the plurality of data blocks and a convergence key of each data block; constructing a hash tree based on the identification of each data block; the receiving and sending module is used for sending information of a hash tree to the server, wherein leaf nodes of the hash tree are hash values of the identification of each data block, and root nodes of the hash tree are hash check values of the leaf nodes; receiving a to-be-uploaded data block list sent by a server according to information of a hash tree, wherein the to-be-uploaded data block list comprises at least one identifier of a to-be-uploaded data block, the at least one to-be-uploaded data block is different from data blocks already stored in the server, and the at least one to-be-uploaded data block is all or part of the data blocks; the processing module is further configured to: encrypting the at least one data block to be uploaded through the convergence key of the at least one data block to be uploaded to obtain a ciphertext of the at least one data block to be uploaded; signing a first tuple through a first private key in a first key pair to obtain a signed first tuple, wherein the first tuple comprises information of a root node, a file name of a first file and a version number of the first file; the transceiver module is further configured to: and sending the signed first tuple, the ciphertext of the at least one data block to be uploaded and the convergence secret key ciphertext of the at least one data block to be uploaded to a server, wherein the convergence secret key ciphertext of the at least one data block to be uploaded is obtained by encrypting the convergence secret key of the at least one data block to be uploaded through a first public key in a first secret key pair.
With reference to the third aspect, in some implementations of the third aspect, the transceiver module is further configured to: sending a request for downloading a first file to a server; receiving ciphertext of a plurality of data blocks and convergence key ciphertext of the plurality of data blocks from a server; the processing module is further configured to: decrypting the convergence key ciphertexts of the data blocks respectively through the first private key to obtain convergence keys of the data blocks; and decrypting the ciphertexts of the data blocks by the convergence keys of the data blocks to obtain the data blocks.
With reference to the third aspect, in some implementations of the third aspect, the transceiver module is configured to: receiving a signed first tuple from a server; and carrying out integrity verification on the signed first tuple by utilizing the first public key.
With reference to the third aspect, in some implementations of the third aspect, the transceiver module is further configured to: sending a request for sharing a first file with another client to a server; receiving a second public key in a second key pair corresponding to another client from the server and a convergence key ciphertext of the plurality of data blocks; the processing module is used for: decrypting the convergence key ciphertexts of the data blocks respectively through the first private key to obtain convergence keys of the data blocks; encrypting the convergence keys of the data blocks respectively through the second public key to obtain new convergence key ciphertexts of the data blocks; the transceiver module is further configured to: and sending the new convergence key cryptographs of the plurality of data blocks to the server.
In a fourth aspect, the present application provides a data storage device comprising: a receiving and sending module and a processing module. The receiving and sending module is used for receiving information of a hash tree from a client, the hash tree is constructed based on the identifiers of a plurality of data blocks of a first file, a leaf node of the hash tree is a hash value of the identifier of each data block in the plurality of data blocks, and a root node of the hash tree is a hash check value of the leaf node; the processing module is used for carrying out integrity check by utilizing the information of the root node and the information of the leaf node; under the condition of passing the integrity check, the server compares the identifications of the data blocks with the identifications of the data blocks stored in the server respectively to determine a data block list to be uploaded, wherein the data block list to be uploaded comprises at least one identification of the data block to be uploaded, the at least one data block to be uploaded is different from the data block stored in the server, and the at least one data block to be uploaded is all or part of the data blocks; the transceiver module is further configured to: sending a list of data blocks to be uploaded to a client; the method comprises the steps of receiving a signed first tuple, at least one ciphertext of a data block to be uploaded and at least one convergence key ciphertext of the data block to be uploaded from a client, wherein the first tuple comprises root node information, a file name of a first file and a version number of the first file, the signed first tuple is obtained by signing the first tuple through a first private key of a first key pair, the at least one ciphertext of the data block to be uploaded is obtained by encrypting the at least one data block to be uploaded through the convergence key of the at least one data block to be uploaded, and the at least one convergence key ciphertext of the data block to be uploaded is obtained by encrypting the convergence key of the at least one data block to be uploaded through a first public key of the first key pair.
With reference to the fourth aspect, in some implementations of the fourth aspect, the transceiver module is configured to: receiving a request from a client to download a first file; and sending the ciphertext of the plurality of data blocks and the convergence key ciphertext of the plurality of data blocks to the client based on the request for downloading the first file.
With reference to the fourth aspect, in some implementations of the fourth aspect, the transceiver module is configured to: receiving a request from a client to share a first file with another client; based on a request for sharing the first file with another client, sending a second public key in a second key pair corresponding to the other client and a convergence key ciphertext of the plurality of data blocks to the client; receiving new convergence key ciphertexts of the data blocks from the client, wherein the new convergence key ciphertexts of the data blocks are obtained by encrypting the convergence keys of the data blocks through a second public key; and sending the new convergence key ciphertext of the plurality of data blocks and the ciphertext of the plurality of data blocks to another client.
In a fifth aspect, the present application provides a data storage device comprising a processor and a memory. The processor is configured to read instructions stored in the memory to perform a method according to any one of the possible implementations of any one of the above aspects.
Optionally, there are one or more processors and one or more memories.
Alternatively, the memory may be integrated with the processor, or provided separately from the processor.
In a specific implementation process, the memory may be a non-transient memory, such as a Read Only Memory (ROM), which may be integrated on the same chip as the processor, or may be separately disposed on different chips.
The data storage device in the above fifth aspect may be a chip, and the processor may be implemented by hardware or software, and when implemented by hardware, the processor may be a logic circuit, an integrated circuit, or the like; when implemented in software, the processor may be a general-purpose processor implemented by reading software code stored in a memory, which may be integrated with the processor, located external to the processor, or stand-alone.
In a sixth aspect, the present application provides a computer-readable medium storing a computer program (which may also be referred to as code, or instructions) which, when executed on a computer, causes the computer to perform the method of any of the possible implementations of any of the above aspects.
In a seventh aspect, the present application provides a computer program product comprising: computer program (also called code, or instructions), which when executed, causes a computer to perform the method of any of the possible implementations of any of the above aspects.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
Fig. 1 is a schematic diagram of a communication system to which embodiments of the present application are applicable;
FIG. 2 is a schematic flow chart of a data storage method provided by an embodiment of the present application;
FIG. 3 is a schematic flow chart diagram of another data storage method provided by an embodiment of the present application;
FIG. 4 is a schematic flow chart diagram of another data storage method provided by an embodiment of the present application;
FIG. 5 is a schematic block diagram of a data storage device provided by an embodiment of the present application;
fig. 6 is a schematic block diagram of another data storage device provided in an embodiment of the present application.
With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the inventive concepts in any manner, but rather to illustrate the inventive concepts to those skilled in the art by reference to specific embodiments.
Detailed Description
The technical solution in the present application will be described below with reference to the accompanying drawings.
For the convenience of understanding the embodiments of the present application, the related terms in the embodiments of the present application will be described first.
1. Plaintext and ciphertext
Plaintext is the word before encryption. The ciphertext is an encrypted word.
The relationship between ciphertext and plaintext may be: the ciphertext is the message obtained by encrypting the plaintext.
2. Secret key
A key is a parameter that is input in an algorithm that converts plaintext into ciphertext or converts ciphertext into plaintext.
Keys can be classified as symmetric keys and asymmetric keys. Wherein the symmetric key, i.e. the key for encryption and decryption, is the same. Asymmetric keys, i.e. keys for encryption and decryption, are different.
3. Symmetric key encryption
Symmetric key encryption, also called private key encryption, means that both parties sending and receiving data must use the same key to encrypt and decrypt the plaintext.
4. Asymmetric key encryption
Asymmetric key encryption, also known as public key encryption, uses different keys to encrypt and decrypt plaintext for both sending and receiving data.
In asymmetric key encryption, a key used for encryption may be referred to as an asymmetric encryption public key, and a key used for decryption may be referred to as an asymmetric encryption private key.
5. Convergence encryption
The convergent encryption is an encryption method for generating a key from data content. Among them, a key generated from data content may be referred to as a convergence key.
In the convergent encryption method, the convergent keys generated by the same data have uniqueness, i.e., the convergent keys generated by the same data are the same.
6. Hash Tree (Hash Tree)
Hash trees, also commonly referred to as Merkle trees (Merkle trees), are a type of tree data structure in cryptography and computer science, with each leaf node being labeled with a hash of a block of data and nodes other than the leaf node being labeled with an encrypted hash of its child node label.
The hash tree can efficiently and safely verify the content of a large data structure.
7. Secure hash algorithm 256
Secure hash algorithm 256 (SHA-256) is a hash function, also known as a hash algorithm, which is a method of creating a small digital "fingerprint" from any kind of data.
The hash function may compress a message or data into a digest so that the amount of data is reduced, fix the format of the data, and then shuffle the data to recreate a fingerprint called a hash value (or hash value). Where the hash value is typically represented by a short string of random letters and numbers.
For any length of message, SHA256 generates a 256-bit (bit) long hash value called message digest, which is equivalent to an array of 32 bytes in length, usually represented by a 64-hexadecimal string.
8. Secure hash algorithm 1(secure hash algorithm-1, SHA-1)
SHA-1 is a cryptographic hash function that generates a 160-bit (20-byte) hash value called a message digest, typically in the form of 40 hexadecimal numbers.
In the SHA-1 method, input information is different, and output message digests are different.
In the field of data storage, deduplication is a very critical technology. If the deduplication technology is not used in the data storage system, the same data needs to be stored for multiple times and transmitted for multiple times. If the data storage system uses the duplicate removal technology, the same data can be stored and transmitted once, and the data storage cost and the data transmission efficiency can be greatly reduced. Therefore, the deduplication technology is widely applied to a network disk and a Content Delivery Network (CDN).
Current data storage methods may include plaintext, uniform key encryption, and user-defined encryption. In the plaintext and unified key encryption mode, operation and maintenance personnel in the data center can check all data and can deduplicate repeated data, but the security is low, and the requirements of users with high sensitivity to data security, such as enterprise users and financial users, are not met. The user defines the encryption mode, and different users encrypt the data by using different keys to obtain different ciphertexts, so that the safety can be improved, but the difficulty of data duplication removal is increased due to different ciphertexts. Therefore, the existing data storage mode cannot give consideration to both data security and data deduplication.
In addition, the user defines the encryption mode, when the users share data, the users need to inform the opposite side of the own secret key, and the opposite side can obtain the shared data according to the secret key, namely when the users share the data, the sharing needs to be realized under the condition that the secret keys of the users are exposed, and the safety is low.
In view of this, the data storage method and the data storage device provided in the embodiments of the present application are beneficial to data security and data deduplication at the time of data storage.
For the convenience of understanding the embodiments of the present application, a communication system to which the embodiments of the present application are applicable will be described first.
Fig. 1 is a schematic diagram of a communication system 100 according to an embodiment of the present disclosure, and as shown in fig. 1, the communication system 100 includes a client 101, a client 102, and a server 103. The number of the clients and the servers in the communication system 100 is only an example, and the number of the clients and the servers is not limited in the embodiment of the present application.
Both client 101 and client 102 may send data to server 102, and server 102 may send data to at least one of client 101 and client 102.
The client 101 and/or the client 102 may send data to the server 103 by using the method provided in the embodiment of the present application, and after receiving the data, the server 103 may store and/or send the data by using the method provided in the embodiment of the present application, so that data security and data deduplication may be both considered during data storage.
Before describing the data storage method and the data storage device provided by the embodiments of the present application, the following description is made:
first, the first, second and various numerical numbers in the embodiments shown below are merely for convenience of description and are not intended to limit the scope of the embodiments of the present application. E.g., to distinguish between different key pairs, to distinguish between different public keys, etc.
Second, "at least one" means one or more, "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, and c, may represent: a, or b, or c, or a and b, or a and c, or b and c, or a, b and c, wherein a, b and c can be single or multiple.
Fig. 2 is a schematic flowchart of a data storage method 200 according to an embodiment of the present application, which can be applied to the communication system 100. The method 200 may include the steps of:
s201, the client acquires a plurality of data blocks of the first file, an identifier of each data block in the data blocks and a convergence key of each data block.
The client may be the client 101 or the client 102 in the communication system 100 described above.
The client may divide the data in the first file into a plurality of data blocks in a preset fixed size. For example, the preset fixed size may be 512 Kilobytes (KB) or 2 Megabytes (MB). The embodiment of the present application does not limit the specific value of the preset fixed size.
The plurality of data blocks may include one data block, two data blocks, or more than two data blocks, and the number of the data blocks is not limited in the embodiment of the present application.
The data blocks may be represented by symbol blocks.
The number of the plurality of data blocks is n, and can be specifically expressed as [ block ]1,block2,…,blocki,…,blockn]Wherein blockiCan be used to represent the ith data block, i being an integer greater than or equal to 1 and less than or equal to n.
The identification of the data blocks is used to distinguish between different data blocks and may be indicated by the symbol Tag.
The identity of the multiple data blocks may be denoted as [ Tag ]1,Tag2,…,Tagi,…,Tagn]Wherein, TagiMay be used to represent the identity of the ith data block.
Alternatively, the identification of the data block may be generated from the data block by SHA-256.
The convergence key for each data block may be generated by the client based on the data content of each data block, and may be represented by the symbol CK.
The convergence key for multiple data blocks may be identified as [ CK ]1,CK2,…,CKi,…,CKn]Wherein, CKiMay be used to represent the convergence key for the ith data block.
Alternatively, the convergence key for each data block may be generated from the data block by SHA-1.
S202, the client sends the identification of each data block to the server, and correspondingly, the server receives the identification of each data block.
S203, the server compares the identifications of the data blocks with the identifications of the data blocks stored in the server respectively, and determines a list of the data blocks to be uploaded, wherein the list of the data blocks to be uploaded comprises at least one identification of the data block to be uploaded, the at least one data block to be uploaded is different from the data blocks stored in the server, and the at least one data block to be uploaded is all or part of the data blocks.
The data blocks stored in the server may be uploaded by the client or by other clients, and the source of the data blocks stored in the server is not limited in the embodiment of the present application.
Alternatively, the sizes of the data blocks already stored in the server may all be the same as the preset fixed size described above. Or the size of the data block stored in the server is partially the same as the preset fixed size, and partially different from the preset fixed size.
When the sizes of the data blocks already stored in the server are all the same as the preset fixed size, the server may compare the identifiers of the plurality of data blocks with the identifiers of the data blocks already stored in the server, and add the identifier of the data block different from the identifier of the data block already stored in the server to the list of the data block to be uploaded.
When the size of the data block stored in the server is partially the same as the preset fixed size and partially different from the preset fixed size, the server may compare the identifiers of the plurality of data blocks with the identifiers of the data blocks stored in the server and having the same size as the preset fixed size, and add the identifiers of the different data blocks to the list of the data blocks to be uploaded.
In other words, in the case where the data blocks already stored in the server adopt the same dividing rule and the same identification rule as the plurality of data blocks of the first file (for example, when the data blocks are divided, the data blocks are guaranteed to have the same size, the identifications of the data blocks corresponding to the same data are the same, and the like), the server may perform comparison by using the above identifications, thereby determining the data blocks which are not stored by the server.
In this embodiment of the present application, the number of the data blocks to be uploaded may be one, or may be two or more, and this is not limited in this embodiment of the present application.
It should be understood that, when a plurality of data blocks of the first file have been stored in the server, that is, all data blocks in the first file have been stored in the server, the number of data blocks to be uploaded is 0, and the client does not need to upload the data blocks to the server.
It should also be understood that when the number of the plurality of data blocks is the same as the number of the data blocks to be uploaded, the client needs to send all the data blocks in the first file to the server, that is, any data block in the first file is not stored in the server.
The identifier of the at least one to-be-uploaded data block may be sent in the form of the to-be-uploaded data block list, or may be sent in other forms, for example, sent in the form of a message or a message, which is not limited in this embodiment of the present application.
S204, the server sends a list of the data blocks to be uploaded to the client, and correspondingly, the client receives the list of the data blocks to be uploaded sent by the server according to the identification of each data block.
The client receives the list of the data blocks to be uploaded, and can determine the data blocks to be uploaded according to the identification of at least one data block to be uploaded in the list of the data blocks to be uploaded. S205, the client encrypts the at least one data block to be uploaded through the convergence key of the at least one data block to be uploaded to obtain the ciphertext of the at least one data block to be uploaded.
The ciphertext of the block of data to be uploaded may be represented by symbol C.
The ciphertext of multiple data blocks to be uploaded may be denoted as [ C ]1,C2,…,Ci,…,Cn]Wherein, CiMay be used to represent the ciphertext of the ith block of data to be uploaded.
The client can use the convergence key CK of the ith data block to be uploadediFor the ith data block to be uploadediEncrypting to obtain the ciphertext C of the ith data block to be uploadedi
It should be understood that the convergence keys of different data blocks to be uploaded are different, and the client encrypts the data block to be uploaded through the convergence key of the data block to be uploaded to obtain a ciphertext of the data block to be uploaded.
S206, the client sends the ciphertext of the at least one data block to be uploaded and the convergence key ciphertext of the at least one data block to be uploaded to the server, and the convergence key ciphertext of the at least one data block to be uploaded is obtained by encrypting the convergence key of the at least one data block to be uploaded through the first public key of the first key pair.
The first key pair is automatically generated by the client upon user registration. The first private key may be denoted by the symbol AKprv.
Alternatively, the first key pair may be asymmetric keys. The first private key may be an asymmetric encryption private key.
When the first key pair is an asymmetric key, the first public key may be an asymmetric cryptographic public key.
The first public key may be denoted by the symbol AKpub. The convergence key ciphertext of the data block to be uploaded may be represented by the symbol ACK.
The converged key ciphertext of multiple data blocks to be uploaded may be denoted as [ ACK1,ACK2,…,ACKi,…,ACKn]Wherein, ACKiAnd the convergence key ciphertext of the ith data block to be uploaded can be represented.
The client side can encrypt the convergence secret key of the at least one data block to be uploaded through the first public key to obtain the convergence secret key ciphertext of the at least one data block to be uploaded.
For example, the client may use the first public key AKpub to converge the key CK for the ith data block to be uploadediEncrypting to obtain a convergence secret key ciphertext ACK of the ith data block to be uploadedi
According to the data storage method, the data blocks which are not stored in the server are determined according to the identification of the data blocks, then the data block list to be uploaded is determined, data duplication removal is achieved, improvement of data transmission efficiency is facilitated, data storage cost is reduced, in addition, the client side encrypts the data blocks to be uploaded through the convergence key, the convergence key is encrypted through the first public key, and improvement of data transmission safety is facilitated through an encryption mode. Therefore, the method can simultaneously give consideration to data security and data deduplication, and is beneficial to reducing data storage cost and data transmission efficiency.
As an optional embodiment, in the step S202, the sending, by the client, the identifier of each data block to the server includes: the client side constructs a hash tree based on the identification of each data block and sends the information of the hash tree to the server, the leaf nodes of the hash tree are hash values of the identification of each data block, and the root nodes of the hash tree are hash check values of the leaf nodes; correspondingly, the server receives an identification of each data block in the plurality of data blocks from the client, including: the server receives information of the hash tree from the client, the server performs integrity check by using the information of the root node and the information of the leaf nodes, and under the condition that the integrity check is passed, the server compares the identifications of the data blocks with the identifications of the data blocks stored in the server respectively to determine a list of the data blocks to be uploaded. The step S204 of receiving, by the client, the list of the data blocks to be uploaded, which is sent by the server according to the identifier of each data block, includes: and the client receives a to-be-uploaded data block list sent by the server according to the information of the hash tree.
The client may construct a hash tree using the hash value of the identifier of each data block as a leaf node and the hash check value of the leaf node as a root node.
The server may be the server 103 in the communication system 100 described above.
The information of the hash tree includes information of a root node and information of leaf nodes of the hash tree.
The server may determine whether the identifiers of the plurality of data blocks are complete using self-checking of the hash tree using the information of the root node and the information of the leaf nodes.
The signed first tuple may include information of the signed root node, a file name of the signed first file, and a version number of the signed first file.
The version number of the first file may be, for example, 3.8.1.6102.
The signed first tuple may be represented by the symbol Sign.
After the server receives the signed first tuple, the ciphertext of the at least one data block to be uploaded and the convergence key ciphertext of the at least one data block to be uploaded, the server can perform signature removal on the signed first tuple through the first public key to obtain root node information of the hash tree, compare the root node information of the hash tree with the hash tree, and check the integrity of the data again. In the case of passing the integrity check, the server may save the ciphertext of the at least one data block to be uploaded to the storage medium (i.e., the disk is dropped), that is, the server saves the first file after the deduplication. The server may also generate descriptive data describing the first file, i.e., a signed first tuple and a convergence key ciphertext of the data block to be uploaded.
According to the data storage method, the client signs the first tuple through the first private key, so that the follow-up server can verify the integrity of the data again according to the information of the first tuple and the information of the hash tree, and the safety of the data is further improved.
Optionally, the server may perform de-signing on the signed first tuple through the first public key to obtain root node information of the hash tree. Wherein, the first public key can be sent by the client to the server.
The client may automatically generate the first public key and the first private key of the first key pair when the user registers. The client may send the first public key and the first private key to the server, and after receiving the first public key and the first private key, the server may store the first public key and the first private key.
And under the condition that the server receives the signed first tuple, the server can perform de-signing on the signed first tuple through the first public key to obtain the first tuple.
Optionally, the client may not store the first public key and the first private key, or may store the first public key and the first private key, which is not limited in this embodiment of the present application.
Under the condition that the first public key and the first private key are not saved or are lost after being saved, the client can automatically download the first public key and the first private key from the server.
Optionally, when a user registers at the client, the key can be customized at the client to prevent other users from viewing or stealing data. The key customized by the user at the client may be referred to as a user key or a user-defined key, and the name of the key is not limited in the embodiment of the present application.
Illustratively, the user key may be a symmetric encryption key, which may be denoted by the symbol PK.
When the user registers at the client, the user can directly fill in the key at the client, or the key stored at the client can be imported at the client. The embodiment of the application does not limit the source of the user key.
Optionally, the client may encrypt the first private key AKprv through the user key PK to obtain the encrypted first private key. The encrypted first private key may be represented by the symbol c (akprv).
The client may send the user name, the encrypted first private key c (akprv), and the first public key to the server, and correspondingly, after receiving the user name, the encrypted first private key c (akprv), and the first public key, the server stores the user name, the encrypted first private key c (akprv), and the first public key, that is, the user is successfully created.
Under the condition that the client does not store or loses the first public key and the first private key after storing, the client can download the encrypted first private key and the first public key from the server, and decrypt the encrypted first private key through the first public key to obtain the first private key.
According to the data storage method provided by the embodiment of the application, the client encrypts the first private key through the user key, so that the security of transmitting the first private key can be improved, and meanwhile, the first private key and the user can be bound, and the data management of the server is facilitated.
The method 200 described above describes a process in which a client sends a first file to a server, and the client can download the first file from the server after the server stores data related to the first file. Therefore, the embodiment of the present application further provides a data storage method 300, which is used for describing a process of downloading the first file from the server by the client.
Fig. 3 is a schematic flow chart of another data storage method 300 according to an embodiment of the present application, where the method 300 can be applied to the communication system 100.
The method 300 may include the steps of:
s301, the client sends a request for downloading the first file to the server, and correspondingly, the server receives the request for downloading the first file.
The client may be the client 101 or the client 102 in the communication system 100 described above. The server may be the server 103 in the communication system 100 described above.
Optionally, the request for downloading the first file may include a file name of the first file.
S302, the server sends the ciphertexts of the data blocks and the convergence key ciphertexts of the data blocks to the client based on the request for downloading the first file, and correspondingly, the client receives the ciphertexts of the data blocks and the convergence key ciphertexts of the data blocks.
The server may determine a plurality of data blocks of the first file based on the request to download the first file. The plurality of data blocks may include at least one data block to be uploaded in the method 200 and a data block that is the same as a data block already existing in the server.
For example, the server may establish a correspondence relationship between the plurality of data blocks and a file name of the first file, and when the server receives a request to download the first file, the server may determine the plurality of data blocks based on the file name of the first file in the request to download the first file.
The ciphertext of the plurality of data blocks includes at least one ciphertext of a data block to be uploaded in the method 200 and a ciphertext of a data block that is the same as a data block already existing in the server.
When the same data block as the data block already existing in the server is uploaded to the server by the client, the server stores the ciphertext of the same data block as the data block already existing in the server, and the ciphertext of a plurality of data blocks can be directly transmitted to the client.
In a case where a data block identical to a data block already existing in a server is uploaded to the server by a client other than the client, the server needs to encrypt the data block identical to the data block already existing in the server by using a convergence key of the data block identical to the data block already existing in the server to obtain a ciphertext of the data block identical to the data block already existing in the server, and then send the ciphertexts of a plurality of data blocks to the client.
The convergence key ciphertext of the multiple data blocks includes the convergence key ciphertext of at least one data block to be uploaded in the method 200 and the convergence key ciphertext of a data block that is the same as a data block already existing in the server.
When the same data block as the data block already existing in the server is uploaded to the server by the client, the server stores the convergence key ciphertext of the same data block as the data block already existing in the server, and the convergence key ciphertext of a plurality of data blocks can be directly transmitted to the client.
When a data block identical to a data block already existing in a server is uploaded to the server by a client other than the client, the server needs to encrypt a convergence key of the data block identical to the data block already existing in the server by using a first private key to obtain a convergence key ciphertext of the data block identical to the data block already existing in the server, and then send the convergence key ciphertexts of a plurality of data blocks to the client.
It should be understood that the first private key is the same as the first private key in method 200 described above.
S303, the client decrypts the convergence key ciphertexts of the data blocks through the first private key respectively to obtain the convergence keys of the data blocks.
The first private key is the same as the first private key in method 200 described above.
S304, the client decrypts the ciphertexts of the data blocks through the convergence keys of the data blocks to obtain the data blocks.
The convergence keys of the data blocks are the same as those of the data blocks in the method 200, and are not described herein again.
According to the data storage method provided by the embodiment of the application, the client downloads the first file stored by the server, and the plurality of data blocks of the first file are obtained in a decryption mode twice, so that the data transmission safety is improved.
As an alternative embodiment, the method 300 further includes: the server can also send the signed first tuple to the client, and correspondingly, the client receives the signed first tuple; and the client side carries out integrity verification on the signed first tuple by using the first public key.
The signed first tuple, the first public key, and the first tuple are respectively the same as the signed first tuple, the first public key, and the first tuple in the method 200 described above.
Exemplarily, the client uses the first public key to sign off the signed first tuple, and if the sign off succeeds, the data integrity can be indicated; if the de-signature is not successful, the data can be indicated to be missing or tampered.
According to the data storage method provided by the embodiment of the application, the integrity of the signed first tuple is verified by using the first public key, so that the integrity of data is ensured, and the security of data transmission is further improved.
The embodiment of the present application further provides a data storage method 400, which is used for introducing a process of file sharing between clients.
Fig. 4 is a schematic flow chart of another data storage method 400 provided in the embodiment of the present application, where the method 400 may be applied to the communication system 100.
The method 400 may include the steps of:
s401, the client sends a request for sharing the first file with another client to the server, and correspondingly, the server receives the request for sharing the first file with another client.
The client may be the client 101 in the communication system 100 described above. Another client may be client 102 in communication system 100.
It should be understood that, for convenience of description, the present embodiment is described by taking the first file in the method 200 and the method 300 as an example, in other possible implementations, the file shared between the clients may be any other file, and the present embodiment does not limit this.
In this embodiment of the present application, the client shares the first file with another client, that is, the other client may also obtain data in the first file, that is, the plurality of data blocks in the method 200 and the method 300 described above.
The user of a client may be referred to as user a, the user of another client may be referred to as user b, and a first file is shared between user a and user b.
Illustratively, a request to share a first file with another client may be described in a < user b, filename > tuple.
S402, the server sends a second public key in a second key pair corresponding to another client and the convergence key ciphertext of the data blocks to the client based on a request of sharing the first file with the other client, and the client receives the second public key in the second key pair corresponding to the other client and the convergence key ciphertext of the data blocks.
The other client may automatically generate a second public key and a second private key, i.e. a second key pair, upon user registration. It should be understood that the second key pair is different from the first key pair described above.
Alternatively, the second key pair may be asymmetric keys. The second private key may be an asymmetric encryption private key and the second public key may be an asymmetric encryption public key.
And the convergence key ciphertext of the data blocks is the convergence key ciphertext of the data blocks in the first file.
For example, the second public key and the converged key ciphertext for the plurality of data blocks may be described by a < second public key, converged key ciphertext for the plurality of data blocks > tuple.
S403, the client decrypts the convergence key ciphertexts of the data blocks through the first private key respectively to obtain the convergence keys of the data blocks.
The first private key is the same as the first private key in methods 200 and 300 described above.
The client decrypts the convergence key ciphertexts of the data blocks respectively through the first private key of the client to obtain the convergence keys of the data blocks
S404, the client encrypts the convergence keys of the data blocks through the second public key respectively to obtain new convergence key ciphertexts of the data blocks.
And the client encrypts the convergence keys of the data blocks respectively through the second public key of the other client to obtain new convergence key ciphertexts of the data blocks.
S405, the client sends the new convergence key ciphertexts of the data blocks to the server, and correspondingly, the server receives the new convergence key ciphertexts of the data blocks.
After receiving the new convergence key ciphertexts of the data blocks, the server may establish a descriptive file of the first file, that is, the new convergence key ciphertexts of the data blocks.
S406, the server sends the new convergence key ciphertexts of the data blocks and the ciphertexts of the data blocks to another client, and correspondingly, another client may receive the new convergence key ciphertexts of the data blocks and the ciphertexts of the data blocks.
When the other client side checks the data of the first file, the new convergence key ciphertexts of the data blocks can be decrypted through the second private key to obtain the convergence keys of the data blocks, and the ciphertexts of the data blocks are decrypted through the convergence keys of the data blocks to obtain the data blocks, namely the data in the first file.
Optionally, before the step S406, the method 400 further includes: another client sends a request for downloading the first file or a request for opening the first file to the server, and correspondingly, the server receives the request for downloading the first file or the request for opening the first file from the other client; the method S406 includes: the server sends the new convergence key ciphertext of the plurality of data blocks and the ciphertext of the plurality of data blocks to another client based on a request to download the first file or a request to open the first file.
Optionally, the server may further send the signed first tuple and information of the hash tree to another client, and correspondingly, the another client may receive the signed first tuple and the information of the hash tree and perform integrity verification based on the signed first tuple and the information of the hash tree.
According to the data storage method provided by the embodiment of the application, when a client and another client share a file, the client can firstly obtain a second public key of the other client through the server, encrypt the convergence keys of the data blocks based on the second public key to obtain new convergence key ciphertexts of the data blocks, the other client obtains the new convergence key ciphertexts of the data blocks and the key ciphertexts of the data blocks through the server, the new convergence key ciphertexts of the data blocks can be decrypted through the second public key generated by the other client, the convergence key capable of decrypting the key ciphertexts of the data blocks is obtained, and then the data blocks are obtained. In the method, the file sharing can be realized between the client and the other client without exposing the public key of the client, and the safety of data transmission can be provided.
The method 300 and the method 400 are implemented based on the method 200, and the method 300 and the method 400 are two methods in parallel. The methods 200, 300, and 400 can be applied to various storage related fields such as a cloud disk, a backup, a CDN, and an object storage, and the economy, the availability, the security, and the transmission efficiency are all improved.
The sequence numbers of the above processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not be limited in any way to the implementation process of the embodiments of the present application.
The data storage method provided by the embodiment of the present application is described in detail above with reference to fig. 1 to 4, and the data storage device provided by the embodiment of the present application is described in detail below with reference to fig. 5 and 6.
Fig. 5 illustrates a data storage device 500 according to an embodiment of the present application. The apparatus 500 comprises: a processing module 510 and a transceiver module 520.
In a possible implementation manner, the apparatus 500 is configured to execute the respective flows and steps corresponding to the client in the foregoing method embodiment.
The processing module 510 is configured to: acquiring a plurality of data blocks of a first file, an identifier of each data block in the plurality of data blocks and a convergence key of each data block; the transceiver module 520 is configured to: sending the identification of each data block to a server; receiving a to-be-uploaded data block list sent by a server according to the identifier of each data block, wherein the to-be-uploaded data block list comprises at least one identifier of the to-be-uploaded data block, the at least one to-be-uploaded data block is different from the data block stored in the server, and the at least one to-be-uploaded data block is all or part of the data blocks; the processing module 510 is further configured to: encrypting the at least one data block to be uploaded through the convergence key of the at least one data block to be uploaded to obtain a ciphertext of the at least one data block to be uploaded; the transceiver module 520 is further configured to: and sending the ciphertext of the at least one data block to be uploaded and the convergence secret key ciphertext of the at least one data block to be uploaded to the server, wherein the convergence secret key ciphertext of the at least one data block to be uploaded is obtained by encrypting the convergence secret key of the at least one data block to be uploaded through a first public key in a first secret key pair.
Optionally, the processing module 510 is further configured to: constructing a hash tree based on the identification of each data block; the transceiver module 520 is further configured to: sending information of a hash tree to a server, wherein leaf nodes of the hash tree are hash values of the identification of each data block, and root nodes of the hash tree are hash check values of the leaf nodes; and receiving a list of the data blocks to be uploaded, which are sent by the server according to the information of the hash tree.
Optionally, the processing module 510 is further configured to: signing a first tuple through a first private key in a first key pair to obtain a signed first tuple, wherein the first tuple comprises information of a root node, a file name of a first file and a version number of the first file; the transceiver module 520 is further configured to: and sending the signed first tuple to the server. Optionally, the transceiver module 520 is further configured to: sending a request for downloading a first file to a server; receiving ciphertext of a plurality of data blocks and convergence key ciphertext of the plurality of data blocks from a server; the processing module 510 is further configured to: decrypting the convergence key ciphertexts of the data blocks respectively through the first private key to obtain convergence keys of the data blocks; and decrypting the ciphertexts of the data blocks by the convergence keys of the data blocks to obtain the data blocks.
Optionally, the transceiver module 520 is configured to: receiving a signed first tuple from a server; and carrying out integrity verification on the signed first tuple by utilizing the first public key.
Optionally, the transceiver module 520 is further configured to: sending a request for sharing a first file with another client to a server; receiving a second public key in a second key pair corresponding to another client from the server and a convergence key ciphertext of the plurality of data blocks; the processing module 510 is configured to: decrypting the convergence key ciphertexts of the data blocks respectively through the first private key to obtain convergence keys of the data blocks; encrypting the convergence keys of the data blocks respectively through the second public key to obtain new convergence key ciphertexts of the data blocks; the transceiver module 520 is further configured to: and sending the new convergence key cryptographs of the plurality of data blocks to the server.
In another possible implementation manner, the apparatus 500 is configured to execute the respective flows and steps corresponding to the server in the foregoing method embodiment.
The transceiver module 520 is configured to: the identification processing module 510, which receives each of the plurality of data blocks from the client, is configured to: comparing the identifications of the data blocks with the identifications of the data blocks stored in the server respectively, and determining a data block list to be uploaded, wherein the data block list to be uploaded comprises at least one identification of the data block to be uploaded, the at least one data block to be uploaded is different from the data block stored in the server, and the at least one data block to be uploaded is all or part of the data blocks; the transceiver module 520 is further configured to: sending a list of data blocks to be uploaded to a client; receiving a ciphertext of at least one data block to be uploaded from a client and a convergence key ciphertext of at least one data block to be uploaded, wherein the ciphertext of the at least one data block to be uploaded is obtained by encrypting the at least one data block to be uploaded through the convergence key of the at least one data block to be uploaded, and the convergence key ciphertext of the at least one data block to be uploaded is obtained by encrypting the convergence key of the at least one data block to be uploaded through a first public key of a first key pair.
Optionally, the transceiver module 520 is further configured to: receiving information of a hash tree from a client, wherein the hash tree is constructed based on the identifiers of a plurality of data blocks of a first file, a leaf node of the hash tree is a hash value of the identifier of each data block in the plurality of data blocks, and a root node of the hash tree is a hash check value of the leaf node; the processing module 510 is further configured to: carrying out integrity check by using the information of the root node and the information of the leaf node; and under the condition of passing the integrity check, comparing the identifications of the plurality of data blocks with the identifications of the data blocks already stored in the device 500 respectively, and determining a list of the data blocks to be uploaded.
Optionally, the transceiver module 520 is further configured to: receiving a signed first tuple from a client, wherein the first tuple comprises root node information, a file name of a first file and a version number of the first file, and the signed first tuple is obtained by signing the first tuple through a first private key in a first key pair; the processing module 510 is further configured to: and carrying out integrity verification on the signed first tuple again by utilizing the first public key.
Optionally, the transceiver module 520 is configured to: receiving a request from a client to download a first file; and sending the ciphertext of the plurality of data blocks and the convergence key ciphertext of the plurality of data blocks to the client based on the request for downloading the first file.
Optionally, the transceiver module 520 is configured to: receiving a request from a client to share a first file with another client; based on a request for sharing the first file with another client, sending a second public key in a second key pair corresponding to the other client and a convergence key ciphertext of the plurality of data blocks to the client; receiving new convergence key ciphertexts of the data blocks from the client, wherein the new convergence key ciphertexts of the data blocks are obtained by encrypting the convergence keys of the data blocks through a second public key; and sending the new convergence key ciphertext of the plurality of data blocks and the ciphertext of the plurality of data blocks to another client.
It should be appreciated that the apparatus 500 herein is embodied in the form of functional modules. The term module herein may refer to an Application Specific Integrated Circuit (ASIC), an electronic circuit, a processor (e.g., a shared, dedicated, or group processor) and memory that execute one or more software or firmware programs, a combinational logic circuit, and/or other suitable components that support the described functionality. In an optional example, as will be understood by those skilled in the art, the apparatus 500 may be specifically a client or a server in the foregoing embodiment, or functions of the client or the server in the foregoing embodiment may be integrated in the apparatus 500, and the apparatus 500 may be configured to perform each process and/or step corresponding to the client or the server in the foregoing method embodiment, and details are not described here again to avoid repetition.
The apparatus 500 has functions of implementing corresponding steps executed by a client or a server in the method 200, the method 300 or the method 400; the above functions may be implemented by hardware, or may be implemented by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the functions described above.
Fig. 6 illustrates a data storage device 600 according to an embodiment of the present application. The apparatus 600 comprises: a processor 610, a transceiver 620, and a memory 630. Wherein the processor 610, the transceiver 620 and the memory 630 are in communication with each other through an internal connection path, the memory 630 is used for storing instructions, and the processor 610 is used for executing the instructions stored in the memory 630 to control the transceiver to transmit and/or receive signals.
It should be understood that the apparatus 600 may be used for executing various steps and/or flows corresponding to the client or the server in the above method embodiments. The memory 630 may optionally include both read-only memory and random access memory, and provides instructions and data to the processor 610. A portion of the memory 630 may also include non-volatile random access memory. For example, the memory 630 may also store device type information. The processor 610 may be configured to execute the instructions stored in the memory 630, and when the processor 610 executes the instructions stored in the memory 630, the processor 610 is configured to perform the steps and/or processes of the method embodiments corresponding to the client or server described above.
It should be understood that, in the embodiment of the present application, the processor 610 of the apparatus 600 may be a Central Processing Unit (CPU), and the processor 610 may also be other general processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The steps of a method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software elements in a processor. The software elements may be located in ram, flash, rom, prom, or eprom, registers, among other storage media that are well known in the art. The storage medium is located in a memory, and a processor executes instructions in the memory, in combination with hardware thereof, to perform the steps of the above-described method. To avoid repetition, it is not described in detail here.
The present application provides a readable computer storage medium for storing a computer program for implementing a method corresponding to the client or the server in the above embodiments.
The present application provides a computer program product comprising a computer program (also referred to as code, or instructions) which, when run on a computer, can perform the method corresponding to the client or server in the above embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (16)

1. A method of storing data, comprising:
a client acquires a plurality of data blocks of a first file, an identifier of each data block in the data blocks and a convergence key of each data block;
the client sends the identification of each data block to a server;
the client receives a to-be-uploaded data block list sent by the server according to the identifier of each data block, wherein the to-be-uploaded data block list comprises at least one identifier of a to-be-uploaded data block, the at least one to-be-uploaded data block is different from the data block stored in the server, and the at least one to-be-uploaded data block is all or part of the data blocks;
the client encrypts the at least one data block to be uploaded through the convergence key of the at least one data block to be uploaded to obtain a ciphertext of the at least one data block to be uploaded;
and the client sends the ciphertext of the at least one data block to be uploaded and the convergence key ciphertext of the at least one data block to be uploaded to the server, wherein the convergence key ciphertext of the at least one data block to be uploaded is obtained by encrypting the convergence key of the at least one data block to be uploaded through a first public key of a first key pair.
2. The method of claim 1, wherein the client sending the identification of each data block to a server comprises:
the client side constructs a hash tree based on the identification of each data block and sends the information of the hash tree to the server, wherein leaf nodes of the hash tree are hash values of the identification of each data block, and root nodes of the hash tree are hash check values of the leaf nodes;
the client receives a list of data blocks to be uploaded, which is sent by the server according to the identifier of each data block, and the list comprises the following data blocks:
and the client receives a data block list to be uploaded, which is sent by the server according to the information of the hash tree.
3. The method of claim 2, further comprising:
the client signs a first tuple through a first private key in the first key pair to obtain the signed first tuple, wherein the first tuple comprises the information of the root node, the file name of the first file and the version number of the first file;
and the client sends the signed first tuple to the server.
4. The method of claim 1, further comprising:
the client sends a request for downloading the first file to the server;
the client receives ciphertext of the plurality of data blocks and convergence key ciphertext of the plurality of data blocks from the server;
the client decrypts the convergence key ciphertexts of the data blocks respectively through the first private key to obtain the convergence keys of the data blocks;
and the client decrypts the ciphertexts of the data blocks through the convergence keys of the data blocks to obtain the data blocks.
5. The method of claim 4, further comprising:
the client receives the signed first tuple from the server;
and the client side carries out integrity verification on the signed first tuple by using the first public key.
6. The method of claim 1, further comprising:
the client sends a request for sharing the first file with another client to the server;
the client receives a second public key in a second key pair corresponding to the other client from the server and the convergence key ciphertext of the data blocks;
the client decrypts the convergence key ciphertexts of the data blocks respectively through the first private key to obtain the convergence keys of the data blocks;
the client encrypts the convergence keys of the data blocks respectively through the second public key to obtain new convergence key ciphertexts of the data blocks;
and the client sends the new convergence key cryptographs of the data blocks to the server.
7. A method of storing data, comprising:
the server receives an identification of each data block in the plurality of data blocks from the client;
the server compares the identifications of the data blocks with the identifications of the data blocks stored in the server respectively to determine a data block list to be uploaded, wherein the data block list to be uploaded comprises at least one identification of a data block to be uploaded, the at least one data block to be uploaded is different from the data block stored in the server, and the at least one data block to be uploaded is all or part of the data blocks;
the server sends the list of the data blocks to be uploaded to the client;
the server receives a ciphertext of the at least one data block to be uploaded from the client and a convergence key ciphertext of the at least one data block to be uploaded, wherein the ciphertext of the at least one data block to be uploaded is obtained by encrypting the at least one data block to be uploaded through a convergence key of the at least one data block to be uploaded, and the convergence key ciphertext of the at least one data block to be uploaded is obtained by encrypting the convergence key of the at least one data block to be uploaded through a first public key of a first key pair.
8. The method of claim 7, wherein the server receives an identification of each of the plurality of data chunks from a client, comprising:
the server receives information of a hash tree from a client, wherein the hash tree is constructed based on the identifiers of a plurality of data blocks of a first file, a leaf node of the hash tree is a hash value of the identifier of each data block in the plurality of data blocks, and a root node of the hash tree is a hash check value of the leaf node;
after the server receives the identification of each of the plurality of data chunks from the client, the method further comprises:
the server carries out integrity check by using the information of the root node and the information of the leaf nodes;
the server compares the identifications of the data blocks with the identifications of the data blocks already stored in the server respectively to determine a list of the data blocks to be uploaded, and the method comprises the following steps:
and under the condition of passing the integrity check, the server compares the identifications of the data blocks with the identifications of the data blocks already stored in the server respectively to determine a list of the data blocks to be uploaded.
9. The method of claim 8, further comprising:
the server receives a signed first tuple from the client, wherein the first tuple comprises the root node information, the file name of the first file and the version number of the first file, and the signed first tuple is obtained by signing the first tuple through a first private key in the first key pair;
and the server carries out integrity verification on the signed first tuple again by using the first public key.
10. The method of claim 7, wherein the method comprises:
the server receives a request from the client terminal for downloading the first file;
and the server sends the ciphertexts of the data blocks and the convergence key ciphertexts of the data blocks to the client based on the request for downloading the first file.
11. The method of claim 7, further comprising:
the server receiving a request from the client to share the first file with another client;
the server sends a second public key in a second key pair corresponding to another client and the convergence key ciphertext of the data blocks to the client based on the request for sharing the first file with the other client;
the server receives new convergence key ciphertexts of the data blocks from the client, wherein the new convergence key ciphertexts of the data blocks are obtained by encrypting the convergence keys of the data blocks through the second public key;
the server sends the new convergence key ciphertexts of the data blocks and the ciphertexts of the data blocks to the other client.
12. A data storage device, comprising:
the processing module is used for acquiring a plurality of data blocks of a first file, an identifier of each data block in the data blocks and a convergence key of each data block;
the receiving and sending module is used for sending the identification of each data block to a server; receiving a to-be-uploaded data block list sent by the server according to the identifier of each data block, wherein the to-be-uploaded data block list comprises at least one identifier of a to-be-uploaded data block, the at least one to-be-uploaded data block is different from the data block stored in the server, and the at least one to-be-uploaded data block is all or part of the data blocks;
the processing module is further configured to: encrypting the at least one data block to be uploaded through the convergence key of the at least one data block to be uploaded to obtain a ciphertext of the at least one data block to be uploaded;
the transceiver module is further configured to: and sending the ciphertext of the at least one data block to be uploaded and the convergence key ciphertext of the at least one data block to be uploaded to the server, wherein the convergence key ciphertext of the at least one data block to be uploaded is obtained by encrypting the convergence key of the at least one data block to be uploaded through a first public key in the first key pair.
13. A data storage device, comprising:
a transceiver module for receiving an identification of each of the plurality of data blocks from a client;
the processing module is configured to compare identifiers of the multiple data blocks with identifiers of data blocks already stored in the server, and determine a to-be-uploaded data block list, where the to-be-uploaded data block list includes an identifier of at least one to-be-uploaded data block, the at least one to-be-uploaded data block is different from the data block already stored in the server, and the at least one to-be-uploaded data block is all or part of the multiple data blocks;
the transceiver module is further configured to: sending the data block list to be uploaded to the client; and receiving a ciphertext of the at least one data block to be uploaded from the client and a convergence key ciphertext of the at least one data block to be uploaded, wherein the ciphertext of the at least one data block to be uploaded is obtained by encrypting the at least one data block to be uploaded through a convergence key of the at least one data block to be uploaded, and the convergence key ciphertext of the at least one data block to be uploaded is obtained by encrypting the convergence key of the at least one data block to be uploaded through a first public key of a first key pair.
14. A data storage device, comprising: a processor coupled with a memory for storing a computer program that, when invoked by the processor, causes the apparatus to perform the method of any of claims 1 to 6 or to perform the method of any of claims 7 to 11.
15. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program comprising instructions for implementing the method of any of claims 1 to 6 or the method of any of claims 7 to 11.
16. A computer program product comprising computer program code which, when run on a computer, causes the computer to carry out the method of any one of claims 1 to 6 or the method of any one of claims 7 to 11.
CN202111470327.1A 2021-12-03 2021-12-03 Data storage method and data storage device Active CN114143098B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111470327.1A CN114143098B (en) 2021-12-03 2021-12-03 Data storage method and data storage device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111470327.1A CN114143098B (en) 2021-12-03 2021-12-03 Data storage method and data storage device

Publications (2)

Publication Number Publication Date
CN114143098A true CN114143098A (en) 2022-03-04
CN114143098B CN114143098B (en) 2023-08-15

Family

ID=80387594

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111470327.1A Active CN114143098B (en) 2021-12-03 2021-12-03 Data storage method and data storage device

Country Status (1)

Country Link
CN (1) CN114143098B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117097528A (en) * 2023-08-22 2023-11-21 广州市番禺融合小额贷款股份有限公司 Financial data secure storage system, method and equipment based on big data

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103685162A (en) * 2012-09-05 2014-03-26 中国移动通信集团公司 File storing and sharing method
CN104158880A (en) * 2014-08-19 2014-11-19 济南伟利迅半导体有限公司 User-end cloud data sharing solution
CN105915332A (en) * 2016-07-04 2016-08-31 广东工业大学 Cloud storage encryption and dereplication method and cloud storage encryption and dereplication system
CN106506474A (en) * 2016-11-01 2017-03-15 西安电子科技大学 A kind of efficient traceable data sharing method based on mobile cloud environment
CN109491591A (en) * 2018-09-17 2019-03-19 广东工业大学 A kind of information diffusion method suitable for cloudy storage system
CN112565434A (en) * 2020-12-09 2021-03-26 广东工业大学 Cloud storage safety duplicate removal method and device based on Mercker hash tree

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103685162A (en) * 2012-09-05 2014-03-26 中国移动通信集团公司 File storing and sharing method
CN104158880A (en) * 2014-08-19 2014-11-19 济南伟利迅半导体有限公司 User-end cloud data sharing solution
CN105915332A (en) * 2016-07-04 2016-08-31 广东工业大学 Cloud storage encryption and dereplication method and cloud storage encryption and dereplication system
CN106506474A (en) * 2016-11-01 2017-03-15 西安电子科技大学 A kind of efficient traceable data sharing method based on mobile cloud environment
CN109491591A (en) * 2018-09-17 2019-03-19 广东工业大学 A kind of information diffusion method suitable for cloudy storage system
CN112565434A (en) * 2020-12-09 2021-03-26 广东工业大学 Cloud storage safety duplicate removal method and device based on Mercker hash tree

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王珂: "一种基于代理重加密的安全重复数据删除机制的研究", 中国优秀硕士学位论文全文数据库 信息科技辑 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117097528A (en) * 2023-08-22 2023-11-21 广州市番禺融合小额贷款股份有限公司 Financial data secure storage system, method and equipment based on big data

Also Published As

Publication number Publication date
CN114143098B (en) 2023-08-15

Similar Documents

Publication Publication Date Title
CN109194466B (en) Block chain-based cloud data integrity detection method and system
US11323247B2 (en) Methods and systems for secure data communication
CA3073549C (en) Methods and systems for secure data communication
US9116849B2 (en) Community-based de-duplication for encrypted data
US9537657B1 (en) Multipart authenticated encryption
CN109347627B (en) Data encryption and decryption method and device, computer equipment and storage medium
CN110096901B (en) Electronic contract data encryption storage method and signing client
US20140195804A1 (en) Techniques for secure data exchange
CN112202754B (en) Data encryption method and device, electronic equipment and storage medium
CN112738051B (en) Data information encryption method, system and computer readable storage medium
US20140237252A1 (en) Techniques for validating data exchange
CN115225409B (en) Cloud data safety duplicate removal method based on multi-backup joint verification
CN114244508B (en) Data encryption method, device, equipment and storage medium
CN112804217B (en) Block chain technology-based evidence storing method and device
CN111970114A (en) File encryption method, system, server and storage medium
US20140237239A1 (en) Techniques for validating cryptographic applications
CN114143098B (en) Data storage method and data storage device
CN112947967B (en) Software updating method, blockchain application store and software uploading terminal
CN114338648A (en) SFTP multi-terminal file secure transmission method and system based on state cryptographic algorithm
CN114679299B (en) Communication protocol encryption method, device, computer equipment and storage medium
CN113158218A (en) Data encryption method and device and data decryption method and device
CN116866029B (en) Random number encryption data transmission method, device, computer equipment and storage medium
CN112350920A (en) Instant communication system based on block chain
CN113572599B (en) Power data transmission method, data source equipment and data access equipment
US20230027422A1 (en) Systems, apparatus, and methods for generation, packaging, and secure distribution of symmetric quantum cypher keys

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant