WO2019218717A1 - 一种分布式存储方法、装置、计算机设备及存储介质 - Google Patents

一种分布式存储方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2019218717A1
WO2019218717A1 PCT/CN2019/072337 CN2019072337W WO2019218717A1 WO 2019218717 A1 WO2019218717 A1 WO 2019218717A1 CN 2019072337 W CN2019072337 W CN 2019072337W WO 2019218717 A1 WO2019218717 A1 WO 2019218717A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
data packet
storage
stored
fragment
Prior art date
Application number
PCT/CN2019/072337
Other languages
English (en)
French (fr)
Inventor
荆博
Original Assignee
百度在线网络技术(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百度在线网络技术(北京)有限公司 filed Critical 百度在线网络技术(北京)有限公司
Priority to US16/766,151 priority Critical patent/US11842072B2/en
Priority to JP2020530626A priority patent/JP7044881B2/ja
Publication of WO2019218717A1 publication Critical patent/WO2019218717A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6209Protecting access to data via a platform, e.g. using keys or access control rules to a single file or object, e.g. in a secure envelope, encrypted and accessed using a key, or with access control rules appended to the object itself
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • G06F3/0623Securing storage systems in relation to content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0662Virtualisation aspects
    • G06F3/0665Virtualisation aspects at area level, e.g. provisioning of virtual or logical volumes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/06Network architectures or network communication protocols for network security for supporting key management in a packet data network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0861Generation of secret information including derivation or calculation of cryptographic keys or passwords
    • H04L9/0869Generation of secret information including derivation or calculation of cryptographic keys or passwords involving random numbers or seeds
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3236Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3236Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions
    • H04L9/3239Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions involving non-keyed hash functions, e.g. modification detection codes [MDCs], MD5, SHA or RIPEMD
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/50Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols using hash chains, e.g. blockchains or hash trees

Definitions

  • the embodiments of the present application relate to the field of data storage technologies, for example, to a distributed storage method, apparatus, computer device, and storage medium.
  • the cloud storage technology in the related art is generally stored through a centralized server. With more and more storage data, the storage space and bandwidth resources of the server are seriously occupied, and the cloud storage cost continues to increase. Moreover, the data stored in the cloud by the related cloud storage technology is not encrypted, and the privacy of the data is endorsed by the credit of a large cloud storage service provider.
  • the decentralized storage of the data leads to the decentralization of credit, which leads to the problem that the storage node is unstable and the data storage is not safe and easy to be attacked.
  • the embodiments of the present invention provide a distributed storage method, device, computer device, and storage medium, which are convenient for users to store files in a distributed network to reduce storage costs, and can effectively improve the privacy and security of stored files, thereby avoiding The attacker restores the original file.
  • the embodiment of the present application provides a distributed storage method, including:
  • All data packets are grouped into at least three data fragments, wherein each data fragment includes a partial data packet, and each data packet is added to at least two data fragments;
  • Another embodiment of the present application further provides a distributed storage device, including:
  • a data grouping module configured to group the files to be stored to form a plurality of data packets
  • a data fragmentation module configured to form all data packets into at least three data fragments, wherein each data fragment includes a partial data packet, and each data packet is added to at least two data fragments;
  • a data storage module configured to perform distributed storage of each data slice in a distributed storage node
  • a relationship record module configured to record a correspondence between the data fragment and the data packet, and a correspondence between the storage node and the stored data fragment;
  • File deletion module set to delete the local file to be stored.
  • Another embodiment of the present application further provides a computer device, where the computer device includes:
  • One or more processors are One or more processors;
  • a storage device configured to store one or more programs
  • the one or more programs are executed by the one or more processors such that the one or more processors implement a distributed storage method as provided by any embodiment of the present application.
  • Another embodiment of the present application further provides a computer storage medium having stored thereon a computer program that, when executed by a processor, implements the distributed storage method provided by any embodiment of the present application.
  • FIG. 1 is a flowchart of a distributed storage method according to Embodiment 1 of the present application.
  • FIG. 2a is a flowchart of a distributed storage method according to Embodiment 2 of the present application.
  • 2b is a schematic structural view of the original Merkel tree involved in the second embodiment of the present application.
  • FIG. 3a is a flowchart of a distributed storage method according to Embodiment 3 of the present application.
  • 3b is a flowchart of a method for restoring a storage file in a distributed storage method according to Embodiment 3 of the present application;
  • Embodiment 4 is a flowchart of a distributed storage method according to Embodiment 4 of the present application.
  • FIG. 5 is a schematic diagram of a distributed storage device according to Embodiment 5 of the present application.
  • FIG. 6 is a schematic structural diagram of a computer device according to Embodiment 6 of the present application.
  • Embodiment 1 is a flowchart of a distributed storage method according to Embodiment 1 of the present application. This embodiment is applicable to a case where a file is stored in a distributed network, and the method may be performed by a distributed storage device. It is implemented by software and/or hardware, and can generally be integrated into any computer device that can initiate data storage. As shown in FIG. 1, the method includes step S110, step S120, step S130, step S140, and step S150.
  • step S110 the files to be stored are grouped to form a plurality of data packets.
  • the file to be stored may be a storable file of a text, a picture, a video, an audio, and other types (such as a compressed file in a zip format).
  • the embodiment of the present application does not limit the type of the file to be stored.
  • the data packet may be part of the file data of the file to be stored.
  • the files to be stored are first grouped and divided into a plurality of data packets.
  • Grouping the storage files may be divided into N data packets equally, such that each data packet includes file data of the same amount of data.
  • the storage files may also be grouped in a random division manner such that each data packet includes file data of a different data amount.
  • those skilled in the art can also establish other ways of grouping files according to the technical requirements of the technical solution according to actual needs, which is not limited by the embodiment of the present application.
  • step S120 all data packets are formed into at least three data slices, wherein each data slice includes partial data packets, and each data packet is added to at least two data slices.
  • the data fragment is composed of partial data packets, that is, each data fragment does not include all data packets.
  • the number of data packets included in each data fragment may be the same or different.
  • the number of data packets included in the data fragment may be 2, 5, 8, or more.
  • the number of data packets included in the data slice is not limited.
  • all the data packets can be formed into at least three data fragments, and each data packet can be added to at least two data fragments, that is, each data packet is formed at least.
  • Two storage copies In the process of grouping slices, data packets are stored redundantly. For M copy storage, one data packet appears in M data slices. The number of copies M is greater than or equal to 2, and may be preset, or may be dynamically adjusted according to actual conditions, such as the importance level of the storage file and the stability of the storage node.
  • step S130 each data slice is distributedly stored in a distributed storage node.
  • the storage nodes of the distributed network are nodes that work independently, and can be scheduled according to a distributed storage algorithm.
  • the data fragments formed according to the data packet may be distributedly stored in the distributed storage node instead of directly storing the data packet formed by the file to be stored.
  • each distributed storage node can store only one data slice.
  • Reed-Solomon Redundancy can be used when each data fragment is distributed in each distributed storage node.
  • This algorithm corrects erroneous data through a polynomial operation/Erasure Code. Therefore, even if some nodes are dropped or data is corrupted, the data files can be successfully recovered and accessed.
  • a file to be stored is divided into a plurality of data packets, formed into data fragments and distributed, and M copies are redundantly stored on N distributed storage nodes (for example, 30 nodes, 3 copies of storage), each A distributed storage node stores a portion of a data packet. As long as N/M normal distributed storage nodes survive, the original files to be stored can be restored.
  • 3 copies of redundant storage can be taken.
  • step S140 a correspondence relationship between the data fragment and the data packet, and a correspondence relationship between the storage node and the stored data fragment are recorded.
  • the corresponding relationship between the data fragment and the data packet can be used to search for a corresponding data fragment by using the data packet, and the corresponding relationship between the storage node and the stored data fragment can be used to search for the corresponding storage node by using the stored data fragment, so that Download the corresponding stored data fragment from the storage node.
  • the two groups of correspondence can also verify the data packet.
  • the privacy protection of the data packet and the data fragment can be implemented.
  • the correspondence between the data fragment and the data packet may be: the data fragment 1 includes data packets corresponding to numbers 1, 2, and 3, respectively.
  • the correspondence between the storage node and the stored data fragment may be the data fragment numbered 1 stored by the storage node 5.
  • step S150 the local file to be stored is deleted.
  • the local file to be stored can be deleted to prevent the file to be stored from being acquired by an unscrupulous attacker.
  • all the data packets are formed into at least three data fragments by grouping the files to be stored into a plurality of data packets, and each data fragment includes a partial data packet, and each data packet is added to at least two data.
  • each data fragment is distributed and distributed in a distributed storage node, thereby realizing distributed storage of data.
  • Distributed storage can solve the bottleneck problem of centralized storage, reduce bandwidth cost and storage cost, and adopt multi-copy storage of data packets to avoid the unrecoverable data as a whole due to the failure of some storage nodes.
  • the data fragments stored in each storage node do not include all the data components, it is impossible to recover the original storage file by breaking a storage node.
  • the foregoing technical solution solves the problem that the storage cost generated by the related cloud storage technology continues to increase and the data storage caused by the distributed storage technology is not secure, and the user is convenient for file storage in the distributed network to reduce the storage cost, and can effectively improve the storage file.
  • FIG. 2a is a flowchart of a distributed storage method according to Embodiment 2 of the present application.
  • the embodiment is refined based on the foregoing embodiment.
  • an implementation manner for encrypting data packets is provided.
  • the correspondence between the recorded data fragment and the data packet, and the correspondence between the storage node and the stored data fragment are refined as follows: the hash value of the data packet included in the data fragment is recorded in the form of a Merkel tree.
  • the method of this embodiment may include step S210, step S220, step S230, step S240, step S250, step S260, and step S270.
  • step S210 the files to be stored are grouped to form a plurality of data packets.
  • each data packet is sequentially encrypted using a key, wherein a key of each of the other data packets except the first data packet is generated according to the ciphertext of the previous data packet; The encryption order of the data packets.
  • each data packet may be encrypted after grouping the stored files.
  • the encryption method of the data packet may be sequential encryption: each data packet is encrypted by using a symmetric encryption algorithm and a packet encryption mechanism, and 128 bits of data may be symmetrically encrypted each time, and the encrypted key may be up to 256 bits.
  • the first data packet can be separately encrypted and the corresponding ciphertext is generated. Later, when encrypting other data packets, the ciphertext of the previous data packet is used as part of the input to confuse the output of the next data packet. The ciphertext of the previous data packet can be used to calculate the key for determining the next data packet.
  • the key of the next data packet may include a part of the fixed key, and the other part is determined by the ciphertext calculation. Since the conventional CPU (Central Processing Unit/Processor) does not perform instruction set optimization for the packet encryption algorithm, the above-mentioned sequential encryption method relying on the previous data packet is violently cracked to cope with a huge attack cost. In addition, even if the key is leaked, it will not cause the leakage of the contents of the stored file, because the attacker can only crack the contents of the stored file only by acquiring all the data packets and understanding the encryption order. After sequentially encrypting each data packet by using a key, in order to facilitate later restoration of the original file to be stored, the encryption order of each data packet can also be recorded.
  • the conventional CPU Central Processing Unit/Processor
  • recording the encryption order of each data packet includes calculating a hash value of each data packet in accordance with an encryption order of the data packets to form original Merkle trees.
  • FIG. 2b is a schematic structural diagram of the original Merkel tree involved in Embodiment 2 of the present application.
  • DATA BLOCK data packets
  • the hash values of each data packet are respectively calculated, and the leaf nodes (Hash-LEAF) of the original Merkel tree are formed in order from left to right, and then The two-two-leaf nodes are combined to calculate the hash value as the upper branch (Hash-BRANCH) until the root node (Hash-ROOT) of the original Merkel tree is calculated.
  • the original Merkel tree not only records the hash value of the data packet, but also records the order of the data packets in a tree structure.
  • DATA BLOCK2 the second data packet
  • DATA SHARD1 the second data packet
  • DATA SHARD2 the second data packet
  • DATA SHARD3 the third data fragments
  • the encryption order of each data packet can be recorded by using the Merkel tree, thereby utilizing the advantages of the Merkel tree to improve the operational efficiency and scalability of the distributed network, and can be used as a school for later recovery data. Check the credentials.
  • step S230 all data packets are formed into at least three data slices, wherein each data slice includes partial data packets, and each data packet is added to at least two data slices.
  • step S240 each data slice is distributedly stored in a distributed storage node.
  • step S250 the hash value of the data packet included in the data slice is recorded in the form of a Merkel tree as a fragmented Merkel tree.
  • the fragmented Merkel tree is a Merkel tree based on each hash value calculated from the data packets included in the data fragment.
  • each data slice can be recorded using a Merkel tree. Since the corresponding sequence number ID and the hash value of the corresponding content are set for each data packet when the original Merkel tree is acquired, each data slice can be finally calculated according to the hash value of the data packet it includes to obtain a corresponding value.
  • Fragmented Merkel tree In the fragmented Merkel tree, the order of the data packets does not have to be the same as the original encryption order, and the hash branches of the Merkel tree can be calculated by combining the data groups arbitrarily.
  • step S260 the correspondence between each fragmented Merkel tree and the storage node where each of the data fragments is located is recorded.
  • each data slice corresponds to a sliced Merkel tree.
  • the original Merkel tree is formed according to the hash value of each data packet by sequentially encrypting the data packet, and the fragmented value is formed by the hash value of the data packet included in the data fragment.
  • Kerr tree can realize data search and verification of data grouping and data fragmentation, and improve the privacy and security of the storage file, thereby effectively preventing the attacker from restoring the original file.
  • step S270 the local file to be stored is deleted.
  • FIG. 3a is a flowchart of a distributed storage method according to Embodiment 3 of the present application
  • FIG. 3b is a flowchart of a method for restoring a storage file in a distributed storage method according to Embodiment 3 of the present application.
  • the embodiment is based on the refinement.
  • an implementation manner of recovering a storage file according to each data packet is given.
  • the method in the embodiment of the present application may include step S310. Step S320, step S330, step S340, step S350, step S360, and step S370.
  • step S310 the files to be stored are grouped to form a plurality of data packets.
  • step S320 all data packets are formed into at least three data slices, wherein each data slice includes partial data packets, and each data packet is added to at least two data slices.
  • step S330 each data slice is distributedly stored in a distributed storage node.
  • step S340 a correspondence between the data fragment and the data packet, and a correspondence relationship between the storage node and the stored data fragment are recorded.
  • step S350 the local file to be stored is deleted.
  • step S360 when generating the storage file query request, each data packet is downloaded from the storage node according to the corresponding relationship between the locally recorded data fragment and the data packet, and the correspondence between the storage node and the stored data fragment. .
  • the storage file query request may be a request sent by the user to obtain a storage file, such as downloading a storage file or an online preview storage file.
  • the corresponding storage file is sequentially obtained in order. All data packets are finally spliced and decrypted in order, so that a complete storage file is obtained.
  • the process of restoring the storage file may include step S361, step S362, step S363, step S364, step S365, step S366, step S367, and step S368.
  • step S361 the first data packet is determined as the current data packet based on the encryption order of each data packet recorded locally.
  • the data packet corresponding to the first hash leaf node may be found according to the original recorded Merkel tree.
  • step S362 according to the correspondence between the data fragment and the data packet, and the correspondence between the storage node and the stored data fragment, the storage node where the current data packet is located is determined as the current packet node.
  • the corresponding hash value can be found in the fragmented Merkel tree. Then, the corresponding storage node is determined according to the found data fragment.
  • step S363 a data slice is downloaded from the current packet node, and a current data packet is extracted from the data slice.
  • each data packet is generated in the process of encryption, its key is generated in association with the ciphertext corresponding to the previous data packet. Therefore, in the process of restoring data, the first data packet in the encryption sequence of the data packet can be directly obtained, and the data packet is the first data packet, and then the other data packets are sequentially acquired according to the first data packet. .
  • the first data packet After determining the first data packet, the first data packet is taken as the current data packet. Finding a data fragment in which the current data packet is located according to the correspondence between the data fragment and the data packet, and determining, according to the correspondence between the storage node and the stored data fragment, the storage node where the current data packet is located as the current grouping node, and from the current grouping The node downloads data fragments. Because the current data packet is included in the data fragment stored by the current packet node, the current data packet can be extracted from the data fragment according to the hash position of the fragmented Merkel tree corresponding to the current packet node.
  • the tree consisting of each hash value within the dashed box is the original Merkel tree.
  • the LEAF nodes of the original Merkel tree namely Hash 1-LEAF, Hash 2-LEAF, Hash 3-LEAF, and Hash 4-LEAF
  • the BRANCH nodes namely Hash 5-BRANCH and Hash 6-BRANCH
  • Hash algorithm the hash algorithm according to each LEAF node
  • the ROOT node that is, Hash 7-ROOT
  • the hash values corresponding to Hash 5-BRANCH and Hash 6-BRANCH are calculated according to the Hash algorithm.
  • DATA BLOCK 1, DATA BLOCK 2, DATA BLOCK3, and DATA BLOCK4 are data packets formed by files to be stored.
  • DATA SHARD 1, DATA SHARD 2, and DATA SHARD 3 are partial data fragments formed according to data packets (ie, not shown in Figure 2b). All data fragments are included, and the second data packet is included in the data fragment.
  • the hash position corresponding to the first data packet (DATA BLOCK 1) is a Hash 1-LEAF node, and so on, all data packets store their hash values at the corresponding hash positions.
  • the structure of the segmented Merkel tree is not shown in Fig. 2b, and the segmented Merkel tree is formed based on the hash value of the partial data packets included in the data slice. Therefore, the LEAF node of the fragmented Merkel tree corresponds to the hash value formed by the data packets it includes, and the other nodes are formed in the same process as the original Merkel tree.
  • the hash position corresponding to the current data packet in the original Merkel tree is Hash 2-LEAF
  • the stored hash value (such as H)
  • H The hash value stored in the fragmented Merkel tree
  • the storage node where the current data packet is located may be determined as the current grouping node according to the correspondence between all the fragmented Merkel trees and each of the fragmented Merkel trees and the storage node. There may be more than one storage node where the current data packet is located, and one of them may be selected as the current grouping node. Finding the data packet corresponding to the hash value H in the fragmented Merkel tree corresponding to the current packet node is the required second data packet.
  • the distributed storage method of the embodiment of the present application can effectively prevent an attacker from acquiring an original file.
  • step S364 the extracted hash value of the current data packet is calculated, and the hash value of the current data packet is stored locally and matched to verify the validity of the current data packet.
  • the validity of the current data packet may be verified. Because the hash value corresponding to each data packet is different, the hash value of the data packet can be used as a check basis, and the hash value of the current data packet is matched with the hash value of the current data packet stored locally. The verification is consistent, indicating that the verification is passed, and the current data packet is valid; otherwise, according to the correspondence between the data fragment and the data packet (such as using the fragmented Merkel tree), another data fragment is reselected and the current data packet is extracted until the extraction is determined. The current data packet is valid.
  • step S365 the current data packet is decrypted by using a corresponding key, and the corresponding key of the next data packet is determined by using the ciphertext of the current data packet.
  • step S366 the next data packet is updated to the current data packet.
  • the data packet after determining the validity of the data packet, it can be decrypted by using the key corresponding to the current data. Since the key of the first data packet is independent of other data packets, the first data packet can be decrypted directly with the corresponding key. At the time of decryption, the acquired data packet can be decrypted using a 128-bit or 256-bit key obtained by encryption.
  • the corresponding key of the next data packet is determined based on the ciphertext of the current data packet.
  • the ciphertext of the first data packet may be combined with the set number of fixed characters to form a corresponding key of the next data packet. Then, the next data packet is updated to the current data packet, and the next data packet is processed in the manner of downloading and verifying the current data packet as described above.
  • step S367 it is determined whether all data packet downloading is completed, and if so, step S370 is performed; otherwise, step S368 is performed.
  • the storage file is formed by splicing recovery of a plurality of data packets. Therefore, the corresponding save file can be obtained only after all the data packets are obtained.
  • the storage file is directly restored according to each data packet; otherwise, S368 is performed to continue to acquire the missing data packet.
  • step S368 it is determined whether the current data packet is stored in the downloaded data fragment, and if so, step S364 is performed; otherwise, step S362 is performed.
  • each data fragment includes a partial data packet
  • other data including hash value verification processing or non-hash value verification processing are included. Grouping.
  • the current data packet is included in the previously downloaded data fragment, it is no longer necessary to determine the storage node where the current data packet is located according to the correspondence between the data fragment and the data packet, and the correspondence between the storage node and the stored data fragment.
  • the current data packet can be directly hashed, and the current data packet is decrypted according to the key formed by the ciphertext of the previous data packet.
  • the partial data packet included in the data fragment downloaded by the current packet node is: a first data packet, a second data packet, a fourth data packet, and a Five data groups.
  • the first data packet has been hashed
  • the fourth data packet and the fifth data packet have not been hashed.
  • the fourth data packet or the fifth data packet is processed as the current data packet
  • the corresponding data fragment is not downloaded
  • the second data packet is directly corresponding to the fourth data included in the downloaded data fragment.
  • the data packet or the fifth data packet is used as the current data packet.
  • step S370 the formation of the storage file is resumed according to each data packet.
  • each data packet is separately downloaded from the storage node according to the corresponding relationship between the data fragment and the data packet recorded locally, and the corresponding relationship between the storage node and the stored data fragment; and the data packet is restored according to each data packet.
  • Storing files can effectively improve the privacy and security of stored files, thus preventing attackers from restoring original files.
  • the distributed storage method provided in the embodiment of the present application can be applied to data storage in each distributed network.
  • the blockchain network is used to perform distributed storage on the storage file.
  • the blockchain system generally includes multiple nodes and can work independently. On the one hand, it can be used as a storage node to store pre-storage preparations. On the other hand, it can also be used as a storage node to accept storage tasks requested by other nodes.
  • the blockchain system is a decentralized network that can work collaboratively based on protocols such as consensus mechanisms.
  • Step S410 is a flowchart of a distributed storage method according to Embodiment 4 of the present application.
  • the method includes Step S410, Step S420, Step S430, Step S440, Step S450, and Step S460.
  • step S410 the files to be stored are grouped to form a plurality of data packets.
  • step S420 all data packets are formed into at least three data slices, wherein each data slice includes partial data packets, and each data packet is added to at least two data slices.
  • step S430 each data slice is separately stored in each storage node in the blockchain network.
  • a node that generates storage requirements, or any electronic device it can become a lease node, that is, a node that requests leased storage space from other nodes.
  • the lease node prepares the file for fragmentation before storing the file.
  • the leasing node also determines each storage node to serve its service in the blockchain network, and each storage node may be referred to as a tenant node.
  • the process of determining the storage node may be an offline negotiation process, or a smart contract that embodies the storage space lease process in the blockchain network, and the node to be contracted is called a lease node.
  • the lease node transmits the data fragment to the storage node for storage.
  • step S440 the storage relationship of the data fragments in the storage node is provided as a smart contract to the block generation node in the blockchain network to add the smart contract to the block for storage.
  • the smart contracts identified above that embody the leased storage space process are transmitted in a blockchain network.
  • the block generation node currently competing for the block processing authority will process the currently generated smart contract and package it to form a block.
  • the block generation node may obtain the block generation authority based on a plurality of consensus mechanisms, and different smart contracts may be generated by different lease nodes during the permission period of the block generation node.
  • the block generation node can process the smart contract, including but not limited to: verifying, converting, encrypting, and storing the content of the smart contract. For example, leasing the storage space of other nodes may pay a certain fee, and the corresponding payment amount will be reflected in the smart contract, which is confirmed by the lease node signature.
  • the block generation node may transfer the payment amount from the account of the rental node to the account of the renting node according to the provisions in the smart contract.
  • the block generated by the block generation node is broadcasted to other nodes of the blockchain network for verification, and then the block is effectively added to the tail of the block chain.
  • the intangible modification of smart contracts is guaranteed by the characteristics of the blockchain.
  • step S450 a correspondence between the data fragment and the data packet, and a correspondence between the storage node and the stored data fragment are recorded;
  • the lease node can record the above correspondence locally to facilitate subsequent recovery of the stored data.
  • step S460 the local file to be stored is deleted.
  • nodes in a distributed blockchain network can communicate with each other.
  • Each node is likely to become a lease node or a tenant node at any time.
  • the lease node can upload the file to be stored, that is, the node with the file storage requirement; the lease node can store the data fragment corresponding to the file, and can also be referred to as a storage node.
  • a node exemplary, a node representing a Baidu network disk
  • the preparation process of the lease demand is started.
  • the files to be stored may be first grouped at a node having a rental demand to form a plurality of data packets, and each data packet is sequentially encrypted using a key, wherein each of the data packets except the first data packet The key is generated based on the ciphertext of the previous data packet.
  • the hash value of each data packet can be calculated according to the encryption order of the data packet to form the original Merkel tree.
  • the node with the lease requirement randomly combines the data packets in pairs to form at least three data fragments.
  • a partial data packet is included in each data slice, and each data packet is added to at least two data slices.
  • the node with the lease requirement records the hash value of the data packet included in the data fragment in the form of a Merkel tree as a fragmented Merkel tree; and records each fragment Merkel tree and each data. The correspondence between the storage nodes where the fragments are located.
  • the node with the lease requirement transmits the data fragment to the storage node separately, and realizes the data fragment transmission to the distributed blockchain network.
  • the storage relationship of the data fragments in the storage node can be provided as a smart contract to the block generation node in the blockchain network to add the smart contract to the block for storage. .
  • the distributed storage method provided by the embodiment of the present application is applied in the field of blockchain technology, which can facilitate users to store files in a distributed blockchain network to reduce storage costs, and can effectively improve storage files. Privacy and security to prevent an attacker from restoring the original file.
  • FIG. 5 is a schematic diagram of a distributed storage device according to Embodiment 5 of the present application. As shown in FIG. 5, the device includes: a data grouping module 510, a data fragmenting module 520, a data storage module 530, and a relationship recording module 540. And a file deletion module 550, wherein:
  • the data grouping module 510 is configured to group the files to be stored to form a plurality of data packets.
  • the data fragmentation module 520 is configured to form all data packets into at least three data fragments, wherein each data fragment includes a partial data packet, and each data packet is added to at least two data fragments;
  • the data storage module 530 is configured to perform distributed storage of each data fragment in a distributed storage node
  • the relationship record module 540 is configured to record a correspondence between the data fragment and the data packet, and a correspondence between the storage node and the stored data fragment;
  • the file deletion module 550 is configured to delete the local file to be stored.
  • all the data packets are formed into at least three data fragments by grouping the files to be stored into a plurality of data packets, and each data fragment includes a partial data packet, and each data packet is added to at least two data.
  • each data fragment is distributed and distributed in a distributed storage node, thereby realizing distributed storage of data.
  • Distributed storage can solve the bottleneck problem of centralized storage, reduce bandwidth cost and storage cost, and adopt multi-copy storage of data packets to avoid the unrecoverable data as a whole due to the failure of some storage nodes.
  • the data fragments stored in each storage node do not include all the data components, it is impossible to recover the original storage file by breaking a storage node.
  • the foregoing technical solution solves the problem that the storage cost generated by the related cloud storage technology continues to increase and the data storage caused by the distributed storage technology is not secure, and the user is convenient for file storage in the distributed network to reduce the storage cost, and can effectively improve the storage file.
  • the device further includes: a data encryption module, configured to perform sequential encryption on each data packet by using a key, wherein a key of each data packet except the first data packet is based on the previous data.
  • the ciphertext of the packet is generated; the encryption order of each of the data packets is recorded.
  • the relationship record module 540 is configured to record a hash value of the data packet included in the data slice in the form of a Merkel tree as a fragmented Merkel tree; record each fragmented Merkel tree Correspondence relationship with each storage node where the data fragment is located.
  • the data encryption module is configured to calculate a hash value of each data packet according to an encryption sequence of the data packet to form an original Merkel tree.
  • the device further includes: a file recovery module, configured to: according to the correspondence between the locally recorded data fragment and the data packet, and the correspondence between the storage node and the stored data fragment when generating the storage file query request Each data packet is downloaded from the storage node, respectively; and the storage file is restored according to each data packet.
  • a file recovery module configured to: according to the correspondence between the locally recorded data fragment and the data packet, and the correspondence between the storage node and the stored data fragment when generating the storage file query request Each data packet is downloaded from the storage node, respectively; and the storage file is restored according to each data packet.
  • the file recovery module is configured to determine the first data packet as the current data packet according to the encryption sequence of each data packet recorded locally; according to the correspondence between the data fragment and the data packet, and the storage node and the Storing a correspondence of data fragments, determining a storage node where the current data packet is located as a current packet node; downloading a data fragment from the current packet node, and extracting a current data packet from the data fragment; using a corresponding key Decrypting the current data packet, determining a corresponding key of the next data packet by using a ciphertext of the current data packet; updating the next data packet to a current data packet; and when the current data packet is stored in the downloaded data fragment, Returning to perform the decryption operation; when the current data packet is not stored in the downloaded data fragment, returning to perform the operation of determining the current packet node until all data packet downloads are completed.
  • the file recovery module is configured to calculate a hash value of the extracted current data packet, store a hash value of the current data packet locally, and perform matching to verify the validity of the current data packet.
  • the data storage module 530 is configured to store each data fragment separately in each storage node in the blockchain network; and store the data storage relationship in the storage node as a smart contract, to provide Blocks in the blockchain network generate nodes to add smart contracts to the blocks for storage.
  • the distributed storage device may be configured as a distributed storage method provided by any embodiment of the present application, and has a corresponding functional module and a beneficial effect of the execution method.
  • a distributed storage method provided by any embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a computer device according to Embodiment 6 of the present application.
  • FIG. 6 shows a block diagram of a computer device 612 suitable for use in implementing embodiments of the present application.
  • the computer device 612 shown in FIG. 6 is merely an example and should not impose any limitation on the function and scope of use of the embodiments of the present application.
  • computer device 612 is embodied in the form of a general purpose computing device.
  • Components of computer device 612 may include, but are not limited to, one or more processors 616, storage device 628, and bus 618 that connect various system components, including storage device 628 and processor 616.
  • Bus 618 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a graphics acceleration port, a processor, or a local bus using any of a variety of bus structures.
  • these architectures include, but are not limited to, the Industry Standard Architecture (ISA) bus, the Micro Channel Architecture (MCA) bus, the Enhanced ISA Bus, and the Video Electronics Standards Association. Association, VESA) Local Bus and Peripheral Component Interconnect (PCI) bus.
  • Computer device 612 typically includes a variety of computer system readable media. These media can be any available media that can be accessed by computer device 612, including volatile and nonvolatile media, removable and non-removable media.
  • Storage device 628 can include computer system readable media in the form of volatile memory, such as random access memory (RAM) 630 and/or cache memory 632.
  • Computer device 612 may further include other removable/non-removable, volatile/non-volatile computer system storage media.
  • storage system 634 can read and write non-removable, non-volatile magnetic media (not shown in Figure 6, commonly referred to as "hard disk drives").
  • a disk drive for reading and writing to a removable non-volatile disk such as a "floppy disk”
  • a removable non-volatile disk for example, a compact disk (Compact Disc-Read Only) may be provided.
  • Storage device 628 can include at least one program product having a set (eg, at least one) of program modules configured to perform the functions of each embodiment of the present application.
  • Program 636 having a set (at least one) of program modules 626, which may be stored, for example, in storage device 628, such program program 626 includes, but is not limited to, an operating system, one or more applications, other program modules, and program data. Implementations of the network environment may be included in each or a combination of the examples.
  • Program module 626 typically performs the functions and/or methods of the embodiments described herein.
  • Computer device 612 can also be in communication with one or more external devices 614 (eg, a keyboard, pointing device, camera, display 624, etc.), and can also be in communication with one or more devices that enable a user to interact with the computer device 612, and / Or communicating with any device (eg, a network card, modem, etc.) that enables the computer device 612 to communicate with one or more other computing devices. This communication can take place via an input/output (I/O) interface 622.
  • computer device 612 can also communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through network adapter 620.
  • networks eg, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet
  • network adapter 620 communicates with other modules of computer device 612 via bus 618. It should be understood that although not shown in the figures, other hardware and/or software modules may be utilized in conjunction with computer device 612, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, disk arrays (Redundant Arrays) Of Independent Disks (RAID) systems, tape drives, and data backup storage systems.
  • RAID Redundant Arrays
  • tape drives and data backup storage systems.
  • the processor 616 executes each of the functional applications and data processing by running a program stored in the storage device 628, for example, implementing the distributed storage method provided by the above-described embodiments of the present application.
  • the method is: grouping files to be stored to form a plurality of data packets; forming all data packets into at least three data fragments, wherein each data fragment includes partial data. Grouping, and each data packet is added to at least two data fragments; each data fragment is distributedly stored in a distributed storage node; a correspondence between the data fragment and the data packet is recorded, and the storage node Correspondence with the stored data fragments; delete the local file to be stored.
  • Distributed storage can solve the bottleneck problem of centralized storage, reduce bandwidth cost and storage cost, and adopt multi-copy storage of data packets to avoid the unrecoverable data as a whole due to the failure of some storage nodes.
  • the data fragments stored in each storage node do not include all the data components, it is impossible to recover the original storage file by breaking a storage node.
  • the foregoing technical solution solves the problem that the storage cost generated by the related cloud storage technology continues to increase and the data storage caused by the distributed storage technology is not secure, and the user is convenient for file storage in the distributed network to reduce the storage cost, and can effectively improve the storage file.
  • the seventh embodiment of the present application further provides a computer storage medium storing a computer program, when executed by a computer processor, performing the distributed storage method according to any one of the foregoing embodiments of the present application: grouping files to be stored Forming a plurality of data packets; grouping all of the data into at least three data slices, wherein each data slice includes a partial data packet, and each data packet is added to at least two data slices; The data fragment is distributed in the distributed storage node; the correspondence between the data fragment and the data packet is recorded, and the corresponding relationship between the storage node and the stored data fragment is deleted; and the local file to be stored is deleted.
  • the computer storage medium of the embodiments of the present application may employ any combination of one or more computer readable mediums.
  • the computer readable medium can be a computer readable signal medium or a computer readable storage medium.
  • the computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above.
  • a more detailed example of a computer readable storage medium includes: an electrical connection with one or more wires, a portable computer disk, a hard disk, a random access memory, a read only memory (Read Only Memory, ROM) ), Erasable Programmable Read Only Memory (EPROM) or flash memory, optical fiber, portable compact disk read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • a computer readable storage medium can be any tangible medium that can contain or store a program, which can be used by or in connection with an instruction execution system, apparatus or device.
  • a computer readable signal medium may include a data signal that is propagated in the baseband or as part of a carrier, carrying computer readable program code. Such propagated data signals can take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer readable signal medium can also be any computer readable medium other than a computer readable storage medium that can transmit, propagate, or transport a program for use by or in connection with the instruction execution system, apparatus, or device.
  • Program code embodied on a computer readable medium can be transmitted by any suitable medium, including but not limited to wireless, wire, optical cable, radio frequency (RF), and the like, or any suitable combination of the foregoing.
  • suitable medium including but not limited to wireless, wire, optical cable, radio frequency (RF), and the like, or any suitable combination of the foregoing.
  • Computer program code for performing the operations of the present application may be written in one or more programming languages, or a combination thereof, including an object oriented programming language such as Java, Smalltalk, C++, and conventional procedural Programming language - such as the "C" language or a similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer, partly on the remote computer, or entirely on the remote computer or server.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network or a wide area network, or can be connected to an external computer (e.g., using an Internet service provider to connect via the Internet).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Hardware Design (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Storage Device Security (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本文公开了一种分布式存储方法、装置、计算机设备及存储介质所述方法包括:将待存储文件进行分组,形成多个数据分组;将全部数据分组形成至少三个数据分片,其中,每个数据分片中包括部分数据分组,且每个数据分组添加到至少两个数据分片中;将每个数据分片在分布式存储节点中进行分布式存储;记录数据分片与数据分组的对应关系,以及存储节点与所存储数据分片的对应关系;删除本地的待存储文件。

Description

一种分布式存储方法、装置、计算机设备及存储介质
本申请要求在2018年5月18日提交中国专利局、申请号为201810479464.3的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及数据存储技术领域,例如涉及一种分布式存储方法、装置、计算机设备及存储介质。
背景技术
相关技术中的云存储技术,一般是通过中心化的服务器来进行存储的。随着存储数据越来越多,对服务器存储空间和带宽资源的占用严重,云存储成本持续增高。并且,相关的云存储技术保存在云端的数据是不加密的,数据的私密性由大型的云存储服务商的信用来背书。
若采用分布式存储技术,由于数据的去中心化存储,导致了信用的去中心化,进而引发存储节点不稳定导致易丢失、易受到攻击而造成的数据存储不安全的问题。
发明内容
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
本申请实施例提供一种分布式存储方法、装置、计算机设备及存储介质,便于用户在分布式网络中进行文件存储以降低存储成本,且能够有效提高存储文件的私密性和安全性,从而避免攻击者恢复原始文件。
本申请实施例提供了一种分布式存储方法,包括:
将待存储文件进行分组,形成多个数据分组;
将全部数据分组形成至少三个数据分片,其中,每个数据分片中包括部分 数据分组,且每个数据分组添加到至少两个数据分片中;
将每个数据分片在分布式存储节点中进行分布式存储;
记录数据分片与数据分组的对应关系,以及所述存储节点与所存储数据分片的对应关系;
删除本地的待存储文件。
本申请另一实施例还提供了一种分布式存储装置,包括:
数据分组模块,设置为将待存储文件进行分组,形成多个数据分组;
数据分片模块,设置为将全部数据分组形成至少三个数据分片,其中,每个数据分片中包括部分数据分组,且每个数据分组添加到至少两个数据分片中;
数据存储模块,设置为将每个数据分片在分布式存储节点中进行分布式存储;
关系记录模块,设置为记录数据分片与数据分组的对应关系,以及所述存储节点与所存储数据分片的对应关系;
文件删除模块,设置为删除本地的待存储文件。
本申请另一实施例还提供了一种计算机设备,所述计算机设备包括:
一个或多个处理器;
存储装置,设置为存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现本申请任意实施例所提供的分布式存储方法。
本申请另一实施例还提供了一种计算机存储介质,其上存储有计算机程序,该程序被处理器执行时实现本申请任意实施例所提供的分布式存储方法。
在阅读并理解了附图和详细描述后,可以明白其他方面。
附图说明
图1是本申请实施例一提供的一种分布式存储方法的流程图;
图2a是本申请实施例二提供的一种分布式存储方法的流程图;
图2b是本申请实施例二中所涉及的原始梅克尔树的结构示意图;
图3a是本申请实施例三提供的一种分布式存储方法的流程图;
图3b是本申请实施例三提供的一种分布式存储方法中恢复存储文件方法的流程图;
图4是本申请实施例四提供的一种分布式存储方法的流程图;
图5是本申请实施例五提供的一种分布式存储装置的示意图;
图6为本申请实施例六提供的一种计算机设备的结构示意图。
具体实施方式
下面结合附图和实施例对本申请作详细说明。可以理解的是,此处所描述的实施例仅仅设置为解释本申请,而非对本申请的限定。
为了便于描述,附图中仅示出了与本申请相关的部分而非全部内容。在更加详细地讨论示例性实施例之前应当提到的是,部分示例性实施例被描述成作为流程图描绘的处理或方法。虽然流程图将每项操作(或步骤)描述成顺序的处理,但是其中的许多操作可以被并行地、并发地或者同时实施。此外,每项操作的顺序可以被重新安排。当其操作完成时所述处理可以被终止,但是还可以具有未包括在附图中的附加步骤。所述处理可以对应于方法、函数、规程、子例程、子程序等等。
实施例一
图1是本申请实施例一提供的一种分布式存储方法的流程图,本实施例可适用于在分布式网络中存储文件的情况,该方法可以由分布式存储装置来执行,该装置可以由软件和/或硬件的方式来实现,并一般可集成在任何可以发起数据存储的计算机设备中,如图1所示,该方法包括步骤S110、步骤S120、步骤S130、步骤S140以及步骤S150。
在步骤S110中,将待存储文件进行分组,形成多个数据分组。
其中,待存储文件可以是文本、图片、视频、音频以及其他类型(如zip格式的压缩文件等)的可存储的文件,本申请实施例并不对待存储文件的类型 进行限定。数据分组可以是待存储文件的其中一部分文件数据。
在本申请实施例中,在对待存储文件进行分布式存储之前,首先将待存储文件进行分组,将其划分为多个数据分组。对待存储文件进行分组可以采用平均划分为N个数据分组的方式,使得每个数据分组包括相同数据量的文件数据。也可以采用随机划分的方式对待存储文件进行分组,使得每个数据分组包括不同数据量的文件数据。当然,本领域技术人员还可以根据实际需求,在本技术方案的技术背景下,建立其他的文件分组的方式,本申请实施例对此并不进行限制。
在步骤S120中,将全部数据分组形成至少三个数据分片,其中,每个数据分片中包括部分数据分组,且每个数据分组添加到至少两个数据分片中。
在本申请实施例中,数据分片由部分数据分组组成,即每个数据分片并不包括所有的数据分组。其中,每个数据分片中所包括的数据分组的数量可以相同,也可以不同,例如,数据分片包括的数据分组数量可以是2个、5个、8个或者更多,本申请实施例并不对数据分片包括的数据分组的数量进行限定。
为了提高数据存储的安全性,可以将全部的数据分组形成至少三个数据分片,并且能够保证每个数据分组都可以添加到至少两个数据分片中,即保证每个数据分组至少形成了两个存储副本。在分组成片的处理过程中,数据分组会进行冗余存储,对于M副本存储来说,一个数据分组会在M个数据分片中出现。副本的数量M大于或等于2,可以预先设定,也可以根据实际情况,例如存储文件的重要性等级、存储节点的稳定性情况进行动态调整。
在步骤S130中,将每个数据分片在分布式存储节点中进行分布式存储。
其中,分布式网络的存储节点是各自独立工作的节点,能够基于分布式存储算法进行调度安排。
在本申请实施例中,可以将根据数据分组形成的数据分片在分布式存储节点中进行分布式存储,而非直接存储待存储文件形成的数据分组。可选的,每个分布式存储节点可以只存储一个数据分片。
相应的,在将每个数据分片在每个分布式存储节点中进行分布式存储时, 可以采用里德-所罗门冗余算法(Reed-Solomon Redundancy)。这种算法通过多项式运算/纠错码(Erasure Code)来纠正错误数据。因此,即使有部分节点掉线或者数据损坏,数据文件仍然可以被成功恢复和访问。示例性的,一份待存储文件被分割为多个数据分组,形成数据分片后散布且M副本冗余存储在N个分布式存储节点上(例如30个节点,3副本存储),每个分布式存储节点存储一部分数据分组。只要有N/M个正常分布式存储节点存活,就可以恢复原始的待存储文件。
在一个例子中,可以采取3副本冗余存储。在使用30个分布式存储节点的情况下,只要同时有10个正常分布式存储节点存活,就可以提供可用的存储服务。假设每个分布式存储节点的可靠性仅为50%,则经过简单的计算,可以得出分布式网络的服务稳定性是f=1-(1-70%)21,即99.99995%。
在步骤S140中,记录数据分片与数据分组的对应关系,以及所述存储节点与所存储数据分片的对应关系。
其中,数据分片与数据分组的对应关系可以实现通过数据分组查找对应的数据分片,存储节点与所存储数据分片的对应关系可以实现通过所存储的数据分片查找对应的存储节点,以便从存储节点中下载对应的所存储数据分片。同时,两组对应关系还可以实现对数据分组的验证。
在本申请实施例中,根据数据分片与数据分组的对应关系,以及存储节点与所存储数据分片的对应关系,可以实现对数据分组和数据分片的隐私保护。
在一个例子中,数据分片与数据分组的对应关系可以是:数据分片1包括编号分别为1、2和3对应的数据分组。存储节点与所存储数据分片的对应关系可以是存储节点5存储的编号为1的数据分片。
在步骤S150中,删除本地的待存储文件。
相应的,在完成每个数据分片的存储以及每个对应关系的记录后,即可以删除本地的待存储文件,以防止待存储文件被不法攻击者所获取。
本申请实施例通过将待存储文件分组形成多个数据分组,将全部数据分组形成至少三个数据分片,每个数据分片中包括部分数据分组,且每个数据分组 添加到至少两个数据分片中,再将每个数据分片在分布式存储节点中进行分布式存储,从而实现了数据的分布式存储。分布式存储能够解决中心化存储的瓶颈问题,降低带宽成本和存储成本,且采用数据分组的多副本存储,避免由于部分存储节点的故障而导致数据整体的不可恢复。并且,由于每个存储节点中所存储的数据分片都没有包括所有的数据组分,所以不可能通过攻破一个存储节点就能恢复原始存储文件。上述技术方案解决了相关云存储技术产生的存储成本持续增高以及分布式存储技术引发的数据存储不安全的问题,便于用户在分布式网络中进行文件存储以降低存储成本,且能够有效提高存储文件的私密性和安全性,从而避免攻击者恢复原始文件。
实施例二
图2a是本申请实施例二提供的一种分布式存储方法的流程图,本实施例以上述实施例为基础进行细化,在本实施例中,给出了对数据分组进行加密的实现方式。同时,将记录数据分片与数据分组的对应关系,以及存储节点与所存储数据分片的对应关系细化为:采用梅克尔树形式记录数据分片中所包括的数据分组的哈希值,作为分片梅克尔树;记录每个分片梅克尔树与每个所述数据分片所在存储节点的对应关系。相应的,如图2a所示,本实施例的方法可以包括步骤S210、步骤S220、步骤S230、步骤S240、步骤S250、步骤S260以及步骤S270。
在步骤S210中,将待存储文件进行分组,形成多个数据分组。
在步骤S220中,采用密钥对每个数据分组进行顺序加密,其中,除第一个数据分组外的其他每个数据分组的密钥根据前一个数据分组的密文产生;记录所述每个数据分组的加密顺序。
在本申请实施例中,为了提高数据分组的安全性,对待存储文件进行分组后,可以对每个数据分组进行加密。可选的,数据分组的加密方式可以是顺序加密:对每个数据分组采用对称加密算法及分组加密机制进行加密,每次可以对128个比特的数据进行对称加密,加密的密钥最多可以达到256个比特。其 中,第一个数据分组可以单独加密并产生对应的密文,后续在对其他的数据分组加密时,以前一个数据分组的密文作为一部分输入来对下一个数据分组的输出进行混淆。可以采用前一个数据分组的密文来计算确定下一个数据分组的密钥。下一个数据分组的密钥可以包括一部分固定密钥,另一部分是通过密文计算确定的。由于传统CPU(中央处理器,Central Processing Unit/Processor)未针对分组加密算法进行过指令集优化,因此,对上述依赖前一个数据分组的顺序加密方式进行暴力破解应付出巨大的攻击代价。此外,即使密钥泄漏也不会导致存储文件内容的泄漏,因为攻击者只有获取全部数据分组并了解其加密顺序,才有可能破解出存储文件的内容。在采用密钥对每个数据分组进行顺序加密后,为了便于后期恢复原始待存储文件,还可以对每个数据分组的加密顺序进行记录。
在本申请的一个可选实施例中,记录每个数据分组的加密顺序包括:按照数据分组的加密顺序,计算每个数据分组的哈希值,形成原始梅克尔树(Merkle trees)。
其中,原始梅克尔树是依据数据分组计算所得每个哈希值为基础建立的梅克尔树。示例性的,图2b是本申请实施例二中所涉及的原始梅克尔树的结构示意图。如图2b所示,包括四个数据分组(DATA BLOCK),分别计算每个数据分组的哈希值,按照从左至右的顺序形成原始梅克尔树的叶节点(Hash-LEAF),而后两两叶节点组合再计算哈希值作为上层分支(Hash-BRANCH),直至计算出原始梅克尔树的根节点(Hash-ROOT)。原始梅克尔树不仅记录了数据分组的哈希值,还以树状结构记录了数据分组的顺序。在该原始梅克尔树中,以第二个数据分组(DATA BLOCK2)为例,还记录了其存储于三个数据分片(DATA SHARD1、DATA SHARD2和DATA SHARD3)中。
相应的,可以采用梅克尔树的方式对每个数据分组的加密顺序进行记录,进而利用梅克尔树的优势提升分布式网络的运行效率和可扩展性,并可以作为后期恢复数据的校验凭据。
在步骤S230中,将全部数据分组形成至少三个数据分片,其中,每个数据分片中包括部分数据分组,且每个数据分组添加到至少两个数据分片中。
在步骤S240中,将每个数据分片在分布式存储节点中进行分布式存储。
在步骤S250中,采用梅克尔树形式记录数据分片中所包括的数据分组的哈希值,作为分片梅克尔树。
其中,分片梅克尔树是依据数据分片中所包括的数据分组计算所得每个哈希值为基础建立的梅克尔树。
在本申请实施例中,可以使用梅克尔树来进行记录每个数据分片的结构。由于在获取原始梅克尔树时为每个数据分组设置了对应的顺序编号ID和对应内容的hash值,所以每个数据分片可以依据其包括的数据分组的哈希值最终计算获取一个对应的分片梅克尔树。在分片梅克尔树中,数据分组的顺序不必与原始加密顺序相同,可以将数据分组任意的两两组合计算梅克尔树的哈希分支。
在步骤S260中,记录每个分片梅克尔树与每个所述数据分片所在存储节点的对应关系。
相应的,在获取到每个数据分片对应的分片梅克尔树后,可以将每个分片梅克尔树与每个数据分片所在的存储节点之间的对应关系进行记录。每个数据分片对应一个分片梅克尔树。
在本申请实施例中,通过对数据分组按顺序加密后依据每个数据分组的哈希值形成原始梅克尔树,以及通过数据分片中所包括的数据分组的哈希值形成分片梅克尔树,能够实现对数据分组以及数据分片的数据查找和校验,提高存储文件的私密性和安全性,从而有效避免攻击者恢复原始文件。
在步骤S270中,删除本地的待存储文件。
采用本实施例的技术方案,通过对数据分组进行顺序加密,极大增加了攻击存储节点来恢复原始存储文件的难度。能够有效提高分布式存储的私密性。
实施例三
图3a是本申请实施例三提供的一种分布式存储方法的流程图,图3b是本 申请实施例三提供的一种分布式存储方法中恢复存储文件方法的流程图,本实施例以上述实施例为基础进行细化,在本实施例中,给出了根据每个数据分组恢复形成存储文件的实现方式,相应的,如图3a所示,本申请实施例的方法可以包括步骤S310、步骤S320、步骤S330、步骤S340、步骤S350、步骤S360以及步骤S370。
在步骤S310中,将待存储文件进行分组,形成多个数据分组。
在步骤S320中,将全部数据分组形成至少三个数据分片,其中,每个数据分片中包括部分数据分组,且每个数据分组添加到至少两个数据分片中。
在步骤S330中,将每个数据分片在分布式存储节点中进行分布式存储。
在步骤S340中,记录数据分片与数据分组的对应关系,以及所述存储节点与所存储数据分片的对应关系。
在步骤S350中,删除本地的待存储文件。
在步骤S360中,在产生存储文件查询请求时,根据本地记录的数据分片与数据分组的对应关系,以及存储节点与所存储数据分片的对应关系,分别从存储节点中下载每个数据分组。
其中,存储文件查询请求可以是用户发送的获取存储文件的请求,如下载存储文件或在线预览存储文件等。
在本申请实施例中,在恢复数据的过程中,可以根据本地记录的数据分片与数据分组的对应关系,以及存储节点与所存储数据分片的对应关系,按照顺序依次获取存储文件对应的所有数据分组,最终将获取的每个数据分组再按照顺序进行拼接和解密,从而得到完整的存储文件。
相应的,如图3b所示,恢复存储文件的过程可以包括步骤S361、步骤S362、步骤S363、步骤S364、步骤S365、步骤S366、步骤S367以及步骤S368。
在步骤S361中,根据本地记录的每个数据分组的加密顺序,确定第一个数据分组作为当前数据分组。
可以是根据本地记录的原始梅克尔树,查找第一个哈希叶节点对应的数据分组。
在步骤S362中,根据数据分片与数据分组的对应关系,以及存储节点与所存储数据分片的对应关系,确定当前数据分组所在的存储节点,作为当前分组节点。
根据第一个数据分组的哈希值,可以在分片梅克尔树中查找对应的哈希值。进而再根据查找到的数据分片确定对应的存储节点。
在步骤S363中,从所述当前分组节点下载数据分片,并从所述数据分片中提取当前数据分组。
考虑到每个数据分组在加密的过程中其密钥的产生都与前一个数据分组对应的密文相关。因此,在恢复数据的过程中,可以直接获取数据分组的加密顺序中的第一个数据分组,该数据分组即为第一个数据分组,然后再依据第一个数据分组依次获取其他的数据分组。
在确定第一个数据分组后,将第一个数据分组作为当前数据分组。根据数据分片与数据分组的对应关系查找当前数据分组所在的数据分片,同时根据存储节点与所存储数据分片的对应关系确定当前数据分组所在的存储节点作为当前分组节点,并从当前分组节点下载数据分片。因为当前分组节点存储的数据分片中包括了当前数据分组,因此可以依据当前分组节点对应存储的分片梅克尔树的哈希位置从数据分片中提取当前数据分组。
示例性的,参考图2b所示,虚线框内由每个哈希值组成的树即为原始梅克尔树。其中,原始梅克尔树的LEAF节点,即Hash 1-LEAF、Hash 2-LEAF、Hash 3-LEAF以及Hash 4-LEAF,为根据每个数据分组顺序形成的哈希值。BRANCH节点,即Hash 5-BRANCH和Hash 6-BRANCH,为根据每个LEAF节点按照Hash算法计算得出的哈希值,ROOT节点,即Hash 7-ROOT为原始梅克尔树的根节点,也是根据Hash 5-BRANCH和Hash 6-BRANCH对应的哈希值按照Hash算法计算得出。DATA BLOCK 1、DATA BLOCK 2、DATA BLOCK3和DATA BLOCK4为待存储文件形成的数据分组,DATA SHARD 1、DATA SHARD 2和DATA SHARD 3为根据数据分组形成的部分数据分片(即图2b中没有示出所有的数据分片),该部分数据分片中都包括第二个数据分组。相应 的,第一个数据分组(DATA BLOCK 1)对应的哈希位置为Hash 1-LEAF节点,以此类推,所有的数据分组都在对应的哈希位置处存储其哈希值。
图2b中没有示出分片梅克尔树的结构,分片梅克尔树是根据数据分片所包括的部分数据分组的哈希值形成的。因此,分片梅克尔树的LEAF节点则对应其包括的数据分组形成的哈希值,其他节点的形成过程则与原始梅克尔树相同。
相应的,当将第二个数据分组(DATA BLOCK 2)作为当前数据分组时,在原始梅克尔树中当前数据分组对应的哈希位置为Hash 2-LEAF,其存储的哈希值(如H)可以对分片梅克尔树存储的哈希值进行校验。确定当前数据分组后,可以依据所有的分片梅克尔树以及每个分片梅克尔树与存储节点的对应关系确定当前数据分组所在的存储节点作为当前分组节点。当前数据分组所在的存储节点可能会有多个,可以选择其中一个作为当前分组节点。在当前分组节点对应存储的分片梅克尔树中查找哈希值为H对应的数据分组即为所需的第二个数据分组。
由此可见,攻击者若想获取到原始文件,只有根据数据分片对应的分片梅克尔树的数据结构,来完成所有数据分片的下载,同时掌握所有数据分组的加密顺序,而这是很难办到的。因此,本申请实施例的分布式存储方法能够有效避免攻击者获取原始文件。
在步骤S364中,计算提取出的当前数据分组的哈希值,与本地存储当前数据分组的哈希值,并进行匹配,以验证所述当前数据分组的有效性。
在本申请实施例中,在获取到当前数据分组后,可以对当前数据分组的有效性进行校验。因为每个数据分组对应的哈希值并不同,所以可以采用数据分组的哈希值作为校验依据,将当前数据分组的哈希值与本地存储当前数据分组的哈希值进行匹配验证。验证一致,说明验证通过,当前数据分组有效;否则,根据数据分片与数据分组的对应关系(如利用分片梅克尔树)重新选择另一个数据分片并提取当前数据分组,直到确定提取的当前数据分组是有效的。
在步骤S365中,采用对应密钥对所述当前数据分组进行解密,采用当前数据分组的密文确定下一个数据分组的对应密钥。
在步骤S366中,将下一个数据分组更新为当前数据分组。
相应的,在确定当数据分组的有效性以后,可以采用当前数据对应的密钥对其进行解密。由于第一个数据分组的密钥与其他的数据分组无关,因此可以直接用对应密钥对第一个数据分组进行解密。在解密时,可以对获取的数据分组采用加密时得到的128bit或256bit的密钥进行解密。在当前数据分组解密完成后,以当前数据分组的密文为依据确定下一个数据分组的对应密钥。可选的,可以将第一个数据分组的密文与设定数量的固定字符进行组合形成下一个数据分组的对应密钥。然后,再将下一个数据分组更新为当前数据分组,并按照上述下载和校验当前数据分组的方式对下一个数据分组进行处理。
在步骤S367中,判断是否所有数据分组下载完成,若是,则执行步骤S370;否则,执行步骤S368。
在本申请实施例中,存储文件是由多个数据分组进行拼接恢复形成的。因此,只有获取到所有的数据分组后才能得到对应的存文件。当确定所有数据分组下载完成时,直接根据每个数据分组恢复形成存储文件;否则,执行S368以继续获取缺少的数据分组。
在步骤S368中,判断当前数据分组是否存储于已下载数据分片,若是,则执行步骤S364;否则,执行步骤S362。
由于每个数据分片均包括了部分数据分组,所以一个数据分片被下载时,除了包括当前数据分组,还包括其他的经哈希值校验处理或未经哈希值校验处理的数据分组。当在之前已下载的数据分片中包括当前数据分组时,无需再根据数据分片与数据分组的对应关系,以及存储节点与所存储数据分片的对应关系确定当前数据分组所在的存储节点并下载对应的数据分片,可以直接对当前数据分组进行哈希值校验,并依据上一个数据分组的密文所形成的密钥对当前数据分组进行解密。
示例性的,假设当前数据分组为第二个数据分组,在当前分组节点下载的数据分片包括的部分数据分组为:第一个数据分组、第二个数据分组、第四个数据分组以及第五个数据分组。其中,第一个数据分组已经哈希值校验处理, 而第四个数据分组以及第五个数据分组还未经哈希值校验处理。则在将第四个数据分组或第五个数据分组作为当前数据分组进行处理时,无需再下载对应的数据分片,直接将第二个数据分组对应下载的数据分片中包括的第四个数据分组或第五个数据分组作为当前数据分组。
在步骤S370中,根据每个数据分组恢复形成存储文件。
本申请实施例通过根据本地记录的数据分片与数据分组的对应关系,以及存储节点与所存储数据分片的对应关系,分别从存储节点中下载每个数据分组;根据每个数据分组恢复形成存储文件,能够有效提高存储文件的私密性和安全性,从而避免攻击者恢复原始文件。
实施例四
本申请实施例所提供的分布式存储方法,可以适用于每种分布式网络进行数据存储,本实施例中,可选的,采用区块链网络对待存储文件进行分布式存储。区块链系统一般都包括多个节点,可独立工作,一方面是可独立作为有存储需求的节点来进行存储前准备,另一方面也可以作为一个存储节点接受其他节点请求的存储任务。区块链系统是一个去中心化的网络,可以基于共识机制等协议进行协同工作。
图4为本申请实施例四所提供的分布式存储方法的流程图,该方法包括步骤S410、步骤S420、步骤S430、步骤S440、步骤S450以及步骤S460。
在步骤S410中,将待存储文件进行分组,形成多个数据分组。
在步骤S420中,将全部数据分组形成至少三个数据分片,其中,每个数据分片中包括部分数据分组,且每个数据分组添加到至少两个数据分片中。
在步骤S430中,将每个数据分片在区块链网络中的每个存储节点分别进行存储。
作为产生了存储需求的节点,或者任意电子设备,可以成为租赁节点,即向其他节点请求租赁存储空间的节点。租赁节点在存储文件之前,先对文件进行分片处理的准备。
同时,租赁节点还在区块链网络中确定每个存储节点为其服务,每个存储节点可以称为承租节点。确定存储节点的过程可以是线下协商过程,也可以是在区块链网络中发布体现存储空间租赁过程的智能合约,应约的节点即称为承租节点。确定存储节点后,租赁节点将数据分片传输给存储节点进行存储。
在步骤S440中,将数据分片在存储节点中的存储关系作为智能合约,提供至区块链网络中的区块生成节点,以将智能合约添加至区块中进行存储。
上述所确定的体现租赁存储空间过程的智能合约,在区块链网络中传输。当前竞争到区块处理权限的区块生成节点将对当前产生的智能合约进行处理,打包形成区块。区块生成节点可以基于多种共识机制获得区块生成权限,在该区块生成节点的权限时段内,可能由不同的租赁节点产生不同的智能合约。区块生成节点可以对智能合约进行处理,处理的方式包括但不限于:对智能合约的内容进行验证、转换、加密、和存储等。例如,租赁其他节点的存储空间,可能支付一定的费用,相应的支付金额会体现在智能合约中,由租赁节点签名确认。区块生成节点可以根据智能合约中的规定,将支付金额从租赁节点的账户转移至承租节点的账户。区块生成节点后续将生成的区块,广播发给区块链网络的其他节点进行验证,而后使得区块生效添加至区块链尾部。由区块链的自身特点保证智能合约的不可篡改性。
在步骤S450中,记录数据分片与数据分组的对应关系,以及所述存储节点与所存储数据分片的对应关系;
租赁节点可以在本地记录上述对应关系,以方便后续恢复存储的数据。
在步骤S460中,删除本地的待存储文件。
在本应用场景中,分布式的区块链网络中的节点之间可以进行互相通信。每一个节点都有可能随时成为租赁节点或承租节点。其中,租赁节点可以上传待存储文件,即为有文件存储需求的节点;承租节点可以存储文件对应的数据分片,也可以被称为存储节点。
当一个节点(示例性的,代表百度网盘的节点)有租赁需求时,也就是请求区块链网络上的其它节点来共同存储文件时,开始发布租赁需求的准备过程。 然后,待存储文件可以首先在有租赁需求的节点处进行分组形成多个数据分组,并采用密钥对每个数据分组进行顺序加密,其中,除第一个数据分组外的其他每个数据分组的密钥根据前一个数据分组的密文产生。数据分组加密完成后,可以按照数据分组的加密顺序,计算每个数据分组的哈希值,形成原始梅克尔树。接下来,有租赁需求的节点在对个数据分组进行两两随机组合形成至少三个数据分片。每个数据分片中包括部分数据分组,且每个数据分组添加到至少两个数据分片中。同时,有租赁需求的节点采用梅克尔树形式记录数据分片中所包括的数据分组的哈希值,作为分片梅克尔树;并记录每个分片梅克尔树与每个数据分片所在存储节点的对应关系。最后,有租赁需求的节点向存储节点分别传输数据分片,实现将数据分片传送到分布式的区块链网络当中。每个数据分片存储完成后,则可以将数据分片在存储节点中的存储关系作为智能合约,提供至区块链网络中的区块生成节点,以将智能合约添加至区块中进行存储。
由此可见,本申请实施例所提供的分布式存储方法应用在区块链技术领域,能够便于用户在分布式的区块链网络中进行文件存储以降低存储成本,且能够有效提高存储文件的私密性和安全性,从而避免攻击者恢复原始文件。
实施例五
图5是本申请实施例五提供的一种分布式存储装置的示意图,如图5所示,所述装置包括:数据分组模块510、数据分片模块520、数据存储模块530、关系记录模块540以及文件删除模块550,其中:
数据分组模块510,设置为将待存储文件进行分组,形成多个数据分组;
数据分片模块520,设置为将全部数据分组形成至少三个数据分片,其中,每个数据分片中包括部分数据分组,且每个数据分组添加到至少两个数据分片中;
数据存储模块530,设置为将每个数据分片在分布式存储节点中进行分布式存储;
关系记录模块540,设置为记录数据分片与数据分组的对应关系,以及所 述存储节点与所存储数据分片的对应关系;
文件删除模块550,设置为删除本地的待存储文件。
本申请实施例通过将待存储文件分组形成多个数据分组,将全部数据分组形成至少三个数据分片,每个数据分片中包括部分数据分组,且每个数据分组添加到至少两个数据分片中,再将每个数据分片在分布式存储节点中进行分布式存储,从而实现了数据的分布式存储。分布式存储能够解决中心化存储的瓶颈问题,降低带宽成本和存储成本,且采用数据分组的多副本存储,避免由于部分存储节点的故障而导致数据整体的不可恢复。并且,由于每个存储节点中所存储的数据分片都没有包括所有的数据组分,所以不可能通过攻破一个存储节点就能恢复原始存储文件。上述技术方案解决了相关云存储技术产生的存储成本持续增高以及分布式存储技术引发的数据存储不安全的问题,便于用户在分布式网络中进行文件存储以降低存储成本,且能够有效提高存储文件的私密性和安全性,从而避免攻击者恢复原始文件。
可选的,所述装置还包括:数据加密模块,设置为采用密钥对每个数据分组进行顺序加密,其中,除第一个数据分组外的其他每个数据分组的密钥根据前一个数据分组的密文产生;记录所述每个数据分组的加密顺序。
可选的,关系记录模块540,是设置为采用梅克尔树形式记录数据分片中所包括的数据分组的哈希值,作为分片梅克尔树;记录每个分片梅克尔树与每个所述数据分片所在存储节点的对应关系。
可选的,数据加密模块,是设置为按照数据分组的加密顺序,计算每个数据分组的哈希值,形成原始梅克尔树。
可选的,所述装置还包括:文件恢复模块,设置为在产生存储文件查询请求时,根据本地记录的数据分片与数据分组的对应关系,以及存储节点与所存储数据分片的对应关系,分别从存储节点中下载每个数据分组;根据每个数据分组恢复形成存储文件。
可选的,文件恢复模块,是设置为根据本地记录的每个数据分组的加密顺序,确定第一个数据分组作为当前数据分组;根据数据分片与数据分组的对应 关系,以及存储节点与所存储数据分片的对应关系,确定当前数据分组所在的存储节点,作为当前分组节点;从所述当前分组节点下载数据分片,并从所述数据分片中提取当前数据分组;采用对应密钥对所述当前数据分组进行解密,采用当前数据分组的密文确定下一个数据分组的对应密钥;将下一个数据分组更新为当前数据分组;当当前数据分组存储于已下载数据分片时,返回执行解密操作;当当前数据分组未存储于已下载数据分片时,返回执行确定当前分组节点的操作,直至所有数据分组下载完成。
可选的,文件恢复模块,是设置为计算提取出的当前数据分组的哈希值,与本地存储当前数据分组的哈希值,并进行匹配,以验证所述当前数据分组的有效性。
可选的,数据存储模块530,是设置为将每个数据分片在区块链网络中的每个存储节点分别进行存储;将数据分片在存储节点中的存储关系作为智能合约,提供至区块链网络中的区块生成节点,以将智能合约添加至区块中进行存储。
上述分布式存储装置可执行本申请任意实施例所提供的分布式存储方法,具备执行方法相应的功能模块和有益效果。未在本实施例中详尽描述的技术细节,可参见本申请任意实施例提供的分布式存储方法。
实施例六
图6为本申请实施例六提供的一种计算机设备的结构示意图。图6示出了适于用来实现本申请实施方式的计算机设备612的框图。图6显示的计算机设备612仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。
如图6所示,计算机设备612以通用计算设备的形式表现。计算机设备612的组件可以包括但不限于:一个或者多个处理器616,存储装置628,连接不同系统组件(包括存储装置628和处理器616)的总线618。
总线618表示几类总线结构中的一种或多种,包括存储器总线或者存储器控制器,外围总线,图形加速端口,处理器或者使用多种总线结构中的任意总 线结构的局域总线。举例来说,这些体系结构包括但不限于工业标准体系结构(Industry Standard Architecture,ISA)总线,微通道体系结构(Micro Channel Architecture,MCA)总线,增强型ISA总线、视频电子标准协会(Video Electronics Standards Association,VESA)局域总线以及外围组件互连(Peripheral Component Interconnect,PCI)总线。
计算机设备612典型地包括多种计算机系统可读介质。这些介质可以是任何能够被计算机设备612访问的可用介质,包括易失性和非易失性介质,可移动的和不可移动的介质。
存储装置628可以包括易失性存储器形式的计算机系统可读介质,例如随机存取存储器(Random Access Memory,RAM)630和/或高速缓存存储器632。计算机设备612可以进一步包括其它可移动/不可移动的、易失性/非易失性计算机系统存储介质。仅作为举例,存储系统634可以读写不可移动的、非易失性磁介质(图6未显示,通常称为“硬盘驱动器”)。尽管图6中未示出,可以提供对可移动非易失性磁盘(例如“软盘”)读写的磁盘驱动器,以及对可移动非易失性光盘(例如只读光盘(Compact Disc-Read Only Memory,CD-ROM)、数字视盘(Digital Video Disc-Read Only Memory,DVD-ROM)或者其它光介质)读写的光盘驱动器。在这些情况下,每个驱动器可以通过一个或者多个数据介质接口与总线618相连。存储装置628可以包括至少一个程序产品,该程序产品具有一组(例如至少一个)程序模块,这些程序模块被配置以执行本申请每个实施例的功能。
具有一组(至少一个)程序模块626的程序636,可以存储在例如存储装置628中,这样的程序模块626包括但不限于操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或其中一种组合中可能包括网络环境的实现。程序模块626通常执行本申请所描述的实施例中的功能和/或方法。
计算机设备612也可以与一个或多个外部设备614(例如键盘、指向设备、摄像头、显示器624等)通信,还可与一个或者多个使得用户能与该计算机设 备612交互的设备通信,和/或与使得该计算机设备612能与一个或多个其它计算设备进行通信的任何设备(例如网卡,调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口622进行。并且,计算机设备612还可以通过网络适配器620与一个或者多个网络(例如局域网(Local Area Network,LAN),广域网Wide Area Network,WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器620通过总线618与计算机设备612的其它模块通信。应当明白,尽管图中未示出,可以结合计算机设备612使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、磁盘阵列(Redundant Arrays of Independent Disks,RAID)系统、磁带驱动器以及数据备份存储系统等。
处理器616通过运行存储在存储装置628中的程序,从而执行每种功能应用以及数据处理,例如实现本申请上述实施例所提供的分布式存储方法。
也即,所述处理单元执行所述程序时实现:将待存储文件进行分组,形成多个数据分组;将全部数据分组形成至少三个数据分片,其中,每个数据分片中包括部分数据分组,且每个数据分组添加到至少两个数据分片中;将每个数据分片在分布式存储节点中进行分布式存储;记录数据分片与数据分组的对应关系,以及所述存储节点与所存储数据分片的对应关系;删除本地的待存储文件。
通过所述计算机设备将待存储文件分组形成多个数据分组,将全部数据分组形成至少三个数据分片,每个数据分片中包括部分数据分组,且每个数据分组添加到至少两个数据分片中,再将每个数据分片在分布式存储节点中进行分布式存储,从而实现了数据的分布式存储。分布式存储能够解决中心化存储的瓶颈问题,降低带宽成本和存储成本,且采用数据分组的多副本存储,避免由于部分存储节点的故障而导致数据整体的不可恢复。并且,由于每个存储节点中所存储的数据分片都没有包括所有的数据组分,所以不可能通过攻破一个存储节点就能恢复原始存储文件。上述技术方案解决了相关云存储技术产生的存储成本持续增高以及分布式存储技术引发的数据存储不安全的问题,便于用户 在分布式网络中进行文件存储以降低存储成本,且能够有效提高存储文件的私密性和安全性,从而避免攻击者恢复原始文件。
实施例七
本申请实施例七还提供一种存储计算机程序的计算机存储介质,所述计算机程序在由计算机处理器执行时执行本申请上述实施例任一所述的分布式存储方法:将待存储文件进行分组,形成多个数据分组;将全部数据分组形成至少三个数据分片,其中,每个数据分片中包括部分数据分组,且每个数据分组添加到至少两个数据分片中;将每个数据分片在分布式存储节点中进行分布式存储;记录数据分片与数据分组的对应关系,以及所述存储节点与所存储数据分片的对应关系;删除本地的待存储文件。
本申请实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更细化的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器、只读存储器(Read Only Memory,ROM)、可擦式可编程只读存储器((Erasable Programmable Read Only Memory,EPROM)或闪存)、光纤、便携式紧凑磁盘只读存储器、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输由指令执行系统、装置或者器件使用或者 与其结合使用的程序。
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括——但不限于无线、电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言或其组合来编写执行本申请操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言——诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网或广域网—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。

Claims (11)

  1. 一种分布式存储方法,包括:
    将待存储文件进行分组,形成多个数据分组;
    将全部数据分组形成至少三个数据分片,其中,每个数据分片中包括部分数据分组,且每个数据分组添加到至少两个数据分片中;
    将每个数据分片在分布式存储节点中进行分布式存储;
    记录数据分片与数据分组的对应关系,以及所述存储节点与所存储数据分片的对应关系;
    删除本地的待存储文件。
  2. 根据权利要求1所述的方法,将全部数据分组形成至少三个数据分片之前,还包括:
    采用密钥对每个数据分组进行顺序加密,其中,除第一个数据分组外的其他每个数据分组的密钥根据前一个数据分组的密文产生;
    记录所述每个数据分组的加密顺序。
  3. 根据权利要求2所述的方法,其中,记录数据分片与数据分组的对应关系,以及存储节点与所存储数据分片的对应关系包括:
    采用梅克尔树形式记录数据分片中所包括的数据分组的哈希值,作为分片梅克尔树;
    记录每个分片梅克尔树与每个所述数据分片所在存储节点的对应关系。
  4. 根据权利要求3所述的方法,其中,记录所述每个数据分组的加密顺序包括:
    按照数据分组的加密顺序,计算每个数据分组的哈希值,形成原始梅克尔树。
  5. 根据权利要求3所述的方法,还包括:
    在产生存储文件查询请求时,根据本地记录的数据分片与数据分组的对应关系,以及存储节点与所存储数据分片的对应关系,分别从存储节点中下载每个数据分组;
    根据每个数据分组恢复形成存储文件。
  6. 根据权利要求5所述的方法,其中,根据本地记录的数据分片与数据分组的对应关系,以及存储节点与所存储数据分片的对应关系,分别从存储节点中下载每个数据分组包括:
    根据本地记录的每个数据分组的加密顺序,确定第一个数据分组作为当前数据分组;
    根据数据分片与数据分组的对应关系,以及存储节点与所存储数据分片的对应关系,确定当前数据分组所在的存储节点,作为当前分组节点;
    从所述当前分组节点下载数据分片,并从所述数据分片中提取当前数据分组;
    采用对应密钥对所述当前数据分组进行解密,采用当前数据分组的密文确定下一个数据分组的对应密钥;
    将下一个数据分组更新为当前数据分组;
    当当前数据分组存储于已下载数据分片时,返回执行解密操作;
    当当前数据分组未存储于已下载数据分片时,则返回执行确定当前分组节点的操作,直至所有数据分组下载完成。
  7. 根据权利要求6所述的方法,采用对应密钥对所述当前数据分组进行解密之前,还包括:
    计算提取出的当前数据分组的哈希值,与本地存储当前数据分组的哈希值,并进行匹配,以验证所述当前数据分组的有效性。
  8. 根据权利要求1-7任一所述的方法,其中,将每个数据分片在分布式存储节点中进行分布式存储包括:
    将每个数据分片在区块链网络中的每个存储节点分别进行存储;
    将数据分片在存储节点中的存储关系作为智能合约,提供至区块链网络中的区块生成节点,以将智能合约添加至区块中进行存储。
  9. 一种分布式存储装置,包括:
    数据分组模块,设置为将待存储文件进行分组,形成多个数据分组;
    数据分片模块,设置为将全部数据分组形成至少三个数据分片,其中,每 个数据分片中包括部分数据分组,且每个数据分组添加到至少两个数据分片中;
    数据存储模块,设置为将每个数据分片在分布式存储节点中进行分布式存储;
    关系记录模块,设置为记录数据分片与数据分组的对应关系,以及所述存储节点与所存储数据分片的对应关系;
    文件删除模块,设置为删除本地的待存储文件。
  10. 一种计算机设备,所述设备包括:
    一个或多个处理器;
    存储装置,设置为存储一个或多个程序,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-8中任一所述的分布式存储方法。
  11. 一种计算机存储介质,其上存储有计算机程序,该程序被处理器执行时实现如权利要求1-8中任一所述的分布式存储方法。
PCT/CN2019/072337 2018-05-18 2019-01-18 一种分布式存储方法、装置、计算机设备及存储介质 WO2019218717A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US16/766,151 US11842072B2 (en) 2018-05-18 2019-01-18 Distributed storage method and apparatus, computer device, and storage medium
JP2020530626A JP7044881B2 (ja) 2018-05-18 2019-01-18 分散型ストレージ方法及び装置、コンピュータ機器及び記憶媒体

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810479464.3A CN108664223B (zh) 2018-05-18 2018-05-18 一种分布式存储方法、装置、计算机设备及存储介质
CN201810479464.3 2018-05-18

Publications (1)

Publication Number Publication Date
WO2019218717A1 true WO2019218717A1 (zh) 2019-11-21

Family

ID=63776722

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/072337 WO2019218717A1 (zh) 2018-05-18 2019-01-18 一种分布式存储方法、装置、计算机设备及存储介质

Country Status (4)

Country Link
US (1) US11842072B2 (zh)
JP (1) JP7044881B2 (zh)
CN (1) CN108664223B (zh)
WO (1) WO2019218717A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111176567A (zh) * 2019-12-25 2020-05-19 上海沄界信息科技有限公司 分布式云存储的存储供应量验证方法及装置
JPWO2021172589A1 (zh) * 2020-02-28 2021-09-02

Families Citing this family (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664223B (zh) * 2018-05-18 2021-07-02 百度在线网络技术(北京)有限公司 一种分布式存储方法、装置、计算机设备及存储介质
US11228445B2 (en) * 2018-06-19 2022-01-18 Docusign, Inc. File validation using a blockchain
CN111079193B (zh) * 2018-10-19 2023-03-28 华为云计算技术有限公司 数据存储方法、数据查询方法、装置及设备
CN109614037A (zh) * 2018-11-16 2019-04-12 新华三技术有限公司成都分公司 数据巡检方法、装置和分布式存储系统
US10949388B2 (en) * 2018-11-16 2021-03-16 Advanced Messaging Technologies, Inc. Systems and methods for distributed data storage and delivery using blockchain
CN109558081A (zh) * 2018-11-23 2019-04-02 深圳市威赫科技有限公司 一种数据存储机制及系统
CN109634932B (zh) * 2018-11-30 2021-03-23 北京瑞卓喜投科技发展有限公司 一种智能合约存储方法及存储系统
CN111338841A (zh) * 2018-12-19 2020-06-26 北京京东尚科信息技术有限公司 数据处理方法、装置、设备和存储介质
CN113689213A (zh) * 2018-12-26 2021-11-23 创新先进技术有限公司 区块链数据处理方法、装置及系统
CN109815258A (zh) * 2018-12-29 2019-05-28 深圳云天励飞技术有限公司 一种数据处理的方法及装置
CN109800599A (zh) * 2019-01-18 2019-05-24 深圳市威赫科技有限公司 一种区块链分布式存储方法及系统
CN109902494A (zh) * 2019-01-24 2019-06-18 北京融链科技有限公司 数据加密存储方法、装置,以及文件存储系统
CN111475538A (zh) * 2019-01-24 2020-07-31 北京京东尚科信息技术有限公司 一种数据处理方法、装置及存储介质
US11222099B2 (en) * 2019-02-08 2022-01-11 Synergex Group Methods, systems, and media for authenticating users using blockchains
CN111835801B (zh) * 2019-04-18 2023-11-14 北京度友信息技术有限公司 文件下载方法、装置、服务器、边缘设备、终端及介质
US11294875B2 (en) * 2019-05-31 2022-04-05 Advanced New Technologies Co., Ltd. Data storage on tree nodes
CN110288445B (zh) * 2019-06-28 2024-03-05 杭州复杂美科技有限公司 去中心化存储方法、设备和存储介质
CN110300170A (zh) * 2019-06-28 2019-10-01 杭州复杂美科技有限公司 区块链分布式存储下载方法、设备和存储介质
CN110288346A (zh) * 2019-06-28 2019-09-27 杭州复杂美科技有限公司 区块链分布式存储下载方法、设备和存储介质
CN110442644A (zh) * 2019-07-08 2019-11-12 深圳壹账通智能科技有限公司 区块链数据归档存储方法、装置、计算机设备和存储介质
CN110597824A (zh) * 2019-09-20 2019-12-20 腾讯科技(深圳)有限公司 一种基于区块链网络的数据存储方法以及装置
KR102628057B1 (ko) * 2019-10-29 2024-01-22 삼성에스디에스 주식회사 블록체인 기반 파일 송신 방법 및 그 시스템
CN111030930B (zh) * 2019-12-02 2022-02-01 北京众享比特科技有限公司 基于去中心化网络数据分片传输方法、装置、设备及介质
US11368285B2 (en) * 2019-12-05 2022-06-21 International Business Machines Corporation Efficient threshold storage of data object
CN111193798A (zh) * 2019-12-31 2020-05-22 山东公链信息科技有限公司 一种打散后加密分散存储的图片分布式存储技术
CN111324305B (zh) * 2020-02-16 2021-02-02 西安奥卡云数据科技有限公司 一种分布式存储系统中数据写入/读取方法
CN111290883B (zh) * 2020-02-16 2021-03-26 西安奥卡云数据科技有限公司 一种基于重删的精简复制方法
CN111311283B (zh) * 2020-02-20 2021-02-05 宁波甜宝生物信息技术有限公司 基于区块链和云计算的化妆品溯源生产工艺方法
CN111475839B (zh) * 2020-04-06 2023-04-18 华中科技大学 一种用于不可信环境的冗余数据编码方法、存储介质
CN111611317B (zh) * 2020-06-08 2023-05-30 杭州复杂美科技有限公司 区块链分布式存储分组方法、设备和存储介质
CN112231398A (zh) * 2020-09-25 2021-01-15 北京金山云网络技术有限公司 数据存储方法、装置、设备及存储介质
CN112130772A (zh) * 2020-09-29 2020-12-25 合肥城市云数据中心股份有限公司 一种基于稀疏随机纠删码技术的区块链安全存储方法
AU2021254561A1 (en) * 2021-10-19 2023-05-04 Neo Nebula Pty Ltd A device, method and system for the secure storage of data in a distributed manner
CN112328688B (zh) * 2020-11-09 2023-10-13 广州虎牙科技有限公司 数据存储方法、装置、计算机设备及存储介质
CN112667568B (zh) * 2020-12-21 2022-11-22 广州携旅信息科技有限公司 一种在酒店内网环境下实现分布式存储的方法
CN114006690A (zh) * 2021-01-04 2022-02-01 北京八分量信息科技有限公司 一种区块链的数据授权方法
CN112968864A (zh) * 2021-01-26 2021-06-15 太原理工大学 一种可信的IPv6网络服务过程机制
CN112905667A (zh) * 2021-03-08 2021-06-04 黑芝麻智能科技(上海)有限公司 无人驾驶信息存储和回放方法、装置及存储介质
CN116303753A (zh) * 2021-12-09 2023-06-23 中兴通讯股份有限公司 分布式数据库的分片方法、装置、电子设备和存储介质
CN115935090B (zh) * 2023-03-10 2023-06-16 北京锐服信科技有限公司 一种基于时间分片的数据查询方法及系统
CN116860180A (zh) * 2023-08-31 2023-10-10 中航金网(北京)电子商务有限公司 一种分布式存储方法、装置、电子设备及存储介质
CN117873402B (zh) * 2024-03-07 2024-05-07 南京邮电大学 一种基于异步联邦学习和感知聚类的协作边缘缓存优化方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130111153A1 (en) * 2011-11-02 2013-05-02 Ju Pyung LEE Distributed storage system, apparatus and method for managing a distributed storage in consideration of latency elements
CN103559102A (zh) * 2013-10-22 2014-02-05 北京航空航天大学 数据冗余处理方法、装置和分布式存储系统
CN106302702A (zh) * 2016-08-10 2017-01-04 华为技术有限公司 数据的分片存储方法、装置及系统
CN107220559A (zh) * 2017-06-11 2017-09-29 南京安链数据科技有限公司 一种针对不可篡改文件的加密存储方法
CN107273759A (zh) * 2017-05-08 2017-10-20 上海点融信息科技有限责任公司 用于保护区块链数据的方法、设备以及计算机可读存储介质
CN108664223A (zh) * 2018-05-18 2018-10-16 百度在线网络技术(北京)有限公司 一种分布式存储方法、装置、计算机设备及存储介质

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000224158A (ja) 1999-02-01 2000-08-11 Toyo Commun Equip Co Ltd 暗号通信システム
US7203871B2 (en) * 2004-06-03 2007-04-10 Cisco Technology, Inc. Arrangement in a network node for secure storage and retrieval of encoded data distributed among multiple network nodes
CN102148798A (zh) * 2010-02-04 2011-08-10 上海果壳电子有限公司 大容量数据包的高效并行安全加解密方法
CN101808095B (zh) * 2010-03-22 2012-08-15 华中科技大学 一种分布式存储环境下的加密副本组织方法
JP5647058B2 (ja) 2011-04-19 2014-12-24 佐藤 美代子 情報処理システムおよびデータバックアップ方法
US9483657B2 (en) * 2013-01-14 2016-11-01 Accenture Global Services Limited Secure online distributed data storage services
CN104123300B (zh) * 2013-04-26 2017-10-13 上海云人信息科技有限公司 数据分布式存储系统及方法
RU2527210C1 (ru) * 2013-06-14 2014-08-27 Общество с ограниченной ответственностью "Новые технологии презентаций" Способ и система для передачи данных от веб-сервера клиентским терминальным устройствам посредством локальной беспроводной коммуникационной сети
US10547460B2 (en) * 2016-11-18 2020-01-28 Qualcomm Incorporated Message-based key generation using physical unclonable function (PUF)
CN110024422B (zh) * 2016-12-30 2023-07-18 英特尔公司 物联网的命名和区块链记录
CN106775494B (zh) * 2017-01-06 2023-05-12 南京普天通信股份有限公司 一种基于分布式软件定义存储的数据存储装置及存储方法
CN108628539B (zh) * 2017-03-17 2021-03-26 杭州海康威视数字技术股份有限公司 数据存储、分散、重构、回收方法、装置及数据处理系统
CN107273410B (zh) * 2017-05-03 2020-07-07 上海点融信息科技有限责任公司 基于区块链的分布式存储
US20190180272A1 (en) * 2017-12-12 2019-06-13 Janathon R. Douglas Distributed identity protection system and supporting network for providing personally identifiable financial information protection services

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130111153A1 (en) * 2011-11-02 2013-05-02 Ju Pyung LEE Distributed storage system, apparatus and method for managing a distributed storage in consideration of latency elements
CN103559102A (zh) * 2013-10-22 2014-02-05 北京航空航天大学 数据冗余处理方法、装置和分布式存储系统
CN106302702A (zh) * 2016-08-10 2017-01-04 华为技术有限公司 数据的分片存储方法、装置及系统
CN107273759A (zh) * 2017-05-08 2017-10-20 上海点融信息科技有限责任公司 用于保护区块链数据的方法、设备以及计算机可读存储介质
CN107220559A (zh) * 2017-06-11 2017-09-29 南京安链数据科技有限公司 一种针对不可篡改文件的加密存储方法
CN108664223A (zh) * 2018-05-18 2018-10-16 百度在线网络技术(北京)有限公司 一种分布式存储方法、装置、计算机设备及存储介质

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111176567A (zh) * 2019-12-25 2020-05-19 上海沄界信息科技有限公司 分布式云存储的存储供应量验证方法及装置
CN111176567B (zh) * 2019-12-25 2023-11-03 上海新沄信息科技有限公司 分布式云存储的存储供应量验证方法及装置
JPWO2021172589A1 (zh) * 2020-02-28 2021-09-02
JP7158690B2 (ja) 2020-02-28 2022-10-24 長瀬産業株式会社 情報処理システム、及びプログラム

Also Published As

Publication number Publication date
CN108664223B (zh) 2021-07-02
JP7044881B2 (ja) 2022-03-30
US20200363994A1 (en) 2020-11-19
JP2021506004A (ja) 2021-02-18
US11842072B2 (en) 2023-12-12
CN108664223A (zh) 2018-10-16

Similar Documents

Publication Publication Date Title
WO2019218717A1 (zh) 一种分布式存储方法、装置、计算机设备及存储介质
US11698840B2 (en) Transaction consensus processing method and apparatus for blockchain and electronic device
US11501533B2 (en) Media authentication using distributed ledger
US9262247B2 (en) Updating data stored in a dispersed storage network
JP6671278B2 (ja) データ転送最適化
US11657171B2 (en) Large network attached storage encryption
KR20200074911A (ko) 분산 시스템 내의 네트워크 노드를 위한 복구 프로세스의 수행
US11943350B2 (en) Systems and methods for re-using cold storage keys
CN111047324A (zh) 用于更新区块链节点处的公钥集合的方法及装置
CN109241754B (zh) 一种基于区块链的云文件重复数据删除方法
CN109347803B (zh) 一种区块链的数据处理方法、装置、设备及介质
CN115225409A (zh) 基于多备份联合验证的云数据安全去重方法
US20220216999A1 (en) Blockchain system for supporting change of plain text data included in transaction
US11893577B2 (en) Cryptographic key storage system and method
CN114615031A (zh) 文件存储方法、装置、电子设备及存储介质
CN108563396B (zh) 一种安全的云端对象存储方法
US10348705B1 (en) Autonomous communication protocol for large network attached storage
US11513913B2 (en) Method for storage management, electronic device, and computer program product
CN110958211B (zh) 一种基于区块链的数据处理系统及方法
CN110611674A (zh) 不同计算机系统之间的协议交互方法、系统及存储介质
JPWO2015162688A1 (ja) データ処理システム、データ処理方法
US11803648B2 (en) Key in lockbox encrypted data deduplication
CN114666037A (zh) 一种基于区块链的可审计数据去重方法
CN118051930A (zh) 分布式隐私文件加解密方法、装置及存储介质
CN117061126A (zh) 一种管理云盘文件加密与解密的系统和方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19804489

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020530626

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19804489

Country of ref document: EP

Kind code of ref document: A1