CN111444042B - Block chain data storage method based on erasure codes - Google Patents
Block chain data storage method based on erasure codes Download PDFInfo
- Publication number
- CN111444042B CN111444042B CN202010214114.1A CN202010214114A CN111444042B CN 111444042 B CN111444042 B CN 111444042B CN 202010214114 A CN202010214114 A CN 202010214114A CN 111444042 B CN111444042 B CN 111444042B
- Authority
- CN
- China
- Prior art keywords
- data
- block
- node
- nodes
- block chain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013500 data storage Methods 0.000 title claims abstract description 30
- 238000000034 method Methods 0.000 title claims abstract description 17
- 238000011084 recovery Methods 0.000 claims abstract description 18
- 230000006835 compression Effects 0.000 claims description 3
- 238000007906 compression Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1479—Generic software techniques for error detection or fault masking
- G06F11/1489—Generic software techniques for error detection or fault masking through recovery blocks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention belongs to the technical field of block chain storage, and particularly relates to a block chain data storage method based on erasure codes. The method assigns global unique node numbers to system nodes, and any node i encodes storage data based on an erasure code concept to obtain an ith data block and store all check codes; the failure node requests data from the global rest nodes under the condition of a specific fault tolerance rate, so that data repair is carried out; after joining the blockchain system, the new node requests data from the global and stores the historical blockchain. The invention can reduce global storage overhead of the block chain system, simultaneously realize recovery of data under the condition of failure of less than 50% of nodes, and ensure data reliability.
Description
Technical Field
The invention belongs to the technical field of block chain storage, and particularly relates to a block chain data storage method based on erasure codes.
Background
The block chain is a data structure formed by using chains for data blocks according to the sequence of time stamps, and ensures the distributed decentralized account book which cannot be tampered and counterfeited in a cryptographic way, safely stores data which have a precedence relationship and can be verified in the system, and can solve the problems of distributed data storage, transmission, distribution and the like. This also means that each participating node in the blockchain system must store the entire blockchain, which places a significant storage burden while maintaining data. It is apparent that the conventional blockchain storage mode has difficulty in coping with the explosively increasing data volume in the big data age.
Erasure codes are important fault-tolerant techniques that can provide the same data reliability with lower storage overhead than multi-copy fault-tolerant techniques. The multi-copy technology is used for backing up the data stored in the system by the user and storing the data in the corresponding redundant disk, and when the data block is lost or the disk fails, the copy data can be directly read from the redundant disk storing the copy for restoration. Multiple copy technology is widely used in existing storage systems due to its simple operation and efficient recovery speed. However, the multiple copy technology needs to occupy the same number of disks as the backup to store the copies, has extremely high storage overhead, and greatly increases the cost of constructing and operating the storage system. Particularly in the current age of rapid increases in data volume, multi-copy technology is increasingly inconvenient to use in large fault tolerant storage systems due to the high storage overhead it introduces. Compared with the multi-copy technology, the erasure code technology remarkably reduces the storage overhead of data, and can achieve the same or even higher fault tolerance performance, so that the erasure code technology is widely focused, and becomes a research hotspot in the storage field.
The erasure coding technology can recover original data by dividing the original data into blocks and coding the blocks, and when the data fails, the original data can be recovered by reading the data blocks which are not failed. There is currently no excessive research on the application of erasure codes to blockchains. The invention introduces erasure codes to improve the storage mode of the block chain system, can reduce the storage overhead of the whole system, and can recover data under the condition of specific fault tolerance rate.
Disclosure of Invention
The invention aims to provide a block chain data storage method based on erasure codes, which can reduce global storage overhead of a block chain system, recover data under the condition that less than 50% of nodes fail, and ensure data reliability.
The aim of the invention is realized by the following technical scheme: the method comprises a data storage stage, a data recovery stage and a data updating stage;
the data storage stage specifically comprises the following steps:
step 1.1: according to blockchain application systemThe number of nodes in the system, the original block chain data is divided into data blocks with the same number as the nodes by using the erasure code idea, namely, each block has the same size and N i d, if the sizes are different, 0 is added at the tail end to make the sizes the same;
the block chain application system has k nodes { N } 1 ,N 2 ,…N k Each node generates data as N i d, all nodes storeThe global data storage amount T in the block system is:
step 1.2: encoding the data block by an encoding strategy to generate r check blocks { P } 1 ,P 2 ,…,P r -a }; wherein k is>r is more than or equal to k/2, and the size of the check block is as follows:
step 1.3: each node stores 1 data block and all r check blocks except the block chain hash value, and replaces the original scheme of storing all data; i.e. each node memory is:
the global data storage amount is:
wherein T is 1 T is less than or equal to, and the compression ratio can reach 0.25 at most;
the data recovery stage specifically comprises the following steps:
step 2.1: node N in a blockchain application i When the data with i being more than or equal to 1 and k being less than or equal to k is invalid, broadcasting to other surviving nodes is needed, and the other nodes are requested to send data to the node;
step 2.2: node N i Receiving data blocks or check blocks of other nodes, and obtaining at least k/2+1 identical check blocks so as to recover all the check blocks;
step 2.3: recovering the received data block and the check block based on the erasure code idea, comparing the recovered data with a local hash value, and if the comparison is consistent, indicating that the recovered data is correct, thereby recovering the invalid data;
the data updating stage specifically comprises the following steps:
step 3.1: when a new node is added, broadcasting a request to the blockchain system, requesting other nodes to send data, and recovering the original blockchain according to a data recovery principle;
step 3.2: the original node receives the data in the new node, updates the block chain in an additional updating mode, and simultaneously obtains a data block and a check block which are required to be stored by the node by utilizing a data storage principle;
the additional updating mode specifically comprises the following steps: after adding new data to the original data block in the form of a log, adding a difference value between the new data and the original data to the check block in the form of a log; and merging the additional data with the original data after a period of time or when a new data block or a check block needs to be accessed.
The invention has the beneficial effects that:
aiming at the problem of high storage overhead of a block chain storage mode, the invention provides a block chain data storage method based on erasure codes. The invention can reduce global storage overhead of the block chain system, simultaneously realize recovery of data under the condition of failure of less than 50% of nodes, and ensure data reliability.
Drawings
FIG. 1 is a schematic diagram of a data storage system according to the present invention.
FIG. 2 is a diagram illustrating data recovery and data update according to the present invention.
Fig. 3 is a flow chart of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
The invention solves the technical problems that: aiming at the problem of high storage overhead of a block chain storage mode, the block chain data storage method based on erasure codes is provided, the global storage overhead of a block chain system can be reduced, and meanwhile, the recovery of data under the condition that less than 50% of nodes fail is realized, and the reliability of the data is ensured.
The technical scheme of the invention is as follows: a block chain data storage method based on erasure codes. Comprising the following steps:
each participating node in the blockchain application system has a globally unique number;
based on erasure code thought, node i blocks and codes the block chain data to obtain the ith original data
Data blocks and all check blocks;
redefining a block chain storage principle, wherein the ith data block and the check block replace the original ith node to store;
for data recovery of the failure node, based on a defined storage principle, a request is sent to the global node, and the failure node receives the data fragments and the check blocks sent by other nodes to carry out data recovery;
for the addition of new nodes, an interaction mode of the nodes and the blockchain network is given;
according to the block chain data storage method based on the erasure codes, global unique node numbers are given to system nodes, and any node i encodes storage data based on the erasure code concept to obtain an ith block data block and store all check codes; the failure node requests data from the global rest nodes under the condition of a specific fault tolerance rate, so that data repair is carried out; after joining the blockchain system, the new node requests data from the global and stores the historical blockchain.
The invention provides a block chain data storage method based on erasure codes, which comprises three parts, namely numbersA data storage phase, a data recovery phase and a data update phase. In theory, the invention is applicable to all types of erasure codes, for convenience of explanation, taking RS erasure codes as an example, it is assumed that k nodes { N ] are present in the blockchain application system 1 ,N 2 ,…N k Each node generates data as N i d, all nodes storeThe global data storage amount T in the block system is:
in order to ensure that the minority of the blockchain obeys the majority principle, namely, the number of failed nodes is controlled within 50 percent, reasonable data division and data storage strategies are needed. And when the node fails or data is added, corresponding data recovery and updating are carried out.
FIG. 1 is a schematic diagram of data storage. The storage procedure is as follows.
And a data storage stage:
step A1: firstly, according to the number of nodes, the original block chain data is divided into data blocks with the same number as the nodes by using an erasure code idea, namely each block has the same size and N i d, if the sizes are different, 0 is added at the tail end to make the sizes the same.
Step A2: encoding the data block through an RS encoding strategy to generate r check blocks { P } 1 ,P 2 ,…,P r To meet the principle of reducing the memory overhead, r of the present invention should meet k>r is more than or equal to k/2, and the size of the check block is as follows:
step A3: each node stores 1 data block and all r check blocks except the block chain hash value, and replaces the original scheme of storing all data. I.e. each node memory is:
the global data storage amount is:
it can be seen that T 1 T is less than or equal to, and the compression ratio can reach 0.25 at most.
FIG. 2 is a schematic diagram of a data recovery stage and a data update stage. Step 1,2 is a step of requesting data recovery for a failure node; and 3,4, requesting historical data for the new node joining system to perform a data updating step.
And (3) a data recovery stage:
step B1: node N in a blockchain application i When the data with i being more than or equal to 1 and k being less than or equal to k is invalid, broadcasting to other surviving nodes is needed, and the other nodes are requested to send data to the node.
Step B2: node N i Receiving data blocks or check blocks of other nodes, and obtaining at least k/2+1 identical check blocks so as to recover all the check blocks;
step B3: and then recovering the received data block and the check block based on the erasure code idea, comparing the recovered data with the local hash value, and if the comparison is consistent, indicating that the recovered data is correct, thereby recovering the invalid data.
And a data updating stage:
step C1: when a new node is added, a request is broadcast to the blockchain system, other nodes are requested to send data, and the original blockchain is restored according to the data restoration principle.
Step C2: the original node receives the data in the new node, updates the blockchain in an additional updating mode, and simultaneously obtains the data block and the check block which are needed to be stored by the node by utilizing the data storage principle.
Step C3: the new data is added to the original data block in a log mode in an addition updating mode, and the difference value between the new data and the original data is added to the check block in a log mode; and merging the additional data with the original data after a period of time or when a new data block or a check block needs to be accessed.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (1)
1. A block chain data storage method based on erasure codes is characterized in that: the method comprises a data storage stage, a data recovery stage and a data updating stage;
the data storage stage specifically comprises the following steps:
step 1.1: according to the number of nodes in the block chain application system, the original block chain data is divided into data blocks with the same number as the nodes by using the erasure code idea, namely, each block has the same size and N i d, if the sizes are different, 0 is added at the tail end to make the sizes the same;
the block chain application system has k nodes { N } 1 ,N 2 ,…N k Each node generates data as N i d, all nodes storeThe global data storage amount T in the block system is:
step 1.2: encoding the data block by an encoding strategy to generate r check blocks { P } 1 ,P 2 ,…,P r -a }; wherein k is>r is more than or equal to k/2, and the size of the check block is as follows:
step 1.3: each node stores 1 data block and all r check blocks except the block chain hash value, and replaces the original scheme of storing all data; i.e. each node memory is:
the global data storage amount is:
wherein T is 1 T is less than or equal to, and the compression ratio can reach 0.25 at most;
the data recovery stage specifically comprises the following steps:
step 2.1: node N in a blockchain application i When the data with i being more than or equal to 1 and k being less than or equal to k is invalid, broadcasting to other surviving nodes is needed, and the other nodes are requested to send data to the node;
step 2.2: node N i Receiving data blocks or check blocks of other nodes, and obtaining at least k/2+1 identical check blocks so as to recover all the check blocks;
step 2.3: recovering the received data block and the check block based on the erasure code idea, comparing the recovered data with a local hash value, and if the comparison is consistent, indicating that the recovered data is correct, thereby recovering the invalid data;
the data updating stage specifically comprises the following steps:
step 3.1: when a new node is added, broadcasting a request to the blockchain system, requesting other nodes to send data, and recovering the original blockchain according to a data recovery principle;
step 3.2: the original node receives the data in the new node, updates the block chain in an additional updating mode, and simultaneously obtains a data block and a check block which are required to be stored by the node by utilizing a data storage principle;
the additional updating mode specifically comprises the following steps: after adding new data to the original data block in the form of a log, adding a difference value between the new data and the original data to the check block in the form of a log; and merging the additional data with the original data after a period of time or when a new data block or a check block needs to be accessed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010214114.1A CN111444042B (en) | 2020-03-24 | 2020-03-24 | Block chain data storage method based on erasure codes |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010214114.1A CN111444042B (en) | 2020-03-24 | 2020-03-24 | Block chain data storage method based on erasure codes |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111444042A CN111444042A (en) | 2020-07-24 |
CN111444042B true CN111444042B (en) | 2023-10-27 |
Family
ID=71629449
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010214114.1A Active CN111444042B (en) | 2020-03-24 | 2020-03-24 | Block chain data storage method based on erasure codes |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111444042B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112799872B (en) * | 2021-02-19 | 2022-08-12 | 上海交通大学 | Erasure code encoding method and device based on key value pair storage system |
CN112835738B (en) * | 2021-02-20 | 2022-05-20 | 华中科技大学 | Method for constructing strip data storage structure |
CN116579025A (en) * | 2021-04-20 | 2023-08-11 | 支付宝(杭州)信息技术有限公司 | File storage method, device and equipment |
CN114244853A (en) * | 2021-11-29 | 2022-03-25 | 国网北京市电力公司 | Big data sharing method and device and big data sharing system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109491835A (en) * | 2018-10-25 | 2019-03-19 | 哈尔滨工程大学 | A kind of data fault tolerance method based on Dynamic Packet code |
CN109871366A (en) * | 2019-01-17 | 2019-06-11 | 华东师范大学 | A kind of storage of block chain fragment and querying method based on correcting and eleting codes |
CN110046894A (en) * | 2019-04-19 | 2019-07-23 | 电子科技大学 | A kind of restructural block chain method for building up of grouping based on correcting and eleting codes |
CN110611699A (en) * | 2019-08-09 | 2019-12-24 | 南京泛函智能技术研究院有限公司 | Wireless self-organizing data storage method and data storage system |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106708651B (en) * | 2016-11-16 | 2020-09-11 | 北京三快在线科技有限公司 | Partial writing method and device based on erasure codes, storage medium and equipment |
-
2020
- 2020-03-24 CN CN202010214114.1A patent/CN111444042B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109491835A (en) * | 2018-10-25 | 2019-03-19 | 哈尔滨工程大学 | A kind of data fault tolerance method based on Dynamic Packet code |
CN109871366A (en) * | 2019-01-17 | 2019-06-11 | 华东师范大学 | A kind of storage of block chain fragment and querying method based on correcting and eleting codes |
CN110046894A (en) * | 2019-04-19 | 2019-07-23 | 电子科技大学 | A kind of restructural block chain method for building up of grouping based on correcting and eleting codes |
CN110611699A (en) * | 2019-08-09 | 2019-12-24 | 南京泛函智能技术研究院有限公司 | Wireless self-organizing data storage method and data storage system |
Non-Patent Citations (1)
Title |
---|
分布式存储中的纠删码容错技术研究;王意洁;许方亮;裴晓强;;计算机学报(第01期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN111444042A (en) | 2020-07-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111444042B (en) | Block chain data storage method based on erasure codes | |
Qi et al. | BFT-Store: Storage partition for permissioned blockchain via erasure coding | |
US6970987B1 (en) | Method for storing data in a geographically-diverse data-storing system providing cross-site redundancy | |
US10019317B2 (en) | Parity protection for data chunks in an object storage system | |
CN103944981B (en) | Cloud storage system and implement method based on erasure code technological improvement | |
CN111149093B (en) | Data encoding, decoding and repairing method of distributed storage system | |
CN106776130B (en) | Log recovery method, storage device and storage node | |
US20160006461A1 (en) | Method and device for implementation data redundancy | |
CN106708653B (en) | Mixed tax big data security protection method based on erasure code and multiple copies | |
CN109814807B (en) | Data storage method and device | |
CN105393225A (en) | Erasure coding across multiple zones | |
CN114415976B (en) | Distributed data storage system and method | |
CN105956128B (en) | A kind of adaptive coding storage fault-tolerance approach based on simple regeneration code | |
CN105530294A (en) | Mass data distributed storage method | |
CN102955720A (en) | Method for improving stability of EXT (extended) file system | |
CN109491835B (en) | Data fault-tolerant method based on dynamic block code | |
CN110515541B (en) | Method for updating erasure code non-aligned data in distributed storage | |
CN106951340B (en) | A kind of RS correcting and eleting codes data layout method and system preferential based on locality | |
CN110427156B (en) | Partition-based MBR (Membrane biological reactor) parallel reading method | |
CN107689983B (en) | Cloud storage system and method based on low repair bandwidth | |
WO2023103213A1 (en) | Data storage method and device for distributed database | |
CN110895497B (en) | Method and device for reducing erasure code repair in distributed storage | |
CN108762978B (en) | Grouping construction method of local part repeated cyclic code | |
JP2021086289A (en) | Distributed storage system and parity update method of distributed storage system | |
CN114047878A (en) | Erasure code low-overhead storage system and method for block chain domain name resolution |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |