CN111444042B - Block chain data storage method based on erasure codes - Google Patents

Block chain data storage method based on erasure codes Download PDF

Info

Publication number
CN111444042B
CN111444042B CN202010214114.1A CN202010214114A CN111444042B CN 111444042 B CN111444042 B CN 111444042B CN 202010214114 A CN202010214114 A CN 202010214114A CN 111444042 B CN111444042 B CN 111444042B
Authority
CN
China
Prior art keywords
data
block
node
nodes
block chain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010214114.1A
Other languages
Chinese (zh)
Other versions
CN111444042A (en
Inventor
孟宇龙
任龙
徐东
张子迎
钟俊捷
华园园
曹雨倩
蒋馨宙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN202010214114.1A priority Critical patent/CN111444042B/en
Publication of CN111444042A publication Critical patent/CN111444042A/en
Application granted granted Critical
Publication of CN111444042B publication Critical patent/CN111444042B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1479Generic software techniques for error detection or fault masking
    • G06F11/1489Generic software techniques for error detection or fault masking through recovery blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention belongs to the technical field of block chain storage, and particularly relates to a block chain data storage method based on erasure codes. The method assigns global unique node numbers to system nodes, and any node i encodes storage data based on an erasure code concept to obtain an ith data block and store all check codes; the failure node requests data from the global rest nodes under the condition of a specific fault tolerance rate, so that data repair is carried out; after joining the blockchain system, the new node requests data from the global and stores the historical blockchain. The invention can reduce global storage overhead of the block chain system, simultaneously realize recovery of data under the condition of failure of less than 50% of nodes, and ensure data reliability.

Description

Block chain data storage method based on erasure codes
Technical Field
The invention belongs to the technical field of block chain storage, and particularly relates to a block chain data storage method based on erasure codes.
Background
The block chain is a data structure formed by using chains for data blocks according to the sequence of time stamps, and ensures the distributed decentralized account book which cannot be tampered and counterfeited in a cryptographic way, safely stores data which have a precedence relationship and can be verified in the system, and can solve the problems of distributed data storage, transmission, distribution and the like. This also means that each participating node in the blockchain system must store the entire blockchain, which places a significant storage burden while maintaining data. It is apparent that the conventional blockchain storage mode has difficulty in coping with the explosively increasing data volume in the big data age.
Erasure codes are important fault-tolerant techniques that can provide the same data reliability with lower storage overhead than multi-copy fault-tolerant techniques. The multi-copy technology is used for backing up the data stored in the system by the user and storing the data in the corresponding redundant disk, and when the data block is lost or the disk fails, the copy data can be directly read from the redundant disk storing the copy for restoration. Multiple copy technology is widely used in existing storage systems due to its simple operation and efficient recovery speed. However, the multiple copy technology needs to occupy the same number of disks as the backup to store the copies, has extremely high storage overhead, and greatly increases the cost of constructing and operating the storage system. Particularly in the current age of rapid increases in data volume, multi-copy technology is increasingly inconvenient to use in large fault tolerant storage systems due to the high storage overhead it introduces. Compared with the multi-copy technology, the erasure code technology remarkably reduces the storage overhead of data, and can achieve the same or even higher fault tolerance performance, so that the erasure code technology is widely focused, and becomes a research hotspot in the storage field.
The erasure coding technology can recover original data by dividing the original data into blocks and coding the blocks, and when the data fails, the original data can be recovered by reading the data blocks which are not failed. There is currently no excessive research on the application of erasure codes to blockchains. The invention introduces erasure codes to improve the storage mode of the block chain system, can reduce the storage overhead of the whole system, and can recover data under the condition of specific fault tolerance rate.
Disclosure of Invention
The invention aims to provide a block chain data storage method based on erasure codes, which can reduce global storage overhead of a block chain system, recover data under the condition that less than 50% of nodes fail, and ensure data reliability.
The aim of the invention is realized by the following technical scheme: the method comprises a data storage stage, a data recovery stage and a data updating stage;
the data storage stage specifically comprises the following steps:
step 1.1: according to blockchain application systemThe number of nodes in the system, the original block chain data is divided into data blocks with the same number as the nodes by using the erasure code idea, namely, each block has the same size and N i d, if the sizes are different, 0 is added at the tail end to make the sizes the same;
the block chain application system has k nodes { N } 1 ,N 2 ,…N k Each node generates data as N i d, all nodes storeThe global data storage amount T in the block system is:
step 1.2: encoding the data block by an encoding strategy to generate r check blocks { P } 1 ,P 2 ,…,P r -a }; wherein k is>r is more than or equal to k/2, and the size of the check block is as follows:
step 1.3: each node stores 1 data block and all r check blocks except the block chain hash value, and replaces the original scheme of storing all data; i.e. each node memory is:
the global data storage amount is:
wherein T is 1 T is less than or equal to, and the compression ratio can reach 0.25 at most;
the data recovery stage specifically comprises the following steps:
step 2.1: node N in a blockchain application i When the data with i being more than or equal to 1 and k being less than or equal to k is invalid, broadcasting to other surviving nodes is needed, and the other nodes are requested to send data to the node;
step 2.2: node N i Receiving data blocks or check blocks of other nodes, and obtaining at least k/2+1 identical check blocks so as to recover all the check blocks;
step 2.3: recovering the received data block and the check block based on the erasure code idea, comparing the recovered data with a local hash value, and if the comparison is consistent, indicating that the recovered data is correct, thereby recovering the invalid data;
the data updating stage specifically comprises the following steps:
step 3.1: when a new node is added, broadcasting a request to the blockchain system, requesting other nodes to send data, and recovering the original blockchain according to a data recovery principle;
step 3.2: the original node receives the data in the new node, updates the block chain in an additional updating mode, and simultaneously obtains a data block and a check block which are required to be stored by the node by utilizing a data storage principle;
the additional updating mode specifically comprises the following steps: after adding new data to the original data block in the form of a log, adding a difference value between the new data and the original data to the check block in the form of a log; and merging the additional data with the original data after a period of time or when a new data block or a check block needs to be accessed.
The invention has the beneficial effects that:
aiming at the problem of high storage overhead of a block chain storage mode, the invention provides a block chain data storage method based on erasure codes. The invention can reduce global storage overhead of the block chain system, simultaneously realize recovery of data under the condition of failure of less than 50% of nodes, and ensure data reliability.
Drawings
FIG. 1 is a schematic diagram of a data storage system according to the present invention.
FIG. 2 is a diagram illustrating data recovery and data update according to the present invention.
Fig. 3 is a flow chart of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
The invention solves the technical problems that: aiming at the problem of high storage overhead of a block chain storage mode, the block chain data storage method based on erasure codes is provided, the global storage overhead of a block chain system can be reduced, and meanwhile, the recovery of data under the condition that less than 50% of nodes fail is realized, and the reliability of the data is ensured.
The technical scheme of the invention is as follows: a block chain data storage method based on erasure codes. Comprising the following steps:
each participating node in the blockchain application system has a globally unique number;
based on erasure code thought, node i blocks and codes the block chain data to obtain the ith original data
Data blocks and all check blocks;
redefining a block chain storage principle, wherein the ith data block and the check block replace the original ith node to store;
for data recovery of the failure node, based on a defined storage principle, a request is sent to the global node, and the failure node receives the data fragments and the check blocks sent by other nodes to carry out data recovery;
for the addition of new nodes, an interaction mode of the nodes and the blockchain network is given;
according to the block chain data storage method based on the erasure codes, global unique node numbers are given to system nodes, and any node i encodes storage data based on the erasure code concept to obtain an ith block data block and store all check codes; the failure node requests data from the global rest nodes under the condition of a specific fault tolerance rate, so that data repair is carried out; after joining the blockchain system, the new node requests data from the global and stores the historical blockchain.
The invention provides a block chain data storage method based on erasure codes, which comprises three parts, namely numbersA data storage phase, a data recovery phase and a data update phase. In theory, the invention is applicable to all types of erasure codes, for convenience of explanation, taking RS erasure codes as an example, it is assumed that k nodes { N ] are present in the blockchain application system 1 ,N 2 ,…N k Each node generates data as N i d, all nodes storeThe global data storage amount T in the block system is:
in order to ensure that the minority of the blockchain obeys the majority principle, namely, the number of failed nodes is controlled within 50 percent, reasonable data division and data storage strategies are needed. And when the node fails or data is added, corresponding data recovery and updating are carried out.
FIG. 1 is a schematic diagram of data storage. The storage procedure is as follows.
And a data storage stage:
step A1: firstly, according to the number of nodes, the original block chain data is divided into data blocks with the same number as the nodes by using an erasure code idea, namely each block has the same size and N i d, if the sizes are different, 0 is added at the tail end to make the sizes the same.
Step A2: encoding the data block through an RS encoding strategy to generate r check blocks { P } 1 ,P 2 ,…,P r To meet the principle of reducing the memory overhead, r of the present invention should meet k>r is more than or equal to k/2, and the size of the check block is as follows:
step A3: each node stores 1 data block and all r check blocks except the block chain hash value, and replaces the original scheme of storing all data. I.e. each node memory is:
the global data storage amount is:
it can be seen that T 1 T is less than or equal to, and the compression ratio can reach 0.25 at most.
FIG. 2 is a schematic diagram of a data recovery stage and a data update stage. Step 1,2 is a step of requesting data recovery for a failure node; and 3,4, requesting historical data for the new node joining system to perform a data updating step.
And (3) a data recovery stage:
step B1: node N in a blockchain application i When the data with i being more than or equal to 1 and k being less than or equal to k is invalid, broadcasting to other surviving nodes is needed, and the other nodes are requested to send data to the node.
Step B2: node N i Receiving data blocks or check blocks of other nodes, and obtaining at least k/2+1 identical check blocks so as to recover all the check blocks;
step B3: and then recovering the received data block and the check block based on the erasure code idea, comparing the recovered data with the local hash value, and if the comparison is consistent, indicating that the recovered data is correct, thereby recovering the invalid data.
And a data updating stage:
step C1: when a new node is added, a request is broadcast to the blockchain system, other nodes are requested to send data, and the original blockchain is restored according to the data restoration principle.
Step C2: the original node receives the data in the new node, updates the blockchain in an additional updating mode, and simultaneously obtains the data block and the check block which are needed to be stored by the node by utilizing the data storage principle.
Step C3: the new data is added to the original data block in a log mode in an addition updating mode, and the difference value between the new data and the original data is added to the check block in a log mode; and merging the additional data with the original data after a period of time or when a new data block or a check block needs to be accessed.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (1)

1. A block chain data storage method based on erasure codes is characterized in that: the method comprises a data storage stage, a data recovery stage and a data updating stage;
the data storage stage specifically comprises the following steps:
step 1.1: according to the number of nodes in the block chain application system, the original block chain data is divided into data blocks with the same number as the nodes by using the erasure code idea, namely, each block has the same size and N i d, if the sizes are different, 0 is added at the tail end to make the sizes the same;
the block chain application system has k nodes { N } 1 ,N 2 ,…N k Each node generates data as N i d, all nodes storeThe global data storage amount T in the block system is:
step 1.2: encoding the data block by an encoding strategy to generate r check blocks { P } 1 ,P 2 ,…,P r -a }; wherein k is>r is more than or equal to k/2, and the size of the check block is as follows:
step 1.3: each node stores 1 data block and all r check blocks except the block chain hash value, and replaces the original scheme of storing all data; i.e. each node memory is:
the global data storage amount is:
wherein T is 1 T is less than or equal to, and the compression ratio can reach 0.25 at most;
the data recovery stage specifically comprises the following steps:
step 2.1: node N in a blockchain application i When the data with i being more than or equal to 1 and k being less than or equal to k is invalid, broadcasting to other surviving nodes is needed, and the other nodes are requested to send data to the node;
step 2.2: node N i Receiving data blocks or check blocks of other nodes, and obtaining at least k/2+1 identical check blocks so as to recover all the check blocks;
step 2.3: recovering the received data block and the check block based on the erasure code idea, comparing the recovered data with a local hash value, and if the comparison is consistent, indicating that the recovered data is correct, thereby recovering the invalid data;
the data updating stage specifically comprises the following steps:
step 3.1: when a new node is added, broadcasting a request to the blockchain system, requesting other nodes to send data, and recovering the original blockchain according to a data recovery principle;
step 3.2: the original node receives the data in the new node, updates the block chain in an additional updating mode, and simultaneously obtains a data block and a check block which are required to be stored by the node by utilizing a data storage principle;
the additional updating mode specifically comprises the following steps: after adding new data to the original data block in the form of a log, adding a difference value between the new data and the original data to the check block in the form of a log; and merging the additional data with the original data after a period of time or when a new data block or a check block needs to be accessed.
CN202010214114.1A 2020-03-24 2020-03-24 Block chain data storage method based on erasure codes Active CN111444042B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010214114.1A CN111444042B (en) 2020-03-24 2020-03-24 Block chain data storage method based on erasure codes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010214114.1A CN111444042B (en) 2020-03-24 2020-03-24 Block chain data storage method based on erasure codes

Publications (2)

Publication Number Publication Date
CN111444042A CN111444042A (en) 2020-07-24
CN111444042B true CN111444042B (en) 2023-10-27

Family

ID=71629449

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010214114.1A Active CN111444042B (en) 2020-03-24 2020-03-24 Block chain data storage method based on erasure codes

Country Status (1)

Country Link
CN (1) CN111444042B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112799872B (en) * 2021-02-19 2022-08-12 上海交通大学 Erasure code encoding method and device based on key value pair storage system
CN112835738B (en) * 2021-02-20 2022-05-20 华中科技大学 Method for constructing strip data storage structure
CN116579025A (en) * 2021-04-20 2023-08-11 支付宝(杭州)信息技术有限公司 File storage method, device and equipment
CN114244853A (en) * 2021-11-29 2022-03-25 国网北京市电力公司 Big data sharing method and device and big data sharing system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109491835A (en) * 2018-10-25 2019-03-19 哈尔滨工程大学 A kind of data fault tolerance method based on Dynamic Packet code
CN109871366A (en) * 2019-01-17 2019-06-11 华东师范大学 A kind of storage of block chain fragment and querying method based on correcting and eleting codes
CN110046894A (en) * 2019-04-19 2019-07-23 电子科技大学 A kind of restructural block chain method for building up of grouping based on correcting and eleting codes
CN110611699A (en) * 2019-08-09 2019-12-24 南京泛函智能技术研究院有限公司 Wireless self-organizing data storage method and data storage system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106708651B (en) * 2016-11-16 2020-09-11 北京三快在线科技有限公司 Partial writing method and device based on erasure codes, storage medium and equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109491835A (en) * 2018-10-25 2019-03-19 哈尔滨工程大学 A kind of data fault tolerance method based on Dynamic Packet code
CN109871366A (en) * 2019-01-17 2019-06-11 华东师范大学 A kind of storage of block chain fragment and querying method based on correcting and eleting codes
CN110046894A (en) * 2019-04-19 2019-07-23 电子科技大学 A kind of restructural block chain method for building up of grouping based on correcting and eleting codes
CN110611699A (en) * 2019-08-09 2019-12-24 南京泛函智能技术研究院有限公司 Wireless self-organizing data storage method and data storage system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
分布式存储中的纠删码容错技术研究;王意洁;许方亮;裴晓强;;计算机学报(第01期);全文 *

Also Published As

Publication number Publication date
CN111444042A (en) 2020-07-24

Similar Documents

Publication Publication Date Title
CN111444042B (en) Block chain data storage method based on erasure codes
Qi et al. BFT-Store: Storage partition for permissioned blockchain via erasure coding
US6970987B1 (en) Method for storing data in a geographically-diverse data-storing system providing cross-site redundancy
US10019317B2 (en) Parity protection for data chunks in an object storage system
CN103944981B (en) Cloud storage system and implement method based on erasure code technological improvement
CN111149093B (en) Data encoding, decoding and repairing method of distributed storage system
CN106776130B (en) Log recovery method, storage device and storage node
US20160006461A1 (en) Method and device for implementation data redundancy
CN106708653B (en) Mixed tax big data security protection method based on erasure code and multiple copies
CN109814807B (en) Data storage method and device
CN105393225A (en) Erasure coding across multiple zones
CN114415976B (en) Distributed data storage system and method
CN105956128B (en) A kind of adaptive coding storage fault-tolerance approach based on simple regeneration code
CN105530294A (en) Mass data distributed storage method
CN102955720A (en) Method for improving stability of EXT (extended) file system
CN109491835B (en) Data fault-tolerant method based on dynamic block code
CN110515541B (en) Method for updating erasure code non-aligned data in distributed storage
CN106951340B (en) A kind of RS correcting and eleting codes data layout method and system preferential based on locality
CN110427156B (en) Partition-based MBR (Membrane biological reactor) parallel reading method
CN107689983B (en) Cloud storage system and method based on low repair bandwidth
WO2023103213A1 (en) Data storage method and device for distributed database
CN110895497B (en) Method and device for reducing erasure code repair in distributed storage
CN108762978B (en) Grouping construction method of local part repeated cyclic code
JP2021086289A (en) Distributed storage system and parity update method of distributed storage system
CN114047878A (en) Erasure code low-overhead storage system and method for block chain domain name resolution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant