CN114237971A - Erasure code coding layout method and system based on distributed storage system - Google Patents

Erasure code coding layout method and system based on distributed storage system Download PDF

Info

Publication number
CN114237971A
CN114237971A CN202111481100.7A CN202111481100A CN114237971A CN 114237971 A CN114237971 A CN 114237971A CN 202111481100 A CN202111481100 A CN 202111481100A CN 114237971 A CN114237971 A CN 114237971A
Authority
CN
China
Prior art keywords
data
check
block
node
lost
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111481100.7A
Other languages
Chinese (zh)
Inventor
宋�莹
穆天童
杨明杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Information Science and Technology University
Original Assignee
Beijing Information Science and Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Information Science and Technology University filed Critical Beijing Information Science and Technology University
Priority to CN202111481100.7A priority Critical patent/CN114237971A/en
Publication of CN114237971A publication Critical patent/CN114237971A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1012Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using codes or arrangements adapted for a specific type of error
    • G06F11/1032Simple parity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Error Detection And Correction (AREA)

Abstract

The invention provides an erasure code coding layout method and an erasure code coding layout system of a distributed storage system, which aim to improve the recovery efficiency and reliability of the whole system by reducing the recovered data transmission quantity and recovery time when the distributed system has data loss. The invention adds parity check calculation in the node on the basis of the traditional RS erasure code storage, uses RS codes with different n and k values, and stores the parity check calculation result in the current node, thus when a small amount of data is lost or the node fails, the parity check block can be decoded from the node, and the purpose of reducing the cross-frame and cross-node network bandwidth generated by recovery is achieved.

Description

Erasure code coding layout method and system based on distributed storage system
Technical Field
The invention relates to the field of distributed storage calculation and data recovery, in particular to an erasure code coding layout method based on a distributed storage system for improving recovery efficiency, and belongs to the field of distributed calculation.
Background
With the rapid development of internet technology, the data volume is exponentially increased, and the data storage mode is gradually changed from single-machine storage to distributed storage. The most popular big data open source framework at present is Hadoop, a big data platform capable of processing massive data in an off-line and parallel mode has the characteristics of high reliability, high expandability, high efficiency, low cost, open source and the like, and becomes a preferred massive data processing scheme for many Internet companies. Hadoop mainly comprises a Hadoop Distributed File System (HDFS) and a MapReduce distributed computing framework, and although Hadoop is developed to 3.x, the Hadoop is mature, but some aspects still have defects and need improvement and optimization.
Distributed clusters (e.g., Hadoop) are typically composed of many independent unreliable commercial components, and it is common for a component to fail. To ensure high reliability and availability of data in such distributed storage systems, two common approaches are to provide fault tolerance with multiple copies and erasure codes. The multi-copy form is easy to deploy and fail-over, but the storage overhead is too large to be suitable for a system with an excessive amount of data and a small disk space. Erasure codes provide near fault tolerance to multiple copies with lower storage overhead as an alternative, which has been deployed in some distributed systems. This reduces the storage redundancy from the traditional 3x to 1.4x, saving more space. But recovery of a failed block using erasure codes requires retrieval of multiple available blocks, which results in high recovery costs. Although erasure codes improve storage efficiency, they significantly increase disk I/O and network bandwidth utilization for failover.
To maximize data availability of distributed storage systems deployed using erasure codes, different blocks of erasure codes are stored in nodes of different chassis. This data layout approach enables the system to tolerate a certain number of node failures and rack failures. However, this placement of data blocks inevitably results in the repair of any failed data blocks requiring the retrieval of available data blocks from other chassis, and therefore occupies a significant amount of cross-chassis bandwidth. Typically, the available cross-chassis bandwidth for each node is only 1/20 to 1/5 of the internal chassis bandwidth. Therefore, in a distributed storage system, the internal rack bandwidth is considered to be sufficient, but the cross-rack bandwidth is not abundant, and is generally considered to be a scarce resource, and excessive cross-rack traffic inevitably delays the recovery process, and reduces the recovery efficiency.
Disclosure of Invention
The invention provides an erasure code coding layout method for improving data recovery efficiency, which aims to improve the recovery efficiency and reliability of the whole system by reducing the data transmission quantity and recovery time length of recovery when a distributed system has data loss. The invention adds parity check calculation in the node on the basis of the traditional RS erasure code storage, uses RS codes with different n and k values, and stores the parity check calculation result in the current node, thus when a small amount of data is lost or the node fails, the parity check block can be decoded from the node, and the purpose of reducing the cross-frame and cross-node network bandwidth generated by recovery is achieved. Specifically, the present invention comprises the steps of:
aiming at the defects of the prior art, the invention provides an erasure code coding layout method based on a distributed storage system, which comprises the following steps:
step 1, acquiring a distributed storage system with a plurality of storage nodes, setting transverse and longitudinal coding parameters according to the number of the storage nodes of the distributed storage system, and dividing all the storage nodes into data nodes for storing data blocks and check nodes for storing transverse check blocks according to storage contents;
step 2, according to the horizontal and vertical coding parameters, respectively performing vertical and horizontal erasure coding on each original data block on each data node to obtain a vertical check block and a horizontal check block corresponding to each original data block; storing the transverse check block to a check node, and storing the longitudinal check block to the data node corresponding to the original data block;
step 3, when data is lost, judging whether the lost data belongs to an original data block, if so, decoding a longitudinal check block of a data node where the lost data is located to recover the lost data, and storing the data node where the lost data is located; otherwise, judging whether the lost data belongs to a longitudinal check block, if so, carrying out longitudinal erasure coding on the lost data so as to recover the lost data, and storing the data into a data node where the lost data is located; otherwise, the lost data belongs to the transverse check block, transverse erasure coding is carried out on the lost data so as to recover the lost data, and the lost data is stored in the check node.
The erasure code coding layout method based on the distributed storage system further comprises the following steps: and 4, when the data node fails, decoding the transverse check block of the check node to recover to obtain the stripe where the longitudinal check block is located, and then decoding to recover the original data block until the last remaining original data block is decoded and recovered by using the recovered longitudinal check block.
The erasure code coding layout method based on the distributed storage system further comprises the following steps: and 5, when the check node fails, performing transverse erasure coding on each original data block in the data node to recover the failed check node.
The erasure code coding layout method based on the distributed storage system is characterized in that the longitudinal erasure codes and the transverse erasure codes belong to parity check codes, and the number of the transverse check blocks is greater than that of the longitudinal check blocks.
The invention also provides an erasure code coding layout system based on the distributed storage system, which comprises the following steps:
the initial module is used for acquiring a distributed storage system with a plurality of storage nodes, setting transverse and longitudinal coding parameters according to the number of the storage nodes of the distributed storage system, and dividing all the storage nodes into data nodes for storing data blocks and check nodes for storing transverse check blocks according to storage contents;
the encoding module is used for respectively carrying out longitudinal and transverse erasure coding on each original data block on each data node according to the transverse and longitudinal coding parameters to obtain a longitudinal check block and a transverse check block corresponding to each original data block; storing the transverse check block to a check node, and storing the longitudinal check block to the data node corresponding to the original data block;
the recovery module is used for judging whether the lost data belongs to an original data block or not when the data is lost, if so, decoding a longitudinal check block of a data node where the lost data is located to recover the lost data and storing the lost data into the data node where the lost data is located; otherwise, judging whether the lost data belongs to a longitudinal check block, if so, carrying out longitudinal erasure coding on the lost data so as to recover the lost data, and storing the data into a data node where the lost data is located; otherwise, the lost data belongs to the transverse check block, transverse erasure coding is carried out on the lost data so as to recover the lost data, and the lost data is stored in the check node.
The erasure code coding layout system based on the distributed storage system is characterized in that the recovery module is further used for decoding the transverse check block of the check node when the data node fails to obtain the stripe where the longitudinal check block is located, and then decoding and recovering the original data block until the last remaining original data block is decoded and recovered by using the longitudinal check block obtained by recovery.
The erasure code coding layout system based on the distributed storage system is characterized in that the recovery module is further configured to perform transverse erasure coding on each original data block in the data node when the check node fails, so as to recover the failed check node.
The erasure code coding layout system based on the distributed storage system is characterized in that the longitudinal erasure codes and the transverse erasure codes belong to parity check codes, and the number of the transverse check blocks is greater than that of the longitudinal check blocks.
The invention also provides a storage medium for storing a program for executing any erasure code coding layout method based on the distributed storage system.
The invention also provides a client used for any erasure code coding layout system based on the distributed storage system.
According to the scheme, the invention has the advantages that:
firstly, reading id and position of an original data block in a distributed file storage system, respectively generating a horizontal parity check block and a vertical parity check block, recording the horizontal parity check block and the vertical parity check block into a file, judging a fault type according to the file when data is lost, and coding or decoding to recover the lost data block or the parity check block. The layout method provided by the invention improves the reliability of the distributed cluster and the efficiency of cluster data recovery, and reduces the data transmission quantity and the bandwidth occupation during recovery.
Drawings
FIG. 1 is a layout diagram of the present invention;
FIG. 2 is a flow chart of data recovery according to the present invention.
Detailed Description
The method provided by the invention comprises the following steps:
A. the values of RS (n, k) and RS (n ', k') are selected according to the number of distributed cluster nodes.
A1. And determining the values of n, k, n 'and k' according to the number of nodes in the distributed cluster and the common coding mode of the RS.
A2. And dividing all nodes in the cluster into data nodes and parity check nodes according to the conditions, wherein the data nodes only store data blocks, and the parity check nodes only store parity check blocks.
B. All data blocks are encoded, marked and recorded based on the determined number of nodes.
B1. And reading the id and the position of the original data block in the distributed file storage system.
B2. And B, longitudinally correcting and coding all data blocks on the corresponding node according to the longitudinal RS (n ', k') determined in the step A.
B3. And B, transversely erasure coding all the data blocks according to the transverse RS (n, k) determined in the step A.
B4. The label of each block is stored in a file together with the content read in step B1.
C. And D, judging the fault type according to the error report condition of the data block read by the user and the file uploaded in the step B.
C1. It is determined whether a single data block is lost.
C2. It is determined whether it is a single node failure.
D. And D, selecting a data recovery mode according to the distributed cluster fault condition judged in the step C.
D1. And C, recovering the single data block loss according to the judgment result of the step C.
D2. And C, recovering the single node fault according to the judgment result of the step C.
D3. And reading all the recovered data, comparing the data with the content in the file, and checking whether the recovery is successful.
In order to make the aforementioned features and effects of the present invention more comprehensible, embodiments accompanied with figures are described in detail below.
The steps of the present invention are further described below in conjunction with figures 1 and 2, comprising: A. selecting values of RS (n, k) and RS (n ', k') according to the number of distributed cluster nodes; B. coding, marking and recording all data blocks based on the determined number of nodes; C. judging the fault type according to the error report condition of the data block read by the user and the file uploaded in the step B; D. and D, selecting a data recovery mode according to the distributed cluster fault condition judged in the step C. One specific implementation is as follows:
A. the values of RS (n, k) and RS (n ', k') are selected according to the number of distributed cluster nodes.
A1. And determining the values of n, k, n 'and k' according to the number of nodes in the distributed cluster and the common coding mode of the RS. One (n, k) code encodes n data blocks into k additional transverse parity blocks. The number of all nodes is larger than or equal to the sum of n and k, and the encoding result of RS (n, k) is stored on the parity check node. n 'and k' are respectively less than or equal to n and k, and a combination mode can be fixedly used for calculating the longitudinal parity check blocks in the nodes, and the calculation results are respectively stored on the respective nodes.
A2. All nodes in the cluster are divided into data nodes and parity nodes according to the above conditions, the data nodes will store the original data blocks and the longitudinal parity blocks calculated in step a1, and the parity nodes store only the transverse parity blocks.
B. All data blocks are encoded, marked and recorded based on the determined number of nodes.
B1. And reading the id and the position of the original data block in the distributed file storage system.
B1-1, setting the number of copies of the stored data in all nodes to 1 by setting the cluster configuration file.
B1-2, obtaining the id number and the storage position of each original data block in the distributed file storage system through the API, including which data block is split by the same file in which node.
B1-3, arranging the original data blocks into an abstract matrix according to the information obtained in the step B1-2, and generating the parity check blocks by subsequent encoding.
B2. And B, longitudinally correcting and coding all data blocks on the corresponding node according to the longitudinal RS (n ', k') determined in the step A.
B2-1, longitudinally erasure coding all the original data blocks on each data node according to the RS (n ', k') determined in the step A and the information obtained in the step B1-1.
B2-2, storing the newly generated inner longitudinal parity blocks on respective nodes, and adding identification to all longitudinal parity blocks.
And B2-3, recording the position and the identification of the longitudinal check block in the file in the form of an abstract matrix. One possible layout architecture is shown in fig. 1, where RS (3,1) is used for vertical erasure coding in fig. 1, that is, 1 vertical parity block is generated every 3 data blocks and stored in the current node, and this coding scheme can tolerate 1 data block failure at most.
B3. And B, transversely erasure coding all the data blocks according to the transverse RS (n, k) determined in the step A.
B3-1, according to the storage positions of the horizontal RS (n, k) determined in the step A and the vertical erasure correction coding in the step B2, encoding all the current data blocks (including the original data block and the vertical parity block) in a horizontal erasure correction mode.
B3-2, storing the generated transverse parity blocks on the parity check nodes according to the data stripes, and adding identifications for all the transverse parity blocks.
B3-3, recording the position and identification of the transverse check block in the file in the form of abstract matrix. One possible layout architecture is shown in fig. 1, in which the horizontal erasure coding in fig. 1 uses RS (3,2), that is, 2 horizontal parity blocks are generated for every 3 data blocks and stored in the parity nodes, and this coding scheme can tolerate at most 2 data block failures.
B4. The label of each block is stored in a file together with the content read in step B1.
B4-1. after the above steps are completed, there should be three different types of data blocks, namely, original data block, longitudinal parity block and transverse parity block, and identification is added respectively. The identity and position in the abstract matrix of all data blocks are recorded into a file.
B4-2, uploading the file to a distributed file storage system for storage, and downloading and checking when data loss occurs.
C. And D, judging the fault type according to the error report condition of the data block read by the user and the file uploaded in the step B.
C1. It is determined whether a single data block is lost.
C1-1, reading all data blocks after failure, comparing with the data blocks and the identification stored in the file, and judging which type of data block is lost, original data block or transverse check block or longitudinal check block.
C2. It is determined whether it is a single node failure.
C2-1, reading all data blocks after failure, comparing with the data blocks and the marks stored in the file, and judging which type of node failure, data node or parity check node is.
D. And D, selecting a data recovery mode according to the distributed cluster fault condition judged in the step C.
D1. And C, recovering the lost single data block according to the judgment result of the step C.
D1-1. if the original data block is lost, only the longitudinal parity check block on the same node needs to be decoded, and the data does not need to be transmitted.
D1-2. if the parity block is lost, the original data block is encoded by the longitudinal RS (n ', k') to generate the parity block, and the parity block is stored in the current node without transmitting data.
D1-3. if the transverse parity block is lost, transverse RS (n, k) encoding the original data block on the same band as the lost parity block, generating the transverse parity block, and storing the transverse parity block on the parity check node.
D2. And C, restoring the single failed node according to the judgment result of the step C.
D2-1, if the data node is failed, decoding and recovering the stripe where the longitudinal check block is located by preferentially using the transverse check block of the parity check node, then decoding and recovering any n-1 original data blocks, and finally decoding and recovering the remaining original data block by using the longitudinal parity check block which is preferentially recovered.
D2-2. if the parity check node is failed, the transverse parity check block of each stripe can be recovered by encoding calculation.
D3. And reading all the recovered data, comparing the data with the content in the file, and checking whether the recovery is successful.
The following are system examples corresponding to the above method examples, and this embodiment can be implemented in cooperation with the above embodiments. The related technical details mentioned in the above embodiments are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the above-described embodiments.
The invention also provides an erasure code coding layout system based on the distributed storage system, which comprises the following steps:
the initial module is used for acquiring a distributed storage system with a plurality of storage nodes, setting transverse and longitudinal coding parameters according to the number of the storage nodes of the distributed storage system, and dividing all the storage nodes into data nodes for storing data blocks and check nodes for storing transverse check blocks according to storage contents;
the encoding module is used for respectively carrying out longitudinal and transverse erasure coding on each original data block on each data node according to the transverse and longitudinal coding parameters to obtain a longitudinal check block and a transverse check block corresponding to each original data block; storing the transverse check block to a check node, and storing the longitudinal check block to the data node corresponding to the original data block;
the recovery module is used for judging whether the lost data belongs to an original data block or not when the data is lost, if so, decoding a longitudinal check block of a data node where the lost data is located to recover the lost data and storing the lost data into the data node where the lost data is located; otherwise, judging whether the lost data belongs to a longitudinal check block, if so, carrying out longitudinal erasure coding on the lost data so as to recover the lost data, and storing the data into a data node where the lost data is located; otherwise, the lost data belongs to the transverse check block, transverse erasure coding is carried out on the lost data so as to recover the lost data, and the lost data is stored in the check node.
The erasure code coding layout system based on the distributed storage system is characterized in that the recovery module is further used for decoding the transverse check block of the check node when the data node fails to obtain the stripe where the longitudinal check block is located, and then decoding and recovering the original data block until the last remaining original data block is decoded and recovered by using the longitudinal check block obtained by recovery.
The erasure code coding layout system based on the distributed storage system is characterized in that the recovery module is further configured to perform transverse erasure coding on each original data block in the data node when the check node fails, so as to recover the failed check node.
The erasure code coding layout system based on the distributed storage system is characterized in that the longitudinal erasure codes and the transverse erasure codes belong to parity check codes, and the number of the transverse check blocks is greater than that of the longitudinal check blocks.
The invention also provides a storage medium for storing a program for executing any erasure code coding layout method based on the distributed storage system.
The invention also provides a client used for any erasure code coding layout system based on the distributed storage system.
The invention respectively generates a transverse parity check block and a longitudinal parity check block by reading the id and the position of an original data block in a distributed file storage system, marks the data with 3 types and records the data into a file, then judges the type of the fault according to the file when the distributed cluster has the fault, and encodes or decodes the data to recover the lost data block or node. By the layout and the recovery method provided by the invention, the data transmission amount and the occupied cross-frame and cross-node bandwidth during recovery are reduced. Finally, the recovery efficiency and reliability of the whole system are improved, and the method has good market prospect and application value.

Claims (10)

1. An erasure code coding layout method based on a distributed storage system is characterized by comprising the following steps:
step 1, acquiring a distributed storage system with a plurality of storage nodes, setting transverse and longitudinal coding parameters according to the number of the storage nodes of the distributed storage system, and dividing all the storage nodes into data nodes for storing data blocks and check nodes for storing transverse check blocks according to storage contents;
step 2, according to the horizontal and vertical coding parameters, respectively performing vertical and horizontal erasure coding on each original data block on each data node to obtain a vertical check block and a horizontal check block corresponding to each original data block; storing the transverse check block to a check node, and storing the longitudinal check block to the data node corresponding to the original data block;
step 3, when data is lost, judging whether the lost data belongs to an original data block, if so, decoding a longitudinal check block of a data node where the lost data is located to recover the lost data, and storing the data node where the lost data is located; otherwise, judging whether the lost data belongs to a longitudinal check block, if so, carrying out longitudinal erasure coding on the lost data so as to recover the lost data, and storing the data into a data node where the lost data is located; otherwise, the lost data belongs to the transverse check block, transverse erasure coding is carried out on the lost data so as to recover the lost data, and the lost data is stored in the check node.
2. The distributed storage system based erasure code coding layout method of claim 1, further comprising: and 4, when the data node fails, decoding the transverse check block of the check node to recover to obtain the stripe where the longitudinal check block is located, and then decoding to recover the original data block until the last remaining original data block is decoded and recovered by using the recovered longitudinal check block.
3. The distributed storage system based erasure code coding layout method of claim 1 or 2, further comprising: and 5, when the check node fails, performing transverse erasure coding on each original data block in the data node to recover the failed check node.
4. The distributed storage system-based erasure code coding layout method of claim 1, wherein the vertical and horizontal erasure codes belong to parity codes, and the number of horizontal parity check blocks is greater than the number of vertical parity check blocks.
5. An erasure code coding layout system based on a distributed storage system, comprising:
the initial module is used for acquiring a distributed storage system with a plurality of storage nodes, setting transverse and longitudinal coding parameters according to the number of the storage nodes of the distributed storage system, and dividing all the storage nodes into data nodes for storing data blocks and check nodes for storing transverse check blocks according to storage contents;
the encoding module is used for respectively carrying out longitudinal and transverse erasure coding on each original data block on each data node according to the transverse and longitudinal coding parameters to obtain a longitudinal check block and a transverse check block corresponding to each original data block; storing the transverse check block to a check node, and storing the longitudinal check block to the data node corresponding to the original data block;
the recovery module is used for judging whether the lost data belongs to an original data block or not when the data is lost, if so, decoding a longitudinal check block of a data node where the lost data is located to recover the lost data and storing the lost data into the data node where the lost data is located; otherwise, judging whether the lost data belongs to a longitudinal check block, if so, carrying out longitudinal erasure coding on the lost data so as to recover the lost data, and storing the data into a data node where the lost data is located; otherwise, the lost data belongs to the transverse check block, transverse erasure coding is carried out on the lost data so as to recover the lost data, and the lost data is stored in the check node.
6. The distributed storage system-based erasure code coding layout system of claim 5, wherein the recovery module is further configured to, in case of a data node failure, decode the horizontal parity chunks of the parity nodes to recover the stripe where the vertical parity chunks are located, and then decode and recover the original data chunks until the last remaining original data chunk is recovered by decoding the recovered vertical parity chunk.
7. The distributed storage system-based erasure coding layout system of claim 5 or 6, wherein the recovery module is further configured to perform lateral erasure coding on each original data block in the data node when the check node fails, so as to recover the failed check node.
8. The distributed storage system-based erasure code coding layout system of claim 5, wherein the vertical and horizontal erasure codes are parity codes, and the number of horizontal parity blocks is greater than the number of vertical parity blocks.
9. A storage medium storing a program for executing the erasure code coding layout method based on the distributed storage system of any one of claims 1 through 4.
10. A client for use in an erasure coding layout system based on a distributed storage system as claimed in any one of claims 6 to 8.
CN202111481100.7A 2021-12-06 2021-12-06 Erasure code coding layout method and system based on distributed storage system Pending CN114237971A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111481100.7A CN114237971A (en) 2021-12-06 2021-12-06 Erasure code coding layout method and system based on distributed storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111481100.7A CN114237971A (en) 2021-12-06 2021-12-06 Erasure code coding layout method and system based on distributed storage system

Publications (1)

Publication Number Publication Date
CN114237971A true CN114237971A (en) 2022-03-25

Family

ID=80753495

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111481100.7A Pending CN114237971A (en) 2021-12-06 2021-12-06 Erasure code coding layout method and system based on distributed storage system

Country Status (1)

Country Link
CN (1) CN114237971A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112463812A (en) * 2020-12-21 2021-03-09 重庆邮电大学 Optimization method for updating repair data based on multi-machine frame of Ceph distributed system
CN117370067A (en) * 2023-12-07 2024-01-09 融科联创(天津)信息技术有限公司 Data layout and coding method of large-scale object storage system
CN117950916A (en) * 2024-03-26 2024-04-30 陕西中安数联信息技术有限公司 High-reliability data backup method and system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112463812A (en) * 2020-12-21 2021-03-09 重庆邮电大学 Optimization method for updating repair data based on multi-machine frame of Ceph distributed system
CN117370067A (en) * 2023-12-07 2024-01-09 融科联创(天津)信息技术有限公司 Data layout and coding method of large-scale object storage system
CN117370067B (en) * 2023-12-07 2024-04-12 融科联创(天津)信息技术有限公司 Data layout and coding method of large-scale object storage system
CN117950916A (en) * 2024-03-26 2024-04-30 陕西中安数联信息技术有限公司 High-reliability data backup method and system

Similar Documents

Publication Publication Date Title
CN114237971A (en) Erasure code coding layout method and system based on distributed storage system
Silberstein et al. Lazy means smart: Reducing repair bandwidth costs in erasure-coded distributed storage
CN108170555B (en) Data recovery method and equipment
CN107544862B (en) Stored data reconstruction method and device based on erasure codes and storage node
US10140172B2 (en) Network-aware storage repairs
CN109643258B (en) Multi-node repair using high-rate minimal storage erase code
CN107885612B (en) Data processing method, system and device
US11531593B2 (en) Data encoding, decoding and recovering method for a distributed storage system
US11188404B2 (en) Methods of data concurrent recovery for a distributed storage system and storage medium thereof
CN109814807B (en) Data storage method and device
CN111614720B (en) Cross-cluster flow optimization method for single-point failure recovery of cluster storage system
CN109491835B (en) Data fault-tolerant method based on dynamic block code
CN107689983B (en) Cloud storage system and method based on low repair bandwidth
CN116501553B (en) Data recovery method, device, system, electronic equipment and storage medium
CN110427156A (en) A kind of parallel reading method of the MBR based on fragment
Esmaili et al. The core storage primitive: Cross-object redundancy for efficient data repair & access in erasure coded storage
CN109358980A (en) A kind of pair of data update and single disk error repairs friendly RAID6 coding method
CN106027638A (en) Hadoop data distribution method based on hybrid coding
JP2021086289A (en) Distributed storage system and parity update method of distributed storage system
CN112000278B (en) Self-adaptive local reconstruction code design method for thermal data storage and cloud storage system
CN111506450B (en) Method, apparatus and computer program product for data processing
CN113157715B (en) Erasure code data center rack collaborative updating method
CN111224747A (en) Coding method capable of reducing repair bandwidth and disk reading overhead and repair method thereof
CN106911793B (en) I/O optimized distributed storage data repair method
CN113504875B (en) Method and system for recovering erasure code system based on multistage scheduling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination