CN113708780B - Partial repetition code construction method based on shadow - Google Patents
Partial repetition code construction method based on shadow Download PDFInfo
- Publication number
- CN113708780B CN113708780B CN202110931106.3A CN202110931106A CN113708780B CN 113708780 B CN113708780 B CN 113708780B CN 202110931106 A CN202110931106 A CN 202110931106A CN 113708780 B CN113708780 B CN 113708780B
- Authority
- CN
- China
- Prior art keywords
- code
- shadow
- phi
- heterogeneous
- storage capacity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000010276 construction Methods 0.000 title claims abstract description 15
- 238000000034 method Methods 0.000 claims abstract description 6
- 239000011159 matrix material Substances 0.000 claims description 18
- 230000008439 repair process Effects 0.000 description 10
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
- H03M13/47—Error detection, forward error correction or error protection, not provided for in groups H03M13/01 - H03M13/37
Abstract
The invention discloses a partial repetition code construction method based on shadow, which comprises the following steps: step 1: dividing an original file M into k original data blocks, and performing (n, k) MDS coding on the k original data blocks to obtain n coded data blocks; step 2: constructing a set X and a set phi according to the number n of the coded data blocks, wherein the set X comprises n different elements, the set phi comprises t subsets phi, the subsets phi are (d+1) element subsets of the set X, the subsets phi comprise (d+1) elements, and each subset phi has no identical element; step 3: obtaining a shadow set of set ψWherein the shadow setThe method comprises t groups of sub-shadow sets, wherein each group of sub-shadow sets comprises (d+1) sets phi ', each set phi ' comprises d elements, and each set phi ' consists of the rest elements after any element in the subset phi is deleted; step 4: according to the shadow setAnd constructing an FR code. The FR code constructed by the invention has lower repairing locality and can not increase along with the increase of system parameters, and meanwhile, the proper node storage capacity and data repeatability can be selected according to the system requirement.
Description
Technical Field
The invention belongs to the field of computers, and particularly relates to a partial repetition code construction method based on shadow.
Background
With the progress of technology, data needs to be stored more and more, but a traditional storage system cannot meet the requirement of mass data storage, and a distributed storage system capable of storing a large amount of data is generated. In distributed storage systems, there is often a loss of data, so some means are required to ensure the reliability of the data, and "copy" and "erasure code" techniques are typically employed. However, the storage overhead occupied by the copy policy is large, erasure code repair is complex, and the whole file needs to be downloaded for repair in the repair process, so that the required large repair bandwidth overhead is required. Rouayheb and ramchandoran then proposed a partially repeated (Fractional Repetition, FR) code for accurate repair in 2010. The FR code can tolerate the accurate coding-free repair of multiple fault nodes, the repair bandwidth overhead and the calculation complexity are small, and the repair performance of the fault nodes is greatly improved. There are many methods for constructing FR codes, such as those using the Steiner system, paired balance design, etc.
In the existing FR code construction, system parameters cannot be adjusted, for example, a part of repeated codes are constructed by using a regular graph, the repeatability of the repeated codes cannot be changed, and only single-node faults can be repaired. Prajapati proposes a partial repetition code with a ring structure, and parameters cannot be adjusted timely according to system requirements. Based on the FR codes that can be designed in groups, an appropriate storage capacity or repetition level can be selected according to the system requirements, but the repair locality thereof increases as the parameters increase.
Disclosure of Invention
The invention aims to provide a partial repetition code construction method based on shadow, which is used for solving the problems that the storage capacity, the repetition degree and the repair locality are large because the storage capacity and the repetition degree cannot be changed according to the system requirement in the prior art.
In order to realize the tasks, the invention adopts the following technical scheme:
a partial repetition code construction method based on shadow comprises the following steps:
step 1: dividing an original file M into k original data blocks, and performing (n, k) MDS coding on the k original data blocks to obtain n coded data blocks, wherein k is more than or equal to 2 and n is more than or equal to k;
step 2: constructing a set X and a set phi according to the number n of the coded data blocks, wherein the set X comprises n different elements, the set phi comprises t subsets phi, the subsets phi are (d+1) element subsets of the set X, the subsets phi comprise (d+1) elements and no identical elements exist in each subset phi, and d is a positive integer and (d+1) < n;
step 3: obtaining a shadow set of set ψWherein, shadow set->The method comprises t groups of sub-shadow sets, wherein each group of sub-shadow sets comprises (d+1) sets phi ', each set phi ' comprises d elements, and each set phi ' consists of the rest elements after any element in the subset phi is deleted;
step 4: according to the shadow setConstructing an FR code includes three cases:
case one: if the isomorphic FR code is constructed, each node of the isomorphic FR code corresponds to a shadow setThe number of nodes of the isomorphic FR code is t x (d+1), the node storage capacity of the isomorphic FR code is d, the repeatability of the isomorphic FR code is d, and the data block stored by each node of the isomorphic FR code is a corresponding set +.>An element contained;
and a second case: if the repetition rate heterogeneous FR code is constructed, then deleteAny one of the sub-shadow sets phi' of each group, resulting in a pruned shadow set +.>Deleted shadow set->Including t sets of sub-shadow sets, each set of sub-shadow sets comprising d sets phi';
each node of the repetition heterogeneous FR code corresponds to the pruned shadow setThe number of nodes of the repetition degree heterogeneous FR code is t multiplied by d, the node storage capacity of the repetition degree heterogeneous FR code is d, the repetition degree of the repetition degree heterogeneous FR code is d or (d-1), and the data block stored by each node of the repetition degree heterogeneous FR code is an element contained in a corresponding set phi';
and a third case: if the storage capacity heterogeneous FR code is constructed, a pruned shadow set is constructed based on the second casePerforming row-column exchange on the A to obtain a matrix A';
each node of the storage capacity heterogeneous FR code corresponds to each row in the matrix A ', the number of nodes of the storage capacity heterogeneous FR code is the number of rows of the matrix A', the storage capacity of the nodes of the storage capacity heterogeneous FR code is d or d-1, the repeatability of the storage capacity heterogeneous FR code is d, and the data block stored by each node of the storage capacity heterogeneous FR code is the number of columns with 1 element in the corresponding row.
Compared with the prior art, the invention has the following technical characteristics:
1. the partial repetition code based on the shadow construction is a new algorithm, the FR code is constructed by using the algorithm, the method is simpler, more visual and more efficient, and the constructed FR code has lower repairing locality and does not increase with the increase of system parameters.
2. Based on the partial repetition code of the shadow construction, a proper node storage capacity and data repetition degree can be selected according to the system requirement.
Drawings
FIG. 1 is a isomorphic FR code structure based on a shadow construction;
FIG. 2 is a repetitive heterogeneous FR code based on a shadow construction;
fig. 3 is a storage capacity heterogeneous FR code constructed based on shadow.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples. It should be noted that, in the following embodiments, only the objects, technical solutions, and advantages of the present invention will be more apparent to those skilled in the art, and the present invention is not limited to these embodiments.
shadow structure: let X be the set of n elements, letRepresents the set of k elements in all X, there is the set +.>Wherein k is more than or equal to 0 and less than or equal to n. Aggregation
Set of scalesIs the shadow of delta (shadow), wherein +.>Representing the set of all k-1 elements in X. E represents the set +.>F represents a subset of the set delta, shadow set +.>Is a set formed by deleting one element in the set δ.
The embodiment discloses a construction method of a partial repetition code based on a shadow construction, which specifically comprises the following steps:
step 1: dividing the original file M into k original data blocks, and performing (n, k) MDS coding on the k original data blocks to obtain n coded data blocks c 1 ,…,c k-1 ,c k ,c k+1 ,…c n The n coded data blocks comprise k original data blocks and n-k check data blocks, wherein k is more than or equal to 2 and n is more than or equal to k;
step 2: constructing a set X and a set phi according to the number n of the coded data blocks, wherein the set X comprises n different elements, the set phi comprises t subsets phi, the subsets phi are (d+1) element subsets of the set X, the subsets phi comprise (d+1) elements and no identical elements exist in each subset phi, and d is a positive integer and (d+1) < n;
step 3: deleting each element in the subset phi once to obtain a shadow set of the set phiWherein, shadow set->Including t sets of sub-shadow sets, each set of sub-shadow sets comprising (d+1) sets phi ', each set phi' comprising d elements;
step 4: according to the shadow setConstructing an FR code includes three cases:
case one: if the isomorphic FR code is constructed, each node of the isomorphic FR code corresponds to a shadow setThe number of nodes of the isomorphic FR codes is t x (d+1), the node storage capacity of the isomorphic FR codes is d, the repeatability of the isomorphic FR codes is d, and the data blocks stored by each node of the isomorphic FR codes are elements contained in the corresponding set phi';
and a second case: if the repetition rate heterogeneous FR code is constructed, then deleteAny one of the sub-shadow sets phi' of each group, resulting in a pruned shadow set +.>Deleted shadow set->Including t sets of sub-shadow sets, each set of sub-shadow sets comprising d sets phi';
each node of the repetition heterogeneous FR code corresponds to the pruned shadow setThe number of nodes of the repetition degree heterogeneous FR code is t multiplied by d, the node storage capacity of the repetition degree heterogeneous FR code is d, the repetition degree of the repetition degree heterogeneous FR code is d or (d-1), and the data block stored by each node of the repetition degree heterogeneous FR code is an element contained in a corresponding set phi';
and a third case: if the storage capacity heterogeneous FR code is constructed, a pruned shadow set is constructed based on the second casePerforming row-column exchange on the A to obtain a matrix A';
each node of the storage capacity heterogeneous FR code corresponds to each row in the matrix A ', the number of nodes of the storage capacity heterogeneous FR code is the number of rows of the matrix A', the storage capacity of the nodes of the storage capacity heterogeneous FR code is d or d-1, the repeatability of the storage capacity heterogeneous FR code is d, and the data block stored by each node of the storage capacity heterogeneous FR code is the number of columns with 1 element in the corresponding row.
Specifically, in case oneI.e. the i-th set of FR codes i-th node->The i-th set of (a) contains elements corresponding to the data blocks stored by the i-th node of the FR code. According to the set psi and shadow set +.>Is divided into t sub-shadow groups, the s-th sub-shadow group of the FR code is divided into the s-th sub-set of the set psi (0 therein<s.ltoreq.t) and the generated shadow set phis' are correspondingly generated.
Specifically, in case three, each row in matrix A 'represents a storage node, and the ith row in matrix A' represents the ith storage node N in the distributed storage system i I=1, 2, …, n. The FR code is constructed from the following formula:
N i ={j:a ij =1} (2)
j=1, 2, …, n, i denotes the i-th storage node, a ij Representing the value of row i and column j of the matrix. N (N) i Storage node representing FR code, N i The data blocks contained in the data block are the columns corresponding to all 1's of the ith row in the matrix A', the columns are extracted to obtain the data block stored in one node, and the heterogeneous FR codes with the storage capacity of d or d-1 and the repetition degree rho=d of each node can be constructed.
Examples
The embodiment provides a construction method of an infinitely expandable partial repetition (Fractional Repetition, FR) code, and on the basis of the above embodiment, the following technical features are further disclosed:
this embodiment constructs a (12, 9) MDS code using m= (m) 1 ,m 2 ,m 3 ,m 4 ,m 5 ,m 6 ,m 7 ,m 8 ,m 9 ) Representing an original file stored in a distributed storage system, c= (m 1 ,m 2 ,m 3 ,m 4 ,m 5 ,m 6 ,m 7 ,m 8 ,m 9 ,p 10 ,p 11 ,p 12 ) Representing a systematic MDS code, where m 1 ,m 2 ,m 3 ,m 4 ,m 5 ,m 6 ,m 7 ,m 8 ,m 9 Representing an original data block; p is p 10 ,p 11 ,p 12 Representing a check data block.
In this embodiment, the storage capacity of the partial repetition code node in the distributed storage system is defined as d=3, so that a set x= {1,2,3,4,5,6,7,8,9,10,11,12} containing 12 elements is selected, and a 4-element set ψ satisfying the condition is constructed as follows
ψ={{1,4,8,12},{2,5,9,11},{3,6,7,10}} (3)
In this embodiment, 3 subsets are included and therefore divided into 3 sub shadow groups, shadow setsThe following are provided:
in this embodiment, according to the case of step 4, a shadow set corresponds to a distributed storage system, and an isomorphic FR code is constructed as shown in fig. 1 below. Each node stores capacity d=3, repetition rate ρ=3 and is divided into three sub-shadow groups. The repeatability requirement can be met by deleting shadow sets phi' according to the size of the storage capacity of the system.
In this embodiment, a shadow set of shadow obtained by subtracting a subset is deletedThe following are provided:
from a collectionThe repetition of the heterogeneous FR codes is obtained according to case two in step 4, as shown in fig. 2. The FR code repetition rate ρ=2 or 3 of the structure, and the storage capacity d=3.
In this embodiment, shadow shadow setThe corresponding shadow sub-association matrix is shown as follows:
shadow set for shadowThe corresponding shadow sub-associated matrix exchanges rows and columns to obtain the following matrix
In this embodiment, the association matrix a' obtains FR codes with heterogeneous storage capacities according to the third case in step 4, as shown in fig. 3. The node stores a heterogeneous partial repetition code with a capacity of 2 or 3 and a repetition degree of ρ=3.
In this embodiment, it can be seen that constructing an isomorphic FR code has the same storage capacity and the same repetition degree for each storage node, by simply deleting the set, heterogeneous FR codes with different repetition degrees can be constructed, heterogeneous FR codes with different storage capacities of nodes can be constructed by inverting the association matrix, and a proper shadow set can be selected to construct according to the needs of the system for the storage capacity of the nodes and the data repetition degree. It is obvious that this FR code is more suitable for practical distributed storage systems than the normal FR code, and the cost of storing is lower.
Claims (1)
1. A shadow-based partial repetition code construction method, comprising the steps of:
step 1: dividing an original file M into k original data blocks, and performing (n, k) MDS coding on the k original data blocks to obtain n coded data blocks, wherein k is more than or equal to 2 and n is more than or equal to k;
step 2: constructing a set X and a set phi according to the number n of the coded data blocks, wherein the set X comprises n different elements, the set phi comprises t subsets phi, the subsets phi are (d+1) element subsets of the set X, the subsets phi comprise (d+1) elements and no identical elements exist in each subset phi, and d is a positive integer and (d+1) < n;
step 3: obtaining a shadow set of set ψWherein, shadow set->The method comprises t groups of sub-shadow sets, wherein each group of sub-shadow sets comprises (d+1) sets phi ', each set phi ' comprises d elements, and each set phi ' consists of the rest elements after any element in the subset phi is deleted;
step 4: according to the shadow setConstructing an FR code includes three cases:
case one: if the isomorphic FR code is constructed, each node of the isomorphic FR code corresponds to a shadow setThe number of nodes of the isomorphic FR code is t x (d+1), the node storage capacity of the isomorphic FR code is d, the repeatability of the isomorphic FR code is d, and the data block stored by each node of the isomorphic FR code is a corresponding set +.>An element contained;
and a second case: if the repetition rate heterogeneous FR code is constructed, then deleteAny one of the sub-shadow sets phi' of each group, resulting in a pruned shadow set +.>Deleted shadow set->Including t sets of sub-shadow sets, each set of sub-shadow sets comprising d sets phi';
each node of the repetition heterogeneous FR code corresponds to the pruned shadow setThe number of nodes of the repetition degree heterogeneous FR code is t multiplied by d, the node storage capacity of the repetition degree heterogeneous FR code is d, the repetition degree of the repetition degree heterogeneous FR code is d or (d-1), and the data block stored by each node of the repetition degree heterogeneous FR code is an element contained in a corresponding set phi';
and a third case: if the storage capacity heterogeneous FR code is constructed, a pruned shadow set is constructed based on the second casePerforming row-column exchange on the A to obtain a matrix A';
each node of the storage capacity heterogeneous FR code corresponds to each row in the matrix A ', the number of nodes of the storage capacity heterogeneous FR code is the number of rows of the matrix A', the storage capacity of the nodes of the storage capacity heterogeneous FR code is d or d-1, the repeatability of the storage capacity heterogeneous FR code is d, and the data block stored by each node of the storage capacity heterogeneous FR code is the number of columns with 1 element in the corresponding row.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110931106.3A CN113708780B (en) | 2021-08-13 | 2021-08-13 | Partial repetition code construction method based on shadow |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110931106.3A CN113708780B (en) | 2021-08-13 | 2021-08-13 | Partial repetition code construction method based on shadow |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113708780A CN113708780A (en) | 2021-11-26 |
CN113708780B true CN113708780B (en) | 2024-02-02 |
Family
ID=78652670
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110931106.3A Active CN113708780B (en) | 2021-08-13 | 2021-08-13 | Partial repetition code construction method based on shadow |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113708780B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2020268A1 (en) * | 1989-06-30 | 1990-12-31 | Scott H. Davis | Digital data management system |
US7043415B1 (en) * | 2001-01-31 | 2006-05-09 | Pharsight Corporation | Interactive graphical environment for drug model generation |
JP2011242818A (en) * | 2010-04-21 | 2011-12-01 | Allied Engineering Corp | Parallel finite element calculation system |
WO2014153716A1 (en) * | 2013-03-26 | 2014-10-02 | 北京大学深圳研究生院 | Methods for encoding minimum bandwidth regenerating code and repairing storage node |
CN110990375A (en) * | 2019-11-19 | 2020-04-10 | 长安大学 | Method for constructing heterogeneous partial repeat codes based on adjusting matrix |
CN110990188A (en) * | 2019-11-19 | 2020-04-10 | 长安大学 | Construction method of partial repetition code based on Hadamard matrix |
CN111125014A (en) * | 2019-11-19 | 2020-05-08 | 长安大学 | Construction method of flexible partial repeat code based on U-shaped design |
-
2021
- 2021-08-13 CN CN202110931106.3A patent/CN113708780B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2020268A1 (en) * | 1989-06-30 | 1990-12-31 | Scott H. Davis | Digital data management system |
US7043415B1 (en) * | 2001-01-31 | 2006-05-09 | Pharsight Corporation | Interactive graphical environment for drug model generation |
JP2011242818A (en) * | 2010-04-21 | 2011-12-01 | Allied Engineering Corp | Parallel finite element calculation system |
WO2014153716A1 (en) * | 2013-03-26 | 2014-10-02 | 北京大学深圳研究生院 | Methods for encoding minimum bandwidth regenerating code and repairing storage node |
CN110990375A (en) * | 2019-11-19 | 2020-04-10 | 长安大学 | Method for constructing heterogeneous partial repeat codes based on adjusting matrix |
CN110990188A (en) * | 2019-11-19 | 2020-04-10 | 长安大学 | Construction method of partial repetition code based on Hadamard matrix |
CN111125014A (en) * | 2019-11-19 | 2020-05-08 | 长安大学 | Construction method of flexible partial repeat code based on U-shaped design |
Non-Patent Citations (2)
Title |
---|
Optimal Fractional Repetition Codes Based on Graphs and Designs;Natalia Silberstein etal.;《IEEE TRANSACTIONS ON INFORMATION THEORY》;第61卷(第8期);全文 * |
异构部分重复码的构造;孙伟等;《计算机系统应用》;第第30卷卷(第第2期期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113708780A (en) | 2021-11-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20120173932A1 (en) | Storage codes for data recovery | |
CN106484559B (en) | A kind of building method of check matrix and the building method of horizontal array correcting and eleting codes | |
US8928503B2 (en) | Data encoding methods, data decoding methods, data reconstruction methods, data encoding devices, data decoding devices, and data reconstruction devices | |
JP3978195B2 (en) | Method and system for minimizing the length of a defect list in a storage device | |
CN106788891A (en) | A kind of optimal partial suitable for distributed storage repairs code constructing method | |
CN110032470B (en) | Method for constructing heterogeneous partial repeat codes based on Huffman tree | |
US20170302294A1 (en) | Data processing method and system based on quasi-cyclic ldpc | |
CN113258936B (en) | Dual coding construction method based on cyclic shift | |
CN113708780B (en) | Partial repetition code construction method based on shadow | |
EP3648357B1 (en) | Encoding method and apparatus, and computer storage medium | |
CN111125014B (en) | Construction method of flexible partial repeat code based on U-shaped design | |
CN110990188B (en) | Construction method of partial repetition code based on Hadamard matrix | |
CN110990375B (en) | Method for constructing heterogeneous partial repeat codes based on adjusting matrix | |
CN106788454B (en) | Construction method of local unequal codes | |
CN114285420A (en) | Iterative matrix-based construction method of partial repetition codes and node restoration method | |
Elishco et al. | The entropy rate of some Pólya string models | |
CN109634953A (en) | A kind of weight quantization Hash search method towards higher-dimension large data sets | |
Antoniou et al. | Compressing biological sequences using self adjusting data structures | |
CN107665152A (en) | The interpretation method of a kind of correcting and eleting codes | |
CN113611354A (en) | Protein torsion angle prediction method based on lightweight deep convolutional network | |
Silberstein et al. | Optimal fractional repetition codes | |
Paunkoska et al. | Improving DSS efficiency with shortened MSR codes | |
CN110909027A (en) | Hash retrieval method | |
CN117171497B (en) | Sparse matrix storage method, device, equipment and storage medium | |
Zhang et al. | Two families of LRCs with availability based on iterative matrix |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20231226 Address after: Room A1-955, No. 58 Fumin Branch Road, Chongming District, Shanghai, 202150 Applicant after: Shanghai Yingsheng Network Technology Co.,Ltd. Address before: 710064 No. 126 central section of South Ring Road, Yanta District, Xi'an, Shaanxi Applicant before: CHANG'AN University |
|
GR01 | Patent grant | ||
GR01 | Patent grant |