CN113391948A - Folding type extensible distributed storage coding and repairing and expanding method - Google Patents
Folding type extensible distributed storage coding and repairing and expanding method Download PDFInfo
- Publication number
- CN113391948A CN113391948A CN202110726617.1A CN202110726617A CN113391948A CN 113391948 A CN113391948 A CN 113391948A CN 202110726617 A CN202110726617 A CN 202110726617A CN 113391948 A CN113391948 A CN 113391948A
- Authority
- CN
- China
- Prior art keywords
- matrix
- check
- nodes
- information
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/08—Error detection or correction by redundancy in data representation, e.g. by using checking codes
- G06F11/10—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
- G06F11/1076—Parity data used in redundant arrays of independent storages, e.g. in RAID systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
- G06F3/0619—Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
Abstract
The invention discloses a folding type expandable distributed storage coding and repairing and expanding method, which comprises the following steps: after the coding parameters of each node are determined, sequentially constructing a generating matrix set corresponding to other stages from the generating matrix of the last stage, combining a coding group and selecting a code from the coding group to code data to be coded; when the node fails, selecting the nodes which are not failed and have the same number as the information nodes, and downloading the data symbols from the nodes to recover the data symbols in the failed nodes; the encoded data is expanded by merging the two sub-stripes within each expanded group. The invention has the advantages of improving the fault-tolerant capability of the expanded system, having MDS property, low expansion bandwidth, being capable of expanding for many times and the like, and can be used for coding, repairing and expanding the distributed storage system with the node calculation capability.
Description
Technical Field
The invention belongs to the technical field of computers, and further relates to a folding type extensible distributed storage coding and repairing and expanding method in the technical field of distributed storage. The invention can be used for coding, repairing and expanding the distributed storage system with the node having the computing capability.
Background
Due to the fact that the distributed storage system is large in data storage amount and frequent in node failure events, the distributed storage system needs to improve the reliability of the system by storing redundant data. The erasure code technology is a typical data redundancy mechanism, and achieves the purpose of fault tolerance by dividing original data into information blocks, then coding the information blocks to generate check blocks, and storing the information blocks and the check blocks in nodes of a distributed storage system in a scattered manner. The scale of the distributed storage system generally increases with the change of the use time, more and more new nodes are added into the system, ideally, the encoding parameters of the distributed storage system should be dynamically expandable according to the application requirements, and when the erasure code-based distributed storage system expands the encoding parameters, the migration and update processes of data need to transmit a large amount of data between the nodes, which may cause a large amount of network bandwidth resource consumption and affect the performance of the distributed storage system.
The patent document of Huazhong university of science and technology "a storage expansion method based on network coding" (patent application No. 201810304384.4, publication No. CN 108536396B) discloses a storage expansion method based on network coding. The idea of the method is to divide the strip before storage expansion into a plurality of expansion groups, and further divide each expansion group into PG and DG; sequentially taking data blocks from an original node circularly in a DG to obtain a series of data sets; encoding each data set by using network coding to generate update blocks, and performing local update or remote update on the coding blocks in the PG by using the update blocks; transmitting the coding blocks or the data blocks to the newly added nodes, and keeping the data blocks and the coding blocks after expansion uniformly placed on all the nodes; all data blocks and coding blocks transmitted to the new node are deleted and all coding blocks within the DG are deleted. The method utilizes the computing resources of the storage nodes to code the data blocks and locally update partial code blocks during storage expansion, so that the expansion bandwidth is reduced, but the method still has the defects that the number of the nodes in the system is increased after the storage expansion, more code blocks are needed to meet the requirement of the system on the fault-tolerant capability, the number of the code blocks in a single strip is kept unchanged during the storage expansion, and the fault-tolerant capability of the system after the expansion cannot adapt to the node scale of the system.
Paper published by aged et al "random binary spreading codes: a coding method suitable for a distributed storage system (computer science and report, 9.2017) is provided, wherein the coding method can dynamically adjust code rate and erasure correction capability. The coding matrix of the method consists of a unit matrix and a random matrix, and the high performance of the whole code word is achieved by adopting a top-down design mode and controlling the generation of each element in the random matrix. The method has the advantages that the parameters have the capability of dynamic adjustment, the row and the column of the coding matrix can be freely stretched, and further, the storage system can dynamically adjust the code rate and the erasure correction capability according to the change of application requirements.
A coding method for efficient Code conversion is proposed in the paper "conversion codes of New class of codes for influencing conversion data in Distributed Storage" (11th Innovations in the scientific Computer Conference ser Leibnizi International Proceedings in information, vol.151, pp.66:1-66:26,2020.), and the Code conversion problem is analyzed in the paper "Bandwith Code of Code conversion in Distributed Storage: functional limitations and optical configurations" (arXiv:2008.12707) published thereafter by means of network information, a conversion codes coding scheme for reducing the conversion Bandwidth is proposed. The Convertible codes effectively reduce the resource consumption of the system when the system expands from the initial code to the final code, but the method still has the defect that the method can only expand once with low bandwidth resource consumption.
Disclosure of Invention
The invention aims to provide a folding expandable distributed storage coding and repairing and expanding method aiming at overcoming the defects of the prior art, and aims to solve the problems that the fault-tolerant capability of a system after expansion cannot adapt to the scale of nodes, the repairing degree is high when a failed node is repaired, and the number of times of expansion is small.
The idea for realizing the purpose of the invention is as follows: because the coding method of the invention calculates the coding parameters of other stages in turn from the coding parameter of the 1 st stage according to the formula, the problem that the fault-tolerant capability after the system expansion cannot adapt to the node scale is solved because the number of the calculated check nodes is increased. And constructing a generating matrix corresponding to the last stage by taking a system type MDS code as a basic code, constructing generating matrix sets corresponding to other stages in reverse order according to a set folding rule, combining a coding group, and selecting a code from the coding group to encode data to be encoded. The repairing method of the invention selects the nodes with the same number as the information nodes from the non-failed nodes to download the data symbols to recover the data symbols in the failed nodes, and the selected nodes are equal to the information nodes, thereby solving the problem of high repairing degree when repairing the failed nodes. Because the extension method of the invention merges the two sub-stripes in each extension group when extending the coded data, because there are a plurality of codes in the coding group, the merging process can be carried out for a plurality of times, and the problem of few times of extension is solved.
To achieve the above object, the steps of a foldable scalable distributed storage coding method of the present invention include:
(1) setting the coding parameters of the 1 st stage:
number k of information nodes1Number of check nodes r1Number of verification nodes s1Set as the encoding parameter of the 1 st stage, where k1、r1Is a positive integer, s1Is a non-negative integer and is less than or equal to r1;
(2) Calculating the encoding parameters of the next stage:
(2a) calculating the number of information nodes and the number of check nodes in the next stage according to the following formula:
k′=2k
r′=2r-s
wherein k 'and r' respectively represent the number of information nodes and the number of check nodes of the next stage of the current stage, and k, r and s respectively represent the number of information nodes, the number of check nodes and the number of element check nodes of the current stage;
(2b) selecting a value equal to the maximum value of the number of simultaneously failed information nodes which are expected to be repaired with low repair complexity from the value range { s, s +1, …,2r-s } as the number of meta check nodes of the next stage;
(3) judging whether the total number of the coding parameters obtained by the current iteration is equal to m, if so, executing the step (4); otherwise, executing the step (2) after taking the determined coding parameter as the coding parameter of the current stage; m represents the total number of codes in the set code group to be constructed, and the value of m is an integer greater than or equal to 2;
(4) determining the final encoding parameters:
(4a) setting the value of the number of check nodes in the coding parameter obtained by the current iteration as the number of meta check nodes in the coding parameter obtained by the current iteration;
(4b) composing the number of information nodes, the number of check nodes and the number of determined element check nodes obtained by current iteration into a final coding parameter;
(5) constructing a generating matrix corresponding to the last stage:
a systematic MDS code is used as a basic code, and a generating matrix G corresponding to the last stage is constructed by using a generating matrix constructing method of the basic codem:
Wherein G ismThe generator matrix corresponding to the last stage is shown,represents a kmIdentity matrix of order, kmIs equal to the number of information nodes in the final coding parameter,represents a rmLine kmMatrix of columns, rmThe value of (a) is equal to the number of check nodes in the final encoding parameter;
(6) setting a folding rule of a generating matrix:
(6a) the matrix to be folded is divided into A, B, X, U, V five matrices: wherein A denotes the 1 st to l-th rows of the matrix to be folded1A left information matrix of rows is formed, representing the number of information nodes in the coding parameter corresponding to the previous stage of the current stage; b denotes the l-th of the matrix to be folded2Go to3A right information matrix of rows is formed,x denotes the l-th of the matrix to be folded4Go to first5A matrix of rows is formed of a plurality of columns, representing the number of meta-check nodes in the coding parameter corresponding to the previous stage of the current stage; u denotes the l-th of the matrix to be folded6Go to7A matrix of rows is formed of a plurality of columns, representing the number of check nodes in the coding parameter corresponding to the previous stage of the current stage; v denotes the l-th of the matrix to be folded8One right NOT composed of line to last 1 lineThe number of the element matrix is,
(6b) generating a matrix which is equal to the elements of the X row and the X column, setting the elements of the last mu non-zero columns of the matrix to zero to obtain a left element matrix X',generating a matrix which is equal to the row elements and the column elements of the matrix X, and obtaining a right element matrix X' after all the first mu non-zero column elements of the matrix are set to zero; setting all the elements of the last mu non-zero columns of the matrix U to zero to obtain a left non-element matrix U';
(6c) according to the following formula, combining the matrix A, the matrix X ', the matrix U ', the matrix B, the matrix X ' and the matrix V respectively to construct two generation matrixes which are correlated in a generation matrix set corresponding to the previous stage of the current stage after the matrix to be folded is folded:
wherein G 'represents a left generator matrix, G' represents a right generator matrix;
(7) constructing a generating matrix set corresponding to the last stage of the last stage:
(7a) folding the generated matrix corresponding to the last stage according to the folding rule of the generated matrix,
adding the folded generation matrix into a generation matrix set corresponding to the previous stage of the current stage;
(7b) judging whether the value of m-1 is equal to 2, if so, executing the step (10), otherwise, executing the step (8) after taking the generated matrix set determined by the iteration as the generated matrix set corresponding to the current stage;
(8) constructing a generating matrix set corresponding to the previous stage:
according to the folding rule of the generated matrix, folding each generated matrix in the generated matrix set corresponding to the current stage, and adding the generated matrix obtained by folding into the generated matrix set corresponding to the previous stage of the current stage;
(9) judging whether the number of the generating matrixes in the generating matrix set obtained by the current iteration is equal to 2 or notm-1If so, executing the step (10), otherwise, executing the step (8) after taking the generated matrix set determined this time as the generated matrix set corresponding to the current stage;
(10) determining codes of all stages:
taking the coding parameter corresponding to each stage as the coding parameter of the corresponding code; taking the generating matrix corresponding to the last stage as the generating matrix of the sub-strip of the code of the last stage; taking each generation matrix in the generation matrix set corresponding to each other stage except the last stage as the generation matrix of each sub-strip of the corresponding code;
(11) combining the codes of all the stages into a coding group;
(12) selecting a code from the coding group, wherein the sum of the number of the information nodes and the number of the check nodes in the coding parameters of the selected code is equal to the total number of the nodes expected to be adopted;
(13) encoding data to be encoded:
averagely dividing data to be coded into t information symbols, wherein t is km(ii) a Respectively coding data to be coded by using a generating matrix corresponding to each sub-stripe of the selected code to obtain data symbols of the sub-stripe, and forming coded data by the data symbols of all the sub-stripes; the encoded data is saved to the corresponding node.
The invention relates to a folding type expandable distributed storage coding repairing method, which comprises the following steps:
(1) abandoning and repairing the condition that the total number of the failure nodes of each coded data coded by the same code is larger than the number of the check nodes in the code coding parameter, and executing the step (2) under the other conditions;
(2) judging whether the total number of the failure information nodes of all the failure nodes is a non-0 value or not, if so, executing the step (3); otherwise, executing the step (6) after judging that the failure information node does not exist but the failure check node exists;
(3) judgment ofIf yes, executing the step (4), otherwise, executing the step (7); wherein alpha represents the total number of the failure information nodes of all the failure nodes,the number of the element check nodes in the parameter which represents that each coded data adopts the same code to code and the lambda represents the total number of the failure element check nodes in all the failure check nodes;
(4) dividing the data symbols:
downloading all information symbols stored by the information node from each non-failed information node which stores the same encoded data with the failed node, downloading all meta-check symbols stored by the meta-check node from each non-failed meta-check node which stores the same encoded data with the failed node, and randomly selecting eta non-meta-check nodes from all non-failed non-meta-check nodes which store the same encoded data with the failed node to download all non-meta-check symbols in the non-meta-check nodes; dividing the data symbols belonging to the same sub-stripe into the same symbol group, and dividing the symbol groups belonging to the same coded data into the same data set; wherein η has a value equal to
(5) Processing each data set:
(5a) numbering each symbol group in each data set according to:
wherein j ish,cIndicating the number of the c-th symbol group in the h-th data set,indicating a rounding-up operation, qh,cThe sequence number of the non-zero element column in the first row element of the generating matrix of the corresponding sub-stripe of the c symbol group in the h data set is represented,the number of information nodes in the parameter of each coded data coded by the same code is represented;
(5b) for each data set, sequentially carrying out decoding-eliminating operation on each symbol group in the data set from the symbol group with the number of 1 in the data set;
(5c) executing step (9) after all the data sets are processed;
(6) dividing information symbols:
downloading all information symbols stored by the information node from each information node storing the same encoded data with the failed node, and dividing the information symbols belonging to the same sub-stripe into the same symbol group; dividing symbol groups belonging to the same coded data into the same data set and then executing the step (9);
(7) and recovering the information symbols corresponding to the failure information nodes:
(7a) downloading all information symbols stored by the information node from each non-failed information node which stores the same encoded data with the failed node, and randomly selecting the number of meta-check nodes equal to the value of alpha from all non-failed meta-check nodes which store the same encoded data with the failed node to download all meta-check symbols stored by the meta-check node; dividing the data symbols belonging to the same sub-stripe into the same symbol group;
(7b) decoding the data symbols in each symbol group by using a decoding method corresponding to the basic code, recovering the information symbols corresponding to the failure information nodes in the symbol group, and adding the information symbols into the symbol group;
(7c) dividing symbol groups belonging to the same coded data into the same data set;
(8) judging whether all failure nodes have failure check nodes, if so, executing the step (9), otherwise, executing the step (10);
(9) encoding the information symbols:
coding all information symbols in all symbol groups in each data set by using a coding coefficient row matrix corresponding to the failure check node, recovering the check symbols corresponding to the failure check node, and adding the recovered check symbols into the symbol groups corresponding to the same sub-strip;
(10) and saving the recovered data symbols:
adding new information nodes with the number equal to the value of alpha, adding new check nodes with the number equal to the total number of the failed check nodes of all the failed nodes, storing the recovered information symbols belonging to the same information node in the same information node, and storing the recovered check symbols belonging to the same check node in the same check symbol;
(11) and replacing the failed node with the new node.
The invention relates to a folding expandable distributed storage coding expansion method, which comprises the following steps:
(1) adding a new node:
adding rho information nodes Y to the node for storing coded data except for the case that the coded data is coded by adopting the code of the last stage1、Y2、…、YρAnd gamma check nodes F1、F2、…、FγWherein, in the step (A), information node in coding parameters of a code representing a stage next to a corresponding stage of a code currently used for coding dataThe number of the first and second groups is,representing the number of information nodes in the coding parameters of the code currently used for coding the data, indicating the number of check nodes in the encoding parameters of the code of the next stage corresponding to the stage currently adopted for encoding data,the number of check nodes in the coding parameters of the code currently adopted by the coded data is represented;
(2) dividing two sub-stripes which are mutually related to a generating matrix in the coded data into an expansion group;
(3) merging two sub-stripes within each extension group:
(3a) downloading and caching all information symbols of the sub-strips corresponding to the right generating matrix in each extended group from all information nodes for storing coded data, and respectively transferring the information symbols at different positions of the sub-strips in the information symbols to an information node Y1、Y2、…、YρIn different information nodes, combining the information symbols after migration and the information symbols not after migration into the information symbols of the merged sub-strips;
(3b) adding two element check symbols positioned in the same element check node in two sub-strips in each extended group in the element check node to obtain updated element check symbols;
(3c) coding the cached information symbols by using a complementary matrix of a left generator matrix corresponding to the sub-strip of each extended group to obtain correction symbols, and updating the non-meta-check symbols of the sub-strip corresponding to the left generator matrix by using the correction symbols;
(3d) non-meta check symbols at different positions in the sub-stripe corresponding to the right generator matrix in each extended groupNumber is respectively migrated to check node F1、F2、…、FγIn different check nodes;
(3e) combining the updated meta-check symbol, the updated non-meta-check symbol and the migrated check symbol into a check symbol of the merged sub-stripe;
(4) and combining the data symbols of all the merged sub-stripes into expanded coded data.
Compared with the prior art, the invention has the following advantages:
firstly, the number of check nodes in the next stage calculated when the coding parameter in the next stage is calculated in the coding method of the present invention is not less than the number of check nodes in the current stage, and the problems that the number of coding blocks in a single stripe remains unchanged during storage expansion and the fault-tolerant capability after system expansion cannot adapt to the node scale of the system in the prior art are solved, so that the number of check symbols in a single sub-stripe can be simultaneously increased when a code word constructed by using the coding method of the present invention has a code in the next stage of construction, and the fault-tolerant capability after system expansion can adapt to the node scale of the system.
Secondly, in the repairing method, when the data symbols are downloaded from the selected nodes of the non-failed nodes, only the nodes with the same number as the number of the information nodes need to be selected, and the problems that the number of the nodes needing to be connected is larger than the number of the information nodes and the repairing degree is high when the failed nodes are repaired in the prior art are solved, so that the repairing method has the advantages that the number of the nodes needing to be connected is equal to the number of the information nodes and the repairing degree is low when the failed nodes are repaired.
Thirdly, the extension method of the present invention can realize extension of the encoded data by merging two sub-stripes in each extension group, and since there are multiple codes in the encoding group, such extension process can be performed many times, which overcomes the problem that the prior art can only extend once with low bandwidth resource consumption, so that the extension method of the present invention has the advantage that the extension can be performed many times with low bandwidth resource consumption.
Drawings
FIG. 1 is a flow chart of the foldable scalable distributed storage coding of the present invention;
FIG. 2 is a diagram illustrating encoding of data to be encoded according to an embodiment of the present invention;
FIG. 3 is a flow diagram of a folded extensible distributed storage repair of the present invention;
FIG. 4 is a flow chart of the foldable extensible distributed storage extension of the present invention;
fig. 5 is a schematic diagram of expanding encoded data in the embodiment of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings and examples.
The implementation steps of the folding scalable distributed storage coding method of the present invention are further described with reference to fig. 1.
Number k of information nodes1Number of check nodes r1Number of verification nodes s1Set as the encoding parameter of the 1 st stage, where k1、r1Is a positive integer, s1Is a non-negative integer and is less than or equal to r1。
In the embodiment of the present invention, the number of information nodes, the number of check nodes, and the number of meta check nodes of the encoding parameter at the 1 st stage are set to 4, 3, and 1, respectively.
And 2, calculating the coding parameters of the next stage.
Calculating the number of information nodes and the number of check nodes in the next stage according to the following formula:
k′=2k
r′=2r-s
wherein k 'and r' respectively represent the number of information nodes and the number of check nodes of the next stage of the current stage, and k, r and s respectively represent the number of information nodes, the number of check nodes and the number of element check nodes of the current stage;
a value equal to the maximum value of the number of information nodes which are expected to be repaired with low repair complexity and fail at the same time is selected from the value range { s, s +1, …,2r-s } as the number of meta-check nodes of the next stage.
And 4, determining final encoding parameters.
and step 2, composing the number of the information nodes, the number of the check nodes and the number of the determined element check nodes obtained by current iteration into a final coding parameter.
In the embodiment of the present invention, the total number of codes in the coding group is set to be 3, the number of information nodes, the number of check nodes, and the number of meta check nodes in the coding parameters of the 2 nd stage code can be obtained by calculation and selection to be 8, 5, and 2, respectively, and the number of information nodes, the number of check nodes, and the number of meta check nodes in the coding parameters of the 3 rd stage code, that is, the number of information nodes, the number of check nodes, and the number of meta check nodes in the final coding parameters, are 16, 8, and 8, respectively.
And 5, constructing a generating matrix corresponding to the last stage.
A systematic MDS code is used as a basic code, and a generating matrix G corresponding to the last stage is constructed by using a generating matrix constructing method of the basic codem:
Wherein G ismThe generator matrix corresponding to the last stage is shown,represents a kmIdentity matrix of order, kmIs equal to the number of information nodes in the final coding parameter,represents a rmLine kmMatrix of columns, rmIs equal to the number of check nodes in the final encoding parameter.
In the embodiment of the invention, a system type (24,16) RS is used as a basic code to construct a generating matrix
And 6, setting a folding rule of the generated matrix.
In step 1, the matrix to be folded is divided into A, B, X, U, V five matrices: wherein A denotes the 1 st to l-th rows of the matrix to be folded1A left information matrix of rows is formed, representing the number of information nodes in the coding parameter corresponding to the previous stage of the current stage; b denotes the l-th of the matrix to be folded2Go to3A right information matrix of rows is formed,x denotes the l-th of the matrix to be folded4Go to first5A matrix of rows is formed of a plurality of columns, representing the number of meta-check nodes in the coding parameter corresponding to the previous stage of the current stage; u denotes the l-th of the matrix to be folded6Go to7A matrix of rows is formed of a plurality of columns, representing the number of check nodes in the coding parameter corresponding to the previous stage of the current stage; v denotes the l-th of the matrix to be folded8A right non-element matrix composed of rows to the last 1,
and 3, respectively combining the matrix A, the matrix X ', the matrix U ', the matrix B, the matrix X ' and the matrix V according to the following formula to construct two generation matrixes which are related to each other and in a generation matrix set corresponding to the previous stage of the current stage after the matrix to be folded is folded:
wherein G 'represents a left generator matrix and G' represents a right generator matrix.
And 7, constructing a generating matrix set corresponding to the last stage of the last stage.
adding the generated matrix obtained by folding into a generated matrix set corresponding to the previous stage of the current stage;
and 2, judging whether the value of m-1 is equal to 2, if so, executing the step 10, otherwise, executing the step 8 after taking the generated matrix set determined this time as the generated matrix set corresponding to the current stage.
And 8, constructing a generating matrix set corresponding to the previous stage.
And according to the folding rule of the generated matrix, folding each generated matrix in the generated matrix set corresponding to the current stage, and adding the generated matrix obtained by folding into the generated matrix set corresponding to the previous stage of the current stage.
And 9, judging whether the total number of the generated matrix sets obtained currently is equal to m-1, if so, executing the step 10, and otherwise, executing the step 8 after taking the generated matrix set determined this time as the generated matrix set corresponding to the current stage.
In the embodiment of the present invention, the matrix P is described for convenience of description8×16The row matrixes of the coding coefficients corresponding to the 1 st to 8 th rows are respectively expressed as p1、p2、…、p8By means of symbolsDenotes the reservation pqThe u-th to v-th elements are set to zero to obtain a coding coefficient row matrix, q is more than or equal to 1 and less than or equal to 8, u is more than or equal to 1 and less than or equal to v and less than or equal to 16, and the symbol E is usedu′×u′An identity matrix representing one u ' row and u ' column, 1 ≦ u ' ≦ 16, denoted by the symbol Ou″×v"denotes an all-zero matrix of u" rows and v "columns, 1. ltoreq. u" ≦ 16, 1. ltoreq. v "≦ 16. Firstly, the generator matrix G corresponding to the last stage is used3The division into five matrices: left information matrix A3=[E8×8|O8×8]Right information matrix B3=[O8×8|E8×8]Matrix, matrixMatrix arrayRight non-element matrixGenerating two AND matrices X3The same matrix is set to zero in the last 8 non-zero columns of one matrix to obtain the element matrixSetting all the elements of the first 8 non-zero columns of another matrix to zero to obtain a constructed right element matrixWill matrix U3All the elements of the last 8 non-zero columns are set to zero to obtain a left non-element matrixAre respectively aligned with the matrix A3Matrix, matrixMatrix arrayMatrix B3Matrix, matrixMatrix V3Combining to construct a pair matrix G3Two correlative generating matrixes in the generating matrix set corresponding to the previous stage of the current stage obtained after foldingNamely:
G2,1、G2,1i.e. 2 generator matrices in the set of generator matrices in stage 2.
The same method is adopted for G2,1、G2,2Folding to obtain two generation matrixes G1,1、G1,2,G1,3、G1,4:
G1,1、G1,2,G1,3、G1,4I.e. 4 of the set of generation matrices of stage 1.
And step 10, determining codes of all stages.
Taking the coding parameter corresponding to each stage as the coding parameter of the corresponding code; taking the generating matrix corresponding to the last stage as the generating matrix of the sub-strip of the code of the last stage; and taking each generation matrix in the generation matrix set corresponding to each other stage except the last stage as the generation matrix of each sub-strip of the corresponding code.
And step 11, combining the codes of all the stages into one coding group.
And step 12, selecting a code from the coding group, wherein the sum of the number of the information nodes and the number of the check nodes in the coding parameters of the selected code is equal to the total number of the expected nodes.
In the embodiment of the present invention, the total number of nodes to be used is 7, so the code of the 1 st stage is selected from the code group.
And step 13, encoding the data to be encoded.
Averagely dividing data to be coded into t information symbols, wherein t is km(ii) a Respectively coding data to be coded by using a generating matrix corresponding to each sub-stripe of the selected code to obtain data symbols of the sub-stripe, and forming coded data by the data symbols of all the sub-stripes; the encoded data is saved to the corresponding node.
The data symbols of the sub-strips comprise information symbols and check symbols, the check symbols comprise meta check symbols and non-meta check symbols, and the data symbols are obtained by encoding data to be encoded through encoding coefficient row matrixes corresponding to all rows in a generating matrix; the information symbol is a data symbol obtained by coding data to be coded by a left information matrix in the left generating matrix or a right information matrix in the right generating matrix; the element check symbol is a data symbol obtained by encoding data to be encoded by a left element matrix in the left generating matrix or a right element matrix in the right generating matrix; the non-element check symbol is a data symbol obtained by encoding data to be encoded by a left non-element matrix in the left generating matrix or a right non-element matrix in the right generating matrix.
The coded data is stored in the corresponding node, that is, the data symbols at different positions in each sub-stripe are stored in different nodes, and the data symbols at the same positions in different sub-stripes are stored in the same node; the nodes comprise two categories of information nodes and check nodes, and the check nodes comprise two subclasses of meta check nodes and non-meta check nodes; the information node is used for storing information symbols; the meta-check node is used for storing a meta-check symbol; the non-meta check node is used for storing a non-meta check symbol.
Referring to fig. 2, implementation steps for encoding data to be encoded in the embodiment of the present invention are further described.
D in FIG. 21、D2、D3、D4Representing 4 information nodes, C1、C2、C3Represents 3 check nodes, a1、a2、…、a16Representing 16 information symbols, M representing the data to be encoded, M ═ a1,a2,…,a16]TAnd T denotes a transposition operation. Generation matrix G corresponding to 4 sub-stripes of the 1 st stage code1,1、G1,2、G1,3、G1,4Respectively encoding the data M to be encoded to obtain the data of each sub-strip as follows:
The data of the above 4 sub-stripes together constitute one encoded data. Sequentially storing the 1 st to 4 th information symbols in the 1 st sub-strip in an information node D1、D2、D3、D4In the method, the 1 st to 3 rd check symbols are sequentially stored in a check node C1、C2、C3Performing the following steps; the data symbols of the other sub-stripes are stored in the same way as sub-stripe 1.
The implementation steps of the folding scalable distributed storage coding repair method of the present invention are further described with reference to fig. 3.
In the embodiment of the present invention, it is assumed that 2 nodes out of all the nodes in fig. 2 fail.
In the embodiment of the present invention, assume that 2 failed nodes in fig. 2 are D3、D4That is, the total number of the failed information nodes is 2, and the number of the failed element check nodes is 0.
And 4, dividing the data symbols.
Downloading all information symbols stored by an information node from each non-invalid information node which stores the same encoding data with a invalid node, downloading all meta-check symbols stored by a meta-check node from each non-invalid meta-check node which stores the same encoding data with the invalid node, and randomly selecting eta non-meta-check nodes from all non-invalid non-meta-check nodes which store the same encoding data with the invalid node to download all non-meta-check symbols in the non-meta-check nodes; number to be of the same sub-stripeDividing the symbols into the same symbol group, and dividing the symbol groups belonging to the same coded data into the same data set; wherein η has a value equal to
In the embodiment of the present invention, node D shown in FIG. 2 is selected from1、D2In which all information symbols are downloaded, from the meta check node C1Mid-download check symbolsSelecting a non-meta check node C2And from C2Mid-download check symbolsThe downloaded data symbols are divided into 4 symbol groups: the symbol group corresponding to the 1 st sub-strip isThe symbol group corresponding to the 2 nd sub-band isThe symbol group corresponding to the 3 rd sub-strip isThe symbol group corresponding to the 4 th sub-strip is
And 5, processing each data set.
wherein j ish,cIndicating the number of the c-th symbol group in the h-th data set,representing an up forensic operation, qh,cThe sequence number of the non-zero element column in the first row element of the generating matrix of the corresponding sub-stripe of the c symbol group in the h data set is represented,the number of information nodes in the parameter of each coded data coded by the same code is represented;
and step 2, carrying out decoding-eliminating operation on each symbol group in the data set in turn from the symbol group with the number of 1 in the data set for each data set.
The decoding-eliminating operation is to perform decoding operation first and then eliminate operation, the decoding operation refers to decoding the data symbols in the current symbol group in the data set by using a decoding method corresponding to the basic code, recovering the information symbols corresponding to the failure information nodes, and adding the information symbols into the current symbol group in the data set; the elimination operation refers to utilizing the information symbols in the current symbol group to eliminate the check symbols in each symbol group with the serial number larger than that of the current symbol group in the data set; the elimination means that if the value of an element corresponding to an information symbol in a coding coefficient row matrix of the check symbol is a non-zero value, the information symbol is multiplied by the corresponding non-zero value and then subtracted from the check symbol to obtain the eliminated check symbol.
In the embodiment of the present invention, symbol sets corresponding to the 1 st to 4 th sub-bands shown in fig. 2 are numbered as 1, 2, 3, and 4 in sequence. For symbol group with number 1RS decoding is carried out to recover the information symbol a3,a4(ii) a Then, the information symbol a in the symbol group numbered 1 is used1,a2,a3,a4The check symbols in each symbol group numbered 2, 3, 4 are eliminated: in the symbol group numbered 2, the symbols are checkedOf the row matrix of coding coefficientsIt can be known thatNeutralizing information symbol a1,a2,a3,a4The value of the corresponding element is non-zero, soMinusObtaining a cancelled check symbolThe symbol group numbered 2 is updated toAfter the check symbols in the symbol groups numbered 3 and 4 are eliminated by the same method, the symbol group numbered 3 is kept unchanged, and the symbol group numbered 4 is updated toThe information symbol a can be recovered in sequence by the same method as the above process7、a8,a11、a12,a15、a16。
And 3, executing the step 9 after all the data sets are processed.
And 6, dividing the information symbols.
Downloading all information symbols stored by the information node from each information node storing the same encoded data with the failed node, and dividing the information symbols belonging to the same sub-stripe into the same symbol group; and step 9 is executed after dividing symbol groups belonging to the same coded data into the same data set.
And 7, recovering the information symbol corresponding to the failure information node.
and step 3, dividing symbol groups belonging to the same coded data into the same data set.
And 8, judging whether the failure check nodes exist in all the failure nodes, if so, executing the step 9, otherwise, executing the step 10.
And 9, coding the information symbols.
And coding all information symbols in all symbol groups in each data set by using a coding coefficient row matrix corresponding to the failure check node, recovering the check symbols corresponding to the failure check node, and adding the recovered check symbols into the symbol groups corresponding to the same sub-strip.
And step 10, storing the recovered data symbols.
And adding new information nodes with the number equal to the value of alpha, adding new check nodes with the number equal to the total number of the failed check nodes of all the failed nodes, storing the recovered information symbols belonging to the same information node in the same information node, and storing the recovered check symbols belonging to the same check node in the same check symbol.
In the embodiment of the invention, the recovered information symbol a3、a7、a11、a15Save to a new node D3', information symbol a to be recovered4、a8、a12、a16Saved in another new node D4′。
And 11, replacing the failed node with the new node.
In the embodiment of the invention, node D is used3′、D4' Replacing failed node D3、D4。
The implementation steps of the folding scalable distributed storage coding expansion method of the present invention are further described with reference to fig. 4.
And step 1, adding a new node.
Adding rho information nodes Y to the node for storing coded data except for the case that the coded data is coded by adopting the code of the last stage1、Y2、…、YρAnd gamma check nodes F1、F2、…、FγWherein, in the step (A), indicating the number of information nodes in the coding parameters of the code of the next stage of the corresponding stage of the code currently used for coding data,representing the number of information nodes in the coding parameters of the code currently used for coding the data, indicating the number of check nodes in the encoding parameters of the code of the next stage corresponding to the stage currently adopted for encoding data,indicating the number of check nodes in the encoding parameters of the code currently used to encode the data.
The implementation steps of the encoded data extension in the embodiment of the present invention are further described with reference to fig. 5.
In the embodiment of the present invention, the encoded data shown in fig. 2 is expanded, and fig. 5(a) shows a schematic diagram of encoded data storage after adding a node on the basis of fig. 2, where D5、D6、D7、D8Respectively, 4 information nodes newly added on the basis of FIG. 2, C4、C5Respectively, 2 check nodes newly added on the basis of fig. 2. Fig. 5(b) shows a schematic diagram of encoded data storage after expansion of encoded data.
And 2, dividing two sub-strips, which are obtained by folding the same matrix, of a generated matrix in the coded data into an expansion group.
And 3, combining the two sub-stripes in each expansion group.
and 3, coding the cached information symbols by using the complementary matrix of the left generating matrix corresponding to the sub-strip of each extended group to obtain correction symbols, and updating the non-meta-check symbols of the sub-strip corresponding to the left generating matrix by using the correction symbols.
The complementary matrix of the left generator matrix corresponding to the sub-strip in each extended group means that the complementary matrix is to be paired with the sub-strip in the extended groupThe folded matrix corresponding to the corresponding left generator matrix is represented as a matrixRepresenting a left non-element matrix in a left generating matrix corresponding to the sub-strip in the extended group as a matrix W, and using the matrix to generate a left non-element matrixL.16' go to l7The elements of a row form a matrix The number of meta-check nodes in the coding parameters representing the code currently used for coding the data will beThe obtained matrix is expressed as a matrix T, and all non-zero columns in the matrix T form a complementary matrix of a left generation matrix corresponding to the sub-band in the extended group.
and 5, combining the updated meta-check symbol, the updated non-meta-check symbol and the migrated check symbol into a check symbol of the merged sub-stripe.
In the embodiment of the present invention, the generation matrix G of the 1 st subband shown in fig. 5(a) is used1,1And the generation matrix G of the 2 nd sub-stripe1,2Are related to each other, so that the 1 st sub-band and the 2 nd sub-band are divided into an extended group, and similarly, the 3 rd sub-band and the 4 th sub-band are divided into an extended group. For the 1 st and 2 nd sub-stripesAn extension group is formed, and all information symbols a in the 2 nd sub-strip are downloaded and cached5、a6、a7、a8Sequentially migrating the information symbols to a node D5、D6、D7、D8Post-migration symbol and non-migration information symbol a1、a2、a3、a4The information symbols of the merged sub-strip are composed. At node C1Internally, by mergingAndobtaining updated meta-check symbolsNamely, it isLeft generator matrix G1,1Corresponding complementary matrix isBy using complementary matrices to buffer information symbols a5、a6、a7、a8Coding to obtain a corrected symbol Using the obtained two correction symbols to respectively pairIs eliminated, i.e. fromMinusFromMinusObtaining two updated check symbolsNon-meta-check symbols in the 2 nd sub-stripeRespectively migrate to node C4、C5(ii) a And combining the updated meta-check symbol, the updated non-meta-check symbol and the migrated check symbol into all check symbols of the merged sub-stripe. The left sub-band in fig. 5(b) is the result obtained by combining the 1 st and 2 nd sub-bands. And combining the 3 rd sub-band and the 4 th sub-band in the same way to obtain another new sub-band. The right sub-band in fig. 5(b) is the result obtained by combining the 3 rd and 4 th sub-bands.
And 4, combining the data symbols of all the merged sub-stripes into expanded coded data.
In the embodiment of the present invention, all the data symbols of the two new sub-stripes shown in fig. 5(b) constitute the extended encoded data.
Claims (7)
1. A folding expandable distributed storage coding method is characterized in that coding parameters of the next stage are calculated, a generation matrix set corresponding to the previous stage is constructed according to a folding rule of the generation matrix, codes corresponding to each stage are determined, and codes corresponding to all stages are combined into a coding group; the method comprises the following steps:
(1) setting the coding parameters of the 1 st stage:
number k of information nodes1Number of check nodes r1Number of verification nodes s1Is set to the 1 st stepCoding parameters of the segments, wherein k1、r1Is a positive integer, s1Is a non-negative integer and is less than or equal to r1;
(2) Calculating the encoding parameters of the next stage:
(2a) calculating the number of information nodes and the number of check nodes in the next stage according to the following formula:
k′=2k
r′=2r-s
wherein k 'and r' respectively represent the number of information nodes and the number of check nodes of the next stage of the current stage, and k, r and s respectively represent the number of information nodes, the number of check nodes and the number of element check nodes of the current stage;
(2b) selecting a value equal to the maximum value of the number of simultaneously failed information nodes which are expected to be repaired with low repair complexity from the value range { s, s +1, …,2r-s } as the number of meta check nodes of the next stage;
(3) judging whether the total number of the coding parameters obtained by the current iteration is equal to m, if so, executing the step (4); otherwise, executing the step (2) after taking the determined coding parameter as the coding parameter of the current stage; m represents the total number of codes in the set code group to be constructed, and the value of m is an integer greater than or equal to 2;
(4) determining the final encoding parameters:
(4a) setting the value of the number of check nodes in the coding parameter obtained by the current iteration as the number of meta check nodes in the coding parameter obtained by the current iteration;
(4b) composing the number of information nodes, the number of check nodes and the number of determined element check nodes obtained by current iteration into a final coding parameter;
(5) constructing a generating matrix corresponding to the last stage:
a systematic MDS code is used as a basic code, and a generating matrix G corresponding to the last stage is constructed by using a generating matrix constructing method of the basic codem:
Wherein G ismThe generator matrix corresponding to the last stage is shown,represents a kmIdentity matrix of order, kmIs equal to the number of information nodes in the final coding parameter,represents a rmLine kmMatrix of columns, rmThe value of (a) is equal to the number of check nodes in the final encoding parameter;
(6) setting a folding rule of a generating matrix:
(6a) the matrix to be folded is divided into A, B, X, U, V five matrices: wherein A denotes the 1 st to l-th rows of the matrix to be folded1A left information matrix of rows is formed, representing the number of information nodes in the coding parameter corresponding to the previous stage of the current stage; b denotes the l-th of the matrix to be folded2Go to3A right information matrix of rows is formed,x denotes the l-th of the matrix to be folded4Go to first5A matrix of rows is formed of a plurality of columns, representing the number of meta-check nodes in the coding parameter corresponding to the previous stage of the current stage; u denotes the l-th of the matrix to be folded6Go to7Line groupThe matrix is formed by the following steps of, representing the number of check nodes in the coding parameter corresponding to the previous stage of the current stage; v denotes the l-th of the matrix to be folded8A right non-element matrix composed of rows to the last 1,
(6b) generating a matrix which is equal to the elements of the X row and the X column, setting the elements of the last mu non-zero columns of the matrix to zero to obtain a left element matrix X',generating a matrix which is equal to the row elements and the column elements of the matrix X, and obtaining a right element matrix X' after all the first mu non-zero column elements of the matrix are set to zero; setting all the elements of the last mu non-zero columns of the matrix U to zero to obtain a left non-element matrix U';
(6c) according to the following formula, combining the matrix A, the matrix X ', the matrix U ', the matrix B, the matrix X ' and the matrix V respectively to construct two generation matrixes which are correlated in a generation matrix set corresponding to the previous stage of the current stage after the matrix to be folded is folded:
wherein G 'represents a left generator matrix, G' represents a right generator matrix;
(7) constructing a generating matrix set corresponding to the last stage of the last stage:
(7a) folding the generated matrix corresponding to the last stage according to the folding rule of the generated matrix, and adding the folded generated matrix into a generated matrix set corresponding to the previous stage of the current stage;
(7b) judging whether the value of m-1 is equal to 2, if so, executing the step (10), otherwise, executing the step (8) after taking the generated matrix set determined by the iteration as the generated matrix set corresponding to the current stage;
(8) constructing a generating matrix set corresponding to the previous stage:
according to the folding rule of the generated matrix, folding each generated matrix in the generated matrix set corresponding to the current stage, and adding the generated matrix obtained by folding into the generated matrix set corresponding to the previous stage of the current stage;
(9) judging whether the number of the generating matrixes in the generating matrix set obtained by the current iteration is equal to 2 or notm-1If so, executing the step (10), otherwise, executing the step (8) after taking the generated matrix set determined this time as the generated matrix set corresponding to the current stage;
(10) determining codes of all stages:
taking the coding parameter corresponding to each stage as the coding parameter of the corresponding code; taking the generating matrix corresponding to the last stage as the generating matrix of the sub-strip of the code of the last stage; taking each generation matrix in the generation matrix set corresponding to each other stage except the last stage as the generation matrix of each sub-strip of the corresponding code;
(11) combining the codes of all the stages into a coding group;
(12) selecting a code from the coding group, wherein the sum of the number of the information nodes and the number of the check nodes in the coding parameters of the selected code is equal to the total number of the nodes expected to be adopted;
(13) encoding data to be encoded:
averagely dividing data to be encoded intot information symbols, t ═ km(ii) a Respectively coding data to be coded by using a generating matrix corresponding to each sub-stripe of the selected code to obtain data symbols of the sub-stripe, and forming coded data by the data symbols of all the sub-stripes; the encoded data is saved to the corresponding node.
2. The method according to claim 1, wherein the data symbols of the sub-stripes in step (13) include information symbols and check symbols, the check symbols include meta check symbols and non-meta check symbols, and the data symbols are obtained by encoding data to be encoded through a row matrix of encoding coefficients corresponding to each row in a generator matrix; the information symbol is a data symbol obtained by coding data to be coded by a left information matrix in the left generating matrix or a right information matrix in the right generating matrix; the element check symbol is a data symbol obtained by encoding data to be encoded by a left element matrix in the left generating matrix or a right element matrix in the right generating matrix; the non-element check symbol is a data symbol obtained by encoding data to be encoded by a left non-element matrix in the left generating matrix or a right non-element matrix in the right generating matrix.
3. The method according to claim 1, wherein the step (13) of storing the encoded data in the corresponding node means that the data symbols at different positions in each sub-stripe are stored in different nodes, and the data symbols at the same positions in different sub-stripes are stored in the same node; the nodes comprise two categories of information nodes and check nodes, and the check nodes comprise two subclasses of meta check nodes and non-meta check nodes; the information node is used for storing information symbols; the meta-check node is used for storing a meta-check symbol; the non-meta check node is used for storing a non-meta check symbol.
4. A foldable scalable distributed storage coding repair method for foldable scalable distributed storage coding according to claim 1, characterized in that each data set is processed, information symbols are coded, and recovered data symbols are saved; the method comprises the following steps:
(1) abandoning and repairing the condition that the total number of the failure nodes of each coded data coded by the same code is larger than the number of the check nodes in the code coding parameter, and executing the step (2) under the other conditions;
(2) judging whether the total number of the failure information nodes of all the failure nodes is a non-0 value or not, if so, executing the step (3); otherwise, executing the step (6) after judging that the failure information node does not exist but the failure check node exists;
(3) judgment ofIf yes, executing the step (4), otherwise, executing the step (7); wherein alpha represents the total number of the failure information nodes of all the failure nodes,the number of the element check nodes in the parameter which represents that each coded data adopts the same code to code and the lambda represents the total number of the failure element check nodes in all the failure check nodes;
(4) dividing the data symbols:
downloading all information symbols stored by the information node from each non-failed information node which stores the same encoded data with the failed node, downloading all meta-check symbols stored by the meta-check node from each non-failed meta-check node which stores the same encoded data with the failed node, and randomly selecting eta non-meta-check nodes from all non-failed non-meta-check nodes which store the same encoded data with the failed node to download all non-meta-check symbols in the non-meta-check nodes; dividing the data symbols belonging to the same sub-stripe into the same symbol group, and dividing the symbol groups belonging to the same coded data into the same data set; wherein η has a value equal to
(5) Processing each data set:
(5a) numbering each symbol group in each data set according to:
wherein j ish,cIndicating the number of the c-th symbol group in the h-th data set,indicating a rounding-up operation, qh,cThe sequence number of the non-zero element column in the first row element of the generating matrix of the corresponding sub-stripe of the c symbol group in the h data set is represented,the number of information nodes in the parameter of each coded data coded by the same code is represented;
(5b) for each data set, sequentially carrying out decoding-eliminating operation on each symbol group in the data set from the symbol group with the number of 1 in the data set;
(5c) executing step (9) after all the data sets are processed;
(6) dividing information symbols:
downloading all information symbols stored by the information node from each information node storing the same encoded data with the failed node, and dividing the information symbols belonging to the same sub-stripe into the same symbol group; dividing symbol groups belonging to the same coded data into the same data set and then executing the step (9);
(7) and recovering the information symbols corresponding to the failure information nodes:
(7a) downloading all information symbols stored by the information node from each non-failed information node which stores the same encoded data with the failed node, and randomly selecting the number of meta-check nodes equal to the value of alpha from all non-failed meta-check nodes which store the same encoded data with the failed node to download all meta-check symbols stored by the meta-check node; dividing the data symbols belonging to the same sub-stripe into the same symbol group;
(7b) decoding the data symbols in each symbol group by using a decoding method corresponding to the basic code, recovering the information symbols corresponding to the failure information nodes in the symbol group, and adding the information symbols into the symbol group;
(7c) dividing symbol groups belonging to the same coded data into the same data set;
(8) judging whether all failure nodes have failure check nodes, if so, executing the step (9), otherwise, executing the step (10);
(9) encoding the information symbols:
coding all information symbols in all symbol groups in each data set by using a coding coefficient row matrix corresponding to the failure check node, recovering the check symbols corresponding to the failure check node, and adding the recovered check symbols into the symbol groups corresponding to the same sub-strip;
(10) and saving the recovered data symbols:
adding new information nodes with the number equal to the value of alpha, adding new check nodes with the number equal to the total number of the failed check nodes of all the failed nodes, storing the recovered information symbols belonging to the same information node in the same information node, and storing the recovered check symbols belonging to the same check node in the same check symbol;
(11) and replacing the failed node with the new node.
5. The method according to claim 4, wherein the decoding-removing operation in step (5b) is a decoding operation followed by a removing operation, and the decoding operation refers to decoding the data symbols in the current symbol group in the data set by using a decoding method corresponding to the basic code, recovering the information symbols corresponding to the failed information nodes, and adding the information symbols into the current symbol group in the data set; the elimination operation refers to utilizing the information symbols in the current symbol group to eliminate the check symbols in each symbol group with the serial number larger than that of the current symbol group in the data set; the elimination means that if the value of an element corresponding to an information symbol in a coding coefficient row matrix of the check symbol is a non-zero value, the information symbol is multiplied by the corresponding non-zero value and then subtracted from the check symbol to obtain the eliminated check symbol.
6. A method for expanding foldable scalable distributed storage coding according to claim 1, wherein two sub-stripes of the encoded data, which are obtained by folding the same matrix into a matrix, are divided into an expanded group, two sub-stripes in the same expanded group are combined, and data symbols of all new sub-stripes are combined into new expanded encoded data, and the method comprises the following steps:
(1) adding a new node:
adding rho information nodes Y to the node for storing coded data except for the case that the coded data is coded by adopting the code of the last stage1、Y2、…、YρAnd gamma check nodes F1、F2、…、FγWherein, in the step (A), indicating the number of information nodes in the coding parameters of the code of the next stage of the corresponding stage of the code currently used for coding data,representing the number of information nodes in the coding parameters of the code currently used for coding the data, indicating the number of check nodes in the encoding parameters of the code of the next stage corresponding to the stage currently adopted for encoding data,the number of check nodes in the coding parameters of the code currently adopted by the coded data is represented;
(2) dividing two sub-stripes which are mutually related to a generating matrix in the coded data into an expansion group;
(3) merging two sub-stripes within each extension group:
(3a) downloading and caching all information symbols of the sub-strips corresponding to the right generating matrix in each extended group from all information nodes for storing coded data, and respectively transferring the information symbols at different positions of the sub-strips in the information symbols to an information node Y1、Y2、…、YρIn different information nodes, combining the information symbols after migration and the information symbols not after migration into the information symbols of the merged sub-strips;
(3b) adding two element check symbols positioned in the same element check node in two sub-strips in each extended group in the element check node to obtain updated element check symbols;
(3c) coding the cached information symbols by using a complementary matrix of a left generator matrix corresponding to the sub-strip of each extended group to obtain correction symbols, and updating the non-meta-check symbols of the sub-strip corresponding to the left generator matrix by using the correction symbols;
(3d) respectively transferring the non-meta-check symbols at different positions in the sub-strips corresponding to the right generator matrix in each extended group to a check node F1、F2、…、FγIn different check nodes;
(3e) combining the updated meta-check symbol, the updated non-meta-check symbol and the migrated check symbol into a check symbol of the merged sub-stripe;
(4) and combining the data symbols of all the merged sub-stripes into expanded coded data.
7. The method according to claim 6, wherein the complementary matrix of the left generator matrix corresponding to the sub-strip in each extended group in step (3c) represents the folded matrix corresponding to the left generator matrix corresponding to the sub-strip in the extended group as a matrixRepresenting a left non-element matrix in a left generating matrix corresponding to the sub-strip in the extended group as a matrix W, and using the matrix to generate a left non-element matrixL.16' go to l7The elements of a row form a matrix The number of meta-check nodes in the coding parameters representing the code currently used for coding the data will beThe obtained matrix is expressed as a matrix T, and all non-zero columns in the matrix T form a complementary matrix of a left generation matrix corresponding to the sub-band in the extended group.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110726617.1A CN113391948B (en) | 2021-06-29 | 2021-06-29 | Folding type extensible distributed storage coding and repairing and expanding method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110726617.1A CN113391948B (en) | 2021-06-29 | 2021-06-29 | Folding type extensible distributed storage coding and repairing and expanding method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113391948A true CN113391948A (en) | 2021-09-14 |
CN113391948B CN113391948B (en) | 2022-10-21 |
Family
ID=77624387
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110726617.1A Active CN113391948B (en) | 2021-06-29 | 2021-06-29 | Folding type extensible distributed storage coding and repairing and expanding method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113391948B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102571104A (en) * | 2012-01-15 | 2012-07-11 | 西安电子科技大学 | Distributed encoding and decoding method for RA (Repeat Accumulate) code |
CN102667761A (en) * | 2009-06-19 | 2012-09-12 | 布雷克公司 | Scalable cluster database |
CN103688515A (en) * | 2013-03-26 | 2014-03-26 | 北京大学深圳研究生院 | Method for encoding minimum bandwidth regeneration codes and repairing storage nodes |
CN103688514A (en) * | 2013-02-26 | 2014-03-26 | 北京大学深圳研究生院 | Coding method for minimum storage regeneration codes and method for restoring of storage nodes |
CN104503706A (en) * | 2014-12-23 | 2015-04-08 | 中国科学院计算技术研究所 | Data storing method and data reading method based on disk array |
US20170077950A1 (en) * | 2008-09-16 | 2017-03-16 | File System Labs Llc | Matrix-Based Error Correction and Erasure Code Methods and System and Applications Thereof |
CN106790408A (en) * | 2016-11-29 | 2017-05-31 | 中国空间技术研究院 | A kind of coding method repaired for distributed memory system node |
-
2021
- 2021-06-29 CN CN202110726617.1A patent/CN113391948B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170077950A1 (en) * | 2008-09-16 | 2017-03-16 | File System Labs Llc | Matrix-Based Error Correction and Erasure Code Methods and System and Applications Thereof |
CN102667761A (en) * | 2009-06-19 | 2012-09-12 | 布雷克公司 | Scalable cluster database |
CN102571104A (en) * | 2012-01-15 | 2012-07-11 | 西安电子科技大学 | Distributed encoding and decoding method for RA (Repeat Accumulate) code |
CN103688514A (en) * | 2013-02-26 | 2014-03-26 | 北京大学深圳研究生院 | Coding method for minimum storage regeneration codes and method for restoring of storage nodes |
CN103688515A (en) * | 2013-03-26 | 2014-03-26 | 北京大学深圳研究生院 | Method for encoding minimum bandwidth regeneration codes and repairing storage nodes |
CN104503706A (en) * | 2014-12-23 | 2015-04-08 | 中国科学院计算技术研究所 | Data storing method and data reading method based on disk array |
CN106790408A (en) * | 2016-11-29 | 2017-05-31 | 中国空间技术研究院 | A kind of coding method repaired for distributed memory system node |
Non-Patent Citations (4)
Title |
---|
MÁRTON SIPOS等: "Distributed cloud storage using network coding", 《2014 IEEE 11TH CONSUMER COMMUNICATIONS AND NETWORKING CONFERENCE (CCNC)》 * |
刘冰星等: "一种网络编码分布式存储系统中的数据更新策略", 《小型微型计算机系统》 * |
王意洁等: "分布式存储中的纠删码容错技术研究", 《计算机学报》 * |
陈亮等: "随机二元扩展码:一种适用于分布式存储系统的编码", 《计算机学报》 * |
Also Published As
Publication number | Publication date |
---|---|
CN113391948B (en) | 2022-10-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI419481B (en) | Low density parity check codec and method of the same | |
US10146618B2 (en) | Distributed data storage with reduced storage overhead using reduced-dependency erasure codes | |
CN104052576B (en) | Data recovery method based on error correcting codes in cloud storage | |
CN111697976B (en) | RS erasure correcting quick decoding method and system based on distributed storage | |
JP2004186940A (en) | Error correction code decoding device | |
US20120023362A1 (en) | System and method for exact regeneration of a failed node in a distributed storage system | |
CN108132854B (en) | Erasure code decoding method capable of simultaneously recovering data elements and redundant elements | |
WO2018072294A1 (en) | Method for constructing check matrix and method for constructing horizontal array erasure code | |
CN112000512B (en) | Data restoration method and related device | |
CN105518996B (en) | A kind of data decoding method based on binary field reed-solomon code | |
CN112332856B (en) | Layer decoding method and device of quasi-cyclic LDPC code | |
CN110764950A (en) | Hybrid coding method, data restoration method and system based on RS (Reed-Solomon) code and regeneration code | |
CN111858169A (en) | Data recovery method, system and related components | |
CN111786683B (en) | Low-complexity polar code multi-code block decoder | |
CN113626250A (en) | Strip merging method and system based on erasure codes | |
JPWO2006087792A1 (en) | Encoding apparatus and encoding method | |
CN110061746B (en) | Coupling method of space coupling LDPC code without code rate loss | |
CN113391948B (en) | Folding type extensible distributed storage coding and repairing and expanding method | |
CN110990375B (en) | Method for constructing heterogeneous partial repeat codes based on adjusting matrix | |
CN110781024B (en) | Matrix construction method of symmetrical partial repetition code and fault node repairing method | |
CN116707545A (en) | Low-consumption and high-throughput 5GLDPC decoder implementation method and device | |
CN112104412A (en) | Accelerator suitable for low-orbit satellite broadband communication | |
CN109343998A (en) | Erasure code-based full-distribution restoration method | |
US20210203364A1 (en) | Apparatuses and methods for mapping frozen sets between polar codes and product codes | |
CN108199720A (en) | A kind of node restorative procedure and system for reducing storage overhead and improving remediation efficiency |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |