WO2017041232A1 - Encoding and decoding framework for binary cyclic code - Google Patents
Encoding and decoding framework for binary cyclic code Download PDFInfo
- Publication number
- WO2017041232A1 WO2017041232A1 PCT/CN2015/089179 CN2015089179W WO2017041232A1 WO 2017041232 A1 WO2017041232 A1 WO 2017041232A1 CN 2015089179 W CN2015089179 W CN 2015089179W WO 2017041232 A1 WO2017041232 A1 WO 2017041232A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- code
- binary
- binary cyclic
- cyclic code
- polynomial
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
- H03M13/03—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
- H03M13/05—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
- H03M13/13—Linear codes
- H03M13/15—Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes
Landscapes
- Physics & Mathematics (AREA)
- Algebra (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Pure & Applied Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Error Detection And Correction (AREA)
Abstract
The present invention relates to the field of distributed storage systems. Disclosed is an encoding and decoding framework for a binary cyclic code, the framework consisting of a linear code and an alphabet. The linear code is a binary cyclic code, the binary cyclic code is an Rm-based linear code, and a binary parity-check code Cm is used as the alphabet, wherein Rm is a polynomial ring, a variable z represents a cyclic right shift operation of the ring Rm, and Cm consists of an even number of nonzero coefficient polynomials. In a binary-based finite field F2, the parity-check code Cm has a dimension of m-1, and Cm has a check polynomial of h(z):=1+z+…+zm-1. The present invention has the following beneficial effects: Because the encoding and decoding processes of the binary cyclic code both involve an exclusive-or operation only and have low computation complexity and small computation overheads, the present invention can greatly reduce the system computation delay, save time and resources, and reduce the costs, and is suitable for actual storage systems. The present invention provides necessary and sufficient conditions under which the binary cyclic code satisfies an MDS property, which is an important theoretical basis for designing an MDS code with low computation complexity.
Description
本发明涉及分布式存储系统领域,尤其涉及一种二进制循环码的编解码框架。The present invention relates to the field of distributed storage systems, and in particular, to a codec framework of a binary cyclic code.
随着计算机网络应用的迅速发展,网络信息数据量变得越来越大,海量信息存储变得尤为重要,持续增长的数据存储压力带动着整个存储市场的快速发展,分布式存储以其高性价比、低初期投资、按需付费等优越的特点日益成为当今大数据存储的主流技术。With the rapid development of computer network applications, the amount of network information data has become larger and larger, and massive information storage has become more and more important. The ever-increasing data storage pressure has driven the rapid development of the entire storage market, and distributed storage is cost-effective. The superior features of low initial investment and pay-as-you-go have become the mainstream technology of today's big data storage.
分布式存储系统的存储节点失效已经成为一种常态,当系统所部署的存储节点变得不可靠时,必须引入冗余来提高节点失效时的可靠性,引入冗余最简单的方法就是对原始数据直接备份,直接备份虽然简单但是其存储效率和系统可靠性不高,而通过编码引入冗余的方法可以提高其存储效率,因此分布式存储的高概率可用性、可靠性以及安全性等均是分布式存储系统的关键技术问题。Storage node failure of distributed storage systems has become a normal state. When the storage nodes deployed by the system become unreliable, redundancy must be introduced to improve the reliability of the node failure. The easiest way to introduce redundancy is to Data direct backup, direct backup is simple, but its storage efficiency and system reliability are not high, and the introduction of redundancy through coding can improve its storage efficiency, so the high probability availability, reliability and security of distributed storage are Key technical issues with distributed storage systems.
在目前的存储系统中,编码方法一般采用MDS码,MDS码可以达到存储空间效率的最佳,一个(n,k)MDS纠删码需要将一个原始文件分成k个大小相等的数据模块,并通过线性编码生成n个互不相关的编码模块,由n个节点存储不同的模块,并满足MDS属性(n个编码模块中任意k个就可重构原始文件),这种编码技术在提供有效的网络存储冗余中占有重要的地位,特别适合存储大的文件以及档案数据备份应用。In the current storage system, the encoding method generally adopts the MDS code, and the MDS code can achieve the best storage space efficiency. One (n, k) MDS erasure code needs to divide an original file into k equal-sized data modules, and Generate n uncorrelated coding modules by linear coding, store different modules by n nodes, and satisfy the MDS attribute (any k of n coding modules can reconstruct the original file). This coding technology is effective. It plays an important role in network storage redundancy, and is especially suitable for storing large files and archive data backup applications.
在分布式存储系统中,把大小为B的数据存储在n个存储节点中,每个存储节点存储的数据大小为数据接收者只需要连接并下载n个存储节点中的任意k个存储节点的数据即可恢复出原始数据B,这一过程称为数据重建过程或解码过程,RS码是满足MDS码特性的一种码字。In a distributed storage system, data of size B is stored in n storage nodes, and the size of data stored in each storage node is The data receiver only needs to connect and download the data of any k storage nodes of the n storage nodes to recover the original data B. This process is called a data reconstruction process or a decoding process, and the RS code is a one that satisfies the characteristics of the MDS code. Kind of code word.
论文[James S.Plank,"Optimizing Cauchy Reed-Solomon Codes for Fault-Tolerant Network Storage Applications"Network Computing and Applications,2006.]提出的柯西RS码(Cauchy Reed-Solomon Code,简称CRS码)是当前最常用的RS编码之一,已经被广泛用于分布式存储系统中,例如在HDFS中就提
供了一套基于CRS编码的分布式存储系统;传统RS码的运算中,加法是较为简单的,但是乘法和除法的运算却非常复杂,甚至需要借助离散对数运算和查表才能实现,CRS码克服了传统RS码中的乘法问题,突破性地使用了一种只由0和1构成的有限域二进制矩阵作为生成矩阵,大大提高了编解码的效率,在这基础上,经过人们不断的优化,目前已经成为一种高效和广泛应用的存储编码;但是,CRS依然存在着一些缺陷。首先,使用0-1生成矩阵,虽然能大大降低编解码复杂度,但实际上,它的解码复杂度却不是最优的。其次,它用于编解码的有限域二进制矩阵还是比较复杂,散乱无章的0和1使得编解码难以更进一步优化。The paper [Cauchy Reed-Solomon Code (CRS code) proposed by James S. Plank, "Optimizing Cauchy Reed-Solomon Codes for Fault-Tolerant Network Storage Applications" Network Computing and Applications, 2006.] is currently the most One of the commonly used RS codes has been widely used in distributed storage systems, for example, in HDFS.
A distributed storage system based on CRS coding is provided; in the operation of traditional RS code, the addition is relatively simple, but the operations of multiplication and division are very complicated, and even need to be realized by means of discrete logarithm operation and table lookup, CRS The code overcomes the multiplication problem in the traditional RS code, and uses a finite field binary matrix composed only of 0 and 1 as a generator matrix, which greatly improves the efficiency of codec. On this basis, people continue to Optimization has become an efficient and widely used storage encoding; however, CRS still has some shortcomings. First, using the 0-1 generator matrix, although greatly reducing the codec complexity, in fact, its decoding complexity is not optimal. Secondly, the finite field binary matrix used for codec is still relatively complicated, and the scattered 0 and 1 make it difficult to further optimize the codec.
RDP码,全称Row Diagonal Parity Code,是一种简单的纠删码(引自论文References P.Corbett et al.“Row diagonal parity for double disk failure correction,”4th Usenix Conf.on File and Storage Tech.,San Francisco,2004),它不需要使用有限域或者生成矩阵,只是按行和按泛对角线进行异或计算,生成两个校验块,构成了一种带有2个校验块的纠删码,它在解码时,只需要按生成校验块的方式直接进行逆向计算,就能循环解出所有的原始数据块,简单的编解码规则,使得RDP成为带有2个校验块的纠删码中,编解码复杂度最优的一种;但是,RDP码也存在缺陷:不可拓展:RDP只有两个校验块,最多容许两个块丢失,就像三个数据备份的策略一样,如果丢失数量超过两个块就不能修复。RDP code, the full name of Row Diagonal Parity Code, is a simple erasure code (quoted from the paper References P. Corbett et al. "Row diagonal parity for double disk failure correction," 4th Usenix Conf. on File and Storage Tech., San Francisco, 2004), it does not need to use finite fields or generator matrices, but XORs by row and pan-diagonal, generating two check blocks, forming a correction with 2 check blocks. Deleting code, when decoding, it only needs to directly calculate the inverse of the check block, and can solve all the original data blocks cyclically, and simple codec rules, so that RDP becomes a check block with 2 check blocks. Among the erasure codes, the codec has the best complexity; however, the RDP code also has defects: it cannot be extended: RDP has only two check blocks, and at most two blocks are allowed to be lost, just like the strategy of three data backups. If the number of lost more than two blocks can not be repaired.
【发明内容】[Summary of the Invention]
为了解决现有技术中的问题,本发明提供了一种二进制循环码的编解码框架,解决现有技术中修复解码时计算复杂度高和计算开销大的问题。In order to solve the problems in the prior art, the present invention provides a codec framework of a binary cyclic code, which solves the problem of high computational complexity and large computational overhead in repairing decoding in the prior art.
本发明提供了一种二进制循环码的编解码框架,由线性码和字母表组成,该线性码为二进制循环码,二进制循环码为一种基于Rm的线性码,二进制奇偶校验码Cm作为字母表;其中,Rm表示为一个多项式环,Rm:=F2[z]/(1+zm).,矢量对应为环Rm中的多项式变量z代表环Rm的循环右移操作;Cm由Rm上的偶数个非零系数的多项式组成Cm={a(z)(1+z):a(z)∈Rm},在基于二进制有限域F2中,奇偶校验码Cm的维度是m-1,Cm的校验多项式是h(z):=1+z+…+zm-1。The invention provides a codec framework of a binary cyclic code, which is composed of a linear code and a alphabet, the linear code is a binary cyclic code, the binary cyclic code is a linear code based on R m , and the binary parity code C m As the alphabet; where R m is expressed as a polynomial ring, R m :=F 2 [z]/(1+z m )., vector Corresponding to a polynomial in the ring R m The variable z represents a cyclic right shift operation of the ring R m ; C m consists of an even number of non-zero coefficient polynomials over R m C m ={a(z)(1+z): a(z)∈R m }, In the binary finite field F 2 based, the dimension of the parity code C m is m-1, and the check polynomial of C m is h(z):=1+z+...+z m-1 .
作为本发明的进一步改进:二进制循环码的编解码框架中,给定一个奇数m和正整数k、v,二进制循环码是一种从F2到的映射,由环Rm
上的一个k×v生成矩阵来表示,编码过程具体为:(A)将(m-1)k个比特均分到k组中,每一组包含m-1个比特,对于每一个组的m-1个比特,增添一个奇偶校验比特,生成一个Cm上的多项式,将生成的多项式一起组成一个k元组二进制循环编码是对应于(m-1)k个输入比特的编码,通过wG得到;(B)通过添加一个奇偶校验比特得到的Cm中的多项式称为原始数据包或者数据包,将一个wG中的多项式称为编码包,一个编码包是k个数据包的Rm线性组合,编码系数为Rm中的多项式。As a further improvement of the present invention: in the codec framework of the binary cyclic code, given an odd number m and a positive integer k, v, the binary cyclic code is a type from F 2 to The mapping is represented by a k×v generation matrix on the ring R m . The encoding process is specifically as follows: (A) equally divide (m-1) k bits into k groups, each group containing m-1 Bit, for each group of m-1 bits, add a parity bit, generate a polynomial on C m , and combine the generated polynomials together into a k-tuple The binary cyclic coding is an encoding corresponding to (m-1) k input bits, obtained by wG; (B) a polynomial in C m obtained by adding a parity bit is called an original data packet or a data packet, and one will be The polynomial in wG is called an encoding packet, and one encoding packet is a linear combination of R m of k packets, and the encoding coefficient is a polynomial in R m .
作为本发明的进一步改进:二进制循环码的编解码框架的编解码算法均只涉及二进制的异或运算。As a further improvement of the present invention, the codec algorithms of the codec framework of the binary cyclic code all involve only binary exclusive OR operations.
作为本发明的进一步改进:二进制循环码的编解码框架中,二进制循环码的解码过程为:将k个编码包中恢复出k个原始数据包具体为:设s1(z),...,sk(z)为k个数据包,p1(z),...,pk(z)为以I为索引的k个编码包,解码过程为(p1(z),...,pk(z))=(s1(z),...,sk(z))·GI,其中GI为通过保留I列的矩阵G的一个k×k子矩阵。As a further improvement of the present invention: in the codec framework of the binary cyclic code, the decoding process of the binary cyclic code is: recovering k original data packets from k coding packets, specifically: setting s 1 (z),... , s k (z) is k packets, p 1 (z),..., p k (z) are k coded packets indexed by I, and the decoding process is (p 1 (z), .. , p k (z)) = (s 1 (z), ..., s k (z)) · G I , where G I is a k × k submatrix passing through the matrix G of the reserved I column.
作为本发明的进一步改进:二进制循环码在加法和乘法上是封闭的。As a further improvement of the invention: the binary cyclic code is closed in addition and multiplication.
本发明的有益效果是:二进制循环码的编解码过程均只涉及异或运算,其计算复杂度很低、计算开销很小,很大程度上降低了系统计算时延,节省时间和资源,能减少成本的消耗,适合实际的存储系统;本专利给出了二进制循环码满足MDS属性的冲要条件,为设计低计算复杂度的MDS码的一个重要理论基础。The beneficial effects of the invention are: the encoding and decoding process of the binary cyclic code only involves the exclusive OR operation, the computational complexity is low, the calculation overhead is small, the system calculation delay is greatly reduced, the time and resources are saved, and the energy can be saved. The cost reduction is suitable for the actual storage system; this patent gives the critical condition that the binary cyclic code satisfies the MDS attribute and is an important theoretical basis for designing the MDS code with low computational complexity.
下面结合具体实施方式对本发明进一步说明。The invention is further described below in conjunction with specific embodiments.
一种二进制循环码的编解码框架,由线性码和字母表组成,该线性码为二进制循环码,二进制循环码为一种基于Rm的线性码,二进制奇偶校验码Cm作为字母表;其中,Rm表示为一个多项式环,Rm:=F2[z]/(1+zm).,矢量对应为环Rm中的多项式变量z代表环Rm的循环右移操作;Cm由Rm上的偶数个非零系数的多项式组成Cm={a(z)(1+z):a(z)∈Rm},在基于二进制有限域F2中,奇偶校验码Cm的维度是m-1,Cm的校验多项式是h(z):=1+z+…+zm-1。
A codec frame of a binary cyclic code consisting of a linear code and a alphabet, the linear code is a binary cyclic code, the binary cyclic code is a linear code based on R m , and the binary parity code C m is used as an alphabet; Where R m is represented as a polynomial ring, R m :=F 2 [z]/(1+z m )., vector Corresponding to a polynomial in the ring R m The variable z represents a cyclic right shift operation of the ring R m ; C m consists of an even number of non-zero coefficient polynomials over R m C m ={a(z)(1+z): a(z)∈R m }, In the binary finite field F 2 based, the dimension of the parity code C m is m-1, and the check polynomial of C m is h(z):=1+z+...+z m-1 .
二进制循环码的编解码框架中,给定一个奇数m和正整数k、v,二进制循环码是一种从F2到的映射,由环Rm上的一个k×v生成矩阵来表示,编码过程具体为:(A)将(m-1)k个比特均分到k组中,每一组包含m-1个比特,对于每一个组的m-1个比特,增添一个奇偶校验比特,生成一个Cm上的多项式,将生成的多项式一起组成一个k元组二进制循环编码是对应于(m-1)k个输入比特的编码,通过wG得到;(B)通过添加一个奇偶校验比特得到的Cm中的多项式称为原始数据包或者数据包,将一个wG中的多项式称为编码包,一个编码包是k个数据包的Rm线性组合,编码系数为Rm中的多项式。In the codec framework of binary cyclic code, given an odd number m and a positive integer k, v, the binary cyclic code is a kind from F 2 to The mapping is represented by a k×v generation matrix on the ring R m . The encoding process is specifically as follows: (A) equally divide (m-1) k bits into k groups, each group containing m-1 Bit, for each group of m-1 bits, add a parity bit, generate a polynomial on C m , and combine the generated polynomials together into a k-tuple The binary cyclic coding is an encoding corresponding to (m-1) k input bits, obtained by wG; (B) a polynomial in C m obtained by adding a parity bit is called an original data packet or a data packet, and one will be The polynomial in wG is called an encoding packet, and one encoding packet is a linear combination of R m of k packets, and the encoding coefficient is a polynomial in R m .
二进制循环码的编解码框架的编解码算法均只涉及二进制的异或运算。The codec algorithm of the codec framework of the binary cyclic code only involves binary XOR operations.
二进制循环码的编解码框架中,二进制循环码的解码过程为:将k个编码包中恢复出k个原始数据包具体为:设s1(z),...,sk(z)为k个数据包,p1(z),...,pk(z)为以I为索引的k个编码包,解码过程为(p1(z),...,pk(z))=(s1(z),...,sk(z))·GI,其中GI为通过保留I列的矩阵G的一个k×k子矩阵。In the codec framework of the binary cyclic code, the decoding process of the binary cyclic code is: recovering k original data packets from k coding packets as follows: Let s 1 (z),...,s k (z) k packets, p 1 (z),...,p k (z) are k coded packets indexed by I, and the decoding process is (p 1 (z),...,p k (z) ) = (s 1 (z), ..., s k (z)) · G I , where G I is a k × k submatrix that passes through the matrix G of the reserved I column.
二进制循环码在加法和乘法上是封闭的Binary cyclic code is closed in addition and multiplication
在一实施例中,一种新的编码框架,称为二进制循环编码框架,其编解码算法均只涉及二进制的异或运算,为设计低计算复杂度的存储编码提供理论支撑,现有技术RDP码可看做本专利提出的二进制循环编码框架的一个具体的例子。In an embodiment, a new coding framework, called a binary cyclic coding framework, whose codec algorithm only involves binary XOR operations, provides theoretical support for designing storage coding with low computational complexity. Prior art RDP The code can be seen as a specific example of the binary loop coding framework proposed in this patent.
二进制循环码的理论知识,令m为一个正奇数,Rm表示为一个多项式环The theoretical knowledge of binary cyclic codes, let m be a positive odd number and R m be a polynomial ring
Rm:=F2[z]/(1+zm).R m :=F 2 [z]/(1+z m ).
(2)(2)
称环Rm中的元素为多项式。矢量可对应为环Rm中的多项式在上式(2)中,变量z代表环Rm的循环右移操作。
定义一个长度为m的二进制循环码为一个Rm的子集,该子集在加法和乘法上是封闭的。The element in the ring R m is called a polynomial. Vector Can correspond to a polynomial in the ring R m In the above formula (2), the variable z represents a cyclic right shift operation of the ring R m . Define a binary cyclic code of length m as a subset of R m that is closed in addition and multiplication.
在本专利中,只考虑简单奇偶校验码Cm,它是由Rm上的偶数个非零系数的多项式组成In this patent, only the simple parity code C m is considered, which is composed of an even number of non-zero coefficient polynomials over R m
Cm={a(z)(1+z):a(z)∈Rm}。C m ={a(z)(1+z): a(z)∈R m }.
(3)(3)
在基于二进制有限域F2中,奇偶校验码Cm的维度是m-1,Cm的校验多项式是h(z):=1+z+…+zm-1。In the binary finite field F 2 based, the dimension of the parity code C m is m-1, and the check polynomial of C m is h(z):=1+z+...+z m-1 .
二进制循环码的编码框架:The encoding framework of the binary cyclic code:
定义二进制循环码为一种基于Rm的线性码,它是用二进制奇偶校验码Cm作为字母表。具体的,给定一个奇数m和正整数k、v,二进制循环编码是一种从F2到的映射,它可以由环Rm上的一个k×v生成矩阵来表示。编码过程可以分为两步:首先,将(m-1)k个比特均分到k组中,每一组包含m-1个比特。对于每一个组的m-1个比特,增添一个奇偶校验比特,生成一个Cm上的多项式。将生成的多项式一起组成一个k元组二进制循环编码是对应于(m-1)k个输入比特的编码,通过wG得到。The binary cyclic code is defined as a linear code based on R m which uses the binary parity code C m as the alphabet. Specifically, given an odd number m and a positive integer k, v, the binary loop encoding is a kind from F 2 to The mapping, which can be represented by a k × v generation matrix on the ring R m . The encoding process can be divided into two steps: First, (m-1) k bits are equally divided into k groups, each group containing m-1 bits. For each group of m-1 bits, add a parity bit to generate a polynomial on C m . Combine the generated polynomials together into a k-tuple Binary cyclic coding is an encoding corresponding to (m-1) k input bits, obtained by wG.
此后,将通过添加一个奇偶校验比特得到的Cm中的多项式称为原始数据包或者数据包,而将一个wG中的多项式称为编码包。一个编码包是k个数据包的Rm线性组合,编码系数为Rm中的多项式。Thereafter, the polynomial in C m obtained by adding one parity bit is referred to as an original data packet or a data packet, and a polynomial in a wG is referred to as an encoded packet. An encoding packet is a linear combination of R m of k packets, and the coding coefficient is a polynomial in R m .
下面通过一个例子来说明二进制编码的编码过程。假设想要将2(m-1)信息比特存储到4个存储节点中,其中m为奇正整数。在存储
节点中,在每个节点上存储m比特。称节点1和节点2为信息节点,它们分别存储m-1个信息比特和1个奇偶校验码比特。称节点3和节点4为编码节点,它们分别存储m个编码比特。The encoding process of binary encoding is illustrated by an example below. Suppose you want to store 2 (m-1) information bits into 4 storage nodes, where m is an odd positive integer. In storage
In the node, m bits are stored on each node. Node 1 and Node 2 are referred to as information nodes, which store m-1 information bits and 1 parity code bit, respectively. Node 3 and Node 4 are said to be coding nodes, which store m coded bits, respectively.
将2(m-1)信息比特平均分成两个部分。第一部分的比特表示为s1,0,s1,1,…,s1,m-2,第二部分比特表示为s2,0,s2,1,…,s2,m-2。对于i=1,2,节点i存储比特si,0,si,1…,si,m-2,奇偶校验比特为The 2 (m-1) information bits are equally divided into two parts. The bits of the first part are denoted as s 1,0 , s 1,1 ,...,s 1,m-2 , and the second part of the bits are denoted as s 2,0 , s 2,1 ,...,s 2,m-2 . For i=1, 2, the node i stores the bits s i,0 , s i,1 ..., s i,m-2 , and the parity bit is
节点3存储比特为Node 3 stores bits as
s3,j:=s1,j+s2,j,s 3,j :=s 1,j +s 2,j ,
j=0,1,...,m-1,节点4存储比特为j=0,1,...,m-1, node 4 stores bits as
j=0,1,...,m-1。符号表示模m加。在表I中给出了一个m=7的例子。发现节点3中的编码比特由节点1和2中的比特相加计算得到,与此同时,节点4中的编码比特由节点1中的比特和节点2中比特的循环转换相加计算得到。j=0,1,...,m-1. symbol Indicates the modulo m plus. An example of m=7 is given in Table I. It is found that the coded bits in node 3 are calculated by adding the bits in nodes 1 and 2, while the coded bits in node 4 are calculated by adding the bits in node 1 and the cyclic conversion of the bits in node 2.
表I:一个4节点二进制循环编码的例子。Table I: An example of a 4-node binary loop encoding.
下面证明任意两个节点中的数据可以恢复出原始的信息比特。在节点1和2中,可以直接获得信息比特。如果想要从节点1和节点3中解码信息比特,可以从s3,j=s1,j+s2,j中减去s1,j来获得s2,j的值。相似的,可以从任意一个信息节点和任意一个编码节点中恢复信息比特。最后,想要从节点3和节点4中解码信息比特。首先,可以计算The following proves that the data in any two nodes can recover the original information bits. In nodes 1 and 2, information bits can be obtained directly. If you want the node 1 and nodes 3 decode the information bits, from the s 3, j = s 1, j + s 2, j is subtracted S 1, j is obtained, the value of j s 2. Similarly, information bits can be recovered from any one of the information nodes and any one of the coding nodes. Finally, it is desirable to decode the information bits from node 3 and node 4. First, you can calculate
其中j=1,3,5,...m-2。接下来,可以计算出s2,0,Where j=1, 3, 5, ... m-2. Next, you can calculate s 2,0 ,
如果s2,0的值是已知的,那么可以通过s3,0+s2,0=s1,0得到s1,0,通过s4,0+s1,0=s2,1来得到s2,1。其余的信息比特可以被迭代地解码。If the value of s 2,0 is known, then s 3,0 +s 2,0 =s 1,0 can be obtained as s 1,0 by s 4,0 +s 1,0 =s 2,1 Come to get s 2,1 . The remaining information bits can be iteratively decoded.
以上给出了二进制编码的参数为m=7、k=2和v=4的一个例子。两个数据包是i=1,2,它的生成矩阵是An example of the binary coded parameters m=7, k=2, and v=4 is given above. Two packets are i=1, 2, its generator matrix is
因为对于所有的c(z)∈Cm,有c(z)h(z)=0,因此编码包作为一个数据包的线性组合可以有多种方式获得。可以在生成矩阵G的任意元中增加校验多项式,而不会改变编码包。比如,可以选择Since c(z)h(z) = 0 for all c(z) ∈C m , the linear combination of the encoded packet as a packet can be obtained in a number of ways. A check polynomial can be added to any element of the generator matrix G without changing the coded packet. For example, you can choose
作为该例子中的生成矩阵。As a generator matrix in this example.
二进制循环码的解码方法Binary cyclic code decoding method
称k个编码包是可解码的,是指可以从这k个编码包中恢复出k个原始数据包。在本小结,给出了二进制循环码可解的充要条件。首先定义一些符号。对于一个多项式如果可以找到另一个多项式使之满足等于1或者1+h(z),则称Rm上的多项式f(z)为Cm可逆的。对于一个|I|=k的子集令GI为通过保留I列的矩阵G的一个k×k子矩阵。The k coded packets are said to be decodable, meaning that k original data packets can be recovered from the k coded packets. In this summary, the necessary and sufficient conditions for the solution of the binary cyclic code are given. First define some symbols. For a polynomial If you can find another polynomial Satisfy Equal to 1 or 1+h(z), then the polynomial f(z) on R m is said to be C m reversible. For a subset of |I|=k Let G I be a k × k submatrix that passes through the matrix G of the I column.
首先给出其充分条件:如果det(GI)是Cm可逆的,那么由I索引的k个编码包是可解码的。First, a sufficient condition is given: if det(G I ) is C m reversible, the k coded packets indexed by I are decodable.
设s1(z),...,sk(z)为k个数据包,p1(z),...,pk(z)为以I为索引的k个编码包,其编码过程为Let s 1 (z),...,s k (z) be k packets, p 1 (z),...,p k (z) be k coded packets indexed by I, and their encoding Process is
(p1(z),...,pk(z))=(s1(z),...,sk(z))·GI
(p 1 (z),...,p k (z))=(s 1 (z),...,s k (z))·G I
假设GI的行列式是Cm可逆的,根据定义,可以得到Rm上的一个多项式,其满足δ(z)det(GI)等于1或1+h(z)。因此,可以通过下列方式从k个编码包恢复原始数据包,Assuming that the determinant of G I is C m reversible, by definition a polynomial on R m can be obtained which satisfies δ(z)det(G I ) equal to 1 or 1+h(z). Therefore, the original data packet can be recovered from k coded packets in the following manner.
(p1(z),...,pk(z))·adj(GI)·δ(z)(p 1 (z),...,p k (z))·adj(G I )·δ(z)
=(s1(z),...,sk(z))·GI·adj(GI)·δ(z)
=(s 1 (z),...,s k (z))·G I ·adj(G I )·δ(z)
=(s1(z),...sk(z))·det(GI)·δ(z)=(s 1 (z),...s k (z))·det(G I )·δ(z)
=(s1(z),...,sk(z)),=(s 1 (z),...,s k (z)),
在上式中,adj(GI)是GI的伴随矩阵。在最后一步,使用了Cm中的特性:如果si(z)∈Cm,则si(z)(1+h(z))=si(z)。In the above formula, adj(G I ) is the adjoint matrix of G I . In the last step, the property in C m is used: if s i (z) ∈ C m , then s i (z)(1+h(z))=s i (z).
接下来将给出判断一个环Cm上的多项式是否为Cm可逆的冲要条件。设f1(z),f2(z),...fL(z)是基于二进制有限域F2的校验多项式h(z)的素因子分解。不可约多项式f1(z)到fL(z)在除1+zm时是不同的,因为m为奇数。在一个拥有单位元的一般交换环R中,对于一个元素u∈R,如果能够找到一个元素满足等于R中的单位元,那么称u为一个单元。Next, a judgment will be given as to whether or not the polynomial on a ring C m is a reversible condition of C m . Let f 1 (z), f 2 (z), ... f L (z) be the prime factorization of the check polynomial h(z) based on the binary finite field F 2 . The irreducible polynomials f 1 (z) through f L (z) are different except for 1+z m because m is an odd number. In a general exchange ring R with unit elements, for an element u∈R, if an element can be found Satisfy Equal to the unit cell in R, then u is called a unit.
假设f1(z),f2(z),...,fL(z)是奇偶校验码Cm的校验多项式h(z)的不可约因子。令a(z)为Rm上的一个多项式,则下列条件是等价的:Let f 1 (z), f 2 (z), ..., f L (z) be the irreducible factors of the check polynomial h(z) of the parity code C m . Let a(z) be a polynomial on R m , then the following conditions are equivalent:
1)a(z)是Cm可逆的。1) a(z) is C m reversible.
2)a(z)模h(z)是F2[z]/(h(z))上的一个单元。2) The a(z) modulo h(z) is a unit on F 2 [z]/(h(z)).
3)a(z)对于所有l=1,2,...,L是F2[z]/(fl(z))上的一个单元。3) a(z) for all l = 1, 2, ..., L is a unit on F 2 [z] / (f l (z)).
定义f0(z)为多项式1+z。由中国剩余定理,得到环Rm同构于 Define f 0 (z) as a polynomial 1+z. From the Chinese remainder theorem, the ring R m is isomorphic
实际上,可以定义映射φ:Rm→R'm为In fact, you can define the mapping φ: R m → R' m
a(z)α(a(z)mod1+z,a(z)modh(z)),a(z)α(a(z)mod1+z, a(z)modh(z)),
定义逆映射φ':R'm→Rm为
Define the inverse mapping φ': R' m → R m
(a0(z),a1(z))αh(z)a0(z)+(1+h(z))a1(z)mod1+zm
(a 0 (z), a 1 (z)) αh(z)a 0 (z)+(1+h(z))a 1 (z)mod1+z m
以上两个两个映射之间是可逆的。假设a(z)modh(z)是F2[z]/(h(z))的一个单元,可以找到一个多项式d(z)满足φ(a(z)d(z))=(a,1),a等于0或1。因此a(z)d(z)等于φ'((0,1))=1+h(z)或φ'((1,1))=1。所以,a(z)是Cm可逆的。The above two mappings are reversible. Assuming that a(z)modh(z) is a unit of F 2 [z]/(h(z)), we can find a polynomial d(z) that satisfies φ(a(z)d(z))=(a, 1), a is equal to 0 or 1. Thus a(z)d(z) is equal to φ'((0,1))=1+h(z) or φ'((1,1))=1. Therefore, a(z) is C m reversible.
反过来,假设a(z)是Cm可逆的,即存在多项式满足等于1或1+h(z)。如果用映射φ用于对于a∈F2,有因此a(z)modh(z)是一个单元。Conversely, assuming that a(z) is C m reversible, ie, there exists a polynomial Satisfy Equal to 1 or 1+h(z). If using map φ for For a∈F 2 , there is Therefore a(z)modh(z) is a unit.
h(z)可以被分解成为f1(z)f2(z)…fL(z),由中国剩余定理,可以得出定理的(2)条件和(3)条件是等价的。 h(z) can be decomposed into f 1 (z)f 2 (z)...f L (z). From the Chinese remainder theorem, it can be concluded that the (2) condition of the theorem and the (3) condition are equivalent.
令I为基数为k的索引集,由I为索引的编码数据包是可解码的充要条件为det(GI)是Cm可逆的。Let I be the index set with base k, The encoded data packet indexed by I is a decodable and sufficient condition that det(G I ) is C m reversible.
已经证明了“充分”部分。下面论述“必要”部分,假设det(GI)不是Cm可逆的。对于一些l0∈{1,2,...,L},满足如果把矩阵GI的元素模那么生成矩阵在有限域上是奇异矩阵。因此,可以找到一个非零向量它的每个元素均属于那么是非零向量。对于j=1,2,...k,选择aj(z)∈Cm满足The "sufficient" part has been proven. The "necessary" part is discussed below, assuming that det(G I ) is not C m reversible. For some l 0 ∈{1,2,...,L}, satisfy If the element of the matrix G I is modular Then the generator matrix is in the finite field The top is a singular matrix. So you can find a non-zero vector Each of its elements belongs to Then Is a non-zero vector. For j=1, 2,...k, choose a j (z)∈C m to satisfy
如果将aj(z)作为原始数据包,那么由(a1(z),a2(z),...,ak(z))GI得到的v元组是零v元组。那么编码表不是单射的,则由I索引的编码包就是不可解码的。
If a j (z) is taken as the original data packet, the v-tuple obtained from (a 1 (z), a 2 (z), ..., a k (z)) G I is a zero v-tuple. Then the encoding table is not single shot, then the encoded packet indexed by I is not decodable.
继续前面m=7的例子。多项式1+z7可以被分解为f0(z)=1+z、f1(z)=1+z+z2和f2(z)=1+z2+z3的乘积。可以检验其中任意两个编码包是可解码的。比如,如果索引集合为I={3,4},行列式det(GI)=1+z不能分解成f1(z)和f2(z)。实际上,1+z是Cm可逆的,因为Continue with the previous example of m=7. The polynomial 1+z 7 can be decomposed into a product of f 0 (z)=1+z, f 1 (z)=1+z+z 2 and f 2 (z)=1+z 2 +z 3 . It can be checked that any two of the encoded packets are decodable. For example, if the index set is I={3,4}, the determinant det(G I )=1+z cannot be decomposed into f 1 (z) and f 2 (z). In fact, 1+z is C m reversible because
(1+z)(z+z3+z5)=z+z2+…z6=1+h(z)。(1+z)(z+z 3 +z 5 )=z+z 2 +...z 6 =1+h(z).
因此,可以从节点3和节点4中计算出两个数据包。Therefore, two data packets can be calculated from node 3 and node 4.
在软件实现中,可以通过使用指针来实现循环移位。在内存中连续地存储m个比特,使用一个指针来存储数据包的头地址。循环移位可以只通过修改指针来实现,而不需要修改数据包本身。也可以用字节循环移位替代比特循环移位,而这对于软件实现也是更易于控制的。In software implementations, cyclic shifts can be achieved by using pointers. M bits are stored consecutively in memory, and a pointer is used to store the header address of the packet. Cyclic shifting can be done only by modifying the pointer without modifying the packet itself. It is also possible to use byte cyclic shift instead of bit cyclic shift, which is also easier to control for software implementations.
另外,可以验证现有技术之一的RDP码可以看做本专利提出的二进制循环码的一个例子,其生成矩阵为In addition, it can be verified that the RDP code of one of the prior art can be regarded as an example of the binary cyclic code proposed in this patent, and the generation matrix is
相比于之前的编码方案,比如RDP码,二进制循环码更一般化,RDP可以看做二进制循环码的一个特例。Compared to previous coding schemes, such as RDP codes, binary cyclic codes are more general, and RDP can be seen as a special case of binary cyclic codes.
以上内容是结合具体的优选实施方式对本发明所作的进一步详细说明,不能认定本发明的具体实施只局限于这些说明。对于本发明所属技术领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干简单推演或替换,都应当视为属于本发明的保护范围。
The above is a further detailed description of the present invention in connection with the specific preferred embodiments, and the specific embodiments of the present invention are not limited to the description. It will be apparent to those skilled in the art that the present invention may be made without departing from the spirit and scope of the invention.
Claims (5)
- 一种二进制循环码的编解码框架,其特征在于:由线性码和字母表组成,该线性码为二进制循环码,二进制循环码为一种基于Rm的线性码,二进制奇偶校验码Cm作为字母表;其中,Rm表示为一个多项式环,Rm:=F2[z]/(1+zm).,矢量对应为环Rm中的多项式变量z代表环Rm的循环右移操作;Cm由Rm上的偶数个非零系数的多项式组成Cm={a(z)(1+z):a(z)∈Rm},在基于二进制有限域F2中,奇偶校验码Cm的维度是m-1,Cm的校验多项式是h(z):=1+z+…+zm-1。A codec frame of a binary cyclic code, characterized in that it consists of a linear code and a alphabet, the linear code is a binary cyclic code, the binary cyclic code is a linear code based on R m , and the binary parity code C m As the alphabet; where R m is expressed as a polynomial ring, R m :=F 2 [z]/(1+z m )., vector Corresponding to a polynomial in the ring R m The variable z represents a cyclic right shift operation of the ring R m ; C m consists of an even number of non-zero coefficient polynomials over R m C m ={a(z)(1+z): a(z)∈R m }, In the binary finite field F 2 based, the dimension of the parity code C m is m-1, and the check polynomial of C m is h(z):=1+z+...+z m-1 .
- 根据权利要求1所述的二进制循环码的编解码框架,其特征在于:二进制循环码的编解码框架中,给定一个奇数m和正整数k、v,二进制循环码是一种从F2到的映射,由环Rm上的一个k×v生成矩阵来表示,编码过程具体为:(A)将(m-1)k个比特均分到k组中,每一组包含m-1个比特,对于每一个组的m-1个比特,增添一个奇偶校验比特,生成一个Cm上的多项式,将生成的多项式一起组成一个k元组二进制循环编码是对应于(m-1)k个输入比特的编码,通过wG得到;(B)通过添加一个奇偶校验比特得到的Cm中的多项式称为原始数据包或者数据包,将一个wG中的多项式称为编码包,一个编码包是k个数据包的Rm线性组合,编码系数为Rm中的多项式,其中G为对应的编码包的生成矩阵。The codec framework of the binary cyclic code according to claim 1, wherein: in the codec frame of the binary cyclic code, an odd number m and a positive integer k, v are given, and the binary cyclic code is a type from F 2 to The mapping is represented by a k×v generation matrix on the ring R m . The encoding process is specifically as follows: (A) equally divide (m-1) k bits into k groups, each group containing m-1 Bit, for each group of m-1 bits, add a parity bit, generate a polynomial on C m , and combine the generated polynomials together into a k-tuple The binary cyclic coding is an encoding corresponding to (m-1) k input bits, obtained by wG; (B) a polynomial in C m obtained by adding a parity bit is called an original data packet or a data packet, and one will be The polynomial in wG is called an encoding packet, and one encoding packet is a linear combination of R m of k data packets, and the coding coefficient is a polynomial in R m , where G is a generation matrix of the corresponding coding packet.
- 根据权利要求1所述的二进制循环码的编解码框架,其特征在于:二进制循环码的编解码框架的编解码算法均只涉及二进制的异或运算。The codec framework of the binary cyclic code according to claim 1, wherein the codec algorithm of the codec frame of the binary cyclic code only involves a binary exclusive OR operation.
- 根据权利要求1所述的二进制循环码的编解码框架,其特征在于:二进制循环码的编解码框架中,二进制循环码的解码过程为:将k个编码包中恢复出k个原始数据包具体为:设s1(z),...,sk(z)为k个数据包,p1(z),...,pk(z)为以I为索引的k个编码包,解码过程为(p1(z),...,pk(z))=(s1(z),...,sk(z))·GI,其中GI为通过保留I列的矩阵G的一个k×k子矩阵。The codec framework of the binary cyclic code according to claim 1, wherein in the codec framework of the binary cyclic code, the decoding process of the binary cyclic code is: recovering k original data packets from k coding packets. Let: s 1 (z),...,s k (z) be k packets, p 1 (z),...,p k (z) are k coded packets indexed by I, The decoding process is (p 1 (z),...,p k (z))=(s 1 (z),...,s k (z))·G I , where G I is by retaining column I A k × k submatrix of the matrix G.
- 根据权利要求1所述的二进制循环码的编解码框架,其特征在于:二进制循环码在加法和乘法上是封闭的。 A codec framework for a binary cyclic code according to claim 1, wherein the binary cyclic code is closed in addition and multiplication.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2015/089179 WO2017041232A1 (en) | 2015-09-08 | 2015-09-08 | Encoding and decoding framework for binary cyclic code |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2015/089179 WO2017041232A1 (en) | 2015-09-08 | 2015-09-08 | Encoding and decoding framework for binary cyclic code |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017041232A1 true WO2017041232A1 (en) | 2017-03-16 |
Family
ID=58240444
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2015/089179 WO2017041232A1 (en) | 2015-09-08 | 2015-09-08 | Encoding and decoding framework for binary cyclic code |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2017041232A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109062725A (en) * | 2018-08-09 | 2018-12-21 | 东莞理工学院 | A kind of coding framework method of binary system MDS array code |
CN110289864A (en) * | 2019-08-01 | 2019-09-27 | 东莞理工学院 | The optimal reparation access transform method and device of binary system MDS array code |
CN114828143A (en) * | 2022-03-19 | 2022-07-29 | 西安电子科技大学 | Wireless multi-hop transmission method, system, storage medium, equipment and terminal |
CN116560915A (en) * | 2023-07-11 | 2023-08-08 | 北京谷数科技股份有限公司 | Data recovery method and device, electronic equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7734991B1 (en) * | 2007-01-04 | 2010-06-08 | The United States Of America As Represented By The Director, National Security Agency | Method of encoding signals with binary codes |
CN103208996A (en) * | 2013-04-17 | 2013-07-17 | 北京航空航天大学 | Method for coding frequency domains of quasi-cyclic codes |
-
2015
- 2015-09-08 WO PCT/CN2015/089179 patent/WO2017041232A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7734991B1 (en) * | 2007-01-04 | 2010-06-08 | The United States Of America As Represented By The Director, National Security Agency | Method of encoding signals with binary codes |
CN103208996A (en) * | 2013-04-17 | 2013-07-17 | 北京航空航天大学 | Method for coding frequency domains of quasi-cyclic codes |
Non-Patent Citations (2)
Title |
---|
SHUM, K.W ET AL.: "Regenerating Codes Over a Binary Cyclic Code", 2014 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, 29 June 2014 (2014-06-29), pages 1046 - 1050, XP032635109 * |
ZHOU, HUANYIN ET AL.: "Research on the Coding of Cyclic Code", MODERN ELECTRONIC TECHNIQUE, 15 October 2006 (2006-10-15), pages 10 - 12 and 15 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109062725A (en) * | 2018-08-09 | 2018-12-21 | 东莞理工学院 | A kind of coding framework method of binary system MDS array code |
CN110289864A (en) * | 2019-08-01 | 2019-09-27 | 东莞理工学院 | The optimal reparation access transform method and device of binary system MDS array code |
CN114828143A (en) * | 2022-03-19 | 2022-07-29 | 西安电子科技大学 | Wireless multi-hop transmission method, system, storage medium, equipment and terminal |
CN116560915A (en) * | 2023-07-11 | 2023-08-08 | 北京谷数科技股份有限公司 | Data recovery method and device, electronic equipment and storage medium |
CN116560915B (en) * | 2023-07-11 | 2023-09-19 | 北京谷数科技股份有限公司 | Data recovery method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10146618B2 (en) | Distributed data storage with reduced storage overhead using reduced-dependency erasure codes | |
WO2018166078A1 (en) | Mds array code encoding and decoding method for repairing failure of multiple nodes | |
Gad et al. | Repair-optimal MDS array codes over GF (2) | |
CN105991227B (en) | Data coding method and device | |
CN105356968B (en) | The method and system of network code based on cyclic permutation matrices | |
WO2014153716A1 (en) | Methods for encoding minimum bandwidth regenerating code and repairing storage node | |
WO2021098665A1 (en) | Erasure code calculation method | |
CN111858169B (en) | Data recovery method, system and related components | |
WO2016058289A1 (en) | Mds erasure code capable of repairing multiple node failures | |
CN105356892B (en) | The method and system of network code | |
Sima et al. | Optimal codes for the q-ary deletion channel | |
WO2015010476A1 (en) | Data recovery method, data recovery device, and distributed storage system | |
WO2016058262A1 (en) | Data codec method based on binary reed-solomon code | |
CN113391946B (en) | Coding and decoding method for erasure codes in distributed storage | |
WO2017041232A1 (en) | Encoding and decoding framework for binary cyclic code | |
US11626890B2 (en) | Dynamically variable error correcting code (ECC) system with hybrid rateless reed-solomon ECCs | |
Hou et al. | Triple-fault-tolerant binary MDS array codes with asymptotically optimal repair | |
Huang et al. | Secure RAID schemes for distributed storage | |
WO2020029418A1 (en) | Method for constructing repair binary code generator matrix and repair method | |
US10187084B2 (en) | Method of encoding data and data storage system | |
Ivanichkina et al. | Mathematical methods and models of improving data storage reliability including those based on finite field theory | |
Wu et al. | Generalized expanded-Blaum-Roth codes and their efficient encoding/decoding | |
WO2017041233A1 (en) | Encoding and storage node repairing method for functional-repair regenerating code | |
WO2018029212A1 (en) | Regenerating locally repairable codes for distributed storage systems | |
WO2017041231A1 (en) | Codec of binary exact-repair regenerating code |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15903339 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 15903339 Country of ref document: EP Kind code of ref document: A1 |