KR101923116B1

KR101923116B1 - Apparatus for Encoding and Decoding in Distributed Storage System using Locally Repairable Codes and Method thereof

Info

Publication number: KR101923116B1
Application number: KR1020170116506A
Authority: KR
Inventors: 송홍엽; 남미영
Original assignee: 연세대학교 산학협력단
Priority date: 2017-09-12
Filing date: 2017-09-12
Publication date: 2018-11-28

Abstract

The present invention relates to a device for encoding and decoding using a repairable code, and a method thereof. Especially, the present invention discloses a device for encoding using a binary locally repairable code having an improved minimum distance, and a method thereof. The device for encoding using a repairable code comprises: a parity check matrix generating part generating a parity check matrix for detecting an error of a repairing code which encodes data in consideration of the number of connection nodes to be connected to repair the data; and a distribution storing part generating a code word according to the encoding of the data by using the generated parity check matrix, and distributing and storing the generated code word in a distribution storage system.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to an apparatus and method for encoding / decoding using a partial access recovery code in a distributed storage system,

본 발명은 복구 부호를 이용하는 부호화/복호화 장치 및 방법에 관한 것이다. 보다 상세하게는, 분산 저장 시스템에서 부분접속 복구 부호를 이용하는 부호화/복호화 장치 및 방법에 관한 것이다.The present invention relates to an apparatus and method for encoding / decoding using a restoration code. More particularly, the present invention relates to an encoding / decoding apparatus and method using a partial access recovery code in a distributed storage system.

분산 저장 시스템(Distributed Storage System)은 대용량의 데이터를 안정적으로 저장하기 위한 시스템으로서, 빅 데이터의 안전한 저장과 복구를 위하여 마이크로소프트, 구글, 페이스북 등의 주요 기업들은 고객의 데이터를 저장하기 위해 분산 저장 시스템을 운용한다.Distributed Storage System (Distributed Storage System) is a system to store large amount of data in a stable manner. Major companies such as Microsoft, Google, and Facebook are required to distribute The storage system is operated.

분산 저장 시스템은 네트워크로 연결된 수 많은 노드들에 대용량의 데이터를 나누어 저장하는데, 이때 개별 노드는 하드웨어적 결함이나 네트워크 연결 등의 문제로 인해 자주 사용할 수 없는 상태가 될 수 있다. 이러한 노드 장애에 대응하여 데이터를 안정적으로 저장하고, 전체 데이터의 손실을 막기 위하여 데이터를 단순히 분할하여 저장하는 대신, 오류정정 부호를 통해 데이터를 부호화한 후, 부호화된 데이터를 분할하여 개별 노드에 나누어 저장한다.Distributed storage systems divide and store large amounts of data in a large number of nodes connected to a network. Individual nodes may become unusable due to problems such as hardware defects or network connections. In order to stably store the data in response to such a node failure and to prevent the loss of the entire data, instead of simply dividing and storing the data, the data is encoded through the error correction code, and the encoded data is divided into individual nodes .

개별 노드에 장애가 발생하는 빈번한 상황에서 분산 저장 시스템의 안정성을 일정한 수준으로 유지하기 위하여 복구(Repair)과정을 거쳐 해당 노드에 저장된 데이터를 다시 만들어내야 한다. 분산 저장 시스템에서는 이러한 데이터의 소실과 복구가 빈번히 발생하므로 복구 과정의 효율성이 시스템 전체의 성능을 좌우한다.In order to maintain the stability of the distributed storage system at a constant level in the frequent occurrence of failure of the individual node, the data stored in the corresponding node must be rebuilt through a repair process. In a distributed storage system, the loss and recovery of such data occur frequently, and the efficiency of the recovery process depends on the performance of the system as a whole.

부분접속 복구 부호 (Locally Repairable Code: LRC)는 분산 저장 시스템에 적합한 효율적인 복구 과정을 갖는 부호로 많이 쓰이는데, 부분접속 복구 부호를 이용한 복구 과정의 효율성을 판단하는 척도로 부분접속수(Locality)가 사용된다. 부분접속수는 한 노드를 복구하기 위해 접속해야 하는 최소 노드의 수를 의미하고, 분산 저장 시스템에 효과적인 적용을 위해 작은 크기의 유한체에서의 부분접속 복구 부호에 관한 연구가 이루어 지고 있다.Locally Repairable Code (LRC) is often used as a code having an efficient recovery process suitable for a distributed storage system. As a measure for determining the efficiency of recovery using a partial access recovery code, a partial access number (Locality) is used do. The number of partial accesses refers to the minimum number of nodes that need to be connected to recover a node and partial access recovery codes in a small size finite element have been studied for effective application in distributed storage systems.

종래의 부분 접속 복구 부호는 최소거리가 항상 4로 유지되어 부호 길이가 길어지는 경우, 그 안정성이 떨어져서 기존의 반복 부호 기법을 대체할 수 없었다. The conventional partial access recovery code can not replace the conventional iterative coding scheme because the minimum distance is always kept at 4 and the code length becomes longer and its stability is lowered.

따라서, 복구 부호의 길이가 길어져도, 높은 안정성을 보장할 수 있도록 하는 복구 부호의 개발이 요구되고 있다. Therefore, it is required to develop a restoration code for ensuring high stability even if the length of the restoration code becomes long.

한국 공개 특허 제 10-2015-0131541 (공개)Korean Patent Publication No. 10-2015-0131541 (published)

본 발명은 상기한 문제점을 해결하기 위하여 안출된 것으로서, 이진 부분접속 복구 부호를 이용하는 부호화/복호화 장치를 개시한다. 특히, 2이상의 부분 접속수를 가지고, 개선된 최소거리를 가지는 이진 부분 접속 복구 부호를 이용한 부호화/복호화 장치를 개시한다. 또한, 이진 부분접속 복구 부호를 이용하는 부호화 방법을 개시한다.SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and an encoding / decoding apparatus using a binary partial access recovery code is disclosed. In particular, a coding / decoding apparatus using a binary partial access recovery code having at least two partial connection numbers and having an improved minimum distance is disclosed. Further, a coding method using a binary partial access recovery code is disclosed.

본 발명은 상기한 목적을 달성하기 위해 안출된 것으로서, 본 발명의 복구 부호를 이용하는 부호화 장치는 데이터 복구를 위해 접속하고자 하는 접속 노드수를 고려하여 상기 데이터를 부호화 하는 복구 부호를 정의하는 패리티 검사 행렬을 생성하는 패리티 검사 행렬 생성부; 및 상기 생성된 패리티 검사 행렬을 이용하여 상기 데이터의 부호화에 따른 부호어를 생성하고, 상기 생성된 부호어를 분산 저장 시스템에 분산 저장하는 분산 저장부; 를 포함한다.According to an aspect of the present invention, there is provided an encoding apparatus using a restoration code, comprising: a parity check matrix for defining a restoration code for encoding the data, A parity check matrix generator for generating a parity check matrix; And generating a codeword for encoding the data using the generated parity check matrix and distributing the generated codeword to a distributed storage system; .

본 발명에서, 상기 복구 부호는 이진 부분접속 복구 부호이고, 상기 패리티 검사 행렬 생성부는 상기 복구 부호의 길이 및 상기 데이터의 길이를 더 고려하여 상기 패리티 검사 행렬을 생성할 수 있다.In the present invention, the restoration code is a binary partial access restoration code, and the parity check matrix generator may generate the parity check matrix considering the length of the restoration code and the length of the data.

상기 패리티 검사 행렬 생성부는 부분 접속수를 확보하기 위한 제1 부행렬을 생성하는 제1 부행렬 생성부; 및 상기 제1 부행렬에 인접하여 배치되어, 상기 데이터의 최소거리를 확보하기 위한 제2 부행렬을 생성하는 제2 부행렬 생성부; 를 포함하고, 상기 제1 부행렬 및 제2 부행렬을 이용하여 상기 패리티 검사 행렬을 생성할 수 있다.Wherein the parity check matrix generator comprises: a first sub-matrix generator for generating a first sub-matrix for securing a partial connection number; And a second sub-matrix generator disposed adjacent to the first sub-matrix to generate a second sub-matrix for securing a minimum distance of the data; And the parity check matrix may be generated using the first sub-matrix and the second sub-matrix.

본 발명에서, 상기 제1 부행렬 생성부는 상기 부분 접속수 및 상기 데이터의 길이를 고려하여 상기 제1 부행렬 각각의 행의 길이를 정하고, 상기 제1 부행렬의 서로 다른 행의 넌-제로 엘리먼트(non-zeroelement)는 동일한 열에 배치되지 않도록 마련될 수 있다.In the present invention, the first sub-matrix generator may determine the length of each row of the first sub-matrix in consideration of the partial connection number and the length of the data, and determine a length of each non- non-zero elements may not be arranged in the same column.

상기 제1 부행렬은 상기 부분 접속수를 고려하여 상기 제1 부행렬 각각의 행에 포함된 넌-제로 엘리먼트(non-zeroelement)의 수를 설정할 수 있다.The first sub-matrix may set the number of non-zero elements included in each row of the first sub-matrix in consideration of the partial connection number.

본 발명에서 상기 제2 부행렬 생성부는 미리 결정된 차원을 가지는 이진 벡터 공간을 생성하고, 상기 생성된 이진 벡터 공간을 상기 부분 접속수에 따른 차원을 가지는 서브 벡터 공간들의 집합으로 분할하며, 상기 분할된 서브 벡터 공간들의 집합을 이용하여 상기 제2 부행렬을 생성할 수 있다. In the present invention, the second sub-matrix generation unit generates a binary vector space having a predetermined dimension, and divides the generated binary vector space into a set of subvector spaces having a dimension according to the partial connection number, The second sub-matrix can be generated using a set of subvector spaces.

상기 미리 결정된 차원은 상기 부분 접속수 및 상기 서브벡터 공간들의 집합의 크기를 고려하여 설정되고, 상기 서브 벡터 공간들은 상기 부분 접속수에 따른 차원에 대응하는 기저벡터 및 상기 기저벡터의 합을 상기 서브 벡터 공간들의 원소로 포함할 수 있다.Wherein the predetermined dimension is set in consideration of the number of partial connections and the size of the set of subvector spaces, the subvector spaces having a base vector corresponding to a dimension according to the partial connection and a sum of the base vectors, Can be included as elements of vector spaces.

본 발명에서 상기 제2 부행렬 생성부는 상기 미리 결정된 차원에 따라 결정되는 차수를 갖는 갈로아 필드에서 원시 원소의 거듭제곱으로 표현된 원소들을 상기 원시 원소를 근으로 가지는 원시 다항식을 이용하여 상기 미리 결정된 차원 보다 작은 승수의 상기 원시 원소의 거듭제곱들의 합으로 변환하고, 상기 변환된 갈로아 필드의 원소들의 각 항의 계수를 이용하여 이진 벡터로 표현된 상기 제2 부행렬을 생성할 수 있다.In the present invention, the second sub-matrix generator may generate the second sub-matrix using the primitive polynomial having the elements represented by powers of the primitive elements in the Galois field having an order determined according to the predetermined dimension, The second sub-matrix expressed by a binary vector may be generated using a coefficient of each term of elements of the transformed Galois field.

상기 이진 벡터 공간을 분할하는 서브 벡터 공간들의 집합의 크기는 상기 부분 접속수, 상기 부분 접속수에 따른 상기 서브 벡터 공간의 차원 및 상기 미리 결정된 차원이 상기 부분 접속수에 따른 상기 서브 벡터 공간의 차원으로 나누어지는지 여부를 고려하여 설정되고, 상기 원시 다항식은 상기 미리 결정된 차원에 따라 존재 가능한 모든 형태의 원시 다항식을 포함하도록 마련될 수 있다.Wherein the size of the set of subvector spaces dividing the binary vector space is determined by the number of partial connections, the dimension of the subvector space according to the partial connection number, and the dimension of the subvector space according to the partial connection number , And the primitive polynomial may be arranged to include all types of primitive polynomials that can exist according to the predetermined dimension.

본 발명에서 상기 분산 저장부는 상기 생성된 패리티 검사 행렬을 제1 단위 행렬과 나머지 부행렬을 포함하는 조직적 형태로 변환하고, 상기 변환된 패리티 검사 행렬의 부행렬을 전치하며, 상기 전치된 부행렬 및 제2 단위 행렬을 포함하는 부호화 행렬을 생성하고, 상기 생성된 부호화 행렬을 이용하여 상기 부호어를 생성할 수 있다.In the present invention, the distributed storage unit converts the generated parity check matrix into a systematic form including a first unitary matrix and a remaining sub-matrix, transposes a sub-matrix of the transformed parity check matrix, An encoding matrix including a second unitary matrix may be generated, and the codeword may be generated using the generated encoding matrix.

또한 상기한 목적을 달성하기 위하여 본 발명의 복구 부호를 이용하는 복호화 장치는 데이터 복구를 위해 접속하고자 하는 접속 노드수를 고려하여 상기 데이터를 부호화 하는 복구 부호를 정의하는 패리티 검사 행렬을 생성하는 패리티 검사 행렬 생성부; 상기 생성된 패리티 검사 행렬을 이용하여 상기 데이터의 부호화에 따른 부호어를 생성하고, 상기 생성된 부호어를 분산 저장 시스템에 분산 저장하는 분산 저장부; 및 상기 분산 저장 시스템에 저장된 코드 블록의 인덱스를 수신하여, 상기 수신된 인덱스에 관한 열을 포함하는 부행렬을 상기 패리티 검사 행렬에서 선정하고, 상기 선정된 부행렬에 가우스 소거 연산을 수행하여 복호화를 수행하는 복호화부; 를 포함한다.According to another aspect of the present invention, there is provided a decoding apparatus using a restoration code. The decoding apparatus includes a parity check matrix generating a parity check matrix defining a restoration code for coding the data, Generating unit; Generating a codeword for encoding the data using the generated parity check matrix, and distributing the generated codeword to the distributed storage system; And an index of a code block stored in the distributed storage system, selecting a sub-matrix including a column related to the received index from the parity check matrix, performing a Gaussian elimination operation on the selected sub-matrix, A decoding unit to perform decoding; .

또한 상기한 목적을 달성하기 위하여 본 발명의 복구 부호를 이용하는 부호화 방법은 데이터 복구를 위해 접속하고자 하는 접속 노드수를 고려하여 상기 데이터를 부호화 하는 복구 부호의 오류를 검출하기 위한 패리티 검사 행렬을 생성하는 단계; 및 상기 생성된 패리티 검사 행렬을 이용하여 상기 데이터의 부호화에 따른 부호어를 생성하고, 상기 생성된 부호어를 분산 저장 시스템에 분산 저장하는 단계; 를 포함한다.According to another aspect of the present invention, there is provided an encoding method using a restoration code, the method including generating a parity check matrix for detecting an error of a restoration code for encoding the data, step; Generating a codeword according to the encoding of the data using the generated parity check matrix, and distributing the generated codeword to the distributed storage system; .

상기 복구 부호는 이진 부분접속 복구 부호이고, 상기 패리티 검사 행렬을 생성하는 단계는 상기 복구 부호의 길이 및 상기 데이터의 길이를 더 고려하여 상기 패리티 검사 행렬을 생성하도록 마련될 수 있다.The restoration code is a binary partial access restoration code. In the step of generating the parity check matrix, the parity check matrix may be generated by considering the length of the restoration code and the length of the data.

본 발명에서 상기 패리티 검사 행렬을 생성하는 단계는 부분 접속수를 확보하기 위한 제1 부행렬을 생성하는 단계; 및 상기 제1 부행렬에 인접하여 배치되어, 상기 데이터의 최소거리를 확보하기 위한 제2 부행렬을 생성하는 단계; 를 포함하고, 상기 제1 부행렬 및 제2 부행렬을 이용하여 상기 패리티 검사 행렬을 생성할 수 있다. The step of generating the parity check matrix may include: generating a first sub-matrix for securing a partial connection number; And generating a second sub-matrix adjacent to the first sub-matrix, the second sub-matrix for ensuring a minimum distance of the data; And the parity check matrix may be generated using the first sub-matrix and the second sub-matrix.

상기 제1 부행렬을 생성하는 단계는 상기 부분 접속수를 고려하여 상기 제1 부행렬 각각의 행에 포함된 넌-제로 엘리먼트(non-zeroelement)의 수를 정하고, 상기 부분 접속수 및 상기 데이터의 길이에 따라 상기 제1 부행렬 각각의 행의 길이를 정하며, 상기 제1 부행렬의 서로 다른 행에 포함된 넌-제로 엘리먼트(non-zeroelement)는 동일한 열에 배치되지 않도록 마련될 수 있다.Wherein the step of generating the first sub-matrix defines the number of non-zero elements included in each row of the first sub-matrix in consideration of the partial access number, The length of each row of the first sub-matrix is determined according to the length of the first sub-matrix, and non-zero elements included in different rows of the first sub-matrix are not arranged in the same column.

본 발명에서 상기 제2 부행렬을 생성하는 단계는 미리 결정된 차원을 가지는 이진 벡터 공간을 생성하고, 상기 생성된 이진 벡터 공간을 상기 부분 접속수에 따른 차원을 가지는 서브 벡터 공간들의 집합으로 분할하며, 상기 분할된 서브 벡터 공간들의 집합을 이용하여 상기 제2 부행렬을 생성할 수 있다.The generating of the second sub-matrix may include generating a binary vector space having a predetermined dimension, dividing the generated binary vector space into a set of subvector spaces having a dimension according to the partial connection number, The second sub-matrix may be generated using the set of divided sub-vector spaces.

상기 제2 부행렬을 생성하는 단계는 상기 미리 결정된 차원에 따라 결정되는 차수를 가지는 갈로아 필드에서 원시 원소의 거듭 제곱으로 표현된 원소들을 상기 원시 원소를 근으로 가지는 원시 다항식을 이용하여 상기 미리 결정된 차원 보다 작은 승수의 상기 원시 원소의 거듭제곱들의 합으로 변환하고, 상기 변환된 갈로아 필드의 원소들의 각 항의 계수를 이용하여 이진 벡터로 표현된 상기 제2 부행렬을 생성할 수 있다.Wherein the step of generating the second sub-matrix further comprises the steps of: using the primitive polynomial having the elements represented by powers of the primitive elements in the Galois field having an order determined according to the predetermined dimension, The second sub-matrix expressed by a binary vector may be generated using a coefficient of each term of elements of the transformed Galois field.

상기 서브 벡터 공간들의 집합의 크기는 상기 부분 접속수, 상기 부분 접속수에 따른 상기 서브 벡터 공간의 차원 및 상기 미리 결정된 차원이 상기 부분 접속수에 따른 상기 서브 벡터 공간의 차원으로 나누어지는지 여부를 고려하여 설정되고, 상기 원시 다항식은 상기 미리 결정된 차원에 따라 존재 가능한 모든 형태의 원시 다항식을 포함하도록 마련될 수 있다. Wherein the size of the set of subvector spaces considers whether the partial connection number, the dimension of the subvector space according to the partial connection number, and whether the predetermined dimension is divided by the dimension of the subvector space according to the partial connection number And the primitive polynomial may be provided to include all types of primitive polynomials that may exist according to the predetermined dimension.

상기 분산 저장하는 단계는 상기 생성된 패리티 검사 행렬을 제1 단위 행렬과 나머지 부행렬을 포함하는 조직적 형태로 변환하고, 상기 변환된 패리티 검사 행렬의 부행렬을 전치하며, 상기 전치된 부행렬 및 제2 단위 행렬을 포함하는 부호화 행렬을 생성하고, 상기 생성된 부호화 행렬을 이용하여 상기 부호어를 생성할 수 있다.The distributed storage stores the generated parity check matrix into a systematic form including a first unit matrix and a remaining sub-matrix, transposes a sub-matrix of the converted parity check matrix, A coding matrix including a 2-unit matrix may be generated, and the codeword may be generated using the generated coding matrix.

또한, 본 발명은 컴퓨터에서 상기한 복구 부호를 이용하는 부호화 방법을 실행시키기 위한 컴퓨터에서 판독 가능한 기록매체에 저장된 컴퓨터 프로그램을 개시한다.The present invention also discloses a computer program stored in a computer readable recording medium for causing a computer to execute a coding method using the above restoration code.

본 발명에 따르면, 복구 부호를 이용하는 부호화/복호화 장치는 개선된 최소 거리를 가지는 복구 부호를 이용하여 안정적으로 분산 저장 시스템에 대용량의 데이터를 저장할 수 있는 잇점이 있다.According to the present invention, an encoding / decoding apparatus using a restoration code can advantageously store a large amount of data in a distributed storage system stably using a restoration code having an improved minimum distance.

특히, 로컬 복구 특성을 이용하여 효율적으로 복호화를 수행할 수 있는 잇점이 있다.In particular, there is an advantage that decryption can be efficiently performed by using the local recovery characteristic.

도 1은 본 발명의 일 실시 예에 따른 복구 부호를 이용하는 부호화 장치의 블록도이다.
도 2는 도 1의 실시예에서 패리티 검사 행렬 생성부의 확대 블록도이다.
도 3a는 도 1의 실시 예에서 제1 부행렬 생성부(100)에서 생성된 제1 부행렬의 예시도이다.
도 3b는 제1 부행렬 및 제2 부행렬을 포함하는 패리티 검사 행렬의 예시도이다.
도 4a 는 차수(Degree)에 따라 존재하는 원시 다항식의 계수를 이진 시퀀스로 표현한 예시도이다.
도 4b 는 차수(Degree)에 따라 존재하는 원시 다항식의 계수를 이진 시퀀스로 표현한 예시도이다.
도 4c 는 차수(Degree)에 따라 존재하는 원시 다항식의 계수를 이진 시퀀스로 표현한 예시도이다.
도 5는 갈로아 필드

의 원시 원소의 거듭제곱으로 표현된 원소들을 원시 원소의 3이하의 지수를 갖는 거듭제곱들의 합으로 표현하고, 각 항의 계수를 이용하여 이진 벡터로 표현한 예시도이다.
도 6은 본 발명의 일 실시 예에 따른 벡터 공간과 그의 2-스프레드의 예시도이다.
도 7은 본 발명의 일 실시 예에 따라 생성된 패리티 검사 행렬의 예시이다.
도 8은 일 실시 예에 따른 패리티 검사 행렬을 부호화 행렬로 변환하는 과정을 나타내는 예시도이다.
도 9는 본 발명에서 제시된 복구 부호의 안정성을 나타내는 차트이다.
도 10은 본 발명의 일 실시 예에 따른 복구 부호를 이용하는 부호화 방법의 흐름도이다.
도 11은 본 발명의 일 실시 예에 따른 복구 부호를 이용하는 복호화 장치의 블록도이다.1 is a block diagram of an encoding apparatus using a restoration code according to an embodiment of the present invention.
2 is an enlarged block diagram of a parity check matrix generation unit in the embodiment of FIG.
FIG. 3A is an exemplary diagram of a first sub-matrix generated by the first sub-matrix generation unit 100 in the embodiment of FIG.
3B is an exemplary diagram of a parity check matrix including a first sub-matrix and a second sub-matrix.
FIG. 4A is an example of a binary sequence of primitive polynomial coefficients according to a degree. FIG.
FIG. 4B is an example of a binary sequence of primitive polynomial coefficients according to a degree. FIG.
FIG. 4C is an example of a binary sequence of primitive polynomial coefficients according to a degree. FIG.
Figure 5 is a cross-

Is represented by the sum of powers having exponents of 3 or less of the primitive elements and expressed by binary vectors using the coefficients of the terms.
6 is an illustration of a vector space and its 2-spread according to an embodiment of the present invention.
7 is an illustration of a parity check matrix generated according to an embodiment of the present invention.
8 is a diagram illustrating a process of converting a parity check matrix into an encoding matrix according to an embodiment.
9 is a chart showing the stability of the restoration code shown in the present invention.
10 is a flowchart of a coding method using a restoration code according to an embodiment of the present invention.
11 is a block diagram of a decoding apparatus using a restoration code according to an embodiment of the present invention.

이하, 본 발명의 일 실시예를 첨부된 도면들을 참조하여 상세히 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

첨부 도면을 참조하여 설명함에 있어, 동일하거나 대응하는 구성 요소는 동일한 도면번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS In the following description with reference to the accompanying drawings, the same or corresponding components are denoted by the same reference numerals, and a duplicate description thereof will be omitted.

또한 본 발명을 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략할 수 있다. In the following description of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear.

본 출원에서 사용한 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 용어를 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the scope. The singular expressions include plural expressions unless the context clearly dictates otherwise.

이하에서 설명하는 각 단계는 하나 또는 여러 개의 소프트웨어 모듈로도 구비가 되거나 또는 각 기능을 담당하는 하드웨어로도 구현이 가능하며, 소프트웨어와 하드웨어가 복합된 형태로도 가능하다. 명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라, 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 "...부", "...기", "모듈", "블록" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다. 각 용어의 구체적인 의미와 예시는 각 도면의 순서에 따라 이하 설명 한다. 이하에서는 본 발명의 실시예에 따른 복구 부호를 이용하는 부호화 장치의 구성을 관련된 도면을 참조하여 상세히 설명한다.Each of the steps described below may be implemented by one or a plurality of software modules, or hardware that is responsible for each function, or a combination of software and hardware. Throughout the specification, when an element is referred to as " including " an element, it does not exclude other elements unless specifically stated to the contrary. The terms "part", "unit", "module", "block", and the like described in the specification mean units for processing at least one function or operation, And a combination of software. Specific meanings and examples of the terms will be described below in accordance with the order of each drawing. Hereinafter, a configuration of an encoding apparatus using a restoration code according to an embodiment of the present invention will be described in detail with reference to related drawings.

도 1은 본 발명의 일 실시 예에 따른 복구 부호를 이용하는 부호화 장치의 블록도이다. 1 is a block diagram of an encoding apparatus using a restoration code according to an embodiment of the present invention.

복구 부호를 이용하는 부호화 장치(10)는 패리티 검사 행렬 생성부(100) 및 분산 저장부(200)를 포함한다.The encoding apparatus 10 using a restoration code includes a parity check matrix generation unit 100 and a distributed storage unit 200.

복구 부호를 이용하는 부호화 장치(10)는 패리티 검사 행렬을 생성하고, 생성된 패리티 검사 행렬을 이용하여 데이터를 부호화하여 부호어를 생성하며, 생성된 부호어를 분산 저장 시스템에 나누어 저장한다. 보다 상세하게는, 복구 부호를 이용하는 부호화 장치(10)는 패리티 검사 행렬을 생성하고, 생성된 패리티 검사 행렬을 이용하여 부호화 행렬을 생성하며, 상기 생성된 부호화 행렬을 이용하여 데이터를 부호화할 수 있다.The encoding apparatus 10 using the restoration code generates a parity check matrix, generates data by using the generated parity check matrix to generate a codeword, and stores the generated codeword in a distributed storage system. More specifically, the encoding apparatus 10 using the restoration code generates a parity check matrix, generates an encoding matrix using the generated parity check matrix, and encodes the data using the generated encoding matrix .

예를 들어, 복구 부호를 이용하는 부호화 장치(10)는 대규모의 자료를 저장하고 유지해야 하는 분산 저장 시스템에서 데이터의 안전한 저장과 효율적인 관리를 위하여 사용될 수 있다. 복구 부호를 이용하는 부호화 장치(10)를 이용하여 데이터를 부호화하여 저장하고, 부호화된 데이터가 소실된 경우에 분산 저장 시스템에 저장된 데이터 노드들에 접속하여 소실된 데이터를 복구 할 수 있다. 본 발명에서 복구 부호는 연산 복잡도가 낮은 이진 부호로써 이진 부분접속 복구 부호를 사용하고, 기존의 최소거리 4를 갖는 부분접속 복구 부호를 개선하여 최소거리 6을 갖는 복구 부호를 사용할 수 있다.For example, an encoding device 10 using a restoration code can be used for secure storage and efficient management of data in a distributed storage system that needs to store and maintain a large amount of data. It is possible to encode and store the data using the encoding device 10 using the restoration code and restore the lost data by connecting to the data nodes stored in the distributed storage system when the encoded data is lost. In the present invention, the restoration code may be a binary code having a low computational complexity, a binary partial access restoration code, a partial access restoration code having a minimum distance of 4, and a restoration code having a minimum distance of 6 may be used.

또 다른 실시 예로, 복구 부호를 이용하는 부호화 장치(10)는 반복 부호 및 소실 부호와 같은 전통적인 오류정정부호를 이용할 수 있다. 하지만, 본 발명의 복구 부호를 이용하는 부호화 장치(10)는 분산 저장 시스템의 특성상 시스템의 안정성을 일정하게 유지하기 위해 장애가 발생한 노드를 수시로 복구 해야 할 필요가 있고, 이는 전통적인 오류정정부호가 사용되었던 시스템과는 차이가 있다. In another embodiment, the encoding apparatus 10 using the restoration code can use a conventional error correction code such as a repetition code and a missing code. However, in the encoding apparatus 10 using the restoration code of the present invention, it is necessary to restore the failed node from time to time in order to maintain the stability of the system constant due to the characteristics of the distributed storage system. .

패리티 검사 행렬 생성부(100)는 제1 부행렬 생성부(120) 및 제2 부행렬 생성부(140)를 포함한다.The parity check matrix generation unit 100 includes a first sub-matrix generation unit 120 and a second sub-matrix generation unit 140.

패리티 검사 행렬 생성부(100)는 데이터 복구를 위해 접속하고자 하는 접속 노드수를 고려하여 상기 데이터를 부호화 하는 복구 부호의 오류를 검출하기 위한 패리티 검사 행렬을 생성한다. 상기 복구 부호는 이진 부분 접속 복구 부호이고, 상기 패리티 검사 행렬 생성부(100)는 상기 복구 부호의 길이 및 상기 데이터의 길이를 더 고려하여 상기 패리티 검사 행렬을 생성할 수 있다.The parity check matrix generator 100 generates a parity check matrix for detecting an error of a recovering code for encoding the data, considering the number of access nodes to be accessed for data recovery. The parity check matrix generator 100 may generate the parity check matrix considering the length of the parity check code and the length of the parity check matrix.

예를 들어, 패리티 검사 행렬 생성부(100)는 복구 부호를 이용하여 부호화된 데이터인 부호어의 유효성을 검출하기 위한 패리티 검사 행렬을 생성하고, 복구 부호는 부호어의 집합을 의미하므로, 패리티 검사 행렬 생성부(100)는 복구 부호의 오류를 검출하는 패리티 검사 행렬을 생성할 수 있다. 도 2, 도 3a 및 도 3b를 참조하여 설명한다.For example, the parity check matrix generator 100 generates a parity check matrix for detecting the validity of a codeword that is coded data using a restoration code, and the restoration code refers to a set of codewords. Therefore, The matrix generator 100 may generate a parity check matrix for detecting errors in the recovery code. Will be described with reference to Figs. 2, 3A and 3B.

예를 들어, 패리티 검사 행렬 생성부(100)에서 생성된 패리티 검사 행렬은 후술하는 바와 같이 부호화 행렬로 변환할 수 있고, 부호화 행렬을 이용하여 데이터를 부호화할 수 있다. 패리티 검사 행렬 생성부(100)는 제1 부행렬 및 제2 부행렬을 포함하고, 제1 부행렬에 인접한 제2 부행렬을 이용하여 패리티 검사 행렬을 생성한다.For example, the parity check matrix generated by the parity check matrix generation unit 100 can be converted into an encoding matrix as described later, and data can be encoded using an encoding matrix. The parity check matrix generating unit 100 includes a first sub-matrix and a second sub-matrix, and generates a parity check matrix using a second sub-matrix adjacent to the first sub-matrix.

제1 부행렬 생성부(120)는 부분 접속수를 확보하기 위한 제1 부행렬을 생성한다. 부분 접속수는 한 노드에 저장된 데이터를 복구 하기 위하여 접속하여야 하는 최소 접속 노드수를 의미한다.The first sub-matrix generator 120 generates a first sub-matrix for securing a partial connection number. The number of partial accesses means the minimum number of connected nodes to recover data stored in one node.

예를 들어, 제1 부행렬 생성부(120)는

과 같이 0과 1을 포함하는 이진 시퀀스가 나열된 형태로 생성되고 상기 제1 부행렬의 서로 다른 행에서 1(넌-제로 엘리먼트, non-zeroelement)이 동일한 열에 배치되지 않게 함으로써 코드 레이트를 향상시킬 수 있다. 제1 부행렬 생성부(120)는 부분 접속수(r)을 고려하여 제1 부행렬의 한 행에서 나열되는 1의 수를 정하고, 행의 나머지 부분은 0을 배치하여 제1 부행렬을 생성한다. 구체적으로, 부분 접속수가 2인 경우 3개의 1(넌-제로 엘리먼트, non-zeroelement)을 연속하여 나열하고, 행의 나머지 부분은 0을 배치할 수 있다. 또한, 제1 부행렬 생성부(120)는 후술하는 바와 같이 부분 접속수, 복구 부호의 차원 및 서브벡터 공간의 크기(t-스프레드의 크기)를 고려하여 설정할 수 있다. t-스프레드는 미리 결정된 차원(m)의 벡터 공간을 분할하는 t차원의 서브 벡터 공간들의 집합을 의미한다.For example, the first sub-matrix generator 120 may generate

And a 1 (non-zero element) in different rows of the first sub-matrix are not arranged in the same column, thereby improving the code rate. have. The first sub-matrix generation unit 120 determines the number of 1s to be arranged in one row of the first sub-matrix in consideration of the partial connection number r and generates the first sub-matrix by arranging the remainder of the rows by 0 do. Specifically, when the number of partial connections is 2, three 1s (non-zero elements) can be successively arranged, and the rest of the rows can be assigned 0s. Also, the first sub-matrix generator 120 can be set in consideration of the number of partial connections, the dimension of the restoration code, and the size of the subvector space (t-spread size) as described later. The t-spread refers to a set of t-dimensional subvector spaces that divides the vector space of a predetermined dimension (m).

예를 들어, 제1 부행렬 생성부(120)는 r+1개의 1이 나열된 행을 가짐으로써 부분 접속수 r을 확보할 수 있다. 제1 부행렬을 포함하는 패리티 검사 행렬의 각각의 열은 하나의 부호화된 심볼을 의미하고, 각각의 행은 패리티 검사식에 해당한다. 즉, 제1 부행렬의 첫번째 행에 따른 패리티 검사식은

이고, 부호어 이 소실되는 경우

의 관계식을 이용하여 나머지 r개의 부호 심볼들로부터 c1을 구할 수 있다. 따라서, 상기의 제1 부행렬은 부분 접속수가 r임을 확인할 수 있다.For example, the first sub-matrix generation unit 120 can obtain a partial connection number r by having a row in which r + 1 1's are arranged. Each column of the parity check matrix including the first sub-matrix means one encoded symbol, and each row corresponds to a parity check equation. That is, the parity check equation according to the first row of the first sub-

, And the codeword is lost

, C1 can be obtained from the remaining r code symbols. Therefore, the first sub-matrix can confirm that the partial connection number is r.

또한, 제1 부행렬 생성부(120)에서 생성된 상기와 같은 제1 부행렬을 포함하는 패리티 검사 행렬은 최소거리 2를 만족한다. 공지의 정리에 따르면 패리티 검사 행렬의 임의의 z-1개의 열벡터가 항상 선형 독립(Linearly Independent)이고, 선형독립이 아닌 z개의 열벡터가 존재한다면(어떤 z개의 열벡터는 선형독립이 아니라면) 부호의 최소거리는 z이다. In addition, the parity check matrix including the first sub-matrix generated by the first sub-matrix generator 120 satisfies the minimum distance 2. According to the known theorem, if any z-1 column vectors of the parity check matrix are always linearly independent and there are z column vectors that are not linearly independent (some z column vectors are not linearly independent) The minimum distance of the sign is z.

따라서,

와 같은 제1 부행렬을 가지는 패리티 검사 행렬은 임의의 1개의 열벡터가 항상 0이 아닌 벡터이고, 따라서 임의의 1개의 열벡터는 항상 선형 독립이며, 어떤(some) 2개의 행렬은 선형 독립이 아니기 때문에, 전술한 패리티 검사 행렬을 갖는 부호는 최소거리 2를 만족한다.therefore,

, A parity check matrix having a first sub-matrix such that any one column vector is always a non-zero vector, so that any one column vector is always linearly independent and some two matrices are linearly independent , The code having the above-mentioned parity check matrix satisfies the minimum distance 2.

제2 부행렬 생성부(140)는 미리 결정된 차원을 가지는 이진 벡터 공간을 생성하고, 상기 생성된 이진 벡터 공간을 상기 부분 접속수에 따른 차원을 가지는 서브 벡터 공간들의 집합으로 분할하며, 상기 분할된 서브 벡터 공간들의 집합을 이용하여 상기 제2 부행렬을 생성할 수 있다. 상기 미리 결정된 차원은 부분 접속수 r및 t-스프레드의 크기 l을 고려하여 설정될 수 있다. The second sub-matrix generation unit 140 generates a binary vector space having a predetermined dimension, and divides the generated binary vector space into a set of sub-vector spaces having a dimension according to the partial connection number, The second sub-matrix can be generated using a set of subvector spaces. The predetermined dimension may be set considering the partial connection number r and the size l of the t-spread.

예를 들어, 제2 부행렬 생성부(140)는 전술한 제1 부행렬이 갖는 최소거리 2를 늘리기 위하여 제1 부행렬 아래에 인접하는 제2 부행렬을 생성할 수 있다. 최소거리는 임의의 소실(Erasure)이 발생할 경우에도 항상 원래 데이터로 복호가 가능한 최대 소실(Erasure)개수를 의미한다. 길이가 n, 부호의 차원이 k, 최소거리가 d인 임의의 부호는 d-1개의 소실(erasure)을 최대우도 복호를 통해 복구할 수 있다. 따라서, 동일한 최소거리에서 부호 전체의 길이가 길어지는 경우 시스템의 안정성을 확보할 수 없기 때문에, 개선된 최소거리를 가지는 부호의 생성이 요구된다.For example, the second sub-matrix generator 140 may generate a second sub-matrix adjacent to the first sub-matrix to increase the minimum distance 2 of the first sub-matrix. The minimum distance means the maximum number of erasures that can always be decoded into the original data even if any erasure occurs. Any code with length n, sign dimension k, and minimum distance d can recover d-1 erasures through maximum likelihood decoding. Therefore, when the length of the entire code becomes longer at the same minimum distance, stability of the system can not be secured, and therefore, it is required to generate codes having an improved minimum distance.

일 실시 예로, 제2 부행렬 생성부(140)는 부분 접속수 2 및 최소거리 2를 확보하는 제1 부행렬에 더하여 제2 부행렬을 생성함으로써 최소거리 6을 확보할 수 있다. 제2 부행렬을 생성하기 위한 벡터 공간 및 서브 벡터 공간으로 이루어지는 t-스프레드 생성법은 다음과 같다.In one embodiment, the second sub-matrix generation unit 140 can obtain the minimum sub-matrix 6 by generating the second sub-matrix in addition to the first sub-matrix securing the partial connection number 2 and the minimum distance 2. The t-spread generation method comprising the vector space and the subvector space for generating the second sub-matrix is as follows.

여기에서, r은 부분 접속수, t는 서브 벡터 공간의 차원을 의미한다. 서브 벡터 공간의 차원은 서브 벡터 공간의 기저벡터의 수와 같다. 기저 벡터는 독립 벡터이면서 해당 벡터 공간을 생성할 수 있는 벡터를 의미한다. 예를 들어, 부분 접속수 r이 2인경우 상기 수학식 1을 만족하는 최소 t의 값은 2이다. 따라서, 부분 접속수 r이 2인경우 제2 부행렬 생성부(140)는 m차원 이진 벡터 공간의 2-스프레드를 생성하고, 이를 이용하여 제2 부행렬을 채울 수 있다. 부분 접속수 r에 따라 정해지는 t값을 구하고, 구해진 t값을 이용하여 m차원 이진 벡터 공간의 t차원 서브 벡터 공간들의 집합인t-스프레드를 구하는 방법은 m이 t로 나누어 떨어지는지 여부를 고려하여 설정한다. 부분접속수 r과 그에 따른 t값을 구한 후, 서브 벡터 공간들의 집합의 크기(l)은 부분 접속수(r), 상기 부분 접속수에 따른 상기 서브 벡터 공간의 차원(t) 및 상기 미리 결정된 차원이 상기 부분 접속수에 따른 상기 서브 벡터 공간의 차원으로 나누어 지는지 여부를 고려하여 설정할 수 있다.Where r is the number of partial connections and t is the dimension of the subvector space. The dimension of the subvector space is equal to the number of basis vectors of the subvector space. The base vector means an independent vector and a vector capable of generating the corresponding vector space. For example, if the partial connection number r is 2, the value of the minimum t satisfying the expression (1) is 2. Accordingly, if the partial connection number r is 2, the second sub-matrix generation unit 140 generates a 2-spread of the m-dimensional binary vector space and can fill the second sub-matrix using the 2-spread. The method of obtaining the t-value determined according to the partial connection number r and obtaining the t-spread as a set of t-dimensional subvector spaces of the m-dimensional binary vector space using the obtained t value considers whether or not m is divided by t . After finding the partial connection number r and the corresponding t value, the size l of the set of subvector spaces is determined by the partial connection number r, the dimension t of the subvector space according to the partial connection number, Dimension of the sub-vector space according to the partial connection number.

제2 부행렬 생성부(140)는 미리 결정된 차원(m)을 가지는 이진 벡터 공간을 생성하고, 상기 생성된 이진 벡터 공간을 부분접속수(r)에 따른 차원(t)을 가지는 서브 벡터 공간들의 집합으로 분할하는데, 서브 벡터 공간의 차원 t는 상기 수학식 1에서 r을 고려하여 설정되는 최소의 정수를 의미한다. The second sub-matrix generation unit 140 generates a binary vector space having a predetermined dimension m and outputs the generated binary vector space to sub-vector spaces having a dimension t according to the partial connection number r The dimension t of the subvector space means a minimum integer set in consideration of r in Equation (1).

상기 수학식 2는 t가 m을 나누는 경우에 t-스프레드의 크기(m차원 이진 벡터 공간을 분할하는 t차원 서브 벡터 공간의 개수)를 구하는 관계식이다. 여기에서 m은 이진 벡터 공간의 차원이고, l은 t-스프레드의 크기로서, m차원 이진 벡터 공간을 분할하는 t-차원 서브 벡터 공간의 개수를 의미한다. Equation (2) is a relational expression for obtaining the size of the t-spread (the number of t-dimensional subvector spaces dividing the m-dimensional binary vector space) when t divides m. Where m is the dimension of the binary vector space and l is the size of the t-spread, which means the number of t-dimensional subvector spaces that divide the m-dimensional binary vector space.

여기에서

는 t가 m을 나누는 경우 t-스프레드의 i번째 서브 벡터 공간을 의미한다. 상기 수학식 3에서

는 주어진 차수가 m인 확장체

의 원시 원소, 원시 원소의 거듭제곱을

이라고 표현하며, t는 부분접속수 r에 따른 서브 벡터 공간의 차원, l은 t-스프레드의 크기, i는 0에서 l-1까지의 정수 의미한다. 주어진

을 이용하여 t가 m을 나누는 경우 t-스프레드의 i번째 서브 벡터 공간

는 수학식 3과 같이 생성될 수 있다. 또한,

=<a,b>는 기저벡터 a 및 b로 생성되는 벡터 공간을 의미한다.From here

Denotes the i-th subvector space of the t-spread when t divides m. In Equation (3)

Lt; RTI ID = 0.0 > m < / RTI &

Of a primitive element, the power of a primitive element

T is the dimension of the subvector space according to the partial connection number r, l is the size of the t-spread, and i is an integer from 0 to l-1. given

If t is divided by m using the i-th sub-vector space of the t-spread

Can be generated as shown in Equation (3). Also,

= < a, b > represents a vector space generated by the basis vectors a and b.

상기 수학식 4은 t가 m을 나누지 않는 경우에 t-스프레드의 크기(m차원 이진 벡터 공간을 분할하는 t 차원 서브 벡터 공간의 개수)l을 구하는 관계식이다. 여기에서, l은 t-스프레드의 크기이고, q는 유한체의 크기로 부호의 심볼 알파벳 사이즈, m은 미리 결정된 벡터 공간의 차원, t는 m차원의 벡터 공간을 분할하는 서브 벡터 공간의 차원, z는 m을 t로 나눈 나머지, h를

로 정의하면, 갈로아 필드

의 모든 원소를

과 같이 표현할 수 있고, 본 발명에서는 상기의 표현법을 사용한다Equation (4) is a relational expression for obtaining the magnitude of the t-spread (the number of t-dimensional subvector spaces dividing the m-dimensional binary vector space) l when t does not divide m. Here, 1 is the size of the t-spread, q is the size of the finite field, the symbol alphabet size of the sign, m is the dimension of the predetermined vector space, t is the dimension of the subvector space dividing the m- z is the remainder of dividing m by t, h

Lt; RTI ID = 0.0 >

All elements of

, And the present invention uses the above-mentioned expression method

여기에서, S는 t가 m을 나누지 않는 경우에 생성되는 t-스프레드이고, w,

, 및

은 각각 t차원의 서브 벡터 공간의 기저벡터 집합에 해당한다. 먼저,

를

의 원시 원소,

를

의 원시 원소,

및

로 정의한다. w는 유한체

의 원시 원소

로 표현되는 기저벡터를 갖는 서브벡터 공간으로서

와 같다.

는 유한체

의 원시 원소

로 표현되는 기저벡터를 갖는 서브 벡터공간으로서

와 같다.

는 전술한 원시 원소

및

로 표현되는 기저벡터를 갖는 서브 벡터 공간으로서,

와 같다. 전술한 바와 같이 <a,b>는 기저벡터 a 및 b로 생성되는 벡터 공간을 의미한다. 여기에서, i는 0에서 g-1까지의 정수 및 j는 0에서

임의의 정수 이며, m은 미리 결정된 이진 벡터 공간의 차원이고, t는 m차원의 이진 벡터 공간을 분할하는 서브 벡터 공간의 차원이며, h는

식을 만족하는 정수이다. 따라서, t가 m을 나누지 않는 경우 크기 l의 t-스프레드는 수학식 5와 같이 생성될 수 있다. 도 4a, 도 4b 및 도 4c 를 참조하여 설명한다.Where S is the t-spread generated when t does not divide m, and w,

, And

Correspond to the set of basis vectors of the sub-vector space of the t-dimensional, respectively. first,

To

Of the raw elements,

To

Of the raw elements,

And

. w is a finite element

Raw element of

As a subvector space having a basis vector expressed by < RTI ID = 0.0 >

.

Is a finite element

Raw element of

As a subvector space having a basis vector expressed by < RTI ID = 0.0 >

.

The above-mentioned raw element

And

As a subvector space having a basis vector expressed by < RTI ID = 0.0 >

. As described above, < a, b > denotes a vector space generated by the basis vectors a and b. Where i is an integer from 0 to g-1 and j is an integer from 0 to

M is a dimension of a predetermined binary vector space, t is a dimension of a subvector space that divides an m-dimensional binary vector space, h is an integer

It is an integer that satisfies the expression. Thus, if t does not divide m, then a t-spread of size l can be generated as shown in equation (5). Will be described with reference to Figs. 4A, 4B and 4C.

여기에서, r은 부분 접속수, m은 미리 결정된 벡터 공간의 차원, l은 t-스프레드의 크기이다. 상기 수학식 6은 수학식 2 및 수학식 4와 함께 벡터 공간의 차원 m을 정하는데 사용될 수 있다. 예를 들어, 부분 접속수 r을 정하고, 상기 수학식 1에 의하여 t값이 정해지면, t가 m을 나누는지 여부를 고려하여 수학식 2 또는 4를 선택하고, 상기 선택된 수학식 2 또는 4에 의해 결정된 l값을 찾아서, 수학식 6을 만족하는 m값을 찾을 수 있다. Where r is the partial connection number, m is the dimension of the predetermined vector space, and l is the size of the t-spread. Equation (6) can be used to determine the dimension m of the vector space together with Equations (2) and (4). For example, if the partial connection number r is determined and the t value is determined according to Equation (1), then Equation (2) or Equation (4) is selected in consideration of whether t divides m, And find the value of m, which satisfies Equation (6).

또 다른 실시 예로, 부분 접속수 r이 2인경우를 살펴보면, r=2이면 수학식 1에 의하여 t는 2의 값을 가진다. 이때 미리 결정된 벡터 공간의 차원 m값을 변화시키면서 수학식 2 또는 4 및 수학식 6을 만족하는지 여부를 판단한다. m=1인 경우 l=0 이고, m=2인 경우 수학식 2에 의하여 l=1이고, m=3인경우 수학식 4에 의하여 l=1인데, 이는 수학식 6을 만족하지 않는다. 하지만 m=4인경우 l은 수학식 2에 따라 5이고 이는 수학식 6을 만족하는 m 및 l값에 해당한다. 따라서, 제2 부행렬 생성부(140)는 부분접속수 r=2, t=2, m=4 및 l=5를 이용하여 이진 벡터 공간의 2-스프레드를 생성할 수 있다. In another embodiment, if the partial connection number r is 2, the value of t is 2 according to Equation (1) if r = 2. At this time, it is determined whether the mathematical expression 2 or 4 and the mathematical expression 6 are satisfied by changing the dimension m value of the predetermined vector space. In the case of m = 1, l = 0 and m = 2, l = 1 according to Equation (2) and l = 1 according to Equation (4) if m = 3. However, if m = 4, l is 5 according to Equation (2), which corresponds to m and l values satisfying Equation (6). Accordingly, the second sub-matrix generator 140 can generate the 2-spread of the binary vector space using partial access numbers r = 2, t = 2, m = 4 and l =

제2 부행렬 생성부(140)는 미리 결정된 차원을 가지는 이진 벡터 공간을 생성하고, 상기 생성된 이진 벡터 공간을 부분 접속수에 따른 차원을 가지는 서브 벡터 공간으로 분할하며, 상기 분할된 서브 벡터 공간을 이용하여 상기 제2 부행렬을 생성할 수 있다. 상기 분할된 부분접속수에 따른 차원을 가지는 서브 벡터 공간들의 집합을 t-스프레드라 한다. 여기에서, t-스프레드에 속하는 t차원의 서브 벡터 공간의 기저(basis)는 유한체의 원시 원소의 곱으로 표현이 되고, 제2 부행렬 생성부(140)는 이를 이진 벡터로 변환하여 제2 부행렬을 생성하므로, 이하 원시 원소의 곱으로 표현이 된 t차원 서브 벡터 공간의 기저(basis)를 이진 벡터로 표현하는 방법을 설명한다.The second sub-matrix generation unit 140 generates a binary vector space having a predetermined dimension, and divides the generated binary vector space into sub-vector spaces having a dimension according to the number of partial connections, May be used to generate the second sub-matrix. A set of subvector spaces having a dimension according to the divided partial connection number is called a t-spread. Here, the basis of the sub-vector space of the t-dimension belonging to the t-spread is represented by the product of the primitive elements of the finite field, and the second sub-matrix generator 140 converts the basis vector into a binary vector, A sub-matrix is generated. Hereinafter, a method of expressing the basis of a t-dimensional subvector space expressed by a product of a primitive element as a binary vector will be described.

제2 부행렬 생성부(140)는 각 유한체의 차수(degree)에 따른 원시 다항식의 형태에 따라 서로 다른 이진 벡터로 표현된 이진 서브 벡터 공간을 생성할 수 있다. 예를 들어, 도 4a, 도 4b 및 도 4c에 도시된 바와 같이, 유한체의 차수가 5인 경우 존재 가능한 원시 다항식의 계수만을 이진 시퀀스로 표현하면 100101/101001/101111/110111/111011/111101로 표현 가능하다. 100101은 원시 다항식의 계수를 이진 벡터로 표현한 것이고, 100101을 미지수 x를 이용한 원시 다항식의 형태로 표현하면

와 같다. The second sub-matrix generator 140 may generate a binary sub-vector space represented by different binary vectors according to the shape of the primitive polynomial according to the degree of each finite field. For example, as shown in FIG. 4A, FIG. 4B and FIG. 4C, if only the coefficients of the primitive polynomial that can exist when the degree of the finite field is 5 are expressed as binary sequences, then the coefficients of 100101/101001/101111/110111/111011/111101 It is expressible. 100101 is a binary vector representing the coefficients of the primitive polynomial, and 100101 is expressed in the form of primitive polynomial using the unknown x

.

또 다른 실시 예로, 차수가 2인 원시 다항식의 계수를 이진벡터로 표현하면 111이고, 이는

과 같다. 상기 원시 다항식은 원시 원소

를 근으로 가지기 때문에

와 같이 표현될 수 있다. 이진 연산을 이용하여 상기 다항식을 정리하면

이고, 유한체

의 모든 원소를 1이하의 지수를 갖는

의 거듭제곱들의 합으로 표현할 수 있다. 즉 차수가 2인 유한체

는

와 같이 정리할 수 있다.In another embodiment, the coefficient of the source polynomial having degree 2 is denoted by 111 as a binary vector,

Respectively. The primitive polynomial is a primitive element

Because

Can be expressed as When the polynomial is summarized using a binary operation

, And a finite element

All the elements of

Can be expressed as the sum of powers of the two. In other words,

The

Can be summarized as follows.

제2 부행렬 생성부(140)는 미리 결정된 차수를 가지는 갈로아 필드

의 원시 원소의 거듭제곱으로 표현된 원소들을 상기 원시 원소를 근으로 가지는 원시 다항식을 이용하여 상기 차수 보다 작은 승수의 상기 원시 원소의 거듭제곱들의 합으로 변환하고, 상기 변환된 갈로아 필드의 원소들의 계수를 이용하여 이진 벡터로 표현된 제2 부행렬을 생성할 수 있다. 상기 미리 결정된 차수를 가지는 갈로아 필드는 미리 결정된 차수의 이진체의 확대체를 의미할 수 있다. 도 5를 참조하여 설명한다.The second sub-matrix generating unit 140 generates a second sub-

To a sum of powers of the primitive elements of a multiplier less than the degree by using a primitive polynomial having the root element as its root, the elements represented by powers of the primitive elements of the transformed Galois field Coefficients can be used to generate a second sub-matrix expressed as a binary vector. The Galois field having the predetermined order may mean an extension of a predetermined order of binary bodies. Will be described with reference to FIG.

예를 들어, 차수가4인 유한체는

와 같이 표현되고, 차수가4인 경우 존재 가능한 원시 다항식의 형태는 10011 및 11001이 존재한다. 두가지 존재 가능한 원시 다항식의 계수 형태 중에서 10011을 선택하면 원시 다항식은

와 같고, 원시 원소를 근으로 가지므로

의 식을 만족한다. 따라서 이진 연산을 이용하여

로 변환하고, 이를 이용하여 차수가4인 유한체의 모든 원소들을 상기 차수4보다 작은 승수의 원시 원소의 거듭제곱들의 합으로 정리하면 도 5의

와 같다. 원시 원소의 거듭제곱들의 합으로 표현된 원소 각 항의 계수를 이진 벡터로 변환하면

의 모든 원소는 이진 벡터 0000, 0001, 0010, 0100, 1000, 0011,….,1101, 1001로 표현될 수 있다. For example, a finite field with a degree of 4

And if the degree is 4, there exist 10011 and 11001 types of primitive polynomials that can exist. If you select 10011 from two possible exponential polynomial coefficients, the primitive polynomial

, And since they have roots in their roots

. Therefore,

And by summing up all the elements of the finite field having the order of 4 using the sum of powers of the primitive elements of the multiplier smaller than the order of 4,

. Converting the coefficients of each term of the element represented by the sum of powers of the primitive elements into a binary vector

All elements of the binary vector 0000, 0001, 0010, 0100, 1000, 0011, ... ., 1101, and 1001, respectively.

예를 들어, 부분 접속수가 2인경우, 수학식 1을 만족하는 최소의 t는 2이고, t가 m을 나누는 경우 이므로, m=4로 가정하면 수학식 2및 수학식 6을 모두 만족하는 l값은 l=5이다. 따라서, m=4를 선택한 경우 차원이 4인 이진 벡터 공간의 2-스프레드 S는 다음과 같이 생성할 수 있다.

For example, if the partial connection number is 2, the minimum t satisfying the expression (1) is 2 and t is the division of the m. Therefore, assuming that m = 4, The value is l = 5. Therefore, if m = 4 is selected, the 2-spread S of the binary vector space with dimension 4 can be generated as follows.

t-스프레드 S에 속하는 각각의 서브 벡터 공간의 기저(Basis)벡터 및 상기 서브 벡터 공간의 기저벡터의 합을 이용하여 제2 부행렬을 생성할 수 있다. 즉, 전술한 차원이 4인 이진 벡터 공간의 2-스프레드 S에서 5개의 각각의 서브 벡터 공간은 기저 벡터에 더하여 상기 기저 벡터들의 합을 원소로 포함할 수 있다. 예를 들어, m차원 이진 벡터 공간의 크기 l인 t-스프레드 S를

, 0에서 l-1까지 인덱스 넘버를 가지는 i를 이용하여

로 정의하고, t차원 서브 벡터 공간 Wi의 기저(Basis)를

로 정의한다. t차원 서브 벡터 공간의 기저

는 t개의 기저 벡터

로 이루어진 집합

로 표현 가능하다. 기저

를 이용하여 새로운 집합

는

로 정의하고, 이때 각 원소는 하기의 수학식 7에서 도출될 수 있다. the second sub-matrix can be generated using the basis vectors of the respective subvector spaces belonging to the t-spread S and the sum of the basis vectors of the subvector space. That is, each of the five subvector spaces in the 2-spread S of the binary vector space having the dimension of 4 described above may include, in addition to the basis vectors, the sum of the basis vectors as an element. For example, the t-spread S, which is the size of the m-dimensional binary vector space, l

, I with an index number from 0 to l-1

And Basis of the t-dimensional subvector space Wi is defined as

. The basis of the t-dimensional subvector space

T < / RTI >

A set of

. Base

To create a new set

The

, Where each element can be derived from Equation (7) below.

여기에서 i는 0에서 l-1까지의 정수, j는

를 만족하는 정수, t는 부분접속수 r에 따라 정해지는 서브 벡터 공간들의 차원을 의미한다. 수학식 7을 이용하여 새로운 집합

의 원소를 정의할 수 있다상기 집합

및

를 이용하여 새로운 집합

를 생성하고, 새로운 집합

에 포함된 길이가 m인

개의 벡터를 이용해 제2 부행렬을 생성할 수 있다. Where i is an integer from 0 to l-1, j is

And t denotes a dimension of the subvector spaces determined according to the partial connection number r. Using equation (7), a new set

The above set of elements

And

To create a new set

And a new set

The length included in

A second sub-matrix can be generated by using a vector of two vectors.

집합

는 다음의 정리를 만족한다. set

Satisfies the following theorem.

첫째, 집합

의 임의의 두개의 열벡터의 벡터합은 영벡터가 되지 않는다. 둘째, 집합

에서 임의의 네 개의 열벡터의 벡터합은 영벡터가 되지 않는다. First,

The vector sum of any two column vectors of < RTI ID = 0.0 > Second,

The vector sum of any four column vectors does not become a zero vector.

제2 부행렬 생성부(140)는 상기 생성된

를 이용하여 0에서 s-1까지의 값을 가지는 인덱스 i로 표현되는 부행렬

을 채울 수 있다. 집합

에는 길이가 m인

개의 벡터들이 존재하고, 수학식 1을 만족하도록 t가 선택되었으므로,

의 원소 r+1개를

의 r+1개의 열벡터로 넣어서 패리티 검사 행렬을 생성할 수 있다. 상기 생성된 패리티 검사 행렬에 의하여 정의되는 부호는 부분 접속수가 r, 부호길이 n=(r+1)s, 부호차원 k=rs-m이고, 최소거리가 6이상임이 보장되는 부호가 된다.The second sub-matrix generator 140 generates the second sub-

A sub-matrix expressed as an index i having a value from 0 to s-1

Lt; / RTI > set

Has a length of m

And since t is selected so as to satisfy the expression (1)

The element r + 1 of

Th column vector of the parity check matrix. The code defined by the generated parity check matrix is a code that guarantees that the partial access number is r, the code length n = (r + 1) s, the code dimension k = rs-m, and the minimum distance is 6 or more.

여기에서 r은 부분 접속수, m은 이진 벡터 공간의 차원, l은 t-스프레드의 크기, s는 수학식 8의 범위를 만족하는 정수로서, 한 행의 길이를 n이라 할 때

와 같다. Where r is the partial number of connections, m is the dimension of the binary vector space, 1 is the size of the t-spread, and s is an integer that satisfies the range of Equation 8. Assuming that the length of one row is n

.

예를 들어, 부분 접속수가 2인경우, 수학식 1을 만족하는 최소의 t는 2이고, 수학식 2에 의하여 l은 5이다. 따라서, 수학식 5를 만족하는 s는 3,4 및 5를 포함한다. 제2 부행렬 생성부(140)는 s가 3인 경우

와 같은 행렬을 생성할 수 있다. 상기 패리티 검사 행렬을 갖는 부호는 길이가 9, 차원이 2, 부분접속수가 2, 최소거리가 6을 만족한다.For example, if the number of partial connections is 2, the minimum t satisfying Equation 1 is 2, and l is 5 according to Equation (2). Thus, s satisfying (5) includes 3, 4, and 5. When s is 3, the second sub-matrix generating unit 140

&Lt; / RTI > The code having the parity check matrix has a length of 9, a dimension of 2, a partial connection number of 2, and a minimum distance of 6.

또 다른 실시 예로, s가 4인 경우 패리티 검사 행렬은

와 같다. 상기 패리티 검사 행렬은 길이 12, 차원이 4, 부분접속수 2, 최소거리가 6인 부호에 대응된다.In another embodiment, when s is 4, the parity check matrix is

. The parity check matrix corresponds to a code having a length of 12, a dimension of 4, a partial connection number of 2, and a minimum distance of 6.

또 다른 실시 예로, s가 5인 경우 패리티 검사 행렬은

와 같다. 상기 패리티 검사 행렬은 길이 12, 차원이 6, 부분접속수2 및 최소거리 6을 가지는 부호에 대응된다.In another embodiment, when s is 5, the parity check matrix is

. The parity check matrix corresponds to a code having a length of 12, a dimension of 6, a partial connection number of 2, and a minimum distance of 6.

분산 저장부(200)는 패리티 검사 행렬부(100)에서 생성된 패리티 검사 행렬을 이용하여 부호어를 생성하고, 상기 생성된 부호어를 분산 저장 시스템에 분산 저장한다. The distributed storage unit 200 generates a codeword using the parity check matrix generated by the parity check matrix unit 100 and distributes the generated codeword to the distributed storage system.

예를 들어, 분산 저장부(200)는 상기 생성된 패리티 검사 행렬을 이용하여 부호화 행렬을 생성하고, 상기 생성된 부호화 행렬을 이용하여 상기 데이터를 부호화하여 부호어를 생성할 수 있다. 보다 상세하게는, 분산 저장부(200)는 생성된 패리티 검사 행렬을 제1 단위 행렬과 나머지 부행렬을 포함하는 조직적 형태로 변환하고, 상기 변환된 패리티 검사 행렬의 부행렬을 전치하며, 상기 전치된 부행렬 및 제2 단위 행렬을 포함하는 부호화 행렬을 생성하고, 상기 생성된 부호화 행렬을 이용하여 상기 부호어를 생성할 수 있다. 상기 조직적 형태의 행렬은 단위 행렬을 포함하는 형태의 행렬을 의미한다. 조직적 형태의 패리티 검사 행렬은 패리티 검사 행렬 H에 기본행 연산(Elementary row operation)을 적용하여 구할 수 있다. 패리티 검사 행렬을 이용하여 부호화 행렬을 생성하는 방법은 도 8을 참조하여 후술한다.For example, the distributed storage unit 200 generates an encoding matrix using the generated parity check matrix, and generates the codeword by encoding the data using the generated encoding matrix. More specifically, the distributed storage unit 200 converts the generated parity check matrix into a systematic form including a first unitary matrix and a remaining sub-matrix, transposes a sub-matrix of the converted parity check matrix, And a second unitary matrix, and generate the codeword using the generated coding matrix. The matrix of the systematic form means a matrix including a unit matrix. The parity check matrix of the systematic form can be obtained by applying an elementary row operation to the parity check matrix H. [ A method of generating an encoding matrix using a parity check matrix will be described later with reference to FIG.

도 2는 도 1의 실시예에서 패리티 검사 행렬 생성부(100)의 확대 블록도이다.2 is an enlarged block diagram of a parity check matrix generation unit 100 in the embodiment of FIG.

패리티 검사 행렬 생성부(100)는 제1 부행렬 생성부(120) 및 제2 부행렬 생성부(140)을 포함한다. 패리티 검사 행렬 생성부(100)는 최소 접속 노드수를 고려하여 상기 데이터를 부호화 하는 복구 부호의 오류를 검출하기 위한 패리티 검사 행렬을 생성할 수 있다. The parity check matrix generation unit 100 includes a first sub-matrix generation unit 120 and a second sub-matrix generation unit 140. The parity check matrix generator 100 may generate a parity check matrix for detecting an error of a restoration code for encoding the data considering the minimum number of connected nodes.

일 실시 예로, 패리티 검사 행렬 생성부(100)는 데이터 복구를 위해 접속하고자 하는 최소 접속 노드수인 부분 접속수, 상기 복구 부호의 길이 및 상기 데이터의 길이를 고려하여 패리티 검사 행렬을 생성할 수 있다. 패리티 검사 행렬 생성부(100)에서 중복되는 사항은 전술한 바와 같으므로 생략한다.In one embodiment, the parity check matrix generator 100 may generate a parity check matrix considering the number of partial accesses, the length of the recovery code, and the length of the data, which is the minimum number of access nodes to be accessed for data recovery . The redundant matters in the parity check matrix generation unit 100 are as described above and will be omitted.

도 3a는 제1 부행렬 생성부(120)에서 생성된 제1 부행렬의 예시도이다.3A is an exemplary diagram of a first sub-matrix generated by the first sub-matrix generator 120. FIG.

제1 부행렬 생성부(120)는 최소 접속 노드수인 부분 접속수를 확보하기 위한 제1 부행렬을 생성한다. The first sub-matrix generator 120 generates a first sub-matrix for securing the number of partial connections that is the minimum number of connected nodes.

예를 들어, 부분접속수가 r인 경우 패리티 검사 행렬 H의 각 행에는 넌-제로 엘리먼트(non-zeroelement) r+1개가 존재한다. 패리티 검사 행렬 H의 각 행은 패리티 검사 식이고, 상기 패리티 검사 행렬의 서로 다른 행의 넌-제로 엘리먼트는 동일한 열에 존재하지 않음으로써 부분 접속수를 확보함과 동시에 코드 레이트를 향상 시킬 수 있다. 제1 부행렬을 포함하는 패리티 검사 행렬은 부분접속수 r이고 최소거리 2를 만족하는 부호를 가질 수 있다. 제1 부행렬에 관한 사항은 전술한 바와 같으므로 생략한다.For example, if the partial connection number is r, there are r + 1 non-zero elements in each row of the parity check matrix H. [ Each row of the parity check matrix H is a parity check matrix, and the non-zero elements of the different rows of the parity check matrix are not present in the same column, thereby securing the partial connection count and improving the code rate. The parity check matrix including the first sub-matrix may have a code satisfying the minimum distance 2 and the partial connection number r. The matters relating to the first sub-matrix are as described above and will be omitted.

도 3b는 제1 부행렬 및 제2 부행렬을 포함하는 패리티 검사 행렬의 예시도이다. 제2 부행렬 생성부(140)는 제1 부행렬에 인접하여 배치되는 제2 부행렬을 생성한다. 패리티 검사 행렬 생성부(100)는 생성된 제1 부행렬 및 제2 부행렬을 이용하여 패리티 검사 행렬을 생성한다.3B is an exemplary diagram of a parity check matrix including a first sub-matrix and a second sub-matrix. The second sub-matrix generator 140 generates a second sub-matrix adjacent to the first sub-matrix. The parity check matrix generator 100 generates a parity check matrix using the generated first sub-matrix and second sub-matrix.

패리티 검사 행렬 생성부(100)에서 생성된 패리티 검사 행렬이 포함하는 제1 부행렬은 HL, 제2 부행렬은 HG로 표기될 수 있고, 제1 부행렬은 상단, 제2 부행렬은 제1 부행렬의 하단에 인접하여 배치될 수 있다. 제2 부행렬을 생성하는 방법은 전술 한 바와 같으므로 생략한다.The first sub-matrix included in the parity check matrix generated by the parity-check matrix generation unit 100 may be denoted by HL and the second sub-matrix HG, and the first sub-matrix may be represented by an upper part, And may be disposed adjacent to the bottom of the submatrix. The method of generating the second sub-matrix is the same as described above and is therefore omitted.

도 4a 는 차수(Degree)에 따라 생성될 수 있는 원시 다항식의 계수를 이진 시퀀스로 표현한 예시도이다.FIG. 4A is an example of a binary sequence representing coefficients of a primitive polynomial that can be generated according to a degree. FIG.

제2 부행렬 생성부(140)는 미리 결정된 차수에 따라 결정되는 크기를 갖는 갈로아 필드에서 원시 원소의 거듭제곱으로 표현된 원소들을 상기 원시 원소를 근으로 가지는 원시 다항식을 이용하여 상기 미리 결정된 차원 보다 작은 승수의 상기 원시 원소의 거듭제곱들의 합으로 변환하고, 상기 변환된 갈로아 필드의 원소들의 각 항의 계수를 이용하여 이진 벡터로 표현된 상기 제2 부행렬을 생성할 수 있다. The second sub-matrix generator 140 generates the second sub-matrix using the primitive polynomial having the elements represented by powers of the primitive elements in the Galois field having a size determined according to a predetermined order, To the sum of the powers of the primitive elements of the smaller multipliers, and to generate the second sub-matrix expressed as a binary vector using the coefficients of the respective terms of the transformed Galois field elements.

상기 원시 원소를 근으로 가지는 원시 다항식은 차원에 따라 다양한 형태가 존재할 수 있고, 복구 부호를 이용하는 부호화 장치(10)는 모든 존재 가능한 다항식을 이용하여 원시 원소의 거듭제곱으로 갈로아 필드의 원소들을 변환할 수 있다. 상기 미리 결정된 차원은 상기 부분 접속수 및 상기 벡터 공간을 분할하는 서브 벡터 공간들의 집합의 크기를 고려하여 설정될 수 있음은 전술한 바와 같다.The primitive polynomial having the root of the primitive element may exist in various forms according to the dimension, and the encoding apparatus 10 using the restoration code transforms the elements of the Galois field into powers of primitive elements using all possible polynomials can do. The predetermined dimension can be set considering the size of the partial access number and the set of subvector spaces dividing the vector space as described above.

도 4b 는 차수(Degree)에 따라 생성될 수 있는 원시 다항식의 계수를 이진 시퀀스로 표현한 예시도이다.FIG. 4B is an example of a binary sequence representing coefficients of a primitive polynomial that can be generated according to a degree. FIG.

각 차수(Degree)에 따라 생성되는 원시 다항식과 이를 이용하는 제2 부행렬 생성부(140)의 내용은 전술한 바와 같으므로 생략한다.The primitive polynomial generated according to each degree and the contents of the second sub-matrix generator 140 using the primitive polynomial are the same as described above, and thus will not be described.

도 4c 는 차수(Degree)에 따라 생성될 수 있는 원시 다항식의 계수를 이진 시퀀스로 표현한 예시도이다. 각 차원(Degree)에 따라 생성되는 원시 다항식과 이를 이용하는 제2 부행렬 생성부(140)의 내용은 전술한 바와 같으므로 생략한다.FIG. 4C is an example of a binary sequence representing coefficients of a primitive polynomial that can be generated according to a degree. FIG. The primitive polynomial generated according to each degree and the contents of the second sub-matrix generator 140 using the primitive polynomial are the same as described above, and thus will not be described.

도 5는 갈로아 필드

의 원시 원소의 거듭제곱으로 표현된 원소들을 원시 원소의 3이하의 지수를 갖는 거듭제곱들의 합으로 표현한 예시도이다.Figure 5 is a cross-

And the elements represented by the powers of the primitive elements of the primitive elements are expressed as a sum of powers having exponents of 3 or less of the primitive elements.

예를 들어, 제2 부행렬 생성부(140)는 도 4a, 도 4b 및 도 4c의 각 차수에 따른 원시 다항식을 이용하여 해당 차수보다 작은 승수를 갖는 원시 원소의 거듭제곱들의 합으로 유한체의 원소를 변환하고, 변환된 유한체의 원소들의 각 항의 계수를 이용하여 유한체의 원소들을 이진 벡터로 표현할 수 있다.For example, the second sub-matrix generator 140 generates a second sub-matrix using the primitive polynomials according to the respective orders of FIGS. 4A, 4B, and 4C to obtain a sum of powers of primitive elements having a multiplier smaller than the corresponding order, The elements of the finite field can be represented as binary vectors by transforming the elements and using the coefficients of the terms of the elements of the transformed finite field.

일 실시 예로, 제2 부행렬 생성부(140)는 차수가 4인 이진 확장체

의 각 원소들을 차수가 4인 원시 다항식 중 10011을 이용하여 승수가 4보다 작은 원시 원소의 거듭제곱들의 합으로 변환하고, 변환된 원소들의 각 항의 계수를 이용하여 상기 원시 원소들을 이진 벡터로 표현할 수 있다.In one embodiment, the second sub-matrix generator 140 generates a second sub-

Are transformed into the sum of the powers of the primitive elements whose multipliers are less than 4 by using 10011 among the primitive polynomials of order 4 and the primitive elements can be represented by a binary vector using the coefficients of the respective terms of the transformed elements have.

도 6은 본 발명의 일 실시 예에 따른 벡터 공간과 그의 2-스프레드의 예시도이다.6 is an illustration of a vector space and its 2-spread according to an embodiment of the present invention.

은 3차원 이진 벡터 공간을 의미하고, 하기의 W는 크기가 1인 2-스프레드를 의미한다.

는 4차원 이진 벡터 공간을 의미하고, 하기의 S는 크기가 5인 2-스프레드를 의미한다. 여기에서, 여기에서 w1, w2, w3, w4 및 w5는 2차원 서브 벡터 공간들을 표현한 것이다.

Means a three-dimensional binary vector space, and W denotes a two-spread having a size of 1.

Means a four-dimensional binary vector space, and S in the following represents a two-spread having a size of 5. Here, w1, w2, w3, w4 and w5 represent two-dimensional subvector spaces.

도 7은 본 발명의 일 실시 예에 따라 생성된 패리티 검사 행렬의 예시이다.7 is an illustration of a parity check matrix generated according to an embodiment of the present invention.

H1의 패리티 검사 행렬을 갖는 부호 C1은 길이9, 차원2, 부분접속수 2 및 최소거리 6을 가지고, H2의 패리티 검사 행렬을 갖는 부호 C2는 길이 12, 차원 4, 부분접속수 2 및 최소거리 6을 가지며, H3의 패리티 검사 행렬을 가지는 부호 C3는 길이 15, 차원6, 부분접속수 2 및 최소거리 6을 만족한다.The code C1 having a parity check matrix of H1 has a length of 9, a dimension of 2, a partial connection number of 2 and a minimum distance of 6, and a code of C2 having a parity check matrix of H2 has a length of 12, a dimension of 4, 6, and the code C3 having the parity check matrix of H3 satisfies the length 15, the dimension 6, the partial connection number 2, and the minimum distance 6.

예를 들어, 패리티 검사 행렬 생성부(100)는 H1, H2 및 H3의 패리티 검사 행렬을 생성할 수 있고, 분산 저장부(200)는 생성된 패리티 검사 행렬을 이용하여 데이터를 부호화한다. 보다 상세하게는, 복구 부호를 이용하는 부호화 장치(10)는 상기 생성된 패리티 검사 행렬을 이용하여 부호화 행렬을 생성하고, 상기 생성된 부호화 행렬을 이용하여 데이터를 부호화할 수 있다. 생성된 패리티 검사 행렬 H1, H2 및 H3의 최소거리가 6을 만족하기 위해서는 다음의 정리를 만족하여야 한다.For example, the parity check matrix generation unit 100 may generate a parity check matrix of H1, H2, and H3, and the distributed storage unit 200 encodes data using the generated parity check matrix. More specifically, the encoding apparatus 10 using the restoration code can generate an encoding matrix using the generated parity check matrix, and encode data using the generated encoding matrix. In order for the minimum distance of generated parity check matrices H1, H2, and H3 to satisfy 6, the following theorem must be satisfied.

패리티 검사 행렬 생성부(100)에서 생성된 패리티 검사 행렬에 포함되는 제2 부행렬의 열벡터들이 s개의 그룹(한 그룹은 r+1개의 열벡터들을 포함함)으로 분할된다고 가정할때, 제2 부행렬의 열벡터들은 다음의 세가지 조건을 만족하여야 한다.Assuming that the column vectors of the second sub-matrix included in the parity check matrix generated by the parity check matrix generation unit 100 are divided into s groups (one group includes r + 1 column vectors) The column vectors of the two-part matrix must satisfy the following three conditions.

첫째, 동일한 그룹에 속하는 모든 열벡터 중 임의의 두개의 열벡터의 벡터합이 영벡터가 되지 않는다. 둘째, 동일한 그룹에 속하는 모든 열벡터 중 임의의 네 개의 열벡터의 벡터합이 영벡터가 되지 않는다. 셋째, 하나의 그룹에서 선택된 임의의 두 개의 열벡터와 또 다른 하나의 그룹에서 선택된 임의의 두 개의 열벡터로 이루어진 네 개의 열벡터의 벡터합이 영벡터가 되지 않는다.First, the vector sum of any two column vectors of all column vectors belonging to the same group does not become a zero vector. Second, the vector sum of any four column vectors of all column vectors belonging to the same group does not become a zero vector. Third, the vector sum of the four column vectors consisting of any two column vectors selected in one group and any two column vectors selected in another group does not become a zero vector.

패리티 검사 행렬 생성부(100)에서 생성된 패리티 검사 행렬이 최소거리 6임을 증명하기 위해서는 패리티 검사 행렬에서 선택된 임의의 5개 이하의 열벡터가 항상 선형 독립임을 보임으로써 증명할 수 있다. 제1 부행렬 생성부(120)에서 생성된 제1 부행렬의 구조적 특징에 의해, 생성된 패리티 검사 행렬의 홀수 개의 열벡터들의 벡터합은 항상 영이 아닌 벡터가 된다. 따라서, 다섯 개 이하의 짝수 개의 열벡터들의 벡터합이 영이 아닌 벡터가 됨을 보이면 된다. In order to prove that the parity check matrix generated by the parity check matrix generation unit 100 is a minimum distance 6, it can be proved that any five or less column vectors selected from the parity check matrix are always linearly independent. According to the structural features of the first sub-matrix generated by the first sub-matrix generator 120, the vector sum of the odd column vectors of the generated parity check matrix is always a non-zero vector. Therefore, it can be seen that the vector sum of five or less even column vectors becomes a non-zero vector.

패리티 검사 행렬 생성부(100)에서 생성된 패리티 검사 행렬이 상기 3가지의 조건을 만족하는 경우, 패리티 검사 행렬의 모든 5개 이하의 짝수 개의 임의의 열벡터들의 벡터합은 항상 영이 아닌 벡터가 됨이 보장된다. 따라서, 상기 생성된 패리티 검사 행렬을 갖는 부호의 최소거리는 6이상임이 보장된다.If the parity check matrix generated by the parity check matrix generation unit 100 satisfies the above three conditions, the vector sum of all the even and odd row vectors of all five or less of the parity check matrix is always a non-zero vector . Therefore, the minimum distance of the code having the generated parity check matrix is guaranteed to be 6 or more.

수학식 9는 최소거리가 d인 (n,k,r)q 부분접속 복구 부호의 차원 k의 상한계를 나타내는 식이다. 여기에서 n은 복구 부호의 길이, k는 복구 부호의 차원, r은 부분접속수, d는 최소거리이고, q는 유한체의 크기로 부호의 심볼 알파벳 사이즈이다. (n,k,r)q부분접속 복구 부호는 크기q인 유한체에서 정의되는 길이 n, 차원k, 그리고 부분접속수가 r인 부호이고,

는 길이가 n, 최소거리가 d, 알파벳 사이즈가 q인 부호가 가질 수 있는 최대 차원이다. Equation (9) represents an upper limit of the dimension k of the (n, k, r) q partial access restoration code having the minimum distance d. Where n is the length of the restoration code, k is the dimension of the restoration code, r is the number of partial connections, d is the minimum distance, and q is the size of the finite field and is the symbol's alphabet size. (n, k, r) q The partial access recovery code is a code with length n, dimension k, and partial access number r defined in finite field q,

Is the largest dimension that a code of length n, minimum distance d, and alphabet size q can have.

상기 표 1은 최소거리가 6으로 주어졌을 때, 부호의 길이 n에 따라, 이진 부호의 가능한 최대 차원

의 값을 나타낸다. 여기에서, n은 부호의 길이를 의미한다. 수학식 9의 상한계를 이용하여 패리티 검사 행렬 H1, H2 및 H3를 가지는 부호 C1, C2 및 C3가 최적임을 확인하기 위해, 표1의

을 이용하여 최적임을 확인한다. 표1은 'M. Grassl, Bounds on the minimum distance of linear codes and quantum codes, Online available at http://www.codetables.de.'을 참조한 것이다. 예를 들어, 부호 C1은 부분접속수 특성을 고려하지 않더라도 주어진 길이 n=12와 최소거리 d=6을 갖는 모든 이진 선형 부호가 가질 수 있는 최대 가능한 차원

를 달성하는 부호이므로 최적이다. 부호 C2를 수학식 9에 대입할 경우, t=1일 때 수학식 9의 우변이 최소가 되고, 이때의 값이 4이고 C2의 차원과 같다. 따라서 부호 C2 역시 최적이다. 부호 C3를 수학식 9에 대입하면 t=1일 때 수학식 9의 우변이 최소가 되고, 부호 C3의 차원은 이때의 값 6과 같다. 따라서 부호 C3 역시 최적임을 확인할 수 있다.Table 1 shows the maximum possible dimension of the binary code according to the length n of the code when the minimum distance is given as 6

Lt; / RTI > Here, n means the length of the code. In order to confirm that the codes C1, C2 and C3 having the parity check matrices H1, H2 and H3 are optimal by using the upper limit of the expression (9)

To confirm that it is optimal. Table 1 shows' M. Grassl, Bounds on the minimum distance of linear codes and quantum codes, online available at http://www.codetables.de. For example, the code C1 indicates the maximum possible dimension that all binary linear codes having a given length n = 12 and a minimum distance d = 6 can have,

And is therefore optimal. When the code C2 is substituted into the equation (9), the right side of the equation (9) becomes minimum when t = 1, and the value at this time is 4 and equal to the dimension of C2. Therefore, code C2 is also optimal. When the code C3 is substituted into Equation 9, the right side of Equation 9 becomes minimum when t = 1, and the dimension of the code C3 equals the value 6 at this time. Therefore, it can be confirmed that the code C3 is also optimal.

도 8은 일 실시 예에 따른 패리티 검사 행렬을 부호화 행렬로 변환하는 과정을 나타내는 예시도이다.8 is a diagram illustrating a process of converting a parity check matrix into an encoding matrix according to an embodiment.

분산 저장부(200)는 패리티 검사 행렬을 이용하여 상기 데이터의 부호화에 따른 부호어를 생성하고, 생성된 부호어를 분산 저장 시스템에 저장한다. The distributed storage unit 200 generates a codeword according to the encoding of the data using a parity check matrix, and stores the generated codeword in the distributed storage system.

예를 들어, 분산 저장부(200)는 상기 생성된 패리티 검사 행렬을 제1 단위 행렬과 나머지 부행렬을 포함하는 조직적 형태로 변환하고, 상기 변환된 패리티 검사 행렬의 부행렬을 전치하며, 상기 전치된 부행렬 및 제2 단위 행렬을 포함하는 부호화 행렬을 생성하고, 상기 생성된 부호화 행렬을 이용하여 상기 부호어를 생성할 수 있다. 분산 저장부(200)는 전술한 바와 같이 패리티 검사 행렬에 기본행 연산을 적용하여 조직적 형태(Systematic)로 변환하여 H' 행렬을 생성할 수 있다. 조직적 형태로 변환된 패리티 검사 행렬 H'는 도8에 도시된 바와 같이 우측에 단위 행렬을 가지는데, 분산 저장부(200)는 단위 행렬을 제외한 조직적 형태로 변환된 패리티 검사 행렬 H'의 부행렬을 전치하고, 전치된 부행렬의 좌측에 단위 행렬을 결합하여 조직적 형태의 부호화 행렬 G'를 구하며 부호화 행렬 G'에 역 열 교환을 적용하여 G'' 부호화 행렬을 구하고, 이를 이용하여 데이터를 부호화 할 수 있다.For example, the distributed storage unit 200 converts the generated parity check matrix into a systematic form including a first unitary matrix and a remaining sub-matrix, transposes a sub-matrix of the converted parity check matrix, And a second unitary matrix, and generate the codeword using the generated coding matrix. The distributed storage unit 200 may generate an H 'matrix by transforming into a systematic form by applying a basic row operation to a parity check matrix as described above. The parity check matrix H 'converted into the systematic form has a unit matrix on the right side as shown in FIG. 8, where the distributed storage unit 200 stores a sub-matrix of the parity check matrix H' And a unit matrix to the left of the transposed submatrix to obtain a systematic encoding matrix G ', applying back-to-back exchange to the encoding matrix G' to obtain a G '' encoding matrix, can do.

여기에서, c는 부호어 벡터, m은 메시지 벡터, G는 부호화 행렬 또는 생성행렬이다. 상기 수학식 10은 일반적인 부호의 패리티 검사 행렬 H와 부호화행렬 G, 데이터 벡터 m, 부호어 벡터 c사이에 항상 성립하는 관계식이다. 에서 행렬 G를 대신하여 G''를 부호화 행렬 G로써 대입하여도 만족한다.Where c is a codeword vector, m is a message vector, and G is an encoding matrix or generator matrix. Equation (10) is a relational expression always established between the parity check matrix H of a general code, the encoding matrix G, the data vector m, and the codeword vector c. G " is substituted into the encoding matrix G instead of the matrix G in Fig.

도 9는 본 발명에서 제시된 복구 부호의 안정성을 나타내는 차트이다.9 is a chart showing the stability of the restoration code shown in the present invention.

본 발명에서 제안된 복구 부호의 장점을 알아보기 위해 MTTDL(Mean Time To Data Loss)을 이용한다. 종래 부호의 MTTDL을 구하고, 각 부호의 MTTDL을 구하여, 이들 사이의 비율을

로 정의하여 사용한다. 즉, 종래 분산 저장 시스템에 사용되어 왔던 3회 반복 부호의 MTTDL값을 분모로 하고, 이 부호와의 비교를 원하는 새로운 부호의 MTTDL값을 분자로 정의하여

을 구할 수 있다. MTTDL(Mean Time To Data Loss) 및

을 구하는데 필요한 파라미터의 설정은 "J. Hao, S.-T. Xia, and B. Chen, "Some results on optimal locally repairable codes," in Proc. IEEE Int. Symp. Inf. Theory (ISIT), pp. 440-444, Jun.2016.에 따른다.The mean time to data loss (MTTDL) is used to find out the merits of the recovering code proposed in the present invention. The MTTDL of the conventional code is obtained, the MTTDL of each code is obtained, and the ratio between them

. That is, the MTTDL value of the third repetition code which has been used in the conventional distributed storage system is denominator, and the MTTDL value of a new code which is desired to be compared with this code is defined as a numerator

Can be obtained. Mean time to data loss (MTTDL) and

(ISIT), " IEEE Int. Symp. Inf. Theory (ISIT), " pp. 440-444, Jun.2016.

도 9에는 3회 반복 부호를 기준으로 하여, 종래의 최소거리가 항상 4인 부분접속 복구 부호와 본 발명에서 제안하는 최소거리가 항상 6인 부분접속 복구 부호의

가 도시되어 있다. 여기에서, 각 부호는 길이 n이 증가함에도 최소거리는 항상 상수 4와 6으로 고정되어 있기 때문에, n이 증가함에 따라

가 감소하는 형태를 보인다.9 shows a partial access recovery code with a minimum minimum distance of 4 and a partial access recovery code with a minimum distance always 6 according to the present invention,

Are shown. Here, since the minimum distance is always fixed to

constants

4 and 6 even though the length n of each code increases, as n increases

.

기준선으로 3회 반복 부호의

을 두꺼운 실선으로 표기하였고, 부분접속수가 2이고 최소거리가 4인 이진 부분접속 복구 부호의

를 점선으로 표시하였다. 제안된 복구 부호는 동그라미로 표기하였다. 부분접속수가 2이고 최소거리가 4인 이진 부분접속 복구 부호의

는 n이 14이상이 되면 3회 반복 부호에 비해 MTTDL이 더 작아지는 반면, 제안하는 부호는 n이 103이하의 부호 길이 까지는 3회 반복 부호보다 큰 MTTDL값을 가진다.The baseline consists of three

And a binary partial connection restoration code having a partial connection number of 2 and a minimum distance of 4

Are indicated by dotted lines. The proposed restoration codes are indicated by circles. Of the binary partial access repair code with a partial connection number of 2 and a minimum distance of 4

The MTTDL is smaller than the 3-iteration code when n is 14 or more, while the MTTDL is larger than the 3-iteration code until the code length of n is 103 or less.

도 10은 본 발명의 일 실시 예에 따른 복구 부호를 이용하는 부호화 방법(S10)의 흐름도이다.10 is a flowchart of a coding method (S10) using a restoration code according to an embodiment of the present invention.

S100에서, 패리티 검사 행렬 생성부(100)는 데이터 복구를 위해 접속하고자 하는 접속 노드수를 고려하여 상기 데이터를 부호화 하는 복구 부호의 오류를 검출하기 위한 패리티 검사 행렬을 생성하는 패리티 검사 행렬을 생성한다. 패리티 검사 행렬 생성부(100)는 하나의 노드를 복구하기 위해 접속해야 하는 최소 노드수인 부분 접속수에 더하여 상기 복구 부호의 길이 및 상기 데이터의 길이를 더 고려하여 상기 패리티 검사 행렬을 생성할 수 있다. In step S100, the parity check matrix generator 100 generates a parity check matrix for generating a parity check matrix for detecting an error of a restoration code for encoding the data, considering the number of access nodes to be connected for data recovery . The parity check matrix generating unit 100 may generate the parity check matrix considering the length of the restoration code and the length of the data in addition to the partial access count which is the minimum number of nodes that must be connected in order to recover one node have.

S200에서, 분산 저장부(200)는 패리티 검사 행렬부(100)에서 생성된 패리티 검사 행렬을 이용하여 부호어를 생성하고, 상기 생성된 부호어를 분산 저장 시스템에 분산 저장한다. 분산 저장부(200)에 관한 사항은 전술한 바와 같으므로 생략한다.In S200, the distributed storage unit 200 generates codewords using the parity check matrix generated by the parity check matrix unit 100, and stores the generated codewords in a distributed storage system. The matters related to the distributed storage unit 200 are as described above and will be omitted.

도 11은 본 발명의 일 실시 예에 따른 복구 부호를 이용하는 복호화 장치의 블록도이다.11 is a block diagram of a decoding apparatus using a restoration code according to an embodiment of the present invention.

복구 부호를 이용하는 복호화 장치(20)는 패리티 검사 행렬 생성부(100), 분산 저장부(200) 및 복호화부(300)를 포함한다.The decoding apparatus 20 using the restoration code includes a parity check matrix generation unit 100, a distributed storage unit 200, and a decoding unit 300.

복구 부호를 이용하는 복호화 장치(20)는 분산저장시스템에 저장된 부호화된 일부의 데이터가 임의로 소실된 경우에도 복호를 통해 원래의 데이터를 복구할 수 있다. The decoding apparatus 20 using the restoration code can recover the original data through decoding even if a part of the encoded data stored in the distributed storage system is randomly lost.

예를 들어, 복구 부호를 이용하는 복호화 장치(20)는 길이가 n, 부호의 차원이 k, 최소거리가 d인 블록 코드인 경우 최대 d-1개의 임의의 소실을 최대우도복호를 통해 복구 할 수 있다. 이는 분산 저장 시스템에서 임의의 n-d+1개의 데이터 노드에 접속해 저장된 블록을 다운로드 받아와서 가우스 소거 연산을 적용하는 것과 같다. 복구 부호를 이용하는 복호화 장치(20)는 (n,k,d)부호 C의 패리티 검사 행렬 H를 이용하여 부호화 행렬 G를 구하고, 이를 이용하여 메시지 벡터 m을 부호화하고, 부호어 c=(c1,c2,…cn)을 생성해낼 수 있다.For example, the decoding apparatus 20 using a restoration code can recover at most d-1 random deletions by maximum likelihood decoding in the case of a block code having a length n, a code dimension k, and a minimum distance d have. This is the same as applying a Gaussian elimination operation by downloading stored blocks connected to arbitrary n-d + 1 data nodes in the distributed storage system. The decoding apparatus 20 using the restoration code finds the encoding matrix G by using the parity check matrix H of (n, k, d) code C and codes the message vector m by using the parity check matrix H, c2, ..., cn.

여기에서 H는 (n,k,d)부호 C의 패리티 검사 행렬이고, c는 부호어이다. 임의의 n-d+1개의 데이터 노드에 접속하여 받아온 코드 블록의 인덱스들의 집합을 K라고 하면,

는 K의 여집합이고, 이때

이고,

이다.

는

에 속하는 인덱스를 갖는 열벡터들로 이루어진 H의 부행렬을 의미한다.수학식11로부터 수학식 12를 도출해 낼 수 있다.Where H is a parity check matrix of (n, k, d) code C, and c is a codeword. Let K be a set of indices of code blocks received and connected to any n-d + 1 data nodes,

Is the complement of K,

ego,

to be.

The

Matrix H of column vectors having indexes belonging to the sub-matrix H. From Equation (11), Equation (12) can be derived.

여기에서,

는

에 속하는 인덱스를 갖는 열벡터들로 이루어진 H의 부행렬을 의미한다. 또한, K는 인덱스들의 집합이고, c는 부호어 및 T는 전치 행렬 연산을 의미한다.From here,

The

Quot; is a sub-matrix of H consisting of column vectors having indexes belonging to " H " Also, K is a set of indices, c is a codeword, and T is a transpose matrix operation.

수학식 12를 푸는 것은 해당 부호의 복호 알고리즘이 된다. 즉, 수학식 12를 푸는 것은

크기의 행렬에 대해 가우스 소거법을 적용하는 것과 같다. 일반적으로

크기의 행렬에 가우스 소거법을 적용하는 경우, 필요한 덧셈의 횟수는

이고, 필요한 곱셈의 횟수는

가 된다. Solving Equation (12) becomes a decoding algorithm of the corresponding code. That is, solving equation (12)

It is equivalent to applying a Gaussian elimination method for a matrix of size. Generally

When applying Gaussian elimination to a matrix of size, the number of additions required is

, And the number of necessary multiplications is

.

일 실시 예에 따른 복구 부호를 이용하는 복호화 장치(20)가 수행하는 로컬 복구를 이용한 복호 알고리즘의 예를 설명한다. 먼저, 다음과 같은 행렬 H를 패리티 검사 행렬로 갖는 부호 C를 가정한다. 상기 패리티 검사 행렬

를 패리티 검사 행렬로 가지는 부호 C는 길이가 9, 차원이 2, 최소거리가 6인 부호이다. 이 부호의 부호어를

라고 하면, 수학식 11을 이용하여 최대 5개의 소실이 발생한 경우에도 최대우도복호를 통해 원래의 메시지 벡터를 복호할 수 있다.An example of a decoding algorithm using local recovery performed by the decoding apparatus 20 using a restoration code according to an embodiment will be described. First, a code C having the following matrix H as a parity check matrix is assumed. The parity check matrix

The code C having a parity check matrix is a code having a length of 9, a dimension of 2, and a minimum distance of 6. The codeword of this code

, It is possible to decode the original message vector through the maximum likelihood decoding even when a maximum of five lost occurrences using Equation (11).

예를 들어, 사용자가 4개의 저장 노드에 접속하여 저장된 부호 심볼들을 다운로드 받아오는 상황을 가정하면,

의 다섯개 심볼을 수신했고,

4개의 심볼을 받아 오지 않아서 모른다고 가정하면,

을 다음과 같이 쓸 수 있다.For example, assuming that a user connects to four storage nodes to download stored code symbols,

Lt; RTI ID = 0.0 > symbol,

Assuming you do not know that you have not received four symbols,

Can be written as

상기 패리티 검사 행렬

의 윗 부분 3개의 행을 이용하여 로컬 패리티 검사식은

로 정의된다. 상기 패리티 검사 행렬의 윗 부분의 3개의 행은 이 부호에 부분접속 복구 특성을 부여한다. 상기 3개의 패리티 검사식에서 c1, c2 및c3를 포함하는 첫 번째 식의 세 개의 심볼은 아는 값이고, 두 번째 식에서는 c6만 미지수이다. 두 번째 식에서 c6을 이진 연산을 통하여 구할 수 있다. c7, c8 및 c9를 포함하는 세번째 식의 세 개의 심볼은 모두 미지수이므로 이식을 통해 알 수 있는 심볼의 값은 없다. 따라서, 두번째 패리티 검사식만을 통해 소실된 심볼 c6을 가우스 소거법을 적용하기 전에 미리 찾아낼 수 있다.The parity check matrix

The local parity check expression using the three rows above

. The three rows at the top of the parity check matrix give the partial access recovery property to this code. In the three parity check equations, three symbols of the first equation including c1, c2 and c3 are known values, and in the second equation, only c6 is unknown. In the second equation, c6 can be obtained through a binary operation. Since the three symbols of the third equation including c7, c8, and c9 are all unknown, there is no symbol value that can be known through the graft. Therefore, the lost symbol c6 can be found before applying the Gaussian elimination method only through the second parity check expression.

복구 부호를 이용하는 복호화 장치(20)는 종래의 가우스 소거법을 이용한 최대 우도 복호 알고리즘과 달리 미지수 심볼의 개수를 4개(c6, c7, c8, c9)에서 3개(c7, c8, c9)로 줄일 수 있다. 상기 세개의 심볼만을 미지수로 갖는 관계식을 행렬로 표현하면 다음과 같다. The decoding apparatus 20 using the restoration code reduces the number of unknown symbols from four (c6, c7, c8, c9) to three (c7, c8, c9) unlike the maximum likelihood decoding algorithm using the conventional Gauss elimination method . A relational expression having only three symbols as unknowns is expressed as a matrix.

상기 세개의 심볼만을 미지수로 갖는 관계식을 행렬로표현하면

와 같고, 가우스 소거법을 적용하여 나머지 3개의 미지수 심볼 c7, c8 및 c9의 값을 구할 수 있다. 복구 부호를 이용하는 복호화 장치(20)가 이용하는 행렬의 크기는 종래의 복호 알고리즘을 바로 적용하는 경우 보다 작기 때문에, 가우스 소거 연산시에 연산의 횟수가 적어지고, 복호 복잡도가 낮다고 할 수 있다. 바람직하게는, 복구 부호를 이용하는 복호화 장치(20)는 다운 로드 하지 않은 5개의 심볼을 로컬 패리티 검사식에 의해 구할 수 있고, 이때는 가우스 소거 연산을 적용하지 않을수 있어 연산 복잡도를 크게 낮출 수 있는 장점이 있다.If a relational expression having only three symbols as an unknown number is represented by a matrix

And the values of the remaining three unknown symbols c7, c8 and c9 can be obtained by applying the Gauss elimination method. Since the size of the matrix used by the decoding apparatus 20 using the restoration code is smaller than that in the case of applying the conventional decoding algorithm immediately, it is possible to say that the number of operations is small and the decoding complexity is low at the time of Gaussian elimination. Preferably, the decoding apparatus 20 using the restoration code can obtain five symbols that have not been downloaded by a local parity check formula. In this case, since the Gaussian elimination operation is not applied, the advantage is that the computational complexity can be greatly reduced have.

본 발명에서 복구 부호를 이용하는 복호화 장치(20)는 부분접속 복구 부호의 로컬 복구 특성을 이용하여 상기의 연산 횟수를 더 줄일 수 있다. 복구 부호를 이용하는 복호화 장치(20)는 부분접속 복구 부호의 모든 심볼 인덱스 i를 포함하는 크기가 최대 r+1이하인 복구 집합 R(i)를 이용하여 간단한 로컬 복구를 수행함으로써 복구를 수행할 수 있다. 예를 들어, 복구 부호를 이용하는 복호화 장치(20)가 이진 부분접속 복구 부호를 이용하는 경우 R(i)에 속하는 r+1개 중 i번째 심볼 블록만이 소실된 경우

의 r개의 심볼블록들의 xor연산만을 통해 해당 블록 ci를 복구해낼 수 있다. 따라서

에 속하는 블록들 중 로컬 복구가 가능한 블록을 복구한 후, 로컬 복구되지 못한 나머지 소실에 대해서만 가우스 소거법을 적용하면

보다 더 작은 크기의 행렬에 가우스 소거법을 적용하여 최대우도 복호를 완벽히 수행할 수 있다.In the present invention, the decoding apparatus 20 using the restoration code can further reduce the number of operations by using the local restoration characteristic of the partial access restoration code. The decoding apparatus 20 using the restoration code can perform restoration by performing a simple local restoration using the restoration set R (i) whose size including all the symbol indexes i of the partial access restoration code is at most r + 1 or less . For example, when the decoding apparatus 20 using the restoration code uses the binary partial access restoration code, only the i-th symbol block out of r + 1 belonging to R (i) is lost

The corresponding block ci can be recovered only through the xor operation of r number of symbol blocks. therefore

Of the blocks that are locally recoverable and then apply the Gaussian elimination method only to the remaining blocks that have not been recovered locally

The maximum likelihood decoding can be completely performed by applying the Gaussian elimination method to a matrix of a smaller size.

복구 부호를 이용하는 복호화 장치(20)는 부분접속수가 2인 (27,13,6) 부호가 적용된 시스템에서, 사용자가 임의의 22개의 노드에 접속해 코드 블록을 받아오고 이를 이용해 복호화 함에 있어, 기존의 가우스 소거법을 바로 적용하는 경우 평균 28회 덧셈을 필요로 하는 것과는 달리, 평균 7.18회의 덧셈을 수행하기 때문에 연산 횟수가 감소된 복호화를 수행할 수 있다. 상기 복구 부호를 이용하는 복호화 장치(20)가 수행하는 평균 덧셈의 횟수는 실험을 통해 구하였다. 즉, 복구 부호를 이용하는 복호화 장치(20)는 부분접속 복구 부호의 로컬 복구 특성을 이용함으로써 더 낮은 복잡도를 갖는 효율적인 복호 알고리즘을 수행함으로써 효율적인 복호화가 가능하다.In the decoding apparatus 20 using the restoration code, in the system to which the partial (27, 13, 6) code is applied, when the user connects to any of the 22 nodes and receives the code block and decodes the code block, It is possible to perform decryption with a reduced number of operations because it performs an average of 7.18 additions, unlike the case where an average of 28 additions is required. The average number of additions performed by the decoding apparatus 20 using the restoration code is obtained through experiments. That is, the decoding apparatus 20 using the restoration code can efficiently decode by performing an efficient decoding algorithm having a lower complexity by using the local restoration characteristic of the partial access restoration code.

상기 설명된 본 발명의 일 실시예의 방법의 전체 또는 일부는, 컴퓨터에 의해 실행되는 프로그램 모듈과 같은 컴퓨터에 의해 실행 가능한 기록 매체의 형태(또는 컴퓨터 프로그램 제품)로 구현될 수 있다. 여기에서, 컴퓨터 판독 가능 매체는 컴퓨터 저장 매체(예를 들어, 메모리, 하드디스크, 자기/광학 매체 또는 SSD(Solid-State Drive) 등)를 포함할 수 있다. 컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다.All or part of the method of an embodiment of the present invention described above can be implemented in the form of a computer-executable recording medium (or a computer program product) such as a program module executed by a computer. Here, the computer-readable medium may include computer storage media (e.g., memory, hard disk, magnetic / optical media or solid-state drives). Computer readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media.

또한, 본 발명의 일 실시예에 따르는 방법의 전체 또는 일부는 컴퓨터에 의해 실행 가능한 명령어를 포함하며, 컴퓨터 프로그램은 프로세서에 의해 처리되는 프로그래밍 가능한 기계 명령어를 포함하고, 고레벨 프로그래밍 언어(High-level Programming Language), 객체 지향 프로그래밍 언어(Object-oriented Programming Language), 어셈블리 언어 또는 기계 언어 등으로 구현될 수 있다.Also, all or part of the method according to an embodiment of the present invention may include instructions executable by a computer, the computer program comprising programmable machine instructions to be processed by a processor, Language, an object-oriented programming language, an assembly language, or a machine language.

본 명세서에서의 부(means) 또는 모듈(Module)은 본 명세서에서 설명되는 각 명칭에 따른 기능과 동작을 수행할 수 있는 하드웨어를 의미할 수도 있고, 특정 기능과 동작을 수행할 수 있는 컴퓨터 프로그램 코드를 의미할 수도 있고, 또는 특정 기능과 동작을 수행시킬 수 있는 컴퓨터 프로그램 코드가 탑재된 전자적 기록 매체, 예를 들어 프로세서 또는 마이크로 프로세서를 의미할 수 있다.Means or module in the present specification may mean hardware capable of performing the functions and operations according to the respective names described herein and may be implemented by computer program code , Or may refer to an electronic recording medium, e.g., a processor or a microprocessor, having computer program code embodied thereon to perform particular functions and operations.

다시 말해, 부(means) 또는 모듈(Module)은 본 발명의 기술적 사상을 수행하기 위한 하드웨어 및/또는 상기 하드웨어를 구동하기 위한 소프트웨어의 기능적 및/또는 구조적 결합을 의미할 수 있다.In other words, a means or module may mean a functional and / or structural combination of hardware for carrying out the technical idea of the present invention and / or software for driving the hardware.

따라서 본 발명의 일 실시예에 따르는 방법은 상술한 바와 같은 컴퓨터 프로그램이 컴퓨팅 장치에 의해 실행됨으로써 구현될 수 있다. 컴퓨팅 장치는 프로세서와, 메모리와, 저장 장치와, 메모리 및 고속 확장포트에 접속하고 있는 고속 인터페이스와, 저속 버스와 저장 장치에 접속하고 있는 저속 인터페이스 중 적어도 일부를 포함할 수 있다. 이러한 성분들 각각은 다양한 버스를 이용하여 서로 접속되어 있으며, 공통 머더보드에 탑재되거나 다른 적절한 방식으로 장착될 수 있다.Thus, a method according to an embodiment of the present invention may be implemented by a computer program as described above being executed by a computing device. The computing device may include a processor, a memory, a storage device, a high-speed interface connected to the memory and a high-speed expansion port, and a low-speed interface connected to the low-speed bus and the storage device. Each of these components is connected to each other using a variety of buses and can be mounted on a common motherboard or mounted in any other suitable manner.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위 내에서 다양한 수정, 변경 및 치환이 가능할 것이다. 따라서, 본 발명에 개시된 실시예 및 첨부된 도면들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예 및 첨부된 도면에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구 범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 발명의 권리 범위에 포함되는 것으로 해석되어야 할 것이다.It will be apparent to those skilled in the art that various modifications, substitutions and substitutions are possible, without departing from the scope and spirit of the invention as disclosed in the accompanying claims. will be. Therefore, the embodiments disclosed in the present invention and the accompanying drawings are intended to illustrate and not to limit the technical spirit of the present invention, and the scope of the technical idea of the present invention is not limited by these embodiments and the accompanying drawings . The scope of protection of the present invention should be construed according to the following claims, and all technical ideas within the scope of equivalents should be construed as falling within the scope of the present invention.

Claims

A parity check matrix generator for generating a parity check matrix defining a restoration code for encoding the data in consideration of the number of access nodes to be accessed for data recovery; And
Generating a codeword for encoding the data using the generated parity check matrix, and distributing the generated codeword to the distributed storage system; Lt; / RTI >
The restoration code is a binary partial access restoration code,
Wherein the parity check matrix generator generates the parity check matrix considering the length of the restoration code and the length of the data.

delete

The apparatus of claim 1, wherein the parity check matrix generator
A first sub-matrix generator for generating a first sub-matrix for securing a partial connection number; And
A second sub-matrix generation unit arranged adjacent to the first sub-matrix to generate a second sub-matrix for securing a minimum distance of the restoration code; Lt; / RTI >
Generates the parity check matrix using the first sub-matrix and the second sub-matrix,
The minimum distance d (d is a natural number) of the restoration code is set to a value obtained by subtracting a minimum distance d from a plurality of codewords included in the restoration code defined to be able to decode the data even if up to a maximum of d- Wherein the distance between the arbitrarily selected codewords is smaller than the distance between the arbitrarily selected codewords by comparing the distances between arbitrarily selected codewords.

4. The apparatus of claim 3, wherein the first sub-
Determining a length of each row of the first sub-matrix in consideration of the partial connection number and the length of the data,
Wherein non-zeroelements of the different rows of the first sub-matrix are not arranged in the same column.

5. The apparatus of claim 4, wherein the first sub-
And sets the number of non-zero elements included in each row of the first sub-matrix in consideration of the partial connection number.

4. The apparatus of claim 3, wherein the second sub-
A method of generating a binary vector space having a predetermined dimension and dividing the generated binary vector space into a set of subvector spaces having a dimension according to the partial connection number, And generates a two-part matrix.

The method according to claim 6,
Wherein the predetermined dimension is set considering the size of the partial connection number and the set of the subvector spaces,
And a vector sum of two vectors belonging to the base and the basis corresponding to the dimension according to the partial access number of the subvector spaces is included in the column of the second sub-matrix.

7. The apparatus of claim 6, wherein the second sub-
Using a primitive polynomial having the elements represented by powers of the primitive elements in the Galois field having an order determined according to the predetermined dimension as the power of the primitive elements, And generates the second sub-matrix expressed as a binary vector using coefficients of the terms of the elements of the transformed Galois field.

9. The method of claim 8,
Wherein the size of the set of subvector spaces considers whether the partial connection number, the dimension of the subvector space according to the partial connection number, and whether the predetermined dimension is divided by the dimension of the subvector space according to the partial connection number Lt; / RTI >
Wherein the primitive polynomial includes all possible primitive polynomials according to the predetermined order.

The apparatus of claim 1, wherein the dispersion storage unit
Transforming the generated parity check matrix into a systematic form including a first unit matrix and a remaining sub-matrix, transposing a sub-matrix of the transformed parity check matrix, and transforming the transformed sub- Generates an encoding matrix, and generates the codeword using the generated encoding matrix.

A parity check matrix generator for generating a parity check matrix defining a restoration code for encoding the data in consideration of the number of access nodes to be accessed for data recovery;
Generating a codeword for encoding the data using the generated parity check matrix, and distributing the generated codeword to the distributed storage system; And
A decoding unit which selects a sub-matrix including a column related to a code block stored in the distributed storage system in the parity check matrix and performs a Gaussian elimination operation on the selected sub-matrix to perform decoding; Lt; / RTI >
The restoration code is a binary partial access restoration code,
Wherein the parity check matrix generator generates the parity check matrix considering the length of the restoration code and the length of the data.

A coding method using a restoration code in a distributed storage system,
Generating a parity check matrix defining a restoration code to encode the data in consideration of the number of access nodes to be accessed for data recovery; And
Generating a codeword according to the encoding of the data using the generated parity check matrix and distributing the generated codeword to a distributed storage system,
The restoration code is a binary partial access restoration code,
Wherein the step of generating the parity check matrix generates the parity check matrix considering the length of the restoration code and the length of the data.

delete

13. The method of claim 12, wherein generating the parity check matrix comprises:
Generating a first sub-matrix for securing a partial connection number; And
Generating a second sub-matrix adjacent to the first sub-matrix to secure a minimum distance of the restoration code; Lt; / RTI >
Generates the parity check matrix using the first sub-matrix and the second sub-matrix,
The minimum distance d (d is a natural number) of the restoration code is set to a value obtained by subtracting a minimum distance d from a plurality of codewords included in the restoration code defined to be able to decode the data even if up to a maximum of d- And a distance between arbitrarily selected codewords is compared to indicate a smallest distance among distances between the arbitrarily selected codewords.

15. The method of claim 14, wherein generating the first sub-
Determining a number of non-zero elements included in a row of each of the first sub-matrices in consideration of the partial connection number, determining a number of non-zero elements included in each row of the first sub- Determines the length of the row,
Wherein the non-zeroelements included in the different rows of the first sub-matrix are not arranged in the same column.

15. The method of claim 14, wherein generating the second sub-
A method of generating a binary vector space having a predetermined dimension and dividing the generated binary vector space into a set of subvector spaces having a dimension according to the partial connection number, And generating a 2 < nd > sub-matrix.

17. The method of claim 16,
Wherein the predetermined dimension is set considering the size of the partial connection number and the set of the subvector spaces,
Wherein the subvector spaces include a basis vector corresponding to a dimension according to the number of partial connections and a sum of the basis vectors as elements of the subvector spaces.

17. The method of claim 16, wherein generating the second sub-
And a multiplier of the multiplier of the multiplier less than the predetermined order using a primitive polynomial having the elements represented by powers of the primitive elements in the Galois field having an order determined according to the predetermined dimension, And generating the second sub-matrix expressed as a binary vector using the coefficient of each term of the elements of the transformed Galois field.

19. The method of claim 18,
Wherein the size of the set of subvector spaces is determined based on the number of partial connections, the dimension of the subvector space according to the number of partial connections, and whether the predetermined dimension is divided by the dimension of the subvector space according to the partial connection number. Lt; / RTI &
Wherein the primitive polynomial includes all possible primitive polynomials according to the predetermined dimension. &Lt; RTI ID = 0.0 > 31. < / RTI >

13. The method of claim 12, wherein the distributing and storing step
Transforming the generated parity check matrix into a systematic form including a first unit matrix and a remaining sub-matrix, transposing a sub-matrix of the transformed parity check matrix, and transforming the transformed sub- Generating an encoding matrix, and generating the codeword using the generated encoding matrix.

A program stored in a computer-readable recording medium for realizing an encoding method using a restoration code according to any one of claims 12, 14 to 20 through being executed by a processor.