KR20110083189A

KR20110083189A - Distributed method and apparatus for storing large data

Info

Publication number: KR20110083189A
Application number: KR1020100003283A
Authority: KR
Inventors: 송유진
Original assignee: 동국대학교 경주캠퍼스 산학협력단
Priority date: 2010-01-14
Filing date: 2010-01-14
Publication date: 2011-07-20
Also published as: KR101129330B1

Abstract

PURPOSE: An apparatus for storing and distributing large scaled data and a method thereof are provided to take distributed DBMS operation effect. CONSTITUTION: A method for storing and distributing large scaled data comprises steps of: distributing a relation table with a large scale data into l fragments using a vertical distribution method; distributing the l fragments into n distribution information through XOR threshold secret distribution method; and distributed storing the distribution information over the threshold value into an incompleted block design discrete structure.

Description

Distributed method and apparatus for storing large data

본 발명은 대용량 데이터 저장에 관한 것으로서, 더욱 상세하게는 대용량 데이터를 분산 저장하는 방법 및 그 장치에 관한 것이다.The present invention relates to large data storage, and more particularly, to a method and apparatus for distributed storage of large data.

최근 유비쿼터스 네트워크 환경이 진화됨에 따라 과거와 달리 많은 대용량 데이터들이 다양한 유통 채널을 통해 발생하고 있다. 또한, 휴대 전화나 가전 기기, 정보 기기 등을 통해 디지털 콘텐츠의 다양화, 대용량화, 서비스의 다양화 등 일반인들이 일상 생활에서 유비쿼터스 네트워크와 연결하여 여러 가지 정보들을 교환, 공유하고 있다. 이처럼 네트워크 환경의 진화로 인해 네트워크 접속형 스토리지의 이용이 증가하고 동시에 단말의 스토리지 대용량화에 따라 네트워크상의 여러 다른 장소에서 데이터를 저장할 수 있는 안전하고 효율적인 스토리지 환경이 필요하게 되었다. Recently, as the ubiquitous network environment has evolved, a lot of large data is generated through various distribution channels unlike the past. In addition, the general public, such as diversification of digital content, large capacity, and diversification of services through mobile phones, home appliances, and information devices, exchange and share various information by connecting with ubiquitous networks in daily life. As the network environment evolves, the use of network-connected storage increases and at the same time, as the storage capacity of terminals increases, a safe and efficient storage environment capable of storing data in various places on the network is required.

이와 같이 유비쿼터스 환경에서는 다양한 데이터 서비스가 가능해지면서 악의적인 공격자나 내부 사용자에 의한 보안 취약성 및 프라이버시 침해로 인한 문제를 해결하기 위해 대용량 데이터의 보안 문제와 안전한 관리가 주요 이슈로 떠오르고 있다.As various data services become available in the ubiquitous environment, the security issues and secure management of large data are emerging as the main issues in order to solve problems caused by security vulnerabilities and privacy breaches by malicious attackers or internal users.

종래의 Ito, Saito, Nishizeki의 분산 정보 할당 방식은 임계치 접근 구조에서 분산 정보를 할당하는 방식으로 r명의 참가자 집합 {P₁, P₂, ..., P_r}으로 구성된다. 각 참가자는 (m+1)개의 분산 정보 집합이 있다. 비밀 분산 방식의 접근구조를 구성하기 위해 먼저 m명의 참가자로 가능한 모든 조합으로 구성된 q개의 그룹 B를 다음 수학식 1과 같이 구성한다.The conventional Ito, Saito, and Nishizeki's distributed information allocation method allocates distributed information in a threshold access structure and consists of r participants set {P ₁ , P ₂ , ..., P _r }. Each participant has (m + 1) distributed information sets. To construct a secret distributed approach, first, q groups B composed of all possible combinations of m participants are configured as in Equation 1 below.

여기서,

.here,

.

다음으로 (q,q) 임계치 비밀분산 방식으로 비밀 정보 s를 분산한다. 이때, 분산된 분산 정보 집합은 {s₁, s₂,..., s_q}로 표시된다. 참가자 P_i에게 할당된 분산 정보의 집합은 함수

에 의해 할당된다. 각 참가자는

분산 정보를 받고 각 분산 정보는 (r - m) 참가자에게 저장된다. 예를 들면, 분산된 비밀을 찾기 위해 적어도 3명의 참가자가 분산 정보를 모아야 하는 4명으로 구성된 참가자 집합을 생각해 보자. r = 4, m = 2인 참가자로 가능한 모든 조합 그룹 B는 다음 수학식 2와 같다.Next, the secret information s is distributed in the (q, q) threshold secret distribution method. In this case, the distributed distribution information set is represented by {s ₁ , s ₂ , ..., s _q }. The set of distributed information assigned to participant P _i is a function

Is assigned by Each participant

The distribution information is received and each distribution information is stored to the participants (r-m). For example, consider a set of four participants whose at least three participants must gather distributed information to find a distributed secret. All combination groups B available to r = 4 and m = 2 participants are represented by Equation 2 below.

비밀 정보를 복원하기 위해 필요한 6개의 분산 정보를 생성한다. 6개의 분산 정보는 {s₁, s₂, s₃, s₄, s₅, s₆}로 나타낸다. 표 1은 분산 정보 할당 함수 g에 의해서 각 참가자가 갖는 분산 정보이다.Generates six pieces of distributed information needed to recover secret information. Six pieces of variance information are represented by {s ₁ , s ₂ , s ₃ , s ₄ , s ₅ , s ₆ }. Table 1 shows the distribution information that each participant has by the distribution information allocation function g.

참가자Participant 분산 정보Distributed information P₁ P ₁ s₄, s₅, s₆ s ₄ , s ₅ , s ₆ P₂ P ₂ s₂, s₃, s₆ s ₂ , s ₃ , s ₆ P₃ P ₃ s₁, s₃, s₅ s ₁ , s ₃ , s ₅ P₄ P ₄ s₁, s₂, s₄ s ₁ , s ₂ , s ₄

분산 정보를 복원하기 위해서는 2명의 참가자로는 6개의 분산 정보 중 5개의 분산 정보만 모을 수 있다. 복원시 필요한 6개의 분산 정보가 없으므로 비밀 정보를 복원할 수 없다. 따라서 분산된 비밀 정보를 복원하는데 필요한 6개의 분산 정보를 모두 수집하기 위해서는 적어도 3명의 관리자가 필요하다. 위와 같이, 종래의 Ito, Saito, Nishizeki 방식은 분산된 비밀 정보를 복원하기 위해 분산 정보 모두가 필요하게 된다.In order to recover distributed information, two participants can gather only five distributed information out of six distributed information. Secret information cannot be restored because there is no six pieces of distributed information needed for restoration. Therefore, at least three administrators are required to collect all six pieces of distributed information needed to restore the distributed secret information. As described above, the conventional Ito, Saito, and Nishizeki methods require all of the distributed information to recover the distributed secret information.

본 발명은 상기한 바와 같은 종래의 문제점을 해결하기 위한 것으로, 임계치 이상의 분산 정보를 이용하여 비밀 정보를 생성하도록 대용량 데이터를 t-(υ,k,λ) 조합 디자인을 기반으로 한 균형 불완비 블록 설계 이산 구조로 할당 저장할 수 있는 대용량 데이터 분산 저장 방법 및 그 장치를 제공함에 그 목적이 있다.SUMMARY OF THE INVENTION The present invention has been made to solve the conventional problems as described above, and a balanced incomplete block based on a t- (υ, k, λ) combination design of a large amount of data to generate secret information using distributed information above a threshold. It is an object of the present invention to provide a large data distributed storage method and apparatus for allocating and storing a design discrete structure.

상기 목적을 달성하기 위하여, 본 발명에 따른 대용량 데이터 분산 저장 방법은 (i) 수직 분할법을 이용하여 파티션 구조를 설정한 후 대용량의 데이터를 함유한 관계 테이블을 ℓ개의 프래그먼트로 분할하는 단계; (ii) 상기 ℓ개의 프래그먼트를 XOR 임계치 비밀 분산 방식으로 각각 n개의 분산 정보로 분산하는 단계; 및 (iii) 임계치 이상의 분산 정보를 이용하여 비밀 정보를 복원하도록 ℓn개의 분산 정보를 모든 블록은 k개의 점으로 구성된 제1 성질 및 임의의 t개의 점을 포함하는 블록의 개수는 정확히 λ개인 제2 성질을 만족하는 υ개 점의 집합 및 블록의 집합으로 구성된 t-(υ,k,λ) 조합 디자인을 기반으로 한 균형 불완비 블록 설계 이산 구조로 할당 저장하는 단계를 포함하고, 상기 ℓ, n, t, 및 λ는 자연수이고, 상기 k는 상기 임계치로서 상기 n 이하의 자연수인 것을 특징으로 한다.In order to achieve the above object, the large data distribution storage method according to the present invention comprises the steps of: (i) partitioning the relationship table containing a large amount of data into l fragments after setting the partition structure using the vertical partition method; (ii) distributing the l fragments into n distribution information, each in an XOR threshold secret distribution scheme; And (iii) a second property in which all blocks have a first property consisting of k points and the number of blocks containing any t points is exactly λ, so as to recover the secret information using variance information above a threshold. Allocating and storing a balanced incomplete block design discrete structure based on a t- (υ, k, λ) combination design consisting of a set of υ points and a set of blocks satisfying a property. , t, and λ are natural numbers, and k is a natural number equal to or less than n as the threshold.

또한, 본 발명에 따른 대용량 데이터 분산 저장 장치는 수직 분할법을 이용하여 파티션 구조를 설정한 후 대용량의 데이터를 함유한 관계 테이블을 ℓ개의 프래그먼트로 분할하는 분산 데이터베이스 관리 시스템; 및In addition, the mass data distributed storage device according to the present invention comprises: a distributed database management system for setting a partition structure by using a vertical partitioning method and partitioning a relational table containing a large amount of data into l fragments; And

상기 ℓ개의 프래그먼트를 XOR 임계치 비밀 분산 방식으로 각각 n개의 분산 정보로 분산하고, 임계치 이상의 분산 정보를 이용하여 비밀 정보를 복원하도록 ℓn개의 분산 정보를 모든 블록은 k개의 점으로 구성된 제1 성질 및 임의의 t개의 점을 포함하는 블록의 개수는 정확히 λ개인 제2 성질을 만족하는 υ개 점의 집합 및 블록의 집합으로 구성된 t-(υ,k,λ) 조합 디자인을 기반으로 한 균형 불완비 블록 설계 이산 구조로 할당 저장하는 저장부를 포함하고, 상기 ℓ, n, t, 및 λ는 자연수이고, 상기 k는 상기 임계치로서 상기 n 이하의 자연수인 것을 특징으로 한다.The first property and randomness of all the blocks of k points are distributed into n points of distribution so as to distribute the pieces of l pieces into n pieces of distributed information in the XOR threshold secret distribution method, and restore the secret information using the pieces of distributed information above the threshold. The number of blocks containing t points of is a balanced incomplete block based on a t- (υ, k, λ) combination design consisting of a set of υ points and a set of blocks satisfying the second property of exactly λ. And a storage unit for allocating and storing in a design discrete structure, wherein l, n, t, and λ are natural numbers, and k is a natural number equal to or less than n as the threshold.

본 발명에서는 종래의 문제를 해결하고자 대용량 데이터를 비밀 분산 방식을 이용하여 분산 저장 관리할 때 분산 정보의 저장을 조합 디자인이 갖는 행렬구조로부터 분산 정보 할당 구조를 적용시킴으로써 분산 정보 할당법을 새롭게 구성한 것으로서, 종래의 Ito, Saito, Nishizeki 방식은 분산된 비밀 정보를 복원하기 위해 분산 정보 모두가 필요한데 반해, 임계치 이상의 분산 정보가 수집되면 비밀 정보를 복원할 수 있다는 관점에서 구현이 용이하며, 이에 따라 데이터를 분산 및 복원하는 분산 DBMS 운영상의 효과가 있다.In the present invention, in order to solve the conventional problem, distributed information allocation method is newly configured by applying distributed information allocation structure from matrix structure of combination design when storing and storing large amount of data using secret distribution method. In the conventional Ito, Saito, and Nishizeki methods, all of the distributed information is required to restore the distributed secret information. However, when distributed information above the threshold is collected, the secret information can be restored. It has the effect of operating distributed DBMS.

이와 같이 구성된 분산 정보 할당법은 기존 방식의 다항식 기반을 이용한 Ito, Saito 할당 방식에 의존하고 있는 반면 본 발명의 경우 조합디자인 파라미터에 관계하고 있으므로 DB 서버 및 분산 정보 선택의 규칙성이 있고 잘 알려진 조합적 구조를 이용함으로써 구현이 용이한 특징이 있다.The distributed information allocation method configured as described above relies on the Ito and Saito allocation method using the conventional polynomial base, but in the present invention, it is related to the combination design parameter, so there is a regular and well-known combination of DB server and distributed information selection. There is a feature that can be easily implemented by using an adaptive structure.

도 1은 본 발명의 실시예에 따른 대용량 데이터 분산 저장 장치의 구성을 나타낸 블록도이다.
도 2는 본 발명의 실시예에 따른 프래그먼트 작성 및 분산 과정을 설명하는 구성도이다.
도 3은 본 발명에 적용될 수 있는 t-(υ,k,λ) 조합 디자인의 일예를 나타낸 도면이다.1 is a block diagram illustrating a configuration of a mass data distributed storage device according to an exemplary embodiment of the present invention.
2 is a block diagram illustrating a fragment creation and distribution process according to an embodiment of the present invention.
3 is a diagram illustrating an example of a t- (υ, k, λ) combination design applicable to the present invention.

이하, 첨부된 예시 도면에 의거하여 본 발명의 실시예에 따른 대용량 데이터 분산 저장 장치 및 그 방법을 상세히 설명한다. Hereinafter, a mass data distributed storage device and a method thereof according to an exemplary embodiment of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 실시예에 따른 대용량 데이터 분산 저장 장치의 구성을 나타낸 블록도이다. 도 2는 본 발명의 실시예에 따른 프래그먼트 작성 및 분산 과정을 설명하는 구성도이다.1 is a block diagram illustrating a configuration of a mass data distributed storage device according to an exemplary embodiment of the present invention. 2 is a block diagram illustrating a fragment creation and distribution process according to an embodiment of the present invention.

본 발명의 실시예에 따른 대용량 데이터 분산 저장 장치는 분산 DBMS(110) 및 저장부(120)를 포함한다.The mass data distributed storage device according to an embodiment of the present invention includes a distributed DBMS 110 and a storage 120.

분산 DBMS(110)는 수직 분할법을 이용하여 파티션 구조를 설정한 후 대용량의 데이터를 함유한 관계 테이블을 ℓ개의 프래그먼트로 분할한다. 분산 DBMS(110)는 쿼리 처리, 데이터 분할 관리 등 분산 데이터베이스의 기본적인 기능을 제공한다. The distributed DBMS 110 sets up a partition structure by using a vertical partitioning method, and then partitions a relational table containing a large amount of data into L fragments. The distributed DBMS 110 provides basic functions of a distributed database, such as query processing and data partition management.

저장부(120)는 상기 ℓ개의 프래그먼트를 XOR (k, n) 임계치 비밀 분산 방식으로 각각 n개의 분산 정보로 분산하고, 임계치 k의 분산 정보를 이용하여 상기 프래그먼트를 복원하여 상기 관계 테이블을 생성하도록 ℓn개의 분산 정보를 모든 블록은 k개의 점으로 구성된 제1 성질 및 임의의 t개의 점을 포함하는 블록의 개수는 정확히 λ개인 제2 성질을 만족하는 υ개 점의 집합 및 블록의 집합으로 구성된 t-(υ,k,λ) 조합 디자인을 기반으로 한 균형 불완비 블록 설계 이산 구조로 할당 저장한다.The storage unit 120 distributes the l fragments into n pieces of distributed information in an XOR (k, n) threshold secret distribution scheme, respectively, and restores the fragments using the distribution information of threshold k to generate the relationship table. t consists of a set of υ points and a set of blocks satisfying the second property, where the number of blocks containing ℓn variance information is the first property of k points and the number of blocks containing any t points is exactly λ. -(υ, k, λ) Based on the combination design, the balanced incomplete block design is allocated and stored in a discrete structure.

상기 ℓ, n, t, 및 λ는 자연수이고, k는 상기 n 이하의 자연수이다.L, n, t, and λ are natural numbers, and k is a natural number less than or equal to n.

이때, 도 2를 참조하면, ℓ개의 프래그먼트로 분산된 ℓn개의 분산 정보는 m개의 DB(210) 중에서 조합, 특히 BIBD의 이산 구조로 분산 배치한다. 분할된 프래그먼트는 각각의 속성별로 구성되어 있는데 각각의 프래그먼트에는 식별할 수 있는 ID_i가 부여되고(i=1,...,ℓ) 분할된 프래그먼트에서 분산된 분산 정보는 분산 정보 할당 테이블에 따라 DB_i(i=1,...,)에 저장된다. In this case, referring to FIG. 2, the l n pieces of distributed information distributed in the l fragments are distributed and arranged in a discrete structure of a combination, in particular, a BIBD, among the m DBs 210. Partitioned fragments are configured for each property, and each fragment is given an identifying ID _i (i = 1, ..., ℓ), and the distributed information distributed in the fragmented fragments is distributed according to the distributed information allocation table. It is stored in DB _i (i = 1, ...,).

도 2를 참조하면, 상기 저장부(120)는 t-(υ,k,λ) 조합 디자인을 기반으로 상기 ℓn개의 분산 정보를 k-(ℓn, n, 1)의 분산 정보 할당 테이블 형태로 다수의 DB(210)에 할당 저장한다. 분산 정보 할당 테이블은 BIBD 즉, t-(υ,k, 1) 조합 디자인을 기반으로 하여 k-(ℓn, n, 1)의 형태로 구성된다. 표 2에는 각 기호에 대한 내용이 설명되어 있다.Referring to FIG. 2, the storage unit 120 stores a plurality of L n pieces of distributed information in the form of distributed information allocation table of k- (L n, n, 1) based on a t- (υ, k, λ) combination design. The allocation is stored in the DB 210. The distributed information allocation table is configured in the form of k- (ℓn, n, 1) based on a BIBD, that is, a t- (υ, k, 1) combination design. Table 2 describes the contents of each symbol.

기호sign 내용Contents mm DB의 총 수Total number of DBs kk 임계치Threshold ℓℓ 프그래먼트의 총 수Total number of fragments nn 분산 정보Distributed information ℓnℓn 분산 정보의 총 수Total number of distribution information

(예 2) 관계 테이블에서 분할된 프래그먼트를 XOR 기반 비밀 분산 방식으로 분산하는 경우를 생각해 보자. 분산 정보 할당 테이블은 3-(8, 4, 1) 조합 디자인 구성법에 의해 DB(m=8, 임계값 k=3, 프래그먼트의 총수 ℓ=2이고 분산 정보는 n = 4이다. 표 3은 3-(8, 4, 1) 조합 디자인 DB 할당의 예를 나타낸다.(Example 2) Consider the case of distributing fragmented fragments in a relational table using XOR-based secret distribution. The variance information allocation table is DB (m = 8, threshold k = 3, total number of fragments ℓ = 2 and variance information n = 4 by 3- (8, 4, 1) combination design construction method. -(8, 4, 1) shows an example of combination design DB allocation.

IDID DBDB ID₁(d₁)ID ₁ (d ₁ ) DB₁, DB₂, DB₄, DB₈ DB ₁ , DB ₂ , DB ₄ , DB ₈ ID₂(d₂)ID ₂ (d ₂ ) DB₂, DB₃, DB₅, DB₈ DB ₂ , DB ₃ , DB ₅ , DB ₈

분산 정보를 복원하기 위해서는 프래그먼트의 식별 ID를 색인으로 8개의 저장소에서 검색하고자 하는 프래그먼트의 복원을 위해 4개의 DB 중에서 임계치 3만큼의 분산 정보를 수집하여 한 개의 프래그먼트를 복원하게 된다.In order to restore distributed information, fragments are restored by collecting threshold information of 3 among 4 DBs for restoring fragments to be searched in 8 stores using index ID of fragments.

대용량 데이터 분산 저장 과정에서, 먼저 분산 정보 할당 구조와 조합 디자인이 갖는 조합적 구조와의 밀접한 관계에 대하여 고찰하고 조합 디자인으로부터 분산 정보 할당법을 구성한다. In the process of distributed storage of large amounts of data, first, the close relationship between the distributed information allocation structure and the combinatorial structure of the combined design is discussed, and the distributed information allocation method is constructed from the combined design.

분산 정보 할당 구조는 비밀 복원 특성(임의의 분산 정보를 소유하는 DB 서버의 집합은 비밀을 복원할 수 있는 성질)을 형식화하기 위해 도입될 수 있다. The distributed information allocation structure can be introduced to formalize the secret retrieval characteristics (the collection of DB servers owning any distributed information can recover the secrets).

이와 같이 분산 정보 할당 구조는 디자인(λ=1인 경우 BIBD)의 균형성이 있는 부분 집합의 집합을 적절히 선택하여 분산 정보 할당법의 비밀 복원 특성을 만족하도록 할 수 있을 지의 관점에서 해석할 수 있다. 여기서 본 발명에서 취급하는 조합 디자인에 대하여 정의한다.In this way, the distributed information allocation structure can be interpreted in terms of whether a proper set of balanced subsets of the design (λ = 1, BIBD) can be appropriately satisfied to satisfy the secret recovery characteristics of the distributed information allocation method. . Here, the combination design handled in the present invention will be defined.

[정의 1] t-(υ,k,λ) 조합 디자인은 다음의 성질을 만족하는 υ개 점의 집합 X와 블록의 집합 D로 구성된다.[Definition 1] The t- (υ, k, λ) combination design consists of a set X of υ points and a set D of blocks satisfying the following properties.

(1) 모든 블록은 정확히 k개의 점으로 구성된다.(1) Every block consists of exactly k points.

(2) 임의의 t개의 점을 포함하는 블록의 개수는 정확히 λ개다. (2) The number of blocks containing arbitrary t points is exactly λ.

2-(7,3,1) 예를 들어 설명하면, 디자인은 다음의 성질을 만족하는 υ= 7개 점의 집합 X(수학식 3)와 블록의 집합 D(수학식 4)로 구성된다.For example, the design is composed of a set X of 7 points (Equation 3) and a block D (Equation 4) satisfying the following properties.

(1) 모든 블록은 정확히 k = 3개의 점으로 구성된다.(1) Every block consists of exactly k = 3 points.

(2) 임의의 =2개의 점, 예를 들면 3, 4를 포함하는 블록의 개수는 정확히 λ = 1개이다.(2) The number of blocks containing any = 2 points, for example 3, 4, is exactly λ = 1.

도 3은 본 발명에 적용될 수 있는 t-(υ,k,λ) 조합 디자인의 일예를 나타낸 도면이다.3 is a diagram illustrating an example of a t- (υ, k, λ) combination design applicable to the present invention.

분산 정보 할당 구조의 행렬 표현Matrix Representation of Distributed Information Allocation Structure

s를 비밀, υ₁,υ₂,....,υ_n을 분산 정보로 한다. 여기서, 액세스 집합을 비밀을 복원할 수 있는 분산 정보 집합으로서 정의하고 복원 불가능한 분산 정보 집합을 비액세스 집합으로서 정의한다. 분산 정보 할당 구조가 갖는 행렬 구조는 디자인구조가 갖는 행렬 구조에 의해 표현될 수 있다. Let s be the secret, υ ₁ , υ ₂ , ...., υ _n as distributed information. Here, the access set is defined as a distributed information set that can recover a secret, and the non-restored distributed information set is defined as a non-access set. The matrix structure of the distributed information allocation structure may be represented by the matrix structure of the design structure.

조합 디자인이 갖는 행렬 구조를 분산 정보 할당 행렬에 적용시키기 위해 예를 들어 설명하고 구성법을 정리한다. 특정 조합 디자인이 갖는 행렬 구조를 분산 정보 할당 행렬로서 변환하고 이 분산 정보 할당 행렬이 갖는 t-(υ,k,1) 조합 디자인에 근거한 분산 정보 할당법의 구성을 나타낸다. 분산 정보 할당법은 가능한 (s,υ₁,υ₂,....,υ_n)의 값을 행으로 하는 행렬에 의해 표현할 수 있다. 본 발명에서는 이와 같은 행렬을 분산 정보 할당 행렬이라 하고 M으로 나타낸다. M을 이해하기 위해 예 1을 설명한다.In order to apply the matrix structure of the combinatorial design to the distributed information allocation matrix, an example is described and the composition method is summarized. The matrix structure of a specific combination design is transformed into a dispersion information allocation matrix, and the configuration of the dispersion information allocation method based on the t- (υ, k, 1) combination design of this dispersion information allocation matrix is shown. The distributed information allocation method can be expressed by a matrix of rows of possible values of (s, υ ₁ , υ ₂ , ...., υ _n ). In the present invention, such a matrix is referred to as a distributed information allocation matrix and denoted by M. Example 1 is described to understand M.

(예 1) 3-(10,4,1) 디자인으로부터 얻을 수 있는 분산 정보 할당 행렬 M을 이하 수학식 7에 나타낸다.(Example 1) The variance information allocation matrix M obtained from the 3- (10,4,1) design is shown in Equation 7 below.

[정리 1] t-(υ,k,1) 디자인의 각 블록을 행으로 하는 행렬을 M으로 한다. 상기 M을 분산 정보 할당 행렬로 하는 분산 정보 공유법에 있어서 임의의 t명은 액세스 집합이다.[Theorem 1] T- (υ, k, 1) A matrix having each block of a design as a row is M. In the distributed information sharing method using M as the distributed information allocation matrix, an arbitrary t name is an access set.

(증명) [정의 1]의 (2)로부터 임의의 t 열의 값 (x₁,x₂,....,x_t)는 M의 행을 결정한다. 따라서 s를 포함하지 않는 임의의 t열의 값은 s열의 값을 유일하게 결정한다. 그러므로 임의의 t 명은 액세스 집합이다.(Proof) The value (x ₁ , x ₂ , ...., x _t ) of any column t from (2) of [Definition 1] determines the row of M. Thus, any value in column t that does not contain s uniquely determines the value of column s. Thus any t name is an access set.

t-(υ,k,1) 조합 디자인의 균형성 조건이 비밀 정보 복원 특성을 어느 정도 결정하는 지를 검토한다. 비밀 정보를 분산 정보로 분할 부호화할 때 비밀복원 특성은 조합디자인 파라미터 t, k, 및 λ에 의해 특징 지워진다. 즉, 분산 정보를 점으로 비밀 분산 함수를 블록으로 대응시키면 비밀 정보 복원 특성은 조합디자인 파라미터 t, k, 및 λ에 있어서 k점 중의 t 개의 점이 주어지면 블록을 정확히 하나 결정할 수 있는 성질에 귀착된다. 조합 디자인의 정의로부터 검토하면,Examine how the balance condition of the t- (υ, k, 1) combination design determines the secret information recovery characteristics. The secret recovery characteristic is characterized by the combination design parameters t, k, and λ when the secret information is partitioned into distributed information. In other words, if the secret variance function is mapped to the block with the distributed information as a point, the secret information restoration characteristic results in the property of determining exactly one block given t points among k points in the combination design parameters t, k, and λ. . If we examine from definition of combination design,

* 각 블록에 포함되는 점의 개수는 k이다(블록의 크기가 비밀을 복원할 수 있는 인원수에 대응하고 있다). The number of points in each block is k (the size of the block corresponds to the number of people who can recover the secret).

* t개의 서로 다른 점에 대해서 이들을 모두 포함하는 블록의 개수는 일정하다(t가 비밀을 복원할 수 있는 최소 액세스 집합의 크기에 대응하고 있다).For t different points, the number of blocks containing them all is constant (corresponding to the size of the minimum set of accesses for which t can recover secrets).

* t개의 점을 포함하는 블록의 수는 λ이다 (λ가 최소 액세스 집합이 복원할 수 있는 비밀의 수에 대응하고 있다).The number of blocks containing t points is λ (λ corresponds to the number of secrets that the minimum access set can recover).

이와 같이 멤버의 부분 집합인 k명중에서 t명이 모여지면 비밀 분산 방식이 구성되고 최소 액세스 집합인 t명은 λ=1개의 비밀 복원이 가능하다. 이와 같은 관점으로부터 비밀 분산 방식은 조합 디자인 파라미터 t, k, 및 λ에 의해 특징지을 수 있다. In this manner, when t people are gathered from k members, which is a subset of members, a secret distribution scheme is configured, and λ = 1 secret restoration is possible for t names, which is the minimum access set. From this point of view, the secret distribution scheme can be characterized by the combination design parameters t, k, and λ.

이러한 파라미터를 사전에 지정해서 비밀 분산 방식을 구성하는 것이 조합 디자인 접근법의 이점이다. 반면 기존의 비밀 분산 방식은 GF(q)상의 파라미터 q에 의존하고 있다. It is an advantage of the combinatorial design approach to configure these secrets by specifying these parameters in advance. On the other hand, the existing secret distribution method depends on the parameter q on GF (q).

본 발명은 조합 디자인 관점으로부터 비밀 복원 특성을 형식화함으로써 비밀분산 방식을 다음과 같이 재정의한다. The present invention redefines the secret dispersion scheme as follows by formalizing the secret recovery characteristics from the combination design point of view.

[정의 2] 비밀 분산 방식은 다음과 같은 비밀 복원 특성을 만족하는 (P, F)이다. 여기서 P는 υ개의 점(분산정보)의 집합, F는 크기 k의 P의 부분 집합의 집합에 대응하는 비밀 분산 함수의 집합이다. [Definition 2] The secret distribution method is (P, F) that satisfies the following secret recovery characteristics. Where P is a set of υ points (distribution information), and F is a set of secret variance functions corresponding to a set of subsets of P of size k.

1) k개의 점 중 t개 이상의 점으로부터 비밀 정보를 유일하게 결정할 수 있다.1) The secret information can be uniquely determined from t or more of k points.

2) k개의 점 중 t-1 개의 이하의 점으로부터는 비밀 정보를 전혀 결정할 수 없다.2) No secret information can be determined from t-1 points below k points.

이상에서는 본 발명을 특정의 바람직한 실시예로서 설명하였으나, 본 발명은 상기한 실시예나 도면이 기재된 내용에 그 기술적 사상이 한정되지 아니하며, 특허 청구의 범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 분야에서 통상의 지식을 가진 자라면 누구든지 다양한 변형이 가능한 것은 물론이고, 그와 같은 변경은 본 발명의 청구범위 기재의 범위 내에 있게 된다.In the above, the present invention has been described as a specific preferred embodiment, but the present invention is not limited to the technical spirit of the above-described embodiments or drawings, and the present invention is not limited to the scope of the present invention as claimed in the claims. Various modifications can be made by those skilled in the art to which the invention pertains, and such changes are within the scope of the claims of the present invention.

본 발명에 따른 대용량 데이터 분산 저장 방법 및 그 장치는 기밀성이 높은 의료 데이터, 고객의 개인 정보를 포함하는 영업 비밀 정보 등의 대용량 데이터를 효율적으로 분산 관리하는 기술로서 유비쿼터스 서비스 이용 환경에서 이용이 기대된다.The large data distributed storage method and apparatus according to the present invention are expected to be used in a ubiquitous service environment as a technology for efficiently managing and managing large data such as confidential medical data and trade secret information including personal information of customers. .

110: 분산 DBMS
120: SA
210: DB110: distributed DBMS
120: SA
210: DB

Claims

(i) partitioning a relationship table containing a large amount of data into l fragments after establishing a partition structure using a vertical partitioning method;
(ii) distributing the l fragments into n distribution information, each in an XOR threshold secret distribution scheme; And
(iii) a first property in which all blocks have k points, and a second property in which the number of blocks including any t points is exactly λ, so as to recover secret information using variance information above a threshold. Assigning and storing a balanced incomplete block design discrete structure based on a t- (υ, k, λ) combination design consisting of a set of υ points and a set of blocks that satisfy t and λ are natural numbers, and k is a natural number less than n as the threshold.

The method of claim 1, wherein λ is 1.

2. The method of claim 1, wherein in step (iii), the plurality of databases in the form of a distribution information allocation table of k- (ℓn, n, 1) is converted into the Ln distribution information based on a t- (υ, k, 1) combination design. Distributed storage method for storing in large volumes.

A distributed database management system for setting a partition structure by using a vertical partitioning method and partitioning a relational table containing a large amount of data into l fragments; And
The first property and randomness of all the blocks of k points are distributed into n points of distribution so as to distribute the pieces of l pieces into n pieces of distributed information in the XOR threshold secret distribution method, and restore the secret information using the pieces of distributed information above the threshold. The number of blocks containing t points of is a balanced incomplete block based on a t- (υ, k, λ) combination design consisting of a set of υ points and a set of blocks satisfying the second property of exactly λ. And a storage unit for allocating and storing in a design discrete structure, wherein l, n, t, and λ are natural numbers, and k is a natural number less than or equal to n as the threshold.

The mass data distributed storage device of claim 4, wherein λ is 1. 6.

The method of claim 4, wherein the storage unit is based on a t- (υ, k, λ) combination design.
A mass data distributed storage device for allocating l n pieces of distributed information in a plurality of databases in the form of k- (l n, n, 1) distributed information allocation tables.