JP6012078B2

JP6012078B2 - Information distribution system and information distribution storage system

Info

Publication number: JP6012078B2
Application number: JP2013126082A
Authority: JP
Inventors: 隆宏小篠; 経幸平本
Original assignee: KEIREX TECHNOLOGY INC.
Current assignee: KEIREX TECHNOLOGY INC.
Priority date: 2013-05-30
Filing date: 2013-05-30
Publication date: 2016-10-25
Anticipated expiration: 2033-05-30
Also published as: WO2014192957A1; JP2014235425A

Description

Detailed Description of the Invention

本発明は、分散情報のデータサイズとセキュリティ強度を乱数の量等によって調整可能な情報分散システム、及び当該情報分散システムを用いた情報分散ストレージシステムに関する。 The present invention relates to an information distribution system in which the data size and security strength of distributed information can be adjusted by the amount of random numbers and the like, and an information distributed storage system using the information distribution system.

近年、データ通信網およびデータセンタの発展、大震災等の災害発生を契機として、重要なデータを秘匿化して広域に分散保管する技術が注目されている。秘密分散技術は、その用途として、秘匿性の高い大量ファイルのバックアップや少量の重要なデータのバックアップ（公開鍵暗号方式の秘密鍵の管理）等があるが、近年のデータセンタの普及とともに保管サービスに対するニーズが高まっている。 2. Description of the Related Art In recent years, technology that conceals important data and distributes it over a wide area has been attracting attention in response to the development of data communication networks and data centers and the occurrence of disasters such as the Great East Japan Earthquake. Secret sharing technology can be used to back up a large amount of highly confidential files and a small amount of important data (management of secret keys for public key cryptosystems). The need for is growing.

（ｋ、ｎ）閾値秘密分散法はその代表的な手法であり、秘密に保管すべき元データをｎ個の分散情報として秘匿化し、任意のｋ個の分散情報をもとにして、元データを復元するものであり、情報の秘匿と紛失によるリスクを回避する手段として有効である。ネットワークを介した第三者による保管サービスでは、高い可用性と高い秘匿性ならびに低コスト運用を図るための技術が極めて重要である。従来の秘密分散技術では、可用性と秘匿性を追求した技術は提案されているが、低コストと両立するような技術の提案は充分でなかった。 The (k, n) threshold secret sharing method is a typical method, and the original data to be kept secret is concealed as n pieces of shared information, and the original data is based on any k pieces of shared information. It is an effective means for avoiding the risk of information concealment and loss. In a storage service by a third party via a network, a technology for achieving high availability, high confidentiality, and low-cost operation is extremely important. In the conventional secret sharing technique, a technique pursuing availability and secrecy has been proposed, but a proposal for a technique compatible with low cost has not been sufficient.

特許文献１では、分散数４で任意の２つの分散情報から元データを復元できる（２、４）閾値秘密分散法により、２重故障での可用性の確保と、演算量を少なくして秘密分散処理を効率化することが提案されている。２つまでの分散情報が消失またはサーバの故障等で利用できなくなっても、残りの２つで元データを復元でき安全性を確保するというものである。特許文献１では、元データを２個の部分情報に分割して、それぞれの部分情報と同一サイズの乱数要素を２個発生させ、各部分情報と乱数の組み合わせでそれぞれの排他的論理和演算を行い、演算結果で得られたものを２個ずつ連結して４個の分散情報を生成している。排他的論理和演算のみを用いるため、計算速度は速いが、分散情報のサイズが元データと同じであることから、保管するデータ量が全体で元データの４倍の大きさになり、保管コストがかかると言う問題がある。 In Patent Document 1, the original data can be restored from any two pieces of shared information with the number of shares of (4, 4). By using the threshold secret sharing method, it is possible to ensure availability in a double failure and to reduce the amount of computation to distribute the secret. It has been proposed to improve processing efficiency. Even if up to two pieces of shared information cannot be used due to loss or server failure, the remaining two can restore the original data to ensure safety. In Patent Document 1, the original data is divided into two pieces of partial information, two random elements having the same size as each piece of partial information are generated, and each combination of the pieces of partial information and random numbers is used for each exclusive OR operation. And two pieces of information obtained as a result of the operation are connected two by two to generate four pieces of shared information. Since only the exclusive OR operation is used, the calculation speed is fast, but the size of the distributed information is the same as the original data, so the total amount of data to be stored is four times the original data, and the storage cost There is a problem that it takes.

一方特許文献２には、同様に（ｋ、ｎ）閾値分散法を利用したシステムにおいて、分散情報のサイズに自由度を持たせる発明が提案されている。当該発明前では、元データを分割して部分情報を生成する際、分割数と同じ数の乱数要素（それぞれ部分情報と同一サイズ）を発生させ、部分情報の集合（第１群）と、乱数のみの集合（第２群）との間で集合要素の排他的論理和の組み合わせ演算を行い、演算結果で得られたもの（集合要素）を連結して分散情報を作成している。分散情報のサイズは、第１群の要素のサイズを加算したものと同等であるため、元データのサイズと同じであった。特許文献２に記載の発明は、元データの分割数を元の分割数よりも所定数だけ増やし、増やした個数分（指定数）の部分情報を第２群に移し、乱数と置き換えている。第１群の要素の個数と第２群の要素の個数は同一であるため、乱数の個数は置き換えられた分だけ減っていることになる。また、第１群に所属する部分情報のサイズおよび乱数のサイズは、分割数が増えた分だけ小さくなる。その後の処理は、従来同様に第１群（部分情報のみの集合）と第２群（残りの部分情報と乱数の集合）の間で、要素間の排他的論理和演算を行い、演算結果で得られたものを連結して分散情報を生成している。この発明は、分割数を増やし第２群に移す部分情報が増えるほど、分散情報のサイズは減っていくことになる。しかしながらこの方法を、本明細書で示す最も現実的な運用である（２、４）閾値秘密分散に適用した場合、分散情報の内部に原データ（秘密情報）の断片や、部分情報同士の排他的論理和を行っただけの断片（乱数との排他的論理和演算がされていない断片）が存在し、セキュリティ強度として好ましくないケースが出現する。例えば、部分情報同士の排他的論理和演算の場合、一方のデータがオールゼロのときにもう一方の原データが出現することになり、部分的な秘密漏洩の問題が存在する。 On the other hand, Patent Document 2 proposes an invention that gives a degree of freedom to the size of shared information in a system that similarly uses the (k, n) threshold distribution method. Prior to the invention, when generating partial information by dividing the original data, the same number of random elements as the number of divisions (each having the same size as the partial information) are generated, a set of partial information (first group), and a random number Only the set (second group) is subjected to a combination operation of exclusive OR of set elements, and the result (set element) obtained as a result of the calculation is connected to create distributed information. The size of the shared information is the same as the size of the elements of the first group and is the same as the size of the original data. In the invention described in Patent Document 2, the number of divisions of the original data is increased by a predetermined number from the original number of divisions, and the increased number (specified number) of partial information is transferred to the second group and replaced with random numbers. Since the number of elements in the first group is the same as the number of elements in the second group, the number of random numbers is reduced by the amount replaced. In addition, the size of the partial information belonging to the first group and the size of the random number are reduced by the increase in the number of divisions. Subsequent processing, as in the prior art, performs an exclusive OR operation between elements between the first group (set of partial information only) and the second group (set of remaining partial information and random numbers). The obtained information is connected to generate distributed information. According to the present invention, the size of the distributed information decreases as the number of divisions increases and the partial information transferred to the second group increases. However, when this method is applied to (2, 4) threshold secret sharing which is the most realistic operation shown in this specification, fragments of original data (secret information) and exclusion of partial information within the shared information There is a fragment that has been subjected to a logical OR (fragment that has not been subjected to an exclusive OR operation with a random number), and a case in which the security strength is not desirable appears. For example, in the case of an exclusive OR operation between partial information, when one data is all zero, the other original data appears, and there is a problem of partial secret leakage.

また非特許文献１では、（ｋ、Ｌ、ｎ）ランプ型閾値秘密分散法を用いて（３、２、４）閾値分散を提案している。（ｋ、Ｌ、ｎ）ランプ型閾値秘密分散法とは、非特許文献２等の多くの文献に示されているとおり、分散情報のサイズを減らすための手法である。（ｋ、ｎ）閾値秘密分散法と同様に元データ（データ）をｎ個の分散情報として秘匿化し、そのうちｋ個を集めて元データに復元できるが、新たなパラメータＬを導入して、ｋ−Ｌ個の分散情報からは元データが復元できないがｋ−Ｌ＋１個からｋ−１個までを集めると何らかの元データに関する情報を得ることができるという手法である。非特許文献１では、元データを４つに分散し、そのうち３つの分散情報を集めれば、元データが完全に復元できる。Ｌ＝２を指定しているため、分散情報のサイズが元データのサイズの半分になる効果があり、分散情報が１個では元データが絶対に復元できず、２個集めると一部の情報を得ることができる。当該手法では、分散情報のサイズ抑制効果と処理速度向上に貢献しているものの、２重故障に耐えることができない。 Non-Patent Document 1 proposes (3, 2, 4) threshold distribution using the (k, L, n) ramp-type threshold secret sharing method. The (k, L, n) ramp-type threshold secret sharing method is a technique for reducing the size of shared information, as shown in many documents such as Non-Patent Document 2. Similar to the (k, n) threshold secret sharing method, the original data (data) is concealed as n pieces of distributed information, of which k pieces can be collected and restored to the original data. The original data cannot be restored from -L pieces of distributed information, but information about some original data can be obtained by collecting k-L + 1 to k-1 pieces. In Non-Patent Document 1, the original data can be completely restored by distributing the original data into four and collecting three pieces of distributed information. Since L = 2 is specified, there is an effect that the size of the shared information becomes half the size of the original data, and the original data cannot be absolutely restored with one shared information, and some information is collected when two are collected. Can be obtained. Although this method contributes to the effect of suppressing the size of distributed information and improving the processing speed, it cannot withstand double failures.

特開２０１１−３５６１８号公報JP 2011-35618 A 特開２００８−２６２０４０号公報JP 2008-262040 A 特開２００９−３７０９３号公報JP 2009-37093 A 特開２００４−２１３６５０号公報JP 2004-213650 A 松本勉、清藤武暢、鴨志田昭輝、新谷敏文、佐藤敦“セキュアデータ保管サービス向け高速秘密分散方式”ＳＣＩＳ２０１２、２０１２．Tsutomu Matsumoto, Takeshi Kiyoto, Aki Kamoshita, Toshifumi Shintani, Satoshi Sato “High-speed secret sharing method for secure data storage service” SCIS 2012, 2012. 土井洋“秘密分散法とその応用について”情報セキュリティ総合科学第４巻、２０１２Doi Hiroshi “Secret Sharing Method and its Applications” Information Security Science Vol.4, 2012

本発明は、大量データの取り扱いを視野に入れ、低コスト性、高い可用性、高い秘匿性、高速性を満たすための手段を提案するものであり、以下の課題を同時に解決することができる。 The present invention proposes means for satisfying low cost, high availability, high confidentiality, and high speed with a view to handling a large amount of data, and can simultaneously solve the following problems.

（１）分散情報のサイズを小さくする。大規模ファイル群をバックアップする場合、分散情報のサイズが元データと同じ大きさでは、分散情報の数が増えるにしたがって大規模ストレージが必要となり、保管コストも増大する。
（２）分散情報の秘匿性を高める。分散情報の全ての断片に乱数との排他的論理和演算（ＸＯＲ）が入り、元データを推測できないようにする。特に、分散情報のネットワーク転送時に、コストの安い公衆回線を利用できるようにするためには、秘匿性を高めることが重要である。
（３）高速処理を実現する。大規模データを扱うためには、元データから分散情報の生成、および元データの復元の処理速度が遅いと高性能の計算処理装置が必要となり、コストがかかる。また、データ回線上を通るデータ量を落とさないと、回線ネックとなり、処理速度も落ちることになる。
（４）論理レベルで２重故障に対応する。万が一、複数の分散装置のいずれかが故障した場合でも（データの破壊、消失）、これを多重化して、連続運転ができるように可用性を高める必要がある。従来は、ストレージレベルでＲＡＩＤの機能やサーバのレプリケーション機能で対応しているが、既存の分散装置と併用するとコストが高くなるため、ＲＡＩＤを使用しないで論理レベルで２重故障に耐えるようにする。(1) Reduce the size of the distributed information. When backing up a large-scale file group, if the size of the distributed information is the same as the original data, a large-scale storage is required as the number of distributed information increases, and the storage cost also increases.
(2) Increase the confidentiality of distributed information. An exclusive OR operation (XOR) with a random number is entered in all pieces of shared information so that the original data cannot be estimated. In particular, it is important to increase confidentiality so that a public line with a low cost can be used when the distributed information is transferred over the network.
(3) Realize high-speed processing. In order to handle large-scale data, if the processing speed for generating distributed information from the original data and restoring the original data is slow, a high-performance computing device is required, which is expensive. If the amount of data passing through the data line is not reduced, a line bottleneck will occur and the processing speed will also decrease.
(4) Respond to double failures at the logic level. Even if one of the plurality of distributed devices fails (data destruction or loss), it is necessary to multiplex them to increase availability so that continuous operation is possible. Conventionally, RAID functions and server replication functions are supported at the storage level, but when used in combination with existing distributed devices, the cost increases, so that double failures can be tolerated at the logical level without using RAID. .

本発明は、上記の課題を同時に解決するため、（２、４）閾値分散法を用いて、分散技術の優位点を保ちながら、実運用コストを下げる手段を提案するものである。 In order to solve the above-mentioned problems at the same time, the present invention proposes means for lowering the actual operation cost while maintaining the advantages of the dispersion technique by using the (2, 4) threshold dispersion method.

本第一の発明は、元データから４個の分散情報を生成し、そのうち２個の分散情報を用いて前記元データを復元する（２、４）閾値情報分散システムであって、前記元データを分割してｄ個の部分情報を作成する部分情報生成手段と、前記部分情報と同じサイズの乱数をｍ個生成する乱数生成手段と、前記ｄ個の部分情報とｍ個の乱数を成分とする（ｄ＋ｍ）／２のサイズからなる第１及び第２基準ベクトルを生成する基準ベクトル生成手段と、生成された前記第１及び第２基準ベクトルと予め準備した８個の係数行列により、前記第１基準ベクトルと前記係数行列との組み合わせ、前記第２基準ベクトルと前記係数行列との組み合わせによる行列演算（排他的論理和演算ＸＯＲ）と当該演算結果のベクトル間の和演算（排他的論理和演算ＸＯＲ）を行い、４個の（ｄ＋ｍ）／２サイズの分散情報ベクトルを生成する分散情報ベクトル生成手段と、生成した前記４個の分散情報ベクトルのそれぞれに対し、その（ｄ＋ｍ）／２個の要素を接続して４個の前記分散情報を生成する分散情報生成手段と、任意の２個の前記分散情報をもとにして前記元データを復元するデータ復元手段、からなることを特徴とする。
また本第一の発明は、ｍはｄ以下、（ｄ＋ｍ）／２は２以上の整数である、ことを特徴とする。
また本第一の発明は、前記係数行列は、０又は１のみを要素とする（ｄ＋ｍ）／２次元正方行列である、ことを特徴とする。
また本第一の発明は、前記係数行列は、前記基準ベクトル内の要素の並び順に合わせて予め定義した複数のパターン、又は外部から任意に与える、ことを特徴とする。
更に本第一の発明は、ｍが、２^ｍ−１≧（ｄ＋ｍ）／２を満たす場合、ｍ個の乱数を組み合わせて新たな乱数として利用し、前記行列演算（排他的論理和演算ＸＯＲ）を行う、ことを特徴とする。The first invention is a threshold information distribution system that generates four pieces of shared information from original data and restores the original data using two pieces of shared information among them (2, 4). A partial information generating unit that generates d partial information, a random number generating unit that generates m random numbers having the same size as the partial information, and the d partial information and m random numbers as components. The reference vector generation means for generating the first and second reference vectors having a size of (d + m) / 2, and the generated first and second reference vectors and eight coefficient matrices prepared in advance. A matrix operation (exclusive OR operation XOR) by a combination of one reference vector and the coefficient matrix, a combination of the second reference vector and the coefficient matrix, and a sum operation (exclusive OR operation) between the operation results vectors X R) to generate four (d + m) / 2-size shared information vectors, and for each of the generated four shared information vectors, (d + m) / 2 A shared information generating unit that connects four elements to generate the four pieces of shared information, and a data restoring unit that restores the original data based on any two pieces of the shared information. .
The first invention is characterized in that m is d or less and (d + m) / 2 is an integer of 2 or more.
The first invention is characterized in that the coefficient matrix is a (d + m) / 2-dimensional square matrix having only 0 or 1 as elements.
In addition, the first invention is characterized in that the coefficient matrix is arbitrarily given from a plurality of patterns defined in advance according to the arrangement order of elements in the reference vector, or from the outside.
Further, according to the first invention, when m satisfies 2 ^m −1 ≧ (d + m) / 2, m random numbers are combined and used as a new random number, and the matrix operation (exclusive OR operation XOR) It is characterized by performing.

本第二の発明は、前記情報分散システムを用いた情報分散ストレージシステムであって、４個の情報分散ストレージ装置をネットワークで接続し、前記情報分散ストレージ装置の稼働状態を監視する監視手段と、分散情報がどこに保管されているかを管理する管理手段と、分散情報を前記情報分散ストレージ装置内で相互保管する手段と、前記監視手段によって１つの前記情報分散ストレージ装置が故障していることを検知した場合、直ちに別の前記情報分散ストレージ装置に切り替える手段と、前記監視手段によって２つまでの前記情報分散ストレージ装置が故障していることを検知した場合、残りの情報分散ストレージ装置で運用継続する手段とを有し、前記分散情報は、全ての前記情報分散ストレージ装置内に保管され、いずれの前記情報分散ストレージ装置でも元データに復元できる、ことを特徴とする The second invention is an information distribution storage system using the information distribution system, wherein four information distribution storage devices are connected by a network, and monitoring means for monitoring an operating state of the information distribution storage device; A management means for managing where the distributed information is stored, a means for mutually storing the distributed information in the information distributed storage apparatus, and a monitoring means detect that one of the information distributed storage apparatuses has failed. If it is detected by the means for immediately switching to another information distributed storage device and the monitoring means that the two information distributed storage devices have failed, the operation is continued with the remaining information distributed storage devices. And the shared information is stored in all the information distributed storage devices, and Also restored to the original data by multicast distributed storage system, and wherein the

本第一の発明によれば、乱数の発生数ｍが可変であるため、分散情報のサイズを小さくしたい場合にはｍを小さくし、分散情報のサイズは大きくなるがセキュリティ強度をあげたい場合には、ｍを大きくしてｄに近づければ良い。乱数の量が可変であるため、分散情報の秘匿性の強度と分散情報のサイズ・処理速度のバランスを自由に変更することができる。すなわち、乱数を元データと同じだけ発生させると、分散情報のサイズは元データと同じになって処理時間はかかるものの、セキュリティ強度を最大とすることができる。一方、乱数の量を減らすと分散情報のサイズが減ることで処理速度を速くすることができる。 According to the first invention, since the number m of random numbers generated is variable, when it is desired to reduce the size of the distributed information, m is decreased, and when the size of the distributed information is increased but the security strength is desired to be increased. Can be increased by increasing m to approach d. Since the amount of random numbers is variable, it is possible to freely change the balance between the confidentiality strength of the distributed information and the size and processing speed of the distributed information. In other words, if the random number is generated as much as the original data, the size of the distributed information is the same as the original data and the processing time is increased, but the security strength can be maximized. On the other hand, if the amount of random numbers is reduced, the processing speed can be increased by reducing the size of the shared information.

また係数行列のパターンによって分散情報を作成するときの乱数と元データの排他的論理和演算の計算パターンを変更でき、これによってもセキュリティ強度を変更することができる。 In addition, the calculation pattern of the exclusive OR operation between the random number and the original data when creating the distributed information can be changed according to the coefficient matrix pattern, and the security strength can be changed accordingly.

また乱数の発生数ｍを指定する際、２^ｍ−１が（ｄ＋ｍ）／２に等しいか、又は大きくなる数字とした場合、ｍ個の乱数を組み合わせて新たな乱数として利用し、前記行列演算（排他的論理和演算ＸＯＲ）を行うため、分散情報の全ての断片に異なる乱数要素との排他的論理和演算が入るようにすることでき、分散情報の全ての断片に元データや元データのみの排他的論理和演算が出現しないため、分散情報に元データの断片が生データとして絶対に存在せず、分散情報はその全体を完全な乱数とみなすことができる。When the number m of random numbers is specified, if 2 ^m −1 is a number that is equal to or larger than (d + m) / 2, m random numbers are combined and used as a new random number, and the matrix operation is performed. (Exclusive OR operation XOR) allows exclusive OR operation with different random number elements to be included in all pieces of distributed information, and only original data and original data are included in all pieces of distributed information Therefore, the fragment of the original data never exists as raw data in the shared information, and the entire shared information can be regarded as a complete random number.

また、ｄに対するｍの数値を上記条件２^ｍ−１が（ｄ＋ｍ）／２に等しいか大きいことを守りながら、ｍの数値を相対的に小さくすることで、セキュリティ強度を保ったまま、乱数の発生数が少なくすることや分散情報のサイズを減らすことができる。乱数に真正乱数を用いた場合、乱数発生コスト（処理時間等を含む）がかかるため、乱数の発生量が少ないことは、分散情報のサイズ減少によるディスク容量削減のみならず、システム全体の処理速度向上にも寄与する。In addition, by keeping the numerical value of m relatively small while maintaining the security strength by keeping the numerical value of m relatively small while keeping the above condition 2 ^m −1 equal to or larger than (d + m) / 2. The number of occurrences can be reduced and the size of the distributed information can be reduced. When a genuine random number is used as the random number, the random number generation cost (including processing time) is required. Therefore, the generation amount of the random number is small, not only the disk capacity reduction due to the size reduction of the distributed information but also the processing speed of the entire system. Contributes to improvement.

本第二の発明によれば、分散情報の１つを情報分散ストレージ内の各情報分散ストレージ装置が保管しているため、復元のために足りない１つの分散情報のみをネットワークを通して入手すればよく、ネットワーク転送時間が減り、元データＳの復元時間が大幅に節約できる。
また、１つの情報分散ストレージシステムを、複数の顧客がシェアして使用でき、コストシェアができるため、実際の運用コストを下げることができる。例えば、４人の異なる顧客が、それぞれ専用の情報分散ストレージ装置を情報分散処理の入口として使用し、保管場所は残りの３つの情報分散ストレージ装置として使用すると、１つの情報分散ストレージシステムをシェアできることになり、各顧客が負担するコストを削減することができる。According to the second aspect of the present invention, since each information distribution storage device in the information distribution storage stores one piece of distributed information, only one piece of distributed information that is insufficient for restoration needs to be obtained through the network. The network transfer time is reduced, and the restoration time of the original data S can be greatly saved.
In addition, since one information distributed storage system can be shared and used by a plurality of customers and cost sharing can be achieved, the actual operation cost can be reduced. For example, if four different customers use their dedicated information distribution storage devices as entry points for information distribution processing, and the storage location is used as the remaining three information distribution storage devices, they can share one information distribution storage system. Thus, the cost borne by each customer can be reduced.

実施例１における情報分散システムの構成図である。1 is a configuration diagram of an information distribution system in Embodiment 1. FIG. 元データから分散情報を生成する処理フロー図である。It is a processing flowchart which produces | generates distribution information from original data. 元データの復元の処理フロー図である。It is a processing flow figure of restoration of original data. 実施例２における情報分散ストレージシステムの構成図である。6 is a configuration diagram of an information distribution storage system in Embodiment 2. FIG. 端末から元データを入力する構成図である。It is a block diagram which inputs original data from a terminal. ネットワーク上の端末から元データを入力する構成図である。It is a block diagram which inputs original data from the terminal on a network.

以下、本発明の実施形態について詳細に説明するが、まず本実施例で使用する記号について以下のとおり定義する。

＋：加算演算子。部分情報や乱数要素間での＋演算はビット単位の排他的論理和

Ｓ_ｉ（ｉ＝１，２，３，．．．，ｄ）：
元データＳを示すビット列。Ｓ_１，Ｓ_２，Ｓ_３，．．．，Ｓ_ｄは、元データＳをｄ分割した各要素（ビット列）であり、それぞれが部分情報である。
Ｒ_ｉ（ｉ＝１，２，３，．．．，ｍ）：
乱数要素を示すビット列。Ｒ_１，Ｒ_２，Ｒ_３，．．．，Ｒ_ｍは、部分情報と同一サイズの乱数要素（ビット列）である。
Ｖ_１，Ｖ_２：
基準ベクトル。部分情報と乱数要素を、それぞれ（ｄ＋ｍ）／２個ずつ並べたもの。本明細書では、乱数要素をＶ_１のベクトルにまとめて解説する。
Ｗ_ｉ：
分散情報ベクトル。部分分散情報を（ｄ＋ｍ）／２個並べたベクトルであり、各要素をビット結合すると分散情報データができる。ｉ＝１，２，３，４とすれば、４個の分散情報ができる。
Ｗ_（ｉｕ）：
分散情報ベクトルＷ_ｉを構成する要素であり、それぞれが部分分散情報
（ｕ＝１，２，３，．．．．，（ｄ＋ｍ）／２）である。
ＩおよびＩ_ｉ：
単位行列。対角成分のみが１で他は全て０の正方行列である。
Ｋ_{（ｉ，ｊ）}（ｉ＝１，２，３，４，，ｊ＝１，２）：
Ｗ_ｉを生成するときの基準ベクトルＶ_１，Ｖ_２に対する係数行列。それぞれＫ_{（ｉ，１）}Ｋ_{（ｉ，２）}で示される。

Ｗ_ｉ，Ｗ_ｊから基準ベクトルＶ_１，Ｖ_２を復元するときの係数行列。Ｋ_{（ｉ，１）}，Ｋ_{（ｉ，２）}、Ｋ_{（ｊ，１）}，Ｋ_{（ｊ，２）}を要素とする行列の逆行列要素である。Hereinafter, embodiments of the present invention will be described in detail. First, symbols used in this embodiment are defined as follows.

+: Addition operator. + Operation between partial information and random number elements is bitwise exclusive OR

S _i (i = 1, 2, 3,..., D):
A bit string indicating the original data S. S ₁ , S ₂ , S ₃ ,. . . , S _d are elements (bit strings) obtained by dividing the original data S into d parts, each of which is partial information.
R _i (i = 1, 2, 3,..., M):
A bit string indicating a random element. R ₁ , R _2, R ₃ ,. . . , R _m are random elements (bit strings) having the same size as the partial information.
V ₁ , V ₂ :
Reference vector. Partial information and random number elements arranged in (d + m) / 2. In this specification, description together random elements to a vector of V _1.
W _i :
Distributed information vector. This is a vector in which (d + m) / 2 pieces of partial shared information are arranged, and shared information data can be obtained by bit-combining each element. If i = 1, 2, 3, and 4, four pieces of shared information are generated.
W _(iu) :
It is an element of the distributed information vector _{W i,} each partial shared information (u = 1,2,3, ...., ( d + m) / 2) is.
I and I _i :
The identity matrix. A square matrix in which only the diagonal component is 1 and all others are 0.
K _{(i, j)} (i = 1, 2, 3, 4, j = 1, 2):
Coefficient matrix for the reference vector _V 1, _{V 2} when generating the W _i. Each is represented by K _{(i, 1)} K _{(i, 2)} .

A coefficient matrix for restoring the reference vectors V ₁ and V ₂ from W _i and W _j . It is an inverse matrix element of a matrix having K _{(i, 1)} , K _{(i, 2)} , K _{(j, 1)} , K _{(j, 2)} as elements.

図１は本願発明に係る情報分散システムの概要を示すブロック図であり、図２は処理のフロー概要を示す。本発明は、コスト面等から実運用に最も適している（２、４）閾値情報分散システムにより、元データＳを分散数４の分散情報に分散し、そのうちの任意の２つの分散情報から元データＳを復元するシステムに適用しているが、本アルゴリズムは、一般式で示す（ｋ、ｎ）閾値情報分散システムにも適用可能である。この場合は、以降の「（ｄ＋ｍ）／２」が「（ｄ＋ｍ）／ｋ」に置き換わり、ｍは（ｄ＋ｍ）／ｋ以下で、（ｄ＋ｍ）／ｋが２以上の整数、かつ以下の基準ベクトル数がｋ個となり、かつ係数行列は（ｄ＋ｍ）／ｋサイズのものが各基準ベクトル毎にｎ個（合計ｋ×ｎ個）必要となる。 FIG. 1 is a block diagram showing an outline of an information distribution system according to the present invention, and FIG. 2 shows an outline of a processing flow. The present invention is most suitable for actual operation in terms of cost and the like (2, 4). The threshold data distribution system distributes the original data S into distributed information with a distribution number of 4, and the original data is obtained from any two of the distributed information. Although applied to a system for restoring data S, the present algorithm can also be applied to a (k, n) threshold information distribution system represented by a general formula. In this case, the following “(d + m) / 2” is replaced with “(d + m) / k”, m is (d + m) / k or less, (d + m) / k is an integer of 2 or more, and the following reference vector: The number of k and the coefficient matrix of (d + m) / k size are required for each reference vector (n in total, k × n).

図１、図２に示す通り、まず入力パラメータとして、秘密管理情報となる元データＳを部分情報に分割するための分割数ｄと、発生させる乱数要素の個数ｍを指定する。ｍはｄ以下、（ｄ＋ｍ）／２が２以上の整数であることが条件である（Ｓ１０１）。 As shown in FIGS. 1 and 2, first, as an input parameter, a division number d for dividing the original data S as secret management information into partial information and a number m of random number elements to be generated are designated. The condition is that m is equal to or less than d, and (d + m) / 2 is an integer equal to or greater than 2 (S101).

次に元データ分割手段１１により元データＳをｄ個の部分情報Ｓ_１，Ｓ_２，Ｓ_３，．．．．，Ｓ_ｄに等分割する（Ｓ１０２）。また乱数発生手段１２によりｍ個の乱数要素Ｒ_１，Ｒ_２，Ｒ_３，．．．．，Ｒ_ｍを生成する（Ｓ１０３）。発生させる各乱数要素のサイズ（ビット長）は、それぞれ１つの部分情報のサイズ（ビット長）と同じである。Next, the original data S is divided into d pieces of partial information S ₁ , S ₂ , S ₃ ,. . . . , _Sd are equally divided (S102). Further, the random number generation means 12 generates m random elements R ₁ , R ₂ , R ₃ ,. . . . , R _m are generated (S103). The size (bit length) of each random number element to be generated is the same as the size (bit length) of one piece of partial information.

次に、基準ベクトル生成手段１３において、前記ｄ個の部分情報Ｓ_ｉ（ｉ＝１，２，３，．．．，ｄ）とｍ個の乱数要素Ｒ_ｉ（ｉ＝１，２，３，．．．，ｍ）を（ｄ＋ｍ）／２個ずつに分類し、分類したものを順に並べて、２つの基準ベクトルＶ_１、Ｖ_２を生成する（Ｓ１０４）。基準ベクトルＶ_１、Ｖ_２内の各要素の並べ方は、これ以外にも構成することができるが、後述の係数行列のパターンに影響を与える。

基準ベクトルの各要素は、同一サイズを持った部分情報または乱数要素である。
乱数要素は第１の基準ベクトルＶ_１にまとめた方が、係数行列が単純になり処理が簡単になるが、必ずしも数１のパターンに限定されない。Next, in the reference vector generation means 13, the d pieces of partial information S _i (i = 1, 2, 3,..., D) and the m random number elements R _i (i = 1, 2, 3, 3). .., M) are classified into (d + m) / 2, and the classified ones are arranged in order to generate _two reference vectors V ₁ and V ₂ (S104). The arrangement of the elements in the reference vectors V ₁ and V ₂ can be configured in other ways, but affects the pattern of the coefficient matrix described later.

Each element of the reference vector is partial information or a random number element having the same size.
When the random number elements are collected in the _first reference vector V1, the coefficient matrix becomes simpler and the processing becomes easier. However, the random number elements are not necessarily limited to the pattern of Expression 1.

次に、分散情報生成手段１４において、基準ベクトルＶ_１、Ｖ_２のそれぞれに予め準備した（又は外部から入力する）異なる係数行列を乗じ、その演算結果のベクトルを再度加算して、（ｄ＋ｍ）／２サイズの分散情報ベクトルＷを得る（Ｓ１０５）。
各分散情報ベクトルＷ_ｉ（ｉ＝１，２，３，４）は、（ｄ＋ｍ）／２個の成分からなる列ベクトルを生成する。なお、各分散情報ベクトルＷ_ｉ（ｉ＝１，２，３，４）の各要素のサイズは、部分情報のサイズ（元データの１／ｄ）と同じであり、それらの和が分散情報データとなるが、その全体サイズは元データＳの（ｄ＋ｍ）／２ｄである。

ここで、ｐ＝（ｄ＋ｍ）／２であり、ベクトル部の右肩のＴは転置ベクトルであることを示す。要素（ベクトル成分）間の加算（＋演算子）は、排他的論理和（ＸＯＲ）演算に置き換えるものとする。Next, in the shared information generation means 14, each of the reference vectors V ₁ and V ₂ is multiplied by a different coefficient matrix prepared in advance (or input from the outside), and the vector of the calculation result is added again, and (d + m) A dispersion information vector W of / 2 size is obtained (S105).
Each distributed information vector W _i (i = 1, 2, 3, 4) generates a column vector composed of (d + m) / 2 components. The size of each element of each shared information vector W _i (i = 1, 2, 3, 4) is the same as the size of the partial information (1 / d of the original data), and the sum of them is the shared information data However, the overall size is (d + m) / 2d of the original data S.

Here, p = (d + m) / 2, and T on the right shoulder of the vector portion indicates a transposed vector. An addition (+ operator) between elements (vector components) is replaced with an exclusive OR (XOR) operation.

以下は異なる４つの分散情報ベクトルＷ_ｉ（ｉ＝１，２，３，４）を生成するときの式を示す。

The following formulas are used when four different distributed information vectors W _i (i = 1, 2, 3, 4) are generated.

係数行列記憶部１４ａにおける係数行列Ｋ_{（ｉ，１）}，Ｋ_{（ｉ，２）}は、０と１からなる２値正方行列で縦横サイズが（ｄ＋ｍ）／２のものであり、同じ基準ベクトル内のどの要素間の加算（ＸＯＲ演算）結果を新たな要素とするかを定義するものである。係数行列は１つの分散情報ベクトルに対して２つ必要であるため、４つの分散情報ベクトルを作成する場合、合計８個（基準ベクトル数×分散情報数）必要である。係数行列のパターンは基準ベクトルごとに分類される。The coefficient matrices K _{(i, 1)} and K _{(i, 2)} in the coefficient matrix storage unit 14a are binary square matrices composed of 0 and 1, and have a vertical and horizontal size of (d + m) / 2, and are within the same reference vector. Which element is added (XOR operation) is defined as a new element. Since two coefficient matrices are required for one shared information vector, when four shared information vectors are created, a total of eight (reference vector number × shared information number) is required. The coefficient matrix pattern is classified for each reference vector.

本発明では係数行列の定義が重要であるが、以下の（１）および（２）の条件を満たす係数行列を予め用意するか、或いは外部から与えるようにしている。
（１）１つの基準ベクトル内に部分情報と乱数要素が混在する場合は、係数行列と基準ベクトルの積を行った際、演算結果ベクトルの全ての成分に乱数要素が加算されるように係数行列を定義する。係数行列と基準ベクトルとの積は、係数行列の各行で値が１になっている要素に対応する基準ベクトルの要素のみが加算（ＸＯＲ演算）されることになる。したがって、基準ベクトル内で部分情報が出現している行は、該部分情報の要素と乱数要素が加算（ＸＯＲ演算）されるように係数行列の各行の対応部分を１とする。
ここで、注意すべきこととして、基準ベクトル内の乱数要素の数が少なく、同じ乱数要素を再利用する場合、演算結果に同じ乱数要素とのＸＯＲ演算を行うことで同じ乱数要素出現の周期性が増すことになる。これを防止するためには、分散情報ベクトルの各要素が１つの乱数要素との演算結果になるのではなく、異なる複数の乱数要素との演算結果になるように係数パターンを指定すればよい。ｍ個の乱数要素を指定した場合、ｍ個の乱数要素の組み合わせ数は２^ｍ−１個であるため、乱数要素の出現の周期性は２^ｍ−１に増やすことができる。例えば、Ｒ_１，Ｒ_２，Ｒ_３の３個の乱数要素を指定した場合、乱数要素の組み合わせとして得られる異なる乱数要素は、数４のように２^３−１（＝７）個であり、それぞれを異なる乱数（乱数同士の演算結果は乱数）とみなすことができる。

換言すると、乱数要素の数を増やしたものと同じにみなせるため、２^ｍ−１（計算で得られた異なる乱数の個数）が係数行列の行の数より大きい場合、つまり次式を満たす場合に、係数行列のパターンを後述の方式で指定すると、基準ベクトルとの演算結果に同じ乱数要素が出現するような周期性をなくすことができる。

（２）基準ベクトル内に乱数要素が無い場合は、係数行列と基準ベクトルの積を行った際、結果のベクトルの要素が、係数行列ごとに元の基準ベクトルの要素の並びと完全に異なるような係数行列を定める。In the present invention, the definition of the coefficient matrix is important, but a coefficient matrix that satisfies the following conditions (1) and (2) is prepared in advance or given from the outside.
(1) When partial information and random number elements are mixed in one reference vector, the coefficient matrix is added so that the random number elements are added to all components of the operation result vector when the product of the coefficient matrix and the reference vector is performed. Define As the product of the coefficient matrix and the reference vector, only the elements of the reference vector corresponding to the elements having a value of 1 in each row of the coefficient matrix are added (XOR operation). Therefore, for the row in which the partial information appears in the reference vector, the corresponding portion of each row of the coefficient matrix is set to 1 so that the element of the partial information and the random number element are added (XOR operation).
Here, it should be noted that when the number of random elements in the reference vector is small and the same random elements are reused, the periodicity of the appearance of the same random elements by performing an XOR operation with the same random elements on the operation result Will increase. In order to prevent this, it is only necessary to specify a coefficient pattern so that each element of the distributed information vector is not the calculation result with one random number element but the calculation result with a plurality of different random number elements. When m random elements are designated, the number of combinations of m random elements is 2 ^m −1, and therefore the periodicity of appearance of random elements can be increased to 2 ^m −1. For example, when three random number elements R ₁ , R ₂ , and R ₃ are specified, the number of different random number elements obtained as a combination of random number elements is 2 ³ −1 (= 7) as in Expression 4, Each can be regarded as a different random number (the operation result between random numbers is a random number).

In other words, since it can be regarded as the same as the number of random elements increased, when 2 ^m −1 (number of different random numbers obtained by calculation) is larger than the number of rows of the coefficient matrix, that is, when the following expression is satisfied: When the pattern of the coefficient matrix is designated by the method described later, it is possible to eliminate the periodicity such that the same random number element appears in the calculation result with the reference vector.

(2) When there is no random number element in the reference vector, when the product of the coefficient matrix and the reference vector is performed, the elements of the resulting vector are completely different from the arrangement of the elements of the original reference vector for each coefficient matrix. Define a coefficient matrix.

以下、更に詳細に説明する。０と１の要素からなる係数行列Ｋ_{（ｉ，ｊ）}（ｉ＝１，．．４）（ｊ＝１，２）及び単位行列Ｉはメモリに記録されている。当然、外部から定義パターンとして与えることもできるが、ここではメモリに記録されているものとして説明をする。メモリに記録されている係数行列Ｋ_{（ｉ，ｊ）}（ｉ＝１，．．４）（ｊ＝１，２）は、このとき２を法とする演算のもとに数６の左辺に示す行列式が０でないように定義されている。メモリに記録されている係数行列Ｋ_{（ｉ，ｊ）}（ｉ＝１，．．４）（ｊ＝１，２）の詳細パターンは、数１２〜数１５である。

This will be described in more detail below. A coefficient matrix K _{(i, j)} (i = 1,... 4) (j = 1, 2) and a unit matrix I composed of 0 and 1 elements are recorded in the memory. Of course, it can be given as a definition pattern from the outside, but here it will be described as being recorded in a memory. The coefficient matrix K _{(i, j)} (i = 1,... 4) (j = 1, 2) recorded in the memory is shown on the left side of Equation 6 under the operation modulo 2 at this time. It is defined so that the determinant is not 0. The detailed patterns of the coefficient matrix K _{(i, j)} (i = 1,... 4) (j = 1, 2) recorded in the memory are Expressions 12 to 15.

下記の式を用いて４つの分散情報ベクトルＷ_１，Ｗ_２，Ｗ_３，Ｗ_４を生成する。

Four distributed information vectors W ₁ , W ₂ , W ₃ , and W ₄ are generated using the following equations.

表現を換えると次式になり、これは数３と同一式であるが再掲する。

In other words, the following expression is obtained.

上記の係数行列について、詳細な内容（パターン）をのべる。
第１の基準ベクトルＶ_１にかかる係数行列パターン（２値正方行列）を数１２、数１３に掲げるが、その前にベースとなる基本パターン行列を数９に示す。この行列Ｚは一辺が２^ｍ−１の正方行列である。

上記数９で、最初の左端列は乱数の組み合わせを示す行列要素（Ｉ_１およびＡ_２〜Ａ_ｍ）が並んでおり、全て列数が同一（乱数の個数＝ｍ）であるがＩ_１を除いて正方行列とは限らない。それ以外のＩ_１〜Ｉ_ｍは対角正方行列である。Ｉ_１は行列サイズがｍの対角行列であり、Ｉ_２〜Ｉ_ｍはそれぞれＡ_２〜Ａ_ｍの行数を一辺とする対角正方行列である。
Ａ_ｉの列数はｍであるが、行数は、_ｉＣ_ｍ（ｍ個から異なるｉ個を選ぶ組み合わせ数）であり、各行は選択されたｉ個の列要素が１で他は０であるような行列である。
ここで添字ｉは、ｉ＝２，．．，ｍである。
数１０は、乱数の個数ｍが５の場合のＡ_２〜Ａ_ｍの事例を示している。各行は、乱数要素Ｒ_１〜Ｒ_ｍの選択（排他的論理和演算に使用する）を１で非選択（排他的論理和演算に使用しない）を０で示している。

Detailed contents (pattern) of the coefficient matrix are described.
Coefficient matrix pattern (binary square matrix) number 12 according to the first reference vector V _1, but listed in number 13 shows a basic pattern matrix as a base before its number 9. This matrix Z is a square matrix having a side of 2 ^m −1.

In the above formula 9, matrix elements (I ₁ and A _{2 to} A _m ) indicating a combination of random numbers are arranged in the first leftmost column, and all have the same number of columns (number of random numbers = m), but I ₁ It is not necessarily a square matrix. Other I ₁ -I _m are diagonal square matrices. I ₁ is a diagonal matrix of the matrix size _m, an I 2 ~I _m diagonal square matrix to one side the number of rows of _A 2 to A _m, respectively.
The number of columns of A _i is m, but the number of rows is _i C _m (the number of combinations for selecting i different from m), and each row has 1 selected i column elements and the other 0. It is a certain matrix.
Here, the subscript i is i = 2,. . , M.
The number 10, the number m of the random number indicates the case of _A 2 to A _m in the case of 5. Each row indicates selection of random number elements R _{1 to} R _m (used for exclusive OR operation) by 1 and non-selection (not used for exclusive OR operation) by 0.

第１の基準ベクトルＶ_１にかかる係数行列パターン（正方行列）は、一辺が（ｄ＋ｍ）／２の正方行列であり、「（ｄ＋ｍ）／２」が「２^ｍ−１」以下の場合は、数１２に示す行列となる。一方、「（ｄ＋ｍ）／２」が「２^ｍ−１」より大きい場合は、数１３に示すが数９の行列Ｚを再利用したものになる。すなわち、最初のｍ列は数９の行列のＩ_１、Ａ_２〜Ａ_ｍを縦方向に繰り返して並べるとともに、全体の対角要素が１になるように拡張した行列から「（ｄ＋ｍ）／２」要素分を切り取った（左上から切り取る）行列となる。数１０のパターンは、特に乱数の量を少なくするときに有効であるが、乱数の再現性が発生するためセキュリティ上は好ましくない。

The coefficient matrix pattern (square matrix) according to the first reference vector V ₁ is a square matrix having one side of (d + m) / 2, and when “(d + m) / 2” is “2 ^m −1” or less, The matrix shown in Equation 12 is obtained. On the other hand, when “(d + m) / 2” is larger than “2 ^m −1”, the matrix Z shown in Equation 13 is reused. That, I ₁ of the first m columns are the number 9 of the _matrix, A ₂ to A _m with a repeated aligned in the longitudinal direction, from the matrix overall diagonal elements is expanded so that the 1 "(d + m) / 2 "This is a matrix with elements cut out (cut from the upper left). The pattern of Equation 10 is particularly effective when reducing the amount of random numbers, but is not preferable in terms of security because reproducibility of random numbers occurs.

第２の基準ベクトルＶ_２にかかる係数行列パターン（正方行列）も、一辺が（ｄ＋ｍ）／２の正方行列であるが、この行列サイズ（ｄ＋ｍ）／２が偶数か奇数かによって異なる。数１４は、偶数の場合であり、数１５は奇数の場合の係数行列のパターンである。

The second reference vector V ₂ in accordance coefficient matrix pattern (square matrix) also has a side which is the (d + m) / 2 square matrix, this matrix size (d + m) / 2 is different depending on whether an even or odd. Expression 14 is an even number case, and Expression 15 is a coefficient matrix pattern in the case of an odd number.

上記によって得られた分散情報ベクトルＷ_ｉの要素を、ベクトルごとにその成分を接続（ビット列の接続）して、分散情報を得る（Ｓ１０６）。分散情報は異なるものを４個生成し、分散情報送信手段１５によりネットワーク経由で保管サーバ（ストレージ）２ａ〜２ｄにそれぞれ送信され、各保管サーバ（ストレージ）内に保存される。The elements of the obtained dispersion information vector W _i by the above, by connecting the component for each vector (connection bit sequence), to obtain a dispersion information (S106). Four pieces of different shared information are generated, transmitted to the storage servers (storage) 2a to 2d via the network by the distributed information transmission means 15, and stored in each storage server (storage).

上記で生成された各分散情報のデータにヘッダ部を付加して、最終的な分散情報を生成する（ステップＳ１０７、ステップＳ１０８）。
ヘッダ部には、第１ヘッダ部と第２ヘッダ部があり、第１ヘッダ部には、分散情報ベクトルの計算式を識別するコード（分散情報ベクトルＷｉの添字ｉの番号等）やヘッダ部、分散情報データ部のサイズを格納し、第２ヘッダ部には分散情報ベクトルの計算式で使用した係数行列を格納する。A header part is added to the data of each shared information generated above to generate final shared information (steps S107 and S108).
The header part includes a first header part and a second header part. The first header part includes a code for identifying a calculation formula of the shared information vector (number of subscript i of the shared information vector Wi), a header part, The size of the shared information data part is stored, and the coefficient matrix used in the calculation formula of the shared information vector is stored in the second header part.

第２ヘッダ部の係数行列自体の秘匿性を高めたいときは、ステップＳ１０１〜Ｓ１０６の手順で第２ヘッダ部を対象に分散情報データを作成し、元データＳと置き換える。処理を単純化するため、ｄ＝ｍ＝２を使用し、この場合の係数行列は固定的なものとして記憶されるようにし、分散情報の中には入れないようにすることができる。 When it is desired to increase the confidentiality of the coefficient matrix itself of the second header part, the shared information data is created for the second header part in the procedure of steps S101 to S106 and replaced with the original data S. In order to simplify the processing, d = m = 2 is used, and the coefficient matrix in this case can be stored as a fixed one and not included in the distributed information.

ついで、各保管サーバ（ストレージ）２ａ〜２ｄに保存されている分散情報から、元データＳを復元する処理について、図１、図３にもとづいて説明する。 Next, processing for restoring the original data S from the distributed information stored in the respective storage servers (storage) 2a to 2d will be described with reference to FIGS.

図１の分散情報受信手段１６により任意の２つの分散情報を集め、当該分散情報を（ｄ＋ｍ）／２個に分割して順に並べ、分散情報ベクトルを得る。分散情報ベクトルは２つ生成できるため、これをＷ_ｉ、Ｗ_ｊとすると数１６のように、数３で示す式が２つ生成できる。すなわち、基準ベクトルＶ_１、Ｖ_２を含む式が２つできることになるため、その係数行列の逆行列が求めれば、Ｖ_１、Ｖ_２を未知数とする連立方程式を解くことで元の基準ベクトルを求めることができる。

得られた基準ベクトル内の成分より、元の部分情報を取り出し、それらを接続（ビット列の結合）して元データＳを復元する。Arbitrary two pieces of shared information are collected by the shared information receiving means 16 in FIG. 1, and the shared information is divided into (d + m) / 2 pieces and arranged in order to obtain a shared information vector. Since two disperse information vectors can be generated, if these are W _i and W _j , two expressions shown in Expression 3 can be generated as shown in Expression 16. That is, since _two equations including the reference vectors V ₁ and V ₂ can be obtained, if the inverse matrix of the coefficient matrix is obtained, the original reference vector can be obtained by solving simultaneous equations with V ₁ and V ₂ as unknowns. Can be sought.

Original partial information is extracted from the components in the obtained reference vector, and the original data S is restored by connecting them (combining bit strings).

図１に示す係数行列の逆行列記憶部１７ａは、上記のＶ_１、Ｖ_２を求めるための、係数行列Ｋ_{（ｉ、ｊ）}の逆行列のパターンを予め計算して記憶しておく部分である。The inverse matrix storage unit 17a of the coefficient matrix shown in FIG. 1 is a part that calculates and stores in advance an inverse matrix pattern of the coefficient matrix K _{(i, j)} for obtaining the above V ₁ and V _2. is there.

元データＳを復元する処理を図３にもとづいてより詳細に説明する。それぞれの分散情報からヘッダ部と分散情報データ部を分離し、ヘッダ部自身が分散情報データになっている場合は、ヘッダ部から係数行列を復元するために、ヘッダ部に対して以下の処理を行い、係数行列Ｋ_ｉ、ｊを復元する。The process for restoring the original data S will be described in more detail with reference to FIG. When the header part and the shared information data part are separated from each shared information and the header part itself is shared information data, the following processing is performed on the header part in order to restore the coefficient matrix from the header part. To restore the coefficient matrix K _{i, j} .

まず、２つの分散情報データ部を「（ｄ＋ｍ）／２」個の部分分散情報に等分割し、それぞれを並べて２つの分散情報ベクトルＷ_ｉ，Ｗ_ｊを生成する（Ｓ２０２、Ｓ２０３）。First, the two shared information data parts are equally divided into “(d + m) / 2” partial shared information, and each of them is arranged to generate two shared information vectors W _i and W _j (S202, S203).

次に、２つの分散情報ベクトルＷ_ｉ，Ｗ_ｊをもとにして元データＳを復元処理するため次式を生成する（ステップＳ２０４）。

Next, the following equation is generated to restore the original data S based on the two shared information vectors W _i and W _j (step S204).

上記数１７により次式が生成される。Ｖ_１、Ｖ_２を未知数とする連立一次方程式を解くのと同じである（ステップＳ２０５、Ｓ２０６）。

The following equation is generated by the above equation (17). This is the same as solving simultaneous linear equations with unknown values of V ₁ and V ₂ (steps S205 and S206).

演算のもとで、且つ、Ｋ_{（ｉ，１）}＝Ｋ_{（ｊ，１）}の条件を使用して、次のように算出される。

がある。次式のＩは、「（ｄ＋ｍ）／２」次元の単位行列（対角要素のみが１で他は０）である。

Under the calculation and using the condition of K _{(i, 1)} = K _{(j, 1)} , it is calculated as follows.

There is. I in the following equation is a unit matrix of “(d + m) / 2” dimension (only diagonal elements are 1 and others are 0).

上記数１９は表現を換えると次式となる。この式を用いるとｄ個の部分情報Ｓ_１，Ｓ_２，Ｓ_３，．．．．．．Ｓ_ｄを求めることができる。

Equation (19) can be expressed by the following equation in other words. Using this equation, d pieces of partial information S ₁ , S ₂ , S ₃ ,. . . . . . S _d can be determined.

ベクトルＶ_１、Ｖ_２は、部分情報と乱数を要素としたベクトルであり、これらベクトル内の各要素の並び順は最初に定義しているため、これからｄ個の部分情報Ｓ_１，Ｓ_２，Ｓ_３，．．．．．．Ｓ_ｄを取り出すことができる。取り出した部分情報を連結し、最終的に元データＳを得ることができる（Ｓ２０７）。The vectors V ₁ and V ₂ are vectors having partial information and random numbers as elements, and since the arrangement order of each element in these vectors is defined first, d partial information S ₁ , S ₂ , S ₃ ,. . . . . . S _d can be taken out. The extracted partial information is concatenated to finally obtain the original data S (S207).

上記実施例では、係数行列のパターンを数１２〜数１５に示したが、このパターンは基準ベクトルの係数行列として演算され、この係数行列を基準ベクトルに掛けると基準ベクトル内の成分の間での排他的論理和演算の組合せパターンを定義するものであり、さまざまなパターンを定義することができる。また、後述の具体例では、異なる係数行列パターンのサンプルを掲げる。また、基準ベクトルの成分となっている部分情報と乱数要素の並べ方も様々なものが考えられる。係数行列はこの並びを意識して定義する必要がある。 In the above embodiment, the coefficient matrix pattern is shown in Formula 12 to Formula 15, but this pattern is calculated as the coefficient matrix of the reference vector, and when this coefficient matrix is multiplied by the reference vector, the pattern between the components in the reference vector is calculated. A combination pattern of exclusive OR operation is defined, and various patterns can be defined. In the specific examples described later, samples of different coefficient matrix patterns are listed. Various arrangements of partial information and random number elements that are components of the reference vector are conceivable. The coefficient matrix must be defined with this arrangement in mind.

（ｄ＝７、ｍ＝３を適用した具体例）
以下では、元データＳの分割数が７で乱数要素が３の場合の事例である。

＝７となる。このケースでは、分散情報のサイズは、元データＳのサイズの約０．７１倍になる。(Specific example applying d = 7, m = 3)
The following is a case where the number of divisions of the original data S is 7 and the random number element is 3.

= 7. In this case, the size of the shared information is about 0.71 times the size of the original data S.

この場合、分散情報ベクトルＷ_１，Ｗ_２，Ｗ_３，Ｗ_４は、次のようになる。

In this case, the shared information vector _{_{_{W 1, W 2, W 3}}} , W 4 is as follows.

ここでの復元例は分散情報ベクトルＷ_１，Ｗ_２による復元を考える。他の分散情報ベクトルＷ_ｉ，Ｗ_ｊから残りの復元も同様にできる。数１７を適用して数２３を得る。

The restoration example here considers restoration using the distributed information vectors W ₁ and W ₂ . The rest of the restoration can be similarly performed from the other distributed information vectors W _i and W _j . Equation 17 is applied to obtain Equation 23.

上記数２３は、数１８の形式で記述すると下記の行列が適用される生成されることになる。

If the equation 23 is described in the form of the equation 18, the following matrix is generated.

分散情報ベクトルＷ_１，Ｗ_２，からの元データ復元は数２３の計算式で行うことができる。各分散情報ベクトルＷ_１，Ｗ_２，Ｗ_３，Ｗ_４の中から任意の２つを用いた復元式は、以下のとおりである。The original data restoration from the distributed information vectors W _1, W ₂ can be performed by the equation (23). Each distributed information vector _{_{_{W 1, W 2, W 3}}} , W any restoration type using two of the _four are as follows.

分散情報ベクトルＷ_１，Ｗ_２による復元。
右辺行列の各成分の式は、結合法則と交換法則が成り立つため、プログラムで計算するときは、共通の演算項を取り出して先に計算すると、計算量を減らすことができる。以降の復元式についても同じである。

Restoration by shared information vectors W ₁ and W ₂ .
Since the expression of each component of the right-hand side matrix is based on the combination law and the exchange law, the calculation amount can be reduced by calculating a common operation term and calculating it first in the program. The same applies to the subsequent restoration formulas.

分散情報ベクトルＷ_１，Ｗ_３による復元。

Restoration by shared information vectors W ₁ and W ₃ .

分散情報ベクトルＷ_１，Ｗ_４による復元。

Restoration using shared information vectors W ₁ and W ₄ .

分散情報ベクトルＷ_２，Ｗ_３による復元。

Restoration by shared information vectors W ₂ and W ₃ .

分散情報ベクトルＷ_２，Ｗ_４による復元。

Restoration by shared information vectors W ₂ and W ₄ .

分散情報ベクトルＷ_３，Ｗ_４による復元。

Restoration by shared information vectors W ₃ and W ₄ .

（ｄ＝２６、ｍ＝４を適用した具体例）
乱数要素が少ないケースの具体例として、ｄ＝２６、ｍ＝４の例を示す。ここでは乱数の周期性をみるために第１の基準ベクトルとそれに乗ずる係数行列のパターンの演算のみを示す。基準ベクトルは（ｄ＋ｍ）／２であるため１５列の列ベクトルで、行（ｄ＋ｍ）／２列の正方行列である。

１列目から４列目までは、ｍ＝４であるため、各行ごとに４個の乱数Ｒ_１〜Ｒ_４の組み合わせパターンを示している。１行目〜４行目までは乱数Ｒ_１〜Ｒ_４を、単独で演算することを示す。５行目の“１１００”は、Ｒ_１＋Ｒ_２の演算結果を使用することを示す。一般に、最初のｍ列が乱数の組み合わせパターンを示す。
なお、第２の基準ベクトルにかかる係数行列Ｋ_{（１，２）}，Ｋ_{（２，２）}，Ｋ_{（３，２）}，Ｋ_{（４，２）}は、係数行列の大きさ（ｄ＋ｍ）／２が、偶数か奇数かにより、異なるものを指定する。(Specific example applying d = 26, m = 4)
As a specific example of the case where there are few random number elements, an example where d = 26 and m = 4 is shown. Here, in order to see the periodicity of random numbers, only the calculation of the pattern of the first reference vector and the coefficient matrix multiplied by the first reference vector is shown. Since the reference vector is (d + m) / 2, it is a column vector of 15 columns and a square matrix of rows (d + m) / 2 columns.

Since m = 4 from the first column to the fourth column, a combination pattern of _four random numbers R _{1 to} R ₄ is shown for each row. The first to fourth lines indicate that the random numbers R _{1 to} R ₄ are calculated independently. “1100” in the fifth line indicates that the calculation result of R ₁ + R ₂ is used. In general, the first m columns indicate a random number combination pattern.
Note that the coefficient matrices K _(1,2) , K _(2,2) , K _(3,2) , and K _{(4,2) relating to} the second reference vector are the size (d + m) / 2 of the coefficient matrix. Specify different values depending on whether is even or odd.

周期性を見るために、第１項の演算のみを示すと、次のとおりの演算結果になる。

In order to see the periodicity, if only the calculation of the first term is shown, the calculation result is as follows.

分散情報のサイズは、元データＳのサイズに対して（ｄ＋ｍ）／２ｄ倍で表されるため、ｄとｍが上式を満たし、かつｍがｄに対して相対的に小さいほど、分散情報のサイズが小さくなる。本例でば、分割数ｄを２６、乱数要素の個数ｍを４としているため、分散情報のサイズは元のサイズの約０．５９倍になる。 Since the size of the shared information is expressed by (d + m) / 2d times the size of the original data S, the more the d and m satisfy the above equation and the smaller m is relative to d, the more the shared information is The size of becomes smaller. In this example, since the division number d is 26 and the number m of random number elements is 4, the size of the shared information is about 0.59 times the original size.

（ｍ＝ｄを適用した具体例）
次に、ｍ＝ｄ、すなわち元データＳの分割数と乱数要素の数が同じ場合について説明する。このとき、分散情報のサイズは元データＳのサイズとおなじである。既存のアルゴリズム（例えば特許文献２）では、分割数ｄは５以上の素数から１を引いたものに制約されるが、本発明では、この制約が無く以下の係数行列のパターンを使用して、分散情報ベクトルを生成することができ、かつ元データＳを復元できる。(Specific example applying m = d)
Next, a case where m = d, that is, the number of divisions of the original data S and the number of random number elements is the same will be described. At this time, the size of the shared information is the same as the size of the original data S. In an existing algorithm (for example, Patent Document 2), the division number d is limited to a value obtained by subtracting 1 from a prime number of 5 or more. In the present invention, the following coefficient matrix pattern is used without this restriction: A distributed information vector can be generated and the original data S can be restored.

以下の例では、ｍ＝ｄのうち、分割数が３（奇数）、乱数要素も３の事例を示す。
特許文献２では奇数の分割は不可能であるが、本発明では以下のとおり可能である。

The following example shows an example in which m = d, the number of divisions is 3 (odd number), and the random number element is also 3.
In Patent Document 2, an odd number of divisions is impossible, but in the present invention, it is possible as follows.

４つの分散情報ベクトルＷ_１，Ｗ_２，Ｗ_３，Ｗ_４の要素を表形式に表現すると、表２のようになる。この表では第１行から順にＷ_１，Ｗ_２，Ｗ_３，Ｗ_４の要素を示している。

Table 2 shows the elements of the _four distributed information vectors W _1, W _2, W ₃ , and W _{4 in} a tabular form. The table shows the _{_{_{W 1, W 2, W 3}}} , W 4 elements in order from the first row.

各Ｗ_１，Ｗ_２，Ｗ_３，Ｗ_４から任意の２つを用いた復元の計算式は、以下のとおりである。
Ｗ_１，Ｗ_２による元データの復元。

The calculation formula for restoration using any _{two of} W _1, W _2, W ₃ , and W ₄ is as follows.
Restore original data using W ₁ and W ₂ .

Ｗ_１，Ｗ_３による元データの復元。

Restore original data using W ₁ and W ₃ .

Ｗ_１，Ｗ_４による元データの復元。

Restore original data by W ₁ and W ₄ .

Ｗ_２，Ｗ_３による元データの復元。

Restore original data using W ₂ and W ₃ .

Ｗ_２，Ｗ_４による元データの復元。

Restore original data using W ₂ and W ₄ .

Ｗ_３，Ｗ_４による元データの復元。

Restore original data with W ₃ and W ₄ .

以上のとおり、基本的な元データＳの分散と復元方法につき説明したが、以下、任意の２つの分散情報をもとにして、消失・破壊された分散情報を再生成する処理について説明する。本方式によれば、元データＳを復元する過程を経ないで、残りの分散情報から消失したデータ（分散情報）を再生成することができ、部分的な消失・破壊からの復旧処理の高速化に極めて有効である。 As described above, the basic method for distributing and restoring the original data S has been described. Hereinafter, processing for regenerating lost / destroyed shared information based on arbitrary two pieces of shared information will be described. According to this method, the lost data (distributed information) can be regenerated from the remaining shared information without going through the process of restoring the original data S, and the recovery processing from partial loss / destruction is fast. It is extremely effective for conversion.

分散情報ベクトルＷ_ｉ，Ｗ_{ｊ（ｉ≠ｊ）}より、Ｗ_{ｋ（ｋ≠ｉ，ｋ≠ｊ）}の分散情報ベクトルを求める式は下記の式に従って求められる。

From the shared information vectors W _i and W _{j (i ≠ j)} , an expression for obtaining a distributed information vector of W _{k (k ≠ i, k ≠ j)} is obtained according to the following expression.

以下では、元データの復元処理を必要とせず、保存されている２つの分散情報をもとに、消失した２つの分散情報を復元する方式について述べる。
分散情報ベクトルＷ_１，Ｗ_２から、直接分散情報ベクトルＷ_３，Ｗ_４を復元。

In the following, a method for restoring the two lost shared information based on the two stored shared information without requiring the restoration process of the original data will be described.
The shared information vectors W ₃ and W ₄ are directly restored from the shared information vectors W ₁ and W ₂ .

分散情報ベクトルＷ_１，Ｗ_３から、直接分散情報ベクトルＷ_２，Ｗ_４を復元。

The shared information vectors W ₂ and W ₄ are directly restored from the distributed information vectors W ₁ and W ₃ .

分散情報ベクトルＷ_１，Ｗ_４から、直接分散情報ベクトルＷ_２，Ｗ_３を復元。

The shared information vectors W ₂ and W ₃ are directly restored from the shared information vectors W ₁ and W ₄ .

分散情報ベクトルＷ_２，Ｗ_３から、直接分散情報ベクトルＷ_１，Ｗ_４を復元。

The shared information vectors W ₁ and W ₄ are directly restored from the distributed information vectors W ₂ and W ₃ .

分散情報ベクトルＷ_２，Ｗ_４から、直接分散情報ベクトルＷ_１，Ｗ_３を復元。

The shared information vectors W ₁ and W ₃ are directly restored from the distributed information vectors W ₂ and W ₄ .

分散情報ベクトルＷ_３，Ｗ_４から、直接分散情報ベクトルＷ_１，Ｗ_２を復元。

The shared information vectors W ₁ and W ₂ are directly restored from the shared information vectors W ₃ and W ₄ .

本発明によれば（２、４）閾値分散法を利用した情報分散システムを用いて、低コストの情報分散ストレージシステムを実現することができる。以下、実施例１の情報分散システムを利用した情報分散ストレージシステムを説明するが、情報分散システムは実施例１で示したもの以外、従来公知の分散情報の生成方法（生成アルゴリズム）や元データＳの復元方法（復元アルゴリズム）など、様々な方式を利用していても実現可能である。また、（２、４）閾値分散法に限らず一般式で示す任意の（ｋ、ｎ）閾値分散法にも適用可能である。 According to the present invention, a low-cost information distribution storage system can be realized by using an information distribution system using the (2, 4) threshold distribution method. Hereinafter, an information distribution storage system using the information distribution system of the first embodiment will be described. The information distribution system, other than the one shown in the first embodiment, is a conventionally known distributed information generation method (generation algorithm) or original data S. This can also be realized using various methods such as the restoration method (restoration algorithm). Further, the present invention can be applied not only to the (2, 4) threshold dispersion method but also to any (k, n) threshold dispersion method represented by a general formula.

図４に示すように４つの情報分散ストレージ装置をネットワークで接続し、全体を１つの情報分散ストレージシステムとして構成する。各情報分散ストレージ装置は情報分散システム部とストレージ部を内蔵しているため、元データＳから分散情報を生成する機能と分散情報を保管する機能を合わせて持っている。本システムでは、システム内の一つの情報分散ストレージ装置から生成された分散情報が、自分自身のストレージ部および同じシステム内の別の情報分散ストレージ装置のストレージ部に保管され、任意の２つの情報分散ストレージ装置までが故障しても残りの情報分散ストレージ装置で継続運用が可能となる。 As shown in FIG. 4, four information distribution storage apparatuses are connected by a network, and the whole is configured as one information distribution storage system. Since each information distribution storage device has an information distribution system unit and a storage unit, it has a function of generating the distributed information from the original data S and a function of storing the distributed information. In this system, distributed information generated from one information distribution storage device in the system is stored in its own storage unit and the storage unit of another information distribution storage device in the same system, and any two pieces of information distribution Even if the storage device fails, the remaining information distributed storage device can be operated continuously.

以下、図４に示した情報分散ストレージシステムの構成について、詳細に説明する。
情報分散ストレージシステムは図４の２０ａ〜２０ｄの４つの情報分散ストレージ装置をネットワークで接続したものである。元データＳはネットワークに接続した端末から指定された情報分散ストレージ装置に入力する構成になっている。端末（図４の２２）は複数あっても良い。また、端末が接続されるネットワークは、図６に示すように別のネットワークにすることも可能である。The configuration of the information distribution storage system shown in FIG. 4 will be described in detail below.
The information distribution storage system is obtained by connecting four information distribution storage apparatuses 20a to 20d in FIG. 4 via a network. The original data S is input to a designated information distribution storage device from a terminal connected to the network. There may be a plurality of terminals (22 in FIG. 4). Further, the network to which the terminal is connected can be another network as shown in FIG.

情報分散ストレージ装置は、図４の２０１〜２０４に示すように、情報分散システム部、分散カタログ情報生成部、装置監視部、ストレージ部によって構成される。情報分散システム部は実施形態１に示すような分散情報の生成や元データＳの復元を行うものであるが、その手段・方法は異なっても良い。分散カタログ情報生成部は、各元データＳの管理情報と分散情報の保管場所を示すデータ（分散カタログ情報）をまとめたデータベースの生成・更新を行うものである。 As shown in 201 to 204 in FIG. 4, the information distributed storage device includes an information distribution system unit, a distributed catalog information generation unit, a device monitoring unit, and a storage unit. The information distribution system unit generates shared information and restores the original data S as shown in the first embodiment, but the means and method may be different. The distributed catalog information generation unit generates and updates a database in which management information of each original data S and data (distributed catalog information) indicating a storage location of the distributed information are collected.

装置監視部は、周期的に自分自身の情報分散機能やストレージ部が動作しているかを監視するとともに、情報分散ストレージシステム内の他の情報分散ストレージが正常か否かを周期的に問い合わせ監視する。 The device monitoring unit periodically monitors whether its own information distribution function and storage unit are operating, and periodically inquires and monitors whether other information distributed storage in the information distributed storage system is normal. .

ストレージ部は大容量ストレージを接続したものであり、分散情報と分散カタログ情報（データベース）が保管される。 The storage unit is connected to a large-capacity storage and stores distributed information and distributed catalog information (database).

情報分散ストレージシステムの処理についてその方式を説明する。
端末から入力された元データＳは、指定の情報分散ストレージ装置に渡される。
当該情報分散数トレージ装置内の情報分散システム部で４個の分散情報が生成され、自身のストレージ部に１個保管し、残りの３個を同一システム（情報分散ストレージシステム）内の別の情報分散ストレージ装置に転送され、それぞれのストレージ部に保管される。A method of processing of the information distributed storage system will be described.
The original data S input from the terminal is passed to the designated information distribution storage device.
Four pieces of distributed information are generated in the information distribution system unit in the information distribution number storage device, one is stored in its own storage unit, and the remaining three pieces are different information in the same system (information distribution storage system). The data is transferred to the distributed storage device and stored in each storage unit.

転送先の情報分散ストレージ装置が故障していた場合は、故障しているものを除いて転送されるが、最大２つまでの故障があっても処理を続けることができる。端末から入力された元データＳを受け取る情報分散ストレージ装置（予め指定されている）が故障した場合は、次のプライオリティの情報分散ストレージ装置が元データＳを受け取るように処理を自動的に切り替えるようにする。 If the information distributed storage device at the transfer destination has failed, the transfer is performed except for the failure, but the processing can be continued even if there are up to two failures. When an information distribution storage device (preliminarily designated) that receives the original data S input from the terminal fails, the processing is automatically switched so that the information distribution storage device of the next priority receives the original data S. To.

端末は複数存在することができるが、端末ごとに元データＳを受け取る情報分散ストレージ装置を別にして、全体システムの負荷を分散させることができる。 A plurality of terminals can exist, but the load on the entire system can be distributed apart from the information distribution storage apparatus that receives the original data S for each terminal.

分散情報を保管する際に、当該元データＳの管理情報とその保管先を示す情報（元データＳと分散情報の住所録のようなもの）を「カタログ情報」として生成し、カタログ情報自体の秘匿性を確保するため、元データＳから分散情報を生成するのと同様の手段で、カタログ情報の分散情報（分散カタログ情報）を生成する。分散カタログ情報は、各情報分散ストレージ装置のストレージ部にデータベースとして保管される。 When the shared information is stored, management information of the original data S and information indicating the storage destination (such as an address book of the original data S and the distributed information) are generated as “catalog information”, and the catalog information itself is stored. In order to ensure confidentiality, shared information of catalog information (distributed catalog information) is generated by the same means as that for generating distributed information from the original data S. The distributed catalog information is stored as a database in the storage unit of each information distributed storage device.

情報分散ストレージ装置は、装置監視部を持ち、周期的に自身の情報分散システム部やストレージ部が故障しないで動作しているかを監視する。動作しているかの監視は、一致周期で決められたデータが更新されていることをチェックすることで行う。また、ネットワーク経由で同一システム（情報分散ストレージシステム）内の他の情報分散ストレージ装置が故障しないで動作しているかを監視する。これも、他の情報分散ストレージ装置内の決められたデータが更新されているかをチェックすることで行う。オペレーティングシステム（ＯＳ）レベルで対象にアクセスできない場合は、当然決められたデータが更新されたことを確認できないため、故障と判断される。 The information distribution storage device has a device monitoring unit, and periodically monitors whether its information distribution system unit and storage unit are operating without failure. Whether it is operating or not is monitored by checking that the data determined in the coincidence period has been updated. Further, it monitors whether other information distributed storage devices in the same system (information distributed storage system) are operating without failure via the network. This is also performed by checking whether or not the determined data in another information distributed storage apparatus has been updated. If the target cannot be accessed at the operating system (OS) level, it is naturally determined that the determined data has been updated.

情報分散ストレージシステム内の４つの情報分散ストレージ装置は、それぞれのストレージ部に「全ての元データＳ」の分散情報と分散カタログ情報を保管しているため、どの装置からでも元データＳの復元ができる。 Since the four information distributed storage devices in the information distributed storage system store the distributed information and distributed catalog information of “all original data S” in their respective storage units, the original data S can be restored from any device. it can.

本実施例２の情報分散ストレージシステムは、以下のような効果がある。
分散情報の１つを情報分散ストレージ内の各情報分散ストレージ装置が保管しているため、復元のために足りない１つの分散情報のみをネットワークを通して入手すればよい。このため、ネットワーク転送時間が減り、元データＳの復元時間が節約できる。
また、１つの情報分散ストレージシステムを、複数の顧客がシェアして使用でき、コストシェアができるため、実際の運用コストを下げることができる。例えば、４人の顧客が、それぞれ専用の情報分散ストレージ装置を情報分散処理の入口として使用し、保管場所は残りの３つの情報分散ストレージ装置として使用すると、１つの情報分散ストレージシステムをシェアできることになる。The information distributed storage system according to the second embodiment has the following effects.
Since each of the information distributed storage devices in the information distributed storage stores one piece of shared information, only one piece of distributed information that is insufficient for restoration needs to be obtained through the network. For this reason, the network transfer time is reduced, and the restoration time of the original data S can be saved.
In addition, since one information distributed storage system can be shared and used by a plurality of customers and cost sharing can be achieved, the actual operation cost can be reduced. For example, if four customers each use a dedicated information distribution storage device as an entrance for information distribution processing and the storage location is used as the remaining three information distribution storage devices, one information distribution storage system can be shared. Become.

図５で示すシステムは、情報分散ストレージ装置を４台ネットワークで接続した変形例であり、以下のような処理を行う。 The system shown in FIG. 5 is a modified example in which four information distributed storage apparatuses are connected via a network, and performs the following processing.

端末から元データＳを、ネットワーク上の指定の情報分散ストレージ装置に転送すると、該情報分散ストレージ装置内の情報分散システムが元データＳを、同一システム内の情報分散ストレージ装置（３０ａ〜３０ｄ）のストレージ部に分散する。情報分散ストレージ装置の１つに、元データＳの置かれている端末（コンピュータ）のアドレスとファイル格納場所を予め指定しておくと、情報分散ストレージ装置は周期的に指定場所をチェックし、元データＳが未処理の元データＳ（ファイル）があれば、上記と同様に元データＳの分散処理を行うようにする。端末から元データＳの復元を要求すると、自身のストレージに保管されている１つの分散情報と、他の情報分散ストレージ装置に保管されている分散情報の１つから元データＳの復元を行う。 When the original data S is transferred from the terminal to the designated information distribution storage device on the network, the information distribution system in the information distribution storage device transfers the original data S to the information distribution storage devices (30a to 30d) in the same system. Distributed in the storage unit. If the address of the terminal (computer) where the original data S is placed and the file storage location are specified in advance in one of the information distributed storage devices, the information distributed storage device periodically checks the specified location, If there is unprocessed original data S (file), the original data S is distributed as described above. When restoration of the original data S is requested from the terminal, the original data S is restored from one shared information stored in its own storage and one of the distributed information stored in other information distributed storage devices.

図６は、情報分散ストレージ装置を４台ネットワークで接続したものであるが、その中の３台は同じネットワークに接続されており、他の１台は別のネットワークに接続されている。図５と同じような処理を行うが、ネットワークの利用コストを削減するため図６のネットワーク１は公衆回線等を利用できる。これは、分散情報がネットワーク上で１つ盗難にあっても、元データＳが絶対に復元できないようにするためである。 In FIG. 6, four information distribution storage apparatuses are connected via a network, three of which are connected to the same network, and the other one is connected to another network. Although the same processing as in FIG. 5 is performed, the network 1 in FIG. 6 can use a public line or the like in order to reduce the use cost of the network. This is because the original data S can never be restored even if one piece of distributed information is stolen on the network.

以上の各実施例で説明したように、本発明によれば、分散情報のサイズの削減、２重故障での可用性、分散情報の全ての断片の乱数化による秘匿性の向上、排他的論理和演算を主体とした高速性を、同時に実現することができる。 As described in each of the above embodiments, according to the present invention, the size of the shared information is reduced, the availability in the case of double failure, the improvement of confidentiality by randomizing all fragments of the shared information, the exclusive OR It is possible to simultaneously realize high-speed performance mainly for computation.

実施例１に示した分散情報生成処理、元データＳ復元処理、追加分散情報生成処理、および実施例２で示した情報分散ストレージシステムの処理は、コンピュータのプログラムによって実現し、このプログラムをコンピュータ読み取り可能な記録媒体に記録し、この記録媒体をコンピュータに組み込んだり、または記録媒体に記録されたプログラムを、通信回線を介してコンピュータシステムにダウンロードしたり、または記録媒体からインストールし、該プログラムをコンピュータシステム上で作動させることによって機能させることができる。もちろん、プログラムは、本明細書で述べた一部の機能を実現するものであっても良い。 The shared information generation process, the original data S restoration process, the additional distributed information generation process, and the information distribution storage system process shown in the second embodiment are realized by a computer program, and this program is read by a computer. The program is recorded on a recordable medium and the program is incorporated into a computer, or the program recorded on the record medium is downloaded to a computer system via a communication line or installed from the record medium, and the program is installed on the computer. It can function by running on the system. Of course, the program may realize a part of the functions described in this specification.

本発明の具体的実施形態については、本明細書で説明した構成に限定されるものではなく、本発明の要旨を逸脱しない範囲の設計等も含まれる。特に、実施例１及び２で示した各システムは、最も現実的運用である（２、４）閾値分散法に限定して説明したが、これに限らず一般式で示す（ｋ、ｎ）閾値分散法にも適用可能である。 The specific embodiment of the present invention is not limited to the configuration described in the present specification, and includes design and the like within a scope not departing from the gist of the present invention. In particular, each of the systems shown in the first and second embodiments has been described by limiting to the (2, 4) threshold distribution method, which is the most realistic operation. It can also be applied to the dispersion method.

１：情報分散システム
２ａ〜２ｄ：保管サーバ
３：ネットワーク、
１１：元データ分割手段、
１２：乱数発生手段、
１３：基準ベクトル生成手段、
１４：分散情報生成手段、
１４ａ：係数行列記憶部、
１５：分散情報送信手段、
１６：分散情報受信手段、
１７：元データ復元手段、
１７ａ：係数行列の逆行列記憶部、
２０ａ〜２０ｄ：情報分散ストレージ装置、
２１：ネットワーク、
２２：端末
３０ａ〜３０ｄ：情報分散ストレージ装置、
３１：ネットワーク、
３２：端末、
４０ａ〜４０ｄ：情報分散ストレージ装置、
４１〜４２：ネットワーク、
４３：端末1: information distribution systems 2a to 2d: storage server 3: network,
11: Original data dividing means,
12: Random number generation means,
13: reference vector generation means,
14: Distributed information generation means,
14a: coefficient matrix storage unit,
15: Distributed information transmission means,
16: Distributed information receiving means,
17: Original data restoration means,
17a: inverse matrix storage unit of coefficient matrix,
20a to 20d: Information distributed storage device,
21: Network
22: Terminals 30a to 30d: Information distribution storage device,
31: Network,
32: terminal,
40a to 40d: Information distributed storage device,
41-42: network,
43: Terminal

Claims

A threshold information distribution system that generates four pieces of shared information from original data and restores the original data using two pieces of shared information among them (2, 4),
Partial information generating means for dividing the original data to create d pieces of partial information;
Random number generating means for generating m random numbers having the same size as the partial information;
A reference vector generating means for generating first and second reference vectors having a size of (d + m) / 2 having the d partial information and m random numbers as components;
Matrix calculation based on the combination of the first reference vector and the coefficient matrix, and the combination of the second reference vector and the coefficient matrix, using the generated first and second reference vectors and eight coefficient matrices prepared in advance. Distributed information vector generation means for generating four (d + m) / 2-size distributed information vectors by performing a sum operation (exclusive OR operation XOR) between the (exclusive OR operation XOR) and the vector of the operation result When,
For each of the generated four shared information vectors, shared information generating means for connecting the (d + m) / 2 elements to generate the four shared information,
Data restoring means for restoring the original data based on any two pieces of the shared information;
An information distribution system characterized by comprising:

m is d or less, and (d + m) / 2 is an integer of 2 or more.
The information distribution system according to claim 1.

The coefficient matrix is a (d + m) / 2-dimensional square matrix whose elements are only 0 or 1.
The information distribution system according to any one of claims 1 and 2.

The coefficient matrix is arbitrarily given from a plurality of patterns defined in advance in accordance with the arrangement order of elements in the reference vector, or from the outside,
The information distribution system according to claim 1, wherein the information distribution system is an information distribution system.

When m satisfies 2 ^m −1 ≧ (d + m) / 2, m random numbers are combined and used as a new random number, and the matrix operation (exclusive OR operation XOR) is performed.
The information distribution system according to claim 1, wherein:

An information distribution storage system using the information distribution system according to any one of claims 1 to 5,
Connect four information distribution storage devices via a network,
Monitoring means for monitoring the operating state of the information distributed storage device;
A management means for managing where the distributed information is stored;
Means for mutually storing shared information in the information distributed storage device;
Means for immediately switching to another information distributed storage device when the monitoring means detects that one of the information distributed storage devices is faulty;
Means that when the monitoring means detects that up to two of the information distribution storage devices are out of order, the operation means continues to operate in the remaining information distribution storage devices;
The distributed information is stored in all the information distributed storage devices, and can be restored to the original data in any of the information distributed storage devices.
An information distributed storage system characterized by that.