JP2017091307A

JP2017091307A - Decentralized storage control method and decentralization storage system

Info

Publication number: JP2017091307A
Application number: JP2015222053A
Authority: JP
Inventors: 茂樹三宅; Shigeki Miyake
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2015-11-12
Filing date: 2015-11-12
Publication date: 2017-05-25
Anticipated expiration: 2035-11-12
Also published as: JP6445955B2

Abstract

PROBLEM TO BE SOLVED: To provide a decentralized storage control method and a decentralized storage system capable of configuring a linear network code for "Information Flow graph" when parameters (n, k, d, α, and β) of decentralized storage at arbitrary points (α, β) on an α-β tradeoff curve are given.SOLUTION: A decentralized storage control method and a decentralized storage system according to the present invention are characterized in that a counter which counts every node fault is installed and when node faults reach a maximum count value, decentralized storage system are all refreshed and decentralized encoding is restarted from the beginning. Further, a process is arranged in which a data amount of an encoding vector is converted into an alphabet character of proper size at an information source node and a restoration node to be made small enough to ignore as compared with an encoding data amount of a content.SELECTED DRAWING: Figure 8

Description

本発明は、分散ストレージにおけるデータ保存量とノード再生成データ転送量との任意のトレードオフを達成する符号化技術に関する。 The present invention relates to an encoding technique that achieves an arbitrary trade-off between a data storage amount and a node regenerated data transfer amount in distributed storage.

１．分散ストレージ技術の概要
（ｎ、ｋ）分散ストレージ系を図１に示す。図１はｎ＝４、ｋ＝３の場合である。情報源をｓとし、ｓは保存・復元されるコンテンツを持つ。保存・復元されるコンテンツをｘ^Ｂと書く。肩にかかるＢはコンテンツ長がＢバイトであることを示すものとする。すなわちｘ^Ｂ＝｛ｘ_１，ｘ_２、・・・，ｘ_Ｂ｝に対してｘ_ｉ∈｛０，１，・・・，ｑ−１｝とする。ここで集合｛０，１，・・・，ｑ−１｝を「分散ストレージ系のアルファベット」と呼ぶ。また、分散ストレージ系のアルファベットを有限体Ｆ_ｑにとる。ここでｑ＝２^ｍであり、１バイト＝ｍビットとしておく（要するにｘ^Ｂ∈Ｆ_ｑ ^Ｂである。）。例えば対象とする系で送受信されるパケット長を１バイトとおくことが考えられる。コンテンツはｎ個のストレージに分散して保存される。また、データを復元する者を「ＤａｔａＣｏｌｌｅｃｔｏｒ」以下、「ＤＣ」と略記する。）と呼ぶことにする。ＤＣは任意のｋ個のストレージにアクセスして、各ストレージからαバイトずつデータを受け取る。ＤＣは受け取ったデータを復号することで元のコンテンツｘ^Ｂを復元する。 1. Overview of Distributed Storage Technology FIG. 1 shows a (n, k) distributed storage system. FIG. 1 shows a case where n = 4 and k = 3. Let s be an information source, and s has contents to be saved and restored. The content to be saved and restored written as x ^B. B on the shoulder indicates that the content length is B bytes. That is, let x _i ε {0, 1,..., Q−1} for x ^B = {x ₁ , x ₂ ,..., X _B }. Here, the set {0, 1,..., Q−1} is referred to as “distributed storage system alphabet”. Also, taking the alphabet distributed storage system finite field F _q. Here, q = 2 ^m , and 1 byte = m bits (in short, x ^B εF _q ^B ). For example, it is conceivable that the packet length transmitted and received in the target system is 1 byte. Content is distributed and stored in n storages. Further, a person who restores data is abbreviated as “DC” below “Data Collector”. ). The DC accesses arbitrary k storages and receives data from each storage by α bytes. DC restores the original content x ^B by decoding the received data.

ところで、分散ストレージの各ノードは故障のリスクが常につきまとう。図２はノード１が故障し、同じサイトにおいてノード５を新たに構成した状況を表している。本明細書では「サイト」という言葉を特定のストレージが存在する場所を指すものとする。このとき、ノード５は故障する前のノード１と同じ機能をもつことが要請される。すなわち、図２においてノードの再生成後、ＤＣは再び任意のｋ個のストレージにアクセスして、各ストレージからαバイトずつデータを受け取ることで元のコンテンツｘ^Ｂを復元することができることになる。ノードが故障した際に、同じサイトに同一機能を持つ新たなノードを構成することを「再生成」と呼ぶ。再生成は新たなノード（図２の符号５’）が故障していないｄ個のノードからβバイトずつ再生成用データを受け取ることでなされる。 By the way, each node of the distributed storage is always at risk of failure. FIG. 2 shows a situation where node 1 has failed and node 5 is newly configured at the same site. In this specification, the term “site” refers to a place where a specific storage exists. At this time, the node 5 is required to have the same function as the node 1 before the failure. That is, after regeneration of the node in FIG. 2, DC will be capable of again accessing any of k storage, to restore the original content x ^B by receiving the data by α bytes from each storage. When a node fails, configuring a new node with the same function at the same site is called “regeneration”. Regeneration is performed by receiving regeneration data in units of β bytes from d nodes in which a new node (reference numeral 5 ′ in FIG. 2) has not failed.

図２においてノード５およびノード５’の２種類が存在するのは、後で再生成符号としてのネットワーク符号（非特許文献４を参照）を構成するための便宜的なもので、物理的な存在としては同一である。ノード５’は再生成用のデータを受け取るノード、ノード５は復元用のデータをＤＣに送信するノードを表している。なお、図２の横軸は時間を表しており、図は各時点でどのノードが故障し、どのノードから再生成用データが転送され、ＤＣはどのノードから復号用データを取得したか、等を示すもので、一般に「Ｉｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈ」と呼ばれている。 In FIG. 2, the two types of node 5 and node 5 ′ exist for the sake of convenience in constructing a network code (see Non-Patent Document 4) as a regenerated code later. Are the same. Node 5 'represents a node that receives data for regeneration, and node 5 represents a node that transmits data for restoration to DC. Note that the horizontal axis of FIG. 2 represents time, and in the figure, which node failed at each time point, from which node the regeneration data was transferred, from which node the DC acquired the decryption data, etc. Generally, it is called “Information flow graph”.

上記で説明した再生成のための分散データを構成するための符号（ｎ、ｋ、ｄ、α、β）を「再生成符号」と呼ぶ。再生成符号は分散符号の１つである。再生成符号はＤｉｍａｋｉｓ達によって提案（非特許文献３）され、その基本的な性質が明らかにされている。再生成の特徴はデータ保存量αと再生成用データ転送量βとの間にトレードオフがあるというもので次の命題として記される。
［命題１］
分散ストレージ系のパラメータ（ｎ，ｋ，ｄ，α，β）が与えられているとする。ＤＣが任意のｋ個の分散ストレージにアクセスすることで大きさＢバイトのコンテンツを正しく復元することができ、さらに故障ノードが任意のｄ個の分散ストレージにアクセスすることで正しく再生成されるための必要十分条件は、

として与えられる（非特許文献３を参照。）。 The codes (n, k, d, α, β) for configuring the distributed data for regeneration described above are referred to as “regeneration codes”. The regenerated code is one of distributed codes. The regeneration code was proposed by Dimakis et al. (Non-Patent Document 3), and its basic properties have been clarified. The feature of the regeneration is that there is a trade-off between the data storage amount α and the regeneration data transfer amount β, which is described as the next proposition.
[Proposition 1]
It is assumed that distributed storage system parameters (n, k, d, α, β) are given. A DC can correctly restore B-byte content by accessing any k distributed storages, and a failed node can be correctly regenerated by accessing any d distributed storages. The necessary and sufficient conditions for

(See Non-Patent Document 3).

（１）式が表す曲線（トレードオフ曲線）の例を図３（非特許文献３）に示す。図３の曲線の上側が理論的に実現可能な領域を表す。課題は、トレードオフ曲線上の任意の点に近接する点を実現するような再生成符号を構成することである。 An example of a curve (trade-off curve) represented by equation (1) is shown in FIG. 3 (Non-Patent Document 3). The upper side of the curve in FIG. 3 represents a theoretically realizable region. The problem is to construct a regeneration code that realizes a point close to an arbitrary point on the trade-off curve.

Ｖ．Ｊａｃｏｂｓｏｎ，Ｄ．Ｋ．Ｓｍｅｔｔｅｒｓ，Ｊ．Ｄ．Ｔｈｏｒｎｔｏｎ，Ｍ．Ｆ．Ｐｌａｓｓ，Ｎ．Ｈ．Ｂｒｉｇｇｓ，ａｎｄＲ．Ｌ．Ｂｒａｙｎａｒｄ， “Ｎｅｔｗｏｒｋｉｎｇｎａｍｅｄｃｏｎｔｅｎｔ”，ｉｎＰｒｏｃ．５ｔｈＡＣＭＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＥｍｅｒｇｉｎｇＮｅｔｗｏｒｋｉｎｇＥｘｐｅｒｉｍｅｎｔｓａｎｄＴｅｃｈｎｏｌｏｇｉｅｓ，ｐｐ．１−１２，２００９．V. Jacobson, D.C. K. Smetters, J.A. D. Thornton, M.M. F. Plus, N.C. H. Briggs, and R.A. L. Brainard, “Networking named content”, in Proc. 5th ACM International Conference on Emergence Networking Experiences and Technologies, pp. 1-12, 2009. 桑門秀典，栗原正純，「分散ストレージシステムのための新しい符号化法 −再生成符号とＰｙｒａｍｉｄ符号−」，信学会誌，ｖｏｌ．９８，ｎｏ．２，ｐｐ．１３０−１３７，２０１５．Hidenori Kuwamon, Masazumi Kurihara, "New Coding Methods for Distributed Storage Systems-Regenerative Codes and Pyramid Codes", Journal of the IEICE, vol. 98, no. 2, pp. 130-137, 2015. Ａ．Ｇ．Ｄｉｍａｋｉｓ，Ｐ．Ｂ．Ｇｏｄｆｒｅｙ，Ｙ．Ｗｕ，Ｍ．Ｊ．Ｗａｉｎｗｒｉｇｈｔ，ａｎｄＫ．Ｒａｍｃｈａｎｄｒａｎ， “Ｎｅｔｗｏｒｋｃｏｄｉｎｇｆｏｒｄｉｓｔｒｉｂｕｔｅｄｓｔｏｒａｇｅｓｙｓｔｅｍｓ”，ＩＥＥＥＴｒａｎｓ．Ｉｎｆ．Ｔｈｅｏｒｙ，ｖｏｌ．５６，ｎｏ．９，ｐｐ．４５３９−４５５１，Ｓｅｐｔ．２０１０．A. G. Dimakis, P.A. B. Godfrey, Y.M. Wu, M.M. J. et al. Wainwright, and K.W. Ramchandran, “Network coding for distributed storage systems”, IEEE Trans. Inf. Theory, vol. 56, no. 9, pp. 4539-4551, Sept. 2010. Ｔ．Ｈｏ，Ｍ．Ｍｅｄａｒｄ，Ｒ．Ｋｏｅｔｔｅｒ，Ｄ．Ｒ．Ｋａｒｇｅｒ，Ｍ．Ｅｆｆｒｏｓ，Ｊ．Ｓｈｉ，ａｎｄＢ．Ｌｅｏｎｇ， “Ａｒａｎｄｏｍｌｉｎｅａｒｎｅｔｗｏｒｋｃｏｄｉｎｇａｐｐｒｏａｃｈｔｏｍｕｌｔｉｃａｓｔ”，ＩＥＥＥＴｒａｎｓ．Ｉｎｆ．Ｔｈｅｏｒｙ，ｖｏｌ．５２，ｎｏ．１０，ｐｐ．４４１３−４４３０，Ｏｃｔ．２００６．T. T. Ho, M.M. Medard, R.M. Koetter, D.C. R. Karger, M.M. Effros, J.M. Shi, and B.J. Leong, “A random linear network coding approach to multicast”, IEEE Trans. Inf. Theory, vol. 52, no. 10, pp. 4413-4430, Oct. 2006.

２．従来の技術での困難点
図２で示したように「Ｉｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈ」は一般に有向非周期なグラフとなるため、再生成符号の実現にはネットワーク符号を適用することが適切であるように思われる。ここで、ネットワーク符号とは、非特許文献４にあるように与えられたネットワーク上でマルチキャスト伝送を実現するためのものである。このネットワーク符号を「Ｉｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈ」における再生成符号の実現に適用することが試みられている。ところが、「Ｉｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈ」にネットワーク符号化を適用しようとすると以下の困難に直面する。
［困難１］分散ストレージの各ノードはランダムに故障する。従って「Ｉｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈ」の形は一通りではない。
［困難２］「Ｉｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈ」は時間的に発展しているためネットワークはいくらでも大きくなり得る。つまり、ノードの故障・再生成が発生するたびにいくらでも長く伸びていく。
［困難３］「Ｉｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈ」の形が予め確定していないため、符号化のための行列の情報もコンテンツデータと一緒に送信されなければならない。このため分散ストレージ系のアルファベットＦ_ｑにおけるｑ＝２^ｍのｍをブロック長と見なしたとき、コンテンツの大きさが再生成符号の容量限界に近づくためには、符号化行列（もしくは符号化ベクトル）の情報はｏ（ｍ）の大きさでなければならない。ここで、ｍに依存する量ｆ（ｍ）がｏ（ｍ）の大きさであるとは、

を満足する量であることをいう。 2. Difficulties in the prior art As shown in Fig. 2, "Information flow graph" is generally a directed aperiodic graph, so that it is appropriate to apply network codes to realize regenerated codes. Seem. Here, the network code is for realizing multicast transmission on a given network as described in Non-Patent Document 4. Attempts have been made to apply this network code to the realization of the regenerated code in the “Information flow graph”. However, when network coding is applied to the “Information flow graph”, the following difficulties are encountered.
[Difficulty 1] Each node of the distributed storage fails randomly. Therefore, the form of “Information flow graph” is not unique.
[Difficulty 2] Since the “Information flow graph” has evolved over time, the network can grow indefinitely. In other words, whenever a node failure / regeneration occurs, it grows as long as possible.
[Difficulty 3] Since the form of “Information flow graph” is not fixed in advance, matrix information for encoding must also be transmitted together with the content data. For this reason, when ^m of q = 2 ^m in the alphabet F _q of the distributed storage system is regarded as a block length, in order for the size of the content to approach the capacity limit of the regenerated code, an encoding matrix (or encoding vector) ) Information must be o (m) in size. Here, the amount f (m) depending on m is the size of o (m).

It is an amount that satisfies

上述の困難のため、現状ではトレードオフ曲線上の任意の点における再生成符号を構成するのではなく、αやβが最小であるような特定の点についてのみ具体的な符号が構成されている（例えば、非特許文献２を参照。）。αに関して最小の符号を「ＭｉｎｉｍｕｍＳｔｏｒａｇｅＲｅｇｅｎｅｒａｔｉｏｎ（ＭＳＲ）符号」、βに関して最小の符号を「ＭｉｎｉｍｕｍＢａｎｄｗｉｄｔｈＲｅｇｅｎｅｒａｔｉｏｎ（ＭＢＲ）符号」と呼ぶ。 Due to the difficulties described above, at present, a specific code is configured only for a specific point where α and β are minimum, instead of configuring a regenerated code at an arbitrary point on the trade-off curve. (For example, refer nonpatent literature 2.). The minimum code for α is referred to as “Minimum Storage Regeneration (MSR) code”, and the minimum code for β is referred to as “Minimum Bandwidth Regeneration (MBR) code”.

ＭＳＲ／ＭＢＲ点以外での符号の具体的な構成は分散ストレージのみならず、ＩＣＮやＣＣＮ（例えば、非特許文献１を参照）などのネットワーク上の各ノードに付随するキャッシュ機能を拡張してコンテンツ・ストレージとして利用するデータ転送方式においても今後重要になってくると見なされる。 The specific configuration of codes other than the MSR / MBR points is not only distributed storage but also the contents by extending the cache function attached to each node on the network such as ICN and CCN (see Non-Patent Document 1, for example).・ It is considered that the data transfer method used as storage will become important in the future.

３．課題
前節で述べたように、α−βトレードオフ曲線上の任意の点（α、β）を含む分散ストレージのパラメータ（ｎ、ｋ、ｄ、α、β）が与えられた時に、このパラメータを実現する再生成符号を構成することが理想である。しかしながら、「Ｉｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈ」に対する線形ネットワーク符号を構成するという自然なアプローチでは２節で述べた困難が生ずる。 3. As described in the previous section, given a distributed storage parameter (n, k, d, α, β) that includes any point (α, β) on the α-β trade-off curve, It is ideal to configure the regenerated code to be realized. However, the natural approach of constructing a linear network code for “Information flow graph” causes the difficulties described in Section 2.

そこで、本発明は、α−βトレードオフ曲線上の任意の点（α，β）を含む分散ストレージのパラメータ（ｎ、ｋ、ｄ、α、β）が与えられた時に「Ｉｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈ」に対する線形ネットワーク符号で構成できる分散ストレージ制御方法及び分散ストレージシステムを提供することを目的とする。 Therefore, the present invention is directed to “Information flow graph” when distributed storage parameters (n, k, d, α, β) including arbitrary points (α, β) on the α-β trade-off curve are given. It is an object of the present invention to provide a distributed storage control method and a distributed storage system that can be configured with a linear network code.

上記目的を達成するために、本願発明に係る分散ストレージ制御方法及び分散ストレージシステムは、ノード故障回数毎に加算するカウンタを設置し、ノード故障が最大カウンタ値に達したならば、分散ストレージ系をすべてリフレッシュして分散符号化を始めから再開するようにした。また、符号化ベクトルのデータ量をコンテンツの符号化データ量に比較して無視できるほど小さくするために、情報源ノードで分散ストレージ系アルファベットを適切な大きさの符号化アルファベットに変換し、復元ノードで符号化アルファベットを分散ストレージ系アルファベットへ逆変換する処理を配置した。 In order to achieve the above object, the distributed storage control method and distributed storage system according to the present invention are provided with a counter to be added for each node failure, and if the node failure reaches the maximum counter value, the distributed storage system is All were refreshed and distributed encoding was restarted from the beginning. Also, in order to make the data amount of the encoded vector small enough to be ignored compared to the encoded data amount of the content, the information storage node converts the distributed storage system alphabet into an appropriately sized encoded alphabet, and restore node The processing to reversely convert the encoded alphabet to the distributed storage alphabet is arranged.

具体的には、本願発明に係る分散ストレージ制御方法は、
コンテンツデータを保持する情報源ノードと、
前記コンテンツデータを分散させた分散データを保持する複数のストレージノードと、
任意の前記ストレージノードから出力された分散データを受信して前記コンテンツデータを復元する復元ノードと、
で構成される分散ストレージ系を有する分散ストレージシステムを制御する分散ストレージ制御方法であって、
前記情報源ノードで前記コンテンツデータを符号化する分散ストレージ系アルファベットを、前記分散ストレージ系アルファベットより小さな符号化アルファベットに変換する変換手順と、
前記復元ノードで前記符号化アルファベットを前記分散ストレージ系アルファベットに戻す逆変換手順と、
前記ストレージノードの故障時に他の少なくとも１の前記ストレージノードからデータを受信して故障した前記ストレージノードの代替となる再生成用ストレージノードを形成し、前記分散ストレージ系を更新する再生成手順と、
前記ストレージノードの故障の回数をカウントし、前記回数が上限値に達した時、構成された前記分散ストレージ系をリセットして改めて分散ストレージ系を構成するリセット手順と、
を行うことを特徴とする。 Specifically, the distributed storage control method according to the present invention is:
An information source node holding content data;
A plurality of storage nodes holding distributed data in which the content data is distributed;
A restoration node that receives the distributed data output from any of the storage nodes and restores the content data;
A distributed storage control method for controlling a distributed storage system having a distributed storage system configured by:
A conversion procedure for converting a distributed storage alphabet that encodes the content data at the information source node into an encoding alphabet that is smaller than the distributed storage alphabet;
An inverse conversion procedure for returning the encoded alphabet to the distributed storage alphabet at the restoration node;
Regeneration procedure for receiving a data from at least one other storage node at the time of failure of the storage node, forming a regeneration storage node to replace the failed storage node, and updating the distributed storage system;
Counting the number of failures of the storage node, and when the number reaches the upper limit, resetting the configured distributed storage system and configuring the distributed storage system again,
It is characterized by performing.

また、本願発明に係る分散ストレージシステムは、
コンテンツデータを保持する情報源ノードと、
前記コンテンツデータを分散させた分散データを保持する複数のストレージノードと、
任意の前記ストレージノードから出力された分散データを受信して前記コンテンツデータを復元する復元ノードと、
で構成される分散ストレージ系を有する分散ストレージシステムであって、
前記情報源ノードが、前記コンテンツデータを符号化する分散ストレージ系アルファベットを、前記分散ストレージ系アルファベットより小さな符号化アルファベットに変換する変換機能を有し、
前記復元ノードが、前記符号化アルファベットを前記分散ストレージ系アルファベットに戻す逆変換機能を有しており、
前記ストレージノードの故障時に他の少なくとも１の前記ストレージノードからデータを受信して故障した前記ストレージノードの代替となる再生成用ストレージノードを形成し、前記分散ストレージ系を更新する再生成機能と、
前記ストレージノードの故障の回数をカウントし、前記回数が上限値に達した時、構成された前記分散ストレージ系をリセットして改めて分散ストレージ系を構成するリセット機能と、
を備える制御部を備えることを特徴とする。 The distributed storage system according to the present invention is
An information source node holding content data;
A plurality of storage nodes holding distributed data in which the content data is distributed;
A restoration node that receives the distributed data output from any of the storage nodes and restores the content data;
A distributed storage system having a distributed storage system comprising:
The information source node has a conversion function of converting a distributed storage alphabet encoding the content data into an encoding alphabet smaller than the distributed storage alphabet;
The restoration node has an inverse conversion function for returning the encoded alphabet to the distributed storage alphabet;
A regeneration function for receiving data from at least one other storage node at the time of failure of the storage node, forming a regeneration storage node to replace the failed storage node, and updating the distributed storage system;
Counting the number of failures of the storage node, and when the number reaches the upper limit, resetting the configured distributed storage system and configuring the distributed storage system again,
It is characterized by providing a control part provided with.

本発明は、最大カウンタ値を導入することで「Ｉｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈ」を時間的にも空間的にも有限にすることを可能とし、ランダムネットワーク符号化アルゴリズムを用いた再生成符号のネットワーク符号化を実現できる。本発明は、線形ネットワーク符号を用いるため、再生成符号の符号化をベクトルどうしのスカラー積、復号化をガウスの掃き出し法を用いることできわめて容易に実現できる。 The present invention makes it possible to make the “Information flow graph” finite in terms of time and space by introducing a maximum counter value, and to perform network coding of regenerated codes using a random network coding algorithm. realizable. Since the present invention uses a linear network code, the reproduction code can be encoded very easily by using a scalar product of vectors and decoding using a Gaussian sweeping method.

従って、本発明は、α−βトレードオフ曲線上の任意の点（α，β）を含む分散ストレージのパラメータ（ｎ、ｋ、ｄ、α、β）が与えられた時に「Ｉｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈ」に対する線形ネットワーク符号で構成できる分散ストレージ制御方法及び分散ストレージシステムを提供することができる。 Therefore, the present invention is based on “Information flow graph” when given distributed storage parameters (n, k, d, α, β) including arbitrary points (α, β) on the α-β trade-off curve. A distributed storage control method and a distributed storage system that can be configured with a linear network code can be provided.

好ましい形態として、本願発明に係る分散ストレージ制御方法は、前記ストレージノードにおいて、ネットワーク符号化された少なくとも１つの入力データが入力され、ネットワーク符号構成アルゴリズムとしてランダムネットワーク符号化アルゴリズムを適用し、前記入力データに含まれる前記分散データと前記変換手順で変換された前記符号化アルファベットで構成された符号化行列を掛け合わせて後段のノードへ出力する出力データを計算することを特徴とする。 As a preferred embodiment, in the distributed storage control method according to the present invention, at least one input data that is network encoded is input to the storage node, a random network encoding algorithm is applied as a network code configuration algorithm, and the input data The output data to be output to the subsequent node is calculated by multiplying the distributed data included in the data and the encoding matrix composed of the encoding alphabet converted by the conversion procedure.

また、本願発明に係る分散ストレージシステムの前記ストレージノードは、ネットワーク符号化された少なくとも１つの入力データが入力され、ネットワーク符号構成アルゴリズムとしてランダムネットワーク符号化アルゴリズムを適用し、前記入力データに含まれる前記分散データと前記情報源ノードが変換した前記符号化アルファベットで構成された符号化行列を掛け合わせて後段のノードへ出力する出力データを計算することを特徴とする。 In addition, the storage node of the distributed storage system according to the present invention receives at least one input data that is network encoded, applies a random network encoding algorithm as a network code configuration algorithm, and is included in the input data The output data to be output to the subsequent node is calculated by multiplying the distributed data and the encoding matrix composed of the encoding alphabet converted by the information source node.

本発明は、α−βトレードオフ曲線上の任意の点（α，β）を含む分散ストレージのパラメータ（ｎ、ｋ、ｄ、α、β）が与えられた時に「Ｉｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈ」に対する線形ネットワーク符号で構成できる分散ストレージ制御方法及び分散ストレージシステムを提供することができる。 The present invention provides a linear network for "Information flow graph" given distributed storage parameters (n, k, d, α, β) including arbitrary points (α, β) on the α-β trade-off curve. A distributed storage control method and a distributed storage system that can be configured with codes can be provided.

分散ストレージシステムのＩｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈの例を説明する図である。It is a figure explaining the example of Information flow graph of a distributed storage system. 分散ストレージシステムのＩｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈの例を説明する図である。It is a figure explaining the example of Information flow graph of a distributed storage system. 非特許文献３に記載されるα‐βトレードオフ曲線の例を説明する図である。It is a figure explaining the example of the alpha-beta trade-off curve described in the nonpatent literature 3. FIG. 本発明に係る分散ストレージシステムのストレージノードを説明する図である。It is a figure explaining the storage node of the distributed storage system which concerns on this invention. 本発明に係る分散ストレージシステムの復元ノードを説明する図である。It is a figure explaining the restoration | recovery node of the distributed storage system which concerns on this invention. 本発明に係る分散ストレージシステムの情報源ノードの動作（符号化ベクトルの割り当て）を説明する図である。It is a figure explaining operation | movement (assignment of an encoding vector) of the information source node of the distributed storage system which concerns on this invention. 本発明に係る分散ストレージシステムのストレージノードの動作（符号化ベクトルの割り当て）を説明する図である。It is a figure explaining operation | movement (assignment of an encoding vector) of the storage node of the distributed storage system which concerns on this invention. 本発明に係る分散ストレージシステムを説明する図である。It is a figure explaining the distributed storage system which concerns on this invention.

４．開発技術の具体的な説明
４．１システム概要
添付の図面を参照して本発明の実施形態を説明する。以下に説明する実施形態は本発明の実施例であり、本発明は、以下の実施形態に制限されるものではない。なお、本明細書及び図面において符号が同じ構成要素は、相互に同一のものを示すものとする。 4). 4. Detailed Description of Development Technology 4.1 System Overview Embodiments of the present invention will be described with reference to the accompanying drawings. The embodiments described below are examples of the present invention, and the present invention is not limited to the following embodiments. In the present specification and drawings, the same reference numerals denote the same components.

図８は、本実施形態の分散ストレージシステム３０１を説明する図である。分散ストレージシステム３０１は、
コンテンツデータを保持する情報源ノード５と、
前記コンテンツデータを分散させた分散データを保持する複数のストレージノード１０と、
任意のストレージノード１０から出力された分散データを受信して前記コンテンツデータを復元する復元ノード２０と、
で構成される分散ストレージ系を有する分散ストレージシステムであって、情報源ノード１５が、前記コンテンツデータを符号化する分散ストレージ系アルファベットを、前記分散ストレージ系アルファベットより小さな符号化アルファベットに変換するアルファベット変換処理部１６を有し、
復元ノード２０が、前記符号化アルファベットを前記分散ストレージ系アルファベットに戻すアルファベット変換処理部２４を有しており、
ストレージノード１０の故障時に他の少なくとも１のストレージノード１０からデータを受信して故障したストレージノード１０の代替となる再生成用ストレージノードを形成し、前記分散ストレージ系を更新する再生成機能と、
前記ストレージノードの故障の回数をカウントし、前記回数が上限値に達した時、構成された前記分散ストレージ系をリセットして改めて分散ストレージ系を構成するリセット機能と、
を備える制御部を備えることを特徴とする。 FIG. 8 is a diagram illustrating the distributed storage system 301 of this embodiment. The distributed storage system 301
An information source node 5 holding content data;
A plurality of storage nodes 10 holding distributed data in which the content data is distributed;
A restoration node 20 that receives the distributed data output from any storage node 10 and restores the content data;
A distributed storage system having a distributed storage system, wherein the information source node 15 converts the distributed storage system alphabet encoding the content data into an encoded alphabet smaller than the distributed storage system alphabet A processing unit 16;
The restoration node 20 includes an alphabet conversion processing unit 24 that returns the encoded alphabet to the distributed storage alphabet.
A regeneration function that receives data from at least one other storage node 10 when the storage node 10 fails, forms a regeneration storage node that replaces the failed storage node 10, and updates the distributed storage system;
Counting the number of failures of the storage node, and when the number reaches the upper limit, resetting the configured distributed storage system and configuring the distributed storage system again,
It is characterized by providing a control part provided with.

ストレージノード１０は、
ネットワーク符号化された少なくとも１つの入力データが入力され、ネットワーク符号構成アルゴリズムとしてランダムネットワーク符号化アルゴリズムを適用し、前記入力データに含まれる前記分散データと情報源ノード１５が変換した前記符号化アルファベットで構成された符号化行列を掛け合わせて後段のノードへ出力する出力データを計算する。なお、当該計算は後述する数式（１３）の処理である。 The storage node 10
At least one input data subjected to network encoding is input, a random network encoding algorithm is applied as a network code configuration algorithm, and the distributed data included in the input data and the encoded alphabet converted by the information source node 15 are used. Output data to be output to the subsequent node is calculated by multiplying the configured encoding matrix. The calculation is a process of Equation (13) described later.

分散ストレージシステム３０１は２節で述べた困難を次のように克服することでα−βトレードオフ曲線上の任意の点を達成する再生成符号を構成する。 The distributed storage system 301 constructs a regeneration code that achieves an arbitrary point on the α-β trade-off curve by overcoming the difficulty described in Section 2 as follows.

困難２に対して：
分散ストレージシステム３０１は、ノード故障回数毎に加算するカウンタを持ち、ノード故障が最大カウンタ値Ｔに達したならば、分散ストレージ系をすべてリフレッシュして分散符号化（本実施形態の場合、ネットワーク符号化）を始めから再開する。最大カウンタ値を設定することで、「Ｉｎｆｏｒａｍｔｉｏｎｆｌｏｗｇｒａｐｈ」が（時間方向に）無限に延びていくことはなくなった。 For Difficulty 2:
The distributed storage system 301 has a counter to be added for each node failure. When the node failure reaches the maximum counter value T, all the distributed storage systems are refreshed to perform distributed encoding (in this embodiment, network code). )) From the beginning. By setting the maximum counter value, the “Information flow graph” does not extend infinitely (in the time direction).

困難１に関して：
最大カウンタ値Ｔを定義することによって故障がＴ回発生するまでに「Ｉｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈ」の取り得るトポロジーの数は有限で抑えられる。従って、ネットワーク符号構成アルゴリズムにランダムネットワーク符号化アルゴリズム（非特許文献４を参照。）を適用すれば、ノード間の通信路である各リンク、すなわち「ＩｎｆｏｒｍａｔｉｏｎＦｌｏｗＧｒａｐｈ」の各辺に適切な符号化ベクトルが割り当てられる確率を、分散ストレージ系アルファベットもしくは符号化アルファベットのバイト長ｍに関して漸近的に限りなく１に近づけることが示せる。ここで、「適切な符号化ベクトルの割り当て」とは、それによって復元ノード２０がいかなるコンテンツデータに対しても正しく復号化可能であるような符号化ベクトルの割り当てのことをいう。そして、「バイト長ｍに関して漸近的に限りなく１に近づける」とは、後述するように符号化ベクトルの構成は（１０）式および（１１）式のようにランダムになされるので、バイト長ｍを十分大きくすれば適切な符号化ベクトルが得られる確率を高めることができるという意味である。 Regarding Difficulty 1:
By defining the maximum counter value T, the number of topologies that the “Information flow graph” can take before the failure occurs T times is limited. Therefore, if a random network encoding algorithm (see Non-Patent Document 4) is applied to the network code configuration algorithm, encoding appropriate for each link that is a communication path between nodes, that is, each side of the “Information Flow Graph”. It can be shown that the probability that a vector is allocated is as close to 1 asymptotically with respect to the byte length m of the distributed storage alphabet or the encoded alphabet. Here, “appropriate encoding vector allocation” refers to an allocation of an encoding vector so that the restoration node 20 can correctly decode any content data. “Asymptotically approaching 1 as much as possible with respect to the byte length m” means that the configuration of the encoding vector is made random as shown in equations (10) and (11), as will be described later. This means that the probability that an appropriate encoded vector can be obtained can be increased if is made sufficiently large.

困難３に関して：
分散ストレージシステム３０１は、予め与えられている分散ストレージ系のアルファベットＦｑから小さな符号化アルファベットＦｑ^〜＝｛０，１，２，・・・，ｑ^〜−１｝（ただしｑ^〜＜ｑ）に変換することによって符号化アルファベットＦｑ^〜上でネットワーク符号を構成する。このため、分散ストレージシステム３０１は、バイト長ｍが十分長い漸近的な極限では再生成用のデータ量βに比べてネットワーク符号のための符号化ベクトルのデータ量を無視できる。 Regarding Difficulty 3:
Distributed storage system 301, converted from an alphabet Fq of distributed storage systems are given in advance a small coding alphabet ^{^{Fq ~ = {0,1,2, ···,}} q ~ -1} ( provided that ^q ~ <q) to configuring the network code in the encoding alphabet over Fq ^~ by. For this reason, the distributed storage system 301 can ignore the data amount of the coding vector for the network code in the asymptotic limit where the byte length m is sufficiently long compared to the data amount β for regeneration.

４．２具体的な符号構成
分散ストレージシステム３０１の動作は、
（０）分散ストレージ系アルファベットＦｑから符号化アルファベットＦｑ^〜への変換、
（１）符号化ベクトルの構成、
（２）各リンクにおける符号化および送信データの構成、
（３）ＤＣ（復元ノード）における復号化、および符号化アルファベットＦｑ^〜から分散ストレージ系のアルファベットＦｑへの逆変換、
の４ステップよりなる。
図４は、ノードν（ストレージノード１０）の構成および動作（ステップ（１）と（２））を説明する図である。図５は、復元ノード２０の構成および動作（ステップ（３））を説明する図である。 4.2 Specific Code Configuration The operation of the distributed storage system 301 is as follows:
(0) transformation into coded alphabets Fq ^~ from the distributed storage system alphabet Fq,
(1) Configuration of encoding vector,
(2) Configuration of encoding and transmission data in each link,
(3) Decoding in DC (restoration node), and inverse conversion from encoding alphabet Fq ^to distributed storage system alphabet Fq,
It consists of 4 steps.
FIG. 4 is a diagram for explaining the configuration and operation (steps (1) and (2)) of the node ν (storage node 10). FIG. 5 is a diagram for explaining the configuration and operation of the restoration node 20 (step (3)).

ステップ（０）［分散ストレージ系のアルファベットから符号化アルファベットへの変換］
分散ストレージ系のアルファベットＦ_ｑ（ただしｑ＝２^ｍ）から符号化アルファベットＦ_ｑ ^〜（ただしｑ^〜＝２^ε，ε＝ｍ^δ）への１対１写像ψを次のように定義する。

すなわちｘ∈Ｆ_ｑを２進表現すれば（ｂ_１、・・・、ｂ_ｍ）とｍビットで表すことができる。写像ψは、この２進表現をｍ^δビット毎に区切ることで

として表現したものである。また、ｘ_１、ｘ_２∈Ｆ_ｑに対して

として定義する。ここで“＊”は文字列の連接を表すものとする。 Step (0) [Conversion from distributed storage system alphabet to coded alphabet]
A one-to-one mapping ψ from the distributed storage system alphabet F _q (where q = 2 ^m ) to the encoded alphabet F _q ^˜ (where q ^˜ = 2 ^ε , ε = m ^δ ) is defined as follows.

That is, if x∈F _q is expressed in binary, it can be expressed as (b ₁ ,..., B _m ) with m bits. The mapping ψ is obtained by dividing this binary expression into m ^δ bits.

It is expressed as For x ₁ , x ₂ ∈F _q

Define as Here, “*” represents concatenation of character strings.

ステップ（１）［符号化ベクトルの構成］
正の定数０＜δ＜１が予め与えられているとする。最初に符号化アルファベットを確定させるために，情報源ノード１５（情報源ノードｓ）においてアルファベットのＦ_ｑからＦ_ｑ ^〜への変換処理（ただし、ｑ^〜＝２^ε＜ｑ＝２^ｍ）が行われる（（３）式及び（５）式を参照）。ランダムネットワーク符号化アルゴリズム実行部１２は、入力部１１に入力された符号化アルファベットＦ_ｑ ^〜上の符号化ベクトルを利用し、これらから出射リンクe’上の符号化ベクトルを構成する。対象となる「Ｉｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈ」は有向非周期グラフなので、情報源ノードｓを１番目とした順序付けが可能となる。この順序に従って、各ノードから出射するリンクに対して符号化ベクトルを帰納的に割り当てる。 Step (1) [Configuration of Encoding Vector]
Assume that a positive constant 0 <δ <1 is given in advance. To initially determine the coding alphabet, conversion of the source node 15 (source node s) from alphabet _{F q} _F ^q to ^~ ^{^{(where, q ~ = 2 ε <q}} = 2 m) row (See Equations (3) and (5)). Random network coding algorithm executing unit 12 utilizes the coding vector on the encoding alphabet F _q ^~ input to the input unit 11, constituting the encoded vector on exit link e 'therefrom. Since the target “Information flow graph” is a directed aperiodic graph, the information source node s can be ordered first. In accordance with this order, encoding vectors are recursively assigned to links emanating from each node.

図６は、情報源ノードｓの動作を説明する図である。情報源ノードｓに対しては図６のように仮想的に容量Ｂのリンクが入射しているものと見なす。このとき、ｓへの入射リンクには符号化ベクトル

が割り当てられる。ここで、［１：Ｂ］＝｛１、２、・・・、Ｂ｝を表すものとする。単位容量あたり１つの符号化ベクトルが割り当てられることに注意しておく。 FIG. 6 is a diagram for explaining the operation of the information source node s. It is assumed that a link of capacity B is incident on the information source node s as shown in FIG. At this time, the encoding vector is in the incident link to s.

Is assigned. Here, it is assumed that [1: B] = {1, 2,..., B}. Note that one encoding vector is allocated per unit capacity.

図７は、ノードνの動作を説明する図である。ノードνにおける容量Ｒ_ｅの入射リンクｅには符号化ベクトル

が割り当てられているものとする。ここで、

である。ただしＴは転置を表す。このデータは符号語と一緒に図４の入力部１１に蓄積される。このときノードνの出射リンクｅ’に対しては

のＲ_ｅ’個の符号化ベクトルが割り当てられることになる。各々の符号化ベクトルは、

として与えられる。ただし、係数

は符号化アルファベットＦ_ｑ ^〜より一様ランダムに取り出されたものとする。ここで、Ｉｎ（ν）はノードνへ入射するリンクの集合を表す。また「Ｉｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈ」においてリンク容量Ｒ_ｅはαもしくはβの値をとる。 FIG. 7 is a diagram for explaining the operation of the node ν. The encoding vector is used for the incident link e of the capacity R _{e at} the node ν.

Is assigned. here,

It is. However, T represents transposition. This data is stored in the input unit 11 of FIG. 4 together with the code word. At this time, for the outgoing link e ′ of the node ν

R _{e ′} encoding vectors are assigned. Each encoding vector is

As given. However, the coefficient

And ones that were retrieved coded alphabet F _q uniformly random than ^~. Here, In (ν) represents a set of links incident on the node ν. The link capacity _{R e} in the "Information flow graph" takes the value of α or β.

次に符号化を実行するために（１０）式で与えられた符号化ベクトルから符号化行列を構成する。以下ｍ^１−δ×ｍ^１−δ単位行列をＩとする。ノードνから出射するリンクｅ∈Ｏｕｔ（ν）上の符号化行列は

として与えられる。ここで、Ｏｕｔ（ν）はノードνから出射するリンクの集合を表す。 Next, in order to execute encoding, an encoding matrix is constructed from the encoding vector given by equation (10). Hereinafter, let m ^1−δ × m ^1−δ unit matrix be I. The encoding matrix on the link e∈Out (ν) emanating from the node ν is

As given. Here, Out (ν) represents a set of links emitted from the node ν.

ステップ（２）［各リンクにおける符号化および送信データの構成］
図４の出力部１３の動作を説明する。リンクｅにおける符号化を説明する。ｅ∈Ｏｕｔ（ν）において、符号化を

によって実行する。このときリンクｅが出力する出力データを

とする。（１４）式には未知の量ｘ^Ｂが見かけ上含まれているが、ノードνが受け取るデータ（入力部１１が受信する入力データ）を

とすると、

によって与えられる。 Step (2) [Configuration of encoding and transmission data in each link]
The operation of the output unit 13 in FIG. 4 will be described. The encoding in link e will be described. For e∈Out (ν), encode

Run by. At this time, the output data output by link e

And (14) includes the apparent unknown amount x ^B is the expression, the node ν receives data (input data input unit 11 receives)

Then,

Given by.

また、ノードνからＤＣへ送られるデータはノードν’（物理的なノードとしてはノードνと同一であるが再生成のためのデータを受信するノードとして論理的に区別した）より送られてきたデータをそのまま転送するものとする。ここで、ｑ^〜＝２^ｍδなので

ならば、適切な符号化ベクトルが各リンクに割り当てられる確率が十分大きくなることが示せる。ただし、Ｔは予め設定された最大カウンタ値である。また符号化ベクトルに関するデータを送信する際のデータ量は（１４）式よりＢＲ_ｅｍ^δビットであるので、ｍが十分大きければ符号語のデータ量Ｒ_ｅｍビットに比較して無視できるほど小さくなる。 The data sent from the node ν to the DC is sent from the node ν ′ (the physical node is the same as the node ν but is logically distinguished as the node that receives the data for regeneration). Data is transferred as it is. Where q ^~ = 2 ^mδ

Then, it can be shown that the probability that an appropriate encoding vector is assigned to each link is sufficiently large. T is a preset maximum counter value. Further, since the data amount when transmitting the data related to the coding vector is BR _e m ^δ bits from the equation (14), if m is sufficiently large, the data amount of the code word is small enough to be ignored in comparison with R _e m bits. Become.

ステップ（３）［ＤＣにおける復号化］
図５の復元ノード２０の動作を説明する。
ＤＣ（復元ノード２０）がｋ個のノードν_ｉ１、・・・、ν_ｉｋにアクセスするとき、ノードν_ｉａより送られるデータを

と書く。これらのデータは入力部２１に蓄積される。ただし、Ａ^〜 _ｉａは対応する符号化ベクトルを横に並べたものをまとめて行列表現としたものである。このとき復号化は次の一次方程式

を解くことにより実行される。一次方程式の求解は掃き出し法実行部２２でなされる。ここで、Ａ_ｉａはＡ^〜 _ｉａの対応する符号化ベクトルの各要素にｍ^１−δ×ｍ^１−δ単位行列を掛けたものである。次に変換処理部２４で符号化アルファベットから分散ストレージ系アルファベットへの逆変換処理がなされる。処理内容は，ステップ（０）で説明した変換処理の逆処理である。要するに

は１対１写像であるので一次方程式（１９）式の解ψ（ｘ^Ｂ）から求めるコンテンツデータｘ^Ｂを復元できることになる。 Step (3) [Decoding in DC]
The operation of the restoration node 20 in FIG. 5 will be described.
When DC (restoration node 20) accesses k nodes ν _i1 ,..., Ν _ik , data sent from node ν _ia

Write. These data are accumulated in the input unit 21. Here, A ^to _ia are matrix representations of the corresponding encoded vectors arranged side by side. At this time, decoding is performed by the following linear equation

It is executed by solving The solution of the linear equation is performed by the sweep-out method execution unit 22. Here, A _ia is obtained by multiplying each element of the corresponding encoding vector of A ^to _{ia by} an m ^1−δ × m ^1−δ unit matrix. Next, the conversion processing unit 24 performs reverse conversion processing from the encoded alphabet to the distributed storage alphabet. The processing content is the reverse processing of the conversion processing described in step (0). in short

Since this is a one-to-one mapping, the content data x ^B obtained from the solution ψ (x ^B ) of the linear equation (19) can be restored.

５．発明によって生じる効果
非特許文献１のＩＣＮ／ＣＣＮは次世代コンテンツ配信の有力な方式のひとつである。これは、ネットワーク上のノードに付随するキャッシュをローカルなストレージとして再定義することでネットワーク負荷の低減およびエンドユーザに対する実効スループット向上を目指すものである。ＩＣＮ／ＣＣＮの具体的なアーキテクチャは現在のところ研究段階ではあるが、その実現においてはネットワーク上にある種の分散ストレージが実現されることになることは明らかである。分散ストレージでのデータ保存量と再生成のためのデータ転送量は（１）式で示したトレードオフの関係を持つ。従って、もしＭＳＲ／ＭＢＲ点でなければ最適な符号が構成できないようであればネットワーク上の各ノードで保存可能なデータ量に対しても著しい制限が課されることになり、ＩＣＮ／ＣＣＮの最大限のパフォーマンスを発揮することが困難となると考えられる。本発明は、α−βトレードオフ曲線上の任意の点における再生成符号の構成を可能とするものであるから、ＩＣＮ／ＣＣＮのキャッシュ（もしくはローカルストレージ）機能を最大限に発揮させることができる。 5. Effects produced by the invention ICN / CCN of Non-Patent Document 1 is one of the leading methods of next-generation content distribution. This aims to reduce the network load and improve the effective throughput for the end user by redefining the cache associated with the node on the network as a local storage. Although the specific architecture of ICN / CCN is currently at the research stage, it is clear that some kind of distributed storage will be realized on the network. The amount of data stored in the distributed storage and the amount of data transferred for regeneration have a trade-off relationship expressed by equation (1). Therefore, if an optimal code cannot be constructed unless it is an MSR / MBR point, the amount of data that can be stored at each node on the network will be significantly limited, and the maximum of ICN / CCN will be imposed. It seems that it will be difficult to achieve the maximum performance. Since the present invention enables the configuration of a regenerated code at an arbitrary point on the α-β tradeoff curve, the cache (or local storage) function of ICN / CCN can be maximized. .

６．発明のポイント
４．１節で述べたように、本発明は、最大カウンタ値を導入することで「Ｉｎｆｏｒｍａｔｉｏｎｆｌｏｗｇｒａｐｈ」を時間的にも空間的にも有限にすることが可能となったのがポイントである。これによって、ランダムネットワーク符号化アルゴリズムを使う際に符号化ベクトルの係数をランダムに割り当ててもアルファベットのバイト長が十分大きいならば適切に符号化ベクトルの係数が割り当てられる確率を限りなく１に近くなることを示すことができた。ここで，適切な符号化ベクトルの割り当てとは、それによってＤＣがいかなるコンテンツデータに対しても正しく復号化可能であるような符号化ベクトルの割り当てのことをいう。また適切な符号化アルファベットへの変換を用いることで、符号化アルファベットを分散ストレージ系のアルファベットよりも小さくとることができるので、再生成のデータ転送の際に、符号化ベクトルの係数の転送にかかるデータ量は再生成用のコンテンツデータに比較して無視できるようにすることができた。 6). As described in section 4.1, the present invention has made it possible to make the “Information flow graph” finite in terms of time and space by introducing a maximum counter value. It is a point. As a result, even if the encoding vector coefficient is randomly assigned when using the random network encoding algorithm, the probability that the encoding vector coefficient is appropriately assigned will be close to 1 as long as the byte length of the alphabet is sufficiently large. I was able to show that. Here, the appropriate encoding vector assignment refers to an assignment of an encoding vector that allows the DC to correctly decode any content data. In addition, by using conversion to an appropriate encoding alphabet, the encoding alphabet can be made smaller than that of the distributed storage system, so that it is necessary to transfer the coefficient of the encoding vector when transferring the regenerated data. The amount of data could be ignored compared to the content data for regeneration.

１０：ストレージノード
１１：入力部
１２：ランダムネットワーク符号化アルゴリズム実行部
１３：出力部
１５：情報源ノード
１６：アルファベット変換処理部
２０：復元ノード
２１：入力部
２２：掃き出し法実行部
２４：アルファベット変換処理部
３０：制御部
３１：再生成機能
３２：リセット機能
３０１：分散ストレージシステム 10: storage node 11: input unit 12: random network encoding algorithm execution unit 13: output unit 15: information source node 16: alphabet conversion processing unit 20: restoration node 21: input unit 22: sweep-out method execution unit 24: alphabet conversion Processing unit 30: Control unit 31: Regeneration function 32: Reset function 301: Distributed storage system

Claims

An information source node holding content data;
A plurality of storage nodes holding distributed data in which the content data is distributed;
A restoration node that receives the distributed data output from any of the storage nodes and restores the content data;
A distributed storage control method for controlling a distributed storage system having a distributed storage system configured by:
A conversion procedure for converting a distributed storage alphabet that encodes the content data at the information source node into an encoding alphabet that is smaller than the distributed storage alphabet;
An inverse conversion procedure for returning the encoded alphabet to the distributed storage alphabet at the restoration node;
Regeneration procedure for receiving a data from at least one other storage node at the time of failure of the storage node, forming a regeneration storage node to replace the failed storage node, and updating the distributed storage system;
Counting the number of failures of the storage node, and when the number reaches the upper limit, resetting the configured distributed storage system and configuring the distributed storage system again,
The distributed storage control method characterized by performing.

In the storage node,
At least one input data that is network encoded is input, a random network encoding algorithm is applied as a network code configuration algorithm, and the distributed data included in the input data and the encoded alphabet converted by the conversion procedure The distributed storage control method according to claim 1, wherein output data to be output to a subsequent node is calculated by multiplying the configured encoding matrix.

An information source node holding content data;
A plurality of storage nodes holding distributed data in which the content data is distributed;
A restoration node that receives the distributed data output from any of the storage nodes and restores the content data;
A distributed storage system having a distributed storage system, wherein the information source node converts a distributed storage alphabet encoding the content data into an encoded alphabet smaller than the distributed storage alphabet Have
The restoration node has an inverse conversion function for returning the encoded alphabet to the distributed storage alphabet;
A regeneration function for receiving data from at least one other storage node at the time of failure of the storage node, forming a regeneration storage node to replace the failed storage node, and updating the distributed storage system;
Counting the number of failures of the storage node, and when the number reaches the upper limit, resetting the configured distributed storage system and configuring the distributed storage system again,
A distributed storage system comprising a control unit comprising:

The storage node is
At least one input data that is network encoded is input, a random network encoding algorithm is applied as a network code configuration algorithm, and the distributed data included in the input data and the encoded alphabet converted by the information source node 4. The distributed storage system according to claim 3, wherein output data to be output to a subsequent node is calculated by multiplying the configured encoding matrix.