JP2012190377A

JP2012190377A - Content decentralization and storage system

Info

Publication number: JP2012190377A
Application number: JP2011055016A
Authority: JP
Inventors: Akihiko Nishitani; 明彦西谷; Tomohiko Ogishi; 智彦大岸
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2011-03-14
Filing date: 2011-03-14
Publication date: 2012-10-04

Abstract

PROBLEM TO BE SOLVED: To provide a system that makes a read without concentrating a load on a specific file server enabled when a content distribution server which distributes content pieces to a user terminal reads in a plurality of content pieces together before the distribution in a decentralization file system.SOLUTION: In a content decentralization and storage system including the decentralization file system which includes a plurality of file servers storing respective content pieces and the content distribution server which reads in N content pieces from the respective file servers together at a distribution request from the user terminal and distributes the content pieces to the user terminal, the decentralization file system has a decentralization file system client 21 which determines N file servers or more as write destinations of the respective content pieces from the plurality of file servers, and the decentralization file system client has a write destination determination part 52 which determines a write destination file server 22 so that the respective content pieces are decentralized in time-series order.

Description

本発明は、ビデオ・オン・デマンド（ＶｏＤ）配信サーバにおいて、ネットワークを介して散在する複数コンピュータのストレージを仮想的に統合して提供可能な分散ファイルシステムを利用し、一つのコンテンツを断片化（コンテンツ片を作成）し、一連のファイルセットを別々のファイルサーバに保管するコンテンツ分散保管システムに関する。 The present invention uses a distributed file system capable of providing virtually integrated storage of a plurality of computers scattered over a network in a video-on-demand (VoD) distribution server, and fragments one piece of content ( The present invention relates to a distributed content storage system that creates a piece of content and stores a series of file sets in separate file servers.

この種の技術としては、非特許文献１や非特許文献２で示されるように、複数のマシンのディスクを組み合わせて１つの分散ファイルシステムとして機能する分散プラットフォームが提案されている。
非特許文献１に示されたGfarmは、広域ネットワーク上で、大容量、大規模データ処理の要求に応えるスケーラブルな分散ファイルシステムプラットフォームであり、広域なネットワーク上での効率的なファイル共有に適した分散プラットフォームである。
一方、非特許文献２に示されたHadoopは、１つのディスクで保存できない大量のデータを並列化することで高速かつ効率良く処理できるものであり、比較的大きなサイズかつ基本的に更新されることのないファイルのＩ／Ｏに適した分散プラットフォームである。 As this type of technology, as shown in Non-Patent Document 1 and Non-Patent Document 2, a distributed platform that functions as one distributed file system by combining disks of a plurality of machines has been proposed.
Gfarm shown in Non-Patent Document 1 is a scalable distributed file system platform that meets the demands of large-capacity, large-scale data processing on a wide area network, and is suitable for efficient file sharing on a wide area network A distributed platform.
On the other hand, Hadoop disclosed in Non-Patent Document 2 can process a large amount of data that cannot be saved on a single disk in parallel and can be processed at high speed and efficiently, and is relatively large and basically updated. It is a distributed platform suitable for I / O of files without a file.

従来、複数のサーバで構成された分散ファイルシステムにおける監視システムは、各サーバの状態を定期的に収集し、統合的に分析することが行われている。収集される情報としては、ＣＰＵ使用率、メモリ使用量、ディスク使用量、ＣＰＵ温度、ネットワーク接続状態などが存在する。統合的な分析例としては、ＣＰＵ使用率が８０％以上のファイルサーバの台数を把握することでファイルサーバの混雑度が分かる。これにより、システムの使用状況に対して、ファイルサーバの台数が十分かどうかなどの指標を得ることができる。 Conventionally, a monitoring system in a distributed file system composed of a plurality of servers regularly collects the status of each server and performs an integrated analysis. Information collected includes CPU usage rate, memory usage, disk usage, CPU temperature, network connection status, and the like. As an integrated analysis example, the degree of congestion of a file server can be understood by grasping the number of file servers having a CPU usage rate of 80% or more. Thereby, it is possible to obtain an index such as whether or not the number of file servers is sufficient with respect to the usage status of the system.

URL：http://datafarm.apgrid.org/indeＸ.ja.htmlURL: http://datafarm.apgrid.org/indeX.ja.html URL：http://hadoop.apache.org/URL: http://hadoop.apache.org/

上述した分散ファイルシステムにおいて、断片化されたコンテンツ片をそれぞれ保管する場合のファイルサーバの選択は、各ファイルサーバのＣＰＵ利用率やディスク空き容量などのリソース情報をもとに選択することが行われている。
例えば図７に示すように、コンテンツ配信サーバは、エンドユーザからの再生要求に従い、コンテンツ片を時系列順に配信する。ただし、各コンテンツ片の配信開始時刻に対象コンテンツ片の読込みを開始するのではなく、配信ストリーム（映像、音声）が途切れない様、先のコンテンツ片をＮ個配信時刻前に読込み、メモリ上に予めロードしておき、配信開始時刻に備えるようになっている。 In the distributed file system described above, the file server is selected when each piece of fragmented content is stored based on resource information such as the CPU usage rate and free disk capacity of each file server. ing.
For example, as shown in FIG. 7, the content distribution server distributes the content pieces in chronological order according to the reproduction request from the end user. However, instead of starting the reading of the target content piece at the delivery start time of each content piece, N pieces of the previous content pieces are read before the delivery time so that the delivery stream (video, audio) is not interrupted and stored in the memory. It is loaded in advance to prepare for the distribution start time.

しかしながら、上述のような方式であると、時系列的に隣接するコンテンツ片をＮ個まとめて読込む場合（図７の例であれば、コンテンツ片１，２，３をまとめて読込む場合、あるいはコンテンツ片４，５，６をまとめて読込む場合）、まとめて読込むコンテンツ片が、図８に示すように同一のファイルサーバ（2010srv）内に保管されている場合には、コンテンツ片１，２，３をまとめて読込むに際して、同一のファイルサーバ（2010srv）にてコンテンツ片１及びコンテンツ片３の読込み処理が輻輳するため、対象ファイルサーバ（2010srv）においてデータ入出力負荷が高騰し、コンテンツ配信サーバに対する応答速度等の読込み性能が低下するという問題があった。 However, in the case of the above-described method, when N pieces of content pieces adjacent in time series are read together (in the example of FIG. 7, when pieces of content pieces 1, 2, and 3 are read together, Alternatively, when content pieces 4, 5 and 6 are read together), when the content pieces to be read together are stored in the same file server (2010srv) as shown in FIG. , 2 and 3 are read together, the content file 1 and content piece 3 are congested in the same file server (2010srv), so the data input / output load increases in the target file server (2010srv), There has been a problem that reading performance such as response speed to the content distribution server is lowered.

本発明は上記実情に鑑みて提案されたもので、コンテンツ片が格納される複数のファイルサーバを備えた分散ファイルシステムにおいて、コンテンツ片をユーザ端末へ配信するコンテンツ配信サーバが、配信前に複数のコンテンツ片をまとめて読込むに際して、特定のファイルサーバに負荷を集中させることなく読込みが可能なコンテンツ分散保管システムを提供することを目的としている。 The present invention has been proposed in view of the above circumstances. In a distributed file system including a plurality of file servers in which content pieces are stored, a content distribution server that distributes content pieces to user terminals has a plurality of It is an object of the present invention to provide a distributed content storage system that can read content pieces without having to concentrate the load on a specific file server.

上記目的を達成するため本発明（請求項１）は、コンテンツを断片化して複数のコンテンツ片を生成するコンテンツ生成サーバと、前記各コンテンツ片を格納するため物理的に分散した複数のファイルサーバを含んで構成される分散ファイルシステムと、ユーザ端末からの配信要求に応じて前記各ファイルサーバから各コンテンツ片をＮ個まとめて読込み前記ユーザ端末への配信を行うコンテンツ配信サーバとを備えたシステムにおいて、次の構成を含むことを特徴としている。
前記分散ファイルシステムは、前記複数のファイルサーバから各コンテンツ片の書込み先となるＮ個以上のファイルサーバを決定する分散ファイルシステムクライアントを有している。
前記分散ファイルシステムクライアントは、前記各コンテンツ片が時系列順に分散するように書込み先のファイルサーバを決定する書込み先決定部を備える。 To achieve the above object, the present invention (Claim 1) includes a content generation server that generates a plurality of content pieces by fragmenting content, and a plurality of file servers that are physically distributed to store the content pieces. In a system comprising: a distributed file system comprising: a content distribution server that reads N pieces of each piece of content from each file server in response to a distribution request from a user terminal and distributes the content pieces to the user terminal The following structure is included.
The distributed file system includes a distributed file system client that determines N or more file servers to which each piece of content is written from the plurality of file servers.
The distributed file system client includes a write destination determining unit that determines a write destination file server so that the content pieces are distributed in time series.

請求項２は、請求項１のコンテンツ分散保管システムにおいて、前記分散ファイルシステムは、ＣＰＵ利用率やディスク空き容量などのリソース情報が所定の閾値以上あるファイルサーバのリストを作成するためのメタデータサーバを有し、前記書込み先決定部は、前記メタデータサーバから取得した前記リストに基づいて複数のファイルサーバの中からコンテンツ片の書込み先のファイルサーバを選択することを特徴としている。 2. The content distributed storage system according to claim 1, wherein the distributed file system is a metadata server for creating a list of file servers having resource information such as a CPU usage rate and a free disk capacity exceeding a predetermined threshold. The write destination determination unit selects a file server to which the content piece is to be written from a plurality of file servers based on the list acquired from the metadata server.

請求項３は、請求項２のコンテンツ分散保管システムにおいて、前記リストは、ファイルサーバを一意に識別可能な値の順に並べて作成されることを特徴としている。 According to a third aspect of the present invention, in the distributed content storage system according to the second aspect, the list is created by arranging file servers in order of uniquely identifiable values.

請求項４は、請求項３のコンテンツ分散保管システムにおいて、前記リストにおけるファイルサーバを一意に識別可能な値の順は、各ファイルサーバのホスト名順又はＩＰアドレス順又はＭＡＣアドレス順であることを特徴としている。 According to a fourth aspect of the present invention, in the distributed content storage system according to the third aspect, the order of values that can uniquely identify the file server in the list is the host name order, the IP address order, or the MAC address order of each file server. It is a feature.

請求項５は、請求項２のコンテンツ分散保管システムにおいて、前記リストは、ファイルサーバを運用者が指定した任意の順に並べて作成されることを特徴としている。 According to a fifth aspect of the present invention, in the content distributed storage system according to the second aspect, the list is created by arranging file servers in an arbitrary order designated by an operator.

請求項６は、請求項１または請求項２のコンテンツ分散保管システムにおいて、前記コンテンツ配信サーバからの同時読込み数とファイルサーバでの書込み総数との乗数以上の台数のファイルサーバを設置した環境で、前記リストにおいて、コンテンツ片を保管したファイルサーバの順位に同時読込み数を加えた順位のファイルサーバにコンテンツ片の複製を保管することを特徴としている。 A content distribution storage system according to claim 1 or claim 2 is an environment in which the number of file servers equal to or greater than the multiplier of the number of simultaneous reads from the content distribution server and the total number of writes in the file server is installed. In the list, a copy of the content piece is stored in a file server having a rank obtained by adding the number of simultaneous readings to the rank of the file server storing the content piece.

本発明によれば、分散ファイルシステムクライアントは、書込み先決定部により各コンテンツ片が時系列順に分散するように書込み先のファイルサーバを決定するので、コンテンツ配信サーバがユーザ端末からの配信要求に応じて各ファイルサーバから各コンテンツ片をＮ個まとめて読み込む場合、読み込まれる各コンテンツ片は、Ｎ個以上のファイルサーバに時系列順に分散するように格納されているので、読込み先のファイルサーバが重複することなく、特定のファイルサーバに負荷を集中させることなく読み込みを行うことができる。 According to the present invention, the distributed file system client determines the write destination file server so that each piece of content is distributed in chronological order by the write destination determination unit, so that the content distribution server responds to the distribution request from the user terminal. When N pieces of content are read together from each file server, the read pieces of content are stored in N or more file servers so as to be distributed in chronological order, so that the read destination file servers are duplicated. Without having to concentrate the load on a specific file server.

また、書込み先決定部は、ＣＰＵ利用率やディスク空き容量などのリソース情報が所定の閾値以上あるファイルサーバのリストに基づいて複数のファイルサーバの中からコンテンツ片の書込み先を決定するので、選択されたファイルサーバにコンテンツ片を確実に格納させることができる。 In addition, the write destination determination unit determines the write destination of the content piece from among the plurality of file servers based on the list of file servers having resource information such as CPU usage rate and disk free space equal to or greater than a predetermined threshold. The piece of content can be securely stored in the file server.

本発明のコンテンツ分散保管システムの実施形態の一例を示す全体構成モデル図である。It is a whole structure model figure which shows an example of embodiment of the content distribution storage system of this invention. 分散ファイルシステムにおける各コンテンツ片と書込み先のファイルサーバ（ホスト名順）との関係を示すモデル図である。It is a model figure which shows the relationship between each piece of content in a distributed file system, and the file server (host name order) of a write-destination. コンテンツ生成サーバ、分散ファイルシステム、コンテンツ配信サーバ、ユーザ端末間でのコンテンツ生成から配信までを示すシーケンス図である。It is a sequence diagram which shows from a content production | generation server, a distributed file system, a content delivery server, and the content production | generation between user terminals to distribution. 分散ファイルシステムにおける分散ファイルシステムクライアントの構成を示すブロック図である。It is a block diagram which shows the structure of the distributed file system client in a distributed file system. 分散ファイルシステムクライアントの書込み先決定部における処理を示すフローチャート図である。It is a flowchart figure which shows the process in the write-destination determination part of a distributed file system client. 分散ファイルシステムにおける各コンテンツ片（複製有）と書込み先のファイルサーバ（ホスト名順）との関係を示すモデル図である。It is a model figure which shows the relationship between each content piece (copy existence) in a distributed file system, and the file server of a write destination (in order of host name). 従前技術におけるコンテンツ片の読込み・配信のタイミングを説明するシーケンス図である。It is a sequence diagram explaining the timing of reading and delivery of a content piece in the prior art. 従前の分散ファイルシステムにおける各コンテンツ片と書込み先のファイルサーバとの関係を示すモデル図である。It is a model figure which shows the relationship between each piece of content and the file server of a write destination in the conventional distributed file system.

本発明のコンテンツ分散保管システムの実施形態の一例について、図面を参照しながら説明する。図１は、コンテンツ分散保管システムの全体構成図である。
コンテンツ分散保管システムは、コンテンツ生成サーバ１０と、分散ファイルシステム２０と、コンテンツ配信サーバ３０と、ユーザ端末４０とを備え、それぞれネットワークを介して接続して構成されている。 An example of an embodiment of a content distribution storage system of the present invention will be described with reference to the drawings. FIG. 1 is an overall configuration diagram of a content distribution storage system.
The content distributed storage system includes a content generation server 10, a distributed file system 20, a content distribution server 30, and a user terminal 40, which are connected to each other via a network.

コンテンツ生成サーバ１０は、コンテンツを提供する外部ストレージ１１に接続され、運用者に指定される時間単位に外部ストレージ１１から提供されたコンテンツを断片化して複数のコンテンツ片１５を生成し、このコンテンツ片１５を分散ファイルシステム２０内に分割書込みする処理が行われる。 The content generation server 10 is connected to the external storage 11 that provides content, and generates a plurality of content pieces 15 by fragmenting the content provided from the external storage 11 in units of time designated by the operator. The process of dividing and writing 15 into the distributed file system 20 is performed.

分散ファイルシステム２０は、コンテンツ片１５の書込み及び読込みを行う分散ファイルシステムクライアント２１と、各コンテンツ片が時系列的に格納される複数のファイルサーバ２２と、各コンテンツ片の格納先情報（メタ情報）を記録するメタデータサーバ２３を備えている。分散ファイルシステムクライアント２１は、コンテンツ片１５の書込みを行う場合、各コンテンツ片の格納先となるファイルサーバ２２を選択する。
メタデータサーバ２３は、各コンテンツ片がどのファイルサーバ２２に格納されたかのメタ情報を記録する。 The distributed file system 20 includes a distributed file system client 21 that writes and reads content pieces 15, a plurality of file servers 22 that store each content piece in time series, and storage location information (meta information) of each content piece. ) Is recorded. When the distributed file system client 21 writes the content piece 15, the distributed file system client 21 selects the file server 22 that is the storage destination of each content piece.
The metadata server 23 records meta information indicating in which file server 22 each piece of content is stored.

ファイルサーバ２２は、各コンテンツ片１５を時系列的に格納するため物理的に分散して配置されている。
コンテンツ配信サーバ３０は、ユーザ端末４０からの配信要求に応じて各ファイルサーバ２２から各コンテンツ片をＮ個ずつまとめて読込み、ネットワークを介してユーザ端末４０へ順次配信を行う。そして、前記したファイルサーバ２２は、Ｎ個の各コンテンツ片１５を時系列的に物理的に分散して格納するため、少なくとも書込み先として選択可能なファイルサーバ２２がＮ個以上存在する個数を備えている。 The file servers 22 are physically distributed in order to store the content pieces 15 in time series.
In response to a distribution request from the user terminal 40, the content distribution server 30 reads N pieces of each content piece from each file server 22 and distributes them sequentially to the user terminal 40 via the network. Since the file server 22 stores N pieces of content pieces 15 in a time-sequentially physically distributed manner, the file server 22 includes at least N file servers 22 that can be selected as write destinations. ing.

次に、コンテンツ分散保管システムでのコンテンツ片の書込み及び配信処理について、図２及び図３を参照しながら説明する。
先ず、コンテンツ片の書込みを行う場合について説明する。
コンテンツ生成サーバ１０は、運用者に指定される時間単位に提供されたコンテンツを断片化して複数のコンテンツ片１５を生成し、各コンテンツ片１５を分散ファイルシステム２０内に分割書込みする処理が行われる（図３におけるＡ）。 Next, content piece writing and distribution processing in the content distributed storage system will be described with reference to FIGS.
First, a case where content pieces are written will be described.
The content generation server 10 generates a plurality of content pieces 15 by fragmenting the content provided in time units designated by the operator, and performs a process of dividing and writing each content piece 15 into the distributed file system 20. (A in FIG. 3).

分散ファイルシステム２０内では、分散ファイルシステムクライアント２１が、コンテンツ生成サーバ１０からのコンテンツ片１５の書込み要求を受け、コンテンツ生成サーバ１０から時系列順に順次渡されるコンテンツ片１５を書き込むファイルサーバ２２を選択する。ファイルサーバ２２の選択は、メタデータサーバ２３に対して書込み先のリスト問い合わせを行い、ＣＰＵ利用率やディスク空き容量などのリソース情報に余裕がある複数のファイルサーバ２２を抽出（応答）し、例えば各ファイルサーバ２２のホスト名順（図２に示す例の場合、ホスト名が2010srv,srv_A,srv_B,workhostの順）に並べたリストを作成し、リストにおける各ファイルサーバ２２の昇順に書込み先を決定しコンテンツ片１５が書き込まれる。 In the distributed file system 20, the distributed file system client 21 receives a write request for the content pieces 15 from the content generation server 10, and selects the file server 22 to which the content pieces 15 that are sequentially passed from the content generation server 10 in time series are written. To do. The file server 22 is selected by inquiring a list of write destinations with respect to the metadata server 23, and extracting (response) a plurality of file servers 22 having sufficient resource information such as CPU usage rate and free disk capacity. Create a list in which the host names of each file server 22 are arranged in the order of the host names (in the example shown in FIG. 2, the host names are in the order of 2010srv, srv_A, srv_B, workhost). The content piece 15 is determined and written.

分散ファイルシステムクライアント２１において、リストに抽出される複数のファイルサーバ２２は、リソース情報が予め設定された閾値より大きいものが選択される。なお、Ｎ個のコンテンツ片１５を物理的に別々のファイルサーバ２２に分散させて保管するためには、リストに抽出される複数のファイルサーバ２２（リソース情報に余裕があるもの）がＮ個以上必要である。 In the distributed file system client 21, a plurality of file servers 22 extracted in the list are selected with resource information larger than a preset threshold value. Note that in order to store N pieces of content 15 separately on physically separate file servers 22, there are N or more file servers 22 extracted from the list (those with sufficient resource information). is necessary.

これにより、一つのコンテンツの一連のファイルセットの中で、時系列的に近いＮ個のコンテンツ片１５は、物理的に別々のファイルサーバ２２に分散して保管される。従って、これらのコンテンツ片１５を読込むコンテンツ配信サーバ３０にとっては、読込むファイルが物理的に分散して保管されているため、ファイルサーバ２２の負荷を分散しつつ各々を並列に読み込むことができ、高速にメモリ上にデータをロードして配信を開始することができる。 Accordingly, N pieces of content 15 that are close in time series in a series of file sets of one content are distributed and stored in physically separate file servers 22. Therefore, for the content distribution server 30 that reads these pieces of content 15, since the files to be read are physically distributed and stored, each of them can be read in parallel while distributing the load on the file server 22. The data can be loaded onto the memory at high speed and distribution can be started.

尚、図２の例では、各ファイルサーバ２２のホスト名順（昇順）にコンテンツ片の書込み先を決定するようにしたが、ホスト名順（降順）、ＩＰアドレス順、各サーバのＭＡＣアドレス順など、ファイルサーバ２２を一意に識別可能な値の順、あるいは運用者が設定ファイルに指定した任意の順で書込み先を決定してもよい。 In the example of FIG. 2, the writing destination of the content pieces is determined in the order of host names (ascending order) of each file server 22, but in order of host name (descending order), IP address order, and MAC address order of each server. For example, the write destination may be determined in the order of values that can uniquely identify the file server 22 or in any order designated by the operator in the setting file.

次に、コンテンツの配信を行う場合について説明する。
コンテンツ配信サーバ３０がユーザ端末４０から所望コンテンツの配信要求を受けた場合、コンテンツ配信サーバ３０は、分散ファイルシステム２０に対して所望コンテンツを構成するＮ個のコンテンツ片１５の内の第１コンテンツ片の読込み命令を発する（図３におけるＢ）。
コンテンツ片の読込み命令を受けた分散ファイルシステムクライアント２１は、メタデータサーバ２３に対して読込み先の問い合わせを行い、メタデータサーバ２３からの応答から該当するファイルサーバ２２に対して読込み要求を行い、読込み要求を受けたファイルサーバ２２が分散ファイルシステムクライアント２１に第１コンテンツ片１５のデータを送付（応答）し、これを受けた分散ファイルシステムクライアント２１がコンテンツ配信サーバ３０に対して読込応答を行う。 Next, a case where content distribution is performed will be described.
When the content distribution server 30 receives a distribution request for the desired content from the user terminal 40, the content distribution server 30 sends the first content piece among the N pieces of content 15 constituting the desired content to the distributed file system 20. Is issued (B in FIG. 3).
The distributed file system client 21 that has received the content piece read command makes an inquiry about the read destination to the metadata server 23, makes a read request to the corresponding file server 22 from the response from the metadata server 23, and The file server 22 that has received the read request sends (responds) the data of the first content piece 15 to the distributed file system client 21, and the distributed file system client 21 that receives this sends a read response to the content distribution server 30. .

同様の処理が順次行われることで、分散ファイルシステムクライアント２１は、コンテンツ配信サーバ３０に対して第２コンテンツ片〜第Ｎコンテンツ片を読込み（各ファイルサーバ２２から各コンテンツ片をＮ個まとめて読込み）、ユーザ端末４０への第１〜第Ｎコンテンツ片を順次配信する。
そして、第１〜第Ｎコンテンツ片の配信時において、コンテンツ配信サーバ３０は、分散ファイルシステム２０に対して、第（Ｎ＋１）コンテンツ片の読込み命令を発し、この命令に基づいて分散ファイルシステム２０からコンテンツ配信サーバ３０へ読込応答が順次行われる。 By performing the same processing sequentially, the distributed file system client 21 reads the second content piece to the Nth content piece from the content distribution server 30 (reads N pieces of each content piece from each file server 22 at a time. ), The first to Nth content pieces are sequentially delivered to the user terminal 40.
Then, at the time of distributing the first to Nth content pieces, the content distribution server 30 issues a command to read the (N + 1) th content piece to the distributed file system 20, and from the distributed file system 20 based on this command. Read responses are sequentially made to the content distribution server 30.

次に、分散ファイルシステムクライアント２１の構成について、図４を参照しながら説明する。分散ファイルシステムクライアント２１は、ユーザアクセス制御部５１、書込み先決定部５２、書込み先情報収集部５３、ファイルシステムアクセス部５４の各モジュールを備えて構成されている。 Next, the configuration of the distributed file system client 21 will be described with reference to FIG. The distributed file system client 21 includes modules of a user access control unit 51, a write destination determination unit 52, a write destination information collection unit 53, and a file system access unit 54.

ユーザアクセス制御部５１は、外部アプリケーション（コンテンツ生成サーバ）から、コンテンツ片の書込み要求を順次受信し、書込み先決定部５２に要求を伝える。また、書込み結果を外部アプリケーションに応答する。
書込み先決定部５２は、書込み対象として選択可能なファイルサーバリストの中から、ファイルサーバ２２のホスト名順、ＩＰアドレス順、各サーバのＭＡＣアドレス順など、ファイルサーバ２２を一意に識別可能な値の順で書込み先を決定する（従って、前回書込み対象としたファイルサーバ２２を記憶しておき、その次のファイルサーバ２２を選択する）。決定した書込み先ファイルサーバ２２をファイルシステムアクセス部５４に伝える。 The user access control unit 51 sequentially receives write requests for content pieces from an external application (content generation server) and transmits the requests to the write destination determination unit 52. The write result is returned to the external application.
The write destination determination unit 52 can uniquely identify the file server 22 such as the host name order of the file server 22, the IP address order, and the MAC address order of each server from the file server list that can be selected as a write target. In this order, the write destination is determined (therefore, the file server 22 to be written last time is stored and the next file server 22 is selected). The determined write destination file server 22 is transmitted to the file system access unit 54.

書込み先情報収集部５３は、メタデータサーバ２３がファイルサーバ２２から定期的に収集した各ファイルサーバ２２のディスク空き容量、ＣＰＵ使用率などのリソース情報を基に作成した書込み可能なファイルサーバ２２のファイルサーバリストを取得し、書込み先決定部５２に渡す。
ファイルシステムアクセス部５４は、書込み先決定部５２から渡される書込み先ファイルサーバ情報に従い、対象となるファイルサーバ２２に書込みデータを転送し、応答を受け取る。 The write destination information collection unit 53 includes a writable file server 22 created on the basis of resource information such as a free disk capacity and a CPU usage rate of each file server 22 periodically collected from the file server 22 by the metadata server 23. A file server list is acquired and passed to the write destination determination unit 52.
The file system access unit 54 transfers write data to the target file server 22 according to the write destination file server information passed from the write destination determination unit 52 and receives a response.

続いて、分散ファイルシステムクライアント２１の書込み先決定部５２における処理手順について、図５のフローチャートを参照しながら説明する。 Next, the processing procedure in the write destination determination unit 52 of the distributed file system client 21 will be described with reference to the flowchart of FIG.

先ず、書込み先決定部５２は、ユーザアクセス制御部５１からの書込み要求を受け付ける（ステップ６１）。
次に、ユーザアクセス制御部５１から受け取った書込み要求から、ファイル名、ファイルオープンオプション（どのようなモードでファイルを開くか）、ファイル作成時のアクセスモード（書込み許可、読込み許可、書き込みと読み出し許可いずれか）の要求パラメータを抽出する（ステップ６２）。
書込み先決定部５２は、作成されたファイルサーバリストをメタデータサーバ２３から取得する（ステップ６３）。 First, the write destination determination unit 52 receives a write request from the user access control unit 51 (step 61).
Next, from the write request received from the user access control unit 51, a file name, a file open option (in what mode the file is opened), an access mode at the time of file creation (write permission, read permission, write and read permission) Any required parameter is extracted (step 62).
The write destination determining unit 52 acquires the created file server list from the metadata server 23 (step 63).

ファイルサーバリストの取得が成功した場合は（ステップ６４）、前回書込み先の有無をチェックする（ステップ６５）。前回書込み先の有無は、書込み先決定部５２のメモリ上に保管しておいた前回書込み先ファイルサーバが、メタデータサーバ２３から受け取ったファイルサーバリスト中に存在しているかのチェックを行う。
前回書込み先ファイルサーバが有る場合（ステップ６６）、今回の書込み先（前回書込み先に対してリスト順において＋１を加えた書込み先）となる次のホスト名を選定する（ステップ６７）。 If acquisition of the file server list is successful (step 64), the presence / absence of the previous write destination is checked (step 65). The presence / absence of the previous write destination is checked by checking whether the previous write destination file server stored in the memory of the write destination determination unit 52 exists in the file server list received from the metadata server 23.
If there is a previous write destination file server (step 66), the next host name to be the current write destination (the write destination obtained by adding +1 in the list order to the previous write destination) is selected (step 67).

選定された書込み先のファイルサーバに対して書込み要求を発行する（ステップ６８）。
書込み先のファイルサーバからファイル書込み応答を受信する（ステップ６９）。
書込み先をメモリに記録する（ステップ７０）。 A write request is issued to the selected write destination file server (step 68).
A file write response is received from the write destination file server (step 69).
The write destination is recorded in the memory (step 70).

前回書込み先ファイルサーバが無い場合（ステップ６６）、書込み１回目であるかを判断し（ステップ７１）、１回目である場合はファイルサーバリストの先頭のホストに決定する（ステップ７２）。１回目の書込みであるので、ホスト名でソートしたファイルサーバリストの先頭を選定する。 If there is no previous write destination file server (step 66), it is determined whether or not it is the first write (step 71), and if it is the first time, it is determined as the first host in the file server list (step 72). Since this is the first writing, the top of the file server list sorted by host name is selected.

前回書込み先がない場合で書込みが１回目でない場合（ステップ７１）とは、前回書込み先となったファイルサーバを削除した場合や、前回書込み先のファイルサーバが故障している場合を想定している。この場合には、前回書込み先に近いファイルサーバを検索する（ステップ７３）。
すなわち、ファイルサーバリストの中から辞書順に、前回書込み先ホスト名の次のファイルサーバ名を今回の書込み先に決定する。書込み先の決定後は、ファイル書込み要求の発行（ステップ６８）、ファイル書込み応答の受信（ステップ６９）、書込み先の記録（ステップ７０）が順次行われる。 When there is no previous write destination and the write is not the first time (step 71), it is assumed that the file server that was the previous write destination has been deleted or the file server that was the previous write destination has failed Yes. In this case, a file server close to the previous writing destination is searched (step 73).
That is, the file server name next to the previous write destination host name is determined as the current write destination in the dictionary order from the file server list. After the write destination is determined, a file write request is issued (step 68), a file write response is received (step 69), and the write destination is recorded (step 70).

図６は、コンテンツ分散保管システムでのコンテンツ片の書込み及び配信処理の他の実施形態を示すもので、図２と同一の構成をとる部分については同一の符号を付している。
この例のコンテンツ分散保管システムでは、図１と同様に、コンテンツ生成サーバ、コンテンツ配信サーバ、分散ファイルシステム（分散ファイルシステムクライアント、メタデータサーバ、複数のファイルサーバ）が稼働されている。また、コンテンツ配信サーバからの同時読込み数は「３」とする。 FIG. 6 shows another embodiment of content piece writing and distribution processing in the content distributed storage system, and parts having the same configuration as in FIG. 2 are given the same reference numerals.
In the content distributed storage system of this example, a content generation server, a content distribution server, and a distributed file system (distributed file system client, metadata server, multiple file servers) are operated as in FIG. The number of simultaneous readings from the content distribution server is “3”.

そして、コンテンツ生成サーバからのコンテンツ片書込み要求を受けた分散ファイルシステムクライアントは、図２の例と同様に、コンテンツ生成サーバから時系列順に順次渡されるコンテンツ片を書き込むファイルサーバを、各ファイルサーバのＣＰＵ利用率やディスク空き容量などのリソース情報に余裕があること（ある程度の閾値より上のものの中から）に加え、各ファイルサーバを一意に識別可能な値の順に選択する。 Then, the distributed file system client that has received the content piece write request from the content generation server, as in the example of FIG. 2, sets the file server to which the content pieces sequentially passed in time series from the content generation server to each file server. In addition to the fact that there is room in resource information such as CPU utilization and free disk capacity (from those above a certain threshold), each file server is selected in the order of values that can be uniquely identified.

本例のコンテンツ分散保管システムの分散ファイルシステム内においては、最初にファイルサーバに書き込まれたコンテンツ片（オリジナル）の複製を別のファルサーバに作成する機能を有している。すなわち、コンテンツ配信サーバからの同時読込み数（例えば「３」）と複製数（例えば、オリジナルを含む複製数が「２」）との乗数（この場合、６台）以上の台数のファイルサーバを設置した環境で、コンテンツ片１５（オリジナル）を保管したファイルサーバの順位（ソートした中での順位）に同時読込み数（図６の例では「３」）を加えた順位のファイルサーバに複製を保管する。図６の例によれば、ファイルサーバ（2010srv）に保管された第１コンテンツ片１５（オリジナル）の複製は、昇順では３番後（順位４）のファイルサーバ（srv_B）に複製として書き込まれる。 The distributed file system of the content distributed storage system of this example has a function of creating a copy of the content piece (original) written in the file server first in another file server. That is, the number of file servers equal to or greater than the multiplier (in this case, 6) of the number of simultaneous readings from the content distribution server (for example, “3”) and the number of copies (for example, the number of copies including the original is “2”) In this environment, the duplicates are stored in the file server having the rank obtained by adding the number of simultaneous readings (“3” in the example of FIG. 6) to the rank of the file server that stored the content pieces 15 (original) (ranked rank). To do. According to the example of FIG. 6, a copy of the first content piece 15 (original) stored in the file server (2010srv) is written as a copy to the file server (srv_B) after the third (rank 4) in ascending order.

このアルゴリズムにより、コンテンツ配信サーバ３０が時系列的に隣接した３つのコンテンツ片１５を同時に読込む際、オリジナルあるいは複製いずれの読込みを行っても、読込み先ファイルサーバが重複することは無い。 With this algorithm, when the content distribution server 30 reads the three content pieces 15 that are adjacent in time series at the same time, the read destination file server does not overlap even if the original or duplicate is read.

また、一旦コンテンツ片１５が書き込まれたファイルサーバ２２に障害が発生した場合は、運用者が代替機を用意することにより、分散ファイルシステム２０内のメタデータサーバ２３が保持するコンテンツ片１５のメタ情報から、用意された代替機にコンテンツ片を再配置すれば、障害に対処することができる。 In addition, when a failure occurs in the file server 22 to which the content piece 15 has been written once, the operator prepares an alternative machine, whereby the meta data of the content piece 15 held by the metadata server 23 in the distributed file system 20 is prepared. If the content piece is rearranged from the information to the prepared alternative machine, the failure can be dealt with.

上述した各実施形態によれば、コンテンツ配信サーバ３０が各コンテンツ片１５をＮ個まとめて読込みに際して、時系列的に近いＮ個のコンテンツ片１５は物理的に別々のファイルサーバ２２に保管されているため、コンテンツ片１５を順次読込む処理において、複数のコンテンツ片が一つのファイルサーバに格納される状態を防止し、ファイルサーバへの負荷を集中させることなく並列に読み込みことができるので、低負荷でシームレスな配信を行うことが可能となる。 According to each embodiment described above, when the content distribution server 30 reads N pieces of each content piece 15 together, N pieces of content 15 that are close in time series are stored in physically separate file servers 22. Therefore, in the process of sequentially reading the content pieces 15, it is possible to prevent a state in which a plurality of content pieces are stored in one file server and to read in parallel without concentrating the load on the file server. It becomes possible to perform seamless distribution under load.

１０…コンテンツ生成サーバ、１１…外部ストレージ、１５…コンテンツ片２０…分散ファイルシステム、２１…分散ファイルシステムクライアント、２２…ファイルサーバ、２３…メタデータサーバ、３０…コンテンツ配信サーバ、４０…ユーザ端末。 DESCRIPTION OF SYMBOLS 10 ... Content generation server, 11 ... External storage, 15 ... Content piece 20 ... Distributed file system, 21 ... Distributed file system client, 22 ... File server, 23 ... Metadata server, 30 ... Content delivery server, 40 ... User terminal.

Claims

Content generation server that generates a plurality of pieces of content by fragmenting content, a distributed file system that includes a plurality of physically distributed file servers for storing each piece of content, and distribution from user terminals In a system including a content distribution server that reads N pieces of each piece of content from each file server in response to a request and distributes the pieces to the user terminal,
The distributed file system has a distributed file system client that determines N or more file servers to which each piece of content is written from the plurality of file servers,
The distributed file system client includes a write destination determination unit that determines a write destination file server so that the content pieces are distributed in time series.

The distributed file system has a metadata server for creating a list of file servers having resource information such as a CPU usage rate and a disk free capacity equal to or greater than a predetermined threshold,
The content distribution storage system according to claim 1, wherein the write destination determination unit selects a file server to which a piece of content is written from a plurality of file servers based on the list acquired from the metadata server.

The content distribution storage system according to claim 2, wherein the list is created by arranging file servers in order of uniquely identifiable values.

4. The distributed content storage system according to claim 3, wherein the order of values that can uniquely identify the file server in the list is the host name order, IP address order, or MAC address order of each file server.

The content distribution storage system according to claim 2, wherein the list is created by arranging file servers in an arbitrary order designated by an operator.

A file server that stores pieces of content in the list in an environment in which a number of file servers equal to or greater than the multiplier of the number of simultaneous reads from the content distribution server and the total number of writes on the file server (the number of copies including the original) are installed The content distribution storage system according to claim 1 or 2, wherein a copy of the content piece is stored in a file server having a rank obtained by adding the number of simultaneous readings to the rank.