JP5410301B2

JP5410301B2 - Distributed processing system, distributed processing method, and program

Info

Publication number: JP5410301B2
Application number: JP2010000376A
Authority: JP
Inventors: 浩樹赤間; 崇毛受; 知洋長谷川; 基弘松田; 一兵衛内藤; 雅司山室
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2010-01-05
Filing date: 2010-01-05
Publication date: 2014-02-05
Anticipated expiration: 2030-01-05
Also published as: JP2011141587A

Description

本発明は、アップロードデータの分散処理システム、分散処理方法、およびプログラムに関するものである。 The present invention relates to a distributed processing system, a distributed processing method, and a program for upload data.

Ｗｅｂ上でのサービスを提供する場合に、１つのまとまったデータについて、データ量が巨大であり、何らかの処理を行うに際して、処理結果をユーザに提供するのに、リアルタイムでは、処理しきれない場合が存在する。例えば、日英翻訳処理や、日本語係り受け解析処理、映像からの注目シーン抽出等である。具体的には、日本語係り受け解析処理の場合において、ある設計書１６０００行（７００ＫＢ）に対して、係り受け解析処理に１００秒かかるケース等がある。この場合に、この処理をＷｅｂブラウザ経由のサービスとして実行すると、１００秒経つ以前に、ブラウザのタイムアウトが発生してしまう。従って、従来は、このような場合に、以下のような対処が行われてきた。 When providing a service on the Web, the amount of data for a single piece of data is enormous, and when some processing is performed, the processing result is provided to the user, but in some cases it cannot be processed in real time. Exists. For example, Japanese-English translation processing, Japanese dependency analysis processing, attention scene extraction from video, and the like. Specifically, in the case of Japanese dependency analysis processing, there is a case where the dependency analysis processing takes 100 seconds for a certain design document 16000 lines (700 KB). In this case, if this process is executed as a service via a Web browser, a browser timeout occurs before 100 seconds. Therefore, conventionally, the following measures have been taken in such a case.

（１）情報量の多いデータの投入を禁止する。このような制限を設けるサービスシステムは、非常に多いが、ユーザに要求する制限が厳しすぎる場合には、そもそもサービス自体が利用されなくなるという問題がある。 (1) Prohibit input of data with a large amount of information. There are a large number of service systems that provide such restrictions, but there is a problem that the service itself cannot be used in the first place if the restrictions required of the user are too strict.

（２）非同期の結果確認インタフェースを設ける。ユーザに対してブラウザのタイムアウト時間内に回答できないケースに備えて、処理完了後にメールにおいてユーザに通知する等の新たな処理群として、例えば、メールアドレス登録、メール送信、エラー再送、結果進捗状態照会等を用意しておく。この場合は、システムの本質的でない部分において、処理構成が複雑になるという問題がある。 (2) An asynchronous result confirmation interface is provided. In preparation for cases where users cannot respond within the browser timeout period, new processing groups such as notifying the user by e-mail after processing is completed, for example, e-mail address registration, e-mail transmission, error resending, result progress status inquiry Prepare etc. In this case, there is a problem that the processing configuration becomes complicated in a non-essential part of the system.

（３）ユーザ認証が可能なストレージエリア等に、処理結果を生成することとし、ユーザは、適当な頃合を見計らって、結果が生成されていることを確認する。この場合は、ユーザがいつ生成されるか分からない結果を得るために、何度も結果の確認を行う必要があり、ユーザの負担が大きくなるという問題がある。 (3) The processing result is generated in a storage area or the like where user authentication is possible, and the user confirms that the result is generated at an appropriate time. In this case, in order to obtain a result that the user does not know when it is generated, it is necessary to confirm the result many times, and there is a problem that the burden on the user increases.

一方、大量のＰＣ群を活用して処理の高速化を行う技術として、MapReduceモデルが存在する（非特許文献１参照）。MapReduceモデルは、例えば、Ｗｅｂをクロールして得た細かなデータ群（Ｗｅｂページ群）を分散ファイルシステム（ＧＦＳ）上に保管し、各分散ファイルシステムを提供するマシン上で、形態素解析処理や単語のカウント処理等を並列に実行させる。これにより、総容量として１０ＴＢといった巨大なファイル群に対して、短時間で転置ファイルの作成等を可能にする。 On the other hand, there is a MapReduce model as a technique for speeding up processing by utilizing a large number of PC groups (see Non-Patent Document 1). The MapReduce model stores, for example, a detailed data group (Web page group) obtained by crawling the Web on a distributed file system (GFS), and performs morphological analysis processing and word processing on a machine that provides each distributed file system. The counting process is executed in parallel. Thereby, it is possible to create a transposed file in a short time for a huge file group having a total capacity of 10 TB.

また、システムのスループットを向上する技術として、分散構成が提案されている（特許文献１参照）。この分散構成は、複数の受付キューが存在し、複数の処理部が存在する構成で、複数のデータを複数の受付キューで受け付け、受け付けた複数のデータを分散して各処理部に処理させることで、システム全体としてのスループットを向上させることができるものである。このシステムは、追記参照型ＤＭＳ（Data Management System）、もしくは単にＤＭＳと呼ばれる。 Also, a distributed configuration has been proposed as a technique for improving the system throughput (see Patent Document 1). In this distributed configuration, there are a plurality of reception queues and a plurality of processing units, a plurality of data are received by a plurality of reception queues, and a plurality of received data are distributed and processed by each processing unit. Thus, the throughput of the entire system can be improved. This system is called a write-once reference type DMS (Data Management System) or simply DMS.

ＩＴ用語辞典、“MapReduce”、［online］、［平成21年12月20日検索］、インターネット＜URL: http://e-words.jp/w/MapReduce.html＞IT Glossary, “MapReduce”, [online], [Search on December 20, 2009], Internet <URL: http://e-words.jp/w/MapReduce.html>

特開２００８−８３８０８号公報JP 2008-83808 A

しかしながら、非特許文献１に記載された技術である、MapReduceモデルは、計算の前にどのマシンに処理を分散実行させるかというスケジューリングを時間をかけて行う必要がある点や、分散ファイルシステム上にデータを登録する必要がある点等の制約が強く、バッチ処理に向いており、リアルタイムの処理には向いていないものである。よって、ユーザからアップロードされたデータに即応して、対象とした処理を高速化することに適用することはできなかった。また、基本的には、Ｗｅｂページデータ等、細かなファイルが多数存在するケースに対応しており、本発明が対象とするような単一の情報量の多いファイルの投入に対しては、データのサイズのみでの分断を実施したとしても、分断した近辺において、意味的な整合を維持しつつ分散処理を実行することが困難であった。 However, the MapReduce model, which is a technique described in Non-Patent Document 1, requires that it takes time to schedule which machine to execute processing before calculation, and on the distributed file system. There are strong restrictions such as the need to register data, and it is suitable for batch processing and not for real-time processing. Therefore, it cannot be applied to speeding up the targeted processing in response to data uploaded from the user. Basically, it corresponds to the case where there are many small files such as Web page data. For the input of a single information-rich file as the object of the present invention, data Even if the division is performed only with the size of the file, it is difficult to execute the distributed processing while maintaining the semantic matching in the vicinity of the division.

また、特許文献１に記載の技術である追記参照型ＤＭＳは、複数のデータを受け付ける場合に、システム全体としてのスループットの向上を目的とするものであり、そのままでは、アップロードされた単一の情報量の多いデータに対して、レスポンスを高速化させることはできなかった。 In addition, the write-once reference type DMS, which is a technique described in Patent Document 1, aims to improve the throughput of the entire system when receiving a plurality of data. It was not possible to speed up the response to a large amount of data.

このように、バッチ処理の処理時間を短縮させる技術や、既に分割されている複数のデータに対して処理スループットを向上させる技術は存在したが、ユーザからの単一の情報量の多いアップロードデータに即応して、データの意味的な整合を維持しつつ分散処理を実行し、対象とした処理の応答時間を短縮することはできなかった。 As described above, there are technologies for reducing the processing time of batch processing and technologies for improving the processing throughput for a plurality of pieces of data that have already been divided. However, the upload data has a large amount of information from the user. In response, the distributed processing was executed while maintaining the semantic consistency of the data, and the response time of the target processing could not be shortened.

このような背景に鑑みて本発明がなされたのであり、本発明は、単一の情報量の多いデータに対して、レスポンスの高速化を図ることができる、分散処理システム、分散処理方法、およびプログラムを提供することを目的とする。 The present invention has been made in view of such a background, and the present invention provides a distributed processing system, a distributed processing method, and a method capable of speeding up a response to a single data having a large amount of information. The purpose is to provide a program.

前記した課題を解決するため、請求項１に記載の発明は、（１）ユーザ端末から通信ネットワークを介してアップロードデータを受け付け、前記受け付けたアップロードデータについてのデータ処理の結果データを同期もしくは非同期で前記ユーザ端末に返信する受付応答装置と、（２）前記受付応答装置から取得した前記アップロードデータを分断して複数の断片データを生成し、前記生成した複数の断片データがデータ処理された結果である複数の断片結果データを統合して前記結果データを生成する分断統合装置と、（３）前記分断統合装置から前記断片データを取得してキューとして保存し、複数の処理装置のうちの１つからの要求により前記断片データを当該処理装置に送信する１つ以上のキュー管理装置と、（４）前記断片データを前記キュー管理装置から取得し、前記取得した断片データをデータ処理して、前記断片結果データとして前記分断統合装置に送信する前記複数の処理装置と、を備える分散処理システムであって、前記受付応答装置が、前記ユーザ端末から前記通信ネットワークを介して、前記アップロードデータを受け付けるデータ受付部と、前記データ受付部から前記アップロードデータを取得してファイル化し、前記ファイル化したデータをデータ蓄積部に保存した後に、前記分断統合装置を起動させる起動指示メッセージを生成するファイル処理部と、前記ファイル処理部が生成した起動指示メッセージおよび前記ファイル化したデータを前記分断統合装置に送信し、前記分断統合装置から取得した前記結果データを、前記通信ネットワークを介して、前記ユーザ端末に送信する送信制御部と、を備え、前記分断統合装置が、前記受付応答装置から送信された前記データを当該データの最初から順に読み込むデータ読込部と、前記データ読込部が読み込んだデータの種類を判定し、前記判定したデータの種類に応じて、前記処理装置が前記データをデータ処理する際に処理可能な単位での切り出しによる前記複数の断片データの生成を行う断片データ生成部と、前記断片データ生成部により、前記複数の断片データのうちの１つが生成される順に、前記複数の断片データそれぞれにシーケンシャルなＩＤを付して、前記シーケンシャルなＩＤが付された断片データが生成される毎に、前記１つ以上のキュー管理装置のいずれかに当該断片データを送信するデータ管理部と、前記複数の処理装置から送信された前記断片結果データを受信し、前記断片データに付された前記ＩＤに対応付けて前記断片結果データに付されたシーケンシャルなＩＤに基づき、すべての前記断片結果データを取得したか否かを判定する断片結果データ取得部と、前記断片結果データ取得部が、すべての前記断片結果データを取得したと判定した場合に、前記断片結果データに付された前記シーケンシャルなＩＤを用いて、当該ＩＤ順に前記断片結果データを統合し前記結果データを生成する断片結果データ統合部と、前記断片結果データ統合部により生成された前記結果データが記憶されるデータ記憶部と、前記データ記憶部に記憶された前記結果データを、前記受付応答装置へ送信する結果データ出力部と、を備え、前記キュー管理装置が、前記分断統合装置から送信された前記断片データを受け付け、キューとしてキュー保存部に保存するキュー受付部と、前記処理装置からの前記断片データの取得要求に基づき、前記キュー保存部に保存された前記断片データを当該処理装置に送信するキュー管理部と、前記キュー受付部が受け付けた断片データが保存される前記キュー保存部と、を備え、前記処理装置が、前記キュー管理装置に前記断片データ取得要求を送信し、前記キュー管理装置から前記断片データを取得する断片データ取得部と、前記断片データ取得部が取得した前記断片データに対し、所定のデータ処理を実行するデータ処理部と、前記データ処理部によりデータ処理された結果である前記断片結果データに、前記断片データに付された前記ＩＤに対応付けたシーケンシャルなＩＤを付して、前記分断統合装置に送信する断片結果データ出力部と、を備えることを特徴とする分散処理システムとした。 In order to solve the above-described problem, the invention according to claim 1 is: (1) accepting upload data from a user terminal via a communication network, and synchronously or asynchronously processing result data of the accepted upload data. An acceptance response device that replies to the user terminal; and (2) a plurality of pieces of fragment data are generated by dividing the upload data acquired from the acceptance response device, and the generated pieces of fragment data are subjected to data processing. and cutting integrating device for generating the results data by integrating some multiple fragments result data, and stored as a queue to acquire the fragment data from (3) the cutting integrated device, one of a plurality of processing devices One or more queue management devices that transmit the fragment data to the processing device in response to a request from (4) the fragment data A plurality of processing devices that acquire data from the queue management device, process the acquired fragment data, and transmit the obtained fragment data to the fragmentation integration device as the fragment result data. A responding device receives the upload data from the user terminal via the communication network, and obtains the upload data from the data accepting unit into a file, and stores the filed data in the data storage unit After saving, a file processing unit that generates a start instruction message for starting the division integration device, a start instruction message generated by the file processing unit and the filed data are transmitted to the division integration device, and the division integration is performed. The result data obtained from the device is transmitted via the communication network. A transmission control unit that transmits the data to the user terminal, and the division integration device reads the data transmitted from the acceptance response device in order from the beginning of the data, and the data reading unit Fragment data that determines the type of read data and generates the plurality of fragment data by cutting out in units that can be processed when the processing device processes the data according to the determined type of data A fragment in which a sequential ID is assigned to each of the plurality of fragment data in the order in which one of the plurality of fragment data is generated by the generation unit and the fragment data generation unit, and the sequential ID is added. A data management unit that transmits the fragment data to any one of the one or more queue management devices each time data is generated, and the plurality of processes Whether the fragment result data transmitted from the apparatus has been received, and all the fragment result data have been acquired based on the sequential IDs attached to the fragment result data in association with the IDs attached to the fragment data When the fragment result data acquisition unit for determining whether or not the fragment result data acquisition unit determines that all the fragment result data has been acquired, the sequential ID attached to the fragment result data is used. A fragment result data integration unit that integrates the fragment result data in the ID order to generate the result data, a data storage unit that stores the result data generated by the fragment result data integration unit, and the data storage unit A result data output unit that transmits the result data stored in the response response device to the acceptance response device, wherein the queue management device A queue receiving unit that receives the fragment data transmitted from a combination device and stores it in a queue storage unit as a queue; and the fragment data stored in the queue storage unit based on an acquisition request for the fragment data from the processing device A queue management unit that transmits the fragment data received by the queue reception unit, and the queue storage unit that stores the fragment data received by the queue reception unit, wherein the processing device sends the fragment data acquisition request to the queue management device. A fragment data acquisition unit that transmits and acquires the fragment data from the queue management device; a data processing unit that executes predetermined data processing on the fragment data acquired by the fragment data acquisition unit; and the data processing unit The sequential result associated with the ID attached to the fragment data is added to the fragment result data that is the result of data processing by Denoted by the ID, and a distributed processing system characterized in that it comprises a fragment result data output unit to be transmitted to the cutting integrated system.

また、請求項６に記載の発明は、（１）ユーザ端末から通信ネットワークを介してアップロードデータを受け付け、前記受け付けたアップロードデータについてのデータ処理の結果データを同期もしくは非同期で前記ユーザ端末に返信する受付応答装置と、（２）前記受付応答装置から取得した前記アップロードデータを分断して複数の断片データを生成し、前記生成した複数の断片データがデータ処理された結果である複数の断片結果データを統合して前記結果データを生成する分断統合装置と、（３）前記分断統合装置から前記断片データを取得してキューとして保存し、複数の処理装置のうちの１つからの要求により前記断片データを当該処理装置に送信する１つ以上のキュー管理装置と、（４）前記断片データを前記キュー管理装置から取得し、前記取得した断片データをデータ処理して、前記断片結果データとして前記分断統合装置に送信する前記複数の処理装置と、を備える分散処理システムに用いられる分散処理方法であって、前記受付応答装置が、前記アップロードデータがファイル化されて保存されるデータ蓄積部を備え、前記ユーザ端末から前記通信ネットワークを介して、前記アップロードデータを受け付けるステップと、前記受け付けたアップロードデータを取得してファイル化し、前記ファイル化したデータを前記データ蓄積部に保存した後に、前記分断統合装置を起動させる起動指示メッセージを生成するステップと、前記生成した起動指示メッセージおよび前記ファイル化したデータを前記分断統合装置に送信するステップと、を実行し、前記分断統合装置が、前記受付応答装置から送信された前記データを当該データの最初から順に読み込むステップと、前記読み込んだデータの種類を判定し、前記判定したデータの種類に応じて、前記処理装置が前記データをデータ処理する際に処理可能な単位での切り出しによる前記複数の断片データの生成を行うステップと、前記複数の断片データのうちの１つが生成される順に、前記複数の断片データそれぞれにシーケンシャルなＩＤを付して、前記シーケンシャルなＩＤが付された断片データが生成される毎に、前記１つ以上のキュー管理装置のいずれかに当該断片データを送信するステップと、を実行し、前記キュー管理装置が、前記分断統合装置から送信された前記断片データがキューとして保存されるキュー保存部を備え、前記分断統合装置から送信された前記断片データを受け付け、キューとしてキュー保存部に保存するステップと、前記処理装置からの前記断片データの取得要求に基づき、前記キュー保存部に保存された前記断片データを当該処理装置に送信するステップと、を実行し、前記処理装置が、前記キュー管理装置に前記断片データ取得要求を送信し、前記キュー管理装置から前記断片データを取得するステップと、前記取得した前記断片データに対し、所定のデータ処理を実行するステップと、前記データ処理された結果である前記断片結果データに、前記断片データに付された前記ＩＤに対応付けたシーケンシャルなＩＤを付して、前記分断統合装置に送信するステップと、を実行し、前記分断統合装置が、前記断片結果データを統合した前記結果データが記憶されるデータ記憶部を備え、前記複数の処理装置から送信された前記断片結果データを受信し、前記断片データに付された前記ＩＤに対応付けて前記断片結果データに付された前記シーケンシャルなＩＤに基づき、すべての前記断片結果データを取得したか否かを判定するステップと、前記すべての前記断片結果データを取得したと判定した場合に、前記断片結果データに付された前記シーケンシャルなＩＤを用いて、当該ＩＤ順に前記断片結果データを統合し前記結果データを生成するステップと、前記生成された結果データを前記データ記憶部に記憶するステップと、前記データ記憶部に記憶された前記結果データを、前記受付応答装置へ送信するステップと、を実行し、前記受付応答装置が、前記分断統合装置から取得した前記結果データを、前記通信ネットワークを介して、前記ユーザ端末に送信するステップを実行することを特徴とする分散処理方法とした。 The invention according to claim 6 is: (1) accepting upload data from a user terminal via a communication network, and returning the data processing result data for the accepted upload data to the user terminal synchronously or asynchronously. An acceptance response device; and (2) a plurality of fragment result data obtained by dividing the upload data acquired from the acceptance response device to generate a plurality of fragment data, and the plurality of the generated fragment data are subjected to data processing. and cutting integrating device for generating the results data by integrating, (3) the saved from cutting integrating device as a queue to acquire the fragment data, the fragments by from one request of the plurality of processing devices One or more queue management devices that transmit data to the processing device; and (4) the fragment data is sent to the queue management device. And a plurality of processing devices that process the acquired fragment data and transmit the fragment data as the fragment result data to the fragmentation integration device, and a distributed processing method used in the distributed processing system, The reception response device includes a data storage unit that stores the upload data as a file, and receives the upload data from the user terminal via the communication network, and acquires the received upload data. Generating a start instruction message for starting the split integration device after storing the filed data in the data storage unit; and dividing and integrating the generated start instruction message and the filed data Transmitting to the device, and performing the division A combination device that reads the data transmitted from the acceptance response device in order from the beginning of the data; determines the type of the read data; and, depending on the determined data type, the processing device A step of generating the plurality of fragment data by cutting out in units that can be processed when data is processed; and in order in which one of the plurality of fragment data is generated, each of the plurality of fragment data is sequentially Each time the fragment data to which the sequential ID is attached is generated, the fragment data is transmitted to any one of the one or more queue management devices, and The queue management device includes a queue storage unit that stores the fragment data transmitted from the division integration device as a queue, and the division integration Receiving the fragment data transmitted from the apparatus and storing the fragment data as a queue in the queue storage unit; and processing the fragment data stored in the queue storage unit based on the fragment data acquisition request from the processing device Transmitting to the device, the processing device transmitting the fragment data acquisition request to the queue management device and acquiring the fragment data from the queue management device; and the acquired fragment data In response to the step of executing predetermined data processing, the fragment result data which is the result of the data processing is attached with a sequential ID associated with the ID attached to the fragment data, and the fragmentation is performed. Transmitting to the integration device, and the result of the fragmentation integration device integrating the fragment result data is A data storage unit that stores the fragment result data transmitted from the plurality of processing devices, and the sequential result attached to the fragment result data in association with the ID attached to the fragment data. A step of determining whether or not all the fragment result data have been acquired based on the ID, and the sequential ID assigned to the fragment result data when it is determined that all the fragment result data have been acquired with, a step of storing the steps of integrating the fragments results data to the ID order to generate the result data, before Kisei the result data has been made in the data storage unit, stored in the data storage unit Transmitting the result data to the reception response device, and the reception response device acquires the result obtained from the division integration device. The data, via the communication network, and a distributed processing method characterized by performing the step of transmitting to the user terminal.

このようにすることで、本発明よれば、単一の情報量の多いデータあっても、複数の処理装置を用意すれば、ユーザ端末からの処理要求に対して、レスポンスの高速化を図ることができる。よって、ブラウザのタイムアウト時間内に結果をユーザ端末に返却可能となる。また、処理結果を事後に通知する等の付随したシステム（メールアドレス登録、メール送信等のシステム）の構築を省略することが可能となる。 Thus, according to the present invention, even if there is a single large amount of information, if a plurality of processing devices are prepared, the response to processing requests from the user terminal can be speeded up. Can do. Therefore, the result can be returned to the user terminal within the timeout time of the browser. In addition, it is possible to omit the construction of an accompanying system (system for e-mail address registration, e-mail transmission, etc.) such as notification of the processing result after the fact.

請求項２に記載の発明は、前記受付応答装置が、前記データ受付部が、前記アップロードデータの受け付けを開始すると、前記起動指示メッセージを生成し、前記送信制御部を介して前記分断統合装置に送信させ、前記データ受付部が受け付けている前記アップロードデータのうちの所定量を取得する毎に分割してファイル化し、前記ファイル化したデータを逐次前記データ蓄積部に保存し、前記データを前記データ蓄積部に保存する毎に、当該データを前記送信制御部を介して前記分断統合装置に送信させる逐次ファイル処理部をさらに備え、前記分断統合装置が、前記データ読込部が、前記受付応答装置から前記データを読み込む毎に、前記断片データ生成部に引き渡し、前記断片データ生成部が前記断片データを生成することを特徴とする請求項１に記載の分散処理システムとした。 According to a second aspect of the present invention, the acceptance response device generates the activation instruction message when the data acceptance unit starts accepting the upload data, and sends the activation instruction message to the division integration device via the transmission control unit. Each time a predetermined amount of the upload data received by the data receiving unit is acquired and divided into a file, the filed data is sequentially stored in the data storage unit, and the data is stored in the data storage unit. Each time the data is stored in the storage unit, the data storage unit further includes a sequential file processing unit that transmits the data to the dividing and integrating device via the transmission control unit. The dividing and integrating device, the data reading unit, and the reception response device. Each time the data is read, the data is transferred to the fragment data generation unit, and the fragment data generation unit generates the fragment data. It was distributed processing system according to claim 1 that.

このようにすることで、本発明によれば、ユーザ端末から受付応答装置へのデータのアップロード処理、分断統合装置による断片データ生成処理、そして、キュー管理装置への断片データの送信処理が、並列に実行可能となるため、更なるレスポンス（応答時間）の短縮が可能となる。 In this way, according to the present invention, the data upload processing from the user terminal to the reception response device, the fragment data generation processing by the fragmentation integration device, and the fragment data transmission processing to the queue management device are performed in parallel. Therefore, the response (response time) can be further shortened.

請求項３に記載の発明は、前記分断統合装置の前記断片データ生成部が、前記データ読込部が読み込んだ前記データの種類が日本語テキストデータの場合に、所定量まで前記データを読み込み、前記読み込んだデータの最後から、前方もしくは後方に、句点文字およびそれに続く改行コードを検出し、前記検出された地点までを、前記処理装置が前記データをデータ処理する際に処理可能な部分としての断片データとして切り出す日本語テキスト断片データ生成部を、さらに備えることを特徴とする請求項１または請求項２に記載の分散処理システムとした。 According to a third aspect of the present invention, the fragment data generation unit of the fragmentation integration device reads the data up to a predetermined amount when the type of the data read by the data reading unit is Japanese text data, Fragment as a portion that can be processed when the processing device processes the data from the end of the read data to the front or the back, detecting a punctuation character and the following line feed code and up to the detected point The distributed processing system according to claim 1 or 2, further comprising a Japanese text fragment data generation unit cut out as data.

このようにすることで、日本語テキストデータが受付応答装置にアップロードされた場合に、分断統合装置の日本語テキスト断片データ生成部が、データのサイズと、句点文字およびそれに続く改行コードを検索し、処理装置がデータ処理する際に処理可能な単位での切り出しによる断片データの生成を行うことができる。 In this way, when Japanese text data is uploaded to the acceptance response device, the Japanese text fragment data generation unit of the fragmentation integration device searches for the data size, punctuation characters, and the following line feed code. Fragment data can be generated by cutting out in units that can be processed when the processing device processes data.

請求項４に記載の発明は、前記分断統合装置の前記断片データ生成部が、前記データ読込部が読み込んだ前記データの種類が英語テキストデータの場合に、所定量まで前記データを読み込み、前記読み込んだデータの最後から、前方もしくは後方に、ピリオドおよびそれに続く改行コードを検出し、前記検出された地点までを、前記処理装置が前記データをデータ処理する際に処理可能な部分としての断片データとして切り出す英語テキスト断片データ生成部を、さらに備えることを特徴とする請求項１乃至請求項３のいずれか１項に記載の分散処理システムとした。 According to a fourth aspect of the present invention, the fragment data generation unit of the fragmentation integration apparatus reads the data up to a predetermined amount when the type of the data read by the data reading unit is English text data, and reads the data From the end of the data, a period and subsequent line feed code are detected forward or backward from the end of the data, and up to the detected point as fragment data as a portion that can be processed when the processing unit processes the data. The distributed processing system according to any one of claims 1 to 3, further comprising an English text fragment data generation unit to be cut out.

このようにすることで、英語テキストデータが受付応答装置にアップロードされた場合に、分断統合装置の英語テキスト断片データ生成部が、データのサイズと、ピリオドおよびそれに続く改行コードを検索し、処理装置がデータ処理する際に処理可能な単位での切り出しによる断片データの生成を行うことができる。 In this way, when English text data is uploaded to the reception response device, the English text fragment data generation unit of the fragmentation integration device searches for the data size, the period, and the line feed code that follows it, and the processing device Can generate fragment data by cutting out in units that can be processed.

請求項５に記載の発明は、前記分断統合装置の前記断片データ生成部が、前記データ読込部が読み込んだ前記データの種類がＧＯＰ（Group Of Pictures）単位で構成される映像データの場合に、所定数のＧＯＰ毎に前記映像データを切り出して断片データを生成する映像断片データ生成部を、さらに備えることを特徴とする請求項１乃至請求項４のいずれか１項に記載の分散処理システムとした。 The invention according to claim 5, in the case where the fragment data generation unit of the fragmentation integration device is video data in which the type of the data read by the data reading unit is configured in GOP (Group Of Pictures) units, 5. The distributed processing system according to claim 1, further comprising a video fragment data generation unit that generates the fragment data by cutting out the video data for each predetermined number of GOPs. did.

このようにすることで、単一の情報量の多い映像データが入力された場合に、複数の処理装置を用意することで、ユーザからの処理要求に対して、レスポンス（応答時間）の短縮を図ることが可能となる。 In this way, when a single piece of video data with a large amount of information is input, by preparing a plurality of processing devices, response (response time) can be shortened in response to a processing request from the user. It becomes possible to plan.

請求項７に記載の発明は、請求項６に記載の分散処理方法をコンピュータに実行させるためのプログラムとした。 The invention according to claim 7 is a program for causing a computer to execute the distributed processing method according to claim 6.

このようなプログラムによれば、請求項６に記載の分散処理方法を一般的なコンピュータで実現することができる。 According to such a program, the distributed processing method according to the sixth aspect can be realized by a general computer.

本発明によれば、単一の情報量の多いデータに対して、レスポンスの高速化を図ることができる、分散処理システム、分散処理方法、およびプログラムを提供することができる。 According to the present invention, it is possible to provide a distributed processing system, a distributed processing method, and a program capable of speeding up the response to data having a large amount of information.

本発明の第１の実施形態に係る分散処理システムの構成例を示す機能ブロック図である。It is a functional block diagram which shows the structural example of the distributed processing system which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係る分散処理システムによる分散処理の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of the distributed processing by the distributed processing system which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係る分断統合装置のテキスト断片データ生成部が行う断片データ生成処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the fragment | piece data generation process which the text fragment | piece data generation part of the division | segmentation integration apparatus which concerns on the 1st Embodiment of this invention performs. 本発明の第１の実施形態に係る分散処理システムに入力される日本語テキストデータの一例を示す図である。It is a figure which shows an example of the Japanese text data input into the distributed processing system which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係る分散処理システムが生成する結果データの一例を示す図である。It is a figure which shows an example of the result data which the distributed processing system which concerns on the 1st Embodiment of this invention produces | generates. 本発明の第１の実施形態に係る分断統合装置のテキスト断片データ生成部が生成する断片データの一例を示す図である。It is a figure which shows an example of the fragment data which the text fragment data generation part of the division | segmentation integration apparatus which concerns on the 1st Embodiment of this invention produces | generates. 本発明の第２の実施形態に係る分散処理システムの分散処理の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of the distributed processing of the distributed processing system which concerns on the 2nd Embodiment of this invention. 本発明の第３の実施形態に係る分断統合装置の映像断片データ生成部が行う断片データ生成処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the fragment | piece data generation process which the video fragment data generation part of the division | segmentation integration apparatus which concerns on the 3rd Embodiment of this invention performs. 本発明の実施形態に係る分散処理システムの変形例について説明するための構成図である。It is a block diagram for demonstrating the modification of the distributed processing system which concerns on embodiment of this invention.

次に、本発明を実施するための形態（「実施形態」という）について、適宜図面を参照しながら詳細に説明する。 Next, modes for carrying out the present invention (referred to as “embodiments”) will be described in detail with reference to the drawings as appropriate.

《第１の実施形態》
図１は、本発明の第１の実施形態に係る分散処理システム１の構成例を示す機能ブロック図である。 << First Embodiment >>
FIG. 1 is a functional block diagram showing a configuration example of a distributed processing system 1 according to the first embodiment of the present invention.

最初に、本発明の第１の実施形態に係る分散処理システム１における分散処理の概要を、図１を参照して説明する。まず、ユーザ端末２からネットワーク３を介して受付応答装置１０がデータのアップロードを受け付ける。このときアップロードされるデータは、例えば、後記する図４に示すような単一の情報量の多いテキストデータ等である。受付応答装置１０は、受け付けた単一のデータを分断統合装置２０に送信し、分断統合装置２０では、所定のデータサイズとパターン検索により、処理装置がデータ処理する際に処理可能な単位で切り出して断片データ（後記する図６参照）を生成し、１つ以上のキュー管理装置３０へ送信する。そして、複数の処理装置４０は、それぞれキュー管理装置３０から断片データを受け取り、データ処理（例えば、日英翻訳処理）を実行し断片結果データを生成する。各処理装置４０から断片結果データを取得した分断統合装置２０は、すべての断片結果データを統合して、結果データ（後記する図５参照）を生成して、受付応答装置１０を介してユーザ端末２へ送信する。 First, an overview of distributed processing in the distributed processing system 1 according to the first embodiment of the present invention will be described with reference to FIG. First, the acceptance response device 10 accepts data upload from the user terminal 2 via the network 3. The data uploaded at this time is, for example, text data having a large amount of information as shown in FIG. The reception response device 10 transmits the received single data to the dividing and integrating device 20, and the dividing and integrating device 20 cuts out in units that can be processed when the processing device performs data processing by searching for a predetermined data size and pattern. Then, fragment data (see FIG. 6 described later) is generated and transmitted to one or more queue management devices 30. Each of the plurality of processing devices 40 receives the fragment data from the queue management device 30 and executes data processing (for example, Japanese-English translation processing) to generate fragment result data. The fragmentation and integration device 20 that has acquired the fragment result data from each processing device 40 integrates all the fragment result data, generates result data (see FIG. 5 to be described later), and receives the user terminal via the reception response device 10. 2 to send.

このように、分断統合装置２０が、アップロードされた単一のデータを分断し、複数の処理装置４０により分散してデータ処理を行った上で、データを統合することで、本発明の第１の実施形態に係る分散処理システム１によれば、アップロードされたデータへのレスポンス（応答時間）の短縮が図れるものである。 As described above, the dividing and integrating device 20 divides the uploaded single data, distributes the data by the plurality of processing devices 40, performs the data processing, and then integrates the data, whereby the first of the present invention. According to the distributed processing system 1 according to the embodiment, the response (response time) to uploaded data can be shortened.

次に、本発明の第１の実施形態に係る分散処理システム１の構成について具体的に説明する。
分散処理システム１は、ユーザ端末２からネットワーク３を介してデータ（アップロードデータ）を受信する受付応答装置１０と、受付応答装置１０が受け付けたデータを取得して、そのデータを分断して断片データを生成し、さらに処理された断片結果データを統合する分断統合装置２０と、断片データを取得し、データ処理を行う複数の処理装置４０（４０ａ，４０ｂ，４０ｃ，…）と、分断統合装置２０から出力された断片データを取得してキューとして記憶し、処理装置４０からの要求により断片データを処理装置４０に送信する１つ以上のキュー管理装置３０（３０ａ，３０ｂ，…）と、を含んで構成される。 Next, the configuration of the distributed processing system 1 according to the first embodiment of the present invention will be specifically described.
The distributed processing system 1 acquires data (upload data) received from the user terminal 2 via the network 3, and acquires the data received by the reception response device 10, and divides the data to obtain fragment data. , And further integrate the processed fragment result data, a plurality of processing devices 40 (40a, 40b, 40c,...) That acquire fragment data and perform data processing, and the fragmentation integration device 20 One or more queue management devices 30 (30a, 30b,...) That acquire fragment data output from the storage device and store it as a queue and transmit the fragment data to the processing device 40 in response to a request from the processing device 40. Consists of.

なお、（１）受付応答装置１０と、分断統合装置２０との間、（２）分断統合装置２０と、キュー管理装置３０（３０ａ，３０ｂ，…）および処理装置４０（４０ａ，４０ｂ，４０ｃ，…）との間、（３）キュー管理装置３０（３０ａ，３０ｂ，…）と、処理装置４０（４０ａ，４０ｂ，４０ｃ）との間は、それぞれ図示しないネットワークや専用回線により接続される。また、キュー管理装置３０の数は、１つ以上のキュー管理装置３０（３０ａ，３０ｂ，…）であり、処理装置４０の数は、２つ以上の処理装置４０（４０ａ，４０ｂ，４０ｃ，…）であればよい。 In addition, (1) between the reception response device 10 and the division integration device 20, (2) the division integration device 20, the queue management device 30 (30a, 30b,...) And the processing device 40 (40a, 40b, 40c,. ..) And (3) the queue management device 30 (30a, 30b,...) And the processing device 40 (40a, 40b, 40c) are connected by a network or a dedicated line (not shown). The number of queue management devices 30 is one or more queue management devices 30 (30a, 30b,...), And the number of processing devices 40 is two or more processing devices 40 (40a, 40b, 40c,...). ).

＜受付応答装置＞
受付応答装置１０は、ユーザ端末２から送信された単一のデータ（アップロードデータ）を、ネットワーク３を介して受け付ける。この受付応答装置１０は、データ受付部１１と、ファイル処理部１２と、送信制御部１３と、データ蓄積部１４とを含んで構成される。なお、ここでは、図示を省略しているが、受付応答装置１０は、各種データの入力を司る入力部と、出力を司る出力部と、ＣＰＵ（Central Processing Unit）と、通信インタフェースとを備えるコンピュータにより実現される。 <Reception response device>
The reception response device 10 receives single data (upload data) transmitted from the user terminal 2 via the network 3. The reception response device 10 includes a data reception unit 11, a file processing unit 12, a transmission control unit 13, and a data storage unit 14. Although not shown here, the reception response device 10 is a computer including an input unit that controls input of various data, an output unit that controls output, a CPU (Central Processing Unit), and a communication interface. It is realized by.

データ受付部１１は、ユーザ端末２から、ネットワーク３または入力部（不図示）を介して、単一のデータ（アップロードデータ）を受信する。そして、その受信したデータを、ファイル処理部１２へ引き渡す。 The data reception unit 11 receives single data (upload data) from the user terminal 2 via the network 3 or an input unit (not shown). Then, the received data is delivered to the file processing unit 12.

ファイル処理部１２は、データ受付部１１から取得したデータを、ファイル化してデータ蓄積部１４へ保存する。また、ファイル処理部１２は、データのアップロードが完了したことを契機として、起動指示メッセージを送信制御部１３を介して分断統合装置２０へ送信する。そして、ファイル処理部１２は、分断統合装置２０から、データ処理され統合された結果データを取得し、送信制御部１３を介して、その結果データをユーザ端末２へ送信する。なお、図１に示すファイル処理部１２が備える逐次ファイル処理部１２１は、後記する第２の実施形態に係る分散処理システム１ａにおいて、備えられるものであるので、ここでは説明を省略する。 The file processing unit 12 converts the data acquired from the data receiving unit 11 into a file and stores it in the data storage unit 14. Further, the file processing unit 12 transmits an activation instruction message to the dividing and integrating device 20 via the transmission control unit 13 when the data upload is completed. Then, the file processing unit 12 acquires result data that has been subjected to data processing and integration from the dividing and integrating device 20, and transmits the result data to the user terminal 2 via the transmission control unit 13. The sequential file processing unit 121 included in the file processing unit 12 illustrated in FIG. 1 is included in the distributed processing system 1a according to the second embodiment to be described later, and thus description thereof is omitted here.

送信制御部１３は、ファイル処理部１２の指示によりデータ蓄積部１４に記憶されたデータを分断統合装置２０へ送信する制御を行う。また、送信制御部１３は、ファイル処理部１２が取得した結果データを、ユーザ端末２へ送信する制御を行う。 The transmission control unit 13 performs control to transmit the data stored in the data storage unit 14 to the division integration device 20 according to an instruction from the file processing unit 12. The transmission control unit 13 performs control to transmit the result data acquired by the file processing unit 12 to the user terminal 2.

データ蓄積部１４は、データ受付部１１が受け付けたデータを記憶する記憶手段であり、ＲＡＭ（Random Access Memory）、ＨＤＤ（Hard Disk Drive）等により実現される。 The data storage unit 14 is a storage unit that stores data received by the data receiving unit 11 and is realized by a RAM (Random Access Memory), a HDD (Hard Disk Drive), or the like.

なお、データ受付部１１、ファイル処理部１２、および送信制御部１３は、ＣＰＵによるプログラム実行処理または専用のハードウェアにより実現される。ＣＰＵによるプログラム実行処理により実現する場合、データ蓄積部１４にはこれらのプログラムが記憶される。 The data receiving unit 11, the file processing unit 12, and the transmission control unit 13 are realized by a program execution process by the CPU or dedicated hardware. When the program is executed by the CPU, the data storage unit 14 stores these programs.

＜分断統合装置＞
分断統合装置２０は、受付応答装置１０から受信したデータについて、処理装置がデータ処理する際に処理可能な単位で切り出して、断片データを生成する。そして、分断統合装置２０は、各処理装置４０（４０ａ，４０ｂ，４０ｃ，…）により処理された後の断片結果データを統合して、１つの処理済みの結果データを生成する。この分断統合装置２０は、データ読込部２１と、断片データ生成部２２と、データ管理部２３と、断片結果データ取得部２４と、断片結果データ統合部２５と、結果データ出力部２６と、データ記憶部２７とを含んで構成される。なお、ここでは、図示を省略しているが、分断統合装置２０は、各種データの入力を司る入力部と、出力を司る出力部と、ＣＰＵとを備えるコンピュータにより実現される。 <Partition integration device>
The division integration device 20 cuts out the data received from the reception response device 10 in units that can be processed when the processing device performs data processing, and generates fragment data. And the division | segmentation integration apparatus 20 integrates | segments the fragment result data after processing by each processing apparatus 40 (40a, 40b, 40c, ...), and produces | generates one processed result data. The fragmentation integration device 20 includes a data reading unit 21, a fragment data generation unit 22, a data management unit 23, a fragment result data acquisition unit 24, a fragment result data integration unit 25, a result data output unit 26, a data And a storage unit 27. Although illustration is omitted here, the division integration device 20 is realized by a computer including an input unit that controls input of various data, an output unit that controls output, and a CPU.

データ読込部２１は、受付応答装置１０により送信されたデータを読み込む。そして、読み込んだデータを断片データ生成部２２へ引き渡す。 The data reading unit 21 reads the data transmitted by the acceptance response device 10. Then, the read data is delivered to the fragment data generation unit 22.

断片データ生成部２２は、引き渡されたデータの種類を判定し、処理装置がデータ処理する際に処理可能な単位での切り出しによる断片データの生成を行う。なお、ここで、処理装置がデータ処理する際に処理可能な単位での切り出しとは、例えば、データが日本語テキストデータであれば、句点および改行で区切られて、日本語として意味を理解することができる文章の切れ目で切り出しを行うことをいう（図３，図６参照）。また、後記するｍｐｅｇ２等の映像データであれば、ＧＯＰ（Group Of Pictures）と呼ばれる動画を構成する最小の単位構造がいくつか連なった映像のシーンの単位での切り出しをいう（図８参照）。 The fragment data generation unit 22 determines the type of the delivered data, and generates fragment data by cutting out in units that can be processed when the processing device processes the data. Here, the extraction in units that can be processed when data is processed by the processing device, for example, if the data is Japanese text data, it is delimited by punctuation marks and line feeds, and the meaning is understood as Japanese. This means that the sentence is cut out at a break between the sentences (see FIGS. 3 and 6). In addition, in the case of video data such as mpeg2 to be described later, it refers to segmentation of video in units of several minimum unit structures constituting a moving image called GOP (Group Of Pictures) (see FIG. 8).

この断片データ生成部２２は、テキスト断片データ生成部２２１および映像断片データ生成部２２２を含んで構成される。なお、映像断片データ生成部２２２については、後記する第３の実施形態に係る分散処理システム１ｂにおいて備わるものであり、ここでの説明は省略する。 The fragment data generation unit 22 includes a text fragment data generation unit 221 and a video fragment data generation unit 222. Note that the video fragment data generation unit 222 is provided in the distributed processing system 1b according to the third embodiment to be described later, and a description thereof is omitted here.

テキスト断片データ生成部２２１は、データの種類がテキストデータの場合に、まず、所定のバイト数を読み込み、所定のバイト数を読み込んだデータの最後の地点から、前方（もしくは後方）に、句点文字およびそれに続く改行コードがあるか否かを検出し、検出された地点までを、処理装置がデータ処理する際に処理可能な部分としての断片データとして切り出す処理を行う。なお、本実施形態において、前方とは、あるデータからみてそれより後の将来に向かって読み込まれるデータ側に向かって処理することをいい、後方とは、あるデータからみて、それより前に読み込まれたデータ側に向かって処理することをいう。また、このテキスト断片データ生成部２２１による断片データ生成処理については、後記する図３において詳細に説明する。 When the data type is text data, the text fragment data generation unit 221 first reads a predetermined number of bytes, and forward (or backward) a punctuation character from the last point of the read data. Then, it is detected whether or not there is a subsequent line feed code, and processing up to the detected point is cut out as fragment data as a portion that can be processed when the processing device performs data processing. In the present embodiment, the forward means processing toward the data side to be read in the future after viewing from a certain data, and the backward means reading before the data from a certain data. Processing toward the data side. The fragment data generation processing by the text fragment data generation unit 221 will be described in detail with reference to FIG.

データ管理部２３は、断片データ生成部２２により、断片データが生成される順に、各断片データにシーケンシャルなＩＤを付し、また、断片データ生成部２２が切り出した最後の断片データには、最後であることが識別可能なフラグ（以下、「最後フラグ」と呼ぶ）を付与する。そして、データ管理部２３は、ＩＤが付された断片データを、キュー管理装置３０へ送信する。このとき、キュー管理装置３０が複数存在する場合は、受付キューの負荷分散等を考慮して、ラウンドロビン法等の分散アルゴリズムを用いて、複数のキュー管理装置３０（３０ａ，３０ｂ，…）に断片データを分散して送信するようにすることもできる。このようにすることにより、キュー管理装置３０の処理負荷を低減し、処理待ちキューが停滞することを防ぐことができる。 The data management unit 23 assigns a sequential ID to each piece of fragment data in the order in which the piece data is generated by the piece data generation unit 22, and the last piece of data cut out by the piece data generation unit 22 Is attached (hereinafter referred to as “last flag”). Then, the data management unit 23 transmits the fragment data to which the ID is attached to the queue management device 30. At this time, if there are a plurality of queue management devices 30, considering the load distribution of the reception queues, etc., a distribution algorithm such as a round robin method is used for the plurality of queue management devices 30 (30 a, 30 b,...). It is also possible to transmit fragment data in a distributed manner. By doing so, it is possible to reduce the processing load of the queue management device 30 and prevent the process waiting queue from stagnation.

断片結果データ取得部２４は、処理装置４０（４０ａ，４０ｂ，４０ｃ，…）から、断片結果データを取得し、データ管理部２３が断片データに付し、断片結果データに引き継がれたＩＤをファイル名として、断片結果データをデータ記憶部２７に記憶する。また断片結果データ取得部２４は、断片結果データに付されたＩＤを用いて、すべての断片データに対応する断片結果データを取得したか否かを判定する。 The fragment result data acquisition unit 24 acquires the fragment result data from the processing device 40 (40a, 40b, 40c,...), And the data management unit 23 attaches the fragment data to the ID and the ID carried over to the fragment result data as a file. As a name, the fragment result data is stored in the data storage unit 27. Further, the fragment result data acquisition unit 24 determines whether or not the fragment result data corresponding to all the fragment data has been acquired, using the ID attached to the fragment result data.

断片結果データ統合部２５は、断片結果データ取得部２４が、すべての断片結果データを取得したことを契機として、断片結果データに付されたＩＤ順にマージすることにより、各断片結果データを統合し、結果データを生成する。また、断片結果データ統合部２５は、生成した結果データをデータ記憶部２７に記憶する。 The fragment result data integration unit 25 integrates each fragment result data by merging in the order of IDs attached to the fragment result data when the fragment result data acquisition unit 24 acquires all the fragment result data. , Generate result data. Further, the fragment result data integration unit 25 stores the generated result data in the data storage unit 27.

結果データ出力部２６は、断片結果データ統合部２５による結果データの生成を契機として、データ処理の完了通知を生成し、その完了通知とともに、結果データを受付応答装置１０に送信する。 The result data output unit 26 generates a data processing completion notification triggered by the generation of the result data by the fragment result data integration unit 25, and transmits the result data to the reception response device 10 together with the completion notification.

データ記憶部２７は、受付応答装置１０から取得したデータや、断片データ生成部２２が生成した断片データ、断片結果データ取得部２４が取得した断片結果データ、断片結果データ統合部２５が生成した結果データ等が記憶される記憶手段であり、ＲＡＭ、ＨＤＤ等により実現される。 The data storage unit 27 includes data acquired from the reception response device 10, fragment data generated by the fragment data generation unit 22, fragment result data acquired by the fragment result data acquisition unit 24, and results generated by the fragment result data integration unit 25. This is a storage means for storing data and the like, and is realized by a RAM, an HDD, or the like.

なお、データ読込部２１、断片データ生成部２２、データ管理部２３、断片結果データ取得部２４、断片結果データ統合部２５、および結果データ出力部２６は、ＣＰＵによるプログラム実行処理または専用のハードウェアにより実現される。ＣＰＵによるプログラム実行処理により実現する場合、データ記憶部２７にはこれらのプログラムが記憶される。 The data reading unit 21, the fragment data generation unit 22, the data management unit 23, the fragment result data acquisition unit 24, the fragment result data integration unit 25, and the result data output unit 26 are a program execution process by the CPU or dedicated hardware. It is realized by. When the program is executed by the CPU, the data storage unit 27 stores these programs.

＜キュー管理装置＞
キュー管理装置３０（３０ａ，３０ｂ，…）は、分断統合装置２０から送信された断片データをキューとして管理・保存し、処理装置４０（４０ａ，４０ｂ，４０ｃ，…）からの要求に応じて、断片データを要求のあった処理装置４０に送信する。このキュー管理装置３０は、キュー受付部３１と、キュー管理部３２と、キュー保存部３３とを含んで構成される。なお、ここでは、図示を省略しているが、キュー管理装置３０は、各種データの入力を司る入力部と、出力を司る出力部と、ＣＰＵとを備えるコンピュータにより実現される。 <Queue management device>
The queue management device 30 (30a, 30b,...) Manages and stores the fragment data transmitted from the dividing and integrating device 20 as a queue, and in response to a request from the processing device 40 (40a, 40b, 40c,...) The fragment data is transmitted to the requested processing apparatus 40. The queue management device 30 includes a queue reception unit 31, a queue management unit 32, and a queue storage unit 33. Although not shown here, the queue management device 30 is realized by a computer including an input unit that controls input of various data, an output unit that controls output, and a CPU.

キュー受付部３１は、分断統合装置２０から送信されたＩＤの付された断片データを取得し、キューとしてキュー保存部３３に保存する。また、キュー管理部３２は、処理装置４０（４０ａ，４０ｂ，４０ｃ，…）から断片データの取得要求があったとき、キュー保存部３３に保存された断片データを、その処理装置４０へ送信する。 The queue reception unit 31 acquires the fragment data with the ID transmitted from the dividing and integrating device 20 and stores it in the queue storage unit 33 as a queue. Further, when there is a request for obtaining fragment data from the processing device 40 (40a, 40b, 40c,...), The queue management unit 32 transmits the fragment data stored in the queue storage unit 33 to the processing device 40. .

キュー保存部３３は、キュー受付部３１が取得した断片データを保存しておく記憶手段であり、ＲＡＭ、ＨＤＤ等により実現される。 The queue storage unit 33 is a storage unit that stores the fragment data acquired by the queue reception unit 31, and is realized by a RAM, an HDD, or the like.

＜処理装置＞
処理装置４０は、キュー管理装置３０から断片データを取得し、その取得した断片データに対し所定のデータ処理を実行する。この処理装置４０は、キュー選択部４１と、断片データ取得部４２と、データ処理部４３と、断片結果データ出力部４４とを含んで構成される。なお、ここでは、図示を省略しているが、処理装置４０は、各種データの入力を司る入力部と、出力を司る出力部と、各種データを記憶する記憶部と、ＣＰＵとを備えるコンピュータにより実現される。 <Processing device>
The processing device 40 acquires fragment data from the queue management device 30 and executes predetermined data processing on the acquired fragment data. The processing device 40 includes a queue selection unit 41, a fragment data acquisition unit 42, a data processing unit 43, and a fragment result data output unit 44. Although not shown here, the processing device 40 is a computer that includes an input unit that controls input of various data, an output unit that controls output, a storage unit that stores various data, and a CPU. Realized.

キュー選択部４１は、断片データの取得元となるキュー管理装置３０を選択する。このときのキュー管理装置３０の選択は、例えば、ラウンドロビン法等の分散アルゴリズムを用いて行われる。 The queue selection unit 41 selects the queue management device 30 from which fragment data is acquired. The selection of the queue management device 30 at this time is performed using a distributed algorithm such as a round robin method.

断片データ取得部４２は、キュー選択部４１により選択されたキュー管理装置３０に、断片データの取得要求を送信し、キュー管理装置３０から断片データを取得する。 The fragment data acquisition unit 42 transmits a fragment data acquisition request to the queue management device 30 selected by the queue selection unit 41 and acquires fragment data from the queue management device 30.

データ処理部４３は、断片データ取得部４２が取得した断片データに対し、所定のデータ処理を実行する。例えば、日英翻訳処理や、日本語係り受け解析処理等を実行する。なお、後記する映像データに関しては、カット点検出処理や注目シーン検出処理等を実行する。 The data processing unit 43 performs predetermined data processing on the fragment data acquired by the fragment data acquisition unit 42. For example, Japanese-English translation processing, Japanese dependency analysis processing, etc. are executed. For video data described later, cut point detection processing, attention scene detection processing, and the like are executed.

断片結果データ出力部４４は、データ処理部４３においてデータ処理された結果である断片結果データを分断統合装置２０へ出力する。なお、このとき、断片結果データには、分断統合装置２０のデータ管理部２３により断片データに付されたシーケンシャルなＩＤがそのまま引き継がれ付される。 The fragment result data output unit 44 outputs fragment result data, which is a result of data processing performed by the data processing unit 43, to the dividing and integrating device 20. At this time, the sequential ID assigned to the fragment data by the data management unit 23 of the dividing and integrating device 20 is directly inherited to the fragment result data.

以上説明した、受付応答装置１０、分断統合装置２０、キュー管理装置３０（３０ａ，３０ｂ，…）、および処理装置４０（４０ａ，４０ｂ，４０ｃ，…）は、図１に示したように、それぞれ別個のコンピュータ（装置）により実現してもよいし、これらをまとめて１つのコンピュータとして実現するようにしてもよい。 As described above, the reception response device 10, the division integration device 20, the queue management device 30 (30a, 30b,...), And the processing device 40 (40a, 40b, 40c,. These may be realized by separate computers (devices), or may be realized as a single computer.

＜動作手順＞
次に、本発明の第１の実施形態に係る分散処理システム１の動作手順について、図１を参照しつつ、図２を用いて詳細に説明する。図２は、本発明の第１の実施形態に係る分散処理システム１による分散処理の流れを示すシーケンス図である。 <Operation procedure>
Next, the operation procedure of the distributed processing system 1 according to the first embodiment of the present invention will be described in detail with reference to FIG. 1 and FIG. FIG. 2 is a sequence diagram showing the flow of distributed processing by the distributed processing system 1 according to the first embodiment of the present invention.

まず、ユーザ端末２からネットワーク３を介して、データが送信されると（ステップＳ１０１）、受付応答装置１０のデータ受付部１１がデータを受信する（ステップＳ１０２）。そして、受付応答装置１０のファイル処理部１２は、データ受付部１１によるデータの受信が完了したか否かを判定する（ステップＳ１０３）。そして、データの受信が完了していなければ（つまり、データ受信中であれば）（ステップＳ１０３→Ｎｏ）、ステップＳ１０２へ戻り、データ受付部１１はデータの受信を続ける。一方、データの受信が完了すると（ステップＳ１０３→Ｙｅｓ）、ファイル処理部１２は、データ受付部１１が受け付けたデータを、１つのファイルとして、データ蓄積部１４に記憶する。 First, when data is transmitted from the user terminal 2 via the network 3 (step S101), the data reception unit 11 of the reception response device 10 receives the data (step S102). Then, the file processing unit 12 of the reception response device 10 determines whether or not the data reception by the data reception unit 11 is completed (Step S103). If data reception has not been completed (that is, if data is being received) (step S103 → No), the process returns to step S102, and the data receiving unit 11 continues to receive data. On the other hand, when the data reception is completed (step S103 → Yes), the file processing unit 12 stores the data received by the data receiving unit 11 in the data storage unit 14 as one file.

また、ファイル処理部１２は、データの受付（アップロード）が完了したことを契機として、起動指示メッセージを、送信制御部１３を介して分断統合装置２０へ送信する。そして、送信制御部１３は、分断統合装置２０へデータ蓄積部１４に記憶されたデータを送信する（ステップＳ１０４）。 Further, the file processing unit 12 transmits an activation instruction message to the dividing and integrating device 20 via the transmission control unit 13 when the data reception (upload) is completed. And the transmission control part 13 transmits the data memorize | stored in the data storage part 14 to the division | segmentation integration apparatus 20 (step S104).

次に、受付応答装置１０からの起動指示メッセージの受信により起動した分断統合装置２０は、データ読込部２１によるデータの読み込みを開始する（ステップＳ１０５）。そして、データ読込部２１は、データの読み込みを行いつつ、断片データ生成部２２にデータを引き渡す。 Next, the division integration device 20 activated upon reception of the activation instruction message from the reception response device 10 starts data reading by the data reading unit 21 (step S105). Then, the data reading unit 21 delivers the data to the fragment data generation unit 22 while reading the data.

続いて、断片データ生成部２２は、引き渡されたデータの種類を判定し、処理装置がそのデータをデータ処理する際に処理可能な単位での切り出しによる断片データ生成処理を行う（ステップＳ１０６）。なお、この断片データ生成部２２による断片データ生成処理の詳細は、後記する図３において説明する。 Subsequently, the fragment data generation unit 22 determines the type of the delivered data, and performs fragment data generation processing by cutting out in units that can be processed when the processing apparatus processes the data (step S106). Details of the fragment data generation processing by the fragment data generation unit 22 will be described later with reference to FIG.

そして、データ管理部２３は、断片データ生成部２２により、断片データが生成される毎に、断片データを受け取り、各断片データの生成順にシーケンシャルなＩＤを付し、また、断片データ生成部２２が切り出した最後の断片データには、最後であることが識別可能なフラグ（最後フラグ）を付与する。そして、データ管理部２３は、複数のキュー管理装置３０が存在する場合には、ラウンドロビン法等を用いてデータの出力先となるキュー管理装置３０を決定し、その決定したキュー管理装置３０へ、断片データを送信する（ステップＳ１０７）。続いて、データ管理部２３は、データ読込部２１が読み込んだデータのすべてを切り出し、断片データのすべてを送信したか否かを判定する（ステップＳ１０８）。そして、まだ、読込み中のデータがあり、断片データのすべてを送信していない場合には（ステップＳ１０８→Ｎｏ）、ステップＳ１０５に戻り、データ読込部２１によるデータの読込みを続ける。一方、断片データのすべてを送信した場合には（ステップＳ１０８→Ｙｅｓ）、処理を終了する。 The data management unit 23 receives fragment data every time fragment data is generated by the fragment data generation unit 22, assigns sequential IDs in the order of generation of each fragment data, and the fragment data generation unit 22 A flag (last flag) that can be identified as the last is added to the last fragment data that has been cut out. Then, when there are a plurality of queue management devices 30, the data management unit 23 determines the queue management device 30 as the data output destination using the round robin method or the like, and sends the determined queue management device 30 to the determined queue management device 30. The fragment data is transmitted (step S107). Subsequently, the data management unit 23 cuts out all of the data read by the data reading unit 21 and determines whether all of the fragment data has been transmitted (step S108). If there is still data being read and all of the fragment data has not been transmitted (step S108 → No), the process returns to step S105, and the data reading unit 21 continues to read the data. On the other hand, if all of the fragment data has been transmitted (step S108 → Yes), the process is terminated.

次に、キュー管理装置３０のキュー受付部３１は、分断統合装置２０から、ＩＤの付された断片データを受け付け（ステップＳ１０９）、キューとしてキュー保存部３３に記憶する。そして、キュー管理部３２は、処理装置４０（４０ａ，４０ｂ，４０ｃ，…）からの断片データ取得要求に応じて、キュー保存部３３に記憶された断片データを、処理装置４０（４０ａ，４０ｂ，４０ｃ，…）へ送信する（ステップＳ１１０）。 Next, the queue receiving unit 31 of the queue management device 30 receives the fragment data with the ID from the dividing and integrating device 20 (step S109) and stores it as a queue in the queue storage unit 33. In response to a fragment data acquisition request from the processing device 40 (40a, 40b, 40c,...), The queue management unit 32 converts the fragment data stored in the queue storage unit 33 into the processing device 40 (40a, 40b,. 40c,...) (Step S110).

続いて、処理装置４０のキュー選択部４１は、断片データの取得元となるキュー管理装置３０を、ラウンドロビン法等を用いて選択する（ステップＳ１１１）。そして、処理装置４０の断片データ取得部４２は、キュー選択部４１により選択されたキュー管理装置３０に、断片データの取得要求を送信し、そのキュー管理装置３０から断片データを取得する（ステップＳ１１２）。次に、データ処理部４３は、断片データ取得部４２が取得した断片データに対し、所定のデータ処理を実行する（ステップＳ１１３）。そして、データ処理部４３の処理結果である断片結果データを、断片結果データ出力部４４は、分断統合装置２０に送信する（ステップＳ１１４）。 Subsequently, the queue selection unit 41 of the processing device 40 selects the queue management device 30 from which the fragment data is acquired using the round robin method or the like (step S111). Then, the fragment data acquisition unit 42 of the processing device 40 transmits a fragment data acquisition request to the queue management device 30 selected by the queue selection unit 41, and acquires fragment data from the queue management device 30 (step S112). ). Next, the data processing unit 43 performs predetermined data processing on the fragment data acquired by the fragment data acquisition unit 42 (step S113). Then, the fragment result data output unit 44 transmits the fragment result data, which is the processing result of the data processing unit 43, to the dividing and integrating device 20 (step S114).

この処理装置４０は、例えば、日本語テキストの断片データをキュー管理装置３０から取得し、データ処理部４３において、日英翻訳処理を行う。そして、そのデータ処理の結果として得られた断片結果データを、断片データに付されたＩＤを付したまま分断統合装置２０へ出力する。また、この１つの処理装置４０とは別の処理装置４０も同様の処理を行い、それぞれの処理装置４０（４０ａ，４０ｂ，４０ｃ，…）においてデータ処理された各断片結果データが、分断統合装置２０へ送信される。 For example, the processing device 40 acquires fragment data of Japanese text from the queue management device 30 and performs a Japanese-English translation process in the data processing unit 43. Then, the fragment result data obtained as a result of the data processing is output to the dividing and integrating device 20 with the ID attached to the fragment data attached. Further, the processing device 40 different from the one processing device 40 performs the same processing, and each fragment result data subjected to data processing in each processing device 40 (40a, 40b, 40c,...) Is divided and integrated. 20 is transmitted.

このように、処理時間のかかる日英翻訳処理等のデータ処理を、複数の断片データに分断した上で、複数の処理装置４０において分散して実行することによって、全体の処理時間を短縮することが可能となる。 In this way, data processing such as Japanese-English translation processing, which requires processing time, is divided into a plurality of pieces of fragment data and then distributed and executed in a plurality of processing devices 40, thereby reducing the overall processing time. Is possible.

次に、分断統合装置２０の断片結果データ取得部２４は、各処理装置４０（４０ａ，４０ｂ，４０ｃ，…）から断片結果データを取得する（ステップＳ１１５）。そして、断片結果データ統合部２５は、断片結果データ取得部２４が取得した断片結果データに付されたＩＤをファイル名等にしてファイル化し、データ記憶部２７に記憶する。また、断片結果データ統合部２５は、取得した断片結果データに最後フラグを付与されているか否かを判定する。そして、断片結果データ統合部２５は、最後フラグが付与された断片結果データを取得した以後に、断片データを取得した場合には、その断片データに付されたＩＤをファイル名等にしてファイル化するとともに、すべての断片データを取得したか否かを判定する（ステップＳ１１６）。 Next, the fragment result data acquisition unit 24 of the division integration device 20 acquires fragment result data from each processing device 40 (40a, 40b, 40c,...) (Step S115). The fragment result data integration unit 25 converts the ID attached to the fragment result data acquired by the fragment result data acquisition unit 24 into a file name or the like and stores the file in the data storage unit 27. Further, the fragment result data integration unit 25 determines whether or not the last flag is given to the acquired fragment result data. Then, when the fragment result data integration unit 25 acquires the fragment data after acquiring the fragment result data to which the last flag is added, the fragment result data integration unit 25 converts the ID assigned to the fragment data into a file name or the like as a file. At the same time, it is determined whether or not all pieces of fragment data have been acquired (step S116).

そして、まだ取得していない断片データがある場合には（ステップＳ１１６→Ｎｏ）、ステップＳ１１５へ戻り、断片データの取得を続ける。一方、すべての断片データを取得した場合には（ステップＳ１１６→Ｙｅｓ）、取得処理を終了し、断片結果データ統合部２５は、ファイルに付されたＩＤ順にマージして、断片結果データを統合した結果データを生成し（ステップＳ１１７）、生成した結果データをデータ記憶部２７に記憶する。 If there is fragment data that has not yet been acquired (step S116 → No), the process returns to step S115 to continue acquiring fragment data. On the other hand, when all pieces of fragment data have been acquired (step S116 → Yes), the acquisition process ends, and the fragment result data integration unit 25 merges the fragment result data by merging in the order of IDs attached to the files. Result data is generated (step S117), and the generated result data is stored in the data storage unit 27.

続いて、結果データ出力部２６は、処理完了通知を生成し、その処理完了通知を受付応答装置１０へ送信する（ステップＳ１１８）。次に、処理完了通知を受け付けた受付応答装置１０のデータ受付部１１は、送信制御部１３を介して、処理完了通知をユーザ端末２へ送信する（つまり、結果データを非同期でユーザ端末２へ返信する）（ステップＳ１１９）。なお、ステップＳ１１８において、分断統合装置２０の結果データ出力部２６が処理完了通知を受付応答装置１０へ向けて送信する際に、処理完了通知とともに、結果データを併せて、受付応答装置１０へ送信し、受付応答装置１０からユーザ端末２へ送信するようにしてもよい（つまり、結果データを同期でユーザ端末２へ返信する）。そして、ユーザ端末２が、処理完了通知（結果データ）を受信することで処理を終了する（ステップＳ１２０）。 Subsequently, the result data output unit 26 generates a processing completion notification and transmits the processing completion notification to the reception response device 10 (step S118). Next, the data reception unit 11 of the reception response device 10 that has received the processing completion notification transmits the processing completion notification to the user terminal 2 via the transmission control unit 13 (that is, the result data is asynchronously transmitted to the user terminal 2). Reply) (step S119). In step S118, when the result data output unit 26 of the division integration device 20 transmits the process completion notification to the reception response device 10, the result data is transmitted together with the processing completion notification to the reception response device 10. Then, it may be transmitted from the acceptance response device 10 to the user terminal 2 (that is, the result data is returned to the user terminal 2 synchronously). Then, the user terminal 2 receives the process completion notification (result data) and ends the process (step S120).

（断片データ生成処理）
次に、図２のステップＳ１０６における分断統合装置２０の断片データ生成部２２が行う断片データ生成処理について説明する。図３は、本発明の第１の実施形態に係る分断統合装置２０のテキスト断片データ生成部２２１が行う断片データ生成処理の流れを示すフローチャートである。なお、本実施形態においては、図４に示す日本語テキストデータがアップロードされ、このデータが処理装置４０により日英翻訳処理を行うことで、図５に示す英語テキストデータに変換する処理を行うものとして説明する。この場合、分断統合装置２０の断片データ生成部２２に備えられたテキスト断片データ生成部２２１が、データのサイズとデータのパターンを基準にして、処理装置がデータ処理する際に処理可能な単位での切り出しによる断片データの生成を行う。例えば、所定のバイト数を読み込んだデータの最後の地点から前方（もしくは後方）にパターンの検索を行う。このパターンは、句点文字およびそれに続く改行コードが存在するか否かを判定する。このパターンが存在した場合に、その地点までを、処理装置がそのデータをデータ処理する際に処理可能な部分としての断片データとして切り出す。以下、具体的に説明する。 (Fragment data generation process)
Next, the fragment data generation process performed by the fragment data generation unit 22 of the division integration device 20 in step S106 of FIG. 2 will be described. FIG. 3 is a flowchart showing a flow of fragment data generation processing performed by the text fragment data generation unit 221 of the fragmentation integration apparatus 20 according to the first embodiment of the present invention. In the present embodiment, the Japanese text data shown in FIG. 4 is uploaded, and this data is subjected to a Japanese-English translation process by the processing device 40, thereby converting it into English text data shown in FIG. Will be described. In this case, the text fragment data generation unit 221 provided in the fragment data generation unit 22 of the fragmentation integration device 20 is a unit that can be processed when the processing device performs data processing based on the data size and the data pattern. Fragment data is generated by cutting out. For example, a pattern is searched forward (or backward) from the last point of the data read from a predetermined number of bytes. This pattern determines whether there is a punctuation character followed by a line feed code. When this pattern exists, the data up to that point is cut out as fragment data as a part that can be processed when the processing unit processes the data. This will be specifically described below.

まず、断片データ生成部２２は、データ読込部２１が読込みを開始したデータの種類を判定する。ここでは、データの種類が日本語テキストデータか否かを判定する（ステップＳ２０１）。データ読込部２１が読み込んだデータが日本語テキストデータでなければ（ステップＳ２０１→Ｎｏ）、処理を終了する。一方、読み込んだデータが日本語テキストデータであれば（ステップＳ２０１→Ｙｅｓ）、次のステップＳ２０２へ進む。 First, the fragment data generation unit 22 determines the type of data that the data reading unit 21 has started reading. Here, it is determined whether or not the data type is Japanese text data (step S201). If the data read by the data reading unit 21 is not Japanese text data (step S201 → No), the process is terminated. On the other hand, if the read data is Japanese text data (step S201 → Yes), the process proceeds to the next step S202.

ステップＳ２０２において、断片データ生成部２２に備えられたテキスト断片データ生成部２２１は、まず、所定量のデータの読込みを行う。例えば、図４に示した日本語テキストデータの先頭からデータサイズが１０２４ＫＢの地点までのデータを読込む。 In step S202, the text fragment data generation unit 221 provided in the fragment data generation unit 22 first reads a predetermined amount of data. For example, data from the beginning of the Japanese text data shown in FIG. 4 to a point with a data size of 1024 KB is read.

次に、テキスト断片データ生成部２２１は、ステップＳ２０２において、データを読み込む際に、ファイルの終端が存在したか否かを判定する（ステップＳ２０３）。ファイルの終端が存在した場合には（ステップＳ２０３→Ｙｅｓ）、テキスト断片データ生成部２２１は、最後の断片データとして切り出し（ステップＳ２０７）、断片データ生成処理を終了する。一方、ファイルの終端が存在しなかった場合には（ステップＳ２０３→Ｎｏ）、次のステップＳ２０４へ進む。 Next, the text fragment data generation unit 221 determines whether or not the end of the file exists when reading the data in step S202 (step S203). If the end of the file exists (step S203 → Yes), the text fragment data generation unit 221 cuts out as the last fragment data (step S207), and ends the fragment data generation processing. On the other hand, if the end of the file does not exist (step S203 → No), the process proceeds to the next step S204.

ステップＳ２０４において、テキスト断片データ生成部２２１は、前方（もしくは後方）へ向けてパターン検索を行う。具体的には、テキスト断片データ生成部２２１は、句点文字とそれに続く改行コードが存在するか否かを判定する（ステップＳ２０５）。そして、句点文字とそれに続く改行コードが存在しなければ（ステップＳ２０５→Ｎｏ）、さらに前方（もしくは後方）へ向けてパターン検索を続ける。一方、句点文字とそれに続く改行コードが存在した場合には（ステップＳ２０５→Ｙｅｓ）、テキスト断片データ生成部２２１は、その地点までを切り出し、断片データを生成して（ステップＳ２０６）、ステップＳ２０２へ戻る。 In step S204, the text fragment data generation unit 221 performs a pattern search forward (or backward). Specifically, the text fragment data generation unit 221 determines whether or not there is a punctuation character followed by a line feed code (step S205). If there is no punctuation character and the following line feed code (step S205 → No), the pattern search continues further forward (or backward). On the other hand, when there is a punctuation character and a subsequent line feed code (step S205 → Yes), the text fragment data generation unit 221 cuts out to that point, generates fragment data (step S206), and proceeds to step S202. Return.

以上の処理を具体例を用いて説明する。図４に示すように、先頭からのデータサイズが１０２４ＫＢの最後の地点が、符号４０１の地点の「流」の文字であるとすると、その後方に向けて、句点文字「。」とそれに続く改行コード「↓」の存在する地点、つまり行番号「０５」の行末までを、処理装置がデータ処理する際に処理可能な部分としての断片データとして生成する。次に、２回目以降は、先頭から１０２４ＫＢの地点ではなく、直前に断片データとして切り出した直後の位置から１０２４ＫＢの地点までの読込みを行う。図４では、行番号「０６」の先頭からデータサイズが１０２４ＫＢの地点が、符号４０２の地点の「鎖」の文字であるとすると、その後方に向けて、句点文字「。」とそれに続く改行コード「↓」の存在する地点、つまり行番号「１２」の行末までを、処理装置がデータ処理する際に処理可能な部分である断片データとして生成する。 The above process will be described using a specific example. As shown in FIG. 4, if the last point with a data size of 1024 KB from the beginning is the “flow” character at the point 401, the punctuation character “.” Followed by a new line The point where the code “↓” exists, that is, up to the end of the line of the line number “05” is generated as fragment data as a part that can be processed when the processing apparatus processes data. Next, in the second and subsequent times, reading is performed from the position immediately after cutting out as fragment data to the point of 1024 KB, not the point of 1024 KB from the beginning. In FIG. 4, assuming that a point with a data size of 1024 KB from the head of the line number “06” is a “chain” character at the point of reference numeral 402, a punctuation character “.” Followed by a new line The point where the code “↓” exists, that is, up to the end of the line of the line number “12”, is generated as fragment data that is a part that can be processed when the processing apparatus processes data.

図６は、本発明の第１の実施形態に係る分断統合装置２０のテキスト断片データ生成部２２１が生成する断片データの一例を示す図である。図６に示すように、テキスト断片データ生成部２２１により生成された断片データは、単に、データサイズに基づきデータを分断する場合に比べて、パターン検索を行うことにより、日本語として文章が中断されることない。よって、断片データを、処理装置がデータ処理する際に処理可能な単位としてのデータとして切り出すことが可能となる。 FIG. 6 is a diagram illustrating an example of fragment data generated by the text fragment data generation unit 221 of the fragmentation integration device 20 according to the first embodiment of the present invention. As shown in FIG. 6, in the fragment data generated by the text fragment data generation unit 221, the sentence is interrupted as Japanese by performing a pattern search as compared to the case where the data is simply divided based on the data size. Never. Therefore, the fragment data can be cut out as data as a unit that can be processed when the processing apparatus processes the data.

なお、本発明の第１の実施形態に係る分散処理システム１を用いた分散処理により、レスポンス（応答時間）が短縮される効果がより期待できるのは、以下のような場合である。 It should be noted that the effect of shortening the response (response time) by the distributed processing using the distributed processing system 1 according to the first embodiment of the present invention can be expected in the following cases.

分断統合装置２０でのデータの分断時間をＤ、分断統合装置２０での統合時間をＭ、処理装置４０での処理時間をＰとすると、Ｄ≪Ｐ、Ｍ≪Ｐの場合である。例えば、本発明の第１の実施形態で示した日英翻訳処理は、日本語テキストを分断する時間をＤ_１、英語テキストを統合する時間をＭ_１、翻訳処理時間をＰ_１とすると、分析時間Ｄ_１および統合時間Ｍ_１が、翻訳処理時間Ｐ_１に比べ、非常に短くなるため、著しい分散処理の効果が期待できる。 This is the case of D << P and M << P, where D is the data division time in the division integration device 20, M is the integration time in the division integration device 20, and P is the processing time in the processing device 40. For example, in the Japanese-English translation processing shown in the first embodiment of the present invention, the time for dividing Japanese text is D ₁ , the time for integrating English text is M ₁ , and the time for translation processing is P _1. Since the time D ₁ and the integration time M ₁ are much shorter than the translation processing time P ₁ , a significant distributed processing effect can be expected.

また、本発明の第１の実施形態に係るテキスト断片データ生成部２２１は、日本語テキスト断片データ生成部として、日本語テキストデータの断片データ生成処理を行うものとして説明した。これに対し、英語テキストデータが受付応答装置１０にアップロードされ、処理装置４０において、英日翻訳処理を行い日本語テキストデータに変換する処理を行う場合には、テキスト断片データ生成部２２１として英語テキスト断片データ生成部を設け、英語テキストデータの断片データ生成処理を行うものとしてもよい。この場合、英語テキスト断片データ生成部は、所定量のデータを読み込み、その読み込んだデータの最後から、前方もしくは後方に、ピリオドおよびそれに続く改行コードを検出する。そして、検出された地点までを、処理装置がそのデータをデータ処理する際に処理可能な部分としての断片データとして切り出すことにより、断片データ生成処理を行うことができる。 Further, the text fragment data generation unit 221 according to the first embodiment of the present invention has been described as performing the fragment data generation processing of Japanese text data as the Japanese text fragment data generation unit. On the other hand, when the English text data is uploaded to the reception response device 10 and the processing device 40 performs an English-Japanese translation process and converts it into Japanese text data, the English text data is used as the text fragment data generation unit 221. A fragment data generation unit may be provided to perform fragment data generation processing of English text data. In this case, the English text fragment data generation unit reads a predetermined amount of data, and detects a period and a subsequent line feed code from the end of the read data forward or backward. Then, fragment data generation processing can be performed by cutting out the detected points as fragment data as a portion that can be processed when the processing device processes the data.

以上説明したように、本発明の第１の実施形態に係る分散処理システム１によれば、単一の情報量の多いデータであっても、複数の処理装置４０を用意すれば、ユーザ端末２からの処理要求に対して、レスポンスの高速化を図ることができる。よって、ブラウザのタイムアウト時間内に結果をユーザ端末２に返却可能となる。また、処理結果を事後に通知する等の付随したシステム（メールアドレス登録、メール送信等のシステム）の構築を省略することが可能となる。 As described above, according to the distributed processing system 1 according to the first embodiment of the present invention, the user terminal 2 can be obtained by preparing a plurality of processing devices 40 even for a single data having a large amount of information. The response speed can be increased in response to the processing request from. Therefore, the result can be returned to the user terminal 2 within the browser timeout time. In addition, it is possible to omit the construction of an accompanying system (system for e-mail address registration, e-mail transmission, etc.) such as notification of the processing result after the fact.

《第２の実施形態》
次に、本発明の第２の実施形態に係る分散処理システム１ａについて説明する。本発明の第１の実施形態に係る分散処理システム１との違いは、図１に示すように、受付応答装置１０のファイル処理部１２に、逐次ファイル処理部１２１をさらに備えていることである。第２の実施形態に係る分散処理システム１ａは、第１の実施形態に係る分散処理システム１で得られる、データを分散して処理することによりレスポンス（応答時間）を短縮する効果に加えて、逐次ファイル処理部１２１が、データのアップロードを開始とともに、分断統合装置２０の起動を行い、データを受信しつつファイル化を行う。そして、すべてのデータのアップロードを待つことなく、ファイル化したデータから順に分断統合装置２０へ送信する処理を行う。このようにすることで、さらにレスポンス（応答時間）の短縮を図るものである。以下、具体的に説明する。 << Second Embodiment >>
Next, a distributed processing system 1a according to a second embodiment of the present invention will be described. The difference from the distributed processing system 1 according to the first embodiment of the present invention is that the file processing unit 12 of the reception response device 10 further includes a sequential file processing unit 121 as shown in FIG. . The distributed processing system 1a according to the second embodiment, in addition to the effect of shortening the response (response time) by distributing and processing data obtained by the distributed processing system 1 according to the first embodiment, The sequential file processing unit 121 starts uploading data and activates the division integration device 20 to create a file while receiving data. And the process which transmits to the division | segmentation integration apparatus 20 in order from the file-ized data is performed, without waiting for upload of all the data. By doing so, the response (response time) is further shortened. This will be specifically described below.

本発明の第１の実施形態に係るファイル処理部１２の処理においては、データ受付部１１がすべてのデータの受け付けを完了してから、データをファイル化してデータ蓄積部１４に保存し、その後、分断統合装置２０へ起動指示メッセージを出力していた（図２のステップＳ１０２〜Ｓ１０４参照）。一方、本発明の第２の実施形態に係る逐次ファイル処理部１２１は、ユーザ端末２からのデータをデータ受付部１１が受け付け開始すると、まず、分断統合装置２０へ起動指示メッセージを送信する。そして、逐次ファイル処理部１２１は、所定量（例えば、１０バイト）のデータを受け付ける毎に、受け付けたデータをファイル化してデータ蓄積部１４に保存し、その保存したデータを順次分断統合装置２０へ送信する。 In the processing of the file processing unit 12 according to the first embodiment of the present invention, after the data reception unit 11 completes reception of all data, the data is filed and stored in the data storage unit 14, and then An activation instruction message was output to the dividing and integrating device 20 (see steps S102 to S104 in FIG. 2). On the other hand, when the data reception unit 11 starts to accept data from the user terminal 2, the sequential file processing unit 121 according to the second embodiment of the present invention first transmits an activation instruction message to the division integration device 20. Each time the sequential file processing unit 121 receives a predetermined amount (for example, 10 bytes) of data, the sequential data processing unit 121 converts the received data into a file and stores it in the data storage unit 14, and sequentially stores the stored data to the dividing and integrating device 20. Send.

このようにすることで、受付応答装置１０によるデータすべてのアップロードを待つことなく、受け付けたデータから順に、分断統合装置２０へ送信することができ、レスポンス（応答時間）をさらに短縮することができる。 By doing in this way, without waiting for the upload of all the data by the reception response apparatus 10, it can transmit to the division | segmentation integration apparatus 20 in order from the received data, and a response (response time) can further be shortened. .

図７は、本発明の第２の実施形態に係る分散処理システム１ａの分散処理の流れを示すシーケンス図である。本発明の第１の実施形態に係るシーケンス（図２参照）と同様の処理は、同一符号を付し説明を省略する。 FIG. 7 is a sequence diagram showing the flow of distributed processing of the distributed processing system 1a according to the second embodiment of the present invention. The same processes as those in the sequence according to the first embodiment of the present invention (see FIG. 2) are denoted by the same reference numerals and description thereof is omitted.

まず、ユーザ端末２からネットワーク３を介して、データが送信されると（ステップＳ１０１）、受付応答装置１０のデータ受付部１１がデータを受け付け、そのデータの受け付けを契機として、逐次ファイル処理部１２１が、分断統合装置２０へ起動指示メッセージを、送信制御部１３を介して送信する（ステップＳ３０１）。そして、分断統合装置２０は、この起動指示メッセージを受信することにより（ステップＳ３０２）、起動してデータ読込み待ち状態となる。 First, when data is transmitted from the user terminal 2 via the network 3 (step S101), the data reception unit 11 of the reception response device 10 receives the data, and the sequential file processing unit 121 is triggered by the reception of the data. However, an activation instruction message is transmitted to the division integration device 20 via the transmission control unit 13 (step S301). And the division | segmentation integration apparatus 20 will start and will be in a data reading waiting state by receiving this starting instruction | indication message (step S302).

次に、受付応答装置１０の逐次ファイル処理部１２１は、データ受付部１１が所定量（例えば、１０バイト）のデータを受け付ける毎に、受け付けたデータを逐次ファイル化してデータ蓄積部１４に保存し、その保存したデータを順次分断統合装置２０へ送信する（ステップＳ３０３）。そして、逐次ファイル処理部１２１は、アップロードされるすべてのデータの読込みと、分断統合装置２０への送信が終了したか否かを判定する（ステップＳ３０４）。そして、まだ読込みと送信が終わっていないデータがあれば（ステップＳ３０４→Ｎｏ）、ステップＳ３０３へ戻り処理を続ける。一方、すべてのデータの読込みと送信が終わっている場合には（ステップＳ３０４→Ｙｅｓ）、処理を終了する。 Next, every time the data reception unit 11 receives a predetermined amount (for example, 10 bytes) of data, the sequential file processing unit 121 of the reception response device 10 sequentially converts the received data into a file and stores it in the data storage unit 14. The stored data is sequentially transmitted to the dividing and integrating device 20 (step S303). Then, the sequential file processing unit 121 determines whether reading of all uploaded data and transmission to the dividing and integrating device 20 have been completed (step S304). If there is data that has not yet been read and transmitted (step S304 → No), the process returns to step S303 to continue the processing. On the other hand, if all the data has been read and transmitted (step S304 → Yes), the process ends.

分断統合装置２０のデータ読込部２１は、受付応答装置１０から取得したデータを、すべてのデータの受信完了を待たずに、読込みを行い（ステップＳ１０５）、読み込んだデータを断片データ生成部２２へ引き渡す。次に、断片データ生成部２２では、順次データの読み込みを行い、断片データ生成処理を行う（ステップＳ１０６）。そして、生成した断片データをデータ管理部２３が、キュー管理装置３０へ送信する（ステップＳ１０７）。
以下は、図１に示すステップＳ１０８〜Ｓ１２０と同様の処理を行い、分断統合装置２０において断片結果データを統合し、処理完了通知をユーザ端末２に送信する。 The data reading unit 21 of the dividing and integrating device 20 reads the data acquired from the reception response device 10 without waiting for the completion of reception of all data (step S105), and the read data is sent to the fragment data generating unit 22. hand over. Next, the fragment data generation unit 22 sequentially reads data and performs fragment data generation processing (step S106). Then, the data management unit 23 transmits the generated fragment data to the queue management device 30 (step S107).
In the following, processing similar to that in steps S108 to S120 shown in FIG.

以上のように、本発明の第２の実施形態に係る分散処理システム１ａによれば、ユーザ端末２から受付応答装置１０へのデータのアップロード処理と、分断統合装置２０による断片データ生成処理、そして、キュー管理装置３０への断片データの送信が、並列に実行可能となるため、更なるレスポンス（応答時間）の短縮が可能となる。 As described above, according to the distributed processing system 1a according to the second embodiment of the present invention, data upload processing from the user terminal 2 to the reception response device 10, fragment data generation processing by the division integration device 20, and Since the fragment data can be transmitted to the queue management device 30 in parallel, the response (response time) can be further shortened.

《第３の実施形態》
次に、本発明の第３の実施形態に係る分散処理システム１ｂについて説明する。本発明の第１の実施形態に係る分散処理システム１との違いは、ユーザ端末２からアップロードされるデータが映像データ（映像コンテンツ）であることである。この映像データは、例えば、ｍｐｅｇ２やＨ．２６４映像、もしくはその後継コーデックや派生コーデックである。そして、この映像データを分断するために、断片データ生成部２２には、図１に示すように、分断統合装置２０の断片データ生成部２２内に映像断片データ生成部２２２を備える。なお、この映像データのアップロードに対して、処理装置４０が行う処理は、例えば、１つの映像から他の映像へ切り換わるカット点を検出するカット点検出処理や、人物等の被写体が大きく写っているアップショット等の注目シーンを検出する注目シーン検出処理の提供等があげられる。 << Third Embodiment >>
Next, a distributed processing system 1b according to a third embodiment of the present invention will be described. The difference from the distributed processing system 1 according to the first embodiment of the present invention is that data uploaded from the user terminal 2 is video data (video content). This video data is, for example, mpeg2 or H.264. H.264 video or its successor codec or derivative codec. In order to divide the video data, the fragment data generation unit 22 includes a video fragment data generation unit 222 in the fragment data generation unit 22 of the division integration device 20 as shown in FIG. Note that the processing performed by the processing device 40 for uploading the video data includes, for example, a cut point detection process for detecting a cut point at which one video is switched to another video, and a subject such as a person is greatly captured. For example, a noticed scene detection process for detecting a noticed scene such as an up-shot is provided.

この映像断片データ生成部２２２は、映像データからの断片データの切り出しの際に、ＧＯＰ（Group Of Pictures）を１つの分割不能な単位とし、所定数のＧＯＰを含む映像データを、1つ以上の映像シーンの集合である断片データとして切り出す。
そして、映像断片データ生成部２２２により生成された断片データは、第１の実施形態と同様に、データ管理部２３によりキュー管理装置３０へ送信され、処理装置４０によって、断片データ毎に、カット点検出処理や注目シーン検出処理が行われる。 This video fragment data generation unit 222 uses GOP (Group Of Pictures) as one non-dividable unit when cutting out fragment data from video data, and converts video data including a predetermined number of GOPs into one or more video data. Cut out as fragment data that is a set of video scenes.
Then, the fragment data generated by the video fragment data generation unit 222 is transmitted to the queue management device 30 by the data management unit 23 as in the first embodiment, and the processing device 40 performs the cut inspection for each piece of fragment data. Out processing and attention scene detection processing are performed.

次に、映像断片データ生成部２２２が行う断片データ生成処理について説明する。
図８は、本発明の第３の実施形態に係る分断統合装置２０の映像断片データ生成部２２２が行う断片データ生成処理の流れを示すフローチャートである。 Next, fragment data generation processing performed by the video fragment data generation unit 222 will be described.
FIG. 8 is a flowchart showing a flow of fragment data generation processing performed by the video fragment data generation unit 222 of the fragmentation integration device 20 according to the third embodiment of the present invention.

まず、断片データ生成部２２は、データ読込部２１が読込みを開始したデータの種類を判定する。ここでは、データの種類が、ｍｐｅｇ２やＨ．２６４映像、もしくはその後継コーデックや派生コーデック等の映像データであるか否かを判定する（ステップＳ４０１）。データ読込部２１が読み込んだデータが、ｍｐｅｇ２等の映像データでなければ（ステップＳ４０１→Ｎｏ）、処理を終了する。一方、読み込んだデータが、ｍｐｅｇ２等の映像データであれば（ステップＳ４０１→Ｙｅｓ）、次のステップＳ４０２へ進む。 First, the fragment data generation unit 22 determines the type of data that the data reading unit 21 has started reading. Here, the data type is mpeg2 or H.264. It is determined whether the video data is H.264 video or video data of a successor codec or a derived codec (step S401). If the data read by the data reading unit 21 is not video data such as mpeg2 (step S401 → No), the process ends. On the other hand, if the read data is video data such as mpeg2 (step S401 → Yes), the process proceeds to the next step S402.

ステップＳ４０２において、断片データ生成部２２に備えられた映像断片データ生成部２２２は、所定数のＧＯＰの含むデータの切り出しを行う。次に、映像断片データ生成部２２２は、データを読み込む際にファイルの終端が存在したか否かを判定する（ステップＳ４０３）。ファイルの終端が存在した場合には（ステップＳ４０３→Ｙｅｓ）、映像断片データ生成部２２２は、その読み込んだデータを最後の断片データとして切り出し（ステップＳ４０５）、コンテナ（動画ファイル）内に含めて断片データを生成する。一方、ファイルの終端が存在しなかった場合には（ステップＳ４０３→Ｎｏ）、映像断片データ生成部２２２は、ステップＳ４０２で切り出したデータを、コンテナ内に含めて断片データを生成し（ステップＳ４０４）、ステップＳ４０２へ戻る。 In step S402, the video fragment data generation unit 222 provided in the fragment data generation unit 22 cuts out data included in a predetermined number of GOPs. Next, the video fragment data generation unit 222 determines whether or not the end of the file exists when reading the data (step S403). When the end of the file exists (step S403 → Yes), the video fragment data generation unit 222 cuts out the read data as the last fragment data (step S405), and includes the fragment in the container (moving image file). Generate data. On the other hand, when the end of the file does not exist (step S403 → No), the video fragment data generation unit 222 includes the data extracted in step S402 in the container to generate fragment data (step S404). Return to step S402.

映像毎にデコードを実行して断片データを切り出す処理を行う場合と比較して、本発明の第３の実施形態に係る映像断片データ生成部２２２によるＧＯＰ単位の切り出しによれば、処理装置４０での処理時間Ｐに比べ、分断統合装置２０での分断時間Ｄを大幅に短縮することができるため、Ｄ≪Ｐを実現することができる。また、カット点検出処理や注目シーン検出処理は、その結果が映像内のフレーム番号等であるため、そのフレーム番号に対応した画像の生成をするだけでよく、アップロードされた一連の映像データのデータ量に比べ、出力情報のデータ量が著しく減少し、分断統合装置２０での統合時間Ｍ≪処理時間Ｐとなり、本発明による分散処理効果を享受することが可能となる。 Compared with the case where the processing for cutting out fragment data by executing decoding for each video is performed, according to the clipping in units of GOP by the video fragment data generation unit 222 according to the third embodiment of the present invention, the processing device 40 Compared to the processing time P, the division time D in the division integration device 20 can be greatly shortened, so that D << P can be realized. In addition, since the cut point detection process and the target scene detection process result in a frame number in the video, it is only necessary to generate an image corresponding to the frame number, and a series of uploaded video data data. Compared with the amount, the data amount of the output information is remarkably reduced, and the integration time M << processing time P in the dividing and integrating device 20 is satisfied, and the distributed processing effect according to the present invention can be enjoyed.

また、例えば、ｍｐｅｇ２コーデックの映像を、Ｈ．２６４コーデックの映像に変換する処理を、処理装置４０が行う場合にも、本発明の適用することができる。ただし、複数の変換後の断片結果映像（Ｈ．２６４にエンコードされた映像）を結合すると、統合時間Ｍのコストが高く、Ｍ≪Ｐを状態が実現できないケースが存在する。その場合には、複数の映像コンテナを更にまとめて１つの映像コンテナとしてみなすプレイリスト等を結合結果として利用することで、統合の負荷が低減され、Ｍ≪Ｐの状態を実現でき、本発明による分散処理効果を享受することが可能となる。 Also, for example, the video of the mpeg2 codec is The present invention can also be applied to the case where the processing device 40 performs the process of converting to an H.264 codec video. However, when a plurality of converted fragment result videos (videos encoded in H.264) are combined, there is a case where the cost of the integration time M is high and the state of M << P cannot be realized. In that case, by integrating a plurality of video containers and using a playlist or the like that is regarded as one video container as a combined result, the load of integration can be reduced, and the state of M << P can be realized. It is possible to enjoy the distributed processing effect.

以上のように、本発明の第３の実施形態に係る分散処理システム１ｂによれば、単一の情報量の多い映像データ（もしくはコンテンツ）が入力された場合に、複数の処理装置４０を用意することで、ユーザからの処理要求に対して、レスポンス（応答時間）の短縮を図ることが可能となる。 As described above, according to the distributed processing system 1b according to the third embodiment of the present invention, a plurality of processing devices 40 are prepared when a single piece of video data (or content) with a large amount of information is input. By doing so, it becomes possible to shorten the response (response time) to the processing request from the user.

なお、本発明の第３の実施形態に係る分散処理システム１ｂにおいては、分断統合装置２０の断片データ生成部２２に映像断片データ生成部２２２を備えるものとしたが、例えば、音声データに関して、音声断片データ生成部を備えるようにしてもよい。この場合、音声断片データ生成部は、受付応答装置１０で受け付けた音声データを、例えば、その音声データの無音時間が所定時間連続する部分を区切りとしてデータを切り出すことで、処理装置がデータ処理する際に処理可能な単位としての断片データを生成する。そして、キュー管理装置３０を介して、処理装置４０において、例えば、その断片データ（音声データ）を日本語テキストデータに変換するデータ処理を実行させることもできる。 In the distributed processing system 1b according to the third embodiment of the present invention, the fragment data generation unit 22 of the fragmentation integration device 20 includes the video fragment data generation unit 222. A fragment data generation unit may be provided. In this case, the audio fragment data generation unit processes the audio data received by the reception response device 10 by, for example, cutting out the data by separating a portion where the silent time of the audio data continues for a predetermined time as a delimiter. In this case, fragment data is generated as a unit that can be processed. Then, for example, data processing for converting the fragment data (voice data) into Japanese text data can be executed in the processing device 40 via the queue management device 30.

≪本実施形態の変形例≫
次に、本発明の実施形態に係る分散処理システム１の変形例について説明する。図９は、本実施形態に係る分散処理システム１の変形例について説明するための構成図である。図９に示すシステム１００は、図１に示す分散処理システム１の受付応答装置１０の代わりに、前記した特許文献１に記載された追記参照型ＤＭＳ（Data Management System）４を適用したものである。 << Modification of this embodiment >>
Next, a modified example of the distributed processing system 1 according to the embodiment of the present invention will be described. FIG. 9 is a configuration diagram for explaining a modification of the distributed processing system 1 according to the present embodiment. A system 100 shown in FIG. 9 is obtained by applying a postscript reference type DMS (Data Management System) 4 described in Patent Document 1 described above, instead of the reception response device 10 of the distributed processing system 1 shown in FIG. .

このＤＭＳ４においては、複数のユーザ端末２からネットワーク３を介して、複数のデータがアップロードされると、受付振り分け装置５０が、受信したデータを複数の追記装置６０に振り分けて保存し、複数の処理装置７０が追記装置６０からデータを取得してデータ処理する。そして各処理装置７０の処理結果を受付振り分け装置５０に送信して、ユーザ端末２へ返信する。このことにより、ＤＭＳ４によれば、複数のデータを並列処理することができ、システム全体してスループットを向上させることができる。 In the DMS 4, when a plurality of data is uploaded from a plurality of user terminals 2 via the network 3, the reception distribution device 50 distributes and stores the received data to the plurality of additional recording devices 60, and performs a plurality of processes. The device 70 acquires data from the appending device 60 and processes the data. Then, the processing result of each processing device 70 is transmitted to the reception distribution device 50 and returned to the user terminal 2. Thus, according to the DMS 4, a plurality of data can be processed in parallel, and the throughput of the entire system can be improved.

そして、このシステム１００においては、ＤＭＳ４を、本実施形態に係る分散処理システム１の受付応答装置１０の代わりに設ける。ここでＤＭＳ４の処理装置７０は、Polder Engine処理を行うものとする。このPolder Engine処理は、例えば、ユーザがユーザ端末２のＧＵＩ機能を用いて、複数のデータファイルを選択して、フォルダに一括ＤＲＯＰすると、そのフォルダは処理付きフォルダとして、そのフォルダ名に示される処理を実行し、処理結果の各々がフォルダ内に格納させるものである。このPolder Engine処理においては、処理付きフォルダの入れ子構造によって、一つの処理付きフォルダ内にさらに処理付きフォルダを入れ込むことができる。例えば、ある画像データをＯＣＲ処理により、日本語テキストデータに変換し、さらに、その日本語テキストデータについて、日英翻訳処理を行わせ、さらにその英文の要約処理を行うような場合である。このような処理を、処理付きフォルダの入れ子構造によって、一括処理することが可能となる。 In this system 100, the DMS 4 is provided instead of the reception response device 10 of the distributed processing system 1 according to the present embodiment. Here, it is assumed that the processing device 70 of the DMS 4 performs the Folder Engine processing. For example, when the user selects a plurality of data files using the GUI function of the user terminal 2 and batch DROPs the folder, the Folder Engine process is a process indicated by the folder name as a processed folder. , And each processing result is stored in a folder. In the Folder Engine processing, a folder with processing can be further inserted into one folder with processing by the nested structure of the folder with processing. For example, there is a case where certain image data is converted into Japanese text data by OCR processing, further, Japanese-English translation processing is performed on the Japanese text data, and the English summarization processing is further performed. Such processing can be collectively processed by the nested structure of folders with processing.

この処理を本システム１００において行う場合には、ＤＭＳ４の処理装置７０からデータを取得して、（１）まず、ＯＣＲ処理装置４５がＯＣＲ処理を行い、（２）次に、その結果データを、処理装置７０が分散処理システム１の分断統合装置２０に送信する。そして、分断統合装置２０によりデータを分断し断片データを生成する。続いて、キュー管理装置３０に記憶された断片データを、各処理装置４０が取得して、日本語テキストデータを英語テキストデータに翻訳した上で、断片結果データを分断統合装置２０に送信する。分断統合装置２０では、断片結果データを統合し、その統合した結果データを、ＤＭＳ４の処理装置７０に送信する。（３）そして、英文要約処理装置４６が英語テキストデータの要約処理を行う。このような処理を行うことによって、処理装置７０によるレスポンス（応答時間）を短縮することが可能となる。さらに、これらの処理を多数のユーザ端末２から同時にリクエストされた場合（つまり，同時にアップロードされた場合）には処理装置７０が同時に動作することでシステムのスループットが向上する。 When this processing is performed in the present system 100, data is acquired from the processing device 70 of the DMS4. (1) First, the OCR processing device 45 performs OCR processing. (2) Next, the result data is The processing device 70 transmits to the division integration device 20 of the distributed processing system 1. Then, the data is divided by the division integration device 20 to generate fragment data. Subsequently, each processing device 40 acquires the fragment data stored in the queue management device 30, translates the Japanese text data into English text data, and transmits the fragment result data to the fragmentation integration device 20. The fragmentation integration device 20 integrates the fragment result data, and transmits the integrated result data to the processing device 70 of the DMS 4. (3) Then, the English summary processing unit 46 performs English text data summarization processing. By performing such processing, a response (response time) by the processing device 70 can be shortened. Further, when these processes are requested simultaneously from a large number of user terminals 2 (that is, when uploading at the same time), the processing device 70 operates simultaneously, thereby improving the system throughput.

このように、本システム１００によれば、ＤＭＳ４により、多数のユーザからのリクエストの対するスループットの向上を図ることができ、さらにユーザから入力された単一の情報量の多いデータに対しても、本実施形態に係る分散処理システム１を組み合わせることで、レスポンス（応答時間）の向上も図れるものとなる。 As described above, according to the present system 100, it is possible to improve the throughput with respect to requests from a large number of users by using the DMS4. Further, even for data having a large amount of information input from the users, By combining the distributed processing system 1 according to the present embodiment, the response (response time) can be improved.

１分散処理システム
２ユーザ端末
３ネットワーク
５ＤＭＳ
１０受付応答装置
１１データ受付部
１２ファイル処理部
１３送信制御部
１４データ蓄積部
２０分断統合装置
２１データ読込部
２２断片データ生成部
２３データ管理部
２４断片結果データ取得部
２５断片結果データ統合部
２６結果データ出力部
２７データ記憶部
３０キュー管理装置
３１キュー受付部
３２キュー管理部
３３キュー保存部
４０，７０処理装置
４１キュー選択部
４２断片データ取得部
４３データ処理部
４４断片結果データ出力部
４５ＯＣＲ装置
４６英文要約処理装置
５０受付振り分け装置
６０追記装置
１２１逐次ファイル処理部
２２１テキスト断片データ生成部
２２２映像断片データ生成部 1 distributed processing system 2 user terminal 3 network 5 DMS
DESCRIPTION OF SYMBOLS 10 Reception response apparatus 11 Data reception part 12 File processing part 13 Transmission control part 14 Data accumulation part 20 Fragmentation integration apparatus 21 Data reading part 22 Fragment data generation part 23 Data management part 24 Fragment result data acquisition part 25 Fragment result data integration part 26 Result data output unit 27 Data storage unit 30 Queue management device 31 Queue reception unit 32 Queue management unit 33 Queue storage unit 40, 70 Processing device 41 Queue selection unit 42 Fragment data acquisition unit 43 Data processing unit 44 Fragment result data output unit 45 OCR Device 46 English summary processing device 50 Acceptance distribution device 60 Additional recording device 121 Sequential file processing unit 221 Text fragment data generation unit 222 Video fragment data generation unit

Claims

(1) an acceptance response device that accepts upload data from a user terminal via a communication network, and returns the result data of data processing on the accepted upload data to the user terminal synchronously or asynchronously; (2) the acceptance response Dividing the upload data acquired from the apparatus to generate a plurality of fragment data, and integrating the plurality of fragment result data, which is a result of data processing of the generated plurality of fragment data, to generate the result data an integrated device, (3) the saved from cutting integrating device as a queue to acquire the fragment data, one or more of transmitting the fragment data by from one request of the plurality of processing units to the processing unit (4) acquiring the fragment data from the queue management device, and acquiring the fragment data The data and data processing, wherein a plurality of processing devices to be transmitted to the cutting integrated device as said fragments result data, a distributed processing system comprising,
The reception response device includes:
A data reception unit that receives the upload data from the user terminal via the communication network, and obtains the upload data from the data reception unit into a file, and stores the filed data in a data storage unit. A file processing unit that generates a start instruction message for starting the division integration device, a start instruction message generated by the file processing unit and the filed data are transmitted to the division integration device, and acquired from the division integration device A transmission control unit that transmits the result data to the user terminal via the communication network;
The dividing and integrating device is
A data reading unit for reading the data transmitted from the acceptance response device in order from the beginning of the data, a type of data read by the data reading unit, and the processing device according to the determined type of data Generates a plurality of fragment data by cutting out in units that can be processed when the data is processed, and the fragment data generation unit generates one of the plurality of fragment data. In this order, a sequential ID is assigned to each of the plurality of fragment data, and each time the fragment data with the sequential ID is generated, the fragment is assigned to one of the one or more queue management devices. A data management unit for transmitting data;
Receiving the fragment result data transmitted from the plurality of processing devices, and all the fragment result data based on sequential IDs attached to the fragment result data in association with the IDs attached to the fragment data When the fragment result data acquisition unit that determines whether or not the fragment result data acquisition unit has acquired all the fragment result data, the sequential result attached to the fragment result data is determined. Using the ID, the fragment result data integration unit that integrates the fragment result data in the ID order and generates the result data, the data storage unit that stores the result data generated by the fragment result data integration unit, A result data output unit that transmits the result data stored in the data storage unit to the acceptance response device;
The queue management device includes:
The queue accepting unit that accepts the fragment data transmitted from the fragmentation integration device and saves the fragment data as a queue in a queue storage unit, and the fragment data acquisition request from the processing device, the queue storage unit stores the fragment data A queue management unit that transmits fragment data to the processing device, and the queue storage unit that stores the fragment data received by the queue reception unit,
The processor is
The fragment data acquisition request for transmitting the fragment data acquisition request to the queue management device and acquiring the fragment data from the queue management device; and predetermined data processing for the fragment data acquired by the fragment data acquisition unit A data processing unit to be executed; and the fragmentation integration device that attaches a sequential ID associated with the ID assigned to the fragment data to the fragment result data that is a result of data processing by the data processing unit Fragment result data output section to be transmitted to,
A distributed processing system comprising:

The reception response device includes:
When the data accepting unit starts accepting the upload data, the start instruction message is generated and transmitted to the dividing and integrating device via the transmission control unit, and the upload data received by the data accepting unit is transmitted. Each time a predetermined amount is acquired, the data is divided into files, the filed data is sequentially stored in the data storage unit, and each time the data is stored in the data storage unit, the data is transmitted to the transmission control unit. Further comprising a sequential file processing unit that transmits to the dividing and integrating device via
The dividing and integrating device is
2. The distribution according to claim 1, wherein each time the data reading unit reads the data from the reception response device, the data reading unit transfers the data to the fragment data generation unit, and the fragment data generation unit generates the fragment data. Processing system.

The fragment data generation unit of the fragmentation integration apparatus is
When the type of the data read by the data reading unit is Japanese text data, the data is read up to a predetermined amount, and a punctuation character and subsequent line feed code are detected forward or backward from the end of the read data. And a Japanese text fragment data generation unit that extracts up to the detected point as fragment data as a processable part when the processing device processes the data,
The distributed processing system according to claim 1, further comprising:

The fragment data generation unit of the fragmentation integration apparatus is
When the type of the data read by the data reading unit is English text data, the data is read up to a predetermined amount, and a period and a subsequent line feed code are detected forward or backward from the end of the read data, An English text fragment data generation unit that extracts up to the detected point as fragment data as a portion that can be processed when the processing device processes the data.
The distributed processing system according to claim 1, further comprising:

The fragment data generation unit of the fragmentation integration apparatus is
Video fragment data generation for generating fragment data by cutting out the video data for each predetermined number of GOPs when the type of the data read by the data reading unit is GOP (Group Of Pictures) units Part
The distributed processing system according to claim 1, further comprising:

(1) an acceptance response device that accepts upload data from a user terminal via a communication network, and returns the result data of data processing on the accepted upload data to the user terminal synchronously or asynchronously; (2) the acceptance response Dividing the upload data acquired from the apparatus to generate a plurality of fragment data, and integrating the plurality of fragment result data, which is a result of data processing of the generated plurality of fragment data, to generate the result data an integrated device, (3) the saved from cutting integrating device as a queue to acquire the fragment data, one or more of transmitting the fragment data by from one request of the plurality of processing units to the processing unit (4) acquiring the fragment data from the queue management device, and acquiring the fragment data The data and data processing, wherein a plurality of processing devices to be transmitted to the cutting integrated device as said fragments result data, a distributed processing method for use in a distributed processing system comprising,
The reception response device includes:
A data storage unit for storing the uploaded data as a file;
Receiving the upload data from the user terminal via the communication network; obtaining the received upload data into a file; and storing the filed data in the data storage unit; Performing a step of generating a start instruction message for starting the process, and a step of transmitting the generated start instruction message and the filed data to the dividing and integrating device,
The dividing and integrating device is
A step of reading the data transmitted from the acceptance response device in order from the beginning of the data, a type of the read data is determined, and the processing device processes the data according to the determined type of data Generating a plurality of fragment data by cutting out in units that can be processed, and assigning a sequential ID to each of the plurality of fragment data in the order in which one of the plurality of fragment data is generated. And each time the fragment data with the sequential ID is generated, the fragment data is transmitted to any one of the one or more queue management devices, and
The queue management device includes:
A queue storage unit that stores the fragment data transmitted from the fragmentation integration device as a queue;
Receiving the fragment data transmitted from the fragmentation integration device and storing the fragment data in a queue storage unit as a queue; and the fragment data stored in the queue storage unit based on an acquisition request for the fragment data from the processing device Transmitting to the processing device, and
The processor is
Transmitting the fragment data acquisition request to the queue management device, acquiring the fragment data from the queue management device, executing predetermined data processing on the acquired fragment data, and the data processing Performing a step of attaching a sequential ID associated with the ID attached to the fragment data to the fragment result data which is a result obtained, and transmitting the fragment ID to the fragmentation integration device,
The dividing and integrating device is
A data storage unit for storing the result data obtained by integrating the fragment result data;
The fragment result data transmitted from the plurality of processing devices is received, and all the fragment results are based on the sequential ID attached to the fragment result data in association with the ID attached to the fragment data. A step of determining whether or not data has been acquired; and when it is determined that all of the fragment result data has been acquired, the sequential IDs attached to the fragment result data are used to determine the fragments in the order of the IDs. results and generating integrated the result data to data, and storing the result data has been made before Kisei in the data storage unit, the result data stored in the data storage unit, the reception response Transmitting to the device, and
The reception response device includes:
The distributed processing method characterized by performing the step which transmits the said result data acquired from the said division | segmentation integration apparatus to the said user terminal via the said communication network.

A program for causing a computer to execute the distributed processing method according to claim 6.