JP2016218926A

JP2016218926A - Information processor, information processing method, and computer program

Info

Publication number: JP2016218926A
Application number: JP2015105897A
Authority: JP
Inventors: 孝知川合; Takatomo Kawai
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2015-05-25
Filing date: 2015-05-25
Publication date: 2016-12-22

Abstract

PROBLEM TO BE SOLVED: To provide an information processor for distributing data processing to a plurality of arithmetic devices on a network to cause the plurality of arithmetic devices to perform the data processing while taking transfer times of the data on the network into consideration.SOLUTION: The information processor for distributing data processing to a plurality of arithmetic devices on a network to cause the plurality of arithmetic devices to perform the data processing includes: first acquisition means for acquiring first information on a time of data transfer to each of the plurality of arithmetic devices; second acquisition means for acquiring second information on the respective arithmetic capabilities of the plurality of arithmetic devices; and selection means for selecting an arithmetic device caused to perform the data processing among the plurality of arithmetic devices on the basis of the first information and the second information such that the completion of the data processing becomes smaller than a prescribed value.SELECTED DRAWING: Figure 8

Description

本発明は、ネットワーク上の演算デバイスを利用する情報処理装置、情報処理方法およびコンピュータプログラムに関する。 The present invention relates to an information processing apparatus, an information processing method, and a computer program that use a computing device on a network.

ＣＰＵとＧＰＵ、Ｃｅｌｌなどの異種混在の計算資源を利用した並列コンピューティング技術は、プロセッサの演算能力の向上と、Ｃｅｌｌ、ＧＰＵに代表される特定の演算に特化したコアを多数集積したプロセッサの登場により、様々な製品に採用されている。従来、このようなヘテロジニアスコンピューティング環境でのソフトウェアのコーディングにおいては、使用するプロセッサ群に特化した独自の実装をする必要があった。 Parallel computing technology using heterogeneous computing resources such as CPUs, GPUs, and cells improves processor computing power and integrates a number of cores specialized for specific operations represented by cells and GPUs. It has been adopted for various products. Conventionally, in the coding of software in such a heterogeneous computing environment, it has been necessary to implement an original implementation specialized for the processor group to be used.

ヘテロジニアスコンピューティング環境の並列プログラミングのための標準フレームワークとして、ＯｐｅｎＣＬ（ＯｐｅｎＣｏｍｐｕｔｉｎｇＬａｎｇｕａｇｅ）がある。ＯｐｅｎＣＬは、並列演算デバイスやメモリなどを抽象化したモデルを定義しており、ＯｐｅｎＣＬフレームワークを利用することにより、同一のコードで、複数のプロセッシング環境に対応することが可能になる。この仕様は、現在、標準化団体クロノス・グループによって策定、公開されている。 As a standard framework for parallel programming in a heterogeneous computing environment, there is OpenCL (Open Computing Language). OpenCL defines a model that abstracts parallel computing devices, memories, and the like, and by using the OpenCL framework, it is possible to support a plurality of processing environments with the same code. This specification is currently being developed and published by the standards organization Kronos Group.

一方で、インターネットなどのネットワークを介してコンピュータ処理を利用するクラウドコンピューティングがサービスとして徐々に拡大している。クラウドコンピューティングにおいても、その演算環境をモデル化することにより、特定の演算環境に依存しないアプリケーションの構築と演算の高速化を両立することが可能になると考えられる。 On the other hand, cloud computing that uses computer processing via a network such as the Internet is gradually expanding as a service. Even in cloud computing, modeling the computing environment is considered to enable both the construction of an application that does not depend on a specific computing environment and the speeding up of the computation.

特許文献１には、ＯｐｅｎＣＬをクラウドコンピューティングへのインターフェースとして利用し、アプリケーション処理の高速化を行なう技術が開示されている。当該技術によれば、アプリケーションの処理要求が発生した場合、クラウド上の演算リソースの特性に基づきクラウド上の装置を選択し、分散処理を行い、処理負荷を軽減することが可能になっている。尚、演算リソースとは、具体的にはクラウド上のプロセッサとメモリになる。 Japanese Patent Application Laid-Open No. 2004-151867 discloses a technique for using OpenCL as an interface to cloud computing to speed up application processing. According to this technique, when an application processing request occurs, it is possible to select a device on the cloud based on the characteristics of the computing resource on the cloud, perform distributed processing, and reduce the processing load. The computing resources are specifically a processor and a memory on the cloud.

特開２０１１−１３８５０６号公報JP 2011-138506 A

特許文献１に開示されている技術を用いて、画像処理などの演算をクラウドなどのネットワーク上で処理するためには、処理に必要なデータをクラウド上の演算デバイスの間で送受信する必要がある。このデータが大きい場合、演算実行時間に対して、データ転送時間を無視できなくなる。例えば、クラウド上に演算能力が異なる２つの演算デバイスＡ、Ｂが存在し、あるアプリケーションを実行する時間は、演算デバイスＡでは１秒、演算デバイスＢでは２秒であったとする。演算デバイスへのデータ転送時間は、その演算デバイスまでのネットワーク環境に左右されるので、転送速度は同一ではない。演算能力の高い演算デバイスＡへのデータ転送速度が、演算能力の低い演算デバイスＢへのデータ転送速度より低い場合には、演算能力で演算デバイスを選択することが必ずしもアプリケーション実行時間を短くすることにはならない。前述のアプリケーションの実行に必要なデータを演算デバイスＡに転送する時間が４秒、演算デバイスＢに転送する時間が２秒であるとする。この場合、演算実行時間とデータ転送時間の合計であるアプリケーションの実行時間は、演算デバイスＡでは５秒、演算デバイスＢでは４秒となり、演算デバイスＢでの実行時間の方が短くなる。 In order to process operations such as image processing on a network such as a cloud using the technology disclosed in Patent Document 1, it is necessary to transmit and receive data necessary for the processing between operation devices on the cloud. . When this data is large, the data transfer time cannot be ignored with respect to the calculation execution time. For example, it is assumed that there are two computing devices A and B having different computing capabilities on the cloud, and the time for executing an application is 1 second for the computing device A and 2 seconds for the computing device B. Since the data transfer time to the computing device depends on the network environment up to the computing device, the transfer speed is not the same. When the data transfer rate to the computing device A with high computing capability is lower than the data transfer rate to the computing device B with low computing capability, selecting the computing device with the computing capability necessarily shortens the application execution time. It will not be. Assume that the time required to transfer the data necessary for executing the above-described application to the computing device A is 4 seconds, and the time required to transfer to the computing device B is 2 seconds. In this case, the execution time of the application, which is the sum of the calculation execution time and the data transfer time, is 5 seconds for the calculation device A and 4 seconds for the calculation device B, and the execution time in the calculation device B is shorter.

一方で、クラウド上の演算デバイスを使うことを意図するのではなく、ＧＰＵなどの機器内のＯｐｅｎＣＬ環境用に作成されたＯｐｅｎＣＬアプリケーションは既に多く存在する。そこで、既存のＯｐｅｎＣＬアプリケーションを変更せずに、クラウド上の演算デバイスを使用してアプリケーションを実行できることが望ましい。 On the other hand, there are already many OpenCL applications created for the OpenCL environment in devices such as GPUs, rather than intending to use computing devices on the cloud. Therefore, it is desirable that the application can be executed using a computing device on the cloud without changing the existing OpenCL application.

ＯｐｅｎＣＬでは、演算を実行するデバイスを指定する方法として、デバイスＩＤという物理デバイスを間接的に指し示すＩＤを使用する。ＯｐｅｎＣＬをクラウドコンピューティングへのインターフェースとして利用した場合に、クラウド上のノード毎にデバイスＩＤが割り振られるため、ノード間ではデバイスＩＤが重複し、そのままでは一意に識別できない。また、機器内アプリケーションは、クラウド環境のように複数のデバイスが存在することは想定していないことが多い。例えば、最初に識別されたデバイスのデバイスＩＤなど、固定のデバイスＩＤを使用してアプリケーションを実行するように作成されている。よって、クラウド上にある複数のデバイスから適切なデバイスを選択して実行するためには、そのデバイスを使用するようにアプリケーション側を変更しなければならない。 In OpenCL, an ID that indirectly points to a physical device called a device ID is used as a method for specifying a device that performs an operation. When OpenCL is used as an interface to cloud computing, a device ID is assigned to each node on the cloud. Therefore, device IDs overlap between nodes, and cannot be uniquely identified as they are. Also, in-app applications often do not assume that there are multiple devices as in a cloud environment. For example, the application is created using a fixed device ID such as the device ID of the first identified device. Therefore, in order to select and execute an appropriate device from a plurality of devices on the cloud, the application side must be changed to use the device.

本発明は、上記課題を鑑みてなされたものであり、ネットワーク上のデータの転送時間を考慮して、当該データの処理をネットワーク上の複数の演算デバイスに分散して行わせる情報処理装置を提供することを目的とする。 The present invention has been made in view of the above problems, and provides an information processing apparatus that performs processing of data on a plurality of arithmetic devices on the network in consideration of the transfer time of the data on the network. The purpose is to do.

本発明は、上記課題を解決するために、データの処理をネットワーク上の複数の演算デバイスに分散して行わせる情報処理装置であって、前記複数の演算デバイスそれぞれまでのデータ転送の時間に関する第一の情報を取得する第一の取得手段と、前記複数の演算デバイスそれぞれの演算の能力に関する第二の情報を取得する第二の取得手段と、前記第一の情報と前記第二の情報とに基づき、前記データの処理の完了が所定の値よりも小さくなるように、前記複数の演算デバイスから前記データの処理を行わせる演算デバイスを選択する選択手段と、を有することを特徴とする。 In order to solve the above-described problem, the present invention provides an information processing apparatus that distributes data processing to a plurality of arithmetic devices on a network, and relates to a data transfer time to each of the plurality of arithmetic devices. First acquisition means for acquiring one information, second acquisition means for acquiring second information relating to the calculation capability of each of the plurality of calculation devices, the first information and the second information, And selecting means for selecting an arithmetic device for processing the data from the plurality of arithmetic devices so that the completion of the processing of the data becomes smaller than a predetermined value.

本発明によれば、上記課題を鑑みてなされたものであり、ネットワーク上のデータの転送時間を考慮して、当該データの処理をネットワーク上の複数の演算デバイスに分散して行わせる情報処理装置を提供することが出来る。 The present invention has been made in view of the above problems, and considers the transfer time of data on the network and performs processing of the data distributed to a plurality of arithmetic devices on the network. Can be provided.

クラウド上に存在する演算リソースを利用するためのインターフェースとしてＯｐｅｎＣＬを使用するシステムのノード構成を示すブロック図である。It is a block diagram which shows the node structure of the system which uses OpenCL as an interface for utilizing the calculation resource which exists on a cloud. 第１の実施形態におけるクラウド対応ＯｐｅｎＣＬドライバの構成を示すブロック図である。It is a block diagram which shows the structure of the cloud corresponding OpenCL driver in 1st Embodiment. 第１の実施形態に関る演算デバイスのリソース情報とデータ転送時間を取得する手順を示すフローチャートである。It is a flowchart which shows the procedure which acquires the resource information and data transfer time of the arithmetic device concerning 1st Embodiment. 第１の実施形態に関る演算デバイス情報管理部で管理する演算デバイス情報の一例を表す図である。It is a figure showing an example of the arithmetic device information managed by the arithmetic device information management part concerning 1st Embodiment. 第１の実施形態に関る演算デバイス選択部でＯｐｅｎＣＬアプリケーションを実行するのに適した演算デバイスを選択する手順を示すフローチャートである。It is a flowchart which shows the procedure which selects the arithmetic device suitable for performing the OpenCL application in the arithmetic device selection part concerning 1st Embodiment. 第１の実施形態に関るデータ転送時間と演算実行時間の合計時間の推定後の演算デバイス情報の一例を表す図である。It is a figure showing an example of the arithmetic device information after estimation of the total time of the data transfer time and calculation execution time concerning 1st Embodiment. 第１の実施形態に関るノードがネットワーク上に追加された後の演算デバイス情報の一例を表す図である。It is a figure showing an example of the arithmetic device information after the node concerning 1st Embodiment was added on the network. 第２の実施形態におけるＯｐｅｎＣＬドライバの別の構成を示すブロック図である。It is a block diagram which shows another structure of the OpenCL driver in 2nd Embodiment. 第２の実施形態に関る演算デバイスのリソース情報とデータ転送時間の取得とデバイスＩＤの割り当てを行なう手順を示すフローチャートである。It is a flowchart which shows the procedure which performs acquisition of the resource information and data transmission time of a computing device, and device ID assignment concerning 2nd Embodiment. 第２の実施形態に関る演算デバイス情報管理部で管理する演算デバイス情報の一例を表す図である。It is a figure showing an example of the arithmetic device information managed by the arithmetic device information management part concerning 2nd Embodiment. 第２の実施形態に関るＯｐｅｎＣＬアプリケーションを実行する演算デバイスを決定後の演算デバイス情報の一例を表す図である。It is a figure showing an example of the arithmetic device information after determining the arithmetic device which performs the OpenCL application concerning 2nd Embodiment. 第２の実施形態に関るＯｐｅｎＣＬアプリケーションを実行する演算デバイスを決定後の演算デバイス情報の別の一例を表す図である。It is a figure showing another example of the arithmetic device information after determining the arithmetic device which performs the OpenCL application concerning 2nd Embodiment.

（第１の実施形態）
以下、第１の実施形態について図面およびフローチャートを用いて説明する。ただし、本発明の技術的範囲がこの実施形態に限定されるものではない。 (First embodiment)
The first embodiment will be described below with reference to the drawings and flowcharts. However, the technical scope of the present invention is not limited to this embodiment.

図１は、クラウド上に存在する演算リソースを利用するためのインターフェースとしてＯｐｅｎＣＬを使用するシステムのノード（情報処理装置）の構成を示すブロック図である。各ノード（情報処理装置）は、ＣＰＵ、各種コンピュータプログラムが格納された記録媒体などを有し、各種コンピュータプログラムを実行する。 FIG. 1 is a block diagram illustrating a configuration of a node (information processing apparatus) of a system that uses OpenCL as an interface for using a computation resource existing on the cloud. Each node (information processing apparatus) has a CPU, a recording medium storing various computer programs, and the like, and executes the various computer programs.

図中、１０１は、ＯｐｅｎＣＬアプリケーション実行ノードであり、ＯｐｅｎＣＬアプリケーションを実行するネットワーク上のノードである。 In the figure, reference numeral 101 denotes an OpenCL application execution node, which is a node on the network that executes the OpenCL application.

１０２から１０４は、ＯｐｅｎＣＬインターフェース対応演算デバイスノードであり、ネットワーク越しに要求されるＯｐｅｎＣＬインターフェースの要求を実行し、結果を返す。図中には例として３つのノードを示しているが、その個数はこれには限らない。 Reference numerals 102 to 104 denote OpenCL interface-compatible arithmetic device nodes, which execute OpenCL interface requests required over the network and return the results. In the figure, three nodes are shown as an example, but the number is not limited to this.

１０５は、ネットワークであり、ＯｐｅｎＣＬアプリケーション実行ノードとＯｐｅｎＣＬインターフェース対応演算デバイスノードとの間の通信機能を担う。 A network 105 is responsible for a communication function between an OpenCL application execution node and an OpenCL interface-compatible arithmetic device node.

１０１１は、ＯｐｅｎＣＬアプリケーションであり、画像処理などの並列演算処理のアプリケーションである。従来の機器内で閉じたＯｐｅｎＣＬ実行環境で動作するアプリケーションと同等のものである。 Reference numeral 1011 denotes an OpenCL application, which is an application for parallel arithmetic processing such as image processing. This is equivalent to an application that operates in an OpenCL execution environment closed in a conventional device.

１０１２は、クラウド対応ＯｐｅｎＣＬドライバであり、演算リソースとしてクラウド上の演算デバイスを使用するように変更されたドライバで、ＯｐｅｎＣＬアプリケーションに提供するＡＰＩの形式は、従来の機器内ＯｐｅｎＣＬ環境と同等である。機能の詳細については後述する。 Reference numeral 1012 denotes a cloud-compatible OpenCL driver that has been changed to use a computing device on the cloud as a computing resource. The API format provided to the OpenCL application is equivalent to the conventional in-device OpenCL environment. Details of the function will be described later.

１０１３は、ネットワークインターフェースであり、クラウド対応ＯｐｅｎＣＬドライバと、ネットワーク上のＯｐｅｎＣＬインターフェース対応演算デバイスノード間の通信機能を仲介する。 A network interface 1013 mediates a communication function between a cloud-compatible OpenCL driver and an OpenCL interface-compatible arithmetic device node on the network.

１０２１は、ネットワークインターフェースであり、ネットワーク上のクラウド対応ＯｐｅｎＣＬドライバと、自ノードのＯｐｅｎＣＬドライバの間の通信を仲介する。 Reference numeral 1021 denotes a network interface that mediates communication between the cloud-compatible OpenCL driver on the network and the OpenCL driver of the own node.

１０２２は、ＯｐｅｎＣＬドライバであり、ＯｐｅｎＣＬアプリケーションにＯｐｅｎＣＬインターフェースに準拠したＡＰＩを提供する。従来の機器内ＯｐｅｎＣＬ環境上のものと同等のものである。 An OpenCL driver 1022 provides an API that conforms to the OpenCL interface to the OpenCL application. It is the same as that in the conventional OpenCL environment in the device.

１０２３は、演算デバイスであり、ＯｐｅｎＣＬインターフェースに準拠した演算機能を実行するデバイスで、従来の機器内ＯｐｅｎＣＬ環境上のものと同等のものである。 An arithmetic device 1023 is a device that executes an arithmetic function based on the OpenCL interface, and is equivalent to that in the conventional in-device OpenCL environment.

１０１１、１０２２、１０２３は、従来の機器内ＯｐｅｎＣＬ環境上のものと同等であり、本発明の特徴は、１０１２のクラウド対応ＯｐｅｎＣＬドライバ内の構成に集約されている。 Reference numerals 1011, 1022, and 1023 are equivalent to those in the conventional in-device OpenCL environment, and the features of the present invention are summarized in a configuration in the 1012 cloud-compatible OpenCL driver.

図２は本発明におけるクラウド対応ＯｐｅｎＣＬドライバの実施形態の構成を示すブロック図である。 FIG. 2 is a block diagram showing a configuration of an embodiment of a cloud-compatible OpenCL driver in the present invention.

図中、１０１１のＯｐｅｎＣＬアプリケーション、１０１３のネットワークインターフェースは、図１中の同符号のブロックと同じものである。 In the figure, the OpenCL application 1011 and the network interface 1013 are the same as the blocks with the same symbols in FIG.

１０１２のクラウド対応ＯｐｅｎＣＬドライバ内の構成を説明する。 A configuration in the 1012 cloud-compatible OpenCL driver will be described.

１０１２１は、ＯｐｅｎＣＬＡＰＩ変換部であり、従来の機器内ＯｐｅｎＣＬ環境上のＯｐｅｎＣＬドライバと同等のＡＰＩをＯｐｅｎＣＬアプリケーションに提供する。演算実行、リソース情報取得などの実処理は、後述する各ブロックの機能や、ネットワーク上のノードの演算機能を利用して、ＯｐｅｎＣＬインターフェースで定義されている機能を仮想化し、機器内で実行しているようにみせかける機能を担う。 An OpenCL API conversion unit 10121 provides an API equivalent to an OpenCL driver on a conventional in-device OpenCL environment to an OpenCL application. Actual processing such as calculation execution and resource information acquisition is performed by virtualizing the functions defined in the OpenCL interface by using the function of each block described later and the calculation function of the node on the network. It takes on the function of making it appear to be.

１０１２２は、リソース情報取得部であり、ネットワークで接続されたノード上の演算デバイスの演算リソース情報を取得する。ＧＰＵ、ＣＰＵなどのプロセッサの種類や、演算ユニットの数、クロックの周波数、メモリのサイズなど、演算能力に関わる情報を取得する。 Reference numeral 10122 denotes a resource information acquisition unit that acquires operation resource information of an operation device on a node connected by a network. Information related to computing power such as the type of processors such as GPU and CPU, the number of computing units, the frequency of clocks, and the size of memory is acquired.

１０１２３は、データ転送速度取得部であり、本ノードとネットワークに接続されたノードとの間のデータの転送速度を取得する。 A data transfer rate acquisition unit 10123 acquires a data transfer rate between this node and a node connected to the network.

１０１２４は、演算デバイス選択部であり、ＯｐｅｎＣＬアプリケーションの演算要求を実行する演算デバイスを選択する。 Reference numeral 10124 denotes an arithmetic device selection unit which selects an arithmetic device that executes an arithmetic request of the OpenCL application.

１０１２５は、演算デバイス情報管理部であり、前記の１０１２２から１０１２４の各処理部が取得及び、決定した演算デバイスに関する情報を管理し、ＯｐｅｎＣＬＡＰＩ変換部及び各処理部の要求に応じて情報を提供する。 Reference numeral 10125 denotes an arithmetic device information management unit that manages information on the arithmetic devices acquired and determined by the processing units 10122 to 10124, and provides information according to requests from the OpenCL API conversion unit and the respective processing units. To do.

リソース取得部、データ転送時間取得部が協調して、ネットワークに接続されたノードの演算デバイスのリソース情報と、データ転送時間を取得する手順を示すフローチャートを図３に示す。 FIG. 3 is a flowchart showing a procedure for acquiring resource information and data transfer time of the computing device of the node connected to the network in cooperation with the resource acquisition unit and the data transfer time acquisition unit.

この手順を行なう契機は、ＯｐｅｎＣＬアプリケーション実行ノードがネットワークに接続された時や、演算デバイスを持つノードがネットワークに接続された時などが考えられる。 The timing for performing this procedure may be when the OpenCL application execution node is connected to the network, or when a node having a computing device is connected to the network.

ステップＳ３０１では、ネットワーク上のＯｐｅｎＣＬインターフェース対応演算デバイスを有するノードの中から、リソース情報を取得していないノードを探し、リソース情報を取得していないノードがない場合、終了する。リソース情報を取得していないノードがある場合、ステップＳ３０２に進み、そのノードを次の処理対象として選択する。 In step S301, a node that has not acquired resource information is searched for among nodes having an OpenCL interface-compatible arithmetic device on the network. If there is no node that has not acquired resource information, the process ends. If there is a node for which resource information has not been acquired, the process proceeds to step S302, and the node is selected as the next processing target.

ステップＳ３０２では、選択したノード上の演算デバイスのリソース情報を取得する。ＯｐｅｎＣＬインターフェースでは、使用可能なプラットフォーム情報とそのプラットフォーム上でのデバイス情報を取得するＡＰＩが定義されている。リソース情報取得部１０１２２は、ネットワークインターフェース１０１３、１０２１を介して、ネットワーク越しのＯｐｅｎＣＬドライバ１０２２にアクセスし、このＡＰＩを使用して演算リソース情報を取得する。 In step S302, the resource information of the computing device on the selected node is acquired. In the OpenCL interface, APIs for acquiring usable platform information and device information on the platform are defined. The resource information acquisition unit 10122 accesses the OpenCL driver 1022 over the network via the network interfaces 1013 and 1021, and acquires calculation resource information using this API.

ステップＳ３０３では、データ転送速度取得部１０１２３が、選択したノードとＯｐｅｎＣＬアプリケーション実行ノードとの間のデータの転送速度を取得する。 In step S303, the data transfer rate acquisition unit 10123 acquires the transfer rate of data between the selected node and the OpenCL application execution node.

ステップＳ３０３の後、ステップＳ３０１に戻る。以降、ステップＳ３０１からステップＳ３０３をネットワーク上に存在するＯｐｅｎＣＬインターフェース対応演算デバイスを有するノードの数だけ繰り返す。 After step S303, the process returns to step S301. Thereafter, step S301 to step S303 are repeated by the number of nodes having an OpenCL interface-compatible arithmetic device existing on the network.

以上のステップの結果として、リソース取得部、データ転送時間取得部が取得した演算デバイスに関する情報は、演算デバイス情報管理部に集約され、リソース情報、データ転送速度を関連付けるデータリストを生成する。図４は演算デバイス情報管理部で管理する演算デバイス情報の一例である。この図を用いて、上記ステップの具体例を説明する。 As a result of the above steps, the information on the computing device acquired by the resource acquisition unit and the data transfer time acquisition unit is collected in the computing device information management unit, and a data list that associates the resource information and the data transfer rate is generated. FIG. 4 is an example of computing device information managed by the computing device information management unit. A specific example of the above steps will be described with reference to this figure.

ステップＳ３０１で、ネットワーク上に存在するＯｐｅｎＣＬインターフェース対応演算デバイスを有するノードを検索し、ＩＰアドレスが１０．２０．３０．４０のノードが該当したとする。このノードの識別情報として４０１のノード識別情報列にノード識別情報としてＩＰアドレスを記録する。 In step S301, a node having an OpenCL interface-compatible arithmetic device existing on the network is searched, and a node having an IP address of 10.20.30.40 corresponds. As this node identification information, an IP address is recorded as node identification information in a node identification information string 401.

ステップＳ３０２で、このノードのリソース情報を取得する。ＯｐｅｎＣＬインターフェースでは、最初にプラットフォーム情報を取得する。ＯｐｅｎＣＬでは、１つのノード上に複数のＯｐｅｎＣＬ実行環境を持つことが可能であり、各プラットフォームの情報を取得することができる。各プラットフォームはプラットフォームＩＤで識別することができる。プラットフォーム情報として、このノードには２つのプラットフォームが存在するという情報が得られ、それぞれのプラットフォームＩＤは、０と１であったとする。この情報を４０２の取得したＯｐｅｎＣＬプラットフォームＩＤ列に記録する。 In step S302, resource information of this node is acquired. In the OpenCL interface, platform information is first acquired. In OpenCL, it is possible to have a plurality of OpenCL execution environments on one node, and information on each platform can be acquired. Each platform can be identified by a platform ID. As platform information, information that two platforms exist in this node is obtained, and the platform IDs are 0 and 1, respectively. This information is recorded in the acquired OpenCL platform ID column 402.

次にプラットフォーム毎に演算デバイス情報を取得する。ＯｐｅｎＣＬでは、１つのプラットフォーム上に複数の演算デバイスを持つことが可能であり、各演算デバイスの情報を取得することができる。各演算デバイスはデバイスＩＤで識別することができる。演算デバイス情報として、プラットフォームＩＤが０のプラットフォームには、２つの演算デバイスが存在するという情報が得られ、それぞれのデバイスＩＤは、０と１であったとする。また、プラットフォームＩＤが１のプラットフォームには、１つの演算デバイスが存在するという情報が得られ、デバイスＩＤは、０であったとする。これらの情報を４０３の取得したＯｐｅｎＣＬデバイスＩＤ列に記録する。 Next, computing device information is acquired for each platform. In OpenCL, it is possible to have a plurality of computing devices on one platform, and information on each computing device can be acquired. Each computing device can be identified by a device ID. As the computing device information, information that there are two computing devices in the platform with a platform ID of 0 is obtained, and the device IDs are 0 and 1. Further, it is assumed that information indicating that one computing device exists in the platform having the platform ID 1 and the device ID is 0. These pieces of information are recorded in the acquired OpenCL device ID column 403.

次に演算デバイス毎に演算デバイスの演算能力情報を取得する。ＯｐｅｎＣＬでは、各演算デバイスの様々な情報を取得することができる。ここでは、デバイスの演算能力を比較するために必要な情報を取得する。例えば、ＧＰＵ、ＣＰＵなどのプロセッサの種類や、演算ユニットの数、クロックの周波数、メモリのサイズ、デバイス名などの情報である。演算能力情報の１つとして、プロセッサ種別情報を取得する場合を例とする。プラットフォームＩＤが０、デバイスＩＤが０のデバイスのプロセッサ種別情報としてＧＰＵという情報が得られたとする。また、プラットフォームＩＤが０、デバイスＩＤが１のデバイスのプロセッサ種別情報としてＧＰＵ、プラットフォームＩＤが１、デバイスＩＤが０のデバイスのプロセッサ種別情報としてＣＰＵという情報が得られたとする。これらの情報を４０４の演算能力情報列に記録する。以上がステップＳ３０２で行なうリソース情報の取得である。 Next, the computing capability information of the computing device is acquired for each computing device. With OpenCL, it is possible to acquire various types of information about each computing device. Here, information necessary for comparing the computing capabilities of the devices is acquired. For example, information such as the type of a processor such as a GPU or a CPU, the number of arithmetic units, the frequency of a clock, the size of a memory, and a device name. A case where processor type information is acquired as an example of computing capacity information is taken as an example. Assume that information called GPU is obtained as processor type information for a device with a platform ID of 0 and a device ID of 0. Further, it is assumed that information of GPU as the processor type information of the device having the platform ID 0 and device ID 1 and CPU as the processor type information of the device having the platform ID 1 and device ID 0 is obtained. These pieces of information are recorded in the calculation capability information sequence 404. The above is the acquisition of the resource information performed in step S302.

ステップＳ３０３では、このノードとの間のデータ転送速度を取得する。転送速度の測定方法としては、大容量データの入出力を伴う仮のＯｐｅｎＣＬアプリケーションを生成し、選択したノードで実行することによりデータの転送速度を求める方法が考えられる。また、ＯｐｅｎＣＬインターフェースに依存しない方法で、ノード間ネットワークのデータ転送速度を測定しても良い。プラットフォームＩＤが０、デバイスＩＤが０のデバイスとのデータ転送速度としてＡＡという情報が得られたとする。また、プラットフォームＩＤが０、デバイスＩＤが１のデバイスとのデータ転送速度としてＢＢ、プラットフォームＩＤが１、デバイスＩＤが０のデバイスとのデータ転送速度としてＣＣという情報が得られたとする。ここで得られた転送速度を４０５のデータ転送速度列に記録する。 In step S303, the data transfer rate with this node is acquired. As a method for measuring the transfer rate, a method of determining a data transfer rate by generating a temporary OpenCL application with large-capacity data input / output and executing it on a selected node can be considered. Further, the data transfer rate of the inter-node network may be measured by a method that does not depend on the OpenCL interface. It is assumed that information AA is obtained as a data transfer rate with a device having a platform ID of 0 and a device ID of 0. Further, it is assumed that information BB is obtained as the data transfer rate with the device having the platform ID 0 and device ID 1 and CC is obtained as the data transfer rate with the device having the platform ID 1 and device ID 0. The transfer rate obtained here is recorded in the data transfer rate column 405.

以上で１つのノードに対する処理は終了で、次のノードに対して同様の処理を行なう。ＩＰアドレスが、１０．２０．４０．５０のリソース情報とデータ転送速度を得て、４０１から４０５の列に記録する。 This completes the processing for one node, and the same processing is performed for the next node. The resource information and data transfer rate of the IP address 10.20.40.50 are obtained and recorded in the columns 401 to 405.

図５は、この演算デバイス情報を利用して、演算デバイス選択部１０１２４が、ＯｐｅｎＣＬアプリケーションを実行するのに適した演算デバイスを選択する手順を示すフローチャートである。 FIG. 5 is a flowchart showing a procedure by which the computing device selection unit 10124 selects a computing device suitable for executing the OpenCL application using the computing device information.

この手順を行なう契機としては、当該ＯｐｅｎＣＬアプリケーションが最初に実行された時に行なう場合が考えられる。別の契機として、実行される可能性があるＯｐｅｎＣＬアプリケーションに対して適した演算デバイスを予め決定しておくために、システムの起動時などに行なう場合などが考えられる。 As a trigger for performing this procedure, it is conceivable that this procedure is performed when the OpenCL application is first executed. As another opportunity, in order to predetermine a computing device suitable for an OpenCL application that may be executed, it may be performed when the system is started up.

ステップＳ５０１では、実行するＯｐｅｎＣＬアプリケーションから転送するデータ量情報と演算情報を取得する。これらの情報は後述するステップＳ５０３で各演算デバイスに対するデータ転送時間と演算実行時間を求めるために必要な情報である。転送するデータ量情報としては、ＯｐｅｎＣＬアプリケーションに入出力するデータ量と、ＯｐｅｎＣＬアプリケーションプログラムそのもののデータ量を取得する。 In step S501, data amount information and calculation information transferred from the OpenCL application to be executed are acquired. These pieces of information are information necessary for obtaining the data transfer time and the computation execution time for each computing device in step S503 described later. As the data amount information to be transferred, the data amount input / output to / from the OpenCL application and the data amount of the OpenCL application program itself are acquired.

ステップＳ５０２からステップＳ５０５は、先に取得した演算デバイスの数だけ繰り返す処理である。ステップＳ５０２では、ステップＳ５０３からステップＳ５０５の演算デバイスの評価処理を未実施のデバイス情報を探し、無い場合には終了する。評価処理を実施していないデバイスがある場合には、その演算デバイスを処理対象として選択し、ステップＳ５０３に進む。 Steps S <b> 502 to S <b> 505 are processing that is repeated as many times as the number of previously obtained computing devices. In step S502, device information that has not been subjected to the computing device evaluation processing in steps S503 to S505 is searched for, and if there is no device information, the processing ends. If there is a device that has not been evaluated, the computing device is selected as a processing target, and the process proceeds to step S503.

ステップＳ５０３では、選択した演算デバイスで、当該ＯｐｅｎＣＬアプリケーションを実行する場合のデータ転送時間と演算実行時間の合計時間を推定する。ここでは、演算デバイス情報管理部１０１２５からそのデバイスのデータ転送速度、演算能力情報を取得し、ステップＳ５０１で取得したアプリケーションの転送するデータ量情報、演算情報を用いて、データ転送時間と演算実行時間の合計時間を推定する。また、その合計時間情報を演算デバイス情報管理部で管理する演算デバイス情報として記録する。 In step S503, the total time of the data transfer time and the calculation execution time when the OpenCL application is executed by the selected calculation device is estimated. Here, the data transfer speed and calculation capability information of the device are acquired from the calculation device information management unit 10125, and the data transfer time and calculation execution time are obtained using the data amount information and calculation information transferred by the application acquired in step S501. Estimate the total time of. The total time information is recorded as computing device information managed by the computing device information management unit.

データ転送時間に関しては、データ量情報とデータ転送速度から算出できる。 The data transfer time can be calculated from the data amount information and the data transfer rate.

演算実行時間に関しては、ＧＰＵ、ＣＰＵなどのデバイスに対する演算量対演算実行時間変換テーブルを予め用意しておき、ステップＳ５０１で得たアプリケーションの演算量から演算実行時間を得る方法がある。他の方法としては、当該ＯｐｅｎＣＬアプリケーションをその演算デバイスで実際に実行して、演算実行時間を得る方法もある。 Regarding calculation execution time, there is a method of preparing a calculation amount versus calculation execution time conversion table for devices such as GPU and CPU in advance and obtaining the calculation execution time from the calculation amount of the application obtained in step S501. As another method, there is also a method in which the OpenCL application is actually executed by the arithmetic device to obtain the arithmetic execution time.

ステップＳ５０４では、ステップＳ５０３で推定した合計時間が他の演算デバイスで推定合計時間より小さい場合には、ステップＳ５０５に進む。合計時間が他のデバイスより小さくない場合には、ステップＳ５０２に戻り、他のデバイスの評価ステップに進む。 In step S504, if the total time estimated in step S503 is smaller than the estimated total time in other computing devices, the process proceeds to step S505. If the total time is not smaller than that of the other device, the process returns to step S502 and proceeds to the evaluation step for the other device.

ステップＳ５０５では、そのデバイスをＯｐｅｎＣＬアプリケーション実行デバイスとして選択する。具体的には、演算デバイス情報管理部に、当該ＯｐｅｎＣＬアプリケーションを実行するデバイスとして選択したという情報を伝え、演算デバイス情報管理部では、それを記録する。 In step S505, the device is selected as an OpenCL application execution device. Specifically, information indicating that the device is selected as a device that executes the OpenCL application is transmitted to the arithmetic device information management unit, and the arithmetic device information management unit records the information.

以降、ステップＳ５０２からステップＳ５０５を演算デバイスの数だけ繰り返すことにより、合計時間が小さくなるデバイスを探索し、当該ＯｐｅｎＣＬアプリケーションを実行する演算デバイスを決定する。 Thereafter, by repeating steps S502 to S505 as many as the number of computing devices, a device whose total time is reduced is searched, and a computing device that executes the OpenCL application is determined.

この手順を実行した結果、特定のＯｐｅｎＣＬアプリケーションのデータ転送時間と演算実行時間の合計時間を追記した演算デバイス情報の一例を図６に示す。６０６の特定のアプリケーションのデータ転送時間と演算実行時間の合計の列に各演算デバイスでの情報を追記してある。この例では、ノード識別情報が１０．２０．３０．４０、プラットフォームＩＤが０、デバイスＩＤが１のデバイスの合計時間が小さいので、アプリケーションを実行する演算デバイスとしてこのデバイスを選択する。 FIG. 6 shows an example of computing device information in which the total time of the data transfer time and computation execution time of a specific OpenCL application is added as a result of executing this procedure. Information on each computing device is added to a column of the total of the data transfer time and the computation execution time of a specific application 606. In this example, since the total time of the device having the node identification information of 10.20.30.40, the platform ID of 0, and the device ID of 1 is small, this device is selected as the computing device that executes the application.

演算デバイス情報管理部１０１２５に、図６に示したような演算デバイス情報が管理されている状態で、ＯｐｅｎＣＬアプリケーションを実行する時の動作例を説明する。 An example of the operation when the OpenCL application is executed in a state where the arithmetic device information management unit 10125 manages the arithmetic device information as shown in FIG. 6 will be described.

ＯｐｅｎＣＬアプリケーションは、ＯｐｅｎＣＬＡＰＩ変換部１０１２１に、演算処理の実行を依頼する。ＯｐｅｎＣＬＡＰＩ変換部は、ネットワーク上に存在する演算デバイスから演算を実行する実デバイスを決定するために、演算デバイス情報管理部に、デバイス情報を問い合わせる。演算デバイス情報管理部は、データ転送時間と演算実行時間の合計時間が最少であるノード識別情報が１０．２０．３０．４０、プラットフォームＩＤが０、デバイスＩＤが０のデバイスの情報をＯｐｅｎＣＬＡＰＩ変換部に渡す。ＯｐｅｎＣＬＡＰＩ変換部は、このノード上の該当デバイスに演算を実行させる。 The OpenCL application requests the OpenCL API conversion unit 10121 to execute arithmetic processing. The OpenCL API conversion unit inquires of the arithmetic device information management unit about device information in order to determine an actual device that performs the operation from the arithmetic devices present on the network. The computing device information management unit performs OpenCL API conversion on the information of the device whose node identification information is 10.20.40.40, whose platform ID is 0, and whose device ID is 0. Pass to the department. The OpenCL API conversion unit causes the corresponding device on this node to execute an operation.

次に、ＯｐｅｎＣＬインターフェース対応演算デバイスを有するノードがネットワーク上に追加されることにより、ＯｐｅｎＣＬアプリケーションを実行するデバイスが変更される例を説明する。 Next, an example in which a device that executes an OpenCL application is changed by adding a node having an OpenCL interface-compatible computing device to the network will be described.

クラウド対応ＯｐｅｎＣＬドライバは、新たに接続されたＯｐｅｎＣＬインターフェース対応演算デバイスを有するノードに対して、図３で説明したリソース情報と、データ転送時間を取得する手順と図５で説明した演算デバイスを選択する手順を実行する。その結果として、演算デバイス情報管理部が管理する演算デバイス情報が図７になったとする。新たに追加されたノードはノード識別情報が５０．４０．３０．２０で、プラットフォームが２つあり、それぞれ１つのデバイスを持つ構成であるとする。先に取得したノード上のデバイスを含めて考え、データ転送時間と演算実行時間の合計（処理完了までの時間）が最小となる演算デバイスは、ノード識別情報が５０．４０．３０．２０、プラットフォームＩＤが０、デバイスＩＤが０のデバイスとなる。よって、当該ＯｐｅｎＣＬアプリケーションを実行するデバイスとしては、このデバイスを選択し、演算を実行する。尚、本実施形態では、処理完了までの時間が最小となるデバイスを選択したが、あらかじめ設定された所定の値よりも小さくなれば良く、必ずしも最小にする必要はない。 The cloud-compatible OpenCL driver selects the resource information described in FIG. 3 and the procedure for obtaining the data transfer time and the arithmetic device described in FIG. 5 for a node having a newly connected OpenCL interface-compatible arithmetic device. Perform the procedure. As a result, it is assumed that the computing device information managed by the computing device information management unit is as shown in FIG. Assume that the newly added node has node identification information of 50.40.30.20, two platforms, and one device each. Considering the devices on the node acquired earlier, the computing device with the minimum total data transfer time and computation execution time (time to complete the process) has node identification information of 50.40.30.20, platform The device has an ID of 0 and a device ID of 0. Therefore, this device is selected as the device that executes the OpenCL application, and the calculation is executed. In the present embodiment, the device that minimizes the time until completion of processing is selected. However, it is sufficient that the device is smaller than a predetermined value set in advance, and it is not always necessary to minimize the device.

以上の本実施形態における演算処理システムによれば、ネットワーク上に存在する演算デバイスの中から、データ転送時間を考慮して、ＯｐｅｎＣＬアプリケーションの実行に最適な演算デバイスを選択することができる。 According to the arithmetic processing system in the present embodiment described above, it is possible to select an optimal arithmetic device for executing the OpenCL application from the arithmetic devices existing on the network in consideration of the data transfer time.

（第２の実施形態）
以下、第２の実施形態について、第１の実施形態との差異を中心に説明する。ただし、本発明の技術的範囲がこの実施形態に限定されるものではない。 (Second Embodiment)
Hereinafter, the second embodiment will be described focusing on differences from the first embodiment. However, the technical scope of the present invention is not limited to this embodiment.

本実施形態においては、ネットワーク上のＯｐｅｎＣＬインターフェース対応演算デバイスに対してユニークな仮想ＩＤを割り当て、最適な演算デバイスをＯｐｅｎＣＬアプリケーションに提供する実施例を説明する。 In the present embodiment, an example will be described in which a unique virtual ID is assigned to an OpenCL interface-compatible computing device on a network, and an optimal computing device is provided to the OpenCL application.

本実施形態におけるクラウド対応ＯｐｅｎＣＬドライバの構成を図８に示す。図２で示したブロックに、デバイスＩＤ割り当て部１０１２６を追加している。他のブロックは、図２の同符号のブロックと同様である。 The configuration of the cloud-compatible OpenCL driver in this embodiment is shown in FIG. A device ID assignment unit 10126 is added to the block shown in FIG. The other blocks are the same as the blocks with the same symbols in FIG.

１０１２６は、デバイスＩＤ割り当て部であり、ネットワークに接続されたノード上の演算デバイスにユニークなＩＤを割り当てる。 A device ID assignment unit 10126 assigns a unique ID to a computing device on a node connected to the network.

リソース取得部、データ転送時間取得部、デバイスＩＤ割り当て部が協調して、ネットワークに接続されたノードの演算デバイスのリソース情報と、データ転送時間を取得し、ユニークなデバイスＩＤを割り当てる手順を示すフローチャートを図９に示す。 A flowchart showing a procedure in which a resource acquisition unit, a data transfer time acquisition unit, and a device ID allocation unit cooperate to acquire resource information and data transfer time of a computing device of a node connected to the network and assign a unique device ID. Is shown in FIG.

ステップＳ３０１からステップＳ３０３は、図３で説明した同符号のステップと同等である。ステップＳ３０３の後、ステップＳ９０４に進む。 Steps S301 to S303 are equivalent to the steps having the same reference numerals described in FIG. After step S303, the process proceeds to step S904.

ステップＳ９０４では、デバイスＩＤ割り当て部が、そのノードに存在する演算デバイスに仮想のデバイスＩＤを割り当てる。リソース情報取得ステップで取得した演算デバイスに対して、ＯｐｅｎＣＬインターフェースでは、そのノード内では一意に特定できるデバイスＩＤを割り振られているが、ネットワーク上にあるノード間では、そのデバイスＩＤでは一意に特定できない。そこで、通常、ＩＰアドレスなどのノードを識別する情報との組み合わせにより、一意に識別することになる。このステップでは、ネットワーク上に存在する全ての演算デバイス間でユニークとなる仮想のデバイスＩＤを割り当てる。これにより、ＯｐｅｎＣＬアプリケーションは、演算デバイスが存在するノードの情報を知る必要がなく、従来の機器内ＯｐｅｎＣＬ実行環境と同等にデバイスにアクセスできることになる。 In step S904, the device ID assignment unit assigns a virtual device ID to the computing device existing in the node. In the OpenCL interface, a device ID that can be uniquely specified in the node is assigned to the computing device acquired in the resource information acquisition step. However, the device ID cannot be uniquely specified between nodes on the network. . Therefore, it is usually uniquely identified by a combination with information for identifying a node such as an IP address. In this step, a virtual device ID that is unique among all the computing devices existing on the network is assigned. As a result, the OpenCL application does not need to know the information of the node where the computing device exists, and can access the device in the same manner as the conventional in-device OpenCL execution environment.

ステップＳ９０４の後、ステップＳ３０１に戻る。以降、ステップＳ３０１からステップＳ９０４をネットワーク上に存在するＯｐｅｎＣＬインターフェース対応演算デバイスを有するノードの数だけ繰り返す。 After step S904, the process returns to step S301. Thereafter, step S301 to step S904 are repeated by the number of nodes having an OpenCL interface-compatible arithmetic device existing on the network.

以上のステップの結果として、各処理部が取得、決定した演算デバイスに関する情報は、演算デバイス情報管理部に集約され、リソース情報、データ転送速度、仮想デバイスＩＤを関連付けるデータリストを生成する。図１０は演算デバイス情報管理部で管理する演算デバイス情報の一例である。この図を用いて、デバイスＩＤ割り当ての具体例を説明する。 As a result of the above steps, the information regarding the computing device acquired and determined by each processing unit is collected in the computing device information management unit, and a data list that associates resource information, data transfer rate, and virtual device ID is generated. FIG. 10 is an example of computing device information managed by the computing device information management unit. A specific example of device ID assignment will be described with reference to FIG.

４０１から４０５列の情報は、図４で説明した同符号の情報と同等である。 The information in columns 401 to 405 is equivalent to the information with the same sign described in FIG.

ノード識別情報が１０．２０．３０．４０のノードに対するステップＳ９０４において、そのノードに存在する演算デバイスに、ノード間に渡ってユニークな仮想のデバイスＩＤを割り当てる。プラットフォームＩＤが０、デバイスＩＤが０のデバイスの仮想デバイスＩＤを０と設定したとする。同様に、プラットフォームＩＤが０、デバイスＩＤが１のデバイスの仮想デバイスＩＤを１、プラットフォームＩＤが１、デバイスＩＤが０のデバイスの仮想デバイスＩＤを２と設定したとする。ここで得られた仮想デバイスＩＤを１００６の割り当てた仮想デバイスＩＤ列に記録する。 In step S904 for the node having node identification information of 10.20.30.40, a virtual device ID that is unique across the nodes is assigned to the computing device existing in that node. Assume that the virtual device ID of a device having a platform ID of 0 and a device ID of 0 is set to 0. Similarly, assume that the virtual device ID of the device with the platform ID 0 and the device ID 1 is set to 1, and the virtual device ID of the device with the platform ID 1 and the device ID 0 is set to 2. The virtual device ID obtained here is recorded in the virtual device ID column 1006 assigned.

以上で１つのノードに対する処理は終了で、他のノードに対して同様の処理を行なう。ノード識別情報が１０．２０．４０．５０のノードと５０．４０．３０．２０のノードの仮想デバイスＩＤに関しては、ノード間でもユニークとなるＩＤを割り当て、１００６の列に記録する。この例では、仮想デバイスＩＤとして、０から昇順の数値を割り当てているが、識別可能なユニークな値であれば良い。 This completes the processing for one node, and the same processing is performed for the other nodes. Regarding the virtual device IDs of the nodes having node identification information of 10.20.40.50 and 50.40.30.20, IDs that are unique among the nodes are assigned and recorded in the column 1006. In this example, numerical values in ascending order from 0 are assigned as virtual device IDs, but any unique value that can be identified may be used.

次に、複数のＯｐｅｎＣＬアプリケーションに対して、図５で説明した実行に適した演算デバイスを選択する手順を実行し、ＯｐｅｎＣＬアプリケーションを実行する演算デバイスを追記した演算デバイス情報の一例を図１１に示す。 Next, FIG. 11 shows an example of computing device information in which the computing device suitable for execution described in FIG. 5 is selected for a plurality of OpenCL applications, and the computing device that executes the OpenCL application is added. .

演算デバイス選択手順をＯｐｅｎＣＬアプリケーションのアプリＸ、アプリＹ、アプリＺに対して実行したとする。アプリＸのデータ転送時間と演算実行時間の合計時間が小さいデバイスは、ノード識別情報が１０．２０．３０．４０、プラットフォームＩＤが０、デバイスＩＤが１のデバイスであったとする。アプリＹとアプリＺの合計時間が小さいデバイスは共に、ノード識別情報が５０．４０．３０．２０、プラットフォームＩＤが０、デバイスＩＤが０のデバイスであったとする。この場合、１１０７のデバイスで実行するＯｐｅｎＣＬアプリケーション列の該当デバイスの箇所に、そのアプリケーションの情報を記録する。 Assume that the calculation device selection procedure is executed for the application X, the application Y, and the application Z of the OpenCL application. Assume that a device with a short total time of data transfer time and calculation execution time of the application X is a device having node identification information of 10.20.30.40, a platform ID of 0, and a device ID of 1. Assume that the devices with a small total time of app Y and app Z are both devices with node identification information of 50.40.20.20, platform ID of 0, and device ID of 0. In this case, the application information is recorded in the corresponding device in the OpenCL application sequence executed by the device 1107.

演算デバイス情報管理部１０１２５に、図１１に示したような演算デバイス情報が管理されている状態で、既存の機器内ＯｐｅｎＣＬ実行環境用に開発されたＯｐｅｎＣＬアプリケーションを実行する時の動作例を説明する。 An example of operation when an OpenCL application developed for an existing in-device OpenCL execution environment is executed in a state where the arithmetic device information management unit 10125 manages the arithmetic device information as shown in FIG. .

ＯｐｅｎＣＬアプリケーションは、演算処理を実行する前に、実行するデバイスを確定するために、そのプラットフォーム上で使用可能なデバイス情報を取得するＡＰＩを使用して、デバイス情報を取得する。ＯｐｅｎＣＬアプリケーションは、ＯｐｅｎＣＬＡＰＩ変換部１０１２１に対して同ＡＰＩをコールすると、ＯｐｅｎＣＬＡＰＩ変換部は、演算デバイス情報管理部１０１２５からデバイス情報を取得し、アプリケーションに提供する。この時、ＯｐｅｎＣＬのデバイスＩＤに相当する情報としては、仮想デバイスＩＤを提供する。図１１に示した演算デバイスがネットワーク上に存在する場合には、６つのデバイス情報をアプリケーションに提供することになる。 The OpenCL application acquires device information by using an API that acquires device information usable on the platform in order to determine a device to be executed before executing the arithmetic processing. When the OpenCL application calls the API to the OpenCL API conversion unit 10121, the OpenCL API conversion unit acquires device information from the arithmetic device information management unit 10125 and provides it to the application. At this time, a virtual device ID is provided as information corresponding to the OpenCL device ID. When the computing device shown in FIG. 11 exists on the network, six pieces of device information are provided to the application.

ＯｐｅｎＣＬでは、どのデバイスでアプリケーションを実行するかは基本的にアプリケーション側が選択する。具体的には、取得されたデバイスＩＤ群の中から、適すると思われるデバイスをアプリケーションが１つ選択して、ＯｐｅｎＣＬドライバに対してそのデバイスＩＤを指定して演算の実行を依頼する。 In OpenCL, the application side basically selects on which device the application is executed. Specifically, the application selects one suitable device from the acquired device ID group, and requests the OpenCL driver to execute the operation by specifying the device ID.

ここで、機器内ＯｐｅｎＣＬ実行環境用に開発されたＯｐｅｎＣＬアプリケーションは、複数のデバイスが存在することは想定していないことが多いので、例えば、最初に識別されたデバイスのデバイスＩＤを選択したとする。この例では、仮想デバイスＩＤが０のデバイスを選択したとする。 Here, the OpenCL application developed for the in-device OpenCL execution environment often does not assume that there are a plurality of devices. For example, it is assumed that the device ID of the device identified first is selected. . In this example, it is assumed that a device with a virtual device ID of 0 is selected.

デバイスＩＤを０として演算実行を依頼されたＯｐｅｎＣＬＡＰＩ変換部は、仮想デバイスＩＤからネットワーク上に存在する実デバイスを決定するために、演算デバイス情報管理部に、アプケーション情報と共に、問い合わせる。 The OpenCL API conversion unit requested to execute the operation with the device ID set to 0 makes an inquiry to the operation device information management unit together with the application information in order to determine a real device existing on the network from the virtual device ID.

１つ目の例として、このときのアプリケーションがアプリＸ、Ｙ、Ｚ以外であったとする。演算デバイス情報管理部は、図１１中の１１０７列のＯｐｅｎＣＬアプリケーションと問い合わされたアプリケーションが一致するかどうかを検索する。この場合は一致しないので、仮想デバイスＩＤが０のデバイスであるノード識別情報が１０．２０．３０．４０、プラットフォームＩＤが０、デバイスＩＤが０のデバイスの情報をＯｐｅｎＣＬＡＰＩ変換部に渡す。ＯｐｅｎＣＬＡＰＩ変換部は、このノード上の該当デバイスに演算を実行させることになる。この場合は、実行に適した演算デバイスが特定されていなかった状態なので、アプリケーションから指定されたデバイスでそのまま実行することになる。 As a first example, it is assumed that the application at this time is other than the applications X, Y, and Z. The computing device information management unit searches whether the OpenCL application in column 1107 in FIG. 11 matches the inquired application. In this case, since there is no match, the node identification information of the device whose virtual device ID is 0 is 10.20.40.40, the information of the device whose platform ID is 0, and device ID is 0 is passed to the OpenCL API conversion unit. The OpenCL API conversion unit causes the corresponding device on this node to execute an operation. In this case, since the computing device suitable for execution has not been specified, the processing is executed as it is with the device designated by the application.

異なる例として、アプリケーションがアプリＸであったとする。演算デバイス情報管理部は、図１１中の１１０７列のＯｐｅｎＣＬアプリケーションと問い合わされたアプリケーションが一致するかどうかを検索し、この場合は一致することがわかる。アプリＸは仮想デバイスＩＤが１のデバイスで実行すること方が良いので、ノード識別情報が１０．２０．３０．４０、プラットフォームＩＤが０、デバイスＩＤが１のデバイスの情報をＯｐｅｎＣＬＡＰＩ変換部に渡す。これと同時に、内部で管理している仮想デバイスＩＤを入れ替える。入れ替えた後の演算デバイス情報は図１２のようになる。ＯｐｅｎＣＬＡＰＩ変換部は、このノード上の該当デバイスに演算を実行させることになり、アプリＸに対して最適な演算デバイスが優先的に設定され、処理を実行できることになる。 As a different example, it is assumed that the application is an application X. The computing device information management unit searches whether the OpenCL application in column 1107 in FIG. 11 matches the inquired application, and in this case, it is found that they match. Since it is better to execute the application X on the device with the virtual device ID 1, the information of the device with the node identification information 10.20.40.40, the platform ID 0, and the device ID 1 is sent to the OpenCL API conversion unit. hand over. At the same time, the virtual device ID managed internally is replaced. The computing device information after the replacement is as shown in FIG. The OpenCL API conversion unit causes the corresponding device on this node to execute the calculation, and an optimal calculation device is preferentially set for the application X, and the process can be executed.

以上の本実施形態における演算処理システムによれば、ネットワーク上に存在する演算デバイスの中から、データ転送時間を考慮して、最適なデバイスを選択することができる。また、従来の機器内アプリケーションとして開発されたＯｐｅｎＣＬアプリケーションを変更することなく、ネットワーク上の最適なＯｐｅｎＣＬデバイスを選択し、利用することができる。 According to the arithmetic processing system in the present embodiment described above, an optimal device can be selected from the arithmetic devices existing on the network in consideration of the data transfer time. Further, it is possible to select and use an optimal OpenCL device on the network without changing the OpenCL application developed as a conventional in-appliance application.

（その他の実施例）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other examples)
The present invention supplies a program that realizes one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in the computer of the system or apparatus read and execute the program This process can be realized. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１０１ＯｐｅｎＣＬアプリケーション実行ノード
１０２、１０３、１０４ＯｐｅｎＣＬインターフェース対応演算デバイスノード
１０５ネットワーク
１０１１ＯｐｅｎＣＬアプリケーション
１０１２クラウド対応ＯｐｅｎＣＬドライバ
１０１３、１０２１ネットワークインターフェース
１０２２ＯｐｅｎＣＬドライバ
１０２３演算デバイス
１０１２１ＯｐｅｎＣＬＡＰＩ変換部
１０１２２リソース情報取得部
１０１２３データ転送速度取得部
１０１２４演算デバイス選択部
１０１２５演算デバイス情報管理部
１０１２６デバイスＩＤ割り当て部 101 OpenCL application execution node 102, 103, 104 OpenCL interface compatible computing device node 105 network 1011 OpenCL application 1012 cloud compatible OpenCL driver 1013, 1021 network interface 1022 OpenCL driver 1023 computing device 10121 OpenCL API conversion unit 10122 resource information acquisition unit 10123 data transfer Speed acquisition unit 10124 Arithmetic device selection unit 10125 Arithmetic device information management unit 10126 Device ID allocation unit

Claims

An information processing apparatus that distributes data processing to a plurality of arithmetic devices on a network,
First acquisition means for acquiring first information on data transfer time to each of the plurality of computing devices;
Second acquisition means for acquiring second information regarding the calculation capability of each of the plurality of calculation devices;
Based on the first information and the second information, a computing device that performs the data processing is selected from the plurality of computing devices such that completion of the data processing is smaller than a predetermined value. And an information processing apparatus.

The data processing is an OpenCL application processing,
2. The information processing apparatus according to claim 1, wherein the second acquisition unit acquires second information related to a calculation capability of each of the plurality of calculation devices using an OpenCL interface.

Assigning means for assigning a unique virtual device ID on the network to each of the plurality of computing devices;
3. The management apparatus according to claim 2, further comprising management means for managing the virtual device ID and the OpenCL device ID in association with each other and providing the virtual device ID as the OpenCL device ID to the OpenCL application. Information processing device.

3. A setting unit that prioritizes selection by the selection unit when the calculation device specified at the time of execution of the OpenCL application is different from the calculation device selected by the selection unit. 3. The information processing apparatus according to 3.

An information processing apparatus that distributes data processing to a plurality of arithmetic devices on a network,
A first acquisition step in which a first acquisition means acquires first information relating to a data transfer time to each of the plurality of computing devices;
A second acquisition step in which a second acquisition means acquires second information on the calculation capability of each of the plurality of calculation devices;
An operation for causing the selection means to process the data based on the first information and the second information so that the completion of the processing of the data is smaller than a predetermined value. And a selection step of selecting a device.

Computer
An information processing apparatus that distributes data processing to a plurality of arithmetic devices on a network,
First acquisition means for acquiring first information on data transfer time to each of the plurality of computing devices;
Second acquisition means for acquiring second information regarding the calculation capability of each of the plurality of calculation devices;
Based on the first information and the second information, a computing device that performs the data processing is selected from the plurality of computing devices such that completion of the data processing is smaller than a predetermined value. A computer program for causing the information processing apparatus to function.