JP5832311B2

JP5832311B2 - Reconfiguration device, process allocation method, and program

Info

Publication number: JP5832311B2
Application number: JP2012003497A
Authority: JP
Inventors: 悠介谷内出
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2011-02-08
Filing date: 2012-01-11
Publication date: 2015-12-16
Anticipated expiration: 2032-01-11
Also published as: JP2012181824A

Description

本発明は、回路構成を変更することができる再構成デバイスの回路構成情報の生成技術に関するものである。 The present invention relates to a technique for generating circuit configuration information of a reconfigurable device whose circuit configuration can be changed.

従来から製造後のＬＳＩ回路装置であっても内部の回路構成を変更することで、上記回路が行う処理内容を変更することが可能な再構成デバイスが提案されている。製造後のＬＳＩ回路装置でも処理を変更することが可能であるため、仕様変更などに伴うＬＳＩの作り直しを行わなくて済む。製造コストを抑えることが可能な点や開発期間が短縮可能な点で、現在様々な分野において再構成デバイスが使用されている。 Conventionally, there has been proposed a reconfigurable device capable of changing the processing content performed by the above circuit by changing the internal circuit configuration even in an LSI circuit device after manufacture. Since the processing can be changed even in the manufactured LSI circuit device, it is not necessary to recreate the LSI in accordance with the specification change or the like. Currently, reconfigurable devices are used in various fields in that the manufacturing cost can be reduced and the development period can be shortened.

再構成デバイスの構成として、多数のＬＵＴ（Ｌｏｏｋ−Ｕｐ−Ｔａｂｌｅ）を搭載したタイプや多数のプロセッシングエレメントを搭載したタイプが代表的である。また各々のエレメントにはマルチプレクサなどのスイッチングエレメントが接続されている。ここではＬＵＴやプロセッシングエレメント、スイッチングエレメントなどの各構成要素を動作させるための設定を総称し回路構成情報と呼んでいる。回路構成情報の生成方法としては様々あるが、一般に（１）テクノロジーマッピングと呼ばれる論理的な処理の割り当て、（２）各構成要素への物理的な配置、（３）各構成要素の経路決定、の工程に分けられ、順番に行われことが多い。上記三つの工程を経て最終的に回路構成情報が生成される。 As a configuration of the reconfigurable device, a type in which a large number of LUTs (Look-Up-Tables) are mounted and a type in which a large number of processing elements are mounted are typical. Each element is connected to a switching element such as a multiplexer. Here, settings for operating each component such as the LUT, processing element, and switching element are collectively referred to as circuit configuration information. There are various methods for generating circuit configuration information. Generally, (1) logical processing called technology mapping, (2) physical allocation to each component, (3) route determination of each component, These processes are often performed in order. The circuit configuration information is finally generated through the above three steps.

（１）論理的な処理の割り当て工程では、処理を構成要素に割り当てることを行う。具体的には、各処理を物理的な構成要素を特定せずに論理的な構成要素に割り当てる。順序変更の指標としては、回路面積、動作速度、消費電力の観点が一般的である。（２）配置工程では、再構成デバイス内のどの構成要素でどの処理を行うかの物理的な割り当てを決定する。データの入出力関係にある処理が配置されているプロセッシングエレメント間の距離によって、データ通信に係るスイッチングエレメント数が異なるため、遅延時間（最大動作周波数）が大きく変化する。そのため、通常は入出力関係のある処理はできる限り近い距離のプロセッシングエレメントに配置させることが遅延時間を短縮する上で重要となる。（３）経路決定の工程では、データ通信の入出力関係にあるプロセッシングエレメント間のデータ通信のため、スイッチングエレメントによる経路を決定する。配置工程と比べ具体的な経路を決定するため、プロセッシングエレメント間の遅延時間を短縮した経路を決定することが重要となってくる。 (1) In the logical process assignment step, the process is assigned to a component. Specifically, each process is assigned to a logical component without specifying a physical component. As an index for order change, the viewpoint of circuit area, operation speed, and power consumption is generally used. (2) In the placement step, physical assignment of which process is performed with which component in the reconfigurable device is determined. Since the number of switching elements related to data communication differs depending on the distance between processing elements in which processing related to data input / output is arranged, the delay time (maximum operating frequency) varies greatly. For this reason, it is usually important to arrange the processing having an input / output relationship on the processing elements as close as possible to shorten the delay time. (3) In the route determination step, a route by the switching element is determined for data communication between processing elements in an input / output relationship of data communication. In order to determine a specific path as compared with the arrangement process, it is important to determine a path with a reduced delay time between processing elements.

近年、集積度の向上に伴い、再構成デバイスにおいて実行可能な処理の規模は増えている。しかし、最近ではそれにも増して、処理自体への要求が複雑化・高度化しており、一つの再構成デバイスで全ての処理を一度に行うことは難しい場合がある。これに対して、一つの再構成デバイスにおいて時分割で処理を順々に行う方法がある。より具体的には、まず所望の処理を分割し、分割した処理に対応した回路構成情報を生成する。その後、回路構成情報に基づいた再構成デバイスの回路構成の変更、処理を順々に行っていく。これにより、再構成デバイスで大きな規模の処理を行うことが可能となる。しかしながら、毎回全ての回路構成を変更させていては全体の処理時間が長くなってしまい、速度性能が劣化してしまう。また上記の処理分割数が多い場合も同様に、処理速度の劣化を生じさせてしまう。 In recent years, the scale of processing that can be executed in the reconfigurable device is increasing with the improvement in the degree of integration. However, recently, the demand for processing itself has become more complicated and sophisticated than that, and it may be difficult to perform all processing at once with one reconfigurable device. In contrast, there is a method in which processing is sequentially performed in a time-division manner in one reconfigurable device. More specifically, a desired process is first divided, and circuit configuration information corresponding to the divided process is generated. Thereafter, the circuit configuration of the reconfigurable device is changed and processed based on the circuit configuration information. This makes it possible to perform large-scale processing with the reconfigurable device. However, if all the circuit configurations are changed every time, the entire processing time becomes long, and the speed performance deteriorates. Similarly, when the number of processing divisions is large, the processing speed is deteriorated.

この問題を解決するための方法として、マルチコンテキスト型の再構成デバイスがある。コンテキストとは、回路構成情報のことで、マルチコンテキスト型の再構成デバイスとは、複数の回路構成情報を格納するメモリを再構成デバイス内に搭載したものである。回路構成を変更する場合には、上記メモリを切り替えてデバイスを再構成することが可能であり、高速に切り換え可能なことで回路の再構成時間を大幅に短縮することが可能である。しかし、追加で回路構成情報用のメモリを搭載する必要があるため、回路規模が増大する問題がある。 As a method for solving this problem, there is a multi-context reconfiguration device. The context is circuit configuration information, and the multi-context reconfiguration device is a device in which a memory for storing a plurality of circuit configuration information is mounted in the reconfiguration device. When the circuit configuration is changed, the device can be reconfigured by switching the memory, and the circuit reconfiguration time can be greatly reduced by switching at high speed. However, since it is necessary to additionally install a memory for circuit configuration information, there is a problem that the circuit scale increases.

これに対し、特許文献１では、再構成時間を短縮する方法としてスケルトン回路技術に基づいた手法が提案されている。この手法では、先行ベース回路と呼ばれる回路構成情報を、予め再構成デバイスに構成しておく。ここで先行ベース回路とは複数の回路構成情報において全てに共通する共通回路部分と、複数個の回路で互いに共通せず、かつ、再構成デバイス上で回路構成情報を共有しない非排他的独立回路部分からなる回路構成情報のことである。再構成デバイス上において回路の差分のみを部分的に再構成することにより、処理に必要な回路を構成するようにする。この方法は、マルチコンテキスト型に比べ、追加で構成用のメモリを必要としないため回路規模が増大はしない。 On the other hand, Patent Document 1 proposes a method based on a skeleton circuit technique as a method for shortening the reconstruction time. In this method, circuit configuration information called a preceding base circuit is configured in advance in a reconfigurable device. Here, the preceding base circuit is a common circuit portion that is common to all of the plurality of circuit configuration information, and a non-exclusive independent circuit that is not common to the plurality of circuits and that does not share the circuit configuration information on the reconfigurable device. This is circuit configuration information consisting of parts. By partially reconfiguring only the circuit differences on the reconfigurable device, a circuit necessary for processing is configured. Compared with the multi-context type, this method does not require an additional configuration memory, so that the circuit scale does not increase.

特許第３５５８１１９号公報Japanese Patent No. 3558119

しかしながら、一般に再構成デバイスでは様々なアプリケーションを実行する可能性があり、アプリケーションによっては共通部分が少なくなる。また、再構成すべき回路構成情報の数もアプリケーションによって異なる。特許文献１で述べられている先行ベース回路部生成において、共通部分が少ないもしくは、回路構成情報の数が多く再構成デバイスの回路規模を大幅に超えてしまう場合には回路構成を変更するための期間を効率的に削減することが難しい。 However, in general, the reconstruction device may execute various applications, and the number of common parts is reduced depending on the application. Further, the number of circuit configuration information to be reconfigured varies depending on the application. In the generation of the preceding base circuit unit described in Patent Document 1, when there are few common parts or the number of circuit configuration information is large and greatly exceeds the circuit scale of the reconfigurable device, the circuit configuration is changed. It is difficult to reduce the period efficiently.

本発明は、上述した問題点に鑑みてなされたものであり、回路構成変更の順序を考慮することで、回路規模を増やすことなく回路変更期間を効率的に削減することを目的とする。 The present invention has been made in view of the above-described problems, and an object thereof is to efficiently reduce the circuit change period without increasing the circuit scale by considering the order of circuit configuration change.

本発明の処理割当て方法は、複数の構成要素で構成される再構成デバイスに対し、各構成要素に処理を割り当てる処理割当て方法であって、少なくとも二つの異なるデータフローとデータフローの実行順序を入力するデータフロー入力ステップと、前記構成要素の制約を入力する制約ステップと、前記構成要素の制約と実行順序とに基づく前記構成要素の再構成に必要な設定変更数が少なくなるように処理割り当てを決定する処理割り当て決定ステップとを有することを特徴とする。 The process allocation method of the present invention is a process allocation method for allocating processes to each component for a reconfigurable device composed of a plurality of components, and inputting at least two different data flows and the execution order of the data flows. A data flow input step, a constraint step for inputting the constraint of the component, and a process allocation so that the number of setting changes necessary for reconfiguration of the component based on the constraint and execution order of the component is reduced. And a process allocation determining step for determining.

本発明によれば、再構成に必要な設定数を減らすように回路構成情報を作成することで、回路規模を増やすことなく再構成デバイスの再構成期間を短縮することが可能である。 According to the present invention, by creating circuit configuration information so as to reduce the number of settings required for reconfiguration, it is possible to shorten the reconfiguration period of the reconfiguration device without increasing the circuit scale.

再構成デバイスを含む処理装置の構成例を示す図である。It is a figure which shows the structural example of the processing apparatus containing a reconstruction device. 再構成デバイスの構成例を示す図である。It is a figure which shows the structural example of a reconstruction device. 再構成デバイスのエレメント間のデータ通信の手順例を示す図である。It is a figure which shows the example of a procedure of the data communication between the elements of a reconstruction device. プロセッシングエレメントの構成例を示す図である。It is a figure which shows the structural example of a processing element. コンフィギュレーションコマンドのフォーマット例を示す図である。It is a figure which shows the example of a format of a configuration command. プロセッシングエレメントのコンフィギュレーションメモリに格納される設定の概要例を示す図である。It is a figure which shows the example of an outline | summary of the setting stored in the configuration memory of a processing element. 設定を読み書きする手順を示すフローチャートである。It is a flowchart which shows the procedure which reads / writes a setting. スイッチングエレメントの構成例を示す図である。It is a figure which shows the structural example of a switching element. スイッチングエレメントのコンフィギュレーションメモリに格納される設定の概要例を示す図である。It is a figure which shows the example of an outline | summary of the setting stored in the configuration memory of a switching element. 複数のデータフローを順次実行するタイムチャートである。6 is a time chart for sequentially executing a plurality of data flows. 処理割り当ての概要例を示す図である。It is a figure which shows the example of an outline of process allocation. 第１の実施形態におけるデータフローの処理割り当てを行うための概要例を示す図である。It is a figure which shows the example of an outline for performing the process allocation of the data flow in 1st Embodiment. 第１の実施形態１で処理割り当てを行うための処理を示すフローチャートである。6 is a flowchart illustrating a process for performing process assignment in the first embodiment. 第２の実施形態におけるデータフローの処理割り当てを行うための概要例を示す図である。It is a figure which shows the example of an outline for performing the process allocation of the data flow in 2nd Embodiment. 第３の実施形態におけるデータフローの処理割り当てを行うための概要例を示す図である。It is a figure which shows the example of an outline for performing the process allocation of the data flow in 3rd Embodiment. 第４の実施形態におけるデータフローの処理割り当てを行うための概要例を示す図である。It is a figure which shows the example of an outline for performing process allocation of the data flow in 4th Embodiment. 回路構成情報を作成する装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of a structure of the apparatus which produces circuit structure information.

以下、本発明を適用した好適な実施形態を、添付図面を参照しながら詳細に説明する。 DESCRIPTION OF EXEMPLARY EMBODIMENTS Hereinafter, preferred embodiments to which the invention is applied will be described in detail with reference to the accompanying drawings.

図１は、本発明の実施形態に係る再構成デバイスを有するシステムの全体構成の一例を示す図である。外部メモリ１０１は内部に回路構成情報１０６を保持している。回路構成情報１０６は、再構成デバイス１０５を構成する要素を動作させる設定群である。コンフィギュレーションコントローラ１０２は、上記メモリ１０１から結線１０４を通じ、回路構成情報１０６を取得する。取得した回路構成情報１０６は結線１０３を通じて再構成デバイス１０５へ送られる。ここで再構成デバイス１０５の例としてプロセッシングエレメントアレイとしている。 FIG. 1 is a diagram showing an example of the overall configuration of a system having a reconfigurable device according to an embodiment of the present invention. The external memory 101 holds circuit configuration information 106 therein. The circuit configuration information 106 is a setting group for operating the elements constituting the reconfiguration device 105. The configuration controller 102 acquires circuit configuration information 106 from the memory 101 through the connection 104. The acquired circuit configuration information 106 is sent to the reconfiguration device 105 through the connection 103. Here, a processing element array is used as an example of the reconstruction device 105.

また、以下では具体的にプロセッシングエレメントアレイ内の説明をするが、本発明は下記の各プロセッシングエレメントの構成や経路構成に限定されるものではない。 Although the processing element array will be specifically described below, the present invention is not limited to the configuration and path configuration of each processing element described below.

図２は、本実施形態における再構成デバイスであるプロセッシングエレメントアレイの概要を示す図である。再構成デバイスには、８入力８出力の入出力処理手段であるスイッチングエレメント２０１が二次元格子状に配置され、演算処理手段である４入力４出力のプロセッシングエレメント２０２が、スイッチングエレメント２０１の格子内に配置されている。スイッチングエレメント２０１ａ〜ｉの８入力８出力は１入力１出力を一組としてそれぞれ東西南北（右左下上）方向の、他の４つのスイッチングエレメント２０１と結線２０３ａおよび２０３ｂを介して夫々接続されている。さらに、北東、南東、南西、北西方向の、それぞれ別の４つのプロセッシングエレメント２０２と結線２０４ａおよび２０４ｂを介して双方向に接続された構成を持つ。またスイッチングエレメント２０１ａ〜２０１ｉおよびプロセッシングエレメント２０２ａ〜２０２ｄは結線２０５で一方向に数珠繋ぎに接続されている。 FIG. 2 is a diagram showing an outline of a processing element array which is a reconfigurable device in the present embodiment. In the reconfigurable device, switching elements 201 as input / output processing means with 8 inputs and 8 outputs are arranged in a two-dimensional lattice, and processing elements 202 with 4 inputs and 4 outputs as processing means are arranged in the lattice of the switching elements 201. Is arranged. The eight inputs and eight outputs of the switching elements 201a to i are connected to the other four switching elements 201 in the direction of east, west, south, and north (upper right and lower left), respectively, as a set, with one input and one output as a set. . Furthermore, it has a configuration in which it is bidirectionally connected to four different processing elements 202 in the northeast, southeast, southwest and northwest directions via connection lines 204a and 204b. The switching elements 201a to 201i and the processing elements 202a to 202d are connected in a daisy chain in one direction by a connection 205.

結線２０３ａ、２０３ｂおよび結線２０４ａ、２０４ｂはスイッチングエレメント２０１やプロセッシングエレメント２０２の間で処理対象データを通信するために接続された結線である。結線２０５は、スイッチングエレメント２０１やプロセッシングエレメント２０２に設定を供給するための結線である。上述の設定によって、スイッチングエレメント２０１においては処理対象データの入出力先が決定され、プロセッシングエレメント２０２においては処理対象データの入出力先や処理内容が決定される。なお、図２の各結線２０３ａ、２０３ｂ、２０４ａ、２０４ｂ、２０５の矢印の方向はデータの方向を示している。またスイッチングエレメント２０１ａ〜２０１ｉはそれぞれ同じ構成を有し、プロセッシングエレメント２０２ａ〜２０２ｄはそれぞれ同じ構成を有する。 Connections 203 a and 203 b and connection lines 204 a and 204 b are connections that are connected to communicate processing target data between the switching element 201 and the processing element 202. A connection 205 is a connection for supplying settings to the switching element 201 and the processing element 202. With the above settings, the switching element 201 determines the input / output destination of the processing target data, and the processing element 202 determines the input / output destination and processing content of the processing target data. In addition, the direction of the arrow of each connection 203a, 203b, 204a, 204b, 205 of FIG. 2 has shown the direction of data. The switching elements 201a to 201i have the same configuration, and the processing elements 202a to 202d have the same configuration.

ここで、各結線２０３ａ、２０３ｂ、２０４ａ、２０４ｂ、２０５の通信プロトコルの一例として、Ｖａｌｉｄ信号とＲｅａｄｙ信号による２線ハンドシェークを図３に示す。図３では送信側であるモジュールＡ３０１と受信側であるモジュールＢ３０２間はデータ信号線３０３、Ｖａｌｉｄ信号線３０４およびＲｅａｄｙ信号線３０５が接続されている。Ｖａｌｉｄ信号はＶａｌｉｄ信号線３０４を通じ送信側が受信側に対し送信可能状態を示す信号である。Ｒｅａｄｙ信号はＲｅａｄｙ信号線３０５を通じ受信側が送信側に対しデータ受信可能状態を示す信号である。本プロトコルではモジュールＡ３０１からのＶａｌｉｄ信号線３０４とモジュールＢ３０２のＲｅａｄｙ信号線３０５とが両方有効となるクロック立ち上がりのタイミングでデータ信号線３０３のデータがモジュールＡ３０１からモジュールＢ３０２へと送信される。図３の波形中では３０６ａのタイミングでデータＡが、３０６ｂのタイミングでデータＢが、３０６ｃのタイミングでデータＣ、３０６ｄのタイミングでデータＤがモジュールＡ３０１からモジュールＢ３０２へと転送されている。プロセッシングエレメント２０２の構成を図４に示す。プロセッシングエレメント２０２は、コンフィギュレーションユニット４０１、インプットユニット４０２、コンピュテーショナルユニット４０３、アウトプットユニット４０４、テンポラリーバッファ４０５で構成されている。 Here, as an example of a communication protocol for each of the connections 203a, 203b, 204a, 204b, and 205, a two-wire handshake using a Valid signal and a Ready signal is shown in FIG. In FIG. 3, a data signal line 303, a valid signal line 304, and a ready signal line 305 are connected between the module A301 on the transmission side and the module B302 on the reception side. The Valid signal is a signal indicating that the transmitting side can transmit to the receiving side through the Valid signal line 304. The Ready signal is a signal indicating that the receiving side can receive data to the transmitting side through the Ready signal line 305. In this protocol, the data on the data signal line 303 is transmitted from the module A301 to the module B302 at the clock rising timing when both the valid signal line 304 from the module A301 and the ready signal line 305 of the module B302 are valid. In the waveform of FIG. 3, data A is transferred from module A301 to module B302 at timing 306a, data B at timing 306b, data C at timing 306c, and data D at timing 306d. The configuration of the processing element 202 is shown in FIG. The processing element 202 includes a configuration unit 401, an input unit 402, a computational unit 403, an output unit 404, and a temporary buffer 405.

コンフィギュレーションユニット４０１はプロセッシングエレメント２０２の動作内容を決定するための設定の管理を行う。インプットユニット４０２はコンフィギュレーションユニット４０１の設定に基づき、入力処理を行う。コンピュテーショナルユニット４０３はコンフィギュレーションユニット４０１の設定に基づき、演算処理を行う。また、コンピュテーショナルユニット４０３は、再度コンピュテーショナルユニット４０３に入力するためテンポラリーバッファ４０５に処理された結果を保持することが可能である。アウトプットユニット４０４はコンフィギュレーションユニット４０１の設定に基づき、出力処理を行う。 The configuration unit 401 manages settings for determining the operation content of the processing element 202. The input unit 402 performs input processing based on the setting of the configuration unit 401. The computational unit 403 performs arithmetic processing based on the setting of the configuration unit 401. In addition, the computational unit 403 can hold the processed result in the temporary buffer 405 for input to the computational unit 403 again. The output unit 404 performs output processing based on the setting of the configuration unit 401.

上記プロセッシングエレメント２０２の動作をより具体的に説明する。インプットユニット４０２はコンフィギュレーションユニット４０１から入力先を決定するための設定を、結線４０６を通じて取得する。取得した設定はどの入力ポートを介して外部接続されているモジュールと通信を行うかが指定されている。その情報を元に、結線２０４ａ−ｎｅ、２０４ａ−ｓｅ、２０４ａ−ｓｗ、２０４ａ−ｎｗ、を通じて処理対象であるデータを取得する。ここで参照符号のｎｅ，ｓｅ、ｓｗ，ｎｗはそれぞれ方向を示しており、結線２０４ａ−ｎｅは北東に配置されたスイッチングエレメントと接続されている。また、２０４ａ−ｓｅは南東に配置されたスイッチングエレメントと接続されている。２０４ａ−ｓｗは南西に配置されたスイッチングエレメントと接続されている。２０４ａ−ｎｗは北西に配置されたスイッチングエレメントと接続されている。取得したデータは、結線４０９を通してコンピュテーショナルユニット４０３に送られる。 The operation of the processing element 202 will be described more specifically. The input unit 402 acquires the setting for determining the input destination from the configuration unit 401 through the connection 406. The acquired setting specifies which input port is used to communicate with the externally connected module. Based on the information, data to be processed is acquired through the connections 204a-ne, 204a-se, 204a-sw, and 204a-nw. Here, reference numerals ne, se, sw, and nw indicate directions, respectively, and the connection line 204a-ne is connected to a switching element arranged in the northeast. 204a-se is connected to a switching element arranged in the southeast. 204a-sw is connected to a switching element arranged in the southwest. 204a-nw is connected to a switching element arranged in the northwest. The acquired data is sent to the computer unit 403 through the connection 409.

コンピュテーショナルユニット４０３は、コンフィギュレーションユニット４０１から処理内容を決定するための設定を、結線４０７を通じて取得する。取得した設定に基づいてインプットユニット４０２から送られてくるデータを取得し、設定された処理を行う。処理したデータは結線４１０を通じてアウトプットユニット４０４に送られる。 The computational unit 403 acquires settings for determining the processing contents from the configuration unit 401 through the connection 407. Data sent from the input unit 402 is acquired based on the acquired setting, and the set processing is performed. The processed data is sent to the output unit 404 through the connection 410.

コンピュテーショナルユニット４０３は、少なくとも一つの演算器を保持している。上記演算器は例えば、加減算器、比較器、乗算器、除算器、論理演算器、などの演算器、または、これらの組合せからなる演算器、さらにはこれらと他の演算器の組合せからなる演算器などである。以下では具体的な例として、上記コンピュテーショナルユニット４０３では、積和演算と比較演算処理が行え、一度の演算でどちらかを一方を選択的に行うことができるものとし説明する。積和演算では、ａ・ｂ＋ｃ・ｄの処理を行い、比較演算では、もしａ＞ｂならばｃを出力、そうでなければｄを出力するといった処理を行う。また、コンピュテーショナルユニット４０３は一度の入力に対して、上記演算器を繰り返し使用することが可能な構成となっている。繰り返しで使用する場合は一旦上記演算器で使用された処理結果を、結線４１２を通して、テンポラリーバッファに保存し、その後、結線４１１を通して、再度コンピュテーショナルユニット４０３に入力する。再度入力されたデータに対して、新たに上記演算器にて処理を行う。後で詳細を述べるが、上記設定とは上記演算の種類や繰り返し処理、またそれぞれの処理で必要な変数ａ、ｂ、ｃ、ｄはどの値を参照するか、また固定値ならばその値を指定することを意味する。 The computational unit 403 holds at least one computing unit. The arithmetic unit is, for example, an arithmetic unit such as an adder / subtractor, a comparator, a multiplier, a divider, a logical arithmetic unit, or a combination of these, or a combination of these and other arithmetic units. Such as a vessel. In the following, as a specific example, the above-described computational unit 403 will be described assuming that product-sum operation and comparison operation processing can be performed, and one of them can be selectively performed in one operation. In the product-sum operation, the process of a · b + c · d is performed, and in the comparison operation, c is output if a> b, and d is output otherwise. In addition, the computational unit 403 is configured to be able to repeatedly use the arithmetic unit for one input. In the case of repeated use, the processing results once used by the computing unit are stored in a temporary buffer through the connection 412 and then input to the computer unit 403 again through the connection 411. A new process is performed on the data inputted again by the above computing unit. As will be described in detail later, the setting refers to the type of operation and the iterative process, and the values of the variables a, b, c, and d required for each process are referred to. Means to specify.

アウトプットユニット４０４は、処理したデータの出力先を示す設定を、結線４０８を通じて取得する。取得した設定は、どの出力ポートを介してスイッチングエレメントと通信を行うかが指定されている。その情報を元に、結線２０４ｂ−ｎｅ、２０４ｂ−ｓｅ、２０４ｂ−ｓｗ、２０４ｂ−ｎｗ、を通じてスイッチングエレメントへ出力する。ここで結線２０４ｂ−ｎｅは北東に配置されたスイッチングエレメントと接続されている。また、２０４ｂ−ｓｅは南東に配置されたスイッチングエレメントと接続されている。２０４ｂ−ｓｗは南西に配置されたスイッチングエレメントと接続されている。２０４ｂ−ｎｗは北西に配置されたスイッチングエレメントと接続されている。 The output unit 404 acquires the setting indicating the output destination of the processed data through the connection 408. The acquired setting specifies which output port is used to communicate with the switching element. Based on the information, the data is output to the switching element through connection lines 204b-ne, 204b-se, 204b-sw, and 204b-nw. Here, the connection line 204b-ne is connected to a switching element arranged in the northeast. 204b-se is connected to a switching element arranged in the southeast. 204b-sw is connected to a switching element arranged in the southwest. 204b-nw is connected to a switching element arranged in the northwest.

次にコンフィギュレーションユニット４０１の動作について説明する。コンフィギュレーションユニット４０１はプロセッシングエレメント２０２毎にユニークなＩＤを保持している。コンフィギュレーションユニットは入力側の結線２０５より送られてくる設定を取得し、コンフィギュレーションユニット内で処理し、出力側の結線２０５を通して設定を出力する。コンフィギュレーションユニット４０１は自らのＩＤに対応する設定を格納するためコンフィギュレーションメモリ４１３を有している。 Next, the operation of the configuration unit 401 will be described. The configuration unit 401 holds a unique ID for each processing element 202. The configuration unit acquires the setting sent from the input side connection 205, processes it in the configuration unit, and outputs the setting through the output side connection 205. The configuration unit 401 has a configuration memory 413 for storing settings corresponding to its own ID.

図５にコンフィギュレーションユニットに送受信される、設定のためのコンフィギュレーションコマンド５０１を示す。コンフィギュレーションコマンド５０１はリード／ライトモード５０２、ＩＤ５０３、コンフィギュレーションアドレス５０４、設定値５０５で構成されている。リード／ライトモード５０２はコンフィギュレーションコマンドの読み書きの処理を決定する信号である。ＩＤ５０３は処理対象のプロセッシングエレメント２０２を決める信号である。コンフィギュレーションアドレス５０４はコンフィギュレーションユニット４０１中の設定が保持されているメモリ内の番地を指定する信号である。設定値５０５は実際の設定値を表す信号である。図５中のビット幅を示すＭ、Ｎ、Ｏ、Ｐは実際に構成するアーキテクチャによって決められる値である。 FIG. 5 shows a configuration command 501 for setting that is transmitted to and received from the configuration unit. The configuration command 501 includes a read / write mode 502, an ID 503, a configuration address 504, and a setting value 505. The read / write mode 502 is a signal that determines the read / write processing of the configuration command. The ID 503 is a signal that determines the processing element 202 to be processed. The configuration address 504 is a signal for designating an address in the memory where the setting in the configuration unit 401 is held. A set value 505 is a signal representing an actual set value. In FIG. 5, M, N, O, and P indicating the bit width are values determined by the actually configured architecture.

以下では上述した構成に基づく設定に関してより具体的に説明する。図６中の６０１はコンフィギュレーションメモリ４１３のアドレスを示しており、図５のコンフィギュレーションアドレス５０４で指定されるアドレスに相当する。６０２は実際の設定値を示しており、図５の設定値５０５に相当する。本実施形態ではこれら一つ一つを設定と読んでいるが、本発明は上述の単位に限るものではない。図ではインプットユニット４０２、コンピュテーショナルユニット４０３、アウトプットユニット４０４に関する設定がコンフィギュレーションメモリ４１３に保持されている例を示している。 Hereinafter, the setting based on the above-described configuration will be described more specifically. Reference numeral 601 in FIG. 6 denotes an address of the configuration memory 413, which corresponds to an address designated by the configuration address 504 in FIG. Reference numeral 602 denotes an actual set value, which corresponds to the set value 505 in FIG. In the present embodiment, each of these is read as a setting, but the present invention is not limited to the units described above. In the figure, an example in which settings relating to the input unit 402, the computational unit 403, and the output unit 404 are held in the configuration memory 413 is shown.

アドレス０ｘ００００＿００００（“０ｘ”は１６進数を示している）で示した設定値は、インプットユニット４０２における入力先を決定するための設定値であり、値に応じて所定の入力先が決まる。アドレス０ｘ００００＿０００４で示したイタレーションナンバは、コンピュテーショナルユニット４０３における、演算の繰り返し回数を決定するための設定であり、値に応じて演算の繰り返し回数が決まる。本実施形態では４回までの演算を想定している。 A setting value indicated by an address 0x0000 — 0000 (“0x” indicates a hexadecimal number) is a setting value for determining an input destination in the input unit 402, and a predetermined input destination is determined according to the value. The iteration number indicated by the address 0x0000 — 0004 is a setting for determining the number of repetitions of the operation in the computational unit 403, and the number of repetitions of the operation is determined according to the value. In the present embodiment, up to four calculations are assumed.

アドレス０ｘ００００＿０００８で示したオペレーションセッティングは１度目の演算で行う、演算の種類を決定するための設定で、値に応じて積和演算か比較演算かが決まる。アドレス０ｘ００００＿０００ｃで示したバリアブルセッティングは１度目の演算で、変数ａの値の参照先を決めるための設定である。参照先としては、入力ポートからの入力値、コンフィギュレーションメモリ４１３に保持されている固定値、以前の計算結果が保持されているテンポラリーバッファの値がある。この値に応じて、上記いずれかの値がこの変数ａに入力される。また０ｘ００００＿０００ｃと同様に、０ｘ００００＿００１０、０ｘ００００＿００１４、０ｘ００００＿００１８で示したバリアブルセッティングはそれぞれ、１度目の演算における変数ｂ、ｃ、ｄの値の参照先を決めるための設定である。次に０ｘ００００＿００１ｃで示したパラメータは、１度目の演算において０ｘ００００＿０００ｃで指定された参照先が固定値の場合の、変数ａのための固定値である。０ｘ００００＿００１ｃと同様に０ｘ００００＿００２０、０ｘ００００＿００２４、０ｘ００００＿００２８で示した固定値は、それぞれ１度目の演算で変数ｂ、ｃ、ｄ、で使用される場合の値である。 The operation setting indicated by the address 0x0000 — 0008 is a setting for determining the type of calculation performed in the first calculation, and the product-sum calculation or the comparison calculation is determined according to the value. The variable setting indicated by the address 0x0000 — 000c is a setting for determining the reference destination of the value of the variable a in the first calculation. The reference destination includes an input value from the input port, a fixed value held in the configuration memory 413, and a temporary buffer value holding a previous calculation result. Depending on this value, one of the above values is input to this variable a. Similarly to 0x0000_000c, variable settings indicated by 0x0000_0010, 0x0000_0014, and 0x0000_0018 are settings for determining reference destinations of the values of the variables b, c, and d in the first calculation. Next, the parameter indicated by 0x0000 — 001c is a fixed value for the variable a when the reference destination specified by 0x0000 — 000c in the first calculation is a fixed value. Similar to 0x0000 — 001c, the fixed values indicated by 0x0000 — 0020, 0x0000 — 0024, and 0x0000 — 0028 are values used in the variables b, c, and d in the first calculation.

続く０ｘ００００＿００２ｃ〜から０ｘ００００＿００９４は、上記０ｘ００００＿０００８〜０ｘ００００＿００２８で示した１度目の演算に係る設定と同様にそれぞれ、２度目、３度目、４度目の設定値を示している。最後に０ｘ００００＿００９８で示したアウトプットセレクトバリューは、アウトプットユニット４０４における出力先を決定するための設定値で、値に応じて所定の出力先が決まる。 Subsequent 0x0000 — 002c to 0x0000 — 0094 indicate setting values for the second, third, and fourth times, respectively, similarly to the setting related to the first calculation indicated by the above 0x0000 — 0008 to 0x0000 — 0028. Finally, an output select value indicated by 0x0000 — 0098 is a setting value for determining an output destination in the output unit 404, and a predetermined output destination is determined according to the value.

次にコンフィギュレーションユニットで行われる処理フローを図７に示す。ステップＳ７０１では、コンフィギュレーションコマンド５０１を入力する。ステップＳ７０２では、入力されたコンフィギュレーションコマンド５０１で指定されたＩＤ５０３が、コンフィギュレーションユニット４０１の持つ自身のＩＤと一致するかどうかを判断する。もし、ステップＳ７０２で自身のＩＤと違うと判断された場合はステップＳ７１１、７１２で、入力されたコンフィギュレーションコマンド５０１に対して、何も処理せずにそのまま出力する。もし、自身のＩＤと一致する場合は、次にステップＳ７０３でリード／ライトモード５０２の値がリードモードかどうかを判断する。リードモードでなければステップＳ７０７でライトモードかどうかを判断する。いずれのモードでもない場合はステップＳ７１１、７１２で何も処理せずにそのまま出力する。もし、リードモードと判断された場合は、ステップＳ７０４において、コンフィギュレーションメモリ４１３からコンフィギュレーションアドレス５０４で指定されたデータを読み出す。その後、ステップＳ７０５にて、入力されたコンフィギュレーションコマンド５０１の設定値５０５に読み出したデータを書き込み、ステップＳ７０６でそのコンフィギュレーションコマンド５０１を出力する。ライトモードと判断された場合は、ステップＳ７０８にて入力されたコンフィギュレーションコマンド５０１の設定値５０５をコンフィギュレーションアドレス５０４で指定されたコンフィギュレーションメモリ４１３に書き込む。次にステップＳ７０９では、入力されたコンフィギュレーションコマンド５０１の値を変更せず、ステップＳ７１０にて、そのまま出力する。一度のコンフィギュレーションコマンドで一つの設定値５０５を変更することが可能であり、順次コンフィギュレーションコマンドを送信し、必要な設定値を全て変更することで所望の処理を実現する。つまり、この設定数が処理内容の切り替え時間を決めていることとなる。 Next, FIG. 7 shows a processing flow performed in the configuration unit. In step S701, a configuration command 501 is input. In step S <b> 702, it is determined whether the ID 503 specified by the input configuration command 501 matches the own ID of the configuration unit 401. If it is determined in step S702 that the ID is different from its own ID, in step S711 and 712, the input configuration command 501 is output without being processed. If it coincides with its own ID, it is next determined in step S703 whether the value of the read / write mode 502 is the read mode. If it is not the read mode, it is determined in step S707 whether or not it is the write mode. If it is not in any mode, the data is output as it is without being processed in steps S711 and 712. If the read mode is determined, the data designated by the configuration address 504 is read from the configuration memory 413 in step S704. Thereafter, the read data is written in the set value 505 of the input configuration command 501 in step S705, and the configuration command 501 is output in step S706. If the write mode is determined, the setting value 505 of the configuration command 501 input in step S708 is written into the configuration memory 413 designated by the configuration address 504. Next, in step S709, the value of the input configuration command 501 is not changed and is output as it is in step S710. One setting value 505 can be changed by a single configuration command, and a desired process is realized by sequentially transmitting configuration commands and changing all necessary setting values. In other words, this set number determines the processing content switching time.

スイッチングエレメント２０１の構成を図８に示す。スイッチングエレメント２０１は、コンフィギュレーションユニット８０１と、クロスバースイッチ８０２で構成される。コンフィギュレーションユニット８０１はデータを転送する接続先を決定するための設定値の管理をする。クロスバースイッチ８０２はコンフィギュレーションユニット８０１の設定に基づき一対一に入出力を接続する。プロセッシングエレメント２０２のコンフィギュレーションユニット４０１と同様、コンフィギュレーションユニット８０１はスイッチングエレメント２０１毎にユニークなＩＤを保持している。コンフィギュレーションユニットは入力側の結線２０５より送られてくる設定値を取得し、コンフィギュレーションユニット内で処理を行い、出力側の結線２０５を通して設定値を出力する。コンフィギュレーションユニットは上記取得した設定をコンフィギュレーションメモリ８０４に保持している。コンフィギュレーションユニットのコンフィギュレーションコマンドとその処理フローは、図５に示したフォーマット、および図７に示した処理フローと同様である。クロスバースイッチ８０２はコンフィギュレーションユニット８０１からデータの入出力先を決定する設定を、結線８０３を通じて取得する。スイッチングエレメント２０１は取得した設定値に基づき結線２０３ａ−ｗ、−ｓ、２０３ｂ−ｅ、−ｎ、結線２０４ｂ−ｎｅ、−ｓｅ、−ｓｗ、−ｎｗを通じてデータを取得する。取得したデータは、接続されている結線２０３ａ−ｅ、−ｎ、２０３ｂ−ｗ、−ｓ、結線２０４ａ−ｎｅ、−ｓｅ、−ｓｗ、−ｎｗを通じて渡す。 The configuration of the switching element 201 is shown in FIG. The switching element 201 includes a configuration unit 801 and a crossbar switch 802. The configuration unit 801 manages setting values for determining a connection destination to which data is transferred. The crossbar switch 802 connects the input and output one to one based on the setting of the configuration unit 801. Similar to the configuration unit 401 of the processing element 202, the configuration unit 801 holds a unique ID for each switching element 201. The configuration unit acquires the set value sent from the input side connection 205, performs processing in the configuration unit, and outputs the set value through the output side connection 205. The configuration unit holds the acquired setting in the configuration memory 804. The configuration command of the configuration unit and its processing flow are the same as the format shown in FIG. 5 and the processing flow shown in FIG. The crossbar switch 802 acquires the setting for determining the data input / output destination from the configuration unit 801 through the connection 803. The switching element 201 acquires data through the connections 203a-w, -s, 203b-e, -n, and the connections 204b-ne, -se, -sw, -nw based on the acquired set values. The acquired data is passed through the connected connections 203a-e, -n, 203b-w, -s, and the connections 204a-ne, -se, -sw, -nw.

ここで結線２０３ａ−ｗ、−ｓはそれぞれ西、南に配置されたスイッチングエレメントと接続されていることを意味する。２０３ｂ−ｅ、−ｎはそれぞれ東、北に配置されたスイッチングエレメントと接続されていることを意味している。 Here, the connections 203a-w and -s mean that they are connected to the switching elements arranged in the west and south, respectively. 203b-e and -n mean that the switching elements are connected to the east and north, respectively.

結線２０３ａ−ｅ、−ｎはそれぞれ東、北に配置されたスイッチングエレメントと接続されていることを意味する。２０３ｂ−ｗ、−ｓはそれぞれ西、南に配置されたスイッチングエレメントと接続されていることを意味している。 Connections 203a-e and -n mean that they are connected to switching elements arranged east and north, respectively. 203b-w and -s mean that they are connected to switching elements arranged in the west and south, respectively.

結線２０４ａ−ｎｅ、−ｓｅ、−ｓｗ、−ｎｗはそれぞれ北東、南東、南西、北西に配置されたスイッチングエレメントと接続されていることを意味している。結線２０４ｂ−ｎｅ、−ｓｅ、−ｓｗ、−ｎｗはそれぞれ北東、南東、南西、北西に配置されたスイッチングエレメントと接続されていることを意味している。 Connections 204a-ne, -se, -sw, and -nw mean that they are connected to switching elements arranged in the northeast, southeast, southwest, and northwest, respectively. Connections 204b-ne, -se, -sw, and -nw mean that they are connected to switching elements arranged in the northeast, southeast, southwest, and northwest, respectively.

図９にコンフィギュレーションメモリ８０４に保持されている、スイッチングエレメント２０１におけるクロスバースイッチ８０２の入出力の接続に関する設定例を示す。９０１はメモリのアドレスを示しており、図５のコンフィギュレーションアドレス５０４で指定されるアドレスに相当する。９０２は設定値を示しており、図５の設定値５０５を示している。アドレス０ｘ００００＿００００で示したコネクション設定は結線２０３ａ−ｗからの入力を２０３ａ−ｅ、ｎ、２０３ｂ−ｗ、−ｓ、２０４ａ−ｎｅ、−ｓｅ、−ｓｗ、−ｎｗのいずれに出力するかを決定するための設定値である。続く、０ｘ００００＿０００４は結線２０３ａ−ｓからの入力を０ｘ００００＿００００と同様どの結線に出力するかを決定する設定値である。０ｘ００００＿０００８は結線２０３ｂ−ｅからの入力を０ｘ００００＿００００と同様どの結線に出力するかを決定する設定値である。０ｘ００００＿０００ｃは結線２０３ｂ−ｎからの入力を０ｘ００００＿００００と同様どの結線に出力するかを決定する設定値である。０ｘ００００＿００１０は結線２０４ｂ−ｎｅからの入力を０ｘ００００＿００００と同様どの結線に出力するかを決定する設定値である。０ｘ００００＿００１４は結線２０４ｂ−ｓｅからの入力を０ｘ００００＿００００と同様どの結線に出力するかを決定する設定値である。０ｘ００００＿００１８は結線２０４ｂ−ｓｗからの入力を０ｘ００００＿００００と同様どの結線に出力するかを決定する設定値である。０ｘ００００＿００１ｃは結線２０４ｂ−ｎｗからの入力を０ｘ００００＿００００と同様どの結線に出力するかを決定する設定値である。 FIG. 9 shows a setting example related to the input / output connection of the crossbar switch 802 in the switching element 201, which is held in the configuration memory 804. Reference numeral 901 denotes a memory address, which corresponds to the address specified by the configuration address 504 in FIG. Reference numeral 902 denotes a set value, which is the set value 505 in FIG. The connection setting indicated by the address 0x0000_0000 determines whether the input from the connection line 203a-w is output to 203a-e, n, 203b-w, -s, 204a-ne, -se, -sw, or -nw. Is a set value for Subsequently, 0x0000_0004 is a setting value that determines which connection the input from the connection 203a-s is to be output in the same way as 0x0000_0000. 0x0000_0008 is a setting value that determines which connection the input from the connection 203b-e is to be output in the same way as 0x0000_0000. 0x0000 — 000c is a setting value that determines which connection the input from the connection 203b-n is to be output in the same way as 0x0000 — 0000. 0x0000 — 0010 is a setting value that determines to which connection the input from the connection 204b-ne is output as in the case of 0x0000 — 0000. 0x0000 — 0014 is a setting value that determines to which connection the input from the connection 204b-se is output as in the case of 0x0000 — 0000. 0x0000 — 0018 is a setting value that determines which connection the input from the connection 204b-sw is output in the same way as 0x0000 — 0000. 0x0000 — 001c is a setting value that determines to which connection the input from the connection 204b-nw is output as in the case of 0x0000 — 0000.

上述した再構成デバイスの構成を複数種類変更して所望の処理を実現するためのタイムチャート例を図１０に示す。図１０では同一の再構成デバイスで複数の異なるデータフローＡ〜Ｚを順に実行しているタイムチャートを示している。本実施形態では、これらデータフローとデータフローの実行順序とを入力するデータフロー入力によって入力される。
本実施形態で扱うデータフローとは、再構成デバイスへ一度に処理を割り当て可能な単位で構成されたデータフローのことである。各データフローを処理するための設定は予め生成しておく。再構成デバイスに対して予め生成された設定に基づき再構成デバイスを再構成し、その構成のもとで処理をするという一連の処理が、所望の実行順序で順々に行われる。 FIG. 10 shows an example of a time chart for realizing a desired process by changing a plurality of types of the configuration of the reconfigurable device described above. FIG. 10 shows a time chart in which a plurality of different data flows A to Z are sequentially executed by the same reconstruction device. In this embodiment, the data flow and the data flow execution order are input by data flow input.
The data flow handled in the present embodiment is a data flow configured in units that can assign processing to the reconfigurable device at one time. Settings for processing each data flow are generated in advance. A series of processes of reconfiguring the reconfigurable device based on settings generated in advance for the reconfigurable device and performing processing based on the configuration is sequentially performed in a desired execution order.

以下ではデータフローＡを処理割り当て済みとし、データフローＢを処理割当て対象として説明する。具体的には、処理割当て済みのデータフローＡを実行するための設定値を参照し、データフローＢの処理割り当てを決定する。データフローＡの処理割当てを参照しデータフローＢの処理割当てを決定した後、次はデータフローＢを処理割り当て済みとし、データフローＣを処理割り当て対象とする。具体的にはデータフローＢの処理割り当てを決定した時と同様に、処理割り当て済みのデータフローＢを実行するための設定値を参照し、データフローＣの処理割り当てを決定する。上記の手順を順々に繰り返すことでＡからＺまでのデータフローの処理割り当てを行うことが可能となる。 In the following description, it is assumed that the data flow A is already assigned processing and the data flow B is subject to processing assignment. Specifically, the process allocation of data flow B is determined with reference to a setting value for executing data flow A to which process allocation has been performed. After determining the process assignment of the data flow B with reference to the process assignment of the data flow A, the data flow B is assumed to be already assigned and the data flow C is set as the process assignment target. Specifically, the processing assignment of data flow C is determined with reference to the setting value for executing the data flow B to which processing has been assigned in the same manner as when the processing assignment of data flow B has been determined. By repeating the above procedure in order, it is possible to assign processing of data flows from A to Z.

次にデータフローの上記再構成デバイスのプロセッシングエレメントへの処理を割り当てについて説明する。ここで、データフローのプロセッシングエレメントへの処理割り当てとは、データフローの各処理をプロセッシングエレメントに論理的に割り当てる方法である。より具体的には図１１に示すようにデータフローの各ノードに対応する処理内容をどのプロセッシングエレメントにおいて、どの順序で行うかを決めることである。図１１左図はあるデータフローＡを表し、中央図は処理割り当て例を表し、右図は処理割り当てに基づく、図６で示した設定を表す。前述の通り、本実施形態で示すプロセッシングエレメントが有する機能としては繰り返し処理数や、各処理回数目での処理内容やその際に必要となる固定値を想定しており、実際にはこの設定を決めることになる。なお、本実施形態では最大の繰り返し処理回数は４回という想定である。上記データフローに対して、中央の図で表すように、１１０１ａ〜１１０４ａでくくったグループごとに、異なるプロセッシングエレメント２０２に順序付けて処理を割り当てる。上記１１０１ａ〜１１０４ａに基づき、それぞれプロセッシングエレメント２０２−１〜２０２−４に対する設定１１０１ｂ〜１１０４ｂを決定する。 Next, allocation of processing to the processing element of the reconfigurable device in the data flow will be described. Here, the process assignment of the data flow to the processing element is a method of logically assigning each process of the data flow to the processing element. More specifically, as shown in FIG. 11, in which processing element and in which order processing contents corresponding to each node of the data flow are performed are determined. The left diagram in FIG. 11 represents a certain data flow A, the central diagram represents an example of process allocation, and the right diagram represents the settings shown in FIG. 6 based on the process allocation. As described above, the processing element shown in the present embodiment assumes the number of repeated processing, the processing content at each processing number, and a fixed value required at that time. To decide. In the present embodiment, it is assumed that the maximum number of iterations is four. For the data flow, as shown in the central diagram, processing is assigned in order to different processing elements 202 for each group grouped by 1101a to 1104a. Based on the above 1101a to 1104a, settings 1101b to 1104b for the processing elements 202-1 to 202-4 are determined.

本実施形態は、図１０で示したように同一の再構成デバイスにおいて処理内容(データフロー)が切り替わる際に必要となる上記図１１で示した設定数の削減を目的とした処理割り当て方法である。図１２では再構成デバイスにおける処理がデータフローＡからデータフローＢへ変更される際に各プロセッシングエレメント２０２−１〜２０２−４の設定変更の概要を示している。データフロー１２０４の処理は、タイムチャートの処理１２０１に相当し、既に処理割り当て済みであるものとする。データフロー１２０５の処理は、タイムチャートの処理１２０３に相当し、処理割り当て対象のデータフローとする。データフロー１２０４とデータフロー１２０５のそれぞれの処理が割り当てられるプロセッシングエレメント２０２−１〜２０２−４は、それぞれ論理的に同一のプロセッシングエレメントを意味している。１１０１ｂ〜１１０４ｂはそれぞれデータフローＡの処理時のプロセッシングエレメント２０２−１〜２０２−４に関する設定であり、１２０６〜１２０９はそれぞれデータフローＢの処理時のプロセッシングエレメント２０２―１〜２０２−４に関する設定である。プロセッシングエレメント２０２−１では、タイムチャートのデータフローＡからＢへの設定変更期間１２０２において設定１１０１ｂから設定１２０６へと設定変更１２１０が行われる。プロセッシングエレメント２０２−２では、タイムチャートのデータフローＡからＢへの設定変更期間１２０２において設定１１０２ｂから設定１２０７へと設定変更１２１１が行われる。プロセッシングエレメント２０２−３では、タイムチャートのデータフローＡからＢへの設定変更期間１２０２において設定１１０３ｂから設定１２０８へと設定変更１２１２が行われる。プロセッシングエレメント２０２−４では、タイムチャートのデータフローＡからＢへの設定変更期間１２０２において設定１１０４ｂから設定１２０９へと設定変更１２１３が行われる。本実施形態ではデータフローＢの処理割り当てに際し、まず各プロセッシングエレメント２０２−１〜２０２−４における処理割り当て済みのデータフローＡを参照する。その上で上記設定変更１２１０、１２１１、１２１２、１２１３する際に必要な設定変更数を少なくすることを目的としたデータフローＢの処理割り当てを行う。 This embodiment is a process allocation method for the purpose of reducing the number of settings shown in FIG. 11, which is required when the processing content (data flow) is switched in the same reconfigurable device as shown in FIG. . FIG. 12 shows an outline of setting change of each processing element 202-1 to 202-4 when the processing in the reconfigurable device is changed from the data flow A to the data flow B. The process of the data flow 1204 corresponds to the process 1201 of the time chart, and it is assumed that the process has already been assigned. The processing of the data flow 1205 corresponds to the processing 1203 of the time chart, and is a data flow to be processed. The processing elements 202-1 to 202-4 to which the respective processes of the data flow 1204 and the data flow 1205 are assigned mean logically identical processing elements. 1101b to 1104b are settings related to the processing elements 202-1 to 202-4 when processing the data flow A, and 1206 to 1209 are settings related to the processing elements 202-1 to 202-4 when processing the data flow B, respectively. is there. In the processing element 202-1, the setting change 1210 is performed from the setting 1101b to the setting 1206 in the setting change period 1202 from the data flow A to B in the time chart. In the processing element 202-2, the setting change 1211 is performed from the setting 1102b to the setting 1207 in the setting change period 1202 from the data flow A to B of the time chart. In the processing element 202-3, the setting change 1212 is performed from the setting 1103b to the setting 1208 in the setting change period 1202 from the data flow A to B in the time chart. In the processing element 202-4, the setting change 1213 is performed from the setting 1104b to the setting 1209 in the setting change period 1202 from the data flow A to B in the time chart. In the present embodiment, when assigning the process of data flow B, first, the data flow A to which processing has been assigned in each processing element 202-1 to 202-4 is referred. Then, data flow B processing is assigned for the purpose of reducing the number of setting changes required when the setting changes 1210, 1211, 1212, and 1213 are performed.

図１７は、上記データフローＡからデータフローＢへの移行する際の回路構成情報１０６を生成するための装置のブロック構成図を示している。図１７において、２５０１は装置全体の制御を司るＣＰＵである。２５０２はブートプログラムやＢＩＯＳを記憶しているＲＯＭである。２５０３はＣＰＵ２５０１のワークエリアとして利用され、且つ、ＯＳ（オペレーティングシステム）、アプリケーションを格納するためのＲＡＭである。２５０４はＯＳ、回路構成情報１０６を作成するためのアプリケーション、ならびに、様々なデータを格納するためのハードディスクドライブ（ＨＤＤ）である。２５０５はキーボード、２５０６はマウスであり、ユーザインタフェースとして機能する。２５０７は内部にビデオメモリ及び表示コントローラを内蔵する表示制御部であり、２５０８は表示制御部２５０７からの映像信号を受信し、表示するための表示装置である。２５０９は各種外部デバイスと通信するインタフェースであり、例えば、図１に示した外部メモリ１０１を接続することで、本装置が作成した回路構成情報１０６をその外部メモリ１０１に書込むことになる。 FIG. 17 shows a block configuration diagram of an apparatus for generating circuit configuration information 106 at the time of transition from the data flow A to the data flow B. In FIG. 17, reference numeral 2501 denotes a CPU that controls the entire apparatus. A ROM 2502 stores a boot program and BIOS. A RAM 2503 is used as a work area of the CPU 2501 and stores an OS (operating system) and applications. Reference numeral 2504 denotes an OS, an application for creating the circuit configuration information 106, and a hard disk drive (HDD) for storing various data. A keyboard 2505 and a mouse 2506 function as a user interface. Reference numeral 2507 denotes a display control unit incorporating a video memory and a display controller therein. Reference numeral 2508 denotes a display device for receiving and displaying a video signal from the display control unit 2507. An interface 2509 communicates with various external devices. For example, when the external memory 101 shown in FIG. 1 is connected, the circuit configuration information 106 created by this apparatus is written into the external memory 101.

上記構成において、本装置に電源が投入されると、ＣＰＵ２５０１はＲＯＭ２５０２に格納されたブートプログラムを実行し、ＨＤＤ２５０４に格納されたＯＳをＲＡＭにロードし、その後、回路構成情報１０６を作成するアプリケーションを起動することで、本装置が回路構成情報作成装置として機能することになる。 In the above configuration, when the apparatus is turned on, the CPU 2501 executes the boot program stored in the ROM 2502, loads the OS stored in the HDD 2504 into the RAM, and then creates an application for creating the circuit configuration information 106. By starting, this apparatus functions as a circuit configuration information creation apparatus.

以下、回路構成情報作成装置として機能する本装置の処理手順を、図１３のフローチャートを用いて説明する。本手順はシミュレーテッドアニーリングに基づき処理割り当て方法の例を示すが、本発明は上記手法のみに限定されるものではなく、遺伝的アルゴリズムなどの様々な近似解法や数値最適化法でも良い。 Hereinafter, the processing procedure of this apparatus functioning as the circuit configuration information creating apparatus will be described with reference to the flowchart of FIG. This procedure shows an example of a process assignment method based on simulated annealing, but the present invention is not limited to the above method, and various approximate solutions such as genetic algorithms and numerical optimization methods may be used.

まず、図１２を用いてフローチャートの説明に必要な要素について言及する。各データフローを示すインデックスをｉ、プロセッシングエレメントのインデックスをｊ、図６のアドレス６０１に相当するプロセッシングエレメント内で保持しているメモリのコンフィギュレーションアドレスをｋとする。上記より各メモリ内の設定値はｕ_i,j,kと表すことができる。ここで処理割り当て済みのデータフローをi0、処理割り当て対象のデータフローをi1とする。初めに、ステップＳ１３０１では複数のデータフローとその実行順序関係(ｉの順序)を入力する。既に処理割り当て済みのデータフローに関しては、その設定値
ｕ_i0,j,kも入力する。本実施形態ではデータフロー１２０４は既に処理割り当て済みであり、データフロー１２０４に関する各プロセッシングエレメントの各メモリ内の設定値は固定値として与えられる。データフロー１２０５に関しては処理割り当て対象である。 First, elements necessary for explaining the flowchart will be described with reference to FIG. Assume that the index indicating each data flow is i, the index of the processing element is j, and the configuration address of the memory held in the processing element corresponding to the address 601 in FIG. 6 is k. From the above, the set value in each memory can be expressed as u _{i, j, k} . Here, a data flow that has been assigned a process is i0, and a data flow that is a process assignment target is i1. First, in step S1301, a plurality of data flows and their execution order relationship (i order) are input. For data flows that have already been assigned processing, the set values u _{i0, j, k} are also input. In this embodiment, the data flow 1204 has already been assigned processing, and the setting value in each memory of each processing element related to the data flow 1204 is given as a fixed value. The data flow 1205 is a process allocation target.

次に、ステップＳ１３０２で要求仕様とハードウェア制約の条件を入力する。ここでハードウェア制約条件とは再構成デバイス内のプロセッシングエレメントの個数やプロセッシングエレメントで処理可能な繰り返し処理回数、演算器の種類などハードウェア構成上の制約となりうるものである。また要求仕様とは、プロセッシングエレメントの使用個数や繰り返し処理の回数の制限、使用可能な演算器の種類などハードウェア使用上で制限すべき項目である。さらに処理の入出力の順序関係に矛盾がないか、デッドロックがないか、なども本制約に関する。また、既に処理割り当て済みのデータフローに関しては処理割り当てを変更しないという制約も含む。ただし、本発明は上述の制約のみに限られるものではない。 In step S1302, the required specifications and hardware constraint conditions are input. Here, the hardware constraint condition may be a constraint on the hardware configuration such as the number of processing elements in the reconfigurable device, the number of repetition processes that can be processed by the processing element, and the type of arithmetic unit. The required specifications are items that should be restricted in terms of hardware usage, such as the number of processing elements used, the number of repetitions, and the types of computing units that can be used. Furthermore, whether there is any contradiction in the input / output order relationship of processing, deadlocks, etc. are also related to this constraint. Further, there is a restriction that the processing assignment is not changed for a data flow that has already been assigned a processing. However, the present invention is not limited only to the above-described restrictions.

続いてステップＳ１３０３では処理割り当て対象となるデータフローの処理割り当てを行う。初期処理割り当てにおいては、ランダムに割り当てる、ないしはデータフローの深さ方向順に処理を割り当て方法があるが、これらの方法に限られるものではない。初期割り当てでない場合は、例えばランダムに二つの配置を選出し交換するようにシミュレーテッドアニーリングに基づき処理割り当てを変更する。本実施形態では、処理割り当て対象のデータフロー１２０５に関して初期処理割り当て、ないしは処理割り当て変更を行う。処理割り当て済みのデータフローに関しては、制約に基づき処理割り当ての変更は行わない。ステップＳ１３０４では、処理割り当て結果がステップＳ１３０３で入力した要求仕様を満たしているか判断する。 Subsequently, in step S1303, process allocation of the data flow to be processed is performed. In the initial process assignment, there is a method of assigning processes randomly or assigning processes in the order of the depth direction of the data flow, but it is not limited to these methods. If it is not the initial allocation, for example, the processing allocation is changed based on simulated annealing so that two arrangements are randomly selected and exchanged. In this embodiment, an initial process allocation or a process allocation change is performed for the data flow 1205 to be allocated to a process. For data flows that have already been assigned a process, the process assignment is not changed based on the constraints. In step S1304, it is determined whether the process assignment result satisfies the requirement specification input in step S1303.

以下の式に示すように、制約を満たしていれば、ペナルティ変数ｐ₀を０とし、違反している場合はペナルティ変数ｐ₀をペナルティ値Ｃ_p0とする。 As shown in the following equation, if the constraint is satisfied, the penalty variable p _{0 is set} to 0, and if it is violated, the penalty variable p ₀ is set to the penalty value C _p0 .

ここで本実施形態では、違反があった場合に一律Ｃ_p0は定数値として扱っているが、違反項目に応じた変数値としてもよい。ステップＳ１３０５では、処理割り当て結果がステップＳ１３０２で入力したハードウェア制約条件を満たしているかを判断する。以下の式に示すように、制約を満たしていれば、ペナルティ変数Ｐ₁を０とし、違反している場合はペナルティ変数Ｐ₁をペナルティ値Ｃ_p1とする。 Here, in this embodiment, when there is a violation, the uniform C _p0 is treated as a constant value, but it may be a variable value according to the violation item. In step S1305, it is determined whether the process assignment result satisfies the hardware constraint condition input in step S1302. As shown in the following equation, if the constraint is satisfied, the penalty variable P _{1 is set} to 0, and if it is violated, the penalty variable P ₁ is set to the penalty value C _p1 .

ここで本実施形態では、違反があった場合に一律Ｃ_p1は定数値として扱っているが、違反項目に応じた変数値としてもよい。次にステップＳ１３０６では、対象となるデータフロー変更間における設定変更数の算出を行い、評価値を計算する。図１２の例で説明すると設定変更１２１０、１２１１、１２１２、１２１３の際に変更の必要がある設定数である。より具体的には、処理割り当て済みのデータフローｉ０における、設定値ｕ_i0,j,kと、処理割り当て対象のデータフローｉ１における、同じアドレスの設定値ｕ_i1,j,kの値が一致していなければα₁を設定変更数に加える。上述の値が一致していれば何も加えない。上述の計算を全てのプロセッシングエレメントｊの全てのメモリｋに関して行う。以上で説明した設定変更数は以下の式で表すことできる。 Here, in this embodiment, when there is a violation, the uniform C _p1 is treated as a constant value, but it may be a variable value according to the violation item. In step S1306, the number of setting changes between target data flow changes is calculated, and an evaluation value is calculated. If it demonstrates in the example of FIG. 12, it is the setting number which needs to be changed in the case of the setting changes 1210, 1211, 1212, 1213. More specifically, the setting value u _{i0, j, k} in the data flow i0 to which processing has been assigned matches the setting value u _{i1, j, k at} the same address in the data flow i1 to be processed. If not, add α ₁ to the number of setting changes. If the above values match, nothing is added. The above calculation is performed for all memories k of all processing elements j. The number of setting changes described above can be expressed by the following formula.

ここでα₁は通常１であるが、プロセッシングエレメントのコンフィギュレーションメモリの構造に応じて、各設定が格納されているアドレス毎に重み付けを変更することも可能としている。またデータフロー毎に切り替え時間の優先度をつけるためにデータフロー毎に重みづけることも可能としている。 Here, α ₁ is normally 1, but it is also possible to change the weight for each address where each setting is stored according to the structure of the configuration memory of the processing element. In addition, in order to give priority to the switching time for each data flow, it is possible to weight each data flow.

図７で示した通り、一度のコンフィギュレーションコマンドで一つの設定値を変更することができることから、この数が少なくなれば、処理内容の切り替え時間を削減することが可能となる。本ステップで評価値算出する際の式は上述の式より、以下と定義する。 As shown in FIG. 7, since one setting value can be changed with a single configuration command, the processing content switching time can be reduced if this number is reduced. The formula for calculating the evaluation value in this step is defined as follows from the above formula.

つまり、要求仕様とハードウェア制約条件を満たしつつ、再構成に必要な設定変更数が少なくなるほど上記評価値は小さくなる。最後にステップＳ１３０７では、シミュレーテッドアニーリングに基づき目標達成したかを判断し、目標を達成した場合は終了する。目標未達の場合はステップＳ１３０３に戻り、ステップＳ１３０３〜Ｓ１３０７を繰り返し行う。ここで、目標値とは十分良い結果が得られるまで、もしくは予定された計算時間に達するまで繰り返す。 That is, the evaluation value decreases as the number of setting changes necessary for reconfiguration decreases while satisfying the required specifications and hardware constraint conditions. Finally, in step S1307, it is determined whether the target has been achieved based on simulated annealing. If the target has been achieved, the process ends. If the target has not been reached, the process returns to step S1303, and steps S1303 to S1307 are repeated. Here, the target value is repeated until a sufficiently good result is obtained or a predetermined calculation time is reached.

以上の結果、回路構成情報１０６がＨＤＤ２５０４に生成されるので、後はインタフェース２５０９を介して、利用する外部メモリ１０１にそれを書き出し、実製品に搭載すれば良いことになる。 As a result, the circuit configuration information 106 is generated in the HDD 2504. After that, it can be written to the external memory 101 to be used via the interface 2509 and mounted in the actual product.

なお、上記実施形態では、回路構成情報１０６を、外部装置（図１７）にて作成する例を示した。これは以降に説明する全実施形態でも同じである。また、コンフィギュレーションコントローラ１０２が外部装置の代わりに、図１３の処理を実行し、回路構成情報１０６を作成しても構わない。例えば、外部メモリ１０１に複数のデータフロー毎の設定（必要なプロセッサ数と各プロセッサの処理パラメータ）を保持させ、コンフィギュレーションコントローラ１０２が複数のデータフローの設定に基づいて回路構成情報１０６を作成すればよい。係る点も、以降に説明する全実施形態にも適用できることである。 In the embodiment described above, the example in which the circuit configuration information 106 is created by an external device (FIG. 17) has been shown. This is the same in all embodiments described below. Further, the configuration controller 102 may create the circuit configuration information 106 by executing the processing of FIG. 13 instead of the external device. For example, the external memory 101 stores settings for each of a plurality of data flows (required number of processors and processing parameters for each processor), and the configuration controller 102 creates circuit configuration information 106 based on the settings of the plurality of data flows. That's fine. This point is also applicable to all embodiments described below.

一般的な再構成デバイスにおけるデータフローのプロセッシングエレメントへの処理割り当て方法では設定変更数を意識しないため、データフローの処理変更間で全ての設定を変更する必要が生じる。本発明では、データフローの処理順に注目し、最小単位である設定レベルでの変更合計数が少なくすることで、設定変更数削減を効果的に行うことが可能となる。 In the method of assigning a process to a processing element of a data flow in a general reconfigurable device, since the number of setting changes is not conscious, it is necessary to change all settings between data flow process changes. In the present invention, it is possible to effectively reduce the number of setting changes by paying attention to the processing order of the data flow and reducing the total number of changes at the setting level which is the minimum unit.

次に、本発明の第２の実施形態について説明する。第２の実施形態に係る処理のタイムチャート及び処理割り当ての概要を図１４に示す。本実施形態では再構成デバイスの設定を変更することで複数のデータフローの処理を行う際に、各データフローの処理内容自体は決まっているが、そのデータフローの実行順序が不定で、状況や入力データなどに応じて変更する場合の処理割り当てに関する実施形態である。具体的には図１４では再構成デバイスで行う処理として、タイムチャート１４０１で示すようにその実行順序が結果や状態などに応じて変わるなど、一定でない場合を想定している。 Next, a second embodiment of the present invention will be described. FIG. 14 shows an overview of the process time chart and process allocation according to the second embodiment. In this embodiment, when processing a plurality of data flows by changing the setting of the reconfigurable device, the processing content itself of each data flow is determined, but the execution order of the data flow is indeterminate, This is an embodiment relating to process assignment when changing according to input data or the like. Specifically, in FIG. 14, it is assumed that the processing performed by the reconfigurable device is not constant, for example, the execution order changes according to the result or state as shown in the time chart 1401.

タイムチャート１４０１における期間１４０２ではデータフローＡに関する処理が行われ、期間１４０４、１４０８ではデータフローＣに関する処理が行われ、期間１４０６ではデータフローＢに関する処理が行われる。期間１４０３ではデータフローＡからデータフローＣの設定変更が行われる。期間１４０５ではデータフローＣからデータフローＢの設定変更が行われる。期間１４０７ではデータフローＢからデータフローＣの設定変更が行われる。本実施形態では、データフローＡ、Ｂ、Ｃの実行順序が一定でないため、それぞれのデータフロー間の全ての設定変更を考慮して処理割り当てを行う必要がある。また、データフローＡ、Ｂ、Ｃは全て処理割り当て対象とする。 In the period 1402 in the time chart 1401, processing related to the data flow A is performed, processing related to the data flow C is performed in the periods 1404 and 1408, and processing related to the data flow B is performed in the period 1406. In the period 1403, the setting change of the data flow A to the data flow C is performed. In the period 1405, the setting change from the data flow C to the data flow B is performed. In the period 1407, the setting change of the data flow C from the data flow B is performed. In this embodiment, since the execution order of the data flows A, B, and C is not constant, it is necessary to perform processing assignment in consideration of all setting changes between the respective data flows. Data flows A, B, and C are all processing allocation targets.

図１４の１４０９、１４１０、１４１１はそれぞれデータフローＡ、Ｂ、ＣにおけるＰＥ２０２−１〜ＰＥ２０２−４の設定例を示しており、これら全てのデータフローＡ、Ｂ、Ｃの処理割り当てを一度に一括して行う。１４０９、１４１０、１４１１は実施形態１と同様に、図６で示した設定である。本実施形態では、データフローＡ、Ｂ間、Ｂ、Ｃ間、Ｃ、Ａ間のＰＥ２０２−１〜２０２−４でデータフロー変更に必要な設定変更数の合計数に注目する。上記合計数を評価値として算出することで再構成デバイスの再構成時の設定変更数の削減を行う。 14, 1409, 1410, and 1411 respectively show setting examples of the PEs 202-1 to PE 202-4 in the data flows A, B, and C, and all the data flows A, B, and C are assigned processing at once. And do it. Similarly to the first embodiment, 1409, 1410, and 1411 are the settings shown in FIG. In the present embodiment, attention is paid to the total number of setting changes necessary for data flow change in the PEs 202-1 to 202-4 between the data flows A and B, between B and C, and between C and A. By calculating the total number as an evaluation value, the number of setting changes at the time of reconfiguration of the reconfigurable device is reduced.

なお、データフローＡ、Ｂ間でのＰＥ２０２−１〜２０２−４の設定変更はそれぞれ１４１２、１４１５、１４１８、１４２１で表している。またデータフローＢ、Ｃ間でのＰＥ２０２−１〜２０２−４の設定変更はそれぞれ１４１３、１４１６、１４１９、１４２２で表している。データフローＣ、Ａ間でのＰＥ２０２−１〜２０２−４の設定変更は１４１４、１４１７、１４２０、１４２３で表している。 Note that changes in the settings of the PEs 202-1 to 202-4 between the data flows A and B are represented by 1412, 1415, 1418, and 1421, respectively. In addition, setting changes of the PEs 202-1 to 202-4 between the data flows B and C are represented by 1413, 1416, 1419, and 1422, respectively. Changes in the settings of the PEs 202-1 to 202-4 between the data flows C and A are represented by 1414, 1417, 1420, and 1423, respectively.

本実施形態と第１の実施形態との違いは、同時に複数のデータフローの処理割り当てを行う点にある。図１３のステップＳ１３０１では、複数のデータフローを入力すると同時に、実行順序が任意であることを入力する。 The difference between the present embodiment and the first embodiment is that processing allocation of a plurality of data flows is performed at the same time. In step S1301 in FIG. 13, a plurality of data flows are input, and at the same time, an arbitrary execution order is input.

図１３のステップＳ１３０３で行う処理割り当てでは、処理割り当て対象であるデータフロー１４０９、１４１０、１４１１全てに対して処理割り当てを行う。図１３のステップＳ１３０６で使用する設定変更数として、第一の実施形態に対して以下の違いがある。 In the process assignment performed in step S1303 of FIG. 13, process assignment is performed for all the data flows 1409, 1410, and 1411 that are the process assignment targets. The number of setting changes used in step S1306 in FIG. 13 has the following difference from the first embodiment.

処理割り当て対象のデータフローｉ０における、設定値ｕ_i0,j,kと、処理割り当て対象のデータフローｉ１における、同じアドレスの設定値ｕ_i1,j,kの値が一致していなければα₂を設定変更数に加える。また、処理割り当て対象のデータフローｉ１における、設定値ｕ_i1,j,kと、処理割り当て対象のデータフローｉ２における、同じアドレスの設定値ｕ_i2,j,kの値が一致していなければβ₂を設定変更数に加える。さらに、処理割り当て対象のデータフローｉ２における、設定値ｕ_i2,j,kと、処理割り当て対象のデータフローｉ０における、同じアドレスの設定値ｕ_i0,j,kの値が一致していなければγ₂を設定変更数に加える。上述以外で、設定値が一致していれば何も加えない。次の式のように、上述の計算を全てのプロセッシングエレメントｊの全てのメモリｋに関して行う。 If the set value u _{i0, j, k} in the data flow i0 to be processed and the set value u _{i1, j, k at} the same address in the data flow i1 to be processed do not match, α ₂ is set. Add to the number of setting changes. If the set value u _{i1, j, k} in the data flow i1 to be processed and the set value u _{i2, j, k at} the same address in the data flow i2 to be processed do not match, β Add ₂ to the number of setting changes. Furthermore, if the set value u _{i2, j, k} in the data flow i2 to be processed and the set value u _{i0, j, k at} the same address in the data flow i0 to be processed do not match, γ Add ₂ to the number of setting changes. Other than the above, nothing is added if the set values match. The above calculation is performed for all memories k of all processing elements j as in the following equation.

ここでｉ０はデータフローＡ１４０９、ｉ１はデータフローＢ１４１０、ｉ２はデータフローＣ１４１１を示しており、上記式で示す値が少なくなるように、これらの設定値ｕ_i0,j,k、ｕ_i1,j,k、ｕ_i2,j,kを決める。またα₂、β₂、γ₂は通常それぞれ１であるが、プロセッシングエレメントのコンフィギュレーションメモリの構造に応じて、各設定が格納されているアドレス毎に重み付けを変更することも可能としている。またデータフロー毎に切り替え時間の優先度をつけるためにデータフロー毎に重みづけることも可能としている。本実施形態により、全てのデータフロー間を考慮することで、処理の実行順序が不定な場合でも平均して設定変更数削減効果が得られる。 Here, i0 indicates a data flow A 1409, i1 indicates a data flow B1410, and i2 indicates a data flow C1411, and these set values u _{i0, j, k} , u _{i1, j} are reduced so that the value shown in the above equation is reduced. _{, k} and u _{i2, j, k} are determined. Α ₂ , β ₂ , and γ ₂ are normally 1 respectively, but the weight can be changed for each address in which each setting is stored according to the configuration of the configuration memory of the processing element. In addition, in order to give priority to the switching time for each data flow, it is possible to weight each data flow. According to the present embodiment, by considering all the data flows, an effect of reducing the number of setting changes can be obtained on average even when the execution order of processing is indefinite.

次に、本発明の第３の実施形態について説明する。本実施形態に係る処理のタイムチャートおよび、処理割り当ての概要を図１５に示す。本実施形態では既に複数のデータフローの実行順序およびそれぞれの処理割り当ても決まっている場合を想定している。上述の実行順序における任意のデータフローの処理間に、挿入前後の処理割り当ては変えずに、新たなデータフローに対応した処理を挿入するための処理割り当てに関する実施形態である。 Next, a third embodiment of the present invention will be described. FIG. 15 shows a time chart of processing according to the present embodiment and an outline of processing allocation. In the present embodiment, it is assumed that the execution order of a plurality of data flows and the respective process assignments are already determined. This is an embodiment relating to process assignment for inserting a process corresponding to a new data flow without changing the process assignment before and after the insertion between processes of an arbitrary data flow in the execution order described above.

図１５のタイムチャート１５０１は再構成デバイスで行う実行順序が既に決められ、その処理割り当ても決められているタイムチャートである。タイムチャート１５０１ではデータフローＡの処理１５０３後、データフローＡの設定からデータフローＣの設定へ変更する期間１５０４を経てデータフローＣの処理１５０５を行っている。このタイムチャート１５０１のデータフローＡとデータフローＣ間に新たにデータフローＢを挿入したタイムチャートがタイムチャート１５０２である。データフローＡの処理１５０３の後にデータフローＡの設定からデータフローＢの設定へ変更する期間１５０６を経て新たに挿入したデータフローＢの処理１５０７が行われる。その後データフローＢの設定からデータフローＣの設定へ変更する期間１５０８を経てデータフローＢの処理１５０５が行われる。この際データフローＡとＣの処理割り当ては変更せずにデータフローＢの処理割り当てを決定するため、挿入する前後の既に処理割り当て済みのデータフロー間の設定変更量を考慮する。 A time chart 1501 in FIG. 15 is a time chart in which the execution order to be performed by the reconfigurable device is already determined, and the process allocation is also determined. In the time chart 1501, after the processing 1503 of the data flow A, the processing 1505 of the data flow C is performed after a period 1504 of changing from the setting of the data flow A to the setting of the data flow C. The time chart 1502 is a time chart in which the data flow B is newly inserted between the data flow A and the data flow C in the time chart 1501. After the processing 1503 of the data flow A, the processing 1507 of the newly inserted data flow B is performed after a period 1506 of changing from the setting of the data flow A to the setting of the data flow B. Thereafter, processing 1505 of data flow B is performed after a period 1508 of changing from setting of data flow B to setting of data flow C. At this time, in order to determine the process assignment of the data flow B without changing the process assignments of the data flows A and C, the setting change amount between the data flows that have already been assigned before and after the insertion is considered.

図１５の１５０９、１５１０、１５１１はそれぞれデータフローＡ、Ｂ、ＣにおけるＰＥ２０２−１〜ＰＥ２０２−４の設定を示している。本実施形態ではデータフローＡおよびＣは処理割り当て済みであり、データフローＢの処理割り当てを行う。１５０９、１５１０、１５１１は実施形態１と同様に、図６で示した設定である。本実施形態では具体的には、データフローＡ、Ｃの処理割り当ては決定済であり、データフローＢの処理割り当てを行う。 15, 1509, 1510, and 1511 indicate the settings of the PEs 202-1 to 202-4 in the data flows A, B, and C, respectively. In this embodiment, data flows A and C have already been assigned processing, and data flow B is assigned processing. Similarly to the first embodiment, 1509, 1510, and 1511 are the settings shown in FIG. Specifically, in the present embodiment, the process assignment for the data flows A and C has been determined, and the process assignment for the data flow B is performed.

その際データフローＡからＢへ、またＢからＣへと、ＰＥ２０２−１〜２０２−４の設定変更に必要な設定変更数との合計数に注目している。上記合計数が実施形態１の図１３の１３０８で示す評価値として算出することで再構成デバイスの再構成時の設定変更数の削減を行う。 At that time, attention is paid to the total number of setting changes necessary for changing the settings of the PEs 202-1 to 202-4 from data flow A to B and from B to C. The total number is calculated as the evaluation value indicated by 1308 in FIG. 13 in the first embodiment, thereby reducing the number of setting changes when reconfiguring the reconfigurable device.

なお、データフローＡ、Ｂ間での、ＰＥ２０２−１〜２０２−４の設定変更はそれぞれ１５１２、１５１４、１５１６、１５１８で表している。またデータフローＢ、Ｃ間での、ＰＥ２０２−１〜２０２−４の設定変更は、１５１３、１５１５、１５１７、１５１９で表している。本実施形態と第１の実施形態との違いは、一つのデータフローの処理割り当てに際し、他の複数の処理割り当て済みのデータフローを同時に参照する点である。 Note that the setting changes of the PEs 202-1 to 202-4 between the data flows A and B are represented by 1512, 1514, 1516, and 1518, respectively. In addition, changes in the settings of the PEs 202-1 to 202-4 between the data flows B and C are represented by 1513, 1515, 1517, and 1519. The difference between the present embodiment and the first embodiment is that, when allocating a process of one data flow, a plurality of other process-allocated data flows are referred to at the same time.

図１３のステップＳ１３０３で行う処理割り当ては、処理割り当て対象であるデータフロー１５１０である。データフロー１５０９、１５１１は処理割り当て済みであり、処理割り当て変更は行わない。図１３のステップＳ１３０６で使用する設定変更数として、第一の実施形態に対して以下の違いがある。 The process assignment performed in step S1303 of FIG. 13 is the data flow 1510 that is a process assignment target. Data flows 1509 and 1511 have already been assigned processing, and the processing assignment is not changed. The number of setting changes used in step S1306 in FIG. 13 has the following difference from the first embodiment.

処理割り当て済みのデータフローｉ０における、設定値ｕ_i0,j,kと、処理割り当て対象のデータフローｉ１における、同じアドレスの設定値ｕ_i1,j,kの値が一致していなければα₃を設定変更数に加える。また、処理割り当て対象のデータフローｉ１における、設定値ｕ_i1,j,kと、処理割り当て済みのデータフローｉ２における、同じアドレスの設定値ｕ_i2,j,kの値が一致していなければβ₃を設定変更数に加える。上述の値が一致していれば何も加えない。上述の計算を全てのプロセッシングエレメントｊの全てのメモリｋに関して行う。以上で説明した設定変更数は以下の式で表すことできる。 If the setting value u _{i0, j, k} in the data flow i0 to which processing has been assigned and the setting value u _{i1, j, k at} the same address in the data flow i1 to be processed do not match, α ₃ is set. Add to the number of setting changes. If the setting value u _{i1, j, k} in the data flow i1 to be allocated for processing does not match the value of the setting value u _{i2, j, k at} the same address in the data flow i2 to which processing has been allocated, β Add ₃ to the number of setting changes. If the above values match, nothing is added. The above calculation is performed for all memories k of all processing elements j. The number of setting changes described above can be expressed by the following formula.

ここで、ｉ０はデータフローＡ１５０９、ｉ１はデータフローＢ１５１０、ｉ２はデータフローＣ１５１１を示している。これらの設定値の内、ｕ_i0,j,k、ｕ_i2,j,kは既に処理割り当て済みで、上記式で示す値が少なくなるようにｕ_i1,j,kを決めることとなる。またα₃、β₃は通常それぞれ１であるが、プロセッシングエレメントのコンフィギュレーションメモリの構造に応じて、各設定が格納されているアドレス毎に重み付けを変更することも可能としている。またデータフロー毎に切り替え時間の優先度をつけるためにデータフロー毎に重みづけることも可能としている。新たにデータフロー挿入する場合に、挿入前後のデータフローとの設定変更数の削減に関する効果が得られる。 Here, i0 indicates data flow A1509, i1 indicates data flow B1510, and i2 indicates data flow C1511. Among these set values, u _{i0, j, k} and u _{i2, j, k} have already been assigned to processing, and u _{i1, j, k} is determined so that the value shown in the above equation is reduced. Α ₃ and β ₃ are normally 1 respectively, but the weight can be changed for each address where each setting is stored according to the configuration of the configuration memory of the processing element. In addition, in order to give priority to the switching time for each data flow, it is possible to weight each data flow. When a new data flow is inserted, the effect of reducing the number of setting changes with the data flow before and after the insertion can be obtained.

次に、本発明の第４の実施形態について説明する。本実施形態に係る処理のタイムチャートおよび、処理割り当ての概要を図１６に示す。本実施形態では、ある基準となるデータフローの処理後、その結果に応じて次に行われるデータフローが異なる場合の処理割り当てに関する実施形態である。 Next, a fourth embodiment of the present invention will be described. FIG. 16 shows a time chart of processing according to the present embodiment and an outline of processing assignment. In this embodiment, after processing a data flow as a reference, the embodiment relates to processing allocation when the data flow to be performed next differs depending on the result.

図１６のタイムチャート１６０１は基準となるデータフローＸの処理を期間１６０２で行った後、その結果に応じて、データフローＸからデータフローＡかＢかＣへの設定変更を期間１６０３で行う。設定完了後、データフローＡかＢかＣの処理が期間１６０４で行われ、再度基準となるデータフローＸの処理を行うため期間１６０５でデータフローＡかＢかＣからデータフローＸへと設定変更を行う。上記実行順序が繰り返されが、データフローＡかＢかＣのどれが行われるかはデータフローＸの結果に応じて変わる。 The time chart 1601 in FIG. 16 performs processing of the reference data flow X in the period 1602, and then changes the setting from the data flow X to the data flow A, B, or C in the period 1603 according to the result. After the setting is completed, the data flow A, B, or C is processed in the period 1604, and the setting is changed from the data flow A, B, or C to the data flow X in the period 1605 in order to perform the reference data flow X again. I do. The above execution order is repeated, but whether data flow A, B, or C is performed varies depending on the result of data flow X.

図１６の１６０６、１６０７、１６０８、１６０９はそれぞれデータフローＸ、Ａ、Ｂ、ＣにおけるＰＥ２０２−１〜ＰＥ２０２−４の設定を示している。本実施形態では、データフローＸ、Ａ、Ｂ、Ｃの全てのデータフローの処理割り当てを行う。１６０６、１６０７、１６０８、１６０９は実施形態１と同様に、図６で示した設定である。本実施形態では具体的には、データフローＸ、Ａ、Ｂ、Ｃの処理割り当てを行う。 In FIG. 16, 1606, 1607, 1608, and 1609 indicate the settings of the PEs 202-1 to PE202-4 in the data flows X, A, B, and C, respectively. In the present embodiment, processing allocation of all data flows X, A, B, and C is performed. Similar to the first embodiment, 1606, 1607, 1608, and 1609 are the settings shown in FIG. In the present embodiment, specifically, processing allocation of the data flows X, A, B, and C is performed.

処理割り当てに際して、データフローＸ、Ａ間、Ｘ、Ｂ間、Ｘ、Ｃ間のＰＥ２０２−１〜２０２−４でデータフロー変更に必要な設定変更数の合計数に注目している。上記合計数が実施形態１の図１３の１３０４で示す評価値として算出することで再構成デバイスの再構成時の設定変更数の削減を行う。なお、データフローＸ、Ａ間でのＰＥ２０２−１〜２０２−４の設定変更はそれぞれ１６１０、１６１３、１６１６、１６１９で表している。また、データフローＸ、Ｂ間でのＰＥ２０２−１〜２０２−４の設定変更はそれぞれ１６１１、１６１４、１６１７、１６２０で表している。データフローＸ、Ｃ間でのＰＥ２０２−１〜２０２−４の設定変更はそれぞれ１６１２、１６１５、１６１８、１６２１で表している。本実施形態と第１の実施形態との違いは、実行順序に分岐があり、分岐先と分岐元間のデータフローに対して処理割り当てを行う点にある。 At the time of process allocation, attention is paid to the total number of setting changes necessary for data flow changes in the PEs 202-1 to 202-4 between the data flows X and A, between X and B, and between X and C. The total number is calculated as an evaluation value indicated by 1304 in FIG. 13 in the first embodiment, thereby reducing the number of setting changes when reconfiguring the reconfigurable device. Note that the setting changes of the PEs 202-1 to 202-4 between the data flows X and A are represented by 1610, 1613, 1616, and 1619, respectively. In addition, changes in the settings of the PEs 202-1 to 202-4 between the data flows X and B are represented by 1611, 1614, 1617, and 1620, respectively. Changes in the settings of the PEs 202-1 to 202-4 between the data flows X and C are represented by 1612, 1615, 1618, and 1621, respectively. The difference between this embodiment and the first embodiment is that there is a branch in the execution order, and processing is assigned to the data flow between the branch destination and the branch source.

図１３のステップＳ１３０１では、複数のデータフローを入力すると同時に、部分的に任意の実行順序を入力する。図１３のステップＳ１３０３で行う処理割り当ては、処理割り当て対象であるデータフロー１６０６、１６０７、１６０８、１６０９全てに対して処理割り当てを行う。本実施形態では全てのデータフローを処理割り当て対象として扱う例を示しているが、それに限ったものではない。本実施形態はデータフローの実行順序に注目したものであり、少なくとも一つのデータフローが、既に処理割り当て済みの場合には、それ以外のデータフローの処理割り当てを行うこととなる。 In step S1301 of FIG. 13, a plurality of data flows are input, and an arbitrary execution order is partially input. The process assignment performed in step S1303 in FIG. 13 assigns the process to all the data flows 1606, 1607, 1608, and 1609 that are the process assignment targets. In the present embodiment, an example is shown in which all data flows are handled as processing allocation targets, but the present invention is not limited to this. In this embodiment, attention is paid to the execution order of data flows. When at least one data flow has already been assigned processing, processing assignment for other data flows is performed.

図１３のステップＳ１３０６で使用する設定変更数として、第１の実施形態に対して以下の違いがある。処理割り当て対象のデータフローｉ０における、設定値ｕ_i0,j,kと、処理割り当て対象のデータフローｉ１における、同じアドレスの設定値ｕ_i1,j,kの値が一致していなければα₄を設定変更数に加える。また、処理割り当て対象のデータフローｉ０における、設定値ｕ_i0,j,kと、処理割り当て対象のデータフローｉ２における、同じアドレスの設定値ｕ_i2,j,kの値が一致していなければβ₄を設定変更数に加える。さらに、処理割り当て対象のデータフローｉ０における、設定値ｕ_i0,j,kと、処理割り当て対象のデータフローｉ３における、同じアドレスの設定値ｕ_i3,j,kの値が一致していなければγ₄を設定変更数に加える。上述以外で、設定値が一致している場合は何も加えない。上述の計算を全てのプロセッシングエレメントｊの全てのメモリｋに関して行う。 The number of setting changes used in step S1306 in FIG. 13 has the following difference from the first embodiment. If the set value u _{i0, j, k} in the data flow i0 to be processed and the set value u _{i1, j, k at} the same address in the data flow i1 to be processed do not match, α ₄ is set. Add to the number of setting changes. If the set value u _{i0, j, k} in the data flow i0 to be processed and the set value u _{i2, j, k at} the same address in the data flow i2 to be processed do not match, β Add ₄ to the number of setting changes. Further, if the set value u _{i0, j, k} in the data flow i0 to be processed and the set value u _{i3, j, k at} the same address in the data flow i3 to be processed do not match, γ Add ₄ to the number of setting changes. Other than the above, nothing is added when the set values match. The above calculation is performed for all memories k of all processing elements j.

ここでｉ０はデータフローＸ１６０６、ｉ１はデータフローＡ１６０７、ｉ２はデータフローＢ１６０８、ｉ３はデータフローＡ１６０９、を示している。上記式で示す値が少なくなるようにｕ_i0,j,k、ｕ_i1,j,k、ｕ_i2,j,k、ｕ_i3,j,kを決める。またα₄、β₄、γ₄は通常それぞれ１であるが、プロセッシングエレメントのコンフィギュレーションメモリの構造に応じて、各設定が格納されているアドレス毎に重み付けを変更することも可能としている。またデータフロー毎に切り替え時間の優先度をつけるためにデータフロー毎に重みづけることも可能としている。 Here, i0 indicates data flow X1606, i1 indicates data flow A1607, i2 indicates dataflow B1608, and i3 indicates dataflow A1609. U _{i0, j, k} , u _{i1, j, k} , u _{i2, j, k} , u _{i3, j, k} are determined so that the value shown in the above equation is reduced. Α ₄ , β ₄ , and γ ₄ are normally 1 respectively, but the weight can be changed for each address where each setting is stored according to the configuration of the configuration memory of the processing element. In addition, in order to give priority to the switching time for each data flow, it is possible to weight each data flow.

本実施形態により、複数のデータフローを順々に行う上で、その実行順序に分岐がある場合でも、分岐元となるデータフローと分岐先となる複数の他のデータフローを考慮することで、設定変更数削減効果が得られる。 According to this embodiment, when performing a plurality of data flows in order, even if there is a branch in the execution order, by considering a data flow that is a branch source and a plurality of other data flows that are a branch destination, The effect of reducing the number of setting changes can be obtained.

上述した実施形態では、ユースケース別に各々の経路設定方法について述べたが、本発明はこれらの方法の組み合わせでも良い。また、再構成デバイスの構成要素としてプロセッシングエレメントを説明したが、これに限るものではなく、ＬＵＴや、それとの組み合わせでも良い。また、設定は、実施形態で示した設定に限るものではなく、ＬＵＴベースの再構成デバイスで使用される設定でも良い。また、実施形態では入力するデータフロー全てに対して処理割り当て対象としたが、処理割り当て範囲を指定することで、データフローの一部のみに対して処理割り当てを行っても良い。また、実施形態ではプロセッシングエレメント数はデータフロー間で同じとしているが、処理を割り当てるプロセッシングエレメント数が異なっても良い。 In the above-described embodiment, each route setting method has been described for each use case, but the present invention may be a combination of these methods. Further, although the processing element has been described as a component of the reconfigurable device, the present invention is not limited to this, and an LUT or a combination thereof may be used. Further, the setting is not limited to the setting shown in the embodiment, and may be a setting used in an LUT-based reconfiguration device. In the embodiment, the processing allocation target is set for all the input data flows. However, the processing allocation may be performed for only a part of the data flow by specifying the processing allocation range. In the embodiment, the number of processing elements is the same between data flows, but the number of processing elements to which processing is assigned may be different.

また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

Claims

A process assignment method for assigning a process to each component with respect to a reconfigurable device composed of a plurality of components,
A data flow input step for entering at least two different data flows and the execution order of the data flows;
A constraint step for inputting constraints of the component;
And a process allocation determining step of determining a process allocation so that the number of setting changes necessary for reconfiguration of the component elements based on the constraints and execution order of the component elements is reduced.

Of the input data flows, at least one data flow has been assigned a process, and the process flow is assigned to a data flow that has not been assigned a process by referring to the data flow to which the process has been assigned. The process allocation method according to claim 1.

2. The process allocation method according to claim 1, wherein a process allocation is performed for a plurality of data flows that are not allocated to a process among the data flows.

2. The process allocation method according to claim 1, wherein the setting change number is weighted by a setting for determining a processing content of each component or each component.

The program for making a computer perform each step of the process allocation method of any one of Claims 1 thru | or 4.

A reconfiguration device that assigns processing to each component for a reconfiguration device composed of a plurality of components,
Data flow input means for inputting at least two different data flows and the execution order of the data flows;
Constraint means for inputting the constraint of the component;
A reconfiguration device, comprising: process allocation determination means for determining a process allocation so that the number of setting changes required for reconfiguration of the component based on the constraint of the component and the execution order is reduced.