JP2006040254A

JP2006040254A - Reconfigurable circuit and processor

Info

Publication number: JP2006040254A
Application number: JP2005130462A
Authority: JP
Inventors: Hiroshi Nakajima; 洋中島; Makoto Kosone; 真小曽根; Kazuhisa Iizuka; 和久飯塚
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 2004-06-21
Filing date: 2005-04-27
Publication date: 2006-02-09
Anticipated expiration: 2025-04-27
Also published as: JP4484756B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide an integrated circuit device having a reconfigurable circuit in which change of a function is possible. <P>SOLUTION: The integrated circuit device 26 is provided with the reconfigurable circuit 12 on which a plurality of threads are simultaneously executed. A RAM provided to a memory part 27 is assigned to the threads to be executed on the reconfigurable circuit 12. A first switching part in a first switching circuit 23 is provided for every RAM, selects output from the reconfigurable circuit 12 according to the threads and supplies it to the RAM. When transfer of data is performed between the threads, assignment of the thread to the RAM is changed after completion of all the thread processing. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

この発明は、機能の変更が可能なリコンフィギュラブル回路、およびリコンフィギュラブル回路を備えた処理装置に関する。 The present invention relates to a reconfigurable circuit whose function can be changed, and a processing apparatus including the reconfigurable circuit.

近年、アプリケーションに応じてハードウエアの動作を変更可能なリコンフィギュラブルプロセッサの開発が進められている。リコンフィギュラブルプロセッサを実現するためのアーキテクチャとしては、ＤＳＰ(Digital Signal Processor)や、ＦＰＧＡ(Field Programmable Gate Array)を用いる方法が存在する。 In recent years, development of reconfigurable processors capable of changing hardware operations in accordance with applications has been underway. As an architecture for realizing a reconfigurable processor, there are methods using a DSP (Digital Signal Processor) and an FPGA (Field Programmable Gate Array).

ＦＰＧＡ（Field Programmable Gate Array）はＬＳＩ製造後に回路データを書き込んで比較的自由に回路構成を設計することが可能であり、専用ハードウエアの設計に利用されている。ＦＰＧＡは、論理回路の真理値表を格納するためのルックアップテーブル（ＬＵＴ）と出力用のフリップフロップからなる基本セルと、その基本セル間を結ぶプログラマブルな配線リソースとを含む。ＦＰＧＡでは、ＬＵＴに格納するデータと配線データを書き込むことで目的とする論理演算を実現できる。 An FPGA (Field Programmable Gate Array) can design circuit configuration relatively freely by writing circuit data after the LSI is manufactured, and is used for designing dedicated hardware. The FPGA includes a lookup table (LUT) for storing a truth table of a logic circuit, a basic cell composed of an output flip-flop, and a programmable wiring resource that connects the basic cells. In the FPGA, a target logical operation can be realized by writing data stored in the LUT and wiring data.

しかし、ＦＰＧＡでＬＳＩを設計した場合、ＡＳＩＣ（Application Specific IC）による設計と比べると、実装面積が非常に大きくなり、コスト高になる。そこで、ＦＰＧＡを動的に再構成することで、回路構成の再利用を図る方法が提案されている（例えば、特許文献１参照。）。
特開平１０−２５６３８３号公報 However, when an LSI is designed using an FPGA, the mounting area is very large and the cost is high compared to an ASIC (Application Specific IC) design. Thus, a method has been proposed in which the circuit configuration is reused by dynamically reconfiguring the FPGA (see, for example, Patent Document 1).
Japanese Patent Laid-Open No. 10-256383

例えば衛星放送では、季節などにより、放送モードを切り替えて画質の調整などを行うこともある。受信機では、放送モード毎に複数の回路を予めハードウエア上に作り込んでおき、放送モードに合わせて選択器で回路を切り替えて受信している。 For example, in satellite broadcasting, image quality may be adjusted by switching broadcast modes depending on the season. In the receiver, a plurality of circuits are built in hardware for each broadcast mode in advance, and the circuit is switched by a selector according to the broadcast mode for reception.

したがって、受信機の他の放送モード用の回路はその間、遊んでいることになる。モード切り替えのように、複数の専用回路を切り替えて使用し、その切り替え間隔が比較的長い場合、複数の専用回路を作り込む代わりに、切り替え時にＬＳＩを瞬時に再構成することにすれば、回路構造をシンプルにして汎用性を高め、同時に実装コストを抑えることができる。 Therefore, the other broadcast mode circuits of the receiver are idle during that time. When switching and using multiple dedicated circuits, such as mode switching, and the switching interval is relatively long, instead of creating multiple dedicated circuits, the LSI can be reconfigured instantaneously at the time of switching. The structure can be simplified to improve versatility, and at the same time the mounting cost can be reduced.

このようなニーズに応えるべく、動的に再構成可能なＬＳＩに製造業界の関心が集まっている。特に、携帯電話やＰＤＡ（Personal Data Assistance）などのモバイル端末に搭載されるＬＳＩは小型化が必須であり、ＬＳＩを動的に再構成し、用途に合わせて適宜機能を切り替えることができれば、ＬＳＩの実装面積を抑えることができる。 In order to meet such needs, the manufacturing industry has attracted attention to dynamically reconfigurable LSIs. In particular, LSIs mounted on mobile terminals such as cellular phones and PDAs (Personal Data Assistance) must be downsized, and if LSIs can be dynamically reconfigured and functions can be switched appropriately according to the application, Mounting area can be reduced.

ＦＰＧＡは回路構成の設計自由度が高く、汎用的である反面、全ての基本セル間の接続を可能とするため、多数のスイッチとスイッチのＯＮ／ＯＦＦを制御するための制御回路を含む必要があり、必然的に制御回路の実装面積が大きくなる。また、基本セル間の接続に複雑な配線パターンをとるため、配線が長くなる傾向があり、さらに１本の配線に多くのスイッチが接続される構造のため、遅延が大きくなる。 The FPGA has a high degree of design freedom in circuit configuration and is general-purpose. On the other hand, in order to enable connection between all the basic cells, it is necessary to include a large number of switches and a control circuit for controlling ON / OFF of the switches. This inevitably increases the mounting area of the control circuit. Further, since a complicated wiring pattern is used for the connection between the basic cells, the wiring tends to be long, and the delay increases because of the structure in which many switches are connected to one wiring.

そのため、ＦＰＧＡによるＬＳＩは、試作や実験のために利用されるにとどまることが多く、実装効率、性能、コストなどを考えると、量産には適していない。さらに、ＦＰＧＡでは、多数のＬＵＴ方式の基本セルに構成情報を送る必要があるため、回路のコンフィグレーションにはかなりの時間がかかる。そのため、瞬時に回路構成の切り替えが必要な用途にはＦＰＧＡは適していない。 For this reason, FPGA based LSIs are often used only for trial manufacture and experiments, and are not suitable for mass production in view of mounting efficiency, performance, cost, and the like. Furthermore, in the FPGA, it is necessary to send configuration information to a large number of basic cells of the LUT method, so that it takes a considerable time to configure the circuit. For this reason, the FPGA is not suitable for applications that require instantaneous switching of the circuit configuration.

それらの課題を解決するため、近年、ＡＬＵ(Arithmetic Logic Unit)と呼ばれる基本演算機能を複数持つ多機能素子を多段に並べたＡＬＵアレイの検討が行われるようになった。ＡＬＵアレイでは、処理が上段から下段の一方向に流れるので、水平方向のＡＬＵを結ぶ配線は基本的には不要である。そのため、ＦＰＧＡと比較して回路規模を小さくすることが可能となる。 In order to solve these problems, in recent years, an ALU array called ALU (Arithmetic Logic Unit) in which multi-functional elements having a plurality of basic arithmetic functions are arranged in multiple stages has been studied. In the ALU array, processing flows in one direction from the upper stage to the lower stage, so that wiring that connects the ALUs in the horizontal direction is basically unnecessary. Therefore, the circuit scale can be reduced as compared with the FPGA.

ＡＬＵアレイでは、ＡＬＵの出力をＡＬＵの入力にフィードバックすることにより、回路のコンフィギュレーションを実行する。回路のコンフィギュレーションを高速に実行することによって、ＡＬＵアレイの処理速度を高めることが可能となる。特に、実行するべきスレッドが複数存在する場合、複数のスレッドを効率的に処理することが好ましい。 In the ALU array, circuit configuration is executed by feeding back the output of the ALU to the input of the ALU. By executing the circuit configuration at high speed, the processing speed of the ALU array can be increased. In particular, when there are a plurality of threads to be executed, it is preferable to efficiently process the plurality of threads.

本発明はこうした状況に鑑みてなされたもので、その目的は、複数のスレッドを効率的に実行する技術を提供することにある。 The present invention has been made in view of such circumstances, and an object thereof is to provide a technique for efficiently executing a plurality of threads.

上記課題を解決するために、本発明のある態様は、複数の演算機能を選択的に実行可能な論理回路を複数有し、複数のスレッドを同時に実行することができるリコンフィギュラブル回路に関する。 In order to solve the above problem, an embodiment of the present invention relates to a reconfigurable circuit that includes a plurality of logic circuits that can selectively execute a plurality of arithmetic functions and that can execute a plurality of threads simultaneously.

この態様のリコンフィギュラブル回路において、前段の論理回路と後段の論理回路との間に記憶部が設けられ、複数のスレッドの実行中に、記憶部は、第１のタイミングにおいて前段の論理回路から出力されるデータを格納し、第１のタイミングに続く第２のタイミングにおいて、第１のタイミングにおいて前段の論理回路が実行していたスレッドと同一のスレッドを実行する後段の論理回路に、格納した前段の論理回路から出力されたデータを供給する。 In the reconfigurable circuit according to this aspect, a storage unit is provided between the preceding-stage logic circuit and the subsequent-stage logic circuit. During execution of a plurality of threads, the storage unit is connected to the preceding-stage logic circuit at the first timing. The output data is stored and stored in the subsequent logic circuit that executes the same thread as the thread executed by the preceding logic circuit at the first timing at the second timing following the first timing. Data output from the preceding logic circuit is supplied.

なお、記憶部は、データフリップフロップ回路などにより構成されてもよく、リコンフィギュラブル回路のコンフィギュレーションの切替が１クロックで行われる場合には、第２のタイミングは、第１のタイミングから１クロックだけ遅れたタイミングであってよい。 Note that the storage unit may be configured by a data flip-flop circuit or the like. When the configuration of the reconfigurable circuit is switched in one clock, the second timing is one clock from the first timing. The timing may be delayed only.

なお、前段および後段という用語は、処理の方向を意味するものである。前段の論理回路の出力が、後段の論理回路の入力として処理されればよく、物理的な位置関係としての前段および後段を意味するものではない。 The terms “front stage” and “back stage” mean the direction of processing. The output of the preceding stage logic circuit only needs to be processed as the input of the succeeding stage logic circuit, and does not mean the preceding stage and the succeeding stage as a physical positional relationship.

なお、リコンフィギュラブル回路が論理回路の多段接続構造を有して構成されてもよく、その場合であっても、前段と後段という用語は、処理の方向を意味する。前段と後段の論理回路の間に記憶部を設けることで、同時期に前段と後段の論理回路で異なるスレッドを処理することが可能となり、複数のスレッドを実行可能なリコンフィギュラブル回路を構成することができる。 Note that the reconfigurable circuit may be configured to have a multi-stage connection structure of logic circuits. Even in this case, the terms “front stage” and “rear stage” mean the direction of processing. By providing a storage unit between the preceding and succeeding logic circuits, different threads can be processed in the preceding and succeeding logic circuits at the same time, and a reconfigurable circuit that can execute multiple threads is configured. be able to.

なお、第１記憶部は、複数の記憶手段を備え、記憶手段には、記憶手段に対応する２つのスレッドのうちの一方のスレッドが割り当てられており、一方のスレッドの割り当ては、所定サイクル毎に他方のスレッドへ割り当てを切り替えられてもよい。 The first storage unit includes a plurality of storage units, and one of the two threads corresponding to the storage unit is allocated to the storage unit, and the allocation of one thread is performed every predetermined cycle. The assignment to the other thread may be switched.

なお、第１記憶部は、少なくとも一対の記憶手段を備え、一対の記憶手段には、異なるスレッドがそれぞれに割り当てられており、一対の記憶手段のそれぞれに対応するスレッドの割り当ては、所定サイクル毎に互いに切り替えられてもよい。 The first storage unit includes at least a pair of storage units, and a different thread is allocated to each of the pair of storage units, and the allocation of threads corresponding to each of the pair of storage units is performed every predetermined cycle. May be switched to each other.

なお、第１記憶部は、複数の記憶手段を有する情報記憶ユニットを備え、情報記憶ユニットに備えられた記憶手段のいずれかは、リコンフィギュラブル回路からの所定スレッドの出力を記憶してもよい。 The first storage unit may include an information storage unit having a plurality of storage units, and any one of the storage units included in the information storage unit may store an output of a predetermined thread from the reconfigurable circuit. .

なお、情報記憶ユニットに備えられた記憶手段のいずれかは、特定のスレッドのみが固定的に割り当てられてもよい。 Note that only one specific thread may be fixedly assigned to any of the storage means provided in the information storage unit.

なお、情報記憶ユニットに備えられた記憶手段の少なくとも２以上は、同一のアドレス空間に割振られてもよい。 Note that at least two or more of the storage means provided in the information storage unit may be allocated to the same address space.

なお、情報記憶ユニットに備えられた記憶手段のいずれかは、リコンフィギュラブル回路からの所定スレッドの出力を記憶するか否かを、特定のアドレス範囲により決定されてもよい。 Note that any one of the storage units provided in the information storage unit may determine whether or not to store the output of a predetermined thread from the reconfigurable circuit, based on a specific address range.

本発明の別の態様は、複数の演算機能を選択的に実行可能な論理回路を複数有し、複数のスレッドを同時に実行することができるリコンフィギュラブル回路と、リコンフィギュラブル回路からの出力を記憶する第１記憶部とを備えた処理装置に関する。 Another embodiment of the present invention includes a reconfigurable circuit that includes a plurality of logic circuits that can selectively execute a plurality of arithmetic functions, and that can execute a plurality of threads simultaneously, and outputs from the reconfigurable circuit. The present invention relates to a processing apparatus including a first storage unit for storing.

この態様の処理装置において、第１記憶部は、リコンフィギュラブル回路上で実行されるスレッドに割り当てられる。第１記憶部は、ＲＡＭなどの記憶手段により構成されてもよい。第１記憶部は、複数のＲＡＭなどの記憶手段を有してもよく、また１つのＲＡＭなどの記憶手段が複数の記憶領域に分割されたものであってもよい。後者の場合、複数の記憶領域は、同時アクセス可能であることが好ましい。スレッド間でデータの受け渡しを行う場合には、所定のタイミングで、第１記憶部に対するスレッドの割当てを変更することが好ましい。スレッドの割当てを変更することで、スレッド間の効率的なデータの受け渡しを実現できる。 In the processing apparatus of this aspect, the first storage unit is assigned to a thread that is executed on the reconfigurable circuit. The first storage unit may be configured by storage means such as a RAM. The first storage unit may include storage means such as a plurality of RAMs, or a storage means such as one RAM may be divided into a plurality of storage areas. In the latter case, it is preferable that a plurality of storage areas can be accessed simultaneously. When data is exchanged between threads, it is preferable to change the allocation of threads to the first storage unit at a predetermined timing. By changing the thread assignment, it is possible to achieve efficient data transfer between threads.

なお、論理回路は、複数種類の多ビット演算を選択的に実行可能な算術論理回路であってよい。 Note that the logic circuit may be an arithmetic logic circuit capable of selectively executing a plurality of types of multi-bit operations.

なお、以上の構成要素の任意の組み合わせ、本発明の表現を方法、装置、システム、コンピュータプログラムとして表現したものもまた、本発明の態様として有効である。 It should be noted that any combination of the above components and the expression of the present invention expressed as a method, apparatus, system, and computer program are also effective as an aspect of the present invention.

本発明によれば、複数のスレッドを効率的に実行する技術を提供することができる。 According to the present invention, it is possible to provide a technique for efficiently executing a plurality of threads.

図１は、本発明の実施例に係る処理装置１０の構成図である。処理装置１０は、集積回路装置２６、コンパイル部３０、設定データ生成部３２、および記憶部３４を備える。集積回路装置２６は、回路構成を再構成可能とする機能を有する。 FIG. 1 is a configuration diagram of a processing apparatus 10 according to an embodiment of the present invention. The processing device 10 includes an integrated circuit device 26, a compiling unit 30, a setting data generating unit 32, and a storage unit 34. The integrated circuit device 26 has a function that makes it possible to reconfigure the circuit configuration.

集積回路装置２６は１チップとして構成され、リコンフィギュラブル回路１２、設定部１４、制御部１８、内部状態保持回路２０、出力回路２２、第１切替回路２３、第２切替回路２５、メモリ部２７、第３切替回路２８および経路部２４、２９を備える。 The integrated circuit device 26 is configured as one chip, and includes a reconfigurable circuit 12, a setting unit 14, a control unit 18, an internal state holding circuit 20, an output circuit 22, a first switching circuit 23, a second switching circuit 25, and a memory unit 27. The third switching circuit 28 and the path portions 24 and 29 are provided.

リコンフィギュラブル回路１２は、パイプライン構成を有し、論理回路の設定を変更することにより、機能の変更を可能とする。本実施例に係るリコンフィギュラブル回路１２は、複数のスレッドを同時に実行することができる。スレッドは、リコンフィギュラブル回路１２に実行させる処理であり、各スレッドの処理は、それ自体で完結する。複数のスレッドは、互いに独立して実行され、スレッド同士の間でデータの受け渡しがあるものであってもよい。処理装置１０は、複数種類の回路のコンフィギュレーションをリコンフィギュラブル回路１２上で同時に実現できる。 The reconfigurable circuit 12 has a pipeline configuration, and the function can be changed by changing the setting of the logic circuit. The reconfigurable circuit 12 according to the present embodiment can simultaneously execute a plurality of threads. A thread is a process that is executed by the reconfigurable circuit 12, and the process of each thread is completed by itself. The plurality of threads may be executed independently of each other, and there may be data passing between the threads. The processing apparatus 10 can simultaneously implement a plurality of types of circuit configurations on the reconfigurable circuit 12.

設定部１４は、第１回路設定部１４ａ、第２回路設定部１４ｂ、第３回路設定部１４ｃ、および回路処理制御部１６を有し、リコンフィギュラブル回路１２に所期の回路を構成するための設定データ４０を供給する。具体的には、第１回路設定部１４ａ、第２回路設定部１４ｂ、第３回路設定部１４ｃは、それぞれ異なるスレッドを実行するための設定データ４０を回路処理制御部１６に供給する。回路処理制御部１６は、第１回路設定部１４ａ、第２回路設定部１４ｂ、第３回路設定部１４ｃから送られてきた設定データ４０を、リコンフィギュラブル回路１２のパイプラインの各段に相当するリコンフィギュラブルユニットに所定の順序で供給する。これにより、リコンフィギュラブル回路１２の各段には、複数種類の回路の一部がそれぞれ構成されることになり、マルチスレッド処理機能が実現される。 The setting unit 14 includes a first circuit setting unit 14 a, a second circuit setting unit 14 b, a third circuit setting unit 14 c, and a circuit processing control unit 16, and configures an intended circuit in the reconfigurable circuit 12. The setting data 40 is supplied. Specifically, the first circuit setting unit 14a, the second circuit setting unit 14b, and the third circuit setting unit 14c supply the circuit processing control unit 16 with setting data 40 for executing different threads. The circuit processing control unit 16 corresponds to each stage of the pipeline of the reconfigurable circuit 12 with the setting data 40 sent from the first circuit setting unit 14a, the second circuit setting unit 14b, and the third circuit setting unit 14c. The reconfigurable unit is supplied in a predetermined order. Thereby, a part of a plurality of types of circuits is configured in each stage of the reconfigurable circuit 12, and a multithread processing function is realized.

経路部２４、２９は、フィードバックパスとして機能し、リコンフィギュラブル回路１２の出力を、第３切替回路２８に出力する。設定部１４は、プログラムカウンタのカウント値に基づいて記憶したデータを出力するコマンドメモリとして構成されてもよい。この場合、制御部１８がプログラムカウンタの出力を制御する。この意味において、設定データ４０はコマンドデータと呼ばれてもよい。 The path units 24 and 29 function as a feedback path and output the output of the reconfigurable circuit 12 to the third switching circuit 28. The setting unit 14 may be configured as a command memory that outputs stored data based on the count value of the program counter. In this case, the control unit 18 controls the output of the program counter. In this sense, the setting data 40 may be called command data.

メモリ部２７は、制御部１８からの指示に基づきリコンフィギュラブル回路１２から出力されるデータ信号を格納するための記憶領域を有する。メモリ部２７はリコンフィギュラブル回路１２内に設けられてもよく、またリコンフィギュラブル回路１２の外部に設けられていてもよい。 The memory unit 27 has a storage area for storing a data signal output from the reconfigurable circuit 12 based on an instruction from the control unit 18. The memory unit 27 may be provided in the reconfigurable circuit 12 or may be provided outside the reconfigurable circuit 12.

メモリ部２７は複数のＲＡＭなどから構成される。複数のスレッドの実行中、それぞれのＲＡＭは、リコンフィギュラブル回路１２上で実行されるスレッドに割り当てられる。例えばスレッドＡ、スレッドＢ、スレッドＣの３つのスレッドが同時に実行される場合、１番目のＲＡＭはスレッドＡに、２番目のＲＡＭはスレッドＢに、３番目のＲＡＭはスレッドＣに割り当てられる。ＲＡＭに対するスレッドの割当ては、制御部１８により制御される。 The memory unit 27 includes a plurality of RAMs. During execution of a plurality of threads, each RAM is allocated to a thread executed on the reconfigurable circuit 12. For example, when three threads of thread A, thread B, and thread C are executed simultaneously, the first RAM is assigned to thread A, the second RAM is assigned to thread B, and the third RAM is assigned to thread C. The assignment of threads to the RAM is controlled by the control unit 18.

特に、スレッド間にデータの受け渡しがある場合には、制御部１８は、所定のタイミングでＲＡＭに対するスレッドの割当てを変更することが好ましい。例えば、スレッドＡの処理結果をスレッドＢが使用する場合、スレッドＡの処理が終了して、その処理結果が格納された１番目のＲＡＭは、続くタイミングで、制御部１８によりスレッドＢを割り当てられる。これにより、スレッドＢは、スレッドＡの処理結果を１番目のＲＡＭから２番目のＲＡＭにコピーする必要もなく、１番目のＲＡＭに格納されたスレッドＡの処理結果を効率的に利用できる。メモリ部２７の前段には第１切替回路２３が設けられ、後段には第２切替回路２５が設けられる。 In particular, when there is data exchange between threads, the control unit 18 preferably changes the allocation of threads to the RAM at a predetermined timing. For example, when the processing result of the thread A is used by the thread B, the processing of the thread A is finished, and the first RAM storing the processing result is assigned the thread B by the control unit 18 at a subsequent timing. . Thereby, the thread B can efficiently use the processing result of the thread A stored in the first RAM without copying the processing result of the thread A from the first RAM to the second RAM. A first switching circuit 23 is provided in the previous stage of the memory unit 27, and a second switching circuit 25 is provided in the subsequent stage.

第１切替回路２３は、リコンフィギュラブル回路１２からの出力をスレッドに応じて選択して、メモリ部２７のＲＡＭに供給する。これにより、ＲＡＭに対するスレッドの割当てが設定されることになる。したがって、第１切替回路２３におけるスイッチ設定に変更がなければ、あるＲＡＭには、同じスレッドからのデータが格納されることになる。第２切替回路２５は、複数のＲＡＭからの出力の１つをスレッドに応じて選択して、リコンフィギュラブル回路１２の入力にフィードバックさせる。 The first switching circuit 23 selects the output from the reconfigurable circuit 12 according to the thread and supplies it to the RAM of the memory unit 27. As a result, thread allocation to the RAM is set. Accordingly, if the switch setting in the first switching circuit 23 is not changed, data from the same thread is stored in a certain RAM. The second switching circuit 25 selects one of the outputs from the plurality of RAMs according to the thread and feeds it back to the input of the reconfigurable circuit 12.

第２切替回路２５は複数の第２切替部を有して構成され、それぞれの第２切替部は、１つのスレッドについてのデータを選択するように設定される。メモリ部２７に格納されたデータ信号は、第２切替回路２５におけるスイッチ設定に基づいて、経路部２９を通じてリコンフィギュラブル回路１２への入力として伝達される。第１切替回路２３、第２切替回路２５、メモリ部２７の動作は、制御部１８により制御される。 The second switching circuit 25 includes a plurality of second switching units, and each second switching unit is set to select data for one thread. The data signal stored in the memory unit 27 is transmitted as an input to the reconfigurable circuit 12 through the path unit 29 based on the switch setting in the second switching circuit 25. The operations of the first switching circuit 23, the second switching circuit 25, and the memory unit 27 are controlled by the control unit 18.

リコンフィギュラブル回路１２への入力は、経路部２４、経路部２９の２系統存在するが、経路部２４は、メモリ部２７を介さないために高速なフィードバック処理を可能とする。特にメモリ部２７が低速で動作処理する場合には、経路部２４は経路部２９よりもさらに高速なフィードバック処理を可能とする。 There are two inputs to the reconfigurable circuit 12, that is, the route unit 24 and the route unit 29, but the route unit 24 does not go through the memory unit 27, so that high-speed feedback processing is possible. In particular, when the memory unit 27 performs an operation process at a low speed, the path unit 24 enables feedback processing at a higher speed than the path unit 29.

第３切替回路２８は、回路処理制御部１６からの指示信号に応答して、外部からの入力信号および経路部２４、２９からの入力信号を選択的にリコンフィギュラブル回路１２に出力する。具体的には、設定データ４０に基づいて決定される所定のタイミングで、回路処理制御部１６から切替指示がなされる。 In response to the instruction signal from the circuit processing control unit 16, the third switching circuit 28 selectively outputs an external input signal and an input signal from the path units 24 and 29 to the reconfigurable circuit 12. Specifically, a switching instruction is issued from the circuit processing control unit 16 at a predetermined timing determined based on the setting data 40.

内部状態保持回路２０および出力回路２２は、リコンフィギュラブル回路１２の出力を受けて、たとえばデータフリップフロップ（Ｄ−ＦＦ）などの順序回路として構成される。内部状態保持回路２０は経路部２４に接続されている。メモリ部２７は、経路部２９と接続されている。リコンフィギュラブル回路１２は、組合せ回路と、フリップフロップ回路で構成され、パイプライン動作を実現する。 The internal state holding circuit 20 and the output circuit 22 are configured as sequential circuits such as a data flip-flop (D-FF), for example, receiving the output of the reconfigurable circuit 12. The internal state holding circuit 20 is connected to the path unit 24. The memory unit 27 is connected to the path unit 29. The reconfigurable circuit 12 includes a combinational circuit and a flip-flop circuit, and realizes a pipeline operation.

リコンフィギュラブル回路１２は、機能の変更が可能な論理回路を有して構成される。具体的にリコンフィギュラブル回路１２は、複数の演算機能を選択的に実行可能な論理回路を複数段に配列させた構成を備え、さらに前段の論理回路列の出力と後段の論理回路列の入力との接続関係を設定可能な接続部を備える。 The reconfigurable circuit 12 includes a logic circuit whose function can be changed. Specifically, the reconfigurable circuit 12 includes a configuration in which logic circuits capable of selectively executing a plurality of arithmetic functions are arranged in a plurality of stages, and further includes an output of a preceding logic circuit string and an input of a subsequent logic circuit string. The connection part which can set the connection relationship with is provided.

この接続部は、前段の論理回路列の出力すなわち内部状態を保持する状態保持回路（以下、ＦＦ回路とも呼ぶ）の機能も備える。複数の論理回路は、マトリックス状に配置される。各論理回路の機能と、論理回路間の接続関係は、設定部１４により供給される設定データ４０に基づいて設定される。設定データ４０は、以下の手順で生成される。 This connection unit also has a function of a state holding circuit (hereinafter also referred to as FF circuit) that holds the output of the preceding logic circuit row, that is, the internal state. The plurality of logic circuits are arranged in a matrix. The function of each logic circuit and the connection relationship between the logic circuits are set based on setting data 40 supplied by the setting unit 14. The setting data 40 is generated by the following procedure.

集積回路装置２６により実現されるべきプログラム３６が、記憶部３４に保持されている。プログラム３６は、回路における処理の動作を記述した動作記述を示し、信号処理回路または信号処理アルゴリズムなどをＣ言語などの高級言語で記述したものである。 A program 36 to be realized by the integrated circuit device 26 is held in the storage unit 34. The program 36 shows an operation description describing the operation of processing in the circuit, and describes a signal processing circuit or a signal processing algorithm in a high-level language such as C language.

コンパイル部３０は、記憶部３４に格納されたプログラム３６をコンパイルし、データフローグラフ（ＤＦＧ）３８に変換して記憶部３４に格納する。データフローグラフ３８は、回路における演算間の実行順序の依存関係を表現し、入力変数および定数の演算の流れをグラフ構造で示したものである。一般に、データフローグラフ３８は、上から下に向かって演算が進むように作成される。 The compiling unit 30 compiles the program 36 stored in the storage unit 34, converts it into a data flow graph (DFG) 38, and stores it in the storage unit 34. The data flow graph 38 expresses the dependency of execution order between operations in a circuit, and shows the flow of operations of input variables and constants in a graph structure. In general, the data flow graph 38 is created so that the calculation proceeds from top to bottom.

設定データ生成部３２は、データフローグラフ３８から設定データ４０を生成する。設定データ４０は、データフローグラフ３８をリコンフィギュラブル回路１２にマッピングするためのデータであり、リコンフィギュラブル回路１２における論理回路の機能や論理回路間の接続関係を定める。設定データ生成部３２は、１つの生成すべきターゲット回路を分割してできる複数の分割回路の設定データ４０を生成する。 The setting data generation unit 32 generates setting data 40 from the data flow graph 38. The setting data 40 is data for mapping the data flow graph 38 to the reconfigurable circuit 12 and determines the function of the logic circuit in the reconfigurable circuit 12 and the connection relationship between the logic circuits. The setting data generation unit 32 generates setting data 40 for a plurality of divided circuits that are obtained by dividing one target circuit to be generated.

設定データ生成部３２は、リコンフィギュラブル回路１２における論理回路の配列構造とデータフローグラフ３８によって、ターゲット回路の分割方法を定める。なお、リコンフィギュラブル回路１２でパイプライン処理を行うことが予め分かっている場合には、リコンフィギュラブル回路１２におけるリコンフィギュラブルユニットの配列構造に基づいて、ターゲット回路の分割方法を定めてもよい。リコンフィギュラブル回路１２の配列構造は、制御部１８から設定データ生成部３２に伝えられてもよく、また予め記憶部３４に記録されていてもよい。また、制御部１８が、ターゲット回路の分割方法を設定データ生成部３２に指示してもよい。 The setting data generation unit 32 determines the division method of the target circuit based on the logic circuit arrangement structure and the data flow graph 38 in the reconfigurable circuit 12. If it is known in advance that pipeline processing is performed in the reconfigurable circuit 12, a method for dividing the target circuit may be determined based on the arrangement structure of the reconfigurable units in the reconfigurable circuit 12. . The arrangement structure of the reconfigurable circuit 12 may be transmitted from the control unit 18 to the setting data generation unit 32 or may be recorded in the storage unit 34 in advance. In addition, the control unit 18 may instruct the setting data generation unit 32 about the division method of the target circuit.

以上の手順を実行することにより、記憶部３４は、リコンフィギュラブル回路１２を所期の回路として構成するための複数の設定データ４０を記憶する。複数の設定データ４０は、ターゲット回路を分割した複数の分割回路をそれぞれ表現したものである。このように、リコンフィギュラブル回路１２の回路規模に応じて、生成すべきターゲット回路の設定データ４０を生成することにより、汎用性の高い処理装置１０を実現することが可能となる。別の視点からみると、実施例の処理装置１０によれば、回路規模の小さいリコンフィギュラブル回路１２を用いて、所望の回路を再構成することが可能となる。 By executing the above procedure, the storage unit 34 stores a plurality of setting data 40 for configuring the reconfigurable circuit 12 as a desired circuit. The plurality of setting data 40 represents a plurality of divided circuits obtained by dividing the target circuit. As described above, by generating the setting data 40 of the target circuit to be generated according to the circuit scale of the reconfigurable circuit 12, it is possible to realize the processing apparatus 10 with high versatility. From another point of view, according to the processing apparatus 10 of the embodiment, it is possible to reconfigure a desired circuit using the reconfigurable circuit 12 having a small circuit scale.

図２は、リコンフィギュラブル回路１２の構成を示す。リコンフィギュラブル回路１２は、複数の演算機能を選択的に実行可能な論理回路５０より構成される論理回路列を複数備える。具体的に、リコンフィギュラブル回路１２は、論理回路列の多段配列と、各段に設けられた接続部５２を備えて構成される。 FIG. 2 shows the configuration of the reconfigurable circuit 12. The reconfigurable circuit 12 includes a plurality of logic circuit strings each including a logic circuit 50 that can selectively execute a plurality of arithmetic functions. Specifically, the reconfigurable circuit 12 is configured to include a multi-stage arrangement of logic circuit arrays and a connection unit 52 provided in each stage.

接続部５２は、前段の論理回路の出力と後段の論理回路の入力の任意の接続関係あるいは予め定められた接続関係の組合せの中から選択された接続関係を設定することができる。また接続部５２は、前段の論理回路の出力信号を保持することができる。リコンフィギュラブル回路１２では、論理回路の多段配列構造により、上段から下段に向かって演算が進められる。 The connection unit 52 can set an arbitrary connection relationship between the output of the preceding logic circuit and the input of the subsequent logic circuit, or a connection relationship selected from a predetermined combination of connection relationships. The connection unit 52 can hold the output signal of the preceding logic circuit. In the reconfigurable circuit 12, the operation proceeds from the upper stage to the lower stage due to the multistage arrangement structure of the logic circuits.

リコンフィギュラブル回路１２は、論理回路５０としてＡＬＵ(Arithmetic Logic Unit)を有している。ＡＬＵは、複数種類の多ビット演算を選択的に実行可能な算術論理回路であって、論理和、論理積、ビットシフトなどの複数種類の多ビット演算を設定により選択的に実行できる。各ＡＬＵは、複数の演算機能を設定するためのセレクタを有して構成されている。図示の例では、ＡＬＵが、２つの入力端子と２つの出力端子を有して構成される。 The reconfigurable circuit 12 has an ALU (Arithmetic Logic Unit) as the logic circuit 50. The ALU is an arithmetic logic circuit capable of selectively executing a plurality of types of multi-bit operations, and can selectively execute a plurality of types of multi-bit operations such as logical sum, logical product, and bit shift. Each ALU has a selector for setting a plurality of arithmetic functions. In the illustrated example, the ALU has two input terminals and two output terminals.

図示のように、リコンフィギュラブル回路１２は、縦方向にＸ個、横方向にＹ個のＡＬＵが配置されたＸ段Ｙ列のＡＬＵアレイとして構成される。第１段のＡＬＵ１１、ＡＬＵ１２、・・・、ＡＬＵ１Ｙには、入力変数や定数が入力され、設定された所定の演算がなされる。演算結果の出力は、第１段の接続部５２に設定された接続にしたがって、第２段のＡＬＵ２１、ＡＬＵ２２、・・・、ＡＬＵ２Ｙに入力される。 As illustrated, the reconfigurable circuit 12 is configured as an X-stage Y-column ALU array in which X ALUs in the vertical direction and Y ALUs in the horizontal direction are arranged. Input variables and constants are input to the first-stage ALU11, ALU12,..., ALU1Y, and a set predetermined calculation is performed. The output of the calculation result is input to the second-stage ALU 21, ALU 22,..., ALU 2Y according to the connection set in the first-stage connection unit 52.

第１段の接続部５２においては、第１段のＡＬＵ列の出力と第２段のＡＬＵ列の入力の間で任意の接続関係、あるいは予め定められた接続関係の組合せの中から選択された接続関係を実現できるように結線が構成されており、設定により所期の結線が有効となる。以下、最終段である第Ｘ段の接続部５２まで同様の構成である。接続部５２はＦＦ回路としての機能も有しており、最終段の接続部５２は、図１に示す内部状態保持回路２０として機能してもよい。 In the first stage connection section 52, an arbitrary connection relationship between the output of the first ALU column and the input of the second ALU column or a combination of predetermined connection relationships is selected. The connection is configured so that the connection relationship can be realized, and the intended connection is enabled by setting. Hereinafter, the configuration is the same up to the connection part 52 of the Xth stage which is the final stage. The connection unit 52 also has a function as an FF circuit, and the final-stage connection unit 52 may function as the internal state holding circuit 20 illustrated in FIG.

なお、図２のリコンフィギュラブル回路１２においては、接続部５２が、ＡＬＵ列と交互に１段ずつ設けられた構成を示している。この接続部５２を各ＡＬＵ列の下段に配置することにより、リコンフィギュラブル回路１２は、１段ずつのＡＬＵ列から構成されるＸ段のリコンフィギュラブルユニットに分割されることになる。 Note that the reconfigurable circuit 12 of FIG. 2 shows a configuration in which the connection units 52 are provided one by one alternately with the ALU column. By disposing the connection unit 52 at the lower stage of each ALU column, the reconfigurable circuit 12 is divided into X-stage reconfigurable units each including one ALU column.

具体的に、１段のリコンフィギュラブルユニットは、１段のＡＬＵ列と１段の接続部５２で構成される。この分割は、接続部５２に含まれるＦＦ回路にしたがうものであり、例えば２段のＡＬＵ列毎に接続部５２を設け、２段のＡＬＵ列の間を、ＦＦ回路を有しない接続部で接続する場合には、２段ずつのＡＬＵ列で構成されるＸ／２段のリコンフィギュラブルユニットに分割されることになる。それ以外にも、ＦＦ回路を所定段のＡＬＵ列毎に設けることにより、所望段のリコンフィギュラブルユニットを構成することができる。 Specifically, the one-stage reconfigurable unit includes one-stage ALU row and one-stage connection unit 52. This division is in accordance with the FF circuit included in the connection unit 52. For example, a connection unit 52 is provided for each two-stage ALU column, and a connection unit having no FF circuit is connected between two ALU columns. In this case, it is divided into X / 2-stage reconfigurable units each composed of two ALU rows. In addition, by providing an FF circuit for each ALU column of a predetermined stage, a reconfigurable unit of a desired stage can be configured.

回路のコンフィギュレーションは１クロックで行われる。具体的に、回路処理制御部１６が１クロック毎に設定データをリコンフィギュラブル回路１２にマッピングする。各ＡＬＵ列の出力は、後段の接続部５２に保持される。複数スレッドの実行中、接続部５２のＦＦ回路は、前段の論理回路から出力されるデータを格納し、次のクロックで、前段の論理回路が実行していたスレッドと同一のスレッドを実行する後段の論理回路に、格納したデータを供給する。 Circuit configuration is performed in one clock. Specifically, the circuit processing control unit 16 maps the setting data to the reconfigurable circuit 12 every clock. The output of each ALU column is held in the connection unit 52 at the subsequent stage. During the execution of a plurality of threads, the FF circuit of the connection unit 52 stores the data output from the preceding logic circuit, and executes the same thread as the thread executed by the preceding logic circuit at the next clock. The stored data is supplied to the logic circuit.

このように、１つのスレッドの処理は、クロック毎に１つ下段のＡＬＵ列において実行されることになる。最終段で処理されると、また最上段のＡＬＵ列からクロック毎に１段ずつ下がっていく。これにより、マルチスレッド処理を実行でき、効率的な回路コンフィギュレーションを実現できる。 In this way, the processing of one thread is executed in the ALU row at the lower stage for each clock. When processing is performed at the final stage, it is lowered one stage at a time from the uppermost ALU column. Thereby, multi-thread processing can be executed, and an efficient circuit configuration can be realized.

図３は、リコンフィギュラブル回路の構成の別の例を示す。図３に示すリコンフィギュラブル回路１２ａは、図２に示すリコンフィギュラブル回路１２の機能をさらに拡張している。図３に示すリコンフィギュラブル回路１２ａにおいて、接続部５２ａは、図２の接続部５２の機能に加えて、外部から入力される変数や定数を、所期のＡＬＵに供給する機能を有している。 FIG. 3 shows another example of the configuration of the reconfigurable circuit. The reconfigurable circuit 12a shown in FIG. 3 further expands the function of the reconfigurable circuit 12 shown in FIG. In the reconfigurable circuit 12a shown in FIG. 3, in addition to the function of the connection unit 52 of FIG. 2, the connection unit 52a has a function of supplying variables and constants input from the outside to the intended ALU. Yes.

また、接続部５２ａは、前段のＡＬＵの演算結果を外部に直接出力することもできる。この構成により、図２に示されるリコンフィギュラブル回路１２の構成よりも多様な組合せ回路を構成することが可能となり、設計の自由度が向上する。 The connection unit 52a can also directly output the calculation result of the previous ALU to the outside. With this configuration, it is possible to configure various combinational circuits as compared with the configuration of the reconfigurable circuit 12 shown in FIG. 2, and the degree of freedom in design is improved.

図４は、データフローグラフ３８の例を示す図である。データフローグラフ３８においては、入力される変数や定数の演算の流れが段階的にグラフ構造で表現されている。図中、演算子は丸印で示されている。設定データ生成部３２は、このデータフローグラフ３８をリコンフィギュラブル回路１２にマッピングするための設定データ４０を生成する。 FIG. 4 is a diagram illustrating an example of the data flow graph 38. In the data flow graph 38, the flow of operations of input variables and constants is expressed step by step in a graph structure. In the figure, operators are indicated by circles. The setting data generation unit 32 generates setting data 40 for mapping the data flow graph 38 to the reconfigurable circuit 12.

特にデータフローグラフ３８をリコンフィギュラブル回路１２にマッピングしきれない場合に、データフローグラフ３８を複数の領域に分割し、分割回路の設定データ４０を生成する。実施例では、リコンフィギュラブル回路１２上で複数のスレッドが実行されるが、各スレッドは、リコンフィギュラブル回路１２におけるリコンフィギュラブルユニットにてそれぞれ実行されることになる。 In particular, when the data flow graph 38 cannot be mapped to the reconfigurable circuit 12, the data flow graph 38 is divided into a plurality of regions, and setting data 40 for the divided circuit is generated. In the embodiment, a plurality of threads are executed on the reconfigurable circuit 12, but each thread is executed by a reconfigurable unit in the reconfigurable circuit 12.

したがって、設定データ生成部３２は、リコンフィギュラブルユニットの回路規模に応じて、データフローグラフ３８を複数の領域に分割し、分割回路の設定データ４０を生成する。データフローグラフ３８による演算の流れを回路上で実現するべく、設定データ４０は、演算機能を割り当てる論理回路を特定し、また論理回路間の接続関係を定め、さらに入力変数や入力定数などを定義したデータとなる。したがって、設定データ４０は、各論理回路５０の機能を選択するセレクタに供給する選択情報、接続部５２の結線を設定する接続情報、必要な変数データや定数データなどを含んで構成される。 Therefore, the setting data generation unit 32 divides the data flow graph 38 into a plurality of regions according to the circuit scale of the reconfigurable unit, and generates setting data 40 for the divided circuit. In order to realize the flow of calculation by the data flow graph 38 on the circuit, the setting data 40 specifies the logic circuit to which the calculation function is assigned, defines the connection relationship between the logic circuits, and further defines input variables, input constants, and the like. Data. Therefore, the setting data 40 includes selection information supplied to a selector that selects the function of each logic circuit 50, connection information for setting the connection of the connection unit 52, necessary variable data, constant data, and the like.

図５は、１つの生成すべきターゲット回路４２を分割してできる複数の回路の設定データ４０について説明するための図である。このターゲット回路４２には、独立した動作を実行する３つのターゲット回路Ａ、ターゲット回路Ｂ、ターゲット回路Ｃが含まれている。ターゲット回路Ａ、ターゲット回路Ｂ、ターゲット回路Ｃは、それぞれ独立したスレッドを構成し、リコンフィギュラブルユニットの回路規模に合わせて分割される。 FIG. 5 is a diagram for explaining setting data 40 of a plurality of circuits obtained by dividing one target circuit 42 to be generated. The target circuit 42 includes three target circuits A, B, and C that execute independent operations. The target circuit A, the target circuit B, and the target circuit C constitute independent threads, and are divided according to the circuit scale of the reconfigurable unit.

この例では、それぞれのターゲット回路が、３つの分割回路に分割されている。すなわち、ターゲット回路Ａは、分割回路Ａ＿０、分割回路Ａ＿１、分割回路Ａ＿２に分割され、ターゲット回路Ｂは、分割回路Ｂ＿０、分割回路Ｂ＿１、分割回路Ｂ＿２に分割され、ターゲット回路Ｃは、分割回路Ｃ＿０、分割回路Ｃ＿１、分割回路Ｃ＿２に分割される。設定データ生成部３２は、各分割回路に対して設定データ４０を生成する。 In this example, each target circuit is divided into three divided circuits. That is, the target circuit A is divided into a divided circuit A_0, a divided circuit A_1, and a divided circuit A_2, the target circuit B is divided into a divided circuit B_0, a divided circuit B_1, and a divided circuit B_2, and the target circuit C is divided into a divided circuit C_0. And divided circuit C_1 and divided circuit C_2. The setting data generation unit 32 generates setting data 40 for each divided circuit.

各ターゲット回路は、データフローグラフ３８における演算の流れにしたがって分割される。データフローグラフ３８において、上から下に向かう方向に演算の流れが表現される場合、そのデータフローグラフ３８を上から所定の間隔で切り取り、その切り取った部分を分割回路として設定する。流れにしたがって切り取る間隔は、リコンフィギュラブル回路１２におけるリコンフィギュラブルユニットの段数以下に定められる。ターゲット回路４２は、データフローグラフ３８の横方向で分割されてもよい。横方向に分割する幅は、リコンフィギュラブル回路１２における論理回路の１段当たりの個数以下に定められる。 Each target circuit is divided according to the flow of calculation in the data flow graph 38. In the data flow graph 38, when the calculation flow is expressed in a direction from the top to the bottom, the data flow graph 38 is cut from the top at a predetermined interval, and the cut portion is set as a dividing circuit. The interval to be cut according to the flow is determined to be equal to or less than the number of stages of the reconfigurable unit in the reconfigurable circuit 12. The target circuit 42 may be divided in the horizontal direction of the data flow graph 38. The width to be divided in the horizontal direction is determined to be equal to or less than the number of logic circuits in the reconfigurable circuit 12 per stage.

例えば、リコンフィギュラブル回路１２が３段のＡＬＵ列で構成され、各段に接続部５２が設けられている場合、リコンフィギュラブルユニットには、１段のＡＬＵ列が含まれることになる。このとき、各ターゲット回路の分割回路は、１段のデータフローグラフ分を表現することになる。 For example, when the reconfigurable circuit 12 is configured by three stages of ALU columns and the connection unit 52 is provided at each stage, the reconfigurable unit includes one stage of ALU columns. At this time, the division circuit of each target circuit represents one stage of data flow graph.

したがって、図５の例では、各ターゲット回路が３段のデータフローグラフ３８により表現されていることになる。実際のターゲット回路の回路規模は、数十以上の段数のデータフローグラフ３８で表現されることが多いが、本明細書では説明の簡便のため、図５に示す分割回路が設定された場合について説明する。 Therefore, in the example of FIG. 5, each target circuit is represented by a three-stage data flow graph 38. The actual circuit scale of the target circuit is often represented by a data flow graph 38 having several tens or more stages. However, in the present specification, for convenience of explanation, a case where the divided circuit shown in FIG. explain.

図６は、リコンフィギュラブル回路上に構成するターゲット回路Ａの処理の流れを示す図である。リコンフィギュラブル回路１２では、１回のマッピング処理を１クロックで実行することができる。ここでは、３段のＡＬＵ列（リコンフィギュラブルユニット）で構成されるリコンフィギュラブル回路１２を想定する。 FIG. 6 is a diagram illustrating a processing flow of the target circuit A configured on the reconfigurable circuit. In the reconfigurable circuit 12, one mapping process can be executed in one clock. Here, a reconfigurable circuit 12 composed of three stages of ALU rows (reconfigurable units) is assumed.

１クロック目に、１段目のＡＬＵ列に分割回路Ａ＿０が生成され、１段目の接続部５２におけるＦＦ回路が、分割回路Ａ＿０から出力されるデータを格納する。２クロック目に、２段目のＡＬＵ列に分割回路Ａ＿１が生成され、１段目の接続部５２におけるＦＦ回路が、１クロック目に格納したデータを、生成された分割回路Ａ＿１に供給する。 At the first clock, the dividing circuit A_0 is generated in the first-stage ALU column, and the FF circuit in the first-stage connection unit 52 stores the data output from the dividing circuit A_0. At the second clock, the dividing circuit A_1 is generated in the second-stage ALU column, and the FF circuit in the first-stage connection unit 52 supplies the data stored at the first clock to the generated dividing circuit A_1.

３クロック目に、３段目のＡＬＵ列に分割回路Ａ＿２が生成され、２段目の接続部５２におけるＦＦ回路が、２クロック目に格納したデータを、生成された分割回路Ａ＿２に供給する。実施例では、ターゲット回路Ａが３つの分割回路により構成されているため、分割回路Ａ＿２がデータを出力することで、ターゲット回路Ａの処理というスレッドが完了する。なお、ターゲット回路Ａが４つ以上の分割回路により構成されている場合には、分割回路Ａ＿２の出力がフィードバックされて、１段目のＡＬＵ列に供給されることになる。 At the third clock, the dividing circuit A_2 is generated in the third-stage ALU column, and the FF circuit in the second-stage connection unit 52 supplies the data stored at the second clock to the generated dividing circuit A_2. In the embodiment, since the target circuit A is composed of three divided circuits, the divided circuit A_2 outputs data, and the thread of processing of the target circuit A is completed. When the target circuit A is composed of four or more divided circuits, the output of the divided circuit A_2 is fed back and supplied to the first ALU column.

このように、１つのスレッドは、リコンフィギュラブル回路１２の各段に構成されたリコンフィギュラブルユニット毎に処理される。図６からも明らかなように、リコンフィギュラブル回路１２を１つのスレッドの処理のみに用いると、動作しないＡＬＵ列が生じる。そこで、本実施例では、リコンフィギュラブル回路１２を有効に活用するために、空いたＡＬＵ列で別のスレッドを実行させるようにする。これにより、マルチスレッド処理を実現できる。 Thus, one thread is processed for each reconfigurable unit configured in each stage of the reconfigurable circuit 12. As apparent from FIG. 6, when the reconfigurable circuit 12 is used only for processing of one thread, an ALU sequence that does not operate is generated. Therefore, in this embodiment, in order to use the reconfigurable circuit 12 effectively, another thread is executed in an empty ALU string. Thereby, multi-thread processing can be realized.

図７は、リコンフィギュラブル回路上で実現するマルチスレッド動作の流れを示す図である。各スレッドは、互いに独立して実行される。リコンフィギュラブル回路１２におけるリコンフィギュラブルユニット間のデータの受け渡しについては、図６に関して説明したとおりである。 FIG. 7 is a diagram illustrating a flow of a multi-thread operation realized on the reconfigurable circuit. Each thread is executed independently of each other. The data transfer between the reconfigurable units in the reconfigurable circuit 12 is as described with reference to FIG.

１クロック目に、リコンフィギュラブル回路１２の１段目のＡＬＵ列に分割回路Ａ＿０が生成される。２クロック目に、１段目のＡＬＵ列に分割回路Ｂ＿０が生成され、２段目のＡＬＵ列に分割回路Ａ＿１が生成される。３クロック目に、１段目のＡＬＵ列に分割回路Ｃ＿０が生成され、２段目のＡＬＵ列に分割回路Ｂ＿１が生成され、３段目のＡＬＵ列に分割回路Ａ＿２が生成される。 At the first clock, the dividing circuit A_0 is generated in the first ALU column of the reconfigurable circuit 12. At the second clock, the dividing circuit B_0 is generated in the first ALU column, and the dividing circuit A_1 is generated in the second ALU column. At the third clock, the dividing circuit C_0 is generated in the first ALU column, the dividing circuit B_1 is generated in the second ALU column, and the dividing circuit A_2 is generated in the third ALU column.

３クロック目で、ターゲット回路Ａの処理、すなわちスレッドＡの実行は完了する。４クロック目に、１段目のＡＬＵ列に分割回路Ａ＿０が生成され、２段目のＡＬＵ列に分割回路Ｃ＿１が生成され、３段目のＡＬＵ列に分割回路Ｂ＿２が生成される。４クロック目で、ターゲット回路Ｂの処理、すなわちスレッドＢの実行は完了し、新たなスレッドＡが実行される。 At the third clock, the processing of the target circuit A, that is, the execution of the thread A is completed. At the fourth clock, the dividing circuit A_0 is generated in the first ALU column, the dividing circuit C_1 is generated in the second ALU column, and the dividing circuit B_2 is generated in the third ALU column. At the fourth clock, the processing of the target circuit B, that is, the execution of the thread B is completed, and a new thread A is executed.

５クロック目に、１段目のＡＬＵ列に分割回路Ｂ＿０が生成され、２段目のＡＬＵ列に分割回路Ａ＿１が生成され、３段目のＡＬＵ列に分割回路Ｃ＿２が生成される。５クロック目で、ターゲット回路Ｃの処理、すなわちスレッドＣの実行は完了し、新たなスレッドＢが実行される。６クロック目に、１段目のＡＬＵ列に分割回路Ｃ＿０が生成され、２段目のＡＬＵ列に分割回路Ｂ＿１が生成され、３段目のＡＬＵ列に分割回路Ａ＿２が生成される。６クロック目で、スレッドＡの実行は完了し、新たなスレッドＣが実行される。以後、同様に各スレッドがクロック毎に処理される。 At the fifth clock, the dividing circuit B_0 is generated in the first ALU column, the dividing circuit A_1 is generated in the second ALU column, and the dividing circuit C_2 is generated in the third ALU column. At the fifth clock, the processing of the target circuit C, that is, the execution of the thread C is completed, and a new thread B is executed. At the sixth clock, the dividing circuit C_0 is generated in the first ALU column, the dividing circuit B_1 is generated in the second ALU column, and the dividing circuit A_2 is generated in the third ALU column. At the sixth clock, the execution of the thread A is completed, and a new thread C is executed. Thereafter, each thread is similarly processed for each clock.

このように、リコンフィギュラブル回路１２が複数のスレッドを同時に実行することで、リコンフィギュラブル回路１２のハード資源を有効に活用することができるとともに、もとのターゲット回路４２全体の処理速度を高速化できる。 As described above, the reconfigurable circuit 12 executes a plurality of threads at the same time, so that the hardware resources of the reconfigurable circuit 12 can be effectively used and the processing speed of the original target circuit 42 as a whole can be increased. Can be

図８は、集積回路装置２６の詳細な構成を示す。図８は、主としてマルチスレッド処理を実現するための構成、具体的にはリコンフィギュラブル回路１２の入出力に関与する構成を示す。ここでは、図３に示すような各段の接続部５２ａからの途中出力およびメモリ部２７からの途中入力が可能なリコンフィギュラブル回路１２を示している。図８では、説明の便宜上、接続部の符号は、「５２」を利用する。図１、図３および図４の構成と同一の符号を付した構成は、同一の構造および機能を有している。 FIG. 8 shows a detailed configuration of the integrated circuit device 26. FIG. 8 shows a configuration mainly for realizing multithread processing, specifically, a configuration related to input / output of the reconfigurable circuit 12. Here, the reconfigurable circuit 12 is shown in which halfway output from the connection parts 52a of each stage and halfway input from the memory part 27 as shown in FIG. In FIG. 8, for convenience of explanation, “52” is used as the reference numeral of the connection unit. The configurations denoted by the same reference numerals as those in FIGS. 1, 3 and 4 have the same structure and function.

メモリ部２７は、複数のＲＡＭ２７ａ、ＲＡＭ２７ｂ、ＲＡＭ２７ｃを有する。各ＲＡＭは、リコンフィギュラブル回路１２からの出力を記憶する。本実施例では、各ＲＡＭは、リコンフィギュラブル回路１２上で実行されるスレッドに割り当てられる。 The memory unit 27 includes a plurality of RAMs 27a, 27b, and 27c. Each RAM stores an output from the reconfigurable circuit 12. In this embodiment, each RAM is assigned to a thread executed on the reconfigurable circuit 12.

すなわち、１つのＲＡＭは、１つのスレッドからのデータを記憶し、またそのスレッドの実行に必要なデータをリコンフィギュラブル回路１２上のＡＬＵに供給する。したがってＲＡＭの個数は、少なくとも同時に実行するスレッドの数だけ存在していることが好ましい。 That is, one RAM stores data from one thread and supplies data necessary for execution of the thread to the ALU on the reconfigurable circuit 12. Therefore, it is preferable that there are as many RAMs as there are threads executed at the same time.

第１切替回路２３は、第１切替部２３ａ、第１切替部２３ｂ、第１切替部２３ｃを有する。第１切替部２３ａ、２３ｂ、２３ｃのそれぞれは、それぞれＲＡＭ２７ａ、２７ｂ、２７ｃに対応して設けられる。第１切替部２３ａ、２３ｂ、２３ｃは、リコンフィギュラブル回路１２からの出力をスレッドに応じて選択して、対応するＲＡＭ２７ａ、２７ｂ、２７ｃに供給する。それぞれの第１切替部は、全段の接続部５２の出力線と接続されている。 The first switching circuit 23 includes a first switching unit 23a, a first switching unit 23b, and a first switching unit 23c. The first switching units 23a, 23b, and 23c are provided corresponding to the RAMs 27a, 27b, and 27c, respectively. The first switching units 23a, 23b, and 23c select the output from the reconfigurable circuit 12 according to the thread, and supply the selected RAM 27a, 27b, and 27c to the corresponding RAM 27a, 27b, and 27c. Each first switching unit is connected to an output line of the connection unit 52 in all stages.

同時実行されるスレッド数が３つの場合には、１つのスレッドは、いずれかのＡＬＵ列において実行されることとなり、したがって、各第１切替部は、対応するＲＡＭが割り当てられているスレッドからのデータを選択して、そのＲＡＭに供給する。 When the number of threads to be executed simultaneously is three, one thread is executed in any ALU column, and therefore each first switching unit starts from the thread to which the corresponding RAM is allocated. Data is selected and supplied to the RAM.

第２切替回路２５は、第２切替部２５ａ、第２切替部２５ｂ、第２切替部２５ｃを有する。第２切替部２５ａ、２５ｂ、２５ｃのそれぞれは、スレッドに対応して設けられる。第２切替部２５ａ、２５ｂ、２５ｃのそれぞれは、ＲＡＭ２７ａ、２７ｂ、２７ｃからの出力の１つをスレッドに応じて選択して、リコンフィギュラブル回路１２の入力に供給する。具体的に、第２切替部２５ａは、スレッドＡの処理に必要なデータを選択して経路部２９に出力する。 The second switching circuit 25 includes a second switching unit 25a, a second switching unit 25b, and a second switching unit 25c. Each of the second switching units 25a, 25b, and 25c is provided corresponding to a thread. Each of the second switching units 25a, 25b, and 25c selects one of the outputs from the RAMs 27a, 27b, and 27c according to the thread and supplies the selected one to the input of the reconfigurable circuit 12. Specifically, the second switching unit 25 a selects data necessary for the processing of the thread A and outputs the data to the route unit 29.

同様に、第２切替部２５ｂは、スレッドＢの処理に必要なデータを選択して経路部２９に出力し、第２切替部２５ｃは、スレッドＣの処理に必要なデータを選択して経路部２９に出力する。これにより、経路部２９から供給されるデータとスレッドとを対応付けることができ、リコンフィギュラブル回路１２の入力設定および機能設定を容易にすることができる。 Similarly, the second switching unit 25b selects data necessary for processing of the thread B and outputs the data to the path unit 29, and the second switching unit 25c selects data necessary for processing of the thread C to select the path unit. 29. As a result, the data supplied from the path unit 29 can be associated with the thread, and the input setting and function setting of the reconfigurable circuit 12 can be facilitated.

図９は、スレッド間のデータの受け渡しの一例を示す図である。スレッドＡは例えば外部からの入力ないしは設定データによる入力を受けて実行される。スレッドＢはスレッドＡの処理結果を利用し、スレッドＣはスレッドＢの処理結果を利用して、自身の処理結果を出力する。各スレッドは独立した処理を行うため、図７に示すようにマルチスレッドで実行することができる。 FIG. 9 is a diagram illustrating an example of data exchange between threads. For example, the thread A is executed in response to input from the outside or input by setting data. The thread B uses the processing result of the thread A, and the thread C uses the processing result of the thread B and outputs its own processing result. Since each thread performs independent processing, it can be executed in multiple threads as shown in FIG.

スレッドに対するＲＡＭの割当てについて考察する。ＲＡＭをスレッドに対して任意に割り当てる場合、各ＲＡＭに格納されているデータがどのスレッドのデータであるかを判断する必要がある。またデータを読み出すタイミングおよび書き込むタイミングも、適宜判断しなければならない。 Consider the allocation of RAM to threads. When the RAM is arbitrarily assigned to the thread, it is necessary to determine which thread the data stored in each RAM is. In addition, the timing for reading and writing data must be determined appropriately.

各スレッドＡ、Ｂ、Ｃは、他のスレッドと無関係に独立してリコンフィギュラブル回路１２上にマッピングされて動作しているため、ＲＡＭをスレッドに対して任意に割り当てたのであれば、データの読出および書込の制御が非常に複雑となる。以上の各構成の動作は、コンパイル部３０により実行されてＤＦＧに変換されるものであるが、１つのＲＡＭから複数のデータを同時に読み出すことができないなどの複雑な条件を全て考慮する必要があるため、コンパイルプログラムが複雑化することが確実である。 Since each thread A, B, and C is mapped and operated on the reconfigurable circuit 12 independently of the other threads, if the RAM is arbitrarily assigned to the thread, the data The control of reading and writing becomes very complicated. The operations of the above components are executed by the compiling unit 30 and converted into DFG. However, it is necessary to consider all complicated conditions such as that a plurality of data cannot be read simultaneously from one RAM. Therefore, it is certain that the compilation program becomes complicated.

一方、スレッドＡをＲＡＭ２７ａに、スレッドＢをＲＡＭ２７ｂに、スレッドＣをＲＡＭ２７ｃに固定的に割り当てた場合を考える。スレッドＢの処理でスレッドＡの処理結果を利用するため、スレッドＢは、スレッドＡを割り当てられたＲＡＭ２７ａから処理結果を読み出すことになる。一方、ＲＡＭ２７ａには、スレッドＡからのデータが書き込まれる。ＲＡＭ２７ａのアドレスＸにスレッドＡの処理結果が格納されている場合、スレッドＢが読み出すタイミングが、スレッドＡがアドレスＸにデータを書き込むタイミングよりも早ければ問題は生じない。 On the other hand, let us consider a case in which the thread A is fixedly assigned to the RAM 27a, the thread B is assigned to the RAM 27b, and the thread C is fixedly assigned to the RAM 27c. Since the processing result of the thread A is used in the processing of the thread B, the thread B reads the processing result from the RAM 27a to which the thread A is assigned. On the other hand, data from the thread A is written in the RAM 27a. When the processing result of the thread A is stored at the address X of the RAM 27a, there is no problem if the timing at which the thread B reads is earlier than the timing at which the thread A writes data to the address X.

しかしながら、スレッドＡがデータを書き込むタイミングの方が早ければ、スレッドＢは、新たに書き込まれたデータを読み出すことになり、正しい処理結果を読み出すことができない。 However, if the timing at which the thread A writes data is earlier, the thread B will read the newly written data and cannot read the correct processing result.

この場合、スレッドＡの処理結果がＲＡＭ２７ａに書き込まれると、一旦スレッドＡの動作を中断して、スレッドＢで使用するデータをスレッドＢ用のＲＡＭ２７ｂにコピーすればよい。 In this case, when the processing result of the thread A is written in the RAM 27a, the operation of the thread A is temporarily interrupted, and the data used in the thread B may be copied to the RAM 27b for the thread B.

同様に、スレッドＢの処理結果がＲＡＭ２７ｂに書き込まれると、スレッドＢの動作を中断して、スレッドＣで使用するデータをスレッドＣ用のＲＡＭ２７ｃにコピーする。これにより、必要なデータを上書きして消失する事態を回避できる。 Similarly, when the processing result of the thread B is written in the RAM 27b, the operation of the thread B is interrupted, and the data used in the thread C is copied to the RAM 27c for the thread C. Thereby, it is possible to avoid a situation where necessary data is overwritten and lost.

このとき、データのコピーは、経路部２９を介してリコンフィギュラブル回路１２の入力にフィードバックし、データを別のＲＡＭに転送することで行われる。この間、ＡＬＵは、本来のスレッド処理とは異なる動作に使用されるため、非効率である。そこで、以下では、より効率的にマルチスレッド処理を実行する方法について説明する。具体的には、ＲＡＭに対するスレッドの割当てを、時分割的に変化させることで対応する。 At this time, copying of data is performed by feeding back to the input of the reconfigurable circuit 12 via the path unit 29 and transferring the data to another RAM. During this time, the ALU is inefficient because it is used for an operation different from the original thread processing. In the following, a method for executing multithread processing more efficiently will be described. Specifically, this is dealt with by changing the allocation of threads to the RAM in a time-sharing manner.

図１０は、メモリ部２７における複数のＲＡＭへのスレッドの割当てと、ＲＡＭの記憶領域の状態を示す。図中、ＲＡＭに入る矢印は、リコンフィギュラブル回路１２から書き込まれるデータの流れを示し、ＲＡＭから出る矢印は、リコンフィギュラブル回路１２にフィードバックされるデータの流れを示す。 FIG. 10 shows the assignment of threads to a plurality of RAMs in the memory unit 27 and the state of the RAM storage area. In the figure, an arrow entering the RAM indicates a flow of data written from the reconfigurable circuit 12, and an arrow exiting the RAM indicates a flow of data fed back to the reconfigurable circuit 12.

また、例えばＡ→Ｂは、スレッドＢに引き渡されるべきスレッドＡの処理結果を格納する領域を示す。また例えば、Ａ＿ａｒｅａは、スレッドＡ用に割り当てられるテンポラリな領域を示し、処理を実行するために必要な途中結果を格納する領域である。 Further, for example, A → B indicates an area for storing a processing result of the thread A to be delivered to the thread B. Further, for example, A_area indicates a temporary area allocated for the thread A, and is an area for storing intermediate results necessary for executing processing.

図１０（ａ）〜図１０（ｅ）に示されるＲＡＭの記憶領域の状態は、全てのスレッドが終了したときの状態をそれぞれ示している。上記した例では、３クロックで終了する単純なスレッドについて説明したが、実際には、数百クロック程度は必要となるスレッドが想定される。例えば、スレッドＡが１２０クロック、スレッドＢが１５０クロック、スレッドＣが１００クロック必要であるとすると、全てのスレッドが終了する時間は、少なくとも１５０クロックかかる。 The state of the storage area of the RAM shown in FIGS. 10A to 10E shows the state when all threads are finished. In the above example, a simple thread that ends in 3 clocks has been described. However, in reality, a thread that requires about several hundred clocks is assumed. For example, if thread A requires 120 clocks, thread B requires 150 clocks, and thread C requires 100 clocks, it takes at least 150 clocks for all threads to finish.

スレッド間でデータの受け渡しを行うため、各スレッドを確実に実行するためには、１５０クロックを単位時間として、各スレッドが実行されるようにしてもよい。本実施例では、この１５０クロックを単位時間に設定して、ＲＡＭに対するスレッドの割当ての変更を、１５０クロック毎に実行する。 Since data is exchanged between threads, in order to execute each thread reliably, each thread may be executed with 150 clocks as a unit time. In this embodiment, this 150 clock is set as a unit time, and a change in thread assignment to the RAM is executed every 150 clocks.

以下、この基準時間をサイクルとして表現する。なお、サイクルは、同時に実行するスレッドの処理時間のうち、最も時間のかかるスレッドの処理時間以上に設定される。したがって、同時に実行する複数のスレッドの処理は、１サイクルの間に終了されることになる。 Hereinafter, this reference time is expressed as a cycle. The cycle is set to be equal to or longer than the processing time of the thread that takes the longest time among the processing times of the threads that are executed simultaneously. Therefore, the processing of a plurality of threads that are executed at the same time is terminated during one cycle.

図１０（ａ）は、１サイクル目の状態を示す。１サイクル目では、スレッドＡがＲＡＭ２７ａに、スレッドＢがＲＡＭ２７ｂに、スレッドＣがＲＡＭ２７ｃに割り当てられる。１サイクル目では、ＲＡＭ２７ａに、次回以降のサイクルでスレッドＢに引き渡すべき処理結果を格納する領域（Ａ→Ｂ）と、スレッドＡの途中結果を格納する領域（Ａ＿ａｒｅａ）が設定される。なお、Ａ→Ｂの前に表記される（Ｎ）は、Ｎサイクル目に生成された処理結果であることを示す。 FIG. 10A shows the state of the first cycle. In the first cycle, thread A is assigned to RAM 27a, thread B is assigned to RAM 27b, and thread C is assigned to RAM 27c. In the first cycle, an area (A → B) for storing the processing result to be delivered to the thread B in the next and subsequent cycles and an area (A_area) for storing the intermediate result of the thread A are set in the RAM 27a. Note that (N) written before A → B indicates a processing result generated in the Nth cycle.

同様に、ＲＡＭ２７ｂに、次回以降のサイクルでスレッドＣに引き渡すべき処理結果を格納する領域（Ｂ→Ｃ）と、スレッドＢの途中結果を格納する領域（Ｂ＿ａｒｅａ）が設定される。また、ＲＡＭ２７ｃに、スレッドＣの途中結果および処理結果を格納する領域（Ｃ＿ａｒｅａ）が設定される。 Similarly, an area (B → C) for storing the processing result to be delivered to the thread C in the next and subsequent cycles and an area (B_area) for storing the intermediate result of the thread B are set in the RAM 27b. In addition, an area (C_area) for storing the intermediate result and processing result of the thread C is set in the RAM 27c.

図１０（ｂ）は、２サイクル目の状態を示す。２サイクル目では、スレッドＢがＲＡＭ２７ａに、スレッドＣがＲＡＭ２７ｂに、スレッドＡがＲＡＭ２７ｃに割り当てられる。ＲＡＭに対するスレッドの割当ては、制御部１８により実行される。制御部１８は、サイクル毎にスレッドの割当てを変更して、前回のサイクルとは異なるスレッドからのデータをＲＡＭに記憶させるようにする。 FIG. 10B shows the state of the second cycle. In the second cycle, thread B is assigned to RAM 27a, thread C is assigned to RAM 27b, and thread A is assigned to RAM 27c. The assignment of threads to the RAM is executed by the control unit 18. The control unit 18 changes the thread assignment for each cycle so that data from a thread different from the previous cycle is stored in the RAM.

なお、スレッドの割当てを変更する順番は、図９に示すスレッド間のデータの引渡しの関係をもとに定められる。図９に示すように、スレッドＡの処理結果がスレッドＢに引き渡される場合には、前回のサイクルでスレッドＡに割り当てられていたＲＡＭを、今回のサイクルでスレッドＢに割り当てるように定める。そのため、スレッドＢは、今回のサイクルで自身に割り当てられたＲＡＭ中に格納されるスレッドＡの処理結果を容易に利用することができる。 Note that the order in which the thread assignment is changed is determined based on the data delivery relationship between threads shown in FIG. As shown in FIG. 9, when the processing result of thread A is handed over to thread B, the RAM allocated to thread A in the previous cycle is determined to be allocated to thread B in the current cycle. Therefore, the thread B can easily use the processing result of the thread A stored in the RAM allocated to itself in the current cycle.

すなわち、ＲＡＭ２７ａには、前回のサイクルでスレッドＡから引き渡されるべき処理結果が格納されているため、スレッドＢは、この処理結果を自身の処理に使用できる。同様に、前回のサイクルでスレッドＢに割り当てられていたＲＡＭは、今回のサイクルでスレッドＣに割り当てられる。 That is, since the RAM 27a stores the processing result to be delivered from the thread A in the previous cycle, the thread B can use this processing result for its own processing. Similarly, the RAM allocated to the thread B in the previous cycle is allocated to the thread C in the current cycle.

図示されるように、制御部１８は、スレッドの割当てを全てのＲＡＭに対して同時に変更させる。このタイミングは、既述したように全てのスレッドの実行が終了した後のタイミングである。このとき、スレッドの割当てを循環的に変更させることで、コンパイル処理を容易にするとともに、図９に示すスレッド間の関係にしたがったデータの引渡しを効率的に行うことが可能となる。このようにＲＡＭに対するスレッドの割当てを効率よく循環的に変更することで、処理装置１０の高速化を実現できる。 As shown in the figure, the control unit 18 changes the allocation of threads to all the RAMs at the same time. This timing is a timing after the execution of all the threads is completed as described above. At this time, by cyclically changing the thread assignment, the compiling process can be facilitated, and the data delivery according to the relationship between threads shown in FIG. 9 can be efficiently performed. In this way, the processing apparatus 10 can be speeded up by efficiently and cyclically changing the allocation of threads to the RAM.

２サイクル目では、ＲＡＭ２７ａに、前回のサイクルで生成されたスレッドＡの処理結果を格納する領域（Ａ→Ｂ）、次回以降のサイクルでスレッドＣに引き渡すべき処理結果を格納する領域（Ｂ→Ｃ）と、スレッドＢの途中結果を格納する領域（Ｂ＿ａｒｅａ）が設定される。スレッドＡの処理結果およびスレッドＢの途中結果は、スレッドＢの実行のためにリコンフィギュラブル回路１２の入力に読み出される。なお、スレッドＡの処理結果が読み出されて、その後、スレッドＢで使用しないことが分かっている場合には、スレッドＡの処理結果を格納した領域（Ａ→Ｂ）を開放して、データの書込みを許してもよい。 In the second cycle, the RAM 27a stores the processing result of the thread A generated in the previous cycle (A → B), and stores the processing result to be delivered to the thread C in the next and subsequent cycles (B → C). ) And an area (B_area) for storing an intermediate result of the thread B is set. The processing result of the thread A and the intermediate result of the thread B are read to the input of the reconfigurable circuit 12 for the execution of the thread B. If the processing result of the thread A is read and then it is known that the thread B will not use it, the area (A → B) storing the processing result of the thread A is released, and the data You may allow writing.

同様に、ＲＡＭ２７ｂに、前回のサイクルで生成されたスレッドＢの処理結果を格納する領域（Ｂ→Ｃ）と、スレッドＣの途中結果および処理結果を格納する領域（Ｃ＿ａｒｅａ）が設定される。スレッドＢの処理結果およびスレッドＣの途中結果は、スレッドＣの実行のためにリコンフィギュラブル回路１２の入力に読み出される。なお、スレッドＢの処理結果が読み出されて、その後、スレッドＣで使用しないことが分かっている場合には、スレッドＢの処理結果を格納した領域（Ｂ→Ｃ）を開放して、データの書込みを許してもよい。 Similarly, an area for storing the processing result of the thread B generated in the previous cycle (B → C) and an area for storing the intermediate result of the thread C and the processing result (C_area) are set in the RAM 27b. The processing result of the thread B and the intermediate result of the thread C are read to the input of the reconfigurable circuit 12 for the execution of the thread C. If the processing result of the thread B is read and it is known that the thread C does not use it thereafter, the area (B → C) storing the processing result of the thread B is released, and the data You may allow writing.

また、ＲＡＭ２７ｃに、次回以降のサイクルでスレッドＢに引き渡すべき処理結果を格納する領域（Ａ→Ｂ）と、スレッドＡの途中結果を格納する領域（Ａ＿ａｒｅａ）が設定される。スレッドＡの途中結果は、スレッドＡの実行のためにリコンフィギュラブル回路１２の入力に読み出される。 Further, an area (A → B) for storing the processing result to be delivered to the thread B in the next and subsequent cycles and an area (A_area) for storing the intermediate result of the thread A are set in the RAM 27c. The intermediate result of the thread A is read to the input of the reconfigurable circuit 12 for the execution of the thread A.

図１０（ｃ）は、３サイクル目の状態を示す。３サイクル目では、スレッドＣがＲＡＭ２７ａに、スレッドＡがＲＡＭ２７ｂに、スレッドＢがＲＡＭ２７ｃに割り当てられる。３サイクル目では、ＲＡＭ２７ａに、前回のサイクルで生成されたスレッドＢの処理結果を格納する領域（Ｂ→Ｃ）と、スレッドＣの途中結果および処理結果を格納する領域（Ｃ＿ａｒｅａ）が設定される。 FIG. 10C shows the state of the third cycle. In the third cycle, thread C is assigned to RAM 27a, thread A is assigned to RAM 27b, and thread B is assigned to RAM 27c. In the third cycle, an area for storing the processing result of the thread B generated in the previous cycle (B → C) and an area for storing the intermediate result of the thread C and the processing result (C_area) are set in the RAM 27a. .

なお、ＲＡＭ２７ａにおいて、２サイクル目で設定されていた領域（Ａ→Ｂ）は、２サイクル目でデータの読出しが終了しているため、この例ではＣ＿ａｒｅａとして利用されている。これにより、ＲＡＭ２７ａを効率的に利用することができる。 Note that, in the RAM 27a, the area (A → B) set in the second cycle is used as C_area in this example because data reading is completed in the second cycle. Thereby, the RAM 27a can be used efficiently.

同様にＲＡＭ２７ｂに、次回以降のサイクルでスレッドＢに引き渡すべき処理結果を格納する領域（Ａ→Ｂ）と、スレッドＡの途中結果を格納する領域（Ａ＿ａｒｅａ）が設定される。ＲＡＭ２７ｂにおいて、２サイクル目で設定されていた領域（Ｂ→Ｃ）は、２サイクル目でデータの読出しが終了しているため、この例ではＡ＿ａｒｅａとして利用されている。 Similarly, an area (A → B) for storing the processing result to be delivered to the thread B in the next and subsequent cycles and an area (A_area) for storing the intermediate result of the thread A are set in the RAM 27b. In the RAM 27b, the area (B → C) set in the second cycle is used as A_area in this example because the data reading is completed in the second cycle.

またＲＡＭ２７ｃに、前回のサイクルで生成されたスレッドＡの処理結果を格納する領域（Ａ→Ｂ）と、次回以降のサイクルでスレッドＣに引き渡すべき処理結果を格納する領域（Ｂ→Ｃ）と、スレッドＢの途中結果を格納する領域（Ｂ＿ａｒｅａ）が設定される。 Further, the RAM 27c has an area for storing the processing result of the thread A generated in the previous cycle (A → B), an area for storing the processing result to be delivered to the thread C in the next and subsequent cycles (B → C), An area (B_area) for storing an intermediate result of thread B is set.

図１０（ｄ）は、４サイクル目の状態を示す。４サイクル目では、１サイクル目と同様に、スレッドＡがＲＡＭ２７ａに、スレッドＢがＲＡＭ２７ｂに、スレッドＣがＲＡＭ２７ｃに割り当てられる。４サイクル目では、ＲＡＭ２７ａに、次回以降のサイクルでスレッドＢに引き渡すべき処理結果を格納する領域（Ａ→Ｂ）と、スレッドＡの途中結果を格納する領域（Ａ＿ａｒｅａ）が設定される。 FIG. 10D shows the state of the fourth cycle. In the fourth cycle, as in the first cycle, thread A is allocated to the RAM 27a, thread B is allocated to the RAM 27b, and thread C is allocated to the RAM 27c. In the fourth cycle, an area (A → B) for storing the processing result to be delivered to the thread B in the next and subsequent cycles and an area (A_area) for storing the intermediate result of the thread A are set in the RAM 27a.

同様にＲＡＭ２７ｂに、前回のサイクルで生成されたスレッドＡの処理結果を格納する領域（Ａ→Ｂ）と、次回以降のサイクルでスレッドＣに引き渡すべき処理結果を格納する領域（Ｂ→Ｃ）と、スレッドＢの途中結果を格納する領域（Ｂ＿ａｒｅａ）が設定される。またＲＡＭ２７ｃに、前回のサイクルで生成されたスレッドＢの処理結果を格納する領域（Ｂ→Ｃ）と、スレッドＣの途中結果および処理結果を格納する領域（Ｃ＿ａｒｅａ）が設定される。 Similarly, an area (A → B) for storing the processing result of the thread A generated in the previous cycle and an area (B → C) for storing the processing result to be delivered to the thread C in the next and subsequent cycles are stored in the RAM 27b. An area (B_area) for storing the intermediate result of thread B is set. In addition, an area (B → C) for storing the processing result of the thread B generated in the previous cycle and an area (C_area) for storing the intermediate result of the thread C and the processing result are set in the RAM 27c.

図１０（ｅ）は、５サイクル目の状態を示す。５サイクル目では、２サイクル目と同様に、スレッドＢがＲＡＭ２７ａに、スレッドＣがＲＡＭ２７ｂに、スレッドＡがＲＡＭ２７ｃに割り当てられる。なお、ＲＡＭの記憶領域の状態についても、図１０（ｂ）に示す２サイクル目の状態と同一である。以後、図１０（ｂ）、図１０（ｃ）、図１０（ｄ）に示す状態をサイクリックに繰り返して、図９に示す処理が継続されることになる。 FIG. 10E shows the state of the fifth cycle. In the fifth cycle, as in the second cycle, thread B is allocated to the RAM 27a, thread C is allocated to the RAM 27b, and thread A is allocated to the RAM 27c. Note that the state of the storage area of the RAM is also the same as the state of the second cycle shown in FIG. Thereafter, the state shown in FIGS. 10B, 10C, and 10D is cyclically repeated, and the process shown in FIG. 9 is continued.

図１１は、スレッド間のデータの受け渡しの別の例を示す図である。スレッドＡは例えば外部からの入力ないしは設定データによる入力を受けて実行される。スレッドＢはスレッドＡの処理結果を利用し、スレッドＣはスレッドＡの処理結果およびスレッドＢの処理結果を利用して、自身の処理結果を出力する。各スレッドは独立した処理を行い、図７に示すようにマルチスレッドで実行することができる。図９および図１０に関して示した処理と同様に、マルチスレッド処理では、ＲＡＭに対するスレッドの割当てを、時分割的に変化させる。 FIG. 11 is a diagram illustrating another example of data transfer between threads. For example, the thread A is executed in response to input from the outside or input by setting data. Thread B uses the processing result of thread A, and thread C outputs the processing result of itself using the processing result of thread A and the processing result of thread B. Each thread performs independent processing and can be executed in multiple threads as shown in FIG. Similar to the processing shown with reference to FIGS. 9 and 10, in multithread processing, thread allocation to the RAM is changed in a time-sharing manner.

図１２および図１３は、メモリ部２７における複数のＲＡＭへのスレッドの割当てと、ＲＡＭの記憶領域の状態を示す。図中、ＲＡＭに入る矢印は、リコンフィギュラブル回路１２から書き込まれるデータの流れを示し、ＲＡＭから出る矢印は、リコンフィギュラブル回路１２にフィードバックされるデータの流れを示す。 12 and 13 show the assignment of threads to a plurality of RAMs in the memory unit 27 and the state of the RAM storage area. In the figure, an arrow entering the RAM indicates a flow of data written from the reconfigurable circuit 12, and an arrow exiting the RAM indicates a flow of data fed back to the reconfigurable circuit 12.

また、例えばＡ→Ｂは、スレッドＢに引き渡されるべきスレッドＡの処理結果を格納する領域を示す。また例えば、Ａ＿ａｒｅａは、スレッドＡ用に割り当てられるテンポラリな領域を示し、処理を実行するために必要な途中結果を格納する領域である。図１２（ａ）〜図１２（ｄ）および図１３（ａ）〜図１３（ｄ）に示されるＲＡＭの記憶領域の状態は、全てのスレッドが終了したときの状態をそれぞれ示している。図１２および図１３におけるスレッドの割当ての変更は、制御部１８により実行される。 Further, for example, A → B indicates an area for storing a processing result of the thread A to be delivered to the thread B. Further, for example, A_area indicates a temporary area allocated for the thread A, and is an area for storing intermediate results necessary for executing processing. The states of the RAM storage areas shown in FIGS. 12A to 12D and FIGS. 13A to 13D show the states when all the threads are finished. Changes in thread assignment in FIGS. 12 and 13 are executed by the control unit 18.

図１２（ａ）は、１サイクル目の状態を示す。１サイクル目では、スレッドＡがＲＡＭ２７ａに、スレッドＢがＲＡＭ２７ｂに、スレッドＣがＲＡＭ２７ｃに割り当てられる。１サイクル目では、ＲＡＭ２７ａに、次回以降のサイクルでスレッドＢに引き渡すべき処理結果を格納する領域（Ａ→Ｂ）と、次回以降のサイクルでスレッドＣに引き渡すべき処理結果を格納する領域（Ａ→Ｃ）と、スレッドＡの途中結果を格納する領域（Ａ＿ａｒｅａ）が設定される。 FIG. 12A shows the state of the first cycle. In the first cycle, thread A is assigned to RAM 27a, thread B is assigned to RAM 27b, and thread C is assigned to RAM 27c. In the first cycle, an area (A → B) for storing processing results to be delivered to the thread B in the next and subsequent cycles in the RAM 27a, and an area (A → B) for storing processing results to be delivered to the thread C in the next and subsequent cycles. C) and an area (A_area) for storing an intermediate result of the thread A is set.

図１２（ｂ）は、２サイクル目の状態を示す。２サイクル目では、スレッドＢがＲＡＭ２７ａに、スレッドＣがＲＡＭ２７ｂに、スレッドＡがＲＡＭ２７ｃに割り当てられる。２サイクル目では、ＲＡＭ２７ａに、前回のサイクルで生成されたスレッドＡの処理結果を格納する領域（Ａ→Ｂ）、（Ａ→Ｃ）、次回以降のサイクルでスレッドＣに引き渡すべき処理結果を格納する領域（Ｂ→Ｃ）と、スレッドＢの途中結果を格納する領域（Ｂ＿ａｒｅａ）が設定される。 FIG. 12B shows the state of the second cycle. In the second cycle, thread B is assigned to RAM 27a, thread C is assigned to RAM 27b, and thread A is assigned to RAM 27c. In the second cycle, the RAM 27a stores the processing result of the thread A generated in the previous cycle (A → B), (A → C), and the processing result to be delivered to the thread C in the next and subsequent cycles. To be stored (B → C) and an area (B_area) for storing the intermediate result of the thread B are set.

スレッドＢに引き渡されるべきスレッドＡの処理結果およびスレッドＢの途中結果は、スレッドＢの実行のためにリコンフィギュラブル回路１２の入力に読み出される。領域（Ａ→Ｃ）に格納されているスレッドＡの処理結果は、次のサイクルで使用されるため、領域（Ａ→Ｃ）への書き込みは禁止される。 The processing result of the thread A to be delivered to the thread B and the intermediate result of the thread B are read to the input of the reconfigurable circuit 12 for the execution of the thread B. Since the processing result of the thread A stored in the area (A → C) is used in the next cycle, writing to the area (A → C) is prohibited.

なお、領域（Ａ→Ｂ）に格納されているスレッドＡの処理結果が読み出されて、その後、スレッドＢで使用しないことが分かっている場合には、領域（Ａ→Ｂ）を開放して、データの書込みを許してもよい。 If the processing result of the thread A stored in the area (A → B) is read and then it is known that the thread B will not use it, the area (A → B) is released. Data writing may be allowed.

同様に、ＲＡＭ２７ｂに、前回のサイクルで生成されたスレッドＢの処理結果を格納する領域（Ｂ→Ｃ）と、スレッドＣの途中結果および処理結果を格納する領域（Ｃ＿ａｒｅａ）が設定される。スレッドＢの処理結果およびスレッドＣの途中結果は、スレッドＣの実行のためにリコンフィギュラブル回路１２の入力に読み出される。 Similarly, an area for storing the processing result of the thread B generated in the previous cycle (B → C) and an area for storing the intermediate result of the thread C and the processing result (C_area) are set in the RAM 27b. The processing result of the thread B and the intermediate result of the thread C are read to the input of the reconfigurable circuit 12 for the execution of the thread C.

なお、スレッドＢの処理結果が読み出されて、その後、スレッドＣで使用しないことが分かっている場合には、スレッドＢの処理結果を格納した領域（Ｂ→Ｃ）を開放して、データの書込みを許してもよい。 If the processing result of the thread B is read and it is known that the thread C does not use it thereafter, the area (B → C) storing the processing result of the thread B is released, and the data You may allow writing.

また、ＲＡＭ２７ｃに、次回以降のサイクルでスレッドＢに引き渡すべき処理結果を格納する領域（Ａ→Ｂ）、次回以降のサイクルでスレッドＣに引き渡すべき処理結果を格納する領域（Ａ→Ｃ）と、スレッドＡの途中結果を格納する領域（Ａ＿ａｒｅａ）が設定される。スレッドＡの途中結果は、スレッドＡの実行のためにリコンフィギュラブル回路１２の入力に読み出される。 In addition, an area for storing the processing result to be delivered to the thread B in the next and subsequent cycles (A → B), an area for storing the processing result to be delivered to the thread C in the subsequent and subsequent cycles (A → C), An area (A_area) for storing an intermediate result of the thread A is set. The intermediate result of the thread A is read to the input of the reconfigurable circuit 12 for the execution of the thread A.

図１２（ｃ）は、３サイクル目の状態を示す。３サイクル目では、スレッドＣがＲＡＭ２７ａに、スレッドＡがＲＡＭ２７ｂに、スレッドＢがＲＡＭ２７ｃに割り当てられる。３サイクル目では、ＲＡＭ２７ａに、前回のサイクルで生成されたスレッドＢの処理結果を格納する領域（Ｂ→Ｃ）と、前々回のサイクルで生成されたスレッドＡの処理結果を格納する領域（Ａ→Ｃ）、スレッドＣの途中結果および処理結果を格納する領域（Ｃ＿ａｒｅａ）が設定される。 FIG. 12C shows the state of the third cycle. In the third cycle, thread C is assigned to RAM 27a, thread A is assigned to RAM 27b, and thread B is assigned to RAM 27c. In the third cycle, an area for storing the processing result of the thread B generated in the previous cycle (B → C) and an area for storing the processing result of the thread A generated in the previous cycle (A →) are stored in the RAM 27a. C), an area (C_area) for storing the intermediate result of thread C and the processing result is set.

同様にＲＡＭ２７ｂに、次回以降のサイクルでスレッドＢに引き渡すべき処理結果を格納する領域（Ａ→Ｂ）、次回以降のサイクルでスレッドＣに引き渡すべき処理結果を格納する領域（Ａ→Ｃ）と、スレッドＡの途中結果を格納する領域（Ａ＿ａｒｅａ）が設定される。ＲＡＭ２７ｂにおいて、２サイクル目で設定されていた領域（Ｂ→Ｃ）は、２サイクル目でデータの読出しが終了しているため、この例ではＡ＿ａｒｅａとして利用されている。 Similarly, in the RAM 27b, an area for storing a processing result to be delivered to the thread B in the next and subsequent cycles (A → B), an area for storing a processing result to be delivered to the thread C in the next and subsequent cycles (A → C), An area (A_area) for storing an intermediate result of the thread A is set. In the RAM 27b, the area (B → C) set in the second cycle is used as A_area in this example because the data reading is completed in the second cycle.

またＲＡＭ２７ｃに、前回のサイクルで生成されたスレッドＡの処理結果を格納する領域（Ａ→Ｂ）、（Ａ→Ｃ）と、次回以降のサイクルでスレッドＣに引き渡すべき処理結果を格納する領域（Ｂ→Ｃ）と、スレッドＢの途中結果を格納する領域（Ｂ＿ａｒｅａ）が設定される。 In addition, the RAM 27c stores the processing result of the thread A generated in the previous cycle (A → B) and (A → C), and the processing storing the processing result to be delivered to the thread C in the next and subsequent cycles ( B → C) and an area (B_area) for storing the intermediate result of thread B is set.

図１２（ｄ）は、４サイクル目の状態を示す。４サイクル目では、１サイクル目と同様に、スレッドＡがＲＡＭ２７ａに、スレッドＢがＲＡＭ２７ｂに、スレッドＣがＲＡＭ２７ｃに割り当てられる。４サイクル目では、ＲＡＭ２７ａに、次回以降のサイクルでスレッドＢに引き渡すべき処理結果を格納する領域（Ａ→Ｂ）、次回以降のサイクルでスレッドＣに引き渡すべき処理結果を格納する領域（Ａ→Ｃ）と、スレッドＡの途中結果を格納する領域（Ａ＿ａｒｅａ）が設定される。 FIG. 12D shows the state of the fourth cycle. In the fourth cycle, as in the first cycle, thread A is allocated to the RAM 27a, thread B is allocated to the RAM 27b, and thread C is allocated to the RAM 27c. In the fourth cycle, the RAM 27a stores the processing result to be delivered to the thread B in the next and subsequent cycles (A → B), and stores the processing result to be delivered to the thread C in the next and subsequent cycles (A → C). ) And an area (A_area) for storing the intermediate result of the thread A is set.

同様にＲＡＭ２７ｂに、前回のサイクルで生成されたスレッドＡの処理結果を格納する領域（Ａ→Ｂ）、（Ａ→Ｃ）と、次回以降のサイクルでスレッドＣに引き渡すべき処理結果を格納する領域（Ｂ→Ｃ）と、スレッドＢの途中結果を格納する領域（Ｂ＿ａｒｅａ）が設定される。 Similarly, an area for storing the processing result of the thread A generated in the previous cycle (A → B) and (A → C) and an area for storing the processing result to be delivered to the thread C in the next and subsequent cycles are stored in the RAM 27b. (B → C) and an area (B_area) for storing an intermediate result of the thread B are set.

またＲＡＭ２７ｃに、前回のサイクルで生成されたスレッドＢの処理結果を格納する領域（Ｂ→Ｃ）と、前々回のサイクルで生成されたスレッドＡの処理結果を格納する領域（Ａ→Ｃ）と、スレッドＣの途中結果および処理結果を格納する領域（Ｃ＿ａｒｅａ）が設定される。 Further, the RAM 27c stores an area (B → C) for storing the processing result of the thread B generated in the previous cycle, an area (A → C) for storing the processing result of the thread A generated in the previous cycle, An area (C_area) for storing the intermediate result of thread C and the processing result is set.

図１３（ａ）は、５サイクル目の状態を示す。５サイクル目では、２サイクル目と同様に、スレッドＢがＲＡＭ２７ａに、スレッドＣがＲＡＭ２７ｂに、スレッドＡがＲＡＭ２７ｃに割り当てられる。５サイクル目では、ＲＡＭ２７ａに、前回のサイクルで生成されたスレッドＡの処理結果を格納する領域（Ａ→Ｂ）、（Ａ→Ｃ）、次回以降のサイクルでスレッドＣに引き渡すべき処理結果を格納する領域（Ｂ→Ｃ）と、スレッドＢの途中結果を格納する領域（Ｂ＿ａｒｅａ）が設定される。 FIG. 13A shows the state of the fifth cycle. In the fifth cycle, as in the second cycle, thread B is allocated to the RAM 27a, thread C is allocated to the RAM 27b, and thread A is allocated to the RAM 27c. In the fifth cycle, the RAM 27a stores the processing result of the thread A generated in the previous cycle (A → B), (A → C), and the processing result to be delivered to the thread C in the next and subsequent cycles. To be stored (B → C) and an area (B_area) for storing the intermediate result of the thread B are set.

同様に、ＲＡＭ２７ｂに、前回のサイクルで生成されたスレッドＢの処理結果を格納する領域（Ｂ→Ｃ）、前々回のサイクルで生成されたスレッドＡの処理結果を格納する領域（Ａ→Ｃ）と、スレッドＣの途中結果および処理結果を格納する領域（Ｃ＿ａｒｅａ）が設定される。 Similarly, an area for storing the processing result of the thread B generated in the previous cycle (B → C) and an area for storing the processing result of the thread A generated in the previous cycle (A → C) are stored in the RAM 27b. An area (C_area) for storing the intermediate result of thread C and the processing result is set.

また、ＲＡＭ２７ｃに、次回以降のサイクルでスレッドＢに引き渡すべき処理結果を格納する領域（Ａ→Ｂ）、次回以降のサイクルでスレッドＣに引き渡すべき処理結果を格納する領域（Ａ→Ｃ）と、スレッドＡの途中結果を格納する領域（Ａ＿ａｒｅａ）が設定される。 In addition, an area for storing the processing result to be delivered to the thread B in the next and subsequent cycles (A → B), an area for storing the processing result to be delivered to the thread C in the subsequent and subsequent cycles (A → C), An area (A_area) for storing an intermediate result of the thread A is set.

図１３（ｂ）は、６サイクル目の状態を示す。６サイクル目では、３サイクル目と同様に、スレッドＣがＲＡＭ２７ａに、スレッドＡがＲＡＭ２７ｂに、スレッドＢがＲＡＭ２７ｃに割り当てられる。なお、ＲＡＭの記憶領域の状態についても、図１２（ｃ）に示す３サイクル目の状態と同一である。 FIG. 13B shows the state of the sixth cycle. In the sixth cycle, as in the third cycle, the thread C is allocated to the RAM 27a, the thread A is allocated to the RAM 27b, and the thread B is allocated to the RAM 27c. Note that the state of the storage area of the RAM is also the same as the state of the third cycle shown in FIG.

図１３（ｃ）は、７サイクル目の状態を示す。７サイクル目では、４サイクル目と同様に、スレッドＡがＲＡＭ２７ａに、スレッドＢがＲＡＭ２７ｂに、スレッドＣがＲＡＭ２７ｃに割り当てられる。なお、ＲＡＭの記憶領域の状態についても、図１２（ｄ）に示す４サイクル目の状態と同一である。 FIG. 13C shows the state of the seventh cycle. In the seventh cycle, as in the fourth cycle, thread A is allocated to the RAM 27a, thread B is allocated to the RAM 27b, and thread C is allocated to the RAM 27c. Note that the state of the storage area of the RAM is also the same as the state of the fourth cycle shown in FIG.

図１３（ｄ）は、８サイクル目の状態を示す。８サイクル目では、５サイクル目と同様に、スレッドＢがＲＡＭ２７ａに、スレッドＣがＲＡＭ２７ｂに、スレッドＡがＲＡＭ２７ｃに割り当てられる。なお、ＲＡＭの記憶領域の状態についても、図１３（ａ）に示す４サイクル目の状態と同一である。以後、図１２（ｃ）、図１２（ｄ）、図１３（ａ）に示す状態をサイクリックに繰り返して、図１１に示す処理が継続されることになる。 FIG. 13D shows the state of the eighth cycle. In the eighth cycle, as in the fifth cycle, the thread B is allocated to the RAM 27a, the thread C is allocated to the RAM 27b, and the thread A is allocated to the RAM 27c. Note that the state of the storage area of the RAM is also the same as the state of the fourth cycle shown in FIG. Thereafter, the state shown in FIG. 12C, FIG. 12D, and FIG. 13A is cyclically repeated, and the process shown in FIG. 11 is continued.

また、図９又は図１１に示すスレッド間の関係は、他のスレッドからスレッドＡへのフィードバックデータがない場合のものであった。図１４（ａ）および図１４（ｂ）は、スレッドＡへのフィードバックデータが存在する場合のスレッド間のデータの受け渡しの例を示す図である。 Further, the relationship between threads shown in FIG. 9 or FIG. 11 is the case where there is no feedback data from other threads to the thread A. FIGS. 14A and 14B are diagrams illustrating an example of data transfer between threads when feedback data to the thread A exists.

スレッドＡへのフィードバックデータが存在する場合であっても、図９および図１１に関連して説明したように、ＲＡＭに対するスレッドの割当てをサイクル毎に変更することによって、効率的なリコンフィギュラブル回路１２のコンフィギュレーションを実現することが可能である。 Even when feedback data to the thread A exists, an efficient reconfigurable circuit can be obtained by changing the allocation of threads to the RAM for each cycle as described with reference to FIGS. Twelve configurations can be realized.

また、各スレッド間でデータの受け渡しの競合の無い構成を実現することで、データ受け渡しの調停回路を不要とできる。これにより、回路の小型化および低消費電力化が可能な技術を提供できる。 In addition, by realizing a configuration in which there is no data transfer contention among the threads, a data transfer arbitration circuit can be eliminated. As a result, it is possible to provide a technology capable of downsizing the circuit and reducing power consumption.

図１５は、図８に示す構成の別の例を示す図である。上記図８では、スレッドＡの処理結果がスレッドＢに引き渡される場合には、前回のサイクルでスレッドＡに割り当てられていたＲＡＭは、今回のサイクルでスレッドＢに割り当てられる。そして、スレッドＢの処理結果がスレッドＣに引き渡される場合には、前回のサイクルでスレッドＢに割り当てられていたＲＡＭは、今回のサイクルでスレッドＣに割り当てられる。 FIG. 15 is a diagram illustrating another example of the configuration illustrated in FIG. In FIG. 8, when the processing result of the thread A is delivered to the thread B, the RAM allocated to the thread A in the previous cycle is allocated to the thread B in the current cycle. When the processing result of the thread B is delivered to the thread C, the RAM assigned to the thread B in the previous cycle is assigned to the thread C in the current cycle.

このように、第１切替部２３がＲＡＭに割り当てられるスレッドをサイクル毎に順次切替（スレッドＡ→スレッドＢ→スレッドＣの順番で切替）を実行することにより、当該ＲＡＭにおいて一方のスレッドの処理結果が他方のスレッドに引き渡されている。このため、第１切替部２３は、当該一方のスレッド（ここではスレッドＡ）から他方のスレッド（ここではスレッドＣ）にデータを引き渡すには、最大（スレッド数−１）回（ここでは２回）の切替をしなければならず、処理時間を増大させていた。この点を解消するために図１５は以下に示す構成を備えている。以下詳細に説明する。 As described above, when the first switching unit 23 sequentially switches the threads assigned to the RAM for each cycle (switching in the order of thread A → thread B → thread C), the processing result of one thread in the RAM. Is handed over to the other thread. For this reason, the first switching unit 23 can transfer data from the one thread (here, thread A) to the other thread (here, thread C) at most (number of threads−1) times (here, twice). ) Must be switched, which increases the processing time. In order to eliminate this point, FIG. 15 has the following configuration. This will be described in detail below.

図１５に示すように、集積回路装置２６は、リコンフィギュラブル回路１２と、第１切替部２３と、第３切替部２８と、スレッドＡＢ用記憶ユニット６０と、スレッドＢＣ用記憶ユニット７０と、スレッドＣＡ用記憶ユニット８０とを備えている。 As shown in FIG. 15, the integrated circuit device 26 includes a reconfigurable circuit 12, a first switching unit 23, a third switching unit 28, a thread AB storage unit 60, a thread BC storage unit 70, And a thread CA storage unit 80.

なお、図８に示す第１切替部２３は、図１５に示す第１切替回路２３及び第４切替部（第４切替部６１，７１，８１）に対応する。すなわち、図８に示す第１切替回路２３は、リコンフィギュラブル回路１２から出力される各スレッドの選択、及びＲＡＭ２７のそれぞれに対応するスレッドの選択を実行しているが、図１５に示す第１切替回路２３はリコンフィギュラブル回路１２から出力される各スレッドの選択を実行し、図１５に示す第４切替部（第４切替部６１，７１，８１）はそれぞれのＲＡＭに対応するスレッドの選択を実行している。 The first switching unit 23 illustrated in FIG. 8 corresponds to the first switching circuit 23 and the fourth switching unit (fourth switching units 61, 71, 81) illustrated in FIG. That is, the first switching circuit 23 shown in FIG. 8 executes selection of each thread output from the reconfigurable circuit 12 and selection of a thread corresponding to each of the RAM 27, but the first switching circuit 23 shown in FIG. The switching circuit 23 selects each thread output from the reconfigurable circuit 12, and the fourth switching unit (fourth switching units 61, 71, 81) shown in FIG. 15 selects a thread corresponding to each RAM. Is running.

例えば、第１切替部２３ａはスレッドＡの出力を選択し、第１切替部２３ｂはスレッドＢの出力を選択し、第１切替部２３ｃはスレッドＣの出力を選択する。 For example, the first switching unit 23a selects the output of the thread A, the first switching unit 23b selects the output of the thread B, and the first switching unit 23c selects the output of the thread C.

また、第４切替部６１はＲＡＭ６４ａ，ＲＡＭ６４ｂのそれぞれに対応するスレッドＡ又はスレッドＢの出力を供給する。また、第４切替部７１はＲＡＭ７４ａ,ＲＡＭ７４ｂのそれぞれに対応するスレッドＢ又はスレッドＣの出力を供給する。さらに、第４切替部８１はＲＡＭ８４ａ,ＲＡＭ８４ｂのそれぞれに対応するスレッドＣ又はスレッドＡの出力を供給する。 The fourth switching unit 61 supplies the output of the thread A or thread B corresponding to each of the RAM 64a and RAM 64b. The fourth switching unit 71 supplies the output of the thread B or thread C corresponding to each of the RAM 74a and RAM 74b. Further, the fourth switching unit 81 supplies the output of the thread C or thread A corresponding to each of the RAM 84a and RAM 84b.

また、図８に示す第２切替部２５は、図１５に示す第５切替部（第５切替部６５，７５，８５）に対応する。なお、図８に示す構成と共通する説明については省略する。 Further, the second switching unit 25 shown in FIG. 8 corresponds to the fifth switching unit (fifth switching units 65, 75, 85) shown in FIG. The description common to the configuration shown in FIG. 8 is omitted.

記憶ユニット６０は、第４切替部６１と、ＲＡＭ６２と、ＲＡＭ６３と、ＲＡＭ６４ａ，ＲＡＭ６４ｂ（一対の記憶手段）と、第５切替部６５と、第６切替部６６と、第６切替部６７とを備えている。 The storage unit 60 includes a fourth switching unit 61, a RAM 62, a RAM 63, a RAM 64 a and a RAM 64 b (a pair of storage units), a fifth switching unit 65, a sixth switching unit 66, and a sixth switching unit 67. I have.

ＲＡＭ６２には、スレッドＡのみが固定的に割り当てられている。同様にして、ＲＡＭ６３には、スレッドＢのみが固定的に割り当てられている。ここで、本実施形態では、上記ＲＡＭ６２及びＲＡＭ６４ａ，ＲＡＭ６４ｂのそれぞれは、リコンフィギュラブル回路１２からの所定スレッド（ここではスレッドＡ）の出力を共通して記憶可能に構成されており、情報記憶ユニットを構成している。また、ＲＡＭ６３及びＲＡＭ６４ａ，ＲＡＭ６４ｂのそれぞれは、リコンフィギュラブル回路１２からの所定スレッド（ここではスレッドＢ）の出力を共通して記憶可能に構成されており、情報記憶ユニットを構成している。 Only the thread A is fixedly assigned to the RAM 62. Similarly, only the thread B is fixedly assigned to the RAM 63. Here, in the present embodiment, each of the RAM 62, the RAM 64a, and the RAM 64b is configured to be capable of commonly storing the output of a predetermined thread (here, thread A) from the reconfigurable circuit 12, and is configured as an information storage unit. Is configured. Each of the RAM 63, the RAM 64a, and the RAM 64b is configured to be able to store the output of a predetermined thread (here, thread B) from the reconfigurable circuit 12 in common, and constitutes an information storage unit.

ＲＡＭ６４ａには、２つのスレッドのうちの一方のスレッドが割り当てられており、一方のスレッドの割り当ては、所定サイクル毎に他方のスレッドへ割り当てを切り替えられる。同様にして、ＲＡＭ６４ｂには、２つのスレッドのうちの一方のスレッドが割り当てられており、一方のスレッドの割り当ては、所定サイクル毎に他方のスレッドへ割り当てを切り替えられる。なお、スレッドの割り当ての切り替えは、第４切替部６１及び第５切替部６５により実行される。すなわち、第４切替部６１は、ＲＡＭ６４ａ，ＲＡＭ６４ｂの入力側でスレッドの割り当てを切り替え、第５切替部６５は、ＲＡＭ６４ａ，ＲＡＭ６４ｂの出力側でスレッドの割り当てを切り替える。 One of the two threads is allocated to the RAM 64a, and the allocation of one thread can be switched to the other thread every predetermined cycle. Similarly, one of the two threads is allocated to the RAM 64b, and the allocation of one thread can be switched to the other thread every predetermined cycle. The switching of the thread assignment is executed by the fourth switching unit 61 and the fifth switching unit 65. That is, the fourth switching unit 61 switches thread assignment on the input side of the RAM 64a and RAM 64b, and the fifth switching unit 65 switches thread assignment on the output side of the RAM 64a and RAM 64b.

例えば、ＲＡＭ６４ａにスレッドＡ及びスレッドＢのうちの一方のスレッドＡが割り当てられている場合には、所定サイクルが経過した後に該スレッドＡの割り当ては他方のスレッドＢへ割り当てを切り替えられる。さらに所定サイクルが経過した後は、スレッドＢの割り当ては他方のスレッドＡへ割り当てを切り替えられる。 For example, when one of the threads A and B is assigned to the RAM 64a, the assignment of the thread A can be switched to the other thread B after a predetermined cycle has elapsed. Further, after a predetermined cycle elapses, the assignment of the thread B can be switched to the other thread A.

また、ＲＡＭ６４ａ，ＲＡＭ６４ｂには、異なるスレッドがそれぞれに割り当てられている場合には、ＲＡＭ６４ａ，ＲＡＭ６４ｂのそれぞれに対応するスレッドの割り当ては、所定サイクル毎に互いに切り替えられてもよい。なお、スレッドの割り当ての切り替えは、第４切替部６１及び第５切替部６５により実行される。 When different threads are assigned to the RAM 64a and RAM 64b, the assignment of threads corresponding to the RAM 64a and RAM 64b may be switched to each other every predetermined cycle. The switching of the thread assignment is executed by the fourth switching unit 61 and the fifth switching unit 65.

例えば、ＲＡＭ６４ａにスレッドＡが割り当てられ、ＲＡＭ６４ｂにスレッドＢが割り当てられている場合には、所定サイクルが経過した後に、ＲＡＭ６４ａに対応するスレッドＡの割り当てがスレッドＢの割り当てに切り替えられるとともに、ＲＡＭ６４ｂに対応するスレッドＢの割り当てがスレッドＡの割り当てに切り替えられる。さらに所定サイクルが経過した後は、ＲＡＭ６４ａに対応するスレッドＢの割り当てがスレッドＡの割り当てに切り替えられるとともに、ＲＡＭ６４ｂに対応するスレッドＡの割り当てがスレッドＢの割り当てに切り替えられる。 For example, when the thread A is assigned to the RAM 64a and the thread B is assigned to the RAM 64b, the thread A corresponding to the RAM 64a is switched to the thread B assignment after a predetermined cycle, and the RAM 64b The corresponding thread B assignment is switched to thread A assignment. Further, after a predetermined cycle elapses, the assignment of the thread B corresponding to the RAM 64a is switched to the assignment of the thread A, and the assignment of the thread A corresponding to the RAM 64b is changed to the assignment of the thread B.

したがって、第４切替部６１は、第１切替部２３ａのスレッドＡの出力と、第１切替部２３ｂのスレッドＢの出力とが入力されると、第１切替部２３ａのスレッドＡの出力をＲＡＭ６４ａに出力し、第１切替部２３ｂのスレッドＢの出力をＲＡＭ６４ｂに出力する。 Therefore, when the output of the thread A of the first switching unit 23a and the output of the thread B of the first switching unit 23b are input, the fourth switching unit 61 outputs the output of the thread A of the first switching unit 23a to the RAM 64a. And the output of the thread B of the first switching unit 23b is output to the RAM 64b.

また、所定のサイクルが経過した後は、ＲＡＭ６４ａに対応するスレッドＡの割り当てがスレッドＢの割り当てに切り替えられ、ＲＡＭ６４ｂに対応するスレッドＢの割り当てがスレッドＡの割り当てに切り替えられる。そして、第４切替部６１は、第１切替部２３ｂのスレッドＢの出力をＲＡＭ６４ａに出力し、第１切替部２３ａのスレッドＡの出力をＲＡＭ６４ｂに出力する。 Further, after a predetermined cycle elapses, the assignment of the thread A corresponding to the RAM 64a is switched to the assignment of the thread B, and the assignment of the thread B corresponding to the RAM 64b is changed to the assignment of the thread A. Then, the fourth switching unit 61 outputs the output of the thread B of the first switching unit 23b to the RAM 64a, and outputs the output of the thread A of the first switching unit 23a to the RAM 64b.

第５切替部６５は、ＲＡＭ６４ａにスレッドＡが割り当てられ、ＲＡＭ６４ｂにスレッドＢが割り当てられている場合には、ＲＡＭ６４ａの出力を第６切替部６６を通してリコンフィギュラブル回路１２におけるスレッドＡに対応する処理に出力し、ＲＡＭ６４ｂの出力を第６切替部６７を通してリコンフィギュラブル回路１２におけるスレッドＢに対応する処理へ出力する。 When the thread A is assigned to the RAM 64a and the thread B is assigned to the RAM 64b, the fifth switching unit 65 processes the output of the RAM 64a through the sixth switching unit 66 and corresponds to the thread A in the reconfigurable circuit 12 The output of the RAM 64b is output to the process corresponding to the thread B in the reconfigurable circuit 12 through the sixth switching unit 67.

また、所定のサイクルが経過した後は、ＲＡＭ６４ａに対応するスレッドＡの割り当てがスレッドＢの割り当てに切り替えられ、ＲＡＭ６４ｂに対応するスレッドＢの割り当てがスレッドＡの割り当てに切り替えられるため、第５切替部６５は、ＲＡＭ６４ａの出力を第６切替部６７を通してリコンフィギュラブル回路１２におけるスレッドＢに対応する処理に出力し、ＲＡＭ６４ｂの出力を第６切替部６６を通してリコンフィギュラブル回路１２におけるスレッドＡに対応する処理へ出力する。 Further, after the predetermined cycle has elapsed, the assignment of the thread A corresponding to the RAM 64a is switched to the assignment of the thread B, and the assignment of the thread B corresponding to the RAM 64b is switched to the assignment of the thread A. 65 outputs the output of the RAM 64a to the process corresponding to the thread B in the reconfigurable circuit 12 through the sixth switching unit 67, and the output of the RAM 64b corresponds to the thread A in the reconfigurable circuit 12 through the sixth switching unit 66. Output to processing.

第６切替部６６は、ＲＡＭ６２及び第５切替部６５のいずれかの出力を選択して出力する。第６切替部６７は、ＲＡＭ６３及び第５切替部６５のいずれかの出力を選択して出力する。 The sixth switching unit 66 selects and outputs one of the outputs of the RAM 62 and the fifth switching unit 65. The sixth switching unit 67 selects and outputs one of the outputs of the RAM 63 and the fifth switching unit 65.

ここで、スレッドＡ、スレッドＢ及びスレッドＣのそれぞれは、複数のアドレス空間を有している。本実施形態では、スレッドＡは、アドレス空間Ａ−１とアドレス空間Ａ−２とを有しているものとする。スレッドＢは、アドレス空間Ｂ−１とアドレス空間Ｂ−２とを有しているものとする。スレッドＣは、アドレス空間Ｃ−１とアドレス空間Ｃ−２とを有しているものとする。それぞれのアドレス空間には、００１番から２００番が割り振られている。以下では、スレッドとアドレス空間とＲＡＭとの関係について詳細に説明する。 Here, each of the thread A, the thread B, and the thread C has a plurality of address spaces. In the present embodiment, it is assumed that the thread A has an address space A-1 and an address space A-2. It is assumed that the thread B has an address space B-1 and an address space B-2. It is assumed that the thread C has an address space C-1 and an address space C-2. Numbers 001 to 200 are allocated to each address space. Hereinafter, the relationship among the thread, the address space, and the RAM will be described in detail.

先ず、ＲＡＭ６２は、スレッドＡのアドレス空間Ａ−１のうち００１番から１００番に（固定的に）割り振られている。ＲＡＭ６４ａがスレッドＡに割り当てられた場合には、ＲＡＭ６４ａはスレッドＡのアドレス空間Ａ−１のうち１０１番から２００番に割り振られる。また、ＲＡＭ６４ｂがスレッドＡに割り当てられた場合には、ＲＡＭ６４ｂはスレッドＡのアドレス空間Ａ−１のうち１０１番から２００番に割り振られる。 First, the RAM 62 is allocated (fixedly) from 001 to 100 in the address space A-1 of the thread A. When the RAM 64a is allocated to the thread A, the RAM 64a is allocated from the 101st to the 200th in the address space A-1 of the thread A. When the RAM 64b is allocated to the thread A, the RAM 64b is allocated from the 101st to the 200th in the address space A-1 of the thread A.

すなわち、スレッドＡのアドレス空間Ａ−１に対して行われる処理はアドレス範囲により決定され、アドレス空間Ａ−１の００１番から１００番に対する処理はＲＡＭ６２に行われ、アドレス空間Ａ−１の１０１番から２００番に対する処理は、ＲＡＭに対応するスレッドの割り当てに応じてＲＡＭ６４ａ又はＲＡＭ６４ｂに対して行われる。 That is, the processing performed on the address space A-1 of the thread A is determined by the address range, the processing on the address space A-1 from 001 to 100 is performed on the RAM 62, and the processing on the address space A-1 is 101. To No. 200 are performed on the RAM 64a or the RAM 64b in accordance with the assignment of threads corresponding to the RAM.

ＲＡＭ６３は、スレッドＢのアドレス空間Ｂ−２のうち００１番から１００番に（固定的に）割り振られている。ＲＡＭ６４ａがスレッドＢに割り当てられた場合には、ＲＡＭ６４ａはスレッドＢのアドレス空間Ｂ−２のうち１０１番から２００番に割り振られる。また、ＲＡＭ６４ｂがスレッドＢに割り当てられた場合には、ＲＡＭ６４ｂはスレッドＢのアドレス空間Ｂ−２のうち１０１番から２００番に割り振られる。 The RAM 63 is allocated (fixedly) from 001 to 100 in the address space B-2 of the thread B. When the RAM 64a is allocated to the thread B, the RAM 64a is allocated from the 101st to the 200th in the address space B-2 of the thread B. When the RAM 64b is allocated to the thread B, the RAM 64b is allocated from the 101st to the 200th in the address space B-2 of the thread B.

すなわち、スレッドＢのアドレス空間Ｂ−２に対して行われる処理はアドレス範囲により決定され、アドレス空間Ｂ−２の００１番から１００番に対する処理はＲＡＭ６３に行われ、アドレス空間Ｂ−２の１０１番から２００番に対する処理は、ＲＡＭに対応するスレッドの割り当てに応じてＲＡＭ６４ａ又はＲＡＭ６４ｂに対して行われる。 That is, the processing performed on the address space B-2 of the thread B is determined by the address range, the processing on the address space B-2 from 001 to 100 is performed on the RAM 63, and the processing on the address space B-2 is 101. To No. 200 are performed on the RAM 64a or the RAM 64b in accordance with the assignment of threads corresponding to the RAM.

同様にして、スレッドＢのアドレス空間Ｂ−１に対して行われる処理はアドレス範囲により決定され、アドレス空間Ｂ−１の００１番から１００番に対する処理はＲＡＭ７２に行われ、アドレス空間Ｂ−１の１０１番から２００番に対する処理は、ＲＡＭに対応するスレッドの割り当てに応じてＲＡＭ７４ａ又はＲＡＭ７４ｂに対して行われる。 Similarly, the processing performed on the address space B-1 of the thread B is determined by the address range, and the processing on the address space B-1 from 001 to 100 is performed on the RAM 72. The processing from No. 101 to No. 200 is performed on the RAM 74a or RAM 74b according to the assignment of threads corresponding to the RAM.

また、スレッドＣのアドレス空間Ｃ−２に対して行われる処理はアドレス範囲により決定され、アドレス空間Ｃ−２の００１番から１００番に対する処理はＲＡＭ７３に対して行われ、アドレス空間Ｃ−２の１０１番から２００番に対する処理は、ＲＡＭに対応するスレッドの割り当てに応じてＲＡＭ７４ａ又はＲＡＭ７４ｂに対して行われる。 Further, the processing performed on the address space C-2 of the thread C is determined by the address range, the processing on the address space C-2 from 001 to 100 is performed on the RAM 73, and the processing of the address space C-2 The processing from No. 101 to No. 200 is performed on the RAM 74a or RAM 74b according to the assignment of threads corresponding to the RAM.

また、スレッドＣがアドレス空間Ｃ−１に対して行われる処理はアドレス範囲により決定され、アドレス空間Ｃ−１の００１番から１００番に対する処理はＲＡＭ８２に対して行われ、アドレス空間Ｃ−１の１０１番から２００番に対する処理は、ＲＡＭに対応するスレッドの割り当てに応じてＲＡＭ８４ａ又はＲＡＭ８４ｂに対して行われる。 Further, the processing performed by the thread C on the address space C-1 is determined by the address range, and the processing on the address space C-1 from 001 to 100 is performed on the RAM 82. Processing from No. 101 to No. 200 is performed on the RAM 84a or the RAM 84b in accordance with the assignment of threads corresponding to the RAM.

さらに、スレッドＡのアドレス空間Ａ−２に対して行われる処理はアドレス範囲により決定され、アドレス空間Ａ−２の００１番から１００番に対する処理はＲＡＭ８３に対して行われ、アドレス空間Ａ−２の１０１番から２００番に対する処理は、ＲＡＭに対応するスレッドの割り当てに応じてＲＡＭ８４ａ又はＲＡＭ８４ｂに対して行われる。 Further, the processing performed on the address space A-2 of the thread A is determined by the address range, the processing on the address space A-2 from 001 to 100 is performed on the RAM 83, and the processing of the address space A-2 Processing from No. 101 to No. 200 is performed on the RAM 84a or the RAM 84b in accordance with the assignment of threads corresponding to the RAM.

上述したスレッドとアドレス空間とＲＡＭとの関係により、スレッド間でデータがやり取りされる。例えば、ＲＡＭ６４ａがスレッドＡに割り当てられ、ＲＡＭ６４ｂがスレッドＢに割り当てられた場合には、スレッドＡのアドレス空間Ａ−１に対する処理はＲＡＭ６２又はＲＡＭ６４ａに対して行われ、スレッドＢのアドレス空間Ｂ−２に対する処理はＲＡＭ６３又はＲＡＭ６４ｂに対して行われる。 Data is exchanged between threads due to the relationship between the thread, the address space, and the RAM. For example, when the RAM 64a is allocated to the thread A and the RAM 64b is allocated to the thread B, the processing for the address space A-1 of the thread A is performed on the RAM 62 or the RAM 64a, and the address space B-2 of the thread B is performed. The processing for is performed on the RAM 63 or the RAM 64b.

また、所定のサイクルが経過した後に、ＲＡＭ６４ａがスレッドＢに割り当てられ、ＲＡＭ６４ｂがスレッドＡに割り当てられると、スレッドＡのアドレス空間Ａ−１に対する処理はＲＡＭ６２又はＲＡＭ６４ｂに対して行われ、スレッドＢのアドレス空間Ｂ−２に対する処理はＲＡＭ６３又はＲＡＭ６４ａに対して行われる。 Further, when the RAM 64a is assigned to the thread B and the RAM 64b is assigned to the thread A after a predetermined cycle has elapsed, the processing for the address space A-1 of the thread A is performed on the RAM 62 or the RAM 64b, and the thread B Processing for the address space B-2 is performed on the RAM 63 or the RAM 64a.

つまり、ＲＡＭ６４ａとＲＡＭ６４ｂとはスレッドＡとスレッドＢとの間のデータのやり取りに利用される領域であり、これらのＲＡＭに割り当てられるスレッドが切り替わることにより、スレッドＡとスレッドＢとの間でデータが相互にやり取りされる。 That is, the RAM 64a and the RAM 64b are areas used for data exchange between the thread A and the thread B, and data is transferred between the thread A and the thread B by switching the threads allocated to these RAMs. Communicate with each other.

例えば、スレッドＡとスレッドＢとの間でデータがやり取りされる場合には、スレッドＡがスレッドＢに渡すデータをアドレス空間Ａ−１の１０１番から２００番に書き込み、スレッドＢがスレッドＡに渡すデータをアドレス空間Ｂ−２の１０１番から２００番に書き込む。このときに、ＲＡＭ６４ａがスレッドＡに割り当てられ、ＲＡＭ６４ｂがスレッドＢに割り当てられている場合には、スレッドＡがスレッドＢに渡すデータはＲＡＭ６４ａに書き込まれ、スレッドＢがスレッドＡに渡すデータはＲＡＭ６４ｂに書き込まれることになる。 For example, when data is exchanged between the thread A and the thread B, the data that the thread A passes to the thread B is written from the 101st to the 200th in the address space A-1, and the thread B passes to the thread A. Data is written from address 101 to address 200 in address space B-2. At this time, when the RAM 64a is assigned to the thread A and the RAM 64b is assigned to the thread B, the data that the thread A passes to the thread B is written to the RAM 64a, and the data that the thread B passes to the thread A is written to the RAM 64b. Will be written.

そして、所定のサイクルが経過した後に、ＲＡＭ６４ｂがスレッドＡに割り当てられ、ＲＡＭ６４ａがスレッドＢに割り当てられると、ＲＡＭ６４ｂに書き込まれたスレッドＢのデータをスレッドＡがアドレス空間Ａ−１の１０１番〜２００番の領域から読み込み、ＲＡＭ６４ａに書き込まれたスレッドＡのデータをスレッドＢがアドレス空間Ｂ−２の１０１番〜２００番の領域から読み込むこととなる。これにより、ＲＡＭ６４ａ及びＲＡＭ６４ｂのそれぞれに対応するスレッドが相互に１回切り替えられるだけで、スレッドＡとスレッドＢとの間でデータをやり取りすることができる。 Then, after a predetermined cycle elapses, when the RAM 64b is assigned to the thread A and the RAM 64a is assigned to the thread B, the thread A writes the data of the thread B written in the RAM 64b and the addresses 101 to 200 in the address space A-1. The thread B reads the data of the thread A read from the number area and written to the RAM 64a from the areas 101 to 200 in the address space B-2. Thereby, data can be exchanged between the thread A and the thread B only by switching the threads corresponding to the RAM 64a and the RAM 64b once each other.

なお、本実施形態では、スレッド（例えば、スレッドＡ）は２つのアドレス空間（例えば、アドレス空間Ａ−１、アドレス空間Ａ−２）を有しており、１つのアドレス空間（例えば、アドレス空間Ａ−１）は２つの領域（例えば、スレッドＡに固定的に割り当てられる領域（００１番から１００番）と、スレッドＢとのデータをやり取りするための領域（１０１番から２００番））に分けられているが、これに限定されるものではない。 In this embodiment, a thread (for example, thread A) has two address spaces (for example, address space A-1 and address space A-2), and one address space (for example, address space A). -1) is divided into two areas (for example, an area fixedly assigned to the thread A (001 to 100) and an area for exchanging data with the thread B (101 to 200)). However, it is not limited to this.

例えば、スレッドＡに固定的に割り当てられる領域（００１番から１００番）、スレッドＢとのデータをやり取りするための領域（１０１番から２００番）、スレッドＣとのデータをやり取りするための領域（２０１番から３００番）の３つに分けられて構成されるアドレス空間を有してもよい。 For example, an area fixedly assigned to thread A (001 to 100), an area for exchanging data with thread B (101 to 200), and an area for exchanging data with thread C (number 100) You may have the address space divided into three (201 to 300).

記憶ユニット７０には、スレッドＢ及びスレッドＣが割り当てられる。記憶ユニット７０は、第４切替部７１と、ＲＡＭ７２と、ＲＡＭ７３と、ＲＡＭ７４ａ，ＲＡＭ７４ｂと、第５切替部７５と、第６切替部７６と、第６切替部７７とを備えている。記憶ユニット７０は、上述した記憶ユニット６０と同様の機能を有しているため、詳細な説明は省略する。 Thread B and thread C are assigned to the storage unit 70. The storage unit 70 includes a fourth switching unit 71, a RAM 72, a RAM 73, a RAM 74 a and a RAM 74 b, a fifth switching unit 75, a sixth switching unit 76, and a sixth switching unit 77. Since the storage unit 70 has the same function as the storage unit 60 described above, detailed description thereof is omitted.

記憶ユニット８０には、スレッドＣ及びスレッドＡが割り当てられる。記憶ユニット８０は、第４切替部８１と、ＲＡＭ８２と、ＲＡＭ８３と、ＲＡＭ８４ａ，ＲＡＭ８４ｂと、第５切替部８５と、第６切替部８６と、第６切替部８７とを備えている。記憶ユニット８０は、上述した記憶ユニット６０と同様の機能を有しているため、詳細な説明は省略する。 A thread C and a thread A are allocated to the storage unit 80. The storage unit 80 includes a fourth switching unit 81, a RAM 82, a RAM 83, a RAM 84 a and a RAM 84 b, a fifth switching unit 85, a sixth switching unit 86, and a sixth switching unit 87. Since the storage unit 80 has the same function as the storage unit 60 described above, a detailed description thereof is omitted.

なお、図１５は、３つのスレッドが存在する場合の構成であるが、これに限定されずに、４つ以上のスレッドが存在する場合の構成であってもよい。例えば、４つのスレッドが存在する場合には、スレッドＡ及びスレッドＢに割り当てられる記憶ユニット（ＡＢ用）、スレッドＡ及びスレッドＣに割り当てられる記憶ユニット（ＡＣ用）、スレッドＡ及びスレッドＤに割り当てられる記憶ユニット（ＡＤ用）、スレッドＢ及びスレッドＣに割り当てられる記憶ユニット（ＢＣ用）、スレッドＢ及びスレッドＤに割り当てられる記憶ユニット（ＢＤ用）、スレッドＣ及びスレッドＤに割り当てられる記憶ユニット（ＣＤ用）の６つが備えられることとなる。 Note that FIG. 15 shows a configuration in which three threads exist, but the configuration is not limited to this, and a configuration in which four or more threads exist may be used. For example, when there are four threads, the storage unit (for AB) assigned to thread A and thread B, the storage unit (for AC) assigned to thread A and thread C, the thread A and thread D are assigned. Storage unit (for AD), storage unit assigned to thread B and thread C (for BC), storage unit assigned to thread B and thread D (for BD), storage unit assigned to thread C and thread D (for CD) 6) will be provided.

なお、本発明は、図１５に示す各ＲＡＭに限定されずに、図１５に示す各ＲＡＭ以外のＲＡＭを備えてもよい。具体的には、図１５に示す各ＲＡＭに加えて、スレッドが固定的に割り当てられるＲＡＭが備えられてもよい。例えば、スレッドＡは、上述したアドレス空間Ａ−１、アドレス空間Ａ−２に加えて、新たにアドレス空間Ａ−３を有するものとする。そして、スレッドＡが固定的に割り当てられるＲＡＭが、アドレス空間Ａ−３に割り振られる。 Note that the present invention is not limited to each RAM shown in FIG. 15, and may include RAMs other than the RAMs shown in FIG. Specifically, a RAM to which threads are fixedly assigned may be provided in addition to the RAMs shown in FIG. For example, it is assumed that the thread A newly has an address space A-3 in addition to the address space A-1 and the address space A-2 described above. The RAM to which the thread A is fixedly allocated is allocated to the address space A-3.

上記図１５に示す構成によれば、第４切替部及び第５切替部は、記憶ユニット６０，７０，８０内における各ＲＡＭ（例えば、ＲＡＭ６４ａ，ＲＡＭ６４ｂ）のそれぞれに対応するスレッドの割り当てを、所定サイクル毎に互いに切り替えることができる。これにより、第４切替部及び第５切替部は、記憶ユニット６０，７０，８０内において一方のスレッド（例えば、スレッドＡ）の処理結果を１回の切り替えで他方のスレッド（例えば、スレッドＢ）に引き渡し、これと同時に他方のスレッド（例えば、スレッドＢ）の処理結果を１回の切り替えで一方のスレッド（例えば、スレッドＡ）に引き渡すことができる。 According to the configuration shown in FIG. 15, the fourth switching unit and the fifth switching unit allocate predetermined threads to the respective RAMs (for example, the RAM 64a and the RAM 64b) in the storage units 60, 70, and 80. Each cycle can be switched to each other. As a result, the fourth switching unit and the fifth switching unit switch the processing result of one thread (for example, thread A) in the storage units 60, 70, and 80 by one switching to the other thread (for example, thread B). At the same time, the processing result of the other thread (for example, thread B) can be handed over to one thread (for example, thread A) with one switching.

さらに、スレッドＡとスレッドＢとの間のみならずに、スレッドＢとスレッドＣとの間、スレッドＣとスレッドＡとの間でも同様に、一方のスレッドの処理結果を１回の切り替えで他方のスレッドに引渡し、これと同時に他方のスレッドの処理結果を１回の切り替えで一方のスレッドに引き渡すことができるため、任意のスレッド間で同時にデータを引き渡すことができる。よって、各スレッドの処理結果が利用されるまでの待ち時間を短くすることができ、図８に示す構成よりも処理時間を大幅に削減することができる。 Further, not only between the thread A and the thread B but also between the thread B and the thread C, and between the thread C and the thread A, the processing result of one thread can be switched by one switching. At the same time, the processing result of the other thread can be transferred to one thread by switching once, so that data can be simultaneously transferred between arbitrary threads. Therefore, the waiting time until the processing result of each thread is used can be shortened, and the processing time can be significantly reduced as compared with the configuration shown in FIG.

また、図８に示す構成では、１つのスレッドが有するアドレス空間のうちの１つは、スレッド間のデータの引き渡しに用いられる領域だけとなるため、これらアドレス空間には、次のサイクルで同一スレッドが利用するスレッドの処理結果を記憶することができない。これに対し、図１５に示す構成では、それぞれのアドレス空間において、スレッド間のデータを引き渡すための領域だけではなく、さらにスレッドに固定的に割り当てられる領域をも有しているため、どのアドレス空間に対しても、次のサイクルで同一スレッドが利用するスレッドの処理結果を記憶することができる。 Further, in the configuration shown in FIG. 8, one of the address spaces of one thread is only an area used for data transfer between threads. The processing result of the thread used by can not be stored. On the other hand, in the configuration shown in FIG. 15, each address space has not only an area for passing data between threads but also an area that is fixedly assigned to the thread. However, the processing result of the thread used by the same thread in the next cycle can be stored.

例えば、図８に示す構成では、アドレス空間は、スレッド間のデータの引渡しをするために用いられ、次のサイクルで同一スレッドが利用するスレッドの処理結果を記憶することができない。このため、次のサイクルで同一スレッドが利用するスレッドの処理結果は、その他のアドレス空間に記憶されることになる。 For example, in the configuration shown in FIG. 8, the address space is used for transferring data between threads, and cannot store the processing results of threads used by the same thread in the next cycle. For this reason, the processing result of the thread used by the same thread in the next cycle is stored in another address space.

これに対し、図１５に示す構成では、１つのスレッドの持つ２つのアドレス空間のいずれにも、次のサイクルで同一スレッドが利用するスレッドの処理結果を記憶する記憶部を有している。このため、次のサイクルで同一スレッドが利用するスレッドの処理結果が２つある場合でも、それら２つの演算結果を別々のアドレス空間に同時に記憶させることも可能となる。 On the other hand, the configuration shown in FIG. 15 has a storage unit for storing the processing result of the thread used by the same thread in the next cycle in both of the two address spaces of one thread. For this reason, even when there are two processing results of threads used by the same thread in the next cycle, it is possible to simultaneously store the two calculation results in different address spaces.

以上、本発明を実施例をもとに説明した。実施例は例示であり、それらの各構成要素や各処理プロセスの組み合わせにいろいろな変形例が可能なこと、またそうした変形例も本発明の範囲にあることは当業者に理解されるところである。例えば、図８に示した集積回路装置２６は、リコンフィギュラブル回路１２への途中入力またはリコンフィギュラブル回路からの途中出力を可能とする構成であったが、本発明は、途中入力または途中出力のない構成に対しても適用できる。 In the above, this invention was demonstrated based on the Example. It is to be understood by those skilled in the art that the embodiments are exemplifications, and that various modifications are possible in the combination of each component and each processing process, and such modifications are within the scope of the present invention. For example, the integrated circuit device 26 shown in FIG. 8 has a configuration that enables intermediate input to the reconfigurable circuit 12 or intermediate output from the reconfigurable circuit. It can also be applied to configurations without this.

例えば、リコンフィギュラブル回路１２におけるＡＬＵの配列は、縦方向にのみ接続を許した多段配列に限らず、横方向の接続も許した、メッシュ状の配列であってもよい。また、上記の説明では、段を飛ばして論理回路を接続する結線は設けられていないが、このような段を飛ばす接続結線を設ける構成としてもよい。 For example, the array of ALUs in the reconfigurable circuit 12 is not limited to a multistage array that allows connection only in the vertical direction, but may be a mesh-like array that allows connection in the horizontal direction. In the above description, the connection for connecting the logic circuits by skipping the stages is not provided, but the connection connection for skipping such stages may be provided.

また、図１では、処理装置１０が１つのリコンフィギュラブル回路１２を有する場合を示しているが、複数のリコンフィギュラブル回路１２を有していてもよい。 Further, FIG. 1 shows a case where the processing apparatus 10 has one reconfigurable circuit 12, but it may have a plurality of reconfigurable circuits 12.

図８に示した構成では、集積回路装置２６が、第１切替回路２３および複数のＲＡＭから構成されるメモリ部２７を有していた。以下の変形例では、メモリ部２７が、複数のＲＡＭを有する代わりに、同時アクセス可能な複数の入出力ポートを備えた１つのＲＡＭを有してもよい。この場合、ＲＡＭの記憶領域は複数に分割されており、分割されたそれぞれの記憶領域は、リコンフィギュラブル回路１２上で実行されるスレッドに割り当てられる。複数の入出力ポートのそれぞれは、分割されたそれぞれの記憶領域に対応する。 In the configuration shown in FIG. 8, the integrated circuit device 26 includes the memory unit 27 including the first switching circuit 23 and a plurality of RAMs. In the following modified example, the memory unit 27 may have one RAM having a plurality of input / output ports that can be accessed simultaneously instead of having a plurality of RAMs. In this case, the storage area of the RAM is divided into a plurality of parts, and each of the divided storage areas is assigned to a thread executed on the reconfigurable circuit 12. Each of the plurality of input / output ports corresponds to each divided storage area.

ＲＡＭの分割は、例えばアドレスの所定位置のビット値を利用して行われる。例えば、第１の入出力ポートは、アドレスの最上位２ビットが”００”である記憶領域に対応付けられ、第２の入出力ポートは、アドレスの最上位２ビットが”０１”である記憶領域に対応付けられ、第３の入出力ポートは、アドレスの最上位２ビットが”１０”である記憶領域に対応付けられる。 The RAM is divided using, for example, a bit value at a predetermined position of the address. For example, the first input / output port is associated with a storage area in which the most significant 2 bits of the address is “00”, and the second input / output port is stored in which the most significant 2 bits of the address is “01”. The third input / output port is associated with a storage area in which the most significant 2 bits of the address are “10”.

複数のスレッドのそれぞれを各入出力ポートに対応付けることで、スレッドと分割した記憶領域とが対応付けられる。リコンフィギュラブル回路１２および第２切替回路２５、また制御部１８などの他の構成は、上記した実施例と同様である。 By associating each of the plurality of threads with each input / output port, the thread is associated with the divided storage area. Other configurations such as the reconfigurable circuit 12, the second switching circuit 25, and the control unit 18 are the same as those in the above-described embodiment.

すなわち、実施例では複数のＲＡＭが存在していたが、この変形例では、実施例における１つのＲＡＭが、ＲＡＭにおいて分割された１つの記憶領域に対応する。なお、アドレス中の２ビットを用いる場合には、記憶領域を最大で４つに分割することが可能であるが、さらに記憶領域を分割する必要がある場合には、アドレス中の３ビット以上を用いる。 That is, although a plurality of RAMs exist in the embodiment, in this modification, one RAM in the embodiment corresponds to one storage area divided in the RAM. When using 2 bits in the address, it is possible to divide the storage area into four at the maximum. However, if it is necessary to further divide the storage area, 3 bits or more in the address are used. Use.

以上のように、図８に示す第１切替回路２３およびメモリ部２７を、複数の入出力ポートを有する１つのＲＡＭに置き換えてもよい。このように、メモリ部２７として、複数のデータの同時書込および／または読出を可能とするＲＡＭを使用することで、実施例で説明した同様の効果を得ることができ、さらに第１切替回路２３を集積回路装置２６から省略できるため、回路規模を削減できる。 As described above, the first switching circuit 23 and the memory unit 27 shown in FIG. 8 may be replaced with one RAM having a plurality of input / output ports. As described above, by using the RAM that enables simultaneous writing and / or reading of a plurality of data as the memory unit 27, the same effect as described in the embodiment can be obtained, and the first switching circuit can be obtained. Since 23 can be omitted from the integrated circuit device 26, the circuit scale can be reduced.

なお、リコンフィギュラブル回路１２については、本実施例で説明したものに限定されずに、ＣＰＵ、ＤＳＰ又はＦＰＧＡ等のプログラム可能なデバイスも含まれる。また、スレッドの実行については、例えば図２に示すような一つの回路上で複数のスレッドが実行されるケースについてしか説明していないが、個別の回路が存在し、そのそれぞれ回路で別のスレッドが同時に実行される場合も含まれる。 Note that the reconfigurable circuit 12 is not limited to the one described in this embodiment, and includes a programmable device such as a CPU, DSP, or FPGA. For thread execution, for example, only a case where a plurality of threads are executed on one circuit as shown in FIG. 2 has been described. However, individual circuits exist, and each circuit has a different thread. Are executed at the same time.

今回開示された実施例はすべての点で例示であって制限的なものではないと考えられるべきである。本発明の範囲は上記した説明ではなくて特許請求の範囲によって示され、特許請求の範囲と均等の意味および範囲内でのすべての変更が含まれることが意図される。 It should be understood that the embodiments disclosed herein are illustrative and non-restrictive in every respect. The scope of the present invention is defined by the terms of the claims, rather than the description above, and is intended to include any modifications within the scope and meaning equivalent to the terms of the claims.

実施例に係る処理装置の構成図である。It is a block diagram of the processing apparatus which concerns on an Example. リコンフィギュラブル回路の構成図である。It is a block diagram of a reconfigurable circuit. リコンフィギュラブル回路の別の構成図である。It is another block diagram of a reconfigurable circuit. データフローグラフの例を示す図である。It is a figure which shows the example of a data flow graph. 生成すべきターゲット回路を分割してできる複数の回路の設定データについて説明するための図である。It is a figure for demonstrating the setting data of the some circuit which can divide | segment the target circuit which should be produced | generated. リコンフィギュラブル回路上に構成するターゲット回路の処理の流れを示す図である。It is a figure which shows the flow of a process of the target circuit comprised on a reconfigurable circuit. リコンフィギュラブル回路上で実現するマルチスレッド動作の流れを示す図である。It is a figure which shows the flow of the multithread operation | movement implement | achieved on a reconfigurable circuit. 集積回路装置の詳細な構成を示す図である。It is a figure which shows the detailed structure of an integrated circuit device. スレッド間のデータの受け渡しの一例を示す図である。It is a figure which shows an example of the delivery of the data between threads. メモリ部における複数のＲＡＭへのスレッドの割当てと、ＲＡＭの記憶領域の状態を示す図である。It is a figure which shows the allocation of the thread | sled to several RAM in a memory part, and the state of the storage area of RAM. スレッド間のデータの受け渡しの別の例を示す図である。It is a figure which shows another example of the delivery of the data between threads. メモリ部における複数のＲＡＭへのスレッドの割当てと、ＲＡＭの記憶領域の状態を示す図である。It is a figure which shows the allocation of the thread | sled to several RAM in a memory part, and the state of the storage area of RAM. メモリ部における複数のＲＡＭへのスレッドの割当てと、ＲＡＭの記憶領域の状態を示す図である。It is a figure which shows the allocation of the thread | sled to several RAM in a memory part, and the state of the storage area of RAM. スレッドへのフィードバックデータが存在する場合のスレッド間のデータの受け渡しの例を示す図である。It is a figure which shows the example of the delivery of the data between threads when the feedback data to a thread exists. 集積回路装置の詳細な構成の他の例を示す図である。It is a figure which shows the other example of a detailed structure of an integrated circuit device.

Explanation of symbols

１０…処理装置、１２…リコンフィギュラブル回路、１４…設定部、１６…回路処理制御部、１８…制御部、２０…内部状態保持回路、２２…出力回路、２３…第１切替回路、２４…経路部、２５…第２切替回路、２６…集積回路装置、２７…メモリ部、２８…第３切替回路、２９…経路部、３０…コンパイル部、３２…設定データ生成部、３４…記憶部、３６…プログラム、３８…データフローグラフ、４０…設定データ、５０…論理回路、５２…接続部 DESCRIPTION OF SYMBOLS 10 ... Processing apparatus, 12 ... Reconfigurable circuit, 14 ... Setting part, 16 ... Circuit processing control part, 18 ... Control part, 20 ... Internal state holding circuit, 22 ... Output circuit, 23 ... 1st switching circuit, 24 ... Path unit, 25 ... second switching circuit, 26 ... integrated circuit device, 27 ... memory unit, 28 ... third switching circuit, 29 ... path unit, 30 ... compilation unit, 32 ... setting data generation unit, 34 ... storage unit, 36 ... Program, 38 ... Data flow graph, 40 ... Setting data, 50 ... Logic circuit, 52 ... Connector

Claims

A reconfigurable circuit having a plurality of logic circuits capable of selectively executing a plurality of arithmetic functions and capable of simultaneously executing a plurality of threads,
A storage unit is provided between the preceding logic circuit and the succeeding logic circuit,
During execution of the plurality of threads, the storage unit stores data output from the preceding logic circuit at a first timing, and at a second timing following the first timing, the first Reconfigurable, wherein the stored data output from the preceding logic circuit is supplied to the succeeding logic circuit that executes the same thread as that executed by the preceding logic circuit at the timing circuit.

A reconfigurable circuit having a plurality of logic circuits capable of selectively executing a plurality of arithmetic functions and capable of simultaneously executing a plurality of threads;
A first storage unit that stores an output from the reconfigurable circuit;
The processing apparatus, wherein the first storage unit is assigned to a thread that is executed on the reconfigurable circuit.

The first storage unit includes a plurality of storage units,
The processing apparatus according to claim 2, wherein a thread allocated to each of at least a part of the plurality of storage units is changed every predetermined cycle.

4. The apparatus according to claim 2, further comprising: a first switching unit that selects an output from the reconfigurable circuit according to a thread and supplies the output to the first storage unit. 5. Processing equipment.

5. The second switching unit according to claim 2, further comprising a second switching unit that selects one of the outputs from the plurality of storage units according to a thread and supplies the selected output to the input of the reconfigurable circuit. The processing apparatus in any one.

The first storage unit includes a plurality of storage units,
One of the two threads corresponding to the storage unit is allocated to the storage unit, and the allocation of the one thread can be switched to the other thread every predetermined cycle. The processing apparatus according to any one of claims 2 to 5.

The first storage unit includes at least a pair of storage means,
The pair of storage means is assigned with a different thread, and the assignment of threads corresponding to each of the pair of storage means is switched to each other every predetermined cycle. Item 6. The processing device according to any one of Items 5 to 6.

The first storage unit includes an information storage unit having a plurality of storage means,
8. The processing apparatus according to claim 2, wherein one of the storage units provided in the information storage unit stores an output of a predetermined thread from the reconfigurable circuit. 9.

9. The processing apparatus according to claim 8, wherein only one specific thread is fixedly assigned to any of the storage means provided in the information storage unit.

10. The processing apparatus according to claim 8, wherein each of the storage means provided in the information storage unit is assigned to the same address space.

The storage unit provided in the information storage unit determines whether or not to store the output of the predetermined thread from the reconfigurable circuit based on a specific address range. The processing apparatus according to claim 8.

The storage area of the first storage unit is divided into a plurality of areas,
The processing apparatus according to claim 2, wherein a thread allocated to each of the divided storage areas is changed every predetermined cycle.

13. The system further comprises a switching unit that selects one of the divided outputs from the plurality of storage areas according to a thread and supplies the selected output to the input of the reconfigurable circuit. The processing apparatus in any one of.

The reconfigurable circuit stores data output from the preceding-stage logic circuit between the preceding-stage logic circuit and the subsequent-stage logic circuit, and is the same as the thread executed by the preceding-stage logic circuit. The processing apparatus according to claim 2, further comprising a second storage unit that supplies stored data to the subsequent logic circuit that executes a thread.

A control unit that controls assignment of threads to the first storage unit;
15. The control unit according to claim 2, wherein the control unit changes assignment of a thread to the first storage unit, and stores data from a different thread in the first storage unit. Processing equipment.

The processing apparatus according to claim 15, wherein the control unit simultaneously changes the allocation of threads to the plurality of storage units or the plurality of divided storage areas.

The processing apparatus according to claim 15, wherein the control unit cyclically changes thread allocation to the first storage unit.

The process according to claim 15, wherein the control unit changes the correspondence between the first storage unit and the thread at a timing after execution of all threads is completed. apparatus.