JP4082439B2

JP4082439B2 - Parallel computer

Info

Publication number: JP4082439B2
Application number: JP2006527723A
Authority: JP
Inventors: 敦夫尾崎
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2004-07-26
Filing date: 2004-07-26
Publication date: 2008-04-30
Anticipated expiration: 2024-07-26
Also published as: WO2006011189A1; JPWO2006011189A1

Description

この発明は、単一のタスクを複数の実行単位に分割して、各実行単位を複数のプロセッサで並列に処理する並列計算機に係るものであり、並列計算機全体の処理能力を維持する一方で、消費電力を節約する技術に関する。 The present invention relates to a parallel computer that divides a single task into a plurality of execution units and processes each execution unit in parallel with a plurality of processors, while maintaining the processing capability of the entire parallel computer, The present invention relates to a technology for saving power consumption.

さらにこの発明は、各タスクに課された処理完了時間に関する制約を満たしつつ、並列計算機全体の消費電力を節約する技術に関する。 Furthermore, the present invention relates to a technique for saving the power consumption of the entire parallel computer while satisfying the restriction on the processing completion time imposed on each task.

携帯電話やノートパソコンを始めとする携帯情報機器は軽量であることが求められる。しかしながら、これらの機器では、長時間に亘り動作周波数の高いプロセッサを駆動するために大容量バッテリを内蔵することが多い。容量の大きいバッテリは重量もかさむため、携帯情報機器を軽量化する上で大きな問題となる。 Mobile information devices such as mobile phones and notebook computers are required to be lightweight. However, these devices often incorporate a large-capacity battery in order to drive a processor having a high operating frequency for a long time. A battery with a large capacity is heavy, which is a big problem in reducing the weight of the portable information device.

バッテリの容量を小さくして軽量化する一方で、持続時間を延長するために、処理の種類や内容に応じてプロセッサの動作周波数を変更する技術が知られている。これは、プロセッサを低い動作周波数で動作させることによって、消費電力を節約することができるという原理に基づいている。 A technique is known in which the operating frequency of a processor is changed in accordance with the type and content of processing in order to extend the duration while reducing the battery capacity and weight. This is based on the principle that power consumption can be saved by operating the processor at a low operating frequency.

ところで携帯情報機器においても、マルチメディアデータ処理のように時間的制約を有する処理を行う要求があり、さらには組み込みシステムのように実時間処理が要求される場合が多い。このように処理時間に制約を有する処理を目的としながら動作周波数を適宜変更する省電力技術として、例えば日本国特開２００２−９９４３２（以下、特許文献１とする）が知られている。 By the way, portable information devices are also requested to perform time-constrained processing such as multimedia data processing, and real-time processing is often required like embedded systems. As a power saving technique for appropriately changing the operating frequency while aiming at a process having a restriction on the processing time as described above, for example, Japanese Unexamined Patent Application Publication No. 2002-99432 (hereinafter referred to as Patent Document 1) is known.

この技術は、処理時間に制約のある各タスクの処理時間要求を満たすかどうかを判断しながらタスクのスケジューリングを行っていき、さらに全体のタスクの処理時間要求に余裕がある場合にはプロセッサの動作周波数や電源電圧を変更して省電力化するというものである。 This technology performs task scheduling while determining whether the processing time requirement of each task with processing time constraints is satisfied, and if there is room in the processing time requirement of the entire task, the operation of the processor This is to save power by changing the frequency and power supply voltage.

また処理を高速化する技法としては、プロセッサを高い動作周波数で動作させることの他に、複数のプロセッサを組み合わせて並列処理する方法もよく用いられる。このようなマルチプロセッサシステムを構成する各プロセッサの動作周波数を制御することで省電力化を図る技術としては、例えば日本国特開２００２−２１５５９９（以下、特許文献２とする）が知られている。 As a technique for speeding up the processing, in addition to operating the processor at a high operating frequency, a method of performing parallel processing by combining a plurality of processors is often used. For example, Japanese Patent Application Laid-Open No. 2002-215599 (hereinafter referred to as Patent Document 2) is known as a technique for saving power by controlling the operating frequency of each processor constituting such a multiprocessor system. .

特許文献２における方法は、複数のプロセッサを用いて複数のタスクを処理する上において、一部のプロセッサが他のプロセッサよりも早く処理を完了する場合に、そのプロセッサの動作周波数や電源電圧を他のプロセッサの処理完了時間に応じて低く抑えることで、消費電力の低減を図るものである。 In the method in Patent Document 2, when a plurality of processors are used to process a plurality of tasks, when some of the processors complete processing earlier than the other processors, the operating frequency and power supply voltage of the processor are changed. The power consumption is reduced by keeping it low according to the processing completion time of the processor.

しかし、特許文献２における方法で基準となるのは他のプロセッサの処理完了時間であって、処理自体の時間的制約が基準となるものではない。ゆえに特許文献２に示される方法を処理に時間的制約を有するシステムに適用することはできない。 However, the method in Patent Document 2 is based on the processing completion time of another processor, and is not based on the time constraint of the processing itself. Therefore, the method disclosed in Patent Document 2 cannot be applied to a system that has time constraints on processing.

一方、特許文献１における方法は、単一のプロセッサから構成されたシステムを前提とするものであり、マルチプロセッサシステムに適用する場合は、最小処理単位であるタスク相互の間に依存関係が全くないか、依存関係による影響を無視することができる、という条件を満たさなければ適用することができないことが明らかである。 On the other hand, the method in Patent Document 1 is based on a system composed of a single processor, and when applied to a multiprocessor system, there is no dependency between tasks that are the minimum processing unit. It is clear that it cannot be applied unless the condition that the influence of the dependency relationship can be ignored is satisfied.

並列計算機の分野では、各プロセッサが協調して単一の問題（タスク）を解決する並列演算アルゴリズムが広く研究されてきている。しかしながら、特許文献１の方法、あるいは特許文献１の方法と特許文献２の方法とを組み合わせても、これらの研究成果を利用することができないのである。 In the field of parallel computers, parallel computing algorithms in which each processor cooperates to solve a single problem (task) have been widely studied. However, even if the method of Patent Document 1 or the method of Patent Document 1 and the method of Patent Document 2 are combined, these research results cannot be used.

この発明はこのような課題を解決するためになされたもので、消費電力の低減を図りつつ要求された処理時間内に単一のタスクを並列処理によって完了させる計算機を提供することを目的としている。 The present invention has been made to solve such a problem, and an object thereof is to provide a computer that completes a single task by parallel processing within a required processing time while reducing power consumption. .

この発明に係る並列計算機は、タスクを複数の処理単位に分割して、分割された処理単位を並列に実行する並列計算機において、
上記タスクを個別プロセッサで実行可能な複数の処理単位に分割し、分割された処理単位を複数のサブタスクとして出力するタスク分割手段と、
上記タスク分割手段により分割されたサブタスクの属性情報を保持するサブタスク属性情報ファイルと、
消費電力量を外部から制御しうるように構成され、上記タスク分割手段により分割されたサブタスクを実行する複数のプロセッサと、
上記サブタスク属性情報ファイルが保持するサブタスクの属性情報に基づいて、上記タスク分割手段により分割されたサブタスクを上記複数のプロセッサに分配してそのサブタスクの実行を指示するとともに上記複数のプロセッサの消費電力量を制御するプロセッサ制御手段と、
を備えたものである。In a parallel computer according to the present invention, a task is divided into a plurality of processing units, and the divided processing units are executed in parallel.
Task dividing means for dividing the task into a plurality of processing units that can be executed by individual processors, and outputting the divided processing units as a plurality of subtasks;
A subtask attribute information file that holds attribute information of the subtask divided by the task dividing means;
A plurality of processors configured to control power consumption from the outside and executing the subtask divided by the task dividing unit;
Based on the subtask attribute information held in the subtask attribute information file, the subtask divided by the task dividing means is distributed to the plurality of processors to instruct execution of the subtask, and the power consumption of the plurality of processors Processor control means for controlling
It is equipped with.

なお上記において、サブタスクという概念には、タスクを構成する命令コード列の一部を分割してなる部分的命令コード列は含まれることはいうまでもないが、これにとどまるものではなく、タスクを構成する命令コード自体を分割するのではなく、タスクの処理対象であるデータを複数に分割することで処理単位を複数に分けたものであってもよい。 In the above, the concept of subtask includes a partial instruction code sequence formed by dividing a part of the instruction code sequence constituting the task, but it is not limited to this. Instead of dividing the constituent instruction code itself, the processing unit may be divided into a plurality of units by dividing the data to be processed by the task into a plurality of units.

このように、この発明に係る並列計算機によれば、タスクから分割されたサブタスクの属性情報に基づいてサブタスクを複数のプロセッサに分配しながらそれぞれのプロセッサの消費電力量を制御することとしたので、タスクの実行時間の制約を満たしつつ、消費電力量の削減を達成することができる。 Thus, according to the parallel computer according to the present invention, the power consumption of each processor is controlled while distributing the subtask to a plurality of processors based on the attribute information of the subtask divided from the task. A reduction in power consumption can be achieved while satisfying the task execution time constraint.

この発明の実施の形態１に係る並列計算機の構成を示すブロック図、FIG. 2 is a block diagram showing a configuration of a parallel computer according to Embodiment 1 of the present invention; この発明の実施の形態１に係る並列計算機のプロセッサの特性を示す図、The figure which shows the characteristic of the processor of the parallel computer which concerns on Embodiment 1 of this invention, この発明の実施の形態１に係る並列計算機のフローチャート、The flowchart of the parallel computer which concerns on Embodiment 1 of this invention, この発明の実施の形態１に係る実行方式を選択する方法を説明するための図、The figure for demonstrating the method to select the execution system which concerns on Embodiment 1 of this invention, 各種実行方式を選択する上で、考慮される境界値の関係を示した図、A diagram showing the relationship of boundary values to be considered when selecting various execution methods, プロセッサ数と消費電力との関係を示した図、である。It is the figure which showed the relationship between the number of processors and power consumption.

実施の形態１．
第１図は、この発明の実施の形態１による並列計算機の構成を示すブロック図である。図において、タスク入力端１０は、この並列計算機に処理させるタスクを投入する入力端である。ここで、タスクとは中央演算装置（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ：以下、ＣＰＵと記する）内部における仕事の単位をいう。またここでいう仕事とは、計算機の命令コードを複数個組み合わせて構成される所定の処理の単位であって、計算機のオペレータやシステム管理者からみて分かりやすく、あるいは扱いやすくなるように、という観点から、タスクの大きさが定められることが多い。しかしどのような処理単位で１つのタスクを構成するようにしても、この発明の特徴が失われることはないのである。 Embodiment 1 FIG.
FIG. 1 is a block diagram showing the configuration of a parallel computer according to Embodiment 1 of the present invention. In the figure, a task input terminal 10 is an input terminal for inputting a task to be processed by the parallel computer. Here, a task refers to a unit of work inside a central processing unit (hereinafter referred to as CPU). The term "work" as used herein refers to a predetermined processing unit configured by combining a plurality of computer instruction codes and is easy to understand or handle from the perspective of a computer operator or system administrator. Therefore, the task size is often determined. However, the features of the present invention are not lost no matter what processing unit constitutes one task.

また図においては、タスク入力端１０を設けることによって、外部からタスクを入力するような構成を想定している。しかしながら、この計算機がオペレーティングシステムの制御の元に、自律的に外部の記憶装置に記憶されているタスクを取得するような構成としてもよい。このような構成を有する計算機システムはきわめてありふれているので、ここで改めて説明を要するものではない。 In the figure, it is assumed that a task is input from the outside by providing the task input terminal 10. However, the computer may be configured to autonomously acquire a task stored in an external storage device under the control of the operating system. Computer systems having such a configuration are very common and need not be described again here.

タスク分割手段１１は、タスク入力端１０から投入された単一のタスクを複数のサブタスクに分割する部位である。 The task dividing means 11 is a part that divides a single task input from the task input terminal 10 into a plurality of subtasks.

サブタスク属性情報ファイル１２は、各サブタスクについての付加情報を記憶するファイルであって、ランダムアクセスメモリ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ：ＲＡＭ）や固定ディスク装置その他の記憶装置や記憶素子、あるいは記憶回路によって記憶されるデータである。なお、サブタスク属性情報ファイル１２だけが物理的に単独で存在している必要はなく、例えばタスク入力端１０から投入されるタスクのプログラム実行可能ファイル（命令コードと静的データとが記憶されているバイナリ形式のプログラムファイル）中に記憶するようにしておき、これをサブタスク属性情報ファイル１２として扱うような構成を採用してもよいのである。 The subtask attribute information file 12 is a file for storing additional information about each subtask, and is stored by a random access memory (RAM), a fixed disk device or other storage device or storage element, or a storage circuit. It is data. Note that it is not necessary for the subtask attribute information file 12 to physically exist alone. For example, a task executable file (instruction code and static data stored in the task input terminal 10 is stored). The program may be stored in a binary program file) and handled as the subtask attribute information file 12.

プロセッサ制御手段としての制御用プロセッサ１３は、タスク分割手段１１が分割したサブタスクを、サブタスク属性情報ファイル１２を参照しながら、演算用プロセッサ１４−１〜１４−Ｎからなる複数のプロセッサに分配した上で、サブタスクが分配されたプロセッサにサブタスクの処理を指示する部位である。加えて、制御用プロセッサ１３は、演算用プロセッサ１４−１〜１４−Ｎの消費電力を制御する特徴を有しており、タスクの実行時間の制約を満たしつつ消費電力の低減化を図るのである。 The control processor 13 as the processor control means distributes the subtask divided by the task dividing means 11 to a plurality of processors including the arithmetic processors 14-1 to 14-N while referring to the subtask attribute information file 12. This is a part for instructing the processor to which the subtask is distributed to process the subtask. In addition, the control processor 13 has a feature of controlling the power consumption of the arithmetic processors 14-1 to 14-N, and is intended to reduce power consumption while satisfying the task execution time constraint. .

なお、サブタスクの構成としてはタスクの命令コード列を、より小さなステップ数からなる命令コード列に分割する構成と、タスクの処理対象となるデータを、より小さなサイズのデータに分割する構成とが考えられる。命令コード列を分割してサブタスクを構成する場合には、サブタスクを実行する、と表現すべきであり、データを分割してサブタスクを構成する場合にはサブタスクを処理する、と表現すべきであるが、ここでは表記を簡潔にするために、一律に「サブタスクを処理する」という表現を用いることとする。しかし「サブタスクを処理する」という表現には「サブタスクを実行する」という意味も含むものとする。 In addition, the configuration of subtasks can be divided into a configuration in which a task instruction code sequence is divided into instruction code sequences having a smaller number of steps, and a configuration in which data to be processed by a task is divided into smaller size data. It is done. When a subtask is configured by dividing an instruction code string, it should be expressed as executing a subtask, and when a subtask is configured by dividing data, it should be expressed as processing a subtask. However, here, in order to simplify the notation, the expression “process subtask” is used uniformly. However, the expression “process subtask” includes the meaning of “execute subtask”.

演算用プロセッサ１４−１〜１４−Ｎは、タスク分割手段１１によって分割された各サブタスクを処理する演算装置又は回路である。さらに演算用プロセッサ１４−１〜１４−Ｎは外部から消費電力を制御できるようになっている。消費電力を制御する方法としては、演算用プロセッサ１４−１〜１４−Ｎ自体が直接的に消費電力を変更するようなインターフェースを備えており、このインターフェースを介して消費電力を変更する、というようになっていてもよいし、さらには、外部から入力されるクロック信号に基づいて各サブタスクの命令コードをデコードして実行するようになっている場合に、このクロック信号の変更を通じて消費電力を変更する、というものでも構わない。 The arithmetic processors 14-1 to 14 -N are arithmetic devices or circuits that process each subtask divided by the task dividing unit 11. Furthermore, the arithmetic processors 14-1 to 14-N can control power consumption from the outside. As a method of controlling the power consumption, the arithmetic processors 14-1 to 14-N themselves have an interface for directly changing the power consumption, and the power consumption is changed through this interface. In addition, when the instruction code of each subtask is decoded and executed based on an externally input clock signal, the power consumption is changed through the change of this clock signal. You can do that.

第２図は、演算用プロセッサ１４−１〜プロセッサ１４−Ｎの特性の例を示した図である。図のように演算用プロセッサ１４−１〜プロセッサ１４−Ｎは”高速動作状態”、”標準動作状態”、”遊休状態”の少なくとも３つの動作状態を選べるようになっている。高速動作状態にある場合、演算用プロセッサ１４−１〜プロセッサ１４−Nは、３８０MHｚの動作周波数によって、１．８Vの電圧で動作し、０．５Wの消費電力を消費する。また、標準動作状態においては、演算用プロセッサ１４−１〜プロセッサ１４−Nは、１５２MHｚの動作周波数と１．０Vの電圧で動作し、その消費電力は０．０５３Wとなっている。さらには、遊休状態で動作する場合、動作周波数は３３MHzであり、電圧は１．０Ｖ、消費電力は０，０１１５Ｗとなっている。 FIG. 2 is a diagram showing an example of characteristics of the arithmetic processors 14-1 to 14-N. As shown in the figure, the arithmetic processors 14-1 to 14-N can select at least three operation states of "high-speed operation state", "standard operation state", and "idle state". When in the high-speed operation state, the arithmetic processors 14-1 to 14-N operate at a voltage of 1.8V with an operating frequency of 380 MHz and consume 0.5 W of power. In the standard operation state, the arithmetic processors 14-1 to 14-N operate at an operating frequency of 152 MHz and a voltage of 1.0 V, and the power consumption is 0.053 W. Further, when operating in an idle state, the operating frequency is 33 MHz, the voltage is 1.0 V, and the power consumption is 0,0115 W.

この図に示される特性からも分かるように、電子回路では一般に、動作周波数を高くするにつれて、単位時間あたりの消費電力が高くなることが知られている。消費電力Ｐと動作周波数Ｆ、および電源電圧Ｖとの関係は、リーク電力を無視した場合、式（１）によって与えられる。ここで、ｔは信号遷移率であり、Ｃは静電容量である。 As can be seen from the characteristics shown in this figure, it is generally known that in an electronic circuit, the power consumption per unit time increases as the operating frequency increases. The relationship between the power consumption P, the operating frequency F, and the power supply voltage V is given by Equation (1) when the leakage power is ignored. Here, t is a signal transition rate and C is a capacitance.

Ｐ＝ｔ・Ｃ・Ｆ・Ｖ^２（１）P = t · C · F · V ² (1)

なお、演算用プロセッサ１４−１〜１４−Ｎは、例示として”高速動作状態”、”標準動作状態”、”遊休状態”からなる３つの動作状態を推移するな構成を有しているが、この発明において使用することのできるプロセッサはこのような例に限定されるものではない。 Note that the arithmetic processors 14-1 to 14-N have a configuration in which three operation states including a “high-speed operation state”, a “standard operation state”, and an “idle state” change as an example. The processor that can be used in the present invention is not limited to such an example.

計算機が置かれる環境の気温によってクロックの速度は変動しうるので、実用的な市販プロセッサは外部クロックの変動に対するマージンを有している。このような市販プロセッサは外部クロックを高速にするとその分速く動作するようになり、低速にするとその分遅く動作するようになる。そこで、上述の例に示したようなプロセッサとは異なり、積極的に複数の動作状態をサポートしていない市販プロセッサを用いた場合であっても、外部クロック変動に対するマージンを積極的に利用することで、この発明の特徴を利用することが可能となるのである。最近では、消費電力を低減することができるプロセッサはモバイル用途で広く使用されており、技術的にも公知となっているので、ここではこれ以上詳細には触れないこととする。 Since the clock speed can vary depending on the temperature of the environment in which the computer is located, a practical commercially available processor has a margin for the variation of the external clock. Such a commercially available processor operates faster as the external clock speed increases, and operates slower as the external clock speed decreases. Therefore, unlike the processor shown in the above example, even when using a commercially available processor that does not actively support multiple operating states, the margin for external clock fluctuations should be actively used. Thus, the features of the present invention can be used. Recently, processors that can reduce power consumption have been widely used in mobile applications and are well known in the art, and will not be discussed in further detail here.

なお演算用プロセッサ１４−１〜１４−Ｎは、それぞれが例えば独立したＬＳＩ部品であると限定的に解釈してはならない。例えばベクトルプロセッサは単体の演算装置でありながら、複数の演算を並列実行することができる。第１図に示した計算機の構成はこのようなものも含むのである。また制御用プロセッサ１３と演算用プロセッサ１４−１〜１４−Nとを、パソコンやワークステーションのような完成されたコンピュータで置き換えることも可能なことはいうまでもない。すなわち、この発明は複数のコンピュータを組み合わせた並列演算システムにも適用可能である。 Note that the arithmetic processors 14-1 to 14-N should not be interpreted in a limited way as being independent LSI components, for example. For example, the vector processor is a single arithmetic device, but can execute a plurality of operations in parallel. The configuration of the computer shown in FIG. 1 includes such a configuration. It goes without saying that the control processor 13 and the arithmetic processors 14-1 to 14-N can be replaced with a completed computer such as a personal computer or a workstation. That is, the present invention can also be applied to a parallel computing system in which a plurality of computers are combined.

なお、タスク分割手段１１は独立した制御回路又は制御装置として構成してもよいし、制御用プロセッサ１３によって実行されるコンピュータプログラムとして構成するようにしても構わない。 The task dividing means 11 may be configured as an independent control circuit or control device, or may be configured as a computer program executed by the control processor 13.

また、制御用プロセッサ１３を一般的なプロセッサアーキテクチャにおけるフェッチ回路及びデコーダとみなし、タスクとサブタスクとを、そのプロセッサにおける機械語レベルの命令コードとマイクロコードとみなせば、第１図に示したシステム全体が単一のプロセッサを表すものとみなすこともできる。この場合、ベクトル演算による配列処理をタスクとみなし、配列の各要素の処理を複数のサブタスクとみなすこととなる。さらにタスク分解手段１１に相当するのはベクトル化コンパイラと呼ばれるベクトル演算命令を生成する最適化処理に対応したコンパイラ（言語処理プロセッサ）と、ベクトル演算命令をマイクロコードにデコードするデコーダとなるであろう。このようなコンパイラ技術はすでに公知である。またこのようなプロセッサアーキテクチャのレベルで定まる処理単位ではなく、プロセスとスレッドの関係をタスクとサブタスクの関係に対応させて考えてもよい。この場合は、システムの設計に基づいてタスクとサブタスクとの関係が柔軟に定義される。このように、第１図の構成はさまざまなレベルで適用することができるのである。 Further, if the control processor 13 is regarded as a fetch circuit and a decoder in a general processor architecture, and a task and a subtask are regarded as an instruction code and a microcode at the machine language level in the processor, the entire system shown in FIG. Can be considered to represent a single processor. In this case, array processing by vector operation is regarded as a task, and processing of each element of the array is regarded as a plurality of subtasks. Further, the task decomposing means 11 will be a compiler (language processor) corresponding to an optimization process for generating a vector operation instruction called a vectorization compiler, and a decoder for decoding the vector operation instruction into microcode. . Such compiler technology is already known. Further, instead of the processing unit determined at the level of the processor architecture, the relationship between the process and the thread may be considered to correspond to the relationship between the task and the subtask. In this case, the relationship between tasks and subtasks is flexibly defined based on the system design. Thus, the configuration of FIG. 1 can be applied at various levels.

続いて、この発明の実施の形態１による並列計算機の動作について説明する。第３図は、この並列計算機の動作を示すフローチャートである。タスク入力端１０から、実行すべきタスクが投入されると、タスク分割手段１１はタスクをサブタスクに分割する（ステップＳ１０１）。続いて制御用プロセッサ１３は、タスクの処理制限時間Ｔを取得する（ステップＳ１０２）。処理制限時間Ｔはシステムによって予め定められる値である。 Subsequently, the operation of the parallel computer according to the first embodiment of the present invention will be described. FIG. 3 is a flowchart showing the operation of this parallel computer. When a task to be executed is input from the task input terminal 10, the task dividing unit 11 divides the task into subtasks (step S101). Subsequently, the control processor 13 acquires a task processing time limit T (step S102). The processing time limit T is a value predetermined by the system.

例えばプロセスとスレッドの場合は、利用者あるいはシステムの目的からＴが決定される。システムが一定時間（サンプリング時間）ごとに発生する入力信号（例えば何らかの観測値など）の信号処理を行うことを目的としているのであれば、これら信号を取得する周期であるサンプリング時間が処理制限時間Ｔに該当するであろう。 For example, in the case of a process and a thread, T is determined from the purpose of the user or the system. If the system is intended to perform signal processing of an input signal (for example, some observation value) generated every fixed time (sampling time), the sampling time which is a period for acquiring these signals is the processing time limit T. Would fall under.

また、外部仕様からは処理制限時間が定まらずに、並列計算機の構成から処理制限時間Ｔが決定される場合もある。例えば、外部クロックで１クロック内にほとんどの命令を完了するようなプロセッサを構成する場合、１外部クロックに相当する長さの時間が処理制限時間Ｔになる。 In some cases, the processing time limit T is determined from the configuration of the parallel computer without determining the processing time limit from the external specification. For example, when configuring a processor that completes most instructions within one clock with an external clock, a time corresponding to one external clock is the processing limit time T.

続いて、制御用プロセッサ１３は、演算用プロセッサ１４−１〜１４−Ｎを高速動作状態に設定した場合のタスクの処理時間ｔminを算出する（ステップＳＴ１０３）。この処理を実現するためには、各サブタスクの処理完了見込み時間が予め分かっていることが要求される。そこで、例えば演算用プロセッサ１４−１〜１４−Ｎのいずれかのプロセッサによる各サブタスクの高速動作状態と標準状態における処理時間を予め計測しておき、サブタスク属性情報ファイルに記憶させておく。そして制御用プロセッサ１３は、サブタスクの種類に応じてそのサブタスクの処理時間を取得し、タスクの処理時間ｔminを算出するのである。 Subsequently, the control processor 13 calculates a task processing time tmin when the arithmetic processors 14-1 to 14-N are set to the high-speed operation state (step ST103). In order to realize this process, it is required that the estimated process completion time of each subtask is known in advance. Therefore, for example, the processing time in the high-speed operation state and the standard state of each subtask by any one of the arithmetic processors 14-1 to 14-N is measured in advance and stored in the subtask attribute information file. Then, the control processor 13 obtains the processing time of the subtask according to the type of the subtask, and calculates the processing time tmin of the task.

なお、サブタスクの処理時間を高速動作状態と標準動作状態のいずれか一方のみについてのみ測定しておき、測定した処理時間の動作状態の動作周波数と他方の動作状態の動作周波数との比率を乗じて、他方の処理時間を概算するようにしても構わない。 Note that the subtask processing time is measured only in either the high-speed operating state or the standard operating state, and is multiplied by the ratio between the operating frequency of the measured operating time and the operating frequency of the other operating state. The other processing time may be approximated.

この結果、ｔminが処理制限時間Ｔを下回る場合（ステップＳＴ１０４：Ｙｅｓ）は演算用プロセッサ１４−１〜１４−Ｎの並列処理能力が、処理すべきサブタスクの処理量を上回ることを意味しており、処理能力に余裕があるのでステップＳＴ１０５以降の消費電力節約処理に移行する。 As a result, when tmin falls below the processing time limit T (step ST104: Yes), it means that the parallel processing capability of the arithmetic processors 14-1 to 14-N exceeds the processing amount of the subtask to be processed. Since there is a margin in processing capacity, the process proceeds to the power saving process after step ST105.

一方、ｔminが処理制限時間Ｔを下回ることがない場合は、消費電力節約よりも処理の高速化に重点を置く必要があるので、演算用プロセッサ１４−１〜１４−Ｎを高速動作状態に設定する（ステップＳＴ１０６：実行方式１）。そしてステップＳＴ１１１に進む。なおステップＳＴ１１１以降の処理については後述する。 On the other hand, when tmin does not fall below the processing time limit T, it is necessary to focus on speeding up processing rather than saving power consumption, so the arithmetic processors 14-1 to 14-N are set to a high-speed operation state. (Step ST106: execution method 1). Then, the process proceeds to step ST111. The processing after step ST111 will be described later.

ステップＳＴ１０５において、制御用プロセッサ１３は、演算用プロセッサ１４−１〜１４−Ｎのいずれか一つを標準動作状態に設定し、標準動作状態に設定したプロセッサのみですべてのサブタスクを実行した場合のタスクの処理時間ｔstdを算出する。この場合もステップＳＴ１０３におけるｔminの算出と同じようにサブタスクの処理時間に基づいてｔstdが算出される。そしてこのｔstdがＴを上回る場合（ＳＴ１０７：Ｙｅｓ）は、演算用プロセッサ１４−１〜１４−Ｎのいずれか一つのプロセッサのみによる処理では処理制限時間Ｔ以内にタスクを完了させるという要求を満たすことができないので、ステップＳＴ１０９以降の複数のプロセッサを用いた並列処理に進む。 In step ST105, the control processor 13 sets any one of the arithmetic processors 14-1 to 14-N to the standard operation state, and executes all subtasks only with the processor set to the standard operation state. The task processing time tstd is calculated. Also in this case, tstd is calculated based on the processing time of the subtask in the same manner as the calculation of tmin in step ST103. If this tstd exceeds T (ST107: Yes), the processing by only one of the arithmetic processors 14-1 to 14-N satisfies the request to complete the task within the processing time limit T. Therefore, the process proceeds to parallel processing using a plurality of processors after step ST109.

一方、ｔstdがＴを上回ることがない場合、１つのプロセッサのみでも処理制限時間Ｔ以内にタスクを完了させるという要求を満たしうるので、演算用プロセッサ１４−１〜１４−Ｎのうちのいずれか一つのプロセッサ、例えば演算用プロセッサ１４−１を標準動作状態に設定する（ステップＳＴ１０８）。加えて、演算用プロセッサ１４−１を除いた他のプロセッサ、すなわち演算用プロセッサ１４−２〜１４−Ｎを遊休状態に設定する。 On the other hand, if tstd does not exceed T, the request to complete the task within the processing time limit T can be satisfied even with only one processor, and therefore any one of the arithmetic processors 14-1 to 14-N. One processor, for example, the arithmetic processor 14-1, is set to the standard operation state (step ST108). In addition, the other processors excluding the arithmetic processor 14-1, that is, the arithmetic processors 14-2 to 14-N are set in an idle state.

こうすることにより、所定の処理制限時間以内にタスクの処理を完了させるという実時間処理に対する要求を満足させながら、消費電力の削減をも同時に達成できるのである。
一方、ｔstdがＴを上回る場合、サブタスクの性質と各演算用プロセッサの性質（動作周波数、消費電力）に基づいて、次のいずれかの処理方式（実行方式３と実行方式４）を選択し、その処理方式に基づいてサブタスク処理に用いる演算用プロセッサの個数ｎと動作周波数を算出する。（ステップＳＴ１０９）。By doing so, it is possible to simultaneously achieve a reduction in power consumption while satisfying the requirement for real-time processing of completing task processing within a predetermined processing time limit.
On the other hand, when tstd exceeds T, one of the following processing methods (execution method 3 and execution method 4) is selected based on the properties of the subtasks and the properties (operation frequency and power consumption) of each processor. Based on the processing method, the number n of arithmetic processors used for subtask processing and the operating frequency are calculated. (Step ST109).

実行方式３：
演算用プロセッサ１４−１〜１４−Ｎのうちの一つの演算用プロセッサを選択し、選択した演算用プロセッサの動作周波数を高速動作状態の動作周波数βに設定して、この演算用プロセッサによりすべてのサブタスクを実行する。選択された演算用プロセッサ以外の演算用プロセッサは遊休状態に設定される。 Execution method 3 :
One of the arithmetic processors 14-1 to 14-N is selected, and the operating frequency of the selected arithmetic processor is set to the operating frequency β in the high-speed operation state. Perform subtasks. Arithmetic processors other than the selected arithmetic processor are set in an idle state.

実行方式４：
演算用プロセッサ１４−１〜１４−Ｎのうちのｎ個の演算用プロセッサを選択し、選択した演算用プロセッサの動作周波数を標準動作状態の動作周波数αとして、選択されたｎ個（２≦ｎ≦Ｎ）の演算用プロセッサにより実行する。選択されたｎ個の演算用プロセッサ以外の演算用プロセッサは遊休状態に設定される。 Execution method 4 :
The n arithmetic processors are selected from the arithmetic processors 14-1 to 14-N, and the selected n processors (2 ≦ n) are selected with the operating frequency of the selected arithmetic processor as the operating frequency α in the standard operating state. ≦ N). Arithmetic processors other than the selected n arithmetic processors are set in an idle state.

実行方式５：
演算用プロセッサ１４−１〜１４−Ｎのうちのｍ個（ｍ＜ｎ）の演算用プロセッサを選択し、選択したプロセッサの動作周波数を高速動作状態の動作周波数βとして、選択したｍ個（２≦ｍ＜ｎ≦Ｎ）のプロセッサにより実行する。選択されたｍ個のプロセッサ以外は遊休状態に設定する。 Execution method 5 :
Among the arithmetic processors 14-1 to 14-N, m (m <n) arithmetic processors are selected, and the selected m (2 ≦ m <n ≦ N). All but the selected m processors are set in an idle state.

次に実行方式３、実行方式４、実行方式５のいずれかの実行方式を選択する方法について説明する。 Next, a method for selecting one of execution method 3, execution method 4, and execution method 5 will be described.

第４図は、処理制約時間（Ｔ）内の実行方式３と実行方式４のタイムチャート例を示したものである。両者の違いは太線枠内部分であるため、この部分に関しての消費電力量を比較すれば良い。第４図の場合では、処理制約時間（Ｔ）は、実行方式３より実行方式４の処理時間の方が大きいため、式（２）のように示すことができる。ここで、Ｔc（=ＴS＋ＴR）は１回の通信処理に要する時間であり、送信処理時間ＴSと受信処理時間ＴRを加えたものである。また、Tαは、1つの処理データを1つのプロセッサで動作周波数αで処理した場合の実行時間である。また、ｎはプロセッサ数を示す。 FIG. 4 shows a time chart example of the execution method 3 and the execution method 4 within the processing constraint time (T). Since the difference between the two is the portion within the thick line frame, it is only necessary to compare the power consumption amounts for this portion. In the case of FIG. 4, the processing constraint time (T) can be expressed as in equation (2) because the processing time of execution method 4 is longer than that of execution method 3. Here, Tc (= TS + TR) is the time required for one communication process, and is the sum of the transmission processing time TS and the reception processing time TR. Tα is an execution time when one piece of processing data is processed by one processor at the operating frequency α. N indicates the number of processors.

Ｔ＝（ｎ−１）・ＴC +Ｔα／ｎ（２） T = (n-1) .TC + T.alpha. / N (2)

この場合の実行方式３による消費電力量Ｃ2[W・s]を示したものが式（３）である。ここで、式（３）の第1項は動作周波数βでデータ処理を行うのに要する消費電力量であり、残りの第2項は、遊休状態であるプロセッサ（第４図：演算用プロセッサ１４−１〜演算用プロセッサ１４−Ｎ）とデータ処理が終わり遊休状態となった期間のプロセッサ（第４図：演算用プロセッサ１４−１）の消費電力量を示したものである。また、ｋ＝α／βである。 Equation (3) shows the power consumption C2 [W · s] by the execution method 3 in this case. Here, the first term of equation (3) is the amount of power consumed to perform data processing at the operating frequency β, and the remaining second term is an idle processor (FIG. 4: processor 14 for computation). -1 to computing processor 14-N) and the power consumption of the processor (FIG. 4: computing processor 14-1) during the period when the data processing is completed and the processor is idle. Further, k = α / β.

Ｃ2＝Ｐβ・ｋ・Ｔα ＋ｋ・Ｐγ・Ｔα・（ｎ−１）
＋ｎ・Ｐγ・［Ｔα・（１／ｎ−ｋ）＋（ｎ−１）・Ｔc］
＝Ｐβ・ｋ・Ｔα ＋Ｐγ・［（１−ｋ）・Ｔα ＋ｎ・（ｎ−１）・Ｔc］（３）C2 = Pβ · k · Tα + k · Pγ · Tα · (n−1)
+ N · Pγ · [Tα · (1 / n−k) + (n−1) · Tc]
= Pβ · k · Tα + Pγ · [(1-k) · Tα + n · (n-1) · Tc] (3)

同様に、この場合の実行方式４による消費電力量Ｃ3[W・s]を示したものが式（４）である。ここで、式（４）の第1項は通信処理に要する消費電力量と全部の遊休状態の消費電力量とを加えたものであり、第2項はデータ処理に要する消費電力量を示したものである。 Similarly, Equation (4) shows the power consumption C3 [W · s] by the execution method 4 in this case. Here, the first term of Equation (4) is the sum of the power consumption required for communication processing and the power consumption in all idle states, and the second term represents the power consumption required for data processing. Is.

Ｃ3＝（ｎ−１）・Ｐα・Ｔc ＋（１／ｎ）・Ｐα・Ｔα
＋(ｎ−１)・［Ｐα・Ｔc ＋(１／ｎ)・Ｐα・Ｔα＋(ｎ−２)・Ｐγ・Ｔc］
＝（ｎ−１）・［２・Ｐα ＋（ｎ−２）・Ｐγ］・Ｔc ＋Ｐα・Ｔα （４）C3 = (n-1) .P.alpha..Tc + (1 / n) .P.alpha..T.alpha.
+ (N−1) · [Pα · Tc + (1 / n) · Pα · Tα + (n−2) · Pγ · Tc]
= (N-1) · [2 · Pα + (n-2) · Pγ] · Tc + Pα · Tα (4)

ここでＣ2＝Ｃ3とすると、式（３）と式（４）から式（５）を導出することができる。Ｃ2＝Ｃ3を満たす場合とは、これら２つの実行方式による消費電力が等しい場合であり、C２＝C3を満たす各パラメータの値が境界値となって、この境界値以外のパラメータ値をとる場合に、これらの実行方式のいずれか一方が有利となるのである。ここで、ρはデータ処理に対する通信処理の処理時間の比率（Ｔc／Ｔα）を表すものとする。 Here, assuming that C2 = C3, Expression (5) can be derived from Expression (3) and Expression (4). The case where C2 = C3 is satisfied is the case where the power consumption by these two execution methods is equal, and the value of each parameter satisfying C2 = C3 becomes a boundary value and takes a parameter value other than this boundary value. Any one of these execution methods is advantageous. Here, ρ represents a ratio (Tc / Tα) of processing time of communication processing to data processing.

ρ＝｛ｋ・Ｐβ−Ｐα ＋Ｐγ・（１−ｋ）｝／｛２・（ｎ−１）（Ｐα−Ｐγ）｝（５） ρ = {k · Pβ−Pα + Pγ · (1−k)} / {2 · (n−1) (Pα−Pγ)} (5)

この式（５）に基づいて求めたρと、実行方式２により選定した省電力実行のためのρ3とを比較すれば実行方式３と４の優劣が判定でき、ρ<ρ3であれば実行方式３を、ρ>ρ3であれば実行方式４を適用すればよいことが分かる。なお、ここまでの議論は、第４図に基づいて実行方式３よりも実行方式４の処理時間の方が大きい場合に関するものであるが、逆の場合でも、式（３）と式（４）は異なるものになるが、同じ式（５）が導出される。但し、ｎ＝２，３の場合は、送信処理時間Ｔsと受信処理時間ＴRの大小関係によっては、例えば、第４図で示した実行方式４の演算用プロセッサ１４−１にも遊休状態におかれてしまう場合がある。しかし、Ｔs＝ＴRと仮定すれば、ｎ＝２，３の場合でもρは式（５）によって与えられる。 Comparing ρ obtained based on this equation (5) with ρ3 for execution of power saving selected by execution method 2, the superiority or inferiority of execution methods 3 and 4 can be determined. If ρ <ρ3, the execution method It can be seen that execution method 4 should be applied if 3 is ρ> ρ3. The discussion so far relates to the case where the processing time of the execution method 4 is longer than that of the execution method 3 based on FIG. 4, but in the opposite case, the expressions (3) and (4) Are different, but the same equation (5) is derived. However, in the case of n = 2, 3, depending on the magnitude relationship between the transmission processing time Ts and the reception processing time TR, for example, the arithmetic processor 14-1 of the execution method 4 shown in FIG. There is a case that it will be scratched. However, assuming Ts = TR, ρ is given by equation (5) even when n = 2,3.

第５図は、式（５）の右辺における各パラメータに、第２図の各値を与えた場合の演算用プロセッサの個数(ｎ≧２)に対するρの値である。実行方式３と４、そして１と４の優劣は、対象とするタスクを解析し、実行方式４での省電力量のための最適なプロセッサ数とその場合のρの値が求まれば、第５図より判定できる。 FIG. 5 shows the value of ρ with respect to the number of arithmetic processors (n ≧ 2) when the values in FIG. 2 are given to the parameters on the right side of equation (5). Execution methods 3 and 4, and superiority and inferiority of 1 and 4 can be obtained by analyzing the target task and obtaining the optimal number of processors and the value of ρ in that case for the energy saving in execution method 4. It can be determined from FIG.

また、第６図は実行方式３が選定/実行された場合の実行方式4に対する消費電力量の比率（Ｅ3／Ｅ4）を、適当なρ（≦０．０５）に関して示したものである。なお、ρ≦０．０５であればプロセッサ数が２〜２０の範囲内では常に並列処理による効果が得られる。この結果（第６図）より、ρの値が一定の場合、プロセッサ数が多いほどこの比率は小さくなるが、逆にρが小さくなればこの比率は大きくなることが確認できる。したがって、演算用プロセッサの個数が増えるにつれρが小さくなるとすると、その状態の間はこの比率の下げ率はより小さくなることになる。 FIG. 6 shows the ratio (E3 / E4) of the power consumption with respect to the execution method 4 when the execution method 3 is selected / executed with respect to an appropriate ρ (≦ 0.05). If ρ ≦ 0.05, the effect of parallel processing is always obtained within the range of 2 to 20 processors. From this result (FIG. 6), when the value of ρ is constant, the ratio decreases as the number of processors increases, but conversely, it can be confirmed that the ratio increases as ρ decreases. Therefore, if ρ becomes smaller as the number of arithmetic processors increases, the rate of reduction of this ratio becomes smaller during that state.

このように通信処理と処理時間の比率、そして演算用プロセッサの動作周波数と消費電力に基づいて第５図のようなプロセッサ数とρの関係を予め求めておき、これを例えばサブタスク属性情報ファイル１２のような記憶領域に記憶させておく。そしてステップＳ１０９において、制御用プロセッサ１３において式（５）の関係から実行方式３及び実行方式４のいずれかの実行方式を選択するのである。 In this way, the relationship between the number of processors and ρ as shown in FIG. 5 is obtained in advance based on the ratio of communication processing and processing time, and the operating frequency and power consumption of the arithmetic processor. Is stored in a storage area such as In step S109, the control processor 13 selects one of the execution method 3 and the execution method 4 from the relationship of the expression (5).

なお、上記の例では、サブタスク間の依存関係として、サブタスクを演算用プロセッサ１４−１〜１４−Ｎに分配するための通信処理の例を説明したが、その他の依存関係に拡張して式（３）〜式（５）に相当する関係を導き出すことは容易である。 In the above example, an example of communication processing for distributing the subtasks to the arithmetic processors 14-1 to 14-N has been described as the dependency relationship between the subtasks. It is easy to derive a relationship corresponding to 3) to Equation (5).

また、実行方式３と５の選定に関しては、両方式とも同じ動作周波数であるため、両方式とも制限時間内に完了するのであれば、実行方式３が選定されることになる。使用するプロセッサ数が少ない方が省電力実行できるためである。 Moreover, regarding the selection of execution methods 3 and 5, since both methods have the same operating frequency, execution method 3 is selected if both methods are completed within the time limit. This is because power saving can be executed when the number of processors used is small.

さらに、実行方式４と５の選定に関しては、実行方式４の方が実行方式５より処理時間を要するとした場合、実行方式５の消費電力量Ｃ５は、次のようになる。 Further, regarding the selection of execution methods 4 and 5, if the execution method 4 requires more processing time than the execution method 5, the power consumption C5 of the execution method 5 is as follows.

Ｃ５＝（ｍ−１）・［２・Ｐβ＋（ｍ−２）・Ｐγ］・ｋ・Ｔｃ＋Ｐβ・Ｔβ
＋Ｐγ・｛Ｔｃ・［ｎ・（ｎ−１）−ｋ・ｍ・（ｍ−１）］＋Ｔα・（１−ｋ）｝（６）C5 = (m−1) · [2 · Pβ + (m−2) · Pγ] · k · Tc + Pβ · Tβ
+ Pγ · {Tc · [n · (n−1) −k · m · (m−1)] + Tα · (1−k)} (6)

ここで、式（６）の第１項と第２項は、処理を割り付けられたプロセッサの消費電力量であり、第３項と第４項は、遊休状態のプロセッサと、処理を割り付けられたが処理が完了して待ち状態であるため、遊休状態となっているプロセッサの消費電力量を示したものである。 Here, the first term and the second term of the expression (6) are the power consumption of the processor to which the processing is assigned, and the third term and the fourth term are assigned to the idle processor and the processing. Shows the power consumption of the processor that is in an idle state because the process is in a wait state.

したがって、実行方式４と実行方式５の消費電力量の差Ｃ５−Ｃ３は、 Therefore, the difference C5-C3 in power consumption between execution method 4 and execution method 5 is

Ｃ５−Ｃ３＝Ｔｃ・｛２・ｋ・（ｍ−１）（Ｐβ−Ｐγ）−２・（ｎ−１）・（Ｐα−Ｐγ）｝
＋Ｔα・｛ｋ・（Ｐβ−Ｐγ）−（Ｐα−Ｐγ）｝（７）C5-C3 = Tc. {2.k. (m-1) (P.beta.-P.gamma.)-2. (n-1). (P.alpha.-P.gamma.)}
+ Tα · {k · (Pβ−Pγ) − (Pα−Pγ)} (7)

となる。この式（７）を用いて、実行方式４と実行方式５との優劣を判定すればよい。なお、実行方式５の方が実行方式４よりも処理時間を要するとした場合も、式（６）は異なるが同じ式（７）が導出されることになる。 It becomes. What is necessary is just to determine the superiority / inferiority of the execution method 4 and the execution method 5 using this Formula (7). Even if execution method 5 requires more processing time than execution method 4, equation (6) is different, but the same equation (7) is derived.

最後に制御用プロセッサ１３は、ステップＳ１０９において決定した実行方式に基づいて演算用プロセッサ１４−１〜１４−Ｎにサブタスクを分配し、サブタスクの実行を指示する（ステップＳＴ１１０）。 Finally, the control processor 13 distributes the subtasks to the arithmetic processors 14-1 to 14-N based on the execution method determined in step S109, and instructs the execution of the subtasks (step ST110).

このように、この発明の実施の形態１の並列計算機によれば、タスクをサブタスクに分割し、サブタスクの依存関係に基づいて実行方式１〜実行方式４のいずれかの実行方式を選択してタスクを並列実行することとしたので、タスクの処理制約時間を満たしつつ複数のプロセッサにおける消費電力の総計を低減することができるのである。 As described above, according to the parallel computer of the first embodiment of the present invention, a task is divided into subtasks, and one of execution methods 1 to 4 is selected based on the subtask dependency, and the task is selected. Are executed in parallel, so that the total power consumption of the plurality of processors can be reduced while satisfying the task processing constraint time.

なお上述の説明において、制御用プロセッサ１３はサブタスクの分配を行う専用のプロセッサであるとしたが、制御用プロセッサ１３は演算用プロセッサ１４−１〜１４−Ｎに比べて負荷が低い場合もあるので、演算用プロセッサ１４−１〜１４−Ｎの機能を兼用させるように構成してもよい。 In the above description, the control processor 13 is a dedicated processor for distributing subtasks. However, the control processor 13 may have a lower load than the arithmetic processors 14-1 to 14-N. The functions of the arithmetic processors 14-1 to 14-N may be combined.

この発明は、複数の計算機をクラスタ構成とした並列計算機システム若しくは複数の演算命令処理部を有する並列処理プロセッサなど、並列演算を目的とする計算機処理システムに広く適用することが可能である。
The present invention can be widely applied to a computer processing system for parallel operations such as a parallel computer system having a plurality of computers in a cluster configuration or a parallel processing processor having a plurality of operation instruction processing units.

Claims

In a parallel computer that divides a task into multiple subtasks and executes the divided subtasks,
Task dividing means for dividing the task into a plurality of subtasks executable by the processor;
A subtask information file that holds information on the processing time of the subtask based on the operating frequency of the processor;
A plurality of processors for executing subtasks divided by the task dividing means;
Information related to the processing time of the subtask based on the operating frequency of the processor, the processing time limit for the task determined in advance, information related to the time required for communication processing between the plurality of processors, and power consumption based on the operating frequency of the processor Processor control means for selecting the number of processors and the operating frequency based on the information on the quantity and distributing the subtasks divided by the task dividing means to the processors;
With
Said processor control means acquires information about the processing time of the subtasks based on the operating frequency of the processor from the subtask information file, all of the subtasks at every processor that high-speed operation state of operating faster than the standard operating conditions A parallel computer that calculates a task processing time when processing is performed and distributes all the subtasks to all the processors when the task processing time is longer than a task processing time limit.

In a parallel computer that divides a task into multiple subtasks and executes the divided subtasks,
Task dividing means for dividing the task into a plurality of subtasks executable by the processor;
A subtask information file that holds information on the processing time of the subtask based on the operating frequency of the processor;
A plurality of processors for executing subtasks divided by the task dividing means;
Information related to the processing time of the subtask based on the operating frequency of the processor, the processing time limit for the task determined in advance, information related to the time required for communication processing between the plurality of processors, and power consumption based on the operating frequency of the processor Processor control means for selecting the number of processors and the operating frequency based on the information on the quantity and distributing the subtasks divided by the task dividing means to the processors;
With
Said processor control means acquires information about the processing time of the subtasks based on the operating frequency of the processor from the subtask information file, all of the subtasks at every processor that high-speed operation state of operating faster than the standard operating conditions The task processing time when processing is calculated, the task processing time is shorter than the task processing time limit, and the task processing time when all the subtasks are processed by one processor in the standard state is calculated. When the processing time is longer than the task processing time limit, information on the subtask processing time based on the processor operating frequency, information on power consumption based on the processor operating frequency, and communication processing between multiple processors Power consumption based on information about the time required for Parallel computer select the number and the operating frequency of the processor to reduce the amount of power, characterized by distributing the subtasks processor.