WO2006011189A1 - Parallel computer - Google Patents

Parallel computer Download PDF

Info

Publication number
WO2006011189A1
WO2006011189A1 PCT/JP2004/010610 JP2004010610W WO2006011189A1 WO 2006011189 A1 WO2006011189 A1 WO 2006011189A1 JP 2004010610 W JP2004010610 W JP 2004010610W WO 2006011189 A1 WO2006011189 A1 WO 2006011189A1
Authority
WO
WIPO (PCT)
Prior art keywords
subtask
processor
task
processors
attribute information
Prior art date
Application number
PCT/JP2004/010610
Other languages
French (fr)
Japanese (ja)
Inventor
Atsuo Ozaki
Original Assignee
Mitsubishi Denki Kabushiki Kaisha
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mitsubishi Denki Kabushiki Kaisha filed Critical Mitsubishi Denki Kabushiki Kaisha
Priority to JP2006527723A priority Critical patent/JP4082439B2/en
Priority to PCT/JP2004/010610 priority patent/WO2006011189A1/en
Publication of WO2006011189A1 publication Critical patent/WO2006011189A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5044Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5094Allocation of resources, e.g. of the central processing unit [CPU] where the allocation takes into account power or heat criteria
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5017Task decomposition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present invention relates to a parallel computer in which a single task is divided into a plurality of execution units, and each execution unit is processed in parallel by a plurality of processors, and the processing capability of the entire parallel computer is maintained. On the other hand, it relates to a technology for saving power consumption.
  • the present invention relates to a technique for saving power consumption of the entire parallel computer while satisfying a restriction on processing completion time imposed on each task.
  • Portable information devices such as mobile phones and notebook computers are required to be lightweight.
  • these devices often incorporate large-capacity batteries to drive processors with a high operating frequency for a long time.
  • the large capacity and the heavy weight of the battery are a major problem in reducing the weight of portable information devices.
  • a technology is known that changes the operating frequency of the processor according to the type and content of processing to extend the duration while reducing the capacity of the battery and reducing its weight! . This is based on the principle that the power consumption can be saved by operating the processor at a low operating frequency! /.
  • Patent Document 1 Japanese Patent Application Laid-Open No. 2002-99432
  • Patent Document 2 Japanese Patent Application Laid-Open No. 2002-215599
  • Patent Document 2 is based on the processing completion time of other processors, and is not based on the time constraint of the processing itself. Therefore, the method disclosed in Patent Document 2 cannot be applied to a system that has time constraints on processing.
  • Patent Document 1 is based on a system in which a single processor power is also configured.
  • the method depends on the minimum processing unit between tasks. It is clear that it cannot be applied without satisfying the condition that there is no relationship or that the influence of the dependency relationship can be ignored.
  • the present invention was made to solve such a problem, and provides a computer that completes a single task by parallel processing within a requested processing time while reducing power consumption.
  • a parallel computer is a parallel computer that divides a task into a plurality of processing units and executes the divided processing units in parallel.
  • a subtask attribute information file that holds attribute information of the subtask divided by the task dividing means
  • a plurality of processors configured to control power consumption from the outside and executing subtasks divided by the task dividing means
  • the subtask divided by the task dividing means is distributed to the plurality of processors to instruct execution of the subtask and the plurality of processors.
  • the concept of subtask includes a partial instruction code sequence formed by dividing a part of an instruction code sequence constituting a task, but it goes without saying. Instead of dividing the instruction code itself that constitutes the task, the data that is the processing target of the task that is not divided is divided into a plurality of processing units.
  • the power consumption of each processor is controlled while distributing the subtask to a plurality of processors based on the attribute information of the subtask divided into the task powers. As a result, the power consumption can be reduced while satisfying the task execution time constraints.
  • FIG. 1 is a block diagram showing a configuration of a parallel computer according to Embodiment 1 of the present invention
  • FIG. 2 is a diagram showing the characteristics of the processor of the parallel computer according to Embodiment 1 of the present invention.
  • FIG. 3 is a flowchart of the parallel computer according to Embodiment 1 of the present invention.
  • FIG. 4 is a diagram for explaining a method for selecting an execution method according to the first embodiment of the present invention.
  • FIG. 5 A diagram showing the relationship of boundary values to be considered when selecting various execution methods.
  • FIG. 6 is a diagram showing the relationship between the number of processors and power consumption.
  • FIG. 1 is a block diagram showing a configuration of a parallel computer according to Embodiment 1 of the present invention.
  • a task input terminal 10 is an input terminal for inputting a task to be processed by the parallel computer.
  • a task is a central processing unit (Central Processing Unit)
  • the unit of work in the interior refers to a predetermined processing unit composed of a combination of computer instruction codes, so that the power of the computer operator and system administrator can be easily and easily handled. From this point of view, the task size is often determined. However, no matter what processing unit is used to configure one task, the features of the present invention are not lost.
  • a task input terminal 10 is provided so that an external force can also input a task.
  • the computer may be configured to autonomously acquire tasks stored in an external storage device under the control of the operating system. Computer systems having such a configuration are very common and need not be explained again here.
  • the task dividing means 11 is a part that divides a single task input from the task input terminal 10 into a plurality of subtasks.
  • the subtask attribute information file 12 is a file for storing additional information about each subtask, and is a random access memory (RAM), a fixed disk device or other storage device or storage element, or a storage device. Data stored by the circuit. Note that only the subtask attribute information file 12 need not physically exist alone, for example, a program executable file for a task input from the task input terminal 10 (instruction code and static data are stored, It is also possible to adopt a configuration in which the program is stored in a binary program file) and handled as the subtask attribute information file 12.
  • RAM random access memory
  • the control processor 13 as the processor control means distributes the subtask divided by the task dividing means 11 to a plurality of processors having 14 N power while referring to the subtask attribute information file 12. Above, it is the part that instructs the processor to which the subtask is distributed to process the subtask.
  • the control processor 13 It has the feature of controlling the power consumption of the computing processors 14 1 1 14 N, and it aims to reduce power consumption while satisfying the task execution time constraints.
  • the task instruction code string is divided into instruction code strings having a smaller number of steps, and the data to be processed by the task is divided into smaller size data. It is conceivable that the configuration When sub-tasks are configured by dividing instruction code strings, they should be expressed as executing sub-tasks, and when sub-tasks are configured by dividing data, they should be expressed as processing sub-tasks. However, in order to simplify the notation here, we will use the expression that uniformly “processes subtasks”. However, the expression “process subtask” t includes the meaning of “execute subtask” t.
  • the arithmetic processor 14 1-1 14 N is an arithmetic device or a circuit for processing each subtask divided by the task dividing means 11. Furthermore, the computing processor 14 1 1 14 N can control the external power consumption. As a method of controlling power consumption, the arithmetic processor 14 1 1 14 N itself has an interface that directly changes the power consumption, and the power consumption is changed via this interface. In addition, if the instruction code of each subtask is decoded and executed based on the clock signal input from the external power! /, This clock signal can be changed. It is also possible to change the power consumption through.
  • FIG. 2 is a diagram showing an example of the characteristics of the processor for operation 14-1 and the processor 14N.
  • the arithmetic processor 14 1 one processor 14 N can select at least three operation states of “high-speed operation state”, “standard operation state”, and “idle state”.
  • the processor 14—one processor 14 N operates at a voltage of 1.8 V and consumes 0.5 W of power with an operating frequency of 380 MHz.
  • the arithmetic processor 14-1 processor 14N operates at an operating frequency of 152MHz and a voltage of 1.OV, and its power consumption is 0.053W.
  • the operating frequency is 33 MHz
  • the voltage is 1. OV
  • the power consumption is 0, 0115W.
  • an electronic circuit generally has a higher operating frequency. It is known that the power consumption per unit time increases as the number of times increases. The relationship between power consumption P, operating frequency F, and power supply voltage V is given by equation (1) when leakage power is ignored. Where t is the signal transition rate and C is the capacitance.
  • the arithmetic processor 14 1-1 14 N has, for example, a configuration that does not change three operation states including a “high-speed operation state”, a “standard operation state”, and an “idle state”.
  • the processor that can be used in the present invention is not limited to such an example.
  • a practical sales processor Since the clock speed can vary depending on the temperature of the environment in which the computer is placed, a practical sales processor has a margin for the variation of the external clock. Such a sales processor operates faster when the external clock is increased, and operates slower as the external clock is decreased. Therefore, unlike the processor shown in the above example, it actively supports multiple operating states, and even when using a sales processor, it can proactively provide a margin for external clock fluctuations. This makes it possible to use the features of the present invention. Recently, processors that can reduce power consumption have been widely used in mopile applications and are well known in the art, so we will not go into further detail here.
  • the arithmetic processors 14 1 1 14 N should not be construed as being limited to each being, for example, an independent LSI component.
  • a vector processor is a single arithmetic device, but can execute a plurality of operations in parallel.
  • the configuration of the computer shown in Fig. 1 includes such a configuration.
  • the control processor 13 and the arithmetic processor 14 1 1 14 N can be replaced with a completed computer such as a personal computer or a workstation. That is, the present invention can also be applied to a parallel computing system in which a plurality of computers are combined.
  • the task dividing means 11 may be configured as an independent control circuit or control device, or may be configured as a computer program executed by the control processor 13.
  • control processor 13 is configured to fetch the processor in a general processor architecture. If the task and subtask are regarded as a machine language instruction code and microcode in the processor, the entire system shown in Fig. 1 can be regarded as a single processor. . In this case, array processing by vector operation is regarded as a task, and processing of each element of the array is regarded as a plurality of subtasks. Furthermore, the task decomposition means 11 would be a compiler (language processor) that supports vector processing instructions called vectory compilers, and a decoder that decodes vector operation instructions into microcode. . Such compiler technology is already known.
  • the relationship between processes and threads may be considered to correspond to the relationship between tasks and subtasks.
  • the relationship between tasks and subtasks is flexibly defined based on the system design. In this way, the configuration in Figure 1 can be applied at various levels.
  • FIG. 3 is a flowchart showing the operation of this parallel computer.
  • the task dividing means 11 divides the task into subtasks (step S101).
  • the control processor 13 acquires a task processing time limit T (step S102).
  • the processing time limit T is a value predetermined by the system.
  • T is determined from the purpose of the user or the system. If the system is intended to perform signal processing of input signals (for example, some observations) that occur every fixed time (sampling time), the sampling time, which is the period for acquiring these signals, is limited. It will correspond to time T.
  • the processing time limit T may be determined from the configuration of the parallel computer without the processing time limit being determined from the external specification. For example, when configuring a processor that completes most instructions within one clock with an external clock, the time limit corresponding to one external clock is the processing time limit T.
  • the control processor 13 calculates a task processing time tmin when the arithmetic processors 14 1 and 14 N are set to the high-speed operation state (step ST103).
  • the estimated process completion time of each subtask must be divided in advance. Required. Therefore, for example, the processing time in the high-speed operation state and the standard state of each subtask by any of the processors 14 1 to 14 N is measured in advance and stored in the subtask attribute information file. Then, the control processor 13 obtains the processing time of the subtask according to the type of the subtask, and calculates the processing time tmin of the task.
  • the processing time of the subtask is measured only for one of the high-speed operation state and the standard operation state! /, Or only one of them, and the operation frequency of the measured processing time and the other operation state are measured.
  • the other processing time may be approximated by multiplying the ratio with the operating frequency.
  • step ST104 Yes
  • the processor capacity for computing 14 1 1 14 N exceeds the processing capacity of the subtask to be processed. Since there is enough processing capacity, the process moves to the power saving process after step ST105.
  • Step ST106 Execution method 1
  • step ST111 The processing after step ST111 will be described later.
  • step ST105 the control processor 13 sets any one of the arithmetic processors 14-1 to 14N to the standard operation state, and performs all subtasks only for the processor set to the standard operation state.
  • tstd is calculated based on the processing time of the subtask as in the calculation of tmin in step ST103. If this tstd force is exceeded (ST107: Yes), processing by any one of the processors 14 1 1 14 N must satisfy the request to complete the task within the processing time limit T. Since this is not possible, the process proceeds to parallel processing using a plurality of processors after step ST1 09.
  • processor 14 1 Set the status (step ST108). Power! Then, the other processors excluding the arithmetic processor 14-1, that is, the arithmetic processors 14 2-14 N are set in an idle state.
  • Step ST109 select one of the following processing methods (execution method 3 and execution method 4) based on the nature of the subtask and the nature of each processor (operation frequency, power consumption): Based on the processing method, calculate the number of processors n and the operating frequency used for subtask processing. (Step ST109).
  • Arithmetic processor 14 Selects one of the 14 N computing processors, sets the operating frequency of the selected computing processor to the operating frequency ⁇ 8 of the high-speed operating state, and uses this computing processor to Perform subtasks. Arithmetic processors other than the selected arithmetic processor are set in an idle state.
  • Arithmetic Processor 14 1 Selects n computing processors out of 14N, and selects the selected n processors (2 ⁇ n) with the operating frequency of the selected computing processor as the operating frequency O in the standard operating state. It is executed by a processor for ⁇ N). Arithmetic processors other than the selected n arithmetic processors are set in an idle state.
  • FIG. 4 shows a time chart example of execution method 3 and execution method 4 within the processing constraint time (T). Since the difference between the two is the part within the bold frame, the power consumption for this part is Compare competence.
  • the processing constraint time (T) can be expressed as equation (2) because the processing time of execution method 4 is longer than that of execution method 3.
  • is the execution time when one piece of processing data is processed by one processor at the operating frequency ⁇ . ⁇ indicates the number of processors.
  • T (n-1) -TC + T a / ⁇ (2)
  • Equation (3) shows the power consumption C2 [W's] by execution method 3 in this case.
  • the first term of Equation (3) is the power consumption required to process data at the operating frequency ⁇
  • the remaining second term is the idle processor (Fig. 4: processor for computation).
  • This figure shows the amount of power consumed by the processor 1 (14) and the processor ( Figure 4: processor 14 1) during the idle period after data processing.
  • k ⁇ Z j8.
  • C2 Pj8 -k-Ta + k- ⁇ ⁇ ⁇ ⁇ (n-1)
  • equation (4) shows the power consumption C3 [W's] by execution method 4 in this case.
  • the first term in equation (4) is the sum of the power consumption required for communication processing and the power consumption in all idle states, and the second term represents the power consumption required for data processing. It is a thing.
  • C3 (n-1) -Pa -Tc + (1 / ⁇ ) ⁇ ⁇ ⁇ ⁇
  • Equation (5) can be derived from equations (3) and (4).
  • p represents the ratio of communication processing time to data processing (TcZT a).
  • FIG. 5 shows the value of p with respect to the number of arithmetic processors ( n ⁇ 2) when the values in FIG. 2 are given to the parameters on the right side of equation (5).
  • Execution methods 3 and 4 and superiority and inferiority of 1 and 4 can be obtained by analyzing the target task and finding the optimal number of processors and the p value in that case for power saving in execution method 4. This can be determined from FIG.
  • Fig. 6 shows the ratio (E3ZE4) of the power consumption to execution method 4 when execution method 3 is selected / executed with respect to an appropriate p ( ⁇ 0.05). If p ⁇ 0.05, the effect of parallel processing is always obtained when the number of processors is in the range of 2-20. From this result (Fig. 6), it can be confirmed that when the value of is constant, this ratio decreases as the number of processors increases. Conversely, if p decreases, this ratio increases. Therefore, if / 0 becomes smaller as the number of arithmetic processors increases, the rate of reduction of this ratio will be smaller during that state.
  • step S 109 the execution method 3 or the execution method 4 is selected from the relationship of the expression (5) for the control processor 13.
  • C5 (m-l)-[2-Pj8 + (m-2) -Py] -k-Tc + Pj8 -Tj8
  • Equation (6) the first and second terms in Equation (6) are the power consumption of the processor to which the process is assigned, and the third and fourth terms are the idle processor and the process. It shows the power consumption of a processor that is in an idle state because it has been allocated but is in a wait state after processing is complete.
  • C5-C3 Tc- ⁇ 2-k- (m— 1) ( ⁇ - ⁇ ) ⁇ 2 ⁇ ( ⁇ — 1) ⁇ ( ⁇ - ⁇ ) ⁇
  • control processor 13 distributes the subtasks to the arithmetic processors 14 1 to 14 based on the execution method determined in step S109, and instructs the execution of the subtasks (step ST110).
  • the task is divided into subtasks, and any one of the execution methods 1 and 4 is executed based on the subtask dependency. Since the task is selected and executed in parallel, the total power consumption of multiple processors can be reduced while satisfying the task processing constraint time.
  • control processor 13 is a dedicated processor for distributing subtasks.
  • the load of the control processor 13 may be lower than that of the processor 14-1 to 14N. So the processor 14 1 1 14 N functions Configure it for use!
  • the present invention can be widely applied to a computer processing system for parallel operations such as a parallel computer system having a plurality of computers in a cluster configuration or a parallel processing processor having a plurality of operation instruction processing units.

Abstract

It is possible to parallel-process a task within the processing limit time of the task while reducing the power consumption. A parallel computer divides a task into a plurality of processing units and executes the divided processing units in parallel. The parallel computer includes: task dividing means (11) for dividing a task into a plurality of processing units which can be executed in separate processors and outputting the divided processing units as sub-tasks; a sub-task attribute information file (12) for holding attribute information on the sub-tasks divided by the task dividing means (11); processors (14-1 to 14-N) configured so that the power consumption amount is controlled from outside and executing the sub-tasks divided by the task dividing means (11); and processor control means (13) for distributing the sub-tasks divided by the task dividing means (11), to the processors (14-1 to 14-N) according to the sub-task attribute information held in the sub-task attribute information file (12), instructing execution of the sub-tasks, and controlling the power consumption amount of the processors (14-1 to 14-N).

Description

明 細 書  Specification
並列計算機  Parallel computer
技術分野  Technical field
[0001] この発明は、単一のタスクを複数の実行単位に分割して、各実行単位を複数のプロ セッサで並列に処理する並列計算機に係るものであり、並列計算機全体の処理能力 を維持する一方で、消費電力を節約する技術に関する。  [0001] The present invention relates to a parallel computer in which a single task is divided into a plurality of execution units, and each execution unit is processed in parallel by a plurality of processors, and the processing capability of the entire parallel computer is maintained. On the other hand, it relates to a technology for saving power consumption.
[0002] さらにこの発明は、各タスクに課された処理完了時間に関する制約を満たしつつ、 並列計算機全体の消費電力を節約する技術に関する。  [0002] Further, the present invention relates to a technique for saving power consumption of the entire parallel computer while satisfying a restriction on processing completion time imposed on each task.
背景技術  Background art
[0003] 携帯電話やノートパソコンを始めとする携帯情報機器は軽量であることが求められ る。し力しながら、これらの機器では、長時間に亘り動作周波数の高いプロセッサを駆 動するために大容量バッテリを内蔵することが多 、。容量の大き 、バッテリは重量も 力さむため、携帯情報機器を軽量ィ匕する上で大きな問題となる。  [0003] Portable information devices such as mobile phones and notebook computers are required to be lightweight. However, these devices often incorporate large-capacity batteries to drive processors with a high operating frequency for a long time. The large capacity and the heavy weight of the battery are a major problem in reducing the weight of portable information devices.
[0004] ノ ッテリの容量を小さくして軽量ィ匕する一方で、持続時間を延長するために、処理 の種類や内容に応じてプロセッサの動作周波数を変更する技術が知られて!/、る。こ れは、プロセッサを低い動作周波数で動作させることによって、消費電力を節約する ことができると!/、う原理に基づ!/、て 、る。  [0004] A technology is known that changes the operating frequency of the processor according to the type and content of processing to extend the duration while reducing the capacity of the battery and reducing its weight! . This is based on the principle that the power consumption can be saved by operating the processor at a low operating frequency! /.
[0005] ところで携帯情報機器にお 、ても、マルチメディアデータ処理のように時間的制約 を有する処理を行う要求があり、さらには組み込みシステムのように実時間処理が要 求される場合が多い。このように処理時間に制約を有する処理を目的としながら動作 周波数を適宜変更する省電力技術として、例えば日本国特開 2002— 99432 (以下 、特許文献 1とする)が知られている。  [0005] By the way, even in portable information devices, there is a demand for processing with time constraints such as multimedia data processing, and there are many cases where real-time processing is required as in an embedded system. . For example, Japanese Patent Application Laid-Open No. 2002-99432 (hereinafter referred to as Patent Document 1) is known as a power saving technique for appropriately changing the operating frequency while aiming at processing having a restriction on processing time.
[0006] この技術は、処理時間に制約のある各タスクの処理時間要求を満たすかどうかを判 断しながらタスクのスケジューリングを行っていき、さらに全体のタスクの処理時間要 求に余裕がある場合にはプロセッサの動作周波数や電源電圧を変更して省電力化 するというものである。  [0006] With this technology, task scheduling is performed while determining whether the processing time requirement of each task with processing time constraints is satisfied, and there is room in the processing time requirement of the entire task. In this method, the processor operating frequency and power supply voltage are changed to save power.
[0007] また処理を高速化する技法としては、プロセッサを高い動作周波数で動作させるこ との他に、複数のプロセッサを組み合わせて並列処理する方法もよく用いられる。こ のようなマルチプロセッサシステムを構成する各プロセッサの動作周波数を制御する ことで省電力化を図る技術としては、例えば日本国特開 2002-215599 (以下、特 許文献 2とする)が知られて 、る。 [0007] As a technique for speeding up the processing, the processor is operated at a high operating frequency. In addition to the above, a method of performing parallel processing by combining a plurality of processors is often used. For example, Japanese Patent Application Laid-Open No. 2002-215599 (hereinafter referred to as Patent Document 2) is known as a technique for reducing power consumption by controlling the operating frequency of each processor constituting such a multiprocessor system. And
[0008] 特許文献 2における方法は、複数のプロセッサを用いて複数のタスクを処理する上 において、一部のプロセッサが他のプロセッサよりも早く処理を完了する場合に、そ のプロセッサの動作周波数や電源電圧を他のプロセッサの処理完了時間に応じて 低く抑えることで、消費電力の低減を図るものである。 [0008] In the method in Patent Document 2, when a plurality of processors are used to process a plurality of tasks, when some of the processors complete processing earlier than the other processors, By keeping the power supply voltage low according to the processing completion time of other processors, the power consumption can be reduced.
発明の開示  Disclosure of the invention
発明が解決しょうとする課題  Problems to be solved by the invention
[0009] しかし、特許文献 2における方法で基準となるのは他のプロセッサの処理完了時間 であって、処理自体の時間的制約が基準となるものではない。ゆえに特許文献 2に 示される方法を処理に時間的制約を有するシステムに適用することはできない。  However, the method in Patent Document 2 is based on the processing completion time of other processors, and is not based on the time constraint of the processing itself. Therefore, the method disclosed in Patent Document 2 cannot be applied to a system that has time constraints on processing.
[0010] 一方、特許文献 1における方法は、単一のプロセッサ力も構成されたシステムを前 提とするものであり、マルチプロセッサシステムに適用する場合は、最小処理単位で あるタスク相互の間に依存関係が全くないか、依存関係による影響を無視することが できる、という条件を満たさなければ適用することができないことが明らかである。  [0010] On the other hand, the method in Patent Document 1 is based on a system in which a single processor power is also configured. When applied to a multiprocessor system, the method depends on the minimum processing unit between tasks. It is clear that it cannot be applied without satisfying the condition that there is no relationship or that the influence of the dependency relationship can be ignored.
[0011] 並列計算機の分野では、各プロセッサが協調して単一の問題 (タスク)を解決する 並列演算アルゴリズムが広く研究されてきている。し力しながら、特許文献 1の方法、 あるいは特許文献 1の方法と特許文献 2の方法とを組み合わせても、これらの研究成 果を利用することができな 、のである。  [0011] In the field of parallel computers, parallel arithmetic algorithms in which each processor cooperates to solve a single problem (task) have been widely studied. However, even if the method of Patent Document 1 or the method of Patent Document 1 and the method of Patent Document 2 are combined, these research results cannot be used.
[0012] この発明はこのような課題を解決するためになされたもので、消費電力の低減を図 りつつ要求された処理時間内に単一のタスクを並列処理によって完了させる計算機 を提供することを目的として!ヽる。  [0012] The present invention was made to solve such a problem, and provides a computer that completes a single task by parallel processing within a requested processing time while reducing power consumption. For the purpose!
課題を解決するための手段  Means for solving the problem
[0013] この発明に係る並列計算機は、タスクを複数の処理単位に分割して、分割された処 理単位を並列に実行する並列計算機において、 [0013] A parallel computer according to the present invention is a parallel computer that divides a task into a plurality of processing units and executes the divided processing units in parallel.
上記タスクを個別プロセッサで実行可能な複数の処理単位に分割し、分割された 処理単位を複数のサブタスクとして出力するタスク分割手段と、 The above tasks are divided into multiple processing units that can be executed by individual processors. A task dividing means for outputting a processing unit as a plurality of subtasks;
上記タスク分割手段により分割されたサブタスクの属性情報を保持するサブタスク 属性情報ファイルと、  A subtask attribute information file that holds attribute information of the subtask divided by the task dividing means;
消費電力量を外部から制御しうるように構成され、上記タスク分割手段により分割さ れたサブタスクを実行する複数のプロセッサと、  A plurality of processors configured to control power consumption from the outside and executing subtasks divided by the task dividing means;
上記サブタスク属性情報ファイルが保持するサブタスクの属性情報に基づ 、て、上 記タスク分割手段により分割されたサブタスクを上記複数のプロセッサに分配してそ のサブタスクの実行を指示するとともに上記複数のプロセッサの消費電力量を制御 するプロセッサ制御手段と、  Based on the subtask attribute information held in the subtask attribute information file, the subtask divided by the task dividing means is distributed to the plurality of processors to instruct execution of the subtask and the plurality of processors. Processor control means for controlling the power consumption of
を備えたものである。  It is equipped with.
[0014] なお上記において、サブタスクという概念には、タスクを構成する命令コード列の一 部を分割してなる部分的命令コード列は含まれることは 、うまでもな 、が、これにとど まるものではなぐタスクを構成する命令コード自体を分割するのではなぐタスクの処 理対象であるデータを複数に分割することで処理単位を複数に分けたものであって ちょい。  [0014] In the above, the concept of subtask includes a partial instruction code sequence formed by dividing a part of an instruction code sequence constituting a task, but it goes without saying. Instead of dividing the instruction code itself that constitutes the task, the data that is the processing target of the task that is not divided is divided into a plurality of processing units.
発明の効果  The invention's effect
[0015] このように、この発明に係る並列計算機によれば、タスク力 分割されたサブタスク の属性情報に基づいてサブタスクを複数のプロセッサに分配しながらそれぞれのプ 口セッサの消費電力量を制御することとしたので、タスクの実行時間の制約を満たし つつ、消費電力量の削減を達成することができる。  As described above, according to the parallel computer of the present invention, the power consumption of each processor is controlled while distributing the subtask to a plurality of processors based on the attribute information of the subtask divided into the task powers. As a result, the power consumption can be reduced while satisfying the task execution time constraints.
図面の簡単な説明  Brief Description of Drawings
[0016] [図 1]この発明の実施の形態 1に係る並列計算機の構成を示すブロック図、  FIG. 1 is a block diagram showing a configuration of a parallel computer according to Embodiment 1 of the present invention;
[図 2]この発明の実施の形態 1に係る並列計算機のプロセッサの特性を示す図、 FIG. 2 is a diagram showing the characteristics of the processor of the parallel computer according to Embodiment 1 of the present invention;
[図 3]この発明の実施の形態 1に係る並列計算機のフローチャート、 FIG. 3 is a flowchart of the parallel computer according to Embodiment 1 of the present invention;
[図 4]この発明の実施の形態 1に係る実行方式を選択する方法を説明するための図、 FIG. 4 is a diagram for explaining a method for selecting an execution method according to the first embodiment of the present invention;
[図 5]各種実行方式を選択する上で、考慮される境界値の関係を示した図、 [Figure 5] A diagram showing the relationship of boundary values to be considered when selecting various execution methods.
[図 6]プロセッサ数と消費電力との関係を示した図、である。  FIG. 6 is a diagram showing the relationship between the number of processors and power consumption.
発明を実施するための最良の形態 [0017] 実施の形餱 1. BEST MODE FOR CARRYING OUT THE INVENTION [0017] Implementation form 1.
第 1図は、この発明の実施の形態 1による並列計算機の構成を示すブロック図であ る。図において、タスク入力端 10は、この並列計算機に処理させるタスクを投入する 入力端である。ここで、タスクとは中央演算装置(Central Processing Unit:以下 FIG. 1 is a block diagram showing a configuration of a parallel computer according to Embodiment 1 of the present invention. In the figure, a task input terminal 10 is an input terminal for inputting a task to be processed by the parallel computer. Here, a task is a central processing unit (Central Processing Unit)
、 CPUと記する)内部における仕事の単位をいう。またここでいう仕事とは、計算機の 命令コードを複数個組み合わせて構成される所定の処理の単位であって、計算機の オペレータやシステム管理者力 みて分力りやすぐあるいは扱 、やすくなるように、 という観点から、タスクの大きさが定められることが多い。し力しどのような処理単位で 1つのタスクを構成するようにしても、この発明の特徴が失われることはないのである。 The unit of work in the interior). The term “work” here refers to a predetermined processing unit composed of a combination of computer instruction codes, so that the power of the computer operator and system administrator can be easily and easily handled. From this point of view, the task size is often determined. However, no matter what processing unit is used to configure one task, the features of the present invention are not lost.
[0018] また図においては、タスク入力端 10を設けることによって、外部力もタスクを入力す るような構成を想定している。しかしながら、この計算機がオペレーティングシステム の制御の元に、自律的に外部の記憶装置に記憶されているタスクを取得するような 構成としてもよい。このような構成を有する計算機システムはきわめてありふれている ので、ここで改めて説明を要するものではない。  In the figure, it is assumed that a task input terminal 10 is provided so that an external force can also input a task. However, the computer may be configured to autonomously acquire tasks stored in an external storage device under the control of the operating system. Computer systems having such a configuration are very common and need not be explained again here.
[0019] タスク分割手段 11は、タスク入力端 10から投入された単一のタスクを複数のサブタ スクに分割する部位である。  The task dividing means 11 is a part that divides a single task input from the task input terminal 10 into a plurality of subtasks.
[0020] サブタスク属性情報ファイル 12は、各サブタスクについての付加情報を記憶するフ アイルであって、ランダムアクセスメモリ(Random Access Memory :RAM)や固 定ディスク装置その他の記憶装置や記憶素子、あるいは記憶回路によって記憶され るデータである。なお、サブタスク属性情報ファイル 12だけが物理的に単独で存在し ている必要はなぐ例えばタスク入力端 10から投入されるタスクのプログラム実行可 能ファイル (命令コードと静的データとが記憶されて 、るバイナリ形式のプログラムフ アイル)中に記憶するようにしておき、これをサブタスク属性情報ファイル 12として扱う ような構成を採用してもよいのである。  [0020] The subtask attribute information file 12 is a file for storing additional information about each subtask, and is a random access memory (RAM), a fixed disk device or other storage device or storage element, or a storage device. Data stored by the circuit. Note that only the subtask attribute information file 12 need not physically exist alone, for example, a program executable file for a task input from the task input terminal 10 (instruction code and static data are stored, It is also possible to adopt a configuration in which the program is stored in a binary program file) and handled as the subtask attribute information file 12.
[0021] プロセッサ制御手段としての制御用プロセッサ 13は、タスク分割手段 11が分割した サブタスクを、サブタスク属性情報ファイル 12を参照しながら、演算用プロセッサ 14 1一 14 N力もなる複数のプロセッサに分配した上で、サブタスクが分配されたプロセ ッサにサブタスクの処理を指示する部位である。カロえて、制御用プロセッサ 13は、演 算用プロセッサ 14 1一 14 Nの消費電力を制御する特徴を有しており、タスクの実 行時間の制約を満たしつつ消費電力の低減ィ匕を図るのである。 [0021] The control processor 13 as the processor control means distributes the subtask divided by the task dividing means 11 to a plurality of processors having 14 N power while referring to the subtask attribute information file 12. Above, it is the part that instructs the processor to which the subtask is distributed to process the subtask. The control processor 13 It has the feature of controlling the power consumption of the computing processors 14 1 1 14 N, and it aims to reduce power consumption while satisfying the task execution time constraints.
[0022] なお、サブタスクの構成としてはタスクの命令コード列を、より小さなステップ数から なる命令コード列に分割する構成と、タスクの処理対象となるデータを、より小さなサ ィズのデータに分割する構成とが考えられる。命令コード列を分割してサブタスクを 構成する場合には、サブタスクを実行する、と表現すべきであり、データを分割してサ ブタスクを構成する場合にはサブタスクを処理する、と表現すべきであるが、ここでは 表記を簡潔にするために、一律に「サブタスクを処理する」 t 、う表現を用いることとす る。しかし「サブタスクを処理する」 t 、う表現には「サブタスクを実行する」 t 、う意味 も含むものとする。 [0022] It should be noted that, as the subtask configuration, the task instruction code string is divided into instruction code strings having a smaller number of steps, and the data to be processed by the task is divided into smaller size data. It is conceivable that the configuration When sub-tasks are configured by dividing instruction code strings, they should be expressed as executing sub-tasks, and when sub-tasks are configured by dividing data, they should be expressed as processing sub-tasks. However, in order to simplify the notation here, we will use the expression that uniformly “processes subtasks”. However, the expression “process subtask” t includes the meaning of “execute subtask” t.
[0023] 演算用プロセッサ 14 1一 14 Nは、タスク分割手段 11によって分割された各サブ タスクを処理する演算装置又は回路である。さらに演算用プロセッサ 14 1一 14 N は外部力 消費電力を制御できるようになつている。消費電力を制御する方法として は、演算用プロセッサ 14 1一 14 N自体が直接的に消費電力を変更するようなイン ターフェースを備えており、このインターフェースを介して消費電力を変更する、という ようになつていてもよいし、さらには、外部力 入力されるクロック信号に基づいて各サ ブタスクの命令コードをデコードして実行するようになって!/、る場合に、このクロック信 号の変更を通じて消費電力を変更する、というものでも構わない。  The arithmetic processor 14 1-1 14 N is an arithmetic device or a circuit for processing each subtask divided by the task dividing means 11. Furthermore, the computing processor 14 1 1 14 N can control the external power consumption. As a method of controlling power consumption, the arithmetic processor 14 1 1 14 N itself has an interface that directly changes the power consumption, and the power consumption is changed via this interface. In addition, if the instruction code of each subtask is decoded and executed based on the clock signal input from the external power! /, This clock signal can be changed. It is also possible to change the power consumption through.
[0024] 第 2図は、演算用プロセッサ 14— 1一プロセッサ 14 Nの特性の例を示した図である 。図のように演算用プロセッサ 14 1一プロセッサ 14 Nは"高速動作状態"、 "標準 動作状態"、 "遊休状態"の少なくとも 3つの動作状態を選べるようになつている。高速 動作状態にある場合、演算用プロセッサ 14—1一プロセッサ 14 Nは、 380MHzの動 作周波数によって、 1. 8Vの電圧で動作し、 0. 5Wの消費電力を消費する。また、標 準動作状態においては、演算用プロセッサ 14—1一プロセッサ 14 Nは、 152MHzの 動作周波数と 1. OVの電圧で動作し、その消費電力は 0. 053Wとなっている。さらに は、遊休状態で動作する場合、動作周波数は 33MHzであり、電圧は 1. OV、消費電 力は 0, 0115Wとなっている。  FIG. 2 is a diagram showing an example of the characteristics of the processor for operation 14-1 and the processor 14N. As shown in the figure, the arithmetic processor 14 1 one processor 14 N can select at least three operation states of “high-speed operation state”, “standard operation state”, and “idle state”. When in high-speed operation, the processor 14—one processor 14 N operates at a voltage of 1.8 V and consumes 0.5 W of power with an operating frequency of 380 MHz. In the standard operating state, the arithmetic processor 14-1 processor 14N operates at an operating frequency of 152MHz and a voltage of 1.OV, and its power consumption is 0.053W. Furthermore, when operating in an idle state, the operating frequency is 33 MHz, the voltage is 1. OV, and the power consumption is 0, 0115W.
[0025] この図に示される特性からも分力るように、電子回路では一般に、動作周波数を高 くするにつれて、単位時間あたりの消費電力が高くなることが知られている。消費電 力 Pと動作周波数 F、および電源電圧 Vとの関係は、リーク電力を無視した場合、式( 1)によって与えられる。ここで、 tは信号遷移率であり、 Cは静電容量である。 [0025] As shown by the characteristics shown in this figure, an electronic circuit generally has a higher operating frequency. It is known that the power consumption per unit time increases as the number of times increases. The relationship between power consumption P, operating frequency F, and power supply voltage V is given by equation (1) when leakage power is ignored. Where t is the signal transition rate and C is the capacitance.
[0026] P=t-C-F-V2 (1) [0026] P = tCFV 2 (1)
[0027] なお、演算用プロセッサ 14 1一 14 Nは、例示として"高速動作状態"、 "標準動 作状態"、 "遊休状態"からなる 3つの動作状態を推移するな構成を有しているが、こ の発明において使用することのできるプロセッサはこのような例に限定されるものでは ない。  [0027] It should be noted that the arithmetic processor 14 1-1 14 N has, for example, a configuration that does not change three operation states including a “high-speed operation state”, a “standard operation state”, and an “idle state”. However, the processor that can be used in the present invention is not limited to such an example.
[0028] 計算機が置かれる環境の気温によってクロックの速度は変動しうるので、実用的な 巿販プロセッサは外部クロックの変動に対するマージンを有している。このような巿販 プロセッサは外部クロックを高速にするとその分速く動作するようになり、低速にすると その分遅く動作するようになる。そこで、上述の例に示したようなプロセッサとは異なり 、積極的に複数の動作状態をサポートして 、な 、巿販プロセッサを用いた場合であ つても、外部クロック変動に対するマージンを積極的に利用することで、この発明の特 徴を利用することが可能となるのである。最近では、消費電力を低減することができる プロセッサはモパイル用途で広く使用されており、技術的にも公知となっているので、 ここではこれ以上詳細には触れないこととする。  [0028] Since the clock speed can vary depending on the temperature of the environment in which the computer is placed, a practical sales processor has a margin for the variation of the external clock. Such a sales processor operates faster when the external clock is increased, and operates slower as the external clock is decreased. Therefore, unlike the processor shown in the above example, it actively supports multiple operating states, and even when using a sales processor, it can proactively provide a margin for external clock fluctuations. This makes it possible to use the features of the present invention. Recently, processors that can reduce power consumption have been widely used in mopile applications and are well known in the art, so we will not go into further detail here.
[0029] なお演算用プロセッサ 14 1一 14 Nは、それぞれが例えば独立した LSI部品であ ると限定的に解釈してはならな 、。例えばベクトルプロセッサは単体の演算装置であ りながら、複数の演算を並列実行することができる。第 1図に示した計算機の構成は このようなものも含むのである。また制御用プロセッサ 13と演算用プロセッサ 14 1一 14 Nとを、パソコンやワークステーションのような完成されたコンピュータで置き換え ることも可能なことはいうまでもない。すなわち、この発明は複数のコンピュータを組み 合わせた並列演算システムにも適用可能である。  [0029] It should be noted that the arithmetic processors 14 1 1 14 N should not be construed as being limited to each being, for example, an independent LSI component. For example, a vector processor is a single arithmetic device, but can execute a plurality of operations in parallel. The configuration of the computer shown in Fig. 1 includes such a configuration. It goes without saying that the control processor 13 and the arithmetic processor 14 1 1 14 N can be replaced with a completed computer such as a personal computer or a workstation. That is, the present invention can also be applied to a parallel computing system in which a plurality of computers are combined.
[0030] なお、タスク分割手段 11は独立した制御回路又は制御装置として構成してもよいし 、制御用プロセッサ 13によって実行されるコンピュータプログラムとして構成するよう にしても構わない。  It should be noted that the task dividing means 11 may be configured as an independent control circuit or control device, or may be configured as a computer program executed by the control processor 13.
[0031] また、制御用プロセッサ 13を一般的なプロセッサアーキテクチャにおけるフェッチ回 路及びデコーダとみなし、タスクとサブタスクとを、そのプロセッサにおける機械語レ ベルの命令コードとマイクロコードとみなせば、第 1図に示したシステム全体が単一の プロセッサを表すものとみなすこともできる。この場合、ベクトル演算による配列処理 をタスクとみなし、配列の各要素の処理を複数のサブタスクとみなすこととなる。さらに タスク分解手段 11に相当するのはベクトルィ匕コンパイラと呼ばれるベクトル演算命令 を生成する最適化処理に対応したコンパイラ (言語処理プロセッサ)と、ベクトル演算 命令をマイクロコードにデコードするデコーダとなるであろう。このようなコンパイラ技 術はすでに公知である。またこのようなプロセッサアーキテクチャのレベルで定まる処 理単位ではなく、プロセスとスレッドの関係をタスクとサブタスクの関係に対応させて 考えてもよい。この場合は、システムの設計に基づいてタスクとサブタスクとの関係が 柔軟に定義される。このように、第 1図の構成はさまざまなレベルで適用することがで きるのである。 [0031] Further, the control processor 13 is configured to fetch the processor in a general processor architecture. If the task and subtask are regarded as a machine language instruction code and microcode in the processor, the entire system shown in Fig. 1 can be regarded as a single processor. . In this case, array processing by vector operation is regarded as a task, and processing of each element of the array is regarded as a plurality of subtasks. Furthermore, the task decomposition means 11 would be a compiler (language processor) that supports vector processing instructions called vectory compilers, and a decoder that decodes vector operation instructions into microcode. . Such compiler technology is already known. In addition, instead of processing units determined at the level of processor architecture, the relationship between processes and threads may be considered to correspond to the relationship between tasks and subtasks. In this case, the relationship between tasks and subtasks is flexibly defined based on the system design. In this way, the configuration in Figure 1 can be applied at various levels.
[0032] 続いて、この発明の実施の形態 1による並列計算機の動作について説明する。第 3 図は、この並列計算機の動作を示すフローチャートである。タスク入力端 10から、実 行すべきタスクが投入されると、タスク分割手段 11はタスクをサブタスクに分割する( ステップ S101)。続いて制御用プロセッサ 13は、タスクの処理制限時間 Tを取得する (ステップ S102)。処理制限時間 Tはシステムによって予め定められる値である。  [0032] Next, the operation of the parallel computer according to the first embodiment of the present invention will be described. FIG. 3 is a flowchart showing the operation of this parallel computer. When a task to be executed is input from the task input terminal 10, the task dividing means 11 divides the task into subtasks (step S101). Subsequently, the control processor 13 acquires a task processing time limit T (step S102). The processing time limit T is a value predetermined by the system.
[0033] 例えばプロセスとスレッドの場合は、利用者あるいはシステムの目的から Tが決定さ れる。システムが一定時間(サンプリング時間)ごとに発生する入力信号 (例えば何ら かの観測値など)の信号処理を行うことを目的としているのであれば、これら信号を取 得する周期であるサンプリング時間が処理制限時間 Tに該当するであろう。  [0033] For example, in the case of a process and a thread, T is determined from the purpose of the user or the system. If the system is intended to perform signal processing of input signals (for example, some observations) that occur every fixed time (sampling time), the sampling time, which is the period for acquiring these signals, is limited. It will correspond to time T.
[0034] また、外部仕様からは処理制限時間が定まらずに、並列計算機の構成から処理制 限時間 Tが決定される場合もある。例えば、外部クロックで 1クロック内〖こほとんどの命 令を完了するようなプロセッサを構成する場合、 1外部クロックに相当する長さの時間 が処理制限時間 Tになる。  [0034] Further, the processing time limit T may be determined from the configuration of the parallel computer without the processing time limit being determined from the external specification. For example, when configuring a processor that completes most instructions within one clock with an external clock, the time limit corresponding to one external clock is the processing time limit T.
[0035] 続いて、制御用プロセッサ 13は、演算用プロセッサ 14 1一 14 Nを高速動作状態 に設定した場合のタスクの処理時間 tminを算出する (ステップ ST103)。この処理を 実現するためには、各サブタスクの処理完了見込み時間が予め分力つていることが 要求される。そこで、例えば演算用プロセッサ 14 1一 14 Nのいずれかのプロセッ サによる各サブタスクの高速動作状態と標準状態における処理時間を予め計測して おき、サブタスク属性情報ファイルに記憶させておく。そして制御用プロセッサ 13は、 サブタスクの種類に応じてそのサブタスクの処理時間を取得し、タスクの処理時間 t minを算出するのである。 [0035] Subsequently, the control processor 13 calculates a task processing time tmin when the arithmetic processors 14 1 and 14 N are set to the high-speed operation state (step ST103). In order to realize this process, the estimated process completion time of each subtask must be divided in advance. Required. Therefore, for example, the processing time in the high-speed operation state and the standard state of each subtask by any of the processors 14 1 to 14 N is measured in advance and stored in the subtask attribute information file. Then, the control processor 13 obtains the processing time of the subtask according to the type of the subtask, and calculates the processing time tmin of the task.
[0036] なお、サブタスクの処理時間を高速動作状態と標準動作状態の!/、ずれか一方のみ につ 、てのみ測定しておき、測定した処理時間の動作状態の動作周波数と他方の 動作状態の動作周波数との比率を乗じて、他方の処理時間を概算するようにしても 構わない。 [0036] It should be noted that the processing time of the subtask is measured only for one of the high-speed operation state and the standard operation state! /, Or only one of them, and the operation frequency of the measured processing time and the other operation state are measured. The other processing time may be approximated by multiplying the ratio with the operating frequency.
[0037] この結果、 tminが処理制限時間 Tを下回る場合 (ステップ ST104: Yes)は演算用 プロセッサ 14 1一 14 Nの並列処理能力力 処理すべきサブタスクの処理量を上 回ることを意味しており、処理能力に余裕があるのでステップ ST105以降の消費電 力節約処理に移行する。  [0037] As a result, if tmin is less than the processing time limit T (step ST104: Yes), it means that the processor capacity for computing 14 1 1 14 N exceeds the processing capacity of the subtask to be processed. Since there is enough processing capacity, the process moves to the power saving process after step ST105.
[0038] 一方、 tminが処理制限時間 Tを下回ることがない場合は、消費電力節約よりも処理 の高速ィ匕に重点を置く必要があるので、演算用プロセッサ 14 1一 14 Nを高速動 作状態に設定する (ステップ ST106:実行方式 1)。そしてステップ ST111に進む。 なおステップ ST111以降の処理につ 、ては後述する。  [0038] On the other hand, if tmin does not fall below the processing time limit T, it is necessary to focus on high-speed processing rather than power consumption savings. Set the status (Step ST106: Execution method 1). Then, the process proceeds to step ST111. The processing after step ST111 will be described later.
[0039] ステップ ST105において、制御用プロセッサ 13は、演算用プロセッサ 14—1一 14 Nのいずれか一つを標準動作状態に設定し、標準動作状態に設定したプロセッサの みですベてのサブタスクを実行した場合のタスクの処理時間 tstdを算出する。この場 合もステップ ST103における tminの算出と同じようにサブタスクの処理時間に基づい て tstdが算出される。そしてこの tstd力 を上回る場合(ST107 :Yes)は、演算用プロ セッサ 14 1一 14 Nのいずれか一つのプロセッサのみによる処理では処理制限時 間 T以内にタスクを完了させるという要求を満たすことができないので、ステップ ST1 09以降の複数のプロセッサを用いた並列処理に進む。  [0039] In step ST105, the control processor 13 sets any one of the arithmetic processors 14-1 to 14N to the standard operation state, and performs all subtasks only for the processor set to the standard operation state. Calculate the task processing time tstd when executed. In this case, tstd is calculated based on the processing time of the subtask as in the calculation of tmin in step ST103. If this tstd force is exceeded (ST107: Yes), processing by any one of the processors 14 1 1 14 N must satisfy the request to complete the task within the processing time limit T. Since this is not possible, the process proceeds to parallel processing using a plurality of processors after step ST1 09.
[0040] 一方、 tstdが Tを上回ることがない場合、 1つのプロセッサのみでも処理制限時間 T 以内にタスクを完了させるという要求を満たしうるので、演算用プロセッサ 14 1一 14 Nのうちのいずれか一つのプロセッサ、例えば演算用プロセッサ 14 1を標準動作 状態に設定する (ステップ ST108)。力!]えて、演算用プロセッサ 14-1を除いた他の プロセッサ、すなわち演算用プロセッサ 14 2— 14 Nを遊休状態に設定する。 [0040] On the other hand, if tstd does not exceed T, one processor alone can satisfy the request to complete the task within the processing time limit T. Standard operation of one processor, for example, processor 14 1 Set the status (step ST108). Power! Then, the other processors excluding the arithmetic processor 14-1, that is, the arithmetic processors 14 2-14 N are set in an idle state.
[0041] こうすることにより、所定の処理制限時間以内にタスクの処理を完了させるという実 時間処理に対する要求を満足させながら、消費電力の削減をも同時に達成できるの である。 By so doing, it is possible to simultaneously achieve a reduction in power consumption while satisfying the requirement for real-time processing of completing task processing within a predetermined processing time limit.
一方、 tstdが Tを上回る場合、サブタスクの性質と各演算用プロセッサの性質 (動作 周波数、消費電力)に基づいて、次のいずれかの処理方式 (実行方式 3と実行方式 4 )を選択し、その処理方式に基づ!/、てサブタスク処理に用いる演算用プロセッサの個 数 nと動作周波数を算出する。(ステップ ST109)。  On the other hand, if tstd exceeds T, select one of the following processing methods (execution method 3 and execution method 4) based on the nature of the subtask and the nature of each processor (operation frequency, power consumption): Based on the processing method, calculate the number of processors n and the operating frequency used for subtask processing. (Step ST109).
[0042] 実行方式 3 :  [0042] Execution method 3:
演算用プロセッサ 14— 1一 14 Nのうちの一つの演算用プロセッサを選択し、選択 した演算用プロセッサの動作周波数を高速動作状態の動作周波数 ι8に設定して、こ の演算用プロセッサによりすべてのサブタスクを実行する。選択された演算用プロセ ッサ以外の演算用プロセッサは遊休状態に設定される。  Arithmetic processor 14—1 Selects one of the 14 N computing processors, sets the operating frequency of the selected computing processor to the operating frequency ι8 of the high-speed operating state, and uses this computing processor to Perform subtasks. Arithmetic processors other than the selected arithmetic processor are set in an idle state.
[0043] 実行方式 4 :  [0043] Execution method 4:
演算用プロセッサ 14— 1一 14 Nのうちの n個の演算用プロセッサを選択し、選択し た演算用プロセッサの動作周波数を標準動作状態の動作周波数 Oとして、選択され た n個(2≤n≤N)の演算用プロセッサにより実行する。選択された n個の演算用プロ セッサ以外の演算用プロセッサは遊休状態に設定される。  Arithmetic Processor 14—1 Selects n computing processors out of 14N, and selects the selected n processors (2≤n) with the operating frequency of the selected computing processor as the operating frequency O in the standard operating state. It is executed by a processor for ≤N). Arithmetic processors other than the selected n arithmetic processors are set in an idle state.
[0044] 実行方式 5 :  [0044] Execution method 5:
演算用プロセッサ 14 1一 14 Nのうちの m個(m<n)の演算用プロセッサを選択 し、選択したプロセッサの動作周波数を高速動作状態の動作周波数 )8として、選択 した m個 (2≤m<n≤N)のプロセッサにより実行する。選択された m個のプロセッサ 以外は遊休状態に設定する。  Arithmetic processor 14 1 1 14 Select m (m <n) arithmetic processors from 14 N, and set the selected processor's operating frequency as the operating frequency of the high-speed operating state) 8. It is executed by a processor of m <n≤N). All but the selected m processors are set to idle state.
[0045] 次に実行方式 3、実行方式 4、実行方式 5のいずれかの実行方式を選択する方法 について説明する。  [0045] Next, a method for selecting one of execution method 3, execution method 4, and execution method 5 will be described.
[0046] 第 4図は、処理制約時間 (T)内の実行方式 3と実行方式 4のタイムチャート例を示し たものである。両者の違いは太線枠内部分であるため、この部分に関しての消費電 力量を比較すれば良い。第 4図の場合では、処理制約時間 (T)は、実行方式 3より 実行方式 4の処理時間の方が大きいため、式(2)のように示すことができる。ここで、 Tc(=TS+TR)は 1回の通信処理に要する時間であり、送信処理時間 TSと受信処理 時間 TRをカ卩えたものである。また、 Ταは、 1つの処理データを 1つのプロセッサで動 作周波数 αで処理した場合の実行時間である。また、 ηはプロセッサ数を示す。 FIG. 4 shows a time chart example of execution method 3 and execution method 4 within the processing constraint time (T). Since the difference between the two is the part within the bold frame, the power consumption for this part is Compare competence. In the case of FIG. 4, the processing constraint time (T) can be expressed as equation (2) because the processing time of execution method 4 is longer than that of execution method 3. Here, Tc (= TS + TR) is a time required for one communication process, and is a combination of a transmission processing time TS and a reception processing time TR. Τα is the execution time when one piece of processing data is processed by one processor at the operating frequency α. Η indicates the number of processors.
[0047] T= (n-1) -TC +T a /η (2) [0047] T = (n-1) -TC + T a / η (2)
[0048] この場合の実行方式 3による消費電力量 C2[W's]を示したものが式(3)である。ここ で、式(3)の第 1項は動作周波数 βでデータ処理を行うのに要する消費電力量であ り、残りの第 2項は、遊休状態であるプロセッサ(第 4図:演算用プロセッサ 14 1ー演 算用プロセッサ 14 Ν)とデータ処理が終わり遊休状態となった期間のプロセッサ(第 4図:演算用プロセッサ 14 1)の消費電力量を示したものである。また、 k= α Z j8で ある。  [0048] Equation (3) shows the power consumption C2 [W's] by execution method 3 in this case. Here, the first term of Equation (3) is the power consumption required to process data at the operating frequency β, and the remaining second term is the idle processor (Fig. 4: processor for computation). This figure shows the amount of power consumed by the processor 1 (14) and the processor (Figure 4: processor 14 1) during the idle period after data processing. In addition, k = α Z j8.
[0049] C2 = Pj8 -k-Ta + k-Ργ ·Τα · (n-1)  [0049] C2 = Pj8 -k-Ta + k-Ργ · Τα · (n-1)
+ η·Ργ · [Τα · (l/n-k) + (n-1) -Tc]  + η · Ργ · [Τα · (l / n-k) + (n-1) -Tc]
=Ρβ -k-Τα + Ργ ·【(1— k) ·Τα + η· (n-1) -Tc] (3)  = Ρβ -k-Τα + Ργ · 【(1— k) · Τα + η · (n-1) -Tc] (3)
[0050] 同様に、この場合の実行方式 4による消費電力量 C3[W's]を示したものが式 (4)で ある。ここで、式 (4)の第 1項は通信処理に要する消費電力量と全部の遊休状態の消 費電力量とを加えたものであり、第 2項はデータ処理に要する消費電力量を示したも のである。 [0050] Similarly, equation (4) shows the power consumption C3 [W's] by execution method 4 in this case. Here, the first term in equation (4) is the sum of the power consumption required for communication processing and the power consumption in all idle states, and the second term represents the power consumption required for data processing. It is a thing.
[0051] C3=(n-1) -Pa -Tc + (1/η) ·Ρα ·Τα  [0051] C3 = (n-1) -Pa -Tc + (1 / η) · Ρα · Τα
+ (η-1)·[Ρα -Tc +(1/η)·Ρα ·Τα +(η-2)·Ργ -Tc]  + (η-1) · [Ρα -Tc + (1 / η) · Ρα · Τα + (η-2) · Ργ -Tc]
= (η-ΐ) ·[2·Ρα + (η-2) ·Ργ ] -Tc + Ρα ·Τα (4)  = (η-ΐ) · [2 · Ρα + (η-2) · Ργ] -Tc + Ρα · Τα (4)
[0052] ここで C2 = C3とすると、式(3)と式 (4)から式(5)を導出することができる。 C2 = C3 を満たす場合とは、これら 2つの実行方式による消費電力が等しい場合であり、 C2 = C3を満たす各パラメータの値が境界値となって、この境界値以外のパラメータ値をと る場合に、これらの実行方式のいずれか一方が有利となるのである。ここで、 pはデ ータ処理に対する通信処理の処理時間の比率 (TcZT a )を表すものとする。 [0052] If C2 = C3, then equation (5) can be derived from equations (3) and (4). The case where C2 = C3 is satisfied is when the power consumption by these two execution methods is equal, and the value of each parameter satisfying C2 = C3 becomes the boundary value, and parameter values other than this boundary value are taken. In addition, one of these execution methods is advantageous. Here, p represents the ratio of communication processing time to data processing (TcZT a).
[0053] p
Figure imgf000012_0001
+ Ργ · (l-k)}/{2- (n-1) (Ρα~Ργ)} (5) [0054] この式(5)に基づいて求めた pと、実行方式 2により選定した省電力実行のための P 3とを比較すれば実行方式 3と 4の優劣が判定でき、 く p 3であれば実行方式 3を , > P 3であれば実行方式 4を適用すればよいことが分かる。なお、ここまでの議論 は、第 4図に基づいて実行方式 3よりも実行方式 4の処理時間の方が大きい場合に 関するものであるが、逆の場合でも、式(3)と式 (4)は異なるものになる力 同じ式(5 )が導出される。但し、 n= 2, 3の場合は、送信処理時間 Tsと受信処理時間 TRの大 小関係によっては、例えば、第 4図で示した実行方式 4の演算用プロセッサ 14 1に も遊休状態におかれてしまう場合がある。しかし、 Ts=TRと仮定すれば、 n= 2, 3の 場合でも は式(5)によって与えられる。
[0053] p
Figure imgf000012_0001
+ Ργ · (lk)} / {2- (n-1) (Ρα ~ Ργ)} (5) [0054] If p calculated based on this equation (5) is compared with P 3 for power saving execution selected by execution method 2, the superiority or inferiority of execution methods 3 and 4 can be determined. It can be seen that execution method 3 should be applied if there is, and execution method 4 should be applied if> P3. The discussion so far relates to the case where the processing time of execution method 4 is longer than that of execution method 3 based on Fig. 4, but in the opposite case, equations (3) and (4 ) Is a different force The same equation (5) is derived. However, when n = 2, 3, depending on the size relationship between the transmission processing time Ts and the reception processing time TR, for example, the execution processor 4 shown in FIG. There is a case that it will be scratched. However, assuming Ts = TR, even if n = 2, 3, it is given by Eq. (5).
[0055] 第 5図は、式(5)の右辺における各パラメータに、第 2図の各値を与えた場合の演 算用プロセッサの個数 (n≥ 2)に対する pの値である。実行方式 3と 4、そして 1と 4の 優劣は、対象とするタスクを解析し、実行方式 4での省電力量のための最適なプロセ ッサ数とその場合の pの値が求まれば、第 5図より判定できる。 [0055] FIG. 5 shows the value of p with respect to the number of arithmetic processors ( n ≥ 2) when the values in FIG. 2 are given to the parameters on the right side of equation (5). Execution methods 3 and 4, and superiority and inferiority of 1 and 4 can be obtained by analyzing the target task and finding the optimal number of processors and the p value in that case for power saving in execution method 4. This can be determined from FIG.
[0056] また、第 6図は実行方式 3が選定/実行された場合の実行方式 4に対する消費電力 量の比率 (E3ZE4)を、適当な p (≤0. 05)に関して示したものである。なお、 p≤0 . 05であればプロセッサ数が 2— 20の範囲内では常に並列処理による効果が得られ る。この結果 (第 6図)より、 の値が一定の場合、プロセッサ数が多いほどこの比率 は小さくなる力 逆に pが小さくなればこの比率は大きくなることが確認できる。したが つて、演算用プロセッサの個数が増えるにつれ /0が小さくなるとすると、その状態の 間はこの比率の下げ率はより小さくなることになる。  [0056] Fig. 6 shows the ratio (E3ZE4) of the power consumption to execution method 4 when execution method 3 is selected / executed with respect to an appropriate p (≤0.05). If p≤0.05, the effect of parallel processing is always obtained when the number of processors is in the range of 2-20. From this result (Fig. 6), it can be confirmed that when the value of is constant, this ratio decreases as the number of processors increases. Conversely, if p decreases, this ratio increases. Therefore, if / 0 becomes smaller as the number of arithmetic processors increases, the rate of reduction of this ratio will be smaller during that state.
[0057] このように通信処理と処理時間の比率、そして演算用プロセッサの動作周波数と消 費電力に基づいて第 5図のようなプロセッサ数と pの関係を予め求めておき、これを 例えばサブタスク属性情報ファイル 12のような記憶領域に記憶させておく。そしてス テツプ S 109にお 、て、制御用プロセッサ 13にお ヽて式(5)の関係から実行方式 3及 び実行方式 4のいずれかの実行方式を選択するのである。  [0057] In this way, the relationship between the number of processors and p as shown in Fig. 5 is obtained in advance based on the ratio of communication processing and processing time, and the operating frequency and power consumption of the arithmetic processor. It is stored in a storage area such as the attribute information file 12. In step S 109, the execution method 3 or the execution method 4 is selected from the relationship of the expression (5) for the control processor 13.
[0058] なお、上記の例では、サブタスク間の依存関係として、サブタスクを演算用プロセッ サ 14 1一 14 Nに分配するための通信処理の例を説明した力 その他の依存関係 に拡張して式(3)—式(5)に相当する関係を導き出すことは容易である。 [0059] また、実行方式 3と 5の選定に関しては、両方式とも同じ動作周波数であるため、両 方式とも制限時間内に完了するのであれば、実行方式 3が選定されることになる。使 用するプロセッサ数が少ない方が省電力実行できるためである。 [0058] In the above example, as the dependency relationship between subtasks, the expression described by expanding the communication processing example for distributing the subtasks to the computing processors 14 1 1 14 N to other dependency relationships is shown. (3) —It is easy to derive a relationship corresponding to equation (5). [0059] Regarding the selection of execution methods 3 and 5, since both methods have the same operating frequency, execution method 3 is selected if both methods are completed within the time limit. This is because power saving can be executed when the number of processors used is small.
[0060] さらに、実行方式 4と 5の選定に関しては、実行方式 4の方が実行方式 5より処理時 間を要するとした場合、実行方式 5の消費電力量 C5は、次のようになる。  [0060] Further, regarding the selection of execution methods 4 and 5, if execution method 4 requires more processing time than execution method 5, the power consumption C5 of execution method 5 is as follows.
[0061] C5=(m-l) -[2-Pj8 + (m-2) -Py]-k-Tc + Pj8 -Tj8  [0061] C5 = (m-l)-[2-Pj8 + (m-2) -Py] -k-Tc + Pj8 -Tj8
+ Ργ -{Tc-[n- (n-l)-k-m- (m— 1)]+Τα · (1— k)}(6)  + Ργ-{Tc- [n- (n-l) -k-m- (m— 1)] + Τα · (1— k)} (6)
[0062] ここで、式 (6)の第 1項と第 2項は、処理を割り付けられたプロセッサの消費電力量 であり、第 3項と第 4項は、遊休状態のプロセッサと、処理を割り付けられたが処理が 完了して待ち状態であるため、遊休状態となっているプロセッサの消費電力量を示し たものである。  [0062] Here, the first and second terms in Equation (6) are the power consumption of the processor to which the process is assigned, and the third and fourth terms are the idle processor and the process. It shows the power consumption of a processor that is in an idle state because it has been allocated but is in a wait state after processing is complete.
[0063] したがって、実行方式 4と実行方式 5の消費電力量の差 C5— C3は、  [0063] Therefore, the difference in power consumption C5—C3 between execution method 4 and execution method 5 is
[0064] C5-C3=Tc-{2-k- (m— 1) (Ρβ-Ργ)~2· (η— 1) · (Ρα-Ργ)} [0064] C5-C3 = Tc- {2-k- (m— 1) (Ρβ-Ργ) ~ 2 · (η— 1) · (Ρα-Ργ)}
+Τα -{k- (Ρ|8-Ργ)-(Ρα-Ργ)} (7)  + Τα-{k- (Ρ | 8-Ργ)-(Ρα-Ργ)} (7)
[0065] となる。この式(7)を用いて、実行方式 4と実行方式 5との優劣を判定すればょ 、。な お、実行方式 5の方が実行方式 4よりも処理時間を要するとした場合も、式 (6)は異な るが同じ式(7)力導出されること〖こなる。 [0065] Use this equation (7) to determine the superiority or inferiority of execution method 4 and execution method 5. Even if execution method 5 requires more processing time than execution method 4, equation (6) is different, but the same equation (7) is derived.
[0066] 最後に制御用プロセッサ 13は、ステップ S 109において決定した実行方式に基づ いて演算用プロセッサ 14 1一 14 Νにサブタスクを分配し、サブタスクの実行を指 示する(ステップ ST110)。 [0066] Finally, the control processor 13 distributes the subtasks to the arithmetic processors 14 1 to 14 based on the execution method determined in step S109, and instructs the execution of the subtasks (step ST110).
[0067] このように、この発明の実施の形態 1の並列計算機によれば、タスクをサブタスクに 分割し、サブタスクの依存関係に基づいて実行方式 1一実行方式 4のいずれかの実 行方式を選択してタスクを並列実行することとしたので、タスクの処理制約時間を満 たしつつ複数のプロセッサにおける消費電力の総計を低減することができるのである Thus, according to the parallel computer of the first embodiment of the present invention, the task is divided into subtasks, and any one of the execution methods 1 and 4 is executed based on the subtask dependency. Since the task is selected and executed in parallel, the total power consumption of multiple processors can be reduced while satisfying the task processing constraint time.
[0068] なお上述の説明において、制御用プロセッサ 13はサブタスクの分配を行う専用の プロセッサであるとした力 制御用プロセッサ 13は演算用プロセッサ 14— 1一 14 N に比べて負荷が低い場合もあるので、演算用プロセッサ 14 1一 14 Nの機能を兼 用させるように構成してちょ!、。 [0068] In the above description, it is assumed that the control processor 13 is a dedicated processor for distributing subtasks. The load of the control processor 13 may be lower than that of the processor 14-1 to 14N. So the processor 14 1 1 14 N functions Configure it for use!
産業上の利用可能性 Industrial applicability
この発明は、複数の計算機をクラスタ構成とした並列計算機システム若しくは複数 の演算命令処理部を有する並列処理プロセッサなど、並列演算を目的とする計算機 処理システムに広く適用することが可能である。  The present invention can be widely applied to a computer processing system for parallel operations such as a parallel computer system having a plurality of computers in a cluster configuration or a parallel processing processor having a plurality of operation instruction processing units.

Claims

請求の範囲 The scope of the claims
[1] タスクを複数の処理単位に分割して、分割された処理単位を並列に実行する並列計 算機において、  [1] In a parallel computer that divides a task into multiple processing units and executes the divided processing units in parallel.
上記タスクを個別プロセッサで実行可能な複数の処理単位に分割し、分割された 処理単位を複数のサブタスクとして出力するタスク分割手段と、  A task dividing means for dividing the task into a plurality of processing units that can be executed by individual processors, and outputting the divided processing units as a plurality of subtasks;
上記タスク分割手段により分割されたサブタスクの属性情報を保持するサブタスク 属性情報ファイルと、  A subtask attribute information file that holds attribute information of the subtask divided by the task dividing means;
消費電力量を外部から制御しうるように構成され、上記タスク分割手段により分割さ れたサブタスクを実行する複数のプロセッサと、  A plurality of processors configured to control power consumption from the outside and executing subtasks divided by the task dividing means;
上記サブタスク属性情報ファイルが保持するサブタスクの属性情報に基づ 、て、上 記タスク分割手段により分割されたサブタスクを上記複数のプロセッサに分配してそ のサブタスクの実行を指示するとともに上記複数のプロセッサの消費電力量を制御 するプロセッサ制御手段と、  Based on the subtask attribute information held in the subtask attribute information file, the subtask divided by the task dividing means is distributed to the plurality of processors to instruct execution of the subtask and the plurality of processors. Processor control means for controlling the power consumption of
を備えたことを特徴とする並列計算機。  A parallel computer characterized by comprising:
[2] 請求の範囲第 1項記載の並列計算機において、  [2] In the parallel computer according to claim 1,
サブタスク属性情報は、サブタスクの処理時間を属性情報として保持することを特徴 とすることを特徴とする並列計算機。  The subtask attribute information is characterized by holding the processing time of the subtask as attribute information.
[3] 請求の範囲第 2項記載の並列計算機において、 [3] In the parallel computer according to claim 2,
複数のプロセッサの一部又は全部は、標準動作状態で動作する第 1のプロセッサと Some or all of the plurality of processors are connected to the first processor operating in the standard operating state.
、標準動作状態とこの標準動作状態よりも低い電力を消費する遊休状態との何れカゝ の状態で動作するよう設定可能な第 2のプロセッサと、の少なくとも 2種類のプロセッ サカゝら構成され、 A second processor that can be set to operate in either a standard operating state or an idle state that consumes less power than the standard operating state,
プロセッサ制御手段は、サブタスク属性情報ファイル力 サブタスクの処理時間を 取得して、上記第 1のプロセッサのみを用いて上記サブタスクを処理したと仮定してタ スク処理時間を算出し、算出したタスク処理時間が所定の時間よりも短い場合に、上 記第 2のプロセッサを遊休状態に設定するとともに上記第 1のプロセッサのみに上記 サブタスクを分配してそのサブタスクの実行を指示することを特徴とする並列計算機 The processor control means obtains the processing time of the subtask attribute information file power subtask, calculates the task processing time on the assumption that the subtask is processed using only the first processor, and calculates the calculated task processing time. A parallel computer that sets the second processor to an idle state and distributes the subtask to only the first processor and instructs execution of the subtask when the time is shorter than a predetermined time.
[4] 請求の範囲第 3項に記載の並列計算機において、 [4] In the parallel computer according to claim 3,
第 1のプロセッサは、標準動作状態と、この標準動作状態よりも高速に動作する高 速動作状態と、の 、ずれかに設定可能なプロセッサとして構成され、  The first processor is configured as a processor that can be set between a standard operation state and a high-speed operation state that operates at a higher speed than the standard operation state.
プロセッサ制御手段は、サブタスク属性情報ファイル力 サブタスクの処理時間を 取得して、高速動作状態に設定した上記第 1のプロセッサのみを用 、て上記サブタ スクを処理したと仮定してタスク処理時間を算出し、算出したタスク処理時間が所定 の時間よりも短い場合に、上記第 2のプロセッサを遊休状態に設定するとともに上記 第 1のプロセッサを高速動作状態に設定した上でこの第 1のプロセッサのみに上記サ ブタスクを分配しそのサブタスクの実行を指示することを特徴とする並列計算機。  The processor control means obtains the processing time of the subtask attribute information file and calculates the task processing time on the assumption that the subtask is processed using only the first processor set to the high-speed operation state. When the calculated task processing time is shorter than the predetermined time, the second processor is set in an idle state and the first processor is set in a high-speed operation state, and only the first processor is set. A parallel computer that distributes the subtask and instructs execution of the subtask.
[5] 請求の範囲第 1項記載の並列計算機において、  [5] In the parallel computer according to claim 1,
サブタスク属性情報ファイルは、そのサブタスクと他のサブタスクとの間の依存関係 を記憶することを特徴とする並列計算機。  A subtask attribute information file stores a dependency relationship between the subtask and another subtask.
[6] 請求の範囲第 5項記載の並列計算機において、  [6] In the parallel computer according to claim 5,
サブタスク属性情報ファイルは、そのサブタスクと他のサブタスクとの間の依存関係 として複数のプロセッサの個数とこれらプロセッサにサブタスクを分配する時間との関 係を記憶し、  The subtask attribute information file stores the relationship between the number of processors and the time to distribute subtasks to these processors as a dependency between the subtask and other subtasks.
プロセッサ制御手段は、上記サブタスク属性情報ファイルに保持される複数のプロ セッサの個数とこれらプロセッサにサブタスクを分配する時間との関係に基づ ヽて、 複数のプロセッサにサブタスクを分配するとともにこれらのプロセッサの消費電力量を 制御することを特徴とする並列計算機。  The processor control means distributes the subtasks to the plurality of processors based on the relationship between the number of the plurality of processors held in the subtask attribute information file and the time for distributing the subtasks to the processors, and the processors. A parallel computer characterized by controlling the power consumption of the computer.
PCT/JP2004/010610 2004-07-26 2004-07-26 Parallel computer WO2006011189A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2006527723A JP4082439B2 (en) 2004-07-26 2004-07-26 Parallel computer
PCT/JP2004/010610 WO2006011189A1 (en) 2004-07-26 2004-07-26 Parallel computer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2004/010610 WO2006011189A1 (en) 2004-07-26 2004-07-26 Parallel computer

Publications (1)

Publication Number Publication Date
WO2006011189A1 true WO2006011189A1 (en) 2006-02-02

Family

ID=35785948

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2004/010610 WO2006011189A1 (en) 2004-07-26 2004-07-26 Parallel computer

Country Status (2)

Country Link
JP (1) JP4082439B2 (en)
WO (1) WO2006011189A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009527828A (en) * 2006-02-17 2009-07-30 クゥアルコム・インコーポレイテッド System and method for multiprocessor application support
JP2010537266A (en) * 2007-08-17 2010-12-02 インターナショナル・ビジネス・マシーンズ・コーポレーション Proactive power management in parallel computers
JP2011134330A (en) * 2009-12-22 2011-07-07 Intel Corp Systems and methods for energy efficient load balancing at server clusters
JP2013502642A (en) * 2009-08-18 2013-01-24 インターナショナル・ビジネス・マシーンズ・コーポレーション Decentralized load balancing method and computer program in event-driven system
US8612981B2 (en) 2006-11-07 2013-12-17 Sony Corporation Task distribution method
JP2014142719A (en) * 2013-01-22 2014-08-07 Canon Inc Information processing device
JP2022516549A (en) * 2019-11-29 2022-02-28 上▲海▼商▲湯▼智能科技有限公司 Chip operating frequency setting

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09185589A (en) * 1996-01-05 1997-07-15 Toshiba Corp Information processing system and power saving method for the system
JPH09218861A (en) * 1996-02-08 1997-08-19 Fuji Xerox Co Ltd Scheduler
JP2000066776A (en) * 1998-08-03 2000-03-03 Lucent Technol Inc Method for controlling power consumption in sub-circuit of system
JP2004513451A (en) * 2000-10-31 2004-04-30 ミレニアル・ネット・インコーポレーテッド Network processing system with optimized power efficiency

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09185589A (en) * 1996-01-05 1997-07-15 Toshiba Corp Information processing system and power saving method for the system
JPH09218861A (en) * 1996-02-08 1997-08-19 Fuji Xerox Co Ltd Scheduler
JP2000066776A (en) * 1998-08-03 2000-03-03 Lucent Technol Inc Method for controlling power consumption in sub-circuit of system
JP2004513451A (en) * 2000-10-31 2004-04-30 ミレニアル・ネット・インコーポレーテッド Network processing system with optimized power efficiency

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009527828A (en) * 2006-02-17 2009-07-30 クゥアルコム・インコーポレイテッド System and method for multiprocessor application support
US8612981B2 (en) 2006-11-07 2013-12-17 Sony Corporation Task distribution method
JP2010537266A (en) * 2007-08-17 2010-12-02 インターナショナル・ビジネス・マシーンズ・コーポレーション Proactive power management in parallel computers
US7941681B2 (en) 2007-08-17 2011-05-10 International Business Machines Corporation Proactive power management in a parallel computer
JP2013502642A (en) * 2009-08-18 2013-01-24 インターナショナル・ビジネス・マシーンズ・コーポレーション Decentralized load balancing method and computer program in event-driven system
US9665407B2 (en) 2009-08-18 2017-05-30 International Business Machines Corporation Decentralized load distribution to reduce power and/or cooling costs in an event-driven system
JP2011134330A (en) * 2009-12-22 2011-07-07 Intel Corp Systems and methods for energy efficient load balancing at server clusters
JP2014142719A (en) * 2013-01-22 2014-08-07 Canon Inc Information processing device
JP2022516549A (en) * 2019-11-29 2022-02-28 上▲海▼商▲湯▼智能科技有限公司 Chip operating frequency setting

Also Published As

Publication number Publication date
JPWO2006011189A1 (en) 2008-05-01
JP4082439B2 (en) 2008-04-30

Similar Documents

Publication Publication Date Title
EP3155521B1 (en) Systems and methods of managing processor device power consumption
US9626307B2 (en) Mobile device and a method of controlling the mobile device
JP5075274B2 (en) Power aware thread scheduling and dynamic processor usage
US8381004B2 (en) Optimizing energy consumption and application performance in a multi-core multi-threaded processor system
JP4909588B2 (en) Information processing apparatus and method of using reconfigurable device
JP5583837B2 (en) Computer-implemented method, system and computer program for starting a task in a computer system
US10031574B2 (en) Apparatus and method for controlling multi-core processor of computing system
US9323306B2 (en) Energy based time scheduler for parallel computing system
US8417918B2 (en) Reconfigurable processor with designated processing elements and reserved portion of register file for interrupt processing
EP1868094A2 (en) Multitasking method and apparatus for reconfigurable array
KR100681199B1 (en) Method and apparatus for interrupt handling in coarse grained array
KR20130061747A (en) Providing per core voltage and frequency control
US20110161637A1 (en) Apparatus and method for parallel processing
EP3245587A1 (en) Systems and methods for providing dynamic cache extension in a multi-cluster heterogeneous processor architecture
US20150301858A1 (en) Multiprocessors systems and processes scheduling methods thereof
US9152418B2 (en) Apparatus and method of exception handling for reconfigurable architecture
WO2021071761A1 (en) Latency-aware thread scheduling
US8495345B2 (en) Computing apparatus and method of handling interrupt
WO2006011189A1 (en) Parallel computer
TW201805809A (en) Fine-grained power optimization for heterogeneous parallel constructs
KR102154080B1 (en) Power management system, system on chip including the same and mobile device including the same
KR101765830B1 (en) Multi-core system and method for driving the same
JP6515771B2 (en) Parallel processing device and parallel processing method
JP6368452B2 (en) Improved scheduling of tasks performed by asynchronous devices
WO2015005909A1 (en) Differential voltage and frequency scaling (dvfs) switch reduction

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2006527723

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase