JPH10124313A

JPH10124313A - Parallel processing computer

Info

Publication number: JPH10124313A
Application number: JP27635196A
Authority: JP
Inventors: Hidekazu Takahashi; 英一高橋
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1996-10-18
Filing date: 1996-10-18
Publication date: 1998-05-15

Abstract

PROBLEM TO BE SOLVED: To eliminate the necessity of complicated hardware and to prevent the reduction of performance due to inter-instruction dependent relation or the like by including mutual dependent relation such as register interference in an instruction string of a certain program and executing an instruction of another program during the reset of the dependent relation. SOLUTION: A program identifier(ID) 1 corresponding to a certain program is added to instructions fetched by instruction fetching parts 1a to 1c and the ID 1 added instructions are sent to an instruction merging part 2 at intervals. When two or more instructions are simultaneously sent from the fetching parts 1a to 1c, the merging part 2 orders these sent instructions and sends the instructions to an instruction decoding part 4 in order. The decoding part 4 interprets the instructions and reads out data from an internal storage device (register) 3 corresponding to the program ID 1. An operation part 5 accesses data stored in the internal storage device 3 corresponding to the program ID 1 or a main storage device.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、複数のプログラム
を命令レベルで並列に実行する並列処理計算機に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a parallel processing computer for executing a plurality of programs in parallel at an instruction level.

【０００２】最近のプログラムの複雑化と、大規模化に
伴い、計算機システムでの処理の高速化が要求されてい
る。このため、該計算機のプロセッサでの動作サイクル
の高速化がなされているが、技術的、及び物理的な限界
があるため、複数の命令を同時に実行する並列処理計算
機が必要とされる。[0002] With the recent increase in complexity and scale of programs, there is a demand for faster processing in computer systems. For this reason, the operation cycle of the processor of the computer has been accelerated, but due to technical and physical limitations, a parallel processing computer that executes a plurality of instructions at the same time is required.

【０００３】[0003]

【従来の技術】図３は、従来の命令レベルの並列処理計
算機を模式的に示した図である。従来の命令レベルでの
並列処理計算機では、１つの命令の実行を複数のステー
ジに分割し、各ステージを並列に実行するパイプライン
方式が用いられている。2. Description of the Related Art FIG. 3 schematically shows a conventional instruction-level parallel processing computer. 2. Description of the Related Art A conventional parallel processing computer at an instruction level uses a pipeline method in which the execution of one instruction is divided into a plurality of stages and each stage is executed in parallel.

【０００４】図３は、上記１つの命令の実行を複数のス
テージに分割し、各ステージを並列に実行するパイプラ
イン方式によるパイプライン処理の概念を示したもの
で、ＩＦは命令フェッチのステージ、Ｄはフェッチした
命令を解釈するデコードステージ、Ｅは、該解釈した命
令を実行する演算ステージ、そして、Ｗは、該演算ステ
ージでの演算結果をレジスタ等に格納するステージであ
る。FIG. 3 shows the concept of pipeline processing by a pipeline system in which the execution of the above-mentioned one instruction is divided into a plurality of stages and each stage is executed in parallel. IF denotes an instruction fetch stage, D is a decode stage for interpreting the fetched instruction, E is an operation stage for executing the interpreted instruction, and W is a stage for storing the operation result of the operation stage in a register or the like.

【０００５】この場合、例えば、命令１の演算結果を命
令２が使用している場合、命令２のＥステージは、命令
１のＷステージが完了した後でないと実行することがで
きないので、図示されているように、命令２では、Ｄス
テージが２ステージに延ばされることになり、パイプラ
イン方式の並列処理の特徴が生かされなくなる。In this case, for example, when the operation result of the instruction 1 is used by the instruction 2, the E stage of the instruction 2 cannot be executed until after the W stage of the instruction 1 is completed. As described above, in the instruction 2, the D stage is extended to two stages, and the characteristic of the pipeline type parallel processing cannot be utilized.

【０００６】即ち、１つのプログラムには、命令列中
に、例えばレジスタ依存関係を持つ命令が近接して存在
するケースが多いため、該並列処理計算機のプロセッサ
中でのパイプラインやステージの数を増やしたり、演算
器を多数用意することで、多くの命令を並列に実行でき
るようになってきているが、命令間の上記依存関係が解
消するまで、後続命令（例えば、上記命令２の例）の実
行を停止したり、並列実行による誤った実行、例えば、
分岐命令の場合、分岐先の命令列、或いは、続く命令列
を先行実行し、分岐先が決定した時点（Ｅステージ）
で、先行して実行されている無駄な処理を破棄して、元
の状態に戻す、所謂リカバリ処理や、上記分岐予測した
ときの分岐先の命令を記憶する手段とか、該分岐予測に
必要な制御手段をプロセッサ内に持つことが必要とな
る。That is, in one program, there are many cases where, for example, instructions having, for example, register dependencies exist in the instruction sequence in close proximity, so that the number of pipelines and stages in the processor of the parallel processing computer is reduced. By increasing or preparing a large number of arithmetic units, many instructions can be executed in parallel. However, a subsequent instruction (for example, the example of the instruction 2) until the dependency between the instructions is eliminated. Execution of a program, or incorrect execution by parallel execution, for example,
In the case of a branch instruction, the instruction sequence at the branch destination or the following instruction sequence is pre-executed, and the time when the branch destination is determined (E stage)
Thus, a so-called recovery process for discarding the useless process executed in advance and returning to the original state, a unit for storing the instruction of the branch destination at the time of the branch prediction, a method required for the branch prediction, It is necessary to have control means in the processor.

【０００７】[0007]

【発明が解決しようとする課題】従って、従来の命令レ
ベルの並列処理計算機では、プロセッサが多くの命令を
並列に実行することができても、例えば、レジスタ依存
関係を持つ命令によって、その並列処理能力を充分活用
することができず、後続命令の実行停止による性能低下
や、レジスタ依存関係等を解消するためのハードウェア
手段によるハードウェアの複雑化といった問題があっ
た。Therefore, in a conventional instruction level parallel processing computer, even if a processor can execute many instructions in parallel, for example, the parallel processing is performed by an instruction having a register dependency. There was a problem that the capability could not be fully utilized, the performance was degraded due to the halt of the execution of the subsequent instruction, and the hardware was complicated by hardware means for eliminating register dependencies and the like.

【０００８】本発明は上記従来の欠点に鑑み、依存関係
を持つ命令を多数有するプログラム実行に対し、各プロ
グラム間の命令には、上記レジスタ依存関係がないこと
に着目し、複数のプログラムから独立に命令を取り出し
て並列に実行することで、複雑なハードウェアを必要と
しないで、命令間の依存関係等が無い場合と同等かそれ
に近い性能を達成することができる並列処理計算機を提
供することを目的とするものである。In view of the above-mentioned drawbacks, the present invention focuses on the fact that, while executing a program having a large number of instructions having dependencies, the instructions between the programs do not have the above-described register dependency, and are independent of a plurality of programs. To provide a parallel processing computer that can achieve the same or near performance as when there are no dependencies between instructions without requiring complicated hardware by taking out instructions and executing them in parallel It is intended for.

【０００９】[0009]

【課題を解決するための手段】図１は、本発明の原理構
成図である。上記の問題点は下記の如くに構成した並列
処理計算機によって解決される。FIG. 1 is a block diagram showing the principle of the present invention. The above problem is solved by a parallel processing computer configured as follows.

【００１０】一つ又は複数の主記憶装置に格納されてい
る複数のプログラムから独立に命令を取り出し、該取り
出した各プログラムの命令にプログラム識別子を付加
する複数の命令取り出し部 1a,1b, 〜1cと、該取り出さ
れた複数の命令を受け取り、それらに順序を付けて命令
デコード部4へ送る命令マージ部 2と、上記プログラム
識別子を有する命令を解釈し、必要ならば命令中の内
部記憶装置 (レジスタ) 3 の番地とプログラム識別子
とから新たに内部記憶装置番地を算出し、その番地
からデータを読み出す命令デコード部 4と、上記プログ
ラム識別子と内部記憶装置番地を受け取り、命令中
の内部記憶装置番地とプログラム識別子とから新た
に内部記憶装置番地を算出し、その番地にデータを書
き込むか、又は、上記命令中の主記憶装置番地とプログ
ラム識別子とから新たに主記憶装置番地を算出し、そ
の番地へアクセスする演算部5とを備えるように構成す
る。A plurality of instruction fetch units 1a, 1b,..., 1c, which fetch instructions independently from a plurality of programs stored in one or a plurality of main storage devices and add a program identifier to each fetched program instruction. And an instruction merging unit 2 that receives the extracted instructions, assigns them to the instruction decoding unit 4 and sends the instructions to the instruction decoding unit 4, interprets the instruction having the program identifier, and if necessary, stores an internal storage device in the instruction ( Register), an internal storage device address is newly calculated from the address of the register 3 and the program identifier, and an instruction decoding unit 4 for reading data from the address is received.The program identifier and the internal storage device address are received, and the internal storage device address in the instruction is received. And a new internal storage address from the program identifier and write data to that address, or replace the main storage address in the instruction with Newly calculated main storage addresses and a program identifier, configured to include a calculating unit 5 to access that address.

【００１１】上記の命令取り出し部 1a,1b, 〜1cは、実
現する並列度に応じた数存在する。例えば、４つの命令
取り出し部を持つことで、４命令の並列実行を行う並列
処理計算機を実現することができる。There are a number of instruction fetch units 1a, 1b, to 1c according to the degree of parallelism to be realized. For example, by having four instruction fetch units, a parallel processing computer that executes four instructions in parallel can be realized.

【００１２】本発明の並列処理計算機のプロセッサで
は、図１に示されている如く、命令取り出し部 1a,1b,
〜1cが、１つ又は、複数の主記憶装置中に格納されてい
る複数のプログラムの命令を独立に取り出す。In the processor of the parallel processing computer according to the present invention, as shown in FIG. 1, the instruction fetch units 1a, 1b,
1c independently retrieve instructions of a plurality of programs stored in one or more main storage devices.

【００１３】この場合、１つの主記憶装置に複数のプロ
グラムが格納されている場合には、例えば、マルチポー
トアクセスが可能な主記憶装置を必要とするとか、各プ
ログラムを相互に異なるバンクに格納しておいて、公知
のインタリーブ方式でマシンサイクル毎に、異なるバン
クをアクセスして異なるプログラムの命令を取り出すよ
うにする。該複数のプログラムが、それぞれ、別の主記
憶装置に格納されている場合には、各主記憶装置を独立
にアクセスすることで、対応するプログラムから命令を
取り出すようにする。In this case, when a plurality of programs are stored in one main storage device, for example, a main storage device capable of multi-port access is required, or each program is stored in a different bank. In addition, different banks are accessed and machine instructions of different programs are fetched every machine cycle by a known interleave method. When each of the plurality of programs is stored in another main storage device, an instruction is extracted from the corresponding program by independently accessing each main storage device.

【００１４】このようにして、命令取り出し部 1a,1b,
〜1cで取り出された命令に、プログラムに対応したプロ
グラム識別子を付加して、命令マージ部 2に送る。命
令マージ部 2では、送られてきた命令を順次命令デコー
ド部 4に送る。このとき、同時に２つ以上の命令取り出
し部 1a,1b, 〜1cから命令が送られてきた場合には、該
送られてきた命令に順序を付けて順番に命令デコード部
4に送るようにする。In this manner, the instruction fetch units 1a, 1b,
A program identifier corresponding to the program is added to the instruction fetched in 1c and sent to the instruction merging unit 2. The instruction merging unit 2 sequentially sends the transmitted instructions to the instruction decoding unit 4. At this time, when instructions are sent from two or more instruction fetch units 1a, 1b, to 1c at the same time, the sent instructions are put in order and the instruction decode units are sequentially put in order.
Send it to 4.

【００１５】命令デコード部 4は、命令を解釈し、所
謂、汎用レジスタのデータを必要としている命令では、
内部記憶装置 (レジスタ) 3 からデータを読み出すが、
この時、該命令に付加されているプログラム識別子に
対応した内部記憶装置 (レジスタ) 3 からデータを読み
出すようにしている。即ち、本発明の並列処理計算機で
は、例えば、汎用レジスタ R1,R2, 〜R16 として、プロ
グラム識別子に対応した R1,R2, 〜R16 を備えてい
る。The instruction decoding unit 4 interprets the instruction, and in the case of an instruction requiring data of a general-purpose register,
Reads data from internal storage device (register) 3,
At this time, data is read from the internal storage device (register) 3 corresponding to the program identifier added to the instruction. That is, the parallel processing computer of the present invention includes, for example, R1, R2,..., R16 corresponding to the program identifier as general-purpose registers R1, R2,.

【００１６】各種の演算を行う演算部 5では、上記プロ
グラム識別子に対応した内部記憶装置 (汎用レジス
タ) 3 や、主記憶装置上のデータをアクセスする。又、
該演算部 5では、分岐命令処理のとき、該演算部 5で求
めた次命令の指定アドレスを、プログラム識別子に対
応した命令取り出し部 1a,1b, 〜1cに送る。An operation unit 5 for performing various operations accesses an internal storage device (general-purpose register) 3 corresponding to the program identifier and data in a main storage device. or,
At the time of branch instruction processing, the arithmetic unit 5 sends the designated address of the next instruction obtained by the arithmetic unit 5 to the instruction fetch units 1a, 1b, to 1c corresponding to the program identifier.

【００１７】上記各命令取り出し部 1a,1b, 〜1cでは、
自己のプログラムの命令同士には依存関係が発生しない
ように間隔を空けて命令マージ部 2に送るように構成し
ているので、命令デコード部 4では、各プログラムの命
令同士には依存関係 (前記のレジスタ干渉等) がない。
従って、該レジスタ干渉関係を持つ命令を有するプログ
ラムでも、前述のように、該依存関係が解消するまでの
間は次の命令を送ってこないことになるが、その間は、
別のプログラムの命令を実行することができるので、命
令デコード部 4、演算部 5では、従来のように、命令実
行の停止が起きることがなく、又、複雑な依存関係を解
消するためのハードウェア、例えば、前述のレジスタ干
渉検出手段とか、リカバリ手段とか、分岐予測のための
記憶装置とか、制御手段を必要とせず、パイプライン処
理を円滑に実行することができる。In each of the instruction fetch units 1a, 1b, to 1c,
Since the instructions of the own program are configured to be sent to the instruction merge unit 2 at intervals so as not to cause a dependency between the instructions of the own program, the instruction No register interference).
Therefore, even in a program having an instruction having the register interference relationship, as described above, the next instruction is not sent until the dependency is resolved.
Since the instruction of another program can be executed, the instruction decoding unit 4 and the arithmetic unit 5 do not stop the execution of the instruction unlike the related art, and also have a hardware for eliminating a complicated dependency. Hardware, for example, the above-described register interference detection means, recovery means, storage device for branch prediction, and control means are not required, and the pipeline processing can be executed smoothly.

【００１８】[0018]

【発明の実施の形態】以下本発明の実施例を図面によっ
て詳述する。図１は、本発明の原理構成図であり、図２
は、本発明の一実施例を示した図であって、図２(a)
は、本発明の並列処理計算機の構成例を示し、図２(b)
は、プログラム識別子を命令に付加した例を示してい
る。Embodiments of the present invention will be described below in detail with reference to the drawings. FIG. 1 is a diagram showing the principle of the present invention, and FIG.
FIG. 2A is a view showing one embodiment of the present invention, and FIG.
Shows an example of the configuration of the parallel processing computer of the present invention, and FIG.
Shows an example in which a program identifier is added to an instruction.

【００１９】本発明の実施の形態においては、一つ又は
複数の主記憶装置に格納されている複数のプログラムか
ら独立に命令を取り出し、該取り出した各プログラムの
命令にプログラム識別子を付加する複数の命令取り出
し部 1a,1b, 〜1cと、該取り出された複数の命令を受け
取り、それらに順序を付けて命令デコード部 4へ送る命
令マージ部 2と、上記プログラム識別子を有する命令
を解釈し、必要ならば命令中の内部記憶装置 (レジス
タ) 3 の番地とプログラム識別子とから新たに内部
記憶装置番地を算出し、その番地からデータを読み出
す命令デコード部4と、上記プログラム識別子と内部
記憶装置番地を受け取り、命令中の内部記憶装置番地
とプログラム識別子とから新たに内部記憶装置番地
を算出し、その番地にデータを書き込むか、又は、上
記命令中の主記憶装置番地とプログラム識別子とか
ら新たに主記憶装置番地を算出し、その番地へアクセ
スする演算部 5とが、本発明の実施の形態に必要な手段
である。尚、全図を通して同じ符号は同じ対象物を示し
ている。In an embodiment of the present invention, a plurality of instructions are fetched independently from a plurality of programs stored in one or a plurality of main storage devices, and a program identifier is added to each fetched program instruction. Instruction fetching units 1a, 1b, to 1c, an instruction merging unit 2 which receives the fetched instructions, assigns them to an order, and sends them to an instruction decoding unit 4, interprets instructions having the above program identifier, and If this is the case, a new internal storage device address is calculated from the address of the internal storage device (register) 3 in the instruction and the program identifier, and an instruction decoding unit 4 for reading data from that address. Receiving and calculating a new internal storage address from the internal storage address and the program identifier in the instruction, and writing data to the address, or Newly from the main memory address and a program identifier in the instruction to calculate a main storage address, and a calculation unit 5 to access that address is the means necessary to the embodiment of the present invention. Note that the same reference numerals indicate the same object throughout the drawings.

【００２０】以下、図１を参照しながら、図２によっ
て、本発明の並列処理計算機の構成と動作を説明する。
図２(a) において、ａ，ｂ，〜ｃは、複数のプログラム
領域であり、一つの主記憶装置上に配置されていても良
いし、各プログラムａ，ｂ，〜ｃが、それぞれ、別個の
主記憶装置に配置されていても良いが、複数のプログラ
ム領域が、一つの主記憶装置上に配置されている場合に
は、前述のように、マルチポートを有する主記憶装置で
あって、同時に、複数の命令を取り出すことができる必
要があるか、各プログラムａ，ｂ，〜ｃが、相互に異な
るバンクに格納されている場合には、公知のインタリー
ブ方式で、マシンサイクル毎に各バンクをアクセスする
ようにして、複数の命令を取り出すようにする。又、各
プログラムａ，ｂ，〜ｃが、それぞれ、別個の主記憶装
置に配置されている場合には、各主記憶装置を独立にア
クセスすることで、複数の命令を独立に読み出す事がで
きる。Hereinafter, the configuration and operation of the parallel processing computer according to the present invention will be described with reference to FIG. 2 while referring to FIG.
In FIG. 2A, a, b, and c are a plurality of program areas, which may be arranged on one main storage device. May be arranged in the main storage device, but when a plurality of program areas are arranged in one main storage device, as described above, the main storage device having a multi-port, At the same time, if it is necessary to be able to fetch a plurality of instructions, or if the programs a, b,..., C are stored in mutually different banks, a known interleave method Is accessed to retrieve a plurality of instructions. When each of the programs a, b, to c is arranged in a separate main storage device, a plurality of instructions can be read out independently by accessing each main storage device independently. .

【００２１】命令取り出し部 1a,1b, 〜1cでは、読み出
す命令の番地を示すプログラムカウンタＰＣa,ＰＣb,Ｐ
Ｃｃ、及び、対応するプログラムの識別子 PIDa,PIDb,P
IDcをそれぞれ持っている。In the instruction fetch units 1a, 1b, to 1c, program counters PCa, PCb, P indicating the address of the instruction to be read are provided.
Cc and corresponding program identifiers PIDa, PIDb, P
Each has an IDc.

【００２２】上記命令取り出し部 1a,1b, 〜1cは、対応
するプログラム領域ａ，ｂ，ｃから上記プログラムカウ
ンタＰＣa,ＰＣb,ＰＣｃに従って命令 ca,cb,cc を取り
出し、プログラム識別子 PIDa,PIDb,PIDc を該取り出
した命令 ca,cb,cc に付加して命令マージ部 2に送る。The instruction fetch units 1a, 1b, to 1c fetch the instructions ca, cb, cc from the corresponding program areas a, b, c in accordance with the program counters PCa, PCb, PCc, and program identifiers PIDa, PIDb, PIDc. Is added to the extracted instructions ca, cb, cc and sent to the instruction merging section 2.

【００２３】上記各命令取り出し部 1a,1b, 〜1cは、上
記プログラムカウンタＰＣa,ＰＣb,ＰＣｃの値を一つ増
やし、上記送った命令の実行が完了したことを、例え
ば、後述の演算部 5からのプログラム識別子に基づい
て送られてきた完了信号を検出した後に、再び、次の命
令の取り出し処理を繰り返す。このとき、先行している
命令の完了信号の種別、例えば、レジスタ間演算命令等
であると、後続命令との間にレジスタ干渉の危険がある
ので、所定のマシンサイクルを挿入して、レジスタ干渉
が絶対に起こることがないようにして次の命令の取り出
しを始めるようにする。又、該先行命令が分岐命令の場
合には、分岐先が決定しているので、該決定された分岐
先の番地をプログラムカウンタＰＣa,ＰＣb,ＰＣｃに設
定した分岐先を読み出すようにすることで、分岐予測
と、該分岐予測にともなうリカバリ処理を必要としない
ように制御することができる。The instruction fetch units 1a, 1b,..., 1c increment the values of the program counters PCa, PCb, PCc by one. After detecting the completion signal sent based on the program identifier from, the process of retrieving the next instruction is repeated again. At this time, if the type of the completion signal of the preceding instruction is, for example, an inter-register operation instruction or the like, there is a risk of register interference between the instruction and a subsequent instruction. Will never occur and start fetching the next instruction. When the preceding instruction is a branch instruction, since the branch destination is determined, the branch destination having the determined branch destination address set in the program counters PCa, PCb, and PCc is read. , So that a branch prediction and a recovery process accompanying the branch prediction are not required.

【００２４】プログラム識別子の命令への付加は、図
２(b) に示されているように、命令コードのビットフィ
ールドに、複数のプログラムを識別することができるビ
ット数のフィールドを付加すれば実現することができ
る。As shown in FIG. 2B, the program identifier is added to the instruction by adding a field of the number of bits capable of identifying a plurality of programs to the bit field of the instruction code. can do.

【００２５】命令マージ部 2は、上記各命令取り出し部
1a,1b, 〜1cから送られてきた命令ca,cb,cc を命令デ
コード部 4へ順次送るが、複数の命令取り出し部 1a,1
b, 〜1cから同時に命令が送られてきた場合には、それ
ぞれの命令に順序付けを行い、順番に命令デコード部 4
に送る。The instruction merging section 2 includes the above-described instruction fetching sections.
Instructions ca, cb, cc sent from 1a, 1b, to 1c are sequentially sent to the instruction decoding unit 4, but a plurality of instruction fetching units 1a, 1
When instructions are sent simultaneously from b, to 1c, the instructions are ordered and the instruction decoding unit 4
Send to

【００２６】命令デコード部 4以降の構成は、従来のパ
イプライン方式の構成と原理的には同じであるが、本発
明の並列処理計算機では、従来のように、レジスタ干渉
を検出する手段とか、分岐予測手段とか、分岐予測が当
たらなかった場合のリカバリ手段等を持つ必要がない。The configuration after the instruction decoding unit 4 is basically the same as the configuration of the conventional pipeline system. However, in the parallel processing computer of the present invention, there are conventional means for detecting register interference, There is no need to have a branch prediction means or a recovery means when the branch prediction is not successful.

【００２７】命令デコード部 4では、命令を解釈し、演
算部 5に設けられている各種の演算器 50 、例えば、加
算器、乗算器、除算器、シフト器、分岐命令演算器 (例
えば、分岐先アドレスを算出する加算器、分岐条件を判
定する判定器) 等を選択してデータを送るが、該命令が
指示するデータの種別に対応して、内部記憶装置、例え
ば、前述のプログラム識別子と、該命令中の内部記憶
装置番地、例えば、レジスタ番号に従って、該内部記
憶装置 (レジスタ) 3 のデータを読み出し、取り出した
データを対応する演算器 50 に送る。The instruction decoding unit 4 interprets the instruction, and various arithmetic units 50 provided in the arithmetic unit 5, for example, adders, multipliers, dividers, shifters, branch instruction arithmetic units (for example, branch instruction arithmetic units) Data to be sent by selecting an adder for calculating a destination address, a determiner for determining a branch condition) and the like, but corresponding to the type of data indicated by the instruction, an internal storage device, for example, According to the address of the internal storage device in the instruction, for example, the register number, the data in the internal storage device (register) 3 is read, and the extracted data is sent to the corresponding computing unit 50.

【００２８】このような手段は、例えば、命令取り出し
部 1a,1b, 〜1cで付加されたプログラム識別子のビッ
トフィールドの内容と、命令中の内部記憶装置番地フィ
ールド、例えば、レジスタ番号指定フィールドを結合す
る等の演算を施して、内部記憶装置 (レジスタ) 3 の番
地とすることで実現することができる。Such means combines, for example, the contents of the bit field of the program identifier added by the instruction fetch units 1a, 1b,... 1c with the internal storage device address field in the instruction, for example, the register number designation field. This can be realized by performing an operation such as performing an operation and setting the address of the internal storage device (register) 3.

【００２９】演算部 5は、各命令毎の複数の並列に動作
する演算器 50 からなる。上記内部記憶装置 (レジス
タ) 3 へ書き込みを行う演算器 50 は、命令デコード部
4で行ったのと同様に、プログラム識別子と、上記内
部記憶装置 (レジスタ) 3 の番地とから、該命令の内
部記憶装置 (レジスタ) 3 の番地を指定する。The operation unit 5 comprises a plurality of operation units 50 operating in parallel for each instruction. The arithmetic unit 50 that writes to the internal storage device (register) 3 is an instruction decoding unit.
In the same manner as performed in step 4, the address of the internal storage device (register) 3 of the instruction is designated from the program identifier and the address of the internal storage device (register) 3.

【００３０】又、該命令が主記憶装置へのアクセスを指
定している場合には、対応する演算器 50 において、該
命令が指示している主記憶装置内の番地と、プログラ
ム識別子とを結合する等の演算を施した結果を、該主
記憶装置の該プログラムに対応した番地としてアクセ
スする。If the instruction specifies access to the main storage device, the corresponding arithmetic unit 50 combines the address in the main storage device specified by the instruction with the program identifier. The result obtained by performing an operation such as performing an operation is accessed as an address corresponding to the program in the main storage device.

【００３１】又、該演算器 50 が分岐命令演算器である
場合には、例えば、次の分岐先命令を示す番地をプログ
ラム識別子に対応した命令取り出し部 1a,1b, 〜1cの
プログラムカウンタＰＣa,ＰＣb,ＰＣｃに設定する。When the arithmetic unit 50 is a branch instruction arithmetic unit, for example, the address indicating the next branch destination instruction is stored in the program counter PCa, 1a, 1b,. Set to PCb, PCc.

【００３２】尚、上記の実施例では、命令取り出し部 1
a,1b, 〜1cでの次の命令の読み出しは、先行して読み出
した命令が命令デコード部 4以降の装置で実行が完了し
た後に読み出す例で説明したが、命令の実行完了に係わ
らず、常に、命令の読み出しを行い、命令デコード部 4
で、従来のように、依存関係のチェックを行って、依存
関係がある場合には、依存関係が解消する迄、命令デコ
ード部 4への送出を待ち、もし実行をする必要のない命
令、例えば、分岐予測等で先行して読み出した命令であ
れば、その命令を無効にするような方式であっても良い
ことは言うまでもないことである。この場合でも、前述
のように、命令取り出し部 1a,1b, 〜1cでの次の命令の
読み出しだけで、命令の実行を行っていないので、該命
令が当該並列処理計算機の状態に影響を与えることがな
い為、従来のリカバリ等のハードウェアより単純な構成
で実現することができる。In the above embodiment, the instruction fetch unit 1
The reading of the next instruction in a, 1b, to 1c has been described as an example in which the previously read instruction is read after execution has been completed in the device after the instruction decoding unit 4, but regardless of completion of execution of the instruction, Instructions are always read and the instruction decode unit 4
Then, as in the conventional case, the dependency is checked, and if there is a dependency, transmission to the instruction decoding unit 4 is waited until the dependency is resolved, and an instruction that does not need to be executed, for example, Needless to say, if the instruction is read in advance by branch prediction or the like, the instruction may be invalidated. Even in this case, as described above, since the instruction is not executed merely by reading the next instruction in the instruction fetch units 1a, 1b, to 1c, the instruction affects the state of the parallel processing computer. Therefore, the present invention can be realized with a simpler configuration than conventional hardware such as recovery.

【００３３】このように、本発明の並列処理計算機で
は、複数のプログラムから独立に命令を取り出し、該取
り出した各プログラムの命令にプログラム識別子を付
加する複数の命令取り出し部と、該取り出された複数の
命令を受け取り、それらに順序を付けて命令デコード部
へ送る命令マージ部と、上記プログラム識別子を有す
る命令を解釈し、必要ならば命令中の内部記憶装置番地
とプログラム識別子とから新たに内部記憶装置番地
を算出し、その番地からデータを読み出す命令デコ
ード部と、上記プログラム識別子と内部記憶装置番地
を受け取り、命令中の内部記憶装置番地とプログラ
ム識別子とから新たに内部記憶装置番地を算出し、
その番地にデータを書き込むか、又は、上記命令中の
主記憶装置番地とプログラム識別子とから新たに主
記憶装置番地を算出し、その番地へアクセスする演
算部とを備えるように構成したところに特徴がある。As described above, in the parallel processing computer of the present invention, a plurality of instruction fetching sections for independently fetching instructions from a plurality of programs and adding a program identifier to the instructions of each fetched program, And an instruction merging unit that sends them to the instruction decoding unit in an order, interprets the instruction having the program identifier, and newly stores the internal storage from the internal storage device address and the program identifier in the instruction if necessary. An instruction decoding unit that calculates a device address and reads data from the address, receives the program identifier and the internal storage device address, and newly calculates an internal storage device address from the internal storage device address and the program identifier in the instruction;
It is characterized in that it is configured to write data to the address or to newly calculate a main storage device address from the main storage device address and the program identifier in the above-mentioned instruction, and to have an operation unit for accessing the address. There is.

【００３４】[0034]

【発明の効果】以上、詳細に説明したように、本発明の
並列処理計算機によれば、あるプログラムの命令列中に
レジスタ干渉等の相互依存関係を持つため、実行するこ
とができない命令が存在しても、依存関係が解消する迄
の間、他のプログラムの命令を実行することができるた
め、該並列処理計算機内では、プログラム処理が停止し
ないという顕著な効果を奏し、従来必要であった分岐予
測、リカバリ処理等の複雑なハードウェアを不要とする
ことができ、かかる命令間依存関係等による性能低下を
防ぐことに寄与するところが大きいという効果がある。As described above in detail, according to the parallel processing computer of the present invention, there is an instruction that cannot be executed because of an interdependency such as register interference in an instruction sequence of a certain program. Even so, instructions of another program can be executed until the dependency is resolved, so that there is a remarkable effect that the program processing does not stop in the parallel processing computer, which has been conventionally required. Complex hardware such as branch prediction and recovery processing can be dispensed with, and there is an effect that it greatly contributes to preventing performance degradation due to such inter-instruction dependency.

[Brief description of the drawings]

【図１】本発明の原理構成図FIG. 1 is a block diagram of the principle of the present invention.

【図２】本発明の一実施例を示した図FIG. 2 shows an embodiment of the present invention.

【図３】従来の命令レベルの並列処理計算機を模式的に
示した図FIG. 3 is a diagram schematically showing a conventional instruction-level parallel processing computer;

[Explanation of symbols]

1a,1b,1c 命令取り出し部 2 命令マージ
部 3 内部記憶装置 (レジスタ) 4 命令デコー
ド部 5 演算部 50 演算器プログラム識別子 (PIDa,PIDb,PIDc) 内部記憶装置番地、主記憶装置番地 a,b,c プログラム領域ＰＣa,ＰＣb,ＰＣc プログラムカウンタ1a, 1b, 1c Instruction fetch unit 2 Instruction merge unit 3 Internal storage device (register) 4 Instruction decode unit 5 Operation unit 50 Operation unit Program identifier (PIDa, PIDb, PIDc) Internal storage address, main storage address a, b , c Program area PCa, PCb, PCc Program counter

Claims

[Claims]

1. A plurality of instruction fetch units for independently fetching instructions from a plurality of programs stored in one or a plurality of main storage devices, and adding a program identifier to the instructions of each fetched program; An instruction merging unit that receives a plurality of instructions, and sends them to the instruction decoding unit in an order; interprets the instruction having the program identifier, and if necessary, newly generates an instruction from the internal storage device address and the program identifier in the instruction. An instruction decoding unit that calculates an internal storage device address and reads data from the address, receives the program identifier and the internal storage device address,
A new internal storage device address is calculated from the internal storage device address and the program identifier in the instruction, and data is written to that address, or a new main storage device is calculated from the main storage device address and the program identifier in the instruction. A parallel processing computer comprising: an arithmetic unit for calculating an address and accessing the address.