JPH05298094A

JPH05298094A - Processor

Info

Publication number: JPH05298094A
Application number: JP9766392A
Authority: JP
Inventors: Yutaka Harada; 豊原田; Kazumasa Takagi; 一正高木; Koji Nakahara; 宏治中原
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1992-04-17
Filing date: 1992-04-17
Publication date: 1993-11-12

Abstract

PURPOSE:To improve the practical use efficiency of computer and to obtain a high performance computer by switching an instruction execution controlling register among stages plural times. CONSTITUTION:Since an instruction register(IR) 150 simultaneously performs instruction read out, operand readout, and instruction execution with a single instruction train, the instruction register(IR) 150 consists of shift registers passing the stages plural times so that the read-out instruction controls stages plural times. The IR 150 consists of the shift register from stage 2 to stage 5 of a second cycle, that from stage 0 to stage 5 of a third cycle, and that from stage 0 to stage 3 of a fourth cycle. The IR 150 receives an instruction from an instruction memory 170 in stage 2 of the second cycle, and execution of the instruction is controlled by the instruction code which the IR 150 receives at this time.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は計算機の構成方法に係
り、特に多数の処理装置を並列に動作させるパイプライ
ン方式のマルチプロセッサの構成方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a computer configuration method, and more particularly to a pipeline type multiprocessor configuration method for operating a large number of processing devices in parallel.

【０００２】[0002]

【従来の技術】計算機性能を向上させるには、素子性能
特に回路速度を改良する方法と、計算機構造を改良する
方法とがある。計算機構造を改良するにあたって、計算
機の持つ機能を分割し、命令列でその機能を時分割で使
う方法、いわゆるパイプライン方式を導入することが極
めて有効である。このパイプライン方式については例え
ば”ジア−キテクチャ− オブパイプラインドコ
ンピュ−タ−ズ”ヘミスフェア− パブリッシングコ
−ポレ−ション１９８１年（”ＴｈｅＡｒｃｈｉｔ
ｅｃｔｕｒｅｏｆｐｉｐｅｌｉｎｅｄｃｏｍｐｕ
ｔｅｒｓ” ＨｅｍｉｓｐｈｅｒｅＰｕｂｌｉｓｈｉ
ｎｇＣｏｒｐｏｒａｔｉｏｎ１９８１）に記載され
ている。従来のパイプライン方式では、例えば各命令の
実行順序を命令読み出し（ＩｎｓｔｒｕｃｔｉｏｎＦ
ｅｔｃｈ）、命令解読（Ｄｅｃｏｒｄ）、オペランド読
み出し（ＯｐｅｒａｎｄＦｅｔｃｈ）、命令実行（Ｅ
ｘｅｃｕｔｉｏｎ）、書き込み（Ｗｒｉｔｅ）の連続す
る５動作に分け、これらの一連の動作を別々の命令が順
次オ−バ−ラップしながら実行する。2. Description of the Related Art In order to improve computer performance, there are a method of improving device performance, especially a circuit speed, and a method of improving computer structure. In order to improve the computer structure, it is extremely effective to divide the functions of the computer and use the so-called pipeline method, which uses the functions in an instruction sequence in a time division manner. This pipeline system is described, for example, in "The Architecture of Pipelined Computers", Hemisphere Publishing Corporation, 1981 ("The Archit
image of pipelined compu
ters "Hemisphere Publishi
ng Corporation 1981). In the conventional pipeline method, for example, the instruction reading (Instruction F
etch), instruction decoding (Decord), operand reading (Operand Fetch), instruction execution (E)
X operation) and write (Write) are divided into five continuous operations, and a series of these operations are executed while sequentially overlapping different instructions.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、上記従
来技術は、用意されているハ−ドウエアを時分割で有効
に活用する方法であるが、必ずしも高効率で用意された
ハ−ドウエアを活用しているとは言えない。究極的なパ
イプラインでは計算機を構成しているゲ−ト単位で時分
割動作を行わせ、計算機の論理段数分の命令を一度に実
行させることができる。この目的で計算機動作を細かい
ステ−ジに分け、該ステ−ジに独立した計算機機能を割
り付けたマイクロパイプライン方式の計算機構成方が開
示されている。この既に開示されたマイクロパイプライ
ン方式の計算機では、命令メモリ、デ−タメモリの参照
が、計算機処理の一連のステ−ジ中に直列に実行される
よう配置されていた。このため、従来技術では、命令メ
モリ、デ−タメモリの参照速度が遅いため、計算機の処
理速度を速く出来ない欠点があった。However, the above-mentioned prior art is a method of effectively utilizing the prepared hardware in a time division manner, but it is not always necessary to utilize the prepared hardware with high efficiency. I can't say that In the ultimate pipeline, time-division operations can be performed in units of the gates that make up the computer, and instructions for the number of logical stages of the computer can be executed at one time. For this purpose, there is disclosed a micro-pipeline type computer configuration method in which the computer operation is divided into fine stages and the computer functions independent of the stages are assigned. In this already disclosed micro-pipeline type computer, the reference of the instruction memory and the data memory is arranged to be executed serially during a series of stages of the computer processing. Therefore, the conventional technique has a drawback that the processing speed of the computer cannot be increased because the reference speed of the instruction memory and the data memory is slow.

【０００４】本発明の目的は、計算機ハ−ドウエアの活
用効率を高め、高性能の計算機を実現するために、計算
機機能をより細分化したマイクロパイプライン方式で、
単一の命令列の処理機能に関し、その命令読みだし、オ
ペランド読みだし、命令実行が同時に行われるマイクロ
パイプライン方式の計算機の構成方法を提供しようとす
るものである。An object of the present invention is to improve the utilization efficiency of computer hardware and realize a high performance computer by a micropipeline system in which computer functions are subdivided.
It is an object of the present invention to provide a method of constructing a micro-pipeline type computer in which the instruction reading, the operand reading, and the instruction execution are simultaneously performed with respect to the processing function of a single instruction string.

【０００５】[0005]

【課題を解決するための手段】上記目的は、細分化した
計算機機能に対して、各々レジスタセットを割当て、命
令の実行順序が進むにつれて、該レジスタセットのデ−
タも順次移動させる方法のマイクロパイプライン方式の
処理装置であって、命令実行を制御するレジスタを複数
回数にわたって該ステ−ジ間を移動する方法を採用する
ことにより達成される。SUMMARY OF THE INVENTION The above-mentioned object is to allocate a register set to each subdivided computer function so that the data of the register set is decremented as the instruction execution sequence progresses.
This is achieved by adopting a micropipeline type processing device of a method of sequentially moving the data, and adopting a method of moving a register controlling instruction execution between the stages a plurality of times.

【０００６】[0006]

【作用】本発明による方式を採用すれば、複数の命令列
（プログラム）を実行するマイクロパイプライン方式で
あって、参照時間の大きい命令メモリ、デ−タメモリを
使ってもパイプラインピッチの細かい処理装置を実現で
きる。従って、本発明よれば、単一のハ−ドウエアで高
い効率で複数のプログラムを実行するマルチプロセッサ
を構成できる。If the method according to the present invention is adopted, it is a micropipeline method for executing a plurality of instruction sequences (programs), and even if an instruction memory or data memory with a long reference time is used, a fine pipeline pitch processing is performed. The device can be realized. Therefore, according to the present invention, it is possible to configure a multiprocessor that executes a plurality of programs with high efficiency with a single hardware.

【０００７】[0007]

【実施例】以下に、本発明を実施例を使って説明する。
本発明では簡単な計算機を実施例として取り上げる。計
算機は、命令メモリ（ＩｎｓｔｒｕｃｔｉｏｎＭｅｍ
ｏｒｙ：ＩＭ）から命令を読み出し、デ−タメモリ（Ｄ
ａｔａＭｅｍｏｒｙ：ＤＭ）からオペランドを読み出
して演算を実行する。この計算機は加算器と以下の４個
のレジスタを持つ。ＡＣＣ：アキュミュレ−タ（演算レジスタ、演算デ−タ
の保持）ＭＤＲ：メモリデ−タレジスタ（デ−タメモリから読み
出されたのオペランドの保持とデ−タメモリに書き込む
デ−タの保持）ＰＣ：プログラムカウンタ（プログラムの実行アドレ
スの保持）ＩＲ：インストラクションレジスタ（命令コ−ドの保
持）またこの計算機の命令フォ−マットは、命令コ−ド（Ｏ
ＰＣｏｄｅ）と命令アドレス（Ａｄｄｒｅｓｓ）から
なる。この命令は命令メモリから読みだされ、ＩＲにセ
ットされて解読された後、実行される。この計算機の命
令は以下の７種類である。ＪＭＰ：命令アドレスに無条件ジャンプＣＡＬ：命令アドレスにサブル−チンジャンプＲＴＮ：サブル−チンジャッンプからの復帰ＪＡＮ：アキュミュレ−タが負ならば命令アドレスに
ジャンプ（条件ジャンプ）ＬＡＣ：命令アドレスのデ−タをアキュミュレ−タに
読み出すＳＡＣ：アキュミュレ−タのデ−タを命令アドレスに
書き込むＡＤＤ：命令アドレスのデ−タをアキュミュレ−タに
加算する本実施例の計算機は構造が簡単であり、従来のパイプラ
イン技術の命令読み出し（ＩｎｓｔｒｕｃｔｉｏｎＦ
ｅｔｃｈ）とオペランド読みだし（ＯｐｅｒａｎｄＦ
ｅｔｃｈ）、命令実行（Ｅｘｅｃｕｔｉｏｎ）に相当す
る動作のみを行う。しかし、より詳細には命令読み出
し、オペランド読みだし、命令実行はより複雑な動作に
細分化される。EXAMPLES The present invention will be described below with reference to examples.
The present invention takes a simple calculator as an example. The computer is an instruction memory (Instruction Mem).
ory: IM) to read the instruction from the data memory (D
The operand is read from the data memory (DM) and the operation is executed. This computer has an adder and the following four registers. ACC: Accumulator (holding operation registers and operation data) MDR: Memory data register (holding operands read from data memory and holding data to be written in data memory) PC: Program counter (Holding program execution address) IR: Instruction register (Holding instruction code) The instruction format of this computer is the instruction code (O
P Code) and an instruction address (Address). This instruction is read from the instruction memory, set in IR, decoded, and then executed. The computer has the following seven types of instructions. JMP: Unconditional jump to instruction address CAL: Subroutine jump to instruction address RTN: Return from subroutine jump JAN: Jump to instruction address if accumulator is negative (conditional jump) LAC: Instruction address data To the accumulator SAC: Write the data of the accumulator to the instruction address ADD: Add the data of the instruction address to the accumulator The computer of the present embodiment has a simple structure and the conventional pipe. Instruction reading of line technology (Instruction F
and read operands (Operand F
(etch) and instruction execution (Execution). However, more specifically, instruction reading, operand reading, and instruction execution are subdivided into more complicated operations.

【０００８】本発明による計算機のハ−ドウエア構成の
実施例を図１に示す。図１において、計算機は加算器１
００と各動作ステ−ジ（ステ−ジ０からステ−ジ５）毎
におかれたレジスタ群で構成される。加算機１００は、
パイプライレジスタと演算論理回路が組合わさったパイ
プライン構造をしており、ステ−ジ５からステ−ジ１で
加算演算を実行する。またレジスタに関しては、同種の
レジスタはデ−タがステ−ジ順に転送されるように、シ
フトレジスタを構成している。一連のシフトレジスタは
用途に応じて、不必要な場合は途中を欠落させてあり、
またプログラム（命令列）でデ−タを保持する必要のあ
るものは該シフトレジスタを環状に接続し、該デ−タが
該シフトレジスタ内を循環する構成としている。命令の
読みだし、オペランドの読みだし、命令実行は該ステ−
ジを４サイクル（サイクル１からサイクル４）実行する
ことにより完了する。以下に、図１の実施例の動作を詳
細に説明する。本実施例による処理装置は加算器１０
０、ＡＣＣ１１０、ＭＤＲ１２０、ＰＣ１４０、ＩＲ１
５０、デ−タメモリ１６０、命令メモリ１７０から構成
されている。ＡＣＣ１１０はスレ−ジ０からステ−ジ５
までの各ステ−ジにデ−タを保持するためのレジスタを
持つ。各ステ−ジのレジスタはビット毎に次のステ−ジ
にデ−タを転送できるようステ−ジ方向にシフトレジス
タとしての構成をしている。またプログラム内（命令
列）でデ−タを保持するため、ステ−ジ５のデ−タはス
テ−ジ０に転送される構造を持つ。同様に、ＭＤＲ１２
０はステ−ジ３からステ−ジ５までのシフトレジスタか
ら構成される。ＭＤＲ１２０のステ−ジ４ではデ−タメ
モリ１６０からデ−タが転送され、ステ−ジ３ではデ−
タメモリ１６０にデ−タを転送する。ＰＣ１４０はステ
−ジ０からステ−ジ５までのシフトレジスタから構成さ
れ、次の命令アドレスを保持するためステ−ジ５のデ−
タはステ−ジ０に転送される構造を持つ。ステ−ジ０で
ＰＣ１４０のデ−タは命令を読みだすためアドレスとし
て命令メモリ１７０に転送され、命令メモリは読みだし
サイクルにセットされる。またステ−ジ１には保持する
デ−タを１増加して次の命令アドレスをセットする目的
で１加算器１４１が付属している。ＩＲ１５０は、命令
読みだし、オペランド読みだし、命令実行を単一命令列
で同時に行うため、読みだされた命令が該ステ−ジを複
数回制御するように該ステ−ジを複数回通過するシフト
レジスタで構成されている。ＩＲ１５０は第２サイクル
のステ−ジ２からステ−ジ５、第３サイクルのステ−ジ
０からステ−ジ５、第４サイクルのステ−ジ０からステ
−ジ３までのシフトレジスタから構成される。第２サイ
クルのステ−ジ２でＩＲ１５０は該命令メモリ１７０か
ら命令を受け取る。命令の実行はＩＲ１５０がこの時に
受け取っ命令コ−ドにより制御される。例えば、第２サ
イクルのステ−ジ２でデ−タメモリにアドレスが送り出
され、第３サイクルのステ−ジ４でデ−タメモリから読
みだされたオペランドがＭＤＲ１２０にセットされる。
オペランド間の演算は第３サイクルのステ−ジ５から第
４サイクルのステ−ジ２の間に実行される。An embodiment of the hardware configuration of a computer according to the present invention is shown in FIG. In FIG. 1, the calculator is an adder 1
00 and a group of registers arranged for each operation stage (stage 0 to stage 5). The adder 100 is
It has a pipeline structure in which a pipeline register and an operation logic circuit are combined, and an addition operation is executed in steps 5 to 1. Regarding the registers, the same kind of register constitutes a shift register so that the data is transferred in the stage order. A series of shift registers are omitted in the middle when unnecessary, depending on the application,
Further, in a program (instruction string) that needs to hold data, the shift register is connected in a ring shape, and the data circulates in the shift register. Instruction reading, operand reading, and instruction execution
It is completed by executing four cycles (cycle 1 to cycle 4). The operation of the embodiment shown in FIG. 1 will be described in detail below. The processing apparatus according to the present embodiment is an adder 10
0, ACC110, MDR120, PC140, IR1
50, a data memory 160, and an instruction memory 170. ACC110 is from stage 0 to stage 5
Up to each stage has a register for holding data. The register of each stage is configured as a shift register in the stage direction so that data can be transferred to the next stage bit by bit. Further, since the data is held in the program (instruction string), the data of stage 5 has a structure to be transferred to stage 0. Similarly, MDR12
0 is composed of shift registers from stage 3 to stage 5. In stage 4 of MDR120, data is transferred from data memory 160, and in stage 3, data is transferred.
The data is transferred to the data memory 160. The PC 140 is composed of shift registers from stage 0 to stage 5, and the stage 5 data is stored to hold the next instruction address.
The data has a structure transferred to stage 0. In step 0, the data of the PC 140 is transferred to the instruction memory 170 as an address to read the instruction, and the instruction memory is set to the reading cycle. Further, the stage 1 is provided with a 1 adder 141 for the purpose of increasing the held data by 1 and setting the next instruction address. The IR 150 performs instruction read, operand read, and instruction execution simultaneously in a single instruction sequence, so that the read instruction passes through the stage multiple times so as to control the stage multiple times. It consists of registers. The IR 150 is composed of a stage 2 to stage 5 in the second cycle, a stage 0 to stage 5 in the third cycle, and a stage 0 to stage 3 in the fourth cycle. It In Stage 2 of the second cycle, IR 150 receives an instruction from instruction memory 170. Instruction execution is controlled by the instruction code that IR 150 receives at this time. For example, in stage 2 of the second cycle, an address is sent to the data memory, and in stage 4 of the third cycle, the operand read from the data memory is set in the MDR 120.
Operations between operands are executed during stage 5 of the third cycle to stage 2 of the fourth cycle.

【０００９】次に、図１を使って動作を詳細に説明す
る。命令読み出しでは、第１サイクルのステ−ジ０でＰ
Ｃ１４０をアドレスレジスタとしてそのデ−タをアドレ
スとしてチャンネル１７１を介して命令メモリ１７０に
送り、メモリを読み出しサイクルにする。次に、ステ−
ジ１ではＰＣ１４０のデ−タを１加算器１４１を使って
１増加させる。第２サイクルのステ−ジ２では命令メモ
リ１７０から命令が読み出されチャンネル１７２を介し
てＩＲ１５０に転送される。またこのステ−ジ２ではＩ
Ｒ１５０にセットされた命令コ−ドのアドレス部または
ＣＡＬ命令の復帰先を保持する特定番地をチャンネル１
６１を介してデ−タメモリ１６０に転送する。ここで、
第１サイクルのステ−ジ０から第２サイクルのステ−ジ
２をまでデ−タを逐次転送することにより命令読み出し
が実行されることは明らか。また第２サイクルのステ−
ジ２でＩＲ１５０に転送された命令部により以下の第２
サイクルのステ−ジ３から第４サイクルのステ−ジ２ま
での命令の実行が制御される。命令実行では、ＪＭＰ命
令の場合は、第２サイクルのステ−ジ３でＩＲ１５０か
らＰＣ１４０にチャンネル３００を介してジャンプ先ア
ドレスが転送される。ＣＡＬ命令では、第２サイクルの
ステ−ジ３でＭＤＲ１２０にＰＣ１４０から復帰先アド
レスがチャンネル３０１を介して転送され、さらにデ−
タメモリのチャンネル１６２を介してデ−タメモリ１６
０に転送され、デ−タメモリ１６０を書き込みサイクル
にする。また、ＩＲ１５０からＰＣ１４０にチャンネル
３００を介してジャンプ先アドレスが転送される。ＲＴ
Ｎ命令では第３サイクルのステ−ジ４で戻り先のアドレ
スがＭＤＲ１２０に転送され、ステ−ジ５で戻り先アド
レスがＭＤＲ１２０からＰＣ１４０にチャンネル３０４
を介して転送される。ＪＡＮ命令はＡＣＣ１１０の符号
を判定してＪＭＰ命令と同じ動作を行う。ＬＡＣ命令で
は第３サイクルのステ−ジ４でデ−タメモリ１６０より
デ−タがＭＤＲ１２０に転送され、第３サイクルのステ
−ジ５から第４サイクルのステ−ジ２で転送されたデ−
タがＭＤＲ１２０よりＡＣＣ１１０に加算器１００、チ
ャネル３０６、３０７を介して転送される。ＳＡＣ命令
では第２サイクルのステ−ジ３でＡＣＣ１１０からＭＤ
Ｒ１２０に書き込みデ−タがチャンネル３０２を介して
転送され、同時にデ−タメモリ１６０を書き込みサイク
ルにする。ＡＤＤ命令では第３サイクルのステ−ジ４で
デ−タメモリ１６０よりデ−タがＭＤＲ１２０に転送さ
れ、ステ−ジ５でＡＣＣ１１０とＭＤＲ１２０のデ−タ
がチャンネル３０５、３０６より加算器１００に供給さ
れ、第４サイクルのステ−ジ２で該加算器１００の出力
はチャンネル３０７を介してＡＣＣ１１０に転送され
る。以上の説明より命令の実行が第２サイクルのステ−
ジ３から第４サイクルのステ−ジ２まで逐次デ−タを次
のステ−ジに転送することにより行われることは明ら
か。図１の実施例では、ＩＲ１５０は常に読みだされた
命令コ−ドを保持していたが、デ−タが転送されるにつ
れ保持されるデ−タの内容の一部を欠落させることがで
きることも明らかである。この場合、転送または保持す
るデ−タが少なくなる分処理装置のハ−ドウエア量を少
なくできる。Next, the operation will be described in detail with reference to FIG. When reading an instruction, P at stage 0 of the first cycle
C140 is used as an address register and the data is sent as an address to the instruction memory 170 via the channel 171 to make the memory a read cycle. Next,
In page 1, the data of the PC 140 is incremented by 1 by using the 1 adder 141. In stage 2 of the second cycle, an instruction is read from the instruction memory 170 and transferred to the IR 150 via the channel 172. Also in this stage 2, I
Channel 1 is a specific address holding the address part of the instruction code set in R150 or the return destination of the CAL instruction.
The data is transferred to the data memory 160 via 61. here,
It is clear that the instruction read is executed by sequentially transferring the data from stage 0 of the first cycle to stage 2 of the second cycle. Also, the second cycle
The following second by the instruction section transferred to IR150 in J2
The execution of instructions from stage 3 of the cycle to stage 2 of the fourth cycle is controlled. In the instruction execution, in the case of the JMP instruction, the jump destination address is transferred from the IR 150 to the PC 140 via the channel 300 in step 3 of the second cycle. In the CAL instruction, in step 3 of the second cycle, the return destination address is transferred from the PC 140 to the MDR 120 via the channel 301, and then the data is further deleted.
Data memory 16 via channel 162 of data memory
0, and the data memory 160 is set to a write cycle. Further, the jump destination address is transferred from the IR 150 to the PC 140 via the channel 300. RT
In the N instruction, the return address is transferred to the MDR 120 in step 4 of the third cycle, and the return address is transferred from the MDR 120 to the PC 140 in the channel 304 in step 5.
Be transferred through. The JAN instruction determines the sign of the ACC 110 and performs the same operation as the JMP instruction. In the LAC instruction, data is transferred from the data memory 160 to the MDR 120 in stage 4 of the third cycle and transferred from stage 5 of the third cycle to stage 2 of the fourth cycle.
Data is transferred from the MDR 120 to the ACC 110 via the adder 100 and channels 306 and 307. In SAC instruction, from ACC110 to MD in stage 3 of the second cycle
The write data is transferred to the R120 via the channel 302, and at the same time, the data memory 160 is set to the write cycle. In the ADD instruction, the data is transferred from the data memory 160 to the MDR 120 in stage 4 of the third cycle, and the data of ACC 110 and MDR 120 is supplied to the adder 100 from channels 305 and 306 in stage 5. The output of the adder 100 is transferred to the ACC 110 via the channel 307 in the second cycle of the second stage. From the above description, the instruction execution is the second cycle
Obviously, this is done by sequentially transferring the data from stage 3 to stage 2 of the fourth cycle to the next stage. In the embodiment shown in FIG. 1, the IR 150 always holds the read instruction code. However, as the data is transferred, a part of the contents of the held data can be lost. Is also clear. In this case, the amount of hardware of the processing device can be reduced because the amount of data to be transferred or held is reduced.

【００１０】図２は図１の本発明による処理装置のパイ
プライン動作を示している。図１の実施例を使えば、１
個のハ−ドウエアで複数の命令を同時に処理できる。す
でに命令は各ステ−ジを逐次実行することを明らかにし
た。このことは、１個の命令は、複数個あるステ−ジの
１このステ−ジを専有しているにすぎない。従って、図
１の実施例では、複数個の命令を各ステ−ジに割り当て
ることが出来ることを示唆している。すなわち、図１の
実施例では、６個の命令を各ステ−ジに割当て、１個の
ハ−ドウエアで６個の命令を同時に処理できることを示
している。図１の実施例では６個の命令列を同時に処理
出来るが、命令列Ａに着目すれば、命令読みだし（Ｉ
Ｆ）、オペランド読みだし（ＯＦ）、命令実行（ＥＸ１
またはＥＸ２）が同時に実行されていることが分かる。
図２で、オペランドの読みだしが必要ないＪＭＰ、ＣＡ
Ｌ、ＪＡＮ命令はＥＸ１で命令が終了する。これらのブ
ランチ関係命令のブランチ先アドレスは次の次の命令読
みだしアドレスとして使われる。またオペランドを読み
だす必要のある他の命令はＥＸ２で命令が終了する。こ
の実行された結果は次の次の命令実行に間にあうようレ
ジスタに結果が転送される。この実施例では、命令メモ
リ、デ−タメモリの参照が並列に行われ、命令読みだ
し、オペランド読みだし、命令実行が並列に行われてい
るため、各メモリの参照時間が遅くても高速の処理機能
を実現できる。さらに、図１の実施例では、各ステ−ジ
にレジスタ（ＡＣＣ，ＭＤＲ，ＰＣ，ＩＲ）を置いてあ
るため各ステ−ジ間の相関を無くすことができる。この
ため相関の無い複数の命令を同時に実行できるため、本
発明は単一のハ−ドウエアでマルチプロセッサと同じ機
能をを実現している。FIG. 2 shows the pipeline operation of the processor according to the invention of FIG. Using the example of FIG. 1, 1
Multiple instructions can be processed simultaneously with one piece of hardware. It has already been shown that the instructions execute each stage in sequence. This means that an instruction only occupies one of several stages. Therefore, the embodiment of FIG. 1 suggests that multiple instructions can be assigned to each stage. That is, in the embodiment shown in FIG. 1, 6 instructions are assigned to each stage, and 6 instructions can be simultaneously processed by 1 hardware. In the embodiment shown in FIG. 1, six instruction strings can be processed simultaneously, but if the instruction string A is focused on, the instruction reading (I
F), operand read (OF), instruction execution (EX1
Alternatively, it can be seen that EX2) is simultaneously executed.
In Fig. 2, JMP and CA that do not require reading of operands
The L and JAN instructions end with EX1. The branch destination addresses of these branch related instructions are used as the next next instruction read address. In addition, the other instructions whose operands need to be read are terminated by EX2. The executed result is transferred to the register so that it can be executed in the next next instruction execution. In this embodiment, since the instruction memory and the data memory are referred to in parallel, the instruction reading, the operand reading, and the instruction execution are performed in parallel, the high-speed processing is possible even if the reference time of each memory is slow. The function can be realized. Further, in the embodiment of FIG. 1, since registers (ACC, MDR, PC, IR) are placed in each stage, the correlation between each stage can be eliminated. Therefore, since a plurality of uncorrelated instructions can be executed simultaneously, the present invention realizes the same function as a multiprocessor with a single hardware.

【００１１】以上の説明では、ステ−ジを６個に分けた
場合を例に本発明を説明したが、より多くのステ−ジに
分解することは可能であり、より多くの命令を同時に実
行でき、このステ−ジはゲ−ト一段まで分解できること
は明らかである。In the above description, the present invention has been described by exemplifying the case where the stage is divided into six, but it is possible to decompose into more stages, and more instructions can be executed simultaneously. It is possible, and it is clear that this stage can be decomposed to one stage of the gate.

【００１２】[0012]

【発明の効果】本発明は、ハ−ドウエアの使用効率を高
め、１個のハ−ドウエアで複数個の命令列を同時に実行
でき、さらに参照時間の遅いメモリを使って高速の処理
動作を行う、マルチプロセッサを提供できる。従って、
本発明は、高性能の処理装置を実現する極めて有効な手
段を提供するものである。According to the present invention, the efficiency of use of hardware is improved, a plurality of instruction sequences can be simultaneously executed by one hardware, and a high-speed processing operation is performed using a memory with a slow reference time. , A multiprocessor can be provided. Therefore,
The present invention provides an extremely effective means for realizing a high performance processing device.

[Brief description of drawings]

【図１】本発明による処理装置の構造図FIG. 1 is a structural diagram of a processing apparatus according to the present invention.

【図２】処理装置のパイプライン動作シ−ケンスFIG. 2 is a pipeline operation sequence of a processing device.

[Explanation of symbols]

１００：加算器、１１０：ＡＣＣ、１２０：ＭＤ
Ｒ、１４０：ＰＣ、１４１：１加算器、１５０：Ｉ
Ｒ、１６０：デ−タメモリ１７０：命令メモリ１６１、１６２、１７１、１７２、２００、２０１、２
０２、３００、３０１、３０２、３０４、３０５、３０
６、３０７：チャンネル100: Adder, 110: ACC, 120: MD
R, 140: PC, 141: 1 adder, 150: I
R, 160: data memory 170: instruction memory 161, 162, 171, 172, 200, 201, 2
02, 300, 301, 302, 304, 305, 30
6, 307: Channel

Claims

[Claims]

1. A structure in which processing contents are divided into a plurality of stages, a register is provided for each stage, and data of the same kind of registers is sequentially transferred in the order of the stages. And a single instruction occupies one of the plurality of stages, and executes the instruction by advancing the dedicated stage. The processing device controls the stage a plurality of times by the instruction.

2. A processing device according to claim 1, wherein the register holding the read instruction is composed of a shift register which passes through the stage a plurality of times.

3. The method according to claim 2, wherein when the contents of the register holding an instruction are transferred between stages,
A processing device, wherein a part of the contents is deleted as needed.