JPH0296234A

JPH0296234A - Data processor

Info

Publication number: JPH0296234A
Application number: JP63248108A
Authority: JP
Inventors: Masato Suzuki; 正人鈴木; Masashi Deguchi; 雅士出口; Yukinobu Nishikawa; 幸伸西川; Takashi Sakao; 坂尾　隆
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1988-09-30
Filing date: 1988-09-30
Publication date: 1990-04-09
Anticipated expiration: 2012-10-27
Also published as: JP2668987B2

Abstract

PURPOSE:To obtain a data processor which can shorten the executing time of an instruction train without deteriorating the pipeline efficiency by providing the 1st and 2nd arithmetic units and the operand reading and writing devices to the stages following an instruction decoder. CONSTITUTION:An arithmetic unit 3 which works independently carries out such an instruction that requires no fetching action of an operand of the inter- register arithmetic, etc., nor writing action to a memory. While an operand reading device 4, transfer means (6 - 8), and an arithmetic unit 5 carry out the instructions requiring the reading actions out of an operand memory. Then an operand writing device 6 and the transfer means (4, 7 and 8) carry out the instructions requiring the writing actions to the operand memory. Thus it is possible to carry out in parallel such continuous instruction trains as the instructions requiring the operand fetching actions and the subsequent instructions requiring no operand fetching action. As a result, the executing time of the instruction trains can be reduced and at the same time the deterioration of the pipeline efficiency can be minimized.

Description

【発明の詳細な説明】産業上の利用分野本発明は、コンピュータの高速化を目的としたデータ処
理装置に関するものである。DETAILED DESCRIPTION OF THE INVENTION Field of the Invention The present invention relates to a data processing device aimed at increasing the speed of a computer.

従来の技術従来のデータ処理装置としては、例えば　元岡達「計算
機ンステム技術」、（昭４８．４．２０）。Conventional technology An example of a conventional data processing device is "Computer System Technology" by Tatsu Motooka (April 20, 1972).

オーム社、　Ｐ　９３、〜９９に示されている。Ohmsha, pp. 93-99.

第１０図はこの従来のデータ処理装置の構成図を示すも
のである。第１０図において、１１は命令コードの先読
みを行なう命令先読み装置、１２は命令解読装置、１３
はオペランドのアドレス計算を行なうアドレス計算装置
、１４はオペランドの先読みを行なうオペランド先読み
装置、１５は演算装置、１６はメモリ・Ｉｌｏなどを接
続する入出力バス、１７は命令先読み装置１１、オペラ
ンド先読み装置１４、およびオペランドの書込み時の演
算装置１６からの要求を調停し入出力バス１６０制呻を
行なうバス制御ａ１装置である。FIG. 10 shows a configuration diagram of this conventional data processing device. In FIG. 10, 11 is an instruction prefetching device for prefetching instruction codes, 12 is an instruction decoding device, and 13
14 is an address calculation device that calculates the address of an operand, 14 is an operand prefetch device that prefetches an operand, 15 is an arithmetic unit, 16 is an input/output bus that connects memory, Ilo, etc., and 17 is an instruction prefetch device 11, an operand prefetch device. 14, and a bus control a1 device that arbitrates requests from the arithmetic unit 16 when writing operands and controls the input/output bus 160.

命令解読装置１２は、命令先読み装置１１により先読み
された命令コードを解読し、命令実行に関する制御情報
と、メモリオペランドのフェッチを伴う場合はオペラン
ドのアドレス計算および先読みのだめの制御情報を、ま
たメモリへの書込みを伴う場合はオペランドのアドレス
計算のための制御情報を、アドレス計算装置１３に定行
する。The instruction decoding device 12 decodes the instruction code prefetched by the instruction prefetching device 11, and sends control information related to instruction execution and, if a memory operand is fetched, operand address calculation and prefetch storage control information to the memory. If writing is involved, control information for calculating the address of the operand is regularly sent to the address calculation device 13.

アＦ”！／ス計算装置１３は、オペランドのアドレス計
算を行ない、オペランドアドレスとメモリ参照に伴う制
御情報と命令実行に関する制御情報をオペランド先読み
装置１４に送出する。The AF''!/s calculation device 13 calculates the address of the operand, and sends the operand address, control information associated with memory reference, and control information related to instruction execution to the operand prefetch device 14.

オペランド先読み装置１４は、メモリオペランドのフェ
ッチが必要な場合はバス制御装置１７へ要求を出し、ア
ドレス計算装置１３より受け取ったオペランドアドレス
に従ってオペランドの先読みを行なう。バス制研装置１
７より受け取ったオペランドの先読みデータ、アドレス
計算装置１３より受け取ったメモリへの書込みを伴う場
合のオペランドの書込みアドレスおよび命令実行に関す
る：ＩＮＩ　ｔｌ情報は、演算装置１５に送出する。Operand prefetch device 14 issues a request to bus control device 17 when it is necessary to fetch a memory operand, and performs prefetch of the operand according to the operand address received from address calculation device 13. Bus polishing equipment 1
The pre-read data of the operand received from 7, the write address of the operand when writing to memory is involved, and the INI tl information regarding instruction execution received from the address calculation device 13 are sent to the arithmetic device 15.

演算装置１５ば、オペランド先読み装置１４より受け取
った先読みデータおよび命令実行に関する制御情報に従
って演算を実行する。また、演算結果のメモリへの書込
みを必要とする場合は、バス制御装置１７へ要求を出し
、オペランド先読み装置１４より受け取った書込みアド
レスに従って演算結果のメモリへの書込みを行なう。The arithmetic unit 15 executes arithmetic operations in accordance with the prefetch data and instruction execution control information received from the operand prefetch unit 14 . If it is necessary to write the result of the operation to the memory, a request is sent to the bus control device 17, and the result of the operation is written to the memory according to the write address received from the operand look-ahead device 14.

以上のように構成された従来のデータ処理装置について
、以下その動作を説明する。The operation of the conventional data processing apparatus configured as described above will be described below.

第１１図は動作タイミング図を示すものである。FIG. 11 shows an operation timing chart.

命令先読み装置１１、命令解読装置１２、アドレス計算
装置１３、オペランド先読み装置１４、および演算装置
１６において実行されている命令をクロック単位で示し
ている。ここで、命令先読み装置１１は内部にキャッシ
ュメモリを持ち、各装置の必要クロック数が、命令先読
み装置１１（１クロツク）、命令解読装置１２（１クロ
ツク）、アドレス計算装置１３（１クロツク）、オペラ
ンド先読み装置１４（３クロツク）、および演算装置１
５（１クロツク）の場合を示している。実行している命
令シーケンスは、メモリオペランドのフェッチが必要な
命令に続いて２命令のメモリオペランドのフェッチが不
要な命令を実行し・この３命令の繰り返しとなっている
。具体的には、命令１，４がメモリオペランドの７エツ
チが必要な命令であり、命令２．３．６．６がメモリオ
ペランドのフェッチが不要な命令である。またパイプラ
インの初期状態は空状態（例えば条件分岐時）としてい
る。命令１は、クロックｔ１に命令先読み装置１１で命
令コードの先読みが行なわれ、命令コードを命令解読装
置１２に発行する。クロッりｔ２に命令解読装置１２で
命令解読が行なわれ、命令実行に関する制御情報とオペ
ランドのアドレス計算および先読みのための制御情報を
アドレス計算装置１３へ発行する。クロックｔ３にアド
レス計算装置１３でオペランドのアドレス計算および先
読みのための制ａ１情報に従ってオペランドのアドレス
計算が行なわれ、オペランドアドレスとメモリ参照に伴
う制御情報と命令実行に関する制（財）情報をオペラン
ド先読み装置１４へ送出する。Instructions being executed in the instruction prefetching device 11, instruction decoding device 12, address calculation device 13, operand prefetching device 14, and arithmetic unit 16 are shown in clock units. Here, the instruction prefetching device 11 has an internal cache memory, and the required number of clocks for each device is: instruction prefetching device 11 (1 clock), instruction decoding device 12 (1 clock), address calculation device 13 (1 clock), Operand look-ahead device 14 (3 clocks) and arithmetic device 1
5 (1 clock) is shown. The instruction sequence being executed is an instruction that requires a memory operand fetch, followed by two instructions that do not require a memory operand fetch, and these three instructions are repeated. Specifically, instructions 1 and 4 are instructions that require 7 fetches of memory operands, and instructions 2, 3, 6, and 6 are instructions that do not require fetching of memory operands. The initial state of the pipeline is an empty state (for example, at the time of a conditional branch). For the instruction 1, the instruction code is prefetched by the instruction prefetching device 11 at clock t1, and the instruction code is issued to the instruction decoding device 12. At clock t2, the instruction decoding device 12 decodes the instruction, and issues control information regarding instruction execution and control information for operand address calculation and prefetching to the address calculation device 13. At clock t3, the address calculation device 13 calculates the address of the operand in accordance with the control information a1 for calculating the address of the operand and reading ahead, and reads the operand address, control information associated with memory reference, and control information regarding instruction execution. It is sent to the device 14.

クロックｔ４〜ｔ６にオペランド先読み装置１４でオペ
ランドアドレスおよびメモリ参照に伴う制御情報に従っ
てオペランドの先読みが行なわれ、先読みデータと命令
実行に関する制御情報を演算装置１６へ送出する。クロ
ックｔ７に演算装置１６において命令実行に関する制御
情報に従って演算を実行する。命令２は、クロックｔ２
に命令先読み装置１１で命令コードの先読みが行なわれ
、クロックｔ３に命令解読装置１２で命令解読が行なわ
れ、メモリオペランドの７エツチが不要な命令のため命
令実行に関する制御情報だけをアドレス計算装置１３へ
発行する。クロックｔ４にアドレス計算装置１３のステ
ージを通るが、オペランドのアドレス計算および先読み
のだめの制御情報が無いため何も行なわず、命令実行に
関する制御情報をオペランド先読み装置１４へ送出しよ
うとする。しかし、命令１のオペランド先読み装置１４
での実行が完了していないために、命令実行に関する命
令２の制御情報の送出はクロックｔ７まで遅延される。From clock t4 to t6, operand prefetching is performed in operand prefetching device 14 according to the operand address and control information associated with memory reference, and prefetching data and control information regarding instruction execution are sent to arithmetic unit 16. At clock t7, the arithmetic unit 16 executes an arithmetic operation according to control information regarding instruction execution. Instruction 2 is clock t2
At clock t3, the instruction code is prefetched by the instruction prefetch device 11, and the instruction is decoded by the instruction decoder 12 at clock t3.Since the instruction does not require 7 etching of the memory operand, only control information related to instruction execution is sent to the address calculation device 13. Issue to. It passes through the stage of the address calculation device 13 at clock t4, but since there is no control information for operand address calculation and prefetching, nothing is done, and control information regarding instruction execution is sent to the operand prefetching device 14. However, the operand lookahead device 14 of instruction 1
Since execution at t has not been completed, the sending of control information for instruction 2 regarding instruction execution is delayed until clock t7.

クロックｔ７にオペランド先読み装置１４のステージを
通るが、オペランドのアドレス計算および先読みのため
の制御１青報が無いため何も行なわず、命令実行に関す
る制御情報を演算装置１５へ送出する。クロックｔ８に
演算装置１６において命令実行に関する制菌清報に従っ
て演算を実行する。It passes through the stage of the operand look-ahead device 14 at clock t7, but since there is no control 1 blueprint for operand address calculation and look-ahead, nothing is done, and control information regarding instruction execution is sent to the arithmetic unit 15. At clock t8, the arithmetic unit 16 executes arithmetic operations in accordance with the antibacterial information regarding instruction execution.

しかしながら上記のような構成では、メモリオペランド
のフェッチが不要なレジスタ間演算等の命令においても
、オペランドのアドレス計算や先読み等のパイプライン
ステージの不必要な通過が必要となり、このため分岐時
等のパイプラインの乱れが発生した場合、パイプライン
の充填のためのオーバヘッドが発生する。また、メモリ
オペランドのフェッチが不要な命令のパイプラインステ
ージの不必要な通過により本来オペランドのフェッチが
必要な命令のオペランドアクセスのだめのバス帯域を制
限するという課題を有していた。However, in the above configuration, even instructions such as inter-register operations that do not require fetching of memory operands require unnecessary passage through pipeline stages such as operand address calculation and lookahead, which causes problems such as when branching. When a pipeline disturbance occurs, overhead is generated for filling the pipeline. Furthermore, there is a problem in that the unnecessary passage through the pipeline stage of instructions that do not require fetching of memory operands limits the bus bandwidth available for operand access of instructions that originally require fetching of operands.

そこで上記の課題を解決するために例えば次のような特
許（特開昭６３−１６７９３５１が出願されている。In order to solve the above problems, for example, the following patent (Japanese Unexamined Patent Publication No. 1679351/1982) has been filed.

第１２（２）はそのデータ処理装置の構成図を示すもの
である。第１２図において、１８は命令コードの先読み
を行なう命令先読み装置、１９は命令解読装置、２０は
オペランドのアドレス計算を行なうアドレス計算装置、
２１はオペランドの先読みを行なうオペランド先読み装
置、２２は演算制御装置、２３は演算装置、２４はメモ
リ・Ｉｌｏなどを接続する入出力バス、２５は命令先読
み装置１８、オペランド先読み装置２１およびオペラン
ドの書込み時の演算装置２３からの要求を調停し入出力
パス２４の制御を行なうパス制御装置である。No. 12(2) shows a configuration diagram of the data processing device. In FIG. 12, 18 is an instruction prefetching device for prefetching instruction codes, 19 is an instruction decoding device, and 20 is an address calculation device for calculating addresses of operands.
Reference numeral 21 denotes an operand prereader for prereading operands, 22 an arithmetic control unit, 23 an arithmetic unit, 24 an input/output bus for connecting memory, Ilo, etc., 25 an instruction prereader 18, an operand prereader 21, and operand writing. This is a path control device that mediates requests from the processing device 23 and controls the input/output path 24.

命令ｊゲを読装置〆い９は、命令先読み装置１８により
先読みされた命令コードを解読し、メモリオペランドの
７エツチを伴う場合はオペランドのアドレス計算および
先読みのための制御ｆ１“ｉ報を、またメモリへの書込
みを１゛トう場合はオペランドのアドレス計算のための
制Ｊｌ　ｆ＋’？報をアドレス計算装置２ｏに発行する
。また、命令大行に関する制ｆｆｆｌ］　ｌ）’ｆ報は
演算制御装置２２に発行する。The instruction j game reading device 9 decodes the instruction code prefetched by the instruction prefetching device 18, and if the instruction code is prefetched by the instruction prefetching device 18, it sends a control f1 "i" information for operand address calculation and prefetching if the instruction code is prefetched by the instruction prefetching device 18. In addition, when writing to the memory is performed, a control Jl f+'? information for calculating the address of the operand is issued to the address calculation device 2o.Also, a control fffl regarding a large instruction line] l)'f information is an operation Issued to the control device 22.

アドレス計算装置２ｏは、オペランドのアドレス計算を
行ない、オペランドアドレスとメモリ参照に伴う制御情
報をオペランド先読み装置２１に送出する。The address calculation device 2o calculates the address of the operand, and sends the operand address and control information associated with memory reference to the operand prefetch device 21.

オペランド先読み装置２１は、メモリオペラッドのフェ
ッチが必要な場合はパス制釘装置２５へ要求を出し、ア
ドレス計算装置２ｏより受け取ったオペランドアドレス
に従ってメモリの先読みを行ない、読出したデータのキ
ューイングを行う。When the operand prefetching device 21 needs to fetch a memory operad, it issues a request to the path nailing device 25, performs prefetching of the memory according to the operand address received from the address calculation device 2o, and queues the read data. .

また、メモリへの書込みが必要な場合はオペランドアド
レスのキューイングを行う。先読みデータおよヒ書込み
アドレスのキューイングの状態は演算制御装置２２に通
知される。Additionally, if writing to memory is required, operand addresses are queued. The queuing status of the pre-read data and the write address is notified to the arithmetic control unit 22.

演算制御装置２２は、命令解読装置１９より受け取った
演算装置２３の制（ＭＪｔｒｉ報のキューイングを行う
。また、演算装置２３からの要求に従って制（財）情報
を発行する。この時、発行する制御情報が先読みデータ
または書込みアドレスを必要とする場合は、オペランド
先読み装置２１の先読みデータおよび書込みアドレスの
キューイングの状態の確認を行なう。準備が完了してい
ない場合、演算装置２３への制御情報の発行は先読みデ
ータまたは書込みアドレスの準備が完了するまで遅延さ
せる。The arithmetic control unit 22 queues the control (MJtri) information for the arithmetic unit 23 received from the instruction decoding device 19. Also, it issues control information in accordance with a request from the arithmetic unit 23. If the control information requires pre-read data or a write address, the state of queuing of the pre-read data and write address in the operand pre-read device 21 is checked.If the preparation is not completed, the control information is sent to the arithmetic unit 23. Issuance of is delayed until the read-ahead data or write address is ready.

演算装置２３は、演算制御装置２２より受け取った制御
情報およびオペランド先読み装置２１より受け取った先
読みデータに従って演算を実行する。また、演算結果の
メモリへの書込みを必要とする場合はバス制御装置２５
へ要求を出し、オペランド先読み装置２１より受け取っ
た書込みアドレスに従って演算結果のメモリへの書込み
を行う。The arithmetic device 23 executes arithmetic operations according to the control information received from the arithmetic control device 22 and the prefetch data received from the operand prefetch device 21 . In addition, if it is necessary to write the calculation result to the memory, the bus control device 25
The operation result is written into the memory according to the write address received from the operand prereading device 21.

以上のように構成された第２の従来のデータ処理装置に
ついて、以下その動作を説明する。The operation of the second conventional data processing device configured as described above will be described below.

第１３図は動作タイミング図を示すものである。FIG. 13 shows an operation timing chart.

命令先読み装置１８、命令解読装置１９、アドレス計算
装置２０．オペランド先読み装置２１、および演算装置
２３において実行されている命令をクロンク単位で示し
ている。ここで、命令先読み装置１８は内部にキャッシ
ュメモリを持ち、各装置の盛装クロック数が、命令先読
み装置１８（１クロツク）、命令解読ＢＵ１９　（１ク
ロツク）、アドレス計算装置２ｏ（１クロツク）、オペ
ランド先読み装置２１（３クロツク）、および演算装置
２３（１クロツク）の場合を示している。実行している
命令シーケンスは、第１１図の動作タイミング図に示し
た命令シーケンスと同じであり、メモリオペランドのフ
ェッチが必要な命令に続いて２命令のメモリオペランド
のフェッチが不要な命令を実行し、この３命令の繰り返
しとなっている。具体的には、命令１，４がメモリオペ
ランドの７エツチが必要な命令であり、命令２，３，８
゜６がメモリオペランドのフェッチが不要な命令である
。またパイプラインの初期状態は空状態（例えば条件分
岐時）としている。命令１は、クロックｔ１に命令先読
み装置１８で命令コードの先読みが行なわれ、命令コー
ドを命令解読装置１９にｆａ行する。クロックｔ２に命
令解読装置１９で命令解読が行なわれ、オペランドのア
ドレス計算および先読みのだめの制ｍｌ　ｈ’７報をア
ドレス計算装置２ｏへ発行し、命令実行に関する制御情
報を演算制御装置２２へ発行する。しかし、データの準
備が完了していないために、命令実行に関する制御情報
は演算制御装置２２にキューイングされた状態で演算装
置２３への発行は遅延される。オペランドのアドレス計
算および先読みのための制御情報ニ従って、クロックｔ
３にアドレス計算装置２ｏでオペランドのアドレス計算
が行なわれ、クロックｔ４〜ｔ６にオペランド先読み装
置２１でオペランドの先読みが行なわれる。データの準
備が完了したことにより演算制御装置２２においてキュ
ーイングされている命令実行に関する制御情報が発行さ
れ、クロックを了に演算装置１”′１２３において実行
される。命令２は、クロックｔ２に命令先読み装置１８
で命令コードの先読みが行なわれ、クロックｔ３に命令
解読装置１９で命令解読が行なわれ、メモリオペランド
の７エツチが不要な命令のため命令実行に関する制御情
報だけを演算制御装置２２へ発行する。しかし、命令１
のだめの制御情報の発行が完了していないために、命令
実行に関する命令２の制御情報は演算制御装置２２にキ
ューイングされた状態で演算装置２３への定行はクロッ
クｔ８まで遅延される。Instruction prefetching device 18, instruction decoding device 19, address calculation device 20. Instructions being executed in the operand lookahead device 21 and the arithmetic unit 23 are shown in clock units. Here, the instruction prefetching device 18 has an internal cache memory, and the number of installed clocks of each device is as follows: instruction prefetching device 18 (1 clock), instruction decoding BU 19 (1 clock), address calculation device 2o (1 clock), operand The case of the look-ahead device 21 (3 clocks) and the arithmetic device 23 (1 clock) is shown. The instruction sequence being executed is the same as the instruction sequence shown in the operation timing diagram of Figure 11, in which an instruction that requires a memory operand fetch is followed by two instructions that do not require a memory operand fetch. , these three instructions are repeated. Specifically, instructions 1 and 4 are instructions that require 7 etches of memory operands, and instructions 2, 3, and 8
6 is an instruction that does not require fetching a memory operand. The initial state of the pipeline is an empty state (for example, at the time of a conditional branch). For the instruction 1, the instruction code is prefetched by the instruction prefetching device 18 at clock t1, and the instruction code is sent to the instruction decoding device 19 in fa. At clock t2, the instruction decoding device 19 decodes the instruction, calculates the address of the operand and issues a prefetch control ml h'7 report to the address calculation device 2o, and issues control information regarding instruction execution to the arithmetic control device 22. . However, since the preparation of the data is not completed, the control information related to instruction execution is queued in the arithmetic control unit 22 and issuance to the arithmetic unit 23 is delayed. Control information for operand address calculation and lookahead; therefore, the clock t
At 3, the address calculation unit 2o calculates the address of the operand, and at clocks t4 to t6, the operand read-ahead unit 21 pre-reads the operand. When the preparation of the data is completed, control information regarding the execution of the instruction queued in the arithmetic control unit 22 is issued and executed in the arithmetic unit 1''123 at the end of the clock.Instruction 2 is executed at clock t2. Look-ahead device 18
The instruction code is prefetched at clock t3, the instruction is decoded by the instruction decoder 19 at clock t3, and only control information regarding instruction execution is issued to the arithmetic control unit 22 since the instruction does not require a 7 etch of the memory operand. However, instruction 1
Since the issuance of the remaining control information has not been completed, the control information for the instruction 2 related to instruction execution is queued in the arithmetic control unit 22, and regular transmission to the arithmetic unit 23 is delayed until clock t8.

以上に示す第２の従来例によれば、メモリオペランドの
フェッチが不要なレジスタ間演算等の命令においては、
オペランドのアドレス計算や先読み等のパイプラインス
テージの不必要な通過が不要となり、このため分岐時等
のパイプラインの乱れが発生した場合においても、パイ
プラインの充填のためのオーバヘッドは発生しない。ま
た、メモリオペランドのフェッチが不要な命令のパイプ
ラインステージの不必要な通過により生じる本来オペラ
ンドの７エツチが必要な命令のオペランドアクセスのた
めのバス帯域の制限も解消される。According to the second conventional example described above, in instructions such as inter-register operations that do not require fetching of memory operands,
Unnecessary passage through pipeline stages such as operand address calculation and look-ahead is unnecessary, so even if a pipeline disturbance occurs such as at the time of branching, no overhead is generated for filling the pipeline. In addition, bus band limitations for operand access of instructions that originally require seven fetches of operands, which are caused by unnecessary passage through pipeline stages of instructions that do not require fetching of memory operands, are eliminated.

発明が解決しようとする課題しかしながら上記のような構成では、オペ２ンドフエツ
チの必要な命令の次にメモリへの書込みを必要とする命
令が続く場合、後者の命令の演算ステージは、たとえそ
の演算に前者の命令の実行結果を用いなくても、０１１
者の命令のオペランド先読みステージと演算ステージが
完了するまで待たなければならない。さらに、オペラン
ドフェッチもメモリへの書込みも必要としない命令が続
く場合、その命令の演算ステージは、オペランドフェッ
チの必要な命令の実行結果を用いなくても、先行するオ
ペランドフェッチの必要な命令のオペランド先読みステ
ージおよび演算ステージと、先行するメモリへの書込み
を必要とする命令の演算ステージが完了するまで待たな
ければならない。また、メモリへの書込みを必要とする
命令にオペランドフェッチの必要な命令が続く場合、後
者の命令の演算ステージは、たとえその演算に前者の命
令の実行結果を用いなくても、り！Ｉ者の命令のオペラ
ンド先読みステージと演算ステージが完了するまで待た
なければならない。さらに、オペランドフェッチもメモ
リへの書込みも必要としない命令が続く場合、その命令
の演算ステージは、オペランドフェッチの必要な命令の
実行結果を用いなくても、先行するメモリへの書込みを
必要とする命令の演算ステージと、先行するオペランド
フェッチの必要な命令のオペランド先読みステージおよ
び演算ステージが完了するまで待たなければならない。Problem to be Solved by the Invention However, in the above configuration, when an instruction that requires an operation 2nd fetch is followed by an instruction that requires writing to memory, the arithmetic stage of the latter instruction is 011 without using the execution result of the former instruction.
must wait until the operand look-ahead stage and arithmetic stage of the other instruction are completed. Furthermore, if an instruction that does not require an operand fetch or a write to memory follows, the operation stage of that instruction will be able to use the operands of the preceding instruction that requires an operand fetch, without using the execution result of the instruction that requires an operand fetch. It must wait until the read-ahead and arithmetic stages and the arithmetic stages of instructions that require previous writes to memory are completed. Also, when an instruction that requires writing to memory is followed by an instruction that requires operand fetching, the calculation stage of the latter instruction can be used even if the execution result of the former instruction is not used in the calculation. It is necessary to wait until the operand look-ahead stage and operation stage of the I-person's instruction are completed. Furthermore, if an instruction that does not require an operand fetch or a memory write follows, the arithmetic stage of that instruction requires a previous memory write, even if it does not use the execution result of the instruction that requires an operand fetch. It is necessary to wait until the operation stage of the instruction and the operand look-ahead stage and operation stage of the instruction that requires fetching the preceding operand are completed.

このように、プログラム中に頻出するオペランドフェッ
チの必要な命令とメモリへのδ込みを必要とする命令が
連続すると芙行うロック数が増大し、後続するレジスタ
間演算命令等のオペランドフェッチもメモリへの書込み
も必要としない命令の実行が遅れてしまうという第１の
課題を有していた。この課題を第１４図の動作タイミン
グ図で説明する。各装置の必要クロック数は第１３図に
示すものと同じである。ただし、オペランド先読み装置
２１における書込みアドレスのキューインクは１クロツ
ク、演算装置２３における演算とメモリへの書込みは４
クロツクとする。第１４図は命令のシーケンスとして、
オペ２ンドフエツチノ必要な命令（命令１）、レジスタ
のメモリへの書込みを行なう命令（命令２）、オペラン
ドフェッチもメモリへの書込みも必要としない命令（命
令３）が続く場合を示している。命令１はクロックｔ３
でアドレスが計算され、クロックｔ４〜ｔ６でオペラン
ドをフェッチし、クロックｔ７で演算される。しかし、
命令２はクロックｔ４でアドレスが計算された後、命令
１の演算ステージの完了を待たねばならず、クロックｔ
８〜ｔ１１で演算とメモリへの書込みが行なわれる。命
令３の演算はクロックｔ１２まで遅れる。In this way, if an instruction that requires an operand fetch and an instruction that requires δ loading to memory occur consecutively in a program, the number of locks to be performed will increase, and the operand fetch for subsequent inter-register operation instructions will also be transferred to memory. The first problem is that the execution of instructions that do not require writing is delayed. This problem will be explained using the operation timing diagram of FIG. 14. The required number of clocks for each device is the same as shown in FIG. However, the queue ink of the write address in the operand lookahead device 21 is 1 clock, and the calculation and writing to the memory in the arithmetic unit 23 is 4 clocks.
Clocks. Figure 14 shows the sequence of instructions as follows:
A case is shown in which an instruction that requires an operand fetch (instruction 1), an instruction that writes a register into memory (instruction 2), and an instruction that does not require operand fetch or writing to memory (instruction 3) follow. Instruction 1 is clock t3
An address is calculated at , an operand is fetched at clocks t4 to t6, and an operation is performed at clock t7. but,
After the address is calculated at clock t4, instruction 2 must wait for the completion of the operation stage of instruction 1, and clock t4
Arithmetic operations and writing to memory are performed from 8 to t11. The operation of instruction 3 is delayed until clock t12.

本発明はかかる点に濫み、オペランドフェッチの必要な
命令とメモリへの書込みを必要とする命令が連続しても
両命令が並列に実行でき、後続するレジスタ間演算命令
等のオペランドフェッチもメモリへの書込みも必要とし
ない命令の実行が遅れることのないデータ処理装置を提
供することを■ｒ白とする。The present invention takes advantage of these points, and even if an instruction that requires an operand fetch and an instruction that requires writing to memory are consecutive, both instructions can be executed in parallel, and operand fetches such as subsequent inter-register operation instructions can also be executed from memory. Our objective is to provide a data processing device that does not delay the execution of instructions that do not require writing to.

また従来の構成では、オペランドフェッチの必要な命令
が一度実行されると、以降の命令の演算ステージはその
命令がオペランドフェッチの不要な命令であっても、命
令先読みステージから時間的に離れてしまう。そのため
に、ｖ１算ステージに空きが生じ、空状態のパイプライ
ンから最初に出現するオペランドフェッチの盛装な命令
の夷行うロック数は１クロツクとはならず、アドレス計
算装置２０とオペランド先読み装置２１と演算装置２３
の必要クロック数の和になるという第２の課題を有して
いた。さらに、条件分岐命令が後続する場合は、分岐先
命令の命令先読みステージが遅れるためパイプラインの
インタロック時間が長くなり、パイプラインの効率が大
きく低下するという第３の課題を有していた。第２の課
題を第１５図の動作タイミング図で、第３の課題を第１
６図の動作タイミング図で説明する。両図とも各装置の
必要クロック数は第１３図に示すものと同じである。第
１６図は命令のシーケンスとして、オペランドフェッチ
の不要な２つの命令（命令１、命令２）に続いて、オペ
ランドフェッチの必要な１つの命令（命令３）、さらに
オペランドフェッチの不要な２つの命令（命令４、命令
６）が続く場合を示している。命令１および命令２はそ
れぞれクロックｔ３、クロックｔ４で演算されるが、命
令３はアドレス計算ステージとオペランド先読みステー
ジのため、クロックｔ９で演算される。そのため、クロ
ックｔ５〜ｔ８の期間は演算ステージに空きが生じ、空
状態のパイプラインから最初に出現するオペランドフェ
ッチの必要な命令３の芙行うロック数は５クロツクとな
り、また命令４および命令５の演算ステージは命令先読
みステージから６クロツクだけ離れてしまう。第１６図
は命令のシーケンスとして、オペランドフェッチの必要
な１つの命令（命令１）とオペランドフェッチの不要な
４つの命令（命令２〜命令５）の後に、条件分岐命令（
命令６）が続き、条件分岐命令６では分岐が成立し、オ
ペランドフェッチの不要な命令（命令ｎ）に分岐する場
合を示している。命令１拐；アドレス計算ステージとオ
ペランド先読みステージを必要とするため、以降の命令
２から命令６の演算ステージは命令先読みステージから
６クロツクだけ離れてしまう。従って、分岐先命令ｎの
命令先読みステージはクロックｔ１３となり、命令先読
みステージにおいて・；イブラインは、クロックｔ７〜
ｔ１２の６クロツクものインタロックが発生することに
なる。このように従来の構成では、プログラム中の頻度
が２０・く−セントもある条件分岐命令の出現は、パイ
プラインの効率の大きな低下を招くことになる。Furthermore, in conventional configurations, once an instruction that requires operand fetching is executed, the calculation stage of subsequent instructions is temporally separated from the instruction lookahead stage, even if the instruction does not require operand fetching. . Therefore, there is a vacancy in the v1 arithmetic stage, and the number of locks performed by the operand fetch fancy instruction that first appears from the empty pipeline is not one clock, and the address calculation device 20 and operand prefetching device 21 Arithmetic device 23
The second problem is that the number of required clocks is the sum of the numbers of clocks. Furthermore, when a conditional branch instruction follows, the instruction prefetch stage of the branch destination instruction is delayed, resulting in a longer pipeline interlock time, resulting in a third problem in that the efficiency of the pipeline is significantly reduced. The second problem can be compared to the operation timing diagram in Figure 15, and the third problem can be compared to the first problem.
This will be explained using the operation timing diagram shown in FIG. In both figures, the required number of clocks for each device is the same as that shown in FIG. Figure 16 shows the instruction sequence: two instructions that do not require operand fetch (instruction 1, instruction 2), followed by one instruction that requires operand fetch (instruction 3), and then two instructions that do not require operand fetch. The case where (instruction 4, instruction 6) continues is shown. Instruction 1 and instruction 2 are operated on clock t3 and clock t4, respectively, but instruction 3 is operated on clock t9 because it is in the address calculation stage and operand prefetch stage. Therefore, during the period from clock t5 to clock t8, there is an empty operation stage, and the number of locks to be performed for instruction 3, which first appears from the empty pipeline and requires operand fetch, is 5 clocks, and the number of locks for instruction 4 and instruction 5 is 5 clocks. The arithmetic stage is separated by six clocks from the instruction lookahead stage. FIG. 16 shows the instruction sequence: one instruction that requires operand fetch (instruction 1) and four instructions that do not require operand fetch (instruction 2 to instruction 5), followed by a conditional branch instruction (
Instruction 6) follows, and a branch is established in conditional branch instruction 6, indicating a case where the branch is taken to an instruction (instruction n) that does not require operand fetching. Instruction 1 skip: Since an address calculation stage and an operand prefetch stage are required, the calculation stages of subsequent instructions 2 to 6 are separated by 6 clocks from the instruction prefetch stage. Therefore, the instruction prefetch stage of the branch destination instruction n is clock t13, and in the instruction prefetch stage, the ;
An interlock of 6 clocks at t12 will occur. As described above, in the conventional configuration, the appearance of conditional branch instructions, which occur at a frequency of 20 cents in a program, causes a significant decrease in the efficiency of the pipeline.

本発明はかかる点に鑑み、オペランド先読・ノチの必要
な命令にオペランドフェッチの不要な命令が続く場合で
も、演算ステージに空きが生じず、空状態のパイプライ
ンから最初に出現するオペランドフェッチの必要な命令
の実行うロック−数が１クロツクとなり、さらに、条件
分岐命令が後続する場合も、パイプラインのインタロッ
ク時間をオペランドフェッチの必要な命令がない場合と
等しくシ、パイプラインの効率の低下を最小にするデー
タ処理装置を提供することを目的とする。In view of this point, the present invention prevents an empty operation stage from occurring even when an instruction that requires operand lookahead/notch is followed by an instruction that does not require operand fetch, and the first operand fetch that appears from an empty pipeline. Even when the number of locks required to execute a necessary instruction is one clock, and a conditional branch instruction follows, the pipeline interlock time is the same as when there is no instruction requiring operand fetch, and the efficiency of the pipeline is reduced. It is an object of the present invention to provide a data processing device that minimizes degradation.

また従来の構成では、オペランドフェッチの不要な命令
の演算で書換えられるレジスタを後続するオペランドフ
ェッチの必要な命令のアドレス計算で読出す場合、その
アドレス計算ステージは、オペランドフェッチの不要な
命令の演算ステージが完了するまでインタロックしくこ
れをアドレス計算干渉という）、パイプラインの効率が
低下するという第４の課題を有していた。この課題を第
１７図の動作タイミング図で説明する。各装置の必要ク
ロック数は第１３図に示すものと同じである。第１７図
は命令のシーケンスとして、オペランドフェッチの必要
な命令（命令１）に続いて、オペランドフェッチの不要
な２つの命令（命令２、命令３）、さらにオペランドフ
ェッチの必要な命令（命令４）が続き、かつ、命令３の
演算結果を命令４のアドレス計算で用いる場合を示して
いる。In addition, in the conventional configuration, when a register that is rewritten by an operation of an instruction that does not require operand fetch is read by the address calculation of a subsequent instruction that requires operand fetch, the address calculation stage is the operation stage of the instruction that does not require operand fetch. The fourth problem is that the pipeline is interlocked until it is completed (this is called address calculation interference), and the efficiency of the pipeline is reduced. This problem will be explained using the operation timing diagram of FIG. 17. The required number of clocks for each device is the same as shown in FIG. Figure 17 shows an instruction sequence that includes an instruction that requires operand fetch (instruction 1), followed by two instructions that do not require operand fetch (instruction 2 and instruction 3), and an instruction that requires operand fetch (instruction 4). continues, and the calculation result of instruction 3 is used in the address calculation of instruction 4.

命令４のアドレス計算はアドレス計算干渉がなければク
ロックｔ６で行なうことができるが、干渉のため命令３
の演算が完了する次のクロックｔ１０まで待たされる。Address calculation for instruction 4 can be performed at clock t6 if there is no interference in address calculation, but due to interference, instruction 3
The process is made to wait until the next clock t10 when the calculation of is completed.

このように、アドレス計算ステージにおいてパイプライ
ンはクロックｔ６〜ｔ９の４クロツクものインタロック
が発生し、アドレス計算干渉はパイプラインの効率の低
下を招くことになる。Thus, in the address calculation stage, interlocks of as many as four clocks t6 to t9 occur in the pipeline, and address calculation interference causes a decrease in pipeline efficiency.

本定明はかかる点に鑑み、オペランドフェッチの不要な
命令の演算で書換えられるレジスタを後続するオペラン
ドフェッチの必要な命令のアドレス計算で読出す場合で
も、インタロックが発生せず、従ってパイプラインの効
率が低下しないデータ処理装置を提供することを目的と
する。In view of this point, the present invention prevents interlocks from occurring even when a register that is rewritten by an operation of an instruction that does not require operand fetching is read by the address calculation of a subsequent instruction that requires operand fetching, so that the pipeline An object of the present invention is to provide a data processing device that does not reduce efficiency.

また従来の構成では、複数のオペランドフェッチを必要
とする命令あるいは複数のオペランドのメモリへの書込
みを伴う命令が実行されると、その後に続く命令がたと
えオペランドフェッチが不要であり、かつ、複数のオペ
ランドフェッチを必要とする命令あるいは複数のオペラ
ンドのメモリへの書込みを伴う命令の実行結果を必要と
しない場合であっても、その命令の演算は、複数のオペ
ランドフェッチを必要とする命令あるいは複数のオペラ
ンドのメモリへの書込みを伴う命令が完了するまでの長
い時間を待たなければならず、パイプラインの効率が低
下するという第６の課題を有していた。また、従来の構
成では、複数のオペランドフェッチを必要とする命令あ
るいは複数のオペランドのメモリへの書込みを伴う命令
に続く命令が、複数のオペランドフェッチを必要とする
命令あるいは複数のオペランドのメモリへの書込みを伴
う命令の実行結果を必要としないようにプログラムを最
適化するソフトウェアの性能を発揮させることができな
いという第６の課題を有していた。第５、第６の課題を
第１８図の動作タイミング図で説明する。各装置の必要
クロック数は第１３図に示すものと同じである。第１８
図は命令のシーケンスとして、３つのオペランド７エツ
チを行なう命令（命令１）に、オペランドフェッチの不
要な２つの命令（命令２、命令３）が続き、かつ、命令
２、命令３はいずれも命令１の実行結果を必要としない
場合を示している。命令１はクロックｔ４〜ｔ６で第１
のオペランドの先読みを行ないクロックｔ７でそのオペ
ランドをレジスタＲ１に格納し・クロックｔ７〜ｔ９で
第２のオペランドの先読みを行ないクロックｔ１０でそ
のオペランドをレジスタＲ２に格納し、クロックｔ１゜
〜ｔ１２で第３のオペランドの先読みを行ないクロック
ｔ１３でそのオペランドをレジスタＲ３に格納する。命
令２の演算は、演算装置２３がクロックｔ４において使
われておらず、かつ、命令２が命令１の実行結果である
レジスタＲ１、Ｒ２゜Ｒ３を用いないにもかかわらず、
命令１の第３の演算ステージの次のクロックｔ１４まで
待だされる。また、命令３の演算は、演算装置２３がク
ロックｔ６において使われておらず、かつ、命令３が命
令１の実行結果であるレジスタＲ１，Ｒ２Ｒ３を用いな
いにもかかわらず、クロックｔ１５まで侍たされる。こ
のように、演算装置２３の長大な空き時間が生じ、パイ
プラインの効率が著しく低下する。In addition, in conventional configurations, when an instruction that requires multiple operand fetches or an instruction that involves writing multiple operands to memory is executed, subsequent instructions may be executed even if no operand fetches are required and Even if the execution result of an instruction that requires fetching operands or writing multiple operands to memory is not required, the operation of that instruction may require fetching multiple operands or writing multiple operands to memory. The sixth problem is that an instruction that involves writing an operand to memory has to wait a long time for completion, which reduces the efficiency of the pipeline. Additionally, in conventional configurations, an instruction that follows an instruction that requires fetching multiple operands or writing multiple operands to memory is an instruction that requires fetching multiple operands or writes multiple operands to memory. The sixth problem is that the performance of the software that optimizes the program cannot be maximized so that the execution results of instructions that involve writing are not required. The fifth and sixth problems will be explained using the operation timing diagram of FIG. 18. The required number of clocks for each device is the same as shown in FIG. 18th
The figure shows an instruction sequence in which an instruction that fetches three operands (instruction 1) is followed by two instructions that do not require operand fetch (instruction 2, instruction 3), and both instructions 2 and 3 are instructions. This shows a case where the execution result of step 1 is not required. Instruction 1 is the first instruction at clocks t4 to t6.
The second operand is prefetched and stored in register R1 at clock t7.The second operand is prefetched at clock t7 to t9 and the operand is stored in register R2 at clock t10. The third operand is read in advance and stored in the register R3 at clock t13. The operation of instruction 2 is performed even though the arithmetic unit 23 is not used at clock t4 and instruction 2 does not use registers R1, R2 and R3, which are the execution results of instruction 1.
The process is awaited until the next clock t14 of the third calculation stage of instruction 1. Furthermore, the operation of instruction 3 was not performed until clock t15 even though the arithmetic unit 23 was not used at clock t6 and instruction 3 did not use registers R1 and R2R3, which were the execution results of instruction 1. be done. In this way, a large amount of idle time is generated in the arithmetic unit 23, and the efficiency of the pipeline is significantly reduced.

本発明はかかる点に鑑み、複数のオペランドフェッチを
必要とする命令あるいは複数のオペランドのメモリへの
書込みを伴う命令が実行されても・その後に続く命令が
オペランドフェッチが不要であり、かつ、複数のオペラ
ンドフェッチを必要とする命令あるいは複数のオペラン
ドのメモリへの書込みを伴う命令の実行結果を必要とし
ない場合であれば、その命令の演算を、複数のオペラン
ドフェッチを必要とする命令あるいは複数のオペランド
のメモリへの書込みを伴う命令の演算と並行または先行
して行ない、パイプラインの効率を低下させないデータ
処理装置を提供することを目的とする。また本発明は、
複数のオペランドフェッチを必要とする命令あるいは複
数のオペランドのメモリへの書込みを伴う命令に続く命
令が、複数のオペランドフェッチを必要とする命令ある
いは複数のオペランドのメモリへの書込みを伴う命令の
実行結果を必要としないようにプログラムを最Ｊ　化ｆ
るソフトウェアの性能を発揮させルデータ処理装置を提
供することを目的とする。In view of this, the present invention provides that even if an instruction that requires fetching multiple operands or an instruction that involves writing multiple operands to memory is executed, the following instruction does not require fetching operands, and If you do not need the execution result of an instruction that requires fetching multiple operands or writing multiple operands to memory, then you can use the instruction that requires fetching multiple operands or writing multiple operands to memory. It is an object of the present invention to provide a data processing device that performs operations in parallel with or in advance of instruction operations that involve writing operands to memory, and that does not reduce pipeline efficiency. Further, the present invention
An instruction that follows an instruction that requires fetching multiple operands or writing multiple operands to memory is the result of executing an instruction that requires fetching multiple operands or writing multiple operands to memory. Reorganize the program so that it does not require
The purpose of this invention is to provide a data processing device that enables the performance of software to be fully utilized.

以上に示した課題は、いずれもオペランド先読み装置２
１の必要クロック数が増加するほど顕著になる。従って
、従来の構成において、オペランド先読み装置２１にデ
ータ用のキヤノンユメモリを備え、オペランド先読み装
置２１のキャッンユヒソト時の必要クロック数を減少さ
せることにより、第１から第６の課題をより軽少なもの
にすることは可能である。しかしながら、内蔵するキャ
ッンユメモリの容量が増加するほどそのヒント率は向上
するが、キャソンユメモリのアクセス時間が大きくなり
、オペランド先読み装置２１のキャツシュヒツト時の必
要クロック数を減少させることが困難になるという新た
な第７の課題を有していた。All of the above-mentioned issues are related to the operand look-ahead device 2.
This becomes more noticeable as the number of clocks required for 1 increases. Therefore, in the conventional configuration, by providing the operand prefetching device 21 with a canon memory for data and reducing the number of clocks required when the operand prefetching device 21 scans the data, the first to sixth problems can be alleviated. It is possible to do so. However, as the capacity of the built-in cache memory increases, the hint rate improves, but the access time to the cache memory increases, making it difficult to reduce the number of clocks required when the operand look-ahead device 21 caches. There was a seventh issue.

本発明はかかる点に鑑み、キャッシュメモリの内蔵の有
無、キャッンユメモリのヒツト率、キャッシュメモリの
容量などによるオペランドの読出しや書込みに必要なり
ロック数の増減に関わりなく、第１から第６の課題を解
決することができ、またオペランドの読出しや吉込みに
必要なりロック数をレジスタ間演算の実行うロック数と
は独立に設定できるデータ処理装置を提供することを目
的とする。In view of this, the present invention provides the first to sixth locks regardless of the increase or decrease in the number of locks required for reading and writing operands depending on whether or not there is a built-in cache memory, the hit rate of the cache memory, the capacity of the cache memory, etc. It is an object of the present invention to provide a data processing device that can solve the problems and also set the number of locks required for reading and arranging operands independently of the number of locks used for inter-register operations.

また従来の構成では、演算制御装置２２における先読み
データの待ち合わせ時間を短縮するために、命令解読装
置１９は演算制御装置２２に命令実行に関する制御情報
を送出するよりもできるだけ早い時点にアドレス計算装
置２０にアドレス計算および先読みのための制御情報を
送出する必要がある。そのため、命令解読装置１９は制
御が複雑になる。また上記のような構成では、演算制御
装置２２における先読みデータまたは書込みアドレスの
待ち合わせのだめの複雑な制御が必要になる。このよう
に、各装置の制御が複雑化することに伴い制御回路の遅
延時間が増大し、プロセッサが動作するクロック周波数
を向上させることが容易ではないという第８の課題を有
していた。In addition, in the conventional configuration, in order to shorten the waiting time for pre-read data in the arithmetic control unit 22, the instruction decoder 19 sends the instruction decoding device 19 to the address calculation device 20 as early as possible before sending control information related to instruction execution to the arithmetic control unit 22. It is necessary to send control information for address calculation and lookahead to Therefore, control of the instruction decoding device 19 becomes complicated. Furthermore, the above configuration requires complicated control such as waiting for pre-read data or write addresses in the arithmetic control unit 22. As described above, as the control of each device becomes more complicated, the delay time of the control circuit increases, and the eighth problem is that it is difficult to improve the clock frequency at which the processor operates.

本発明はかかる点に濫み、プロセッサが動作するクロッ
ク周波数を容易に向上できるデータ処理装置を提供する
ことを目的とする。SUMMARY OF THE INVENTION It is an object of the present invention to overcome these problems and provide a data processing device that can easily increase the clock frequency at which a processor operates.

課題を解決するための手段本第１の発明は命令の先読みを行なう命令先読み手段と
、命令の解読を行なう命令解読手段と、解読結果に応じ
て命令形式がメモリからの読出しまたはメモリへの書込
みを必要としない時は命令が指定する演算操作を行ない
命令形式がメモリからの読出しまたはメモリへの書込み
を必要とする時は読出しまたは書込みの対象とするオペ
ランドのアドレス計算操作を行なう少なくとも汎用レジ
スタおよび演算器からなる演算手段と、さらに命令形式
がメモリからの読出しを必要とする命令のために専用に
設けた１１ｎ記演算手段により得られたアドレスに従っ
てオペランドのメモリからの読出しを行なうオペランド
読出し手段および前記オペランド読出し手段により得ら
れたデータを前記演算手段中の汎用レジスタに格納する
ための読出しオペランド転送手段と、命令形式がメモリ
への書込みを必要とする命令のために専用に設けた前記
演算手段により得られたアドレスに従ってオペランドの
メモリへの書込みを行なうオペランド書込み手段および
前記オペランド書込み手段がメモリに書込むデータを前
記演算手段中の汎用レジスタから取出すための書込みオ
ペランド転送手段とを備えたデータ処理装置である。Means for Solving the Problems The first invention includes an instruction prefetching means for prefetching an instruction, an instruction decoding means for decoding the instruction, and an instruction format that reads from or writes to the memory depending on the decoding result. At least a general-purpose register and arithmetic means consisting of an arithmetic unit; and operand reading means for reading operands from the memory according to addresses obtained by the 11n arithmetic means provided exclusively for instructions whose instruction format requires reading from the memory; read operand transfer means for storing data obtained by the operand reading means in a general-purpose register in the arithmetic means; and the arithmetic means provided exclusively for instructions whose instruction format requires writing to memory. and a write operand transfer means for retrieving data to be written into the memory by the operand writing means from a general-purpose register in the arithmetic means. It is a device.

また第２の発明は命令の先読みを行なう命令先読み手段
と、命令の解読を行なう命令解読手段と、解１洸結果に
応じて命令形式がメモリからの読出しまたはメモリへの
書込みを必要としない時は命令が指定する演算操作を行
ない命令形式がメモリからの読出しまたはメモリへの書
込みを必要とする時は読出しまたは書込みの対象とする
オペランドのアドレス計算操作を行なう少なくとも汎用
レジスタおよび演算器からなる第１の演算手段と、さら
に命令形式がメモリからの読出しを必要とする命令のた
めに専用に設けた前記第１の演算手段により得られたア
ドレスに従ってオペランドのメモリからの読出しを行な
うオペランド読出し手段および前記オペランド読出し手
段においてメモリから読出したオペランドに対して演算
を行なう第２の演算手段および前記第２の演算手段によ
り得られたデータを前記第１の演算手段中の汎用レジス
タに格納するための読出しオペランド転送手段と、τδ
令形式がメモリへの書込みを必要とする命令のために専
用に設けた前記第１の演算手段により得られたアドレス
に従ってオペランドのメモリへの書込みを行なうオペラ
ンド書込み手段および前記オペランド書込み手段がメモ
リに書込むデータを’＋１記第１の演算手段中の汎用レ
ジスタから取出すための書込みオペランド転送手段とを
備えたデータ処理装置である。Further, the second invention provides an instruction prefetching means for prefetching an instruction, an instruction decoding means for decoding the instruction, and an instruction format that does not require reading from or writing to the memory according to the solution result. performs the arithmetic operation specified by the instruction, and when the instruction format requires reading from or writing to memory, the controller comprises at least a general-purpose register and an arithmetic unit that calculates the address of the operand to be read or written. an operand reading means for reading an operand from the memory according to an address obtained by the first calculating means, which is provided exclusively for an instruction whose instruction format requires reading from the memory; a second arithmetic means for performing an arithmetic operation on the operand read from the memory in the operand reading means; and a read operation for storing data obtained by the second arithmetic means in a general-purpose register in the first arithmetic means. Operand transfer means and τδ
operand writing means for writing an operand into memory according to an address obtained by said first arithmetic means provided exclusively for instructions whose instruction format requires writing to memory; The data processing device includes write operand transfer means for extracting data to be written from a general-purpose register in the first operation means.

また、第３の発明は命令の先読みを行なう命令先読み手
段と、命令の解読を行なう命令解読手段と、解読結果に
応じて命令形式がメモリからの読出しまたはメモリへの
書込みを必要としない時は命令が指定する演算操作を行
ない命令形式がメモリからの読出しまたはメモリへの書
込みを必要とする時は読出しまたは書込みの対象とする
オペランドのアドレス計算操作を行なうと共にメモリか
ら読出す単一または多重オペランドを格納する、あるい
はメモリへ書込む単一または多重オペランドを格納して
いる汎用レジスタを指定するための情報の発行を行なう
少なくとも汎用レジスタおよび演算器からなる演算手段
と、さらに命令形式がメモリからの読出しを必要とする
命令のために専用に設けた前記演算手段により得られた
アドレスに従って単一または多重オペランドのメモリか
らの読出しを行なうオペランド読出し手段およびメモリ
から読出す単一または多重オペランドを格納する汎用レ
ジスタを指定するための情報を前記演算手段から受け取
り前記オペランド読出し手段により得られた単一または
多重オペランドを前記演算手段中の汎用レジスタに格納
するための読出しオペランド転送手段と、命令形式がメ
モリへの書込みを必要とする命令のために専用に設けた
前記演算手段により得られたアドレスに従って単一また
は多重オペランドのメモリへの書込みを行なうオペラン
ド書込み手段およびメモリへ書込む単一または多重オペ
ランドを格納している汎用レジスタを指定するための情
報を前記演算手段から受け取シ前記オペランド書込み手
段がメモリに書込む単一または多重オペランドをＦＭ前
記演算手段中の汎用レジスタから取出すための書込みオ
ペランド転送手段とを６ｊａえたデータ処理装置である
。Further, the third invention includes an instruction prefetching means for prefetching an instruction, an instruction decoding means for decoding the instruction, and an instruction format that does not require reading from or writing to the memory according to the decoding result. A single or multiple operand that performs the arithmetic operation specified by the instruction and, if the instruction format requires reading from or writing to memory, performs a calculation operation on the address of the operand to be read or written, and reads from memory. an arithmetic means comprising at least a general-purpose register and an arithmetic unit for issuing information for specifying a general-purpose register storing a single or multiple operands to be stored or written to memory; Operand reading means for reading a single or multiple operands from the memory according to the address obtained by the arithmetic means provided exclusively for instructions requiring reading, and storing the single or multiple operands to be read from the memory. read operand transfer means for receiving information for specifying a general-purpose register from the arithmetic means and storing single or multiple operands obtained by the operand reading means in the general-purpose register in the arithmetic means; Operand writing means for writing a single or multiple operands into memory according to the address obtained by the arithmetic means provided exclusively for instructions requiring writing to the memory; A write operand transfer for receiving information for specifying a stored general-purpose register from the arithmetic means and retrieving single or multiple operands to be written into memory by the operand writing means from the general-purpose register in the arithmetic means. This is a data processing device with 6ja of means.

また、第４の発明は命令の先読みを行なう命令先読み手
段と、命令の解読を行なう命令解読手段と、解読結果に
応じて命令形式がメモリからの読出しまたはメモリへの
書込みを必要としない時は命令が指定する演算操作を行
ない命令形式がメモリからの読出しまたはメモリへの書
込みを必要とする時は読出しまたは書込みの対象とする
オペランドのアドレス計算操作を行なうと共にメモリか
ら読出す単一または多重オペランドを格納する、あるい
はメモリへ書込む単一または多重オペランドを格納して
いる汎用レジスタを指定するための情報の発行を行なう
少なくとも汎用レジスタおよび演算器からなる第１の演
算手段と、さらに命令形式がメモリからの読出しを必要
とする命令のために専用に設けた前記第１の演算手段に
より得られたアドレスに従って単一または多重オペラン
ドのメモリからの読出しを行なうオペランド読出し手段
とおよび前記オペランド読出し手段においてメモリから
読出した単一または多重オペランドに対して演算を行な
う第２の演算手段およびメモリから読出す単一または多
重オペランドを格納する汎用レジスタを指定するための
１青報を前記演算手段から受け取り（τ１記第２の演算
手段により得られた単一または多重オペランドを前記第
１の演算手段中の汎用レジスタに格納するための読出し
オペランド転送手段と、命令形式がメモリへの書込みを
必要とする命令のために専用に設けた前記第１の演算手
段により得られたアドレスに従って単一または多重オペ
ランドのメモリへの書込みを行なうオペランド書込み手
段およびメモリへ書込む単一または多重オペランドを格
納している汎用レジスタを指定するための情報を前記演
算手段から受け取り前記オペランド書込み手段がメモリ
に書込む単一または多重オペランドを前記第１の演算手
段中の汎用レジスタから取出すための書込みオペランド
転送手段とを備えたデータ処理装置である。Further, the fourth invention includes an instruction prefetching means for prefetching an instruction, an instruction decoding means for decoding the instruction, and an instruction format that reads the instruction from the memory or writes to the memory according to the decoding result. A single or multiple operand that performs the arithmetic operation specified by the instruction and, if the instruction format requires reading from or writing to memory, performs a calculation operation on the address of the operand to be read or written, and reads from memory. a first arithmetic means comprising at least a general-purpose register and an arithmetic unit for issuing information for specifying a general-purpose register storing a single or multiple operands to be stored or written to memory; operand reading means for reading single or multiple operands from memory according to an address obtained by said first arithmetic means provided exclusively for instructions requiring reading from memory, and said operand reading means; Receives from the calculation means a second calculation means for performing an operation on the single or multiple operands read from the memory and a general-purpose register for storing the single or multiple operands read from the memory ( τ1 Read operand transfer means for storing single or multiple operands obtained by the second operation means in a general-purpose register in the first operation means, and an instruction whose instruction format requires writing to memory. operand writing means for writing single or multiple operands into memory according to the address obtained by the first arithmetic means provided exclusively for the purpose; and general-purpose operands storing single or multiple operands to be written into memory. write operand transfer means for receiving information for specifying a register from the arithmetic means and retrieving single or multiple operands to be written into memory by the operand writing means from a general-purpose register in the first arithmetic means; It is a data processing device.

′まだ、第６の発明は上記第１または第３の発明にオペ
ランド読出し手段で実行中の命令に後続する命令を演算
手段で実行中の時は前記演算手段より発行される複数の
実行結果フラグ生成情報を用いて、そして前記オペラン
ド読出し手段が命令実行を終了する時は１１１１記オペ
ランド読出し手段より発行される複数の実行結果フラグ
生成情報のうち１１１記演算手段による後続命令の実行
により変化を受けなかったフラグ生成情報のみを有効と
して、複数の実行結果フラグを格納するステータスレジ
スタへの実行結果の反映を行なうフラグ生成手段を備え
たデータ処理装置である。'Still, a sixth invention provides a sixth invention in addition to the first or third invention, when an instruction subsequent to an instruction being executed by the operand reading means is being executed by the calculation means, a plurality of execution result flags are issued by the calculation means. Using the generation information, when the operand reading means finishes executing the instruction, the operand reading means 1111 issues a plurality of execution result flag generation information issued by the operand reading means 111, which receives a change due to the execution of a subsequent instruction by the arithmetic means 111. This data processing device includes flag generation means that validates only the flag generation information that was not present and reflects the execution results in a status register that stores a plurality of execution result flags.

また第６の発明は上記第２または第４の発明に第２の演
算手段で実行中の命令に後続する命令を第１の演算手段
で実行中の時は前記第１の演算手段より発行される複数
の実行結果フラグ生成時報を用いて、そして１）１■記
第２の演算手段が命令実行を終了する時はｍｌ記第２の
演算手段より発行される複数の実行結果フラグ生成情報
のうち前記第１の演算手段による後続命令の実行により
変化を受けなかったフラグ生成情報のみを有効として、
複数の実行結果フラグを格納するステータスレジスタへ
の実行結果の反映を行なうフラグ生成手段を備えたデー
タ、処理装置である。Further, a sixth invention is the second or fourth invention, in which when the first calculation means is executing an instruction subsequent to the instruction being executed by the second calculation means, the first calculation means issues an instruction. 1) When the second arithmetic means described in 1) completes instruction execution, the plurality of execution result flag generation information issued by the second arithmetic means described in ml is used. Among them, only the flag generation information that has not been changed by the execution of the subsequent instruction by the first calculation means is made valid;
The present invention is a data processing device that includes flag generation means that reflects execution results in a status register that stores a plurality of execution result flags.

作用第１および第２の発明は前記した構成により、互いに独
立に動作する第１の演算手段によりレジスタ間演算等の
オペランドフェッチもメモリへの書込みも必要としない
命令を実行し・オペランド読出し手段と読出しオペラン
ド転送手段と第２の演算手段によりオペランドのメモリ
からの読出しを必要とする命令を実行し、オペランド書
込み手段と書込みオペランド転送手段によりオペランド
のメモリへの書込みを必要とする命令を実行する。Effects The first and second inventions have the configuration described above, and the first arithmetic means that operate independently of each other executes instructions that do not require operand fetching such as inter-register operations or writing to memory. The read operand transfer means and the second arithmetic means execute an instruction requiring reading of an operand from the memory, and the operand writing means and the write operand transfer means execute an instruction requiring writing of an operand to the memory.

このことによって、オペランドフェッチの必要な命令と
後続するオペランド７エツチもメモリへの書込みも必要
としない命令、メモリへの書込みを必要とする命令と後
続するオペランドフェッチもメモリへの書込みも必要と
しない命令、オペランドフェッチの必要な命令と後続す
るメモリへの書込みを必要とする命令、メモリへの書込
みを必要とする命令と後続するオペランドフェッチの必
要な命令をそれぞれ並列に実行することができる。As a result, an instruction that requires an operand fetch and a subsequent operand 7 fetch and an instruction that does not require a write to memory, and an instruction that requires a write to memory and a subsequent operand fetch and a write to memory are not required. Instructions, instructions requiring operand fetch and subsequent instructions requiring writing to memory, and instructions requiring writing to memory and subsequent instructions requiring operand fetch can be executed in parallel.

また、オペランド読出し手段と読出しオペランド転送手
段と第２の演算手段によりオペランドフェッチの必要な
命令を実行している間に、第１の演算手段により後続す
るオペランドフェッチの不要な命令を実行する。このこ
とによって、オペランドフェッチの必要な命令にオペラ
ンドフェッチの不要な命令が続く場合でも、演算ステー
ジに空きが生じず、空状態のパイプラインから最初に出
現するオペランドフェッチの必要な命令の実行うロック
数を１クロツクとすることができる。Further, while the operand reading means, the read operand transfer means, and the second arithmetic means are executing an instruction requiring an operand fetch, the first arithmetic means executes a subsequent instruction that does not require an operand fetch. As a result, even if an instruction that requires operand fetching is followed by an instruction that does not require operand fetching, there will be no empty space in the calculation stage, and the first instruction that requires operand fetching that appears from the empty pipeline will be locked. The number can be one clock.

また・オペランドフェッチの必要な命令はオペランド読
出し手段と読出しオペランド転送手段と第２の演算手段
で第１の演算手段とは独立に実行されるため、以降の命
令の第１の演算ステージが命令先読みステージから時間
的に離れてしまうようなことはない。従って、オペラン
ド７エツチの必要な命令に条件分岐命令が後続する場合
も、パイプラインのインタロック時間をオペランドフェ
ッチの必要な命令がない場合と等しくし、パイプライン
の効率の低下を最小にすることができる。In addition, since instructions that require operand fetching are executed by the operand reading means, read operand transfer means, and second calculation means independently of the first calculation means, the first calculation stage of subsequent instructions is the instruction prefetch. There will be no time away from the stage. Therefore, even when a conditional branch instruction follows an instruction that requires an operand 7 fetch, the pipeline interlock time can be made equal to the case where no instruction that requires an operand fetch is present, thereby minimizing the decrease in pipeline efficiency. Can be done.

壕だ、レジスタ間演算等のオペラン！パフエッチの不要
な命令の演算も、オペランドフェッチの必要な命令およ
びメモリへの書込みが必要な命令のオペランドのアドレ
ス計算も共に第１の演算手段で命令列順に行なうため、
オペランドフェッチの不要な命令の演算で書換えられる
レジスタを後続するオペランドフェッチの必要な命令あ
るいはメモリへの書込みが必要な命令のアドレス計算で
読出す場合でも、インタロックが発生しない。It's a moat, operan for operations between registers! Since operations for instructions that do not require puff etching, as well as address calculations for operands for instructions that require operand fetch and instructions that require writing to memory, are performed in the order of the instruction sequence by the first operation means,
No interlock occurs even when a register that is rewritten by an operation of an instruction that does not require an operand fetch is read by an address calculation of a subsequent instruction that requires an operand fetch or an instruction that requires writing to memory.

第３および第４の発明は前記した構成により、複数のオ
ペランドフェッチを必要とする命令は第１の演算手段で
アドレス計算された後、オペランド読出し手段と読出し
オペランド転送手段と第２の演算手段が第１の演算手段
とは独立に複数回動作して実行する。また、複数のオペ
ランドのメモリへの書込みを必要とする命令は第１の演
算手段でアドレス計算された後、オペランド書込み手段
と書込みオペランド転送手段が第１の演算手段とは独立
に複数回動作して実行する。このことによって・その後
に続く命令がオペランドフェッチが不要であり、かつ、
複数のオペランドフェッチを必要とする命令あるいは複
数のオペランドのメモリへの書込みを伴う命令の実行結
果を必要としない場合であれば、その命令の演算を、第
１の演算手段において複数のオペランドフェッチを必要
とする命令あるいは複数のオペランドのメモリへの書込
みを伴う命令の演算と並行または先行して行なうことが
できる。In the third and fourth inventions, with the above-described configuration, after the address of an instruction requiring fetching of a plurality of operands is calculated by the first calculation means, the operand reading means, the read operand transfer means, and the second calculation means are processed. It operates and executes multiple times independently of the first calculation means. Further, for an instruction that requires writing multiple operands to memory, after the address is calculated by the first calculation means, the operand writing means and the write operand transfer means operate multiple times independently of the first calculation means. and execute it. As a result, subsequent instructions do not require operand fetching, and
If the execution result of an instruction that requires fetching multiple operands or writing multiple operands to memory is not required, the operation of the instruction may be performed by fetching multiple operands in the first calculation means. It can be performed in parallel or in advance of the operation of a required instruction or instruction that involves writing multiple operands to memory.

また第１　、第２．第３および第４の発明によれば前記
した構成により、第１の演算手段によるレジスタ間演算
等のオペランドフェッチもメモリへの書込みも必要とし
ない命令の実行と、オペランド読出し手段と読出しオペ
ランド転送手段と第２の演算手段によるオペランドのメ
モリからの読出しを必要とする命令の実行と、オペラン
ド書込み手段と書込みオペランド転送手段によるオペラ
ンドのメモリへの書込みを必要とする命令の実行は互い
に独立である。従って、キャッシュメモリの内蔵の有無
、キャッシュメモリのヒツト率、キャッシュメモリの容
量などによって、オペランドのメモリからの読出しに必
要なりロック数が増減しようとも、またオペランドのメ
モリへの書込みに必要なりロック数が増減しようとも、
これらの命令に後続するレジスタ間演算等のオペランド
フェッチもメモリへの書込みも必要としない命令の演算
ステージの時間的位置は不変である。Also, 1st, 2nd. According to the third and fourth inventions, the above-described configuration allows the first arithmetic means to execute instructions that do not require operand fetching such as inter-register operations or writing to memory, and the operand reading means and read operand transfer means. The execution of an instruction requiring reading of an operand from the memory by the operand writing means and the write operand transfer means are independent of each other. Therefore, depending on the presence or absence of a built-in cache memory, the cache memory hit rate, the cache memory capacity, etc., the number of locks required to read an operand from memory may increase or decrease, and the number of locks required to write an operand to memory may increase or decrease. Even if it increases or decreases,
The temporal positions of the operation stages of instructions that follow these instructions, such as inter-register operations, that do not require operand fetching or writing to memory remain unchanged.

第１および第２の発明は前記した構成により、メモリオ
ペランドの先読みを行なわない。そのため命令解読手段
は、アドレス計算のための制御情報を命令実行に関する
制御情報に先行して発行する必要がなく、すべての制御
情報を同時に第１の演算手段に発行すればよい。従って
、命令解読手段の制御は簡単になる。また、先読みデー
タや書込みアドレスの待ち合わせのだめの複雑な制御も
必要としない。このように、各装置の制御の簡単化に伴
い制御回路の遅延時間が短縮され、プロセッサが動作す
るクロンク周波数を容易に向上することができる。The first and second inventions do not perform pre-reading of memory operands due to the above-described configuration. Therefore, the instruction decoding means does not need to issue the control information for address calculation prior to the control information regarding instruction execution, and can issue all the control information to the first calculation means at the same time. Therefore, control of the instruction decoding means becomes simple. Further, complicated control such as waiting for pre-read data and write addresses is not required. In this way, the delay time of the control circuit is shortened as the control of each device is simplified, and the clock frequency at which the processor operates can be easily increased.

第６および第６の発明は前記した構成により、フラグ生
成手段はオペランド読出し手段または第２の演算手段で
実行中の命令に後続する命令を第１の演算手段で実行中
の時は第１の演算手段よυ発行される複数の実行結果フ
ラグ生成情報を用いて、そしてオペランド読出し手段ま
たは第２の演算手段が命令実行を終了する時はオペラン
ド読出し手段または第２の演算手段より発行される複数
の実行結果フラグ生成情報のうち第１の演算手段による
後続命令の実行により変化を受けなかったフラグ生成情
報のみを有効として、ステータスレジスタへの実行結果
の反映を行なう。従って、第１の演算手段とオペランド
読出し手段または第２の演算手段によって実行される命
令の順序がプログラムの命令列の順序と異なっても、命
令がプログラムの命令列の順序で実行された場合と矛盾
のないフラグがプロセッサ中のステータスレジスタに反
映される。In the sixth and sixth aspects of the invention, with the above-described configuration, when the first calculation means is executing an instruction subsequent to the instruction being executed by the operand reading means or the second calculation means, the flag generation means is configured to Using a plurality of execution result flag generation information υ issued by the arithmetic means, and when the operand reading means or the second arithmetic means finishes executing the instruction, the plurality of execution result flag generation information issued by the operand reading means or the second arithmetic means are used. Among the execution result flag generation information, only the flag generation information that has not been changed by the execution of the subsequent instruction by the first calculation means is validated, and the execution result is reflected in the status register. Therefore, even if the order of the instructions executed by the first arithmetic means and the operand reading means or the second arithmetic means is different from the order of the instruction sequence of the program, it is the same as if the instructions were executed in the order of the instruction sequence of the program. Consistent flags are reflected in the status register in the processor.

実施例第１図は本発明の実施例におけるデータ処理装置の構成
図を示すものである。第１図において、１は命令コード
の先読みを行なう命令先読み装置、２は命令解読装置、
３は第１演算装置、４はオペランドの読出しを行なうオ
ペランド読出し装置、５は第２演算装置、６はオペラン
ドのメモリへの書込みを行なうオペランド書込み装置、
７はメモリ・Ｉｌｏなどを接続する入出力パス、８は命
令先読み装置１およびオペランド読出し装置４からの要
求を調停し入出カバスフの制御を行なうバス制御装置で
ある。Embodiment FIG. 1 shows a block diagram of a data processing apparatus in an embodiment of the present invention. In FIG. 1, 1 is an instruction prefetching device that prefetches instruction codes; 2 is an instruction decoding device;
3 is a first arithmetic unit; 4 is an operand read unit that reads operands; 5 is a second arithmetic unit; 6 is an operand write unit that writes operands to memory;
Reference numeral 7 designates an input/output path that connects memory, Ilo, etc., and 8 designates a bus control device that arbitrates requests from the instruction prefetch device 1 and the operand read device 4 and controls the input/output bus flow.

命令解読装置２は、命令先読み装置１により先読みされ
た命令コードを解読し、命令実行に関する制御情報、メ
モリオペランドのフェッチを伴う場合のオペランドのア
ドレス計算および読出しのための制御情報、メモリへの
書込みを伴う場合のオペランドのアドレス計算および書
込みのための制御情報を第１演算装置３に発行する。The instruction decoding device 2 decodes the instruction code prefetched by the instruction prefetching device 1, and outputs control information related to instruction execution, control information for calculating and reading addresses of operands when memory operands are fetched, and writing to memory. control information for calculating and writing addresses of operands in cases involving .

第１演算装置３は、命令がレジスタ間演算またはレジス
タ・即値間演算の時は命令解読装置２より受け取った制
御情報に従って演算を実行する。The first arithmetic unit 3 executes an operation according to the control information received from the instruction decoder 2 when the instruction is an operation between registers or an operation between registers and immediate values.

命令がメモリオペランドの７エツチを伴う時は命令解読
装置２より受け取った制御情報に従ってオペランドのア
ドレス計算を行ない、オペランドアドレスとメモリ参照
に伴う制御情報と命令実行に関する制御情報をオペラン
ド読出し装置４に送出する。また、命令がメモリへの書
込みを伴う時は命令解読装置２より受け取った制御情報
に従って９込みアドレスの計算を行ない、書込みアドレ
スとメモリ書込みに伴う制御情報をオペランド書込み装
置６に送出する。When an instruction involves a 7-etch of a memory operand, the address of the operand is calculated according to the control information received from the instruction decoding device 2, and the operand address, control information associated with memory reference, and control information regarding instruction execution are sent to the operand reading device 4. do. Further, when the instruction involves writing to the memory, a write address is calculated according to the control information received from the instruction decoder 2, and the write address and control information associated with the memory write are sent to the operand writing device 6.

オペランド読出し装置４は、バス制御装置８へ要求を出
し、第１演算装置３より受け取ったオペランドアドレス
に従ってメモリの読出しを行なう。The operand reading device 4 issues a request to the bus control device 8 and reads the memory according to the operand address received from the first arithmetic device 3.

読出したデータは命令実行に関する制御情報とともに第
２演算装置５に送られる。The read data is sent to the second arithmetic unit 5 together with control information regarding instruction execution.

第２演算装置６は、オペランド読出し装置４より受け取
ったデータと命令実行に関する制御情報に従って演算を
実行するっ演算結果は、第１演算装置３に返される。The second arithmetic unit 6 executes an arithmetic operation according to the data received from the operand reading device 4 and the control information regarding instruction execution, and the result of the arithmetic operation is returned to the first arithmetic unit 3.

オペランド書込み装置６は、第１演算装置３から書込み
データを取りこみ、オペランド読出し装置４を通してバ
ス制御ｉ＃装置８へ要求を出し、第１演算装置３より受
け取った書込みアドレスに従って、メモリへの書込みを
行なう。The operand writing device 6 takes in the write data from the first arithmetic device 3, issues a request to the bus control i# device 8 through the operand reading device 4, and writes data to the memory according to the write address received from the first arithmetic device 3. Let's do it.

第２図は第１図に示す第１演算装置３、オペランド読出
し装置４、第２演算装置５およびオペランド書込み装置
６の詳細構成図を示すものである。FIG. 2 shows a detailed configuration diagram of the first arithmetic unit 3, operand reading device 4, second arithmetic device 5, and operand writing device 6 shown in FIG.

第２図において、３０１は命令解読装置２より受け取っ
た命令実行に関する制御情報やオペランドのアドレス計
算のための制御情報に従って第１演算装置３の制御を行
なう演算制御回路、３０２は汎用レジスタ、３ｏ３は汎
用レジスタ３０２の制御を行なうレジスタ制御回路、３
０４は汎用レジスタ３０２に格納されたデータを演算す
る第１算術論理演算回路、３ｏ６はステータスレジスタ
、３０６はステータスレジスタ３０５が示す演算結果の
フラグ情報を生成するフラグ生成回路で、以上は第１演
算装置３に実装される。４０１はメモリの読出しを行な
うアドレスを保持する読出しアドレスレジスタ、４ｏ２
は複数のオペランド読出しの際の結果を格納するレジス
タを指定する情報（レジスタリスト）を含めメモリ読出
しに伴う制御情報を演算制御回路３０１より受け取り、
オペランド読出し装置４とレジスタ制御回路３０３の制
御を行なう読出し制御回路、４０３はパス制御装置８よ
り受け取った読出しデータを保持する読出しデータバッ
ファ、４０４は読出しアドレスレジスタ４０１の内容を
一定値増加させる読出しアドレス増分回路で、以上はオ
ペランド読出し装置４に実装される。６０１は読出しデ
ータバッファ４０３と汎用レジスタ３０２に格納された
データの演算を行なう第２算術論理演算回路で、第２演
算装置６に実装される。６０１はメモリの書込みを行な
うアドレスを保持する書込みアドレスレジスタ、６ｏ２
は複数のオペランド３込みの際のオペランドを格納して
いるレジスタを指定する情報（レジスタリスト）を含め
メモリ書込みに伴う制ＨＰ？ｆ報を演算制御回路３０１
より受け取り、オペランド書込み装置６とレジスタ制御
回路３０３０制呻を行なう書込み制御回路、６０３は汎
用レジスタ３０２から取シこんだ書込みデータまたは第
２算術論理演算回路６０１より受け取った書込みデータ
を保持する書込みデータバッファ、６０４は書込みアト
・レスレジスタ６０１の内容を一定値増加させる書込み
アドレス増分回路で、以上はオペランド書込み装置６に
実装される。フラグ生成回路３０６は第１算術論理演算
回路３０４および第２算術論理演算回路５０１よりフラ
グ生成に関する情報を受け取り、フラグを生成する。In FIG. 2, 301 is an arithmetic control circuit that controls the first arithmetic unit 3 according to control information regarding instruction execution and control information for operand address calculation received from the instruction decoder 2, 302 is a general-purpose register, and 3o3 is an arithmetic control circuit. a register control circuit that controls the general-purpose register 302;
04 is a first arithmetic and logic operation circuit that operates on the data stored in the general-purpose register 302, 3o6 is a status register, and 306 is a flag generation circuit that generates flag information of the operation result indicated by the status register 305. It is implemented in the device 3. 401 is a read address register that holds the address for reading memory; 4o2;
receives control information associated with memory reading from the arithmetic control circuit 301, including information (register list) specifying registers to store the results of reading multiple operands;
A read control circuit that controls the operand read device 4 and the register control circuit 303; 403 is a read data buffer that holds read data received from the path control device 8; 404 is a read address that increases the contents of the read address register 401 by a certain value; The above is implemented in the operand reading device 4 with an incremental circuit. A second arithmetic and logic operation circuit 601 performs operations on data stored in the read data buffer 403 and the general-purpose register 302, and is implemented in the second arithmetic unit 6. 601 is a write address register that holds the address for writing to memory; 6o2
is the control HP associated with memory writing, including information (register list) that specifies the registers storing operands when multiple operands are written? The f-information calculation control circuit 301
A write control circuit 603 receives write data from the general-purpose register 302 or holds write data received from the second arithmetic and logic circuit 601. A buffer 604 is a write address increment circuit that increases the contents of the write address register 601 by a fixed value, and the above is implemented in the operand writing device 6. The flag generation circuit 306 receives information regarding flag generation from the first arithmetic and logic operation circuit 304 and the second arithmetic and logic operation circuit 501, and generates a flag.

以上のように構成された本実施例のデータ処理装置につ
いて、以下その動作を説明する。The operation of the data processing apparatus of this embodiment configured as described above will be described below.

第３図は命令の型と、パイプラインステージの流れの関
係を示したものである。ここで、ＩＦは命令先読み装置
１における命令先読みステージ、ＤＩＣＧは命令解読装
置２における命令解読ステージ、ＥＸｌは第１演算装置
３における第１演算ステージ〜ＯＦはオペランド読出し
装置４におけるオペランド読出しステージ、ＥＸ２は第
２演算装置６における第２演算ステージ、Ｏ８はオペラ
ンド書込み装置６におけるオペランド書込みステージを
表す。命令は命令先読み装置１で先読みされ・命令解読
装置２で解読される。その後の処理の流れは命令の型に
よって異なる。FIG. 3 shows the relationship between instruction types and the flow of pipeline stages. Here, IF is an instruction prefetching stage in the instruction prefetching device 1, DICG is an instruction decoding stage in the instruction decoding device 2, EX1 is the first arithmetic stage in the first arithmetic device 3, and OF is an operand reading stage in the operand reading device 4, EX2 O8 represents the second arithmetic stage in the second arithmetic unit 6, and O8 represents the operand write stage in the operand write device 6. Instructions are prefetched by an instruction prefetch device 1 and decoded by an instruction decoder 2. The flow of subsequent processing varies depending on the type of instruction.

（ａ）はレジスタ間演算命令またはレジスタ・即値開演
性命令の場合である。処理は第１演算装置３での演算で
終わる。即ち、第１演算装置３は、汎用レジスタ３０２
のデータまだ′Ｊｉ演算Ｆｌ；（１例回路３０１から得
られる即値を第１算術論）］Ｊ！演算回路３０４で演算
し、汎用レジスタ３０２に格納する。(a) is the case of an inter-register operation instruction or a register/immediate open instruction. The process ends with the calculation in the first calculation device 3. That is, the first arithmetic unit 3 has a general-purpose register 302
data still 'Ji operation Fl; (1 example, the immediate value obtained from the circuit 301 is the first arithmetic theory)] J! It is calculated by the calculation circuit 304 and stored in the general-purpose register 302.

（１））はメモリ・レジスタ間演算を行ない結果をレジ
スタに格納する命令の場合であるっ演算の作ゎない（口
、−ド）命令も含まれる。処理は第１演算装置３でのオ
ペランドのアドレス計算、オペランド読出し装置４での
オペランドの読出し、第２演算装置５での演算と流れて
終わる。即ち、第１演算装置３は、第１算術論理演算回
路３０４でオペランドの読出しアドレスを計算し、読出
しアドレスレジスタ４０１に送る。オペランド読出し装
置４は、読出しアドレスレジスタ４０１に保持された読
出しアドレスをパス制御装置８に渡し、パス制御装置８
から受け取った読出しデータを一旦読出しデータバッフ
ァ４０３に保持する。第２演算装置６は、読出しデータ
バッファ４０３に保持された読出しデータを一方のオペ
ランドである汎用レジスタ３０２の内容とともに第２算
術論理演算回路５０１で演算し、結果を汎用レジスタ３
０２に格納する。命令が演算を伴わない時は、第２算術
論理演算回路５０１は読出しデータバッファ４０３の内
容をそのまま通過させる働きをする。(1)) is an instruction that performs an operation between memory and registers and stores the result in a register; it also includes instructions that do not perform operations. The process starts with address calculation of the operand in the first arithmetic unit 3, readout of the operand in the operand reading unit 4, and calculation in the second arithmetic unit 5, and ends. That is, the first arithmetic unit 3 calculates the read address of the operand using the first arithmetic and logic circuit 304 and sends it to the read address register 401 . The operand reading device 4 passes the read address held in the read address register 401 to the path control device 8.
The read data received from the read data buffer 403 is temporarily held. The second arithmetic unit 6 operates on the read data held in the read data buffer 403 together with the contents of the general-purpose register 302, which is one operand, in the second arithmetic and logic operation circuit 501, and sends the result to the general-purpose register 302.
Store in 02. When the instruction does not involve an operation, the second arithmetic and logic operation circuit 501 functions to pass the contents of the read data buffer 403 as is.

（Ｃ）はレジスタの内容をメモリに書込む（ストア）命
令の場合である。処理は第１演算装置３での書込みアド
レスの計算、オペランド書込み装置らでのメモリへの書
込みと流れて終わる。即ち、第１演算装置３は、第１算
術論理演算回路３０４でオペランドの書込みアドレスを
計算し、書込みアドレスレジスタ６０１に送る。オペ２
ンド書込み装置６は、汎用レジスタ３０２から書込みデ
ータを取りこみ、書込みデータバッファ６０３に一旦保
持した後、書込みアドレスレジスタ８０１に保持された
書込みアドレスと共にバス制御装置８に送り、メモリへ
の書込みを行なう。(C) is a case of a (store) instruction that writes the contents of a register to memory. The process ends with calculation of the write address in the first arithmetic unit 3, and writing to the memory in the operand writing devices. That is, the first arithmetic unit 3 calculates the write address of the operand using the first arithmetic and logic circuit 304 and sends it to the write address register 601. Operation 2
The command writing device 6 takes in the write data from the general-purpose register 302, temporarily holds it in the write data buffer 603, and then sends it to the bus control device 8 together with the write address held in the write address register 801, and writes it into the memory.

（ｄ）はレジスタ・メモリ間演算を行ない結果をメモリ
に書込む命令の場合である。この場合のメモリオペラン
ドは読出し→更新→書込みとなる。処理は第１演算装置
３でのオペランドのアドレス計算、オペランド読出し装
置４でのオペランドの読出し、第２演算装置６での演算
、オペランド書込み装置６での演算結果のメモリへの書
込みと流れで終わる。即ち、第１演算装置３は、第１ｎ
術論理演算回路３０４でオペランドの読出し・書込みア
ドレスを計算し、読出しアドレスレジスタ４０１と書込
みアドレスレジスタ６０１に送る。オペランド読出し装
置４は、読出しアドレスレジスタ４０１に保持された読
出しアドレスをバス制御装置８に渡し、バス制御装置８
から受け取った読出しデータを一旦読出しデータバッフ
ァ４０３に保持する。第２演算装置５は、読出しデータ
バッフ７４ｏ３に保持された読出しデータを一方のオペ
ランドである汎用レジスタ３０２の内容とともに第２算
術論理演算回路５０１で演算し、結果を書込みデータバ
ッファ６０３に格納する。オペランド書込み装置ｅは・
書込みアドレスレジスタ６０１に保持された書込みアド
レスと書込みデータバッファ６０３に保持された書込み
データを共にバス制御装置８に送り、メモリへの書込み
を行なう。(d) is an instruction that performs a register-memory operation and writes the result to memory. The memory operand in this case is read → update → write. The processing ends with address calculation of the operand in the first arithmetic unit 3, reading of the operand in the operand reading unit 4, operation in the second arithmetic unit 6, and writing of the operation result in the memory in the operand writing unit 6. . That is, the first arithmetic device 3
The logical operation circuit 304 calculates the read/write address of the operand and sends it to the read address register 401 and the write address register 601. The operand reading device 4 passes the read address held in the read address register 401 to the bus control device 8.
The read data received from the read data buffer 403 is temporarily held. The second arithmetic unit 5 operates on the read data held in the read data buffer 74o3 together with the contents of the general-purpose register 302, which is one operand, in the second arithmetic and logic operation circuit 501, and stores the result in the write data buffer 603. The operand writing device e is
Both the write address held in the write address register 601 and the write data held in the write data buffer 603 are sent to the bus control device 8 to write into the memory.

（６）はアドレスが連続する複数のメモリオペランドと
レジスタ間の演算を行ない複数の結果を複数のレジスタ
に格納する命令の場合である。演算の陣わない（マルチ
プルロード）命令も含まれる。(6) is an instruction that performs an operation between a plurality of memory operands and registers having consecutive addresses and stores a plurality of results in a plurality of registers. It also includes instructions that do not require arithmetic operations (multiple load).

処理は第１演算装置３での第１のメモリオペランドのア
ドレス計算の後、オペランド読出し装置４でのオペラン
ドの読出しと第２演算装置６での演算とを繰返して終わ
る。ただし、オペランド読出し装置４での次のオペラン
ドの読出しと第２演算装置６での演算は並列に実行する
。即ち、Ｗｊ１演算装置３は１第１算術論理演算回路３
０４で最初のオペランドの読出しアドレスを計算し、読
出しアドレスレジスタ４０１に送るとともに、結果を格
納するレジスタ群を示すレジスタリストを演算制御回路
３０１から読出し制御回路４０２に送る。The process ends by repeating the address calculation of the first memory operand in the first arithmetic unit 3, the reading of the operand in the operand reading unit 4, and the operation in the second arithmetic unit 6. However, the reading of the next operand by the operand reading device 4 and the calculation by the second arithmetic device 6 are executed in parallel. That is, the Wj1 arithmetic unit 3 is the first arithmetic logic operation circuit 3.
At step 04, the read address of the first operand is calculated and sent to the read address register 401, and a register list indicating the register group in which the result is to be stored is sent from the arithmetic control circuit 301 to the read control circuit 402.

またこの時、レジスタ制御回路３０３は汎用レジスタ３
０２の結果を格納するレジスタ群の読出しを禁ｊ１−す
る。オペランド読出し装置４ば、読出しアドレスレジス
タ４０１に保持された読出しアドレスをバス制御装置８
に渡すと同時に、読出しアドレス増分回路４０４で一定
値増加し、再び読出しアドレスレジスタ４０１に保持す
る。バス制御装置８から受け取った読出しデータを一旦
読出しデータバッファ４０３に保持する。第２演算装置
６は、読出しデータバッファ４０３に保持された読出し
データを一方のオペランドである汎用レジスタ３０２の
内容とともに第２算術論理演算回路６０１で演算し、結
果を汎用レジスタ３０２に格納する。命令が演算を伴わ
ない時は、第２算術論理演算回路６０１は読出しデータ
バッファ４０３の内容をそのまま通過させる働きをする
。同時に、読出し制御回路４０２はレジスタ制御回路３
０３に対して結果の１つが汎用レジスタ３０２に格納さ
れたことを通知し、レジスタ制御回路３０３はこれを受
けて汎用レジスタ３０２の当該レジスタの読出しの禁止
を解除する。第２以降のオペ２ンドについては、オペラ
ンド読出し装置４と第２演算装置５が第１演算装置３と
は独立に、オペランドアドレスの増分加算、メモリオペ
ランドの読出し、演算、結果の格納、当該レジスタの読
出しの禁止の解除を繰返して行なう。この時・オペラン
ド読出し装置４と第２演算装置６は、パイプラインを形
成し並列に実行する。Also, at this time, the register control circuit 303 controls the general-purpose register 3
Prohibits reading of the register group storing the result of 02 j1-. The operand reading device 4 transfers the read address held in the read address register 401 to the bus control device 8.
At the same time, the address is incremented by a certain value in the read address increment circuit 404 and held in the read address register 401 again. The read data received from the bus control device 8 is temporarily held in the read data buffer 403. The second arithmetic unit 6 operates on the read data held in the read data buffer 403 together with the contents of the general-purpose register 302, which is one operand, in the second arithmetic and logic operation circuit 601, and stores the result in the general-purpose register 302. When the instruction does not involve an operation, the second arithmetic and logic operation circuit 601 functions to pass the contents of the read data buffer 403 as is. At the same time, the read control circuit 402
03 is notified that one of the results has been stored in the general-purpose register 302, and in response to this, the register control circuit 303 releases the prohibition of reading of that register in the general-purpose register 302. For the second and subsequent operands, the operand reading device 4 and the second arithmetic device 5 independently of the first arithmetic device 3 perform incremental addition of the operand address, read out the memory operand, perform the operation, store the result, and perform the operations in the corresponding register. Repeatedly cancel the read prohibition. At this time, the operand reading device 4 and the second arithmetic device 6 form a pipeline and execute in parallel.

（ｆ′）は複数のレジスタの内容をアドレスが連続する
メモリに書込む（マルチプルストア）命令の場合である
。処理は第１演算装置３での第１のレジスタの書込みア
ドレス計算の後、オペランド書込み装置６でのメモリへ
の書込みを繰返して終わる。(f') is a case of an instruction (multiple store) that writes the contents of a plurality of registers to memories with consecutive addresses. The process ends after the first arithmetic unit 3 calculates the write address of the first register, and then the operand write unit 6 repeats writing to the memory.

即ち、第１演算装置３は、第１算術論理演算回路３０４
で最初のオペランドの書込みアドレスを計算し、書込み
アドレスレジスタ６０１に送るとともに、メモリに書込
むレジスタ群を示すレジスタリストを演算制御回路３０
１から書込み制御回路６０２に送る。またこの時、レジ
スタ制御回路３０３は汎用レジスタ３０２のメモリに書
込むレジスタ群の書込みを禁止する。オペランド書込み
装置６は、書込み制御回路６０２に従い汎用レジスタ３
０２から書込みデータを取りこみ、書込みデータバッフ
７６０３に一旦保持した後、書込みアドレスレジスタ６
０１に保持された書込みアドレスと共にバス制御装置８
に送り、メモリへの書込みを行なうと同時に、書込みア
ドレス増分回路６０４で一定値増加し、再び書込みアド
レスレジスタ６０１に保持する。同時に、書込み制御回
路６０２はレジスタ制御回路３０３に対してレジスタの
１つが書込みデータバッフ７６０３に保持されたことを
通知し、レジスタ制御回路３０３はこれを受けて汎用レ
ジスタ３０２の当該レジスタの書込みの禁止を解除する
。第２以降のオペランドについては、オペランド書込み
装置６が第１演算装置３とは独立に、レジスタの読出し
、オペランドアドレスの増分加算、メモリへの書込み、
当該レジスタの書込みの禁止の解除を繰返して行なう。That is, the first arithmetic unit 3 has a first arithmetic logic operation circuit 304.
calculates the write address of the first operand and sends it to the write address register 601, and also sends a register list indicating the register group to be written to the memory to the arithmetic control circuit 30.
1 to the write control circuit 602. Also, at this time, the register control circuit 303 prohibits writing of the register group to the memory of the general-purpose register 302. The operand writing device 6 writes the general-purpose register 3 according to the writing control circuit 602.
After reading the write data from 02 and temporarily holding it in the write data buffer 7603, the write address register 6
bus controller 8 with the write address held at 01.
At the same time as writing into the memory, the write address increment circuit 604 increments the address by a certain value and holds it in the write address register 601 again. At the same time, the write control circuit 602 notifies the register control circuit 303 that one of the registers is held in the write data buffer 7603, and the register control circuit 303 receives this and prohibits writing to that register in the general-purpose register 302. Release. For the second and subsequent operands, the operand writing device 6 independently of the first arithmetic device 3 reads the register, adds the operand address incrementally, writes it to the memory,
Repeatedly cancel the write prohibition of the register.

以上のように本実施例のデータ処理装置は、第１演算装
置３が以降の処理を切シ離し、オペランド読出し装置４
と第２演算装置５、あるいはオペランド書込み装置６が
命令先読み装置１、命令解読装置２、第１演算装置３と
は独立に命令の以降の処理を実行するものである。As described above, in the data processing device of this embodiment, the first arithmetic unit 3 disconnects the subsequent processing, and the operand reading device 4
The second arithmetic unit 5 or operand writing device 6 executes subsequent processing of the instruction independently of the instruction prefetch unit 1, instruction decoder 2, and first arithmetic unit 3.

第４図は動作タイミング図を示すものである。FIG. 4 shows an operation timing chart.

命令先読み装置１、命令解読装置２、第１演算装置３、
オペランド読出し装置４、および第２演算装置６におい
て実行されている命令とステータスレジスタ３０６の内
容の変化をクロック単位で示している。ここで、命令先
読み装置１８は内部にキャッシュメモリを持ち、各装置
の必要クロック数が、命令先読み装置１（１クロツク）
、命令解読装置２（１クロツク）、第１演算装置３（１
クロツク）、オペランド読出し装置４（３クロツク）、
および第２演算装置５（１クロツク）の場合を示してい
る。実行している命令シ・−ケンスは、第１３図の動作
タイミング図に示した命令シーケンスと同じであり、メ
モリ・レジスタ間演算レジスタ格納命令に続いて２命令
のレジスタ間演算命令を実行し、この３命令の繰り返し
となっている。instruction prefetching device 1, instruction decoding device 2, first arithmetic device 3,
The instructions being executed in the operand reading device 4 and the second arithmetic unit 6 and changes in the contents of the status register 306 are shown in units of clocks. Here, the instruction prefetching device 18 has an internal cache memory, and the number of clocks required for each device is the instruction prefetching device 1 (1 clock).
, instruction decoding device 2 (1 clock), first arithmetic unit 3 (1 clock),
clock), operand reading device 4 (3 clocks),
and the case of the second arithmetic unit 5 (one clock) is shown. The instruction sequence being executed is the same as the instruction sequence shown in the operation timing diagram of FIG. 13, in which a memory-register operation register storage instruction is followed by two inter-register operation instructions; These three commands are repeated.

具体的には、命令１．４がメモリ・レジスタ間演算レジ
スタ格納命令であり、命令２．３．５．６がレジスタ間
演算命令である。またパイプラインの初期状態は空状態
（例えば条件分岐時）としている。命令１は、クロック
ｔ１に命令先読み装置１で命令コードの先読みが行なわ
れ、命令コードを命令解読装置２に発行する。クロック
ｔ２に命令解読装置２で命令解読が行なわれ、オペラン
ドのアドレス計算および読出しのための制（財）情報と
命令実行に関する制（財）情報を第１演算装置３に発行
する。オペランドのアドレス計算および読出しのための
制ＨＩ？ｆ報に従って、クロックｔ３に第１演算装置３
でオペランドのアドレス計算が行なわれ、クロックｔ４
〜ｔ６にオペランド読出し装置４でオペランドの読出し
が行なわれる。読出しデータと命令実行に関する制御情
報は第２演算装置６に送出され、クロックｔ７に第２演
算装置５で演算される。命令２は、クロックｔ２に命令
先読み装置１で命令コードの先読みが行なわれ、クロッ
クｔ３に命令解読装置２で命令解読が行なわれ、命令実
行に関する制御情報のみを第１演算装置３に発行する。Specifically, instruction 1.4 is a memory-register operation register storage instruction, and instruction 2.3.5.6 is an inter-register operation instruction. The initial state of the pipeline is an empty state (for example, at the time of a conditional branch). The instruction code of the instruction 1 is prefetched by the instruction prefetching device 1 at clock t1, and the instruction code is issued to the instruction decoding device 2. At clock t2, the instruction decoding device 2 decodes the instruction, and issues control information for operand address calculation and readout and control information regarding instruction execution to the first arithmetic device 3. Control HI for operand address calculation and reading? According to the f-report, the first arithmetic unit 3 is activated at clock t3.
The address calculation of the operand is performed at clock t4.
From t6 to t6, the operand reading device 4 reads out the operand. The read data and control information regarding instruction execution are sent to the second arithmetic unit 6 and are operated by the second arithmetic unit 5 at clock t7. For the instruction 2, the instruction code is prefetched by the instruction prefetching device 1 at clock t2, the instruction is decoded by the instruction decoding device 2 at clock t3, and only control information related to instruction execution is issued to the first arithmetic device 3.

クロックｔ４に第１演算装置３で演算される。同様に命
令３はクロックｔ６に第１演算装置３で演算される。The calculation is performed by the first calculation device 3 at clock t4. Similarly, instruction 3 is operated by first arithmetic unit 3 at clock t6.

このように、命令２．命令３は命令１より先行的に演算
され、命令５は命令１と並列に演算される。フラグ生成
回路３０６はクロックｔ４において第１算術論理演算回
路３０４から得られる情報に基づいて命令２のフラグを
生成し、クロックｔ６にステータスレジスタ３０５に反
映する。この時鳥更新したフラグを記憶しておく。また
、クロックｔ５において第１算術論理演算回路３０４か
ら得られる情報に基づいて命令３のフラグを生成し、ク
ロックｔ６にステータスレジスタ３０６に反映する。こ
の時も更新したフラグを記憶しておく。次に、フラグ生
成回路３０６はクロックｔ７において第１算術論理演算
回路３０４から得られる情報に基づいて命令６のフラグ
を、第２算術論理演算回路５０１から得られる情報に基
づいて命令１のフラグを生成する。しかしこの時、命令
１フラグを生成しない。また、命令６で更新されるフラ
グのうち、命令１で更新されるフラグと同一のものがあ
る場合は、クロックｔｓ　、ｔｅで更新したフラグと同
一であっても命令５を優先して生成する。生成されたフ
ラグはクロックｔ８にステータスレジスタ３０５に反映
する。Thus, instruction 2. Instruction 3 is operated in advance of instruction 1, and instruction 5 is operated in parallel with instruction 1. The flag generation circuit 306 generates a flag for instruction 2 based on information obtained from the first arithmetic and logic circuit 304 at clock t4, and reflects it in the status register 305 at clock t6. At this time, the flag updated by the bird is memorized. Furthermore, at clock t5, a flag for instruction 3 is generated based on information obtained from the first arithmetic and logic circuit 304, and is reflected in the status register 306 at clock t6. At this time as well, the updated flag is memorized. Next, at clock t7, the flag generation circuit 306 generates a flag for instruction 6 based on information obtained from the first arithmetic and logic operation circuit 304, and generates a flag for instruction 1 based on information obtained from the second arithmetic and logic operation circuit 501. generate. However, at this time, the instruction 1 flag is not generated. Also, if there is a flag updated by instruction 6 that is the same as the flag updated by instruction 1, instruction 5 is generated with priority even if it is the same as the flag updated by clocks ts and te. . The generated flag is reflected in the status register 305 at clock t8.

以上のように本実施例によれば、命令の演算される順序
は、必ずしもプログラムの順序とは一致しない。しかし
、ステータスレジスタ３０５へのフラグ反映は、フラグ
生成回路３０６においテ生成するフラグの調停を行なう
ことにより、命令がプログラムの順序で演算された場合
と矛盾なく行なわれることができる。As described above, according to this embodiment, the order in which instructions are operated does not necessarily match the order of the program. However, by arbitrating the flags generated by the flag generating circuit 306, the flag can be reflected in the status register 305 consistent with the case where the instructions are operated in the order of the program.

第６図は、本実施例が第１の課題を解決することを説明
する動作タイミング図を示したものである。各装置の必
要クロック数は第４図に示すものと同じである。ただし
、オペランド跡込み装置らにおけるメモリへの書込みに
要するクロック数は３クロツクとする。実行している命
令・／−ケンスは、第１４図の動作タイミング図に示し
た命令シーケンスと同じであシ、メモリ・レジスタ間演
算レジスタ格納命令（命令１　）、レジスタをメモリへ
転送する（ストア）命令（命令２）、レジスタ間演算命
令（命令３）が続く場合を示している。FIG. 6 shows an operation timing diagram explaining that this embodiment solves the first problem. The required number of clocks for each device is the same as shown in FIG. However, the number of clocks required for writing to the memory in the operand tracing device is three clocks. The instructions being executed are the same as the instruction sequence shown in the operation timing diagram in Figure 14, including the memory-register operation register store instruction (instruction 1), register transfer to memory (store ) instruction (instruction 2) and an inter-register operation instruction (instruction 3) are shown.

命令１はクロックｔ３でアドレスが計算され、クロック
ｔ４〜ｔ６でオペランドをフェッチし、クロックｔ７で
演算される。しかし、命令２はクロックｔ４に第１演算
装置３でアドレスが計算された後、以降の処理は切シ離
され、レジスタのメモリへの転送はオペランド書込み装
置６が第１演算装置３とは独立に行なう。即ち、オペラ
ンド書込み装置６はクロックｔ６において汎用レジスタ
３０２から書込みデータを書込みデータバッファ６０３
に取りこみ、クロックｔ７〜ｔ９において書込みアドレ
スレジスタ６０１に保持された書込みアドレスと書込み
データバッファｅｏ３Ｖｃ保持された書込みデータをバ
ス制御装置８に送り、メモリへの書込みを行なう。命令
３は命令１のメモリ読出しや命令２のメモリ書込みと並
列にクロックｔ６で演算される。In the instruction 1, an address is calculated at clock t3, an operand is fetched at clocks t4 to t6, and an operation is performed at clock t7. However, after the address of instruction 2 is calculated by the first arithmetic unit 3 at clock t4, the subsequent processing is separated, and the operand writing unit 6 transfers the register to the memory independently of the first arithmetic unit 3. go to That is, the operand writing device 6 writes write data from the general-purpose register 302 to the write data buffer 603 at clock t6.
At clocks t7 to t9, the write address held in the write address register 601 and the write data held in the write data buffer eo3Vc are sent to the bus control device 8, and written into the memory. Instruction 3 is operated at clock t6 in parallel with the memory reading of instruction 1 and the memory writing of instruction 2.

以上のように本実施例によれば、オペランドフェッチの
必要な１命令と後続するオペランド７エツチもメモリへ
の−１ｒＦ込みも必要としない命令、メモリへの書込み
を必要とする命令と後続するオペランドフェッチもメモ
リへの書込みも必要としない命令、オペランドフェッチ
の必要な命令と後続するメモリへの書込みを必要とする
命令、メモリへの、揉込みを必要とする命令と後続する
オペランドフェッチの必要な命令をそれぞれ並列に実行
することができ、命令列の実行時間を短縮することがで
きる。As described above, according to this embodiment, one instruction that requires an operand fetch, an instruction that does not require an operand 7 fetch or a -1rF write to memory, an instruction that requires writing to memory, and a subsequent operand Instructions that require neither a fetch nor a write to memory, instructions that require an operand fetch followed by a write to memory, and instructions that require a write to memory followed by an operand fetch. Each instruction can be executed in parallel, and the execution time of a sequence of instructions can be shortened.

第８図は、本実施例が第２の課題を解決することを説明
する動作タイミング図を示したものである。各装置の必
要クロック数は第４回に示すものと同じである。実行し
ている命令シーケンスは、第１５図の動作タイミング図
に示した命令シーケンスと同じであり、２つのレジスタ
間演算命令（命令１　、命令２）に続いて・１つのメモ
リ・レジスタ間演算レジスタ格納命令（命令３）、さら
に２つのレジスタ間演算命令（命令４．命令６）が続く
場合を示している。命令３はクロックｔ５でアドレス計
算を行ない、クロックｔ６〜ｔ８でオペランドがメモリ
から読出され、クロックｔ９で演算される。しかしオペ
ランド読出し装置４、第２演算装置６は、第１演算装置
３と独立に動作するため、命令４．命令５は、命令３の
メモリの読出しと並列に第１演算装置３においてそれぞ
れクロックｔ６．クロックｔ７で演算することができ、
パイプラインステージの空きが発生しない。FIG. 8 shows an operation timing diagram explaining that this embodiment solves the second problem. The required number of clocks for each device is the same as shown in the fourth section. The instruction sequence being executed is the same as the instruction sequence shown in the operation timing diagram of Fig. 15, in which two inter-register operation instructions (instruction 1, instruction 2) are followed by one memory inter-register operation register. A case is shown in which a store instruction (instruction 3) is followed by two inter-register operation instructions (instruction 4 and instruction 6). In instruction 3, address calculation is performed at clock t5, operands are read from the memory at clocks t6 to t8, and operated at clock t9. However, since the operand reading device 4 and the second arithmetic device 6 operate independently of the first arithmetic device 3, the instruction 4. Instruction 5 is executed in parallel with the memory reading of instruction 3 at clock t6. It can be calculated using clock t7,
Pipeline stage vacancies do not occur.

以上のように本実施例によれば、オペランドフェッチの
必要な命令にオペランドフェッチの不要な命令が続く場
合でも、演算ステージに空きが生じず、空状態のパイプ
ラインから最初に出現するオペランドフェッチの必要な
命令の実行うロック数を１クロツクとすることができる
。As described above, according to this embodiment, even if an instruction that requires operand fetching is followed by an instruction that does not require operand fetching, there is no vacancy in the operation stage, and the first operand fetch that appears from the empty pipeline is The number of locks required to execute the necessary instructions can be set to one clock.

第７図は、本実施例が第３の課題を解決することを説明
する動作タイミング図を示したものである。各装置の必
要クロック数は第４図に示すものと同じである。実行し
ている命令シーケンスは、第１６図の動作タイミング図
に示した命令シーケンスと同じであり、１つのメモリ・
レジスタ間演算レジスタ格納命令（命令１）と４つのレ
ジスタ間演算命令（命令２〜命令６）の後に、条件分岐
命令（命令６）が続き、条件分岐命令６では分岐が成立
し、レジスタ間演算命令（命令ｎ）に分岐する場合を示
している。命令１がオペランドのアドレス計算とメモリ
からのオペランドの読出しを必要とするが、オペランド
読出し装置４、第２演算装置６は、第１演算装置３と独
立に動作するため、以降の命令２から命令６は、命令１
のメモリの読出しと並列に第１演算装置３においてクロ
ックｔ４からクロックｔ８で演算することができる。FIG. 7 shows an operation timing diagram explaining that this embodiment solves the third problem. The required number of clocks for each device is the same as shown in FIG. The instruction sequence being executed is the same as the instruction sequence shown in the operation timing diagram of FIG.
Inter-register operation Register storage instruction (instruction 1) and four inter-register operation instructions (instruction 2 to instruction 6) are followed by a conditional branch instruction (instruction 6), and in conditional branch instruction 6, a branch is taken and the inter-register operation is executed. This shows the case of branching to an instruction (instruction n). Instruction 1 requires address calculation of the operand and reading of the operand from memory, but since the operand reading device 4 and the second arithmetic device 6 operate independently of the first arithmetic device 3, instructions from the subsequent instruction 2 6 is command 1
The first arithmetic unit 3 can perform calculations from clock t4 to clock t8 in parallel with the reading of the memory.

従って１分岐先命令ｎの命令先読みステージはクロック
ｔ９まで早まり、命令先読みステージにおいてパイプラ
インは、クロックｔ７〜ｔ８の２クロツクのインタロツ
タしか発生しない。Therefore, the instruction prefetch stage of one branch destination instruction n is advanced to clock t9, and in the instruction prefetch stage, the pipeline only generates an interlock of two clocks, clocks t7 to t8.

以上のように本実施例によれば、オペランドフェッチの
必要な命令に条件分岐命令が後続する場合も、パイプラ
インのインタロック時間をオペランドフェッチの必要な
命令がない場合と等しくし、パイプラインの効率の低下
を最小にすることができる。As described above, according to this embodiment, even when an instruction requiring an operand fetch is followed by a conditional branch instruction, the pipeline interlock time is made the same as when there is no instruction requiring an operand fetch, and the pipeline Efficiency loss can be minimized.

第８図は、本実施例が第４の課題を解決することを説明
する動作タイミング図を示したものである。各装ｕ１の
必要クロック数は第４図に示すものと同じである。実行
している命令シーケンスは、第１７図の動作タイミング
図に示した命令シーケンスと同じであシ、メモリ・レジ
スタ間演算レジスタ格納命令（命令１）に続いて、２つ
のレジスタ間演算命令（命令２．命令３）、さらにメモ
リ・レジスタ間演算レジスタ格納命令（命令４）が続き
、かつ、命令３の演算結果を命令４のアドレス計算で用
いる場合を示している。アドレス計算とレジスタ間演算
をともに第１演算装置３で行なうため、命令４のアドレ
ス計算は命令３の演算が完了する次のクロックｔ６で行
なうことができ、アドレス計算干渉によるパイプライン
のインタロックは発生しない。FIG. 8 shows an operation timing diagram explaining that this embodiment solves the fourth problem. The required number of clocks for each device u1 is the same as that shown in FIG. The instruction sequence being executed is the same as the instruction sequence shown in the operation timing diagram of FIG. 2. Instruction 3) is followed by a memory-register calculation register storage instruction (instruction 4), and the calculation result of instruction 3 is used in the address calculation of instruction 4. Since both the address calculation and the inter-register operation are performed in the first arithmetic unit 3, the address calculation for instruction 4 can be performed at the next clock t6 when the operation for instruction 3 is completed, and the pipeline interlock due to address calculation interference can be avoided. Does not occur.

以上のように本実施例によれば、オペランドフェッチの
不要な命令の演算で書換えられるレジスタを後１ｂ２す
るオペランドフェッチの必要な命令の７１’ｔｚス計算
で読出す場合でも、インタロックが発生せず、従ってパ
イプラインの効率が低下しないようにすることができる
。As described above, according to this embodiment, even when a register that is rewritten by an operation of an instruction that does not require an operand fetch is read by the 71'tz calculation of an instruction that requires an operand fetch, no interlock occurs. Therefore, the efficiency of the pipeline can be prevented from decreasing.

第９図は、本実施例が第６．第６の課題を解決すること
を説明する動作タイミング図を示したものである。各装
置の必要クロック数は第４図に示すものと同じである。FIG. 9 shows that this embodiment is the 6th. This is an operation timing diagram illustrating solving the sixth problem. The required number of clocks for each device is the same as shown in FIG.

実行している命令シーケンスは、第１８図の動作タイミ
ング図に示した命令ノーケンスと同じであり、アドレス
が連続する３つのメモリオペランドを３つのレジスタに
転送する（マルチプルロード）命令（命令１）に、２つ
のレジスタ間演算命令（命令２、命令３）が続き、かつ
、命令２．命令３はいずれも命令１の実行結果を必要と
しない場合を示している。命令１はクロックｔ４〜ｔ６
で第１のオペランドの読出しを行ないクロックｔ７でそ
のオペランドをレジスタＲ１に格納し・クロックｔ７〜
ｔ９で第２のオペランドの読出しを行ないクロックｔ１
０でそのオペランドをレジスタＲ２に格納し、クロック
ｔ１０〜ｔ１２で第３のオペランドの読出しを行ないク
ロックｔ１３でそのオペランドをレジスタＲ３に格納す
る。しかし、クロックｔ３で第１演算装置３が第１オペ
ランドのアドレス計算を行なった後、オペランド読出し
装置４、第２演算装置５は、第１演算装置３と独立に動
作するため、以降の命令２・命令３は、命令１のメモリ
の読出しおよびレジスタ格納と並列に第１演算装置３に
おいてクロックｔ４、クロックｔ６で演算することがで
きる。The instruction sequence being executed is the same as the instruction nocence shown in the operation timing diagram of Figure 18, and includes an instruction (multiple load) that transfers three memory operands with consecutive addresses to three registers (instruction 1). , two inter-register operation instructions (instruction 2, instruction 3) follow, and instruction 2. Instruction 3 shows a case where the execution result of instruction 1 is not required. Instruction 1 is clock t4-t6
The first operand is read out at clock t7, and the operand is stored in register R1 at clock t7.
At t9, the second operand is read and clock t1
0, the operand is stored in the register R2, the third operand is read out at clocks t10 to t12, and the operand is stored in the register R3 at clock t13. However, after the first arithmetic unit 3 calculates the address of the first operand at clock t3, the operand reading unit 4 and the second arithmetic unit 5 operate independently of the first arithmetic unit 3, so that the subsequent instructions 2 - Instruction 3 can be operated on clock t4 and clock t6 in first arithmetic unit 3 in parallel with the memory reading and register storage of instruction 1.

従って、第１演算装置３にメモリ読出しにかかる長大な
空き時間が生じることがない。Therefore, the first arithmetic unit 3 does not have a long idle time required for memory reading.

また、複数のレジスタの内容をアドレスが連続するメモ
リに書込む（マルチプルストア）命令の場合も同様で、
第１演算装置３が第１オペランドのアドレス計算を行な
った後、オペランド書込み装置６は、第１演算装置３と
独立に動作するだめ、以降の命令は、複数のレジスタの
内容をアドレスが連続するメモリに書込む命令のメモリ
の書込みと並列に第１演算装置３において演算すること
ができる。従って、第１演算装置３にメモリ書込みにか
かる長大な空き時間が生じることがない。The same is true for instructions that write the contents of multiple registers to memory with consecutive addresses (multiple store).
After the first arithmetic unit 3 calculates the address of the first operand, the operand writing device 6 operates independently of the first arithmetic unit 3, and subsequent instructions write the contents of multiple registers at consecutive addresses. The first arithmetic unit 3 can perform calculations in parallel with the writing of the instruction to the memory. Therefore, the first arithmetic unit 3 does not have a long idle time required for memory writing.

以上のように本実施例によれば、複数のオペランドフェ
ッチを必要とする命令あるいは複数のオペランドのメモ
リへの書込みを伴う命令が実行されても、その後に続く
命令がオペランドフェッチが不要であシ、かつ、複数の
オペランドフェッチを必要とする命令あるいは複数のオ
ペランドのメモリへの書込みを伴う命令の実行結果を必
要としない場合であれば、その命令の演算を、複数のオ
ペランドフェッチを必要とする命令あるいは複数のオペ
ランドのメモリへの書込みを伴う命令の演算と並行また
は先行して行なうことにょ軌パイプラインの効率を低下
させないようにすることができる。As described above, according to this embodiment, even if an instruction that requires multiple operand fetches or an instruction that involves writing multiple operands to memory is executed, the following instructions do not require operand fetching and can be executed. , and if the execution result of an instruction that requires fetching multiple operands or writing multiple operands to memory is not required, then the operation of that instruction requires fetching multiple operands. It is possible to avoid reducing the efficiency of the pipeline by performing the operation in parallel with or in advance of the operation of an instruction that involves writing an instruction or a plurality of operands to memory.

また本実施例によれば、複数のオペランドフェッチを必
要とする命令あるいは複数のオペランドのメモリへの書
込みを伴う命令に続く命令が、複数のオペランドフェッ
チを必要とする命令あるいは複数のオペランドのメモリ
への書込みを伴う命令の実行結果を必要としないように
プログラムを最適化するソフトウェアの性能を発揮させ
ることができる。Further, according to this embodiment, an instruction that follows an instruction that requires fetching multiple operands or an instruction that involves writing multiple operands to memory writes an instruction that requires fetching multiple operands or writes multiple operands The performance of software that optimizes programs can be maximized so that the execution results of instructions that involve writing are not required.

次に、本実施例が第７の課題を解決することを説明する
。以上のように構成された本実施例のデータ処理装置は
、第１演算装置３がオペランドのアドレス計算を行なっ
た後、以降の処理を切り離し、オペランド読出し装置４
、第２演算装置５、オペランド書込み装置６を、第１演
算装置３と独立に動作させるため、後続する命令はメモ
リの読出しや書込みと並列に第１演算装置３において演
算することができる。このことにより、たとえオペラン
ド読出し装置４またはオペランド書込み装置６の必要ク
ロック数が増加しても、メモリ読出しあるいは書込みを
伴う命令に続く命令の演算ステージが遅れるようなこと
はない。また、データ用のキャッシュメモリを内蔵し、
オペランド読出し装置４またはオペランド書込み装置６
の必要クロック数が減少しても、メモリ読出しあるいは
書込みを伴う命令に続く命令の演算ステージの時間的位
置は変化しない。Next, it will be explained that this embodiment solves the seventh problem. In the data processing device of this embodiment configured as described above, after the first arithmetic unit 3 calculates the address of the operand, the subsequent processing is separated, and the operand reading device 4
, the second arithmetic unit 5, and the operand writing device 6 are operated independently of the first arithmetic unit 3, so that subsequent instructions can be operated in the first arithmetic unit 3 in parallel with memory reading and writing. As a result, even if the required number of clocks of the operand reading device 4 or the operand writing device 6 increases, the calculation stage of an instruction following an instruction involving memory reading or writing will not be delayed. It also has a built-in cache memory for data.
Operand reading device 4 or operand writing device 6
Even if the required number of clocks is reduced, the temporal position of the arithmetic stage of an instruction following an instruction involving a memory read or write does not change.

以上のように本実砲例によれば、キヤノンユメモリの内
蔵の有無、キャッシュメモリのヒツト率、キャッシュメ
モリの容遣などによるオペランドの読出しやｉｌＦ込み
に必要なりロック数の増減に関わ９なく、第１から第６
の課題を解決することができ、またオペランドの読出し
や書込みに必要なりロック数をレジスタ間演算の実行う
ロック数とは独立に設定できる。As described above, according to this actual example, it is necessary to read operands and include IIF depending on the presence or absence of built-in Canon Yu memory, cache memory hit rate, cache memory capacity, etc., regardless of the increase or decrease in the number of locks. , 1st to 6th
In addition, the number of locks required for reading and writing operands can be set independently of the number of locks required for inter-register operations.

最後に、本実施例が第８の課題を解決することを説明す
る。以上のように構成された本実施例のデータ処理装置
は、メモリオペランドの先読みを行なわない。そのため
命令解読装置２は、アドレス計算のための制御情報を命
令実行に関する制御情報に先行して発行する必要がなく
、すべての制御す１７報を同時に第１演算装置３に発行
すればよい。Finally, it will be explained that this embodiment solves the eighth problem. The data processing device of this embodiment configured as described above does not read ahead of the memory operand. Therefore, the instruction decoding device 2 does not need to issue control information for address calculation prior to control information related to instruction execution, and can issue all control information to the first arithmetic device 3 at the same time.

従って、命令解読装置２の制御は簡単になる。また、先
読みデータや−３込みアドレスの待ち合わせのだめの複
雑な制御も必要としない。Therefore, control of the instruction decoding device 2 becomes simple. Further, there is no need for complicated control such as look-ahead data or waiting for addresses including -3.

以上のように本実施例によれば、各装置の制御の簡単化
に伴い制御回路の遅延時間が短縮され、プロセッサが動
作するクロック周波数を容易に向上することができる。As described above, according to this embodiment, the delay time of the control circuit is shortened as the control of each device is simplified, and the clock frequency at which the processor operates can be easily increased.

なお本実施例は、単一または複数のオペランドをメモリ
から読出して演算を施さずにレジスタに格納する（ロー
ドまたはマルチプルロード）命令を実行する場合、読出
しデータバッファ４０３の内容をそのまま第２算術論理
演算回路５０１を通過させて汎用レジスタ３０２に格納
しているが、読出しデータバッファ４０３から直接汎用
レジスタ３０２に通じるデータ線を設けて汎用レジスタ
３０２に格納してもよい。このようにすることにより、
プログラム中に頻出するロード命令やマルチプルロード
命令は第２演算ステージが不要になり、これらの命令の
実行結果を後続する命令で用いる場合に発生するパイプ
ラインのインタロックを軽減でき、パイプラインの効率
が向上するという効果がある。ただしロード命令やマル
チプルロード命令によりフラグを変化させる場合、読出
しデータバッファ４０３からフラグ生成回路３０６にフ
ラグ生成に関する情報を送出する必要がある。Note that in this embodiment, when executing an instruction (load or multiple load) that reads a single or multiple operands from memory and stores them in a register without performing an operation, the contents of the read data buffer 403 are directly transferred to the second arithmetic logic. Although the data is passed through the arithmetic circuit 501 and stored in the general-purpose register 302, it is also possible to provide a data line leading directly from the read data buffer 403 to the general-purpose register 302 and store the data in the general-purpose register 302. By doing this,
Load instructions and multiple load instructions that occur frequently in a program no longer require a second arithmetic stage, reducing pipeline interlocks that occur when the execution results of these instructions are used in subsequent instructions, and improving pipeline efficiency. This has the effect of improving. However, when the flag is changed by a load instruction or a multiple load instruction, it is necessary to send information regarding flag generation from the read data buffer 403 to the flag generation circuit 306.

また、本実施例は実記憶対応としてアドレス変換機構を
考慮しなかったが、仮想記憶対応の場合は命令先読み装
置１とオペランド読出し装置４、またはパス制御装置８
にアドレス変換機構を組み込んでもよい。In addition, although this embodiment did not consider the address translation mechanism for real memory, in the case of virtual memory, the instruction prefetch device 1, operand read device 4, or path control device 8
An address translation mechanism may be incorporated into the address translation mechanism.

また、本実施例はオペランド読出し装置４と第２演算装
置５を分離したが、オペランド読出し装置としてひとつ
の装置として実現してもよい。Further, in this embodiment, the operand reading device 4 and the second arithmetic device 5 are separated, but they may be realized as a single device as the operand reading device.

また、本実施例は各装置をひとつのパイプラインステー
ジとして説明したが、複数のパイプラインステージを持
つ装置として実現してもよい。Further, in this embodiment, each device has been described as one pipeline stage, but it may be realized as a device having a plurality of pipeline stages.

発明の詳細な説明したように、第１および第２の発明によれば、オ
ペランドフェッチの必要な命令と後続するオペランドフ
ェッチもメモリへの書込みも必要としない命令、メモリ
への書込みを必要とする命令と後続するオペランドフェ
ッチもメモリへの書込みも必要としない命令、オペラン
ドフェッチの必要な命令と後続するメモリへの書込みを
必要とする命令、メモリへの書込みを必要とする命令と
後続するオペランドフェッチの必要な命令をそれぞれ並
列に実行することができ、命令列の実行時間を短縮する
ことができる。As described in detail, according to the first and second inventions, an instruction requiring an operand fetch, a subsequent instruction requiring neither an operand fetch nor writing to memory, and a subsequent instruction requiring writing to memory are provided. Instructions followed by instructions that do not require an operand fetch or write to memory; Instructions that require an operand fetch followed by instructions that require a memory write; Instructions that require a memory write followed by an operand fetch Each of the necessary instructions can be executed in parallel, and the execution time of the instruction sequence can be shortened.

またオペランドフェッチの必要な命令にオペランドフェ
ッチの不要な命令が続く場合でも、演算ステージに空き
が生じず、空状態のパイプラインから最初に出現するオ
ペランドフェッチの必要な命令の実行うロック数を１ク
ロツクとすることができる。In addition, even if an instruction that requires operand fetching is followed by an instruction that does not require operand fetching, there will be no empty space in the calculation stage, and the number of locks to be executed for the first instruction that requires operand fetching that appears from the empty pipeline will be reduced to 1. It can be a clock.

まだ、オペランドフェッチの必要な命令に条件分岐命令
が後続する場合も、パイプラインのインタロック時間を
オペランドフェッチの必要な命令がない場合と等しくシ
、パイプラインの効率の低下を最小にすることができる
。However, even when a conditional branch instruction follows an instruction that requires an operand fetch, the pipeline interlock time is the same as when there is no instruction that requires an operand fetch, thereby minimizing the decrease in pipeline efficiency. can.

まだ、オペランドフェッチの不要な命令の演算で書換え
られるレジスタを後続するオペランドフェッチの必要な
命令あるいはメモリへの書込みが必要な命令のアドレス
計算で読出す場合でも、インタロックが発生せず、パイ
プラインの効率が低下しない。Even if a register that is rewritten by an instruction that does not require an operand fetch is read by the address calculation of a subsequent instruction that requires an operand fetch or an instruction that requires writing to memory, no interlock occurs and the pipeline efficiency is not reduced.

ま・だ以上説明したように、第３および第４の発明によ
れば、複数のオペランドフェッチを必要とする命令ある
いは複数のオペランドのメモリへのＪＦ込みを伴う命令
が実行されても、その後に続く命令がオペランドフェッ
チが不要であり、かつ、複数のオペランドフェッチを必
要とする命令あるいは複数のオペランドのメモリへの書
込みを伴う命令の実行結果を必要としない場合であれば
、その命令の演算を、複数のオペランドフェッチを必要
とする命令あるいは複数のオペランドのメモリへの書込
みを伴う命令の演算と並列−または先行して行ない、パ
イプラインの効率の低下を防ぐことができる。また本発
明によれば、複数のオペランドフェッチを必要とする命
令あるいは複数のオペランドのメモリへの書込みを伴う
命令に続く命令が、複数のオペランドフェッチを必要と
する命令あるいは複数のオペランドのメモリへの書込み
を伴う命令の実行結果を必要としないようにブログラノ
・を最適化するソフトウェアの性能を発揮させることが
できる。As explained above, according to the third and fourth inventions, even if an instruction that requires fetching multiple operands or an instruction that involves JF storing multiple operands into memory is executed, If the following instruction does not require an operand fetch and does not require the execution result of an instruction that requires fetching multiple operands or writing multiple operands to memory, the operation of that instruction is , can be performed in parallel with or in advance of an instruction that requires fetching multiple operands or writing multiple operands to memory, thereby preventing deterioration in pipeline efficiency. Further, according to the present invention, an instruction that follows an instruction that requires fetching multiple operands or writing multiple operands to memory is an instruction that requires fetching multiple operands or writes multiple operands to memory. The performance of the software that optimizes BLOGRANO can be maximized so that the execution results of instructions that involve writing are not required.

また、第１．第２．第３および第４の発明によればキャ
ッシュメモリの内蔵の有無、キャッシュメモリのヒツト
率、キャッシュメモリの容量などによって１オペランド
のメモリからの読出しに必要なりロック数が増減しよう
とも、またオペランドのメモリへの書込みに必要なりロ
ック数が増減しようとも、以上の効果は保たれる。また
、オペランド読出しや書込みに必要なりロック数をレジ
スタ間演算の実行うロック数とは独立に設定できる。Also, 1st. Second. According to the third and fourth inventions, even if the number of locks required to read one operand from the memory increases or decreases depending on the presence or absence of a built-in cache memory, the hit rate of the cache memory, the capacity of the cache memory, etc. The above effect will be maintained even if the number of locks required for writing to is increased or decreased. Further, the number of locks required for operand reading and writing can be set independently of the number of locks required for executing inter-register operations.

捷た、第１および第２の発明によれば各装置の制御の簡
単化により制御回路の遅延時間を短縮でき、プロセッサ
が動作するクロック周波数を容易に向上することができ
る。According to the first and second inventions, the delay time of the control circuit can be shortened by simplifying the control of each device, and the clock frequency at which the processor operates can be easily increased.

また第６および第６の発明によれば実行される命令の順
序がプログラムの命令列の順序と異なっても、命令がプ
ログラムの命令列の順序で実行された場合と矛盾のない
フラグをプロセッサ中のステータスレジスタに反映する
ことができる。Further, according to the sixth and sixth inventions, even if the order of executed instructions is different from the order of the instruction sequence of the program, a flag that is consistent with the case where the instructions are executed in the order of the instruction sequence of the program is set in the processor. can be reflected in the status register.

以上に示すように、本発明の実用的効果はきわめて大き
い。As shown above, the practical effects of the present invention are extremely large.

[Brief explanation of the drawing]

第１図は本発明の実施例におけるデータ処理装置の構成
図、第２図は同実施例の第１演算装置３、オペランド読
出し装置４、第２演算装置６およびオペランド書込み装
置６の詳細構成図、第３図は同実施例の命令の型とパイ
プラインステージの関係図、第４図は同実施例の動作タ
イミング図、第６図は第１の課題を解決することを説明
する同実施例の動作タイミング図、第６図は第２の課題
を解決することを説明する同実施例の動作タイミング図
、第７図は第３の課題を解決することを説明する同実施
例の動作タイミング図、第８図は第４の課題を解決する
ことを説明する同実施例の動作タイミング図、第９図は
第６および第６の課題を解ｉ−＠ることを説明する同実
施例の動作タイミング図、第１０図は第１の従来のデー
タ処理装置の構成図、第１１図は同第１の従来例の動作
タイミング図、第１２図は第２の従来のデータ処理装置
の構成図、第１３図は同第２の従来例の動作タイミンク
図、第１４図は第１の課題を説明する同第２の従来ψｊ
の動作タイミング図、第１６図は第２の課題を説明する
同第２の従来例の動作タイミンク図、第１６図は第３の
課題を説明する同第２の従来例の動作タイミング図、第
１７図は第４の課題を説明する同第２の従来例の動作タ
イミング図、第１８図は第５および第６の課題を説明す
る同第２の従来例の動作タイミング図である。１　・・・・・命令先読み装置、２・・・・・・命令解
読装置、３　・・・第１演算装置、４・・・・・・オペ
ランド読出し装置、５・・・・・・第２演算装置、６・
・・・・オペランド書込み装置、７・・・・・・入出力
バス、８・・・・・・バス制（財）装置、３０１・・・
・・演算制菌回路、３０２・・・・・汎用レジスタ、３
０３・・・・・レジスタ制御回路、３０４・・・・・・
第１算術論理演算回路、３０５・・・・・・ステータス
レジスタ、３０６・・・・・フラグ生成回路、４０１・
・・・・・読出しアドレスレジスタ、４０２・・・・・
・読出し制御回路、４ｏ３・・・・・読出しデータバッ
ファ、４０４・・・・読出しアドレス増分回路、６０１
・・・・・・？１Ｓ２算術論理演算回洛、６０１・・・
・・・書込みアドレスレジスタ、６０２　・・、１′：
込み制（財）回路、６０３・・・・書込みデータバッフ
ァ、（５０４・・・・書込みアドレス増分回路。代理人の氏名　弁理士　粟　野　重　孝　ほか１名第図舅図（ａ）ロ工］ｉ工■＝コＲｏｐ、　Ｒ−１−ＲＩＦ　　　　ＤＥＣＥＸＩ　　　　ＯＦ　　　　Ｅｘ２
Ｍ　Ｏｐ、　ＲゆＲＨ→尺ＩＦ　　ＤＥＣＥｘＩ　　０５Ｒ−７’１Ｐ（ｄ）ＥＣＥＸＩＦＸ２ＳＲｏｐ、　Ｍ →Ｎｅ１ｍｕｌｒｌｐｌｅ　（Ｍｏｐ、Ｒ−Ｒ、Ｍ　４Ｒ）（ｆ
Ｊ７Ｆ　　　　ＯＦ：ＣＥＸＩ　　　　Ｏ５Ｏ３ＱコＭｍ
ｕｒｔｔｐＬｅ　（Ｒ”　Ｍ　）硝Ｌ　　　　　　　　　　　　　Ｊン　化　ン　に　欠　忙Ｃｃ２：　欠２：（〜化　化　（ン（ｒｒ＋ａｔ　Ｚ：　ａｔンＬａ＋　ζ龜ζ ｏｃｌｏＱＣ）ミｃｃＣｃ　　叱２ニー−藪くいくｒ舞杖宵Ｃ）第１θ図第１２図ｃＬ　　叱　ミ　に　釦学FIG. 1 is a configuration diagram of a data processing device in an embodiment of the present invention, and FIG. 2 is a detailed configuration diagram of the first arithmetic unit 3, operand reading device 4, second arithmetic device 6, and operand writing device 6 of the same embodiment. , FIG. 3 is a diagram of the relationship between instruction types and pipeline stages of the same embodiment, FIG. 4 is an operation timing diagram of the same embodiment, and FIG. 6 is a diagram of the same embodiment explaining how to solve the first problem. 6 is an operation timing diagram of the same embodiment to explain solving the second problem, and FIG. 7 is an operation timing diagram of the same embodiment to explain solving the third problem. , FIG. 8 is an operation timing diagram of the same embodiment to explain solving the fourth problem, and FIG. 9 is an operation timing diagram of the same embodiment to explain solving the sixth and sixth problems. Timing diagrams, FIG. 10 is a configuration diagram of a first conventional data processing device, FIG. 11 is an operation timing diagram of the first conventional example, FIG. 12 is a configuration diagram of a second conventional data processing device, FIG. 13 is an operation timing diagram of the second conventional example, and FIG. 14 is a diagram of the second conventional example ψj, which explains the first problem.
16 is an operation timing diagram of the second conventional example to explain the second problem. FIG. 16 is an operation timing diagram of the second conventional example to explain the third problem. FIG. 17 is an operation timing chart of the second conventional example to explain the fourth problem, and FIG. 18 is an operation timing diagram of the second conventional example to explain the fifth and sixth problems. 1... Instruction prereading device, 2... Instruction decoding device, 3... First arithmetic unit, 4... Operand reading device, 5... Second Arithmetic device, 6.
... Operand writing device, 7 ... Input/output bus, 8 ... Bus system (incorporated) device, 301 ...
... Arithmetic sterilization circuit, 302 ... General purpose register, 3
03...Register control circuit, 304...
First arithmetic logic operation circuit, 305...Status register, 306...Flag generation circuit, 401...
...Read address register, 402...
・Read control circuit, 4o3...Read data buffer, 404...Read address increment circuit, 601
...? 1S2 arithmetic and logic operation round, 601...
...Write address register, 602..., 1':
Input system circuit, 603...Write data buffer, (504...Write address increment circuit. Name of agent: Patent attorney Shigetaka Awano and one other person) i engineering■=ko Rop, R-1-R IF DECEXI OF Ex2
M Op, RyuR H → Shaku IF DECExI 05 R-7'1 P (d) EC EXI F X2 S Rop, M →N e1 mulrlple (Mop, R-R, M 4R) (f
J 7F OF: CEXI O5O3QkoMm
urttpLe (R”M) Glass LJ Evening C) Figure 1Theta Figure 1 Figure 2cL Scolding Mi to Button Science

Claims

[Scope of Claims] (1) Instruction prefetching means for prefetching instructions, instruction decoding means for decoding instructions, and an instruction format that does not require reading from or writing to memory depending on the decoding result. It consists of at least a general-purpose register and an arithmetic unit that performs the arithmetic operation specified by the instruction, and when the instruction format requires reading from or writing to memory, calculates the address of the operand to be read or written. an arithmetic means, and an operand read means for reading an operand from the memory according to an address obtained by the arithmetic means, which is provided exclusively for an instruction whose instruction format requires reading from the memory, and the operand read means. A read operand transfer means for storing the obtained data in a general-purpose register in the arithmetic means, and an address obtained by the arithmetic means provided exclusively for instructions whose instruction format requires writing to memory. operand writing means for writing an operand into memory according to the operand writing means; and write operand transfer means for retrieving data to be written into the memory by the operand writing means from a general-purpose register in the arithmetic means; During the execution of an instruction that requires reading from the memory, the arithmetic means executes a subsequent instruction that does not require reading from or writing to the memory in parallel, and the operand writing means requires writing to the memory. During the execution of the instruction, the arithmetic means executes a subsequent instruction that does not require reading from or writing to the memory in parallel, and the operand reading means also executes an instruction that requires reading from the memory. (2) an instruction prefetching device for prefetching an instruction; an instruction decoding means for decoding the instruction format; and, depending on the decoding result, if the instruction format does not require reading from or writing to the memory, performs the arithmetic operation specified by the instruction; a first arithmetic means comprising at least a general-purpose register and an arithmetic unit, which performs an address calculation operation of an operand to be read or written when writing is required;
Further, an operand reading means for reading an operand from the memory according to an address obtained by the first arithmetic means provided exclusively for an instruction whose instruction format requires reading from the memory, and a memory in the operand reading means. a read operand transfer means for storing data obtained by the second arithmetic means in a general-purpose register in the first arithmetic means;
Operand writing means whose instruction format writes an operand into memory according to the address obtained by the first calculating means; write operand transfer means for retrieving from a register, the operand reading means and the second arithmetic means performing an instruction that requires reading from the memory, the first arithmetic means subsequently reading from the memory; executing instructions that do not require reading or writing to the memory in parallel; and while the operand writing means is executing an instruction that requires writing to the memory, the first arithmetic means subsequently reads from the memory; Alternatively, instructions that do not require writing to memory are executed in parallel, and furthermore, while the operand reading means and the second arithmetic means are executing an instruction that requires reading from memory, the operand writing means is executed in parallel. Alternatively, a data processing device is characterized in that instructions that require preceding writing to memory are executed in parallel. (2) An instruction prefetching means for prefetching an instruction, an instruction decoding means for decoding an instruction, and an instruction that specifies when the instruction format does not require reading from or writing to the memory according to the decoding result. When an arithmetic operation is performed and the instruction format requires reading from or writing to memory, the address of the operand to be read or written is calculated, and the single or multiple operands to be read from memory are stored. Alternatively, an arithmetic means consisting of at least a general-purpose register and an arithmetic unit that issues information for specifying a general-purpose register storing a single or multiple operands to be written to memory, and an instruction format that requires reading from memory. specifies an operand reading means for reading a single or multiple operands from the memory according to an address obtained by the arithmetic means provided exclusively for the instruction to be read from the memory, and a general-purpose register for storing the single or multiple operands to be read from the memory. read operand transfer means for receiving information from the operation means to store the single or multiple operands obtained by the operand read means in a general-purpose register in the operation means; Operand writing means for writing a single or multiple operands into memory according to the address obtained by the arithmetic means provided exclusively for the required instruction, and storing the single or multiple operands to be written to the memory. write operand transfer means for receiving information for specifying a general-purpose register from the arithmetic means and retrieving single or multiple operands to be written into memory by the operand writing means from the general-purpose register in the arithmetic means; While the operand reading means is executing an instruction requiring reading from memory of a single or multiple operands, the arithmetic means performs in parallel execution of an instruction that does not require subsequent reading from or writing to memory, In addition, while the operand writing means is executing an instruction that requires writing a single or multiple operands to memory, the arithmetic means can perform a subsequent read from memory or execution of an instruction that does not require writing to memory in parallel. and furthermore, during execution of an instruction in which said operand reading means requires reading a single or multiple operands from memory, said operand writing means requires writing subsequent or preceding single or multiple operands to memory. A data processing device characterized in that it executes instructions in parallel. (4) Instruction prefetching means for prefetching instructions, instruction decoding means for decoding instructions, and instructions specifying when the instruction format does not require reading from or writing to memory according to the decoding results. When an arithmetic operation is performed and the instruction format requires reading from or writing to memory, the address of the operand to be read or written is calculated, and the single or multiple operands to be read from memory are stored. Alternatively, a first arithmetic means consisting of at least a general-purpose register and an arithmetic unit that issues information for specifying a general-purpose register storing a single or multiple operands to be written to memory, and further an instruction format for reading from memory. operand reading means for reading single or multiple operands from memory according to the address obtained by the first arithmetic means provided exclusively for instructions requiring a second arithmetic means for performing an arithmetic operation on a single or multiple operands, and information for specifying a general-purpose register for storing a single or multiple operands to be read from memory from the arithmetic means; read operand transfer means for storing the obtained single or multiple operands in a general-purpose register in the first arithmetic means; operand writing means for writing a single or multiple operands into memory according to the address obtained by the first arithmetic means; and information for specifying a general-purpose register storing the single or multiple operands to be written to memory; write operand transfer means for retrieving single or multiple operands received from the arithmetic means and written to memory by the operand write means from a general purpose register in the first arithmetic means;
During the execution of an instruction in which the operand reading means and the second arithmetic means require reading a single or multiple operands from memory, the first arithmetic means performs subsequent reading from or writing to memory. carrying out the execution of instructions not required in parallel, and during the execution of an instruction requiring writing of a single or multiple operands to memory by the operand writing means, the first arithmetic means subsequently reading from the memory; The execution of instructions that do not require writing to memory is performed in parallel; 1. A data processing device, characterized in that operand writing means executes in parallel instructions that require writing subsequent or preceding single or multiple operands into memory. (5) When an instruction subsequent to the instruction being executed by the operand reading means is being executed by the calculation means, the operand reading means executes the instruction using a plurality of execution result flag generation information issued by the calculation means. When terminating the process, among the plurality of execution result flag generation information issued by the operand reading means, only the flag generation information that has not been changed by the execution of the subsequent instruction by the arithmetic means is validated, and the plurality of execution result flags are 4. The data processing apparatus according to claim 1, further comprising a flag generating means for reflecting an execution result in a status register to be stored. (6) When the first arithmetic means is executing an instruction subsequent to the instruction being executed by the second arithmetic means, using a plurality of execution result flag generation information issued by the first arithmetic means, When the second arithmetic means finishes executing an instruction, the plurality of execution result flag generation information issued by the second arithmetic means is not changed by the execution of a subsequent instruction by the first arithmetic means. 5. The data processing apparatus according to claim 2, further comprising flag generation means for validating only the flag generation information obtained by the execution result and reflecting the execution result to a status register storing a plurality of execution result flags.