JPH0991138A

JPH0991138A - Operation processing method

Info

Publication number: JPH0991138A
Application number: JP25059695A
Authority: JP
Inventors: Yoshihide Yabuki; 喜秀矢吹
Original assignee: Hitachi Ltd; Hitachi Computer Engineering Co Ltd
Current assignee: Hitachi Ltd; Hitachi Computer Engineering Co Ltd
Priority date: 1995-09-28
Filing date: 1995-09-28
Publication date: 1997-04-04

Abstract

PROBLEM TO BE SOLVED: To shorten instruction processing time by detecting a pair of instructions to be simultaneously executed by a single computing element and simultaneously executing this instruction pair by the single computing element. SOLUTION: A detection circuit 1 detects the instruction pair, which has data dependency relation and can be simultaneously executed by the single computing element, or the instruction pair which has no data dependency relation and can be simultaneously executed by the single computing element. When the detection circuit 1 provided inside an arithmetic unit detects the instruction pair which can be simultaneously executed by the single computing element, this detected result is reported to a control circuit 2 for controlling a computing element 13. The control circuit 2 controls operand data corresponding to the instruction pair to be simultaneously executed so as to be supplied to the computing element 13 and generates a control signal for controlling the operation of the computing element 13 as well. The computing element 13 simultaneously executes the plural instructions based on the supplied operand data and control signal.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、複数の命令を同時
に実行することができる演算パイプライン方式のデータ
処理装置に好適な演算処理方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an arithmetic processing method suitable for an arithmetic pipeline type data processing apparatus capable of simultaneously executing a plurality of instructions.

【０００２】[0002]

【従来の技術】複数の命令を並列に実行できる演算パイ
プライン処理方式のデータ処理装置において、演算パイ
プラインのスムースな流れを乱す要因の１つとして、
「データ依存性ハザード」がよく知られている。これ
は、第１命令の演算結果を第２命令がオペランドとして
必要とした場合に生じるパイプライン・ハザードのこと
である。2. Description of the Related Art In a data processing apparatus of an arithmetic pipeline processing system capable of executing a plurality of instructions in parallel, one of the factors disturbing the smooth flow of the arithmetic pipeline is
The "data dependency hazard" is well known. This is a pipeline hazard that occurs when the second instruction requires the operation result of the first instruction as an operand.

【０００３】この「データ依存性ハザード」が頻繁に発
生すると、複数の命令を並列に実行することができる演
算パイプライン処理装置のハードウエア資源が有効に使
用されないことになる。このため、従来から「データ依
存性ハザード」を解消するための種々の手法が提案され
てきた。If this "data dependence hazard" frequently occurs, the hardware resources of the arithmetic pipeline processing device capable of executing a plurality of instructions in parallel will not be effectively used. For this reason, various methods have conventionally been proposed to eliminate the “data dependency hazard”.

【０００４】例えば、ある命令対間の「データ依存性ハ
ザード」を解消するために、該命令対を「複合命令化」
して単一実行サイクルで処理する複数スカラ命令の並列
実行支援装置がある（特開平５−７３３０９号を参
照）。For example, in order to eliminate a "data dependency hazard" between a pair of instructions, the pair of instructions is "combined".
There is a parallel execution support device for a plurality of scalar instructions that is processed in a single execution cycle (see Japanese Patent Laid-Open No. 5-73309).

【０００５】[0005]

【発明が解決しようとする課題】上記した従来技術は、
データ依存関係にある命令対を「複合命令化」するとと
もに、「データ依存解消ＡＬＵ」を設け、ここで該命令
対を実行することにより「データ依存性ハザード」を解
消している。しかし、この方法では、ハードウエア量が
大幅に増加するという問題があり、しかも全ての命令組
合せについてデータ依存関係が解消されるというもので
もない。DISCLOSURE OF THE INVENTION Problems to be Solved by the Invention
The instruction pair having the data dependency relationship is "combined into an instruction" and a "data dependency elimination ALU" is provided to execute the instruction pair to eliminate the "data dependency hazard". However, this method has a problem that the amount of hardware is significantly increased, and the data dependency is not eliminated for all instruction combinations.

【０００６】本発明の目的は、大量のハードウエアを必
要とすることなく、単一の演算器で同時に実行可能な命
令対を検出し、この命令対を同時に実行させることによ
り命令処理時間を短縮した演算処理方法を提供すること
にある。An object of the present invention is to detect an instruction pair that can be simultaneously executed by a single arithmetic unit without requiring a large amount of hardware and to execute the instruction pair at the same time, thereby shortening the instruction processing time. It is to provide the above-mentioned arithmetic processing method.

【０００７】[0007]

【課題を解決するための手段】前記目的を達成するため
に、本発明では、命令およびデータを格納した記憶ユニ
ットと、該記憶ユニットから複数の命令を取り出し、該
複数の命令を並列に解読する命令ユニットと、命令コー
ド、オペランドを受け取り、演算を実行する演算ユニッ
トとを備えたデータ処理装置において、複数の命令を並
列に実行する演算パイプライン方式の演算処理方法であ
って、前記複数の命令の内、データ依存関係があり、か
つ単一の演算器で同時に実行可能な命令対を検出したと
き、該命令対を単一の演算器で同時に実行させることを
特徴としている。In order to achieve the above-mentioned object, in the present invention, a storage unit storing instructions and data, a plurality of instructions are fetched from the storage unit, and the plurality of instructions are decoded in parallel. A data processing device comprising an instruction unit and an operation unit that receives an instruction code and an operand and executes an operation. An operation processing method of an operation pipeline system for executing a plurality of instructions in parallel, the method comprising: Among them, when an instruction pair having a data dependency relationship and capable of being simultaneously executed by a single arithmetic unit is detected, the instruction pair is simultaneously executed by a single arithmetic unit.

【０００８】また、前記複数の命令の内、データ依存関
係がなく、かつ単一の演算器で同時に実行可能な命令対
を検出したとき、該命令対を単一の演算器で同時に実行
させることを特徴としている。When a pair of instructions among the plurality of instructions that has no data dependency and can be simultaneously executed by a single arithmetic unit is detected, the instruction pair is simultaneously executed by a single arithmetic unit. Is characterized by.

【０００９】これにより、演算ユニット内に設けられた
検出手段が、単一の演算器で同時に実行可能な命令対を
検出すると、この検出結果を、演算器を制御する制御手
段に通知する。制御手段は、同時に実行される命令対に
対応したオペランドデータが演算器に供給されるように
制御するとともに、演算器の動作を制御する制御信号も
生成する。演算器は、供給されたオペランドデータと制
御信号を基に各種演算を実行する。Thus, when the detecting means provided in the arithmetic unit detects an instruction pair that can be simultaneously executed by a single arithmetic unit, the detection result is notified to the control means for controlling the arithmetic unit. The control means controls the operand data corresponding to the pair of instructions to be simultaneously executed to be supplied to the arithmetic unit, and also generates a control signal for controlling the operation of the arithmetic unit. The arithmetic unit executes various operations based on the supplied operand data and control signals.

【００１０】[0010]

【発明の実施の形態】以下、本発明の一実施例を図面を
用いて具体的に説明する。図２に、複数の命令を並列に
実行することができる演算パイプライン方式のデータ処
理装置の構成を示す。図において、２１は、複数の命令
を並列に解読する命令ユニット、２２は、命令およびデ
ータを格納する記憶ユニット、２３は、演算（浮動小数
点演算）を実行する演算ユニットである。命令ユニット
２２は、記憶ユニット２２から読み出された命令を格納
する命令バッファ２４と、命令バッファ２４から切り出
された先行命令を格納する第１の命令レジスタ２５と、
命令バッファ２４から切り出された後続命令を格納する
第２の命令レジスタ２６と、第１、第２の命令レジスタ
の命令を同時に解読する第１、第２のデコーダ２７、２
８などから構成されている。そして、演算ユニット２３
は、命令ユニット２１および記憶ユニット２２から、並
列に実行される命令コード、メモリオペランドなどを受
け取り、演算を実行する。DESCRIPTION OF THE PREFERRED EMBODIMENTS One embodiment of the present invention will be specifically described below with reference to the drawings. FIG. 2 shows the configuration of an arithmetic pipeline type data processing device capable of executing a plurality of instructions in parallel. In the figure, 21 is an instruction unit that decodes a plurality of instructions in parallel, 22 is a storage unit that stores instructions and data, and 23 is an operation unit that executes an operation (floating point operation). The instruction unit 22 includes an instruction buffer 24 that stores the instruction read from the storage unit 22, and a first instruction register 25 that stores the preceding instruction cut out from the instruction buffer 24.
A second instruction register 26 for storing subsequent instructions cut out from the instruction buffer 24 and first and second decoders 27, 2 for simultaneously decoding the instructions of the first and second instruction registers.
It is composed of 8 and the like. And the arithmetic unit 23
Receives instruction codes, memory operands, etc. to be executed in parallel from the instruction unit 21 and the storage unit 22, and executes an operation.

【００１１】図１は、本発明の実施例に係る浮動小数点
演算ユニットの構成を示す。１は、命令ユニット２１の
デコーダ２７、２８に接続された検出回路である。この
検出回路１は、データ依存関係があり、かつ単一の演算
器で同時に実行可能な命令対または、データ依存関係が
なく、かつ単一の演算器で同時に実行可能な命令対のい
ずれかを検出する。２は、検出回路１からの検出結果に
応じて演算器全体を制御する制御回路、３、４、５、
６、７はセレクタ、８、９は、ワークレジスタ、１０
は、中間結果を格納するレジスタ、１１はストアデータ
を格納するレジスタ、１２は、演算結果を格納するレジ
スタ、１３は単一の演算器である。FIG. 1 shows the configuration of a floating point arithmetic unit according to an embodiment of the present invention. Reference numeral 1 is a detection circuit connected to the decoders 27 and 28 of the instruction unit 21. The detection circuit 1 has either a data dependency and an instruction pair that can be simultaneously executed by a single arithmetic unit, or an instruction pair that has no data dependency and can be simultaneously executed by a single arithmetic unit. To detect. Reference numeral 2 denotes a control circuit for controlling the entire arithmetic unit according to the detection result from the detection circuit 1, 3, 4, 5,
6, 7 are selectors, 8 and 9 are work registers, 10
Is a register for storing an intermediate result, 11 is a register for storing store data, 12 is a register for storing an operation result, and 13 is a single arithmetic unit.

【００１２】レジスタ群１４は一般には、汎用レジスタ
群と、浮動小数点レジスタ群との２つを有していて、浮
動小数点演算を行うとき浮動小数点レジスタ群を用い、
それ以外の演算を行うとき汎用レジスタ群を用いるが、
図１では、簡単のため１つのレジスタ群を示す。また、
１０１、１０２、１０３、１０４はオペランド・バス、
１０５はデータ・バスである。The register group 14 generally has a general-purpose register group and a floating-point register group, and the floating-point register group is used when performing a floating-point operation.
The general-purpose register group is used to perform other operations,
In FIG. 1, one register group is shown for simplicity. Also,
101, 102, 103, 104 are operand buses,
Reference numeral 105 is a data bus.

【００１３】ここで、本実施例の演算器１３は、同時に
複数の命令を実行するための特別の構成を備えたもので
はない。すなわち、僅かのハードウエアの付加によって
単一の演算器で複数の命令（本実施例では、２命令）を
同時に実行させる点が本発明の特徴であり、従来技術と
相違する点である。Here, the arithmetic unit 13 of this embodiment does not have a special configuration for executing a plurality of instructions at the same time. That is, a feature of the present invention is that a plurality of instructions (two instructions in this embodiment) are simultaneously executed by a single arithmetic unit by adding a small amount of hardware, which is a point different from the prior art.

【００１４】ただし、図１に示す演算器構成を採る場
合、以下の理由から同時に実行できる命令の組合せは、
ある程度、特定される。（１）ワークレジスタ（図１の８、９）を１組しか持
たない。（２）同一演算を実行できる演算器（図１の１３）を
１個しか持たない。However, when the arithmetic unit configuration shown in FIG. 1 is adopted, the combination of instructions that can be simultaneously executed is as follows for the following reason.
To some extent specified. (1) It has only one set of work registers (8 and 9 in FIG. 1). (2) It has only one computing unit (13 in FIG. 1) that can execute the same computation.

【００１５】従って、上記（１）によって、並列に解読
された２命令が、都合４オペランドを必要とする場合
は、同時実行が不可能であり、同様に都合３オペランド
を必要とする場合は、データ依存関係にあるときに限り
同時実行が可能となる。また、上記（２）によって、同
一演算命令の同時実行ができないなどの制約がある。し
かし、これらの制約は、同一演算を実行できる演算器を
複数個設けるなど、ハードウエア量を増加させることに
よって、ほぼ解消することができる。Therefore, according to the above (1), if two instructions decoded in parallel require four operands for convenience, simultaneous execution is impossible, and similarly, if three operands for convenience are required, Concurrent execution is possible only when there is a data dependency. Further, due to the above (2), there is a restriction that the same operation instruction cannot be executed simultaneously. However, these restrictions can be almost eliminated by increasing the amount of hardware such as providing a plurality of arithmetic units capable of executing the same arithmetic operation.

【００１６】さて、本発明は、前述したように単一の演
算器で複数命令を同時に実行する点に特徴があるが、本
発明が対象とする浮動小数点命令の分類を図３に示す。
命令は、ＲＲ形式とＲＸ形式に分けられ、各形式の命令
は、それぞれ４グループに分類される。例えば、ＬＥＲ
グループのロード命令＜ＬＥＲ＞は、単精度のロード命
令であり、＜ＬＤＲ＞は倍精度のロード命令である。な
お、上記した各命令の詳細は、例えばＩＢＭ「Ｅｎｔ
ｅｒｐｒｉｓｅＳｙｓｔｅｍＡｒｃｈｉｔｅｃｔｕ
ｒｅ／３７０ＰｒｉｎｃｉｐｌｅｓｏｆＯｐｅ
ａｔｉｏｎ」に記載されている。As described above, the present invention is characterized in that a plurality of instructions are simultaneously executed by a single arithmetic unit, and the classification of floating point instructions to which the present invention is applied is shown in FIG.
Instructions are divided into RR format and RX format, and each format instruction is classified into four groups. For example, LER
The group load instruction <LER> is a single-precision load instruction, and <LDR> is a double-precision load instruction. The details of each of the above-mentioned commands are described in, for example, the IBM “Ent
erprise System Architectu
re / 370 Principles of Op
ation ”.

【００１７】上記した分類は、演算器の通過ルート、命
令形式、実行サイクル数が各命令によって異なるため、
同時に実行できる命令対であるか否かの判定にとって重
要な要素になる。なお、以降の説明では、同時に実行す
る命令対のうち、先行する命令をＸ系、後続の命令をＹ
系と呼ぶ。In the above classification, the passage route of the arithmetic unit, the instruction format, and the number of execution cycles are different for each instruction.
It becomes an important factor for determining whether or not the instruction pairs can be executed simultaneously. In the following description, of the instruction pairs to be executed at the same time, the preceding instruction is the X type and the subsequent instruction is the Y type.
Called the system.

【００１８】図４は、図３の命令の分類と、図１の演算
器構成とに基づいて、各命令対が単一の演算器で同時に
実行可能か否かの判定結果を示す。図において、Ｘ系と
Ｙ系がクロスした箇所が丸印のとき、先行するＸ系の命
令と後続するＹ系の命令は、データ依存関係に係らず、
同時実行が可能な命令対であることを示す。また、三角
印は、データ依存関係がある場合にのみ同時実行が可能
な命令対であることを示し、×印は同時実行が不可能な
命令対を示す。そして、検出回路１内には、図４の判定
結果がテーブルとして格納されている。FIG. 4 shows the result of judgment as to whether or not each instruction pair can be executed simultaneously by a single arithmetic unit, based on the instruction classification of FIG. 3 and the arithmetic unit configuration of FIG. In the figure, when the crossing point between the X system and the Y system is a circle, the preceding X system instruction and the following Y system instruction are
Indicates an instruction pair that can be executed simultaneously. Further, a triangle mark indicates an instruction pair that can be simultaneously executed only when there is a data dependency relationship, and a x mark indicates an instruction pair that cannot be simultaneously executed. The determination result of FIG. 4 is stored in the detection circuit 1 as a table.

【００１９】以下、具体的な命令対を例にして、本発明
の動作を説明する。（例１）の命令対が、図２に示す命
令ユニット２１から演算ユニット２２に対し発行された
とする。すなわち、先行命令（ＬＤＲ０，２）が命令
レジスタ２５に格納され、後続命令（ＡＤＲ０，４）
が命令レジスタ２６に格納され、次のステージでそれぞ
れ同時にデコーダ２７、２８で解読され、演算ユニット
２２の検出回路１に送られる。The operation of the present invention will be described below by taking a specific instruction pair as an example. It is assumed that the instruction pair of (Example 1) is issued from the instruction unit 21 shown in FIG. 2 to the arithmetic unit 22. That is, the preceding instruction (LDR 0,2) is stored in the instruction register 25, and the subsequent instruction (ADR 0,4).
Are stored in the instruction register 26, are simultaneously decoded by the decoders 27 and 28 in the next stage, and are sent to the detection circuit 1 of the arithmetic unit 22.

【００２０】ここで、ＬＤＲ命令は、第２オペランドで
指定されるレジスタの内容を、第１オペランドで指定さ
れるレジスタにロードする命令であり、ＡＤＲ命令は、
第２オペランドで指定されるレジスタの内容と第１オペ
ランドで指定されるレジスタの内容を加算して、第１オ
ペランドで指定されるレジスタに格納する命令である。Here, the LDR instruction is an instruction for loading the contents of the register designated by the second operand into the register designated by the first operand, and the ADR instruction is
This is an instruction for adding the contents of the register specified by the second operand and the contents of the register specified by the first operand and storing them in the register specified by the first operand.

【００２１】検出回路１は、発行された命令の命令コード、オペラン
ド・アドレスを基に、図４の判定結果（テーブル）を参
照する。この例１では、先行命令（Ｘ系）がＬＤＲ、後
続命令（Ｙ系）がＡＤＲであり、先行命令の第１オペラ
ンドで指定されるレジスタと後続命令の第１オペランド
で指定されるレジスタが一致し、データ依存関係がある
ので、例１の命令対は単一の演算器で同時に実行可能と
判定され、その判定結果が制御回路２に通知される。制
御回路２は、以下のように演算器１３を制御する。[0021] The detection circuit 1 refers to the determination result (table) of FIG. 4 based on the instruction code of the issued instruction and the operand address. In this example 1, the preceding instruction (X type) is LDR and the subsequent instruction (Y type) is ADR, and the register specified by the first operand of the preceding instruction and the register specified by the first operand of the subsequent instruction are equal to each other. However, since there is a data dependency, it is determined that the instruction pair of Example 1 can be simultaneously executed by a single arithmetic unit, and the determination result is notified to the control circuit 2. The control circuit 2 controls the arithmetic unit 13 as follows.

【００２２】まず、Ｘ系命令の第２オペランド・データ
（２番目のレジスタの内容）をセレクタ６で選択し、同
時にＹ系命令の第２オペランド・データ（４番目のレジ
スタの内容）をセレクタ７で選択する。選択された各オ
ペランド・データは、それぞれオペランド・バス１０
２、１０４を介してセレクタ４、３に送られる。次い
で、セレクタ３を介してオペランド・バス１０４のオペ
ランド・データ（つまり、ＡＤＲの第２オペランド・デ
ータ）をワーク・レジスタ８に格納し、セレクタ４を介
してペランド・バス１０２のオペランド・データ（つま
り、ＬＤＲの第２オペランド・データ）をワ−ク・レジ
スタ９に格納する。First, the selector 6 selects the second operand data of the X-type instruction (contents of the second register), and at the same time selects the second operand data of the Y-type instruction (contents of the fourth register). Select with. The selected operand data are respectively transferred to the operand bus 10
It is sent to selectors 4 and 3 via 2 and 104. Then, the operand data of the operand bus 104 (that is, the second operand data of the ADR) is stored in the work register 8 through the selector 3, and the operand data of the Peland bus 102 (that is, the second operand data of the ADR) (that is, the second operand data of the ADR) is stored through the selector 4. , LDR second operand data) are stored in the work register 9.

【００２３】続いて、演算器１３は、（ワーク・レジスタ８の内容）＋（ワーク・レジスタ９
の内容）を演算し、レジスタ１０、レジスタ１２に順次、演算結
果を格納し、データ・バス１０５を介して、ＡＤＲの第
１オペランドで指定される、レジスタ群１４内の０番目
のレジスタに演算結果を格納し、上記した命令対の処理
を終了する。Subsequently, the computing unit 13 calculates (contents of the work register 8) + (work register 9
Content) is stored in the register 10 and the register 12 in order, and the result is stored in the 0th register in the register group 14 designated by the first operand of ADR via the data bus 105. The result is stored, and the processing of the above-described instruction pair ends.

【００２４】このように、Ｘ系命令（ＬＤＲ）は、第２
オペランド・データをＹ系命令に引渡しただけで実行さ
れず（つまり、第２オペランド・データが第１オペラン
ドのレジスタに転送されない）、一方、Ｙ系命令（ＡＤ
Ｒ）は、Ｘ系命令の第２オペランド・データと自命令の
第２オペランド・データとを加算するように制御され
る。この結果、単一の演算器で同時に２つの命令が実行
されたことになり、従来の処理に比べて命令処理時間が
短縮される。なお、上記したように命令列が同時実行可
能でないときは、まず先行命令が演算ユニットで処理さ
れ、次いで後続命令が演算ユニットで処理される。As described above, the X-type instruction (LDR) is the second instruction.
It is not executed just by passing the operand data to the Y instruction (that is, the second operand data is not transferred to the register of the first operand), while the Y instruction (AD
R) is controlled so as to add the second operand data of the X-type instruction and the second operand data of its own instruction. As a result, two instructions are simultaneously executed by a single arithmetic unit, and the instruction processing time is shortened as compared with the conventional processing. When the instruction sequences are not simultaneously executable as described above, the preceding instruction is first processed by the arithmetic unit, and the subsequent instruction is then processed by the arithmetic unit.

【００２５】（例２−１）の命令対が、図２に示す命令
ユニット２１から演算ユニット２２に対し発行されたと
する。ここで、ＳＴＤ命令は、第１オペランドで指定さ
れるレジスタの内容を、第２オペランドで指定される主
記憶アドレス（このアドレスは、Ｘ２フィールドにより
指定される指標アドレスレジスタの内容とＢ２フィール
ドにより指定される基底アドレスレジスタの内容とＤ２
フィールドの内容とを加算することにより得る）にスト
アする命令である。It is assumed that the instruction pair of (Example 2-1) is issued from the instruction unit 21 shown in FIG. 2 to the arithmetic unit 22. Here, the STD instruction sets the contents of the register specified by the first operand to the main memory address specified by the second operand (this address is specified by the contents of the index address register specified by the X2 field and the B2 field. Contents of the base address register to be set and D2
(Obtained by adding the contents of the field).

【００２６】検出回路１は、（例１）と同様に、発行された命令の命
令コード、オペランド・アドレスを基に、図４の判定結
果（テーブル）を参照する。この例２−１では、先行命
令（Ｘ系）がＬＤＲ、後続命令（Ｙ系）がＳＴＤである
ので、データ依存関係に係らず、例２−１の命令対は単
一の演算器で同時に実行可能と判定され、その判定結果
が制御回路２に通知される。制御回路２は、以下のよう
に演算器１３を制御する。[0026] Similar to (Example 1), the detection circuit 1 refers to the determination result (table) of FIG. 4 based on the instruction code and operand address of the issued instruction. In this Example 2-1, since the preceding instruction (X type) is LDR and the subsequent instruction (Y type) is STD, the instruction pair of Example 2-1 is simultaneously executed by a single arithmetic unit regardless of the data dependency. It is determined to be executable, and the determination result is notified to the control circuit 2. The control circuit 2 controls the arithmetic unit 13 as follows.

【００２７】まず、Ｘ系命令（ＬＤＲ）の第２オペラン
ド・データをセレクタ６で選択し、同時にＹ系命令（Ｓ
ＴＤ）の第１オペランド・データをセレクタ７で選択す
る。選択された各オペランド・データは、各々オペラン
ド・バス１０２ないし１０３を介して、セレクタ３ない
し４に送られる。次に、セレクタ３を介してオペランド
・バス１０２のオペランド・データ（ＬＤＲの第２オペ
ランド・データ）をワーク・レジスタ８に格納し、セレ
クタ４を介してオペランド・バス１０３のオペランド・
データ（ＳＴＤの第１オペランド・データ）をワーク・
レジスタ９に格納する。First, the second operand data of the X-type instruction (LDR) is selected by the selector 6, and at the same time, the Y-type instruction (S
The selector 7 selects the first operand data of TD). The selected operand data are sent to the selectors 3 to 4 via the operand buses 102 to 103, respectively. Next, the operand data of the operand bus 102 (second operand data of LDR) is stored in the work register 8 via the selector 3, and the operand data of the operand bus 103 is stored via the selector 4.
Work with data (first operand data of STD)
Store in register 9.

【００２８】次に、Ｘ系命令（ＬＤＲ）は、演算器１３
において、（ワーク・レジスタ８の内容）＋０を演算し（つまり、演算器を通す）、レジスタ１０、レ
ジスタ１２に順次、演算結果を格納し、データ・バス１
０５を介して、レジスタ群１４内の第１オペランドで指
定されるレジスタに演算結果を格納する。Next, the X system instruction (LDR) is issued to the arithmetic unit 13
At (contents of work register 8) +0 is calculated (that is, passed through a calculator), the calculation results are sequentially stored in registers 10 and 12, and data bus 1
The calculation result is stored in the register designated by the first operand in the register group 14 via 05.

【００２９】Ｙ系命令（ＳＴＤ）は、ワーク・レジスタ
９の内容をセレクタ５を介してレジスタ１１に格納す
る。このとき、制御回路２は、ワーク・レジスタ９の内
容をストアデータとして選択するよう動作する。次い
で、レジスタ１１の内容は、図示しないバスを介して、
図２に示す記憶ユニット２２に送られ、第２オペランド
で指定される主記憶アドレスにレジスタ１１の内容がス
トアされ、これにより該命令対の処理が終了する。The Y-system instruction (STD) stores the contents of the work register 9 in the register 11 via the selector 5. At this time, the control circuit 2 operates to select the content of the work register 9 as the store data. Next, the contents of the register 11 are transferred via a bus (not shown) to
The contents of the register 11 are sent to the storage unit 22 shown in FIG. 2 and stored in the main storage address designated by the second operand, whereby the processing of the instruction pair ends.

【００３０】同様に、（例２−２）の命令対が、図２に
示す命令ユニット２１から演算ユニット２３に対して発
行されたとする。Similarly, it is assumed that the instruction pair of (Example 2-2) is issued from the instruction unit 21 shown in FIG. 2 to the arithmetic unit 23.

【００３１】（例２−１）との相違は、セレクタ７でＹ系命令の第１
オペランド・データを選択しないように制御し、また、
セレクタ４で選択されるオペランド・バスを、１０３か
ら１０２にするように制御する。従って、セレクタ６で
選択されたＸ命令系の第２オペランド・データは、オペ
ランド・バス１０２、セレクタ４を介してワーク・レジ
スタ９に格納される。以降の処理は（例２−１）と全く
同様であるので、その説明を省略する。[0031] The difference from (Example 2-1) is that the selector 7 selects the first Y-type instruction.
Operand data is controlled not to be selected, and
The operand bus selected by the selector 4 is controlled to change from 103 to 102. Therefore, the second operand data of the X instruction system selected by the selector 6 is stored in the work register 9 via the operand bus 102 and the selector 4. Subsequent processing is exactly the same as in (Example 2-1), so description thereof will be omitted.

【００３２】（例３）の命令の組合せが、図２に示す命
令ユニット２１から演算ユニット２３に対し発行された
とする。It is assumed that the instruction combination of (Example 3) is issued from the instruction unit 21 shown in FIG. 2 to the arithmetic unit 23.

【００３３】検出回路１は、上記した例と同様に、発行された命令の
命令コード、オペランド・アドレスを調べ、この例では
データ依存関係があるので、単一の演算器で同時に実行
可能と判定し、制御回路２に通知する。制御回路２は次
のように演算器を制御する。[0033] The detection circuit 1 checks the instruction code and the operand address of the issued instruction as in the above example. Since there is a data dependency in this example, the detection circuit 1 determines that they can be executed simultaneously by a single arithmetic unit, and controls Notify circuit 2. The control circuit 2 controls the arithmetic unit as follows.

【００３４】まず、Ｘ系命令の第１オペランド・データ
をセレクタ６で選択し、同時にＹ系命令の第２オペラン
ド・データをセレクタ７で選択する。選択された各オペ
ランド・データは、それぞれオペランド・バス１０１な
いし１０４を介して、セレクタ４ないし３に送られる。
次に、セレクタ３を介してオペランド・バス１０４のオ
ペランド・データをワーク・レジスタ８（ＡＤＲの第２
オペランド）に格納し、セレクタ４を介してオペランド
・バス１０１のオペランド・データをワ−ク・レジスタ
９（ＳＴＤの第１オペランド）に格納する。First, the first operand data of the X-type instruction is selected by the selector 6, and at the same time, the second operand data of the Y-type instruction is selected by the selector 7. The selected operand data are sent to the selectors 4 to 3 via the operand buses 101 to 104, respectively.
Next, the operand data on the operand bus 104 is transferred via the selector 3 to the work register 8 (second register of ADR).
The operand data of the operand bus 101 is stored in the work register 9 (first operand of STD) via the selector 4.

【００３５】次に、Ｘ系命令は、ワーク・レジスタ９の
内容をセレクタ５を介してレジスタ１１にストアデータ
としてに格納する。次いで、レジスタ１１の内容は、図
示しないバスを介して、図２に示す記憶ユニットに送ら
れる。Next, the X-type instruction stores the contents of the work register 9 in the register 11 as store data via the selector 5. Next, the contents of the register 11 are sent to the storage unit shown in FIG. 2 via a bus (not shown).

【００３６】Ｙ系命令は、演算器１３において、（ワーク・レジスタ８の内容）＋（ワーク・レジスタ９
の内容）を演算し、レジスタ１０、レジスタ１２に順次、演算結
果を格納し、データ・バス１０５を介して、レジスタ群
１４内の第１オペランドで指定されるレジスタに演算結
果を格納し、これにより該命令対の処理が終了する。The Y-system instruction is executed in the arithmetic unit 13 by (contents of work register 8) + (work register 9
Content) is stored in the register 10 and the register 12 sequentially, and the calculation result is stored in the register designated by the first operand in the register group 14 via the data bus 105. This completes the processing of the instruction pair.

【００３７】同様に、（例４）の命令の組合せが、命令
ユニットから演算ユニットに対して発行されたとする。Similarly, it is assumed that the instruction combination of (Example 4) is issued from the instruction unit to the arithmetic unit.

【００３８】検出回路１は、上記した例と同様に、発行された命令の
命令コード、オペランド・アドレスを調べ、この例は例
３と同様にデータ依存関係があるので、単一の演算器で
同時に実行可能と判定して、制御回路２に通知する。制
御回路２は、次のように演算器を制御する。[0038] The detection circuit 1 checks the instruction code and operand address of the issued instruction as in the above example. Since this example has a data dependency relationship as in Example 3, it can be executed simultaneously by a single arithmetic unit. Then, the control circuit 2 is notified. The control circuit 2 controls the arithmetic unit as follows.

【００３９】まず、Ｘ系命令の第１、第２オペランド・
データをセレクタ６で選択する。このとき、Ｙ系命令の
オペランド・データは、セレクタ７で選択されない。セ
レクタ６で選択されたＸ系命令のオペランド・データ
は、それぞれオペランド・バス１０１ないし１０２を介
してセレクタ４ないし３に送られる。次に、セレクタ３
を介してオペランド・バス１０２のオペランド・データ
をワーク・レジスタ８（ＡＤＲの第２オペランド）に格
納し、セレクタ４を介してオペランド・バス１０１のオ
ペランド・データをワーク・レジスタ９（ＡＤＲの第１
オペランド）に格納する。First, the first and second operands of the X-type instruction
The data is selected by the selector 6. At this time, the operand data of the Y-type instruction is not selected by the selector 7. The operand data of the X-system instruction selected by the selector 6 is sent to the selectors 4 to 3 via the operand buses 101 to 102, respectively. Next, selector 3
The operand data of the operand bus 102 is stored in the work register 8 (second operand of ADR) via the selector register 4, and the operand data of the operand bus 101 is stored in the work register 9 (first register of ADR) via the selector 4.
Operand).

【００４０】次に、演算器１３において、（ワーク・レジスタ８の内容）＋（ワーク・レジスタ９
の内容）を演算し、レジスタ１０、レジスタ１２に順次、演算結
果を格納し、データ・バス１０５を介してレジスタ群１
４内の第１オペランドで指定されたレジスタに演算結果
を格納する。Next, in the arithmetic unit 13, (contents of the work register 8) + (work register 9
The contents of 1) are calculated, the calculation results are sequentially stored in the register 10 and the register 12, and the register group 1 is stored via the data bus 105.
The operation result is stored in the register designated by the first operand in 4.

【００４１】ここで、注意しなければならないことは、
（例４）におけるＹ系命令の処理である。すなわち、前
述した例３のストア命令と異なり、この例４のＹ系命令
の処理でストアするデータは、ワーク・レジスタ９に格
納されたオペランド・データではなく、Ｘ系命令の演算
結果のデータである。このため、前述した例３のストア
命令より、１マシンサイクル遅れて、レジスタ１１に記
憶ユニットに送出すべきストアデータが格納されること
になる。Here, it should be noted that
This is processing of Y-type instructions in (Example 4). That is, unlike the store instruction of Example 3 described above, the data to be stored in the processing of the Y-type instruction of Example 4 is not the operand data stored in the work register 9 but the data of the operation result of the X-type instruction. is there. Therefore, the store data to be sent to the storage unit is stored in the register 11 one machine cycle later than the store instruction of the above-described example 3.

【００４２】Ｙ系命令の処理は次のようになる。Ｘ系命
令によって演算器１３で演算され、レジスタ１０に格納
された演算結果がセレクタ５に送られる。セレクタ５
は、制御回路２によりＸ系命令の演算結果をＹ系命令の
ストアデータとして選択するよう制御される。次いで、
レジスタ１１に格納されたストアデータは、図示しない
バスを介して、図２に示す記憶ユニットに送られ、処理
が終了する。The processing of Y-related instructions is as follows. An operation result is stored in the register 10 after being calculated by the calculator 13 according to the X-system instruction, and sent to the selector 5. Selector 5
Is controlled by the control circuit 2 to select the operation result of the X-type instruction as the store data of the Y-type instruction. Then
The store data stored in the register 11 is sent to the storage unit shown in FIG. 2 via a bus (not shown), and the process ends.

【００４３】以上、幾つかの命令対を例にして、本発明
の一実施例を説明したが、図４に示す他の同時実行可能
な命令対についても、全く同様にその動作が説明される
ことは容易に理解されるであろう。上記説明した処理に
よって、単一の演算器でハードウエア量を増加させるこ
となく、複数の命令を同時に実行することが可能にな
り、命令処理時間を短縮することができる。Although one embodiment of the present invention has been described above by taking several instruction pairs as an example, the operation of other simultaneously executable instruction pairs shown in FIG. 4 will be described in exactly the same manner. It will be easily understood. By the processing described above, it is possible to execute a plurality of instructions simultaneously without increasing the amount of hardware with a single arithmetic unit, and it is possible to shorten the instruction processing time.

【００４４】なお、本発明は、演算器が単一の場合でも
処理性能を向上させる上に有効であるが、演算器を複数
個備えたデータ処理装置に適用した場合、さらに顕著な
効果が得られることは、明らかであろう。Although the present invention is effective in improving the processing performance even when there is a single arithmetic unit, when it is applied to a data processing apparatus having a plurality of arithmetic units, a more remarkable effect can be obtained. It will be obvious.

【００４５】[0045]

【発明の効果】以上、説明したように、本発明によれ
ば、データ依存関係があり、かつ単一の演算器で同時に
実行可能な命令対、または、データ依存関係がなく、か
つ単一の演算器で同時に実行可能な命令対のいずれかを
検出し、検出された命令対を単一の演算器で同時に実行
させているので、ハードウェア量を増大させることな
く、命令処理時間を短縮することができる。As described above, according to the present invention, there is a data dependency and a pair of instructions that can be simultaneously executed by a single arithmetic unit, or there is no data dependency and a single instruction pair. Since any one of the instruction pairs that can be executed simultaneously by the arithmetic unit is detected and the detected instruction pair is executed by the single arithmetic unit at the same time, the instruction processing time can be shortened without increasing the hardware amount. be able to.

[Brief description of drawings]

【図１】本発明の実施例に係る浮動小数点演算ユニット
の構成を示す。FIG. 1 shows a configuration of a floating point arithmetic unit according to an embodiment of the present invention.

【図２】複数の命令を並列に実行することができる演算
パイプライン方式のデータ処理装置の構成を示す。FIG. 2 shows the configuration of an arithmetic pipeline type data processing device capable of executing a plurality of instructions in parallel.

【図３】本発明が対象とする浮動小数点命令の分類を示
す。FIG. 3 shows a classification of floating-point instructions targeted by the present invention.

【図４】命令対が単一の演算器で同時に実行可能か否か
の判定結果を示す。FIG. 4 shows a determination result of whether or not an instruction pair can be simultaneously executed by a single arithmetic unit.

[Explanation of symbols]

１検出回路２制御回路３、４、５、６、７セレクタ８、９ワークレジスタ１０中間結果レジスタ１１ストアデータレジスタ１２演算結果レジスタ１３演算器１４レジスタ群１０１、１０２、１０３、１０４オペランド・バス１０５データ・バス 1 Detection Circuit 2 Control Circuit 3, 4, 5, 6, 7 Selector 8, 9 Work Register 10 Intermediate Result Register 11 Store Data Register 12 Operation Result Register 13 Operation Unit 14 Register Group 101, 102, 103, 104 Operand Bus 105 Data bus

Claims

[Claims]

1. A storage unit for storing instructions and data, an instruction unit for extracting a plurality of instructions from the storage unit and decoding the plurality of instructions in parallel, an operation for receiving an instruction code and an operand, and executing an operation. A data processing device including a unit, which is an operation processing method of an operation pipeline method for executing a plurality of instructions in parallel, wherein the plurality of instructions have a data dependency relationship and a single operation unit An arithmetic processing method characterized in that, when an instruction pair that can be executed simultaneously is detected, the instruction pair is executed simultaneously by a single arithmetic unit.

2. When a pair of instructions among the plurality of instructions that has no data dependency and can be simultaneously executed by a single arithmetic unit is detected, the instruction pair is simultaneously executed by a single arithmetic unit. The arithmetic processing method according to claim 1, wherein: