JP2869376B2

JP2869376B2 - Pipeline data processing method for executing multiple data processing with data dependency

Info

Publication number: JP2869376B2
Application number: JP3218196A
Authority: JP
Inventors: 雅逸中島
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1995-03-03
Filing date: 1996-02-20
Publication date: 1999-03-10
Anticipated expiration: 2016-02-20
Also published as: JPH08305566A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、パイプライン処理
によってデータ処理を実行するパイプラインデータ処理
方法の改良に関し、特に、複数のパイプラインによって
並列に複数のデータを処理するものに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an improvement in a pipeline data processing method for executing data processing by pipeline processing, and more particularly to a method of processing a plurality of data in parallel by a plurality of pipelines.

【０００２】[0002]

【従来の技術】近年、マイクロプロセッサの性能は著し
く向上している。この性能向上には、クロック周波数の
向上と、内部並列処理化とが大きく寄与している。前者
のクロック周波数の向上には、プロセス技術や高速回路
技術に代表されるＶＬＳＩ(Very Large Scale Integrat
ed) 技術の進展が寄与し、後者の内部並列処理化は、パ
イプライン技術やスーパースカラー技術に代表されるア
ーキテクチャ技術の進展が大きく寄与する。2. Description of the Related Art In recent years, the performance of microprocessors has been significantly improved. The improvement of the clock frequency and the internal parallel processing greatly contribute to the performance improvement. In order to improve the former clock frequency, VLSI (Very Large Scale Integration) represented by process technology and high-speed circuit technology
ed) The progress of technology contributes, and the latter internal parallel processing greatly contributes to the progress of architecture technology represented by pipeline technology and superscalar technology.

【０００３】内部並列処理化において、前記パイプライ
ン技術とは、時間的な並列処理技術であって、１連のデ
ータ処理を複数のステージに分割（パイプライン化）す
ることにより、複数のデータ処理をオーバーラップして
行う技術である。また、スーパースカラー技術とは、空
間的な並列処理技術であって、複数のデータ処理を同時
並行して実行する技術である。これ等の２つの技術を併
用して、マイクロプロセッサの性能向上が実現されてい
る。[0003] In the internal parallel processing, the pipeline technique is a temporal parallel processing technique. By dividing a series of data processing into a plurality of stages (pipelining), a plurality of data processing is performed. It is a technique that overlaps. Further, the superscalar technology is a spatial parallel processing technology, in which a plurality of data processes are executed in parallel. By using these two techniques together, the performance of the microprocessor is improved.

【０００４】以下、複数のデータ処理を時間的に並列処
理する従来のパイプラインデータ処理装置について前記
スーパースカラー技術を適用して、複数のデータ処理を
空間的にも並列処理する構成の一例を、図１７に基いて
説明する。Hereinafter, an example of a configuration in which a plurality of data processes are spatially parallel processed by applying the super scalar technique to a conventional pipeline data processing device that temporally parallel processes a plurality of data processes will be described. This will be described with reference to FIG.

【０００５】同図において、１５０、１５１及び１５２
は、相互に並列に配置されたパイプラインデータ処理回
路であって、この並列配置により、３組のデータ処理を
空間的に並列に実行可能である。前記各パイプラインデ
ータ処理回路１５０〜１５２は相互に同一構成であり、
以下、１個のパイプラインデータ処理回路１５０を例に
挙げてその内部構成を説明する。パイプラインデータ処
理回路１５０において、１１８、１１９はデータが入力
される第１及び第２の入力ポート、１２０、１２１は前
記各入力ポート１１８、１１９に入力されたデータを各
々格納する第１及び第２のレジスタ、１３０は前記２個
のレジスタ１２０、１２１に格納されたデータを入力
し、この両データを加算又は減算を行う２入力加減算
器、１３２は前記加減算器１３０の演算結果１３１を入
力して格納するレジスタである。[0005] In the figure, 150, 151 and 152
Are pipeline data processing circuits arranged in parallel with each other. With this parallel arrangement, three sets of data processing can be spatially executed in parallel. Each of the pipeline data processing circuits 150 to 152 has the same configuration as each other,
Hereinafter, the internal configuration of one pipeline data processing circuit 150 will be described as an example. In the pipeline data processing circuit 150, 118 and 119 are first and second input ports to which data is input, and 120 and 121 are first and second input ports that store data input to the input ports 118 and 119, respectively. Two registers, 130, input data stored in the two registers 120, 121, and a two-input adder / subtracter for adding or subtracting the two data, and 132, input the operation result 131 of the adder / subtractor 130. This is a register for storing.

【０００６】前記パイプラインデータ処理回路１５０〜
１５２は、各々、１つのデータ処理が、データの読み出
しステージ、読み出したデータを実際に加算又は減算す
る演算実行ステージ、及び演算結果を記憶するデータ書
き込みステージの３ステージにパイプライン化される。
尚、図１７において、１４０ａ〜１４０ｆは各パイプラ
インデータ処理回路１５０〜１５２の２入力加減算器１
３０での演算に供給されるデータを予め記憶するレジス
タ、１４０ｇ〜１４０ｉは各パイプラインデータ処理回
路１５０〜１５２から出力される演算結果を各々格納す
るレジスタである。The pipeline data processing circuits 150 to 150
Each of the pipelines 152 is pipelined into three stages: a data read stage, an operation execution stage for actually adding or subtracting the read data, and a data write stage for storing the operation result.
In FIG. 17, reference numerals 140a to 140f denote two-input adder / subtracters 1 of the respective pipeline data processing circuits 150 to 152.
The registers 140g to 140i previously store data supplied to the operation at 30, and the registers 140g to 140i respectively store the operation results output from the respective pipeline data processing circuits 150 to 152.

【０００７】次に、前記図１７に示した従来のパイプラ
インデータ処理装置の動作を説明する。Next, the operation of the conventional pipeline data processing device shown in FIG. 17 will be described.

【０００８】先ず、下記に示す３種の加算処理Ｃ、Ｄ、
Ｇの実行について、加算処理Ｃを第１のパイプラインデ
ータ処理回路１５０で、加算処理Ｄを第２のパイプライ
ンデータ処理回路１５１で、及び加算処理Ｇを第３のパ
イプラインデータ処理回路１５２で各々実行する場合を
考える。First, the following three types of addition processing C, D,
Regarding the execution of G, the addition processing C is performed by the first pipeline data processing circuit 150, the addition processing D is performed by the second pipeline data processing circuit 151, and the addition processing G is performed by the third pipeline data processing circuit 152. Let us consider the case where each is executed.

【０００９】Ｃ＝Ａ＋ＢＤ＝Ｅ＋ＦＧ＝Ｈ＋Ｉ前記３つの処理間には相互にデータ依存関係がないの
で、完全に独立して実行することが可能である。即ち、
図１８（ａ）の動作タイミングに示すように、データ読
み出しステージでは、データＡ及びＢの読み出しと、デ
ータＥ及びＦの読み出しと、データＨ及びＩの読み出し
とが同時に実行可能である。また、演算実行ステージで
は、Ａ＋Ｂの演算実行と、Ｅ＋Ｆの演算実行と、Ｈ＋Ｉ
の演算実行とが同時に実行可能である。更に、データ書
き込みステージでは、データＣの書き込みと、データＤ
の書き込みと、データＧの書き込みとが同時に実行可能
である。このように３種のデータ処理が３組のパイプラ
インデータ処理回路１５０〜１５２により並列に実行さ
れる。C = A + BD = E + FG = H + I Since there is no data dependency among the three processes, they can be executed completely independently. That is,
As shown in the operation timing of FIG. 18A, in the data reading stage, reading of data A and B, reading of data E and F, and reading of data H and I can be executed simultaneously. In the operation execution stage, A + B operation execution, E + F operation execution, and H + I
Can be executed at the same time. Further, in the data writing stage, writing of data C and data D
And writing of data G can be executed simultaneously. Thus, three types of data processing are executed in parallel by the three sets of pipeline data processing circuits 150 to 152.

【００１０】[0010]

【発明が解決しようとする課題】しかしながら、前記従
来のパイプラインデータ処理装置では、次の欠点があ
る。以下、この欠点を詳述する。即ち、下記に示す３種
の加算処理Ｃ、Ｊ、ＧＣ＝Ａ＋ＢＪ＝Ｅ＋ＣＧ＝Ｈ＋Ｉの実行について、その実行順序が処理Ｃ、処理Ｊ、処理
Ｇの順序に予め設定されている場合に、加算処理Ｃを第
１のパイプラインデータ処理回路１５０で、加算処理Ｊ
を第２のパイプラインデータ処理回路１５１で、及び加
算処理Ｇを第３のパイプラインデータ処理回路１５２で
各々実行する場合を考える。However, the conventional pipeline data processing device has the following disadvantages. Hereinafter, this disadvantage will be described in detail. That is, when the following three types of addition processing C, J, and G C = A + B J = E + C G = H + I are performed, the execution order is set in advance to the order of the processing C, the processing J, and the processing G. , Addition processing C in the first pipeline data processing circuit 150
Is executed by the second pipeline data processing circuit 151 and the addition processing G is executed by the third pipeline data processing circuit 152.

【００１１】ここで、前記３種の処理のうち、２つの処
理Ｃ、Ｊ間にはデータの依存関係が存在する。即ち、第
２番目の処理Ｊ＝Ｅ＋Ｃの演算は、第１番目の処理Ｃ＝
Ａ＋Ｂの演算結果を用いて実行する必要があり、従っ
て、第１番目の処理Ｃ＝Ａ＋Ｂの演算と、第２番目の処
理Ｊ＝Ｅ＋Ｃの演算とは、同時実行が不可能である。そ
の結果、図１８（ｂ）の動作タイミングに示すように、
第２番目の処理Ｊ＝Ｅ＋Ｃの演算の実行は、第１番目の
処理Ｃ＝Ａ＋Ｂの演算の開始よりも１サイクル待って開
始する必要がある。Here, there is a data dependency between the two processes C and J among the three types of processes. That is, the calculation of the second process J = E + C is performed by the first process C =
It is necessary to execute using the operation result of A + B, and therefore, the first operation C = A + B and the second operation J = E + C cannot be executed simultaneously. As a result, as shown in the operation timing of FIG.
The execution of the operation of the second process J = E + C needs to be started one cycle after the start of the operation of the first process C = A + B.

【００１２】ここで、第２番目の処理Ｊと第３番目の処
理Ｇとの実行順序については２つの方式、即ち、Ｏｕｔ
- ｏｆ- Ｏｒｄｅｒ方式と、Ｉｎ- Ｏｒｄｅｒ方式とが
ある。前者の方式は複数のデータ処理間の実行の順序の
入れ換えを許可する方式であり、後者の方式は複数のデ
ータ処理間の予め定められた実行順序を保証し、処理の
追い越しを禁止する方式である。従って、Ｏｕｔ- ｏｆ
- Ｏｒｄｅｒ方式を前提とする場合には、同図（ｂ）に
示すように、第３番目の処理Ｇ＝Ｈ＋Ｉの演算は、第２
番目の処理Ｊを追い越して、第１番目の処理Ｃの演算と
同時に実行可能である。一方、Ｉｎ- Ｏｒｄｅｒ方式を
前提とする場合には、実行順序の入れ換えが禁止される
ので、同図（ｃ）に示すように、第３番目の処理Ｇの演
算は、第２番目の処理Ｊの演算を追い越せず、第２番目
の処理Ｊの演算と同時期、即ち第１番目の処理Ｃの演算
から１サイクル待って行う必要がある。Here, the execution order of the second processing J and the third processing G is determined by two methods, ie, Out.
There are -of-Order method and In-Order method. The former method is a method that permits changing the order of execution between a plurality of data processing, and the latter method is a method that guarantees a predetermined execution order between a plurality of data processing and prohibits passing of processing. is there. Therefore, Out-of
-On the assumption of the Order method, the third processing G = H + I is performed in the second processing as shown in FIG.
It can be executed simultaneously with the calculation of the first process C, overtaking the first process J. On the other hand, when the In-Order method is premised, the change of the execution order is prohibited, so that the third process G is performed by the second process J as shown in FIG. Must be performed at the same time as the operation of the second process J, that is, one cycle after the operation of the first process C.

【００１３】以上のように、従来のパイプラインデータ
処理装置では、データ依存関係を持つ複数のデータ処理
Ｃ、Ｊは、Ｉｎ- Ｏｒｄｅｒ方式及びＯｕｔ- ｏｆ- Ｏ
ｒｄｅｒ方式に拘らず、同時に実行できない。このた
め、従来では、このように複数のデータ処理間にデータ
依存関係ある場合には、その各処理Ｃ、Ｊを行わせる命
令は各々パイプラインデータ処理回路１５０、１５１に
対して同時には発行されず、第２番目の処理Ｊの命令は
第１番目の処理Ｃの命令よりも１サイクル遅れて発行さ
れる。また、Ｉｎ- Ｏｒｄｅｒ方式を前提とする場合に
は、後続するデータ依存関係の無い第３番目の処理Ｇの
命令についても、第１番目の処理Ｃの命令の発行時より
も１サイクル遅れて発行される。このように、従来のパ
イプラインデータ処理装置では、データ依存関係を持つ
複数のデータ処理を高速度に行えず、処理速度が向上し
ないという欠点があった。As described above, in the conventional pipeline data processing device, a plurality of data processes C and J having a data dependency depend on the In-Order method and the Out-of-O method.
Regardless of the der method, they cannot be executed at the same time. For this reason, conventionally, when there is a data dependency among a plurality of data processes, instructions for performing the processes C and J are simultaneously issued to the pipeline data processing circuits 150 and 151, respectively. Instead, the instruction of the second process J is issued one cycle later than the instruction of the first process C. In addition, when the In-Order method is premised, the subsequent instruction of the third processing G having no data dependency is also issued one cycle later than the time of issuing the instruction of the first processing C. Is done. As described above, the conventional pipeline data processing apparatus has a drawback that a plurality of data processes having a data dependency cannot be processed at high speed, and the processing speed is not improved.

【００１４】本発明は前記欠点に鑑み、その目的は、Ｉ
ｎ- Ｏｒｄｅｒ方式及びＯｕｔ- ｏｆ- Ｏｒｄｅｒ方式
に拘らず、データ依存関係を有する複数のデータ処理が
存在しても、これ等の処理を指示する複数の命令を各々
パイプラインデータ処理回路に同時に発行できて、デー
タ依存関係を有する複数のデータ処理をこれ等のパイプ
ラインデータ処理回路で適切に実行でき、更には、デー
タ依存関係を有するデータ処理の後にデータ依存関係の
無いデータ処理が続く場合にも、この後続するデータ処
理の命令を、前記データ依存関係を有するデータ処理の
命令と同時に発行できるパイプラインデータ処理方法を
提供し、処理速度性能の向上を図ることにある。The present invention has been made in view of the above drawbacks, and has as its object
Regardless of the n-Order method and the Out-of-Order method, even if there are a plurality of data processes having a data dependency, a plurality of instructions instructing these processes are simultaneously issued to the pipeline data processing circuit. A plurality of data processes having a data dependency can be appropriately executed by these pipelined data processing circuits, and furthermore, a data process having a data dependency is followed by a data process having no data dependency. Another object of the present invention is to provide a pipeline data processing method capable of issuing the subsequent data processing instruction at the same time as the data processing instruction having the data dependency, thereby improving the processing speed performance.

【００１５】[0015]

【課題を解決するための手段】前記目的を達成するため
に、本発明では、複数個のパイプライン処理回路を備え
て複数のデータ処理を空間的に並列処理するスーパース
カラー技術におけるパイプラインデータ処理方法におい
て、ステージ数を、データ読み出しステージ、演算実行
ステージ、及びデータ書き込みステージの３ステージに
所定数を加えた複数ステージ数とし、この複数ステージ
の中で前記データ読み出しステージ及び演算実行ステー
ジを、同時発行する複数の命令が各々指示するデータ処
理相互間の依存関係の有無に応じて、適宜任意のステー
ジに移動できる構成を採用する。In order to achieve the above object, according to the present invention, there is provided a pipeline data processing system in a super scalar technology which comprises a plurality of pipeline processing circuits and performs a plurality of data processing spatially in parallel. In the method, the number of stages is a plurality of stages obtained by adding a predetermined number to three stages of a data read stage, an operation execution stage, and a data write stage, and the data read stage and the operation execution stage are simultaneously performed among the plurality of stages. Data processing specified by each of a plurality of instructions to be issued
A configuration is adopted in which the stage can be moved to an arbitrary stage as appropriate depending on whether or not there is a dependency between the processes .

【００１６】即ち、請求項１記載の発明のパイプライン
データ処理方法は、複数のデータ処理を各々指示する複
数の命令を複数のパイプラインデータ処理回路で実行す
るパイプラインデータ処理方法において、既実行中の命
令及び同時に発行しようとする複数の命令において、そ
の各命令が各々指示するデータ処理間にデータ依存関係
が有るか否かを検出する工程と、前記検出した複数のデ
ータ処理間のデータ依存関係の有無に応じて、前記同時
発行する複数の命令の各々に、何も行わないｄｕｍｍｙ
サイクルの挿入数の情報を含めて、その各データ処理を
各々指示する命令を複数のパイプラインデータ処理回路
に各々同時に発行する工程と、命令を受けた各々のパイ
プラインデータ処理回路は、その受けた命令に含まれる
ｄｕｍｍｙサイクルの挿入数に等しい数のサイクルだけ
遅れた時点で、その受けた命令が指示するデータ処理を
実行する工程とを備えたことを特徴とする。 [0016] That is, the pipeline data processing method of the invention of claim 1, wherein, in the pipelined data processing method for executing a plurality of instructions to instruct each of the plurality of data processing at a plurality of pipelined data processing circuit, already executed Life inside
Order and multiple orders to be issued simultaneously.
Dependencies between data processing specified by each instruction
Detecting whether or not there is a plurality of detected data.
Depending on whether there is a data dependency between data processing,
For each of the multiple instructions to be issued, do nothing
A step of simultaneously issuing, to each of the plurality of pipeline data processing circuits, an instruction for instructing each data processing including information on the number of inserted cycles to each of the plurality of pipeline data processing circuits;
The pipeline data processing circuit is included in the received instruction.
at the time when only <br/> delay of several cycles equal to the number of insertion of the dummy cycle, you characterized by comprising a step of performing data processing the received instruction is an instruction.

【００１７】請求項２記載の発明は、前記請求項１記載
のパイプラインデータ処理方法において、前記複数のパ
イプラインデータ処理回路は、各々、命令が指示するデ
ータ処理の実行結果をそのまま次のステージに移行させ
る１つ又は複数の通過ステージを持ち、前記複数のパイ
プラインデータ処理回路において、前記データ処理を実
行した後は、前記受けた命令に含まれるｄｕｍｍｙサイ
クルの挿入数に等しい数の前記通過ステージをバイパス
させることを特徴とする。[0017] invention 請 Motomeko 2 in that in the pipeline data processing method of claim 1, wherein the plurality of pipeline data processing circuit, respectively, the instruction is a data processing for instructing execution result as follows A plurality of pipeline data processing circuits having one or a plurality of passing stages for shifting to a plurality of pipeline stages, and after executing the data processing in the plurality of pipeline data processing circuits, a number equal to the number of inserted dummy cycles included in the received instruction. The passage stage is bypassed.

【００１８】請求項３記載の発明は、前記請求項１記載
のパイプラインデータ処理方法において、既実行中の命
令及び同時に発行しようとする複数の命令において、そ
の各命令が各々指示するデータ処理間にデータ依存関係
が有るか否かを検出する工程において、前記複数のデー
タ処理間にデータ依存関係が無い場合には、ｄｕｍｍｙ
サイクルの挿入数を“０”に設定することを特徴とす
る。According to a third aspect of the present invention, there is provided the pipeline data processing method according to the first aspect of the present invention, wherein, for an instruction being executed and a plurality of instructions to be issued at the same time, the data processing is performed by each of the instructions. In the step of detecting whether or not there is a data dependency, if there is no data dependency between the plurality of data processes,
The number of cycles to be inserted is set to “0”.

【００１９】請求項４記載の発明は、前記請求項１記載
のパイプラインデータ処理方法において、既実行中の命
令及び同時に発行しようとする複数の命令において、そ
の各命令が各々指示するデータ処理間にデータ依存関係
が有るか否かを検出する工程において、前記複数のデー
タ処理間にデータ依存関係が有る場合には、前記各パイ
プラインデータ処理回路について、ｄｕｍｍｙサイクル
の挿入数ｘを次式に基いて計算して設定するｘ＝ａ＋ｂ−ｃここに、ａは、先行するデータ処理に対してデータ依存
するデータ処理を担当するパイプラインデータ処理回路
においてそのデータ処理を実行するパイプライン段数、
ｂは先行のデータ依存されるデータ処理を担当するパイ
プラインデータ処理回路のデータ処理に挿入されるｄｕ
ｍｍｙサイクルの数、ｃはデータ依存関係の有る２つの
データ処理間の開始サイクルのサイクル差であること
を特徴とする。According to a fourth aspect of the present invention, there is provided the pipeline data processing method according to the first aspect of the present invention, wherein, for an instruction being executed and a plurality of instructions to be issued at the same time, each of the instructions is designated by a data processing instruction. In the step of detecting whether or not there is a data dependency, if there is a data dependency among the plurality of data processes, the number x of dummy cycles inserted for each of the pipeline data processing circuits is expressed by the following equation. X = a + bc where a is the number of pipeline stages that execute data processing in a pipeline data processing circuit that is in charge of data processing that depends on the preceding data processing;
b is du to be inserted into the data processing of the pipeline data processing circuit responsible for the data processing dependent on the preceding data
The number of mmy cycles, c, is the cycle difference of the start cycle between two data processes having a data dependency.

【００２０】請求項５記載の発明は、前記請求項１記載
のパイプラインデータ処理方法において、各パイプライ
ンデータ処理回路がｄｕｍｍｙサイクルの実行中か否か
を判断することにより、各パイプラインデータ処理回路
が命令の実行可能な状態にあるか否かを判定し、何れか
のパイプラインデータ処理回路が命令の実行可能な状態
にない際は、このパイプラインデータ処理回路に対して
命令を発行しないことを特徴とする。According to a fifth aspect of the present invention, in the pipeline data processing method according to the first aspect, each pipeline data processing method comprises:
Whether the data processing circuit is executing a dummy cycle
By judging, each pipeline data processing circuit
Is in a state in which the instruction can be executed.
When there is no instruction, no instruction is issued to the pipeline data processing circuit.

【００２１】請求項６記載の発明は、前記請求項１記載
のパイプラインデータ処理方法において、ｄｕｍｍｙサ
イクルの挿入数の計算に先立って、依存関係の有る複数
のデータ処理の全部又は一部を、データ依存関係のない
データ処理に置換することを特徴とする。[0021] According to a sixth aspect of the invention, in a pipeline data processing method of claim 1, wherein, prior to insertion calculate the number of dummy cycles, the whole or a part of the plurality of data processing having the dependencies, It is characterized in that it is replaced with data processing having no data dependency.

【００２２】請求項７記載の発明は、前記請求項１記載
のパイプラインデータ処理方法において、各パイプライ
ンデータ処理回路がｄｕｍｍｙサイクルの実行中か否か
を判断することにより、各パイプラインデータ処理回路
が命令の実行可能な状態にあるか否かを判定し、命令の
実行可能と判定した場合には、ｄｕｍｍｙサイクルの実
行中でない１個又は複数のパイプラインデータ処理回路
に対して命令を単独又は各々同時に発行することとし
て、既実行中の命令及び前記単独又は各々同時に発行し
ようとする１個又は複数の命令において、その各命令が
各々指示するデータ処理間にデータ依存関係が有るか否
かを検出し、前記検出したデータ依存関係の有無に応じ
て、前記単独又は各々同時発行する１個又は複数の命令
に、何も行わないｄｕｍｍｙサイクルの挿入数の情報を
含めることを特徴とする。According to a seventh aspect of the present invention, in the pipeline data processing method of the first aspect, each pipeline data processing circuit determines whether or not each pipeline data processing circuit is executing a dummy cycle. It is determined whether or not the circuit is in a state in which an instruction can be executed. If it is determined that the instruction can be executed, the instruction is solely transmitted to one or more pipeline data processing circuits that are not executing a dummy cycle. Or, when issued at the same time, whether or not there is a data dependency between the data processing indicated by each of the currently executed instruction and the one or more instructions to be issued individually or at the same time. And, depending on the presence or absence of the detected data dependency, perform no operation on the one or more instructions issued individually or simultaneously at the same time. wherein the inclusion of inserting information of the number of mmy cycle.

【００２３】以上の構成により、請求項１ないし請求項
７記載のパイプラインデータ処理方法では、複数個（例
えば３個）のパイプラインデータ処理回路が並列に配置
される。複数（３つ）のデータ処理が相互にデータ依存
関係を持つ（例えば、処理Ｃ＝Ａ＋Ｂ、処理Ｊ＝Ｅ＋
Ｃ、処理Ｋ＝Ｈ＋Ｊ）場合に、この複数のデータ処理を
指示する複数の命令が各々前記パイプラインデータ処理
回路に同時に発行される。第１番目のデータ処理Ｃを行
うパイプラインデータ処理回路では、ｄｕｍｍｙサイク
ルの挿入数は”０”であって、第１及び第２ステージに
データ読み出しステージ及び演算実行ステージが割り当
てられる。また、第２番目のデータ処理Ｊを行うパイプ
ラインデータ処理回路では、ｄｕｍｍｙサイクルの挿入
数は”１”であって、第２及び第３ステージにデータ読
み出しステージ及び演算実行ステージが割り当てられ
る。更に、第３番目のデータ処理Ｋを行うパイプライン
データ処理回路では、ｄｕｍｍｙサイクルの挿入数は”
２”であって、第３及び第４ステージにデータ読み出し
ステージ及び演算実行ステージが割り当てられる。従っ
て、複数のデータ処理Ｃ、Ｊ、Ｋが順次実行され、これ
等のデータ処理間にデータ依存関係が存在しても、これ
等のデータ処理が適切に実行されることになる。According to the above construction, claims 1 to 5
In the pipeline data processing method described in No. 7 , a plurality (for example, three) of pipeline data processing circuits are arranged in parallel. A plurality (three) of data processes have a data dependency with each other (for example, process C = A + B, process J = E +
C, process K = H + J), a plurality of instructions for instructing the plurality of data processes are simultaneously issued to the pipeline data processing circuit. In the pipeline data processing circuit that performs the first data processing C, the dummy cycle
The number of inserted data is “0”, and the data read stage and the operation execution stage are allocated to the first and second stages . Also , in the pipeline data processing circuit that performs the second data processing J , insertion of a dummy cycle is performed.
The number is “1”, and the data read stage and the operation execution stage are allocated to the second and third stages.
You. Further , in the pipeline data processing circuit that performs the third data processing K, the number of inserted dummy cycles is “
2 ", and a data read stage and an operation execution stage are allocated to the third and fourth stages. Therefore, a plurality of data processes C, J, and K are sequentially executed, and a data dependency relationship between these data processes. , These data processes will be executed properly.

【００２４】ここに、前記複数のデータ処理Ｃ、Ｊ、Ｋ
を指示する各命令は同時に発行され、その後の命令（即
ち、第４番目のデータ処理を指示する命令）は第２サイ
クル目で発行される。従来では、この第４番目の命令は
第４サイクル目で発行される。従って、本発明では、連
続する多数の命令を澱み無く発行できて、データ依存関
係を持つ複数のデータ処理を高速度で実行可能となる。Here, the plurality of data processes C, J, K
Are issued at the same time, and the subsequent instruction (that is, the instruction instructing the fourth data processing) is issued in the second cycle. Conventionally, the fourth instruction is issued in the fourth cycle. Therefore, according to the present invention, it is possible to issue a large number of continuous instructions without delay, and to execute a plurality of data processes having a data dependency at a high speed.

【００２５】特に、請求項５及び７記載の発明では、既
発行の命令のデータ処理において、ｄｕｍｍｙサイクル
の実行中、即ち、次の命令の実行が不可能な状況では、
このｄｕｍｍｙサイクルの実行中のパイプラインデータ
処理回路には次の命令は発行されず、ｄｕｍｍｙサイク
ルの実行中でないパイプラインデータ処理回路に対して
のみ次の命令が発行される。従って、各命令が指示する
データ処理は、このデータ処理を担当する各パイプライ
ンデータ処理回路で資源ハザードを招くことなく、正し
く実行される。 In particular, in the inventions according to claims 5 and 7,
Dummy cycle in data processing of issued instruction
Is executed, that is, in a situation where execution of the next instruction is impossible,
Pipeline data during execution of this dummy cycle
The next instruction is not issued to the processing circuit and the dummy cycle
For pipeline data processing circuits that are not executing
Only the next instruction is issued. Therefore, each instruction dictates
Data processing is performed by each pipeline responsible for this data processing.
Data processing circuit without causing resource hazards.
It is executed well.

【００２６】[0026]

【発明の実施の形態】以下、本発明の実施の形態のパイ
プラインデータ処理方法について、図面を参照しながら
説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS A pipeline data processing method according to an embodiment of the present invention will be described below with reference to the drawings.

【００２７】図１において、１はプロセッサ、２ａは予
め命令プログラムを格納する命令メモリ、２ｂは予め多
数のデータを格納するデータメモリである。前記命令メ
モリ２ａ及びデータメモリ２ｂは相互に分離する必要は
なく、１個のメモリに命令とデータとを多数混在させて
格納してもよい。In FIG. 1, 1 is a processor, 2a is an instruction memory for storing an instruction program in advance, and 2b is a data memory for storing a large number of data in advance. The instruction memory 2a and the data memory 2b do not need to be separated from each other, and a large number of instructions and data may be mixed and stored in one memory.

【００２８】前記プロセッサ１は、命令制御部３と、デ
ータの演算処理等の命令を実行する命令実行部４とから
成る。前記命令制御部３は、命令フェッチ部５と、命令
レジスタ６と、命令解読部７と、命令発行制御部８とか
ら成る。前記命令フェッチ部５は、１つ又は複数の命令
アドレスを生成し、この命令アドレスを前記命令メモリ
２ａに送出して、必要な１つ又は複数の命令をフェッチ
する。前記フェッチされた命令は、前記命令フェッチ部
５によって命令レジスタ６に格納される。前記命令解読
部７は、前記命令レジスタ６に格納された命令を複数個
（例えば３個）の命令づつ解読(decode)する。前記命令
発行制御部８は、解読結果に基いて命令に対応した資源
(resource)に命令を発行し、例えば命令がデータの演算
処理を指示する場合には前記命令実行部４に発行する一
方、命令が命令プログラムの流れを制御する分岐命令等
の場合には命令フェッチ部５に発行する。更に、前記命
令発行制御部８は、その動作を概述すると、命令を命令
実行部４に発行する場合には、命令実行部４の中で命令
の実行に必要な資源の状態、及び複数の命令データ間の
データ依存関係の有無等を調査し、その結果、発行可能
な命令のみを命令実行部４に発行する。前記命令発行制
御部８の内部構成及びその動作の詳細は後述する。The processor 1 comprises an instruction control unit 3 and an instruction execution unit 4 for executing instructions such as data arithmetic processing. The instruction control unit 3 includes an instruction fetch unit 5, an instruction register 6, an instruction decoding unit 7, and an instruction issuance control unit 8. The instruction fetch unit 5 generates one or more instruction addresses, sends the instruction addresses to the instruction memory 2a, and fetches one or more necessary instructions. The fetched instruction is stored in the instruction register 6 by the instruction fetch unit 5. The instruction decoding unit 7 decodes the instructions stored in the instruction register 6 for each of a plurality of (for example, three) instructions. The instruction issuance control unit 8 stores a resource corresponding to the instruction based on the decryption result.
(resource), the instruction is issued to the instruction execution unit 4 when the instruction instructs data processing, while the instruction fetch is performed when the instruction is a branch instruction that controls the flow of the instruction program. Issue to Part 5. Furthermore, when the instruction issuance control unit 8 issues an instruction to the instruction execution unit 4, the instruction issuance control unit 8 states the state of resources necessary for execution of the instruction in the instruction execution unit 4 and a plurality of instructions. A check is made to see if there is a data dependency between the data, and as a result, only the instructions that can be issued are issued to the instruction execution unit 4. The internal configuration and operation of the instruction issuance control unit 8 will be described later in detail.

【００２９】前記命令実行部４は、データを一時的に格
納するレジスタファイル９と、データの演算を行う実行
処理部１０と、実行制御部１１とから成る。前記実行制
御部１１は、実行処理部１０でのデータの演算に必要な
データを前記データメモリ２ｂから読み出す(load)と共
に、このデータをレジスタファイル９に格納(store)
し、更に実行処理部１０での実際の演算の実行時には、
前記レジスタファイル９に格納されたデータを読み出し
て、命令に応じた演算を実行するように実行処理部１０
を制御し、その演算結果をレジスタファイル９に格納す
る。The instruction execution section 4 comprises a register file 9 for temporarily storing data, an execution processing section 10 for performing data operation, and an execution control section 11. The execution control unit 11 reads (loads) data necessary for data calculation in the execution processing unit 10 from the data memory 2b, and stores the data in the register file 9 (store).
Then, when the actual operation is executed in the execution processing unit 10,
The execution processing unit 10 reads out the data stored in the register file 9 and executes an operation according to the instruction.
Is stored in the register file 9.

【００３０】前記プロセッサ１によるデータの演算につ
いては、パイプライン処理が行われる。このパイプライ
ン処理は、図２に示すように、５ステージより成り、第
１ステージでは命令フェッチが、第２ステージでは命令
解読及び命令発行が、第３ステージではレジスタファイ
ル９からのデータの読み出しが、第４ステージではデー
タの演算の実行が、第５ステージでは演算結果のレジス
タファイル９への書き込みが各々行われる。尚、前記第
２及び第３ステージを１つのステージで行っても良い。For the data operation by the processor 1, pipeline processing is performed. As shown in FIG. 2, this pipeline processing includes five stages. In the first stage, instruction fetching, in the second stage, instruction decoding and instruction issuing, and in the third stage, data reading from the register file 9 are performed. In the fourth stage, execution of data operation is performed, and in the fifth stage, the operation result is written in the register file 9. Note that the second and third stages may be performed in one stage.

【００３１】図３は、パイプラインデータ処理装置にお
ける前記実行処理部１０の具体的構成を示す。同図にお
いて、１００、１０１及び１０２は、相互に並列に配置
されたパイプラインデータ処理回路である。各パイプラ
インデータ処理回路１００〜１０２は相互に同一構成で
あり、従って、パイプラインデータ処理回路１００を例
に挙げてその内部構成を説明する。同図において、１
８、１９は第１及び第２のデータ入力ポートであって、
前記レジスタファイル９からのデータを入力する。２
０、２１は前記データ入力ポート１８、１９から入力さ
れるデータを各々格納する第１及び第２のデータ入力レ
ジスタ、３０は前記２個のデータ入力レジスタ２０、２
１に格納されたデータを入力し、この両データを加算又
は減算する２入力演算器（データ演算回路）である。FIG. 3 shows a specific configuration of the execution processing unit 10 in the pipeline data processing device. In the figure, reference numerals 100, 101 and 102 denote pipeline data processing circuits arranged in parallel with each other. Each of the pipeline data processing circuits 100 to 102 has the same configuration as one another. Therefore, the internal configuration of the pipeline data processing circuit 100 will be described using the pipeline data processing circuit 100 as an example. In the figure, 1
8, 19 are first and second data input ports,
The data from the register file 9 is input. 2
0 and 21 are first and second data input registers for storing data input from the data input ports 18 and 19, respectively, and 30 is the two data input registers 20, 2
This is a two-input arithmetic unit (data arithmetic circuit) for inputting data stored in 1 and adding or subtracting both data.

【００３２】更に、４０は前記演算器３０の後段に配置
された第１（中間）パイプラインレジスタ、６０は前記
第１のパイプラインレジスタ４０の後段に配置された第
２（中間）パイプラインレジスタ、７０は前記第２のパ
イプラインレジスタ６０の後段に配置されたセレクタ
（経路切換回路）である。前記第１のパイプラインレジ
スタ４０は、前記演算器３０から出力されるデータ演算
結果３１を入力し、格納する。前記第２のパイプライン
レジスタ６０は、前記第１のパイプラインレジスタ４０
の出力データを入力し、格納する。更に、前記セレクタ
７０は、前記演算器３０から出力されるデータ演算結果
３１と、前記第１及び第２のパイプラインレジスタ４
０、６０に格納されたデータとの三者を入力し、その何
れか一つを選択して、出力する。８１は前記セレクタ７
０の出力データ８０を入力して格納する最終パイプライ
ンレジスタである。Further, reference numeral 40 denotes a first (intermediate) pipeline register arranged downstream of the arithmetic unit 30, and reference numeral 60 denotes a second (intermediate) pipeline register arranged downstream of the first pipeline register 40. And 70 are selectors (path switching circuits) arranged at a stage subsequent to the second pipeline register 60. The first pipeline register 40 receives and stores the data operation result 31 output from the operation unit 30. The second pipeline register 60 includes the first pipeline register 40.
Input and store the output data. Further, the selector 70 stores the data operation result 31 output from the operation unit 30 and the first and second pipeline registers 4.
Data and data stored in 0 and 60 are input, and one of them is selected and output. 81 is the selector 7
This is the final pipeline register that inputs and stores output data 80 of 0.

【００３３】前記各パイプラインデータ処理回路１００
〜１０２は、その内蔵する前記レジスタ２０、２１、４
０、６０及びセレクタ７０が、図４に示すように、前記
図１の実行制御部１１により制御される（図４では１個
のパイプラインデータ処理回路１００についてのみ図示
している）。尚、９ａ〜９ｉは前記レジスタファイル９
内のレジスタを示す。Each of the pipeline data processing circuits 100
-102 are the registers 20, 21, 4 included therein.
As shown in FIG. 4, 0, 60 and the selector 70 are controlled by the execution control unit 11 of FIG. 1 (only one pipeline data processing circuit 100 is shown in FIG. 4). 9a to 9i are the register files 9
Shows the registers inside.

【００３４】尚、前記パイプラインデータ処理回路１０
０は、図５から図７に示すパイプラインデータ処理回路
により構成しても良い。図５に示すパイプラインデータ
処理回路１２０は、図４のパイプラインデータ処理回路
１００のパイプライン段数を１段増加させたものであ
る。即ち、図４では、セレクタ７０と最終パイプライン
レジスタ８１との間に、他のパイプラインレジスタ９０
及び他のセレクタ（経路切換回路）９１とが追加されて
いる。前記他のパイプラインレジスタ９０はセレクタ７
０の選択結果を入力して格納する。前記他のセレクタ９
１は、演算器３０の演算結果３１と、各中間（第１、第
２）及び前記他のパイプラインレジスタ４０、６０、９
０に各々格納されたデータとを入力し、そのうち何れか
１つのデータを選択して、出力する。第２のセレクタ９
０の出力９２は、最終パイプラインレジスタ８１に入力
されて格納される。図６のパイプラインデータ処理回路
１３０は、演算器３０´に１個のパイプラインレジスタ
３０ａを内蔵した回路であり、加算又は減算の実行を２
段のパイプラインとしたものである。図７のパイプライ
ンデータ処理回路１４０では、第１及び第２のパイプラ
インレジスタ４０、６０の間に第１のセレクタ５２が、
第２及び最終パイプラインレジスタ６０、８１の間に第
２のセレクタ７２を配置した構成である。前記第１セレ
クタ５２は、演算器３０の演算結果と前段（第１の）パ
イプラインレジスタ４０の格納データとを入力し、その
何れか１つのデータを選択して出力する。前記第２セレ
クタ７２は、前記第１セレクタ５２の選択結果と前段
（第２の）パイプラインレジスタ６０の格納データとを
入力し、その何れか１つのデータを選択して最終パイプ
ラインレジスタ８１に出力する。前記図５及び図７のパ
イプラインデータ処理回路１２０、１４０の演算器３０
は、図６のパイプラインデータ処理回路１３０の演算器
３０´で構成してもよい。図１に示した３個のパイプラ
インデータ処理回路１００〜１０２は、以上で説明した
処理回路１２０〜１４０の何れか１種類又は複数種類を
組合せて構成しても良い、但し、複数種類を組合せて構
成する場合には、パイプライン段数が相互に同数のもの
を用いる必要がある。The pipeline data processing circuit 10
0 may be configured by the pipeline data processing circuits shown in FIGS. The pipeline data processing circuit 120 shown in FIG. 5 is obtained by increasing the number of pipeline stages of the pipeline data processing circuit 100 of FIG. 4 by one. That is, in FIG. 4, another pipeline register 90 is placed between the selector 70 and the final pipeline register 81.
And another selector (path switching circuit) 91. The other pipeline register 90 is connected to the selector 7
A selection result of 0 is input and stored. The other selector 9
Reference numeral 1 denotes an operation result 31 of the operation unit 30, each intermediate (first and second) and the other pipeline registers 40, 60, and 9
The data stored in each of 0 are input, and any one of the data is selected and output. Second selector 9
The output 92 of 0 is input to the final pipeline register 81 and stored. The pipeline data processing circuit 130 shown in FIG. 6 is a circuit in which one pipeline register 30a is built in a computing unit 30 ', and performs two operations of addition or subtraction.
It is a pipeline of stages. In the pipeline data processing circuit 140 of FIG. 7, the first selector 52 is provided between the first and second pipeline registers 40 and 60.
In this configuration, a second selector 72 is arranged between the second and final pipeline registers 60 and 81. The first selector 52 inputs the operation result of the operation unit 30 and the data stored in the preceding stage (first) pipeline register 40, and selects and outputs any one of the data. The second selector 72 receives the selection result of the first selector 52 and the data stored in the preceding (second) pipeline register 60, selects any one of the data, and sends the selected data to the final pipeline register 81. Output. The arithmetic unit 30 of the pipeline data processing circuits 120 and 140 shown in FIGS.
May be configured by the arithmetic unit 30 'of the pipeline data processing circuit 130 in FIG. The three pipeline data processing circuits 100 to 102 shown in FIG. 1 may be configured by combining any one or a plurality of types of the processing circuits 120 to 140 described above. In such a case, it is necessary to use the same number of pipeline stages.

【００３５】次に、図３に示したパイプラインデータ処
理装置により複数の演算を並列に実行する場合の動作を
説明する前に、先ず、１つのパイプラインデータ処理回
路（例えば１００）単体による１つの演算の実行を説明
する。図４ないし図７に示した各パイプラインデータ処
理回路は全て基本的な動作が同一であるので、以下、図
４のパイプラインデータ処理回路１００を使用して説明
する。Next, before describing the operation when a plurality of operations are executed in parallel by the pipeline data processing apparatus shown in FIG. 3, first, one pipeline data processing circuit (for example, 100) alone The execution of two operations will be described. Since the basic operations of all the pipeline data processing circuits shown in FIGS. 4 to 7 are the same, a description will be given below using the pipeline data processing circuit 100 of FIG.

【００３６】パイプラインデータ処理回路１００は、図
８、図９及び図１０に示す３種類のタイミングでデータ
処理を実行することが可能である。図８〜図１０では、
パイプライン処理のタイミングを説明するための各サイ
クルを第１〜第５サイクルとし、パイプラインデータ処
理回路１００でのデータの流れを説明するためのハード
ウエアにおける各ステージを、データ読み出しステー
ジ、演算実行ステージ、Ｔｒｏｕｇｈステージ（１）
（第１通過ステージ）、Ｔｒｏｕｇｈステージ（２）
（第２通過ステージ）、及びデータ書き込みステージと
して説明する。データ処理は基本的に５段のパイプライ
ンステージによって実行される。図８〜図１０では、パ
イプライン処理のタイミングと、実際のパイプラインデ
ータ処理回路でのデータの流れとを対応付けている。The pipeline data processing circuit 100 can execute data processing at three types of timings shown in FIGS. 8, 9 and 10. 8 to 10,
Each cycle for explaining the timing of pipeline processing is defined as first to fifth cycles, and each stage in hardware for explaining the flow of data in the pipeline data processing circuit 100 is defined as a data read stage and an arithmetic execution Stage, Tough Stage (1)
(First pass stage), Tough stage (2)
(2nd pass stage) and a data write stage. Data processing is basically executed by five pipeline stages. 8 to 10, the timing of the pipeline processing is associated with the actual data flow in the pipeline data processing circuit.

【００３７】図８は、パイプラインデータ処理回路１０
０の最も基本的な動作を示す。データ処理（例えばＣ＝
Ａ＋Ｂの演算）の実行は、以下のように実行される。セ
レクタ７０は第２のパイプラインレジスタ６０の出力を
選択する。第１サイクルでは、データ読み出しステージ
において、データＡ及びＢが読み出されて各々入力レジ
スタ２０、２１に格納される。第２サイクルでは、演算
実行ステージにおいてＡ＋Ｂの演算が実行され、その演
算結果Ｃが第１パイプラインレジスタ４０に格納され
る。第３サイクルでは、データ（演算結果Ｃ）がＴｒｏ
ｕｇｈステージ（１）を通過してその値のまま第２パイ
プラインレジスタ６０に格納される。第４サイクルで
は、データ（演算結果Ｃ）がＴｒｏｕｇｈステージ
（２）を通過してその値のまま第３パイプラインレジス
タ８０に格納される。即ち、この両Ｔｒｏｕｇｈステー
ジでは、データ（演算結果Ｃ）に対して何等の加工及び
処理も行われず、データは順次次のステージに移動する
だけである。第５サイクルでは、データ書き込みステー
ジにおいて、演算結果Ｃの書き込みが行われる。図８で
は、データはパイプラインデータ処理回路１００内を滞
ることなく流れて行く。FIG. 8 shows the pipeline data processing circuit 10.
0 indicates the most basic operation. Data processing (eg C =
The execution of (A + B operation) is executed as follows. The selector 70 selects the output of the second pipeline register 60. In the first cycle, in the data read stage, data A and B are read and stored in the input registers 20 and 21, respectively. In the second cycle, A + B operation is executed in the operation execution stage, and the operation result C is stored in the first pipeline register 40. In the third cycle, the data (operation result C) is
After passing through the high stage (1), the value is stored in the second pipeline register 60 as it is. In the fourth cycle, the data (operation result C) passes through the Tough stage (2) and is stored in the third pipeline register 80 as it is. That is, in these two Through stages, no processing or processing is performed on the data (calculation result C), and the data merely moves to the next stage in sequence. In the fifth cycle, the operation result C is written in the data write stage. In FIG. 8, data flows through the pipeline data processing circuit 100 without delay.

【００３８】図９は、５段のパイプラインのうち第１サ
イクルをＤｕｍｍｙサイクルとしたものである。セレク
タ７０は第１のパイプラインレジスタ４０の出力を選択
する。第１サイクル（Ｄｕｍｍｙサイクル）では、何も
実行されず、パイプラインもホールドする。即ち、タイ
ミングが次の第２サイクルに移っても、実際のデータ
は、データ読み出しステージから次の演算実行ステージ
に進まない。第２サイクルにおいて、初めて、データＡ
及びＢがデータ読み出しステージで読み出されて各々レ
ジスタ２０、２１に格納される。第３サイクルでは、演
算実行ステージにおいてＡ＋Ｂの演算が実行され、その
演算結果Ｃが第１パイプラインレジスタ４０に格納され
る。第４サイクルでは、演算結果ＣはＴｒｏｕｇｈ
（１）ステージを通過し、セレクタ７０を経て第３パイ
プラインレジスタ８０に格納される。第５サイクルで
は、データ書き込みステージにおいて演算結果Ｃの書き
込みが行われる。即ち、Ｔｒｏｕｇｈ（１）ステージか
らのデータ（即ち、第１パイプラインレジスタ４０に格
納された演算結果Ｃ）は、Ｔｒｏｕｇｈ（２）ステージ
をバイパスして、データ書き込みステージに進む。FIG. 9 shows an example in which the first cycle of the five-stage pipeline is a Dummy cycle. The selector 70 selects the output of the first pipeline register 40. In the first cycle (Dummy cycle), nothing is executed and the pipeline is also held. That is, even if the timing shifts to the next second cycle, actual data does not advance from the data read stage to the next operation execution stage. In the second cycle, for the first time, data A
And B are read in the data read stage and stored in the registers 20 and 21, respectively. In the third cycle, the operation of A + B is executed in the operation execution stage, and the operation result C is stored in the first pipeline register 40. In the fourth cycle, the operation result C is Trough
(1) The signal passes through the stage and is stored in the third pipeline register 80 via the selector 70. In the fifth cycle, the operation result C is written in the data write stage. That is, the data from the Tough (1) stage (that is, the operation result C stored in the first pipeline register 40) bypasses the Tough (2) stage and proceeds to the data write stage.

【００３９】ここで、重要な事項は次の通りである。即
ち、第１サイクルがＤｕｍｍｙサイクルである場合に
は、第２サイクルで次のデータ処理（例えばＬ＝Ｊ＋
Ｋ）のデータＪ、Ｋは入力できない。この場合には、第
２サイクルで初めてデータ読み出しステージになるの
で、現在実行中のデータＡ、Ｂが演算実行ステージに進
む、即ち第３サイクルに移行して、初めて次のデータ
Ｊ、Ｋの入力が可能になる。Here, important matters are as follows. That is, when the first cycle is a Dummy cycle, the next data processing (for example, L = J +
Data J and K of K) cannot be input. In this case, since the data read stage is first entered in the second cycle, the data A and B currently being executed proceed to the operation execution stage, that is, shift to the third cycle, and the input of the next data J and K is performed for the first time. Becomes possible.

【００４０】図１０は、第１及び第２サイクルにＤｕｍ
ｍｙサイクルを挿入した場合の動作説明図である。セレ
クタ７０は演算器３０の出力を選択する。第１及び第２
サイクル（Ｄｕｍｍｙサイクル）では、何も実行され
ず、パイプラインもホールドする。即ち、タイミングが
次の第３サイクルに移っても、実際のデータＡ、Ｂは、
データ読み出しステージから演算実行ステージに進まな
い。第３サイクルにおいて、初めて、データＡ及びＢが
データ読み出しステージで読み出されて各々入力レジス
タ２０、２１に格納される。第４サイクルでは、演算実
行ステージにおいてＡ＋Ｂの演算が実行され、その演算
結果Ｃがセレクタ７０を経て第３パイプラインレジスタ
８０に格納される。第５サイクルでは、データ書き込み
ステージにおいて前記演算結果Ｃの書き込みが行われ
る。即ち、演算実行ステージからのデータ（演算結果
Ｃ）は、Ｔｒｏｕｇｈ（１），（２）ステージをバイパ
スして、データ書き込みステージに進む。FIG. 10 shows that Dum is applied to the first and second cycles.
FIG. 11 is an operation explanatory diagram when a my cycle is inserted. The selector 70 selects the output of the arithmetic unit 30. First and second
In the cycle (Dummy cycle), nothing is executed and the pipeline is also held. That is, even if the timing shifts to the next third cycle, the actual data A and B are
It does not proceed from the data read stage to the operation execution stage. In the third cycle, for the first time, data A and B are read in the data read stage and stored in the input registers 20 and 21, respectively. In the fourth cycle, the operation of A + B is executed in the operation execution stage, and the operation result C is stored in the third pipeline register 80 via the selector 70. In the fifth cycle, the operation result C is written in the data write stage. That is, the data (the operation result C) from the operation execution stage proceeds to the data write stage, bypassing the Tough (1) and (2) stages.

【００４１】ここで、重要な事項は次の通りである。即
ち、第１及び第２サイクルがＤｕｍｍｙサイクルである
場合には、第３サイクルまで次のデータ処理のデータは
入力できない。この場合には、第３サイクルで初めてデ
ータ読み出しステージになるので、現在実行中のデータ
Ａ、Ｂが演算実行ステージに進む、即ち第４サイクルに
移行して、初めて次のデータの入力が可能になる。Here, important matters are as follows. That is, when the first and second cycles are Dummy cycles, data for the next data processing cannot be input until the third cycle. In this case, since the data read stage is the first in the third cycle, the currently executing data A and B proceed to the operation execution stage, that is, shift to the fourth cycle, and the next data can be input for the first time. Become.

【００４２】次に、前記図１の命令制御制御部８の内部
構成を説明する。命令発行制御部８は、同図に示すよう
に、データ依存検出部８ａと、資源ハザード制御部８ｂ
と、発行制御部８ｃとを備える。前記データ依存検出部
８ａは、前記命令解読部７からの命令解読情報を入力
し、複数個（３個）の命令の解読情報に基いて、その３
個の命令が指示するデータ処理（例えば、Ｃ＝Ａ＋Ｂ、
Ｅ＝Ｃ＋Ｄ、Ｇ＝Ｅ＋Ｆ）間のデータ依存関係を検出す
る。前記資源ハザード制御部８ｂは、前記各パイプライ
ンデータ処理回路１００〜１０２が図９及び図１０に示
したＤｕｍｍｙサイクルを実行中であることを示す信号
（Ｄｕｍｍｙサイクル実行中信号）を受け、この信号の
受信時に資源ハザードであると判断し、その資源ハザー
ド信号を発行制御部８ｃに送出する。前記発行制御部８
ｃは、各パイプラインデータ処理回路１００〜１０２が
命令の実行可能な状態にあるか否かを判定すると共に、
実行可能と判定した場合には、前記図９及び図１０に示
したＤｕｍｍｙサイクルを挿入するか否か、及び挿入す
る場合にはその挿入するＤｕｍｍｙサイクルの数を各々
決定し、この決定した情報を命令に付加し、この命令を
前記実行可能なパイプラインデータ処理回路で実行させ
るように、この１又は複数の命令を前記実行制御部１１
に対して送出する。Next, the internal configuration of the instruction control controller 8 shown in FIG. 1 will be described. As shown in the figure, the instruction issuance control unit 8 includes a data dependency detection unit 8a and a resource hazard control unit 8b.
And an issue control unit 8c. The data dependence detection unit 8a receives the instruction decoding information from the instruction decoding unit 7 and based on the decoding information of a plurality of (three) instructions,
Data processing (for example, C = A + B,
E = C + D, G = E + F) is detected. The resource hazard control unit 8b receives a signal indicating that each of the pipeline data processing circuits 100 to 102 is executing the Dummy cycle shown in FIGS. 9 and 10 (Dummy cycle execution signal). Is determined to be a resource hazard at the time of reception, and the resource hazard signal is transmitted to the issuance control unit 8c. The issuance control unit 8
c determines whether each of the pipeline data processing circuits 100 to 102 is in a state in which an instruction can be executed,
When it is determined that execution is possible, it is determined whether or not the Dummy cycle shown in FIGS. 9 and 10 is to be inserted, and if it is, the number of Dummy cycles to be inserted is determined. The one or more instructions are added to the execution control unit 11 so that the instructions are executed by the executable pipeline data processing circuit.
Is sent to

【００４３】前記発行制御部８ｃにおいて各パイプライ
ンデータ処理回路１００〜１０２が命令の実行可能な状
態にあるか否かの判定は、具体的には、各パイプライン
データ処理回路１００〜１０２がＤｕｍｍｙサイクル中
か否かを判断することにより行い、Ｄｕｍｍｙサイクル
中でない場合には実行可能と判定する。The issue control unit 8c determines whether or not each of the pipeline data processing circuits 100 to 102 is in a state where an instruction can be executed. This is performed by determining whether or not the cycle is in progress, and if not in the Dummy cycle, it is determined that execution is possible.

【００４４】更に、前記発行制御部８ｃにおけるＤｕｍ
ｍｙサイクルの挿入、及びその挿入Ｄｕｍｍｙサイクル
数の決定は、次の通り行う。即ち、先ず、既実行のデー
タ処理と今回実行しようとするデータ処理との間、及び
同時実行しようとしている複数のデータ処理の間にデー
タ依存関係があるか否かを判定する。その結果、データ
依存関係が存在しない場合には、ｄｕｍｍｙサイクル数
＝０、即ちＤｕｍｍｙサイクルを挿入しないと決定す
る。一方、データ依存関係が存在する場合には、ｄｕｍ
ｍｙサイクル数は次の通り決定する。理解の容易なよう
に例えば、図１１に示すように、実行順序がＣ＝Ａ＋
Ｂ、Ｄ＝Ｅ＋Ｃ、Ｇ＝Ｆ＋Ｄの順番である３つのデータ
処理の場合を考える。ｄｕｍｍｙサイクル数は、先に実
行すべきデータ処理から順に決定される。先行するデー
タ処理Ｃ（又はＤ）とデータ依存関係を持つデータ処理
Ｄ（又はＧ）を実行するパイプラインデータ処理回路で
の演算器３０のパイプライン段数（図４のパイプライン
データ処理回路では“１”、図６のパイプラインデータ
処理回路では“２”である）を記号ａ（以下、図４のパ
イプラインデータ処理回路を例に挙げて、ａ＝１とす
る）、データ依存されるデータ処理Ｃ（又はＤ）を行う
パイプラインデータ処理回路のデータ処理に挿入される
ｄｕｍｍｙサイクル数を記号ｂ、前記データ依存関係を
持つデータ処理間（ＣとＤ、ＤとＧ）の開始サイクルの
サイクル差を記号ｃとすると、ｄｕｍｍｙサイクル数ｘ
は、次式、ｘ＝ａ＋ｂ−ｃにより決定される。Further, Dum in the issuance control unit 8c
The insertion of the my cycle and the determination of the number of inserted Dummy cycles are performed as follows. That is, first, it is determined whether there is a data dependency between the already executed data processing and the data processing to be executed this time, and between a plurality of data processing to be executed simultaneously. As a result, when there is no data dependency, the number of dummy cycles = 0, that is, it is determined not to insert a dummy cycle. On the other hand, if there is a data dependency,
The number of my cycles is determined as follows. For easy understanding, for example, as shown in FIG. 11, the execution order is C = A +
Consider the case of three data processes in the order of B, D = E + C, G = F + D. The number of dummy cycles is determined in order from the data processing to be executed first. The number of pipeline stages of the arithmetic unit 30 in the pipeline data processing circuit that executes the data processing D (or G) having a data dependency with the preceding data processing C (or D) (in the pipeline data processing circuit of FIG. 1 "," 2 "in the pipeline data processing circuit of FIG. 6) is denoted by a symbol a (hereinafter, a = 1 as an example of the pipeline data processing circuit of FIG. 4), The number of dummy cycles inserted in the data processing of the pipeline data processing circuit performing the processing C (or D) is represented by the symbol b, and the cycle of the start cycle between data processing having the data dependency (C and D, D and G) Assuming that the difference is a symbol c, the number of dummy cycles x
Is determined by the following equation: x = a + bc.

【００４５】前記図１１の例では、第１番目のデータ処
理Ｃ＝Ａ＋Ｂは、データ依存関係を持つ先行のデータ処
理が無いので、ｄｕｍｍｙサイクル数は“０”である。
第１番目のデータ処理Ｃとデータ依存関係を有する第２
番目のデータ処理Ｄ＝Ｅ＋Ｃでは、依存されるデータ処
理Ｃのｄｕｍｍｙサイクル数ｂが“０”、両データ処理
間の開始サイクルの差ｃは“０”（同時実行）であるの
で、求めるｄｕｍｍｙサイクル数ｘは１＋０−０＝１サ
イクルとなる。また、先行するデータ処理Ｄとの間にデ
ータ依存関係を持つ第３番目のデータ処理Ｇ＝Ｆ＋Ｄで
は、依存されるデータ処理Ｄのｄｕｍｍｙサイクル数ｂ
が“１”、開始サイクルの差ｃは０（同時実行）である
ので、求めるｄｕｍｍｙサイクルｘは１＋１−０＝２サ
イクルとなる。In the example shown in FIG. 11, the first data processing C = A + B does not have a preceding data processing having a data dependency, so the number of dummy cycles is “0”.
The second having the data dependency with the first data processing C
In the data processing D = E + C, the dummy cycle number b of the dependent data processing C is “0”, and the difference c between the start cycles between the two data processings is “0” (simultaneous execution). The number x is 1 + 0-0 = 1 cycle. In a third data processing G = F + D having a data dependency with the preceding data processing D, the number of dummy cycles b of the dependent data processing D
Is “1”, and the difference c between the start cycles is 0 (simultaneous execution), so that the obtained dummy cycle x is 1 + 1−0 = 2 cycles.

【００４６】図１１から判るように、データ依存関係を
有する３つのデータ処理Ｃ、Ｄ、Ｇについて、第１番目
のデータ処理Ｃは図８のタイミングで実行され、先行す
る処理Ｃと依存関係を持つ第２番目のデータ処理Ｄは図
９のタイミングで実行され、先行する処理Ｄと依存関係
を持つ第３番目のデータ処理Ｇは図１０のタイミングで
各々実行される。従って、これ等のデータ依存関係を有
する３つのデータ処理Ｃ、Ｄ、Ｇの各命令を同時に発行
しても、支障無く実行できる。データ処理Ｃ、Ｄ、Ｇの
終了はＩｎ- Ｏｒｄｅｒ方式が保証されている。As can be seen from FIG. 11, for the three data processes C, D, and G having a data dependency, the first data process C is executed at the timing shown in FIG. The second data processing D possessed is executed at the timing of FIG. 9, and the third data processing G having a dependency with the preceding processing D is executed at the timing of FIG. Therefore, even if the three data processing instructions C, D, and G having these data dependencies are simultaneously issued, they can be executed without any trouble. The end of the data processing C, D, and G is guaranteed by the In-Order method.

【００４７】前記図１において、実行制御部１１は、発
行制御部８ｃから発行された命令を受け、この命令が指
示するデータ処理を、この命令が指定するパイプライン
データ処理回路１００〜１０２で行わせる。前記実行制
御部１１は、前記指定されたパイプラインデータ処理回
路のセレクタ７０を制御し、対応する命令に含まれるｄ
ｕｍｍｙサイクル数ｘがｘ＝０の場合には第２のパイプ
ラインレジスタ６０の出力を、ｘ＝１の場合には第１の
パイプラインレジスタ４０の出力を、ｘ＝２の場合には
演算器３０の出力（演算結果）を選択させる。更に、前
記実行制御部１１は、各パイプラインデータ処理回路１
００〜１０２がｄｕｍｍｙサイクルの実行中である場合
には、ｄｕｍｍｙサイクル実行中信号を前記資源ハザー
ド制御部８ｂに送出する。In FIG. 1, the execution control unit 11 receives an instruction issued from the issuance control unit 8c, and performs data processing designated by the instruction in the pipeline data processing circuits 100 to 102 designated by the instruction. Let The execution control unit 11 controls the selector 70 of the designated pipeline data processing circuit, and selects d included in the corresponding instruction.
When the number of ummy cycles x is x = 0, the output of the second pipeline register 60 is used. When x = 1, the output of the first pipeline register 40 is used. 30 outputs (calculation results) are selected. Further, the execution control unit 11 controls each pipeline data processing circuit 1
When the dummy cycle is being executed in 00 to 102, a signal indicating that the dummy cycle is being executed is sent to the resource hazard control unit 8b.

【００４８】次に、更に複雑なデータ処理シーケンスを
用いて、本実施の形態のパイプラインデータ処理方法を
図１２に基いて説明する。次のような１０個のデータ処
理からから成るデータ処理シーケンスを考える。Next, using a more complicated data processing sequence, the pipeline data processing method of the present embodiment will be described with reference to FIG. Consider a data processing sequence consisting of the following ten data processings.

【００４９】Ｃ＝Ａ＋Ｂ（１）Ｅ＝Ｃ＋Ｄ（２）Ｇ＝Ｅ＋Ｆ（３）Ｊ＝Ｈ＋Ｉ（４）Ｌ＝Ｊ＋Ｋ（５）Ｎ＝Ｌ＋Ｍ（６）Ｐ＝Ｎ＋Ｏ（７）Ｓ＝Ｑ＋Ｒ（８）Ｕ＝Ｓ＋Ｔ（９）Ｘ＝Ｕ＋Ｖ（１０）これ等の１０個のデータ処理のうち（１），（４）及び
（８）のデータ処理以外のデータ処理は、全て先行する
データ処理との間にデータ依存関係が存在している。C = A + B (1) E = C + D (2) G = E + F (3) J = H + I (4) L = J + K (5) N = L + M (6) P = N + O (7) S = Q + R (8) U = S + T (9) X = U + V (10) Of these ten data processes, all of the data processes other than (1), (4) and (8) are the same as the preceding data process. There is a data dependency between them.

【００５０】図１２において、最初は、データ処理Ｃ、
Ｅ、Ｇを指示する３つの命令が同時に解読され、この３
命令が同時に発行され、第１番目のデータ処理Ｃがパイ
プラインデータ処理回路１００で、第２番目のデータ処
理Ｅがパイプラインデータ処理回路１０１で、第３番目
のデータ処理Ｇがパイプラインデータ処理回路１０２で
各々進行する。但し、第１及び第２番目のデータ処理
Ｃ、Ｅ間、及び第２及び第３番目のデータ処理Ｅ、Ｇ間
でデータ依存関係がある。パイプラインデータ処理回路
１０１に挿入すべきｄｕｍｍｙサイクル数ｘは、前記算
出式においてａ＝１、ｂ＝０、ｃ＝０であるので、ｘ＝
１となり、パイプラインデータ処理回路１０２に挿入す
べきｄｕｍｍｙサイクル数ｘは、前記算出式においてａ
＝１、ｂ＝１、ｃ＝０であるので、ｘ＝２となる。In FIG. 12, first, data processing C,
The three instructions indicating E and G are decoded at the same time.
Instructions are issued simultaneously, the first data processing C is the pipeline data processing circuit 100, the second data processing E is the pipeline data processing circuit 101, and the third data processing G is the pipeline data processing circuit. Each proceeds in a circuit 102. However, there is a data dependency between the first and second data processing C and E and between the second and third data processing E and G. Since the number x of dummy cycles to be inserted into the pipeline data processing circuit 101 is a = 1, b = 0, and c = 0 in the above formula, x =
1 and the number x of dummy cycles to be inserted into the pipeline data processing circuit 102 is a
= 1, b = 1 and c = 0, so x = 2.

【００５１】第１サイクルでは、後続する３つのデータ
処理Ｊ、Ｌ、Ｎを指示する３命令が解読される。また、
２個のパイプラインデータ処理回路１０１、１０２は各
々ｄｕｍｍｙサイクルの実行中であるので、これ等の処
理回路１０１、１０２について資源ハザード信号が発生
する。その結果、第４のデータ処理Ｊを指示する命令の
みが発行され、このデータ処理Ｊがパイプラインデータ
処理回路１００で進行する。第４番目のデータ処理Ｊは
既存のデータ処理Ｃ、Ｅ、Ｇとの間でデータ依存関係が
無いので、データ処理Ｊを行うパイプラインデータ処理
回路１００では、ｄｕｍｍｙサイクル数ｘはｘ＝０であ
る。In the first cycle, three instructions instructing the following three data processings J, L, and N are decoded. Also,
Since the two pipeline data processing circuits 101 and 102 are each executing a dummy cycle, a resource hazard signal is generated for these processing circuits 101 and 102. As a result, only an instruction instructing the fourth data processing J is issued, and this data processing J proceeds in the pipeline data processing circuit 100. Since the fourth data processing J has no data dependency with the existing data processings C, E, and G, in the pipeline data processing circuit 100 performing the data processing J, the number of dummy cycles x is x = 0. is there.

【００５２】第２サイクルでは、前サイクルで発行され
なかった２命令と、第７番目のデータ処理Ｐを指示する
命令が解読される。パイプラインデータ処理回路１０２
のみがｄｕｍｍｙサイクルの実行中であるので、この処
理回路１０２について資源ハザード信号が発生する。そ
の結果、第５及び第６のデータ処理Ｌ、Ｎを指示する２
つの命令が発行され、このデータ処理Ｌ、Ｎが各々パイ
プラインデータ処理回路１００、１０１で進行する。こ
こでは、第４及び第５番目のデータ処理Ｊ、Ｌ間、及び
第５及び第６番目のデータ処理Ｌ、Ｎ間でデータ依存関
係が有る。従って、パイプラインデータ処理回路１００
に挿入されるｄｕｍｍｙサイクル数ｘは、前記算出式に
おいてａ＝１、ｂ＝０、ｃ＝１であるので、ｘ＝０とな
り、パイプラインデータ処理回路１０１に挿入されるｄ
ｕｍｍｙサイクル数ｘは、前記算出式においてａ＝１、
ｂ＝０、ｃ＝０であるので、ｘ＝１となる。In the second cycle, two instructions not issued in the previous cycle and an instruction designating the seventh data processing P are decoded. Pipeline data processing circuit 102
Since only the dummy cycle is being executed, a resource hazard signal is generated for the processing circuit 102. As a result, the second and third data processings L and N are designated.
One instruction is issued, and the data processing L and N proceed in the pipeline data processing circuits 100 and 101, respectively. Here, there is a data dependency between the fourth and fifth data processes J and L and between the fifth and sixth data processes L and N. Therefore, the pipeline data processing circuit 100
The number x of dummy cycles inserted into the pipeline data processing circuit 101 is x = 0 because a = 1, b = 0, and c = 1 in the above-described calculation formula.
The number of ummy cycles x is a = 1,
Since b = 0 and c = 0, x = 1.

【００５３】第３サイクルでは、前サイクルで発行され
なかった１命令と、第８及び第９番目のデータ処理Ｓ、
Ｕを指示する命令が解読される。パイプラインデータ処
理回路１０１のみがｄｕｍｍｙサイクルの実行中である
ので、この処理回路１０１について資源ハザード信号が
発生する。その結果、第７及び第８のデータ処理Ｐ、Ｓ
を指示する２つの命令が発行され、このデータ処理Ｐ、
Ｓが各々パイプラインデータ処理回路１００、１０２で
進行する。ここでは、第６及び第７番目のデータ処理
Ｎ、Ｐ間でデータ依存関係が有る。従って、パイプライ
ンデータ処理回路１００に挿入されるｄｕｍｍｙサイク
ル数ｘは、前記算出式においてａ＝１、ｂ＝１、ｃ＝１
であるので、ｘ＝１となる。一方、第７及び第８番目の
データ処理Ｐ、Ｓ間でデータ依存関係は無いので、パイ
プラインデータ処理回路１０２に挿入されるｄｕｍｍｙ
サイクル数ｘは、ｘ＝０である。In the third cycle, one instruction that has not been issued in the previous cycle and the eighth and ninth data processing S,
The instruction pointing to U is decoded. Since only the pipeline data processing circuit 101 is executing the dummy cycle, a resource hazard signal is generated for the processing circuit 101. As a result, the seventh and eighth data processing P, S
Are issued, and the data processing P,
S proceeds in the pipeline data processing circuits 100 and 102, respectively. Here, there is a data dependency between the sixth and seventh data processing N and P. Therefore, the number x of dummy cycles inserted into the pipeline data processing circuit 100 is a = 1, b = 1, c = 1
Therefore, x = 1. On the other hand, since there is no data dependency between the seventh and eighth data processes P and S, the dummy data inserted into the pipeline data processing circuit 102
The cycle number x is x = 0.

【００５４】第４サイクルでは、前サイクルで発行され
なかった１命令と、第１０番目のデータ処理Ｘを指示す
る命令が解読される。パイプラインデータ処理回路１０
０のみがｄｕｍｍｙサイクルの実行中であるので、この
処理回路１００について資源ハザード信号が発生する。
その結果、第９及び第１０のデータ処理Ｕ、Ｘを指示す
る２つの命令が発行され、このデータ処理Ｕ、Ｘが各々
パイプラインデータ処理回路１０１、１０２で進行す
る。ここでは、第８及び第９番目のデータ処理Ｓ、Ｕ
間、及び第９及び第１０番目のデータ処理Ｕ、Ｘ間でデ
ータ依存関係が有る。従って、パイプラインデータ処理
回路１０１に挿入されるｄｕｍｍｙサイクル数ｘは、前
記算出式においてａ＝１、ｂ＝０、ｃ＝１であるので、
ｘ＝０となる。また、パイプラインデータ処理回路１０
２に挿入されるｄｕｍｍｙサイクル数ｘは、前記算出式
においてａ＝１、ｂ＝０、ｃ＝０であるので、ｘ＝１と
なる。In the fourth cycle, one instruction not issued in the previous cycle and an instruction instructing the tenth data processing X are decoded. Pipeline data processing circuit 10
Since only 0 is performing a dummy cycle, a resource hazard signal is generated for this processing circuit 100.
As a result, two instructions instructing the ninth and tenth data processing U and X are issued, and the data processing U and X proceed in the pipeline data processing circuits 101 and 102, respectively. Here, the eighth and ninth data processing S, U
And the ninth and tenth data processing U and X have data dependency. Therefore, the number x of dummy cycles inserted into the pipeline data processing circuit 101 is a = 1, b = 0, c = 1 in the above calculation formula.
x = 0. Further, the pipeline data processing circuit 10
The number x of dummy cycles inserted in 2 is x = 1 because a = 1, b = 0, and c = 0 in the above-described calculation formula.

【００５５】本実施の形態では、図１２から判るよう
に、１０個のデータ処理の命令が全て発行された時点は
第５サイクル目であり、全てのデータ処理の実行が完了
した時点は第９サイクル目である。一方、図１７の従来
のパイプラインデータ処理装置を用いて前記のデータ処
理シーケンスを実行した場合には、図１３のタイミング
図に示すように、１０個のデータ処理の命令が全て発行
された時点は第８サイクル目、全てのデータ処理の実行
が完了した時点は第１０サイクル目である。In this embodiment, as can be seen from FIG. 12, the time when all the ten data processing instructions are issued is the fifth cycle, and the time when all the data processing has been executed is the ninth cycle. This is the cycle cycle. On the other hand, when the above-described data processing sequence is executed by using the conventional pipeline data processing apparatus shown in FIG. 17, as shown in the timing chart of FIG. Is the eighth cycle, and the point at which execution of all data processing is completed is the tenth cycle.

【００５６】以上のことから、図１２の本発明の実施の
形態と図１３の従来例とを比較すると、本発明の実施の
形態のパイプラインデータ処理方法の方が、動作速度の
点で性能向上が図れていることが判る。From the above, when comparing the embodiment of the present invention in FIG. 12 with the conventional example in FIG. 13, the pipeline data processing method according to the embodiment of the present invention has higher performance in terms of operating speed. It can be seen that the improvement has been achieved.

【００５７】図１４は、他のパイプラインデータ処理装
置を示す。FIG. 14 shows another pipeline data processing device.

【００５８】本実施の形態では、図１の構成に加えて、
相互に同一構成である３個のパイプラインデータ処理回
路１０３〜１０５が並列に設けられる。各パイプライン
データ処理回路１０３〜１０５は、３以上の多入力（同
図では３入力）型の演算器３０''を有し、この３入力型
に対応して、３つのデータ入力ポート１７〜１９と、こ
れ等ポートに入力されるデータを各々格納する３個のデ
ータ入力レジスタ２０〜２２が設けられ、このデータ入
力レジスタ２０〜２２に格納されたデータが前記３入力
演算器３０''に入力されて加減算される。他の構成は、
前記第１の実施の形態の図４に示したパイプラインデー
タ処理回路と同様であるので、同一部分に同一符号を付
してその説明を省略する。尚、図１４において、９ｊ〜
９ｒはレジスタファイル９内のレジスタを示す。In this embodiment, in addition to the configuration of FIG.
Three pipeline data processing circuits 103 to 105 having the same configuration are provided in parallel. Each of the pipeline data processing circuits 103 to 105 has three or more multi-input (three-input in the figure) type operation unit 30 ″, and corresponding to the three-input type, three data input ports 17 to 19, and three data input registers 20 to 22 for storing data to be input to these ports, respectively. The data stored in the data input registers 20 to 22 is stored in the three-input arithmetic unit 30 ''. Input and add / subtract. Other configurations are
Since the configuration is the same as that of the pipeline data processing circuit of the first embodiment shown in FIG. 4, the same portions are denoted by the same reference numerals and description thereof will be omitted. Note that in FIG.
9r indicates a register in the register file 9.

【００５９】図１５は前記図１４のパイプラインデータ
処理装置を含んだプロセッサの要部構成を示す。このプ
ロセッサの全体構成は前記第１の実施の形態で示した図
１のプロセッサの構成と同様であり、以下、異なる点の
みを説明する。図１５のプロセッサは、命令発行制御部
８´の発行制御部８ｃ´のみが異なる。この発行制御部
８ｃ´は、複数のデータ処理間のデータ依存関係を解消
するようにその各データ処理を変更する。即ち、例え
ば、以下に示すＣ＝Ａ＋ＢＤ＝Ｅ＋Ｃというデータ依存関係を持つ複数（２つ）のデータ処理
において、後者のデータ処理を、と置換し、その後、このデータ処理Ｄを指示する命令を
発行し、この命令を前記３入力演算器３０''を有するパ
イプラインデータ処理回路１０３で実行させるものであ
る。従って、本来はデータ依存関係を持つ複数のデータ
処理を、データ依存関係の無い複数のデータ処理として
同時に並列して実行できる。FIG. 15 shows a main configuration of a processor including the pipeline data processing device of FIG. The overall configuration of this processor is the same as the configuration of the processor of FIG. 1 shown in the first embodiment, and only different points will be described below. The processor of FIG. 15 differs only in the issue control unit 8c 'of the instruction issue control unit 8'. The issuance control unit 8c 'changes each data process so as to eliminate the data dependency between the plurality of data processes. That is, for example, in a plurality of (two) data processes having a data dependency relationship of C = A + BD = E + C shown below, After that, an instruction instructing the data processing D is issued, and this instruction is executed by the pipeline data processing circuit 103 having the three-input arithmetic unit 30 ''. Therefore, a plurality of data processes which originally have a data dependency can be simultaneously executed in parallel as a plurality of data processes having no data dependency.

【００６０】従って、本実施の形態のパイプラインデー
タ処理方法では、以下に示すような相互にデータ依存関
係が存在する６つのデータ処理を実行する場合には、Ｃ＝Ａ＋ＢＤ＝Ｅ＋ＣＧ＝Ｆ＋ＤＨ＝Ｉ＋ＧＪ＝Ｋ＋ＨＬ＝Ｍ＋Ｊ図１６に示すように、３入力演算器３０''を持つパイプ
ラインデータ処理装置によるデータ依存関係の解消と、
第１の実施の形態で説明したｄｕｍｍｙサイクルの挿入
との２つの効果により、データ依存関係を持つこれ等６
つのデータ処理の命令を同時並列に発行して、実行する
ことができる。Therefore, in the pipeline data processing method according to the present embodiment, when the following six data processings having data dependency are executed, C = A + BD = E + CG = F + D H = I + G J = K + HL L = M + J As shown in FIG. 16, the elimination of the data dependency by the pipeline data processing device having the three-input arithmetic unit 30 ″,
Due to the two effects of the insertion of the dummy cycle described in the first embodiment, these elements having a data dependency 6
Two data processing instructions can be issued and executed in parallel.

【００６１】尚、本実施の形態では、図４に示したパイ
プラインデータ処理回路の演算器３０を３入力演算器３
０''に変更したが、その他、図示しないが、図５、図
６、又は図７に示した各パイプラインデータ処理回路の
演算器３０を３入力演算器３０''に変更してもよいのは
勿論である。In this embodiment, the arithmetic unit 30 of the pipeline data processing circuit shown in FIG.
Although it is changed to 0 '', although not shown, the arithmetic unit 30 of each pipeline data processing circuit shown in FIG. 5, FIG. 6, or FIG. 7 may be changed to a three-input arithmetic unit 30 ''. Of course.

【００６２】[0062]

【発明の効果】以上説明したように、請求項１ないし請
求項７記載の発明のパイプラインデータ処理方法によれ
ば、複数個のパイプライン処理回路を備えて複数のデー
タ処理を空間的に並列処理するスーパースカラー技術に
おいて、複数のデータ処理間のデータ依存関係の有無に
応じて、同時発行する複数の命令の各々に挿入するｄｕ
ｍｍｙサイクルの数を算出したので、複数ステージより
成る各パイプラインの中で、データ処理の実行ステージ
を任意のステージに移動することが可能となり、Ｏｕｔ
- ｏｆ- Ｏｒｄｅｒ方式及びＩｎ- Ｏｒｄｅｒ方式に拘
らず、複数の命令が指示するデータ処理相互間にデータ
依存関係があっても、これ等複数の命令を同時に発行し
ながら、これ等命令が指示するデータ処理を各々正しく
実行できると共に、後続するデータ依存関係のないデー
タ処理を指示する命令を、その先行するデータ処理を指
示する命令と同時に発行することが可能であり、データ
処理速度の向上を図ることができる。As described in the foregoing, according to the pipeline data processing method of the invention of claim 1 to claim 7, wherein, in parallel a plurality of data processing spatially comprises a plurality of pipeline processing circuits In super scalar technology to process, whether there is data dependency between multiple data processing
Accordingly, du to be inserted into each of a plurality of instructions issued at the same time
Having calculated the number of mmy cycles, in each pipeline comprising a plurality stages, it can become able to move the execute stage of the data processing at any stage, Out
-Regardless of the Of-Order method and In-Order method, data is exchanged between data processing indicated by a plurality of instructions.
Even if there are dependencies, issue these multiple instructions at the same time.
However, the data processing indicated by these instructions
An instruction that can be executed and that instructs subsequent data processing that has no data dependency can be issued at the same time as an instruction that instructs the preceding data processing, thereby improving the data processing speed.

【００６３】特に、請求項５及び７記載の発明では、ｄ
ｕｍｍｙサイクルの実行中か否かにより各パイプライン
データ処理回路が次の命令の実行が可能か否かを判定し
たので、各命令が指示するデータ処理を、その担当パイ
プラインデータ処理回路で資源ハザードを招くことな
く、正しく実行させることが可能である。 In particular, in the invention of claims 5 and 7, d
Each pipeline depends on whether ummy cycle is running or not
The data processing circuit determines whether the next instruction can be executed.
Therefore, the data processing specified by each instruction is
Do not introduce resource hazards in the pipeline data processing circuit
And can be executed correctly.

[Brief description of the drawings]

【図１】プロセッサの全体構成を示す図である。FIG. 1 is a diagram illustrating an overall configuration of a processor.

【図２】パイプライン動作の説明図である。FIG. 2 is an explanatory diagram of a pipeline operation.

【図３】本発明の第１の実施の形態のパイプラインデー
タ処理方法に係るパイプラインデータ処理装置の全体構
成を示す図である。FIG. 3 is a diagram illustrating an overall configuration of a pipeline data processing device according to the pipeline data processing method according to the first embodiment of the present invention.

【図４】パイプラインデータ処理装置を構成する１つの
パイプラインデータ処理回路の内部構成を示す図であ
る。FIG. 4 is a diagram showing the internal configuration of one pipeline data processing circuit that constitutes the pipeline data processing device.

【図５】パイプラインデータ処理回路の第１変形例を示
す図である。FIG. 5 is a diagram showing a first modification of the pipeline data processing circuit.

【図６】パイプラインデータ処理回路の第２変形例を示
す図である。FIG. 6 is a diagram showing a second modification of the pipeline data processing circuit.

【図７】パイプラインデータ処理回路の第３変形例を示
す図である。FIG. 7 is a diagram showing a third modification of the pipeline data processing circuit.

【図８】パイプラインデータ処理回路の基本動作を示す
タイミング図である。FIG. 8 is a timing chart showing a basic operation of the pipeline data processing circuit.

【図９】パイプラインデータ処理回路の他の基本動作を
示すタイミング図である。FIG. 9 is a timing chart showing another basic operation of the pipeline data processing circuit.

【図１０】パイプラインデータ処理回路の別の基本動作
を示すタイミング図である。FIG. 10 is a timing chart showing another basic operation of the pipeline data processing circuit.

【図１１】本発明のパイプラインデータ処理方法により
３つのデータ処理を行う場合の動作タイミング図であ
る。FIG. 11 is an operation timing chart when three data processes are performed by the pipeline data processing method of the present invention.

【図１２】本発明のパイプラインデータ処理方法により
多数のデータ処理を行う場合の動作タイミング図であ
る。FIG. 12 is an operation timing chart when performing a large number of data processing by the pipeline data processing method of the present invention.

【図１３】従来のパイプラインデータ処理装置を用いて
多数のデータ処理を行う場合の動作タイミング図であ
る。FIG. 13 is an operation timing diagram when performing a large number of data processing using a conventional pipeline data processing device.

【図１４】本発明の第２の実施の形態のパイプラインデ
ータ処理方法に係るパイプラインデータ処理装置の全体
構成を示す図である。FIG. 14 is a diagram illustrating an overall configuration of a pipeline data processing device according to a pipeline data processing method according to a second embodiment of the present invention.

【図１５】同パイプラインデータ処理装置を含むプロセ
ッサの要部構成を示す図である。FIG. 15 is a diagram showing a main configuration of a processor including the pipeline data processing device.

【図１６】本発明の第２の実施の形態のパイプラインデ
ータ処理方法により６つのデータ処理を行う場合の動作
タイミング図である。FIG. 16 is an operation timing chart when six data processes are performed by the pipeline data processing method according to the second embodiment of the present invention.

【図１７】従来のパイプラインデータ処理装置の全体構
成を示す図である。FIG. 17 is a diagram showing an overall configuration of a conventional pipeline data processing device.

【図１８】従来のパイプラインデータ処理装置の基本動
作を示す動作タイミング図である。FIG. 18 is an operation timing chart showing a basic operation of a conventional pipeline data processing device.

[Explanation of symbols]

１プロセッサ８命令発行制御部１１実行制御部１７〜１２データ入力ポート２０〜２２入力データレジスタ３０、３０´ ２入力演算器（データ演算回路）３０'' ３入力演算器（データ演算回路）４０第１（中間）パイプラインレジスタ５２第１セレクタ（経路切換回路）６０第２（中間）パイプラインレジスタ７０セレクタ（経路切換回路）７２第２セレクタ（経路切換回路）８１最終パイプラインレジスタ９１他のセレクタ（経路切換回路）１００〜１０５パイプラインデータ処理回路 DESCRIPTION OF SYMBOLS 1 Processor 8 Instruction issue control part 11 Execution control part 17-12 Data input port 20-22 Input data register 30, 30 '2-input operation unit (data operation circuit) 30' '3-input operation unit (data operation circuit) 40th 1 (intermediate) pipeline register 52 first selector (path switching circuit) 60 second (intermediate) pipeline register 70 selector (path switching circuit) 72 second selector (path switching circuit) 81 final pipeline register 91 other selector (Path switching circuit) 100-105 Pipeline data processing circuit

Claims

(57) [Claims]

1. A pipeline data processing method in which a plurality of instructions each instructing a plurality of data processes are executed by a plurality of pipeline data processing circuits, wherein a plurality of instructions which are being executed and a plurality of instructions which are to be issued at the same time are provided.
In the data processing specified by each instruction.
Detecting whether there is a data dependency , and determining whether there is a data dependency between the plurality of detected data processes.
No matter what, each of the multiple instructions issued at the same time
Including information on the number of inserted dummy cycles
Then, the step of simultaneously issuing an instruction for instructing each data processing to each of the plurality of pipeline data processing circuits, and the step of receiving each of the instructions,
To the number of inserted dummy cycles included in the instruction
After an equal number of cycles , the received instruction
Pipeline data processing method but which is characterized in that a step of performing data processing to be indicated.

2. The plurality of pipeline data processing circuits each have one or a plurality of passing stages for directly shifting an execution result of data processing designated by an instruction to a next stage. in the processing circuit, after performing the data processing pipeline data processing method according to claim 1, wherein the to bypass the passage stage number equal to the number of insertions dummy cycles included in the received command .

3. A plurality of instructions to be command and simultaneously issued in already running, Engineering for detecting whether or not the data dependency exists between the data processing that each instruction instructs each
In degree, wherein when the data dependence is not among a plurality of data processing pipeline data processing method according to claim 1, wherein the setting the number of insertions dummy cycle "0".

4. A plurality of instructions to be command and simultaneously issued in already running, Engineering for detecting whether or not the data dependency exists between the data processing that each instruction instructs each
In extent, if the data dependency exists between said plurality of data processing, for each pipeline data processing circuit, du
Calculates and sets the number of insertions x of the mmy cycle based on the following equation. x = a + bc where a is the value of the pipeline data processing circuit that is responsible for data processing that depends on the preceding data processing. Number of pipeline stages to execute data processing,
b is du to be inserted into the data processing of the pipeline data processing circuit responsible for the data processing dependent on the preceding data
The number of mmy cycle, c is pipelined data processing method according to claim 1, characterized in that the cycle difference starting cycle between two data processing having the data dependence.

5. Each of the pipeline data processing circuits is dum.
By determining whether or not the my cycle is being executed,
The pipeline data processing circuit is ready to execute instructions
Judge whether or not there is, and any pipeline data processing circuit can execute the instruction
2. The pipeline data processing method according to claim 1, wherein no instruction is issued to the pipeline data processing circuit when the pipeline data processing circuit is not in a normal state .

Prior to 6. Insert calculate the number of dummy cycles, the whole or a part of the plurality of data processing having the dependency claim 1, wherein the replacing the data dependence without data processing Pipeline data processing method.

7. Each of the pipeline data processing circuits is dum.
By determining whether or not the my cycle is being executed, it is determined whether or not each pipeline data processing circuit is in an instruction executable state. If it is determined that the instruction is executable, By issuing instructions individually or simultaneously to one or more pipeline data processing circuits that are not being executed, in an already executed instruction and in the one or more instructions that are to be issued individually or simultaneously at the same time, Detecting whether or not there is a data dependency between the data processing indicated by each of the instructions, and depending on the presence or absence of the detected data dependency, the one or more instructions issued individually or simultaneously at the same time, 2. The pipeline data processing method according to claim 1, further comprising information on the number of inserted dummy cycles in which nothing is performed.