JP2797833B2

JP2797833B2 - Parallel instruction execution control method

Info

Publication number: JP2797833B2
Application number: JP4115216A
Authority: JP
Inventors: さおり中村
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1992-04-09
Filing date: 1992-04-09
Publication date: 1998-09-17
Anticipated expiration: 2013-09-17
Also published as: JPH05289870A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、複数の演算部を有する
プロセッサの各演算部を並列に制御するための並列命令
実行制御方式に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a parallel instruction execution control system for controlling respective operation units of a processor having a plurality of operation units in parallel.

【０００２】[0002]

【従来の技術】前記のようなプロセッサにおいては、コ
ンパイラが目的プログラムを生成する時点で命令及びデ
ータの依存関係が解析され、その結果に基づいてアセン
ブリ命令の配置が行なわれる。生成される目的プログラ
ムたるアセンブリ命令列は並列アセンブリ命令の並び
で、並列アセンブリ命令の各々は演算部の数に等しい数
のアセンブリ命令を含むが、一個又はそれ以上の演算部
で実行すべき命令が存在しないサイクルにおいては、並
列アセンブリ命令中の対応する箇所には何も実行しない
ことを示す空命令が置かれる。実行時にはサイクル毎
に、目的プログラムたるアセンブリ命令列から１個の並
列アセンブリ命令即ち演算部の数に等しい数のアセンブ
リ命令群を取り出してデコードし、その結果に基づいて
各々の演算部が制御される。2. Description of the Related Art In a processor as described above, the dependency of instructions and data is analyzed when a compiler generates a target program, and an assembly instruction is arranged based on the result of analysis. The generated assembly instruction sequence, which is a target program, is a sequence of parallel assembly instructions.Each of the parallel assembly instructions includes the same number of assembly instructions as the number of operation units. In a cycle that does not exist, a null instruction indicating that nothing is to be executed is placed at a corresponding position in the parallel assembly instruction. At the time of execution, one parallel assembly instruction, that is, a group of assembly instructions equal in number to the operation units, is extracted from the assembly instruction sequence as the target program and decoded for each cycle, and each operation unit is controlled based on the result. .

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、従来に
あっては実際には各サイクルにおいて全ての演算部に対
してアセンブリ命令を実行させることは不可能であるこ
とが多く、特に並列性の低いプログラムの場合、目的プ
ログラムは多くの空命令を含むことになる。こうした問
題を解決する手段として、例えばＣｌｏｗｅｌｌＲ．
Ｐ.,ＮｉｘＲ．Ｐ.,Ｏ’ｄｏｎｎｅｌｌＪ．Ｊ.,Ｐ
ａｐｗｏｒｔｈＤ．Ｂ．ａｎｄＲｏｄｍａｎＰ．
Ｋ．：“ＡＶＬＩＷａｒｃｈｉｔｅｃｔｕｒｅｆ
ｏｒａｔｒａｃｅｓｃｈｅｄｕｌｉｎｇｃｏｍｐ
ｉｌｅｒ”，ＩＥＥＥＴｒａｎｓ．Ｃｏｍｐｕｔｅｒ
ｓ，３７，８，ｐｐ．９６７−９７９（１９８８−０
８）に、展開後の命令の位置情報を付加して目的プログ
ラムの圧縮を行ない、実行時に専用のハードウェアを用
いてキャッシュメモリ上で圧縮された目的プログラムを
展開する方式が提唱されている。ところが、この方式で
は展開用ハードウェアの構成が非常に複雑になるという
難点がある。However, in the prior art, it is often impossible to actually execute an assembly instruction for all arithmetic units in each cycle, and in particular, a program having a low parallelism is often used. In this case, the target program will include many empty instructions. As means for solving such a problem, for example, Crowell R. et al.
P., Nix R. P., O'Donnell J. J., P
apporth D. B. and Rodman P.S.
K. : “A VLIW architecture f
or atrace scheduling comp
iler ", IEEE Trans. Computer
s, 37, 8 pp. 967-979 (1988-0)
8), a method is proposed in which the target program is compressed by adding the position information of the expanded instruction, and the compressed target program is expanded on a cache memory using dedicated hardware at the time of execution. However, this method has a drawback that the configuration of the deployment hardware becomes very complicated.

【０００４】また、前記の問題がコンパイラが静的に目
的プログラムの配置を行なうことに起因するものである
ことから、それとは全く異なる側面からの並列実行制御
の手段として、実行時に動的に命令の依存関係の解析を
行ないそれに基づいて複数の演算部を持つプロセッサの
実行制御を行なおうとするｓｕｐｅｒｓｃａｌａｒとい
う方式が、ＭｃＧｅａｄｙＳ．：“Ｔｈｅｉ９６０
ＣＡｓｕｐｅｒｓｃａｌａｒｉｍｐｌｅｍｅｎｔａ
ｔｉｏｎｏｆｔｈｅ８０９６０ａｒｃｈｉｔｅ
ｃｔｕｒｅ”，ＣｏｍｐｃｏｎＳｐｒｉｎｇ９０
ｄｉｇｅｓｔｏｆｐａｐｅｒｓ，ｐｐ．２３２−２４
０，ＩＥＥＥ（１９９０）等により提案されている。し
かし、このｓｕｐｅｒｓｃａｌｅｒ方式では命令の依存
関係及び資源の利用可能性の解析のための複雑で膨大な
ハードウェアが必須であり、また命令の依存関係の解析
が狭い範囲に限定されるという難点があった。[0004] Further, since the above-mentioned problem is caused by the fact that the compiler statically allocates the target program, as a means of controlling parallel execution from a completely different aspect, dynamically executing an instruction at the time of execution is performed. A method called superscalar, which analyzes the dependency of a computer and executes execution control of a processor having a plurality of arithmetic units based on the analysis, is disclosed by McGeady S.S. : “The i960
CA superscalar implementa
Tion of the 80960 archite
cure ", Compcon Spring 90
digestof papers, pp. 232-24
0, IEEE (1990) and the like. However, this superscalar method has a drawback in that it requires complicated and enormous hardware for analyzing instruction dependencies and resource availability, and the analysis of instruction dependencies is limited to a narrow range. Was.

【０００５】本発明の目的は、前記の問題を解消して、
不要な空命令を含まないサイズの小さな目的プログラム
によって、複雑なハードウェアを用いることなく前記の
ようなプロセッサの並列実行制御を容易に行なえるよう
にした並列命令実行制御方式を提供することにある。An object of the present invention is to solve the above-mentioned problems,
It is an object of the present invention to provide a parallel instruction execution control method capable of easily performing the above-described parallel execution control of a processor by using a small-sized target program that does not include unnecessary empty instructions without using complicated hardware. .

【０００６】[0006]

【課題を解決するための手段】本発明の並列命令実行制
御方式は、複数の演算部を有するプロセッサの各々の演
算部を目的プログラムである並列アセンブリ命令列に含
まれている各アセンブリ命令によって並列に制御する並
列命令実行制御方式において、前記並列アセンブリ命令
のフィールド内に前記プロセッサの１サイクルにおける
並列アセンブリ命令の区切りを示すフラグをセットし、
前記プロセッサの実行時に、サイクル毎に順次前回の終
了位置からフラグがセットされているアセンブリ命令ま
でを前記各演算部で並列に実行することを特徴としてい
る。Parallel instruction execution control method of the present invention According to an aspect of the parallel by Tei Ru each assembly instructions included in parallel assembly instruction sequence is an object program operations of each of the processors having a plurality of operation portions In the parallel instruction execution control method, the parallel assembly instruction
In one cycle of the processor in the field of
Set a flag to indicate the break of parallel assembly instructions,
During execution of the processor, the previous
From the end position to the assembly instruction with the flag set.
Are executed in parallel by the respective operation units.
You .

【０００７】[0007]

【実施例】次に、本発明の実施例について図面を参照し
て詳細に説明する。図１は本発明の並列命令実行制御方
式に基づく目的プログラムの一実施例を示した図であ
る。以下、図１の実施例の手順について具体的に説明す
る。なお、ここで対象としているプロセッサは、１サイ
クルに整数演算命令（またはロード／ストア命令）を２
個、浮動小数点演算命令を１個、分岐命令を１個同時に
実行することが可能であるとし、また簡略化のため遅延
は生じないものとする。Next, embodiments of the present invention will be described in detail with reference to the drawings. FIG. 1 is a diagram showing an embodiment of a target program based on the parallel instruction execution control method of the present invention. Hereinafter, the procedure of the embodiment of FIG. 1 will be specifically described. It should be noted that the target processor here uses two integer operation instructions (or load / store instructions) in one cycle.
, One floating-point operation instruction and one branch instruction at the same time, and no delay occurs for simplicity.

【０００８】図１において、１１．１２．１３．１４は
それぞれ１サイクルで同時に実行されるべき並列アセン
ブリ命令を示す。この並列アセンブリ命令は可変個のア
センブリ命令を含んでいる。例えば、１１の並列アセン
ブリ命令は各々の整数演算器で実行されるべきアセンブ
リ命令１１１と１１３を含む。ここで、１１１はレジス
タｒ８の内容に整数１０を加算して結果をｒ８に置くこ
とを表し、１１３はレジスタｒ１０の内容から整数１を
減算して結果をｒ１０に置くことを表す。また、１１
２，１１４は各々命令１１１，１１３が並列アセンブリ
命令の終りか否かを示すフラグである。１１２はセット
されていないフラグで、１１４はセットされているフラ
グである。同様に、１２の並列アセンブリ命令は、ｒ８
の内容を変数ａへストアする命令１２１とｒ８とｒ１０
の内容を加算して結果をｒ８へ置く命令１２２を含んで
いる。また、１３の並列アセンブリ命令はｒ８の内容が
０と等しいならば、ラベルｌａｂｅｌで示されるアドレ
スへ分岐する命令１３１を含み、１４の並列アセンブリ
命令はａの内容をレジスタｒ９へロードする命令１４１
を含むことを示している。In FIG. 1, 11.12.13.14 indicates parallel assembly instructions to be executed simultaneously in one cycle. This parallel assembly instruction includes a variable number of assembly instructions. For example, the eleven parallel assembly instructions include assembly instructions 111 and 113 to be executed in each integer arithmetic unit. Here, 111 represents adding the integer 10 to the contents of the register r8 and placing the result in r8, and 113 represents subtracting the integer 1 from the contents of the register r10 and placing the result in r10. Also, 11
Numerals 2 and 114 are flags indicating whether the instructions 111 and 113 are the end of the parallel assembly instruction. 112 is a flag that is not set, and 114 is a flag that is set. Similarly, the twelve parallel assembly instructions are r8
121, r8 and r10 for storing the contents of
, And puts the result in r8. The 13 parallel assembly instructions include an instruction 131 for branching to the address indicated by the label "label" if the contents of r8 are equal to 0, and the 14 parallel assembly instructions load the contents of a into the register r9.
Is included.

【０００９】図１では、まず４個のアセンブリ命令、す
なわち１１１，１１３，１２１，１２２が読み込まれる
が、命令１１３はフラグ１１４がセットされているの
で、このサイクルでは１１１及び１１３すなわち並列ア
センブリ命令１１が実行され、１２１，１２２は次のサ
イクルまで実行しない。次のサイクルでは、１２２にフ
ラグがセットされているので、１２１及び１２２すなわ
ち並列アセンブリ命令１２が実行される。同様にして、
それ以降のサイクルで並列アセンブリ命令１３，１４が
実行される。ここで命令１３１にはフラグが存在しない
が、これは分岐命令では必ず並列アセンブリ命令が区切
れるものと仮定しているためである。In FIG. 1, four assembly instructions, ie, 111, 113, 121 and 122, are first read. However, since the flag 113 is set in the instruction 113, in this cycle, 111 and 113, ie, the parallel assembly instruction 11 is read. Are executed, and 121 and 122 do not execute until the next cycle. In the next cycle, since the flag is set in 122, 121 and 122, that is, the parallel assembly instruction 12 is executed. Similarly,
In subsequent cycles, parallel assembly instructions 13 and 14 are executed. Here, the instruction 131 has no flag, because it is assumed that a parallel assembly instruction is always separated in a branch instruction.

【００１０】対比のため、図２に従来方式による目的プ
ログラムの例を示す。図２の２１，２２，２３，２４は
並列アセンブリ命令を示すが、この場合１個の並列アセ
ンブリ命令は常に４個のアセンブリ命令から構成され
る。その４個の中にはＮＯＰで表される空命令が１個ま
たはそれ以上含まれることもある。例えば、並列アセン
ブリ命令２１は２１１，２１２，２１３，２１４のアセ
ンブリ命令を含むが、２１３，２１４は空命令である。
従来方式では、サイクル毎に２１，２２，２３，２４の
順に読み込まれて実行されるが、この場合半分以上が空
命令であり、本発明にくらべて２倍程のメモリ及びバス
等の資源を浪費することになる。FIG. 2 shows an example of a target program according to a conventional method for comparison. In FIG. 2, 21, 22, 23, and 24 indicate parallel assembly instructions. In this case, one parallel assembly instruction always includes four assembly instructions. The four instructions may include one or more empty instructions represented by NOP. For example, the parallel assembly instruction 21 includes 211, 212, 213, and 214 assembly instructions, and 213 and 214 are empty instructions.
In the conventional method, the instructions are read and executed in the order of 21, 22, 23, and 24 in each cycle. In this case, more than half are empty instructions, and resources such as memory and buses which are about twice as large as those of the present invention are used. You will waste it.

【００１１】[0011]

【発明の効果】以上説明したように本発明によれば、並
列アセンブリ命令のフィールド内にプロセッサの１サイ
クルにおける並列アセンブリ命令の区切りを示すフラグ
をセットし、プロセッサの実行時にサイクル毎に順次前
回の終了位置からフラグがセットされているアセンブリ
命令までを各演算部で並列に実行しているので、目的プ
ログラムの中から不要な空き命令を排除でき、メモリや
バス等の資源を無駄に消費することなく並列実行制御を
容易に行うことができる。 According to the present invention, as described above ,
One size of the processor in the field of the column assembly instruction
Flag indicating the break of parallel assembly instructions
Is set, and when the processor is executed,
Assembly flagged from the end of the run
Up to the instruction is executed in parallel by each operation unit, so the target program
Unnecessary free instructions can be eliminated from the program,
Parallel execution control without wasting resources such as buses
It can be done easily .

[Brief description of the drawings]

【図１】本発明の並列命令実行制御方式に基づく目的プ
ログラムの一実施例を示した図である。FIG. 1 is a diagram showing an embodiment of a target program based on a parallel instruction execution control system of the present invention.

【図２】従来方式による目的プログラムを示した図であ
る。FIG. 2 is a diagram showing a target program according to a conventional method.

[Explanation of symbols]

１１〜１４並列アセンブリ命令１１１，１１３，１２１，１２２，１３１，１４１各
演算器で実行されるべきアセンブリ命令１１２，１１４フラグ11 to 14 Parallel assembly instructions 111, 113, 121, 122, 131, 141 Assembly instructions to be executed in each arithmetic unit 112, 114 Flag

Claims

(57) [Claims]

1. A parallel instruction execution control method for controlling in parallel by each of the arithmetic unit the assembly instructions Ru Tei included in parallel assembly instruction sequence is an object program for a processor having a plurality of operation portions, the parallel assembly
One cycle of the processor in the field of the instruction
Flag that indicates the break of parallel assembly instructions
When executing the processor, the previous
From the end position of the assembly
A parallel instruction execution control method, wherein the instructions are executed in parallel by the respective operation units .