JP2797833B2 - Parallel instruction execution control method - Google Patents
Parallel instruction execution control methodInfo
- Publication number
- JP2797833B2 JP2797833B2 JP4115216A JP11521692A JP2797833B2 JP 2797833 B2 JP2797833 B2 JP 2797833B2 JP 4115216 A JP4115216 A JP 4115216A JP 11521692 A JP11521692 A JP 11521692A JP 2797833 B2 JP2797833 B2 JP 2797833B2
- Authority
- JP
- Japan
- Prior art keywords
- parallel
- instruction
- assembly
- instructions
- execution control
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Advance Control (AREA)
Description
【0001】[0001]
【産業上の利用分野】本発明は、複数の演算部を有する
プロセッサの各演算部を並列に制御するための並列命令
実行制御方式に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a parallel instruction execution control system for controlling respective operation units of a processor having a plurality of operation units in parallel.
【0002】[0002]
【従来の技術】前記のようなプロセッサにおいては、コ
ンパイラが目的プログラムを生成する時点で命令及びデ
ータの依存関係が解析され、その結果に基づいてアセン
ブリ命令の配置が行なわれる。生成される目的プログラ
ムたるアセンブリ命令列は並列アセンブリ命令の並び
で、並列アセンブリ命令の各々は演算部の数に等しい数
のアセンブリ命令を含むが、一個又はそれ以上の演算部
で実行すべき命令が存在しないサイクルにおいては、並
列アセンブリ命令中の対応する箇所には何も実行しない
ことを示す空命令が置かれる。実行時にはサイクル毎
に、目的プログラムたるアセンブリ命令列から1個の並
列アセンブリ命令即ち演算部の数に等しい数のアセンブ
リ命令群を取り出してデコードし、その結果に基づいて
各々の演算部が制御される。2. Description of the Related Art In a processor as described above, the dependency of instructions and data is analyzed when a compiler generates a target program, and an assembly instruction is arranged based on the result of analysis. The generated assembly instruction sequence, which is a target program, is a sequence of parallel assembly instructions.Each of the parallel assembly instructions includes the same number of assembly instructions as the number of operation units. In a cycle that does not exist, a null instruction indicating that nothing is to be executed is placed at a corresponding position in the parallel assembly instruction. At the time of execution, one parallel assembly instruction, that is, a group of assembly instructions equal in number to the operation units, is extracted from the assembly instruction sequence as the target program and decoded for each cycle, and each operation unit is controlled based on the result. .
【0003】[0003]
【発明が解決しようとする課題】しかしながら、従来に
あっては実際には各サイクルにおいて全ての演算部に対
してアセンブリ命令を実行させることは不可能であるこ
とが多く、特に並列性の低いプログラムの場合、目的プ
ログラムは多くの空命令を含むことになる。こうした問
題を解決する手段として、例えばClowell R.
P.,Nix R.P.,O’donnell J.J.,P
apworth D.B.and Rodman P.
K.:“A VLIW architecture f
or atrace scheduling comp
iler”,IEEE Trans.Computer
s,37,8,pp.967−979(1988−0
8)に、展開後の命令の位置情報を付加して目的プログ
ラムの圧縮を行ない、実行時に専用のハードウェアを用
いてキャッシュメモリ上で圧縮された目的プログラムを
展開する方式が提唱されている。ところが、この方式で
は展開用ハードウェアの構成が非常に複雑になるという
難点がある。However, in the prior art, it is often impossible to actually execute an assembly instruction for all arithmetic units in each cycle, and in particular, a program having a low parallelism is often used. In this case, the target program will include many empty instructions. As means for solving such a problem, for example, Crowell R. et al.
P., Nix R. P., O'Donnell J. J., P
apporth D. B. and Rodman P.S.
K. : “A VLIW architecture f
or atrace scheduling comp
iler ", IEEE Trans. Computer
s, 37, 8 pp. 967-979 (1988-0)
8), a method is proposed in which the target program is compressed by adding the position information of the expanded instruction, and the compressed target program is expanded on a cache memory using dedicated hardware at the time of execution. However, this method has a drawback that the configuration of the deployment hardware becomes very complicated.
【0004】また、前記の問題がコンパイラが静的に目
的プログラムの配置を行なうことに起因するものである
ことから、それとは全く異なる側面からの並列実行制御
の手段として、実行時に動的に命令の依存関係の解析を
行ないそれに基づいて複数の演算部を持つプロセッサの
実行制御を行なおうとするsuperscalarとい
う方式が、McGeady S.:“The i960
CA superscalar implementa
tion of the 80960 archite
cture”,Compcon Spring 90
digestof papers,pp.232−24
0,IEEE(1990)等により提案されている。し
かし、このsuperscaler方式では命令の依存
関係及び資源の利用可能性の解析のための複雑で膨大な
ハードウェアが必須であり、また命令の依存関係の解析
が狭い範囲に限定されるという難点があった。[0004] Further, since the above-mentioned problem is caused by the fact that the compiler statically allocates the target program, as a means of controlling parallel execution from a completely different aspect, dynamically executing an instruction at the time of execution is performed. A method called superscalar, which analyzes the dependency of a computer and executes execution control of a processor having a plurality of arithmetic units based on the analysis, is disclosed by McGeady S.S. : “The i960
CA superscalar implementa
Tion of the 80960 archite
cure ", Compcon Spring 90
digestof papers, pp. 232-24
0, IEEE (1990) and the like. However, this superscalar method has a drawback in that it requires complicated and enormous hardware for analyzing instruction dependencies and resource availability, and the analysis of instruction dependencies is limited to a narrow range. Was.
【0005】本発明の目的は、前記の問題を解消して、
不要な空命令を含まないサイズの小さな目的プログラム
によって、複雑なハードウェアを用いることなく前記の
ようなプロセッサの並列実行制御を容易に行なえるよう
にした並列命令実行制御方式を提供することにある。An object of the present invention is to solve the above-mentioned problems,
It is an object of the present invention to provide a parallel instruction execution control method capable of easily performing the above-described parallel execution control of a processor by using a small-sized target program that does not include unnecessary empty instructions without using complicated hardware. .
【0006】[0006]
【課題を解決するための手段】本発明の並列命令実行制
御方式は、複数の演算部を有するプロセッサの各々の演
算部を目的プログラムである並列アセンブリ命令列に含
まれている各アセンブリ命令によって並列に制御する並
列命令実行制御方式において、前記並列アセンブリ命令
のフィールド内に前記プロセッサの1サイクルにおける
並列アセンブリ命令の区切りを示すフラグをセットし、
前記プロセッサの実行時に、サイクル毎に順次前回の終
了位置からフラグがセットされているアセンブリ命令ま
でを前記各演算部で並列に実行することを特徴としてい
る。Parallel instruction execution control method of the present invention According to an aspect of the parallel by Tei Ru each assembly instructions included in parallel assembly instruction sequence is an object program operations of each of the processors having a plurality of operation portions In the parallel instruction execution control method, the parallel assembly instruction
In one cycle of the processor in the field of
Set a flag to indicate the break of parallel assembly instructions,
During execution of the processor, the previous
From the end position to the assembly instruction with the flag set.
Are executed in parallel by the respective operation units.
You .
【0007】[0007]
【実施例】次に、本発明の実施例について図面を参照し
て詳細に説明する。図1は本発明の並列命令実行制御方
式に基づく目的プログラムの一実施例を示した図であ
る。以下、図1の実施例の手順について具体的に説明す
る。なお、ここで対象としているプロセッサは、1サイ
クルに整数演算命令(またはロード/ストア命令)を2
個、浮動小数点演算命令を1個、分岐命令を1個同時に
実行することが可能であるとし、また簡略化のため遅延
は生じないものとする。Next, embodiments of the present invention will be described in detail with reference to the drawings. FIG. 1 is a diagram showing an embodiment of a target program based on the parallel instruction execution control method of the present invention. Hereinafter, the procedure of the embodiment of FIG. 1 will be specifically described. It should be noted that the target processor here uses two integer operation instructions (or load / store instructions) in one cycle.
, One floating-point operation instruction and one branch instruction at the same time, and no delay occurs for simplicity.
【0008】図1において、11.12.13.14は
それぞれ1サイクルで同時に実行されるべき並列アセン
ブリ命令を示す。この並列アセンブリ命令は可変個のア
センブリ命令を含んでいる。例えば、11の並列アセン
ブリ命令は各々の整数演算器で実行されるべきアセンブ
リ命令111と113を含む。ここで、111はレジス
タr8の内容に整数10を加算して結果をr8に置くこ
とを表し、113はレジスタr10の内容から整数1を
減算して結果をr10に置くことを表す。また、11
2,114は各々命令111,113が並列アセンブリ
命令の終りか否かを示すフラグである。112はセット
されていないフラグで、114はセットされているフラ
グである。同様に、12の並列アセンブリ命令は、r8
の内容を変数aへストアする命令121とr8とr10
の内容を加算して結果をr8へ置く命令122を含んで
いる。また、13の並列アセンブリ命令はr8の内容が
0と等しいならば、ラベルlabelで示されるアドレ
スへ分岐する命令131を含み、14の並列アセンブリ
命令はaの内容をレジスタr9へロードする命令141
を含むことを示している。In FIG. 1, 11.12.13.14 indicates parallel assembly instructions to be executed simultaneously in one cycle. This parallel assembly instruction includes a variable number of assembly instructions. For example, the eleven parallel assembly instructions include assembly instructions 111 and 113 to be executed in each integer arithmetic unit. Here, 111 represents adding the integer 10 to the contents of the register r8 and placing the result in r8, and 113 represents subtracting the integer 1 from the contents of the register r10 and placing the result in r10. Also, 11
Numerals 2 and 114 are flags indicating whether the instructions 111 and 113 are the end of the parallel assembly instruction. 112 is a flag that is not set, and 114 is a flag that is set. Similarly, the twelve parallel assembly instructions are r8
121, r8 and r10 for storing the contents of
, And puts the result in r8. The 13 parallel assembly instructions include an instruction 131 for branching to the address indicated by the label "label" if the contents of r8 are equal to 0, and the 14 parallel assembly instructions load the contents of a into the register r9.
Is included.
【0009】図1では、まず4個のアセンブリ命令、す
なわち111,113,121,122が読み込まれる
が、命令113はフラグ114がセットされているの
で、このサイクルでは111及び113すなわち並列ア
センブリ命令11が実行され、121,122は次のサ
イクルまで実行しない。次のサイクルでは、122にフ
ラグがセットされているので、121及び122すなわ
ち並列アセンブリ命令12が実行される。同様にして、
それ以降のサイクルで並列アセンブリ命令13,14が
実行される。ここで命令131にはフラグが存在しない
が、これは分岐命令では必ず並列アセンブリ命令が区切
れるものと仮定しているためである。In FIG. 1, four assembly instructions, ie, 111, 113, 121 and 122, are first read. However, since the flag 113 is set in the instruction 113, in this cycle, 111 and 113, ie, the parallel assembly instruction 11 is read. Are executed, and 121 and 122 do not execute until the next cycle. In the next cycle, since the flag is set in 122, 121 and 122, that is, the parallel assembly instruction 12 is executed. Similarly,
In subsequent cycles, parallel assembly instructions 13 and 14 are executed. Here, the instruction 131 has no flag, because it is assumed that a parallel assembly instruction is always separated in a branch instruction.
【0010】対比のため、図2に従来方式による目的プ
ログラムの例を示す。図2の21,22,23,24は
並列アセンブリ命令を示すが、この場合1個の並列アセ
ンブリ命令は常に4個のアセンブリ命令から構成され
る。その4個の中にはNOPで表される空命令が1個ま
たはそれ以上含まれることもある。例えば、並列アセン
ブリ命令21は211,212,213,214のアセ
ンブリ命令を含むが、213,214は空命令である。
従来方式では、サイクル毎に21,22,23,24の
順に読み込まれて実行されるが、この場合半分以上が空
命令であり、本発明にくらべて2倍程のメモリ及びバス
等の資源を浪費することになる。FIG. 2 shows an example of a target program according to a conventional method for comparison. In FIG. 2, 21, 22, 23, and 24 indicate parallel assembly instructions. In this case, one parallel assembly instruction always includes four assembly instructions. The four instructions may include one or more empty instructions represented by NOP. For example, the parallel assembly instruction 21 includes 211, 212, 213, and 214 assembly instructions, and 213 and 214 are empty instructions.
In the conventional method, the instructions are read and executed in the order of 21, 22, 23, and 24 in each cycle. In this case, more than half are empty instructions, and resources such as memory and buses which are about twice as large as those of the present invention are used. You will waste it.
【0011】[0011]
【発明の効果】以上説明したように本発明によれば、並
列アセンブリ命令のフィールド内にプロセッサの1サイ
クルにおける並列アセンブリ命令の区切りを示すフラグ
をセットし、プロセッサの実行時にサイクル毎に順次前
回の終了位置からフラグがセットされているアセンブリ
命令までを各演算部で並列に実行しているので、目的プ
ログラムの中から不要な空き命令を排除でき、メモリや
バス等の資源を無駄に消費することなく並列実行制御を
容易に行うことができる。 According to the present invention, as described above ,
One size of the processor in the field of the column assembly instruction
Flag indicating the break of parallel assembly instructions
Is set, and when the processor is executed,
Assembly flagged from the end of the run
Up to the instruction is executed in parallel by each operation unit, so the target program
Unnecessary free instructions can be eliminated from the program,
Parallel execution control without wasting resources such as buses
It can be done easily .
【図1】本発明の並列命令実行制御方式に基づく目的プ
ログラムの一実施例を示した図である。FIG. 1 is a diagram showing an embodiment of a target program based on a parallel instruction execution control system of the present invention.
【図2】従来方式による目的プログラムを示した図であ
る。FIG. 2 is a diagram showing a target program according to a conventional method.
11〜14 並列アセンブリ命令 111,113,121,122,131,141 各
演算器で実行されるべきアセンブリ命令 112,114 フラグ11 to 14 Parallel assembly instructions 111, 113, 121, 122, 131, 141 Assembly instructions to be executed in each arithmetic unit 112, 114 Flag
Claims (1)
の演算部を目的プログラムである並列アセンブリ命令列
に含まれている各アセンブリ命令によって並列に制御す
る並列命令実行制御方式において、前記並列アセンブリ
命令のフィールド内に前記プロセッサの1サイクルにお
ける並列アセンブリ命令の区切りを示すフラグをセット
し、前記プロセッサの実行時に、サイクル毎に順次前回
の終了位置からフラグがセットされているアセンブリ命
令までを前記各演算部で並列に実行することを特徴とす
る並列命令実行制御方式。1. A parallel instruction execution control method for controlling in parallel by each of the arithmetic unit the assembly instructions Ru Tei included in parallel assembly instruction sequence is an object program for a processor having a plurality of operation portions, the parallel assembly
One cycle of the processor in the field of the instruction
Flag that indicates the break of parallel assembly instructions
When executing the processor, the previous
From the end position of the assembly
A parallel instruction execution control method, wherein the instructions are executed in parallel by the respective operation units .
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP4115216A JP2797833B2 (en) | 1992-04-09 | 1992-04-09 | Parallel instruction execution control method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP4115216A JP2797833B2 (en) | 1992-04-09 | 1992-04-09 | Parallel instruction execution control method |
Publications (2)
Publication Number | Publication Date |
---|---|
JPH05289870A JPH05289870A (en) | 1993-11-05 |
JP2797833B2 true JP2797833B2 (en) | 1998-09-17 |
Family
ID=14657241
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP4115216A Expired - Fee Related JP2797833B2 (en) | 1992-04-09 | 1992-04-09 | Parallel instruction execution control method |
Country Status (1)
Country | Link |
---|---|
JP (1) | JP2797833B2 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5669001A (en) * | 1995-03-23 | 1997-09-16 | International Business Machines Corporation | Object code compatible representation of very long instruction word programs |
US6324639B1 (en) | 1998-03-30 | 2001-11-27 | Matsushita Electric Industrial Co., Ltd. | Instruction converting apparatus using parallel execution code |
EP0953898A3 (en) | 1998-04-28 | 2003-03-26 | Matsushita Electric Industrial Co., Ltd. | A processor for executing Instructions from memory according to a program counter, and a compiler, an assembler, a linker and a debugger for such a processor |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS61245239A (en) * | 1985-04-23 | 1986-10-31 | Toshiba Corp | Logical circuit system |
-
1992
- 1992-04-09 JP JP4115216A patent/JP2797833B2/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
JPH05289870A (en) | 1993-11-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0437044B1 (en) | Data processing system with instruction tag apparatus | |
US5341482A (en) | Method for synchronization of arithmetic exceptions in central processing units having pipelined execution units simultaneously executing instructions | |
EP0762270B1 (en) | Microprocessor with load/store operation to/from multiple registers | |
US9355061B2 (en) | Data processing apparatus and method for performing scan operations | |
EP0312764A2 (en) | A data processor having multiple execution units for processing plural classes of instructions in parallel | |
JP2610821B2 (en) | Multi-processor system | |
US5119495A (en) | Minimizing hardware pipeline breaks using software scheduling techniques during compilation | |
JPH04336378A (en) | Information processor | |
JPH02227730A (en) | Data processing system | |
US5850551A (en) | Compiler and processor for processing loops at high speed | |
JP2004529405A (en) | Superscalar processor implementing content addressable memory for determining dependencies | |
JPH03286332A (en) | Digital data processor | |
JPH09152973A (en) | Method and device for support of speculative execution of count / link register change instruction | |
JP2797833B2 (en) | Parallel instruction execution control method | |
DeWitt | A control word model for detecting conflicts between microprograms | |
AU626263B2 (en) | Apparatus and method for providing an extended processing environment on nonmicrocoded data processing system | |
US7899995B1 (en) | Apparatus, system, and method for dependent computations of streaming multiprocessors | |
US7107478B2 (en) | Data processing system having a Cartesian Controller | |
IE880818L (en) | Apparatus and method for synchronization of arithmetic¹exceptions in central processing units having pipelined¹execution units simultaneously executing instructions | |
JPH0454638A (en) | Electronic computer | |
JP2729795B2 (en) | Parallel computer and control method thereof | |
JP2551163B2 (en) | Command processing control method | |
JPH0279122A (en) | Floating point arithmetic mechanism | |
JP3079090B2 (en) | Processor and program conversion device | |
JPH03127171A (en) | Vector processor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20070703 Year of fee payment: 9 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20080703 Year of fee payment: 10 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20090703 Year of fee payment: 11 |
|
LAPS | Cancellation because of no payment of annual fees |