JP2797833B2 - Parallel instruction execution control method - Google Patents

Parallel instruction execution control method

Info

Publication number
JP2797833B2
JP2797833B2 JP4115216A JP11521692A JP2797833B2 JP 2797833 B2 JP2797833 B2 JP 2797833B2 JP 4115216 A JP4115216 A JP 4115216A JP 11521692 A JP11521692 A JP 11521692A JP 2797833 B2 JP2797833 B2 JP 2797833B2
Authority
JP
Japan
Prior art keywords
parallel
instruction
assembly
instructions
execution control
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP4115216A
Other languages
Japanese (ja)
Other versions
JPH05289870A (en
Inventor
さおり 中村
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP4115216A priority Critical patent/JP2797833B2/en
Publication of JPH05289870A publication Critical patent/JPH05289870A/en
Application granted granted Critical
Publication of JP2797833B2 publication Critical patent/JP2797833B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Landscapes

  • Advance Control (AREA)

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【0001】[0001]

【産業上の利用分野】本発明は、複数の演算部を有する
プロセッサの各演算部を並列に制御するための並列命令
実行制御方式に関するものである。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a parallel instruction execution control system for controlling respective operation units of a processor having a plurality of operation units in parallel.

【0002】[0002]

【従来の技術】前記のようなプロセッサにおいては、コ
ンパイラが目的プログラムを生成する時点で命令及びデ
ータの依存関係が解析され、その結果に基づいてアセン
ブリ命令の配置が行なわれる。生成される目的プログラ
ムたるアセンブリ命令列は並列アセンブリ命令の並び
で、並列アセンブリ命令の各々は演算部の数に等しい数
のアセンブリ命令を含むが、一個又はそれ以上の演算部
で実行すべき命令が存在しないサイクルにおいては、並
列アセンブリ命令中の対応する箇所には何も実行しない
ことを示す空命令が置かれる。実行時にはサイクル毎
に、目的プログラムたるアセンブリ命令列から1個の並
列アセンブリ命令即ち演算部の数に等しい数のアセンブ
リ命令群を取り出してデコードし、その結果に基づいて
各々の演算部が制御される。
2. Description of the Related Art In a processor as described above, the dependency of instructions and data is analyzed when a compiler generates a target program, and an assembly instruction is arranged based on the result of analysis. The generated assembly instruction sequence, which is a target program, is a sequence of parallel assembly instructions.Each of the parallel assembly instructions includes the same number of assembly instructions as the number of operation units. In a cycle that does not exist, a null instruction indicating that nothing is to be executed is placed at a corresponding position in the parallel assembly instruction. At the time of execution, one parallel assembly instruction, that is, a group of assembly instructions equal in number to the operation units, is extracted from the assembly instruction sequence as the target program and decoded for each cycle, and each operation unit is controlled based on the result. .

【0003】[0003]

【発明が解決しようとする課題】しかしながら、従来に
あっては実際には各サイクルにおいて全ての演算部に対
してアセンブリ命令を実行させることは不可能であるこ
とが多く、特に並列性の低いプログラムの場合、目的プ
ログラムは多くの空命令を含むことになる。こうした問
題を解決する手段として、例えばClowell R.
P.,Nix R.P.,O’donnell J.J.,P
apworth D.B.and Rodman P.
K.:“A VLIW architecture f
or atrace scheduling comp
iler”,IEEE Trans.Computer
s,37,8,pp.967−979(1988−0
8)に、展開後の命令の位置情報を付加して目的プログ
ラムの圧縮を行ない、実行時に専用のハードウェアを用
いてキャッシュメモリ上で圧縮された目的プログラムを
展開する方式が提唱されている。ところが、この方式で
は展開用ハードウェアの構成が非常に複雑になるという
難点がある。
However, in the prior art, it is often impossible to actually execute an assembly instruction for all arithmetic units in each cycle, and in particular, a program having a low parallelism is often used. In this case, the target program will include many empty instructions. As means for solving such a problem, for example, Crowell R. et al.
P., Nix R. P., O'Donnell J. J., P
apporth D. B. and Rodman P.S.
K. : “A VLIW architecture f
or atrace scheduling comp
iler ", IEEE Trans. Computer
s, 37, 8 pp. 967-979 (1988-0)
8), a method is proposed in which the target program is compressed by adding the position information of the expanded instruction, and the compressed target program is expanded on a cache memory using dedicated hardware at the time of execution. However, this method has a drawback that the configuration of the deployment hardware becomes very complicated.

【0004】また、前記の問題がコンパイラが静的に目
的プログラムの配置を行なうことに起因するものである
ことから、それとは全く異なる側面からの並列実行制御
の手段として、実行時に動的に命令の依存関係の解析を
行ないそれに基づいて複数の演算部を持つプロセッサの
実行制御を行なおうとするsuperscalarとい
う方式が、McGeady S.:“The i960
CA superscalar implementa
tion of the 80960 archite
cture”,Compcon Spring 90
digestof papers,pp.232−24
0,IEEE(1990)等により提案されている。し
かし、このsuperscaler方式では命令の依存
関係及び資源の利用可能性の解析のための複雑で膨大な
ハードウェアが必須であり、また命令の依存関係の解析
が狭い範囲に限定されるという難点があった。
[0004] Further, since the above-mentioned problem is caused by the fact that the compiler statically allocates the target program, as a means of controlling parallel execution from a completely different aspect, dynamically executing an instruction at the time of execution is performed. A method called superscalar, which analyzes the dependency of a computer and executes execution control of a processor having a plurality of arithmetic units based on the analysis, is disclosed by McGeady S.S. : “The i960
CA superscalar implementa
Tion of the 80960 archite
cure ", Compcon Spring 90
digestof papers, pp. 232-24
0, IEEE (1990) and the like. However, this superscalar method has a drawback in that it requires complicated and enormous hardware for analyzing instruction dependencies and resource availability, and the analysis of instruction dependencies is limited to a narrow range. Was.

【0005】本発明の目的は、前記の問題を解消して、
不要な空命令を含まないサイズの小さな目的プログラム
によって、複雑なハードウェアを用いることなく前記の
ようなプロセッサの並列実行制御を容易に行なえるよう
にした並列命令実行制御方式を提供することにある。
An object of the present invention is to solve the above-mentioned problems,
It is an object of the present invention to provide a parallel instruction execution control method capable of easily performing the above-described parallel execution control of a processor by using a small-sized target program that does not include unnecessary empty instructions without using complicated hardware. .

【0006】[0006]

【課題を解決するための手段】本発明の並列命令実行制
御方式は、複数の演算部を有するプロセッサの各々の演
算部を目的プログラムである並列アセンブリ命令列に含
まれている各アセンブリ命令によって並列に制御する並
列命令実行制御方式において、前記並列アセンブリ命令
のフィールド内に前記プロセッサの1サイクルにおける
並列アセンブリ命令の区切りを示すフラグをセットし、
前記プロセッサの実行時に、サイクル毎に順次前回の終
了位置からフラグがセットされているアセンブリ命令ま
でを前記各演算部で並列に実行することを特徴としてい
Parallel instruction execution control method of the present invention According to an aspect of the parallel by Tei Ru each assembly instructions included in parallel assembly instruction sequence is an object program operations of each of the processors having a plurality of operation portions In the parallel instruction execution control method, the parallel assembly instruction
In one cycle of the processor in the field of
Set a flag to indicate the break of parallel assembly instructions,
During execution of the processor, the previous
From the end position to the assembly instruction with the flag set.
Are executed in parallel by the respective operation units.
You .

【0007】[0007]

【実施例】次に、本発明の実施例について図面を参照し
て詳細に説明する。図1は本発明の並列命令実行制御方
式に基づく目的プログラムの一実施例を示した図であ
る。以下、図1の実施例の手順について具体的に説明す
る。なお、ここで対象としているプロセッサは、1サイ
クルに整数演算命令(またはロード/ストア命令)を2
個、浮動小数点演算命令を1個、分岐命令を1個同時に
実行することが可能であるとし、また簡略化のため遅延
は生じないものとする。
Next, embodiments of the present invention will be described in detail with reference to the drawings. FIG. 1 is a diagram showing an embodiment of a target program based on the parallel instruction execution control method of the present invention. Hereinafter, the procedure of the embodiment of FIG. 1 will be specifically described. It should be noted that the target processor here uses two integer operation instructions (or load / store instructions) in one cycle.
, One floating-point operation instruction and one branch instruction at the same time, and no delay occurs for simplicity.

【0008】図1において、11.12.13.14は
それぞれ1サイクルで同時に実行されるべき並列アセン
ブリ命令を示す。この並列アセンブリ命令は可変個のア
センブリ命令を含んでいる。例えば、11の並列アセン
ブリ命令は各々の整数演算器で実行されるべきアセンブ
リ命令111と113を含む。ここで、111はレジス
タr8の内容に整数10を加算して結果をr8に置くこ
とを表し、113はレジスタr10の内容から整数1を
減算して結果をr10に置くことを表す。また、11
2,114は各々命令111,113が並列アセンブリ
命令の終りか否かを示すフラグである。112はセット
されていないフラグで、114はセットされているフラ
グである。同様に、12の並列アセンブリ命令は、r8
の内容を変数aへストアする命令121とr8とr10
の内容を加算して結果をr8へ置く命令122を含んで
いる。また、13の並列アセンブリ命令はr8の内容が
0と等しいならば、ラベルlabelで示されるアドレ
スへ分岐する命令131を含み、14の並列アセンブリ
命令はaの内容をレジスタr9へロードする命令141
を含むことを示している。
In FIG. 1, 11.12.13.14 indicates parallel assembly instructions to be executed simultaneously in one cycle. This parallel assembly instruction includes a variable number of assembly instructions. For example, the eleven parallel assembly instructions include assembly instructions 111 and 113 to be executed in each integer arithmetic unit. Here, 111 represents adding the integer 10 to the contents of the register r8 and placing the result in r8, and 113 represents subtracting the integer 1 from the contents of the register r10 and placing the result in r10. Also, 11
Numerals 2 and 114 are flags indicating whether the instructions 111 and 113 are the end of the parallel assembly instruction. 112 is a flag that is not set, and 114 is a flag that is set. Similarly, the twelve parallel assembly instructions are r8
121, r8 and r10 for storing the contents of
, And puts the result in r8. The 13 parallel assembly instructions include an instruction 131 for branching to the address indicated by the label "label" if the contents of r8 are equal to 0, and the 14 parallel assembly instructions load the contents of a into the register r9.
Is included.

【0009】図1では、まず4個のアセンブリ命令、す
なわち111,113,121,122が読み込まれる
が、命令113はフラグ114がセットされているの
で、このサイクルでは111及び113すなわち並列ア
センブリ命令11が実行され、121,122は次のサ
イクルまで実行しない。次のサイクルでは、122にフ
ラグがセットされているので、121及び122すなわ
ち並列アセンブリ命令12が実行される。同様にして、
それ以降のサイクルで並列アセンブリ命令13,14が
実行される。ここで命令131にはフラグが存在しない
が、これは分岐命令では必ず並列アセンブリ命令が区切
れるものと仮定しているためである。
In FIG. 1, four assembly instructions, ie, 111, 113, 121 and 122, are first read. However, since the flag 113 is set in the instruction 113, in this cycle, 111 and 113, ie, the parallel assembly instruction 11 is read. Are executed, and 121 and 122 do not execute until the next cycle. In the next cycle, since the flag is set in 122, 121 and 122, that is, the parallel assembly instruction 12 is executed. Similarly,
In subsequent cycles, parallel assembly instructions 13 and 14 are executed. Here, the instruction 131 has no flag, because it is assumed that a parallel assembly instruction is always separated in a branch instruction.

【0010】対比のため、図2に従来方式による目的プ
ログラムの例を示す。図2の21,22,23,24は
並列アセンブリ命令を示すが、この場合1個の並列アセ
ンブリ命令は常に4個のアセンブリ命令から構成され
る。その4個の中にはNOPで表される空命令が1個ま
たはそれ以上含まれることもある。例えば、並列アセン
ブリ命令21は211,212,213,214のアセ
ンブリ命令を含むが、213,214は空命令である。
従来方式では、サイクル毎に21,22,23,24の
順に読み込まれて実行されるが、この場合半分以上が空
命令であり、本発明にくらべて2倍程のメモリ及びバス
等の資源を浪費することになる。
FIG. 2 shows an example of a target program according to a conventional method for comparison. In FIG. 2, 21, 22, 23, and 24 indicate parallel assembly instructions. In this case, one parallel assembly instruction always includes four assembly instructions. The four instructions may include one or more empty instructions represented by NOP. For example, the parallel assembly instruction 21 includes 211, 212, 213, and 214 assembly instructions, and 213 and 214 are empty instructions.
In the conventional method, the instructions are read and executed in the order of 21, 22, 23, and 24 in each cycle. In this case, more than half are empty instructions, and resources such as memory and buses which are about twice as large as those of the present invention are used. You will waste it.

【0011】[0011]

【発明の効果】以上説明したように本発明によれば、並
列アセンブリ命令のフィールド内にプロセッサの1サイ
クルにおける並列アセンブリ命令の区切りを示すフラグ
をセットし、プロセッサの実行時にサイクル毎に順次前
回の終了位置からフラグがセットされているアセンブリ
命令までを各演算部で並列に実行しているので、目的プ
ログラムの中から不要な空き命令を排除でき、メモリや
バス等の資源を無駄に消費することなく並列実行制御を
容易に行うことができる
According to the present invention, as described above ,
One size of the processor in the field of the column assembly instruction
Flag indicating the break of parallel assembly instructions
Is set, and when the processor is executed,
Assembly flagged from the end of the run
Up to the instruction is executed in parallel by each operation unit, so the target program
Unnecessary free instructions can be eliminated from the program,
Parallel execution control without wasting resources such as buses
It can be done easily .

【図面の簡単な説明】[Brief description of the drawings]

【図1】本発明の並列命令実行制御方式に基づく目的プ
ログラムの一実施例を示した図である。
FIG. 1 is a diagram showing an embodiment of a target program based on a parallel instruction execution control system of the present invention.

【図2】従来方式による目的プログラムを示した図であ
る。
FIG. 2 is a diagram showing a target program according to a conventional method.

【符号の説明】[Explanation of symbols]

11〜14 並列アセンブリ命令 111,113,121,122,131,141 各
演算器で実行されるべきアセンブリ命令 112,114 フラグ
11 to 14 Parallel assembly instructions 111, 113, 121, 122, 131, 141 Assembly instructions to be executed in each arithmetic unit 112, 114 Flag

Claims (1)

(57)【特許請求の範囲】(57) [Claims] 【請求項1】 複数の演算部を有するプロセッサの各々
の演算部を目的プログラムである並列アセンブリ命令列
に含まれている各アセンブリ命令によって並列に制御す
る並列命令実行制御方式において、前記並列アセンブリ
命令のフィールド内に前記プロセッサの1サイクルにお
ける並列アセンブリ命令の区切りを示すフラグをセット
し、前記プロセッサの実行時に、サイクル毎に順次前回
の終了位置からフラグがセットされているアセンブリ命
令までを前記各演算部で並列に実行することを特徴とす
る並列命令実行制御方式。
1. A parallel instruction execution control method for controlling in parallel by each of the arithmetic unit the assembly instructions Ru Tei included in parallel assembly instruction sequence is an object program for a processor having a plurality of operation portions, the parallel assembly
One cycle of the processor in the field of the instruction
Flag that indicates the break of parallel assembly instructions
When executing the processor, the previous
From the end position of the assembly
A parallel instruction execution control method, wherein the instructions are executed in parallel by the respective operation units .
JP4115216A 1992-04-09 1992-04-09 Parallel instruction execution control method Expired - Fee Related JP2797833B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP4115216A JP2797833B2 (en) 1992-04-09 1992-04-09 Parallel instruction execution control method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP4115216A JP2797833B2 (en) 1992-04-09 1992-04-09 Parallel instruction execution control method

Publications (2)

Publication Number Publication Date
JPH05289870A JPH05289870A (en) 1993-11-05
JP2797833B2 true JP2797833B2 (en) 1998-09-17

Family

ID=14657241

Family Applications (1)

Application Number Title Priority Date Filing Date
JP4115216A Expired - Fee Related JP2797833B2 (en) 1992-04-09 1992-04-09 Parallel instruction execution control method

Country Status (1)

Country Link
JP (1) JP2797833B2 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5669001A (en) * 1995-03-23 1997-09-16 International Business Machines Corporation Object code compatible representation of very long instruction word programs
US6324639B1 (en) 1998-03-30 2001-11-27 Matsushita Electric Industrial Co., Ltd. Instruction converting apparatus using parallel execution code
EP0953898A3 (en) 1998-04-28 2003-03-26 Matsushita Electric Industrial Co., Ltd. A processor for executing Instructions from memory according to a program counter, and a compiler, an assembler, a linker and a debugger for such a processor

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS61245239A (en) * 1985-04-23 1986-10-31 Toshiba Corp Logical circuit system

Also Published As

Publication number Publication date
JPH05289870A (en) 1993-11-05

Similar Documents

Publication Publication Date Title
EP0437044B1 (en) Data processing system with instruction tag apparatus
US5341482A (en) Method for synchronization of arithmetic exceptions in central processing units having pipelined execution units simultaneously executing instructions
EP0762270B1 (en) Microprocessor with load/store operation to/from multiple registers
US9355061B2 (en) Data processing apparatus and method for performing scan operations
EP0312764A2 (en) A data processor having multiple execution units for processing plural classes of instructions in parallel
JP2610821B2 (en) Multi-processor system
US5119495A (en) Minimizing hardware pipeline breaks using software scheduling techniques during compilation
JPH04336378A (en) Information processor
JPH02227730A (en) Data processing system
US5850551A (en) Compiler and processor for processing loops at high speed
JP2004529405A (en) Superscalar processor implementing content addressable memory for determining dependencies
JPH03286332A (en) Digital data processor
JPH09152973A (en) Method and device for support of speculative execution of count / link register change instruction
JP2797833B2 (en) Parallel instruction execution control method
DeWitt A control word model for detecting conflicts between microprograms
AU626263B2 (en) Apparatus and method for providing an extended processing environment on nonmicrocoded data processing system
US7899995B1 (en) Apparatus, system, and method for dependent computations of streaming multiprocessors
US7107478B2 (en) Data processing system having a Cartesian Controller
IE880818L (en) Apparatus and method for synchronization of arithmetic¹exceptions in central processing units having pipelined¹execution units simultaneously executing instructions
JPH0454638A (en) Electronic computer
JP2729795B2 (en) Parallel computer and control method thereof
JP2551163B2 (en) Command processing control method
JPH0279122A (en) Floating point arithmetic mechanism
JP3079090B2 (en) Processor and program conversion device
JPH03127171A (en) Vector processor

Legal Events

Date Code Title Description
FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20070703

Year of fee payment: 9

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20080703

Year of fee payment: 10

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20090703

Year of fee payment: 11

LAPS Cancellation because of no payment of annual fees