JP3311381B2

JP3311381B2 - Instruction scheduling method in compiler

Info

Publication number: JP3311381B2
Application number: JP06198392A
Authority: JP
Inventors: 正和林; 晶彦塩谷; 寛五十嵐; 学松山
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1992-03-18
Filing date: 1992-03-18
Publication date: 2002-08-05
Anticipated expiration: 2017-08-05
Also published as: JPH05265769A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は，複数の演算器を持つ計
算機上で動作するプログラムについて，実行性能のよい
オブジェクトプログラムを生成できるようにしたコンパ
イラにおける命令スケジューリング処理方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an instruction scheduling method in a compiler capable of generating an object program having good execution performance for a program operating on a computer having a plurality of arithmetic units.

【０００２】最近，いわゆるスーパスカラー計算機やＶ
ＬＩＷ計算機というような複数の演算器を同時に動作さ
せて命令を実行できるプロセッサが用いられ始めてい
る。これらの計算機のハードウェア資源を有効に使い，
プログラムの実行性能を上げるためには，コンパイラに
おける命令スケジューリングおよびレジスタ割り付け
を，適切に行う必要がある。Recently, so-called super color calculators and V
Processors that can execute instructions by simultaneously operating a plurality of arithmetic units, such as LIW computers, have begun to be used. Effective use of the hardware resources of these computers,
In order to improve the execution performance of a program, it is necessary to appropriately perform instruction scheduling and register allocation in a compiler.

【０００３】[0003]

【従来の技術】命令スケジューリングは，ハードウェア
の遅れを，命令を並べ換えることによって小さくするコ
ンパイラにおける最適化技術の一つである。特に，複数
の演算器を同時に実行できる並列ＲＩＳＣプロセサを使
用したハードウェアにおいては，この命令スケジューリ
ング技術は，できる限り複数の演算器を同時に利用させ
るように命令を並べ換える並列命令スケジューリングと
呼ばれる技術である。2. Description of the Related Art Instruction scheduling is one of optimization techniques in a compiler for reducing hardware delay by rearranging instructions. In particular, in hardware using a parallel RISC processor that can execute a plurality of processing units at the same time, this instruction scheduling technique is a technique called parallel instruction scheduling that rearranges instructions so that as many processing units as possible can be used simultaneously. is there.

【０００４】この並列命令スケジューリングは，データ
や演算器の競合がないような複数の命令を探し，それら
を同時に実行させるように命令を並べ換えるものであ
る。したがって，命令スケジューリングを効果的なもの
にするためには，各々の命令に対してレジスタや演算器
等のハードウェア資源を別々に割り当てるようにしなけ
ればならない。In this parallel instruction scheduling, a plurality of instructions which do not have a conflict between data and arithmetic units are searched, and the instructions are rearranged so as to execute them simultaneously. Therefore, in order to make the instruction scheduling effective, it is necessary to separately allocate hardware resources such as registers and arithmetic units to each instruction.

【０００５】レジスタ割り付けも，当初から存在するコ
ンパイラの重要な最適化技術である。一般に，レジスタ
割り付けの技術では，数が有限なレジスタをいかに有効
に使うかという問題を扱っている。そのため，資源をで
きるだけ共用しようとする方針で処理が行われてきた。
これは例えば，Ａ，Ｂという違うデータがある場合に，
Ａ，Ｂのライブレンジが重なっていないときには，これ
ら２つに同じレジスタを割り当てようとする処理技術で
ある。[0005] Register allocation is also an important compiler optimization technique that exists from the beginning. Generally, the technique of register allocation deals with the problem of how to effectively use registers having a finite number. Therefore, processing has been carried out with a policy of sharing resources as much as possible.
This is, for example, when there are different data A and B,
When the live ranges of A and B do not overlap, the same register is assigned to these two.

【０００６】並列命令スケジューリングもレジスタ割り
付けも，コンパイラにおける有用な最適化技術である
が，レジスタの利用という観点においては，相反するこ
とを目的としている。なぜならば，並列命令スケジュー
リングの観点では，各命令に使用されているレジスタは
できる限り相互の依存性がない，すなわち，できる限り
異なったレジスタを割り付けることが望ましいのに対
し，レジスタ割り付けの観点では，各命令に対して可能
な限り同じレジスタを割り付けることが望ましいからで
ある。Although both parallel instruction scheduling and register allocation are useful optimization techniques in compilers, they are intended to be mutually exclusive from the viewpoint of register use. This is because, from the viewpoint of parallel instruction scheduling, registers used for each instruction are mutually independent as much as possible, that is, it is desirable to allocate different registers as much as possible, whereas from the viewpoint of register allocation, This is because it is desirable to allocate the same register to each instruction as much as possible.

【０００７】しかし，従来の命令スケジューリングとレ
ジスタ割り付けの両方をサポートしているコンパイラで
は，一般に命令スケジューリングフェーズとレジスタ割
り付けフェーズとが独立しており，命令スケジューリン
グフェーズが終了してからレジスタ割り付けフェーズに
入るか，レジスタ割り付けフェーズが終了してから命令
スケジューリングフェーズに入るように考えられてい
た。However, in a conventional compiler that supports both instruction scheduling and register allocation, the instruction scheduling phase and the register allocation phase are generally independent, and the operation enters the register allocation phase after the instruction scheduling phase ends. Alternatively, it has been considered that the instruction allocation phase is started after the register allocation phase is completed.

【０００８】[0008]

【発明が解決しようとする課題】命令スケジューリング
とレジスタ割り付けとは，上記に述べたような，両者の
目的の違いのために，その作用させる順番によって以下
のような問題が存在し，これは性能上の問題となってい
た。As described above, instruction scheduling and register allocation have the following problems depending on the order in which they are applied due to the difference between the two purposes. Had been the problem above.

【０００９】１）命令スケジューリングを，レジスタ割
り付けの前に行う場合命令スケジューリングの効果を上げるために，データの
名称変更（ｒｅｎａｍｉｎｇ）を頻繁に行い，また各命
令は，レジスタライブレンジを考慮しないで，命令を動
かす。1) When Instruction Scheduling is Performed Prior to Register Allocation In order to enhance the effect of instruction scheduling, data renaming is frequently performed. Each instruction is executed without considering the register live range. Move instructions.

【００１０】もし，このまま実際のレジスタがすべての
データに対して割り付けたならば，このコードは，最良
のものである。しかし，レジスタ数は有限であるので，
レジスタが足りなくなることがあり，その場合には，レ
ジスタ割り付けフェーズで，レジスタの値を退避するた
めの命令であるスピルコードを発行しなければならなく
なり，その分，性能が劣化する。また，このスピルコー
ドは命令スケジューリングの対象とならないので，演算
器の使用に無駄が生じることがあり，実行性能が悪くな
ってしまう。This code is the best if the actual registers are allocated for all data. However, since the number of registers is finite,
In some cases, the number of registers may be insufficient. In this case, in the register allocation phase, it is necessary to issue a spill code, which is an instruction for saving the register value, which degrades the performance. Further, since this spill code is not subject to instruction scheduling, use of the arithmetic unit may be wasted, resulting in poor execution performance.

【００１１】なお，データのｒｅｎａｍｉｎｇとは，命
令スケジューリングのときに，よく利用される最適化技
術で，以下のような処理をいう。例えば，次のような
(a) 〜(d) の中間テキスト列を考える。Note that data renaming is an optimization technique often used in instruction scheduling and refers to the following processing. For example,
Consider the intermediate text strings (a) to (d).

【００１２】 (a) Ａ＝Ｂ＋ＣＡ＝Ｂ＋Ｃ (b) Ｄ＝Ａ＋Ｅ → Ｄ＝Ａ＋Ｅ (c) Ａ＝Ｆ−Ｇ (renaming) Ｘ＝Ｆ−Ｇ (d) Ｈ＝Ｉ−ＡＨ＝Ｉ−Ｘこの時，(c) の変数ＡをＸに変え，(c) 以降の変数Ａを
全て（この場合は(d)に存在する）Ｘに置き換えること
をｒｅｎａｍｉｎｇという。(A) A = B + C A = B + C (b) D = A + E → D = A + E (c) A = FG (renaming) X = FG (d) H = IA H = I− X At this time, changing the variable A of (c) to X and replacing all the variables A after (c) with X (in this case, existing in (d)) is called renaming.

【００１３】ｒｅｎａｍｉｎｇをすることによって，
(a) と(c) はデータの依存関係がなくなり，この２つを
並列に実行させることができるようになる。２）命令スケジューリングを，レジスタ割り付けの後に
行う場合この場合には，レジスタ割り付けによって，本来なら
ば，並列に実行できる２つのデータに対して，同じレジ
スタを割り当ててしまうことがあり，このことがデータ
の並列性を阻害することになる。したがって，命令スケ
ジューリングの効果が減少してしまう。By performing renaming,
In (a) and (c), there is no data dependency, and the two can be executed in parallel. 2) When instruction scheduling is performed after register allocation In this case, the same register may be allocated to two data that can be executed in parallel by register allocation. In parallel. Therefore, the effect of instruction scheduling is reduced.

【００１４】また，このような問題について，現在，レ
ジスタ割り付け時に，並列性をできるだけ意識してレジ
スタを割り付けようとしている処理系が存在するが，こ
れらはデータの並列性しか意識せず，ハードウェアが持
っている並列性は意識していないので，処理としては不
十分である。To deal with such a problem, there are presently processing systems which attempt to allocate registers while paying attention to parallelism as much as possible at the time of register allocation. Since the parallelism that is possessed is not considered, it is not enough for processing.

【００１５】本発明は上記問題点の解決を図り，並列性
優先の命令スケジューリングとレジスタ数を少なくする
レジスタ割り付けとのバランスをとり，実行効率のよい
オブジェクトコードの生成を可能とすることを目的とし
ている。An object of the present invention is to solve the above-mentioned problems, to balance instruction scheduling giving priority to parallelism and register allocation to reduce the number of registers, and to enable generation of object code with high execution efficiency. I have.

【００１６】[0016]

【課題を解決するための手段】図１は，本発明の原理説
明図である。図１において，１０はコンパイル対象とな
るソースプログラム，１１はコンパイラが動作するコン
パイル処理装置，１２はソースプログラム１０を入力
し，構文解析および意味解析を行って，結果を中間テキ
ストとして出力するフロントエンド処理フェーズ，１３
は中間テキストについて一般的な最適化を行う最適化処
理フェーズ，１４は本発明に係る命令スケジューリング
・レジスタ割り付けフェーズ，１５はオブジェクトプロ
グラムの命令を出力する命令出力フェーズ，１６はコン
パイル結果のオブジェクトプログラムを表す。FIG. 1 is a diagram illustrating the principle of the present invention. In FIG. 1, 10 is a source program to be compiled, 11 is a compile processing device on which a compiler operates, 12 is a front end that inputs the source program 10, performs syntax analysis and semantic analysis, and outputs the result as intermediate text. Processing phase, 13
Is an optimization processing phase for performing general optimization on intermediate text; 14 is an instruction scheduling / register allocation phase according to the present invention; 15 is an instruction output phase for outputting instructions of an object program; and 16 is an object program obtained as a result of compilation. Represent.

【００１７】本発明では，それぞれ並列に動作可能な複
数の演算器を備えた計算機上で動作させるソースプログ
ラム１０をコンパイルする際に，最適化された中間テキ
ストをもとに，命令の配置順序を定める命令スケジュー
リングと，各命令で使用するレジスタを定めるレジスタ
割り付けとを，同じ命令スケジューリング・レジスタ割
り付けフェーズ１４で実行する。In the present invention, when compiling a source program 10 to be operated on a computer having a plurality of arithmetic units each of which can operate in parallel, the arrangement order of instructions is determined based on the optimized intermediate text. The instruction scheduling to be determined and the register allocation to determine the registers to be used for each instruction are executed in the same instruction scheduling / register allocation phase 14.

【００１８】命令スケジューリング・レジスタ割り付け
フェーズ１４の処理では，並列性優先スケジューリング
手段１４−２と，同時アクティブなレジスタ数を減らす
スケジューリング手段１４−３とを持ち，これらをレジ
スタの使用状況検査手段１４−１によって切り換える。
すなわち，中間テキストを解析し，レジスタ数の使用見
積もりと実際に使用できるレジスタ数とを照らし合わせ
ることによって，複数の演算器の使用に関する並列性を
高める命令スケジューリングを行うか，レジスタの同時
アクティブを減らす命令スケジューリングを行うかを，
切り換えることによって，複数の演算器とレジスタを有
効に活用するオブジェクトコードの出力を可能とする。In the processing of the instruction scheduling / register allocating phase 14, a parallelism priority scheduling means 14-2 and a scheduling means 14-3 for reducing the number of simultaneously active registers are provided. Switch by 1.
In other words, by analyzing the intermediate text and comparing the estimated number of registers used with the number of registers that can be actually used, instruction scheduling that enhances the parallelism related to the use of multiple arithmetic units is performed, or the number of simultaneously active registers is reduced. Whether to perform instruction scheduling,
By switching, it is possible to output an object code that effectively utilizes a plurality of arithmetic units and registers.

【００１９】また，命令を並べ換えても割り当てるべき
レジスタ数が不足する場合に，スピルコード挿入手段１
４−４により，レジスタの内容を他の場所に退避する命
令であるスピルコードを挿入し，そのスピルコードにつ
いても命令スケジューリングの処理対象とする。In the case where the number of registers to be allocated is insufficient even after instructions are rearranged, the spill code insertion means 1
According to 4-4, a spill code as an instruction to save the contents of the register to another location is inserted, and the spill code is also subjected to instruction scheduling.

【００２０】[0020]

【作用】従来技術の説明で指摘したような問題は，以下
のことが原因である。−レジスタ割り付けと命令スケジ
ューリングとの間の使用しているハードウェア資源（レ
ジスタ，演算器）の情報伝播が上手くできていないこ
と。The problems pointed out in the description of the prior art are caused by the following. -Information transmission of hardware resources (registers, arithmetic units) used between register allocation and instruction scheduling has not been successfully performed.

【００２１】−従来の命令スケジューリングでは，命令
の移動をそれらの並列性を高める方針だけで行い，レジ
スタのライブレンジという概念を持たなかったこと。本
発明は，これらの２つの欠点を改良するために，次の処
理を行う。In the conventional instruction scheduling, instructions are moved only by a policy of increasing their parallelism, and the concept of register live range is not provided. The present invention performs the following processing to improve these two disadvantages.

【００２２】ａ）命令スケジューリングとレジスタ割り
付けを同時に行う（レジスタの使用状況を参考にして命
令スケジューリングを行う）。ｂ）命令スケジューリング方法として，並列性を高める
手段とレジスタライブレンジを減らす手段の２種類を持
ち，それらを使用できるレジスタ数と実際に必要なレジ
スタ数とを照らし合わせることによって，使い分ける。A) Instruction scheduling and register allocation are performed at the same time (instruction scheduling is performed with reference to the state of use of registers). b) As an instruction scheduling method, there are two types of means, a means for increasing the parallelism and a means for reducing the register live range, and these are used properly by comparing the number of registers that can be used with the actually required number of registers.

【００２３】ｃ）スピルコードを出力する必要があると
きには，スピルコードもスケジューリングの対象となる
ようにする。以上の処理方法により，使用可能なレジスタを最大限利
用した状態で，演算器による実行の並列性を高めること
が可能になり，コンパイルしたオブジェクトプログラム
１６の実行時間を短縮することができる。特に，スピル
コードもスケジューリングの対象とすることにより，実
際に使用できるレジスタ数が不足する場合であっても，
ハードウェア資源（レジスタ，演算器）の利用上の無駄
をなくすことが可能になる。C) When it is necessary to output a spill code, the spill code is also subjected to scheduling. With the above-described processing method, it is possible to increase the parallelism of execution by the arithmetic unit while using the available registers to the maximum extent, and to reduce the execution time of the compiled object program 16. In particular, even when the number of registers that can be actually used is insufficient by scheduling spill code,
It is possible to eliminate useless use of hardware resources (registers, arithmetic units).

【００２４】[0024]

【実施例】本発明の実施例を説明するに先立ち，まず，
その説明に関係する語句の説明を行う。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Before describing the embodiments of the present invention, first,
A description will be given of terms related to the description.

【００２５】〔語句説明〕 (1) ライブレンジライブレンジとは，データの内容が有効な範囲のことで
ある。[Explanation of Terms] (1) Live Range The live range is a range in which the contents of data are valid.

【００２６】レジスタや変数の値を定義した命令（値を
書き込む命令）から，その値を最後に使用した命令まで
の範囲をいう。例えば，図２に示すようなアセンブラ表
記の命令列があったとする。なお，このアセンブラ表記
では，最終オペランドが定義対象である。の命令は，
メモリ上の変数ＸをレジスタＲ１にロードする命令であ
り，Ｒ１の値を定義している。の命令は，レジスタＲ
１の値と，レジスタＲ２の値を加算し，レジスタＲ３に
設定する命令である。この命令では，レジスタＲ１の値
を参照している。は減算命令であり，レジスタＲ１の
値を参照している。の命令は，レジスタＲ１に新しい
変数Ｙの値を代入している。The range from an instruction that defines the value of a register or a variable (an instruction for writing a value) to an instruction that last uses the value. For example, assume that there is an instruction sequence in assembler notation as shown in FIG. In this assembler notation, the final operand is the object to be defined. The instruction of
This is an instruction to load the variable X on the memory into the register R1, and defines the value of R1. Instruction in register R
This is an instruction to add the value of 1 to the value of the register R2 and set the value in the register R3. This instruction refers to the value of the register R1. Is a subtraction instruction, which refers to the value of the register R1. Is assigning the new value of the variable Y to the register R1.

【００２７】したがって，で定義した値が最後に使用
されたのは，の命令であり，Ｒ１に関するレジスタラ
イブレンジは，からの範囲となる。 (2) 基本ブロック制御が先頭の命令（中間テキスト）に与えられ，その
後，途中で分岐したり，停止したりすることがない，連
続した命令（中間テキスト）列から構成される固まりを
基本ブロックという。Therefore, the last use of the value defined by is the instruction of, and the register live range for R1 is the range from. (2) Basic block Control is given to the first instruction (intermediate text), and then a block consisting of a sequence of instructions (intermediate text) that does not branch or stop halfway is a basic block. That.

【００２８】(3) 入口ビジー一般には，基本ブロックに制御が渡ってきたときに，有
効な（既に値が設定されている）データを，入口ビジー
なデータという。(3) Entrance busy Generally, when control is passed to the basic block, data that is valid (has a value already set) is referred to as entrance busy data.

【００２９】本発明では，複数の基本ブロックを１つの
スケジューリング範囲としている。したがって，この定
義においては，『基本ブロック』を『スケジューリング
範囲』と置き換えてよい（以下，同様）。In the present invention, a plurality of basic blocks are defined as one scheduling range. Therefore, in this definition, "basic block" may be replaced with "scheduling range" (the same applies hereinafter).

【００３０】(4) 出口ビジー一般には，処理中の基本ブロックのデータが，他の基本
ブロック（本発明では，スケジューリング範囲）で参照
されている時，このデータは出口ビジーなデータである
という。(4) Exit Busy In general, when data of a basic block being processed is referred to by another basic block (in the present invention, a scheduling range), this data is said to be exit busy data.

【００３１】(5) ＩＰＩＰは，Immediate Predecessor の略語である。プログ
ラムの流れを考えたときに，ある基本ブロックＡに直
接，制御を渡す可能性があるブロックを基本ブロックＡ
のＩＰという。(5) IP IP is an abbreviation of Immediate Predecessor. When considering the flow of a program, a block that may transfer control directly to a certain basic block A is called a basic block A.
Called IP.

【００３２】例えば，図３の（イ）に示すような制御の
流れ（コントロールフロー）があったとき，それぞれの
ＩＰは，図３の（ロ）に示すようになる。すなわち，基
本ブロックＡのＩＰはなし，基本ブロックＢとＣのＩＰ
はＡ，基本ブロックＤのＩＰは，Ｂ，Ｃである。For example, when there is a control flow (control flow) as shown in FIG. 3A, each IP becomes as shown in FIG. That is, there is no IP of the basic block A, and the IPs of the basic blocks B and C.
Is A, and the IPs of the basic block D are B and C.

【００３３】(6) グローバルデータグローバルデータとは，複数の基本ブロック（本発明で
は，複数のスケジューリング範囲）にまたがって定義・
参照されているデータのことをいう。(6) Global Data Global data is defined and defined over a plurality of basic blocks (in the present invention, a plurality of scheduling ranges).
Refers to the data being referenced.

【００３４】(7) ディペンデンシ命令Ａと命令Ｂがあり，仮に命令Ｂは，命令Ａの実行結
果を参照しているとき，命令Ｂは命令Ａの実行を待たな
いと（実行後にならないと）命令を実行できない。この
ように命令間で，どちらかがどちらかの影響を受けるよ
うなとき，両者には“ディペンデンシ”があるという。(7) Dependency There are an instruction A and an instruction B. If the instruction B refers to the execution result of the instruction A, the instruction B must wait for the execution of the instruction A (or be executed after execution). Cannot be executed. When one of the instructions is affected by one of them, it is said that both have "dependency".

【００３５】(8) 同時アクティブ（なレジスタ）任意のクロック数ｔのとき，データが有効な（そのライ
ブレンジの中にｔが含まれる）レジスタを同時アクティ
ブであるという。(8) Simultaneously Active (Register) When the clock number is an arbitrary number t, a register in which data is valid (t is included in the live range) is said to be simultaneously active.

【００３６】以上で語句の説明を終了する。また，本発
明の実施例で用いる以下の技術については，実施例の説
明の最後に，技術説明１，２として，まとめて記載す
る。〔技術説明１〕レジスタライブの同時アクティブを減ら
すような命令スケジューリングの方法。〔技術説明２〕スピルコードの挿入手順。The description of the phrase has been completed. The following technologies used in the embodiments of the present invention will be collectively described as technical descriptions 1 and 2 at the end of the description of the embodiments. [Technical Description 1] A method of instruction scheduling that reduces simultaneous activation of register live. [Technical explanation 2] Insertion procedure of spill code.

【００３７】図４は，本発明の実施例に係る命令スケジ
ューリング・レジスタ割り付けフェーズの処理の流れを
示す。図１に示す命令スケジューリング・レジスタ割り
付けフェーズ１４では，例えば，図４に示す処理１）〜
５）を行う。なお，他のフェーズにおける処理について
は，従来技術と同様でよく，周知技術であるのでここで
の詳しい説明は省略する。以下，図４に示す処理１）〜
５）に従って説明する。FIG. 4 shows a processing flow of the instruction scheduling / register allocating phase according to the embodiment of the present invention. In the instruction scheduling / register allocating phase 14 shown in FIG. 1, for example, processing 1) to FIG.
Perform 5). The processing in the other phases may be the same as that of the conventional technique, and is a well-known technique, and thus detailed description thereof will be omitted. Hereinafter, processing 1) to FIG.
This will be described according to 5).

【００３８】１）〔入口ビジー（前述の語句説明参照）
情報の設定〕このスケジューリングは，１つのスケジューリング範囲
に対して行われる。スケジューリング範囲は通常の場
合，基本ブロック（語句説明参照）である。しかし，グ
ローバルスケジューリング等により，スケジューリング
範囲が広げられた場合も考えられる。本方式にとって
は，スケジューリング範囲が基本ブロックかどうかとい
うことは，本質的な事項ではない。1) [Entry busy (refer to the above explanation of words and phrases)
Setting of Information] This scheduling is performed for one scheduling range. The scheduling range is usually a basic block (see description of phrase). However, it is also conceivable that the scheduling range is expanded by global scheduling or the like. It is not essential for this method whether the scheduling range is a basic block.

【００３９】図５は，グローバルデータ（語句説明参
照）とレジスタとの対応を表すデータ構造の説明図であ
る。各スケジューリング範囲の入口および出口ごとに，
グローバルデータとそのデータが割り当てられているレ
ジスタ番号の対応情報が，図５に示すようなデータ構造
でキュー管理される。スケジューリング範囲間の情報
（どのグローバルデータがどのレジスタに確保されてい
るか）は，図５に示すデータ構造を用いて伝播される。FIG. 5 is an explanatory diagram of a data structure showing the correspondence between global data (see description of words and phrases) and registers. For each entry and exit of each scheduling range,
Correspondence information between global data and register numbers to which the data is assigned is queue-managed in a data structure as shown in FIG. Information between scheduling ranges (which global data is reserved in which register) is propagated using the data structure shown in FIG.

【００４０】あるデータがグローバルデータかどうかと
いうような情報は，通常のコンパイラのデータフロー解
析を用いることにより，得ることができる。ここで，情
報の設定というのは，以下の処理をいう。Information such as whether or not certain data is global data can be obtained by using data flow analysis of a normal compiler. Here, information setting means the following processing.

【００４１】ａ）入口ビジーなデータに対して，仮想レ
ジスタ（テンポラリ）を用意する。なお，この分につい
ては，レジスタが既に使用されているものとして扱い，
後述する技術説明１で示すレジスタ数を覚えるための変
数の初期値に反映する。A) A virtual register (temporary) is prepared for data that is busy at the entrance. Note that this is treated as if the register has already been used.
This is reflected in the initial value of a variable for remembering the number of registers described in Technical Description 1 described later.

【００４２】ｂ）このスケジューリング範囲のＩＰ（語
句説明参照）の情報（データとレジスタ番号の関係）を
取り込み，図５に示すようなデータ構造に覚える。これ
は，後で述べるレジスタ割り付けのときに利用される。B) The information (relationship between data and register number) of the IP (see the description of words and phrases) in this scheduling range is fetched and stored in a data structure as shown in FIG. This is used at the time of register allocation described later.

【００４３】２）〔ＤＡＧの作成〕１つの中間テキストをノードとして，ＤＡＧ（Directed
Acyclic Graph)を作成する。2) [Creation of DAG] Using one intermediate text as a node, DAG (Directed
Create Acyclic Graph).

【００４４】ＤＡＧとは，例えば図６の（イ）に示すよ
うな中間テキストの命令列があるとき，図６の（ロ）に
示すような内容を持つ命令の実行可能な順序を示すグラ
フである。命令は，命令で定義されたｔ４の値を参
照しているので，命令の実行後でなければ実行できな
い。同様に，命令は，命令，命令で定義されたｔ
２とｔ６の値を参照するので，命令および命令の実
行後でなければ実行できない。DAG is a graph showing the executable order of instructions having the contents shown in FIG. 6 (B) when there is an intermediate text instruction sequence as shown in FIG. 6 (A). is there. Since the instruction refers to the value of t4 defined by the instruction, it cannot be executed until after the execution of the instruction. Similarly, the instruction is an instruction, t defined by the instruction.
Since the values of 2 and t6 are referred to, they cannot be executed unless the instruction and the instruction have been executed.

【００４５】このＤＡＧについての詳細な内容やその作
成方法については，周知技術（例えば，エイホ・セシィ
・ウルマン共著，原田訳『コンパイラII〜原理・技法・
ツール』第９章参照）であるので，ここではこの程度の
説明にとどめる。The detailed contents of the DAG and the method of creating the DAG are described in well-known techniques (for example, co-authored by Aho Sesi Ullman, translated by Harada, "Compiler II-Principles, Techniques,
Tool ”in Chapter 9).

【００４６】このときに，各命令間のオペランドの定義
・参照関係を，図７で示すようなデータ構造で保持して
おく。すなわち，図７に示すように，仮想レジスタ（テ
ンポラリ）の識別子ｉｄごとに，定義(DEF) と参照(US
E) のリストを，チェーンによって管理する。定義のフ
ィールドは，ＤＡＧノードを示し，参照のフィールドは
ノードとクロック情報のリストを示す。これは，これ以
降に述べるスケジューリングのときに使用する。At this time, the definition and reference relationship of the operands between the instructions are held in a data structure as shown in FIG. That is, as shown in FIG. 7, for each identifier id of a virtual register (temporary), a definition (DEF) and a reference (US
The list in E) is managed by a chain. The definition field indicates a DAG node, and the reference field indicates a list of node and clock information. This is used for the scheduling described below.

【００４７】３）〔命令スケジューリング〕命令スケジューリングでは，ＤＡＧのノード１つずつに
対して，以下に説明する3-1)〜3-4)のステップを実行す
る。3) [Instruction Scheduling] In the instruction scheduling, the following steps 3-1) to 3-4) are executed for each node of the DAG.

【００４８】この命令スケジューリングの間，常にレジ
スタの使用状況（レジスタライブレンジ）を把握してい
る。そのレジスタライブレンジと，そのとき使用可能な
レジスタ数の釣り合いにより，処理は，以下の３つの場
合に分けられる。During this instruction scheduling, the status of register use (register live range) is always grasped. Processing is divided into the following three cases depending on the balance between the register live range and the number of registers that can be used at that time.

【００４９】場合１：十分にレジスタがある。命令はレ
ジスタの使用状況を無視してスケジューリングすること
ができる。場合２：命令の配置によっては，使用されるレジスタ数
が，使用可能なレジスタ数を上回ることがある。したが
って，命令はレジスタ数を意識して配置されなければな
らない。Case 1: There are enough registers. Instructions can be scheduled ignoring register usage. Case 2: The number of registers used may exceed the number of available registers depending on the instruction arrangement. Therefore, instructions must be placed with the number of registers in mind.

【００５０】場合３：命令を配置するためには，レジス
タ数が絶対的に不足しているので，スピルコードを出力
する必要がある。以上の場合分けは，各命令が実行された時のレジスタラ
イブレンジの増減の値と，各スケジューリング範囲で使
用可能なレジスタとを比較することによって行う。Case 3: In order to arrange instructions, it is necessary to output a spill code because the number of registers is absolutely insufficient. The above cases are classified by comparing the increase / decrease value of the register live range when each instruction is executed with the registers available in each scheduling range.

【００５１】命令スケジューリングの処理の流れを説明
する。 3-1)候補の決定ＤＡＧの中から，スケジューリングの候補を探す。The flow of the instruction scheduling process will be described. 3-1) Candidate determination A candidate for scheduling is searched from the DAG.

【００５２】スケジューリング候補は，ＤＡＧの中で，
親（先行するもの）がないか，親がすでに処理されたも
のである。スケジューリング候補は，スケジューリング
候補のリストに，優先順位の順序で並べられる。The scheduling candidate is a DAG
There is no parent (preceding one) or the parent has already been processed. The scheduling candidates are arranged in the scheduling candidate list in the order of priority.

【００５３】次の3-2)〜3-4)は，上記のスケジューリン
グ候補のリスト上のノードに対して適用される。 3-2)演算器の決定演算器が複数あるようなアーキテクチャにおいては，命
令によって，実行できる演算が決まっている。これは，
各命令をインデックスとしたテーブルを作成することに
よって見つけることができる。The following 3-2) to 3-4) are applied to the nodes on the above list of scheduling candidates. 3-2) Determining arithmetic units In an architecture with multiple arithmetic units, the operations that can be performed are determined by instructions. this is,
It can be found by creating a table with each instruction as an index.

【００５４】3-3)命令の実行時間τを見つける命令ｎの実行時間τは，次のようなステップで得られ
る。ａ）ｎをレジスタ数を無視して，スケジューリングした
ときのクロック数ｔを得る。3-3) Finding the execution time τ of the instruction The execution time τ of the instruction n is obtained by the following steps. a) The number of clocks t at the time of scheduling is obtained by ignoring the number of registers n.

【００５５】これは，ｎのディペンデンシと演算器の使
われ方に注目する従来の技術を利用する。ただし，ここ
では，命令ｎの配置場所を決定しない。また，ここでは
命令ｎの配置場所というのを，演算器とその演算器が実
行される（と予想される）クロック数ｔで表す。This utilizes a conventional technique that focuses on the dependency of n and the way in which a computing unit is used. However, here, the location of the instruction n is not determined. Here, the location of the instruction n is represented by a computing unit and the number of clocks t at which the computing unit is executed (estimated).

【００５６】ｂ）処理ａ）で得られたｔに対して，その
時にアクティブなレジスタの数を計算する。レジスタ数の使用見積もりは以下のように計算できる。B) For t obtained in step a), the number of registers active at that time is calculated. The estimated usage of the number of registers can be calculated as follows.

【００５７】レジスタ数の使用見積もり（ｔ，ｎ）＝現
在のレジスタ使用数（ｔ）＋命令によるレジスタ数の増
減（ｎ）ここで，命令によるレジスタ数の増減（ｎ）は，命令ｎ
が定義・参照するレジスタデータ（rl，…，rn）がある
とき，以下のようになる。Estimation of use of register number (t, n) = current register use number (t) + increase / decrease of register number by instruction (n) Here, increase / decrease of register number by instruction (n) corresponds to instruction n
When there is register data (rl, ..., rn) defined and referenced by

【００５８】レジスタ数の増減（ｎ）＝ Σ（ if(ri==定義）＋１ else if(ri== 参照かつ riはこれ以降現れない） −１ else ０）（ただし，Σはｉ＝１からｎまでの総和）ここで，各レジスタの定義・参照（および参照回数）
は，ＤＡＧを生成するときに計算可能であるので，これ
を覚えておけばよい。これには，図７のデータ構造を利
用することができる。Increase / decrease in the number of registers (n) = Σ (if (ri == definition) +1 else if (ri == referenced and ri does not appear thereafter) −1 else 0) (where Σ is from i = 1 Here, the definition and reference of each register (and the number of references)
Can be calculated when the DAG is generated, so it is necessary to remember this. For this, the data structure of FIG. 7 can be used.

【００５９】ｃ）処理ｂ）で求めたレジスタ数の使用見
積もり（ｔ，ｎ）が，許されるレジスタ数よりも小さい
とき，命令ｎはクロックｔの位置に配置することを決定
する。C) If the use estimate (t, n) of the number of registers obtained in the process b) is smaller than the allowable number of registers, it is determined that the instruction n is arranged at the position of the clock t.

【００６０】ｄ）処理ｂ）で求めたレジスタ数の使用見
積もり（ｔ，ｎ）が，許されるレジスタ数より大きいと
きには，命令ｎはクロックｔの位置に配置することはで
きない。したがって，同時アクティブ（語句説明参照）
なレジスタ数を減らすような命令スケジューリングが必
要になる。D) When the use estimate (t, n) of the number of registers obtained in the process b) is larger than the allowable number of registers, the instruction n cannot be arranged at the position of the clock t. Therefore, simultaneous active (refer to the phrase explanation)
Instruction scheduling is required to reduce the number of registers.

【００６１】このような場合の処理として，次の方法が
考えられる。ア）次の候補ｎ' に対して，3-2)から開始する。イ）命令ｎに対して，上記ｃ）の条件を満たすような
ｔ' を探す。そのときの手順については，後述する技術
説明１に示す。The following method can be considered as a process in such a case. A) Start from 3-2) for the next candidate n '. B) For the instruction n, find t 'that satisfies the above condition c). The procedure at that time will be described in Technical Description 1 described later.

【００６２】ウ）上記ア），イ）を組み合わせる。本実施例では，ウ）を採用している。すなわち，候補の
中に例えばストア命令のようなレジスタの使用数を増加
させない命令があるときには，その命令に対して，ア）
を適用する。そのような命令がないときには，イ）を適
用する。C) The above a) and b) are combined. In this embodiment, c) is adopted. That is, if there is an instruction that does not increase the number of registers used, such as a store instruction, among the candidates, a)
Apply If there is no such instruction, a) shall apply.

【００６３】ｅ）処理ｄ）のとき，命令ｎに対して，適
当なｔ' が見つからない時がある。この場合には，レジ
スタ数がどうしても足りない場合である。このような場
合，スピルコードが必要になる。E) In the process d), there is a case where an appropriate t 'cannot be found for the instruction n. In this case, the number of registers is insufficient. In such a case, a spill code is required.

【００６４】スピルコード挿入の処理手順については，
技術説明２として後述する。これによって，スピルコー
ドは通常の命令と同じような処理で，スケジュールされ
る。The processing procedure for inserting the spill code is as follows.
This will be described later as Technical Description 2. As a result, the spill code is scheduled in the same manner as a normal instruction.

【００６５】3-4)データの登録命令ｎを実行する演算器とそれを実行させるクロック数
ｔが求められたならば，その情報を登録する。3-4) Data Registration When the arithmetic unit for executing the instruction n and the number of clocks t for executing the same are obtained, the information is registered.

【００６６】ここで，登録対象となるデータは，次の３
種のデータである。 −各演算器に対してクロックごとの演算器の使用状況を
表すデータ。これは，図８に示すようなデータ構造で表現される。す
なわち，各演算器ごとに，使用中チェーン８０とすきま
チェーン８１とを持ち，使用中チェーン８０によって，
演算器を使用する時間が割り当てられた各命令ノードの
情報を管理し，すきまチェーン８１によって，演算器が
空いているクロックとその長さを管理する。Here, the data to be registered is the following 3
Seed data. -Data indicating the usage status of the arithmetic unit for each clock for each arithmetic unit. This is represented by a data structure as shown in FIG. That is, each operating unit has a chain 80 in use and a clearance chain 81, and the chain 80 in use
The information of each instruction node to which the use time of the computing unit is assigned is managed, and the gap chain 81 manages the clocks in which the computing unit is idle and the length thereof.

【００６７】−レジスタ種別ごとに，各クロックに対し
て何個のレジスタが使用中（アクティブ）になっている
かを覚えておくためのデータ。これについては，後述する技術説明１で説明する。Data for remembering how many registers are in use (active) for each clock for each register type. This will be described later in Technical Description 1.

【００６８】−各データについての定義命令と参照命令
の関係を示したもの。この登録では，２）で述べたＤＡＧを生成するときに，
図７に示すようなテーブルが用意されるので，この連鎖
に繋がっている各命令が，スケジュールされたかどうか
の印（命令が演算器に割り付けられたクロック数Clock)
を付ける。-The relationship between the definition command and the reference command for each data. In this registration, when generating the DAG described in 2),
Since a table as shown in FIG. 7 is prepared, each instruction linked to this chain is marked as to whether or not it has been scheduled (the number of clocks assigned to the arithmetic unit Clock).
Attached.

【００６９】これによって，どのデータがどの時間のと
きに，アクティブになっているかどうかが分かる。４）〔レジスタ割り付け〕以上の３）までの手続きによって（特に3-3 のｅ）によ
る），レジスタに載せるべきデータに対して，レジスタ
が不足するという事態はなくなる。Thus, it can be determined which data is active at which time. 4) [Register allocation] By the procedure up to the above 3) (particularly according to 3-3 e)), the situation where the register is insufficient for the data to be loaded in the register is eliminated.

【００７０】したがって，ここでのレジスタ割り付け
は，スピルコードを出力する必要もなく，従来のローカ
ルレジスタ割り付けの処理で実現できる。５）〔出口ビジー（語句説明参照）情報の処理〕次のスケジューリング範囲への情報を伝播する。この情
報というのは，出口ビジーなデータに対して，どのレジ
スタが割り付けられているかという対応を示すものであ
る。Therefore, the register allocation here can be realized by the conventional local register allocation processing without outputting the spill code. 5) [Processing of Exit Busy (Refer to Phrase Description) Information] The information is transmitted to the next scheduling range. This information indicates the correspondence of which register is allocated to the exit busy data.

【００７１】この処理では，図５で示したようなデータ
構造の出口ビジーの箇所に，出口ビジーのデータに対す
るレジスタ割り付けの結果を格納する。次に，本発明の
実施例による具体的な処理結果の例を，図９ないし図１
２に従って説明する。In this processing, the result of register allocation for the data of the exit busy is stored in the exit busy portion of the data structure as shown in FIG. Next, examples of specific processing results according to the embodiment of the present invention will be described with reference to FIGS.
Explanation will be made according to 2.

【００７２】ここでは，オブジェクトプログラムが動作
する計算機として，以下のような機能を持つマシンを想
定する。ただし，本発明は，このような計算機に限られ
るわけではない。Here, a machine having the following functions is assumed as a computer on which the object program operates. However, the present invention is not limited to such a computer.

【００７３】−１命令で３つのオペレーションを同時に
実行できる。 −このマシンは，整数演算ユニット＊２個浮動小数点演算ユニット＊１個を持っている。なお，メモリ演算は，整数演算ユニット
で動作するが，１命令の中では，１つのメモリ操作だけ
が可能である。Three operations can be executed simultaneously by one instruction. -This machine has 2 integer operation units * 1 floating point operation unit * 1. Although the memory operation is performed by the integer operation unit, only one memory operation can be performed in one instruction.

【００７４】−掛算は，浮動小数点演算ユニットで演算
される。 −命令は，以下のように記述される。 load R1,０,R2 : add R3,R4,R5 : mul R6,R7,R8 st R8,R1, 4 : sub R5,R0,R0 bne label1 ここで，１命令は１行で表され，各オペレーション
は“：”で区切られるものとする。１命令の中に２以下
のオペレーションしかない場合には，残りの部分にＮＯ
Ｐ（No Operation) が挿入されているものと解釈され
る。The multiplication is performed by the floating-point operation unit. -Instructions are described as follows. load R1,0, R2: add R3, R4, R5: mul R6, R7, R8 st R8, R1,4: sub R5, R0, R0 bne label1 Here, one instruction is represented by one line, and each operation is They are separated by “:”. If there are only two or less operations in one instruction, the remaining part is NO
It is interpreted that P (No Operation) is inserted.

【００７５】−命令の実行は，１行ずつ行われ，前の命
令が終了しないと，次の命令は実行できないものとす
る。 −load命令，st命令は，実行に２τかかり， add命令,
sub 命令, mul 命令は，実行に４τかかるものとする。-Instructions are executed line by line, and the next instruction cannot be executed unless the previous instruction is completed. -The load and st instructions take 2τ to execute, and the add and
The sub and mul instructions take 4τ to execute.

【００７６】図１に示す最適化処理フェーズ１３によっ
て最適化された中間テキストが，図９の（イ）に示すよ
うな命令（オペレーション）の並びであって，これにつ
いて，命令スケジューリングとレジスタ割り付けを行う
ものとする。The intermediate text optimized by the optimization processing phase 13 shown in FIG. 1 is a sequence of instructions (operations) as shown in FIG. 9 (a). Assumed to be performed.

【００７７】この中間テキスト中で，ｔ１〜ｔ１１は仮
想レジスタ，すなわちレジスタ数に制限がないものとし
てテンポラリに割り当てられたレジスタを表す。このス
ケジューリング範囲において，ｔ１とｔ３とｔ５とは，
入口ビジーおよび出口ビジーであるとする。In the intermediate text, t1 to t11 represent virtual registers, that is, registers temporarily allocated as long as the number of registers is not limited. In this scheduling range, t1, t3, and t5 are:
It is assumed that there is an entrance busy and an exit busy.

【００７８】整数演算ユニット数が２，浮動小数点演算
ユニットが１であるので，レジスタ等が競合しない限
り，整数演算（ロード／ストア）・掛算・加算は同時に
実行することができる。Since the number of integer operation units is 2 and the number of floating-point operation units is 1, integer operations (load / store), multiplication, and addition can be performed simultaneously unless there is a conflict between registers and the like.

【００７９】図９の（イ）に示す中間テキストの命令
(1) 〜(10)を，図１０に示すように各時間（time-1〜ti
me-22)に配置したとすると，各レジスタｔ１〜ｔ１１の
ライブレンジは，同図に示すようになる。The instruction of the intermediate text shown in FIG.
As shown in FIG. 10, (1) to (10) are converted to respective times (time-1 to ti).
me-22), the live ranges of the registers t1 to t11 are as shown in FIG.

【００８０】ここで，最大の同時アクティブなレジスタ
数は，time-9〜time-10 において８である。したがっ
て，使用できるレジスタが８個以上ある場合には，レジ
スタ数の影響を受けずにスケジューリングすることがで
きる。その結果，オブジェクトプログラムとして出力で
きる命令列は，図９の（ロ）に示すようになり，この命
令列の実行時間は，２２τとなる。Here, the maximum number of simultaneously active registers is eight from time-9 to time-10. Therefore, when there are eight or more registers that can be used, scheduling can be performed without being affected by the number of registers. As a result, an instruction sequence that can be output as an object program is as shown in FIG. 9B, and the execution time of this instruction sequence is 22τ.

【００８１】ここで，仮に使用できるレジスタが７個し
かなかった場合を考える。従来方式では，例えば命令ス
ケジューリングが終了してからレジスタ割り付けを行
い，レジスタが足りなくなった時点でスピルコード（レ
ジスタの値を退避する命令）を出力する。Here, it is assumed that there are only seven registers that can be used temporarily. In the conventional method, for example, register allocation is performed after instruction scheduling is completed, and a spill code (instruction to save the register value) is output when the register becomes insufficient.

【００８２】したがって，従来の処理結果による命令列
は，例えば図１１の（イ）に示すようになる。図１１の
（イ）において，命令(4) は，レジスタＲ５の内容をメ
モリの作業域ＭＥＭに退避するスピルストアの命令，命
令(7) は，レジスタＲ５の内容をメモリの作業域ＭＥＭ
から復元するスピルロードの命令である。Therefore, an instruction sequence based on a conventional processing result is as shown in FIG. In FIG. 11A, an instruction (4) is a spill store instruction for saving the contents of the register R5 to a work area MEM in the memory, and an instruction (7) is an instruction for storing the contents of the register R5 in the work area MEM of the memory.
This is a spill load instruction to restore from.

【００８３】このときのレジスタライブレンジと，実行
時間の関係を調べると，図１１の（ロ）に示すようにな
り，同図に実線で示されるレジスタの使用状況から明ら
かなように，図１１の（イ）に示す命令列の実行時間
は，２８τとなる。When the relationship between the register live range and the execution time at this time is examined, it becomes as shown in FIG. 11 (b). As is clear from the register usage shown by the solid line in FIG. The execution time of the instruction sequence shown in (a) is 28τ.

【００８４】これに対し，本発明では，前述のように命
令スケジューリングとレジスタ割り付けとを同じフェー
ズで行い，使用できるレジスタ数が７個で，当初の見積
もりより少ない場合には，できるだけスピルコードを出
力しないように命令を並べ換えるスケジューリングを行
う。On the other hand, in the present invention, as described above, instruction scheduling and register allocation are performed in the same phase, and when the number of available registers is seven and is smaller than the initial estimate, a spill code is output as much as possible. Scheduling is performed so that instructions are not rearranged.

【００８５】この結果，本発明によって出力される命令
列は，例えば図１２の（イ）に示すようになる。このと
きのレジスタライブレンジと，実行時間の関係を調べる
と，図１２の（ロ）に示すようになる。この図から明ら
かなように，図１２の（イ）に示す命令列の実行時間
は，２４τとなる。As a result, the instruction sequence output according to the present invention is, for example, as shown in FIG. The relationship between the register live range and the execution time at this time is shown in FIG. As is apparent from this figure, the execution time of the instruction sequence shown in FIG.

【００８６】したがって，この命令列は，図１１の
（イ）に示すスピルコードを出す命令列よりも，高速に
実行されることがわかる。〔技術説明１〕レジスタライブの同時アクティブを減ら
すような命令スケジューリングの方法。Therefore, it can be seen that this instruction sequence is executed at a higher speed than the instruction sequence for issuing the spill code shown in FIG. [Technical Description 1] A method of instruction scheduling that reduces simultaneous activation of register live.

【００８７】次のような変数を利用する。 char Num-of-Live-reg［２］［Ｎ］；これは，各クロック数０，１，２，……，Ｎに対して，
汎用レジスタと浮動小数点レジスタのアクティブなレジ
スタ数を記憶する配列である。The following variables are used. char Num-of-Live-reg [2] [N]; For each clock number 0, 1, 2,.
This is an array for storing the number of active general-purpose registers and floating-point registers.

【００８８】例えば， Num-of-Live-reg［０］［１０］
の値は，クロック１０のときに使用されている（アクテ
ィブな）汎用レジスタの数を記憶している。ここで，Ｎ
は，クロック数の最終値であるが，Ｎは必ずしも正確な
値を必要とせず，大きめに見積もっておけばよい。ま
た，この値は，ＤＡＧを作成中に命令数がわかるので，
それをもとに見積もることができる。ここでは，レジス
タの種類は，汎用レジスタと浮動小数点レジスタの２種
類としているが，処理系やハードウェアによって種類を
増やすことも可能である。For example, Num-of-Live-reg [0] [10]
Stores the number of (active) general-purpose registers used at clock 10. Where N
Is the final value of the number of clocks, but N does not always need to be an accurate value, and may be estimated to be relatively large. Also, since this value can be used to determine the number of instructions while creating a DAG,
You can estimate based on that. Here, there are two types of registers, a general-purpose register and a floating-point register, but the types can be increased depending on the processing system and hardware.

【００８９】この命令スケジューリングの処理手順は，
以下のようになる。（本文における 3-3）ｄ）のイ）からの続き） 3-3）ａ）で得られたＴ（従来のスケジューリング方
法で得られたクロック数）から，ＴＴ（現在，その演算
器でクロックが一番遅いところに割り付けられている命
令が終了するクロック）までで，各クロックごとにレジ
スタ数の見積もりを行う。The procedure of the instruction scheduling is as follows.
It looks like this: (3-3) d) Continuation from d) a) 3-3) From T (the number of clocks obtained by the conventional scheduling method) obtained in a), TT (currently , The number of registers is estimated for each clock.

【００９０】もし，見積もったレジスタ数が，許される
レジスタ数よりも小さいとき，そのときのｔが求めるｔ
である。仮に，そのようなｔが見つからなかった場合に
は，対象としている命令ｎに対して，レジスタのライブ
レンジを減らすようなスケジューリングはできなかった
ことを示す。If the estimated number of registers is smaller than the allowable number of registers, t at that time is calculated as t
It is. If such t is not found, it indicates that scheduling for reducing the live range of the register has not been performed for the target instruction n.

【００９１】以上の流れを，擬似プログラムで書くと，
次のようになる。ＬＯＯＰｆｒｏｍｔ＝Ｔ：ｔｏｔ＝ＴＴ；｛レジスタ数の使用見積もり（ｔ，ｎ）＝ Num-of-Live-reg［Ｉ］［Ｎ］＋レジスタ数の増減（ｎ）；ｉｆ（レジスタ数の使用見積もり＜利用できるレジスタ数）ｒｅｔｕｒｎｔ；｝ｒｅｔｕｒｎ見つからない；ここで，Ｉは命令ｎによって，使用するレジスタの種類
が決まっているので，それに対応した番号である。〔技術説明２〕スピルコードの挿入手順。When the above flow is written in a pseudo program,
It looks like this: LOOP from t = T: to t = TT; 使用 Estimated use of register number (t, n) = Num-of-Live-reg [I] [N] + increase / decrease of register number (n); Estimated use <the number of available registers) return t;｝ return not found; Here, I is a number corresponding to the type of register to be used, which is determined by the instruction n. [Technical explanation 2] Insertion procedure of spill code.

【００９２】スピルコードは，レジスタが必要なデータ
に対して，割り当てるべきレジスタがなくなってしまっ
た場合に発生する命令のことである。スピルコードは，
多くの場合以下の２つから構成される。The spill code is an instruction generated when there is no register to be assigned to data requiring a register. The spill code is
In most cases, it is composed of the following two.

【００９３】１）あるレジスタの内容を他の場所（メモ
リ，又は他の種類のレジスタ）に退避する命令（ここで
は，ストア命令と呼ぶことにする）を発行することによ
って，そのレジスタを使用できるようにする。1) By issuing an instruction (herein referred to as a store instruction) for saving the contents of a register to another location (memory or another type of register), the register can be used. To do.

【００９４】２）退避されたデータが使用されるときに
は，１）で退避された場所から，レジスタに持ってくる
命令（ここでは，ロード命令と呼ぶことにする）を発行
する。2) When the saved data is used, an instruction to be brought into the register (herein referred to as a load instruction) is issued from the location saved in 1).

【００９５】本方式では，スピルコードは，以下のよう
な手順で出力される。１）スピルされるデータの選択どのレジスタの内容をストアするのかという問題は重要
であり，これに関しては，既存のコンパイラでも検討さ
れている。しかし，本発明の要旨には直接関係しないた
め，ここでは，従来のコンパイラと同じ手順でスピルさ
れれるデータを選ぶものとする。In this method, the spill code is output in the following procedure. 1) Selection of data to be spilled The problem of which register contents to store is important, and this is being studied by existing compilers. However, since it is not directly related to the gist of the present invention, here, data to be spilled by the same procedure as that of the conventional compiler is selected.

【００９６】２）ストア命令およびロード命令の処理ア）ストア命令およびロード命令に対応するＤＡＧノー
ドを作成する。イ）ストア命令に対応するＤＡＧノードは，スピルを必
要とする命令ｎの親（先に実行される）ＤＡＧの位置に
挿入する。2) Processing of store and load instructions a) DAG nodes corresponding to store and load instructions are created. B) The DAG node corresponding to the store instruction is inserted at the position of the parent (previously executed) DAG of the instruction n requiring spill.

【００９７】ロード命令に対応するＤＡＧノードは，ス
トア命令によって退避されたデータが，実際に使用され
る命令の親かつスピルを必要とする命令ｎの子供の位置
に，挿入する。The DAG node corresponding to the load instruction inserts the data saved by the store instruction into the position of the parent of the instruction actually used and the child of the instruction n requiring spill.

【００９８】３）データの定義・使用情報を更新する。以上によって，スピルコードを，通常の命令と同様にス
ケジューリング手順の中に組み込んで処理することが可
能になる。3) Update the data definition / use information. As described above, the spill code can be incorporated into the scheduling procedure and processed in the same manner as a normal instruction.

【００９９】[0099]

【発明の効果】以上説明したように，本発明によれば，
命令スケジューリングとレジスタ割り付けのバランスが
とれた効率のよいコードを出力することができるように
なる。レジスタのライブレンジを考慮した命令スケジュ
ーリングが行われるので，スピルコードの発生を少なく
することができる。また，スピルコードが発生しても，
それらのコードが命令スケジューリングの対象になるの
で，スピルコードと他のオペレーションコードとの並列
化がなされ，演算器の有効活用が可能になる。As described above, according to the present invention,
It is possible to output efficient code that balances instruction scheduling and register allocation. Since instruction scheduling is performed in consideration of the live range of the register, the occurrence of spill code can be reduced. Also, even if a spill code occurs,
Since those codes are subjected to instruction scheduling, the spill code is parallelized with other operation codes, so that the operation unit can be effectively used.

[Brief description of the drawings]

【図１】本発明の原理説明図である。FIG. 1 is a diagram illustrating the principle of the present invention.

【図２】本発明の実施例を説明するためのライブレンジ
の説明図である。FIG. 2 is an explanatory diagram of a live range for explaining an embodiment of the present invention.

【図３】本発明の実施例を説明するためのＩＰの説明図
である。FIG. 3 is an explanatory diagram of an IP for describing an embodiment of the present invention.

【図４】本発明の実施例に係る命令スケジューリング・
レジスタ割り付けフェーズの処理の流れを示す図であ
る。FIG. 4 shows an instruction scheduling method according to an embodiment of the present invention.
FIG. 9 is a diagram showing a flow of processing in a register allocation phase.

【図５】本発明の実施例で用いるグローバルデータとレ
ジスタとの対応を表すデータ構造の説明図である。FIG. 5 is an explanatory diagram of a data structure representing a correspondence between global data and a register used in the embodiment of the present invention.

【図６】本発明の実施例で用いるＤＡＧの説明図であ
る。FIG. 6 is an explanatory diagram of a DAG used in the embodiment of the present invention.

【図７】本発明の実施例で用いる各データの定義・使用
連鎖を表すためのデータ構造の説明図である。FIG. 7 is an explanatory diagram of a data structure for representing a definition / use chain of each data used in the embodiment of the present invention.

【図８】本発明の実施例で用いる演算器の使用状況を表
すデータ構造の説明図である。FIG. 8 is an explanatory diagram of a data structure representing a use state of a computing unit used in the embodiment of the present invention.

【図９】本発明の実施例による処理結果の例を示す図で
ある。FIG. 9 is a diagram illustrating an example of a processing result according to the embodiment of the present invention.

【図１０】図９に示す実施例の処理結果に係るレジスタ
ライブレンジを示す図である。FIG. 10 is a diagram showing a register live range according to a processing result of the embodiment shown in FIG. 9;

【図１１】本発明の実施例による処理結果と比較するた
めの従来技術による処理結果の例を示す図である。FIG. 11 is a diagram illustrating an example of a processing result according to the related art for comparison with a processing result according to the embodiment of the present invention.

【図１２】本発明の実施例による処理結果の例を示す図
である。FIG. 12 is a diagram illustrating an example of a processing result according to the embodiment of the present invention.

[Explanation of symbols]

１０ソースプログラム１１コンパイル処理装置１２フロントエンド処理フェーズ１３最適化処理フェーズ１４命令スケジューリング・レジスタ割り付けフ
ェーズ１４−１レジスタの使用状況検査手段１４−２並列性優先スケジューリング手段１４−３同時アクティブなレジスタ数を減らすスケジ
ューリング手段１４−４スピルコード挿入手段１５命令出力フェーズ１６オブジェクトプログラムDESCRIPTION OF SYMBOLS 10 Source program 11 Compile processing unit 12 Front-end processing phase 13 Optimization processing phase 14 Instruction scheduling / register allocation phase 14-1 Register use status inspection means 14-2 Parallelism priority scheduling means 14-3 The number of simultaneously active registers Scheduling means for reducing 14-4 Spill code inserting means 15 Instruction output phase 16 Object program

───────────────────────────────────────────────────── フロントページの続き (72)発明者松山学神奈川県川崎市中原区上小田中1015番地富士通株式会社内 (56)参考文献特開平３−150637（ＪＰ，Ａ) Ｊ．Ｒ．Ｇｏｏｄｍａｎ他，ＣｏｄｅＳｃｈｅｄｕｌｉｎｇａｎｄＲｅｇｉｓｔｅｒＡｌｌｏｃａｔｉｏｎｉｎＬａｒｇｅＢａｓｉｃＢｌｏｃｋｓ，ＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＳｕｐｅｒｃｏｍｐｕｔｉｎｇ，1988，Ｐ．442 −452 (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 9/38 G06F 9/45 ────────────────────────────────────────────────── (7) Continuation of the front page (72) Inventor: Manabu Matsuyama 1015 Kamiodanaka, Nakahara-ku, Kawasaki-shi, Kanagawa Prefecture Inside Fujitsu Limited (56) References JP-A-3-150637 (JP, A) R. Goodman et al., Code Scheduling and Register Allocation in Large Basic Blocks, International Conference on Supercomputing, 1988, p. 442 −452 (58) Fields surveyed (Int.Cl. ⁷ , DB name) G06F 9/38 G06F 9/45

Claims

(57) [Claims]

1. A processing method for compiling a program to be executed on a computer having a plurality of arithmetic units, each of which can operate in parallel, comprising: an instruction scheduling for determining an instruction arrangement order; and a register for determining a register used by each instruction. Performing the allocation in the same phase and increasing the parallelism regarding the use of the plurality of arithmetic units, and the instruction scheduling reducing the simultaneous activation of the registers. for switching by collating a certain instruction n ignoring register number schedule
Of registers that can be used for register estimation
If the value is greater than
Search for the instruction n ', and if the candidate is found,
Performs scheduling processing for the candidate
Is not found, the instruction n is stored in the register
Number usage estimate is less than available registers
At the same time
An instruction scheduling method in a compiler, which performs instruction scheduling to reduce the number of instructions .

2. The method according to claim 1, wherein when the number of registers to be allocated is insufficient even if the instructions are rearranged, a spill code which is an instruction to save the contents of the registers to another location is inserted. And a spill code for the instruction scheduling.