JPH0452987B2

JPH0452987B2 -

Info

Publication number: JPH0452987B2
Application number: JP60106546A
Authority: JP
Inventors: Morie Sagawa
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1985-05-18
Filing date: 1985-05-18
Publication date: 1992-08-25
Also published as: JPS61264442A

Description

[Detailed description of the invention]

〔概要〕中間コード最適化部を有するコンパイル装置に
おいて、当該中間コード最適化部にベクトル長設
定機能部をもうけ、該ベクトル長設定機能部が、
ベクトル長決定テーブルを参照し、可能な範囲で
極端に短いベクトル長が生じることのないように
ベクトル長を決定し、全体としての効率を向上す
るようにすることが開示される。〔産業上の利用分野〕本発明は、コンパイル装置、特に中間コード最
適化部を有するコンパイル装置において、好まし
いベクトル長を決定する機能をもたせるようにし
たコンパイル装置に関するものである。〔従来の技術〕与えられたソース・プログラムからオブジエク
ト・プログラムを生成するコンパイラにおいて、
ソース・プログラムが例えばループを含んでいる
如き場合に典型的に現われるが、中間コードを生
成した段階において、オブジエクト・プログラム
を生成した際に最も好ましいものが得られるよう
に、ベクトル化された中間コードの最適化をはか
ることが行われる。第５図はベクトル演算の例を示しており、図示
の場合には、ベクトルa₁，a₂，……とベクトル
b₁，b₂，……との各エレメントに対して、エレメ
ント対応にマスクm₁，m₂，……がオンの場合に Ci＝a_i＋b_i なる演算を行い、オフの場合に Ci＝Ci なる演算を行うことを表わしている。第５図に示した如きベクトル演算は、第６図図
示の如きハードウエア構成をもつベクトル計算機
にによつて演算される。第６図において、１はベ
クトル・プロセツサ、２は主記憶装置、３はメモ
リ制御装置、４はチヤネル・プロセツサ、５は大
記憶装置を表わしている。第７図は、与えられたソース・プログラム６を
ベクトル化した中間コードに変換している状態を
表わしている。図示ソース・プログラム６においては、Ｉの値
が「１」から「100」まで変化する場合において、
Ａ（Ｉ）とＤ（Ｉ）とを求めることを表現してい
る。そして、中間コード７においては、ベクト
ル・レングスが「100」（即ちVLENG＝100）で
あるとして、Ａ（＊）とＤ（＊）とを求めることが
表現されている。〔発明が解決しようとする問題点〕コンパイル装置においては、第７図図示の如き
中間コード７を生成した上で、オブジエクト・プ
ログラムを生成するが、オブジエクト・プログラ
ムを実行してゆく段階において使用できるベクト
ル・レジスタに限度がある。このために、どの程
度のベクトル・レジスタを使用する形でオブジエ
クト・プログラムを生成するかが問題となる。ベクトル長が大となればそれだけ少ないベクト
ル・レジスタを使用する形となり易く、全体の数
多くの処理を効率よく実行する上では、かえつて
問題のあることがある。したがつて、一般には、
後述するビジー数を調べて上記ベクトル長が決定
されるが、例えばロジカル・ベクトル・レングス
が「1025」であるような場合に、実際に演算する
際のフイジカル・ベクトル長を例えば「512」に
選んだとすると、上記ロジカル・ベクトル・レン
グスが「1025」である如き演算を、フイジカル・
ベクトル長が「512」演算を２回と、ベクトル長
が「１」の演算を１回とをもつて行うこととなり
かねない。このような区分のされ方となつたよう
な場合には、ベクトル長が「１」の演算を行うた
めの効率が悪く、好ましくはベクトル長「512」
とベクトル長「513」との如き２回の演算に区分
できるようにすることが望ましい。〔問題点を解決するための手段〕本発明は上記の点を解決しており、第１図は本
発明の原理ブロツク図を示す。図中の符号８はコ
ンパイラ、９はソース・プログラム、１０はオブ
ジエクト・プログラムを表わしている。また１１
はソース解釈部、１２は記憶域割付け部、１３は
ベクトル化部、１４は中間コード最適化部であつ
て本発明が直接関連する所のもの、１５はレジス
タ使用決定部、１６はオブジエクト・プログラム
出力部、１７はベクトル長設定機能部を表わして
いる。ベクトル化部１３は第７図を参照して説明した
如き形でベクトル化を行つた中間コードを出力
し、中間コード最適化部１４は中間コードについ
て最適化をはかるようにする。〔作用〕本発明の場合には、上述の如き極端に短いベク
トル長の演算が混在するようになることを防止す
るために、ベクトル長設定機能部１７がもうけら
れる。当該機能部１７は、後述する如き形で最大
ビジー数を求め、最適なベクトル長を求めて、好
ましいベクトル長を設定し、中間コード最適化部
内の処理（図示せず）に通知する。当該最適なベ
クトル長を求めるに当つては、ベクトル長決定テ
ーブルを参照する形の処理が行われる。〔実施例〕第２図は最大ビジー数を決定する態様を説明す
る説明図である。今仮に、第２図図示左端縦列の如き中間コード
７が与えられているとするとき、ベクトル・レジ
スタvt₁，vt₂，vt₃，vt₄が夫々どの範囲で占有さ
れるかを示したものが、図示中央部の縦棒線であ
る。図示の場合、レジスタvt₁は処理からま
での間占有される。またレジスタvt₃は処理か
らまでの間占有される。このような占有の状態
を調べることによつて、処理からまでの間の
処理に当つて、レジスタが一時期に最大個数占有
されるのは処理の場合であり、このことから、
処理からまでの間の最大ビジー数は値「３」
として決定される。第１表は、データ処理装置の機種がVP100の場
合とVP200の場合とで、上記最大ビジー数ｎが求
まつたときに、ベクトル長をどの程度の値に選定
したらよいかの目安を与える目安ベクトル長
DVLを表にまとめたものである。上記の如く求められた最大ビジー数が値「８」
以内である場合には、VP200の装置においてはベ
クトル長は「1024」以内の値を選び得る。 [Summary] In a compiling device having an intermediate code optimization section, a vector length setting function section is provided in the intermediate code optimization section, and the vector length setting function section is configured to:
It is disclosed that the vector length is determined with reference to a vector length determination table so that an extremely short vector length does not occur within the possible range, and the overall efficiency is improved. [Industrial Application Field] The present invention relates to a compiling device, particularly a compiling device having an intermediate code optimization section, which is provided with a function of determining a preferred vector length. [Prior Art] In a compiler that generates an object program from a given source program,
This typically occurs when a source program contains a loop, for example, but at the stage of generating intermediate code, vectorized intermediate code is Optimization is carried out. Figure 5 shows an example of vector calculation, and in the case shown, vectors a ₁ , a ₂ , ... and vectors
For each element b ₁ , b ₂ , ..., if the mask m ₁ , m ₂ , ... is on, the calculation Ci=a _i +b _i is performed, and if it is off, Ci= This indicates that the operation Ci is to be performed. The vector calculation shown in FIG. 5 is performed by a vector computer having a hardware configuration as shown in FIG. In FIG. 6, 1 represents a vector processor, 2 a main storage device, 3 a memory control device, 4 a channel processor, and 5 a large storage device. FIG. 7 shows a state in which a given source program 6 is being converted into vectorized intermediate code. In the illustrated source program 6, when the value of I changes from "1" to "100",
It expresses finding A(I) and D(I). In intermediate code 7, it is expressed that A(*) and D(*) are to be found on the assumption that the vector length is "100" (ie, VLENG=100). [Problems to be Solved by the Invention] In the compiling device, an object program is generated after generating intermediate code 7 as shown in FIG. Vector registers are limited. Therefore, the question is how many vector registers should be used to generate the object program. As the vector length increases, fewer vector registers are likely to be used, which may actually pose a problem in efficiently executing a large number of overall processes. Therefore, in general,
The above vector length is determined by checking the busy number, which will be described later. For example, if the logical vector length is "1025", the physical vector length for actual calculation should be selected as "512". If so, the above logical vector length is "1025", and the calculation is
This may result in two operations with a vector length of "512" and one operation with a vector length of "1". If the classification is done in this way, it is inefficient to perform calculations with a vector length of "1", so it is preferable to use a vector length of "512".
It is desirable to be able to divide the calculation into two operations, such as the vector length “513” and the vector length “513”. [Means for Solving the Problems] The present invention solves the above-mentioned problems, and FIG. 1 shows a block diagram of the principle of the present invention. In the figure, numeral 8 represents a compiler, 9 represents a source program, and 10 represents an object program. Also 11
12 is a storage allocation unit, 13 is a vectorization unit, 14 is an intermediate code optimization unit to which the present invention is directly related, 15 is a register usage determination unit, and 16 is an object program. The output section 17 represents a vector length setting function section. The vectorization unit 13 outputs an intermediate code that has been vectorized in the manner described with reference to FIG. 7, and the intermediate code optimization unit 14 optimizes the intermediate code. [Operation] In the case of the present invention, the vector length setting function unit 17 is provided in order to prevent calculations with extremely short vector lengths from being mixed together as described above. The functional unit 17 determines the maximum busy number in the manner described below, determines the optimal vector length, sets the preferred vector length, and notifies the process (not shown) in the intermediate code optimization unit. In determining the optimum vector length, a process is performed that refers to a vector length determination table. [Example] FIG. 2 is an explanatory diagram illustrating a mode of determining the maximum busy number. Assuming that the intermediate code 7 shown in the leftmost column in Figure 2 is given, this shows the range in which the vector registers vt ₁ , vt ₂ , vt ₃ , and vt ₄ are occupied, respectively. is the vertical bar line in the center of the figure. In the illustrated case, register vt ₁ is occupied until processing begins. Furthermore, register vt ₃ is occupied until processing begins. By examining the state of such occupancy, we can find that during processing, the maximum number of registers are occupied at one time during processing, and from this,
The maximum number of busy times between processing is the value "3"
is determined as. Table 1 provides a guideline for determining the value of the vector length when the maximum busy number n is calculated for the data processing device model VP100 and VP200. vector length
This table summarizes the DVL. The maximum busy number obtained as above is the value "8"
If the vector length is within "1024" in the VP200 device, a value within "1024" can be selected.

[Numerical example]

装置VP200の場合であつて、最大ビジー数が値
「８」であつて、ロジカル・ベクトルLVLが値
「1060」であつたとする。この場合には、（） MOD〔LVL／DVL〕＝MOD〔1060／1024〕＝36 （） LVL＞DVL （） DVL＞128 であり、（）ｍ＝〔LVL−１／DVL〕＋１＝1059／1024 ＋１＝２（小数点以下切捨て）（） DVL′＝LVL−１／ｍ＋１＝1059／２＋１＝530（小数点以下切捨て）となる。また（） MOD〔LVL−１／DVL′〕＋１＝MOD〔1059／530〕＋１＝530 となる。したがつて、図示Ｂ欄が適用される形と
なり、初期PVLが値「530」であり、かつ定常
PVLが値「530」とされる。〔数値例〕装置VP200の場合であつて、最大ビジー数が値
「８」であつて、ロジカル・ベクトル長LVLが値
「1025」であつたとする。この場合（） MOD〔LVL／DVL〕＝１＞０も（）ｍ＝〔LVL−１／DVL〕＋１＝1024／1024＋１＝
２（） DVL′＝LVL−１／ｍ＋１＝1024／２＋１＝ 513 （） MOD〔LVL−１／DVL′〕＋１＝MOD〔1024／513〕＋１＝512 となる。したがつて、図示Ｂ欄が適用される形と
なり、初期PVLが値「512」であり、かつ定常
PVLが値「513」とされる。なお、当該数値例の場合、従来のままの上記目
安ベクトル長DVLの値「512」をそのまま使用す
ると、 1025＝１＋512＋512 であることから、ベクトル長「１」の処理１回
と、ベクトル長「512」の処理２回とによつて所
定の処理が達成されることとなり、極端に小さい
ベクトル長の場合が含まれて効率が悪くなつてい
た。〔発明の効果〕以上説明した如く、本発明によれば、可能な限
ぎり、ベクトル長が極端に小さくなることがなく
なり、従来の場合にくらべて効率がよくなる。 Assume that in the case of the device VP200, the maximum busy number is the value "8" and the logical vector LVL is the value "1060". In this case, () MOD[LVL/DVL]=MOD[1060/1024]=36 () LVL>DVL () DVL>128, () m=[LVL-1/DVL]+1=1059/ 1024 +1=2 (rounded down to the decimal point) () DVL'=LVL-1/m+1=1059/2 +1=530 (rounded down to the decimal point). Also, () MOD [LVL-1/DVL'] + 1 = MOD [1059/530] + 1 = 530. Therefore, column B in the diagram applies, and the initial PVL is 530, and the steady state
The PVL is set to the value "530". [Numerical example] In the case of device VP200, assume that the maximum busy number is "8" and the logical vector length LVL is "1025". In this case () MOD [LVL/DVL] = 1 > 0 also () m = [LVL-1/DVL] + 1 = 1024 / 1024 + 1 =
2 () DVL'=LVL-1/m+1=1024/2+1=513 () MOD[LVL-1/DVL']+1 =MOD[1024/513]+1=512. Therefore, column B shown in the figure is applied, and the initial PVL is the value "512" and the steady state
PVL is set to the value "513". In addition, in the case of the numerical example, if the value of the standard vector length DVL "512" is used as it is, 1025 = 1 + 512 + 512, so the vector length "1" is processed once and the vector length "512" is A predetermined process is achieved by performing the process `` twice'', and this includes cases where the vector length is extremely small, resulting in poor efficiency. [Effects of the Invention] As explained above, according to the present invention, the vector length is prevented from becoming extremely small to the extent possible, and efficiency is improved compared to the conventional case.

[Brief explanation of the drawing]

第１図は本発明の原理ブロツク図、第２図は最
大ビジー数を決定する態様を説明する説明図、第
３図は中間コード出力例、第４図はベクトル長決
定テーブルの一実施例態様、第５図はベクトル演
算の例を説明する説明図、第６図はベクトル計算
機のハードウエア構成例、第７図はベクトル・レ
ングス制御範囲を説明する説明図を示す。図中、１はベクトル・プロセツサ、６はソー
ス・プログラム、７は中間コード、８はコンパイ
ラ、１４は中間コード最適化部、１７はベクトル
長設定機能部、１８はベクトル長決定テーブルを
表わす。 Fig. 1 is a basic block diagram of the present invention, Fig. 2 is an explanatory diagram illustrating how the maximum busy number is determined, Fig. 3 is an example of intermediate code output, and Fig. 4 is an embodiment of a vector length determination table. , FIG. 5 is an explanatory diagram for explaining an example of vector calculation, FIG. 6 is an explanatory diagram for explaining an example of the hardware configuration of a vector calculator, and FIG. 7 is an explanatory diagram for explaining a vector length control range. In the figure, 1 is a vector processor, 6 is a source program, 7 is an intermediate code, 8 is a compiler, 14 is an intermediate code optimization section, 17 is a vector length setting function section, and 18 is a vector length determination table.

Claims

[Claims] 1. A compiling device 8 that generates intermediate code 7 from a given source program 9, performs optimization, and generates an object program 10.
In this step, an intermediate code optimization section 14 is provided which performs optimization when obtaining the object program 10 from the generated intermediate code 7, and the intermediate code optimization section 14 calculates the maximum busy number and then performs optimization. A vector length setting function unit 17 is provided which determines the optimum vector length and sets the vector length, and the vector length setting function unit 17 divides the given logical vector length by the standard vector length determined by the upper maximum busy number. Using the residual value, divide the vector length for executing the processing of the given control range into the initial physical vector length and the steady physical vector length, and divide the initial physical vector length and the steady physical vector length.・Constructed to refer to the vector length determination table 18 that divides the initial physical vector length and the steady physical vector length so that the ratio with the vector length approaches the value "1" as much as possible, and the intermediate code The optimization unit 14 first executes processing according to the given initial physical vector length, and then repeats processing according to the given steady physical vector length one or more times. A compiling device configured to output intermediate code.