JP7040187B2

JP7040187B2 - compiler

Info

Publication number: JP7040187B2
Application number: JP2018053204A
Authority: JP
Inventors: 敏也平田
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2018-03-20
Filing date: 2018-03-20
Publication date: 2022-03-23
Anticipated expiration: 2038-03-20
Also published as: JP2019164704A

Description

本発明は、コンパイラ、コンピュータ、およびコンピュータのコード生成方法に関する。 The present invention relates to a compiler, a computer, and a computer code generation method.

一般に、命令処理の高速化を目的としたベクトル計算機（ベクトル処理装置）においては、メモリアクセス命令の追い越し制御が行われている。メモリアクセス命令の追い越し制御は、メモリアクセス系の命令で先行するメモリストア系命令がアクセスする領域に、続いて発行されるメモリロード系命令のアクセス領域が重複していない場合、後続のメモリロード系命令のメモリアクセスを先に実行させることにより実現される。これにより、メモリアクセス性能の向上を図ることができる。対象となるメモリアクセス系の命令は、ベクトルロード命令、ベクトルストア命令の他に、ベクトルスキャッター命令およびベクトルギャザー命令といったリストベクトル命令がある。リストベクトル命令は、複数のアクセス先のアドレスを配列等により間接参照させるリストアドレスにより、アクセス先のアドレスを指定する。 Generally, in a vector computer (vector processing device) for the purpose of speeding up instruction processing, overtaking control of a memory access instruction is performed. The overtaking control of the memory access instruction is performed when the access area of the memory load instruction issued subsequently does not overlap with the area accessed by the memory store instruction preceding by the memory access instruction, the subsequent memory load system. It is realized by executing the memory access of the instruction first. This makes it possible to improve the memory access performance. In addition to the vector load instruction and the vector store instruction, the target memory access type instructions include a list vector instruction such as a vector scatter instruction and a vector gather instruction. The list vector instruction specifies the access destination address by the list address that indirectly refers to the addresses of a plurality of access destinations by an array or the like.

例えば、特許文献１では、リストベクトルを繰り返し処理するループ制御文を含むソースプログラムのコンパイル時、リストベクトル命令によってアクセスする配列の先頭アドレスと終端アドレスをリストベクトル命令の命令語に設定する。そして、ベクトル計算機では、リストベクトル命令の命令語に設定された先頭アドレスおよび終端アドレスを使用して、他のメモリアクセス系命令がアクセスするメモリのアドレス範囲との重複の有無を判定し、追い越し制御の実行を制御している。特許文献２にも、特許文献１と同様の技術が記載されている。 For example, in Patent Document 1, when compiling a source program including a loop control statement that repeatedly processes a list vector, the start address and end address of an array accessed by a list vector instruction are set as command words of the list vector instruction. Then, the vector computer uses the start address and the end address set in the instruction word of the list vector instruction to determine whether or not there is an overlap with the address range of the memory accessed by another memory access instruction, and overtaking control is performed. Controls the execution of. Patent Document 2 also describes the same technique as Patent Document 1.

特許第３６９８０２７号公報Japanese Patent No. 3698027 特許第３７８９３２０号公報Japanese Patent No. 3789320

上述したように特許文献１では、リストベクトル命令によってアクセスする配列の先頭アドレスと終端アドレスをリストベクトル命令の命令語に設定している。即ち、リストベクトル命令は、アクセスする配列の全範囲をアクセスするものとしている。しかしながら、リストベクトル命令は配列の一部分だけをアクセスする場合がある。そのため、リストベクトル命令によってアクセスする配列の先頭アドレスと終端アドレスをリストベクトル命令の命令語に設定する構成では、リストベクトル命令によってアクセスするメモリのアドレス範囲が実際よりも広くなる場合がある。そうなると、他のメモリアクセス系命令がリストベクトル命令によってアクセスする配列と同じ配列をアクセスする場合、実際には同じ配列の異なる部分を互いにアクセスしている場合であっても、アクセスするメモリのアドレス範囲が互いに重複すると判定される。その結果、実際には追い越し可能な場合であっても、追い越し不可と判定される場合があった。 As described above, in Patent Document 1, the start address and the end address of the array accessed by the list vector instruction are set as the instruction words of the list vector instruction. That is, the list vector instruction is supposed to access the entire range of the array to be accessed. However, the list vector instruction may access only part of the array. Therefore, in the configuration in which the start address and the end address of the array accessed by the list vector instruction are set as the instruction words of the list vector instruction, the address range of the memory accessed by the list vector instruction may be wider than the actual one. Then, when other memory access type instructions access the same array as the array accessed by the list vector instruction, the address range of the memory to be accessed even if different parts of the same array are actually accessing each other. Are determined to overlap each other. As a result, even if it is actually possible to overtake, it may be determined that overtaking is not possible.

本発明の目的は、上述した課題、即ちメモリアクセス命令の適切な追い越し制御が困難である、という課題を解決するコンパイラを提供することにある。 An object of the present invention is to provide a compiler that solves the above-mentioned problem, that is, the problem that appropriate overtaking control of a memory access instruction is difficult.

本発明の一形態に係るコンパイラは、
ソースプログラムからベクトル計算機に対するオブジェクトプログラムを生成するコンピュータに、
前記ソースプログラムを解析して、リストベクトルを繰り返し処理するループ制御文を、リストベクトル命令を含む一連の命令リストに翻訳する処理を行わせ、
前記処理では、前記ループ制御文のループの繰返し毎に、前記リストベクトル命令によってアクセスするメモリの先頭アドレスと終端アドレスを算出し、前記先頭アドレスと前記終端アドレスを前記リストベクトル命令の命令語に設定する。 The compiler according to one embodiment of the present invention is
To a computer that generates an object program for a vector computer from a source program,
The source program is analyzed, and the loop control statement that repeatedly processes the list vector is translated into a series of instruction lists including the list vector instruction.
In the process, the start address and the end address of the memory accessed by the list vector instruction are calculated every time the loop of the loop control statement is repeated, and the start address and the end address are set as the command words of the list vector instruction. do.

また本発明の他の形態に係るコンピュータは、
ソースプログラムからベクトル計算機に対するオブジェクトプログラムを生成するコンピュータであって、
前記ソースプログラムを解析して、リストベクトルを繰り返し処理するループ制御文を認識するループ解析部と、
前記ループ制御文を、リストベクトル命令を含む一連の命令リストに翻訳するコード生成部と、を含み、
前記コード生成部は、前記ループ制御文のループの繰返し毎に、前記リストベクトル命令によってアクセスするメモリの先頭アドレスと終端アドレスを算出し、前記先頭アドレスと前記終端アドレスを前記リストベクトル命令の命令語に設定する。 Further, the computer according to another embodiment of the present invention is
A computer that generates an object program for a vector computer from a source program.
A loop analysis unit that analyzes the source program and recognizes a loop control statement that repeatedly processes the list vector.
Includes a code generator that translates the loop control statement into a series of instruction lists, including list vector instructions.
The code generation unit calculates the start address and the end address of the memory accessed by the list vector instruction for each loop repetition of the loop control statement, and the start address and the end address are the command words of the list vector instruction. Set to.

また本発明の他の形態に係るコンピュータのコード生成方法は、
ソースプログラムからベクトル計算機に対するオブジェクトプログラムを生成するコンピュータが実行するコード生成方法であって、
前記ソースプログラムを解析して、リストベクトルを繰り返し処理するループ制御文を認識し、
前記ループ制御文を、リストベクトル命令を含む一連の命令リストに翻訳し、
前記翻訳では、前記ループ制御文のループの繰返し毎に、前記リストベクトル命令によってアクセスするメモリの先頭アドレスと終端アドレスを算出し、前記先頭アドレスと前記終端アドレスを前記リストベクトル命令の命令語に設定する。 Further, the computer code generation method according to another embodiment of the present invention is described.
It is a code generation method executed by a computer that generates an object program for a vector computer from a source program.
Analyzing the source program, recognizing the loop control statement that iterates over the list vector,
The loop control statement is translated into a series of instruction lists including list vector instructions.
In the translation, the start address and the end address of the memory accessed by the list vector instruction are calculated for each loop repetition of the loop control statement, and the start address and the end address are set as the command words of the list vector instruction. do.

本発明は上述した構成を有することにより、メモリアクセス命令の適切な追い越し制御が可能である。 By having the above-mentioned configuration, the present invention enables appropriate overtaking control of the memory access instruction.

本発明の第１の実施形態に係るコンパイラのブロック図である。It is a block diagram of the compiler which concerns on 1st Embodiment of this invention. 本発明の第１の実施形態に係るコンパイラが生成するリストベクトル命令の命令語の一部を示す図である。It is a figure which shows a part of the instruction word of the list vector instruction generated by the compiler which concerns on 1st Embodiment of this invention. 翻訳対象となるソースプログラムの例を示す図である。It is a figure which shows the example of the source program to be translated. 翻訳結果のオブジェクトプログラムの一部の例を示す図である。It is a figure which shows a part example of the object program of a translation result. 配列Ａのメモリイメージを示す図である。It is a figure which shows the memory image of the array A. 本発明の第１の実施形態に係るコンピュータのブロック図である。It is a block diagram of the computer which concerns on 1st Embodiment of this invention. 本発明が解決しようとする課題を説明するためのＤＯループの例を示す図である。It is a figure which shows the example of the DO loop for demonstrating the problem which the present invention tries to solve. 本発明が解決しようとする課題を説明するためのＤＯループがＪ＝２、３のときの命令列と配列Ａのデータ並びの例を示す図である。It is a figure which shows the example of the instruction sequence and the data arrangement of the array A when the DO loop is J = 2, 3 for explaining the problem to be solved by this invention.

次に本発明の実施の形態について図面を参照して詳細に説明する。
[第１の実施形態]
先ず、本発明の第１の実施形態について説明する。 Next, embodiments of the present invention will be described in detail with reference to the drawings.
[First Embodiment]
First, the first embodiment of the present invention will be described.

＜本実施形態の背景＞
一般に、高速なプロセッサ性能と低速なメモリ性能とのギャップが計算機としての実効性能に大きな影響を及ぼしている。メモリ性能については、メモリロード命令のレイテンシが特に重要な要素である。 <Background of this embodiment>
In general, the gap between high-speed processor performance and low-speed memory performance has a great influence on the effective performance of a computer. For memory performance, the latency of memory load instructions is a particularly important factor.

また、命令処理の高速化を目的としたベクトル処理方式においては、メモリアクセス命令の追い越し制御が行われている。メモリアクセス命令の追い越し制御は、メモリアクセス系の命令で先行するメモリストア系命令がアクセスする領域に、続いて発行されるメモリロード系命令のアクセス領域が重複していない場合、後続のメモリロード系命令のメモリアクセスを先に実行させることにより実現される。これにより、メモリアクセス性能の向上を図ることができる。しかし、リストベクトルによるランダムなアクセスをともなうベクトルギャザー命令、および、ベクトルスキャッター命令は、メモリアクセスに規則性がなくアドレス依存の解析が難しく、効果的な追い越し制御が困難という問題がある。 Further, in the vector processing method for the purpose of speeding up the instruction processing, the overtaking control of the memory access instruction is performed. The overtaking control of the memory access instruction is performed when the access area of the memory load instruction issued subsequently does not overlap with the area accessed by the memory store instruction preceding by the memory access instruction, the subsequent memory load system. It is realized by executing the memory access of the instruction first. This makes it possible to improve the memory access performance. However, the vector gather instruction and the vector scatter instruction with random access by the list vector have a problem that the memory access is not regular and the address dependence analysis is difficult, and the effective overtaking control is difficult.

特許文献１では、ベクトルギャザー命令、および、ベクトルスキャッター命令がアクセスするメモリの先頭アドレスは、リストアドレスが指定された配列がメモリに割り付けられた値とし、その終端アドレスは、その先頭アドレスに配列のサイズを加算した値としている。即ち、リストアドレスの内容によらず配列全体をアクセスする範囲として、命令の追い越しの可否を決めている。このため同一配列へのロードとストアがループ中に混在する場合、決して追い越しが行われない。これに対して、本実施形態ではリストアドレス指定された次元のループの繰返しごとの先頭アドレスから終端アドレスまでをベクトルギャザー命令、および、ベクトルスキャッター命令がアクセスするメモリの範囲と設定することで、リストアドレスによってアクセスされる範囲と重ならない領域にアクセスする命令においては命令の追い越しが可能となり、特許文献１に記載の発明に比べて追い越し可能な命令が増え、メモリアクセス系の処理時間のさらなる改善とプログラムの高速化が見込まれる。 In Patent Document 1, the start address of the memory accessed by the vector gather instruction and the vector scatter instruction is a value assigned to the memory in an array to which a list address is specified, and the end address is set to the start address. It is the value obtained by adding the size of the array. That is, whether or not the instruction can be overtaken is determined as the range for accessing the entire array regardless of the contents of the list address. Therefore, when loads and stores to the same array are mixed in the loop, overtaking is never performed. On the other hand, in the present embodiment, the list address is set as the range of memory accessed by the vector gather instruction and the vector scatter instruction from the start address to the end address for each iteration of the loop of the specified dimension. In the instruction to access the area that does not overlap with the range accessed by the list address, the instruction can be overtaken, the number of instructions that can be overtaken increases as compared with the invention described in Patent Document 1, and the processing time of the memory access system is further increased. It is expected to improve and speed up the program.

＜本実施形態の特徴＞
本実施形態では、ベクトルギャザー命令、または、ベクトルスキャッター命令がアクセスするアドレス領域を解析し、同一ループ内の他のメモリアクセス系命令がアクセスする領域と重なりのある範囲を判定する機能を有するコンパイラを提供する。ベクトルギャザー命令、ベクトルスキャッター命令にこのコンパイラが判定したメモリアクセスに重なりのある範囲の情報を付与することで、ハードウェアにおけるベクトルギャザー命令、ベクトルスキャッター命令の追い越しの制御を簡単にするとともに、追い越し可能な命令の数を増加させる。これにより、メモリアクセスの処理時間を改善し、プログラムを高速化できる。 <Characteristics of this embodiment>
The present embodiment has a function of analyzing an address area accessed by a vector gather instruction or a vector scatter instruction and determining a range overlapping with an area accessed by another memory access system instruction in the same loop. Provides a compiler. By adding a range of information that overlaps the memory access determined by this compiler to the vector gather instruction and vector scatter instruction, it is easy to control the overtaking of the vector gather instruction and vector scatter instruction in hardware. At the same time, increase the number of commands that can be overtaken. This can improve the processing time of memory access and speed up the program.

＜本実施形態が解決しようとする課題＞
ベクトルギャザー命令やベクトルスキャッター命令では、ベクトルレジスタの各要素を実行アドレスとしてメモリにアクセスするため、メモリアクセスに規則性がなくアドレス依存の解析が難しい。この問題を解決するために、例えば、特許文献１で提案されている追い越し制御機能の場合、ベクトルギャザー命令やベクトルスキャッター命令がアクセスする配列の先頭アドレスから終端アドレスまでを命令のアクセス範囲であるとし、このアクセス範囲が他のメモリアクセス系命令のアクセス範囲と重なりがなければハードウェアは追い越し可能と判定して追い越し制御を行っている。しかし、この方法では、ベクトルギャザー命令やベクトルスキャッター命令によってアクセスするメモリのアドレス範囲が実際よりも広くなって、実際には追い越し可能な場合であっても、追い越し不可と判定されるケースが発生する。 <Problems to be solved by this embodiment>
In the vector gather instruction and the vector scatter instruction, since each element of the vector register is used as the execution address to access the memory, there is no regularity in the memory access and it is difficult to analyze the address dependence. In order to solve this problem, for example, in the case of the overtaking control function proposed in Patent Document 1, the access range of the instruction is from the start address to the end address of the array accessed by the vector gather instruction or the vector scatter instruction. If there is, if this access range does not overlap with the access range of other memory access system instructions, the hardware determines that overtaking is possible and performs overtaking control. However, in this method, the address range of the memory accessed by the vector gather instruction or the vector scatter instruction becomes wider than it actually is, and even if it is actually possible to overtake, it may be determined that it cannot be overtaken. Occur.

例えば、図７に示すコード例の（１）の式では配列Ａと配列Ｂの間に依存関係はなく、配列Ｂのベクトルギャザー命令は配列Ａのベクトルスキャッター命令を追い越してよいと判断できる。しかし、特許文献１の技術では、（１）と（２）の間では配列Ａの参照定義の依存関係があるため追い越しできないと判断してしまう。 For example, in the equation (1) of the code example shown in FIG. 7, there is no dependency between the array A and the array B, and it can be determined that the vector gather instruction of the array B may overtake the vector scatter instruction of the array A. .. However, in the technique of Patent Document 1, it is determined that overtaking cannot be performed because there is a dependency relationship of the reference definition of the sequence A between (1) and (2).

図８に上記の例のコードのＪ＝２のとき、および、Ｊ＝３のときの命令列と、配列Ａのデータ並びを示す。Ｊ＝２のとき、配列Ａのストア命令がアクセスする領域は図８の領域５－２であり、配列Ａのロード命令がアクセスする領域は図８の領域５－１であるため、アクセスする領域に重なりはなく、ストア命令とロード命令の間に依存関係はない。また、Ｊ＝３のときはストア命令がアクセスする領域は領域５－３、ロード命令がアクセスする領域は領域５－２となり、やはりストア命令とロード命令がアクセスする領域に重なりはない。つまり、ＪがＭ（２≦Ｍ≦Ｎ）のとき、（２）でロードする配列Ａ（Ｉ，Ｍ－１）の要素は、（１）のストア先である配列Ａ（Ｌ（Ｉ），Ｍ）とはメモリ上重なりはないため、（２）のロード命令は（１）を追い越すことができる。しかし、特許文献１では上述の通り配列Ａの先頭アドレスから終端アドレスまでをアクセス範囲と設定するため、（１）と（２）がアクセスする領域は同じ配列Ａで重なりありとみなされ、追い越し不可と判定されていた。 FIG. 8 shows the instruction sequence when J = 2 and J = 3 in the code of the above example, and the data sequence of the array A. When J = 2, the area accessed by the store instruction of the array A is the area 5-2 of FIG. 8, and the area accessed by the load instruction of the array A is the area 5-1 of FIG. There is no overlap and there is no dependency between the store instruction and the load instruction. Further, when J = 3, the area accessed by the store instruction is the area 5-3, the area accessed by the load instruction is the area 5-2, and the area accessed by the store instruction and the load instruction does not overlap. That is, when J is M (2 ≦ M ≦ N), the elements of the array A (I, M-1) loaded in (2) are the array A (L (I), which is the store destination of (1). Since there is no memory overlap with M), the load instruction of (2) can overtake (1). However, in Patent Document 1, since the access range is set from the start address to the end address of the array A as described above, the areas accessed by (1) and (2) are regarded as overlapping in the same array A and cannot be overtaken. Was determined to be.

＜本実施形態による解決手段＞
まず始めに、本実施形態が必要とするハードウェア、および、コンパイラの要件を述べる。本実施形態は、リストベクトルで指定されたベクトルレジスタの各要素に格納されたメモリ上のアドレスが指し示すデータをロード先のベクトルレジスタにロードするベクトルギャザー命令、および、リストベクトルで指定されたベクトルレジスタの各要素に格納されたメモリ上のストア先のアドレスにベクトルレジスタ上のデータをストアするベクトルスキャッター命令を命令セットに備えたＣＰＵを有するハードウェア（ベクトル計算機）を対象としている。このハードウェアは、ベクトルロード／ストア命令のメモリアクセスの重なりがなければ命令の追い越しを可能とする追い越し制御機能を備えているものとする。また、コンパイラは、配列の宣言情報より配列の次元数や要素数、サイズなどの情報を静的に取得できる機能を有しているものとする。以上が、本実施形態が必要とするハードウェア、およびコンパイラの要件である。これらは従来技術でも用いられているものである。 <Solution by the present embodiment>
First, the hardware required by this embodiment and the requirements of the compiler will be described. In this embodiment, a vector gather instruction for loading the data pointed to by the address on the memory stored in each element of the vector register specified by the list vector into the vector register to be loaded, and the vector register specified by the list vector. The target is hardware (vector computer) having a CPU equipped with a vector scatter instruction for storing data on a vector register at a store destination address on the memory stored in each element of the instruction set. It is assumed that this hardware has an overtaking control function that enables overtaking of instructions if there is no overlap of memory access of vector load / store instructions. Further, it is assumed that the compiler has a function of statically acquiring information such as the number of dimensions, the number of elements, and the size of the array from the declaration information of the array. The above are the requirements of the hardware and the compiler required by this embodiment. These are also used in the prior art.

本実施形態は、リストベクトルでアクセスされる配列、および、リストベクトルに関する情報の解析を行うリストベクトル解析部と、解析で得られたリストアクセスを行うメモリ領域の先頭アドレスと終端アドレスを命令語中にもつベクトルギャザー命令、および、ベクトルスキャッター命令を生成する手段と、命令語にセットしたアドレスを用いて他のメモリアクセス系命令との間にアクセスするメモリ領域に重なりがないかどうかを判定するリストアドレス重なり判定部とを有するコンパイラを提供することにより、リストアクセスを行うベクトル命令の追い越し制御を簡単、かつ効果的にすることで、プログラムを高速化する。 In this embodiment, an array accessed by a list vector, a list vector analysis unit that analyzes information about the list vector, and a start address and an end address of a memory area that performs list access obtained by the analysis are used as commands. It is determined whether or not the memory area to be accessed between the means for generating the vector gather instruction and the vector scatter instruction and the address set in the instruction term overlaps with other memory access system instructions. By providing a compiler having a list address overlap determination unit, the overtaking control of the vector instruction for list access can be made simple and effective, thereby speeding up the program.

＜本実施形態の構成＞
図１は、本発明の第１の実施形態に係るコンパイラ１００の構成図である。コンパイラ１００は、ソースプログラム１０１を翻訳して命令語１０２を生成する。命令語１０２は、オブジェクトコードとも呼ばれる。命令語１０２は、ベクトル計算機で実行される。このようにコンパイラ１００は、ベクトル計算機に対するオブジェクトプログラムを生成する。 <Structure of this embodiment>
FIG. 1 is a block diagram of a compiler 100 according to a first embodiment of the present invention. The compiler 100 translates the source program 101 to generate the instruction word 102. The instruction word 102 is also called an object code. The instruction word 102 is executed by a vector computer. In this way, the compiler 100 generates an object program for the vector computer.

コンパイラ１００は、ループ内命令生成処理部１１０とリストベクトル解析部１２０とを含んで構成されている。またループ内命令生成処理部１１０は、先行命令生成部１１１、ベクトルギャザー／スキャッター命令生成部１１２、後続命令生成部１１３、およびリストアドレス重なり判定部１１４を含んで構成されている。 The compiler 100 includes an in-loop instruction generation processing unit 110 and a list vector analysis unit 120. Further, the in-loop instruction generation processing unit 110 includes a preceding instruction generation unit 111, a vector gather / scatter instruction generation unit 112, a subsequent instruction generation unit 113, and a list address overlap determination unit 114.

コンパイラ１００は、ソースプログラム１０１を解析してリストベクトルを繰り返し処理するループ制御文を認識すると、そのループ制御文に対してループ内命令生成処理部１１０とリストベクトル解析部１２０とを適用するように構成されている。 When the compiler 100 recognizes the loop control statement that analyzes the source program 101 and repeatedly processes the list vector, the in-loop instruction generation processing unit 110 and the list vector analysis unit 120 are applied to the loop control statement. It is configured.

リストベクトル解析部１２０は、ループ制御文のループの繰返しごとにリストベクトルによってアクセスされうるアドレスの先頭アドレスと終端アドレスを算出するように構成されている。 The list vector analysis unit 120 is configured to calculate the start address and the end address of the addresses that can be accessed by the list vector for each loop repetition of the loop control statement.

ループ内命令生成処理部１１０は、ループ制御文を、リストベクトル命令（ベクトルギャザー命令またはベクトルスキャッター命令）を含む一連の命令リスト１３０に翻訳するように構成されている。その際、先行命令生成部１１１は、ベクトルギャザー命令またはベクトルスキャッター命令に先行する命令を生成する。また、ベクトルギャザー／スキャッター命令生成部１１２は、ベクトルギャザー命令またはベクトルスキャッター命令を生成する。またベクトルギャザー／スキャッター命令生成部１１２は、生成したベクトルギャザー命令またはベクトルスキャッター命令の命令語に、リストベクトル解析部１２０で算出された先頭アドレスと終端アドレスを格納したレジスタをセットする。また、後続命令生成部１１３は、ベクトルギャザー命令またはベクトルスキャッター命令の後続の命令を生成する。また、リストアドレス重なり判定部１１４は、リストベクトル解析部１２０で算出したリストアドレスの先頭から終端までの範囲をリストベクトル命令（ベクトルギャザー命令、および、ベクトルスキャッター命令）のメモリアクセス範囲とし、ベクトルギャザー命令の場合は先行のストア命令、ベクトルスキャッター命令の場合は後続のロード命令がアクセスするメモリ領域との間に重なりがないかどうかを判定する。そしてリストアドレス重なり判定部１１４は、その判定結果に基づいて、リストベクトル命令の命令語中に、メモリアクセスのアドレス領域のプログラム上での重なりの有無に応じて所定の値が設定される追い越しビットを設定する。 The in-loop instruction generation processing unit 110 is configured to translate the loop control statement into a series of instruction lists 130 including a list vector instruction (vector gather instruction or vector scatter instruction). At that time, the preceding instruction generation unit 111 generates an instruction preceding the vector gather instruction or the vector scatter instruction. Further, the vector gather / scatter instruction generation unit 112 generates a vector gather instruction or a vector scatter instruction. Further, the vector gather / scatter instruction generation unit 112 sets a register storing the start address and the end address calculated by the list vector analysis unit 120 in the command word of the generated vector gather instruction or vector scatter instruction. Further, the subsequent instruction generation unit 113 generates a subsequent instruction of the vector gather instruction or the vector scatter instruction. Further, the list address overlap determination unit 114 sets the range from the beginning to the end of the list address calculated by the list vector analysis unit 120 as the memory access range of the list vector instruction (vector gather instruction and vector scatter instruction). In the case of a vector gather instruction, it is determined whether or not there is an overlap with the memory area accessed by the preceding store instruction, and in the case of a vector scatter instruction, the subsequent load instruction. Then, the list address overlap determination unit 114 sets a predetermined value in the instruction word of the list vector instruction according to the presence or absence of overlap in the program of the address area of the memory access based on the determination result. To set.

図２は、リストベクトル命令の命令語の一部の構造の一例を示す。この例の命令語は、オペコードＯＰ、追い越しビットＶ、メモリをアクセスするアドレスが格納されているベクトルレジスタを指定するフィールドＶＲ、リストベクトル解析部１２０が算出した先頭アドレスを格納したレジスタを指定するフィールドＳｍｉｎ、リストベクトル解析部１２０が算出した終端アドレスを格納したレジスタを指定するフィールドＳｍａｘを含んで構成されている。コンパイル時に重なりがないことが確定となるリストベクトル命令の場合、追い越しビットＶは“１”になり、それ以外は“０”になる。 FIG. 2 shows an example of the structure of a part of the instruction word of the list vector instruction. The instruction words in this example are the opcode OP, the overtaking bit V, the field VR that specifies the vector register that stores the address that accesses the memory, and the field that specifies the register that stores the start address calculated by the list vector analysis unit 120. It is configured to include Smin and a field Smax that specifies a register that stores the terminal address calculated by the list vector analysis unit 120. In the case of a list vector instruction in which it is certain that there is no overlap at compile time, the overtaking bit V is "1", and otherwise it is "0".

従って、上記命令語を実行するハードウェア（ベクトル計算機）では、命令語の追い越しビットＶが“１”のときは無条件で追い越し制御を行う。また、命令語の追い越しビットＶが“０”のときは、ハードウェア（ベクトル計算機）は、重なり範囲の先頭アドレスと終端アドレスを命令語のSmin、Smaxで指定されたレジスタから取得して、命令の追い越しを禁止する範囲を認識する。そのため、ハードウェア（ベクトル計算機）上でリストアドレスの解析を行わずとも命令の追い越し制御が可能となる。 Therefore, in the hardware (vector computer) that executes the above command word, when the overtaking bit V of the command word is "1", the overtaking control is unconditionally performed. When the overtaking bit V of the instruction word is "0", the hardware (vector computer) acquires the start address and end address of the overlap range from the registers specified by the instruction words Smin and Smax, and gives an instruction. Recognize the range that prohibits overtaking. Therefore, it is possible to control the overtaking of instructions without analyzing the list address on the hardware (vector computer).

＜本実施形態の動作＞
次に本実施形態の動作を、図３に示したソースプログラムを例にして説明する。 <Operation of this embodiment>
Next, the operation of this embodiment will be described by taking the source program shown in FIG. 3 as an example.

図３に示すソースプログラムは、フォートランソースコードのＤＯループの一例である。式３－１の左辺の配列Ａは、リストベクトルＬによって１次元目の要素が指定されている。コンパイラ１００は、図３に示すソースプログラムを機械語（命令語）レベルに翻訳する。 The source program shown in FIG. 3 is an example of a DO loop of Fortran source code. In the array A on the left side of Equation 3-1 the first-dimensional element is designated by the list vector L. The compiler 100 translates the source program shown in FIG. 3 into a machine language (instruction language) level.

図４は図３のソースプログラムの翻訳結果の一部を示す。図３の式３－１は、図４の４－１～４－５のように翻訳され、図３の式３－２は、図４の４－６～４－７のように翻訳される。 FIG. 4 shows a part of the translation result of the source program of FIG. Equation 3-1 in FIG. 3 is translated as 4-1 to 4-5 in FIG. 4, and Equation 3-2 in FIG. 3 is translated as 4-6 to 4-7 in FIG. ..

特許文献１では、式３－１に対してベクトルスキャッター命令４－５を生成する際、配列Ａ全体をアクセスする範囲に設定するため、後続の式３－２に対して生成する配列Ａのロード命令４－６とメモリアクセス領域に重なりがあり、追い越し不可と判定する。本実施形態では、式３－１に対してベクトルスキャッター命令４－５を生成する際、リストベクトル解析部１２０は、配列Ａについて以下の解析処理を実施する。 In Patent Document 1, when the vector scatter instruction 4-5 is generated for the equation 3-1 to set the entire array A in the access range, the array A generated for the subsequent equation 3-2 is set. There is an overlap between the load instruction 4-6 and the memory access area, and it is determined that overtaking is not possible. In the present embodiment, when the vector scatter instruction 4-5 is generated for the equation 3-1 the list vector analysis unit 120 performs the following analysis processing on the array A.

先ず、リストベクトル解析部１２０は、配列の宣言情報から次の情報を取得する。
（ａ）配列のベースアドレス
（ｂ）配列の一つの要素のサイズ
（ｃ）配列の次元数
（ｄ）各次元の情報（要素の数、要素間の距離、下限） First, the list vector analysis unit 120 acquires the following information from the declaration information of the array.
(A) Base address of the array (b) Size of one element of the array (c) Number of dimensions of the array (d) Information of each dimension (number of elements, distance between elements, lower limit)

次に、リストベクトル解析部１２０は、上記取得した情報を用いてループの繰返しごとの１次元目の先頭アドレスと終端アドレスを以下のように求める。
先頭アドレス＝ベースアドレス＋（（１次元目の要素数×要素のサイズ）×（２次元目のインデックス－１））
終端アドレス＝先頭アドレス＋（１次元目の要素数×要素のサイズ） Next, the list vector analysis unit 120 obtains the first-dimensional start address and end address for each loop iteration using the acquired information as follows.
Start address = base address + ((number of elements in the first dimension x size of elements) x (index in the second dimension-1))
End address = start address + (number of elements in the first dimension x element size)

ベクトルギャザー／スキャッター命令生成部１１２は、リストベクトル解析部１２０で算出された先頭アドレスと終端アドレスを、配列Ａに対して生成したベクトルスキャッター命令４－５の命令語にセットする。 The vector gather / scatter instruction generation unit 112 sets the start address and the end address calculated by the list vector analysis unit 120 to the command words of the vector scatter instruction 4-5 generated for the array A.

図５は、図３の外側ループでＪ＝３のときの配列Ａのメモリイメージを示す。外側ループのＪ＝３の繰返しでリストアクセスを行うメモリ領域は、図５のＡ（Ｌ（Ｉ），３）で指し示した斜線の範囲となる。よって、リストベクトル解析部１２０で算出した先頭アドレスと終端アドレスは、図５の斜線の範囲の先頭と終端のアドレスとなる。図３の式３－２に対して、Ａ（Ｉ，Ｊ－１）のロード命令４－６が生成されるが、Ｊ＝３のとき、ロード命令４－６は、図５のＡ（Ｉ，２）で指し示した点線で囲まれた範囲にアクセスする。リストアクセスされる斜線の範囲と重なりがないため、Ａ（Ｉ，２）のロード命令４－６は、先行するＡ（Ｌ（Ｉ），３）のベクトルスキャッター命令４－５を追い越すことが可能であることが分かる。 FIG. 5 shows a memory image of the array A when J = 3 in the outer loop of FIG. The memory area for list access by repeating J = 3 in the outer loop is within the range of the diagonal line indicated by A (L (I), 3) in FIG. Therefore, the start address and the end address calculated by the list vector analysis unit 120 are the start and end addresses in the shaded area of FIG. The load instruction 4-6 of A (I, J-1) is generated for the equation 3-2 of FIG. 3, but when J = 3, the load instruction 4-6 is the A (I) of FIG. , 2) Access the area surrounded by the dotted line. The load instruction 4-6 of A (I, 2) overtakes the preceding vector scatter instruction 4-5 of A (L (I), 3) because it does not overlap with the range of diagonal lines accessed in the list. It turns out that is possible.

リストベクトル解析部１２０は、上記の通りループの繰返しごとにリストアクセスするアドレスの先頭と終端を求め、ベクトルギャザー／スキャッター命令生成部１１２は、それをベクトルスキャッター命令４－５の命令語にセットする。 As described above, the list vector analysis unit 120 finds the start and end of the address to access the list every time the loop is repeated, and the vector gather / scatter instruction generation unit 112 uses it as the command word of the vector scatter instruction 4-5. set.

リストアドレス重なり判定部１１４は、上記セットされた先頭アドレスと終端アドレスをベクトルスキャッター命令４－５の命令語から取得し、その範囲と後続のロード命令４－６がアクセスする範囲との間に重なりがないかどうかの判定を行う。図３のループの場合、図５に示したように式３－２に対応する配列Ａのロード命令４－６のアクセス範囲は配列Ａの２次元目がＪ－１の範囲であり、式３－１に対応するベクトルスキャッター命令４－５は配列Ａの２次元目がＪの範囲である。よって、リストアドレス重なり判定部１１４は、このループにおいて配列Ａの定義参照の依存関係はなく、式３－２に対応するロード命令４－６は、式３－１に対応するベクトルスキャッター命令４－５を追い越すことができると判断する。リストアドレス重なり判定部１１４は、後続のロード命令との間にアクセスするメモリ領域の重なりが見つからなかった場合は、ベクトルスキャッター命令４－５の命令語の追い越しビットＶを“１”にする。リストアドレス重なり判定部１１４は、コンパイル時に静的に取得できる配列情報からアクセス領域の重なりの有無の判別がつかない場合は、命令語の追い越しビットＶを“０”にする。 The list address overlap determination unit 114 acquires the set start address and end address from the instruction word of the vector scatter instruction 4-5, and between the range and the range accessed by the subsequent load instruction 4-6. It is judged whether or not there is an overlap. In the case of the loop of FIG. 3, as shown in FIG. 5, the access range of the load instruction 4-6 of the array A corresponding to the equation 3-2 is the range of the J-1 in the second dimension of the array A, and the equation 3 In the vector scatter instruction 4-5 corresponding to -1, the second dimension of the array A is the range of J. Therefore, the list address overlap determination unit 114 has no dependency on the definition reference of the array A in this loop, and the load instruction 4-6 corresponding to the equation 3-2 is the vector scatter instruction corresponding to the equation 3-1. Judge that it is possible to overtake 4-5. When the overlap of the memory area to be accessed is not found between the list address overlap determination unit 114 and the subsequent load instruction, the overtaking bit V of the instruction word of the vector scatter instruction 4-5 is set to “1”. .. The list address overlap determination unit 114 sets the overtaking bit V of the instruction word to “0” when it is not possible to determine whether or not the access areas overlap from the array information that can be statically acquired at compile time.

上記の手法により生成したリストベクトル命令の命令語の情報を用いて、ハードウェア（ベクトル計算機）は追い越し制御を行う。ハードウェアでは、まずコンパイラが命令語にセットした追い越しビットＶを確認し、追い越しビットＶの値が“１”であればその命令に対して無条件で追い越し制御を行う。追い越しビットＶの値が“０”の場合、ハードウェアは命令語のフィールドＳｍｉｎ、Ｓｍａｘにセットされたレジスタからアクセス範囲の先頭アドレスと終端アドレスを取得し、ベクトルスキャッター命令の重なり範囲のアドレスと後続のロード命令がアクセスするアドレスとの比較を行うことで追い越し可否の判断を行い、追い越し可能なとき追い越し制御を行う。 The hardware (vector computer) performs overtaking control using the information of the instruction word of the list vector instruction generated by the above method. In the hardware, first, the overtaking bit V set by the compiler in the instruction word is confirmed, and if the value of the overtaking bit V is "1", the overtaking control is unconditionally performed for the instruction. When the value of the overtaking bit V is "0", the hardware obtains the start address and the end address of the access range from the register set in the instruction field fields Smin and Smax, and the address of the overlap range of the vector scatter instruction. By comparing with the address accessed by the subsequent load instruction, it is judged whether or not overtaking is possible, and overtaking control is performed when overtaking is possible.

図３に示したソースプログラムを例にベクトルスキャッター命令の場合について説明したが、本発明はベクトルギャザー命令についても同様に処理を実行し、先行するメモリストア系命令の追い越しが可能か否かをコンパイラ１００で判定し、ハードウェアでの判定に必要な情報を命令語にセットすることができる。 The case of the vector scatter instruction has been described by taking the source program shown in FIG. 3 as an example. However, the present invention also executes the processing for the vector gather instruction, and whether or not it is possible to overtake the preceding memory store instruction. Can be determined by the compiler 100, and the information necessary for the determination by the hardware can be set in the instruction word.

＜本実施形態の効果＞
特許文献１では、図３に示したソースプログラムの場合、式３－１に対応するベクトルスキャッター命令４－５は配列Ａの先頭アドレスから終端アドレスまでをアクセス範囲と設定するため、ベクトルスキャッター命令４－５とロード命令４－６がアクセスする領域は同じ配列Ａで重なりありとみなされ、追い越し不可と判定されていた。即ち、実際にはアドレスの重複がない領域にアクセスするにも関わらず追い越し不可と判定していた。これに対して、本実施形態では、ループの繰返し単位でのメモリアクセス範囲の解析機能により、特許文献１で追い越し不可と判定していた命令であっても追い越し制御が可能となる。これにより、ロード命令がストア命令を待ち合わせることによるメモリアクセスの性能低下が軽減され、プログラムの実行が高速化される。また、コンパイラでアドレスの重なりの判定を行うことで、ハードウェアで同様の処理を行う場合に比べハードウェアで消費するリソースを節約できること、および、ハードウェア資源の物理的な制約を受けないというメリットがある。さらに、命令の並べ換えなど命令の順序の変更を伴う最適化の適用が促進される効果もある。 <Effect of this embodiment>
In Patent Document 1, in the case of the source program shown in FIG. 3, the vector scatter instruction 4-5 corresponding to the equation 3-1 sets the access range from the start address to the end address of the array A. The area accessed by the cutter instruction 4-5 and the load instruction 4-6 was regarded as overlapping in the same array A, and it was determined that overtaking was not possible. That is, it was determined that overtaking was not possible even though the area where the addresses were not duplicated was actually accessed. On the other hand, in the present embodiment, the memory access range analysis function in the loop repetition unit enables overtaking control even for an instruction determined to be impossible to overtake in Patent Document 1. This alleviates the performance degradation of memory access caused by the load instruction waiting for the store instruction, and speeds up the execution of the program. In addition, by determining the overlap of addresses with the compiler, the resources consumed by the hardware can be saved compared to the case where the same processing is performed by the hardware, and the advantages of not being physically restricted by the hardware resources. There is. Further, there is an effect that the application of optimization accompanied by the change of the order of instructions such as the rearrangement of instructions is promoted.

[第２の実施形態]
次に、本発明の第２の実施形態について説明する。 [Second Embodiment]
Next, a second embodiment of the present invention will be described.

第１の実施形態では、リストベクトル解析部１２０は、リストベクトル命令によってアクセスするメモリの先頭アドレスと終端アドレスを算出するために必要な情報を配列の宣言情報から取得した。しかし、フォートランの形状引継ぎ配列など、コンパイル時には配列の情報は不明であり、実行時にならないと分からない場合がある。本実施形態では、必要な情報が得られない場合、コンパイラ１００は、リストベクトル解析部１２０でアクセス領域のアドレスを算出する代わりに、実行時にそれらの値を計算する命令（機械語）を生成し、オブジェクトコードに追加する処理を行う。また、コンパイル時にアドレス値が確定しないため、リストアドレス重なり判定部１１４では追い越しビットＶに“０”をセットする。また、Ｓｍｉｎ、Ｓｍａｘが指し示すレジスタの内容は例えばＮＵＬＬ値としておく。具体的には、段落［００３３］に記載された、（ａ）配列のベースアドレス、（ｂ）配列の一つの要素のサイズ、（ｃ）配列の次元数、（ｄ）各次元の情報（要素の数、要素間の距離、下限）といった配列の構成に関する情報は、実行時には配列記述子というデータ構造に格納されている。そこで、コンパイラ１００のリストベクトル解析部１２０は、リストベクトル命令によってアクセスするメモリの先頭アドレスと終端アドレスを算出するために必要な情報を配列記述子から取得し、例えば段落［００３３］に記載した計算式と同様の計算式を使用してリストベクトル命令によってアクセスするメモリの先頭アドレスと終端アドレスを算出し、当該リストベクトル命令の命令語のＳｍｉｎ、Ｓｍａｘが指し示すレジスタに格納する命令を生成するように構成されている。 In the first embodiment, the list vector analysis unit 120 acquires the information necessary for calculating the start address and the end address of the memory accessed by the list vector instruction from the declaration information of the array. However, array information such as Fortran shape inheritance arrays is unknown at compile time, and may not be known until execution time. In the present embodiment, when the necessary information cannot be obtained, the compiler 100 generates an instruction (machine language) for calculating the values at the time of execution instead of calculating the address of the access area by the list vector analysis unit 120. , Performs the process of adding to the object code. Further, since the address value is not fixed at the time of compilation, the overtaking bit V is set to "0" in the list address overlap determination unit 114. Further, the contents of the registers pointed to by Smin and Smax are set to, for example, NULL values. Specifically, (a) the base address of the array, (b) the size of one element of the array, (c) the number of dimensions of the array, and (d) the information (elements) of each dimension described in paragraph [0033]. Information about the composition of an array, such as the number of elements, the distance between elements, and the lower limit), is stored in a data structure called an array descriptor at run time. Therefore, the list vector analysis unit 120 of the compiler 100 acquires the information necessary for calculating the start address and the end address of the memory accessed by the list vector instruction from the array descriptor, and describes, for example, the calculation described in paragraph [0033]. Calculate the start address and end address of the memory accessed by the list vector instruction using the same calculation formula as the formula, and generate the instruction to be stored in the register indicated by the instruction words Smin and Smax of the list vector instruction. It is configured.

コンパイラ１００がリストベクトル解析部１２０で生成した上記命令により、ベクトル計算機におけるオブジェクトプログラムの実行時にリストベクトル命令がアクセスするメモリの先頭アドレスと終端アドレスが計算され、レジスタに格納される。ベクトル計算機は、追い越し制御装置においてそのレジスタを参照することで、リストベクトル命令のメモリアクセスの範囲を取得して命令の追い越し制御を実施する。これにより、コンパイル時に静的にアクセス領域のアドレスを算出した場合と同様に命令の追い越し制御を行うことが可能となる。 The instruction generated by the compiler 100 in the list vector analysis unit 120 calculates the start address and end address of the memory accessed by the list vector instruction when the object program is executed in the vector computer, and stores the start address and the end address in the register. The vector computer acquires the range of the memory access of the list vector instruction by referring to the register in the overtaking control device, and performs the overtaking control of the instruction. This makes it possible to control the overtaking of instructions in the same way as when the address of the access area is statically calculated at compile time.

[第３の実施形態]
次に、本発明の第３の実施形態について説明する。 [Third Embodiment]
Next, a third embodiment of the present invention will be described.

図６は、本実施形態に係るコンピュータのブロック図である。本実施形態に係るコンピュータ２００は、ソースプログラム２０１からベクトル計算機に対するオブジェクトコード２０２を生成するように構成されている。コンピュータ２００は、ループ解析部２１０とコード生成部２２０とを含んで構成される。 FIG. 6 is a block diagram of a computer according to the present embodiment. The computer 200 according to the present embodiment is configured to generate the object code 202 for the vector computer from the source program 201. The computer 200 includes a loop analysis unit 210 and a code generation unit 220.

ループ解析部２１０は、ソースプログラム２０１を解析して、リストベクトルを繰り返し処理するループ制御文を認識するように構成されている。 The loop analysis unit 210 is configured to analyze the source program 201 and recognize a loop control statement that repeatedly processes the list vector.

コード生成部２２０は、ループ解析部２１０によって認識されたループ制御文を、リストベクトル命令を含む一連の命令リストに翻訳するように構成されている。またコード生成部２２０は、ループ制御文のループの繰返し毎に、リストベクトル命令によってアクセスするメモリの先頭アドレスと終端アドレスを算出し、先頭アドレスと終端アドレスをリストベクトル命令の命令語に設定するように構成されている。 The code generation unit 220 is configured to translate the loop control statement recognized by the loop analysis unit 210 into a series of instruction lists including a list vector instruction. Further, the code generation unit 220 calculates the start address and the end address of the memory accessed by the list vector instruction every time the loop of the loop control statement is repeated, and sets the start address and the end address as the instruction words of the list vector instruction. It is configured in.

コンピュータ２００は、ＣＰＵ等のプロセッサとメモリとを有し、ハードディスク等に記録されたプログラムを実行することにより、ループ解析部２１０およびコード生成部２２０をコンピュータ上に実現する。 The computer 200 has a processor such as a CPU and a memory, and realizes the loop analysis unit 210 and the code generation unit 220 on the computer by executing a program recorded on the hard disk or the like.

このように構成された本実施形態に係るコンピュータ２００は、以下のように機能する。即ち、ソースプログラム２０１がコンピュータ２００に入力されると、先ず、ループ解析部２１０は、ソースプログラム２０１を解析して、リストベクトルを繰り返し処理するループ制御文を認識する。次に、コード生成部２２０は、ループ解析部２１０によって認識されたループ制御文を、リストベクトル命令を含む一連の命令リストに翻訳する。その際、コード生成部２２０は、ループ制御文のループの繰返し毎に、リストベクトル命令によってアクセスするメモリの先頭アドレスと終端アドレスを算出し、先頭アドレスと終端アドレスをリストベクトル命令の命令語に設定する。 The computer 200 according to the present embodiment configured in this way functions as follows. That is, when the source program 201 is input to the computer 200, the loop analysis unit 210 first analyzes the source program 201 and recognizes a loop control statement that repeatedly processes the list vector. Next, the code generation unit 220 translates the loop control statement recognized by the loop analysis unit 210 into a series of instruction lists including the list vector instruction. At that time, the code generation unit 220 calculates the start address and the end address of the memory to be accessed by the list vector instruction every time the loop of the loop control statement is repeated, and sets the start address and the end address as the command words of the list vector instruction. do.

このように本実施形態によれば、適切な追い越し制御が可能である。その理由は、ループ制御文のループの繰返し毎に、リストベクトル命令によってアクセスするメモリの先頭アドレスと終端アドレスを算出し、先頭アドレスと終端アドレスをリストベクトル命令の命令語に設定するためである。 As described above, according to the present embodiment, appropriate overtaking control is possible. The reason is that the start address and the end address of the memory accessed by the list vector instruction are calculated every time the loop of the loop control statement is repeated, and the start address and the end address are set as the command words of the list vector instruction.

本発明は、コンパイラに適用でき、特にソースプログラムを翻訳してベクトル計算機向けのオブジェクトプログラムを生成するコンパイラに利用できる。 The present invention can be applied to a compiler, and can be particularly used for a compiler that translates a source program to generate an object program for a vector computer.

１００…コンパイラ
１０１…ソースプログラム
１０２…命令語
１１０…ループ内命令生成処理部
１１１…先行命令生成部
１１２…ベクトルギャザー／スキャッター命令生成部
１１３…後続命令生成部
１１４…リストアドレス重なり判定部
１２０…リストベクトル解析部
１３０…命令リスト
２００…コンピュータ
２０１…ソースプログラム
２０２…オブジェクトコード
２１０…ループ解析部
２２０…コード生成部
ＯＰ…オペコード
Ｖ…追い越しビット
ＶＲ…アドレスを格納したベクトルレジスタを指定するフィールド
Ｓｍｉｎ…先頭アドレスを格納したレジスタを指定するフィールド
Ｓｍａｘ…終端アドレスを格納したレジスタを指定するフィールド 100 ... Compiler 101 ... Source program 102 ... Instruction word 110 ... In-loop instruction generation processing unit 111 ... Preceding instruction generation unit 112 ... Vector gather / scatter instruction generation unit 113 ... Subsequent instruction generation unit 114 ... List address overlap determination unit 120 ... List Vector analysis unit 130 ... Instruction list 200 ... Computer 201 ... Source program 202 ... Object code 210 ... Loop analysis unit 220 ... Code generation unit OP ... Operation code V ... Overtaking bit VR ... Field that specifies the vector register that stores the address Smin ... Top Field that specifies the register that stores the address Smax ... Field that specifies the register that stores the terminal address

Claims

To a computer that generates an object program for a vector computer from a source program,
The source program is analyzed, and the loop control statement that repeatedly processes the list vector is translated into a series of instruction lists including the list vector instruction.
In the process, the start address and the end address of the memory accessed by the list vector instruction are calculated every time the loop of the loop control statement is repeated, and the start address and the end address are set as the command words of the list vector instruction. do,
compiler.

In the calculation, when an element of a specific dimension of the array is specified by the list vector, the base address, the size of one element of the array, and the dimension of the array are obtained from the declaration information of the array of the source program. The number and the information of each dimension of the array are acquired, and the start address and the end address are calculated based on the acquired information.
The compiler according to claim 1.

In the process, further, an address range from the start address set in the instruction word of the list vector instruction to the end address and an address range of the memory accessed by another memory access system instruction included in the instruction list. The presence or absence of duplication is determined, and based on the determination result, an overtaking bit in which a predetermined value is set according to the presence or absence of overlap in the program of the address area of the memory access is set in the instruction word.
The compiler according to claim 1 or 2.

In the determination, when the list vector instruction is a vector gather instruction, the other memory access system instruction is a store instruction preceding the list vector instruction.
The compiler according to claim 3.

In the determination, when the list vector instruction is a vector scatter instruction, the other memory access system instruction is a load instruction following the list vector instruction.
The compiler according to claim 3.

In the process, if the information required for the calculation cannot be acquired at compile time, the information required for the calculation is acquired at the time of executing the object program, the start address and the end address are calculated, and stored in a predetermined register. Generate instructions,
The compiler according to any one of claims 1 to 5.

A computer that generates an object program for a vector computer from a source program.
A loop analysis unit that analyzes the source program and recognizes a loop control statement that repeatedly processes the list vector.
Includes a code generator that translates the loop control statement into a series of instruction lists, including list vector instructions.
The code generation unit calculates the start address and the end address of the memory accessed by the list vector instruction for each loop repetition of the loop control statement, and the start address and the end address are the command words of the list vector instruction. Set to,
Computer.

When the element of a specific dimension of the array is specified by the list vector, the code generator can obtain the base address, the size of one element of the array, and the array from the declaration information of the array of the source program. The number of dimensions and the information of each dimension of the array are acquired, and the start address and the end address are calculated based on the acquired information.
The computer according to claim 7.

The code generation unit further includes an address range from the start address set in the instruction word of the list vector instruction to the end address and the address of the memory accessed by another memory access system instruction included in the instruction list. The presence or absence of overlap with the range is determined, and based on the determination result, an overtaking bit is set in the instruction word in which a predetermined value is set according to the presence or absence of overlap in the program of the address area of the memory access. do,
The computer according to claim 7 or 8.

It is a code generation method executed by a computer that generates an object program for a vector computer from a source program.
Analyzing the source program, recognizing the loop control statement that iterates over the list vector,
The loop control statement is translated into a series of instruction lists including list vector instructions.
In the translation, the start address and the end address of the memory accessed by the list vector instruction are calculated for each loop repetition of the loop control statement, and the start address and the end address are set as the command words of the list vector instruction. do,
How to generate code for your computer.