JP2019012324A

JP2019012324A - compiler

Info

Publication number: JP2019012324A
Application number: JP2017127370A
Authority: JP
Inventors: 健人岩川; Taketo Iwakawa
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2017-06-29
Filing date: 2017-06-29
Publication date: 2019-01-24
Anticipated expiration: 2037-06-29
Also published as: JP6907761B2

Abstract

To eliminate the limitations on an increase in speed of a method for vectorizing a loop control statement by using a work array.SOLUTION: A computer which generates an object program to a vector computer from a source program comprises: a loop analysis unit which recognizes a loop control statement which repeatedly processes list structure data by analyzing the source program; and a vectorization execution unit which vectorizes the loop control statement. The vectorization execution unit inserts a first program portion and a second program portion into the object program. The first program portion writes the initial address of each node of the list structure data in a vector register, and writes the number of nodes of the list structure data in a scalar register. The second program portion vectorizes the loop control statement, by using the initial address of each node of the list structure data written in the vector register and the number of nodes of the list structure data written in the scalar register.SELECTED DRAWING: Figure 15

Description

本発明は、コンパイラ、コンピュータ、およびコンピュータのコード生成方法に関する。 The present invention relates to a compiler, a computer, and a computer code generation method.

リスト構造データは、自分自身と同じ型の構造体へのポインタをメンバに持つデータであり、自己参照構造体とも呼ばれる。リスト構造データは、追加や削除が頻繁に行われる大量のデータを処理するのに適している。 The list structure data is data having a pointer to a structure of the same type as itself as a member, and is also called a self-reference structure. The list structure data is suitable for processing a large amount of data that is frequently added and deleted.

図１は、リスト構造データｓｔｒｕｃｔＡにより形成されるリストのメモリ上の構成を示す図である。図１中、１つの矩形が１つの構造体１−１を表す。構造体はノードとも呼ばれる。この例の構造体は、ｄｏｕｂｌｅ型データｘ，ｙ，ｚとポインタｎｅｘｔをメンバにもつ。ｆｉｒｓｔは、リスト構造データｓｔｒｕｃｔＡの並びの区間の先頭のアドレスを表す。ａｄｄｒ１，…，ａｄｄｒＮは、各構造体の先頭アドレスを表す。ｌａｓｔは、リスト構造データｓｔｒｕｃｔＡの並びの区間の終了アドレスを表す。 FIG. 1 is a diagram showing a structure of a list formed by list structure data struct A on a memory. In FIG. 1, one rectangle represents one structure 1-1. Structures are also called nodes. The structure of this example has double type data x, y, z and a pointer next as members. “first” represents the head address of the section of the list structure data “struct A”. addr1,..., addrN represent the head addresses of the respective structures. “last” represents the end address of the section of the list structure data “struct A”.

図２は、リスト構造データｓｔｒｕｃｔＡを処理するループ制御文を持つＣ言語のソースプログラムの一例を示す図である。文２−１は、リスト構造データｓｔｒｕｃｔＡのデータ宣言文である。リスト構造データｓｔｒｕｃｔＡが自分自身を再帰的に指している。文４−２は、リストをたどりリスト構造データｓｔｒｕｃｔＡを繰り返し処理するループ制御文である。 FIG. 2 is a diagram illustrating an example of a C language source program having a loop control statement for processing the list structure data struct A. A statement 2-1 is a data declaration statement of the list structure data struct A. The list structure data struct A points to itself recursively. A statement 4-2 is a loop control statement for repeatedly processing the list structure data struct A by following the list.

そして、図２に示すようなリスト構造データｓｔｒｕｃｔＡを処理するループ制御文を含むプログラムの高速化を可能にするコンパイラが、例えば特許文献１に記載されている。図１３は、特許文献１に記載のコンパイラによるループ制御文の変形結果を示す図である。図１３を参照すると、特許文献１に記載のコンパイラは、ループ制御文１３−１を文１３−２〜１３−５のように変形することによって、リスト構造データを含むループをベクトル化している。ここで、文１３−２は、リスト構造データｓｔｒｕｃｔＡのノード数をカウントする制御文である。また、文１３−３は、メモリ上に作業配列を確保する宣言文である。また、文１３−４は、確保した作業配列にリスト構造データｓｔｒｕｃｔＡの各ノードの先頭アドレスを登録する制御文である。そして、文１３−５が、作業配列の配列要素を参照するための添字を使って元のループ制御文１３−１と等価な処理を行うループ制御文である。 For example, Patent Document 1 discloses a compiler that can speed up a program including a loop control statement that processes list structure data struct A as shown in FIG. FIG. 13 is a diagram illustrating a modification result of the loop control statement by the compiler described in Patent Document 1. In FIG. Referring to FIG. 13, the compiler described in Patent Document 1 vectorizes a loop including list structure data by transforming the loop control statement 13-1 into statements 13-2 to 13-5. Here, the sentence 13-2 is a control sentence that counts the number of nodes of the list structure data struct A. A statement 13-3 is a declaration statement that secures a work array on the memory. A statement 13-4 is a control statement for registering the head address of each node of the list structure data struct A in the secured work array. The statement 13-5 is a loop control statement that performs processing equivalent to the original loop control statement 13-1 using a subscript for referring to the array element of the work array.

特開２００３−３３７７０７号公報JP 2003-337707 A

特許文献１に記載のコンパイラは、リスト構造データの先頭アドレスをメモリ上の作業配列に割り当てるようにして、リスト構造データを繰り返し処理するループ制御文のベクトル化を行っている。しかしながら、プログラム実行時にメモリ上の作業配列へのアクセスが行われるため、実行速度の高速化には限度がある。 The compiler described in Patent Document 1 vectorizes a loop control statement that repeatedly processes list structure data by assigning the head address of the list structure data to a work array on the memory. However, since access to the work array on the memory is performed when the program is executed, there is a limit to increasing the execution speed.

本発明の目的は、上述した課題、即ち、作業配列を使ってループ制御文をベクトル化する方法では高速化に限度がある、という課題を解決するコンパイラを提供することにある。 An object of the present invention is to provide a compiler that solves the above-mentioned problem, that is, the problem that the method of vectorizing a loop control statement using a work array has a limitation in speeding up.

本発明の一形態に係るコンパイラは、
ソースプログラムからベクトル計算機に対するオブジェクトプログラムを生成するコンピュータに、
前記ソースプログラムを解析してリスト構造データを繰り返し処理するループ制御文を認識する処理と、
前記ループ制御文をベクトル化する処理を行わせ、
前記ベクトル化する処理では、
前記オブジェクトプログラム中に、
前記リスト構造データの各ノードの先頭アドレスをベクトルレジスタに書き込み、前記リスト構造データのノード数をスカラレジスタに書き込む第１のプログラム部分と、
前記ベクトルレジスタに書き込まれた前記リスト構造データの各ノードの先頭アドレスと前記スカラレジスタに書き込まれた前記リスト構造データのノード数とを使って、前記ループ制御文をベクトル化した第２のプログラム部分とを、
挿入する。 A compiler according to one aspect of the present invention
To a computer that generates an object program for a vector computer from a source program,
Processing for recognizing a loop control statement for repeatedly processing list structure data by analyzing the source program;
Let the loop control statement be vectorized,
In the process of vectorization,
During the object program,
A first program part for writing a head address of each node of the list structure data to a vector register, and writing a number of nodes of the list structure data to a scalar register;
A second program part in which the loop control statement is vectorized using the start address of each node of the list structure data written to the vector register and the number of nodes of the list structure data written to the scalar register And
insert.

また、本発明の他の形態に係るコンピュータは、
ソースプログラムからベクトル計算機に対するオブジェクトプログラムを生成するコンピュータであって、
前記ソースプログラムを解析してリスト構造データを繰り返し処理するループ制御文を認識するループ解析部と、
前記ループ制御文をベクトル化するベクトル化実行部と、
を含み、
前記ベクトル化実行部は、前記オブジェクトプログラム中に、
前記リスト構造データの各ノードの先頭アドレスをベクトルレジスタに書き込み、前記リスト構造データのノード数をスカラレジスタに書き込む第１のプログラム部分と、
前記ベクトルレジスタに書き込まれた前記リスト構造データの各ノードの先頭アドレスと前記スカラレジスタに書き込まれた前記リスト構造データのノード数とを使って、前記ループ制御文をベクトル化した第２のプログラム部分とを、
挿入するように構成されている。 A computer according to another embodiment of the present invention is also provided.
A computer that generates an object program for a vector computer from a source program,
A loop analysis unit that recognizes a loop control statement that analyzes the source program and repeatedly processes the list structure data;
A vectorization execution unit for vectorizing the loop control statement;
Including
The vectorization execution unit includes the object program,
A first program part for writing a head address of each node of the list structure data to a vector register, and writing a number of nodes of the list structure data to a scalar register;
A second program part in which the loop control statement is vectorized using the start address of each node of the list structure data written to the vector register and the number of nodes of the list structure data written to the scalar register And
Configured to insert.

また、本発明の他の形態に係るコンピュータのコード生成方法は、
ソースプログラムからベクトル計算機に対するオブジェクトプログラムを生成するコンピュータが実行するコード生成方法であって、
前記ソースプログラムを解析してリスト構造データを繰り返し処理するループ制御文を認識し、
前記ループ制御文をベクトル化し、
前記ベクトル化では、
前記オブジェクトプログラム中に、
前記リスト構造データの各ノードの先頭アドレスをベクトルレジスタに書き込み、前記リスト構造データのノード数をスカラレジスタに書き込む第１のプログラム部分と、
前記ベクトルレジスタに書き込まれた前記リスト構造データの各ノードの先頭アドレスと前記スカラレジスタに書き込まれた前記リスト構造データのノード数とを使って、前記ループ制御文をベクトル化した第２のプログラム部分とを、
挿入する。 A computer code generation method according to another aspect of the present invention includes:
A code generation method executed by a computer that generates an object program for a vector computer from a source program,
Recognizing a loop control statement that repeatedly processes list structure data by analyzing the source program,
Vectorize the loop control statement;
In the vectorization,
During the object program,
A first program part for writing a head address of each node of the list structure data to a vector register, and writing a number of nodes of the list structure data to a scalar register;
A second program part in which the loop control statement is vectorized using the start address of each node of the list structure data written to the vector register and the number of nodes of the list structure data written to the scalar register And
insert.

本発明は上述した構成を有することにより、作業配列を使ってループ制御文をベクトル化するのに比べて、ループ制御文の実行速度の高速化が可能になる。 Since the present invention has the above-described configuration, it is possible to increase the execution speed of the loop control statement as compared with the case where the loop control statement is vectorized using the work arrangement.

リスト構造データｓｔｒｕｃｔＡにより形成されるリストのメモリ上の構成を示す図である。It is a figure which shows the structure on the memory of the list | wrist formed by list structure data struct A. FIG. リスト構造データｓｔｒｕｃｔＡを処理するループ制御文を持つＣ言語のソースプログラムの一例を示す図である。It is a figure which shows an example of the C language source program which has a loop control statement which processes list structure data struct A. FIG. ベクトル化するリスト構造データの各先頭アドレスを格納するベクトルレジスタとベクトル収集命令の処理の説明図である。It is explanatory drawing of the process of the vector register which stores each head address of the list structure data to vectorize, and a vector collection command. ベクトル化するリスト構造データの各先頭アドレスを格納するベクトルレジスタとベクトル拡散命令の処理の説明図である。It is explanatory drawing of the process of the vector register which stores each head address of the list structure data to vectorize, and a vector spread instruction. 本発明の第１の実施形態に係るコンパイラの構成例を示す図である。It is a figure which shows the structural example of the compiler which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態におけるＶＬＩＳＴ２命令の対象となる、区間の最後がＮＵＬＬ（終端文字）と決めているリスト構造データにより形成されるリストのメモリ上の構成とベクトルレジスタの内容を示す図である。The figure which shows the structure on the memory of the list | wrist formed by the list structure data which the object of the VLIST2 instruction in the 1st Embodiment of this invention determines the last of the area as NULL (terminal character), and the content of the vector register It is. 本発明の第１の実施形態におけるＶＬＩＳＴ２命令の対象となる、リスト構造データにより形成されるリストのメモリ上の構成とベクトルレジスタの内容を示す図である。It is a figure which shows the structure on the memory of the list | wrist formed by list structure data used as the object of the VLIST2 instruction | indication in the 1st Embodiment of this invention, and the content of a vector register. 本発明の第１の実施形態におけるソースプログラムから生成される命令列の例を示す図である。It is a figure which shows the example of the command sequence produced | generated from the source program in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるソースプログラムから生成される他の命令列の例を示す図である。It is a figure which shows the example of the other instruction sequence produced | generated from the source program in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるソースプログラムから生成される別の命令列の例を示す図である。It is a figure which shows the example of another instruction sequence produced | generated from the source program in the 1st Embodiment of this invention. 本発明の第１の実施形態で使用するマスク生成命令とマスク先行ゼロカウント命令の処理の説明図である。It is explanatory drawing of a process of the mask production | generation instruction | indication and mask leading zero count instruction | indication used in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるソースプログラムから生成される更に別の命令列の例を示す図である。It is a figure which shows the example of another instruction sequence produced | generated from the source program in the 1st Embodiment of this invention. 本発明に関連するコンパイラによってソースプログラムから生成される命令列の例を示す図である。It is a figure which shows the example of the instruction sequence produced | generated from the source program by the compiler relevant to this invention. 本発明の第１の実施形態に係るコンパイラを実現するコンピュータのハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of the computer which implement | achieves the compiler which concerns on the 1st Embodiment of this invention. 本発明の第２の実施形態に係るコンパイラのブロック図である。It is a block diagram of the compiler which concerns on the 2nd Embodiment of this invention.

次に本発明の実施の形態について図面を参照して詳細に説明する。
[第１の実施形態]
図５を参照すると、本発明の第１の実施形態に係るコンパイラ５は、ソースプログラム５−３を翻訳してオブジェクトコード５−４を生成する。生成されたオブジェクトコードは、ベクトル計算機で実行される。このようにコンパイラ５は、ベクトル計算機に対するオブジェクトプログラムを生成する。 Next, embodiments of the present invention will be described in detail with reference to the drawings.
[First embodiment]
Referring to FIG. 5, the compiler 5 according to the first embodiment of the present invention translates the source program 5-3 to generate an object code 5-4. The generated object code is executed by a vector computer. Thus, the compiler 5 generates an object program for the vector computer.

コンパイラ５は、ループ解析部５−１とベクトル化実行部５−２とを含んで構成されている。 The compiler 5 includes a loop analysis unit 5-1 and a vectorization execution unit 5-2.

ループ解析部５−１は、ソースプログラム５−３を解析してリスト構造データを含むループのベクトル化ができるか否かを判定する。ループ解析部５−１は、リスト構造データを含むループの認識手段（以下、認識手段と呼ぶ）５−１−１、リスト構造データの構造解析手段（以下、構造解析手段と呼ぶ）５−１−２、および、ベクトル化判定手段５−１−３を含んで構成されている。 The loop analysis unit 5-1 analyzes the source program 5-3 and determines whether or not the loop including the list structure data can be vectorized. The loop analysis unit 5-1 includes a loop recognizing unit (hereinafter referred to as a recognizing unit) 5-1-1 including list structure data, and a structure analyzing unit for list structure data (hereinafter referred to as a structural analyzing unit) 5-1. -2 and vectorization determination means 5-1-3.

認識手段５−１−１は、ソースプログラム５−３中のリスト構造データを含むループを検出する。構造解析手段５−１−２は、認識手段５−１−１で検出したループに含まれるリスト構造を解析する。ベクトル化判定手段５−１−３は、認識手段５−１−１で検出したループがベクトル化できるか否かを判定する。 The recognition unit 5-1-1 detects a loop including the list structure data in the source program 5-3. The structure analysis unit 5-1-2 analyzes the list structure included in the loop detected by the recognition unit 5-1-1. The vectorization determination unit 5-1-3 determines whether the loop detected by the recognition unit 5-1-1 can be vectorized.

ベクトル化実行部５−２は、ループ解析部５−１の解析結果に基づきループをベクトル化する。ベクトル化実行部５−２は、リスト構造先頭アドレス・リスト構造ノード数取得命令生成手段（以下、ＶＬＩＳＴ命令生成手段と呼ぶ）５−２−１、ベクトル長ロード命令生成手段５−２−２、ベクトル収集命令生成手段５−２−３、ベクトル演算命令生成手段５−２−４、および、ベクトル拡散命令生成手段５−２−５を含んで構成されている。 The vectorization execution unit 5-2 vectorizes the loop based on the analysis result of the loop analysis unit 5-1. The vectorization execution unit 5-2 includes a list structure start address / list structure node number acquisition instruction generation unit (hereinafter referred to as a VLIST instruction generation unit) 5-2-1, a vector length load instruction generation unit 5-2-2, A vector collection instruction generation unit 5-2-3, a vector operation instruction generation unit 5-2-4, and a vector diffusion instruction generation unit 5-2-5 are included.

ＶＬＩＳＴ命令生成手段５−２−１は、リスト構造の先頭アドレスのリストおよびリスト構造データのノード数を取得するベクトル命令であるリスト構造先頭アドレス・リスト構造ノード数取得命令（以下、ＶＬＩＳＴ命令と呼ぶ）を生成する。ベクトル長ロード命令生成手段５−２−２は、ＶＬＩＳＴ命令によって取得されたノード数をベクトル長に設定するベクトル長ロード命令を生成する。ベクトル収集命令生成手段５−２−３は、メモリ上のリスト構造データをベクトルレジスタにロードするベクトル収集命令を生成する。ベクトル演算命令生成手段５−２−４は、ベクトルレジスタにロードされたデータの演算を行い、演算結果をベクトルレジスタに格納するベクトル演算命令を生成する。ベクトル拡散命令生成手段５−２−５は、ベクトルレジスタに格納された演算結果を、メモリ上のリスト構造データにストアするベクトル拡散命令を生成する。 The VLIST instruction generation means 5-2-1 is a list structure start address / list structure node number acquisition instruction (hereinafter referred to as a VLIST instruction) which is a vector instruction for acquiring the list of list structure start addresses and the number of nodes of list structure data. ) Is generated. The vector length load instruction generating unit 5-2-2 generates a vector length load instruction for setting the number of nodes acquired by the VLIST instruction as a vector length. The vector collection instruction generation unit 5-2-3 generates a vector collection instruction for loading the list structure data on the memory into the vector register. The vector operation instruction generation unit 5-2-4 performs an operation on the data loaded in the vector register, and generates a vector operation instruction for storing the operation result in the vector register. The vector spread instruction generating unit 5-2-5 generates a vector spread instruction for storing the operation result stored in the vector register in the list structure data on the memory.

コンパイラ５は、例えば図１４に示すようなパーソナルコンピュータ等の情報処理装置１４−１とプログラム１４−２とで実現することができる。情報処理装置１４−１は、キーボードやマウスなどの操作入力部１４−１−１と、液晶ディスプレイ等の画面表示部１４−１−２と、通信インタフェース部１４−１−３と、メモリやハードディスク等の記憶部１４−１−４と、１以上のマイクロプロセッサ等の演算処理部１４−１−５とを有する。プログラム１４−２は、情報処理装置１４−１の立ち上げ時等に外部のコンピュータ読み取り可能な記憶媒体から記憶部１４−１−４に読み込まれ、演算処理部１４−１−５の動作を制御することにより、演算処理部１４−１−５上に、コンパイラ５を実現する。 The compiler 5 can be realized by an information processing device 14-1 such as a personal computer and a program 14-2 as shown in FIG. The information processing apparatus 14-1 includes an operation input unit 14-1-1 such as a keyboard and a mouse, a screen display unit 14-1-2 such as a liquid crystal display, a communication interface unit 14-1-3, a memory and a hard disk. Storage unit 14-1-4 and the like, and an arithmetic processing unit 14-1-5 such as one or more microprocessors. The program 14-2 is read from an external computer-readable storage medium into the storage unit 14-1-4 when the information processing apparatus 14-1 is started up, and controls the operation of the arithmetic processing unit 14-1-5. Thus, the compiler 5 is realized on the arithmetic processing unit 14-1-5.

＜本実施形態の説明＞
次に、図１〜図１２を用いてコンパイラ５の動作について説明する。 <Description of this embodiment>
Next, the operation of the compiler 5 will be described with reference to FIGS.

図２は、前述したように、リスト構造データを含むループを有するＣ言語のソースプログラムである。コンパイラ５のループ解析部５−１は、このようなループを有するソースプログラム５−３を解析する。 FIG. 2 is a C language source program having a loop including list structure data as described above. The loop analysis unit 5-1 of the compiler 5 analyzes the source program 5-3 having such a loop.

まず、認識手段５−１−１は、ソースプログラム５−３中から図２のループ制御文２−２のような、ループ内にリスト構造データを持つループを検出する。 First, the recognition unit 5-1-1 detects a loop having list structure data in the loop, such as the loop control statement 2-2 in FIG. 2, from the source program 5-3.

次に、構造解析手段５−１−２は、検出されたループに含まれるリスト構造データを解析する。リスト構造データの解析では、リスト構造データ内の次のノードを指すポインタがリスト構造データ内のどの位置にあるかを解析する。図２に示したリスト構造データにおいて、ｄｏｕｂｌｅ型データを８バイトとすると、ｘ，ｙ，ｚの後に位置するｎｅｘｔは、ノードの先頭アドレスから２４バイト目に位置することとなる。つまり、図２に示すリスト構造データでは、次のノードを指すポインタの位置はノードの先頭から２４バイト目となる。 Next, the structure analysis unit 5-1-2 analyzes the list structure data included in the detected loop. In the analysis of the list structure data, the position in the list structure data where the pointer pointing to the next node in the list structure data is analyzed is analyzed. In the list structure data shown in FIG. 2, when double type data is 8 bytes, the next located after x, y, z is located at the 24th byte from the head address of the node. That is, in the list structure data shown in FIG. 2, the position of the pointer pointing to the next node is the 24th byte from the head of the node.

次に、コンパイラ５のベクトル化判定手段５−１−３は、検出されたループがベクトル化できるか否かを判定する。ベクトル化できる条件としては、ループ内のリスト構造データ、配列、変数の定義・参照関係にベクトル化を阻害する依存関係がないことが挙げられる。 Next, the vectorization determination unit 5-1-3 of the compiler 5 determines whether or not the detected loop can be vectorized. As a condition that can be vectorized, the list structure data, the array, and the definition / reference relationship of the variables in the loop have no dependency that inhibits vectorization.

ループ解析部５−１の処理が終わると、ベクトル化実行部５−２による処理が行われる。ベクトル化実行部５−２は、ループ解析部５−１におってベクトル化できると判定されたリスト構造データを含むループを、以下のようにベクトル化する。 When the processing of the loop analysis unit 5-1 ends, the processing by the vectorization execution unit 5-2 is performed. The vectorization execution unit 5-2 vectorizes the loop including the list structure data determined to be vectorizable by the loop analysis unit 5-1, as follows.

まず、ＶＬＩＳＴ命令生成手段５−２−１は、ベクトル化するリスト構造データの各先頭アドレスのリストとリスト構造データのノードの個数をレジスタにロードするためのＶＬＩＳＴ命令を生成する。このＶＬＩＳＴ命令が実行されることによって、ベクトル化するリスト構造データの各先頭アドレスがベクトルレジスタに格納される。例えば、図３に示すｆｉｒｓｔからｌａｓｔまでの区間のリスト構造データに対してＶＬＩＳＴ命令が実行されると、図３に示すように、リスト構造データの各先頭アドレスのリストがベクトルレジスタ３−１に格納される。また、ＶＬＩＳＴ命令によって、リスト構造データのノードの個数が図示しないスカラレジスタに格納される。ＶＬＩＳＴ命令の詳細については後述する。 First, the VLIST instruction generation means 5-2-1 generates a VLIST instruction for loading a list of each top address of the list structure data to be vectorized and the number of nodes of the list structure data into a register. By executing this VLIST instruction, each head address of the list structure data to be vectorized is stored in the vector register. For example, when the VLIST instruction is executed on the list structure data in the section from first to last shown in FIG. 3, a list of each head address of the list structure data is stored in the vector register 3-1, as shown in FIG. Stored. Further, the number of nodes of the list structure data is stored in a scalar register (not shown) by the VLIST instruction. Details of the VLIST instruction will be described later.

次に、ベクトル長ロード命令生成手段５−２−２は、ＶＬＩＳＴ命令により得られたノードの個数をベクトル長として設定する。 Next, the vector length load instruction generation unit 5-2-2 sets the number of nodes obtained by the VLIST instruction as the vector length.

次に、ベクトル収集命令生成手段５−２−３は、ベクトル収集命令を生成する。ベクトル収集命令は、ベクトルレジスタの各要素に格納されているアドレスのメモリデータを、別のベクトルレジスタの対応する要素に格納するように、メモリからデータをロードするものである。ベクトル収集命令生成手段５−２−３は、ＶＬＩＳＴ命令により得られたリスト構造データの先頭アドレスのリストを格納するベクトルレジスタ３−１を指定したベクトル収集命令を生成する。これによって、図３に示すように、ベクトルレジスタ３−１の各要素に格納されているアドレスのメモリデータを、メモリ３−２からベクトルレジスタ３−３へロードするベクトル収集命令が生成される。 Next, the vector collection command generation unit 5-2-3 generates a vector collection command. The vector collection instruction is to load data from the memory so that the memory data at the address stored in each element of the vector register is stored in the corresponding element of another vector register. The vector collection instruction generation unit 5-2-3 generates a vector collection instruction designating the vector register 3-1 for storing the list of the top addresses of the list structure data obtained by the VLIST instruction. As a result, as shown in FIG. 3, a vector collection instruction for loading the memory data at the address stored in each element of the vector register 3-1 from the memory 3-2 to the vector register 3-3 is generated.

次に、ベクトル演算命令生成手段５−２−４は、ベクトル収集命令によりベクトルレジスタ３−３へロードされたデータを指定してベクトル演算を実行し、結果を別のベクトルレジスタへ格納するベクトル演算命令を生成する。 Next, the vector operation instruction generation means 5-2-4 executes the vector operation by designating the data loaded to the vector register 3-3 by the vector collection instruction, and stores the result in another vector register. Generate instructions.

次に、ベクトル拡散命令生成手段５−２−５は、ベクトル拡散命令を生成する。ベクトル拡散命令は、ベクトルレジスタの各要素に格納されているアドレスのメモリ領域へ、別のベクトルレジスタの対応する要素のデータを格納するように、メモリにデータをストアするものである。ベクトル拡散命令生成手段５−２−５は、ベクトル演算命令により得られた演算結果を格納するベクトルレジスタおよびリスト構造データの先頭アドレスのリストを格納するベクトルレジスタを指定したベクトル拡散命令を生成する。これによって、図４に示すように、ベクトルレジスタ４−３に格納されているベクトル演算により得られた演算結果を、ベクトルレジスタ４−１に格納されているリスト構造データの先頭アドレスのメモリ４−２の領域へストアするベクトル拡散命令が生成される。 Next, the vector spread instruction generating unit 5-2-5 generates a vector spread instruction. The vector spread instruction stores data in the memory so that the data of the corresponding element of another vector register is stored in the memory area of the address stored in each element of the vector register. The vector diffusion instruction generating means 5-2-5 generates a vector diffusion instruction specifying a vector register for storing the operation result obtained by the vector operation instruction and a vector register for storing the list of the top addresses of the list structure data. As a result, as shown in FIG. 4, the operation result obtained by the vector operation stored in the vector register 4-3 is stored in the memory 4-4 at the head address of the list structure data stored in the vector register 4-1. A vector spread instruction to store in the two areas is generated.

次に、ＶＬＩＳＴ命令の例について詳述する。 Next, an example of the VLIST instruction will be described in detail.

＜ＶＬＩＳＴ命令の例１＞
一つ目の例のＶＬＩＳＴ命令（以下、ＶＬＩＳＴ１命令と呼ぶ）は、以下のように、オペランドの所定のフィールドで、％ｖ０，％ｓ０，％ｓ１，％ｓ２，％ｓ３の合計５個のレジスタを指定する構成とされている。
ＶＬＩＳＴ１命令％ｖ０，％ｓ０，％ｓ１，％ｓ２，％ｓ３ <Example 1 of VLIST instruction>
The VLIST instruction (hereinafter referred to as the VLIST1 instruction) of the first example has a total of five registers of% v0,% s0,% s1,% s2, and% s3 in a predetermined field of the operand as follows. It is configured to specify.
VLIST1 instruction% v0,% s0,% s1,% s2,% s3

％ｖ０は、リスト構造データの各ノードの先頭アドレスを書込むベクトルレジスタである。％ｓ０は、リスト構造データのノードの個数を書込むスカラレジスタである。％ｓ１は、命令の対象となるリスト構造データの区間の先頭のアドレス（図１のｆｉｒｓｔ）を指すスカラレジスタである。％ｓ２は、命令の対象となるリスト構造データの区間の末尾のアドレス（図１のｌａｓｔ）を指すスカラレジスタである。％ｓ３は、命令の対象となるリスト構造データの構造解析手段５−１−２により取得したリスト構造内の次のデータのポインタの位置（図１のｎｅｘｔ）を指すスカラレジスタである。 % V0 is a vector register for writing the head address of each node of the list structure data. % S0 is a scalar register for writing the number of nodes of list structure data. % S1 is a scalar register indicating the head address (first in FIG. 1) of the section of the list structure data that is the target of the instruction. % S2 is a scalar register that points to the last address (last in FIG. 1) of the section of the list structure data that is the target of the instruction. % S3 is a scalar register indicating the position of the pointer (next in FIG. 1) of the next data in the list structure acquired by the structure analysis unit 5-1-2 of the list structure data to be instructed.

ＶＬＩＳＴ１命令を命令セットに含むベクトル計算機では、ＶＬＩＳＴ１命令が発行されると、スカラレジスタ％ｓ１、％ｓ２、および％ｓ３で指定されるｆｉｒｓｔ、ｌａｓｔ、ｎｅｘｔによって特定されるリスト構造データの各ノードの先頭アドレスを取得してベクトルレジスタ％ｖ０に格納する処理と、ノードの個数を計数してスカラレジスタ％ｓ０に格納する処理とが、１命令に係る処理として行われる。 In the vector computer including the VLIST1 instruction in the instruction set, when the VLIST1 instruction is issued, each node of the list structure data specified by the first, last, and next specified by the scalar registers% s1,% s2, and% s3 The process of acquiring the start address and storing it in the vector register% v0 and the process of counting the number of nodes and storing it in the scalar register% s0 are performed as processes related to one instruction.

＜ＶＬＩＳＴ命令の例２＞
二つ目の例のＶＬＩＳＴ命令（以下、ＶＬＩＳＴ２命令と呼ぶ）は、以下のように、オペランドの所定のフィールドで、％ｖ０，％ｓ０，％ｓ１，％ｓ２の合計４個のレジスタを指定する構成とされている。
ＶＬＩＳＴ２命令％ｖ０，％ｓ０，％ｓ１，％ｓ２ <Example 2 of VLIST instruction>
The VLIST instruction (hereinafter referred to as the VLIST2 instruction) of the second example specifies a total of four registers of% v0,% s0,% s1, and% s2 in a predetermined field of the operand as follows: It is configured.
VLIST2 instruction% v0,% s0,% s1,% s2

％ｖ０は、リスト構造データの各ノードの先頭アドレスを書込むベクトルレジスタである。％ｓ０は、リスト構造データのノードの個数を書込むスカラレジスタである。％ｓ１は、命令の対象となるリスク構造データの区間の先頭のアドレス（図１のｆｉｒｓｔ）を指すスカラレジスタである。％ｓ２は、命令の対象となるリスト構造データの構造解析手段５−１−２により取得したリスト構造内の次のデータのポインタの位置（図１のｎｅｘｔ）を指すスカラレジスタである。一つ目の例のＶＬＩＳＴ１命令とは異なり、ＶＬＩＳＴ２命令は、命令の対象となるリスト構造データの区間の末尾のアドレス（図１のｌａｓｔ）を指すオペランドはない。その理由は、ＶＬＩＳＴ２命令は、図６の６−１のＮＵＬＬに示すように、区間の最後はＮＵＬＬ（終端文字）と決めているためである。このように、ＶＬＩＳＴ２命令は、区間の末尾のアドレスを指すスカラレジスタを減らして、オペランド数を削減している。 % V0 is a vector register for writing the head address of each node of the list structure data. % S0 is a scalar register for writing the number of nodes of list structure data. % S1 is a scalar register indicating the head address (first in FIG. 1) of the section of the risk structure data that is the target of the instruction. % S2 is a scalar register indicating the position (next in FIG. 1) of the pointer of the next data in the list structure acquired by the structure analysis unit 5-1-2 of the list structure data to be instructed. Unlike the VLIST1 instruction in the first example, the VLIST2 instruction does not have an operand indicating the end address (last in FIG. 1) of the section of the list structure data that is the target of the instruction. The reason is that the VLIST2 instruction determines that the end of the section is NULL (terminal character) as indicated by NULL of 6-1 in FIG. As described above, the VLIST2 instruction reduces the number of operands by reducing the number of scalar registers indicating the end address of the section.

ＶＬＩＳＴ２命令を命令セットに含むベクトル計算機では、ＶＬＩＳＴ２命令が発行されると、スカラレジスタ％ｓ１および％ｓ２で指定されるｆｉｒｓｔ、ｎｅｘｔによって特定されるリスト構造データの各ノードの先頭アドレスを取得してベクトルレジスタ％ｖ０に格納する処理と、ノードの個数を計数してスカラレジスタ％ｓ０に格納する処理とが、１命令に係る処理として行われる。 In a vector computer including the VLIST2 instruction in the instruction set, when the VLIST2 instruction is issued, the head address of each node of the list structure data specified by the first and next specified by the scalar registers% s1 and% s2 is obtained. Processing for storing in the vector register% v0 and processing for counting the number of nodes and storing in the scalar register% s0 are performed as processing related to one instruction.

＜ＶＬＩＳＴ命令の例３＞
三つ目の例のＶＬＩＳＴ命令（以下、ＶＬＩＳＴ３命令と呼ぶ）は、以下のように、オペランドの所定のフィールドで、％ｖ０，％ｓ０，％ｓ１，％ｓ２の合計４個のレジスタを指定する構成とされている。
ＶＬＩＳＴ３命令％ｖ０，％ｓ０，％ｓ１，％ｓ２ <Example 3 of VLIST instruction>
The VLIST instruction of the third example (hereinafter referred to as VLIST3 instruction) specifies a total of four registers of% v0,% s0,% s1, and% s2 in a predetermined field of the operand as follows: It is configured.
VLIST3 instruction% v0,% s0,% s1,% s2

％ｖ０は、リスト構造データの各ノードの先頭アドレスを書込むベクトルレジスタである。％ｓ０は、リスト構造データの区間の先頭のアドレス（図１のｆｉｒｓｔ）を指すスカラレジスタである。％ｓ１は、リスト構造データの区間の末尾のアドレス（図１のｌａｓｔ）を指すスカラレジスタである。％ｓ２は、リスト構造データの構造解析手段５−１−２により取得したリスト構造内の次のデータのポインタの位置（図１のｎｅｘｔ）を指すスカラレジスタである。ＶＬＩＳＴ１，ＶＬＩＳＴ２とは異なり、ＶＬＩＳＴ３には、リスト構造データのノードの個数を書込むスカラレジスタを指定するオペランドは無い。 % V0 is a vector register for writing the head address of each node of the list structure data. % S0 is a scalar register indicating the head address (first in FIG. 1) of the section of the list structure data. % S1 is a scalar register indicating the end address (last in FIG. 1) of the section of the list structure data. % S2 is a scalar register indicating the position of the next data pointer (next in FIG. 1) in the list structure acquired by the structure analysis means 5-1-2 of the list structure data. Unlike VLIST1 and VLIST2, VLIST3 does not have an operand that specifies a scalar register for writing the number of nodes of list structure data.

ＶＬＩＳＴ３命令を命令セットに含むベクトル計算機では、ＶＬＩＳＴ３命令が発行されると、スカラレジスタ％ｓ０、％ｓ１、および％ｓ２で指定されるｆｉｒｓｔ、ｌａｓｔ、ｎｅｘｔによって特定されるリスト構造データの各ノードの先頭アドレスを取得して、図７に示すようにベクトルレジスタ％ｖ０（７−１）に格納する処理と、末尾のアドレスの次の要素をＮＵＬＬ（終端文字）にする処理とが、１命令に係る処理として行われる。ここで、ベクトルレジスタ７−１の末尾のアドレスの次の要素をＮＵＬＬ（終端文字）にする理由は、ベクトルレジスタ７−１の何番目の要素までアドレスを格納したかを把握することができるようにするためである。ノードの個数の算出は、ＶＬＩＳＴ３命令に引き続く別の命令によって行われる（詳細は図１０を参照して後述する）。 In the vector computer including the VLIST3 instruction in the instruction set, when the VLIST3 instruction is issued, each node of the list structure data specified by the first, last, and next specified by the scalar registers% s0,% s1, and% s2 The process of acquiring the start address and storing it in the vector register% v0 (7-1) as shown in FIG. 7 and the process of making the next element of the end address NULL (terminal character) are in one instruction. This process is performed. Here, the reason why the element next to the end address of the vector register 7-1 is NULL (termination character) is that it is possible to grasp up to which element of the vector register 7-1 the address has been stored. It is to make it. The calculation of the number of nodes is performed by another instruction subsequent to the VLIST3 instruction (details will be described later with reference to FIG. 10).

＜ＶＬＩＳＴ命令の例４＞
四つ目の例のＶＬＩＳＴ命令（以下、ＶＬＩＳＴ４と呼ぶ）は、以下のように、オペランドの所定のフィールドで、％ｖ０，％ｓ０，％ｓ１の合計３個のレジスタを指定する構成とされている。
ＶＬＩＳＴ４命令％ｖ０，％ｓ０，％ｓ１ <Example 4 of VLIST instruction>
The VLIST instruction of the fourth example (hereinafter referred to as VLIST4) is configured to specify a total of three registers of% v0,% s0, and% s1 in a predetermined field of the operand as follows. Yes.
VLIST4 instruction% v0,% s0,% s1

％ｖ０は、リスト構造データの各先頭アドレスを書込むベクトルレジスタである。％ｓ０は、リスト構造データの区間の先頭のアドレス（図１のｆｉｒｓｔ）を指すスカラレジスタである。％ｓ１は、リスト構造データの構造解析手段５−１−２により取得したリスト構造内の次のデータのポインタの位置（図１のｎｅｘｔ）を指すスカラレジスタである。ＶＬＩＳＴ４命令は、三つ目の例のＶＬＩＳＴ３命令から区間の末尾のアドレス（図１のｌａｓｔ）を指すスカラレジスタのオペランドを無くした例である。ＶＬＩＳＴ４命令は、二つ目の例のＶＬＩＳＴ２命令と同様に、区間の最後はＮＵＬＬ（終端文字）と決めてオペランド数を減らしている。 % V0 is a vector register for writing each head address of the list structure data. % S0 is a scalar register indicating the head address (first in FIG. 1) of the section of the list structure data. % S1 is a scalar register indicating the position of the next data pointer (next in FIG. 1) in the list structure acquired by the structure analysis unit 5-1-2 of the list structure data. The VLIST4 instruction is an example in which the operand of the scalar register indicating the last address of the section (last in FIG. 1) is eliminated from the VLIST3 instruction of the third example. Similarly to the VLIST2 instruction in the second example, the VLIST4 instruction determines the end of the section as NULL (terminal character) and reduces the number of operands.

ＶＬＩＳＴ４命令を命令セットに含むベクトル計算機では、ＶＬＩＳＴ４命令が発行されると、スカラレジスタ％ｓ０および％ｓ１で指定されるｆｉｒｓｔ、ｎｅｘｔによって特定されるリスト構造データの各ノードの先頭アドレスを取得して、ベクトルレジスタ％ｖ０に格納する処理と、末尾のアドレスの次の要素をＮＵＬＬ（終端文字）にする処理とが、１命令に係る処理として行われる。また、ノードの個数の算出は、ＶＬＩＳＴ３命令と同様に、ＶＬＩＳＴ４命令に引き続く別の命令によって行われる。 In a vector computer including the VLIST4 instruction in the instruction set, when the VLIST4 instruction is issued, the head address of each node of the list structure data specified by the first and next specified by the scalar registers% s0 and% s1 is obtained. The process of storing in the vector register% v0 and the process of setting the element next to the end address to NULL (termination character) are performed as processes related to one instruction. Further, the calculation of the number of nodes is performed by another instruction subsequent to the VLIST4 instruction, similarly to the VLIST3 instruction.

次に、図２に示したソースプログラムをコンパイルしたときに生成される命令列の例を図８〜図１０、図１２を参照して説明する。 Next, examples of instruction sequences generated when the source program shown in FIG. 2 is compiled will be described with reference to FIGS. 8 to 10 and FIG.

＜命令列の例１＞
図８は、ＶＬＩＳＴ１命令を用いた場合に生成される命令列である。最初のＶＬＩＳＴ１命令は、リスト構造データの各ノードの先頭アドレスのリストをベクトルレジスタｖｒｅｇ１にロードし、そのノードの個数をスカラレジスタｓｒｅｇ１にロードする。次のＬＶＬ命令（ベクトル長ロード命令）は、スカラレジスタｓｒｅｇ１によりノードの個数を指定してベクトル長をロードする。次のＶＧＴ命令（ベクトル収集命令）は、ベクトルレジスタｖｒｅｇ１を指定してｘのデータをメモリからベクトルレジスタｖｒｅｇ２にロードしている。次のＶＡＤＤ命令（ベクトル加算命令）は、ｙのアドレスのリストをベクトルレジスタｖｒｅｇ３に格納する。即ち、ｙのデータはｘの次に位置するためｄｏｕｂｌｅ型を８バイトとすると、ｙはｓｔｒｕｃｔＡの先頭から８バイト目に位置することになる。そのため、先頭アドレスのリスト（ｖｒｅｇ１）にＶＡＤＤ命令で８を加算した演算結果をベクトルレジスタｖｒｅｇ３に格納している。次のＶＧＴ命令は、ベクトルレジスタｖｒｅｇ３を指定してｙのデータをメモリからベクトルレジスタｖｒｅｇ４にロードする。次のＶＭＵＬ命令（ベクトル乗算命令）は、ベクトルレジスタｖｒｅｇ２，ｖｒｅｇ４にロードしたｘ，ｙを乗算し、演算結果をベクトルレジスタｖｒｅｇ５に格納する。つぎのＶＡＤＤ命令は、ｚのアドレスのリストをベクトルレジスタｖｒｅｇ６に格納する。即ち、ｚのデータはｓｒｕｃｔＡの先頭から１６バイト目に位置するので、先頭アドレスのリストに１６を加算した演算結果をベクトルレジスタｖｒｅｇ６に格納している。次のＶＳＣ命令（ベクトル拡散命令）は、ベクトルレジスタｖｒｅｇ６を指定して、ベクトルレジスタｖｒｅｇ５の乗算結果を各ノードのｚの位置へストアする。 <Example of instruction sequence 1>
FIG. 8 shows an instruction sequence generated when the VLIST1 instruction is used. The first VLIST1 instruction loads a list of head addresses of each node of the list structure data into the vector register vreg1, and loads the number of the nodes into the scalar register sreg1. The next LVL instruction (vector length load instruction) loads the vector length by specifying the number of nodes by the scalar register sreg1. The next VGT instruction (vector collection instruction) designates the vector register vreg1 and loads the data of x from the memory into the vector register vreg2. The next VADD instruction (vector addition instruction) stores a list of addresses of y in the vector register vreg3. That is, since the data of y is positioned next to x, assuming that the double type is 8 bytes, y is positioned at the 8th byte from the beginning of struct A. For this reason, the operation result obtained by adding 8 to the list (vreg1) of the top address by the VADD instruction is stored in the vector register vreg3. The next VGT instruction specifies the vector register vreg3 and loads y data from the memory into the vector register vreg4. The next VMUL instruction (vector multiplication instruction) multiplies x and y loaded in the vector registers vreg2 and vreg4, and stores the operation result in the vector register vreg5. The next VADD instruction stores a list of z addresses in the vector register vreg6. That is, since the data of z is located at the 16th byte from the head of struct A, the operation result obtained by adding 16 to the list of head addresses is stored in the vector register vreg6. The next VSC instruction (vector spread instruction) designates the vector register vreg6 and stores the multiplication result of the vector register vreg5 in the position of z of each node.

＜命令列の例２＞
図９は、ＶＬＩＳＴ２命令を用いた場合に生成される命令列である。リスト構造データの末尾をＮＵＬＬ（終端文字）と決めているため、ＶＬＩＳＴ２命令が使用される。それ以外は図８とは変わらない。 <Example 2 of instruction sequence>
FIG. 9 shows an instruction sequence generated when the VLIST2 instruction is used. Since the end of the list structure data is determined to be NULL (terminal character), the VLIST2 instruction is used. Other than that, it is not different from FIG.

＜命令列の例３＞
図１０は、ＶＬＩＳＴ３命令を用いた場合に生成される命令列である。前述したように、ＶＬＩＳＴ３命令では、リスト構造データの先頭アドレスのリストをベクトルレジスタにロードするとき、末尾のアドレスの次の要素をＮＵＬＬ（終端文字）にする。即ち、ＶＬＩＳＴ３命令でリスト構造データの先頭アドレスをロードしたときのベクトルレジスタは、図１１のベクトルレジスタ１１−１のようになり、最後の要素がＮＵＬＬとなる。そこで、図１０の命令列では、ＶＦＭＫ．ＥＱ命令（マスク生成命令）により、図１１に示すように、ベクトルレジスタ１１−２にマスクを生成する。ＶＦＭＫ．ＥＱ命令ではオペランドの所定のフィールドで指定されたベクトルレジスタｖｒｅｇ１（１１−１）の要素がＮＵＬＬと等しい場合は１、それ以外は０の要素となるマスクをオペランドの所定のフィールドで指定されたマスクレジスタに生成する。そして、生成されたマスクを指定したＬＺＶＭ命令（マスク先行ゼロカウント命令）を続ける。ＬＺＶＭ命令は、オペランドの所定のフィールドで指定されたベクトルレジスタ１１−２のマスクの先頭から０がいくつ連続して続くかをカウントし、そのカウント結果であるノードの個数をオペランドの所定のフィールドで指定された図１１のスカラレジスタ１１−３（ｓｒｅｇ１）に格納する。その後は図８、図９と同様である。 <Example 3 of instruction sequence>
FIG. 10 shows an instruction sequence generated when the VLIST3 instruction is used. As described above, in the VLIST3 instruction, when the list of the top address of the list structure data is loaded into the vector register, the element next to the end address is set to NULL (terminal character). That is, the vector register when the head address of the list structure data is loaded with the VLIST3 instruction is the vector register 11-1 in FIG. 11, and the last element is NULL. Therefore, in the instruction sequence of FIG. An EQ instruction (mask generation instruction) generates a mask in the vector register 11-2 as shown in FIG. VFMK. In the case of the EQ instruction, a mask that is 1 when the element of the vector register vreg1 (11-1) specified in the predetermined field of the operand is equal to NULL, and 0 that is otherwise, is specified by the predetermined field of the operand. Generate to register. Then, the LZVM instruction (mask leading zero count instruction) specifying the generated mask is continued. The LZVM instruction counts how many zeros continue from the beginning of the mask of the vector register 11-2 specified in the predetermined field of the operand, and the number of nodes as the count result is determined in the predetermined field of the operand. The data is stored in the designated scalar register 11-3 (sreg1) in FIG. The subsequent steps are the same as those shown in FIGS.

＜命令列の例４＞
図１２は、ＶＬＩＳＴ４命令を用いた場合に生成される命令列である。リスト構造データの末尾をＮＵＬＬ（終端文字）と決めているためＶＬＩＳＴ４命令が使用される。それ以外は図１０とは変わらない。 <Example 4 of instruction sequence>
FIG. 12 shows an instruction sequence generated when the VLIST4 instruction is used. Since the end of the list structure data is determined to be NULL (terminal character), the VLIST4 instruction is used. Other than that, it is not different from FIG.

＜本実施形態の効果＞
このように本実施形態によれば、リスト構造データを含むループ制御文のベクトル化が可能となる。そして、ループのベクトル化によりプログラムの実行性能を向上させることができる。 <Effect of this embodiment>
As described above, according to the present embodiment, it is possible to vectorize a loop control statement including list structure data. The program execution performance can be improved by loop vectorization.

また、特許文献１では、図１３の文１３−３、１３−４のように作業配列のメモリを確保してリスト構造データの各ノードの先頭アドレスをストアする。これに対して本実施形態では、リスト構造データの各ノードの先頭アドレスをベクトルレジスタに直接ロードする。そのためメモリの確保は不要である。また、作業配列であるメモリがベクトルレジスタに置き換わるためデータ転送も高速になる。 Further, in Patent Document 1, a work array memory is secured as in sentences 13-3 and 13-4 in FIG. 13, and the head address of each node of list structure data is stored. On the other hand, in the present embodiment, the head address of each node of the list structure data is directly loaded into the vector register. Therefore, it is not necessary to secure memory. Further, since the memory as the work array is replaced with a vector register, data transfer is also performed at high speed.

また、特許文献１においては、リスト構造データのノード数のカウントと先頭アドレスのストアのために図１３の文１３−２、１３−４のように２回ループでリスト構造をたどっている。これに対して本実施形態では、ＶＬＩＳＴ命令により１回リスト構造をたどるだけでよく処理を減らせる。 In Patent Document 1, the list structure is traced in a loop twice as shown in sentences 13-2 and 13-4 in FIG. 13 for counting the number of nodes in the list structure data and storing the head address. On the other hand, in the present embodiment, the processing can be reduced only by following the list structure once by the VLIST instruction.

本実施形態は、以上の構成および動作を基本としつつ、各種の付加変更が可能である。例えば、ベクトル長が最大ベクトル長（ベクトルレジスタの要素数）を超える場合は、コンパイラ５は、最大ベクトル長のノード数を処理した後、リスト構造データの先頭アドレスがロードされたベクトルレジスタの最後尾のノードの次のノードから処理を再開することで、最大ベクトル長を超えるループを処理する。 In the present embodiment, various additions and changes can be made based on the above configuration and operation. For example, when the vector length exceeds the maximum vector length (the number of elements of the vector register), the compiler 5 processes the number of nodes of the maximum vector length, and then the tail end of the vector register loaded with the top address of the list structure data. A loop exceeding the maximum vector length is processed by resuming the processing from the node next to the current node.

[第２の実施形態]
図１５を参照すると、本発明の第２の実施形態に係るコンピュータ１５は、ソースプログラム１５−３からベクトル計算機に対するオブジェクトプログラム１５−４を生成するために、ループ解析部１５−１とベクトル化実行部１５−２とを含んで構成される。 [Second Embodiment]
Referring to FIG. 15, the computer 15 according to the second embodiment of the present invention performs vectorization execution with the loop analysis unit 15-1 in order to generate an object program 15-4 for the vector computer from the source program 15-3. Part 15-2.

ループ解析部１５−１は、ソースプログラム１５−３を解析してリスト構造データを繰り返し処理するループ制御文を認識する。ループ解析部１５−１は、例えば図５のループ解析部５−１と同様に構成することができるが、それに限定されない。ベクトル化実行部１５−２は、ループ解析部１５−１で認識されたループ制御文をベクトル化する。ベクトル化実行部１５−２は、例えば図５のベクトル化実行部５−２と同様に構成することができるが、それに限定されない。 The loop analysis unit 15-1 analyzes the source program 15-3 and recognizes a loop control statement that repeatedly processes the list structure data. The loop analysis unit 15-1 can be configured in the same manner as the loop analysis unit 5-1, for example, in FIG. 5, but is not limited thereto. The vectorization execution unit 15-2 vectorizes the loop control statement recognized by the loop analysis unit 15-1. The vectorization execution unit 15-2 can be configured, for example, in the same manner as the vectorization execution unit 5-2 in FIG. 5, but is not limited thereto.

ベクトル化実行部１５−２は、オブジェクトプログラム１５−４中に、第１のプログラム部分１５−５と第２のプログラム部分１５−６とを挿入するように構成されている。第１のプログラム部分１５−５は、リスト構造データの各ノードの先頭アドレスをベクトルレジスタに書き込み、リスト構造データのノード数をスカラレジスタに書き込む命令を含んでいる。第２のプログラム部分１５−６は、ベクトルレジスタに書き込まれたリスト構造データの各ノードの先頭アドレスとスカラレジスタに書き込まれたリスト構造データのノード数とを使って、ループ制御文をベクトル化している。 The vectorization execution unit 15-2 is configured to insert a first program part 15-5 and a second program part 15-6 into the object program 15-4. The first program part 15-5 includes an instruction for writing the head address of each node of the list structure data into the vector register and writing the number of nodes of the list structure data into the scalar register. The second program portion 15-6 vectorizes the loop control statement using the head address of each node of the list structure data written in the vector register and the number of nodes of the list structure data written in the scalar register. Yes.

次に、本実施形態に係るコンピュータ１５によるコード生成方法について説明する。 Next, a code generation method by the computer 15 according to the present embodiment will be described.

まず、ループ解析部１５−１は、ソースプログラム１５−３を解析してリスト構造データを繰り返し処理するループ制御文を認識する。次に、ベクトル化実行部１５−２は、ループ制御文をベクトル化したオブジェクトプログラム１５−４を生成する。上記ベクトル化では、オブジェクトプログラム１５−４中に、第１のプログラム部分１５−５と第２のプログラム部分１５−６をと挿入する。 First, the loop analysis unit 15-1 analyzes the source program 15-3 and recognizes a loop control statement that repeatedly processes the list structure data. Next, the vectorization execution unit 15-2 generates an object program 15-4 in which the loop control statement is vectorized. In the vectorization, the first program part 15-5 and the second program part 15-6 are inserted into the object program 15-4.

このように本実施形態によれば、作業配列を使ってループ制御文をベクトル化するのに比べて、リスト構造データを繰り返し処理するループ制御文の実行速度の高速化が可能になる。 As described above, according to the present embodiment, it is possible to increase the execution speed of the loop control statement that repeatedly processes the list structure data, compared to vectorizing the loop control statement using the work array.

その理由は、オブジェクトプログラム１５−４は、リスト構造データの各ノードの先頭アドレスをベクトルレジスタに書き込む第１のプログラム部分１５−５と、ベクトルレジスタに書き込まれたリスト構造データの各ノードの先頭アドレスを使ってループ制御文をベクトル化する第２のプログラム部分１５−６とを含んで構成されているためである。 The reason is that the object program 15-4 has a first program part 15-5 for writing the start address of each node of the list structure data to the vector register, and the start address of each node of the list structure data written to the vector register. This is because it includes the second program portion 15-6 that vectorizes the loop control statement using.

本発明は、ＨＰＣ（Ｈｉｇｈ−Ｐｅｒｆｏｒｍａｎｃｅ−Ｃｏｍｐｕｔｉｎｇ）分野、ベクトルプロセッサを有する計算機に利用でき、特にリスト構造データを繰り返し処理するループ制御文を含むソースプログラムをコンパイルしてオブジェクトプログラムを生成する分野に利用できる。 The present invention can be used in the field of HPC (High-Performance-Computing), a computer having a vector processor, and particularly in the field of generating an object program by compiling a source program including a loop control statement for repeatedly processing list structure data. it can.

１−１…構造体（ノード）
２−１…リンク構造体のデータ宣言文
２−２…ループ制御文
３−１…ベクトルレジスタ
３−２…メモリ
３−３…ベクトルレジスタ
４−１…ベクトルレジスタ
４−２…メモリ
４−３…ベクトルレジスタ
５…コンパイラ
５−１…ループ解析部
５−１−１…リスト構造データを含むループの認識手段
５−１−２…リスト構造データの構造解析手段
５−１−３…ベクトル化判定手段
５−２…ベクトル化実行部
５−２−１…リスト構造先頭アドレス・リスト構造ノード数取得命令生成手段
５−２−２…ベクトル長ロード命令生成手段
５−２−３…ベクトル収集命令生成手段
５−２−４…ベクトル演算命令生成手段
５−２−５…ベクトル拡散命令生成手段
５−３…ソースプログラム
５−４…オブジェクトコード
６−１…ＮＵＬＬ（終端文字）
７−１…ベクトルレジスタ
１１−１…ベクトルレジスタ
１１−２…マスクレジスタ
１１−３…スカラレジスタ
１３−１…ループ制御文
１３−２〜１３−５…文
１４−１…情報処理装置
１４−２…プログラム
１４−１−１…操作入力部
１４−１−２…画面表示部
１４−１−３…通信インタフェース部
１４−１−４…記憶部
１４−１−５…演算処理部
１５…コンピュータ
１５−１…ループ解析部
１５−２…ベクトル化実行部
１５−３…ソースプログラム
１５−４…オブジェクトプログラム
１５−５…第１のプログラム部分
１５−６…第２のプログラム部分 1-1 ... Structure (node)
2-1 ... link structure data declaration statement 2-2 ... loop control statement 3-1 ... vector register 3-2 ... memory 3-3 ... vector register 4-1 ... vector register 4-2 ... memory 4-3 ... Vector register 5... Compiler 5-1. Loop analysis section 5-1-1... Loop recognition means including list structure data 5-1-2. Structure analysis means for list structure data 5-1-3. 5-2 ... Vectorization execution unit 5-2-1 ... List structure start address / list structure node number acquisition instruction generation means 5-2-2 ... Vector length load instruction generation means 5-2-3 ... Vector collection instruction generation means 5-2-4 Vector calculation instruction generation means 5-2-5 Vector diffusion instruction generation means 5-3 Source program 5-4 Object code 6-1 NULL (Terminal character)
7-1 ... Vector register 11-1 ... Vector register 11-2 ... Mask register 11-3 ... Scalar register 13-1 ... Loop control statements 13-2 to 13-5 ... Statement 14-1 ... Information processing device 14-2 ... Program 14-1-1 ... Operation input unit 14-1-2 ... Screen display unit 14-1-3 ... Communication interface unit 14-1-4 ... Storage unit 14-1-5 ... Operation processing unit 15 ... Computer 15 -1 ... Loop analysis unit 15-2 ... Vectorization execution unit 15-3 ... Source program 15-4 ... Object program 15-5 ... First program part 15-6 ... Second program part

Claims

To a computer that generates an object program for a vector computer from a source program,
Processing for recognizing a loop control statement for repeatedly processing list structure data by analyzing the source program;
Let the loop control statement be vectorized,
In the process of vectorization,
During the object program,
A first program part for writing a head address of each node of the list structure data to a vector register, and writing a number of nodes of the list structure data to a scalar register;
A second program part in which the loop control statement is vectorized using the start address of each node of the list structure data written to the vector register and the number of nodes of the list structure data written to the scalar register And
insert,
compiler.

The first program portion includes a first vector instruction;
The first vector instruction is a predetermined field of an operand, a vector register% v0 for writing the head address of each node of list structure data, a scalar register% s0 for writing the number of nodes of list structure data, and list structure data A structure for designating a scalar register% s1 that points to the start address of a section of data, a scalar register% s2 that points to the end address of the section of list structure data, and a scalar register% s3 that points to the position of the pointer of the next data in the list structure It is said that
The compiler according to claim 1.

The first program portion includes a second vector instruction;
The second vector instruction is a predetermined field of an operand, a vector register% v0 for writing the head address of each node of list structure data, a scalar register% s0 for writing the number of nodes of list structure data, and risk structure data A scalar register% s1 that points to the head address of the section of, and a scalar register% s2 that points to the position of the next data pointer in the list structure.
The compiler according to claim 1.

The first program part includes a third vector instruction, a mask generation instruction, and a mask leading zero count instruction;
The third vector instruction is a predetermined field of the operand, a vector register% v0 for writing the start address and end character indicating the end of each node of the list structure data, and a scalar indicating the start address of the section of the list structure data. The register% s0, the scalar register% s1 indicating the end address of the list structure data section, and the scalar register% s2 indicating the position of the pointer of the next data in the list structure are designated.
The mask generation instruction is configured to designate the vector register% v0 and a register mreg1 for generating a mask in a predetermined field of an operand.
The mask leading zero count instruction is configured to designate a scalar register that stores, as a number of nodes, the register mreg1, a number of consecutive zeros from the top of the mask, in a predetermined field of an operand.
The compiler according to claim 1.

The first program part includes a fourth vector instruction, a mask generation instruction, and a mask leading zero count instruction;
The fourth vector instruction is a predetermined field of the operand, a vector register% v0 for writing each head address of the list structure data and a terminal character indicating the end, and a scalar register% indicating the head address of the section of the list structure data s0, and is configured to designate a scalar register% s1 that points to the position of the next data pointer in the list structure (next in FIG. 1).
The mask generation instruction is configured to designate the vector register% v0 and a register mreg1 for generating a mask in a predetermined field of an operand.
The mask leading zero count instruction is configured to designate a scalar register that stores, as a number of nodes, the register mreg1, a number of consecutive zeros from the top of the mask, in a predetermined field of an operand.
The compiler according to claim 1.

A computer that generates an object program for a vector computer from a source program,
A loop analysis unit that recognizes a loop control statement that analyzes the source program and repeatedly processes the list structure data;
A vectorization execution unit for vectorizing the loop control statement;
Including
The vectorization execution unit includes the object program,
A first program part for writing a head address of each node of the list structure data to a vector register, and writing a number of nodes of the list structure data to a scalar register;
A second program part in which the loop control statement is vectorized using the start address of each node of the list structure data written to the vector register and the number of nodes of the list structure data written to the scalar register And
Configured to insert,
Computer.

A code generation method executed by a computer that generates an object program for a vector computer from a source program,
Recognizing a loop control statement that repeatedly processes list structure data by analyzing the source program,
Vectorize the loop control statement;
In the vectorization,
During the object program,
A first program part for writing a head address of each node of the list structure data to a vector register, and writing a number of nodes of the list structure data to a scalar register;
A second program part in which the loop control statement is vectorized using the start address of each node of the list structure data written to the vector register and the number of nodes of the list structure data written to the scalar register And
insert,
Computer code generation method.