JP6907761B2

JP6907761B2 - compiler

Info

Publication number: JP6907761B2
Application number: JP2017127370A
Authority: JP
Inventors: 健人岩川
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2017-06-29
Filing date: 2017-06-29
Publication date: 2021-07-21
Anticipated expiration: 2037-06-29
Also published as: JP2019012324A

Description

本発明は、コンパイラ、コンピュータ、およびコンピュータのコード生成方法に関する。 The present invention relates to compilers, computers, and computer code generation methods.

リスト構造データは、自分自身と同じ型の構造体へのポインタをメンバに持つデータであり、自己参照構造体とも呼ばれる。リスト構造データは、追加や削除が頻繁に行われる大量のデータを処理するのに適している。 List structure data is data that has a pointer to a structure of the same type as itself as a member, and is also called a self-referencing structure. List-structured data is suitable for processing large amounts of data that are frequently added or deleted.

図１は、リスト構造データｓｔｒｕｃｔＡにより形成されるリストのメモリ上の構成を示す図である。図１中、１つの矩形が１つの構造体１−１を表す。構造体はノードとも呼ばれる。この例の構造体は、ｄｏｕｂｌｅ型データｘ，ｙ，ｚとポインタｎｅｘｔをメンバにもつ。ｆｉｒｓｔは、リスト構造データｓｔｒｕｃｔＡの並びの区間の先頭のアドレスを表す。ａｄｄｒ１，…，ａｄｄｒＮは、各構造体の先頭アドレスを表す。ｌａｓｔは、リスト構造データｓｔｒｕｃｔＡの並びの区間の終了アドレスを表す。 FIG. 1 is a diagram showing a memory configuration of a list formed by list structure data strike A. In FIG. 1, one rectangle represents one structure 1-1. Structures are also called nodes. The structure of this example has double type data x, y, z and a pointer next as members. first represents the address of the beginning of the section of the list structure data strike A. addr1, ..., AddrN represent the start address of each structure. last represents the end address of the section of the list structure data strike A.

図２は、リスト構造データｓｔｒｕｃｔＡを処理するループ制御文を持つＣ言語のソースプログラムの一例を示す図である。文２−１は、リスト構造データｓｔｒｕｃｔＡのデータ宣言文である。リスト構造データｓｔｒｕｃｔＡが自分自身を再帰的に指している。文４−２は、リストをたどりリスト構造データｓｔｒｕｃｔＡを繰り返し処理するループ制御文である。 FIG. 2 is a diagram showing an example of a C language source program having a loop control statement for processing the list structure data strike A. Sentence 2-1 is a data declaration statement of the list structure data strike A. The list structure data strike A recursively points to itself. Sentence 4-2 is a loop control statement that follows the list and repeatedly processes the list structure data strike A.

そして、図２に示すようなリスト構造データｓｔｒｕｃｔＡを処理するループ制御文を含むプログラムの高速化を可能にするコンパイラが、例えば特許文献１に記載されている。図１３は、特許文献１に記載のコンパイラによるループ制御文の変形結果を示す図である。図１３を参照すると、特許文献１に記載のコンパイラは、ループ制御文１３−１を文１３−２〜１３−５のように変形することによって、リスト構造データを含むループをベクトル化している。ここで、文１３−２は、リスト構造データｓｔｒｕｃｔＡのノード数をカウントする制御文である。また、文１３−３は、メモリ上に作業配列を確保する宣言文である。また、文１３−４は、確保した作業配列にリスト構造データｓｔｒｕｃｔＡの各ノードの先頭アドレスを登録する制御文である。そして、文１３−５が、作業配列の配列要素を参照するための添字を使って元のループ制御文１３−１と等価な処理を行うループ制御文である。 Then, for example, Patent Document 1 describes a compiler that enables high-speed programming including a loop control statement that processes list structure data strike A as shown in FIG. FIG. 13 is a diagram showing a modification result of the loop control statement by the compiler described in Patent Document 1. Referring to FIG. 13, the compiler described in Patent Document 1 vectorizes a loop containing list structure data by transforming the loop control statement 13-1 as in statements 13-2 to 13-5. Here, statement 13-2 is a control statement that counts the number of nodes in the list structure data strike A. Further, statement 13-3 is a declaration statement for allocating a working array in the memory. Further, statement 13-4 is a control statement for registering the start address of each node of the list structure data strike A in the secured working array. Then, statement 13-5 is a loop control statement that performs processing equivalent to the original loop control statement 13-1 by using a subscript for referring to the array element of the working array.

特開２００３−３３７７０７号公報Japanese Unexamined Patent Publication No. 2003-337707

特許文献１に記載のコンパイラは、リスト構造データの先頭アドレスをメモリ上の作業配列に割り当てるようにして、リスト構造データを繰り返し処理するループ制御文のベクトル化を行っている。しかしながら、プログラム実行時にメモリ上の作業配列へのアクセスが行われるため、実行速度の高速化には限度がある。 The compiler described in Patent Document 1 vectorizes a loop control statement that repeatedly processes the list structure data by assigning the start address of the list structure data to a working array in the memory. However, since the work array in the memory is accessed when the program is executed, there is a limit to the speedup of the execution speed.

本発明の目的は、上述した課題、即ち、作業配列を使ってループ制御文をベクトル化する方法では高速化に限度がある、という課題を解決するコンパイラを提供することにある。 An object of the present invention is to provide a compiler that solves the above-mentioned problem, that is, the problem that the method of vectorizing a loop control statement using a working array has a limit in speeding up.

本発明の一形態に係るコンパイラは、
ソースプログラムからベクトル計算機に対するオブジェクトプログラムを生成するコンピュータに、
前記ソースプログラムを解析してリスト構造データを繰り返し処理するループ制御文を認識する処理と、
前記ループ制御文をベクトル化する処理を行わせ、
前記ベクトル化する処理では、
前記オブジェクトプログラム中に、
前記リスト構造データの各ノードの先頭アドレスをベクトルレジスタに書き込み、前記リスト構造データのノード数をスカラレジスタに書き込む第１のプログラム部分と、
前記ベクトルレジスタに書き込まれた前記リスト構造データの各ノードの先頭アドレスと前記スカラレジスタに書き込まれた前記リスト構造データのノード数とを使って、前記ループ制御文をベクトル化した第２のプログラム部分とを、
挿入する。 The compiler according to one embodiment of the present invention is
On a computer that generates an object program for a vector computer from a source program,
A process of recognizing a loop control statement that analyzes the source program and repeatedly processes the list structure data,
The process of vectorizing the loop control statement is performed, and the process is performed.
In the vectorization process,
In the object program,
A first program portion that writes the start address of each node of the list structure data to the vector register and writes the number of nodes of the list structure data to the scalar register.
A second program portion in which the loop control statement is vectorized using the start address of each node of the list structure data written in the vector register and the number of nodes of the list structure data written in the scalar register. And,
insert.

また、本発明の他の形態に係るコンピュータは、
ソースプログラムからベクトル計算機に対するオブジェクトプログラムを生成するコンピュータであって、
前記ソースプログラムを解析してリスト構造データを繰り返し処理するループ制御文を認識するループ解析部と、
前記ループ制御文をベクトル化するベクトル化実行部と、
を含み、
前記ベクトル化実行部は、前記オブジェクトプログラム中に、
前記リスト構造データの各ノードの先頭アドレスをベクトルレジスタに書き込み、前記リスト構造データのノード数をスカラレジスタに書き込む第１のプログラム部分と、
前記ベクトルレジスタに書き込まれた前記リスト構造データの各ノードの先頭アドレスと前記スカラレジスタに書き込まれた前記リスト構造データのノード数とを使って、前記ループ制御文をベクトル化した第２のプログラム部分とを、
挿入するように構成されている。 In addition, the computer according to another embodiment of the present invention
A computer that generates an object program for a vector computer from a source program.
A loop analysis unit that recognizes a loop control statement that analyzes the source program and repeatedly processes list structure data,
A vectorization execution unit that vectorizes the loop control statement,
Including
The vectorization execution unit is used in the object program.
A first program portion that writes the start address of each node of the list structure data to the vector register and writes the number of nodes of the list structure data to the scalar register.
A second program portion in which the loop control statement is vectorized using the start address of each node of the list structure data written in the vector register and the number of nodes of the list structure data written in the scalar register. And,
It is configured to be inserted.

また、本発明の他の形態に係るコンピュータのコード生成方法は、
ソースプログラムからベクトル計算機に対するオブジェクトプログラムを生成するコンピュータが実行するコード生成方法であって、
前記ソースプログラムを解析してリスト構造データを繰り返し処理するループ制御文を認識し、
前記ループ制御文をベクトル化し、
前記ベクトル化では、
前記オブジェクトプログラム中に、
前記リスト構造データの各ノードの先頭アドレスをベクトルレジスタに書き込み、前記リスト構造データのノード数をスカラレジスタに書き込む第１のプログラム部分と、
前記ベクトルレジスタに書き込まれた前記リスト構造データの各ノードの先頭アドレスと前記スカラレジスタに書き込まれた前記リスト構造データのノード数とを使って、前記ループ制御文をベクトル化した第２のプログラム部分とを、
挿入する。 Further, the computer code generation method according to another embodiment of the present invention is described.
A computer-executed code generation method that generates an object program for a vector computer from a source program.
Recognize the loop control statement that analyzes the source program and iteratively processes the list structure data.
Vectorize the loop control statement
In the vectorization,
In the object program,
A first program portion that writes the start address of each node of the list structure data to the vector register and writes the number of nodes of the list structure data to the scalar register.
A second program portion in which the loop control statement is vectorized using the start address of each node of the list structure data written in the vector register and the number of nodes of the list structure data written in the scalar register. And,
insert.

本発明は上述した構成を有することにより、作業配列を使ってループ制御文をベクトル化するのに比べて、ループ制御文の実行速度の高速化が可能になる。 By having the above-described configuration, the present invention makes it possible to increase the execution speed of the loop control statement as compared with vectorizing the loop control statement using the working array.

リスト構造データｓｔｒｕｃｔＡにより形成されるリストのメモリ上の構成を示す図である。It is a figure which shows the structure in the memory of the list formed by the list structure data strike A. リスト構造データｓｔｒｕｃｔＡを処理するループ制御文を持つＣ言語のソースプログラムの一例を示す図である。It is a figure which shows an example of the source program of C language which has a loop control statement which processes list structure data strike A. ベクトル化するリスト構造データの各先頭アドレスを格納するベクトルレジスタとベクトル収集命令の処理の説明図である。It is explanatory drawing of the process of the vector register which stores each start address of the list structure data to be vectorized, and the vector collection instruction. ベクトル化するリスト構造データの各先頭アドレスを格納するベクトルレジスタとベクトル拡散命令の処理の説明図である。It is explanatory drawing of the process of the vector register which stores each start address of the list structure data to be vectorized, and the vector spread instruction. 本発明の第１の実施形態に係るコンパイラの構成例を示す図である。It is a figure which shows the structural example of the compiler which concerns on 1st Embodiment of this invention. 本発明の第１の実施形態におけるＶＬＩＳＴ２命令の対象となる、区間の最後がＮＵＬＬ（終端文字）と決めているリスト構造データにより形成されるリストのメモリ上の構成とベクトルレジスタの内容を示す図である。The figure which shows the composition in the memory of the list formed by the list structure data which determines that the end of a section is NULL (terminated character) which is the object of the VLIST2 instruction in 1st Embodiment of this invention, and the contents of a vector register Is. 本発明の第１の実施形態におけるＶＬＩＳＴ２命令の対象となる、リスト構造データにより形成されるリストのメモリ上の構成とベクトルレジスタの内容を示す図である。It is a figure which shows the structure in the memory of the list formed by the list structure data, and the contents of a vector register which is the object of the VLIST2 instruction in 1st Embodiment of this invention. 本発明の第１の実施形態におけるソースプログラムから生成される命令列の例を示す図である。It is a figure which shows the example of the instruction sequence generated from the source program in 1st Embodiment of this invention. 本発明の第１の実施形態におけるソースプログラムから生成される他の命令列の例を示す図である。It is a figure which shows the example of another instruction sequence generated from the source program in 1st Embodiment of this invention. 本発明の第１の実施形態におけるソースプログラムから生成される別の命令列の例を示す図である。It is a figure which shows the example of another instruction sequence generated from the source program in 1st Embodiment of this invention. 本発明の第１の実施形態で使用するマスク生成命令とマスク先行ゼロカウント命令の処理の説明図である。It is explanatory drawing of the process of the mask generation instruction and the mask leading zero count instruction used in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるソースプログラムから生成される更に別の命令列の例を示す図である。It is a figure which shows the example of still another instruction sequence generated from the source program in 1st Embodiment of this invention. 本発明に関連するコンパイラによってソースプログラムから生成される命令列の例を示す図である。It is a figure which shows the example of the instruction sequence generated from the source program by the compiler related to this invention. 本発明の第１の実施形態に係るコンパイラを実現するコンピュータのハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware composition of the computer which realizes the compiler which concerns on 1st Embodiment of this invention. 本発明の第２の実施形態に係るコンパイラのブロック図である。It is a block diagram of the compiler which concerns on 2nd Embodiment of this invention.

次に本発明の実施の形態について図面を参照して詳細に説明する。
[第１の実施形態]
図５を参照すると、本発明の第１の実施形態に係るコンパイラ５は、ソースプログラム５−３を翻訳してオブジェクトコード５−４を生成する。生成されたオブジェクトコードは、ベクトル計算機で実行される。このようにコンパイラ５は、ベクトル計算機に対するオブジェクトプログラムを生成する。 Next, an embodiment of the present invention will be described in detail with reference to the drawings.
[First Embodiment]
Referring to FIG. 5, the compiler 5 according to the first embodiment of the present invention translates the source program 5-3 to generate the object code 5-4. The generated object code is executed by a vector computer. In this way, the compiler 5 generates an object program for the vector computer.

コンパイラ５は、ループ解析部５−１とベクトル化実行部５−２とを含んで構成されている。 The compiler 5 includes a loop analysis unit 5-1 and a vectorization execution unit 5-2.

ループ解析部５−１は、ソースプログラム５−３を解析してリスト構造データを含むループのベクトル化ができるか否かを判定する。ループ解析部５−１は、リスト構造データを含むループの認識手段（以下、認識手段と呼ぶ）５−１−１、リスト構造データの構造解析手段（以下、構造解析手段と呼ぶ）５−１−２、および、ベクトル化判定手段５−１−３を含んで構成されている。 The loop analysis unit 5-1 analyzes the source program 5-3 and determines whether or not the loop including the list structure data can be vectorized. The loop analysis unit 5-1 includes loop recognition means (hereinafter referred to as recognition means) 5-1-1 containing list structure data, and structure analysis means of list structure data (hereinafter referred to as structure analysis means) 5-1. -2 and 5-1-3 for vectorization determination are included.

認識手段５−１−１は、ソースプログラム５−３中のリスト構造データを含むループを検出する。構造解析手段５−１−２は、認識手段５−１−１で検出したループに含まれるリスト構造を解析する。ベクトル化判定手段５−１−３は、認識手段５−１−１で検出したループがベクトル化できるか否かを判定する。 The recognition means 5-1-1 detects a loop containing the list structure data in the source program 5-3. The structural analysis means 5-1-2 analyzes the list structure included in the loop detected by the recognition means 5-1-1. The vectorization determination means 5-1-3 determines whether or not the loop detected by the recognition means 5-1-1 can be vectorized.

ベクトル化実行部５−２は、ループ解析部５−１の解析結果に基づきループをベクトル化する。ベクトル化実行部５−２は、リスト構造先頭アドレス・リスト構造ノード数取得命令生成手段（以下、ＶＬＩＳＴ命令生成手段と呼ぶ）５−２−１、ベクトル長ロード命令生成手段５−２−２、ベクトル収集命令生成手段５−２−３、ベクトル演算命令生成手段５−２−４、および、ベクトル拡散命令生成手段５−２−５を含んで構成されている。 The vectorization execution unit 5-2 vectorizes the loop based on the analysis result of the loop analysis unit 5-1. The vectorization execution unit 5-2 includes a list structure head address / list structure node number acquisition instruction generation means (hereinafter referred to as a VLIST instruction generation means) 5-2-1, a vector length load instruction generation means 5-2-2, It includes a vector collection instruction generation means 5-2-3, a vector calculation instruction generation means 5-2-4, and a vector diffusion instruction generation means 5-2-5.

ＶＬＩＳＴ命令生成手段５−２−１は、リスト構造の先頭アドレスのリストおよびリスト構造データのノード数を取得するベクトル命令であるリスト構造先頭アドレス・リスト構造ノード数取得命令（以下、ＶＬＩＳＴ命令と呼ぶ）を生成する。ベクトル長ロード命令生成手段５−２−２は、ＶＬＩＳＴ命令によって取得されたノード数をベクトル長に設定するベクトル長ロード命令を生成する。ベクトル収集命令生成手段５−２−３は、メモリ上のリスト構造データをベクトルレジスタにロードするベクトル収集命令を生成する。ベクトル演算命令生成手段５−２−４は、ベクトルレジスタにロードされたデータの演算を行い、演算結果をベクトルレジスタに格納するベクトル演算命令を生成する。ベクトル拡散命令生成手段５−２−５は、ベクトルレジスタに格納された演算結果を、メモリ上のリスト構造データにストアするベクトル拡散命令を生成する。 The VLIST instruction generation means 5-2-1 is a list structure start address / list structure node number acquisition instruction (hereinafter referred to as a VLIST instruction) which is a vector instruction for acquiring a list of start addresses of a list structure and the number of nodes of list structure data. ) Is generated. The vector length load instruction generation means 5-2-2 generates a vector length load instruction that sets the number of nodes acquired by the VLIST instruction to the vector length. The vector collection instruction generation means 5-2-3 generates a vector collection instruction that loads the list structure data in the memory into the vector register. The vector operation instruction generation means 5-2-4 calculates the data loaded in the vector register and generates a vector operation instruction for storing the operation result in the vector register. The vector spreading instruction generating means 5-2-5 generates a vector spreading instruction that stores the operation result stored in the vector register in the list structure data in the memory.

コンパイラ５は、例えば図１４に示すようなパーソナルコンピュータ等の情報処理装置１４−１とプログラム１４−２とで実現することができる。情報処理装置１４−１は、キーボードやマウスなどの操作入力部１４−１−１と、液晶ディスプレイ等の画面表示部１４−１−２と、通信インタフェース部１４−１−３と、メモリやハードディスク等の記憶部１４−１−４と、１以上のマイクロプロセッサ等の演算処理部１４−１−５とを有する。プログラム１４−２は、情報処理装置１４−１の立ち上げ時等に外部のコンピュータ読み取り可能な記憶媒体から記憶部１４−１−４に読み込まれ、演算処理部１４−１−５の動作を制御することにより、演算処理部１４−１−５上に、コンパイラ５を実現する。 The compiler 5 can be realized by an information processing device 14-1 such as a personal computer and a program 14-2 as shown in FIG. 14, for example. The information processing device 14-1 includes an operation input unit 14-1-1 such as a keyboard and a mouse, a screen display unit 14-1-2 such as a liquid crystal display, a communication interface unit 14-1-3, and a memory and a hard disk. It has a storage unit 14-1-4 such as, and an arithmetic processing unit 14-1-5 such as one or more microprocessors. The program 14-2 is read into the storage unit 14-1-4 from an external computer-readable storage medium such as when the information processing device 14-1 is started up, and controls the operation of the arithmetic processing unit 14-1-5. By doing so, the compiler 5 is realized on the arithmetic processing unit 14-1-5.

＜本実施形態の説明＞
次に、図１〜図１２を用いてコンパイラ５の動作について説明する。 <Explanation of the present embodiment>
Next, the operation of the compiler 5 will be described with reference to FIGS. 1 to 12.

図２は、前述したように、リスト構造データを含むループを有するＣ言語のソースプログラムである。コンパイラ５のループ解析部５−１は、このようなループを有するソースプログラム５−３を解析する。 As described above, FIG. 2 is a C language source program having a loop containing list structure data. The loop analysis unit 5-1 of the compiler 5 analyzes the source program 5-3 having such a loop.

まず、認識手段５−１−１は、ソースプログラム５−３中から図２のループ制御文２−２のような、ループ内にリスト構造データを持つループを検出する。 First, the recognition means 5-1-1 detects a loop having list structure data in the loop, such as the loop control statement 2-2 of FIG. 2, from the source program 5-3.

次に、構造解析手段５−１−２は、検出されたループに含まれるリスト構造データを解析する。リスト構造データの解析では、リスト構造データ内の次のノードを指すポインタがリスト構造データ内のどの位置にあるかを解析する。図２に示したリスト構造データにおいて、ｄｏｕｂｌｅ型データを８バイトとすると、ｘ，ｙ，ｚの後に位置するｎｅｘｔは、ノードの先頭アドレスから２４バイト目に位置することとなる。つまり、図２に示すリスト構造データでは、次のノードを指すポインタの位置はノードの先頭から２４バイト目となる。 Next, the structural analysis means 5-1-2 analyzes the list structure data included in the detected loop. In the analysis of the list structure data, the position in the list structure data where the pointer to the next node in the list structure data is located is analyzed. In the list structure data shown in FIG. 2, assuming that the double type data is 8 bytes, the next located after x, y, z is located at the 24th byte from the start address of the node. That is, in the list structure data shown in FIG. 2, the position of the pointer pointing to the next node is the 24th byte from the beginning of the node.

次に、コンパイラ５のベクトル化判定手段５−１−３は、検出されたループがベクトル化できるか否かを判定する。ベクトル化できる条件としては、ループ内のリスト構造データ、配列、変数の定義・参照関係にベクトル化を阻害する依存関係がないことが挙げられる。 Next, the vectorization determination means 5-1-3 of the compiler 5 determines whether or not the detected loop can be vectorized. The condition that can be vectorized is that there is no dependency that hinders vectorization in the definition / reference relationships of list structure data, arrays, and variables in the loop.

ループ解析部５−１の処理が終わると、ベクトル化実行部５−２による処理が行われる。ベクトル化実行部５−２は、ループ解析部５−１におってベクトル化できると判定されたリスト構造データを含むループを、以下のようにベクトル化する。 When the processing of the loop analysis unit 5-1 is completed, the processing by the vectorization execution unit 5-2 is performed. The vectorization execution unit 5-2 vectorizes the loop including the list structure data determined by the loop analysis unit 5-1 to be vectorized as follows.

まず、ＶＬＩＳＴ命令生成手段５−２−１は、ベクトル化するリスト構造データの各先頭アドレスのリストとリスト構造データのノードの個数をレジスタにロードするためのＶＬＩＳＴ命令を生成する。このＶＬＩＳＴ命令が実行されることによって、ベクトル化するリスト構造データの各先頭アドレスがベクトルレジスタに格納される。例えば、図３に示すｆｉｒｓｔからｌａｓｔまでの区間のリスト構造データに対してＶＬＩＳＴ命令が実行されると、図３に示すように、リスト構造データの各先頭アドレスのリストがベクトルレジスタ３−１に格納される。また、ＶＬＩＳＴ命令によって、リスト構造データのノードの個数が図示しないスカラレジスタに格納される。ＶＬＩＳＴ命令の詳細については後述する。 First, the VLIST instruction generation means 5-2-1 generates a VLIST instruction for loading the list of each start address of the list structure data to be vectorized and the number of nodes of the list structure data into the register. By executing this VLIST instruction, each start address of the list structure data to be vectorized is stored in the vector register. For example, when the VLIST instruction is executed for the list structure data in the section from first to last shown in FIG. 3, the list of the start addresses of the list structure data is stored in the vector register 3-1 as shown in FIG. It is stored. Further, by the VLIST instruction, the number of nodes of the list structure data is stored in a scalar register (not shown). The details of the VLIST instruction will be described later.

次に、ベクトル長ロード命令生成手段５−２−２は、ＶＬＩＳＴ命令により得られたノードの個数をベクトル長として設定する。 Next, the vector length load instruction generation means 5-2-2 sets the number of nodes obtained by the VLIST instruction as the vector length.

次に、ベクトル収集命令生成手段５−２−３は、ベクトル収集命令を生成する。ベクトル収集命令は、ベクトルレジスタの各要素に格納されているアドレスのメモリデータを、別のベクトルレジスタの対応する要素に格納するように、メモリからデータをロードするものである。ベクトル収集命令生成手段５−２−３は、ＶＬＩＳＴ命令により得られたリスト構造データの先頭アドレスのリストを格納するベクトルレジスタ３−１を指定したベクトル収集命令を生成する。これによって、図３に示すように、ベクトルレジスタ３−１の各要素に格納されているアドレスのメモリデータを、メモリ３−２からベクトルレジスタ３−３へロードするベクトル収集命令が生成される。 Next, the vector collection instruction generation means 5-2-3 generates a vector collection instruction. The vector collection instruction loads data from memory so that the memory data of the address stored in each element of the vector register is stored in the corresponding element of another vector register. The vector collection instruction generation means 5-2-3 generates a vector collection instruction in which the vector register 3-1 for storing the list of the start addresses of the list structure data obtained by the VLIST instruction is specified. As a result, as shown in FIG. 3, a vector collection instruction for loading the memory data of the address stored in each element of the vector register 3-1 from the memory 3-2 to the vector register 3-3 is generated.

次に、ベクトル演算命令生成手段５−２−４は、ベクトル収集命令によりベクトルレジスタ３−３へロードされたデータを指定してベクトル演算を実行し、結果を別のベクトルレジスタへ格納するベクトル演算命令を生成する。 Next, the vector operation instruction generation means 5-2-4 specifies the data loaded in the vector register 3-3 by the vector collection instruction, executes the vector operation, and stores the result in another vector register. Generate an instruction.

次に、ベクトル拡散命令生成手段５−２−５は、ベクトル拡散命令を生成する。ベクトル拡散命令は、ベクトルレジスタの各要素に格納されているアドレスのメモリ領域へ、別のベクトルレジスタの対応する要素のデータを格納するように、メモリにデータをストアするものである。ベクトル拡散命令生成手段５−２−５は、ベクトル演算命令により得られた演算結果を格納するベクトルレジスタおよびリスト構造データの先頭アドレスのリストを格納するベクトルレジスタを指定したベクトル拡散命令を生成する。これによって、図４に示すように、ベクトルレジスタ４−３に格納されているベクトル演算により得られた演算結果を、ベクトルレジスタ４−１に格納されているリスト構造データの先頭アドレスのメモリ４−２の領域へストアするベクトル拡散命令が生成される。 Next, the vector diffusion instruction generation means 5-2-5 generates a vector diffusion instruction. The vector spreading instruction stores data in the memory so as to store the data of the corresponding element of another vector register in the memory area of the address stored in each element of the vector register. The vector spreading instruction generating means 5-2-5 generates a vector spreading instruction in which a vector register for storing the operation result obtained by the vector operation instruction and a vector register for storing the list of start addresses of the list structure data are specified. As a result, as shown in FIG. 4, the operation result obtained by the vector operation stored in the vector register 4-3 is stored in the memory 4- of the start address of the list structure data stored in the vector register 4-1. A vector spreading instruction to store in the area of 2 is generated.

次に、ＶＬＩＳＴ命令の例について詳述する。 Next, an example of the VLIST instruction will be described in detail.

＜ＶＬＩＳＴ命令の例１＞
一つ目の例のＶＬＩＳＴ命令（以下、ＶＬＩＳＴ１命令と呼ぶ）は、以下のように、オペランドの所定のフィールドで、％ｖ０，％ｓ０，％ｓ１，％ｓ２，％ｓ３の合計５個のレジスタを指定する構成とされている。
ＶＬＩＳＴ１命令％ｖ０，％ｓ０，％ｓ１，％ｓ２，％ｓ３ <Example 1 of VLIST instruction>
The VLIST instruction (hereinafter referred to as VLIST1 instruction) in the first example is a total of five registers of% v0,% s0,% s1,% s2,% s3 in a predetermined field of the operand as shown below. It is configured to specify.
VLIST1 instruction% v0,% s0,% s1,% s2,% s3

％ｖ０は、リスト構造データの各ノードの先頭アドレスを書込むベクトルレジスタである。％ｓ０は、リスト構造データのノードの個数を書込むスカラレジスタである。％ｓ１は、命令の対象となるリスト構造データの区間の先頭のアドレス（図１のｆｉｒｓｔ）を指すスカラレジスタである。％ｓ２は、命令の対象となるリスト構造データの区間の末尾のアドレス（図１のｌａｓｔ）を指すスカラレジスタである。％ｓ３は、命令の対象となるリスト構造データの構造解析手段５−１−２により取得したリスト構造内の次のデータのポインタの位置（図１のｎｅｘｔ）を指すスカラレジスタである。 % V0 is a vector register for writing the start address of each node of the list structure data. % S0 is a scalar register that writes the number of nodes in the list structure data. % S1 is a scalar register pointing to the first address (first in FIG. 1) of the section of the list structure data to be instructed. % S2 is a scalar register indicating the address at the end of the section of the list structure data to be instructed (last in FIG. 1). % S3 is a scalar register indicating the position of the pointer of the next data in the list structure (next in FIG. 1) acquired by the structural analysis means 5-1-2 of the list structure data to be instructed.

ＶＬＩＳＴ１命令を命令セットに含むベクトル計算機では、ＶＬＩＳＴ１命令が発行されると、スカラレジスタ％ｓ１、％ｓ２、および％ｓ３で指定されるｆｉｒｓｔ、ｌａｓｔ、ｎｅｘｔによって特定されるリスト構造データの各ノードの先頭アドレスを取得してベクトルレジスタ％ｖ０に格納する処理と、ノードの個数を計数してスカラレジスタ％ｓ０に格納する処理とが、１命令に係る処理として行われる。 In a vector computer that includes the VLIST1 instruction in the instruction set, when the VLIST1 instruction is issued, each node of the list structure data specified by the first, last, and next specified by the scalar registers% s1,% s2, and% s3. The process of acquiring the start address and storing it in the vector register% v0 and the process of counting the number of nodes and storing it in the scalar register% s0 are performed as the process related to one instruction.

＜ＶＬＩＳＴ命令の例２＞
二つ目の例のＶＬＩＳＴ命令（以下、ＶＬＩＳＴ２命令と呼ぶ）は、以下のように、オペランドの所定のフィールドで、％ｖ０，％ｓ０，％ｓ１，％ｓ２の合計４個のレジスタを指定する構成とされている。
ＶＬＩＳＴ２命令％ｖ０，％ｓ０，％ｓ１，％ｓ２ <Example 2 of VLIST instruction>
The VLIST instruction (hereinafter referred to as VLIST2 instruction) in the second example specifies a total of four registers of% v0,% s0,% s1,% s2 in a predetermined field of the operand as follows. It is configured.
VLIST2 instruction% v0,% s0,% s1,% s2

％ｖ０は、リスト構造データの各ノードの先頭アドレスを書込むベクトルレジスタである。％ｓ０は、リスト構造データのノードの個数を書込むスカラレジスタである。％ｓ１は、命令の対象となるリスク構造データの区間の先頭のアドレス（図１のｆｉｒｓｔ）を指すスカラレジスタである。％ｓ２は、命令の対象となるリスト構造データの構造解析手段５−１−２により取得したリスト構造内の次のデータのポインタの位置（図１のｎｅｘｔ）を指すスカラレジスタである。一つ目の例のＶＬＩＳＴ１命令とは異なり、ＶＬＩＳＴ２命令は、命令の対象となるリスト構造データの区間の末尾のアドレス（図１のｌａｓｔ）を指すオペランドはない。その理由は、ＶＬＩＳＴ２命令は、図６の６−１のＮＵＬＬに示すように、区間の最後はＮＵＬＬ（終端文字）と決めているためである。このように、ＶＬＩＳＴ２命令は、区間の末尾のアドレスを指すスカラレジスタを減らして、オペランド数を削減している。 % V0 is a vector register for writing the start address of each node of the list structure data. % S0 is a scalar register that writes the number of nodes in the list structure data. % S1 is a scalar register pointing to the first address (first in FIG. 1) of the section of the risk structure data to be instructed. % S2 is a scalar register indicating the position of the pointer of the next data in the list structure (next in FIG. 1) acquired by the structural analysis means 5-1-2 of the list structure data to be instructed. Unlike the VLIST1 instruction in the first example, the VLIST2 instruction does not have an operand indicating the address at the end of the section of the list structure data to be instructed (last in FIG. 1). The reason is that the VLIST2 instruction determines that the end of the section is NULL (terminated character) as shown in NULL of 6-1 in FIG. In this way, the VLIST2 instruction reduces the number of operands by reducing the scalar register that points to the address at the end of the interval.

ＶＬＩＳＴ２命令を命令セットに含むベクトル計算機では、ＶＬＩＳＴ２命令が発行されると、スカラレジスタ％ｓ１および％ｓ２で指定されるｆｉｒｓｔ、ｎｅｘｔによって特定されるリスト構造データの各ノードの先頭アドレスを取得してベクトルレジスタ％ｖ０に格納する処理と、ノードの個数を計数してスカラレジスタ％ｓ０に格納する処理とが、１命令に係る処理として行われる。 In a vector computer that includes the VLIST2 instruction in the instruction set, when the VLIST2 instruction is issued, the start address of each node of the list structure data specified by the first and next specified by the scalar registers% s1 and% s2 is acquired. The process of storing in the vector register% v0 and the process of counting the number of nodes and storing in the scalar register% s0 are performed as the process related to one instruction.

＜ＶＬＩＳＴ命令の例３＞
三つ目の例のＶＬＩＳＴ命令（以下、ＶＬＩＳＴ３命令と呼ぶ）は、以下のように、オペランドの所定のフィールドで、％ｖ０，％ｓ０，％ｓ１，％ｓ２の合計４個のレジスタを指定する構成とされている。
ＶＬＩＳＴ３命令％ｖ０，％ｓ０，％ｓ１，％ｓ２ <Example 3 of VLIST instruction>
The VLIST instruction (hereinafter referred to as VLIST3 instruction) in the third example specifies a total of four registers of% v0,% s0,% s1,% s2 in a predetermined field of the operand as follows. It is configured.
VLIST3 instruction% v0,% s0,% s1,% s2

％ｖ０は、リスト構造データの各ノードの先頭アドレスを書込むベクトルレジスタである。％ｓ０は、リスト構造データの区間の先頭のアドレス（図１のｆｉｒｓｔ）を指すスカラレジスタである。％ｓ１は、リスト構造データの区間の末尾のアドレス（図１のｌａｓｔ）を指すスカラレジスタである。％ｓ２は、リスト構造データの構造解析手段５−１−２により取得したリスト構造内の次のデータのポインタの位置（図１のｎｅｘｔ）を指すスカラレジスタである。ＶＬＩＳＴ１，ＶＬＩＳＴ２とは異なり、ＶＬＩＳＴ３には、リスト構造データのノードの個数を書込むスカラレジスタを指定するオペランドは無い。 % V0 is a vector register for writing the start address of each node of the list structure data. % S0 is a scalar register pointing to the first address (first in FIG. 1) of the section of the list structure data. % S1 is a scalar register indicating the address at the end of the section of the list structure data (last in FIG. 1). % S2 is a scalar register indicating the position of the pointer of the next data in the list structure (next in FIG. 1) acquired by the structural analysis means 5-1-2 of the list structure data. Unlike VLIST1 and VLIST2, VLIST3 does not have an operand that specifies a scalar register for writing the number of nodes in the list structure data.

ＶＬＩＳＴ３命令を命令セットに含むベクトル計算機では、ＶＬＩＳＴ３命令が発行されると、スカラレジスタ％ｓ０、％ｓ１、および％ｓ２で指定されるｆｉｒｓｔ、ｌａｓｔ、ｎｅｘｔによって特定されるリスト構造データの各ノードの先頭アドレスを取得して、図７に示すようにベクトルレジスタ％ｖ０（７−１）に格納する処理と、末尾のアドレスの次の要素をＮＵＬＬ（終端文字）にする処理とが、１命令に係る処理として行われる。ここで、ベクトルレジスタ７−１の末尾のアドレスの次の要素をＮＵＬＬ（終端文字）にする理由は、ベクトルレジスタ７−１の何番目の要素までアドレスを格納したかを把握することができるようにするためである。ノードの個数の算出は、ＶＬＩＳＴ３命令に引き続く別の命令によって行われる（詳細は図１０を参照して後述する）。 In a vector computer that includes the VLIST3 instruction in the instruction set, when the VLIST3 instruction is issued, each node of the list structure data specified by the first, last, and next specified by the scalar registers% s0,% s1, and% s2. The process of acquiring the start address and storing it in the vector register% v0 (7-1) as shown in FIG. 7 and the process of setting the next element of the end address to NUML (terminated character) are combined into one instruction. It is performed as such processing. Here, the reason why the element next to the address at the end of the vector register 7-1 is set to NULL (terminated character) is that it is possible to grasp the number of the element in which the address is stored in the vector register 7-1. To make it. The calculation of the number of nodes is performed by another instruction following the VLIST3 instruction (details will be described later with reference to FIG. 10).

＜ＶＬＩＳＴ命令の例４＞
四つ目の例のＶＬＩＳＴ命令（以下、ＶＬＩＳＴ４と呼ぶ）は、以下のように、オペランドの所定のフィールドで、％ｖ０，％ｓ０，％ｓ１の合計３個のレジスタを指定する構成とされている。
ＶＬＩＳＴ４命令％ｖ０，％ｓ０，％ｓ１ <Example 4 of VLIST instruction>
The VLIST instruction (hereinafter referred to as VLIST4) in the fourth example is configured to specify a total of three registers of% v0,% s0, and% s1 in a predetermined field of the operand as shown below. There is.
VLIST4 instruction% v0,% s0,% s1

％ｖ０は、リスト構造データの各先頭アドレスを書込むベクトルレジスタである。％ｓ０は、リスト構造データの区間の先頭のアドレス（図１のｆｉｒｓｔ）を指すスカラレジスタである。％ｓ１は、リスト構造データの構造解析手段５−１−２により取得したリスト構造内の次のデータのポインタの位置（図１のｎｅｘｔ）を指すスカラレジスタである。ＶＬＩＳＴ４命令は、三つ目の例のＶＬＩＳＴ３命令から区間の末尾のアドレス（図１のｌａｓｔ）を指すスカラレジスタのオペランドを無くした例である。ＶＬＩＳＴ４命令は、二つ目の例のＶＬＩＳＴ２命令と同様に、区間の最後はＮＵＬＬ（終端文字）と決めてオペランド数を減らしている。 % V0 is a vector register for writing each start address of the list structure data. % S0 is a scalar register pointing to the first address (first in FIG. 1) of the section of the list structure data. % S1 is a scalar register indicating the position of the pointer of the next data in the list structure (next in FIG. 1) acquired by the structural analysis means 5-1-2 of the list structure data. The VLIST4 instruction is an example in which the operand of the scalar register indicating the address at the end of the section (last in FIG. 1) is removed from the VLIST3 instruction in the third example. Like the VLIST2 instruction in the second example, the VLIST4 instruction determines that the end of the interval is NULL (terminated character) and reduces the number of operands.

ＶＬＩＳＴ４命令を命令セットに含むベクトル計算機では、ＶＬＩＳＴ４命令が発行されると、スカラレジスタ％ｓ０および％ｓ１で指定されるｆｉｒｓｔ、ｎｅｘｔによって特定されるリスト構造データの各ノードの先頭アドレスを取得して、ベクトルレジスタ％ｖ０に格納する処理と、末尾のアドレスの次の要素をＮＵＬＬ（終端文字）にする処理とが、１命令に係る処理として行われる。また、ノードの個数の算出は、ＶＬＩＳＴ３命令と同様に、ＶＬＩＳＴ４命令に引き続く別の命令によって行われる。 In a vector computer that includes the VLIST4 instruction in the instruction set, when the VLIST4 instruction is issued, the start address of each node of the list structure data specified by the first and next specified by the scalar registers% s0 and% s1 is acquired. , The process of storing in the vector register% v0 and the process of setting the next element of the last address to NULL (terminated character) are performed as the process related to one instruction. Further, the calculation of the number of nodes is performed by another instruction following the VLIST4 instruction, similarly to the VLIST3 instruction.

次に、図２に示したソースプログラムをコンパイルしたときに生成される命令列の例を図８〜図１０、図１２を参照して説明する。 Next, an example of the instruction sequence generated when the source program shown in FIG. 2 is compiled will be described with reference to FIGS. 8 to 10 and 12.

＜命令列の例１＞
図８は、ＶＬＩＳＴ１命令を用いた場合に生成される命令列である。最初のＶＬＩＳＴ１命令は、リスト構造データの各ノードの先頭アドレスのリストをベクトルレジスタｖｒｅｇ１にロードし、そのノードの個数をスカラレジスタｓｒｅｇ１にロードする。次のＬＶＬ命令（ベクトル長ロード命令）は、スカラレジスタｓｒｅｇ１によりノードの個数を指定してベクトル長をロードする。次のＶＧＴ命令（ベクトル収集命令）は、ベクトルレジスタｖｒｅｇ１を指定してｘのデータをメモリからベクトルレジスタｖｒｅｇ２にロードしている。次のＶＡＤＤ命令（ベクトル加算命令）は、ｙのアドレスのリストをベクトルレジスタｖｒｅｇ３に格納する。即ち、ｙのデータはｘの次に位置するためｄｏｕｂｌｅ型を８バイトとすると、ｙはｓｔｒｕｃｔＡの先頭から８バイト目に位置することになる。そのため、先頭アドレスのリスト（ｖｒｅｇ１）にＶＡＤＤ命令で８を加算した演算結果をベクトルレジスタｖｒｅｇ３に格納している。次のＶＧＴ命令は、ベクトルレジスタｖｒｅｇ３を指定してｙのデータをメモリからベクトルレジスタｖｒｅｇ４にロードする。次のＶＭＵＬ命令（ベクトル乗算命令）は、ベクトルレジスタｖｒｅｇ２，ｖｒｅｇ４にロードしたｘ，ｙを乗算し、演算結果をベクトルレジスタｖｒｅｇ５に格納する。つぎのＶＡＤＤ命令は、ｚのアドレスのリストをベクトルレジスタｖｒｅｇ６に格納する。即ち、ｚのデータはｓｒｕｃｔＡの先頭から１６バイト目に位置するので、先頭アドレスのリストに１６を加算した演算結果をベクトルレジスタｖｒｅｇ６に格納している。次のＶＳＣ命令（ベクトル拡散命令）は、ベクトルレジスタｖｒｅｇ６を指定して、ベクトルレジスタｖｒｅｇ５の乗算結果を各ノードのｚの位置へストアする。 <Example 1 of command sequence>
FIG. 8 is an instruction sequence generated when the VLIST1 instruction is used. The first VLIST1 instruction loads the list of start addresses of each node of the list structure data into the vector register vreg1 and the number of the nodes into the scalar register sreg1. The next LVL instruction (vector length load instruction) loads the vector length by designating the number of nodes by the scalar register sreg1. The next VGT instruction (vector collection instruction) specifies the vector register vreg1 and loads the data of x from the memory into the vector register vreg2. The next VADD instruction (vector addition instruction) stores a list of y addresses in the vector register vreg3. That is, since the data of y is located next to x, if the double type is 8 bytes, y is located at the 8th byte from the beginning of the struct A. Therefore, the calculation result obtained by adding 8 by the VADD instruction to the list of start addresses (vreg1) is stored in the vector register vreg3. The next VGT instruction specifies the vector register vreg3 and loads the data of y from the memory into the vector register vreg4. The next VMUL instruction (vector multiplication instruction) multiplies the loaded x and y in the vector registers vreg2 and vreg4, and stores the operation result in the vector register vreg5. The next VADD instruction stores a list of addresses of z in the vector register vreg6. That is, since the data of z is located at the 16th byte from the beginning of sluct A, the calculation result obtained by adding 16 to the list of start addresses is stored in the vector register vreg6. The next VSC instruction (vector diffusion instruction) specifies the vector register vreg6 and stores the multiplication result of the vector register vreg5 at the z position of each node.

＜命令列の例２＞
図９は、ＶＬＩＳＴ２命令を用いた場合に生成される命令列である。リスト構造データの末尾をＮＵＬＬ（終端文字）と決めているため、ＶＬＩＳＴ２命令が使用される。それ以外は図８とは変わらない。 <Example 2 of command sequence>
FIG. 9 is an instruction sequence generated when the VLIST2 instruction is used. Since the end of the list structure data is determined to be NULL (terminated character), the VLIST2 instruction is used. Other than that, it is the same as FIG.

＜命令列の例３＞
図１０は、ＶＬＩＳＴ３命令を用いた場合に生成される命令列である。前述したように、ＶＬＩＳＴ３命令では、リスト構造データの先頭アドレスのリストをベクトルレジスタにロードするとき、末尾のアドレスの次の要素をＮＵＬＬ（終端文字）にする。即ち、ＶＬＩＳＴ３命令でリスト構造データの先頭アドレスをロードしたときのベクトルレジスタは、図１１のベクトルレジスタ１１−１のようになり、最後の要素がＮＵＬＬとなる。そこで、図１０の命令列では、ＶＦＭＫ．ＥＱ命令（マスク生成命令）により、図１１に示すように、ベクトルレジスタ１１−２にマスクを生成する。ＶＦＭＫ．ＥＱ命令ではオペランドの所定のフィールドで指定されたベクトルレジスタｖｒｅｇ１（１１−１）の要素がＮＵＬＬと等しい場合は１、それ以外は０の要素となるマスクをオペランドの所定のフィールドで指定されたマスクレジスタに生成する。そして、生成されたマスクを指定したＬＺＶＭ命令（マスク先行ゼロカウント命令）を続ける。ＬＺＶＭ命令は、オペランドの所定のフィールドで指定されたベクトルレジスタ１１−２のマスクの先頭から０がいくつ連続して続くかをカウントし、そのカウント結果であるノードの個数をオペランドの所定のフィールドで指定された図１１のスカラレジスタ１１−３（ｓｒｅｇ１）に格納する。その後は図８、図９と同様である。 <Example 3 of command sequence>
FIG. 10 is an instruction sequence generated when the VLIST3 instruction is used. As described above, in the VLIST3 instruction, when the list of the start addresses of the list structure data is loaded into the vector register, the element next to the end address is set to NULL (terminated character). That is, the vector register when the start address of the list structure data is loaded by the VLIST3 instruction is as shown in the vector register 11-1 in FIG. 11, and the last element is NULL. Therefore, in the instruction sequence of FIG. 10, VFMK. As shown in FIG. 11, a mask is generated in the vector register 11-2 by the EQ instruction (mask generation instruction). VFMK. In the EQ instruction, the mask that is 1 if the element of the vector register vreg1 (11-1) specified in the predetermined field of the operand is equal to NULL, and 0 otherwise is the mask specified in the predetermined field of the operand. Generate in the register. Then, the LZVM instruction (mask leading zero count instruction) that specifies the generated mask is continued. The LZVM instruction counts how many consecutive 0s continue from the beginning of the mask of the vector register 11-2 specified in the predetermined field of the operand, and counts the number of nodes resulting from the count in the predetermined field of the operand. It is stored in the designated scalar register 11-3 (sreg1) of FIG. After that, the same is as in FIGS. 8 and 9.

＜命令列の例４＞
図１２は、ＶＬＩＳＴ４命令を用いた場合に生成される命令列である。リスト構造データの末尾をＮＵＬＬ（終端文字）と決めているためＶＬＩＳＴ４命令が使用される。それ以外は図１０とは変わらない。 <Example 4 of command sequence>
FIG. 12 is an instruction sequence generated when the VLIST4 instruction is used. Since the end of the list structure data is determined to be NULL (terminated character), the VLIST4 instruction is used. Other than that, it is the same as FIG.

＜本実施形態の効果＞
このように本実施形態によれば、リスト構造データを含むループ制御文のベクトル化が可能となる。そして、ループのベクトル化によりプログラムの実行性能を向上させることができる。 <Effect of this embodiment>
As described above, according to the present embodiment, it is possible to vectorize the loop control statement including the list structure data. Then, the execution performance of the program can be improved by vectorizing the loop.

また、特許文献１では、図１３の文１３−３、１３−４のように作業配列のメモリを確保してリスト構造データの各ノードの先頭アドレスをストアする。これに対して本実施形態では、リスト構造データの各ノードの先頭アドレスをベクトルレジスタに直接ロードする。そのためメモリの確保は不要である。また、作業配列であるメモリがベクトルレジスタに置き換わるためデータ転送も高速になる。 Further, in Patent Document 1, as shown in sentences 13-3 and 13-4 of FIG. 13, the memory of the working array is secured and the start address of each node of the list structure data is stored. On the other hand, in the present embodiment, the start address of each node of the list structure data is directly loaded into the vector register. Therefore, it is not necessary to allocate memory. In addition, since the memory that is the working array is replaced with the vector register, the data transfer becomes high speed.

また、特許文献１においては、リスト構造データのノード数のカウントと先頭アドレスのストアのために図１３の文１３−２、１３−４のように２回ループでリスト構造をたどっている。これに対して本実施形態では、ＶＬＩＳＴ命令により１回リスト構造をたどるだけでよく処理を減らせる。 Further, in Patent Document 1, the list structure is traced in a double loop as shown in sentences 13-2 and 13-4 of FIG. 13 for counting the number of nodes of the list structure data and storing the start address. On the other hand, in the present embodiment, the number of processes can be reduced by tracing the list structure once by the VLIST instruction.

本実施形態は、以上の構成および動作を基本としつつ、各種の付加変更が可能である。例えば、ベクトル長が最大ベクトル長（ベクトルレジスタの要素数）を超える場合は、コンパイラ５は、最大ベクトル長のノード数を処理した後、リスト構造データの先頭アドレスがロードされたベクトルレジスタの最後尾のノードの次のノードから処理を再開することで、最大ベクトル長を超えるループを処理する。 In this embodiment, various additions and changes can be made based on the above configuration and operation. For example, if the vector length exceeds the maximum vector length (the number of elements in the vector register), the compiler 5 processes the number of nodes with the maximum vector length, and then the end of the vector register in which the start address of the list structure data is loaded. Processes loops that exceed the maximum vector length by resuming processing from the node next to that node.

[第２の実施形態]
図１５を参照すると、本発明の第２の実施形態に係るコンピュータ１５は、ソースプログラム１５−３からベクトル計算機に対するオブジェクトプログラム１５−４を生成するために、ループ解析部１５−１とベクトル化実行部１５−２とを含んで構成される。 [Second Embodiment]
Referring to FIG. 15, the computer 15 according to the second embodiment of the present invention executes vectorization with the loop analysis unit 15-1 in order to generate the object program 15-4 for the vector computer from the source program 15-3. It is configured to include parts 15-2.

ループ解析部１５−１は、ソースプログラム１５−３を解析してリスト構造データを繰り返し処理するループ制御文を認識する。ループ解析部１５−１は、例えば図５のループ解析部５−１と同様に構成することができるが、それに限定されない。ベクトル化実行部１５−２は、ループ解析部１５−１で認識されたループ制御文をベクトル化する。ベクトル化実行部１５−２は、例えば図５のベクトル化実行部５−２と同様に構成することができるが、それに限定されない。 The loop analysis unit 15-1 recognizes a loop control statement that analyzes the source program 15-3 and repeatedly processes the list structure data. The loop analysis unit 15-1 can be configured in the same manner as, for example, the loop analysis unit 5-1 of FIG. 5, but is not limited thereto. The vectorization execution unit 15-2 vectorizes the loop control statement recognized by the loop analysis unit 15-1. The vectorization execution unit 15-2 can be configured in the same manner as, for example, the vectorization execution unit 5-2 in FIG. 5, but is not limited thereto.

ベクトル化実行部１５−２は、オブジェクトプログラム１５−４中に、第１のプログラム部分１５−５と第２のプログラム部分１５−６とを挿入するように構成されている。第１のプログラム部分１５−５は、リスト構造データの各ノードの先頭アドレスをベクトルレジスタに書き込み、リスト構造データのノード数をスカラレジスタに書き込む命令を含んでいる。第２のプログラム部分１５−６は、ベクトルレジスタに書き込まれたリスト構造データの各ノードの先頭アドレスとスカラレジスタに書き込まれたリスト構造データのノード数とを使って、ループ制御文をベクトル化している。 The vectorization execution unit 15-2 is configured to insert the first program portion 15-5 and the second program portion 15-6 into the object program 15-4. The first program portion 15-5 includes an instruction to write the start address of each node of the list structure data to the vector register and the number of nodes of the list structure data to the scalar register. The second program part 15-6 vectorizes the loop control statement by using the start address of each node of the list structure data written in the vector register and the number of nodes of the list structure data written in the scalar register. There is.

次に、本実施形態に係るコンピュータ１５によるコード生成方法について説明する。 Next, a code generation method by the computer 15 according to the present embodiment will be described.

まず、ループ解析部１５−１は、ソースプログラム１５−３を解析してリスト構造データを繰り返し処理するループ制御文を認識する。次に、ベクトル化実行部１５−２は、ループ制御文をベクトル化したオブジェクトプログラム１５−４を生成する。上記ベクトル化では、オブジェクトプログラム１５−４中に、第１のプログラム部分１５−５と第２のプログラム部分１５−６をと挿入する。 First, the loop analysis unit 15-1 analyzes the source program 15-3 and recognizes a loop control statement that repeatedly processes the list structure data. Next, the vectorization execution unit 15-2 generates an object program 15-4 in which the loop control statement is vectorized. In the above vectorization, the first program portion 15-5 and the second program portion 15-6 are inserted into the object program 15-4.

このように本実施形態によれば、作業配列を使ってループ制御文をベクトル化するのに比べて、リスト構造データを繰り返し処理するループ制御文の実行速度の高速化が可能になる。 As described above, according to the present embodiment, it is possible to increase the execution speed of the loop control statement that repeatedly processes the list structure data, as compared with vectorizing the loop control statement using the working array.

その理由は、オブジェクトプログラム１５−４は、リスト構造データの各ノードの先頭アドレスをベクトルレジスタに書き込む第１のプログラム部分１５−５と、ベクトルレジスタに書き込まれたリスト構造データの各ノードの先頭アドレスを使ってループ制御文をベクトル化する第２のプログラム部分１５−６とを含んで構成されているためである。 The reason is that the object program 15-4 has the first program portion 15-5 that writes the start address of each node of the list structure data to the vector register and the start address of each node of the list structure data written to the vector register. This is because it is configured to include a second program portion 15-6 that vectorizes the loop control statement using.

本発明は、ＨＰＣ（Ｈｉｇｈ−Ｐｅｒｆｏｒｍａｎｃｅ−Ｃｏｍｐｕｔｉｎｇ）分野、ベクトルプロセッサを有する計算機に利用でき、特にリスト構造データを繰り返し処理するループ制御文を含むソースプログラムをコンパイルしてオブジェクトプログラムを生成する分野に利用できる。 The present invention can be used in the HPC (High-Permanence-Compiling) field and computers having a vector processor, and is particularly used in the field of compiling a source program including a loop control statement that repeatedly processes list structure data to generate an object program. can.

１−１…構造体（ノード）
２−１…リンク構造体のデータ宣言文
２−２…ループ制御文
３−１…ベクトルレジスタ
３−２…メモリ
３−３…ベクトルレジスタ
４−１…ベクトルレジスタ
４−２…メモリ
４−３…ベクトルレジスタ
５…コンパイラ
５−１…ループ解析部
５−１−１…リスト構造データを含むループの認識手段
５−１−２…リスト構造データの構造解析手段
５−１−３…ベクトル化判定手段
５−２…ベクトル化実行部
５−２−１…リスト構造先頭アドレス・リスト構造ノード数取得命令生成手段
５−２−２…ベクトル長ロード命令生成手段
５−２−３…ベクトル収集命令生成手段
５−２−４…ベクトル演算命令生成手段
５−２−５…ベクトル拡散命令生成手段
５−３…ソースプログラム
５−４…オブジェクトコード
６−１…ＮＵＬＬ（終端文字）
７−１…ベクトルレジスタ
１１−１…ベクトルレジスタ
１１−２…マスクレジスタ
１１−３…スカラレジスタ
１３−１…ループ制御文
１３−２〜１３−５…文
１４−１…情報処理装置
１４−２…プログラム
１４−１−１…操作入力部
１４−１−２…画面表示部
１４−１−３…通信インタフェース部
１４−１−４…記憶部
１４−１−５…演算処理部
１５…コンピュータ
１５−１…ループ解析部
１５−２…ベクトル化実行部
１５−３…ソースプログラム
１５−４…オブジェクトプログラム
１５−５…第１のプログラム部分
１５−６…第２のプログラム部分 1-1 ... Structure (node)
2-1 ... Data declaration statement of link structure 2-2 ... Loop control statement 3-1 ... Vector register 3-2 ... Memory 3-3 ... Vector register 4-1 ... Vector register 4-2 ... Memory 4-3 ... Vector register 5 ... Compiler 5-1 ... Loop analysis unit 5-1-1 ... Loop recognition means containing list structure data 5-1-2 ... List structure data structure analysis means 5-1-3 ... Vectorization determination means 5-2 ... Vectorization execution unit 5-2-1 ... List structure head address / list structure Number of nodes Acquisition instruction generation means 5-2-2 ... Vector length load instruction generation means 5-2-3 ... Vector collection instruction generation means 5-2-4 ... Vector calculation instruction generation means 5-2-5 ... Vector diffusion instruction generation means 5-3 ... Source program 5-4 ... Object code 6-1 ... NUML (Terminal character)
7-1 ... Vector register 11-1 ... Vector register 11-2 ... Mask register 11-3 ... Scalar register 13-1 ... Loop control statement 13-2 to 13-5 ... Statement 14-1 ... Information processing device 14-2 ... Program 14-1-1 ... Operation input unit 14-1-2 ... Screen display unit 14-1-3 ... Communication interface unit 14-1-4 ... Storage unit 14-1-5 ... Arithmetic processing unit 15 ... Computer 15 -1 ... Loop analysis unit 15-2 ... Vectorization execution unit 15-3 ... Source program 15-4 ... Object program 15-5 ... First program part 15-6 ... Second program part

Claims

On a computer that generates an object program for a vector computer from a source program,
A process of recognizing a loop control statement that analyzes the source program and repeatedly processes the list structure data,
The process of vectorizing the loop control statement is performed, and the process is performed.
In the vectorization process,
In the object program,
A first program portion that writes the start address of each node of the list structure data to the vector register and writes the number of nodes of the list structure data to the scalar register.
A second program portion in which the loop control statement is vectorized using the start address of each node of the list structure data written in the vector register and the number of nodes of the list structure data written in the scalar register. And,
Insert and
The first program portion includes a first vector instruction.
The first vector instruction is a predetermined field of the operand, a vector register% v0 for writing the start address of each node of the list structure data, a scalar register% s0 for writing the number of nodes of the list structure data, and a list structure data. Scalar register% s1 that points to the start address of the section of the list structure Scalar register% s2 that points to the end address of the section of the list structure data2 Is said to be
compiler.

On a computer that generates an object program for a vector computer from a source program,
A process of recognizing a loop control statement that analyzes the source program and repeatedly processes the list structure data,
The process of vectorizing the loop control statement is performed, and the process is performed.
In the vectorization process,
In the object program,
A first program portion that writes the start address of each node of the list structure data to the vector register and writes the number of nodes of the list structure data to the scalar register.
A second program portion in which the loop control statement is vectorized using the start address of each node of the list structure data written in the vector register and the number of nodes of the list structure data written in the scalar register. And,
Insert and
The first program portion includes a second vector instruction.
The second vector instruction is a predetermined field of the operand, a vector register% v0 for writing the start address of each node of the list structure data, a scalar register% s0 for writing the number of nodes of the list structure data, and a risk structure data. scalar register% point to the beginning of the address of the segment s1, that is configured to specify a scalar register% s2 to point to the location of the pointer of the next data in the list structure,
Compiler.

On a computer that generates an object program for a vector computer from a source program,
A process of recognizing a loop control statement that analyzes the source program and repeatedly processes the list structure data,
The process of vectorizing the loop control statement is performed, and the process is performed.
In the vectorization process,
In the object program,
A first program portion that writes the start address of each node of the list structure data to the vector register and writes the number of nodes of the list structure data to the scalar register.
A second program portion in which the loop control statement is vectorized using the start address of each node of the list structure data written in the vector register and the number of nodes of the list structure data written in the scalar register. And,
Insert and
The first program portion includes a third vector instruction, a mask generation instruction, and a mask leading zero count instruction.
The third vector instruction is a scalar in which the vector register% v0 for writing the start address and the end character indicating the end of each node of the list structure data and the start address of the section of the list structure data are written in the predetermined field of the operand. The configuration is such that register% s0, scalar register% s1 indicating the address at the end of the section of list structure data1, and scalar register% s2 indicating the position of the pointer of the next data in the list structure are specified.
The mask generation instruction is configured to specify the vector register% v0 and the mask generation register mreg1 in a predetermined field of the operand.
The mask leading zero count instruction, a predetermined field of the operand, the register MREG1, that is configured to specify a scalar register for storing a number the number of nodes 0 are continuous from the head of the mask,
Compiler.

On a computer that generates an object program for a vector computer from a source program,
A process of recognizing a loop control statement that analyzes the source program and repeatedly processes the list structure data,
The process of vectorizing the loop control statement is performed, and the process is performed.
In the vectorization process,
In the object program,
A first program portion that writes the start address of each node of the list structure data to the vector register and writes the number of nodes of the list structure data to the scalar register.
A second program portion in which the loop control statement is vectorized using the start address of each node of the list structure data written in the vector register and the number of nodes of the list structure data written in the scalar register. And,
Insert and
The first program portion includes a fourth vector instruction, a mask generation instruction, and a mask leading zero count instruction.
The fourth vector instruction is a vector register% v0 for writing the end address indicating each start address and end of the list structure data and a scalar register% indicating the start address of the section of the list structure data in a predetermined field of the operand. s0, a configuration that specifies the scalar register% s1 that points to the position of the pointer of the next data in the list structure (next in FIG. 1).
The mask generation instruction is configured to specify the vector register% v0 and the mask generation register mreg1 in a predetermined field of the operand.
The mask leading zero count instruction, a predetermined field of the operand, the register MREG1, that is configured to specify a scalar register for storing a number the number of nodes 0 are continuous from the head of the mask,
Compiler.

A computer that generates an object program for a vector computer from a source program.
A loop analysis unit that recognizes a loop control statement that analyzes the source program and repeatedly processes list structure data,
A vectorization execution unit that vectorizes the loop control statement,
Including
The vectorization execution unit is used in the object program.
A first program portion that writes the start address of each node of the list structure data to the vector register and writes the number of nodes of the list structure data to the scalar register.
A second program portion in which the loop control statement is vectorized using the start address of each node of the list structure data written in the vector register and the number of nodes of the list structure data written in the scalar register. And,
Configured to insert ,
The first program portion includes a first vector instruction.
The first vector instruction is a predetermined field of the operand, a vector register% v0 for writing the start address of each node of the list structure data, a scalar register% s0 for writing the number of nodes of the list structure data, and a list structure data. Scalar register% s1 that points to the start address of the section of the list structure Scalar register% s2 that points to the end address of the section of the list structure data2 Is said to be
Computer.

A computer-executed code generation method that generates an object program for a vector computer from a source program.
Recognize the loop control statement that analyzes the source program and iteratively processes the list structure data.
Vectorize the loop control statement
In the vectorization,
In the object program,
A first program portion that writes the start address of each node of the list structure data to the vector register and writes the number of nodes of the list structure data to the scalar register.
A second program portion in which the loop control statement is vectorized using the start address of each node of the list structure data written in the vector register and the number of nodes of the list structure data written in the scalar register. And,
Insert and
The first program portion includes a first vector instruction.
The first vector instruction is a predetermined field of the operand, a vector register% v0 for writing the start address of each node of the list structure data, a scalar register% s0 for writing the number of nodes of the list structure data, and a list structure data. Scalar register% s1 that points to the start address of the section of the list structure Scalar register% s2 that points to the end address of the section of the list structure data2 Is said to be
How to generate computer code.