JPH10177566A

JPH10177566A - Memory access fast processor and recording medium

Info

Publication number: JPH10177566A
Application number: JP9287445A
Authority: JP
Inventors: Masaki Aoki; 正樹青木; Masato Morishima; 政人森島
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1996-10-18
Filing date: 1997-10-20
Publication date: 1998-06-30

Abstract

PROBLEM TO BE SOLVED: To evade the bank conflict and to increase the restore access speed with no source tuning of a user by preparing a means which generates and inputs an instruction to decide the access position of a work array and a means which changes a restore vector access due to an object array into a work array access. SOLUTION: When an optimization control row analysis means 5 finds an optimization instruction row out of a source program 1, a change processing means 8 generates and inputs an instruction to copy an object array of a restore access to a work array, generates and inputs an instruction to decide the access position of the work array due to the list value of the found list access, and then changes the list vector access due to the object array into an access due to the work array. Furthermore, the means 8 generates and inputs an instruction to decide a random access position and changes all accesses dependent on the repetition of a loop included in a found list access into the produced access positions.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、メモリアクセスを
高速化する高速化処理装置および記録媒体に関するもの
である。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a high-speed processing device for speeding up memory access and a recording medium.

【０００２】[0002]

【従来の技術】従来、科学技術計算をベクトル計算機で
行なう場合、メモリアクセスのアクセス性能がプログラ
ムの実行性能を決定する重要な１つの要因である。2. Description of the Related Art Conventionally, when scientific and technical calculations are performed by a vector computer, the access performance of memory access is an important factor that determines the execution performance of a program.

【０００３】ベクトル計算機におけるメモリアクセスの
アクセス種類は、以下に分類できる。（１）連続アクセス：例えば下記のソース(fortranソ
ース)である。The types of memory access in a vector computer can be classified as follows. (1) Continuous access: For example, the following source (fortran source).

【０００４】 do i=1,10 a(i)=b(i) 配列aおよび配列bはDOインデックスiで連続アクセスする endo （２）距離付きアクセス：例えば下記のソース(fortran
ソース)である。Do i = 1,10 a (i) = b (i) The array a and the array b are continuously accessed at the DO index i. Endo (2) Access with distance: For example, the following source (fortran
Source).

【０００５】 do i=1,10,2 a(i)=b(i) 配列aおよび配列bはDOインデックスiで2飛びに距離付き endo アクセスする（３）リストアクセス：例えば下記のソース(fortranソ
ース)である。Do i = 1,10,2 a (i) = b (i) The array a and the array b access the endo with a DO index i at two jumps. (3) List access: For example, the following source (fortran Source).

【０００６】 do i=1,10 a(list(i))=b(list(i)) 配列aおよび配列bはリスト配列listでランダムアクセスするアクセス効率（速度）の観点から（１）＞（２）＞
（３）の順であり、使用者が高速なメモリアクセスを行
なうアルゴリズムに変更することがプログラミングチュ
ーニングの重要な課題であった。Do i = 1,10 a (list (i)) = b (list (i)) Array a and array b are accessed randomly by list array list from the viewpoint of access efficiency (speed) (1)>(2)>
In order of (3), it was an important issue of programming tuning that the user change to an algorithm that performs high-speed memory access.

【０００７】[0007]

【発明が解決しようとする課題】上述したアクセス速度
が最も遅いリストアクセスは、プログラムのアルゴリズ
ム上出現が回避不可能であり、上記（１）や（２）のア
クセスに変更できない問題がある。The above list access with the slowest access speed is unavoidable due to the algorithm of the program, and cannot be changed to the above (1) or (2) access.

【０００８】また、メモリのバンクの競合が生じた場
合、リストアクセスのパターンにより速度も大きく異な
り、同一バンクに当たって競合が生じた場合には数十倍
の低速になってしまう問題がある。In addition, when contention between memory banks occurs, the speed greatly differs depending on the list access pattern, and when contention occurs in the same bank, the speed is reduced by several tens of times.

【０００９】本発明は、これらの問題を解決するため、
コンパイル時にリストアクセスのアクセス方法を最適化
する最適化指示行、および作業用配列にコピー、アクセ
ス順番をランダム、あるいはスカラメモリアクセスし、
バンク競合を回避して使用者のソースチューニングなし
にリストアクセスの高速化を図ることを目的としてい
る。[0009] The present invention solves these problems,
Optimization instruction line to optimize the access method of list access at compile time, and copy to work array, random access order, or scalar memory access,
The purpose of the present invention is to speed up list access without user source tuning by avoiding bank contention.

【００１０】[0010]

【課題を解決するための手段】図１を参照して課題を解
決するための手段を説明する。図１において、ソースプ
ログラム１は、原始プログラムであって、メモリアクセ
スの高速化対象となるプログラムである。Means for solving the problem will be described with reference to FIG. In FIG. 1, a source program 1 is a source program, which is a target for speeding up memory access.

【００１１】処理装置２は、プログラムに従って各種処
理を行なうものであって、プログラム入力手段３、ソー
ス解析手段４、ベクトル処理手段６、およびコード生成
手段９などから構成されるものである。The processing device 2 performs various processes in accordance with a program, and includes a program input means 3, a source analysis means 4, a vector processing means 6, a code generation means 9, and the like.

【００１２】ソース解析手段４は、ソースプログラム１
を解析するものであって、ここでは、最適化制御行を解
析する最適化制御行解析手段５などから構成されるもの
である。[0012] The source analysis means 4 stores the source program 1
Here, it is configured by an optimization control row analysis means 5 for analyzing the optimization control row.

【００１３】ベクトル処理手段６は、ベクトル化処理を
行なうものであって、ここでは、リストベクトルアクセ
スの変更処理手段８などから構成されるものである。次
に、動作を説明する。The vector processing means 6 performs a vectorization process, and here comprises a list vector access change processing means 8 and the like. Next, the operation will be described.

【００１４】最適化制御行解析手段５がソースプログラ
ム１中から最適化指示行を見つけたときに、変更処理手
段８がリストアクセスの対象配列を作業配列に複写する
命令を生成して挿入、見つけたリストアクセスのリスト
値による作業配列のアクセス位置を決める命令を生成し
て挿入、および対象配列によるリストベクトルアクセス
を作業配列によるアクセスに変更するようにしている。When the optimization control line analysis unit 5 finds an optimization instruction line in the source program 1, the change processing unit 8 generates, inserts, and finds an instruction to copy the list access target array into the work array. An instruction for determining the access position of the work array based on the list value of the list access is generated and inserted, and the list vector access by the target array is changed to the access by the work array.

【００１５】また、最適化制御行解析手段８がソースプ
ログラム１中から最適化指示行を見つけたときに、変更
処理手段８がランダムなアクセス位置を生成する命令を
生成して挿入、および見つけたリストアクセス中のルー
プの繰り返しに依存するアクセスを全て生成したアクセ
ス位置に変更するようにしている。Further, when the optimization control line analysis means 8 finds an optimization instruction line in the source program 1, the change processing means 8 generates, inserts, and finds an instruction for generating a random access position. All accesses that depend on the repetition of the loop during list access are changed to the generated access position.

【００１６】また、最適化制御行解析手段８がソースプ
ログラム１中から最適化指示行を見つけたときに、変更
処理手段８が対象配列の先頭位置をロードするスカラ命
令を生成して挿入、見つけたリストアクセスのベクトル
データにロードしたスカラデータを転送する命令を生成
して挿入、および転送されたベクトルデータを使用した
命令列を生成して挿入するようにしている。When the optimization control line analysis means 8 finds an optimization instruction line in the source program 1, the change processing means 8 generates a scalar instruction for loading the head position of the target array, inserts and finds it. An instruction to transfer the scalar data loaded to the list access vector data is generated and inserted, and an instruction sequence using the transferred vector data is generated and inserted.

【００１７】これらの際に、リストアクセスのアクセス
方法を変更する最適化指示行がソースプログラム１中に
挿入されていたときにそれに続くリストアクセスを当該
最適化指示行で指示されたリストアクセスに対する上記
処理を行なうようにしている。At this time, when an optimization instruction line for changing the access method of the list access is inserted in the source program 1, the following list access is performed for the list access specified by the optimization instruction line. Processing is performed.

【００１８】また、上記手段を有するプログラムを格納
した記憶媒体を生成するようにしている。従って、コン
パイル時にリストアクセスのアクセス方法を最適化する
最適化指示行、および作業用配列にコピー、アクセス順
番をランダム、あるいはスカラメモリアクセスすること
により、バンク競合を回避して使用者のソースチューニ
ングなしにリストアクセスの高速化を図ることが可能と
なる。Further, a storage medium storing a program having the above means is generated. Therefore, by optimizing the instruction line for optimizing the access method of the list access at the time of compiling, and copying to the working array, random access order, or scalar memory access, bank conflicts are avoided and the user does not need to tune the source. In addition, it is possible to speed up list access.

【００１９】[0019]

【発明の実施の形態】次に、図１から図８を用いて本発
明の実施の形態および動作を順次詳細に説明する。ここ
で、図示外の記録媒体あるいは外部記憶装置であるハー
ドディスク装置などから読み出したプログラム、または
センタの外部記憶装置から回線を介して転送を受けたプ
ログラムを主記憶にローディングして起動し、以下に説
明する各種処理を行うようにしている。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Next, an embodiment and operation of the present invention will be described in detail with reference to FIGS. Here, a program read from a recording medium (not shown) or a hard disk device as an external storage device, or a program transferred from an external storage device of the center via a line is loaded into the main storage and activated. Various processes to be described are performed.

【００２０】図１は、本発明の１実施例構成図を示す。
図１において、ソースプログラム１は、原始プログラム
であって、メモリアクセスを高速に行なう対象となる例
えば後述する図３のソース例に記載したようなプログラ
ムである。FIG. 1 is a block diagram showing one embodiment of the present invention.
In FIG. 1, a source program 1 is a source program and is a program to be accessed at a high speed, for example, a program described in a source example of FIG. 3 described later.

【００２１】処理装置２は、ＣＰＵやメモリから構成さ
れプログラムに従って各種処理を行なうものであって、
ここでは、プログラム入力手段３、ソース解析手段４、
ベクトル処理手段６、およびコード生成手段９などから
構成されるものである。The processing device 2 comprises a CPU and a memory and performs various processes according to programs.
Here, the program input means 3, the source analysis means 4,
It comprises a vector processing means 6, a code generation means 9, and the like.

【００２２】ソース解析手段４は、ソースプログラム１
を解析するものであって、ここでは、最適化制御行を解
析する最適化制御行解析手段５などから構成されるもの
である。The source analysis means 4 stores the source program 1
Here, it is configured by an optimization control row analysis means 5 for analyzing the optimization control row.

【００２３】ベクトル処理手段６は、ベクトル化処理を
行なうものであって、ここでは、ベクトル化を行なうベ
クトル化手段７、リストベクトルアクセスの変更処理手
段８などから構成されるものである。The vector processing means 6 performs a vectorization process, and here comprises a vectorization means 7 for performing vectorization, a list vector access change processing means 8 and the like.

【００２４】変更処理手段８は、リストベクトルアクセ
ス時のバンク競合を回避するように対象配列を作業用配
列にコピー、アクセス順番をランダム、あるいはスカラ
メモリアクセスしたりするようにするものである。The change processing means 8 copies the target array to the working array, randomizes the access order, or accesses the scalar memory so as to avoid bank conflicts when accessing the list vector.

【００２５】コード生成手段９は、コードを生成して実
行可能形式のオブジェクト１０にするものである。次
に、図２ないし図４を用いて配列複写方式によるバンク
競合を回避する場合について詳細に説明する。The code generating means 9 is for generating a code to make the object 10 in an executable format. Next, a case of avoiding bank conflict by the array copying method will be described in detail with reference to FIGS.

【００２６】図２は、本発明の動作説明フローチャート
（配列複写方式）を示す。図２において、Ｓ１は、最適
化指示行の解析を行なう。これは、後述する図３の
（ａ）のソース例のうちの、第３行目の！ocl list copy(4,A) という配列複写方式を表す最適化指示行があるか解析す
る。ここで、！ocl list copy(4,A)は、配列Aを４つ複写
する配列複写方法を表す。FIG. 2 is a flowchart illustrating the operation of the present invention.
(Array copying method). In FIG. 2, S1 is optimal
Analyzes the conversion instruction line. This corresponds to FIG.
In the source example of (a), the third line! ocl list Analyze whether there is an optimization instruction line indicating the array copy method called copy (4, A).
You. here,! ocl list copy (4, A) copies four arrays A
Represents the array copying method to be performed.

【００２７】Ｓ２は、配列複写方式が指示か判別する。
ＹＥＳの場合には、Ｓ３に進む。ＮＯの場合には、処理
なしでここでは終了する。Ｓ３は、指示対象配列がリス
トアクセスか判別する。これは、Ｓ２で配列複写方式で
指示された配列がリストアクセスか判別する。ＹＥＳの
場合には、Ｓ４に進む。ＮＯの場合には、処理なしでこ
こでは終了する。In step S2, it is determined whether the array copying method is the instruction.
In the case of YES, the process proceeds to S3. In the case of NO, the processing ends here without any processing. In step S3, it is determined whether the designated array is a list access. This is to determine whether or not the array specified by the array copying method in S2 is a list access. In the case of YES, the process proceeds to S4. In the case of NO, the processing ends here without any processing.

【００２８】Ｓ４は、指示対象配列の大きさが判るか判
別する。ＹＥＳの場合には、指示対象の配列の大きさが
判明したので、Ｓ５で静的作業配列を確保し（図３の
（ｂ）の参照）、Ｓ７に進む。ＮＯの場合には、Ｓ６
で動的作業配列を確保し、Ｓ７に進む。In step S4, it is determined whether or not the size of the designated array is known. In the case of YES, the size of the array to be instructed is determined, so a static work array is secured in S5 (see FIG. 3B), and the process proceeds to S7. If NO, S6
To secure a dynamic work array, and proceed to S7.

【００２９】Ｓ７は、対象配列を作業配列へ複写回数
分複写する命令を生成する（図３の（ｂ）の参照）。
Ｓ８は、リスト値による作業配列のアクセス位置を決
める命令を生成する（図３の（ｂ）の参照）。In step S7, an instruction for copying the target array to the work array by the number of times of copying is generated (see FIG. 3B).
In step S8, an instruction for determining the access position of the work array based on the list value is generated (see FIG. 3B).

【００３０】Ｓ９は、対象配列によるリストベクトル
アクセスを作業配列によるアクセスに変更する（図３の
（ｂ）の参照）。以上によって、ソースプログラム１
中に配列複写方式の最適化指示行（例えば図３の（ａ）
の第３行目の！ocl list copy(4,A)）が検出された場
合、指示対象配列がリストアクセスのときに当該配列の
作業領域を確保し（例えば図３の（ｂ）の参照）、対
象配列を作業配列に複写回数分複写する命令を生成（例
えば図３の（ｂ）の参照）、リスト値による作業配列
のアクセス位置を決める命令を生成（例えば図３の
（ｂ）の参照）、および対象配列によるリストベクト
ルアクセスを作業配列によるアクセスに変更（例えば図
３の（ｂ）の参照）することにより、図３の（ａ）の
ソースプログラム１が図３の（ｂ）のソースプログラム
に変更され、これをコンパイルしてオブジェクトコード
を生成して実行すると、バンク競合を削減してリストア
クセスの高速化を図ることが可能となる。この配列複写
方式は、特にリストの値が連続している場合は、図４を
用いて後述するようにバンク競合を回避するのに有効で
ある。In step S9, the list vector access by the target array is changed to the access by the work array (see FIG. 3B). By the above, source program 1
An instruction line for optimizing the array copying method (for example, (a) in FIG. 3)
On the third line! ocl list When "copy (4, A)" is detected, a work area for the designated array is secured when the designated array is accessed in a list (for example, see FIG. 3B), and the number of times the target array is copied to the work array Generates an instruction for copying (for example, see FIG. 3B), generates an instruction for determining the access position of the work array by a list value (for example, see FIG. 3B), and accesses a list vector by a target array. (For example, see FIG. 3B), the source program 1 in FIG. 3A is changed to the source program in FIG. 3B, and this is compiled. When the object code is generated and executed in this way, it is possible to reduce bank contention and speed up list access. This array copying method is effective for avoiding bank conflict as described later with reference to FIG. 4, particularly when the values of the list are continuous.

【００３１】図３は、本発明のソース例（配列複写方
式）を示す。図３の（ａ）は、ソース例を示す。ここ
で、第３行目の！ocl list copy(4,A) が配列複写方式の最適化指示行である。そして、第４行
目および第５行目がリストアクセスの行である。FIG. 3 shows a source example (array copy method) of the present invention. FIG. 3A shows an example of a source. Here, the third line! ocl list copy (4, A) is the optimization instruction line of the array copy method. The fourth and fifth lines are the list access lines.

【００３２】図３の（ｂ）は、インプリメントイメージ
を示す。ここで、、、、は、既述した図２の
、、、にそれぞれ対応した結果を表す。図３の
（ｂ）において、は、作業配列を確保する命令であ
る。FIG. 3B shows an implementation image. Here, represents the results respectively corresponding to and in FIG. 2 described above. In FIG. 3B, a command for securing a work array is shown.

【００３３】１つ目のは、作業配列への複写１（第１
回目の複写）する命令である。２つ目のは、作業配列
への複写２（第２回目の複写）する命令である。３つ目
のは、作業配列への複写３（第３回目の複写）する命
令である。The first is copy 1 to work array (first
(The second copy). The second is an instruction to copy 2 (second copy) to the work array. The third is an instruction to copy 3 (third copy) to the work array.

【００３４】４つ目のは、作業配列への複写４（第４
回目の複写）する命令である。は、リスト値をによる
作業配列のアクセス位置を決める命令である。は、作
業配列から引用する命令である。The fourth is copy 4 to work array (fourth copy).
(The second copy). Is an instruction for determining the access position of the work array based on the list value. Is an instruction to be referenced from the working array.

【００３５】図４は、本発明の動作説明図（配列複写方
式）を示す。図４の（ａ）は、配列ＬＩＳＴの値の例を
示す。ここでは、説明を簡単にするために、ソース例で
の配列の大きさおよびＤＯループの繰り返し回数Ｎを３
として説明する。また、メモリバンク数も８として説明
する。配列ＬＩＳＴの値は以下とする。FIG. 4 is a diagram for explaining the operation of the present invention (array copying system). FIG. 4A shows an example of the value of the array LIST. Here, in order to simplify the explanation, the size of the array and the number of iterations N of the DO loop in the source example are set to 3
It will be described as. Also, the description will be made on the assumption that the number of memory banks is eight. The value of the array LIST is as follows.

【００３６】ＬＩＳＴ（１）＝１ＬＩＳＴ（２）＝３ＬＩＳＴ（３）＝１図４の（ｂ）は、変更前のメモリアクセスの例を示す。
ここでは、図示のように配列ＬＩＳＴが図４の（ａ）に
示すように持ち、ＤＯループの繰り返し１と３の時、配
列Ａの同一位置（バンク）をアクセスし、バンク競合と
なる。LIST (1) = 1 LIST (2) = 3 LIST (3) = 1 FIG. 4B shows an example of memory access before the change.
In this case, as shown in the figure, the array LIST has as shown in FIG. 4 (a), and when the DO loop is repeated 1 and 3, the same position (bank) of the array A is accessed, and bank conflict occurs.

【００３７】図４の（ｃ）は、対象配列Ａを作業用配列
ＷＯＲＫにコピーした様子を示す。ここでは、ＷＯＲＫ
配列は、ＷＯＲＫ（１２，４）を確保している。この確
保した領域に、対象配列ＡをＷＯＲＫ配列へ指示通り４
つ図示のように１つずらしながら複写する。FIG. 4C shows a state where the target sequence A is copied to the working sequence WORK. Here, WORK
The array secures WORK (12, 4). In the secured area, the target array A is added to the WORK array in accordance with the instruction.
One copy is performed as shown in FIG.

【００３８】図４の（ｄ）は、変更後のメモリアクセス
の例を示す。ここでは、図示のように配列ＬＩＳＴが図
４の（ａ）に示すように持ち、ＤＯループの繰り返し１
と３の時、同じＡ（１）の内容を持つ場合でも、ＷＯＲ
Ｋ配列上の違うバンクをアクセスし、バンク競合を回避
できる。FIG. 4D shows an example of the memory access after the change. Here, as shown in the figure, the array LIST has an array LIST as shown in FIG.
And 3, WOR even if they have the same content of A (1)
By accessing different banks on the K array, bank conflicts can be avoided.

【００３９】次に、図５ないし図６を用いてランダムア
クセス方式によるバンク競合を回避する場合について詳
細に説明する。図５は、本発明の動作説明フローチャー
ト（ランダクアクセス方式）を示す。Next, the case of avoiding bank contention by the random access method will be described in detail with reference to FIGS. FIG. 5 is a flowchart for explaining the operation of the present invention (a random access method).

【００４０】図５において、Ｓ１１は、最適化指示行の
解析を行なう。これは、後述する図６の（ａ）のソース
例のうちの、第３行目の！ocl list random(B) というランダムアクセス方式を表す最適化指示行がある
か解析する。In FIG. 5, S11 analyzes an optimization instruction line. This is the! In the third line of the source example of FIG. ocl list It analyzes whether there is an optimization instruction line that represents a random access method called random (B).

【００４１】Ｓ１２は、ランダムアクセス方式が指示か
判別する。ＹＥＳの場合には、Ｓ１３に進む。ＮＯの場
合には、処理なしでここでは終了する。Ｓ１３は、指示
対象配列がリストアクセスか判別する。これは、Ｓ１２
でランダムアクセス方式で指示された配列がリストアク
セスか判別する。ＹＥＳの場合には、Ｓ１４に進む。Ｎ
Ｏの場合には、処理なしでここでは終了する。In S12, it is determined whether the random access method is the instruction. In the case of YES, the process proceeds to S13. In the case of NO, the processing ends here without any processing. In step S13, it is determined whether the designated array is a list access. This is S12
To determine whether the array designated by the random access method is a list access. In the case of YES, the process proceeds to S14. N
In the case of O, the process ends here without any processing.

【００４２】Ｓ１４は、ランダムなアクセス位置を生
成する命令を生成する（図６の（ｂ）の参照）。Ｓ１
５は、ループの繰り返しに存在するアクセスを全て
で生成したアクセス位置に変更する（図６の（ｂ）の
参照）。In step S14, an instruction for generating a random access position is generated (see FIG. 6B). S1
5 changes the access existing in the repetition of the loop to the access position generated by all (see FIG. 6B).

【００４３】以上によって、ソースプログラム１中にラ
ンダムアクセス方式の最適化指示行（例えば図６の
（ａ）の第３行目の！ocl list random(B)）が検出さ
れた場合、指示対象配列がリストアクセスのときにラン
ダムアクセス位置の生成関数を生成（図６の（ｂ）の
参照）、およびランダムアクセス位置でのアクセス位置
に変更（図６の（ｂ）の参照）することにより、図６
の（ａ）のソースプログラム１が図６の（ｂ）のソース
プログラムに変更され、これをコンパイルしてオブジェ
クトコードを生成して実行すると、バンク競合を削減し
てリストアクセスの高速化を図ることが可能となる。こ
のランダムアクセス方式は、特にリストの値が１、２、
３、４、１、２、３、４などの飛び飛びで同一の値の場
合、アクセス順序をリストベクトルの値によらず、ラン
ダムにアクセスすることにより、バンク競合を回避する
のに有効である。As described above, in the source program 1, the optimization instruction line of the random access method (for example,! Ocl list in the third line of FIG. If (random (B)) is detected, a generation function of a random access position is generated when the designated array is a list access (see FIG. 6B), and the access position at the random access position is changed ( By referring to FIG. 6B), FIG.
(A) is changed to the source program shown in FIG. 6 (b), and when this is compiled and an object code is generated and executed, bank contention is reduced and list access is speeded up. Becomes possible. In this random access method, in particular, when the value of the list is 1, 2,
In the case of the same value at intervals of 3, 4, 1, 2, 3, 4, etc., it is effective to avoid bank conflict by randomly accessing the access order regardless of the value of the list vector.

【００４４】図６は、本発明のソース例（ランダムアク
セス方式）を示す。図６の（ａ）は、ソース例を示す。
ここで、第３行目の！ocl list random(B) がランダムアクセス方式の最適化指示行である。そし
て、第４行目および第５行目がリストアクセスの行であ
る。FIG. 6 shows a source example (random access system) of the present invention. FIG. 6A shows a source example.
Here, the third line! ocl list random (B) is a random access method optimization instruction line. The fourth and fifth lines are the list access lines.

【００４５】図６の（ｂ）は、インプリメントイメージ
を示す。ここで、、は、既述した図５の、にそ
れぞれ対応した結果を表す。図６の（ｂ）において、
は、ランダムアクセス位置の生成関数である。このラン
ダムアクセス位置の生成関数は、ループの繰り返し回数
と、オリジナルのアクセス位置とから求める。FIG. 6B shows an implementation image. Here, represents the results respectively corresponding to the previously described FIG. In FIG. 6B,
Is a random access position generation function. This random access position generation function is obtained from the number of loop iterations and the original access position.

【００４６】は、ランダムアクセス位置でのアクセス
に変更する生成関数である。次に、図７ないし図８を用
いてスカラロード方式によるバンク競合を回避する場合
について詳細に説明する。Is a generation function for changing to an access at a random access position. Next, the case of avoiding bank contention by the scalar load method will be described in detail with reference to FIGS.

【００４７】図７は、本発明の動作説明フローチャート
（スカラロード方式）を示す。図７において、Ｓ２１
は、最適化指示行の解析を行なう。これは、後述する図
８の（ａ）のソース例のうちの、第３行目の！ocl list vls(B) というスカラロード方式を表す最適化指示行があるか解
析する。FIG. 7 is a flow chart for explaining the operation of the present invention (scalar load method). In FIG. 7, S21
Analyzes the optimization instruction line. This is the! In the third line of the source example of FIG. ocl list Analyze whether there is an optimization instruction line that indicates the scalar load method called vls (B).

【００４８】Ｓ２２は、スカラロード方式が指示か判別
する。ＹＥＳの場合には、Ｓ２３に進む。ＮＯの場合に
は、処理なしでここでは終了する。Ｓ２３は、指示対象
配列がリストアクセスか判別する。これは、Ｓ２２でス
カラロード方式で指示された配列がリストアクセスか判
別する。ＹＥＳの場合には、Ｓ２４に進む。ＮＯの場合
には、処理なしでここでは終了する。In step S22, it is determined whether the scalar load method is the instruction. In the case of YES, the process proceeds to S23. In the case of NO, the processing ends here without any processing. A step S23 decides whether or not the designated array is a list access. This is to determine whether the array designated by the scalar load method in S22 is a list access. In the case of YES, the process proceeds to S24. In the case of NO, the processing ends here without any processing.

【００４９】Ｓ２４は、対象配列の先頭位置をロード
するスカラ命令を生成する（図８の（ｂ）の参照）。
Ｓ２５は、ベクトルデータにでロードしたスカラデ
ータを複写する命令を生成する（図８の（ｂ）の参
照）。In step S24, a scalar instruction for loading the head position of the target array is generated (see FIG. 8B).
In step S25, an instruction to copy the scalar data loaded in the vector data is generated (see FIG. 8B).

【００５０】Ｓ２６は、例外処理生成（リスト値が先
頭以外の場合に対象配列からのリストアクセスとする命
令を生成、と同じベクトルデータへ格納）する。Ｓ２
７は、、とで格納されたベクトルデータを使用し
た命令列を生成する。In step S26, exception processing is generated (in the case where the list value is other than the head, an instruction for making a list access from the target array is generated and stored in the same vector data). S2
7 generates an instruction sequence using the vector data stored in and.

【００５１】以上によって、ソースプログラム１中にス
カラロード方式の最適化指示行（例えば図８の（ａ）の
第３行目の！ocl list vls(B)）が検出された場合、
指示対象配列がリストアクセスのときに対象配列の先頭
位置をロードするスカラ命令を生成（図８の（ｂ）の
参照）、ベクトルデータにでロードしたスカラデータ
を複写する命令を生成、例外処理生成、格納された
ベクトルデータを使用した命令列を生成することによ
り、図８の（ａ）のソースプログラム１が図８の（ｂ）
のソースプログラムに変更され、これをコンパイルして
オブジェクトコードを生成して実行すると、バンク競合
を削減してリストアクセスの高速化を図ることが可能と
なる。このスカラロード方式は、リストの値がループの
繰り返しの間、同一の値が連続している場合、配列をリ
ストベクトル命令でアクセスするよりも、１個のスカラ
メモリアクセス命令を行い、その結果をベクトルデータ
で使用した方が実行効率が良好となる。ここで、スカラ
ロード方式は、メモリアクセスを１個のスカラメモリア
クセスで行い、その結果をベクトルデータに拡散して使
用する方式であり、スカラアクセス位置を先頭と仮定し
ている。また、例外処理のため（リストの値がリストの
先頭以外の場合）、異なるリスト位置が出現することも
考慮し、例外のマスクを生成し、先頭位置以外は通常の
リストベクトル命令でアクセスする。As described above, the optimization instruction line of the scalar load method (for example,! Ocl list in the third line of FIG. vls (B)),
Generates a scalar instruction that loads the head position of the target array when the specified target array is accessed in a list (see FIG. 8B), generates an instruction that copies the loaded scalar data into vector data, and generates exception processing. By generating an instruction sequence using the stored vector data, the source program 1 of FIG.
If the source program is changed to a source program, and it is compiled to generate and execute an object code, bank conflicts can be reduced and list access can be speeded up. In this scalar loading method, when the list value is the same during a loop iteration, a single scalar memory access instruction is executed rather than accessing an array with a list vector instruction, and the result is returned. Execution efficiency is better when used with vector data. Here, the scalar load method is a method in which a memory access is performed by one scalar memory access, and the result is used by spreading the result to vector data. The scalar access position is assumed to be the head. In addition, due to exception processing (when the value of the list is other than the head of the list), a mask for the exception is generated in consideration of the appearance of a different list position, and access is performed by a normal list vector instruction except the head position.

【００５２】図８は、本発明のソース例（スカラロード
方式）を示す。図８の（ａ）は、ソース例を示す。ここ
で、第３行目の！ocl list vls(B) がスカラロード方式の最適化指示行である。そして、第
４行目および第５行目がリストアクセスの行である。FIG. 8 shows a source example (scalar load system) of the present invention. FIG. 8A shows a source example. Here, the third line! ocl list vls (B) is a scalar load optimization instruction line. The fourth and fifth lines are the list access lines.

【００５３】図８の（ｂ）は、インプリメントイメージ
を示す。ここで、、、、は、既述した図７の
、、、にそれぞれ対応した結果を表す。図８の
（ｂ）において、は、リストの値が全て１と仮定しス
カラロードをである。FIG. 8B shows an implementation image. Here, represents the results respectively corresponding to and in FIG. 7 described above. In FIG. 8B, scalar loading is assumed assuming that all values in the list are 1.

【００５４】は、ベクトルデータへの転送である。
は、例外処理である。は、ベクトルデータを使用して
命令列である。Is transfer to vector data.
Is exception handling. Is an instruction sequence using vector data.

【００５５】[0055]

【発明の効果】以上説明したように、本発明によれば、
コンパイル時にリストアクセスのアクセス方法を最適化
する最適化指示行、および作業用配列にコピー、アクセ
ス順番をランダム、あるいはスカラメモリアクセスする
構成を採用しているため、バンク競合を回避して使用者
のソースチューニングなしにリストアクセスの高速化を
図ることができる。As described above, according to the present invention,
An optimization instruction line that optimizes the access method for list access at compile time and a configuration that copies to the work array, randomizes the access order, or accesses the scalar memory, are used to avoid bank conflicts and avoid user conflicts. List access can be speeded up without source tuning.

[Brief description of the drawings]

【図１】本発明の１実施例構成図である。FIG. 1 is a configuration diagram of one embodiment of the present invention.

【図２】本発明の動作説明フローチャート（配列複写方
式）である。FIG. 2 is an operation explanatory flowchart (array copying method) of the present invention.

【図３】本発明のソース例（配列複写方式）である。FIG. 3 is a source example (array copy method) of the present invention.

【図４】本発明の動作説明図（配列複写方式）である。FIG. 4 is an operation explanatory diagram (array copy method) of the present invention.

【図５】本発明の動作説明フローチャート（ランダムア
クセス方式）である。FIG. 5 is a flowchart (random access method) for explaining the operation of the present invention.

【図６】本発明のソース例（ランダムアクセス方式）で
ある。FIG. 6 is a source example (random access method) of the present invention.

【図７】本発明の動作説明フローチャート（スカラロー
ド方式）である。FIG. 7 is a flowchart (scalar load method) for explaining the operation of the present invention.

【図８】本発明のソース例（スカラロード方式）であ
る。FIG. 8 is a source example (scalar load system) of the present invention.

[Explanation of symbols]

１：ソースプログラム２：処理装置４：最適化制御行解析手段８：リストベクトルアクセスの変更処理手段９：コード生成手段１０：オブジェクト 1: Source program 2: Processing device 4: Optimization control line analysis means 8: Change processing means for list vector access 9: Code generation means 10: Object

Claims

[Claims]

1. A high-speed processing apparatus for speeding up memory access, comprising: means for generating and inserting an instruction for copying a target array of a list access to a work array when a list access is found in a source at the time of compiling; Means for generating and inserting an instruction for determining the access position of the work array based on the list value of the found list access; and means for changing the list vector access by the target array to the access by the work array. Characteristic high-speed memory access processing device.

2. A high-speed processing apparatus for accelerating memory access, comprising: means for generating and inserting an instruction for generating a random access position when a list access is found in a source at the time of compiling; Means for changing all accesses depending on the repetition of the loop being accessed to the generated access position.

3. A high-speed processing device for accelerating a memory access, comprising: a means for generating and inserting a scalar instruction for loading a head position of a target array when a list access is found in a source at the time of compiling; Means for generating and inserting an instruction for transferring the loaded scalar data to the vector data of the list access, and means for generating and inserting an instruction sequence using the transferred vector data. Memory access speed-up processing device.

4. The method according to claim 1, wherein, at the time of compiling, when an optimization instruction line for changing the access method of the list access is inserted in the source, a subsequent list access is instructed by the optimization instruction line. 4. A memory access speed-up processing device comprising: means for executing any one of claims 3 to 3.

5. A means for generating and inserting an instruction for copying a target array for list access to a work array when a list access is found in a source at the time of compiling, and said work array based on the found list access list value. And a computer-readable recording medium storing a program for functioning as a means for generating and inserting an instruction for determining an access position of the target array and for changing a list vector access by the target array to an access by the work array.