JPH0713963A

JPH0713963A - Method for processing scheduling of vector instruction

Info

Publication number: JPH0713963A
Application number: JP5147543A
Authority: JP
Inventors: Eiji Nunohiro; 永示布広; Kazuhiro Aida; 一弘会田; Hiroyuki Sone; 広幸曽根; Hisato Ushijima; 寿人牛島
Original assignee: Hitachi Software Engineering Co Ltd; Hitachi Ltd
Current assignee: Hitachi Software Engineering Co Ltd; Hitachi Ltd
Priority date: 1993-06-18
Filing date: 1993-06-18
Publication date: 1995-01-17

Abstract

PURPOSE:To generate an object program having satisfactory usage efficiency in the arithmetic equipment of a vector processing processor by evading a register collision at the time of shifting an instruction. CONSTITUTION:A vector instruction scheduling part 8 is operated before a vector processing processor register assigning part 9 and inputs an intermediate language in a vector forming loop from the intermediate language 3. Information obtained by analizing the reliance between the kind of the arithmetic equipment used by the instruction at each one vector instruction unit and vector data used by the instruction is held in an instruction table in an instruction table generating part 81 and the rearrangement of the intermediate language is executed so as to make instruction arrangement in accordance with the configuration ratio of the arithmetic equipment of the vector processing processor based on the instruction table in an instruction arranging part 82.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、ベクトル命令のスケジ
ューリング処理方法に関し、特に、並列動作可能な複数
の演算器を有するベクトル処理プロセッサに対して、ソ
ース・プログラムから目的プログラムを生成するコンパ
イラにおいて、ベクトル処理プロセッサの演算器の使用
効率を向上させて実行性能の良い目的プログラムを生成
するのに好適なベクトル命令のスケジューリング処理方
法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a vector instruction scheduling method, and more particularly, to a vector processor having a plurality of arithmetic units capable of operating in parallel in a compiler for generating an object program from a source program, The present invention relates to a vector instruction scheduling method suitable for improving the usage efficiency of arithmetic units of a vector processor to generate a target program with good execution performance.

【０００２】[0002]

【従来の技術】ベクトル命令のスケジューリング処理方
法としては、例えば特開昭５８−１４９５７０号公報に
示されるように、ベクトル処理プロセッサのベクトルレ
ジスタを割当てた命令列から、依存関係のある部分命令
列を１つの命令群として、互いに依存関係のない複数の
命令群を抽出し、１つの命令群の実行シュミレーション
を行って演算器の空きを見つけ、別の命令群から当該演
算器を使用する命令をあてはめる方法が知られている。2. Description of the Related Art As a vector instruction scheduling processing method, for example, as shown in Japanese Patent Laid-Open No. 149570/1983, a partial instruction string having a dependency relationship is extracted from an instruction string to which a vector register of a vector processor is assigned. As one instruction group, a plurality of instruction groups that do not depend on each other are extracted, an execution simulation of one instruction group is performed to find a vacancy of an arithmetic unit, and an instruction that uses the arithmetic unit is applied from another instruction group. The method is known.

【０００３】[0003]

【発明が解決しようとする課題】上記従来方法は、依存
関係のない命令群同士でレジスタ衝突を起こさない範囲
では十分な効果がある。しかし、ベクトル処理プロセッ
サのベクトルレジスタを割当てた後の命令列をスケジュ
ーリングの対象としているため、依存関係のない命令群
同士であっても命令を移動した場合にレジスタ衝突が発
生する部分に関しては、演算器に空きがあっても命令の
移動ができず、つまり演算器の空きを解消できず、これ
が演算器の使用効率の向上の妨げになるという問題があ
った。The above conventional method has a sufficient effect in the range where register collision does not occur between instruction groups having no dependency. However, since the instruction sequence after allocating the vector register of the vector processor is the target of scheduling, even if there is no dependency relationship between instruction groups, register collision will occur when the instructions move. There is a problem that the instruction cannot be moved even if there is a vacancy in the computing unit, that is, the vacancy in the computing unit cannot be eliminated, which hinders the improvement of the usage efficiency of the computing unit.

【０００４】よって本発明の目的は、依存関係のない命
令群同士で命令移動をする場合のレジスタ衝突を回避し
て演算器の空きを少なくし、演算器の使用効率をより向
上させるためのベクトル命令のスケジューリング処理方
法を提供することにある。Therefore, an object of the present invention is to provide a vector for avoiding register collision when moving instructions between non-dependent instruction groups to reduce the vacancy of the arithmetic unit and to improve the utilization efficiency of the arithmetic unit. It is to provide an instruction scheduling processing method.

【０００５】[0005]

【課題を解決するための手段】本発明によれば、上記目
的を達成するために、並列動作可能な複数個の演算器を
有するベクトル処理プロセッサに対して、ソース・プロ
グラムから目的プログラムを生成するコンパイラにおい
て、ベクトル処理プロセッサのベクトルレジスタを割当
てる処理の前に、ベクトル命令のスケジューリング処理
を行う。According to the present invention, in order to achieve the above object, a target program is generated from a source program for a vector processor having a plurality of arithmetic units that can operate in parallel. In the compiler, the vector instruction scheduling process is performed before the process of allocating the vector register of the vector processor.

【０００６】このベクトル命令のスケジューリング処理
は、１ベクトル命令単位毎に、該ベクトル命令で使用す
るベクトルデータの依存関係、及び該命令で使用する演
算器の種別を解析した情報を該ベクトル命令対応に用意
した命令テーブルに保持し、該命令テーブルをもとにベ
クトル処理プロセッサの演算器の構成比即ちロード・ス
トア演算器数と算術・論理演算器数との割合に応じた命
令の配置を行う。In this vector instruction scheduling process, the information obtained by analyzing the dependency relation of the vector data used in the vector instruction and the type of the arithmetic unit used in the instruction is associated with the vector instruction for each vector instruction unit. The prepared instruction table is held, and instructions are arranged according to the composition ratio of the arithmetic units of the vector processing processor, that is, the ratio of the number of load / store arithmetic units to the number of arithmetic / logical arithmetic units based on the instruction table.

【０００７】[0007]

【作用】ベクトル処理プロセッサのベクトルレジスタ割
当てを行う前に、ソース・プログラムの中のベクトル化
ループに対する中間語を入力し、１ベクトル命令単位毎
に該ベクトル命令で使用するベクトルデータの依存関
係、及び該命令で使用する演算器の種別を解析した情報
を該ベクトル命令対応に用意した命令テーブルに保持
し、該命令テーブルをもとにベクトル処理プロセッサの
演算器の構成比即ちロード・ストア演算器数と算術・論
理演算器数との割合に応じた命令の配置を行うベクトル
命令のスケジューリングを行い、その後にベクトル処理
プロセッサのベクトルレジスタを割当てることによっ
て、演算器の使用効率の良い目的プログラムを生成する
ことができる。Before the vector register allocation of the vector processor, the intermediate word for the vectorization loop in the source program is input, and the dependency of the vector data used in the vector instruction for each vector instruction unit, and The information obtained by analyzing the type of the arithmetic unit used in the instruction is held in an instruction table prepared for the vector instruction, and based on the instruction table, the composition ratio of the arithmetic units of the vector processor, that is, the number of load / store arithmetic units. A vector instruction that allocates instructions according to the ratio of the number of arithmetic and logical arithmetic units is scheduled, and after that, the vector register of the vector processor is allocated to generate a target program with efficient use of arithmetic units. be able to.

【０００８】[0008]

【実施例】以下、本発明の一実施例を図面を用いて説明
する。図１は、本発明を適用したコンパイラの構成を示
す図である。このコンパイラ１は、並列動作可能な複数
個の演算器を有するベクトル処理プロセッサに対し、ソ
ース・プログラムから目的プログラムを生成するもの
で、その構成部分としてソースプログラム解析部５、ル
ープ解析部６、ベクトル化ループ解析部７、ベクトル命
令スケジューリング部８、ベクトル処理プロセッサレジ
スタ割当て部９、ストレージ割当て部１０、スカラ処理
プロセッサレジスタ割当て部１１、目的プログラム出力
部１２がある。ベクトル命令スケジューリング部８は、
命令テーブル作成部８１及び命令配置部８２からなる。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a diagram showing a configuration of a compiler to which the present invention is applied. This compiler 1 generates an object program from a source program for a vector processor having a plurality of arithmetic units that can operate in parallel. As its constituent parts, a source program analysis section 5, a loop analysis section 6, and a vector. There is a generalization loop analysis unit 7, a vector instruction scheduling unit 8, a vector processing processor register allocation unit 9, a storage allocation unit 10, a scalar processing processor register allocation unit 11, and an object program output unit 12. The vector instruction scheduling unit 8
The instruction table creating unit 81 and the instruction arranging unit 82 are included.

【０００９】このコンパイラ１の動作の概要は次のとお
りである。ソース・プログラム解析部５が、大容量記憶
装置内のソース・プログラム２を読み込み、それを中間
語３に変換する。ループ解析部６が、この中間語３を入
力してソース・プログラムの制御の流れを解析し条件構
造やループ構造を検出すると同時に、変数及び配列デー
タの定義・参照関係を調べることにより依存解析を行
う。ベクトル化ループ解析部７が、ベクトル実行可能な
ループ構造を検出してベクトル化ループ構造に変換す
る。The outline of the operation of the compiler 1 is as follows. The source program analysis unit 5 reads the source program 2 in the mass storage device and converts it into the intermediate language 3. The loop analysis unit 6 inputs the intermediate language 3 to analyze the control flow of the source program to detect the conditional structure and the loop structure, and at the same time, checks the definition / reference relationship of variables and array data to perform dependency analysis. To do. The vectorization loop analysis unit 7 detects a vector executable loop structure and converts it into a vectorization loop structure.

【００１０】ベクトル命令スケジューリング部８が、演
算器の使用効率を向上させた命令配置となるよう中間語
の配置換えを行う。この処理の詳細は後述する。The vector instruction scheduling unit 8 rearranges the intermediate words so that the instruction arrangement improves the usage efficiency of the arithmetic units. Details of this processing will be described later.

【００１１】ベクトル処理プロセッサレジスタ割当て部
９が、ベクトル化ループ内の中間語に対してベクトル処
理プロセッサのベクトルレジスタを割当てる。ストレー
ジ割当て部１０が、目的プログラムの実行に必要なデー
タ領域の割当てをおこなう。スカラ処理プロセッサレジ
スタ割当て部１１が、スカラ命令で使用するスカラ処理
プロセッサのレジスタを割当てを行う。目的プログラム
出力部１２が、機械命令語である目的プログラム４を大
容量記憶装置に出力する。The vector processing processor register allocating unit 9 allocates the vector register of the vector processing processor to the intermediate word in the vectorization loop. The storage allocation unit 10 allocates a data area necessary for executing the target program. The scalar processing processor register allocation unit 11 allocates the registers of the scalar processing processor used in the scalar instruction. The target program output unit 12 outputs the target program 4, which is a machine instruction word, to the mass storage device.

【００１２】次に、本発明の要部たるベクトル命令スケ
ジューリング部８の動作について詳細に説明する。前述
のように、ベクトル命令スケジューリング部８は命令テ
ーブル作成部８１と命令配置部８２から構成されている
が、その命令テーブル作成部８１の動作について、図２
に示す処理フローを用いて説明する。Next, the operation of the vector instruction scheduling unit 8 which is the main part of the present invention will be described in detail. As described above, the vector instruction scheduling unit 8 is composed of the instruction table creating unit 81 and the instruction arranging unit 82. Regarding the operation of the instruction table creating unit 81, FIG.
This will be described using the processing flow shown in.

【００１３】まず、図１に示した中間語３からベクトル
化ループの中間語を入力し（処理１０１）、この中間語
をソース・プログラムの出現順とは逆順に走査して、未
走査の中間語があるかどうかを判定する（処理１０
２）。未走査の中間語がある場合には命令テーブルの作
成処理（処理１０３〜処理１０５）を行い、ない場合に
は命令テーブルの作成処理を終了する。First, an intermediate word of the vectorization loop is input from the intermediate word 3 shown in FIG. 1 (process 101), the intermediate word is scanned in the reverse order of the appearance order of the source program, and the unscanned intermediate word is scanned. It is determined whether there is a word (Process 10)
2). If there is an unscanned intermediate word, the instruction table creating process (process 103 to process 105) is performed.

【００１４】命令テーブルの作成処理（処理１０３〜処
理１０５）は次のとおりである。まず、図４に示す形式
の命令テーブルを１つ確保すると同時に、その内容を０
で初期設定し（処理１０３）、走査対象となった中間語
から命令で使用する演算器の種別、即ちロード・ストア
演算器か算術・論理演算器のいずれか１つの種別を識別
して、それぞれ整数値０、１に対応させた情報として命
令テーブルに設定する（処理１０４）。つぎに、走査対
象となった中間語から命令で使用するベクトルデータを
検出すると同時に、前記検出したベクトルデータから依
存関係のある中間語を検出し、前記依存関係のあった中
間語に対応する命令テーブル内の同期ビットｎ（ｎは、
当該ベクトルデータが使用されている命令オペランド番
号に対応した値で１〜４の範囲の中の一つである）をＯ
Ｎ（１）にする。これと同時に、処理１０３で確保した
命令テーブル内の命令チェイン数の内容を１だけ増加さ
せる（処理１０５）。そして、処理１０２に制御を戻
す。以上が、命令テーブル作成部８１の動作説明であ
る。The instruction table creating process (process 103 to process 105) is as follows. First, one instruction table of the format shown in FIG.
Is initialized (step 103), the type of the arithmetic unit used in the instruction is identified from the scanned intermediate language, that is, one type of the load / store arithmetic unit or the arithmetic / logical arithmetic unit is identified, and The information corresponding to the integer values 0 and 1 is set in the instruction table (process 104). Next, the vector data used in the instruction is detected from the intermediate word that is the scan target, and at the same time, the intermediate word having the dependency relationship is detected from the detected vector data, and the instruction corresponding to the intermediate word having the dependency relationship is detected. Sync bit n in the table (where n is
Is a value corresponding to the instruction operand number in which the vector data is used, and is one of the range of 1 to 4)
Set to N (1). At the same time, the content of the number of instruction chains in the instruction table secured in step 103 is incremented by 1 (step 105). Then, the control is returned to the process 102. The above is the description of the operation of the instruction table creation unit 81.

【００１５】次に、命令配置部８２の動作について、図
３に示す処理フローを用いて説明する。まず、上記命令
テーブル作成部８１で作成した命令テーブルを逐次入力
し、命令テーブル内の全同期ビットがＯＦＦ（０）であ
る命令テーブルを命令配置可能なものとして、命令テー
ブル内の演算器種別から命令で使用する演算器別に初期
登録する（処理２０１）。Next, the operation of the instruction placement unit 82 will be described using the processing flow shown in FIG. First, the instruction table created by the instruction table creating unit 81 is sequentially input, and the instruction table in which all the synchronization bits in the instruction table are OFF (0) can be arranged, Initial registration is performed for each arithmetic unit used in the command (process 201).

【００１６】次に、命令配置番号カウンタを１に初期設
定し（処理２０２）、命令配置すべき命令テーブルが登
録されているかどうかを判定し（処理２０３）、登録さ
れている場合に処理２０４に制御を渡し、登録されてい
ない場合には命令配置部８２の処理を終了する。処理２
０４では、演算器カウンタにベクトル処理プロセッサの
ロード・ストア演算器数（本実施例では２とする）を設
定し、ロード・ストア命令の配置処理（処理２０５〜処
理２０６）に進む。Next, the instruction allocation number counter is initialized to 1 (process 202), it is judged whether or not the command table to which the command is allocated is registered (process 203), and if registered, the process 204 is executed. Control is passed, and if not registered, the processing of the instruction placement unit 82 is ended. Process 2
In 04, the number of load / store arithmetic units of the vector processing processor (2 in this embodiment) is set in the arithmetic unit counter, and the process proceeds to load / store instruction allocation processing (processing 205 to processing 206).

【００１７】ロード・ストア命令の配置処理（処理２０
５〜処理２０６）は次のとおりである。演算器カウンタ
が０でなく、かつ演算器種別が０（ロード・ストア演算
器）である命令テーブルが登録されているかどうか判定
する（処理２０５）。前記条件が成立する場合には命令
配置換え処理１（処理２０６）に、成立しない場合には
処理２０７に制御を渡す。Load / store instruction allocation processing (processing 20)
5 to process 206) are as follows. It is determined whether the instruction table in which the arithmetic unit counter is not 0 and the arithmetic unit type is 0 (load / store arithmetic unit) is registered (step 205). If the above condition is satisfied, the control is passed to the instruction rearrangement process 1 (process 206), and if not satisfied, the control is passed to the process 207.

【００１８】命令配置換え処理１（処理２０６）は、演
算器カウンタを１減じるとともに、登録されている命令
テーブル群から命令テーブル内の命令チェイン数が最大
のものを１つ選び、その命令テーブルに対応する中間語
を命令配置番号カウンタの示す所定の位置に配置し、命
令配置番号カウンタを１だけ増加させる。更に、前記配
置を行った中間語上にあるベクトルデータの夫々に対し
て、依存関係のある中間語を検出し、前記検出した中間
語の夫々に対応する命令テーブル内の前記ベクトルデー
タを使用する命令オペランド位置に対応する同期ビット
をＯＦＦ（０）にして同期待ちの解消を行う。この時、
同期待ちの解消を行った中間語に対し、対応する命令テ
ーブル内の全同期ビットがＯＦＦであるとき命令配置可
能として命令テーブルを登録する。そして、前記配置を
行った中間語に対応する命令テーブルを削除し、処理２
０５に制御を戻す。In the instruction rearrangement process 1 (process 206), the arithmetic unit counter is decremented by one, and the one having the largest number of instruction chains in the instruction table is selected from the registered instruction table group, and the instruction table is stored in the instruction table. The corresponding intermediate word is placed at a predetermined position indicated by the instruction placement number counter, and the instruction placement number counter is incremented by one. Further, an intermediate word having a dependency relationship is detected for each of the vector data on the arranged intermediate word, and the vector data in the instruction table corresponding to each of the detected intermediate words is used. The synchronization wait corresponding to the instruction operand position is turned off (0) to cancel the synchronization wait. This time,
When all the synchronization bits in the corresponding instruction table are OFF with respect to the intermediate word for which the synchronization waiting is canceled, the instruction table is registered as the instruction can be arranged. Then, the instruction table corresponding to the arranged intermediate language is deleted, and the process 2
Return control to 05.

【００１９】処理２０７は、演算器カウンタにベクトル
処理プロセッサの算術・論理演算器数（本実施例では２
とする）を設定し、算術・論理命令の配置処理（処理２
０８〜処理２０９）に進む。算術・論理命令の配置処理
は、演算器カウンタが０でなく、かつ演算器種別が１
（算術・論理演算器）である命令テーブルが登録されて
いるかどうか判定し（処理２０８）、前記条件が成立す
る場合には命令配置換え処理２（処理２０９）に制御を
渡し、成立しない場合には処理２０３に制御を戻す。命
令配置換え処理２は、命令配置換え処理１と同様の処理
を行い、処理２０８に制御を戻す。In step 207, the number of arithmetic / logical arithmetic units of the vector processing processor is set in the arithmetic unit counter (2 in this embodiment).
Set), and arithmetic / logical instruction allocation processing (processing 2)
08 to process 209). Arithmetic / logical instruction allocation processing is performed when the arithmetic unit counter is not 0 and the arithmetic unit type is 1
It is determined whether or not an instruction table, which is (arithmetic / logical operation unit), is registered (step 208). If the above condition is satisfied, control is passed to the instruction rearrangement process 2 (step 209). Returns control to process 203. The instruction rearrangement process 2 performs the same process as the instruction rearrangement process 1 and returns the control to the process 208.

【００２０】以上のようにベクトル命令スケジューリン
グ部８は動作して、ベクトル化ループ内の中間語をベク
トル処理プロセッサの演算器の構成比率に応じた命令の
配置となるようにスケジューリングする。The vector instruction scheduling unit 8 operates as described above, and schedules the intermediate words in the vectorization loop so that instructions are arranged according to the composition ratio of the arithmetic units of the vector processing processor.

【００２１】本発明のスケジューリング処理方法を適用
したベクトル命令のスケジューリング例を以下に示す。An example of vector instruction scheduling to which the scheduling method of the present invention is applied is shown below.

【００２２】図５は、図１に示したコンパイラ１におい
て、ソース・プログラム１を入力してソース・プログラ
ム解析部５、ループ解析部６およびベクトル化ループ解
析部７の所定の処理をして得られるベクトル化ループの
中間語の一例を示す。この中間語の例を用いたスケジュ
ーリング例を説明する。FIG. 5 is obtained by inputting the source program 1 into the compiler 1 shown in FIG. 1 and subjecting the source program analyzing section 5, loop analyzing section 6 and vectorized loop analyzing section 7 to predetermined processing. An example of the intermediate word of the vectorization loop used is shown. A scheduling example using this intermediate language example will be described.

【００２３】命令テーブル作成部８１は、図５に示した
中間語を逆順つまり、、……、の順に走査し、各
中間語に対して命令で使用する演算器の種別と命令で使
用するベクトルデータの依存関係を解析して図６に示す
命令テーブルを作成する。The instruction table creating unit 81 scans the intermediate words shown in FIG. 5 in reverse order, that is, in the order of, ..., And the type of arithmetic unit used in the instruction and the vector used in the instruction for each intermediate word. An instruction table shown in FIG. 6 is created by analyzing the data dependency.

【００２４】命令配置部８２は、図６に示した命令テー
ブルを、、……、の順に入力して、全同期ビット
がＯＦＦである命令テーブル、、、を命令配置
可能なロード・ストア演算器を使用する命令として初期
登録し、、、、の順に命令テーブルをチェイン
する。The instruction arranging unit 82 inputs the instruction table shown in FIG. 6 in this order, ..., And the load / store arithmetic unit capable of arranging the instruction table in which all the synchronization bits are OFF. Is initially registered as an instruction using, and the instruction table is chained in the order of ,,,.

【００２５】次に、命令配置番号カウンタに１を設定
し、命令配置すべき命令テーブル（、、、）が
登録されているので、演算器カウンタにロード・ストア
演算器数２を設定し、ロード・ストア命令の配置処理を
行う。このロード・ストア命令の配置処理により、命令
テーブルに対応する中間語を第１番目の命令として
配置し、命令テーブルの同期ビット２をＯＦＦにし、
命令テーブルを削除する。さらに、命令テーブルに
対応する中間語を第２番目の命令として配置し、命令
テーブルの同期ビット３をＯＦＦにすると同時に、命
令テーブルの全同期ビットがＯＦＦになったため、命
令テーブルを命令配置可能な算術・論理演算器を使用
する命令として登録し命令テーブルを削除する。そし
て、演算器カウンタに算術・論理演算器数２を設定し、
算術・論理命令の配置処理を行う。算術・論理命令の配
置処理により、命令テーブルに対応する中間語を第
３番目の命令として配置し、命令テーブルの同期ビッ
ト２をＯＦＦにすると同時に、命令テーブルの全同期
ビットがＯＦＦになったため、命令テーブルを命令配
置可能なロード・ストア演算器を使用する命令として登
録し命令テーブルを削除する。Next, since the instruction allocation number counter is set to 1 and the instruction table (,,,) to which the instruction is to be allocated is registered, the load / store operation unit number is set to 2 in the operation unit counter and the load is performed. • Performs store instruction allocation processing. By this placement processing of load / store instructions, the intermediate word corresponding to the instruction table is placed as the first instruction, and the synchronization bit 2 of the instruction table is turned off.
Delete the instruction table. Further, since the intermediate word corresponding to the instruction table is arranged as the second instruction and all the synchronization bits of the instruction table are turned off at the same time that the synchronization bit 3 of the instruction table is turned off, the instruction table can be placed in the instruction. Register as an instruction that uses an arithmetic / logical operation unit and delete the instruction table. Then, set the number of arithmetic / logical arithmetic units to the arithmetic unit counter,
Arrangement of arithmetic and logic instructions is performed. By the arithmetic / logical instruction arrangement process, the intermediate word corresponding to the instruction table is arranged as the third instruction, and the synchronous bit 2 of the instruction table is turned off, and at the same time, all the synchronous bits of the instruction table are turned off. The instruction table is registered as an instruction that uses a load / store arithmetic unit that can allocate instructions, and the instruction table is deleted.

【００２６】以上の様に、配置すべき命令テーブルが登
録されているあいだ、ロード・ストア命令の配置処理と
算術・論理命令の配置処理を繰り返すことにより、図７
に示す中間語の配置となる。As described above, while the instruction table to be arranged is registered, the arrangement process of the load / store instruction and the arrangement process of the arithmetic / logical instruction are repeated, so that FIG.
It becomes the arrangement of the intermediate language shown in.

【００２７】スケジューリングを行う前のベクトル化ル
ープの中間語（図５）とスケジューリングを行った後の
ベクトル化ループの中間語（図７）の演算器の使用効率
を対比するため、それぞれの命令のタイミングチャート
を図８の（ａ）及び（ｂ）に示す。このタイミングチャ
ートからも容易に理解されるように、本発明のスケジュ
ーリング処理により演算器の使用効率が向上する。In order to compare the use efficiency of the arithmetic unit of the intermediate word of the vectorization loop before scheduling (FIG. 5) and the intermediate word of the vectorization loop after scheduling (FIG. 7), each instruction Timing charts are shown in FIGS. 8A and 8B. As can be easily understood from this timing chart, the use efficiency of the arithmetic unit is improved by the scheduling process of the present invention.

【００２８】[0028]

【発明の効果】以上説明したように、本発明のベクトル
命令スケジューリング処理方法によれば、依存関係のな
い命令同士でのレジスタ衝突を回避した命令のスケジュ
ーリングが可能となり、演算器の使用効率を向上させた
目的プログラムが生成でき、目的プログラムの実行時間
を短縮することが可能となる。As described above, according to the vector instruction scheduling processing method of the present invention, it is possible to perform instruction scheduling while avoiding register collision between instructions having no dependency, thereby improving the usage efficiency of arithmetic units. The generated target program can be generated, and the execution time of the target program can be shortened.

[Brief description of drawings]

【図１】本発明が適用されたコンパイラの一実施例の構
成図である。FIG. 1 is a configuration diagram of an embodiment of a compiler to which the present invention is applied.

【図２】ベクトル命令スケジューリング部における命令
テーブル作成処理のフローチャートである。FIG. 2 is a flowchart of an instruction table creating process in a vector instruction scheduling unit.

【図３】ベクトル命令スケジューリング部における命令
配置処理のフローチャートである。FIG. 3 is a flowchart of an instruction placement process in a vector instruction scheduling unit.

【図４】命令テーブルの内容例を示す。FIG. 4 shows an example of contents of an instruction table.

【図５】ベクトル化ループの中間語の一例を示す。FIG. 5 shows an example of an intermediate word of a vectorization loop.

【図６】図５の中間語に対する命令テーブル作成例を示
す。6 shows an example of creating an instruction table for the intermediate language of FIG.

【図７】図５の中間語を命令スケジューリング処理した
後の中間語の配置を示す。FIG. 7 shows an arrangement of intermediate words after instruction scheduling processing of the intermediate words of FIG.

【図８】（ａ）図５の中間語の命令タイミングチャート
である。（ｂ）図７の中間語の命令タイミングチャート
である。8 (a) is an instruction timing chart of the intermediate language of FIG. (B) It is an instruction timing chart of the intermediate language of FIG. 7.

[Explanation of symbols]

１コンパイラ２ソース・プログラム３中間語４目的プログラム５ソース・プログラム解析部６ループ解析部７ベクトル化ループ解析部８ベクトル命令スケジューリング部８１命令テーブル作成部８２命令配置部９ベクトル処理プロセッサレジスタ割当て部１０ストレージ割当て部１１スカラ処理プロセッサレジスタ割当て部１２目的プログラム出力部 DESCRIPTION OF SYMBOLS 1 compiler 2 source program 3 intermediate language 4 object program 5 source program analysis unit 6 loop analysis unit 7 vectorized loop analysis unit 8 vector instruction scheduling unit 81 instruction table creation unit 82 instruction placement unit 9 vector processing processor register allocation unit 10 Storage allocation unit 11 Scalar processing processor Register allocation unit 12 Target program output unit

───────────────────────────────────────────────────── フロントページの続き (72)発明者会田一弘神奈川県横浜市戸塚区戸塚町5030番地株式会社日立製作所ソフトウェア開発本部内 (72)発明者曽根広幸神奈川県横浜市中区尾上町６丁目81番地日立ソフトウェアエンジニアリング株式会社内 (72)発明者牛島寿人神奈川県横浜市中区尾上町６丁目81番地日立ソフトウェアエンジニアリング株式会社内 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Kazuhiro Aida 5030 Totsuka-cho, Totsuka-ku, Yokohama City, Kanagawa Prefecture Software Development Division, Hitachi, Ltd. (72) Hiroyuki Sone Inoue-cho, Naka-ku, Yokohama City, Kanagawa Prefecture 81 Hitachi Software Engineering Co., Ltd. In-house (72) Inventor Hisato Ushijima 6-81 Onoue-cho, Naka-ku, Yokohama-shi Kanagawa Hitachi Software Engineering Co., Ltd. In-house

Claims

[Claims]

1. A vector processing processor having a plurality of arithmetic units capable of operating in parallel, in a compiler for generating an object program from a source program, sequentially inputting intermediate words in a vectorization loop to obtain one vector. Analyzing the dependency of vector data used in the instruction and the arithmetic unit used in the instruction for each instruction, creating an instruction table holding the analysis result, and calculating the vector processor based on the created instruction table A vector instruction scheduling method characterized by performing vector instruction scheduling immediately prior to vector register allocation so that instructions are arranged according to the composition ratio of the device.