JP2007066128A

JP2007066128A - Compilation processing method, compilation processor and compilation processing program

Info

Publication number: JP2007066128A
Application number: JP2005252969A
Authority: JP
Inventors: Hiroko Sugiyama; 浩子杉山
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2005-09-01
Filing date: 2005-09-01
Publication date: 2007-03-15
Anticipated expiration: 2025-09-01
Also published as: JP4819442B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a new compilation processing technology capable of shortening the execution time of a loop for executing an operation of a non-dense matrix when a source program has the loop. <P>SOLUTION: This compilation processor is constituted so that a multiplex loop sentence in the innermost loop sentence of which the non-dense matrix is included and which has an operation whose execution can be omitted when data about the non-dense matrix is 0 is detected regarding multiplex loop sentences described in the source program; whether or not the data about the non-dense matrix of the operation which the innermost loop sentence has is the data whose value is not updated by rotation of the loop is judged regarding the innermost loop sentence of the detected multiplex loop sentence; whether or not the data about the non-dense matrix is 0 is inspected immediately before the innermost loop sentence in which it is judged that it is the data whose value is not updated; and a statement instructing that the loop proceeds to the next rotation without performing the operation in the innermost loop sentence when the data is 0 is inserted into the multiplex loop sentence and the multiplex loop sentence into which the statement is inserted is parallelized in a cyclic system. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、ソースプログラムをコンパイルするコンパイル処理方法及びその装置と、そのコンパイル処理方法の実現に用いられるコンパイル処理プログラムとに関し、特に、ソースプログラムが疎行列の演算を実行するループを持つ場合に、その実行時間を短縮できるようにするコンパイル処理方法及びその装置と、そのコンパイル処理方法の実現に用いられるコンパイル処理プログラムとに関する。 The present invention relates to a compile processing method and apparatus for compiling a source program, and a compile processing program used for realizing the compile processing method, and in particular, when the source program has a loop for executing a sparse matrix operation, The present invention relates to a compile processing method and apparatus capable of shortening the execution time, and a compile processing program used for realizing the compile processing method.

従来技術では、ソースプログラムが疎行列（０を多く持つ行列）の演算を実行するループを持つ場合に、実行不要な演算があるにもかかわらず、それを実行していることから実行時間が長くなるという問題がある。 In the conventional technique, when a source program has a loop for executing an operation of a sparse matrix (matrix having many 0s), even though there is an unnecessary operation, the execution time is long. There is a problem of becoming.

例えば、
ＤＯＪ＝１，Ｎ
ＤＯＩ＝１，Ｍ
Ａ（Ｉ，Ｊ）＝Ａ（Ｉ，Ｊ）＋Ｂ（Ｊ）＊Ｃ
ＥＮＤＤＯ
ＥＮＤＤＯ
という行列Ｂの演算を実行する２重ループでは、図７に示すように、内側ループをＭ回実行することになる。 For example,
DO J = 1, N
DO I = 1, M
A (I, J) = A (I, J) + B (J) * C
ENDDO
ENDDO
In the double loop that executes the operation of the matrix B, as shown in FIG. 7, the inner loop is executed M times.

この場合に、従来技術では、行列Ｂが疎行列であるのか否かを考慮しないで、そのまま内側ループをＭ回実行するようにしている。しかるに、Ｂ（Ｊ）の値が０である場合、内側ループの演算結果は“Ａ（Ｉ，Ｊ）”となり、内側ループは演算不要ということを意味する。 In this case, in the prior art, the inner loop is executed M times as it is without considering whether the matrix B is a sparse matrix. However, when the value of B (J) is 0, the calculation result of the inner loop is “A (I, J)”, which means that the inner loop does not require calculation.

行列Ｂが疎行列である場合、Ｂ（Ｊ）の値は０である場合が多い。これから、従来技術のように、行列Ｂが疎行列であるのか否かを考慮しないで、そのまま内側ループをＭ回実行するようにしていると、演算不要な場合が多いにも関わらず内側ループを実行することになることから、その実行時間が長くなるという問題があるのである。 When the matrix B is a sparse matrix, the value of B (J) is often 0. From now on, if the inner loop is executed M times as it is without considering whether or not the matrix B is a sparse matrix as in the prior art, the inner loop is changed even though there are many cases where the calculation is unnecessary. Since it is executed, there is a problem that the execution time becomes long.

この問題は、上述の２重ループを単純に並列化しても解決できない問題である。 This problem is a problem that cannot be solved by simply parallelizing the above-described double loop.

すなわち、通常、このような多重ループは一番外側の次元で並列化されることになるので、このループを２つのＣＰＵ１，２の並列プロセッサ向けに並列化すると、ＣＰＵ１とＣＰＵ２とでは、それぞれ図８に示すようなループが実行されることになる。 That is, normally, such a multiple loop is parallelized in the outermost dimension. Therefore, when this loop is parallelized for the parallel processors of two CPUs 1 and 2, CPU1 and CPU2 respectively A loop as shown in FIG. 8 is executed.

このように並列化した場合の実行時間は、ｎ並列で実行した場合、
ＭＡＸ（ＣＰＵ１の実行時間 ,ＣＰＵ２の実行時間 ,・・ ,ＣＰＵｎの実行時間）
で求めることができる。 The execution time when parallelized in this way is as follows:
MAX (CPU1 execution time, CPU2 execution time,..., CPUn execution time)
Can be obtained.

これから、内側ループの実行時間をＴとすると、上述の２重ループの実行時間は、
並列化しない場合：Ｔ＊Ｎ
並列化する場合：Ｔ＊（Ｎ／２）
となる。 From now on, if the execution time of the inner loop is T, the execution time of the above-mentioned double loop is
When not parallelized: T * N
Parallelization: T * (N / 2)
It becomes.

このように、ソースプログラムが２重ループなどのような多重ループを持つ場合には、それを並列化することで実行時間を大幅に短縮できるようになる。 Thus, when the source program has a multiple loop such as a double loop, the execution time can be greatly shortened by parallelizing the source program.

しかしながら、このような並列化を行っても、個々のＣＰＵでは演算不要な場合が多いにも関わらずそのまま内側ループを実行するということに変わりはなく、これから、実行時間が長くなるという問題は解決できないことになる。 However, even if such parallelization is performed, there is no change in that the inner loop is executed as it is, although there are many cases where calculation is not required in each CPU, and the problem that the execution time becomes longer will be solved. It will not be possible.

ここで、本発明に関連する従来技術として、下記に示す特許文献１，２がある。 Here, there are Patent Documents 1 and 2 shown below as conventional techniques related to the present invention.

この特許文献１に記載された発明では、大部分の行列要素が０であるという特性をもつスパース行列（疎行列）の行列演算を実行する場合に、前処理でスパース行列の演算過程で現れる全ての非零要素位置を検出してビットマップを作成し、このビットマップが示す行列要素位置のみを計算して無駄のない反復行列求解を行うようにしている。 In the invention described in Patent Document 1, when performing matrix operation of a sparse matrix (sparse matrix) having a characteristic that most matrix elements are 0, all appearing in the sparse matrix operation process in the preprocessing A non-zero element position is detected and a bitmap is created, and only a matrix element position indicated by the bitmap is calculated to perform an iterative matrix solution without waste.

また、特許文献２に記載された発明では、複数のタスクを複数のプロセッサ上で並行して実行することで手続内ループの実行速度の向上を図るときにあって、割り当てられたループの実行を終えたプロセッサが未処理のループの実行を行うようにすることで、その実行速度の向上を一層確かなものにするようにしている。
特開昭６０−２４７７８２号公報特開平３−２１８５５６号公報 In the invention described in Patent Document 2, the execution of the assigned loop is performed when the execution speed of the in-procedure loop is improved by executing a plurality of tasks in parallel on a plurality of processors. By making the finished processor execute the unprocessed loop, the execution speed is further improved.
JP-A-60-247772 JP-A-3-218556

上述したように、従来技術では、ソースプログラムが疎行列の演算を実行するループを持つ場合に、実行不要な演算が含まれているものの、そのようなことを考えずに、そのままソースプログラムをコンパイルするようにしている。 As described above, in the conventional technology, when a source program has a loop for executing a sparse matrix operation, an unnecessary operation is included, but the source program is compiled as it is without considering such an operation. Like to do.

これから、従来技術に従っていると、ソースプログラムが疎行列の演算を実行するループを持つ場合に、実行不要な演算があるにもかかわらず、それを実行していることから実行時間が長くなるという問題がある。 From now on, according to the prior art, when the source program has a loop that executes sparse matrix operations, even though there are operations that do not need to be executed, the execution time becomes longer because they are executed There is.

本発明はかかる事情に鑑みてなされたものであって、ソースプログラムをコンパイルするときにあって、ソースプログラムが疎行列の演算を実行するループを持つ場合に、その実行時間を短縮できるようにする新たなコンパイル処理技術の提供を目的とする。 The present invention has been made in view of such circumstances. When a source program has a loop for executing a sparse matrix operation, the execution time can be shortened when compiling the source program. The purpose is to provide new compilation technology.

この目的を達成するために、本発明のコンパイル処理装置は、（１）ソースプログラムに記述される多重ループ文について、その多重ループ文の最内ループ文に、疎行列を含み、かつ、その疎行列のデータが０である場合に実行を省略できる演算を持つものを検出する検出手段と、（２）検出手段が検出した多重ループ文の最内ループ文について、その最内ループ文の持つ演算の疎行列のデータがループの回転により値の更新されないデータであるのかを判断する判断手段と、（３）判断手段が値の更新されないデータであることを判断した最内ループ文の直前に、疎行列のデータが０であるのかを検査して、０である場合には最内ループ文内の演算を行わずに次の回転に進むことを指示する命令文を挿入する挿入手段と、（４）挿入手段が命令文を挿入した多重ループ文をサイクリック方式の形で並列化する並列化手段とを備えるように構成する。 In order to achieve this object, the compile processing apparatus of the present invention (1) includes a sparse matrix in the innermost loop statement of a multiple loop statement described in a source program and includes the sparse matrix. (2) an operation of the innermost loop statement for the innermost loop statement of the multiple loop statement detected by the detecting means; Determining means for determining whether the data of the sparse matrix is data whose value is not updated due to the rotation of the loop, and (3) immediately before the innermost loop statement in which the determining means determines that the data is not updated. Checking whether the data of the sparse matrix is 0, and if it is 0, an insertion means for inserting a command statement instructing to proceed to the next rotation without performing an operation in the innermost loop statement; 4) Insertion means Configured to include a parallelization means to parallelize the multi-loop sentence by inserting the Ryobun in the form of a cyclic manner.

このように構成されるときにあって、検出手段は、ソースプログラムに記述される１重ループ文について、その１重ループ文に、疎行列を含み、かつ、その疎行列のデータが０である場合に実行を省略できる演算を持つものを検出することがある。 When configured in this way, the detection means includes a sparse matrix in the single loop statement described in the source program, and the data of the sparse matrix is 0. In some cases, an operation having an operation that can be omitted is detected.

この場合には、判断手段は、検出手段が検出した１重ループ文について、その１重ループ文の持つ演算の疎行列のデータがループの回転により値の更新されないデータであるのかを判断し、挿入手段は、判断手段が値の更新されないデータであることを判断した１重ループ文の直前に、疎行列のデータが０であるのかを検査して、０である場合には１重ループ文内の演算を行わずに次の回転に進むことを指示する命令文を挿入し、並列化手段は、挿入手段が命令文を挿入した１重ループ文をサイクリック方式の形で並列化するように処理する。 In this case, for the single loop statement detected by the detection unit, the determination unit determines whether the sparse matrix data of the operation of the single loop statement is data whose value is not updated by the rotation of the loop, The inserting unit checks whether the data of the sparse matrix is 0 immediately before the single loop statement in which the determining unit determines that the data is not updated, and if it is 0, the single loop statement An instruction statement instructing to proceed to the next rotation without performing the operation is inserted, and the parallelizing means parallelizes the single loop statement in which the inserting means has inserted the instruction sentence in a cyclic manner. To process.

以上の各処理手段が動作することで実現される本発明のコンパイル処理方法はコンピュータプログラムでも実現できるものであり、このコンピュータプログラムは、適当なコンピュータ読み取り可能な記録媒体に記録して提供されたり、ネットワークを介して提供され、本発明を実施する際にインストールされてＣＰＵなどの制御手段上で動作することにより本発明を実現することになる。 The compile processing method of the present invention realized by the operation of each processing means described above can also be realized by a computer program, and this computer program is provided by being recorded on a suitable computer-readable recording medium, The present invention is realized by being provided via a network, installed when executing the present invention, and operating on a control means such as a CPU.

このように構成される本発明のコンパイル処理装置では、ソースプログラムに記述される多重ループ文について、その多重ループ文の最内ループ文に、疎行列を含み、かつ、その疎行列のデータが０である場合に実行を省略できる演算を持つものを検出すると、その検出した多重ループ文の最内ループ文について、その最内ループ文の持つ演算の疎行列のデータがループの回転により値の更新されないデータであるのかを判断する。 In the compile processing device of the present invention configured as described above, for the multiple loop statement described in the source program, the innermost loop statement of the multiple loop statement includes a sparse matrix, and the data of the sparse matrix is 0. If an operation with an operation that can be omitted is detected, the data of the sparse matrix of the operation of the innermost loop statement is updated by the rotation of the loop. Determine whether the data is not processed.

続いて、その値の更新されないデータであることを判断した最内ループ文の直前に、疎行列のデータが０であるのかを検査して、０である場合には最内ループ文内の演算を行わずに次の回転に進むことを指示する命令文を挿入して、その命令文を挿入した多重ループ文をサイクリック方式の形で並列化する。 Subsequently, immediately before the innermost loop statement for which it is determined that the data is not updated, it is checked whether the data of the sparse matrix is 0. If it is 0, the operation in the innermost loop statement is performed. The command statement instructing to proceed to the next rotation without inserting is inserted, and the multiple loop statement in which the command statement is inserted is parallelized in a cyclic manner.

このようにして、本発明のコンパイル処理装置では、
ＤＯＪ＝１，Ｎ
ＤＯＩ＝１，Ｍ
Ａ（Ｉ，Ｊ）＝Ａ（Ｉ，Ｊ）＋Ｂ（Ｊ）＊Ｃ
ＥＮＤＤＯ
ＥＮＤＤＯ
という疎行列Ｂの演算を実行する２重ループ文で説明するならば、図１（ａ）に示すように、疎行列ＢのデータＢ（Ｊ）が０である場合には内側ループ文の実行を省略でき、かつ、このデータＢ（Ｊ）が内側ループの回転により値が更新されないので、この内側ループ文の直前に、Ｂ（Ｊ）の値が０であるのかを検査して、０である場合には内側ループ文を実行しないようにするという命令文（図中に示す＊部分の命令文）を挿入するという構成を採る。 Thus, in the compile processing apparatus of the present invention,
DO J = 1, N
DO I = 1, M
A (I, J) = A (I, J) + B (J) * C
ENDDO
ENDDO
In the case of a double loop statement that executes the operation of the sparse matrix B, as shown in FIG. 1A, when the data B (J) of the sparse matrix B is 0, the inner loop statement is executed. Since the value of this data B (J) is not updated by the rotation of the inner loop, it is checked immediately before this inner loop statement whether the value of B (J) is 0. In some cases, a configuration is adopted in which a command statement (indicated by the * part shown in the figure) that prevents execution of the inner loop statement is inserted.

このように、本発明のコンパイル処理装置では、ソースプログラムに対して不要な演算の実行を省略可能にする命令文を挿入することで不要な演算を実行しないで済むようにし、これにより実行時間の短縮を図るようにするという構成を採るものである。 As described above, in the compile processing device of the present invention, it is possible to avoid executing unnecessary operations by inserting a statement that makes it possible to omit the execution of unnecessary operations in the source program. A configuration for shortening is adopted.

しかるに、図１（ｂ）に示すように、
ＤＯＩ＝１，Ｍ
Ａ（Ｉ）＝Ａ（Ｉ）＋Ｂ（Ｉ）＊Ｃ
ＥＮＤＤＯ
という１重ループ文に上記の命令文（図中に示す＊部分の命令文：不要な演算の実行を省略可能にする命令文）を挿入すると、Ｂ（Ｉ）の値が０であるかどうかに関わらずに、ループが１回転する毎に、この命令文を実行しなければならないことから、Ｂ（Ｉ）の値がほとんど０であっても、この命令文を実行することによるオーバヘッドにより実行時間を短縮することが難しくなる。 However, as shown in FIG.
DO I = 1, M
A (I) = A (I) + B (I) * C
ENDDO
Whether the value of B (I) is 0 when the above-mentioned command statement (* command statement shown in the figure: a command statement that allows the execution of unnecessary operations to be omitted) is inserted in the single loop statement Regardless of this, since this statement must be executed every time the loop rotates once, even if the value of B (I) is almost 0, it is executed by the overhead of executing this statement. It becomes difficult to shorten the time.

これから、本発明のコンパイル処理装置では、１重ループ文については原則として上記の命令文を挿入しないことになるが、図１（ｃ）に示すように、１重ループ文ではあるものの、そのループ内で演算データがスカラーデータであることにより不変である場合には、上述した多重ループ文の最内ループ文と同様に、上記の命令文をループの外に出せることで、その命令文の前の演算（図中の：で表した部分）に紛れて実行できることになることから、その命令文（図中に示す＊部分の命令文）を実行することによるオーバヘッドを意識する必要はない。 From now on, in the compile processing apparatus of the present invention, the above statement is not inserted in principle for a single loop statement. However, as shown in FIG. If the operation data is invariant because it is scalar data, the above statement can be moved out of the loop in the same way as the innermost loop statement of the multiple loop statement described above. Therefore, it is not necessary to be aware of the overhead caused by executing the command statement (* command statement shown in the diagram).

そこで、本発明のコンパイル処理装置では、ソースプログラムに記述される１重ループ文について、その１重ループ文に、疎行列を含み、かつ、その疎行列のデータが０である場合に実行を省略できる演算を持つものを検出すると、その検出した１重ループ文について、その１重ループ文の持つ演算の疎行列のデータがループの回転により値の更新されないデータであるのかを判断し、その値の更新されないデータであることを判断した１重ループ文の直前に、疎行列のデータが０であるのかを検査して、０である場合には１重ループ文内の演算を行わずに次の回転に進むことを指示する命令文を挿入し、その命令文を挿入した１重ループ文をサイクリック方式の形で並列化するようにしている。 Therefore, in the compile processing apparatus of the present invention, the execution of a single loop statement described in a source program is omitted when the single loop statement includes a sparse matrix and the data of the sparse matrix is 0. When an object having an operation that can be detected is detected, it is determined whether the data of the sparse matrix of the operation that the single loop statement has is the data whose value is not updated by the rotation of the loop. Immediately before the single loop statement that is determined to be non-updated data, it is checked whether the data of the sparse matrix is 0. If it is 0, the operation in the single loop statement is not performed. A command statement for instructing to proceed to the rotation is inserted, and the single loop statement into which the command statement is inserted is parallelized in a cyclic manner.

一方、ソースプログラムが多重ループを持つ場合には、それを並列化することで実行時間を大幅に短縮できるようになる。 On the other hand, if the source program has multiple loops, the execution time can be greatly shortened by parallelizing the source program.

例えば、
ＤＯＪ＝１，Ｎ
ＤＯＩ＝１，Ｍ
Ａ（Ｉ，Ｊ）＝Ａ（Ｉ，Ｊ）＋Ｂ（Ｊ）＊Ｃ
ＥＮＤＤＯ
ＥＮＤＤＯ
という疎行列Ｂの演算を実行する２重ループについて、２つのＣＰＵ１，２の並列プロセッサ向けに並列化すると、図８に示すようなループが実行されることになり、上述したように、ＣＰＵ台数に応じて実行時間を大幅に短縮できるようになる。 For example,
DO J = 1, N
DO I = 1, M
A (I, J) = A (I, J) + B (J) * C
ENDDO
ENDDO
When the double loop for executing the operation of the sparse matrix B is parallelized for the parallel processors of the two CPUs 1 and 2, a loop as shown in FIG. 8 is executed. Depending on the situation, the execution time can be greatly reduced.

しかしながら、例えば、疎行列ＢのデータＢ（Ｊ）の値の内、前半の値は全て０で、後半の値は全て０以外であったとすると、ＣＰＵ２では内側ループの演算を“Ｎ／２”回実行しなければならないのに対して、ＣＰＵ１では内側ループの演算を全く実行しなくてもよいことになる。 However, for example, if the first half value of the data B (J) of the sparse matrix B is all 0 and the second half value is other than 0, the CPU 2 performs the inner loop operation “N / 2”. However, the CPU 1 does not have to execute the inner loop operation at all.

この場合、図７に示すような通常の方式（均等に分割して並列化する方式）で並列化を行っていると、その実行時間は、一番長いＣＰＵ実行時間により決められることになるので、内側ループの実行時間をＴとすると、その実行時間は“Ｔ＊（Ｎ／２）”となり、結局のところＣＰＵ台数に基づく効果でしか実行時間を短縮できないことになる。 In this case, if parallelization is performed by a normal method (a method of dividing and parallelizing evenly) as shown in FIG. 7, the execution time is determined by the longest CPU execution time. If the execution time of the inner loop is T, the execution time is “T * (N / 2)”, and as a result, the execution time can be shortened only by the effect based on the number of CPUs.

そこで、本発明のコンパイル処理装置では、多重ループ文や１重ループ文を並列化する場合に、疎行列の０の出現位置が偏ることがある場合を考慮して、それらのループ文をサイクリック方式の形で並列化するようにしている。 Therefore, in the compile processing device of the present invention, when multiple loop statements and single loop statements are parallelized, the loop statements are cyclically taken into consideration when the appearance position of 0 in the sparse matrix may be biased. Parallelization is made in the form of a method.

このサイクリック方式の並列化では、２つのＣＰＵ１，２の並列プロセッサ向けに並列化する場合には、例えば、Ｊ＝１，３，５，・・・についてはＣＰＵ１で実行し、Ｊ＝２，４，６，・・・についてはＣＰＵ２で実行するというように、１回転毎にループを分割して他方のＣＰＵに実行させるという構成を採る。 In this cyclic parallelization, when parallelizing for the parallel processors of two CPUs 1 and 2, for example, J = 1, 3, 5,... .., 4, 6... Are executed by the CPU 2, and the other CPU is executed by dividing the loop every rotation.

これから、疎行列ＢのデータＢ（Ｊ）の値の内、前半の値は全て０で、後半の値は全て０以外であったとすると、ＣＰＵ１とＣＰＵ２とで平均してループの実行を行えるようになることから、内側ループの実行時間をＴとすると、その実行時間は“Ｔ＊（Ｎ／２）＊（１／２）”となり、図８に示すような通常の方式で並列化を行うのに比べて、その実行時間を短縮できることになる。 Assuming that the values of the first half of the data B (J) of the sparse matrix B are all 0 and the values of the second half are other than 0, the CPU 1 and the CPU 2 can execute the loop on average. Therefore, if the execution time of the inner loop is T, the execution time is “T * (N / 2) * (1/2)”, and parallelization is performed in a normal manner as shown in FIG. The execution time can be shortened compared to the above.

なお、疎行列の０の出現位置が偏らない場合には、通常の方式で並列化を行う場合の実行時間と、サイクリック方式で並列化を行う場合の実行時間とに差はなく、したがって、サイクリック方式で並列化を行うことによる不利益は生じないない。 When the appearance position of 0 in the sparse matrix is not biased, there is no difference between the execution time when parallelization is performed in the normal method and the execution time when parallelization is performed in the cyclic method. There is no penalty for parallelization using the cyclic method.

以上説明したように、本発明よれば、ソースプログラムをコンパイルするときにあって、ソースプログラムが疎行列の演算を実行するループを持つ場合に、その実行時間を大幅に短縮できるようになる。 As described above, according to the present invention, when a source program is compiled, when the source program has a loop for executing a sparse matrix operation, the execution time can be greatly reduced.

以下、実施の形態に従って本発明を詳細に説明する。 Hereinafter, the present invention will be described in detail according to embodiments.

図２に、本発明を具備するコンパイル処理装置１の一実施形態例を図示する。 FIG. 2 illustrates an embodiment of a compile processing apparatus 1 having the present invention.

この図に示すように、本発明のコンパイル処理装置１は、ソースプログラム２をコンパイルすることでオブジェクトプログラム３を生成するために、ソースプログラム２を入力するプログラム入力部１０と、プログラム入力部１０の入力したソースプログラム２の最適化処理を実行する最適化処理部１１と、最適化処理部１１の最適化処理結果に基づいてオブジェクトプログラム３を生成するオブジェクト生成部１２とを備える。 As shown in this figure, the compile processing apparatus 1 of the present invention includes a program input unit 10 for inputting a source program 2 and a program input unit 10 for generating an object program 3 by compiling the source program 2. An optimization processing unit 11 that executes the optimization processing of the input source program 2 and an object generation unit 12 that generates the object program 3 based on the optimization processing result of the optimization processing unit 11 are provided.

最適化処理部１１は、本発明を実現するために、検出部１１０と、判断部１１１と、挿入部１１２と、並列化部１１３とを備える。 The optimization processing unit 11 includes a detection unit 110, a determination unit 111, an insertion unit 112, and a parallelization unit 113 in order to realize the present invention.

この検出部１１０は、ソースプログラム２に記述される多重ループ文について、その多重ループ文の最内ループ文に、疎行列を含み、かつ、その疎行列のデータが０である場合に実行を省略できる演算を持つものを検出したり、ソースプログラム２に記述される１重ループ文について、その１重ループ文に、疎行列を含み、かつ、その疎行列のデータが０である場合に実行を省略できる演算を持つものを検出するという処理を行う。 The detection unit 110 omits execution of a multiple loop statement described in the source program 2 when the innermost loop statement of the multiple loop statement includes a sparse matrix and the data of the sparse matrix is 0. Executes when a single-loop statement described in the source program 2 is detected and the single-loop statement includes a sparse matrix and the data of the sparse matrix is 0. A process of detecting an object having an operation that can be omitted is performed.

判断部１１１は、検出部１１０が検出した多重ループ文の最内ループ文について、その最内ループ文の持つ演算の疎行列のデータがループの回転により値の更新されないデータであるのかを判断したり、検出部１１０が検出した１重ループ文について、その１重ループ文の持つ演算の疎行列のデータがループの回転により値の更新されないデータであるのかを判断するという処理を行う。 The determination unit 111 determines, for the innermost loop statement of the multiple loop statement detected by the detection unit 110, whether the data of the sparse matrix of the operation of the innermost loop statement is data whose value is not updated by the rotation of the loop. For the single loop sentence detected by the detection unit 110, a process is performed to determine whether the data of the sparse matrix of the operation of the single loop sentence is data whose value is not updated by the rotation of the loop.

挿入部１１２は、判断部１１１が値の更新されないデータであることを判断した最内ループ文の直前に、疎行列のデータが０であるのかを検査して、０である場合には最内ループ文内の演算を行わずに次の回転に進むことを指示する命令文を挿入したり、判断部１１１が値の更新されないデータであることを判断した１重ループ文の直前に、疎行列のデータが０であるのかを検査して、０である場合には１重ループ文内の演算を行わずに次の回転に進むことを指示する命令文を挿入するという処理を行う。 The insertion unit 112 checks whether the data in the sparse matrix is 0 immediately before the innermost loop statement in which the determination unit 111 determines that the value is not updated. A sparse matrix immediately before a single loop statement in which a command statement instructing to proceed to the next rotation without performing an operation in the loop statement is inserted, or when the determination unit 111 determines that the data is not updated. If the data is 0, if it is 0, a process is performed in which an instruction that instructs to proceed to the next rotation is performed without performing an operation in the single loop sentence.

並列化部１１３は、挿入部１１２が命令文を挿入した多重ループ文を図３に示すような形態であるサイクリック方式の形で並列化したり、挿入部１１２が命令文を挿入した１重ループ文を図３に示すような形態であるサイクリック方式の形で並列化するという処理を行う。 The parallelizing unit 113 parallelizes the multiple loop statement in which the insertion unit 112 has inserted the command statement in the form of a cyclic method as shown in FIG. 3, or the single loop in which the insertion unit 112 has inserted the command statement. Processing is performed in which sentences are parallelized in the form of a cyclic method as shown in FIG.

ここで、ソースプログラム２に記述される行列の中に含まれる行列（配列）の内、どれが疎行列であり、どれが疎行列でないということについては、例えば、ソースプログラム２に記述される情報に従って取得したり、プログラマからの入力情報に従って取得することになる。 Here, among the matrices (arrays) included in the matrix described in the source program 2, which is a sparse matrix and which is not a sparse matrix, for example, information described in the source program 2 According to the input information from the programmer.

図４及び図５に、最適化処理部１１が本発明を実現するために実行する処理フローの一実施形態例を図示する。 FIG. 4 and FIG. 5 illustrate an embodiment of a processing flow executed by the optimization processing unit 11 to realize the present invention.

次に、この処理フローに従って、本発明により実現されるコンパイル最適化処理について説明する。 Next, compile optimization processing realized by the present invention will be described according to this processing flow.

最適化処理部１１は、ソースプログラム２を受け取ると、本発明を実現するために、図４及び図５の処理フローに示すように、先ず最初に、ステップ１０で、ソースプログラム２に未処理のループ文が残っているのか否かを判断して、未処理のループ文が残っていることを判断するときには、ステップ１１に進んで、ソースプログラム２に記述される未処理のループ文を１つ選択する。 When the optimization processing unit 11 receives the source program 2, in order to implement the present invention, as shown in the processing flow of FIGS. 4 and 5, first, in step 10, the source program 2 is unprocessed. When it is determined whether or not a loop statement remains, and when it is determined that an unprocessed loop statement remains, the process proceeds to step 11 where one unprocessed loop statement is described in the source program 2. select.

続いて、ステップ１２で、その選択したループ文が多重ループ文であるのか１重ループ文であるのかを判断して、多重ループ文であることを判断するときには、ステップ１３に進んで、その多重ループ文の最内ループ文内に、疎行列Ｂについての演算である
Ａ＝Ａ＋Ｂ＊Ｃ，Ａ＝Ａ−Ｂ＊Ｃ，Ａ＝Ａ＋Ｂ／Ｃ，Ａ＝Ａ−Ｂ／Ｃ
という演算パターンがあるのかを調べる。 Subsequently, in step 12, when it is determined whether the selected loop statement is a multiple loop statement or a single loop statement, and it is determined that the selected loop statement is a multiple loop statement, the process proceeds to step 13 and the multiple A = A + B * C, A = A−B * C, A = A + B / C, A = A−B / C are operations on the sparse matrix B in the innermost loop statement of the loop statement.
Check whether there is an operation pattern.

ここで、これらの演算パターンは、疎行列Ｂのデータが０である場合に、その実行を省略できる演算パターンを示している。 Here, these calculation patterns indicate calculation patterns that can be omitted when the data of the sparse matrix B is zero.

したがって、このステップ１３では、選択した多重ループ文の最内ループ文に、疎行列Ｂを含み、かつ、その疎行列Ｂのデータが０である場合に実行を省略できる演算が含まれているのかを調べるのである。 Therefore, in this step 13, whether the innermost loop statement of the selected multiple loop statement includes a sparse matrix B and an operation that can be omitted when the data of the sparse matrix B is 0. Is examined.

続いて、ステップ１４で、ステップ１３のチェック処理に従って、選択した多重ループ文の最内ループ文に上述の演算パターンが存在しないことを判断するときには、実行を省略できる演算が含まれていないので、次のループ文を処理すべくステップ１０に戻る。 Subsequently, when it is determined in step 14 that the above-described operation pattern does not exist in the innermost loop statement of the selected multiple loop statement in accordance with the check processing in step 13, the operation that can be omitted is not included. Return to step 10 to process the next loop statement.

一方、ステップ１４で上述の演算パターンが存在することを判断するときには、ステップ１５に進んで、最内ループ文に記述される疎行列Ｂに該当する演算データは、最内ループが回転しても値が更新されない不変のデータ（ベクトルデータ）であるのかを調べる。 On the other hand, when it is determined in step 14 that the above-described operation pattern exists, the process proceeds to step 15, and the operation data corresponding to the sparse matrix B described in the innermost loop statement is not detected even if the innermost loop is rotated. It is checked whether the data is invariant data (vector data) whose value is not updated.

続いて、ステップ１６で、ステップ１５のチェック処理に従って、最内ループ文に記述される疎行列Ｂに該当する演算データが不変のデータでないことを判断するときには、ステップ１７で説明する命令文を挿入すると、いちいちその命令文を実行しなければならないことにより実行時間がかえって長くなることを考慮して、次のループ文を処理すべくステップ１０に戻る。 Subsequently, in step 16, when it is determined that the operation data corresponding to the sparse matrix B described in the innermost loop statement is not invariant data according to the check processing in step 15, the instruction statement described in step 17 is inserted. Then, in consideration of the fact that the execution time becomes longer due to the execution of the command statement, the process returns to step 10 to process the next loop statement.

一方、ステップ１６で不変のデータであることを判断するときには、ステップ１７に進んで、最内ループ文の直前に、演算データの値が０であるのかを検査して、０である場合には、最内ループ文内の演算を行わずに次の回転に進むことを指示する命令文を挿入する。 On the other hand, when it is determined in step 16 that the data is invariant, the process proceeds to step 17 to check whether the value of the operation data is 0 immediately before the innermost loop statement. Then, a command statement for instructing to proceed to the next rotation without performing the operation in the innermost loop statement is inserted.

すなわち、図６中の＊１で示すように、
ＩＦ（Ｂ（ｋ−１）．ＥＱ．０）ＧＯＴＯ１０
という命令文を挿入したり、図６中の＊２で示すように、
ＩＦ（Ｂ（ｋ）．ＥＱ．０）ＧＯＴＯ２０
という命令文を挿入するのである。 That is, as shown by * 1 in FIG.
IF (B (k-1) .EQ .0) GOTO 10
Or as shown by * 2 in FIG.
IF (B (k) .EQ .0) GOTO 20
Is inserted.

続いて、ステップ１８で、ステップ１１で選択した多重ループのループ文（ステップ１７の処理に従って命令文が挿入されている）をサイクリック方式で並列化してから、次のループ文を処理すべくステップ１０に戻る。 Subsequently, in step 18, the loop statement of the multiple loop selected in step 11 (the instruction statement is inserted according to the processing in step 17) is parallelized in a cyclic manner, and then the step to process the next loop statement is performed. Return to 10.

一方、ステップ１２で、選択したループ文が１重ループ文であることを判断するときには、ステップ１９に進んで、その１重ループ文内に、疎行列Ｂについての演算である
Ａ＝Ａ＋Ｂ＊Ｃ，Ａ＝Ａ−Ｂ＊Ｃ，Ａ＝Ａ＋Ｂ／Ｃ，Ａ＝Ａ−Ｂ／Ｃ
という演算パターンがあるのかを調べる。 On the other hand, when it is determined in step 12 that the selected loop sentence is a single loop sentence, the process proceeds to step 19 where A = A + B * C, which is an operation on the sparse matrix B. , A = A−B * C, A = A + B / C, A = A−B / C
Check whether there is an operation pattern.

したがって、このステップ１９では、選択した１重ループ文に、疎行列Ｂを含み、かつ、その疎行列Ｂのデータが０である場合に実行を省略できる演算が含まれているのかを調べるのである。 Therefore, in this step 19, it is checked whether or not the selected single loop statement includes a sparse matrix B and an operation that can be omitted when the data of the sparse matrix B is 0. .

続いて、ステップ２０で、ステップ１９のチェック処理に従って、選択した１重ループ文に上述の演算パターンが存在しないことを判断するときには、実行を省略できる演算が含まれていないので、次のループ文を処理すべくステップ１０に戻る。 Subsequently, when it is determined in step 20 that the above-described operation pattern does not exist in the selected single loop statement in accordance with the check processing in step 19, since the operation that can be omitted is not included, the next loop statement Return to step 10 to process.

一方、ステップ２０で上述の演算パターンが存在することを判断するときには、ステップ２１に進んで、１重ループ文に記述される疎行列Ｂに該当する演算データは、１重ループが回転しても値が更新されない不変のデータ（スカラデータ）であるのかを調べる。 On the other hand, when it is determined in step 20 that the above-described operation pattern exists, the process proceeds to step 21 and the operation data corresponding to the sparse matrix B described in the single loop statement is not detected even if the single loop is rotated. Check if the value is invariant data (scalar data) that is not updated.

続いて、ステップ２２で、ステップ２１のチェック処理に従って、１重ループ文に記述される疎行列Ｂに該当する演算データが不変のデータでないことを判断するときには、ステップ２３で説明する命令文を挿入すると、いちいちその命令文を実行しなければならないことにより実行時間がかえって長くなることを考慮して、次のループ文を処理すべくステップ１０に戻る。 Subsequently, in step 22, when it is determined that the operation data corresponding to the sparse matrix B described in the single loop statement is not invariant data according to the check processing in step 21, the instruction statement described in step 23 is inserted. Then, in consideration of the fact that the execution time becomes longer due to the execution of the command statement, the process returns to step 10 to process the next loop statement.

一方、ステップ２２で不変のデータであることを判断するときには、ステップ２３に進んで、１重ループ文の直前に、演算データの値が０であるのかを検査して、０である場合には、１重ループ文内の演算を行わずに次の回転に進むことを指示する命令文を挿入する。 On the other hand, when it is determined in step 22 that the data is invariant, the process proceeds to step 23 to check whether the value of the operation data is 0 immediately before the single loop statement. Insert a command statement that indicates to proceed to the next rotation without performing the operation in the single loop statement.

続いて、ステップ２４で、ステップ１１で選択した１重ループのループ文（ステップ２３の処理に従って命令文が挿入されている）をサイクリック方式で並列化してから、次のループ文を処理すべくステップ１０に戻る。 Subsequently, in step 24, the loop statement of the single loop selected in step 11 (the instruction statement is inserted according to the processing in step 23) is parallelized in a cyclic manner, and then the next loop statement is to be processed. Return to Step 10.

このようにして、ステップ１０〜ステップ２４の処理を繰り返していくことで、ステップ１０で、ソースプログラム２に記述される全てのループ文を処理したことを判断すると、本発明によるコンパイルの最適化処理を終了する。 When it is determined that all the loop statements described in the source program 2 have been processed in step 10 by repeating the processing in steps 10 to 24 in this way, the optimization processing for compilation according to the present invention is performed. Exit.

以上に説明したように、本発明のコンパイル処理装置１は、疎行列の性質を考慮して、ソースプログラム２に対して不要な演算の実行を省略可能にする命令文を挿入することで不要な演算を実行しないで済むようにし、これにより実行時間の短縮を実現するのである。 As described above, the compile processing apparatus 1 according to the present invention is unnecessary by inserting a statement that makes it possible to omit execution of unnecessary operations in the source program 2 in consideration of the nature of a sparse matrix. This eliminates the need to execute computations, thereby reducing the execution time.

そして、本発明のコンパイル処理装置１は、この命令文を挿入したソースプログラム２を並列化することで実行時間の短縮を図るときに、疎行列の０の出現位置に偏りがあるような場合に、その実行時間の短縮を図れないことが起こることを考慮して、通常の均等に分割する方式に基づいた並列化を行うのではなくて、サイクリック方式に基づいた並列化を行うようにし、これにより並列化で実行時間の短縮を図るときに、その実行時間の短縮をより大きなものとすることを実現するのである。 The compile processing apparatus 1 according to the present invention, when the execution time is shortened by parallelizing the source program 2 in which this statement is inserted, when the occurrence position of 0 of the sparse matrix is biased. In consideration of the fact that the execution time may not be shortened, instead of performing parallel processing based on the normal equally dividing method, parallel processing based on the cyclic method is performed, As a result, when the execution time is shortened by parallelization, the execution time can be shortened more greatly.

（付記１）ソースプログラムをコンパイルするコンパイル処理方法において、ソースプログラムに記述される多重ループ文について、その多重ループ文の最内ループ文に、疎行列を含み、かつ、その疎行列のデータが０である場合に実行を省略できる演算を持つものを検出する過程と、上記検出した多重ループ文の最内ループ文について、その最内ループ文の持つ演算の疎行列のデータがループの回転により値の更新されないデータであるのかを判断する過程と、上記値の更新されないデータであることを判断した最内ループ文の直前に、疎行列のデータが０であるのかを検査して、０である場合には最内ループ文内の演算を行わずに次の回転に進むことを指示する命令文を挿入する過程と、上記命令文を挿入した多重ループ文をサイクリック方式の形で並列化する過程とを備えることを、特徴とするコンパイル処理方法。 (Supplementary note 1) In a compile processing method for compiling a source program, for a multiple loop statement described in the source program, the innermost loop statement of the multiple loop statement includes a sparse matrix, and the data of the sparse matrix is 0 In the process of detecting an operation having an operation that can be omitted in the case of, and for the innermost loop statement of the detected multiple loop statement, the sparse matrix data of the operation possessed by the innermost loop statement is obtained by the rotation of the loop. It is 0 by checking whether the sparse matrix data is 0 immediately before the process of determining whether the data is not updated and immediately before the innermost loop statement that has determined that the data is not updated. In some cases, a process of inserting a command statement instructing to proceed to the next rotation without performing an operation in the innermost loop statement, and a multiple loop statement in which the command statement is inserted are cyclically selected. Further comprising a step of parallelism in the form of a method, the compilation process wherein.

（付記２）付記１に記載のコンパイル処理方法において、上記検出する過程では、ソースプログラムに記述される１重ループ文について、その１重ループ文に、疎行列を含み、かつ、その疎行列のデータが０である場合に実行を省略できる演算を持つものを検出し、上記判断する過程では、上記検出した１重ループ文について、その１重ループ文の持つ演算の疎行列のデータがループの回転により値の更新されないデータであるのかを判断し、上記挿入する過程では、上記値の更新されないデータであることを判断した１重ループ文の直前に、疎行列のデータが０であるのかを検査して、０である場合には１重ループ文内の演算を行わずに次の回転に進むことを指示する命令文を挿入し、上記並列化する過程では、上記命令文を挿入した１重ループ文をサイクリック方式の形で並列化することを、特徴とするコンパイル処理方法。 (Supplementary note 2) In the compiling method according to supplementary note 1, in the detection process, for the single loop statement described in the source program, the single loop statement includes a sparse matrix, and the sparse matrix In the process of detecting the one having an operation that can be omitted when the data is 0, and in the above determination process, the data of the sparse matrix of the operation that the single loop statement has is detected in the loop. It is determined whether or not the data is not updated by rotation, and in the above insertion process, whether or not the sparse matrix data is 0 immediately before the single loop statement that determines that the data is not updated. If the result of the check is 0, a command statement instructing to proceed to the next rotation without performing an operation in the single loop statement is inserted, and in the parallelization process, the command statement is inserted 1 Heavy To parallelize the-loop sentence in the form of a cyclic system, compilation method characterized.

（付記３）ソースプログラムをコンパイルするコンパイル処理装置において、ソースプログラムに記述される多重ループ文について、その多重ループ文の最内ループ文に、疎行列を含み、かつ、その疎行列のデータが０である場合に実行を省略できる演算を持つものを検出する検出手段と、上記検出手段が検出した多重ループ文の最内ループ文について、その最内ループ文の持つ演算の疎行列のデータがループの回転により値の更新されないデータであるのかを判断する判断手段と、上記判断手段が値の更新されないデータであることを判断した最内ループ文の直前に、疎行列のデータが０であるのかを検査して、０である場合には最内ループ文内の演算を行わずに次の回転に進むことを指示する命令文を挿入する挿入手段と、上記挿入手段が命令文を挿入した多重ループ文をサイクリック方式の形で並列化する並列化手段とを備えることを、特徴とするコンパイル処理装置。 (Additional remark 3) In the compile processing apparatus which compiles a source program, about the multiple loop sentence described in the source program, the innermost loop sentence of the multiple loop sentence contains a sparse matrix, and the data of the sparse matrix is 0 The detection means for detecting an operation having an operation that can be omitted when the condition is, and the innermost loop statement of the multiple loop statement detected by the detection means, the data of the sparse matrix of the operation possessed by the innermost loop statement is a loop. Whether the data of the sparse matrix is 0 immediately before the innermost loop statement in which the determination means determines that the data is not updated by the rotation and the determination means determines that the data is not updated. Inserting means for inserting a command statement instructing to proceed to the next rotation without performing an operation in the innermost loop statement when the value is 0, and the inserting means In that it comprises a parallelism means to parallelize the multi-loop sentence by inserting the statement in the form of a cyclic system, compilation and wherein.

（付記４）付記３に記載のコンパイル処理装置において、上記検出手段は、ソースプログラムに記述される１重ループ文について、その１重ループ文に、疎行列を含み、かつ、その疎行列のデータが０である場合に実行を省略できる演算を持つものを検出し、上記判断手段は、上記検出手段が検出した１重ループ文について、その１重ループ文の持つ演算の疎行列のデータがループの回転により値の更新されないデータであるのかを判断し、上記挿入手段は、上記判断手段が値の更新されないデータであることを判断した１重ループ文の直前に、疎行列のデータが０であるのかを検査して、０である場合には１重ループ文内の演算を行わずに次の回転に進むことを指示する命令文を挿入し、上記並列化手段は、上記挿入手段が命令文を挿入した１重ループ文をサイクリック方式の形で並列化することを、特徴とするコンパイル処理装置。 (Additional remark 4) In the compile processing apparatus according to Additional remark 3, the detection means includes a sparse matrix in the single loop sentence described in the source program, and data of the sparse matrix When the value of 0 is 0, an object having an operation that can be omitted is detected, and the determination means has a loop of the sparse matrix data of the operation of the single loop statement detected by the detection means. The insertion means determines that the data in the sparse matrix is 0 immediately before the single loop statement in which the determination means determines that the data is not updated. If it is 0, an instruction statement instructing to proceed to the next rotation without performing an operation in the single loop statement is inserted. Insert a sentence To parallelize the singlet loop statement in the form of a cyclic system, compilation and wherein.

（付記５）ソースプログラムをコンパイルするコンパイル処理プログラムにおいて、ソースプログラムに記述される多重ループ文について、その多重ループ文の最内ループ文に、疎行列を含み、かつ、その疎行列のデータが０である場合に実行を省略できる演算を持つものを検出する処理と、上記検出した多重ループ文の最内ループ文について、その最内ループ文の持つ演算の疎行列のデータがループの回転により値の更新されないデータであるのかを判断する処理と、上記値の更新されないデータであることを判断した最内ループ文の直前に、疎行列のデータが０であるのかを検査して、０である場合には最内ループ文内の演算を行わずに次の回転に進むことを指示する命令文を挿入する処理と、上記命令文を挿入した多重ループ文をサイクリック方式の形で並列化する処理とをコンピュータに実行させるためのコンパイル処理プログラム。 (Additional remark 5) In the compile processing program which compiles a source program, about the multiple loop statement described in the source program, the innermost loop statement of the multiple loop statement contains a sparse matrix, and the data of the sparse matrix is 0 If there is a processing that detects an operation that can be omitted if it is, and for the innermost loop statement of the detected multiple loop statement, the sparse matrix data of the operation that the innermost loop statement has is the value of the rotation of the loop It is 0 by checking whether the sparse matrix data is 0 immediately before the process of determining whether the data is not updated and immediately before the innermost loop statement that has determined that the data is not updated. In this case, a process for inserting a command statement instructing to proceed to the next rotation without performing an operation in the innermost loop statement, and a multiple loop statement in which the command statement is inserted are Compilation program for executing a process of parallelization in the form of click system to a computer.

（付記６）付記５に記載のコンパイル処理プログラムにおいて、上記検出する処理では、ソースプログラムに記述される１重ループ文について、その１重ループ文に、疎行列を含み、かつ、その疎行列のデータが０である場合に実行を省略できる演算を持つものを検出し、上記判断する処理では、上記検出した１重ループ文について、その１重ループ文の持つ演算の疎行列のデータがループの回転により値の更新されないデータであるのかを判断し、上記挿入する処理では、上記値の更新されないデータであることを判断した１重ループ文の直前に、疎行列のデータが０であるのかを検査して、０である場合には１重ループ文内の演算を行わずに次の回転に進むことを指示する命令文を挿入し、上記並列化する処理では、上記命令文を挿入した１重ループ文をサイクリック方式の形で並列化することを、特徴とするコンパイル処理プログラム。 (Supplementary note 6) In the compile processing program according to supplementary note 5, in the processing to detect, the single loop statement described in the source program includes a sparse matrix in the single loop statement, and the sparse matrix When the data is 0, an operation having an operation that can be omitted is detected, and in the above determination process, for the detected single loop statement, the sparse matrix data of the operation of the single loop statement is It is determined whether the data is not updated by rotation, and in the above insertion process, it is determined whether the sparse matrix data is 0 immediately before the single loop statement that is determined to be data not updated. If the result is 0, a command statement instructing to proceed to the next rotation without performing an operation in the single loop statement is inserted, and in the parallel processing, the command statement is inserted. To parallelize the singlet loop statement in the form of a cyclic system, compilation program characterized.

本発明の説明するためのソースプログラムの説明図である。It is explanatory drawing of the source program for demonstrating this invention. 本発明のコンパイル処理装置の一実施形態例である。It is an example of 1 embodiment of the compilation processing apparatus of this invention. サイクリック方式による並列化の説明図である。It is explanatory drawing of the parallelization by a cyclic system. 最適化処理部の実行する処理フローの一実施形態例である。It is one Embodiment of the processing flow which an optimization process part performs. 最適化処理部の実行する処理フローの一実施形態例である。It is one Embodiment of the processing flow which an optimization process part performs. 本発明による命令文挿入の説明図である。It is explanatory drawing of the command sentence insertion by this invention. 従来技術を説明するためのソースプログラムの説明図である。It is explanatory drawing of the source program for demonstrating a prior art. 従来技術を説明するためのソースプログラムの説明図である。It is explanatory drawing of the source program for demonstrating a prior art.

Explanation of symbols

１コンパイル処理装置
２ソースプログラム
３オブジェクトプログラム
１０プログラム入力部
１１最適化処理部
１２オブジェクト生成部
１１０検出部
１１１判断部
１１２挿入部
１１３並列化部 DESCRIPTION OF SYMBOLS 1 Compile processing apparatus 2 Source program 3 Object program 10 Program input part 11 Optimization process part 12 Object generation part 110 Detection part 111 Judgment part 112 Insertion part 113 Parallelization part

Claims

In a compile processing method for compiling a source program,
For multiple loop statements described in a source program, the inner loop statement of the multiple loop statement contains a sparse matrix and has an operation that can be omitted when the data of the sparse matrix is 0 The process of
Regarding the innermost loop statement of the detected multiple loop statement, a process of determining whether the sparse matrix data of the operation of the innermost loop statement is data whose value is not updated by the rotation of the loop,
Check whether the sparse matrix data is 0 immediately before the innermost loop statement in which it is determined that the data is not updated. If it is 0, the operation in the innermost loop statement is not performed. Inserting a command to instruct to proceed to the next rotation,
A process of parallelizing a multiple loop statement in which the above-mentioned imperative statement is inserted in a cyclic manner,
A featured compilation method.

The compile processing method according to claim 1,
In the detection process, the single loop statement described in the source program includes an operation that can be omitted when the single loop statement includes a sparse matrix and the data of the sparse matrix is 0. Detect things,
In the determination process, for the detected single loop statement, it is determined whether the sparse matrix data of the operation of the single loop statement is data whose value is not updated by the rotation of the loop,
In the above insertion process, it is checked whether the sparse matrix data is 0 immediately before the single loop statement in which it is determined that the data is not updated. Insert a statement to go to the next rotation without performing the operation in
In the process of parallelization, parallelizing a single loop statement in which the above-mentioned imperative sentence is inserted in a cyclic manner,
A featured compilation method.

In a compilation processing device for compiling a source program,
For multiple loop statements described in a source program, the inner loop statement of the multiple loop statement contains a sparse matrix and has an operation that can be omitted when the data of the sparse matrix is 0 Detecting means for
Determination means for determining whether the data of the sparse matrix of the operation of the innermost loop statement is data whose value is not updated by the rotation of the innermost loop sentence of the multiple loop sentence detected by the detection means;
Immediately before the innermost loop statement in which the determination means has determined that the data is not updated, it is checked whether the sparse matrix data is 0. If it is 0, the operation in the innermost loop statement is performed. An insertion means for inserting a command statement instructing to proceed to the next rotation without performing
The insertion means comprises parallel means for parallelizing a multiple loop statement into which a statement is inserted in a cyclic manner;
A featured compilation processing device.

The compile processing apparatus according to claim 3,
The detecting means includes a single loop statement described in a source program, the single loop statement including a sparse matrix, and an operation that can be omitted when the data of the sparse matrix is 0 Detect
The determination means determines, for the single loop sentence detected by the detection means, whether the sparse matrix data of the operation of the single loop sentence is data whose value is not updated by the rotation of the loop,
The inserting means checks whether the data of the sparse matrix is 0 immediately before the single loop statement in which the determining means determines that the data is not updated. Insert a statement that indicates to proceed to the next rotation without performing the operation in the loop statement,
The parallelizing means parallelizes the single loop statement in which the insertion means has inserted the imperative sentence in a cyclic manner.
A featured compilation processing device.

In a compilation processing program that compiles a source program,
For multiple loop statements described in a source program, the inner loop statement of the multiple loop statement contains a sparse matrix and has an operation that can be omitted when the data of the sparse matrix is 0 Processing to
For the innermost loop statement of the detected multiple loop statement, a process for determining whether the data of the sparse matrix of the operation of the innermost loop statement is data whose value is not updated by loop rotation;
Check whether the sparse matrix data is 0 immediately before the innermost loop statement for which it is determined that the data is not updated. If it is 0, the operation in the innermost loop statement is not performed. Processing to insert a command to instruct to proceed to the next rotation,
A compiling process program for causing a computer to execute a process of parallelizing a multiple loop statement in which the above-mentioned imperative sentence is inserted in a cyclic manner.