JP4327533B2

JP4327533B2 - Arithmetic processing program, arithmetic processing method, and arithmetic processing apparatus

Info

Publication number: JP4327533B2
Application number: JP2003299311A
Authority: JP
Inventors: 剛司山本
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2003-08-22
Filing date: 2003-08-22
Publication date: 2009-09-09
Anticipated expiration: 2023-08-22
Also published as: JP2005071056A

Description

本発明は、ソースプログラムに含まれる多項演算を翻訳する演算処理プログラム、演算処理方法、および演算処理装置に関するものである。 The present invention relates to an arithmetic processing program, an arithmetic processing method, and an arithmetic processing device for translating a polynomial operation included in a source program.

従来、コンパイラでソースプログラム中に記述された多項演算をハードウェア命令を用いて実行させる場合、多項演算で記述されている演算子の優先順に従って左から右方向に順次ハードウェア命令を用いて演算を行う命令列に展開していた。 Conventionally, when executing a multinomial operation described in a source program using a hardware instruction using a compiler, the operation is performed sequentially from left to right according to the priority order of the operators described in the multinomial operation. It was expanded to the instruction sequence to do.

例えば多項演算として、図４の（ａ）に示す
１６バイト２進＝８バイト２進＊２バイト２進＊４バイト２進＊２バイト２進
があった場合、３２ビット（４バイト）のＣＰＵでは、１度に３２ビット（４バイト）同士の演算（乗算）ができ、これを越えるときは分割する必要があるため、図４の（ａ）中の（１）から（３）に示す乗算について、図４の（ｂ）の（１）から（３）のようなハードウェア命令にそれぞれ展開して演算するようにしていた。 For example, when there is a 16-byte binary = 8-byte binary * 2-byte binary * 4-byte binary * 2-byte binary shown in FIG. 4A as a polynomial operation, a 32-bit (4 bytes) CPU In this case, the calculation (multiplication) between 32 bits (4 bytes) can be performed at once, and if it exceeds this, it is necessary to divide, so the multiplications shown in (1) to (3) in FIG. 4 are expanded into hardware instructions as shown in (1) to (3) of FIG.

このため、
図４の（ｂ）の（１）で、２回の乗算が必要
図４の（ｂ）の（２）で、３回の乗算が必要
図４の（ｂ）の（３）で、４回の乗算が必要
となり、合計９回の乗算（ハードウェア命令による乗算）が必要となっていた。 For this reason,
4 (b) (1) requires two multiplications FIG. 4 (b) (2) requires three multiplications FIG. 4 (b) (3) four times In other words, a total of 9 multiplications (multiplication by hardware instructions) are necessary.

従来は、上述した図４の（ａ）に示すように、多項演算を左から順に右の方向にハードウェア命令に展開したので、展開されるハードウェア命令の数が多くなってしまい、迅速に実行し得ないという問題があった。 Conventionally, as shown in (a) of FIG. 4 above, since the polynomial operation is expanded into hardware instructions in order from the left to the right, the number of hardware instructions to be expanded increases, so that There was a problem that it could not be executed.

本発明は、これらの問題を解決するため、演算ソート手段８がソースプログラムに含まれる多項演算中の演算を桁数の小さい順にソートし、分割手段９が小さい順にソートした後の多項演算の先頭から順に演算するハードウェア命令を生成する際に、実行させる装置のハードウェア命令の桁数を超えたときは超えない範囲に分割し、オブジェクト生成手段１１が多項演算の先頭から順に演算するハードウェア命令を生成、あるいは分割されたときは分割後の演算についてそれぞれハードウェア命令を生成するようにしている。 In the present invention, in order to solve these problems, the operation sorting means 8 sorts the operations in the polynomial operation included in the source program in ascending order of the number of digits, and the dividing means 9 sorts the head of the polynomial operation after sorting in ascending order. When generating hardware instructions that are operated in order, the hardware instructions are divided into ranges that do not exceed when the number of digits of hardware instructions of the device to be executed is exceeded, and the object generation means 11 calculates hardware in order from the beginning of the multinomial operation When an instruction is generated or divided, a hardware instruction is generated for each operation after division.

また、組み合わせ選択手段１０がソースプログラムに含まれる多項演算中の演算のうち、演算結果の桁数が最も小さくなる組み合わせの２つの演算を選択する、あるいは選択した２つの演算の演算結果の桁数と、残りの演算とのうち、演算結果の桁数が最も小さくなる組み合わせの２つの演算を選択し、分割手段９が選択された演算の組について、演算するハードウェア命令を生成する際に、実行させる装置のハードウェア命令の桁数を超えたときは超えない範囲に分割し、オブジェクト生成手段１１が多項演算の先頭から順に演算するハードウェア命令を生成、あるいは分割されたときは分割後の演算についてそれぞれハードウェア命令を生成するようにしている。 Also, the combination selection means 10 selects two operations of the combination having the smallest number of digits of the operation result among the operations in the polynomial operation included in the source program, or the number of digits of the operation result of the two selected operations. And, when selecting two operations of the combination having the smallest number of digits of the operation result among the remaining operations and generating the hardware instruction to perform the operation for the selected operation set, the dividing unit 9 When the number of digits of the hardware instruction of the device to be executed is exceeded, it is divided into a range that does not exceed, and when the object generation means 11 generates or divides the hardware instruction for calculating in order from the beginning of the multinomial operation, A hardware instruction is generated for each operation.

従って、コンパイル時に多項演算をハードウェア命令に展開する際に、多項演算の桁数の少ない順にソートした後にハードウェア命令に展開したり、中間結果も含めて演算結果が最小となる組み合わせから順にハードウェア命令に展開したりすることにより、多項演算を実行するハードウェア命令数を最小限にし、または処理時間がかかる命令の展開を抑制し、実行速度の向上を図ることが可能となる。 Therefore, when expanding a multinomial operation to a hardware instruction at compile time, it is sorted into the hardware instruction after sorting in ascending order of the number of digits of the multinomial operation, or from the combination that minimizes the operation result including intermediate results. By expanding to hardware instructions, it is possible to minimize the number of hardware instructions that execute a multinomial operation, or to suppress the expansion of instructions that require processing time, thereby improving the execution speed.

本願発明は、コンパイル時に多項演算を実行するためのハードウェア命令数を最小限に展開しているため、ソースプログラム中に出現する多項演算の実行速度の向上を図ることが可能となる。 In the present invention, since the number of hardware instructions for executing a polynomial operation at the time of compilation is expanded to a minimum, the execution speed of the polynomial operation appearing in the source program can be improved.

本発明は、コンパイル時に多項演算をハードウェア命令に展開する際に、多項演算の桁数の少ない順にソートした後にハードウェア命令に展開したり、中間結果も含めて演算の演算結果が最小となる組み合わせから順にハードウェア命令に展開したりすることにより、多項演算を実行するハードウェア命令数を最小限にし、または処理時間がかかる命令の展開を抑制し、実行速度の向上を図ることを実現した。 In the present invention, when a multinomial operation is expanded into a hardware instruction at the time of compilation, the operation is sorted into the hardware instruction after the number of digits of the multinomial operation is reduced, or the operation result of the operation including the intermediate result is minimized. By expanding the hardware instructions in order from the combination, the number of hardware instructions that execute multinomial operations is minimized, or the expansion of instructions that require processing time is suppressed, and the execution speed is improved. .

図１は、本発明のシステム構成図を示す。 FIG. 1 shows a system configuration diagram of the present invention.

図１において、コンパイラ１は、プログラムをコンピュータが実行することによって、ソースプログラム２をコンパイル（翻訳）して実行可能形式のオブジェクトプログラム３を生成するものであって、ここでは、字句解析手段２、構文解析手段３、意味解析手段４、最適化手段５、およびオブジェクト生成手段１１などから構成されるものである。 In FIG. 1, a compiler 1 is a computer that executes a program to compile (translate) a source program 2 to generate an executable object program 3. Here, a lexical analysis unit 2, It comprises a syntax analysis means 3, a semantic analysis means 4, an optimization means 5, an object generation means 11, and the like.

字句解析手段２は、コンパイル対象のソースプログラム２を読み込んで当該ソースプログラム２の字句を解析するものであって、公知の手段である。 The lexical analysis means 2 is a known means for reading the source program 2 to be compiled and analyzing the lexical of the source program 2.

構文解析手段３は、字句解析手段２によってソースプログラム２の字句解析した結果をもとに、当該ソースプログラム２の構文（構造）を解析するものであって、公知の手段である。 The syntax analysis means 3 is a known means for analyzing the syntax (structure) of the source program 2 based on the result of the lexical analysis of the source program 2 by the lexical analysis means 2.

意味解析手段４は、構文解析手段３によって解析されたソースプログラム２の意味を解析するものであって、公知の手段である。 The semantic analysis means 4 analyzes the meaning of the source program 2 analyzed by the syntax analysis means 3, and is a known means.

最適化手段５は、意味解析手段４によって解析された結果をもとに、最適化（実行形式のオブジェクトプログラムが高速に実行したり、レジスタの有効利用を図ったりなどの最適化）を行うものであって、ここでは、分岐最適化手段６、演算最適化手段７などから構成されるものである。 The optimization means 5 performs optimization (optimization such as execution of an object program in an executable format or effective use of registers) based on the result analyzed by the semantic analysis means 4 In this example, the branch optimization unit 6 and the operation optimization unit 7 are included.

分岐最適化手段６は、ソースプログラム２の分岐を最適化するものであって、公知の手段である。 The branch optimization means 6 optimizes the branch of the source program 2 and is a known means.

演算最適化手段７は、本願発明に係わるものであって、多項演算をコンパイルするときにハードウェア命令数を最小限にする最適化を図るものであり、演算ソート手段８、分割手段９、および組合せ選択手段１０などから構成されるものである（図２、図３を用いて後述する）。 The operation optimizing means 7 relates to the present invention and is intended to optimize the number of hardware instructions when compiling a multinomial operation. The operation optimizing means 8, the dividing means 9, The combination selection means 10 and the like (which will be described later with reference to FIGS. 2 and 3).

演算ソート手段８は、多項演算の桁数が小さい順にソートするものである（図２を用いて後述する）。 The arithmetic sorting means 8 sorts the multiple arithmetic operations in ascending order of digits (described later with reference to FIG. 2).

分割手段９は、多項演算をハードウェア命令に展開するときに、当該ハードウェアで演算する桁数を超えるときに分割（例えば上位４バイトと、下位４バイトに分割）するものである（図２、図３を用いて後述する）。 The dividing means 9 divides a multinomial operation into hardware instructions when it exceeds the number of digits to be calculated by the hardware (for example, divided into upper 4 bytes and lower 4 bytes) (FIG. 2). And will be described later with reference to FIG.

組合せ選択手段１０は、多項演算の桁数が小さい組合せを選択するものである（図３を用いて後述する）。 The combination selection means 10 selects a combination with a small number of digits in the polynomial operation (described later with reference to FIG. 3).

オブジェクト生成手段１１は、ソースプログラムの実行可能形式のオブジェクトプログラム３を生成するものである。 The object generation means 11 generates an object program 3 in an executable format of the source program.

ソースプログラム２は、コンパイル（翻訳）対象のソースプログラムである。 The source program 2 is a source program to be compiled (translated).

オブジェクトプログラム３は、コンパイルした後の実行可能形式のプログラムである。 The object program 3 is an executable program after being compiled.

次に、図２の（ｂ）のフローチャートの順番に従い、多項演算の桁数が小さい順にソートして順次ハードウェア命令に展開し、ハードウェア命令数を最小限にするときの手順を詳細に説明する。 Next, in accordance with the order of the flowchart of FIG. 2B, the procedure for sorting the hardware operations in ascending order of the number of digits and expanding them into hardware instructions in order and minimizing the number of hardware instructions will be described in detail. To do.

図２は、本発明の説明図（その１）を示す。 FIG. 2 is an explanatory diagram (part 1) of the present invention.

図２の（ａ)は、多項演算の例を示す。ここでは、図示の下記の多項演算
・１６バイト２進＝８バイト２進＊２バイト２進＊４バイト２進＊２バイト２進
＝ａ＊ｂ＊ｃ＊ｄ
とする（ａは８バイト２進で表現される任意の数値を表すとする、以下同様）。 FIG. 2A shows an example of a polynomial operation. Here, the following polynomial operation shown: 16 bytes binary = 8 bytes binary * 2 bytes binary * 4 bytes binary * 2 bytes binary
= A * b * c * d
(A represents an arbitrary numerical value expressed in 8-byte binary, and so on).

図２の（ｂ）は、フローチャートを示す。これは、図２の（ａ）の多項演算を、ここでは、ＣＰＵが４バイト（３２ビット）の演算命令で実行するときの展開などの様子を示す。 FIG. 2B shows a flowchart. This shows a situation such as expansion when the CPU executes the polynomial operation of FIG. 2A with a 4-byte (32-bit) operation instruction.

図２の（ｂ）において、Ｓ１で、演算式を取り込む。これは、図２の（ａ）の多項演算（乗算の多項演算）の式を取り込む、ここでは、右側に記載したように、
ａ＊ｂ＊ｃ＊ｄ
と取り込む。 In FIG. 2B, an arithmetic expression is fetched in S1. This takes in the expression of the polynomial operation (multiplicative operation of multiplication) in FIG. 2A, where, as described on the right side,
a * b * c * d
And capture.

Ｓ２で、ビット数の少ない順に並び替える。これは、Ｓ１で取り込んだ図２の（ａ）の多項演算のａ＊ｂ＊ｃ＊ｄを、ビット数の少ない順に並び変え、右側に記載したように、
ｂ＊ｄ＊ｃ＊ａ
とする。 In S2, rearrangement is performed in ascending order of the number of bits. This is because the a * b * c * d of the polynomial operation of FIG. 2A taken in S1 is rearranged in ascending order of the number of bits, and as described on the right side,
b * d * c * a
And

Ｓ３で、先頭の１データを取り込む。これは、Ｓ２で並び替えた後の多項演算の先頭の１データ、１回目はｂを取り込む。 In S3, the first data is fetched. This is the first one data of the polynomial operation after the rearrangement in S2, and b is taken in the first time.

Ｓ４で、分割要か判別する。これは、Ｓ３で取り込んだデータについて、ＣＰＵが実行できる演算命令の４バイトよりも大きくて分割要か判別する。１回目のときは、Ｓ３でｂを取り込み当該ｂが２バイト２進で４バイトよりも小さく分割不要でＮＯとなり、Ｓ６に進む。一方、ＹＥＳの場合には、Ｓ５で上位下位に分割してＣＰＵが実行できる演算命令の４バイトよりも小さくしてＳ６に進む（尚、上位下位に分割してもなおも４バイトを超えるときは更に３分割、４分割などして４バイトを超えないようにし、Ｓ６に進む）。 In S4, it is determined whether division is necessary. This determines whether the data fetched in S3 is larger than 4 bytes of the arithmetic instruction that can be executed by the CPU and needs to be divided. At the first time, b is fetched in S3, and b is smaller than 4 bytes in 2 bytes binary and no division is required, and the process proceeds to S6. On the other hand, in the case of YES, the process advances to S6 after being divided into upper and lower parts in S5 and smaller than 4 bytes of the arithmetic instruction that can be executed by the CPU (in addition, when the upper and lower parts are still exceeded 4 bytes) Is further divided into 3 or 4 so as not to exceed 4 bytes, and the process proceeds to S6).

Ｓ６で、次のデータを取り込む。ここでは、２番目のｄ（２バイト２進）を取り込む。 In S6, the next data is fetched. Here, the second d (2-byte binary) is taken in.

Ｓ７で、分割要か判別する。これは、乗算する側についてもＳ４（被乗数の側）と同様に、ＣＰＵが実行できる演算命令の４バイトよりも大きくて分割要か判別する。ＹＥＳの場合には、Ｓ８で上位下位（更に、４バイトよりも大きいときは３分割、４分割などする）に分割し、Ｓ９に進む。ＮＯの場合には、Ｓ９に進む。 In S7, it is determined whether division is necessary. This is also determined on the multiplication side, as in S4 (multiplicand side), if it is larger than 4 bytes of the arithmetic instruction that can be executed by the CPU and needs to be divided. In the case of YES, in S8, it is divided into higher order and lower order (further, if it is larger than 4 bytes, it is divided into 3 divisions, 4 divisions, etc.), and the process proceeds to S9. If NO, the process proceeds to S9.

Ｓ９で、演算を処理する展開を行う。これは、Ｓ３からＳ５で被乗数について４バイト以下とし、Ｓ７、Ｓ８で乗数について４バイト以下としたので、被除数と乗数を演算処理できるように展開を行う（例えば後述する図２の（ｃ）の「２バイト２進＊２バイト２進」などのようにハードウェア命令で演算実行できるように展開する）。 In S9, development for processing the calculation is performed. This is because the multiplicand is set to 4 bytes or less in S3 to S5, and the multiplier is set to 4 bytes or less in S7 and S8, so that the dividend and the multiplier can be calculated (for example, as shown in FIG. It is expanded so that it can be executed by a hardware instruction such as “2-byte binary * 2-byte binary”).

Ｓ１０で、終わりか判別する。ＹＥＳの場合には、Ｓ１２に進む。ＮＯの場合には、Ｓ１１で結果を先頭にし、Ｓ３以降を繰り返し、２回目、３回目の処理などを行う。 In S10, it is determined whether or not the end. If YES, the process proceeds to S12. In the case of NO, the result is placed at the top in S11, and S3 and subsequent steps are repeated, and the second and third processes are performed.

Ｓ１２で、演算を行う展開を行う。これは、Ｓ９でハードウェア命令で演算を行える桁数以下の演算（演算式）に展開したので、当該演算（演算式）をハードウェア命令を使用した形に展開する。 In S12, the expansion for performing the calculation is performed. Since this is expanded to an operation (arithmetic expression) of the number of digits or less that can be performed by a hardware instruction in S9, the arithmetic (arithmetic expression) is expanded into a form using a hardware instruction.

以上の手順によって、ここでは、図２の（ａ）の多項演算ａ＊ｂ＊ｃ＊ｄについて、Ｓ２でｂ＊ｄ＊ｃ＊ａに桁数の小さい順に並び変え、先頭から順に、図示の１回目（分割なし）、２回目（分割なし）、３回目（中間結果を３つに分割）の処理を行い、後述する図２の（ｃ）に示す１回目、２回目、３回目のように展開でき、合計６回のハードウェア命令の実行（乗算の実行）で済み、既述した従来の合計９回から３回少なくでき、高速実行できるオブジェクトプログラムを生成可能となる。 By the above procedure, here, the polynomial operation a * b * c * d in FIG. 2A is rearranged to b * d * c * a in ascending order of the number of digits in S2, and the order shown in FIG. The first (no division), second (no division), and third (intermediate results are divided into three) processes are performed, and the first, second, and third shown in FIG. It is possible to develop an object program that can be executed at high speed, and can be executed three times from the total of nine times of the conventional method described above.

図２の（ｃ）は、図２の（ａ）の多項演算について、図２の（ｂ）のフローチャートに従い処理した結果を模式的に示す。 FIG. 2 (c) schematically shows the result of processing the polynomial operation of FIG. 2 (a) according to the flowchart of FIG. 2 (b).

・１回目：図２の（ｂ）の１回目に対応し、ｂ＊ｄ（ここで、ｂ＝２バイト２進、ｄ＝２バイト２進）は、４バイト以下であるので、そのまま演算し、中間結果４バイトとなる様子を示す。・ First time: Corresponding to the first time in FIG. 2 (b), b * d (where b = 2 bytes binary, d = 2 bytes binary) is 4 bytes or less, so the calculation is performed as it is. The intermediate result is 4 bytes.

・２回目：図２の（ｂ）の２回目に対応し、中間結果４バイト＊ｃ（ここで、ｃ＝４バイト２進）は、４バイト以下であるので、そのまま演算し、中間結果８バイトとなる様子を示す。 Second time: Corresponding to the second time in FIG. 2B, the intermediate result 4 bytes * c (where c = 4 bytes binary) is 4 bytes or less, so the calculation is performed as it is, and the intermediate result 8 It shows how it becomes a byte.

・３回目：図２の（ｂ）の３回目に対応し、中間結果８バイト＊ａ（ここで、ａ＝８バイト２進）は、４バイト以上であるので、図示のように分割してそれぞれ演算し、中間結果１６バイトとなる様子を示す。ここでは、上位４バイトと下位４バイトにそれぞれ分割して組み合わせて合計４回の図示のような演算に展開する。 -Third time: Corresponding to the third time of FIG. 2 (b), the intermediate result 8 bytes * a (where a = 8 bytes binary) is 4 bytes or more. Each of the operations is shown as an intermediate result of 16 bytes. Here, the upper 4 bytes and the lower 4 bytes are divided and combined, and expanded into a total of four operations as shown in the figure.

図２の（ｄ）は、図２の（ｃ）の乗算は合計６回でよいという様子を示す。これは、既述した図４の従来の合計９回に比べて６回となり、３回乗算が少なくなっている。 FIG. 2 (d) shows that the multiplication of FIG. 2 (c) may be six times in total. This is 6 times compared to the conventional 9 times in FIG. 4 described above, and the multiplication of 3 times is reduced.

図３は、本発明の説明図（その２）を示す。 FIG. 3 is an explanatory diagram (part 2) of the present invention.

図３の（ａ)は、多項演算の例を示す。ここでは、図示の下記の多項演算
・８バイト２進＝２バイト２進＊２バイト２進＊２バイト２進＊２バイト２進
＝ａ＊ｂ＊ｃ＊ｄ
とする（ａは２バイト２進で表現される任意の数値を表すとする、以下同様）。 FIG. 3A shows an example of a polynomial operation. Here, the following polynomial operation shown below: 8-byte binary = 2-byte binary * 2-byte binary * 2-byte binary * 2-byte binary
= A * b * c * d
(A represents an arbitrary numerical value expressed in 2-byte binary, and so on).

図３の（ｂ）は、フローチャートを示す。これは、図３の（ａ）の多項演算を、ここでは、ＣＰＵが４バイト（３２ビット）の演算命令で実行するときの展開などの様子を示す。 FIG. 3B shows a flowchart. This shows a situation such as expansion when the CPU executes the multinomial operation of FIG. 3A with a 4-byte (32-bit) operation instruction.

図３の（ｂ）において、Ｓ２１で、演算式を取り込む。これは、図３の（ａ）の多項演算（乗算の多項演算）の式を取り込む、ここでは、
ａ＊ｂ＊ｃ＊ｄ
と取り込む。 In FIG. 3B, an arithmetic expression is fetched in S21. This takes in the expression of the polynomial operation (multiplicative operation of multiplication) in FIG.
a * b * c * d
And capture.

Ｓ２２で、ビット数の少ない順に並び替える。これは、Ｓ２１で取り込んだ図３の（ａ）の多項演算のａ＊ｂ＊ｃ＊ｄを、ビット数の少ない順に並び変える。ここでは、ビット数が全て同じであるので、ａ＊ｂ＊ｃ＊ｄのままとする。 In S22, rearrangement is performed in ascending order of the number of bits. This rearranges the a * b * c * d of the polynomial operation of FIG. 3A taken in S21 in ascending order of the number of bits. Here, since all the numbers of bits are the same, a * b * c * d remains as it is.

Ｓ２３で、１個目と２個目を演算する。これは、１個目（先頭）のａと、２個目のｂとを演算（ａ＊ｂ）する。尚、被乗数、乗数（ここでは、ａ，ｂ）の桁数が４バイトを超えるときは、既述した図２の（ｂ）のＳ３〜Ｓ５、あるいはＳ６〜Ｓ８のように超えないように分割し、これら分割したものの演算を行う。 In S23, the first and second are calculated. This calculates (a * b) the first (first) a and the second b. When the number of digits of the multiplicand or multiplier (here, a, b) exceeds 4 bytes, it is divided so as not to exceed S3 to S5 or S6 to S8 in FIG. Then, the operation of these divided parts is performed.

Ｓ２４で、演算結果と残りのビットの組合せで最小の演算結果のビット数となるものを見つける。これは、１回目はＳ２３で演算した演算結果、２回目以降は前回の演算結果と、残りの演算とを含めて２つの組合せて演算結果のビット数（桁数）が最小となる組合せを見つける。 In S24, the combination of the operation result and the remaining bits that finds the minimum number of operation result bits is found. This is to find the combination that minimizes the number of bits (number of digits) of the operation result by combining the two results including the previous operation result and the remaining operation after the first operation result calculated in S23. .

Ｓ２５で、分割する。これは、Ｓ２４で見つけた２つの演算（演算と他の演算、あるいは前回の中間結果と演算）について、４バイト（ＣＰＵが実行できるハードウェア命令のサイズである４バイト）を超えるときに超えないように分割する（既述した図２のＳ３からＳ８と同様に分割する）。 In S25, the image is divided. This does not exceed when the two operations (operation and other operations or the previous intermediate result and operation) found in S24 exceed 4 bytes (4 bytes which is the size of the hardware instruction that can be executed by the CPU). (Similar to S3 to S8 in FIG. 2 described above).

Ｓ２６で、演算を処理する展開を行う。これは、Ｓ２４で選択した２つ（あるいはＳ２５で分割した後の２つ）の被除数と乗数を演算処理できるように展開を行う（例えば後述する図３の（ｄ）の「２バイト２進＊２バイト２進」などのようにハードウェア命令で演算実行できるように展開する）。 In S26, the expansion for processing the calculation is performed. This is expanded so that the two dividends and multipliers selected in S24 (or two after division in S25) can be processed (for example, “2-byte binary * in FIG. 3D described later). It is expanded so that it can be executed by hardware instructions such as “2-byte binary”).

Ｓ２７で、終わりか判別する。ＹＥＳの場合には、Ｓ２８に進む。ＮＯの場合には、Ｓ２４以降を繰り返す。 In S27, it is determined whether or not the end. If YES, the process proceeds to S28. In the case of NO, S24 and subsequent steps are repeated.

Ｓ２８で、演算を行う展開を行う。これは、Ｓ２６でハードウェア命令で演算を行える桁数以下の演算（演算式）に展開したので、当該演算（演算式）をハードウェア命令を使用した形に展開する。 In S28, the expansion for performing the calculation is performed. Since this is expanded to an operation (arithmetic expression) of the number of digits or less that can be performed by a hardware instruction in S26, the operation (arithmetic expression) is expanded into a form using a hardware instruction.

以上の手順によって、ここでは、図３の（ａ）の多項演算ａ＊ｂ＊ｃ＊ｄについて、Ｓ２２でａ＊ｂ＊ｃ＊ｄの桁数の小さい順に並び変え、１回目は１個目と２個目の演算を行い、以降、前回の演算結果と、残りの演算とのうちの最小の演算結果となる組合せを選択して演算（４バイトを越えるときは分割した後に演算）することを繰り返すことにより、例えば図３の（ａ）の例では、後述する図３の（ｃ）に示すように、
・１回目：ａ＊ｂ
・２回目：ｃ＊ｄ
・３回目：１回目の演算結果ｅ＊２回目の演算結果ｆ
というように合計３回の演算（乗算）で実行させるオブジェクトプログラム３を生成することが可能となる。 Through the above procedure, here, the polynomial operation a * b * c * d in FIG. 3A is rearranged in ascending order of the number of digits of a * b * c * d in S22, and the first is the first one. And then the second operation, and then select the combination of the previous operation result and the remaining operation that gives the minimum operation result (if it exceeds 4 bytes, perform the operation after dividing). By repeating the above, for example, in the example of FIG. 3A, as shown in FIG.
・ First time: a * b
・ Second time: c * d
・ 3rd: First calculation result e * Second calculation result f
Thus, it is possible to generate the object program 3 to be executed by a total of three operations (multiplication).

図３の（ｃ）は、図３の（ａ）の多項演算について、図３の（ｂ）のフローチャートに従い処理した結果を模式的に示す。 FIG. 3C schematically shows the result of processing the polynomial operation of FIG. 3A according to the flowchart of FIG.

・１回目：図３の（ａ）のａ＊ｂ（ここで、ａ，ｂ＝２バイト２進）を選択し、これらは４バイト以下であるので、そのまま演算し、中間結果ｅが４バイトとなる様子を示す。 -First time: a * b (a, b = 2 bytes binary) in FIG. 3A is selected. Since these are 4 bytes or less, the calculation is performed as it is, and the intermediate result e is 4 bytes. It shows how it becomes.

・２回目：図３の（ａ）のｃ＊ｄ（ここで、ｃ，ｄ＝２バイト２進）を選択し、これらは４バイト以下であるので、そのまま演算し、中間結果ｆが４バイトとなる様子を示す。 Second time: Select c * d (where c, d = 2 bytes binary) in (a) of FIG. 3. Since these are 4 bytes or less, the calculation is performed as it is, and the intermediate result f is 4 bytes. It shows how it becomes.

・３回目：図３の（ａ）の１回目のａ＊ｂの演算結果ｅと、２回目のｃ＊ｄの演算結果ｆとを選択し、これらは４バイト以下であるので、そのまま演算し、中間結果８バイトとなる様子を示す。 -Third time: The first a * b calculation result e and the second c * d calculation result f in FIG. 3A are selected. Since these are four bytes or less, the calculation is performed as it is. The intermediate result is 8 bytes.

図３の（ｄ）は、図３の（ｃ）の演算をハードウェア命令を使用して展開できる状態にした様子を示す。 FIG. 3D shows a state in which the operation of FIG. 3C can be expanded using hardware instructions.

・１回目：図３の（ｃ）の１回目に対応し、ａ＊ｂに展開した様子を示す。 First time: Corresponding to the first time in (c) of FIG.

・２回目：図３の（ｃ）の２回目に対応し、ｃ＊ｄに展開した様子を示す。 Second time: Corresponding to the second time in (c) of FIG.

・３回目：図３の（Ｃ）の３回目に対応し、１回目の演算結果ｅと、２回目の演算結果ｆとを演算するように展開した様子を示す。 Third time: Corresponding to the third time in (C) of FIG. 3, a state where the first calculation result e and the second calculation result f are calculated is shown.

図３の（ｅ）は、乗算は合計３回でよい様子を示す。これは、図３の（ｃ），（ｄ）で説明したように、図３の（ａ）の多項演算は図３の（ｂ）のフローチャートに従えば、合計３回の演算（乗算）という少ない演算回数で実行することが可能となる。 FIG. 3 (e) shows that the multiplication may be three times in total. As described in FIGS. 3C and 3D, the polynomial operation of FIG. 3A is called a total of 3 operations (multiplication) according to the flowchart of FIG. 3B. It can be executed with a small number of operations.

本発明は、ソースプログラムに含まれる多項演算を翻訳する演算処理プログラムにおいて、コンパイル時に多項演算をハードウェア命令に展開する際に、多項演算の桁数の少ない順にソートした後にハードウェア命令に展開したり、中間結果も含めて他の演算の演算結果が最小となる組み合わせから順にハードウェア命令に展開し、多項演算を実行するハードウェア命令数を最小限にして実行速度の向上を図ることが可能となる。 In an arithmetic processing program for translating a polynomial operation included in a source program, when expanding a polynomial operation into a hardware instruction at the time of compilation, the polynomial operation is sorted into a hardware instruction after being sorted in ascending order of digits. In addition, it is possible to expand the hardware instructions in order starting from the combination that minimizes the operation result of other operations including intermediate results, thereby minimizing the number of hardware instructions that execute multinomial operations and improving the execution speed. It becomes.

本発明のシステム構成図である。It is a system configuration diagram of the present invention. 本発明の説明図（その１）である。It is explanatory drawing (the 1) of this invention. 本発明の説明図（その２）である。It is explanatory drawing (the 2) of this invention. 従来技術の説明図である。It is explanatory drawing of a prior art.

Explanation of symbols

１：コンパイラ
２：字句解析手段
３：構文解析手段
４：意味解析手段
５：最適化手段
６：分岐最適化手段
７：演算最適化手段
８：演算ソート手段
９：分割手段
１０：組合せ選択手段
１１：オブジェクト生成手段
２：ソースプログラム
３：オブジェクトプログラム 1: compiler 2: lexical analysis means 3: syntax analysis means 4: semantic analysis means 5: optimization means 6: branch optimization means 7: calculation optimization means 8: calculation sort means 9: division means 10: combination selection means 11 : Object generation means 2: Source program 3: Object program

Claims

In an arithmetic processing program for causing a computer to translate a polynomial operation included in a source program,
In the computer,
Sorting the polynomial multiplications in the polynomial operations included in the source program in ascending order of the number of digits;
Generate a hardware instruction that operates in order from the top of the sorted polynomial multiplication , and if the hardware instruction of the device that executes the operation exceeds the number of digits that can be calculated, it is divided into a plurality of operations in a range not exceeding Generating hardware instructions after
Arithmetic processing program for causing a run.

In an arithmetic processing program for causing a computer to translate a polynomial operation included in a source program,
In the computer,
A selection step for selecting two operations of a combination in which the number of digits of the operation result is the smallest from the multiplicative multiplication in the polynomial operation included in the source program;
When two of the combinations are selected, a hardware instruction that calculates the selected two is generated, and when the hardware instruction of the device on which the calculation is performed exceeds the number of digits that can be calculated, A generation step of generating a hardware instruction after dividing into a plurality of operations;
Repeating the selection step and the generation step in order until all terms of the polynomial operation are selected;
Arithmetic processing program for causing a run.

In an arithmetic processing method for translating a multinomial operation included in a source program,
Computer
Sorting the polynomial multiplications in the polynomial operations included in the source program in ascending order of the number of digits;
Generate a hardware instruction that operates in order from the top of the sorted polynomial multiplication , and if the hardware instruction of the device that executes the operation exceeds the number of digits that can be calculated, it is divided into a plurality of operations in a range not exceeding And a step of generating a hardware instruction after that.

In an arithmetic processing method for translating a polynomial operation included in a source program,
Computer
A selection step for selecting two operations of a combination in which the number of digits of the operation result is the smallest from the multiplicative multiplication in the polynomial operation included in the source program;
When two of the combinations are selected, a hardware instruction that calculates the selected two is generated, and when the hardware instruction of the device on which the calculation is performed exceeds the number of digits that can be calculated, A generation step for generating a hardware instruction after dividing into a plurality of operations;
An operation processing method , comprising: repeating the selection step and the generation step in order until all terms of the polynomial operation are selected .

In an arithmetic processing device that translates a polynomial operation included in a source program,
Means for sorting the polynomial multiplications in the polynomial operations included in the source program in ascending order of the number of digits;
Generate a hardware instruction that operates in order from the top of the sorted polynomial multiplication , and if the hardware instruction of the device that executes the operation exceeds the number of digits that can be calculated, it is divided into a plurality of operations in a range not exceeding And a means for generating a hardware instruction after that.