JP2020166661A

JP2020166661A - Division device, division method and program

Info

Publication number: JP2020166661A
Application number: JP2019067538A
Authority: JP
Inventors: 克維久保; Katsutsuna Kubo; 彰一左近; Shoichi Sakon
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2019-03-29
Filing date: 2019-03-29
Publication date: 2020-10-08

Abstract

To provide a division device, a division method and a program capable of speeding up a plurality of division operations and at the same time improving an operation accuracy, in the case of dividing an array of dividends by a divisor, which is a scalar variable.SOLUTION: The division device includes a multiplication value calculation unit that calculates a multiplication value for each of a plurality of elements by multiplying a reciprocal of a divisor, which is a scalar variable, by each of a plurality of elements included in an array of dividends, and a correction value arithmetic unit that calculates a correction value for each of the plurality of elements, which is the correction value of the multiplication values calculated by the multiplication value calculation unit by performing a plurality of product-sum operations for each of the multiplication values calculated by the multiplication value calculation unit.SELECTED DRAWING: Figure 5

Description

本発明は、除算装置、除算方法及びプログラムに関する。 The present invention relates to a division device, a division method and a program.

特許文献１及び特許文献２は、演算プログラムにより、繰り返し演算が行われるループ処理内に、固定値である除数による除算を含んでいる場合に、除数の逆数をループ処理の外で先に演算し、ループ処理内では、除算を、除数の逆数との乗算に置き換えて演算することにより、演算プログラムの演算速度を向上させる技術について開示している。 In Patent Document 1 and Patent Document 2, when division by a divisor that is a fixed value is included in the loop processing in which the arithmetic operation is performed repeatedly by the arithmetic program, the inverse number of the divisor is calculated first outside the loop processing. , In the loop processing, a technique for improving the calculation speed of a calculation program by replacing division with multiplication with the inverse of the divisor is disclosed.

特許文献３は、演算プログラムにより、固定値である被除数を、固定値である除数で除算する演算を行う場合に、除数の逆数を演算し、被除数と乗算した後、乗算後の値を補正する技術について開示している。 Patent Document 3 calculates the inverse of a divisor, multiplies it with the divisor, and then corrects the value after multiplication when the arithmetic program divides the divisor that is a fixed value by the divisor that is a fixed value. It discloses the technology.

特開２０００−１８１７２２号公報Japanese Unexamined Patent Publication No. 2000-181722 特開平０１−１４０３３６号公報Japanese Patent Application Laid-Open No. 01-140336 特開平０２−２２７７２６号公報Japanese Patent Application Laid-Open No. 02-227726

しかし、特許文献１及び特許文献２では、浮動小数点数である除数の逆数を演算する際に、誤差が生じるおそれがあり、また、除算結果について補正を行ったとしても、最下位の１ビットに誤差が生じるおそれがあった。
また、特許文献３では、複数回の除算を行う場合には、複数回の除算ごとに、除数の逆数を演算してから、被除数と乗算を行う必要があり、複数回の除算を行う場合には、演算速度が低下するおそれがあった。
そのため、特許文献１〜３では、配列である被除数を、スカラ変数である除数で除算する演算を行う場合に、複数の除算演算を高速化すると同時に、演算精度を向上させることはできなかった。 However, in Patent Document 1 and Patent Document 2, there is a possibility that an error may occur when calculating the reciprocal of a divisor that is a floating point number, and even if the division result is corrected, it is set to the lowest bit. There was a risk of error.
Further, in Patent Document 3, when performing a plurality of divisions, it is necessary to calculate the reciprocal of the divisor for each of the multiple divisions and then perform the division with the dividend, and when the division is performed a plurality of times. There was a risk that the calculation speed would decrease.
Therefore, in Patent Documents 1 to 3, when a division of an array is divided by a divisor that is a scalar variable, it is not possible to speed up a plurality of division operations and at the same time improve the calculation accuracy.

そこで、この発明は、上述の課題を解決する除算装置、除算方法及びプログラムを提供することを目的としている。 Therefore, an object of the present invention is to provide a division device, a division method, and a program that solve the above-mentioned problems.

本発明のいくつかの態様は、上述の課題を解決すべくなされたもので、本発明の第１の態様による除算装置は、
スカラ変数である除数の逆数を、被除数である配列に含まれる複数の要素ごとに乗算することにより、前記複数の要素ごとの乗算値を演算する乗算値演算部と、
前記乗算値演算部が演算した前記乗算値のそれぞれに対して、複数回の積和演算を行うことにより、前記乗算値演算部が演算した前記乗算値の補正値であって、前記複数の要素ごとの前記補正値を演算する補正値演算部と、
を備える。 Some aspects of the present invention have been made to solve the above-mentioned problems, and the division apparatus according to the first aspect of the present invention is
A multiplication value calculation unit that calculates the multiplication value for each of the plurality of elements by multiplying the reciprocal of the divisor, which is a scalar variable, for each of a plurality of elements included in the array that is the divisor.
A correction value of the multiplication value calculated by the multiplication value calculation unit by performing a product-sum operation a plurality of times for each of the multiplication values calculated by the multiplication value calculation unit, and the plurality of elements. A correction value calculation unit that calculates the correction value for each
To be equipped.

また、本発明の第２の態様による除算方法は、
スカラ変数である除数の逆数を、被除数である配列に含まれる複数の要素ごとに乗算することにより、前記複数の要素ごとの乗算値を演算し、
演算した前記乗算値のそれぞれに対して、複数回の積和演算を行うことにより、前記乗算値の補正値であって、前記複数の要素ごとの前記補正値を演算する。 Further, the division method according to the second aspect of the present invention is
By multiplying the reciprocal of the divisor, which is a scalar variable, for each of a plurality of elements included in the array that is a divisor, the multiplication value for each of the plurality of elements is calculated.
By performing a product-sum operation a plurality of times for each of the calculated multiplication values, the correction value of the multiplication value is calculated for each of the plurality of elements.

また、本発明の第３の態様によるプログラムは、
除算装置のコンピュータを、
スカラ変数である除数の逆数を、被除数である配列に含まれる複数の要素ごとに乗算することにより、前記複数の要素ごとの乗算値を演算する乗算値演算手段、
前記乗算値演算手段が演算した前記乗算値のそれぞれに対して、複数回の積和演算を行うことにより、前記乗算値演算手段が演算した前記乗算値の補正値であって、前記複数の要素ごとの前記補正値を演算する補正値演算手段、
として機能させる。 Further, the program according to the third aspect of the present invention is
The computer of the divider,
A multiplication value calculation means that calculates the multiplication value for each of the plurality of elements by multiplying the reciprocal of the divisor, which is a scalar variable, for each of a plurality of elements included in the array that is the divisor.
A correction value of the multiplication value calculated by the multiplication value calculation means by performing a product-sum operation a plurality of times for each of the multiplication values calculated by the multiplication value calculation means, and the plurality of elements. Correction value calculation means for calculating the correction value for each
To function as.

本発明のいくつかの態様によれば、配列である被除数を、スカラ変数である除数で除算する演算を行う場合に、複数の除算演算を高速化すると同時に、演算精度を向上させることができる。 According to some aspects of the present invention, when performing an operation of dividing an array divisor by a scalar variable divisor, it is possible to speed up a plurality of division operations and at the same time improve the calculation accuracy.

本発明の実施形態による除算装置で行われる処理の概要を説明する第１の図である。FIG. 1 is a first diagram illustrating an outline of processing performed by the division apparatus according to the embodiment of the present invention. 本発明の実施形態による除算装置で行われる処理の概要を説明する第２の図である。FIG. 2 is a second diagram illustrating an outline of processing performed by the division apparatus according to the embodiment of the present invention. 本発明の実施形態による除算装置で行われる処理の概要を説明する第３の図である。FIG. 3 is a third diagram illustrating an outline of processing performed by the division apparatus according to the embodiment of the present invention. 本発明の実施形態による除算装置で行われる処理の概要を説明する第４の図である。FIG. 4 is a fourth diagram illustrating an outline of processing performed by the division apparatus according to the embodiment of the present invention. 本発明の実施形態による除算装置の構成を示すブロック図である。It is a block diagram which shows the structure of the division apparatus by embodiment of this invention. 本発明の実施形態による除算装置の処理を示すフローチャートである。It is a flowchart which shows the processing of the division apparatus by embodiment of this invention. 最小構成を有する除算装置の構成を示すブロック図である。It is a block diagram which shows the structure of the division apparatus which has the minimum structure. 最小構成を有する除算装置の処理を示すフローチャートである。It is a flowchart which shows the processing of the division apparatus which has the minimum configuration.

以下、本発明の実施形態について説明する。
図１〜図４は、本発明の実施形態による除算装置１０ａ（図５）で行われる処理の概要を説明する図である。
除算装置１０ａは、シミュレーションやデータ解析などを行う計算機などである。除算装置１０ａには、図１の符号（ａ）で示すような、除算演算を含んだ演算命令が入力される。 Hereinafter, embodiments of the present invention will be described.
1 to 4 are diagrams illustrating an outline of processing performed by the division device 10a (FIG. 5) according to the embodiment of the present invention.
The division device 10a is a computer or the like that performs simulation, data analysis, or the like. An operation instruction including a division operation is input to the division device 10a as shown by the reference numeral (a) in FIG.

図１の符号（ａ）に示す演算では、実数（ＲＥＡＬ）の複数（Ｎ個）の要素を含む配列Ａ（Ｎ）、実数の複数（Ｎ個）の要素を含む配列Ｂ（Ｎ）、実数Ｓが用いられる。
図１の符号（ａ）では、値ｉを、１からＮまで増加させながら（ループ処理）、Ａ（ｉ）＝Ｂ（ｉ）／Ｓの演算が行われる。なお、ｉは、１からＮまでの整数のうち、いずれかの整数を示す。また、本願において、記号“／”や記号“÷”は、除算の演算子を示す。
なお、実数Ｓは、ループ処理が行われる間、一定であり、変化しない。 In the operation shown by the reference numeral (a) in FIG. 1, an array A (N) containing a plurality of (N) elements of a real number (REAL), an array B (N) containing a plurality of (N) elements of a real number, and a real number. S is used.
In the reference numeral (a) of FIG. 1, the calculation of A (i) = B (i) / S is performed while increasing the value i from 1 to N (loop processing). In addition, i represents any integer among integers from 1 to N. Further, in the present application, the symbol “/” and the symbol “÷” indicate division operators.
The real number S is constant and does not change while the loop processing is performed.

除算装置１０ａは、図１の符号（ａ）で示される演算命令を、図１の符号（ｂ）で示される演算命令に置き換える。
図１の符号（ｂ）に示す演算では、実数の複数（Ｎ個）の要素を含む配列Ａ（Ｎ）、実数の複数（Ｎ個）の要素を含む配列Ｂ（Ｎ）、実数Ｓ、実数Ｔが用いられる。
図１の符号（ｂ）に示す演算では、始めに、実数Ｓの逆数である実数Ｔが、Ｔ＝１．０／Ｓの式により、ループ処理が行われる前に、演算される。 The dividing device 10a replaces the arithmetic instruction indicated by the reference numeral (a) in FIG. 1 with the arithmetic instruction indicated by the reference numeral (b) in FIG.
In the operation shown by the reference numeral (b) in FIG. 1, an array A (N) containing a plurality of (N) elements of a real number, an array B (N) containing a plurality of (N) elements of a real number, a real number S, and a real number. T is used.
In the operation shown by the reference numeral (b) in FIG. 1, first, the real number T, which is the reciprocal of the real number S, is calculated by the formula T = 1.0 / S before the loop processing is performed.

次に、ループ処理が行われる。つまり、値ｉを、１からＮまで増加させながら（ループ処理）、Ａ（ｉ）＝Ｂ（ｉ）＊Ｔの演算が行われる。なお、本願において、記号“＊”や記号“×”は、乗算の演算子を示す。
図１の符号（ａ）の演算では、実数Ｓによる除算が行われるが、図１の符号（ｂ）の演算では、実数Ｔによる乗算が行われる。 Next, loop processing is performed. That is, the calculation of A (i) = B (i) * T is performed while increasing the value i from 1 to N (loop processing). In the present application, the symbol "*" and the symbol "x" indicate multiplication operators.
In the operation of the code (a) of FIG. 1, division by the real number S is performed, but in the operation of the code (b) of FIG. 1, multiplication by the real number T is performed.

図１の符号（ａ）で示す演算命令によれば、図２（ａ）に示すような演算が、配列Ｂを構成する要素Ｂ（ｉ）に対して行われ、レジスタにマッピングされる。
つまり、配列Ｂのｉ番目の要素Ｂ（ｉ）について、実数Ｓによる除算が行われることにより、配列Ａのｉ番目の要素Ａ（ｉ）が演算される。例えば、始めに、配列Ｂの１番目の要素Ｂ（１）について、実数Ｓによる除算が行われることにより、配列Ａの１番目の要素Ａ（１）が演算される。同様の処理が、ｉを、１ずつ増加させながら行われ、最後に、配列ＢのＮ番目の要素Ｂ（Ｎ）について、実数Ｓによる除算が行われることにより、配列ＡのＮ番目の要素Ａ（Ｎ）が演算される。 According to the operation instruction shown by the reference numeral (a) in FIG. 1, the operation shown in FIG. 2A is performed on the element B (i) constituting the array B and mapped to the register.
That is, the i-th element A (i) of the array B is calculated by dividing the i-th element B (i) of the array B by the real number S. For example, first, the first element B (1) of the array B is divided by the real number S, so that the first element A (1) of the array A is calculated. The same processing is performed while increasing i by one, and finally, the Nth element B (N) of the array B is divided by the real number S, so that the Nth element A of the array A is performed. (N) is calculated.

一方、図１の符号（ｂ）で示す演算命令によれば、図２（ｂ）に示すような演算が、配列Ｂを構成する要素Ｂ（ｉ）に対して行われる。
つまり、配列Ｂのｉ番目の要素Ｂ（ｉ）について、実数Ｔによる乗算が行われることにより、配列Ａのｉ番目の要素Ａ（ｉ）が演算される。例えば、始めに、配列Ｂの１番目の要素Ｂ（１）について、実数Ｔによる乗算が行われることにより、配列Ａの１番目の要素Ａ（１）が演算される。同様の処理が、ｉを、１ずつ増加させながら行われ、最後に、配列ＢのＮ番目の要素Ｂ（Ｎ）について、実数Ｔによる乗算が行われることにより、配列ＡのＮ番目の要素Ａ（Ｎ）が演算される。 On the other hand, according to the operation instruction shown by the reference numeral (b) in FIG. 1, the operation as shown in FIG.
That is, the i-th element B (i) of the array B is multiplied by the real number T to calculate the i-th element A (i) of the array A. For example, first, the first element B (1) of the array B is multiplied by a real number T to calculate the first element A (1) of the array A. The same processing is performed while increasing i by one, and finally, the Nth element B (N) of the array B is multiplied by the real number T, so that the Nth element A of the array A is performed. (N) is calculated.

図３は、本発明の実施形態による除算装置１０ａにより実行される演算命令を示している。
図３に示す演算では、実数の複数（Ｎ個）の要素を含む配列Ａ（Ｎ）、実数の複数（Ｎ個）の要素を含む配列Ｂ（Ｎ）、実数Ｓ、実数Ｔ、実数Ｒが用いられる。
図３では、図１の符号（ｂ）で説明したように、始めに、実数Ｓの逆数である実数Ｔが、Ｔ＝１．０／Ｓの式により、ループ処理が行われる前に、演算される。 FIG. 3 shows an arithmetic instruction executed by the division device 10a according to the embodiment of the present invention.
In the operation shown in FIG. 3, the array A (N) including a plurality of (N) elements of a real number, the array B (N) containing a plurality of (N) elements of a real number, the real number S, the real number T, and the real number R are Used.
In FIG. 3, as described with reference numeral (b) in FIG. 1, first, the real number T, which is the reciprocal of the real number S, is calculated by the formula T = 1.0 / S before the loop processing is performed. Will be done.

次に、ループ処理が行われる。つまり、図１の符号（ｂ）で説明したように、値ｉを、１からＮまで増加させながら（ループ処理）、Ａ（ｉ）＝Ｂ（ｉ）＊Ｔの演算が行われる。
そして、Ｒ＝ＦＭＡ（Ｂ（ｉ）−Ａ（ｉ）＊Ｓ）の式を用いた１回目のＦＭＡ（ＦｕｓｅｄＭｕｌｔｉｐｌｙａｎｄＡｄｄ）演算が行われる。
そして、Ａ’（ｉ）＝ＦＭＡ（Ａ（ｉ）＋Ｔ＊Ｒ）の式を用いた２回目のＦＭＡ演算が行われる。 Next, loop processing is performed. That is, as described with reference numeral (b) in FIG. 1, the calculation of A (i) = B (i) * T is performed while increasing the value i from 1 to N (loop processing).
Then, the first FMA (Fused Multiply and Add) operation using the equation R = FMA (B (i) -A (i) * S) is performed.
Then, the second FMA calculation is performed using the formula A'(i) = FMA (A (i) + T * R).

図３で示す演算命令によれば、図４に示すような演算が、配列Ｂを構成する要素Ｂ（ｉ）に対して行われ、レジスタにマッピングされる。
つまり、始めに、図４の符号（ａ）で示すように、配列Ｂのｉ番目の要素Ｂ（ｉ）について、実数Ｔによる乗算が行われ、配列Ａのｉ番目の要素Ａ（ｉ）が演算される。例えば、始めに、配列Ｂの１番目の要素Ｂ（１）について、実数Ｔによる乗算が行われることにより、配列Ａの１番目の要素Ａ（１）が演算される。同様の処理が、ｉを、１ずつ増加させながら行われ、最後に、配列ＢのＮ番目の要素Ｂ（Ｎ）について、実数Ｔによる乗算が行われることにより、配列ＡのＮ番目の要素Ａ（Ｎ）が演算される。 According to the operation instruction shown in FIG. 3, the operation as shown in FIG. 4 is performed on the element B (i) constituting the array B and mapped to the register.
That is, first, as shown by the reference numeral (a) in FIG. 4, the i-th element B (i) of the array B is multiplied by the real number T, and the i-th element A (i) of the array A is It is calculated. For example, first, the first element B (1) of the array B is multiplied by a real number T to calculate the first element A (1) of the array A. The same processing is performed while increasing i by one, and finally, the Nth element B (N) of the array B is multiplied by the real number T, so that the Nth element A of the array A is performed. (N) is calculated.

次に、図４の符号（ｂ）で示すように、図４の符号（ａ）の演算結果を用いて、Ｒ_ｉ＝Ｂ（ｉ）−Ａ（ｉ）＊Ｓの演算が行われる。例えば、始めに、Ｒ_１＝Ｂ（１）−Ａ（１）＊Ｓの演算が行われることにより、所定値Ｒ_１が演算される。同様の処理が、ｉを、１ずつ増加させながら行われ、最後に、Ｒ_Ｎ＝Ｂ（Ｎ）−Ａ（Ｎ）＊Ｓの演算が行われることにより、所定値Ｒ_Ｎが演算される。
次に、図４の符号（ｃ）で示すように、図４の符号（ｂ）の演算結果を用いて、Ａ’（ｉ）＝Ａ（ｉ）＋Ｔ＊Ｒ_ｉの演算が行われる。例えば、始めに、Ａ’（１）＝Ａ（１）＋Ｔ＊Ｒ_１の演算が行われることにより、補正値Ａ’（１）が演算される。同様の処理が、ｉを、１ずつ増加させながら行われ、最後に、Ａ’（Ｎ）＝Ａ（Ｎ）＋Ｔ＊Ｒ_Ｎの演算が行われることにより、補正値Ａ’（Ｎ）が演算される。 Next, as shown by the reference numeral (b) in FIG. 4, the calculation of _Ri = B (i) −A (i) * S is performed using the calculation result of the reference numeral (a) in FIG. For example, the predetermined value R ₁ is calculated by _first performing the calculation of R ₁ = B (1) -A (1) * S. Similar processing is the i, carried out while increasing by one, and _finally, by calculating the R N = B (N) -A (N) * S is performed, the predetermined value _{R N} is calculated.
Next, as shown by the reference numeral (c) in FIG. 4, the calculation of A'(i) = A (i) + T * R _i is performed using the calculation result of the reference numeral (b) in FIG. For example, the correction value A'(1) is calculated by first performing the calculation of A'(1) = A (1) + T * R ₁ . Similar processing is the i, carried out while increasing by one, and finally, A '(N) = by A (N) of the + T * _{R N} operation is performed, the correction value A' (N) is operational Will be done.

次に、上述の図１〜図４を参照して説明した処理であって、本発明の実施形態による除算装置１０ａで行われる処理を実現するための具体的な構成及び処理について説明する。 Next, a specific configuration and processing for realizing the processing performed by the division apparatus 10a according to the embodiment of the present invention, which is the processing described with reference to FIGS. 1 to 4 described above, will be described.

図５は、本発明の実施形態による除算装置１０ａの構成を示すブロック図である。除算装置１０ａは、被除数取得部１１、除数取得部１２、逆数演算部１３、乗算値演算部１４ａ、補正値演算部１５ａを備える。
補正値演算部１５ａは、第１の演算部１５１、第２の演算部１５２を備える。 FIG. 5 is a block diagram showing a configuration of a division device 10a according to an embodiment of the present invention. The division device 10a includes a division acquisition unit 11, a divisor acquisition unit 12, a reciprocal calculation unit 13, a multiplication value calculation unit 14a, and a correction value calculation unit 15a.
The correction value calculation unit 15a includes a first calculation unit 151 and a second calculation unit 152.

被除数取得部１１は、ＰＣ（ＰｅｓｒｓｏｎａｌＣｏｍｐｕｔｅｒ）のキーボードなどから入力される複数（Ｎ個）の要素であって、実数の要素を含む配列Ｂ＝{Ｂ（１）、Ｂ（２）、Ｂ（３）、・・・、Ｂ（Ｎ）}（Ｎは、正の整数）を取得し、その配列Ｂを、乗算値演算部１４ａ、第１の演算部１５１に出力する。
除数取得部１２は、ＰＣのキーボードなどから入力されるスカラ変数であって、実数である除数Ｓを取得し、その除数Ｓを、逆数演算部１３、第１の演算部１５１に出力する。 The division number acquisition unit 11 is a plurality of (N) elements input from a keyboard of a PC (Pesrsonal Computer) or the like, and is an array B = {B (1), B (2), B ( 3), ..., B (N)} (N is a positive integer) is acquired, and the array B is output to the multiplication value calculation unit 14a and the first calculation unit 151.
The divisor acquisition unit 12 acquires a divisor S which is a scalar variable input from a keyboard of a PC or the like and is a real number, and outputs the divisor S to the reciprocal calculation unit 13 and the first calculation unit 151.

逆数演算部１３は、除数取得部１２から出力される除数Ｓの逆数を用いて、Ｔ＝１．０／Ｓの式により、逆数Ｔを演算し、その逆数Ｔを、乗算値演算部１４ａ、第２の演算部１５２に出力する。
乗算値演算部１４ａは、被除数取得部１１から出力される配列Ｂと、逆数演算部１３から出力される逆数Ｔとを用いて、Ａ（ｉ）＝Ｂ（ｉ）＊Ｔの式により、乗算値Ａを演算し、その乗算値Ａを、第１の演算部１５１、第２の演算部１５２に出力する。 The reciprocal calculation unit 13 calculates the reciprocal T by the formula of T = 1.0 / S using the reciprocal of the divisor S output from the divisor acquisition unit 12, and the reciprocal T is calculated by the multiplication value calculation unit 14a, Output to the second calculation unit 152.
The multiplication value calculation unit 14a uses the array B output from the dividend acquisition unit 11 and the reciprocal T output from the reciprocal calculation unit 13 to multiply by the formula A (i) = B (i) * T. The value A is calculated, and the multiplication value A is output to the first calculation unit 151 and the second calculation unit 152.

第１の演算部１５１は、被除数取得部１１から出力される被除数Ｂ（ｉ）と、除数取得部１２から出力される除数Ｓと、乗算値演算部１４ａから出力される乗算値Ａ（ｉ）とを用いて、Ｒ_ｉ＝ＦＭＡ（Ｂ（ｉ）−Ａ（ｉ）＊Ｓ）の式により、所定値Ｒ_ｉを演算し、その所定値Ｒ_ｉを、第２の演算部１５２に出力する。なお、ＦＭＡ（）は、ＦＭＡ（ＦｕｓｅｄＭｕｌｔｉｐｌｙａｎｄＡｄｄ）演算を行うことを示す。 The first calculation unit 151 includes a division B (i) output from the division acquisition unit 11, a divisor S output from the division acquisition unit 12, and a multiplication value A (i) output from the multiplication value calculation unit 14a. using and _{in which,} the formula R i = FMA (B (i ) -A (i) * S), and calculates the predetermined value _{R i,} and outputs the predetermined value _{R i,} to the second arithmetic unit 152 .. In addition, FMA () indicates that FMA (Fused Multiply and Add) operation is performed.

第２の演算部１５２は、逆数演算部１３から出力される逆数Ｔと、乗算値演算部１４ａから出力される乗算値Ａ（ｉ）と、第１の演算部１５１から出力される所定値Ｒ_ｉを用いて、Ａ’（ｉ）＝ＦＭＡ（Ａ（ｉ）＋Ｔ＊Ｒ_ｉ）の式により、補正値Ａ’（ｉ）を演算し、その補正値Ａ’を、除算装置１０ａの外部に出力する。 The second calculation unit 152 includes a reciprocal T output from the reciprocal calculation unit 13, a multiplication value A (i) output from the multiplication value calculation unit 14a, and a predetermined value R output from the first calculation unit 151. _{Using i} , the correction value A'(i) is calculated by the formula A'(i) = FMA (A (i) + T * R _i ), and the correction value A'is sent to the outside of the dividing device 10a. Output.

図６は、本発明の実施形態による除算装置１０ａの処理を示すフローチャートである。
始めに、除数取得部１２は、スカラ変数であり、実数である除数Ｓを取得する（ステップＳ１０１）。
次に、被除数取得部１１は、実数の要素Ｂ（１）、・・・、Ｂ（Ｎ）を含む配列Ｂであって、被除数である配列Ｂを取得する（ステップＳ１０２）。 FIG. 6 is a flowchart showing the processing of the division device 10a according to the embodiment of the present invention.
First, the divisor acquisition unit 12 acquires the divisor S, which is a scalar variable and is a real number (step S101).
Next, the division number acquisition unit 11 acquires an array B including the real elements B (1), ..., B (N), which is a division number (step S102).

次に、逆数演算部１３は、ステップＳ１０１で取得した除数Ｓの逆数を用いて、Ｔ＝１．０／Ｓの式により、逆数Ｔを演算する（ステップＳ１０３）。
次に、乗算値演算部１４ａは、変数ｉを、１に設定する（ステップＳ１０４）。
次に、乗算値演算部１４ａは、ステップＳ１０２で取得した配列Ｂの要素Ｂ（ｉ）と、ステップＳ１０３で取得した逆数Ｔとを用いて、Ａ（ｉ）＝Ｂ（ｉ）＊Ｔの式により、乗算値Ａのｉ番目の要素Ａ（ｉ）を演算する（ステップＳ１０５）。 Next, the reciprocal calculation unit 13 calculates the reciprocal T by the formula T = 1.0 / S using the reciprocal of the divisor S acquired in step S101 (step S103).
Next, the multiplication value calculation unit 14a sets the variable i to 1 (step S104).
Next, the multiplication value calculation unit 14a uses the element B (i) of the array B acquired in step S102 and the reciprocal T acquired in step S103 to formulate A (i) = B (i) * T. The i-th element A (i) of the multiplication value A is calculated (step S105).

次に、第１の演算部１５１は、ステップＳ１０２で取得した配列Ｂの要素Ｂ（ｉ）と、ステップＳ１０１で取得した除数Ｓと、ステップＳ１０５で演算した乗算値Ａ（ｉ）とを用いて、Ｒ_ｉ＝ＦＭＡ（Ｂ（ｉ）−Ａ（ｉ）＊Ｓ）の式により、所定値Ｒ_ｉを演算する（ステップＳ１０６）。
次に、第２の演算部１５２は、ステップＳ１０３て演算された逆数Ｔと、ステップＳ１０５で演算された乗算値Ａ（ｉ）と、ステップＳ１０６で演算された所定値Ｒ_ｉを用いて、Ａ’（ｉ）＝ＦＭＡ（Ａ（ｉ）＋Ｔ＊Ｒ_ｉ）の式により、補正値Ａ’（ｉ）を演算する（ステップＳ１０７）。 Next, the first calculation unit 151 uses the element B (i) of the array B acquired in step S102, the divisor S acquired in step S101, and the multiplication value A (i) calculated in step S105. by equation _{R i = FMA (B (i} ) -A (i) * S), and calculates the predetermined value _{R i} (step S106).
Next, the second calculation unit 152 uses the reciprocal T calculated in step S103, the multiplication value A (i) calculated in step S105, and the predetermined value R _i calculated in step S106. The correction value A'(i) is calculated by the formula'(i) = FMA (A (i) + T * R _i ) (step S107).

次に、乗算値演算部１４ａは、変数ｉが、配列Ｂに含まれる要素の数である整数Ｎと等しいか否かについて判定する（ステップＳ１０８）。
乗算値演算部１４ａが、変数ｉが、整数Ｎと等しくないと判定した場合には（ステップＳ１０８でＮＯ）、乗算値演算部１４ａは、変数ｉに、１を加算し（ステップＳ１０９）、再度、上述したステップＳ１０５の処理を行う（ループ処理）。
一方、乗算値演算部１４ａが、変数ｉが、整数Ｎと等しいと判定した場合には（ステップＳ１０８でＹＥＳ）、第２の演算部１５２は、それまでに、演算したＮ個の補正値Ａ’（ｉ）を、配列である補正値Ａ’として、除算装置１０ａの外部に出力する（ステップＳ１１０）。 Next, the multiplication value calculation unit 14a determines whether or not the variable i is equal to the integer N, which is the number of elements included in the array B (step S108).
When the multiplication value calculation unit 14a determines that the variable i is not equal to the integer N (NO in step S108), the multiplication value calculation unit 14a adds 1 to the variable i (step S109), and again. , The process of step S105 described above is performed (loop process).
On the other hand, when the multiplication value calculation unit 14a determines that the variable i is equal to the integer N (YES in step S108), the second calculation unit 152 has calculated N correction values A up to that point. '(I) is output to the outside of the dividing device 10a as a correction value A'which is an array (step S110).

本発明の実施形態によれば、ループ処理中に不変な除数Ｓによる除算を、除数Ｓの逆数との乗算に置き換えることにより、ループ処理中は、配列の各要素ごとに、除数Ｓの逆数を演算する必要がない。つまり、乗算値Ａ（ｉ）を演算する処理については、配列Ｂの要素の数であるＮ回行われるが、除数Ｓの逆数を演算する処理については、１回のみ行われる。そのため、演算速度の高速性を損なうことなく演算誤差を解消することができる。 According to an embodiment of the present invention, the reciprocal of the divisor S is calculated for each element of the array during the loop processing by replacing the division by the invariant divisor S during the loop processing with the multiplication of the reciprocal of the divisor S. No need to calculate. That is, the process of calculating the multiplication value A (i) is performed N times, which is the number of elements in the array B, but the process of calculating the reciprocal of the divisor S is performed only once. Therefore, the calculation error can be eliminated without impairing the high speed of the calculation speed.

また、ＦＭＡを用いた演算を行うことにより、除算を実行した場合と同じ値を得ることができ、更に、除算を乗算に置き換えた時に生じる演算誤差を解消することができる。なお、ＦＭＡを用いた演算を行うことにより、演算時間は、増加するものの、ＦＭＡ演算は高速であるため、除算を行う場合と比較すると、演算時間を短縮することができる。 Further, by performing the calculation using FMA, the same value as when the division is executed can be obtained, and further, the calculation error that occurs when the division is replaced with the multiplication can be eliminated. Although the calculation time is increased by performing the calculation using FMA, since the FMA calculation is high speed, the calculation time can be shortened as compared with the case of performing division.

図７は、最小構成を有する除算装置１０ｂの構成を示すブロック図である。除算装置１０ｂは、乗算値演算部１４ｂ、補正値演算部１５ｂを備える。 FIG. 7 is a block diagram showing the configuration of the division device 10b having the minimum configuration. The division device 10b includes a multiplication value calculation unit 14b and a correction value calculation unit 15b.

図８は、最小構成を有する除算装置１０ｂの処理を示すフローチャートである。
始めに、乗算値演算部１４ｂは、スカラ変数である除数Ｓの逆数Ｔ（＝１．０／Ｓ）を、被除数である配列Ｂに含まれる複数の要素Ｂ（ｉ）ごとに乗算することにより、複数の要素Ｂ（ｉ）ごとの乗算値Ａ（ｉ）を演算する（ステップＳ２０１）。
次に、補正値演算部１５ｂは、乗算値演算部１４ｂが演算した乗算値Ａ（ｉ）のそれぞれに対して、複数回（例えば、２回）の積和演算（例えば、ＦＭＡ演算）を行うことにより、乗算値演算部１４ｂが演算した乗算値Ａ（ｉ）の補正値Ａ’（ｉ）であって、複数の要素Ｂ（ｉ）ごとの補正値Ａ’（ｉ）を演算する（ステップＳ２０２）。 FIG. 8 is a flowchart showing the processing of the division device 10b having the minimum configuration.
First, the multiplication value calculation unit 14b multiplies the reciprocal T (= 1.0 / S) of the divisor S, which is a scalar variable, for each of the plurality of elements B (i) included in the array B, which is the divisor. , Calculate the multiplication value A (i) for each of the plurality of elements B (i) (step S201).
Next, the correction value calculation unit 15b performs a product-sum calculation (for example, FMA calculation) a plurality of times (for example, twice) for each of the multiplication values A (i) calculated by the multiplication value calculation unit 14b. As a result, the correction value A'(i) of the multiplication value A (i) calculated by the multiplication value calculation unit 14b is calculated, and the correction value A'(i) for each of the plurality of elements B (i) is calculated (step). S202).

なお、図５における各部の機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより図５における各部の処理を行ってもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータシステム」は、ホームページ提供環境（あるいは表示環境）を備えたＷＷＷシステムも含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムが送信された場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリ（ＲＡＭ）のように、一定時間プログラムを保持しているものも含むものとする。 A program for realizing the functions of each part in FIG. 5 is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system and executed to obtain the parts in FIG. Processing may be performed. The term "computer system" as used herein includes hardware such as an OS and peripheral devices. Further, the "computer system" shall also include a WWW system provided with a homepage providing environment (or display environment). Further, the "computer-readable recording medium" refers to a portable medium such as a flexible disk, a magneto-optical disk, a ROM, or a CD-ROM, or a storage device such as a hard disk built in a computer system. Furthermore, a "computer-readable recording medium" is a volatile memory (RAM) inside a computer system that serves as a server or client when a program is transmitted via a network such as the Internet or a communication line such as a telephone line. In addition, it shall include those that hold the program for a certain period of time.

本発明のいくつかの態様は、配列である被除数を、スカラ変数である除数で除算する演算を行う場合に、複数の除算演算を高速化すると同時に、演算精度を向上させることが必要な除算装置、除算方法及びプログラムなどに適用することができる。 In some aspects of the present invention, when a division that is an array is divided by a divisor that is a scalar variable, it is necessary to speed up a plurality of division operations and at the same time improve the calculation accuracy. , Dividing methods and programs, etc.

１０ａ、１０ｂ・・・除算装置
１１・・・被除数取得部
１２・・・除数取得部
１３・・・逆数演算部
１４ａ、１４ｂ・・・乗算値演算部（乗算値演算手段）
１５ａ、１５ｂ・・・補正値演算部（補正値演算手段）
１５１・・・第１の演算部
１５２・・・第２の演算部 10a, 10b ... Division device 11 ... Divide number acquisition unit 12 ... Divide acquisition unit 13 ... Reciprocal calculation unit 14a, 14b ... Multiplication value calculation unit (multiplication value calculation means)
15a, 15b ... Correction value calculation unit (correction value calculation means)
151 ... 1st arithmetic unit 152 ... 2nd arithmetic unit

Claims

A multiplication value calculation unit that calculates the multiplication value for each of the plurality of elements by multiplying the reciprocal of the divisor, which is a scalar variable, for each of a plurality of elements included in the array that is the divisor.
A correction value of the multiplication value calculated by the multiplication value calculation unit by performing a product-sum operation a plurality of times for each of the multiplication values calculated by the multiplication value calculation unit, and the plurality of elements. A correction value calculation unit that calculates the correction value for each
Dividing device with.

The division device according to claim 1, wherein the correction value calculation unit performs two product-sum operations as the plurality of product-sum operations.

The division device according to claim 1 or 2, wherein the correction value calculation unit performs the product-sum calculation a plurality of times by using an FMA (Fused Multiply and Add) calculation.

The correction value calculation unit
Claims 1 to 3 further include a first calculation unit that calculates a predetermined value for each of the plurality of elements based on the divisor, the divisor, and the multiplication value calculated by the multiplication value calculation unit. The dividing device described in either.

The correction value calculation unit
Based on the reciprocal, the multiplication value calculated by the multiplication value calculation unit, and the predetermined value calculated by the first calculation unit, the correction value of the multiplication value is calculated for each of the plurality of elements. The second arithmetic unit to do
The division device according to claim 4.

I th said predetermined value (i is any integer from 1 to N) elements of the R _i,
Let B (i) be the i-th element of the divisor.
Let A (i) be the i-th element of the multiplication value.
When the divisor is S,
Said first operation _unit, the equation R i = FMA (B (i ) -A (i) * S), for each of the plurality of elements, according to claim 4 or 5 for calculating a predetermined value Divider.

When the i-th element of the correction value is A'(i),
The division apparatus according to claim 5, wherein the second calculation unit calculates the correction value for each of the plurality of elements by the formula A'(i) = FMA (A (i) + T * R _i ). ..

Further provided with a reciprocal calculation unit that performs the reciprocal calculation of the divisor only once.
The division device according to any one of claims 1 to 7, wherein the multiplication value calculation unit performs a process of calculating the multiplication value by the number of the plurality of elements.

By multiplying the reciprocal of the divisor, which is a scalar variable, for each of a plurality of elements included in the array that is a divisor, the multiplication value for each of the plurality of elements is calculated.
A division method for calculating a correction value of the multiplication value for each of the plurality of elements by performing a product-sum operation a plurality of times for each of the calculated multiplication values.

The computer of the divider,
A multiplication value calculation means that calculates the multiplication value for each of the plurality of elements by multiplying the reciprocal of the divisor, which is a scalar variable, for each of a plurality of elements included in the array that is the divisor.
A correction value of the multiplication value calculated by the multiplication value calculation means by performing a product-sum operation a plurality of times for each of the multiplication values calculated by the multiplication value calculation means, and the plurality of elements. Correction value calculation means for calculating the correction value for each
A program that functions as.