JP2011145999A

JP2011145999A - Method and apparatus for calculating simultaneous linear equation

Info

Publication number: JP2011145999A
Application number: JP2010008419A
Authority: JP
Inventors: Yuichiro Miki; 雄一郎三木
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2010-01-18
Filing date: 2010-01-18
Publication date: 2011-07-28
Anticipated expiration: 2030-01-18
Also published as: JP5570228B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a preconditioned iterative solution that has a highly accelerated effect of preconditioning, and a practical preconditioned iterative solution that does not leave parameter determination to a user by providing the step of determining appropriate parameters depending on a coefficient matrix. <P>SOLUTION: A coefficient matrix A is incompletely triangular-factorized into a lower triangular matrix L and an upper triangular matrix U, and diagonal elements and off-diagonal elements of L and U are multiplied by respective scaling parameters to form a lower triangular matrix L' and an upper triangular matrix U'. The preconditioned iterative solution is executed by the use of a preconditioned matrix M given by L' and U'. The step of scaling parameter determination estimates such scaling parameters that minimize an evaluation function defined by the coefficient matrix A and preconditioned matrix M. The evaluation function is the Euclidean norm of a vector expressed by a formula whose component is the sum of columns of a difference matrix between the coefficient matrix A and the preconditioned matrix M. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、連立一次方程式の計算方法に関し、特に、係数行列が大規模な疎行列である連立一次方程式の前処理付き反復解法に関する。 The present invention relates to a method for calculating simultaneous linear equations, and more particularly to a preprocessed iterative solution of simultaneous linear equations whose coefficient matrix is a large-scale sparse matrix.

物理現象の解析など各種の工学的な問題は、模擬的に表現された偏微分方程式を離散化することにより、最終的に多元連立一次方程式に帰着する。多くの場合、離散化の近似精度は高々数次程度なので方程式の係数は疎行列となる。一方問題を精度よく解くためには離散化の要素数を増やす必要があるため、係数行列は大規模な行列となることが多い。そのため大規模な疎行列を係数行列にもつ連立一次方程式を求解することは、非常に重要である。 Various engineering problems such as the analysis of physical phenomena are ultimately reduced to a multi-dimensional simultaneous linear equation by discretizing a partial differential equation expressed in a simulated manner. In many cases, the approximation accuracy of discretization is at most several orders, so the coefficients of the equations are sparse matrices. On the other hand, in order to solve the problem with high accuracy, it is necessary to increase the number of discretization elements, and thus the coefficient matrix is often a large matrix. Therefore, it is very important to solve simultaneous linear equations having a large-scale sparse matrix as a coefficient matrix.

反復解法とは連立一次方程式の求解手法の一つであり、特に係数行列が大規模疎行列の場合に最も有効な手法として知られている。具体的なアルゴリズムは、最初に近似解として適当な初期値を与え、より正確な解となるよう近似解の更新処理を繰り返すことで、徐々に真の解に近づけていくというものである。例えば正定値対称行列に特化したCG法（共役勾配法）や、非対称行列にも対応したBiCGStab法（安定化双共役勾配法）などがよく利用されている。 The iterative solving method is one of solving methods of simultaneous linear equations, and is known as the most effective method especially when the coefficient matrix is a large-scale sparse matrix. A specific algorithm is to first give an appropriate initial value as an approximate solution, and repeat the update process of the approximate solution to obtain a more accurate solution, thereby gradually approaching the true solution. For example, the CG method (conjugate gradient method) specialized for positive definite symmetric matrices and the BiCGStab method (stabilized biconjugate gradient method) corresponding to asymmetric matrices are often used.

反復解法においてはその収束性、すなわちいかに少ない反復回数で近似解を真の解に十分近づけることができるか、が重要なポイントとなる。この収束性は係数行列の条件数、すなわち固有値の最大値を最小値で割った値に大きく左右され、条件数が小さいほど収束性が向上する。この性質を踏まえ、反復解法を高速化する手段として各種の前処理法が提案されている。 In the iterative solution method, the convergence point, that is, how close the approximate solution can be to be sufficiently close to the true solution with a small number of iterations is an important point. This convergence is greatly influenced by the condition number of the coefficient matrix, that is, the value obtained by dividing the maximum value of the eigenvalues by the minimum value. The smaller the condition number, the better the convergence. Based on this property, various preprocessing methods have been proposed as means for speeding up the iterative solution.

前処理の基本的な考え方は、解くべき方程式の係数行列を近似した前処理行列を用意し、その逆行列を方程式の両辺に乗じることで、方程式自体をより条件数の小さい係数行列で再構成し、反復解法における収束性を高めるというものである。実際は、方程式の再構成は負荷が大きいため陽には行わず、反復解法の反復ルーチン内に、前処理行列によるベクトルの修正処理（以降前処理と呼ぶ）を組み込む事で、再構成後の方程式に対する反復解法と数学的に同値となるアルゴリズムを構成する。 The basic idea of preprocessing is to prepare a preprocessing matrix that approximates the coefficient matrix of the equation to be solved, and multiply both sides of the equation by the inverse matrix, thereby reconstructing the equation itself with a coefficient matrix with a smaller condition number In addition, the convergence in the iterative solution is improved. Actually, the reconstruction of the equation is not performed explicitly because of the heavy load, and the reconstructed equation is incorporated by incorporating a vector correction process (hereinafter referred to as preprocessing) by a preprocessing matrix in the iterative routine of the iterative solution method. Construct an algorithm that is mathematically equivalent to the iterative solution to.

前処理行列に求められる性質は２つある。１つは係数行列に対する近似性が高いことである。近似性が高いほど、前処理によって係数行列が単位行列に近づき条件数が小さくなるため、反復解法の収束性が向上する。もう１つは、前処理行列の作成及び前処理に要する時間が少ないことである。前処理により収束性が向上しても、これらの処理に時間がかかり過ぎると解法全体として計算時間が増加する恐れがあるからである。一般にこの両者はトレードオフの関係にあり、バランスよく両立させるために様々な前処理法が提案されている。 There are two properties required for the preprocessing matrix. One is that the closeness to the coefficient matrix is high. The higher the approximation, the closer the coefficient matrix to the unit matrix by the preprocessing and the smaller the condition number, so the convergence of the iterative solution improves. The other is that the time required for creating and preprocessing the preprocessing matrix is small. This is because even if the convergence is improved by the preprocessing, if these processes take too long, the calculation time may increase as a whole solution. In general, the two are in a trade-off relationship, and various preprocessing methods have been proposed in order to achieve a good balance.

その中でも最もポピュラーな方法の一つとして、不完全三角分解前処理（非特許文献１参照）が挙げられる。この方法は係数行列の三角分解を不完全に行い、その結果を前処理行列として用いる前処理法である。以下、その詳細について説明する。
三角分解とは、行列を下三角行列Ｌと上三角行列Ｕの積に分解するＬＵ分解と、下三角行列Ｌ、対角行列Ｄ、及び上三角行列Ｕとの積に分解するＬＤＵ分解の総称である。三角分解は、本来直接解法と呼ばれる連立一次方程式の求解手段の一種であるが、係数行列が大規模な疎行列の場合は通常使用されない。なぜならば、疎行列の三角分解には多くのｆ
ｉｌｌ−ｉｎ（フィルイン）が発生し、行列が大規模な場合に膨大な計算時間とメモリを必要とするからである。ここでｆｉｌｌ−ｉｎとは、分解中に元の行列では零であった位置へ新たに非零要素の書き込みが発生することを指す。
不完全三角分解前処理とは、係数行列の三角分解において、上記ｆｉｌｌ−ｉｎを棄却することで不完全に分解された行列を前処理行列として用いる前処理法である。前処理行列の係数行列に対する近似性は、例えば係数行列の対角成分のみを抜き出した行列を前処理行列とする対角スケーリング前処理に比べて高く、より良い収束性を示す。一方前処理行列の作成には多少の時間を要するが、ｆｉｌｌ−ｉｎを棄却するため比較的短時間で可能である。また前処理についても同様で、前処理行列がＬ＊ＵもしくはＬ＊Ｄ＊Ｕの形に分解されているため前進代入・後退代入という簡便な操作で済むことと、ＬやＵが係数行列の疎性を継承するため演算量が少ないことから、こちらも短時間で実行可能である。これらの特徴により、効率よく反復解法を高速化することができるため、最もポピュラーな前処理法の一つとして利用されている。 Among them, one of the most popular methods is incomplete trigonometric decomposition preprocessing (see Non-Patent Document 1). This method is a preprocessing method in which triangulation of a coefficient matrix is incomplete and the result is used as a preprocessing matrix. The details will be described below.
Triangular decomposition is a generic term for LU decomposition that decomposes a matrix into a product of a lower triangular matrix L and an upper triangular matrix U, and LDU decomposition that decomposes into a product of a lower triangular matrix L, a diagonal matrix D, and an upper triangular matrix U. It is. Triangular decomposition is a kind of solution method for simultaneous linear equations, which is essentially called direct solution, but is not usually used when the coefficient matrix is a large-scale sparse matrix. This is because there are many f
This is because ill-in (fill-in) occurs and a large amount of calculation time and memory are required when the matrix is large. Here, fill-in means that a new non-zero element is written to a position that was zero in the original matrix during decomposition.
The incomplete triangulation preprocessing is a preprocessing method that uses, as a preprocessing matrix, a matrix that is incompletely decomposed by rejecting the fill-in in the triangulation of the coefficient matrix. The closeness of the preprocessing matrix to the coefficient matrix is higher than, for example, diagonal scaling preprocessing using a matrix obtained by extracting only the diagonal components of the coefficient matrix as a preprocessing matrix, and shows better convergence. On the other hand, it takes some time to create the preprocessing matrix, but it can be done in a relatively short time because the fill-in is rejected. The same applies to pre-processing. Since the pre-processing matrix is decomposed into L * U or L * D * U, simple operations such as forward substitution and backward substitution are sufficient, and L and U are coefficient matrices. Since the sparseness is inherited, the amount of calculation is small, so this can also be executed in a short time. These features enable efficient iterative solution speedup and are used as one of the most popular preprocessing methods.

不完全三角分解前処理には、より収束性を高めるためにいくつかの派生法が提案されている。いずれも前処理行列の作成時に何らかの工夫を施すことを特徴とする。
シフトパラメタ付き不完全三角分解前処理（非特許文献１参照）と、補正パラメタ付き不完全三角分解前処理（非特許文献２参照）は、どちらも前処理行列作成の際に、元の係数行列の対角項にある変更を施してから不完全三角分解を行うことを特徴とする。前者は対角項にシフトパラメタを加算し、後者は対角項に補正パラメタを乗算する。特に係数行列の対角優位性が低い場合に、分解により対角項がゼロに近づくことを防ぎ、収束性の悪化を避けることを目的としている。
修正不完全三角分解前処理（非特許文献１参照）は、不完全三角分解においてｆｉｌｌ−ｉｎを単に棄却するのではなく、ｆｉｌｌ−ｉｎが発生した行の対角項からこの量を引くことを特徴とする。これは前処理行列が元の行列と同じ行和を持つよう強制するための操作である。実際は、ｆｉｌｌ−ｉｎに緩和パラメタを乗じた上で対角項から引くことが多く、この緩和パラメタ次第で収束性が向上することが知られている。
ｆｉｌｌ−ｉｎレベル付き不完全三角分解前処理（非特許文献１参照）は、不完全三角分解時のｆｉｌｌ−ｉｎを完全に棄却するのではなく、一定レベル以下のｆｉｌｌ−ｉｎを許容する手法である。ここでｆｉｌｌ−ｉｎのレベルとは、元の係数行列では非零であった位置をレベル０として、以降分解によってｋレベルの要素を参照するｆｉｌｌ−ｉｎはレベルｋ＋１と定義されるというものである。許容するレベルを上げるほど、不完全分解の“不完全さ”が低減するため、係数行列に対する近似性が高まり収束性が向上する。 Several derivation methods have been proposed for incomplete triangulation preprocessing in order to further improve convergence. Both are characterized in that some kind of contrivance is applied when the preprocessing matrix is created.
Both the incomplete triangulation preprocessing with shift parameters (see Non-Patent Document 1) and the incomplete triangulation preprocessing with correction parameters (see Non-Patent Document 2) are both performed at the time of creating the preprocessing matrix. Incomplete triangulation is performed after the change in the diagonal term. The former adds a shift parameter to the diagonal term, and the latter multiplies the diagonal term by a correction parameter. In particular, when the diagonal advantage of the coefficient matrix is low, the objective is to prevent the diagonal term from approaching zero due to the decomposition and to avoid deterioration of convergence.
The modified incomplete triangulation preprocessing (see Non-Patent Document 1) does not simply reject the fill-in in the incomplete triangulation, but subtracts this amount from the diagonal term of the line where the fill-in occurs. Features. This is an operation for forcing the preprocessing matrix to have the same row sum as the original matrix. Actually, the fill-in is often multiplied by a relaxation parameter and then subtracted from the diagonal term, and it is known that the convergence is improved depending on the relaxation parameter.
The pre-processing of incomplete triangulation with a fill-in level (see Non-Patent Document 1) is a technique that does not completely reject the fill-in at the time of incomplete triangulation but allows a fill-in below a certain level. is there. Here, the fill-in level means that a position that is non-zero in the original coefficient matrix is set to level 0, and a fill-in that refers to an element at k level by decomposition is defined as level k + 1. . As the permissible level is increased, the “incompleteness” of the incomplete decomposition is reduced, so that the approximation to the coefficient matrix is increased and the convergence is improved.

J.J.ドンガラ他著／長谷川里美長谷川秀彦藤野清次訳，「反復法Ｔｅｍｐｌａｔｅｓ」，朝倉書店，１９９６年，ｐｐ．５３−５７J.J. Dongara et al./Satomi Hasegawa Hidehiko Hasegawa Kiyoji Fujino, “Iteration Method Templates”, Asakura Shoten, 1996, pp. 53-57 片岡勲安田秀辛高野直樹芝原正彦著，「数値解析入門」，コロナ社，２００２年，ｐｐ．５７Isao Kataoka Hideshi Yasuda Naoki Takano, Masahiko Shibahara, “Introduction to Numerical Analysis”, Corona, 2002, pp. 57

以上説明した背景技術には２つの課題がある。１つは前処理による高速化の効果が低い点で、もう１つは各派生法に関し、それぞれ固有のパラメタについて、適切な値を決定する工程を持たない点である。 The background art described above has two problems. One is that the effect of speeding up by preprocessing is low, and the other is that each derivative method does not have a step of determining an appropriate value for each unique parameter.

第１の課題である、前処理による高速化の効果が低い点について、その要因を以下に詳しく説明する。まず基本となる不完全三角分解前処理は、係数行列の三角分解を不完全な状態で中断し、そのまま前処理行列として利用する。しかし分解直後の前処理行列が、所
望の疎性を満たす範囲で係数行列に対し最良の近似性を持つという保障はなく、前処理による収束性の向上は限定的となる。またそれ以外の各派生法についても、基本となる不完全三角分解前処理に対して、不完全分解前および分解中に何らかの補正を施すものであり、不完全分解直後の行列をそのまま前処理行列として利用することに変わりはなく、やはり収束性の向上は限定的となる。 The reason why the effect of speeding up by preprocessing, which is the first problem, is low will be described in detail below. First, in the basic incomplete triangulation, the triangulation of the coefficient matrix is interrupted in an incomplete state and used as it is as a preprocessing matrix. However, there is no guarantee that the preprocessing matrix immediately after decomposition has the best approximation to the coefficient matrix within the range satisfying the desired sparseness, and the improvement in convergence by the preprocessing is limited. For other derivation methods, the basic incomplete triangulation preprocessing is corrected before and during incomplete decomposition, and the matrix immediately after the incomplete decomposition is used as the preprocessing matrix. However, the improvement in convergence is limited.

また各派生法はそれぞれ、高速化の効果を阻害する以下の要因を抱えている。まずシフトパラメタ付き及び補正パラメタ付き不完全三角分解前処理については、分解前の係数行列の対角項に変更を加えることが、前処理行列の係数行列に対する近似性を向上させる直接的な手段とはならないため、収束性の向上が限定的となる。また修正不完全三角分解については、本来計算する必要のないｆｉｌｌ−ｉｎを計算し、対角項から引くという操作を行うため、前処理行列の作成に多くの時間が必要となる。ｆｉｌｌ−ｉｎレベル付き不完全三角分解前処理についても、ｆｉｌｌ−ｉｎ許容量を増やすほど収束性はより向上するが、その代償として前処理行列の作成および前処理に要する計算時間が増加してしまう。 In addition, each derivative method has the following factors that impede the speed-up effect. First, for incomplete triangulation preprocessing with shift parameters and correction parameters, changing the diagonal terms of the coefficient matrix before decomposition is a direct means of improving the approximation of the preprocessing matrix to the coefficient matrix. Therefore, the improvement in convergence is limited. In addition, with respect to the modified incomplete triangulation, since the operation of calculating fill-in which is not originally required for calculation and subtracting from the diagonal term is performed, it takes a lot of time to create the preprocessing matrix. Concerning incomplete triangulation preprocessing with a fill-in level, the convergence is further improved as the fill-in permissible amount is increased, but at the cost of the preprocessing matrix creation and the calculation time required for the preprocessing increases. .

このように、上記で説明した全ての従来手法に関して、解法全体としての高速化の効果が低いという課題があり、より高速化の効果の高い前処理付き反復解法が望まれている。 As described above, with respect to all the conventional methods described above, there is a problem that the speed-up effect as a whole solution is low, and a preprocessed iterative solution method with a higher speed-up effect is desired.

次に第２の課題である、各派生法における固有のパラメタを適切に決定する工程を持たない点について、以下に詳しく説明する。背景技術に示した全ての派生法は、それぞれ固有のパラメタを持っている。いずれも各パラメタが高速化の度合いを大きく左右するが、係数行列毎に適切な値が異なるため注意が必要である。まずシフトパラメタ付き及び補正パラメタ付き不完全三角分解前処理については、それぞれパラメタが０及び１から離れすぎると、分解する行列が本来の係数行列とかけ離れてしまい、前処理行列の近似性が低下して収束性の悪化を招いてしまう。また修正不完全三角分解前処理についても、緩和パラメタが大きすぎると、分解により対角項がゼロに近づき収束性が悪化する事が知られている。ｆｉｌｌ−ｉｎレベル付き不完全三角分解については、ｆｉｌｌ−ｉｎ許容レベルを上げるほど収束性が向上するが、逆に前処理行列の作成および前処理に時間がかかってしまう。以上よりこれらの派生法に関しては、係数行列に応じて適切なパラメタを設定することが重要となる。しかし前処理による高速化の効果を損なわない程度の時間で、適切な値を推定する手段は明らかではない。そのためこれらの派生法は適切なパラメタを決定する工程を持たず、パラメタの決定は利用者に委ねられている。従って前処理による高速化の効果を確実に得るためには、係数行列に対していくつか適当なパラメタを用意して試行を繰り返し、適切なパラメタを手動で探索するしかない。 Next, the point which does not have the process of determining appropriately the intrinsic | native parameter in each derivation method which is the 2nd subject is demonstrated in detail below. All derivations shown in the background art have their own parameters. In any case, each parameter greatly affects the degree of speedup, but care must be taken because appropriate values differ for each coefficient matrix. First, for incomplete triangulation preprocessing with shift parameters and correction parameters, if the parameters are too far from 0 and 1, respectively, the matrix to be decomposed is far from the original coefficient matrix, and the approximation of the preprocessing matrix decreases. This leads to deterioration of convergence. In addition, it is known that the modified incomplete triangulation decomposition preconditioning also causes the diagonal term to approach zero due to the decomposition and deteriorates the convergence if the relaxation parameter is too large. Concerning the incomplete triangulation with the fill-in level, the convergence improves as the fill-in tolerance level increases, but conversely, the creation of the preprocessing matrix and the preprocessing take time. As described above, regarding these derivation methods, it is important to set appropriate parameters according to the coefficient matrix. However, it is not clear how to estimate an appropriate value in a time that does not impair the effect of speeding up by preprocessing. As a result, these derivations do not have the process of determining appropriate parameters, and parameter determination is left to the user. Therefore, in order to surely obtain the effect of speeding up by preprocessing, there is no choice but to prepare several appropriate parameters for the coefficient matrix, repeat the trial, and manually search for the appropriate parameters.

有限差分法や有限要素法といった数値解析手法においては、短い時間ステップごとに係数行列の異なる連立一次方程式を次々と解いていく必要がある。これに背景技術の各派生法を適用する場合、時間ステップごとに上記パラメタの手動探索を行うことは計算時間の点で非現実的であり、パラメタを適当な値に固定して計算を行うしかない。しかしその場合、時間ステップによって前処理の効果が変動し、最悪の場合収束が得られず計算が停止してしまう可能性もあり危険である。そのため、係数行列に応じた適切なパラメタを決定する工程を有する前処理付き反復解法が望まれている。 In numerical analysis methods such as the finite difference method and the finite element method, simultaneous linear equations having different coefficient matrices need to be solved one after another for each short time step. When applying each derivation method of the background art to this, it is unrealistic in terms of calculation time to perform manual search of the above parameters at each time step, and it is only possible to perform calculation with the parameters fixed to appropriate values. Absent. However, in that case, the effect of the preprocessing varies depending on the time step, and in the worst case, convergence is not obtained and the calculation may be stopped, which is dangerous. Therefore, a preprocessed iterative solution having a step of determining appropriate parameters according to the coefficient matrix is desired.

本発明は、このような背景の下になされたもので、その第１の目的は、前処理による高速化の効果が高い連立一次方程式の数値計算方法及び装置を提供することにある。また第２の目的は、第１の目的に加え、係数行列に応じて適切なパラメタを決定することが可能な連立一次方程式の数値計算方法及び装置を提供することにある。 The present invention has been made under such a background, and a first object of the present invention is to provide a numerical calculation method and apparatus for simultaneous linear equations that are highly effective in speeding up by preprocessing. In addition to the first object, the second object is to provide a numerical calculation method and apparatus for simultaneous linear equations capable of determining an appropriate parameter according to a coefficient matrix.

上記第１の目的を達成するため、本発明は、連立一次方程式の数値解法であって、係数行列Ａを不完全三角分解して下三角行列Ｌ、及び上三角行列Ｕを得る分解工程と、前記分解工程において分解された前記Ｌ及び前記Ｕの対角要素にスケーリングパラメタγを乗じ、非対角要素にスケーリングパラメタφを乗じることによって下三角行列Ｌ′及び上三角行列Ｕ′を得るスケーリング工程と、前記Ｌ′及び前記Ｕ′より得られる前処理行列Ｍを用いて前処理を行う前処理工程と、前記前処理工程で得られた結果を用いて近似解を更新する近似解更新工程と、収束判定条件により解の収束を判定し、収束していれば処理を終了し、そうでなければ前記前処理工程と前記近似解更新工程とを繰り返す反復工程と、を含むことを特徴とする。 In order to achieve the first object, the present invention is a numerical solution of simultaneous linear equations, in which a coefficient matrix A is incompletely triangulated to obtain a lower triangular matrix L and an upper triangular matrix U; A scaling step of obtaining a lower triangular matrix L ′ and an upper triangular matrix U ′ by multiplying the diagonal elements of L and U decomposed in the decomposing step by a scaling parameter γ and multiplying the non-diagonal elements by a scaling parameter φ. A preprocessing step of performing preprocessing using the preprocessing matrix M obtained from the L ′ and the U ′, and an approximate solution updating step of updating the approximate solution using the result obtained in the preprocessing step; , Determining the convergence of the solution according to the convergence determination condition, and if it has converged, the process is terminated, and if not, an iterative process that repeats the preprocessing process and the approximate solution update process is included. .

また、上記第２の目的を達成するために、本発明は、前記スケーリングパラメタγ及びφを、前記係数行列Ａと前記前処理行列Ｍより定義される評価関数を最小とするパラメタを推定することによって決定するパラメタ決定工程をさらに含むことを特徴とする。この評価関数として、前記係数行列Ａと前記前処理行列Ｍの差行列の列毎の和を成分とするベクトル

のユークリッドノルムを採用することが好適である。 In order to achieve the second object, the present invention estimates the parameters that minimize the evaluation function defined by the coefficient matrix A and the preprocessing matrix M for the scaling parameters γ and φ. The method further includes a parameter determining step that is determined by: As this evaluation function, a vector whose component is the sum of each difference matrix of the coefficient matrix A and the preprocessing matrix M

It is preferable to adopt the Euclidean norm.

本発明によれば、前記スケーリング工程において下三角行列Ｌ及び上三角行列Ｕの対角項と非対角項のそれぞれにスケーリングパラメタを乗じることで、不完全三角分解後の前処理行列に対し、係数行列への近似性を向上させるための直接的な修正が可能となる。そのため収束性の向上が期待できる。一方前記スケーリング工程における演算量は微小であり、また前記スケーリング工程を経ても前処理行列の疎（スパース）性は保持される。そのため収束性の向上に対するトレードオフの要素、すなわち前処理行列の作成及び前処理に要する時間はほとんど増加せず、前処理による高速化の効果が高くなる。 According to the present invention, by multiplying the diagonal and non-diagonal terms of the lower triangular matrix L and the upper triangular matrix U by the scaling parameter in the scaling step, A direct correction for improving the closeness to the coefficient matrix is possible. Therefore, improvement in convergence can be expected. On the other hand, the amount of calculation in the scaling step is very small, and the sparseness of the preprocessing matrix is maintained even after the scaling step. Therefore, the trade-off factor for improving the convergence, that is, the time required for creating the preprocessing matrix and the preprocessing hardly increases, and the effect of speeding up by the preprocessing is enhanced.

また本発明によれば、前記パラメタ決定工程において、前処理の効果を損なわない程度の時間で、係数行列に応じた適切なスケーリングパラメタを推定することができる。 According to the present invention, in the parameter determination step, an appropriate scaling parameter corresponding to the coefficient matrix can be estimated in a time that does not impair the effect of the preprocessing.

さらに背景技術のいくつかの派生法に対して本発明を適用することで、派生法固有のパラメタに対する感度を低減させ、実用性を向上させることができる。 Furthermore, by applying the present invention to several derivation methods of the background art, the sensitivity to the parameters unique to the derivation method can be reduced and the practicality can be improved.

本発明の数値解法を示すフローチャートFlow chart showing the numerical solution of the present invention 本発明の数値計算装置の構成を示す図The figure which shows the structure of the numerical calculation apparatus of this invention 前処理つき共役勾配法（ＣＧ法）のアルゴリズムPreconditioned conjugate gradient method (CG method) algorithm 本発明を適用した際のスケーリングパラメタと反復回数・計算時間の関係Relationship between scaling parameter and number of iterations / calculation time when applying the present invention 従来手法と本発明との比較Comparison between the conventional method and the present invention 従来手法に本発明を適用した場合の効果Effects of applying the present invention to the conventional method 従来手法に本発明を適用した場合の効果Effects of applying the present invention to the conventional method

図２は、本実施形態に係る数値計算装置として機能するコンピュータの基本構成を示すブロック図である。該コンピュータは、大略、バス２０７を介して接続されたＣＰＵ２０１、表示装置２０２，入力装置２０３、ＲＡＭ２０４、ハードディスク装置２０５、ＮＩ
Ｃ２０６等から構成される。 FIG. 2 is a block diagram showing a basic configuration of a computer that functions as a numerical calculation apparatus according to the present embodiment. The computer generally includes a CPU 201, a display device 202, an input device 203, a RAM 204, a hard disk device 205, an NI connected via a bus 207.
C206 and the like.

ＣＰＵ（中央演算装置）２０１で、ＲＡＭ（主記憶装置）２０４に記憶されているプログラムやデータを用いて本装置全体の制御を行うと共に、後述する各処理を実行する。表示装置２０２は、液晶画面などにより構成されており、ＣＰＵ２０１による処理結果を文字が画像などでもって表示することができる。入力装置２０３は、キーボードやマウスなどの操作系の装置により構成されており、各種の指示をＣＰＵ２０１に対して入力することができる。ＲＡＭ２０４は、ハードディスク装置２０５からロードされたプログラムやデータを一時的に記憶する為のエリアを備えると共に、ＣＰＵ２０１が各種の処理を実行する際に必要なワークエリアを備える。ハードディスク装置２０５には、ＯＳ（オペレーティングシステム）や、コンピュータが行うべき後述の各処理をＣＰＵ２０１に実行させるためのプログラムやデータが保存されている。これらのプログラムはＲＡＭ２０４にロードされて、ＣＰＵ２０１によって実行されることで、本コンピュータが連立一次方程式を数値的に解くための数値計算装置として機能する。ＮＩＣ（ネットワークインターフェース）２０６は、外部装置とのデータ通信を行う際のインターフェースとして機能するものであり、本コンピュータはこのＮＩＣ２０６を介して外部装置とプログラムやデータの送受信を行う。なお、ハードディスク装置２０５が保存しているプログラムやデータの代わりに、ＮＩＣ２０６を介して外部装置から受信したプログラムやデータをＣＰＵ２０１による処理対象としても良い。 A CPU (central processing unit) 201 controls the entire apparatus using programs and data stored in a RAM (main storage device) 204 and executes each process described later. The display device 202 is configured by a liquid crystal screen or the like, and can display the processing result by the CPU 201 with characters as images. The input device 203 is configured by an operation device such as a keyboard and a mouse, and can input various instructions to the CPU 201. The RAM 204 includes an area for temporarily storing programs and data loaded from the hard disk device 205 and a work area required when the CPU 201 executes various processes. The hard disk device 205 stores an OS (operating system) and programs and data for causing the CPU 201 to execute processes to be described later that should be performed by the computer. These programs are loaded into the RAM 204 and executed by the CPU 201, whereby the computer functions as a numerical calculation device for numerically solving the simultaneous linear equations. The NIC (network interface) 206 functions as an interface when performing data communication with an external device, and the computer transmits and receives programs and data to and from the external device via the NIC 206. Note that instead of the programs and data stored in the hard disk device 205, the programs and data received from the external device via the NIC 206 may be processed by the CPU 201.

次に、上記構成を備えるコンピュータが行う、連立一次方程式の解を求める為の数値解法について説明する。図１は、本実施形態に係るコンピュータが行う、連立一次方程式の解を求めるための数値計算処理のフローチャートである。なお、ここでは係数行列Ａ、変数ベクトルｘ、右辺ベクトルｂとして、Ａｘ＝ｂで表される連立一次方程式の求解処理に、本発明の前処理付き反復解法を適用する。 Next, a numerical solution method for obtaining a solution of simultaneous linear equations performed by a computer having the above configuration will be described. FIG. 1 is a flowchart of numerical calculation processing for obtaining a solution of simultaneous linear equations performed by the computer according to the present embodiment. Here, the iterative solution method with preprocessing of the present invention is applied to the solution processing of the simultaneous linear equations represented by Ax = b as the coefficient matrix A, the variable vector x, and the right side vector b.

まず、係数行列Ａを不完全三角分解して下三角行列Ｌ、及び上三角行列Ｕを得る（ステップＳ１０１：分解工程）。なお、ここでは不完全三角分解としてＩＬＤＵ分解を例に説明するが、これは本発明を限定するものではなく、上記不完全三角分解はＩＬＵ分解やＩＣ（不完全コレスキー）分解であってもよい。また、背景技術に記載したシフトパラメタ付き不完全三角分解前処理、補正パラメタ付き不完全三角分解前処理、修正不完全三角分解前処理、及びｆｉｌｌ−ｉｎレベル付き不完全三角分解前処理のいずれの派生法における不完全三角分解処理であってもよい。すなわち、ここでの分解処理は、係数行列を不完全に分解し、Ｌ＊ＵもしくはＬ＊Ｄ＊Ｕの行列積を導出するいかなる処理であってよい。 First, the coefficient matrix A is incompletely triangularly decomposed to obtain a lower triangular matrix L and an upper triangular matrix U (step S101: decomposition step). Here, ILDU decomposition will be described as an example of incomplete triangulation, but this does not limit the present invention, and the incomplete triangulation may be ILU decomposition or IC (incomplete Cholesky) decomposition. Good. In addition, any of the incomplete triangulation preprocessing with shift parameters, the incomplete triangulation preprocessing with correction parameters, the corrected incomplete triangulation preprocessing, and the incomplete triangulation preprocessing with the fill-in level described in Background Art It may be incomplete triangulation processing in the derivation method. That is, the decomposition process here may be any process that incompletely decomposes the coefficient matrix and derives the matrix product of L * U or L * D * U.

次に、下三角行列Ｌの対角要素Ｌ_ｉｉ及び上三角行列Ｕの対角要素Ｕ_ｉｉに乗じるスケーリングパラメタγと、Ｌの非対角要素Ｌ_ｉｊ（ｉ＞ｊ）及びＵの非対角要素Ｕ_ｉｊ（ｉ＜ｊ）に乗じるスケーリングパラメタφを決定する（ステップＳ１０２）。スケーリングパラメタγ及びφは、それぞれＬ及びＵの対角項と非対角項に属する要素全てに対して一律に掛かる単一の定数である。なお、ｉ及びｊは行列の行方向及び列方向の添え字を表す。ここでのパラメタ決定方法は、決定に要する時間が、前処理による高速化の効果を損なわない程度であることが望ましい。またパラメタを作用させることで前処理行列の係数行列に対する近似性が向上するような、適切な値を導出できる方法が望ましい。そこで、スケーリングパラメタφ及びγを、係数行列Ａと前処理行列Ｍより定義される評価関数を最小とする値として決定する。この評価関数は、係数行列Ａと前処理行列Ｍの近似性が高いほど、値が小さくなる関数であることが好ましい。以下では、係数行列Ａと前処理行列Ｍとの差行列の列毎の和を成分とする下記ベクトルＲのユークリッドノルムを評価関数として、この評価関数を最小にするパラメタを推定する方法について説明する。すなわち評価関数は以下の式で表されるベクトルＲのユークリッドノルムとする。なお、nは行列の次
数を表す。

Then, the off-diagonal of the diagonal elements L and scaling parameters γ multiplying the diagonal elements U _ii of _ii and upper triangular matrix U, the off-diagonal elements L _{ij (i>} j) of the L and U of the lower triangular matrix L A scaling parameter φ to be multiplied by the element U _ij (i <j) is determined (step S102). The scaling parameters γ and φ are single constants that are uniformly applied to all elements belonging to the diagonal and non-diagonal terms of L and U, respectively. Note that i and j represent subscripts in the row direction and column direction of the matrix. In this parameter determination method, it is desirable that the time required for the determination is such that the speed-up effect by the preprocessing is not impaired. In addition, it is desirable to have a method capable of deriving appropriate values such that the approximation of the preprocessing matrix to the coefficient matrix is improved by applying parameters. Therefore, the scaling parameters φ and γ are determined as values that minimize the evaluation function defined by the coefficient matrix A and the preprocessing matrix M. This evaluation function is preferably a function whose value decreases as the closeness of the coefficient matrix A and the preprocessing matrix M increases. In the following, a method for estimating a parameter that minimizes the evaluation function using the Euclidean norm of the following vector R whose component is the sum of the difference matrix between the coefficient matrix A and the preprocessing matrix M as a component will be described. . That is, the evaluation function is the Euclidean norm of the vector R expressed by the following equation. Note that n represents the order of the matrix.

評価関数の最小化は、評価関数の一次微分をゼロにするγおよびφをニュートン法等の手法で数値的に解けばよいが、これは本発明を限定するものではなく、多変数関数の最適化手法のいずれを用いても良い。 The minimization of the evaluation function may be performed by numerically solving γ and φ that make the first derivative of the evaluation function zero by a method such as Newton's method. However, this does not limit the present invention, and the optimization of the multivariable function is not limited. Any of the methods for making it possible may be used.

次に、パラメタ決定工程で得られたスケーリングパラメタγ_ｏｐｔおよびφ_ｏｐｔを用いて、ＬおよびＵをスケーリングして、下三角行列Ｌ′と上三角行列Ｕ′を得る（ステップＳ１０３）。すなわち以下の式で表されるように、スケーリングパラメタγ_ｏｐｔをＬの対角要素Ｌ_ｉｉ及びＵの対角要素Ｕ_ｉｉに乗じ、スケーリングパラメタφ_ｏｐｔをＬの非対角要素Ｌ_ｉｊ（ｉ＞ｊ）及びＵの非対角要素Ｕ_ｉｊ（ｉ＜ｊ）に乗じる。

Next, using the scaling parameters γ _opt and φ _opt obtained in the parameter determination step, L and U are scaled to obtain a lower triangular matrix L ′ and an upper triangular matrix U ′ (step S103). That is, as expressed by the following equation, the scaling parameter γ _opt is multiplied by the diagonal element L _ii of L and the diagonal element U _ii of U, and the scaling parameter φ _opt is _converted to the non-diagonal element L _{ij of} L (i> j) and U off-diagonal elements U _ij (i <j).

以上説明したステップＳ１０１からＳ１０３の操作により前処理行列Ｍ＝Ｌ′＊Ｄ＊Ｕ′が決定される。この前処理行列Ｍを用いて、以降のステップＳ１０４からＳ１０６では、前処理付き反復解法を行う。ここではＲｉｃｈａｒｄｓｏｎ反復法等の定常反復解法や、ＣＧ法、ＢｉＣＧ法、ＣＧＳ法、ＢｉＣＧＳｔａｂ法等の非定常反復解法をはじめ、その他いかなる前処理付き反復解法であってもよい。以下では前処理つきＣＧ法を例にとって説明する。前処理つきＣＧ法は公知であるため詳しい説明は省略するが、その実装法の一例を図３に示す。ただし、数学的に同値となるその他の手法によって計算しても構わない。 The preprocessing matrix M = L ′ * D * U ′ is determined by the operations of steps S101 to S103 described above. Using this preprocessing matrix M, the iterative solution with preprocessing is performed in subsequent steps S104 to S106. Here, any other iterative solution with pre-processing may be used, such as a steady iterative solution method such as the Richardson iterative method, a non-stationary iterative solution method such as the CG method, BiCG method, CGS method, or BiCGStab method. Hereinafter, the CG method with preprocessing will be described as an example. Since the CG method with preprocessing is well known, detailed description is omitted, but an example of the mounting method is shown in FIG. However, it may be calculated by other methods that are mathematically equivalent.

まずＣＧ法における残差ベクトルｒに対して前処理を行う（ステップＳ１０４）。すなわち、以下の式で表されるベクトルｚを、前進代入・後退代入を用いて求める。

なお、前進代入・後退代入の詳細については公知の技術であるためここでは記載しない。 First, preprocessing is performed on the residual vector r in the CG method (step S104). That is, a vector z expressed by the following equation is obtained using forward substitution / backward substitution.

The details of forward substitution and backward substitution are well-known techniques, and are not described here.

次に、ステップＳ１０４で得られたベクトルｚを用いて、前処理付きＣＧ法における近似解の更新を行う（ステップＳ１０５）。なお、前処理付きＣＧ法の近似解更新は公知の技術であるため詳細な説明は省略するが、図３に示す近似解更新処理を行って、近似解を更新する。 Next, the approximate solution in the preprocessed CG method is updated using the vector z obtained in step S104 (step S105). Note that since the approximate solution update of the preprocessed CG method is a known technique, a detailed description thereof will be omitted, but the approximate solution is updated by performing the approximate solution update process shown in FIG.

次に、近似解の収束判定を行い、収束していれば処理を終了し、そうでなければステップＳ１０４に戻る（ステップＳ１０６）。ここでは、例えば、残差ベクトルｒのユークリッドノルムを右辺ベクトルｂのユークリッドノルムで割った値が収束判定用の閾値以下であるという収束判定条件を満たす場合に収束、そうでなければ未収束と判定する。ただし
これは本発明を限定するものではなく、前処理付き反復解法の収束判定であればいかなる方法を用いても良い。 Next, the convergence determination of the approximate solution is performed. If it has converged, the process is terminated, and if not, the process returns to step S104 (step S106). Here, for example, when the convergence determination condition that the value obtained by dividing the Euclidean norm of the residual vector r by the Euclidean norm of the right-hand vector b is equal to or less than the threshold for convergence determination is satisfied, it is determined that it is not converged. To do. However, this does not limit the present invention, and any method may be used as long as it is a convergence determination of the iterative solution with preprocessing.

なお、上記の説明では、前処理行列ＭをＬ＊ＵもしくはＬ＊Ｄ＊Ｕで構成する場合について説明したが、ここに任意のスカラー値ａを乗じ、Ｍ＝ａ＊Ｌ＊Ｕもしくはａ＊Ｌ＊Ｄ＊Ｕとして構成しても構わない。例えば、ａにスケーリングパラメタγの逆数を用いても良い。 In the above description, the case where the preprocessing matrix M is configured by L * U or L * D * U has been described. However, this is multiplied by an arbitrary scalar value a, and M = a * L * U or a *. You may comprise as L * D * U. For example, the reciprocal of the scaling parameter γ may be used for a.

また、上記の説明では、差行列の列毎の和を成分とするベクトルＲのユークリッドノルムを評価関数とし、これを最小とするスケーリングパラメタγ，φを採用しているが、その他の評価関数を用いても良い。例えば、差行列の行毎の和を成分とするベクトルのユークリッドノルムを評価関数としても良い。その場合、評価関数は以下の式で表されるベクトルＲのユークリッドノルムとなる。なお、nは行列の次数を表す。

また、スケーリングパラメタγ，φをこのような評価関数を最小とするものとして決定することが好適であるが、利用者が適当なパラメタを用意して試行を繰り返して適切なパラメタを手動で探索しても良い。 In the above description, the Euclidean norm of the vector R whose component is the sum for each column of the difference matrix is used as the evaluation function, and the scaling parameters γ and φ that minimize this are used. It may be used. For example, the Euclidean norm of a vector whose component is the sum of each row of the difference matrix may be used as the evaluation function. In that case, the evaluation function is the Euclidean norm of the vector R expressed by the following equation. Note that n represents the order of the matrix.

In addition, it is preferable to determine the scaling parameters γ and φ as those that minimize such an evaluation function. However, the user prepares an appropriate parameter and repeats the trial to manually search for the appropriate parameter. May be.

また、スケーリングパラメタ決定処理を簡易かつ高速なものとするために、スケーリングパラメタφを１．０で固定して、この条件の下で上記評価関数が最小となるスケーリングパラメタγを決定しても良い。この場合、スケーリング工程では、スケーリングパラメタγとφを不完全三角分解後の三角行列Ｌ，Ｕの対角要素と非対角要素に乗じていると捉えることもできるし、スケーリングパラメタγのみを三角行列Ｌ，Ｕの対角要素に乗じていると捉えることもできる。 Also, in order to make the scaling parameter determination process simple and fast, the scaling parameter φ may be fixed at 1.0, and the scaling parameter γ that minimizes the evaluation function under this condition may be determined. . In this case, in the scaling step, the scaling parameters γ and φ can be regarded as being multiplied by the diagonal elements and non-diagonal elements of the triangular matrices L and U after incomplete triangulation, or only the scaling parameter γ is triangular. It can also be understood that the diagonal elements of the matrices L and U are multiplied.

以上説明した本発明の一実施形態に係る前処理付き反復解法の効果について、以下に詳しく説明する。なお求解の対象とする連立一次方程式には、流体解析で現れる３次元ポアソン方程式を有限差分法で解いた際の、ある時間ステップにおけるデータを用いた。係数行列は正定値対称行列で、規模は５４万元である。また本発明における非対角要素のスケーリングパラメタφは１．０に固定してある。なお、ベースとなる不完全三角分解にはＩＬＤＵ分解を用い、反復解法にはＣＧ法を用いる。 The effect of the iterative method with preprocessing according to the embodiment of the present invention described above will be described in detail below. Note that the data at a certain time step when the three-dimensional Poisson equation appearing in the fluid analysis was solved by the finite difference method was used for the simultaneous linear equations to be solved. The coefficient matrix is a positive definite symmetric matrix with a scale of 540,000 yuan. In the present invention, the scaling parameter φ of the non-diagonal element is fixed at 1.0. Note that ILDU decomposition is used for incomplete triangulation as a base, and CG method is used for iterative solution.

まず図４において、対角要素のスケーリングパラメタγの値を手動で変化させた場合の結果と、評価関数（ベクトルＲのユークリッドノルム）を最小にするようにスケーリングパラメタγを決定した場合の結果とを比較する。図４（Ａ）は反復回数を、図４（Ｂ）は計算時間を示すグラフである。なお、図中、白丸及び白三角がγを手動で変化させた場合、黒丸及び黒三角が評価関数を最小とするようにγを定めた場合である。 First, in FIG. 4, the result when the value of the scaling parameter γ of the diagonal element is manually changed and the result when the scaling parameter γ is determined so as to minimize the evaluation function (Euclidean norm of the vector R) Compare 4A is a graph showing the number of iterations, and FIG. 4B is a graph showing calculation time. In the figure, white circles and white triangles change γ manually, and black circles and black triangles set γ so that the evaluation function is minimized.

γを手動で変化させた結果、γ＝０．６３において反復回数と計算時間の両者が共に最小値を示した。γ＝１．０すなわち基本となるＩＬＤＵ分解前処理付きＣＧ法と比べると、反復回数は１７１回から９５回に低減し、収束性が大幅に向上している。一方計算時間も１２．７秒から７．３秒に低減しており、前処理行列の作成や前処理に要する時間はほとんど増加しないことがわかる。その結果、約１．７倍という大幅な高速化が得られている。
一方、評価関数を最小とするようにγを定めた場合はγ＝０．６４となり、手動探索による最適値に近い値が推定された。その結果、反復回数は９６回、計算時間は７．４秒となっており、手動でのパラメタ最適化と遜色ない結果が得られた。この結果は、上記のパラメタ決定方法が、本手法の高速化の効果を損なわない程度の時間で、係数行列に応じた適切なスケーリングパラメタを推定していることを示すものである。 As a result of manually changing γ, both the number of iterations and the calculation time showed minimum values at γ = 0.63. Compared to the basic CG method with ILDU decomposition preprocessing, γ = 1.0, the number of iterations is reduced from 171 times to 95 times, and the convergence is greatly improved. On the other hand, the calculation time is also reduced from 12.7 seconds to 7.3 seconds, and it can be seen that the time required for creating the preprocessing matrix and preprocessing hardly increases. As a result, a significant speed increase of about 1.7 times is obtained.
On the other hand, when γ was determined so as to minimize the evaluation function, γ = 0.64, and a value close to the optimum value by manual search was estimated. As a result, the number of iterations was 96 and the calculation time was 7.4 seconds, which was inferior to manual parameter optimization. This result shows that the above parameter determination method estimates an appropriate scaling parameter according to the coefficient matrix in a time that does not impair the speed-up effect of the present method.

次に、図５において、背景技術で説明した各種の従来手法（不完全三角分解前処理、その派生法であるシフトパラメタ付き、補正パラメタ付き、フィルインレベル付きの不完全三角分解前処理、修正不完全分解前処理）と本発明の手法とを比較する。図５（Ａ）では反復回数を、図５（Ｂ）では計算時間を比較してある。なお、本発明に関してはスケーリングパラメタを、上記のパラメタ決定手法により決定している。一方、背景技術の４つの派生法に関しては、それぞれの固有パラメタについて、様々な値を試して最も良い結果を与える最適値をあらかじめ探索しておき、そのパラメタを用いた結果を示す。図より、反復回数及び計算時間の両方で、本発明による結果が最も優れている。すなわち本発明の解法は、たった一度の計算で、何度も試行を繰り返してパラメタを最適化した背景技術の各派生法よりも優れたパフォーマンスを示し、実用性に優れることがわかる。 Next, in FIG. 5, various conventional methods described in the background art (incomplete triangulation pre-processing, its derived method with shift parameter, with correction parameter, incomplete triangulation pre-processing with fill-in level, uncorrected The complete decomposition pretreatment) is compared with the method of the present invention. In FIG. 5A, the number of iterations is compared, and in FIG. 5B, the calculation time is compared. In the present invention, the scaling parameter is determined by the parameter determination method described above. On the other hand, with respect to the four derivation methods of the background art, the optimum values that give the best results by trying various values for each unique parameter are searched in advance, and the results using the parameters are shown. From the figure, the results according to the present invention are the best in both the number of iterations and the calculation time. In other words, it can be seen that the solution of the present invention shows superior performance and is more practical than the respective derivation methods of the background art in which the parameters are optimized through repeated trials in a single calculation.

なお、下三角行列Ｌ及び上三角行列Ｕにスケーリングパラメタを乗じる本発明の手法は、不完全三角分解前処理の各派生法に対して同時に適用することができる。図６に、各派生法に対して本発明の手法を適用した場合の結果を示す。すなわち、本発明における不完全三角分解に、基本となるＩＬＤＵ分解だけでなく、各派生法で行われる独自のＩＬＤＵ分解を利用している。なお、スケーリングパラメタは上記パラメタ決定処理により評価関数を最小にするものを採用している。 Note that the method of the present invention in which the lower triangular matrix L and the upper triangular matrix U are multiplied by the scaling parameter can be applied simultaneously to each derivation method of the incomplete triangulation preprocessing. FIG. 6 shows the results when the method of the present invention is applied to each derivation method. That is, not only the basic ILDU decomposition but also the original ILDU decomposition performed in each derivation method is used for the incomplete triangulation in the present invention. A scaling parameter that minimizes the evaluation function by the parameter determination process is adopted.

図６（Ａ）では反復回数を、図６（Ｂ）では計算時間を示してある。また図５と同様、従来技術の４つの派生法については、固有パラメタを事前に最適化した結果を示している。図より、基本となるＩＬＤＵ分解前処理だけでなく、４つの派生法全てについて、本発明を適用する事で高速化されることがわかる。すなわち本発明はいずれの従来技術とも競合するものではなく、それらと組み合わせることでより効果を高めるという性質を持つ。 FIG. 6A shows the number of iterations, and FIG. 6B shows the calculation time. Similarly to FIG. 5, the four derivation methods of the prior art show the result of optimizing the eigenparameters in advance. From the figure, it can be seen that not only the basic ILDU decomposition preprocessing but also all four derivation methods can be speeded up by applying the present invention. In other words, the present invention does not compete with any prior art, and has the property of enhancing the effect when combined with them.

図６では各派生法に関し、固有パラメタを最適化した結果のみ示していたが、図７では固有パラメタに対して様々な値を試行した結果を示す。図７（Ａ）（Ｂ）がシフトパラメタ付きＩＬＤＵ分解前処理付きＣＧ法と、それに本発明を適用した場合の結果を示す。以降同様に、図７（Ｃ）（Ｄ）が補正パラメタ付きＩＬＤＵ分解、図７（Ｅ）（Ｆ）が修正ＩＬＤＵ分解、図７（Ｇ）（Ｈ）がｆｉｌｌ−ｉｎレベル付きＩＬＤＵ分解と続く。図中、白丸及び白三角は、従来手法のみによる解法での反復回数及び計算時間を示し、黒丸及び黒三角は、本発明と従来手法を組み合わせた場合の反復回数及び計算時間を示す。 FIG. 6 shows only the result of optimizing the eigenparameter for each derivation method, but FIG. 7 shows the result of trying various values for the eigenparameter. FIGS. 7A and 7B show the CG method with ILDU decomposition preprocessing with shift parameters and the results when the present invention is applied thereto. Thereafter, similarly, FIGS. 7C and 7D are ILDU decompositions with correction parameters, FIGS. 7E and 7F are modified ILDU decompositions, and FIGS. 7G and 7H are ILDU decompositions with a fill-in level. . In the figure, white circles and white triangles indicate the number of iterations and calculation time in the solution only by the conventional method, and black circles and black triangles indicate the number of iterations and the calculation time when the present invention and the conventional method are combined.

まず全ての派生法に関して、それぞれ固有のパラメタ（シフトパラメタ、補正パラメタ緩和パラメタ、フィルインレベル）に対する感度が非常に高く、パラメタの設定が高速化の度合いを大きく左右することがわかる。一方、本発明を各派生法に適用すると、固有パラメタの値に関わらず反復回数と計算時間の両方が低減しており、固有パラメタに対するロバスト性が表れている。 First of all, for all derivation methods, it can be seen that the sensitivity to unique parameters (shift parameter, correction parameter relaxation parameter, fill-in level) is very high, and the parameter setting greatly affects the degree of speedup. On the other hand, when the present invention is applied to each derivation method, both the number of iterations and the calculation time are reduced regardless of the value of the eigenparameter, and robustness with respect to the eigenparameter appears.

また特に、補正パラメタ付きＩＬＤＵ分解前処理（図７（Ｃ）（Ｄ））、及び修正ＩＬＤＵ分解前処理（図７（Ｅ）（Ｆ））に本発明を適用することで、それぞれ補正パラメタと緩和パラメタに対する感度が低減していることがわかる。その結果ある範囲のパラメタが一律に最適値となっており、パラメタを固定して利用する際の安全性が増している。すなわち本発明の解法は、上記２つの派生法に対して、高速化と実用性の向上という２つの効果を与えることがわかる。 In particular, by applying the present invention to ILDU decomposition pre-processing with correction parameters (FIGS. 7C and 7D) and modified ILDU decomposition pre-processing (FIGS. 7E and 7F), the correction parameters and It can be seen that the sensitivity to the relaxation parameter is reduced. As a result, a certain range of parameters are uniformly optimal values, which increases the safety when using fixed parameters. In other words, it can be seen that the solution of the present invention provides two effects of speeding up and improving practicality with respect to the above two derivative methods.

２０１…ＣＰＵ，２０２…表示装置，２０３…入力装置，２０４…ＲＡＭ 201 ... CPU, 202 ... display device, 203 ... input device, 204 ... RAM

Claims

A method for calculating simultaneous linear equations in a numerical computer,
The numerical calculator is
A decomposition step of obtaining a lower triangular matrix L and an upper triangular matrix U by incomplete triangular decomposition of the coefficient matrix A;
A scaling step of obtaining a lower triangular matrix L ′ and an upper triangular matrix U ′ by multiplying the diagonal elements of L and U decomposed in the decomposing step by a scaling parameter γ and multiplying the non-diagonal elements by a scaling parameter φ. When,
A preprocessing step of performing preprocessing using a preprocessing matrix M obtained from L ′ and U ′;
An approximate solution update step of updating the approximate solution using the result obtained in the preprocessing step;
It determines the convergence of the solution according to the convergence determination condition, and if it has converged, terminates the process; otherwise, repeats the preprocessing step and the approximate solution update step; and
A method of calculating simultaneous linear equations, characterized in that

2. The parameter determining step of determining the scaling parameters γ and φ by estimating a parameter that minimizes an evaluation function defined by the coefficient matrix A and the preprocessing matrix M. Calculation method of simultaneous linear equations described in 1.

The evaluation function is a vector whose component is a sum for each column of a difference matrix between the coefficient matrix A and the preprocessing matrix M.

The simultaneous linear equation calculation method according to claim 2, wherein the Euclidean norm is

A numerical calculation device for finding a solution of simultaneous linear equations,
Decomposition means for obtaining a lower triangular matrix L and an upper triangular matrix U by incomplete triangular decomposition of the coefficient matrix A;
Scaling means for obtaining a lower triangular matrix L ′ and an upper triangular matrix U ′ by multiplying the diagonal elements of L and U decomposed by the decomposing means by a scaling parameter γ and multiplying non-diagonal elements by a scaling parameter φ. When,
Preprocessing means for performing preprocessing using a preprocessing matrix M obtained from the L ′ and the U ′;
Approximate solution update means for updating the approximate solution using the result of the preprocessing means;
It determines the convergence of the solution according to the convergence determination condition, ends the process if it has converged, otherwise repeats the processing by the preprocessing means and the approximate solution update means,
A numerical calculation apparatus comprising:

5. The apparatus according to claim 4, further comprising parameter determining means for determining the scaling parameters γ and φ by estimating a parameter that minimizes an evaluation function defined by the coefficient matrix A and the preprocessing matrix M. The numerical calculation device described in 1.

The numerical calculation apparatus according to claim 5, wherein the numerical calculation apparatus is an Euclidean norm.