JPH04365171A

JPH04365171A - Parallel numerical arithmetic system

Info

Publication number: JPH04365171A
Application number: JP16767291A
Authority: JP
Inventors: Satoshi Matsushita; 智松下
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1991-06-12
Filing date: 1991-06-12
Publication date: 1992-12-17

Abstract

PURPOSE:To execute arithmetic operation at high speed by using MIMD type parallel computers by parallelizing a positive solution part by dividing a vector, and parallelizing a negative solution part by pipeline processing. CONSTITUTION:In a positive solution to update the vector U from the vectors U, W shown in a figure, the new element 13 of the vector U is updated by referring to the elements 10, 11, 12. At that time, by arranging the elements in a processor PE2 as shown in the figure, the processor PE2 can calculate the positive solution part by only receiving the elements 10, 14 from the neighboring processors E1, E3. In the processing to obtain Vn,k from the vector Un,k by forward elimination, and further, to obtain Wn,k by back-substitution, by obtaining Vn,k, Wn,k by using plural processors in a pipeline-like way, the negative solution part can be calculated.

Description

[Detailed description of the invention]

【０００１】0001

【産業上の利用分野】本発明は複数個のプロセッサを用
いて行う並列数値演算方式に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a parallel numerical calculation method using a plurality of processors.

【０００２】0002

【従来の技術】陽解法および陰解法からなる数値演算で
あって、ｉをインデックスとする独立なＮ個の演算式が
あり、ｘｉ　からｙｉ　を求める陽解法部分の関係式が
ｙｉ　＝ｆ（ｘｉ　）と示され、ｕｉ　からｗｉ　陰解
法部分の関係式が、Ｅｉ　をＮ個の行列として、ｕｉ　
＝Ｅｉ　ｗｉ　と示される演算を含む問題でＥｉ　が疎
行列である問題の数値演算を考える。[Prior Art] Numerical operations consist of an explicit method and an implicit method, and there are N independent arithmetic expressions with i as an index, and the relational expression of the explicit method part to obtain yi from xi is yi = f(xi The relational expression of the implicit part from ui to wi is shown as ui
Consider a numerical calculation for a problem that includes an operation expressed as =Ei wi where Ei is a sparse matrix.

【０００３】この場合、陰解法部分のベクトル化は困難
であり、従来、演算量の大きい陽解法部分のみをベクト
ル計算機でベクトル処理することで速度向上を行なって
いた。ところが、陽解法部分の速度が向上した場合、ベ
クトル処理できない陰解法部分がベクトル計算機での速
度向上の隘路になってしまい大きな速度向上が望めない
という問題点があった。[0003] In this case, it is difficult to vectorize the implicit method portion, and conventionally, speed has been improved by vector processing only the explicit method portion, which requires a large amount of calculation, using a vector computer. However, even if the speed of the explicit part is improved, the implicit part, which cannot process vectors, becomes a bottleneck in improving the speed of the vector computer, and there is a problem in that no significant speed improvement can be expected.

【０００４】そこで、本計算の陰解法部分が独立なＮ個
の演算式から成っていることを利用し、陰解法部分を並
列化することで速度向上を行なうアイディアがあった。[0004] Therefore, there was an idea to improve the speed by parallelizing the implicit part of the calculation by taking advantage of the fact that the implicit part of the calculation consists of N independent arithmetic expressions.

【０００５】[0005]

【発明が解決しようとする課題】単純には、以下の速度
向上法が考えられる。１．陽解法部分をベクトル計算機で行ない、陰解法部分
をＮ個の並列プロセッサで処理する。２．陽解法部分および陰解法部分をともにＮ個のプロセ
ッサで処理する。[Problem to be Solved by the Invention] The following speed improvement method can be considered simply. 1. The explicit part is processed by a vector computer, and the implicit part is processed by N parallel processors. 2. Both the explicit method part and the implicit method part are processed by N processors.

【０００６】ところが、第１項では、高価なベクトル計
算機を必要とするうえ、ベクトル計算機と並列プロセッ
サの間でのデータ再配分が必要であり、通信処理オーバ
ヘッドが速度向上を阻害する可能性がある。また、第２
項では、Ｎが十分大きくないと処理量の大きい陽解法部
分の速度向上が難しい。一般に、Ｎは１０〜数１０程度
であまり大きくなく大きな速度向上の実現は困難である
。However, the first item requires an expensive vector computer and also requires data redistribution between the vector computer and the parallel processor, and communication processing overhead may impede speed improvement. . Also, the second
In this section, unless N is sufficiently large, it is difficult to speed up the explicit method part, which requires a large amount of processing. Generally, N is about 10 to several 10, which is not so large that it is difficult to achieve a large speed improvement.

【０００７】[0007]

【課題を解決するための手段】本発明に係る並列数値演
算方式は、陽解法および陰解法からなる数値演算であっ
て、陽解法部分は、ベクトルｘ，ｙを用いて、ｙ＝ｆ（
ｘ）の関係式でｘからｙを導出する演算として示され、
陰解法部分は、Ｎ個のベクトルｕｉ　，ｗｉ　と、Ｎ個
の独立した行列Ａｉ　を用いてｕｉ　＝Ａｉ　ｗｉ　の
関係式でｕｉ　からｗｉ　を導出する処理として示され
る数値演算をＰ個のプロセッサを用いて並列化する方法
であって、陽解法部分は、ｘベクトルをＰ個の連続した
小ベクトルｘｋ　に分割しｋ番目のプロセッサＰＥｋ　
でｘｋ　を解くことで並列化し、陰解法部分は行列Ａｉ
　をＬＵ分解し、ｕｉ　＝Ｌｉ　ｖｉ　でｕｉ　からｖ
ｉ　を計算する前進消去と、ｖｉ　＝Ｕｉ　ｗｉ　でｖ
ｉ　からｗｉ　を計算する後退代入に帰着し、ｘベクト
ルの分割にあわせ、ｕｉ　，ｖｉ　，ｗｉ　ベクトルを
ｕｉ，ｋ　，ｖｉ，ｋ　，ｗｉ　，ｋに分割し、前進消
去では、ｋ番目のプロセッサＰＥｋ　で、ｕｉ，ｋ　と
Ｌｉ　およびＰＥｋ−１　から受けとった中間解ｖｉ　
，ｋ−１を用い前進消去を部分的に行ない中間解ｖｉ，
ｋ　を作成しＰＥｋ＋１　に送ったのち、ＰＥｋ　は、
別のベクトルｕｊ，ｋ　とＰＥｋ−１　から受けとった
中間解ｖｊ，ｋ−１　を用い別の前進消去を部分的に行
なうパイプライン処理によって並列にｖｉ　を計算し、
後退代入では、ｋ番目のプロセッサＰＥｋで前記前進消
去処理で計算したｖｉ，ｋ　とＵｉ　およびＰＥｋ＋１
　から受け取った部分解ｗｉ，ｋ＋１　を用い後退代入
を部分的に行ない部分解ｗｉ　，ｋを作成しＰＥｋ−１
　に送ったのち、ＰＥｋ　は、別のベクトルｖｊ，ｋ　
とＰＥｋ＋１　から受けとった中間解ｗｊ，ｋ＋１　を
用い別の後退代入を部分的に行なうパイプライン処理に
よって並列にｗｉ　を計算することで、陰解法部分を並
列化することを特徴とする。[Means for Solving the Problems] The parallel numerical calculation method according to the present invention is a numerical calculation consisting of an explicit method and an implicit method.
x) is expressed as an operation to derive y from x in the relational expression,
The implicit part uses N vectors ui, wi and N independent matrices Ai to derive wi from ui using the relational expression ui = Ai wi using P processors. In this method, the explicit method part divides the x vector into P consecutive small vectors xk and
Parallelization is performed by solving xk with
LU decomposition, ui = Li vi from ui to v
Forward elimination to calculate i and v with vi = Ui wi
This results in backward substitution to calculate wi from i, and in accordance with the division of the x vector, the ui, vi, wi vector is divided into ui, k, vi, k, wi, k, and in forward elimination, the k-th processor PEk Then, the intermediate solution vi received from ui,k and Li and PEk-1
, k-1 and perform partial forward elimination to obtain an intermediate solution vi,
After creating k and sending it to PEk+1, PEk is
Compute vi in parallel by pipeline processing that partially performs another forward cancellation using another vector uj,k and the intermediate solution vj,k-1 received from PEk-1,
In the backward substitution, the k-th processor PEk calculates vi,k, Ui, and PEk+1 calculated in the forward elimination process.
Partial backward substitution is performed using the partial solution wi,k+1 received from PEk-1 to create a partial solution wi,k.
After sending PEk to another vector vj,k
The implicit method part is parallelized by calculating wi in parallel by pipeline processing that partially performs another backward substitution using the intermediate solution wj,k+1 received from PEk+1 and PEk+1.

【０００８】[0008]

【作用】本発明においては、ベクトルｘの要素数だけ存
在する高い並列度を利用して並列化することができる。そこで、前記第２項にあったような並列度の制限はなく
、多数のプロセッサからなる並列マシンを用いることで
陽解法部分でベクトル計算機を越える速度向上が可能で
ある。[Operation] In the present invention, parallelization can be performed by utilizing the high degree of parallelism that exists as many as the number of elements in the vector x. Therefore, there is no limit on the degree of parallelism as mentioned in the second term above, and by using a parallel machine consisting of a large number of processors, it is possible to improve the speed of the explicit method part by exceeding that of a vector computer.

【０００９】本発明では、陰解法部分もＮの並列度を利
用して並列化することができる。このとき、陰解法と陽
解法との間で、データ分割の形状が同じであるから、前
記第１項にあったような、データ再配分は必要でない。通信は陽解法および陰解法処理で隣接プロセッサでの境
界部分のデータ交換のみで済むため通信量が小さく、さ
らに各プロセッサが同時に通信を行なえるので、通信オ
ーバヘッドを極めて小さくすることが可能である。In the present invention, the implicit method part can also be parallelized using the degree of parallelism of N. At this time, since the shape of data division is the same between the implicit method and the explicit method, there is no need for data redistribution as described in the first section. Communication is performed using explicit and implicit methods, and the amount of communication is small because only data exchange at the boundary between adjacent processors is required.Furthermore, since each processor can communicate simultaneously, communication overhead can be extremely reduced.

【００１０】以上から、本発明によりベクトル計算機で
問題となっていた陰解法部分の並列化が達成され、ベク
トル計算機を越える速度での数値演算が可能になる。ま
た、係数行列、データを全プロセッサを用いて分散配置
ができるから、大規模なデータの数値演算が可能である
。As described above, according to the present invention, parallelization of the implicit method part, which has been a problem with vector computers, has been achieved, and numerical calculations can be performed at a speed exceeding that of vector computers. Further, since the coefficient matrix and data can be distributed and arranged using all processors, numerical operations on large-scale data are possible.

【００１１】[0011]

【実施例】本発明の並列数値演算方式について、方程式
（１），（２）の求解を例にとり、図１、図２を参照し
て説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS The parallel numerical calculation system of the present invention will be explained with reference to FIGS. 1 and 2, taking the solution of equations (1) and (2) as an example.

【００１２】（方程式）(Equation)

【数１】[Math 1]

【数２】ただし、ａｋ，ｌ，ｍ　（ｒ）　，ｂｋ，ｌ，ｍ　（ｒ
）　，ｃｋ，ｌ，ｍ　（ｒ）　は、ｋ，ｌ，ｍとｒに依
存して定まる係数である。また、演算子［］ｍ，ｎ　は
式（３）で定義される。[Formula 2] However, ak,l,m (r) ,bk,l,m (r
) , ck, l, m (r) are coefficients determined depending on k, l, m and r. Furthermore, the operator []m,n is defined by equation (3).

【数３】[Math 3]

【００１３】（数値演算法）ｒ方向には式４に示す中心
差分により離散化する。(Numerical calculation method) In the r direction, discretization is performed using the central difference shown in equation 4.

【数４】すると、方程式（１）の時間積分は陽解法で行なうこと
ができる。##EQU00004## Then, the time integration of equation (1) can be performed using an explicit method.

【００１４】一方、方程式（２）の求解は、時間積分し
たＵから求解することができるが、この演算は行列の逆
演算を伴う陰解法となる。このとき、方程式（２）は、
ｎ毎に独立であることを考慮して、Ｎ個の行列方程式で
表現される。On the other hand, equation (2) can be solved from time-integrated U, but this calculation is an implicit method that involves matrix inversion. At this time, equation (2) is
Considering that each n is independent, it is expressed by N matrix equations.

【００１５】前記行列演算はＵｍ，ｎ，φｌ，ｎ　を離
散化したものをｕｎ　（要素はｕｉ，ｍ，ｎ　），ｗｎ
　（要素は、ｗｉ，ｌ，ｎ　）とおくと行列表現（５）
式になる。このとき、ｕｎ　ｗｎ　をｍまたは、ｌ要素
が連続になるようにソートすると、行列Ａは図３に示さ
れるブロックサイズｍのブロック３重対角行列になる。ただし、ｉの範囲を１からＩ、ｍ，ｌの範囲を１からＭ
、ｎの範囲を１からＮとする。[0015] In the matrix operation, the discretized Um, n, φl, n are un (elements are ui, m, n), wn
(Elements are wi, l, n) Matrix representation (5)
It becomes a ceremony. At this time, if un wn is sorted so that m or l elements are continuous, matrix A becomes a block tridiagonal matrix with block size m shown in FIG. However, the range of i is from 1 to I, and the range of m, l is from 1 to M.
, n range from 1 to N.

【００１６】Ａｎ　は時間変化しないため、（５）式を
Ａｎ　＝Ｌｎ　Ｕｎ　とＬＵ分解し中間ベクトルｖｎ　
を導入し、前進消去（６）式と、後退代入（７）式に分
解する。ｕｎ　＝Ａｎ　ｗｎ　　　　　　　　　　　　　　　　
　（５）ｕｎ　＝Ｌｎ　ｖｎ　　　　　　　　　　　　
　　　　　（６）ｕｎ　＝Ｕｎ　ｗｎ　　　　　　　　
　　　　　　　　　（７）Since An does not change over time, the intermediate vector vn is decomposed by LU decomposition of equation (5) as An = Ln Un
is introduced and decomposed into the forward elimination equation (6) and the backward substitution equation (7). un = An wn
(5) un = Ln vn
(6) un = Un wn
(7)

【００１７】（数値演算の流
れ）以下の処理の繰り返しで時間積分を行なう。１．陽解法によるｕｎ　の１タイムステップの積分２．
ｕｎ　から前進消去による、中間解ｖｎ　の求解３．ｖ
ｎ　から後退代入による、ｗｎ　の求解(Flow of numerical calculations) Time integration is performed by repeating the following processing. 1. Integration of one time step of un by explicit method 2.
Solving the intermediate solution vn by forward elimination from un 3. v
Solving wn from n by backward substitution

【００１８】（
並列化）本発明の実施例としてＰ台のプロセッサを用い
て前記数値演算を並列化する。以降プロセッサをＰＥｋ
　、（１＜ｋ＜Ｐ）で参照する。ｕｎ　，ｗｎ　をＰ個
の連続した部分ベクトルに分割し、ＰＥｋ　にｋ番目の
部分ベクトルｕｎ，ｋ　，ｗｎ，ｋ　、ただし、要素は
ｕｉ，ｍ，ｎ　，ｗｉ，ｌ，ｎ　としてｉｍｉｎ（ｋ）
＜ｉ＜ｉｍａｘ（ｋ）となるものを配置する。[0018](
Parallelization) As an embodiment of the present invention, P processors are used to parallelize the numerical operations. After that, the processor is PEk
, (1<k<P). Divide un , wn into P consecutive subvectors, and set PEk as k-th subvector un,k , wn,k , where the elements are ui,m,n , wi,l,n and imin(k)
Place items such that <i<imax(k).

【００１９】（陽解法部分の並列化）陽解法部分は、ｕ
ｉ，ｍ，ｎ　，ｗｉ，ｌ，ｎ　でｉに関しては（ｉｍｉ
ｎ（ｋ）−１＜ｉ＜ｉｍａｘ（ｋ）＋１）を参照するこ
とで解くことができる。(Parallelization of explicit method part) The explicit method part is
i, m, n , wi, l, n for i (imi
This can be solved by referring to n(k)-1<i<imax(k)+1).

【００２０】陽解法部分を並列化した場合のデータ参照
の様子を図１に示す。要素１３を更新するためには、隣
接要素１０、１１、１２を参照する。FIG. 1 shows how data is referenced when the explicit method part is parallelized. To update element 13, neighboring elements 10, 11, 12 are referenced.

【００２１】並列化した陽解法部分の処理の流れ　　　
　　　陽解法部分の処理は、１．ＰＥｋ　、ただし、ｋ＜＝Ｐ−１が、ｕｉ，ｍ，ｎ
　、ただし、ｉ＝ｉｍａｘ（ｋ）をＰＥｋ＋１　に転送
する。２．ＰＥｋ　、ただし、２＜＝ｋが、ｕｉ，ｍ，ｎ　、
ただし、ｉ＝ｉｍｉｎ（ｋ）をＰＥｋ−１　に転送する
。３．ＰＥｋ　０＜＝ｋ＜＝Ｐが、独立にｕｎ，ｋ　を計
算する。図１では、第１，２項の処理により、ＰＥ２　の計算で
必要となるベクトル要素１０，１４が転送される。[0021] Processing flow of the parallelized explicit method part
The processing of the explicit method part is as follows: 1. PEk, where k<=P-1 is ui, m, n
, where i=imax(k) is transferred to PEk+1. 2. PEk, where 2<=k, ui,m,n,
However, i=imin(k) is transferred to PEk-1. 3. PEk 0<=k<=P independently calculates un,k. In FIG. 1, vector elements 10 and 14 necessary for calculation of PE2 are transferred by the processing of the first and second terms.

【００２２】陰解法部分の並列化　　　　　　陰解法部
分は図２で示すように、ＰＥｋ　とＰＥｋ−１　の間で
パイプライン的に処理を行なうことで並列化する。ただ
し、図２では簡単のため、Ｐ＝３，ｎ＝２の場合を示す
。Parallelization of the implicit method part The implicit method part is parallelized by performing pipeline processing between PEk and PEk-1, as shown in FIG. However, for simplicity, FIG. 2 shows a case where P=3 and n=2.

【００２３】並列化した陰解法部分の処理の流れ　　　
　　　各ｎ成分について図２を参照し、処理の流れを以
下に示す。ただし、図２では簡単のため、Ｐ＝３，ｎ＝
２の場合を示す。１．（以降、前進消去部分）２．ＰＥ１　でｕｎ，ｏ　からｖｎ，ｏ　の計算をする
。３．ＰＥ１　がｖｉ，ｍ，ｎ　（ただしｉ＝ｉｍａｘ（
１））要素をＰＥ２　に送信する。４．ＰＥ２　は、ｖｉ，ｍ，ｎ　（ただしｉ＝ｉｍａｘ
（１））要素を受信し、ｖｎ，２　を計算する。５．ＰＥ２　がｖｉ，ｍ，ｎ　（ただしｉ＝ｉｍａｘ（
２））要素をＰＥ３　に送信する。６．… ７．ＰＥＰ　は、ｖｉ，ｍ，ｎ　（ただしｉ＝ｉｍａｘ
（Ｐ−１））要素を受信し、ｖｎ　，Ｐ　を計算する。８．（後退代入消去部分）９．ＰＥＰ　でｖｎ，Ｐ　からｗｎ，Ｐ　を計算する。１０．ＰＥＰ　がｗｉ，ｍ，ｎ　（ただしｉ＝ｉｍｉｎ
（Ｐ））要素をＰＥＰ−１　に送信する。１１．ＰＥＰ−１　は、ｗｉ，ｍ，ｎ　（ただしｉ＝ｉ
ｍｉｎ（Ｐ））要素を受信し、ｗｎ，Ｐ−１　を計算す
る。１２．ＰＥＰ−１　がｗｉ，ｍ，ｎ　（ただしｉ＝ｉｍ
ｉｎ（Ｐ−１））要素をＰＥＰ−２　に送信する。１３．… １４．ＰＥ１　は、ｗｉ，ｍ，ｎ　（ただしｉ＝ｉｍｉ
ｎ（２））要素を受信し、ｗｎ，１　を計算する。となり、ｗｎ，１　が求められる。ＰＥｋ　は、１つの
ｎについての求解のあと、異なったｎについての求解を
開始することで図２に示す様にパイプライン的に前進消
去が実現される。Processing flow of parallelized implicit method part
The flow of processing for each n component is shown below with reference to FIG. However, for simplicity in Figure 2, P=3, n=
Case 2 is shown below. 1. (Hereafter, the forward erasing part) 2. PE1 calculates vn,o from un,o. 3. PE1 is vi, m, n (where i=imax(
1)) Send the element to PE2. 4. PE2 is vi, m, n (where i=imax
(1)) Receive the element and calculate vn,2. 5. PE2 is vi, m, n (where i=imax(
2)) Send the element to PE3. 6. …7. PEP is vi, m, n (where i=imax
(P-1)) Receive the element and calculate vn,P. 8. (Backward substitution elimination part) 9. PEP calculates wn,P from vn,P. 10. PEP is wi, m, n (where i=imin
(P)) Send the element to PEP-1. 11. PEP-1 is wi, m, n (where i=i
min(P)) element and calculate wn,P-1. 12. PEP-1 is wi, m, n (where i=im
in(P-1)) element to PEP-2. 13. ...14. PE1 is wi, m, n (where i=imi
n(2)) elements and calculate wn,1. Then, wn,1 is obtained. For PEk, forward elimination is realized in a pipeline manner as shown in FIG. 2 by starting the solution for a different n after solving for one n.

【００２４】[0024]

【発明の効果】本発明においては、一般に問題のベクト
ル長はＮに比べ大きいから、陽解法部分の並列度が大き
く高並列マシンを用いることで陽解法部分の大幅な高速
化が可能である。一方、陰解法部分と陽解法部分の間で
ＰＥへのデータ分散の形状が同じため、データの再配分
の必要性がなく通信オーバヘッドが小さい。According to the present invention, since the vector length of a problem is generally larger than N, the explicit solution part can be significantly speeded up by using a highly parallel machine with a large degree of parallelism in the explicit solution part. On the other hand, since the shape of data distribution to PEs is the same between the implicit method part and the explicit method part, there is no need for data redistribution and communication overhead is small.

【００２５】さらに、係数行列、データを全プロセッサ
を用いて分散配置できるので、プロセッサ台数の増加に
より、対象となる問題の規模を拡大できる。本発明の並
列数値演算方式は、ＭＩＭＤ型高並列マシンに適した並
列化手法である。Furthermore, since the coefficient matrix and data can be distributed and arranged using all the processors, the scale of the target problem can be expanded by increasing the number of processors. The parallel numerical calculation method of the present invention is a parallelization method suitable for MIMD type highly parallel machines.

[Brief explanation of drawings]

【図１】陽解法部分の並列化を示した概念図である。FIG. 1 is a conceptual diagram showing parallelization of an explicit method part.

【図２】陰解法部分の並列化を示した概念図である。FIG. 2 is a conceptual diagram showing parallelization of an implicit method part.

【図３】陰解法部分の係数行列を示した図である。FIG. 3 is a diagram showing a coefficient matrix of an implicit solution part.

【図４】前進消去を行列表現で示した図である。FIG. 4 is a diagram showing forward elimination in matrix representation.

【図５】後退代入を行列表現で示した図である。FIG. 5 is a diagram showing backward substitution in matrix representation.

[Explanation of symbols]

１０　　　　ＰＥ２　の計算で必要とされＰＥ１　から
受信されるｗｎ　，ｕｎ　の要素１１，１２　　　　参照されるｗｎ　，ｕｎ　の要素１
３　　　　更新されるｕｎ　の要素１４　　　　ＰＥ２　の計算で必要とされＰＥ３　から
受信されるｗｎ　，ｕｎ　の要素２０，２１，２２　　　　ｕ１，ｋ　から前進消去によ
り算出されるｖ１　，ｋの要素２３，２４，２５　　　　ｕ２，ｋ　から前進消去によ
り算出されるｖ２　，ｋの要素２６，２７，２８　　　　ｖ１，ｋ　から後退代入によ
り算出されるｗ１　，ｋの要素２９，３０，３１　　　　ｖ２，ｋ　から後退代入によ
り算出されるｗ２　，ｋの要素10 Elements 11, 12 of wn, un that are needed in the calculation of PE2 and received from PE1 Element 1 of wn, un that is referenced
3 Elements of un to be updated 14 Elements 20, 21, 22 of wn, un that are required in the calculation of PE2 and received from PE3 Elements 23, 24, 25 of v1, k calculated by forward elimination from u1, k v2, calculated from u2,k by forward elimination, elements 26, 27, 28 of k, w1, calculated by backward substitution from v1,k, elements 29, 30, 31 of k, calculated by backward substitution from v2,k w2, element of k

Claims

[Claims]

Claim 1: Numerical calculation consisting of an explicit method and an implicit method, where the explicit method part uses vectors x and y to calculate y
It is expressed as an operation to derive y from x using the relational expression =f(x), and the implicit method part uses N vectors ui, wi and N independent matrices Ai to calculate ui =Ai w
This is a method of parallelizing the numerical operation shown as the process of deriving wi from ui in the relational expression of i using P processors, and the explicit method part consists of dividing the x vector into P consecutive small vectors xk. kth processor P
Parallelization is performed by solving xk with Ek, and the implicit solution part is the LU decomposition of matrix Ai, and ui with ui = Li vi
Forward elimination to calculate vi from vi = Ui wi
This results in backward substitution to calculate wi from vi , and x
According to the vector division, the ui, vi, wi vector is divided into ui,k, vi,k, wi,k, and in forward elimination, the k-th processor PEk divides the ui,k
and the intermediate solution v received from Li and PEk-1
i, k−1 and perform partial forward elimination to obtain an intermediate solution v
After creating i,k and sending it to PEk+1, PEk
calculates vi in parallel by a pipeline process that partially performs another forward elimination using another vector uj,K and an intermediate solution vj,k−1 received from PEk−1, and in backward substitution, the k-th vi,k, Ui and PE calculated by the forward elimination process using the processor PEk of
Using the partial solution wi,k+1 received from k+1, perform partial backward substitution to create a partial solution wi,k, and PE
After sending it to k-1, PEk is sent to another vector vJ
, k and the intermediate solution wj,k+ received from PEk+1
1. A parallel numerical calculation method characterized in that the implicit method part is parallelized by calculating wi in parallel by pipeline processing that partially performs another backward substitution using 1.