JPH04365171A - Parallel numerical arithmetic system - Google Patents

Parallel numerical arithmetic system

Info

Publication number
JPH04365171A
JPH04365171A JP16767291A JP16767291A JPH04365171A JP H04365171 A JPH04365171 A JP H04365171A JP 16767291 A JP16767291 A JP 16767291A JP 16767291 A JP16767291 A JP 16767291A JP H04365171 A JPH04365171 A JP H04365171A
Authority
JP
Japan
Prior art keywords
pek
vector
implicit
calculate
solution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
JP16767291A
Other languages
Japanese (ja)
Inventor
Satoshi Matsushita
智 松下
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP16767291A priority Critical patent/JPH04365171A/en
Publication of JPH04365171A publication Critical patent/JPH04365171A/en
Withdrawn legal-status Critical Current

Links

Landscapes

  • Advance Control (AREA)
  • Multi Processors (AREA)
  • Complex Calculations (AREA)

Abstract

PURPOSE:To execute arithmetic operation at high speed by using MIMD type parallel computers by parallelizing a positive solution part by dividing a vector, and parallelizing a negative solution part by pipeline processing. CONSTITUTION:In a positive solution to update the vector U from the vectors U, W shown in a figure, the new element 13 of the vector U is updated by referring to the elements 10, 11, 12. At that time, by arranging the elements in a processor PE2 as shown in the figure, the processor PE2 can calculate the positive solution part by only receiving the elements 10, 14 from the neighboring processors E1, E3. In the processing to obtain Vn,k from the vector Un,k by forward elimination, and further, to obtain Wn,k by back-substitution, by obtaining Vn,k, Wn,k by using plural processors in a pipeline-like way, the negative solution part can be calculated.

Description

【発明の詳細な説明】[Detailed description of the invention]

【0001】0001

【産業上の利用分野】本発明は複数個のプロセッサを用
いて行う並列数値演算方式に関する。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a parallel numerical calculation method using a plurality of processors.

【0002】0002

【従来の技術】陽解法および陰解法からなる数値演算で
あって、iをインデックスとする独立なN個の演算式が
あり、xi からyi を求める陽解法部分の関係式が
yi =f(xi )と示され、ui からwi 陰解
法部分の関係式が、Ei をN個の行列として、ui 
=Ei wi と示される演算を含む問題でEi が疎
行列である問題の数値演算を考える。
[Prior Art] Numerical operations consist of an explicit method and an implicit method, and there are N independent arithmetic expressions with i as an index, and the relational expression of the explicit method part to obtain yi from xi is yi = f(xi The relational expression of the implicit part from ui to wi is shown as ui
Consider a numerical calculation for a problem that includes an operation expressed as =Ei wi where Ei is a sparse matrix.

【0003】この場合、陰解法部分のベクトル化は困難
であり、従来、演算量の大きい陽解法部分のみをベクト
ル計算機でベクトル処理することで速度向上を行なって
いた。ところが、陽解法部分の速度が向上した場合、ベ
クトル処理できない陰解法部分がベクトル計算機での速
度向上の隘路になってしまい大きな速度向上が望めない
という問題点があった。
[0003] In this case, it is difficult to vectorize the implicit method portion, and conventionally, speed has been improved by vector processing only the explicit method portion, which requires a large amount of calculation, using a vector computer. However, even if the speed of the explicit part is improved, the implicit part, which cannot process vectors, becomes a bottleneck in improving the speed of the vector computer, and there is a problem in that no significant speed improvement can be expected.

【0004】そこで、本計算の陰解法部分が独立なN個
の演算式から成っていることを利用し、陰解法部分を並
列化することで速度向上を行なうアイディアがあった。
[0004] Therefore, there was an idea to improve the speed by parallelizing the implicit part of the calculation by taking advantage of the fact that the implicit part of the calculation consists of N independent arithmetic expressions.

【0005】[0005]

【発明が解決しようとする課題】単純には、以下の速度
向上法が考えられる。 1.陽解法部分をベクトル計算機で行ない、陰解法部分
をN個の並列プロセッサで処理する。 2.陽解法部分および陰解法部分をともにN個のプロセ
ッサで処理する。
[Problem to be Solved by the Invention] The following speed improvement method can be considered simply. 1. The explicit part is processed by a vector computer, and the implicit part is processed by N parallel processors. 2. Both the explicit method part and the implicit method part are processed by N processors.

【0006】ところが、第1項では、高価なベクトル計
算機を必要とするうえ、ベクトル計算機と並列プロセッ
サの間でのデータ再配分が必要であり、通信処理オーバ
ヘッドが速度向上を阻害する可能性がある。また、第2
項では、Nが十分大きくないと処理量の大きい陽解法部
分の速度向上が難しい。一般に、Nは10〜数10程度
であまり大きくなく大きな速度向上の実現は困難である
However, the first item requires an expensive vector computer and also requires data redistribution between the vector computer and the parallel processor, and communication processing overhead may impede speed improvement. . Also, the second
In this section, unless N is sufficiently large, it is difficult to speed up the explicit method part, which requires a large amount of processing. Generally, N is about 10 to several 10, which is not so large that it is difficult to achieve a large speed improvement.

【0007】[0007]

【課題を解決するための手段】本発明に係る並列数値演
算方式は、陽解法および陰解法からなる数値演算であっ
て、陽解法部分は、ベクトルx,yを用いて、y=f(
x)の関係式でxからyを導出する演算として示され、
陰解法部分は、N個のベクトルui ,wi と、N個
の独立した行列Ai を用いてui =Ai wi の
関係式でui からwi を導出する処理として示され
る数値演算をP個のプロセッサを用いて並列化する方法
であって、陽解法部分は、xベクトルをP個の連続した
小ベクトルxk に分割しk番目のプロセッサPEk 
でxk を解くことで並列化し、陰解法部分は行列Ai
 をLU分解し、ui =Li vi でui からv
i を計算する前進消去と、vi =Ui wi でv
i からwi を計算する後退代入に帰着し、xベクト
ルの分割にあわせ、ui ,vi ,wi ベクトルを
ui,k ,vi,k ,wi ,kに分割し、前進消
去では、k番目のプロセッサPEk で、ui,k と
Li およびPEk−1 から受けとった中間解vi 
,k−1を用い前進消去を部分的に行ない中間解vi,
k を作成しPEk+1 に送ったのち、PEk は、
別のベクトルuj,k とPEk−1 から受けとった
中間解vj,k−1 を用い別の前進消去を部分的に行
なうパイプライン処理によって並列にvi を計算し、
後退代入では、k番目のプロセッサPEkで前記前進消
去処理で計算したvi,k とUi およびPEk+1
 から受け取った部分解wi,k+1 を用い後退代入
を部分的に行ない部分解wi ,kを作成しPEk−1
 に送ったのち、PEk は、別のベクトルvj,k 
とPEk+1 から受けとった中間解wj,k+1 を
用い別の後退代入を部分的に行なうパイプライン処理に
よって並列にwi を計算することで、陰解法部分を並
列化することを特徴とする。
[Means for Solving the Problems] The parallel numerical calculation method according to the present invention is a numerical calculation consisting of an explicit method and an implicit method.
x) is expressed as an operation to derive y from x in the relational expression,
The implicit part uses N vectors ui, wi and N independent matrices Ai to derive wi from ui using the relational expression ui = Ai wi using P processors. In this method, the explicit method part divides the x vector into P consecutive small vectors xk and
Parallelization is performed by solving xk with
LU decomposition, ui = Li vi from ui to v
Forward elimination to calculate i and v with vi = Ui wi
This results in backward substitution to calculate wi from i, and in accordance with the division of the x vector, the ui, vi, wi vector is divided into ui, k, vi, k, wi, k, and in forward elimination, the k-th processor PEk Then, the intermediate solution vi received from ui,k and Li and PEk-1
, k-1 and perform partial forward elimination to obtain an intermediate solution vi,
After creating k and sending it to PEk+1, PEk is
Compute vi in parallel by pipeline processing that partially performs another forward cancellation using another vector uj,k and the intermediate solution vj,k-1 received from PEk-1,
In the backward substitution, the k-th processor PEk calculates vi,k, Ui, and PEk+1 calculated in the forward elimination process.
Partial backward substitution is performed using the partial solution wi,k+1 received from PEk-1 to create a partial solution wi,k.
After sending PEk to another vector vj,k
The implicit method part is parallelized by calculating wi in parallel by pipeline processing that partially performs another backward substitution using the intermediate solution wj,k+1 received from PEk+1 and PEk+1.

【0008】[0008]

【作用】本発明においては、ベクトルxの要素数だけ存
在する高い並列度を利用して並列化することができる。 そこで、前記第2項にあったような並列度の制限はなく
、多数のプロセッサからなる並列マシンを用いることで
陽解法部分でベクトル計算機を越える速度向上が可能で
ある。
[Operation] In the present invention, parallelization can be performed by utilizing the high degree of parallelism that exists as many as the number of elements in the vector x. Therefore, there is no limit on the degree of parallelism as mentioned in the second term above, and by using a parallel machine consisting of a large number of processors, it is possible to improve the speed of the explicit method part by exceeding that of a vector computer.

【0009】本発明では、陰解法部分もNの並列度を利
用して並列化することができる。このとき、陰解法と陽
解法との間で、データ分割の形状が同じであるから、前
記第1項にあったような、データ再配分は必要でない。 通信は陽解法および陰解法処理で隣接プロセッサでの境
界部分のデータ交換のみで済むため通信量が小さく、さ
らに各プロセッサが同時に通信を行なえるので、通信オ
ーバヘッドを極めて小さくすることが可能である。
In the present invention, the implicit method part can also be parallelized using the degree of parallelism of N. At this time, since the shape of data division is the same between the implicit method and the explicit method, there is no need for data redistribution as described in the first section. Communication is performed using explicit and implicit methods, and the amount of communication is small because only data exchange at the boundary between adjacent processors is required.Furthermore, since each processor can communicate simultaneously, communication overhead can be extremely reduced.

【0010】以上から、本発明によりベクトル計算機で
問題となっていた陰解法部分の並列化が達成され、ベク
トル計算機を越える速度での数値演算が可能になる。ま
た、係数行列、データを全プロセッサを用いて分散配置
ができるから、大規模なデータの数値演算が可能である
As described above, according to the present invention, parallelization of the implicit method part, which has been a problem with vector computers, has been achieved, and numerical calculations can be performed at a speed exceeding that of vector computers. Further, since the coefficient matrix and data can be distributed and arranged using all processors, numerical operations on large-scale data are possible.

【0011】[0011]

【実施例】本発明の並列数値演算方式について、方程式
(1),(2)の求解を例にとり、図1、図2を参照し
て説明する。
DESCRIPTION OF THE PREFERRED EMBODIMENTS The parallel numerical calculation system of the present invention will be explained with reference to FIGS. 1 and 2, taking the solution of equations (1) and (2) as an example.

【0012】(方程式)(Equation)

【数1】[Math 1]

【数2】 ただし、ak,l,m (r) ,bk,l,m (r
) ,ck,l,m (r) は、k,l,mとrに依
存して定まる係数である。また、演算子[]m,n は
式(3)で定義される。
[Formula 2] However, ak,l,m (r) ,bk,l,m (r
) , ck, l, m (r) are coefficients determined depending on k, l, m and r. Furthermore, the operator []m,n is defined by equation (3).

【数3】[Math 3]

【0013】(数値演算法)r方向には式4に示す中心
差分により離散化する。
(Numerical calculation method) In the r direction, discretization is performed using the central difference shown in equation 4.

【数4】 すると、方程式(1)の時間積分は陽解法で行なうこと
ができる。
##EQU00004## Then, the time integration of equation (1) can be performed using an explicit method.

【0014】一方、方程式(2)の求解は、時間積分し
たUから求解することができるが、この演算は行列の逆
演算を伴う陰解法となる。このとき、方程式(2)は、
n毎に独立であることを考慮して、N個の行列方程式で
表現される。
On the other hand, equation (2) can be solved from time-integrated U, but this calculation is an implicit method that involves matrix inversion. At this time, equation (2) is
Considering that each n is independent, it is expressed by N matrix equations.

【0015】前記行列演算はUm,n,φl,n を離
散化したものをun (要素はui,m,n ),wn
 (要素は、wi,l,n )とおくと行列表現(5)
式になる。このとき、un wn をmまたは、l要素
が連続になるようにソートすると、行列Aは図3に示さ
れるブロックサイズmのブロック3重対角行列になる。 ただし、iの範囲を1からI、m,lの範囲を1からM
、nの範囲を1からNとする。
[0015] In the matrix operation, the discretized Um, n, φl, n are un (elements are ui, m, n), wn
(Elements are wi, l, n) Matrix representation (5)
It becomes a ceremony. At this time, if un wn is sorted so that m or l elements are continuous, matrix A becomes a block tridiagonal matrix with block size m shown in FIG. However, the range of i is from 1 to I, and the range of m, l is from 1 to M.
, n range from 1 to N.

【0016】An は時間変化しないため、(5)式を
An =Ln Un とLU分解し中間ベクトルvn 
を導入し、前進消去(6)式と、後退代入(7)式に分
解する。 un =An wn                
 (5)un =Ln vn            
     (6)un =Un wn        
         (7)
Since An does not change over time, the intermediate vector vn is decomposed by LU decomposition of equation (5) as An = Ln Un
is introduced and decomposed into the forward elimination equation (6) and the backward substitution equation (7). un = An wn
(5) un = Ln vn
(6) un = Un wn
(7)

【0017】(数値演算の流
れ)以下の処理の繰り返しで時間積分を行なう。 1.陽解法によるun の1タイムステップの積分2.
un から前進消去による、中間解vn の求解3.v
n から後退代入による、wn の求解
(Flow of numerical calculations) Time integration is performed by repeating the following processing. 1. Integration of one time step of un by explicit method 2.
Solving the intermediate solution vn by forward elimination from un 3. v
Solving wn from n by backward substitution

【0018】(
並列化)本発明の実施例としてP台のプロセッサを用い
て前記数値演算を並列化する。以降プロセッサをPEk
 、(1<k<P)で参照する。un ,wn をP個
の連続した部分ベクトルに分割し、PEk にk番目の
部分ベクトルun,k ,wn,k 、ただし、要素は
ui,m,n ,wi,l,n としてimin(k)
<i<imax(k)となるものを配置する。
[0018](
Parallelization) As an embodiment of the present invention, P processors are used to parallelize the numerical operations. After that, the processor is PEk
, (1<k<P). Divide un , wn into P consecutive subvectors, and set PEk as k-th subvector un,k , wn,k , where the elements are ui,m,n , wi,l,n and imin(k)
Place items such that <i<imax(k).

【0019】(陽解法部分の並列化)陽解法部分は、u
i,m,n ,wi,l,n でiに関しては(imi
n(k)−1<i<imax(k)+1)を参照するこ
とで解くことができる。
(Parallelization of explicit method part) The explicit method part is
i, m, n , wi, l, n for i (imi
This can be solved by referring to n(k)-1<i<imax(k)+1).

【0020】陽解法部分を並列化した場合のデータ参照
の様子を図1に示す。要素13を更新するためには、隣
接要素10、11、12を参照する。
FIG. 1 shows how data is referenced when the explicit method part is parallelized. To update element 13, neighboring elements 10, 11, 12 are referenced.

【0021】並列化した陽解法部分の処理の流れ   
   陽解法部分の処理は、 1.PEk 、ただし、k<=P−1が、ui,m,n
 、ただし、i=imax(k)をPEk+1 に転送
する。 2.PEk 、ただし、2<=kが、ui,m,n 、
ただし、i=imin(k)をPEk−1 に転送する
。 3.PEk 0<=k<=Pが、独立にun,k を計
算する。 図1では、第1,2項の処理により、PE2 の計算で
必要となるベクトル要素10,14が転送される。
[0021] Processing flow of the parallelized explicit method part
The processing of the explicit method part is as follows: 1. PEk, where k<=P-1 is ui, m, n
, where i=imax(k) is transferred to PEk+1. 2. PEk, where 2<=k, ui,m,n,
However, i=imin(k) is transferred to PEk-1. 3. PEk 0<=k<=P independently calculates un,k. In FIG. 1, vector elements 10 and 14 necessary for calculation of PE2 are transferred by the processing of the first and second terms.

【0022】陰解法部分の並列化      陰解法部
分は図2で示すように、PEk とPEk−1 の間で
パイプライン的に処理を行なうことで並列化する。ただ
し、図2では簡単のため、P=3,n=2の場合を示す
Parallelization of the implicit method part The implicit method part is parallelized by performing pipeline processing between PEk and PEk-1, as shown in FIG. However, for simplicity, FIG. 2 shows a case where P=3 and n=2.

【0023】並列化した陰解法部分の処理の流れ   
   各n成分について図2を参照し、処理の流れを以
下に示す。ただし、図2では簡単のため、P=3,n=
2の場合を示す。 1.(以降、前進消去部分) 2.PE1 でun,o からvn,o の計算をする
。 3.PE1 がvi,m,n (ただしi=imax(
1))要素をPE2 に送信する。 4.PE2 は、vi,m,n (ただしi=imax
(1))要素を受信し、vn,2 を計算する。 5.PE2 がvi,m,n (ただしi=imax(
2))要素をPE3 に送信する。 6.… 7.PEP は、vi,m,n (ただしi=imax
(P−1))要素を受信し、vn ,P を計算する。 8.(後退代入消去部分) 9.PEP でvn,P からwn,P を計算する。 10.PEP がwi,m,n (ただしi=imin
(P))要素をPEP−1 に送信する。 11.PEP−1 は、wi,m,n (ただしi=i
min(P))要素を受信し、wn,P−1 を計算す
る。 12.PEP−1 がwi,m,n (ただしi=im
in(P−1))要素をPEP−2 に送信する。 13.… 14.PE1 は、wi,m,n (ただしi=imi
n(2))要素を受信し、wn,1 を計算する。 となり、wn,1 が求められる。PEk は、1つの
nについての求解のあと、異なったnについての求解を
開始することで図2に示す様にパイプライン的に前進消
去が実現される。
Processing flow of parallelized implicit method part
The flow of processing for each n component is shown below with reference to FIG. However, for simplicity in Figure 2, P=3, n=
Case 2 is shown below. 1. (Hereafter, the forward erasing part) 2. PE1 calculates vn,o from un,o. 3. PE1 is vi, m, n (where i=imax(
1)) Send the element to PE2. 4. PE2 is vi, m, n (where i=imax
(1)) Receive the element and calculate vn,2. 5. PE2 is vi, m, n (where i=imax(
2)) Send the element to PE3. 6. …7. PEP is vi, m, n (where i=imax
(P-1)) Receive the element and calculate vn,P. 8. (Backward substitution elimination part) 9. PEP calculates wn,P from vn,P. 10. PEP is wi, m, n (where i=imin
(P)) Send the element to PEP-1. 11. PEP-1 is wi, m, n (where i=i
min(P)) element and calculate wn,P-1. 12. PEP-1 is wi, m, n (where i=im
in(P-1)) element to PEP-2. 13. ...14. PE1 is wi, m, n (where i=imi
n(2)) elements and calculate wn,1. Then, wn,1 is obtained. For PEk, forward elimination is realized in a pipeline manner as shown in FIG. 2 by starting the solution for a different n after solving for one n.

【0024】[0024]

【発明の効果】本発明においては、一般に問題のベクト
ル長はNに比べ大きいから、陽解法部分の並列度が大き
く高並列マシンを用いることで陽解法部分の大幅な高速
化が可能である。一方、陰解法部分と陽解法部分の間で
PEへのデータ分散の形状が同じため、データの再配分
の必要性がなく通信オーバヘッドが小さい。
According to the present invention, since the vector length of a problem is generally larger than N, the explicit solution part can be significantly speeded up by using a highly parallel machine with a large degree of parallelism in the explicit solution part. On the other hand, since the shape of data distribution to PEs is the same between the implicit method part and the explicit method part, there is no need for data redistribution and communication overhead is small.

【0025】さらに、係数行列、データを全プロセッサ
を用いて分散配置できるので、プロセッサ台数の増加に
より、対象となる問題の規模を拡大できる。本発明の並
列数値演算方式は、MIMD型高並列マシンに適した並
列化手法である。
Furthermore, since the coefficient matrix and data can be distributed and arranged using all the processors, the scale of the target problem can be expanded by increasing the number of processors. The parallel numerical calculation method of the present invention is a parallelization method suitable for MIMD type highly parallel machines.

【図面の簡単な説明】[Brief explanation of drawings]

【図1】陽解法部分の並列化を示した概念図である。FIG. 1 is a conceptual diagram showing parallelization of an explicit method part.

【図2】陰解法部分の並列化を示した概念図である。FIG. 2 is a conceptual diagram showing parallelization of an implicit method part.

【図3】陰解法部分の係数行列を示した図である。FIG. 3 is a diagram showing a coefficient matrix of an implicit solution part.

【図4】前進消去を行列表現で示した図である。FIG. 4 is a diagram showing forward elimination in matrix representation.

【図5】後退代入を行列表現で示した図である。FIG. 5 is a diagram showing backward substitution in matrix representation.

【符号の説明】[Explanation of symbols]

10    PE2 の計算で必要とされPE1 から
受信されるwn ,un の要素 11,12    参照されるwn ,un の要素1
3    更新されるun の要素 14    PE2 の計算で必要とされPE3 から
受信されるwn ,un の要素 20,21,22    u1,k から前進消去によ
り算出されるv1 ,kの要素 23,24,25    u2,k から前進消去によ
り算出されるv2 ,kの要素 26,27,28    v1,k から後退代入によ
り算出されるw1 ,kの要素 29,30,31    v2,k から後退代入によ
り算出されるw2 ,kの要素
10 Elements 11, 12 of wn, un that are needed in the calculation of PE2 and received from PE1 Element 1 of wn, un that is referenced
3 Elements of un to be updated 14 Elements 20, 21, 22 of wn, un that are required in the calculation of PE2 and received from PE3 Elements 23, 24, 25 of v1, k calculated by forward elimination from u1, k v2, calculated from u2,k by forward elimination, elements 26, 27, 28 of k, w1, calculated by backward substitution from v1,k, elements 29, 30, 31 of k, calculated by backward substitution from v2,k w2, element of k

Claims (1)

【特許請求の範囲】[Claims] 【請求項1】  陽解法および陰解法からなる数値演算
であって、陽解法部分は、ベクトルx,yを用いて、y
=f(x)の関係式でxからyを導出する演算として示
され、陰解法部分は、N個のベクトルui ,wi と
、N個の独立した行列Ai を用いてui =Ai w
i の関係式でui からwi を導出する処理として
示される数値演算をP個のプロセッサを用いて並列化す
る方法であって、陽解法部分は、xベクトルをP個の連
続した小ベクトルxk に分割しk番目のプロセッサP
Ek でxk を解くことで並列化し、陰解法部分は行
列Ai をLU分解し、ui =Li vi でui 
からvi を計算する前進消去と、vi =Ui wi
 でvi からwi を計算する後退代入に帰着し、x
ベクトルの分割にあわせ、ui ,vi ,wi ベク
トルをui,k ,vi,k,wi,k に分割し、前
進消去では、k番目のプロセッサPEk で、ui,k
 とLi およびPEk−1 から受けとった中間解v
i,k−1 を用い前進消去を部分的に行ない中間解v
i,k を作成しPEk+1 に送ったのち、PEk 
は、別のベクトルuj,K とPEk−1 から受けと
った中間解vj,k−1 を用い別の前進消去を部分的
に行なうパイプライン処理によって並列にvi を計算
し、後退代入では、k番目のプロセッサPEk で前記
前進消去処理で計算したvi,k とUi およびPE
k+1 から受けとった部分解wi,k+1 を用い後
退代入を部分的に行ない部分解wi,k を作成しPE
k−1 に送ったのち、PEk は、別のベクトルvJ
,k とPEk+1 から受けとった中間解wj,k+
1 を用い別の後退代入を部分的に行なうパイプライン
処理によって並列にwi を計算することで、陰解法部
分を並列化することを特徴とする並列数値演算方式。
Claim 1: Numerical calculation consisting of an explicit method and an implicit method, where the explicit method part uses vectors x and y to calculate y
It is expressed as an operation to derive y from x using the relational expression =f(x), and the implicit method part uses N vectors ui, wi and N independent matrices Ai to calculate ui =Ai w
This is a method of parallelizing the numerical operation shown as the process of deriving wi from ui in the relational expression of i using P processors, and the explicit method part consists of dividing the x vector into P consecutive small vectors xk. kth processor P
Parallelization is performed by solving xk with Ek, and the implicit solution part is the LU decomposition of matrix Ai, and ui with ui = Li vi
Forward elimination to calculate vi from vi = Ui wi
This results in backward substitution to calculate wi from vi , and x
According to the vector division, the ui, vi, wi vector is divided into ui,k, vi,k, wi,k, and in forward elimination, the k-th processor PEk divides the ui,k
and the intermediate solution v received from Li and PEk-1
i, k−1 and perform partial forward elimination to obtain an intermediate solution v
After creating i,k and sending it to PEk+1, PEk
calculates vi in parallel by a pipeline process that partially performs another forward elimination using another vector uj,K and an intermediate solution vj,k−1 received from PEk−1, and in backward substitution, the k-th vi,k, Ui and PE calculated by the forward elimination process using the processor PEk of
Using the partial solution wi,k+1 received from k+1, perform partial backward substitution to create a partial solution wi,k, and PE
After sending it to k-1, PEk is sent to another vector vJ
, k and the intermediate solution wj,k+ received from PEk+1
1. A parallel numerical calculation method characterized in that the implicit method part is parallelized by calculating wi in parallel by pipeline processing that partially performs another backward substitution using 1.
JP16767291A 1991-06-12 1991-06-12 Parallel numerical arithmetic system Withdrawn JPH04365171A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP16767291A JPH04365171A (en) 1991-06-12 1991-06-12 Parallel numerical arithmetic system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP16767291A JPH04365171A (en) 1991-06-12 1991-06-12 Parallel numerical arithmetic system

Publications (1)

Publication Number Publication Date
JPH04365171A true JPH04365171A (en) 1992-12-17

Family

ID=15854082

Family Applications (1)

Application Number Title Priority Date Filing Date
JP16767291A Withdrawn JPH04365171A (en) 1991-06-12 1991-06-12 Parallel numerical arithmetic system

Country Status (1)

Country Link
JP (1) JPH04365171A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08263470A (en) * 1995-03-24 1996-10-11 Nec Corp Numerical fluid analytic method using two kinds of time integration methods
US5819279A (en) * 1995-04-27 1998-10-06 Fujitsu Limited Object data processing apparatus

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08263470A (en) * 1995-03-24 1996-10-11 Nec Corp Numerical fluid analytic method using two kinds of time integration methods
US5819279A (en) * 1995-04-27 1998-10-06 Fujitsu Limited Object data processing apparatus

Similar Documents

Publication Publication Date Title
Gurbuzbalaban et al. On the convergence rate of incremental aggregated gradient algorithms
Stone Parallel tridiagonal equation solvers
Dennis, Jr et al. Direct search methods on parallel machines
JPH07271760A (en) Method and computer for simultaneous linear equation calculating process by memory decentralized type parallel computer
Kindervater et al. Experiments with parallel algorithms for combinatorial problems
JPH04365171A (en) Parallel numerical arithmetic system
Srinivas Optimal parallel scheduling of Gaussian elimination DAG's
Chikalov et al. Sequential optimization of matrix chain multiplication relative to different cost functions
Cheng et al. Algorithm partition for a fixed-size VLSI architecture using space-time domain expansion
Yakovlev Bounds on the minimum of convex functions on Euclidean combinatorial sets
Kroonenberg et al. Gram-Schmidt versus Bauer-Rutishauser in alternating least-squares algorithms for three-mode principal component analysis
Proudler et al. Formal derivation of a systolic array for recursive least squares estimation
JP2018206078A (en) Parallel processing apparatus, parallel operation method, and parallel operation program
Lin A hardware implementable two-level parallel computing algorithm for general minimum-time control
Carlson Solving linear recurrence systems on mesh-connected computers with multiple global buses
Yang et al. Unified gpu-parallelizable robot forward dynamics computation using band sparsity
Song et al. On the convergence of relaxed parallel chaotic iterations for h-matrix yongzhong song
Provot et al. Recognition of blurred pieces of discrete planes
Marrakchi et al. Static scheduling with load balancing for solving triangular band linear systems on multicore processors
Onaga et al. On design of rotary array communication and wavefront-driven algorithms for solving large-scale band-limited matrix equations
UA155815U (en) Device for optimal placement of distributed database fragments in the network structure of the cloud environment
Dunbar Analysis and design of parallel algorithms
JP3542184B2 (en) Linear calculation method
Maria et al. 1d and 2d systolic implementations for radial basis function networks
Fan et al. An iterative procedure for multidimensional realization by lft techniques

Legal Events

Date Code Title Description
A300 Application deemed to be withdrawn because no request for examination was validly filed

Free format text: JAPANESE INTERMEDIATE CODE: A300

Effective date: 19980903