JPH04321158A

JPH04321158A - Distribution/collection processor for array data

Info

Publication number: JPH04321158A
Application number: JP3090146A
Authority: JP
Inventors: Hidetoshi Iwashita; 英俊岩下
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1991-04-22
Filing date: 1991-04-22
Publication date: 1992-11-11

Abstract

PURPOSE:To obtain a means which can optionally divide and assign the array data in a simple and highly efficient way for a distribution/collection processor which divides and assigns the array data to plural processors through a parallel computer and then distributes automatically the array data in an optional transfer pattern. CONSTITUTION:A transfer table production means 11 produces a transfer table 16 which contains the information on each dimensional range in accordance with the blocks divided in one or more optional dimensional directions and to be assigned to each storage space or each processor 10. A data transfer means 12 transfers the data to each storage space or each processor 10 for each divided block by reference to the table 16. Thus the array data can be automatically distributed in an optional transfer pattern.

Description

[Detailed description of the invention]

【０００１】0001

【産業上の利用分野】本発明は，並列計算機で配列デー
タを複数のプロセッサに分割割付けする装置に係り，特
に任意の転送パターンによりデータを自動分配または自
動収集する配列データの分配／収集処理装置に関する。[Field of Industrial Application] The present invention relates to a device for dividing and allocating array data to a plurality of processors in a parallel computer, and in particular, an array data distribution/collection processing device that automatically distributes or automatically collects data according to an arbitrary transfer pattern. Regarding.

【０００２】巨大な配列データを処理するにあたって，
複数のプロセッサにデータを分割して割り付け，並列に
処理するシステムが用いられている。このようなデータ
の分割のしかたを任意に動的に切り替えて処理できるよ
うにするために，効率よくデータを再配置する技術が必
要とされる。[0002] When processing huge array data,
Systems are used in which data is divided and allocated to multiple processors and processed in parallel. In order to be able to dynamically switch how data is divided and process it, a technology is needed to efficiently rearrange data.

【０００３】0003

【従来の技術】並列計算機では，配列データを効率よく
扱うために，次元に沿った超直方体領域にブロック化し
，各ブロックのメモリ割付けを分離し，それぞれをプロ
セッサごとに担当して管理することが多い。[Prior Art] In parallel computers, in order to efficiently handle array data, it is necessary to block it into hypercuboid areas along the dimensions, separate memory allocation for each block, and manage each block with each processor in charge. many.

【０００４】図１３はそのような配列データの分割例を
示している。図１３に示す例では，３次元配列Ａ（１０
０，１００，１００）を分割している。（イ）は１次元
分割，（ロ）は多次元分割，（ハ）は多次元の不均等分
割の例を示しているが，（イ）に示すような板状分割は
，ｋ方向に４分割すると同時に，ｉ方向とｊ方向に対し
て１分割していると解釈することもできる。FIG. 13 shows an example of such division of array data. In the example shown in Figure 13, the three-dimensional array A (10
0,100,100). (a) shows an example of one-dimensional partitioning, (b) shows an example of multidimensional partitioning, and (c) shows an example of multidimensional uneven partitioning. When dividing, it can also be interpreted as one division in the i direction and the j direction.

【０００５】一般には，ｎ次元配列Ａ（Ｍ１，Ｍ２，…
，　Ｍｎ）を，各次元に対してそれぞれ，ｐ１，ｐ２，
…，ｐｎ　部分（それぞれ１以上の整数）に切るなら，
（ｐ１　×…×ｐｎ）個の超直方体ブロックに分割する
ことができる。分散メモリ型の並列計算機であれば，各
プロセッサの処理手続きに都合のよいように，データを
分割配置することにより，計算中のデータ転送量を減ら
して効率の向上を図ることができる。共用メモリ型の並
列計算機の場合にも，同様の分割を行えば，メモリの競
合を減らして効率よく処理を進めることができる。Generally, an n-dimensional array A (M1, M2,...
, Mn) for each dimension, p1, p2,
…, pn parts (each an integer greater than or equal to 1),
It can be divided into (p1×...×pn) hypercuboid blocks. A distributed memory type parallel computer can reduce the amount of data transferred during calculations and improve efficiency by dividing and arranging data to suit the processing procedures of each processor. In the case of a shared memory type parallel computer, similar partitioning can reduce memory contention and allow processing to proceed more efficiently.

【０００６】また，ブロックごとに物理アドレスを分割
することにより，各プロセッサの担当データを物理的に
連続なアドレスに集めることができるため，ページ切り
替えやキャッシュメモリのミスヒットが少なくなるとか
，１プロセッサのアドレス空間に入り切らないような巨
大な配列データにもアドレス付けできるなどの効果があ
る。[0006] Furthermore, by dividing the physical addresses for each block, the data handled by each processor can be collected in physically consecutive addresses, which reduces page switching and cache memory misses, and reduces the number of page switches and cache memory misses. It has the advantage of being able to address even huge array data that cannot fit into the address space of .

【０００７】このように，配列の分割割付けは重要であ
るが，実際のアプリケーションプログラムでは，データ
の分割のしかたが固定的であることは稀であり，実行処
理中にデータ転送処理を伴う再分割割付けが頻繁に必要
となる。それは例えば次のような場合である。[0007] As described above, partitioning and allocation of arrays is important, but in actual application programs, the method of partitioning data is rarely fixed, and repartitioning that involves data transfer processing during execution processing is important. Allocation is frequently required. This is the case, for example, as follows.

【０００８】（ａ）　データ相関方向の変化特に，ＡＤ
Ｅ（交互方向編集）がよく知られている。これは，ある
次元での板状分割から，別の次元での板状分割へと変更
する編集である。ＬＵ分解などの行列計算や多次元ＦＥ
Ｔなどで必要となるため，連続系の数値計算では多用さ
れる。プロセッサ数が多くなれば，多次元分割にせざる
を得なくなり，より複雑な転送パターンとなることは必
至である。例えば，配列Ａ（１００，１００，１００）
を，２００プロセッサに割り付けるには，２次元以上で
分割するしかなく，その場合のＡＤＥ転送はかなり複雑
になる。(a) Change in data correlation direction, especially AD
E (alternating direction editing) is well known. This is an edit that changes from a plate-like division in one dimension to a plate-like division in another dimension. Matrix calculations such as LU decomposition and multidimensional FE
Since it is necessary for T, etc., it is frequently used in numerical calculations of continuous systems. As the number of processors increases, multidimensional partitioning becomes necessary, which inevitably leads to more complex transfer patterns. For example, array A(100,100,100)
In order to allocate this to 200 processors, the only way is to divide it into two or more dimensions, and in that case, ADE transfer becomes quite complicated.

【０００９】（ｂ）　細分化などの再分割割付け一般に
並列度が上がるほど，データ転送や同期などのオーバヘ
ッドが増えるため，手続きによって使用プロセッサ数の
最適値がある。それに伴って，並列効果の異なる手続き
の境界で，データ分割割付けの変更が必要になる場合が
ある。(b) Re-division allocation such as subdivision In general, as the degree of parallelism increases, overheads such as data transfer and synchronization increase, so there is an optimal value for the number of processors used depending on the procedure. Accordingly, it may be necessary to change the data partitioning allocation at the boundary between procedures with different parallelism effects.

【００１０】（ｃ）　計算機の運用上の理由一部のプロ
セッサの故障，同時利用者数の変化などで，分割の変更
を伴うプロセッサ間データ移動が必要になることがある
。(c) Reasons for computer operation Due to a failure of some processors, a change in the number of simultaneous users, etc., it may be necessary to move data between processors with a change in partitioning.

【００１１】従来，ユーザがＦＯＲＴＲＡＮやＣ言語な
どの計算機用記述言語を使って記述する場合，配列の添
字の計算には大変な手間がかかる上に，煩雑であるため
ミスが生じやすく，検証も難しかった。また，データ転
送のタイミングはユーザには予想しきれないことが多く
，効率化は困難であるという問題があった。Conventionally, when a user writes a description using a computer description language such as FORTRAN or the C language, it takes a lot of time and effort to calculate the subscripts of an array. was difficult. Furthermore, there is a problem in that the timing of data transfer is often difficult for the user to predict, making it difficult to improve efficiency.

【００１２】分割の形状や分割数などを制約すれば，比
較的簡単に実現できる可能性があり，特定の転送パター
ンの効率化を狙ったハードウェアを持つ計算機も考えら
れているが，アプリケーションの幅や運用形態で大きな
制約を受けることになる。特に，汎用性の高い計算機を
提供するためには，ユーザがどのような分割形状や転送
パターンを必要としても，それに対応できなければなら
ない。しかし，従来，任意の分割形状や転送パターンに
柔軟に対応できるような手段はなかった。[0012] This may be achieved relatively easily by restricting the shape of the division, the number of divisions, etc., and computers equipped with hardware designed to improve the efficiency of specific transfer patterns are being considered, but There will be major restrictions in terms of width and operational format. In particular, in order to provide a highly versatile computer, it must be able to accommodate whatever division shape or transfer pattern the user requires. However, until now, there has been no means that can flexibly accommodate arbitrary division shapes and transfer patterns.

【００１３】[0013]

【発明が解決しようとする課題】本発明は上記問題点の
解決を図り，配列データの任意の分割割付けを，手軽に
効率よく実現する手段を提供することを目的とする。す
なわち，ユーザはデータ転送の詳細な内容を明示的に意
識する必要がなく，分割形状と配列の範囲を指定するだ
けで，配列の分割割付けや分割の変更が可能になるよう
にすることを目的とする。SUMMARY OF THE INVENTION It is an object of the present invention to solve the above-mentioned problems and to provide means for easily and efficiently realizing arbitrary division and allocation of array data. In other words, the purpose is to enable the user to allocate and change the partitioning of an array by simply specifying the partition shape and array range without having to be explicitly aware of the details of data transfer. shall be.

【００１４】また，本発明は，配列の次数や大きさ，分
割次元数や分割幅が均等か否かなどによる制約がなく，
広い範囲に適用できるデータの分配手段および収集手段
を提供することを目的とする。[0014] Furthermore, the present invention is not limited by the order or size of the array, the number of division dimensions, whether the division width is equal, etc.
The purpose is to provide a means of distributing and collecting data that is widely applicable.

【００１５】[0015]

【課題を解決するための手段】図１は本発明の原理構成
図である。図１において，１０ａ，１０ｂ，…，１０ｃ
は各々独立したデータ処理機能を有するプロセッサ，１
１は転送テーブル作成手段，１２はデータを分配または
収集するデータ転送手段，１３は分割または収集するブ
ロックの分割区間番号を指定する分割区間番号指定手段
，１４は各ブロックの分割区間上下限を指定する分割区
間上下限指定手段，１５は各プロセッサからアクセス可
能となっている共用メモリ，１６はデータの分配または
収集に用いる転送テーブル，１７は分配するデータが格
納されている分配データ格納域，１８は各プロセッサが
処理するデータを格納する転送先データ格納域，１９は
収集したデータを格納する収集データ格納域，２０は収
集対象となるデータが格納されている転送元データ格納
域を表す。[Means for Solving the Problems] FIG. 1 is a diagram showing the basic configuration of the present invention. In FIG. 1, 10a, 10b,..., 10c
are processors each having an independent data processing function, 1
1 is a transfer table creation means, 12 is a data transfer means for distributing or collecting data, 13 is a division section number designating means for specifying the division section number of a block to be divided or collected, and 14 is a section for specifying the upper and lower limits of the division section of each block. 15 is a shared memory accessible from each processor; 16 is a transfer table used for data distribution or collection; 17 is a distribution data storage area in which data to be distributed is stored; 18 19 represents a transfer destination data storage area in which data to be processed by each processor is stored; 19 represents a collection data storage area in which collected data is stored; and 20 represents a transfer source data storage area in which data to be collected is stored.

【００１６】請求項１記載の発明は，例えば（イ）に示
すような構成になっている。ある分割法で１ブロック内
に割り付けられている任意の部分配列を，別の分割法に
従って分割割付けする。そのため，プロセッサ１０ａは
，分割区間番号指定手段１３から分割対象となるブロッ
クの分割区間番号を得て，その各分割区間番号に対応す
る分割ブロックの上下限の範囲を，分割区間上下限指定
手段１４から得る。分割区間番号指定手段１３および分
割区間上下限指定手段１４は，入出力装置，外部記憶装
置またはプログラムなどのいずれの手段でもよい。[0016] The invention recited in claim 1 has a configuration as shown in (a), for example. An arbitrary partial array that is allocated within one block using a certain partitioning method is partitioned and allocated according to another partitioning method. Therefore, the processor 10a obtains the division section number of the block to be divided from the division section number designation means 13, and determines the range of the upper and lower limits of the division block corresponding to each division section number from the division section upper and lower limit designation means 13. Get from. The division section number designating means 13 and the division section upper and lower limit designation means 14 may be any means such as an input/output device, an external storage device, or a program.

【００１７】転送テーブル作成手段１１は，分割区間番
号指定手段１３および分割区間上下限指定手段１４から
得た情報に基づいて，共用メモリ１５における各転送先
データ格納域１８への割り付け対象となる１以上の任意
の次元方向で分割した分割ブロックに対応して，各次元
ごとの範囲に関する情報を持つ転送テーブル１６を作成
する。The transfer table creation means 11 selects one to be allocated to each transfer destination data storage area 18 in the shared memory 15 based on information obtained from the division section number designation means 13 and the division section upper and lower limit designation means 14. A transfer table 16 having information regarding the range of each dimension is created corresponding to the divided blocks divided in the above arbitrary dimension directions.

【００１８】データ転送手段１２は，作成した転送テー
ブル１６に基づいて，各分割ブロックごとにデータを各
記憶空間に転送する。すなわち，転送テーブル１６に基
づいて分配データ格納域１７のデータを各プロセッサに
対応して指定された転送先データ格納域１８へ分配する
。The data transfer means 12 transfers data to each storage space for each divided block based on the created transfer table 16. That is, based on the transfer table 16, the data in the distribution data storage area 17 is distributed to the transfer destination data storage area 18 designated corresponding to each processor.

【００１９】図１の（イ）に示す例は，共用メモリ１５
上でデータの分配を行っているが，分散メモリ型の計算
機では，各プロセッサが持つローカルメモリに対してそ
れぞれデータを転送する。請求項２記載の発明は，例え
ば（ロ）に示すような構成になっている。In the example shown in FIG. 1(a), the shared memory 15
Data is distributed above, but in a distributed memory computer, data is transferred to the local memory of each processor. The invention recited in claim 2 has a configuration as shown in (b), for example.

【００２０】ある分割法で分散して割り付けられている
配列からデータを収集し，別の分割法に基づく１ブロッ
ク内に属する任意の部分配列を収集する。そのため，プ
ロセッサ１０ａは，分割区間番号指定手段１３から収集
対象となるブロックの分割区間番号を得て，その各分割
区間番号に対応する分割ブロックの上下限の範囲を，分
割区間上下限指定手段１４から得る。Data is collected from arrays that are distributed and allocated using one division method, and arbitrary partial arrays belonging to one block based on another division method are collected. Therefore, the processor 10a obtains the division section number of the block to be collected from the division section number designation means 13, and determines the range of the upper and lower limits of the division block corresponding to each division section number from the division section upper and lower limit designation means 13. Get from.

【００２１】転送テーブル作成手段１１は，分割区間番
号指定手段１３および分割区間上下限指定手段１４から
得た情報に基づいて，共用メモリ１５における転送元デ
ータ格納域２０へ分散して配置されている１以上の任意
の次元方向で分割した分割ブロックに対応して，収集す
べきデータの各次元ごとの範囲に関する情報を持つ転送
テーブル１６を作成する。The transfer table creation means 11 is arranged in a distributed manner in the transfer source data storage area 20 in the shared memory 15 based on the information obtained from the division section number designation means 13 and the division section upper and lower limit designation means 14. A transfer table 16 having information regarding the range of data to be collected for each dimension is created corresponding to the divided blocks divided in one or more arbitrary dimension directions.

【００２２】データ転送手段１２は，作成した転送テー
ブル１６に基づいて，各分割ブロックごとの収集データ
を，各転送元データ格納域２０から収集データ格納域１
９へ転送する。図１の（ロ）に示す例は，共用メモリ１
５上でデータの収集を行っているが，分散メモリ型の計
算機では，各プロセッサが持つローカルメモリから，そ
れぞれ収集するプロセッサのローカルメモリへデータを
転送する。The data transfer means 12 transfers the collected data for each divided block from each transfer source data storage area 20 to the collected data storage area 1 based on the created transfer table 16.
Transfer to 9. In the example shown in (b) of Figure 1, the shared memory 1
In a distributed memory computer, data is transferred from the local memory of each processor to the local memory of the processor that collects the data.

【００２３】[0023]

【作用】請求項１記載の発明では，例えばｎ次元配列Ａ
（Ｍ１，Ｍ２，…，　Ｍｎ）　　（ｎ≧１，Ｍｄ　は次
元ｄの寸法）の超直方体の部分配列Ａ（ｉ１：ｊ１，　ｉ２：ｊ２，　…，　ｉｎ：ｊｎ　
）（ｉｄ，　ｊｄはそれぞれ次元ｄのインデックスの下
限，上限。０≦ｉｄ≦ｊｄ≦Ｍｄ−１）が１分割ブロッ
ク内にあるとき，これを別の分割法に従って分割割付け
する。[Operation] In the invention as claimed in claim 1, for example, the n-dimensional array A
(M1, M2, ..., Mn) (n≧1, Md is the dimension d) subarray A (i1:j1, i2:j2, ..., in:jn
) (id, jd are the lower limit and upper limit of the index of dimension d, respectively. 0≦id≦jd≦Md−1) is within one divided block, this is divided and allocated according to another division method.

【００２４】請求項２記載の発明では，逆に分割割付け
されている部分配列を別の分割法に従って１分割ブロッ
クに収集する。部分配列は，配列全体であってもよい。説明の簡単化のため，各次元の配列のインデックスおよ
び分割区間番号は，０から数えるものとする。他の場合
（例えば１から数える場合）には，シフトさせて本発明
を適用すればよい。また，分割ブロックごとに，配列要
素Ａ（ｋ１，…　，ｋｎ）のメモリ割付けは，次式で示
されるアドレスに置くこととする。In the second aspect of the invention, on the other hand, partial arrays that have been divided and allocated are collected into one divided block according to another division method. A subarray may be the entire array. To simplify the explanation, it is assumed that the array index and division section number of each dimension are counted from 0. In other cases (for example, when counting from 1), the present invention may be applied with a shift. Furthermore, for each divided block, the memory allocation of array element A (k1,...,kn) is assumed to be placed at the address shown by the following equation.

【００２５】　　（ベースアドレス）＋（ワード長）×　　　　　　
　　　　　　　　　　（ｋ１＋ｍ１×（ｋ２＋ｍ２×（
　…ｍ（ｎ−１）×ｋｎ…）））　　　　　　　（式１
）分散メモリ型計算機では，分散ブロックとプロセッサ
の対応をテーブル等で管理する。(Base address) + (word length) ×
(k1+m1×(k2+m2×(
...m(n-1)×kn...))) (Formula 1
) In distributed memory computers, the correspondence between distributed blocks and processors is managed using tables, etc.

【００２６】この係数ｍｄ（ｄ＝１，２，…，ｎ−１）
　を，その分割ブロックの次元ｄの分割係数と呼ぶ。分
割係数は，その次元方向の分割幅（インデックス数）以
上の定数である。分割幅と等しい場合，ブロック内の配
列要素は連続アドレスに割り付けられる。[0026] This coefficient md (d=1, 2,..., n-1)
is called the division coefficient of dimension d of the divided block. The division coefficient is a constant greater than or equal to the division width (index number) in the dimension direction. If equal to the partition width, the array elements within the block are allocated to consecutive addresses.

【００２７】［１］共用メモリ型計算機の場合，ある分
割法Ａで１ブロック内に割り付けられている任意の部分
配列を，他の分割法Ｂに従って分割割付けするときの分
配の手順は以下のようになる。 ■　　プロセッサ１０ａは，分割区間番号指定手段１３
により，すべての次元ｄ＝１，…，ｎについて，分割法
Ｂでｉｄ　に対応する分割区間番号ｐｄ　と，ｊｄ　に
対応する分割区間番号ｑｄ　を得る。これによって，分
割法Ｂの分割ブロックの範囲は，（ｐ１：ｑ１，…，　
ｐｎ：ｑｎ）であることがわかる。[1] In the case of a shared memory type computer, the distribution procedure when an arbitrary partial array allocated in one block by a certain partitioning method A is divided and allocated according to another partitioning method B is as follows. become. ■ The processor 10a uses the division section number designation means 13
Thus, for all dimensions d=1, . . . , n, the division method B obtains the division section number pd corresponding to id and the division section number qd corresponding to jd. As a result, the range of divided blocks of division method B is (p1:q1,...,
pn:qn).

【００２８】■　　転送テーブル１６の領域を，分割法
Ｂの分割ブロックに対応して，（ｑ１−ｐ１　＋１）×
…×　（ｑｎ　−ｐｎ　＋１）個確保する。 ■　　プロセッサ１０ａは，分割ブロック（ｒ１，…ｒ
ｎ）　（ｒｄ　＝ｐｄ，…ｑｄ；ｄ＝１，…，ｎ）に対
応して，すべての転送テーブル１６を，次のように完成
させる。■ The area of the transfer table 16 is divided into (q1-p1 +1)×corresponding to the divided blocks of division method B.
…× (qn − pn +1) are secured. ■ The processor 10a divides the divided blocks (r1,...r
n) Corresponding to (rd = pd, ... qd; d = 1, ..., n), complete all transfer tables 16 as follows.

【００２９】次元ｄの下限値は，ｒｄ　＝ｐｄ　のとき，ｉｄ　。ｒｄ　≧ｐｄ　のとき，分割区間ｒｄ　の下限。次元ｄの上限値は，ｒｄ　＝ｑｄ　のとき，ｊｄ　。ｒｄ　≧ｑｄ　のとき，分割区間ｒｄ　の上限。[0029] The lower limit value of dimension d is id when rd = pd. When rd≧pd, the lower limit of the division interval rd. The upper limit of dimension d is jd when rd = qd. When rd ≧ qd, the upper limit of the divided section rd.

【００３０】■　　プロセッサ１０ａは，すべての転送
テーブル１６に従って，分配データ格納域１７から指定
される配列要素を読み出し，転送先データ格納域１８へ
書き込む。分割法Ａ，分割法Ｂとも，アドレス計算は上
記（式１）による。転送テーブル１６は，配列全体に対
応する数だけ，あらかじめ確保しておいてもよい。その
場合には，手順■は不要となる。(2) The processor 10a reads out the specified array element from the distribution data storage area 17 according to all the transfer tables 16, and writes it to the transfer destination data storage area 18. In both division method A and division method B, the address calculation is based on the above (Equation 1). The number of transfer tables 16 corresponding to the entire array may be secured in advance. In that case, step ■ is not necessary.

【００３１】［２］同じ処理を分散メモリ型計算機で実
現する場合，分割ブロックに対応するプロセッサ１０ｂ
，…１０ｃへのデータ送信となる。最初の手順■〜■は
共用メモリ型と同様である。■では以下の処理を行う。 ■　　データ転送手段１２により，転送テーブル１６ご
とにデータ転送を実行する。１ワード長ずつしか転送で
きない機構であれば，全次元について反復実行しながら
１配列要素ずつ転送する。連続アドレスが一度に転送で
きる機構であれば，第２次元以降について反復実行しな
がら第１次元の連続配列要素を一度に転送することが可
能である。インターバル付きアドレス指定が可能な機構
であれば，第３次元以降について反復実行しながら第１
，２次元の長方形領域の配列要素を一度に転送すること
が可能である。複数次元のインターバルが可能ならば，
さらに多くの配列要素を一度に転送できる。[2] When implementing the same processing using a distributed memory computer, the processor 10b corresponding to the divided block
, . . . data is sent to 10c. The first steps ① to ② are similar to the shared memory type. In ③, the following processing is performed. (2) The data transfer means 12 executes data transfer for each transfer table 16. If the mechanism is capable of transferring only one word length at a time, it will transfer one array element at a time while repeating the process for all dimensions. If there is a mechanism that can transfer consecutive addresses at once, it is possible to transfer consecutive array elements of the first dimension at one time while repeating execution for the second and subsequent dimensions. If the mechanism allows addressing with intervals, the first
, it is possible to transfer array elements of a two-dimensional rectangular area at once. If multidimensional intervals are possible, then
More array elements can be transferred at once.

【００３２】［３］共用メモリ型計算機において，分割
法Ｂによる１ブロック内に属する任意の部分配列を，分
割法Ａで割り付けられている配列から収集し，分割法Ｂ
で割り付ける収集の手順は以下のようになる。 ■　　プロセッサ１０ａは，分割区間番号指定手段１３
により，すべての次元ｄ＝１，…，ｎについて，分割法
Ａでｉｄ　に対応する分割区間番号ｐｄ　と，ｊｄ　に
対応する分割区間番号ｑｄ　を得る。これによって，分
割法Ａの分割ブロックの範囲は，（ｐ１：ｑ１，…，　
ｐｎ：ｑｎ）であることがわかる。[3] In a shared memory computer, collect arbitrary partial arrays belonging to one block using partitioning method B from the arrays allocated using partitioning method A, and use partitioning method B.
The collection procedure for allocation is as follows. ■ The processor 10a uses the division section number designation means 13
As a result, for all dimensions d=1, . . . , n, the division method A obtains the division section number pd corresponding to id and the division section number qd corresponding to jd. As a result, the range of divided blocks of division method A is (p1:q1,...,
pn:qn).

【００３３】■　　転送テーブル１６の領域を，分割法
Ａの分割ブロックに対応して，（ｑ１−ｐ１　＋１）×
…×　（ｑｎ　−ｐｎ　＋１）個確保する。 ■　　［１］の分配の場合と同様に，分割ブロックに対
応するすべての転送テーブル１６を完成させる。■ The area of the transfer table 16 is divided into (q1-p1 +1)×corresponding to the divided blocks of division method A.
…× (qn − pn +1) are secured. ■ As in the case of distribution [1], complete all transfer tables 16 corresponding to the divided blocks.

【００３４】■　　プロセッサ１０ａは，すべての転送
テーブル１６に従って，転送元データ格納域２０から指
定される配列要素を読み出し，収集データ格納域１９へ
書き込む。分割法Ａ，分割法Ｂとも，アドレス計算は上
記（式１）による。［４］分散メモリ型計算機において収集を行う場合には
，分割ブロックを担当する各プロセッサ１０ｂ，…１０
ｃからのデータ受信となる。(2) The processor 10a reads out the specified array element from the transfer source data storage area 20 according to all the transfer tables 16, and writes it to the collected data storage area 19. In both division method A and division method B, the address calculation is based on the above (Equation 1). [4] When collecting on a distributed memory computer, each processor 10b,...10 in charge of a divided block
Data will be received from c.

【００３５】[0035]

【実施例】図２は，分散メモリ型並列計算機に適用した
本発明の構成例を示している。図１に示す例では，各プ
ロセッサ１０ａ〜１０ｃがそれぞれ処理するデータを共
用メモリ１５に分割して格納しているのに対し，図２に
示す例では，各プロセッサ１０ａ〜１０ｃ対応に設けら
れている分散メモリ３０のデータ格納域３１に，データ
を分散させて格納する。この場合，データ転送手段１２
は，各プロセッサの分散メモリ３０間でデータを送受信
するバスまたは転送装置等で構成される。他の構成につ
いては，図１に示すものとほぼ同様である。Embodiment FIG. 2 shows a configuration example of the present invention applied to a distributed memory parallel computer. In the example shown in FIG. 1, data to be processed by each of the processors 10a to 10c is divided and stored in the shared memory 15, whereas in the example shown in FIG. The data is distributed and stored in the data storage area 31 of the distributed memory 30. In this case, the data transfer means 12
is composed of a bus or a transfer device that transmits and receives data between the distributed memories 30 of each processor. The other configurations are almost the same as those shown in FIG.

【００３６】図３は，本発明の実施例で用いる転送テー
ブルの構成例を示す。転送テーブル１６は，最大の場合
，分割ブロック数まで必要であり，それぞれ図３に示す
ように，配列の各次元ごとにインデックスの下限を示す
整数値と，上限を示す整数値とを持つことにより，分割
割付けの範囲を示す。FIG. 3 shows an example of the configuration of a transfer table used in the embodiment of the present invention. In the maximum case, the transfer table 16 is necessary for up to the number of divided blocks, and as shown in FIG. , indicates the range of split allocation.

【００３７】ＡＤＥ転送に関する実施例として，１０台
のプロセッサを持つ分散メモリ型並列計算機で，２次元
配列Ａ（３０，５０）およびＢ（３０，５０）が使われ
，図４に示すように分割配置されている具体的な例につ
いて説明する。配列Ａから配列Ｂへ全要素を代入する。As an example regarding ADE transfer, two-dimensional arrays A(30,50) and B(30,50) are used in a distributed memory parallel computer with 10 processors, and are divided as shown in FIG. A specific example of the arrangement will be explained. Assign all elements from array A to array B.

【００３８】各プロセッサは担当するＡの部分配列全域
を，Ｂへ分配する。例えばプロセッサＰ０は，Ａ（０：
２９，　０：４）を，次の手順でＢの対応する要素へ送
信する。 ■　　配列Ａ，Ｂについて，インデックスと分割区間番
号の対に関する情報を，例えば図５に示す分割情報テー
ブルを使って保持する。または計算式の形で保持する。これをプロセッサで参照または計算することにより，分
割区間の範囲を認識する。図１に示す分割区間番号指定
手段１３および分割区間上下限指定手段１４は，このよ
うな分割情報テーブル等を得る手段である。Each processor distributes the entire partial array of A for which it is responsible to B. For example, processor P0 is A(0:
29, 0:4) to the corresponding element of B using the following procedure. (2) For arrays A and B, information regarding pairs of indexes and division section numbers is held using, for example, the division information table shown in FIG. Or keep it in the form of a calculation formula. By referencing or calculating this in a processor, the range of the divided interval is recognized. The division section number designation means 13 and the division section upper and lower limit designation means 14 shown in FIG. 1 are means for obtaining such a division information table and the like.

【００３９】■　　次元１の区間０：２９，次元２の区
間０：４から，転送先となる配列Ｂの部分は，分割ブロ
ック（０，０）から（９，０）までの１０ブロックにま
たがっていることがわかる。 ■　　分割ブロックに対応して，転送テーブル１６を１
０個用意する。■ From interval 0:29 of dimension 1 and interval 0:4 of dimension 2, the part of array B that is the transfer destination spans 10 blocks from divided blocks (0,0) to (9,0). It can be seen that ■ The transfer table 16 is set to 1 corresponding to the divided block.
Prepare 0 pieces.

【００４０】■　　転送テーブル１６を，図６に示す転
送テーブル１６−０，…，１６−９のように完成させる
。 ■　　転送装置に対し，必要なパラメータを与えて起動
する。例えば自プロセッサのインターバル付きアドレス
区間から，他プロセッサのインターバル付きアドレス区
間へ送受信するハードウェアを用いるなら，ｂｌｏｃｋ
（１，０）に関しては，次のようなパラメータをセット
し起動する。(2) Complete the transfer table 16 as transfer tables 16-0, . . . , 16-9 shown in FIG. ■ Give the necessary parameters to the transfer device and start it. For example, if you are using hardware that sends and receives data from the address interval with intervals of its own processor to the address interval with intervals of another processor, block
For (1,0), set the following parameters and start.

【００４１】（ａ）　転送データ数　　　　Ｎ　　　　（５−３＋１
）×（４−０＋１）＝１５（ｂ）　転送先プロセッサＰ
　　　　プロセッサＰ１（ｃ）　転送元先頭アドレスｂ１　　　　Ａ（３，０）のアドレス連続
データ数ｗ１　　　　５−３＋１＝３　　　　　　イン
ターバルｉ１　　　　３０（ｄ）　転送先先頭アドレスｂ２　　　　Ｂ（３，０）のアドレス連続
データ数ｗ２　　　　５−３＋１＝３インターバルｉ２
　　　　３転送データ数Ｎは，図６に示す転送テーブル１６−１か
ら計算できる。転送元の先頭アドレスｂ１は，転送テー
ブル１６−１の下限値から決める。連続データ数ｗ１と
インターバルｉ１は，それぞれ転送テーブル１６−１の
次元１方向の連続長と，図５に示すテーブルの次元１方
向の分割幅から決める。転送先についても同様である。(a) Number of transferred data N (5-3+1
)×(4-0+1)=15(b) Transfer destination processor P
Processor P1(c) Transfer source start address b1 Number of address continuous data of A(3,0) w1 5-3+1=3 Interval i1 30(d) Transfer destination start address b2 Number of address continuous data of B(3,0) w2 5-3+1=3 intervals i2
3. The number of transferred data N can be calculated from the transfer table 16-1 shown in FIG. The transfer source head address b1 is determined from the lower limit value of the transfer table 16-1. The continuous data number w1 and the interval i1 are determined from the continuous length of the transfer table 16-1 in the first dimension direction and the division width of the table shown in FIG. 5 in the first dimension direction. The same applies to the transfer destination.

【００４２】パラメータを受け取った転送装置は，例え
ば図１０に従って後述する手順により，転送元アドレス
を得ながらＮ個のデータをメモリから読み，パラメータ
Ｎ，ｂ２，ｗ２，ｉ２とともにプロセッサＰ１等の受信
装置へ送信する。プロセッサＰ１等の受信装置は，同様
に図１０に示す手順で転送先アドレスを得ながらＮ個の
データを順次書き込む。The transfer device that has received the parameters reads N pieces of data from the memory while obtaining the transfer source address, for example according to the procedure described later in accordance with FIG. Send to. Similarly, the receiving device such as the processor P1 sequentially writes N pieces of data while obtaining the transfer destination address using the procedure shown in FIG.

【００４３】この結果，図７の配列Ａのように各プロセ
ッサＰ０〜Ｐ９に分割割付けされていたデータは，配列
Ｂのように再配置されることになる。以上の例では，分
配によって配列Ａのデータを配列Ｂのように再配置して
いるが，分散配置されているデータを収集することによ
って，同様に再配置することも可能である。この場合，
各プロセッサＰ０〜Ｐ９は，担当する配列Ｂの部分配列
全体を，配列Ａから収集する。例えばプロセッサＰ０は
，部分配列Ｂ（０：２，０：４９）を，配列Ａの対応す
る要素から受信する。その手順は以下のとおりである。As a result, the data that was divided and allocated to each of the processors P0 to P9 as shown in array A in FIG. 7 is rearranged as shown in array B. In the above example, data in array A is rearranged as array B by distribution, but it is also possible to rearrange data in the same way by collecting distributed data. in this case,
Each processor P0 to P9 collects the entire partial array of array B for which it is responsible, from array A. For example, processor P0 receives partial array B (0:2, 0:49) from the corresponding element of array A. The procedure is as follows.

【００４４】■　　上記分配のときの■と同様である。 ■　　転送元となる配列Ａの部分は，分割ブロック（０
，０）から（０，９）までまたがっている。 ■　　分割ブロックに対応して転送テーブル１６を１０
個用意する。(2) This is the same as (2) in the above distribution. ■ The part of array A that is the transfer source is a divided block (0
,0) to (0,9). ■ Transfer table 16 is set to 10 in correspondence with the divided blocks.
Prepare one.

【００４５】■　　転送テーブルを，図８および図９に
示す手順により埋める。 ■　　上記分配の場合と同様であるが，各プロセッサに
対して自分へのデータ転送を要求し，応答を待つ。以上の手順における転送テーブル１６の作成処理は，図
８および図９に示すように行う。なお，この例は３次元
配列のデータを分割する場合の例である。以下の説明に
おける（ａ）　〜（ｚ）　は，図８および図９に示す（
ａ）　〜（ｚ）　に対応する。■ Fill the transfer table according to the procedure shown in FIGS. 8 and 9. ■ Same as in the case of distribution above, but requests each processor to transfer data to itself and waits for a response. The process of creating the transfer table 16 in the above procedure is performed as shown in FIGS. 8 and 9. Note that this example is an example of dividing data in a three-dimensional array. (a) to (z) in the following explanation are shown in FIGS. 8 and 9 (
a) Corresponds to ~(z).

【００４６】（ａ）　部分配列の次元１のインデックス
の下限と上限を変数ｉ１，ｊ１にセットする。同様に，
次元２のインデックスの下限と上限を変数ｉ２，ｊ２に
，次元３のインデックスの下限と上限を変数ｉ３，ｊ３
にセットする。また，ｐ１，ｑ１，ｐ２，ｑ２，ｐ３，
ｑ３に，それぞれｉ１，ｊ１，ｉ２，ｊ２，ｉ３，ｊ３
に対応する分割区間番号をセットする。(a) Set the lower and upper limits of the index of dimension 1 of the partial array to variables i1 and j1. Similarly,
The lower and upper limits of the index of dimension 2 are set as variables i2 and j2, and the lower and upper limits of the index of dimension 3 are set as variables i3 and j3.
Set to . Also, p1, q1, p2, q2, p3,
q3, i1, j1, i2, j2, i3, j3 respectively
Set the division section number corresponding to .

【００４７】（ｂ）　変数ｒ１にｐ１をセットする。（ｃ）　すべてのｒ２＝ｐ２，…，ｑ２；ｒ３＝ｐ３，
…，ｑ３について，転送テーブル　ｔａｂｌｅ（ｒ１，
ｒ２，ｒ３）の次元１の下限値として，ｉ１の値を設定
する。(b) Set p1 to variable r1. (c) All r2=p2,...,q2; r3=p3,
..., q3, transfer table table(r1,
The value of i1 is set as the lower limit value of dimension 1 of (r2, r3).

【００４８】（ｄ）　ｒ１＝ｑ１ならば，処理（ｉ）　
へ進む。（ｅ）　次元１の分割区間ｒ１の上限値を，変数ｔｏｐ
に設定する。（ｆ）　すべてのｒ２＝ｐ２，…，ｑ２；ｒ３＝ｐ３，
…，ｑ３について，転送テーブル　ｔａｂｌｅ（ｒ１，
ｒ２，ｒ３）の次元１の上限値として，ｔｏｐの値を設
定する。(d) If r1=q1, process (i)
Proceed to. (e) The upper limit of the division interval r1 of dimension 1 is set as the variable top
Set to . (f) All r2=p2,...,q2; r3=p3,
..., q3, transfer table table(r1,
The value of top is set as the upper limit value of dimension 1 of (r2, r3).

【００４９】（ｇ）　ｒ１に１を加算する（ｈ）　すべ
てのｒ２＝ｐ２，…，ｑ２；ｒ３＝ｐ３，…，ｑ３につ
いて，転送テーブル　ｔａｂｌｅ（ｒ１，ｒ２，ｒ３）
の次元１の上限値として，ｔｏｐ＋１の値を設定する。その後，処理（ｄ）　へ戻る。(g) Add 1 to r1 (h) For all r2 = p2,..., q2; r3 = p3,..., q3, transfer table table (r1, r2, r3)
The value of top+1 is set as the upper limit value of dimension 1 of . Thereafter, the process returns to process (d).

【００５０】（ｉ）　すべてのｒ２＝ｐ２，…，ｑ２；
ｒ３＝ｐ３，…，ｑ３について，転送テーブル　ｔａｂ
ｌｅ（ｒ１，ｒ２，ｒ３）の次元１の上限値として，ｊ
１の値を設定する。（ｊ）　変数ｒ２にｐ２をセットする。(i) All r2=p2,...,q2;
For r3=p3,...,q3, transfer table tab
As the upper limit of dimension 1 of le (r1, r2, r3), j
Set the value to 1. (j) Set p2 to variable r2.

【００５１】（ｋ）　すべてのｒ１＝ｐ１，…，ｑ１；
ｒ３＝ｐ３，…，ｑ３について，転送テーブル　ｔａｂ
ｌｅ（ｒ１，ｒ２，ｒ３）の次元２の下限値として，ｉ
２の値を設定する。（ｌ）　ｒ２＝ｑ２ならば，処理（ｑ）　へ進む。(k) All r1=p1,...,q1;
For r3=p3,...,q3, transfer table tab
As the lower limit of dimension 2 of le (r1, r2, r3), i
Set the value of 2. (l) If r2=q2, proceed to process (q).

【００５２】（ｍ）　次元２の分割区間ｒ２の上限値を
，変数ｔｏｐに設定する。（ｎ）　すべてのｒ１＝ｐ１，…，ｑ１；ｒ３＝ｐ３，
…，ｑ３について，転送テーブル　ｔａｂｌｅ（ｒ１，
ｒ２，ｒ３）の次元２の上限値として，ｔｏｐの値を設
定する。(m) Set the upper limit value of the division interval r2 of dimension 2 to the variable top. (n) All r1=p1,...,q1; r3=p3,
..., q3, transfer table table(r1,
The value of top is set as the upper limit value of dimension 2 of r2, r3).

【００５３】（ｏ）　ｒ２に１を加算する（ｐ）　すべ
てのｒ１＝ｐ１，…，ｑ１；ｒ３＝ｐ３，…，ｑ３につ
いて，転送テーブル　ｔａｂｌｅ（ｒ１，ｒ２，ｒ３）
の次元２の上限値として，ｔｏｐ＋１の値を設定する。その後，処理（ｌ）　へ戻る。(o) Add 1 to r2 (p) For all r1 = p1, ..., q1; r3 = p3, ..., q3, transfer table table (r1, r2, r3)
The value of top+1 is set as the upper limit value of dimension 2 of . Then, return to process (l).

【００５４】（ｑ）　すべてのｒ１＝ｐ１，…，ｑ１；
ｒ３＝ｐ３，…，ｑ３について，転送テーブル　ｔａｂ
ｌｅ（ｒ１，ｒ２，ｒ３）の次元２の上限値として，ｊ
２の値を設定する。（ｒ）　変数ｒ３にｐ３をセットする。(q) All r1=p1,...,q1;
For r3=p3,...,q3, transfer table tab
As the upper limit of dimension 2 of le (r1, r2, r3), j
Set the value of 2. (r) Set p3 to variable r3.

【００５５】（ｓ）　すべてのｒ１＝ｐ１，…，ｑ１；
ｒ２＝ｐ２，…，ｑ２について，転送テーブル　ｔａｂ
ｌｅ（ｒ１，ｒ２，ｒ３）の次元３の下限値として，ｉ
３の値を設定する。（ｔ）　ｒ３＝ｑ３ならば，処理（ｙ）　へ進む。(s) all r1=p1,...,q1;
For r2=p2,...,q2, transfer table tab
As the lower limit of dimension 3 of le(r1, r2, r3), i
Set the value of 3. (t) If r3=q3, proceed to process (y).

【００５６】（ｕ）　次元３の分割区間ｒ３の上限値を
，変数ｔｏｐに設定する。（ｖ）　すべてのｒ１＝ｐ１，…，ｑ１；ｒ２＝ｐ２，
…，ｑ２について，転送テーブル　ｔａｂｌｅ（ｒ１，
ｒ２，ｒ３）の次元３の上限値として，ｔｏｐの値を設
定する。(u) Set the upper limit value of the divided section r3 of dimension 3 to the variable top. (v) All r1=p1,...,q1; r2=p2,
..., q2, transfer table table(r1,
The value of top is set as the upper limit value of dimension 3 of r2, r3).

【００５７】（ｗ）　ｒ３に１を加算する（ｘ）　すべ
てのｒ１＝ｐ１，…，ｑ１；ｒ２＝ｐ２，…，ｑ２につ
いて，転送テーブル　ｔａｂｌｅ（ｒ１，ｒ２，ｒ３）
の次元２の上限値として，ｔｏｐ＋１の値を設定する。その後，処理（ｔ）　へ戻る。(w) Add 1 to r3 (x) For all r1 = p1, ..., q1; r2 = p2, ..., q2, transfer table table (r1, r2, r3)
The value of top+1 is set as the upper limit value of dimension 2 of . Thereafter, the process returns to process (t).

【００５８】（ｙ）　すべてのｒ１＝ｐ１，…，ｑ１；
ｒ２＝ｐ２，…，ｑ２について，転送テーブル　ｔａｂ
ｌｅ（ｒ１，ｒ２，ｒ３）の次元３の上限値として，ｊ
３の値を設定し，転送テーブルの作成を完了する。デー
タを転送する転送装置は，図１０に示す処理論理■〜■
により転送元または転送先のアドレスを決定する。(y) all r1=p1,...,q1;
For r2=p2,...,q2, transfer table tab
As the upper limit of dimension 3 of le (r1, r2, r3), j
Set the value of 3 and complete the creation of the transfer table. The transfer device that transfers data has the processing logic shown in Figure 10.
The forwarding source or forwarding destination address is determined by

【００５９】■　　データ数をＮ，先頭アドレスをｂ，
連続データ数をｗ，インターバルをｉとする。 ■　　変数ｋ０に０，変数ｄに（ｉ−ｗ），アドレスａ
ｄｄｒにｂを設定する。 ■　　変数ｋ１に０を設定する。■ The number of data is N, the start address is b,
Let w be the number of continuous data and i be the interval. ■ 0 in variable k0, (i-w) in variable d, address a
Set b to ddr. ■ Set variable k1 to 0.

【００６０】■　　変数ｋ０がＮになったならば，処理
を終了する。 ■　　アドレスａｄｄｒをアクセスする。転送元の場合
，読み出し（Ｒｅａｄ），転送先の場合，書き込み（Ｗ
ｒｉｔｅ）を行う。 ■　　アドレスａｄｄｒを１歩進する。変数ｋ０，ｋ１
にそれぞれ１を加算する。■ When the variable k0 becomes N, the process ends. ■ Access address addr. For the transfer source, read (Read), for the transfer destination, write (W)
rite). ■ Advance address addr by one step. Variables k0, k1
Add 1 to each.

【００６１】■　　変数ｋ１がｗになるまで，■へ戻っ
て処理を繰り返す。 ■　　変数ｋ１がｗになったならば，アドレスａｄｄｒ
にｄを加えた後，■へ戻って同様に処理を繰り返す。デ
ータを転送する転送装置は，以上の処理論理によってア
ドレスを決定するハードウェアによって容易に構成でき
る。または，各プロセッサにおけるソフトウェアによっ
て構成することも可能である。■ Return to ■ and repeat the process until the variable k1 becomes w. ■ If variable k1 becomes w, address addr
After adding d to , return to step 3 and repeat the process. A transfer device that transfers data can be easily constructed using hardware that determines addresses using the above processing logic. Alternatively, it can also be configured by software in each processor.

【００６２】分割配列の初期化に本発明を適用する場合
，以下のように行う。ファイルやオブジェクトコードに
書かれている値での分割配列の初期化は，これらの初期
化データ列を１×…×１に分割された配列とみなし，各
プロセッサによる分配または収集を行うことで実現する
ことができる。ただし，シーケンシャルファイルからの
入力の場合には，プロセッサ間で逐次化の処理が必要で
ある。排他処理が必要な場合もある。When the present invention is applied to the initialization of a divided array, it is performed as follows. Initialization of a divided array with values written in a file or object code is achieved by regarding these initialization data strings as an array divided into 1×...×1, and distributing or collecting them by each processor. can do. However, when inputting from a sequential file, serialization processing is required between processors. Exclusive processing may be necessary.

【００６３】分割配列の出力に本発明を適用する場合，
以下のように行う。ファイルやプリンタへの分割配列の
出力は，これらの出力先を１×…×１に分割された配列
とみなし，各プロセッサで本発明による分配を適用する
ことで実現することができる。ただし，シーケンシャル
ファイルへの出力や，出力順序に意味のあるプリンタ出
力の場合には，プロセッサ間で逐次化の処理が必要であ
る。排他処理が必要な場合もある。When applying the present invention to the output of a divided array,
Do as follows. Outputting a divided array to a file or printer can be achieved by regarding these output destinations as an array divided into 1×...×1, and applying the distribution according to the present invention to each processor. However, when outputting to a sequential file or outputting to a printer where the output order is meaningful, serialization processing is required between processors. Exclusive processing may be necessary.

【００６４】図１１および図１２は，本発明の適用対象
となる分割割付けの例と実際のメモリにおけるデータの
配置例を示している。例えば図１１の（イ）に示す２次
元配列Ｘ（８，１２）の長方形型の部分配列Ｘ（１：６
，１：７）を，第１次元方向に順に区間幅２，３，３，
第２次元方向に順に区間幅４，４，４で分割すると，そ
の分割イメージは，図１１の（ロ）に示すようになる。FIGS. 11 and 12 show an example of divisional allocation to which the present invention is applied and an example of data arrangement in actual memory. For example, a rectangular partial array X (1:6) of the two-dimensional array X (8, 12) shown in FIG.
, 1:7) in the first dimension direction with interval widths 2, 3, 3,
When divided in the second dimension direction by interval widths 4, 4, 4, the divided image becomes as shown in (b) of FIG.

【００６５】この３×３ブロックに分割された各ブロッ
ク内のデータを，メモリ上の連続アドレスに割り付ける
ようにすると，メモリ割付けは，図１２に示す（イ）〜
（ヘ）に示すような状態になる。部分配列は，６つのブ
ロックにまたがって，一見不規則なアドレスに分散配置
される形になる。しかし，分散メモリ型の場合などに，
各プロセッサは分割法に応じたメモリ割付けを意識し，
効率よくデータを処理することが可能である。If the data in each block divided into 3×3 blocks is allocated to consecutive addresses on the memory, the memory allocation will be as shown in FIG.
The situation will be as shown in (f). The partial array spans six blocks and is distributed at seemingly irregular addresses. However, in cases such as distributed memory type,
Each processor is aware of memory allocation according to the partitioning method,
It is possible to process data efficiently.

【００６６】[0066]

【発明の効果】以上説明したように，本発明によれば，
分割形状の異なる配列間の転送や，分割配列への入出力
処理を，統一的に簡単かつ効率よく行うことができる。特に，分散メモリ型並列計算機でのデータ転送処理の効
率化も可能であり，ユーザプログラム等の負担軽減に大
きな効果がある。[Effect of the invention] As explained above, according to the present invention,
Transfers between arrays with different partition shapes and input/output processing to partition arrays can be performed easily and efficiently in a unified manner. In particular, it is possible to improve the efficiency of data transfer processing in distributed memory parallel computers, which has a great effect on reducing the burden on user programs.

[Brief explanation of drawings]

【図１】本発明の原理構成図である。FIG. 1 is a diagram showing the principle configuration of the present invention.

【図２】本発明の他の構成例を示す図である。FIG. 2 is a diagram showing another configuration example of the present invention.

【図３】本発明の実施例に係る転送テーブルの構成例を
示す図である。FIG. 3 is a diagram showing a configuration example of a transfer table according to an embodiment of the present invention.

【図４】本発明の一実施例による再分配説明図である。FIG. 4 is an explanatory diagram of redistribution according to an embodiment of the present invention.

【図５】本発明の一実施例で用いる分割情報テーブルの
例を示す図である。FIG. 5 is a diagram showing an example of a division information table used in an embodiment of the present invention.

【図６】本発明の一実施例に係る転送テーブルの例を示
す図である。FIG. 6 is a diagram showing an example of a transfer table according to an embodiment of the present invention.

【図７】本発明の一実施例によるメモリ割付けの例を示
す図である。FIG. 7 is a diagram illustrating an example of memory allocation according to an embodiment of the present invention.

【図８】本発明の一実施例による転送テーブル作成処理
説明図である。FIG. 8 is an explanatory diagram of transfer table creation processing according to an embodiment of the present invention.

【図９】本発明の一実施例による転送テーブル作成処理
説明図である。FIG. 9 is an explanatory diagram of transfer table creation processing according to an embodiment of the present invention.

【図１０】本発明の一実施例によるデータ転送制御説明
図である。FIG. 10 is an explanatory diagram of data transfer control according to an embodiment of the present invention.

【図１１】本発明に関係する分割割付けの例を示す図で
ある。FIG. 11 is a diagram showing an example of divisional allocation related to the present invention.

【図１２】本発明に関係するメモリ割付けの例を示す図
である。FIG. 12 is a diagram showing an example of memory allocation related to the present invention.

【図１３】並列計算機における配列データの分割例を示
す図である。FIG. 13 is a diagram showing an example of dividing array data in a parallel computer.

[Explanation of symbols]

１０ａ〜１０ｃ　　プロセッサ１１　　　　　　転送テーブル作成手段１２　　　　　
　データ転送手段１３　　　　　　分割区間番号指定手段１４　　　　　
　分割区間上下限指定手段１５　　　　　　共用メモリ１６　　　　　　転送テーブル１７　　　　　　分配データ格納域１８　　　　　　転送先データ格納域１９　　　　　　収集データ格納域２０　　　　　　転送元データ格納域10a to 10c Processor 11 Transfer table creation means 12
Data transfer means 13 Division number designation means 14
Divided section upper and lower limit designation means 15 Shared memory 16 Transfer table 17 Distribution data storage area 18 Transfer destination data storage area 19 Collection data storage area 20 Transfer source data storage area

Claims

[Claims]

Claim 1: A computer in which a plurality of processors (10) each process data allocated to its own device, the computer processing all or part of data input from an input device or array data stored in a storage device. In a distribution/collection processing device for array data that is divided and allocated to each storage space or each processor (10), partitioning is performed in one or more arbitrary dimension directions to be allocated to each storage space or each processor (10). Corresponding to the block, a transfer table (16
); and data transfer means (12) that transfers data for each divided block to each storage space or each processor (10) based on the created transfer table (16). 1. An array data distribution/collection processing device, characterized in that the data is automatically distributed according to an arbitrary transfer pattern.

2. A computer in which a plurality of processors (10) each process data allocated to its own device, wherein all or part of the data distributed in each storage space or each processor (10) is processed. In an array data distribution/collection processing device that collects parts into a certain storage space or one processor (10), one or more arbitrary dimensional directions are distributed and arranged in each storage space or each processor (10). a transfer table creation means (11) for creating a transfer table (16) having information regarding the range of each dimension of data to be collected corresponding to the divided blocks divided by; and based on the created transfer table (16). The present invention is characterized in that it is equipped with a data transfer means (12) for transferring collected data for each divided block to a certain storage space or one processor (10), and is configured to automatically collect data according to an arbitrary transfer pattern. Array data distribution/collection processing device.