JPH02733B2

JPH02733B2 -

Info

Publication number: JPH02733B2
Application number: JP59006581A
Authority: JP
Inventors: Tsutomu Hoshino; Tomonori Shirakawa; Toshio Kawai
Original assignee: Shingijutsu Kaihatsu Jigyodan
Current assignee: Shingijutsu Kaihatsu Jigyodan
Priority date: 1984-01-18
Filing date: 1984-01-18
Publication date: 1990-01-09
Also published as: JPS60151776A

Description

【発明の詳細な説明】〔発明の技術分野〕本発明は、複合コンピユータシステムに関する
ものであり、特にその中でも並列配置されたコン
ピユータに対する通信および制御を効率的に行な
う手段に関する。DETAILED DESCRIPTION OF THE INVENTION [Technical Field of the Invention] The present invention relates to a complex computer system, and particularly to means for efficiently communicating and controlling computers arranged in parallel.

[Technology background]

科学技術上の問題には、物理空間内に分布する
多数の点からなり、各点が近傍の点との間に相互
作用をもつような系において、与えられた条件の
もとに各点の値の時間的な変化を計算することが
必要とされるものが少なくない。たとえば、気象
や洪水の予測問題などもその１つである。しかし
これらの問題は、膨大な量の計算を伴うものであ
るため、特にその処理用コンピユータには高速性
が要求される。 Scientific and technological problems include solving problems for each point under given conditions in a system consisting of many points distributed in physical space, where each point has interactions with neighboring points. In many cases, it is necessary to calculate changes in values over time. For example, weather and flood prediction problems are one example. However, since these problems involve an enormous amount of calculation, the processing computers are particularly required to be high-speed.

ところで、多数のコンピユータを２次元配列し
て、隣接するコンピユータ同士を結合可能にした
複合コンピユータシステムは、このような問題を
処理するのに適しており、上記した物理空間内の
各点を２次元配列された各要素コンピユータに対
応させて同時に並列動作させることにより、高速
処理を可能にする。しかし、従来のこの種の複合
コンピユータシステムは、全体を統括するホスト
コンピユータと２次元配列されて各要素コンピユ
ータとの間の通信、および各要素コンピユータ間
の通信に多くの時間を要して、システム全体の処
理の高速化に一定の制約があり、さらには各要素
コンピユータの同期制御などに技術的に改善すべ
き点をもつていた。 By the way, a composite computer system in which a large number of computers are arranged in a two-dimensional array and adjacent computers can be connected to each other is suitable for processing such problems, and each point in the physical space described above can be arranged in two dimensions. High-speed processing is made possible by allowing each element computer in the array to operate simultaneously in parallel. However, in conventional composite computer systems of this type, communication between a host computer that controls the whole system and each element computer arranged in a two-dimensional array, and communication between each element computer takes a lot of time. There were certain restrictions on speeding up the overall processing, and there were also technical points that needed to be improved, such as the synchronization control of each element computer.

[Purpose of the invention]

本発明の目的は、２次元配列された複数のコン
ピユータからなる複合コンピユータシステムにお
いて、通信および同期制御方式を改良し、より高
速で信頼性の高いシステムを実現することにあ
る。 An object of the present invention is to improve communication and synchronization control methods in a complex computer system consisting of a plurality of computers arranged in a two-dimensional array, and to realize a faster and more reliable system.

〔発明の構成および実施例〕以下に、本発明の構成を１実施例にしたがつて
説明する。[Configuration and Examples of the Invention] The configuration of the present invention will be described below based on one example.

(1) システムの全体構成第１図は、本発明を実施した複合コンピユータ
システムの全体構成図である。図中、１は32個の
処理ユニツトPUを８行４列に２次元配列したPU
アレイ、２は制御ユニツト、３はホストコンピユ
ータ、４はデジタル入出力線DI／Ｏ、５はデイ
スクパツク記憶装置DP、６は磁気テープ装置
MT、７および８はデイスプレイ端末装置CRT、
９はラインプリンタLP、１０は通信制御装置
COM、１１はデイスプレイ装置CRT、１２はキ
ーボード入力装置KB、１３はプリンタPR、１
４はカセツトテープ装置CMTを示す。(1) Overall configuration of the system FIG. 1 is an overall configuration diagram of a composite computer system in which the present invention is implemented. In the figure, 1 is a PU with 32 processing units PU arranged two-dimensionally in 8 rows and 4 columns.
array, 2 is a control unit, 3 is a host computer, 4 is a digital input/output line DI/O, 5 is a disk pack storage device DP, 6 is a magnetic tape device
MT, 7 and 8 are display terminal devices CRT,
9 is a line printer LP, 10 is a communication control device
COM, 11 is the display device CRT, 12 is the keyboard input device KB, 13 is the printer PR, 1
4 indicates a cassette tape device CMT.

PUアレイ１は、複数のタスクを並列に実行す
る。各PUは、後述されるように、本質的には単
一ボードのマイクロコンピユータと同じ機能をも
つている。なおアレイを構成するPUの個数は32
個に限られるものではなく、一般には任意の適当
な個数が選択される。ｍ行ｎ列目のPUは（ｍ，
ｎ）で表わされる。 The PU array 1 executes multiple tasks in parallel. Each PU has essentially the same functionality as a single-board microcomputer, as described below. The number of PUs that make up the array is 32.
The number is not limited to 1, and in general, any suitable number can be selected. The PU at row m and column n is (m,
n).

制御ユニツト２は１つのマイクロコンピユータ
であり、PUアレイ１を制御するとともに、デジ
タル入出力線DI／Ｏを介して、ホストコンピユ
ータ３あるいは入出力装置７乃至10との間でデー
タ通信を行なう。 The control unit 2 is a microcomputer that controls the PU array 1 and also performs data communication with the host computer 3 or the input/output devices 7 to 10 via the digital input/output line DI/O.

ホストコンピユータ３は、汎用のミニコンピユ
ータであり、ソースプログラムをコンパイル／ア
センブルし、得られたオブジエクトプログラムを
制御ユニツト２およびPUへロードする。さらに
並列タスクを開始させ、制御ユニツト２との間で
必要なデータの転送を行ない、処理結果を出力さ
せる。 The host computer 3 is a general-purpose minicomputer, compiles/assembles source programs, and loads the obtained object programs into the control unit 2 and PU. Furthermore, parallel tasks are started, necessary data is transferred to and from the control unit 2, and processing results are output.

(2) PUアレイの構成第２図は、PUアレイ１におけるｍ行ｎ列目の
１つの処理ユニツトPUの細部構成図である。図
中、１６はマイクロプロセツサMPU、１７は算
術演算ユニツトAPU、１８は制御レジスタCR、
１９は状態レジスタSR、２０はローカルメモリ
LM、２１はプログラムメモリPM、２２は結果
メモリRM、２２ａは同期レジスタSC、２３は
前方の通信メモリCM、２４は後方の通信メモリ
CM、２５は左方の通信メモリCM、２６は右方
の通信メモリCMを示す。(2) Configuration of PU array FIG. 2 is a detailed configuration diagram of one processing unit PU in the m-th row and n-th column in the PU array 1. In the figure, 16 is a microprocessor MPU, 17 is an arithmetic operation unit APU, 18 is a control register CR,
19 is status register SR, 20 is local memory
LM, 21 is program memory PM, 22 is result memory RM, 22a is synchronization register SC, 23 is front communication memory CM, 24 is rear communication memory
CM, 25 indicates the left communication memory CM, and 26 indicates the right communication memory CM.

マイクロプロセツサMPU１６は、応用プログ
ラムを実行し、８ビツトの固定小数点算術演算お
よび論理演算、メモリ間のデータ転送、算術演算
ユニツトAPU１７の制御などを行なう。 The microprocessor MPU 16 executes application programs, performs 8-bit fixed-point arithmetic operations and logical operations, transfers data between memories, controls the arithmetic operation unit APU 17, and the like.

算術演算ユニツトAPUは、16ビツトおよび32
ビツト幅の固定小数点算術演算と32ビツト幅の浮
動小数点算術演算、対数や平方根などの基本関数
計算などを行なう。APU１７に対する転送およ
びその動作開始は、MPU１６により完全に制御
される。ただし、これらのデータについては、
MPUのレジスタを介さずに、直接APUとメモリ
との間でデータ転送を行なうことができる。 The arithmetic unit APU has 16-bit and 32-bit
It performs bit-width fixed-point arithmetic operations, 32-bit wide floating-point arithmetic operations, and basic function calculations such as logarithms and square roots. Transfer to the APU 17 and the start of its operation are completely controlled by the MPU 16. However, regarding these data,
Data can be transferred directly between the APU and memory without going through MPU registers.

制御レジスタCR１８は、制御ユニツト２から
PUへ、制御語を転送するために使用される。 Control register CR18 is connected to control register CR18 from control unit 2.
Used to transfer control words to the PU.

状態レジスタSR１９は、PUの状態を制御ユニ
ツト２へ通知するために使用される。このレジス
タはPUによつて書き込まれ、制御ユニツト２の
みによつて読み出されることができる。 The status register SR19 is used to notify the control unit 2 of the status of the PU. This register is written by the PU and can only be read by the control unit 2.

ローカルメモリLM２０は、PUのローカルデ
ータおよびプログラムを記憶するために使用され
る。 Local memory LM20 is used to store local data and programs of the PU.

プログラムメモリPM２１は、プログラムおよ
び読み出し専用データを記憶するために使用され
る。 Program memory PM21 is used to store programs and read-only data.

結果メモリRM２２は、PUと制御ユニツト２
によつて共有され、これらの間でデータ転送を行
なうために使用される。 The result memory RM22 is connected to the PU and control unit 2.
and is used to transfer data between them.

同期レジスタ２２ａは、PUアレイ内の各PUの
処理の同期をとるために使用される。 The synchronization register 22a is used to synchronize the processing of each PU in the PU array.

通信メモリCM２３乃至２６は、それぞれ前、
後、左、右の隣接するPUとの間で共有され、そ
れぞれのPUとの間のデータ通信のために使用さ
れる。 The communication memories CM23 to CM26 are connected to the front,
It is shared between the rear, left and right adjacent PUs, and used for data communication between each PU.

各PUは、システムクロツクによつて同期化さ
れている。システムクロツクは制御ユニツト２に
よつて供給される。隣接するPU間のCMに対す
るアクセスは、PU同士の競合を避けるため、シ
ステムクロツクの半サイクルで偶数番目のPUが
アクセスし、他の半サイクルで奇数番目のPUが
アクセスするように制御される。 Each PU is synchronized by a system clock. The system clock is supplied by control unit 2. Access to the CM between adjacent PUs is controlled so that even-numbered PUs access it in half a system clock cycle and odd-numbered PUs access it in the other half cycle to avoid contention between PUs. .

(3) 隣接PU間のデータ転送隣接するPU同士の間でのデータ転送は、上記
した前後左右の通信メモリCM２３乃至２６を用
いて行なわれる。従来は、ローカルメモリLM２
０中のデータを他の隣接PUに転送したい場合、
MPU１６によりLM２０中のデータを読み出し
てそれを通信メモリCM２３，２４，２５，２６
に順次格納することにより行なつていた。(3) Data transfer between adjacent PUs Data transfer between adjacent PUs is performed using the above-mentioned front, rear, left and right communication memories CM23 to 26. Conventionally, local memory LM2
If you want to transfer the data in 0 to another adjacent PU,
MPU16 reads data in LM20 and stores it in communication memory CM23, 24, 25, 26
This was done by storing them sequentially.

しかし、これでは、通信に要する時間がかかり
すぎるため、本発明では、MPU１６がLM２０
にデータを格納する際、それが隣接するPUに転
送すべきデータである場合には、特別のアドレス
（またはアドレス領域）を指定してCM２３乃至
２６にも同時に書き込むようにしている。 However, this takes too much time for communication, so in the present invention, the MPU 16 is
When storing data in the CM 23 to CM 26, if the data is to be transferred to an adjacent PU, a special address (or address area) is designated and the data is written in the CMs 23 to 26 at the same time.

第３図は、ある１つのPUのMPU１６からみた
LMおよびCMのアドレス空間の１例を示す。ア
ドレス０〜99はLM２０のみにアクセスが行なわ
れる領域、アドレス100〜199は前方のCM２３と
LM２０とに共通にアクセスが行なわれる領域、
アドレス200〜299は後方のCM２４とLM２０と
に共通にアクセスが行なわれる領域、アドレス
300〜399とアドレス400〜499は、それぞれ左方お
よび右方のCM２５，２６とLM２０とに共通に
アクセスが行なわれる領域、そしてアドレス500
〜599は前後左右のCM２３乃至２６とLM２０と
に共通にアクセスが行なわれる領域である。 Figure 3 is seen from MPU16 of one PU.
An example of LM and CM address spaces is shown. Addresses 0 to 99 are areas that are accessed only to the LM20, addresses 100 to 199 are areas that are accessed only to the CM23 in front.
Areas that are commonly accessed with LM20,
Addresses 200 to 299 are areas commonly accessed by the rear CM24 and LM20, addresses
Addresses 300 to 399 and addresses 400 to 499 are areas commonly accessed by the left and right CM25, 26 and LM20, respectively, and address 500
599 is an area that is commonly accessed by the CMs 23 to 26 on the front, rear, left and right sides and the LM 20.

これにより、各PUにおいて、MPU１６は、デ
ータを転送すべき隣接PUの方向に応じてLM２
０への書き込みアドレス領域を選択することによ
り、自由に隣接するPUとの間で通信を行なうこ
とができる。 As a result, in each PU, the MPU 16 selects the LM2 according to the direction of the adjacent PU to which data should be transferred.
By selecting the write address area to 0, communication can be freely performed between adjacent PUs.

この場合、PUアレイの配列は２次元に限らず、
１次元、３次元など任意の次元のものに適用する
ことができる。 In this case, the arrangement of the PU array is not limited to two dimensions;
It can be applied to objects of any dimension, such as one dimension or three dimensions.

(4) 複数のPUのメモリ間あるいはPUのメモリと
周辺装置との間のデータ転送 PUアレイ１中の複数のPUのメモリ間、あるい
はメモリと周辺装置との間で、直接メモリアクセ
ス（DMA）方式によりデータ転送を行なう。(4) Data transfer between the memories of multiple PUs or between the memory of a PU and a peripheral device Direct memory access (DMA) between the memories of multiple PUs in PU array 1 or between the memory and a peripheral device Data transfer is performed according to the method.

従来のDMAコントローラは、１つのPUのメ
モリと周辺装置との間のデータ転送を行なうよう
に作製されていたため、複数のPUに対して用い
るためには、これまで次のような方法がとられて
いた。 Conventional DMA controllers were created to transfer data between the memory of one PU and peripheral devices, so the following methods have been used to use them for multiple PUs: was.

第４図はその概要図であり、図中、１はPUア
レイ、２は制御ユニツト、２７はPU接続切換回
路、２８は共通バス、２９はDMAコントローラ
機能をもつデータ転送制御装置を示している。制
御ユニツト２は、そのソフトウエアにより、各
PUに対して設けられているPU接続切換回路２７
を制御し、データを読み書きするべきPUまたは
PU群を選択指定する。例として、PU（01）のメ
モリの内容をPU（00）乃至PU（73）の全PUのメ
モリへ放送する場合を以下に示す。 Figure 4 is a schematic diagram of the system. In the figure, 1 is a PU array, 2 is a control unit, 27 is a PU connection switching circuit, 28 is a common bus, and 29 is a data transfer control device with a DMA controller function. . The control unit 2 uses its software to
PU connection switching circuit 27 provided for PU
PU or
Select and specify the PU group. As an example, a case where the contents of the memory of PU (01) are broadcast to the memories of all PUs from PU (00) to PU (73) is shown below.

まず制御ユニツト２は、PU接続切換回路２
７を制御し、共通バス２８とPU（01）を接続す
る。 First, the control unit 2 is connected to the PU connection switching circuit 2.
7 and connects the common bus 28 and PU (01).

次に、DMAコントローラ機能をもつデータ
転送制御装置２９に読み出しを開始するべきメ
モリのアドレス、データの量などの値を知ら
せ、この装置２９に制御権を移す。 Next, the data transfer control device 29 having a DMA controller function is informed of values such as the address of the memory to start reading and the amount of data, and the control right is transferred to this device 29.

すると、データ転送制御装置２９はPU（01）
のメモリからデータを読み出し、制御ユニツト
２のメモリなどに一時貯え、制御ユニツト２に
制御権を戻す。 Then, the data transfer control device 29 transfers PU (01)
The data is read from the memory of the control unit 2, temporarily stored in the memory of the control unit 2, and the control right is returned to the control unit 2.

制御ユニツト２はPU接続切換回路２７を制
御して、共通バス２８を全PR（00）乃至PU
（73）と接続する。 The control unit 2 controls the PU connection switching circuit 27 to switch the common bus 28 from all PR (00) to PU
Connect with (73).

つづいて、制御ユニツト２はデータ転送制御
装置２９にデータを書き込むべきアドレスなど
を知らせ、制御権を移す。 Next, the control unit 2 notifies the data transfer control device 29 of the address to which the data should be written, and transfers control.

データ転送制御装置２９は、先程貯えたデー
タを全PUのメモリに同時に書き込んだ後、制
御権を放す。 The data transfer control device 29 releases the control right after writing the previously stored data into the memories of all PUs at the same time.

以上６回の手順により、PU（01）のメモリの内
容が全PUのメモリにコピーされる。 Through the above six steps, the contents of the memory of PU (01) are copied to the memories of all PUs.

しかしこの方法は、比較的自由なPUの選択・
指定が可能である反面、制御のために時間がかか
り、システム全体の効率が低下する欠点があつ
た。 However, this method provides relatively free PU selection and
Although it is possible to specify this, it has the disadvantage that it takes time to control and the efficiency of the entire system decreases.

本発明では、上記のような従来方式の欠点を解
決するために、データ転送制御装置２９内にある
データの読み書きのアドレス指示器、データ数計
数器に連動させ、自動的にデータの読み書きを行
なうべきPUまたはPU群の選択制御を行なうため
の指示示器を設け、データ転送のための制御に必
要な時間を短縮して、PUアレイの処理効率を向
上させている。 In order to solve the drawbacks of the conventional method as described above, the present invention automatically reads and writes data in conjunction with the data read/write address indicator and data counter in the data transfer control device 29. An indicator is provided to select and control the desired PU or PU group, thereby reducing the time required for data transfer control and improving the processing efficiency of the PU array.

第５図は、本発明に基づくデータ転送制御装置
の１実施例の構成図である。図中、３１は読み出
しPU指示器、３２は書き込みPU指示器、３３は
読み出しアドレス指示器、３４は書き込みアドレ
ス指示器、３５はデータラツチ、３６は読み書き
切り替え器、３７は転送データ数指示器、３８は
転送データ数計数器、３９は比較器、４０は命令
レジスタ、４１は命令解釈器、４２乃至４５は演
算器、４６乃至４８はスイツチを示す。以下に、
各部の機能を説明する。 FIG. 5 is a configuration diagram of one embodiment of a data transfer control device based on the present invention. In the figure, 31 is a read PU indicator, 32 is a write PU indicator, 33 is a read address indicator, 34 is a write address indicator, 35 is a data latch, 36 is a read/write switch, 37 is a transfer data number indicator, 38 39 is a transfer data number counter, 39 is a comparator, 40 is an instruction register, 41 is an instruction interpreter, 42 to 45 are arithmetic units, and 46 to 48 are switches. less than,
Explain the functions of each part.

読み出しPU指示器３１、書き込みPU指示器
３２：それぞれ、データが読み出されるPUと、
書き込まれるPUを指定するコードを格納して
おくレジスタで、そのコードがPU選択バスに
出力されると指定されたPUにアドレスバスと
データバスが接続され、そのPUのデータを読
み書きすることが可能となる。読み出しPU指
示器３１と書き込みPU指示器３２に付属して
いる演算器４２，４３は、あるPUの読み出し
や書き込みが終了したあと、次に読み出し書き
込みを行なうPUのコードを求めるための計算
を行なうものである。 Read PU indicator 31, write PU indicator 32: PU from which data is read, and
A register that stores a code that specifies the PU to be written to. When that code is output to the PU selection bus, the address bus and data bus are connected to the specified PU, making it possible to read and write data from that PU. becomes. Arithmetic units 42 and 43 attached to the read PU indicator 31 and the write PU indicator 32 perform calculations to obtain the code of the next PU to be read and written after reading or writing from a certain PU is completed. It is something.

読み出しアドレス指示器３３、書き込みアド
レス指示器３４：それぞれ、データが読み出さ
れるアドレスと、書き込まれるアドレスを格納
しておくレジスタである。読み出しアドレス指
示器３３と書き込みアドレス指示器３４に付属
している演算器４４，４５は、あるアドレスの
読み出しや書き込みが終了したあと、次に読み
出し書き込みを行なうアドレスのコードを求め
るための計算を行なうものである。 Read address indicator 33, write address indicator 34: These are registers that store an address from which data is read and an address where data is written, respectively. Arithmetic units 44 and 45 attached to the read address indicator 33 and the write address indicator 34 perform calculations to obtain the code of the next address to be read or written after reading or writing to a certain address is completed. It is something.

データラツチ３５；転送されるデータを一時
蓄えておくものである。 Data latch 35: Temporarily stores data to be transferred.

読み書き切り替え器３６：この装置がデータ
転送を行なう際に、データを読む動作と書く動
作を交互に切り替える制御をおこなう。データ
を読む時には、スイツチ４６，４７，４８を上
に倒し、読み出しPU指示器をPU選択バスに、
読み出しアドレス指示器をアドレスバスに接続
し、データラツチ３５を入力状態に、
READ／WRITE信号をREAD状態にする。こ
の状態でデータはPUから読み出され、データ
ラツチ３５に一時蓄えられる。次にデータを書
く時には、スイツチ４６，４７，４８を下に倒
し、書き込みPU指示器をPU選択バスに、書き
込みアドレス指示器をアドレスバスに接続し、
データラツチを出力状態に、READ／WRITE
信号をWRITE状態にする。この状態でデータ
はデータラツチ３５からPUへと転送される。 Read/write switcher 36: When this device transfers data, it performs control to alternately switch between reading and writing data. When reading data, flip switches 46, 47, and 48 upward, and set the read PU indicator to the PU selection bus.
Connect the read address indicator to the address bus, put the data latch 35 in the input state,
Set the READ/WRITE signal to READ state. In this state, data is read from the PU and temporarily stored in the data latch 35. Next, when writing data, turn down the switches 46, 47, and 48, connect the write PU indicator to the PU selection bus, and connect the write address indicator to the address bus.
Set data latch to output state, READ/WRITE
Set the signal to WRITE state. In this state, data is transferred from the data latch 35 to the PU.

転送データ数指示器３７：転送されるべきデ
ータの数を格納しておくレジスタ。 Transfer data number indicator 37: A register that stores the number of data to be transferred.

転送データ数計数器３８：転送されたデータ
の数をかぞえ、格納しておくレジスタ。 Transfer data number counter 38: A register that counts and stores the number of transferred data.

比較器３９：転送データ数指示器３７の内容
と転送データ数計数器３８の内容、即ち転送さ
れるべきデータの数と転送されたデータの数を
比較し、必要なだけの数のデータが転送された
ことを検出する。 Comparator 39: Compares the contents of the transfer data number indicator 37 and the contents of the transfer data number counter 38, that is, the number of data to be transferred and the number of transferred data, and transfers the necessary number of data. Detect what has happened.

命令レジスタ４０：どのような手順のデータ
転送を行なうかを指示した命令を格納しておく
レジスタである。この命令は命令解釈器４１に
よりデコードされ、各演算器４２乃至４５に対
してどのような演算を行なうかを適切なタイミ
ングで指示する。 Instruction register 40: This is a register that stores an instruction that instructs what procedure to perform data transfer. This command is decoded by the command interpreter 41 and instructs each of the calculation units 42 to 45 at appropriate timing what kind of calculation to perform.

例えば、５番、10番、15番の各PUのメモリの
100番地からの８個のデータを、全PUのメモリの
300番地に移したいときは、読み出しPU指示器３
１に５を、書き込みPU指示器３２に全PUを表わ
すコードをそれぞれ格納し、読み出しアドレス指
示器３３、書き込みアドレス指示器３４にそれぞ
れ100，300を格納し、転送データ数指示器３７に
８を格納しておく。次に命令レジスタ４０にこの
手順のデータ転送を行なうことを指示した命令を
格納すると、この命令は命令解釈器によりデコー
ドされ、実行が始まる。 For example, the memory of each PU No. 5, No. 10, and No. 15
The 8 data from address 100 are stored in the memory of all PUs.
If you want to move to address 300, read PU indicator 3
1, store a code representing all PUs in the write PU indicator 32, store 100 and 300 in the read address indicator 33 and write address indicator 34, respectively, and set 8 to the transfer data number indicator 37. Store it. Next, when an instruction instructing data transfer of this procedure is stored in the instruction register 40, this instruction is decoded by the instruction interpreter and execution begins.

まず転送データ数計数器３８がクリアされ、５
番のPUの100番地からデータが読み出され、全
PUの300番地に書き込まれる。その後、読み出し
アドレス指示器３３、書き込みアドレス指示器３
４、転送データ数計数器３８の値が＋１される。
読み出しアドレス指示器３３、書き込みアドレス
指示器の値が＋１されて、それぞれ101，301にな
つたので、次は５番のPUの101番地から全PUの
301番地へデータ転送が行なわれる。これを８回
繰り返すと、転送データ数指示器３７と転送デー
タ数計数器３８の値が等しくなり、比較器３９に
より一致が検出されて、命令解釈器４１に伝えら
れる。命令解釈器４１は、このタイミングで、読
み出しPU指示器３１を＋５し、読み出しアドレ
ス指示器３３に元の値100を格納し、転送データ
数計数器３８をクリアし、実行を続ける。15番の
PUからのデータ転送が終了すれば全命令の完了
を検出し、バスの占有権を放す。 First, the transfer data number counter 38 is cleared and 5
Data is read from address 100 of the numbered PU, and all
Written to address 300 of PU. After that, read address indicator 33, write address indicator 3
4. The value of the transfer data number counter 38 is incremented by 1.
The values of the read address indicator 33 and the write address indicator have been incremented by 1 and become 101 and 301, respectively, so next, start from address 101 of the 5th PU to all PUs.
Data is transferred to address 301. When this is repeated eight times, the values of the transfer data number indicator 37 and the transfer data number counter 38 become equal, a match is detected by the comparator 39, and the result is transmitted to the command interpreter 41. At this timing, the instruction interpreter 41 increments the read PU indicator 31 by +5, stores the original value 100 in the read address indicator 33, clears the transfer data number counter 38, and continues execution. number 15
When the data transfer from the PU is completed, the completion of all instructions is detected and the bus is released.

また複合コンピユータシステムでは、各PUに
散在している数個ずつのデータを全PUに複写す
る必要が生じることも少なくない。このような複
写を行なうデータ転送制御装置の他の実施例構成
を第６図に示す。 Furthermore, in complex computer systems, it is often necessary to copy several pieces of data scattered across each PU to all PUs. FIG. 6 shows another embodiment of the configuration of a data transfer control device that performs such copying.

第６図において、第５図と異なる要素のみを示
すと、４９は読み出し開始アドレスラツチ、５０
は書き込み開始アドレスラツチ、５１は転送PU
数指示器、５２は転送PU数計数器、５３は比較
器、５４乃至５６は＋１加算器である。以下に、
装置の動作を説明する。 In FIG. 6, only the elements different from those in FIG. 5 are shown: 49 is a read start address latch;
is the write start address latch, 51 is the transfer PU
52 is a transfer PU number counter, 53 is a comparator, and 54 to 56 are +1 adders. less than,
The operation of the device will be explained.

読み出しPU指示器３１に読み出しを開始す
るPUの番号（コード）を、書き込みPU指示器
３２に全PUを表わすコードをそれぞれ格納し、
読み出し開始アドレスラツチ４９、書き込み開
始アドレスラツチ５０にそれぞれ読み出しを開
始するアドレス、書き込みを開始するアドレス
を格納し、転送データ数指示器３７にデータ
数、転送PU数指示器５１に全PU数を格納して
おき、命令レジスタ４０に複写の命令を格納す
る。 The number (code) of the PU to start reading is stored in the read PU indicator 31, the code representing all PUs is stored in the write PU indicator 32, and
The read start address latch 49 and the write start address latch 50 store the address to start reading and the address to start writing, respectively, the number of data to be transferred is stored in the number indicator 37, and the total number of PUs is stored in the number of transfer PUs indicator 51. Then, a copy instruction is stored in the instruction register 40.

命令の格納により実行が始まる。書き込み開
始アドレスを書き込みアドレス指示器３４に転
送する。 Execution begins when the instruction is stored. The write start address is transferred to the write address indicator 34.

転送データ数計数器３８をクリアし、読み出
し開始アドレスを読み出しアドレス指示器３３
に転送する。 Clear the transfer data number counter 38, read the read start address, and read the address indicator 33.
Transfer to.

読み出しPU指示器３１で指示されるPUの読
み出しアドレス指示器３３で指定されるアドレ
スからデータを読み出し、書き込みPU指示器
３２で指定されるPU内の、書き込みアドレス
指示器３４で指定されるアドレスに書き込む。 Read data from the address specified by the read address indicator 33 of the PU specified by the read PU indicator 31, and read data from the address specified by the write address indicator 34 in the PU specified by the write PU indicator 32. Write.

読み出しアドレス指示器３３、書き込みアド
レス指示器３４、転送データ数計数器の値を＋
１する。転送データ数指示器３７と転送データ
数計数器３８の値が等しくなければに戻る。 The values of the read address indicator 33, write address indicator 34, and transfer data number counter are +
Do 1. If the values of the transfer data number indicator 37 and the transfer data number counter 38 are not equal, the process returns to step.

転送データ数指示器３７と転送データ数計数
器３８の値が等しくなつたら、読み出しPU指
示器３１、転送PU数計数器５２の値を＋１す
る。転送PU数指示器５１と転送PU数計数器５
２の値が等しくなければに戻る。 When the values of the transfer data number indicator 37 and the transfer data number counter 38 become equal, the values of the read PU indicator 31 and the transfer PU number counter 52 are incremented by 1. Transfer PU number indicator 51 and transfer PU number counter 5
If the values of 2 are not equal, return to step 2.

転送PU数指示器５１と転送PU数計数器５２
の値が等しければ転送を終了し、バスを解放
し、制御を親コンピユータあるいはPUアレイ
に戻す。 Transfer PU number indicator 51 and transfer PU number counter 52
If the values are equal, the transfer ends, the bus is released, and control is returned to the parent computer or PU array.

読み出しPU指示器３１、書き込みPU指示器３
２により、制御ユニツト内のコンピユータも指定
可能とすることができる。このようにして、任意
のPUと制御ユニツトのメモリ、または周辺装置
との間で、データの構造をPUアレイの構造に合
わせて合理的に転送することができる。 Read PU indicator 31, write PU indicator 3
2, the computer within the control unit can also be specified. In this way, the structure of data can be rationally transferred between any PU and the memory of the control unit or the peripheral device according to the structure of the PU array.

(5) PUアレイとホストコンピユータとの結合本発明では、第１図に示すようなPUアレイ１
とホストコンピユータ３との間のデータ参照のた
めに、簡単で高速なバスの結合手段が使用され
る。PUアレイについては、実際に物理的にPU相
互間にどのような結合がなされているかというこ
ととは別に、ホストコンピユータから見た論理的
なPUアレイの構造を考えることができる。(5) Connection between PU array and host computer In the present invention, a PU array 1 as shown in FIG.
For data reference between the computer 3 and the host computer 3, simple and fast bus coupling means are used. Regarding the PU array, apart from considering how the PUs are actually physically connected to each other, it is possible to consider the logical structure of the PU array as seen from the host computer.

第７図は８個のPU、すなわちPU（０）乃至PU
(7)が１列に並んだ１次元構造のPUアレイ例であ
る。一方、ホストコンピユータの中で扱うデータ
にも論理的な構造が考えられる。第８図は８×８
の２次元の行列データの例である。このデータを
第７図のPUアレイで分割して処理をする場合、
分割の仕方をいろいろとることができる。たとえ
ばa₁₁〜a₈₁の１列をPU（０）が分担し、a₁₂〜a₈₂
の１列をPU(1)が分担し、……というように各PU
が１列ずつを分担するということが考えられる。
このような場合、ホストコンピユータの中では、
データは第８図の行列中の（）で番号付けされた
ような順にメモリまたは周辺装置に格納されてい
る。他方、PUアレイにおいては、（０）〜(7)の８
個のアドレスの各PUに、その順に格納される。 Figure 7 shows 8 PUs, PU(0) to PU
(7) is an example of a PU array with a one-dimensional structure arranged in a row. On the other hand, data handled within a host computer can also have a logical structure. Figure 8 is 8x8
This is an example of two-dimensional matrix data. When this data is divided and processed using the PU array shown in Figure 7,
There are various ways of dividing. For example, PU (0) will share one column from a ₁₁ to _{a 81} , and a ₁₂ to _{a 82}
PU(1) shares one column of , and so on, each PU
It is conceivable that each column is divided into two columns.
In such a case, in the host computer,
The data is stored in the memory or peripheral device in the order numbered in parentheses in the matrix of FIG. On the other hand, in the PU array, 8 of (0) to (7)
are stored in each PU at addresses in that order.

もう少し複雑な例としては、第９図に示すよう
に、２×４の２次元構造のPUアレイを考える。
この場合、８×８の行列データを、４×２の小行
列８個に分割して、各PUに割当てることができ
る。この時、たとえばPU（０）とPU(1)に割当て
られたデータの順序（アドレス）は、それぞれ第
１０図イ，ロのようにずれる。同様にして、他の
PUにおけるデータ順序にもホストコンピユータ
でのデータの順序に対して一定のずれが生じる。 As a more complex example, consider a PU array with a two-dimensional structure of 2 x 4, as shown in Figure 9.
In this case, the 8×8 matrix data can be divided into eight 4×2 small matrices and assigned to each PU. At this time, for example, the order (address) of data assigned to PU(0) and PU(1) is shifted as shown in FIG. 10A and B, respectively. Similarly, other
There is also a certain deviation in the data order in the PU from the data order in the host computer.

本発明では、このずれをアドレスラインとPU
選択ラインの信号との簡単な演算により自動的に
発生する変換回路を用いて、高速に対応するデー
タをホストコンピユータとPUアレイとの間で参
照可能にする。 In the present invention, this misalignment is corrected between the address line and the PU
Using a conversion circuit that is automatically generated by simple calculations with the signal on the selection line, the corresponding data can be referenced between the host computer and the PU array at high speed.

第１１図は、ホストコンピユータにおける第８
図に示す８×８の行列データを第９図に示す２×
４のPUアレイ構造に対して割当てる場合のアド
レス変換回路の実施例を示す。図中、１はPUア
レイ、３はホストコンピユータ、６１はアドレス
ライン、６２はアドレスラインの信号からホスト
コンピユータ３におけるa₁₁のアドレスすなわち
先頭アドレスを差引く減算回路、６３は除算回
路、６４は加算回路、６５はPU選択ライン、６
６はPUアドレスラインである。６２，６３，６
４の回路が変換回路を構成し、制御ユニツト２内
に置かれる。除算回路６３は、アドレス信号を１
つのPU中のデータの行数、すなわち第１０図に
より“４”で割つた商b₁と余りb₀を求め、さらに
その商b₁をPUの縦の個数すなわち“２”で割つ
た商b₃と余りb₂を求め、さらにその商b₃を１つの
PU中のデータの列数、すなわち第１０図により
“２”で割つた商b₅と余りb₄とを求め、b₀，b₂，
b₄，b₅を出力する。 Figure 11 shows the eighth
The 8×8 matrix data shown in the figure is converted to the 2× matrix data shown in FIG.
4 shows an example of an address conversion circuit for allocation to a PU array structure of 4. In the figure, 1 is a PU array, 3 is a host computer, 61 is an address line, 62 is a subtraction circuit that subtracts the address of _a11 in the host computer 3, that is, the start address, from the address line signal, 63 is a division circuit, and 64 is an addition circuit. Circuit, 65 is PU selection line, 6
6 is the PU address line. 62,63,6
The circuit No. 4 constitutes a conversion circuit and is placed within the control unit 2. The division circuit 63 divides the address signal into 1
The number of rows of data in one PU, i.e., the quotient b ₁ divided by "4" according to Figure 10, and the remainder b ₀ are calculated, and the quotient b ₁ is divided by the vertical number of PUs, i.e. "2", b ₃ and the remainder b ₂ , and then divide the quotient b ₃ into one
Find the number of columns of data in the PU, that is, the quotient b ₅ divided by "2" and the remainder b ₄ according to Figure 10, and calculate b ₀ , b ₂ ,
Output b ₄ and b ₅ .

一般的にPUアレイ中のｍ行ｎ列目のPUをPU
（ｍ，ｎ）で表わしたとき、b₅，b₂によりPU（b₅，
b₂）が選択される。しかし、第１１図に示すよう
に、PUの番号付けを０乃至７のように一次元的
に行なつた場合には、その番号は、 b₅×（PUアレイの行数）＋b₂ となる。 Generally, the PU in the m row and n column in the PU array is
When expressed as (m, n), b ₅ , b ₂ gives PU(b ₅ ,
b ₂ ) is selected. However, as shown in Figure 11, if the PUs are numbered one-dimensionally from 0 to 7, the number will be b ₅ × (number of rows in the PU array) + b ₂ . .

b₄，b₀は、上記のようにb₅，b₂により選択され
たPU内の選択されたデータの２次元的な位置が、
（b₄，b₀）、すなわちb₄行b₀列目であることを表わ
す。これを一次元的なアドレスとして表わせば、 b₄×（PU内データ配列の行数）＋b₀ ＋（PU内のベースアドレス）となる。 b ₄ and b ₀ are the two-dimensional positions of the selected data in the PU selected by b ₅ and b ₂ as described above.
(b ₄ , b ₀ ), that is, b _4th row and b _0th column. If this is expressed as a one-dimensional address, it becomes b ₄ × (number of rows of data array in PU) + b ₀ + (base address in PU).

ここで第１１図に示すように、アドレスライン
６１のアドレス値をA₀減算回路６２の出力アド
レス値をA₁，b₄およびb₀の値をA₂，b₅およびb₂
の値をA₃、PU選択ライン６５の値をA₃、PUア
ドレスライン６６の値をA₄とすると、A₁，A₂，
A₃，A₄は次のような式で与えられる。 Here, as shown in FIG. 11, the address value of the address line 61 is A ₀ , the output address value of the subtraction circuit 62 is A ₁ , b ₄ and b ₀ , and the values of A ₂ , b ₅ and b ₂ are
Assuming that the value of is A ₃ , the value of the PU selection line 65 is A ₃ , and the value of the PU address line 66 is A ₄ , A ₁ , A ₂ ,
A ₃ and A ₄ are given by the following formulas.

A₁＝A₀−（ホストコンピユータ先頭アドレス） b₀：A₁／（１つのPU内のデータ配列の行数）
の余り b₁：A₁／（１つのPU内のデータ配列の行数）
の商 b₂：b₁／（PUアレイの縦方向行数）の余り b₃：b₁／（PUアレイの縦方向行数）の商 b₄：b₂／（１つのPU内のデータ配列の列数）
の余り b₅：b₃／（１つのPU内のデータ配列の列数）
の商 A₂＝b₄×（１つのPU内のデータ配列の行数）＋
b₀ A₃＝b₅×（１つのPU内のデータ配列の行数）＋
b₂ A₄＝A₂＋PUのベースアドレス第１１図の実施例の場合、アドレスライン６１
の最下位２ビツトは、b₀として、第１０図イ，ロ
に示すような各PUに割当てられる４×２の行列
データ中の各列における４個のデータ位置（順
序）の１つを指定する。A ₁ = A ₀ - (host computer start address) b ₀ :A ₁ / (number of rows of data array in one PU)
Remainder b ₁ :A ₁ / (number of rows of data array in one PU)
The quotient of b ₂ : b ₁ / (the number of vertical rows in the PU array) is the remainder b ₃ : the quotient of b ₁ / (the number of vertical rows in the PU array) b ₄ : b ₂ / (the data array in one PU) number of columns)
Remainder b ₅ :b ₃ / (number of columns of data array in one PU)
Quotient of A ₂ = b ₄ × (number of rows of data array in one PU) +
b ₀ A ₃ = b ₅ × (number of rows of data array in one PU) +
b ₂ A ₄ = A ₂ + PU base address In the case of the embodiment shown in Fig. 11, address line 61
The lowest two bits of _b0 specify one of the four data positions (order) in each column of the 4x2 matrix data assigned to each PU as shown in Figure 10 A and B. do.

アドレスライン６１の下位から３ビツト目は、
b₂として、２×４配列のPUアレイの第１行のPU
か第２行のPUかを指定する。 The third bit from the bottom of address line 61 is
b ₂ , the PU in the first row of the 2×4 PU array
or the second line PU.

アドレスライン６１の下位から４ビツト目は、
b₄として、各PUにおける４×２行列データの第
１列か第２列かを指定する。 The fourth bit from the bottom of address line 61 is
As b ₄ , specify whether it is the first column or the second column of the 4×2 matrix data in each PU.

アドレスライン６１の下位から５，６ビツト目
の２ビツトは、b₅として、PUアレイ中の４つの
列位置の１つを指定する。 The 5th and 6th bits from the bottom of the address line 61 specify one of the four column positions in the PU array as _b5 .

この実施例では、データの行、列、PUの行、
列の数がすべて２のベキ乗であるので、上記除算
回路６３の演算は、単なるアドレスラインの入れ
換えだけで済ますことができる。しかし一般的な
行、列数の場合、上記のような除算が必要であ
る。演算により得られたb₅，b₂をPUの選択ライ
ン信号として使用し、b₄，b₀にPU内でのこのデ
ータ群のベースアドレスの加算を加算回路６４に
おいて行なつて、その出力をPUのアドレスライ
ン信号として使用する。以上の対応付けにより、
ホストコンピユータ３は、PUアレイ中に分散し
ているデータを、第８図に表わされた構造として
すばやく参照することができる。 In this example, the data row, column, PU row,
Since the number of columns is all a power of 2, the operation of the division circuit 63 can be completed by simply replacing the address lines. However, in the case of general numbers of rows and columns, the above division is necessary. b ₅ and b ₂ obtained by the calculation are used as selection line signals of the PU, and the base address of this data group in the PU is added to b ₄ and b ₀ in the adder circuit 64, and the output is Used as PU address line signal. With the above correspondence,
The host computer 3 can quickly refer to the data distributed in the PU array as the structure shown in FIG.

上記の方式は一般的なアドレスの変換方式を与
えるものであるから、上記の例以外のデータ構造
（３次元データなど）と他のPUアレイ構造につい
ても適用できる。 Since the above method provides a general address conversion method, it can also be applied to data structures other than the above example (such as three-dimensional data) and other PU array structures.

(6) PU間の同期 PUアレイ中の各PUに次の処理を実行させるた
めには、他のPUの現在の処理の結果が必要であ
る場合がある。そのような場合には、PUアレイ
中の各PUが次の処理を開始する前に、全てのPU
が現在の処理を終了している必要がある。従来の
同期制御回路は、第１２図に示すように各PUが
１桁のフラグレジスタ６７をもち、いつもは
“０”を設定しておき、同期化要求がある状態
（モード）では、各PUが現在の動作を終了したと
きにそれぞれ“１”を書き込むようにして、これ
らのフラグの一致をANDゲート６８で検出し、
制御装置６９は、ANDゲート６８の出力が“１”
になつた後で各PUへ割り込みをかけて同期をと
るようにしていた。(6) Synchronization between PUs In order for each PU in the PU array to execute the next process, the results of the current process of the other PUs may be required. In such a case, all PUs in the PU array must be
must have finished its current processing. In the conventional synchronization control circuit, as shown in FIG. 12, each PU has a one-digit flag register 67, which is normally set to "0". "1" is written to each flag when the current operation is completed, and the match of these flags is detected by the AND gate 68.
The control device 69 determines that the output of the AND gate 68 is “1”
After that, I tried to synchronize by interrupting each PU.

しかし、プログラム中の複数箇所において、同
期をとる必要がある場合、各箇所ごとに同期をと
らなければならないが、第１２図の方式ではこの
ような複数個の同期要求について各同期点を識別
することができないので、エラーにより異なる同
期点にあるPUに対しても同期制御を行なう可能
性がある。このような不都合が生じないようにす
るには、フラグレジスタの出力が“１”となつて
後、制御装置が各PUの同期要求が同種類のもの
であるかを調べる必要がある。これは、制御装置
が各PUを順次調べることにより行なわれるので、
システム全体の性能低下をきたす。 However, if it is necessary to synchronize multiple locations in a program, synchronization must be achieved at each location, but the method shown in Figure 12 identifies each synchronization point for such multiple synchronization requests. Therefore, there is a possibility that synchronization control may be performed even for PUs at different synchronization points due to an error. In order to prevent such a problem from occurring, after the output of the flag register becomes "1", the control device needs to check whether the synchronization requests of each PU are of the same type. This is done by the controller examining each PU in turn, so
This will cause a decline in the performance of the entire system.

このため本発明の同期制御回路では、１桁のフ
ラグレジスタの代りに複数桁をもつ同期レジスタ
を設け、異なる同期点に対して別々の同期コード
を設定することにより識別可能にしている。 Therefore, in the synchronization control circuit of the present invention, a synchronization register having multiple digits is provided instead of a single-digit flag register, and different synchronization codes are set for different synchronization points to enable identification.

第１３図はその概要図であり、７０は同期レジ
スタ、７１は一致検出回路を示す。 FIG. 13 is a schematic diagram thereof, in which 70 indicates a synchronization register and 71 indicates a coincidence detection circuit.

一致検出回路７１は、各同期レジスタ７０に書
き込まれた同期コードが一致したとき、一致した
ことと、同期コードとを制御装置６９に通知す
る。制御装置６９は各PUに再スタートを指示す
ることにより同期をとる。これにより、同期点が
複数個ある場合も、各同期点ごとに確実高速に同
期をとることができる。 When the synchronization codes written in each synchronization register 70 match, the match detection circuit 71 notifies the control device 69 of the match and the synchronization code. The control device 69 achieves synchronization by instructing each PU to restart. As a result, even if there are multiple synchronization points, synchronization can be achieved reliably and quickly at each synchronization point.

また、一致した同期コードを制御装置に知らせ
ることにより、制御装置は単なる同期のみでな
く、PUの停止など他の制御を行なうことができ
る。 Furthermore, by notifying the control device of the matched synchronization code, the control device can perform not only simple synchronization but also other controls such as stopping the PU.

第１４図は、同期制御回路の１実施例の構成図
であり、DMA要求を用いたPUの同期制御の例
を示す。図中、２は制御ユニツト、７２はPU、
７３はORゲート、７４は同期要求フラグレジス
タSF、７５は通信要求フラグレジスタCF、７６
は同期レジスタSC、７７は一致検出回路、７８
はORおよびNORゲート、７９はANDおよびOR
ゲート、８０はタイマ、８１および８２はAND
ゲートである。次に回路の動作機能を説明する。 FIG. 14 is a block diagram of one embodiment of a synchronous control circuit, and shows an example of synchronous control of a PU using a DMA request. In the figure, 2 is a control unit, 72 is a PU,
73 is an OR gate, 74 is a synchronization request flag register SF, 75 is a communication request flag register CF, 76
is a synchronization register SC, 77 is a coincidence detection circuit, 78
are OR and NOR gates, 79 is AND and OR
Gate, 80 is timer, 81 and 82 are AND
It is a gate. Next, the operational functions of the circuit will be explained.

同期は、プログラム上の同期点まで実行を行
なつたPUが自分自身にHALT，WAITなどを
かけるハードウエアをセツトすることにより一
時実行を停止し、全PUがこの状態になつたこ
とを検出して一斉にPUに実行を再開させると
いう方法を用いて行なう。しかしHALT状態
の無いマイクロプロセツサを用いる場合は、
DMA要求によるPUの停止方法を用いる。同
期点まで実行を行なつたPUが自分自身に
HALT，WAITの代わりにDMAをかけるハー
ドウエアをセツトすることにより一時実行を停
止し、全PUがこの状態になつたことを検出し
て一斉にPUにDMA要求を解除することによ
り一斉に実行を開始させるという方法を用いて
行なう。 In synchronization, the PU that has executed up to the synchronization point in the program temporarily stops execution by setting hardware that applies HALT, WAIT, etc. to itself, and detects that all PUs have entered this state. This is done by having the PUs all at once resume execution. However, when using a microprocessor that does not have a HALT state,
Use the method of stopping the PU using a DMA request. The PU that executed up to the synchronization point
By setting hardware that applies DMA instead of HALT or WAIT, execution is temporarily stopped, and when all PUs are detected to be in this state, execution is stopped at once by releasing DMA requests to the PUs all at once. This is done using the method of starting.

PUごとに同期点がずれていないことを確か
にする為に、PUは複数ビツトの同期コードを
同期レジスタSCに書き込み、全PUの同期コー
ドが一致した事を検出した後、HALT要求を
解除する。 In order to make sure that the synchronization point does not shift for each PU, the PU writes a multi-bit synchronization code to the synchronization register SC, and releases the HALT request after detecting that the synchronization codes of all PUs match. .

同期によつては、一致を検出してもすぐに
HALTを解除せずに、制御ユニツト２に通知
する場合もあるので、その場合には通信要求フ
ラグレジスタCFを用いて制御ユニツトへの通
知要求を発生する。 Depending on the synchronization, even if a match is found, the
There is a case where the control unit 2 is notified without releasing HALT, so in that case, the communication request flag register CF is used to generate a notification request to the control unit.

HALT，WAITの代わりにDMA要求を用い
る場合、一般にDMAはデータ転送のために用
いられるので、同期のためのDMA要求をデー
タ転送のためのそれと区別する必要がある。そ
のためには、データ転送のためのDMAのフラ
グと同期のためのDMAのフラグ（同期要求フ
ラグレジスタSF）を互いに独立に設け、両者
の論理和（ORゲート７３）により実際のPU
へのDMA要求が生じるようにする。 When using DMA requests instead of HALT and WAIT, since DMA is generally used for data transfer, it is necessary to distinguish DMA requests for synchronization from those for data transfer. To do this, the DMA flag for data transfer and the DMA flag for synchronization (synchronization request flag register SF) are provided independently from each other, and the logical sum (OR gate 73) of the two is used to determine whether the actual PU
cause a DMA request to occur.

同期エラーの検出。同期要求が少なくとも一
つ有るにもかかわらず、各同期レジスタSCの
内容が不一致のまま一定時間が経過すれば、タ
イマ８０がこれを検出して制御ユニツトに通知
する。この一定時間の設定、ENABLE／
DISABLEは制御ユニツトからソフトウエアで
行なう。 Detection of synchronization errors. Even though there is at least one synchronization request, if the contents of each synchronization register SC remain inconsistent for a certain period of time, the timer 80 detects this and notifies the control unit. This fixed time setting, ENABLE/
DISABLE is performed by software from the control unit.

同期要求フラグレジスタSFは、同期レジスタ
SCへの書き込みによつてセツトされ、PU自身に
HALT（またはWAIT，DMA）要求を発生する。
そして同期をとるためのHALT解除によりリセ
ツトされる。特別にフリツプフロツプのようなフ
ラグのためのハードウエアを設けずに、同期レジ
スタSCのデフオールト（普段の値）をあらかじ
め決めておいて、その値以外になつたとき同期要
求があるものと解釈する回路によることもでき
る。 The synchronization request flag register SF is a synchronization register.
Set by writing to SC and written to PU itself.
Generates a HALT (or WAIT, DMA) request.
Then, it is reset by releasing HALT for synchronization. A circuit that predetermines the default (usual value) of the synchronization register SC without providing special hardware for a flag such as a flip-flop, and interprets it as a synchronization request when it becomes a value other than that value. It can also be done by

制御ユニツトへの通信要求フラグCFは、同期
レジスタSCへの書き込みによつてセツト／リセ
ツトされ、制御ユニツトへの通知要求を発生す
る。 The communication request flag CF to the control unit is set/reset by writing to the synchronization register SC, and generates a notification request to the control unit.

同期レジスタSCはPUからのみ同期コードを書
き込まれる。 A synchronization code is written to the synchronization register SC only from the PU.

この同期回路では、全SFが１で、且つ、全SC
が一致し、且つ、CFが０のとき、全PUの同期用
HALT要求及びフラグを解除する。CFが１のと
き、制御ユニツトに割り込みをかける。 In this synchronous circuit, all SF is 1 and all SC
matches and CF is 0, for synchronization of all PUs
Cancel HALT request and flag. When CF is 1, interrupts the control unit.

第１５図は、同期をとるためにDMA要求とは
別のHALT，WAIT等の信号を用いることが可
能な場合にそれを用いた例で、同期要求フラグの
代わりに同期コードが普段の値で無い事を検出し
て同期制御を行なう方式である。図示の回路で
は、同期コードの普段の値として零を用いてお
り、そのため全てのPUの同期レジスタSCが同じ
非零の値をとつたとき同期制御を行なうようにす
る。図中の８３は、全入力が同じ非零値であるこ
とを検出する一致検出回路である。 Figure 15 shows an example of using signals such as HALT and WAIT that are separate from the DMA request when possible to achieve synchronization, and the synchronization code is set to the usual value instead of the synchronization request flag. This is a method that detects the absence and performs synchronous control. In the illustrated circuit, zero is used as the normal value of the synchronization code, so synchronization control is performed when the synchronization registers SC of all PUs take the same non-zero value. 83 in the figure is a coincidence detection circuit that detects that all inputs are the same non-zero value.

第１５図の回路の動作を説明する。まず、プロ
グラムの実行が同期点に到達したPUが同期レジ
スタSCに零でない同期コードを書き込む。SCの
書き込みに連動して、PUにHALTがかかる。各
PUが次々に同期コードを書き込みそして停止し、
一般検出回路によりすべての同期コードが一致
し、且つそれが零でないことが検出されると、そ
の出力信号により全SCがクリアされる。さらに
それに連動してPUのHALTが解除され、全PU
が一斉に実行を再開する。 The operation of the circuit shown in FIG. 15 will be explained. First, the PU whose program execution has reached a synchronization point writes a non-zero synchronization code to the synchronization register SC. HALT is applied to PU in conjunction with writing to SC. each
PU writes synchronous code one after another and stops,
When the general detection circuit detects that all synchronization codes match and are not zero, its output signal clears all SCs. Furthermore, HALT of PU is canceled in conjunction with this, and all PU
will resume execution all at once.

なお、第１４図および第１５図の回路では、同
期制御の条件の組み合わせを換えて用いることが
可能である。すなわち、同期要求フラグと
HALTその他を用いることも、同期コードの普
段の値とDMAを用いることもできる。またこれ
らや、第１５図の回路に通信要求を付加すること
もできる。同期要求フラグ、WAIT信号を用い、
且つ同期コードの値がある範囲の場合に通信要求
を発生する回路の例を第１６図に示す。 Note that the circuits shown in FIGS. 14 and 15 can be used with different combinations of synchronous control conditions. That is, the synchronization request flag and
You can use HALT or something like that, or you can use the normal value of the synchronization code and DMA. It is also possible to add a communication request to these circuits or the circuit shown in FIG. Using the synchronization request flag and WAIT signal,
FIG. 16 shows an example of a circuit that generates a communication request when the value of the synchronization code is within a certain range.

第１６図において、８４は全入力の一致とその
正、負を識別する一致検出回路である。次に第１
６図の回路の動作を説明する。まず、プログラム
の実行が同期点に到達したPUが同期レジスタSC
に同期コードを書き込む。このとき、通信の必要
のある同期点では負の同期コードを書き込み、そ
うでない同期点では正の同期コードを書き込む。
SCの書き込みに連動して同期要求フラグ（SF）
がセツトされ、PUにWAITがかかり、PUは停
止する。各PUが次々に同期コードを書き込み、
停止し、一致検出回路８４によりすべての同期コ
ードが一致し、且つそれが正であることが検出さ
れると、その出力信号により全SFがクリアされ
る。それに連動しPUのWAITが解除され、全
PUが一斉に実行を再開する。同期コードが負で
あるときは、一致検出回路８４により親コンピユ
ータの制御ユニツトに通信要求が伝えられる。 In FIG. 16, 84 is a coincidence detection circuit that identifies coincidence of all inputs and whether it is positive or negative. Next, the first
The operation of the circuit shown in FIG. 6 will be explained. First, the PU whose program execution has reached the synchronization point registers the synchronization register SC.
Write the synchronous code in. At this time, negative synchronization codes are written at synchronization points where communication is necessary, and positive synchronization codes are written at synchronization points where communication is not necessary.
Synchronous request flag (SF) in conjunction with SC writing
is set, WAIT is applied to the PU, and the PU stops. Each PU writes synchronization code one after another,
When the synchronization detection circuit 84 detects that all synchronization codes match and are positive, all SFs are cleared by the output signal. In conjunction with this, PU WAIT is canceled and all
PUs resume execution all at once. When the synchronization code is negative, the coincidence detection circuit 84 transmits a communication request to the control unit of the parent computer.

〔Effect of the invention〕

以上のように、本発明によれば、複合コンピユ
ータシステムにおける各要素コンピユータ間ある
いは周辺装置と要素コンピユータ間の通信時間が
短縮され、また同期制御を確実に行なうことが可
能となり、システム全体の性能を向上させること
ができる。 As described above, according to the present invention, the communication time between each element computer or between a peripheral device and an element computer in a composite computer system is shortened, and synchronization control can be performed reliably, thereby improving the performance of the entire system. can be improved.

[Brief explanation of drawings]

第１図は本発明による複合コンピユータシステ
ムの全体構成図、第２図はPUアレイ内の１つの
PUの細部構成図、第３図は通信メモリCMのア
ドレス空間の説明図、第４図はPUアレイに対す
るデータ転送の従来例を説明するための概要図、
第５図はデータ転送制御装置の１実施例構成図、
第６図はデータ転送制御装置の他の実施例の構成
図、第７図は１次元構造のPUアレイの１例を示
す図、第８図は８×８の２次元の行列データの説
明図、第９図は２×４の２次元構造のPUアレイ
の説明図、第１０図イ，ロはそれぞれ第９図にお
けるPU（０），PU(1)内のデータ配列を示す説明
図、第１１図はアドレス変換回路の１実施例の構
成図、第１２図は従来の同期制御回路の１例を示
す図、第１３図は本発明による同期制御回路の概
要図、第１４図は同期制御回路の１実施例の構成
図、第１５図および第１６図はそれぞれ同期制御
回路の他の実施例の構成図である。図中、１はPUアレイ、２は制御ユニツト、３
はホストコンピユータ、１６はマイクロプロセツ
サMPU、２０はローカルメモリLM、２２ａは
同期レジスタSC、２３乃至２６は通信メモリCM
を示す。 Figure 1 is an overall configuration diagram of a composite computer system according to the present invention, and Figure 2 is a diagram of one of the components in the PU array.
A detailed configuration diagram of the PU, Figure 3 is an explanatory diagram of the address space of the communication memory CM, and Figure 4 is a schematic diagram to explain a conventional example of data transfer to the PU array.
FIG. 5 is a configuration diagram of one embodiment of a data transfer control device.
Figure 6 is a block diagram of another embodiment of the data transfer control device, Figure 7 is a diagram showing an example of a PU array with a one-dimensional structure, and Figure 8 is an explanatory diagram of 8x8 two-dimensional matrix data. , Fig. 9 is an explanatory diagram of a PU array with a two-dimensional structure of 2 × 4, Fig. 10 A and B are explanatory diagrams showing the data arrays in PU (0) and PU (1) in Fig. 9, respectively. Fig. 11 is a block diagram of one embodiment of an address conversion circuit, Fig. 12 is a diagram showing an example of a conventional synchronous control circuit, Fig. 13 is a schematic diagram of a synchronous control circuit according to the present invention, and Fig. 14 is a synchronous control circuit. A block diagram of one embodiment of the circuit, and FIGS. 15 and 16 are block diagrams of other embodiments of the synchronous control circuit, respectively. In the figure, 1 is the PU array, 2 is the control unit, and 3 is the PU array.
is a host computer, 16 is a microprocessor MPU, 20 is a local memory LM, 22a is a synchronization register SC, and 23 to 26 are communication memories CM.
shows.

Claims

[Claims]

1 In a composite computer system in which multiple computers are arranged in parallel and communication is possible between two adjacent computers, each computer has a local memory, and each adjacent two
A communication memory is provided between two computers, and a common address area is set in at least a part of the local memory and communication memory for each computer, and each computer uses the common address area when communicating with an adjacent computer. A composite computer system characterized in that the same data is simultaneously written to both local memory and communication memory using .