JPH0554008A

JPH0554008A - Multiprocessor system

Info

Publication number: JPH0554008A
Application number: JP3218213A
Authority: JP
Inventors: Toru Ueda; 徹上田
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1991-08-29
Filing date: 1991-08-29
Publication date: 1993-03-05
Anticipated expiration: 2013-09-17
Also published as: JP2799528B2

Abstract

PURPOSE:To provide a multiprocessor system where processors arbitrarily access local memories of processors of one another to increase the processing speed and the processing capability without increasing the number of processing program steps. CONSTITUTION:A multiprocessor system 1 includes plural processors PEi, and each processor PEi is provided with a local memory 3i which is accessible independently. Local memories 3i are dynamically rearranged in accordance with a set degree of parallel so that processors PEi which perform the memory access processing can refer to local memories of processors PEi which do not perform the memory access processing, and the memory space which each processor PEi is accessible is extended without changing the program.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明はマルチプロセッサシス
テムに関し、特に、並列に演算を実行する少なくとも２
個以上のプロセッサを有し、各プロセッサは独自にアク
セス可能な局所メモリを備え、さらにプロセッサのそれ
ぞれは相互に同一バスを介して接続されるマルチプロセ
ッサシステムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a multiprocessor system, and more particularly to at least two processors that execute operations in parallel.
The present invention relates to a multiprocessor system having at least two processors, each processor having a locally accessible memory, and each processor being mutually connected via the same bus.

【０００２】[0002]

【従来の技術】従来、数値解析やニューラルネットワー
クなどの演算処理の高速化を図るために種々のマルチプ
ロセッサシステムが提供されてきた。2. Description of the Related Art Conventionally, various multiprocessor systems have been provided in order to speed up arithmetic processing such as numerical analysis and neural networks.

【０００３】図１２は従来の複数のプロセッサを並列接
続したマルチプロセッサシステムの概略構成図である。FIG. 12 is a schematic configuration diagram of a conventional multiprocessor system in which a plurality of processors are connected in parallel.

【０００４】図１２においてマルチプロセッサシステム
２は計算素子（以下、プロセッサと呼ぶ）Ｐｅｉ（以
下、ｉ＝０、１、２、…、７）と、これらプロセッサＰ
ｅｉと、入力データバス１０および出力データバス２０
を介して並列に接続されるシーケンサ５０およびシーケ
ンサ５０によってアクセスされ、そこにストアされるデ
ータが読書きされるメモリ５１を含む。さらにプロセッ
サＰｅｉの各々は相互に内部データバス８０を介して接
続される。In FIG. 12, a multiprocessor system 2 includes computing elements (hereinafter, referred to as processors) Pei (hereinafter, i = 0, 1, 2, ..., 7) and these processors P.
ei, input data bus 10 and output data bus 20
A sequencer 50 connected in parallel via the memory and a memory 51 for reading and writing data stored and stored therein. Further, the processors Pei are connected to each other via an internal data bus 80.

【０００５】上述のように構成されるマルチプロセッサ
システム２は、プロセッサを８個有し、それぞれのプロ
セッサＰｅｉは後述するように独自にアクセスできる局
所メモリを有して、外部から与えられるデータに対して
並列に演算が実行できることを特徴としている。図１の
場合のマルチプロセッサシステム２では、１つの入力デ
ータに対して８つのプロセッサが独自に、かつ同時に演
算処理を実行できる。The multiprocessor system 2 configured as described above has eight processors, and each processor Pei has a local memory that can be independently accessed as will be described later. It is characterized in that it can execute operations in parallel. In the multiprocessor system 2 in the case of FIG. 1, eight processors can independently and simultaneously perform arithmetic processing on one input data.

【０００６】動作においてマルチプロセッサシステム２
のシーケンサ５０は、メモリ５１に予めストアされるプ
ログラムを逐次読出して解析し、解析した命令（以下、
コマンドと呼ぶ）を逐次入力データバス１０を介してプ
ロセッサＰｅｉのそれぞれに並列に与える。これに応じ
てプロセッサＰｅｉのそれぞれは与えられるコマンドに
従って同時に演算処理を実行し、その演算結果を出力デ
ータバス２０を介して外部に出力する。また、プロセッ
サＰｅｉのそれぞれは、シーケンサ５０から与えられる
コマンド、もしくは予めその内部に記憶するコマンドシ
ーケンスに沿って演算処理動作を行なうように構成され
ている。In operation, the multiprocessor system 2
Sequencer 50 sequentially reads and analyzes a program stored in advance in memory 51, and analyzes the analyzed instruction (hereinafter,
Command) is sequentially given to each of the processors Pei in parallel via the input data bus 10. In response to this, each of the processors Pei simultaneously executes the arithmetic processing according to the given command, and outputs the arithmetic result to the outside via the output data bus 20. Further, each of the processors Pei is configured to perform an arithmetic processing operation in accordance with a command given from the sequencer 50 or a command sequence stored therein beforehand.

【０００７】図１３は、前掲図１２に示されたプロセッ
サＰｅｉの概略構成図である。図１３においてプロセッ
サＰｅｉは演算ユニット１ｉおよび局所メモリ３ｉを含
む。演算ユニット１ｉは一般のＣＰＵ（中央処理装置の
略）と同様に加算器、レジスタなどを有し、演算処理に
並行して局所メモリ３ｉを逐次アクセスする。FIG. 13 is a schematic block diagram of the processor Pei shown in FIG. In FIG. 13, the processor Pei includes an arithmetic unit 1i and a local memory 3i. The arithmetic unit 1i has an adder, a register and the like like a general CPU (abbreviation of central processing unit), and sequentially accesses the local memory 3i in parallel with arithmetic processing.

【０００８】演算ユニット１ｉは入力データバス１０お
よび出力データバス２０を介してシーケンサ５０に接続
されるとともに、隣接するプロセッサＰｅｉと内部デー
タバス８０を介して接続される。The arithmetic unit 1i is connected to the sequencer 50 via the input data bus 10 and the output data bus 20, and is also connected to the adjacent processor Pei via the internal data bus 80.

【０００９】ここで、プロセッサＰｅｉの演算処理の動
作について説明する。たとえば、ニューラルネットワー
クで代表的な内部に予め記憶された数値（ウェイトと呼
ばれる）と入力データバス１０を介して与えられる入力
データとの乗算処理を行なう場合を想定する。Now, the operation of the arithmetic processing of the processor Pei will be described. For example, it is assumed that a neural network performs a typical multiplication process of a numerical value (called a weight) stored in advance inside and input data given via the input data bus 10.

【００１０】シーケンサ５０がメモリ５１をアクセスし
て読取ったデータを入力データバス１０を介してプロセ
ッサＰｅｉのそれぞれに並列に与える。これに応じて各
プロセッサＰｅｉは入力データに対して並列に演算処理
を実行する。まず演算ユニット１ｉは、局所メモリ３ｉ
をアクセスし、予めストアされているウェイトを読出
し、一時その内部レジスタにストアする。その後、演算
ユニット１ｉは与えられた入力データと内部レジスタに
ストアされたウェイトとの乗算処理を加算器を用いて実
行し、入力データとウェイトの積を算出する。この積算
処理は８つのプロセッサにおいて同時に並列して行なわ
れるので、単一プロセッサの場合に比較し、８倍の演算
速度が得られるという特徴がある。The sequencer 50 accesses the memory 51 and applies the read data to the processors Pei in parallel via the input data bus 10. In response to this, each processor Pei executes arithmetic processing on the input data in parallel. First, the arithmetic unit 1i includes the local memory 3i.
To read the previously stored wait and temporarily store it in the internal register. After that, the arithmetic unit 1i uses the adder to perform the multiplication process of the supplied input data and the weight stored in the internal register, and calculates the product of the input data and the weight. Since this integration processing is performed in parallel in eight processors at the same time, there is a feature that an operation speed that is eight times higher than that in the case of a single processor can be obtained.

【００１１】[0011]

【発明が解決しようとする課題】上述したような従来の
マルチプロセッサシステム２においては、プロセッサＰ
ｅｉのそれぞれが自己の局所メモリ３ｉのそれぞれを参
照しながら行なう演算処理に対しては、演算速度の高速
性を得ることができる。しかしながら、プロセッサＰｅ
ｉのそれぞれの局所メモリ３ｉに予めストアされている
データを並び変えるような処理の場合には、処理速度が
著しく低下するという問題があった。たとえば、プロセ
ッサＰｅ０ないしＰｅ７の８つの局所メモリ３０ないし
３７にそれぞれストアされている数値データを降順に並
び変える処理を実行する際には、プロセッサＰｅｉのそ
れぞれが互いの局所メモリ３ｉにストアされている数値
データの比較処理を行なうために、自己の局所メモリ３
ｉにストアされている数値データは、内部データバス８
０を介して他のプロセッサＰｅｉにデータ転送される。
さらにプロセッサＰｅｉのそれぞれの演算ユニット１ｉ
は、該当する局所メモリ３ｉにストアされる数値データ
を読出し、内部データバス８０を介して他のプロセッサ
Ｐｅｉにデータ転送した後に数値データの比較処理を行
なうので、処理速度は低下する。In the conventional multiprocessor system 2 as described above, the processor P
It is possible to obtain a high calculation speed for the calculation processing performed by each ei while referring to each of its own local memories 3i. However, the processor Pe
In the case of processing for rearranging the data stored in advance in the respective local memories 3i of i, there is a problem that the processing speed is significantly reduced. For example, when executing a process of rearranging numerical data stored in the eight local memories 30 to 37 of the processors Pe0 to Pe7 in descending order, the processors Pei are stored in the local memories 3i of each other. In order to compare the numerical data, its own local memory 3
The numerical data stored in i is stored in the internal data bus 8
Data is transferred to another processor Pei via 0.
Further, each arithmetic unit 1i of the processor Pei
Reads numerical data stored in the corresponding local memory 3i, transfers the data to another processor Pei via the internal data bus 80, and then compares the numerical data, so that the processing speed is reduced.

【００１２】また、上述したようにプロセッサＰｅｉが
他のプロセッサの局所メモリ３ｉにストアされるデータ
も参照する処理は、自己の局所メモリ３ｉにストアされ
るデータのみを参照する処理に比較し、著しくプログラ
ムのステップ数が増大してプログラムミスが発生しやす
くなるとともに、プログラムのメンテナンスコストも高
くなるという問題もあった。Further, as described above, the processing in which the processor Pei also refers to the data stored in the local memory 3i of another processor is remarkably higher than the processing in which only the data stored in its own local memory 3i is referred to. There is also a problem that the number of steps of the program increases, a program error is likely to occur, and the maintenance cost of the program also increases.

【００１３】それゆえにこの発明の目的は、処理プログ
ラムステップ数を増大させることなくプロセッサ相互に
その局所メモリを任意にアクセスしてシステム自体の処
理速度および処理能力を向上させるマルチプロセッサシ
ステムを提供することである。Therefore, an object of the present invention is to provide a multiprocessor system in which the local memory of each processor is arbitrarily accessed by each processor without increasing the number of processing program steps to improve the processing speed and processing capacity of the system itself. Is.

【００１４】[0014]

【課題を解決するための手段】この発明にかかるマルチ
プロセッサシステムは、同一バスを介して並列接続され
る少なくとも２個以上のプロセッサを有し、各プロセッ
サは並列に処理をするシステムである。詳細には、前記
プロセッサのそれぞれは、前記同一バスが接続される演
算手段と、記憶手段と、バス切換手段と、前記バス切換
手段を介して、前記記憶手段と、隣接する一方のプロセ
ッサの前記バス切換手段とを接続する第１のバスと、前
記演算手段と前記バス切換手段とを接続する第２の数
と、前記バス切換手段と、隣接する他方のプロセッサの
前記バス切換手段とを接続する第３のバスとを備えて構
成される。A multiprocessor system according to the present invention is a system that has at least two processors connected in parallel via the same bus, and each processor processes in parallel. In detail, each of the processors is connected to the same bus, an arithmetic means, a storage means, a bus switching means, and the storage means via the bus switching means. A first bus connecting the bus switching means, a second number connecting the arithmetic means and the bus switching means, the bus switching means, and the bus switching means of the other adjacent processor are connected. And a third bus that operates.

【００１５】前記演算手段は、さらに予め内部に記憶さ
れる、該プロセッサを一意に特定する特定データおよび
前記記憶手段を並列にアクセスする前記プロセッサの数
を特定する並列度データに基づいてバスの切換信号を導
出する信号導出手段を備えて構成される。The arithmetic means further switches the bus based on specific data for uniquely identifying the processor and parallelism data for identifying the number of the processors accessing the storage means in parallel, which are internally stored in advance. It is configured to include signal deriving means for deriving a signal.

【００１６】また、前記バス切換手段は、さらに前記信
号導出手段によって導出された前記切換信号に基づいて
前記第２および第３バスのいずれか１つを選択して前記
第１バスに接続するバス選択手段を備えて構成される。The bus switching means further selects one of the second and third buses based on the switching signal derived by the signal deriving means and connects the bus to the first bus. It is configured to include a selection unit.

【００１７】[0017]

【作用】この発明にかかるマルチプロセッサシステムは
上述のように構成されて、信号導出手段は並列度データ
および該プロセッサを一意に特定する特定データに基づ
いてバスの切換信号を導出する。バス選択手段は、導出
された切換信号に基づいて、第２および第３のバスのい
ずれか一方を選択して第１のバスに接続する。したがっ
て、並列度データが第２バスを第１バスに接続するよう
なデータである場合、該プロセッサの記憶手段が同一プ
ロセッサの演算手段によってアクセス可能となる。逆
に、並列度データが第３バスを第１バスに接続させるよ
うなデータである場合、該プロセッサの記憶手段が、隣
接するプロセッサのバス切換手段を介して他のプロセッ
サの演算手段によりアクセス可能となる。以上のよう
に、並列度データを可変設定するだけで、各プロセッサ
が独自に有する記憶手段は動的に再配置されて、システ
ム構成は固定でありながら各プロセッサはデータ転送を
行うことなく他のプロセッサの記憶手段をアクセス可能
となり、各プロセッサについてアクセス可能なメモリの
アドレス空間は動的に可変設定される。The multiprocessor system according to the present invention is constructed as described above, and the signal deriving means derives the bus switching signal based on the parallelism data and the specific data for uniquely identifying the processor. The bus selection means selects one of the second and third buses based on the derived switching signal and connects the selected bus to the first bus. Therefore, when the parallelism data is data for connecting the second bus to the first bus, the storage means of the processor can be accessed by the arithmetic means of the same processor. On the contrary, when the parallelism data is data for connecting the third bus to the first bus, the storage means of the processor can be accessed by the arithmetic means of another processor via the bus switching means of the adjacent processor. Becomes As described above, simply by variably setting the parallelism data, the storage means unique to each processor is dynamically rearranged, and while the system configuration is fixed, each processor does not perform data transfer and other The storage means of the processor can be accessed, and the accessible address space of the memory for each processor is dynamically set.

【００１８】[0018]

【実施例】以下、本発明の一実施例について図面を参照
して詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described in detail below with reference to the drawings.

【００１９】以下の実施例中では、マルチプロセッサシ
ステムとして８つのプロセッサを採用いたシステム構成
を想定するが、システムを構成するプロセッサの数は少
なくとも２個以上であることが唯一の条件であり、その
他の制約は特にない。In the following embodiments, a system configuration in which eight processors are adopted as a multiprocessor system is assumed, but the only condition is that the number of processors constituting the system is at least two or more. There are no particular restrictions.

【００２０】図１（ａ）ないし（ｃ）は、本発明の一実
施例によるマルチプロセッサシステム１の概要を説明す
るためのシステム構成図である。1 (a) to 1 (c) are system configuration diagrams for explaining the outline of a multiprocessor system 1 according to an embodiment of the present invention.

【００２１】図１に示されるマルチプロセッサシステム
１は、プロセッサのそれぞれが有する局所メモリを該シ
ステム稼動中に動的に再配置することによって、他のプ
ロセッサの局所メモリにストアされるデータを、プロセ
ッサ相互にデータ転送を行なわずに参照可能とするよう
に構成される。The multiprocessor system 1 shown in FIG. 1 dynamically relocates the local memory of each of the processors during the operation of the system so that the data stored in the local memory of another processor can be processed by the processor. It is configured so that they can be referred to without mutual data transfer.

【００２２】図１には、プロセッサ（計算素子）ＰＥ０
ないしＰＥ７の８個を含んで構成されたシステムが示さ
れる。図１（ａ）にはプロセッサＰＥｉのそれぞれが、
独自にアクセスできるメモリスペースとして局所メモリ
３ｉを備えている。図１（ａ）の場合、従来と同様にし
て並列に８個の演算処理が実行可能である。FIG. 1 shows a processor (computing element) PE0.
A system configured to include eight of PE to PE7 is shown. In FIG. 1A, each of the processors PEi
The local memory 3i is provided as a memory space that can be independently accessed. In the case of FIG. 1A, eight arithmetic processings can be executed in parallel as in the conventional case.

【００２３】図１（ｂ）および（ｃ）は本実施例による
マルチプロセッサシステムの局所メモリの動的再配置を
説明するための概念図である。FIGS. 1B and 1C are conceptual diagrams for explaining the dynamic relocation of local memory in the multiprocessor system according to this embodiment.

【００２４】ここで、並列度について説明する。図１
（ａ）に示されるマルチプロセッサシステム１では８個
のプロセッサが存在して、これらプロセッサがすべて自
己のメモリ３ｉをアクセスするので並列度は８となる。Here, the degree of parallelism will be described. Figure 1
In the multiprocessor system 1 shown in (a), there are eight processors, and all of these processors access their own memory 3i, so the degree of parallelism is eight.

【００２５】図１（ｂ）の場合、斜線で示されたプロセ
ッサＰＥｉは自己の局所メモリ３ｉをアクセスしないプ
ロセッサである。その他のプロセッサＰＥｉは局所メモ
リ３ｉをアクセスするプロセッサである。図示されるよ
うに局所メモリ３ｉをアクセスするプロセッサＰＥｉ
は、隣接する斜線で示されたプロセッサＰＥｉの局所メ
モリ３ｉを、あたかも自己の局所メモリのようにアクセ
スすることができる。たとえば、プロセッサＰＥ０は自
己の局所メモリ３０と同様にして隣接するプロセッサＰ
Ｅ１の局所メモリ３１をアクセスできることを示してい
る。このとき、並列に局所メモリ３ｉをアクセスするプ
ロセッサＰＥｉは４個であるので、並列度４となる。In the case of FIG. 1B, the shaded processor PEi is a processor that does not access its own local memory 3i. The other processors PEi are processors that access the local memory 3i. Processor PEi accessing local memory 3i as shown
Can access the local memory 3i of the processor PEi indicated by the adjacent diagonal lines as if it were its own local memory. For example, the processor PE0 is adjacent to the processor P0 in the same manner as its own local memory 30.
It shows that the local memory 31 of E1 can be accessed. At this time, since the number of processors PEi that access the local memory 3i in parallel is four, the parallel degree is 4.

【００２６】同様にして図１（ｃ）に示されるように並
列度２になると、局所メモリ３ｉをアクセスしない斜線
で示されるプロセッサはプロセッサＰＥ１、ＰＥ２、Ｐ
Ｅ３、ＰＥ５、ＰＥ６およびＰＥ７となる。たとえば、
局所メモリ３ｉをアクセスするプロセッサＰＥ０は、プ
ロセッサＰＥ１ないしＰＥ３の局所メモリ３１ないし３
３を自己の局所メモリ３０と同様にしてアクセスする。
したがって、プロセッサＰＥ０のアクセス可能なメモリ
スペースは図１（ａ）に示された通常の並列度８の場合
に比較し４倍に拡張される。また同様にしてプロセッサ
ＰＥ４はプロセッサＰＥ５ないしＰＥ７の局所メモリ３
５ないし３７を自己の局所メモリ３４と同様にアクセス
して、プロセッサＰＥ４のアクセス可能なメモリスペー
スは、図１（ａ）に示された通常の並列度８の場合に比
較し、４倍に拡張される。Similarly, when the degree of parallelism is 2 as shown in FIG. 1C, the processors PE1, PE2, P2 that are indicated by diagonal lines and do not access the local memory 3i are processors PE1, PE2, P2.
E3, PE5, PE6 and PE7. For example,
The processor PE0 that accesses the local memory 3i has the local memories 31 to 3 of the processors PE1 to PE3.
3 is accessed in the same manner as its own local memory 30.
Therefore, the accessible memory space of the processor PE0 is expanded four times as compared with the case of the normal parallel degree of 8 shown in FIG. Similarly, the processor PE4 is the local memory 3 of the processors PE5 to PE7.
5 to 37 are accessed in the same manner as the local memory 34 of its own, and the accessible memory space of the processor PE4 is expanded four times as compared with the case of the normal parallel degree of 8 shown in FIG. To be done.

【００２７】さらに、図示されないが並列度１になった
場合は、１つのプロセッサＰＥｉについて、通常の並列
度８の場合に比較し８倍のアクセス可能なメモリスペー
スを得ることができる。Further, although not shown, when the degree of parallelism is 1, it is possible to obtain 8 times as much accessible memory space for one processor PEi as in the case of the normal degree of parallelism 8.

【００２８】以上のように並列度とは、並列に局所メモ
リ３ｉをアクセスできるプロセッサＰＥｉの数を示す。
なお、図１に示される局所メモリ３ｉをアクセスしない
斜線で示されるプロセッサＰＥｉは、自己の局所メモリ
３ｉをアクセスしない動作をしていてもよく、また全く
動作を行なわず待機中であってもよい。As described above, the degree of parallelism indicates the number of processors PEi that can access the local memory 3i in parallel.
It should be noted that the processor PEi indicated by hatching that does not access the local memory 3i shown in FIG. 1 may perform an operation of not accessing its own local memory 3i, or may be in a standby state without performing any operation. ..

【００２９】図２（ａ）および（ｂ）は、本発明の一実
施例によるマルチプロセッサシステムを構成するプロセ
ッサの概略ブロック図である。2 (a) and 2 (b) are schematic block diagrams of processors constituting a multiprocessor system according to an embodiment of the present invention.

【００３０】図２（ａ）においてプロセッサＰＥｉは入
力データバス１０および出力データバス２０を介して、
メモリ５１が接続されたシーケンサ５０に接続される。
プロセッサＰＥｉはさらに演算ユニット１ｉ、メモリ管
理ユニット２ｉおよび局所メモリ３ｉを含む。In FIG. 2A, the processor PEi passes through the input data bus 10 and the output data bus 20,
It is connected to the sequencer 50 to which the memory 51 is connected.
The processor PEi further includes an arithmetic unit 1i, a memory management unit 2i and a local memory 3i.

【００３１】演算ユニット１ｉは入力データバス１０お
よび出力データバス２０を介してシーケンサ５０に接続
される。また演算ユニット１ｉは内部データバス８０を
介して隣接するプロセッサにも接続される。The arithmetic unit 1i is connected to the sequencer 50 via the input data bus 10 and the output data bus 20. The arithmetic unit 1i is also connected to an adjacent processor via the internal data bus 80.

【００３２】メモリ管理ユニット２ｉは演算ユニット１
ｉに接続されるとともに、ローカルメモリ用バス４０を
介して隣接するプロセッサのメモリ管理ユニット２（ｉ
−１）または２（ｉ＋１）に接続される。局所メモリ３
ｉはメモリ管理ユニット２ｉに接続される。したがっ
て、プロセッサＰＥｉは演算ユニット１ｉが局所メモリ
３ｉを、メモリ管理ユニット２ｉを介してアクセスする
ように構成される。The memory management unit 2i is the arithmetic unit 1
The memory management unit 2 (i) of the processor connected to the i and also adjacent via the local memory bus 40.
-1) or 2 (i + 1). Local memory 3
i is connected to the memory management unit 2i. Therefore, the processor PEi is configured so that the arithmetic unit 1i accesses the local memory 3i via the memory management unit 2i.

【００３３】図２（ｂ）は、図２（ａ）に示された演算
ユニット１ｉの概略ブロック図である。演算ユニット１
ｉは一般のＣＰＵと同様に、命令処理装置１ａおよび主
記憶装置１ｂを含む。命令処理装置１ａは加算器１ｃお
よびレジスタ群１ｄを含み、主記憶装置１ｂに予め格納
されたプログラム（コマンドシーケンス）に従いデータ
処理し、結果を再び主記憶装置１ｂにストアするように
動作する。また命令処理装置１ａは入力データバス１
０、出力データバス２０および内部データバス８０を接
続して、シーケンサ５０および隣接するプロセッサＰＥ
ｉに接続されるよう構成される。FIG. 2 (b) is a schematic block diagram of the arithmetic unit 1i shown in FIG. 2 (a). Arithmetic unit 1
i includes an instruction processing device 1a and a main storage device 1b, like a general CPU. The instruction processing device 1a includes an adder 1c and a register group 1d, and operates so as to process data according to a program (command sequence) stored in the main storage device 1b in advance and store the result in the main storage device 1b again. Further, the instruction processing device 1a has an input data bus 1
0, the output data bus 20 and the internal data bus 80 are connected, and the sequencer 50 and the adjacent processor PE are connected.
configured to be connected to i.

【００３４】図３は、本発明の一実施例によるメモリ管
理ユニットに関する入出力バスの接続状態を説明するた
めの模式図である。FIG. 3 is a schematic diagram for explaining a connection state of the input / output bus related to the memory management unit according to the embodiment of the present invention.

【００３５】メモリ管理ユニット２ｉは、図３に示され
るような入出力バスを接続する。図３においては、プロ
セッサＰＥｉのメモリ管理ユニット２ｉを中心にした模
式図が示される。メモリ管理ユニット２ｉは隣接するプ
ロセッサのメモリ管理ユニット２（ｉ−１）および２
（ｉ＋１）と、ローカルメモリ用バス４０を介して接続
される。詳細にはローカル用メモリバス４０はアドレス
バスおよびデータバスを含んで構成され、メモリ管理ユ
ニット２ｉと２（ｉ−１）はアドレスバスＢおよびデー
タバスＤＢを介して接続される。また同様にしてメモリ
管理ユニット２ｉとメモリ管理ユニット２（ｉ＋１）は
アドレスバスＹおよびデータバスＤＹを介して接続され
る。The memory management unit 2i connects the input / output buses as shown in FIG. In FIG. 3, a schematic diagram focusing on the memory management unit 2i of the processor PEi is shown. The memory management unit 2i includes the memory management units 2 (i-1) and 2 of the adjacent processors.
(I + 1) is connected via the local memory bus 40. Specifically, the local memory bus 40 is configured to include an address bus and a data bus, and the memory management units 2i and 2 (i-1) are connected via the address bus B and the data bus DB. Similarly, the memory management unit 2i and the memory management unit 2 (i + 1) are connected via the address bus Y and the data bus DY.

【００３６】さらにメモリ管理ユニット２ｉは演算ユニ
ット１ｉとアドレスバスＡおよびデータバスＤＡを介し
て接続されるとともに、演算ユニット１ｉからバス切換
信号ＣＨが与えられる。また、メモリ管理ユニット２ｉ
は局所メモリ３ｉとアドレスバスＸおよびデータバスＤ
Ｘを介して接続されるとともに、局所メモリ３ｉにメモ
リセレクト信号ＭＳを出力する。メモリ管理ユニット２
ｉは、上述したような接続するアドレスバスとデータバ
スの接続切換をしている。Further, the memory management unit 2i is connected to the arithmetic unit 1i via the address bus A and the data bus DA, and receives a bus switching signal CH from the arithmetic unit 1i. Also, the memory management unit 2i
Is a local memory 3i, an address bus X and a data bus D
It is connected via X and outputs a memory select signal MS to the local memory 3i. Memory management unit 2
i switches the connection between the address bus and the data bus to be connected as described above.

【００３７】図４（ａ）および（ｂ）は、図３に示され
たメモリ管理ユニット２ｉのブロック図である。FIGS. 4A and 4B are block diagrams of the memory management unit 2i shown in FIG.

【００３８】図４（ａ）は、メモリ管理ユニット２ｉの
バス接続切換前の初期の内部状態を示す。FIG. 4A shows the initial internal state of the memory management unit 2i before switching the bus connection.

【００３９】図４（ａ）においては、説明を簡単にする
ためにデータバスとアドレスバスを一緒に記述してい
る。したがって図３のアドレスバスＡおよびデータバス
ＤＡはバスＡと記述され、アドレスバスＢおよびデータ
バスＤＢはバスＢと記述され、アドレスバスＸおよびデ
ータバスＤＸはバスＸと記述され、さらにアドレスバス
ＹおよびデータバスＤＹはバスＹと記述される。In FIG. 4A, the data bus and the address bus are described together for the sake of simplicity. Therefore, address bus A and data bus DA in FIG. 3 are described as bus A, address bus B and data bus DB are described as bus B, address bus X and data bus DX are described as bus X, and address bus Y is further described. The data bus DY is described as a bus Y.

【００４０】図４（ａ）に示されるようにメモリ管理ユ
ニット２ｉは、初期状態においては局所メモリ３ｉを接
続するバスＸをバスＹに接続する。メモリ管理ユニット
２ｉは、演算ユニット１ｉから与えられるバス切換信号
ＣＨに基づいて、バスの接続切換を行なう。つまり、バ
スＡをバスＸおよびバスＹに接続して演算ユニット１ｉ
を局所メモリ３ｉのアクセスを可能とするように接続切
換するか、または隣接するメモリ管理ユニット２（ｉ−
１）を接続するバスＢをバスＸおよびバスＹに接続し
て、隣接するプロセッサＰＥ（ｉ−１）が局所メモリ３
ｉをアクセスできるようにバスを接続するかを選択的に
切換えている。As shown in FIG. 4A, the memory management unit 2i connects the bus X connecting the local memory 3i to the bus Y in the initial state. The memory management unit 2i switches the bus connection based on the bus switching signal CH provided from the arithmetic unit 1i. That is, the bus A is connected to the bus X and the bus Y to connect the arithmetic unit 1i.
Connection switching so that the local memory 3i can be accessed, or the adjacent memory management unit 2 (i-
1) is connected to the bus X and the bus Y so that the adjacent processor PE (i-1) has the local memory 3
The bus is connected so that i can be accessed selectively.

【００４１】図４（ｂ）は、メモリ管理ユニット２ｉの
概略ブロックを示す。図４（ｂ）において、メモリ管理
ユニット２ｉはアドレスバス切換器１１ｉ、データバス
切換器１２ｉ、ｍｏｄｅレジスタ１３ｉ、ｐｅレジスタ
１４ｉおよびメモリセレクト判定器１５ｉを含む。FIG. 4B shows a schematic block of the memory management unit 2i. In FIG. 4B, the memory management unit 2i includes an address bus switch 11i, a data bus switch 12i, a mode register 13i, a pe register 14i and a memory select determiner 15i.

【００４２】アドレスバス切換器１１ｉおよびデータバ
ス切換器１２ｉには演算ユニット１ｉからバス切換信号
ＣＨが与えられる。アドレスバス切換器１１ｉは、接続
されるアドレスバスＡおよびＢのいずれか一方を切換信
号ＣＨに基づいて切換えて、切換えられたアドレスバス
を介して与えられるアドレスを出力側に導出する。アド
レスバス切換器１１ｉを介して導出されたアドレスは、
隣接するメモリ管理ユニット２（ｉ＋１）および局所メ
モリ３ｉに与えられる。データバス切換器１２ｉは、与
えられるバス切換信号ＣＨに基づいて、接続されるデー
タバスＡおよびＢのいずれか一方を接続し、接続された
データバスから与えられるデータを出力側に導出する。
データバス切換器１２ｉを介して導出されたデータは、
隣接するメモリ管理ユニット２（ｉ＋１）に与えられる
とともに、局所メモリ３ｉに与えられる。A bus switching signal CH is applied from the arithmetic unit 1i to the address bus switching unit 11i and the data bus switching unit 12i. The address bus switch 11i switches either one of the connected address buses A and B based on the switching signal CH, and derives the address given via the switched address bus to the output side. The address derived via the address bus switch 11i is
It is provided to the adjacent memory management unit 2 (i + 1) and the local memory 3i. The data bus switcher 12i connects either one of the connected data buses A and B based on the supplied bus switch signal CH, and derives the data supplied from the connected data bus to the output side.
The data derived via the data bus switch 12i is
It is given to the adjacent memory management unit 2 (i + 1) and is given to the local memory 3i.

【００４３】以上のように、アドレスバス切換器１１ｉ
およびデータバス切換器１２ｉから導出されたアドレス
およびデータは、図４（ａ）に示されたようにバスＸお
よびバスＹの各伝送経路を介して局所メモリ３ｉおよび
隣接するメモリ管理ユニット２（ｉ＋１）の両方に並行
して与えられる。As described above, the address bus switch 11i
The address and data derived from the data bus switcher 12i and the local memory 3i and the adjacent memory management unit 2 (i + 1) are transmitted through the transmission paths of the bus X and the bus Y as shown in FIG. ) Is given in parallel to both.

【００４４】ｍｏｄｅレジスタ１３ｉおよびｐｅレジス
タ１４ｉのそれぞれにはコマンドバス１０ｉが接続され
る。コマンドバス１０ｉは前掲図３には示されないが、
演算ユニット１ｉから与えられる２ｂｉｔのコマンド信
号をレジスタ１３ｉおよび１４ｉに与えるために接続さ
れる。コマンドバス１０ｉを介して与えられるコマンド
は、レジスタ１３ｉおよび１４ｉの内容を書換えるため
に演算ユニット１ｉから供給される。コマンドバス１０
ｉからは２ｂｉｔのコマンドが供給され、この供給コマ
ンドが“００”であるとき、ｍｏｄｅレジスタ１３ｉお
よびｐｅレジスタ１４ｉはデータ読出可能モードに設定
される。また、コマンドが“０１”である場合、ｐｅレ
ジスタ１４ｉのみが書込可能モードに設定され、レジス
タ１４ｉはデータバスＤＡを介して与えられるデータが
書込まれて、そのレジスタ内容が更新される。また、与
えられるコマンドが“１０”であるとき、ｍｏｄｅレジ
スタ１３ｉのみがデータ書込可能モードに設定される。
このとき、ｍｏｄｅレジスタ１３ｉはデータバスＤＡを
介して与えられるデータが書込まれて、その内容が更新
される。なお、供給されるコマンドが“１１”であるよ
うな場合は該マルチプロセッサシステム１においては発
生しないと想定する。A command bus 10i is connected to each of the mode register 13i and the pe register 14i. Although the command bus 10i is not shown in FIG. 3 above,
It is connected to apply a 2-bit command signal provided from the arithmetic unit 1i to the registers 13i and 14i. The command given via the command bus 10i is supplied from the arithmetic unit 1i to rewrite the contents of the registers 13i and 14i. Command bus 10
A 2-bit command is supplied from i. When this supply command is "00", the mode register 13i and pe register 14i are set to the data readable mode. When the command is "01", only the pe register 14i is set to the writable mode, the data given via the data bus DA is written in the register 14i, and the content of the register is updated. When the applied command is "10", only the mode register 13i is set to the data writable mode.
At this time, the data given via the data bus DA is written in the mode register 13i, and the content thereof is updated. When the supplied command is "11", it is assumed that it does not occur in the multiprocessor system 1.

【００４５】メモリセレクト判定器１５ｉはバス切換信
号ＣＨに基づいて切換えられたアドレスバスから導出さ
れるアドレス、ｍｏｄｅレジスタ１３ｉから読出される
データおよびｐｅレジスタ１４ｉから読出されるデータ
を入力し、応じてメモリセレクト信号ＭＳを導出し、局
所メモリ３ｉに与える。導出されるメモリセレクト信号
ＭＳは、“１”または“０”のいずれか一方の信号レベ
ルを有する信号であり、メモリセレクト信号ＭＳがレベ
ル“１”である場合、局所メモリ３ｉは書込可能モード
に設定され、信号ＭＳがレベル“０”である場合、局所
メモリ３ｉはデータ読出可能モードに設定される。した
がって、局所メモリ３ｉはメモリセレクト判定器１５ｉ
から導出されるメモリセレクト信号ＭＳが信号レベル
“１”で与えられるときのみ、アドレスバス切換器１１
ｉから導出されたアドレスに、データバス切換器１２ｉ
から導出されたデータを書込むように動作する。また、
メモリセレクト判定器１５ｉから与えられるメモリセレ
クト信号ＭＳが信号レベル“０”であるとき、局所メモ
リ３ｉはアドレスバス切換器１１ｉから導出されるアド
レスからデータを読出すように動作する。The memory select determiner 15i receives an address derived from the address bus switched based on the bus switching signal CH, data read from the mode register 13i and data read from the pe register 14i, and in response thereto. The memory select signal MS is derived and given to the local memory 3i. The derived memory select signal MS is a signal having a signal level of either "1" or "0". When the memory select signal MS is at level "1", the local memory 3i is in the writable mode. And the signal MS is at level "0", the local memory 3i is set to the data readable mode. Therefore, the local memory 3i is the memory select determiner 15i.
Only when the memory select signal MS derived from is given at the signal level "1", the address bus switch 11
to the address derived from i, the data bus switch 12i
Operates to write data derived from. Also,
When the memory select signal MS provided from the memory select determiner 15i has a signal level "0", the local memory 3i operates to read data from the address derived from the address bus switch 11i.

【００４６】図５は、本発明の一実施例によるメモリ管
理ユニット２ｉにおける接続バスの選択動作を示すフロ
ー図である。FIG. 5 is a flow chart showing a connection bus selecting operation in the memory management unit 2i according to the embodiment of the present invention.

【００４７】なお、本実施例においてはプロセッサＰＥ
ｉにそれぞれを一意に特定するためのプロセッサ番号ｐ
ｅがデータにして予め割当てられていると想定する。図
１に示されるプロセッサＰＥ０ないしＰＥ７のそれぞれ
について、プロセッサ番号ｐｅは０ないし７がそれぞれ
割当てられている。また、該マルチプロセッサシステム
１の並列度を決定する変数データである並列度ｍｏｄｅ
は、並列度８、４、２および１の場合のそれぞれに対し
て、並列度ｍｏｄｅは０、１、３および７のそれぞれ値
が割当てられると想定する。このプロセッサ番号ｐｅお
よび並列度ｍｏｄｅの各データは、予めプロセッサＰＥ
ｉの演算ユニット１ｉの主記憶装置１ｂまたはレジスタ
群１ｄにストアされていると想定する。また、図５に示
されるフローは、予めプログラムとして演算ユニット１
ｉの主記憶装置１ｂにストアされ、命令処理装置１ａの
制御の下に実行されると想定する。In the present embodiment, the processor PE
The processor number p for uniquely identifying each i
It is assumed that e is pre-allocated as data. For the processors PE0 to PE7 shown in FIG. 1, 0 to 7 are assigned to the processor numbers pe. In addition, the parallelism mode which is variable data that determines the parallelism of the multiprocessor system 1.
Assumes that for each of the parallelism levels of 8, 4, 2 and 1, the parallelism mode is assigned a value of 0, 1, 3 and 7, respectively. Each data of the processor number pe and the parallelism mode is previously stored in the processor PE.
It is assumed that the data is stored in the main storage device 1b or the register group 1d of the arithmetic unit 1i of i. Moreover, the flow shown in FIG.
Suppose that i is stored in the main storage device 1b and executed under the control of the instruction processing device 1a.

【００４８】命令処理装置１ａは、主記憶装置１ｂにス
トアされるプログラムを読出して、図５のステップＳＴ
１（図中、ＳＴ１と略す）において、予めストアされる
プロセッサ番号ｐｅと並列度ｍｏｄｅをそれぞれに読出
して、（ｐｅ＆ｍｏｄｅ）＝０が成立するか否かの判定
を行なう。この＆で示される演算はプロセッサ番号ｐｅ
と並列度ｍｏｄｅの論理積を表わす。The instruction processing device 1a reads the program stored in the main storage device 1b, and the step ST of FIG.
In 1 (abbreviated as ST1 in the figure), the processor number pe and the parallelism mode that are stored in advance are read out to determine whether (pe & mode) = 0 holds. The operation indicated by & is the processor number pe
And the parallel degree mode.

【００４９】ステップＳＴ１の処理において、演算結果
が０であれば、次のステップＳＴ２の処理に移行し、命
令処理装置１ａは図４（ａ）のバスＡをバスＸおよびＹ
に接続させるようにバス切換信号ＣＨを導出する。逆
に、ステップＳＴ１の処理において演算結果が０でなけ
れば、ステップＳＴ３の処理に移行し、図４（ａ）のバ
スＢをバスＸおよびＹに接続させるようなバス切換信号
ＣＨを導出する。その後、接続バス選択の処理は終了す
る。In the process of step ST1, if the operation result is 0, the process proceeds to the next step ST2, and the instruction processing device 1a connects the bus A of FIG. 4A to the buses X and Y.
The bus switching signal CH is derived so as to be connected to. On the contrary, if the calculation result is not 0 in the process of step ST1, the process shifts to the process of step ST3, and the bus switching signal CH for connecting the bus B of FIG. 4A to the buses X and Y is derived. After that, the connection bus selection processing ends.

【００５０】たとえば、マルチプロセッサシステム１が
並列度４の場合には、並列度ｍｏｄｅは（１００）₂で
あるので、プロセッサ番号ｐｅが偶数であるプロセッサ
ＰＥｉのメモリ管理ユニット２ｉはバスＡを局所メモリ
３ｉに接続し、演算ユニット１ｉが局所メモリ３ｉをア
クセス可能なようにしている。また、プロセッサ番号ｐ
ｅが奇数であるプロセッサＰＥｉのメモリ管理ユニット
２ｉは、バスＢを局所メモリ３ｉに接続して、プロセッ
サＰＥ（ｉ−１）の局所メモリ３ｉへのアクセスが可能
となるようにバスの接続切換を行なっている。For example, when the multiprocessor system 1 has a parallel degree of 4, the parallel degree mode is (100) _2. Therefore, the memory management unit 2i of the processor PEi having the even processor number pe uses the bus A as a local memory. 3i so that the arithmetic unit 1i can access the local memory 3i. Also, the processor number p
The memory management unit 2i of the processor PEi in which e is an odd number connects the bus B to the local memory 3i and switches the connection of the bus so that the local memory 3i of the processor PE (i-1) can be accessed. I'm doing it.

【００５１】図６（ａ）および（ｂ）は、前掲図５の動
作フローに従って得られるメモリ管理ユニット２ｉにお
けるバス接続状態を示す概略図である。FIGS. 6A and 6B are schematic diagrams showing bus connection states in the memory management unit 2i obtained according to the operation flow shown in FIG.

【００５２】図６（ａ）は、メモリ管理ユニット２ｉが
与えられるバス切換信号ＣＨに基づいてバスＡを選択
し、局所メモリ３ｉに接続した場合の状態を示す。図６
（ｂ）は、メモリ管理ユニット２ｉが与えられるバス切
換信号ＣＨに基づいてバスＢを選択し、プロセッサＰＥ
（ｉ−１）のメモリ管理ユニット２（ｉ−１）を局所メ
モリ３ｉに接続した場合の状態を示す図である。FIG. 6A shows a state in which the bus A is selected based on the bus switching signal CH provided by the memory management unit 2i and connected to the local memory 3i. Figure 6
(B) shows that the memory management unit 2i selects the bus B based on the bus switching signal CH and the processor PE
It is a figure which shows the state at the time of connecting the memory management unit 2 (i-1) of (i-1) to the local memory 3i.

【００５３】この並列度４におけるバス接続切換動作に
より、プロセッサ番号ｐｅが奇数であるプロセッサＰＥ
ｉの局所メモリ３ｉが、プロセッサ番号ｐｅが偶数であ
るプロセッサＰＥｉの局所メモリ３ｉとして参照可能に
なる。Due to the bus connection switching operation at the parallel degree of 4, the processor PE whose processor number pe is an odd number
The local memory 3i of i can be referred to as the local memory 3i of the processor PEi having the even processor number pe.

【００５４】図７は、前掲図５の動作フローに従って得
られる並列度２の場合のバス接続状態を示す概略図であ
る。FIG. 7 is a schematic diagram showing a bus connection state in the case of parallelism of 2 obtained according to the operation flow of FIG.

【００５５】図７において並列度２の場合、前掲図１
（ｃ）に示されるようにメモリ管理ユニット２０を含む
プロセッサＰＥ０は、メモリ管理ユニット２１、２２お
よび２３における図５に示されたバス接続切換動作によ
り、局所メモリ３１ないし３３を、自己の局所メモリ３
０と同様にアクセスすることができるので、アクセス可
能なメモリスペースは４倍に拡張される。In FIG. 7, when the degree of parallelism is 2, FIG.
As shown in (c), the processor PE0 including the memory management unit 20 causes the local memories 31 to 33 to move their own local memories by the bus connection switching operation shown in FIG. 5 in the memory management units 21, 22 and 23. Three
Since it can be accessed in the same way as 0, the accessible memory space is expanded four times.

【００５６】図８は、本発明の一実施例によるメモリセ
レクト動作を示すフロー図である。図８に示されるフロ
ーは、各プロセッサＰＥｉのメモリセレクト判定器１５
ｉの動作を示す。メモリセレクト動作とは、メモリセレ
クト判定器１５ｉが与えられるアドレスに基づいて局所
メモリ３ｉをアクセス可能とするか否かを決定するよう
にメモリセレクト信号ＭＳを導出する動作である。図中
のＡｄｄｒ３は、与えられるアドレスの上位３ビットを
示し、これはアクセスすべき局所メモリ３ｉを有するプ
ロセッサＰＥｉのプロセッサ番号ｐｅを指定するもので
ある。したがって、本実施例では８つのプロセッサＰＥ
ｉを接続しているので与えられるアドレスの上位３ビッ
トに基づきアクセスすべき局所メモリ３ｉが決定される
が、参照するビット数は、３ビットに固定されず該シス
テム１を構成するプロセッサＰＥｉの数に依存して決定
するようにしてもよい。したがって、メモリ管理ユニッ
ト２ｉに与えられるアドレスは、少なくともこの上位３
ビットを含む３ビット以上から構成される。FIG. 8 is a flow chart showing a memory select operation according to an embodiment of the present invention. The flow shown in FIG. 8 is performed by the memory select determiner 15 of each processor PEi.
The operation of i is shown. The memory select operation is an operation for deriving the memory select signal MS so as to determine whether or not the local memory 3i can be accessed based on the address given by the memory select determiner 15i. Addr3 in the figure indicates the upper 3 bits of the given address, which specifies the processor number pe of the processor PEi having the local memory 3i to be accessed. Therefore, in this embodiment, the eight processors PE
Since i is connected, the local memory 3i to be accessed is determined based on the upper 3 bits of the given address, but the number of bits to be referred is not fixed to 3 bits, but the number of processors PEi configuring the system 1 It may be determined depending on. Therefore, the address given to the memory management unit 2i should be at least this upper 3
It consists of 3 bits or more including bits.

【００５７】次に、図８のフローを参照して、本発明の
一実施例によるメモリセレクト動作を説明する。Next, the memory select operation according to the embodiment of the present invention will be described with reference to the flow chart of FIG.

【００５８】図４（ｂ）のメモリセレクト判定器１５ｉ
は、まずステップＳＴ１０の処理において、（Ａｄｄｒ
３＝ｐｅ＆ｍｏｄｅ）が成立するか否かを判定する。つ
まり、メモリセレクト判定器１５ｉは、レジスタ１４ｉ
および１３ｉから予めストアされているプロセッサ番号
ｐｅおよび並列度ｍｏｄｅを読出し、アドレスバス切換
器１１ｉを介して与えられるアドレスの上位３ビットＡ
ｄｄｒ３とともにステップＳＴ１０の処理を実行する。
このとき、この論理式が成立すれば、レベル“１”のメ
モリセレクト信号ＭＳを導出する。これにより、局所メ
モリ３ｉはアクセス可能とされる。The memory select determiner 15i of FIG. 4 (b)
First, in the processing of step ST10, (Addr
3 = pe & mode) is established. That is, the memory select determiner 15i uses the register 14i
And 13i, the prestored processor number pe and parallelism mode are read, and the upper 3 bits A of the address given via the address bus switcher 11i
The processing of step ST10 is executed together with ddr3.
At this time, if this logical expression is satisfied, the memory select signal MS of level "1" is derived. As a result, the local memory 3i is accessible.

【００５９】前記ステップＳＴ１０の処理に戻り、逆に
この論理式が成立しなければ、レベル“０”のメモリセ
レクト信号ＭＳを導出し、局所メモリ３ｉに与える。こ
れにより局所メモリ３ｉはアクセス不可能とされる。Returning to the processing of step ST10, if this logical expression is not satisfied, the memory select signal MS of level "0" is derived and given to the local memory 3i. This makes the local memory 3i inaccessible.

【００６０】図９は、前掲図８の動作フローに従って得
られる並列度２の場合のメモリセレクト状態の一例を示
す概略図である。FIG. 9 is a schematic diagram showing an example of the memory select state in the case of the parallel degree 2 obtained according to the operation flow of FIG.

【００６１】今、シーケンサ５０から入力データバス１
０を介してすべてのプロセッサＰＥｉに同一アドレスが
与えられたと想定する。Now, input data bus 1 from sequencer 50
It is assumed that the same address is given to all processors PEi via 0.

【００６２】前記アドレスの上位３ビットが２、すなわ
ちＡｄｄｒ３＝（１０）₂であるとき、図９に示される
ようにプロセッサＰＥ２のメモリセレクト信号ＭＳのみ
がレベル“１”状態となり、局所メモリ３２のみがメモ
リ管理ユニット２２との間で接続バスを確立させる。し
たがって、プロセッサＰＥ０は４倍に拡張されたメモリ
スペースのうち、与えられるアドレスに基づいて、局所
メモリ３２をアクセスすることができる。When the upper 3 bits of the address are 2, that is, Addr3 = (10) ₂ , only the memory select signal MS of the processor PE2 is in the level "1" state and only the local memory 32 is shown in FIG. Establishes a connection bus with the memory management unit 22. Therefore, the processor PE0 can access the local memory 32 based on the given address in the memory space expanded four times.

【００６３】以上のようにして、各プロセッサＰＥｉ
は、図５に示される動作フローに従って、プロセッサ番
号ｐｅおよびそのときの並列度ｍｏｄｅに従って自己の
局所メモリ３ｉに接続すべきバスを選択して接続切換す
る。さらに図８の処理フローに従って、与えられるアド
レスに基づいて、その上位３ビットＡｄｄｒ３の値と、
前述したプロセッサ番号ｐｅおよび並列度ｍｏｄｅとに
基づいて、現在の並列度により拡張されたメモリスペー
スを構成する局所メモリのいずれをアクセスするかを決
定している。これにより、プロセッサＰＥｉのそれぞれ
は、その並列度に応じて、拡張されたメモリ空間を与え
られるアドレスを用いてアクセスすることができる。As described above, each processor PEi
In accordance with the operation flow shown in FIG. 5, the bus to be connected to its own local memory 3i is selected and connection switched according to the processor number pe and the parallel degree mode at that time. Further, according to the processing flow of FIG. 8, based on the given address, the value of the upper 3 bits Addr3,
Based on the processor number pe and the parallelism mode described above, it is determined which of the local memories configuring the memory space expanded by the current parallelism is to be accessed. As a result, each of the processors PEi can access the expanded memory space by using an address provided in accordance with the degree of parallelism.

【００６４】図１０は、本発明の一実施例によるメモリ
管理ユニット２ｉに関してデータバスのみを接続切換す
る状態を説明するための概略図である。FIG. 10 is a schematic diagram for explaining a state in which only the data bus is connected and switched in the memory management unit 2i according to the embodiment of the present invention.

【００６５】上述した実施例では、メモリ管理ユニット
２ｉをアドレスバスおよびデータバスの両方を接続切換
するような構造にしたが、データバスのみを切換えるよ
うな簡易構造でも上述したものと同様に実現が可能であ
る。In the above-described embodiment, the memory management unit 2i has a structure in which both the address bus and the data bus are connected and switched, but a simple structure in which only the data bus is switched can be realized in the same manner as that described above. It is possible.

【００６６】図１０に示されるように、この場合には局
所メモリ３ｉをアクセスするためのアドレスは、常に該
当演算ユニット１ｉから与えられる。ただし、メモリを
アクセスするためのアドレスが演算実行中に変化する
（動的に変化する）場合には、図１０に示される簡易構
造ではメモリアクセスができないという制限が設けられ
る。たとえば、プロセッサＰＥ０が、隣接するプロセッ
サＰＥ１の局所メモリ３１をアクセスしようとする場合
において、メモリ５１にストアされているコマンドの中
にメモリ参照のためのアドレスが記述されている場合
は、この簡易構造でメモリアクセスすることができる。
つまり、シーケンサ５０はメモリ５１から逐次読出すコ
マンドを、入力データバス１０を介してプロセッサＰＥ
０およびＰＥ１に同時に与えるので、演算ユニット１０
および１１は、同時に同じコマンドを受取って、同時に
自己の局所メモリ３０および３１に対してアドレス信号
を与えることで、プロセッサＰＥ０は隣接するプロセッ
サＰＥ１の局所メモリ３１をアクセスすることが可能と
なる。しかしながら、シーケンサ５０を介して一斉に同
じコマンドが与えられるではなく、プロセッサＰＥ０の
たとえばレジスタ群１ｄにメモリアクセスのためのアド
レスがストアされる場合には、隣接するプロセッサＰＥ
１はそこにストアされたアドレスを参照することはでき
ないために、自己の局所メモリ３１に対して適切なアド
レス信号を与えることができない。As shown in FIG. 10, in this case, the address for accessing the local memory 3i is always given from the corresponding arithmetic unit 1i. However, if the address for accessing the memory changes (dynamically changes) during execution of the operation, the simple structure shown in FIG. 10 has a limitation that the memory cannot be accessed. For example, when the processor PE0 attempts to access the local memory 31 of the adjacent processor PE1, when the address for memory reference is described in the command stored in the memory 51, this simplified structure is used. You can access the memory with.
That is, the sequencer 50 sends a command for sequentially reading from the memory 51 to the processor PE via the input data bus 10.
0 and PE1 are given at the same time, so the arithmetic unit 10
And 11 simultaneously receive the same command, and at the same time, give address signals to their own local memories 30 and 31, so that the processor PE0 can access the local memory 31 of the adjacent processor PE1. However, if the same command is not given all at once via the sequencer 50, but an address for memory access is stored in, for example, the register group 1d of the processor PE0, the adjacent processors PE
Since 1 cannot refer to the address stored therein, it cannot give an appropriate address signal to its own local memory 31.

【００６７】上述したように、マルチプロセッサシステ
ム１がシーケンサ５０を介して予めメモリ５１にストア
されているコマンドシーケンスのみに従って動作するシ
ステムの場合は、各プロセッサに対して一斉に同じコマ
ンドが与えられるので、図１０に示されたようなデータ
バスのみを接続切換する簡易構造を用いて所望するメモ
リ空間をアクセスすることができる。As described above, in the case where the multiprocessor system 1 operates according to only the command sequence stored in the memory 51 in advance via the sequencer 50, the same command is given to all the processors at once. The desired memory space can be accessed by using the simple structure shown in FIG. 10 in which only the data bus is connected and switched.

【００６８】マルチプロセッサシステム１においてシー
ケンサ５０から与えられるコマンドと各プロセッサＰＥ
ｉの内部のレジスタ群１ｄにストアされるコマンドに従
って処理が実行されるような場合には、図３に示された
ようにデータバスとアドレスバスを接続切換する構造を
採用して、所望するメモリをアクセスできるようにす
る。In the multiprocessor system 1, a command given from the sequencer 50 and each processor PE
When the processing is executed in accordance with the command stored in the register group 1d inside i, the structure for switching the connection between the data bus and the address bus is adopted as shown in FIG. To be accessible.

【００６９】このように、プロセッサシステム１の動作
を制御するコマンドの供給形式に応じて、データバスの
みを接続切換するか、データバスおよびアドレスバスを
同時に接続切換するかの方法を使い分けるようにしても
よい。As described above, depending on the supply format of the command for controlling the operation of the processor system 1, the method of switching the connection of only the data bus or the method of simultaneously switching the connection of the data bus and the address bus is used. Good.

【００７０】図１１は、本発明の一実施例によるマルチ
プロセッサシステムシステムの並列度の可変設定動作を
示すフロー図である。FIG. 11 is a flow chart showing the variable parallel degree setting operation of the multiprocessor system according to the embodiment of the present invention.

【００７１】図示されるフローは、予めプログラムとし
てプロセッサＰＥｉのそれぞれの主記憶装置１ｂにスト
アされ、命令処理装置１ａの制御の下に実行されるか、
またはメモリ５１に予めプログラムとしてストアされ、
シーケンサ５０の制御の下に逐次読出されて実行される
ようにしてもよい。The flow shown in the figure is stored as a program in each main memory 1b of the processor PEi in advance and executed under the control of the instruction processing unit 1a.
Alternatively, it is stored as a program in the memory 51 in advance,
It may be sequentially read and executed under the control of the sequencer 50.

【００７２】上述した実施例では、マルチプロセッサシ
ステム１の並列度ｍｏｄｅを予めプロセッサＰＥｉのそ
れぞれに固定のデータとして設定するようにしたが、こ
の並列度ｍｏｄｅを可変設定するようにしてもよい。つ
まり、各プロセッサＰＥｉは実行中に与えられるまたは
発生するアドレスが自己がアクセス可能なメモリのアド
レス空間を越えたとき、応じて現在の並列度を下げるよ
うな値を並列度ｍｏｄｅに設定することにより、所望さ
れるアドレスにストアされたデータをメモリから読出す
ことが可能となる。これにより、ユーザはマルチプロセ
ッサシステム１の並列度を意識することなく、プログラ
ミングが可能となる。In the above-described embodiment, the parallelism mode of the multiprocessor system 1 is set in advance as fixed data in each processor PEi, but the parallelism mode may be variably set. That is, each processor PEi sets the parallelism mode to a value that lowers the current parallelism when the address given or generated during execution exceeds the address space of the memory accessible by itself. It becomes possible to read the data stored at the desired address from the memory. As a result, the user can perform programming without being aware of the degree of parallelism of the multiprocessor system 1.

【００７３】今、マルチプロセッサシステム１は図１
（ａ）に示されるように並列度８、すなわち並列度ｍｏ
ｄｅ＝０であったと想定する。また、プロセッサＰＥｉ
のそれぞれは、主記憶装置１ｂに自己の局所メモリ３ｉ
のメモリ空間の大きさＳをデータとして予めストアして
いると想定する。Now, the multiprocessor system 1 is shown in FIG.
As shown in (a), the degree of parallelism is 8, that is, the degree of parallelism mo.
Assume that de = 0. Also, the processor PEi
Of the local memory 3i in the main memory 1b.
It is assumed that the size S of the memory space is stored in advance as data.

【００７４】シーケンサ５０はメモリ５１にストアされ
るプログラムを解読して得られるアドレスデータａｄｄ
ｒを入力データバス１０を介してすべてのプロセッサＰ
Ｅｉに一斉に与える。The sequencer 50 decodes the address data add obtained by decoding the program stored in the memory 51.
r is input via data bus 10 to all processors P
Give to Ei all at once.

【００７５】プロセッサＰＥｉのそれぞれの演算ユニッ
ト１ｉは、与えられるアドレスデータａｄｄｒを入力
し、応じて図１１のステップＳＴ２０の処理を実行す
る。ステップＳＴ２０の処理において、命令処理装置１
ａは、与えられるアドレスデータａｄｄｒをそのレジス
タ群１ｄに一時的にストアするとともに、主記憶装置１
ｂから局所メモリ空間の大きさＳを読出し、（ａｄｄｒ
≧Ｓ）を判定する。このとき、この論理式が不成立であ
ること、すなわち与えられるアドレスデータａｄｄｒは
自己の局所メモリ３ｉのメモリ空間を指定することを判
定すると、次のステップＳＴ２１の処理に移行し、命令
処理装置１ａは主記憶装置１ｂの並列度ｍｏｄｅに値０
を設定し、処理を終了する。これにより、マルチプロセ
ッサシステム１の並列度は８に維持されて、プロセッサ
ＰＥｉのそれぞれは与えられるアドレスデータａｄｄｒ
に基づいてアドレス指定により自己の局所メモリ３ｉに
ついて所望するデータを読書きすることができる。Each arithmetic unit 1i of the processor PEi inputs the supplied address data addr, and accordingly executes the process of step ST20 of FIG. In the processing of step ST20, the instruction processing device 1
a temporarily stores the given address data addr in the register group 1d, and at the same time, the main memory 1
The size S of the local memory space is read from b, and (addr
≧ S) is determined. At this time, when it is determined that this logical expression is not satisfied, that is, the given address data addr specifies the memory space of its own local memory 3i, the process proceeds to the next step ST21, and the instruction processing device 1a Value 0 in the parallel degree mode of the main memory 1b
Is set and the process ends. As a result, the parallelism of the multiprocessor system 1 is maintained at 8 and each of the processors PEi is given the address data addr.
Based on the address, desired data can be read and written in the local memory 3i of its own.

【００７６】前記ステップＳＴ２０の処理に戻り、命令
処理装置１ａが（ａｄｄｒ≧Ｓ）が成立であることを判
定する、すなわち与えられるアドレスデータａｄｄｒは
自己の局所メモリ３ｉのメモリ空間の大きさを越えてい
ることを判定すると、次のステップＳＴ２２の処理に移
行する。Returning to the processing of step ST20, the instruction processing device 1a determines that (addr ≧ S) is satisfied, that is, the given address data addr exceeds the size of the memory space of its own local memory 3i. If it is determined that it is present, the process proceeds to the next step ST22.

【００７７】ステップＳＴ２２の処理において、命令処
理装置１ａは主記憶装置１ｂのデータｍｏｄｅに値１を
設定し、マルチプロセッサシステム１の並列度を４に下
げる。これにより、図１（ａ）から図１（ｂ）に示され
るようなシステム構成となって、プロセッサ番号ｐｅが
偶数であるプロセッサＰＥｉは、隣接する奇数番号のプ
ロセッサ番号ｐｅを有するプロセッサＰＥｉの局所メモ
リ３ｉを自己のメモリスペースとしてアクセスして、与
えられるアドレスデータａｄｄｒを用いて所望されるデ
ータを読書きできる。In the processing of step ST22, the instruction processing device 1a sets the value 1 in the data mode of the main memory device 1b, and reduces the parallel degree of the multiprocessor system 1 to 4. As a result, the system configuration shown in FIGS. 1A and 1B is obtained, and the processor PEi having the even processor number pe is locally connected to the processor PEi having the adjacent odd processor number pe. By accessing the memory 3i as its own memory space, desired data can be read / written using the supplied address data addr.

【００７８】本実施例の中で述べたように、マルチプロ
セッサシステム１は２ⁿ個（＝Ｎ）のプロセッサを有す
ることにより、並列度をＮ／２、Ｎ／４のように下げた
いずれの場合でも、システム構成の対称性は保持される
ので、制御が効率的に行なえる。As described in this embodiment, the multiprocessor system 1 has 2 ⁿ (= N) processors, so that the degree of parallelism is reduced to N / 2 or N / 4. Even in such a case, since the symmetry of the system configuration is maintained, control can be performed efficiently.

【００７９】[0079]

【発明の効果】以上のようにこの発明によれば、信号導
出手段は、並列度データおよび該プロセッサを一意に特
定する特定データに基づいてバスの切換信号を導出し、
バス選択手段は導出されたバス切換信号に基づいて、第
２および第３バスのいずれか一方を第１のバスに接続す
るように動作する。並列度データにより第２バスが第１
バスに接続された場合、該プロセッサの記憶手段は同一
プロセッサの演算手段によってアクセス可能となるよう
に設定され、逆に第３バスが第１バスに接続されるよう
な場合、該プロセッサの記憶手段は、隣接するプロセッ
サのバス切換手段を介して該プロセッサを除く他のプロ
セッサの演算手段によりアクセス可能と設定される。し
たがって、本発明にかかるマルチプロセッサシステムで
は、並列度データを任意に可変設定するだけで、プログ
ラムステップ数を増やすことなく、各プロセッサが独自
にアクセスする記憶手段を動的に再配置することが可能
となるという効果がある。As described above, according to the present invention, the signal deriving means derives the bus switching signal based on the parallelism data and the specific data for uniquely specifying the processor,
The bus selection means operates to connect one of the second and third buses to the first bus based on the derived bus switching signal. The second bus is the first due to the parallelism data
When connected to the bus, the storage means of the processor is set to be accessible by the arithmetic means of the same processor, and conversely, when the third bus is connected to the first bus, the storage means of the processor Is set to be accessible by the computing means of other processors except the processor via the bus switching means of the adjacent processor. Therefore, in the multiprocessor system according to the present invention, it is possible to dynamically rearrange the storage means that each processor independently accesses, without increasing the number of program steps, simply by variably setting the parallel degree data. The effect is that

【００８０】上述した効果は、該システム構成は固定で
ありながら、各プロセッサはバス切換手段により接続切
換されたバスを介して、他のプロセッサの記憶手段をプ
ロセッサ間のデータ転送を行なうことなくアクセス可能
となり、各プロセッサについてアクセス可能な局所メモ
リのアドレス空間を任意にかつ柔軟に可変設定できると
いう効果をもたらす。The above-mentioned effect is that, although the system configuration is fixed, each processor accesses the storage means of another processor via the bus connection-switched by the bus switching means without data transfer between the processors. This makes it possible to arbitrarily and flexibly set the address space of the local memory accessible to each processor.

【００８１】さらに、上述したような効果により、各プ
ロセッサ、ひいては該マルチプロセッサシステム自体の
処理速度および処理能力を向上させることができるとい
う効果をもたらす。Furthermore, the above-described effects bring about an effect that the processing speed and processing capacity of each processor, and by extension, the multiprocessor system itself can be improved.

[Brief description of drawings]

【図１】（ａ）ないし（ｃ）は、本発明の第１の実施例
によるマルチプロセッサシステムの概要を説明するため
のシステム構成図である。1A to 1C are system configuration diagrams for explaining an outline of a multiprocessor system according to a first embodiment of the present invention.

【図２】（ａ）および（ｂ）は、本発明の一実施例によ
るマルチプロセッサシステムを構成するプロセッサの概
略ブロック図である。2A and 2B are schematic block diagrams of a processor constituting a multiprocessor system according to an embodiment of the present invention.

【図３】本発明の一実施例によるメモリ管理ユニットに
関する入出力バスの接続状態を説明するための模式図で
ある。FIG. 3 is a schematic diagram for explaining a connection state of input / output buses regarding a memory management unit according to an embodiment of the present invention.

【図４】（ａ）および（ｂ）は、図３に示されたメモリ
管理ユニットのブロック図である。4A and 4B are block diagrams of the memory management unit shown in FIG.

【図５】本発明の一実施例によるメモリ管理ユニットに
おける接続バスの選択動作を示すフロー図である。FIG. 5 is a flowchart showing a connection bus selecting operation in the memory management unit according to the embodiment of the present invention.

【図６】（ａ）および（ｂ）は、図５の動作フローに従
って得られるメモリ管理ユニットにおけるバス接続状態
を示す概略図である。6A and 6B are schematic diagrams showing a bus connection state in the memory management unit obtained according to the operation flow of FIG. 5;

【図７】図５の動作フローに従って得られる並列度２の
場合のバス接続状態を示す概略図である。FIG. 7 is a schematic diagram showing a bus connection state in the case of a parallel degree of 2 obtained according to the operation flow of FIG.

【図８】本発明の一実施例によるメモリセレクト動作を
示すフロー図である。FIG. 8 is a flowchart showing a memory select operation according to an embodiment of the present invention.

【図９】図８の動作フローに従って得られる並列度２の
場合のメモリセレクト状態の一例を示す概略図である。9 is a schematic diagram showing an example of a memory select state in the case of a parallel degree of 2 obtained according to the operation flow of FIG.

【図１０】本発明の一実施例によるメモリ管理ユニット
に関してデータバスのみを接続切換する状態を説明する
ための概略図である。FIG. 10 is a schematic diagram for explaining a state in which only a data bus is connected and switched in a memory management unit according to an exemplary embodiment of the present invention.

【図１１】本発明の一実施例によるマルチプロセッサシ
ステムの並列度の可変設定動作を示すフロー図である。FIG. 11 is a flowchart showing a variable parallel degree setting operation of the multiprocessor system according to the exemplary embodiment of the present invention.

【図１２】従来の複数のプロセッサを並列接続したマル
チプレクサシステムの概略構成図である。FIG. 12 is a schematic configuration diagram of a conventional multiplexer system in which a plurality of processors are connected in parallel.

【図１３】図１２に示されたプロセッサの概略構成図で
ある13 is a schematic configuration diagram of the processor shown in FIG.

【符号の説明】１マルチプロセッサシステムＰＥｉプロセッサ（計算素子）１ｉ演算ユニット２ｉメモリ管理ユニット３ｉ局所メモリ１０入力データバス２０出力データバス４０ローカルメモリ用バス５０シーケンサ５１メモリ８０内部データバスＣＨバス切換信号ＭＳメモリセレクト信号ｐｅプロセッサ番号ｍｏｄｅ並列度Ａｄｄｒ３アドレスの上位３ビットＳ局所メモリ空間の大きさ（ｉ＝０、１、２、…、７）なお、各図中、同一符号は同一または相当部分を示す。[Explanation of Codes] 1 multiprocessor system PEi processor (computing element) 1i arithmetic unit 2i memory management unit 3i local memory 10 input data bus 20 output data bus 40 local memory bus 50 sequencer 51 memory 80 internal data bus CH bus switching signal MS memory select signal pe processor number mode parallel degree Addr3 upper 3 bits of address S size of local memory space (i = 0, 1, 2, ..., 7) In each drawing, the same reference numerals denote the same or corresponding portions. Show.

Claims

[Claims]

1. A multiprocessor system having at least two processors connected in parallel via the same bus, each processor performing processing in parallel, wherein each processor is connected to the same bus. A first arithmetic operation means, a storage means, a bus switching means, and the storage means and the bus switching means of one of the adjacent processors are connected via the bus switching means.
Bus, a second bus connecting the arithmetic means and the bus switching means, a bus switching means, and a third bus connecting the bus switching means of the other adjacent processor. The arithmetic means further includes signal derivation means for deriving a bus switching signal based on specific data and parallelism data uniquely identifying the processor, which are stored inside beforehand, and the bus switching means further includes: Based on the switching signal derived by the signal deriving means, the second and third
A multiprocessor which comprises bus selecting means for selecting any one of the buses and connecting it to the first bus, wherein the parallelism data is data for specifying the number of the processors for accessing the storage means in parallel. system.