JP2799528B2

JP2799528B2 - Multiprocessor system

Info

Publication number: JP2799528B2
Application number: JP3218213A
Authority: JP
Inventors: 徹上田
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1991-08-29
Filing date: 1991-08-29
Publication date: 1998-09-17
Anticipated expiration: 2013-09-17
Also published as: JPH0554008A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】この発明はマルチプロセッサシス
テムに関し、特に、並列に演算を実行する少なくとも２
個以上のプロセッサを有し、各プロセッサは独自にアク
セス可能な局所メモリを備え、さらにプロセッサのそれ
ぞれは相互に同一バスを介して接続されるマルチプロセ
ッサシステムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a multiprocessor system, and more particularly to at least two processors which execute operations in parallel.
It relates to a multiprocessor system having more than one processor, each processor having a locally accessible local memory and each of the processors being interconnected via the same bus.

【０００２】[0002]

【従来の技術】従来、数値解析やニューラルネットワー
クなどの演算処理の高速化を図るために種々のマルチプ
ロセッサシステムが提供されてきた。2. Description of the Related Art Conventionally, various multiprocessor systems have been provided in order to speed up arithmetic processing such as numerical analysis and neural networks.

【０００３】図１２は従来の複数のプロセッサを並列接
続したマルチプロセッサシステムの概略構成図である。FIG. 12 is a schematic configuration diagram of a conventional multiprocessor system in which a plurality of processors are connected in parallel.

【０００４】図１２においてマルチプロセッサシステム
２は計算素子（以下、プロセッサと呼ぶ）Ｐｅｉ（以
下、ｉ＝０、１、２、…、７）と、これらプロセッサＰ
ｅｉと、入力データバス１０および出力データバス２０
を介して並列に接続されるシーケンサ５０およびシーケ
ンサ５０によってアクセスされ、そこにストアされるデ
ータが読書きされるメモリ５１を含む。さらにプロセッ
サＰｅｉの各々は相互に内部データバス８０を介して接
続される。In FIG. 12, a multiprocessor system 2 includes a computing element (hereinafter, referred to as a processor) Pei (i = 0, 1, 2,..., 7) and these processors P
ei, input data bus 10 and output data bus 20
And a memory 51 accessed by the sequencer 50 and read / written by the data stored therein. Further, each of the processors Pei is connected to each other via an internal data bus 80.

【０００５】上述のように構成されるマルチプロセッサ
システム２は、プロセッサを８個有し、それぞれのプロ
セッサＰｅｉは後述するように独自にアクセスできる局
所メモリを有して、外部から与えられるデータに対して
並列に演算が実行できることを特徴としている。図１の
場合のマルチプロセッサシステム２では、１つの入力デ
ータに対して８つのプロセッサが独自に、かつ同時に演
算処理を実行できる。The multiprocessor system 2 configured as described above has eight processors, and each processor Pei has a local memory that can be accessed independently as described later, and It is characterized in that calculations can be performed in parallel. In the multiprocessor system 2 in the case of FIG. 1, eight processors can independently and simultaneously execute arithmetic processing on one input data.

【０００６】動作においてマルチプロセッサシステム２
のシーケンサ５０は、メモリ５１に予めストアされるプ
ログラムを逐次読出して解析し、解析した命令（以下、
コマンドと呼ぶ）を逐次入力データバス１０を介してプ
ロセッサＰｅｉのそれぞれに並列に与える。これに応じ
てプロセッサＰｅｉのそれぞれは与えられるコマンドに
従って同時に演算処理を実行し、その演算結果を出力デ
ータバス２０を介して外部に出力する。また、プロセッ
サＰｅｉのそれぞれは、シーケンサ５０から与えられる
コマンド、もしくは予めその内部に記憶するコマンドシ
ーケンスに沿って演算処理動作を行なうように構成され
ている。In operation, the multiprocessor system 2
Sequencer 50 sequentially reads and analyzes a program stored in advance in a memory 51, and analyzes the analyzed instruction (hereinafter, referred to as
(Referred to as a command) are sequentially applied to each of the processors Pei via the input data bus 10 in parallel. In response to this, each of the processors Pei simultaneously executes arithmetic processing according to a given command, and outputs the arithmetic result to the outside via the output data bus 20. Each of the processors Pei is configured to perform an arithmetic operation in accordance with a command given from the sequencer 50 or a command sequence stored in advance therein.

【０００７】図１３は、前掲図１２に示されたプロセッ
サＰｅｉの概略構成図である。図１３においてプロセッ
サＰｅｉは演算ユニット１ｉおよび局所メモリ３ｉを含
む。演算ユニット１ｉは一般のＣＰＵ（中央処理装置の
略）と同様に加算器、レジスタなどを有し、演算処理に
並行して局所メモリ３ｉを逐次アクセスする。FIG. 13 is a schematic block diagram of the processor Pei shown in FIG. In FIG. 13, processor Pei includes an operation unit 1i and a local memory 3i. The arithmetic unit 1i has an adder, a register, and the like, like a general CPU (abbreviation of a central processing unit), and sequentially accesses the local memory 3i in parallel with the arithmetic processing.

【０００８】演算ユニット１ｉは入力データバス１０お
よび出力データバス２０を介してシーケンサ５０に接続
されるとともに、隣接するプロセッサＰｅｉと内部デー
タバス８０を介して接続される。The arithmetic unit 1i is connected to the sequencer 50 via the input data bus 10 and the output data bus 20, and is connected to the adjacent processor Pei via the internal data bus 80.

【０００９】ここで、プロセッサＰｅｉの演算処理の動
作について説明する。たとえば、ニューラルネットワー
クで代表的な内部に予め記憶された数値（ウェイトと呼
ばれる）と入力データバス１０を介して与えられる入力
データとの乗算処理を行なう場合を想定する。Here, the operation of the arithmetic processing of the processor Pei will be described. For example, it is assumed that a typical process performed by a neural network is to multiply a typical numerical value (called weight) stored in advance and input data supplied via input data bus 10.

【００１０】シーケンサ５０がメモリ５１をアクセスし
て読取ったデータを入力データバス１０を介してプロセ
ッサＰｅｉのそれぞれに並列に与える。これに応じて各
プロセッサＰｅｉは入力データに対して並列に演算処理
を実行する。まず演算ユニット１ｉは、局所メモリ３ｉ
をアクセスし、予めストアされているウェイトを読出
し、一時その内部レジスタにストアする。その後、演算
ユニット１ｉは与えられた入力データと内部レジスタに
ストアされたウェイトとの乗算処理を加算器を用いて実
行し、入力データとウェイトの積を算出する。この積算
処理は８つのプロセッサにおいて同時に並列して行なわ
れるので、単一プロセッサの場合に比較し、８倍の演算
速度が得られるという特徴がある。The data read by the sequencer 50 accessing the memory 51 is provided in parallel to each of the processors Pei via the input data bus 10. In response to this, each processor Pei performs an arithmetic operation on the input data in parallel. First, the arithmetic unit 1i includes the local memory 3i
, Read the weight stored in advance, and temporarily store it in its internal register. Thereafter, the arithmetic unit 1i executes a multiplication process of the given input data and the weight stored in the internal register by using an adder, and calculates a product of the input data and the weight. Since this integration process is performed simultaneously in eight processors in parallel, there is a characteristic that an operation speed eight times as high as that of a single processor can be obtained.

【００１１】[0011]

【発明が解決しようとする課題】上述したような従来の
マルチプロセッサシステム２においては、プロセッサＰ
ｅｉのそれぞれが自己の局所メモリ３ｉのそれぞれを参
照しながら行なう演算処理に対しては、演算速度の高速
性を得ることができる。しかしながら、プロセッサＰｅ
ｉのそれぞれの局所メモリ３ｉに予めストアされている
データを並び変えるような処理の場合には、処理速度が
著しく低下するという問題があった。たとえば、プロセ
ッサＰｅ０ないしＰｅ７の８つの局所メモリ３０ないし
３７にそれぞれストアされている数値データを降順に並
び変える処理を実行する際には、プロセッサＰｅｉのそ
れぞれが互いの局所メモリ３ｉにストアされている数値
データの比較処理を行なうために、自己の局所メモリ３
ｉにストアされている数値データは、内部データバス８
０を介して他のプロセッサＰｅｉにデータ転送される。
さらにプロセッサＰｅｉのそれぞれの演算ユニット１ｉ
は、該当する局所メモリ３ｉにストアされる数値データ
を読出し、内部データバス８０を介して他のプロセッサ
Ｐｅｉにデータ転送した後に数値データの比較処理を行
なうので、処理速度は低下する。In the conventional multiprocessor system 2 as described above, the processor P
For the arithmetic processing performed by each of ei with reference to each of its own local memories 3i, a high-speed operation speed can be obtained. However, the processor Pe
In the case of processing for rearranging data stored in advance in each local memory 3i of i, there has been a problem that the processing speed is significantly reduced. For example, when executing a process of rearranging numerical data stored in the eight local memories 30 to 37 of the processors Pe0 to Pe7 in descending order, the processors Pei are stored in the local memories 3i of each other. In order to compare numerical data, the local memory 3 of its own is used.
i is stored in the internal data bus 8
0, the data is transferred to another processor Pei.
Further, each operation unit 1i of the processor Pei
Reads out the numerical data stored in the corresponding local memory 3i and performs a comparison process of the numerical data after transferring the data to another processor Pei via the internal data bus 80, so that the processing speed is reduced.

【００１２】また、上述したようにプロセッサＰｅｉが
他のプロセッサの局所メモリ３ｉにストアされるデータ
も参照する処理は、自己の局所メモリ３ｉにストアされ
るデータのみを参照する処理に比較し、著しくプログラ
ムのステップ数が増大してプログラムミスが発生しやす
くなるとともに、プログラムのメンテナンスコストも高
くなるという問題もあった。Further, as described above, the processing in which the processor Pei also refers to the data stored in the local memory 3i of another processor is significantly different from the processing in which the processor Pei refers to only the data stored in its own local memory 3i. There has been a problem that the number of steps of the program is increased and a program error is likely to occur, and the maintenance cost of the program is also increased.

【００１３】それゆえにこの発明の目的は、処理プログ
ラムステップ数を増大させることなくプロセッサ相互に
その局所メモリを任意にアクセスしてシステム自体の処
理速度および処理能力を向上させるマルチプロセッサシ
ステムを提供することである。It is therefore an object of the present invention to provide a multiprocessor system in which processors can arbitrarily access their local memories without increasing the number of processing program steps, thereby improving the processing speed and processing capability of the system itself. It is.

【００１４】[0014]

【課題を解決するための手段】この発明にかかるマルチ
プロセッサシステムは、同一バスを介して並列接続され
る少なくとも２個以上のプロセッサを有し、各プロセッ
サは並列に処理をするシステムである。詳細には、前記
プロセッサのそれぞれは、前記同一バスが接続される演
算手段と、記憶手段と、バス切換手段と、前記バス切換
手段を介して、前記記憶手段と、隣接する一方のプロセ
ッサの前記バス切換手段とを接続する第１のバスと、前
記演算手段と前記バス切換手段とを接続する第２の数
と、前記バス切換手段と、隣接する他方のプロセッサの
前記バス切換手段とを接続する第３のバスとを備えて構
成される。A multiprocessor system according to the present invention has at least two or more processors connected in parallel via the same bus, and each processor performs processing in parallel. In detail, each of the processors is configured to include an arithmetic unit to which the same bus is connected, a storage unit, a bus switching unit, and the storage unit via the bus switching unit. A first bus connecting the bus switching means, a second number connecting the arithmetic means and the bus switching means, and connecting the bus switching means to the bus switching means of the other adjacent processor; And a third bus.

【００１５】前記演算手段は、さらに予め内部に記憶さ
れる、該プロセッサを一意に特定する特定データおよび
前記記憶手段を並列にアクセスする前記プロセッサの数
を特定する並列度データに基づいてバスの切換信号を導
出する信号導出手段を備えて構成される。[0015] The arithmetic means is further configured to switch a bus based on specific data uniquely specifying the processor and parallelism data specifying the number of processors accessing the storage means in parallel, which are stored in advance therein. It comprises a signal deriving means for deriving a signal.

【００１６】また、前記バス切換手段は、さらに前記信
号導出手段によって導出された前記切換信号に基づいて
前記第２および第３バスのいずれか１つを選択して前記
第１バスに接続するバス選択手段を備えて構成される。Further, the bus switching means further selects one of the second and third buses based on the switching signal derived by the signal deriving means and connects to the first bus. It comprises a selection means.

【００１７】[0017]

【作用】この発明にかかるマルチプロセッサシステムは
上述のように構成されて、信号導出手段は並列度データ
および該プロセッサを一意に特定する特定データに基づ
いてバスの切換信号を導出する。バス選択手段は、導出
された切換信号に基づいて、第２および第３のバスのい
ずれか一方を選択して第１のバスに接続する。したがっ
て、並列度データが第２バスを第１バスに接続するよう
なデータである場合、該プロセッサの記憶手段が同一プ
ロセッサの演算手段によってアクセス可能となる。逆
に、並列度データが第３バスを第１バスに接続させるよ
うなデータである場合、該プロセッサの記憶手段が、隣
接するプロセッサのバス切換手段を介して他のプロセッ
サの演算手段によりアクセス可能となる。以上のよう
に、並列度データを可変設定するだけで、各プロセッサ
が独自に有する記憶手段は動的に再配置されて、システ
ム構成は固定でありながら各プロセッサはデータ転送を
行うことなく他のプロセッサの記憶手段をアクセス可能
となり、各プロセッサについてアクセス可能なメモリの
アドレス空間は動的に可変設定される。The multiprocessor system according to the present invention is configured as described above, and the signal deriving means derives a bus switching signal based on the parallelism data and specific data for uniquely specifying the processor. The bus selection means selects one of the second and third buses based on the derived switching signal and connects to the first bus. Therefore, when the parallelism data is data that connects the second bus to the first bus, the storage means of the processor can be accessed by the arithmetic means of the same processor. Conversely, when the parallel degree data is data for connecting the third bus to the first bus, the storage means of the processor can be accessed by the arithmetic means of another processor via the bus switching means of the adjacent processor. Becomes As described above, only by variably setting the parallelism data, the storage means independently possessed by each processor is dynamically rearranged, and while the system configuration is fixed, each processor can perform other data transfer without performing data transfer. The storage means of the processors can be accessed, and the address space of the memory accessible for each processor is dynamically variably set.

【００１８】[0018]

【実施例】以下、本発明の一実施例について図面を参照
して詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below in detail with reference to the drawings.

【００１９】以下の実施例中では、マルチプロセッサシ
ステムとして８つのプロセッサを採用いたシステム構成
を想定するが、システムを構成するプロセッサの数は少
なくとも２個以上であることが唯一の条件であり、その
他の制約は特にない。In the following embodiments, a system configuration employing eight processors as a multiprocessor system is assumed. The only condition is that the number of processors constituting the system is at least two or more. There are no particular restrictions.

【００２０】図１（ａ）ないし（ｃ）は、本発明の一実
施例によるマルチプロセッサシステム１の概要を説明す
るためのシステム構成図である。FIGS. 1A to 1C are system configuration diagrams for explaining an outline of a multiprocessor system 1 according to an embodiment of the present invention.

【００２１】図１に示されるマルチプロセッサシステム
１は、プロセッサのそれぞれが有する局所メモリを該シ
ステム稼動中に動的に再配置することによって、他のプ
ロセッサの局所メモリにストアされるデータを、プロセ
ッサ相互にデータ転送を行なわずに参照可能とするよう
に構成される。The multiprocessor system 1 shown in FIG. 1 dynamically relocates the local memory of each processor during the operation of the system so that data stored in the local memory of another processor can be stored in the processor. It is configured to be able to refer to each other without data transfer.

【００２２】図１には、プロセッサ（計算素子）ＰＥ０
ないしＰＥ７の８個を含んで構成されたシステムが示さ
れる。図１（ａ）にはプロセッサＰＥｉのそれぞれが、
独自にアクセスできるメモリスペースとして局所メモリ
３ｉを備えている。図１（ａ）の場合、従来と同様にし
て並列に８個の演算処理が実行可能である。FIG. 1 shows a processor (calculation element) PE0.
1 to 8 are shown. FIG. 1A shows each of the processors PEi.
The local memory 3i is provided as a memory space that can be independently accessed. In the case of FIG. 1A, eight arithmetic processes can be executed in parallel as in the conventional case.

【００２３】図１（ｂ）および（ｃ）は本実施例による
マルチプロセッサシステムの局所メモリの動的再配置を
説明するための概念図である。FIGS. 1B and 1C are conceptual diagrams for explaining the dynamic relocation of the local memory of the multiprocessor system according to the present embodiment.

【００２４】ここで、並列度について説明する。図１
（ａ）に示されるマルチプロセッサシステム１では８個
のプロセッサが存在して、これらプロセッサがすべて自
己のメモリ３ｉをアクセスするので並列度は８となる。Here, the degree of parallelism will be described. FIG.
In the multiprocessor system 1 shown in (a), there are eight processors, and all of these processors access their own memory 3i, so the degree of parallelism is 8.

【００２５】図１（ｂ）の場合、斜線で示されたプロセ
ッサＰＥｉは自己の局所メモリ３ｉをアクセスしないプ
ロセッサである。その他のプロセッサＰＥｉは局所メモ
リ３ｉをアクセスするプロセッサである。図示されるよ
うに局所メモリ３ｉをアクセスするプロセッサＰＥｉ
は、隣接する斜線で示されたプロセッサＰＥｉの局所メ
モリ３ｉを、あたかも自己の局所メモリのようにアクセ
スすることができる。たとえば、プロセッサＰＥ０は自
己の局所メモリ３０と同様にして隣接するプロセッサＰ
Ｅ１の局所メモリ３１をアクセスできることを示してい
る。このとき、並列に局所メモリ３ｉをアクセスするプ
ロセッサＰＥｉは４個であるので、並列度４となる。In FIG. 1B, the processor PEi indicated by oblique lines is a processor that does not access its own local memory 3i. Other processors PEi are processors that access the local memory 3i. Processor PEi accessing local memory 3i as shown.
Can access the local memory 3i of the adjacent processor PEi indicated by oblique lines as if it were its own local memory. For example, the processor PE0 is connected to the adjacent processor P
This indicates that the local memory 31 of E1 can be accessed. At this time, since the number of processors PEi accessing the local memory 3i in parallel is four, the degree of parallelism is 4.

【００２６】同様にして図１（ｃ）に示されるように並
列度２になると、局所メモリ３ｉをアクセスしない斜線
で示されるプロセッサはプロセッサＰＥ１、ＰＥ２、Ｐ
Ｅ３、ＰＥ５、ＰＥ６およびＰＥ７となる。たとえば、
局所メモリ３ｉをアクセスするプロセッサＰＥ０は、プ
ロセッサＰＥ１ないしＰＥ３の局所メモリ３１ないし３
３を自己の局所メモリ３０と同様にしてアクセスする。
したがって、プロセッサＰＥ０のアクセス可能なメモリ
スペースは図１（ａ）に示された通常の並列度８の場合
に比較し４倍に拡張される。また同様にしてプロセッサ
ＰＥ４はプロセッサＰＥ５ないしＰＥ７の局所メモリ３
５ないし３７を自己の局所メモリ３４と同様にアクセス
して、プロセッサＰＥ４のアクセス可能なメモリスペー
スは、図１（ａ）に示された通常の並列度８の場合に比
較し、４倍に拡張される。Similarly, when the degree of parallelism becomes 2 as shown in FIG. 1 (c), the processors indicated by oblique lines that do not access the local memory 3i are processors PE1, PE2, P
E3, PE5, PE6 and PE7. For example,
The processor PE0 accessing the local memory 3i is connected to the local memories 31 to 3 of the processors PE1 to PE3.
3 is accessed in the same manner as its own local memory 30.
Therefore, the accessible memory space of the processor PE0 is expanded four times as compared with the case of the normal parallelism of 8 shown in FIG. Similarly, the processor PE4 stores the local memory 3 of the processors PE5 to PE7.
5 to 37 are accessed in the same manner as the own local memory 34, and the accessible memory space of the processor PE4 is expanded to four times as compared with the case of the normal parallelism 8 shown in FIG. Is done.

【００２７】さらに、図示されないが並列度１になった
場合は、１つのプロセッサＰＥｉについて、通常の並列
度８の場合に比較し８倍のアクセス可能なメモリスペー
スを得ることができる。Further, although not shown, when the degree of parallelism becomes 1, an accessible memory space of one processor PEi can be obtained eight times as compared with the case of the ordinary degree of parallelism of 8.

【００２８】以上のように並列度とは、並列に局所メモ
リ３ｉをアクセスできるプロセッサＰＥｉの数を示す。
なお、図１に示される局所メモリ３ｉをアクセスしない
斜線で示されるプロセッサＰＥｉは、自己の局所メモリ
３ｉをアクセスしない動作をしていてもよく、また全く
動作を行なわず待機中であってもよい。As described above, the degree of parallelism indicates the number of processors PEi that can access the local memory 3i in parallel.
It should be noted that the processor PEi shown in FIG. 1 which is not shaded and does not access the local memory 3i may operate without accessing its own local memory 3i, or may stand by without performing any operation. .

【００２９】図２（ａ）および（ｂ）は、本発明の一実
施例によるマルチプロセッサシステムを構成するプロセ
ッサの概略ブロック図である。FIGS. 2A and 2B are schematic block diagrams of a processor constituting a multiprocessor system according to an embodiment of the present invention.

【００３０】図２（ａ）においてプロセッサＰＥｉは入
力データバス１０および出力データバス２０を介して、
メモリ５１が接続されたシーケンサ５０に接続される。
プロセッサＰＥｉはさらに演算ユニット１ｉ、メモリ管
理ユニット２ｉおよび局所メモリ３ｉを含む。In FIG. 2A, the processor PEi is connected via an input data bus 10 and an output data bus 20 to
The memory 51 is connected to the connected sequencer 50.
Processor PEi further includes an arithmetic unit 1i, a memory management unit 2i, and a local memory 3i.

【００３１】演算ユニット１ｉは入力データバス１０お
よび出力データバス２０を介してシーケンサ５０に接続
される。また演算ユニット１ｉは内部データバス８０を
介して隣接するプロセッサにも接続される。The operation unit 1i is connected to the sequencer 50 via the input data bus 10 and the output data bus 20. The arithmetic unit 1i is also connected to an adjacent processor via the internal data bus 80.

【００３２】メモリ管理ユニット２ｉは演算ユニット１
ｉに接続されるとともに、ローカルメモリ用バス４０を
介して隣接するプロセッサのメモリ管理ユニット２（ｉ
−１）または２（ｉ＋１）に接続される。局所メモリ３
ｉはメモリ管理ユニット２ｉに接続される。したがっ
て、プロセッサＰＥｉは演算ユニット１ｉが局所メモリ
３ｉを、メモリ管理ユニット２ｉを介してアクセスする
ように構成される。The memory management unit 2i includes the arithmetic unit 1
i, and via a local memory bus 40, the memory management unit 2 (i
-1) or 2 (i + 1). Local memory 3
i is connected to the memory management unit 2i. Therefore, the processor PEi is configured such that the arithmetic unit 1i accesses the local memory 3i via the memory management unit 2i.

【００３３】図２（ｂ）は、図２（ａ）に示された演算
ユニット１ｉの概略ブロック図である。演算ユニット１
ｉは一般のＣＰＵと同様に、命令処理装置１ａおよび主
記憶装置１ｂを含む。命令処理装置１ａは加算器１ｃお
よびレジスタ群１ｄを含み、主記憶装置１ｂに予め格納
されたプログラム（コマンドシーケンス）に従いデータ
処理し、結果を再び主記憶装置１ｂにストアするように
動作する。また命令処理装置１ａは入力データバス１
０、出力データバス２０および内部データバス８０を接
続して、シーケンサ５０および隣接するプロセッサＰＥ
ｉに接続されるよう構成される。FIG. 2B is a schematic block diagram of the arithmetic unit 1i shown in FIG. 2A. Arithmetic unit 1
i includes an instruction processing device 1a and a main storage device 1b, like a general CPU. The instruction processing device 1a includes an adder 1c and a register group 1d, operates to perform data processing according to a program (command sequence) stored in the main storage device 1b in advance, and stores the result in the main storage device 1b again. The instruction processing device 1a has an input data bus 1
0, the output data bus 20 and the internal data bus 80 are connected so that the sequencer 50 and the adjacent processor PE
i.

【００３４】図３は、本発明の一実施例によるメモリ管
理ユニットに関する入出力バスの接続状態を説明するた
めの模式図である。FIG. 3 is a schematic diagram for explaining a connection state of an input / output bus with respect to a memory management unit according to one embodiment of the present invention.

【００３５】メモリ管理ユニット２ｉは、図３に示され
るような入出力バスを接続する。図３においては、プロ
セッサＰＥｉのメモリ管理ユニット２ｉを中心にした模
式図が示される。メモリ管理ユニット２ｉは隣接するプ
ロセッサのメモリ管理ユニット２（ｉ−１）および２
（ｉ＋１）と、ローカルメモリ用バス４０を介して接続
される。詳細にはローカル用メモリバス４０はアドレス
バスおよびデータバスを含んで構成され、メモリ管理ユ
ニット２ｉと２（ｉ−１）はアドレスバスＢおよびデー
タバスＤＢを介して接続される。また同様にしてメモリ
管理ユニット２ｉとメモリ管理ユニット２（ｉ＋１）は
アドレスバスＹおよびデータバスＤＹを介して接続され
る。The memory management unit 2i connects an input / output bus as shown in FIG. FIG. 3 is a schematic diagram centered on the memory management unit 2i of the processor PEi. Memory management units 2 (i-1) and 2 (2) of adjacent processors
(I + 1) via the local memory bus 40. More specifically, the local memory bus 40 includes an address bus and a data bus, and the memory management units 2i and 2 (i-1) are connected via an address bus B and a data bus DB. Similarly, the memory management unit 2i and the memory management unit 2 (i + 1) are connected via an address bus Y and a data bus DY.

【００３６】さらにメモリ管理ユニット２ｉは演算ユニ
ット１ｉとアドレスバスＡおよびデータバスＤＡを介し
て接続されるとともに、演算ユニット１ｉからバス切換
信号ＣＨが与えられる。また、メモリ管理ユニット２ｉ
は局所メモリ３ｉとアドレスバスＸおよびデータバスＤ
Ｘを介して接続されるとともに、局所メモリ３ｉにメモ
リセレクト信号ＭＳを出力する。メモリ管理ユニット２
ｉは、上述したような接続するアドレスバスとデータバ
スの接続切換をしている。Further, the memory management unit 2i is connected to the arithmetic unit 1i via the address bus A and the data bus DA, and receives a bus switching signal CH from the arithmetic unit 1i. Also, the memory management unit 2i
Is a local memory 3i, an address bus X and a data bus D
X, and outputs a memory select signal MS to the local memory 3i. Memory management unit 2
i switches the connection between the address bus and the data bus to be connected as described above.

【００３７】図４（ａ）および（ｂ）は、図３に示され
たメモリ管理ユニット２ｉのブロック図である。FIGS. 4A and 4B are block diagrams of the memory management unit 2i shown in FIG.

【００３８】図４（ａ）は、メモリ管理ユニット２ｉの
バス接続切換前の初期の内部状態を示す。FIG. 4A shows an initial internal state of the memory management unit 2i before switching the bus connection.

【００３９】図４（ａ）においては、説明を簡単にする
ためにデータバスとアドレスバスを一緒に記述してい
る。したがって図３のアドレスバスＡおよびデータバス
ＤＡはバスＡと記述され、アドレスバスＢおよびデータ
バスＤＢはバスＢと記述され、アドレスバスＸおよびデ
ータバスＤＸはバスＸと記述され、さらにアドレスバス
ＹおよびデータバスＤＹはバスＹと記述される。In FIG. 4A, the data bus and the address bus are described together to simplify the description. Therefore, address bus A and data bus DA in FIG. 3 are described as bus A, address bus B and data bus DB are described as bus B, address bus X and data bus DX are described as bus X, and address bus Y is further described. And data bus DY is described as bus Y.

【００４０】図４（ａ）に示されるようにメモリ管理ユ
ニット２ｉは、初期状態においては局所メモリ３ｉを接
続するバスＸをバスＹに接続する。メモリ管理ユニット
２ｉは、演算ユニット１ｉから与えられるバス切換信号
ＣＨに基づいて、バスの接続切換を行なう。つまり、バ
スＡをバスＸおよびバスＹに接続して演算ユニット１ｉ
を局所メモリ３ｉのアクセスを可能とするように接続切
換するか、または隣接するメモリ管理ユニット２（ｉ−
１）を接続するバスＢをバスＸおよびバスＹに接続し
て、隣接するプロセッサＰＥ（ｉ−１）が局所メモリ３
ｉをアクセスできるようにバスを接続するかを選択的に
切換えている。As shown in FIG. 4A, the memory management unit 2i connects the bus X connecting the local memory 3i to the bus Y in the initial state. The memory management unit 2i performs bus connection switching based on a bus switching signal CH provided from the arithmetic unit 1i. That is, the bus A is connected to the buses X and Y to connect the arithmetic unit 1i
Is switched to enable access to the local memory 3i, or the adjacent memory management unit 2 (i-
1) is connected to a bus X and a bus Y, and the adjacent processor PE (i-1)
Whether to connect the bus so that i can be accessed is selectively switched.

【００４１】図４（ｂ）は、メモリ管理ユニット２ｉの
概略ブロックを示す。図４（ｂ）において、メモリ管理
ユニット２ｉはアドレスバス切換器１１ｉ、データバス
切換器１２ｉ、ｍｏｄｅレジスタ１３ｉ、ｐｅレジスタ
１４ｉおよびメモリセレクト判定器１５ｉを含む。FIG. 4B shows a schematic block diagram of the memory management unit 2i. In FIG. 4B, the memory management unit 2i includes an address bus switch 11i, a data bus switch 12i, a mode register 13i, a pe register 14i, and a memory select determiner 15i.

【００４２】アドレスバス切換器１１ｉおよびデータバ
ス切換器１２ｉには演算ユニット１ｉからバス切換信号
ＣＨが与えられる。アドレスバス切換器１１ｉは、接続
されるアドレスバスＡおよびＢのいずれか一方を切換信
号ＣＨに基づいて切換えて、切換えられたアドレスバス
を介して与えられるアドレスを出力側に導出する。アド
レスバス切換器１１ｉを介して導出されたアドレスは、
隣接するメモリ管理ユニット２（ｉ＋１）および局所メ
モリ３ｉに与えられる。データバス切換器１２ｉは、与
えられるバス切換信号ＣＨに基づいて、接続されるデー
タバスＡおよびＢのいずれか一方を接続し、接続された
データバスから与えられるデータを出力側に導出する。
データバス切換器１２ｉを介して導出されたデータは、
隣接するメモリ管理ユニット２（ｉ＋１）に与えられる
とともに、局所メモリ３ｉに与えられる。The bus switch signal CH is supplied from the arithmetic unit 1i to the address bus switch 11i and the data bus switch 12i. The address bus switching unit 11i switches one of the connected address buses A and B based on the switching signal CH, and derives an address given via the switched address bus to the output side. The address derived via the address bus switch 11i is
It is provided to the adjacent memory management unit 2 (i + 1) and the local memory 3i. The data bus switch 12i connects one of the connected data buses A and B based on the supplied bus switching signal CH, and derives the data supplied from the connected data bus to the output side.
The data derived via the data bus switch 12i is
It is provided to the adjacent memory management unit 2 (i + 1) and to the local memory 3i.

【００４３】以上のように、アドレスバス切換器１１ｉ
およびデータバス切換器１２ｉから導出されたアドレス
およびデータは、図４（ａ）に示されたようにバスＸお
よびバスＹの各伝送経路を介して局所メモリ３ｉおよび
隣接するメモリ管理ユニット２（ｉ＋１）の両方に並行
して与えられる。As described above, the address bus switch 11i
Addresses and data derived from the data bus switch 12i are transferred to the local memory 3i and the adjacent memory management unit 2 (i + 1) via the transmission paths of the bus X and the bus Y as shown in FIG. ) Are given in parallel to both.

【００４４】ｍｏｄｅレジスタ１３ｉおよびｐｅレジス
タ１４ｉのそれぞれにはコマンドバス１０ｉが接続され
る。コマンドバス１０ｉは前掲図３には示されないが、
演算ユニット１ｉから与えられる２ｂｉｔのコマンド信
号をレジスタ１３ｉおよび１４ｉに与えるために接続さ
れる。コマンドバス１０ｉを介して与えられるコマンド
は、レジスタ１３ｉおよび１４ｉの内容を書換えるため
に演算ユニット１ｉから供給される。コマンドバス１０
ｉからは２ｂｉｔのコマンドが供給され、この供給コマ
ンドが“００”であるとき、ｍｏｄｅレジスタ１３ｉお
よびｐｅレジスタ１４ｉはデータ読出可能モードに設定
される。また、コマンドが“０１”である場合、ｐｅレ
ジスタ１４ｉのみが書込可能モードに設定され、レジス
タ１４ｉはデータバスＤＡを介して与えられるデータが
書込まれて、そのレジスタ内容が更新される。また、与
えられるコマンドが“１０”であるとき、ｍｏｄｅレジ
スタ１３ｉのみがデータ書込可能モードに設定される。
このとき、ｍｏｄｅレジスタ１３ｉはデータバスＤＡを
介して与えられるデータが書込まれて、その内容が更新
される。なお、供給されるコマンドが“１１”であるよ
うな場合は該マルチプロセッサシステム１においては発
生しないと想定する。A command bus 10i is connected to each of the mode register 13i and the pe register 14i. The command bus 10i is not shown in FIG.
It is connected to supply a 2-bit command signal supplied from the operation unit 1i to the registers 13i and 14i. The command given via the command bus 10i is supplied from the arithmetic unit 1i to rewrite the contents of the registers 13i and 14i. Command bus 10
A 2-bit command is supplied from i. When the supplied command is "00", the mode register 13i and the pe register 14i are set to the data readable mode. When the command is "01", only the pe register 14i is set to the writable mode, the data supplied via the data bus DA is written to the register 14i, and the register contents are updated. When the applied command is "10", only the mode register 13i is set to the data writable mode.
At this time, data given via data bus DA is written into mode register 13i, and its contents are updated. If the supplied command is "11", it is assumed that the command does not occur in the multiprocessor system 1.

【００４５】メモリセレクト判定器１５ｉはバス切換信
号ＣＨに基づいて切換えられたアドレスバスから導出さ
れるアドレス、ｍｏｄｅレジスタ１３ｉから読出される
データおよびｐｅレジスタ１４ｉから読出されるデータ
を入力し、応じてメモリセレクト信号ＭＳを導出し、局
所メモリ３ｉに与える。導出されるメモリセレクト信号
ＭＳは、“１”または“０”のいずれか一方の信号レベ
ルを有する信号であり、メモリセレクト信号ＭＳがレベ
ル“１”である場合、局所メモリ３ｉは書込可能モード
に設定され、信号ＭＳがレベル“０”である場合、局所
メモリ３ｉはデータ読出可能モードに設定される。した
がって、局所メモリ３ｉはメモリセレクト判定器１５ｉ
から導出されるメモリセレクト信号ＭＳが信号レベル
“１”で与えられるときのみ、アドレスバス切換器１１
ｉから導出されたアドレスに、データバス切換器１２ｉ
から導出されたデータを書込むように動作する。また、
メモリセレクト判定器１５ｉから与えられるメモリセレ
クト信号ＭＳが信号レベル“０”であるとき、局所メモ
リ３ｉはアドレスバス切換器１１ｉから導出されるアド
レスからデータを読出すように動作する。Memory select determiner 15i receives an address derived from the address bus switched based on bus switching signal CH, data read from mode register 13i, and data read from pe register 14i. The memory select signal MS is derived and applied to the local memory 3i. The derived memory select signal MS is a signal having one of "1" and "0" signal levels. When the memory select signal MS is at the level "1", the local memory 3i is in the writable mode. And the signal MS is at the level "0", the local memory 3i is set to the data readable mode. Therefore, the local memory 3i is stored in the memory select determiner 15i.
Only when memory select signal MS derived from the data bus is applied at signal level "1", address bus switch 11
i to the address derived from the data bus switch 12i.
It operates to write data derived from. Also,
When memory select signal MS applied from memory select determiner 15i is at signal level "0", local memory 3i operates to read data from an address derived from address bus switch 11i.

【００４６】図５は、本発明の一実施例によるメモリ管
理ユニット２ｉにおける接続バスの選択動作を示すフロ
ー図である。FIG. 5 is a flowchart showing the operation of selecting a connection bus in the memory management unit 2i according to one embodiment of the present invention.

【００４７】なお、本実施例においてはプロセッサＰＥ
ｉにそれぞれを一意に特定するためのプロセッサ番号ｐ
ｅがデータにして予め割当てられていると想定する。図
１に示されるプロセッサＰＥ０ないしＰＥ７のそれぞれ
について、プロセッサ番号ｐｅは０ないし７がそれぞれ
割当てられている。また、該マルチプロセッサシステム
１の並列度を決定する変数データである並列度ｍｏｄｅ
は、並列度８、４、２および１の場合のそれぞれに対し
て、並列度ｍｏｄｅは０、１、３および７のそれぞれ値
が割当てられると想定する。このプロセッサ番号ｐｅお
よび並列度ｍｏｄｅの各データは、予めプロセッサＰＥ
ｉの演算ユニット１ｉの主記憶装置１ｂまたはレジスタ
群１ｄにストアされていると想定する。また、図５に示
されるフローは、予めプログラムとして演算ユニット１
ｉの主記憶装置１ｂにストアされ、命令処理装置１ａの
制御の下に実行されると想定する。In this embodiment, the processor PE
Processor number p for uniquely specifying each to i
Assume that e is pre-assigned as data. Processor numbers pe to 0 to 7 are assigned to processors PE0 to PE7 shown in FIG. 1, respectively. Also, the parallelism mode, which is variable data for determining the parallelism of the multiprocessor system 1,
Assumes that the degree of parallelism mode is assigned a value of 0, 1, 3, and 7, respectively, for each of the cases of parallelism 8, 4, 2, and 1. The data of the processor number pe and the degree of parallelism mode are stored in advance in the processor PE.
It is assumed that the data is stored in the main storage device 1b or the register group 1d of the i operation unit 1i. Further, the flow shown in FIG.
i is assumed to be stored in the main storage device 1b and executed under the control of the instruction processing device 1a.

【００４８】命令処理装置１ａは、主記憶装置１ｂにス
トアされるプログラムを読出して、図５のステップＳＴ
１（図中、ＳＴ１と略す）において、予めストアされる
プロセッサ番号ｐｅと並列度ｍｏｄｅをそれぞれに読出
して、（ｐｅ＆ｍｏｄｅ）＝０が成立するか否かの判定
を行なう。この＆で示される演算はプロセッサ番号ｐｅ
と並列度ｍｏｄｅの論理積を表わす。The instruction processing device 1a reads out the program stored in the main storage device 1b, and reads the program stored in the step ST of FIG.
At 1 (abbreviated as ST1 in the figure), the processor number pe and the degree of parallelism mode stored in advance are respectively read, and it is determined whether (pe & mode) = 0 holds. The operation indicated by & is the processor number pe.
And the logical product of the parallelism mode.

【００４９】ステップＳＴ１の処理において、演算結果
が０であれば、次のステップＳＴ２の処理に移行し、命
令処理装置１ａは図４（ａ）のバスＡをバスＸおよびＹ
に接続させるようにバス切換信号ＣＨを導出する。逆
に、ステップＳＴ１の処理において演算結果が０でなけ
れば、ステップＳＴ３の処理に移行し、図４（ａ）のバ
スＢをバスＸおよびＹに接続させるようなバス切換信号
ＣＨを導出する。その後、接続バス選択の処理は終了す
る。In the process of step ST1, if the operation result is 0, the process proceeds to the next step ST2, and the instruction processing device 1a connects the bus A of FIG.
The bus switching signal CH is derived so as to be connected to Conversely, if the operation result is not 0 in the process of step ST1, the process proceeds to the process of step ST3 to derive a bus switching signal CH for connecting the bus B of FIG. Thereafter, the process of selecting the connection bus ends.

【００５０】たとえば、マルチプロセッサシステム１が
並列度４の場合には、並列度ｍｏｄｅは（１００）₂で
あるので、プロセッサ番号ｐｅが偶数であるプロセッサ
ＰＥｉのメモリ管理ユニット２ｉはバスＡを局所メモリ
３ｉに接続し、演算ユニット１ｉが局所メモリ３ｉをア
クセス可能なようにしている。また、プロセッサ番号ｐ
ｅが奇数であるプロセッサＰＥｉのメモリ管理ユニット
２ｉは、バスＢを局所メモリ３ｉに接続して、プロセッ
サＰＥ（ｉ−１）の局所メモリ３ｉへのアクセスが可能
となるようにバスの接続切換を行なっている。For example, when the multiprocessor system 1 has a parallelism of 4, the parallelism mode is (100) ₂ , so that the memory management unit 2i of the processor PEi whose processor number pe is an even number stores the bus A in the local memory. 3i so that the arithmetic unit 1i can access the local memory 3i. Also, the processor number p
The memory management unit 2i of the processor PEi where e is an odd number connects the bus B to the local memory 3i and switches the bus connection so that the processor PE (i-1) can access the local memory 3i. I do.

【００５１】図６（ａ）および（ｂ）は、前掲図５の動
作フローに従って得られるメモリ管理ユニット２ｉにお
けるバス接続状態を示す概略図である。FIGS. 6A and 6B are schematic diagrams showing the bus connection state in the memory management unit 2i obtained according to the operation flow of FIG.

【００５２】図６（ａ）は、メモリ管理ユニット２ｉが
与えられるバス切換信号ＣＨに基づいてバスＡを選択
し、局所メモリ３ｉに接続した場合の状態を示す。図６
（ｂ）は、メモリ管理ユニット２ｉが与えられるバス切
換信号ＣＨに基づいてバスＢを選択し、プロセッサＰＥ
（ｉ−１）のメモリ管理ユニット２（ｉ−１）を局所メ
モリ３ｉに接続した場合の状態を示す図である。FIG. 6A shows a state where the bus A is selected by the memory management unit 2i based on the supplied bus switching signal CH and connected to the local memory 3i. FIG.
(B) shows the case where the memory management unit 2i selects the bus B based on the bus switching signal CH supplied thereto, and the processor PE
It is a figure which shows the state at the time of connecting the memory management unit 2 (i-1) of (i-1) to the local memory 3i.

【００５３】この並列度４におけるバス接続切換動作に
より、プロセッサ番号ｐｅが奇数であるプロセッサＰＥ
ｉの局所メモリ３ｉが、プロセッサ番号ｐｅが偶数であ
るプロセッサＰＥｉの局所メモリ３ｉとして参照可能に
なる。By the bus connection switching operation at the degree of parallelism of 4, the processor PE whose processor number pe is an odd number
The local memory 3i of i can be referred to as the local memory 3i of the processor PEi whose processor number pe is even.

【００５４】図７は、前掲図５の動作フローに従って得
られる並列度２の場合のバス接続状態を示す概略図であ
る。FIG. 7 is a schematic diagram showing a bus connection state in the case of a degree of parallelism of 2 obtained according to the operation flow of FIG.

【００５５】図７において並列度２の場合、前掲図１
（ｃ）に示されるようにメモリ管理ユニット２０を含む
プロセッサＰＥ０は、メモリ管理ユニット２１、２２お
よび２３における図５に示されたバス接続切換動作によ
り、局所メモリ３１ないし３３を、自己の局所メモリ３
０と同様にアクセスすることができるので、アクセス可
能なメモリスペースは４倍に拡張される。In FIG. 7, when the degree of parallelism is 2, FIG.
As shown in (c), the processor PE0 including the memory management unit 20 replaces the local memories 31 to 33 with its own local memory by the bus connection switching operation of the memory management units 21, 22 and 23 shown in FIG. 3
Since it can be accessed in the same way as 0, the accessible memory space is quadrupled.

【００５６】図８は、本発明の一実施例によるメモリセ
レクト動作を示すフロー図である。図８に示されるフロ
ーは、各プロセッサＰＥｉのメモリセレクト判定器１５
ｉの動作を示す。メモリセレクト動作とは、メモリセレ
クト判定器１５ｉが与えられるアドレスに基づいて局所
メモリ３ｉをアクセス可能とするか否かを決定するよう
にメモリセレクト信号ＭＳを導出する動作である。図中
のＡｄｄｒ３は、与えられるアドレスの上位３ビットを
示し、これはアクセスすべき局所メモリ３ｉを有するプ
ロセッサＰＥｉのプロセッサ番号ｐｅを指定するもので
ある。したがって、本実施例では８つのプロセッサＰＥ
ｉを接続しているので与えられるアドレスの上位３ビッ
トに基づきアクセスすべき局所メモリ３ｉが決定される
が、参照するビット数は、３ビットに固定されず該シス
テム１を構成するプロセッサＰＥｉの数に依存して決定
するようにしてもよい。したがって、メモリ管理ユニッ
ト２ｉに与えられるアドレスは、少なくともこの上位３
ビットを含む３ビット以上から構成される。FIG. 8 is a flowchart showing a memory select operation according to one embodiment of the present invention. The flow shown in FIG. 8 corresponds to the memory select determiner 15 of each processor PEi.
The operation of i is shown. The memory select operation is an operation to derive the memory select signal MS so as to determine whether or not the local memory 3i can be accessed based on the address given by the memory select determiner 15i. In the figure, Addr3 indicates the upper three bits of the given address, and specifies the processor number pe of the processor PEi having the local memory 3i to be accessed. Therefore, in this embodiment, eight processors PE
i, the local memory 3i to be accessed is determined based on the upper 3 bits of the given address. However, the number of bits referred to is not fixed to 3 bits, and the number of processors PEi constituting the system 1 May be determined. Therefore, the address given to the memory management unit 2i is at least
It is composed of three or more bits including bits.

【００５７】次に、図８のフローを参照して、本発明の
一実施例によるメモリセレクト動作を説明する。Next, a memory select operation according to one embodiment of the present invention will be described with reference to the flow chart of FIG.

【００５８】図４（ｂ）のメモリセレクト判定器１５ｉ
は、まずステップＳＴ１０の処理において、（Ａｄｄｒ
３＝ｐｅ＆ｍｏｄｅ）が成立するか否かを判定する。つ
まり、メモリセレクト判定器１５ｉは、レジスタ１４ｉ
および１３ｉから予めストアされているプロセッサ番号
ｐｅおよび並列度ｍｏｄｅを読出し、アドレスバス切換
器１１ｉを介して与えられるアドレスの上位３ビットＡ
ｄｄｒ３とともにステップＳＴ１０の処理を実行する。
このとき、この論理式が成立すれば、レベル“１”のメ
モリセレクト信号ＭＳを導出する。これにより、局所メ
モリ３ｉはアクセス可能とされる。The memory select judging unit 15i shown in FIG.
First, in the process of step ST10, (Addr
3 = pe & mode) is determined. That is, the memory select determination unit 15i
And the processor number pe and the degree of parallelism mode stored in advance from the address bus switch 11i and read the upper 3 bits A of the address given through the address bus switch 11i.
The process of step ST10 is executed together with ddr3.
At this time, if this logical expression holds, a memory select signal MS of level “1” is derived. Thus, the local memory 3i can be accessed.

【００５９】前記ステップＳＴ１０の処理に戻り、逆に
この論理式が成立しなければ、レベル“０”のメモリセ
レクト信号ＭＳを導出し、局所メモリ３ｉに与える。こ
れにより局所メモリ３ｉはアクセス不可能とされる。Returning to the process of step ST10, if this logical expression does not hold, a memory select signal MS of level "0" is derived and applied to the local memory 3i. This makes the local memory 3i inaccessible.

【００６０】図９は、前掲図８の動作フローに従って得
られる並列度２の場合のメモリセレクト状態の一例を示
す概略図である。FIG. 9 is a schematic diagram showing an example of a memory select state in the case of a parallelism of 2 obtained according to the operation flow of FIG.

【００６１】今、シーケンサ５０から入力データバス１
０を介してすべてのプロセッサＰＥｉに同一アドレスが
与えられたと想定する。Now, the input data bus 1 from the sequencer 50
It is assumed that the same address has been given to all the processors PEi through “0”.

【００６２】前記アドレスの上位３ビットが２、すなわ
ちＡｄｄｒ３＝（１０）₂であるとき、図９に示される
ようにプロセッサＰＥ２のメモリセレクト信号ＭＳのみ
がレベル“１”状態となり、局所メモリ３２のみがメモ
リ管理ユニット２２との間で接続バスを確立させる。し
たがって、プロセッサＰＥ０は４倍に拡張されたメモリ
スペースのうち、与えられるアドレスに基づいて、局所
メモリ３２をアクセスすることができる。When the upper 3 bits of the address are 2, that is, Addr3 = (10) ₂ , only the memory select signal MS of the processor PE2 is at the level “1” as shown in FIG. Establishes a connection bus with the memory management unit 22. Therefore, the processor PE0 can access the local memory 32 based on the given address in the memory space expanded four times.

【００６３】以上のようにして、各プロセッサＰＥｉ
は、図５に示される動作フローに従って、プロセッサ番
号ｐｅおよびそのときの並列度ｍｏｄｅに従って自己の
局所メモリ３ｉに接続すべきバスを選択して接続切換す
る。さらに図８の処理フローに従って、与えられるアド
レスに基づいて、その上位３ビットＡｄｄｒ３の値と、
前述したプロセッサ番号ｐｅおよび並列度ｍｏｄｅとに
基づいて、現在の並列度により拡張されたメモリスペー
スを構成する局所メモリのいずれをアクセスするかを決
定している。これにより、プロセッサＰＥｉのそれぞれ
は、その並列度に応じて、拡張されたメモリ空間を与え
られるアドレスを用いてアクセスすることができる。As described above, each processor PEi
Selects a bus to be connected to its own local memory 3i according to the processor number pe and the parallelism mode at that time according to the operation flow shown in FIG. Further, according to the processing flow of FIG. 8, based on the given address, the value of the upper three bits Addr3,
Based on the processor number pe and the degree of parallelism mode, it is determined which of the local memories constituting the memory space extended by the current degree of parallelism is to be accessed. Thereby, each of the processors PEi can access using the address given the expanded memory space according to the degree of parallelism.

【００６４】図１０は、本発明の一実施例によるメモリ
管理ユニット２ｉに関してデータバスのみを接続切換す
る状態を説明するための概略図である。FIG. 10 is a schematic diagram for explaining a state in which only the data bus is switched in connection with the memory management unit 2i according to one embodiment of the present invention.

【００６５】上述した実施例では、メモリ管理ユニット
２ｉをアドレスバスおよびデータバスの両方を接続切換
するような構造にしたが、データバスのみを切換えるよ
うな簡易構造でも上述したものと同様に実現が可能であ
る。In the above embodiment, the memory management unit 2i has a structure in which both the address bus and the data bus are switched. However, a simple structure in which only the data bus is switched can be realized in the same manner as described above. It is possible.

【００６６】図１０に示されるように、この場合には局
所メモリ３ｉをアクセスするためのアドレスは、常に該
当演算ユニット１ｉから与えられる。ただし、メモリを
アクセスするためのアドレスが演算実行中に変化する
（動的に変化する）場合には、図１０に示される簡易構
造ではメモリアクセスができないという制限が設けられ
る。たとえば、プロセッサＰＥ０が、隣接するプロセッ
サＰＥ１の局所メモリ３１をアクセスしようとする場合
において、メモリ５１にストアされているコマンドの中
にメモリ参照のためのアドレスが記述されている場合
は、この簡易構造でメモリアクセスすることができる。
つまり、シーケンサ５０はメモリ５１から逐次読出すコ
マンドを、入力データバス１０を介してプロセッサＰＥ
０およびＰＥ１に同時に与えるので、演算ユニット１０
および１１は、同時に同じコマンドを受取って、同時に
自己の局所メモリ３０および３１に対してアドレス信号
を与えることで、プロセッサＰＥ０は隣接するプロセッ
サＰＥ１の局所メモリ３１をアクセスすることが可能と
なる。しかしながら、シーケンサ５０を介して一斉に同
じコマンドが与えられるではなく、プロセッサＰＥ０の
たとえばレジスタ群１ｄにメモリアクセスのためのアド
レスがストアされる場合には、隣接するプロセッサＰＥ
１はそこにストアされたアドレスを参照することはでき
ないために、自己の局所メモリ３１に対して適切なアド
レス信号を与えることができない。As shown in FIG. 10, in this case, the address for accessing local memory 3i is always given from the corresponding arithmetic unit 1i. However, when the address for accessing the memory changes during execution of the operation (dynamically changes), there is a limitation that the memory cannot be accessed with the simple structure shown in FIG. For example, when the processor PE0 attempts to access the local memory 31 of the adjacent processor PE1 and an address for memory reference is described in a command stored in the memory 51, the simple structure is used. Can access the memory.
That is, the sequencer 50 sends commands to be sequentially read from the memory 51 to the processor PE via the input data bus 10.
0 and PE1 at the same time.
And 11 receive the same command at the same time and simultaneously supply address signals to their own local memories 30 and 31, so that the processor PE0 can access the local memory 31 of the adjacent processor PE1. However, if the same command is not given all at once via the sequencer 50 but an address for memory access is stored in, for example, the register group 1d of the processor PE0, the adjacent processor PE0
Since 1 cannot refer to the address stored therein, it cannot provide an appropriate address signal to its local memory 31.

【００６７】上述したように、マルチプロセッサシステ
ム１がシーケンサ５０を介して予めメモリ５１にストア
されているコマンドシーケンスのみに従って動作するシ
ステムの場合は、各プロセッサに対して一斉に同じコマ
ンドが与えられるので、図１０に示されたようなデータ
バスのみを接続切換する簡易構造を用いて所望するメモ
リ空間をアクセスすることができる。As described above, in the case where the multiprocessor system 1 operates according to only the command sequence previously stored in the memory 51 via the sequencer 50, the same command is given to all processors at once. A desired memory space can be accessed using a simple structure for switching only the data bus as shown in FIG.

【００６８】マルチプロセッサシステム１においてシー
ケンサ５０から与えられるコマンドと各プロセッサＰＥ
ｉの内部のレジスタ群１ｄにストアされるコマンドに従
って処理が実行されるような場合には、図３に示された
ようにデータバスとアドレスバスを接続切換する構造を
採用して、所望するメモリをアクセスできるようにす
る。In the multiprocessor system 1, a command given from the sequencer 50 and each processor PE
In the case where the processing is executed in accordance with the command stored in the register group 1d inside i, a structure for switching the connection between the data bus and the address bus as shown in FIG. To be accessible.

【００６９】このように、プロセッサシステム１の動作
を制御するコマンドの供給形式に応じて、データバスの
みを接続切換するか、データバスおよびアドレスバスを
同時に接続切換するかの方法を使い分けるようにしても
よい。As described above, according to the supply format of the command for controlling the operation of the processor system 1, the method of switching connection of only the data bus or the method of switching connection of the data bus and the address bus simultaneously is selected. Is also good.

【００７０】図１１は、本発明の一実施例によるマルチ
プロセッサシステムシステムの並列度の可変設定動作を
示すフロー図である。FIG. 11 is a flowchart showing a variable parallelism setting operation of the multiprocessor system according to the embodiment of the present invention.

【００７１】図示されるフローは、予めプログラムとし
てプロセッサＰＥｉのそれぞれの主記憶装置１ｂにスト
アされ、命令処理装置１ａの制御の下に実行されるか、
またはメモリ５１に予めプログラムとしてストアされ、
シーケンサ５０の制御の下に逐次読出されて実行される
ようにしてもよい。The illustrated flow is stored as a program in advance in each main storage device 1b of the processor PEi and executed under the control of the instruction processing device 1a.
Alternatively, the program is stored in the memory 51 in advance as a program,
The data may be sequentially read out and executed under the control of the sequencer 50.

【００７２】上述した実施例では、マルチプロセッサシ
ステム１の並列度ｍｏｄｅを予めプロセッサＰＥｉのそ
れぞれに固定のデータとして設定するようにしたが、こ
の並列度ｍｏｄｅを可変設定するようにしてもよい。つ
まり、各プロセッサＰＥｉは実行中に与えられるまたは
発生するアドレスが自己がアクセス可能なメモリのアド
レス空間を越えたとき、応じて現在の並列度を下げるよ
うな値を並列度ｍｏｄｅに設定することにより、所望さ
れるアドレスにストアされたデータをメモリから読出す
ことが可能となる。これにより、ユーザはマルチプロセ
ッサシステム１の並列度を意識することなく、プログラ
ミングが可能となる。In the above-described embodiment, the parallelism mode of the multiprocessor system 1 is set in advance as fixed data in each of the processors PEi. However, the parallelism mode may be variably set. That is, when the address given or generated during execution exceeds the address space of the memory that can be accessed by itself, each processor PEi sets a value that reduces the current parallelism in the parallelism mode accordingly. , Data stored at a desired address can be read from the memory. This allows the user to perform programming without being aware of the degree of parallelism of the multiprocessor system 1.

【００７３】今、マルチプロセッサシステム１は図１
（ａ）に示されるように並列度８、すなわち並列度ｍｏ
ｄｅ＝０であったと想定する。また、プロセッサＰＥｉ
のそれぞれは、主記憶装置１ｂに自己の局所メモリ３ｉ
のメモリ空間の大きさＳをデータとして予めストアして
いると想定する。Now, the multiprocessor system 1 will be described with reference to FIG.
As shown in (a), the parallelism is 8, ie, the parallelism mo
Assume that de = 0. The processor PEi
Are stored in the local memory 3i in the main storage device 1b.
It is assumed that the size S of the memory space is stored in advance as data.

【００７４】シーケンサ５０はメモリ５１にストアされ
るプログラムを解読して得られるアドレスデータａｄｄ
ｒを入力データバス１０を介してすべてのプロセッサＰ
Ｅｉに一斉に与える。The sequencer 50 decodes a program stored in the memory 51 to obtain address data add obtained by decoding the program.
r through input data bus 10 to all processors P
Give to Ei all at once.

【００７５】プロセッサＰＥｉのそれぞれの演算ユニッ
ト１ｉは、与えられるアドレスデータａｄｄｒを入力
し、応じて図１１のステップＳＴ２０の処理を実行す
る。ステップＳＴ２０の処理において、命令処理装置１
ａは、与えられるアドレスデータａｄｄｒをそのレジス
タ群１ｄに一時的にストアするとともに、主記憶装置１
ｂから局所メモリ空間の大きさＳを読出し、（ａｄｄｒ
≧Ｓ）を判定する。このとき、この論理式が不成立であ
ること、すなわち与えられるアドレスデータａｄｄｒは
自己の局所メモリ３ｉのメモリ空間を指定することを判
定すると、次のステップＳＴ２１の処理に移行し、命令
処理装置１ａは主記憶装置１ｂの並列度ｍｏｄｅに値０
を設定し、処理を終了する。これにより、マルチプロセ
ッサシステム１の並列度は８に維持されて、プロセッサ
ＰＥｉのそれぞれは与えられるアドレスデータａｄｄｒ
に基づいてアドレス指定により自己の局所メモリ３ｉに
ついて所望するデータを読書きすることができる。Each operation unit 1i of processor PEi receives the given address data addr, and executes the process of step ST20 in FIG. 11 in response. In the process of step ST20, the instruction processing device 1
a temporarily stores the given address data addr in its register group 1d, and stores the address data addr in the main storage device 1d.
b, the size S of the local memory space is read, and (addr
≧ S) is determined. At this time, if it is determined that this logical expression is not satisfied, that is, the given address data addr specifies the memory space of its own local memory 3i, the process proceeds to the next step ST21, and the instruction processing device 1a A value of 0 is set in the parallelism mode of the main storage device 1b.
Is set, and the process ends. Thereby, the degree of parallelism of the multiprocessor system 1 is maintained at 8, and each of the processors PEi receives the given address data addr.
, Desired data can be read and written from and to its own local memory 3i by specifying an address.

【００７６】前記ステップＳＴ２０の処理に戻り、命令
処理装置１ａが（ａｄｄｒ≧Ｓ）が成立であることを判
定する、すなわち与えられるアドレスデータａｄｄｒは
自己の局所メモリ３ｉのメモリ空間の大きさを越えてい
ることを判定すると、次のステップＳＴ２２の処理に移
行する。Returning to the process of step ST20, the instruction processing device 1a determines that (addr ≧ S) holds, that is, the given address data addr exceeds the size of the memory space of its own local memory 3i. If it is determined that the operation has been performed, the process proceeds to the next step ST22.

【００７７】ステップＳＴ２２の処理において、命令処
理装置１ａは主記憶装置１ｂのデータｍｏｄｅに値１を
設定し、マルチプロセッサシステム１の並列度を４に下
げる。これにより、図１（ａ）から図１（ｂ）に示され
るようなシステム構成となって、プロセッサ番号ｐｅが
偶数であるプロセッサＰＥｉは、隣接する奇数番号のプ
ロセッサ番号ｐｅを有するプロセッサＰＥｉの局所メモ
リ３ｉを自己のメモリスペースとしてアクセスして、与
えられるアドレスデータａｄｄｒを用いて所望されるデ
ータを読書きできる。In the process of step ST22, the instruction processing device 1a sets the value 1 to the data mode of the main storage device 1b, and lowers the degree of parallelism of the multiprocessor system 1 to 4. As a result, a system configuration as shown in FIG. 1A to FIG. 1B is obtained, and the processor PEi having the even processor number pe is a local processor of the adjacent processor PEi having the odd odd processor number pe. By accessing the memory 3i as its own memory space, desired data can be read and written using the given address data addr.

【００７８】本実施例の中で述べたように、マルチプロ
セッサシステム１は２ⁿ個（＝Ｎ）のプロセッサを有す
ることにより、並列度をＮ／２、Ｎ／４のように下げた
いずれの場合でも、システム構成の対称性は保持される
ので、制御が効率的に行なえる。As described in this embodiment, the multiprocessor system 1 has 2 ⁿ (= N) processors, so that the degree of parallelism is reduced to N / 2 or N / 4. Even in this case, since the symmetry of the system configuration is maintained, the control can be performed efficiently.

【００７９】[0079]

【発明の効果】以上のようにこの発明によれば、信号導
出手段は、並列度データおよび該プロセッサを一意に特
定する特定データに基づいてバスの切換信号を導出し、
バス選択手段は導出されたバス切換信号に基づいて、第
２および第３バスのいずれか一方を第１のバスに接続す
るように動作する。並列度データにより第２バスが第１
バスに接続された場合、該プロセッサの記憶手段は同一
プロセッサの演算手段によってアクセス可能となるよう
に設定され、逆に第３バスが第１バスに接続されるよう
な場合、該プロセッサの記憶手段は、隣接するプロセッ
サのバス切換手段を介して該プロセッサを除く他のプロ
セッサの演算手段によりアクセス可能と設定される。し
たがって、本発明にかかるマルチプロセッサシステムで
は、並列度データを任意に可変設定するだけで、プログ
ラムステップ数を増やすことなく、各プロセッサが独自
にアクセスする記憶手段を動的に再配置することが可能
となるという効果がある。As described above, according to the present invention, the signal deriving means derives the bus switching signal based on the parallelism data and the specific data for uniquely specifying the processor.
The bus selecting means operates to connect one of the second and third buses to the first bus based on the derived bus switching signal. Second bus is first according to parallel degree data
When connected to the bus, the storage means of the processor is set so as to be accessible by the arithmetic means of the same processor. Conversely, when the third bus is connected to the first bus, the storage means of the processor is set. Is set to be accessible by arithmetic means of another processor except the processor via the bus switching means of the adjacent processor. Therefore, in the multiprocessor system according to the present invention, it is possible to dynamically rearrange the storage means independently accessed by each processor by merely arbitrarily setting the parallelism data without increasing the number of program steps. This has the effect of becoming

【００８０】上述した効果は、該システム構成は固定で
ありながら、各プロセッサはバス切換手段により接続切
換されたバスを介して、他のプロセッサの記憶手段をプ
ロセッサ間のデータ転送を行なうことなくアクセス可能
となり、各プロセッサについてアクセス可能な局所メモ
リのアドレス空間を任意にかつ柔軟に可変設定できると
いう効果をもたらす。The effect described above is that, while the system configuration is fixed, each processor accesses the storage means of another processor without transferring data between the processors via the bus switched by the bus switching means. This makes it possible to arbitrarily and flexibly set the address space of the local memory accessible to each processor.

【００８１】さらに、上述したような効果により、各プ
ロセッサ、ひいては該マルチプロセッサシステム自体の
処理速度および処理能力を向上させることができるとい
う効果をもたらす。Further, the effect as described above brings about an effect that the processing speed and processing capability of each processor, and furthermore, the multiprocessor system itself can be improved.

[Brief description of the drawings]

【図１】（ａ）ないし（ｃ）は、本発明の第１の実施例
によるマルチプロセッサシステムの概要を説明するため
のシステム構成図である。FIGS. 1A to 1C are system configuration diagrams for explaining an outline of a multiprocessor system according to a first embodiment of the present invention.

【図２】（ａ）および（ｂ）は、本発明の一実施例によ
るマルチプロセッサシステムを構成するプロセッサの概
略ブロック図である。FIGS. 2A and 2B are schematic block diagrams of a processor constituting a multiprocessor system according to an embodiment of the present invention.

【図３】本発明の一実施例によるメモリ管理ユニットに
関する入出力バスの接続状態を説明するための模式図で
ある。FIG. 3 is a schematic diagram for explaining a connection state of an input / output bus with respect to a memory management unit according to one embodiment of the present invention.

【図４】（ａ）および（ｂ）は、図３に示されたメモリ
管理ユニットのブロック図である。FIGS. 4A and 4B are block diagrams of the memory management unit shown in FIG. 3;

【図５】本発明の一実施例によるメモリ管理ユニットに
おける接続バスの選択動作を示すフロー図である。FIG. 5 is a flowchart showing an operation of selecting a connection bus in a memory management unit according to an embodiment of the present invention.

【図６】（ａ）および（ｂ）は、図５の動作フローに従
って得られるメモリ管理ユニットにおけるバス接続状態
を示す概略図である。FIGS. 6A and 6B are schematic diagrams showing bus connection states in a memory management unit obtained according to the operation flow of FIG. 5;

【図７】図５の動作フローに従って得られる並列度２の
場合のバス接続状態を示す概略図である。7 is a schematic diagram showing a bus connection state in the case of a degree of parallelism of 2 obtained according to the operation flow of FIG. 5;

【図８】本発明の一実施例によるメモリセレクト動作を
示すフロー図である。FIG. 8 is a flowchart showing a memory select operation according to an embodiment of the present invention.

【図９】図８の動作フローに従って得られる並列度２の
場合のメモリセレクト状態の一例を示す概略図である。9 is a schematic diagram illustrating an example of a memory select state in the case of a parallelism of 2 obtained according to the operation flow of FIG. 8;

【図１０】本発明の一実施例によるメモリ管理ユニット
に関してデータバスのみを接続切換する状態を説明する
ための概略図である。FIG. 10 is a schematic diagram illustrating a state in which only a data bus is switched in connection with a memory management unit according to an embodiment of the present invention;

【図１１】本発明の一実施例によるマルチプロセッサシ
ステムの並列度の可変設定動作を示すフロー図である。FIG. 11 is a flowchart showing an operation of variably setting the degree of parallelism of the multiprocessor system according to one embodiment of the present invention.

【図１２】従来の複数のプロセッサを並列接続したマル
チプレクサシステムの概略構成図である。FIG. 12 is a schematic configuration diagram of a conventional multiplexer system in which a plurality of processors are connected in parallel.

【図１３】図１２に示されたプロセッサの概略構成図で
ある13 is a schematic configuration diagram of the processor shown in FIG.

[Explanation of symbols]

１マルチプロセッサシステムＰＥｉプロセッサ（計算素子）１ｉ演算ユニット２ｉメモリ管理ユニット３ｉ局所メモリ１０入力データバス２０出力データバス４０ローカルメモリ用バス５０シーケンサ５１メモリ８０内部データバスＣＨバス切換信号ＭＳメモリセレクト信号ｐｅプロセッサ番号ｍｏｄｅ並列度Ａｄｄｒ３アドレスの上位３ビットＳ局所メモリ空間の大きさ（ｉ＝０、１、２、…、７）なお、各図中、同一符号は同一または相当部分を示す。 Reference Signs List 1 multiprocessor system PEi processor (computing element) 1i arithmetic unit 2i memory management unit 3i local memory 10 input data bus 20 output data bus 40 local memory bus 50 sequencer 51 memory 80 internal data bus CH bus switching signal MS memory select signal pe Processor number mode Parallelism Addr3 Upper 3 bits of address S Size of local memory space (i = 0, 1, 2,..., 7) In the drawings, the same reference numerals indicate the same or corresponding parts.

Claims

(57) [Claims]

1. A multiprocessor system having at least two or more processors connected in parallel via a same bus, wherein each processor performs processing in parallel, wherein each of the processors is connected to the same bus. Computing means, storage means, bus switching means, and a first means for connecting the storage means and the bus switching means of one of the adjacent processors via the bus switching means.
A second bus connecting the arithmetic unit and the bus switching unit; a third bus connecting the bus switching unit and the bus switching unit of the other adjacent processor; The arithmetic unit further includes a signal deriving unit that derives a bus switching signal based on specific data uniquely identifying the processor and parallelism data stored in advance therein, and the bus switching unit further includes: Based on the switching signal derived by the signal deriving means, the second and third
A multi-processor, comprising: bus selection means for selecting any one of the buses and connecting to the first bus; wherein the parallelism data is data for specifying the number of processors that access the storage means in parallel. system.