JP2002342295A

JP2002342295A - Multiprocessor system

Info

Publication number: JP2002342295A
Application number: JP2001145741A
Authority: JP
Inventors: Hideyuki Shimonishi; 英之下西
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2001-05-16
Filing date: 2001-05-16
Publication date: 2002-11-29
Anticipated expiration: 2021-05-16
Also published as: JP3925105B2

Abstract

PROBLEM TO BE SOLVED: To solve the problems of the difficulty in high-speed communication between processors, when a multiprocessor system, especially an on-chip multiprocessor system for performing packet transfer processing in a network node device is increased in scaled up. SOLUTION: This multiprocessor system is provided with a plurality of processors 1-1 to 1-4 which are respectively equipped with at least one arithmetic unit and buses 1-8 to 1-10 for connecting those arithmetic units. In this case, not all the processor are connected to a single bus, but a small number of processors are connected to a plurality of buses, so that the processors connected to one and the same bus for rapid perform data exchange between the arithmetic units can be made, without using registers or memories. The processors connected to the different buses can perform the data exchange via a plurality of buses and a plurality of bridges 1-5 and 1-6, so that, the inter-process communication can be quickly performed, even in the large-scaled multiprocessor system.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明はマルチプロセッサシ
ステムに関し、特にネットワークノード装置内でパケッ
ト転送処理を行うオンチップマルチプロセッサシステム
におけるプロセッサ間通信方式に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a multiprocessor system, and more particularly to an interprocessor communication system in an on-chip multiprocessor system for performing packet transfer processing in a network node device.

【０００２】[0002]

【従来の技術】半導体技術の進展により多くのプロセッ
サを１つの半導体上に集積することが可能となった。多
くのプロセッサを１つの半導体上に集積する一つの利点
として、プロセッサ間の通信に大きな帯域幅をとること
ができ、プロセッサ間の通信にかかる遅延時間を短縮で
きる点が挙げられる。プロセッサ間で通信を行う方式と
しては主に以下のような三方式が考えられる。2. Description of the Related Art Advances in semiconductor technology have made it possible to integrate many processors on one semiconductor. One advantage of integrating many processors on a single semiconductor is that a large bandwidth can be used for communication between processors, and a delay time required for communication between processors can be reduced. The following three methods are mainly considered as methods for performing communication between processors.

【０００３】第一のプロセッサ間通信方式は、共有レジ
スタや共有キャッシュメモリや主記憶等の共有の記憶領
域を用いる方式である。あるプロセッサが共有の記憶領
域にデータを書き込むと、この記憶領域を共有する他の
プロセッサが該データを読み出すことで通信を行う。本
方式はマルチプロセッサシステムにおいて一般的に幅広
く用いられており、例えば、共有メモリを用いる例とし
ては、特開平２−２４４２５２号公報や特開平６−３５
８４０号公報、特開平７−１５２６４７号公報等で示さ
れている方式がある。また、共有レジスタを用いる例と
しては、特開平１０−７８８８０号公報等で示される方
式がある。The first inter-processor communication method is a method using a shared storage area such as a shared register, a shared cache memory, and a main memory. When a certain processor writes data to the shared storage area, another processor sharing the storage area reads the data to perform communication. This method is generally and widely used in a multiprocessor system. For example, Japanese Patent Application Laid-Open Nos. 2-244252 and 6-35
840 and JP-A-7-152647. As an example of using a shared register, there is a method disclosed in Japanese Patent Application Laid-Open No. H10-78880.

【０００４】第二のプロセッサ間通信方式は、プロセッ
サ間で直接データを送受信する方式である。例えば、各
プロセッサが夫々有するレジスタファイルをバスを用い
て相互に接続し、レジスタファイル間でレジスタデータ
をコピーする方式が特願平１１−３４８５４５号明細書
で述べられている。The second inter-processor communication system is a system for directly transmitting and receiving data between processors. For example, Japanese Patent Application No. 11-348545 describes a method in which register files of respective processors are connected to each other using a bus, and register data is copied between register files.

【０００５】第三のプロセッサ間通信方式は、夫々のプ
ロセッサが持つ演算器間を直接結線する方式である。一
般的には、本方式はプロセッサ間の通信方式ではなく、
スーパースカラ型プロセッサやＶＬＩＷ（Very Long In
struction Word）型プロセッサ内において、複数の演算
器間で演算途中のデータを交換する方式である。本方式
では、プロセッサ内の複数の演算器間、もしくは異なる
プロセッサの演算器間を直接メッシュに接続することに
より演算器間で高速にデータを交換することができる。[0005] The third inter-processor communication system is a system in which arithmetic units of the respective processors are directly connected. Generally, this method is not a communication method between processors,
Super scalar processor and VLIW (Very Long In
In a struction word) type processor, a plurality of computing units exchange data in the middle of computation. In this method, data can be exchanged between arithmetic units at high speed by directly connecting a plurality of arithmetic units in a processor or arithmetic units of different processors to a mesh.

【０００６】プロセッサ間で通信を行う際、前記第一の
方式の様に共有記憶領域を使用して通信を行う場合には
該記憶領域に対する複数のプロセッサからのアクセスを
調停し、また前記第二及び第三の方式の様にプロセッサ
もしくは演算器間を直接接続する場合には、接続のため
に使用するバスに対する複数のプロセッサからのアクセ
ス得ることを前提としており、実行時に競合が発生した
際には、アクセス要求するプロセッサのうち１つのプロ
セッサを選択して、このプロセッサにアクセスを許可
し、他のプロセッサに対してはアクセスを不許可とす
る。When communication is performed between processors by using a shared storage area as in the first method, arbitration of accesses from a plurality of processors to the storage area is performed. In the case where the processor or the computing unit is directly connected as in the third method, it is assumed that the bus used for the connection can be accessed from a plurality of processors. Selects one of the processors requesting access, permits access to this processor, and denies access to other processors.

【０００７】第二のアクセス調停方式は、プロセッサを
固定されたタイミングで動作させ、予め競合が発生しな
いようなプログラムを与えておく方式である。本方式で
は、あらかじめコンパイラ等を用いてプログラムコード
の並べ替えを行うことにより、競合が発生しないプログ
ラムコードをプロセッサに与え、プロセッサは与えられ
たプログラムコードをコード並べ替え時に想定したタイ
ミングの通りに実行することにより、実行時にアクセス
競合を発生させない。但し、本方式では、プログラム並
び替え時に想定した動作タイミングと実行時の実際の動
作タイミングを厳密に一致させる必要があり、そのため
には、分岐やメモリアクセス等による実行の乱れを予測
してプログラムコードの並べ替えを行い、プロセッサの
ハードウエアは予測できない実行の乱れが発生しないよ
うに構成しておく必要がある。本方式を用いたプロセッ
サの例として特願平１１−３４８５４５号明細書に開示
の技術が挙げられる。The second access arbitration method is a method in which a processor is operated at a fixed timing and a program is provided in advance so as not to cause contention. In this method, program code is rearranged in advance using a compiler, etc., so that program code that does not cause contention is given to the processor, and the processor executes the given program code at the timing assumed at the time of code rearrangement. By doing so, no access conflict occurs during execution. However, in this method, it is necessary to exactly match the operation timing assumed at the time of program rearrangement with the actual operation timing at the time of execution. And the processor hardware must be configured so that unpredictable execution disturbances do not occur. A technique disclosed in Japanese Patent Application No. 11-348545 is an example of a processor using this method.

【０００８】[0008]

【発明が解決しようとする課題】第一の問題点は、前記
第一のプロセッサ間通信方式及び第二のプロセッサ間通
信方式において通信にかかる遅延、特にあるプロセッサ
で演算を行ってその結果を用いて他のプロセッサで演算
を行う場合のプロセッサ間の通信遅延が大きいことであ
る。前記第一の方式では、送信を行うプロセッサと受信
を行うプロセッサが共にアクセスできる記憶領域に一旦
データを書きこむ必要があるため、送信プロセッサの演
算器において演算されたデータが受信プロセッサの演算
器に到達するまで大きな遅延が発生する。The first problem is that communication delays in the first and second inter-processor communication systems, particularly, the operation performed by a certain processor and the results are used. That is, the communication delay between the processors when the operation is performed by another processor is large. In the first method, it is necessary to temporarily write data in a storage area that can be accessed by both the transmitting processor and the receiving processor, so that the data calculated in the arithmetic unit of the transmitting processor is stored in the arithmetic unit of the receiving processor. Large delays occur until they arrive.

【０００９】また、前記第二の方式においても、送信を
行うプロセッサは演算結果を一旦レジスタファイルに書
き込み、これを受信を行うプロセッサのレジスタファイ
ルに転送する必要があり、やはり演算器間では大きな遅
延が発生する。Also, in the second method, the transmitting processor must once write the operation result to the register file and transfer the result to the register file of the receiving processor. Occurs.

【００１０】第二の問題点は、前記第三のプロセッサ間
通信方式においてプロセッサ数及び演算器数を大きく出
来ない点である。本方式では、全ての演算器の出力を全
ての演算器の入力に接続する必要があるため、プロセッ
サ数及び演算器数の２乗の量の配線が必要となり、接続
できるプロセッサ及び演算器の数が少ない。A second problem is that the number of processors and the number of arithmetic units cannot be increased in the third inter-processor communication system. In this method, it is necessary to connect the outputs of all the arithmetic units to the inputs of all the arithmetic units, so wiring of the number of processors and the square of the number of arithmetic units is required, and the number of connectable processors and arithmetic units Less is.

【００１１】第三の問題点は、前記第一及び第二のプロ
セッサ間通信方式においてプロセッサ間の処理引継ぎに
時間がかかることである。前記方式においてはプロセッ
サ間でメモリ上のデータ及びレジスタ上のデータ、演算
器出力のデータはプロセッサ間で引き継ぐことが出来る
が、状態レジスタやプログラムカウンタの内容を引き継
ぐことは出来ない。そのため、プロセッサ間で処理を引
き継ぐ場合、状態レジスタ内の情報は一旦メモリもしく
は通常のレジスタに反映させて引継ぎを行い、引き継い
だプロセッサでは、前記メモリもしくは通常のレジスタ
の内容を調べて改めて状態レジスタを設定する必要があ
る。A third problem is that in the first and second communication systems between processors, it takes time to take over processing between processors. In the above-described method, data on a memory, data on a register, and data on an operation unit can be transferred between processors, but the contents of a status register and a program counter cannot be transferred between processors. Therefore, when processing is taken over between processors, the information in the status register is temporarily reflected in a memory or a normal register to carry out the processing, and the succeeding processor checks the contents of the memory or the normal register and renews the status register. Must be set.

【００１２】また、プログラムカウンタに関しても同様
であり、あるプロセッサで処理が複数に分岐していてそ
れぞれの分岐先から処理の引継ぎを行う場合、引継ぎ元
のプロセッサはどの分岐の処理中であるかの情報を一旦
メモリもしくは通常のレジスタに反映させて引継ぎを行
い、引き継いだプロセッサでは前記メモリもしくは通常
のレジスタの内容を調べて改めて処理を分岐させる必要
がある。The same applies to the program counter. When a certain processor branches a plurality of processes and takes over the process from each branch destination, the takeover source processor determines which branch is being processed. The information is temporarily reflected in a memory or a normal register to perform the takeover, and the processor that has taken over needs to check the contents of the memory or the normal register and branch the process again.

【００１３】第四の問題点は、第一のアクセス調停方式
が大規模なマルチプロセッシングに適さない点である。
本方式では、全てのプロセッサからのアクセス要求から
一つのプロセッサに対してアクセス許可を与えるため、
プロセッサ数が多くなるとアクセス調停の判断にかかる
遅延が大きくなり、またプロセッサからアクセス調停回
路までの物理的距離が長くなるため、プロセッサがアク
セス要求を出してから通信を開始するまでの遅延時間が
長くなる。そのため、プロセッサ数が増加するとプロセ
ッサ間の通信速度が低下するという問題がある。A fourth problem is that the first access arbitration method is not suitable for large-scale multiprocessing.
In this method, to grant access to one processor from access requests from all processors,
As the number of processors increases, the delay in determining access arbitration increases, and the physical distance from the processor to the access arbitration circuit increases, so the delay time between when a processor issues an access request and when communication starts is longer. Become. Therefore, there is a problem that as the number of processors increases, the communication speed between the processors decreases.

【００１４】第五の問題点は、第一のアクセス調停方式
が実時間処理に適さない点である。本方式では、実行中
に動的にアクセス調停を行うため、各処理に対する処理
時間を保証することが出来ない。そのため、パケット転
送処理おけるパケットスケジューリング処理等の非常に
厳密な実時間性を要求される処理を実現することが難し
い。[0015] A fifth problem is that the first access arbitration method is not suitable for real-time processing. In this method, access arbitration is performed dynamically during execution, so that the processing time for each process cannot be guaranteed. Therefore, it is difficult to realize a process that requires very strict real-time performance, such as a packet scheduling process in the packet transfer process.

【００１５】第六の問題点は、第二のアクセス調停方式
を用いた場合、処理内容によってはプログラムの実行効
率が低下する点である。同じ処理を単純に繰り返す処理
であれば共有資源に対するアクセス時刻を固定してもプ
ログラムの実行効率は低下しない。しかしながら、プロ
グラム内に分岐が多くその時々に応じて多様な処理を行
う場合には、全ての分岐パスにおいて共有資源へのアク
セスが競合しない様にプログラムコードを作成する必要
があり、分岐パス数が多くなるに従って無駄なアクセス
待ち時間が増加する。A sixth problem is that when the second access arbitration system is used, the execution efficiency of the program is reduced depending on the processing content. If the same processing is simply repeated, the execution efficiency of the program does not decrease even if the access time to the shared resource is fixed. However, when there are many branches in a program and various processes are performed according to the time, it is necessary to create a program code so that access to a shared resource does not conflict in all branch paths. As the number increases, the useless access waiting time increases.

【００１６】パケット転送処理を例にすると、パケット
スケジューリング処理は同様の処理の繰り返し処理であ
るため、本方式を用いても処理効率は低下しないが、パ
ケットヘッダ処理において多様な種類のパケットを処理
する場合にはパケットの種類に応じて実行する処理が異
なりプログラムの実行効率が低下する恐れがある。Taking the packet transfer process as an example, the packet scheduling process is a repetitive process of the same process, so that even if this method is used, the processing efficiency does not decrease, but various types of packets are processed in the packet header process. In this case, the processing to be executed differs depending on the type of the packet, and the execution efficiency of the program may be reduced.

【００１７】本発明は以上の問題点を鑑み発案されたも
のであり、マルチプロセッサシステム、特にネットワー
クノード装置内でパケット転送処理を行うオンチップマ
ルチプロセッサシステムにおいて、プロセッサ数が非常
に多い大規模なシステムにおいても、プロセッサ間の通
信を高速化し、プログラムの実行効率と実時間処理を共
に満足するアクセス調停を行うことを目的とする。The present invention has been made in view of the above problems, and has been proposed in a multiprocessor system, particularly, an on-chip multiprocessor system for performing packet transfer processing in a network node device, which has a very large number of processors. It is another object of the present invention to speed up communication between processors and perform access arbitration that satisfies both program execution efficiency and real-time processing.

【００１８】[0018]

【課題を解決するための手段】第一の問題点を解決する
ため、本発明によるマルチプロセッサシステムでは、従
来のプロセッサ間通信方式に加えて、夫々のプロセッサ
が持つ演算器間で直接通信を行うための通信路を有し、
演算器間で高速にデータの交換を行う。In order to solve the first problem, in a multiprocessor system according to the present invention, in addition to the conventional interprocessor communication system, direct communication is performed between arithmetic units of respective processors. Communication path for
High-speed data exchange between arithmetic units.

【００１９】第二の問題点を解決するため、全ての演算
器間をメッシュに接続するのではなく、１本あるいは複
数のバスを用いたバス接続とする。１本のバスを用いた
場合には、プロセッサ数が増加するとバスの混雑やバス
の物理的な距離の増加によって通信の効率が低下する。
そのため、プロセッサ数が増加した場合には、全プロセ
ッサを１本のバスに接続するのではなく、短いバスを複
数用意して各プロセッサは最寄のバスにのみに接続す
る。そして異なるバスに接続されたプロセッサ間で通信
を行うため、バス間をブリッジを用いて接続することで
複数のバスを経由した通信を可能とする。In order to solve the second problem, a bus connection using one or a plurality of buses is used instead of connecting all the arithmetic units to a mesh. When one bus is used, when the number of processors increases, communication efficiency decreases due to bus congestion and an increase in the physical distance of the bus.
Therefore, when the number of processors increases, not all processors are connected to one bus, but a plurality of short buses are prepared and each processor is connected only to the nearest bus. Since communication is performed between processors connected to different buses, communication between a plurality of buses is enabled by connecting the buses using a bridge.

【００２０】第三の問題点を解決するため、夫々のプロ
セッサが持つ状態レジスタ及びプログラムカウンタの間
で直接通信を行うための通信路を有する。In order to solve the third problem, a communication path for directly communicating between the status register and the program counter of each processor is provided.

【００２１】第四の問題点を解決するため、全てのプロ
セッサから全共有資源に対してアクセス調停を行うので
はなく、同一のバスを共有しているプロセッサ間のみで
アクセス調停を行う。本マルチプロセッサシステムは複
数のバスを有し、プロセッサは最寄のバスにのみ接続さ
れている。そのため、１本のバスを共有しているプロセ
ッサは近接しており、またその数も限られるため、これ
らのプロセッサ間では高速にアクセス調停を行うことが
できる。In order to solve the fourth problem, access arbitration is not performed by all processors for all shared resources, but only by processors sharing the same bus. The multiprocessor system has a plurality of buses, and the processor is connected only to the nearest bus. Therefore, processors sharing one bus are close to each other, and the number of processors is limited, so that access arbitration can be performed between these processors at high speed.

【００２２】プロセッサ数を増加させるときには１本の
バスに接続するプロセッサ数を増加させるのではなくバ
スの数を増加させることにより、大規模なマルチプロセ
ッシングを行う場合においても高速にアクセス調停を行
うことができる。複数のバスを通過する通信の場合、プ
ロセッサが出力したデータは一旦経由するブリッジが受
け取り、このブリッジが次のバスに対してアクセス調停
を行いデータを出力する。When the number of processors is increased, the number of buses is increased rather than the number of processors connected to one bus, so that access arbitration can be performed at high speed even in the case of performing large-scale multiprocessing. Can be. In the case of communication passing through a plurality of buses, data output from the processor is received by a bridge that passes once, and the bridge arbitrates access to the next bus and outputs data.

【００２３】第五及び第六の問題点を解決するため、各
プロセッサを第一のアクセス調停方式に対応する非同期
的な動作と、第二のアクセス調停方式に対応する予測不
能な実行の乱れを生じる処理を禁じた同期的な動作とを
選択可能であるように構成する。共有資源に対するアク
セス制御方式としては、第一のアクセス調停方式を用い
たプロセッサに対しては動的な調停を行い、第二のアク
セス調停方式を用いたプロセッサに対してはそのアクセ
ス要求を常に許可する。[0023] In order to solve the fifth and sixth problems, each processor is provided with an asynchronous operation corresponding to the first access arbitration scheme and an unpredictable execution disorder corresponding to the second access arbitration scheme. It is configured such that a synchronous operation in which the processing to be performed is prohibited can be selected. As an access control method for shared resources, dynamic arbitration is performed for a processor using the first access arbitration method, and the access request is always permitted for a processor using the second access arbitration method. I do.

【００２４】すなわち、第二のアクセス調停方式を用い
たプロセッサに対してはアクセスが競合しないようにプ
ログラムコードを作成しておき、実行時には第二のアク
セス調停方式を用いたプロセッサのアクセスを行わない
資源に限り、第一のアクセス調停方式を用いたプロセッ
サからのアクセス要求を動的に調停する。That is, a program code is created so that accesses to the processor using the second access arbitration method do not conflict with each other, and the processor using the second access arbitration method is not accessed at the time of execution. For resources only, dynamically arbitrate access requests from processors using the first access arbitration scheme.

【００２５】パケット転送処理を本マルチプロセッサシ
ステム上に実装する際には、パケット転送処理を複数の
小さな処理に分割し、各処理を固定的にプロセッサに割
り当てる。そして各処理の特性に応じてプロセッサの動
作を選択することにより、すなわち多様な動作を行うパ
ケットヘッダ処理を担当するプロセッサは第一のアクセ
ス調停方式を用い、実時間性が強く単純な繰り返しが主
なパケットスケジューリング処理を担当するプロセッサ
は第二のアクセス調停方式を用いることによって、処理
の効率性と実時間性を共に満足させる。When implementing the packet transfer process on the multiprocessor system, the packet transfer process is divided into a plurality of small processes, and each process is fixedly assigned to the processor. Then, by selecting the operation of the processor according to the characteristics of each processing, that is, the processor in charge of packet header processing that performs various operations uses the first access arbitration method, and simple repetition with strong real-time characteristics is mainly used. By using the second access arbitration scheme, the processor in charge of the packet scheduling process satisfies both the processing efficiency and the real-time performance.

【００２６】[0026]

【発明の実施の形態】以下、図面を参照して本発明の実
施の形態について説明する。図１は本発明によるマルチ
プロセッサシステムの第一の実施例の構成を示すブロッ
ク図である。図１を参照すると、本実施例のマルチプロ
セッサシステムは、複数のプロセッサ１−１〜１−４、
複数のプロセッサの演算器間でデータを交換するデータ
バス１−８〜１−１０、データバス間でデータを交換す
るブリッジ１−５〜１−６、共有メモリ１−７、各プロ
セッサと共有メモリ１−７を接続するメモリバス１−１
１、コプロセッサ１−１２〜１−１３で構成される。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing a configuration of a first embodiment of a multiprocessor system according to the present invention. Referring to FIG. 1, the multiprocessor system according to the present embodiment includes a plurality of processors 1-1 to 1-4,
Data buses 1-8 to 1-10 for exchanging data between arithmetic units of a plurality of processors; bridges 1-5 to 1-6 for exchanging data between data buses; shared memory 1-7; Memory bus 1-1 connecting 1-7
1. Coprocessors 1-12-1-13.

【００２７】プロセッサ１−１はデータを保持するレジ
スタファイル１−１−１と、１または複数の演算器１−
１−２〜１−１−３、１または複数のメモリユニット１
−１−４で構成され、プロセッサ１−２〜１−３も同様
の構成である。各プロセッサはプログラムカウンタや状
態レジスタ等の一般的なプロセッサが必要とする機構は
有しているが、本実施例によるプロセッサ間通信動作と
は直接関係無いため本構成図からは省略している。The processor 1-1 includes a register file 1-1-1 for holding data, and one or more arithmetic units 1-1-1.
1-2-1-3, one or more memory units 1
-1-4, and the processors 1-2 to 1-3 have the same configuration. Although each processor has a mechanism required by a general processor such as a program counter and a status register, it is omitted from the configuration diagram because it has no direct relation to the inter-processor communication operation according to the present embodiment.

【００２８】データバス１−８〜１−１０は同時に複数
のデータをバス内で通信することが出来るように複数の
チャネルで構成し、各チャネルはデータ及びその宛先を
転送するための通信路で構成される。メモリバス１−１
１はアドレスとデータを転送するための通信路で構成さ
れる。コ（子）プロセッサ１−１２や１−１３は浮動小
数点演算器や文字列コピー等の（親）プロセッサ１−１
〜１−２や１−３〜１−４の動作をそれぞれ補助する機
構であり、データバスもしくはメモリバスを用いてプロ
セッサやメモリと通信することで処理を行う。The data buses 1-8 to 1-10 are constituted by a plurality of channels so that a plurality of data can be simultaneously communicated in the bus, and each channel is a communication path for transferring data and its destination. Be composed. Memory bus 1-1
Reference numeral 1 denotes a communication path for transferring addresses and data. The co (child) processors 1-12 and 1-13 are (parent) processors 1-1 such as a floating-point arithmetic unit and a character string copy.
1-2 and 1-3-1-4, which perform processing by communicating with a processor or a memory using a data bus or a memory bus.

【００２９】図２は本実施例における演算器１−１−２
の構造を示したブロック図である。演算器１−１−２は
データバスから入力レジスタへのデータ入力を制御する
入力回路１−１−９、データバスから取り込んだデータ
を一時保持する入力レジスタ１−１−５、演算回路に与
えるデータを入力レジスタもしくはレジスタファイルか
ら選択する選択回路１−１−６から１−１−７、演算回
路１−１−８、演算回路からデータバスへのデータ出力
を制御する出力回路１−１−１０で構成される。他の演
算器も全て同様の構成である。FIG. 2 shows an arithmetic unit 1-1-2 according to this embodiment.
FIG. 2 is a block diagram showing the structure of FIG. The arithmetic unit 1-1-2 controls an input circuit 1-1-9 for controlling data input from the data bus to the input register, an input register 1-1-5 for temporarily holding data fetched from the data bus, and supplies the arithmetic circuit. Selection circuits 1-1-6 to 1-1-7 for selecting data from an input register or a register file, an arithmetic circuit 1-1-8, and an output circuit 1-1 for controlling data output from the arithmetic circuit to the data bus. It consists of ten. All other arithmetic units have the same configuration.

【００３０】次に、図１及び図２を参照して本実施例の
動作を説明する。本実施例における各プロセッサのメモ
リアクセス動作は、例えば特開平２−２４４２５２号公
報や特開平６−３５８４０号公報、特開平７−１５２６
４７号公報等に示されるような一般的なマルチプロセッ
サシステムにおけるメモリアクセス動作と同様である。
従って、各プロセッサ間でメモリを介して通信を行う場
合、本実施例におけるマルチプロセッサシステムにおい
ても一般的なマルチプロセッサシステムと同様の動作で
通信を行う。Next, the operation of this embodiment will be described with reference to FIGS. The memory access operation of each processor in this embodiment is described in, for example, JP-A-2-244252, JP-A-6-35840, and JP-A-7-1526.
This is the same as the memory access operation in a general multiprocessor system as shown in JP-A-47-47 and the like.
Therefore, when communication is performed between the processors via the memory, the multiprocessor system according to the present embodiment performs communication in the same manner as a general multiprocessor system.

【００３１】本実施例におけるマルチプロセッサシステ
ムにおいては、演算器を直接接続するデータバスを有す
る。そこで、次にデータバスを用いたプロセッサ間通信
方式について説明する。プロセッサにおいて通常の演算
命令を実行した際には、演算回路からの演算結果出力は
レジスタファイルに送られるが、データ出力を伴う演算
命令を実行した際には、演算結果出力をレジスタファイ
ルに送ると同時に出力回路を通じでデータバスへと出力
する。出力回路では、プログラムからの指示によって出
力するデータバス上のチャネルを選択し、プログラムに
よって指示されるデータの宛先のプロセッサ番号もしく
はコプロセッサ番号と共にデータをデータバスに出力す
る。The multiprocessor system according to the present embodiment has a data bus for directly connecting arithmetic units. Therefore, an inter-processor communication system using a data bus will be described next. When a processor executes a normal operation instruction, the operation result output from the operation circuit is sent to the register file, but when an operation instruction accompanied by data output is executed, the operation result output is sent to the register file. At the same time, the data is output to the data bus through the output circuit. The output circuit selects a channel on the data bus to be output according to an instruction from the program, and outputs the data to the data bus together with the destination processor number or coprocessor number of the data specified by the program.

【００３２】ブリッジでは、データバス上を流れるデー
タの宛先を検査し、もしデータが流れているデータバス
とその宛先が接続されているデータバスが同じであると
判断すればデータの転送処理は行わない。さもなければ
宛先への経路となるデータバスへデータを転送する。In the bridge, the destination of the data flowing on the data bus is checked, and if it is determined that the data bus through which the data flows and the data bus to which the destination is connected are the same, the data transfer processing is performed. Absent. Otherwise, transfer the data to the data bus which is the route to the destination.

【００３３】各演算器の入力回路では、接続されている
データバス内の各チャネルを監視し、チャネル上を流れ
るデータの宛先が自プロセッサであると判断した場合に
は、そのデータを取りこんで入力レジスタへと格納して
から該入力レジスタを有効状態に設定する。演算器にお
いて通常の演算を実行する場合はデータをレジスタファ
イルから取得するが、データ入力を伴う演算を実行する
場合には入力レジスタからデータを取得して演算を行い
う。入力レジスタ上のデータを使用する際、該入力レジ
スタが無効状態にあれば演算を停止して該入力レジスタ
が有効状態に変化するまで待つ。そして該入力レジスタ
が有効状態に変化すると演算を再開し、演算が終了する
と該入力レジスタを無効状態へと設定する。The input circuit of each arithmetic unit monitors each channel in the connected data bus, and when it is determined that the destination of data flowing on the channel is its own processor, the data is taken in and input. After the data is stored in the register, the input register is set to a valid state. When a normal operation is performed in an arithmetic unit, data is obtained from a register file. When an operation involving data input is performed, data is obtained from an input register and the operation is performed. When the data in the input register is used, if the input register is in an invalid state, the operation is stopped and the operation waits until the input register changes to a valid state. When the input register changes to a valid state, the operation is restarted, and when the operation is completed, the input register is set to an invalid state.

【００３４】このような入力レジスタの機構はプロセッ
サ間の処理の同期をとるために必要であり、データを受
信するプロセッサがデータ到着前に演算を開始すること
を防ぐ。なお、特願平１１−３４８５４５号明細書に説
明されているように、各プロセッサが固定タイミングで
動作している場合、このような同期のための機構は不必
要である。Such an input register mechanism is necessary for synchronizing processing between processors, and prevents a processor receiving data from starting an operation before data arrives. As described in Japanese Patent Application No. 11-348545, when each processor operates at a fixed timing, such a mechanism for synchronization is unnecessary.

【００３５】プロセッサがコプロセッサに対して処理を
委託するためデータを送信する場合、プロセッサは前記
の動作と同様にデータバスのデータを出力し、コプロセ
ッサはデータバス上からデータを受信して所定の処理を
行う。そして処理結果は再びデータバスを用いて処理を
依頼したプロセッサに送信される。メモリ上の文字列転
送処理の様にコプロセッサが直接メモリアクセスを行う
必要がある処理では、コプロセッサはメモリバスを用い
て直接メモリアクセスを行う。When the processor sends data to the coprocessor to outsource processing, the processor outputs data on the data bus in the same manner as the above-described operation, and the coprocessor receives data from the data bus and performs predetermined processing. Is performed. Then, the processing result is transmitted again to the processor which has requested the processing using the data bus. In a process in which the coprocessor needs to perform direct memory access, such as a character string transfer process on a memory, the coprocessor performs direct memory access using a memory bus.

【００３６】図３は本発明によるマルチプロセッサシス
テムの第二の実施例の構成を示すブロック図である。図
３を参照すると、本実施例におけるマルチプロセッサシ
ステムは、複数のプロセッサ２−１〜２−４、複数のプ
ロセッサの演算器間及びレジスタファイル間でデータを
交換するデータバス２−８〜２−１０、データバス間で
データを交換するブリッジ２−５〜２−６、共有メモリ
２−７、コプロセッサ２−１１〜２−１２で構成され
る。FIG. 3 is a block diagram showing the configuration of a second embodiment of the multiprocessor system according to the present invention. Referring to FIG. 3, the multiprocessor system according to the present embodiment includes a plurality of processors 2-1 to 2-4, data buses 2-8 to 2--4 for exchanging data among arithmetic units of the plurality of processors and between register files. 10. Bridges 2-5 to 2-6 for exchanging data between data buses, shared memory 2-7, and coprocessors 2-11 to 2-12.

【００３７】プロセッサ２−１はデータを保持するレジ
スタファイル２−１−１と、１または複数の演算器２−
１−２〜２−１−３で構成され、プロセッサ２−２〜２
−３も同様の構成である。データバス２−８〜２−１０
は同時に複数のデータをデータバス内で通信することが
出来るように複数のチャネルで構成し、各チャネルはデ
ータ及びその宛先を転送するための通信路で構成され
る。また、本実施例はメモリバスを持たず、共有メモリ
２−７はデータバス内の２本のチャネルを用いて接続さ
れる。The processor 2-1 has a register file 2-1-1 for holding data and one or more arithmetic units 2-1-1.
1-2 to 2-1-3, and processors 2-2 to 2-2.
-3 has the same configuration. Data bus 2-8 to 2-10
Is composed of a plurality of channels so that a plurality of data can be simultaneously communicated in a data bus, and each channel is composed of a communication path for transferring data and its destination. Further, this embodiment does not have a memory bus, and the shared memory 2-7 is connected using two channels in the data bus.

【００３８】各演算器の構成は第一の実施例における演
算器の構成と同様であるので、ここではその構成図を省
略する。Since the configuration of each arithmetic unit is the same as that of the arithmetic unit in the first embodiment, the configuration diagram is omitted here.

【００３９】次に、図３を参照して本実施例の動作を説
明する。本実施例では、データバスを用いて共有メモリ
を接続するため、メモリアクセス動作が第一の実施例と
異なる。メモリへの書き込みを行う場合、レジスタファ
イル上の１つのレジスタに書き込みを行うアドレスを設
定し、別のレジスタに書き込みを行うデータを設定す
る。そして、送り先を共有メモリのアドレス入力に設定
してアドレスを設定したレジスタからデータバスの１つ
のチャネルにレジスタ内容を出力し、送り先を共有メモ
リのデータ入出力に設定してデータを設定したレジスタ
からデータバスの別のチャネルにレジスタ内容を出力す
る。Next, the operation of this embodiment will be described with reference to FIG. In the present embodiment, since the shared memory is connected using the data bus, the memory access operation is different from that of the first embodiment. When writing to a memory, an address to be written is set in one register on the register file, and data to be written is set in another register. Then, the destination is set to the address input of the shared memory, the register contents are output from the register where the address is set to one channel of the data bus, and the destination is set to the data input / output of the shared memory and the data is set from the register where the data is set. Output the register contents to another channel of the data bus.

【００４０】共有メモリはデータバスからこれらのアド
レス及びデータを受信するとメモリへの書き込みを行
う。共有メモリから読み出しを行う場合、レジスタファ
イル中の１つのレジスタに読み出しを行うアドレスを設
定し、送り先を共有メモリのアドレス入力に設定して、
該レジスタからデータバスの１つのチャネルにレジスタ
内容を出力する。共有メモリは読み出しアドレスを受け
取ってデータの読み出しを行い、宛先を要求元のプロセ
ッサ設定して読み出したデータをデータバスの１つのチ
ャネルに出力する。プロセッサはこのデータを受け取る
と所定のレジスタへと格納する。When the shared memory receives these addresses and data from the data bus, it writes to the memory. When reading from the shared memory, the address to be read is set in one register in the register file, the destination is set to the address input of the shared memory,
The register contents are output from the register to one channel of the data bus. The shared memory receives the read address, reads the data, sets the destination as the requesting processor, and outputs the read data to one channel of the data bus. When the processor receives this data, it stores it in a predetermined register.

【００４１】次に、データバスを用いたプロセッサ間通
信方式について説明する。本実施例においても演算器間
をデータバスを用いて接続しているため、第一の実施例
と同様に演算器間で直接通信を行うことが可能であり、
この場合の演算器やブリッジの動作は第一の実施例と同
様である。さらに本実施例においては、レジスタファイ
ルもデータバスに接続されているため、あるプロセッサ
のレジスタファイルからレジスタのデータをデータバス
に出力し、他のプロセッサのレジスタファイル上のレジ
スタへと直接データを転送することが可能である。この
場合複数のチャネルを用いて同時に複数のレジスタのデ
ータを送受信することも可能である。Next, an interprocessor communication system using a data bus will be described. Also in this embodiment, since the arithmetic units are connected using the data bus, it is possible to directly communicate between the arithmetic units as in the first embodiment,
The operation of the arithmetic unit and the bridge in this case is the same as in the first embodiment. Further, in this embodiment, since the register file is also connected to the data bus, the register data is output from the register file of one processor to the data bus, and the data is directly transferred to the register on the register file of another processor. It is possible to In this case, data of a plurality of registers can be transmitted and received simultaneously using a plurality of channels.

【００４２】また、本実施例においては、演算器から出
力したデータをレジスタファイルで受けたり、レジスタ
ファイルから出力したデータを演算器で受けることも可
能である。さらには、メモリアクセス時において、アド
レス送信の際やデータ送受信の際にレジスタファイルで
はなく、演算器の入出力を用いることも可能である。例
えば、メモリへデータを書き込み際、演算器を用いて計
算した書き込みアドレスを直接データバスに出力し、書
き込みデータはレジスタファイルから出力することが可
能である。In this embodiment, it is also possible to receive the data output from the arithmetic unit by the register file, or to receive the data output from the register file by the arithmetic unit. Further, at the time of memory access, it is also possible to use not the register file but the input / output of the arithmetic unit at the time of address transmission or data transmission / reception. For example, when writing data to a memory, it is possible to directly output a write address calculated using an arithmetic unit to a data bus and output write data from a register file.

【００４３】本実施例において、レジスタファイルを用
いて通信を行う場合、第一の実施例における入力レジス
タと同様の構成を用いてレジスタファイル内の各レジス
タに有効状態／無効状態を持たせることで、プロセッサ
間の同期を取ることが可能である。また、各プロセッサ
が固定タイミングで同期動作を行っている場合、このよ
うな同期のための機構は不必要である。In this embodiment, when communication is performed using a register file, each register in the register file is provided with a valid state / invalid state by using the same configuration as the input register in the first embodiment. , It is possible to synchronize between processors. When each processor performs a synchronous operation at a fixed timing, such a mechanism for synchronization is unnecessary.

【００４４】レジスタファイルや演算器、共有メモリ、
コプロセッサがデータバス上の全てのチャネルに接続さ
れている必要は無く、図３のように特定のチャネルに接
続することも可能である。この場合、通信の自由度は低
下するが、ハードウエア量を低減する効果がある。これ
は以下の実施例においても同様である。Register files, arithmetic units, shared memory,
The coprocessor need not be connected to all the channels on the data bus, but may be connected to a specific channel as shown in FIG. In this case, the degree of freedom of communication is reduced, but there is an effect of reducing the amount of hardware. This is the same in the following embodiments.

【００４５】本実施例におけるプロセッサとコプロセッ
サの間の通信は第一の実施例と同一であるが、コプロセ
ッサとメモリ間の通信は以上で説明したようにデータバ
スを用いて行う。The communication between the processor and the coprocessor in this embodiment is the same as that in the first embodiment, but the communication between the coprocessor and the memory is performed using the data bus as described above.

【００４６】図４は本発明によるマルチプロセッサシス
テムの第三の実施例の構成を示すブロック図である。図
４を参照すると、本実施例におけるマルチプロセッサシ
ステムは、複数のプロセッサ３−１〜３−４、複数のプ
ロセッサの演算器及びレジスタファイル、プログラムカ
ウンタ、状態レジスタの間でデータを交換するデータバ
ス３−８〜３−１０、データバス間でデータを交換する
ブリッジ３−５〜３−６、共有メモリ３−７で構成され
る。FIG. 4 is a block diagram showing the configuration of a third embodiment of the multiprocessor system according to the present invention. Referring to FIG. 4, the multiprocessor system according to the present embodiment includes a plurality of processors 3-1 to 3-4, an arithmetic unit of the plurality of processors, and a data bus for exchanging data among register files, program counters, and status registers. 3-8 to 3-10, bridges 3-5 to 3-6 for exchanging data between data buses, and a shared memory 3-7.

【００４７】プロセッサ３−１はデータを保持するレジ
スタファイル３−１−１と、１または複数の演算器３−
１−２〜３−１−３、プログラムカウンタ３−１−４、
状態レジスタ３−１−５で構成され、プロセッサ３−２
〜３−３も同様の構成である。The processor 3-1 has a register file 3-1-1 for holding data and one or more arithmetic units 3-1-1.
1-2, 3-1-3, program counter 3-1-4,
The processor 3-2 comprises a status register 3-1-5.
3-3 have the same configuration.

【００４８】データバス１−８〜１−１０は同時に複数
のデータをバス内で通信することが出来るように複数の
チャネルで構成し、各チャネルはデータ及びその宛先を
転送するための通信路で構成される。また、本実施例は
メモリバスを持たず、共有メモリ２−７はデータバス内
の２本以上のチャネルを用いて接続される。本実施例に
おいても、各演算器の構成は第一の実施例における演算
器の構成と同様であるので、その構成図は省略する。本
実施例の特徴は、プログラムカウンタ３−１−４、状態
レジスタ３−１−５がデータバスに直接接続されている
点である。The data buses 1-8 to 1-10 are constituted by a plurality of channels so that a plurality of data can be simultaneously communicated in the bus, and each channel is a communication path for transferring data and its destination. Be composed. Further, this embodiment does not have a memory bus, and the shared memory 2-7 is connected using two or more channels in the data bus. Also in the present embodiment, the configuration of each arithmetic unit is the same as the configuration of the arithmetic unit in the first embodiment, and the configuration diagram is omitted. The feature of this embodiment is that the program counter 3-1-4 and the status register 3-1-5 are directly connected to the data bus.

【００４９】次に、図４を参照して本実施例の動作を説
明する。本実施例におけるメモリアクセス動作及びプロ
セッサ間通信動作は第二の実施例と同様であるため、こ
こでは省略する。本実施例においては、プロセッサ間で
プログラムカウンタ及び状態レジスタの内容を送受信す
ることが可能であり、プロセッサ間での処理の引継ぎを
効率的に行うことが出来る。すなわち、プロセッサ間で
処理を引き継ぐ際、引継ぎ元のプロセッサがレジスタフ
ァイル上のレジスタの内容だけでなくプログラムカウン
タ及び状態レジスタの内容をデータバスを用いて引継ぎ
先のプロセッサに送ることが可能であり、引継ぎ先のプ
ロセッサでは引継ぎ元のプロセッサの最終的な状態と全
く同じ状態で引き継いだ処理を再開することができる。Next, the operation of this embodiment will be described with reference to FIG. Since the memory access operation and the inter-processor communication operation in the present embodiment are the same as those in the second embodiment, they will not be described here. In this embodiment, the contents of the program counter and the status register can be transmitted and received between the processors, and the processing can be efficiently taken over between the processors. That is, when taking over processing between processors, the takeover source processor can send the contents of the program counter and status register as well as the contents of the registers on the register file to the takeover destination processor using the data bus, The takeover processor can resume the takeover process in exactly the same state as the final state of the takeover source processor.

【００５０】各プロセッサが異なる処理を行うために異
なるプログラムメモリの内容を持っている場合、引継ぎ
元とプロセッサと引継ぎ先のプロセッサの間にアドレス
の整合性が無いため、単純にプログラムカウンタを引き
渡すことはできない。このような場合、引継ぎ元のプロ
セッサにおいて引継ぎ先のプロセッサの実行開始アドレ
スを計算してレジスタファイル上のレジスタに格納して
おき、処理の引継ぎ時にはこのレジスタの内容を引き継
ぎ先のプロセッサのプログラムカウンタへと送信すれば
良い。When the processors have different contents of the program memory to perform different processes, the program counter is simply transferred because there is no address consistency between the takeover source and the processor and the takeover destination processor. Can not. In such a case, the takeover source processor calculates the execution start address of the takeover destination processor, stores it in a register in the register file, and transfers the contents of this register to the program counter of the takeover destination processor when the process is taken over. And send it.

【００５１】図５は本発明によるマルチプロセッサシス
テムの第四の実施例の構成を示すブロック図である。図
５を参照すると、本実施例におけるマルチプロセッサシ
ステムは、第二の実施例におけるマルチプロセッサシス
テムに、コプロセッサ４−１５〜４−１６、プロセッサ
４−１〜４−２及びコプロセッサ４−１５のデータバス
アクセスを調停するアービタ４−１１、プロセッサ４−
１〜４−２及びコプロセッサ４−１６及び共有メモリ４
−７のデータバスアクセスを調停するアービタ４−１
２、ブリッジ４−５のデータ転送動作を記述したタイム
テーブル４−１３、ブリッジ４−６のデータ転送動作を
記述したタイムテーブル４−１４を加えた構成である。FIG. 5 is a block diagram showing the configuration of a fourth embodiment of the multiprocessor system according to the present invention. Referring to FIG. 5, the multiprocessor system according to the present embodiment differs from the multiprocessor system according to the second embodiment in that coprocessors 4-15 to 4-16, processors 4-1 to 4-2, and a coprocessor 4-15. Arbiter 4-11 for arbitrating data bus access, processor 4-
1-4, coprocessor 4-16 and shared memory 4
Arbiter 4-1 that arbitrates -7 data bus access
2, a time table 4-13 describing the data transfer operation of the bridge 4-5 and a time table 4-14 describing the data transfer operation of the bridge 4-6 are added.

【００５２】また、プロセッサ４−１は動作選択レジス
タ４−１−４を持ち、プロセッサ４−２〜４−４におい
ても同様である。各演算器の構成は第一の実施例におけ
る演算器の構成と同様であるので、ここではその構成図
を省略する。The processor 4-1 has an operation selection register 4-1-4, and the same applies to the processors 4-2 to 4-4. Since the configuration of each arithmetic unit is the same as the configuration of the arithmetic unit in the first embodiment, the configuration diagram is omitted here.

【００５３】本実施例においては、各プロセッサは動作
選択レジスタによって、ダイナミックスケジューリング
動作かスタティックスケジューリング動作かを選択す
る。スタティックスケジューリング動作とはプロセッサ
が固定タイミングで動作すことであり、プロセッサに対
して予測不能な実行の乱れを生じる処理を禁止し、各プ
ロセッサを予め設定した固定長の周期に従って繰り返し
処理を行なわせる動作のことである。そのため、プロセ
ッサがどの時刻にどのようなデータバスアクセスを行う
かが予測可能であり、プログラムコードを最適に並び替
える事で各プロセッサからのアクセスが競合しないよう
に調整しておくことが可能である。In this embodiment, each processor selects the dynamic scheduling operation or the static scheduling operation by using the operation selection register. The static scheduling operation means that the processor operates at a fixed timing, prohibits a process that causes unpredictable execution disturbance to the processor, and causes each processor to repeatedly execute a process according to a preset fixed-length cycle. That is. Therefore, it is possible to predict what data bus access will be performed by the processor at what time, and it is possible to adjust the program code so that the access from each processor does not conflict by rearranging the program code optimally. .

【００５４】一方、ダイナミックスケジューリング動作
では各プロセッサは固定長の動作周期を持たず、予測不
能な実行の乱れを生じる処理も禁止されない。そのた
め、プロセッサのデータバスアクセス動作を予め予測す
ることは不可能でり、アクセス調停はプログラム実行時
に動的に行う必要がある。On the other hand, in the dynamic scheduling operation, each processor does not have a fixed-length operation cycle, and processing that causes unpredictable execution disturbance is not prohibited. For this reason, it is impossible to predict the data bus access operation of the processor in advance, and access arbitration must be performed dynamically at the time of program execution.

【００５５】図６はタイムテーブル４−１３の構成例で
ある。タイムテーブルは、ブリッジに接続された各デー
タバスの各チャネルから受信したデータの転送の可否を
定義するのもであり、受信データの宛先によって転送の
可否を時刻毎に定義する。図６の例では、時刻０におい
てはデータバス４−８のチャネル１から入力したデータ
をプロセッサ１に向けて出力することを許可し、時刻１
においてはデータバス４−８のチャネルｎから入力した
データを共有メモリに向けて出力することを許可し、時
刻２においてはデータバス４−９のチャネル１から入力
したデータを共有メモリに向けて出力することを許可
し、その他のブリッジにおける転送処理は全て不許可で
あることを示している。FIG. 6 shows an example of the configuration of the time table 4-13. The time table defines whether transfer of data received from each channel of each data bus connected to the bridge is possible or not, and defines whether transfer is possible for each time according to the destination of the received data. In the example of FIG. 6, at time 0, the data input from the channel 1 of the data bus 4-8 is permitted to be output to the processor 1, and at time 1
Allows the data input from the channel n of the data bus 4-8 to be output to the shared memory, and outputs the data input from the channel 1 of the data bus 4-9 to the shared memory at time 2. This indicates that all transfer processing in other bridges is not permitted.

【００５６】なお、ブリッジとスタティックスケジュー
リング動作を行うプロセッサは同期して動作し、ブリッ
ジは周期的にタイムテーブルを繰り返し参照して転送処
理を行う。The bridge and the processor performing the static scheduling operation operate in synchronization with each other, and the bridge periodically performs the transfer process by repeatedly referring to the time table.

【００５７】次に、図５及び図６を参照して本実施例の
動作を説明する。本実施例におけるプロセッサ間通信動
作は第二の実施例と同様であり、以下では、特にデータ
バスに対するアクセス調停の動作を詳しく説明する。ま
ず、同一データバス内での通信におけるアクセス調停動
作を説明する。プロセッサはデータバスにデータを出力
する際、宛先と使用するチャネルをアービタに伝えてチ
ャネルの使用許可を要求し、その許可を得てからデータ
の出力を行う。もし使用許可が与えられなければデータ
出力を中止し、使用許可が与えられるまで要求を繰り返
し行う。アービタは各プロセッサからの使用許可要求を
受け、それぞれのチャネルに関してデータ出力を許可す
るプロセッサを決定し、その結果を各プロセッサに伝え
る。Next, the operation of this embodiment will be described with reference to FIGS. The inter-processor communication operation in this embodiment is the same as that in the second embodiment, and the operation of arbitrating access to the data bus will be described in detail below. First, an access arbitration operation in communication within the same data bus will be described. When outputting data to the data bus, the processor notifies the arbiter of the destination and the channel to be used, requests permission to use the channel, and outputs the data after obtaining the permission. If the use permission is not given, the data output is stopped, and the request is repeated until the use permission is given. The arbiter receives a use permission request from each processor, determines a processor to permit data output for each channel, and transmits the result to each processor.

【００５８】スタティックスケジューリング動作のプロ
セッサではプログラムコードを実行前に適切に並べ替え
ておくことによって、データバスのアクセス競合を回避
するが、ダイナミックスケジューリング動作のプロセッ
サでは、プログラム実行中に動的にアクセス調停を行う
必要がある。そのため、アービタでは、複数のプロセッ
サが同じチャネルの使用許可を要求した場合、まず、ス
タティックスケジューリング動作のプロセッサに対して
使用許可を与え、もしスタティックスケジューリング動
作のプロセッサからの要求が無い場合には、その他のプ
ロセッサの中から１つのプロセッサを選択して使用許可
を与える。In the processor of the static scheduling operation, the contention of the access of the data bus is avoided by appropriately rearranging the program code before execution. In the processor of the dynamic scheduling operation, the access arbitration is dynamically performed during the execution of the program. Need to do. Therefore, in the arbiter, when a plurality of processors request the use permission of the same channel, first, the use permission is given to the processor of the static scheduling operation, and if there is no request from the processor of the static scheduling operation, One processor is selected from the processors and the use permission is given.

【００５９】コプロセッサや共有メモリからのデータ出
力動作に関しても同様である。ただし、アービタでは、
コプロセッサや共有メモリからダイナミックスケジュー
リング動作を行っているプロセッサに対してデータを出
力する場合、このデータ出力がスタティックスケジュー
リング動作を行っているプロセッサからのデータ出力を
妨げないように、コプロセッサや共有メモリからのチャ
ネル使用要求よりもスタティックスケジューリング動作
を行っているプロセッサからのチャネル使用要求を優先
させる。The same applies to the data output operation from the coprocessor and the shared memory. However, at Arbiter,
When outputting data from the coprocessor or the shared memory to the processor performing the dynamic scheduling operation, make sure that the data output does not interfere with the data output from the processor performing the static scheduling operation. Priority is given to the channel use request from the processor performing the static scheduling operation over the channel use request from the processor.

【００６０】一方、コプロセッサや共有メモリからスタ
ティックスケジューリング動作を行っているプロセッサ
に対してデータを出力する場合には、このデータ出力を
ダイナミックスタティックスケジューリング動作を行っ
ているプロセッサからのチャネル使用要求よりも優先さ
せる。On the other hand, when data is output from the coprocessor or the shared memory to the processor performing the static scheduling operation, the data output is made smaller than the channel use request from the processor performing the dynamic static scheduling operation. Prioritize.

【００６１】次に、ブリッジを越える通信におけるアク
セス調停動作を説明する。この場合でも、プロセッサの
動作は上記と同様であり、アービタの動作が異なる。ア
ービタは他のデータバスに向かうチャネル使用許可要求
があれば、ブリッジに対して現時刻で転送可能な宛先を
問い合わせ、ブリッジはアービタが接続されているバス
の該当するチャネルに関してタイムテーブルを参照して
転送可能な宛先を回答する。そして、アービタは回答さ
れた宛先と同じ宛先をもつチャネル使用許可要求があれ
ばこの要求に対して許可を与え、さもなければブリッジ
を越える通信は全て不許可としてデータバス内での通信
に対してチャネル使用許可を与える。Next, an access arbitration operation in communication over a bridge will be described. Also in this case, the operation of the processor is the same as described above, and the operation of the arbiter is different. If the arbiter has a channel use permission request for another data bus, the arbiter queries the bridge for a transferable destination at the current time, and the bridge refers to the time table for the corresponding channel of the bus to which the arbiter is connected. Answer the forwarding destination. Then, the arbiter grants the request for a channel use permission request having the same destination as the answered destination, and otherwise, all communication over the bridge is disallowed and communication on the data bus is prohibited. Grant channel use permission.

【００６２】ただし、スタティックスケジューリング動
作を行っているプロセッサの通信は常に優先し、これら
のプロセッサからのチャネル使用許可要求があれば、ダ
イナミックスケジューリング動作を行っているプロセッ
サのチャネル使用許可要求はブリッジを経由するしない
に係わり無く、全て不許可とする。However, the communication of the processors performing the static scheduling operation always has priority, and if there is a channel use permission request from these processors, the channel use request of the processor performing the dynamic scheduling operation passes through the bridge. Regardless of whether or not to do so, all are disallowed.

【００６３】ブリッジでは、ブリッジを越える通信のデ
ータを受信すると、このデータの宛先への経路となるデ
ータバスに対してデータを出力するため、このデータバ
スを管轄するアービタに対してチャネル使用許可を求
め、そしてこのデータバスへデータを出力する。アービ
タでは、ブリッジからのチャネル使用要求は必ず許可
し、ブリッジにおいてデータが滞留することが無いよう
にする。そのため、各ブリッジのタイムテーブルではブ
リッジ間でデータバスを競合することが無い様に設定し
ておく必要があり、またスタティックスケジューリング
動作を行うプロセッサではブリッジ間の通信を妨げるこ
とが無い様に、プログラムコードを作成しておく必要が
ある。When the bridge receives data for communication over the bridge, it outputs the data to the data bus which is a route to the destination of the data. Therefore, the arbiter that controls this data bus is permitted to use the channel. And outputs the data to this data bus. The arbiter always permits a channel use request from the bridge so that data does not stay in the bridge. Therefore, in the time table of each bridge, it is necessary to set so that the data bus does not compete between the bridges. In the processor performing the static scheduling operation, the program must be set so that the communication between the bridges is not hindered. You need to write code.

【００６４】以上の様に、プロセッサが最寄のアービタ
からチャネル使用許可が得られると、その後の通信経路
ではタイムテーブルの設定によって必ずチャネルが予約
されているため、データの転送が滞ることなく宛先まで
到着する。そのため、スタティックスケジューリングを
行うプロセッサでは、接続されたデータバスだけではな
く、通信途中の全てのデータバスにおいてデータが競合
しなようにプログラムを作成し、その結果に従ってタイ
ムテーブルを設定する必要がある。As described above, when the processor obtains the channel use permission from the nearest arbiter, the channel is always reserved in the subsequent communication path by setting the time table, so that the destination of the data transfer is not interrupted. Arrive up to. Therefore, in a processor that performs static scheduling, it is necessary to create a program so that data does not conflict on all data buses in communication, not only on the connected data bus, and set a time table according to the result.

【００６５】一方、ダイナミックスケジューリング動作
を行うプロセッサに関しては、アービタによってアクセ
ス制御が行われているため、プログラムコードを自由に
作成することが可能である。ただし、ブリッジを超える
通信を行うためには、スタティックスケジューリング動
作を行うプロセッサの通信を遮らないように、タイムテ
ーブルの適切な時刻に通信経路を設定しておき、プロセ
ッサがこの通信を行う場合には、設定した時刻になるま
で通信を待つ必要がある。ただし、タイムテーブル余裕
があれば複数の時刻に通信経路を設定することで、通信
を待つ時間を軽減することが可能である。On the other hand, with regard to the processor performing the dynamic scheduling operation, the access control is performed by the arbiter, so that the program code can be freely created. However, in order to perform communication beyond the bridge, a communication path is set at an appropriate time in the time table so as not to interrupt communication of the processor performing the static scheduling operation. It is necessary to wait for communication until the set time comes. However, if there is enough time table time, it is possible to reduce communication waiting time by setting communication paths at a plurality of times.

【００６６】なお、データバス４−９のように、ブリッ
ジしか接続されていないデータバスでは競合が発生しな
いのでアービタは不要である。An arbiter is not required for a data bus connected only to a bridge, such as the data bus 4-9, since no conflict occurs.

【００６７】図７は本発明によるマルチプロセッサシス
テムの第五の実施例の構成を示すブロック図である。図
５を参照すると、本実施例におけるマルチプロセッサシ
ステムは、第四の実施例におけるマルチプロセッサシス
テムとほぼ同様の構成であり、タイムテーブルの代わり
に一時バッファ５−１３〜５−１４を有する点が異な
る。FIG. 7 is a block diagram showing the configuration of a fifth embodiment of the multiprocessor system according to the present invention. Referring to FIG. 5, the multiprocessor system according to the present embodiment has substantially the same configuration as the multiprocessor system according to the fourth embodiment, except that temporary buffers 5-13 to 5-14 are provided instead of the time tables. different.

【００６８】次に、図７を参照して本実施例の動作を説
明する。本実施例では、同一データバス内でのアクセス
調停動作は第四の実施例と同様でり、ブリッジにデータ
を一時蓄えておくことで、タイムテーブルによる制御を
必要としない点が第四の実施例と異なる。すなわち、第
四の実施例ではブリッジを越える通信はタイムテーブル
を用いて全経路に関するアクセス制御を行っておくた
め、初段のアービタで送信が許可されたデータは途中の
ブリッジで滞留することはないが、本実施例では、初段
のデータバスさえ使用可能であればデータの送信を開始
し、通信経路の途中でデータバスが使用不可能となって
いれば、その直前のブリッジにおいてデータバスが使用
可能になるまでデータを待たせておく。そのため、本実
施例では、ダイナミックスケジューリング動作を行って
いるプロセッサが通信開始を待つ時間が軽減されるとい
う利点を持つが、その代わりブリッジが複雑になるとい
う欠点を持つ。Next, the operation of this embodiment will be described with reference to FIG. In the present embodiment, the access arbitration operation in the same data bus is the same as that of the fourth embodiment, and the data is temporarily stored in the bridge, so that the control by the time table is not required. Different from the example. That is, in the fourth embodiment, since the access control for all the routes is performed using the time table in the communication beyond the bridge, the data permitted to be transmitted by the first-stage arbiter does not stay in the middle bridge. In this embodiment, if even the first-stage data bus is available, data transmission is started. If the data bus is unavailable in the middle of the communication path, the data bus can be used in the immediately preceding bridge. Let the data wait until it becomes. Therefore, this embodiment has an advantage that the time required for the processor performing the dynamic scheduling operation to wait for the start of communication is reduced, but has a disadvantage that the bridge becomes complicated.

【００６９】なお、スタティックスケジューリング動作
を行うプロセッサのデータがブリッジにおいて滞留する
ことは許されないため、転送するデータに対して優先度
を定め、スタティックスケジューリング動作を行うプロ
セッサからのデータやこのプロセッサ宛てのデータは、
優先データとして待ち合わせが発生しないようにアービ
タを制御し、ダイナミックスケジューリング動作を行う
プロセッサからのデータやこのプロセッサ宛てのデータ
は非優先として、他のデータと競合した場合にはブリッ
ジにおいて待ち合わせを行う。Since the data of the processor performing the static scheduling operation is not allowed to stay in the bridge, the priority is set for the data to be transferred, and the data from the processor performing the static scheduling operation and the data addressed to this processor are determined. Is
The arbiter is controlled so that no queuing occurs as priority data, and data from a processor performing a dynamic scheduling operation and data addressed to this processor are given non-priority.

【００７０】次に、ブリッジを経由する通信におけるア
クセス調停動作に関して、アービタとブリッジの動作を
説明する。アービタは優先データに対するチャネル使用
許可要求があれば、必ずこの要求を許可する。優先デー
タが無ければ該アービタに接続された全てのブリッジに
対して一時バッファの空き容量を問い合わせ、空き容量
のあるブリッジを経由する非優先データの中から１つの
非優先データを選択してチャネル使用許可を与える。も
し、＜ブリッジを経由する非優先データにチャネル使用
許可が与えられない場合、データバス内での非優先の通
信に対してチャネル使用許可を与える。Next, the operation of the arbiter and the bridge regarding the access arbitration operation in communication via the bridge will be described. The arbiter always grants a channel use permission request for priority data, if any. If there is no priority data, all bridges connected to the arbiter are inquired about the free space of the temporary buffer, and one non-priority data is selected from non-priority data passing through a bridge having a free capacity to use a channel. Give permission. If the channel use permission is not given to the non-priority data passing through the bridge, the channel use permission is given to the non-priority communication in the data bus.

【００７１】ブリッジでは、ブリッジを越える通信のデ
ータを受信すると、データの出力先となるデータバスを
管轄するアービタに対してチャネル使用許可を要求す
る。受信データが優先データであれば、該アービタは必
ず使用許可を与えるので、ブリッジは到着データを滞留
させること無く転送処理を行う。到着データが非優先デ
ータでありかつアービタが使用許可を与え、かつ同じデ
ータバスへ出力される優先データや一時バッファ内の非
優先データが無い場合は、この非優先データの転送処理
を行い、さもなければこのデータを一時バッファへと格
納する。When the bridge receives the data of the communication over the bridge, it requests the arbiter that controls the data bus to which the data is output, to grant the channel use permission. If the received data is the priority data, the arbiter always grants the use permission, so that the bridge performs the transfer processing without arriving the arriving data. If the arriving data is non-priority data and the arbiter grants use permission, and there is no priority data to be output to the same data bus or non-priority data in the temporary buffer, the non-priority data is transferred. If not, this data is stored in a temporary buffer.

【００７２】複数の非優先データが同じデータバスへ出
力される場合、このうちの１つの非優先データに関して
転送処理を行い、残りの非優先データは全て一時バッフ
ァに格納する。一時バッファ内に蓄えられた非優先デー
タは、出力先のデータバスが同じである優先データがな
く、かつ該データバスを管轄するアービタから許可が得
られれば、一時バッファから取り出されて転送処理が行
われる。When a plurality of non-priority data are output to the same data bus, a transfer process is performed on one of the non-priority data, and all the remaining non-priority data are stored in a temporary buffer. The non-priority data stored in the temporary buffer is extracted from the temporary buffer and the transfer processing is performed if there is no priority data with the same data bus as the output destination and permission is obtained from the arbiter that controls the data bus. Done.

【００７３】図８は本発明によるマルチプロセッサシス
テムの第六の実施例の構成を示すブロック図である。図
８を参照すると、本実施例におけるマルチプロセッサシ
ステムは、複数のプロセッサ６−１〜６−４、全プロセ
ッサによって共有される共有メモリ６−８、プロセッサ
６−１及び６−２から共有メモリ６−８へアクセスする
ためのメモリバス６−３、プロセッサ６−１及び６−２
とプロセッサ６−３及び６−４との共有メモリ６−８へ
のアクセスを調停するメモリ制御回路６−６、プロセッ
サ６−３及び６−４の演算器間及びレジスタファイル間
でデータを交換するデータバス６−７、プロセッサ６−
３及び６−４によって共有される共有メモリ６−９で構
成される。FIG. 8 is a block diagram showing the configuration of a sixth embodiment of the multiprocessor system according to the present invention. Referring to FIG. 8, the multiprocessor system according to the present embodiment includes a plurality of processors 6-1 to 6-4, a shared memory 6-8 shared by all processors, and a shared memory 6 from processors 6-1 and 6-2. Memory bus 6-3 for accessing -8, processors 6-1 and 6-2
A memory control circuit 6-6 for arbitrating access to the shared memory 6-8 between the processor 6-3 and the processor 6-3, and exchanging data between arithmetic units of the processors 6-3 and 6-4 and between register files. Data bus 6-7, processor 6
It is composed of a shared memory 6-9 shared by 3 and 6-4.

【００７４】プロセッサ６−１はデータを保持するレジ
スタファイル６−１−１と、１または複数の演算器６−
１−２から６−１−３、１または複数のメモリユニット
６−１−４で構成され、プロセッサ１−２も同様の構成
である。プロセッサ６−３はデータを保持するレジスタ
ファイル６−３−１と、１または複数の演算器６−３−
２から６−３−３で構成され、プロセッサ６−４も同様
の構成である。The processor 6-1 includes a register file 6-1-1 for holding data and one or more arithmetic units 6-1-1.
1-2 to 6-1-3, one or a plurality of memory units 6-1-4, and the processor 1-2 has the same configuration. The processor 6-3 includes a register file 6-3-1 for holding data and one or more arithmetic units 6-3-.
2 to 6-3-3, and the processor 6-4 has the same configuration.

【００７５】次に、図８を参照して本実施例の動作を説
明する。プロセッサ６−１及び６−２はダイナミックス
ケジューリング動作を行い、動的に共有メモリ６−８に
対するアクセス制御を行う。これらのプロセッサは共有
メモリ６−８のみを介して通信し、これはメモリを共有
した一般的なマルチプロセッサシステムと同様の動作で
ある。Next, the operation of this embodiment will be described with reference to FIG. The processors 6-1 and 6-2 perform a dynamic scheduling operation, and dynamically control access to the shared memory 6-8. These processors communicate only via the shared memory 6-8, which is similar to the operation of a general multiprocessor system sharing the memory.

【００７６】一方、プロセッサ６−３及び６−４は第二
の実施例におけるプロセッサ間通信と同様の動作を行
う。すなわちこれらのプロセッサはデータバスを用いて
演算器間及びレジスタファイル間で通信を行い、また共
有メモリ６−８や共有メモリ６−９を用いた通信も行
う。また、これらのプロセッサはスタティックスケジュ
ーリング動作を行っているため、データバス６−７に対
する様なアクセス制御は行っていない。On the other hand, the processors 6-3 and 6-4 perform the same operation as the inter-processor communication in the second embodiment. That is, these processors communicate with each other using the data bus between arithmetic units and register files, and also communicate with the shared memory 6-8 and the shared memory 6-9. Further, since these processors perform a static scheduling operation, they do not perform access control such as for the data bus 6-7.

【００７７】プロセッサ６−１及び６−２とプロセッサ
６−３及び６−４の間の通信は共有メモリ６−８を用い
て通信を行う。プロセッサ６−１及び６−２からのメモ
リアクセスとプロセッサ６−３及び６−３空のメモリア
クセスはメモリ制御回路６−６によって調停され、ダイ
ナミックスケジューリング動作を行っているプロセッサ
６−３及び６−４からのアクセスを常に優先するように
制御を行う。Communication between the processors 6-1 and 6-2 and the processors 6-3 and 6-4 is performed using the shared memory 6-8. The memory accesses from the processors 6-1 and 6-2 and the empty memory accesses from the processors 6-3 and 6-3 are arbitrated by the memory control circuit 6-6, and the processors 6-3 and 6-6 performing the dynamic scheduling operation. 4 is controlled so that the access from 4 is always given priority.

【００７８】本実施例は第一の実施例及び第二の実施例
を組み合わせ、特にパケット転送処理に特化した構成で
ある。すなわち、パケットヘッダ処理ではプロセッサ間
の通信が比較的少なく各プロセッサでは多様な処理が行
われるため、プロセッサ６−１及び６−２を一般的なマ
ルチプロセッサシステムと同様の構成として処理を行う
が、パケットスケジューリング処理等ではプロセッサ間
の通信が頻繁で厳密な実時間処理が必要であるため、プ
ロセッサ６−３及び６−４を第二の実施例と同様の構成
としてスタティックスケジューリング動作させて処理を
行う。This embodiment combines the first embodiment and the second embodiment, and has a configuration particularly specialized in packet transfer processing. That is, in the packet header processing, since there is relatively little communication between the processors and various processing is performed in each processor, the processors 6-1 and 6-2 are processed with the same configuration as a general multiprocessor system. In the packet scheduling process and the like, communication between the processors is frequent and strict real-time processing is required. Therefore, the processors 6-3 and 6-4 have the same configuration as in the second embodiment to perform the static scheduling operation. .

【００７９】[0079]

【発明の効果】本発明の第一の効果は、マルチプロセッ
サシステムにおいて、特にネットワークノード装置内で
パケット転送処理を行うオンチップマルチプロセッサシ
ステムにおいて、夫々のプロセッサが持つ演算器間で直
接通信を行うための通信路を備えることによって、プロ
セッサ間で非常に遅延時間の短い通信を実現できること
である。The first effect of the present invention is that, in a multiprocessor system, particularly in an on-chip multiprocessor system for performing packet transfer processing in a network node device, direct communication is performed between arithmetic units of respective processors. Communication with a very short delay time between processors can be realized by providing a communication path for communication.

【００８０】本発明の第二の効果は、全プロセッサを１
本のバスに接続するのではなく、短いバスを複数用意し
て各プロセッサを最寄のバスにのみ接続することによ
り、またバス間をブリッジを用いて接続して複数のバス
を経由した通信を可能とすることにより、多数のプロセ
ッサ及び演算器を用いる場合でも高い通信速度を実現で
きることである。The second effect of the present invention is that all processors
Instead of connecting to buses, prepare multiple short buses and connect each processor only to the nearest bus, and connect the buses using bridges to communicate via multiple buses. By making it possible, a high communication speed can be realized even when a large number of processors and arithmetic units are used.

【００８１】本発明の第三の効果は、夫々のプロセッサ
が持つレジスタファイルや状態レジスタ、プログラムカ
ウンタの間で直接通信を行うことで、プロセッサ間の処
理の引継ぎが効率的に行われることである。A third effect of the present invention is that, by directly communicating between a register file, a status register, and a program counter of each processor, processing between processors can be efficiently taken over. .

【００８２】本発明の第四の効果は、同一のバスを共有
しているプロセッサ間のみで動的なアクセス調停を行
い、ブリッジを越える通信はタイムテーブルによって静
的にスケジューリングするかもしくはブリッジ内の一時
バッファで待ち合わせを行うことにより、大規模なマル
チプロセッシングにおいて高速にアクセス調停を行うこ
とができることである。A fourth advantage of the present invention is that dynamic access arbitration is performed only between processors sharing the same bus, and communication across a bridge is statically scheduled according to a time table, or a communication within a bridge is performed. By queuing in a temporary buffer, access arbitration can be performed at high speed in large-scale multiprocessing.

【００８３】本発明の第五の効果は、各プロセッサをダ
イナミックスケジューリング動作とスタティックスケジ
ューリング動作とを選択可能であるように構成すること
で、処理の特性によってプロセッサの動作を選択し、処
理の効率性と実時間性を共に満足させることができるこ
とである。A fifth effect of the present invention is that, by configuring each processor so that a dynamic scheduling operation or a static scheduling operation can be selected, the operation of the processor is selected according to the characteristics of the processing, and the efficiency of the processing is improved. And real-time performance.

[Brief description of the drawings]

【図１】本発明の第一の実施例の構成を示すブロック図
である。FIG. 1 is a block diagram showing a configuration of a first embodiment of the present invention.

【図２】図１の演算器の構成を示すブロック図である。FIG. 2 is a block diagram showing a configuration of a computing unit in FIG.

【図３】本発明の第二の実施例の構成を示すブロック図
である。FIG. 3 is a block diagram showing a configuration of a second exemplary embodiment of the present invention.

【図４】本発明の第三の実施例の構成を示すブロック図
である。FIG. 4 is a block diagram showing a configuration of a third embodiment of the present invention.

【図５】本発明の第四の実施例の構成を示すブロック図
である。FIG. 5 is a block diagram showing a configuration of a fourth embodiment of the present invention.

【図６】図５の実施例におけるタイムテーブルの構成例
を示す図表である。FIG. 6 is a chart showing a configuration example of a time table in the embodiment of FIG. 5;

【図７】本発明の第五の実施例の構成を示すブロック図
である。FIG. 7 is a block diagram showing a configuration of a fifth embodiment of the present invention.

【図８】本発明の第六の実施例の構成を示すブロック図
である。FIG. 8 is a block diagram showing a configuration of a sixth embodiment of the present invention.

[Explanation of symbols]

１−１〜１−４，２−１〜２−４，３−１〜３−４，４
−１〜４−４，５−１〜５−４，６−１〜６−４
プロセッサ１−５，１−６，２−５，２−６，３−５，３−６，４
−５，４−６ブリッジ１−７，２−７，３−７，４−７，５−７，６−８，６
−９共有メモリ１−８〜１−１０，２−８〜２−１０，３−８〜３−１
０，４−８〜４−１０，５−８〜５−１０，６−５，６
−７データバス１−１１メモリバス１−１２，１−１３，２−１１，２−１２，４−１５，
４−１６コプロセッサ４−１１，４−１２，５−１１，５−１２アービタ４−１３，４−１４タイムテーブル1-1 to 1-4, 2-1 to 2-4, 3-1 to 3-4, 4
-1 to 4-4, 5-1 to 5-4, 6-1 to 6-4
Processor 1-5, 1-6, 2-5, 2-6, 3-5, 3-6, 4
-5,4-6 bridge 1-7,2-7,3-7,4-7,5-7,6-8,6
-9 Shared memory 1-8 to 1-10, 2-8 to 2-10, 3-8 to 3-1
0,4-8 to 4-10,5-8 to 5-10,6,5,6
-7 data bus 1-11 memory bus 1-12, 1-13, 2-11, 12-12, 4-15,
4-16 Coprocessor 4-11, 4-12, 5-11, 5-12 Arbiter 4-13, 4-14 Time Table

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ０６Ｆ 15/16 ６４０Ｇ０６Ｆ 15/16 ６４０Ｂ 15/177 ６８２ 15/177 ６８２ＧＦターム(参考） 5B033 AA03 AA14 DD09 5B045 BB12 BB14 BB32 BB34 BB47 CC06 DD01 EE03 EE07 EE12 EE22 GG01 GG09 5B061 BB01 BC01 FF07 GG01 GG11 RR05 5B077 BA02 BA06 ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI Theme coat ゛ (Reference) G06F 15/16 640 G06F 15/16 640B 15/177 682 15/177 682G F-term (Reference) 5B033 AA03 AA14 DD09 5B045 BB12 BB14 BB32 BB34 BB47 CC06 DD01 EE03 EE07 EE12 EE22 GG01 GG09 5B061 BB01 BC01 FF07 GG01 GG11 RR05 5B077 BA02 BA06

Claims

[Claims]

1. A multiprocessor system having a large number of processors, comprising a plurality of processors having one or a plurality of arithmetic units, and a plurality of buses connecting the plurality of arithmetic units. A multiprocessor system characterized by performing direct communication between computing units and simultaneously performing a plurality of communications between computing units using the bus.

2. The multiprocessor system according to claim 1, further comprising: a bridge connecting the plurality of buses to each other, wherein a part of the plurality of arithmetic units is connected to each of the plurality of buses. The operation units connected to the same bus perform direct communication via the bus, and the operation units connected to different buses perform communication via the plurality of buses and the plurality of bridges. And a multiprocessor system.

3. The multiprocessor system according to claim 1, wherein the multiprocessor system has one or more buses for connecting register files of each of the processors.
A multiprocessor system, wherein communication is performed between the register files of different processors or between the register file and the arithmetic unit.

4. The multiprocessor system according to claim 1, further comprising one or a plurality of coprocessors for assisting the processor, and one or a plurality of memories, and By connecting a coprocessor and a memory, communication is performed between the arithmetic unit and the coprocessor or the memory, between the register file and the coprocessor or the memory, or between the coprocessor and the memory. Multiprocessor system.

5. The multiprocessor system according to claim 1, further comprising a status register indicating data arrival corresponding to said operation unit, wherein data arrives at said operation unit from said bus. In this case, the status register of the operation unit is set to true, and after the operation unit performs an operation using the data arriving from the bus, the status register of the operation unit is set to false, and the operation unit When performing an operation using data arriving from the bus, if data from the bus has not arrived, that is, if the status register is false, wait for the operation until the status register becomes true. Characteristic multiprocessor system.

6. The multiprocessor system according to claim 1, wherein a state register such as a flag register and a program counter of each of said processors are connected to said bus, so that arithmetic operations of different processors are performed. A multiprocessor system for performing communication between a device, a register file, a status register, and a program counter.

7. The multiprocessor system according to claim 1, wherein the register files of each of the processors are connected using a bus, and the contents of the registers are directly copied between the register files. Communication between processors, and by connecting a status register such as a flag register and a program counter of each of the processors to the bus, an arithmetic unit, a register file, a status register, and a program counter of different processors can be connected. A multiprocessor system for performing communication.

8. The multiprocessor system according to claim 2, wherein an arbiter for arbitrating a bus use request from an arithmetic unit, a register file, a coprocessor, or a memory connected to the bus is provided. Each bus has a time table for instructing a data transfer operation at each time, and a time table for instructing a data transfer operation at each time.In communication not exceeding the bridge, arbitration of a bus use request is performed by the arbiter in each bus. In a multiprocessor system, the arbiter permits only communication on a route permitted by the time table at the time of communication exceeding the bridge.

9. The multiprocessor system according to claim 2, wherein an arbiter for arbitrating a bus use request from an arithmetic unit, a register file, a coprocessor, or a memory connected to the bus is provided. Each bus has a temporary buffer for temporarily holding data that cannot be output from the bridge, and arbitration of a bus use request is performed by the arbiter in each bus in communication that does not exceed the bridge. In the communication exceeding the bridge, if the next bus on the path is available, the data is transmitted to the next bridge, and if the next bus on the path is unavailable, the bus is usable. A multiprocessor system characterized in that data is held in a bridge until the data is stored.

10. The multiprocessor system according to claim 8, wherein the processor is a processor that operates at a fixed timing and a processor that operates at a dynamic timing, and the arbiter is a fixed processor. A multiprocessor system characterized in that a bus use permission is always given to a communication performed by a processor operating at a timing, prior to a communication of another processor.

11. The multiprocessor system according to claim 9, wherein the processor is a processor that operates at a fixed timing and a processor that operates at a dynamic timing, and the arbiter is a fixed processor. The communication to the processor operating at the timing is always given a bus use permission in preference to the communication of the other processor, and the bridge gives priority to the data to the processor over the data held in the bridge. A multiprocessor system characterized by outputting to a bus.