JP2008293524A

JP2008293524A - Multiprocessor system

Info

Publication number: JP2008293524A
Application number: JP2008186812A
Authority: JP
Inventors: Toshiki Takeuchi; 俊樹竹内; Hiroyuki Igura; 裕之井倉
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2008-07-18
Filing date: 2008-07-18
Publication date: 2008-12-04
Anticipated expiration: 2023-04-24
Also published as: JP4609540B2

Abstract

<P>PROBLEM TO BE SOLVED: To improve debug efficiency, by improving data processing efficiency of an internal processor of a processor element, by increasing data transfer speed between the processor elements, while minimizing an increase in a circuit scale. <P>SOLUTION: This multiprocessor system has a plurality of processor elements 01 to 0n for performing multiplex transfer or burst transfer as a master, by acquiring the bus use right of a first or second common bus in response to a transfer request of control system data or input-output data. The processor elements 01 to 0n output a bus request signal of the first common bus in response to the transfer request of the control system data, and transfer and output a selection signal, a control signal, an address signal and the control system data of a destination in one cycle as the master in response to output of a bus permission signal, and execute processing based on the control signal and the address signal by inputting the control system data as a slave selected based on the selection signal via the first common bus. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、マルチプロセサシステムに関し、特に、複数の共有バスを介してデータ転送を
行うマルチプロセサシステムに関する。 The present invention relates to a multiprocessor system, and more particularly to a multiprocessor system that transfers data via a plurality of shared buses.

従来、この種のマルチプロセサシステムとして、各プロセサ要素の役割がマスタもしくは
スレーブとして固定されているマルチプロセサシステム以外に、各プロセサ要素がマスタ
もしくはスレーブとして動的に動作可能なマルチプロセサシステムがあり、プロセサ要素
間のデータ転送を効率化するため、複数の共有バスを用いて、プロセサ要素間のメッセー
ジ転送と、プロセサ要素および入出力装置の間の入出力転送とが、それぞれ行われてきた
。 Conventionally, as this type of multiprocessor system, in addition to the multiprocessor system in which the role of each processor element is fixed as a master or slave, there is a multiprocessor system in which each processor element can dynamically operate as a master or slave. In order to improve data transfer efficiency, message transfer between processor elements and input / output transfer between processor elements and input / output devices have been performed using a plurality of shared buses.

たとえば、図１８は、この従来のマルチプロセサシステムの構成例を示すブロック図であ
る（特許文献１参照）。 For example, FIG. 18 is a block diagram showing a configuration example of this conventional multiprocessor system (see Patent Document 1).

この従来のマルチプロセサシステムは、プロセサ要素をそれぞれ構成する複数のプロセサ
１２−１，２および複数のバスコントローラ１３−１，２と、複数の入出力装置１６−１
，２，３および複数のアダプタ１５−１，２とを備え、複数のプロセサ１２−１，２を複
数のバスコントローラ１３−１，２を介して複数の共有バス１４−１，２にそれぞれ接続
し、また、複数の入出力装置１６−１，２，３を複数のアダプタ１５−１，２を介して複
数の共有バス１４−１，２にそれぞれ接続する。 The conventional multiprocessor system includes a plurality of processors 12-1, 2 and 12 and a plurality of bus controllers 13-1, 2 which respectively constitute processor elements, and a plurality of input / output devices 16-1.
, 2, 3 and a plurality of adapters 15-1, 2 and a plurality of processors 12-1, 2 are connected to a plurality of shared buses 14-1, 2 via a plurality of bus controllers 13-1, 2, respectively. The plurality of input / output devices 16-1, 2, 3 are connected to the plurality of shared buses 14-1, 2 via the plurality of adapters 15-1, 2.

さらに、複数のプロセサ１２−１，２は、オペレーティングシステムのカーネル処理手段
として入出力処理部およびメッセージ通信処理部をそれぞれ備える。 Further, the plurality of processors 12-1 and 12-2 each include an input / output processing unit and a message communication processing unit as kernel processing means of the operating system.

入出力処理部は、入出力装置１６−１，２，３への入出力要求に対して、入出力装置のア
ドレス情報や転送データ情報をバスコントローラ１３−１，１３−２にそれぞれ渡し、入
出力を起動する。そして、入出力が完了したときに、バスコントローラ１３−１，２から
割込み通知を受けて、入出力要求を出したプログラムに完了通知を行う。 In response to input / output requests to the input / output devices 16-1, 2 and 3, the input / output processing unit passes the address information and transfer data information of the input / output devices to the bus controllers 13-1 and 13-2, respectively. Start output. When the input / output is completed, an interrupt notification is received from the bus controllers 13-1 and 13-2, and the completion notification is sent to the program that issued the input / output request.

メッセージ通信処理部は、プロセサ間のデータ通信要求を受けると、バスコントローラ１
３−１，２に対して、相手プロセサのアドレス，転送データ情報などを渡して、データの
送信要求を行う。また、受信処理では、他プロセサからのデータ送信があった場合に、バ
スコントローラ１３−１，２から割込み通知を受けて、データを受け取り、そのデータを
要求元のプログラムに渡す。 When the message communication processing unit receives a data communication request between the processors, the bus controller 1
A data transmission request is made by passing the address of the other processor, transfer data information, etc. to 3-1 and 3. In the reception process, when data is transmitted from another processor, an interrupt notification is received from the bus controllers 13-1 and 13-2, the data is received, and the data is transferred to the request source program.

この従来のマルチプロセサシステムにおいては、各プロセサ要素がマスタまたはスレーブ
となり、複数の共有バス１４−１，２を用いて、入出力装置との入出力転送と、プロセサ
要素間のメッセージ転送とが行われる。また、１本のバスを入出力転送およびメッセージ
転送に共通に用いることができるため、プロセサ要素間で転送されるデータ転送量および
転送トラフィックに応じて、複数の共有バスを用いて複数のメッセージ転送と複数の入出
力転送とを同時に行うことができる。このため、メッセージ転送および入出力転送を合わ
せたデータ転送の同時要求数が共有バスの本数以下であれば、バスのビジーによって処理
が待たされることはない。 In this conventional multiprocessor system, each processor element becomes a master or slave, and input / output transfer with an input / output device and message transfer between processor elements are performed using a plurality of shared buses 14-1 and 2. . In addition, since a single bus can be used in common for input / output transfer and message transfer, multiple message transfers using multiple shared buses depending on the amount of data transferred between the processor elements and the transfer traffic. And a plurality of input / output transfers can be performed simultaneously. For this reason, if the number of simultaneous requests for data transfer including message transfer and input / output transfer is equal to or less than the number of shared buses, processing is not waited due to bus busy.

特開平５−６３３３号公報（段落０００７〜００１３，図１）Japanese Patent Laid-Open No. 5-6333 (paragraphs 0007 to 0013, FIG. 1)

一般に、マルチプロセサシステムにおいて複数のプロセサ要素間のデータ転送を行う共有
バスに対して、次に示すような要求が列挙される。
（１）性能の観点から、高速データ転送を小さな回路面積および低消費電力で実現する
こと
（２）拡張容易性・資源再利用性の観点から、プロセサ要素の物理的な追加・変更・削
除が行われても、他のプロセサ要素および共有バスの設計変更点を最小にできること
（３）検証容易性の観点から、プロセサ要素間におけるデータ転送の状況およびプロセ
サ要素ごとのデバグ情報を選択してモニタリングできること
上述した、従来のマルチプロセサシステムでは、プロセサ要素間のメッセージ転送が、入
出力装置の入出力転送から分離され、入出力転送の終了まで待たされことが無く高速に行
われる。しかしながら、プロセサ要素間のメッセージ転送の転送データ数が大きい場合は
、共有バスの占有時間が長くなり、他のプロセサ要素間のメッセージ転送が待たされ、シ
ステム全体として、プロセサ要素間の高速データ転送が難しいという問題がある。 In general, the following requests are listed for a shared bus for transferring data between a plurality of processor elements in a multiprocessor system.
(1) Realize high-speed data transfer with a small circuit area and low power consumption from the viewpoint of performance. (2) From the viewpoint of scalability and resource reusability, physical addition / change / deletion of processor elements. (3) From the viewpoint of easy verification, the status of data transfer between processor elements and debugging information for each processor element can be selected and monitored. What can be done In the conventional multiprocessor system described above, the message transfer between the processor elements is separated from the input / output transfer of the input / output device, and is performed at high speed without waiting for the end of the input / output transfer. However, if the number of transfer data for message transfer between processor elements is large, the shared bus occupies a long time, and message transfer between other processor elements is awaited. As a whole system, high-speed data transfer between processor elements is not possible. There is a problem that it is difficult.

また、この対策として、共有バスの本数を多くした場合は、回路規模オーバヘッドが膨大
になるという問題が発生する。 As a countermeasure, when the number of shared buses is increased, there arises a problem that the circuit scale overhead becomes enormous.

また、プロセサ要素間のメッセージ転送の転送データ数が小さい場合も、メッセージ転送
ごとに、プロセサ要素の内部プロセッサの処理に対して割込みを発生して割込み処理する
必要があり、プロセサ要素の内部プロセッサによるデータ処理の効率が相対的に低下する
という問題がある。 Even when the number of transfer data for message transfer between processor elements is small, it is necessary to generate an interrupt for the processing of the internal processor of the processor element for each message transfer. There is a problem that the efficiency of data processing is relatively lowered.

また、システム全体またはプロセサ要素のプログラムのデバグ時に、プロセサ要素間にお
けるデータ転送の状況およびプロセサ要素ごとのデバグ情報を選択してモニタリングでき
ず、デバグ効率が低いという問題がある。 Further, there is a problem in that debugging efficiency is low because it is not possible to select and monitor the status of data transfer between processor elements and the debugging information for each processor element when debugging the entire system or processor element program.

また、この対策として、例えば、特開２０００−３３０８７７号公報または特開平４−１
９５５５２号公報に開示されているように、共有バスまたはプロセサ要素ごとにバスモニ
タ回路またはアドレストレース機能を実装した場合は、回路規模オーバヘッドが膨大にな
るという問題が発生する。 Moreover, as this countermeasure, for example, Japanese Patent Laid-Open No. 2000-330877 or Japanese Patent Laid-Open No. 4-1
As disclosed in Japanese Patent No. 95552, when a bus monitor circuit or an address trace function is implemented for each shared bus or processor element, there is a problem that the circuit scale overhead becomes enormous.

従って、本発明の目的は、マルチプロセサシステムにおいて、回路規模の増加を最小限に
抑えながら、プロセサ要素間のデータ転送を高速化し、プロセサ要素の内部プロセッサの
データ処理効率を向上させ、また、デバグ効率を向上させることにある。 Accordingly, an object of the present invention is to speed up data transfer between processor elements while minimizing an increase in circuit scale in a multiprocessor system, improve data processing efficiency of an internal processor of the processor element, and debug efficiency. Is to improve.

そのため、本発明は、それぞれデータ処理し制御系データまたは入出力データの転送要求
に対応して第１または第２の共有バスのバス使用権を獲得しマスタとしてマルチプレクス
転送またはバースト転送する複数のプロセサ要素を備えるマルチプロセサシステムにおい
て、
前記プロセサ要素が、前記制御系データの転送要求に対応して第１の共有バスのバス要求
信号を出力しバス許可信号の入力に応じてマスタとして転送先の選択信号，制御信号，ア
ドレス信号および前記制御系データを１サイクルで転送出力し、第１の共有バスを介して
前記選択信号に基づき選択されスレーブとして前記制御系データを入力し前記制御信号お
よび前記アドレス信号に基づき処理している。 For this reason, the present invention processes a plurality of data, and acquires a right to use the first or second shared bus in response to a transfer request for control system data or input / output data, and performs multiplex transfer or burst transfer as a master. In multi-processor systems with processor elements,
The processor element outputs a bus request signal of the first shared bus in response to the transfer request of the control system data, and selects a transfer destination selection signal, a control signal, an address signal as a master in response to the input of the bus permission signal, and The control system data is transferred and output in one cycle, is selected based on the selection signal via the first shared bus, is input as the slave, and is processed based on the control signal and the address signal.

また、前期複数のプロセサ要素から前記選択信号，前記制御信号，前記アドレス信号およ
び前記制御系データをそれぞれ入力し第１の共有バスのバス使用権に対応して第１の共有
バスへ選択的に切り替え出力し第１の共有バスを介して前記選択信号に基づき前記複数の
プロセサ要素の１つをスレーブとして選択し前記制御信号，前記アドレス信号および前記
制御系データを出力する第１の共有バス回路と、
前記複数のプロセサ要素からバス要求信号をサイクルごとにそれぞれ受け付け最も優先度
の高いプロセサ要素に対し第１の共有バスのバス許可信号を発行して次のサイクルのバス
使用権を調停する第１のバスアービタとを備えている。 In addition, the selection signal, the control signal, the address signal, and the control system data are input from a plurality of processor elements in the previous period, and selectively sent to the first shared bus corresponding to the right to use the first shared bus. A first shared bus circuit for switching and outputting one of the plurality of processor elements as a slave based on the selection signal via the first shared bus and outputting the control signal, the address signal, and the control system data When,
A bus request signal is received from each of the plurality of processor elements for each cycle, a bus grant signal for the first shared bus is issued to the processor element having the highest priority, and the bus use right for the next cycle is arbitrated. It has a bus arbiter.

また、第１の共有バス回路が、前記複数のプロセサ要素から前記選択信号，前記制御信号
，前記アドレス信号および前記制御系データをそれぞれ入力し第１の共有バスのバス使用
権に対応して第１の共有バスへ選択的に切り替え出力するマルチプレクサと、
第１の共有バス上の前記選択信号をデコードし前記複数のプロセサ要素の１つを転送先の
スレーブとして選択するデコーダと、
第１の共有バス上の前記制御信号，前記アドレス信号および前記制御系データをそれぞれ
入力し前記デコーダの出力に対応して転送先のスレーブへそれぞれ切り替え分配するデマ
ルチプレクサとを備えている。 The first shared bus circuit receives the selection signal, the control signal, the address signal, and the control system data from the plurality of processor elements, respectively, and corresponds to the bus use right of the first shared bus. A multiplexer for selectively switching to one shared bus;
A decoder that decodes the selection signal on a first shared bus and selects one of the plurality of processor elements as a transfer destination slave;
A demultiplexer that inputs the control signal, the address signal, and the control system data on the first shared bus and switches and distributes the control signal to the transfer destination slave according to the output of the decoder.

また、前記プロセサ要素が、前記制御系データの転送要求に対応して第１の共有バスのバ
ス要求信号を出力しバス許可信号の入力に応じてマスタとして前記制御系データを転送出
力し第１の共有バスを介して前記選択信号に基づき選択されスレーブとして前記制御系デ
ータを入力し前記制御信号および前記アドレス信号に基づきメモリ書込みを行う書込み転
送と、
返送先コードを含む制御系データの転送要求に対応して第１の共有バスのバス要求信号を
出力しバス許可信号の入力に応じてマスタとして前記返送先コードを転送出力し第１の共
有バスを介して前記選択信号に基づき選択されスレーブとして前記返送先コードを入力し
前記制御信号および前記アドレス信号に基づきメモリデータを読み出して制御系データと
し返送要求を行う読出し返送要求転送と、
前記返送要求に対応して第１の共有バスのバス要求信号を出力しバス許可信号の入力に応
じてマスタとして前記返送先コード対応の選択信号を転送出力し第１の共有バスを介して
前記選択信号に基づき選択されスレーブとして前記制御系データを入力し前記制御信号お
よび前記アドレス信号に基づきメモリ書込みを行う返送書込み転送とをそれぞれ行ってい
る。 The processor element outputs a bus request signal of the first shared bus in response to the transfer request of the control system data, and transfers and outputs the control system data as a master in response to the input of the bus permission signal. A write transfer that is selected based on the selection signal via the shared bus and inputs the control system data as a slave and performs memory writing based on the control signal and the address signal;
In response to a transfer request for control system data including a return destination code, a bus request signal of the first shared bus is output, and the return destination code is transferred and output as a master in response to the input of the bus permission signal. A read return request transfer for inputting a return destination code selected as a slave via the selection signal and reading out memory data based on the control signal and the address signal to make a return request as control system data,
In response to the return request, a bus request signal of the first shared bus is output, and a selection signal corresponding to the return destination code is transferred and output as a master in response to an input of the bus permission signal, and the first shared bus is used to transmit The control system data selected as a selection signal is input as a slave, and a return write transfer is performed to perform a memory write based on the control signal and the address signal.

また、前記プロセサ要素が、第１の共有バスを介して前記選択信号に基づき選択されスレ
ーブとして、内部割込み処理でなく専用のメモリ制御部により、前記制御信号および前記
アドレス信号に基づきメモリ書込みまたはメモリ読出し返送要求を行っている。 In addition, the processor element is selected based on the selection signal via the first shared bus and is used as a slave to perform memory writing or memory based on the control signal and the address signal by a dedicated memory control unit instead of internal interrupt processing. A read-back request is made.

また、前記プロセサ要素が、割込み要求を含む制御系データの転送要求に対応して第１の
共有バスのバス要求信号を出力しバス許可信号の入力に応じてマスタとして前記割込み要
求を転送出力し、第１の共有バスを介して前記選択信号に基づき選択されスレーブとして
前記割込み要求を入力し前記制御信号および前記アドレス信号に基づき前記割込み要求に
対応した内部割込み処理を行う割込み要求転送を行っている。 Further, the processor element outputs a bus request signal of the first shared bus in response to a control data transfer request including an interrupt request, and transfers and outputs the interrupt request as a master in response to the input of the bus permission signal. The interrupt request is transferred via the first shared bus, the interrupt request selected as a slave is input as the slave, and the internal interrupt processing corresponding to the interrupt request is performed based on the control signal and the address signal. Yes.

また、前記割込み要求が、割込み要因および転送元コードを含んでいる。 The interrupt request includes an interrupt factor and a transfer source code.

また、第１および第２の共有バス上の前記制御系データおよび前記入出力データを転送経
路およびアドレス範囲の一致に応じてスヌーピングしデバグ用メモリに記憶するデバグ用
処理要素を備えている。 Further, a debugging processing element is provided for snooping the control system data and the input / output data on the first and second shared buses according to the coincidence of the transfer path and the address range and storing them in the debugging memory.

また、前記プロセサ要素が、内部プロセッサの実行命令のアドレスをトレースしトレース
データを作成して制御系データとし、その転送要求に対応して第１の共有バスのバス要求
信号を出力しバス許可信号の入力に応じてマスタとして前記トレースデータを転送出力し
ている。 The processor element traces the address of the execution instruction of the internal processor and creates trace data as control system data. In response to the transfer request, the processor element outputs a bus request signal of the first shared bus. The trace data is transferred and output as a master in response to the input.

また、前記デバグ用処理要素が、第１の共有バスを介して前記選択信号に基づき選択され
スレーブとして前記トレースデータを入力し前記制御信号および前記アドレス信号に基づ
きデバグ用メモリに記憶している。 The debugging processing element is selected based on the selection signal via the first shared bus, inputs the trace data as a slave, and stores it in the debugging memory based on the control signal and the address signal.

また、前記プロセサ要素の基本クロック信号に同期し且つ第１の共有バスの転送トラフィ
ックに応じて前記基本クロック信号の整数倍周波数のバスクロック信号を生成するクロッ
ク生成回路と、
前記プロセサ要素から第１の共有バスのバス要求信号を入力し第１のバスアービタへ前記
バスクロック信号に同期して出力し第１のバスアービタから第１の共有バスのバス許可信
号を入力し前記プロセサ要素へ前記基本クロック信号に同期して出力するアービタ同期回
路と、
第１の共有バスを介して前記選択信号，前記制御信号，前記アドレス信号および前記制御
系データを入力し前記プロセサ要素へ前記基本クロック信号に同期して出力するスレーブ
同期回路とを備え、
第１のバスアービタが、前記アービタ同期回路を介して前記複数のプロセサ要素から第１
の共有バスのバス要求信号を前記基本クロック信号のサイクルごとにそれぞれ１度だけ受
け付け前記バスクロック信号の各バスサイクルで最も優先度の高いプロセサ要素に対し前
記アービタ同期回路を介して第１の共有バスのバス許可信号を発行して次のサイクルの各
バスサイクルのバス使用権を調停している。 A clock generation circuit that generates a bus clock signal that is an integral multiple of the basic clock signal in response to the transfer traffic of the first shared bus in synchronization with the basic clock signal of the processor element;
A bus request signal of the first shared bus is input from the processor element, and is output to the first bus arbiter in synchronization with the bus clock signal, and a bus permission signal of the first shared bus is input from the first bus arbiter. An arbiter synchronization circuit for outputting to the element in synchronization with the basic clock signal;
A slave synchronization circuit that inputs the selection signal, the control signal, the address signal, and the control system data via a first shared bus and outputs them to the processor element in synchronization with the basic clock signal;
A first bus arbiter receives first signals from the plurality of processor elements via the arbiter synchronization circuit.
The bus request signal of the shared bus is received only once for each cycle of the basic clock signal, and the first shared signal is sent to the processor element having the highest priority in each bus cycle of the bus clock signal via the arbiter synchronization circuit. A bus permission signal for the bus is issued to arbitrate the right to use the bus in each bus cycle of the next cycle.

また、前記複数のプロセサ要素の１つとして動作し且つ前記入出力データの転送要求に対
応して第２の共有バスのバス要求信号を出力しバス許可信号の入力に応じてマスタとして
前記入出力データをバースト転送するプロセサ要素と、
前記複数のプロセサ要素の１つとして動作し且つ第２の共有バスを介して接続されたスレ
ーブとして前記入出力データをバースト転送するプロセサ要素とを備えている。 Also, the I / O circuit operates as one of the plurality of processor elements and outputs a bus request signal of a second shared bus in response to the transfer request of the input / output data, and the input / output as a master in response to an input of a bus permission signal A processor element for burst transfer of data;
And a processor element that operates as one of the plurality of processor elements and that burst-transfers the input / output data as a slave connected via a second shared bus.

また、第２の共有バスのマスタまたはスレーブとして動作するプロセサ要素を第２の共有
バスのバス使用権に対応して第２の共有バスに選択的に切り替え接続し第２の共有バスを
介してマスタおよびスレーブ間で前記入出力データをバースト転送する第２の共有バス回
路と、
前記複数のプロセサ要素から第２の共有バスのバス要求信号をサイクルごとにそれぞれ受
付け最も優先度の高いプロセサ要素に対し第２の共有バスのバス許可信号を発行してバス
使用権を調停する第２のバスアービタとを備えている。 In addition, a processor element operating as a master or slave of the second shared bus is selectively switched and connected to the second shared bus corresponding to the right to use the second shared bus, via the second shared bus. A second shared bus circuit for burst-transferring the input / output data between the master and the slave;
A bus request signal for the second shared bus is received for each cycle from the plurality of processor elements, and a bus permission signal for the second shared bus is issued to the processor element with the highest priority to arbitrate the bus use right. 2 bus arbiters.

以上のように、この発明によれば、以下のような効果が期待できる。 As described above, according to the present invention, the following effects can be expected.

第１の効果は、回路規模の増加を抑えたままで、全てのプロセッサ要素間における転送お
よび高速データ転送の両方が効率的に行えることである。 The first effect is that both the transfer between all the processor elements and the high-speed data transfer can be efficiently performed while suppressing the increase in the circuit scale.

その理由は、従来のプロセサ要素と異なり、プロセサ要素による全データ転送の転送デー
タを制御系データおよび入出力データの２種類に分割して、制御系データまたは入出力デ
ータの転送要求に対応して第１または第２の共有バスのバス使用権を獲得しマスタとして
マルチプレクス転送またはバースト転送し、第１の共有バスでは、全てのプロセッサ要素
間で転送可能とし、必要最小限の書込み転送のみの機能に限定し、読み出し返送要求が書
込み転送された時点でバスの使用権は解放され、返送データが準備できるまでの期間中は
、他の転送を行いたいプロセサ要素にバスを割り当てることができ、第２の共有バスでは
、接続されるプロセッサ要素と転送方向とを限定しているためである。 The reason for this is that unlike the conventional processor element, the transfer data for all data transfer by the processor element is divided into two types, control system data and input / output data, to respond to control system data or input / output data transfer requests. Acquires the right to use the first or second shared bus and performs multiplex transfer or burst transfer as a master. The first shared bus enables transfer between all the processor elements, and only the minimum necessary write transfer is possible. When the read return request is written and transferred, the right to use the bus is released when the read return request is written and transferred, and during the period until the return data is ready, the bus can be assigned to the processor element that wants to transfer, This is because the second shared bus limits the processor elements to be connected and the transfer direction.

第２の効果は、全てのプロセッサ要素間の制御系データの転送と、各プロセッサ要素内の
データ処理とが、高速化および低消費電力化され、マルチプロセサシステム全体が高速化
および低消費電力化されることである。 The second effect is that the transfer of control data between all the processor elements and the data processing in each processor element are speeded up and reduced in power consumption, and the whole multiprocessor system is speeded up and reduced in power consumption. Is Rukoto.

その理由は、第１の共有バスを介した制御系データの転送そのものに対しては、特定マス
タのプロセッサ要素の起動が不要になり、各プロセッサ要素のバスインタフェース間で第
１の共有バスを介したデータ転送ができ、各プロセッサ要素内の内部プロセサによる処理
が不要になり、また、転送トラフィックに応じてプロセッサ要素の基本クロック信号のサ
イクルより整数倍速いバスサイクルで第１の共有バス回路を動作させることができるため
である。 The reason for this is that the activation of the processor element of the specific master is not necessary for the transfer of control system data via the first shared bus itself, and the first shared bus is passed between the bus interfaces of each processor element. Data transfer, processing by an internal processor in each processor element is unnecessary, and the first shared bus circuit operates in a bus cycle that is an integer times faster than the cycle of the basic clock signal of the processor element according to the transfer traffic. It is because it can be made.

第３の効果は、拡張容易性および資源再利用性に優れ、マルチプロセサシステムの開発期
間が短縮され、さらには、開発コストが削減されることである。 The third effect is that extensibility and resource reusability are excellent, the development period of the multiprocessor system is shortened, and further, the development cost is reduced.

その理由は、システム仕様の変更に伴ってのプロセサ要素の追加・変更などにより、プロ
セサ要素間に想定していなかった転送経路が発生した場合でも、全てのプロセサ要素間で
転送可能な第１の共有バスを介して、割込み要求を含む制御系データも転送でき、全体の
バス仕様および接続構成の変更をほとんど必要とせずに柔軟に対応でき、プロセサ要素間
に想定していなかった大量のデータ転送が追加・変更された場合でも、第２の共有バスの
接続構成を追加・変更することで対応できるためである。 The reason for this is that even if a transfer path that was not assumed between processor elements occurs due to the addition or change of processor elements accompanying a change in system specifications, the first transferable among all processor elements is possible. Control system data including interrupt requests can also be transferred via a shared bus, which can be flexibly handled with little need for changes in the overall bus specifications and connection configuration, and a large amount of data transfer that was not assumed between processor elements This is because it is possible to cope with the case where the connection configuration of the second shared bus is added / changed even if it is added / changed.

第４の効果は、テスト容易性およびデバグ容易性が向上することである。 The fourth effect is that testability and debugability are improved.

その理由は、プロセサ要素ごとのアドレストレース機能によって生成されたトレースデー
タが、通常動作中の第１の共有バスの不使用期間を利用して、全てのプロセサ要素共通の
デバグ用記憶装置に転送され、また、バスモニタ機能と組み合わせることにより、通常動
作におけるプロセサ要素間の転送データと、１つまたは複数のプロセサ要素のアドレスト
レースデータを同時にモニタリングでき、また、転送トラフィックに応じてプロセッサ要
素の基本クロック信号のサイクルより整数倍速いバスサイクルで第１の共有バス回路を動
作させることができるためである。 The reason is that the trace data generated by the address trace function for each processor element is transferred to the debugging storage device common to all the processor elements using the non-use period of the first shared bus during normal operation. In combination with the bus monitor function, the transfer data between the processor elements in normal operation and the address trace data of one or more processor elements can be monitored simultaneously, and the basic clock of the processor element can be selected according to the transfer traffic. This is because the first shared bus circuit can be operated in a bus cycle that is integer times faster than the signal cycle.

次に、本発明について、図面を参照して説明する。図１は、本発明によるマルチプロセサ
システムの実施形態１を示す全体ブロック図である。図１を参照すると、本実施形態のマ
ルチプロセサシステムは、複数のプロセサ要素０１〜０ｎと、第１，第２の共有バス回路
１００，２００と、第１，第２のバスアービタ１０５，２０５と、デバグ用処理要素１０
とを備える。 Next, the present invention will be described with reference to the drawings. FIG. 1 is an overall block diagram showing Embodiment 1 of a multiprocessor system according to the present invention. Referring to FIG. 1, the multiprocessor system of this embodiment includes a plurality of processor elements 01 to 0n, first and second shared bus circuits 100 and 200, first and second bus arbiters 105 and 205, and a debug. Processing element 10
With.

複数のプロセサ要素０１〜０ｎは、それぞれデータ処理し、また、図１０で示した従来の
マルチプロセサシステムにおけるプロセサ要素と異なり、プロセサ要素による全データ転
送の転送データを制御系データおよび入出力データの２種類に分割して、制御系データま
たは入出力データの転送要求に対応して第１または第２の共有バスのバス使用権を獲得し
マスタとしてマルチプレクス転送またはバースト転送する。 Each of the plurality of processor elements 01 to 0n processes data, and, unlike the processor element in the conventional multiprocessor system shown in FIG. 10, the transfer data of all data transfer by the processor element is the control system data and input / output data 2 Dividing into types, the right to use the first or second shared bus is acquired in response to a transfer request for control system data or input / output data, and multiplex transfer or burst transfer is performed as a master.

各プロセッサ要素内の構成例としては、種々の演算およびプロセッサ要素内の制御を行う
ＭＰＵやＤＳＰなどの内部プロセッサ、メモリやレジスタなどの記憶装置、データ処理を
行う専用ハードウェアアクセラレータ、データ入出力装置（ＤＭＡコントローラ）などか
ら構成されると考えられるが、本発明の実施の形態としては、特にこれの限りではない。 Examples of configurations in each processor element include internal processors such as MPUs and DSPs that perform various operations and control within the processor elements, storage devices such as memories and registers, dedicated hardware accelerators that perform data processing, and data input / output devices (DMA controller) or the like is considered, but the embodiment of the present invention is not particularly limited to this.

また、これら複数のプロセサ要素０１〜０ｎの少なくとも１つが、従来と同じく、入出力
データの転送要求に対応して、第２の共有バスのバス要求信号を出力し、バス許可信号の
入力に応じてマスタとして入出力データをバースト転送し、また、これら複数のプロセサ
要素０１〜０ｎの少なくとも１つが、従来と同じく、第２の共有バスを介して接続された
スレーブとして、入出力データをバースト転送する。 In addition, at least one of the plurality of processor elements 01 to 0n outputs a bus request signal for the second shared bus in response to an input / output data transfer request in the same manner as in the past, and responds to the input of the bus permission signal. As a master, I / O data is transferred in bursts, and at least one of the plurality of processor elements 01 to 0n is burst-transferred as I / O data as a slave connected via a second shared bus, as in the past. To do.

第１，第２の共有バス回路１００，２００は、第１，第２の共有バスを介して、プロセッ
サ要素０１〜０ｎ間の制御系データ，入出力データを互い異なる仕様でデータ転送する。
第１の共有バス回路１００は、必要最小限の書込み転送機能だけを有して一部あるいは全
てのプロセッサ要素間で双方向に１サイクルごとにマルチプレクス転送し、第２の共有バ
ス回路２００は、転送されるプロセッサ要素および転送方向を限定し、マスタからスレー
ブへ、または、スレーブからマスタへバースト転送する。これら第１，第２の共有バスは
、それぞれ、１つのマルチプロセッサ内に物理的に１本または複数本存在可能である。ま
た、第２の共有バスが複数本存在する場合には、それらのバスに接続されるプロセッサ要
素およびバス仕様が同一である必要はない。 The first and second shared bus circuits 100 and 200 transfer control system data and input / output data between the processor elements 01 to 0n with different specifications via the first and second shared buses.
The first shared bus circuit 100 has only a minimum necessary write transfer function, and performs multiplex transfer in a bidirectional manner for every cycle between some or all processor elements. The second shared bus circuit 200 The processor elements to be transferred and the transfer direction are limited, and burst transfer is performed from the master to the slave or from the slave to the master. Each of the first and second shared buses can physically exist in one multiprocessor, or a plurality of them can be present. When there are a plurality of second shared buses, the processor elements connected to these buses and the bus specifications do not have to be the same.

第１，第２のバスアービタ１０５，２０５は、複数のプロセサ要素０１〜０ｎから第１，
第２の共有バスのバス要求をサイクルごとにそれぞれ受け付け最も優先度の高いプロセサ
要素に対し第１，第２の共有バスのバス許可信号を発行して第１，第２のバス使用権を調
停する。 The first and second bus arbiters 105 and 205 are connected to the first and second processor elements 01 to 0n.
Accepts bus requests for the second shared bus for each cycle, issues bus permission signals for the first and second shared buses to the highest priority processor element, and arbitrates the first and second bus usage rights To do.

デバグ用処理要素１０は、第１および第２の共有バス上の制御系データおよび入出力デー
タを転送経路およびアドレス範囲の一致に応じてスヌーピングしデバグ用メモリに記憶し
、モニタ出力できる。 The debugging processing element 10 can snoop the control system data and input / output data on the first and second shared buses according to the match of the transfer path and the address range, store them in the debugging memory, and monitor output.

上述したように、本実施形態のマルチプロセサシステムにおいて、主に、動作タイミング
信号やパラメータ設定信号などの、一度に転送される転送データ数は少ないが全てのプロ
セサ要素間で転送される可能性のあるデータは、制御系データとして、第１の共有バスを
用いて複数のプロセサ要素０１〜０ｎ間で相互にマスタからスレーブへマルチプレクス転
送される。一方、主に、ストリームデータなどの、一度に転送される転送データ数が多く
且つ転送経路が予め決まっているデータは、入出力データとして、第２の共有バスを用い
て複数のプロセサ要素０１〜０ｎの限定されたマスタおよびスレーブ間でバースト転送さ
れる。 As described above, in the multiprocessor system according to the present embodiment, the number of transfer data such as operation timing signals and parameter setting signals, which are transferred at a time, is small, but may be transferred between all processor elements. Data is multiplex-transferred as a control system data from the master to the slave between the plurality of processor elements 01 to 0n using the first shared bus. On the other hand, mainly data having a large number of transfer data transferred at a time and having a predetermined transfer path, such as stream data, is input / output data using a plurality of processor elements 01 to Burst transfer is performed between 0n limited master and slave.

すなわち、転送トラフィックが大きく、第１の共有バスを用いて転送すると他の転送およ
び全体のシステム性能に影響が出るような転送を、第２の共有バスを用いて転送する。こ
れにより、接続先が多く複雑になりがちな第１の共有バス１００の仕様を、最大限簡単化
することが可能となる。 That is, the transfer traffic is large, and when the transfer is performed using the first shared bus, other transfers and transfers that affect the overall system performance are transferred using the second shared bus. This makes it possible to simplify the specifications of the first shared bus 100 that tend to be complicated with many connection destinations.

また、デバグ用処理要素１０により、第１，第２の共有バス上の転送データまたは信号の
転送経路およびアドレスが所望の範囲と一致した場合にだけ、その転送データをスヌーピ
ングし内部のデバグ用メモリに記憶させ、モニタすることができる。このとき、デバグ用
処理要素１０が、第１，第２の共有バス上の転送データを同時にモニタするために、動作
クロックを速くしてマルチプレクサなどを用いて切り替えながらモニタする機能を有して
いても何ら問題はない。 Also, the debug processing element 10 snoops the transfer data and the internal debug memory only when the transfer path or address of the transfer data or signal on the first and second shared buses matches the desired range. Can be stored and monitored. At this time, the debugging processing element 10 has a function of monitoring the transfer data on the first and second shared buses while simultaneously switching using a multiplexer or the like by increasing the operation clock. There is no problem.

次に、本実施形態のマルチプロセサシステムにおける第１，第２の共有バスを介したデー
タ転送についてそれぞれ詳細説明する。 Next, data transfer via the first and second shared buses in the multiprocessor system of this embodiment will be described in detail.

図２は、本実施形態のマルチプロセサシステムにおける第１の共有バスを介したデータ転
送を説明するための説明図である。第１の共有バスを介したデータ転送では、図２に示す
ように、転送を行うプロセッサ要素のマスタもしくはスレーブとしての役割が動的に変化
し、全てのプロセッサ要素間で第１の共有バスを介したデータ転送が許容されている。こ
のような構成をとることにより、第１の共有バスを介したデータ転送そのものに対しては
、特定マスタのプロセッサ要素の起動が不要になり、各プロセッサ要素のバスインタフェ
ース間で第１の共有バスを介したデータ転送ができ、転送の効率化および低消費電力化を
図ることができる。 FIG. 2 is an explanatory diagram for explaining data transfer through the first shared bus in the multiprocessor system of the present embodiment. In the data transfer via the first shared bus, as shown in FIG. 2, the role of the processor element that performs the transfer dynamically changes as the master or slave, and the first shared bus is set among all the processor elements. Data transfer via is allowed. By adopting such a configuration, it is not necessary to activate the processor element of the specific master for the data transfer itself via the first shared bus, and the first shared bus is connected between the bus interfaces of the respective processor elements. The data can be transferred via the network, and the transfer efficiency and power consumption can be reduced.

図３は、第１の共有バス回路１００の内部構成例および周辺接続例を示すブロック図であ
り、図４は、プロセサ要素内の第１の共有バスのマスタ側およびスレーブ側インタフェー
スの１部を示す部分ブロック図である。 FIG. 3 is a block diagram showing an internal configuration example and peripheral connection example of the first shared bus circuit 100, and FIG. 4 shows a part of the master side and slave side interfaces of the first shared bus in the processor element. It is a partial block diagram shown.

図３を参照すると、第１の共有バス回路１００は、マルチプレクサ，デコーダおよびデマ
ルチプレクサを備えて構成される。ここで、マルチプレクサは、第１の共有バスのマスタ
として動作するプロセサ要素から選択信号ＭＳＥＬ，制御信号ＭＷＥ，ＭＲＥＳ，アドレ
ス信号ＭＡＤＤＲおよび制御系データＭＤＢＯをそれぞれ入力し、第１のアービタ１０５
からの第１の共有バスのバス使用権に対応した信号により第１の共有バスへ選択的に切り
替え出力し、デコーダは、第１の共有バス上の選択信号をデコードし複数のプロセサ要素
０１〜０ｎの１つを転送先のスレーブとして選択し、デマルチプレクサは、第１の共有バ
ス上の制御信号，アドレス信号および制御系データをそれぞれ入力し、デコーダの出力に
対応して、第１の共有バスのスレーブとして動作する転送先のプロセサ要素へそれぞれ切
り替え分配する。 Referring to FIG. 3, the first shared bus circuit 100 includes a multiplexer, a decoder, and a demultiplexer. Here, the multiplexer receives the selection signal MSEL, the control signals MWE and MRES, the address signal MADDR, and the control system data MDBO from the processor element that operates as the master of the first shared bus, and the first arbiter 105.
The signal corresponding to the bus usage right of the first shared bus from the first shared bus is selectively switched to the first shared bus, and the decoder decodes the selection signal on the first shared bus and outputs a plurality of processor elements 01 to One of 0n is selected as a transfer destination slave, and the demultiplexer inputs a control signal, an address signal, and control system data on the first shared bus, respectively, and corresponds to the output of the decoder, the first shared It is switched and distributed to each transfer destination processor element that operates as a bus slave.

上述のように、本実施形態の第１の共有バス回路１００は、従来バスでは書込み転送（マ
スタからスレーブへ）および読み出し転送（スレーブからマスタへ）の両方を考慮して回
路を構成しなければならないのに対して、書込み転送のみを可能にするだけの回路構成と
する。このような回路構成にしても、全てのプロセッサ要素がマスタになれるために双方
向のデータ転送が実現でき、回路規模が削減される。また、図３では、第１の共有バス回
路１００が、論理合成等のインプリメント容易性を考慮して、ＭＵＸ型のバス構成となっ
ているが、インプリメント容易性および動作遅延の見積りが許せば、３ステート型などの
バス構成でも何ら問題はない。 As described above, the first shared bus circuit 100 according to the present embodiment must be configured in consideration of both write transfer (from master to slave) and read transfer (from slave to master) in the conventional bus. On the other hand, the circuit configuration is such that only write transfer is possible. Even in such a circuit configuration, since all the processor elements can become masters, bidirectional data transfer can be realized, and the circuit scale can be reduced. In FIG. 3, the first shared bus circuit 100 has a MUX type bus configuration in consideration of the ease of implementation such as logic synthesis, but if the ease of implementation and the estimation of the operation delay allow, There is no problem even with a bus configuration such as a 3-state type.

各プロセサ要素０１〜０ｎは、制御系データの転送要求に対応して第１の共有バスのバス
要求信号ＭＲＥＱを出力し、バス許可信号ＭＧＲＡＮＴの入力に応じてマスタとして転送
先の選択信号ＭＳＥＬと、制御信号ＭＷＥ，ＭＲＥＳと、アドレス信号ＭＡＤＤＲおよび
制御系データＭＤＢＯを各出力端子から１サイクルで転送出力し、また、第１の共有バス
を介して選択信号に基づき選択されスレーブとして、制御系データを入力し制御信号およ
びアドレス信号に基づき処理する。また、図４に示すように、各プロセサ要素０１〜０ｎ
内の第１の共有バスのインタフェースは、割込み要求信号をエンコードし制御系データと
して転送出力するマスタ出力部と、転送入力した制御系データを１時保持およびデコード
し割込み要求信号を生成するスレーブ入力部とを備える。 Each processor element 01 to 0n outputs a bus request signal MREQ of the first shared bus in response to a transfer request for control system data, and a transfer destination selection signal MSEL as a master in response to the input of the bus permission signal MGRANT. The control signals MWE, MRES, the address signal MADDR and the control system data MDBO are transferred and output from each output terminal in one cycle, and the control system data is selected as a slave through the first shared bus as a slave. Is processed based on the control signal and the address signal. Also, as shown in FIG. 4, each processor element 01-0n
The first shared bus interface includes a master output unit that encodes an interrupt request signal and transfers it as control system data, and a slave input that holds and decodes the transferred control system data and generates an interrupt request signal. A part.

第１のアービタ１０５は、複数のプロセサ要素０１〜０ｎからバス要求信号ＭＲＥＱおよ
び優先度信号ＭＰＲＩをサイクルごとにそれぞれ受け付け、最も優先度の高いプロセサ要
素に対し第１の共有バスのバス許可信号ＭＧＲＡＮＴを発行して、次のサイクルの第１の
共有バスのバス使用権を調停し第１の共有バス回路１００へ信号出力する。 The first arbiter 105 receives the bus request signal MREQ and the priority signal MPRI from the plurality of processor elements 01 to 0n for each cycle, and the bus grant signal MGRANT of the first shared bus for the processor element with the highest priority. Is issued to arbitrate the right to use the first shared bus in the next cycle and output a signal to the first shared bus circuit 100.

図５は、第１の共有バスによる制御系データの転送例を示すタイミング図である。図５に
示すように、第１の共有バスによるデータ転送は、ＲｅｑｕｅｓｔｐｈａｓｅとＴｒａ
ｎｓｆｅｒｐｈａｓｅの２種類のフェーズにより実現される。Ｒｅｑｕｅｓｔｐｈａ
ｓｅは、１または複数サイクルを必要とし、転送を行いたいプロセサ要素がバス要求信号
を発行してから、バスの使用権付与を示すバス許可信号がアクティブになるまでの期間で
ある。また、Ｔｒａｎｓｆｅｒｐｈａｓｅは、バス許可信号をクロック信号でラッチし
た信号がアクティブな期間がそのプロセサ要素にバスが割り当てられている期間であり、
そのプロセサ要素がマスタとなれる期間であり、基本的に１サイクルでの転送となる。 FIG. 5 is a timing chart showing an example of transfer of control system data by the first shared bus. As shown in FIG. 5, the data transfer by the first shared bus is performed by request phase and Tra.
This is realized by two types of phases of nsfer phase. Request pha
se is a period that requires one or a plurality of cycles, and is from when a processor element to be transferred issues a bus request signal to when a bus permission signal that indicates the right to use the bus becomes active. The transfer phase is a period in which a signal obtained by latching a bus permission signal with a clock signal is active and a bus is allocated to the processor element.
This is a period during which the processor element can become a master, and basically transfers in one cycle.

つまり、Ｒｅｑｕｅｓｔｐｈａｓｅにてマスタに対してバス許可信号が発行された次の
１サイクルで、アドレス信号などのコントロール信号およびデータ信号全てを出力し、転
送を完了する。データの転送が終了したらバス要求信号を立ち下げる。すると、バスアー
ビタは他のバス割り当てを要求しているプロセサ要素にバスの使用権を割り当てることが
できる。また、マスタ側選択信号ＭＳＥＬによって選択され、スレーブ側選択信号ＳＳＥ
Ｌがアクティブとなったプロセサ要素はスレーブとなり、コントロール信号およびデータ
信号などの全ての転送データをＴｒａｎｓｆｅｒｐｈａｓｅの末尾のクロックタイミン
グでラッチする。 That is, in the next cycle after the bus permission signal is issued to the master in the request phase, all control signals and data signals such as address signals are output, and the transfer is completed. When the data transfer is completed, the bus request signal is lowered. Then, the bus arbiter can assign the right to use the bus to a processor element that requests another bus assignment. The slave side selection signal SSE is selected by the master side selection signal MSEL.
The processor element in which L becomes active becomes a slave, and all transfer data such as a control signal and a data signal are latched at the last clock timing of the transfer phase.

上述のように、第１の共有バスで制御系データの転送するため、バスアービタ１０５は、
サイクルごとにバス使用権の切り替えを行い、バス許可信号がアクティブになった次のサ
イクルが、そのマスタにバスの使用権がある期間である。したがって、各プロセサ要素は
、制御系データの転送要求の発生ごとにバスアービタに対してバス要求する必要があり、
また、バス許可に基づき、制御系データの種類に対応して、次に示す書込み転送，読出し
返送要求転送，返送書込み転送または割込み要求転送を１サイクルごとのマルチプレクス
モードでそれぞれ行う。 As described above, in order to transfer control system data on the first shared bus, the bus arbiter 105
The bus use right is switched for each cycle, and the next cycle in which the bus permission signal becomes active is a period in which the master has the bus use right. Therefore, each processor element must make a bus request to the bus arbiter every time a control system data transfer request occurs.
Further, based on the bus permission, the following write transfer, read return request transfer, return write transfer or interrupt request transfer is performed in a multiplex mode for each cycle corresponding to the type of control system data.

書込み転送において、プロセサ要素は、制御系データの転送要求に対応して第１の共有バ
スのバス要求信号を出力しバス許可信号の入力に応じてマスタとして制御系データを転送
出力し第１の共有バスを介して選択信号に基づき選択されスレーブとして制御系データを
入力し制御信号およびアドレス信号に基づきメモリ書込みを行う。 In the write transfer, the processor element outputs a bus request signal for the first shared bus in response to a transfer request for control system data, and transfers and outputs control system data as a master in response to the input of the bus permission signal. The control system data is selected as a slave through the shared bus and input to the memory based on the control signal and the address signal.

読出し返送要求転送において、プロセサ要素は、返送先コードを含む制御系データの転送
要求に対応して第１の共有バスのバス要求信号を出力しバス許可信号の入力に応じてマス
タとして返送先コードを転送出力し、第１の共有バスを介して選択信号に基づき選択され
スレーブとして返送先コードを入力し制御信号およびアドレス信号に基づきメモリデータ
を読み出して制御系データとし返送要求を行う。 In the read return request transfer, the processor element outputs the bus request signal of the first shared bus in response to the transfer request of the control system data including the return destination code, and returns the return destination code as a master in response to the input of the bus permission signal. And a return destination code selected as a slave through the first shared bus, read out memory data based on the control signal and address signal, and makes a return request as control system data.

返送書込み転送において、プロセサ要素は、読出し返送要求転送の返送要求に対応して第
１の共有バスのバス要求信号を出力しバス許可信号の入力に応じてマスタとして返送先コ
ード対応の選択信号を転送出力し、第１の共有バスを介して選択信号に基づき選択されス
レーブとして制御系データを入力し制御信号およびアドレス信号に基づきメモリ書込みを
行う。 In the return write transfer, the processor element outputs a bus request signal of the first shared bus in response to the return request of the read return request transfer, and sends a selection signal corresponding to the return destination code as a master in response to the input of the bus permission signal. The data is transferred and output, is selected based on the selection signal via the first shared bus, and the control system data is input as a slave, and the memory is written based on the control signal and the address signal.

図６は、これら読出し返送要求転送および返送書込み転送のリンク動作のシーケンスを説
明するための説明図である。ここで、分図（ａ），（ｂ），（ｃ）は、ステップ１，２，
３の動作をそれぞれ示す。図６に示すように、まず、図６（ａ）のステップ１において、
プロセサ要素０１が、返送先コードを含む制御系データの転送要求に対応してマスタとな
り、読み出したいメモリのアドレスＲＡＤＤＲおよび返送先コードを転送する。このとき
、制御信号の１つであるライトイネーブル信号ＭＷＥをインアクティブにすることで読出
し返送要求であることをスレーブのプロセサ要素０２へ伝える。データ出力信号（ＭＤＢ
Ｏ）には、要求元がプロセサ要素０１であることがわかる情報を転送する。 FIG. 6 is an explanatory diagram for explaining a sequence of link operations of the read / return request transfer and the return / write transfer. Here, the fractions (a), (b), (c) are shown in steps 1, 2,
3 operations are respectively shown. As shown in FIG. 6, first, in step 1 of FIG.
The processor element 01 becomes a master in response to a transfer request for control system data including a return destination code, and transfers the memory address RADDR and return destination code to be read. At this time, the write enable signal MWE, which is one of the control signals, is made inactive to inform the slave processor element 02 that it is a read return request. Data output signal (MDB
In O), information indicating that the request source is the processor element 01 is transferred.

次に、図６（ｂ）のステップ２において、プロセサ要素０２が内部メモリからデータを読
み出し返送要求を行う。この期間、同時にバスの使用権は解放され、バスアービタは他の
データ転送をバスに割り当てることができる。 Next, in step 2 of FIG. 6B, the processor element 02 reads data from the internal memory and makes a return request. During this period, the right to use the bus is released at the same time, and the bus arbiter can allocate another data transfer to the bus.

次に、図６（ｃ）ステップ３において、今度はプロセサ要素０２がマスタとなり、読出し
データを返送先コード対応の転送先へ返送する。このとき、制御信号の１つであるレスポ
ンス信号ＭＲＥＳをアクティブにすることで転送データが読み出しデータであることを伝
えることができる。または、アドレス信号を読み出しデータ返送用の専用アドレスにする
ことで伝えてもよい。 Next, in step 3 of FIG. 6 (c), the processor element 02 now becomes the master, and returns the read data to the transfer destination corresponding to the return destination code. At this time, it is possible to inform that the transfer data is read data by activating the response signal MRES which is one of the control signals. Alternatively, the address signal may be transmitted by using a dedicated address for returning read data.

これら書込み転送，読出し返送要求転送または返送書込み転送において、プロセサ要素は
、第１の共有バスを介して選択信号に基づき選択されスレーブとして、内部プロセッサの
内部割込み処理でなく専用のメモリ制御部により、制御信号およびアドレス信号に基づき
メモリ書込みまたはメモリ読出し返送要求を行う。これにより、プロセサ要素間のデータ
転送が高速化し、プロセサ要素の内部プロセッサのデータ処理効率が向上する。 In these write transfer, read return request transfer or return write transfer, the processor element is selected based on the selection signal via the first shared bus as a slave, not by internal interrupt processing of the internal processor, but by a dedicated memory control unit. A memory write or memory read return request is made based on the control signal and the address signal. This speeds up data transfer between the processor elements and improves the data processing efficiency of the internal processor of the processor element.

また、割込み要求転送において、プロセサ要素が、図４に示した第１の共有バスのインタ
フェースのマスタ出力手段およびスレーブ入力手段により、割込み要求を含む制御系デー
タの転送要求に対応して第１の共有バスのバス要求信号を出力しバス許可信号の入力に応
じてマスタとして割込み要求を転送出力し、第１の共有バスを介して選択信号に基づき選
択されスレーブとして割込み要求を入力し制御信号およびアドレス信号に基づき割込み要
求内部割込み処理を行う。このとき、必要に応じて、マスタにおいて、割込み要求として
割込み要因および転送元コードを転送し、スレーブにおいて、割込み要因に対応した内部
割込み処理を行い、その終了時に、処理結果をマスタとして転送元コードに応じて書込み
転送することもできる。 Further, in the interrupt request transfer, the processor element corresponds to the control system data transfer request including the interrupt request by the master output means and slave input means of the first shared bus interface shown in FIG. A bus request signal of the shared bus is output, an interrupt request is transferred and output as a master in response to the input of the bus permission signal, an interrupt request is input as a slave selected based on the selection signal via the first shared bus, and a control signal and Interrupt request internal interrupt processing is performed based on the address signal. At this time, if necessary, the master transfers the interrupt factor and transfer source code as an interrupt request, and the slave performs internal interrupt processing corresponding to the interrupt factor, and at the end, the processing result is used as the transfer source code. It is also possible to perform write transfer according to the above.

この割込み要求転送により、割込み要求信号の専用線を用いず、データ生成終了タイミン
グなどをＣＰＵに知らせることができ、割込み要求信号の追加・変更・削除、または、プ
ロセサ要素の物理的な追加・変更・削除が行われても、他のプロセサ要素および共有バス
の設計変更点を最小にできる。 This interrupt request transfer can notify the CPU of data generation end timing, etc. without using a dedicated line for interrupt request signals, and add / change / delete interrupt request signals, or physically add / change processor elements. -Even if deletion is performed, design changes of other processor elements and shared buses can be minimized.

図７は、本実施形態のマルチプロセサシステムにおける第２の共有バスを介したデータ転
送を説明するための説明図である。第２の共有バスを介したデータ転送は、予め転送方向
が決まっている入出力データの転送を想定している。このため、転送を行うプロセサ要素
は限定され、マスタまたはスレーブの一方は、そのマルチプロセサシステムのホストプロ
セサ，ＤＭＡコントローラまたはメインメモリとなる可能性が高い。また、転送方向も書
込み転送に限定され、マスタからスレーブへの一方向の転送とする。また、第２の共有バ
スを介したデータ転送では、転送方向および転送トラフィックに応じて、図７（ａ）に示
すような１対１の転送に限定する個別バスの形態と、図７（ｂ）に示すような複数の転送
間で共有する共有バスの形態とが考えられる。これにより、読み出し転送を考慮した場合
の回路構成に比べて、回路規模を削減できる。しかしながら、図７（ｃ）に示すようなマ
スタおよびスレーブ間の両方向の転送を考慮した構成でも何ら問題はない。 FIG. 7 is an explanatory diagram for explaining data transfer via the second shared bus in the multiprocessor system of the present embodiment. The data transfer through the second shared bus is assumed to be transfer of input / output data whose transfer direction is determined in advance. For this reason, processor elements that perform transfer are limited, and one of the master and the slave is likely to be a host processor, a DMA controller, or a main memory of the multiprocessor system. Further, the transfer direction is also limited to write transfer, and is assumed to be one-way transfer from the master to the slave. Further, in the data transfer via the second shared bus, the form of the individual bus limited to the one-to-one transfer as shown in FIG. 7A according to the transfer direction and the transfer traffic, and FIG. And a shared bus shared between a plurality of transfers as shown in FIG. Thereby, the circuit scale can be reduced as compared with the circuit configuration in consideration of read transfer. However, there is no problem even in the configuration considering the bidirectional transfer between the master and the slave as shown in FIG.

図８は、第２の共有バス回路２００の内部構成例および周辺接続例を示すブロック図であ
る。図８を参照すると、第２の共有バス回路２００は、マルチプレクサ，デマルチプレク
サを備え、従来と同じく、マルチプレクサは、第２の共有バスのマスタとして動作するプ
ロセサ要素から選択信号ＭＳＥＬ，制御信号ＭＷＥ，アドレス信号ＭＡＤＤＲおよび制御
系データＭＤＡＴＡをそれぞれ入力し、第２のアービタ１０５からの第２の共有バスのバ
ス使用権に対応した信号により第２の共有バスへ選択的に切り替え出力し、第２の共有バ
スを介して、第２の共有バスのスレーブとして動作するプロセサ要素へ出力し、デマルチ
プレクサは、スレーブとして動作するプロセサ要素から第２の共有バスを介して制御信号
ＳＲＥＡＤＹを入力し、マスタとして動作する転送先のプロセサ要素へそれぞれ切り替え
分配する。 FIG. 8 is a block diagram illustrating an internal configuration example and peripheral connection example of the second shared bus circuit 200. Referring to FIG. 8, the second shared bus circuit 200 includes a multiplexer and a demultiplexer. As in the prior art, the multiplexer receives a selection signal MSEL, a control signal MWE, and a control signal from a processor element that operates as a master of the second shared bus. The address signal MADDR and the control system data MDATA are respectively input, and selectively switched to the second shared bus by a signal corresponding to the right to use the second shared bus from the second arbiter 105, and the second The demultiplexer inputs the control signal SREADY from the processor element operating as a slave via the second shared bus via the shared bus to the processor element operating as the slave of the second shared bus, and serves as the master. Each switch is distributed to the operating processor element of the transfer destination.

上述のように、図８では、第２の共有バス回路２００が、論理合成等のインプリメント容
易性を考慮して、ＭＵＸ型のバス構成となっているが、第１の共有バスと同様に、インプ
リメント容易性および動作遅延の見積りが許せば、３−ｓｔａｔｅ型のバス構成でも何ら
問題はない。 As described above, in FIG. 8, the second shared bus circuit 200 has a MUX type bus configuration in consideration of the ease of implementation such as logic synthesis, but as with the first shared bus, There is no problem with the 3-state type bus configuration if the ease of implementation and the estimation of the operation delay allow.

複数のプロセサ要素０１〜０ｎの少なくとも１つが、従来と同じく、入出力データの転送
要求に対応して、第２の共有バスのバス要求信号ＭＲＥＱを出力し、バス許可信号ＭＧＲ
ＡＮＴの入力に応じてマスタとして制御信号ＭＷＥ，アドレス信号ＭＡＤＤＲを出力し制
御信号ＭＲＥＡＤＹに応じて入出力データＭＤＡＴＡをバースト転送し、また、複数のプ
ロセサ要素０１〜０ｎの少なくとも１つが、従来と同じく、第２の共有バスを介して接続
されたスレーブとして、制御信号ＳＷＥ，アドレス信号ＳＡＤＤＲを入力し制御信号ＳＲ
ＥＡＤＹを出力して入出力データＳＤＡＴＡをバースト転送する。 At least one of the plurality of processor elements 01 to 0n outputs the bus request signal MREQ of the second shared bus in response to the input / output data transfer request, and the bus grant signal MGR.
The control signal MWE and the address signal MADDR are output as a master in response to the input of the ANT, and the input / output data MDATA is burst transferred in accordance with the control signal MREADY, and at least one of the plurality of processor elements 01 to 0n is the same as the conventional one The control signal SWE and the address signal SADDR are input as the slave connected via the second shared bus, and the control signal SR
EADY is output and the input / output data SDATA is burst transferred.

第２のバスアービタ２０５は、従来と同じく、複数のプロセサ要素０１〜０ｎから第２の
共有バスのバス要求信号をサイクルごとにそれぞれ受付け、最も優先度の高いプロセサ要
素に対し第２の共有バスのバス許可信号を発行してバス使用権を調停する。 The second bus arbiter 205 receives the bus request signal of the second shared bus from each of the plurality of processor elements 01 to 0n for each cycle as in the prior art, and the second shared bus of the second highest priority is assigned to the processor element. A bus permission signal is issued to arbitrate the bus use right.

なお、第２の共有バスを共有化しない場合は、従来と同じく、第２の共有バス回路２００
および第２のバスアービタ２０５は不要であり、個別バスにより、マスタおよびスレーブ
間が接続される。図９は、第２の共有バスによる入出力データの転送例を示すタイミン
グ図である。図９に示すように、第２の共有バスによるデータ転送は、例えば、アドレス
信号ＭＡＤＤＲおよび制御信号ＭＷＥが発行された次のサイクルに、そのアドレスに対応
するデータ信号ＭＤＡＴＡが出力される。まず、転送を行いたいプロセサ要素はバス要求
信号ＭＲＥＱを発行し、タイミングＴ２でバスの使用権を要求する。次に、タイミングＴ
４で、マスタがアクティブであるバス許可信号ＭＧＲＡＮＴをラッチしたら、アドレス信
号ＭＡＤＤＲを出力し、制御信号ＭＷＥをアクティブにする。スレーブは、制御信号ＭＷ
Ｅがアクティブであることを認識し、次のタイミングＴ５で、アドレス信号ＭＡＤＤＲを
ラッチする。同時に、マスタはアドレス信号ＭＡＤＤＲに対応するデータ信号ＭＤＡＴＡ
を出力する。スレーブは、ラッチしたアドレスに対して書き込めるかどうかのレスポンス
を、制御信号ＳＲＥＡＤＹとして返送する。マスタがアクティブである制御信号ＳＲＥＡ
ＤＹをラッチしたタイミングＴ６で転送は完了し、スレーブ側で転送データの書込み処理
が行われる。また、読み出し転送を考慮したバスの場合には、スレーブは、アクティブな
制御信号ＳＲＥＡＤＹの発行と同時に読み出しデータの返送を行う。 When the second shared bus is not shared, the second shared bus circuit 200 is the same as the conventional one.
The second bus arbiter 205 is unnecessary and the master and the slave are connected by an individual bus. FIG. 9 is a timing chart showing an example of transfer of input / output data by the second shared bus. As shown in FIG. 9, in the data transfer by the second shared bus, for example, the data signal MDATA corresponding to the address is output in the next cycle when the address signal MADDR and the control signal MWE are issued. First, the processor element that wishes to transfer issues a bus request signal MREQ and requests the right to use the bus at timing T2. Next, timing T
When the bus grant signal MGRANT that the master is active is latched at 4, the address signal MADDR is output and the control signal MWE is activated. The slave receives the control signal MW
Recognizing that E is active, the address signal MADDR is latched at the next timing T5. At the same time, the master uses the data signal MDATA corresponding to the address signal MADDR.
Is output. The slave returns a response indicating whether or not the latched address can be written as a control signal SREADY. Control signal SREA that the master is active
Transfer is completed at timing T6 when DY is latched, and transfer data write processing is performed on the slave side. In the case of a bus in consideration of read transfer, the slave returns read data simultaneously with the issuance of the active control signal SREADY.

図１０は、上述した本実施形態のマルチプロセサシステムを具体的なＷ−ＣＤＭＡディジ
タルベースバンドＬＳＩに適用した実施例を示すブロック図である。この実施例のマルチ
プロセサシステムは、Ｗ−ＣＤＭＡディジタルベースバンドＬＳＩのシステム全体を制御
するプロセサ要素のＣＣＰＵ３００，制御系データ用メモリ３０１，入出力データ用メモ
リ３０２，入出力データ転送用ＤＭＡコントローラ３０３と、ディジタルベースバンドＬ
ＳＩの各処理を行うプロセサ要素０１〜０８，デバグ用処理要素１０と、主に制御系デー
タを転送し全てのプロセサ要素間で双方向に転送可能な第１の共有バス回路１００と、主
に転送経路の決まっている受信データ，送信データ用の入出力データを転送する第２の共
有バス回路２００，２０１と、それぞれの共有バスの使用権を調停する第１，第２のバス
アービタ１０５、２０５と、それぞれの共有バスおよびＣＣＰＵバスの間のブリッジ回路
１１０，２１０とから構成される。ここで、プロセサ要素０７，０８は、ＨＳＤＰＡ処理
，ＧＳＭ処理の拡張プロセサ要素として存在する。 FIG. 10 is a block diagram showing an example in which the above-described multiprocessor system according to this embodiment is applied to a specific W-CDMA digital baseband LSI. The multiprocessor system of this embodiment includes a processor element CCPU 300 for controlling the entire system of the W-CDMA digital baseband LSI, a control data memory 301, an input / output data memory 302, an input / output data transfer DMA controller 303, Digital baseband L
A processor element 01 to 08 for performing each processing of SI, a debugging processing element 10, a first shared bus circuit 100 mainly transferring control system data and bi-directionally transferring between all processor elements; Second shared bus circuits 200 and 201 for transferring received data and input / output data for transmission data whose transfer paths are determined, and first and second bus arbiters 105 and 205 for arbitrating the right to use each shared bus And bridge circuits 110 and 210 between the shared bus and the CCPU bus. Here, the processor elements 07 and 08 exist as extended processor elements for HSDPA processing and GSM processing.

この実施例のマルチプロセサシステムにおいては、基本的には、プロセサ要素ＣＣＰＵ３
００がマスタとなって、第１の共有バスを介して各プロセサ要素０１〜０８を制御するこ
とでシステムを実現している。しかしながら、ＣＣＰＵ３００以外の各プロセサ要素０１
〜０８も第１の共有バスのマスタになることができ、従来バスであればスレーブ同士とな
るプロセサ要素０１〜０８間の転送もＣＣＰＵ３００を介さずに直接行うことができる。
具体的には、プロセサ要素０１〜０８間における動作タイミング信号，パラメータ信号，
ステータス信号および割込み信号など制御系データの転送を、第１の共有バスを用いて直
接行う。 In the multiprocessor system of this embodiment, basically, the processor element CCPU3
00 is the master, and the system is realized by controlling the processor elements 01 to 08 via the first shared bus. However, each processor element 01 other than the CCPU 300
˜08 can also be masters of the first shared bus, and in the case of a conventional bus, transfer between the processor elements 01 to 08 that are slaves can be directly performed without going through the CCPU 300.
Specifically, an operation timing signal, a parameter signal between the processor elements 01 to 08,
Control system data such as a status signal and an interrupt signal is directly transferred using the first shared bus.

各々のプロセサ要素で処理される入出力データの転送は、第２の共有バスを介して行われ
る。図１０に示した例では、第２の共有バスはその転送データの転送方向から２本に分離
されており、主に受信データ，送信データ用の入出力データを転送する第２の共有バス回
路２００，２０１が存在する。受信データ用の第２の共有バス回路２００では、マスタと
して接続されているプロセサ要素０５，０７，０８であるＦＥＣ，ＨＳＤＰＡ，ＧＳＭか
ら、スレーブとして接続されているプロセサ要素である入出力データ用メモリ３０２に対
して受信データの転送が行われる。また、送信データ用の第２の共有バス回路２０１では
、マスタとして接続されているプロセサ要素０４であるＤＥＭおよびブリッジ回路２１０
経由のＤＭＡコントローラ３０３から、スレーブとして接続されているプロセサ要素０５
，０６，０８であるＦＥＣ，ＭＯＤ，ＧＳＭに対して、復調データおよび送信データの転
送が行われる。 The transfer of input / output data processed by each processor element is performed via the second shared bus. In the example shown in FIG. 10, the second shared bus is separated into two from the transfer direction of the transfer data, and the second shared bus circuit mainly transfers input / output data for received data and transmission data. 200 and 201 exist. In the second shared bus circuit 200 for received data, input / output data memory as processor elements connected as slaves from FEC, HSDPA and GSM as processor elements 05, 07 and 08 connected as masters. The received data is transferred to 302. In the second shared bus circuit 201 for transmission data, the DEM and the bridge circuit 210 which are the processor elements 04 connected as a master.
Processor element 05 connected as a slave from the via DMA controller 303
, 06, and 08, the demodulated data and the transmission data are transferred to FEC, MOD, and GSM.

効果としては、転送トラフィックの大きい入出力データ転送中に、複雑な制御系データの
転送が発生した場合でも、制御系データおよび入出力データの転送を異なるバスを用いて
行うことにより、柔軟なシステムが実現できる。例えば、ＣＰＵ３００による制御系デー
タの転送と、ＤＭＡコントローラ３０３による入出力データの転送を同時に行うことが可
能になる。また、他のプロセサ要素間においても同様に、転送データ数の多い入出力デー
タの転送中に制御系データを転送することが可能になる。 As an effect, even when complicated control system data transfer occurs during I / O data transfer with large transfer traffic, a flexible system can be used by transferring control system data and I / O data using different buses. Can be realized. For example, it is possible to transfer control system data by the CPU 300 and input / output data by the DMA controller 303 at the same time. Similarly, control system data can be transferred between other processor elements during transfer of input / output data having a large number of transfer data.

また、プロセサ要素０７，０８がＨＳＤＰＡ処理，ＧＳＭ処理の拡張プロセサ要素として
接続されるように、本実施形態における第１，第２の共有バスを用いて全体のマルチプロ
セサシステムを構成することによって、第１，第２の共有バスの仕様をほとんど変更する
ことなく、このような拡張プロセサ要素の追加・変更・削除にも柔軟に対応可能である。 Further, by configuring the entire multiprocessor system using the first and second shared buses in this embodiment so that the processor elements 07 and 08 are connected as extended processor elements for HSDPA processing and GSM processing, It is possible to flexibly cope with the addition / change / deletion of such an extended processor element without substantially changing the specifications of the first and second shared buses.

図１１，図１２は、本発明のマルチプロセサシステムの実施形態２における各プロセサ要
素，デバグ用処理要素の構成の一部をそれぞれ示す部分ブロック図である。 11 and 12 are partial block diagrams respectively showing part of the configuration of each processor element and debugging processing element in the second embodiment of the multiprocessor system of the present invention.

本実施形態のマルチプロセサシステムの全体の構成は、図１に示した実施形態１のマルチ
プロセサシステムと同構成であり、各プロセサ要素およびデバグ用処理要素以外の各ブロ
ックも同構成であり、各プロセサ要素およびデバグ用処理要素の内部構成が異なる。 The overall configuration of the multiprocessor system of the present embodiment is the same as that of the multiprocessor system of the first embodiment shown in FIG. 1, and each block other than the processor elements and the processing elements for debugging has the same configuration. And the internal structure of the processing element for debugging is different.

図１１を参照すると、本実施形態のマルチプロセサシステムにおける各プロセサ要素０１
〜０ｎは、第１の共有バスに接続するためのインタフェース回路２１と、種々の演算およ
びプロセサ要素内のコントロールを行うＤＳＰおよびＭＰＵなどの内部プロセサ２２と、
内部プロセサ２２の命令コードを格納している命令コード記憶装置２３と、命令アドレス
のトレース機能をもつアドレストレーサ２４とを備える。もちろん、各プロセサ要素内に
は、これらの他にもデータ処理を行う専用ハードウェアアクセラレータ、各種レジスタお
よびメモリなどの記憶装置などを備えていても何ら問題はない。 Referring to FIG. 11, each processor element 01 in the multiprocessor system of this embodiment.
˜0n are an interface circuit 21 for connection to the first shared bus, an internal processor 22 such as a DSP and an MPU for performing various operations and control in the processor elements,
An instruction code storage device 23 storing an instruction code of the internal processor 22 and an address tracer 24 having a function of tracing an instruction address are provided. Of course, there is no problem even if each processor element is provided with a dedicated hardware accelerator for performing data processing, a storage device such as various registers and a memory in addition to these.

これら各プロセサ要素０１〜０ｎの動作について説明すると、まず、本実施形態のマルチ
プロセサシステム全体およびプロセサ要素単体の仕様により、その処理がデバグルーチン
に入った場合、ＤＳＰおよびＭＰＵなどの内部プロセサ２２は、アドレストレーサ２４に
対する制御信号Ｃｏｎｔｒｏｌｓｉｇｎａｌｓを用いて、アドレストレーサ２４に対し
て命令アドレストレース開始を指示する。 The operation of each of the processor elements 01 to 0n will be described. First, when the processing enters the debug routine according to the specifications of the entire multiprocessor system and the processor element alone of the present embodiment, the internal processor 22 such as a DSP and MPU The control signal Control signals for the address tracer 24 is used to instruct the address tracer 24 to start an instruction address trace.

次に、アドレストレーサ２４は、命令コード記憶装置２３に対する命令アドレスを監視す
ることでトレースデータＴｒａｃｅＤａｔａを作成し、バスインタフェース回路２１に
トレースデータを転送する。このとき、トレースデータの生成法としては、読み出された
命令アドレス全てをそのまま転送する方法を用いても良いし、または、トレースデータの
転送データ数を削減するために、通常の動作シーケンスは単調増加のインクリメント方式
で行われることに着目し、それ以外の分岐またはウェイトなどのアドレスジャンプが発生
した場合にだけトレースデータを生成し、転送する方法を用いても何ら問題はない。 Next, the address tracer 24 creates trace data Trace Data by monitoring an instruction address for the instruction code storage device 23 and transfers the trace data to the bus interface circuit 21. At this time, as a method of generating the trace data, a method of transferring all the read instruction addresses as they are may be used, or a normal operation sequence is monotonous in order to reduce the number of transfer data of the trace data. Focusing on the increment increment method, there is no problem if a method of generating and transferring trace data only when an address jump such as a branch or a wait other than that occurs is used.

最後に、バスインタフェース回路２１は、通常動作における転送データＯｕｔｐｕｔＤ
ａｔａが存在する場合には通常転送データを優先的にバス転送し、存在しなくなった場合
にだけ、すなわち、通常データ転送の合間に、生成されたトレースデータを転送先である
デバグ処理要素１０のＤＢＧＩＦに向けて第１の共有バス回路１００へ転送する。具体的
には、バスインタフェース回路２１は、第１の共有バス回路１００へデータ転送出力のた
めにＦＩＦＯバッファを持ち、通常データ転送用のＦＩＦＯバッファ内にデータが存在し
なくなってから、トレースデータ用のＦＩＦＯバッファ内のトレースデータを読み出して
転送する。 Finally, the bus interface circuit 21 transfers the transfer data Output D in normal operation.
When data is present, the normal transfer data is preferentially bus-transferred, and only when no data exists, that is, between normal data transfers, the generated trace data is transferred to the debug processing element 10 as the transfer destination. The data is transferred to the first shared bus circuit 100 toward DBGIF. Specifically, the bus interface circuit 21 has a FIFO buffer for data transfer output to the first shared bus circuit 100, and no trace data is stored in the FIFO buffer for normal data transfer. The trace data in the FIFO buffer is read and transferred.

図１２（ａ）を参照すると、本実施形態のマルチプロセサシステムにおけるデバグ用処理
要素１０は、共有バス上の転送データをラッチするレシーブユニットと、デバグ用に共有
バス上の転送経路が所望の条件を満たしているかを判断するスヌーピングユニットと、第
１および第２の共有バスに対してラッチしたデータを格納する２つの記憶装置とを備える
。また、これら２つの記憶装置を１つにして、第１，第２の共有バスに対して共通とし、
マルチプレクサで切り替えながらデータを書き込む構成が図１２（ｂ）に示されている。 Referring to FIG. 12A, the debugging processing element 10 in the multiprocessor system of the present embodiment has a receiving unit that latches transfer data on the shared bus, and a transfer path on the shared bus for debugging satisfies a desired condition. A snooping unit that determines whether the condition is satisfied and two storage devices that store data latched with respect to the first and second shared buses are provided. In addition, these two storage devices are made common to the first and second shared buses,
A configuration for writing data while switching by a multiplexer is shown in FIG.

このデバグ用処理要素１０の動作について説明すると、レシーブユニットがデータをラッ
チする動作として２つある。１つは、第１，第２の共有バスのスレーブとして動作する場
合である。転送先がデバグ用処理要素１０であった場合にデータを取り込む。もう１つは
、デバグ用にバスモニタを行うときである。この場合には、スヌーピングユニットにおい
て、第１，第２の共有バス上を転送されるデータの転送経路の条件ＢＳＥＬ，ＢＤＥＣお
よび書込みアドレスＳＡＤＤＲが所望の範囲を満たしていた場合に、レシーブユニットは
データを取り込みデバグ用記憶装置に書き込む。このとき、その書込みアドレスＳＡＤＤ
Ｒは、通常のスレーブ動作の場合は、そのまま転送されてきたアドレスになり、バスモニ
タ動作の場合には、スヌーピングユニットが指定したアドレスとなる。 The operation of the debugging processing element 10 will be described. There are two operations that the receive unit latches data. One is a case of operating as a slave of the first and second shared buses. Data is fetched when the transfer destination is the debug processing element 10. The other is when performing bus monitoring for debugging. In this case, in the snooping unit, when the conditions BSEL and BDEC and the write address SADDR of the transfer path of the data transferred on the first and second shared buses satisfy the desired range, the receiving unit Is written in the debugging storage device. At this time, the write address SADD
In a normal slave operation, R is an address transferred as it is, and in a bus monitor operation, R is an address designated by the snooping unit.

なお、図１２においては、デバグ用処理要素１０で取得したデータを専用の記憶装置に書
き込む場合について示したが、実際には、記憶装置に書き込まず、直接外部に出力してモ
ニタできる構成でも何ら問題はない。 Although FIG. 12 shows the case where the data acquired by the debugging processing element 10 is written to a dedicated storage device, actually, there is no configuration in which the data can be directly output and monitored without being written to the storage device. No problem.

この実施形態２において、プロセサ要素ごとのアドレストレース機能によって生成された
トレースデータを、第１の共有バスを用いて全てのプロセサ要素共通のデバグ用記憶装置
に転送することにより、従来プロセサ要素ごとに実装されていたトレースメモリの削減に
つながる。これは、全てのプロセサ要素がマスタになれるという第１の共有バスの特徴に
よって実現される。従来、プロセサ要素ごとに必要とされていたトレースメモリを削減で
き、デバグ用の共通記憶装置としてまとめることにより、マルチプロセサシステムとして
効率的に利用可能となる。 In the second embodiment, the trace data generated by the address trace function for each processor element is transferred to the debugging storage device common to all the processor elements by using the first shared bus. It leads to reduction of the implemented trace memory. This is realized by the feature of the first shared bus that all processor elements can become masters. Conventionally, it is possible to reduce the trace memory required for each processor element, and it can be efficiently used as a multiprocessor system by collecting them as a common storage device for debugging.

また、通常動作中の共有バスを使用していない期間を利用してトレースデータが転送され
るため、バスモニタ機能と組み合わせることにより、通常動作におけるプロセサ要素間の
転送データと、１つまたは複数のプロセサ要素のアドレストレースデータを、同時にモニ
タリングできるという利点もある。すなわち、一度動作を停止してから読み出す必要はな
く、通常動作中にアドレストレース情報を取得できる。特に、トレースデータを分岐時な
どに限定して作成することで削減することにより、アドレストレーサのリアルタイム性を
より高く実現できる。 In addition, since trace data is transferred using a period in which the shared bus is not used during normal operation, transfer data between processor elements in normal operation and one or a plurality of data are combined with the bus monitor function. There is also an advantage that the address trace data of the processor element can be monitored simultaneously. In other words, there is no need to read out after stopping the operation, and address trace information can be acquired during normal operation. In particular, by reducing the trace data by creating it only at the time of branching, the real-time property of the address tracer can be realized higher.

また、実施形態２と同様の原理を利用して、アドレストレースデータだけではなく、デバ
グ時におけるプロセサ要素内の任意のデバグ用データ信号も、第１の共有バスを用いてデ
バグ用処理要素１０に対して転送可能であることを付記しておく。 Further, by using the same principle as in the second embodiment, not only address trace data but also an arbitrary debugging data signal in the processor element at the time of debugging is sent to the debugging processing element 10 using the first shared bus. Note that transfer is possible.

図１３は、本発明によるマルチプロセサシステムの実施形態３を示す全体ブロック図であ
る。図１３を参照すると、本実施形態のマルチプロセサシステムの全体の構成は、図１に
示した実施形態１のマルチプロセサシステムに対し各プロセサ要素０１〜０ｎと第１のバ
スアービタ１０５および第１の共有バス回路１００との間にそれぞれ同期回路３０を挿入
追加した構成である。また、第１のバスアービタ１０５以外の各ブロックは、実施形態１
の各ブロックと同構成であり、第１のバスアービタ１０５の内部構成が異なる。また、図
示してないが、プロセサ要素０１〜０ｎの基本クロック信号に同期し且つ第１の共有バス
の転送トラフィックに応じて基本クロック信号の整数倍周波数のバスクロック信号を生成
するクロック生成回路を備える。 FIG. 13 is an overall block diagram showing Embodiment 3 of the multiprocessor system according to the present invention. Referring to FIG. 13, the overall configuration of the multiprocessor system according to the present embodiment is the same as that of the multiprocessor system according to the first embodiment shown in FIG. In this configuration, the synchronization circuit 30 is inserted and added between the two. Each block other than the first bus arbiter 105 is the same as that in the first embodiment.
The first bus arbiter 105 has the same configuration as each block of FIG. Although not shown, a clock generation circuit that generates a bus clock signal that is in synchronization with the basic clock signals of the processor elements 01 to 0n and has an integer multiple frequency of the basic clock signal in accordance with the transfer traffic of the first shared bus. Prepare.

図１４は、本実施形態のマルチプロセサシステムにおいて挿入追加された同期回路３０の
挿入箇所を説明するための説明図である。各プロセサ要素０１〜０ｎと第１のバスアービ
タ１０５との間に、アービタ同期回路３０ａがそれぞれ挿入され、各プロセサ要素０１〜
０ｎのスレーブ入力と第１の共有バス回路１００との間に、スレーブ同期回路３０ｂがそ
れぞれ挿入される。また、図１５は、図１４に示したアービタ同期回路３０ａ，スレーブ
同期回路３０ｂの構成例をそれぞれ示すブロック図であり、図１６は、図１５のアービタ
同期回路３０ａ，スレーブ同期回路３０ｂに供給されるバスクロック信号および基本クロ
ック信号の動作を示すタイミング図である。 FIG. 14 is an explanatory diagram for explaining an insertion position of the synchronization circuit 30 inserted and added in the multiprocessor system of the present embodiment. Arbiter synchronization circuits 30a are respectively inserted between the processor elements 01 to 0n and the first bus arbiter 105, and the processor elements 01 to
A slave synchronization circuit 30b is inserted between the 0n slave input and the first shared bus circuit 100, respectively. 15 is a block diagram showing a configuration example of the arbiter synchronization circuit 30a and slave synchronization circuit 30b shown in FIG. 14, respectively. FIG. 16 is supplied to the arbiter synchronization circuit 30a and slave synchronization circuit 30b of FIG. FIG. 10 is a timing diagram showing operations of a bus clock signal and a basic clock signal.

アービタ同期回路３０ａは、各プロセサ要素から基本クロック信号に同期して発行される
バス要求信号ＭＲＥＱをバスクロック信号に同期させ信号ＢＲＥＱとしてバスアービタに
転送するための追加回路と、可変であるバスクロック信号に同期して第１のバスアービタ
１０５から発行されるバス許可信号ＢＧＲＡＮＴを、基本クロック信号に同期させ、信号
ＭＧＲＡＮＴとして各プロセサ要素のバスインタフェース回路２１に転送する追加回路と
を備える。 The arbiter synchronization circuit 30a includes an additional circuit for synchronizing the bus request signal MREQ issued from each processor element in synchronization with the basic clock signal with the bus clock signal and transferring the signal to the bus arbiter as a signal BREQ, and a variable bus clock signal. And an additional circuit that synchronizes the bus grant signal BGRANT issued from the first bus arbiter 105 with the basic clock signal and transfers it to the bus interface circuit 21 of each processor element as a signal MGRANT.

これら追加回路により、バス要求信号ＭＲＥＱが基本クロック信号に同期して発行されて
も、バス許可信号ＢＧＲＡＮＴがアクティブになった場合は、その基本クロック信号のサ
イクル中の残りのバスサイクルにおいてはバスアービタに対するバス要求信号ＢＲＥＱを
インアクティブにする。また、バス許可信号ＢＧＲＡＮＴはバスクロック信号に同期して
発行されるが、次の基本クロック信号の立ち上がりタイミングまで、そのバス許可信号を
保持してプロセサ要素のバスインタフェース回路に転送できる。 Even if the bus request signal MREQ is issued in synchronization with the basic clock signal by these additional circuits, if the bus grant signal BGRANT becomes active, the remaining bus cycles in the cycle of the basic clock signal will be applied to the bus arbiter. The bus request signal BREQ is made inactive. The bus grant signal BGRANT is issued in synchronization with the bus clock signal, but the bus grant signal can be held and transferred to the bus interface circuit of the processor element until the next rising edge of the basic clock signal.

スレーブ同期回路３０ｂは、第１の共有バス回路１００からバスクロック信号に同期して
入力される転送データ信号ＢＳＳＥＬ，ＢＡＤＤＲ，ＢＤＢＩなどを、基本クロック信号
に同期させ信号ＳＳＥＬ，ＳＡＤＤＲ，ＳＤＢＩなどとして転送するための追加回路を備
える。 The slave synchronization circuit 30b synchronizes the transfer data signals BSSEL, BADDR, BDBI and the like input from the first shared bus circuit 100 in synchronization with the bus clock signal and transfers them as signals SSEL, SADDR, SDBI, etc. in synchronization with the basic clock signal. An additional circuit is provided.

この追加回路により、第１の共有バスからバスクロック信号に同期して１つのスレーブに
対して転送が行われた場合には、同一のスレーブに対しては基本クロックの１サイクル中
に多くても一回の転送しか発生しないが、プロセサ要素のバスインタフェース回路がデー
タをラッチする次の基本クロックの立ち上がりタイミングまでその転送データを保持でき
る。 With this additional circuit, when transfer is performed from the first shared bus to one slave in synchronization with the bus clock signal, the same slave has at most one cycle of the basic clock. Although only one transfer occurs, the transfer data can be held until the next rising edge of the basic clock when the bus interface circuit of the processor element latches the data.

これらアービタ同期回路３０ａおよびスレーブ同期回路３０ｂは、基本的には可変である
バスクロック信号に同期して動作するが、バスクロック信号の周波数が最も低い周波数、
すなわち基本クロック信号の周波数である場合には、内部のレジスタへのクロックを停止
できるという特徴をもつ。 These arbiter synchronization circuit 30a and slave synchronization circuit 30b operate in synchronization with a bus clock signal that is basically variable, but the frequency of the bus clock signal is the lowest.
That is, when the frequency is the frequency of the basic clock signal, the clock to the internal register can be stopped.

図１７は、本実施形態のマルチプロセサシステムにおける第１のバスアービタの構成例を
示すブロック図である。図１７を参照すると、本実施形態のマルチプロセサシステムにお
ける第１のバスアービタは、図１の第１のバスアービタに対して、アービタ同期回路を介
して複数のプロセサ要素から第１の共有バスのバス要求信号を基本クロック信号のサイク
ルごとにそれぞれ１度だけ受け付けマスクするためのマスク機能と、各バスサイクルでの
第１の共有バスのバス許可信号の発行から次のサイクルの各バスサイクルのバス使用権を
調停し信号出力するための遅延機能とを追加している。 FIG. 17 is a block diagram illustrating a configuration example of the first bus arbiter in the multiprocessor system according to the present embodiment. Referring to FIG. 17, the first bus arbiter in the multiprocessor system of this embodiment is different from the first bus arbiter of FIG. And a mask function for accepting and masking only once for each cycle of the basic clock signal, and issuing a bus permission signal for the first shared bus in each bus cycle to the bus use right for each bus cycle in the next cycle A delay function for arbitration and signal output is added.

これらマスク機能および遅延機能により、例えば、図１７に示すように、バスクロック信
号の周波数が基本クロック信号の周波数（３０ＭＨｚ）の１倍，２倍，４倍の場合には、
１バスサイクル前，２サイクル前，４サイクル前のバス調停結果に対応したバス選択信号
ＢＳＥＬにより第１の共有バス回路１００を動作させる。また、基本クロックの同一サイ
クル内においては、同一プロセサ要素をスレーブとする転送は多くても１回だけ行うこと
ができ、各プロセサ要素は、バスクロック信号に依存せずに常に基本クロックで動作でき
る。 With these mask function and delay function, for example, as shown in FIG. 17, when the frequency of the bus clock signal is 1, 2, or 4 times the frequency (30 MHz) of the basic clock signal,
The first shared bus circuit 100 is operated by the bus selection signal BSEL corresponding to the bus arbitration result one bus cycle before, two cycles before, and four cycles before. In addition, within the same cycle of the basic clock, transfer with the same processor element as a slave can be performed at most once, and each processor element can always operate with the basic clock without depending on the bus clock signal. .

上述のように、本実施形態のマルチプロセサシステムにおける第１のバスアービタは、第
１の共有バス回路１００の転送トラフィックを保証するため、および、物理的にバス本数
を増やすことによる回路規模増加を防ぐために、第１の共有バス回路１００を、基本クロ
ック信号の周波数を定数倍したバスクロック信号の周波数で動作させる。 As described above, the first bus arbiter in the multiprocessor system according to the present embodiment guarantees the transfer traffic of the first shared bus circuit 100 and prevents an increase in circuit scale by physically increasing the number of buses. The first shared bus circuit 100 is operated at the frequency of the bus clock signal obtained by multiplying the frequency of the basic clock signal by a constant.

このとき、常に定数倍したクロックで動作させると、回路のスイッチング回数が増加し、
消費電力が大きくなってしまうという問題があるため、バスクロックを可変にできるよう
にする。例えば、転送トラフィックが多い処理ルーチン、および、実施形態２に示したデ
バグ時のアドレストレースを基本クロック信号より速いバスクロック信号を用いて第１の
共有バス回路１００を動作させる。 At this time, if the clock is always multiplied by a constant, the number of circuit switching increases,
Since there is a problem that power consumption becomes large, the bus clock is made variable. For example, the first shared bus circuit 100 is operated by using a bus clock signal that is faster than the basic clock signal for the processing routine with a lot of transfer traffic and the address trace during debugging shown in the second embodiment.

すなわち、本実施形態である可変クロックの第１の共有バス回路１００では、専用ハード
ウェアを用いて何か信号を常に監視して、完全に動的なクロック切り替えを行わず、実際
には、システム全体の処理ルーチンが転送トラフィックの大きいルーチンに入る場合、も
しくは、他のある条件を満たした場合にのみ、ＣＰＵなどから切り替え信号が発行され、
バスクロックが切り替わる仕組みとする。 That is, in the first shared bus circuit 100 of the variable clock according to the present embodiment, any signal is always monitored by using dedicated hardware, and the dynamic clock switching is not performed. A switching signal is issued from the CPU or the like only when the entire processing routine enters a routine having a large transfer traffic or when a certain other condition is satisfied,
The bus clock is switched.

これら同期回路３０および第１のバスアービタに対して、例えば、図１６に示したスター
ト位置信号ｓｔａが必要になるだけである。このスタート位置信号ｓｔａは、基本クロッ
ク信号を３０ＭＨｚとし、可変であるバスクロック信号を３０ＭＨｚ，６０ＭＨｚ，１２
０ＭＨｚとした場合、基本クロック信号の立ち上がりタイミング時に、可変であるバスク
ロック信号に同期して１サイクルだけアクティブになる。また、第１の共有バスに対する
プロセサ要素のデータ出力側には、追加回路は不要である。 For the synchronization circuit 30 and the first bus arbiter, for example, only the start position signal sta shown in FIG. 16 is required. The start position signal sta has a basic clock signal of 30 MHz and a variable bus clock signal of 30 MHz, 60 MHz, 12
In the case of 0 MHz, at the rising timing of the basic clock signal, it becomes active for one cycle in synchronization with the variable bus clock signal. Further, no additional circuit is required on the data output side of the processor element for the first shared bus.

本実施形態のマルチプロセサシステムにおいて、バスの動作クロックを可変（定数倍）に
することによって、物理的にバス本数を増やす場合に比べて回路規模オーバヘッドを抑え
たままで、転送トラフィックを広い範囲で保証できることである。これにより、新たな転
送トラフィックの発生にも柔軟に対応できる可能性が増加するため、拡張容易性も向上す
る。また、バスの動作クロックを常に速くしておく高速転送バスに比べて、必要なときに
だけ速くする可変にすることで低消費電力化が実現できる。 In the multiprocessor system of the present embodiment, by making the bus operation clock variable (constant multiple), it is possible to guarantee transfer traffic in a wide range while suppressing the circuit scale overhead as compared with the case where the number of buses is physically increased. It is. As a result, the possibility of being able to flexibly cope with the occurrence of new transfer traffic increases, and the ease of expansion is also improved. In addition, power consumption can be reduced by making it variable only when necessary as compared with a high-speed transfer bus in which the bus operating clock is always fast.

また、バスの動作クロックの切り替えを、例えば、転送の多い処理ルーチンおよびデバグ
時などは高速なバス動作クロックを用いるなど、マクロ的なシステムレベルで制御するこ
とで、回路規模および消費電力など効率の良いシステムを実現できる。最後に、共有バス
が基本クロックに同期して動作する場合には、同期回路３０内の全てのレジスタにおける
入力クロックを停止でき、低消費電力化が実現できる。 Also, by switching the bus operation clock at the macro system level, for example, using a high-speed bus operation clock for processing routines with many transfers and during debugging, the circuit scale and power consumption can be improved. A good system can be realized. Finally, when the shared bus operates in synchronization with the basic clock, the input clocks in all the registers in the synchronization circuit 30 can be stopped, and low power consumption can be realized.

本発明によるマルチプロセサシステムの実施形態１を示す全体ブロック図である。1 is an overall block diagram showing Embodiment 1 of a multiprocessor system according to the present invention. 図１における第１の共有バスを介したデータ転送を説明するための説明図である。It is explanatory drawing for demonstrating the data transfer via the 1st shared bus in FIG. 図１における第１の共有バス回路１００の内部構成例および周辺接続例を示すブロック図である。FIG. 2 is a block diagram illustrating an internal configuration example and a peripheral connection example of a first shared bus circuit 100 in FIG. 1. 図１における各プロセサ要素内の第１の共有バスのマスタ側およびスレーブ側インタフェースの１部を示す部分ブロック図である。FIG. 2 is a partial block diagram showing a part of a master side and slave side interface of a first shared bus in each processor element in FIG. 1. 図３における第１の共有バスによる制御系データの転送例を示すタイミング図である。FIG. 4 is a timing chart showing an example of transfer of control system data by the first shared bus in FIG. 3. 図３における第１の共有バスによる読出し返送要求転送および返送書込み転送のリンク動作のシーケンスを説明するための説明図である。FIG. 4 is an explanatory diagram for explaining a sequence of link operations of a read return request transfer and a return write transfer by the first shared bus in FIG. 3. 図１における第２の共有バスを介したデータ転送を説明するための説明図である。It is explanatory drawing for demonstrating the data transfer via the 2nd shared bus in FIG. 図１における第２の共有バス回路２００の内部構成例および周辺接続例を示すブロック図である。FIG. 2 is a block diagram showing an example of an internal configuration and a peripheral connection example of a second shared bus circuit 200 in FIG. 1. 図８における第２の共有バスによる入出力データの転送例を示すタイミング図である。FIG. 9 is a timing chart showing an example of input / output data transfer by the second shared bus in FIG. 8. 図１のマルチプロセサシステムを具体的なＷ−ＣＤＭＡディジタルベースバンドＬＳＩに適用した実施例を示すブロック図である。FIG. 2 is a block diagram showing an embodiment in which the multiprocessor system of FIG. 1 is applied to a specific W-CDMA digital baseband LSI. 本発明のマルチプロセサシステムの実施形態２における各プロセサ要素の構成の一部を示す部分ブロック図である。It is a partial block diagram which shows a part of structure of each processor element in Embodiment 2 of the multiprocessor system of this invention. 本発明のマルチプロセサシステムの実施形態２におけるデバグ用処理要素の構成の一部を示す部分ブロック図である。It is a partial block diagram which shows a part of structure of the processing element for debugging in Embodiment 2 of the multiprocessor system of this invention. 本発明によるマルチプロセサシステムの実施形態３を示す全体ブロック図である。It is a whole block diagram which shows Embodiment 3 of the multiprocessor system by this invention. 図１３のマルチプロセサシステムにおいて挿入追加された同期回路３０の挿入箇所を説明するための説明図である。It is explanatory drawing for demonstrating the insertion location of the synchronous circuit 30 inserted and added in the multiprocessor system of FIG. 図１４に示したアービタ同期回路３０ａ，スレーブ同期回路３０ｂの構成例をそれぞれ示すブロック図である。FIG. 15 is a block diagram illustrating a configuration example of the arbiter synchronization circuit 30a and the slave synchronization circuit 30b illustrated in FIG. 14, respectively. 図１５のアービタ同期回路３０ａ，スレーブ同期回路３０ｂに供給されるバスクロック信号および基本クロック信号の動作を示すタイミング図である。FIG. 16 is a timing diagram showing operations of a bus clock signal and a basic clock signal supplied to the arbiter synchronization circuit 30a and the slave synchronization circuit 30b in FIG. 15; 図１３のマルチプロセサシステムにおける第１のバスアービタの構成例を示すブロック図である。FIG. 14 is a block diagram illustrating a configuration example of a first bus arbiter in the multiprocessor system of FIG. 13. 従来のマルチプロセサシステムの構成例を示すブロック図である。It is a block diagram which shows the structural example of the conventional multiprocessor system.

Explanation of symbols

０１〜０ｎプロセサ要素 01 ~ 0n Processor element

１０デバグ用処理要素
１２−１，１２−２プロセサ
１３−１，１３−２バスコントローラ
１５−１，１５−２アダプタ
１６−１，１６−２，１６−３入出力装置
２１バスインタフェース回路
２２内部プロセサ
２３命令コード記憶装置
２４アドレストレーサ
３０，３０ａ，３０ｂ同期回路
１００第１の共有バス回路
１０５第１のバスアービタ
１１０，２１０ブリッジ回路
２００，２０１第２の共有バス
２０５第２のバスアービタ
３００ＣＣＰＵ
３０１制御系データ用メモリ
３０２入出力データ用メモリ
３０３ＤＭＡコントロー 10 Debug processing elements 12-1, 12-2 Processors 13-1, 13-2 Bus controllers 15-1, 15-2 Adapters 16-1, 16-2, 16-3 Input / output device 21 Bus interface circuit 22 Inside Processor 23 Instruction code storage device 24 Address tracer 30, 30a, 30b Synchronous circuit 100 First shared bus circuit 105 First bus arbiter 110, 210 Bridge circuit 200, 201 Second shared bus 205 Second bus arbiter 300 CCPU
301 Control system data memory 302 Input / output data memory 303 DMA controller

Claims

Transfer method capable of processing each data, acquiring the right to use the first or second shared bus in response to a transfer request for control system data or input / output data, and switching the right to use the bus for each cycle as a master A multiprocessor system comprising a plurality of processor elements for multiplex transfer or burst transfer,
The first processor element and the second processor element constituting the plurality of processor elements are:
A bus request signal of the first shared bus is output in response to the transfer request of the control system data, and a transfer destination selection signal, control signal, address signal, and the control system data as a master according to the input of the bus permission signal A master function that transfers and outputs in one cycle;
A slave function that is selected based on the selection signal via the first shared bus, inputs the control system data as a slave, and processes based on the control signal and the address signal;

Further, the first processor element and the second processor element are:
A first function of outputting a bus request signal of the first shared bus in response to a transfer request of control system data including a return destination code, and transferring and outputting the return destination code as a master in response to an input of a bus permission signal When,
The second destination code is selected via the first shared bus based on the selection signal, the return destination code is input as a slave, the memory data is read out based on the control signal and the address signal, and a return request is made as control system data. And having a function
Perform read return request transfer between the processor elements,
A third function of outputting a bus request signal of the first shared bus in response to the return request, and transferring and outputting the selection signal corresponding to the return destination code as a master in response to an input of the bus permission signal;
A fourth function of inputting the control system data as a slave selected based on the selection signal via the first shared bus and performing memory writing based on the control signal and the address signal;
A return write transfer between the processor elements;
A multiprocessor system characterized by

The first processor element outputs a bus request signal for the first shared bus in response to a transfer request for control system data including a return destination code, and the return destination code as a master in response to an input of a bus permission signal. Transfer and output
The second processor element is selected based on the selection signal via the first shared bus, inputs the return destination code as a slave, reads memory data based on the control signal and the address signal, and controls system data A return request, and a read return request transfer between the processor elements,
The second processor element outputs a bus request signal of the first shared bus in response to the return request, and transfers and outputs a selection signal corresponding to the return destination code as a master in response to the input of the bus permission signal. ,
The first processor element is selected based on the selection signal via the first shared bus, inputs the control system data as a slave, performs memory writing based on the control signal and the address signal, and the processor element The multiprocessor system according to claim 1, wherein a return write transfer is performed.

The first processor element and the second processor element are:
A fifth function of outputting a bus request signal of the first shared bus in response to the transfer request of the control system data, and transferring and outputting the control system data as a master in response to an input of a bus permission signal;
And a sixth function for inputting the control system data as a slave through the first shared bus as a slave and writing to the memory based on the control signal and the address signal. 3. The multiprocessor system according to claim 1, wherein write transfer between elements is performed.

The first processor element is:
In response to a transfer request for the control system data, a bus request signal of the first shared bus is output, and the control system data is transferred and output as a master in response to an input of a bus permission signal,
The second processor element is
The control system data selected by the selection signal via the first shared bus is input as a slave, the memory is written based on the control signal and the address signal, and the write transfer between the processor elements is performed. The multiprocessor system according to claim 3.

The selection signal, the control signal, the address signal, and the control system data are input from the plurality of processor elements, respectively, and selectively to the first shared bus corresponding to the right to use the first shared bus. A first shared output for switching, selecting one of the plurality of processor elements as a slave based on the selection signal via a first shared bus, and outputting the control signal, the address signal, and the control system data A bus circuit;
A bus request signal is received from each of the plurality of processor elements for each cycle, and a bus permission signal for the first shared bus is issued to the processor element with the highest priority to arbitrate the bus use right for the next cycle. The multiprocessor system according to claim 1, further comprising: a bus arbiter.

The first shared bus circuit is
The selection signal, the control signal, the address signal, and the control system data are input from the plurality of processor elements, respectively, and selectively switched to the first shared bus corresponding to the right to use the first shared bus An output multiplexer;
A decoder that decodes the selection signal on a first shared bus and selects one of the plurality of processor elements as a transfer destination slave;
6. A demultiplexer that inputs the control signal, the address signal, and the control system data on a first shared bus, respectively, and switches and distributes to each of the transfer destination slaves corresponding to the output of the decoder. The described multiprocessor system.

The processor element is selected based on the selection signal via the first shared bus, and not as an internal interrupt process as a slave, but to a memory write or memory read back based on the control signal and the address signal by a dedicated memory control unit The multiprocessor system according to any one of claims 1 to 6, wherein the request is made.

The processor element outputs a bus request signal of the first shared bus in response to a transfer request for control system data including an interrupt request, and transfers and outputs the interrupt request as a master in response to an input of a bus permission signal. The interrupt request is input as a slave that is selected based on the selection signal via the first shared bus, and performs an interrupt request transfer that performs internal interrupt processing corresponding to the interrupt request based on the control signal and the address signal. Claims 1 to 7
The multiprocessor system according to any one of the above.

The interrupt request includes an interrupt factor and a transfer source code.
The described multiprocessor system.

A debugging processing element for snooping the control system data and the input / output data on the first and second shared buses according to a match of a transfer path and an address range and storing the data in a debugging memory is provided. The multiprocessor system according to claim 9.

The processor element traces the address of the execution instruction of the internal processor and creates trace data as control system data. In response to the transfer request, the processor element outputs a bus request signal of the first shared bus. The multiprocessor system according to any one of claims 1 to 10, wherein the trace data is transferred and output as a master in response to an input.

The processor element traces the address of the execution instruction of the internal processor and creates trace data as control system data. In response to the transfer request, the processor element outputs a bus request signal of the first shared bus. Transfer and output the trace data as a master according to the input,
11. The debugging processing element is selected based on the selection signal via a first shared bus, inputs the trace data as a slave, and stores the trace data in a debugging memory based on the control signal and the address signal. The described multiprocessor system.

A clock generation circuit that generates a bus clock signal that is an integral multiple of the basic clock signal in response to the transfer traffic of the first shared bus in synchronization with the basic clock signal of the processor element;
A bus request signal for the first shared bus is input from the processor element and output in synchronization with the bus clock signal to the first bus arbiter, and a bus permission signal for the first shared bus is input from the first bus arbiter. An arbiter synchronization circuit that outputs the processor element in synchronization with the basic clock signal;
A slave synchronization circuit that inputs the selection signal, the control signal, the address signal, and the control system data via a first shared bus and outputs them to the processor element in synchronization with the basic clock signal;
A first bus arbiter receives a bus request signal of the first shared bus from each of the plurality of processor elements through the arbiter synchronization circuit only once every cycle of the basic clock signal, and each bus of the bus clock signal 6. The bus use right of each bus cycle of the next cycle is arbitrated by issuing a bus permission signal of the first shared bus to the processor element having the highest priority in the cycle via the arbiter synchronization circuit. The multiprocessor system according to any one of 12 above.

Operates as one of the plurality of processor elements, outputs a bus request signal of the second shared bus in response to the transfer request of the input / output data, and inputs / outputs the master as a master in response to the input of the bus permission signal A processor element for burst transfer of data;
A processor element that operates as one of the plurality of processor elements and that burst-transfers the input / output data as a slave connected via a second shared bus. The multiprocessor system described in 1.

A processor element operating as a master or slave of the second shared bus is selectively switched and connected to the second shared bus corresponding to the right to use the second shared bus;
A second shared bus circuit for burst-transferring the input / output data between a master and a slave via a second shared bus;
A bus request signal for the second shared bus is received for each cycle from the plurality of processor elements, and a bus permission signal for the second shared bus is issued to the processor element with the highest priority to arbitrate the bus use right. The multiprocessor system of claim 14, comprising a second bus arbiter.