JP2007280127A

JP2007280127A - Inter-processor communication method

Info

Publication number: JP2007280127A
Application number: JP2006106647A
Authority: JP
Inventors: Koichi Takeda; 浩一武田
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2006-04-07
Filing date: 2006-04-07
Publication date: 2007-10-25

Abstract

<P>PROBLEM TO BE SOLVED: To communicate a large amount of data between processors at high speed in an inter-processor communication method in a controller of a loosely coupled multiprocessor configuration. <P>SOLUTION: The overhead of access to a shared memory SM is eliminated by writing transfer data between processing units (PU) in a local memory LM, and a block transfer instruction or a DMA of a processor 1a performs collective transfer from a transfer source local memory LM to the shared memory SM after completing to write transfer data to thereby achieve a large amount of data transfer at a high speed. The completion of writing transfer data is regarded when a release system synchronization basic operation belonging to an OS, etc., is performed. Next, collective transfer is performed from the shared memory SM to a transfer destination local memory LM by a block transfer instruction or a DMA of the processor 1a before reading transfer data. Start of reading the transfer data is regarded when an acquisition system synchronization basic operation of the OS is performed. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、プロセッサ間通信用の共有メモリを持つ疎結合マルチプロセッサ構成のシステムにおけるプロセッサ間通信方法に関するものである。 The present invention relates to an interprocessor communication method in a system of a loosely coupled multiprocessor configuration having a shared memory for interprocessor communication.

従来、１つのシステムに複数のプロセッサを搭載し、それらのプロセッサを同時に動作させるマルチプロセッサシステムには、主記憶装置を共有する密結合マルチプロセッサ方式（共有メモリ方式（ａ））と、主記憶装置をプロセッサ毎に持つ疎結合マルチプロセッサ方式とがある。疎結合マルチプロセッサ方式には更に、単一のアドレス空間をプロセッサ間で共有し、各プロセッサから他のプロセッサのメモリ（記憶装置）にアクセス可能なもの（分散共有メモリ方式（ｂ））と、プロセッサ毎に別のアドレス空間を持つもの（以後、単に「疎結合マルチプロセッサ方式」（ｃ）という。）とに分類できる。 Conventionally, in a multiprocessor system in which a plurality of processors are mounted in one system and these processors are operated simultaneously, a tightly coupled multiprocessor system (shared memory system (a)) that shares a main memory device, and a main memory device There is a loosely-coupled multiprocessor system with each processor. In the loosely coupled multiprocessor system, a single address space is shared between processors, and the memory (storage device) of each processor can be accessed from each processor (distributed shared memory system (b)). They can be classified into those having different address spaces (hereinafter simply referred to as “loosely coupled multiprocessor system” (c)).

（ａ）共有メモリ方式は、共有メモリにアクセスするときのオーバヘッド（余計に掛かる時間）と共有メモリのアクセス競合を低減するために、各プロセッサにキャッシュメモリを搭載している。このキャッシュメモリのデータ一貫性を保持するために、特別な仕組み（コヒーレントキャッシュ；共有メモリや他のプロセッサのキャッシュメモリも同時に書き換えるまたは消去する機能）を備えている。この仕組みのため、メモリの書き込み、読み出しを行うだけでプロセッサ間通信が可能となっており、プログラミングモデルとして使いやすいモデルを提供している。 (A) In the shared memory method, each processor is equipped with a cache memory in order to reduce the overhead (access time) when accessing the shared memory and the access conflict of the shared memory. In order to maintain the data consistency of the cache memory, a special mechanism (coherent cache; a function of simultaneously rewriting or erasing the cache memory of the shared memory and other processors) is provided. Because of this mechanism, communication between processors is possible simply by writing to and reading from the memory, and an easy-to-use model is provided as a programming model.

（ｂ）分散共有メモリ方式は、自プロセッサのローカル共有メモリにアクセスする場合は高速に行うことができるが、他プロセッサのローカル共有メモリにアクセスする場合はオーバヘッドが大きい。このため、他プロセッサのローカル共有メモリの内容を必要に応じて自プロセッサのメモリ上にキャッシュする方法を取る場合が多い。この場合も、データ一貫性を保持するために、特別な仕組み（コヒーレントキャッシュ）を備えている。この仕組みがある場合、メモリの書き込み、読み出しを行うだけでプロセッサ間通信が可能となるため、プログラミングモデルとして使いやすいモデルを提供している。 (B) The distributed shared memory method can be performed at high speed when accessing the local shared memory of its own processor, but has a large overhead when accessing the local shared memory of other processors. For this reason, there are many cases in which the contents of the local shared memory of other processors are cached in the memory of the own processor as necessary. Also in this case, a special mechanism (coherent cache) is provided to maintain data consistency. When this mechanism is available, it is possible to communicate between processors simply by writing to and reading from the memory. Therefore, an easy-to-use model is provided as a programming model.

（ｃ）プロセッサ毎に別のアドレス空間を持つ疎結合マルチプロセッサ方式は、例えば、通常のネットワーク・パーソナルコンピュータ（以下「PC」という。）等のように、共有メモリを用いたプロセッサ間通信は行わないので、データ一貫性のための仕組み（コヒーレントキャッシュ）を持つ必要がなく、ハードウェアを簡素化できる。 (C) A loosely coupled multiprocessor system having a separate address space for each processor performs inter-processor communication using a shared memory, such as a normal network personal computer (hereinafter referred to as “PC”). Therefore, it is not necessary to have a mechanism for data consistency (coherent cache), and hardware can be simplified.

また、疎結合マルチプロセッサでも、プロセッサ間通信のバンド幅（つまり転送速度）を大きくするために、プロセッサ間通信用に共有メモリを持つ構成のシステム（ｄ）も存在する（例えば、下記の特許文献１に記載されている。）。この場合、共有メモリを用いたプロセッサ間通信を行うが、他のプロセッサの主記憶装置にもアクセスする構成になっていて、共有メモリをキャッシュしないため、データ一貫性の仕組みを持つ必要がない。 Even in a loosely coupled multiprocessor, there is a system (d) configured to have a shared memory for interprocessor communication in order to increase the bandwidth (that is, transfer rate) of interprocessor communication (for example, the following patent document) 1). In this case, the inter-processor communication using the shared memory is performed, but the main memory device of another processor is also accessed, and the shared memory is not cached. Therefore, it is not necessary to have a data consistency mechanism.

このような特許文献１等に記載された構成のシステムにおける従来のプロセッサ間通信方法では、送信プロセッサが共有メモリ上のプロセッサ間通信エリアにメッセージを格納し、送信プロセッサが受信プロセッサにプロセッサ間通信割り込み要求を発行し、受信プロセッサがプロセッサ間通信エリアからメッセージを取り出すことにより行っていた。 In the conventional inter-processor communication method in the system having the configuration described in Patent Document 1 or the like, the transmission processor stores a message in the inter-processor communication area on the shared memory, and the transmission processor interrupts the inter-processor communication interrupt to the reception processor. This was done by issuing a request and the receiving processor retrieving the message from the inter-processor communication area.

特開平１１−１２０１５６号公報Japanese Patent Laid-Open No. 11-120156

しかしながら、従来のプロセッサ間通信方法では、次の（ａ）〜（ｄ）のような課題があった。 However, the conventional inter-processor communication method has the following problems (a) to (d).

（ａ）共有メモリ方式は、コヒーレントキャッシュを実装するため、システム構成が複雑になる。また、インテレクチュアル・プロパティ（知的所有権、以下「IP」という。）)としてライセンスされたキャッシュ付プロセッサコアを使用する場合、キャッシュメモリのデータ一貫性を保持する仕組みをサポートしていないコアは使用できないという課題がある。 (A) Since the shared memory system implements a coherent cache, the system configuration is complicated. In addition, when using a cached processor core licensed as an intellectual property (IP), a core that does not support a mechanism for maintaining cache memory data consistency There is a problem that cannot be used.

（ｂ）分散共有メモリ方式も、コヒーレントキャッシュを実装するため、共有メモリ方式と同様の課題がある。 (B) The distributed shared memory system also has the same problems as the shared memory system because it implements a coherent cache.

（ｃ）疎結合マルチプロセッサは、プロセッサ間通信の機能を、メモリの読み書きによってではなく、プログラムにより通信装置を制御することによって実装する必要があり、プログラムが複雑になるという課題がある。また、通信装置にも依存するが、バンド幅の大きな転送を小さなレイテンシ（待ち時間）で実現することは一般に困難である。 (C) The loosely coupled multiprocessor needs to implement the inter-processor communication function by controlling the communication device by a program, not by reading / writing a memory, and there is a problem that the program becomes complicated. Further, although depending on the communication device, it is generally difficult to realize a transfer with a large bandwidth with a small latency (waiting time).

（ｄ）プロセッサ間通信用に共有メモリを持つ疎結合マルチプロセッサは、以下の課題があった。 (D) A loosely coupled multiprocessor having a shared memory for interprocessor communication has the following problems.

共有メモリにアクセスするためには、バスアービトレーション（バスの調停）が必要であることから、アクセスに時間がかかる。このため、大量のデータを高速に他のプロセッサに転送するためには、オーバヘッドが大きく、転送に時間がかかるばかりか、プロセッサの処理時間を浪費していた。 Since access to the shared memory requires bus arbitration (bus arbitration), access takes time. For this reason, in order to transfer a large amount of data to another processor at high speed, the overhead is large and the transfer takes time, and the processing time of the processor is wasted.

また、プログラムがプロセッサ間通信エリアという特別な領域を意識すること、プロセッサ間割り込みの制御等を行う必要があり、プログラムが繁雑になるという課題があった。 In addition, there is a problem that the program needs to be aware of a special area called an inter-processor communication area, control inter-processor interrupts, etc., and the program becomes complicated.

そこで、このような課題を解決するために、本発明では、システム構成とプログラム構成を複雑にせず、データ一貫性を保持する仕組みを提供していないプロセッサコアIPでも利用可能な、プロセッサ間通信用の共有メモリを持つ疎結合マルチプロセッサ構成をベースとした、プロセッサ間で大量のデータを高速に通信することを可能にするプロセッサ間通信方法を提供することを目的とする。 Therefore, in order to solve such a problem, the present invention does not complicate the system configuration and the program configuration, and can be used even with a processor core IP that does not provide a mechanism for maintaining data consistency. An object of the present invention is to provide an inter-processor communication method that enables high-speed communication between a large amount of data between processors based on a loosely-coupled multiprocessor configuration having a common memory.

前記課題を解決するために、請求項１に係る発明は、ローカルメモリをそれぞれ持つ複数のプロセッサとプロセッサ間通信用の共有メモリとが共有バスで接続されたマルチプロセッサシステムのプロセッサ間通信方法であって、前記複数のプロセッサの内のあるプロセッサから他のプロセッサへデータを転送する際に、獲得系同期基本操作及び解放系同期基本操作の中で前記ローカルメモリと前記共有メモリ間でデータ転送を行うようにしている。 In order to solve the above problems, the invention according to claim 1 is an interprocessor communication method of a multiprocessor system in which a plurality of processors each having a local memory and a shared memory for interprocessor communication are connected by a shared bus. Thus, when data is transferred from one processor of the plurality of processors to another processor, data is transferred between the local memory and the shared memory in the acquisition synchronous basic operation and the release synchronous basic operation. I am doing so.

請求項２、３に係る発明では、転送データをローカルメモリに書き込むことで、共有メモリアクセスのオーバヘッドをなくし、転送データの書き込みが完了してから、プロセッサのブロック転送命令あるいはダイレクト・メモリ・アクセス（以下「DMA」という。）で転送元ローカルメモリから共有メモリへまとめて転送を行うことで、大量データの高速転送を実現する。転送データの書き込み完了は、オペレーティング・システム（以下「OS」という。）等が持っている解放系同期基本操作（例えば、排他的制御手段であるセマフォ（semaphore)V命令、シグナル（signal)、アンロック(unlock)等）を行ったタイミングでそれとみなす。次に、転送データの読み込みに先立って、プロセッサのブロック転送命令あるいはDMAで共有メモリから転送先ローカルメモリにまとめて転送を行う。転送データの読み込み開始は、OSの獲得系同期基本操作（例えば、セマフォP命令、ウェイト（wait)、ロック(lock)等）を行ったタイミングでそれとみなす。 According to the second and third aspects of the present invention, the overhead of shared memory access is eliminated by writing the transfer data to the local memory, and after the transfer data has been written, the processor block transfer instruction or direct memory access ( (Hereinafter referred to as “DMA”)) Transfers data collectively from the transfer source local memory to the shared memory, enabling high-speed transfer of large amounts of data. Completion of transfer data writing is performed by a release synchronous basic operation (for example, semaphore V instruction, signal, signal, uncontrol, which is exclusive control means) possessed by the operating system (hereinafter referred to as “OS”). It is considered that when the lock (unlock, etc.) is performed. Next, prior to reading the transfer data, a block transfer instruction of the processor or DMA transfers the data from the shared memory to the transfer destination local memory. The start of reading the transfer data is considered to be the timing at which the OS acquisition-system synchronous basic operation (for example, semaphore P instruction, wait, lock, etc.) is performed.

請求項１に係る発明によれば、同期基本操作（獲得系同期基本操作、解放系同期基本操作）の中でローカルメモリと共有メモリ間でデータ転送を行うようにしたので、次の(i)、(ii)のような効果がある。 According to the first aspect of the present invention, since data transfer is performed between the local memory and the shared memory in the basic synchronization operation (acquisition basic synchronization operation, release basic synchronization basic operation), the following (i) , (Ii) has the effect.

(i) あるプロセッサにおいて一連の演算処理を行っている間はローカルメモリに書き込むため、プログラムで共有メモリを意識する必要がなく、かつ高速に書き込みができる。そして、一連の演算処理が終了し、同期基本操作を行う際に、他のプロセッサに伝達すべき演算結果をまとめてローカルメモリと共有メモリ間で書き込みを行うため、高速に書き込みができる。 (i) Since a certain processor writes data in the local memory while performing a series of arithmetic processing, the program does not need to be aware of the shared memory, and can be written at high speed. When a series of arithmetic processing is completed and a synchronous basic operation is performed, the arithmetic results to be transmitted to other processors are collectively written between the local memory and the shared memory, so that high-speed writing is possible.

(ii) ローカルメモリと共有メモリ間の書き込み動作は、同期基本操作中に組み込まれているため、メインのプログラムではローカルメモリと共有メモリ間の書き込みを記述する必要がない。よって、プログラムを簡素化することができる。 (ii) Since the write operation between the local memory and the shared memory is incorporated during the synchronous basic operation, it is not necessary to describe the write between the local memory and the shared memory in the main program. Therefore, the program can be simplified.

請求項２、３に係る発明によれば、共有メモリアクセスのオーバヘッドを低減し、大量データの高速転送を実現できる。更に、データ転送のタイミングを同期基本操作とセットで行うことにより、プログラム全体の複雑さを減らすことができる。 According to the second and third aspects of the invention, the overhead of accessing the shared memory can be reduced, and high-speed data transfer can be realized. Furthermore, the complexity of the entire program can be reduced by performing the data transfer timing in combination with the synchronous basic operation.

マルチプロセッサシステムは、ローカルメモリをそれぞれ持つ複数のプロセッサとプロセッサ間通信用の共有メモリとが共有バスで接続されている。このようなマルチプロセッサシステムのプロセッサ間通信方法において、複数のプロセッサの内のあるプロセッサから他のプロセッサへデータを転送する際に、同期基本操作（獲得系同期基本操作（wait_sem)、解放系同期基本操作(sig_sem)）の中でローカルメモリと共有メモリ間でデータ転送を行うようにしている。 In a multiprocessor system, a plurality of processors each having a local memory and a shared memory for interprocessor communication are connected by a shared bus. In such an inter-processor communication method of a multiprocessor system, when data is transferred from one processor to another in a plurality of processors, a synchronization basic operation (acquisition synchronization basic operation (wait_sem), release synchronization basic) In the operation (sig_sem), data is transferred between the local memory and the shared memory.

（実施例１の構成）
図１は、本発明の実施例１に係るプロセッサ間通信方法を実現するマルチプロセッサシステムの構成例を示すブロック図である。 (Configuration of Example 1)
FIG. 1 is a block diagram illustrating a configuration example of a multiprocessor system that implements the inter-processor communication method according to the first embodiment of the invention.

このマルチプロセッサシステムは、複数のプロセッシングユニット（以下「PU」という。）１−０〜１−ｎと、これらの各PU１−０〜１−ｎが使用する共有メモリSMとを有し、これらのPU１−０〜１−ｎと共有メモリSMが共有バス３で接続されている。共有バス３には、DMAコントローラ等が接続されている。各PU１−０〜１−nは、プロセッサ１aと、主記憶メモリであるローカルメモリLMとで構成されている。プロセッサ１aのアドレス空間はPU１−０〜１−n毎に別になっている、いわゆる疎結合マルチプロセッサで、各ローカルメモリLMは、同一PU内のプロセッサ１aからのみアクセス可能である。また、各プロセッサ１aは、必要に応じてキャッシュメモリ１bを持っていてもよい。このキャッシュメモリ１bは、プロセッサ間のデータ一貫性を保証する必要はなく、単一プロセッサで通常使用されるキャッシュメモリでよい。 The multiprocessor system includes a plurality of processing units (hereinafter referred to as “PUs”) 1-0 to 1-n and a shared memory SM used by each of the PUs 1-0 to 1-n. The PUs 1-0 to 1-n and the shared memory SM are connected by the shared bus 3. A DMA controller or the like is connected to the shared bus 3. Each PU 1-0 to 1-n includes a processor 1a and a local memory LM which is a main memory. The address space of the processor 1a is a so-called loosely coupled multiprocessor, which is different for each of the PUs 1-0 to 1-n, and each local memory LM can be accessed only from the processor 1a in the same PU. Each processor 1a may have a cache memory 1b as necessary. The cache memory 1b does not need to guarantee data consistency between processors, and may be a cache memory normally used by a single processor.

共有メモリSMは、プロセッサ１aのアドレス空間の特定の領域にマッピングされており、各プロセッサ１aは、この領域を介してデータを共有できる。共有メモリ領域はキャッシャブルとしない、つまりキャッシュメモリ１bを使用しない。 The shared memory SM is mapped to a specific area in the address space of the processor 1a, and each processor 1a can share data through this area. The shared memory area is not cacheable, that is, the cache memory 1b is not used.

各プロセッサ１aが共有メモリSMにアクセスする場合には、図示しないバスアービタ（バス調停手段）に共有バス３の使用権を要求し、共有バス３の使用権を獲得した場合のみ、共有メモリSMにアクセスできる。また、プロセッサ１a間で割り込みを掛けることができる。 When each processor 1a accesses the shared memory SM, it requests the use right of the shared bus 3 from a bus arbiter (bus arbitration means) (not shown) and accesses the shared memory SM only when the use right of the shared bus 3 is acquired. it can. An interrupt can be generated between the processors 1a.

このような構成のマルチプロセッサシステムで、共有メモリSMを使用したプロセッサ間通信の実施例について述べる。 An embodiment of interprocessor communication using the shared memory SM in the multiprocessor system having such a configuration will be described.

共有メモリSM上には、PU１−０とPU１−１との通信用領域CBUF01が配置されると共に、通信用領域CBUF01に関連する排他制御用変数領域SEM_CBUF01が配置されている。排他制御用変数領域SEM_CBUF01には、例えば、PU１−０データ読み込みコールバックルーチン（回収ルーチン）、PU１−１データ読み込みコールバックルーチン、PU１−０データ書き込みコールバックルーチン、及び、PU１−１データコールバックルーチンの先頭アドレスが格納されている。 On the shared memory SM, a communication area CBUF01 for PU1-0 and PU1-1 is disposed, and an exclusive control variable area SEM_CBUF01 related to the communication area CBUF01 is disposed. The exclusive control variable area SEM_CBUF01 includes, for example, a PU1-0 data read callback routine (collection routine), a PU1-1 data read callback routine, a PU1-0 data write callback routine, and a PU1-1 data callback. Stores the start address of the routine.

PU１−０のローカルメモリLM上には通信用領域CBUF01のコピー領域LBUF0が配置され、PU１−１のローカルメモリLM上にも通信用領域CBUF01のコピー領域LBUF1が配置されている。コピー領域LBUF0,LBUF1は、それぞれPU１−０、PU１−１上で通信用領域CBUF01のバッファ領域として使用されるもので、通信用領域CBUF01とコピー領域LBUF0,LBUF1の対応関係は、それぞれPU１−０,PU１−１で管理される。 A copy area LBUF0 of the communication area CBUF01 is arranged on the local memory LM of the PU1-0, and a copy area LBUF1 of the communication area CBUF01 is also arranged on the local memory LM of the PU1-1. The copy areas LBUF0 and LBUF1 are used as buffer areas for the communication area CBUF01 on the PU1-0 and PU1-1, respectively. The correspondence between the communication area CBUF01 and the copy areas LBUF0 and LBUF1 is PU1-0. , Managed by PU1-1.

図２は、図１中の通信用領域CBUF01の構成を示す図である。
通信用領域CBUF01は、データサイズ２aと転送データ領域２bを含んでいる。 FIG. 2 is a diagram showing the configuration of the communication area CBUF01 in FIG.
The communication area CBUF01 includes a data size 2a and a transfer data area 2b.

通信用領域CBUF01のデータサイズ２aは、PU１−０またはPU１−１により初期値０で初期化されているものとする。 It is assumed that the data size 2a of the communication area CBUF01 is initialized with the initial value 0 by the PU 1-0 or PU 1-1.

（実施例１のプロセッサ間通信方法）
図３−１、図３−２は、本発明の実施例１のプロセッサ間通信方法における処理手順を示す図である。図３−１には、例えば、PU１−０上のユーザプログラムがPU１−１にデータを送信し、PU１−１上のユーザプログラムがその内容に何らかの処理を施してPU１−０に結果を返す処理の手順が示されている。図３−２には、図３−１におけるルーチン等の処理手順が示されている。 (Interprocessor communication method of the first embodiment)
3A and 3B are diagrams illustrating a processing procedure in the inter-processor communication method according to the first embodiment of this invention. In FIG. 3A, for example, the user program on the PU 1-0 transmits data to the PU 1-1, and the user program on the PU 1-1 performs some processing on the contents and returns the result to the PU 1-0. The procedure is shown. FIG. 3-2 shows a processing procedure such as the routine in FIG.

本実施例１のプロセッサ間通信方法において、同期基本操作(wait_sem,sig_sem)は、ユーザの定義したデータ読み込みコールバックルーチン、データ書き込みコールバックルーチンを起動するようになっているものとする。排他制御用変数領域SEM_CBUF01では、同期基本操作(wait_sem,sig_sem)で起動されるデータ読み込みコールバックルーチン、データ書き込みコールバックルーチンがPU毎に関連付けられているものとする。 In the inter-processor communication method according to the first embodiment, it is assumed that the basic synchronization operation (wait_sem, sig_sem) starts a data read callback routine and a data write callback routine defined by the user. In the exclusive control variable area SEM_CBUF01, it is assumed that a data read callback routine and a data write callback routine activated by a synchronous basic operation (wait_sem, sig_sem) are associated with each PU.

図３−１に示すプロセッサ間通信方法では、下記の（Ａ）ステップＳ１、（Ｂ）ステップＳ２、（Ｃ）ステップＳ３に従って処理が行われる。 In the inter-processor communication method shown in FIG. 3A, processing is performed according to the following (A) step S1, (B) step S2, and (C) step S3.

（Ａ） PU１−０：送信プログラムの処理（ステップＳ１）
先ず、PU１−０からPU１−１にデータを送信するために、PU１−０上のユーザの送信プログラムを起動する。PU１−０上の送信プログラムは、最初に、共有メモリSM上に配置された通信用領域CBUF01の排他制御のために、共有メモリSM上の排他制御用変数領域SEM_CBUF01を使ってOSの資源獲得待ちシステムコールwait_semを呼び出して、通信用領域CBUF01の獲得を試みる。 (A) PU1-0: Processing of transmission program (step S1)
First, in order to transmit data from PU1-0 to PU1-1, a user transmission program on PU1-0 is activated. The transmission program on PU1-0 first waits for OS resource acquisition using the exclusive control variable area SEM_CBUF01 on the shared memory SM for exclusive control of the communication area CBUF01 located on the shared memory SM. Call the system call wait_sem to try to acquire the communication area CBUF01.

OSの資源獲得待ちシステムコールwait_semは、共有メモリSM上の排他制御用変数領域SEM_CBF01を参照して通信用領域CBUF01を獲得できるかどうか判断し、獲得できない場合は、待ち状態に移行する。通信用領域CBUF01を獲得できた場合は、共有メモリSM上の排他制御用変数領域SEM_CBF01に関連付けられたデータ読み込みコールバックルーチンを起動する。 The OS resource acquisition wait system call wait_sem refers to the exclusive control variable area SEM_CBF01 on the shared memory SM to determine whether or not the communication area CBUF01 can be acquired, and if not, shifts to a wait state. If the communication area CBUF01 can be acquired, the data read callback routine associated with the exclusive control variable area SEM_CBF01 on the shared memory SM is activated.

データ読み込みコールバックルーチンは、通信用領域CBUF01内のデータを通信用領域CBUF01内に書かれているデータサイズ分だけ通信用領域CBUF01からコピー領域LBUF0にコピーする。データサイズ２ａは０になっているので、実際にはコピーは行われず、データサイズ２ａを通信用領域CBUF01からコピー領域LBUF0に読み込んで、データ読み込みコールバックルーチンは終了する。 The data read callback routine copies the data in the communication area CBUF01 from the communication area CBUF01 to the copy area LBUF0 by the data size written in the communication area CBUF01. Since the data size 2a is 0, no copy is actually performed, the data size 2a is read from the communication area CBUF01 to the copy area LBUF0, and the data read callback routine is terminated.

データ読み込みコールバックルーチンが終了したら、OSの資源獲得待ちシステムコールwait_semは呼び出し元に制御を返す。 When the data read callback routine ends, the OS resource acquisition wait system call wait_sem returns control to the caller.

OSの資源獲得待ちシステムコールwait_sem から戻り、通信用領域CBUF01を獲得できたならば、送信プログラムは、転送データとデータサイズ２aを通信用領域CBUF01のコピー領域であるLBUF0に書き込む（なお、従来は、データを共有メモリSMに書き込んでいた。）。 After returning from the OS resource acquisition wait system call wait_sem and acquiring the communication area CBUF01, the transmission program writes the transfer data and the data size 2a to LBUF0, which is a copy area of the communication area CBUF01 (previously, , Was writing data to the shared memory SM.)

データの書き込みが完了した後、通信用領域CBUF01を解放するため、共有メモリSM上の排他制御用変数領域SEM_CBF01を使って、OSの資源解放システムコールsig_semを呼び出す。 After the data writing is completed, in order to release the communication area CBUF01, the OS resource release system call sig_sem is called using the exclusive control variable area SEM_CBF01 on the shared memory SM.

OSの資源解放システムコールsig_semは、実際に通信用領域CBUF01を解放する前に、データ書き込みコールバックルーチンを起動する。 The OS resource release system call sig_sem starts a data write callback routine before actually releasing the communication area CBUF01.

データ書き込みコールバックルーチンは、コピー領域LBUF0内のデータをコピー領域LBUF0内に書かれているデータサイズ分だけコピー領域LBUF0から通信用領域CBUF01にコピーする。また、コピー領域LBUF0内のデータのデータサイズ自体もコピーする。コピーは、プログラム転送またはDMAを起動して行う。コピーが完了したら、データ書き込みコールバックルーチンは終了する。 The data write callback routine copies the data in the copy area LBUF0 from the copy area LBUF0 to the communication area CBUF01 by the data size written in the copy area LBUF0. Also, the data size itself of the data in the copy area LBUF0 is copied. Copying is performed by starting program transfer or DMA. When copying is complete, the data write callback routine ends.

データ書き込みコールバックルーチンが終了したら、OSの資源解放システムコールsig_semは、通信用領域CBUF01を解放し、呼び出し元に制御を返す。 When the data write callback routine ends, the OS resource release system call sig_sem releases the communication area CBUF01 and returns control to the caller.

次に、送信プログラムはPU１−０からPU１−１に割り込みをかけて、PU１−１にデータが準備できたことを通知する。 Next, the transmission program interrupts PU1-1 to PU1-1 and notifies PU1-1 that data has been prepared.

（Ｂ） PU１−１：処理プログラムの処理（ステップＳ２）
PU１−１は、割り込みを受け取ると、ユーザの処理プログラムを起動する。PU１−１上の処理プログラムは、最初に、共有メモリSM上に配置された通信用領域CBUF01の排他制御のために、共有メモリSM上の排他制御用変数領域SEM_CBF01を使ってOSの資源獲得待ちシステムコールwait_semを呼び出して、通信用領域CBUF01の獲得を試みる。 (B) PU1-1: Processing of processing program (step S2)
Upon receiving the interrupt, the PU 1-1 activates the user processing program. The processing program on the PU 1-1 first waits for OS resource acquisition using the exclusive control variable area SEM_CBF01 on the shared memory SM for exclusive control of the communication area CBUF01 arranged on the shared memory SM. Call the system call wait_sem to try to acquire the communication area CBUF01.

データ読み込みコールバックルーチンは、通信用領域CBUF01内のデータを通信用領域CBUF01内に書かれているデータサイズ分だけ通信用領域CBUF01からコピー領域LBUF1にコピーする。コピーは、プログラム転送またはDMAを起動して行う。コピーが完了したら、データ読み込みコールバックルーチンは終了する。 The data read callback routine copies the data in the communication area CBUF01 from the communication area CBUF01 to the copy area LBUF1 by the data size written in the communication area CBUF01. Copying is performed by starting program transfer or DMA. When copying is complete, the data read callback routine ends.

OSの資源獲得待ちシステムコールwait_sem から戻り、通信用領域CBUF01を獲得できたならば、通信用領域CBUF01のコピー領域であるLBUF1に通信用領域CBUF01の内容がすべてコピーされているので、処理プログラムは、コピー領域LBUF1から転送データをデータサイズ分読み込む。読み込みが完了したら、データサイズ２aを０に変更する。 If you return from the OS resource acquisition wait system call wait_sem and acquire the communication area CBUF01, the contents of the communication area CBUF01 are copied to LBUF1, which is the copy area of the communication area CBUF01. The transfer data is read from the copy area LBUF1 for the data size. When reading is completed, the data size 2a is changed to zero.

次に、処理プログラムは受け取ったデータを元に何らかの処理を行って、その結果のデータとデータサイズ２aをコピー領域LBUF1に書き込む。 Next, the processing program performs some processing based on the received data, and writes the resultant data and the data size 2a in the copy area LBUF1.

データの書き込みが完了した後、通信用領域CBUF01を解放するため、共有メモリSM上の排他制御用変数領域SEM_CBF01を使ってOSの資源解放システムコールsig_semを呼び出す。 After the data writing is completed, in order to release the communication area CBUF01, the OS resource release system call sig_sem is called using the exclusive control variable area SEM_CBF01 on the shared memory SM.

OSの資源解放システムコールsig_semは、実際に通信用領域CBUF01を解放する前にデータ書き込みコールバックルーチンを起動する。 The OS resource release system call sig_sem starts a data write callback routine before actually releasing the communication area CBUF01.

データ書き込みコールバックルーチンは、コピー領域LBUF1内のデータをコピー領域LBUF1内に書かれているデータサイズ分だけコピー領域LBUF1から通信用領域CBUF01にコピーする。コピーは、プログラム転送またはDMAを起動して行う。コピーが完了したら、データ書き込みコールバックルーチンは終了する。 The data write callback routine copies the data in the copy area LBUF1 from the copy area LBUF1 to the communication area CBUF01 by the data size written in the copy area LBUF1. Copying is performed by starting program transfer or DMA. When copying is complete, the data write callback routine ends.

データ書き込みコールバックルーチンが終了したら、OSの資源解放システムコールsig_semは通信用領域CBUF01を解放し、呼び出し元に制御を返す。 When the data write callback routine ends, the OS resource release system call sig_sem releases the communication area CBUF01 and returns control to the caller.

次に、処理プログラムはPU１−１からPU１−０に割り込みをかけて、PU１−０に処理の完了を通知する。 Next, the processing program interrupts PU1-1 to PU1-0 and notifies PU1-0 of the completion of the processing.

（Ｃ） PU１−０：受信プログラムの処理（ステップＳ３）
PU１−０は、割り込みを受け取ると、ユーザの受信プログラムを起動する。PU１−０上の受信プログラムは、最初に、共有メモリSM上に配置された通信用領域CBUF01の排他制御のために、共有メモリSM上の排他制御用変数領域SEM_CBF01を使ってOSの資源獲得待ちシステムコールwait_semを呼び出して、通信用領域CBUF01の獲得を試みる。 (C) PU1-0: Received program processing (step S3)
Upon receiving the interrupt, the PU 1-0 activates the user's reception program. The receiving program on PU1-0 first waits for OS resource acquisition using exclusive control variable area SEM_CBF01 on shared memory SM for exclusive control of communication area CBUF01 located on shared memory SM. Call the system call wait_sem to try to acquire the communication area CBUF01.

データ読み込みコールバックルーチンは、通信用領域CBUF01内のデータを通信用領域CBUF01内に書かれているデータサイズ分だけ通信用領域CBUF01からコピー領域LBUF0にコピーする。コピーは、プログラム転送またはDMAを起動して行う。コピーが完了したら、データ読み込みコールバックルーチンは終了する。 The data read callback routine copies the data in the communication area CBUF01 from the communication area CBUF01 to the copy area LBUF0 by the data size written in the communication area CBUF01. Copying is performed by starting program transfer or DMA. When copying is complete, the data read callback routine ends.

OSの資源獲得待ちシステムコールwait_sem から戻り、通信用領域CBUF01を獲得できたならば、通信用領域CBUF01のコピー領域であるLBUF0に通信用領域CBUF01の内容がすべてコピーされているので、受信プログラムはコピー領域LBUF0から結果データをデータサイズ分読み込む。読み込みが完了したら、データサイズ２aを０に変更する。 If you return from the OS resource acquisition wait system call wait_sem and acquire the communication area CBUF01, the contents of the communication area CBUF01 are copied to LBUF0, which is the copy area of the communication area CBUF01. Read the result data for the data size from the copy area LBUF0. When reading is completed, the data size 2a is changed to zero.

次に、通信用領域CBUF01を解放するため、共有メモリSM上の排他制御用変数領域SEM_CBF01を使ってOSの資源解放システムコールsig_semを呼び出す。 Next, in order to release the communication area CBUF01, the OS resource release system call sig_sem is called using the exclusive control variable area SEM_CBF01 on the shared memory SM.

データ書き込みコールバックルーチンは、コピー領域LBUF0内のデータをコピー領域LBUF0内に書かれているデータサイズ分だけコピー領域LBUF0から通信用領域CBUF01にコピーする。データサイズは０になっているので、実際にはコピーは行われず、データサイズ２aをコピー領域LBUF0から通信用領域CBUF01に書き戻してデータ書き込みコールバックルーチンは終了する。 The data write callback routine copies the data in the copy area LBUF0 from the copy area LBUF0 to the communication area CBUF01 by the data size written in the copy area LBUF0. Since the data size is 0, no copy is actually performed, and the data size 2a is written back from the copy area LBUF0 to the communication area CBUF01, and the data write callback routine ends.

呼び出し元の受信プログラムは、受信が完了したので終了し、結果データに基づいてユーザプログラムの処理を継続する。 The reception program of the calling source is terminated because the reception is completed, and the processing of the user program is continued based on the result data.

（実施例１の効果）
本実施例１によれば、次の（ａ）〜（ｄ）のような効果がある。 (Effect of Example 1)
According to the first embodiment, there are the following effects (a) to (d).

（ａ）同期基本操作（wait_sem,sig_sem）の中でローカルメモリLMと共有メモリSM間でデータ転送を行うようにしている。これにより、あるプロセッサ１aにおいて一連の演算処理を行っている間はローカルメモリLMに書き込むため、プログラムで共有メモリSMを意識する必要がなく、かつ高速に書き込みができる。そして、一連の演算処理が終了し、同期基本操作（wait_sem,sig_sem）を行う際に、他のプロセッサ１aに伝達すべき演算結果をまとめてローカルメモリLMと共有メモリSM間でデータ転送を行うため、高速に書き込みができる。 (A) Data is transferred between the local memory LM and the shared memory SM in the basic synchronous operation (wait_sem, sig_sem). As a result, the data is written to the local memory LM while a series of arithmetic processing is performed in a certain processor 1a, so that it is not necessary to be aware of the shared memory SM by the program, and writing can be performed at high speed. Then, when a series of arithmetic processing is completed and the synchronous basic operation (wait_sem, sig_sem) is performed, the arithmetic results to be transmitted to the other processors 1a are collectively transferred between the local memory LM and the shared memory SM. Can write at high speed.

（ｂ）前記（a)において、ローカルメモリLMと共有メモリSM間の書き込み動作は、同期基本操作（wait_sem,sig_sem）中に組み込まれているため、メインのプログラムではローカルメモリLMと共有メモリSM間の書き込みを記述する必要がない。よって、プログラムを簡素化することができる。 (B) In (a), the write operation between the local memory LM and the shared memory SM is incorporated in the basic synchronous operation (wait_sem, sig_sem), so that the main program is between the local memory LM and the shared memory SM. There is no need to describe the writing. Therefore, the program can be simplified.

（ｃ）共有メモリアクセスに関して、キャッシュメモリのデータ一貫性を保持する仕組みを提供していないキャッシュ付プロセッサコアIPを使用する場合でも、プロセッサコアIPを変更することなく、簡単なハードウェア構成で、共有メモリSMを使用したプロセッサ間通信を高速かつ効率よく行うことができる。 (C) With respect to shared memory access, even when using a processor core IP with cache that does not provide a mechanism for maintaining data consistency of the cache memory, with a simple hardware configuration without changing the processor core IP, Communication between processors using the shared memory SM can be performed quickly and efficiently.

（ｄ）同期基本操作（wait_sem,sig_sem)のタイミングで、共有メモリSMとのブロック転送あるいはDMAによるデータ転送を行うことで、ユーザのプログラムから実際に共有メモリSMに対してデータ転送を行う部分を分離することができ、プログラム全体の複雑さを低減できる。 (D) At the timing of the basic synchronization operation (wait_sem, sig_sem), block transfer with the shared memory SM or data transfer by DMA is performed, so that the part that actually transfers data to the shared memory SM from the user program Can be separated, and the complexity of the entire program can be reduced.

（実施例２の構成）
図４は、本発明の実施例２に係るプロセッサ間通信方法を実現するマルチプロセッサシステムの構成例を示すブロック図であり、実施例１を示す図１中の要素と共通の要素には共通の符号が付されている。 (Configuration of Example 2)
FIG. 4 is a block diagram showing a configuration example of a multiprocessor system that implements the inter-processor communication method according to the second embodiment of the present invention. The code | symbol is attached | subjected.

本実施例２のマルチプロセッサシステムは、実施例１と同様の構成である。
本実施例２では、一方のプロセッサから他方のプロセッサに、ストリーム（stream、流れ）的にデータ転送を行う場合について説明する。 The multiprocessor system of the second embodiment has the same configuration as that of the first embodiment.
In the second embodiment, a case will be described in which data transfer is performed in a stream from one processor to the other processor.

共有メモリSM上には、PU１−０とPU１−１との通信用領域CBUF01が配置されると共に、通信用領域CBUF01に関連する排他制御のための排他制御用変数領域SEM_EMPTY01及びSEM_READY01が配置されている。排他制御用変数領域SEM_EMPTY01は空の転送データ領域数を表す計数型セマフォ、排他制御用変数領域SEM_READY01はデータが書き込まれた転送データ領域数を表す計数型セマフォであり、この一組で通信用領域CBUF01の排他制御を行う。排他制御用変数領域SEM_EMPTY01は初期値ｍ、排他制御用変数領域SEM_READY01は初期値０で初期化されているものとする。 On the shared memory SM, a communication area CBUF01 between PU1-0 and PU1-1 is arranged, and exclusive control variable areas SEM_EMPTY01 and SEM_READY01 for exclusive control related to the communication area CBUF01 are arranged. Yes. The exclusive control variable area SEM_EMPTY01 is a count semaphore that represents the number of empty transfer data areas, and the exclusive control variable area SEM_READY01 is a count semaphore that represents the number of transfer data areas to which data has been written. Performs exclusive control of CBUF01. The exclusive control variable area SEM_EMPTY01 is initialized with an initial value m, and the exclusive control variable area SEM_READY01 is initialized with an initial value 0.

排他制御用変数領域SEM_EMPTY01には、例えば、PU１−０データ読み込みコールバックルーチン、及び、PU１−１データ書き込みコールバックルーチンが格納されている。排他制御用変数領域SEM_READY01には、例えば、PU１−０データ書き込みコールバックルーチン、及び、PU１−１データ読み込みコールバックルーチンが格納されている。 In the exclusive control variable area SEM_EMPTY01, for example, a PU1-0 data read callback routine and a PU1-1 data write callback routine are stored. In the exclusive control variable area SEM_READY01, for example, a PU1-0 data write callback routine and a PU1-1 data read callback routine are stored.

実施例１と同様に、PU１−０のローカルメモリLM上には通信用領域CBUF01のコピー領域LBUF0が配置され、PU１−１のローカルメモリLM上にも通信用領域CBUF01のコピー領域LBUF1が配置されている。コピー領域LBUF0,LBUF1は、それぞれPU１−０、PU１−１上で通信用領域CBUF01のバッファ領域として使用されるもので、通信用領域CBUF01とコピー領域LBUF0,LBUF1の対応関係は、それぞれPU１−０,PU１−１で管理される。 Similar to the first embodiment, the copy area LBUF0 of the communication area CBUF01 is arranged on the local memory LM of the PU1-0, and the copy area LBUF1 of the communication area CBUF01 is also arranged on the local memory LM of the PU1-1. ing. The copy areas LBUF0 and LBUF1 are used as buffer areas for the communication area CBUF01 on the PU1-0 and PU1-1, respectively. The correspondence between the communication area CBUF01 and the copy areas LBUF0 and LBUF1 is PU1-0. , Managed by PU1-1.

図５は、図４中の通信用領域CBUF01の構成を示す図である。
通信用領域CBUF01は、m個の転送データ領域２ｂ−０〜２ｂ−（ｍ−１）を含んでいる。各転送データ領域２ｂ−０〜２ｂ−（ｍ−１）には、書き込まれている各データサイズ２ａ−０〜２ａ−（ｍ−１）を書き込む領域が付随している。データサイズ２ａ−０〜２ａ−（ｍ−１）は、PU１−０またはPU１−１により初期値０で初期化されているものとする。更に、通信用領域CBUF01は、使用中の転送データ領域２ｂを管理するために、使用中の転送データ領域２ｂの先頭の番号headbufno、及び、末尾の次の番号tailbufnoを含んでいる。 FIG. 5 is a diagram showing the configuration of the communication area CBUF01 in FIG.
The communication area CBUF01 includes m transfer data areas 2b-0 to 2b- (m-1). Each transfer data area 2b-0 to 2b- (m-1) is accompanied by an area in which the written data sizes 2a-0 to 2a- (m-1) are written. It is assumed that the data sizes 2a-0 to 2a- (m-1) are initialized with an initial value 0 by the PU1-0 or PU1-1. Further, the communication area CBUF01 includes a head number headbufno at the beginning of the transfer data area 2b in use and a next number tailbufno at the end in order to manage the transfer data area 2b in use.

これらは、通信用領域CBUF01のm個の転送データ領域２ｂ−０〜２ｂ−（ｍ−１）をリングバッファ２Ｂとして管理するためのもので、番号headbufnoは使用中のリングバッファ２Ｂの先頭となる転送データ領域２ｂの番号、番号tailbufnoは使用中のリングバッファ２Ｂの末尾となる転送データ領域の次の番号を保持する。いずれも初期値０で初期化されているものとする。 These are for managing m transfer data areas 2b-0 to 2b- (m-1) of the communication area CBUF01 as the ring buffer 2B, and the number headbufno is the head of the ring buffer 2B in use. The number of the transfer data area 2b and the number tailbufno hold the next number of the transfer data area at the end of the ring buffer 2B in use. Both are initialized with an initial value of 0.

（実施例２のプロセッサ間通信方法）
図６−１、図６−２、図６−３は、本発明の実施例２のプロセッサ間通信方法における処理手順を示す図である。図６−１には、例えば、PU１−０上のユーザプログラムがPU１−１に繰り返しデータを送信し、PU１−１上がそれを受信する処理の手順が示されている。図６−２、図６−３には、図６−１におけるルーチン等の処理手順が示されている。 (Inter-processor communication method of embodiment 2)
6A, 6B, and 6C are diagrams illustrating a processing procedure in the inter-processor communication method according to the second embodiment of this invention. FIG. 6A shows a processing procedure in which, for example, a user program on the PU 1-0 repeatedly transmits data to the PU 1-1 and the PU 1-1 receives the data. FIG. 6B and FIG. 6C show processing procedures such as the routine in FIG.

実施例１と同様に、同期基本操作（wait_sem,sig_sem)は、ユーザの定義したデータ読み込みコールバックルーチン、データ書き込みコールバックルーチンを起動するようになっているものとする。排他制御用変数領域SEM_EMPTY01及びSEM_READY01には、同期基本操作（wait_sem,sig_sem)で起動されるデータ読み込みコールバックルーチン、データ書き込みコールバックルーチンがPU毎に関連付けられているものとする。 As in the first embodiment, the basic synchronous operation (wait_sem, sig_sem) is assumed to start a data read callback routine and a data write callback routine defined by the user. In the exclusive control variable areas SEM_EMPTY01 and SEM_READY01, it is assumed that a data read callback routine and a data write callback routine activated by a synchronous basic operation (wait_sem, sig_sem) are associated with each PU.

図６−１に示すプロセッサ間通信方法では、例えば、PU１−０上のユーザプログラムがPU１−１に繰り返しデータを送信し、PU１−１上のユーザプログラムがそのデータを受信する場合、下記の（Ａ）ステップＳ１０、（Ｂ）ステップＳ１１に従って処理が行われる。 In the inter-processor communication method shown in FIG. 6A, for example, when the user program on the PU 1-0 repeatedly transmits data to the PU 1-1 and the user program on the PU 1-1 receives the data, the following ( A) Processing is performed in accordance with step S10 and (B) step S11.

（Ａ）ＰＵ０：通信プログラムの処理（ステップＳ１０）
先ず、PU１−０からPU１−１にデータを送信するために、PU１−０上のユーザの送信プログラムを起動する。PU１−０上の送信プログラムは、最初に、共有メモリSM上に配置された通信用領域CBUF01の空き転送データ領域獲得のために、共有メモリSM上の排他制御用変数領域SEM_EMPTY01を使ってOSの資源獲得待ちシステムコールwait_semを呼び出して、通信用領域CBUF01の空き転送データ領域２ｂの獲得を試みる。 (A) PU0: Communication program processing (step S10)
First, in order to transmit data from PU1-0 to PU1-1, a user transmission program on PU1-0 is activated. The transmission program on the PU 1-0 first uses the exclusive control variable area SEM_EMPTY01 on the shared memory SM to acquire the free transfer data area of the communication area CBUF01 arranged on the shared memory SM. A resource acquisition wait system call wait_sem is called to try to acquire the free transfer data area 2b of the communication area CBUF01.

OSの資源獲得待ちシステムコールwait_semは、共有メモリSM上の排他制御用変数SEM_EMPTY01を参照して通信用領域CBUF01の空き転送データ領域２ｂを獲得できるかどうか判断し、獲得できない場合は、待ち状態に移行する。通信用領域CBUF01の空き転送データ領域２ｂを獲得できた場合は、共有メモリSM上の排他制御用変数SEM_EMPTY01に関連付けられたPU１−０のデータ読み込みコールバックルーチンを起動する。 The OS resource acquisition wait system call wait_sem refers to the exclusive control variable SEM_EMPTY01 on the shared memory SM to determine whether the free transfer data area 2b of the communication area CBUF01 can be acquired. Transition. If the free transfer data area 2b of the communication area CBUF01 can be acquired, the data read callback routine of the PU 1-0 associated with the exclusive control variable SEM_EMPTY01 on the shared memory SM is activated.

PU１−０のデータ読み込みコールバックルーチンは、通信用領域CBUF01内の番号tailbufnoで指定される転送データ領域２ｂのデータを、その転送データ領域２ｂに書かれているデータサイズ分だけコピー領域LBUF0にコピーする。データサイズ２ａは０に初期化されているので、実際にはコピーは行われず、データサイズ２ａのみをコピー領域LBUF0に読み込んで、データ読み込みコールバックルーチンは終了する。 The PU1-0 data read callback routine copies the data in the transfer data area 2b specified by the number tailbufno in the communication area CBUF01 to the copy area LBUF0 by the data size written in the transfer data area 2b. To do. Since the data size 2a is initialized to 0, no copy is actually performed, only the data size 2a is read into the copy area LBUF0, and the data read callback routine ends.

OSの資源獲得待ちシステムコールwait_sem から戻り、通信用領域CBUF01の空き転送データ領域２ｂを獲得できたならば、送信プログラムは転送データとデータサイズ２ａをコピー領域LBUF0に書き込む。転送データとデータサイズ２ａの書き込みは、複数回に分けて行ってもよい。 After returning from the OS resource acquisition wait system call wait_sem and acquiring the free transfer data area 2b of the communication area CBUF01, the transmission program writes the transfer data and the data size 2a to the copy area LBUF0. The transfer data and the data size 2a may be written in a plurality of times.

データの書き込みが完了した後、そのことを通知するため、共有メモリSM上の排他制御用変数領域SEM_READY01を使ってOSの資源解放システムコールsig_semを呼び出す。
OSの資源解放システムコールsig_semは、実際にデータ準備通知を行う前に、排他制御用変数領域SEM_READY01に関連付けられたPU１−０のデータ書き込みコールバックルーチンを起動する。 After the data writing is completed, the OS resource release system call sig_sem is called using the exclusive control variable area SEM_READY01 on the shared memory SM to notify the fact.
The OS resource release system call sig_sem activates the data write callback routine of the PU 1-0 associated with the exclusive control variable area SEM_READY01 before actually making a data preparation notification.

PU１−０のデータ書き込みコールバックルーチンは、コピー領域LBUF0内のデータをコピー領域LBUF0内に書かれているデータサイズ分だけ、コピー領域LBUF0から通信用領域CBUF01内の番号tailbufnoで指定される転送データ領域にコピーする。データサイズ２ａもコピーする。コピーは、プログラム転送またはDMAを起動して行う。コピーが完了したら、番号tailbufnoの値が新たな領域を指すようにインクリメント（＋１増分）する。リングバッファ２Ｂを構成するように番号tailbufnoの値が（m-1）を超えていたら、その値を０にする。番号tailbufnoの値の更新が完了したら、データ書き込みコールバックルーチンは終了する。 The data write callback routine of PU1-0 transfers the data in the copy area LBUF0 from the copy area LBUF0 by the number tailbufno in the communication area CBUF01 by the data size written in the copy area LBUF0. Copy to area. The data size 2a is also copied. Copying is performed by starting program transfer or DMA. When copying is completed, the value of the number tailbufno is incremented (incremented by +1) so as to indicate a new area. If the value of the number tailbufno exceeds (m−1) so as to constitute the ring buffer 2B, the value is set to zero. When the update of the value of the number tailbufno is completed, the data write callback routine ends.

データ書き込みコールバックルーチンが終了したら、OSの資源解放システムコールsig_semは、排他制御用変数領域SEM_READY01でデータ準備完了を通知し、呼び出し元に制御を返す。 When the data write callback routine is completed, the OS resource release system call sig_sem notifies the data source completion in the exclusive control variable area SEM_READY01 and returns control to the caller.

送信プログラムは、すべてのデータの送信を完了するまで上記の処理を繰り返す。 The transmission program repeats the above processing until transmission of all data is completed.

（Ｂ） PU1：受信プログラムの処理（ステップＳ１１）
PU１−１は、割り込みを受け取るか、もしくは、データを受信したい場合に、ユーザの受信プログラムを起動する。PU１−１上の受信プログラムは、最初に、共有メモリSM上に配置された通信用領域CBUF01にデータが準備されているかどうかを検査するために、共有メモリSM上の排他制御用変数領域SEM_READY01を使ってOSの資源獲得待ちシステムコールwait_semを呼び出す。 (B) PU1: Receive program processing (step S11)
When the PU 1-1 receives an interrupt or wants to receive data, the PU 1-1 starts a user reception program. First, the reception program on the PU 1-1 uses the exclusive control variable area SEM_READY01 on the shared memory SM in order to check whether data is prepared in the communication area CBUF01 arranged on the shared memory SM. Use the OS resource acquisition wait system call wait_sem.

OSの資源獲得待ちシステムコールwait_semは、共有メモリSM上の排他制御用変数領域SEM_READY01を参照して通信用領域CBUF01にデータが準備されているかどうか判断し、データが準備されていない場合は、待ち状態に移行する。データが準備されていた場合は、共有メモリSM上の排他制御用変数領域SEM_READY01に関連付けられたPU１−１のデータ読み込みコールバックルーチンを起動する。 The OS resource acquisition wait system call wait_sem refers to the exclusive control variable area SEM_READY01 on the shared memory SM to determine whether data is prepared in the communication area CBUF01, and waits if data is not prepared. Transition to the state. If the data has been prepared, the data read callback routine of the PU 1-1 associated with the exclusive control variable area SEM_READY01 on the shared memory SM is activated.

PU１−１のデータ読み込みコールバックルーチンは、通信用領域CBUF01内の番号headbufnoで指定される転送データ領域のデータを、その転送データ領域内に書かれているデータサイズ分だけコピー領域LBUF1にコピーする。データサイズ２ａもコピーする。コピーは、プログラム転送またはDMAを起動して行う。コピーが完了したら、データ読み込みコールバックルーチンは終了する。 The data read callback routine of the PU 1-1 copies the data in the transfer data area specified by the number headbufno in the communication area CBUF01 to the copy area LBUF1 by the data size written in the transfer data area. . The data size 2a is also copied. Copying is performed by starting program transfer or DMA. When copying is complete, the data read callback routine ends.

OSの資源獲得待ちシステムコールwait_sem から戻り、受信データが準備されていることが確認できたならば、受信プログラムはコピー領域LBUF1から転送データをデータサイズ分読み込む。転送データの読み込みは、複数回に分けて行ってもよい。読み込みが完了したら、データサイズ２ａを０に更新する。 If the return from the OS resource acquisition wait system call wait_sem confirms that the received data is prepared, the receiving program reads the transfer data from the copy area LBUF1 for the data size. The transfer data may be read in a plurality of times. When the reading is completed, the data size 2a is updated to 0.

データの読み込みが完了したら、通信用領域CBUF01内の使用した転送データ領域２ｂを解放するため、共有メモリSM上の排他制御用変数領域SEM_EMPTY01を使ってOSの資源解放システムコールsig_semを呼び出す。 When the data reading is completed, the OS resource release system call sig_sem is called using the exclusive control variable area SEM_EMPTY01 on the shared memory SM in order to release the used transfer data area 2b in the communication area CBUF01.

OSの資源解放システムコールsig_semは、使用した転送データ領域２ｂを解放する前にPU１−１のデータ書き込みコールバックルーチンを起動する。 The OS resource release system call sig_sem starts the data write callback routine of the PU 1-1 before releasing the used transfer data area 2b.

PU１−１のデータ書き込みコールバックルーチンは、コピー領域LBUF1内のデータをコピー領域LBUF1内に書かれているデータサイズ分だけ、番号headbufnoで指定される転送データ領域２ｂにコピーする。データサイズ２ａは０に更新されているので、実際にはコピーは行われず、データサイズ２ａのみを番号headbufnoで指定される転送データ領域２ｂに書き込むことになる。その後、番号headbufnoの値が次のデータ格納領域を指すようにインクリメントする。リングバッファ２Ｂを構成するように番号headbufnoの値が（m-1）を超えていたら、その値を0にする。番号headbufnoの値の更新が完了したら、データ書き込みコールバックルーチンは終了する。 The data write callback routine of the PU 1-1 copies the data in the copy area LBUF1 by the data size written in the copy area LBUF1 to the transfer data area 2b specified by the number headbufno. Since the data size 2a has been updated to 0, no copy is actually performed, and only the data size 2a is written in the transfer data area 2b specified by the number headbufno. Thereafter, the value of the number headbufno is incremented so as to indicate the next data storage area. If the value of the number headbufno exceeds (m-1) so as to constitute the ring buffer 2B, the value is set to 0. When the update of the value of the number headbufno is completed, the data write callback routine ends.

データ書き込みコールバックルーチンが終了したら、OSの資源解放システムコールsig_semは通信用領域CBUF01内の転送データ領域の解放を通知し、呼び出し元に制御を返す。 When the data write callback routine ends, the OS resource release system call sig_sem notifies the release of the transfer data area in the communication area CBUF01 and returns control to the caller.

次に、受信プログラムはPU１−１からPU１−０に割り込みをかけて、PU１−０に受信完了を通知する。 Next, the reception program interrupts PU1-1 to PU1-0 and notifies PU1-0 of reception completion.

受信プログラムは、すべてのデータを受信するまで上記の処理を繰り返す。
PU１−０、PU１−１間の割り込みは、OSの資源獲得待ちシステムコールwait_semで待ち状態になったプロセスに対し、待ち状態を解除できる状態になったことを通知するために使用する。 The reception program repeats the above processing until all data is received.
The interrupt between PU1-0 and PU1-1 is used to notify the process that has been put into a wait state by the OS resource acquisition wait system call wait_sem that the wait state can be released.

（実施例２の効果）
本実施例２によれば、実施例１の効果（ａ）、（ｂ）と同様の効果があり、更に、次の（ｅ）、（ｆ）のような効果もある。 (Effect of Example 2)
According to the second embodiment, there are the same effects as the effects (a) and (b) of the first embodiment, and further, there are the following effects (e) and (f).

（ｅ）共有メモリアクセスに関して、キャッシュメモリのデータ一貫性を保持する仕組みを提供していないキャッシュ付プロセッサコアIPを使用する場合でも、プロセッサコアIPを変更することなく、簡単なハードウェア構成で、共有メモリSMを使用したプロセッサ間のストリームデータ転送を高速かつ効率よく行うことができる。 (E) With regard to shared memory access, even when using a processor core IP with cache that does not provide a mechanism for maintaining data consistency of the cache memory, with a simple hardware configuration without changing the processor core IP, Stream data transfer between processors using the shared memory SM can be performed quickly and efficiently.

（ｆ）同期基本操作(wait_sem,sig_sem)のタイミングで共有メモリSMとのブロック転送あるいはDMAによるデータ転送を行うことで、ユーザのプログラムから実際に共有メモリSMに対してデータ転送を行う部分を分離することができ、プログラム全体の複雑さを低減できる。 (F) By performing block transfer with the shared memory SM or data transfer by DMA at the timing of the synchronous basic operation (wait_sem, sig_sem), the part that actually transfers data to the shared memory SM from the user program is separated And the complexity of the entire program can be reduced.

なお、本発明は、上記実施例１、２に限定されず、種々の変形や利用形態が可能である。この変形や利用形態としては、例えば、次の（I)、（II）のようなものがある。 In addition, this invention is not limited to the said Example 1, 2, A various deformation | transformation and utilization form are possible. For example, there are the following modifications (I) and (II) as modifications and usage forms.

（I) 実施例１、２では、２つの典型的なプロセッサ間通信について説明したが、その通信の処理手順は図示以外の手順に変更でき、更に、２つのプロセッサ間通信以外の他のプロセッサ間通信についても適用可能である。 (I) In the first and second embodiments, two typical inter-processor communication has been described. However, the communication processing procedure can be changed to a procedure other than that shown in the drawing, and further, between other processors other than the two inter-processor communication. It can also be applied to communication.

（II) ２台のPU間でのプロセッサ間通信について、例を用いて説明したが、３台以上で通信を行う場合についても適用可能である。 (II) Inter-processor communication between two PUs has been described using an example, but the present invention can also be applied to a case where communication is performed with three or more units.

本発明の実施例１に係るプロセッサ間通信方法を実現するマルチプロセッサシステムの構成例を示すブロック図である。It is a block diagram which shows the structural example of the multiprocessor system which implement | achieves the communication method between processors which concerns on Example 1 of this invention. 図１中の通信用領域CBUF01の構成を示す図である。It is a figure which shows the structure of communication area | region CBUF01 in FIG. 本発明の実施例１のプロセッサ間通信方法における処理手順を示す図である。It is a figure which shows the process sequence in the communication method between processors of Example 1 of this invention. 本発明の実施例１のプロセッサ間通信方法における処理手順を示す図である。It is a figure which shows the process sequence in the communication method between processors of Example 1 of this invention. 本発明の実施例２に係るプロセッサ間通信方法を実現するマルチプロセッサシステムの構成例を示すブロック図である。It is a block diagram which shows the structural example of the multiprocessor system which implement | achieves the communication method between processors which concerns on Example 2 of this invention. 図４中の通信用領域CBUF01の構成を示す図である。FIG. 5 is a diagram showing a configuration of a communication area CBUF01 in FIG. 本発明の実施例２のプロセッサ間通信方法における処理手順を示す図である。It is a figure which shows the process sequence in the communication method between processors of Example 2 of this invention. 本発明の実施例２のプロセッサ間通信方法における処理手順を示す図である。It is a figure which shows the process sequence in the communication method between processors of Example 2 of this invention. 本発明の実施例２のプロセッサ間通信方法における処理手順を示す図である。It is a figure which shows the process sequence in the communication method between processors of Example 2 of this invention.

Explanation of symbols

１−０〜１−ｎ PU（プロセッシングユニット）
１ａプロセッサ
１ｂキャッシュメモリ
２ａデータサイズ
２ｂ転送データ領域
３共有バス
LM ローカルメモリ
SM 共有メモリ 1-0 to 1-n PU (processing unit)
1a processor 1b cache memory 2a data size 2b transfer data area 3 shared bus
LM local memory
SM shared memory

Claims

A multiprocessor system inter-processor communication method in which a plurality of processors each having a local memory and a shared memory for inter-processor communication are connected by a shared bus,
When transferring data from one of the plurality of processors to another processor, performing data transfer between the local memory and the shared memory in the acquisition-system synchronization basic operation and the release-system synchronization basic operation. A feature of the inter-processor communication method.

A plurality of processors each having a local memory and a shared memory for inter-processor communication having a communication data area are connected by a shared bus, and each of the processors has or does not have a cache memory. A multiprocessor system inter-processor communication method that does not have a mechanism for maintaining data consistency,
Associating a communication data area of the shared memory with a transfer source local memory of a transmission processor for transmitting data of the plurality of processors and a transfer destination local memory of a reception processor for receiving data; Having an exclusive control variable for exclusive control of the communication data area of the shared memory;
When the transmission processor transmits data, the shared data is written from the transfer source local memory to the transfer source local memory at the timing when the release synchronous basic operation is performed on the communication data area of the associated shared memory. Perform batch transfer to memory,
When the receiving processor receives data, at the timing when the acquisition synchronization basic operation is performed on the communication data area of the shared memory prior to reading the data, the transfer is collectively transferred from the shared memory to the transfer destination local memory. And inter-processor communication method, wherein data is read from the transfer destination local memory.

3. The inter-processor communication method according to claim 2, wherein a block transfer instruction or direct memory access of the processor is used as a method of transferring data collectively between the local memory and the shared memory.