JP2005301607A

JP2005301607A - Information processor, information processing system, and information processing method

Info

Publication number: JP2005301607A
Application number: JP2004115895A
Authority: JP
Inventors: Shinji Nakagawa; 伸治中川
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2004-04-09
Filing date: 2004-04-09
Publication date: 2005-10-27

Abstract

<P>PROBLEM TO BE SOLVED: To provide an information processor for efficiently operating a plurality of processors in parallel, and to provide an information processing system and an information processing method thereof. <P>SOLUTION: Respective sub-processors 24-1 to 24-n include compressing/extending apparatuses 24B-1 to 24B-n, so as to provide a function of compressing data to be transferred on a system bus 29 and extending data which is received from the bus. Through the use of the function, transfer data between a main memory 23 and local memories 14A-1 to 14A-n in the sub-processors 24-1 to 24-n is compressed. Then, the bottleneck problem in the whole system of data transfer time between the memories is solved. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、複数のプロセッサを備えた情報処理装置、そのような情報処理装置がネットワークに接続されて構成された情報処理システム、およびそのような情報処理装置に適用される情報処理方法に関する。 The present invention relates to an information processing apparatus including a plurality of processors, an information processing system configured by connecting such an information processing apparatus to a network, and an information processing method applied to such an information processing apparatus.

従来より、複数のプロセッサ（演算ユニット）を用いて構成されたマルチプロセッサ分散並列処理システム（以下、マルチプロセッサシステムという。）が知られている。そのようなシステムとして、例えば、特許文献１、特許文献２等がある。
特開平７−１５２６６４特開２００２−３５１８５０ Conventionally, a multiprocessor distributed parallel processing system (hereinafter referred to as a multiprocessor system) configured using a plurality of processors (arithmetic units) is known. Examples of such a system include Patent Document 1, Patent Document 2, and the like.
JP 7-152664 A JP 2002-351850 A

このマルチプロセッサシステムでは、システムバスに接続された複数のプロセッサに処理を分散させ、並列に演算処理（信号処理）を行うようになっていることから、単一のプロセッサで信号処理を行った場合と比較して、大幅な演算能力の向上が期待できる。このシステムでは、例えば、各プロセッサが演算器とワークメモリ（以下、ローカルメモリという。）とを含み、各プロセッサがシステムバスを介してメインメモリを共有するように構成される。各プロセッサ内の演算器は、メインメモリからローカルメモリに転送されたデータを使用して演算を実行する。演算器は演算中、ローカルメモリにのみアクセスし、直接メインメモリを参照することはない。演算が終了すると、ローカルメモリ内の演算結果がメインメモリに書き戻される。 In this multiprocessor system, processing is distributed to multiple processors connected to the system bus, and computation processing (signal processing) is performed in parallel. Compared with, a significant improvement in computing power can be expected. In this system, for example, each processor includes an arithmetic unit and a work memory (hereinafter referred to as a local memory), and each processor is configured to share a main memory via a system bus. An arithmetic unit in each processor executes an operation using data transferred from the main memory to the local memory. During the computation, the computing unit accesses only the local memory and does not directly refer to the main memory. When the calculation is completed, the calculation result in the local memory is written back to the main memory.

このような構成をとることで、以下の問題が解決される。
１．メインメモリを共有することによるアクセス競合の防止
２．低速なメインメモリを使うことによるパフォーマンス低下
３．システムバスのトラヒック増加防止 By taking such a configuration, the following problems are solved.
1. 1. Preventing access conflict by sharing main memory 2. Performance degradation due to low-speed main memory Prevention of system bus traffic increase

上記したマルチプロセッサシステムでは、メインメモリへのアクセスは演算前後のデータの入出力を行うときにのみ発生する。このとき、データ転送レートが十分に高ければプロセッサの並列度を増すほどシステム全体の処理能力は向上する。 In the multiprocessor system described above, access to the main memory occurs only when data is input / output before and after the calculation. At this time, if the data transfer rate is sufficiently high, the processing capability of the entire system improves as the parallelism of the processors increases.

しかしながら、従来のシステムでは、以下の理由からメインメモリとローカルメモリとの間のデータ転送レートがボトルネックとなり、演算器の並列度に見合った処理能力をシステムから引き出せない可能性がある。 However, in the conventional system, the data transfer rate between the main memory and the local memory becomes a bottleneck for the following reasons, and there is a possibility that the processing capacity corresponding to the degree of parallelism of the arithmetic units cannot be extracted from the system.

１．メインメモリには大容量のものが要求されるので、コストの低いＤＲＡＭ（ダイナミック・ランダム・アクセス・メモリ）が使われることが多いが、一般的にＤＲＡＭのアクセス速度はシステムバスの速度に比べて遅い。
２．システムバスは、各種周辺デバイスからのトランザクションデータの転送にも使用されることから、プロセッサ間のデータ転送またはプロセッサとメインメモリとの間のデータ転送だけがバスを占有する、ということはあり得ず、その結果、十分な転送レートを確保できない。 1. Since the main memory is required to have a large capacity, a low-cost DRAM (dynamic random access memory) is often used, but the DRAM access speed is generally higher than the system bus speed. slow.
2. Since the system bus is also used to transfer transaction data from various peripheral devices, it is impossible for only data transfer between processors or data transfer between the processor and main memory to occupy the bus. As a result, a sufficient transfer rate cannot be secured.

本発明はかかる問題点に鑑みてなされたもので、その目的は、複数のプロセッサを効率的に並列動作させることが可能な情報処理装置、情報処理システムおよび情報処理方法を提供することにある。 The present invention has been made in view of such problems, and an object thereof is to provide an information processing apparatus, an information processing system, and an information processing method capable of efficiently operating a plurality of processors in parallel.

本発明の情報処理装置は、バスと、このバスに共通に接続された複数の演算プロセッサとを備え、各演算プロセッサが、演算器と、演算器による演算結果データを圧縮する機能と、バスを介して取り込んだ入力データを伸長する機能とを有する圧縮伸長器とを有するようにしたものである。 An information processing apparatus according to the present invention includes a bus and a plurality of arithmetic processors commonly connected to the bus. Each arithmetic processor includes an arithmetic unit, a function of compressing arithmetic result data by the arithmetic unit, and the bus. And a compression / decompression unit having a function of decompressing input data fetched through the network.

本発明の情報処理方法は、バスと、このバスに共通に接続された複数の演算プロセッサとを備えた情報処理装置に適用される方法であって、各演算プロセッサにおいて、バスを介して取り込んだ入力データを選択的に伸長し、伸長した入力データを基に演算処理を行い、演算処理の結果である演算結果データに対して選択的に圧縮処理を行ってバス上に送出するようにしたものである。 The information processing method of the present invention is a method applied to an information processing apparatus including a bus and a plurality of arithmetic processors commonly connected to the bus, and each arithmetic processor takes in the data via the bus. Input data is selectively decompressed, arithmetic processing is performed based on the decompressed input data, arithmetic result data that is the result of the arithmetic processing is selectively compressed, and sent to the bus It is.

本発明の情報処理システムは、ネットワークに複数の情報処理装置を接続してなる情報処理システムであって、各情報処理装置は、バスと、このバスに共通に接続された複数の演算プロセッサとを備え、各演算プロセッサが、演算器と、演算器による演算結果データを圧縮する機能と、バスを介して取り込んだ入力データを伸長する機能とを有する圧縮伸長器とを有するようにしたものである。 The information processing system of the present invention is an information processing system in which a plurality of information processing devices are connected to a network, and each information processing device includes a bus and a plurality of arithmetic processors connected to the bus in common. And each arithmetic processor has an arithmetic unit, a compression / decompression unit having a function of compressing operation result data by the arithmetic unit, and a function of expanding input data taken in via the bus. .

本発明の情報処理装置、情報処理システムおよび情報処理方法では、バスを介して取り込んだ入力データが選択的に伸長されると共に、伸長した入力データを基に演算処理が行われる。演算結果である演算結果データに対しては、選択的に圧縮処理が行われ、バス上に送出される。バスを伝送するデータは圧縮されているため、プロセッサ間でのデータ転送量が少なくなる。 In the information processing apparatus, the information processing system, and the information processing method of the present invention, input data taken in via the bus is selectively expanded and arithmetic processing is performed based on the expanded input data. The calculation result data, which is the calculation result, is selectively compressed and sent to the bus. Since the data transmitted through the bus is compressed, the amount of data transferred between processors is reduced.

本発明の情報処理装置、情報処理システムおよび情報処理方法によれば、複数のプロセッサ間でバス上を転送するデータを選択的に圧縮するようにしたので、まったく圧縮処理を行わない場合と比べて、データ量が少なくなり、転送時間を短縮化することができる。この結果、各プロセッサに対するデータ転送時間が演算性能のボトルネックとなるような分散処理システムの場合において、ボトルネック箇所を、転送時間から、圧縮伸張処理も含めた演算処理時間に転換することができる。これにより、複数のプロセッサを効率的に並列動作させてシステム全体の処理能力を向上させることができる。 According to the information processing apparatus, the information processing system, and the information processing method of the present invention, the data transferred on the bus between the plurality of processors is selectively compressed, so that the compression processing is not performed at all. Data amount can be reduced and transfer time can be shortened. As a result, in the case of a distributed processing system in which the data transfer time for each processor is a bottleneck in calculation performance, the bottleneck location can be changed from the transfer time to an operation processing time including compression / decompression processing. . Thereby, a plurality of processors can be efficiently operated in parallel to improve the processing capacity of the entire system.

以下、本発明を実施するための最良の形態（以下、単に実施の形態という。）について、図面を参照して詳細に説明する。 Hereinafter, the best mode for carrying out the present invention (hereinafter simply referred to as an embodiment) will be described in detail with reference to the drawings.

図１は、本発明の一実施の形態に係る情報処理システムの全体構成を表すものである。なお、本発明の一実施の形態に係る情報処理装置および情報処理方法は、この情報処理システムによって具現化されるので、以下併せて説明する。 FIG. 1 shows the overall configuration of an information processing system according to an embodiment of the present invention. Note that an information processing apparatus and an information processing method according to an embodiment of the present invention are embodied by the information processing system, and will be described below.

この情報処理システムは、ネットワーク１と、複数の情報処理装置２，３，４とを含んで構成されている。情報処理装置２〜４は、いずれも同様の構成を有する。以下においては、情報処理装置を代表して、情報処理装置２について説明する。なお、ここに図示した情報処理装置は３台であるが、２台でもよいし、あるいは４台以上あってもよい。 This information processing system includes a network 1 and a plurality of information processing devices 2, 3, and 4. All of the information processing apparatuses 2 to 4 have the same configuration. In the following, the information processing apparatus 2 will be described on behalf of the information processing apparatus. In addition, although the information processing apparatus shown here is three, two may be sufficient, and four or more may be sufficient.

情報処理装置２〜４はそれぞれ、後述するソフトウェアセルを用い、ネットワーク１を介して、他の情報処理装置との間で、各種のコマンドや各情報処理装置が保有する装置情報および必要なプログラム等を送受信する機能を有する。ここで、装置情報とは、例えば、各情報処理装置の電源状況（電源がＯＮであるかＯＦＦであるか）、どんなタイプの情報処理装置か、どんな入出力インターフェースを持っているか、そのインターフェースはそれぞれどれに接続されているか、などを示す情報である。 Each of the information processing apparatuses 2 to 4 uses a software cell, which will be described later, and communicates with other information processing apparatuses via the network 1, various commands, apparatus information held by each information processing apparatus, necessary programs, and the like Has the function of transmitting and receiving. Here, the device information is, for example, the power status of each information processing device (whether the power is ON or OFF), what type of information processing device, what input / output interface, and the interface is This is information indicating which one is connected.

情報処理装置２は、メインプロセッサ２１と、メモリコントローラ２２と、メインメモリ２３と、複数のサブプロセッサ２４−１〜２４−ｎ（ｎは正の整数）とを備えている。これらは、１つのマルチプロセッサユニット２０を構成している。このマルチプロセッサユニット２０は、集積化された１つの集積回路チップとして構成可能である。マルチプロセッサユニット２０には、この情報処理システム全体を通して情報処理装置２を一意的に識別できる識別子である情報処理装置ＩＤが割り当てられている。なお、以下の説明において、サブプロセッサ２４−１〜２４−ｎをまとめて示す場合には、適宜、これを総称して「サブプロセッサ２４」と表記するものとする。 The information processing apparatus 2 includes a main processor 21, a memory controller 22, a main memory 23, and a plurality of sub processors 24-1 to 24-n (n is a positive integer). These constitute one multiprocessor unit 20. The multiprocessor unit 20 can be configured as one integrated circuit chip. The multiprocessor unit 20 is assigned an information processing apparatus ID, which is an identifier that can uniquely identify the information processing apparatus 2 throughout the information processing system. In the following description, when the sub processors 24-1 to 24-n are collectively shown, they are collectively referred to as “sub processor 24” as appropriate.

情報処理装置２はまた、ディスクコントローラ２５と、外部記録装置２６と、バスアービタ２７と、ネットワーク接続部２８とを備えている。メインメモリ２３および外部記録装置２６を除く各ブロックは、システムバス２９によって相互に接続されている。メインメモリ２３はメモリコントローラ２２に接続され、外部記録装置２６はディスクコントローラ２５に接続されている。 The information processing apparatus 2 also includes a disk controller 25, an external recording device 26, a bus arbiter 27, and a network connection unit 28. The blocks other than the main memory 23 and the external recording device 26 are connected to each other by a system bus 29. The main memory 23 is connected to the memory controller 22, and the external recording device 26 is connected to the disk controller 25.

メインプロセッサ２１は、サブプロセッサ２４によるプログラム実行（データ処理）のスケジュール管理と、マルチプロセッサユニット２０の全般的な管理とを行うためのものである。ただし、メインプロセッサ２１が管理のためのプログラム以外のプログラムを実行することもある。その場合には、メインプロセッサ２１はサブプロセッサ２４と同様に機能することになる。メインプロセッサ２１は、一時的なワークメモリとしてのローカルメモリ（ローカルストレージ）２１Ａを有する。 The main processor 21 is for performing schedule management of program execution (data processing) by the sub-processor 24 and general management of the multiprocessor unit 20. However, the main processor 21 may execute a program other than the management program. In that case, the main processor 21 functions in the same manner as the sub-processor 24. The main processor 21 has a local memory (local storage) 21A as a temporary work memory.

メインメモリ２３は、メインプロセッサ２１およびサブプロセッサ２４−１〜２４−ｎによって共有されるものであり、情報処理装置２の機能を司る各種の機能プログラムや処理に必要なデータ（各情報処理装置に関する装置情報を含む。）を記憶するようになっている。メインメモリとしては、例えばＤＲＡＭ（Dynamic Random Access Memory）のような比較的低速のメモリを使用することも可能である。 The main memory 23 is shared by the main processor 21 and the sub processors 24-1 to 24-n, and various function programs that control the functions of the information processing apparatus 2 and data necessary for processing (related to each information processing apparatus). Including device information). As the main memory, for example, a relatively low-speed memory such as a DRAM (Dynamic Random Access Memory) can be used.

メモリコントローラ２２は、システムバス２９とメインメモリ２３との間にあって、いわゆるＤＭＡＣ（Direct Memory Access Controller ）として機能し、サブプロセッサ２４−１〜２４−ｎが、メインメモリ２３に格納されているプログラムおよびデータに直接（メインプロセッサ２１を介さずに）アクセスしてＤＭＡ（Direct Memory Access）転送を行うことを可能にしている。 The memory controller 22 is located between the system bus 29 and the main memory 23 and functions as a so-called DMAC (Direct Memory Access Controller). The sub-processors 24-1 to 24-n include programs stored in the main memory 23, It is possible to perform direct memory access (DMA) transfer by directly accessing the data (not via the main processor 21).

サブプロセッサ２４−１〜２４−ｎは、それぞれ、メインプロセッサ２１の制御によって並列的かつ独立にプログラムを実行し、データを処理するようになっている。ただし、必要に応じ、メインプロセッサ２１内のプログラムがサブプロセッサ２４内のプログラムと連携して動作することもある。後述する機能プログラムもメインプロセッサ２１内で動作するプログラムである。 Each of the sub-processors 24-1 to 24-n executes a program in parallel and independently under the control of the main processor 21 to process data. However, the program in the main processor 21 may operate in cooperation with the program in the sub processor 24 as necessary. A function program described later is also a program that operates in the main processor 21.

サブプロセッサ２４−１は、ローカルメモリ（ローカルストレージ）２４Ａ−１と、圧縮伸長器２４Ｂ−１と、演算器２４Ｃ−１とを有する。ローカルメモリ２４Ａ−１は、演算器２４Ｃ−１によって一時的なワークメモリとして使用されるものである。圧縮伸長器２４Ｂ−１は、システムバス２９を介してメインメモリ２３から転送され入力されたデータに対して必要に応じて伸長処理を行うと共に、ローカルメモリ２４Ａ−１に格納された送出対象データ（演算器２４Ｃ−１による演算結果データ）に対して必要に応じて圧縮処理を行うようになっている。演算器２４Ｃ−１は、メインメモリ２３から入力されてローカルメモリ２４Ａ−１に格納されたデータに対して所定の演算処理（信号処理）を行うものである。演算処理としては、例えば、画像処理や音響処理のほか、科学技術計算処理等がある。このサブプロセッサ２４−１では、バスアービタ２７からバス使用権（後述）を得てメインメモリ２３との間でデータ転送を行う際のタイミング制御等は、図示しない制御部によって行われるようになっている。但し、この制御部に代えて、演算器２４Ｃ−１がそのような制御を行うようにしてもよい。 The sub processor 24-1 includes a local memory (local storage) 24A-1, a compression / decompression unit 24B-1, and an arithmetic unit 24C-1. The local memory 24A-1 is used as a temporary work memory by the arithmetic unit 24C-1. The compression / decompression device 24B-1 performs decompression processing on the data transferred and input from the main memory 23 via the system bus 29 as necessary, and also sends the transmission target data (stored in the local memory 24A-1). The calculation result data by the calculator 24C-1 is compressed as necessary. The arithmetic unit 24C-1 performs predetermined arithmetic processing (signal processing) on the data input from the main memory 23 and stored in the local memory 24A-1. Examples of the arithmetic processing include scientific and technological calculation processing in addition to image processing and acoustic processing. In the sub-processor 24-1, timing control and the like when data is transferred to and from the main memory 23 after obtaining a bus use right (described later) from the bus arbiter 27 are performed by a control unit (not shown). . However, instead of this control unit, the computing unit 24C-1 may perform such control.

圧縮伸長器２４Ｂ−１による圧縮アルゴリズムは、「可逆性が保障されるもの」であれば、種類を問わない。その代表的なものとして例えばハフマン符号に代表されるエントロピー符号が考えられる。但し、任意の信号処理に対して信号の発生頻度の偏りを利用した符号化方式を適用するのは困難である。圧縮伸張処理に伴うコストと圧縮効果とを考慮すると、差分予測符号化による圧縮アルゴリズムを採用することが好ましい。例えば、分散並列処理を適用するアプリケーションの一つとして動画に対する画像信号処理が考えられる。この画像信号処理では、通常、画像フレームを複数のマクロブロックと呼ばれる小さなブロックに分割し、マクロブロック単位で信号処理を行う。これを分散並列処理に適用する場合には、各プロセッサにマクロブロックを割り当てて並列演算させることになる。画像信号は空間的、時間的な相関が比較的強いため、予測を用いたデータ圧縮が効果的である。 The compression algorithm by the compression / decompression device 24B-1 is not limited as long as it is “one that ensures reversibility”. A typical example is an entropy code represented by a Huffman code. However, it is difficult to apply an encoding method that uses a deviation in signal generation frequency to arbitrary signal processing. In consideration of the cost and compression effect associated with compression / decompression processing, it is preferable to employ a compression algorithm based on differential prediction encoding. For example, image signal processing for a moving image can be considered as one of applications to which distributed parallel processing is applied. In this image signal processing, an image frame is usually divided into a plurality of small blocks called macroblocks, and signal processing is performed in units of macroblocks. When this is applied to distributed parallel processing, a macroblock is assigned to each processor to perform parallel operations. Since image signals have a relatively strong spatial and temporal correlation, data compression using prediction is effective.

なお、サブプロセッサの個数ｎには特に制限がないが、後述するように、情報処理装置全体としての処理速度の向上のためには、より多くのサブプロセッサを設けることが好ましい。他のサブプロセッサ２４−２〜２４−ｎについてもサブプロセッサ２４−１と同様の構成である。 The number n of sub-processors is not particularly limited, but as will be described later, it is preferable to provide more sub-processors in order to improve the processing speed of the entire information processing apparatus. The other sub processors 24-2 to 24-n have the same configuration as that of the sub processor 24-1.

ネットワーク接続部２８は、いわゆるＮＩＣ（Network Interface Card）と呼ばれるＬＡＮインターフェース部であり、ネットワーク１を構成するＬＡＮケーブルとシステムバス２９との間を接続している。 The network connection unit 28 is a LAN interface unit called a so-called NIC (Network Interface Card), and connects the LAN cable constituting the network 1 and the system bus 29.

外部記録装置２６は、情報処理装置２の動作に必要な基本ソフトウェアであるＯＳ（オペレーティングシステム）のほか、後述する機能プログラムやドライバプログラム等のアプリケーションプログラム、および各種のデータを格納するものである。外部記録装置２６としては、例えば、ハードディスク装置やリムーバブルハードディスク装置のほか、ＤＶＤ（DigitalVersatileDisk) 装置、ＭＯ（Magneto Optical ）ドライブおよびＣＤ±ＲＷ（Compact Disc±ReWritable）ドライブ等の光ディスク装置や、メモリディスク、ＳＲＡＭ（Static Random Access Memory ）等が用いられる。ディスクコントローラ２５は、メインプロセッサ２１が外部記録装置２６に対してデータの読み書きを行う場合のアクセス制御を行うものである。 The external recording device 26 stores an OS (operating system), which is basic software necessary for the operation of the information processing device 2, as well as application programs such as function programs and driver programs described later, and various data. Examples of the external recording device 26 include a hard disk device and a removable hard disk device, an optical disk device such as a DVD (Digital Versatile Disk) device, an MO (Magneto Optical) drive and a CD ± RW (Compact Disc ± ReWritable) drive, a memory disk, An SRAM (Static Random Access Memory) or the like is used. The disk controller 25 performs access control when the main processor 21 reads / writes data from / to the external recording device 26.

バスアービタ２７は、メインプロセッサ２１およびサブプロセッサ２４−１〜２４−ｎによるバス使用権を巡る調停処理を行うためのものである。具体的には、バスアービタ２７は、各プロセッサからのバス使用要求を受けて、所定のアルゴリズムに従って、１つのプロセッサにのみバス使用権を付与する。これにより、システム２９上でのデータの衝突を回避することができるようになっている。 The bus arbiter 27 is for performing arbitration processing for bus use rights by the main processor 21 and the sub processors 24-1 to 24-n. Specifically, the bus arbiter 27 receives a bus use request from each processor and gives a bus use right to only one processor according to a predetermined algorithm. As a result, data collision on the system 29 can be avoided.

上記のように、マルチプロセッサユニット２０内の各サブプロセッサ２４−１〜２４−ｎは、独立にプログラムを実行し、データを処理するようになっているが、異なるサブプロセッサがメインメモリ２３内の同一領域に対して同時に読み出しまたは書き込みを行った場合には、データの不整合を生じる可能性がある。そこで、このマルチプロセッサユニット２０では、そのようなデータ不整合を回避するために、以下のような排他的制御を行うようになっている。 As described above, each of the sub processors 24-1 to 24-n in the multiprocessor unit 20 executes a program and processes data independently, but different sub processors are included in the main memory 23. If data is read from or written to the same area at the same time, data mismatch may occur. Therefore, the multiprocessor unit 20 performs the following exclusive control in order to avoid such data inconsistency.

図２は、メインメモリ２３およびローカルメモリ２４Ａ−１〜２４Ａ−Ｎにおけるデータ配置構成と、そのアクセス制御方法とを説明するためのものである。 FIG. 2 is a diagram for explaining a data arrangement configuration in the main memory 23 and the local memories 24A-1 to 24A-N and an access control method thereof.

図２（Ａ）に示したように、メインメモリ２３は、複数のメモリロケーションＭＭＬ0 〜ＭＭＬｍ（ｍは正の整数）によって構成される。各メモリロケーションＭＭＬ0 〜ＭＭＬｍ（以下、総称する場合には、単にＭＭＬと表記する。）は、アドレスを指定することによって特定される記憶領域である。各メモリローケーションＭＭＬには、そこに格納されているデータの状態を示す情報を格納するための追加セグメントＭＳＥＧ０〜ＭＳＥＧｍが割り振られている。各メモリロケーションＭＭＬにはまた、アクセスキーＡＫ０〜ＡＫｍも割り振られる。 As shown in FIG. 2A, the main memory 23 is configured by a plurality of memory locations MML0 to MMLm (m is a positive integer). Each of the memory locations MML0 to MMLm (hereinafter simply referred to as MML when collectively referred to) is a storage area specified by specifying an address. Each memory location MML is allocated with additional segments MSEG0 to MSEGm for storing information indicating the state of data stored therein. Each memory location MML is also assigned an access key AK0-AKm.

各追加セグメントＭＳＥＧ０〜ＭＳＥＧｍ（以下、総称する場合には、単にＭＳＥＧと表記する。）は、Ｆ／Ｅビット、サブプロセッサＩＤおよびローカルメモリアドレスを含む。 Each additional segment MSEG0 to MSEGm (hereinafter collectively referred to simply as MSEG) includes an F / E bit, a sub processor ID, and a local memory address.

Ｆ／Ｅビットは、以下のように定義される。「Ｆ／Ｅビット＝０」は、そのメモリロケーションＭＭＬのデータが、サブプロセッサ２４によって読み出されている処理中のデータ、または空き状態であるため最新データではない無効データであり、データの読み出しは不可であるが、書き込みは可能であることを示す。Ｆ／Ｅビットはデータ書き込み後に「Ｆ／Ｅビット＝１」に設定されるようになっている。 The F / E bit is defined as follows. “F / E bit = 0” is the data being processed being read by the sub-processor 24 or invalid data that is not the latest data because it is empty, and the data is read out. Indicates that writing is possible. The F / E bit is set to “F / E bit = 1” after data writing.

「Ｆ／Ｅビット＝１」は、そのメモリロケーションＭＭＬのデータがサブプロセッサ２４によって読み出されておらず、未処理の最新データであることを示す。「Ｆ／Ｅビット＝１」の状態のメモリロケーションＭＭＬのデータは、読み出し可能であるが、書き込み不可である。Ｆ／Ｅビットは、サブプロセッサ２４によって読み出された後に「Ｆ／Ｅビット＝０」に設定されるようになっている。 “F / E bit = 1” indicates that the data in the memory location MML is not read by the sub-processor 24 and is the latest unprocessed data. Data in the memory location MML in the state of “F / E bit = 1” can be read but cannot be written. The F / E bit is set to “F / E bit = 0” after being read by the sub-processor 24.

「Ｆ／Ｅビット＝０（読み出し不可／書き込み可）」の状態にあるときに、そのメモリロケーションＭＭＬについて読み出し予約を設定することは可能である。「Ｆ／Ｅビット＝０」のメモリロケーションＭＭＬに対する読み出し予約は、サブプロセッサ２４が、読み出し予約を行うメモリロケーションＭＭＬの追加セグメントＭＳＥＧに、読み出し予約情報としてそのサブプロセッサ２４のサブプロセッサＩＤおよびローカルメモリアドレスを書き込むことで行われる。その後、データ書き込み側のサブプロセッサ２４によって、読み出し予約されたメモリロケーションＭＭＬにデータが書き込まれ、「Ｆ／Ｅビット＝１（読み出し可／書き込み不可）」に設定されると、そこに書き込まれたデータは、予め読み出し予約情報として追加セグメントＭＳＥＧに書き込まれたサブプロセッサＩＤをもつサブプロセッサ２４のローカルメモリアドレスに読み出されるようになっている。このような制御は、複数のサブプロセッサ２４−１〜２４−ｎによってデータを多段階に処理する必要がある場合に有効である。前段階の処理を行うサブプロセッサ２４が処理済みのデータをメインメモリ２３上の所定のアドレスに書き込んだ後、直ちに、後段階の処理を行う別のサブプロセッサ２４が前処理後のデータを読み出すことが可能となるからである。 When in the state of “F / E bit = 0 (unreadable / writable)”, it is possible to set a read reservation for the memory location MML. For the read reservation for the memory location MML with the “F / E bit = 0”, the sub-processor 24 adds the sub-processor ID of the sub-processor 24 and the local memory as read reservation information to the additional segment MSEG of the memory location MML for which the read reservation is made This is done by writing an address. Thereafter, the data is written to the memory location MML reserved for reading by the sub processor 24 on the data writing side, and when it is set to “F / E bit = 1 (readable / not writable)”, the data is written there. The data is read out to the local memory address of the sub processor 24 having the sub processor ID written in advance in the additional segment MSEG as read reservation information. Such control is effective when data needs to be processed in multiple stages by the plurality of sub-processors 24-1 to 24-n. Immediately after the sub-processor 24 performing the pre-stage processing writes the processed data to a predetermined address on the main memory 23, another sub-processor 24 performing the post-stage processing reads the pre-processed data immediately. This is because it becomes possible.

メインメモリ２３には、複数のメモリロケーションの集合からなるサンドボックスＳＢ０，ＳＢ１，…等が複数設けられている。各サンドボックスは、メインメモリ２３内の領域を画定するものである。各サンドボックスは、各サブプロセッサ２４に割り当てられ、その割り当てられたサブプロセッサのみが排他的に使用することができる。すなわち、各サブプロセッサ２４は、原則として、自身に割り当てられたサンドボックスを使用できるが、この領域を超えてデータのアクセスを行うことはできないようになっている。 The main memory 23 is provided with a plurality of sandboxes SB0, SB1,. Each sandbox defines an area in the main memory 23. Each sandbox is assigned to each sub-processor 24, and only the assigned sub-processor can be used exclusively. That is, each sub-processor 24 can use a sandbox assigned to itself in principle, but cannot access data beyond this area.

図２（Ｂ）に示したように、サブプロセッサ２４−１内のローカルメモリ２４Ａ−１もまた、複数のメモリロケーションＳＭＬによって構成される。各メモリロケーションＳＭＬには、追加セグメントＳＳＥＧ０〜ＳＳＥＧｋ（以下、総称する場合には、単にＳＳＥＧと表記する。）が割り振られている。各追加セグメントＳＳＥＧは、ビジービットを含む。サブプロセッサ２４がメインメモリ２３内のデータを自身のローカルメモリ２４ＡのメモリロケーションＭＭＬに読み出すときには、対応するビジービットを１に設定して予約するようになっている。ビジービットが１であるメモリロケーションには、他のデータは格納することができない。ローカルメモリ２４Ａのメモリロケーションに読み出し後、ビジービットは０になり、任意の目的に使用できるようになる。 As shown in FIG. 2B, the local memory 24A-1 in the sub-processor 24-1 is also configured by a plurality of memory locations SML. Additional segments SSEG0 to SSEGk (hereinafter simply referred to as SSEG when collectively referred to) are allocated to each memory location SML. Each additional segment SSEG includes a busy bit. When the sub-processor 24 reads the data in the main memory 23 to the memory location MML of its own local memory 24A, the corresponding busy bit is set to 1 and reserved. No other data can be stored in the memory location where the busy bit is 1. After reading to the memory location of the local memory 24A, the busy bit becomes 0 and can be used for any purpose.

メインメモリ２３の排他的な制御を実現するために、図２（Ｃ）に示したようなキー管理テーブルＫＭＴが用いられる。このキー管理テーブルＫＭＴは、例えばメモリコントローラ２２内に設けられたＳＲＡＭ（Static Random Access Memory ）等の比較的高速のメモリ（図示せず）に格納される。キー管理テーブルＫＭＴ内の各エントリには、サブプロセッサＩＤとしてのＰＩＤ１〜ＰＩＤｎ、サブプロセッサキーＰＫ１〜ＰＫｎ（以下、総称する場合には、単にＰＫと表記する。）、およびキーマスクＫＭ１〜ＫＭｎ（以下、総称する場合には、単にＫＭと表記する。）が含まれる。 In order to realize exclusive control of the main memory 23, a key management table KMT as shown in FIG. 2C is used. The key management table KMT is stored in a relatively high-speed memory (not shown) such as an SRAM (Static Random Access Memory) provided in the memory controller 22, for example. Each entry in the key management table KMT includes PID1 to PIDn as subprocessor IDs, subprocessor keys PK1 to PKn (hereinafter simply referred to as PK when collectively referred to), and key masks KM1 to KMn ( Hereinafter, the term “KM” is used as a generic name.)

サブプロセッサ２４がメインメモリ２３を使用する際のプロセスは、以下のとおりである。まず、サブプロセッサ２４はメモリコントローラ２２に、読み出しまたは書き込みのコマンドを出力する。このコマンドは、自身のサブプロセッサＩＤと、メインメモリ２３における使用要求先アドレスとを含む。メモリコントローラ２２は、このコマンドを実行する前に、キー管理テーブルＫＭＴを参照して、使用要求元のサブプロセッサ２４のサブプロセッサキーＰＫを調べる。次に、メモリコントローラ２２は、調べた使用要求元のサブプロセッサキーＰＫと、メインメモリ２３内における使用要求先のメモリロケーションＭＭＬ（図２（Ａ））に割り振られたアクセスキーＡＫとを比較して、２つのキーが一致した場合にのみ、上記のコマンドを実行する。 The process when the sub processor 24 uses the main memory 23 is as follows. First, the sub processor 24 outputs a read or write command to the memory controller 22. This command includes its own sub processor ID and a use request destination address in the main memory 23. Before executing this command, the memory controller 22 refers to the key management table KMT and checks the sub processor key PK of the sub processor 24 that has requested use. Next, the memory controller 22 compares the examined sub-processor key PK of the use request source with the access key AK allocated to the memory location MML of the use request destination in the main memory 23 (FIG. 2A). The above command is executed only when the two keys match.

図２（Ｃ）に示したキー管理テーブルＫＭＴのキーマスクＫＭは、その任意のビットが１になることによって、そのキーマスクＫＭに関連付けられたサブプロセッサキーＰＫの対応するビットを「０」または「１」に変化させることができるようになっている。例えば、サブプロセッサキーＰＫが「１０１０」であるとすると、原則として、このサブプロセッサキーＰＫと同じ「１０１０」というアクセスキーＡＫを持つサンドボックスＳＢへのアクセスだけが許容される。しかし、このサブプロセッサキーＰＫと関連付けられたキーマスクＫＭが、例えば「０００１」に設定されている場合には、このキーマスクＫＭのうちのビットが「１」に設定された桁についてのみ、サブプロセッサキーＰＫとアクセスキーＡＫとの一致判定がマスクされて行われず、その結果、「１０１０」のみならず、「１０１１」というアクセスキーＡＫを持つサンドボックスＳＢへのアクセスも可能となる。 In the key mask KM of the key management table KMT shown in FIG. 2C, when the arbitrary bit becomes 1, the corresponding bit of the sub processor key PK associated with the key mask KM is set to “0” or It can be changed to “1”. For example, if the sub-processor key PK is “1010”, in principle, only access to the sandbox SB having the same access key AK “1010” as the sub-processor key PK is permitted. However, if the key mask KM associated with the sub-processor key PK is set to “0001”, for example, only the digit in which the bit of the key mask KM is set to “1” The matching determination between the processor key PK and the access key AK is not masked, and as a result, not only “1010” but also the sandbox SB having the access key AK “1011” can be accessed.

次に、各サブプロセッサ２４がシステムバス２９を介してメインメモリ２３との間でデータ転送を行う場合の転送データ構造について説明する。なお、メインプロセッサ２１がサブプロセッサとして動作する場合の転送データ構造も同様である。 Next, a transfer data structure when each sub processor 24 performs data transfer with the main memory 23 via the system bus 29 will be described. The transfer data structure when the main processor 21 operates as a sub processor is the same.

図３（Ａ）に示したように、サブプロセッサ２４とメインメモリ２３との間でシステムバス２９上を転送されるデータは、パケット形式のデータ構造を有する。このパケットデータは、パケットヘッダＰＨとデータ部ＤＰとを含む。パケットヘッダＰＨは、圧縮状態を示す圧縮モードＣＭと、データの型（種類）を示すデータタイプＤＴと、転送先（図示せず）とを含む。データ部ＤＰは、実行すべき処理を指示するコマンドと、その処理に必要なデータとを含む。 As shown in FIG. 3A, data transferred on the system bus 29 between the sub-processor 24 and the main memory 23 has a packet-type data structure. This packet data includes a packet header PH and a data part DP. The packet header PH includes a compression mode CM indicating a compression state, a data type DT indicating a data type (kind), and a transfer destination (not shown). The data part DP includes a command for instructing a process to be executed and data necessary for the process.

圧縮モードＣＭは、図３（Ｂ）に示したように、例えば４ビットの情報よりなり、そのビット列状態に応じて、「圧縮オフ」，「Ａ方式圧縮」，「Ｂ方式圧縮」，「Ｃ方式圧縮」，…等の情報を示す。ここで、「圧縮オフ」は、データ部ＤＰのデータが非圧縮データであることを示し、「Ａ方式圧縮」は、データ部ＤＰのデータがＡ方式で圧縮されたデータであることを示す。データタイプＤＴは、図３（Ｃ）に示したように、例えば４ビットの情報よりなり、そのビット列状態に応じて、データ部ＤＰのデータが「静止画」，「動画」，「音声」，「テキスト」，…等であることを示す。 As shown in FIG. 3B, the compression mode CM includes, for example, 4-bit information, and “compression off”, “A method compression”, “B method compression”, “C” are selected according to the bit string state. Information such as “system compression”,... Here, “compression off” indicates that the data in the data part DP is uncompressed data, and “A method compression” indicates that the data in the data part DP is data compressed by the A method. As shown in FIG. 3C, the data type DT is made up of, for example, 4-bit information, and the data in the data portion DP is “still image”, “moving image”, “audio”, Indicates “text”,...

本実施の形態の情報処理システムにおいては、各情報処理装置２〜５間におけるデータ送受信が、所定のフォーマットを有する制御パケットであるソフトウェアセルを用いて行われるようになっている。具体的には、ある情報処理装置内のマルチプロセッサユニット２０に含まれるメインプロセッサ２１が、コマンド、プログラムおよび装置情報等を含むデータから構成されたソフトウェアセルを生成し、ネットワークを介して他の情報処理装置と送受信するようになっている。以下、ソフトウェアセルについて説明する。 In the information processing system according to the present embodiment, data transmission / reception between the information processing apparatuses 2 to 5 is performed using software cells that are control packets having a predetermined format. Specifically, the main processor 21 included in the multiprocessor unit 20 in an information processing device generates a software cell composed of data including commands, programs, device information, and the like, and receives other information via the network. It is designed to send and receive with the processing device. Hereinafter, the software cell will be described.

図４は、ソフトウェアセルの構成例を表すものである。このソフトウェアセルは、送信元ＩＤ１１、送信先ＩＤ１２、応答先ＩＤ１３、セルインターフェース１４、ＤＭＡ（Direct Memory Access）コマンド１５、サブプロセッサプログラム１６およびデータ１７から構成される。 FIG. 4 shows a configuration example of the software cell. This software cell includes a transmission source ID 11, a transmission destination ID 12, a response destination ID 13, a cell interface 14, a DMA (Direct Memory Access) command 15, a sub processor program 16, and data 17.

送信元ＩＤ１１には、ソフトウェアセルの送信元である情報処理装置のＩＰアドレスおよびその情報処理装置内のマルチプロセッサユニットの情報処理装置ＩＤ、さらに、その情報処理装置内のマルチプロセッサユニットが備える各サブプロセッサの識別子（サブプロセッサＩＤ）が含まれる。 The transmission source ID 11 includes the IP address of the information processing device that is the transmission source of the software cell, the information processing device ID of the multiprocessor unit in the information processing device, and each sub-unit included in the multiprocessor unit in the information processing device. A processor identifier (sub-processor ID) is included.

送信先ＩＤ１２および応答先ＩＤ１３には、それぞれ、ソフトウェアセルの送信先である情報処理装置、およびソフトウェアセルの実行結果の応答先である情報処理装置について、送信元ＩＤ１１に含まれる情報と同じ情報が含まれる。 The transmission destination ID 12 and the response destination ID 13 have the same information as the information included in the transmission source ID 11 for the information processing device that is the transmission destination of the software cell and the information processing device that is the response destination of the execution result of the software cell, respectively. included.

セルインターフェース１４は、ソフトウェアセルの利用に必要な情報であり、グローバルＩＤ１４１、サブプロセッサ数１４２、サンドボックスサイズ１４３および前回のソフトウェアセルＩＤ１４４を含む。グローバルＩＤ１４１は、ネットワーク全体を通してそのソフトウェアセルを一意的に識別できるものであり、送信元ＩＤ１１およびソフトウェアセルの作成または送信の日時（日付および時刻）に基づいて作成されるようになっている。サブプロセッサ数１４２は、そのソフトウェアセルの実行に必要なサブプロセッサの数を示す。サンドボックスサイズ１４３は、そのソフトウェアセルの実行に必要なメインメモリ２３内のメモリ量およびサブプロセッサ２４のローカルメモリ２４Ａ内のメモリ量を示す。前回のソフトウェアセルＩＤ１４４は、ストリーミングデータなどのシーケンシャルな実行を要求する１グループのソフトウェアセル内における、前回転送されたソフトウェアセルの識別子である。 The cell interface 14 is information necessary for using the software cell, and includes a global ID 141, the number of sub-processors 142, a sandbox size 143, and the previous software cell ID 144. The global ID 141 can uniquely identify the software cell throughout the network, and is created based on the transmission source ID 11 and the date and time (date and time) of creation or transmission of the software cell. The number of sub processors 142 indicates the number of sub processors necessary for executing the software cell. The sandbox size 143 indicates the amount of memory in the main memory 23 and the amount of memory in the local memory 24 </ b> A of the sub processor 24 necessary for executing the software cell. The previous software cell ID 144 is an identifier of the previously transferred software cell in one group of software cells that require sequential execution of streaming data or the like.

ＤＭＡコマンド１５、サブプロセッサプログラム１６およびデータ１７は、ソフトウェアセルの実行セクション１８を構成するものである。ＤＭＡコマンド１５には、サブプロセッサプログラム１６の起動に必要な一連のＤＭＡコマンドが含まれ、サブプロセッサプログラム１６には、サブプロセッサ２４によって実行されるサブプロセッサプログラムが含まれる。データ１７は、このサブプロセッサプログラム１６を含むプログラムによって処理されるデータである。 The DMA command 15, the sub processor program 16 and the data 17 constitute the execution section 18 of the software cell. The DMA command 15 includes a series of DMA commands necessary for starting the sub processor program 16, and the sub processor program 16 includes a sub processor program executed by the sub processor 24. Data 17 is data processed by a program including the sub processor program 16.

ＤＭＡコマンド８０５には、ロードコマンド１５１、キックコマンド１５２、ステータス要求コマンド１５３、ステータス返信コマンド１５４、機能プログラム実行コマンド１５５が含まれる。 The DMA command 805 includes a load command 151, a kick command 152, a status request command 153, a status return command 154, and a function program execution command 155.

ロードコマンド１５１は、メインメモリ２３内の情報をサブプロセッサ２４内のローカルメモリ２４Ａにロードするコマンドであり、ロードコマンド本体部１５１Ａのほかに、メインメモリアドレス１５１Ｂ、サブプロセッサＩＤ１５１Ｃおよびローカルメモリアドレス１５１Ｄを含む。メインメモリアドレス１５１Ｂは、メインメモリ２３内の情報ロード元領域のアドレスを示す。サブプロセッサＩＤ１５１ＣＨ、情報のロード先であるサブプロセッサ２４の識別子であり、ローカルメモリアドレス１５１Ｄは、ローカルメモリ２４Ａ内の情報ロード先アドレスを示す。 The load command 151 is a command for loading the information in the main memory 23 into the local memory 24A in the sub processor 24. In addition to the load command main body portion 151A, the main memory address 151B, the sub processor ID 151C, and the local memory address 151D are stored. Including. The main memory address 151B indicates the address of the information load source area in the main memory 23. The sub processor ID 151CH is an identifier of the sub processor 24 that is the information loading destination, and the local memory address 151D indicates the information loading destination address in the local memory 24A.

キックコマンド１５２は、プログラムの実行を開始するコマンドであり、キックコマンド本体部１５２Ａのほかに、サブプロセッサＩＤ１５２Ｂおよびプログラムカウンタ１５２Ｃを含む。サブプロセッサＩＤ１５２Ｂは、キック対象のサブプロセッサを識別するためのものであり、プログラムカウンタ１５２Ｃは、プログラム実行用プログラムカウンタのためのアドレスを与えるものである。 The kick command 152 is a command for starting execution of the program, and includes a sub processor ID 152B and a program counter 152C in addition to the kick command main body 152A. The sub processor ID 152B is for identifying the sub processor to be kicked, and the program counter 152C is for giving an address for the program counter for program execution.

ステータス要求コマンド１５３は、送信先ＩＤ１２が示す情報処理装置の装置情報を応答先ＩＤ１３が示す情報処理装置宛に送信することを要求するコマンドである。また、ステータス返信コマンド１５４は、ステータス要求コマンド１５３を受信した情報処理装置が、自身の装置情報を、そのステータス要求コマンド１５３に含まれる応答先ＩＤ１３が示す情報処理装置に返信するためのコマンドである。ＤＭＡコマンド１５がステータス返信コマンド１５４である場合には、実行セクション１８のデータ１７の領域に装置情報が格納されるようになっている。 The status request command 153 is a command for requesting transmission of the device information of the information processing device indicated by the transmission destination ID 12 to the information processing device indicated by the response destination ID 13. The status reply command 154 is a command for the information processing apparatus that has received the status request command 153 to return its own apparatus information to the information processing apparatus indicated by the response destination ID 13 included in the status request command 153. . When the DMA command 15 is the status return command 154, the device information is stored in the data 17 area of the execution section 18.

機能プログラム実行コマンド１５５は、ある情報処理装置が他の情報処理装置に対して、機能プログラムの実行を要求するコマンドである。コマンド本体部１５５Ａと、機能プログラムＩＤ１５５Ｂとを含む。機能プログラム実行コマンド１５５を受信した情報処理装置内のマルチプロセッサユニット２０では、メインプロセッサ２１が、機能プログラムＩＤ１５５Ｂに基づき、起動すべき機能プログラムを識別してメインメモリ２３にロードすると共に、そのロードされた機能プログラムを実行するようになっている。 The function program execution command 155 is a command in which a certain information processing apparatus requests another information processing apparatus to execute a function program. It includes a command main body 155A and a function program ID 155B. In the multiprocessor unit 20 in the information processing apparatus that has received the function program execution command 155, the main processor 21 identifies the function program to be activated based on the function program ID 155B and loads the function program into the main memory 23. The function program is executed.

図５はステータス返信コマンド１５４のデータ構成例を表すものである。このステータス返信コマンド１５４は、コマンド本体部１５４Ａと、データ部１５４Ｂとを含む。データ部１５４Ｂは、ステータス要求コマンド１５３を受信した情報処理装置の装置情報を示すものである。この装置情報は、情報処理装置ＩＤ１５４Ｂ−１、情報処理装置種別ＩＤ１５４Ｂ−２、ＭＳ（マスタ／スレーブ）ステータス１５４Ｂ−３、メインプロセッサ動作周波数１５４Ｂ−４、メインプロセッサ使用率１５４Ｂ−５、サブプロセッサ数１５４Ｂ−６、サブプロセッサＩＤ１５４Ｂ−７、サブプロセッサステータス１５４Ｂ−８、サブプロセッサ使用率１５４Ｂ−９、メインメモリ総容量１５４Ｂ−１０、メインメモリ使用量１５４Ｂ−１１、外部記録部数１５４Ｂ−１２、外部記録部ＩＤ１５４Ｂ−１３、外部記録部種別ＩＤ１５４Ｂ−１４、外部記録部総容量１５４Ｂ−１５および外部記録部使用量１５４Ｂ−１６を含む。これらの装置情報は、そのすべてがメインメモリ２３内に集約して格納されるようになっている。 FIG. 5 shows a data configuration example of the status reply command 154. The status reply command 154 includes a command main body 154A and a data portion 154B. The data portion 154B indicates device information of the information processing device that has received the status request command 153. This device information includes information processing device ID 154B-1, information processing device type ID 154B-2, MS (master / slave) status 154B-3, main processor operating frequency 154B-4, main processor usage rate 154B-5, and number of sub-processors. 154B-6, sub processor ID 154B-7, sub processor status 154B-8, sub processor usage rate 154B-9, main memory total capacity 154B-10, main memory usage 154B-11, number of external recording copies 154B-12, external recording A set ID 154B-13, an external recording unit type ID 154B-14, an external recording unit total capacity 154B-15, and an external recording unit usage amount 154B-16. All of these pieces of device information are stored in the main memory 23 in an aggregated manner.

情報処理装置ＩＤ１５４Ｂ−１は、情報処理装置内のマルチプロセッサユニット２０を識別するための識別子であり、ステータス返信コマンド１５４を送信する情報処理装置を示すものである。情報処理装置ＩＤ１５４Ｂ−１は、電源投入時、その情報処理装置内のマルチプロセッサユニット２０に含まれるメインプロセッサ２１によって生成されるようになっている。具体的には例えば、電源投入時の日時、情報処理装置のＩＰアドレスおよび情報処理装置内のマルチプロセッサユニット２０に含まれるサブプロセッサ２４の数などに基づいて生成される。 The information processing apparatus ID 154B-1 is an identifier for identifying the multiprocessor unit 20 in the information processing apparatus, and indicates the information processing apparatus that transmits the status reply command 154. The information processing apparatus ID 154B-1 is generated by the main processor 21 included in the multiprocessor unit 20 in the information processing apparatus when the power is turned on. Specifically, for example, it is generated based on the date and time when the power is turned on, the IP address of the information processing apparatus, the number of sub-processors 24 included in the multiprocessor unit 20 in the information processing apparatus, and the like.

情報処理装置種別ＩＤ１５４Ｂ−２には、その情報処理装置の特徴を表す値が含まれる。情報処理装置の特徴とは、例えば本実施の形態の場合、ＴＶ装置、ＤＶＤレコーダ、マルチプレーヤなどである。また、情報処理装置種別ＩＤ１５４Ｂ−２は、映像音声記録、映像音声再生など、装置の機能を表すものであってもよい。情報処理装置の特徴や機能を表す値は予め決定されているものとし、情報処理装置種別ＩＤ１５４Ｂ−２に基づいて、その情報処理装置の特徴や機能を把握することが可能である。 The information processing apparatus type ID 154B-2 includes a value representing the characteristics of the information processing apparatus. The characteristics of the information processing apparatus are, for example, a TV apparatus, a DVD recorder, and a multiplayer in the case of this embodiment. Further, the information processing apparatus type ID 154B-2 may represent a function of the apparatus such as video / audio recording or video / audio reproduction. It is assumed that values representing the characteristics and functions of the information processing apparatus are determined in advance, and it is possible to grasp the characteristics and functions of the information processing apparatus based on the information processing apparatus type ID 154B-2.

ＭＳステータス１５４Ｂ−３は、後述のように各情報処理装置がマスタまたはスレーブのいずれとして動作しているかを表すものである。具体的には例えば、ＭＳステータス１５４Ｂ−８が「０」に設定されている場合にはマスタとして動作していることを示し、「１」に設定されている場合にはスレーブ装置として動作していることを示す。 The MS status 154B-3 represents whether each information processing apparatus is operating as a master or a slave as will be described later. Specifically, for example, when the MS status 154B-8 is set to “0”, it indicates that it is operating as a master, and when it is set to “1”, it is operated as a slave device. Indicates that

メインプロセッサ動作周波数１５４Ｂ−４は、マルチプロセッサユニット内のメインプロセッサの動作周波数を表す。また、メインプロセッサ使用率１５４Ｂ−５は、そのメインプロセッサで現在実行中の全てのプログラムについての、そのメインプロセッサでの使用率を表す。メインプロセッサ使用率１５４Ｂ−５は、対象メインプロセッサの全処理能力に対する使用中の処理能力の比率を表した値であり、例えばプロセッサの処理能力評価のための単位であるＭＩＰＳ（Million Instructions Per Second ）を単位として算出され、あるいは単位時間あたりのプロセッサ使用時間に基づいて算出されるようになっている。後述のサブプロセッサ使用率１５４Ｂ−９についても同様にして算出されるようになっている。 The main processor operating frequency 154B-4 represents the operating frequency of the main processor in the multiprocessor unit. The main processor usage rate 154B-5 represents the usage rate in the main processor for all programs currently being executed in the main processor. The main processor usage rate 154B-5 is a value representing the ratio of the processing capacity in use to the total processing capacity of the target main processor. For example, MIPS (Million Instructions Per Second) which is a unit for evaluating the processing capacity of the processor As a unit, or based on the processor usage time per unit time. The sub processor usage rate 154B-9, which will be described later, is calculated in the same manner.

サブプロセッサ数１５４Ｂ−６は、そのマルチプロセッサユニットが備えるサブプロセッサの数を表し、サブプロセッサＩＤ１５４Ｂ−７は、そのマルチプロセッサユニット内の各サブプロセッサを識別するための識別子である。 The number of sub processors 154B-6 represents the number of sub processors included in the multiprocessor unit, and the sub processor ID 154B-7 is an identifier for identifying each sub processor in the multi processor unit.

サブプロセッサステータス１５４Ｂ−８は、各サブプロセッサの状態を表すものであり、unused（未使用），reserved（予約済み），busy（使用中）などの状態がある。unusedは、そのサブプロセッサが現在使用されてなく、使用の予約もされていない状態を表し、reservedは、現在は使用されていないが、予約されている状態を表し、busyは、現在使用中であることを表すようになっている。 The sub processor status 154B-8 represents the state of each sub processor, and includes states such as “unused”, “reserved”, and “busy”. unused indicates that the sub-processor is not currently used and reserved for use, reserved indicates that it is not currently used but reserved, busy is currently in use It is meant to represent something.

サブプロセッサ使用率１５４Ｂ−９は、そのサブプロセッサで現在実行中の、あるいはそのサブプロセッサでの実行が予約されているプログラムについての、そのサブプロセッサでの使用率を表すものである。すなわち、サブプロセッサ使用率１５４Ｂ−９は、サブプロセッサステータスがbusyである場合には、現在の使用率を示し、サブプロセッサステータスがreservedである場合には、後に使用される予定の推定使用率を示す。 The sub processor usage rate 154B-9 represents the usage rate in the sub processor for a program currently being executed in the sub processor or reserved for execution in the sub processor. That is, the sub processor usage rate 154B-9 indicates the current usage rate when the sub processor status is busy, and indicates the estimated usage rate to be used later when the sub processor status is reserved. Show.

サブプロセッサＩＤ１５４Ｂ−７、サブプロセッサステータス１５４Ｂ−８およびサブプロセッサ使用率１５４Ｂ−９は、１つのサブプロセッサ２４に対して一組設定されるものであり、１つのマルチプロセッサユニット２０内のサブプロセッサ数の組だけ設定されるようになっている。 The sub processor ID 154B-7, the sub processor status 154B-8, and the sub processor usage rate 154B-9 are set for one sub processor 24, and the number of sub processors in one multiprocessor unit 20 is set. Only the set of is set.

メインメモリ総容量１５４Ｂ−１０およびメインメモリ使用量１５４Ｂ−１１は、それぞれ、メインメモリ２３の総容量および現在使用中の容量を表すものである。 The main memory total capacity 154B-10 and the main memory usage 154B-11 represent the total capacity of the main memory 23 and the capacity currently in use, respectively.

外部記録装置数１５４Ｂ−１２は、そのマルチプロセッサユニット２０に接続されている外部記録装置２６の数を表す。外部記録装置ＩＤ１５４Ｂ−１３は、そのマルチプロセッサユニットに接続されている外部記録装置を一意的に識別する情報である。外部記録装置種別ＩＤ１５４Ｂ−１４は、その外部記録装置の種類（例えば、ハードディスク、ＣＤ±ＲＷ、ＤＶＤ±ＲＷ、メモリディスク、ＳＲＡＭ、ＲＯＭなど）を表す。また、外部記録装置総容量１５４Ｂ−１５および外部記録装置使用量１５４Ｂ−１６は、それぞれ、外部記録装置ＩＤ１５４Ｂ−１３によって識別される外部記録装置２６の総容量および現在使用中の容量を表す。 The number of external recording devices 154B-12 represents the number of external recording devices 26 connected to the multiprocessor unit 20. The external recording device ID 154B-13 is information for uniquely identifying the external recording device connected to the multiprocessor unit. The external recording device type ID 154B-14 represents the type of the external recording device (for example, hard disk, CD ± RW, DVD ± RW, memory disk, SRAM, ROM, etc.). The external recording device total capacity 154B-15 and the external recording device usage amount 154B-16 represent the total capacity of the external recording device 26 identified by the external recording device ID 154B-13 and the capacity currently in use, respectively.

外部記録装置ＩＤ１５４Ｂ−１３、外部記録装置種別ＩＤ１５４Ｂ−１４、外部記録装置総容量１５４Ｂ−１５および外部記録装置使用量１５４Ｂ−１６は、１つの外部記録装置２６に対して一組設定されるものであり、そのマルチプロセッサユニット２０に接続されている外部記録装置数の組だけ設定されるようになっている。 The external recording device ID 154B-13, the external recording device type ID 154B-14, the external recording device total capacity 154B-15, and the external recording device usage amount 154B-16 are set as one set for one external recording device 26. Yes, only the number of external recording devices connected to the multiprocessor unit 20 is set.

次に、情報処理装置２が保有するソフトウェアについて説明する。この情報処理装置２が保有するソフトウェアには、制御プログラム、機能プログラムおよびデバイスドライバがある。これらのソフトウェアは、そのマルチプロセッサユニット２０に接続される外部記録装置２６に予め記録されており、情報処理装置２に電源が投入されたときに、外部記録装置２６から読み出されるようになっている。 Next, software owned by the information processing apparatus 2 will be described. Software held by the information processing apparatus 2 includes a control program, a function program, and a device driver. These softwares are recorded in advance in an external recording device 26 connected to the multiprocessor unit 20, and are read from the external recording device 26 when the information processing apparatus 2 is turned on. .

制御プログラムは、情報処理システム全体の構成を制御および管理する機能を有し、マルチプロセッサユニット２０のメインプロセッサ２１が実行するソフトウェアである。この制御プログラムは、ＭＳ（マスタ／スレーブ）マネージャおよび能力交換プログラムを含む。ＭＳマネージャは、後述するように各情報処理装置がマスタであるかスレーブであるかという情報を設定する機能を有する。また、能力交換プログラムは、各情報処理装置が保有する装置情報を取得する機能を有する。 The control program is software executed by the main processor 21 of the multiprocessor unit 20 having a function of controlling and managing the configuration of the entire information processing system. This control program includes an MS (master / slave) manager and a capability exchange program. The MS manager has a function of setting information as to whether each information processing apparatus is a master or a slave, as will be described later. The capability exchange program has a function of acquiring device information held by each information processing device.

機能プログラムは、情報処理装置２におけるアプリケーション処理動作を担う機能を有し、メインプロセッサ２１が実行するソフトウェアである。例えば、記録用プログラム、再生用プログラムおよび素材検索用プログラムなど、その情報処理装置に応じたものを含む。 The function program is software executed by the main processor 21 and having a function responsible for application processing operations in the information processing apparatus 2. For example, a program according to the information processing apparatus such as a recording program, a reproduction program, and a material search program is included.

デバイスドライバは、マルチプロセッサユニット２０のデータ入出力（送受信）用のソフトウェアであり、例えば、放送受信用ドライバ、モニタ出力用ドライバ、ビットストリーム入出力用ドライバ、ネットワーク入出力用ドライバなど、その情報処理装置に応じたものを含む。 The device driver is software for data input / output (transmission / reception) of the multiprocessor unit 20, and includes information processing such as a broadcast reception driver, a monitor output driver, a bitstream input / output driver, and a network input / output driver. Including those according to the device.

次に、以上のような構成の情報処理システムの動作を説明する。 Next, the operation of the information processing system configured as described above will be described.

この情報処理システムでは、情報処理装置２〜４が、図４に示したソフトウェアセルを用いて互いの間で装置情報の送受信を行う。これにより、いずれかの情報処理装置がマスタとして設定され、他の情報処理装置がスレーブとして設定される。各情報処理装置は、定期的にＤＭＡコマンドとしてのステータス要求コマンドを含むソフトウェアセルをネットワーク上の他の情報処理装置に送信してステータス情報を照会することにより、他の情報処理装置の状況を監視する。マスタとして設定された情報処理装置のメインメモリ２３には、自装置を含むすべての情報処理装置の装置情報が集約される。各情報処理装置は、ソフトウェアセルを用いて必要な処理を他の情報処理装置に委託し、実行させることができる。 In this information processing system, the information processing apparatuses 2 to 4 transmit and receive apparatus information between each other using the software cell shown in FIG. Thereby, one of the information processing apparatuses is set as a master, and the other information processing apparatus is set as a slave. Each information processing device periodically monitors the status of other information processing devices by sending a software cell including a status request command as a DMA command to other information processing devices on the network and inquiring status information. To do. In the main memory 23 of the information processing apparatus set as the master, apparatus information of all information processing apparatuses including the self apparatus is collected. Each information processing apparatus can entrust a necessary process to another information processing apparatus using a software cell and execute it.

他の情報処理装置からコマンド実行の指示がなされ、または、ユーザからの操作によって直接に処理指示を受けた情報処理装置では、メインプロセッサ２１の制御の下、複数のサブプロセッサ２４−１〜２４−ｎが、与えられた処理を並列に実行する。このとき、マルチプロセッサユニット２０では、メインメモリ２３のサンドボックスに関する排他制御が行われる。まず、この排他制御について説明する。 In an information processing apparatus that is instructed to execute a command from another information processing apparatus or receives a processing instruction directly by a user operation, a plurality of sub-processors 24-1 to 24- n executes a given process in parallel. At this time, the multiprocessor unit 20 performs exclusive control related to the sandbox of the main memory 23. First, this exclusive control will be described.

サブプロセッサ２４−１〜２４−ｎによってデータを多段階に処理する必要がある場合、図２において説明したようなメモリ構造に基づく制御を行うことによって、前段階の処理を行うサブプロセッサ２４と、後段階の処理を行うサブプロセッサ２４のみが、メインメモリ２３の所定アドレスにアクセスできるようになり、データを保護することができる。 When it is necessary to process data in multiple stages by the sub-processors 24-1 to 24-n, the sub-processor 24 that performs the process in the previous stage by performing control based on the memory structure as described in FIG. Only the sub-processor 24 that performs the subsequent processing can access a predetermined address in the main memory 23 and can protect data.

サブプロセッサ２４−１がメインメモリ２３を使用する際のプロセスは、以下のとおりである。まず、サブプロセッサ２４−１は、メモリコントローラ２２に対して、読み出しまたは書き込みのコマンドを出力する。このコマンドには、自身のサブプロセッサＩＤと、メインメモリ２３における使用要求先アドレスとを含める。メモリコントローラ２２は、このコマンドを実行する前に、キー管理テーブルＫＭＴを参照して、要求元のサブプロセッサ２４−１のサブプロセッサキーＰＫを調べる。次に、メモリコントローラ２２は、調べた要求元のサブプロセッサ２４−１のサブプロセッサキーＰＫと、メインメモリ２３内における使用要求先のメモリロケーションＭＭＬ（図２（Ａ））に割り振られたアクセスキーＡＫとを比較して、２つのキーが一致した場合にのみ、上記のコマンドを実行する。 The process when the sub processor 24-1 uses the main memory 23 is as follows. First, the sub processor 24-1 outputs a read or write command to the memory controller 22. This command includes its own sub-processor ID and use request destination address in the main memory 23. Before executing this command, the memory controller 22 refers to the key management table KMT and checks the sub processor key PK of the requesting sub processor 24-1. Next, the memory controller 22 checks the sub-processor key PK of the requesting sub-processor 24-1 and the access key assigned to the use-requested memory location MML (FIG. 2A) in the main memory 23. The above command is executed only when two keys match with AK.

また、マルチプロセッサユニット２０では、キー管理テーブルＫＭＴのキーマスクＫＭを利用することにより、以下のような制御が可能である。 In the multiprocessor unit 20, the following control is possible by using the key mask KM of the key management table KMT.

すなわち、情報処理装置２の起動直後においては、キーマスクＫＭの値は全てゼロである。ここでは、メインプロセッサ２１内のプログラムはサブプロセッサ２４内のプログラムと連携動作するものとする。サブプロセッサ２４−１より出力された処理結果データを一旦メインメモリ２３に格納し、サブプロセッサ２４−２に入力する場合を想定する。この場合、メインメモリ２３における格納領域は、どちらのサブプロセッサ２４−１，２４−２からもアクセス可能である必要がある。そのような場合に、メインプロセッサ２１内のプログラムは、キーマスクＫＭの値を適切に変更し、複数のサブプロセッサ２４からアクセスできる領域をメインメモリ２３内に設けることにより、サブプロセッサ２４による多段階的処理を可能にする。 That is, immediately after the information processing apparatus 2 is activated, the values of the key mask KM are all zero. Here, it is assumed that the program in the main processor 21 operates in cooperation with the program in the sub processor 24. Assume that the processing result data output from the sub processor 24-1 is temporarily stored in the main memory 23 and input to the sub processor 24-2. In this case, the storage area in the main memory 23 needs to be accessible from either of the sub processors 24-1 and 24-2. In such a case, the program in the main processor 21 appropriately changes the value of the key mask KM, and provides an area in the main memory 23 that can be accessed from the plurality of sub processors 24, thereby allowing the sub processor 24 to perform multiple stages. To enable automatic processing.

例えば、以下の手順で多段階処理を行う場合を想定する。
（１）他の情報処理装置からのデータ入力
（２）サブプロセッサ２４−１による処理
（３）メインメモリ２３内のメモリロケーションＳＭＬ１への格納
（４）サブプロセッサ２４−２による処理
（５）メインメモリ２３内のメモリロケーションＳＭＬ２への格納 For example, a case where multistage processing is performed in the following procedure is assumed.
(1) Data input from other information processing apparatus (2) Processing by sub-processor 24-1 (3) Storage in memory location SML1 in main memory 23 (4) Processing by sub-processor 24-2 (5) Main Store to memory location SML2 in memory 23

この場合、例えば、サブプロセッサ２４−１のサブプロセッサキーＰＫ＝「０１００」、メインメモリ２３内のメモリロケーションＳＭＬ１のアクセスキーＡＫ＝「０１００」、サブプロセッサ２４−２のサブプロセッサキーＰＫ＝「０１０１」、メインメモリ２３内のメモリロケーションＳＭＬ２のアクセスキーＡＫ＝「０１０１」という設定であったとすると、このままでは、サブプロセッサ２４−２はメインメモリ２３内のメモリロケーションＳＭＬ１にアクセスすることができない。そこで、サブプロセッサ２４−２のキーマスクＫＭ２を「０００１」に設定することにより、サブプロセッサ２４−２によるメインメモリ２３内のメモリロケーションＳＭＬ１へのアクセスを可能にすることができる。 In this case, for example, the sub processor key PK = “0100” of the sub processor 24-1, the access key AK = “0100” of the memory location SML1 in the main memory 23, and the sub processor key PK = “0101” of the sub processor 24-2. If the access key AK of the memory location SML2 in the main memory 23 is set to “0101”, the sub-processor 24-2 cannot access the memory location SML1 in the main memory 23 as it is. Therefore, by setting the key mask KM2 of the sub processor 24-2 to “0001”, the sub processor 24-2 can access the memory location SML1 in the main memory 23.

次に、図６を参照して、マルチプロセッサユニット２０内のサブプロセッサ２４による並列処理動作について説明する。図６は、あるサブプロセッサ２４がメインメモリ２３から読み出したデータに対して所定の演算処理を行った後、その処理結果をメインメモリ２３に書き戻す場合の処理手順を表すものである。なお、サブプロセッサ２４−１〜２４−ｎにおける各動作はすべて同様であるので、ここでは、代表してサブプロセッサ２４−１について説明する。 Next, parallel processing operations performed by the sub-processors 24 in the multiprocessor unit 20 will be described with reference to FIG. FIG. 6 shows a processing procedure when a certain sub processor 24 performs predetermined arithmetic processing on data read from the main memory 23 and then writes the processing result back to the main memory 23. Since the operations of the sub processors 24-1 to 24-n are all the same, only the sub processor 24-1 will be described here.

サブプロセッサ２４−１では、図示しない制御部が、バスアービタ２７からバス使用権を得て、演算器２４Ｃ−１の演算処理に必要なデータをメインメモリ２３の所定のメモリロケーションＭＭＬからローカルメモリ２４Ａ−１の所定のメモリロケーションＳＭＬに転送する（ステップＳ１０１）。このデータは、図３（Ａ）に示したようなパケットデータとして転送される。 In the sub-processor 24-1, a control unit (not shown) obtains the right to use the bus from the bus arbiter 27 and transfers data necessary for the arithmetic processing of the arithmetic unit 24C-1 from a predetermined memory location MML of the main memory 23 to the local memory 24A-. Transfer to one predetermined memory location SML (step S101). This data is transferred as packet data as shown in FIG.

サブプロセッサ２４−１の圧縮伸長器２４Ｂ−１は、ローカルメモリ２４Ａ−１に格納されたデータのパケットヘッダＰＨを解析してデータ伸張の要否を判定する（ステップＳ１０２）。この結果、このデータが圧縮されたものであったときには、伸長処理を行うべきと判定し（ステップＳ１０３；Ｙ）、その圧縮モード（図３（Ｂ））に対応した方式で伸長処理を行う（ステップＳ１０４）。例えば、パケットヘッダＰＨの圧縮モードＣＭが「０００１」であったとすると、圧縮伸長器２４Ｂ−１は、そのデータが「Ａ方式」で圧縮されたものであると判定し、その「Ａ方式」に対応した伸長方式「ａ方式」で伸長処理を行い、その伸長されたデータをローカルメモリ２４Ａ−１に書き戻す。一方、圧縮モードＣＭが「００００」であったときには、非圧縮データであると判定し、伸長処理を行わず（ステップＳ１０３；Ｎ）、直接、ステップＳ１０５の演算処理に移行する。なお、本実施の形態では、パケットヘッダＰＨのデータタイプＤＴの情報は使用しないので、不要である。このデータタイプＤＴは、後述する変形例において使用される。 The compression / decompression unit 24B-1 of the sub processor 24-1 analyzes the packet header PH of the data stored in the local memory 24A-1, and determines whether or not data expansion is necessary (step S102). As a result, when this data is compressed, it is determined that the decompression process should be performed (step S103; Y), and the decompression process is performed by a method corresponding to the compression mode (FIG. 3B) ( Step S104). For example, if the compression mode CM of the packet header PH is “0001”, the compression / decompression unit 24B-1 determines that the data is compressed by the “A method”, and sets the “A method”. The decompression process is performed by the corresponding decompression method “a method”, and the decompressed data is written back to the local memory 24A-1. On the other hand, when the compression mode CM is “0000”, it is determined that the data is uncompressed data, and the decompression process is not performed (step S103; N), and the process directly proceeds to the operation process of step S105. In the present embodiment, the information of the data type DT of the packet header PH is not used and is unnecessary. This data type DT is used in a modification described later.

演算器２４Ｃ−１は、ローカルメモリ２４Ａ−１の所定のメモリロケーションＳＭＬに格納されたデータに対して、演算処理を行う（ステップＳ１０５）。この演算処理は、予めメインプロセッサ２１によって各サブプロセッサ２４−１〜２４−ｎに割り当てられたものであり、例えば、画像処理、音響処理および数値演算等である。演算器２４Ｃ−１は、演算結果を、再びローカルメモリ２４Ａ−１の所定のメモリロケーションＳＭＬに格納する。 The computing unit 24C-1 performs arithmetic processing on the data stored in the predetermined memory location SML of the local memory 24A-1 (Step S105). This arithmetic processing is assigned in advance to each of the sub processors 24-1 to 24-n by the main processor 21, and is, for example, image processing, acoustic processing, numerical calculation, or the like. The calculator 24C-1 stores the calculation result again in a predetermined memory location SML of the local memory 24A-1.

次に、圧縮伸長器２４Ｂ−１は、ローカルメモリ２４Ａ−１に格納された演算結果データに対して圧縮処理を行うことの妥当性を判定する（ステップＳ１０６）。データ転送量が多く圧縮する必要がある場合は圧縮をすることが妥当であると判定し（ステップＳ１０７；Ｙ）、データ転送量が少なく圧縮する必要がない場合は圧縮をしないことが妥当であると判定する（ステップＳ１０７；Ｎ）。この判定は、例えば、データ転送所要時間Ｔｔと演算所要時間Ｔｐとの比較結果に基づいて行う。ここで、データ転送所要時間Ｔｔは、圧縮処理をせずにそのままデータをシステムバス２９上に送出してメインメモリ２３に転送する場合に要する時間である。演算所要時間Ｔｐは、次にバス使用権を獲得しているサブプロセッサ２４（例えば２４−２）において演算処理に要する時間である。データ転送所要時間Ｔｔと演算所要時間Ｔｐとの関係は静的に見積もることが可能なので、プロセッサ２４−１を起動する時点で出力時の圧縮処理の要否を設定し、効果の得られない場合には圧縮処理を行わないようにすることができる。例えば、元々のデータの転送量が少ない一方、演算に長時間を要することから、あるプロセッサが転送処理を終えても他のプロセッサが転送を行える状態にないような場合、データを圧縮して転送することの効果はなく、データをそのまま出力しても全体的なパフォーマンスには影響しない。したがって、例えばＴｐ＜Ｔｔであるような場合には、圧縮処理を行わないようにすることで、無駄なエネルギーを消費せずに済む。 Next, the compression / decompression unit 24B-1 determines the validity of performing the compression process on the operation result data stored in the local memory 24A-1 (step S106). If the data transfer amount is large and compression is necessary, it is determined that compression is appropriate (step S107; Y). If the data transfer amount is small and compression is not necessary, it is appropriate not to perform compression. (Step S107; N). This determination is made based on, for example, a comparison result between the data transfer required time Tt and the calculation required time Tp. Here, the data transfer required time Tt is a time required when data is directly sent to the system bus 29 and transferred to the main memory 23 without being compressed. The calculation required time Tp is a time required for calculation processing in the sub-processor 24 (for example, 24-2) that has acquired the bus use right next time. Since the relationship between the data transfer required time Tt and the calculation required time Tp can be estimated statically, the necessity of compression processing at the time of output is set at the time of starting the processor 24-1, and the effect cannot be obtained. It is possible to prevent the compression process from being performed. For example, if the amount of original data transfer is small, but the calculation takes a long time, even if one processor finishes the transfer process and the other processor is not ready to transfer, the data is compressed and transferred There is no effect of doing so, and outputting the data as it is does not affect the overall performance. Therefore, for example, when Tp <Tt, it is possible to avoid wasting energy by not performing the compression process.

圧縮伸長器２４Ｂ−１は、圧縮処理の妥当性ありと判定したときは、ローカルメモリ２４Ａ−１のデータに対して圧縮処理を実行し、その結果を、一旦、ローカルメモリ２４Ａ−１に書き戻す（ステップＳ１０８）。一方、圧縮処理の妥当性なしと判定したときは、圧縮せず、非圧縮データ（ローカルメモリ内の生データ）を選択する（ステップＳ１１２）。 When the compression / decompression unit 24B-1 determines that the compression process is valid, the compression / decompression unit 24B-1 executes the compression process on the data in the local memory 24A-1, and temporarily writes the result back to the local memory 24A-1. (Step S108). On the other hand, when it is determined that the compression process is not valid, uncompressed data (raw data in the local memory) is selected without compression (step S112).

圧縮処理を実行した場合、圧縮伸長器２４Ｂ−１はさらに、圧縮結果を評価する（ステップＳ１０９）。この評価は、例えば圧縮前のデータ量と圧縮後のデータ量とを比較することで行う。具体的には、圧縮前データ量に対する圧縮後データ量の比である圧縮率が所定のしきい値を越える場合には、圧縮効果が大きいと判定し（ステップＳ１１０；Ｙ）、圧縮データを選択する（ステップＳ１１１）。一方、圧縮率が所定のしきい値以下の場合には、圧縮効果が少ないと判定し（ステップＳ１１０；Ｎ）、非圧縮データを選択する（ステップＳ１１２）。 When the compression process is executed, the compression / decompression unit 24B-1 further evaluates the compression result (step S109). This evaluation is performed, for example, by comparing the data amount before compression with the data amount after compression. Specifically, when the compression rate, which is the ratio of the amount of data after compression to the amount of data before compression, exceeds a predetermined threshold, it is determined that the compression effect is large (step S110; Y), and the compressed data is selected. (Step S111). On the other hand, if the compression rate is equal to or lower than the predetermined threshold value, it is determined that the compression effect is small (step S110; N), and uncompressed data is selected (step S112).

このとき、圧縮伸長器２４Ｂ−１は、上記において選択されたデータを含むデータ部ＤＰにパケットヘッダＰＨを付加してデータパケット（図３（Ａ））を生成し（ステップＳ１１３）、システムバス２９上に送出する。このパケットヘッダＰＨには、圧縮モードとして、圧縮をしたか否かの情報のほか、圧縮した場合にはその圧縮方式の情報が含まれる。この圧縮モードによって転送先のメインメモリ２３に圧縮結果を通知することができる。 At this time, the compression / decompression unit 24B-1 generates a data packet (FIG. 3A) by adding the packet header PH to the data part DP including the data selected above (step S113), and the system bus 29 Send it up. This packet header PH includes, as a compression mode, information on whether or not compression has been performed, and information on the compression method when compression is performed. The compression result can be notified to the transfer destination main memory 23 in this compression mode.

このようにして、演算器２４Ｃ−１によって演算処理された演算結果データが、そのまま、または圧縮されてローカルメモリ２４−１からメインメモリ２３へと転送され、このメインメモリ２３における所定のメモリロケーションＭＭＬに格納される（ステップＳ１１４）。 In this way, the operation result data calculated by the arithmetic unit 24C-1 is transferred as it is or compressed and transferred from the local memory 24-1 to the main memory 23, and a predetermined memory location MML in the main memory 23 is obtained. (Step S114).

ここで、本実施の形態のマルチプロセッサユニット２０による特徴的作用を説明する前に、まず図１０を参照して、本実施の形態に対する比較例について説明する。 Here, before describing the characteristic operation of the multiprocessor unit 20 of the present embodiment, a comparative example with respect to the present embodiment will be described with reference to FIG.

図１０は、比較例における並列処理動作のタイミングを表すものである。この比較例は、各サブプロセッサにおいて圧縮処理をせずにデータ転送を行う場合を想定したものである。横軸は時間を示し、縦軸は並列度（すなわち、並列処理に携わるサブプロセッサの数）を示す。 FIG. 10 shows the timing of the parallel processing operation in the comparative example. This comparative example assumes the case where data transfer is performed without compression processing in each sub-processor. The horizontal axis represents time, and the vertical axis represents the degree of parallelism (that is, the number of sub-processors engaged in parallel processing).

この比較例では、３つのサブプロセッサＡ，Ｂ，Ｃが並列処理動作を行うものとする。ｔ０の時点で、システムバスに接続されたサブプロセッサＡのローカルメモリに対してメインメモリから入力データの転送が行われる。ｔ２の時点で、サブプロセッサＡへの入力データの転送が完了し、サブプロセッサＡが演算処理を開始するが、これと同時に、サブプロセッサＢへのデータ転送（入力）が開始する。サブプロセッサＡはｔ３の時点で演算処理を終えるが、この時点ではサブプロセッサＢがデータ転送途中のため、サブプロセッサＡはデータ転送（出力）の開始待ちになる。 In this comparative example, it is assumed that three sub-processors A, B, and C perform parallel processing operations. At time t0, input data is transferred from the main memory to the local memory of the sub processor A connected to the system bus. At time t2, transfer of input data to the sub processor A is completed, and the sub processor A starts arithmetic processing. At the same time, data transfer (input) to the sub processor B starts. The sub processor A finishes the arithmetic processing at time t3. At this point, the sub processor A is in the middle of data transfer, so the sub processor A waits for the start of data transfer (output).

ｔ４の時点でサブプロセッサＢへのデータ入力が完了し、サブプロセッサＢが演算処理を開始するが、これと同時に、サブプロセッサＣへのデータ転送（入力）が開始する。サブプロセッサＢはｔ５の時点で演算処理を終えるが、この時点ではサブプロセッサＣがデータ転送途中のため、サブプロセッサＢはデータ転送（出力）の開始待ちになる。 At time t4, data input to the sub processor B is completed, and the sub processor B starts arithmetic processing. At the same time, data transfer (input) to the sub processor C starts. The sub processor B finishes the arithmetic processing at the time t5. At this time, the sub processor C is in the middle of data transfer, so the sub processor B waits for the start of data transfer (output).

ｔ６の時点でサブプロセッサＣへのデータ入力が完了し、サブプロセッサＣが演算処理を開始する。この時点でバスが空くので、サブプロセッサＡから出力結果が転送される。ｔ７の時点でサブプロセッサＣは演算処理を終える。ｔ８の時点でサブプロセッサＡからの出力が完了すると、この時点でサブプロセッサＢがデータ転送（出力）を開始する。ｔ１０の時点でサブプロセッサＢからの出力が完了すると、この時点でサブプロセッサＣがデータ転送（出力）を開始する。サブプロセッサＣからの出力はｔ１２の時点で完了する。 At time t6, data input to the sub processor C is completed, and the sub processor C starts arithmetic processing. Since the bus is free at this time, the output result is transferred from the sub processor A. At time t7, the sub processor C finishes the arithmetic processing. When the output from the sub-processor A is completed at time t8, the sub-processor B starts data transfer (output) at this time. When the output from the sub processor B is completed at time t10, the sub processor C starts data transfer (output) at this time. The output from the sub processor C is completed at time t12.

このように各サブプロセッサがメインメモリを共有しているため、メインメモリとサブプロセッサＡ，Ｂ，Ｃ内のローカルメモリとの間のデータ転送は、図１０に示したように逐次的に発生する。ここに示した比較例では、ｔ３とｔ４との間およびｔ５とｔ６との間で、メモリ間のデータ転送がボトルネック（支配的）になり、サブプロセッサが待ち状態になっている。したがって、これ以上サブプロセッサの並列度を増やしても処理能力が向上しないことがわかる。 Since each sub-processor shares the main memory in this way, data transfer between the main memory and the local memory in the sub-processors A, B, and C occurs sequentially as shown in FIG. . In the comparative example shown here, the data transfer between the memories becomes a bottleneck (dominant) between t3 and t4 and between t5 and t6, and the sub processor is in a waiting state. Therefore, it can be seen that the processing capability does not improve even if the degree of parallelism of the sub processors is further increased.

これに対して、本実施の形態では、各サブプロセッサに圧縮伸長器２４Ｂ−１を設けることにより、システムバス２９上を転送するデータを圧縮し、バスから受信したデータを伸張する機能を提供する。この機能により、メインメモリ２３とサブプロセッサ２４内のローカルメモリ１４Ａとの間の転送データは圧縮され、メモリ間のデータ転送時間がシステム全体のボトルネックになることを解消することが可能になる。以下、図７を参照して本実施の形態における並列処理について説明する。 On the other hand, in the present embodiment, a compression / decompression unit 24B-1 is provided in each sub processor, thereby providing a function of compressing data transferred on the system bus 29 and decompressing data received from the bus. . With this function, the transfer data between the main memory 23 and the local memory 14A in the sub processor 24 is compressed, and it becomes possible to eliminate the fact that the data transfer time between the memories becomes a bottleneck of the entire system. Hereinafter, the parallel processing in the present embodiment will be described with reference to FIG.

図７は、本実施の形態のマルチプロセッサユニット２０における並列処理動作のタイミングを表すものである。この例は、サブプロセッサ２４−１〜２４−４において常に圧縮処理をしたデータを転送する場合を想定したものである。横軸は時間を示し、縦軸は並列度を示す。 FIG. 7 shows the timing of the parallel processing operation in the multiprocessor unit 20 of the present embodiment. In this example, it is assumed that data that is always compressed in the sub-processors 24-1 to 24-4 is transferred. The horizontal axis represents time, and the vertical axis represents the degree of parallelism.

図７では、ｔ０の時点で、システムバスに接続されたサブプロセッサ２４−１に対してメインメモリからデータ転送（入力）が行われる。ｔ１の時点でサブプロセッサ２４−１へのデータ転送が完了すると、サブプロセッサ２４−１がデータの伸張を開始すると同時に、サブプロセッサ２４−２に対してデータ転送が開始される。 In FIG. 7, at time t0, data transfer (input) is performed from the main memory to the sub-processor 24-1 connected to the system bus. When the data transfer to the sub-processor 24-1 is completed at the time t1, the sub-processor 24-1 starts data expansion, and at the same time, the data transfer to the sub-processor 24-2 is started.

ｔ２の時点で、サブプロセッサ２４−２はデータ転送を完了し、伸張処理を開始する。この時点では、サブプロセッサ２４−１は信号処理の途中であってシステムバス２９が空いているため、メインメモリ２３からサブプロセッサ２４−３に対してデータ転送を開始する。 At time t2, the sub processor 24-2 completes data transfer and starts decompression processing. At this time, since the sub processor 24-1 is in the middle of signal processing and the system bus 29 is free, data transfer from the main memory 23 to the sub processor 24-3 is started.

ｔ３の時点で、サブプロセッサ２４−３はデータ転送を完了し、伸張処理を開始する。この時点では、サブプロセッサ２４−１は信号処理の途中であってシステムバス２９が空いているため、メインメモリ２３からサブプロセッサ２４−４に対してデータ転送を開始する。 At time t3, the sub processor 24-3 completes data transfer and starts decompression processing. At this time, since the sub-processor 24-1 is in the middle of signal processing and the system bus 29 is free, data transfer from the main memory 23 to the sub-processor 24-4 is started.

ｔ４の時点で、サブプロセッサ２４−４はデータ転送を完了し、伸張処理を開始する。この時点では、サブプロセッサ２４−１はデータの圧縮処理まで終了しているので、サブプロセッサ２４−１からメインメモリ２３へ演算結果データが転送される。 At time t4, the sub processor 24-4 completes the data transfer and starts the decompression process. At this time, since the sub processor 24-1 has finished the data compression processing, the operation result data is transferred from the sub processor 24-1 to the main memory 23.

ｔ５の時点では、サブプロセッサ２４−１からメインメモリ２３への演算結果データの転送出力が完了すると共に、サブプロセッサ２４−２からメインメモリ２３への演算結果データの転送出力が開始する。ｔ６の時点では、サブプロセッサ２４−２からの演算結果データの転送出力が完了すると共に、サブプロセッサ２４−３からの演算結果データの転送出力が開始する。ｔ７の時点では、サブプロセッサ２４−３からの演算結果データの転送出力が完了すると共に、サブプロセッサ２４−４からの演算結果データの転送出力が開始する。そして、ｔ８の時点で、サブプロセッサ２４−４からの演算結果データの転送出力が完了する。 At time t5, the transfer output of the operation result data from the sub processor 24-1 to the main memory 23 is completed, and the transfer output of the operation result data from the sub processor 24-2 to the main memory 23 is started. At time t6, the transfer output of the operation result data from the sub processor 24-2 is completed, and the transfer output of the operation result data from the sub processor 24-3 starts. At time t7, the transfer output of the operation result data from the sub processor 24-3 is completed, and the transfer output of the operation result data from the sub processor 24-4 is started. At time t8, the transfer output of the operation result data from the sub processor 24-4 is completed.

本実施の形態の並列処理では、データが圧縮された状態でシステムバス上を転送されるため転送時間は短くなる。一方、演算の前処理としてデータの伸張処理、演算の後処理としてデータの圧縮処理が追加されるため、全体の演算時間は長くなる。結果として、マルチプロセッサユニット２０のボトルネック箇所がデータ転送時間から演算時間に転換される。ところが、演算時間の増加による性能低下は、サブプロセッサの並列度を上げることで容易に解決することができる。このため、マルチプロセッサユニット２０全体で見ると処理能力が向上する。 In the parallel processing of this embodiment, the transfer time is shortened because the data is transferred on the system bus in a compressed state. On the other hand, since data decompression processing is added as pre-processing of computation and data compression processing is added as post-processing of computation, the overall computation time is increased. As a result, the bottleneck portion of the multiprocessor unit 20 is converted from the data transfer time to the calculation time. However, the performance degradation due to the increase in computation time can be easily solved by increasing the parallelism of the sub processors. For this reason, the processing capability is improved when the multiprocessor unit 20 is viewed as a whole.

ところで、各サブプロセッサでは各種の演算が行われることが想定される。演算によってはデータ量が少なく転送時間がボトルネックにならない場合がある。また、データによっては圧縮してもそれほど効果の得られないものもあり、そのような場合は受信側で伸張処理を行う分、システムのリソースを無駄にすることになる。 By the way, it is assumed that various operations are performed in each sub-processor. Depending on the computation, the amount of data is small and the transfer time may not become a bottleneck. In addition, some data may not be very effective even when compressed, and in such a case, system resources are wasted as much as decompression processing is performed on the receiving side.

図８は、本実施の形態におけるマルチプロセッサユニット２０の各サブプロセッサ２４−１〜２４−ｎにおいてデータ圧縮の必要なしと判定された場合の動作タイミングを表すものである。 FIG. 8 shows the operation timing when it is determined that data compression is not necessary in each of the sub processors 24-1 to 24-n of the multiprocessor unit 20 in the present embodiment.

この例では、ｔ０の時点でメインメモリからサブプロセッサ２４−１に対してデータ転送が行われる。ｔ１の時点でサブプロセッサ２４−１へのデータ転送が完了すると、サブプロセッサ２４−１はデータの伸張を開始する。これと同時に、サブプロセッサ２４−２に対してデータ転送が開始される。ｔ２の時点でサブプロセッサ２４−２はデータ転送を完了し、伸張処理を開始する。この時点では、サブプロセッサ２４−１は信号処理の途中であってシステムバスが空いているため、サブプロセッサ２４−３に対してデータ転送を開始する。ｔ３の時点でサブプロセッサ２４−３はデータ転送を完了し、伸張処理を開始する。この時点で、サブプロセッサ２４−１は演算を終えているので、圧縮要否判定において、データ圧縮をしないことを決定し、直ちに演算結果の転送を開始する。ｔ４の時点では、サブプロセッサ２４−１からのデータ転送が既に完了している。このため、サブプロセッサ２４−２は圧縮要否判定において、データ圧縮をしないことを決定し、直ちに演算結果の転送を開始する。ｔ５の時点では、サブプロセッサ２４−２からのデータ転送は既に完了している。このため、サブプロセッサ２４−３は圧縮要否判定において、データ圧縮をしないことを決定し、直ちに演算結果の転送を開始する。ｔ６の時点で、サブプロセッサ２４−３からのデータ転送は完了する。 In this example, data transfer is performed from the main memory to the sub processor 24-1 at time t0. When the data transfer to the sub processor 24-1 is completed at the time t1, the sub processor 24-1 starts data expansion. At the same time, data transfer is started to the sub-processor 24-2. At time t2, the sub processor 24-2 completes the data transfer and starts the decompression process. At this time, since the sub-processor 24-1 is in the middle of signal processing and the system bus is free, data transfer to the sub-processor 24-3 is started. At time t3, the sub processor 24-3 completes data transfer and starts decompression processing. At this point, since the sub-processor 24-1 has finished the calculation, it determines in the compression necessity determination that data compression is not to be performed, and immediately starts the transfer of the calculation result. At time t4, data transfer from the sub-processor 24-1 has already been completed. For this reason, the sub-processor 24-2 determines not to compress data in the determination of whether or not compression is necessary, and immediately starts to transfer the calculation result. At time t5, data transfer from the sub-processor 24-2 has already been completed. For this reason, the sub-processor 24-3 determines that data compression is not performed in the determination of whether compression is necessary, and immediately starts to transfer the calculation result. At time t6, the data transfer from the sub processor 24-3 is completed.

このように、事前に圧縮が効果的かを判定し、処理の要否を決定するように構成することにより、上記のような問題（サブプロセッサのリソースの無駄遣い）を回避することができる。すなわち、元々のデータの転送量が少なく、あるサブプロセッサが転送処理を終えても他のサブプロセッサが転送を行える状態にない（まだ演算処理中である）こともある。このような場合には、データを圧縮して転送する効果はなく、データをそのまま出力しても全体的なパフォーマンスには影響しない。例えば図８において、ｔ４に着目すると、この時点ではサブプロセッサ２４−１は転送を終えており、次の転送を開始するために演算の終了待ちになる。これはシステムのボトルネックが転送処理にないことを意味している。したがって、圧縮処理を行う必要はない。演算時間とデータ転送時間の関係は静的に見積もることが可能なので、プロセッサを起動する時点で出力段の圧縮処理有り無しを設定し、効果の得られない圧縮処理を行わないことでプロセッサのリソースの無駄遣いを防ぐことができる。 Thus, by determining whether the compression is effective in advance and determining whether or not the processing is necessary, it is possible to avoid the above-described problem (sub processor resource waste). That is, there is a case where the original data transfer amount is small, and even when one sub processor finishes the transfer process, another sub processor is not ready to perform the transfer (still being processed). In such a case, there is no effect of compressing and transferring the data, and output of the data as it is does not affect the overall performance. For example, in FIG. 8, when attention is paid to t4, the sub-processor 24-1 has finished the transfer at this point, and waits for the end of the calculation in order to start the next transfer. This means that there is no system bottleneck in the transfer process. Therefore, there is no need to perform compression processing. Since the relationship between the computation time and the data transfer time can be estimated statically, the processor resources are set by setting whether or not the output stage compression processing is performed at the time of starting the processor, and not performing the compression processing that is not effective. Can be avoided.

また、一般に、予測符号化によるデータ圧縮処理では、信号間の相関が強いほど圧縮効率が高くなり、相関が弱い場合は差分を伝送するような圧縮方式では圧縮の効果が十分得られない場合がある。この点に関し、本実施の形態では、システムバスに出力する信号を圧縮した後に圧縮結果を評価し、圧縮率があるしきい値を超えていない場合は圧縮前の信号をバスに送信するようにしている。この場合には、データ転送量は減らず、転送元のサブプロセッサでは圧縮処理を行った分だけ無駄なエネルギーを消費したことになるが、その反面、その信号が他のサブプロセッサの入力となったときに伸張処理を行う必要がなくなるため、そのサブプロセッサにおいて、さらなる無駄なエネルギーの消費を防ぐことが可能になる。 In general, in data compression processing by predictive coding, the compression efficiency increases as the correlation between signals increases, and if the correlation is weak, the compression method that transmits the difference may not provide a sufficient compression effect. is there. In this regard, in this embodiment, the compression result is evaluated after the signal output to the system bus is compressed, and if the compression rate does not exceed a certain threshold value, the signal before compression is transmitted to the bus. ing. In this case, the data transfer amount does not decrease, and the transfer-source subprocessor consumes wasted energy for the compression processing, but on the other hand, the signal becomes an input to the other subprocessors. In this case, it is not necessary to perform the decompression process, so that it is possible to prevent further wasteful energy consumption in the sub processor.

なお、圧縮伸長器２４Ｂ−１におけるデータ圧縮伸張のアルゴリズムおよび実装方法は、マルチプロセッサユニット２０の構成に依存するところであり、ハードウェア、ソフトウェアどちらでも実現が可能である。ハードウェアによる実装の場合は、圧縮伸張処理に要する時間が短くて済むため、圧縮伸張処理を隠蔽する（圧縮伸張処理の影響を受けないようにする）ために追加するプロセッサの数( プロセッサの並列度) を少なくすることができる。一方、ソフトウェアによる実装の場合は、転送データの特徴に応じた最適なデータ圧縮方法を適宜選択することが可能であると共に、ハードウェアに変更を加えることなくシステムの演算性能を向上させることができる。 Note that the data compression / decompression algorithm and mounting method in the compression / decompression unit 24B-1 depend on the configuration of the multiprocessor unit 20, and can be realized by either hardware or software. In the case of hardware implementation, the time required for compression / decompression processing is short, so the number of processors to be added to conceal the compression / decompression processing (so as not to be affected by the compression / decompression processing) Degree) can be reduced. On the other hand, in the case of implementation by software, it is possible to appropriately select an optimal data compression method according to the characteristics of the transfer data, and it is possible to improve the calculation performance of the system without changing the hardware. .

以上説明したように、本実施の形態の情報処理装置によれば、以下のような効果がある。
（１）サブプロセッサを効率的に使用することが可能になるのでハードウェアの変更を必要とせずに、動作周波数あたりの演算能力を向上させることが可能になる。
（２）比較例と比べて、同等の演算能力をより低い動作周波数で実現可能になるので、動作電圧を下げることができ、消費電力あたりの演算能力を向上させることができる。
（３）データ圧縮伸張の実装方法には特に制限がないので、システムの要求に応じてハードウェア、ソフトウェアどちらかによる実装を適宜選択できる。
（４）圧縮結果をモニタし、十分な圧縮結果が得られない場合は圧縮前のデータをバスに転送するような仕組みを設けることにより、データが次段のサブプロセッサの入力となった場合に、その次段のサブプロセッサにおける入力段での伸張処理が不要となるため、無駄なエネルギー消費を省くことができる。 As described above, the information processing apparatus according to the present embodiment has the following effects.
(1) Since sub-processors can be used efficiently, it is possible to improve the computing capacity per operating frequency without requiring hardware changes.
(2) Compared with the comparative example, the equivalent computing capability can be realized at a lower operating frequency, so that the operating voltage can be lowered and the computing capability per power consumption can be improved.
(3) Since there is no particular limitation on the method for mounting data compression / decompression, mounting by either hardware or software can be selected as appropriate according to system requirements.
(4) When the compression result is monitored and a sufficient compression result cannot be obtained, a mechanism is provided to transfer the uncompressed data to the bus so that the data is input to the sub processor at the next stage. Since the expansion process at the input stage in the sub-processor at the next stage is not necessary, useless energy consumption can be saved.

以上、実施の形態を挙げて本発明を説明したが、本発明はこの実施の形態に限定されず、種々の変形が可能である。例えば、本実施の形態では、データ圧縮を一旦実行した上でその圧縮処理結果の評価を行い、その結果如何によっては、元のデータ（非圧縮データ）を転送するようにしたが、そのような事後的評価ではなく、例えば図９に示したように、事前に（圧縮前に）評価するようにしてもよい。なお、この図９において、ステップＳ２０１〜Ｓ２０７およびＳ２１２〜Ｓ２１５の処理は、それぞれ、上記実施の形態（図６）におけるステップＳ１０１〜Ｓ１０７およびＳ１１１〜Ｓ１１４と同様であり、適宜、説明を省略する。 While the present invention has been described with reference to the embodiment, the present invention is not limited to this embodiment, and various modifications can be made. For example, in the present embodiment, after the data compression is executed once, the compression processing result is evaluated, and depending on the result, the original data (uncompressed data) is transferred. Instead of the ex-post evaluation, for example, as shown in FIG. 9, the evaluation may be performed in advance (before compression). In FIG. 9, steps S201 to S207 and S212 to S215 are the same as steps S101 to S107 and S111 to S114 in the above-described embodiment (FIG. 6), respectively, and description thereof will be omitted as appropriate.

図９に示した変形例では、ステップＳ２０７におけるデータ圧縮の妥当性の判定ののち、その圧縮対象となっているデータのタイプを判定し（ステップＳ２０８）、その判定結果に応じた処理をする。具体的には、データ入力時にパケットヘッダＰＨのデータタイプＤＴから読み取ったビット情報に基づき、そのデータの種類（静止画、動画、音声、テキスト等）を判定する。その設定の結果、そのデータが例えばテキストデータのような圧縮効果の少ない種類のデータであったときには（ステップＳ２０９；Ｙ）、圧縮を行わないことを決定し、非圧縮データを選択する（ステップＳ２１３）。一方、そのデータが例えば静止画、動画、音響データのような圧縮効果の大きい種類のデータであったときには（ステップＳ２０９；Ｎ）、圧縮を行うことを決定し、さらにそのデータの種類に応じて圧縮方式を選択し（ステップＳ２１０）、その圧縮方式によるデータ圧縮を実行する（ステップＳ２１１）。そして、この圧縮されたデータを転送データとして選択し（ステップＳ２１２）、これにパケットヘッダＰＨを付加した上でシステムバス２９上に送出する（ステップＳ２１４，Ｓ２１５）。このとき、パケットヘッダＰＨの圧縮モードＣＭには、圧縮方式に応じて、図３（Ｂ）に示したビット情報を設定する。また、パケットヘッダＰＨのデータタイプＤＴには、データの種類に応じて、図３（Ｃ）に示したビット情報を設定する。 In the modification shown in FIG. 9, after determining the validity of the data compression in step S207, the type of the data to be compressed is determined (step S208), and processing according to the determination result is performed. Specifically, based on the bit information read from the data type DT of the packet header PH at the time of data input, the type of the data (still image, moving image, voice, text, etc.) is determined. As a result of the setting, if the data is a kind of data having a small compression effect such as text data (step S209; Y), it is decided not to perform compression, and uncompressed data is selected (step S213). ). On the other hand, when the data is a kind of data having a large compression effect such as a still image, a moving image, and sound data (step S209; N), it is decided to perform compression, and further according to the kind of the data. A compression method is selected (step S210), and data compression by the compression method is executed (step S211). Then, the compressed data is selected as transfer data (step S212), the packet header PH is added to the data, and the packet data is transmitted to the system bus 29 (steps S214 and S215). At this time, the bit information shown in FIG. 3B is set in the compression mode CM of the packet header PH according to the compression method. Further, the bit information shown in FIG. 3C is set in the data type DT of the packet header PH according to the type of data.

この変形例によれば、事前に圧縮の要否を判定するようにしたので、上記実施の形態の場合のように、一旦圧縮処理はしたものの結局は非圧縮データを転送する、という事態を回避できる。このため、転送元のサブプロセッサにおける無駄なエネルギー消費を回避することができる。 According to this modification, since the necessity of compression is determined in advance, it is possible to avoid a situation in which uncompressed data is eventually transferred although compression processing is performed once as in the case of the above embodiment. it can. For this reason, useless energy consumption in the transfer source sub-processor can be avoided.

なお、上記実施の形態では、メインメモリとサブプロセッサ内のローカルメモリとの間でのデータ転送に適用した場合について説明したが、サブプロセッサ間（ローカルメモリ間）でのデータ転送に適用することも可能である。 In the above embodiment, the case where the present invention is applied to the data transfer between the main memory and the local memory in the sub processor has been described. However, the present invention may be applied to the data transfer between the sub processors (between the local memories). Is possible.

本発明の一実施の形態に係る情報処理システムの構成を表すブロック図である。It is a block diagram showing the structure of the information processing system which concerns on one embodiment of this invention. 図１のマルチプロセッサユニット内のメインメモリおよびサブプロセッサ内のローカルメモリの記憶領域の構成、および、メインメモリに対する排他的アクセス制御を実現するためのキー管理テーブルの構成を表すブロック図である。FIG. 2 is a block diagram illustrating a configuration of a storage area of a main memory and a local memory in a sub processor of FIG. 1 and a configuration of a key management table for realizing exclusive access control for the main memory. メインメモリとサブプロセッサのローカルメモリとの間で転送されるパケットデータのフォーマット例を表す図である。It is a figure showing the example of a format of the packet data transferred between the main memory and the local memory of a sub processor. 図１の情報処理システムの情報処理装置間でのやり取りに使用されるソフトウェアセルの構成を表す図である。It is a figure showing the structure of the software cell used for the exchange between information processing apparatuses of the information processing system of FIG. 図４のソフトウェアセルを用いて送信されるステータス返信コマンドの一例を表す図である。It is a figure showing an example of the status reply command transmitted using the software cell of FIG. 図１に示した複数のサブプロセッサの並列処理を表す流れ図である。2 is a flowchart illustrating parallel processing of a plurality of sub processors illustrated in FIG. 1. 図１に示した複数のサブプロセッサの並列処理動作におけるタイミング例を表すタイミング図である。FIG. 2 is a timing diagram illustrating a timing example in parallel processing operations of a plurality of sub processors illustrated in FIG. 1. 図１に示した複数のサブプロセッサの並列処理動作における他のタイミング例を表すタイミング図である。FIG. 10 is a timing diagram illustrating another timing example in parallel processing operations of the plurality of sub processors illustrated in FIG. 1. 図１に示した複数のサブプロセッサの並列処理動作の変形例を表す流れ図である。6 is a flowchart illustrating a modification example of the parallel processing operation of the plurality of sub processors illustrated in FIG. 1. 比較例における複数のプロセッサの並列処理動作におけるタイミング例を表す流れ図である。It is a flowchart showing the example of a timing in the parallel processing operation | movement of the some processor in a comparative example.

Explanation of symbols

１…ネットワーク、２，３，４…情報処理装置、２０…マルチプロセッサユニット、２１…メインプロセッサ、２２…メモリコントローラ、２３…メインメモリ、２４−１〜２４−ｎ…サブプロセッサ、２１Ａ，２４Ａ−１〜２４Ａ−Ｎ…ローカルメモリ、２４Ｂ−１〜２４Ｂ−Ｎ…圧縮伸長器、２４Ｃ−１〜２４Ｃ−Ｎ…演算器、２７…バスアービタ、２８…ネットワーク接続部、２９…システムバス、ＰＨ…パケットヘッダ、ＤＰ…データ部、ＣＭ…圧縮モード、ＤＴ…データタイプ。 DESCRIPTION OF SYMBOLS 1 ... Network, 2, 3, 4 ... Information processing apparatus, 20 ... Multiprocessor unit, 21 ... Main processor, 22 ... Memory controller, 23 ... Main memory, 24-1 to 24-n ... Sub processor, 21A, 24A- 1-24A-N: Local memory, 24B-1-24B-N: Compression / decompression unit, 24C-1-24C-N: Operation unit, 27: Bus arbiter, 28 ... Network connection unit, 29 ... System bus, PH ... Packet Header, DP ... data part, CM ... compression mode, DT ... data type.

Claims

A bus, and a plurality of arithmetic processors commonly connected to the bus,
Each arithmetic processor
An arithmetic unit;
An information processing apparatus comprising: a compression / decompression unit having a function of compressing operation result data by the arithmetic unit and a function of expanding input data taken in via the bus.

The compression / decompression unit determines whether or not the compression processing is performed on the calculation result data by the calculation unit, and the calculation result data is used as it is or after the compression processing is performed according to the determination result. The information processing apparatus according to claim 1, wherein the information processing apparatus is sent upward.

Whether the compression / decompression unit performs compression processing on the operation result data to be sent out on the bus based on the magnitude relationship between the operation required time of the arithmetic unit and the data transfer required time on the bus. The information processing apparatus according to claim 2, wherein:

The compression / decompression unit determines whether or not to perform compression processing on the operation result data transmitted on the bus in consideration of the type of the operation result data. Information processing device.

The compression / decompression unit temporarily compresses the operation result data, and based on the compression process result, sends either the operation result data that has been compressed or the operation result data before compression onto the bus. The information processing apparatus according to claim 2, wherein:

The information processing apparatus according to claim 2, wherein the compression / decompression unit further selects a compression processing method when performing compression processing on the operation result data.

The information processing apparatus according to claim 6, wherein the compression / decompression unit selects the compression processing method based on a type of the operation result data.

The compression / decompression unit determines whether or not to perform decompression processing on the input data, and inputs the input data to the arithmetic unit as it is or after performing decompression processing according to the determination result. The information processing apparatus according to claim 1.

The information processing apparatus according to claim 8, wherein the compression / decompression unit further selects a decompression processing method when performing decompression processing on the input data.

The information processing apparatus according to claim 1, wherein the plurality of arithmetic processors include a main processor and a plurality of sub-processors.

A method applied to an information processing apparatus including a bus and a plurality of arithmetic processors commonly connected to the bus,
In each arithmetic processor,
Selectively decompressing input data captured via the bus,
Perform arithmetic processing based on the expanded input data,
An information processing method comprising: selectively compressing operation result data that is a result of the operation processing and sending the result to the bus.

An information processing system comprising a plurality of information processing devices connected to a network,
Each information processing apparatus includes a bus and a plurality of arithmetic processors commonly connected to the bus.
Each arithmetic processor
An arithmetic unit;
An information processing system comprising: a compression / decompression unit having a function of compressing operation result data by the arithmetic unit and a function of expanding input data taken in via the bus.