JP4403009B2

JP4403009B2 - Microprocessor

Info

Publication number: JP4403009B2
Application number: JP2004136382A
Authority: JP
Inventors: 和彦岩永
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2004-04-30
Filing date: 2004-04-30
Publication date: 2010-01-20
Anticipated expiration: 2024-04-30
Also published as: JP2005316887A

Description

本発明は、ＳＩＭＤ型プロセッサを構成する複数のプロセッサエレメントのうち何れかに不良が生じても、簡易な構成で問題無く所定の演算処理を並列に行い、かつ変倍処理にも適するマイクロプロセッサに関する。 The present invention relates to a microprocessor that performs predetermined arithmetic processing in parallel with a simple configuration without any problem even if any of a plurality of processor elements constituting a SIMD type processor is defective, and is also suitable for scaling processing. .

近年、製造プロセスの微細化によってＬＳＩの集積度は高まる一方であり、ＳＩＭＤ（Single Instruction Stream, Multiple Data Stream）型プロセッサにおいても演算ユニット（プロセッサエレメント）数の増加が可能となってきた。
ＳＩＭＤ型プロセッサでは、複数のプロセッサエレメント（以下、ＰＥという）により複数のデータに対して1つの命令で同時に同一の演算処理を実行することが可能である。この複数のＰＥは、演算は同一であるがデータ量が非常に多い処理（例えばデジタルコピアなどにおける画像処理）に係る用途において、頻用される。
ＳＩＭＤ型プロセッサにおける通常の画像処理では、複数のＰＥを主走査方向に並べ、同一の演算を同時に複数のデータに対して実行することによって高速な演算処理が可能となっている。
複数のＰＥで並列処理を行う構成のＳＩＭＤ型プロセッサにおいても、複数のＰＥのうち一つでも故障した場合、プロセッサ全体として故障となってしまうため、従来は、例えば冗長なＰＥを設けて正常なＰＥと置き換えるなどして、プロセッサ全体を救済するという手法が取られてきた。 In recent years, the degree of integration of LSIs has been increasing due to miniaturization of manufacturing processes, and it has become possible to increase the number of arithmetic units (processor elements) even in SIMD (Single Instruction Stream, Multiple Data Stream) type processors.
In the SIMD type processor, a plurality of processor elements (hereinafter referred to as PE) can simultaneously execute the same arithmetic processing on a plurality of data with one instruction. The plurality of PEs are frequently used in applications related to processing (for example, image processing in a digital copier, etc.) that has the same calculation but a very large amount of data.
In normal image processing in a SIMD type processor, a plurality of PEs are arranged in the main scanning direction, and high-speed arithmetic processing is possible by executing the same operation on a plurality of data simultaneously.
Even in a SIMD type processor configured to perform parallel processing with a plurality of PEs, if one of the plurality of PEs fails, the whole processor will fail. The technique of relieving the whole processor by replacing with PE has been taken.

即ち、従来、プロセッサ全体を救済する技術として、例えば第１に、故障している基本要素プロセッサがある場合、予備要素プロセッサで代替して基本となるアレー結合構造の再構成を行う際に、故障している基本要素プロセッサに隣接している基本要素プロセッサと予備プロセッサを接続するという高並列プロセッサの冗長構成方法が知られている（例えば特許文献１参照。）。 That is, as a technique for relieving the entire processor, for example, first, when there is a failed basic element processor, a failure occurs when reconfiguring the basic array coupling structure instead of the spare element processor. There is known a redundant configuration method for a highly parallel processor in which a spare processor is connected to a basic element processor adjacent to the basic element processor (see, for example, Patent Document 1).

第２に、複数のＰＥそれぞれに設けられた可動・停止を制御する信号線を不良ＰＥに対して割り当てることなく、置き換え用の冗長ＰＥに割り当てる冗長切り替え装置を設け、これにより不良ＰＥが含まれる並列プロセッサを正常に動作させるという並列プロセッサの冗長切り替え装置が知られている（例えば特許文献２参照。）。 Second, there is provided a redundant switching device that allocates a replacement PE without allocating a signal line for controlling movement / stop provided for each of the plurality of PEs to the defective PE, thereby including the defective PE. 2. Description of the Related Art A redundancy switching device for a parallel processor that operates the parallel processor normally is known (see, for example, Patent Document 2).

第３に、ＰＥの多数個を相互に結合し並列処理を行う際に、一つのＰＥと該ＰＥを回路に接続または切り離し、信号を通過させる機能を有する複数個の接続切り替え手段との組み合わせを複数個含むプロッセッシングモジュールを基本要素とし、同一構造をもつこれらのプロセッシングモジュールと、隣接するプロセッシングモジュール間を結ぶ１本以上の配線とで構成し、故障ＰＥ回避処理をＰＭ単位で行い、故障前のトポロジィを再現するという並列マシンが知られている（例えば特許文献３参照。）。 Thirdly, when a large number of PEs are coupled to each other for parallel processing, a combination of one PE and a plurality of connection switching means having a function of allowing the PE to connect or disconnect to a circuit and pass a signal. It consists of these processing modules that have the same structure and one or more wires that connect adjacent processing modules, and performs failure PE avoidance processing in units of PM. A parallel machine that reproduces the previous topology is known (for example, see Patent Document 3).

第４に、ｎ＋１個の記憶装置と、ｎ個の演算装置と、記憶装置が不良であるか否かの情報を保持するｎ個のレジスタと、レジスタの情報が記憶装置の不良を示す場合または前段からの切替信号をハイレベルで受けた場合にハイレベルの切替信号を次段の回路および同段の切替回路に出力する切替信号生成回路と、切替信号がローレベルのときは演算装置のデータ入出力ラインと記憶装置の入出力端子を接続し、切替信号がハイレベルに切替わると演算装置のデータ入出力ラインを記憶装置の入出力端子から次段の記憶装置の入出力端子に切り替え接続する切替回路とを設けたデータ処理装置が知られている（例えば特許文献４参照。）。 Fourth, n + 1 storage devices, n arithmetic devices, n registers that hold information indicating whether or not the storage device is defective, and when the register information indicates a storage device failure or When a switching signal from the previous stage is received at a high level, a switching signal generation circuit that outputs the switching signal of the high level to the next stage circuit and the switching circuit of the same stage, and data of the arithmetic unit when the switching signal is at a low level Connect the input / output line and the input / output terminal of the storage device, and switch the data input / output line of the computing device from the input / output terminal of the storage device to the input / output terminal of the next storage device when the switching signal switches to high level. There has been known a data processing device provided with a switching circuit (see, for example, Patent Document 4).

第５に、１個のプロセッサ部とプロセッサ部の第１の側に隣接する他のプロセッサ部とを接続する第１の通信経路と、プロセッサ部とプロセッサ部の第２の側に隣接する他のプロセッサとを接続する第２のデータ通信経路とを有し、プロセッサ部は、プロセッサ部の無効化制御回路と、無効化制御回路の作動による故障のプロセッサ部の無効化時に、第１のデータ通信経路から入力されたデータを第２のデータ通信経路に出力するバイパス回路とを有する半導体装置が知られている（例えば特許文献５参照。）。 Fifth, a first communication path connecting one processor unit and another processor unit adjacent to the first side of the processor unit, and another adjacent to the processor unit and the second side of the processor unit A second data communication path for connecting the processor, and the processor unit performs the first data communication at the time of invalidation of the processor unit invalidation control circuit and a faulty processor unit due to the operation of the invalidation control circuit. A semiconductor device having a bypass circuit that outputs data input from a path to a second data communication path is known (see, for example, Patent Document 5).

第６に、一または複数の構成群に別の同じ付加的構成群の配属し、構成群の入力側にマルチプレクサを前置接続し、該マルチプレクサは、一つの構成群の入力バスを後続の構成群に接続することができ、構成群の出力側にマルチプレクサを後置接続し、該マルチプレクサは、後続の構成群のうち一つの構成群の出力バスを受け取ることができ、構成群の一つが故障した場合、マルチプレクサは、故障した構成群が後続の構成群によって、最後の構成群が付加的構成群によって置換されるという集積回路の修復方法が知られている（例えば特許文献６参照。）。 Sixth, another same additional configuration group is assigned to one or more configuration groups, and a multiplexer is pre-connected to the input side of the configuration group, and the multiplexer connects the input bus of one configuration group to the subsequent configuration Can be connected to a group, and a multiplexer is post-connected to the output side of the configuration group, the multiplexer can receive the output bus of one of the subsequent configuration groups, and one of the configuration groups fails In such a case, a method of repairing an integrated circuit is known in which a failed configuration group is replaced by a subsequent configuration group and a last configuration group is replaced by an additional configuration group (see, for example, Patent Document 6).

特開平９−２２４００号公報Japanese Patent Laid-Open No. 9-22400 特開平９−２８８６５２号公報Japanese Patent Application Laid-Open No. 9-288652 特許第３００５２４３号公報Japanese Patent No. 3005243 特開２０００−１４８９９８号公報JP 2000-148998 A 特開２００２−１６９７８７号公報JP 2002-169787 A 特表２００１−５２７２１８号公報JP-T-2001-527218

このように、特開平９−２２４００号、特開平９−２８８６５２号、特許第３００５２４３号などの各号報に記載の従来例においては、不良プロセッサを置き換えるための冗長ＰＥを設けておき、故障しているＰＥがある場合には、ＰＥを特定する信号を冗長ＰＥに割り当てることで並列プロセッサを救済しており、特に、特開平９−２８８６５２号に記載の従来例においては、プロセッサ外部とのデータの受け渡しに関して、外部出力バスに対して３ステートバッファで出力する形態であり、バッファの出力イネーブル信号自体が故障ＰＥと冗長ＰＥとで切り替わるように構成されている。
これらはＰＥ数が少ない場合には有効な手法であるが、あらかじめ冗長なＰＥを必要とするため、ＰＥ数が多い場合には切り替えが必要な制御線の数が非常に多くなるとともに、構成面積の増大化、製造の複雑化、および製造コストの増大化を余儀なくされるという欠点がある。 As described above, in the conventional examples described in JP-A-9-22400, JP-A-9-288852, and Japanese Patent No. 3005243, a redundant PE for replacing a defective processor is provided and a failure occurs. When there is a PE, the parallel processor is relieved by assigning a signal for specifying the PE to the redundant PE. In particular, in the conventional example described in Japanese Patent Laid-Open No. 9-288652, data from outside the processor is saved. Is transferred to the external output bus by a three-state buffer, and the buffer output enable signal itself is configured to be switched between a faulty PE and a redundant PE.
These are effective methods when the number of PEs is small, but redundant PEs are required in advance, so when the number of PEs is large, the number of control lines that need to be switched becomes very large and the configuration area is increased. However, there is a drawback that the manufacturing cost, the manufacturing complexity, and the manufacturing cost are inevitably increased.

また、特開２０００−１４８９９８号、特開２００２−１６９７８７号、特表２００１−５２７２１８号などの各号報に記載の従来例においては、隣接したＰＥからの出力信号と自身が出力する信号とを共有するマルチプレクサを各ＰＥが有し、故障しているＰＥでは隣接したＰＥからの出力信号をバイパスして出力することで並列プロセッサを救済しており、特に、特開２００２−１６９７８７号には何ら記載が無いが、特開２０００−１４８９９８号に記載の従来例においては、プロセッサ外部とのデータの受け渡しに関しては、シフトレジスタ構成を取ることで対応している。
これらは構成は簡単であるがシフトレジスタを用いるため、特定のＰＥよりも大きいＰＥのみのデータ転送を行うような場合であっても、特定のＰＥよりも小さいＰＥに関してもデータのシフトを行う必要が伴い、転送にかかる時間が多くなってしまうという欠点がある。 Further, in the conventional examples described in the respective publications such as JP 2000-148998, JP 2002-169787, and JP 2001-527218, an output signal from an adjacent PE and a signal output by itself are used. Each PE has a shared multiplexer, and the faulty PE rescues the parallel processor by bypassing and outputting the output signal from the adjacent PE. In particular, Japanese Patent Laid-Open No. 2002-169787 does not Although not described, in the conventional example described in Japanese Patent Laid-Open No. 2000-148998, data transfer with the outside of the processor is handled by adopting a shift register configuration.
These are simple in structure, but use a shift register. Therefore, even when data transfer is performed only for a PE larger than a specific PE, data must be shifted even for a PE smaller than the specific PE. As a result, there is a drawback in that the time required for transfer increases.

本発明は、上記課題に鑑み、ＳＩＭＤ型プロセッサにおいて不良のＰＥがあっても、簡単な構成で、プロセッサの救済および正常かつ速やかなデータ転送を常に行うことができるマイクロプロセッサを提供することを第１の目的とし、しかも画像処理に際しては、プロセッサの救済とともに変倍処理をも同時に行うことができるマイクロプロセッサを提供することを第２の目的とする。 SUMMARY OF THE INVENTION In view of the above problems, the present invention provides a microprocessor that can always perform rescue and normal and quick data transfer with a simple configuration even if there is a defective PE in a SIMD type processor. It is a second object of the present invention to provide a microprocessor that can simultaneously perform a scaling process as well as a repair of a processor in image processing.

上記目的を達成するため、本発明は、複数のデータを処理するための複数のプロセッサエレメントを有するＳＩＭＤ型プロセッサを備えたマイクロプロセッサにおいて、前記各プロセッサエレメントに対しあらかじめ備わる汎用レジスタとは別のレジスタとして自己のプロセッサエレメントの不良の有無を示す不良フラグを設け、該不良フラグは命令インストラクションにより各プロセッサエレメント毎に個別のデータを設定することが可能であり、自己のプロセッサエレメントの番号を示すデータを格納し、不良フラグが判明した場合には該不良フラグのプロセッサエレメントを除いてインクリメントされた数を格納するＩＤレジスタを設け、前記汎用レジスタおよび前記不良フラグに外部からアクセスするためのデータ転送用ポートを設け、前記データ転送用ポートに前記各プロセッサエレメントの汎用レジスタと外部のメモリとの間でデータ転送を行うためのデータ転送装置を接続し、前記データ転送装置は、前記不良フラグと前記ＩＤレジスタの値によって前記データ転送を抑止することを特徴とする。
In order to achieve the above object, the present invention provides a microprocessor including a SIMD type processor having a plurality of processor elements for processing a plurality of data, and a register different from general-purpose registers provided in advance for each processor element. As a failure flag indicating whether or not there is a defect in its own processor element, it is possible to set individual data for each processor element by an instruction instruction, and to indicate data indicating the number of its own processor element A data transfer port for storing and providing an ID register for storing the incremented number except for the processor element of the defective flag when the defective flag is found, and accessing the general purpose register and the defective flag from the outside Set up , Wherein the data transfer port is connected to the data transfer device for performing data transfer between the general-purpose register and an external memory of each processor element, the data transfer apparatus, the defective flag and the ID register value To suppress the data transfer.

前記各プロセッサエレメント毎に設定される前記個別のデータは、画像処理における変倍制御ビットであることを特徴とする。 The individual data set for each processor element is a scaling control bit in image processing.

前記不良フラグの値は、前記ＳＩＭＤ型プロセッサ内の前記各プロセッサエレメントのセルフテストの結果に基づいて設定されることを特徴とする。
The value of the defect flag is set based on a result of a self test of each processor element in the SIMD type processor.

本発明によれば、各プロセッサエレメントのうち、何れかのプロセッサエレメントに不良があるとしても、不良のプロセッサエレメントの悪影響を除き、ＳＩＭＤ型プロセッサ全体を救済することができ、このため不良のプロセッサエレメントがあっても、常に正常に等しい演算処理を行って、正常にデータ転送（送信または受信）を行うことができる。
また、本発明によれば、何れかのプロセッサエレメントに不良があっても、ＳＩＭＤ型プロセッサを救済するとともに確実に変倍処理（または任意の変倍率の変倍処理）をも行うことが可能であり機能的にも向上する。 According to the present invention, even if any one of the processor elements is defective, the entire SIMD type processor can be relieved except for the adverse effect of the defective processor element. Even if there is, it is always possible to perform normal data processing (transmission or reception) by performing the same arithmetic processing.
Further, according to the present invention, even if any of the processor elements is defective, it is possible to rescue the SIMD type processor and perform the scaling process (or scaling process at an arbitrary scaling ratio) with certainty. There is also a functional improvement.

まず、図１および図２を参照して、本発明の前提となる技術について説明する。図１は本発明の前提となる技術の概略構成を示すブロック図であり、図２は本発明の前提となる技術の部分的な詳細構成を示すブロック図である。 First, with reference to FIG. 1 and FIG. 2, the technology which is the premise of the present invention will be described. FIG. 1 is a block diagram showing a schematic configuration of a technology as a premise of the present invention, and FIG. 2 is a block diagram showing a partial detailed configuration of the technology as a premise of the present invention.

本発明の前提となる技術は、即ち、図１に示すように、ＳＩＳＤ（Single Instruction stream Single Data stream）型のプロセッサであるグローバルプロセッサ（以下、ＧＰという）１３と、ＳＩＭＤ（Single Instruction stream Multiple Data stream）型プロセッサ（図２参照）１１を構成するプロセッサエレメントグループ１５、および、データ転送装置としてのメモリコントローラ２１に接続される外部インタフェース１７を備えたマイクロプロセッサである。 As shown in FIG. 1, the technology which is the premise of the present invention is a global processor (hereinafter referred to as GP) 13 which is a single instruction stream single data stream (SISD) type processor, and a single instruction stream multiple data SIMD. stream) type processor (see FIG. 2) 11 and a processor element group 15 and a microprocessor including an external interface 17 connected to a memory controller 21 as a data transfer device.

ＧＰ１３は、メモリコントローラ２１およびプロセッサエレメントグループ１５を制御する。ＧＰ１３は、具体的には、メモリコントローラ２１を起動し、外部のメモリ２３から画像データ等を読み取らせるとともに、外部インタフェース１７を介してプロセッサエレメントグループ１５のレジスタファイル３１内の所定の汎用レジスタ（後述するＲレジスタ）に格納させ、かつ演算アレイ３３を制御してその画像データ等に同一の演算処理を行わせ、レジスタファイル３１内の他の汎用レジスタ（Ｒレジスタ）に格納させる。そして、ＧＰ１３は、メモリコントローラ２１を起動し、外部インタフェース１７を介してその演算処理後の画像データ等を取り込ませるとともに、外部のメモリ２３に書き込ませる。 The GP 13 controls the memory controller 21 and the processor element group 15. Specifically, the GP 13 activates the memory controller 21 to read image data and the like from the external memory 23, and at the same time, a predetermined general-purpose register (described later) in the register file 31 of the processor element group 15 via the external interface 17. And the arithmetic array 33 is controlled to perform the same arithmetic processing on the image data and the like and stored in another general-purpose register (R register) in the register file 31. Then, the GP 13 activates the memory controller 21, causes the image data after the arithmetic processing to be taken in via the external interface 17, and writes it into the external memory 23.

尚、ＧＰ１３は、同一のメモリコントローラ２１を制御し、同一の外部のメモリ２３に対する画像データ等のリード／ライトを行わせるものとは必ずしも限らず、別のメモリコントローラを制御して演算処理後の画像データ等を別の外部のメモリに書き込ませる場合もある。 Note that the GP 13 does not necessarily control the same memory controller 21 to read / write image data or the like with respect to the same external memory 23. In some cases, image data or the like is written to another external memory.

メモリコントローラ２１は、概略的には、ＧＰ１３からの所定の命令に基づいて、外部インタフェース１７を制御するとともに、例えばディジタル複写機やファクシミリ等のＣＣＤセンサ等を含む画像入力系等に備わるメモリ（シングルポートメモリ：ＲＡＭ）２３から演算対象となる画像データ等を順次に読み出して、外部インタフェース１７を介しプロセッサエレメントグループ１５内の各プロセッサエレメント（以下、ＰＥという）のレジスタファイル（後述するＲレジスタ）３１に一時的に記憶する制御、および、一時記憶後の画像データ等の演算アレイ３３による演算処理の結果（画像データ等）を外部インタフェース１７を介しメモリ２３に格納する制御を行う。メモリ２３に格納された演算後の画像データ等は、例えばディジタル複写機やファクシミリ等の静電潜像形成、現像、用紙への転写等を行う画像形成系等に出力される。 In general, the memory controller 21 controls the external interface 17 based on a predetermined command from the GP 13 and also includes a memory (single-unit) provided in an image input system including a CCD sensor such as a digital copying machine or a facsimile. Image data and the like to be calculated are sequentially read out from the port memory (RAM) 23, and a register file (R register described later) 31 of each processor element (hereinafter referred to as PE) in the processor element group 15 via the external interface 17. And temporarily storing the result of the arithmetic processing (image data or the like) by the arithmetic array 33 such as image data after the temporary storage in the memory 23 via the external interface 17. The calculated image data and the like stored in the memory 23 are output to an image forming system or the like that performs electrostatic latent image formation, development, transfer to paper, etc., such as a digital copying machine or a facsimile.

即ち、メモリコントローラ２１は、各Rレジスタ毎に備わっている各外部インタフェース１７毎に設けられており、具体的には、図４に示すように、メモリ（ＲＡＭ）２３に画像データをライトする際に用いるライトバッファ７１、メモリ２３から画像データをリードする際に用いるリードバッファ７３、ＰＥのレジスタファイル３１へのアクセスを行うべく外部インタフェース１７を制御する外部Ｉ／Ｆ制御部７５、メモリ２３への画像データの記録、読み出し、演算後の画像データの記録を制御するＲＡＭ制御部７７、および、全体を制御するシーケンスユニット（以下、ＳＣＵという）７９を備えている。 That is, the memory controller 21 is provided for each external interface 17 provided for each R register. Specifically, as shown in FIG. 4, when writing image data to the memory (RAM) 23. The write buffer 71 used for reading, the read buffer 73 used when reading image data from the memory 23, the external I / F control unit 75 for controlling the external interface 17 to access the register file 31 of the PE, and the memory 23 A RAM control unit 77 that controls recording, reading, and calculation of image data after image data is recorded, and a sequence unit (hereinafter referred to as SCU) 79 that controls the entire image data.

メモリコントローラ２１は、ＳＩＭＤ型プロセッサを構成するプロセッサエレメントグループ１５のレジスタファイル３１と外部インタフェース１７内のデータ転送ポート（図示せず：出力ポートと入力ポートを備える）を介して接続されており、レジスタファイル３１からメモリ２３へのデータ転送、および、メモリ２３からレジスタファイル３１へのデータ転送を制御する。メモリコントローラ２１が制御するレジスタは、Ｉ／Ｏ空間にマッピングされており、ＧＰ１３からの要求に従いリード、ライトの制御が可能となっている。 The memory controller 21 is connected to the register file 31 of the processor element group 15 constituting the SIMD type processor via a data transfer port (not shown: provided with an output port and an input port) in the external interface 17. Data transfer from the file 31 to the memory 23 and data transfer from the memory 23 to the register file 31 are controlled. The registers controlled by the memory controller 21 are mapped in the I / O space, and read and write can be controlled according to a request from the GP 13.

ライトバッファ７１にはＳＩＭＤ型プロセッサ１１の外部インタフェース１７の出力ポートが接続され、リードバッファ７３には外部インタフェース１７の入力ポートが接続される。データ転送ポートは、それぞれ偶数ＰＥ用と奇数ＰＥ用の入力ポート、および出力ポートを独立して有しており、1サイクルで一度に偶数、奇数の１組のＰＥ分のデータがアクセス可能に構成されている。 The output port of the external interface 17 of the SIMD processor 11 is connected to the write buffer 71, and the input port of the external interface 17 is connected to the read buffer 73. Each data transfer port has an input port and an output port for even and odd PEs independently, and can be configured to access data for one set of even and odd PEs at a time in one cycle. Has been.

また、ライトバッファ７１、リードバッファ７３とメモリ２３との間のデータバスは、それぞれ４ＰＥ分のデータ幅で構成されており、1サイクルで一度に４ＰＥ分のデータをアクセスできる。尚、１ＰＥ分のデータは8ｂｉｔとしており、外部インタフェース１７とライトバッファ７１、およびリードバッファ７３との間のデータバスのｂｉｔ幅は１６ｂｉｔで構成される。従って、メモリコントローラ２１とメモリ２３との間のｂｉｔ幅は３２ｂｉｔで構成される。 The data buses between the write buffer 71, the read buffer 73, and the memory 23 are each configured with a data width of 4 PEs, and data of 4 PEs can be accessed at a time in one cycle. The data for 1 PE is 8 bits, and the bit width of the data bus between the external interface 17 and the write buffer 71 and the read buffer 73 is 16 bits. Therefore, the bit width between the memory controller 21 and the memory 23 is 32 bits.

この結果、外部インタフェース１７のデータ転送ポートとメモリコントローラ２１との間のデータ転送を２回行う間に、メモリ２３とメモリコントローラ２１との間のデータ転送を１回実行すれば良いことになる。メモリコントローラ２１のライトバッファ７１は、各ＰＥの外部インタフェース１７より出力されたデータを２回取り込み、４個のＰＥ分のデータが格納された後、メモリ２３に転送する動作を行っている。また、リードバッファ７３は、メモリ２３から読み出した４個のＰＥ分のデータを２回に分けて、各ＰＥの外部インタフェース１７に転送する動作を行っている。 As a result, the data transfer between the memory 23 and the memory controller 21 may be performed once while the data transfer between the data transfer port of the external interface 17 and the memory controller 21 is performed twice. The write buffer 71 of the memory controller 21 performs an operation of fetching data output from the external interface 17 of each PE twice and transferring the data to the memory 23 after the data for four PEs are stored. The read buffer 73 performs an operation of dividing the data for four PEs read from the memory 23 into two times and transferring the data to the external interface 17 of each PE.

一方、ＳＣＵ７９には、外部インタフェース１７の制御ビット用ポート（図示せず）が接続されており、制御ビット用ポートを介して、データ転送の際に転送抑止データをリードし、データ転送の制御を行う。 On the other hand, the control bit port (not shown) of the external interface 17 is connected to the SCU 79, and the transfer suppression data is read at the time of data transfer via the control bit port to control the data transfer. Do.

次に、図２を参照して、ＧＰ１３、およびＳＩＭＤ型プロセッサを構成するＰＥの具体的な構成について説明する。ＧＰ１３は、図２に示すように、具体的には、第１に、各ＰＥ４０を制御するプログラムを格納するプログラムＲＡＭ４１、演算データ格納用のデータＲＡＭ４３が内蔵されている。 Next, with reference to FIG. 2, the specific configuration of the GP 13 and the PEs constituting the SIMD type processor will be described. As shown in FIG. 2, the GP 13 specifically includes a program RAM 41 for storing a program for controlling each PE 40 and a data RAM 43 for storing operation data.

ＧＰ１３は、第２に、プログラムのアドレスを保持するプログラムカウンタＰＣ、演算処理のデータ格納用の汎用レジスタＧ０，Ｇ１，Ｇ２，Ｇ３、レジスタ退避、復帰時にデータＲＡＭ４３の退避先アドレスを保持するスタックポインタＳＰ、サブルーチンコール時にコール元のアドレスを保持するリンクレジスタＬＳ、同じくＩＲＱ（割込み信号入力）時とＮＭＩ（優先度の高い割込み処理）時の分岐元アドレスを保持するレジスタＬＩ，ＬＮ、および、ＧＰ１３の状態を保持するプロセッサステータスレジスタＰが内蔵されている。 Second, the GP 13 is a program counter PC that holds a program address, general-purpose registers G0, G1, G2, and G3 for storing data for arithmetic processing, and a stack pointer that holds a save destination address of the data RAM 43 at the time of register saving and restoration. SP, a link register LS that holds a call source address at the time of a subroutine call, registers LI and LN that hold branch source addresses at the time of IRQ (interrupt signal input) and NMI (high priority interrupt processing), and GP13 The processor status register P for holding the state is incorporated.

ＧＰ１３は、第３に、詳しく図示しないが、所定の命令を実行するための命令デコーダ、ＡＬＵ（算術論理演算装置）、メモリ制御回路、割り込み制御回路、外部Ｉ／Ｏ制御回路、および、ＧＰ演算制御回路が内蔵されている。 Third, although not shown in detail, the GP 13 is an instruction decoder for executing a predetermined instruction, an ALU (arithmetic logic unit), a memory control circuit, an interrupt control circuit, an external I / O control circuit, and a GP calculation. A control circuit is built-in.

また、ＧＰ１３は、ＰＥ命令実行時は、図示しないが、命令デコーダ、レジスタファイル制御回路、ＰＥ演算制御回路を使用して、プロセッサエレメントグループ１５内の各ＰＥ４０のレジスタファイル３１の制御、各ＰＥ４０の演算アレイ（後述する算術論理演算装置を含む）３３の制御、および、各ＰＥ４０の７ｔｏ１マルチプレクサ（以下、７ｔｏ１ＭＵＸという）３５の制御を行う。 Although not shown, the GP 13 controls the register file 31 of each PE 40 in the processor element group 15 and controls each PE 40 using an instruction decoder, a register file control circuit, and a PE operation control circuit, although not shown. Control of an arithmetic array (including an arithmetic logic unit described later) 33 and control of a 7to1 multiplexer (hereinafter referred to as 7to1MUX) 35 of each PE 40 are performed.

一方、各ＰＥ４０は、図２に示すように、第１に、例えばメモリ２３から読み出した１画素の画像データ（Ｄａｔａ）を選択的に一時的に記憶する例えば２４個の８ｂｉｔレジスタ（以下、Ｒレジスタという）Ｒ０〜Ｒ２３、および、後述する演算後の１画素の画像データを選択的に一時的に記憶する例えば８個の８ｂｉｔレジスタ（以下、Ｒレジスタという）Ｒ２４〜Ｒ３１を備えている。 On the other hand, each PE 40, as shown in FIG. 2, first, for example, 24 8-bit registers (hereinafter referred to as R) that selectively store, for example, one-pixel image data (Data) read from the memory 23. R0 to R23 (referred to as registers) and, for example, eight 8-bit registers (hereinafter referred to as R registers) R24 to R31 that selectively and temporarily store image data of one pixel after the calculation described later.

各ＲレジスタＲ０〜Ｒ３１は、後述する算術論理演算装置４７に対する一つの読み出しポートと一つの書き込みポートとを備えており、８ｂｉｔのリード／ライト兼用のバスＢ１を介して後述する演算アレイからアクセスされる。各ＲレジスタＲ０〜Ｒ３１のうち、特にＲレジスタＲ０〜Ｒ２３は、外部からアクセスするためのデータ転送用ポートを備えており、外部インタフェース１７を介し外部のメモリコントローラ２１乃至メモリ２３との間で画像データ等のデータ転送を行うことが可能である。 Each of the R registers R0 to R31 has one read port and one write port for the arithmetic logic unit 47 described later, and is accessed from an arithmetic array described later via an 8-bit read / write bus B1. The Among the R registers R0 to R31, in particular, the R registers R0 to R23 have a data transfer port for external access, and an image is exchanged with the external memory controllers 21 to 23 via the external interface 17. Data transfer such as data can be performed.

殊に、メモリコントローラ２１は、アドレス信号（Ａｄｄｒｅｓｓ）、クロック信号（ＣＬＫ）、リード／ライト制御信号（ＲＷＢ）を出力することで任意のＲレジスタＲ０〜Ｒ２３の読み書きが可能であることは上述した通りである。 In particular, the memory controller 21 can read and write any R register R0 to R23 by outputting an address signal (Address), a clock signal (CLK), and a read / write control signal (RWB). Street.

各ＰＥ４０は、前述のＲレジスタＲ０〜Ｒ３１を備えるとともに、例えば２５６組の配列構成となっており、この場合、Ｒレジスタの総数は８１９２個になる計算である。但し、ＰＥ４０は、例えば１０２４組の配列構成を採用する態様もある。 Each PE 40 includes the aforementioned R registers R0 to R31, and has an array configuration of, for example, 256 sets. In this case, the total number of R registers is 8192. However, PE40 has an aspect which employs an array configuration of 1024 sets, for example.

各ＰＥ４０は、第２に、７ｔｏ１ＭＵＸ３５、シフト＆拡張回路（ＳｈｉｆｔＥｘｐａｎｄ：以下、ＳＥという）４５、１６ｂｉｔの算術論理演算装置（以下、ＡＬＵという）４７、Ａレジスタ４８、およびＦレジスタ４９を備えている。また、各ＰＥ４０は、詳しく図示しないが、８ｂｉｔの条件レジスタを備えており、各ＰＥ４０毎に演算実行の無効／有効の制御が可能である。これにより特定のＰＥ４０を演算対象として選択することができる。 Second, each PE 40 includes a 7 to 1 MUX 35, a shift and expansion circuit (ShiftExpand: hereinafter referred to as SE) 45, a 16-bit arithmetic logic unit (hereinafter referred to as ALU) 47, an A register 48, and an F register 49. . Further, although not shown in detail, each PE 40 includes an 8-bit condition register, and the execution / invalidation control of the operation execution can be performed for each PE 40. Thereby, specific PE40 can be selected as a calculation target.

尚、演算アレイとして、本例の場合はアキュムレータ方式を採用しているが、汎用レジスタ方式等を用いても良いことは勿論である。 In this example, an accumulator method is used as the arithmetic array, but a general-purpose register method or the like may be used.

７ｔｏ１ＭＵＸ３５は、各ＲレジスタＲ０〜Ｒ３１を接続するバスＢ１に接続されるとともに、両隣三つの各ＰＥ４０のバスＢ１にも接続されており、合計七つのＰＥ４０の各ＲレジスタＲ０〜Ｒ３１から一つのＲレジスタＲ０等へのアクセス、もしくはＲレジスタＲ２４等へのアクセスを選択する。即ち、７ｔｏ１ＭＵＸ３５は、ＧＰ１３からの選択指令に基づいて選択したＲレジスタＲ０等に一時的に記憶された画像データ等をＳＥ４５に出力する場合、および、演算後の画像データ等をＲレジスタＲ２４等に一時的に記憶させる場合の何れかの処理を行う。これにより７ｔｏ１ＭＵＸ３５は、両隣三つのＰＥ４０を含め七つのＰＥ４０の各ＲレジスタＲ０〜Ｒ２３に保持されている画像データのうち何れかを利用してＡＬＵ４７に演算処理を行わせることを可能にし演算能力を高める。 The 7 to 1 MUX 35 is connected to the bus B1 connecting the R registers R0 to R31 and also connected to the bus B1 of each of the three PEs 40 adjacent to each other. Access to the register R0 or the like or access to the R register R24 or the like is selected. That is, the 7to1 MUX 35 outputs image data or the like temporarily stored in the R register R0 or the like selected based on the selection command from the GP 13 to the SE 45, and the calculated image data or the like to the R register R24 or the like. One of the processes for temporarily storing is performed. As a result, the 7to1 MUX 35 makes it possible to cause the ALU 47 to perform arithmetic processing using any of the image data held in the R registers R0 to R23 of the seven PEs 40 including the three PEs 40 adjacent to each other. Increase.

ＳＥ４５は、７ｔｏ１ＭＵＸ３５からの画像データ等を所定ビットシフトしＡＬＵ４７に出力するか、ＡＬＵ４７からの演算後の画像データ等を所定ビットシフトし７ｔｏ１ＭＵＸ３５に出力するかの何れかの処理を行う。尚、ＳＥ４５の処理には例えば画像データ等にゼロ拡張や所定の定数倍に処理する場合もが含まれるものである。 The SE 45 performs either processing of shifting the image data from the 7to1 MUX 35 by a predetermined bit and outputting it to the ALU 47 or shifting the image data after the calculation from the ALU 47 by a predetermined bit and outputting it to the 7to1 MUX 35. It should be noted that the processing of SE45 includes a case where image data or the like is subjected to zero extension or a predetermined constant multiple.

ＡＬＵ４７は、ＳＥ４５からの画像データと、Ａレジスタ４８に保持されたデータに基づいて所定の算術論理演算を行う。ＡＬＵ４７は、１６ｂｉｔの画像データ等に対応することが可能である。演算後のデータは、Ａレジスタ４８に記憶される他、該データをＦレジスタ（テンポラリレジスタ）４９に記憶する。これらはＧＰ１３により制御される。 The ALU 47 performs a predetermined arithmetic logic operation based on the image data from the SE 45 and the data held in the A register 48. The ALU 47 can handle 16-bit image data and the like. The calculated data is stored in the A register 48 and the data is stored in the F register (temporary register) 49. These are controlled by GP13.

次に、メモリコントローラ２１と、各ＰＥ４０の動作の概要について説明する。外部のメモリ２３には、例えばスキャナ等から取り込んだ１ライン分の画像処理（演算処理）前の画像データが格納されているものとする。ＧＰ１３からの要求に基づいてメモリコントローラ２１がメモリ２３内の画像データを読み出すとともに、各ＰＥ４０の例えばＲレジスタＲ０等に転送し一時的に記憶させる。 Next, an outline of the operation of the memory controller 21 and each PE 40 will be described. Assume that the external memory 23 stores, for example, image data before image processing (arithmetic processing) for one line captured from a scanner or the like. Based on the request from the GP 13, the memory controller 21 reads the image data in the memory 23 and transfers it to, for example, the R register R 0 of each PE 40 and temporarily stores it.

続いて、各ＰＥ４０は、全体として例えば２５６画素の画像データに対して画像処理（所定の演算処理）を行う。演算処理には例えば数１００回の演算を経て出力画像が形成される。演算後の画像データは、ＧＰ１３からの要求により例えば各ＰＥ４０のＲレジスタＲ１に一時的に記憶される。ＧＰ１３からの要求により所定のメモリコントローラ（例えば別のメモリコントローラ）を起動させ、各ＲレジスタＲ１に記憶された演算後の画像データを外部のメモリ（例えば別のメモリ）に格納する。 Subsequently, each PE 40 performs image processing (predetermined calculation processing) on, for example, image data of 256 pixels as a whole. In the calculation process, for example, an output image is formed through several hundred calculations. The calculated image data is temporarily stored in the R register R1 of each PE 40, for example, in response to a request from the GP 13. A predetermined memory controller (for example, another memory controller) is activated in response to a request from the GP 13, and the post-computation image data stored in each R register R1 is stored in an external memory (for example, another memory).

画像処理を行うべき画像データの横幅（主走査方向の画素数）が２５６以内であれば上記の動作を行うことで１ライン分の処理が終了する。画像データの横幅が２５６以上であれば上述の処理を複数回に分けて行うことで１ライン分の処理が終了する。後は、これを縦方向の画素数（副走査方向の画素数）回だけ繰り返すことで画像処理（演算処理）が完了する。 If the horizontal width (the number of pixels in the main scanning direction) of the image data to be subjected to image processing is within 256, the processing for one line is completed by performing the above operation. If the horizontal width of the image data is 256 or more, the processing for one line is completed by performing the above processing in a plurality of times. After that, the image processing (arithmetic processing) is completed by repeating this for the number of pixels in the vertical direction (number of pixels in the sub-scanning direction) times.

以下、図３乃至図９を参照して、本発明の第１の実施の形態に係るマイクロプロセッサについて説明する。本実施の形態のマイクロプロセッサもＳＩＭＤ型プロセッサに適用される。図３は本実施の形態のＳＩＭＤ型プロセッサおよびＧＰ１３の詳細構成を示すブロック図である。尚、図３において図２に示した部分と同一部分には同一の符号を付して説明を省略する。 The microprocessor according to the first embodiment of the present invention will be described below with reference to FIGS. The microprocessor of this embodiment is also applied to the SIMD type processor. FIG. 3 is a block diagram showing a detailed configuration of the SIMD type processor and GP 13 of the present embodiment. In FIG. 3, the same parts as those shown in FIG.

本例のＳＩＭＤ型プロセッサ５１は、図３に示すように、各ＰＥ４０に対して、不良フラグ５３、ＩＤレジスタ５５、Ｔレジスタ５７、二つのＭＰＸ５９，６１、および、ＰＥ選択部６３を新たに備えたものである。 As shown in FIG. 3, the SIMD type processor 51 of this example newly includes a defect flag 53, an ID register 55, a T register 57, two MPXs 59 and 61, and a PE selection unit 63 for each PE 40. It is a thing.

各不良フラグ５３は、１ｂｉｔのレジスタであり、ＧＰ１３のプログラムＲＡＭ４１に格納されたセルフテストのプログラム実行に伴い、自己のＰＥ４０が正常に動作するか否かをセルフテストした結果として、自己のＰＥ４０が不良であるか否かを示す情報がセットされる。各不良フラグ５３は、自己のＰＥ４０が正常である場合は例えば‘０’にセットされ、不良である場合は‘１’にセットされる。各不良フラグ５３は、シフトレジスタで構成されており、ＧＰ１３から各々の値が‘０’であるか‘１’であるかの設定を行うことが可能である。 Each failure flag 53 is a 1-bit register, and as a result of self-testing whether or not the own PE 40 operates normally in accordance with the execution of the self-test program stored in the program RAM 41 of the GP 13, Information indicating whether or not it is defective is set. Each failure flag 53 is set to ‘0’, for example, when its own PE 40 is normal, and is set to ‘1’ when it is defective. Each defect flag 53 is configured by a shift register, and it is possible to set whether each value is “0” or “1” from the GP 13.

不良フラグ５３は、外部（メモリコントローラ２１）からのアクセスを中継する外部インタフェース１７との接続を図るためのデータ転送用ポート（図示せず）と接続されており、不良フラグ５３の値によりメモリコントローラ２３等に対しデータ転送を抑止させることが可能である。データ転送には、主に演算前もしくは演算後の画像データ等のデータ転送が含まれる。 The defect flag 53 is connected to a data transfer port (not shown) for connection to the external interface 17 that relays access from the outside (memory controller 21). It is possible to prevent data transfer for 23 and the like. Data transfer mainly includes data transfer such as image data before or after calculation.

不良フラグ５３の値は、転送用ポートから外部インタフェース１７を介して外部のメモリコントローラ２１等に出力し、メモリコントローラ２１に不良のＰＥ４０を認識させることが可能である。不良フラグ５３の値も、外部のメモリコントローラ２１からのアクセスが偶数のＰＥ４０と奇数のＰＥ４０との１組で１つのデータとして行われるので、偶数ＰＥ４０用と奇数ＰＥ４０用との２ｂｉｔが１回のアクセスで外部のメモリコントローラ２１へ伝達される。 The value of the defect flag 53 can be output from the transfer port to the external memory controller 21 or the like via the external interface 17 so that the memory controller 21 can recognize the defective PE 40. As for the value of the defect flag 53, the access from the external memory controller 21 is performed as one data by one set of the even-numbered PE 40 and the odd-numbered PE 40. Therefore, 2 bits for the even-numbered PE 40 and the odd-numbered PE 40 are used once. The data is transmitted to the external memory controller 21 by access.

ＩＤレジスタ５５は、自己のＰＥ４０が例えば図示最も左側（最初）のＰＥ４０から数えて何番目のＰＥであるかを示すデータが格納される。即ち、後述するセルフテストの際に判明した不良のＰＥ４０を除いて‘０’から順にインクリメントした数が格納される。ＩＤレジスタ５５の値は、データバスを介してＡＬＵ（１６ｂｉｔＡＬＵ）４７に入力されており、ＡＬＵ４７においてＧＰ１３からの即値データと比較した結果がＴレジスタ５７に反映される。 The ID register 55 stores data indicating the number of PEs counted from, for example, the leftmost (first) PE 40 in the figure. That is, the number incremented in order from “0” is stored except for the defective PE 40 that was found in the self-test described later. The value in the ID register 55 is input to the ALU (16-bit ALU) 47 via the data bus, and the result of comparison with the immediate data from the GP 13 in the ALU 47 is reflected in the T register 57.

画像処理においては、ディザテーブルのロード等のように、例えば四つおきのＰＥ４０に同一の値をロードするといった処理が必要になる場合があるが、ＩＤレジスタ５５を内蔵することによって不良のＰＥ４０があっても対応することが可能となる。 In image processing, for example, processing such as loading the same value to every fourth PE 40 may be required, such as loading a dither table. Even if there is, it becomes possible to cope.

Ｔレジスタ５７は、詳しくは、図５に示すように、マルチプレクサ６４、フラグとしての１ｂｉｔの複数のレジスタＴ７〜Ｔ０、マルチプレクサ６５、および、不良フラグ５３の反転入力とマルチプレクサ６５の出力とのアンドを取るアンド回路６７を備えている。マルチプレクサ６４は、各ＰＥ４０のＡＬＵ４７から出力される演算フラグ（図示せず）の値と、レジスタＴ７〜Ｔ０を初期化するためにデータバスからのデータとを入力し、演算結果に応じてレジスタＴ７〜Ｔ０の何れかを設定することが可能である。マルチプレクサ６５は８組あるレジスタＴ７〜Ｔ０の中の一つを演算種別の条件として選択するために備えられている。アンド回路６７は、不良のあるＰＥ４０では選択された結果が常にネゲート（否定）されており、不良ＰＥでは演算を行わせない。即ちアンド回路６７は、マルチプレクサ６５の出力を入力するとともに不良フラグ５３の出力を反転入力し、マルチプレクサ６５で選択された演算実行が肯定された場合に、その旨を示す出力をＡレジスタ４８、Ｆレジスタ４９、および、ＭＰＸ５９に反映させ、常に１６ｂｉｔＡＬＵ４７の演算実行の無効／有効を制御する。 More specifically, as shown in FIG. 5, the T register 57 performs an AND operation between the multiplexer 64, a plurality of 1-bit registers T7 to T0 as a flag, the multiplexer 65, and the inverted input of the defective flag 53 and the output of the multiplexer 65. An AND circuit 67 is provided. The multiplexer 64 inputs the value of an operation flag (not shown) output from the ALU 47 of each PE 40 and the data from the data bus in order to initialize the registers T7 to T0, and the register T7 according to the operation result. Any of ~ T0 can be set. The multiplexer 65 is provided for selecting one of the eight registers T7 to T0 as a condition for the operation type. In the AND circuit 67, the selected result is always negated (negative) in the defective PE 40, and the calculation is not performed in the defective PE. That is, the AND circuit 67 inputs the output of the multiplexer 65 and inverts the output of the failure flag 53, and when the operation execution selected by the multiplexer 65 is affirmed, the AND circuit 67 outputs an output indicating that to the A register 48, F It is reflected in the register 49 and the MPX 59, and the invalidation / validity of the 16-bit ALU 47 operation execution is always controlled.

ＭＰＸ５９は、ＧＰ１３における即値１データバスからの即値データを入力する一方で、シフタ（即ち図２に示すＳＥと同一：以下、シフタという）４５からの画像データ等を入力し、画像データ等をＡＬＵ４７に転送する。尚、即値データは、全ＰＥ４０に共通にデータを設定する場合等に使用されるもので、命令中に記載されている値のことである。 The MPX 59 inputs the immediate data from the immediate data 1 data bus in the GP 13, while inputting the image data from the shifter (that is, the same as SE shown in FIG. 2; hereinafter referred to as a shifter) 45, and converts the image data and the like into the ALU 47. Forward to. The immediate data is used when data is commonly set for all PEs 40, and is a value described in the command.

ＭＰＸ６１は、ＧＰ１３における即値２データバスからの即値データを入力する一方で、Ａレジスタ４８からのデータを入力し、Ａレジスタ４８からのデータをＡＬＵ４７に転送する。尚、この場合の即値データも、全ＰＥ４０に共通にデータを設定する場合等に使用されるもので、命令中に記載されている値のことである。 The MPX 61 inputs the immediate data from the immediate 2 data bus in the GP 13 while inputting the data from the A register 48 and transfers the data from the A register 48 to the ALU 47. Note that the immediate data in this case is also used when data is commonly set in all the PEs 40, and is a value described in the instruction.

例えば上記ディザテーブルのロード等のように、例えば４つおきのＰＥ４０に同一の値をロードするといった処理が必要になった場合には、各ＰＥ４０のＡＬＵにおいて、ＩＤレジスタの値と、ＧＰから供給される即値データ“３”とのＡＮＤ演算を行い演算結果をＡレジスタに格納する（Ｓｔｅｐ１）。
次にＡレジスタの値と、ＧＰから供給される即値データ“０”との比較を行い比較結果の演算フラグの値（Ｚフラグ（演算結果が等しい場合に“１”になるフラグ）をＴレジスタのＴ１フラグに格納する（Ｓｔｅｐ２）。
同様にＡレジスタの値と、ＧＰから供給される即値データ“１”との比較結果のＺフラグの値をＴ２フラグに格納する（Ｓｔｅｐ３）。
同様にＡレジスタの値と、ＧＰから供給される即値データ“２”との比較結果のＺフラグの値をＴ３フラグに格納する（Ｓｔｅｐ４）。
同様にＡレジスタの値と、ＧＰから供給される即値データ“３”との比較結果のＺフラグの値をＴ４フラグに格納する（Ｓｔｅｐ５）。
このようにすると、ＰＥ番号が（４ｎ）のＰＥはＴ１に“１”が、ＰＥ番号が（４ｎ＋１）のＰＥはＴ２に“１”が、ＰＥ番号が（４ｎ＋２）のＰＥはＴ３に“１”が、そしてＰＥ番号が（４ｎ＋３）のＰＥはＴ４に“１”が格納されるので、この４つのフラグ（Ｔ１〜Ｔ４フラグ）を使えば、４つおきのＰＥ毎に同一の値をロードすることが可能となることがわかる。 For example, when processing such as loading the same value into every fourth PE 40 is required, such as loading the dither table, the value of the ID register and the supply from the GP in the ALU of each PE 40 AND operation with the immediate data “3” to be performed is performed, and the operation result is stored in the A register (Step 1).
Next, the value of the A register is compared with the immediate data “0” supplied from the GP, and the operation flag value of the comparison result (Z flag (a flag that becomes “1” when the operation results are equal)) is stored in the T register. Is stored in the T1 flag (Step 2).
Similarly, the value of the Z flag as a comparison result between the value of the A register and the immediate data “1” supplied from the GP is stored in the T2 flag (Step 3).
Similarly, the value of the Z flag as a comparison result between the value of the A register and the immediate data “2” supplied from the GP is stored in the T3 flag (Step 4).
Similarly, the value of the Z flag as a comparison result between the value of the A register and the immediate data “3” supplied from the GP is stored in the T4 flag (Step 5).
In this way, the PE with the PE number (4n) is “1” at T1, the PE with the PE number (4n + 1) is “1” at T2, and the PE with the PE number (4n + 2) is “1” at T3. ", And PE with the PE number (4n + 3) stores" 1 "in T4. If these four flags (T1 to T4 flags) are used, the same value is loaded for every fourth PE. It turns out that it is possible to do.

ＡＬＵ４７は、前述と同じく、シフタ４５からの画像データ等と、Ａレジスタ４８からのデータに基づいて所定の算術論理演算を行う。ＡＬＵ４７は、１６ｂｉｔの画像データ等に対応することが可能である。演算後のデータは、Ａレジスタ４８に記憶される他、該データをＦレジスタ（テンポラリレジスタ）４９に記憶する場合がある。これらはＧＰ１３により制御される。 As described above, the ALU 47 performs a predetermined arithmetic logic operation based on the image data from the shifter 45 and the data from the A register 48. The ALU 47 can handle 16-bit image data and the like. In addition to being stored in the A register 48, the data after the calculation may be stored in the F register (temporary register) 49. These are controlled by GP13.

ＰＥ選択部６３は、第１に、自己のＰＥ４０においてＲレジスタＲ０〜Ｒ３１に接続するバスＢ１に接続し、第２に、各ＰＥ４０の配列方向に対し前方（図示左方向）の三つ隣分、および四つ目の隣分の各々のＰＥ４０においてＲレジスタＲ０〜Ｒ３１に接続する各バスＢ１に個別に接続し、第３に、各ＰＥ４０の配列方向に対し後方（図示右方向）の三つ隣分、および四つ目の隣分の各々のＰＥ４０においてＲレジスタＲ０〜Ｒ３１に接続する各バスＢ１に個別に接続する合計九つの配線経路６８を備えており、その何れかのＰＥ４０のＲレジスタＲ０〜Ｒ３１との選択的なアクセスを可能にする所謂９ｔｏ１のマルチプレクサとしての構成を含む。 The PE selection unit 63 is first connected to the bus B1 connected to the R registers R0 to R31 in its own PE40, and secondly, three adjacent points in front of the arrangement direction of the PEs 40 (left direction in the drawing). , And the fourth PE40 adjacent to each other, individually connected to each bus B1 connected to the R registers R0 to R31, and thirdly, three in the rear (right direction in the drawing) with respect to the arrangement direction of each PE40. A total of nine wiring paths 68 individually connected to each bus B1 connected to the R registers R0 to R31 in each of the PE40 of the adjacent and fourth adjacent portions are provided, and the R register of any one of the PE40 A configuration as a so-called 9to1 multiplexer that enables selective access to R0 to R31 is included.

即ち、ＰＥ選択部６３は、例えばＧＰ１３からの選択命令により自己に関係する全九つのＰＥ４０のうち、一つのＰＥ４５のＲレジスタＲ０〜Ｒ３１とのアクセスを選択する。尚、前方および後方の各四つ目の隣にあたるＰＥ４０は、各三つ隣のＰＥ４０のうち何れかに不良が生じた場合に対応する救済用乃至予備用として扱うものである。 That is, the PE selection unit 63 selects access to the R registers R0 to R31 of one PE45 among all nine PEs 40 related to itself by a selection command from the GP 13, for example. It should be noted that the PE 40 that is adjacent to each of the fourth front and rear PEs is handled as a relief or a spare for a case where any of the three adjacent PEs 40 is defective.

また、各ＰＥ選択部６３は、自己に関係する全九つのＰＥ４０のうち、自己および両隣三つ分のＰＥ４０の不良フラグ５３の値（特に不良の場合を示す‘１’の値）が入力されるものであり、例えば自己の前方の１つ隣のＰＥ４０に不良がある場合は、前方の２つ隣のＰＥ４０のＲレジスタＲ０〜Ｒ３１とのアクセスを選択する。以下、同様に不良のＰＥ４０がある場合は、その一つ外隣のＰＥ４０のＲレジスタＲ０〜Ｒ３１とのアクセスを選択する。 Each PE selection unit 63 is input with the value of the failure flag 53 (particularly, a value of “1” indicating a case of failure) of the PE 40 corresponding to itself and three adjacent PEs among all nine PEs 40 related to the PE. For example, when there is a defect in the next PE 40 in front of itself, the access to the R registers R0 to R31 of the next PE 40 in front is selected. Similarly, when there is a defective PE 40 as well, access to the R registers R0 to R31 of the next adjacent PE 40 is selected.

ＰＥ選択部６３は、図６に示すように、一例として論理回路を含めて構成し、入力（命令の入力）には、まず各三つ隣分の不良フラグ５３からの不良を示す出力（即ち‘１’）の入力があり、その他、現在実行中の命令が、どのＰＥ４０に格納されている画像データを参照するのかを示す例えばＧＰ１３からの次の７通りの制御信号の入力がある。即ち、Ｃ＿ｅｎは自己のＰＥ４０の参照命令が発行された時にアクティブになる制御信号であり、Ｌ１＿ｅｎは一つ前のＰＥ４０の参照命令が発行された時にアクティブになる制御信号であり、Ｌ２＿ｅｎは二つ前のＰＥ４０の参照命令が発行された時にアクティブになる制御信号であり、Ｌ３＿ｅｎは三つ前のＰＥ４０の参照命令が発行された時にアクティブになる制御信号であり、Ｕ１＿ｅｎは一つ後のＰＥ４０の参照命令が発行された時にアクティブになる制御信号であり、Ｕ２＿ｅｎは二つ後のＰＥ４０の参照命令が発行された時にアクティブになる制御信号であり、Ｕ３＿ｅｎは三つ後のＰＥ４０の参照命令が発行された時にアクティブになる制御信号である。 As shown in FIG. 6, the PE selection unit 63 includes a logic circuit as an example, and an input (instruction input) first includes an output indicating a defect from the defect flag 53 for each of the three neighbors (i.e., an instruction). In addition, there are the following seven control signal inputs from, for example, the GP 13 indicating which PE 40 stores the image data stored in the currently executed instruction. That is, C_en is a control signal that becomes active when a reference instruction of its own PE 40 is issued, L1_en is a control signal that becomes active when a reference instruction of the previous PE 40 is issued, and L2_en is two L3_en is a control signal that becomes active when the previous PE40 reference instruction is issued, L3_en is a control signal that becomes active when the three previous PE40 reference instructions are issued, and U1_en is the next PE40. It is a control signal that becomes active when a reference instruction is issued, U2_en is a control signal that becomes active when a reference instruction for the next PE40 is issued, and U3_en is issued by a reference instruction for the next PE40 This is a control signal that becomes active when activated.

また、論理回路の制御信号には、各々のＰＥ４０で実際に開かれる配線経路（即ち各ＰＥ４０のＲレジスタに通じるバスＢ１に接続する配線経路）が何れの経路であるかを決定する次の９通りの制御信号がある。即ち、Ｃ＿ｅｎａｂｌｅは自己のＰＥ４０（即ちＲレジスタＲ０等：以下、省略する）を参照するゲートを開く制御信号であり、Ｌ１＿ｅｎａｂｌｅは一つ前のＰＥ４０を参照する経路のゲートを開く制御信号であり、Ｌ２＿ｅｎａｂｌｅは二つ前のＰＥ４０を参照する経路のゲートを開く制御信号であり、Ｌ３＿ｅｎａｂｌｅは三つ前のＰＥ４０を参照する経路のゲートを開く制御信号であり、Ｌ４＿ｅｎａｂｌｅは四つ前のＰＥ４０を参照する経路のゲートを開く制御信号である。Ｕ１＿ｅｎａｂｌｅは一つ後のＰＥ４０を参照する経路のゲートを開く制御信号であり、Ｕ２＿ｅｎａｂｌｅは二つ後のＰＥ４０を参照する経路のゲートを開く制御信号であり、Ｕ３＿ｅｎａｂｌｅは三つ後のＰＥ４０を参照する経路のゲートを開く制御信号であり、Ｕ４＿ｅｎａｂｌｅは四つ後のＰＥ４０を参照する経路のゲートを開く制御信号である。 In addition, the control signal of the logic circuit determines the route 9 that determines the route that is actually opened by each PE 40 (that is, the route that is connected to the bus B1 leading to the R register of each PE 40). There are street control signals. That is, C_enable is a control signal that opens a gate that refers to its own PE 40 (that is, R register R0, etc .; hereinafter omitted), and L1_enable is a control signal that opens a gate of a path that references the previous PE 40. L2_enable is a control signal for opening the gate of the path referring to the previous PE40, L3_enable is a control signal for opening the gate of the path referring to the previous PE40, and L4_enable refers to the four previous PE40. This is a control signal that opens the gate of the path. U1_enable is a control signal for opening the gate of the path referring to the next PE40, U2_enable is a control signal for opening the gate of the path referring to the next PE40, and U3_enable refers to the PE40 after the third. A control signal for opening the gate of the path, and U4_enable is a control signal for opening the gate of the path referring to the PE40 that is four times later.

一方、ＰＥ選択部６３の論理回路の構成は、図６に示す構成を有するものであり、図面上から明らかであるため詳しい説明は省略する。 On the other hand, the configuration of the logic circuit of the PE selection unit 63 has the configuration shown in FIG. 6 and is clear from the drawing, and thus detailed description thereof is omitted.

以上、ＰＥ選択部６３は、上述の論理回路の構成を含むことにより、自己に関係する全九つのＰＥ４０のうち、ＧＰ１３からの命令に基づいて自己がアクセスすべき隣接するＰＥ４０に不良がある場合には、さらに１つ隣のＰＥ４０のＲレジスタＲ０等にアクセスすることが可能であり、これによりＳＩＭＤ型プロセッサ５１全体を救済し、良好な処理および正常なデータ転送を行うことができる。 As described above, when the PE selection unit 63 includes the above-described logic circuit configuration, out of all nine PEs 40 related to the PE, the adjacent PE 40 to be accessed by the PE based on a command from the GP 13 is defective. Then, it is possible to access the R register R0 and the like of the next adjacent PE 40, thereby relieving the entire SIMD type processor 51 and performing good processing and normal data transfer.

尚、予備の配線経路を増やせば二つ以上の故障がある場合でも対応は出来るが回路規模の増加と故障頻度を考えた場合には本実施の形態に示す程度が最適である。 If the number of spare wiring paths is increased, even if there are two or more failures, it is possible to cope with them, but when considering an increase in circuit scale and failure frequency, the degree shown in this embodiment is optimal.

次に、セルフテスト後の本実施の形態の動作の要点について説明する。図７は、メモリコントローラ２１を用いて例えばライト制御およびリード制御を行う場合のデータの流れを示す概念図である。 Next, the main points of the operation of the present embodiment after the self test will be described. FIG. 7 is a conceptual diagram showing a data flow when, for example, write control and read control are performed using the memory controller 21.

図７に示す転送抑止データは、データ転送と同期して、各ＰＥ４０の不良フラグ５３の値が外部インタフェース１７の制御ビット用ポートを介して読み出され、故障しているＰＥに関してデータ転送を抑止するために用いられる。転送抑止データは、正常なＰＥ４０では‘０’が、不良のＰＥ４０では‘１’が読み出される。 7 is synchronized with the data transfer, the value of the defect flag 53 of each PE 40 is read through the control bit port of the external interface 17, and the data transfer is suppressed for the faulty PE. Used to do. As the transfer inhibition data, “0” is read for a normal PE 40 and “1” is read for a defective PE 40.

メモリコントローラ２１がライト制御（ＰＥ４０のＲレジスタＲ０等からメモリ２３への転送）を行う場合は、転送抑止データが各々‘０’であるＰＥ４０のデータＤ０，Ｄ２，Ｄ４等のみがＳＣＵ７９及びＲＡＭ制御部７７の制御に基づいてメモリ２３に格納される。即ち、転送抑止データが各々‘１’であるＰＥ４０のデータＤ１，Ｄ３等はＳＣＵ７９の制御に基づいてメモリ２３に格納されず間引かれる。 When the memory controller 21 performs write control (transfer from the R register R0 etc. of the PE 40 to the memory 23), only the data D0, D2, D4 etc. of the PE 40 whose transfer inhibition data is “0” are controlled by the SCU 79 and the RAM. Based on the control of the unit 77, it is stored in the memory 23. That is, the data D1, D3, etc. of the PE 40 whose transfer inhibition data is “1” are not stored in the memory 23 but are thinned out under the control of the SCU 79.

また、メモリコントローラ２１がリード制御（メモリ２３からＰＥ４０のＲレジスタＲ０等への転送）を行う場合、転送抑止データが各々‘１’であるＰＥ４０に対してはメモリ２３からのデータを更新せず、一つＰＥ番号が少ないＰＥ４０に対して転送したデータＤ０、Ｄ１等と同一のデータを転送する。 In addition, when the memory controller 21 performs read control (transfer from the memory 23 to the R register R0 of the PE 40, etc.), the data from the memory 23 is not updated for the PE 40 whose transfer suppression data is “1”. The same data as the transferred data D0, D1, etc. is transferred to the PE 40 having one PE number with a small number.

かかる構成によれば、不良のＰＥ４０があっても、メモリ２３と、各ＰＥ４０に内蔵されるＲレジスタＲ０等との間でのデータ転送を正しく行うことが可能となり、ＳＩＭＤ型プロセッサ５１を有効に救済することが可能である。 According to such a configuration, even if there is a defective PE 40, it is possible to correctly transfer data between the memory 23 and the R register R 0 or the like built in each PE 40, and the SIMD type processor 51 is made effective. It is possible to rescue.

尚、セルフテストのプログラムは、例えば上位のホストプロセッサ等からプログラムＲＡＭ４１にダウンロードする。 The self-test program is downloaded to the program RAM 41 from, for example, an upper host processor.

一方、メモリコントローラ２１は、図４に示すように、ＳＣＵ７９が、各不良フラグ５３に基づく外部インタフェース１７からの制御データの出力を入力することで不良のＰＥ４０を認識することが可能である。尚、本処理に関する処理以外は、メモリコントローラ２１とＳＩＭＤ型プロセッサ５１との関係は、図２に示したメモリコントローラ２１とＳＩＭＤ型プロセッサ１１との関係と同様であり詳しい説明は省略する。 On the other hand, as shown in FIG. 4, the memory controller 21 can recognize a defective PE 40 when the SCU 79 inputs an output of control data from the external interface 17 based on each defect flag 53. Except for the processing related to this processing, the relationship between the memory controller 21 and the SIMD processor 51 is the same as the relationship between the memory controller 21 and the SIMD processor 11 shown in FIG.

ところで、図８に示すように、外部インタフェースを介してＲレジスタを制御しているレジスタコントローラ６９は、各偶数、各奇数のＰＥ４０に備わる図６および図７にレジスタとして示す各ＲレジスタＲ０等に対し個別に設けられているものである。レジスタコントローラ６９の動作は前述の例と同様であり、詳しくは、第１に、外部インタフェース１７を介しメモリコントローラがアクセスを行う所定のＰＥ４０のアドレスを入力し、かつライト制御の信号を入力した場合は、例えばＲレジスタＲ０等のポートＷ１にライト指示の信号を出力し、例えばＲレジスタＲ０等に、外部インタフェース１７を介し転送される画像データ等をポートＤ１から取り込ませ一時的に記憶させる。第２に、外部インタフェース１７を介しリード制御の信号を入力した場合は、例えばＲレジスタＲ０等のポートＲ１にリード指示の信号を出力し、例えばＲレジスタＲ０等に、ポートＤ１を通して外部インタフェース１７に対し一時記憶済みの演算後の画像データ等を転送させる。 By the way, as shown in FIG. 8, the register controller 69 that controls the R register through the external interface is provided in each R register R0 shown as a register in FIG. 6 and FIG. However, it is provided individually. The operation of the register controller 69 is the same as in the above example. Specifically, first, when the address of a predetermined PE 40 to which the memory controller accesses through the external interface 17 is input, and a write control signal is input Outputs a write instruction signal to the port W1 such as the R register R0, for example, and causes the R register R0 or the like to fetch the image data transferred via the external interface 17 from the port D1 and temporarily store it. Second, when a read control signal is input via the external interface 17, for example, a read instruction signal is output to the port R1 such as the R register R0. For example, the read instruction signal is output to the external interface 17 via the port D1 to the R register R0. On the other hand, post-computed image data and the like that have been temporarily stored are transferred.

一方、例えばＲレジスタＲ０等は、例えばＧＰ１３からのリード制御の信号をポートＲ２から入力した場合は、記憶後の画像データ等をポートＤ２を通してＰＥ選択部６３に転送する。また、ＲレジスタＲ０等は、例えばＧＰ１３からのライト制御の信号をポートＷ２から入力した場合は、ＰＥ選択部６３からの演算後の画像データ等をポートＤ２から取り込んで一時的に記憶する。尚、記憶後における演算後の画像データ等は、レジスタコントローラ６９の前記第２の制御に従い外部インタフェース１７を介し外部のメモリコントローラ２１に転送される。 On the other hand, for example, when the read signal from GP13 is input from port R2, for example, the R register R0 or the like transfers the stored image data to the PE selection unit 63 through the port D2. For example, when a write control signal from the GP 13 is input from the port W2, the R register R0 and the like fetch the image data after the calculation from the PE selection unit 63 from the port D2 and temporarily store it. It should be noted that the image data after the calculation after storage is transferred to the external memory controller 21 via the external interface 17 in accordance with the second control of the register controller 69.

次に、図９に示すフローチャートを参照してセルフテストを行う際のＧＰ１３の処理動作について説明する。まず、ステップ８０１において電源投入を行った後、ステップ８０２において例えば上位のホストコンピュータ等から自己検査プログラムをダウンロードし、かつステップ８０３においてＧＰ１３の検査を開始する。そして、ステップ８０４において、ＧＰ１３の検査結果がＯＫであるか否かを判定する。 Next, the processing operation of the GP 13 when performing the self test will be described with reference to the flowchart shown in FIG. First, after power is turned on in step 801, a self-inspection program is downloaded from, for example, a host computer or the like in step 802, and inspection of GP13 is started in step 803. In step 804, it is determined whether the inspection result of GP13 is OK.

ＧＰ１３の検査結果がＮＧである場合は、ＧＰ１３を不良品として認定し例えばメンテナンス等を行う。しかし、ＧＰ１３の検査結果がＯＫである場合は、ステップ８０５において全ＰＥ４０の検査を開始すべく、まず、対象ＰＥとしてＰＥ＝０のアドレスのＰＥ４０の検査を選択する。そして、ステップ８０６において既選択のＰＥ４０の検査を実行する。尚、ＰＥの検査に際しては、例えば全ＰＥ４０に対し順次に同一の画像データ（テスト用データ）等を転送し各々に同一の演算処理を行わせ、あらかじめ用意した演算結果のデータと相違するＰＥ４０を探索する場合、あるいは所定の特性を有して個別に異なり得る画像データ（テスト用データ）等を順次に転送し各々の演算結果を確認する場合等、随時任意の検査を行うことが可能である。 When the inspection result of GP13 is NG, GP13 is recognized as a defective product and, for example, maintenance is performed. However, if the inspection result of GP13 is OK, in order to start the inspection of all PEs 40 in step 805, first, the inspection of the PE 40 with the address of PE = 0 is selected as the target PE. In step 806, the selected PE 40 is inspected. When inspecting PEs, for example, the same image data (test data) or the like is sequentially transferred to all the PEs 40 to perform the same arithmetic processing on each PE 40. Arbitrary inspections can be performed at any time, for example, when searching, or when sequentially transferring image data (test data) having predetermined characteristics that can be different individually and checking the results of each operation. .

続いて、ステップ８０７において既選択のＰＥ４０の検査結果がＯＫであるか否かを判定する。検査結果がＯＫである場合は、ステップ８０８において検査対象のＰＥ４０のアドレスをインクリメントし、かつ最終のＰＥ４０の検査を終えたか否かを判定する。最終のＰＥ４０の検査を終えていない場合は、ステップ８０６に戻り次の選択のＰＥ４０の検査を行う。かくて、最終のＰＥ４０までの検査を終えた場合は、ステップ８１９へ進んでメインプログラムのダウンロードを行い、かつステップ８２０においてメインプログラムの実行を行う。 Subsequently, in step 807, it is determined whether or not the inspection result of the selected PE 40 is OK. If the inspection result is OK, in step 808, the address of the PE 40 to be inspected is incremented, and it is determined whether the final PE 40 has been inspected. If the final PE 40 has not been inspected, the process returns to step 806 to inspect the next selected PE 40. Thus, when the inspection up to the final PE 40 is completed, the process proceeds to step 819 to download the main program, and in step 820, the main program is executed.

しかし、ステップ８０７において、検査結果がＮＧであるＰＥ４０の存在が判明した場合は、ステップ８０９において検査対象のＰＥ４０のアドレスをインクリメントし、かつ最終のＰＥ４０の検査を終えたか否かを判定する。最終のＰＥ４０の検査を終えていない場合は、ステップ８１０へ進む。しかし、最終のＰＥ４０までの検査を終えた場合は、ステップ８１９へ進んでメインプログラムのダウンロードを行い、かつステップ８２０においてメインプログラムの実行を行う。 However, if it is determined in step 807 that the PE 40 whose inspection result is NG is found, the address of the PE 40 to be inspected is incremented in step 809 and it is determined whether or not the final PE 40 has been inspected. If the final PE 40 has not been inspected, the process proceeds to step 810. However, when the inspection up to the final PE 40 is completed, the process proceeds to step 819 to download the main program, and in step 820, the main program is executed.

一方、ステップ８１０においては、ＮＧである該ＰＥ４０の次のＰＥ４０の検査を行い、かつステップ８１１においてその検査結果がＯＫであるか否かを判定する。検査の結果、ＮＧであるＰＥ４０が確認された場合は、図６に示すＰＥ選択回路６３が三つの隣接するＰＥの内の一つのＰＥだけが不良である場合に対応できる構成となっており、三つの隣接するＰＥの中に二つ以上の不良のＰＥがある場合には救済ができないため、ＳＩＭＤ型プロセッサ５１全体を不良品として扱う。しかし、検査結果がＯＫである場合は、ステップ８１２へ進む。 On the other hand, in step 810, the PE 40 next to the NG PE 40 is inspected, and in step 811 it is determined whether or not the inspection result is OK. As a result of the inspection, when the PE 40 that is NG is confirmed, the PE selection circuit 63 shown in FIG. 6 has a configuration that can cope with a case where only one of the three adjacent PEs is defective. If there are two or more defective PEs among the three adjacent PEs, the repair cannot be performed, and the entire SIMD type processor 51 is treated as a defective product. However, if the inspection result is OK, the process proceeds to step 812.

ステップ８１２においては、検査対象のＰＥ４０のアドレスをインクリメントし、かつ最終のＰＥ４０の検査を終えたか否かを判定する。最終のＰＥ４０の検査を終えた場合はステップ８１９へ進むが、最終のＰＥ４０の検査を終えていない場合は、ステップ８１３において次の既選択のＰＥ４０の検査を行った後、ステップ８１４においてその検査結果がＯＫであるか否かを判定する。検査の結果、ＮＧであるＰＥ４０が確認された場合は、上述と同じ理由でＳＩＭＤ型プロセッサ５１全体を不良品として扱う。しかし、検査の結果がＯＫである場合はステップ８１５へ進む。 In step 812, the address of the PE 40 to be inspected is incremented, and it is determined whether or not the final PE 40 has been inspected. If the final PE 40 has been inspected, the process proceeds to step 819. If the final PE 40 has not been inspected, the next selected PE 40 is inspected in step 813 and then the inspection result in step 814. It is determined whether or not is OK. As a result of the inspection, if the PE 40 that is NG is confirmed, the entire SIMD type processor 51 is handled as a defective product for the same reason as described above. However, if the result of the inspection is OK, the process proceeds to step 815.

ステップ８１５においては、検査対象のＰＥ４０のアドレスをインクリメントし、かつ最終のＰＥ４０の検査を終えたか否かを判定する。最終のＰＥ４０の検査を終えた場合はステップ８１９へ進むが、最終のＰＥ４０の検査を終えていない場合は、ステップ８１６において次の既選択のＰＥ４０の検査を行った後、ステップ８１７においてその検査結果がＯＫであるか否かを判定する。検査の結果、ＮＧであるＰＥ４０が確認された場合は、上述と同じ理由でＳＩＭＤ型プロセッサ５１全体を不良品として扱う。しかし、検査の結果がＯＫである場合はステップ８１８へ進む。 In step 815, the address of the PE 40 to be inspected is incremented, and it is determined whether or not the final PE 40 has been inspected. If the final PE 40 has been inspected, the process proceeds to step 819. If the final PE 40 has not been inspected, the next selected PE 40 is inspected in step 816, and then the inspection result in step 817. It is determined whether or not is OK. As a result of the inspection, if the PE 40 that is NG is confirmed, the entire SIMD type processor 51 is handled as a defective product for the same reason as described above. However, if the result of the inspection is OK, the process proceeds to step 818.

ステップ８１８においては、検査対象のＰＥ４０のアドレスをインクリメントし、かつ最終のＰＥ４０の検査を終えたか否かを判定する。最終のＰＥ４０の検査を終えた場合はステップ８１９へ進むが、最終のＰＥ４０の検査を終えていない場合は、ステップ８０６へ移行する。 In step 818, the address of the PE 40 to be inspected is incremented, and it is determined whether or not the final PE 40 has been inspected. If the final PE 40 has been inspected, the process proceeds to step 819. If the final PE 40 has not been inspected, the process proceeds to step 806.

このようにステップ８０７において不良のＰＥが検出された場合は、ステップ８１０で次のＰＥ検査の後、ステップ８１１で検査結果がＯＫであり、かつ同様にステップ８１４でさらに次のＰＥの検査結果がＯＫであり、かつ同様にステップ８１７でさらに次のＰＥの検査結果がＯＫである場合にＳＩＭＤ型プロセッサ５１の救済が可能であり、このための検査を行うことになる。 As described above, when a defective PE is detected in step 807, the inspection result is OK in step 811 after the next PE inspection in step 810, and the inspection result of the next PE is similarly detected in step 814. If it is OK and the inspection result of the next PE is OK in step 817 as well, the SIMD processor 51 can be relieved, and an inspection for this is performed.

次に、図１０乃至図１３を参照して、本発明の第２の実施の形態に係るマイクロプロセッサについて説明する。本実施の形態のマイクロプロセッサもＳＩＭＤ型プロセッサに適用される。図１０は本実施の形態のＳＩＭＤ型プロセッサの詳細構成を示すブロック図である。尚、図１０において図３に示した部分と同一部分には同一の符号を付して説明を省略する。 Next, a microprocessor according to a second embodiment of the present invention will be described with reference to FIGS. The microprocessor of this embodiment is also applied to the SIMD type processor. FIG. 10 is a block diagram showing a detailed configuration of the SIMD type processor according to the present embodiment. In FIG. 10, the same parts as those shown in FIG.

本実施の形態のＳＩＭＤ型プロセッサ１１１は、図１０に示すように、各ＰＥ４０に対して、変倍フラグ１１３が新たに備えられたものである。 As shown in FIG. 10, the SIMD type processor 111 of this embodiment is provided with a scaling flag 113 for each PE 40.

変倍フラグ１１３は、図１１に示すように、具体的には、第２の実施の形態で示したものと同様の不良フラグ５３、各ＰＥ４０のデータバスに接続された変倍フラグ１１３、および、不良フラグ５３の出力もしくは変倍フラグ１１３の出力を入力した際に能動となりＰＥ４０のＰＥ汎用レジスタ（ＲレジスタＲ０等）へ出力を与えるオア回路１１５を備えて構成されている。 As shown in FIG. 11, specifically, the scaling flag 113 includes a failure flag 53 similar to that shown in the second embodiment, a scaling flag 113 connected to the data bus of each PE 40, and The OR circuit 115 is configured to be active when the output of the defective flag 53 or the output of the scaling flag 113 is input and provides an output to a PE general-purpose register (such as the R register R0) of the PE 40.

即ち、変倍フラグ１１３には、ＧＰ１３から出力される例えばライト制御信号によりデータバスから入力された値が書きこまれる。変倍フラグ１１３の値は、不良フラグ５３の値との論理和が取られ、ＰＥ汎用レジスタ（ＲレジスタＲ０等）へと供給される。 That is, a value input from the data bus by the write control signal output from the GP 13 is written in the scaling flag 113. The value of the scaling flag 113 is logically ORed with the value of the defect flag 53 and supplied to a PE general register (R register R0, etc.).

尚、変倍フラグ１１３は、外部（メモリコントローラ２１）からのアクセスを制御する外部インタフェース１７との接続を図るためのデータ転送用ポート（図示せず）と接続されており、変倍フラグ１１３の値によりメモリコントローラ２１等に対しデータ転送を抑止させることが可能である。データ転送には、主に演算前もしくは演算後の画像データ等のデータ転送が含まれる。また、変倍フラグ１１３の値により外部のメモリコントローラ２１等のリード／ライト制御の形態を拡大、縮小の形態に変更させることが可能である。 The scaling flag 113 is connected to a data transfer port (not shown) for connection to the external interface 17 that controls access from the outside (memory controller 21). The data transfer can be inhibited by the memory controller 21 or the like by the value. Data transfer mainly includes data transfer such as image data before or after calculation. Further, the read / write control mode of the external memory controller 21 or the like can be changed to an enlargement / reduction mode according to the value of the scaling flag 113.

図１２の上側は、デジタルコピーやファクシミリ等でよく行われる変倍処理の内、縮小を実現するための間引きライト動作について図示したものである。各ＰＥ４０の変倍フラグ１１３の値は、例えば‘０’か‘１’かの１ｂｉｔデータであり、ここではＧＰ１３からの設定に基づいて正しい間引き制御データを構成すべく一例としてＰＥ１，ＰＥ４，ＰＥ７の変倍フラグ１１３の値が‘１’に設定されている。この結果、ＳＣＵ７９の制御に基づくＲＡＭ制御部（図４参照）７７は、各ＰＥ４０のうち、アドレス例として、ＰＥ１，ＰＥ４，ＰＥ７等の画像データを間引き、ＰＥ０，ＰＥ２，ＰＥ３，ＰＥ５，ＰＥ６，ＰＥ８，ＰＥ９等のＰＥ４０の汎用レジスタ（ＲレジスタＲ０等）のデータをメモリ（ＲＡＭ）２３に書き込む。 The upper side of FIG. 12 illustrates the thinning write operation for realizing reduction in the scaling process often performed in digital copy, facsimile, or the like. The value of the scaling flag 113 of each PE 40 is, for example, 1-bit data of “0” or “1”. Here, as an example, PE1, PE4, and PE7 are used to construct correct thinning control data based on the setting from the GP 13. The value of the variable magnification flag 113 is set to “1”. As a result, the RAM control unit 77 (see FIG. 4) 77 based on the control of the SCU 79 thins out the image data of PE1, PE4, PE7, etc. as the address example among the PEs 40, and PE0, PE2, PE3, PE5, PE6. Data of general-purpose registers (R register R0, etc.) of PE40 such as PE8, PE9, etc. is written to the memory (RAM) 23.

図１２の下側は、不良のＰＥがある場合の縮小の変倍処理の動作について図示したものである。ここではＧＰ１３からの設定に基づいて正しい間引き制御データを構成すべく一例としてＰＥ１、ＰＥ６の変倍フラグ１１３の値が‘１’に設定されているとともに、アドレス例として、ＰＥ２，ＰＥ４，ＰＥ８のＰＥ４０の不良フラグ５３が不良を示す‘１’を出力している。この場合、間引き制御データ（転送抑止データ）にはＰＥ１，ＰＥ２，ＰＥ４，ＰＥ６，ＰＥ８等に間引きを示す‘１’が含まれる。この結果、ＳＣＵ７９の制御に基づくＲＡＭ制御部（図４参照）７７は、各ＰＥ４０のうち、アドレス例として、ＰＥ０，ＰＥ３，ＰＥ５、ＰＥ７，ＰＥ９等のＰＥ４０の汎用レジスタ（ＲレジスタＲ０等）のデータをメモリ（ＲＡＭ）２３に書き込む。即ち、ＳＩＭＤ型プロセッサは、不良のＰＥを飛ばして再構成されるから、ライト制御データおよび画像データが図１２の下側のように格納されることになる。ライト制御データと故障ＰＥ情報との論理和を転送抑止データとすると、図のようにＰＥ１，ＰＥ２，ＰＥ４，ＰＥ６，ＰＥ８のデータが間引かれることになり、メモリ２３等に格納されるデータは故障のない場合のデータの場合と全く同一であることがわかる。 The lower side of FIG. 12 illustrates the operation of the scaling process when there is a defective PE. Here, the value of the scaling flag 113 of PE1 and PE6 is set to '1' as an example in order to configure correct decimation control data based on the setting from GP13, and PE2, PE4, and PE8 are examples of addresses. The failure flag 53 of PE 40 outputs “1” indicating failure. In this case, the decimation control data (transfer inhibition data) includes “1” indicating decimation in PE1, PE2, PE4, PE6, PE8, and the like. As a result, the RAM control unit 77 (see FIG. 4) 77 based on the control of the SCU 79 uses the general-purpose registers (R registers R0, etc.) of the PEs 40 such as PE0, PE3, PE5, PE7, PE9, etc., as address examples. Data is written to the memory (RAM) 23. That is, since the SIMD processor is reconfigured by skipping a defective PE, the write control data and the image data are stored as shown in the lower part of FIG. If the logical sum of the write control data and the failure PE information is the transfer inhibition data, the data of PE1, PE2, PE4, PE6, and PE8 are thinned as shown in the figure, and the data stored in the memory 23 or the like is It can be seen that it is exactly the same as the case of data without failure.

図１３の上側は、デジタルコピーやファクシミリ等でよく行われる変倍処理の内、拡大を実現するための重複リード動作について図示したものである。ここでは、メモリ２３等から読み取る画像データとして、アドレス例として、ＰＥ１，ＰＥ４，ＰＥ７等の画像データが直前のＰＥであるＰＥ０，ＰＥ３，ＰＥ６に格納されるデータと同じ値のデータとなっている。従って、ＰＥ１，ＰＥ４，ＰＥ７を飛ばして、ＰＥ０，ＰＥ２，ＰＥ３，ＰＥ５，ＰＥ６，ＰＥ８，ＰＥ９にメモリ２３等から読み出した画像データが格納されることになる。 The upper side of FIG. 13 illustrates an overlapping read operation for realizing enlargement among the scaling processes often performed in digital copying, facsimile, and the like. Here, as image data read from the memory 23 or the like, as address examples, the image data such as PE1, PE4, and PE7 has the same value as the data stored in the immediately preceding PEs PE0, PE3, and PE6. . Accordingly, PE1, PE4, and PE7 are skipped, and the image data read from the memory 23 or the like is stored in PE0, PE2, PE3, PE5, PE6, PE8, and PE9.

図１３の下側は、不良のＰＥがある場合の拡大処理の動作について図示したものである。ここでは、アドレス例として、ＰＥ２，ＰＥ４，ＰＥ８等に不良がある場合を例示する。不良のＰＥがある場合、ＳＩＭＤ型プロセッサ１１１は、不良のＰＥを飛ばして再構成されるので、リード制御データは図のように格納されていることになる。また、画像データの方も同様に図のように格納されている。リード制御データと故障ＰＥ情報との論理和を転送抑止データとすると、図のようにＰＥ１，ＰＥ２，ＰＥ４，ＰＥ６，ＰＥ８に格納されるデータが直前のＰＥのデータと重複することになり故障のない場合にＰＥのレジスタファイルに格納されるデータと全く同一であることがわかる。故障のあるＰＥを飛ばして順番にデータが格納されている。即ち、本実施の形態の構成によれば、故障しているＰＥがあっても変倍を正しく行うことが可能であることがわかる。 The lower side of FIG. 13 illustrates the operation of enlargement processing when there is a defective PE. Here, as an example of an address, a case where PE2, PE4, PE8, etc. are defective is illustrated. If there is a defective PE, the SIMD processor 111 is reconfigured by skipping the defective PE, so the read control data is stored as shown in the figure. Similarly, the image data is stored as shown in the figure. If the logical sum of the read control data and the faulty PE information is the transfer inhibition data, the data stored in PE1, PE2, PE4, PE6, and PE8 overlaps with the data of the previous PE as shown in the figure. It can be seen that the data stored in the PE register file is exactly the same when there is no data. Data is stored in order by skipping faulty PEs. That is, according to the configuration of the present embodiment, it can be seen that zooming can be performed correctly even if there is a faulty PE.

かかる構成によれば、ＰＥ４０に不良がある場合でも、変倍フラグ１１３の値との論理和を行うことで、不良のＰＥに係らず画像データ等に正常な変倍処理を行うことが可能であり、ＳＩＭＤ型プロセッサ１１１を救済するとともに変倍処理を行えるもので機能的にも優れる利点がある。 According to such a configuration, even when the PE 40 is defective, by performing a logical sum with the value of the scaling flag 113, it is possible to perform normal scaling processing on image data or the like regardless of the defective PE. There is an advantage that the SIMD type processor 111 is relieved and the scaling process can be performed, and the function is excellent.

次に、図１４を参照して、本発明の第３の実施の形態に係るマイクロプロセッサについて説明する。本実施の形態のマイクロプロセッサもＳＩＭＤ型プロセッサに適用されるものであり、図１４は本実施の形態のＳＩＭＤ型プロセッサの詳細構成を示すブロック図である。尚、図１４において図３に示した部分と同一部分には同一の符号を付して説明を省略する。 Next, a microprocessor according to a third embodiment of the present invention will be described with reference to FIG. The microprocessor according to the present embodiment is also applied to the SIMD type processor, and FIG. 14 is a block diagram showing the detailed configuration of the SIMD type processor according to the present embodiment. In FIG. 14, the same parts as those shown in FIG.

本実施の形態のＳＩＭＤ型プロセッサ２１１は、図１４に示すように、各ＰＥ４０の汎用レジスタ（ＲレジスタＲ０等）に対して、第２の実施の形態において、各ＰＥ４０に１つ設けられた変倍フラグ１１３の代わりに、各ＰＥ４０の汎用レジスタＲレジスタ毎に独立して設けられた変倍フラグ２１３が新たに備えられたものである。変倍フラグ２１３は、ポートＷ１にＧＰ１３からのライト制御信号を入力した場合、例えばＤ１ポートから例えば変倍率のデータを取り込み、ポートＲ１にリード制御信号を入力した場合、例えば変倍率のデータを汎用レジスタ（ＲレジスタＲ０等）、およびオア回路２１５に出力する。変倍フラグ２１３は、ＧＰ１３のプログラムに基づいて任意の変倍率の変倍データを設定可能である。 As shown in FIG. 14, the SIMD type processor 211 of the present embodiment is different from the general-purpose register (R register R0, etc.) of each PE 40 in the second embodiment. Instead of the double flag 113, a variable magnification flag 213 provided independently for each general-purpose register R register of each PE 40 is newly provided. When the write control signal from the GP 13 is input to the port W1, for example, the zoom ratio data is input from the D1 port, for example, and when the read control signal is input to the port R1, for example, the zoom ratio flag 213 The data is output to a register (R register R0, etc.) and an OR circuit 215. The scaling flag 213 can set scaling data of an arbitrary scaling ratio based on the GP13 program.

変倍フラグ２１３は、Ｄ１という端子でデータバスＢ１と接続されており、ＧＰ１３より制御することでデータの設定が可能となっている。ポートＷ１にライト制御信号をアサートすると、データバスＢ１からデータを変倍フラグ２１３に書き込む。またＲ１にリード制御信号をアサートすると、変倍フラグ２１３の内容をデータバスＢ１に出力する。オア回路２１５へは変倍フラグ２１３の値が常に出力されている。 The scaling flag 213 is connected to the data bus B1 at a terminal D1, and data can be set by being controlled by the GP 13. When the write control signal is asserted to the port W1, data is written from the data bus B1 to the scaling flag 213. When the read control signal is asserted to R1, the contents of the scaling flag 213 are output to the data bus B1. The value of the scaling flag 213 is always output to the OR circuit 215.

上記第２の実施の形態においては、各ＰＥ４０に１つのみ設けられた変倍フラグ１１３を備えているために、画像処理全体において、外部インタフェース１７を介してＲレジスタに接続される複数のメモリコントローラ２１に対して、複数のメモリコントローラ２１が同時に、共通の変倍率を使用しての変倍は可能であるが、各メモリコントローラ２１毎に異なった変倍率を用いた変倍処理を行う場合や、共通の変倍率を使用する場合においても、各メモリコントローラ２１が、同時ではなく、非同期に変倍を行うような場合には対応し得ないものである。 In the second embodiment, since only one scaling flag 113 is provided for each PE 40, a plurality of memories connected to the R register via the external interface 17 in the entire image processing. A plurality of memory controllers 21 can simultaneously perform scaling using a common scaling ratio with respect to the controller 21, but a scaling process using a different scaling ratio for each memory controller 21 is performed. Even when a common scaling factor is used, it is not possible to deal with the case where each memory controller 21 performs scaling not asynchronously but asynchronously.

しかし、本実施の形態においては、各汎用レジスタ（ＲレジスタＲ０等）に一つずつ変倍率変更可能の変倍フラグ２１３を有し、それぞれ個別のオア回路２１５で自己のＰＥ４０の不良フラグ５３の値との論理和を取り外部インタフェース１７を介しメモリコントローラ２１に供給する構成となっている。メモリコントローラ２１は、ＳＣＵ７９の制御を介するＲＡＭ制御部７７の動作等を以って変倍フラグ２１３が示す変倍率での変倍処理に対応することが可能である。 However, in this embodiment, each general-purpose register (R register R0, etc.) has a scaling factor 213 that can change the scaling factor one by one, and each of the individual OR circuits 215 sets the failure flag 53 of its own PE 40. A logical sum with the value is taken and supplied to the memory controller 21 via the external interface 17. The memory controller 21 can cope with the scaling process at the scaling factor indicated by the scaling flag 213 by the operation of the RAM control unit 77 via the control of the SCU 79.

かかる構成によれば、複数個ある汎用レジスタ毎の変倍フラグ２１３により、複数個あるメモリコントローラ２１毎に独立した変倍率を設定することが可能であるため、不良のＰＥ４０の悪影響を除くとともに、画像処理における複数の変倍率がある場合の変倍処理に対応することが可能となり、また複数個のメモリコントローラ２１は変倍処理を行う場合においても、それぞれが独立して非同期に動作することが可能となるため、本実施の形態のＳＩＭＤ型プロセッサ２１１を使用した画像処理装置のシステム構成としての自由度が向上する。 According to this configuration, it is possible to set an independent scaling factor for each of the plurality of memory controllers 21 by the scaling flag 213 for each of the plurality of general-purpose registers. In the image processing, it is possible to cope with a scaling process when there are a plurality of scaling ratios, and the plurality of memory controllers 21 can operate independently and asynchronously even when the scaling process is performed. Therefore, the degree of freedom as the system configuration of the image processing apparatus using the SIMD type processor 211 of the present embodiment is improved.

本発明の前提となる技術の概略構成を示すブロック図である。It is a block diagram which shows schematic structure of the technique used as the premise of this invention. 本発明の前提となる技術の部分的な詳細構成を示すブロック図である。It is a block diagram which shows the partial detailed structure of the technique used as the premise of this invention. 本発明の第１の実施の形態のＳＩＭＤ型プロセッサの構成を示すブロック図である。It is a block diagram which shows the structure of the SIMD type | mold processor of the 1st Embodiment of this invention. 第１の実施の形態のメモリコントローラの構成を示すブロック図である。1 is a block diagram illustrating a configuration of a memory controller according to a first embodiment. FIG. 第１の実施の形態のＴレジスタの構成を示すブロック図である。It is a block diagram which shows the structure of T register of 1st Embodiment. 第１の実施の形態のＰＥ選択部の構成を示す回路構成図である。It is a circuit block diagram which shows the structure of the PE selection part of 1st Embodiment. 第１の実施の形態のリード／ライト制御を行う際のデータの流れを説明する説明図である。It is explanatory drawing explaining the data flow at the time of performing read / write control of 1st Embodiment. 第１の実施の形態のＳＩＭＤ型プロセッサの要部を拡大して示す要部拡大図である。FIG. 2 is an enlarged view of a main part showing an enlarged main part of the SIMD type processor according to the first embodiment. 第１の実施の形態のセルフテストを行う際の処理動作を示すフローチャートである。It is a flowchart which shows the processing operation at the time of performing the self test of 1st Embodiment. 本発明の第２の実施の形態のＳＩＭＤ型プロセッサの構成を示すブロック図である。It is a block diagram which shows the structure of the SIMD type | mold processor of the 2nd Embodiment of this invention. 第２の実施の形態の変倍フラグの構成を示す回路構成図である。It is a circuit block diagram which shows the structure of the scaling flag of 2nd Embodiment. 第２の実施の形態の縮小の変倍処理を行う際のデータの流れを説明する説明図である。It is explanatory drawing explaining the flow of data at the time of performing the scaling process of reduction of 2nd Embodiment. 第２の実施の形態の拡大の変倍処理を行う際のデータの流れを説明する説明図である。It is explanatory drawing explaining the flow of data at the time of performing the scaling process of expansion of 2nd Embodiment. 本発明の第３の実施の形態のＳＩＭＤ型プロセッサの要部構成を拡大して示すブロック図である。It is a block diagram which expands and shows the principal part structure of the SIMD type | mold processor of the 3rd Embodiment of this invention.

Explanation of symbols

１１，５１，１１１，２１１ＳＩＭＤ型プロセッサ
１３ＧＰ（グローバルプロセッサ）
１５プロセッサエレメントグループ
１７外部インタフェース
２１メモリコントローラ
２３メモリ
３１レジスタファイル
３３演算アレイ
３５７ｔｏ１ＭＵＸ
４１プログラムＲＡＭ
４３データＲＡＭ
４５ＳＥ（Shift Expand：シフタ）
４７ＡＬＵ
４８Ａレジスタ
４９Ｆレジスタ
Ｂ１バス
Ｒ０〜Ｒ２３，Ｒ２４〜Ｒ３１汎用レジスタ（Ｒレジスタ）
５３不良フラグ
５５ＩＤレジスタ
５８Ｔレジスタ
５９，６１ＭＰＸ
６３ＰＥ選択部
６４，６５マルチプレクサ
６７アンド回路
Ｔ７〜Ｔ０レジスタ
６８配線経路
６９レジスタコントローラ
７１ライトバッファ
７３リードバッファ
７５外部Ｉ／Ｆ制御部
７７ＲＡＭ制御部
７９ＳＣＵ（シーケンサユニット）
９２バッファ
１１３，２１３変倍フラグ
１１５，２１５オア回路

11, 51, 111, 211 SIMD type processor 13 GP (global processor)
15 Processor Element Group 17 External Interface 21 Memory Controller 23 Memory 31 Register File 33 Arithmetic Array 35 7to1MUX
41 Program RAM
43 Data RAM
45 SE (Shift Expand: Shifter)
47 ALU
48 A register 49 F register B1 Bus R0 to R23, R24 to R31 General-purpose register (R register)
53 Defect flag 55 ID register 58 T register 59, 61 MPX
63 PE selector 64, 65 Multiplexer 67 AND circuit T7 to T0 Register 68 Wiring path 69 Register controller 71 Write buffer 73 Read buffer 75 External I / F controller 77 RAM controller 79 SCU (sequencer unit)
92 Buffer 113, 213 Scaling flag 115, 215 OR circuit

Claims

In a microprocessor comprising a SIMD type processor having a plurality of processor elements for processing a plurality of data,
A defect flag indicating whether or not there is a defect in its own processor element is provided as a register different from the general-purpose register provided in advance for each processor element, and the defect flag sets individual data for each processor element by an instruction instruction. Is possible,
An ID register that stores data indicating the number of its own processor element and stores an incremented number except for the processor element of the defective flag when a defective flag is found,
A data transfer port for accessing the general-purpose register and the failure flag from the outside is provided,
A data transfer device for transferring data between the general-purpose register of each processor element and an external memory is connected to the data transfer port;
The microprocessor according to claim 1, wherein the data transfer device suppresses the data transfer according to the value of the defect flag and the ID register .

2. The microprocessor according to claim 1, wherein the individual data set for each processor element is a scaling control bit in image processing.

3. The microprocessor according to claim 1, wherein the value of the defect flag is set based on a result of a self-test of each processor element in the SIMD type processor.