JP4825154B2

JP4825154B2 - Data processing device

Info

Publication number: JP4825154B2
Application number: JP2007049413A
Authority: JP
Inventors: 和彦岩永
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2007-02-28
Filing date: 2007-02-28
Publication date: 2011-11-30
Anticipated expiration: 2027-02-28
Also published as: JP2008217065A

Description

本発明は、データ処理装置、特にＳＩＭＤ型マイクロプロセッサに関する。 The present invention relates to a data processing apparatus, and more particularly to a SIMD type microprocessor.

画像処理においては、所定範囲の全画素データの最大値又は最小値を特徴量として画像処理の計算式を設定・変更するというような処理が必要となることがある。多データを一度に演算するという特徴が画像処理に向いているとされるＳＩＭＤ型マイクロプロセッサにおいて、各プロセッサエレメント（以下、ＰＥという。）に格納される画素データのうちから、最大値又は最小値を選出することを実現する技術が、従来幾つか開示されている。 In image processing, processing such as setting / changing a calculation formula for image processing using the maximum value or minimum value of all pixel data in a predetermined range as a feature amount may be necessary. In a SIMD type microprocessor whose feature of computing multiple data at once is suitable for image processing, the maximum value or the minimum value is selected from pixel data stored in each processor element (hereinafter referred to as PE). Several techniques have been disclosed in the past for realizing the selection.

特許文献１や特許文献２で開示されている技術は、基本的に逐次処理に関するものである。その技術は、全てのＰＥから対象画素データを読み出し、逐次大小比較を行った結果大きい方を残す、又は小さい方を残すことにより、全対象画素データの最大値又は最小値を求めるというものである。この技術には、検出までに要する時間が、対象画素の数が大きくなるに従い大きくなるという特徴があり、ＰＥ数の多いプロセッサには適切な技術ではないといえる。 The techniques disclosed in Patent Document 1 and Patent Document 2 basically relate to sequential processing. The technology reads out the target pixel data from all PEs and obtains the maximum value or the minimum value of all target pixel data by leaving the larger one or the smaller one as a result of sequential size comparison. . This technique has a feature that the time required for detection increases as the number of target pixels increases, and it can be said that this technique is not appropriate for a processor having a large number of PEs.

特許文献３で開示される技術も基本的に逐次処理に関する。特許文献１や特許文献２で開示されている技術とは異なり、各ＰＥの持つデータを順次、全ＰＥに供給し、比較結果を収集することで最大値あるいは最小値を求めるという技術であるが、特許文献１や特許文献２で開示されている技術と同様の長所・短所を持つといえる。 The technique disclosed in Patent Document 3 also basically relates to sequential processing. Unlike the techniques disclosed in Patent Literature 1 and Patent Literature 2, the data of each PE is sequentially supplied to all PEs, and the maximum value or the minimum value is obtained by collecting the comparison results. It can be said that the technology has the same advantages and disadvantages as those disclosed in Patent Document 1 and Patent Document 2.

特許文献４では、ＰＥ間にツリー状に演算器を設け、各ツリーにパイプラインを切ることによって演算器の負荷を少なく保持したまま、最大値を検出したり総和を演算したりすることを高速に行う回路構成について開示している。この構成では、ＰＥ数が増加すると演算器の数が増加し、回路規模の増大につながること、及びツリーの最後には全ＰＥ長の半分の距離を跨いだ演算が必要になることから、動作速度の面で懸念がある。特許文献５でも基本的に同様の技術を開示する。 In Patent Document 4, a computing unit is provided in a tree shape between PEs, and the maximum value is detected and the sum is calculated while maintaining a small load on the computing unit by cutting the pipeline in each tree. The circuit configuration to be performed is disclosed. In this configuration, if the number of PEs increases, the number of computing units increases, leading to an increase in circuit scale, and an operation that spans half the distance of the total PE length is required at the end of the tree. There is concern in terms of speed. Patent Document 5 basically discloses the same technique.

特許文献６と特許文献７で開示される技術は、上位ビットから順に１ビットずつ比較を行い、フラグを用いることで最大値又は最小値の候補から外れたものを除外していくというものである。このような技術では、対象データのビット幅の数だけ処理を繰り返すと最大値又は最小値を得ることができるが、対象データ数が多いと１回の処理にかかる時間が増大するという問題が生じる。 The techniques disclosed in Patent Document 6 and Patent Document 7 compare one bit at a time in order from the upper bits, and exclude flags that are out of the maximum value or minimum value candidates by using a flag. . In such a technique, the maximum value or the minimum value can be obtained by repeating the process for the number of bit widths of the target data. However, if the number of target data is large, there is a problem that the time required for one process increases. .

特許文献８は、比較対象データをまずデコードしておき、そのデコード結果の論理和を求めることで最大値、最小値検出を行う回路構成を開示する。この回路構成では、比較対象データのビット幅が広い場合、対象データ数が多い場合についての問題点が解決されていない。
特開２００１−２６５５９２号公報特開平０８−０３０５７７号公報特許第２９６９１１５号特公平８−１４８１６号公報特開２００２−２０７７０６号公報特開平０５−１００８２４号公報特開平０６−１３９０４８号公報特開平１１−８５４６７号公報 Patent Document 8 discloses a circuit configuration in which comparison target data is first decoded and a maximum value and a minimum value are detected by obtaining a logical sum of the decoding results. In this circuit configuration, when the bit width of the comparison target data is wide, the problem about the case where the number of target data is large is not solved.
JP 2001-265592 A Japanese Patent Laid-Open No. 08-030577 Patent No. 2969115 Japanese Patent Publication No. 8-14816 JP 2002-207706 A Japanese Patent Laid-Open No. 05-1000082 Japanese Patent Laid-Open No. 06-139048 Japanese Patent Laid-Open No. 11-85467

本発明は、前述の従来技術の問題点を考慮して、少ない回路規模を保ったまま最大値又は最小値を短いサイクルで選出することのできるデータ処理装置、特にマイクロプロセッサを構築することを目的とする。つまり、マイクロプロセッサにおいて、２進データの数が多い場合にも短いサイクルで最大値又は最小値を求めることができること、２進データのビット幅が広い場合でも回路規模を増やすことなく最大値又は最小値を求めることができること、及び、特にＳＩＭＤ型などの並列プロセッサが処理しているデータの中から最大値又は最小値を回路規模を増やさずに求めることができることを目的とする。 An object of the present invention is to construct a data processing apparatus, particularly a microprocessor, which can select a maximum value or a minimum value in a short cycle while keeping a small circuit scale in consideration of the above-mentioned problems of the prior art. And That is, in the microprocessor, the maximum value or the minimum value can be obtained in a short cycle even when the number of binary data is large, and the maximum value or the minimum value can be obtained without increasing the circuit scale even when the bit width of the binary data is wide. It is an object to be able to obtain a value, and in particular, to obtain a maximum value or a minimum value from data processed by a parallel processor such as a SIMD type without increasing the circuit scale.

本発明は、上記の目的を達成するために為されたものである。本発明に係る請求項１に記載のデータ処理装置は、複数の２進データの中から最大値又は最小値を求めるデータ処理装置であって、
２進データと同数以上の条件フラグと、
各２進データをデコードするための２進データと同数以上のデコーダと、
２進データと同数以上の比較器と、
各デコーダからのデコード結果がＷｉｒｅｄ−ＯＲされて出力される１ビット毎のバスを有し、
各条件フラグと各デコーダと各比較器は、対象の２進データに対して関連付けされており、
上記デコーダによるデコード結果は、関連する条件フラグの値が真であれば１ビット毎にＷｉｒｅｄ−ＯＲされてバスに出力され、関連する条件フラグが偽であればバスに出力されず、
各比較器は、関連するデコーダのデコード結果の値とＷｉｒｅｄ−ＯＲされたバスの値とを比較し、Ｗｉｒｅｄ−ＯＲされたバス値よりもデコード結果の方が小さい場合には、関連する条件フラグの値をリセットすることを特徴とする。 The present invention has been made to achieve the above object. The data processing device according to claim 1 according to the present invention is a data processing device for obtaining a maximum value or a minimum value from a plurality of binary data,
More than the same number of condition flags as binary data;
More than the same number of decoders as binary data for decoding each binary data;
More than the same number of comparators as binary data;
A decoding result from each decoder has a bit-by-bit bus that is output as a Wire-OR.
Each condition flag, each decoder, and each comparator are associated with the target binary data,
If the value of the related condition flag is true, the result of decoding by the decoder is Wired-ORed bit by bit and output to the bus, and if the related condition flag is false, it is not output to the bus.
Each comparator compares the decoded result value of the associated decoder with the value of the wired-OR bus, and if the decoded result is smaller than the wired-OR bus value, the related condition flag The value of is reset.

本発明に係る請求項２に記載のデータ処理装置は、
更に、関連する条件フラグ、デコーダ、及び比較器に対して、ビットシフト回路が関連付けされて設置され、
各ビットシフト回路は各２進データを入力して所定幅だけビットシフトして関連するデコーダに出力し、
複数の２進データのビット幅の中の特定部分のビット幅のデータに関して最大値又は最小値を算出することを特徴とする請求項１に記載のデータ処理装置である。 A data processing apparatus according to claim 2 of the present invention is
Furthermore, a bit shift circuit is installed in association with the related condition flag, decoder, and comparator,
Each bit shift circuit inputs each binary data, shifts the bit by a predetermined width, and outputs it to the associated decoder,
2. The data processing apparatus according to claim 1, wherein a maximum value or a minimum value is calculated with respect to data of a bit width of a specific portion among a plurality of binary data bit widths.

本発明に係る請求項３に記載のデータ処理装置は、
各デコーダが、
デコード結果として、ＬＳＢビットから、入力データをデコードして“１”となったビットまでを、“１”として出力し、それ以外のビットを“０”として出力する、
又は、その負論理を出力するように構成されていることを特徴とする請求項１に記載のデータ処理装置である。 According to a third aspect of the present invention, there is provided a data processing device.
Each decoder
As a decoding result, from the LSB bit, the bit from which the input data is decoded to “1” is output as “1”, and the other bits are output as “0”.
The data processing apparatus according to claim 1, wherein the data processing apparatus is configured to output the negative logic.

本発明に係る請求項４に記載のデータ処理装置は、
請求項１乃至３のうちのいずれか一に記載のデータ処理装置であって、
比較器が、複数の２進データを演算処理するための算術演算装置（ＡＬＵ）で構成されていることを特徴とする。 According to a fourth aspect of the present invention, there is provided a data processing device.
A data processing device according to any one of claims 1 to 3,
The comparator is composed of an arithmetic operation unit (ALU) for processing a plurality of binary data.

本発明を利用することにより、データ処理装置において、対象となる２進データの数が多い場合にも、短いサイクルでそれら多数の２進データの最大値又は最小値を求めることが可能になる。 By utilizing the present invention, even when the number of target binary data is large in the data processing apparatus, it becomes possible to obtain the maximum value or the minimum value of the large number of binary data in a short cycle.

以下、図面を参照して本発明に係る好適な実施形態を説明する。 DESCRIPTION OF EXEMPLARY EMBODIMENTS Hereinafter, preferred embodiments according to the invention will be described with reference to the drawings.

［第１の実施形態］ [First Embodiment]

図１０は、本発明に係るマイクロプロセッサ２の概略の構成図である。図１０に示されるマイクロプロセッサ２は、プロセッサエレメント（４−（１）、４−（２）、・・・４−（ｎ））を複数（図ではｎ個）備えており、各プロセッサエレメント４は図示しないレジスタ及び演算部を備える。各プロセッサエレメント４は適宜接続されており、図示しないグローバルプロセッサなどにより動作を制御される。このような並列プロセッサは通常「ＳＩＭＤ型マイクロプロセッサ」と称されるものである。 FIG. 10 is a schematic configuration diagram of the microprocessor 2 according to the present invention. The microprocessor 2 shown in FIG. 10 includes a plurality (n in the figure) of processor elements (4- (1), 4- (2),... 4- (n)). Includes a register and an arithmetic unit (not shown). Each processor element 4 is appropriately connected, and its operation is controlled by a global processor (not shown). Such a parallel processor is usually referred to as a “SIMD type microprocessor”.

図１は、本発明の第１の実施形態に係るマイクロプロセッサ２の一部拡大図である。特に、３つのプロセッサエレメントの夫々の一部分を示している。ここでは、比較対象となる２進データが４ビットであり、比較対象となる２進データは対象データ１〜対象データ３までの３つである場合について、図示している。対象データ１〜対象データ３は、例えば夫々レジスタ（図示せず。）に記憶されており、それらがデコーダ（デコーダ１、デコーダ２、デコーダ３）に入力する。 FIG. 1 is a partially enlarged view of the microprocessor 2 according to the first embodiment of the present invention. In particular, a portion of each of the three processor elements is shown. Here, the case where the binary data to be compared is 4 bits and the binary data to be compared is three from the target data 1 to the target data 3 is illustrated. The target data 1 to the target data 3 are stored in, for example, registers (not shown), respectively, and are input to the decoders (decoder 1, decoder 2, decoder 3).

デコーダ（デコーダ１、デコーダ２、デコーダ３）には、上記２進データと、夫々の条件フラグの値とが入力される。各デコーダの構成は、図２に示しているようなものである。各デコーダは２進データをデコードした結果を、３ステートバッファを介して、図１下部に示すバスに出力する。図２に示されるように、３ステートバッファの出力イネーブル信号には条件フラグが接続される。つまり、条件フラグが真の状態であるプロセッサエレメントでは、３ステートバッファを介してデコード結果がバスに出力される。 The binary data and the value of each condition flag are input to the decoders (decoder 1, decoder 2, decoder 3). The configuration of each decoder is as shown in FIG. Each decoder outputs the result of decoding the binary data to the bus shown in the lower part of FIG. 1 via the 3-state buffer. As shown in FIG. 2, a condition flag is connected to the output enable signal of the 3-state buffer. That is, in the processor element in which the condition flag is true, the decoding result is output to the bus via the three-state buffer.

なお、３ステートバッファに代わりに、ダイナミックバス構成若しくはオープンドレイン構成を採用しても同様の作用を行うことができることは明白である。 It is obvious that the same operation can be performed even if a dynamic bus configuration or an open drain configuration is adopted instead of the 3-state buffer.

比較器（比較器１、比較器２、比較器３）には、各デコーダからの出力とバス上のデータとが入力され、比較器での比較結果が条件フラグのリセット端子に接続されている。 The comparators (Comparator 1, Comparator 2, Comparator 3) receive the output from each decoder and the data on the bus, and the comparison result of the comparator is connected to the reset terminal of the condition flag. .

図３は、本発明の第１の実施形態に係るマイクロプロセッサに含まれる比較器の構成図（図３（ｂ））、及び、機能内容（図３（ａ））を示す。図３（ａ）は、図３（ｂ）の比較器を構成する２種類の比較素子、即ちＣＭＰ１とＣＭＰ２の動作内容を示す記述である。 FIG. 3 shows a configuration diagram (FIG. 3B) and functional contents (FIG. 3A) of a comparator included in the microprocessor according to the first embodiment of the present invention. FIG. 3A is a description showing the operation contents of two types of comparison elements constituting the comparator of FIG. 3B, that is, CMP1 and CMP2.

図１、図２及び図３に示す構成を備えるマイクロプロセッサは、以下の工程のようにして、３つの対象データの中から最大値を求める。なお、対象データ１、対象データ２、及び、対象データ３は、例として、（１０１０ｂ）、（０１１１ｂ）、（１００１ｂ）であるとする。 The microprocessor having the configuration shown in FIGS. 1, 2, and 3 obtains the maximum value from the three target data as follows. Note that the target data 1, the target data 2, and the target data 3 are, for example, (1010b), (0111b), and (1001b).

（工程１）；全ての条件フラグを“1”にセットする。 (Step 1): All condition flags are set to “1”.

（工程２−１）；デコーダの出力及びバス上のデータは、以下の表１のようになる。

(Step 2-1): The output of the decoder and the data on the bus are as shown in Table 1 below.

（工程２−２）；比較結果は以下の表２のようになる。

(Step 2-2): The comparison results are shown in Table 2 below.

（工程２−３）；結果として条件フラグ２と条件フラグ３はリセットされ、デコーダの出力およびバス上のデータは以下の表３のようになり、最大値が求まる。

ここで、バス上のデータ“( 0000010000000000b )”をエンコードして“1010b”が得られる。 (Step 2-3); As a result, the condition flag 2 and the condition flag 3 are reset, and the output of the decoder and the data on the bus are as shown in Table 3 below, and the maximum value is obtained.

Here, “1010b” is obtained by encoding the data “(0000010000000000b)” on the bus.

以上、３つのデータにおける最大値を求めることについて説明をしたが、これより多数のデータにおける最大値を求める場合も同様にすればよい。また、２進データの反転をデコーダに入力し、バス上のデータのビット順をスワップ（交換）してエンコードすることで、最小値を求めることが可能である。 In the above, the description has been given of obtaining the maximum value in the three data, but the same may be applied to obtaining the maximum value in a larger number of data. Further, the minimum value can be obtained by inputting inversion of binary data to the decoder, and encoding by swapping (changing) the bit order of the data on the bus.

［第２の実施形態］
図４は、本発明の第２の実施形態に係るマイクロプロセッサ２の一部拡大図である。特に、３つのプロセッサエレメントの夫々の一部分を示している。本発明の第２の実施形態に係るマイクロプロセッサは、本発明の第１の実施形態に係るマイクロプロセッサと略同様のものであるので、両者の差異を中心に説明する。 [Second Embodiment]
FIG. 4 is a partially enlarged view of the microprocessor 2 according to the second embodiment of the present invention. In particular, a portion of each of the three processor elements is shown. Since the microprocessor according to the second embodiment of the present invention is substantially the same as the microprocessor according to the first embodiment of the present invention, the difference between the two will be mainly described.

ここでは、比較対象となる２進データが１６ビットであり、比較対象となる２進データは対象データ１〜対象データ３までの３つである場合について図示している。対象データ１〜対象データ３は、例えば夫々レジスタ（図示せず。）に記憶されており、それらがバレルシフタに入力され、バレルシフタで任意のビット数だけシフトされその結果の４ビットがデコーダ（デコータ１、デコーダ２、デコーダ３）に入力される。 Here, the binary data to be compared is 16 bits, and the binary data to be compared is three from target data 1 to target data 3. The target data 1 to 3 are stored in registers (not shown), for example, and are input to the barrel shifter, shifted by an arbitrary number of bits by the barrel shifter, and the resulting 4 bits are decoded (decoder 1). , Decoder 2 and decoder 3).

例えば、１６ビットデータが右シフトされ、シフト結果の下位４ビットがデコーダに入力される。かかる構成によれば、以下のようにして３つの１６ビットデータの中から最大値が求められ得る。なお、対象データ１、対象データ２、及び、対象データ３は、例として、（３２ＦＦｈ）、（３７５０ｈ）、（２８９Ｃｈ）であるとする。 For example, 16-bit data is right-shifted, and the lower 4 bits of the shift result are input to the decoder. According to this configuration, the maximum value can be obtained from the three 16-bit data as follows. Note that the target data 1, the target data 2, and the target data 3 are, for example, (32FFh), (3750h), and (289Ch).

（工程２-１）；バレルシフタで右１２ビットシフトした結果の下位４ビットをデコーダに出力する。デコーダの出力およびバス上のデータは以下の表４のようになる。

(Step 2-1): The lower 4 bits resulting from the right 12-bit shift by the barrel shifter are output to the decoder. The output of the decoder and the data on the bus are as shown in Table 4 below.

（工程２−２）；各比較器における比較結果は以下の表５のようになる。

(Step 2-2): The comparison results in each comparator are as shown in Table 5 below.

結果として条件フラグ３はリセットされ、デコーダの出力およびバス上のデータは以下の表６のようになり、ビット１５〜１２における最大値が求まる。 As a result, the condition flag 3 is reset, the decoder output and the data on the bus are as shown in Table 6 below, and the maximum value in bits 15 to 12 is obtained.

ここで、バス上のデータ“( 0000000000001000b )”をエンコードして“0011b (3h)”が得られる。

Here, “0011b (3h)” is obtained by encoding the data “(0000000000001000b)” on the bus.

（工程３-１）；バレルシフタで右８ビットシフトした結果の下位４ビットをデコーダに出力する。デコーダの出力およびバス上のデータは以下の表７のようになる。

(Step 3-1): The lower 4 bits of the result of the 8-bit right shift by the barrel shifter are output to the decoder. The decoder output and data on the bus are as shown in Table 7 below.

（工程３−２）；各比較器における比較結果は以下の表８のようになる。

(Step 3-2): The comparison result in each comparator is as shown in Table 8 below.

結果として条件フラグ１と条件フラグ３はリセットされ、デコーダの出力およびバス上のデータは以下の表９のようになり、ビット１１〜８における最大値が求まる。 As a result, the condition flag 1 and the condition flag 3 are reset, and the output of the decoder and the data on the bus are as shown in Table 9 below, and the maximum value in the bits 11 to 8 is obtained.

ここで、バス上のデータ“( 0000000010000000b )”をエンコードして“0111b (7h)”が得られる。

Here, “0111b (7h)” is obtained by encoding the data “(0000000010000000b)” on the bus.

上記例では、ここで最大値が求まったことになる。この時点で未だ求まらなければ、前述と同様に、バレルシフタで右４ビットシフトした結果の下位４ビットをデコーダに出力することで、ビット７〜４における最大値を求めて、全体の最大値を求める。その時点でも未だ求まらなければ、更にバレルシフタで右０ビットシフトした（即ち、シフトしない）結果の下位４ビットをデコーダに出力することで、ビット３〜０における最大値を求めて、全体の最大値を求める。つまり、最後までリセットされない条件フラグを含むプロセッサエレメントにおける対象データが、最大値である。 In the above example, the maximum value is obtained here. If it is not yet determined at this point, as in the previous case, the lower 4 bits of the result of the right 4 bit shift by the barrel shifter are output to the decoder, and the maximum value in bits 7 to 4 is obtained to obtain the overall maximum value. Ask for. If it is not yet obtained at that time, the maximum 4 bits 0 to 3 are obtained by outputting the lower 4 bits of the result of the right 0 bit shift (ie, not shifted) by the barrel shifter to the decoder. Find the maximum value. That is, the target data in the processor element including the condition flag that is not reset to the end is the maximum value.

以上、３つのデータにおける最大値を求めることについて説明をしたが、これより多数のデータにおける最大値を求める場合も同様にすればよい。 In the above, the description has been given of obtaining the maximum value in the three data, but the same may be applied to obtaining the maximum value in a larger number of data.

［第３の実施形態］
図５は、本発明の第３の実施形態に係るマイクロプロセッサにおけるデコーダの回路構成図である。 [Third Embodiment]
FIG. 5 is a circuit configuration diagram of a decoder in the microprocessor according to the third embodiment of the present invention.

図１に示す第１の実施形態に係るマイクロプロセッサ、および図４に示す第２の実施形態に係るマイクロプロセッサにおいて、デコーダを、図２に示すものから図５に示すものに入れ替えると、比較器の構成を通常の比較器又は減算器にすることが可能である。 In the microprocessor according to the first embodiment shown in FIG. 1 and the microprocessor according to the second embodiment shown in FIG. 4, when the decoder is changed from the one shown in FIG. 2 to the one shown in FIG. It is possible to use a normal comparator or subtractor.

対象データ１（１０１０ｂ）、対象データ２（０１１１ｂ）、対象データ３（１００１ｂ）の場合で説明すると、デコーダの出力およびバス上のデータは以下の表１０のようになる。 In the case of target data 1 (1010b), target data 2 (0111b), and target data 3 (1001b), the output of the decoder and the data on the bus are as shown in Table 10 below.

このように、バス上のデータは最大数を持つデコーダ１と全く同一となることがわかる。従って、比較器として、図６で示すような単純なコンパレータ、若しくは、（図示しない）減算器を用いることが可能となる。 Thus, it can be seen that the data on the bus is exactly the same as the decoder 1 having the maximum number. Therefore, a simple comparator as shown in FIG. 6 or a subtracter (not shown) can be used as the comparator.

［第４の実施形態］
図７は、本発明の第４の実施形態に係るマイクロプロセッサ２の一部拡大図である。特に、３つのプロセッサエレメントの夫々の一部分を示している。本発明の第４の実施形態に係るマイクロプロセッサは、本発明の第３の実施形態に係るマイクロプロセッサと略同様のものであるので、両者の差異を中心に説明する。 [Fourth Embodiment]
FIG. 7 is a partially enlarged view of the microprocessor 2 according to the fourth embodiment of the present invention. In particular, a portion of each of the three processor elements is shown. Since the microprocessor according to the fourth embodiment of the present invention is substantially the same as the microprocessor according to the third embodiment of the present invention, the difference between the two will be mainly described.

図７に示す第４の第４の実施形態に係るマイクロプロセッサでは、図１に示す第１の実施形態に係るマイクロプロセッサ、および図４に示す第２の実施形態に係るマイクロプロセッサにおいて、デコーダを、図２に示すものから図５に示すものに入れ替え、更に、比較器（比較器１、比較器２、比較器３）を、ＡＬＵ（ＡｒｉｔｈｍｅｔｉｃＬｏｇｉｃａｌＵｎｉｔ；数値演算ユニット）（ＡＬＵ１、ＡＬＵ２、ＡＬＵ３）に入れ替える。更に、（図示していないが、）ＡＬＵ（ＡＬＵ１、ＡＬＵ２、ＡＬＵ３）には、デコーダの出力とバスのデータのみならず、（図示しない）アキュムレータのデータと（図示しない）レジスタからの２進データとが入力されるように構成されている。 In the microprocessor according to the fourth embodiment shown in FIG. 7, a decoder is provided in the microprocessor according to the first embodiment shown in FIG. 1 and the microprocessor according to the second embodiment shown in FIG. 2 is replaced with the one shown in FIG. 5, and the comparators (Comparator 1, Comparator 2, Comparator 3) are replaced with ALUs (Arithmetic Logic Units) (ALU1, ALU2, ALU3). ). Furthermore, (not shown) ALU (ALU1, ALU2, ALU3) includes not only decoder output and bus data, but also accumulator data (not shown) and binary data from a register (not shown). And are input.

かかる構成は、通常の並列マイクロプロセッサに対して、デコーダ、条件フラグ、及びデコード結果のＷｉｒｅｄ−ＯＲ結果を出力するバスを追加設定すれば、実現される。このような追加設定によって、多数のデータにおける最大値又は最小値を求めることができるようになる。 Such a configuration can be realized by additionally setting a decoder, a condition flag, and a bus that outputs a Wired-OR result of a decoding result to a normal parallel microprocessor. By such additional setting, the maximum value or the minimum value in a large number of data can be obtained.

更に、図８は、本発明の第４の実施形態に係るマイクロプロセッサ２の別例の一部拡大図である。この別例では、デコーダの出力がバスに出力されるのではなく、一旦アキュムレータ１２に格納され、アキュムレータ１２の出力が条件フラグの値に拠ってバスに供給される。アキュムレータ１２は、多数のデータにおける最大値又は最小値を求めるとき以外は、ＡＬＵでの演算結果を格納する。即ち、通常のアキュムレータとして機能する。 FIG. 8 is a partially enlarged view of another example of the microprocessor 2 according to the fourth embodiment of the present invention. In this other example, the output of the decoder is not output to the bus, but is temporarily stored in the accumulator 12, and the output of the accumulator 12 is supplied to the bus according to the value of the condition flag. The accumulator 12 stores the calculation result in the ALU except when obtaining the maximum value or the minimum value in a large number of data. That is, it functions as a normal accumulator.

この構成は、多数のデータにおける最大値又は最小値を求めるときに必要なバスと、その他の通常の演算で利用されるバスとを共通化するものである。この構成により、回路全体をコンパクトにできるといえる。 In this configuration, a bus necessary for obtaining a maximum value or a minimum value in a large number of data and a bus used in other normal operations are shared. With this configuration, it can be said that the entire circuit can be made compact.

［第５の実施形態］
図９は、本発明の第５の実施形態に係るマイクロプロセッサ２の一部拡大図である。本発明の第５の実施形態に係るマイクロプロセッサは、図８に示す本発明の第４の実施形態に係るマイクロプロセッサの別例と略同様のものである。 [Fifth Embodiment]
FIG. 9 is a partially enlarged view of the microprocessor 2 according to the fifth embodiment of the present invention. The microprocessor according to the fifth embodiment of the present invention is substantially the same as another example of the microprocessor according to the fourth embodiment of the present invention shown in FIG.

図９に示す第５の実施形態に係るマイクロプロセッサでは、アキュムレータよりバスに出力された値が一度、外部レジスタ２０に格納され、そのレジスタ２０の値が別のバスを介してＡＬＵ（ＡＬＵ１、ＡＬＵ２、ＡＬＵ３）に入力される。 In the microprocessor according to the fifth embodiment shown in FIG. 9, the value output from the accumulator to the bus is once stored in the external register 20, and the value of the register 20 is transferred to another ALU (ALU1, ALU2 via another bus). , ALU3).

かかる構成によれば、Ｗｉｒｅｄ−ＯＲされた結果の値が一度、レジスタ２０に格納されるため、次のサイクルでＷｉｒｅｄ−ＯＲ結果の値と各アキュムレータの値との比較を行うことができる。このように構成することによって、プロセッサエレメントの数が増加してもマイクロプロセッサ全体において高速動作を行うことが可能となる。 According to such a configuration, since the value of the wired-OR result is once stored in the register 20, the value of the wired-OR result and the value of each accumulator can be compared in the next cycle. With this configuration, even if the number of processor elements increases, the entire microprocessor can be operated at high speed.

本発明の第１の実施形態に係るマイクロプロセッサの一部拡大図である。1 is a partially enlarged view of a microprocessor according to a first embodiment of the present invention. 本発明の第１の実施形態に係るマイクロプロセッサで利用されるデコーダの概略の構成図である。1 is a schematic configuration diagram of a decoder used in a microprocessor according to a first embodiment of the present invention. 本発明の第１の実施形態に係るマイクロプロセッサに含まれる比較器の構成図（図３（ｂ））、及び、機能内容（図３（ａ））である。It is a block diagram (FIG.3 (b)) of a comparator contained in the microprocessor which concerns on the 1st Embodiment of this invention, and the function content (FIG.3 (a)). 本発明の第２の実施形態に係るマイクロプロセッサの一部拡大図である。It is a partial enlarged view of the microprocessor which concerns on the 2nd Embodiment of this invention. 本発明の第３の実施形態に係るマイクロプロセッサにおけるデコーダの回路構成図である。It is a circuit block diagram of the decoder in the microprocessor which concerns on the 3rd Embodiment of this invention. 本発明の第３の実施形態に係るマイクロプロセッサで利用されるコンパレータの概略の構成図である。It is a schematic block diagram of the comparator utilized with the microprocessor which concerns on the 3rd Embodiment of this invention. 本発明の第４の実施形態に係るマイクロプロセッサの一部拡大図である。It is a partial enlarged view of the microprocessor which concerns on the 4th Embodiment of this invention. 本発明の第４の実施形態に係るマイクロプロセッサの別例の一部拡大図である。It is a partially expanded view of another example of the microprocessor according to the fourth embodiment of the present invention. 本発明の第５の実施形態に係るマイクロプロセッサの一部拡大図である。It is a partial enlarged view of the microprocessor which concerns on the 5th Embodiment of this invention. 本発明に係るマイクロプロセッサの概略の構成図である。1 is a schematic configuration diagram of a microprocessor according to the present invention. FIG.

Explanation of symbols

２・・・マイクロプロセッサ、４・・・プロセッサエレメント、１０・・・バレルシフタ、１２・・・アキュムレータ、２０・・・外部レジスタ。 2 ... microprocessor, 4 ... processor element, 10 ... barrel shifter, 12 ... accumulator, 20 ... external register.

Claims

A data processing device for obtaining a maximum value or a minimum value from a plurality of binary data,
More than the same number of condition flags as binary data;
More than the same number of decoders as binary data for decoding each binary data;
More than the same number of comparators as binary data;
A decoding result from each decoder has a bit-by-bit bus that is output as a Wire-OR.
Each condition flag, each decoder, and each comparator are associated with the target binary data,
If the value of the related condition flag is true, the result of decoding by the decoder is Wired-ORed bit by bit and output to the bus. If the related condition flag is false, the result is not output to the bus.
Each comparator compares the decoded result value of the associated decoder with the value of the wired-OR bus, and if the decoded result is smaller than the wired-OR bus value, the related condition flag A data processing device characterized by resetting the value of.

Furthermore, a bit shift circuit is installed in association with the related condition flag, decoder, and comparator,
Each bit shift circuit inputs each binary data, shifts the bit by a predetermined width, and outputs it to the associated decoder,
The data processing apparatus according to claim 1, wherein a maximum value or a minimum value is calculated with respect to data of a bit width of a specific portion among a plurality of binary data bit widths.

Each decoder
As a decoding result, from the LSB bit, the bit from which the input data is decoded to “1” is output as “1”, and the other bits are output as “0”.
The data processing apparatus according to claim 1, wherein the data processing apparatus is configured to output the negative logic.

A data processing device according to any one of claims 1 to 3,
A data processing apparatus, wherein the comparator comprises an arithmetic operation unit (ALU) for performing arithmetic processing on a plurality of binary data.