JP2005322198A

JP2005322198A - Data processor

Info

Publication number: JP2005322198A
Application number: JP2004219911A
Authority: JP
Inventors: Hiroshi Koya; 啓小屋
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2004-04-07
Filing date: 2004-07-28
Publication date: 2005-11-17

Abstract

<P>PROBLEM TO BE SOLVED: To provide a data processor advantageous in reducing power consumption and a processing time by suppressing operation frequency of a CPU. <P>SOLUTION: This data processor 4 is equipped with a memory 42 and an arithmetic control circuit 44. A bit configuration of address data allocated to the memory 42 is divided into two on the upper bit side and the lower bit side; and the upper bit side is allocated to a first memory array 42A and the lower bit side is allocated to a second memory array 42B. The arithmetic control circuit 44 is so structured as to execute predetermined calculation by inputting first and second data Da and Db simultaneously read from the first and second memory arrays 42A and 42B, and to output the calculation result thereof to a data bus DB as output data Dout. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明はデータ処理装置に関する。 The present invention relates to a data processing apparatus.

ＣＰＵ（マイクロプロセッサ）を搭載するシステムＬＳＩでは、その処理の高速化のため、チップ上にメモリを搭載した構成となっている（例えば特許文献１参照）。
ＣＰＵにより演算処理を行うために、ＣＰＵはデータメモリから複数のオペランドをフェッチし、場合によってはその演算結果をメモリに書き戻す等の処理が必要になる。
大量のデータ処理をＣＰＵで行うシステムでは、このメモリからのフェッチ回数は膨大なものとなり、メモリアクセス、バスアクセスによる消費電力の増加や、ＣＰＵのソフトウェアサイクル数の増加に繋がる。
例を挙げると、画像処理等ではバイトデータ間の比較、加算、乗算などが頻繁に行われるが、一般にはそのようなデータは全てメモリ上に配置されるので、メモリ対メモリ演算の場合は一つの演算を行う場合でも複数回のメモリアクセスが必要となる。
演算データ量が多くなってくると、当然サイクル数が増えるためシステムとしてのスループットは低下してくる。
図６は二つのオペランドをフェッチして、演算結果をＣＰＵ内部のレジスタに格納するプログラム例とそのタイミングを示す説明図である。
最初の命令はオペランドをメモリアドレスＡ０からＣＰＵレジスタＲ０フェッチする命令、２番目の命令はオペランドをメモリアドレスＡ１からＣＰＵレジスタＲ１にフェッチする命令、３番目の命令は前記２つの命令でフェッチした２つのオペランドで演算を行い、ＣＰＵレジスタＲ２に格納する命令である。
本例は、一般的なパイプラインＣＰＵの場合ではあるが、１セットのデータ処理に、４サイクル必要となる。仮に１００セットのデータを処理するとして、ループ命令などを使用しない場合で４００サイクル必要となるが、このうち、２００サイクルはメモリアクセスに要している。
特開平６−５２０４５号公報 A system LSI equipped with a CPU (microprocessor) has a configuration in which a memory is mounted on a chip in order to increase the processing speed (see, for example, Patent Document 1).
In order to perform arithmetic processing by the CPU, the CPU fetches a plurality of operands from the data memory, and in some cases, processing such as writing back the arithmetic result to the memory is required.
In a system in which a large amount of data processing is performed by a CPU, the number of fetches from the memory becomes enormous, leading to an increase in power consumption due to memory access and bus access, and an increase in the number of software cycles of the CPU.
For example, in image processing and the like, comparison, addition, multiplication, etc. between byte data are frequently performed. However, since all such data is generally arranged in a memory, there is one case in the case of memory-to-memory operation. Even when one operation is performed, multiple memory accesses are required.
As the amount of calculation data increases, the number of cycles naturally increases, so the throughput of the system decreases.
FIG. 6 is an explanatory diagram showing an example of a program for fetching two operands and storing the operation result in a register in the CPU, and its timing.
The first instruction is an instruction that fetches the operand from the memory address A0 to the CPU register R0, the second instruction is an instruction that fetches the operand from the memory address A1 to the CPU register R1, and the third instruction is the two instructions fetched by the two instructions. This is an instruction that performs an operation on an operand and stores it in the CPU register R2.
Although this example is a case of a general pipeline CPU, four cycles are required for one set of data processing. Assuming that 100 sets of data are processed, 400 cycles are required when a loop instruction or the like is not used. Of these, 200 cycles are necessary for memory access.
JP-A-6-52045

ところで、画像処理等では、多くの場合、演算の種類が比較的単純なものに限られているとともに、処理すべきデータ数が大量である。したがって、このような単純で大量の演算処理にＣＰＵを用いることは、システムバスの使用頻度やサイクル数の増大を招き、その結果、電力消費や処理時間の削減を図る上で不利があった。
本発明は、このような事情に鑑みなされたもので、その目的はＣＰＵの使用頻度を抑制することにより電力消費や処理時間の削減を図る上で有利なデータ処理装置を提供することにある。 By the way, in image processing and the like, in many cases, the types of operations are limited to relatively simple ones, and the number of data to be processed is large. Therefore, using the CPU for such a simple and large amount of arithmetic processing increases the frequency of use of the system bus and the number of cycles, and as a result, has a disadvantage in reducing power consumption and processing time.
The present invention has been made in view of such circumstances, and an object thereof is to provide a data processing apparatus that is advantageous in reducing power consumption and processing time by suppressing the frequency of use of a CPU.

上記目的を達成するために本発明のデータ処理装置は、メモリと、論理演算を行う演算制御回路とを有し、これらメモリおよび演算制御回路はアドレスバスとデータバスを介してＣＰＵに接続されたデータ処理装置であって、前記データ処理装置に割り当てられるデータアドレスは、第１メモリアドレスを示すビット構成部分と、第２メモリアドレスを示すビット構成部分とを含んで構成され、前記メモリは、第１メモリアレイと、第２メモリアレイとを含み、前記第１メモリアドレスが前記第１メモリアレイのアドレスとして割り当てられるとともに、前記第２メモリアドレスが前記第２メモリアレイのアドレスとして割り当てられ、前記演算制御回路は、前記ＣＰＵから前記アドレスバスを介して前記第１、第２メモリアドレスが前記第１、第２メモリアレイに供給されることによって前記第１、第２メモリアレイから同時に読み出された第１、第２データを該演算制御回路に入力する第１の処理と、前記入力した第１、第２データに対して所定の演算を行い該演算結果を前記データバスを介して前記ＣＰＵに出力する第２の処理とを行うように構成されていることを特徴とする。
また、本発明は、メモリと、論理演算を行う演算制御回路とを有し、これらメモリおよび演算制御回路はアドレスバスとデータバスを介してＣＰＵに接続されたデータ処理装置であって、前記データ処理装置に割り当てられるデータアドレスは、第１メモリアドレスを示すビット構成部分と、第２メモリアドレスを示すビット構成部分とを含んで構成され、前記メモリは、単一のメモリアレイで構成され、前記メモリアレイから供給されるデータを前記第１メモリアドレスが割り当てられる第１ビット構成部分と前記第２のメモリアドレスが割り当てられる第２ビット構成部分とに分割するとともに、前記第１ビット構成部分から構成される第１データと、前記第２ビット構成部分から構成される第２データとを前記演算制御回路に供給し、かつ、前記第１、第２ビット構成部分のビット幅をビット幅選択信号に基づいて設定するビット構成選択回路を設け、前記演算制御回路は、前記ＣＰＵから前記アドレスバスを介して前記第１、第２メモリアドレスが前記メモリアレイに供給されることによって前記ビット構成選択回路から同時に供給される第１、第２データを該演算制御回路に入力する第１の処理と、前記入力した第１、第２データに対して所定の演算を行い該演算結果を前記データバスを介して前記ＣＰＵに出力する第２の処理とを行うように構成されていることを特徴とする。 In order to achieve the above object, a data processing apparatus according to the present invention has a memory and an arithmetic control circuit for performing a logical operation, and these memory and arithmetic control circuit are connected to a CPU via an address bus and a data bus. A data processing device, wherein a data address assigned to the data processing device includes a bit component indicating a first memory address and a bit component indicating a second memory address, and the memory includes The first memory address is assigned as an address of the first memory array, and the second memory address is assigned as an address of the second memory array, The control circuit receives the first and second memory addresses from the CPU via the address bus. , A first process of inputting the first and second data simultaneously read from the first and second memory arrays by being supplied to the second memory array to the arithmetic control circuit, and the input first The second processing is configured to perform a predetermined operation on the second data and output the operation result to the CPU via the data bus.
The present invention also includes a memory and an arithmetic control circuit that performs a logical operation, the memory and the arithmetic control circuit being a data processing device connected to a CPU via an address bus and a data bus, wherein the data The data address assigned to the processing device includes a bit component indicating a first memory address and a bit component indicating a second memory address, and the memory is configured by a single memory array, The data supplied from the memory array is divided into a first bit constituent part to which the first memory address is assigned and a second bit constituent part to which the second memory address is assigned, and is constituted from the first bit constituent part Supplying the first data and the second data composed of the second bit component to the arithmetic control circuit; A bit configuration selection circuit for setting a bit width of the first and second bit configuration parts based on a bit width selection signal; and the arithmetic control circuit is configured to send the first, A first process for inputting first and second data simultaneously supplied from the bit configuration selection circuit by supplying a second memory address to the memory array; A predetermined operation is performed on the second data, and a second process of outputting the operation result to the CPU via the data bus is performed.

本発明のデータ処理装置によれば、演算制御回路がＣＰＵからアドレスバスを介して供給されるアドレスに含まれる第１、第２メモリアドレスが第１、第２メモリアレイに供給されることによって第１、第２メモリアレイから同時に読み出された第１、第２データを演算制御回路に入力し、該入力した第１、第２データに対して所定の演算を行い該演算結果を前記データバスを介してＣＰＵに出力するように構成されているので、ＣＰＵは１回のアクセスで１個の演算結果を得ることができる。したがって、ＣＰＵによるアドレスバスを用いたメモリへのアクセス頻度を削減することによりアドレスバスの遷移によって消費される電流を低減でき電力消費を低減する上で有利となる。また、ＣＰＵによる演算動作が不要となるため、１セットのデータ処理に要するサイクル数を削減できるので処理時間を短縮する上で有利となる。
また、本発明のデータ処理装置によれば、単一のメモリアレイから読み出される第１データと第２データのビット幅をビット構成選択回路によって設定できるように構成し、第１データに割り当てられた第１メモリアドレスと、第２データに割り当てられた第２メモリアドレスをメモリアレイに供給することで前記第１の処理と第２の処理を行うようにした。したがって、前記効果に加えて、データ処理装置のコストアップを抑制しつつ、ビット幅が異なる様々なビット構成のデータで演算を行う上で有利となる。 According to the data processing apparatus of the present invention, the first and second memory addresses included in the address supplied from the CPU via the address bus by the arithmetic control circuit are supplied to the first and second memory arrays. 1. First and second data simultaneously read from the second memory array are input to an operation control circuit, a predetermined operation is performed on the input first and second data, and the operation result is input to the data bus. Is output to the CPU via the CPU, the CPU can obtain one calculation result in one access. Therefore, by reducing the frequency of access to the memory using the address bus by the CPU, the current consumed by the transition of the address bus can be reduced, which is advantageous in reducing power consumption. In addition, since an arithmetic operation by the CPU becomes unnecessary, the number of cycles required for processing one set of data can be reduced, which is advantageous in shortening the processing time.
According to the data processing device of the present invention, the bit widths of the first data and the second data read from the single memory array can be set by the bit configuration selection circuit, and assigned to the first data. The first process and the second process are performed by supplying the first memory address and the second memory address assigned to the second data to the memory array. Therefore, in addition to the above-described effects, it is advantageous in performing calculations with data having various bit configurations having different bit widths while suppressing an increase in cost of the data processing apparatus.

電力消費や処理時間の削減を図るという目的を、演算制御回路と複数のメモリアレイを設けることによって実現した。 The purpose of reducing power consumption and processing time is realized by providing an arithmetic control circuit and a plurality of memory arrays.

以下、本発明によるデータ処理装置の実施例を図面に基づいて詳細に説明する。
図１は本発明の実施例１におけるデータ処理装置を含むＬＳＩの構成を示すブロック図、図２は本発明の実施例１におけるデータ処理装置の構成を示すブロック図である。 Embodiments of a data processing apparatus according to the present invention will be described below in detail with reference to the drawings.
FIG. 1 is a block diagram showing a configuration of an LSI including a data processing apparatus according to the first embodiment of the present invention, and FIG. 2 is a block diagram showing a configuration of the data processing apparatus according to the first embodiment of the present invention.

図１に示すように、ＬＳＩ１０は、ＣＰＵ１、命令メモリ２、データメモリ３、データ処理装置４を有している。言い換えると、ＣＰＵ１、命令メモリ２、データメモリ３、データ処理装置４は単一の半導体チップ上に設けられている。
ＣＰＵ１は、データバスＤＢおよびアドレスバスＡＢを介してデータメモリ３とデータ処理装置４に接続されている。
命令メモリ２は、ＣＰＵ１が実行すべき制御プログラムを格納しており、専用の命令バスを介してＣＰＵ１に制御プログラムが読み出されるように構成されている。
データメモリ３は、ＣＰＵ１によって処理される種々のデータが格納されており、ＣＰＵ１によってデータメモリ３に対するデータの読み出し、書き込みがなされるように構成されている。 As illustrated in FIG. 1, the LSI 10 includes a CPU 1, an instruction memory 2, a data memory 3, and a data processing device 4. In other words, the CPU 1, the instruction memory 2, the data memory 3, and the data processing device 4 are provided on a single semiconductor chip.
The CPU 1 is connected to the data memory 3 and the data processing device 4 via the data bus DB and the address bus AB.
The instruction memory 2 stores a control program to be executed by the CPU 1 and is configured such that the control program is read out to the CPU 1 via a dedicated instruction bus.
The data memory 3 stores various data to be processed by the CPU 1 and is configured such that data is read from and written to the data memory 3 by the CPU 1.

図２に示すように、データ処理装置４は、メモリ４２と、演算制御回路４４とを備え、メモリ４２と演算制御回路４４とはデータバスＤＢとは異なる内部接続線４６を介して接続されている。
メモリ４２は、ハードウェアとしては単一のメモリから構成され、後述するようにメモリ４２に割り当てられるアドレスデータのビット構成は、上位ビット側と下位ビット側の２つに分割されており、前記上位ビット側が第１メモリアレイ４２Ａに割り当てられ、前記下位ビット側が第２メモリアレイ４２Ｂに割り当てられている。
そして、メモリ４２のデータのビット構成も上位ビット側と下位ビット側の２つに分割されており、前記上位ビット側が第１メモリアレイ４２Ａに格納される第１データＤａを構成するビット構成部分として割り当てられ、前記下位ビット側が第２メモリアレイ４２Ｂに格納される第２データＤｂを構成するビット構成部分として割り当てられている。言い換えると、メモリ４２のデータは、第１データＤａと第２データＤｂの２つのオペランドに分割されている。 As shown in FIG. 2, the data processing device 4 includes a memory 42 and an arithmetic control circuit 44. The memory 42 and the arithmetic control circuit 44 are connected via an internal connection line 46 different from the data bus DB. Yes.
The memory 42 is configured as a single memory as hardware, and the bit configuration of the address data allocated to the memory 42 is divided into two, the upper bit side and the lower bit side, as will be described later. The bit side is assigned to the first memory array 42A, and the lower bit side is assigned to the second memory array 42B.
The bit structure of the data in the memory 42 is also divided into two parts, the upper bit side and the lower bit side, and the upper bit side is used as a bit component part constituting the first data Da stored in the first memory array 42A. The lower bit side is assigned as a bit component constituting the second data Db stored in the second memory array 42B. In other words, the data in the memory 42 is divided into two operands of the first data Da and the second data Db.

演算制御回路４４は、第１、第２メモリアレイ４２Ａ、４２Ｂから内部接続線４６を介して同時に読み出された第１、第２データＤａ、Ｄｂを入力して所定の演算を行い、その演算結果を出力データＤｏｕｔとしてデータバスＤＢに出力するように構成されている。
また、演算制御回路４４は、データバスＤＢを介して供給されるデータを入力データＤｉｎとしてメモリ４２に書き込む動作を行うようにも構成されている。
また、メモリ４２に供給されるアドレスは、アドレスバスＡＢからアドレスデコーダ６に供給され、アドレスデコーダ６でデコードされてからメモリ４２に供給されるが、アドレスデコーダ６の動作および機能は一般的なアドレスデコーダの動作および機能と変わるところはない。また、アドレスデコーダ６をデータ処理装置４の内部に設けるか、データ処理装置４の外部に設けるかは任意である。 The arithmetic control circuit 44 inputs the first and second data Da and Db simultaneously read from the first and second memory arrays 42A and 42B via the internal connection line 46, performs a predetermined calculation, and performs the calculation. The result is output to the data bus DB as output data Dout.
The arithmetic control circuit 44 is also configured to perform an operation of writing data supplied via the data bus DB to the memory 42 as input data Din.
The address supplied to the memory 42 is supplied from the address bus AB to the address decoder 6, decoded by the address decoder 6 and then supplied to the memory 42. The operation and function of the address decoder 6 is a general address. There is no difference from the operation and function of the decoder. Further, it is arbitrary whether the address decoder 6 is provided inside the data processing device 4 or outside the data processing device 4.

図３は、ＣＰＵ１からデータ処理装置４に割り当てられるアドレスデータのビット構成を示す説明図である。
データ処理装置４に割り当てられるアドレスデータ、すなわちメモリ４２に割り当てられるアドレスデータのビット構成は以下のとおりであり、上位ビットから下位ビットの順で次のように割り当てられている。
チップイネーブルのデコード用ビットＩ（チップイネーブルデータＣＥとして使用）：ｉビット
演算機能指定用ビットＪ（演算機能指定データＣＴとして使用）：ｊビット
第１メモリアレイ４２ＡのメモリアドレスＡａ：ｍビット
第２メモリアレイ４２ＢのメモリアドレスＡｂ：ｎビット
アドレスデータのビット幅Ｂ＝（ｉ＋ｊ＋ｍ＋ｎ）ビット
チップイネーブルのデコード用ビットＩ（チップセレクトデータに相当）は第１、第２メモリアレイ４２Ａ、４２Ｂのチップイネーブルデータ（チップイネーブル信号）を示すものであり、特許請求の範囲の制御データに相当する。
演算機能指定用ビットＪ（演算指定データ）は演算制御回路４４によって第１、第２データＤａ、Ｄｂに対して行われる演算の種類、例えば四則演算（加算、減算など）、あるいは、論理演算（論理和、論理積、排他的論理和）などを指定するものであり、特許請求の範囲の制御データに相当する。
また、本実施例では、第１メモリアレイ４２ＡのメモリアドレスＡａのビット幅（ｍビット）と第２メモリアレイ４２ＢのメモリアドレスＡｂのビット幅（ｎビット）とが同一のビット数に設定され、上位側に第１メモリアレイ４２ＡのメモリアドレスＡａが、下位側に第２メモリアレイ４２ＢのメモリアドレスＡｂが割り当てられている。
言い換えると、アドレスデータのそれぞれはこれら制御データを示すビット構成部分と、第１メモリアレイ４２ＡのメモリアドレスＡａを示すビット構成部分と、第２メモリアレイ４２ＢのメモリアドレスＡｂを示すビット構成部分とを含んでいる。
また、本実施例では、第１データＤａのビット幅と第２データＤｂのビット幅とが同一のビット数となるように設定され、したがって第１、第２データＤａ、Ｄｂのビット幅の和はメモリ４２のデータのビット幅と同一であり、かつ、演算制御回路４４の出力データＤｏｕｔのビット幅と同一である。 FIG. 3 is an explanatory diagram showing a bit configuration of address data allocated from the CPU 1 to the data processing device 4.
The bit configuration of the address data assigned to the data processing device 4, that is, the address data assigned to the memory 42 is as follows, and is assigned as follows in order from the upper bit to the lower bit.
Chip enable decoding bit I (used as chip enable data CE): i bit Operation function designating bit J (used as operation function designating data CT): j bit Memory address Aa of first memory array 42A: m bit 2nd Memory address Ab of memory array 42B: n bits Bit width B of address data = (i + j + m + n) bits Chip enable decoding bit I (corresponding to chip select data) is chip enable data of the first and second memory arrays 42A, 42B. (Chip enable signal), which corresponds to the control data in the claims.
The calculation function designating bit J (calculation designating data) is the type of calculation performed on the first and second data Da, Db by the calculation control circuit 44, for example, four arithmetic operations (addition, subtraction, etc.) or logical operation ( Logical sum, logical product, exclusive logical sum), etc., and corresponds to the control data in the claims.
In this embodiment, the bit width (m bits) of the memory address Aa of the first memory array 42A and the bit width (n bits) of the memory address Ab of the second memory array 42B are set to the same number of bits. The memory address Aa of the first memory array 42A is assigned to the upper side, and the memory address Ab of the second memory array 42B is assigned to the lower side.
In other words, each of the address data includes a bit component indicating the control data, a bit component indicating the memory address Aa of the first memory array 42A, and a bit component indicating the memory address Ab of the second memory array 42B. Contains.
In the present embodiment, the bit width of the first data Da and the bit width of the second data Db are set to be the same number of bits, so that the sum of the bit widths of the first and second data Da, Db is set. Is the same as the bit width of the data in the memory 42 and the same as the bit width of the output data Dout of the arithmetic control circuit 44.

さらに具体的な数値を用いて説明する。
実際に必要とされるアドレスデータのビット幅は、第１、第２メモリアレイ４２Ａ、４２Ｂの容量に応じて決定される。アドレスデータのビット幅が１４ビット（本）であれば、１６Ｋワードのアドレッシングが可能なことから、ＣＰＵ１が３２ビットのアドレス空間をサポートするものであれば、１４×２＝２８ビットを第１、第２メモリアレイ４２Ａ、４２Ｂのアドレッシングに使用しても、３２−２８＝４ビットを、チップイネーブルのデコード用ビットＩと、演算機能指定用ビットＪとに割り当てることができる。なお、演算制御回路４４で行う演算が１種類のみであった場合には演算機能指定用ビットＪを省略することもできる。
また、本実施例では、図３に示すように、メモリ４２のうちデータアドレス（０Ｘ００００〜０Ｘ３ＦＦＦ）の部分を通常のメモリと同様にリードライト可能な通常メモリとしてマッピングし、メモリ４２のうちデータアドレス（０Ｘ６０００〜０Ｘ７ＦＦＦ）の部分を第１、第２メモリアレイ４２Ａ、４２Ｂとしてマッピングした場合を示している。
したがって、この場合にはチップイネーブルのデコード用ビットＩは、前記通常のメモリと、第１、第２メモリアレイ４２Ａ、４２Ｂとを選択できるように構成されている。 Furthermore, it demonstrates using a concrete numerical value.
The bit width of the address data actually required is determined according to the capacities of the first and second memory arrays 42A and 42B. If the bit width of the address data is 14 bits (book), 16K words can be addressed. Therefore, if the CPU 1 supports a 32-bit address space, 14 × 2 = 28 bits are set to the first, Even when used for addressing the second memory arrays 42A and 42B, 32-28 = 4 bits can be allocated to the chip enable decoding bit I and the arithmetic function designating bit J. If there is only one type of calculation performed by the calculation control circuit 44, the calculation function designation bit J can be omitted.
In the present embodiment, as shown in FIG. 3, the data address (0X0000 to 0X3FFF) portion of the memory 42 is mapped as a normal memory that can be read and written in the same manner as a normal memory, and the data address of the memory 42 In this example, the portion (0X6000 to 0X7FFF) is mapped as the first and second memory arrays 42A and 42B.
Therefore, in this case, the chip enable decoding bit I is configured so that the normal memory and the first and second memory arrays 42A and 42B can be selected.

また本実施例では、演算機能指定用ビットＪを２ビットとし、演算機能指定用ビットＪが０１であれば第１データＤａと第２データＤｂの論理積を行うことを意味し、演算機能指定用ビットＪが００であれば演算制御回路４４の演算動作を非実行としメモリ４２のデータ（通常メモリのデータ、および第１データＤａと第２データＤｂ）をそのままリードすることを意味するものとする。したがって、演算機能指定用ビットＪは特許請求の範囲の演算制御回路制御データを含んでいる。
メモリ４２の通常メモリ部分、および、第１、第２メモリアレイ４２Ａ、４２Ｂに対するデータの書き込み動作は、通常のメモリと同様に行われる。すなわち、図２では省略したが、制御データとしてメモリ４２に対する書き込み（ライト）あるいは読み出し（リード）の何れを行うかを指定するリード／ライト制御データが演算制御回路４４に供給されるように構成されている。
通常メモリ部分に対する書き込みは、図３に示すように、ライト状態のリード／ライト制御データが供給された演算制御回路４４がアドレスバスＡＢを介して指定されたアドレスに、データバスＤＢを介して供給されたデータを書き込むことでなされる。
通常メモリ部分からの読み出しは、図３に示すように、リード状態のリード／ライト制御データが供給された演算制御回路４４がアドレスバスＡＢを介して指定されたアドレスからデータを読み出しデータバスＤＢへ出力することでなされる。
また、第１、第２メモリアレイ４２Ａ、４２Ｂに対する第１、第２データＤａ、Ｄｂの書き込み動作および読み出し動作も上述した通常メモリ部分の場合と同様であるが、データが第１、第２データＤａ、Ｄｂに分割されている点が異なっている。 In this embodiment, the arithmetic function designation bit J is set to 2 bits, and if the arithmetic function designation bit J is 01, it means that the logical product of the first data Da and the second data Db is performed, and the arithmetic function designation is performed. If the bit J is 00, this means that the arithmetic operation of the arithmetic control circuit 44 is not executed and the data in the memory 42 (the data in the normal memory, the first data Da, and the second data Db) are read as they are. To do. Therefore, the arithmetic function designating bit J includes arithmetic control circuit control data in the scope of claims.
The data write operation to the normal memory portion of the memory 42 and the first and second memory arrays 42A and 42B is performed in the same manner as the normal memory. That is, although omitted in FIG. 2, read / write control data that specifies whether to write (write) or read (read) the memory 42 as control data is supplied to the arithmetic control circuit 44. ing.
As shown in FIG. 3, the write to the normal memory portion is supplied via the data bus DB to the address specified by the arithmetic control circuit 44 to which the read / write control data in the write state is supplied via the address bus AB. This is done by writing the processed data.
For reading from the normal memory portion, as shown in FIG. 3, the arithmetic control circuit 44 to which the read / write control data in the read state is supplied reads the data from the address designated via the address bus AB and transfers it to the data bus DB. This is done by outputting.
Further, the writing operation and the reading operation of the first and second data Da and Db with respect to the first and second memory arrays 42A and 42B are the same as in the case of the normal memory portion described above, but the data is the first and second data. The difference is that it is divided into Da and Db.

次に、ＬＳＩ１０の動作について図４のタイミングチャートを参照して説明する。
予め第１、第２メモリアレイ４２Ａ、４２Ｂには第１、第２データＤａ、Ｄｂが格納されているものとする。
図中、ＣＬＫはＣＰＵ１の動作クロックを示し、ＡＤＲＳはＣＰＵ１からアドレスバスＡＢを介してデータ処理装置４に供給されるアドレスを示し、ＤＡＴＡはデータ処理装置４からデータバスＤＢに出力されるデータを示している。また、ＣＰＵ演算はＣＰＵ１でなされる演算の有無を示し、ＣＰＵ命令はＣＰＵ１で実行される命令を示している。
まず、命令ｉ０がメモリ４２のアドレスＡ０１を読み出してレジスタＲ０に格納することを示すＬＯＡＤ命令であった場合について説明する。
この場合、ＣＰＵ１は１サイクル目でアドレスＡ０１をデータ処理回路４に供給する。このアドレスＡ０１には、前述したデコード用ビットＩ、演算機能指定用ビットＪ（本例では０１）、第１メモリアレイ４２ＡのメモリアドレスＡａ、第２メモリアレイ４２ＢのメモリアドレスＡｂが含まれている。したがって、デコード用ビットＩ、メモリアドレスＡａ、Ａｂに対応して第１、第２メモリアレイ４２Ａ、４２Ｂから第１、第２データＤａ、Ａｂが演算制御回路４４に供給される（第１の処理）。
ＣＰＵ１が２サイクル目の命令ｉ１を実行している間に、演算制御回路４４は演算機能指定用ビットＪによって指定される演算（本例では論理積）を行いその演算結果を出力データＤｏｕｔとしてデータバスＤＢに出力する（第２の処理）。データバスＤＢに出力された出力データＤｏｕｔはＣＰＵ１のレジスタＲ０に格納される。
また、ＣＰＵ１は演算処理に関わっていないため、演算に要するサイクルが必要ない。
以下同様に命令ｉ１、ｉ２、ｉ３、ｉ４……が順次行われる。
したがって、１個分の演算結果を得るために必要なサイクル数、すなわち１セット分のデータ処理に必要なサイクル数は２サイクルで済むことになる。そして、連続して複数の命令が順次実行されると、ＣＰＵ１によるアドレスの供給とデータ処理装置４からの出力データＤｏｕｔの出力とがほぼ同時になされることから、例えば３セット分のデータ処理は４サイクルで済み、１００セット分のデータ処理は１０１サイクルで済むことになる。 Next, the operation of the LSI 10 will be described with reference to the timing chart of FIG.
Assume that first and second data Da and Db are stored in advance in the first and second memory arrays 42A and 42B.
In the figure, CLK indicates an operation clock of the CPU 1, ADRS indicates an address supplied from the CPU 1 to the data processing device 4 via the address bus AB, and DATA indicates data output from the data processing device 4 to the data bus DB. Show. Further, the CPU calculation indicates the presence / absence of a calculation performed by the CPU 1, and the CPU command indicates a command executed by the CPU 1.
First, a case where the instruction i0 is a LOAD instruction indicating that the address A01 of the memory 42 is read and stored in the register R0 will be described.
In this case, the CPU 1 supplies the address A01 to the data processing circuit 4 in the first cycle. The address A01 includes the decoding bit I, the arithmetic function designating bit J (01 in this example), the memory address Aa of the first memory array 42A, and the memory address Ab of the second memory array 42B. . Accordingly, the first and second data Da and Ab are supplied from the first and second memory arrays 42A and 42B to the arithmetic control circuit 44 corresponding to the decoding bit I and the memory addresses Aa and Ab (first processing). ).
While the CPU 1 executes the instruction i1 in the second cycle, the operation control circuit 44 performs an operation (logical product in this example) specified by the operation function specifying bit J and outputs the operation result as output data Dout. The data is output to the bus DB (second process). The output data Dout output to the data bus DB is stored in the register R0 of the CPU1.
Further, since the CPU 1 is not involved in calculation processing, a cycle required for calculation is not necessary.
In the same manner, instructions i1, i2, i3, i4... Are sequentially executed.
Therefore, the number of cycles necessary for obtaining one operation result, that is, the number of cycles necessary for data processing for one set is only two. When a plurality of instructions are successively executed in succession, the address supply by the CPU 1 and the output of the output data Dout from the data processing device 4 are almost simultaneously performed. The cycle is sufficient, and the data processing for 100 sets is 101 cycles.

以上説明したように本実施例によれば、演算制御回路４４は、ＣＰＵ１からアドレスバスＡＢを介して供給されるアドレスに含まれる第１、第２メモリアドレスＡａ、Ａｂが第１、第２メモリアレイ４２Ａ、４２Ｂに供給されることによって第１、第２メモリアレイ４２Ａ、４２Ｂから同時に読み出された第１、第２データＤａ、Ｄｂを演算制御回路４４に入力する第１の処理と、入力した第１、第２データに対して所定の演算を行い該演算結果を前記データバスＤＢを介してＣＰＵに出力する第２の処理とを行うように構成されている。
したがって、ＣＰＵ１は１回のアクセスで１個の演算結果を得ることができるので、ＣＰＵ１によるアドレスバスＡＢを用いたメモリへのアクセス頻度を削減することによりアドレスバスＡＢの遷移によって消費される電流を低減でき電力消費を低減する上で有利となる。また、ＣＰＵ１による演算動作が不要となるため、１セットのデータ処理に要するサイクル数を削減できるので処理時間を短縮する上で有利となる。
具体的に説明すると、本実施例では、ＣＰＵ１は１回のアクセスで１個の演算結果を得ることができることになり、例えば、従来技術の図６のタイミングチャートと比較すると、従来は１セットのデータ処理に４サイクルが必要でそのうち２サイクルがメモリへのアクセスであったのに対し、本発明では２サイクルで済みそのうち１サイクルがメモリへのアクセスとなっており、合計で処理時間を半分に削減することができる。
特に、画像処理等のように単純で大量の演算処理をデータ処理装置４によって行った場合には、同様の処理をＣＰＵ１を用いて行った場合と比較して電力消費および処理時間の削減効果が顕著である。
また、本実施例では、演算制御回路４４と第１、第２メモリアレイ４２Ａ、４２Ｂとの間における第１、第２データＤａ、Ｄｂの転送はＣＰＵ１に接続された前記データバスＤＢとは独立した内部接続線４６を介してなされるので、データ転送を前記データバスＤＢを介して行う場合に比較してデータバスＤＢ上を流れる電流を低減でき電力消費を低減する上でさらに有利となる。 As described above, according to the present embodiment, the arithmetic control circuit 44 has the first and second memory addresses Aa and Ab included in the addresses supplied from the CPU 1 via the address bus AB as the first and second memories. A first process for inputting the first and second data Da and Db simultaneously read from the first and second memory arrays 42A and 42B by being supplied to the arrays 42A and 42B to the arithmetic control circuit 44; A predetermined calculation is performed on the first and second data, and a second process of outputting the calculation result to the CPU via the data bus DB is performed.
Therefore, since the CPU 1 can obtain one calculation result in one access, the current consumed by the transition of the address bus AB can be reduced by reducing the frequency of access to the memory using the address bus AB by the CPU 1. This is advantageous for reducing power consumption. Further, since the arithmetic operation by the CPU 1 is not required, the number of cycles required for processing one set of data can be reduced, which is advantageous in reducing the processing time.
More specifically, in this embodiment, the CPU 1 can obtain one calculation result in one access. For example, when compared with the timing chart of FIG. 4 cycles are required for data processing, of which 2 cycles are accesses to the memory. In the present invention, only 2 cycles are required, and 1 cycle is access to the memory, and the processing time is halved in total. Can be reduced.
In particular, when a simple and large amount of arithmetic processing such as image processing is performed by the data processing device 4, the power consumption and processing time can be reduced compared to the case where the same processing is performed using the CPU 1. It is remarkable.
In this embodiment, the transfer of the first and second data Da and Db between the arithmetic control circuit 44 and the first and second memory arrays 42A and 42B is independent of the data bus DB connected to the CPU 1. Therefore, the current flowing on the data bus DB can be reduced as compared with the case where data transfer is performed via the data bus DB, which is further advantageous in reducing power consumption.

次に実施例２について説明する。
実施例２はデータ処理装置４を画像処理に適用したものである。
図５は画像データのマスク処理の概念を示す説明図である。
画像データの典型的な処理としてマスク処理がある。マスク処理は処理対象となるイメージデータから所定部分のみを取り出したい場合に行うものである。
図５に示すように、処理対象となる元イメージデータをＸ、マスクイメージデータをＭとする。
元イメージデータは複数の画素データＸ１、Ｘ２、Ｘ３、……から構成されている。
マスクイメージデータは複数のマスクデータＭ１、Ｍ２、Ｍ３、……から構成されており、図中白色部分は元イメージデータの画素データを残すためマスクデータを示し、黒色部分は元イメージデータの画素データを取り除くためのマスクデータを示している。
従来のマスク処理では、元イメージデータとマスクイメージデータとをそれぞれメモリに格納しておき、ＣＰＵは、元イメージデータの画素データを１つ読み出し、マスクイメージデータのマスクデータを１つ読み出し、これら読み出した画素データとマスクデータとの論理和をとり、その演算結果を出力データとして出力するといった手順で行っている。したがって、画素データの読み出しとマスクデータの読み出しとの２回の読み出し（メモリアクセス）が必要となる。
本発明のデータ処理回路４を用いた場合には次のように処理を行えばよい。すなわち、第１メモリアレイ４２Ａに元イメージデータの各画素データＸ１、Ｘ２、Ｘ３、……を第１データＤａとして格納し、第２メモリアレイ４２Ｂにマスクイメージデータの各マスクデータＭ１、Ｍ２、Ｍ３……を第２データＤｂとして格納しておき、演算処理回路４４で行う演算を論理和とする。
後はデータ処理回路４にアドレスバスＡＢを介してアドレスを順次供給することによりマスク処理された画素データが順次出力されることになり、マスク処理された１つの画素データを得るために１回のアクセスを行うだけで済み、処理時間が従来の半分で済むことになる。 Next, Example 2 will be described.
In the second embodiment, the data processing device 4 is applied to image processing.
FIG. 5 is an explanatory diagram showing the concept of image data mask processing.
There is a mask process as a typical process of image data. The mask process is performed when it is desired to extract only a predetermined portion from the image data to be processed.
As shown in FIG. 5, the original image data to be processed is X, and the mask image data is M.
The original image data is composed of a plurality of pixel data X1, X2, X3,.
The mask image data is composed of a plurality of mask data M1, M2, M3,..., In the figure, the white portion indicates the mask data to leave the pixel data of the original image data, and the black portion indicates the pixel data of the original image data. The mask data for removing is shown.
In the conventional mask processing, the original image data and the mask image data are respectively stored in the memory, and the CPU reads one pixel data of the original image data, reads one mask data of the mask image data, and reads them. The operation is performed by taking the logical sum of the pixel data and the mask data and outputting the operation result as output data. Therefore, two readings (memory access) are required to read pixel data and mask data.
When the data processing circuit 4 of the present invention is used, processing may be performed as follows. That is, each pixel data X1, X2, X3,... Of the original image data is stored as the first data Da in the first memory array 42A, and each mask data M1, M2, M3 of the mask image data is stored in the second memory array 42B. Are stored as the second data Db, and an operation performed by the operation processing circuit 44 is a logical sum.
After that, pixel data subjected to mask processing is sequentially output by sequentially supplying addresses to the data processing circuit 4 via the address bus AB. In order to obtain one pixel data subjected to mask processing, it is performed once. All that is required is access, and the processing time is half that of the prior art.

なお、上述した各実施例では、データ処理回路４のメモリ４２を第１、第２メモリアレイ４２Ａ、４２Ｂの２つのメモリアレイに分割したが、メモリアレイの数は３つ以上であってもよく、本実施例と同様の作用効果を奏することはもちろんである。また、その場合は、データ処理装置４に割り当てられるデータアドレスがメモリアレイの数と同じ数のビット構成部分を含んで構成されていればよい。
また、各実施例では、データ処理回路４によって行う演算処理として画像処理を例示したが、無論画像処理以外の演算処理であってもよい。
また、本実施例では、データ処理回路４がＣＰＵ１、命令メモリ２、データメモリ３とともに単一の半導体チップ上に構成されたＬＳＩ１０に含まれたものとして例示したが、本発明はこれに限定されるものではなく、データ処理回路４をＣＰＵ１、命令メモリ２、データメモリ３とは別体の部品で構成してもよくその構成は任意である。 In each of the above-described embodiments, the memory 42 of the data processing circuit 4 is divided into two memory arrays, the first and second memory arrays 42A and 42B. However, the number of memory arrays may be three or more. Of course, the same effects as the present embodiment can be obtained. In this case, the data address assigned to the data processing device 4 may be configured to include the same number of bit components as the number of memory arrays.
In each embodiment, the image processing is exemplified as the arithmetic processing performed by the data processing circuit 4. However, arithmetic processing other than image processing may of course be used.
In the present embodiment, the data processing circuit 4 is exemplified as being included in the LSI 10 configured on a single semiconductor chip together with the CPU 1, the instruction memory 2, and the data memory 3. However, the present invention is not limited to this. Instead, the data processing circuit 4 may be configured as a component separate from the CPU 1, the instruction memory 2, and the data memory 3, and the configuration thereof is arbitrary.

次に実施例３について説明する。
上述した実施例１、２では、データ処理回路４のメモリ４２を複数のメモリアレイに分割したため、第１データおよび第２データのビット幅は固定されたものとなる。
したがって、ビット幅が異なる様々なビット構成のデータで演算を行うためには、ビット構成毎にメモリ４２を設ける必要が生じ、データ処理装置４のコストがかさむ不利がある。
そこで、実施例３では、このような課題を解決するために演算オペランドのビット幅を変更することができるデータ処理回路を実現した。 Next, Example 3 will be described.
In the first and second embodiments described above, since the memory 42 of the data processing circuit 4 is divided into a plurality of memory arrays, the bit widths of the first data and the second data are fixed.
Therefore, in order to perform an operation with data having various bit configurations having different bit widths, it is necessary to provide a memory 42 for each bit configuration, which disadvantageously increases the cost of the data processing device 4.
Therefore, in the third embodiment, in order to solve such a problem, a data processing circuit capable of changing the bit width of the arithmetic operand is realized.

図７は実施例３におけるデータ処理装置を含むＬＳＩの構成を示すブロック図、図８は実施例３におけるデータ処理装置の構成を示すブロック図、図９は実施例３におけるデータ処理装置のビット構成選択回路の構成と動作を説明する図である。以下では実施例１と同様の部分には同一の符号を付して説明する。
まず、図７を参照してデータ処理装置５の概略構成について説明する。
データ処理装置５は、メモリ５２と、演算制御回路５４と、ビット構成選択回路５６と、選択制御回路５８とを備え、メモリ５２とビット構成選択回路５４はデータバスＤＢとは異なる内部接続線５０２を介して接続され、ビット構成選択回路５４と演算回路５８はデータバスＤＢとは異なる内部接続線５０４を介して接続されている。
メモリ５２は、ハードウェアとしては単一のメモリから構成され、かつ、単一のメモリアレイとして構成されている。
ビット構成選択回路５６は、メモリ５２から供給されるビット幅ｋのデータをビット幅ｍの第１ビット構成部分とビット幅ｎの第２ビット構成部分とに分割するとともに、前記第１ビット構成部分から構成される第１データＤａと、前記第２ビット構成部分から構成される第２データＤｂとを演算制御回路５４に供給するように構成されている。したがって、メモリ５２のデータは、第１データＤａと第２データＤｂの２つのオペランドに分割されてメモリ５２から読み出されることになる。
また、メモリ５２に格納されているデータのうち、前記第１ビット構成部分（第１データＤａ）には、実施例１と同様の第１メモリアドレスが割り当てられ、第２ビット構成部分（第２データＤｂ）には、実施例１と同様の第２メモリアドレスが割り当てられる。
ビット構成選択回路５６は、選択制御回路５８から供給されるビット幅選択信号Ｓｂに基づいて前記第１、第２ビット構成部分のビット幅ｎ、ｍを設定する。
選択制御回路５８は、ＣＰＵ１からの命令に基づいて前記ビット幅選択信号Ｓｂを生成する。 FIG. 7 is a block diagram showing a configuration of an LSI including a data processing device in the third embodiment, FIG. 8 is a block diagram showing a configuration of the data processing device in the third embodiment, and FIG. 9 is a bit configuration of the data processing device in the third embodiment. It is a figure explaining the structure and operation | movement of a selection circuit. In the following description, the same parts as those in the first embodiment are denoted by the same reference numerals.
First, a schematic configuration of the data processing device 5 will be described with reference to FIG.
The data processing device 5 includes a memory 52, an arithmetic control circuit 54, a bit configuration selection circuit 56, and a selection control circuit 58. The memory 52 and the bit configuration selection circuit 54 are internal connection lines 502 different from the data bus DB. The bit configuration selection circuit 54 and the arithmetic circuit 58 are connected via an internal connection line 504 different from the data bus DB.
The memory 52 is configured as a single memory as hardware, and is configured as a single memory array.
The bit configuration selection circuit 56 divides the data of the bit width k supplied from the memory 52 into a first bit configuration portion having a bit width m and a second bit configuration portion having a bit width n, and the first bit configuration portion The first data Da composed of the second data Db and the second data Db composed of the second bit component are supplied to the arithmetic control circuit 54. Therefore, the data in the memory 52 is divided into two operands of the first data Da and the second data Db and read from the memory 52.
Of the data stored in the memory 52, the first bit configuration part (first data Da) is assigned a first memory address similar to that of the first embodiment, and the second bit configuration part (second data). The data Db) is assigned a second memory address similar to that in the first embodiment.
The bit configuration selection circuit 56 sets the bit widths n and m of the first and second bit configuration parts based on the bit width selection signal Sb supplied from the selection control circuit 58.
The selection control circuit 58 generates the bit width selection signal Sb based on a command from the CPU 1.

演算制御回路５４は、メモリ５２から内部接続線５０２、ビット構成選択回路５６、内部接続線５０４を介して同時に読み出された第１、第２データＤａ、Ｄｂを入力して所定の演算を行い、その演算結果を出力データＤｏｕｔとしてデータバスＤＢに出力するように構成されている。
また、演算制御回路５４は、実施例１と同様に、データバスＤＢを介して供給されるデータを入力データＤｉｎとしてメモリ５２に書き込む動作を行うようにも構成されている。
また、メモリ５２に供給されるアドレスは、実施例１と同様に、アドレスバスＡＢからアドレスデコーダ６（図８参照）に供給され、アドレスデコーダ６でデコードされてからメモリ５２に供給されるが、アドレスデコーダ６の動作および機能は一般的なアドレスデコーダの動作および機能と変わるところはない。また、実施例１と同様に、アドレスデコーダ６をデータ処理装置５の内部に設けるか、データ処理装置５の外部に設けるかは任意である。 The arithmetic control circuit 54 inputs the first and second data Da and Db simultaneously read from the memory 52 via the internal connection line 502, the bit configuration selection circuit 56, and the internal connection line 504, and performs a predetermined calculation. The operation result is output to the data bus DB as output data Dout.
Similarly to the first embodiment, the arithmetic control circuit 54 is also configured to perform an operation of writing data supplied via the data bus DB into the memory 52 as input data Din.
Similarly to the first embodiment, the address supplied to the memory 52 is supplied from the address bus AB to the address decoder 6 (see FIG. 8), decoded by the address decoder 6, and then supplied to the memory 52. The operation and function of the address decoder 6 is not different from the operation and function of a general address decoder. Similarly to the first embodiment, it is optional whether the address decoder 6 is provided inside the data processing device 5 or outside the data processing device 5.

次に、図８、図９を参照してデータ処理回路５の構成について具体的に説明する。なお、図８、図９では図面の繁雑化を避けるためにマルチプレクサ５６０２、演算回路５４０２の一部を省略している。
本例では、図９に示すように、メモリ５２のビット幅ｋを１６ビットとし、第１、第２データＤａ、Ｄｂのビット幅ｎ、ｍを次の２種類に切り換えて、言い換えると、演算オペランドの分割を次の２種類から選択して（切り換えて）設定するものとし、データ出力回路５から出力されるデータのビット幅が１２ビットであるものとする。
第１の設定：ｎ＝４ビット、ｍ＝１２ビット
第２の設定：ｎ＝８ビット、ｍ＝８ビット Next, the configuration of the data processing circuit 5 will be specifically described with reference to FIGS. 8 and 9, a part of the multiplexer 5602 and the arithmetic circuit 5402 are omitted in order to avoid complication of the drawings.
In this example, as shown in FIG. 9, the bit width k of the memory 52 is set to 16 bits, and the bit widths n and m of the first and second data Da and Db are switched to the following two types. It is assumed that the operand division is selected (switched) from the following two types, and the bit width of the data output from the data output circuit 5 is 12 bits.
First setting: n = 4 bits, m = 12 bits Second setting: n = 8 bits, m = 8 bits

図９に示すように、実施例３では、ビット構成選択回路５６は８個のマルチプレクサ５６０２を有している。
選択制御回路５８は、ＣＰＵ１によって制御されるビット選択制御レジスタ５８０２を含み、各マルチプレクサ５６０２に供給されるビット幅選択信号Ｓｂはビット選択制御レジスタ５８０２の出力信号をデコードすることによって生成される。
演算制御回路５４は、四則演算あるいは論理演算を行う１２個の演算回路５４０２と、各演算回路５４０２の演算機能を選択する演算機能選択制御回路５４０４（図８では省略）とを有している。演算機能選択制御回路５４０４は、実施例１と同様の演算機能指定データＣＴに基づいて制御動作を行う。
図８に示すように、各演算回路５４０２の出力端はバッファ５００２を介してデータバスＤＢに接続され、バッファ５００２は、実施例１と同様のチップイネーブルデータＣＥによって動作するように構成されている。 As shown in FIG. 9, in the third embodiment, the bit configuration selection circuit 56 includes eight multiplexers 5602.
The selection control circuit 58 includes a bit selection control register 5802 controlled by the CPU 1, and the bit width selection signal Sb supplied to each multiplexer 5602 is generated by decoding the output signal of the bit selection control register 5802.
The arithmetic control circuit 54 includes twelve arithmetic circuits 5402 that perform four arithmetic operations or logical operations, and an arithmetic function selection control circuit 5404 that selects an arithmetic function of each arithmetic circuit 5402 (not shown in FIG. 8). The arithmetic function selection control circuit 5404 performs a control operation based on arithmetic function designation data CT similar to that in the first embodiment.
As shown in FIG. 8, the output terminal of each arithmetic circuit 5402 is connected to the data bus DB via a buffer 5002, and the buffer 5002 is configured to operate according to chip enable data CE similar to that in the first embodiment. .

次に、選択制御回路５８を構成する各マルチプレクサ５６０２の構成について具体的に説明する。
図９に示すように、選択制御回路５８は、各マルチプレクサ５６０２に対して、メモリ５２から出力されるデータを前記第１の設定と第２の設定に応じて第１ビット構成部分と第２ビット構成部分に分割するための制御信号（ビット幅制御信号Ｓｂ）を生成する。
したがって、選択制御回路５８は、第１の設定の場合には、メモリ５２から出力されるデータの１６ビットのうち、下位のビット０〜ビット３までの４ビットのデータを第１ビット構成部として各演算回路５４０２に入力するとともに、メモリ５２から出力されるデータの１６ビットのうち、上位のビット４〜ビット１５を第２ビット構成部として各演算回路５４０２に入力するように構成されている。
また、選択制御回路５８は、第２の設定の場合には、メモリ５２から出力されるデータの１６ビットのうち、ビット０〜ビット７までの８ビットのデータを第１ビット構成部として各演算回路５４０２に入力するとともに、メモリ５２から出力されるデータの１６ビットのうち、ビット８〜ビット１５を第２ビット構成部として各演算回路５４０２に入力するように構成されている。
具体的には、８個のマルチプレクサ５６０２を下位ビットから上位ビットの順番に第１マルチプレクサ５６０２〜第８マルチプレクサ５６０２とすると、第１マルチプレクサ５６０２はビット４、ビット８の選択を行うように、第２マルチプレクサ５６０２はビット５、ビット９の選択を行うように、第３マルチプレクサ５６０２はビット６、ビット１０の選択を行うように、第４マルチプレクサ５６０２はビット７、ビット１１の選択を行うように、第５マルチプレクサ５６０２はデータ０、ビット１２の選択を行うように、第６マルチプレクサ５６０２はデータ０、ビット１３の選択を行うように、第７マルチプレクサ５６０２はデータ０、ビット１４の選択を行うように、第８マルチプレクサ５６０２はデータ０、ビット１５の選択を行うようにそれぞれ構成されている。 Next, the configuration of each multiplexer 5602 constituting the selection control circuit 58 will be specifically described.
As shown in FIG. 9, the selection control circuit 58 outputs the data output from the memory 52 to each multiplexer 5602 according to the first setting and the second setting. A control signal (bit width control signal Sb) for dividing into constituent parts is generated.
Therefore, in the case of the first setting, the selection control circuit 58 uses, as the first bit component, 4-bit data from the lower bits 0 to 3 among the 16 bits of data output from the memory 52. In addition to being input to each arithmetic circuit 5402, among the 16 bits of data output from the memory 52, the upper bits 4 to 15 are input to the respective arithmetic circuits 5402 as second bit components.
In the case of the second setting, the selection control circuit 58 uses each of the 16 bits of data output from the memory 52 as 8-bit data from bit 0 to bit 7 as the first bit component. In addition to being input to the circuit 5402, among the 16 bits of data output from the memory 52, bits 8 to 15 are input to each arithmetic circuit 5402 as a second bit configuration unit.
Specifically, when the eight multiplexers 5602 are designated as the first multiplexer 5602 to the eighth multiplexer 5602 in the order from the lower bit to the upper bit, the first multiplexer 5602 selects the second bit and the second bit so that the second bit is selected. The multiplexer 5602 selects bits 5 and 9, the third multiplexer 5602 selects bits 6 and 10, and the fourth multiplexer 5602 selects bits 7 and 11. The fifth multiplexer 5602 selects data 0 and bit 12, the sixth multiplexer 5602 selects data 0 and bit 13, and the seventh multiplexer 5602 selects data 0 and bit 14. The eighth multiplexer 5602 selects data 0 and bit 15. Are configured respectively Migihitsuji.

次に、演算制御回路５６を構成する各演算回路５４０２の構成について具体的に説明する。
本実施例では、演算回路５４０２が１２個設けられ、これら演算回路５４０２によって加算が行われるものとする。したがって、第１の設定では１２ビットと４ビットの加算が行われ、第２の設定では８ビットと８ビットの加算が行われる。
具体的には、１２個の演算回路５４０２を下位ビットから上位ビットの順番に第１演算回路５４０２〜第１２演算回路５４０２とすると、第１演算回路５４０２は、ビット０と第１マルチプレクサ５６０２の出力データが入力され、第２演算回路５４０２は、ビット１と第２マルチプレクサ５６０２の出力データが入力され、第３演算回路５４０２は、ビット２と第３マルチプレクサ５６０２の出力データが入力され、第４演算回路５４０２は、ビット３と第４マルチプレクサ５６０２の出力データが入力され、第５演算回路５４０２は、ビット４と第５マルチプレクサ５６０２の出力データが入力され、第６演算回路５４０２は、ビット５と第６マルチプレクサ５６０２の出力データが入力され、第７演算回路５４０２は、ビット６と第７マルチプレクサ５６０２の出力データが入力され、第８演算回路５４０２は、ビット７と第８マルチプレクサ５６０２の出力データが入力され、第９演算回路５４０２は、ビット８とデータ０が入力され、第１０演算回路５４０２は、ビット９とデータ０が入力され、第１１演算回路５４０２は、ビット１０とデータ０が入力され、第１２演算回路５４０２は、ビット１１とデータ０が入力されるようにそれぞれ構成されている。 Next, the configuration of each arithmetic circuit 5402 constituting the arithmetic control circuit 56 will be specifically described.
In this embodiment, twelve arithmetic circuits 5402 are provided and addition is performed by these arithmetic circuits 5402. Accordingly, 12 bits and 4 bits are added in the first setting, and 8 bits and 8 bits are added in the second setting.
Specifically, when twelve arithmetic circuits 5402 are designated as the first arithmetic circuit 5402 to the twelfth arithmetic circuit 5402 in the order from the lower bit to the upper bit, the first arithmetic circuit 5402 outputs the bit 0 and the output of the first multiplexer 5602. Data is input, bit 1 and the output data of the second multiplexer 5602 are input to the second arithmetic circuit 5402, and output data of the bit 2 and the third multiplexer 5602 are input to the third arithmetic circuit 5402, and the fourth operation is performed. The circuit 5402 receives bit 3 and the output data of the fourth multiplexer 5602, the fifth arithmetic circuit 5402 receives bit 4 and the output data of the fifth multiplexer 5602, and the sixth arithmetic circuit 5402 The output data of the 6 multiplexer 5602 is input, and the seventh arithmetic circuit 5402 receives the bits 6 and 7 The output data of the multiplexer 5602 is inputted, the bit 7 and the output data of the eighth multiplexer 5602 are inputted to the eighth arithmetic circuit 5402, the bit 8 and the data 0 are inputted to the ninth arithmetic circuit 5402, and the tenth arithmetic circuit. Bit 5 and data 0 are input to 5402, bit 10 and data 0 are input to the eleventh arithmetic circuit 5402, and bit 11 and data 0 are input to the twelfth arithmetic circuit 5402, respectively. Yes.

上述のように、第５〜第８マルチプレクサ５６０２の一方の入力としてデータ０が設定され、第９〜第１２演算回路５４０２の一方の入力としてデータ０が設定されている理由について説明する。
第１の設定では第１ビット構成部（上位１２ビット）と第２ビット構成部（下位４ビット）の加算が行われるため、第１ビット構成部の上位１２ビットのうち下位４ビットを除く上位８ビットに対応する第２ビット構成部側には８ビット分のデータが存在しない。言い換えると、第２ビット構成部は８ビット分のデータがない。したがって、この８ビットの部分に対して、第２ビット構成部が正数であれば上述のようにデータ０を設定し、負数であれば符号拡張した固定値が設定される。このようなデータ０、固定値の設定はビット構成選択回路５６によって行われる。
また、第２の設定では第１ビット構成部（上位８ビット）と第２ビット構成部（下位８ビット）の加算が行われるため、第１ビット構成部と第２ビット構成部の双方とも、１２ビットに対してそれぞれ上位４ビット分のデータがない。したがって、この４ビットの部分に対して、上述と同様の考え方により、正数、負数に対応したデータ０、固定値の設定が上記と同様にビット構成選択回路５６によって行われる。 As described above, the reason why data 0 is set as one input of the fifth to eighth multiplexers 5602 and data 0 is set as one input of the ninth to twelfth arithmetic circuits 5402 will be described.
In the first setting, the first bit configuration unit (upper 12 bits) and the second bit configuration unit (lower 4 bits) are added, so the upper 12 bits of the upper 12 bits of the first bit configuration unit excluding the lower 4 bits There is no data for 8 bits on the second bit component side corresponding to 8 bits. In other words, the second bit configuration unit has no data for 8 bits. Therefore, for the 8-bit portion, if the second bit component is a positive number, data 0 is set as described above, and if it is a negative number, a sign-extended fixed value is set. The data 0 and fixed value are set by the bit configuration selection circuit 56.
Further, in the second setting, the first bit configuration unit (upper 8 bits) and the second bit configuration unit (lower 8 bits) are added, so both the first bit configuration unit and the second bit configuration unit are There are no upper 4 bits of data for 12 bits. Therefore, the bit configuration selection circuit 56 sets the data 0 corresponding to the positive number and the negative number and the fixed value to the 4-bit portion in the same manner as described above.

次に、データ処理装置５の動作について図４を流用して説明する。
予めメモリ５２には、前記第１の設定あるいは第２の設定の何れかに対応して分割された演算オペランドとしての第１、第２データＤａ、Ｄｂが格納されている。
まず、ＣＰＵ１は、データ処理回路５に対して演算オペランドのビット幅の設定を行う。すなわち、前記第１の設定および第２の設定の何れか一方を選択し、ビット選択制御レジスタ５８０２を制御し、ビット選択制御レジスタ５８０２により各マルチプレクサ５６０２に対してビット幅選択信号Ｓｂを供給させ、これにより各マルチプレクサ５６０２はビット幅選択信号Ｓｂに対応して選択動作を行う。
次いで、ＣＰＵ１は１サイクル目でアドレスＡ０１をデータ処理回路４に供給する。このアドレスＡ０１には、前述したデコード用ビットＩ、演算機能指定用ビットＪ（本例では０１）、第１メモリアドレスＡａ、第２メモリアドレスＡｂが含まれている。したがって、メモリアドレスＡａ、Ａｂに対応してメモリ５２からデータＤがビット構成選択回路５６に入力され、このビット構成選択回路５６によってデータＤが２つのオペランドとしての第１、第２データＤａ、Ｄｂに分割されて演算制御回路５４に供給される（第１の処理）。
ＣＰＵ１が２サイクル目の命令ｉ１を実行している間に、演算制御回路５４は演算機能指定用ビットＪによって指定される演算（本例では論理積）を行いその演算結果を出力データＤｏｕｔとしてバッファ５００２を介してデータバスＤＢに出力する（第２の処理）。データバスＤＢに出力された出力データＤｏｕｔはＣＰＵ１のレジスタＲ０に格納される。
また、ＣＰＵ１は演算処理に関わっていないため、演算に要するサイクルが必要ない。
以下同様に命令ｉ１、ｉ２、ｉ３、ｉ４……が順次行われる。
したがって、１個分の演算結果を得るために必要なサイクル数、すなわち１セット分のデータ処理に必要なサイクル数は２サイクルで済むことになる。そして、連続して複数の命令が順次実行されると、ＣＰＵ１によるアドレスの供給とデータ処理装置５からの出力データＤｏｕｔの出力とがほぼ同時になされることから、例えば３セット分のデータ処理は４サイクルで済み、１００セット分のデータ処理は１０１サイクルで済むことになる。 Next, the operation of the data processing device 5 will be described with reference to FIG.
The memory 52 stores in advance first and second data Da and Db as operation operands divided in accordance with either the first setting or the second setting.
First, the CPU 1 sets the bit width of the operation operand for the data processing circuit 5. That is, one of the first setting and the second setting is selected, the bit selection control register 5802 is controlled, and the bit width selection signal Sb is supplied to each multiplexer 5602 by the bit selection control register 5802, Accordingly, each multiplexer 5602 performs a selection operation corresponding to the bit width selection signal Sb.
Next, the CPU 1 supplies the address A01 to the data processing circuit 4 in the first cycle. The address A01 includes the decoding bit I, the arithmetic function designating bit J (01 in this example), the first memory address Aa, and the second memory address Ab. Therefore, data D is input from the memory 52 to the bit configuration selection circuit 56 corresponding to the memory addresses Aa and Ab, and the data D is input to the bit configuration selection circuit 56 by the first and second data Da and Db as two operands. And is supplied to the arithmetic control circuit 54 (first processing).
While the CPU 1 executes the instruction i1 in the second cycle, the arithmetic control circuit 54 performs an operation (logical product in this example) specified by the operation function specifying bit J and buffers the operation result as output data Dout. The data is output to the data bus DB via 5002 (second processing). The output data Dout output to the data bus DB is stored in the register R0 of the CPU1.
Further, since the CPU 1 is not involved in calculation processing, a cycle required for calculation is not necessary.
In the same manner, instructions i1, i2, i3, i4... Are sequentially executed.
Therefore, the number of cycles necessary for obtaining one operation result, that is, the number of cycles necessary for data processing for one set is only two. When a plurality of instructions are successively executed in succession, the address supply by the CPU 1 and the output of the output data Dout from the data processing device 5 are performed almost simultaneously. The cycle is sufficient, and the data processing for 100 sets is 101 cycles.

実施例３によれば、実施例１と同様に、メモリ５２へのアクセスを削減することで電力消費および処理時間を削減する上で有利であることは無論のこと、データ処理回路５は、ＣＰＵ１からの命令に従って演算オペランドの分割を前記第１の設定と第２の設定の２種類から選択することで、演算するデータのビット幅を変更することができるので、データ処理装置５のコストアップを抑制しつつ、ビット幅が異なる様々なビット構成のデータで演算を行う上で有利となる。 According to the third embodiment, as in the first embodiment, it is obvious that reducing the access to the memory 52 is advantageous in reducing the power consumption and the processing time. The bit width of the data to be calculated can be changed by selecting the operation operand division from the first setting and the second setting in accordance with the instruction from, thus increasing the cost of the data processing device 5. This is advantageous in performing calculations with data of various bit configurations with different bit widths while suppressing the above.

なお、実施例３では、メモリ５２のデータをビット構成選択回路５６によって第１ビット構成部分と第２ビット構成部分の２つのビット構成部分に分割する場合について説明したが、分割するビット構成部分の数は３つ以上であってもよく、実施例３と同様の作用効果を奏することはもちろんである。また、その場合は、データ処理装置５に割り当てられるデータアドレスは、メモリ５２の分割されたビット構成部分の数と同じ数のメモリアドレスに分割されていればよい。
また、実施例３では、ビット構成選択回路５６による演算オペランドの分割を２種類から選択して設定したが、演算オペランドの分割を何種類から選択するかは任意であり、分割された演算オペランドのビット幅の設定も任意である。
また、実施例３では、選択制御回路５８をビット選択制御レジスタ５８０２で構成し、このビット選択制御レジスタ５８０２をデータ処理装置５の内部に設けた場合について説明したが、ビット選択制御レジスタ５８０２と同様のレジスタをデータ処理装置５の外部に設けるようにしてもよく、例えば、ＣＰＵ１のワーキングエリアを構成するメモリ上に前記レジスタを設けてもよい。
また、データ処理回路５によって行う演算処理は、四則演算、論理演算、あるいは画像処理以外の演算処理であってもよいことは実施例１、２と同様である。
また、データ処理回路５は、ＣＰＵ１、命令メモリ２、データメモリ３とともに単一の半導体チップ上に構成されたＬＳＩ１０に含まれたものに限定されず、データ処理回路５をＣＰＵ１、命令メモリ２、データメモリ３とは別体の部品で構成してもよくその構成は任意であることは実施例１、２と同様である。 In the third embodiment, the case where the data of the memory 52 is divided into the two bit configuration parts of the first bit configuration part and the second bit configuration part by the bit configuration selection circuit 56 has been described. The number may be three or more and, of course, the same effects as those of the third embodiment can be obtained. In this case, the data address assigned to the data processing device 5 may be divided into the same number of memory addresses as the number of divided bit components of the memory 52.
In the third embodiment, the division of the operation operand by the bit configuration selection circuit 56 is selected and set from two types. However, it is arbitrary from what kind of operation operand division is selected. The bit width can also be set arbitrarily.
In the third embodiment, the case where the selection control circuit 58 is configured by the bit selection control register 5802 and this bit selection control register 5802 is provided in the data processing device 5 has been described. These registers may be provided outside the data processing device 5. For example, the registers may be provided on a memory constituting a working area of the CPU 1.
The arithmetic processing performed by the data processing circuit 5 may be arithmetic processing other than four arithmetic operations, logical operations, or image processing, as in the first and second embodiments.
The data processing circuit 5 is not limited to that included in the LSI 10 configured on a single semiconductor chip together with the CPU 1, the instruction memory 2, and the data memory 3. The data processing circuit 5 is not limited to the CPU 1, the instruction memory 2, Similar to the first and second embodiments, the data memory 3 may be composed of a separate component, and the configuration is arbitrary.

本発明の実施例１におけるデータ処理装置を含むＬＳＩの構成を示すブロック図である。1 is a block diagram showing a configuration of an LSI including a data processing device in Embodiment 1 of the present invention. 本発明の実施例１におけるデータ処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the data processor in Example 1 of this invention. ＣＰＵ１からデータ処理装置４に割り当てられるアドレスデータのビット構成を示す説明図である。It is explanatory drawing which shows the bit structure of the address data allocated to the data processor 4 from CPU1. 実施例１におけるＬＳＩ１０の動作を示すタイミングチャートである。3 is a timing chart illustrating the operation of the LSI 10 according to the first embodiment. 画像データのマスク処理の概念を示す説明図である。It is explanatory drawing which shows the concept of the mask process of image data. 二つのオペランドをフェッチして、演算結果をＣＰＵ内部のレジスタに格納するプログラム例とそのタイミングを示す説明図である。It is explanatory drawing which shows the example of a program which fetches two operands, and stores a calculation result in the register in CPU, and its timing. 実施例３におけるデータ処理装置を含むＬＳＩの構成を示すブロック図である。FIG. 10 is a block diagram illustrating a configuration of an LSI including a data processing device according to a third embodiment. 実施例３におけるデータ処理装置の構成を示すブロック図である。FIG. 10 is a block diagram illustrating a configuration of a data processing device in a third embodiment. 実施例３におけるデータ処理装置のビット構成選択回路の構成と動作を説明する図である。FIG. 10 is a diagram illustrating the configuration and operation of a bit configuration selection circuit of a data processing device according to a third embodiment.

Explanation of symbols

１……ＣＰＵ、４……データ処理装置、４２……メモリ、４２Ａ……第１メモリアレイ、４２Ｂ……第２メモリアレイ、４４……演算制御回路、Ｄａ……第１データ、Ｄｂ……第２データ、Ａａ……第１メモリアレイ４２Ａのアドレス、Ａｂ……第２メモリアレイ４２Ｂのアドレス、ＡＢ……アドレスバス、ＤＢ……データバス。
DESCRIPTION OF SYMBOLS 1 ... CPU, 4 ... Data processing device, 42 ... Memory, 42A ... First memory array, 42B ... Second memory array, 44 ... Operation control circuit, Da ... First data, Db ... Second data, Aa... Address of the first memory array 42A, Ab... Address of the second memory array 42B, AB... Address bus, DB.

Claims

A memory and an arithmetic control circuit for performing a logical operation, the memory and the arithmetic control circuit being a data processing device connected to the CPU via an address bus and a data bus;
The data address assigned to the data processing device includes a bit component indicating a first memory address and a bit component indicating a second memory address.
The memory includes a first memory array and a second memory array;
The first memory address is assigned as an address of the first memory array, and the second memory address is assigned as an address of the second memory array;
The arithmetic control circuit includes:
The first and second memory addresses are simultaneously read from the first and second memory arrays by supplying the first and second memory addresses from the CPU to the first and second memory arrays via the address bus. A first process of inputting two data to the arithmetic control circuit;
It is configured to perform a predetermined calculation on the input first and second data and to perform a second process of outputting the calculation result to the CPU via the data bus.
A data processing apparatus.

2. The data processing apparatus according to claim 1, wherein the data of the memory includes a bit configuration part constituting the first data and a bit configuration part constituting the second data.

The data address assigned to the data processing device further includes a bit component indicating control data, and the control data includes a bit component indicating operation specifying data specifying the type of operation performed by the operation control circuit. In the second process by the arithmetic control circuit, a predetermined calculation is performed on the first and second data based on the calculation designation data included in the control data, and the calculation result is transmitted via the data bus. 2. The data processing apparatus according to claim 1, wherein the data processing apparatus outputs the data to the CPU.

The data address assigned to the data processing device further includes a bit configuration portion indicating control data, and the control data includes chip select data for selecting the first and second memory arrays. The data processing apparatus according to claim 1.

The first and second data are transferred between the first and second memory arrays and the arithmetic control circuit via an internal connection line independent of the data bus. The data processing apparatus according to claim 1, wherein:

The memory further includes a normal memory unit in addition to the first and second memory arrays, and a data address assigned to the data processing device further includes a bit configuration part indicating control data, The chip control data for selecting the first and second memory arrays and the normal memory unit and the arithmetic control circuit execute the arithmetic operation on the first and second data, or the arithmetic control circuit does not execute the arithmetic operation. 2. The data processing apparatus according to claim 1, further comprising arithmetic control circuit control data for designating whether to execute.

2. The data processing apparatus according to claim 1, wherein the memory and the arithmetic control circuit are provided on a single semiconductor chip.

2. The data processing apparatus according to claim 1, wherein the bit width of the address data in the first memory array is equal to the bit width of the address data in the second memory array.

2. The data processing apparatus according to claim 1, wherein a bit width of data in the first memory array and a bit width of data in the second memory array are equal to each other.

A memory and an arithmetic control circuit for performing a logical operation, the memory and the arithmetic control circuit being a data processing device connected to the CPU via an address bus and a data bus;
The data address assigned to the data processing device includes a bit component indicating a first memory address and a bit component indicating a second memory address.
The memory is composed of a single memory array;
The data supplied from the memory array is divided into a first bit constituent part to which the first memory address is assigned and a second bit constituent part to which the second memory address is assigned, and from the first bit constituent part The first data configured and the second data configured from the second bit component are supplied to the arithmetic control circuit, and the bit width of the first and second bit components is set to a bit width selection signal. A bit configuration selection circuit to set based on
The arithmetic control circuit includes:
When the first and second memory addresses are supplied from the CPU to the memory array via the address bus, the first and second data supplied simultaneously from the bit configuration selection circuit are input to the arithmetic control circuit. A first process to
It is configured to perform a predetermined calculation on the input first and second data and to perform a second process of outputting the calculation result to the CPU via the data bus.
A data processing apparatus.

The data address assigned to the data processing device further includes a bit component indicating control data, and the control data includes a bit component indicating operation specifying data specifying the type of operation performed by the operation control circuit. In the second process by the arithmetic control circuit, a predetermined calculation is performed on the first and second data based on the calculation designation data included in the control data, and the calculation result is transmitted via the data bus. 11. The data processing apparatus according to claim 10, wherein the data processing apparatus outputs the data to the CPU.

The transfer of the first and second data between the memory array and the arithmetic control circuit is performed via an internal connection line independent of the data bus. 10. A data processing apparatus according to 10.

The memory further includes a normal memory unit in addition to the memory array, and a data address assigned to the data processing device further includes a bit configuration part indicating control data, and the control data includes the memory array and the memory array. A chip select data for selecting a normal memory unit, and an arithmetic control circuit for designating whether the arithmetic control circuit performs the arithmetic operation on the first and second data or does not execute the arithmetic operation of the arithmetic control circuit The data processing apparatus according to claim 10, further comprising control data.

11. The data processing apparatus according to claim 10, wherein the memory, the arithmetic control circuit, and the bit configuration selection circuit are provided on a single semiconductor chip.

11. The data according to claim 10, wherein a bit selection control register in which a register value is set by the CPU is provided, and the bit width selection signal is generated by decoding an output signal of the bit selection control register. Processing equipment.