JP2006155223A

JP2006155223A - Data processor

Info

Publication number: JP2006155223A
Application number: JP2004344571A
Authority: JP
Inventors: Maiko Taruki; 麻衣子樽木; Takeshi Nakamura; 中村　　剛
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2004-11-29
Filing date: 2004-11-29
Publication date: 2006-06-15

Abstract

<P>PROBLEM TO BE SOLVED: To provide a data processor capable of responding to a required complicate application while preventing increase in circuit scale or increase in power consumption as well as deterioration of processing performance. <P>SOLUTION: This data processor comprises a plurality of exclusive arithmetic parts 4, 5 and 6 performing predetermined arithmetic processing, a signal line 10 connected to the plurality of exclusive arithmetic parts 4, 5 and 6, and a common arithmetic part 9 performing specified processing, which is connected to the plurality of exclusive arithmetic parts through the signal line 10. The common arithmetic part 9 is used commonly in at least two or more of the plurality of exclusive arithmetic parts 4, 5 and 6. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、所定のデータ処理を実行する専用演算部を搭載したデータ処理装置に関する。 The present invention relates to a data processing apparatus equipped with a dedicated arithmetic unit that executes predetermined data processing.

従来の画像処理やデータ圧縮を行うデータ処理装置は、制御用のメインプロセッサを中心に、特定の演算処理を行う専用演算部が接続される構成を有している。この専用演算部は、それぞれ特定の演算処理を行う専用回路であったり、専用ＬＳＩやＡＳＩＣ（特定用途向けＩＣ）の一部や全部であったりする。 A conventional data processing apparatus that performs image processing and data compression has a configuration in which a dedicated arithmetic unit that performs specific arithmetic processing is connected to a main processor for control. The dedicated arithmetic unit may be a dedicated circuit that performs specific arithmetic processing, or may be a part or all of a dedicated LSI or ASIC (specific application IC).

図１４は従来のデータ処理装置のブロック図である。 FIG. 14 is a block diagram of a conventional data processing apparatus.

データ処理装置１００は、メインプロセッサ１０１、および複数の専用演算部１０２を備える、任意のデータ処理を行う装置である。なお、図１４では専用演算部１０２は３つ設けられている。また、メインプロセッサ１０１は必須の構成要素ではなく、複数の専用演算部１０２を制御（同期制御や命令制御など）するものであればよい。 The data processing device 100 includes a main processor 101 and a plurality of dedicated arithmetic units 102 and performs arbitrary data processing. In FIG. 14, three dedicated arithmetic units 102 are provided. Further, the main processor 101 is not an essential component and may be anything that controls a plurality of dedicated arithmetic units 102 (synchronous control, command control, etc.).

また、専用演算部１０２は、それぞれ異なる演算処理を実行するものでもよく、類似ないしは同一の演算処理を実行するものでもよい。複数の専用演算部１０２が、演算量の多い処理を実行するため、消費電力の低減やリアルタイム処理を可能とする。 In addition, the dedicated arithmetic unit 102 may execute different arithmetic processes, or may execute similar or identical arithmetic processes. Since a plurality of dedicated calculation units 102 execute processing with a large amount of calculation, power consumption can be reduced and real-time processing can be performed.

なお、データ処理装置１００は、その全てもしくは一部がＬＳＩで実現されることも多い。 Note that the data processing apparatus 100 is often realized entirely or partially by an LSI.

図１４に示されるような従来のデータ処理装置の構成が、特許庁資料室ホームページの技術動向トピックス／システムＬＳＩのレイアウト／画像処理用ＬＳＩ／データ圧縮用ＬＳＩに紹介されている。 The configuration of a conventional data processing apparatus as shown in FIG. 14 is introduced in the technical trend topics / system LSI layout / image processing LSI / data compression LSI on the JPO data room homepage.

データ処理装置の例として、ＭＰＥＧ２やＭＰＥＧ４、ＪＰＥＧなどの符号圧縮、復号処理装置や処理ＬＳＩが挙げられる。 Examples of data processing devices include code compression / decoding processing devices and processing LSIs such as MPEG2, MPEG4, and JPEG.

また、このようなデータ処理装置での消費電力削減のために、動作周波数制御が有効である（例えば特許文献１、特許文献２参照）。 Further, in order to reduce power consumption in such a data processing apparatus, operation frequency control is effective (see, for example, Patent Document 1 and Patent Document 2).

また、近年は複数の画像処理フォーマットに対応した符号復号化処理の実現や、画像処理に加えた音声処理の実現なども求められている。即ち、一つのデータ処理装置やこれを実現するＬＳＩが処理するアプリケーションの量が増加傾向にある。 In recent years, there has also been a demand for realization of encoding / decoding processing corresponding to a plurality of image processing formats, realization of audio processing in addition to image processing, and the like. That is, the amount of applications processed by one data processing device or an LSI that realizes the data processing device tends to increase.

しかしながら、従来のデータ処理装置では、次のような問題点を有していた。 However, the conventional data processing apparatus has the following problems.

一つのデータ処理装置やこれを実現するＬＳＩが処理するアプリケーションの量の増加や複雑化に対して、低消費電力とリアルタイム性を確保するために、搭載する専用演算部１０２の個数や回路規模が増大する問題があった。回路規模の増大により、ＬＳＩで実現されている場合にはＬＳＩのチップ面積の増加が問題となっていた。あるいはコスト面でも問題であった。システムや別回路で実現されている場合であっても、同様に回路規模の増加による実装面積の増加などの問題があった。 In order to ensure low power consumption and real-time performance against the increase and complexity of the amount of applications processed by one data processing device or LSI that realizes this, the number and circuit scale of dedicated operation units 102 to be mounted are limited. There was an increasing problem. Due to the increase in circuit scale, an increase in the chip area of the LSI has been a problem when it is realized by an LSI. Or it was a problem in terms of cost. Even when the system is realized by a system or another circuit, there are problems such as an increase in mounting area due to an increase in circuit scale.

また、プログラムで実現されている場合には、プログラム規模の増大に伴い、消費電力増大の問題を有していた。
特開２００３−２４８５２４号公報特開平７−３３４２６７号公報特開２０００−２９８６５２号公報 Further, when implemented by a program, there is a problem of an increase in power consumption as the program scale increases.
JP 2003-248524 A JP-A-7-334267 JP 2000-298652 A

そこで本発明は、処理性能の劣化を防止しつつ、回路規模および消費電力を削減し、要求される複雑なアプリケーションに対応するデータ処理装置を提供することを目的とする。 SUMMARY OF THE INVENTION An object of the present invention is to provide a data processing apparatus that can reduce the circuit scale and power consumption while preventing deterioration in processing performance, and can cope with a required complex application.

第１の発明のデータ処理装置は、所定の演算を行う複数の専用演算部と、複数の専用演算部に接続される信号線と、信号線を介して複数の専用演算部に接続され、共通の演算処理を行う共通演算部を備えるデータ処理装置であって、共通演算部は、複数の専用演算部の少なくとも２以上において共通に用いられる構成である。 A data processing apparatus according to a first aspect of the present invention includes a plurality of dedicated calculation units that perform predetermined calculations, a signal line that is connected to the plurality of dedicated calculation units, and a plurality of dedicated calculation units that are connected via the signal lines. The data processing device includes a common arithmetic unit that performs the arithmetic processing of the above, and the common arithmetic unit is configured to be commonly used in at least two or more of the plurality of dedicated arithmetic units.

複数の専用演算部で重複して設けられていた演算部が、共通演算部として専用演算部から削除されて別個設けられた上で、複数の専用演算部に共有されることで、データ処理装置の回路規模を適切に削減することができる。 The data processing device is configured such that the calculation unit provided redundantly in the plurality of dedicated calculation units is deleted from the dedicated calculation unit as a common calculation unit and provided separately, and then shared by the plurality of dedicated calculation units The circuit scale can be appropriately reduced.

第２の発明のデータ処理装置は、共通演算部が、複数の専用演算部の少なくとも２以上で共通に用いられる積和演算を行う少なくとも１つの共通積和演算部を備える。 In the data processing device according to the second aspect of the present invention, the common operation unit includes at least one common product-sum operation unit that performs a product-sum operation commonly used by at least two or more of the plurality of dedicated operation units.

複数の専用演算部で重複する回路規模および消費電力が共に大きな積和演算部が、共通積和演算部として専用演算部から削除されて別個設けられて共有化されることで、データ処理装置の回路規模、消費電力の削減がさらに促進される。 A product-sum operation unit that has a large circuit scale and power consumption that are duplicated in a plurality of dedicated operation units is deleted from the dedicated operation unit as a common product-sum operation unit, and is provided separately and shared. Reduction of circuit scale and power consumption is further promoted.

第３の発明のデータ処理装置は、共通演算部が、複数の専用演算部で非共通に用いられる演算を行う少なくとも１つの個別演算部をさらに備える。 In the data processing device according to the third aspect of the present invention, the common arithmetic unit further includes at least one individual arithmetic unit that performs arithmetic operations that are not commonly used by the plurality of dedicated arithmetic units.

複数の専用演算部で共通に用いられる演算部だけでなく、個別に用いられる演算部も共通演算部に設けることで、共通演算部での処理能力を高めて専用演算部の処理負担を軽減でき、回路規模の削減に加えて処理能力の向上も図られる。 In addition to computing units that are used in common by multiple dedicated computing units, computing units that are used individually are also provided in the common computing unit, so that the processing capacity of the dedicated computing unit can be reduced and the processing load of the dedicated computing unit can be reduced. In addition to reducing the circuit scale, the processing capability can be improved.

第４の発明のデータ処理装置は、複数の専用演算部のそれぞれが、共通積和演算部と個別演算部の組み合わされた演算結果を用いる。 In the data processing apparatus according to the fourth aspect of the invention, each of the plurality of dedicated calculation units uses a calculation result obtained by combining the common product-sum calculation unit and the individual calculation unit.

共通演算部の処理能力を高め、結果としてデータ処理装置全体の処理能力を向上させつつ、回路規模の適切な削減が可能となる。 It is possible to appropriately reduce the circuit scale while increasing the processing capability of the common arithmetic unit and consequently improving the processing capability of the entire data processing apparatus.

第５の発明のデータ処理装置は、共通演算部が個別演算部を複数備え、複数の個別演算部から専用演算部の要求に対応する所定の個別演算部を選択する選択部をさらに備える。 In a data processing device according to a fifth aspect of the present invention, the common calculation unit includes a plurality of individual calculation units, and further includes a selection unit that selects a predetermined individual calculation unit corresponding to the request of the dedicated calculation unit from the plurality of individual calculation units.

個別演算部が、専用演算部の要求に適切に対応して演算を実行することで、データ処理装置全体のパフォーマンスを向上させることができる。 When the individual calculation unit performs the calculation appropriately corresponding to the request of the dedicated calculation unit, the performance of the entire data processing apparatus can be improved.

第６の発明のデータ処理装置は、共通演算部が複数であるとともに、複数の専用演算部の個数未満である。 In the data processing device according to the sixth aspect of the invention, there are a plurality of common arithmetic units and less than the number of the dedicated arithmetic units.

専用演算部で重複する演算部を最大限に共通化することで、複雑なアプリケーションに対応する種々の専用演算部を有するデータ処理装置であっても、その回路規模の削減を実現できる。 By maximizing the number of overlapping calculation units in the dedicated calculation unit, even a data processing apparatus having various dedicated calculation units corresponding to complex applications can reduce the circuit scale.

第７の発明のデータ処理装置は、複数の専用演算部が、フィルタ処理部、直交変換処理部、動き検出部、および動き補償部の少なくとも一つである。 In the data processing device of the seventh invention, the plurality of dedicated arithmetic units are at least one of a filter processing unit, an orthogonal transformation processing unit, a motion detection unit, and a motion compensation unit.

これらの専用演算部は、乗算器や加算器などの回路規模の大きな積和演算部を重複して有するため、共通する演算部を共有化しやすく、複雑な複数のアプリケーションに対応するデータ処理装置であっても回路規模の削減が実現される。 Since these dedicated arithmetic units have overlapping product-sum arithmetic units such as multipliers and adders, it is easy to share common arithmetic units, and it is a data processing device that supports multiple complex applications. Even in this case, the circuit scale can be reduced.

第８の発明のデータ処理装置は、共通演算部が、専用演算部とは別のクロックを出力するクロック制御部をさらに備える。 In the data processing device according to the eighth aspect of the present invention, the common calculation unit further includes a clock control unit that outputs a clock different from that of the dedicated calculation unit.

共通演算部が、別個独立のクロックで動作するので、複数の専用演算部を並列に動作させる場合に、共通に用いられる共通演算部を擬似的に並列に動作させることができるなど、回路共有化による処理能力の低下を防止する。 Since the common arithmetic unit operates with separate and independent clocks, when multiple dedicated arithmetic units are operated in parallel, the common arithmetic unit used in common can be operated in parallel in a pseudo manner. This prevents the processing capacity from being reduced.

第９の発明のデータ処理装置は、クロック制御部は、共通演算部が非動作時にクロック信号を未出力とする。 In the data processing device according to the ninth aspect of the invention, the clock control unit outputs no clock signal when the common operation unit is not operating.

共通に用いられる演算部を共有化したことで実現された消費電力削減を、さらに削減することができる。 The power consumption reduction realized by sharing the commonly used arithmetic unit can be further reduced.

第１０の発明のデータ処理装置は、複数の専用演算部の個数をＮ、専用演算部のクロック周波数をＦとし、共通演算部のクロック周波数をｆとしたときに、クロック周波数ｆがｆ＝Ｎ＊Ｆで定められる。 In the data processing device according to the tenth aspect of the present invention, when the number of the plurality of dedicated arithmetic units is N, the clock frequency of the dedicated arithmetic units is F, and the clock frequency of the common arithmetic unit is f, the clock frequency f is f = N. * Determined by F.

専用演算部の個数に応じて、共通演算部の処理を擬似的に並列処理と同じにでき、専用演算部の全てを同時実行させることができる。 Depending on the number of dedicated calculation units, the processing of the common calculation unit can be made the same as the parallel processing in a pseudo manner, and all of the dedicated calculation units can be executed simultaneously.

第１１の発明のデータ処理装置は、共通演算部が、プログラム処理を実行するプロセッサユニットを備える。 In a data processing device according to an eleventh aspect, the common arithmetic unit includes a processor unit that executes program processing.

複数の専用演算部で重複している演算を、別途プログラム処理するプロセッサユニットとして共有化することで、回路規模を削減するとともに、共通演算部での演算係数の変更や処理手順の変更などの事後的な変更にも対応でき、フレキシビリティが向上する。 By sharing the operations that are duplicated in multiple dedicated calculation units as a processor unit that performs separate program processing, the circuit scale is reduced, and post-processing such as changes in calculation coefficients and processing procedures in the common calculation unit Can respond to changes, and flexibility is improved.

第１２の発明のデータ処理装置は、プロセッサユニットが、複数の専用演算部の少なくとも２以上で共通に用いられる積和演算を実行する共通積和演算プログラムを備える。 In a data processing device according to a twelfth aspect, the processor unit includes a common product-sum operation program for executing a product-sum operation commonly used by at least two or more of the plurality of dedicated arithmetic units.

乗算器や加算器などから構成される回路規模の大きな積和演算部を、共通のプログラムにより処理できるので、フレキシビリティも高く、回路規模の削減が図られる。 Since a product-sum operation unit having a large circuit scale composed of a multiplier and an adder can be processed by a common program, the flexibility is high and the circuit scale can be reduced.

本発明によれば、複数の専用演算部が重複して有する共通の演算処理を、共通演算部として外部に共有して接続することで、回路規模、チップ面積及び実装面積を削減できる。 According to the present invention, the circuit scale, the chip area, and the mounting area can be reduced by sharing the common arithmetic processing that a plurality of dedicated arithmetic units overlap with each other as a common arithmetic unit.

また、共有化される共通演算部が、独立したクロック制御部を有することで、処理速度の低下を防止すると共に、消費電力を削減できる。 In addition, since the shared common operation unit has an independent clock control unit, it is possible to prevent a reduction in processing speed and reduce power consumption.

更に、共通演算部をサブプロセッサなどに搭載し、ソフトウェアで実装することで、回路規模の削減のみならず、事後的な仕様変更や係数変更などに、柔軟に対応できる構成のデータ処理装置とすることができる。 In addition, by installing a common arithmetic unit in a sub-processor, etc., and implementing it with software, a data processing device with a configuration that can flexibly cope with not only reduction in circuit scale but also subsequent specification changes and coefficient changes, etc. be able to.

以下、図面を参照しながら、本発明の実施の形態を説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

（実施の形態１）
まず、本発明の実施の形態１におけるデータ処理装置について、従来技術からの変更点を含めて、図１（ａ）、図１（ｂ）、図２を用いて説明する。 (Embodiment 1)
First, the data processing apparatus according to the first embodiment of the present invention will be described with reference to FIGS. 1A, 1B, and 2 including changes from the prior art.

図１（ａ）は、従来のデータ処理装置のブロック図であり、図１（ｂ）は本発明の実施の形態１におけるデータ処理装置のブロック図であり、図１（ａ）は、本発明の実施の形態１におけるデータ処理装置を表す図１（ｂ）との対比のために表されている。図２は、本発明の実施の形態１におけるデータ処理装置のブロック図である。 FIG. 1A is a block diagram of a conventional data processing apparatus, FIG. 1B is a block diagram of a data processing apparatus according to Embodiment 1 of the present invention, and FIG. 1 is shown for comparison with FIG. 1B showing the data processing apparatus in the first embodiment. FIG. 2 is a block diagram of the data processing apparatus according to Embodiment 1 of the present invention.

図１（ａ）に表される従来のデータ処理装置１は、所定の演算を行う複数の専用演算部４、５、６が、信号線８を介して接続されている。複数の専用演算部４、５、６は、それぞれ共通の演算を行う演算部７を重複して含んでいる。 In the conventional data processing apparatus 1 shown in FIG. 1A, a plurality of dedicated calculation units 4, 5, 6 that perform predetermined calculations are connected via a signal line 8. The plurality of dedicated calculation units 4, 5, 6 include a calculation unit 7 that performs a common calculation.

図１（ｂ）に表されるデータ処理装置２は、所定の演算を行う複数の専用演算部４、５、６と、これらの複数の専用演算部４、５、６に接続される信号線１０と、信号線１０を介して複数の専用演算部４、５、６に接続され、共通の演算処理を行う共通演算部９を備えて構成される。共通演算部９は、複数の専用演算部４、５、６の内、少なくとも２以上において共通に用いられる。ここで、信号線８、および１０はシングルビットの信号線でもよく、マルチビットの信号線でもよいものである。 The data processing device 2 shown in FIG. 1B includes a plurality of dedicated calculation units 4, 5, and 6 that perform predetermined calculations, and signal lines connected to the plurality of dedicated calculation units 4, 5, and 6. 10 and a plurality of dedicated arithmetic units 4, 5, and 6 through a signal line 10, and a common arithmetic unit 9 that performs common arithmetic processing. The common calculation unit 9 is used in common by at least two or more of the plurality of dedicated calculation units 4, 5, 6. Here, the signal lines 8 and 10 may be single-bit signal lines or multi-bit signal lines.

このように、図１（ａ）に表されるように、複数の専用演算部４、５、６で重複していた共通の演算を実行する演算部７が、各専用演算部４、５、６から括り出されて、別途共有化される共通演算部９として接続されることで、効率的に回路規模を削減できる。 In this way, as shown in FIG. 1A, the calculation unit 7 that executes a common calculation that has been duplicated in the plurality of dedicated calculation units 4, 5, 6 includes the dedicated calculation units 4, 5, The circuit scale can be efficiently reduced by being connected as a common arithmetic unit 9 that is bundled from 6 and shared separately.

なお、図１（ａ）、図１（ｂ）共に、信号線８を介してメインプロセッサ３が接続されているが、メインプロセッサ３は必要に応じて接続される。また、メインプロセッサ３に限らず、適宜制御回路などが接続されてもよいものである。また、専用演算部４、５、６は、図１（ａ）、図１（ｂ）においてそれぞれ３個表されているが、４以上でもよい。 1A and 1B, the main processor 3 is connected via the signal line 8, but the main processor 3 is connected as necessary. Further, not only the main processor 3 but also a control circuit or the like may be appropriately connected. Further, three dedicated operation units 4, 5, and 6 are shown in FIG. 1A and FIG. 1B, respectively, but may be four or more.

次に、図２には３以上の専用演算部４，５、６、１２を含み、複数の共通演算部９が、信号線１０を介して接続されているデータ処理装置２が表されている。共通演算部は複数個接続されているが、共通演算部の個数は専用演算部の個数未満である。同数では、回路規模の削減にならないからである。このように、共通演算部９は単数であっても、複数であってもよい。 Next, FIG. 2 shows a data processing apparatus 2 that includes three or more dedicated arithmetic units 4, 5, 6, and 12, and a plurality of common arithmetic units 9 are connected via a signal line 10. . A plurality of common calculation units are connected, but the number of common calculation units is less than the number of dedicated calculation units. This is because the same number does not reduce the circuit scale. As described above, the common arithmetic unit 9 may be singular or plural.

また、図２では、共通演算部９は、さらに共通積和演算部１５と個別演算部１６を含んでいる構成である。共通積和演算部１５は、複数の専用演算部の少なくとも２以上で共通する積和演算を実行する演算ブロックであり、例えば、共通する加算、乗算などを実行する。複数の専用演算部で重複している演算の中でも、加算、乗算の組み合わされる積和演算は非常に回路規模や処理規模、消費電力が大きく、別ブロックとして共有化するメリットが非常に高いからである。 In FIG. 2, the common calculation unit 9 further includes a common product-sum calculation unit 15 and an individual calculation unit 16. The common product-sum operation unit 15 is an operation block that executes a product-sum operation common to at least two or more of the plurality of dedicated operation units, and performs, for example, common addition and multiplication. Among the operations that are duplicated in multiple dedicated operation units, the product-sum operation combined with addition and multiplication is very large in circuit scale, processing scale, and power consumption, and has the advantage of being shared as a separate block. is there.

個別演算部１６は、複数の専用演算部で非共通に用いられる演算を実行するブロックであり、必要に応じて共通演算部９に備えられる。この場合には、共通演算部９は、専用演算部の少なくとも２以上で共通に用いられる演算（例えば共通の積和演算）に加えて、個別のデータ加算やビット入れ替えなどの個別演算を実行できる。個別演算部１６をさらに含む場合であっても、個別演算部１６で必要となる回路素子（例えばフリップフロップやシフトレジスタ、加算器や乗算器など）を、共通積和演算部１５の回路素子と共有することもできるので、図１（ａ）に表されるように、それぞれの専用演算部４、５、６のみを設けるよりも回路規模を削減することができる。 The individual calculation unit 16 is a block that executes a calculation that is not commonly used by a plurality of dedicated calculation units, and is provided in the common calculation unit 9 as necessary. In this case, the common operation unit 9 can execute individual operations such as individual data addition and bit replacement in addition to operations commonly used by at least two of the dedicated operation units (for example, common product-sum operation). . Even in the case where the individual arithmetic unit 16 is further included, circuit elements (for example, flip-flops, shift registers, adders, multipliers, etc.) required for the individual arithmetic unit 16 are connected to the circuit elements of the common product-sum arithmetic unit 15. Since they can be shared, as shown in FIG. 1A, the circuit scale can be reduced as compared with the case where only the dedicated arithmetic units 4, 5, and 6 are provided.

次に、図１（ｂ）や図２に表されるデータ処理装置２により、回路規模などが削減できることについての詳細を説明する。 Next, details of the reduction in circuit scale and the like by the data processing device 2 shown in FIG. 1B and FIG. 2 will be described.

専用演算部４、５、６は、共通の演算処理を実行する演算部７を含んでいる。この演算部７は共通の演算処理を実行する。例えば、専用演算部４がＭＰＥＧ２を実行し、専用演算部５がＭＰＥＧ４を実行し、専用演算部６がＪＰＥＧを実行する場合、離散コサイン変換（以下「ＤＣＴ」という）が共通する演算である。またＤＣＴに含まれる積和演算が共通する演算となる。演算部７は、このような複数の専用演算部４〜６において共通する演算を行うブロックである。従来技術におけるデータ処理装置１では、このように共通する演算部７が、複数の専用演算部４、５、６に渡って重複して含まれていた。このため、無駄が生じ、回路規模の増大を招いていた。 The dedicated arithmetic units 4, 5, and 6 include an arithmetic unit 7 that executes common arithmetic processing. The calculation unit 7 executes common calculation processing. For example, when the dedicated calculation unit 4 executes MPEG2, the dedicated calculation unit 5 executes MPEG4, and the dedicated calculation unit 6 executes JPEG, discrete cosine transform (hereinafter referred to as “DCT”) is a common calculation. The product-sum operation included in the DCT is a common operation. The calculation unit 7 is a block that performs a calculation common to the plurality of dedicated calculation units 4 to 6. In the data processing apparatus 1 according to the prior art, the common arithmetic unit 7 is included in duplicate over the plurality of dedicated arithmetic units 4, 5, 6. For this reason, uselessness has occurred and the circuit scale has been increased.

一方、データ処理装置２では、演算部７が専用演算部４、５、６それぞれから抽出されて、バス１０を介して共通演算部９として接続されている。これにより、専用演算部４、５、６に重複して含まれていた演算部７は削除され、各々の専用演算部４、５、６の回路規模は削減されている。 On the other hand, in the data processing device 2, the calculation unit 7 is extracted from each of the dedicated calculation units 4, 5, 6 and connected as a common calculation unit 9 via the bus 10. As a result, the calculation unit 7 that is included in the dedicated calculation units 4, 5, 6 is deleted, and the circuit scale of each of the dedicated calculation units 4, 5, 6 is reduced.

このように、重複して設けられていた演算部７が各専用演算部４、５、６から削除され、共有される共通演算部９として接続されることで、回路規模の大幅な削減が可能となる。 As described above, the operation unit 7 provided in an overlapping manner is deleted from the dedicated operation units 4, 5, and 6 and connected as a common operation unit 9 to be shared, so that the circuit scale can be greatly reduced. It becomes.

特に、共通演算部９が、複数の専用演算部で重複していた積和演算をくくりだした共通積和演算部１５を備えることで、回路規模の削減がさらに促進されるものである。積和演算は、加算器、乗算器、レジスタなど回路規模の大きくなりやすい回路素子を多く含むからである。 In particular, the common operation unit 9 includes the common product-sum operation unit 15 that includes the product-sum operation overlapped by a plurality of dedicated operation units, thereby further reducing the circuit scale. This is because the product-sum operation includes many circuit elements such as adders, multipliers, and registers that are likely to have a large circuit scale.

ここで、共通演算部９はバス１０を介して接続されているので、専用演算部４〜６は、各々データをやり取りできる。即ち、共通演算部９を複数の専用演算部４、５、６は共通に使用することができる。 Here, since the common arithmetic unit 9 is connected via the bus 10, the dedicated arithmetic units 4 to 6 can exchange data with each other. That is, the common calculation unit 9 can be used in common for the plurality of dedicated calculation units 4, 5, 6.

なお、上記のように共通演算部９は、各専用演算部４、５、６に共通する共通積和演算部１５を備えることが多いが、積和演算に限られるものではない。例えば特定の制御演算や、誤り訂正、誤り符号、誤り検出などの共通する演算部を備えるものであっても良い。 As described above, the common operation unit 9 often includes the common product-sum operation unit 15 common to the dedicated operation units 4, 5, and 6, but is not limited to the product-sum operation. For example, a common control unit such as a specific control operation, error correction, error code, or error detection may be provided.

また、全ての専用演算部に共通している演算部であることが好ましいが、複数の専用演算部の２以上に共通する演算部であっても良いものである。同様に、回路規模の削減を実現できるからである。 Moreover, although it is preferable that it is a calculating part which is common to all the exclusive calculating parts, the calculating part which is common to two or more of several exclusive calculating parts may be used. Similarly, the circuit scale can be reduced.

更に、共通演算部は、共通積和演算部１５と個別演算部１６両方を含むものであってもよい。共通積和演算部１５と親和性の高い演算部が、個別演算部１６として共通演算部９に含まれることで処理性能低下を防止できるからである。さらに、個別演算部１６が、共通積和演算部１５と回路素子を共有することで、回路規模も削減可能である。 Furthermore, the common calculation unit may include both the common product-sum calculation unit 15 and the individual calculation unit 16. This is because a calculation unit having high affinity with the common product-sum calculation unit 15 is included in the common calculation unit 9 as the individual calculation unit 16, thereby preventing deterioration in processing performance. Furthermore, the circuit scale can also be reduced by sharing the circuit element with the common product-sum operation unit 15 by the individual operation unit 16.

次に、図３（ａ）、図３（ｂ）を用いて、共通演算部９の他のバリエーションについて説明する。図３（ａ）、図３（ｂ）は本発明の実施の形態１における共通演算部の内部ブロック図である。 Next, another variation of the common arithmetic unit 9 will be described with reference to FIGS. 3 (a) and 3 (b). 3 (a) and 3 (b) are internal block diagrams of the common arithmetic unit in the first embodiment of the present invention.

図３（ａ）には、共通積和演算部１５と個別演算部１６、および選択部１７により構成された共通演算部９が表されている。 FIG. 3A shows a common calculation unit 9 including a common product-sum calculation unit 15, an individual calculation unit 16, and a selection unit 17.

選択部１７は、動作を行っている専用演算部４、５、６に対応する個別演算部１６を選択する。これにより、複数の個別演算部１６の内、必要なものが選択され、専用演算部４、５、６での処理動作が実現される。ここで、選択部１７は、専用演算部４〜６からの制御信号などを基準に対応する個別演算部１６を選択する。更に、このとき、共通積和演算部１５も必要に応じて動作し、選択された個別演算部１６と組み合わされた結果が専用演算部４、５、６に出力される。 The selection unit 17 selects the individual calculation unit 16 corresponding to the dedicated calculation units 4, 5, and 6 that are operating. As a result, necessary ones of the plurality of individual calculation units 16 are selected, and processing operations in the dedicated calculation units 4, 5, 6 are realized. Here, the selection unit 17 selects the individual calculation unit 16 corresponding to the control signal from the dedicated calculation units 4 to 6 as a reference. Further, at this time, the common product-sum operation unit 15 also operates as necessary, and the combined result with the selected individual operation unit 16 is output to the dedicated operation units 4, 5, 6.

以上の動作により、専用演算部４、５、６で必要となる演算結果が、共通演算部９により得られる。 Through the above operations, the calculation results required by the dedicated calculation units 4, 5, 6 are obtained by the common calculation unit 9.

また、図３（ｂ）に表されるように、個別演算部１６ではなく、共通積和演算制御部１８を設けてもよい。 Further, as illustrated in FIG. 3B, a common product-sum operation control unit 18 may be provided instead of the individual operation unit 16.

共通積和演算制御部１８は、専用演算部４、５、６で共通する積和演算部での処理動作が、完全同一でない場合に用いられる。例えば、専用演算部４で必要とする積和演算は、乗算が３回まででよいのに対して、専用演算部５で必要とする積和演算は乗算が５回必要である場合である。即ち、同じ乗算器という回路要素を共有するが、その処理結果の要求が異なる場合である。 The common product-sum operation control unit 18 is used when the processing operations in the product-sum operation unit common to the dedicated operation units 4, 5, and 6 are not completely the same. For example, the product-sum operation required by the dedicated operation unit 4 may be up to 3 multiplications, whereas the product-sum operation required by the dedicated operation unit 5 requires 5 multiplications. In other words, the same multiplier element is shared, but the processing result requirements are different.

このような場合には、共通積和演算制御部１８が、共通積和演算部１５での積和演算を、対応する専用演算部の要求に従った制御を行う。上記の例では、共通積和演算制御部１８は、専用演算部４の場合には、共通積和演算部５での乗算を３回で終了するように制御して結果を選択部１７から出力する。専用演算部５の場合には、共通積和演算制御部１８は、共通積和演算部１５での乗算を５回で終了するように制御する。 In such a case, the common product-sum operation control unit 18 controls the product-sum operation in the common product-sum operation unit 15 according to the request of the corresponding dedicated operation unit. In the above example, in the case of the dedicated operation unit 4, the common product-sum operation control unit 18 controls the multiplication in the common product-sum operation unit 5 to end in three times and outputs the result from the selection unit 17. To do. In the case of the dedicated operation unit 5, the common product-sum operation control unit 18 performs control so that the multiplication in the common product-sum operation unit 15 is completed in five times.

以上のように、共通積和演算制御部１８は、専用演算部の相違する演算要求に対応した制御を行う。 As described above, the common product-sum calculation control unit 18 performs control corresponding to different calculation requests of the dedicated calculation unit.

次に、図４（ａ）、図４（ｂ）を用いて、専用演算部が特定の処理を行う具体的な事例について、従来からの変化を含めて説明する。図４（ａ）は、従来のデータ処理装置のブロック図であり、図４（ｂ）は本発明の実施の形態１におけるデータ処理装置のブロック図である。ここでは、本発明のデータ処理装置２の特徴とメリットを効果的に説明するために、従来技術のデータ処理装置１も並べて表示したものである。 Next, with reference to FIGS. 4A and 4B, specific examples in which the dedicated calculation unit performs specific processing will be described including changes from the past. FIG. 4A is a block diagram of a conventional data processing apparatus, and FIG. 4B is a block diagram of the data processing apparatus in Embodiment 1 of the present invention. Here, in order to effectively explain the features and merits of the data processing apparatus 2 of the present invention, the conventional data processing apparatus 1 is also displayed side by side.

図４では、二つの専用演算部が、それぞれフィルタ処理部２０と直交変換部２１である場合が示されている。 FIG. 4 shows a case where the two dedicated arithmetic units are the filter processing unit 20 and the orthogonal transform unit 21, respectively.

なお、これ以外でも画像圧縮などで用いられる動き検出部、動き補償部などであっても良い。これら、フィルタ処理部２０や直交変換部２１などは、処理が複雑であるため回路規模が非常に大きいのに対して、乗算や加算といった積和演算や、ビットシフトやビット入れ替えなどの共通する演算を重複して含むことが多い。このため、重複する無駄な演算回路が存在し、回路規模を増大させやすい。 Other than this, a motion detection unit, a motion compensation unit, or the like used for image compression or the like may be used. The filter processing unit 20 and the orthogonal transform unit 21 are complicated in processing and have a very large circuit scale. On the other hand, product-sum operations such as multiplication and addition, and common operations such as bit shift and bit replacement. Is often included. For this reason, there are redundant useless arithmetic circuits, and it is easy to increase the circuit scale.

例えば、図４（ａ）に表されるように、フィルタ処理部２０も直交変換部２１も共通する積和演算部２２を含んでいる。この積和演算部２２は、フィルタ処理部でのデータ処理に必要な積和演算を実行する、あるいは、直交変換部でのデータ処理に必要な積和演算を実行する。 For example, as shown in FIG. 4A, the filter processing unit 20 and the orthogonal transform unit 21 include a common product-sum operation unit 22. The product-sum operation unit 22 performs a product-sum operation necessary for data processing in the filter processing unit, or performs a product-sum operation necessary for data processing in the orthogonal transform unit.

更に、積和演算部２２は、積和にかかわる係数などに差異があっても、基本的には同等の演算処理を実行するものである。結果として、フィルタ処理部２０、直交変換部２１が回路やＬＳＩで実現されている場合には、回路規模の増大と、チップ面積の増加を招くものである。 Furthermore, the product-sum operation unit 22 basically executes the same operation processing even if there is a difference in the coefficients related to the product-sum. As a result, when the filter processing unit 20 and the orthogonal transformation unit 21 are realized by a circuit or an LSI, an increase in circuit scale and an increase in chip area are caused.

このため、データ処理装置２のように、積和演算部２２は、共通演算部９に備えられることが好ましい。このように共有化されることで、回路規模が削減される。 For this reason, as in the data processing device 2, the product-sum operation unit 22 is preferably provided in the common operation unit 9. By sharing in this way, the circuit scale is reduced.

共通演算部９は、重複していた積和演算部２２に相当する共通積和演算部２４と制御部２３を有している。制御部２３は、共通積和演算部２４を、フィルタ処理部２０と直交変換部２１に対応するように制御する。これにより、フィルタ処理部２０及び直交変換部２１に最適な積和演算を、共通積和演算部２４が実現することができる。 The common operation unit 9 includes a common product-sum operation unit 24 and a control unit 23 corresponding to the product-sum operation unit 22 that have been duplicated. The control unit 23 controls the common product-sum operation unit 24 so as to correspond to the filter processing unit 20 and the orthogonal transform unit 21. As a result, the common product-sum operation unit 24 can realize the optimum product-sum operation for the filter processing unit 20 and the orthogonal transform unit 21.

例えば、制御部２３は、積和演算に用いる係数をそれぞれで変えたり、積和順序の組換えを行ったりする。 For example, the control unit 23 changes the coefficient used for the product-sum operation, or recombines the product-sum order.

以上のように、フィルタ処理部２０や直交変換部２１、あるいは動き検出部や画像処理、音声処理などを実行する専用演算部に、重複して含まれていた演算部を、共通演算部９として、信号線１０を介して接続することで、重複回路を削減し、回路規模の削減や実装面積の削減、コストの削減を実現できるものである。 As described above, the common arithmetic unit 9 is an arithmetic unit that is included redundantly in the filter processing unit 20, the orthogonal transform unit 21, or the dedicated arithmetic unit that executes the motion detection unit, image processing, audio processing, and the like. By connecting via the signal line 10, it is possible to reduce the number of overlapping circuits, and to realize a reduction in circuit scale, a reduction in mounting area, and a reduction in cost.

なお、共通演算部９はハードウェアによる回路で実現されても良く、ＬＳＩの一部として実現されても良く、単体のＬＳＩで個別に実現されても良く、あるいはソフトウェアとして実現されても良いものである。 The common arithmetic unit 9 may be realized by a hardware circuit, may be realized as a part of an LSI, may be realized individually by a single LSI, or may be realized as software. It is.

回路やＬＳＩなどのハードウェアで実現された場合には、回路規模の削減が実現でき、ソフトウェアで実現された場合には、消費電力の削減やメモリの削減などが実現できる。 When implemented with hardware such as a circuit or LSI, the circuit scale can be reduced, and when implemented with software, power consumption or memory can be reduced.

また、フィルタ処理部２０などは複数でも良いものであり、共通演算部９が複数であってもよい。 Further, a plurality of filter processing units 20 may be provided, and a plurality of common calculation units 9 may be provided.

次に、図５、図６を用いて図４に表されているフィルタ処理部２０、および直交変換部２１の動作と、これに対応する共通演算部９での動作をあわせて説明する。回路規模を削減するために共通積和演算部２４を含む共通演算部９が、適切に動作して処理が実行されることが理解される。 Next, the operations of the filter processing unit 20 and the orthogonal transform unit 21 shown in FIG. 4 and the operations of the common calculation unit 9 corresponding thereto will be described with reference to FIGS. It is understood that the common operation unit 9 including the common product-sum operation unit 24 operates appropriately to perform processing in order to reduce the circuit scale.

図５、図６はそれぞれ本発明の実施の形態１におけるデータ処理装置の動作フローチャートである。 5 and 6 are operation flowcharts of the data processing apparatus according to Embodiment 1 of the present invention.

まず、図５を用いてフィルタ処理部２０と共通積和演算部２４との動作を説明する。 First, the operations of the filter processing unit 20 and the common product-sum operation unit 24 will be described with reference to FIG.

まず、ステップ１にて、フィルタ処理部２０ならびに直交変換部２１は、起動後の処理に入る。次に、ステップ２にて、データ処理装置２は積和演算ステージに入る。ここで、フィルタ処理部２０は積和演算ステージ（ステップ２）に入ると、ステップ３にて、共通積和演算部２４に対して使用要求を出力する。ステップ３による使用要求が出力されると、ステップ４にて、積和演算状態が継続する。この積和演算状態は、ステップ５にて、共通積和演算部２４が終了通知を出力するまで継続する。 First, in step 1, the filter processing unit 20 and the orthogonal transform unit 21 enter into the process after activation. Next, in step 2, the data processing device 2 enters the product-sum operation stage. Here, when entering the product-sum operation stage (step 2), the filter processing unit 20 outputs a use request to the common product-sum operation unit 24 in step 3. When the use request in Step 3 is output, the product-sum operation state continues in Step 4. This product-sum operation state continues until the common product-sum operation unit 24 outputs an end notification in step 5.

ステップ４の積和演算状態では、フィルタ処理部２０は、共通積和演算部２４に、必要なデータを出力して積和演算を実行させる。具体的にはデータに所定の係数を乗じ、乗算結果を加算するなどの積和処理が実行される。 In the product-sum operation state of step 4, the filter processing unit 20 causes the common product-sum operation unit 24 to output necessary data and execute the product-sum operation. Specifically, product-sum processing such as multiplying data by a predetermined coefficient and adding the multiplication results is executed.

積和演算が終了すると、ステップ５にて、共通積和演算部２４は終了通知をフィルタ処理部２０に出力する。ステップ６にて、終了通知を受け取ったフィルタ処理部２０は、次の処理へ移行する。これにより、フィルタ処理部２０の所定の処理動作が終了する。 When the product-sum operation is completed, in step 5, the common product-sum operation unit 24 outputs an end notification to the filter processing unit 20. In step 6, the filter processing unit 20 that has received the end notification moves to the next processing. Thereby, the predetermined processing operation of the filter processing unit 20 ends.

次に図６を用いて、共通積和演算部２４の動作について説明する。 Next, the operation of the common product-sum operation unit 24 will be described with reference to FIG.

共通演算部９に含まれる制御部２３は、フィルタ処理部２０や直交変換部２１からの、共通積和演算処理部２４に対する使用要求に対する制御を行う。 The control unit 23 included in the common calculation unit 9 controls the usage request to the common product-sum calculation processing unit 24 from the filter processing unit 20 and the orthogonal transform unit 21.

制御は、ステップ１１によるキューイング処理と、ステップ１２による共通積和演算部２４の処理に分けられる。 Control is divided into a queuing process in step 11 and a process of the common product-sum operation unit 24 in step 12.

ステップ１１によるキューイング処理を説明する。 The queuing process in step 11 will be described.

まず、ステップ１３にて、フィルタ処理部２０もしくは直交変換部２１が使用要求を出力する。次に、ステップ１４にて、フィルタ処理部２０または直交変換部２１からの使用要求を要求キューにキューイングする。このキューイングにより、使用要求信号がストックされる。 First, in step 13, the filter processing unit 20 or the orthogonal transform unit 21 outputs a use request. Next, in step 14, the use request from the filter processing unit 20 or the orthogonal transform unit 21 is queued in the request queue. The use request signal is stocked by this queuing.

次に、ステップ１２による共通積和演算部２４の処理を説明する。 Next, the processing of the common product-sum operation unit 24 in step 12 will be described.

まず、ステップ１５にて、制御部２３は要求キューにキューイングされている使用要求を取り出す。次いで、ステップ１６にて、制御部２３は要求キューに使用要求が存在することを確認する。使用要求が存在する場合には、ステップ１７にて、制御部２３は共通積和演算部を使用状態とする。 First, in step 15, the control unit 23 takes out a use request queued in the request queue. Next, in step 16, the control unit 23 confirms that a use request exists in the request queue. If there is a use request, in step 17, the control unit 23 sets the common product-sum operation unit to the use state.

次に、ステップ１８にて、制御部２３は共通積和演算が終了したことを確認する。ステップ１８での共通積和演算の終了確認により、ステップ１９にて、制御部２３はフィルタ処理部２０または直交変換部２１に対して終了通知を出力する。 Next, in step 18, the control unit 23 confirms that the common product-sum operation has been completed. Upon confirming the end of the common product-sum operation in step 18, the control unit 23 outputs an end notification to the filter processing unit 20 or the orthogonal transform unit 21 in step 19.

なお、ここで、ステップ１１によるキューイング処理と、ステップ１２による共通積和演算部処理は並列して行われる。 Here, the queuing process in step 11 and the common product-sum operation unit process in step 12 are performed in parallel.

以上のように、重複して設けられていた演算部７を、外部に共通演算部９として接続して共有することで、回路規模が削減されると共に、動作上においても、上記に説明したフローで実行されるので、性能の劣化などは生じない。 As described above, the arithmetic unit 7 provided redundantly is connected and shared as the common arithmetic unit 9 to the outside, so that the circuit scale is reduced and the above-described flow is also described in terms of operation. The performance is not degraded.

また、図１などでは回路やＬＳＩなどのハードウェアでの実現の場合について説明したが、一部、もしくは全部がソフトウェアの場合でも同様である。特に、共通演算部９がソフトウェアで実現される場合は、プログラムの圧縮と、これに伴うメモリの削減、消費電力の低減が実現される。 In addition, FIG. 1 and the like have been described with respect to realization with hardware such as a circuit and LSI, but the same is true even when part or all is software. In particular, when the common arithmetic unit 9 is realized by software, the compression of the program and the accompanying reduction in memory and power consumption are realized.

図７を用いて、プロセッサユニットを用いてソフトウェア処理する場合について説明する。図７は、本発明の実施の形態１におけるデータ処理装置のブロック図である。 A case where software processing is performed using a processor unit will be described with reference to FIG. FIG. 7 is a block diagram of the data processing apparatus according to Embodiment 1 of the present invention.

図７には、共通演算がソフトウェアで実現された構成が示されている。 FIG. 7 shows a configuration in which common operations are realized by software.

従来のデータ処理装置では、複数の専用演算部３０は、複数の専用演算部３０に重複する積和演算３１と、個別演算３２、３３を含んでいた。 In the conventional data processing apparatus, the plurality of dedicated calculation units 30 include product-sum calculation 31 and individual calculations 32 and 33 that overlap the plurality of dedicated calculation units 30.

図７に表される本発明のデータ処理装置２は、これらの重複する積和演算３１と、個別演算３２、３３を、プロセッサユニット３４にソフトウェアプログラムとして格納している。さらに、プロセッサユニット３４は信号線１０を介して、複数の専用演算部３０と接続されている。 The data processing apparatus 2 of the present invention shown in FIG. 7 stores these overlapping product-sum operations 31 and individual operations 32 and 33 in the processor unit 34 as a software program. Further, the processor unit 34 is connected to a plurality of dedicated arithmetic units 30 via the signal line 10.

各専用演算部３０で重複していた積和演算３１や個別演算３２、３３を回路ではなく、ソフトウェアプログラムとしてプロセッサユニット３４に設けられることで、回路規模の削減を実現できるものである。さらに、回路として共通化する場合に比べて、係数の変更や処理手順の変更などの事後的な変更にフレキシブルに対応できるメリットもある。 By providing the product-sum operation 31 and the individual operations 32 and 33 that have been duplicated in each dedicated operation unit 30 in the processor unit 34 as a software program instead of a circuit, the circuit scale can be reduced. Furthermore, there is an advantage that it is possible to flexibly cope with a subsequent change such as a change of a coefficient or a change of a processing procedure as compared with a case where the circuit is shared.

また、プロセッサユニット３４で共通化されたプログラムが、もともと各専用演算部３０にて重複していたプログラムである場合には、プログラム規模の削減に加えて、消費電力も削減できるメリットがある。 Further, when the program shared by the processor unit 34 is a program originally duplicated in each dedicated arithmetic unit 30, there is an advantage that the power consumption can be reduced in addition to the reduction of the program scale.

以上より、ハードウェアの場合と同様、プログラム規模の縮小と消費電力の削減、コストの削減などが実現される。 As described above, as in the case of hardware, reduction of the program scale, reduction of power consumption, cost reduction, and the like are realized.

なお、動作については、図５、図６を用いて説明した場合と同様である。 The operation is the same as that described with reference to FIGS.

また、共通演算部９として共有化される回路やプログラムは、積和演算のみならず、乗算器や加算器、あるいは、ある程度まとまった処理を行う演算部などであっても良いものである。 The circuit or program shared as the common arithmetic unit 9 is not limited to a product-sum operation, but may be a multiplier or an adder, or an arithmetic unit that performs a certain amount of processing.

以上の構成により、演算処理上のデメリットを来たさず、回路規模やプログラム規模を、効率的に削減して、ＬＳＩのチップ面積の低減や、回路面積の削減、消費電力の削減が実現される。 With the above configuration, there is no demerit in arithmetic processing, and the circuit scale and program scale can be efficiently reduced to reduce LSI chip area, circuit area, and power consumption. The

（実施の形態２）
実施の形態２では、クロック制御が共通演算部９に独立に含まれる場合について説明する。 (Embodiment 2)
In the second embodiment, a case where clock control is independently included in the common arithmetic unit 9 will be described.

まず、図８を用いて本発明の実施の形態２におけるデータ処理装置の構成について説明する。図８は、本発明の実施の形態２におけるデータ処理装置のブロック図である。 First, the configuration of the data processing apparatus according to the second embodiment of the present invention will be described with reference to FIG. FIG. 8 is a block diagram of the data processing apparatus according to Embodiment 2 of the present invention.

図８に表されるデータ処理装置２では、共通演算部９が専用演算部であるフィルタ処理部２０や直交変換部２１で用いられるクロックとは別のクロックを出力するクロック制御部４０をさらに備えている。 In the data processing device 2 illustrated in FIG. 8, the common calculation unit 9 further includes a clock control unit 40 that outputs a clock different from the clock used in the filter processing unit 20 and the orthogonal transform unit 21 which are dedicated calculation units. ing.

クロック制御部４０は、専用演算部とは独立したクロック信号を共通演算部９に出力する。制御部２３や共通積和演算部２４は、クロック制御部４０から出力されるクロック信号を用いる。 The clock control unit 40 outputs a clock signal independent of the dedicated calculation unit to the common calculation unit 9. The control unit 23 and the common product-sum operation unit 24 use the clock signal output from the clock control unit 40.

ここで、フィルタ処理部２０と直交変換部２１を並列動作させたい場合がある。このような場合にフィルタ処理部２０と直交変換部２１におけるクロック周波数と共通演算部９におけるクロック周波数が同一であると、並列動作させることができない。共通演算部９は、フィルタ処理部２０と直交変換部２１に共有化されているためである。即ち、回路規模削減のために、共有化されたことで、フィルタ処理部２０などの専用演算部は、各々の内部に共通の演算部を有していないため、各専用演算部は共通の演算を同時動作できず並列動作ができない。 Here, the filter processing unit 20 and the orthogonal transform unit 21 may be desired to operate in parallel. In such a case, if the clock frequency in the filter processing unit 20 and the orthogonal transform unit 21 and the clock frequency in the common arithmetic unit 9 are the same, parallel operation cannot be performed. This is because the common arithmetic unit 9 is shared by the filter processing unit 20 and the orthogonal transform unit 21. That is, because the circuit is shared to reduce the circuit scale, the dedicated arithmetic units such as the filter processing unit 20 do not have a common arithmetic unit inside each, so that each dedicated arithmetic unit has a common arithmetic operation. Cannot be operated simultaneously and cannot be operated in parallel.

一方、フィルタ処理部２０、直交変換部２１の大部分は、メインプロセッサ３を始めとした全体を制御する制御機構とクロックを同一にして、同期を取る必要があるため、これらの処理速度向上には限界がある。 On the other hand, most of the filter processing unit 20 and the orthogonal transform unit 21 need to synchronize with the control mechanism that controls the entire system including the main processor 3, so that the processing speed can be improved. There are limits.

ここで、クロック制御部４０が、専用演算部など他の部分と別個独立に設けられていることで、共通演算部９の処理速度を個別に制御でき、全体としての処理速度の低下を防止しつつ、回路規模削減を実現できるものである。 Here, since the clock control unit 40 is provided separately and independently from other parts such as a dedicated calculation unit, the processing speed of the common calculation unit 9 can be individually controlled, and a decrease in the overall processing speed is prevented. However, the circuit scale can be reduced.

例えば、クロック制御部４０から出力されるクロック周波数が、専用演算部などでのクロック周波数の倍とする。この場合には、共通演算部９はフィルタ処理部２０などの倍の速度で動作する。並列処理される専用演算部が２つである場合でも、共通に使用される共通演算部９が、専用演算部の倍の速度で動作するので、２つの専用演算部は見た目上並列動作しているのと同じである。 For example, it is assumed that the clock frequency output from the clock control unit 40 is double the clock frequency in the dedicated arithmetic unit. In this case, the common arithmetic unit 9 operates at a double speed as the filter processing unit 20 or the like. Even when there are two dedicated computing units that are processed in parallel, the common computing unit 9 that is used in common operates at twice the speed of the dedicated computing unit, so the two dedicated computing units apparently operate in parallel. Is the same as

図８に表されるフィルタ処理部２０と直交変換部２１が必要とする積和演算を順次処理しても、結果的には従来と同一クロック数で全体の処理が終了する。即ち、演算部を共有化したために生じる、並列処理の困難性をカバーできる。これにより、見た目上並列処理と変わらないスピードで処理が可能となり、性能低下を防止することができる。 Even if the product-sum operations required by the filter processing unit 20 and the orthogonal transform unit 21 shown in FIG. 8 are sequentially processed, the entire processing ends with the same number of clocks as in the past. That is, it is possible to cover the difficulty of parallel processing that occurs due to sharing of the arithmetic unit. As a result, processing can be performed at a speed that is not different from the apparent parallel processing, and performance degradation can be prevented.

なお、クロック制御部４０の出力するクロック周波数は、倍速のみならず、３倍速や４倍速（専用演算部の要求に合わせて）、あるいは他の種類、あるいは選択可能とすることも好適である。 Note that the clock frequency output from the clock control unit 40 is preferably not only double speed, but also triple speed or quadruple speed (according to the requirements of the dedicated calculation section), other types, or selectable.

例えば、専用演算部の個数をN、専用演算部のクロック周波数をFとし、共通演算部のクロック周波数をｆとしたときに、クロック周波数ｆが
ｆ＝Ｎ＊Ｆ
の式で表される周波数であれば、Ｎ個の専用演算部を全て並列処理させるのと同一クロック数で処理できる。 For example, when the number of dedicated arithmetic units is N, the clock frequency of the dedicated arithmetic units is F, and the clock frequency of the common arithmetic units is f, the clock frequency f is f = N * F
Can be processed with the same number of clocks as when all N dedicated arithmetic units are processed in parallel.

また、共通演算部９が動作しているときのみクロック制御部４０がクロック信号を出力し、非動作時にはクロック信号を未出力とすることで、消費電力の削減などを効果的に行うことができる。クロックの出力、未出力は、例えばゲーテッド・クロックを用いるなどして実現する。 In addition, the clock controller 40 outputs a clock signal only when the common arithmetic unit 9 is operating, and the clock signal is not output when the common arithmetic unit 9 is not operating, so that power consumption can be effectively reduced. . Clock output and non-output are realized by using, for example, a gated clock.

特に、共通演算部９に含まれる積和演算は、シフトレジスタなどのクロックを用いる順序回路を多く含むため、非動作時にクロック出力を停止することは、消費電力の削減において高い効果を有する。消費電力はクロック周波数に比例して増加するからである。 In particular, the product-sum operation included in the common arithmetic unit 9 includes many sequential circuits using a clock such as a shift register. Therefore, stopping the clock output when not operating is highly effective in reducing power consumption. This is because the power consumption increases in proportion to the clock frequency.

次に、図９を用いて共通演算部９の動作を説明する。 Next, the operation of the common arithmetic unit 9 will be described with reference to FIG.

図９は、本発明の実施の形態２における共通演算部の動作フローチャートである。共通演算部９に含まれる制御部２３を動作の基準として説明する。また、ここでは図８に表されるようにフィルタ処理部２０と直交変換部２１を備えたデータ処理装置２として説明する。 FIG. 9 is an operation flowchart of the common arithmetic unit in the second embodiment of the present invention. The control unit 23 included in the common calculation unit 9 will be described as an operation reference. In addition, here, the data processing apparatus 2 including the filter processing unit 20 and the orthogonal transform unit 21 as illustrated in FIG. 8 will be described.

また、メインプロセッサ３、フィルタ処理部２０、直交変換部２１に供給されているクロックの動作周波数は等しいものとし、以降の説明における基準クロックとする。 Further, it is assumed that the operating frequencies of the clocks supplied to the main processor 3, the filter processing unit 20, and the orthogonal transform unit 21 are the same, and are used as reference clocks in the following description.

制御部２３は、ステップ２１によるフィルタ処理部２０などからの共通演算部使用の要求信号のキューイング処理と、ステップ２２による共通積和演算部２４の動作処理を行う。 The control unit 23 performs queuing processing of a request signal for use of the common arithmetic unit from the filter processing unit 20 or the like in step 21 and operation processing of the common product-sum arithmetic unit 24 in step 22.

まず、ステップ２１による使用要求のキューイング処理について説明する。 First, the use request queuing process in step 21 will be described.

まず、ステップ１３にて、制御部２３はフィルタ処理部２０、または直交変換部２１が出力した使用要求を確認する。次いで、ステップ１４にて、制御部２３は確認された使用要求を要求キューにキューイングする。 First, in step 13, the control unit 23 confirms the use request output from the filter processing unit 20 or the orthogonal transform unit 21. Next, in step 14, the control unit 23 queues the confirmed use request in the request queue.

次に、ステップ２３にて、制御部２３はキューイングと同時にタイムカウントを開始する。タイムカウントは、フィルタ処理部２０、直交変換部２１のそれぞれの使用要求毎にカウントする。これにより、異なる専用演算部が出力した使用要求信号が、適切にキューイングされる。 Next, in step 23, the control unit 23 starts time counting simultaneously with queuing. The time count is counted for each use request of the filter processing unit 20 and the orthogonal transform unit 21. As a result, use request signals output from different dedicated arithmetic units are appropriately queued.

次に、ステップ２２における共通積和演算部２４の動作処理について説明する。 Next, the operation process of the common product-sum operation unit 24 in step 22 will be described.

まず、ステップ１５にて、制御部２３は要求キューから要求信号が取り出す。次いで、ステップ１６にて、制御部２３は要求信号の有無を確認する。ステップ２４にて、要求キューに要求信号があれば、制御部２３は要求信号に附随するタイムカウント値を調べた上で、タイムカウント値を閾値ｔｈと比較する。 First, in step 15, the control unit 23 extracts a request signal from the request queue. Next, in step 16, the control unit 23 confirms the presence or absence of a request signal. If there is a request signal in the request queue at step 24, the control unit 23 checks the time count value associated with the request signal and compares the time count value with the threshold th.

ステップ２５にて、タイムカウント値が閾値ｔｈ以上では、クロック制御部４０は基準クロックの倍速のクロックを出力する。一方、ステップ２６にて、タイムカウント値が閾値ｔｈ未満では、クロック制御部４０は基準クロックと同速のクロックを出力する。即ち、タイムカウント値は、同時動作する専用演算部の数を表しているので、これに対応して共通演算部９の動作速度を増減させるものである。 In step 25, when the time count value is equal to or greater than the threshold th, the clock control unit 40 outputs a clock that is double the reference clock. On the other hand, if the time count value is less than the threshold th in step 26, the clock control unit 40 outputs a clock having the same speed as the reference clock. That is, the time count value represents the number of dedicated operation units operating simultaneously, and accordingly, the operation speed of the common operation unit 9 is increased or decreased.

次いで、ステップ１７にて、共通積和演算部２４が動作状態となり、積和演算が実行される。更に、ステップ１８にて、制御部２３は積和演算の終了を確認する。次に、ステップ１９にて、制御部２３は、積和演算の終了通知を専用演算部であるフィルタ処理部２０、もしくは直交変換部２１、およびクロック制御部４０に出力する。 Next, in step 17, the common product-sum operation unit 24 enters an operating state, and product-sum operation is executed. Further, in step 18, the control unit 23 confirms the end of the product-sum operation. Next, in step 19, the control unit 23 outputs a product-sum operation end notification to the filter processing unit 20, which is a dedicated calculation unit, or the orthogonal transform unit 21 and the clock control unit 40.

さらに、ステップ２７にて、終了通知をうけたクロック制御部４０はクロック信号の供給を停止する。これにより、動作不要の期間に於いては共通演算部９の不要な電力消費が削減される。 Furthermore, in step 27, the clock control unit 40 that has received the end notification stops the supply of the clock signal. As a result, unnecessary power consumption of the common arithmetic unit 9 is reduced during a period where no operation is required.

なお、ステップ２１である使用要求に対するキューイング処理と、ステップ２２である共通積和演算部２４の動作処理（２２）は並列に実行される。 Note that the queuing process for the use request in step 21 and the operation process (22) of the common product-sum operation unit 24 in step 22 are executed in parallel.

以上のような処理により、共通演算となる回路やプログラムの共有化による処理速度の低下を防止できる。また、共通演算部９での動作が不要な場合にクロック信号を停止することで、消費電力の削減も可能となるものである。 Through the processing as described above, it is possible to prevent a reduction in processing speed due to sharing of circuits and programs that are common operations. Further, the power consumption can be reduced by stopping the clock signal when the operation in the common arithmetic unit 9 is unnecessary.

（実施の形態３）
実施の形態３では、更にアプリケーションの複雑化に伴い、専用演算部が複雑化した場合について説明する。 (Embodiment 3)
In the third embodiment, a case will be described in which the dedicated operation unit becomes more complicated as the application becomes more complicated.

まず、図１０（ａ）、図１０（ｂ）を用いて、本発明の実施の形態３におけるデータ処理装置の構成について、従来からの変化を含めて説明する。図１０（ａ）は、従来のデータ処理装置のブロック図であり、図１０（ｂ）は、本発明の実施の形態３におけるデータ処理装置のブロック図である。ここでは、本発明のデータ処理装置のメリットを説明するための対比として、従来のデータ処理装置５０を表している。 First, the configuration of the data processing apparatus according to the third embodiment of the present invention will be described using FIG. 10A and FIG. FIG. 10A is a block diagram of a conventional data processing apparatus, and FIG. 10B is a block diagram of the data processing apparatus in Embodiment 3 of the present invention. Here, a conventional data processing apparatus 50 is shown as a comparison for explaining the merits of the data processing apparatus of the present invention.

従来のデータ処理装置５０は、動き検出部５２、５３、５４（図１０中では「ＭＥ」と示す）、絶対誤差合計（図１０中では「ＳＡＤ」と示す）演算器５５、フィルタ処理部２０、直交変換部２１、積和演算部２２を備える。ここで、絶対誤差合計演算器５５は動き検出部５２、５３、５４に重複して含まれ、積和演算部２２はフィルタ処理部２０と直交変換部２１に重複して含まれている。即ち、共通の動作を行うブロックが重複して設けられ、回路規模が増加しているものである。 The conventional data processing apparatus 50 includes motion detectors 52, 53, and 54 (shown as “ME” in FIG. 10), an absolute error sum (shown as “SAD” in FIG. 10), an arithmetic unit 55, and a filter processor 20. , An orthogonal transform unit 21 and a product-sum operation unit 22. Here, the absolute error sum calculator 55 is redundantly included in the motion detectors 52, 53, and 54, and the product-sum calculator 22 is redundantly included in the filter processor 20 and the orthogonal transform unit 21. That is, the blocks for performing common operations are provided in duplicate, and the circuit scale is increased.

一方、実施の形態３におけるデータ処理装置５１は、動き検出部５２、５３、５４、フィルタ処理部２０、直交変換部２１、およびこれらと信号線１０を介して接続されるサブプロセッサ６０を備えている。 On the other hand, the data processing device 51 in the third embodiment includes motion detection units 52, 53, 54, a filter processing unit 20, an orthogonal transformation unit 21, and a sub processor 60 connected to these via the signal line 10. Yes.

さらにサブプロセッサ６０は、エンジンインターフェース５６（図１０中では「エンジンＩ／Ｆ」と表す）、プロセッサユニット５７（図１０中では「ＰＵ」と示す）、絶対誤差合計演算プログラム（図１０中では「ＳＡＤ演算プログラム」と表す）、積和演算プログラム５９を含んでいる。 Further, the sub-processor 60 includes an engine interface 56 (represented as “engine I / F” in FIG. 10), a processor unit 57 (denoted as “PU” in FIG. 10), and an absolute error total calculation program (in FIG. A product-sum operation program 59).

このデータ処理装置５０、５１は例えばＭＰＥＧ２とＭＰＥＧ４とＪＰＥＧの全てを一つの処理装置やＬＳＩで実現する場合などに用いられるものである。 The data processing devices 50 and 51 are used when, for example, all of MPEG2, MPEG4 and JPEG are realized by a single processing device or LSI.

動き検出部５２、５３、５４は、ＭＰＥＧ２などの画像圧縮においてその動きベクトルを検出する。絶対誤差合計演算器５５は、動きベクトル検出での絶対誤差を演算する。積和演算部２２は、フィルタ処理などで必要となる加算や乗算などの積和演算を実行する。 The motion detectors 52, 53, and 54 detect the motion vector in image compression such as MPEG2. The absolute error sum calculator 55 calculates an absolute error in motion vector detection. The product-sum operation unit 22 performs product-sum operations such as addition and multiplication necessary for filter processing and the like.

なお、図１０（ａ）、図１０（ｂ）では、動き検出部５２、５３、５４を３つと、一つのフィルタ処理部２０、一つの直交変換部２１が表されているが、この構成に限られるものではなく、その他のブロックが含まれても良い。 In FIGS. 10A and 10B, three motion detection units 52, 53, and 54, one filter processing unit 20, and one orthogonal transform unit 21 are shown. It is not limited, and other blocks may be included.

従来技術のデータ処理装置５０では、絶対誤差合計演算器５５が３つ重複して設けられている。更に、積和演算部２２も２つ重複して設けられている。絶対誤差合計演算器５５と積和演算部２２はそれぞれ同一の演算処理を行うものであり、重複していることで回路規模が不要に増大している。 In the conventional data processing apparatus 50, three absolute error sum calculators 55 are provided in duplicate. Furthermore, two product-sum operation units 22 are also provided. The absolute error total calculator 55 and the product-sum calculator 22 perform the same calculation process, and the circuit scale is unnecessarily increased due to the overlap.

このため、本発明のデータ処理装置５１は、重複している絶対誤差合計演算器５５と積和演算部２２を抽出し、まとめることで回路規模の削減を実現している。 For this reason, the data processing device 51 of the present invention realizes a reduction in circuit scale by extracting and summing the absolute error sum calculator 55 and the product-sum calculator 22.

このとき、絶対誤差合計演算器５５と積和演算部２２は、それぞれプログラムで実現され、サブプロセッサ６０に実装されている。サブプロセッサ６０がバス１０を介して、動き検出部５２、５３、５４などの専用演算部と接続されることで、これらのプログラムを共有できる。さらに、共有される演算処理が、ソフトウェアで実装されることで、事後的な変更や各種係数など柔軟な変更に対応できるメリットがある。ソフトウェアは、共有される演算処理を専用の電子回路で実装した場合に比べて柔軟性が高い。勿論、回路により実装されてもよい。 At this time, the absolute error total calculator 55 and the product-sum calculator 22 are each realized by a program and mounted on the sub processor 60. These programs can be shared by connecting the sub processor 60 to a dedicated arithmetic unit such as the motion detectors 52, 53, and 54 via the bus 10. Furthermore, since the shared arithmetic processing is implemented by software, there is an advantage that it is possible to cope with flexible changes such as subsequent changes and various coefficients. The software is more flexible than when the shared arithmetic processing is implemented by a dedicated electronic circuit. Of course, it may be implemented by a circuit.

サブプロセッサ６０は、処理の切り替えやバス１０の制御を行うエンジンインターフェース５６とプロセッサユニット５７を搭載している。プロセッサユニット５７は、絶対誤差合計演算プログラム５８と積和演算プログラム５９を搭載している。 The sub-processor 60 includes an engine interface 56 and a processor unit 57 that perform processing switching and bus 10 control. The processor unit 57 includes an absolute error sum calculation program 58 and a product-sum calculation program 59.

このような構成により、重複して設けられていた絶対誤差合計演算器５５や積和演算部２２がまとめられ、回路規模を削減できるものである。更に、サブプロセッサ６０への実装で、柔軟性も高くなるものである。 With such a configuration, the absolute error total computing unit 55 and the product-sum computing unit 22 that are provided in an overlapping manner are collected, and the circuit scale can be reduced. Furthermore, mounting on the sub-processor 60 increases flexibility.

次に、図１１、図１２を用いてサブプロセッサ６０での動作処理について説明する。図１１は本発明の実施の形態３におけるサブプロセッサ６０の動作フローチャート、図１２は、本発明の実施の形態３における割込み処理のフローチャートである。 Next, operation processing in the sub processor 60 will be described with reference to FIGS. FIG. 11 is an operation flowchart of the sub-processor 60 according to the third embodiment of the present invention, and FIG. 12 is a flowchart of interrupt processing according to the third embodiment of the present invention.

エンジンインターフェース５６は、ステップ１１による使用要求のキューイング処理と、ステップ３１によるプロセッサユニット起動処理とを実行する。 The engine interface 56 executes a use request queuing process in step 11 and a processor unit activation process in step 31.

まず、ステップ１１による使用要求のキューイング処理について説明する。 First, the use request queuing process in step 11 will be described.

最初にステップ１３にて、エンジンインターフェース５６は、動き検出部などが出力した使用要求信号の有無を検出する。次いで、ステップ１４にて、検出した使用要求信号を要求キューにキューイングする（即ちストックする）。 First, at step 13, the engine interface 56 detects the presence / absence of a use request signal output by the motion detector or the like. Next, in step 14, the detected use request signal is queued (that is, stocked) in the request queue.

次に、ステップ３１によるプロセッサユニット起動処理について説明する。 Next, the processor unit activation process in step 31 will be described.

最初に、ステップ１５にて、エンジンインターフェース５６は要求キューの要求信号の取り出しを行う。次いで、ステップ１６にて、要求信号の有無を確認する。要求信号がある場合には、ステップ３２にて、エンジンインターフェースはプロセッサユニット５７に割り込み信号を出力して、プロセッサユニットを動作状態にする。 First, in step 15, the engine interface 56 extracts a request signal from the request queue. Next, at step 16, the presence or absence of a request signal is confirmed. If there is a request signal, in step 32, the engine interface outputs an interrupt signal to the processor unit 57 to put the processor unit into an operating state.

次いで、ステップ１７にて、プロセッサユニット５７に含まれるプログラムが動作状態とされる。動作状態となることで、積和演算などのプログラムが実行される。プログラムによる結果は、動き検出部などに出力される。 Next, in step 17, the program included in the processor unit 57 is put into an operating state. By entering the operating state, a program such as a product-sum operation is executed. The result of the program is output to a motion detection unit or the like.

次いで、ステップ１８にて、プログラム動作の終了が確認される。プログラム動作の終了が確認されれば、ステップ１９にて、プロセッサユニット５７は、エンジンインターフェース５６に対して終了通知信号を出力する。これにより、プロセッサユニット５８でのプログラム動作が終了する。さらに、最初の状態に戻り、必要に応じて次の動作処理が開始される。 Next, at step 18, the end of the program operation is confirmed. If the end of the program operation is confirmed, the processor unit 57 outputs an end notification signal to the engine interface 56 in step 19. Thereby, the program operation in the processor unit 58 is completed. Furthermore, returning to the initial state, the next operation process is started as necessary.

次に図１２を用いて、割り込み処理について説明する。 Next, interrupt processing will be described with reference to FIG.

まず、ステップ４１にて、プロセッサユニット５７の動作が終了している間は、プロセッサユニット５７は、割り込み待ち状態で待機している。 First, in step 41, while the operation of the processor unit 57 is completed, the processor unit 57 is waiting in an interrupt waiting state.

次に、ステップ４２にて、プロセッサユニット５７は、エンジンインターフェース５６から割り込み要求が発生した場合、割り込みハンドラへ移行する。 Next, in step 42, when an interrupt request is generated from the engine interface 56, the processor unit 57 shifts to an interrupt handler.

次いで、ステップ４３にて、割込みハンドラは、まず要求元を特定する。さらに、この特定に従って、ステップ４４にて、絶対誤差合計演算プログラムを実行する。あるいは、ステップ４５にて、積和演算プログラムを実行する。 Next, in step 43, the interrupt handler first specifies the request source. Further, in accordance with this specification, the absolute error sum calculation program is executed in step 44. Alternatively, in step 45, the product-sum operation program is executed.

次いで、ステップ４６にて、演算終了後にこれらのプログラムは、演終了フラグを設定し、割込みハンドラ状態を終了する。これにより、プロセッサユニット５７は、再び割り込み待ち状態となる。 Next, in step 46, after the computation is completed, these programs set a performance end flag and end the interrupt handler state. As a result, the processor unit 57 enters the interrupt waiting state again.

以上のように、エンジンインターフェース５６を用いて、割込みハンドラにより、要求元と必要な演算を実施することで、多彩な演算処理を、一つのサブプロセッサ６０で実現することができる。 As described above, various calculation processes can be realized by one sub-processor 60 by performing the required calculation with the request source by the interrupt handler using the engine interface 56.

更に、重複していた演算部をサブプロセッサ６０の中にプログラムとしてまとめることで、回路規模、ＬＳＩのチップ面積、実装面積を削減することができる。 Furthermore, by collecting the overlapping arithmetic units as programs in the sub-processor 60, the circuit scale, LSI chip area, and mounting area can be reduced.

また、プログラムで実装されることで、柔軟性が高くなるメリットもある。 Moreover, there is an advantage that flexibility is increased by being implemented by a program.

なお、キューイング処理とプロセッサユニット起動処理以外に、必要な処理がある場合には、エンジンインターフェース５６が、切り替えを更に実行すればよいものである。 If there is a necessary process other than the queuing process and the processor unit activation process, the engine interface 56 only needs to further perform switching.

また、図１３に表されるように、クロック制御部４０が、クロック制御をサブプロセッサ６０において独立に行うことで、処理速度の低下防止や消費電力の削減も実現できるものである。図１３は、本発明の実施の形態３におけるデータ処理装置のブロック図である。 Further, as shown in FIG. 13, the clock control unit 40 performs clock control independently in the sub-processor 60, so that the processing speed can be prevented from being lowered and the power consumption can be reduced. FIG. 13 is a block diagram of a data processing apparatus according to Embodiment 3 of the present invention.

実施の形態１で説明したように、サブプロセッサ６０がクロックを独立に持つことで、全体の処理速度を低下させないことができる。 As described in the first embodiment, since the sub processor 60 has a clock independently, the overall processing speed can be prevented from being lowered.

本発明に係るデータ処理装置は、例えば、回路規模を削減して種々のアプリケーションに対応することが必要な技術分野などにおいて好適に利用できる。 The data processing apparatus according to the present invention can be suitably used, for example, in a technical field where it is necessary to reduce the circuit scale and cope with various applications.

（ａ）従来のデータ処理装置のブロック図（ｂ）本発明の実施の形態１におけるデータ処理装置のブロック図(A) Block diagram of conventional data processing device (b) Block diagram of data processing device in Embodiment 1 of the present invention 本発明の実施の形態１におけるデータ処理装置のブロック図1 is a block diagram of a data processing apparatus according to Embodiment 1 of the present invention. （ａ）本発明の実施の形態１における共通演算部の内部ブロック図（ｂ）本発明の実施の形態１における共通演算部の内部ブロック図(A) Internal block diagram of common arithmetic unit in Embodiment 1 of the present invention (b) Internal block diagram of common arithmetic unit in Embodiment 1 of the present invention （ａ）従来のデータ処理装置のブロック図（ｂ）本発明の実施の形態１におけるデータ処理装置のブロック図(A) Block diagram of conventional data processing device (b) Block diagram of data processing device in Embodiment 1 of the present invention 本発明の実施の形態１におけるデータ処理装置の動作フローチャートOperation flow chart of data processing apparatus in Embodiment 1 of the present invention 本発明の実施の形態１におけるデータ処理装置の動作フローチャートOperation flow chart of data processing apparatus in Embodiment 1 of the present invention 本発明の実施の形態１におけるデータ処理装置のブロック図1 is a block diagram of a data processing apparatus according to Embodiment 1 of the present invention. 本発明の実施の形態２におけるデータ処理装置のブロック図The block diagram of the data processor in Embodiment 2 of this invention 本発明の実施の形態２における共通演算部の動作フローチャートOperation flow chart of common arithmetic unit in embodiment 2 of the present invention （ａ）従来のデータ処理装置のブロック図（ｂ）本発明の実施の形態３におけるデータ処理装置のブロック図(A) Block diagram of a conventional data processing device (b) Block diagram of a data processing device according to Embodiment 3 of the present invention 本発明の実施の形態３におけるサブプロセッサ６０の動作フローチャートOperation flowchart of sub-processor 60 in Embodiment 3 of the present invention 本発明の実施の形態３における割込み処理のフローチャートFlow chart of interrupt processing in Embodiment 3 of the present invention 本発明の実施の形態３におけるデータ処理装置のブロック図The block diagram of the data processor in Embodiment 3 of this invention 従来のデータ処理装置のブロック図Block diagram of a conventional data processing device

Explanation of symbols

１、２、５０、５１データ処理装置
３メインプロセッサ
４、５、６、１２専用演算部
７演算部
８信号線
９、１３共通演算部
１０信号線
１５共通積和演算部
１６個別演算部
１７選択部
１８共通積和演算制御部
２０フィルタ処理部
２１直交変換部
２２積和演算部
２３制御部
２４共通積和演算部
３０専用演算部
３１共通積和演算プログラム
３２、３３個別演算プログラム
３４プロセッサユニット
４０クロック制御部
５２、５３、５４動き検出部
５５絶対誤差合計演算器
５６エンジンインターフェース
５７プロセッサユニット
５８絶対誤差合計演算プログラム
５９積和演算プログラム
１００データ処理装置
１０１メインプロセッサ
１０２演算処理部 1, 2, 50, 51 Data processor 3 Main processor 4, 5, 6, 12 Dedicated operation unit 7 Operation unit 8 Signal line 9, 13 Common operation unit 10 Signal line 15 Common product-sum operation unit 16 Individual operation unit 17 Selection Unit 18 Common product-sum operation control unit 20 Filter processing unit 21 Orthogonal transformation unit 22 Product-sum operation unit 23 Control unit 24 Common product-sum operation unit 30 Dedicated operation unit 31 Common product-sum operation program 32, 33 Individual operation program 34 Processor unit 40 Clock control units 52, 53, 54 Motion detection unit 55 Absolute error total calculator 56 Engine interface 57 Processor unit 58 Absolute error total calculation program 59 Product-sum calculation program 100 Data processor 101 Main processor 102 Calculation processing unit

Claims

A plurality of dedicated calculation units for performing predetermined calculations;
A signal line connected to the plurality of dedicated arithmetic units;
A data processing apparatus including a common arithmetic unit that is connected to the plurality of dedicated arithmetic units via the signal line and performs common arithmetic processing,
The common arithmetic unit is a data processing device that is used in common in at least two of the plurality of dedicated arithmetic units.

The data processing apparatus according to claim 1, wherein the common operation unit includes at least one common product-sum operation unit that performs a product-sum operation commonly used by at least two or more of the plurality of dedicated operation units.

The data processing apparatus according to claim 2, wherein the common calculation unit further includes at least one individual calculation unit that performs a calculation that is not commonly used by the plurality of dedicated calculation units.

4. The data processing device according to claim 1, wherein each of the plurality of dedicated calculation units uses a calculation result obtained by combining the common product-sum calculation unit and the individual calculation unit. 5.

5. The system according to claim 3, wherein the common calculation unit includes a plurality of the individual calculation units, and further includes a selection unit that selects a predetermined individual calculation unit corresponding to the request of the dedicated calculation unit from the plurality of individual calculation units. The data processing apparatus described.

The data processing apparatus according to claim 1, wherein there are a plurality of the common arithmetic units and the number is less than the number of the plurality of dedicated arithmetic units.

The data processing apparatus according to claim 1, wherein the plurality of dedicated arithmetic units are at least one of a filter processing unit, an orthogonal transformation processing unit, a motion detection unit, and a motion compensation unit.

The data processing apparatus according to claim 1, wherein the common calculation unit further includes a clock control unit that outputs a clock independent of the dedicated calculation unit.

The data processing apparatus according to claim 8, wherein the clock control unit outputs no clock signal when the common arithmetic unit is not operating.

When the number of the dedicated arithmetic units is N, the clock frequency of the dedicated arithmetic units is F, and the clock frequency of the common arithmetic unit is f, the clock frequency f is f = N * F
The data processing device according to claim 8, defined by

The data processing apparatus according to claim 1, wherein the common arithmetic unit includes a processor unit that executes program processing.

12. The data processing apparatus according to claim 11, wherein the processor unit includes a common product-sum operation program that executes a product-sum operation that is commonly used by at least two or more of the plurality of dedicated operation units.