JP2004326215A

JP2004326215A - Signal processor and signal processing method

Info

Publication number: JP2004326215A
Application number: JP2003116497A
Authority: JP
Inventors: Shinichiro Yamashita; 真一郎山下; Katsuaki Moriwake; 且明守分; Kenichi Kawamura; 研一河村
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2003-04-22
Filing date: 2003-04-22
Publication date: 2004-11-18

Abstract

<P>PROBLEM TO BE SOLVED: To prevent processing to video data from failing even when the function of a processor is improved, or the number of input channels is increased. <P>SOLUTION: Data generated by a CPU 6 are supplied through a bus 8 to a decoder 16, and written in an FIFO 17. In the same way, data generated by a CPU10 are supplied through a bus 12 to a decoder 19, and written in an FIFO 20. A bus switch 15 alternately switches the FIFOs 17 and 20 synchronously with the clock of the bus 8, and compresses the data of the selected side in a reading time direction, and supplies the data to a bus 2 side. The data supplied to the bus 2 are set in the register of a processor 3. Thus, it is possible to distribute the load of the CPU without making it necessary to quicken the clock of the CPU or to change an interface. Also, it is possible to complete the data setting processing to the register of the processor 3 in one frame cycle even in the case of processing where the load of the CPU is much higher. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
この発明は、複数のＣＰＵを用いて映像信号を処理する信号処理装置および信号処理方法に関する。
【０００２】
【従来の技術】
放送局やプロダクションは、放送用の映像等を編集するために、業務用画像エフェクタを備えた編集室を設けていることが多い。業務用画像エフェクタは、テレビカメラで得られるライブ映像や、既に記録されている映像等に対して、画像の拡大、縮小処理といった特殊効果処理を施すなどの編集、加工を行う機器である。
【０００３】
放送局等では、この業務用画像エフェクタによって編集、加工された映像に他の映像を組み合わせ、更にＣＧ（ＣｏｍｐｕｔｅｒＧｒａｐｈｉｃｓ）画像やサウンドを付加するなどして放送用の映像を作成する。例えば、テレビカメラから得られた映像を拡大または縮小したものを他の映像にはめ込み、更にその映像にテロップを挿入することによって、放送用の映像を作成する。
【０００４】
こうした映像処理は、例えば、ライブ中継で複数のテレビカメラからの映像を同時に一画面に出力するように編集して、一度に多くの情報を伝えると共に、現場の臨場感を持たせることができる。また、映像に様々な特殊効果処理を施し、更にＣＧ映像等を挿入することによって、番組の表現にバリエーションを持たせることも可能である。
【０００５】
上記業務用画像エフェクタとして使用されるものにディジタルマルチエフェクトシステムがある。このシステムは、ディジタル映像信号について、特殊効果処理を施す等の編集、加工を行うものであり、複数の入力映像について並列的かつリアルタイムに処理することができる。
【０００６】
また、ディジタルマルチエフェクトシステムは、映像信号経路の切り換えを行うスイッチャと組み合わされ、ディジタルマルチエフェクトスイッチャとして利用されることもある。
【０００７】
ディジタルマルチエフェクトシステムは、複数のモニタなどの表示装置と、キーボード、スイッチ、ジョイスティックなどの入力装置を備え、ユーザの入力装置に対する操作に応じて、テレビカメラなどから送信されてくる入力映像や、ＶＴＲ（ＶｉｄｅｏＴａｐｅＲｅｃｏｒｄｅｒ）などで再生された映像素材といった、複数の映像信号を用いて編集処理を行う。
【０００８】
ディジタルマルチエフェクトシステムでは、この編集作業の際に、例えば画像の拡大、縮小処理といった特殊効果処理を、映像信号に対して施すことができる。この特殊効果処理は、例えばビデオプロセッサを用いて行われる。一例として、ユーザの入力装置に対する操作に応じてＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）から制御信号が出力され、この制御信号によりビデオプロセッサが制御され、入力された映像信号に対して所定の処理がなされる。ビデオプロセッサは、例えばＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）からなり、レジスタに係数を設定することで、入力された信号に対して所望の処理を施すようにされている。
【０００９】
図９は、このような、入力された映像信号に対してビデオプロセッサを用いて特殊効果などを施すようにされた、従来の技術によるビデオ処理部１２０の一例の構成を示す。このビデオ処理部１２０は、例えば上述のディジタルマルチエフェクトシステムに組み込まれて用いられる。
【００１０】
ビデオ処理部１２０は、ＲＡＭ／ＲＯＭ２００、ＣＰＵモジュール２１０、Ｉ／Ｏポート２２０およびビデオプロセッサ２３０を有し、それぞれがローカルバス２４０に接続され、互いにデータのやりとりなどが可能なようにされている。ＲＡＭ／ＲＯＭ２００は、ＣＰＵモジュール２１０によって使用されるプログラム等を記憶し、また、必要に応じて実行時に使用されるデータを一時的に記憶する。
【００１１】
また、Ｉ／Ｏポート２２０は、外部との制御データのやりとりを行う。例えばこのビデオ処理部１２０が組み込まれるディジタルマルチエフェクトシステムの全体を制御するメインＣＰＵから、ユーザの入力装置に対する操作に応じて生成された制御データが、Ｉ／Ｏポート２２０を介して入力され、ＣＰＵモジュール２１０に供給される。
【００１２】
一方、処理されるべきビデオデータは、ビデオプロセッサ２３０に入力される。ビデオプロセッサ２３０は、レジスタに設定されたフィルタ係数に応じた処理を、入力されたビデオデータに対して施す。例えば、上述のＩ／Ｏポート２２０を介して供給された制御データに基づきＣＰＵモジュール２１０でフィルタ係数が生成され、ビデオプロセッサ２３０に設定される。ビデオプロセッサ２３０に入力されたビデオデータは、ビデオプロセッサ２３０で、設定されたフィルタ係数に応じた処理を施され、出力される。
【００１３】
すなわち、ビデオプロセッサ２３０では、ＣＰＵモジュール２１０によるレジスタ設定によってビデオデータの出力が変化する。この処理をフレーム周期内で実行することで、ビデオプロセッサ２３０による画像処理が実現される。
【００１４】
ここで、ビデオ処理部１２０は、複数チャンネルのビデオデータに対して並列的に処理を行うことが可能とされる。例えば、ビデオ処理部１２０が４チャンネルのビデオデータ入力に対応可能であるとすると、並列的に入力された４チャンネルのビデオデータは、４チャンネル分のフレームが１フレームの期間内に収まるようにそれぞれ時間方向に圧縮され、１チャンネルのデータに時分割多重されてビデオプロセッサ２３０に供給される。ＣＰＵモジュール２１０は、１フレームの期間内に時分割多重された４チャンネルのビデオデータそれぞれに対するフィルタ係数を、それぞれのチャンネルに同期するように切り換えながら、ビデオプロセッサ２３０のレジスタに設定する。ビデオプロセッサ２３０では、供給されたビデオデータに対してレジスタに設定されたフィルタ係数に応じた処理を施し、出力する。出力されたビデオデータは、時分割多重を解かれ、再び元の４チャンネルのビデオデータに組み立て直されて、チャンネル毎個別に出力される。
【００１５】
ここで、ビデオプロセッサ２３０から出力されたビデオデータが正常に再生できるためには、１フレーム分の画像に関する指定された処理が１フレーム周期内に完了しなければならない。この処理が１フレーム周期内に完了しない場合は、映像の一部の欠落や映像の乱れが生じることになる。例えば、処理するビデオデータのチャンネル数が増加したり、ビデオプロセッサ２３０の機能が向上し、より複雑な画像処理が可能となってくると、ビデオプロセッサ２３０内のレジスタ数が単純に増加することになり、ＣＰＵモジュール２１０によるフィルタ係数の演算やビデオプロセッサ２３０に対するレジスタ設定が１フレーム内で完了できない可能性がある。
【００１６】
図１０を用いて説明する。図１０は、ＣＰＵモジュール２１０の処理時間をフレーム毎に示す一例のタイミングチャートである。ここで、図１０Ａに示されるように、１フレーム周期が時間ｔであるとする。
【００１７】
図１０Ｂは、ビデオデータの第１フレームから第４フレームまでのフレーム画像に対して処理を施す場合に、ＣＰＵモジュール２１０による必要なフィルタ係数の生成やレジスタ設定などの処理が、それぞれ１フレーム周期ｔ内に完了している例である。なお、各フレーム画像に対する処理内容が異なる場合、ＣＰＵモジュール２１０の処理時間もフレーム毎に変動する。
【００１８】
図１０Ｃは、図１０Ｂと同様のビデオデータを処理する場合において、上述したような機能アップや入力チャンネル数の増加などによってＣＰＵモジュール２１０の処理時間が長くなった場合を表している。この図１０Ｃに示される例では、図１０Ｂに例示される場合に比べ、ＣＰＵモジュール２１０の処理時間は、各フレームについてほぼ倍になっている。この例では、第１、第２および第４フレームについては、１フレーム周期ｔ内でＣＰＵモジュール２１０の処理が終了しているが、第３フレームについては、ＣＰＵモジュール２１０の処理時間が１フレーム周期ｔを上回っており、第３フレームの処理がフレーム周期内で完全には終了しなかったことを示している。この時点で、第３フレームの処理は、破綻を来している。
【００１９】
こうした処理の破綻によって、第３フレームがリアルタイム処理に間に合わず、処理が完全に終わらない状態でビデオデータ出力として送信され、その結果、映像の一部が欠落したり、映像の乱れが生じたりする。
【００２０】
このような、フレーム周期内に処理が完了しない問題を解消するために、マルチメディアデータの処理を高速化する方法として、ＰＣＩローカルバスを利用してオーディオデータ、ビデオデータ、およびグラフィックスデータを高速に同時処理する統合マルチメディアボード回路（特許文献１）、およびコマンドの実行順序を調整することによってシステムのデータ処理速度を向上させたバスシステムおよび実行順序の調整方法（特許文献２）がある。
【００２１】
【特許文献１】
特開平７−２６２１３０号公報
【特許文献２】
特開２００２−４１４４９号公報
【００２２】
【発明が解決しようとする課題】
ところで、このような処理の破綻を回避する他の手段として、単純にＣＰＵモジュール２１０の演算処理速度（動作周波数）を上げてＣＰＵの演算時間を短縮する方法が考えられる。
【００２３】
しかしながら、ＣＰＵモジュールの演算処理速度を速くすると、ＲＡＭ／ＲＯＭなどのメモリ、各種データインタフェース、バス等の、ＣＰＵモジュールと同期して動作する構成要素もこれに対応させる必要があり、また、別途ソフトウエアの開発が必要となる場合もある。従って、ＣＰＵの演算処理速度を速くする場合は、ハードウエアの交換・調整やソフトウエアの開発に多大なコストが必要になるという問題点があった。更に、ソフトウエアの開発については、場合によっては極めて長い期間を要するという問題点があった。
【００２４】
こうした事情から、ＣＰＵモジュールの演算処理速度を単に上げるという手段を容易に選択することはできず、コストおよび期間の面でより制約のない別のアプローチが必要になる。
【００２５】
従って、この発明の目的は、ビデオプロセッサの機能アップや入力チャンネル数の増加があっても、ビデオデータに対する処理が破綻を来さないような信号処理装置および信号処理方法を提供することにある。
【００２６】
【課題を解決するための手段】
この発明は、上述した課題を解決するために、受信したデータによるレジスタの更新結果に応じて所定の処理を実行する処理手段と、処理手段のためのデータを生成し、データを送信する複数の制御手段と、複数の制御手段のそれぞれに対応して設けられ、制御手段から送信されたデータを一時的に記憶する複数の記憶手段と、記憶手段を、少なくとも制御手段のデータ送信周期よりも短い周期で選択し、選択された記憶手段に記憶されているデータを、順次、処理手段に送信する切換手段とを有することを特徴とする信号処理装置である。
【００２７】
また、この発明は、受信したデータによるレジスタの更新結果に応じて所定の処理を実行する処理のステップと、処理のステップのためのデータを生成データを送信する複数の制御のステップと、複数の制御のステップのそれぞれに対応して設けられた複数の記憶手段に対し、複数の制御のステップから送信されたデータを一時的にそれぞれ記憶する複数の記憶のステップと、記憶手段を、少なくとも制御のステップのデータ送信周期よりも短い周期で選択し、選択された記憶手段に記憶されているデータを、順次、処理のステップに送信する切換のステップとを有することを特徴とする信号処理方法である。
【００２８】
また、この発明は、受信したコマンドによるレジスタの更新結果に応じて所定の処理を実行する処理手段と、処理手段のためのデータを生成し、データを送信する複数の制御手段と、複数の制御手段のそれぞれに対応して設けられ、制御手段から送信されたデータを一時的に記憶する複数の記憶手段と、複数の記憶手段に対応して設けられ、複数の記憶手段内のデータ数をそれぞれ検出する検出手段と、検出手段により検出されたデータ数に基づき複数の記憶手段から１の記憶手段を選択し、選択された記憶手段に記憶されているデータを、少なくとも制御手段のデータ送信周期よりも短い周期で順次、処理手段に送信する切換手段とを有することを特徴とする信号処理装置である。
【００２９】
また、この発明は、受信したコマンドによるレジスタの更新結果に応じて所定の処理を実行する処理のステップと、処理のステップのためのデータを生成し、データを送信する複数の制御のステップと、複数の制御のステップのそれぞれに対応して設けられた複数の記憶手段に対し、複数の制御のステップから送信されたデータを一時的にそれぞれ記憶する複数の記憶のステップと、複数の記憶手段に対応して設けられ、複数の記憶手段内のデータ数をそれぞれ検出する検出のステップと、検出のステップにより検出されたデータ数に基づき複数の記憶手段から１の記憶手段を選択し、選択された記憶手段に記憶されているデータを、少なくとも制御のステップのデータ送信周期よりも短い周期で順次、処理のステップに送信する切換のステップとを有することを特徴とする信号処理装置である。
【００３０】
上述したように、この発明は、複数のＣＰＵモジュールにより生成されたデータを複数のメモリにそれぞれ一時的に記憶し、メモリを少なくともＣＰＵモジュールによるデータ送信周期よりも短い周期で選択し、選択されたメモリに記憶されているデータを順次、プロセッサに送信するようにしているため、プロセッサに送信するデータを生成する負荷が複数のＣＰＵモジュールに分散される。
【００３１】
また、この発明は、複数のＣＰＵモジュールにより生成されたデータを複数のメモリにそれぞれ一時的に記憶し、複数のメモリにそれぞれ記憶されたデータ数を検出した結果に基づき複数のメモリから１を選択し、選択されたメモリに記憶されているデータを少なくともＣＰＵモジュールのデータ送信周期よりも短い周期で順次、プロセッサに送信するようにしているため、プロセッサに送信するデータを生成する負荷が複数のＣＰＵモジュールに対して適応的に分散できる。
【００３２】
【発明の実施の形態】
以下、この発明の実施の第１の形態を、図面を参照しながら説明する。図１は、この発明が適用できるディジタルマルチエフェクトシステムの一例の構成を示す。デジタルマルチエフェクトシステム１００は、メインバス１４０に接続されたメインＣＰＵ１１０、ビデオ処理部１７０Ａ、１７０Ｂおよび１７０Ｃ、ビデオＩ／Ｆ１３０を備える。メインＣＰＵ１１０には、操作パネル１５０が接続される。
【００３３】
図示は省略するが、操作パネル１５０は、複数のモニタなどの表示装置と、キーボード、スイッチ、ジョイスティックなどの入力装置を備える。複数のモニタは、テレビカメラなどから供給される映像や、このディジタルマルチエフェクトシステムで処理された映像をそれぞれ表示する。ユーザは、ディジタルマルチエフェクトシステム１００の操作パネル１５０に備えられたキーボードやスイッチなどを操作することで、テレビカメラなどから供給される複数の入力映像やＶＴＲにより再生された映像素材をどのように編集するかを指示する。ライブ映像の中継映像を提供するような場面においては、ユーザは、幾つかのテレビカメラの映像から必要なものを選択すると共に、操作パネル１５０のキーボードなどを操作して、選択された各映像に所望の処理を施す。
【００３４】
次に、図１を参照して、デジタルマルチエフェクトシステム１００の処理をより詳細に説明する。ここでは、テレビカメラからの映像や記録されている映像素材等から、複数チャンネルのデジタルビデオデータがディジタルマルチエフェクトシステム１００に対して入力されるものとする。
【００３５】
複数チャンネルのビデオデータがビデオＩ／Ｆ１３０に入力され、ビデオ処理部１７０Ａ、１７０Ｂおよび１７０Ｃにそれぞれ転送される。ビデオデータの転送は、例えば、ビデオＩ／Ｆ１３０からビデオ処理部１７０との間にチャンネル毎に設けられた信号線１６０を介して行われる。なお、図１では、信号線１６０は、ビデオＩ／Ｆ１３０とビデオ処理部１７０Ｂとの間だけが示されており、他は省略されている。
【００３６】
一方、ユーザによって、操作パネル１５０の入力装置によりどのチャンネルのビデオデータにどのような処理を施すのかといった編集指示がなされ、この指示がメインＣＰＵ１１０に送信される。メインＣＰＵ１１０は、この編集指示の内容に基づいて、ビデオデータを処理するビデオ処理部をビデオ処理部１７０Ａ、１７０Ｂおよび１７０Ｃの中から選択し（例えばビデオ処理部１７０Ｂ）、選択されたビデオ処理部１７０ＢとビデオＩ／Ｆ１３０との間のビデオデータの転送と、当該ビデオ処理部１７０Ｂにおけるビデオデータの処理動作を制御する。この制御は、例えばメインＣＰＵ１１０からビデオ処理部１７０およびビデオＩ／Ｆ１３０に対して、バス１４０を介して制御データを送信することにより行われる。
【００３７】
ビデオ処理部１７０は、信号線１６０を介してビデオＩ／Ｆ１３０からビデオデータを受信すると、そのビデオデータに対して指示されている処理を施して出力する。この出力は、信号線１６０を介してビデオＩ／Ｆ１３０に送信され、出力ビデオデータとしてディジタルマルチエフェクトシステム１００の外部に出力される。ディジタルマルチエフェクトシステム１００内部におけるこれらのビデオデータの送受信は、上述したように、チャンネル毎に設けられた信号線１６０を介して個別に行われる。
【００３８】
図２は、この発明の実施の第１の形態によるビデオ処理部の一例の構成を示す。図２において、ビデオ処理部１は、上述の図１に示されるビデオ処理部１７０Ａ、１７０Ｂおよび１７０Ｃにそれぞれ対応する。ビデオプロセッサ３は、上述の従来の技術で説明したビデオプロセッサ２３０と同等のもので、入力されたビデオデータに対して、レジスタに設定されたフィルタ係数に応じて処理を施す。また、詳細は後述するが、ビデオプロセッサ３は、複数チャンネルのビデオデータを並列的に入力してそれぞれ処理を行うことができる。
【００３９】
ビデオ処理部１は、このように構成されるため、従来技術で説明したビデオ処理部１２０と共通のインタフェースを多く有している。したがって、ソフトウエアの基本アーキテクチャ等を変更せずに、従来のビデオ処理部１２０をこの発明のビデオ処理部１に置き換えることによって、この発明の効果を奏するディジタルマルチエフェクトシステムが容易に構成されうる。
【００４０】
ビデオ処理部１は、レジスタバス２にそれぞれ接続されたビデオプロセッサ３およびバスＩ／Ｆ４を有し、バスＩ／Ｆ４には、さらに、ローカルバス８およびローカルバス１２が並列的に接続される。
【００４１】
ローカルバス８には、ＲＡＭ／ＲＯＭ５、ＣＰＵモジュール６、およびＩ／Ｏポート７が接続される。ＲＡＭ／ＲＯＭ５は、ＣＰＵモジュール６によって使用されるプログラム等を記憶し、また、必要に応じて実行時に使用されるデータを一時的に記憶する。Ｉ／Ｏポート７は、図１におけるメインＣＰＵ１１０との間で制御データのやりとりを行う。さらに、制御データは、ローカルバス８を介して、Ｉ／Ｏポート７とＣＰＵモジュール６との間でやりとりされる。
【００４２】
ローカルバス１２も、ローカルバス８と同様の構成とされる。すなわち、ローカルバス１２に対して、ＲＡＭ／ＲＯＭ９、ＣＰＵモジュール１０およびＩ／Ｏポート１１が接続され、Ｉ／Ｏポート１１とメインＣＰＵ１１０との間で制御データがやりとりされる。さらに、制御データは、ローカルバス１２を介して、Ｉ／Ｏポート１１とＣＰＵモジュール１０との間でやりとりされる。
【００４３】
なお、この例では、ローカルバス８およびレジスタバス２の間の双方向性、ならびに、ローカルバス１２およびレジスタバス２の間の双方向性は確保されるが、ローカルバス８および１２間でのデータのやりとりは想定されない。また、この例では、レジスタバス２、ローカルバス８、１２のバス幅は、例えば３２ビットとされる。なお、この発明によるシステムの前提条件として、ビデオプロセッサ３に対するレジスタのアクセスによるバスの占有率は、単位フレーム時間に対して低いものとする。これは、後述するこの発明の実施の第１の形態の変形例および実施の第２の形態においても、同様である。
【００４４】
ビデオプロセッサ３に対して、図１のビデオＩ／Ｆ１３０からビデオデータが入力される。また、ビデオプロセッサ３から出力されたビデオデータは、ビデオＩ／Ｆ１３０に供給される。また、ＣＰＵモジュール６および／または１０で生成されたフィルタ係数がバスＩ／Ｆ４を介してレジスタバス２に供給され、ビデオプロセッサ３のレジスタに所定に設定される。バスＩ／Ｆ４は、ローカルバス８、１２とレジスタバス２とを接続するブリッジのような役割を担う。
【００４５】
なお、上述したように、ローカルバス８とローカルバス１２との間では、基本的にデータのやり取りは発生しない。したがって、メインＣＰＵ１１０は、ＣＰＵモジュール６、１０を制御する上位のＣＰＵに相当する。メインＣＰＵ１１０によって、このビデオ処理部１におけるＣＰＵモジュール６および１０間での動的な最適化が図られる。
【００４６】
図３は、ビデオプロセッサ３の一例の構成を示す。ビデオプロセッサ３は、例えばＤＳＰからなる演算器３００を備えると共に、多重装置３１０および分離装置３２０を有している。図３の例では、第１チャンネルから第４チャンネルまでの、４チャンネル分のビデオデータが独立的に多重装置３１０に入力される。多重装置３１０は、パラレルに提供された上記第１チャンネルから第４チャンネルまでのビデオデータを時分割多重してシリアルに演算器３００に渡す。
【００４７】
演算器３００には、レジスタバス２が接続され、上述の図２で示したＣＰＵモジュール６および１０で生成されたフィルタ係数がレジスタバス２を介して供給される。フィルタ係数は、演算器３００のレジスタに所定に設定される。
【００４８】
演算器３００は、レジスタに設定されたフィルタ係数の内容に従って、時分割多重された第１チャンネルから第４チャンネルのうちの何れかのチャンネルのビデオデータに対して、例えば、１フレームまたは１ラインなどの単位で処理を施す。第１〜第４チャンネルそれぞれのビデオデータが入力されるタイミングに従って演算器３００に設定するフィルタ係数を入れ替えることで、チャンネル毎の独立的な処理を実現することができる。こうして得られた各チャンネルのビデオデータは、分離装置３２０で元のチャンネル単位に組み立てられ、第１チャンネルから第４チャンネルのビデオデータとして独立的に出力される。
【００４９】
次に、この発明の実施の第１の形態によるビデオ処理部１の動作について、より具体的に説明する。ＣＰＵモジュール６、ＣＰＵモジュール１０は、ビデオデータに対する処理の機能毎および／またはチャンネル毎に負荷が分散される。チャンネル毎に負荷分散がなされる場合、例えば、ビデオプロセッサ３に４チャンネル分のビデオデータが入力されていると仮定すると、第１チャンネルと第３チャンネルに対してなされる処理のためのフィルタ係数の演算は、ＣＰＵモジュール６が行い、第２チャンネルと第４チャンネルに対してなされる処理のためのフィルタ係数の演算は、ＣＰＵモジュール１０が行うように制御することが可能である。このような負荷分散についての制御は、メインＣＰＵ１１０からの制御データによって行われる。
【００５０】
バスＩ／Ｆ４は、ローカルバス８、１２とビデオプロセッサ３との接続を一定周期で交互に切り換えることによって、ＣＰＵモジュール６およびＣＰＵモジュール１０から出力される、ビデオプロセッサ３に対する命令（例えばフィルタ係数をレジスタに設定するためのライトコマンド）を、交互にビデオプロセッサ３に供給する。その結果、ビデオプロセッサ３は、ＣＰＵモジュール６からのコマンドとＣＰＵモジュール１０からのコマンドを交互に受け取ってレジスタに設定し、設定されたレジスタの内容に基づく処理を、入力されたビデオデータに対して繰り返し実行する。
【００５１】
図４は、この実施の第１の形態によるフレーム毎の処理時間を表した一例のタイミングチャートである。この図４で示される第１フレーム１〜第４フレームの各フレーム画像に対する処理内容は、上述した図１０に示すものと同様であるものとする。ここで、図４Ａに示されるように、１フレーム周期が時間ｔであるとする。図４Ｂは、ビデオデータの第１フレームから第４フレームまでのフレーム画像に対して処理を施す場合に、ＣＰＵモジュール６および１０による必要なフィルタ係数の生成やレジスタ設定などの処理が、それぞれ１フレーム周期ｔ内に完了している例である。
【００５２】
図４Ｃは、ビデオプロセッサ３の機能アップや、入力チャンネル数が増加した場合を想定している。この実施の第１の形態では、２つのＣＰＵモジュール６および１０を備えるビデオ処理部１を使用してＣＰＵモジュール６および１０に負荷を分散した結果、各ＣＰＵモジュール６および１０の処理時間を１フレーム周期内に収めることができる。
【００５３】
次に、この実施の第１の形態におけるバスＩ／Ｆ４の構成、ならびに、ビデオプロセッサ３、バスＩ／Ｆ４およびＣＰＵモジュール６、１０間の動作について、図５および図６を参照して、より詳細に説明する。
【００５４】
図５は、バスＩ／Ｆ４の一例の構成をより詳細に示すと共に、バスＩ／Ｆ４とＣＰＵモジュール６および１０との接続、ならびに、バスＩ／Ｆ４とビデオプロセッサ３との接続を示す。
【００５５】
なお、以下において、ローカルバス８および１２は、それぞれＣＰＵモジュール６および１０の出力するクロックに同期したバスである。また、レジスタバス２も、クロックに同期したバスである。レジスタバス２のクロックは、ＣＰＵモジュール６およびＣＰＵモジュール１０の何れか一方のクロックに同期される。以下では、レジスタバス２のクロックがＣＰＵモジュール６のクロックに同期しているものとする。
【００５６】
バスＩ／Ｆ４は、バススイッチ１５、アドレスデコーダ１６、ライトＦＩＦＯ１７、リードＦＩＦＯ１８、アドレスデコーダ１９、ライトＦＩＦＯ２０、リードＦＩＦＯ２１、およびレジスタバスＩ／Ｆ２２を備える。
【００５７】
ＣＰＵモジュール６側から説明する。ＣＰＵモジュール６からアドレス（１）データおよびリード／ライトコマンドが出力され、ローカルバス８を介してアドレスデコーダ１６に供給される。また、ＣＰＵモジュール６とアドレスデコーダ１６との間で、ローカルバス８を介してデータのやりとりがなされる。ＣＰＵモジュール６から出力されるデータは、例えばビデオプロセッサ３のレジスタに設定するためのフィルタ係数である。
【００５８】
ライト／リードコマンドは、アドレスデコーダ１６で解釈され、ビデオプロセッサ３に対するアクセスであれば、アドレス（１）データおよびデータ（１）と共に、先入れ先出し方式（Ｆａｓｔ−ＩｎＦａｓｔ−Ｏｕｔ）のメモリであるライトＦＩＦＯ１７に書き込まれる。このとき、アドレス（１）データは、例えばビデオプロセッサ３のレジスタを指定するためのアドレスである。
【００５９】
ＣＰＵモジュール１０側も、ＣＰＵモジュール６側と同様である。すなわち、ＣＰＵモジュール１０から出力されたアドレス（２）データおよびリード／ライトコマンドは、ローカルバス１２を介してアドレスデコーダ１９に供給されると共に、ＣＰＵモジュール１０とアドレスデコーダ１９との間で、ローカルバス１２を介してデータのやりとりがなされる。リード／ライトコマンドは、アドレスデコーダ１９で解釈され、ビデオプロセッサ３に対するアクセスであれば、アドレス（２）データおよびデータ（２）と共に、ライトＦＩＦＯ２０に書き込まれる。
【００６０】
バススイッチ１５では、ライトＦＩＦＯ１７から供給されたアドレス（１）データ、リード／ライトコマンドおよびデータ（１）と、ライトＦＩＦＯ２０から供給されたアドレス（２）データ、リード／ライトコマンドおよびデータ（２）とを所定の周期で切り換え、交互に出力する。
【００６１】
このとき、バススイッチ１５により、ライトＦＩＦＯ１７または２０から読み出されたアドレスデータ、リード／ライトコマンドおよびデータは、ＣＰＵモジュール６のクロックに同期させてレジスタバス２側に転送される。また、バスＩ／Ｆ４では、ローカルバス８および１２からの制御を時間方向に圧縮させて、ビデオプロセッサ３に対して出力する。これにより、レジスタバス２におけるローカルバス８および１２への時間的な影響を無くしている。
【００６２】
例えば、レジスタバス２のクロック周波数を、ＣＰＵモジュール６のクロック周波数よりも高くする。このレジスタバス２のクロック周波数に基づきバススイッチ１５を交互に切り換え、ライトＦＩＦＯ１７および２０から交互にデータを読み出す。好適な例として、レジスタバス２のクロック周波数を、ＣＰＵモジュール６のクロック周波数の２倍とし、このタイミングでバススイッチ１５を切り換えると共にライトＦＩＦＯ１７および２０から交互にデータを読み出す。
【００６３】
バススイッチ１５から出力されたアドレスデータ、リード／ライトコマンドおよびデータは、レジスタバスＩ／Ｆ２２に供給される。アドレスデータ、リード／ライトコマンドおよびデータは、レジスタバスＩ／Ｆ２２からレジスタバス２を介してビデオプロセッサ３に供給される。ビデオプロセッサ３では、例えば、コマンドがライトコマンドであれば、アドレスデータにより指定されるレジスタに対してデータ（フィルタ係数）を設定する。
【００６４】
ビデオプロセッサ３に供給されるコマンドがリードコマンドであれば、例えば、ビデオプロセッサ３からアドレスで指定されたレジスタからデータが読み出され、レジスタバス２を介してレジスタバスＩ／Ｆ２２に供給される。このデータは、レジスタバスＩ／Ｆ２２からバススイッチ１５に供給され、所定に切り換えられ例えばデータ（１）としてリードＦＩＦＯ１８に書き込まれる。データ（１）は、リードＦＩＦＯ１８から読み出され、アドレスデコーダ１６およびローカルバス８を介してＣＰＵモジュール６に供給される。
【００６５】
ＣＰＵモジュール１０側でも同様に、バススイッチ１５からデータ（２）として出力され、リードＦＩＦＯ２１に書き込まれる。データ（２）は、リードＦＩＦＯ２１から読み出され、アドレスデコーダ１９およびローカルバス１２を介してＣＰＵモジュール１０に供給される。
【００６６】
なお、上述において、バススイッチ１５、レジスタバスＩ／Ｆ２２およびビデオプロセッサ３は、同期して動作する。一方、ＣＰＵモジュール６およびアドレスデコーダ１６と、ＣＰＵモジュール１０およびアドレスデコーダ１９は、互いに非同期とすることができる。ＣＰＵモジュール６およびアドレスデコーダ１６と、ＣＰＵモジュール１０およびアドレスデコーダ１９の組から非同期に送信されるコマンドなどは、ライトＦＩＦＯ１７、２０を用いてタイミングの調整がなされ、バススイッチ１５、レジスタバスＩ／Ｆ２２、およびビデオプロセッサ３の組と同期が取られる。
【００６７】
この図５の例では、リード／ライトコマンドなどを一時的に格納する待ち行列としてＦＩＦＯが用いられている。この実施の第１の形態のように、要求順に処理が行われていくようなケースでは、ＦＩＦＯのような先入れ先出し方式が一般的であるが、更新順序その他の事情により、他の方式を使用することも可能である。
【００６８】
ＣＰＵモジュール６からリードコマンドが発行される場合、アドレス（１）データは、例えばビデオプロセッサ３のレジスタを指定する。リードコマンドの応答データは、ビデオプロセッサ３のアドレス（１）データで指定されたレジスタから読み出されたデータがレジスタバスＩ／Ｆ２２およびバススイッチ１５を介してリードＦＩＦＯ１８に書き込まれる。その後、その応答データは、アドレスデコーダ１６を介してＣＰＵモジュール６に供給される。こうしたリードコマンドは、例えば、頻繁に更新されるＦＰＧＡのバージョン情報を読み取る命令であったり、処理対象のビデオデータの一部をサンプルするための命令であったりする。なお、バススイッチ１５は、リードコマンドをビデオプロセッサ３に送信した後、ホールド状態となる。
【００６９】
図６は、ローカルバス８、１２およびレジスタバス２における、リード／ライトコマンドの転送タイミングを示す一例のタイミングチャートである。図６Ａおよび図６Ｂは、それぞれローカルバス８および１２におけるタイミングを示し、図６Ｃは、レジスタバス２におけるタイミングを示す。
【００７０】
図６において、ローカルバス８からのライトコマンドＣ１−１、Ｃ１−２、およびローカルバス１２からのライトコマンドＣ２−１、Ｃ２−２、リードコマンドＣ２−３までは、ライトＦＩＦＯ１７、２０に書き込まれた順序で、レジスタバス２に時間圧縮されて出力される。また、上述のように、リードコマンドＣ２−３のようなコマンドが発生すると、ローカルバス１２側にはウエイト信号が出力されて、バススイッチ１５の動作がホールドされる。
【００７１】
リードコマンドＣ２−３は、ビデオプロセッサ３で処理される。当該リードコマンドＣ２−３による処理の結果得られた応答データがビデオプロセッサ３から取り出され、レジスタバス２、レジスタバスＩ／Ｆ２２、バススイッチ１５、リードＦＩＦＯ２１、アドレスデコーダ１９およびローカルバス１２を介してＣＰＵモジュール１０に転送される。
【００７２】
リードコマンドＣ２−３に対する応答データがレジスタバス２に現れるまで、ローカルバス８側からのリード／ライトコマンドは、ライトＦＩＦＯ１７に溜め込まれる。リードコマンドＣ２−３に対する応答データがレジスタバス２に現れると上述したバススイッチ１５の動作のホールドが解除されるので、その後、ライトＦＩＦＯ１７、２０から、順次レジスタバス２にリード／ライトコマンドが転送される。
【００７３】
このとき、レジスタバス２側の最小コマンドサイクルと、ローカルバス８、１２からのリードコマンドとライトコマンドの発生頻度によって、ローカルバス８、１２側のコマンドサイクルと最低限必要なライトＦＩＦＯ１７、２０の深さ（容量）を見積もることができる。
【００７４】
ここで、レジスタバス２における処理タイミングを、図６Ｃのタイミングチャートに沿って説明する。レジスタバス２の最初のコマンドサイクルＣＳ１の時点においては、ローカルバス８、ローカルバス１２共に、ライトコマンド（Ｃ１−１、Ｃ２−１）、アドレス、およびデータの転送を完了しておらず、処理を行わない（ノーオペレーション（ＮＯＰ））。次のコマンドサイクルＣＳ２においては、ローカルバス８からライトコマンドＣ１−１が、ローカルバス１２からライトコマンドＣ２−１が転送され、それぞれライトＦＩＦＯ１７、２０に、アドレスおよびデータと共に書き込まれる。この時点でバススイッチ１５が、最初にライトＦＩＦＯ２０のコマンドの方をレジスタバス２に供給するものとすると、図のように、コマンドサイクルＣＳ２では、ライトコマンドＣ２−１がアドレスおよびデータと共にレジスタバス２に転送される。
【００７５】
次のコマンドサイクルＣＳ３においては、バススイッチ１５が切り換えられ、ライトＦＩＦＯ１７から、ライトコマンドＣ１−１が、アドレス（１）データおよびデータ（１）と共にレジスタバス２に転送される。
【００７６】
コマンドサイクルＣＳ４においては、ＣＳ２の場合と同様に、ライトＦＩＦＯ１７、２０にコマンドＣ２−２およびＣ１−２が記憶されており、コマンドＣ２−２および対応するアドレス（１）データおよびデータ（１）がレジスタバス２に転送され、以降同様の処理が繰り返される。
【００７７】
レジスタバス２が、コマンドサイクルＣＳ６でリードコマンドを受け取ると、上述のようにレジスタバス２の動作はホールドされるので、ビデオプロセッサ３からリードコマンドの応答があるまで、レジスタバス２は、新たなコマンドを受け取らない。
【００７８】
リードコマンドの応答がビデオプロセッサ３からあった時点で、ライトＦＩＦＯ１７にはライトコマンドＣ１−３、Ｃ１−４、Ｃ１−５が記憶されており、ライトＦＩＦＯ２０には何も記憶されていない。ここで、コマンドサイクルＣＳ７においては、ライトＦＩＦＯ１７から転送が行われるので、ライトコマンドＣ１−３とアドレスデータがレジスタバス２に提供される。次のコマンドサイクルＣＳ８では、バススイッチ１５がライトＦＩＦＯ２０からレジスタバス２への転送を行うよう試みるが、この時点では、まだローカルバス１２からコマンドＣ２−４、アドレスおよびデータの転送が完了しておらず、レジスタバス２はＮＯＰとなる。
【００７９】
コマンドサイクルＣＳ９以降は、ライトＦＩＦＯ１７、２０から交互にライトコマンド、アドレス、およびデータがレジスタバス２に転送される。
【００８０】
上述のこの発明の実施の第１の形態では、２つのＣＰＵモジュール６、１０を有するように構成しているように説明したが、これはこの例に限らず、例えばＣＰＵモジュールを３つ以上有するように構成することも可能である。
【００８１】
図７は、上述した実施の第１の形態の変形例による、３以上のＣＰＵモジュールを有するビデオ処理部の一例の構成を示す。バスＩ／Ｆ２５には、３以上のローカルバス２４Ａ、２４Ｂ、２４Ｃ、・・・が接続されており、その点で図２に示すビデオ処理部１の構成と異なる。処理を指定する対象となるビデオプロセッサ３に対して、複数個のＣＰＵモジュール２２Ａ、２２Ｂ、２２Ｃ、・・・を設け、これらのＣＰＵモジュール２２Ａ、２２Ｂ、２２Ｃ、・・・がそれぞれ接続されるローカルバス２４Ａ、２４Ｂ、２４Ｃ、・・・とビデオプロセッサ３との間を、バスＩ／Ｆ２５で接続する。なお、ビデオプロセッサ３は、上述の実施の第１の形態で用いたものと同様のものを用いることができる。
【００８２】
バスＩ／Ｆ２５は、上述の図５に示されるバスＩ／Ｆ４の構成を参照して、ローカルバス２４Ａ、２４Ｂ、２４Ｃ、・・・にそれぞれ対応するアドレスデコーダ、ライトＦＩＦＯ、リードＦＩＦＯをそれぞれ有し、ローカルバス２４Ａ、２４Ｂ、２４Ｃ、・・・にそれぞれ対応するライトＦＩＦＯ、リードＦＩＦＯを切り換えるバススイッチを有する。バススイッチは、レジスタバス２６を介してビデオプロセッサ３と接続される。
【００８３】
バスＩ／Ｆ２５は、上述の図示されないバススイッチにより、これらローカルバス２４Ａ、２４Ｂ、２４Ｃ、・・・にそれぞれ対応する各ＦＩＦＯとレジスタバス２６との接続を周期的に切り換え、各ＦＩＦＯ内のデータが均等にビデオプロセッサ３に提供されるようにする。これと同時に、レジスタバス２６において最適のタイミングとなるように、コマンドサイクルを時間的に圧縮する。
【００８４】
また、バスＩ／Ｆ２５内のＦＩＦＯおよびバススイッチは、図５で説明したように、双方向のデータ切り換えに対応できる構成とする。これによって、ＣＰＵモジュール２２Ａ、２２Ｂ、２２Ｃ、・・・からビデオプロセッサ３に対する、図５を用いて説明したようなリード／ライト動作が実現できる。
【００８５】
この実施の第１の形態の変形例においては、バスＩ／Ｆ２５のバススイッチの切り換え周期およびレジスタバス２６のクロック周波数を短縮し、これに合わせてビデオプロセッサ３のクロック周波数を上げれば、より多くのＣＰＵモジュールを接続することができる。
【００８６】
次に、この発明の実施の第２の形態について、図８を参照して説明する。この実施の第２の形態では、バススイッチの切り換え方式を、上述の実施の第１の形態のような周期的なものではなく、動的なものとする。この構成は、ＦＩＦＯに記憶されたローカルバス側からのリード／ライトコマンドの量によって、バススイッチ１５の切り換えを制御し、２つのローカルバス間で、リード／ライトコマンドの待ち時間を同等にしようとするものである。
【００８７】
図８は、実施の第２の形態におけるバスＩ／Ｆ４’の一例の構成をより詳細に示すと共に、バスＩ／Ｆ４’とＣＰＵモジュール６および１０との接続、ならびに、バスＩ／Ｆ４’とビデオプロセッサ３との接続を示す。なお、図８において、上述の図５と共通する部分には同一の符号を付し、詳細な説明を省略する。この実施の第２の形態による構成は、基本的には、上述の実施の第１の形態において図５に示した、ＣＰＵモジュール６および１０、バスＩ／Ｆ４、およびビデオプロセッサ３からなる構成と同様である。
【００８８】
この実施の第２の形態では、図５のバスＩ／Ｆ４に対応するバスＩ／Ｆ４’において、リクエストカウンタ３３、３７が新たに設けられている。リクエストカウンタ３３および３７は、それぞれ、ライトＦＩＦＯ１７および２０内に保持されているデータ数を逐次検出する。
【００８９】
例えば、アドレスデコーダ１６からライトＦＩＦＯ１７へリード／ライトコマンドを転送する際には、リクエストカウンタ３３の内容が例えば１だけカウントアップされる。一方、リード／ライトコマンドがライトＦＩＦＯ１７からレジスタバス２側に転送された場合には、リクエストカウンタ３３の内容が、例えば１だけ減じられる。同様に、アドレスデコーダ１９からライトＦＩＦＯ２０へリード／ライトコマンドを転送する際には、リクエストカウンタ３７の内容が例えば１だけカウントアップされる。また、リード／ライトコマンドがライトＦＩＦＯ２０からレジスタバス２側に転送された場合には、リクエストカウンタ３７の内容が、例えば１だけ減じられる。このようにすることで、リクエストカウンタ３３および３７を用いて、ライトＦＩＦＯ１７および２０内のデータ数の逐次検出が可能である。
【００９０】
勿論、ライトＦＩＦＯ１７および２０内のデータ数を、他の方法によって検出してもよい。例えば、クロックに基づきライトＦＩＦＯ１７および２０にアクセスし、書き込まれているデータ数を検出するようにできる。
【００９１】
上述の実施の第１の形態では、バススイッチ１５は、複数のライトＦＩＦＯ１７および２０とレジスタバス２との接続を、単に周期的に切り換えていた。この実施の第２の形態では、バススイッチ１５’は、上述のリクエストカウンタ３３および３７によるライトＦＩＦＯ１７および２０内のデータ数の検出結果に基づき、バススイッチ１５’の切り換えタイミングを動的に制御する。これにより、ローカルバス８および１２のコマンドの待ち数を均一化する。
【００９２】
より具体的には、リクエストカウンタ３３の値（カウント値（１）とする）およびリクエストカウンタ３７の値（カウント値（２）とする）を読み込み、両者を比較する。比較の結果、カウント値（１）＞＝カウント値（２）の場合は、ライトＦＩＦＯ１７とレジスタバスＩ／Ｆ２２’との間を接続するようにバススイッチ１５’を制御し、それ以外の場合は、ライトＦＩＦＯ２０とレジスタバスＩ／Ｆ２２’との間を接続するようにバススイッチ１５’を制御する。カウント値（１）、カウント値（２）は、それぞれライトＦＩＦＯ１７、ライトＦＩＦＯ２０におけるリード／ライトコマンドの待ち数を意味する。
【００９３】
リードコマンドを受け付けた場合は、上述の実施の第１の形態の場合と同様、バススイッチ１５’をホールドさせる。
【００９４】
ここで、バススイッチ１５’は、それ自身が、上述のカウント値（１）とカウント値（２）を読み込み、それらを比較して接続先の判断を行うようにプログラムされ得る。これはこの例に限らず、図１に示したメインＣＰＵ１１０によってこのような接続制御処理が行われるように構成することもできる。
【００９５】
なお、上述では、この発明がディジタルマルチエフェクトシステムのビデオ処理部に対して適用されるように説明したが、この発明はこの例に限定されるものではない。例えば、この発明は、こうしたフレーム画像に対する処理を行う構成に限られず、信号を入力して一定の処理を行うプロセッサに広く適用することが可能である。
【００９６】
【発明の効果】
この発明によれば、画像に特殊効果処理などを施すためのパラメータの演算、生成を、複数のＣＰＵモジュールで分散して処理することができるため、ＣＰＵモジュールに対する負荷がより高くなるような場合でも、１フレーム周期内に処理を完了させることが可能となるという効果がある。
【００９７】
また、この発明によるビデオ処理部の変更を行っても、同一性能のＣＰＵモジュールを増設し、従来のビデオプロセッサをそのまま使用するため、ビデオプロセッサの機能アップや入力チャンネル数の増加があった場合でも、ソフトウエアの基本アーキテクチャに変更を加えることなく、システム全体の処理速度を向上させることができ、その結果、経済的、かつ短期にビデオプロセッサの機能アップ等に対応することができる効果がある。
【００９８】
さらに、この発明の実施の第２の形態によれば、複数のＣＰＵモジュールからのリクエストに応じてバススイッチの切り換えタイミングを適応的に制御できるので、より効率的な処理が可能となるという効果がある。
【図面の簡単な説明】
【図１】この発明が適用できるディジタルマルチエフェクトシステムの一例の構成を示すブロック図である。
【図２】この発明の実施の第１の形態によるビデオ処理部の一例の構成を示すブロック図である。
【図３】ビデオプロセッサの一例の構成を示すブロック図である。
【図４】この実施の第１の形態によるフレーム毎の処理時間を表した一例のタイミングチャートである。
【図５】バスＩ／Ｆの一例の構成をより詳細に示すと共に、バスＩ／Ｆと２つのＣＰＵモジュールとの接続、ならびに、バスＩ／Ｆとビデオプロセッサとの接続を示す略線図である。
【図６】２つのローカルバスおよびレジスタバスにおけるリード／ライトコマンドの転送タイミングを示す一例のタイミングチャートである。
【図７】実施の第１の形態の変形例による、３以上のＣＰＵモジュールを有するビデオ処理部の一例の構成を示すブロック図である。
【図８】実施の第２の形態におけるバスＩ／Ｆの一例の構成をより詳細に示すと共に、バスＩ／Ｆと２つのＣＰＵモジュールとの接続、ならびに、バスＩ／Ｆとビデオプロセッサとの接続を示す略線図である。
【図９】従来の技術によるビデオ処理部の一例の構成を示すブロック図である。
【図１０】ＣＰＵモジュールの処理時間をフレーム毎に示す一例のタイミングチャートである。
【符号の説明】
１・・・ビデオ処理部、２・・・レジスタバス、３・・・ビデオプロセッサ、４・・・バスＩ／Ｆ、５、９・・・ＲＡＭ／ＲＯＭ、６、１０・・・ＣＰＵモジュール、７、１１・・・Ｉ／Ｏポート、８、１２・・・ローカルバス、１７、２０・・・ライトＦＩＦＯ[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a signal processing device and a signal processing method for processing a video signal using a plurality of CPUs.
[0002]
[Prior art]
Broadcasting stations and productions often have an editing room equipped with a business-use image effector for editing broadcast videos and the like. The professional image effector is a device that performs editing and processing such as performing special effect processing such as image enlargement and reduction processing on live video obtained by a television camera, already recorded video, and the like.
[0003]
A broadcast station or the like combines a video edited and processed by the business-use image effector with another video, and further adds a CG (Computer Graphics) image or a sound to create a broadcast video. For example, an image obtained by enlarging or reducing an image obtained from a television camera is inserted into another image, and a telop is inserted into the image to create a broadcast image.
[0004]
In such video processing, for example, video from a plurality of television cameras can be edited so as to be simultaneously output on a single screen in a live broadcast, so that a large amount of information can be transmitted at a time and a sense of realism can be given to the site. It is also possible to give a variation to the expression of the program by performing various special effect processing on the video and further inserting a CG video or the like.
[0005]
A digital multi-effect system is used as the commercial image effector. This system edits and processes digital video signals, such as performing special effect processing, and can process a plurality of input videos in parallel and in real time.
[0006]
Further, the digital multi-effect system may be used as a digital multi-effect switcher in combination with a switcher for switching a video signal path.
[0007]
A digital multi-effect system includes a display device such as a plurality of monitors, and input devices such as a keyboard, a switch, and a joystick, and receives input video transmitted from a television camera or the like in response to a user's operation on the input device, a VTR, and the like. The editing process is performed using a plurality of video signals such as video materials reproduced by (Video Tape Recorder) or the like.
[0008]
In the digital multi-effect system, at the time of this editing work, special effect processing such as enlargement and reduction processing of an image can be performed on a video signal. This special effect processing is performed using, for example, a video processor. As an example, a control signal is output from a CPU (Central Processing Unit) in response to a user operation on an input device, and a video processor is controlled by the control signal, and a predetermined process is performed on an input video signal. The video processor includes, for example, a DSP (Digital Signal Processor), and performs a desired process on an input signal by setting a coefficient in a register.
[0009]
FIG. 9 shows an example of a configuration of a video processing unit 120 according to the related art in which a special effect is applied to such an input video signal using a video processor. The video processing unit 120 is used by being incorporated in, for example, the digital multi-effect system described above.
[0010]
The video processing unit 120 has a RAM / ROM 200, a CPU module 210, an I / O port 220, and a video processor 230, each of which is connected to a local bus 240 so that data can be exchanged with each other. The RAM / ROM 200 stores programs and the like used by the CPU module 210, and temporarily stores data used at the time of execution as necessary.
[0011]
The I / O port 220 exchanges control data with the outside. For example, control data generated in response to a user operation on an input device is input from a main CPU that controls the entire digital multi-effect system in which the video processing unit 120 is incorporated via an I / O port 220, and Provided to module 210.
[0012]
Meanwhile, the video data to be processed is input to the video processor 230. The video processor 230 performs a process according to the filter coefficient set in the register on the input video data. For example, a filter coefficient is generated in the CPU module 210 based on the control data supplied via the above-described I / O port 220, and is set in the video processor 230. The video data input to the video processor 230 is processed by the video processor 230 according to the set filter coefficients, and output.
[0013]
That is, in the video processor 230, the output of the video data changes according to the register setting by the CPU module 210. By executing this processing within a frame period, image processing by the video processor 230 is realized.
[0014]
Here, the video processing unit 120 can perform processing on video data of a plurality of channels in parallel. For example, assuming that the video processing unit 120 can support the input of the video data of four channels, the video data of four channels input in parallel is set so that the frames of four channels are included in the period of one frame. The data is compressed in the time direction, time-division multiplexed into one-channel data, and supplied to the video processor 230. The CPU module 210 sets the filter coefficients for the video data of the four channels time-division multiplexed within one frame period in the register of the video processor 230 while switching them in synchronization with the respective channels. The video processor 230 performs a process on the supplied video data according to the filter coefficient set in the register, and outputs the processed video data. The output video data is demultiplexed by time division multiplexing, reassembled into the original four-channel video data, and output individually for each channel.
[0015]
Here, in order for the video data output from the video processor 230 to be able to be normally reproduced, the designated processing for one frame of image must be completed within one frame period. If this processing is not completed within one frame period, a part of the video is lost or the video is disturbed. For example, when the number of channels of video data to be processed increases or the function of the video processor 230 is improved and more complex image processing becomes possible, the number of registers in the video processor 230 simply increases. Therefore, there is a possibility that the calculation of the filter coefficient by the CPU module 210 and the register setting for the video processor 230 cannot be completed within one frame.
[0016]
This will be described with reference to FIG. FIG. 10 is an example timing chart showing the processing time of the CPU module 210 for each frame. Here, as shown in FIG. 10A, it is assumed that one frame period is time t.
[0017]
FIG. 10B shows that, when processing is performed on the frame images from the first frame to the fourth frame of the video data, the processing such as the generation of the necessary filter coefficients and the register setting by the CPU module 210 takes one frame period t. It is an example that has been completed within. If the processing content for each frame image is different, the processing time of the CPU module 210 also changes for each frame.
[0018]
FIG. 10C illustrates a case where the processing time of the CPU module 210 is increased due to the above-described functional enhancement and an increase in the number of input channels when processing the same video data as in FIG. 10B. In the example shown in FIG. 10C, the processing time of the CPU module 210 is almost doubled for each frame as compared with the case shown in FIG. 10B. In this example, for the first, second, and fourth frames, the processing of the CPU module 210 is completed within one frame period t, but for the third frame, the processing time of the CPU module 210 is one frame period. t, indicating that the processing of the third frame was not completely completed within the frame period. At this point, the processing of the third frame has failed.
[0019]
Due to such a failure of the processing, the third frame is not made in time for the real-time processing and is transmitted as a video data output in a state where the processing is not completely completed. As a result, a part of the video is lost or the video is disturbed. .
[0020]
In order to solve such a problem that the processing is not completed within the frame period, as a method of increasing the processing speed of multimedia data, audio data, video data, and graphics data are speeded up using a PCI local bus. (Patent Document 1), a bus system that improves the data processing speed of the system by adjusting the execution order of commands, and an execution order adjustment method (Patent Document 2).
[0021]
[Patent Document 1]
JP-A-7-262130
[Patent Document 2]
JP 2002-41449 A
[0022]
[Problems to be solved by the invention]
By the way, as another means for avoiding such a failure of the processing, there is a method of simply increasing the operation processing speed (operating frequency) of the CPU module 210 and shortening the operation time of the CPU.
[0023]
However, when the arithmetic processing speed of the CPU module is increased, components operating in synchronization with the CPU module, such as a memory such as a RAM / ROM, various data interfaces, and a bus, need to correspond to this. Wear development may be required. Therefore, when the arithmetic processing speed of the CPU is increased, there is a problem that a great deal of cost is required for hardware replacement / adjustment and software development. Further, there is a problem that software development takes an extremely long time in some cases.
[0024]
Under these circumstances, it is not possible to easily select a means for simply increasing the arithmetic processing speed of the CPU module, and another approach is required which is less restrictive in terms of cost and time.
[0025]
Therefore, an object of the present invention is to provide a signal processing device and a signal processing method that do not cause a failure in processing of video data even if the function of a video processor is improved or the number of input channels is increased.
[0026]
[Means for Solving the Problems]
In order to solve the above-described problem, the present invention provides a processing unit that executes a predetermined process in accordance with a result of updating a register with received data, and a plurality of units that generate data for the processing unit and transmit the data. Control means, a plurality of storage means provided corresponding to each of the plurality of control means, for temporarily storing data transmitted from the control means, and the storage means being at least shorter than a data transmission cycle of the control means. A signal processing device comprising: a switching unit for selecting data periodically and transmitting the data stored in the selected storage unit to the processing unit sequentially.
[0027]
Further, the present invention provides a processing step of executing a predetermined process in accordance with a result of updating a register with received data, a plurality of control steps of transmitting data for generating data for the processing step, and a plurality of control steps. For a plurality of storage means provided corresponding to each of the control steps, a plurality of storage steps for temporarily storing data transmitted from the plurality of control steps, and a storage means, A switching step of selecting data in a cycle shorter than the data transmission cycle of the step and sequentially transmitting data stored in the selected storage means to a processing step. .
[0028]
Also, the present invention provides a processing unit for executing a predetermined process according to a result of updating a register by a received command, a plurality of control units for generating data for the processing unit and transmitting the data, and a plurality of control units. A plurality of storage means provided in correspondence with each of the means and temporarily storing data transmitted from the control means; and a plurality of storage means provided in correspondence with the plurality of storage means and the number of data in the plurality of storage means respectively. One detecting means is selected from the plurality of storing means based on the detecting means for detecting and the number of data detected by the detecting means, and the data stored in the selected storing means is determined by at least the data transmission cycle of the control means. And a switching means for sequentially transmitting to the processing means in a short cycle.
[0029]
Also, the present invention provides a process step of executing a predetermined process according to a result of updating a register by a received command, a plurality of control steps of generating data for the process step and transmitting the data, For a plurality of storage means provided corresponding to each of the plurality of control steps, a plurality of storage steps for temporarily storing data transmitted from the plurality of control steps, and a plurality of storage means. A corresponding step is provided, wherein a detecting step for detecting the number of data in each of the plurality of storage means, and one of the plurality of storage means is selected based on the number of data detected in the detecting step. A switching step of sequentially transmitting data stored in the storage means to a processing step at least in a cycle shorter than a data transmission cycle in a control step; A signal processing apparatus characterized by having a flop.
[0030]
As described above, the present invention temporarily stores data generated by a plurality of CPU modules in a plurality of memories, respectively, and selects a memory at least in a cycle shorter than a data transmission cycle by the CPU module. Since the data stored in the memory is sequentially transmitted to the processor, the load for generating the data to be transmitted to the processor is distributed to the plurality of CPU modules.
[0031]
In addition, the present invention temporarily stores data generated by a plurality of CPU modules in a plurality of memories, respectively, and selects one from the plurality of memories based on a result of detecting the number of data stored in the plurality of memories. In addition, since the data stored in the selected memory is sequentially transmitted to the processor at least in a cycle shorter than the data transmission cycle of the CPU module, a load for generating data to be transmitted to the processor is reduced by a plurality of CPUs. Can be adaptively distributed to modules.
[0032]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, a first embodiment of the present invention will be described with reference to the drawings. FIG. 1 shows a configuration of an example of a digital multi-effect system to which the present invention can be applied. The digital multi-effect system 100 includes a main CPU 110 connected to a main bus 140, video processing units 170A, 170B and 170C, and a video I / F 130. An operation panel 150 is connected to the main CPU 110.
[0033]
Although not shown, the operation panel 150 includes display devices such as a plurality of monitors and input devices such as a keyboard, a switch, and a joystick. The plurality of monitors respectively display images supplied from a television camera or the like and images processed by the digital multi-effect system. By operating a keyboard and switches provided on the operation panel 150 of the digital multi-effect system 100, the user can edit a plurality of input images supplied from a television camera or the like and an image material reproduced by a VTR. Indicate what to do. In a scene where a live video is provided as a relay video, the user selects necessary video images from several television cameras and operates a keyboard or the like of the operation panel 150 to display the selected video images. Perform desired processing.
[0034]
Next, the processing of the digital multi-effect system 100 will be described in more detail with reference to FIG. Here, it is assumed that digital video data of a plurality of channels is input to the digital multi-effect system 100 from video from a television camera or recorded video material.
[0035]
Video data of a plurality of channels is input to the video I / F 130 and transferred to the video processing units 170A, 170B and 170C, respectively. The transfer of the video data is performed, for example, via a signal line 160 provided for each channel between the video I / F 130 and the video processing unit 170. In FIG. 1, only the signal line 160 is shown between the video I / F 130 and the video processing unit 170B, and the others are omitted.
[0036]
On the other hand, the user issues an editing instruction such as what processing is to be performed on the video data of which channel by the input device of the operation panel 150, and the instruction is transmitted to the main CPU 110. The main CPU 110 selects a video processing unit for processing the video data from the video processing units 170A, 170B and 170C based on the content of the editing instruction (for example, the video processing unit 170B), and selects the selected video processing unit 170B. Transfer of video data between the video processing unit 170B and the video I / F 130, and the processing operation of the video data in the video processing unit 170B. This control is performed, for example, by transmitting control data from the main CPU 110 to the video processing unit 170 and the video I / F 130 via the bus 140.
[0037]
Upon receiving the video data from the video I / F 130 via the signal line 160, the video processing unit 170 performs the specified processing on the video data and outputs the processed data. This output is transmitted to the video I / F 130 via the signal line 160 and output to the outside of the digital multi-effect system 100 as output video data. The transmission and reception of these video data inside the digital multi-effect system 100 are individually performed via the signal lines 160 provided for each channel as described above.
[0038]
FIG. 2 shows a configuration of an example of the video processing unit according to the first embodiment of the present invention. 2, the video processing unit 1 corresponds to the video processing units 170A, 170B, and 170C shown in FIG. 1 described above, respectively. The video processor 3 is equivalent to the video processor 230 described in the above-described conventional technique, and performs processing on input video data according to a filter coefficient set in a register. As will be described later in detail, the video processor 3 can perform processing by inputting video data of a plurality of channels in parallel.
[0039]
Since the video processing unit 1 is configured as described above, the video processing unit 1 has many interfaces common to the video processing unit 120 described in the related art. Therefore, by replacing the conventional video processing unit 120 with the video processing unit 1 of the present invention without changing the basic architecture of the software and the like, a digital multi-effect system having the effects of the present invention can be easily configured.
[0040]
The video processing unit 1 has a video processor 3 and a bus I / F 4 connected to the register bus 2, respectively, and a local bus 8 and a local bus 12 are further connected to the bus I / F 4 in parallel.
[0041]
The RAM / ROM 5, the CPU module 6, and the I / O port 7 are connected to the local bus 8. The RAM / ROM 5 stores programs and the like used by the CPU module 6, and temporarily stores data used at the time of execution as necessary. The I / O port 7 exchanges control data with the main CPU 110 in FIG. Further, control data is exchanged between the I / O port 7 and the CPU module 6 via the local bus 8.
[0042]
The local bus 12 has the same configuration as the local bus 8. That is, the RAM / ROM 9, the CPU module 10, and the I / O port 11 are connected to the local bus 12, and control data is exchanged between the I / O port 11 and the main CPU 110. Further, control data is exchanged between the I / O port 11 and the CPU module 10 via the local bus 12.
[0043]
In this example, the bidirectionality between the local bus 8 and the register bus 2 and the bidirectionality between the local bus 12 and the register bus 2 are ensured. No exchange is expected. In this example, the bus width of the register bus 2 and the local buses 8 and 12 is, for example, 32 bits. As a precondition for the system according to the present invention, the occupancy of the bus by register access to the video processor 3 is assumed to be lower than the unit frame time. This is the same in a modified example of the first embodiment of the present invention and a second embodiment described later.
[0044]
Video data is input to the video processor 3 from the video I / F 130 in FIG. The video data output from the video processor 3 is supplied to the video I / F 130. Further, the filter coefficients generated by the CPU modules 6 and / or 10 are supplied to the register bus 2 via the bus I / F 4 and set in the registers of the video processor 3 in a predetermined manner. The bus I / F 4 functions as a bridge connecting the local buses 8 and 12 and the register bus 2.
[0045]
As described above, basically, no data exchange occurs between the local bus 8 and the local bus 12. Therefore, the main CPU 110 corresponds to a higher-level CPU that controls the CPU modules 6 and 10. The main CPU 110 achieves dynamic optimization between the CPU modules 6 and 10 in the video processing unit 1.
[0046]
FIG. 3 shows an example of the configuration of the video processor 3. The video processor 3 includes an arithmetic unit 300 composed of, for example, a DSP, and has a multiplexing device 310 and a separating device 320. In the example of FIG. 3, video data for four channels from the first channel to the fourth channel is independently input to the multiplexer 310. The multiplexing device 310 time-division multiplexes the video data of the first channel to the fourth channel provided in parallel and transfers the video data serially to the arithmetic unit 300.
[0047]
The arithmetic unit 300 is connected to the register bus 2, and receives the filter coefficients generated by the CPU modules 6 and 10 shown in FIG. 2 through the register bus 2. The filter coefficient is set in a register of the arithmetic unit 300 in a predetermined manner.
[0048]
The arithmetic unit 300 performs, for example, one frame or one line on the video data of any one of the first to fourth channels multiplexed according to the content of the filter coefficient set in the register. Processing is performed in units of. By exchanging the filter coefficients set in the arithmetic unit 300 according to the timing at which the video data of each of the first to fourth channels is input, independent processing for each channel can be realized. The video data of each channel thus obtained is assembled in the original channel unit by the separation device 320, and is independently output as video data of the first to fourth channels.
[0049]
Next, the operation of the video processing unit 1 according to the first embodiment of the present invention will be described more specifically. The load of the CPU module 6 and the CPU module 10 is distributed for each function of processing video data and / or for each channel. When load distribution is performed for each channel, for example, assuming that video data of four channels is input to the video processor 3, filter coefficients for processing performed on the first and third channels are determined. The calculation is performed by the CPU module 6, and the calculation of the filter coefficient for the processing performed on the second and fourth channels can be controlled to be performed by the CPU module 10. Such load distribution control is performed by control data from the main CPU 110.
[0050]
The bus I / F 4 alternately switches the connection between the local buses 8 and 12 and the video processor 3 at a constant period, thereby outputting an instruction (for example, a filter coefficient) to the video processor 3 output from the CPU module 6 and the CPU module 10. (Write command for setting in the register) are alternately supplied to the video processor 3. As a result, the video processor 3 alternately receives the command from the CPU module 6 and the command from the CPU module 10 and sets the same in the register, and performs processing based on the set contents of the register with respect to the input video data. Execute repeatedly.
[0051]
FIG. 4 is a timing chart showing an example of a processing time for each frame according to the first embodiment. It is assumed that the processing content for each of the first to fourth frame images shown in FIG. 4 is the same as that shown in FIG. 10 described above. Here, as shown in FIG. 4A, it is assumed that one frame period is time t. FIG. 4B shows that when processing is performed on frame images from the first frame to the fourth frame of video data, processing such as generation of necessary filter coefficients and register setting by the CPU modules 6 and 10 is performed for one frame. This is an example in which the processing is completed within a cycle t.
[0052]
FIG. 4C assumes a case where the function of the video processor 3 is improved and the number of input channels is increased. In the first embodiment, as a result of distributing the load to the CPU modules 6 and 10 using the video processing unit 1 including the two CPU modules 6 and 10, the processing time of each CPU module 6 and 10 is reduced by one frame. It can fit within the cycle.
[0053]
Next, the configuration of the bus I / F 4 in the first embodiment and the operation between the video processor 3, the bus I / F 4 and the CPU modules 6, 10 will be described with reference to FIGS. This will be described in detail.
[0054]
FIG. 5 shows an example of the configuration of the bus I / F 4 in more detail, and shows the connection between the bus I / F 4 and the CPU modules 6 and 10 and the connection between the bus I / F 4 and the video processor 3.
[0055]
In the following, local buses 8 and 12 are buses synchronized with clocks output from CPU modules 6 and 10, respectively. The register bus 2 is also a bus synchronized with the clock. The clock of the register bus 2 is synchronized with one of the clocks of the CPU module 6 and the CPU module 10. Hereinafter, it is assumed that the clock of the register bus 2 is synchronized with the clock of the CPU module 6.
[0056]
The bus I / F 4 includes a bus switch 15, an address decoder 16, a write FIFO 17, a read FIFO 18, an address decoder 19, a write FIFO 20, a read FIFO 21, and a register bus I / F 22.
[0057]
The description starts from the CPU module 6 side. Address (1) data and a read / write command are output from the CPU module 6 and supplied to the address decoder 16 via the local bus 8. Data is exchanged between the CPU module 6 and the address decoder 16 via the local bus 8. The data output from the CPU module 6 is, for example, a filter coefficient for setting in a register of the video processor 3.
[0058]
The write / read command is interpreted by the address decoder 16, and when accessing the video processor 3, a write FIFO 17, which is a first-in first-out (Fast-In Fast-Out) memory, together with the address (1) data and the data (1). Is written to. At this time, the address (1) data is, for example, an address for specifying a register of the video processor 3.
[0059]
The CPU module 10 side is the same as the CPU module 6 side. That is, the address (2) data and the read / write command output from the CPU module 10 are supplied to the address decoder 19 via the local bus 12, and the local bus is transmitted between the CPU module 10 and the address decoder 19. Data is exchanged via the interface 12. The read / write command is interpreted by the address decoder 19, and is written in the write FIFO 20 together with the address (2) data and the data (2) when accessing the video processor 3.
[0060]
In the bus switch 15, the address (1) data, read / write command and data (1) supplied from the write FIFO 17, and the address (2) data, read / write command and data (2) supplied from the write FIFO 20 are used. Are switched at a predetermined cycle and output alternately.
[0061]
At this time, the address data, read / write command and data read from the write FIFO 17 or 20 by the bus switch 15 are transferred to the register bus 2 in synchronization with the clock of the CPU module 6. In the bus I / F 4, the control from the local buses 8 and 12 is compressed in the time direction and output to the video processor 3. This eliminates the temporal influence of the register bus 2 on the local buses 8 and 12.
[0062]
For example, the clock frequency of the register bus 2 is set higher than the clock frequency of the CPU module 6. The bus switch 15 is alternately switched based on the clock frequency of the register bus 2, and data is alternately read from the write FIFOs 17 and 20. As a preferred example, the clock frequency of the register bus 2 is set to twice the clock frequency of the CPU module 6, and at this timing, the bus switch 15 is switched and data is alternately read from the write FIFOs 17 and 20.
[0063]
The address data, read / write command and data output from the bus switch 15 are supplied to the register bus I / F 22. The address data, read / write command and data are supplied from the register bus I / F 22 to the video processor 3 via the register bus 2. For example, if the command is a write command, the video processor 3 sets data (filter coefficient) in a register specified by the address data.
[0064]
If the command supplied to the video processor 3 is a read command, for example, data is read from the register specified by the address from the video processor 3 and supplied to the register bus I / F 22 via the register bus 2. This data is supplied from the register bus I / F 22 to the bus switch 15, is switched in a predetermined manner, and is written to the read FIFO 18 as, for example, data (1). The data (1) is read from the read FIFO 18 and supplied to the CPU module 6 via the address decoder 16 and the local bus 8.
[0065]
Similarly, on the CPU module 10 side, the data is output from the bus switch 15 as data (2) and written to the read FIFO 21. The data (2) is read from the read FIFO 21 and supplied to the CPU module 10 via the address decoder 19 and the local bus 12.
[0066]
In the above description, the bus switch 15, the register bus I / F 22, and the video processor 3 operate in synchronization. On the other hand, the CPU module 6 and the address decoder 16 and the CPU module 10 and the address decoder 19 can be asynchronous with each other. Commands and the like asynchronously transmitted from the set of the CPU module 6 and the address decoder 16 and the set of the CPU module 10 and the address decoder 19 are adjusted in timing using the write FIFOs 17 and 20, and the bus switch 15, the register bus I / F 22 , And a set of video processors 3.
[0067]
In the example of FIG. 5, a FIFO is used as a queue for temporarily storing read / write commands and the like. In the case where processing is performed in the order of requests as in the first embodiment, a first-in first-out method such as FIFO is generally used, but another method is used depending on the update order and other circumstances. It is also possible.
[0068]
When a read command is issued from the CPU module 6, the address (1) data specifies, for example, a register of the video processor 3. As read command response data, data read from the register specified by the address (1) data of the video processor 3 is written to the read FIFO 18 via the register bus I / F 22 and the bus switch 15. After that, the response data is supplied to the CPU module 6 via the address decoder 16. Such a read command is, for example, an instruction to read frequently updated FPGA version information or an instruction to sample a part of video data to be processed. Note that the bus switch 15 enters a hold state after transmitting a read command to the video processor 3.
[0069]
FIG. 6 is an example timing chart showing the transfer timing of the read / write command in the local buses 8 and 12 and the register bus 2. 6A and 6B show the timing on the local buses 8 and 12, respectively, and FIG. 6C shows the timing on the register bus 2.
[0070]
6, write commands C1-1 and C1-2 from the local bus 8 and write commands C2-1 and C2-2 and a read command C2-3 from the local bus 12 are written to the write FIFOs 17 and 20. In this order, the data is time-compressed and output to the register bus 2. As described above, when a command such as the read command C2-3 occurs, a wait signal is output to the local bus 12 side, and the operation of the bus switch 15 is held.
[0071]
The read command C2-3 is processed by the video processor 3. Response data obtained as a result of the processing by the read command C2-3 is taken out from the video processor 3, and is passed through the register bus 2, the register bus I / F 22, the bus switch 15, the read FIFO 21, the address decoder 19, and the local bus 12. The data is transferred to the CPU module 10.
[0072]
Until response data to the read command C2-3 appears on the register bus 2, read / write commands from the local bus 8 are accumulated in the write FIFO 17. When the response data to the read command C2-3 appears on the register bus 2, the above-mentioned hold of the operation of the bus switch 15 is released. Thereafter, the read / write commands are sequentially transferred from the write FIFOs 17 and 20 to the register bus 2. You.
[0073]
At this time, depending on the minimum command cycle on the register bus 2 side and the occurrence frequency of read commands and write commands from the local buses 8 and 12, the command cycle on the local buses 8 and 12 and the minimum required depth of the write FIFOs 17 and 20 are determined. (Capacity) can be estimated.
[0074]
Here, the processing timing in the register bus 2 will be described with reference to the timing chart of FIG. 6C. At the time of the first command cycle CS1 of the register bus 2, the transfer of the write command (C1-1, C2-1), the address, and the data has not been completed in both the local bus 8 and the local bus 12, and the processing is performed. Not performed (no operation (NOP)). In the next command cycle CS2, the write command C1-1 is transferred from the local bus 8 and the write command C2-1 is transferred from the local bus 12, and written to the write FIFOs 17 and 20 together with the address and data. Assuming that the bus switch 15 first supplies the write FIFO 20 command to the register bus 2 at this point in time, as shown in FIG. Will be forwarded to
[0075]
In the next command cycle CS3, the bus switch 15 is switched, and the write command C1-1 is transferred from the write FIFO 17 to the register bus 2 together with the address (1) data and the data (1).
[0076]
In the command cycle CS4, as in the case of CS2, the commands C2-2 and C1-2 are stored in the write FIFOs 17 and 20, and the command C2-2 and the corresponding address (1) data and data (1) are stored. The data is transferred to the register bus 2, and the same processing is repeated thereafter.
[0077]
When the register bus 2 receives the read command in the command cycle CS6, the operation of the register bus 2 is held as described above, so that the register bus 2 holds the new command until a response to the read command is received from the video processor 3. Do not receive.
[0078]
When a response to the read command is received from the video processor 3, the write FIFO 17 stores the write commands C1-3, C1-4, and C1-5, and the write FIFO 20 does not store anything. Here, in the command cycle CS7, since the transfer is performed from the write FIFO 17, the write command C1-3 and the address data are provided to the register bus 2. In the next command cycle CS8, the bus switch 15 attempts to perform the transfer from the write FIFO 20 to the register bus 2, but at this point, the transfer of the command C2-4, the address, and the data from the local bus 12 has not been completed. Therefore, the register bus 2 becomes NOP.
[0079]
After the command cycle CS9, a write command, an address, and data are alternately transferred from the write FIFOs 17, 20 to the register bus 2.
[0080]
In the above-described first embodiment of the present invention, the description has been made so as to have two CPU modules 6 and 10. However, this is not limited to this example. For example, three or more CPU modules are provided. Such a configuration is also possible.
[0081]
FIG. 7 shows an example of a configuration of a video processing unit having three or more CPU modules according to a modification of the first embodiment described above. The bus I / F 25 is connected to three or more local buses 24A, 24B, 24C,..., Which differs from the configuration of the video processing unit 1 shown in FIG. A plurality of CPU modules 22A, 22B, 22C,... Are provided for the video processor 3 for which processing is to be specified, and local modules to which these CPU modules 22A, 22B, 22C,. The buses 24A, 24B, 24C,... And the video processor 3 are connected by a bus I / F 25. Note that the video processor 3 may be the same as that used in the first embodiment.
[0082]
The bus I / F 25 has an address decoder, a write FIFO, and a read FIFO respectively corresponding to the local buses 24A, 24B, 24C,... With reference to the configuration of the bus I / F4 shown in FIG. , And a bus switch for switching between a write FIFO and a read FIFO corresponding to the local buses 24A, 24B, 24C,. The bus switch is connected to the video processor 3 via the register bus 26.
[0083]
The bus I / F 25 periodically switches the connection between each FIFO corresponding to each of the local buses 24A, 24B, 24C,... And the register bus 26 by the above-described bus switch (not shown), and stores the data in each FIFO. Are evenly provided to the video processor 3. At the same time, the command cycle is temporally compressed so that the timing is optimal in the register bus 26.
[0084]
The FIFO and the bus switch in the bus I / F 25 are configured to support bidirectional data switching as described with reference to FIG. Thus, the read / write operation described with reference to FIG. 5 from the CPU modules 22A, 22B, 22C,.
[0085]
In the modification of the first embodiment, if the switching cycle of the bus switch of the bus I / F 25 and the clock frequency of the register bus 26 are shortened, and the clock frequency of the video processor 3 is increased in accordance with this, more CPU modules can be connected.
[0086]
Next, a second embodiment of the present invention will be described with reference to FIG. In the second embodiment, the switching method of the bus switch is dynamic rather than periodic as in the first embodiment. This configuration controls the switching of the bus switch 15 in accordance with the amount of the read / write command from the local bus side stored in the FIFO, and attempts to equalize the waiting time of the read / write command between the two local buses. Is what you do.
[0087]
FIG. 8 shows an example of the configuration of the bus I / F 4 ′ in the second embodiment in more detail, as well as the connection between the bus I / F 4 ′ and the CPU modules 6 and 10, 4 shows a connection with the video processor 3. Note that, in FIG. 8, the same reference numerals are given to portions common to FIG. 5 described above, and detailed description will be omitted. The configuration according to the second embodiment basically includes a configuration including the CPU modules 6 and 10, the bus I / F 4, and the video processor 3 illustrated in FIG. 5 in the first embodiment. The same is true.
[0088]
In the second embodiment, request counters 33 and 37 are newly provided in a bus I / F 4 ′ corresponding to the bus I / F 4 in FIG. The request counters 33 and 37 sequentially detect the number of data held in the write FIFOs 17 and 20, respectively.
[0089]
For example, when transferring a read / write command from the address decoder 16 to the write FIFO 17, the content of the request counter 33 is counted up by, for example, one. On the other hand, when the read / write command is transferred from the write FIFO 17 to the register bus 2, the content of the request counter 33 is reduced by, for example, one. Similarly, when transferring a read / write command from the address decoder 19 to the write FIFO 20, the content of the request counter 37 is counted up by, for example, one. When the read / write command is transferred from the write FIFO 20 to the register bus 2, the content of the request counter 37 is reduced by, for example, one. In this way, the number of data in the write FIFOs 17 and 20 can be sequentially detected using the request counters 33 and 37.
[0090]
Of course, the number of data in the write FIFOs 17 and 20 may be detected by another method. For example, the write FIFOs 17 and 20 can be accessed based on a clock to detect the number of written data.
[0091]
In the first embodiment described above, the bus switch 15 simply switches the connection between the plurality of write FIFOs 17 and 20 and the register bus 2 periodically. In the second embodiment, the bus switch 15 'dynamically controls the switching timing of the bus switch 15' based on the detection result of the number of data in the write FIFOs 17 and 20 by the request counters 33 and 37 described above. . As a result, the number of commands waiting on the local buses 8 and 12 is equalized.
[0092]
More specifically, the value of the request counter 33 (referred to as a count value (1)) and the value of the request counter 37 (referred to as a count value (2)) are read and compared. As a result of the comparison, when the count value (1)> = the count value (2), the bus switch 15 'is controlled so as to connect the write FIFO 17 and the register bus I / F 22'. , The bus switch 15 'to control the connection between the write FIFO 20 and the register bus I / F 22'. The count value (1) and the count value (2) mean the number of waiting read / write commands in the write FIFO 17 and the write FIFO 20, respectively.
[0093]
When the read command is received, the bus switch 15 'is held as in the case of the first embodiment.
[0094]
Here, the bus switch 15 'itself can be programmed to read the above-mentioned count value (1) and count value (2) and compare them to determine the connection destination. This is not limited to this example, and it is also possible to configure such that such a connection control process is performed by the main CPU 110 shown in FIG.
[0095]
In the above description, the present invention has been described as applied to the video processing unit of the digital multi-effect system, but the present invention is not limited to this example. For example, the present invention is not limited to a configuration for performing processing on such a frame image, but can be widely applied to a processor that performs a certain processing by inputting a signal.
[0096]
【The invention's effect】
According to the present invention, calculation and generation of parameters for performing special effect processing and the like on an image can be distributed and processed by a plurality of CPU modules, so that even when the load on the CPU modules becomes higher, There is an effect that the processing can be completed within one frame period.
[0097]
Further, even if the video processing unit according to the present invention is changed, CPU modules with the same performance are added and the conventional video processor is used as it is, so that even if the function of the video processor is improved or the number of input channels is increased, In addition, the processing speed of the entire system can be improved without changing the basic architecture of the software. As a result, there is an effect that the function of the video processor can be improved economically and in a short time.
[0098]
Further, according to the second embodiment of the present invention, the switching timing of the bus switch can be adaptively controlled according to requests from a plurality of CPU modules, so that more efficient processing can be performed. is there.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of an example of a digital multi-effect system to which the present invention can be applied.
FIG. 2 is a block diagram illustrating a configuration of an example of a video processing unit according to the first embodiment of the present invention.
FIG. 3 is a block diagram illustrating a configuration of an example of a video processor.
FIG. 4 is a timing chart illustrating an example of a processing time for each frame according to the first embodiment;
FIG. 5 is a schematic diagram showing an example of the configuration of a bus I / F in more detail, and showing a connection between the bus I / F and two CPU modules and a connection between the bus I / F and a video processor; is there.
FIG. 6 is a timing chart showing an example of a transfer timing of a read / write command on two local buses and a register bus.
FIG. 7 is a block diagram illustrating a configuration of an example of a video processing unit having three or more CPU modules according to a modification of the first embodiment.
FIG. 8 shows an example of the configuration of a bus I / F according to the second embodiment in more detail, and shows a connection between the bus I / F and two CPU modules, and a connection between the bus I / F and a video processor. It is an approximate line figure showing connection.
FIG. 9 is a block diagram showing a configuration of an example of a video processing unit according to a conventional technique.
FIG. 10 is an example timing chart showing the processing time of the CPU module for each frame.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 ... Video processing part, 2 ... Register bus, 3 ... Video processor, 4 ... Bus I / F, 5, 9 ... RAM / ROM, 6, 10 ... CPU module, 7, 11: I / O port, 8, 12: Local bus, 17, 20: Write FIFO

Claims

Processing means for executing a predetermined process in accordance with a result of updating the register by the received data;
A plurality of control means for generating the data for the processing means and transmitting the data;
A plurality of storage means provided corresponding to each of the plurality of control means, temporarily store the data transmitted from the control means,
Switching means for selecting the storage means at least in a cycle shorter than a data transmission cycle of the control means, and sequentially transmitting the data stored in the selected storage means to the processing means; A signal processing device characterized by the above-mentioned.

The signal processing device according to claim 1,
The signal processing device according to claim 1, wherein the processing unit performs predetermined signal processing on the video signal.

The signal processing device according to claim 1,
The signal processing device, wherein the switching means compresses the data in a time direction and transmits the data to the processing means.

The signal processing device according to claim 1,
The signal processing device, wherein the plurality of control units transmit data asynchronously with each other.

The signal processing device according to claim 1,
The signal processing device according to claim 1, wherein transmission of data to said processing means by said switching means is synchronized with a data transmission cycle of at least one of said plurality of control means.

The signal processing device according to claim 1,
When the data includes a read command, and when the switching means transmits the read command to the processing means, the periodic selection of the storage means is stopped until a response message to the read command is received from the processing means. And a storage unit storing the read command and the processing unit fixedly connected.

A step of executing a predetermined process according to a result of updating the register by the received data;
A plurality of control steps for generating the data for the processing steps and transmitting the data;
For a plurality of storage means provided corresponding to each of the plurality of control steps, a plurality of storage steps for temporarily storing the data transmitted from the plurality of control steps,
A switching step of selecting the storage means at least in a cycle shorter than a data transmission cycle of the control step, and sequentially transmitting the data stored in the selected storage means to the processing step; A signal processing method comprising:

Processing means for executing predetermined processing in accordance with a result of updating the register by the received command;
A plurality of control means for generating the data for the processing means and transmitting the data;
A plurality of storage means provided corresponding to each of the plurality of control means, temporarily store the data transmitted from the control means,
Detection means provided corresponding to the plurality of storage means, for detecting the number of data in the plurality of storage means,
One of the plurality of storage units is selected based on the number of data detected by the detection unit, and the data stored in the selected storage unit is determined by at least a data transmission cycle of the control unit. And a switching unit for sequentially transmitting the data to the processing unit in a short cycle.

The signal processing device according to claim 8,
The signal processing device, wherein the switching means selects the storage means having the largest number of data detected by the detection means from the plurality of storage means.

The signal processing device according to claim 8,
The signal processing device according to claim 1, wherein the processing unit performs predetermined signal processing on the video signal.

The signal processing device according to claim 8,
The signal processing device, wherein the switching means compresses the data in a time direction and transmits the data to the processing means.

The signal processing device according to claim 8,
The signal processing device, wherein the plurality of control units transmit data asynchronously with each other.

The signal processing device according to claim 8,
The signal processing device according to claim 1, wherein transmission of data to said processing means by said switching means is synchronized with a data transmission cycle of at least one of said plurality of control means.

The signal processing device according to claim 8,
When the data includes a read command, and when the switching means transmits the read command to the processing means, the periodic selection of the storage means is stopped until a response message to the read command is received from the processing means. And a storage unit storing the read command and the processing unit fixedly connected.

A step of executing predetermined processing according to a result of updating the register by the received command;
A plurality of control steps for generating the data for the processing steps and transmitting the data;
For a plurality of storage means provided corresponding to each of the plurality of control steps, a plurality of storage steps for temporarily storing the data transmitted from the plurality of control steps,
A detection step provided corresponding to the plurality of storage means, and detecting the number of data in the plurality of storage means,
One of the plurality of storage means is selected based on the number of data detected in the detection step, and the data stored in the selected storage means is transmitted to at least the data in the control step. A switching step of sequentially transmitting to the above processing steps in a cycle shorter than the cycle.