JP3934772B2

JP3934772B2 - Variable transfer rate encoding method and apparatus

Info

Publication number: JP3934772B2
Application number: JP03637698A
Authority: JP
Inventors: 隆幸菅原
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 1998-02-18
Filing date: 1998-02-18
Publication date: 2007-06-20
Anticipated expiration: 2018-02-18
Also published as: JPH11234676A

Description

【０００１】
【発明の属する技術分野】
本発明は、動画像信号（ビデオ信号）を符号化するビデオ信号符号化方法及びそれに対応するビデオ信号符号化装置、特に直交変換と量子化を使用した符号化を行うビデオ信号符号化方法および装置に適用可能なものであって、特に符号化した符号化データを固定転送レートで一時記録した後に可変レート符号化データへ変換（再符号化）することを特徴とする可変転送レート符号化方法および装置に関する。
【０００２】
【従来の技術】
従来の可変転送レート符号化技術の一例として特開平７−２８４０９７号公報に記載の技術によると、ビデオ信号を第１のパスと第２のパスに分けてそれぞれ符号化し、第１のパスでは第２のパスの符号化に必要な情報を生成して出力するようにしている。なお、このときの符号化方式にはいわゆるＭＰＥＧなどの方式が使われる。
【０００３】
ＭＰＥＧについては、ＩＳＯ−ＩＥＣ１１１７２−２、ＩＴＵ−ＴＨ．２６２／ＩＳＯ−ＩＥＣ１３８１８−２に詳細な説明がなされているので、ここでは概略のみ説明する。
【０００４】
ＭＰＥＧは１９８８年、ＩＳＯ／ＩＥＣＪＴＣ１／ＳＣ２（国際標準化機構／国際電気標準化会合同技術委員会１／専門部会２，現在のＳＣ２９）に設立された動画像符号化標準を検討する組織の名称（Moving Pictures Expert Group）の略称である。ＭＰＥＧ１（ＭＰＥＧフェーズ１）は１．５Ｍｂｐｓ程度の蓄積メディアを対象とした標準で、静止画符号化を目的としたＪＰＥＧ（Joint Photographic Coding Experts Group）と、ＩＳＤＮ（Integrated Services Digital Network：統合サービスディジタル通信網）のテレビ会議やテレビ電話の低転送レート用の動画像圧縮を目的としたＨ．２６１（ＣＣＩＴＴＳＧＸＶ、現在のＩＴＵ−ＴＳＧ１５で標準化）の基本的な技術を受け継ぎ、蓄積メディア用に新しい技術を導入したものである。これらは１９９３年８月、ＩＳＯ／ＩＥＣ１１１７２として成立している。
【０００５】
ＭＰＥＧ１は、幾つかの技術を組み合わせて作成されている。
【０００６】
入力画像信号からは、動き補償器で復号化した画像信号と当該入力画像信号との差分を取ることで時間冗長部分が削減される。
【０００７】
予測の方法は、基本的なモードとして、過去の画像からの予測を行うモードと、未来の画像からの予測を行うモードと、過去と未来の両方の画像からの予測を行うモードとの３モードが存在する。またこれらのモードは、１６画素×１６画素のマクロブロック（ＭＢ：Macro Block）毎に切り替えて使用できる。予測方法は、入力画像に与えられたピクチャタイプ（Picture＿Type）によって決定される。過去の画像から予測を行って符号化するモードと予測をしないでそのマクロブロックを独立に符号化するモードとの２つのモードが存在するのが、片方向ピクチャ間予測符号化画像（Ｐピクチャ：P-picture）である。また、未来の画像からの予測を行うモードと、過去の画像からの予測を行うモードと、
過去と未来の両方の画像からの予測を行うモードと、予測をしないで独立に符号化するモードの４つのモードが存在するのが、双方向ピクチャ間予測符号化画像（Ｂピクチャ：B-Picture）である。そして、全てのマクロブロックを独立に符号化するのが、ピクチャ内独立符号化画像（Ｉピクチャ：I-picture）である。なお、ピクチャ内独立符号化画像はイントラピクチャと呼ばれ、このため、片方向ピクチャ間予測符号化画像と双方向ピクチャ間予測符号化画像は非イントラピクチャということができる。
【０００８】
動き補償では、動き領域をマクロブロック毎にパターンマッチングを行ってハーフペル精度で動きペクトルを検出し、その検出した動きベクトルの動き分だけシフトしてから予測する。動きペクトルは、水平方向と垂直方向の動きベクトルが存在し、何処からの予測かを示すＭＣ（Motion Compensation）モードとともにマクロブロックの付加情報として伝送される。
【０００９】
図８には、ＭＰＥＧ１が適用されるビデオ信号符号化装置の基本的な構成を示している。
【００１０】
この図８において、入力端子１０１には入力画像信号が供給され、この入力画像信号は演算器１０２と後述する動き補償予測器１１１に送られる。
【００１１】
演算器１０２では、動き補償予測器１１１にて復号化した画像信号と入力画像信号との差分が求められ、その差分画像信号がＤＣＴ器１０３に送られる。
【００１２】
ＤＣＴ器１０３では、供給された差分画像信号を直交変換する。ここでＤＣＴ（Discrete Cosine Transform）とは、余弦関数を積分核とした積分変換を有限空間への離散変換とする直交変換である。ＭＰＥＧではマクロブロックを４分割した８×８のＤＣＴブロックに対して、２次元ＤＣＴを行う。なお、一般に、ビデオ信号は低域成分が多く、高域成分が少ないため、ＤＣＴを行うと係数が低域に集中する。
【００１３】
ＤＣＴ器１０３でのＤＣＴによって得られたデータ（ＤＣＴ係数）は、量子化器１０４で量子化が行われる。この量子化器１０４における量子化では、量子化マトリックスという８×８の２次元周波数を視覚特性で重み付けした値と、その全体をスカラー倍する量子化スケールという値で乗算した値とを量子化値として、ＤＣＴ係数をその量子化値で除算する。
【００１４】
なお、当該ビデオ信号符号化装置にて符号化された後の符号化データを、後に図示しないビデオ信号復号装置（デコーダ）で復号して逆量子化するときは、そのビデオ信号符号化装置にて使用した量子化値で乗算を行うことにより、元のＤＣＴ係数に近似している値を得ることができる。
【００１５】
量子化器１０４にて量子化されたデータは、ＶＬＣ器１０５で可変長符号化される。このＶＬＣ器１０５では、量子化された値のうち、直流（ＤＣ）成分に対しては、予測符号化のひとつであるＤＰＣＭ（Differencial Pulse Code Modulation）を使用して符号化する。また、交流（ＡＣ）成分に対しては、低域から高域に向けていわゆるジグザグスキャン（zigzag scan）を行い、ゼロのラン長および有効係数値を１つの事象とし、出現確率の高いものから符号長の短い符号を割り当てていく、いわゆるハフマン符号化を行う。
【００１６】
ＶＬＣ器１０５にて可変長符号化されたデータは、一時、バッファメモリ１０６に蓄えられた後、このバッファメモリ１０６から所定の転送レートで読み出され、符号化データ（符号化ビットストリーム）として出力端子１０７から出力される。
【００１７】
また、その出力される符号化データにおけるマクロブロック毎の発生符号量は、後述するバッファ管理器１１４を介して符号量制御器１１３に送信される。この符号量制御器１１３は、マクロブロック毎の発生符号量と目標符号量との差分を求め、当該差分に応じた符号量制御信号を生成して量子化器１０４にフィードバックすることにより、発生符号量制御を行う。当該符号量制御のために量子化器１０４にフィードバックされる符号量制御信号は、量子化器１０４における量子化スケールを制御するための信号である。具体的な符号量制御の方法については後述の通りである。
【００１８】
一方、量子化された画像データは、逆量子化器１０８に送られ、ここで逆量子化される。
【００１９】
さらに、この逆量子化により得られたＤＣＴ係数データは、逆ＤＣＴ器１０９に送られて逆ＤＣＴされた後、演算器１１２にて動き補償予測器１１１からの予測差分画像が加算されて画像信号が復元される。
【００２０】
この復元された画像信号は、一時、画像メモリ１１０に蓄えられた後、動き補償予測器１１１に送られる。画像メモリ１１０から動き補償予測器１１１に送られた画像信号は、
演算器１０２にて差分画像を計算するためのリファレンスの復号化画像を生成するために使用される。
【００２１】
また、出力端子１０７から出力される符号化データである符号化ビットストリームは、
ビデオ信号の場合、１ピクチャ毎に可変長の符号量をもっている。これは、ＭＰＥＧがＤＣＴ、量子化、ハフマン符号化という情報変換を用いている理由と同時に、画質向上のためにピクチャごとに配分する符号量を適応的に変更しているあるためである。すなわち、
ＭＰＥＧでは、動き補償予測を行っているので、あるときは入力画像信号そのままを符号化し、あるときは予測画像と入力画像信号との差分である差分画像信号を符号化するなど、符号化画像自体のエントロピィが大きく変化するためである。この場合、多くはその画像エントロピィ比率に配分しつつ、バッファメモリの容量制限を守りながら、符号量制御がなされる。
【００２２】
したがってバッファ管理器１１４は、符号化により発生した符号量と、使用可能な符号化レートとの関係を監視し、バッファメモリ１０６において所定のバッファ容量内に収まるように目標符号量を設定する。
【００２３】
この目標符号量に対する実際の発生符号量の差に対応する情報は可変長符号化器１０５にフィードバックされ、符号量制御器１１３に入る。符号量制御器１１３では、量子化器１０６にセットする量子化スケールの値を大きくして発生符号量を抑えたり、逆に量子化スケールの値を小さくして発生符号量を小さくしたりするための符号量制御信号を生成する。
【００２４】
上述のように、可変長データを固定の転送レートのデータ（固定転送レート符号化データ）として転送する場合、そのデータの受信側となるビデオ信号復号装置側の最大バッファ量が、その送信側のビデオ信号符号化装置における発生符号量の上限値となる。すなわち、一定速度で符号化データが入力されて、所定の値だけ蓄積されたところから、所定の時刻（テレビジョン標準放送方式の一つであるＮＴＳＣ方式のビデオ信号なら１／２９．９７ｓｅｃ単位）で復号化を一瞬で行う仮想的な復号器のモデルを使用し、そのモデルの仮想バッファメモリ（いわゆるＶＢＶバッファ）にオーバーフローもアンダーフローも発生しないように、符号化装置側で符号化することがＭＰＥＧで規定されている。この規定を守っていればＶＢＶバッファ内でのレートは局部的に変化しているものの、観測時間を長く取れば固定の転送レートとなり、ＭＰＥＧではこのことを固定レートであると定義している。
【００２５】
ここで、上述したように定義された固定転送レートの場合において、符号化装置側で発生符号量が少ないときには、復号装置側ではバッファ占有量が上限値に張り付いた状態になる。この場合、例えば無効ビットを追加してバッファ（ＶＢＶバッファ）がオーバーフローしないように、符号化装置側において符号量を増やさなければならない。
【００２６】
一方、可変転送レートの場合には、この固定転送レートの定義を拡張して、バッファ占有率が上限値になったときに、復号装置における読み出しを中止することにより、原理的にオーバーフローが起きないように定義されている。したがって、可変転送レートの場合において、仮に非常に発生符号量が少なくても、復号装置の読み出しが中止されるので、固定転送レートの時のように無効ビットを入れる必要はない。このため、可変転送レートの場合にはアンダーフローだけが発生しないように符号化する。
【００２７】
このような技術を背景にし、従来の可変転送レート符号化においてビデオ信号を第１のパスと第２のパスに分けて符号化し、第１のパスでは第２のパスの符号化に必要な情報を出力するような技術の説明を行う。
【００２８】
図９には、従来例の第１のパスの符号化を行うための構成を示す。なお、この図９において、図８に示した基本構成の各構成要素と同様に動作する部分にはそれぞれ同じ指示符号を付加してそれらの説明は省略する。
【００２９】
この図９に示す第１のパスの構成では、入力端子１０１にビデオ信号が再生入力される。その際、符号化情報には、全ビデオシーケンスに対して短区間ごとに発生する発生符号量情報などが付加される。
【００３０】
この発生符号量は、ＶＬＣ器１０５でのＶＬＣ後に、符号量カウンタ１２１にて計算され、記憶回路１２２に送られて記憶される。
【００３１】
記憶回路１２２は、例えばハードディスクや光ディスクなど、高速のストレージメディアなど何でもよい。
【００３２】
なお、当該第１のパスは、正確な圧縮が行われたか否かをモニタする程度に用いられるので、この図９には図示しているが、バッファメモリ１０６とその後の符号化データの出力は、符号量をカウントできれば必ずしも必要ない。
【００３３】
ところで、ＭＰＥＧに代表されるような符号化方式では、可変長符号化を行っているので量子化幅を固定にして第１のパスの符号化を行うと、符号化画像の複雑さや、動き補償の差分（残差成分）量に応じて発生符号量が多くなる。
【００３４】
したがって、この性質を利用して、発生符号量の配分を行うようにすれば、画質をほぼ均一にすることが可能となる。なお、後述する第２のパスの符号化では、その符号量配分比率を保つと同時に、発生符号量を全体の目標符号量に制御しなければならない。
【００３５】
発生符号量は、第１のパスの符号化で発生した短区間単位で検出し、その情報を記憶回路１２２に記憶する。短区間の例としては、ピクチャ内独立符号化ごとに区切ることができ、約１５ピクチャ程度の１ＧＯＰ（グループオブピクチャ）が考えられる。この場合、
各ＧＯＰ単位にどの位の発生符号量であったかが記憶回路１２２に記憶される。
【００３６】
第１のパスの符号化では、一般的に量子化幅を小さめで且つ固定にして、第２パスで出力される最終的な符号量より多くの符号量を発生させるのが普通である。このように、第１のパスの符号化において量子化幅を小さくするのは、画像の高周波成分まで細かく情報を分解し、その画像の特性を検出する必要があるからである。
【００３７】
ここで、第１のパスにおいてｉ番目の短区間内に発生した符号量をＰＳ１Ｂ（ｉ）とする。この各短区間内で発生した符号量ＰＳ１Ｂ（ｉ）の比率を、第２のパスでもほぼ同じなるようにし、最終目標総符号量を第２のパスにおける各短区間内の目標符号量とする。
【００３８】
例えば、短区間を１ＧＯＰとすると、以下の様な方法で画質をある程度保ちながら符号を制御することができる。
【００３９】
次に、図１０には、従来例の第２のパスの符号化を行うための構成を示す。なお、この図１０において、図８に示した基本構成の各構成要素と同様に動作する部分にはそれぞれ同じ指示符号を付加してそれらの説明は省略する。
【００４０】
この図１０に示す第２のパスの構成において、発生符号量を増加させるには、量子化スケールを小さくし、一方、発生符号量を減少させるには量子化スケールを大きくする。
【００４１】
この原理を用いて、例えばバッファメモリ１０６の占有率をもとに、量子化幅を制御する方法が考えられる。ここで注意すべき点は、片方向ピクチャ間予測符号化される画像（Ｐピクチャ）は一つ前のピクチ内独立符号化画像（Ｉピクチャ）もしくはＰピクチャから予測され、また、双方向ピクチャ間予測符号化画像（Ｂピクチャ）は、時間的に両側のＩピクチャやＰピクチャから予測されている関係上、例えばＩピクチャが劣化すると、連動して他のピクチャも劣化する。
【００４２】
以下の方式例では、これらのピクチャに対する符号量配分を考慮しながら全体の符号量制御を実現している。
【００４３】
先ず、目標符号量決定回路１２４では、記憶回路１２２からの第１のパスで得られた発生符号量ＰＳ１Ｂ（ｉ）を使用して、以下の式のように各ＧＯＰの目標符号量ＰＳ２Ｂ（ｉ）を求める。
ＰＳ２Ｂ（ｉ）＝最終目標総符号量×ＰＳ１Ｂ（ｉ）／ΣＰＳ１Ｂ（ｉ）
【００４４】
ここで、一つのＧＯＰに与える目標符号量ＰＳ２Ｂ（ｉ）をＲとすると、具体的な符号量制御は、以下の第１、第２のステップのようなアルゴリズムにより、符号量制御器１１３が行う。
【００４５】
（Ａ）第１のステップ
第１のステップでは、以下の式のように、ＧＯＰの各ピクチャに対する割り当て符号量を、ＧＯＰ内で未だ符号化していないピクチャに対してある重みを付けて配分する。
Ｘｉ＝Ｓｉ×Ｑｉ
Ｘｐ＝Ｓｐ×Ｑｐ
Ｘｂ＝Ｓｂ×Ｑｂ
【００４６】
ここで、Ｘはグローバルコンプレクシティメジャー（global complexity measure）と呼ばれ、一つ前の同ピクチャタイプの符号化結果の発生符号量Ｓと平均量子化スケールＱとの積で定義するものであり、ｉはＩピクチャを、ｐはＰピクチャを、ｂはＢピクチャを表している。また、理想的な画質を達成する量子化スケールは、Ｉピクチャを基準とした場合のＰピクチャとの比率がＫｐ＝１．０で、同じくＩピクチャを基準とした場合のＢピクチャとの比率がＫｂ＝１．４であると仮定する。
【００４７】
このとき、当該第１のステップで割り当てられる各ピクチャの符号量Ｔｉ，Ｔｐ，Ｔｂは、以下の各式にて求められる。
Ti=MAX{R/(1+(NpXp/XiKp)+NbXb/XiKb)),br/(8*pr)}
Tp=MAX{R/(Np+(NpKpXb/KbXp)),br/(8*pr)}
Tb=MAX{R/(Nb+(NpKbXp/KbXp)),br/(8*pr)}
【００４８】
なお、各式において、ＴｉはＩピクチャの符号量を表し、ＴｐはＰピクチャの符号量を、ＴｂはＢピクチャの符号量を、ＭＡＸは最大を、ＲはそのＧＯＰに与えられた初期値の符号量を、ＮｐはＧＯＰ内のＰピクチャの未符号化ピクチャ枚数、ＮｂはＧＯＰ内のＢピクチャの未符号化ピクチャ枚数、ｐｒはピクチャレート、ｂｒはビットレートを表している。
【００４９】
ここで、初期値の符号量Ｒは、ＧＯＰ内で符号化が進むごとに次のように更新する。
Ｒ＝Ｒ−Ｓｉ，ｐ，ｂ
【００５０】
（Ｂ）第２のステップ
第２のステップでは第１のステップで割り当てられた各ピクチャの符号量（Ｔｉ，Ｔｐ，Ｔｂ）と実際の発生符号量を一致させるため、以下の各式に示すように、マクロブロック毎に発生符号量を加算しつつ、目標符号量から途中での予測目標符号量との差を、量子化スケールにマクロブロック単位でフィードバックする。
dji = dOi+Bj-1−(Ti(j-1)／MB＿cnt)
djp = dOp+Bj-1−(Tp(j-1)／MB＿cnt)
djb = dOp+Bj-1−(Tb(j-1)／MB＿cnt)
【００５１】
なお、各式中において、ｄＯｉはＩピクチャにおける仮想バッファ（ＶＢＶバッファ）の初期占有量を、ｄＯｐはＰピクチャにおける仮想バッファ（ＶＢＶバッファ）の初期占有量を、ｄＯｂはＢピクチャにおける仮想バッファ（ＶＢＶバッファ）の初期占有量を、ｊは符号量カウンタ１２１にてカウントされた各ピクチャの先頭から何番目かを示す番号を、Ｂｊは符号量カウンタ１２１でカウントされた各ピクチャの先頭からｊ番目のマクロブロックまでの発生符号量を、ＭＢ＿ｃｎｔは１ピクチャ内のマクロブロック数を、ｄｊｉはＩピクチャにおけるフィードバック量を、ｄｊｐはＰピクチャにおけるフィードバック量を、ｄｊｂはＢピクチャにおけるフィードバック量を示している。
【００５２】
さらに、平均量子化スケールＱは、以下の式にて求められる。
Ｑ＝ｄｊ×３１／ｒ
Ｒ＝ 2×ｂｒ／ｐｒ
なお、式中のＱは量子化スケールを、ｒはフィードバックの応答速度を決定するパラメータである。
【００５３】
従来の構成では、上述のような各演算を行うことにより符号量制御を行うことが可能となる。
【００５４】
なお、ＭＰＥＧについては、ＩＳＯ−ＩＥＣ１１１７２−２、ＩＴＵ−ＴＨ．２６２／ＩＳＯ−ＩＥＣ１３８１８−２に詳細な説明がなされているため、ここではこれ以上の説明は省略する。
【００５５】
【発明が解決しようとする課題】
このように、従来の構成では可変転送レート符号化を実現するために、上述したような２回のパスの符号化を通さなければならない。
【００５６】
すなわち、図９，図１０に示したように、ビデオ信号符号化装置に対して、符号化する動画像信号を２度入力させなければならなかった。このため、例えば放送や通信などによってリアルタイムで送信されてくる動画像信号などのように、１度しか送信されてこない画像信号に関しては、可変転送レート符号化を行うことができなかった。
【００５７】
また、例えば符号化されたデータの編集をするような場合には、再度符号化し直さなければならなかったり、符号化データをＭＰＥＧに準拠させるために、実際に編集を行いたい部分の他に数フレームを部分的に修正しなければならなくなったりするなど、非常に手間を要する問題があった。
【００５８】
本発明は、上述の課題に鑑みてなされたものであり、例えば放送や通信などによってリアルタイムで送信されてくる動画像信号に対しても可変転送レートでの符号化を行うことが可能であるとともに、編集も容易な、可変転送レート符号化方法および装置の提供を目的とする。
【００５９】
【課題を解決するための手段】
上記課題を解決するために本発明は、下記の方法及び装置を提供するものである。
（１）ビデオ信号を直交変換と量子化を使用して符号化して、可変転送レートにて出力する可変転送レート符号化方法であって、
入来するビデオ信号を直交変換と量子化を使用して符号化するステップと、
前記入来するビデオ信号の符号化と同時に、その符号化された符号化データの１画像単位毎の発生符号量と１画像単位毎の平均量子化幅とを検出し、検出した１画像単位毎の発生符号量と１画像単位毎の平均量子化幅との情報を有する符号化情報を生成するステップと、
前記符号化された符号化データを固定転送レートで記録すると共に、前記符号化情報を記録するステップと、
前記記録された符号化データを復号するステップと、
前記記録された符号化情報と前記符号化データの再生時間と目標とする平均可変転送レートとをもとに、もしくは前記符号化情報と目標符号量とをもとに、１画像単位毎の新たな目標符号量を設定し、１画像単位毎の発生符号量がその新たな目標符号量となるように、前記符号化データを復号したデータを可変長符号化する可変転送レート符号化ステップと、
を有することを特徴とする可変転送レート符号化方法。
（２）前記記録するステップにおいて、前記符号化するステップでの符号化時の動き補償に関するパラメータを記録するようにし、
前記可変転送レート符号化ステップにおいて、前記記録された符号化データを復号したデータを可変長符号化する際に、動き補償に関するパラメータとして、前記記録された動き補償に関するパラメータを使用することを特徴とする上記（１）記載の可変転送レート符号化方法。
（３）編集情報を入力するステップを設け、
前記符号化情報を生成するステップでは、編集情報に基づく必要な区間のみの符号化情報を生成し、
前記復号するステップでは、編集情報に基づく必要な区間のみのデータを復号し、
前記可変転送レート符号化ステップでは、前記符号化情報として編集情報に基づく必要な区間のみの符号化情報を使用すると共に、符号化するデータとして前記編集情報に基づき必要な区間のみ復号されたデータを使用することを特徴とする上記（１）または（２）に記載の可変転送レート符号化方法。
（４）ビデオ信号を直交変換と量子化を使用して符号化して、可変転送レートにて出力する可変転送レート符号化装置であって、
入来するビデオ信号を直交変換と量子化を使用して符号化する入来ビデオ信号符号化手段と、
前記入来するビデオ信号の符号化と同時に、その符号化された符号化データの１画像単位毎の発生符号量と１画像単位毎の平均量子化幅とを検出し、検出した１画像単位毎の発生符号量と１画像単位毎の平均量子化幅との情報を有する符号化情報を生成する符号化情報検出手段と、
前記符号化された符号化データを固定転送レートで記録すると共に、前記符号化情報を記録する記録手段と、
前記記録された符号化データを復号する復号手段と、
前記記録された符号化情報と前記符号化データの再生時間と目標とする平均可変転送レートとをもとに、もしくは前記符号化情報と目標符号量とをもとに、１画像単位毎の新たな目標符号量を設定し、１画像単位毎の発生符号量がその新たな目標符号量となるように、前記符号化データを復号したデータを可変長符号化する可変転送レート符号化手段と、
を有することを特徴とする可変転送レート符号化装置。
（５）前記入来ビデオ信号符号化手段は、符号化時の動き補償に関するパラメータを出力するものであり、
前記記録手段はその動き補償に関するパラメータを記録するものであり、
前記可変転送レート符号化手段は、前記記録された符号化データを復号したデータを可変長符号化する際に、動き補償に関するパラメータとして、前記記録された動き補償に関するパラメータを使用するものであることを特徴とする上記（４）記載の可変転送レート符号化装置。
（６）編集情報を入力する編集情報入力手段と、
その編集情報に基づいて前記符号化情報検出手段を制御する検出制御手段と、
前記編集情報に基づいて前記復号手段を制御する復号化制御手段とを設け、
前記符号化情報検出手段では、前記検出制御手段の制御により、前記編集情報に基づく必要な区間のみの符号化情報を生成し、
前記復号手段では、前記復号化制御手段の制御により、前記編集情報に基づく必要な区間のみ符号化データを復号し、
前記可変転送レート符号化手段では、前記符号化情報として前記編集情報に基づく必要な区間のみの符号化情報を使用すると共に、符号化するデータとして前記編集情報に基づき必要な区間のみ復号されたデータを使用することを特徴とする上記（４）または（５）に記載の可変転送レート符号化装置。
【００６０】
【発明の実施の形態】
以下、本発明に係る可変転送レート符号化方法および装置の好ましい実施の形態について図面を参照しながら詳細に説明する。
【００６１】
図１には、本発明に係る可変転送レート符号化方法及び装置の参考例としてのビデオ信号符号化装置の基本構成を示す。なお、本参考例では、動画像信号の符号化手法として例えばＭＰＥＧ１符号化を用いた例を挙げて説明する。
【００６２】
この図１において、ビデオ信号符号化装置の入力端子１に入力される入力画像信号は、輝度信号と色差信号で構成されたビデオ信号であり、ディジタル化された後にピクチャタイプにあわせて画像の並べ替えが行われているものである。当該入力画像信号が符号化データとして記憶回路２２に記録され、さらにこの記憶回路２２に記録された符号化データを可変転送レート符号化データに変換（再符号化）するまでの概略構成例を、この図１を用いて説明する。
【００６３】
入力端子１に供給された入力画像信号は、演算器２と動き補償予測器１１に送られる。
【００６４】
動き補償予測器１１では、入力画像信号をその符号化順に動き補償予測し、演算器２では、入力画像信号と動き補償予測器１１からの予測画像との差分が計算される。
【００６５】
当該演算器２での演算により得られた差分画像データは、ＤＣＴ器３においてＤＣＴが行われる。
【００６６】
このＤＣＴ器３からのＤＣＴ係数は、量子化器４で量子化される。その量子化データは、動き補償予測器１１からの動きベクトルや符号化モードと共にＶＬＣ器５に送られ、当該ＶＬＣ器５で可変長符号化（ＶＬＣ）される。
【００６７】
このＶＬＣ器５での可変長符号化によって得られた符号化データは、バッファメモリ６に一時蓄積され、その後、ＭＰＥＧのビデオストリームとして当該バッファメモリ６から出力される。
【００６８】
また、レート制御器２３は、バッファメモリ６の充足度を監視しており、基本的には、
バッファメモリ６の充足度が多くなると量子化を粗く、少なくなると量子化を細かくするような符号化制御信号を、量子化器４にフィードバックする。
すなわち、当該量子化器４にフィードバックされる符号化制御信号は、量子化幅を制御するための制御信号である。
【００６９】
ここで、図１に示す装置においても、前述したように、入力画像信号はＭＰＥＧで定義される固定転送レートで符号化される。この符号化レートは、記録メディアの容量と、記録する入力画像信号の再生時間に依存するものであるが、その条件内で当該符号化レートは極力高いほうが望ましい。すなわち画質がよいほうが望ましい。なぜならば、当該符号化レートが後述する可変転送レート符号化データ変換において符号化される条件の最大転送レートに等しくなるからである。
【００７０】
このため、図１に示す装置では、バッファメモリ６から出力される符号化データを、記憶回路２２に記録するようにしている。なお、図１に示す装置にて用いる記憶回路２２は、記録再生可能なハードディスクや光ディスク、高速のストレージメディアなど何でもよい。
【００７１】
一方、Ｉピクチャ、Ｐピクチャは、後に動き補償予測の参照画像として用いる必要があるため、量子化器４から出力される当該ＩピクチャやＰピクチャの量子化データは、逆量子化器８以降にも送られる。
【００７２】
すなわち、この逆量子化器８での逆量子化により得られたＤＣＴ係数データは、逆ＤＣＴ器９に送られて逆ＤＣＴされた後、演算器１２にて動き補償予測器１１からの予測差分画像が加算されて画像信号が復元される。
【００７３】
この復元された画像信号は、一時、画像メモリ１０に蓄えられる。当該復元されて画像メモリ１０に蓄えられた画像信号は、後のビデオ信号復号装置において再生されるものと同じ画像信号である。
【００７４】
当該画像メモリ１０に蓄えられた画像信号は、動き補償予測器１１に送られ、次の動き補償予測の参照画像となされる。つまり、画像メモリ１０から動き補償予測器１１に送られた画像信号は、演算器２にて差分画像を計算するためのリファレンスの復号化画像を生成するために使用される。
【００７５】
次に、図１に示すビデオ信号符号化装置では、記憶回路２２に記録された符号化データを、符号化情報検出器２４に送る。この符号化情報検出器２４では、符号化データから各ピクチャの発生符号量や量子化幅を検出し、その発生符号量や量子化幅を再び記憶回路２２に送って記録させる。当該記憶回路２２に記録される具体的な符号化情報としては、図２に示すようなフォーマットのピクチャ情報を挙げることができる。このピクチャ情報の部分がピクチャの枚数分だけ、符号化の順番で記憶回路２２に記録される。
【００７６】
ここで、ＭＰＥＧ符号化による圧縮データは、ピクチャの先頭に４バイトのピクチャスタートコードをつけるように決められている。このピクチャスタートコードは、「０×０００００１００」といった他のデータと区別できるようバイトアラインされたユニークコードなので、符号化情報検出器２４では、先ずこのピクチャスタートコードを検出し、次に当該ピクチャスタートコードが検出されるまでの間の符号量をカウントすることにより、そのピクチャの発生符号量を計算するようにしている。
【００７７】
また、発生符号量は、符号化レートにも依存するが、１５Ｍｂｐｓ相当で、最大１．７５Ｍビット程度であり、さらに精度も１０００ビット程度あれば良い。
したがって、記憶回路２２には、１１〜１２ビット／ピクチャ程度の情報でバイトアラインを考えて、２バイト程度の情報を符号化されたピクチャ順番に記録するようにしている。
【００７８】
さらに、量子化幅情報としては、マクロブロックという１６画素×１６画素のブロック毎に決められる量子化幅の和、もしくは平均値を求めるのが理想的であるが、いわゆるＮＴＳＣ放送方式の７２０×４８０画素のピクチャの場合には１３５０個のマクロブロックの平均をとらなければならない。この場合、ＭＰＥＧレイヤのうち、本来ならばマクロブロック層という比較的深い部分まで圧縮データをＶＬＣ（可変調復号化）しなければならないが、高速化のために、マクロブロック１列分のスライス層の先頭にあるスライス量子化幅を量子化幅情報として用いることも可能である。このスライス量子化幅は、ＮＴＳＣ放送方式の７２０×４８０画素のピクチャの場合には３０個（４８０／１６）存在していて、ピクチャスタートコードと同様にユニークコードであるスライススタートコード「０×０００００１０１〜０×０００００１１Ｅ」の間で検出することが可能である。なお、
スライス量子化幅はそのスライススタートコードの直後の５ビットにて示されている。
【００７９】
符号化情報検出器２４では、上述した発生符号量や量子化幅の値を３０個検出して、それらの和もしくは平均をとるようにしている。なお、量子化幅は、マクロブロック層でもスライス層でも１〜３１の５ビットで示されるので、当該量子化幅情報は２バイトあれば表現できる値である。このため、記憶回路２２においては、当該量子化幅情報を発生符号量情報の後に付加して記録するようにしている。
【００８０】
図１の構成では、符号化した後の符号化データから発生符号量や量子化幅情報を検出する例を挙げたが、ＶＬＣ器５での符号化と同時に、発生符号量や量子化幅情報を検出するようにしてもよい。
【００８１】
当該ＶＬＣ器５での符号化と同時に発生符号量や量子化幅情報を符号化情報検出器２４にて検出する場合の基本構成を、図３に示す。なお、この図３において、図１に示した構成の各構成要素と同様に動作する部分にはそれぞれ同じ指示符号を付加してそれらの説明は省略する。
【００８２】
この図３に示す構成の符号化情報検出器２４では、例えば、ＶＬＣ器５で可変長符号化を行っているときの符号化データを用いて、例えばピクチャスタートコード間の符号をカウントすることにより、符号化時のマクロブロック毎の量子化幅の和もしくは平均値を検出するようにしている。もちろん、符号化情報検出器２４では、ピクチャスタートコード間で量子化幅を求める代わりに、前述したように、スライススタートコード間でスライス量子化幅を求めることも可能である。当該符号化情報検出器２４にて検出された発生符号量や量子化幅情報は、記憶回路２２に記録される。
【００８３】
上述した図１及び図３の構成においては、記憶回路２２に記録された発生符号量情報と量子化幅情報、符号化データ等を用いて、後述するように、当該符号化データを可変転送レート符号化データに変換（再符号化）するようにしている。
【００８４】
すなわち、図１及び図３の基本構成には、上述した構成に加えて、記憶回路２２に記録された符号化データを復号画像データに変換する復号器４０と、同じく記憶回路２２に記録された発生符号量情報と量子化幅情報、及び、後述する目標とする平均可変転送レートもしくは目標符号量に関する情報に基づいて、その復号画像データを可変転送レート符号化データに変換して出力端子７から出力するための可変転送レート符号化器５０とを備えている。この出力端子７から出力された可変転送レート符号化データは、図示しない記録媒体に記録されることになる。
【００８５】
図４には、記憶回路２２に記録された発生符号量情報と量子化幅情報、符号化データ等を用いて、当該符号化データを可変転送レート符号化データに変換（再符号化）するための第１の具体例の構成、すなわち、図１及び図３の復号器４０と可変転送レート符号化器５０、及びその周辺回路（図１，図３では図示を省略）の、より具体的な構成を示す。なお、この図４には、説明の都合上、記憶回路２２も同時に示している。
【００８６】
この図４において、記憶回路２２から読み出された符号化データは、復号器４０にて復号され、復号画像データとして可変転送レート符号化器５０に送られる。
【００８７】
この可変転送レート符号化器５０に供給された復号画像データは、演算器５２と動き補償予測器６１に送られる。
【００８８】
動き補償予測器６１では、復号画像データを符号化順に動き補償予測し、演算器５２では、復号画像データと動き補償予測器６１からの予測画像との差分が計算される。
【００８９】
当該演算器５２での演算により得られた差分画像データは、ＤＣＴ器５３においてＤＣＴが行われる。
【００９０】
このＤＣＴ器５３からのＤＣＴ係数は、量子化器５４で量子化される。その量子化データは、動き補償予測器６１からの動きベクトルや符号化モードと共にＶＬＣ器５５に送られ、当該ＶＬＣ器５５で可変長符号化（ＶＬＣ）される。
【００９１】
このＶＬＣ器５５での可変長符号化によって得られた符号化データは、バッファメモリ５６に一時蓄積され、その後、可変転送レート符号化データとして出力端子５７から出力され、図１または図３の出力端子７に送られる。この出力された可変転送レート符号化データは図示しない記録媒体に記録されることになる。
【００９２】
一方、Ｉピクチャ、Ｐピクチャは、後に動き補償予測の参照画像として用いる必要があるため、量子化器５４から出力される当該ＩピクチャやＰピクチャの量子化データは、逆量子化器５８以降にも送られる。
【００９３】
すなわち、この逆量子化器５８での逆量子化により得られたＩピクチャやＰピクチャのＤＣＴ係数データは、逆ＤＣＴ器５９に送られて逆ＤＣＴされた後、演算器６２にて動き補償予測器６１からの予測差分画像が加算されて画像信号が復元される。
【００９４】
この復元された画像信号は、一時、画像メモリ６０に蓄えられる。当該復元されて画像メモリ６０に蓄えられた画像信号は、後のビデオ信号復号装置において再生されるものと同じ画像信号である。
【００９５】
当該画像メモリ６０に蓄えられた画像信号は、動き補償予測器６１に送られ、次の動き補償予測の参照画像となされる。つまり、画像メモリ６０から動き補償予測器６１に送られた画像信号は、演算器５２にて差分画像を計算するためのリファレンスの復号化画像を生成するために使用される。
【００９６】
また、符号量カウンタ６２は、ＶＬＣ器５５でのＶＬＣ後に、発生符号量を計算し、この発生符号量を示すピクチャ符号量情報を符号量制御回路６３に供給する。
【００９７】
符号量制御回路６３では、当該符号量カウンタ６２からのピクチャ符号量情報と後述するピクチャ目標符号量情報とに基づいて、量子化器５４における量子化ステップを制御する。
【００９８】
一方、記憶回路２２から読み出された発生符号量情報および量子化幅情報等を含む符号化情報は、図１及び図３では図示を省略した目標符号量決定回路２６に入力される。
【００９９】
また、当該目標符号量決定回路２６には、例えばＣＰＵ（中央処理ユニット）２５から、可変転送レート符号化データの目標符号量が設定される。なお、ＣＰＵ２５は、外部に設けられるものであるが、内部に設けることも可能である。ＣＰＵ２５にて設定される目標符号量は、これから可変転送レート符号化データに変換しようとする符号化データの再生時間がわかれば、平均可変転送レートを入力するのと等価となる。
【０１００】
ここで、この目標符号量決定回路２６における符号化量決定のアルゴリズム例を以下に説明する。
【０１０１】
例えば、前述したように符号化情報検出器２４が検出したピクチャ毎の発生符号量をＢＩＴ(i)、そのときのピクチャ全体の平均量子化幅をＱ（ｉ）、符号化データを可変転送レート符号化データに変換（再符号化）した後の全体の目標符号量をＴＢとする。なお、（ｉ）はピクチャの番号を示す。これらを用いて、可変化転送レート符号化データの各ピクチャに与える目標符号量ＴＧ（ｉ）は、以下の式により求めることができる。
ＥＮ（ｉ）＝ＢＩＴ（ｉ）^0.8×Ｑ(i)
ＴＧ（ｉ）＝ＴＢ×ＥＮ（ｉ）／ΣＥＮ（ｉ）
【０１０２】
この式により求めた、可変化転送レート符号化データの各ピクチャに与える目標符号量（ピクチャ目標符号量情報）ＴＧ（ｉ）は、可変転送レート符号化器５０の符号量制御回路６３に送られる。なお、式中のＥＮは、その検出したピクチャ複雑さや、動き補償時の誤差量にほぼ比例しており、符号化の難しさを表すものである。このＥＮの値が高いときには符号量を増やし、小さいときには符号量を減らすことで、画質を一定にした符号量割り当てが可能となる。可変転送レート符号化データの各ピクチャに与える目標符号量ＴＧ（ｉ）は、この比率で、当該可変転送レート符号化データに変換した後の全体の目標符号量ＴＢを分配していることに他ならない。また、前述した従来例と同様に、発生符号量比率に単純に分配するようにしてもよく、ピクチャ内の符号量制御は前述した第２のステップ以降の方法で実現することが可能である。
【０１０３】
符号量制御回路２７では、上述のようにして求めたピクチャ目標符号量ＴＧ（ｉ）と、
符号量カウンタ６２でカウントされたピクチャ符号量情報とに基づいて、量子化器５４における量子化スケールを制御することで、符号量の制御を行う。
【０１０４】
本実施の形態のビデオ信号符号化装置によれば、上述のような処理によって符号化データを可変転送レート符号化データに変換す（再符号化）ることを可能にしている。
【０１０５】
また、図１及び図３の構成においては、図４の構成に代えて、図５に示すような構成により、符号化データを可変転送レート符号化データに変換（再符号化）することも可能である。
【０１０６】
すなわち、図５には、記憶回路２２に記録された発生符号量情報と量子化幅情報、符号化データ等を用いて、当該符号化データを可変転送レート符号化データに変換（再符号化）するための第２の具体例の構成を示している。なお、この図５において、図４に示した各構成要素と同様に動作する部分にはそれぞれ同じ指示符号を付加してそれらの説明は省略する。
【０１０７】
この図５の構成では、図１の例のように記憶回路２２に記録された符号化データに基づいて符号化情報検出器２４で各ピクチャの発生符号量や量子化幅を検出する構成、もしくは、図３の例のように符号化しながら発生符号量や量子化幅情報を符号化情報検出器２４で検出する構成における符号化にて使用し、その符号化データに記述されたマクロブロック毎の動き補償に関するパラメータを、記憶回路２２に記録しておくようにしており、可変転送レート符号化器５０にて復号画像データを可変転送レート符号化するときに、当該記憶回路２２に記録された動き補償に関するパラメータを使用するようにしている。
【０１０８】
この図５の構成と図４の構成との違いは、記憶回路２２から読み出された動き補償に関するパラメータが、動き補償関連情報として可変転送レート符号化器５０の符号化情報入力器６４に入力され、当該可変転送レート符号化器５０の動き補償予測器６１ではこの動き補償関係情報を用いて動き補償予測を行うようにしていることである。
【０１０９】
ここで、記憶回路２２に記録される動き補償に関するパラメータとしては、具体的にはマクロブロック毎の動きベクトルと動き補償のタイプ等を挙げることができる。すなわち、この図５に示す構成例の記憶回路２２に記憶される具体的な符号化情報としては、前述の図２に示したようなフォーマットのピクチャ情報に加えて、図６に示すようなマクロブロック情報の部分がマクロブロックの個数分だけ、ピクチャの左上から右下方向へのラスタ順番に記録される。
【０１１０】
こうすることで、記憶回路２２に記録されている符号化データが例えばある程度の符号化劣化を伴っているような場合であっても、その符号化データを復号器４０にて復号した復号画像を用いて動き補償予測器６１が動きベクトルを求める際に、符号化劣化ノイズに乱されることがなくなる。また、動きベクトルを求める処理量も削減できる。これ以外にも、例えばいわゆるＭＰＥＧ２に拡張する場合に、フレームとフィールドで適応的に切り換える類の情報を記録することは、十分効果的である。
【０１１１】
さらに、図１及び図３の構成においては、図４や図５の構成に代えて、図７に示すような構成により、符号化データを可変転送レート符号化データに変換（再符号化）することも可能である。
【０１１２】
すなわち、図７には、記憶回路２２に記録された発生符号量情報と量子化幅情報、符号化データ等を用いて、当該符号化データを可変転送レート符号化データに変換（再符号化）するための第３の具体例の構成を示している。なお、この図７において、図４や図５に示した各構成要素と同様に動作する部分にはそれぞれ同じ指示符号を付加してそれらの説明は省略する。
【０１１３】
すなわち、この図７の構成では、図１や図３の符号化により得られた符号化データに対して、例えば編集処理を施すような場合に、可変転送レート符号化器５０における可変転送レート符号化の際に、その編集処理に使用した編集情報に従った符号化を行うようにする。なお、具体的な編集情報としては、例えば編集を有効にしたい画像の時間情報を示す、時間：分：秒：フレームからなるフォーマットの情報を挙げることができる。
【０１１４】
より具体的に説明すると、この図７では、編集開始点と編集終了点を組み合わせて生成される編集情報が編集情報入力器４１から入力され、その編集情報が復号制御器４２に送られる。
【０１１５】
当該復号制御器４２は、その編集情報に記述されている、編集を有効にしたい画像の時間部分だけを復号再生するように、復号器４０を制御する。したがって、このときの復号器４０では、記憶回路２２から供給される符号化データのうち、編集情報にて記述されている編集を有効にしたい画像の時間部分だけを復号再生することになる。
【０１１６】
また、この図７の構成の場合、編集情報入力器４１からの編集情報は、検出制御器４３にも入力される。この検出制御器４３は、編集を有効にしたい画像の時間部分だけの、各ピクチャの発生符号量や量子化幅を検出するように、符号化情報検出器２４を制御する。
このため、このときの記憶回路２２には、編集を有効にしたい画像の時間部分だけの、各ピクチャの発生符号量情報や量子化幅情報が符号化情報として記録されることになる。
【０１１７】
さらに、この図７の構成の場合の目標符号量決定回路２６では、記憶回路２２に記憶された、編集を有効にしたい画像の時間部分だけの各ピクチャの発生符号量情報や量子化幅情報からなる符号化情報と、ＣＰＵ２５からの目標符号量もしくは平均可変転送レートとに基づいて、編集を有効にしたい画像の時間部分だけのピクチャ目標符号量が計算されることになる。
【０１１８】
このようなことから、可変転送レート符号化器５０では、編集を有効にしたい画像の時間部分だけの可変転送レート符号化が可能となる。
【０１１９】
なお、このような編集処理において、例えば編集点がＰピクチャの画像部分になった場合、その画像はＩピクチャとして符号化する必要がある。また、編集点がＢピクチャの画像部分になった場合は、その前後の動き補償関係の情報が無効となる必要がある。したがって、このような場合には、関連するピクチャをＩピクチャとして符号化するなどの処理が必要である。
【０１２０】
本発明の可変転送レート符号化方法及び装置が適用される本実施の形態のビデオ信号符号化装置によれば、上述のような構成を用いることによって符号化データを可変転送レート符号化データに変換（再符号化）することを可能にしている。
【０１２１】
また、本実施の形態のビデオ信号符号化装置によって上述したようにして生成された可変転送レート符号化データを不図示の記録媒体に記録することで、記録媒体の容量は有効に活用されることになり、その可変転送レート符号化データを後に復号した場合にも良好な復号データが得られることになる。
【０１２２】
本発明は、上述した実施の形態に限定されることはなく、本発明に係る技術的思想を逸脱しない範囲であれば、設計等に応じて種々の変更が可能であることは勿論であり、符号化の手法も前述したＭＰＥＧ１に限らない。
【０１２３】
【発明の効果】
上述したように、本発明に係る可変転送レート符号化方法および装置によれば、従来の可変転送レート符号化装置のように、同じ画像を２回符号化装置に入力する必要がなくなる。すなわち、符号化する動画像信号として、放送や通信などからリアルタイムで送信されてくる動画像信号など１度しか送信されない画像信号に関しても、高レートで符号化を行ってそのデータを、書き込み可能なディスクやテープメディアなどに一時的に記録し、
しかる後にそのデータを用いて、高画質な可変転送レート符号化を行い、最終的に必要な可変転送レート符号化データを生成して例えば記録することが可能となる。
また、一時的に記録した符号化データの動き補償に関するパラメータを使用して、可変転送レート符号化を行うようにした場合には、動きベクトルを求める際に、符号化劣化ノイズに乱されることがなくなる。
さらに、一時的に記録した符号化データのうち、編集でカットしたい部分などの編集情報を用いて、可変転送レート符号化を行うようにした場合には、編集情報を反映した可変転送レート符号化を行うことができる。
【図面の簡単な説明】
【図１】本発明の可変転送レート符号化方法および装置の参考例としてのビデオ信号符号化装置において、符号化データを符号化情報検出器で検出して可変転送レート符号化データに変換（再符号化）する場合の概略構成を示すブロック図である。
【図２】本実施の形態のビデオ信号符号化装置の記憶回路に記録される符号化情報（ピクチャ情報）のフォーマットを示す図である。
【図３】本実施の形態のビデオ信号符号化装置において、符号化しながら発生符号化量や量子化幅情報を符号化情報検出器で検出して符号化データを可変長符号化データに変換（再符号化）する場合の概略構成例を示すブロック図である。
【図４】本実施の形態のビデオ信号符号化装置において、記憶回路に記録された発生符号量情報と量子化幅情報、符号化データ等を用いて、当該符号化データを可変転送レート符号化データに変換（再符号化）するための第１の具体例の構成を示すブロック図である。
【図５】本実施の形態のビデオ信号符号化装置において、記憶回路に記録された発生符号量情報と量子化幅情報、符号化データ等を用いて、当該符号化データを可変転送レート符号化データに変換（再符号化）するための第２の具体例の構成を示すブロック図である。
【図６】本実施の形態のビデオ信号符号化装置の記憶回路に記録される符号化情報（マクロブロック情報）のフォーマットを示す図である。
【図７】本実施の形態のビデオ信号符号化装置において、記憶回路に記録された発生符号量情報と量子化幅情報、符号化データ等を用いて、当該符号化データを可変転送レート符号化データに変換（再符号化）するための第３の具体例の構成を示すブロック図である。
【図８】従来例のビデオ信号符号化装置の基本構成を示すブロック図である。
【図９】従来例のビデオ信号符号化装置において第１のパスの符号化を行う場合の構成を示すブロック図である。
【図１０】従来例のビデオ信号符号化装置において第２のパスの符号化を行う場合の構成を示すブロック図である。
【符号の説明】
１…入力端子、２，５２…演算器、３，５３…ＤＣＴ器、４，５４…量子化器、
５，５５…ＶＬＣ器、６，５６…バッファメモリ、７，５７…出力端子、
８，５８…逆量子化器、９，５９…逆ＤＣＴ器、１０，６０…画像メモリ、
１１，６１…動き補償予測器、１２，６２…演算器、６２…符号量カウンタ、
６３…符号量制御回路、２２…記憶回路、２４…符号化情報検出器、２５…ＣＰＵ、
２６…目標符号量決定回路、４０…復号器、４１…編集情報入力器、
４２…復号化制御器、４３…検出制御器、５０…可変転送レート符号化器。 [0001]
BACKGROUND OF THE INVENTION
  The present invention relates to a video signal encoding method for encoding a moving image signal (video signal) and a corresponding video signal encoding device, and more particularly to a video signal encoding method and device for performing encoding using orthogonal transform and quantization. Applicable toEncode encoded data at a fixed transfer rateConvert to variable rate encoded data after temporary recording(Re-encoding)Variable transfer rate encoding method and deviceIn placeRelated.
[0002]
[Prior art]
  According to the technique described in Japanese Patent Laid-Open No. 7-284097 as an example of a conventional variable transfer rate encoding technique, a video signal is divided into a first pass and a second pass, and the first pass is encoded. Information necessary for encoding the second pass is generated and output. A so-called MPEG system is used as the encoding system at this time.
[0003]
  As for MPEG, ISO-IEC11172-2, ITU-T H.264, etc. Since the detailed description is made in H.262 / ISO-IEC13818-2, only the outline will be described here.
[0004]
  MPEG is the name of the organization that examines the video coding standard established in 1988 by ISO / IEC JTC1 / SC2 (International Organization for Standardization / International Electrotechnical Standards Meeting Technical Committee 1 / Technical Committee 2, Current SC29) ( Abbreviation for Moving Pictures Expert Group. MPEG1 (MPEG Phase 1) is a standard for storage media of about 1.5 Mbps. JPEG (Joint Photographic Coding Experts Group) for the purpose of still image coding and ISDN (IntegratedServicesDigitalNetwork (Integrated Service Digital Communication Network) for video conferencing for video conferences and low transfer rates for videophones. It inherits the basic technology of H.261 (CCITT SGXV, standardized by the current ITU-T SG15), and introduces a new technology for storage media. These were established in August 1993 as ISO / IEC 11172.
[0005]
  MPEG1 is created by combining several technologies.
[0006]
  From the input image signal, the time redundant portion is reduced by taking the difference between the image signal decoded by the motion compensator and the input image signal.
[0007]
  The prediction method has three basic modes: a mode for performing prediction from a past image, a mode for performing prediction from a future image, and a mode for performing prediction from both past and future images. Exists. In addition, these modes are a 16 pixel × 16 pixel macroblock (MB: Macro).Bcan be switched for each lock). The prediction method is determined by the picture type (Picture_Type) given to the input image. There are two modes: a mode in which prediction is performed from a past image and a mode in which the macroblock is independently encoded without prediction, and a unidirectional inter-picture prediction encoded image (P picture: P-picture). In addition, a mode for predicting from future images, a mode for predicting from past images,
There are four modes: a mode that performs prediction from both past and future images, and a mode that independently encodes without prediction. Bidirectional inter-picture predictive encoded images (B-picture: B-Picture) ). In addition, it is an intra-picture independent encoded image (I-picture) that encodes all macroblocks independently. Note that the intra-picture independent coded image is called an intra picture, and therefore, the unidirectional inter-picture predictive coded image and the bidirectional inter-picture predictive coded image can be called non-intra pictures.
[0008]
  In motion compensation, the motion region is subjected to pattern matching for each macroblock, a motion vector is detected with half-pel accuracy, and the motion vector is predicted by shifting by the detected motion vector. The motion vector has a motion vector in the horizontal direction and the vertical direction, and is transmitted as additional information of the macroblock together with an MC (Motion Compensation) mode indicating where the motion vector is predicted.
[0009]
  FIG. 8 shows a basic configuration of a video signal encoding apparatus to which MPEG1 is applied.
[0010]
  In FIG. 8, an input image signal is supplied to an input terminal 101, and this input image signal is sent to a calculator 102 and a motion compensation predictor 111 described later.
[0011]
  The computing unit 102 obtains the difference between the image signal decoded by the motion compensation predictor 111 and the input image signal, and sends the difference image signal to the DCT unit 103.
[0012]
  The DCT unit 103 orthogonally transforms the supplied difference image signal. Here, DCT (Discrete Cosine Transform) is an orthogonal transformation in which an integral transformation using a cosine function as an integral kernel is a discrete transformation into a finite space. In MPEG, two-dimensional DCT is performed on an 8 × 8 DCT block obtained by dividing a macroblock into four. In general, a video signal has many low-frequency components and few high-frequency components. Therefore, when DCT is performed, coefficients are concentrated in a low frequency.
[0013]
  Data (DCT coefficient) obtained by DCT in the DCT unit 103 is quantized in the quantizer 104. In the quantization in the quantizer 104, a quantized value is obtained by multiplying a value obtained by weighting an 8 × 8 two-dimensional frequency called a quantization matrix with a visual characteristic and a value called a quantization scale for multiplying the whole by a scalar. Then, the DCT coefficient is divided by the quantized value.
[0014]
  When the encoded data after being encoded by the video signal encoding device is decoded and dequantized later by a video signal decoding device (decoder) (not shown), the video signal encoding device By multiplying by the used quantized value, a value approximating the original DCT coefficient can be obtained.
[0015]
  The data quantized by the quantizer 104 is variable length encoded by the VLC unit 105. In the VLC unit 105, among the quantized values, the direct current (DC) component is subjected to DPCM (one of predictive coding).DifferencialPulseCodeModulation). Also, for alternating current (AC) components, a so-called zigzag scan is performed from low to high, and the zero run length and effective coefficient value are considered as one event, with the highest occurrence probability. So-called Huffman coding is performed, in which codes having a short code length are assigned.
[0016]
  Data variable-length encoded by the VLC unit 105 is temporarily stored in the buffer memory 106, then read out from the buffer memory 106 at a predetermined transfer rate, and output as encoded data (encoded bit stream). Output from the terminal 107.
[0017]
  The generated code amount for each macroblock in the output encoded data is the buffer manager 11 described later.4Is transmitted to the code amount controller 113. The code amount controller 113 obtains a difference between the generated code amount for each macroblock and the target code amount, generates a code amount control signal corresponding to the difference, and feeds back to the quantizer 104, thereby generating the generated code. Perform quantity control. The code amount control signal fed back to the quantizer 104 for the code amount control is a signal for controlling the quantization scale in the quantizer 104. A specific code amount control method will be described later.
[0018]
  On the other hand, the quantized image data is sent to the inverse quantizer 108 where it is inversely quantized.
[0019]
  Further, the DCT coefficient data obtained by the inverse quantization is sent to the inverse DCT unit 109 and subjected to inverse DCT, and then the arithmetic difference unit 112 adds the prediction difference image from the motion compensation predictor 111 to generate an image signal. Is restored.
[0020]
  The restored image signal is temporarily stored in the image memory 110 and then sent to the motion compensation predictor 111. The image signal sent from the image memory 110 to the motion compensated predictor 111 is
It is used for generating a reference decoded image for calculating a difference image in the computing unit 102.
[0021]
  An encoded bit stream that is encoded data output from the output terminal 107 is:
In the case of a video signal, each picture has a variable length code amount. This is because MPEG uses information conversion such as DCT, quantization, and Huffman coding, as well as adaptively changing the code amount allocated to each picture in order to improve image quality. That is,
In MPEG, since motion compensation prediction is performed, the encoded image itself is encoded such that the input image signal is encoded as it is in some cases and the difference image signal that is the difference between the predicted image and the input image signal is encoded in some cases. This is because the entropy of a large change. In this case, in many cases, the code amount control is performed while allocating to the image entropy ratio and keeping the buffer memory capacity limit.
[0022]
  Therefore, the buffer manager 114Monitors the relationship between the code amount generated by encoding and the usable encoding rate, and sets the target code amount in the buffer memory 106 so as to be within a predetermined buffer capacity.
[0023]
  Information corresponding to the difference of the actual generated code amount with respect to the target code amount is fed back to the variable length encoder 105 and enters the code amount controller 113. The code amount controller 113 suppresses the generated code amount by increasing the value of the quantization scale set in the quantizer 106, or conversely reduces the generated code amount by decreasing the value of the quantization scale. The code amount control signal is generated.
[0024]
  As described above, when variable length data is transferred as fixed transfer rate data (fixed transfer rate encoded data), the maximum buffer amount on the video signal decoding device side that receives the data is This is the upper limit value of the generated code amount in the video signal encoding device. That is, the encoded data is input at a constant speed and stored at a predetermined value, and then a predetermined time (in the case of an NTSC video signal, which is one of the television standard broadcasting systems, 1 / 29.97 sec unit). In this case, a virtual decoder model that performs decoding in an instant is used, and encoding is performed on the encoding device side so that neither overflow nor underflow occurs in the virtual buffer memory (so-called VBV buffer) of the model. It is defined by MPEG. If this rule is observed, the rate in the VBV buffer changes locally, but if the observation time is long, it becomes a fixed transfer rate, and MPEG defines this as a fixed rate.
[0025]
  Here, in the case of the fixed transfer rate defined as described above, when the generated code amount is small on the encoding device side, the buffer occupancy amount sticks to the upper limit value on the decoding device side. In this case, for example, the amount of code must be increased on the encoding device side so that an invalid bit is added and the buffer (VBV buffer) does not overflow.
[0026]
  On the other hand, in the case of the variable transfer rate, the definition of this fixed transfer rate is expanded, and when the buffer occupancy reaches the upper limit value, reading in the decoding device is stopped, so that overflow does not occur in principle. Is defined as Therefore, in the case of the variable transfer rate, even if the amount of generated code is very small, reading of the decoding device is stopped, so there is no need to insert invalid bits as in the case of the fixed transfer rate. For this reason, in the case of a variable transfer rate, encoding is performed so that only underflow does not occur.
[0027]
  Against the background of such a technique, in a conventional variable transfer rate encoding, a video signal is encoded by being divided into a first pass and a second pass, and information necessary for encoding the second pass in the first pass. Will be described.
[0028]
  FIG. 9 shows a configuration for encoding the first pass of the conventional example. In FIG. 9, parts that operate in the same manner as the components of the basic configuration shown in FIG.
[0029]
  In the first path configuration shown in FIG. 9, a video signal is reproduced and input to the input terminal 101. At this time, generated code amount information generated for each short section with respect to the entire video sequence is added to the encoded information.
[0030]
  This generated code amount is calculated by the code amount counter 121 after VLC in the VLC unit 105, and is sent to the storage circuit 122 for storage.
[0031]
  The storage circuit 122 may be anything such as a high-speed storage medium such as a hard disk or an optical disk.
[0032]
  Since the first pass is used to monitor whether or not accurate compression has been performed, the buffer memory 106 and the output of the encoded data thereafter are shown in FIG. If the code amount can be counted, it is not always necessary.
[0033]
  By the way, in the encoding method represented by MPEG, since variable length encoding is performed, if the first pass encoding is performed with a fixed quantization width, the complexity of the encoded image and motion compensation are reduced. The generated code amount increases in accordance with the difference (residual component) amount.
[0034]
  Therefore, if the generated code amount is distributed using this property, the image quality can be made substantially uniform. In the second pass encoding described later, the generated code amount must be controlled to the entire target code amount while maintaining the code amount distribution ratio.
[0035]
  The generated code amount is detected in units of short sections generated in the first pass encoding, and the information is stored in the storage circuit 122. As an example of the short interval, 1 GOP (group of pictures) of about 15 pictures can be considered, which can be divided for each intra-picture independent coding. in this case,
The amount of generated code for each GOP unit is stored in the storage circuit 122.
[0036]
  In the first pass encoding, it is common to generate a larger amount of code than the final code amount output in the second pass, generally with a small and fixed quantization width. Thus, the reason why the quantization width is reduced in the first pass encoding is that it is necessary to finely decompose the information up to the high frequency component of the image and detect the characteristics of the image.
[0037]
  Here, it is assumed that the code amount generated in the i-th short interval in the first pass is PS1B (i). The ratio of the code amount PS1B (i) generated in each short section is substantially the same in the second pass, and the final target total code amount is set as the target code amount in each short section in the second pass. .
[0038]
  For example, if the short interval is 1 GOP, the code can be controlled while maintaining a certain level of image quality by the following method.
[0039]
  Next, FIG. 10 shows a configuration for performing the second pass encoding of the conventional example. In FIG. 10, parts that operate in the same manner as the components of the basic configuration shown in FIG. 8 are given the same reference numerals, and descriptions thereof are omitted.
[0040]
  In the configuration of the second pass shown in FIG. 10, the quantization scale is decreased to increase the generated code amount, while the quantization scale is increased to decrease the generated code amount.
[0041]
  A method for controlling the quantization width based on the occupation ratio of the buffer memory 106 using this principle can be considered. It should be noted that an image (P picture) to be unidirectionally inter-picture predictively encoded is predicted from the previous intra-picture independent encoded image (I picture) or P picture, and between two-way pictures. The predicted coded image (B picture) is predicted from the I picture and P picture on both sides in terms of time, so that, for example, when the I picture deteriorates, the other pictures also deteriorate in conjunction.
[0042]
  In the following method example, the entire code amount control is realized while considering the code amount distribution for these pictures.
[0043]
  First, the target code amount determination circuit 124 uses the generated code amount PS1B (i) obtained in the first pass from the storage circuit 122, and uses the generated code amount PS1B (i) of each GOP as shown in the following equation. )
  PS2B (i) = final target total code amount × PS1B (i) / ΣPS1B (i)
[0044]
  Here, when the target code amount PS2B (i) to be given to one GOP is R, the specific code amount control is performed by the code amount controller 113 by an algorithm such as the following first and second steps. .
[0045]
(A) First step
  In the first step, as shown in the following equation, the allocated code amount for each picture of the GOP is distributed with a certain weight assigned to the pictures that have not yet been encoded in the GOP.
      Xi = Si × Qi
      Xp = Sp × Qp
      Xb = Sb × Qb
[0046]
  Here, X is called a global complexity measure, which is defined by the product of the generated code amount S of the previous encoding result of the same picture type and the average quantization scale Q, i represents an I picture, p represents a P picture, and b represents a B picture. In addition, the quantization scale that achieves ideal image quality has a ratio of Kp = 1.0 with respect to the P picture when the I picture is the reference, and a ratio with the B picture when the I picture is also used as the reference. Assume that Kb = 1.4.
[0047]
  At this time, the code amounts Ti, Tp, and Tb of each picture assigned in the first step are obtained by the following equations.
    Ti = MAX {R / (1+ (NpXp / XiKp) + NbXb / XiKb)), br / (8 * pr)}
    Tp = MAX {R / (Np + (NpKpXb / KbXp)), br / (8 * pr)}
    Tb = MAX {R / (Nb + (NpKbXp / KbXp)), br / (8 * pr)}
[0048]
  In each equation, Ti represents the code amount of the I picture, Tp represents the code amount of the P picture, Tb represents the code amount of the B picture, MAX is the maximum, and R is the initial value given to the GOP. Np represents the number of uncoded pictures of P pictures in the GOP, Nb represents the number of uncoded pictures of B pictures in the GOP, pr represents the picture rate, and br represents the bit rate.
[0049]
  Here, the initial code amount R is updated as follows each time encoding progresses in the GOP.
      R = R-Si, p, b
[0050]
(B) Second step
  In the second step, the code amount (Ti, Tp, Tb) of each picture assigned in the first step is matched with the actual generated code amount, so that it is generated for each macroblock as shown in the following equations. While adding the code amount, the difference between the target code amount and the predicted target code amount halfway is fed back to the quantization scale in units of macroblocks.
      dji = dOi + Bj-1− (Ti (j-1) / MB_cnt)
      djp = dOp + Bj-1− (Tp (j-1) / MB_cnt)
      djb = dOp + Bj-1− (Tb (j-1) / MB_cnt)
[0051]
  In each equation, dOi is the initial occupation amount of the virtual buffer (VBV buffer) in the I picture, dOp is the initial occupation amount of the virtual buffer (VBV buffer) in the P picture, and dOb is the virtual buffer (VBV) in the B picture. Buffer), j is a number indicating the number of each picture counted from the top of each picture counted by the code quantity counter 121, and Bj is the jth number from the top of each picture counted by the code quantity counter 121. MB_cnt represents the number of macroblocks in one picture, dji represents the feedback amount in the I picture, djp represents the feedback amount in the P picture, and djb represents the feedback amount in the B picture.
[0052]
  Further, the average quantization scale Q is obtained by the following equation.
      Q = dj × 31 / r
      R = 2 × br / pr
  In the equation, Q is a quantization scale, and r is a parameter that determines the response speed of feedback.
[0053]
  In the conventional configuration, it is possible to perform code amount control by performing each calculation as described above.
[0054]
  As for MPEG, ISO-IEC 11172-2, ITU-T H.264, etc. Since detailed description is made in H.262 / ISO-IEC13818-2, further description is omitted here.
[0055]
[Problems to be solved by the invention]
  Thus, in the conventional configuration, in order to realize variable transfer rate encoding, the above-described two-pass encoding must be passed.
[0056]
  That is, as shown in FIGS. 9 and 10, the video signal to be encoded has to be input twice to the video signal encoding device. For this reason, variable transfer rate encoding cannot be performed on an image signal that is transmitted only once, such as a moving image signal transmitted in real time by broadcasting or communication.
[0057]
  Also, for example, when editing encoded data, it must be re-encoded, or in order to make the encoded data compliant with MPEG, in addition to the part that is actually edited, There were problems that required a lot of work, such as having to partially correct the frame.
[0058]
  The present invention has been made in view of the above-described problems. For example, it is possible to perform encoding at a variable transfer rate even for a moving image signal transmitted in real time by broadcasting or communication. Variable transfer rate encoding method and device that are easy to editSetFor the purpose of provision.
[0059]
[Means for Solving the Problems]
  In order to solve the above problems, the present invention provides the following method and apparatus.
(1) A variable transfer rate encoding method for encoding a video signal using orthogonal transform and quantization and outputting the video signal at a variable transfer rate,
  Encoding an incoming video signal using orthogonal transform and quantization;
  Simultaneously with the encoding of the incoming video signal, the encoded encoded dataFor each image unitGenerated code amount andAverage per image unitQuantization width and detectedFor each image unitGenerated code amount andAverage per image unitGenerating encoded information having information on quantization width;
  Recording the encoded encoded data at a fixed transfer rate and recording the encoded information;
  Decoding the recorded encoded data;
  Based on the recorded encoded information and the reproduction time of the encoded data and the target average variable transfer rate, or based on the encoded information and the target code amount,For each image unitSet a new target code amount,For each image unitA variable transfer rate encoding step for variable-length encoding the data obtained by decoding the encoded data so that the generated code amount becomes the new target code amount;
A variable transfer rate encoding method comprising:
(2) In the recording step, parameters related to motion compensation at the time of encoding in the encoding step are recorded,
  In the variable transfer rate encoding step, when the data obtained by decoding the recorded encoded data is variable-length encoded, the recorded parameter for motion compensation is used as a parameter for motion compensation. The variable transfer rate encoding method according to (1) above.
(3) Provide a step for entering editing information,
  In the step of generating the encoding information, the encoding information of only a necessary section based on the editing information is generated,
  In the decoding step, the data of only the necessary section based on the editing information is decoded,
  In the variable transfer rate encoding step, the encoding information of only the necessary section based on the editing information is used as the encoding information, and the data decoded only in the necessary section based on the editing information is encoded. The variable transfer rate encoding method according to (1) or (2) above, wherein the variable transfer rate encoding method is used.
(4) A variable transfer rate encoding device that encodes a video signal using orthogonal transform and quantization and outputs the video signal at a variable transfer rate,
  An incoming video signal encoding means for encoding the incoming video signal using orthogonal transform and quantization;
  Simultaneously with the encoding of the incoming video signal, the encoded encoded dataFor each image unitGenerated code amount andAverage per image unitQuantization width and detectedFor each image unitGenerated code amount andAverage per image unitEncoded information detecting means for generating encoded information having information on the quantization width;
  Recording means for recording the encoded data at a fixed transfer rate and recording the encoded information;
  Decoding means for decoding the recorded encoded data;
  Based on the recorded encoded information and the reproduction time of the encoded data and the target average variable transfer rate, or based on the encoded information and the target code amount,For each image unitSet a new target code amount,For each image unitVariable transfer rate encoding means for variable-length encoding data obtained by decoding the encoded data so that the generated code amount becomes the new target code amount;
A variable transfer rate encoding device comprising:
(5) The incoming video signal encoding means outputs a parameter relating to motion compensation during encoding,
  The recording means records parameters relating to the motion compensation;
  The variable transfer rate encoding means uses the recorded parameter for motion compensation as a parameter for motion compensation when variable length encoding is performed on data obtained by decoding the recorded encoded data. The variable transfer rate encoding apparatus according to (4), characterized in that:
(6) Editing information input means for inputting editing information;
  Detection control means for controlling the encoded information detection means based on the editing information;
  A decoding control means for controlling the decoding means based on the editing information;
  In the encoded information detection means, under the control of the detection control means, generate encoded information for only the necessary section based on the edit information,
  In the decoding means, under the control of the decoding control means, the encoded data is decoded only in a necessary section based on the editing information,
  In the variable transfer rate encoding means, encoded information of only a necessary section based on the editing information is used as the encoding information, and only a necessary section is decoded based on the editing information as data to be encoded. The variable transfer rate encoding device according to (4) or (5), wherein
[0060]
DETAILED DESCRIPTION OF THE INVENTION
  Hereinafter, a variable transfer rate encoding method and apparatus according to the present invention will be described.SetPreferred embodiments will be described in detail with reference to the drawings.
[0061]
  FIG. 1 shows a variable transfer rate encoding method and apparatus according to the present invention.Reference exampleShows a basic configuration of a video signal encoding apparatus. In addition,Reference exampleNow, an example using MPEG1 encoding will be described as an example of a moving image signal encoding method.
[0062]
  This smellAnd biThe input image signal input to the input terminal 1 of the video signal encoding device is a video signal composed of a luminance signal and a color difference signal, and after being digitized, the images are rearranged according to the picture type. It is what. The input image signal is recorded in the storage circuit 22 as encoded data, and further, a schematic configuration example until the encoded data recorded in the storage circuit 22 is converted (re-encoded) into variable transfer rate encoded data. This will be described with reference to FIG.
[0063]
  The input image signal supplied to the input terminal 1 is sent to the calculator 2 and the motion compensation predictor 11.
[0064]
  The motion compensated predictor 11 performs motion compensation prediction on the input image signal in the coding order, and the computing unit 2 calculates the difference between the input image signal and the predicted image from the motion compensated predictor 11.
[0065]
  The DCT unit 3 performs DCT on the difference image data obtained by the calculation in the calculator 2.
[0066]
  The DCT coefficient from the DCT unit 3 is quantized by the quantizer 4. The quantized data is sent to the VLC unit 5 together with the motion vector and the encoding mode from the motion compensated predictor 11, and is variable length encoded (VLC) by the VLC unit 5.
[0067]
  The encoded data obtained by the variable length encoding in the VLC unit 5 is temporarily stored in the buffer memory 6 and then output from the buffer memory 6 as an MPEG video stream.
[0068]
  In addition, the rate controller 23 monitors the sufficiency of the buffer memory 6, and basically,
An encoding control signal that coarsens the quantization when the degree of fullness of the buffer memory 6 increases and finer the quantization when the buffer memory 6 decreases is fed back to the quantizer 4.
That is, the encoding control signal fed back to the quantizer 4 is a control signal for controlling the quantization width.
[0069]
  here,In the apparatus shown in FIG.However, as described above, the input image signal is encoded at a fixed transfer rate defined by MPEG. The encoding rate depends on the capacity of the recording medium and the reproduction time of the input image signal to be recorded. However, it is desirable that the encoding rate be as high as possible within the conditions. In other words, better image quality is desirable. This is because the encoding rate becomes equal to the maximum transfer rate under the condition that is encoded in the variable transfer rate encoded data conversion described later.
[0070]
  For this reason,Device shown in FIG.Then, the encoded data output from the buffer memory 6 is recorded in the storage circuit 22. In addition,Device shown in FIG.The storage circuit 22 used in 1 can be anything such as a recordable / reproducible hard disk, optical disk, or high-speed storage medium.
[0071]
  On the other hand, since the I picture and P picture need to be used later as reference images for motion compensation prediction, the quantized data of the I picture and P picture output from the quantizer 4 is transmitted to the inverse quantizer 8 and the subsequent ones. Is also sent.
[0072]
  That is, the DCT coefficient data obtained by the inverse quantization in the inverse quantizer 8 is sent to the inverse DCT unit 9 and subjected to inverse DCT, and then the prediction difference from the motion compensated predictor 11 in the arithmetic unit 12. The images are added to restore the image signal.
[0073]
  The restored image signal is temporarily stored in the image memory 10. The restored image signal stored in the image memory 10 is the same image signal that is reproduced in a later video signal decoding apparatus.
[0074]
  The image signal stored in the image memory 10 is sent to the motion compensation predictor 11 and becomes a reference image for the next motion compensation prediction. That is, the image signal sent from the image memory 10 to the motion compensation predictor 11 is used for generating a reference decoded image for calculating a difference image in the computing unit 2.
[0075]
  next,As shown in FIG.In the video signal encoding apparatus, the encoded data recorded in the storage circuit 22 is sent to the encoded information detector 24. The encoded information detector 24 detects the generated code amount and quantization width of each picture from the encoded data, and sends the generated code amount and quantization width to the storage circuit 22 for recording again. Specific coding information recorded in the storage circuit 22 may include picture information having a format as shown in FIG. The picture information portion is recorded in the storage circuit 22 in the order of encoding for the number of pictures.
[0076]
  Here, compressed data by MPEG encoding is determined so that a 4-byte picture start code is attached to the head of a picture. This picture start code is a byte-aligned unique code that can be distinguished from other data such as “0 × 00000100”.detectionThe device 24 detects the picture start code first, and then counts the code amount until the picture start code is detected, thereby calculating the generated code amount of the picture.
[0077]
  Further, although the generated code amount depends on the encoding rate, it corresponds to 15 Mbps, has a maximum of about 1.75 Mbits, and needs only about 1000 bits of accuracy.
Accordingly, the storage circuit 22 is configured to record about 2 bytes of information in the encoded picture order, considering byte alignment with information of about 11 to 12 bits / picture.
[0078]
  Further, as the quantization width information, it is ideal to obtain the sum or average value of quantization widths determined for each block of 16 pixels × 16 pixels called a macroblock, but the so-called NTSC broadcast system 720 × 480 is used. In the case of a picture of pixels, the average of 1350 macroblocks must be taken. In this case, the compressed data is transferred to the relatively deep part of the MPEG layer, which is originally a macroblock layer.CAlthough it is necessary to perform (modulation decoding), the slice quantization width at the head of the slice layer for one column of the macroblock can be used as the quantization width information in order to increase the speed. There are 30 slice quantization widths (480/16) in the case of a picture of 720 × 480 pixels in the NTSC broadcasting system, and the slice start code “0 × 00000101” which is a unique code as with the picture start code. It is possible to detect between “0 × 0000011E”. In addition,
The slice quantization width is indicated by 5 bits immediately after the slice start code.
[0079]
  Encoding informationdetectionIn the unit 24, the generated code amount and the quantization width value described above are 30.detectionAnd the sum or average of them is taken. Since the quantization width is indicated by 5 bits from 1 to 31 in both the macroblock layer and the slice layer, the quantization width information is a value that can be expressed by 2 bytes. For this reason, in the memory circuit 22, the quantization width information is added after the generated code amount information and recorded.
[0080]
  In the configuration of FIG. 1, after encodingMarksGenerated code amount and quantization width information from encoded datadetectionAs an example, the generated code amount and quantization width information are simultaneously obtained with the encoding in the VLC unit 5.detectionYou may make it do.
[0081]
  Simultaneously with the encoding by the VLC unit 5, the generated code amount and the quantization width information are encoded information.detectionIn vessel 24detectionFIG. 3 shows a basic configuration for doing so. In FIG. 3, parts that operate in the same manner as the components of the configuration shown in FIG.
[0082]
  Encoding information having the configuration shown in FIG.detectionIn the unit 24, for example, by using the encoded data when the VLC unit 5 performs variable length encoding, for example, by counting the code between picture start codes, quantization for each macroblock at the time of encoding is performed. Sum or average widthInspection OutLike to do. Of course, the encoding informationdetectionIn the unit 24, instead of obtaining the quantization width between the picture start codes, as described above, it is also possible to obtain the slice quantization width between the slice start codes. The encoded informationdetectionIn vessel 24detectionThe generated generated code amount and quantization width information are recorded in the storage circuit 22.
[0083]
  In the configuration of FIG. 1 and FIG. 3 described above, generated code amount information and quantization width information recorded in the storage circuit 22, MarksUse encoded data, etc., as described later.The markConvert encoded data to variable transfer rate encoded data(Re-encoding)Like to do.
[0084]
  That is, the basic configuration of FIGS. 1 and 3 is recorded in the storage circuit 22 in addition to the configuration described above.TickEncoded dataRecoverOn the basis of the decoder 40 that converts the image data into the generated image data, the generated code amount information and the quantization width information that are also recorded in the storage circuit 22, and the target average variable transfer rate or information about the target code amount, which will be described later, SoRecoveryA variable transfer rate encoder 50 for converting the image data into variable transfer rate encoded data and outputting it from the output terminal 7. The variable transfer rate encoded data output from the output terminal 7 is not shown.NoteIt will be recorded on the recording medium.
[0085]
  FIG. 4 shows the generated code amount information and quantization width information recorded in the storage circuit 22., MarksUsing encoded data, etc.The markConvert encoded data to variable transfer rate encoded data(Re-encoding)More specifically, the configuration of the first specific example to be performed, that is, the decoder 40 and the variable transfer rate encoder 50 of FIG. 1 and FIG. 3 and their peripheral circuits (not shown in FIG. 1 and FIG. 3). A typical configuration is shown. In FIG. 4, the storage circuit 22 is also shown for convenience of explanation.
[0086]
  In FIG. 4, it is read from the memory circuit 22TickThe encoded data is decoded by the decoder 40., RecoveryIt is sent to the variable transfer rate encoder 50 as signal image data.
[0087]
  This variable transfer rate encoder 50 is suppliedRecoveryThe image data is sent to the calculator 52 and the motion compensation predictor 61.
[0088]
  In the motion compensated predictor 61, RecoveryMotion compensation prediction is performed on the image data in the order of encoding., RecoveryThe difference between the image data and the predicted image from the motion compensated predictor 61 is calculated.
[0089]
  The DCT unit 53 performs DCT on the difference image data obtained by the calculation in the calculator 52.
[0090]
  The DCT coefficient from the DCT unit 53 is quantized by the quantizer 54. The quantized data is sent to the VLC unit 55 together with the motion vector and the encoding mode from the motion compensated predictor 61, and is variable length encoded (VLC) by the VLC unit 55.
[0091]
  The encoded data obtained by the variable length encoding in the VLC unit 55 is temporarily stored in the buffer memory 56, and then output from the output terminal 57 as the variable transfer rate encoded data. The output of FIG. 1 or FIG. Sent to terminal 7. The output variable transfer rate encoded data is recorded on a recording medium (not shown).
[0092]
  On the other hand, since the I picture and P picture need to be used later as a reference image for motion compensation prediction, the quantized data of the I picture and P picture output from the quantizer 54 is transmitted to the inverse quantizer 58 and later. Is also sent.
[0093]
  That is, the DCT coefficient data of the I picture and P picture obtained by the inverse quantization by the inverse quantizer 58 is sent to the inverse DCT unit 59 and subjected to inverse DCT, and then the motion compensation prediction is performed by the arithmetic unit 62. The prediction difference image from the unit 61 is added to restore the image signal.
[0094]
  The restored image signal is temporarily stored in the image memory 60. The restored image signal stored in the image memory 60 is the same image signal that is reproduced in a later video signal decoding apparatus.
[0095]
  The image signal stored in the image memory 60 is sent to the motion compensation predictor 61 to be a reference image for the next motion compensation prediction. In other words, the image signal sent from the image memory 60 to the motion compensation predictor 61 is used by the calculator 52 to generate a reference decoded image for calculating a difference image.
[0096]
  The code amount counter 62 calculates a generated code amount after VLC in the VLC unit 55 and supplies picture code amount information indicating the generated code amount to the code amount control circuit 63.
[0097]
  The code amount control circuit 63 controls the quantization step in the quantizer 54 based on picture code amount information from the code amount counter 62 and picture target code amount information described later.
[0098]
  On the other hand, the encoded information including the generated code amount information and quantization width information read from the storage circuit 22 is input to the target code amount determination circuit 26 not shown in FIGS.
[0099]
  The target code amount determination circuit 26 is set with a target code amount of variable transfer rate encoded data from, for example, a CPU (central processing unit) 25. The CPU 25 is provided outside, but can also be provided inside. The target code amount set by the CPU 25 is going to be converted into variable transfer rate encoded data.MarkIf the reproduction time of the encoded data is known, it is equivalent to inputting the average variable transfer rate.
[0100]
  Here, an example of the algorithm for determining the coding amount in the target code amount determination circuit 26 will be described below.
[0101]
  For example, as described above, encoded informationdetectionContainer 24 isdetectionThe generated code amount for each picture is BIT (i), and the average quantization width of the entire picture at that time is Q (i), MarksConvert encoded data to variable transfer rate encoded data(Re-encoding)After that, the entire target code amount is set to TB. Note that (i) indicates a picture number. Using these, give each picture of variable transfer rate encoded dataEyesThe standard code amount TG (i) can be obtained by the following equation.
    EN (i) = BIT (i)^0.8× Q (i)
    TG (i) = TB × EN (i) / ΣEN (i)
[0102]
  Given to each picture of variable transfer rate encoded data obtained by this formulaEyesThe target code amount (picture target code amount information) TG (i) is sent to the code amount control circuit 63 of the variable transfer rate encoder 50. Note that EN in the formula isdetectionThis is almost proportional to the picture complexity and the amount of error at the time of motion compensation, and represents the difficulty of encoding. By increasing the code amount when the EN value is high and decreasing the code amount when the EN value is small, it is possible to assign the code amount with a constant image quality. Give to each picture of variable transfer rate encoded dataEyesStandard code amount TG(I)In this ratio, the entire target code amount TB after being converted into the variable transfer rate encoded data is distributed. Similarly to the conventional example described above, the code amount ratio may be simply distributed to the generated code amount ratio, and the code amount control in the picture can be realized by the method after the second step described above.
[0103]
  In the code amount control circuit 27, the picture target code amount TG (i) obtained as described above,
The code amount is controlled by controlling the quantization scale in the quantizer 54 based on the picture code amount information counted by the code amount counter 62.
[0104]
  According to the video signal encoding apparatus of the present embodiment, the above processing is performed.TickConvert encoded data to variable transfer rate encoded data(Re-encoding)Making it possible.
[0105]
  1 and 3, the configuration shown in FIG. 5 is used instead of the configuration shown in FIG., MarksConvert encoded data to variable transfer rate encoded data(Re-encoding)It is also possible to do.
[0106]
  Ie,FIG. 5 shows generated code amount information and quantization width information recorded in the storage circuit 22., MarksUsing encoded data, etc.The markConvert encoded data to variable transfer rate encoded data(Re-encoding)The structure of the 2nd specific example for doing is shown. In FIG. 5, parts that operate in the same manner as the components shown in FIG. 4 are given the same reference numerals, and descriptions thereof are omitted.
[0107]
  In the configuration of FIG. 5, it is recorded in the storage circuit 22 as in the example of FIG.TickEncoding information based on encoded datadetectionThe generated code amount and quantization width of each picture by the unit 24detectionOr encoding the generated code amount and quantization width information while encoding as in the example of FIG.detectionIn vessel 24detectionIn the configuration toMarkUsed inMarksParameters related to motion compensation for each macroblock described in the encoded data are recorded in the storage circuit 22, and are stored in the variable transfer rate encoder 50.RecoveryWhen the signal image data is subjected to variable transfer rate encoding, the parameters relating to motion compensation recorded in the storage circuit 22 are used.
[0108]
  The difference between the configuration of FIG. 5 and the configuration of FIG. 4 is that the parameters related to motion compensation read from the storage circuit 22 are input to the encoded information input unit 64 of the variable transfer rate encoder 50 as motion compensation related information. Thus, the motion compensation predictor 61 of the variable transfer rate encoder 50 uses this motion compensation related information to perform motion compensation prediction.
[0109]
  Here, specific examples of the parameters relating to motion compensation recorded in the storage circuit 22 include a motion vector for each macroblock and a motion compensation type. That is, specific coding information stored in the storage circuit 22 of the configuration example shown in FIG. 5 includes macro information as shown in FIG. 6 in addition to the picture information in the format as shown in FIG. The block information portion is recorded in the raster order from the upper left to the lower right of the picture by the number of macro blocks.
[0110]
  By doing so, it is recorded in the memory circuit 22.MarkEven if the encoded data has some degree of encoding degradation, for example,MarksWhen the motion compensated predictor 61 obtains the motion vector using the decoded image obtained by decoding the encoded data by the decoder 40, it is not disturbed by the encoding degradation noise. In addition, the amount of processing for obtaining a motion vector can be reduced. In addition to this, for example, in the case of extending to so-called MPEG2, it is sufficiently effective to record information such as adaptive switching between frames and fields.
[0111]
  Further, in the configuration shown in FIGS. 1 and 3, the configuration shown in FIG. 7 is used instead of the configuration shown in FIGS., MarksConvert encoded data to variable transfer rate encoded data(Re-encoding)It is also possible to do.
[0112]
  Ie,FIG. 7 shows the generated code amount information and quantization width information recorded in the storage circuit 22., MarksUsing encoded data, etc.The markConvert encoded data to variable transfer rate encoded data(Re-encoding)The structure of the 3rd specific example for doing is shown. In FIG. 7, parts that operate in the same manner as the components shown in FIG. 4 and FIG.
[0113]
  That is, in the configuration of FIG. 7, FIG.MarksObtained by encodingTickFor example, when the encoded data is subjected to an editing process, the variable transfer rate encoder 50 performs the encoding according to the editing information used for the editing process at the time of the variable transfer rate encoding. To. As specific editing information, for example, information on a format consisting of hours: minutes: seconds: frames indicating time information of an image for which editing is to be enabled can be cited.
[0114]
  More specifically, in FIG. 7, edit information generated by combining the edit start point and the edit end point is input from the edit information input device 41, and the edit information is sent to the decoding controller.
[0115]
  The decoding controller 42 controls the decoder 40 so as to decode and reproduce only the time portion of the image for which editing is to be enabled, which is described in the editing information. Accordingly, the decoder 40 at this time is supplied from the storage circuit 22.MarkOf the encoded data, only the time portion of the image for which the editing described in the editing information is to be validated is decoded and reproduced.
[0116]
  In the case of the configuration of FIG. 7, the editing information from the editing information input device 41 isdetectionIt is also input to the controller 43. thisdetectionThe controller 43 sets the generated code amount and quantization width of each picture only for the time portion of the image for which editing is to be enabled.detectionEncoding informationdetectionThe device 24 is controlled.
Therefore, the generated code amount information and quantization width information of each picture for only the time portion of the image for which editing is to be enabled are recorded as encoded information in the storage circuit 22 at this time.
[0117]
  Further, in the target code amount determination circuit 26 in the case of the configuration of FIG. 7, the generated code amount information and quantization width information of each picture for the time portion of the image for which editing is to be enabled, stored in the storage circuit 22. Based on the encoded information and the target code amount or average variable transfer rate from the CPU 25, the picture target code amount for only the time portion of the image for which editing is to be validated is calculated.
[0118]
  For this reason, the variable transfer rate encoder 50 can perform variable transfer rate encoding only for the time portion of the image for which editing is to be enabled.
[0119]
  In such an editing process, for example, when the editing point is an image portion of a P picture, the image needs to be encoded as an I picture. In addition, when the edit point is an image portion of a B picture, the motion compensation information before and after the edit point needs to be invalidated. Therefore, in such a case, processing such as encoding the related picture as an I picture is necessary.
[0120]
  According to the video signal encoding apparatus of the present embodiment to which the variable transfer rate encoding method and apparatus of the present invention is applied, the configuration as described above is used.TickConvert encoded data to variable transfer rate encoded data(Re-encoding)It is possible to do.
[0121]
  Also,BookThe variable transfer rate encoded data generated as described above by the video signal encoding apparatus of the embodimentOn a recording medium (not shown)RecordDoAs a result, the capacity of the recording medium is effectively utilized, and even when the variable transfer rate encoded data is decoded later, good decoded data can be obtained.
[0122]
  The present invention is not limited to the above-described embodiment, and various modifications can be made according to the design and the like as long as the technical idea of the present invention is not deviated from. The encoding method is not limited to MPEG1 described above.
[0123]
【The invention's effect】
  As described above, according to the variable transfer rate encoding method and apparatus of the present invention, it is not necessary to input the same image to the encoding apparatus twice as in the conventional variable transfer rate encoding apparatus. That is, as a moving image signal to be encoded, an image signal that is transmitted only once, such as a moving image signal transmitted in real time from broadcasting or communication, can be used at a high rate.With a markAnd then temporarily record the data on a writable disc or tape media.
Thereafter, the data is used to perform variable transfer rate encoding with high image quality, and finally necessary variable transfer rate encoded data can be generated and recorded, for example.
  Also, record temporarilyTickPerforms variable transfer rate encoding using parameters for motion compensation of encoded dataIf you doWhen obtaining a motion vector, it is not disturbed by encoding degradation noise.
  In addition, record temporarilyTickPerforms variable transfer rate encoding using the editing information of the encoded data such as the part to be cut.If you doThus, variable transfer rate encoding reflecting the editing information can be performed.
[Brief description of the drawings]
FIG. 1 shows a variable transfer rate encoding method and apparatus according to the present invention.As a reference exampleFIG. 3 is a block diagram illustrating a schematic configuration when encoded data is detected by an encoding information detector and converted into variable transfer rate encoded data (re-encoding) in the video signal encoding apparatus.
FIG. 2 is a diagram illustrating a format of encoded information (picture information) recorded in a storage circuit of the video signal encoding apparatus according to the present embodiment.
FIG. 3 is a video signal encoding device according to the present embodiment.detectionIn a vesseldetectionShiTickConvert encoded data to variable-length encoded data(Re-encoding)It is a block diagram which shows the example of schematic structure in the case of doing.
FIG. 4 shows generated code amount information and quantization width information recorded in a storage circuit in the video signal encoding apparatus of the present embodiment., MarksUsing encoded data, etc.The markConvert encoded data to variable transfer rate encoded data(Re-encoding)It is a block diagram which shows the structure of the 1st specific example for doing.
FIG. 5 shows generated code amount information and quantization width information recorded in a storage circuit in the video signal encoding apparatus of the present embodiment., MarksUsing encoded data, etc.The markConvert encoded data to variable transfer rate encoded data(Re-encoding)It is a block diagram which shows the structure of the 2nd specific example for doing.
FIG. 6 is a diagram illustrating a format of encoded information (macroblock information) recorded in a storage circuit of the video signal encoding apparatus according to the present embodiment.
FIG. 7 shows generated code amount information and quantization width information recorded in a storage circuit in the video signal encoding apparatus according to the present embodiment., MarksUsing encoded data, etc.The markConvert encoded data to variable transfer rate encoded data(Re-encoding)It is a block diagram which shows the structure of the 3rd specific example for doing.
FIG. 8 is a block diagram showing a basic configuration of a conventional video signal encoding device.
[Fig. 9] Fig. 9 is a block diagram illustrating a configuration in the case where the first pass encoding is performed in the conventional video signal encoding device.
FIG. 10 is a block diagram showing a configuration when performing a second pass encoding in a video signal encoding device of a conventional example.
[Explanation of symbols]
  DESCRIPTION OF SYMBOLS 1 ... Input terminal, 2,52 ... Operation unit, 3,53 ... DCT device, 4,54 ... Quantizer,
5, 55 ... VLC unit, 6, 56 ... Buffer memory, 7, 57 ... Output terminal,
8, 58 ... inverse quantizer, 9, 59 ... inverse DCT, 10, 60 ... image memory,
11, 61 ... motion compensation predictor, 12, 62 ... arithmetic unit, 62 ... code amount counter,
63 ... Code amount control circuit, 22 ... Memory circuit, 24 ... Encoding informationdetectionDevice, 25 ... CPU,
26 ... target code amount determination circuit, 40 ... decoder, 41 ... editing information input device,
42 ... Decoding controller, 43 ...detectionController, 50... Variable transfer rate encoder.

Claims

A variable transfer rate encoding method that encodes a video signal using orthogonal transform and quantization and outputs the video signal at a variable transfer rate,
Encoding an incoming video signal using orthogonal transform and quantization;
Simultaneously with the encoding of the incoming video signal, a generated code amount for each image unit and an average quantization width for each image unit of the encoded data are detected, and each detected image unit is detected. the method comprising the generated code amount and generate encoded information having information and the average quantization scale of each image unit,
Recording the encoded encoded data at a fixed transfer rate and recording the encoded information;
Decoding the recorded encoded data;
Based on the recorded encoded information, the reproduction time of the encoded data, and the target average variable transfer rate, or based on the encoded information and the target code amount, a new one for each image unit A variable transfer rate encoding step for variable-length encoding the data obtained by decoding the encoded data so that the generated code amount for each image unit becomes the new target code amount;
A variable transfer rate encoding method comprising:

In the recording step, parameters relating to motion compensation at the time of encoding in the encoding step are recorded,
In the variable transfer rate encoding step, when the data obtained by decoding the recorded encoded data is variable-length encoded, the recorded parameter for motion compensation is used as a parameter for motion compensation. The variable transfer rate encoding method according to claim 1.

Provide a step for entering editing information,
In the step of generating the encoding information, the encoding information of only a necessary section based on the editing information is generated,
In the decoding step, the data of only the necessary section based on the editing information is decoded,
In the variable transfer rate encoding step, encoding information of only a necessary section based on editing information is used as the encoding information, and data decoded only in a necessary section based on the editing information is encoded. 3. The variable transfer rate encoding method according to claim 1, wherein the variable transfer rate encoding method is used.

A variable transfer rate encoding device that encodes a video signal using orthogonal transform and quantization and outputs the video signal at a variable transfer rate,
An incoming video signal encoding means for encoding the incoming video signal using orthogonal transform and quantization;
Simultaneously with the encoding of the incoming video signal, a generated code amount for each image unit and an average quantization width for each image unit of the encoded data are detected, and each detected image unit is detected. Encoding information detecting means for generating encoded information having information on the generated code amount and the average quantization width for each image unit ;
Recording means for recording the encoded data at a fixed transfer rate and recording the encoded information;
Decoding means for decoding the recorded encoded data;
Based on the recorded encoded information, the reproduction time of the encoded data, and the target average variable transfer rate, or based on the encoded information and the target code amount, a new one for each image unit Variable transfer rate encoding means for variable-length encoding the data obtained by decoding the encoded data so that the generated code amount for each image unit becomes the new target code amount.
A variable transfer rate encoding device comprising:

The incoming video signal encoding means outputs a parameter relating to motion compensation during encoding,
The recording means records parameters relating to the motion compensation;
The variable transfer rate encoding means uses the recorded parameter for motion compensation as a parameter for motion compensation when variable length encoding is performed on data obtained by decoding the recorded encoded data. The variable transfer rate encoding apparatus according to claim 4.

Editing information input means for inputting editing information;
Detection control means for controlling the encoded information detection means based on the editing information;
A decoding control means for controlling the decoding means based on the editing information;
In the encoded information detection means, under the control of the detection control means, generate encoded information for only the necessary section based on the edit information,
In the decoding means, under the control of the decoding control means, the encoded data is decoded only in a necessary section based on the editing information,
In the variable transfer rate encoding means, encoded information of only a necessary section based on the editing information is used as the encoding information, and only a necessary section is decoded based on the editing information as data to be encoded. 6. The variable transfer rate encoding device according to claim 4 or 5, wherein: