JP2004015351A

JP2004015351A - Encoding apparatus and method, program, and recording medium

Info

Publication number: JP2004015351A
Application number: JP2002164919A
Authority: JP
Inventors: Shinya Iki; 伊木　信弥
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2002-06-05
Filing date: 2002-06-05
Publication date: 2004-01-15
Anticipated expiration: 2022-06-05
Also published as: JP4288897B2

Abstract

<P>PROBLEM TO BE SOLVED: To encode a signal in a bit amount adjusted in response to a scene change. <P>SOLUTION: An encoder is provided with: a scene change position detection section 12 for detecting the position of a scene change from a moving picture signal; a degree of encoding difficulty calculation section 10 for calculating the degree of encoding difficulty d for each unit time of the moving picture signal; a bit amount calculation section 11 for calculating a bit amount b required for transfer of the moving picture signal within the unit time on the basis of the degree of encoding difficulty d; a bit amount adjustment section for adjusting the bit amount assigned before and after a prescribed position decided on the basis of the scene change position when the bit amount b is lower than a prescribed threshold value; a delay section 13 for delaying the moving picture signal; and an encoding section 14 for revising part of image structure into a prescribed image structure on the basis of the scene change position and encoding the moving picture signal outputted from the delay section on the basis of the adjustment performed by the bit amount adjustment section according to the image structure after the revision. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】本発明は、ディジタル信号を効率的に符号化する高能率符号化における符号化装置及び方法、プログラム、記録媒体に関し、特に、動画像信号の符号化において、１パス方式で可変ビットレートの割当量を制御し符号化を行う符号化装置及び方法、プログラム、記録媒体に関する。
【０００２】
【従来の技術】ディジタルビデオ信号はデータ量が極めて多いため、これを小型で記憶容量の少ない記録媒体に長時間記録したい場合、ビデオ信号を高い圧縮率で効率よく符号化する高能率符号化手段が不可欠となる。このような要求に応えるべく、ビデオ信号の相関を利用した高能率符号化方法が提案されており、その一つにＭＰＥＧ方式がある。
【０００３】
このＭＰＥＧ（Ｍｏｖｉｎｇ　Ｐｉｃｔｕｒｅ　Ｉｍａｇｅ　Ｃｏｄｉｎｇ　Ｅｘｐｅｒｔｓ　Ｇｒｏｕｐ）とは、ＩＳＯ−ＩＥＣ／ＪＴＣ１／ＳＣ２９／ＷＧ１１にて議論され、標準案として提案されたものであり、動き補償予測符号化と離散コサイン変換（ＤＣＴ：Ｄｉｓｃｒｅｔｅ　Ｃｏｓｉｎｅ　Ｔｒａｎｓｆｏｒｍ）符号化とを組み合わせたハイブリッド方式である。このＭＰＥＧ方式では、まずビデオ信号のフレーム間の差分を取ることにより時間軸方向の冗長度を落とし、その後、離散コサイン変換を用いて空間軸方向の冗長度を落とし、このようにしてビデオ信号を能率よく符号化する。また、ＭＰＥＧ方式では、上記動き補償予測符号化を行うために、Ｉピクチャ（Ｉｎｔｒａ符号化画像）、Ｐピクチャ（Ｐｒｅｄｉｃｔｉｖｅ符号化画像）及びＢピクチャ（Ｂｉｄｉｒｅｃｔｉｏｎａｌｌｙ　Ｐｒｅｄｉｃｔｉｖｅ符号化画像）という３つの要素によるＧＯＰ（Ｇｒｏｕｐ　ｏｆ　Ｐｉｃｔｕｒｅ）構造を用いている。
【０００４】
一般に、ビデオ信号は定常的でなく、各ピクチャの情報量は時間経過に伴って変化する。そのため可変ビットレート符号化を用いると同じ符号量で一定ビットレート符号化に比べて高画質が得られることが知られている。
【０００５】
例えば、いわゆるＤＶＤ−ｖｉｄｅｏに記録されるビデオ信号は、２パス方式の可変ビットレート符号化が、一般に用いられている。この２パス方式は、符号量を求めるための符号化処理と、求められた符号量に基づいてビットレートを可変制御しながら行う符号化処理との２度の符号化を行うものであり、使用可能な符号化ビット総量を有効に使うことが出来る利点があるが、処理時間が動画像シーケンスの時間長の約２倍必要という欠点があるため、リアルタイム処理には不向きである。
【０００６】
この処理時間を短くすることを目的とした１パス方式の可変ビットレート符号化方式が、例えば特願平７−３１１４１８号及び特願平９−１１３１４１号等の明細書及び図面等に開示されている。
【０００７】
ここで、図８に、従来の１パス方式の可変ビットレート符号化方法を適用した動画像の符号化装置の構成例を示し、また図９に１パス方式の可変ビットレート符号処理のフローチャートを示す。
【０００８】
符号化装置２は、図８に示すように、符号化難易度計算器２０と、割当ビット量計算器２１と、遅延器２２と、動画像符号化器２３とを備えている。符号化装置２において、入力端子Ａに供給された入力動画像信号は、符号化難易度計算器２０及び遅延器２２に送られる。符号化難易度計算器２０からの出力は、単位時間毎の割当ビット量を計算する割当ビット量計算器２１に送られ、割当ビット量計算器２１からの出力は動画像符号化器２３に送られる。動画像符号化器２３は、遅延器２２からの出力信号を、割当ビット量計算器２１からの割当ビット量に応じて符号化し、端子２０５より符号化ビットストリームとして出力する。
【０００９】
ここで、符号化装置２の動作を、図９のフローチャートに従って説明する。
【００１０】
ステップＳＴ１０において、符号化装置２は、入力端子Ａに供給された動画像信号を符号化難易度計算器２０に入力し、単位時間毎の入力画像の符号化難易度ｄを計算する。上記単位時間は、例えば０．５秒程度とされる。この符号化難易度の計算は、例えば、量子化ステップを固定して入力動画像信号をエンコードして、所定時間毎の発生符号量を計算することにより行われる。
【００１１】
ステップＳＴ１１において、符号化装置２は、割当ビット量計算器２１により、符号化難易度計算器２０から得られた符号化難易度ｄに対する割当ビット量ｂを計算により求める。この場合、予め、基準となる動画像シーケンスを所定の平均ビットレートで可変ビットレート符号化する時の単位時間毎の符号化難易度ｄと割当ビット量ｂを関係付けておく。ここで、基準となる動画像シーケンスに対する単位時間毎の割当ビット量の総和は、目的の記録媒体の記憶容量以下にされている。この符号化難易度ｄと割当ビット量ｂの関係の例を図４に示す。
【００１２】
図４において、横軸は符号化難易度ｄを示し、縦軸は、基準となる動画像シーケンス内で符号化難易度ｄの出現確立ｈ（ｄ）を示している。そして、任意の符号化難易度に対する割当ビット量を関数ｂ（ｄ）に基づいて計算する。この関係は、例えば、映画等の動画像シーケンスを所定の平均ビットレートで符号化する実験を行い、その画質を評価し、試行錯誤を通じて経験的に求められるものであり、世の中のほとんどのシーケンスに適応可能な一般的な関係になっている。その求め方については、例えば、特願平７−３１１４１８号等の明細書及び図面に開示されている。割当ビット量計算器２１では、この図４の関係に基づいて、入力端子Ａからの入力画像の単位時間の符号化難易度ｄに対して、割当ビット量ｂを求める。
【００１３】
この１パス方式の符号化装置２における遅延器２２は、単位時間長の入力画像に対しての符号化難易度計算器２０と割当ビット量計算器２１での処理が単位時間内に終了するので、その画像信号の動画像符号化器２３への入力を単位時間だけ遅延するために設けられている。
【００１４】
また、ステップＳＴ１２において、符号化装置２は、動画像符号化器２３が、単位時間毎の入力動画像を、これに対応して割当ビット量計算器２１から与えられる割当ビット量になるように符号化する。すなわち、動画像符号化器２３は、割当符号量に基づいた量子化ステップサイズにより、単位時間毎の入力動画像をエンコードする。
【００１５】
このような１パス方式においては、画像信号の入力に応じて、ほぼリアルタイムで信号の符号化難易度に応じた最適な割当ビット量での可変ビットレート符号化が行える。
【００１６】
【発明が解決しようとする課題】
ところで、図４の関係は、ほとんどの動画像シーケンスに適用できるが、いくつかの特殊なシーケンスには対応できない。例えば、異なる入力画像信号が時間的に連続するつなぎ目の部分である。なお、このような入力画像信号のつなぎ目をシーンチェンジという。
【００１７】
上記シーンチェンジの部分では、連続した画像から差分を取り出して符号化し、元の画像を再構成する、いわゆる動き補償予測符号化が適用できないため一般的な入力画像信号と同等の画質を得るためには、シーンチェンジの部分により多くのビット量を割り当てるか、シーンチェンジの部分でＭＰＥＧのＧＯＰ構造を変える必要がある。例えば、ＰピクチャをＩピクチャに変更することで、シーンチェンジ時の画質の劣化を防ぐことが可能である。
【００１８】
しかし、ＧＯＰ構造を変更すると一般にそのＧＯＰでは、符号化効率が悪くなるため、単にＧＯＰ構造を変更しただけでは、一般的な入力画像信号と同等の画質を得るのは難しい。特に、１ＧＯＰに低レートのビット量しか割り当てられていないときには、ＰピクチャからＩピクチャに変更するときに必要となるビット量が足りなくなり、ＧＯＰ構造を変更したシーンの画質が劣化してしまう。
【００１９】
そこで、本発明では、シーンチェンジを検出し、ＧＯＰ構造を変更した場合でもシーンチェンジ部分の画質を向上させることが可能な符号化装置及び方法、プログラム、記録媒体を提案することを目的とする。
【００２０】
【課題を解決するための手段】
本発明に係る符号化装置は、上述の問題を解決するために、動画像信号を所定の画像構造に基づき、符号化する符号化装置であって、動画像信号の単位時間ごとの符号化難易度ｄを算出する符号化難易度算出手段と、上記符号化難易度ｄに基づき、上記動画像信号を単位時間内に転送するビット量ｂを算出するビット量算出手段と、動画像信号からシーンチェンジ位置を検出するシーンチェンジ位置検出手段と、上記シーンチェンジ位置に基づき、上記所定の画像構造を変更する画像構造変更手段と、上記ビット量ｂが所定の閾値より低い場合に、上記画像構造変更手段で画像構造の変更を行った位置以前に割り当てるビット量を減少させ、これによって得た余剰のビット量を上記位置以降に割り当てる調整を行うビット量調整手段と、動画像信号を遅延させて出力する遅延手段と、上記画像構造変更手段により変更した画像構造に基づき、上記遅延手段から出力される動画像信号を上記ビット量調整手段の調整に応じて符号化する符号化手段とを備える。
【００２１】
このような符号化装置は、ビット量算出手段で算出したビット量ｂが所定の閾値より低い場合に、ビット量調整手段で画像構造の変更を行った位置以前に割り当てるビット量を減少させ、これによって得た余剰のビット量を上記位置以降に割り当てる調整を行い、符号化手段で変更した画像構造に基づき、動画像信号をビット量調整手段の調整に応じて符号化を行う。
【００２２】
本発明に係る符号化方法は、上述の課題を解決するために、動画像信号を所定の画像構造に基づき、符号化する符号化方法であって、動画像信号の単位時間ごとの符号化難易度ｄを算出し、上記符号化難易度ｄに基づき、上記動画像信号を単位時間内に転送するビット量ｂを算出し、動画像信号からシーンチェンジ位置を検出し、上記シーンチェンジ位置に基づき、上記所定の画像構造を変更し、上記ビット量ｂが所定の閾値より低い場合に、画像構造の変更を行った位置以前に割り当てるビット量を減少させ、これによって得た余剰のビット量を上記位置以降に割り当てる調整を行い、動画像信号を遅延させて出力し、変更した画像構造に基づき、遅延させて出力した動画像信号を上記調整に応じて符号化する。
【００２３】
このような符号化方法は、動画像信号を単位時間内に転送するビット量ｂが所定の閾値より低い場合に、シーンチェンジ位置に基づき変更した画像構造の位置以前に割り当てるビット量を減少させ、これによって得た余剰のビット量を上記位置以降に割り当てる調整を行い、変更した画像構造に基づき、動画像信号を上記調整に応じて符号化する。
【００２４】
本発明に係るコンピュータにより実行させるためのプログラムは、上述の課題を解決するために、動画像信号の単位時間ごとの符号化難易度ｄを算出する工程と、上記符号化難易度ｄに基づき、上記動画像信号を単位時間内に転送するビット量ｂを算出する工程と、動画像信号からシーンチェンジ位置を検出する工程と、上記シーンチェンジ位置に基づき、上記所定の画像構造を変更する工程と、上記ビット量ｂが所定の閾値より低い場合に、画像構造の変更を行った位置以前に割り当てるビット量を減少させ、これによって得た余剰のビット量を上記位置以降に割り当てる調整を行う工程と、動画像信号を遅延させて出力する工程と、変更した画像構造に基づき、遅延させて出力した動画像信号を上記調整に応じて符号化する工程とを有する。
【００２５】
このようなプログラムは、コンピュータにより実行した際には、動画像信号を単位時間内に転送するビット量ｂが所定の閾値より低い場合に、シーンチェンジ位置に基づき変更した画像構造の位置以前に割り当てるビット量を減少させ、これによって得た余剰のビット量を上記位置以降に割り当てる調整を行い、変更した画像構造に基づき、動画像信号を上記調整に応じて符号化する。
【００２６】
本発明に係る記録媒体は、上述の課題を解決するために、動画像信号の単位時間ごとの符号化難易度ｄを算出する工程と、上記符号化難易度ｄに基づき、上記動画像信号を単位時間内に転送するビット量ｂを算出する工程と、動画像信号からシーンチェンジ位置を検出する工程と、上記シーンチェンジ位置に基づき、上記所定の画像構造を変更する工程と、上記ビット量ｂが所定の閾値より低い場合に、画像構造の変更を行った位置以前に割り当てるビット量を減少させ、これによって得た余剰のビット量を上記位置以降に割り当てる調整を行う工程と、動画像信号を遅延させて出力する工程と、変更した画像構造に基づき、遅延させて出力した動画像信号を上記調整に応じて符号化する工程を実行させるためのプログラムを記録したコンピュータ読み取り可能な媒体である。
【００２７】
このような記録媒体を読み取り可能なコンピュータにより実行した際に、動画像信号を単位時間内に転送するビット量ｂが所定の閾値より低い場合に、シーンチェンジ位置に基づき変更した画像構造の位置以前に割り当てるビット量を減少させ、これによって得た余剰のビット量を上記位置以降に割り当てる調整を行い、変更した画像構造に基づき、動画像信号を上記調整に応じて符号化する。
【００２８】
【発明の実施の形態】
以下、本発明の実施の形態について図面を参照しながら詳細に説明する。
【００２９】
まず、本発明の背景について説明する。例えば、光ディスクメディアのひとつである片面一層方式のＤＶＤ（Ｄｉｇｉｔａｌ　Ｖｅｒｓａｔｉｌｅ　Ｄｉｓｃ）は、記録できるデータ量がＣＤの約７倍の約４．７ＧＢを実現した大容量メディアである。しかし、映像信号を何の処理も行わずにディジタル化し、記録しようとすると、長時間の記録ができない。例えば、ＮＴＳＣ方式の映像信号をディジタル化すると、１秒当たりのデータ量は、２０ＭＢ以上になる。なお、２０ＭＢは、７２０×４８０画素の画面を１秒に２９．９７枚、１画素あたり、輝度に８ビット、色に８ビットを与えるものとして計算した値である。
【００３０】
したがって、ＮＴＳＣ方式の映像信号を上記ＤＶＤに記録しようとすると、約４分程度の映像しか記録することができない。そこで、ＤＶＤでは、約１３３分の映像を記録できるようにＮＴＳＣ方式の映像信号を圧縮するＭＰＥＧ（ＭｏｖｉｎｇＰｉｃｔｕｒｅ　Ｉｍａｇｅ　Ｃｏｄｉｎｇ　Ｅｘｐｅｒｔｓ　Ｇｒｏｕｐ）２を採用している。
【００３１】
ＭＰＥＧ２による動画像信号の圧縮方法は、画面内の相関を利用した圧縮である離散コサイン変換（ＤＣＴ、Ｄｉｓｃｒｅｔｅ　Ｃｏｓｉｎｅ　Ｔｒａｎｓｆｏｒｍ）と、画面間の相関を利用した動き補償と、符号列の相関を利用した圧縮であるハフマン符号化の３つを組み合わせたものである。
【００３２】
ここで、動き補償について説明する。ＮＴＳＣ方式の映像信号は、１秒当たり３０フレームで構成されており、ちらつきを無くすためにこれをインターレース（飛び越し走査）方式でスキャンする。そのため、ディスプレイ装置では、１秒当たり６０フィールドで画面を表示している。なお、３０フレームは、５２５本の走査線から形成される３０枚の絵としている。
【００３３】
１秒あたり３０枚の絵のそれぞれについて見ると、連続した画像は、画面中の大きな面積が同一の要素で占められていることが多く、しかも、それは、短時間のうちには余り変化しない。
【００３４】
そこで、例えば、３０枚の絵のうちで変化する部分がある場合、その変化分（差分）のみを取り出して、その他の部分は１回だけ記録するようにする。そして、再生時にその差分を合成すれば、記録・再生のために必要な情報量は、少なくて済むことになる。このような、連続した画像から差分を取り出して符号化しもとの画像を再構成する手法は、１枚前の画像から現在の画像を予測するため予測符号化と呼ばれている。
【００３５】
また、画像の中の変化する部分、すなわち動きのある部分には、形が変わらず時間とともにただ画面上の位置が変わっていくものと、時間とともに形が変わっていくものとがある。前者の場合には、形のデータは、そのまま使うことが可能である。形が変わらずに時間とともに画面上の位置が変わっていく要素について、その変化分、すなわち、動きのずれの量を動きベクトルと呼ぶ。この動きベクトルを符号化し、符号化した信号を伝送すれば、より少ないデータ量で元の画像を再構成することができる。なお、予測符号化の中でこの動きベクトルを利用する方法を動き補償と呼ぶ。
【００３６】
つぎに、上述した予測符号化と動き補償の処理について図１を用いて説明する。予測符号化の処理では、例えば、時間の流れにしたがって再生されるべき画像Ａ及び画像Ｂがあった場合に、画像Ａ及び画像Ｂの内容に共通要素が多いので、画像Ａ及び画像Ｂをそのまま符号化するのではなく、画像Ａと画像Ｂからその差分である画像Ｃを生成し、生成した画像Ｃを画像Ａとともに符号化する。このように符号化された画像Ａ及び画像Ｃは、復号時に、画像Ａと画像Ｃを合成して画像Ｂを再構築することができる。なお、画像Ｃは、予測符号化画像と呼ばれている。
【００３７】
また、動き補償の処理では、画像をブロックに分割して、形が変わらず位置だけが変化した部分から、どの方向にどれだけ動いたかを示す動きベクトルを取り出して符号化を行う。上記動き補償と予測符号化と組み合わせることにより、効率良くデータの圧縮を行うことが可能となる。
【００３８】
また、ＭＰＥＧ２では、動き補償を用いた予測符号化を行うために、図２に示すように、Ｉピクチャ（Ｉｎｔｒａ符号化画像）、Ｐピクチャ（Ｐｒｅｄｉｃｔｉｖｅ符号化画像）及びＢピクチャ（Ｂｉｄｉｒｅｃｔｉｏｎａｌｌｙ　Ｐｒｅｄｉｃｔｉｖｅ符号化画像）という３つの要素によるＧＯＰ（Ｇｒｏｕｐ　ｏｆ　Ｐｉｃｔｕｒｅ）構造を用いている。Ｉピクチャは、フレーム内符号化により作られるピクチャで、前画像からの予測符号化を行わないものである。Ｐピクチャは、ひとつ前の画像から予測符号化を行って作られるフレーム間順方向予測符号化画像であり、Ｉピクチャをもとにして作られるピクチャである。Ｂピクチャは、双方向予測符号化画像であり、前後の２枚のＰピクチャからの予測を行うことで作られるピクチャである。一般的に、１５枚のピクチャ・グループで１ＧＯＰを形成している。なお、１ＧＯＰは、１枚のＩピクチャと、４枚のＰピクチャと、１０枚のＢピクチャで構成されている。また、Ｉピクチャを生成するのに必要なビット量は、Ｂピクチャを生成するのに必要なビット量の５〜６倍であり、Ｐピクチャを生成するのに必要なビット量は、Ｂピクチャを生成するのに必要なビット量の２〜３倍である。
【００３９】
また、一般に、映像信号は定常的でなく、各ピクチャの情報量は時間経過に伴って変化する。そのため、可変ビットレート符号化を用いると、同じ符号量で符号化する固定ビットレート符号化に比べて高画質が得られることが知られている。
【００４０】
また、ＤＶＤに記録される映像信号は、２パス方式と呼ばれる可変ビットレート符号化が、一般に用いられている。この２パス方式は、符号量を求めるための符号化処理と、求められた符号量に基づいてビットレートを可変制御しながら符号化を行う符号化処理の２度の符号化処理を行うものであり、使用可能な符号化ビットの総量を有効に使うことが出来る利点がある。しかし、処理時間が動画像シーケンスの時間長の約２倍必要となる欠点があるため、リアルタイム処理には不向きである。
【００４１】
上記リアルタイム処理には、符号量の計算と、上記符号量に基づいてビットレートを可変制御することを１度に行い符号化する必要がある。このような処理を１パス方式という。本発明を適用した符号化装置１は、上記１パス方式を適用したものであり、入力された動画像信号からシーンチェンジ位置を検出し、上記動画像信号の単位時間ごとに割り当てるビット量が所定の閾値より低い場合に、上記シーンチェンジ位置に基づき決定される所定の位置以前に本来割り立てる予定だったビット量を減少し、これによって得た余剰のビット量を上記所定の位置以降に割り当てる調整を行い、上記調整に応じて上記動画像信号を符号化する装置である。以下に、符号化装置１の構成と動作について詳述する。
【００４２】
符号化装置１は、図３に示すように、符号化難易度計算器１０と、割当ビット量計算器１１と、シーンチェンジ位置検出器１２と、遅延器１３と、動画像符号化器１４とを備える。符号化難易度計算器１０は、入力端子から供給された動画像信号を単位時間ごとに符号化難易度ｄを計算し、算出した符号化難易度ｄを割当ビット量計算器１１に出力する。上記単位時間は、０．５秒又は１ＧＯＰの経過時間とする。また、符号化難易度計算器１０は、量子化ステップを固定して入力された動画像信号をエンコードし、所定時間ごとの発生符号量を計算することにより符号化難易度ｄを計算している。なお、符号化難易度計算器１０は、動きが多いシーンには、符号化難易度ｄを高めに計算し、動きが少ないシーンには、符号化難易度ｄを低めに計算する。
【００４３】
割当ビット量計算器１１は、入力した符号化難易度ｄに対する割当ビット量ｂを計算により求める。この場合、予め、基準となる動画像シーケンスを所定の平均ビットレートで可変ビットレート符号化する時の単位時間毎の符号化難易度ｄと割当ビット量ｂを関係付けておく。ここで、基準となる動画像シーケンスに対する単位時間毎の割当ビット量の総和は、目的の記録媒体の記憶容量以下にされている。この符号化難易度ｄと割当ビット量ｂの関係の例を図４に示す。
【００４４】
この図４において、横軸は符号化難易度ｄを示し、縦軸は、基準となる動画像シーケンス内で符号化難易度ｄの出現確立ｈ（ｄ）を示している。そして、任意の符号化難易度に対する割当ビット量を関数ｂ（ｄ）に基づいて計算する。この関係は、多くの動画像シーケンス（例えば映画）を所定の平均ビットレートで符号化する実験を行い、その画質を評価し、試行錯誤を通じて経験的に求められるものであり、世の中のほとんどのシーケンスに適応可能な一般的な関係になっている。割当ビット量計算器１１は、図４の関係に基づき、動画像信号の単位時間当たりの符号化難易度ｄに対して、割当ビット量ｂを算出する。割当ビット量計算器１１は、符号化難易度ｄが高い場合には、符号化するレートを高めることにより符号化によるノイズを低減する。なお、符号化難易度ｄが低い場合には、符号化によるノイズがあまり発生しないため、符号化するレートを低めに計算する。こうすることにより、動画像信号全体の画質は、均一になり、単位時間ごとの平均的なビット量が小さくなる。
【００４５】
しかし、従来の符号化装置では、３０分内又は１時間内で単位時間当たりのビット量を低く制御するアルゴリズムであるため、最初の数分間は特に制御をせずに多くのビット量を割り当て、後半に少ないビット量を割り当てる処理をすることがある。このような制御だと、例えば、動画像信号を１分間記録して、停止し、再び１分間記録するという細切れの再生動作を繰り返すと、多くのビット量が割り当てられてしまい、単位時間当たりの平均ビット量が高くなってしまう問題がある。本発明に係る符号化装置１は、特に細切れ再生動作を繰り返した場合において、短時間当たりの平均ビット量を低く制御できるものである。割当ビット量計算器１１は、例えば、２．８Ｍｂｐｓ〜１５Ｍｂｐｓ程度のビット量ｂを割り当てる。
【００４６】
また、割当ビット量計算器１１は、計算したビット量ｂが所定の閾値よりも低く、後述するシーンチェンジ位置検出器１２からシーンチェンジがあった旨の信号が供給されたとき、ビット量ｂをシーンチェンジ位置の前後で調整し、調整結果を動画像符号化器１４に出力する。なお、割当ビット量計算器１１による調整については、後述する。
【００４７】
シーンチェンジ位置検出器１２は、入力端子から供給された動画像信号からシーンチェンジ位置を検出し、検出した旨の信号を生成し、割当ビット量計算器１１及び動画像符号化器１４に出力する。
【００４８】
ここで、シーンチェンジ位置検出器１２の具体的な動作について以下に述べる。シーンチェンジ位置検出器１２は、動画像信号の各フレームを特徴付けるパラメータの大きな変化を検出することによりシーンチェンジ位置を検出している。したがって、シーンチェンジ位置検出器１２は、画面の輝度レベルやクロマレベルなどの信号の変化を検出できるものであれば良い。シーンチェンジ位置検出器１２は、例えば、平均輝度レベルの変化や、フレーム間の差分量や、フレーム間のレベル変動量等を用いてシーンチェンジ位置を検出する。
【００４９】
つぎに、上述したシーンチェンジ位置検出器１２によるシーンチェンジ位置の検出を図５に示すフローチャートを用いて説明する。
【００５０】
ステップＳＴ１において、シーンチェンジ位置検出器１２は、入力された動画像信号からシーンチェンジ位置の検出をしたがどうかを判定する。シーンチェンジ位置を検出した場合には、ステップＳＴ２に進む。
【００５１】
ステップＳＴ２において、シーンチェンジ位置検出器１２は、シーンチェンジ位置を検出した旨の信号を生成し、動画像符号化器１４に出力する。動画像符号化器１４は、シーンチェンジ位置に応じてＧＯＰ構造を変更する。
【００５２】
ステップＳＴ３において、シーンチェンジ位置検出器１２は、シーンチェンジ位置を検出した旨の信号を割当ビット量計算器１１に出力する。割当ビット量計算器１１は、シーンチェンジ位置に応じてシーンチェンジ位置の前後に割り当てるビット量を調整する。
【００５３】
遅延器１３は、少なくとも、符号化難易度計算器１０と割当ビット量計算器１１により動画像信号の単位時間あたりのビット量ｂを計算する時間分だけ、動画像信号を一時記憶する。遅延器１３は、動画像信号を一時記憶した後、動画像信号を動画像符号化器１４に供給する。
【００５４】
動画像符号化器１４は、供給された動画像信号を単位時間ごとに割当ビット量計算器１１から供給されるビット量ｂに応じて符号化を行う。また、動画像符号化器１４は、シーンチェンジ位置検出器１２からシーンチェンジがあった旨の信号が供給された場合に、ＧＯＰ構造を変更する。具体的には、図６に示すように、時間方向に、シーンチェンジ位置から最も近いＰピクチャをＩピクチャに変更する。動画像符号化器１４は、位置Ａにシーンチェンジが入ったときには、Ｐピクチャ（Ｐ５）をＩピクチャ（Ｉ５）に変更し、位置Ｂにシーンチェンジが入ったときには、Ｐピクチャ（Ｐ８）をＩピクチャ（Ｉ８）に変更し、位置Ｃにシーンチェンジが入ったときには、Ｐピクチャ（Ｐ１１）をＩピクチャ（Ｉ１１）に変更する。
【００５５】
上述したように、ＰピクチャからＩピクチャに変更すると、２〜３倍のビット量が必要となる。単位時間に割り当てられているビット量が所定の閾値よりも高い場合には、問題はないが、ビット量が所定の閾値よりも低い場合には、ＧＯＰ構造の変更によりビット量が足りなくなり、画質が劣化してしまう。そこで、割当ビット量計算器１１では、動画像信号の単位時間に割り当てるビット量ｂが所定の閾値よりも低いときに、シーンチェンジ位置の検出結果に応じて、シーンチェンジ位置に基づき決定される所定の位置以前に割り当てるビット量と上記所定の位置以降に割り当てるビット量とを調整する。
【００５６】
また、図７に示すように、ＰピクチャからＩピクチャに変更せずに、例えば、２ＧＯＰ内でピクチャの移動によりシーンチェンジに対応することができる。この場合には、２ＧＯＰ内でＰピクチャ（Ｐ１１）からＩピクチャ（Ｉ２）の位置を移動するだけなので、２ＧＯＰで必要とするビット量には大きな変化はないが、シーンチェンジ自体のエンコードが難しく、ビット量が多めに必要であるために、却ってシーチェンジ後に通常よりも多くビット量を割り振る必要が生じる。
【００５７】
ここで、割当ビット量計算器１１による調整作業について以下に述べる。割当ビット量計算器１１は、計算したビット量ｂが所定の閾値よりも低いときには、以下の調整を行う。上述したようなシーンチェンジが起こると、予測により符号化した画像が使用できず、シーンチェンジ後の画質が劣化することがある。そこで、後述する動画像符号化器１４では、シーンチェンジ位置検出器１２からシーンチェンジがあった旨の信号が供給されたときに、上記シーンチェンジに基づき、ＧＯＰ構造を変更する処理を行う。こうすることにより、シーンチェンジ時の画質の劣化を防ぐことができる。
【００５８】
しかし、ＧＯＰ構造を変更すると、一般的にそのＧＯＰでは符号化効率が悪くなり、単にＧＯＰ構造を変更しただけでは、一般的な動画像信号と同等の画質を得ることは難しい。そこで、割当ビット量計算器１１は、シーンチェンジ位置検出器１２からシーンチェンジがあった旨の信号が供給された場合に、シーンチェンジ位置に基づき決定される所定の位置以前に割り当てられているビット量を減少し、これによって得た余剰のビット量を上記所定の位置以降に割り当てる調整を行い、調整結果を動画像符号化器１４に供給する。動画像符号化器１４は、供給された調整結果に基づき、遅延器１３から出力された動画像信号を符号化する。
【００５９】
割当ビット量計算器１１は、例えば、シーンチェンジ位置に基づき決定される所定の位置以前の動画像信号に本来割り当てる予定だったビット量をＲ＿ｐｒｅとし、減少するビットレート率をＳＣ＿ｒａｔｅとすると、上記所定の位置以前の動画像信号に割り当てるビット量Ｒ＿ｐｒｅ’を、
Ｒ＿ｐｒｅ’＝Ｒ＿ｐｒｅ×ＳＣ＿ｒａｔｅ
に従い調整し、シーンチェンジ位置に基づき決定される所定の位置以降の動画像信号に本来割り当てる予定だったビット量をＲ＿ｐｏｓｔとすると、上記所定の位置以降の動画像信号に割り当てるビット量Ｒ＿ｐｏｓｔ’を、
Ｒ＿ｐｏｓｔ’＝Ｒ＿ｐｏｓｔ＋（Ｒ＿ｐｒｅ−Ｒ＿ｐｒｅ’）
に従い調整する。
【００６０】
また、減少するビットレート率ＳＣ＿ｒａｔｅは、シーンチェンジ位置に応じて変化し、例えば、図６に示す位置Ａにシーンチェンジが入った場合には、ＳＣ＿ｒａｔｅ＝０．６とし、位置Ｂにシーンチェンジが入った場合には、ＳＣ＿ｒａｔｅ＝０．７とし、位置Ｃにシーンチェンジが入った場合には、ＳＣ＿ｒａｔｅ＝０．８とする。
【００６１】
動画像符号化器１４は、割当ビット検出器から供給される調整結果に応じて動画像信号に符号化を行う。
【００６２】
このように構成された符号化装置１は、割当ビット量計算器１１の計算により所定の閾値よりも低いビット量ｂを算出した場合、シーンチェンジ位置検出器１２で検出したシーンチェンジ位置に基づき、シーンチェンジ位置のビット量を調整し、シーンチェンジ位置の画像構造を変更し、変更した画像構造にしたがって上記調整に基づき動画像信号を符号化するので、シーンチェンジ位置の画像構造を変更しても符号化効率のよい符号化を行うことができ、割り当てられたビット量ｂの少ない単位時間内に生じたシーンチェンジによる画質の劣化を防ぐことができる。
【００６３】
なお、本発明の実施の形態は、上述例に限らず、コンピュータにより実行されりプログラムとしても良いし、上記プログラムを記録したコンピュータで読み取り可能な記録媒体としても良い。
【００６４】
【発明の効果】
以上詳細に説明したように、本発明に係る符号化装置は、ビット量算出手段で符号化難易度ｄに基づき、所定の閾値よりも低いビット量ｂを算出した場合、ビット量調整手段でシーンチェンジ位置に基づき決定される所定の位置以前に割り当てるビット量を減少し、これによって得た余剰のビット量を上記所定の位置以降に割り当てる調整をし、符号化手段で上記調整に基づき、シーンチェンジ位置に基づき画像構造を変更し、変更した画像構造にしたがって動画像信号を符号化するので、シーンチェンジ位置の画像構造を変更しても符号化効率のよい符号化を行うことができ、割り当てられたビット量ｂの少ない単位時間内に生じたシーンチェンジによる画質の劣化を防ぐことができる。
【００６５】
また、本発明に係る符号化方法は、算出した符号化難易度ｄに基づき、所定の閾値よりも低いビット量ｂを算出した場合、シーンチェンジ位置に基づき決定される所定の位置以前に割り当てるビット量を減少し、これによって得た余剰のビット量を上記所定の位置以降に割り当てる調整をし、上記調整に基づき、シーンチェンジ位置に基づき画像構造を変更し、変更した画像構造にしたがって動画像信号を符号化するので、シーンチェンジ位置の画像構造を変更しても符号化効率のよい符号化を行うことができ、割り当てられたビット量ｂの少ない単位時間内に生じたシーンチェンジによる画質の劣化を防ぐことができる。
【００６６】
また、本発明に係るプログラムは、コンピュータにより、算出した符号化難易度ｄに基づき、所定の閾値よりも低いビット量ｂを算出した場合、シーンチェンジ位置に基づき決定される所定の位置以前に割り当てるビット量を減少し、これによって得た余剰のビット量を上記所定の位置以降に割り当てる調整をし、上記調整に基づき、シーンチェンジ位置に基づき画像構造を変更し、変更した画像構造にしたがって動画像信号を符号化するので、シーンチェンジ位置の画像構造を変更しても符号化効率のよい符号化を行う工程を実行させるので、シーンチェンジ位置の画像構造を変更しても符号化効率のよい符号化を行うことができ、割り当てられたビット量ｂの少ない単位時間内に生じたシーンチェンジによる画質の劣化を防ぐことができる。
【００６７】
さらに、本発明に係る記録媒体は、算出した符号化難易度ｄに基づき、所定の閾値よりも低いビット量ｂを算出した場合、シーンチェンジ位置に基づき決定される所定の位置以前に割り当てるビット量を減少し、これによって得た余剰のビット量を上記所定の位置以降に割り当てる調整をし、上記調整に基づき、シーンチェンジ位置に基づき画像構造を変更し、変更した画像構造にしたがって動画像信号を符号化するので、シーンチェンジ位置の画像構造を変更しても符号化効率のよい符号化を行う工程を実行させるためのプログラムを記録したコンピュータ読み取り可能な媒体であるので、シーンチェンジ位置の画像構造を変更しても符号化効率のよい符号化を行うことができ、割り当てられたビット量ｂの少ない単位時間内に生じたシーンチェンジによる画質の劣化を防ぐことができる。
【図面の簡単な説明】
【図１】予測符号化の処理により非予測符号化画像から予測符号化画像を生成する様子を示す図である。
【図２】ＧＯＰ構造を示す構造図である。
【図３】本発明を適用した符号化装置の構成例を示すブロック図である。
【図４】符号化難易度ｄに対するマクロブロックの出現確率ｈ（ｄ）と、割当符号量ｂとを示す分布図である。
【図５】本発明を適用した符号化装置が備えるシーンチェンジ位置検出器によるシーンチェンジの検出動作を示すフローチャートである。
【図６】シーンチェンジ位置の検出に基づき、ＧＯＰ構造を変更する様子を示す図である。
【図７】２ＧＯＰ構造内でピクチャの移動を行う場合の図である。
【図８】従来の符号化装置の構成例を示すブロック図である。
【図９】従来の１パス方式の可変ビットレート符号化処理を説明するためのフローチャートである。
【符号の説明】
１　符号化装置、１０　符号化難易度計算器、１１　割当ビット量計算器、１２　シーンチェンジ位置検出器、１３　遅延器、１４　動画像符号化器[0001]
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an encoding apparatus and method, a program, and a recording medium in high-efficiency encoding for efficiently encoding digital signals, and more particularly to a one-pass system for encoding moving picture signals. The present invention relates to an encoding apparatus and method, a program, and a recording medium that perform encoding by controlling the allocation amount of a variable bit rate.
[0002]
2. Description of the Related Art Since a digital video signal has a very large amount of data, a high-efficiency encoding means for efficiently encoding a video signal at a high compression rate when it is desired to record the digital video signal on a small recording medium having a small storage capacity for a long time. Becomes indispensable. In order to meet such demands, high-efficiency coding methods using correlation of video signals have been proposed, one of which is the MPEG method.
[0003]
The MPEG (Moving Picture Image Coding Experts Group) is discussed in ISO-IEC / JTC1 / SC29 / WG11 and proposed as a standard, and includes motion compensation predictive coding and discrete cosine transform (DCT: Discrete). This is a hybrid system that combines Cosine Transform (encoding) and coding. In this MPEG system, the redundancy in the time axis direction is first reduced by taking the difference between frames of the video signal, and then the redundancy in the spatial axis direction is reduced by using the discrete cosine transform. Encode efficiently. In addition, in the MPEG system, in order to perform the above-described motion compensation prediction encoding, a GOP including three elements of an I picture (Intra encoded image), a P picture (Predictive encoded image), and a B picture (Bidirectionally Predictive encoded image). (Group of Picture) structure.
[0004]
Generally, a video signal is not stationary, and the information amount of each picture changes with time. Therefore, it is known that higher image quality can be obtained by using variable bit rate coding than by constant bit rate coding with the same code amount.
[0005]
For example, for a video signal recorded on a so-called DVD-video, two-pass variable bit rate encoding is generally used. This two-pass method performs two encodings: an encoding process for obtaining the code amount, and an encoding process performed while variably controlling the bit rate based on the obtained code amount. Although there is an advantage that the total amount of possible coded bits can be used effectively, there is a disadvantage that the processing time is required to be about twice the time length of the moving image sequence, so that it is not suitable for real-time processing.
[0006]
One-pass variable bit rate coding schemes aimed at shortening the processing time are disclosed in the specification and drawings of Japanese Patent Application Nos. 7-31418 and 9-113141, for example. I have.
[0007]
Here, FIG. 8 shows a configuration example of a moving picture encoding apparatus to which a conventional one-pass type variable bit rate encoding method is applied, and FIG. 9 shows a flowchart of a one-pass type variable bit rate encoding process. Show.
[0008]
The encoding device 2 includes an encoding difficulty calculator 20, an allocated bit amount calculator 21, a delay unit 22, and a video encoder 23, as shown in FIG. In the encoding device 2, the input video signal supplied to the input terminal A is sent to the encoding difficulty calculator 20 and the delay unit 22. An output from the encoding difficulty calculator 20 is sent to an assigned bit amount calculator 21 for calculating an assigned bit amount per unit time, and an output from the assigned bit amount calculator 21 is sent to a video encoder 23. Can be The moving picture encoder 23 encodes the output signal from the delay unit 22 according to the assigned bit amount from the assigned bit amount calculator 21 and outputs the encoded signal from the terminal 205 as an encoded bit stream.
[0009]
Here, the operation of the encoding device 2 will be described with reference to the flowchart of FIG.
[0010]
In step ST10, the encoding device 2 inputs the video signal supplied to the input terminal A to the encoding difficulty calculator 20, and calculates the encoding difficulty d of the input image per unit time. The unit time is, for example, about 0.5 seconds. The calculation of the encoding difficulty is performed, for example, by fixing the quantization step, encoding the input moving image signal, and calculating the generated code amount for each predetermined time.
[0011]
In step ST <b> 11, the encoding device 2 calculates the assigned bit amount b for the encoding difficulty d obtained from the encoding difficulty calculator 20 by the assigned bit amount calculator 21. In this case, the degree of encoding difficulty d per unit time and the allocated bit amount b when the reference moving image sequence is subjected to variable bit rate encoding at a predetermined average bit rate are associated in advance. Here, the total sum of the allocated bit amounts per unit time with respect to the reference moving image sequence is set to be equal to or less than the storage capacity of the target recording medium. FIG. 4 shows an example of the relationship between the encoding difficulty d and the allocated bit amount b.
[0012]
In FIG. 4, the horizontal axis indicates the encoding difficulty d, and the vertical axis indicates the probability h (d) of appearance of the encoding difficulty d in the reference moving image sequence. Then, an allocated bit amount for an arbitrary encoding difficulty is calculated based on the function b (d). This relationship can be determined empirically through trial and error, for example, by conducting an experiment to encode a moving image sequence such as a movie at a predetermined average bit rate, evaluating the image quality, and applying it to most sequences in the world. A general relationship that is adaptable. The method of obtaining the information is disclosed in the specification and the drawings of Japanese Patent Application No. 7-31418, for example. The allocated bit amount calculator 21 calculates the allocated bit amount b for the unit time encoding difficulty d of the image input from the input terminal A based on the relationship of FIG.
[0013]
The delay unit 22 in the encoding device 2 of the one-pass method uses the encoding difficulty calculator 20 and the allocated bit amount calculator 21 for the input image having the unit time length, and the processing is completed within the unit time. , For delaying the input of the image signal to the moving image encoder 23 by a unit time.
[0014]
Also, in step ST12, the encoding device 2 sets the moving image encoder 23 such that the input moving image for each unit time has the assigned bit amount given from the assigned bit amount calculator 21 corresponding thereto. Encode. That is, the moving image encoder 23 encodes the input moving image for each unit time using a quantization step size based on the allocated code amount.
[0015]
In such a one-pass system, variable bit rate encoding can be performed in almost real time with an optimal allocated bit amount according to the signal encoding difficulty in accordance with the input of an image signal.
[0016]
[Problems to be solved by the invention]
By the way, the relationship in FIG. 4 can be applied to most moving image sequences, but cannot be applied to some special sequences. For example, it is a joint portion where different input image signals are temporally continuous. Note that such a joint of the input image signals is called a scene change.
[0017]
In the scene change part, the difference is taken out from a continuous image and coded, and the original image is reconstructed. In order to obtain the image quality equivalent to that of a general input image signal, so-called motion compensation prediction coding cannot be applied. It is necessary to allocate more bits to the scene change part or to change the MPEG GOP structure at the scene change part. For example, by changing a P picture to an I picture, it is possible to prevent the image quality from deteriorating at the time of a scene change.
[0018]
However, when the GOP structure is changed, generally, the coding efficiency of the GOP deteriorates. Therefore, simply changing the GOP structure makes it difficult to obtain image quality equivalent to that of a general input image signal. In particular, when only a low-rate bit amount is allocated to one GOP, the bit amount required when changing from a P picture to an I picture becomes insufficient, and the image quality of a scene in which the GOP structure is changed deteriorates.
[0019]
Therefore, an object of the present invention is to propose an encoding apparatus and method, a program, and a recording medium that can detect a scene change and improve the image quality of a scene change portion even when the GOP structure is changed.
[0020]
[Means for Solving the Problems]
An encoding apparatus according to the present invention is an encoding apparatus that encodes a moving image signal based on a predetermined image structure in order to solve the above-described problem. Encoding difficulty calculating means for calculating the degree d, bit amount calculating means for calculating the bit amount b for transferring the video signal in a unit time based on the coding difficulty d, A scene change position detecting means for detecting a change position; an image structure changing means for changing the predetermined image structure based on the scene change position; and the image structure changing means when the bit amount b is lower than a predetermined threshold value. Bit amount adjusting means for adjusting the amount of bits allocated before the position where the image structure is changed by the means, and allocating the surplus bit amount obtained thereby to the position after the position; A delay unit that delays and outputs a signal; and an encoding unit that encodes a moving image signal output from the delay unit based on the image structure changed by the image structure changing unit according to the adjustment of the bit amount adjustment unit. Means.
[0021]
When the bit amount b calculated by the bit amount calculating unit is lower than a predetermined threshold, such an encoding device reduces the bit amount to be allocated before the position where the image structure is changed by the bit amount adjusting unit. The surplus bit amount obtained by the above is adjusted after the above position, and the moving image signal is encoded based on the image structure changed by the encoding unit according to the adjustment of the bit amount adjustment unit.
[0022]
An encoding method according to the present invention is an encoding method for encoding a moving image signal based on a predetermined image structure in order to solve the above-described problem. Calculate the degree d, calculate the bit amount b for transferring the video signal within a unit time based on the encoding difficulty d, detect the scene change position from the video signal, and The predetermined image structure is changed, and when the bit amount b is lower than a predetermined threshold, the amount of bits allocated before the position where the image structure is changed is reduced, and the surplus bit amount obtained by the An adjustment to be assigned to the position and thereafter is performed, the moving image signal is output with a delay, and based on the changed image structure, the moving image signal output with the delay is encoded according to the adjustment.
[0023]
Such an encoding method reduces a bit amount to be allocated before a position of an image structure changed based on a scene change position, when a bit amount b for transferring a moving image signal within a unit time is lower than a predetermined threshold, Adjustment for allocating the surplus bit amount thus obtained to the position after the position is performed, and the moving image signal is encoded based on the changed image structure according to the adjustment.
[0024]
A program to be executed by the computer according to the present invention includes a step of calculating an encoding difficulty d per unit time of a moving image signal, based on the encoding difficulty d, Calculating a bit amount b for transferring the moving image signal within a unit time, detecting a scene change position from the moving image signal, and changing the predetermined image structure based on the scene change position. Adjusting the amount of bits allocated before the position where the image structure is changed, and allocating the surplus bit amount obtained from the position after the position, when the bit amount b is lower than a predetermined threshold value; A step of delaying and outputting a moving image signal, and a step of encoding the delayed and output moving image signal based on the changed image structure in accordance with the adjustment. .
[0025]
When such a program is executed by a computer, if the bit amount b for transferring a moving image signal within a unit time is lower than a predetermined threshold, the program is allocated before the position of the image structure changed based on the scene change position. The bit amount is reduced, and the surplus bit amount thus obtained is adjusted to be allocated to the position after the position, and the moving image signal is encoded based on the changed image structure according to the adjustment.
[0026]
The recording medium according to the present invention, in order to solve the above-described problems, a step of calculating the encoding difficulty d per unit time of the moving image signal, and the step of calculating the moving image signal based on the encoding difficulty d. Calculating a bit amount b to be transferred within a unit time; detecting a scene change position from a moving image signal; changing the predetermined image structure based on the scene change position; Is lower than a predetermined threshold, a step of reducing the amount of bits to be allocated before the position where the image structure is changed, and performing an adjustment of allocating the surplus bit amount obtained by the position after the position; and A computer that records a program for executing a step of outputting with delay and a step of encoding a video signal output with delay based on the changed image structure in accordance with the adjustment. Over data is readable media.
[0027]
When such a recording medium is executed by a computer capable of reading, when the bit amount b for transferring the moving image signal in a unit time is lower than a predetermined threshold, the position before the position of the image structure changed based on the scene change position is determined. Is adjusted to allocate the surplus bit amount obtained after the above to the above position, and a moving image signal is encoded based on the changed image structure according to the above adjustment.
[0028]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
[0029]
First, the background of the present invention will be described. For example, a single-sided, single-layer DVD (Digital Versatile Disc), which is one of the optical disk media, is a large-capacity medium in which the amount of data that can be recorded is about 4.7 GB, which is about seven times that of a CD. However, if an attempt is made to digitize and record a video signal without performing any processing, long-time recording cannot be performed. For example, when an NTSC video signal is digitized, the data amount per second becomes 20 MB or more. Here, 20 MB is a value calculated assuming that 29.97 screens of 720 × 480 pixels are provided per second with 8 bits for luminance and 8 bits for color per pixel.
[0030]
Therefore, when trying to record an NTSC video signal on the DVD, only about 4 minutes of video can be recorded. Therefore, the DVD adopts MPEG (Moving Picture Image Coding Experts Group) 2, which compresses an NTSC video signal so that about 133 minutes of video can be recorded.
[0031]
A moving image signal compression method based on MPEG2 uses discrete cosine transform (DCT), which is compression using correlation within a screen, motion compensation using correlation between screens, and correlation between code strings. It is a combination of three Huffman codings, which is compression.
[0032]
Here, the motion compensation will be described. The video signal of the NTSC system is composed of 30 frames per second, and is scanned by an interlace (interlaced scanning) system in order to eliminate flicker. Therefore, the display device displays the screen at 60 fields per second. Note that 30 frames are 30 pictures formed from 525 scanning lines.
[0033]
Looking at each of the 30 pictures per second, successive images often have a large area of the screen occupied by the same element, and do not change much in a short amount of time.
[0034]
Therefore, for example, when there is a portion that changes in 30 pictures, only the change (difference) is extracted, and the other portions are recorded only once. Then, if the differences are combined at the time of reproduction, the amount of information necessary for recording / reproduction can be reduced. Such a technique of extracting a difference from a continuous image, encoding the original image, and reconstructing the original image is called predictive encoding because the current image is predicted from the immediately preceding image.
[0035]
In addition, the changing part of the image, that is, the moving part, includes a part whose shape does not change and its position on the screen simply changes with time, and a part whose shape changes with time. In the former case, the shape data can be used as it is. For an element whose position on the screen changes over time without changing its shape, the amount of the change, that is, the amount of motion shift is called a motion vector. By encoding this motion vector and transmitting the encoded signal, the original image can be reconstructed with a smaller amount of data. Note that a method of using this motion vector in predictive coding is called motion compensation.
[0036]
Next, the above-described prediction coding and motion compensation processing will be described with reference to FIG. In the predictive coding process, for example, when there is an image A and an image B to be reproduced according to the flow of time, since the contents of the images A and B have many common elements, the images A and B are left as they are. Instead of encoding, an image C which is the difference between the images A and B is generated, and the generated image C is encoded together with the image A. The image A and the image C thus encoded can be combined with the image A and the image C to reconstruct the image B at the time of decoding. Note that the image C is called a prediction coded image.
[0037]
In the motion compensation processing, an image is divided into blocks, and a motion vector indicating a direction and how much the image has moved is extracted from a portion where the shape has not changed and only the position has changed, and coding is performed. Combining the above-described motion compensation and predictive coding enables efficient data compression.
[0038]
In addition, in MPEG2, in order to perform predictive coding using motion compensation, as shown in FIG. 2, an I picture (Intra coded picture), a P picture (Predictive coded picture) and a B picture (Bidirectionally Predictive coded picture). A GOP (Group of Picture) structure using three elements, that is, an image) is used. An I picture is a picture created by intra-frame encoding, and does not perform predictive encoding from a previous image. The P picture is an inter-frame forward prediction coded image created by performing predictive encoding from the immediately preceding image, and is a picture created based on an I picture. A B picture is a bidirectionally predicted coded image, and is a picture created by performing prediction from two preceding and succeeding P pictures. In general, one GOP is formed by a group of 15 pictures. Note that one GOP is composed of one I picture, four P pictures, and ten B pictures. The bit amount required to generate an I picture is 5 to 6 times the bit amount required to generate a B picture, and the bit amount required to generate a P picture is It is 2-3 times the amount of bits needed to generate.
[0039]
In general, a video signal is not stationary, and the information amount of each picture changes with time. Therefore, it is known that higher image quality can be obtained by using variable bit rate coding than by fixed bit rate coding that performs coding with the same code amount.
[0040]
For a video signal recorded on a DVD, variable bit rate encoding called a two-pass method is generally used. In the two-pass method, two encoding processes are performed: an encoding process for obtaining a code amount, and an encoding process for performing encoding while variably controlling a bit rate based on the obtained code amount. There is an advantage that the total amount of available coded bits can be used effectively. However, there is a disadvantage that the processing time is required to be about twice the time length of the moving image sequence, so that it is not suitable for real time processing.
[0041]
In the real-time processing, it is necessary to perform coding once by calculating the code amount and variably controlling the bit rate based on the code amount. Such processing is called a one-pass method. An encoding apparatus 1 to which the present invention is applied employs the one-pass method, detects a scene change position from an input moving image signal, and determines a predetermined bit amount to be assigned per unit time of the moving image signal. If the threshold value is lower than the threshold value, the bit amount originally allocated before the predetermined position determined based on the scene change position is reduced, and the surplus bit amount obtained thereby is allocated to the predetermined position and thereafter. And encodes the moving image signal in accordance with the adjustment. Hereinafter, the configuration and operation of the encoding device 1 will be described in detail.
[0042]
As shown in FIG. 3, the encoding device 1 includes an encoding difficulty calculator 10, an allocated bit amount calculator 11, a scene change position detector 12, a delay unit 13, a moving image encoder 14, Is provided. The encoding difficulty calculator 10 calculates the encoding difficulty d of the video signal supplied from the input terminal for each unit time, and outputs the calculated encoding difficulty d to the allocated bit amount calculator 11. The unit time is an elapsed time of 0.5 seconds or 1 GOP. Further, the encoding difficulty calculator 10 encodes the input moving image signal with the quantization step fixed, and calculates the encoding difficulty d by calculating the generated code amount every predetermined time. . Note that the encoding difficulty calculator 10 calculates the encoding difficulty d higher for a scene with a lot of motion, and calculates the encoding difficulty d lower for a scene with a little motion.
[0043]
The assigned bit amount calculator 11 calculates an assigned bit amount b for the inputted encoding difficulty d. In this case, the degree of encoding difficulty d per unit time and the allocated bit amount b when the reference moving image sequence is subjected to variable bit rate encoding at a predetermined average bit rate are associated in advance. Here, the total sum of the allocated bit amounts per unit time with respect to the reference moving image sequence is set to be equal to or less than the storage capacity of the target recording medium. FIG. 4 shows an example of the relationship between the encoding difficulty d and the allocated bit amount b.
[0044]
In FIG. 4, the horizontal axis represents the encoding difficulty d, and the vertical axis represents the appearance probability h (d) of the encoding difficulty d in the reference moving image sequence. Then, an allocated bit amount for an arbitrary encoding difficulty is calculated based on the function b (d). This relationship can be determined empirically through trials and errors by performing experiments on encoding a large number of moving image sequences (for example, movies) at a predetermined average bit rate, and evaluating the image quality. A general relationship that is adaptable. The allocated bit amount calculator 11 calculates the allocated bit amount b for the encoding difficulty d per unit time of the moving image signal based on the relationship in FIG. When the encoding difficulty d is high, the allocated bit amount calculator 11 increases the encoding rate to reduce noise due to encoding. When the encoding difficulty d is low, the noise due to the encoding does not occur much, and the encoding rate is calculated to be lower. By doing so, the image quality of the entire moving image signal becomes uniform, and the average bit amount per unit time decreases.
[0045]
However, in the conventional encoding device, since the algorithm controls the bit amount per unit time to be low within 30 minutes or 1 hour, a large amount of bits is allocated without any particular control for the first few minutes. A process of allocating a small bit amount in the latter half may be performed. With such a control, for example, if the playback operation of a piece is repeatedly performed in which a moving image signal is recorded for one minute, stopped, and then recorded again for one minute, a large amount of bits is allocated, and a large amount of bits per unit time is allocated. There is a problem that the average bit amount becomes high. The encoding device 1 according to the present invention can control the average bit amount per short time to be low, particularly when the fragment playback operation is repeated. The allocated bit amount calculator 11 allocates a bit amount b of about 2.8 Mbps to 15 Mbps, for example.
[0046]
When the calculated bit amount b is lower than a predetermined threshold value and a signal indicating that there is a scene change is supplied from a scene change position detector 12 described later, the allocated bit amount calculator 11 calculates the bit amount b. The adjustment is performed before and after the scene change position, and the adjustment result is output to the video encoder 14. The adjustment by the allocated bit amount calculator 11 will be described later.
[0047]
The scene change position detector 12 detects a scene change position from the moving image signal supplied from the input terminal, generates a signal indicating the detection, and outputs the signal to the assigned bit amount calculator 11 and the moving image encoder 14. .
[0048]
Here, a specific operation of the scene change position detector 12 will be described below. The scene change position detector 12 detects a scene change position by detecting a large change in a parameter characterizing each frame of the moving image signal. Therefore, the scene change position detector 12 only needs to be able to detect a change in the signal such as the luminance level and the chroma level of the screen. The scene change position detector 12 detects a scene change position using, for example, a change in an average luminance level, a difference amount between frames, a level fluctuation amount between frames, and the like.
[0049]
Next, detection of a scene change position by the above-described scene change position detector 12 will be described with reference to a flowchart shown in FIG.
[0050]
In step ST1, the scene change position detector 12 determines whether a scene change position has been detected from the input moving image signal. If a scene change position has been detected, the process proceeds to step ST2.
[0051]
In step ST2, the scene change position detector 12 generates a signal indicating that the scene change position has been detected, and outputs the signal to the video encoder 14. The moving picture encoder 14 changes the GOP structure according to the scene change position.
[0052]
In step ST3, the scene change position detector 12 outputs a signal to the effect that the scene change position has been detected to the allocated bit amount calculator 11. The allocated bit amount calculator 11 adjusts the bit amount allocated before and after the scene change position according to the scene change position.
[0053]
The delay unit 13 temporarily stores the moving image signal for at least the time required for the encoding difficulty calculator 10 and the allocated bit amount calculator 11 to calculate the bit amount b per unit time of the moving image signal. The delay unit 13 supplies the moving image signal to the moving image encoder 14 after temporarily storing the moving image signal.
[0054]
The moving image encoder 14 encodes the supplied moving image signal according to the bit amount b supplied from the allocated bit amount calculator 11 per unit time. Further, the moving picture encoder 14 changes the GOP structure when a signal indicating that there is a scene change is supplied from the scene change position detector 12. Specifically, as shown in FIG. 6, a P picture closest to the scene change position is changed to an I picture in the time direction. The video encoder 14 changes the P picture (P5) to an I picture (I5) when a scene change occurs at a position A, and changes the P picture (P8) to an I picture (I8) when a scene change occurs at a position B. The picture is changed to picture (I8), and when a scene change occurs at position C, the P picture (P11) is changed to I picture (I11).
[0055]
As described above, when the picture is changed from the P picture to the I picture, a bit amount twice or three times is required. There is no problem when the bit amount allocated to the unit time is higher than the predetermined threshold, but when the bit amount is lower than the predetermined threshold, the bit amount becomes insufficient due to the change of the GOP structure, and Deteriorates. Therefore, when the bit amount b to be allocated to the unit time of the moving image signal is lower than the predetermined threshold, the allocated bit amount calculator 11 determines the predetermined bit amount determined based on the scene change position according to the detection result of the scene change position. The bit amount allocated before the position and the bit amount allocated after the predetermined position are adjusted.
[0056]
Further, as shown in FIG. 7, it is possible to respond to a scene change by moving a picture within a 2GOP without changing from a P picture to an I picture. In this case, although the position of the I picture (I2) is only moved from the P picture (P11) within the 2GOP, there is no large change in the bit amount required in the 2GOP, but it is difficult to encode the scene change itself. Since a larger bit amount is required, it is necessary to allocate a larger bit amount than usual after the sea change.
[0057]
Here, the adjustment operation by the allocated bit amount calculator 11 will be described below. When the calculated bit amount b is lower than a predetermined threshold, the allocation bit amount calculator 11 performs the following adjustment. When a scene change as described above occurs, an image encoded by prediction cannot be used, and the image quality after the scene change may deteriorate. Therefore, when a signal indicating that there is a scene change is supplied from the scene change position detector 12, the moving image encoder 14 described later performs a process of changing the GOP structure based on the scene change. This can prevent the image quality from deteriorating at the time of a scene change.
[0058]
However, when the GOP structure is changed, generally the coding efficiency of the GOP becomes poor, and it is difficult to obtain image quality equivalent to that of a general moving image signal by simply changing the GOP structure. Therefore, when a signal indicating that there is a scene change is supplied from the scene change position detector 12, the allocated bit amount calculator 11 calculates the bits allocated before a predetermined position determined based on the scene change position. The amount is reduced, and an adjustment is made to allocate the surplus bit amount thus obtained to the predetermined position and beyond, and the adjustment result is supplied to the moving picture encoder 14. The moving picture encoder 14 codes the moving picture signal output from the delay unit 13 based on the supplied adjustment result.
[0059]
The assigned bit amount calculator 11, for example, assuming that the bit amount originally scheduled to be assigned to the moving image signal before the predetermined position determined based on the scene change position is R_pre and the decreasing bit rate rate is SC_rate, The bit amount R_pre ′ to be allocated to the video signal before the position of
R_pre '= R_pre × SC_rate
When the bit amount originally allocated to the moving image signal after the predetermined position determined based on the scene change position is R_post, the bit amount R_post ′ to be allocated to the moving image signal after the predetermined position is
R_post '= R_post + (R_pre-R_pre')
Adjust according to.
[0060]
Also, the decreasing bit rate SC_rate changes according to the scene change position. For example, when a scene change occurs at the position A shown in FIG. 6, SC_rate = 0.6, and the scene change occurs at the position B. If a scene change has occurred, SC_rate = 0.7, and if a scene change has occurred at position C, SC_rate = 0.8.
[0061]
The video encoder 14 encodes the video signal according to the adjustment result supplied from the allocation bit detector.
[0062]
When the encoding device 1 configured as described above calculates the bit amount b lower than the predetermined threshold by the calculation of the allocated bit amount calculator 11, based on the scene change position detected by the scene change position detector 12, The bit amount at the scene change position is adjusted, the image structure at the scene change position is changed, and the moving image signal is encoded based on the adjustment according to the changed image structure. Encoding with high encoding efficiency can be performed, and deterioration of image quality due to a scene change occurring within a unit time with a small allocated bit amount b can be prevented.
[0063]
The embodiment of the present invention is not limited to the above example, and may be a computer-executable program or a computer-readable recording medium on which the program is recorded.
[0064]
【The invention's effect】
As described in detail above, when the bit amount calculating unit calculates the bit amount b lower than the predetermined threshold based on the encoding difficulty d, the encoding device according to the present invention uses the bit amount adjusting unit The amount of bits allocated before the predetermined position determined based on the change position is reduced, and the surplus bit amount obtained thereby is adjusted to be allocated after the predetermined position, and the encoding means performs scene change based on the adjustment. Since the image structure is changed based on the position and the moving image signal is coded according to the changed image structure, coding with high coding efficiency can be performed even if the image structure at the scene change position is changed. It is possible to prevent the image quality from deteriorating due to a scene change occurring within a unit time with a small bit amount b.
[0065]
Further, the encoding method according to the present invention, when a bit amount b lower than a predetermined threshold is calculated based on the calculated encoding difficulty d, a bit to be allocated before a predetermined position determined based on the scene change position. The amount is reduced and the surplus bit amount thus obtained is adjusted to be allocated after the predetermined position. Based on the adjustment, the image structure is changed based on the scene change position, and the moving image signal is changed according to the changed image structure. Encoding, it is possible to perform encoding with high encoding efficiency even if the image structure at the scene change position is changed, and the image quality is degraded due to a scene change that occurs within a unit time when the allocated bit amount b is small. Can be prevented.
[0066]
Further, the program according to the present invention, when the computer calculates a bit amount b lower than a predetermined threshold based on the calculated encoding difficulty d, allocates the bit amount before a predetermined position determined based on the scene change position. The bit amount is reduced, and the surplus bit amount obtained thereby is adjusted to be allocated after the predetermined position. Based on the adjustment, the image structure is changed based on the scene change position, and the moving image is changed according to the changed image structure. Since the signal is encoded, a step of performing encoding with high coding efficiency is performed even if the image structure at the scene change position is changed. Therefore, a code with high coding efficiency even if the image structure at the scene change position is changed. Image quality can be prevented from being deteriorated due to a scene change occurring within a unit time in which the allocated bit amount b is small. .
[0067]
Further, the recording medium according to the present invention, when calculating a bit amount b lower than a predetermined threshold based on the calculated encoding difficulty d, assigning a bit amount before a predetermined position determined based on the scene change position Is adjusted, and the surplus bit amount obtained thereby is adjusted after the predetermined position.Based on the adjustment, the image structure is changed based on the scene change position, and the moving image signal is changed according to the changed image structure. Since the encoding is performed, a computer-readable medium storing a program for executing a step of performing encoding with high encoding efficiency even if the image structure at the scene change position is changed. Can be encoded with high coding efficiency even if the It is possible to prevent the deterioration of image quality due to change.
[Brief description of the drawings]
FIG. 1 is a diagram showing how a predicted coded image is generated from a non-predicted coded image by a predictive coding process.
FIG. 2 is a structural diagram showing a GOP structure.
FIG. 3 is a block diagram illustrating a configuration example of an encoding device to which the present invention has been applied.
FIG. 4 is a distribution diagram showing a macroblock appearance probability h (d) with respect to an encoding difficulty d and an assigned code amount b.
FIG. 5 is a flowchart showing a scene change detection operation by a scene change position detector included in the encoding device to which the present invention is applied.
FIG. 6 is a diagram illustrating a state where a GOP structure is changed based on detection of a scene change position.
FIG. 7 is a diagram illustrating a case where a picture is moved within a 2GOP structure.
FIG. 8 is a block diagram illustrating a configuration example of a conventional encoding device.
FIG. 9 is a flowchart illustrating a conventional one-pass variable bit rate encoding process.
[Explanation of symbols]
REFERENCE SIGNS LIST 1 encoder, 10 encoding difficulty calculator, 11 assigned bit amount calculator, 12 scene change position detector, 13 delay unit, 14 video encoder

Claims

An encoding device that encodes a moving image signal based on a predetermined image structure,
Encoding difficulty calculating means for calculating the encoding difficulty d per unit time of the moving image signal;
A bit amount calculating means for calculating a bit amount b for transferring the moving image signal in a unit time based on the encoding difficulty d;
A scene change position detecting means for detecting a scene change position from a moving image signal;
Image structure changing means for changing the predetermined image structure based on the scene change position;
When the bit amount b is lower than a predetermined threshold, the bit amount to be allocated before the position where the image structure is changed by the image structure changing means is reduced, and the surplus bit amount obtained by this is allocated to the position after the position. Bit amount adjusting means for adjusting;
Delay means for delaying and outputting a moving image signal;
Encoding means for encoding a moving image signal output from the delay means based on the image structure changed by the image structure changing means in accordance with the adjustment of the bit amount adjusting means. apparatus.

The image structure is a GOP (Group of Picture) structure formed by a predetermined number of I pictures (Intra coded images), P pictures (Predictively coded images), and B pictures (Bidirectionally Predictive coded images). The encoding device according to claim 1, wherein:

3. The encoding apparatus according to claim 2, wherein said image structure changing means changes the first P picture after the scene change position detected by said scene change position detecting means to an I picture.

When the bit amount b is lower than a predetermined threshold value, the bit amount adjusting unit sets the bit amount originally allocated to the moving image signal before the position where the image structure is changed to R_pre, and decreases the bit rate rate. Is SC_rate, the bit amount R_pre ′ to be allocated to the video signal before the position is
R_pre '= R_pre × SC_rate
And the bit amount originally allocated to the moving image signal after the position is R_post, the bit amount R_post ′ to be allocated to the moving image signal after the position is
R_post '= R_post + (R_pre-R_pre')
2. The encoding apparatus according to claim 1, wherein the encoding is performed according to the following.

An encoding method for encoding a moving image signal based on a predetermined image structure,
Calculate the encoding difficulty d per unit time of the video signal,
Based on the encoding difficulty d, a bit amount b for transferring the video signal in a unit time is calculated,
Detect scene change position from video signal,
Changing the predetermined image structure based on the scene change position,
When the bit amount b is lower than a predetermined threshold, the amount of bits to be allocated before the position where the image structure is changed is reduced, and the surplus bit amount obtained by the adjustment after the position is adjusted.
Delays and outputs the moving image signal,
A coding method, characterized by coding a moving image signal output with a delay based on the changed image structure in accordance with the adjustment.

On the computer,
Calculating an encoding difficulty d per unit time of the moving image signal;
Calculating a bit amount b for transferring the video signal within a unit time based on the encoding difficulty d;
Detecting a scene change position from the moving image signal;
Changing the predetermined image structure based on the scene change position;
When the bit amount b is lower than a predetermined threshold, a step of reducing the bit amount to be allocated before the position where the image structure is changed, and performing an adjustment to allocate the surplus bit amount obtained from the position and thereafter,
A step of delaying and outputting the moving image signal;
A program for executing, based on the changed image structure, a step of encoding a delayed moving image signal according to the adjustment.

Calculating an encoding difficulty d per unit time of the moving image signal;
Calculating a bit amount b for transferring the video signal within a unit time based on the encoding difficulty d;
Detecting a scene change position from the moving image signal;
Changing the predetermined image structure based on the scene change position;
When the bit amount b is lower than a predetermined threshold, a step of reducing the bit amount to be allocated before the position where the image structure is changed, and performing an adjustment to allocate the surplus bit amount obtained from the position and thereafter,
A step of delaying and outputting the moving image signal;
A computer-readable storage medium storing a program for executing a step of encoding a moving image signal output with a delay based on a changed image structure in accordance with the adjustment.