JP4517468B2

JP4517468B2 - Image information converting apparatus and method, and encoding apparatus and method

Info

Publication number: JP4517468B2
Application number: JP2000208932A
Authority: JP
Inventors: 数史佐藤; 武文名雲; 邦明高橋; 輝彦鈴木; 陽一矢ケ崎
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2000-07-10
Filing date: 2000-07-10
Publication date: 2010-08-04
Anticipated expiration: 2020-07-10
Also published as: JP2002027465A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像情報を変換する画像情報変換装置及び方法並びに符号化装置及び方法に関し、詳しくは、離散コサイン変換等の直交変換と動き補償によって圧縮されたＭＰＥＧ等の画像情報（ビットストリーム）を衛星放送、ケーブルＴＶ、インターネット等のネットワークメディアを介して受信する際に、若しくは光、磁気ディスクのような記憶メディア上で処理する際に用いられる画像情報を変換する画像情報変換装置及び方法並びに符号化装置及び方法に関する。
【０００２】
【従来の技術】
近年、画像情報をディジタルとして取り扱い、効率の高い情報の伝送、蓄積を目的とし、画像情報特有の冗長性を利用して、離散コサイン変換等の直交変換と動き補償により圧縮するＭＰＥＧなどの画像情報圧縮方式が提供されている。そして、このような画像情報圧縮方法に準拠した装置は、放送局などの情報配信、及び一般家庭における情報受信の双方において普及しつつある。
【０００３】
特に、ＭＰＥＧ２（ＩＳＯ／ＩＥＣ１３８１８−２）は、飛び越し走査画像及び順次走査画像の双方、並びに標準解像度画像及び高精細画像を網羅する、汎用画像符号化方式として定義されている。
【０００４】
すなわち、ＭＰＥＧ２符号化圧縮方式によれば、例えば、７２０×４８０画素を持つ標準解像度の飛び越し走査画像に４〜８Ｍｂｐｓの符号量（ビットレート）を割り当て、１９２０×１０８８画素を持つ高解像度の飛び越し走査画像に対して１８〜２２Ｍｂｐｓの符号量（ビットレート）を割り当てることにより、高い圧縮率と良好な画質の実現が可能となる。
【０００５】
このようなことから、ＭＰＥＧ２は、プロフェッショナル用途及びコンシューマー用途の広範なアプリケーションに今後とも用いられるものと予想される。
しかし、ＭＰＥＧ２は、主として放送用に適合する高画質符号化を対象としており、例えばＭＰＥＧ１より低い符号量（ビットレート）、つまりより高い圧縮率の符号化方式には対応していなかった。
【０００６】
一方で、近年の携帯端末の普及により、今後とも高い圧縮率の符号化方式のニーズは高まると思われ、これに対応して、高い圧縮率を有するＭＰＥＧ４符号化方式の標準化が行われている。この画像符号化方式に関しては、１９９８年１２月にＩＳＯ／ＩＥＣ１４４９６−２として国際標準の規格が承認された。
【０００７】
ところで、ディジタル放送用に一度符号化されたＭＰＥＧ２画像圧縮情報（ビットストリーム）を、携帯端末上等で処理するのにより適した、より低い符号量（ビットレート）の画像圧縮情報（ビットストリーム）に変換したいというニーズがある。
【０００８】
かかる目的を達成するために、“Field-to-Frame Transcoding with Spatial and Temporal Downsampling”（Susie L Wee, John G. Apostolopoulos, and Nick Feamster, ICIP 99、以下これを文献１と呼ぶ）において画像情報変換装置（トランスコーダ）が提供されている。
【０００９】
この文献１において提供された画像情報変換装置（トランスコーダ）は、図６に示すように、ピクチャタイプ判別部１と、ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）２と、間引き部３と、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）４と、動きベクトル合成部５と、動きベクトル検出部６とから構成されている。
【００１０】
この画像情報変換装置には、フレーム内で符号化されたイントラ符号化画像（Ｉピクチャ；Ｉ）、表示順序で順方向を参照して予測符号化された順方向予測符号化画像（Ｐピクチャ；Ｐ）及び表示順序で順方向及び逆方向を参照して予測符号化された双方向予測符号化画像（Ｂピクチャ；Ｂ）から構成される飛び越し走査のＭＰＥＧ２画像圧縮情報（ビットストリーム）が入力される。
【００１１】
このＭＰＥＧ２画像圧縮情報（ビットストリーム）は、ピクチャタイプ判別部１において、Ｉ／Ｐピクチャに関するものか、Ｂピクチャに関するものであるかを判別され、Ｉ／Ｐピクチャのみ後続のＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）２に出力され、Ｂピクチャは破棄される。
【００１２】
ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）２における処理は通常のＭＰＥＧ２画像情報復号化装置と同様に、ＭＰＥＧ２画像圧縮情報（ビットストリーム）を画像信号に復号するものである。
【００１３】
ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）２の出力となる画素値は、間引き部３に入力される。間引き部３は、水平方向には１／２の間引き処理を施し、垂直方向には、第一フィールド若しくは第二フィールドのどちらか一方のデータのみを残し、もう一方を廃棄する。このような間引きによって、入力となる画像情報の１／４の大きさを持つ順次走査画像を生成する。
【００１４】
間引き部３によって生成された順次走査画像はＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）４によってフレーム内で符号化されたＩ−ＶＯＰ及び表示順序で順方向を参照して予測符号化されたＰ−ＶＯＰに符号化され、ＭＰＥＧ４画像圧縮情報（ビットストリーム）として出力される。尚、ＶＯＰはVideo object Planeを意味し、ＭＰＥＧ２におけるフレームに相当するものである。
【００１５】
その際、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）中の動きベクトル情報は、動きベクトル合成部５において間引き後の画像情報に対する動きベクトルにマッピングされ、動きベクトル検出部６においては、動きベクトル合成部５において合成された動きベクトル値を元に高精度の動きベクトルを検出する。
【００１６】
文献１は、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）の１／２×１／２の大きさを持つ順次走査画像のＭＰＥＧ４画像圧縮情報（ビットストリーム）を生成する画像情報変換装置に関して記述している。すなわち、例えば入力となるＭＰＥＧ２画圧縮情報（ビットストリーム）がＮＴＳＣ（National Television System Committee）の規格に準拠したものであった場合、出力となるＭＰＥＧ４画像圧縮情報はＳＩＦサイズ（３５２×２４０画素）ということになる。
【００１７】
ところで、図６に示した画像情報変換装置においては、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）４における符号量制御が、出力となるＭＰＥＧ４画像圧縮情報（ビットストリーム）における画質を決定する大きな要因となる。ＩＳＯ／ＩＥＣ１４４９６−２においては、符号量制御の方式に関しては特に規定されておらず、各ベンダが、アプリケーションに応じて、演算量及び出力画質の観点から最適と考えられる方式を用いることが出来る。以下では、代表的な符号量制御方式として、ＭＰＥＧ２ＴｅｓｔＭｏｄｅｌ５（ＩＳＯ／ＩＥＣＪＴＣ１／ＳＣ２９／ＷＧ１１Ｎ０４００）で述べられている方式について述べる。
【００１８】
この符号量制御のフローを図７に示すフローを用いて説明する。最初のステップＳ１１において、画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）４は、目標符号量（ターゲットビットレート）、及び、ＧＯＰ（ｇroup of pictures）構成を入力変数として、各ピクチャへのビット配分を行う。ここで、ＧＯＰとは、ランダムアクセス可能なピクチャの組である。
【００１９】
すなわち、ステップＳ１１において、画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）４は、ＧＯＰ内の各ピクチャに対する割り当てビット量を、割り当て対象ピクチャを含めＧＯＰ内でまだ復号化されていないピクチャに対して割り当てられるビット量（以下、これをＲとする）を基に配分する。この配分をＧＯＰ内の符号化ピクチャ順に繰り返す。その際、以下に述べる２つの仮定を用いて各ピクチャへの符号量割り当てを行う。
【００２０】
まず、第一に、各ピクチャを符号化する際に用いる平均量子化スケールコードと発生符号量の積は、画面が変化しない限り、ピクチャタイプ毎に一定値となると仮定する。そこで、各ピクチャを符号化した後、各ピクチャタイプ毎に、画面の複雑さを示す変数Ｘ_i，Ｘ_p，Ｘ_b（grobal complelxity measure）を次の式（１）によって更新する。
【００２１】
【数４】

【００２２】
ここでＳ_i，Ｓ_p，Ｓ_bはピクチャ符号化時の発生符号ビット量であり、Ｑ_i，Ｑ_p，Ｑ_bは、ピクチャ符号化時の平均量子化スケールコードである。また、初期値は、目標符号量（ターゲットビットレート）ｂｉｔ＿ｒａｔｅ［ｂｉｔｓ／ｓｅｃ］を用いて、式（２）で示される値とする。
【００２３】
【数５】

【００２４】
第二に、Ｉピクチャの量子化スケールコードを基準としたＰ，Ｂピクチャの量子化スケールコードの比率Ｋ_p，Ｋ_bが式（３）に定めた値となる場合に常に全体の画質が最適化されると仮定する。
【００２５】
【数６】

【００２６】
すなわち、Ｂピクチャの量子化スケールコードは、Ｉ，Ｐピクチャの量子化スケールコードの常に１．４倍としている。これは、ＢピクチャをＩ，Ｐピクチャに比較して多少粗めに符号化することにより、Ｂピクチャで節約できる符号量をＩ，Ｐピクチャに加えると、Ｉ，Ｐピクチャの画質が改善され、これを参照するＢピクチャの画質も改善されることを想定している。
【００２７】
上記２つの仮定より、ＧＯＰの各ピクチャに対する割り当てビット量（Ｔ_i，Ｔ_p，Ｔ_b）は式（４）に示す値とする。
【００２８】
【数７】

【００２９】
ここでＮ_p，Ｎ_bはＧＯＰ内でまだ符号化されていないＰ，Ｂピクチャの枚数である。
【００３０】
このようにして求めた割当符号量を基にして、各ピクチャをステップＳ１１，Ｓ１２に従って符号化する毎に、ＧＯＰ内の未符号化ピクチャに対して割り当てられるビット量Ｒを式（５）で更新する。
【００３１】
【数８】

【００３２】
また、ＧＯＰの最初のピクチャを符号化する際には、式（６）によりＲを更新する。
【００３３】
【数９】

【００３４】
ＮはＧＯＰ内のピクチャ数である。また、シーケンスの最初でのＲの初期値は０とする。
【００３５】
次に、ステップＳ１２において、画像情報符号化装置（Ｉ／Ｐ−ＶＯＰ）４は、仮想バッファを用いたレート制御を行う。すなわち、ステップＳ１２において、画像情報符号化装置（Ｉ／Ｐ−ＶＯＰ）４は、ステップＳ１１で式（４）により求められた各ピクチャに対する割当ビット量（Ｔ_i，Ｔ_p，Ｔ_b）と、実際の発生符号量を一致させるため、各ピクチャ毎に独立に設定した３種類の仮想バッファの容量を基に、量子化スケールコードを、マクロブロック単位のフィードバック制御で求める。
【００３６】
まず、ｊ番目のマクロブロック符号化に先立ち、仮想バッファの占有量を式（７）によって求める。
【００３７】
【数１０】

【００３８】
ここで、ｄ₀ ⁱ，ｄ₀ ^p，ｄ₀ ^bは各仮想バッファの初期占有量、Ｂ_jはピクチャの先頭からｊ番目のマクロブロックまでの発生ビット量、ＭＢ＿ｃｎｔは１ピクチャ内のマクロブロック数である。各ピクチャ符号化終了時の仮想バッファ占有量（ｄ_{MB_cnt} ⁱ，ｄ_{MB_cnt} ^p，ｄ_{MB_cnt} ^b）は、それぞれ同一のピクチャタイプで、次のピクチャに対する仮想バッファ占有量の初期値（ｄ₀ ⁱ，ｄ₀ ^p，ｄ₀ ^b）として用いられる。
【００３９】
次に、ｊ番目のマクロブロックに対する量子化スケールコードを式（８）により計算する。
【００４０】
【数１１】

【００４１】
ここで、ｒはリアクションパラメーターと呼ばれるフィードバックループの応答を制御する変数であり、式（９）により与えられる。
【００４２】
【数１２】

【００４３】
尚、符号化開始時における仮想バッファの初期値は式（１０）で与えられる。
【００４４】
【数１３】

【００４５】
最後に、ステップＳ１３において、画像情報符号化装置（Ｉ／Ｐ−ＶＯＰ）４は、視覚特性を考慮したマクロブロック毎の適応量子化を行う。すなわち、ステップＳ１３において、画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）４は、ステップＳ１２で求められた量子化スケールコードを、視覚的に劣化の目立ちやすい平坦部でより細かく量子化し、劣化の比較的目立ちにくい絵柄の複雑な部分で粗く量子化するように、各マクロブロック毎のアクティビティと呼ばれる変数によって変化させている。
【００４６】
アクティビティは、原画の輝度信号画素値を用い、フレーム離散コサイン変換モードにおける４個のブロックと、フィールド離散コサイン変換モードにおける４個のブロックとの、合計８ブロックの画素値を用いて式（１１）で与えられる。
【００４７】
【数１４】

【００４８】
ここで、Ｐ_kは原画の輝度信号ブロック内画素値である。式（１１）において最小値を採るのは、マクロブロック内の一部だけでも平坦部分のある場合には量子化を細かくするためである。
【００４９】
更に、式（１２）によりその値が０．５〜２の範囲を取る正規化アクティビティＮａｃｔ_jを求める。
【００５０】
【数１５】

【００５１】
ここで、ａｖｇ＿ａｃｔは、直前に符号化したピクチャでのａｃｔ_jの平均値である。
【００５２】
視覚特性を考慮した量子化スケールコードｍｑｕａｎｔ_jはステップＳ１２で得られた量子化スケールコードＱ_jを基に式（１３）で与えられる。
【００５３】
【数１６】

【００５４】
ＭＰＥＧ２ＴｅｓｔＭｏｄｅｌ５において定められた上記符号量制御方式には以下の制限のあることが知られており、実際の制御を行う場合には、これらの制限に対する対策が必要となる。すなわち、第一の制限は、第一ステップＳ１１はシーンチェンジに対応出来ず、また、シーンチェンジ後には第三ステップＳ１３で用いる媒介変数ａｖｇ＿ａｃｔが間違った値となるということである。第二の制限は、ＭＰＥＧ２及びＭＰＥＧ４において規定されているＶＢＶ（Video Buffer Verifier）の拘束条件を満たす保証がないことである。
【００５５】
ところで、文献”ＭＰＥＧ圧縮効率の理論解析とその符号量制御への応用”（信学技報、ＩＥ−９５，ＤＳＰ９５−１０，１９９５年５月、以下これを文献２と呼ぶ）でも述べられている通り、ＴｅｓｔＭｏｄｅｌ５で定められている符号量制御方式は、ＭＰＥＧ−２画像符号化装置において、必ずしも良好な画質を与えるものではない。
【００５６】
この文献２では、特に、良好な画質を与えるための、ＧＯＰ内における各フレーム毎の最適な符号量配分を与える手法として以下の方式を提案している。すなわち、Ｎ_I，Ｎ_P，Ｎ_Bを、ＧＯＰ内においてまだ符号化されていないＩ，Ｐ，Ｂピクチャの枚数として、これらに割り当てられる符号量をＲ_I，Ｒ_P，Ｒ_Bとする。また、式（１４）で与えられる固定レート条件の下に、それぞれにおける量子化ステップサイズをＱ_I，Ｑ_P，Ｑ_Bとし、ｍを、量子化ステップサイズと再生誤差分散を関係付ける次数（すなわち、量子化ステップサイズをｍ乗したものの平均値の最小化が再生誤差分散を最低にすると仮定する）とする。そして、式（１５）を最小にすることを考える。
【００５７】
【数１７】

【００５８】
【数１８】

【００５９】
尚、それぞれのフレームにおける平均量子化スケールＱ、及び符号量Ｒは、ＴｅｓｔＭｏｄｅｌ５でも用いられる媒体変数である各フレームのコンプレキシティＸと、式（１６）のように関係づけられる。
【００６０】
【数１９】

【００６１】
式（１６）の関係も考慮しつつ、式（１４）の拘束条件の元に式（１５）を最小にするＲ_I，Ｒ_P，Ｒ_Bを、ラグランジェの未定乗数法を用いて算出すると、最適なＲ_I，Ｒ_P，Ｒ_Bとして以下の式のような値が求められる。
【００６２】
【数２０】

【００６３】
α＝１として、式（１７）と、ＭＰＥＧ２ＴｅｓｔＭｏｄｅｌ５で定められた符号量制御方式における式（４）との関係は以下の通りであると言える。すなわち、式（１７）は、符号量制御の媒介変数であるＫ_p，Ｋ_bを、各フレームのコンプレキシティＸ_I，Ｘ_P，Ｘ_Bに応じて、式（１８）のように適応的に算出していることに他ならない。
【００６４】
【数２１】

【００６５】
文献２では、１／（１＋ｍ）の値として、０．６〜１．２程度に設定することで良好な画質が得られることが示されている。
【００６６】
図６に示した画像情報変換装置内で、ＭＰＥＧ４画像情報符号化装置（Ｉ／Ｐ−ＶＯＰ）４において、ＭＰＥＧ２ＴｅｓｔＭｏｄｅｌ５において定められたのと同様な手法を用いて符号量制御を行った場合、シーンチェンジ等に起因する、ＧＯＰ内でのコンプレキシティの変化に対応することが不可能であるため、安定した符号量制御が困難となり、画質劣化を引き起こすことが考えられる。ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）２において抽出される、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）内の情報をＭＰＥＧ４画像圧縮情報符号化部（Ｉ／Ｐ−ＶＯＰ）４において利用することでこの問題を回避することが可能であると期待される。
【００６７】
かかる問題を解決するため、本願出願人は、先に図８に示すような画像情報変換装置を提案した。
【００６８】
この画像情報変換装置は、ピクチャタイプ判別部７と、圧縮情報解析部８と、ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）９と、間引き部１０と、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）１１と、動きベクトル合成部１２と、動きベクトル検出部１３と、情報バッファ１４と、コンプレキシティ算出部１５とから構成される。
【００６９】
この画像情報変換装置は、圧縮情報解析部８、情報バッファ１４、コンプレキシティ算出部１５及びＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）１１における符号量制御以外の動作原理については、図６に示した画像情報変換装置と同様であるため、以下では、圧縮情報解析部８、情報バッファ１４、コンプレキシティ算出部１５における動作原理及びＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）１１における符号量制御について述べることにする。
【００７０】
圧縮情報解析部８において、復号処理に用いられた量子化スケールのフレーム全体に渡る平均値Ｑ、及び、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）において、当該フレームに割り当てられた総符号量（ビット数）Ｂは、情報バッファ１４に格納される。
【００７１】
コンプレキシティ算出部１５においては、情報バッファ１４に格納されたフレーム毎の情報Ｑ及びＢから、当該フレームに対するコンプレキシティＸを式（１９）により算出する。
【００７２】
【数２２】

【００７３】
式（１９）によって算出された、当該フレームに対するコンプレキシティＸは、１ＧＯＶ（group of VOPs）分バッファリングされた後、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）１１に符号量制御のための媒介変数として伝送される。このため、１ＧＯＶ分の遅延が必要となる。この遅延は図示しない遅延バッファを用いて実現される。ここで、ＧＯＶとは、ランダムアクセス可能なＶＯＰの組である。
【００７４】
以下では、式（１９）において算出された、ＧＯＶ内の各フレームに対するコンプレキシティＸが、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）１１においてどのように用いられるかについて述べる。尚、以下では、ピクチャタイプ判別部７がこの画像情報変換装置内に存在せず、フレームレートの変換を行わない場合をも考慮することにする。
【００７５】
式（１８）によって求められたＫ_P，Ｋ_Bの意味するところは、Ｉ−ＶＯＰに対する理想的な平均量子化スケールＱ_{i_ideal}に対するＰ−ＶＯＰ／Ｂ−ＶＯＰに対する理想的な平均量子化スケールＱ_{p_ideal}，Ｑ_{b_ideal}の比が、式（２０）によって与えられるということである。
【００７６】
【数２３】

【００７７】
ＭＰＥＧ２ＴｅｓｔＭｏｄｅｌ５においては、式（１８）のように適応的にＫ_p，Ｋ_bを算出することを行わず、式（３）に示したような固定値を用いている。
【００７８】
式（１８）及び式（２０）から、或るＶＯＰ１と、或るＶＯＰ２に対するコンプレキシティをそれぞれＸ₁，Ｘ₂とし、理想的な量子化スケールをＱ_{1_ideal}，Ｑ_{2_ideal}とすれば、式（２１）のようになる。
【００７９】
【数２４】

【００８０】
或いはまた、ＭＰＥＧ２ＴｅｓｔＭｏｄｅｌ５のように、式（３）に示した固定値を用いたい場合には、式（２１）に代えて、式（２２）のようにすれば良い。
【００８１】
【数２５】

【００８２】
今、ＧＯＶ内の未符号化されたＶＯＰに対して割り当てられる総符号量（ビット数）をＲとし、Ｒが、各ＶＯＰに対して、Ｒ₁，Ｒ₂，…Ｒ_nといったように割り当てられる時、当該ＧＯＶに対する画質が最適化されるものとする。ここでＲとＲ₁，Ｒ₂，…Ｒ_nの間には式（２３）のような関係式が成り立つ。
【００８３】
【数２６】

【００８４】
或るＶＯＰ_kに対する平均量子化スケールＱ_k、割当符号量Ｒ_k、コンプレキシティＸ_kの間には式（２４）なる関係があることにも注意して、式（２３）を変形すれば式（２５）が得られる。
【００８５】
【数２７】

【００８６】
【数２８】

【００８７】
式（２５）において、Ｋ（Ｘ₁，Ｘ₂）に関しては、式（２１）に示した値を用いても、式（２２）に示した値を用いても良いが、前者の方が、画像に応じた、より最適な符号量配分を実現することが可能である。その際、１／（１＋ｍ）の値を１．０と設定することで、指数演算を行うことが不要となり、高速な実行が可能となる。また、１／（１＋ｍ）の値を１．０以外に設定する場合にも、予めテーブルを持ち、これを参照して指数演算を行うことで高速な実行が可能となる。
【００８８】
式（２５）における各ＶＯＰに対するコンプレキシティＸ_kはＭＰＥＧ４画像符号化によるものであるが、ＭＰＥＧ２画像符号化による各フレームに対するコンプレキシティと、ＭＰＥＧ４画像符号化による各フレームに対するコンプレキシティが等しいと仮定すれば、コンプレキシティ算出部１５に格納されたＸ_kを用いることで、式（２５）によって当該ＶＯＰに対する目標符号量を算出することが可能である。
【００８９】
この目標符号量の算出のフローを図９に示す。最初のステップＳ２１において、圧縮情報解析部８は、ＭＰＥＧ２画像情報復号化部９における復号処理に用いられるＧＯＰ内の各フレームに対する平均量子化スケールＱ，及び割当符号量（ビット数）Ｂを抽出する。
【００９０】
ステップＳ２２において、コンプレキシティ算出部１５は、平均量子化スケールＱ及び割当符号量（ビット数）Ｂの積で与えられるコンプレキシティＸを算出する。
【００９１】
ステップＳ２３において、ＭＰＥＧ４画像符号化部（Ｉ／Ｐ−ＶＯＰ）１１は、コンプレキシティＸに応じた目標符号量（ターゲットビット）を算出する。
【００９２】
ＭＰＥＧ２ＴｅｓｔＭｏｄｅｌ５では、ＧＯＰ内におけるＩ，Ｐ，Ｂピクチャに対するコンプレキシティＸ_i，Ｘ_p，Ｘ_bは一定であると仮定しているが実際にはシーンチェンジをＧＯＰ内に含む場合や、ＧＯＰ内で背景が著しく変化する場合等ではこの仮定が成り立たず、安定した符号量制御の妨げとなり、画質劣化の要因ともなる。図８に示した画像情報変換装置においては、そのような場合にも、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）における、各フレームに対するコンプレキシティに基づいた符号量制御を行うため、画質劣化を引き起こすことなく、安定した符号量制御を行うことが可能である。
【００９３】
【発明が解決しようとする課題】
ところで、図７に示した符号量制御方式において、ステップＳ１３における適応量子化が有効に作用するためには、ｊ番目のマクロブロックに対する量子化スケールコード、つまり式（８）におけるＱ_Jがフレーム全体に渡って均一な値を取ることが望ましい。そこで、ＣＣＩＲ（Comite Consultantif Internationale des Radio Communications）テストシーケンスの一つである“ＦｌｏｗｅｒＧａｒｄｅｎ”を、ｎ＝１５；ｍ＝３の条件の元、４Ｍｂｐｓに圧縮したＭＰＥＧ２画像圧縮情報（ビットストリーム）を、図８に示した画像情報変換装置を用いて、ｎ＝５；ｍ＝１のＭＰＥＧ４画像圧縮情報（ビットストリーム）に変換する際の、あるＶＯＰに対するＱ_Jがどのような値を取るかを図１０に示す。
【００９４】
先述の通り、理想的にはＱ_JがＶＯＰ全体に渡って均一な値を取ることが望ましいが、実際には、式（７）における仮想バッファ占有量（ｄ_j ⁱ，ｄ_j ^p，ｄ_j ^b）がマクロブロック毎に変化するため、ＶＯＰ全体に渡って均一な値とならない。
【００９５】
本発明は、上述の実情に鑑みて提案されるものであって、Ｑjの変動を抑制して符号量を制御するような画像情報変換装置及び方法並びに符号化装置及び方法を提供することを目的とする。
【００９６】
【課題を解決するための手段】
上述の課題を解決するために、本発明は、第１の圧縮符号化方式で圧縮された飛び越し走査の入力画像圧縮情報を、第２の圧縮符号化方式で圧縮された順次走査の出力画像圧縮情報に変換するものであって、上記入力画像圧縮情報及び上記出力画像圧縮情報を構成する符号化画像は、それぞれ複数の画素からなる画素ブロックから構成され、上記出力画像圧縮情報の符号化画像における水平方向一列分の画素ブロック群からなる疑似画素ブロック列を構成する画素ブロックに対応する上記入力画像圧縮情報の符号化画像における画素ブロック群の平均量子化スケール及び割当符号量を上記入力画像圧縮情報から抽出する解析手段と、この解析手段で検出した平均量子化スケール及び割当符号量を格納する情報バッファと、上記疑似画素ブロック列に対する目標符号量を算出する目標符号量算出手段と、上記目標符号量算出手段で算出した目標符号量を用い、画像情報を上記出力画像圧縮情報に符号化する符号化手段とを有し、上記解析手段は、上記情報バッファに格納した平均量子化スケール及び割当符号量を用いて、上記出力画像圧縮情報及び符号化画像における疑似画素ブロック列に対するコンプレキシティを算出し、上記目標符号量算出手段は、上記出力画像圧縮情報を構成する符号化画像に対する目標符号量及び各疑似画素ブロック列に対するコンプレキシティを用いて、上記出力画像圧縮情報を構成する符号化画像を構成する疑似画素ブロック列に対する目標符号量を算出する。
また、本発明にかかる符号化装置は、第１の圧縮符号化方式で圧縮された飛び越し走査の入力画像圧縮情報を復号して得られる画像情報を、第２の圧縮符号化方式で圧縮された順次走査の出力画像圧縮情報に符号化する符号化装置において、上記入力画像圧縮情報及び上記出力画像圧縮情報を構成する符号化画像は、それぞれ複数の画素からなる画素ブロックから構成され、上記入力画像圧縮情報から抽出され、上記出力画像圧縮情報の符号化画像における水平方向一列分の画素ブロック群からなる疑似画素ブロック列を構成する画素ブロックに対応する上記入力画像圧縮情報の符号化画像における画素ブロック群の平均量子化スケール及び割当符号量を受け取る受取手段と、この受取手段からの平均量子化スケール及び割当符号量を格納する情報バッファと、上記疑似画素ブロック列に対する目標符号量を算出する目標符号量算出手段と、上記目標符号量算出手段で算出した目標符号量を用い、画像情報を上記出力画像圧縮情報に符号化する符号化手段とを有し、上記受取手段は、上記情報バッファに格納した平均量子化スケール及び割当符号量を用いて算出された上記出力画像圧縮情報及び符号化画像における疑似画素ブロック列に対するコンプレキシティを受け取り、上記目標符号量算出手段は、上記出力画像圧縮情報を構成する符号化画像に対する目標符号量及び各疑似画素ブロック列に対するコンプレキシティを用いて算出された上記出力画像圧縮情報を構成する符号化画像を構成する疑似画素ブロック列に対する目標符号量を算出する。
【００９７】
本発明は、飛び越し走査のＭＰＥＧ２画像圧縮情報（ビットストリーム）を入力画像圧縮情報とし、順次操作のＭＰＥＧ４画像圧縮情報（ビットストリーム）を出力画像圧縮情報とする。これらＭＰＥＧ２画像圧縮情報（ビットストリーム）及びＭＰＥＧ４画像圧縮情報は、複数の画素から構成される画素ブロックすなわちマクロブロックから構成され、疑似画素ブロック列すなわち疑似スライスを利用している。
【００９８】
また、ＭＰＥＧ２画像圧縮情報（ビットストリーム）及びＭＰＥＧ４画像圧縮情報（ビットストリーム）は、画像群すなわちＧＯＰ（group of pictures）及びＧＯＶ（group of VOPs）からそれぞれ構成されている。そして、画像群であるＧＯＰ及びＧＯＶは、複数の符号化画像すなわちピクチャ（picture）及びＶＯＰ（video object plane）からそれぞれ構成されている。
【００９９】
すなわち、本発明は、飛び越し走査のＭＰＥＧ２画像情報圧縮情報（ビットストリーム）を入力とし、ピクチャタイプ判別部、圧縮情報解析部、ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）、間引き部、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）、動きベクトル合成部、動きベクトル検出部、情報バッファ、ＶＯＰコンプレキシテイ算出部、擬似スライスコンプレキシティ算出部、ＶＯＰ目標符号量算出部、擬似スライス目標符号量算出部を兼ね備え、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）における、各スライスに対するコンプレキシティ情報を用いて、ＭＰＥＧ４画像符号化の際に擬似スライス単位の目標符号量（ターゲットビット）を与えることで、符号量制御のステップ２に伴う参照量子化スケールの変動を最小限に抑え、各マクロブロックに対する符号量割当が画像に対して最適化された状態で、順次走査のＭＰＥＧ４画像圧縮情報（ビットストリーム）を出力する手段を提供するものである。
【０１００】
上記構成において、ピクチャタイプ判別部は、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）内で、Ｉ／Ｐピクチャに関するものだけ残してＢピクチャに関するものは廃棄する。圧縮情報解析部は、１ＧＯＰ分の遅延を実現すると同時に、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）内で、各フレームに対して割り当てられた符号量（ビット数）及び各フレームにおける平均量子化スケール、並びに、各フレームにおいて、後段のＭＰＥＧ４画像圧縮情報でそれぞれの擬似スライスを構成するマクロブロック全体に渡る平均量子化スケール及び発生符号量（ビット数）を抽出する。ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）は、ピクチャタイプ判別部の出力となる、Ｉ／Ｐピクチャに関する圧縮情報（ビットストリーム）を、水平方向垂直方向ともに、８次の離散コサイン係数全てを用いた、若しくはその低域成分のみを用いた復号処理を行う。間引き部は、ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）の出力である画像情報の第一フィールド若しくは第二フイールドのみを取り出して順次走査画像への変換を行うと同時に、所望の画枠サイズに変換するためのダウンサンプリングを行う。ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）は、間引き部の出力となる画像情報をＭＰＥＧ４符号化方式により符号化する。動きベクトル合成部は、ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）で検出された、入力となる画像圧縮情報（ビットストリーム）内の動きベクトル値を元に、走査変換後の画像データに対する動きベクトル値にマッピングを行う。動きベクトル検出部は、動きベクトル合成部から出力される動きベクトル値を元に、高精度の動きベクトル検出を行う。情報バッファは、圧縮情報解析部において抽出された、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）における各フレームに割り当てられた符号量（ビット数）及び各フレームに対する平均量子化スケール、並びに、各フレームにおいて、後段のＭＰＥＧ４画像圧縮情報でそれぞれの擬似スライスを構成するマクロブロック全体に渡る平均量子化スケール及び発生符号量（ビット数）を格納する。ＶＯＰコンプレキシテイ算出部は、情報バッファに格納された、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）における、各フレームに割り当てられた符号量（ビット数）、及び各フレームに対する平均量子化スケールから、出力となるＭＰＥＧ４画像圧縮情報（ビットストリーム）における各ＶＯＰに対するコンプレキシティの推定値を算出する。擬似スライスコンプレキシティ算出部においては、情報バッファに格納された、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）における、各擬似スライスに割り当てられた符号量（ビット数）、及び各擬似スライスに対する平均量子化スケールから、出力となるＭＰＥＧ４画像圧縮情報（ビットストリーム）における各擬似スライスに対するコンプレキシティの推定値を算出する。ＶＯＰ目標符号量算出部は、ＶＯＰコンプレキシティ算出部において算出された、各ＶＯＰに対するコンプレキシティに基づいて、各ＶＯＰに対する目標符号量（ターゲットビット）の算出を行い、擬似スライス目標符号量算出部においては、ＶＯＰ目標符号量算出部において算出された各ＶＯＰに対する目標符号量（ターゲットビット）、及び擬似スライスコンプレキシティ算出部において算出された、各擬似スライスに対するコンプレキシティから、各擬似スライスに対する目標符号量（ターゲットビット）を算出し、その情報をＭＰＥＧ４画像情報符号化装置（Ｉ／Ｐ−ＶＯＰ）に伝送する。
【０１０１】
なお、ピクチャタイプ判別部を持たず、フレームレートの変換を行わない装置構成も可能である。また、ＶＯＰ目標符号量算出部において、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）における各フレームに対するコンプレキシティを用いず、ＭＰＥＧ２ＴｅｓｔＭｏｄｅｌ５に定められているのと同様の方式により各ＶＯＰに対する目標符号量を算出する装置構成も考えられる。
【０１０２】
【発明の実施の形態】
以下、図面を参照し、本発明の実施例について説明する。
【０１０３】
まず、本発明を適用した第１の実施の形態の画像情報変換装置について、図１を参照して説明する。
【０１０４】
この画像情報処理装置は、ピクチャタイプ判別部１６と、圧縮情報解析部１７と、ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）１８と、間引き部１９と、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）２０と、動きベクトル合成部２１と、動きベクトル検出部２２と、情報バッファ２３と、ＶＯＰコンプレキシティ算出部２４と、擬似スライスコンプレキシティ算出部２５と、ＶＯＰ目標符号量算出部２６と、擬似スライス目標符号量算出部２７とから構成される。
【０１０５】
この画像情報変換装置には、フレーム内で符号化されたイントラ符号化画像（Ｉピクチャ；Ｉ）、表示順序で順方向を参照して予測符号化された順方向予測符号化画像（Ｐピクチャ；Ｐ）及び表示順序で順方向及び逆方向を参照して予測符号化された双方向予測符号化画像（Ｂピクチャ；Ｂ）から構成される飛び越し走査のＭＰＥＧ２画像圧縮情報（ビットストリーム）が入力される。
【０１０６】
このＭＰＥＧ２画像圧縮情報（ビットストリーム）は、ピクチャタイプ判別部１６において、Ｉ／Ｐピクチャに関するものか、Ｂピクチャに関するものであるかを判別され、Ｉ／Ｐピクチャのみ後続の圧縮情報解析部１７に出力され、Ｂピクチャは破棄される。
【０１０７】
圧縮情報解析部１７において、ＭＰＥＧ２画像情報復号化装置（Ｉ／Ｐピクチャ）１８における復号処理に用いられる量子化スケールのフレーム全体に渡る平均値Ｑ、及び、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）において、当該フレームに割り当てられた総符号量（ビット数）Ｂは、情報バッファ２３に格納される。
【０１０８】
ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）１８における処理は通常のＭＰＥＧ２画像情報復号化装置と同様に、ＭＰＥＧ２画像圧縮情報（ビットストリーム）を画像信号に復号するものである。ここで、Ｂピクチャに関するデータはピクチャタイプ判別部１６において廃棄されているので、ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）１８は、Ｉ／Ｐピクチャのみを復号化出来る機能を有すればよい。
【０１０９】
ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）１８の出力となる画素値は、間引き部１９に入力される。間引き部１９は、水平方向には１／２の間引き処理を施し、垂直方向には、第一フィールド若しくは第二フィールドのどちらか一方のデータのみを残し、もう一方を廃棄する。このような間引きによって、入力となる画像情報の１／４の大きさを持つ順次走査画像を生成する。
【０１１０】
ところで、間引き部１９から出力された画像をＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）２０において１６×１６画素で構成されるマクロブロック単位で符号化するためには、水平方向、垂直方向ともに、その画素数が１６の倍数である必要が有る。間引き部１９においては、このための画素の補填若しくは廃棄を、間引きと同時に行う。
【０１１１】
例えば、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）がＮＴＳＣ（National Television System Committee）の規格に準拠したもの、つまり７２０×４８０画素、３０Ｈｚの飛び越し走査画像であった場合、間引き後の画枠はＳＩＦ（３６０×２４０画素）サイズということになる。この画像に対して、間引き部１９において、例えば水平方向の右端若しくは左端の８ラインを廃棄して３５２×２４０画素とする。
【０１１２】
なお、間引き部１９における動作の変更を行うことで、これ以外の画枠、例えば上記の例で、約１／４×１／４の画枠であるＱＳＩＦ（１７６×１１２画素）サイズの画像に変換することも可能である。
【０１１３】
更に、上述した文献１は、ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）１８における処理として、水平方向、垂直方向それぞれについて、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）内の、８次の離散コサイン変換係数すべてを用いた復号処理を行う画像情報変換装置について述べられているが、図１に示した装置に関してはその限りではなく、水平方向のみ、或いは水平方向、垂直方向ともに、８次の離散コサイン変換係数のうちの低域成分のみを用いた復号処理を行い、画質劣化を最小限に抑えながら、復号処理に伴う演算量とビデオメモリ容量を削減することが可能である。
【０１１４】
間引き部１９によって生成された順次走査画像はＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）２０によってフレーム内で符号化されたＩ−ＶＯＰ及び表示順序で順方向を参照して予測符号化されたＰ−ＶＯＰに符号化され、ＭＰＥＧ４画像圧縮情報（ビットストリーム）として出力される。
【０１１５】
尚、ＶＯＰはVideo object Planeを意味し、ＭＰＥＧ２におけるフレームに相当するものである。また、Ｉ−ＶＯＰはＩピクチャに対応するイントラ符号化ＶＯＰ、Ｐ−ＶＯＰはＰピクチャに対応する順方向予測符号化ＶＯＰ、Ｂ−ＶＯＰはＢピクチャに対応する双方向予測符号化ＶＯＰである。
【０１１６】
ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）２０における符号化の際には、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）中の動きベクトル情報は、動きベクトル合成部２１において間引き後の画像情報に対する動きベクトルにマッピングされ、動きベクトル検出部２２においては、動きベクトル合成部２１において合成された動きベクトル値を元に高精度の動きベクトルを検出する。
【０１１７】
ここで、本実施の形態の画像情報変換装置で用いられる、擬似スライスの概念について述べる。
【０１１８】
ＭＰＥＧ２画像圧縮情報（ビットストリーム）には、図２に示すようなスライス層が存在する。すなわち、スライス層では、画面内で図２のように横長の帯状の領域を示し、（図２のａ，ｂ，ｃ…の領域）画面を複数のスライスで構成することにより、あるスライス層でエラーが発生しても、次のスライス層の開始（ｓｌｉｃｅ＿ｓｔａｒｔ＿ｃｏｄｅ）からの同期でエラー回復が可能となる。スライス層は１個以上のマクロブロックから構成され、ラスタスキャンオーダで、左から右、上から下に並び、その長さや開始位置は自由で、画面毎に変更可能である。但し、並列処理や効果的なエラー耐性を目的として、一つのスライスは右方向にのみ伸び、下方にまで伸びることはない。
【０１１９】
ＭＰＥＧ４画像圧縮情報（ビットストリーム）においては、低ビットレートにおける符号化効率を考慮して、図２に示したようなスライス層は構文上定義されていないが、本実施の形態においては、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）２０における符号化処理のため、図３に示すような擬似スライスを定義する。すなわち、例えば、図１に示した画像情報変換装置によって、入力となる、飛び越し走査のＭＰＥＧ２画像圧縮情報（ビットストリーム）が、１／２×１／２の画枠を持つ順次走査のＭＰＥＧ４画像圧縮情報（ビットストリーム）に変換される場合を考える。この時、図３のＡにおける入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）内の４つのマクロブロックＭＢ₀，ＭＢ₁，ＭＢ_2m，ＭＢ_2m+1が、図３のＢに示す出力となるＭＰＥＧ４画像圧縮情報（ビットストリーム）においては１つのマクロブロックＭｂ₀に対応することになる。この時、図３のＡにおいてマクロブロックＭ_B0，Ｍ_B1，…ＭＢ_4m-1から構成される領域ａに対応して、図３のＢに示す出力となるＭＰＥＧ４画像圧縮情報（ビットストリーム）における、マクロブロックＭｂ₀，Ｍｂ₁，…，Ｍｂ_m-1から構成されるマクロブロック群ａを擬似スライス０と定義する。疑似スライス１以降に関しても同様である。
【０１２０】
圧縮情報解析部１７においては、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）内の、各フレームに対する割当符号量（ビット数）Ｂ_k、及び各フレームにおける平均量子化スケールＱ_kに関する情報を抽出し、情報バッファ２３に格納する。同時に、各フレームにおいて、出力となるＭＰＥＧ４画像圧縮情報（ビットストリーム）において擬似スライス１を構成するマクロブロック群に対する割当符号量（ビット数）Ｂ_{pseudo_slice1}，及び平均量子化スケールＱ_{pseudo_slice1}を情報バッファ２３に格納する。図３のＡにおいて、ｎを整数として、入力となる画像圧縮情報（ビットストリーム）内のあるマクロブロックＭＢ_nに対する発生符号量及び量子化スケールをそれぞれＢ_MBn，Ｑ_MBnとすれば、擬似スライス０に対して、次の式（２６）が成り立つ。
【０１２１】
【数２９】

【０１２２】
擬似スライス１以降に関しても同様である。但し、ｍが奇数の場合、例えば入力となる飛び越し走査のＭＰＥＧ２画像圧縮情報（ビットストリーム）の画枠が７２０×４８０画素であった場合、その１／２×１／２は３６０×２４０画素ということになるが、後続のＭＰＥＧ４画像情報符号化装置（Ｉ／Ｐ−ＶＯＰ）２０においてマクロブロック単位の処理を行うためには、間引き部１９において、例えば画枠に対する右４画素を破棄して、３５２×２４０画素とする必要がある。この場合には、式（２６）に示した値をとしてＢ_{pseudo_slice0}，Ｑ_{pseudo_slice0}として用いても良いし、次の式（２７）のようにしても良い。
【０１２３】
【数３０】

【０１２４】
ＶＯＰコンプレキシティ算出部２４においては、情報バッファ２３に格納された、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）内の、各フレームに対する割当符号量（ビット数）Ｂ_k及び各フレームにおける平均量子化スケールＱ_kに関する情報から、各ＶＯＰに対するコンプレキシティの推測値Ｘ_kを次の式（２８）のように１ＧＯＶ分算出する。
【０１２５】
【数３１】

【０１２６】
ＶＯＰ目標符号量算出部２６においては、ＶＯＰコンプレキシテイ算出部２４を用いて、式（２５）により、各ＶＯＰに対する目標符号量（ターゲットビット）を算出する。以下では、式（２５）によつて求められる各ＶＯＰに対する目標符号量（式ではＲ₁）をＴ_vopと表す。
【０１２７】
擬似スライスコンプレキシティ算出部２５においては、情報バッファ２３に格納された、出力となるＭＰＥＧ４画像圧縮情報（ビットストリーム）において擬似スライス１を構成するマクロブロック群に対する割当符号量（ビット数）Ｂ_{pseudo_slice1}及び平均量子化スケールＱ_{pseudo_slice1}に関する情報から、各擬似スライスに対するコンプレキシティＸ_{pseudo_slice1}を、次の式（２９）のように算出する。
【０１２８】
【数３２】

【０１２９】
今、当該ＶＯＰが、擬似スライス０，擬似スライス１，…擬似スライスＮ−１から構成されているとすれば、疑似スライス目標量算出部２７においては、疑似スライス１に対する目標符号量Ｔ_{pseudo_slice1}が、次の式（３０）のように算出され、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）２０に伝送される。
【０１３０】
【数３３】

【０１３１】
Ｉ−ＶＯＰ，Ｐ−ＶＯＰ，Ｂ−ＶＯＰに対するＴ_{pseudo_slice1}をそれぞれＴ_{i_pseudo_slice1}，Ｔ_{p_pseudo_slice1}，Ｔ_{b_pseudo_slice1}とすれば、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）２０において、仮想バッファ（ｄ_j ⁱ，ｄ_j ^p，ｄ_j ^b）の占有量を、式（２７）に対応して、次の式（３１）のように算出する。
【０１３２】
【数３４】

【０１３３】
ここで、ｄ₀ ⁱ，ｄ₀ ^p，ｄ₀ ^bは、擬似スライス先頭における各仮想バッファの占有量であり、Ｂ_{pseudo_slice_j}は、擬似スライス先頭からｊ番目のマクロブロックでの発生符号量である。Ｐ＿ＳＬＩＣＥ＿ＣＮＴは１擬似スライスに含まれるマクロブロックの個数で、各擬似スライス符号化終了時における仮想バッファの占有量（ｄ_{P_SLICE_CNT} ⁱ，ｄ_{P_SLICE_CNT} ^p，ｄ_{P_SLICE_CNT} ^ib）は、次の擬似スライスに対する仮想バッファの占有量の初期値（ｄ₀ ⁱ，ｄ₀ ^p，ｄ₀ ^b）として用いられる。
【０１３４】
以上のようなＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）２０における一連の符号量制御の動作原理について、図４を参照して説明する。
【０１３５】
最初のステップＳ３１において、圧縮情報解析部１７は、ピクチャタイプ判別部１６を介して入力されたＭＰＥＧ２画像圧縮情報（ビットストリーム）の構文を解析し、各フレームに対する割当符号量（ビット数）Ｂ_k、及び各フレームにおける平均量子化スケールＱ_kに関する情報を抽出し、情報バッファ２３に格納する。同時に、各フレームにおいて、出力となるＭＰＥＧ４画像圧縮情報（ビットストリーム）において疑似スライス１を構成するマクロブロック群に対する割当符号量（ビット数）Ｂ_{pseudo_slice1}、及び平均量子化スケールＱ_{pseudo_slice1}を情報バッファ２３に格納する。
【０１３６】
ステップＳ３２において、ＶＯＰコンプレキシティ算出部２４は、情報バッファ２３に格納された、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）内の、各フレームに対する割当符号量（ビット数）Ｂ_k、及び各フレームにおける平均量子化スケールＱ_kに関する情報から、各ＶＯＰに対するコンプレキシティの推測値Ｘ_kを１ＧＯＶ分算出する。
【０１３７】
ステップＳ３３において、ＶＯＰ目標符号量算出部２６は、各ＶＯＰに対する目標符号量（ターゲットビット）を算出する。
【０１３８】
ステップＳ３４において、疑似スライスコンプレキシティ算出部２５は、情報バッファ２３に格納された、出力となるＭＰＥＧ４画像圧縮情報（ビットストリーム）において疑似スライス１を構成するマクロブロック群に対する割当符号量（ビット数）Ｂ_{pseudo_slice1}、及び平均量子化スケールＱ_{pseudo_slice1}に関する情報から、各疑似スライスに対するコンプレキシティＸ_{pseudo_slice1}を算出する。
【０１３９】
ステップＳ３５において、疑似スライス目標符号量算出部２７は、各疑似スライスに対応する目標符号量（ターゲットビット）を算出し、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）２０に伝送する。
【０１４０】
ステップＳ３６において、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）２０は、仮想バッファを用いたレート制御を行う。ステップＳ３７において、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）２０は、視角特性を考慮したマクロブロックごとの適応量子化を行う。
【０１４１】
次に、本発明を適用した第２の実施の形態の画像情報変換装置について、図５を参照して説明する。
【０１４２】
この画像情報装置は、ピクチャタイプ判別部２８と、圧縮情報解析部２９と、ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）３０と、間引き部３１と、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）３２と、動きベクトル合成部３３と、動きベクトル検出部３４と、情報バッファ３５と、擬似スライスコンプレキシティ算出部３６と、ＶＯＰ目標符号量算出部３７と、擬似スライス目標符号量算出部３８とから構成される。
【０１４３】
図１に示した画像情報変換装置と図５に示した画像情報変換装置における相違点は、図１に示した画像情報変換装置においては、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）２０における、各ＶＯＰに対する目標符号量（ターゲットビット）を式（２５）により算出するのに対し、図５に示した画像情報変換装置においては、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）３２における、各ＶＯＰに対する目標符号量（ターゲットビット）を式（４）により算出する点にある。すなわち、図５に示した画像情報変換装置においては、圧縮情報解析部２９において、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）におけるＧＯＰ構造を抽出してこれを情報バッファ３５に格納し、これより、ＶＯＰ目標符号量算出部３７においては、出力となるＭＰＥＧ４画像圧縮情報（ビットストリーム）におけるＧＯＶ構造を決定し、式（４）に基づいて各ＶＯＰに対する目標符号量（ターゲットビット）の算出を行う。
【０１４４】
以上、入力としてＭＰＥＧ２画像圧縮情報（ビットストリーム）を、出力としてＭＰＥＧ４画像圧縮情報（ビットストリーム）を対象としてきたが、入力、出力ともこれに限らず、例えばＭＰＥＧ−１やＨ．２６３などの画像圧縮情報（ビツトストリーム）でも良い。
【０１４５】
【発明の効果】
以上述べてきた様に、本発明は、飛び越し走査のＭＰＥＧ２画像圧縮情報（ビットストリーム）を入力とし、入力となるＭＰＥＧ２画像圧縮情報（ビットストリーム）における、各スライスに対するコンプレキシティ情報を用いて、ＭＰＥＧ４画像符号化の際に擬似スライス単位の目標符号量（ターゲットビット）を与えることで、符号量制御におけるコンプレキシティを算出するステップに伴う参照量子化スケールの変動を最小限に抑え、各マクロブロックに対する符号量割当が画像に対して最適化された状態で順次走査のＭＰＥＧ４画像圧縮情報（ビツトストリーム）に変換して出力する手段を提供するものである。
【図面の簡単な説明】
【図１】第１の実施の形態の画像情報変換装置の構成を示すブロック図である。
【図２】ＭＰＥＧ２画像圧縮情報（ビットストリーム）におけるスライス層の概念を説明する図である。
【図３】ＭＰＥＧ４画像圧縮情報（ビットストリーム）における疑似スライスの概念を示す図である。
【図４】コンプレキシティを用いて符号量制御を行う動作フローを示す図である。
【図５】第２の実施の形態の画像情報変換装置の構成を示すブロック図である。
【図６】従来の画像情報変換装置の構成を示すブロック図である。
【図７】ＭＰＥＧ２ＴｅｓｔＭｏｄｅ１５（ＩＳＯ／ＩＥＣＪＴＣ１／ＳＣ２９／ＷＧ１１Ｎ０４００）で述べられている符号量制御方式の動作原理を示すフローチャートである。
【図８】本願出願人が提案した画像情報変換装置の構成を示す図である。
【図９】図８の画像情報変換装置における符号量制御の動作を示すフローチャートである。
【図１０】ＣＣＩＲテストシーケンスの一つである“ＦｌｏｗｅｒＧａｒｄｅｎ”を、ｎ＝１５；ｍ＝３の条件の元、４Ｍｂｐｓに圧縮したＭＰＥＧ２画像圧縮情報（ビットストリーム）を、図８に示した画像情報変換装置を用いて、ｎ＝５；ｍ＝１のＭＰＥＧ４画像圧縮情報（ビットストリーム）に変換する際の、あるＶＯＰに対するＱ_jがどのような値を取るかを示した図である。
【符号の説明】
１６ピクチャタイプ判別部、１７圧縮情報解析部、１８ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）、１９間引き部、２０ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）、２１動きベクトル合成部、２２動きベクトル検出部、２３情報バッファ、２４ＶＯＰコンプレキシテイ算出部、２５擬似スライスコンプレキシティ算出部、２６ＶＯＰ目標符号量算出部、２７擬似スライス目標符号量算出部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image information conversion apparatus and method for converting image information.And encoding apparatus and methodSpecifically, when receiving image information (bitstream) such as MPEG compressed by orthogonal transform such as discrete cosine transform and motion compensation via network media such as satellite broadcasting, cable TV, the Internet, or optical Image information conversion apparatus and method for converting image information used when processing on a storage medium such as a magnetic diskAnd encoding apparatus and methodAbout.
[0002]
[Prior art]
In recent years, image information such as MPEG that handles image information as digital, compresses by orthogonal transform such as discrete cosine transform and motion compensation, utilizing the unique redundancy of image information for the purpose of efficient transmission and storage of information. A compression scheme is provided. And an apparatus conforming to such an image information compression method is becoming widespread in both information distribution such as broadcasting stations and information reception in general households.
[0003]
In particular, MPEG2 (ISO / IEC 13818-2) is defined as a general-purpose image encoding method that covers both interlaced scanning images and progressive scanning images, as well as standard resolution images and high-definition images.
[0004]
That is, according to the MPEG2 coding compression method, for example, a code amount (bit rate) of 4 to 8 Mbps is assigned to a standard resolution interlaced scan image having 720 × 480 pixels, and a high resolution interlace scan having 1920 × 1088 pixels is assigned. By assigning a code amount (bit rate) of 18 to 22 Mbps to an image, a high compression rate and good image quality can be realized.
[0005]
For this reason, MPEG2 is expected to be used in a wide range of applications for professional use and consumer use.
However, MPEG2 is mainly intended for high-quality encoding suitable for broadcasting, and for example, does not support encoding methods with a lower code amount (bit rate) than MPEG1, that is, a higher compression rate.
[0006]
On the other hand, with the spread of mobile terminals in recent years, it seems that the need for an encoding method with a high compression rate will continue to increase, and in response to this, standardization of an MPEG4 encoding method with a high compression rate is being carried out. . Regarding this image encoding system, an international standard was approved as ISO / IEC 14496-2 in December 1998.
[0007]
By the way, MPEG2 image compression information (bit stream) once encoded for digital broadcasting is converted into image compression information (bit stream) having a lower code amount (bit rate) that is more suitable for processing on a portable terminal or the like. There is a need to convert.
[0008]
To achieve this purpose, image information conversion in “Field-to-Frame Transcoding with Spatial and Temporal Downsampling” (Susie L Wee, John G. Apostolopoulos, and Nick Feamster, ICIP 99, hereinafter referred to as Reference 1) An apparatus (transcoder) is provided.
[0009]
As shown in FIG. 6, the image information conversion apparatus (transcoder) provided in this document 1 includes a picture type determination unit 1, an MPEG2 image information decoding unit (I / P picture) 2, a thinning unit 3, and , An MPEG4 image information encoding unit (I / P-VOP) 4, a motion vector synthesis unit 5, and a motion vector detection unit 6.
[0010]
This image information conversion apparatus includes an intra-encoded image (I picture; I) encoded within a frame, and a forward predictive encoded image (P picture; encoded with reference to the forward direction in the display order). P) and interlaced scanning MPEG2 image compression information (bitstream) composed of bi-predictive encoded images (B pictures; B) that are predictively encoded with reference to the forward and reverse directions in the display order are input. The
[0011]
This MPEG2 image compression information (bitstream) is determined by the picture type determination unit 1 as to whether it is related to an I / P picture or a B picture. Only the I / P picture is followed by an MPEG2 image information decoding unit. (I / P picture) 2 is output, and the B picture is discarded.
[0012]
The processing in the MPEG2 image information decoding unit (I / P picture) 2 is to decode MPEG2 image compression information (bitstream) into an image signal in the same manner as a normal MPEG2 image information decoding apparatus.
[0013]
The pixel value to be output from the MPEG2 image information decoding unit (I / P picture) 2 is input to the thinning unit 3. The thinning unit 3 performs a half thinning process in the horizontal direction, leaves only data in either the first field or the second field in the vertical direction, and discards the other. By such decimation, a sequentially scanned image having a size that is 1/4 of the input image information is generated.
[0014]
The progressive scan image generated by the thinning unit 3 is predictively encoded with reference to the forward direction in the I-VOP encoded in the frame and the display order by the MPEG4 image information encoding unit (I / P-VOP) 4. Encoded into P-VOP and output as MPEG4 image compression information (bitstream). VOP means Video object Plane and corresponds to a frame in MPEG2.
[0015]
At that time, the motion vector information in the input MPEG2 image compression information (bitstream) is mapped to the motion vector for the image information after the thinning by the motion vector synthesizing unit 5, and the motion vector detecting unit 6 A highly accurate motion vector is detected based on the motion vector value synthesized in the unit 5.
[0016]
Reference 1 describes an image information conversion apparatus that generates MPEG4 image compression information (bitstream) of a progressively scanned image having a size of 1/2 × 1/2 of MPEG2 image compression information (bitstream) as input. ing. That is, for example, if the input MPEG2 image compression information (bitstream) conforms to the NTSC (National Television System Committee) standard, the output MPEG4 image compression information is called SIF size (352 × 240 pixels). It will be.
[0017]
By the way, in the image information conversion apparatus shown in FIG. 6, the code amount control in the MPEG4 image information encoding unit (I / P-VOP) 4 determines the image quality in the MPEG4 image compression information (bitstream) to be output. It becomes a big factor. In ISO / IEC 14496-2, the code amount control method is not specified in particular, and each vendor can use a method that is considered optimal from the viewpoint of calculation amount and output image quality, depending on the application. . Hereinafter, as a typical code amount control method, a method described in MPEG2 Test Model 5 (ISO / IEC JTC1 / SC29 / WG11 N0400) will be described.
[0018]
The code amount control flow will be described with reference to the flow shown in FIG. In the first step S11, the image information encoding unit (I / P-VOP) 4 uses the target code amount (target bit rate) and GOP (group of pictures) configuration as input variables to allocate bits to each picture. I do. Here, GOP is a set of randomly accessible pictures.
[0019]
That is, in step S11, the image information encoding unit (I / P-VOP) 4 determines the allocation bit amount for each picture in the GOP for pictures that have not yet been decoded in the GOP including the allocation target picture. Allocation is performed based on the allocated bit amount (hereinafter referred to as R). This distribution is repeated in the order of the coded pictures in the GOP. At that time, code amount allocation to each picture is performed using the following two assumptions.
[0020]
First, it is assumed that the product of the average quantization scale code used when encoding each picture and the generated code amount is a constant value for each picture type unless the screen changes. Therefore, after encoding each picture, for each picture type, a variable X indicating the complexity of the screen._i, X_p, X_b(Global complelxity measure) is updated by the following equation (1).
[0021]
[Expression 4]

[0022]
Where S_i, S_p, S_bIs the amount of generated code bits at the time of picture encoding, and Q_i, Q_p, Q_bIs an average quantization scale code at the time of picture encoding. The initial value is set to a value represented by Expression (2) using a target code amount (target bit rate) bit_rate [bits / sec].
[0023]
[Equation 5]

[0024]
Second, the ratio K of the quantization scale code of the P and B pictures based on the quantization scale code of the I picture_p, K_bAssume that the overall image quality is always optimized when becomes the value defined in equation (3).
[0025]
[Formula 6]

[0026]
That is, the quantization scale code for B pictures is always 1.4 times the quantization scale code for I and P pictures. This is because the B picture is encoded somewhat coarsely compared to the I and P pictures, and when the code amount that can be saved in the B picture is added to the I and P pictures, the image quality of the I and P pictures is improved. It is assumed that the image quality of the B picture that refers to this is also improved.
[0027]
Based on the above two assumptions, the allocated bit amount (T_i, T_p, T_b) Is a value shown in Equation (4).
[0028]
[Expression 7]

[0029]
Where N_p, N_bIs the number of P and B pictures that have not yet been encoded in the GOP.
[0030]
Based on the allocated code amount obtained in this manner, the bit amount R allocated to the uncoded picture in the GOP is updated by Expression (5) every time each picture is encoded according to steps S11 and S12. To do.
[0031]
[Equation 8]

[0032]
In addition, when the first picture of the GOP is encoded, R is updated by Expression (6).
[0033]
[Equation 9]

[0034]
N is the number of pictures in the GOP. The initial value of R at the beginning of the sequence is 0.
[0035]
Next, in step S12, the image information encoding device (I / P-VOP) 4 performs rate control using a virtual buffer. That is, in step S12, the image information encoding device (I / P-VOP) 4 assigns the allocated bit amount (T_i, T_p, T_b) And the actual generated code amount, the quantization scale code is obtained by macroblock unit feedback control based on the capacities of three types of virtual buffers set independently for each picture.
[0036]
First, prior to encoding the j-th macroblock, the occupancy amount of the virtual buffer is obtained by Expression (7).
[0037]
[Expression 10]

[0038]
Where d₀ ⁱ, D₀ ^p, D₀ ^bIs the initial occupancy of each virtual buffer, B_jIs the amount of generated bits from the beginning of the picture to the jth macroblock, and MB_cnt is the number of macroblocks in one picture. Virtual buffer occupancy at the end of each picture encoding (d_{MB_cnt} ⁱ, D_{MB_cnt} ^p, D_{MB_cnt} ^b) Are the same picture type, and the initial virtual buffer occupancy for the next picture (d₀ ⁱ, D₀ ^p, D₀ ^b).
[0039]
Next, the quantization scale code for the j-th macroblock is calculated according to equation (8).
[0040]
## EQU11 ##

[0041]
Here, r is a variable that controls the response of the feedback loop, called a reaction parameter, and is given by equation (9).
[0042]
[Expression 12]

[0043]
The initial value of the virtual buffer at the start of encoding is given by equation (10).
[0044]
[Formula 13]

[0045]
Finally, in step S13, the image information encoding device (I / P-VOP) 4 performs adaptive quantization for each macroblock in consideration of visual characteristics. That is, in step S13, the image information encoding unit (I / P-VOP) 4 finely quantizes the quantization scale code obtained in step S12 with a flat part that is visually noticeable, and the deterioration It is changed by a variable called activity for each macroblock so that it is roughly quantized in a complicated part of a relatively inconspicuous pattern.
[0046]
The activity uses the luminance signal pixel value of the original picture, and uses the pixel values of a total of 8 blocks of 4 blocks in the frame discrete cosine transform mode and 4 blocks in the field discrete cosine transform mode. Given in.
[0047]
[Expression 14]

[0048]
Where P_kIs the pixel value in the luminance signal block of the original image. The reason why the minimum value is taken in equation (11) is to make the quantization fine when only a part of the macroblock has a flat part.
[0049]
Furthermore, the normalized activity Nact whose value is in the range of 0.5 to 2 according to equation (12)_jAsk for.
[0050]
[Expression 15]

[0051]
Here, avg_act is the act in the picture encoded immediately before_jIs the average value.
[0052]
Quantization scale code mquant considering visual characteristics_jIs the quantization scale code Q obtained in step S12._jIs given by formula (13).
[0053]
[Expression 16]

[0054]
It is known that the code amount control method defined in the MPEG2 Test Model 5 has the following restrictions. When actual control is performed, measures against these restrictions are required. That is, the first limitation is that the first step S11 cannot cope with the scene change, and the parameter avg_act used in the third step S13 becomes an incorrect value after the scene change. The second limitation is that there is no guarantee that satisfies the constraint conditions of VBV (Video Buffer Verifier) defined in MPEG2 and MPEG4.
[0055]
By the way, it is also described in the document "Theoretical analysis of MPEG compression efficiency and its application to code amount control" (Science Technical Report, IE-95, DSP 95-10, May 1995, hereinafter referred to as Document 2). As described above, the code amount control method defined in Test Model 5 does not necessarily provide a good image quality in the MPEG-2 image encoding device.
[0056]
This reference 2 proposes the following method as a method for giving an optimal code amount distribution for each frame in a GOP, in particular, in order to give good image quality. That is, N_I, N_P, N_BIs the number of I, P and B pictures that have not yet been encoded in the GOP, and the code amount assigned to them is R_I, R_P, R_BAnd Further, under the fixed rate condition given by Equation (14), the quantization step size in each is expressed as Q_I, Q_P, Q_BAnd m is the order relating the quantization step size and the reproduction error variance (that is, assuming that minimizing the average value of the quantization step size raised to the mth power minimizes the reproduction error variance). Then, consider minimizing equation (15).
[0057]
[Expression 17]

[0058]
[Expression 18]

[0059]
The average quantization scale Q and the code amount R in each frame are related to the complexity X of each frame, which is a medium variable also used in Test Model 5, as shown in Expression (16).
[0060]
[Equation 19]

[0061]
R which minimizes the expression (15) under the constraint condition of the expression (14) while considering the relationship of the expression (16)_I, R_P, R_BIs calculated using the Lagrange's undetermined multiplier method, the optimal R_I, R_P, R_BAs such, a value like the following formula is obtained.
[0062]
[Expression 20]

[0063]
With α = 1, it can be said that the relationship between the equation (17) and the equation (4) in the code amount control method defined in the MPEG2 Test Model 5 is as follows. That is, Equation (17) is a parameter for controlling the amount of code, K_p, K_bFor each frame complexity X_I, X_P, X_BAccordingly, the calculation is adaptively performed as shown in Expression (18).
[0064]
[Expression 21]

[0065]
Document 2 shows that a good image quality can be obtained by setting the value of 1 / (1 + m) to about 0.6 to 1.2.
[0066]
In the image information conversion apparatus shown in FIG. 6, the MPEG4 image information encoding apparatus (I / P-VOP) 4 performs code amount control using a method similar to that defined in the MPEG2 Test Model 5. In this case, since it is impossible to cope with the complexity change in the GOP due to a scene change or the like, stable code amount control becomes difficult, and image quality deterioration may be caused. Information in the input MPEG2 image compression information (bitstream) extracted in the MPEG2 image information decoding unit (I / P picture) 2 is used in the MPEG4 image compression information encoding unit (I / P-VOP) 4. It is expected that this problem can be avoided.
[0067]
In order to solve such a problem, the applicant of the present application has previously proposed an image information conversion apparatus as shown in FIG.
[0068]
This image information conversion apparatus includes a picture type determination unit 7, a compression information analysis unit 8, an MPEG2 image information decoding unit (I / P picture) 9, a thinning unit 10, and an MPEG4 image information encoding unit (I / P). P-VOP) 11, a motion vector synthesis unit 12, a motion vector detection unit 13, an information buffer 14, and a complexity calculation unit 15.
[0069]
This image information conversion apparatus has an operation principle other than the code amount control in the compression information analysis unit 8, the information buffer 14, the complexity calculation unit 15, and the MPEG4 image information encoding unit (I / P-VOP) 11. 6 is the same as the image information conversion apparatus shown in FIG. 6, and the operation principle in the compression information analysis unit 8, the information buffer 14, and the complexity calculation unit 15 and the MPEG4 image information encoding unit (I / P-VOP) are described below. 11 will be described.
[0070]
In the compression information analysis unit 8, in the average value Q over the entire quantization scale frame used in the decoding process, and the MPEG2 image compression information (bit stream) to be input, the total code amount ( The number of bits (B) is stored in the information buffer 14.
[0071]
The complexity calculation unit 15 calculates the complexity X for the frame from the information Q and B for each frame stored in the information buffer 14 using Equation (19).
[0072]
[Expression 22]

[0073]
The complexity X for the frame calculated by equation (19) is buffered by 1 GOV (group of VOPs), and then the MPEG4 image information encoding unit (I / P-VOP) 11 performs code amount control. Is transmitted as a parameter for For this reason, a delay of 1 GOV is required. This delay is realized using a delay buffer (not shown). Here, GOV is a set of randomly accessible VOPs.
[0074]
Hereinafter, how the complexity X calculated for each frame in the GOV calculated in Expression (19) is used in the MPEG4 image information encoding unit (I / P-VOP) 11 will be described. In the following, the case where the picture type determination unit 7 does not exist in the image information conversion apparatus and the frame rate is not converted will be considered.
[0075]
K obtained by equation (18)_P, K_BMeans the ideal average quantization scale Q for I-VOP_{i_ideal}Ideal average quantization scale Q for P-VOP / B-VOP_{p_ideal}, Q_{b_ideal}Is given by equation (20).
[0076]
[Expression 23]

[0077]
In MPEG2 Test Model 5, K is adaptively expressed as Equation (18)._p, K_bIs not calculated, and a fixed value as shown in Expression (3) is used.
[0078]
From the equations (18) and (20), the complexity for a certain VOP1 and a certain VOP2 is expressed as X₁, X₂And the ideal quantization scale is Q_{1_ideal}, Q_{2_ideal}Then, the equation (21) is obtained.
[0079]
[Expression 24]

[0080]
Alternatively, when it is desired to use the fixed value shown in the equation (3) as in MPEG2 Test Model 5, the equation (22) may be used instead of the equation (21).
[0081]
[Expression 25]

[0082]
Now, let R be the total code amount (number of bits) allocated to unencoded VOPs in the GOV, and R is R for each VOP.₁, R₂, ... R_nIt is assumed that the image quality for the GOV is optimized. Where R and R₁, R₂, ... R_nA relational expression such as Expression (23) is established between.
[0083]
[Equation 26]

[0084]
A VOP_kMean quantization scale Q for_k, Assigned code amount R_k, Complexity X_kNote that there is a relationship of formula (24) between the two, and if formula (23) is modified, formula (25) is obtained.
[0085]
[Expression 27]

[0086]
[Expression 28]

[0087]
In formula (25), K (X₁, X₂), The value shown in Equation (21) or the value shown in Equation (22) may be used, but the former realizes more optimal code amount distribution according to the image. Is possible. At that time, by setting the value of 1 / (1 + m) to 1.0, it is not necessary to perform exponential calculation, and high-speed execution is possible. Even when the value of 1 / (1 + m) is set to a value other than 1.0, high-speed execution is possible by having a table in advance and performing exponential operation with reference to this table.
[0088]
Complexity X for each VOP in equation (25)_kIs based on MPEG4 image coding, but if the complexity for each frame by MPEG2 image coding and the complexity for each frame by MPEG4 image coding are assumed to be equal, they are stored in the complexity calculation unit 15. X_kBy using, it is possible to calculate the target code amount for the VOP by the equation (25).
[0089]
FIG. 9 shows a flow for calculating the target code amount. In the first step S21, the compression information analysis unit 8 extracts the average quantization scale Q and the allocated code amount (number of bits) B for each frame in the GOP used for the decoding process in the MPEG2 image information decoding unit 9. .
[0090]
In step S <b> 22, the complexity calculation unit 15 calculates the complexity X given by the product of the average quantization scale Q and the assigned code amount (number of bits) B.
[0091]
In step S23, the MPEG4 image encoding unit (I / P-VOP) 11 calculates a target code amount (target bit) corresponding to the complexity X.
[0092]
In MPEG2 Test Model 5, complexity X for I, P, B pictures in GOP_i, X_p, X_bHowever, this assumption does not hold when scene changes are included in the GOP, or when the background changes significantly in the GOP, which hinders stable code amount control. It becomes a factor of deterioration. In the image information conversion apparatus shown in FIG. 8, even in such a case, since the code amount control based on the complexity for each frame in the input MPEG2 image compression information (bit stream) is performed, the image quality deterioration is caused. It is possible to perform stable code amount control without causing any problems.
[0093]
[Problems to be solved by the invention]
By the way, in the code amount control method shown in FIG. 7, in order for the adaptive quantization in step S13 to work effectively, the quantization scale code for the j-th macroblock, that is, the Q in equation (8)._JIt is desirable to take a uniform value over the entire frame. Therefore, MPEG2 image compression information (bitstream) obtained by compressing “Flower Garden”, which is one of CCIR (Comite Consultantif Internationale des Radio Communications) test sequences, into 4 Mbps under the condition of n = 15; m = 3, Q for a certain VOP when converting to MPEG4 image compression information (bitstream) of n = 5; m = 1 using the image information conversion apparatus shown in FIG._JFIG. 10 shows what value is taken.
[0094]
As mentioned earlier, ideally Q_JIs preferably uniform across the entire VOP, but in practice, the virtual buffer occupancy (d in equation (7))_j ⁱ, D_j ^p, D_j ^b) Varies from macroblock to macroblock, and does not have a uniform value over the entire VOP.
[0095]
  The present invention has been proposed in view of the above-described circumstances, and an image information conversion apparatus and method for controlling a code amount by suppressing variation in Qj.And encoding apparatus and methodThe purpose is to provide.
[0096]
[Means for Solving the Problems]
  In order to solve the above-described problems, the present invention is directed to interlaced scanning input image compression information compressed by the first compression coding method, and progressive scanning output image compression compressed by the second compression coding method. The encoded image that constitutes the input image compression information and the output image compression information is composed of pixel blocks each composed of a plurality of pixels, and is encoded in the encoded image of the output image compression information. The input image compression information indicates the average quantization scale and the assigned code amount of the pixel block group in the encoded image of the input image compression information corresponding to the pixel block constituting the pseudo pixel block sequence composed of the pixel block group for one horizontal column. Analyzing means extracted from the information buffer, an information buffer for storing an average quantization scale and an assigned code amount detected by the analyzing means, and the pseudo pixel block. A target code amount calculating means for calculating a target code amount for a sequence; and an encoding means for encoding image information into the output image compression information using the target code amount calculated by the target code amount calculating means, The analysis means calculates the complexity for the pseudo pixel block sequence in the output image compression information and the encoded image using the average quantization scale and the allocated code amount stored in the information buffer,The target code amount calculation means includes:Using the target code amount for the encoded image constituting the output image compression information and the complexity for each pseudo pixel block sequence, for the pseudo pixel block sequence constituting the encoded image constituting the output image compression informationTarget code amountIs calculated.
  In addition, the encoding apparatus according to the present invention compresses image information obtained by decoding input image compression information of interlaced scanning compressed by the first compression encoding method using the second compression encoding method. In the encoding device that encodes the progressively scanned output image compression information, each of the input image compression information and the encoded image constituting the output image compression information is composed of a pixel block including a plurality of pixels, and the input image Pixel blocks in the encoded image of the input image compression information corresponding to the pixel blocks that constitute the pseudo pixel block sequence that is extracted from the compression information and that are composed of pixel block groups for one horizontal column in the encoded image of the output image compression information Receiving means for receiving the average quantization scale and assigned code amount of the group, and storing the average quantization scale and assigned code amount from the receiving means; Using the information buffer, the target code amount calculating means for calculating the target code amount for the pseudo pixel block sequence, and the target code amount calculated by the target code amount calculating means, the image information is encoded into the output image compression information. And encoding means for the output image compression information calculated using the average quantization scale and the assigned code amount stored in the information buffer and the pseudo pixel block sequence in the encoded image. Receive city,The target code amount calculation means includes:For the pseudo pixel block sequence constituting the encoded image constituting the output image compression information calculated using the target code amount for the encoded image constituting the output image compression information and the complexity for each pseudo pixel block sequenceCalculate the target code amount.
[0097]
In the present invention, MPEG2 image compression information (bitstream) for interlaced scanning is used as input image compression information, and MPEG4 image compression information (bitstream) for sequential operation is used as output image compression information. These MPEG2 image compression information (bit stream) and MPEG4 image compression information are composed of pixel blocks, that is, macroblocks composed of a plurality of pixels, and use pseudo pixel block sequences, that is, pseudo slices.
[0098]
In addition, MPEG2 image compression information (bit stream) and MPEG4 image compression information (bit stream) are composed of image groups, that is, GOP (group of pictures) and GOV (group of VOPs), respectively. The GOP and GOV that are image groups are each composed of a plurality of encoded images, that is, a picture and a VOP (video object plane).
[0099]
That is, the present invention receives interlaced scanning MPEG2 image information compression information (bitstream) as an input, a picture type discrimination unit, a compression information analysis unit, an MPEG2 image information decoding unit (I / P picture), a thinning unit, an MPEG4 image Information encoding unit (I / P-VOP), motion vector synthesis unit, motion vector detection unit, information buffer, VOP complexity calculation unit, pseudo slice complexity calculation unit, VOP target code amount calculation unit, pseudo slice target It also has a code amount calculation unit, and uses the complexity information for each slice in the input MPEG2 image compression information (bit stream) to calculate a target code amount (target bit) in units of pseudo slices when MPEG4 image is encoded. By changing the reference quantization scale in step 2 of the code amount control, Minimizing, with the code amount allocation for each macroblock is optimized for the image, and provides a means for outputting an MPEG4 image compression information progressive scan (bit stream).
[0100]
In the above configuration, the picture type determination unit leaves only the one relating to the I / P picture and discarding the one relating to the B picture in the input MPEG2 image compression information (bitstream). The compression information analysis unit realizes a delay of 1 GOP, and at the same time, the code amount (number of bits) allocated to each frame and the average quantization in each frame in the input MPEG2 image compression information (bit stream) In each frame, the average quantization scale and the generated code amount (number of bits) over the entire macroblocks constituting each pseudo slice are extracted from the scale and each MPEG4 image compression information in each frame. The MPEG2 image information decoding unit (I / P picture) outputs the compression information (bitstream) related to the I / P picture, which is the output of the picture type discrimination unit, all the 8th order discrete cosine coefficients in both the horizontal and vertical directions. The decoding process which used or used only the low-frequency component is performed. The decimation unit extracts only the first field or the second field of the image information that is the output of the MPEG2 image information decoding unit (I / P picture) and converts it into a sequentially scanned image, and at the same time, the desired image frame size Downsampling to convert to. The MPEG4 image information encoding unit (I / P-VOP) encodes the image information that is output from the thinning unit using the MPEG4 encoding method. The motion vector synthesizing unit performs motion on the image data after the scan conversion based on the motion vector value in the input image compression information (bit stream) detected by the MPEG2 image information decoding unit (I / P picture). Map to a vector value. The motion vector detection unit performs high-precision motion vector detection based on the motion vector value output from the motion vector synthesis unit. The information buffer includes a code amount (number of bits) allocated to each frame in the input MPEG2 image compression information (bit stream) extracted by the compression information analysis unit, an average quantization scale for each frame, and each frame. The MPEG4 image compression information in the subsequent stage stores the average quantization scale and the amount of generated codes (number of bits) over the entire macroblocks constituting each pseudo slice. The VOP complexity calculation unit calculates the amount of code (number of bits) assigned to each frame and the average quantization scale for each frame in the input MPEG2 image compression information (bit stream) stored in the information buffer. Then, an estimated value of complexity for each VOP in the MPEG4 image compression information (bit stream) to be output is calculated. In the pseudo slice complexity calculation unit, the code amount (number of bits) assigned to each pseudo slice and the average for each pseudo slice in the input MPEG2 image compression information (bit stream) stored in the information buffer. From the quantization scale, an estimate of complexity for each pseudo slice in the MPEG4 image compression information (bit stream) to be output is calculated. The VOP target code amount calculation unit calculates a target code amount (target bit) for each VOP based on the complexity for each VOP calculated by the VOP complexity calculation unit, and calculates a pseudo slice target code amount. In the unit, each pseudo slice is calculated from the target code amount (target bit) for each VOP calculated in the VOP target code amount calculation unit and the complexity for each pseudo slice calculated in the pseudo slice complexity calculation unit. A target code amount (target bit) is calculated and the information is transmitted to the MPEG4 image information encoding device (I / P-VOP).
[0101]
An apparatus configuration that does not have a picture type determination unit and that does not perform frame rate conversion is also possible. In addition, the VOP target code amount calculation unit does not use the complexity for each frame in the input MPEG2 image compression information (bitstream), and uses the same method as defined in MPEG2 Test Model 5 for each VOP. A device configuration for calculating the target code amount is also conceivable.
[0102]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described below with reference to the drawings.
[0103]
First, an image information conversion apparatus according to a first embodiment to which the present invention is applied will be described with reference to FIG.
[0104]
This image information processing apparatus includes a picture type determination unit 16, a compression information analysis unit 17, an MPEG2 image information decoding unit (I / P picture) 18, a thinning unit 19, and an MPEG4 image information encoding unit (I / P). P-VOP) 20, motion vector synthesis unit 21, motion vector detection unit 22, information buffer 23, VOP complexity calculation unit 24, pseudo slice complexity calculation unit 25, and VOP target code amount calculation And a pseudo slice target code amount calculation unit 27.
[0105]
This image information conversion apparatus includes an intra-encoded image (I picture; I) encoded within a frame, and a forward predictive encoded image (P picture; encoded with reference to the forward direction in the display order). P) and interlaced scanning MPEG2 image compression information (bitstream) composed of bi-predictive encoded images (B pictures; B) that are predictively encoded with reference to the forward and reverse directions in the display order are input. The
[0106]
The MPEG2 image compression information (bitstream) is discriminated in the picture type discrimination unit 16 as to whether it relates to an I / P picture or B picture, and only the I / P picture is sent to the subsequent compression information analysis unit 17. The B picture is output and discarded.
[0107]
In the compression information analysis unit 17, the average value Q over the entire frame of the quantization scale used for the decoding process in the MPEG2 image information decoding device (I / P picture) 18, and the MPEG2 image compression information (bitstream) as input ), The total code amount (number of bits) B allocated to the frame is stored in the information buffer 23.
[0108]
The processing in the MPEG2 image information decoding unit (I / P picture) 18 is to decode MPEG2 image compression information (bitstream) into an image signal in the same manner as a normal MPEG2 image information decoding apparatus. Here, since the data relating to the B picture is discarded in the picture type determination unit 16, the MPEG2 image information decoding unit (I / P picture) 18 only needs to have a function capable of decoding only the I / P picture. .
[0109]
The pixel value that is output from the MPEG2 image information decoding unit (I / P picture) 18 is input to the thinning unit 19. The thinning unit 19 performs a half thinning process in the horizontal direction, leaves only the data of either the first field or the second field in the vertical direction, and discards the other. By such decimation, a sequentially scanned image having a size that is 1/4 of the input image information is generated.
[0110]
By the way, in order to encode the image output from the thinning unit 19 in units of macroblocks composed of 16 × 16 pixels in the MPEG4 image information encoding unit (I / P-VOP) 20, the horizontal direction and the vertical direction In both cases, the number of pixels needs to be a multiple of 16. The thinning unit 19 performs pixel compensation or discarding simultaneously with the thinning.
[0111]
For example, if the input MPEG2 image compression information (bitstream) conforms to the NTSC (National Television System Committee) standard, that is, a 720 × 480 pixel, 30 Hz interlaced scanned image, the image frame after thinning is This is the SIF (360 × 240 pixels) size. For this image, the thinning unit 19 discards, for example, 8 lines at the right end or left end in the horizontal direction to obtain 352 × 240 pixels.
[0112]
It should be noted that by changing the operation in the thinning-out unit 19, other image frames, for example, in the above example, an image of QSIF (176 × 112 pixels) size, which is an image frame of about 1/4 × 1/4, is obtained. It is also possible to convert.
[0113]
Furthermore, the above-mentioned document 1 describes the 8th order in the MPEG2 image compression information (bit stream) as input in the horizontal direction and the vertical direction as processing in the MPEG2 image information decoding unit (I / P picture) 18. An image information conversion apparatus that performs a decoding process using all the discrete cosine transform coefficients is described. However, the apparatus shown in FIG. 1 is not limited to this, and only the horizontal direction or the eighth order in both the horizontal direction and the vertical direction is described. It is possible to reduce the amount of calculation and the video memory capacity associated with the decoding process while performing the decoding process using only the low-frequency component of the discrete cosine transform coefficients and minimizing image quality degradation.
[0114]
The progressive scan image generated by the thinning-out unit 19 is predictively encoded with reference to the forward direction in the I-VOP encoded in the frame and the display order by the MPEG4 image information encoding unit (I / P-VOP) 20. Encoded into P-VOP and output as MPEG4 image compression information (bitstream).
[0115]
VOP means Video object Plane and corresponds to a frame in MPEG2. Further, I-VOP is an intra-coded VOP corresponding to an I picture, P-VOP is a forward predictive coded VOP corresponding to a P picture, and B-VOP is a bi-directional predictive coded VOP corresponding to a B picture.
[0116]
When encoding in the MPEG4 image information encoding unit (I / P-VOP) 20, the motion vector information in the input MPEG2 image compression information (bitstream) is converted into an image after decimation in the motion vector synthesis unit 21. The motion vector is mapped to the motion vector for the information, and the motion vector detection unit 22 detects a highly accurate motion vector based on the motion vector value synthesized by the motion vector synthesis unit 21.
[0117]
Here, the concept of the pseudo slice used in the image information conversion apparatus of the present embodiment will be described.
[0118]
The MPEG2 image compression information (bit stream) has a slice layer as shown in FIG. That is, in the slice layer, a horizontally long band-like region is shown in the screen as shown in FIG. 2, and the screen is composed of a plurality of slices (regions a, b, c... In FIG. 2). Even if an error occurs, it is possible to recover the error by synchronization from the start of the next slice layer (slice_start_code). The slice layer is composed of one or more macroblocks, arranged in a raster scan order from left to right, and from top to bottom. Its length and start position are arbitrary and can be changed for each screen. However, for the purpose of parallel processing and effective error tolerance, one slice extends only to the right and does not extend downward.
[0119]
In MPEG4 image compression information (bitstream), the slice layer as shown in FIG. 2 is not syntactically defined in consideration of encoding efficiency at a low bit rate, but in this embodiment, an MPEG4 image is not defined. For the encoding process in the information encoding unit (I / P-VOP) 20, a pseudo slice as shown in FIG. 3 is defined. That is, for example, the interlaced scanning MPEG2 image compression information (bit stream) input by the image information conversion apparatus shown in FIG. 1 is progressively scanned MPEG4 image compression having 1/2 × 1/2 image frames. Consider a case where information is converted into information (bitstream). At this time, four macroblocks MB in the MPEG2 image compression information (bitstream) to be input in A of FIG.₀, MB₁, MB_2m, MB_{2m + 1}Is one macroblock Mb in the MPEG4 image compression information (bitstream) that is the output shown in FIG.₀It will correspond to. At this time, in FIG._B0, M_B1, ... MB_4m-1Corresponding to the area a composed of the macro block Mb in the MPEG4 image compression information (bit stream) that is the output shown in B of FIG.₀, Mb₁, ..., Mb_m-1Is defined as pseudo slice 0. The same applies to pseudo slices 1 and later.
[0120]
In the compression information analysis unit 17, the allocated code amount (number of bits) B for each frame in the input MPEG2 image compression information (bit stream)_k, And the average quantization scale Q in each frame_kThe information regarding is extracted and stored in the information buffer 23. At the same time, in each frame, the allocated code amount (number of bits) B for the macroblock group constituting the pseudo slice 1 in the output MPEG4 image compression information (bitstream)_{pseudo_slice1}, And average quantization scale Q_{pseudo_slice1}Is stored in the information buffer 23. 3A, n is an integer, and a certain macroblock MB in the input image compression information (bitstream)_nThe generated code amount and quantization scale for B_MBn, Q_MBnThen, the following equation (26) holds for pseudo slice 0.
[0121]
[Expression 29]

[0122]
The same applies to pseudo slices 1 and later. However, when m is an odd number, for example, when the image frame of MPEG2 image compression information (bit stream) for interlaced scanning to be input is 720 × 480 pixels, 1/2 × 1/2 is 360 × 240 pixels. However, in order to perform macroblock unit processing in the subsequent MPEG4 image information encoding device (I / P-VOP) 20, the thinning unit 19 discards, for example, the right four pixels for the image frame, 352 × 240 pixels are required. In this case, let B be the value shown in equation (26)._{pseudo_slice0}, Q_{pseudo_slice0}It may be used as the following equation (27).
[0123]
[30]

[0124]
In the VOP complexity calculation unit 24, an allocated code amount (number of bits) B for each frame in the input MPEG2 image compression information (bit stream) stored in the information buffer 23 is stored._kAnd the average quantization scale Q in each frame_kFrom the information about the complexity X for each VOP_kIs calculated for 1 GOV as shown in the following equation (28).
[0125]
[31]

[0126]
The VOP target code amount calculation unit 26 uses the VOP complexity calculation unit 24 to calculate a target code amount (target bit) for each VOP using Equation (25). In the following, the target code amount for each VOP obtained by equation (25) (R in the equation)₁) T_vopIt expresses.
[0127]
In the pseudo slice complexity calculation unit 25, the allocated code amount (number of bits) B for the macroblock group constituting the pseudo slice 1 in the output MPEG4 image compression information (bit stream) stored in the information buffer 23._{pseudo_slice1}And average quantization scale Q_{pseudo_slice1}Complexity X for each pseudo slice_{pseudo_slice1}Is calculated as in the following equation (29).
[0128]
[Expression 32]

[0129]
Now, assuming that the VOP is composed of pseudo slice 0, pseudo slice 1,..., Pseudo slice N-1, the pseudo slice target amount calculation unit 27 uses the target code amount T for pseudo slice 1._{pseudo_slice1}Is calculated as in the following equation (30) and transmitted to the MPEG4 image information encoding unit (I / P-VOP) 20.
[0130]
[Expression 33]

[0131]
T for I-VOP, P-VOP, B-VOP_{pseudo_slice1}T_{i_pseudo_slice1}, T_{p_pseudo_slice1}, T_{b_pseudo_slice1}Then, in the MPEG4 image information encoding unit (I / P-VOP) 20, a virtual buffer (d_j ⁱ, D_j ^p, D_j ^b) Is calculated according to the following equation (31) corresponding to the equation (27).
[0132]
[Expression 34]

[0133]
Where d₀ ⁱ, D₀ ^p, D₀ ^bIs the occupation amount of each virtual buffer at the head of the pseudo slice, and B_{pseudo_slice_j}Is a generated code amount in the j-th macroblock from the top of the pseudo slice. P_SLICE_CNT is the number of macroblocks included in one pseudo slice, and the virtual buffer occupancy at the end of each pseudo slice encoding (d_{P_SLICE_CNT} ⁱ, D_{P_SLICE_CNT} ^p, D_{P_SLICE_CNT} ^ib) Is the initial value of the virtual buffer occupancy for the next pseudo slice (d₀ ⁱ, D₀ ^p, D₀ ^b).
[0134]
The operation principle of a series of code amount control in the MPEG4 image information encoding unit (I / P-VOP) 20 will be described with reference to FIG.
[0135]
In the first step S31, the compression information analysis unit 17 analyzes the syntax of the MPEG2 image compression information (bit stream) input via the picture type determination unit 16, and allocates the code amount (number of bits) B for each frame._k, And the average quantization scale Q in each frame_kThe information regarding is extracted and stored in the information buffer 23. At the same time, in each frame, the allocated code amount (number of bits) B for the macroblock group constituting the pseudo slice 1 in the output MPEG4 image compression information (bitstream)_{pseudo_slice1}, And average quantization scale Q_{pseudo_slice1}Is stored in the information buffer 23.
[0136]
In step S <b> 32, the VOP complexity calculation unit 24 assigns an allocated code amount (number of bits) B for each frame in the input MPEG2 image compression information (bit stream) stored in the information buffer 23._k, And the average quantization scale Q in each frame_kFrom the information about the complexity X for each VOP_kIs calculated for 1 GOV.
[0137]
In step S33, the VOP target code amount calculation unit 26 calculates a target code amount (target bit) for each VOP.
[0138]
In step S <b> 34, the pseudo slice complexity calculation unit 25 allocates an amount of code (number of bits) to the macroblock group constituting the pseudo slice 1 in the output MPEG4 image compression information (bit stream) stored in the information buffer 23. ) B_{pseudo_slice1}, And average quantization scale Q_{pseudo_slice1}Complexity X for each pseudo slice_{pseudo_slice1}Is calculated.
[0139]
In step S <b> 35, the pseudo slice target code amount calculation unit 27 calculates a target code amount (target bit) corresponding to each pseudo slice, and transmits the target code amount (target bit) to the MPEG4 image information encoding unit (I / P-VOP) 20.
[0140]
In step S36, the MPEG4 image information encoding unit (I / P-VOP) 20 performs rate control using a virtual buffer. In step S37, the MPEG4 image information encoding unit (I / P-VOP) 20 performs adaptive quantization for each macroblock in consideration of viewing angle characteristics.
[0141]
Next, an image information conversion apparatus according to a second embodiment to which the present invention is applied will be described with reference to FIG.
[0142]
This image information apparatus includes a picture type determination unit 28, a compression information analysis unit 29, an MPEG2 image information decoding unit (I / P picture) 30, a thinning unit 31, and an MPEG4 image information encoding unit (I / P). -VOP) 32, motion vector synthesis unit 33, motion vector detection unit 34, information buffer 35, pseudo slice complexity calculation unit 36, VOP target code amount calculation unit 37, and pseudo slice target code amount calculation Part 38.
[0143]
A difference between the image information conversion apparatus shown in FIG. 1 and the image information conversion apparatus shown in FIG. 5 is that the MPEG4 image information encoding unit (I / P-VOP) 20 is different in the image information conversion apparatus shown in FIG. In FIG. 5, the target code amount (target bit) for each VOP is calculated by the equation (25). In the image information conversion apparatus shown in FIG. 5, the MPEG4 image information encoding unit (I / P-VOP) 32 is used. In this case, the target code amount (target bit) for each VOP is calculated by equation (4). That is, in the image information conversion apparatus shown in FIG. 5, the compression information analysis unit 29 extracts the GOP structure in the input MPEG2 image compression information (bitstream) and stores it in the information buffer 35. The VOP target code amount calculation unit 37 determines the GOV structure in the MPEG4 image compression information (bit stream) to be output, and calculates the target code amount (target bit) for each VOP based on Expression (4). .
[0144]
As described above, MPEG2 image compression information (bit stream) is used as an input and MPEG4 image compression information (bit stream) is used as an output. However, input and output are not limited to this. It may be image compression information (bit stream) such as H.263.
[0145]
【The invention's effect】
As described above, the present invention uses MPEG2 image compression information (bitstream) for interlaced scanning as input, and uses the complexity information for each slice in the MPEG2 image compression information (bitstream) to be input, By providing a target code amount (target bit) in units of pseudo slices during MPEG4 image encoding, fluctuations in the reference quantization scale associated with the step of calculating complexity in code amount control are minimized, and each macro The present invention provides means for converting to MPEG4 image compression information (bit stream) for progressive scanning and outputting in a state where the code amount allocation for the block is optimized for the image.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a configuration of an image information conversion apparatus according to a first embodiment.
FIG. 2 is a diagram for explaining a concept of a slice layer in MPEG2 image compression information (bit stream).
FIG. 3 is a diagram showing a concept of a pseudo slice in MPEG4 image compression information (bit stream).
FIG. 4 is a diagram illustrating an operation flow for performing code amount control using complexity.
FIG. 5 is a block diagram illustrating a configuration of an image information conversion apparatus according to a second embodiment.
FIG. 6 is a block diagram illustrating a configuration of a conventional image information conversion apparatus.
FIG. 7 is a flowchart showing an operation principle of a code amount control method described in MPEG2 Test Mode 15 (ISO / IEC JTC1 / SC29 / WG11 N0400).
FIG. 8 is a diagram showing a configuration of an image information conversion apparatus proposed by the present applicant.
FIG. 9 is a flowchart showing an operation of code amount control in the image information conversion apparatus of FIG. 8;
FIG. 10 shows MPEG2 image compression information (bit stream) obtained by compressing “Flower Garden”, which is one of the CCIR test sequences, to 4 Mbps under the condition of n = 15; m = 3, and the image shown in FIG. Q for a certain VOP when converting to MPEG4 image compression information (bitstream) of n = 5; m = 1 using an information conversion device_jIt is the figure which showed what kind of value it takes.
[Explanation of symbols]
16 picture type determination unit, 17 compression information analysis unit, 18 MPEG2 image information decoding unit (I / P picture), 19 decimation unit, 20 MPEG4 image information encoding unit (I / P-VOP), 21 motion vector synthesis unit , 22 motion vector detection unit, 23 information buffer, 24 VOP complexity calculation unit, 25 pseudo slice complexity calculation unit, 26 VOP target code amount calculation unit, 27 pseudo slice target code amount calculation unit

Claims

In an image information conversion apparatus for converting input image compression information of interlaced scanning compressed by a first compression encoding method into output image compression information of sequential scanning compressed by a second compression encoding method.
The encoded image constituting the input image compression information and the output image compression information is composed of pixel blocks each consisting of a plurality of pixels,
The average quantization scale of the pixel block group in the encoded image of the input image compression information corresponding to the pixel block constituting the pseudo pixel block sequence composed of the pixel block group for one horizontal column in the encoded image of the output image compression information And an analyzing means for extracting the allocated code amount from the input image compression information;
An information buffer for storing an average quantization scale and an assigned code amount detected by the analysis means;
Target code amount calculating means for calculating a target code amount for the pseudo pixel block sequence;
Encoding means for encoding image information into the output image compression information using the target code amount calculated by the target code amount calculation means;
The analysis means calculates the complexity for the pseudo pixel block sequence in the output image compression information and the encoded image using the average quantization scale and the allocated code amount stored in the information buffer,
The target code amount calculation means configures an encoded image constituting the output image compression information by using a target code amount for the encoded image constituting the output image compression information and a complexity for each pseudo pixel block sequence. An image information conversion apparatus for calculating a target code amount for a pseudo pixel block sequence to be performed.

The analysis means uses the average quantization scale and the assigned code amount stored in the information buffer to calculate the complexity for the kth pseudo pixel block sequence in the output image compression information and the encoded image by the following equation: The image information conversion apparatus according to claim 1.

However, it is _assumed that the average quantization scale for the kth pseudo pixel block sequence is Q _{pseudo_slicek} , the assigned code amount is B _{pseudo_slicek,} and the _complexity is X _{pseudo_slicek} .

The target code amount calculation means configures an encoded image constituting the output image compression information by using a target code amount for the encoded image constituting the output image compression information and a complexity for each pseudo pixel block sequence. The image information conversion apparatus according to claim 2, wherein a target code amount for the k-th pseudo pixel block sequence to be calculated is calculated by the following equation.

However, the target code amount for the encoded image is T _vop , the _complexity for the l-th pseudo pixel block sequence is X _{pseudo_slice,} and the target code amount is T _{pseudo_slicek} .

The input image compression information refers to an intra-encoded image encoded within a frame, a forward-predicted encoded image that is encoded with reference to the forward direction in the display order, and refers to the forward and reverse directions in the display order. Is composed of bi-predictive encoded images encoded by
The input image compression information is composed of an image group consisting of a plurality of encoded images, and the analysis means analyzes the structure of the image group constituting the input image compression information, and each frame constituting the input image compression information extract the complexity for the above target code amount calculating means, the image information converting apparatus according to claim 3, wherein by utilizing the complexity for calculating a target code amount for the coded image of the output image compression information.

The output image compression information refers to an intra-encoded image encoded within a frame, a forward-predicted encoded image that is encoded with reference to the forward direction in the display order, and refers to the forward and reverse directions in the display order. 4. The virtual predictor occupying amount of the virtual buffer in the j-th pixel block constituting each type of encoded image is given by the following equation: The image information conversion apparatus described.

However, the l-th target code amount T _{pseudo_slicel} for the intra coded image, the forward predictive coded image, and the bi-directional predictive coded image is _{Ti_pseudo_slicel, Tp_pseudo_slicel} , and T _{b_pseudo_slicel} _{, respectively} , and the virtual buffer occupancy for the jth pixel block Let d _j ⁱ , d _j ^p , d _j ^b be the amounts, and d ₀ ⁱ , d ₀ ^p , d ₀ ^b be the initial values of the virtual buffer occupation amounts. In addition, the code amount (number of bits) generated from the beginning of the pseudo pixel block sequence to the j-th macro block is B _{pseudo_slicej,} and the number of pixel blocks constituting the pseudo pixel block sequence is P_SLICE_CNT.

6. The image information conversion apparatus according to claim 5, wherein the conversion means uses the occupation amount of the virtual buffer at the end of the processing of each pseudo pixel block sequence as an initial value of the occupation amount of the virtual buffer for the next pseudo pixel block sequence. .

The input image compression information refers to an intra-encoded image encoded within a frame, a forward-predicted encoded image that is encoded with reference to the forward direction in the display order, and refers to the forward and reverse directions in the display order. And a determination unit configured to pass the intra-coded image and the forward-predicted encoded image, but discard the bi-directional predictive-coded image. The image information conversion apparatus described.

2. The image information conversion apparatus according to claim 1, wherein the first compression encoding method is MPEG2, and the second compression encoding method is MPEG4.

In an image information conversion method for converting input image compression information of interlaced scanning compressed by a first compression encoding method into output image compression information of sequential scanning compressed by a second compression encoding method,
The encoded image constituting the input image compression information and the output image compression information is composed of pixel blocks each consisting of a plurality of pixels,
The average quantization scale of the pixel block group in the encoded image of the input image compression information corresponding to the pixel block constituting the pseudo pixel block sequence composed of the pixel block group for one horizontal column in the encoded image of the output image compression information And an analysis step of extracting the allocated code amount from the input image compression information;
Storing an average quantization scale and an assigned code amount detected by the analyzing means in an information buffer;
Calculating a target code amount for a pseudo pixel block sequence composed of pixel blocks in the encoded image of the output image compression information;
Using the target code amount, and encoding the image information into the output image compression information,
In the analysis step, using the average quantization scale and the allocated code amount stored in the information buffer, the complexity for the pseudo pixel block sequence in the output image compression information and the encoded image is calculated,
In the target code amount calculation step, the encoded image constituting the output image compression information is configured using the target code amount for the encoded image constituting the output image compression information and the complexity for each pseudo-pixel block sequence. An image information conversion method for calculating a target code amount for a pseudo pixel block sequence.

The image information obtained by decoding the interlaced scan input image compression information compressed by the first compression encoding method is encoded into the progressive scan output image compression information compressed by the second compression encoding method. In the encoding device,
The encoded image constituting the input image compression information and the output image compression information is composed of pixel blocks each consisting of a plurality of pixels,
The encoded image of the input image compression information corresponding to the pixel block that is extracted from the input image compression information and constitutes a pseudo pixel block sequence composed of a pixel block group for one horizontal column in the encoded image of the output image compression information Receiving means for receiving an average quantization scale and an assigned code amount of the pixel block group in
An information buffer for storing the average quantization scale and the assigned code amount from the receiving means;
Target code amount calculating means for calculating a target code amount for the pseudo pixel block sequence;
Encoding means for encoding image information into the output image compression information using the target code amount calculated by the target code amount calculation means;
The receiving means receives the output image compression information calculated using the average quantization scale and the assigned code amount stored in the information buffer and the complexity for the pseudo pixel block sequence in the encoded image,
The target code amount calculation means is a coded image constituting the output image compression information calculated using the target code amount for the coded image constituting the output image compression information and the complexity for each pseudo pixel block sequence. An encoding device that calculates a target code amount for a pseudo pixel block sequence that constitutes.

The image information obtained by decoding the interlaced scan input image compression information compressed by the first compression encoding method is encoded into the progressive scan output image compression information compressed by the second compression encoding method. In the encoding method,
The encoded image constituting the input image compression information and the output image compression information is composed of pixel blocks each consisting of a plurality of pixels,
The encoded image of the input image compression information corresponding to the pixel block that is extracted from the input image compression information and constitutes a pseudo pixel block sequence composed of a pixel block group for one horizontal column in the encoded image of the output image compression information A receiving step of receiving an average quantization scale and an assigned code amount of a pixel block group in
An information buffer for storing an average quantization scale and an assigned code amount from the reception process;
A target code amount calculating step of calculating a target code amount for the pseudo pixel block sequence;
Using the target code amount calculated in the target code amount calculation step, and encoding the image information into the output image compression information,
In the receiving step, the output image compression information calculated using the average quantization scale and the assigned code amount stored in the information buffer and the complexity for the pseudo pixel block sequence in the encoded image are received,
In the target code amount calculation step, the encoded image constituting the output image compression information calculated using the target code amount for the encoded image constituting the output image compression information and the complexity for each pseudo pixel block sequence An encoding method for calculating a target code amount for a pseudo-pixel block sequence that constitutes.