JP3918510B2

JP3918510B2 - Moving picture coding apparatus and moving picture coding method

Info

Publication number: JP3918510B2
Application number: JP2001334295A
Authority: JP
Inventors: 賢二杉山
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 2001-10-31
Filing date: 2001-10-31
Publication date: 2007-05-23
Anticipated expiration: 2021-10-31
Also published as: JP2003143608A

Description

【０００１】
【発明の属する技術分野】
本発明は動画像符号化装置及び動画像符号化方法に係り、特に画像を効率的に伝送、蓄積、表示するために、画像情報をより少ない符号量でディジタル信号にする高能率符号化で、ピクチャ内独立、片方向予測、双方向予測の３種類の符号化手法を用いる動画像符号化において、符号化効率の低下を抑えながら符号列の編集を容易にする動画像符号化装置及び動画像符号化方法に関する。
【０００２】
【従来の技術】
従来より、動画像の高能率圧縮符号化方式としてＭＰＥＧ（Moving Picture Experts Group）方式が知られている。このＭＰＥＧ方式では、画像間予測の方法により３種類のピクチャタイプを持つ。Ｉピクチャと呼ばれるピクチャ内独立符号化ピクチャと、Ｐピクチャと呼ばれる片方向予測符号化（フレーム間又はフィールド間順方向予測符号化）ピクチャと、Ｂピクチャと呼ばれる双方向予測符号化ピクチャである。Ｉピクチャはランダムアクセスやチャンネル切替えに対応するもので、そこから復号が可能となる。ここで、ピクチャは動画像の１フレームないし１フィールドを指す。
【０００３】
符号列はピクチャが複数束ねられ、符号列群すなわちＧＯＰ（Group Of Picture）が形成される。このＧＯＰにおいては、Ｉピクチャがひとつは必ず入る形となる。通常のＧＯＰの構成は、Ｂピクチャから始まりＰピクチャで終わる。このピクチャ構成を図６（ａ）に示す。符号列においてＰ（Ｉ）ピクチャとＢピクチャの順番が入れ替わるため、このような構成となる。
【０００４】
一方、最初のＢピクチャは前のＧＯＰのＰピクチャからも予測されるため、画像間予測が途切れず符号列をＧＯＰ単位で入れ替えることができない。そこで、最初のＢピクチャを無くし、前ＧＯＰのＰピクチャの直後をＩピクチャとする方法がある。これはクローズド（Closed）ＧＯＰと呼ばれるもので、各ＧＯＰは前後ＧＯＰと関係なくなるため、ＧＯＰ単位で符号列の編集が可能になる。このＧＯＰ構成を図６（ｂ）に示す。この場合、最初の部分が周期的な処理でなくなるので、処理がやや面倒になる。また、符号量が少ないＢピクチャが削除されるので平均符号量が増加する。
【０００５】
一方、本発明者が先に特開平１１−１６４３０７号公報にて開示したように、主たる符号列とは別に副符号列として、ピクチャ内独立符号化された符号列を多重化する動画像符号化装置及び動画像復号化装置がある。この動画像符号化装置では、入力される動画像に対して、画像内独立符号化または画像間予測符号化をフレーム又はフィールド単位で切り替えて行い、得られた主符号列を出力する主符号化手段と、前記主符号化手段において画像間予測符号化が行われるフレーム又はフィールドのうち所定フレーム又はフィールドを、画像内独立符号化し、得られた副符号列を出力する副符号化手段と、前記所定フレーム又はフィールドの主符号列の隣接部に前記所定フレームまたはフィールドの副符号列を挿入し、多重化された符号列を得る符号列多重化手段とより構成したことを特徴とする。
【０００６】
また、上記の動画像復号化装置は、入力される符号列のタイプ（主符号列／副符号列）を符号列のヘッダーより検出し、符号列のタイプ情報を出力するタイプ検出手段と、前記符号列のタイプ情報に基づき、連続した画像の復号化が行われていない場合は、入力されるいずれの符号列も復号化処理に導き、連続した画像の復号化が行われている場合は、副符号列を放棄して主符号列のみを復号化処理に導く符号列制御手段と、前記符号列制御手段から与えられる符号列に対して、画像内復号化又は画像間予測復号化を行い、得られた再生画像を出力する復号化手段とを有する構成である。
【０００７】
【図１】
この本発明者の先の提案になる動画像符号化装置及び動画像復号化装置によれば、通常の復号化ではピクチャ内独立符号化された符号列は用いずに復号化し、ランダムアクセスやチャンネル切替え時にのみ、ピクチャ内独立符号化された符号列から復号化することが可能になる。
【０００８】
また、ピクチャ内独立符号化した局部復号画像とピクチャ間予測画像の両方を用いて画質を高める手法がある。例えば、本発明者が先に特開平５−１３０５９１号公報にて開示した動画像符号化装置では、ピクチャ内独立符号化の再生画像とピクチャ間予測画像を適応的に加算し、予測信号を形成するものである。
【０００９】
図７は従来の動画像符号化装置の一例のブロック図を示す。同図において、画像入力端子１より入来する動画像信号は、すべてがフレーム遅延器２に供給される一方、Ｉピクチャとして符号化する信号のみがスイッチ１９を介してＤＣＴ２０に供給される。フレーム遅延器２は、ＰピクチャをＢピクチャに先行して符号化するために、Ｂピクチャのみをフレーム時間遅延させる。順番が入れ替えられた各画像は、減算器３に与えられる。
【００１０】
フレーム遅延器２からの画像信号は、減算器３において後述する加算器９からの予測信号と減算されて予測残差とされてＤＣＴ４に入力される。ＤＣＴ４は予測残差に対して離散コサイン変換（ＤＣＴ：Discrete Cosine Transform）の変換処理を行い、得られた係数を量子化器５に供給する。量子化器５は所定のステップ幅で入力係数を量子化し、固定長の符号となった係数を可変長符号化器６と逆量子化器１０に供給する。可変長符号化器６は、固定長の予測残差を可変長符号で圧縮して、得られた符号を多重化器１３に供給する。
【００１１】
一方、逆量子化器１０及び逆ＤＣＴ１１ではＤＣＴ４及び量子化器５の逆処理が行われ、予測残差を再生する。得られた再生予測残差は加算器１２で、加算器９からの予測信号と加算されて再生画像とされ、画像間予測器７に入力される。画像間予測器７はこの再生画像を参照画像として用いて画像間予測信号を形成し、乗算器８に供給する。乗算器８は、後述の特定画像設定器１８よりの制御情報に従って再生画像に０から１の値を乗じて、加算器９に供給する。
【００１２】
Ｉピクチャの符号化は、上記Ｐピクチャとして符号化される画像の内、周期的に設定した一部の画像について行う。Ｉピクチャの符号化は、予測残差に対する上記の処理と同様で、ＤＣＴ２０、量子化器２１及び可変長符号化器２２からなる回路部で符号化されるが、この処理はＩ（Ｐ）ピクチャに対するＤＣＴ４、量子化器５及び可変長符号化器６からなる回路部の処理と同様である。得られた符号は可変長符号化器２２から多重化器１３に入力される。
【００１３】
一方、逆量子化器１５及び逆ＤＣＴ１６ではＤＣＴ２０及び量子化器２１の逆処理が行われ、画像を再生する。得られた再生画像（Ｉピクチャ局部復号画像）は、乗算器１７に与えられる。乗算器１７は、後述する特定画像設定器１８からの制御情報に従って局部復号画像に０から１の値を乗じて、加算器９に供給する。
【００１４】
加算器９は、乗算器８からの画像間予測画像と、乗算器１７からのＩピクチャ局部復号画像とを加算して最終的な予測画像を得る。乗算器８の乗算係数と乗算器１７の乗算係数とは、それらの和が１となるもので、画像の相関により制御されてもよい。Ｉピクチャの無い非特定ピクチャでは乗算器８で１、乗算器１７で０が乗算され、通常のＰピクチャの処理となる。Ｂピクチャは予測の参照画像とならないので、この加算処理は関係ない。
【００１５】
特定画像設定器１８は所定周期毎のＰピクチャを特定ピクチャとして設定し、その制御情報をスイッチ１９、乗算器８、１７、多重化器１３に与える。多重化器１３は、特定ピクチャの情報と各ピクチャの符号列を多重化し、符号列出力端子１４より出力する。
【００１６】
次に、従来の動画像符号列について説明する。従来のＧＯＰ（画像群）の符号列構成は、通常のＧＯＰの場合は図８（ａ）に示すように、クローズド（Closed）ＧＯＰの場合は図８（ｂ）に示すようになる。図８で区切りは各ピクチャの符号列を示し、Ｉ、Ｂ、Ｐはピクチャタイプ、数字は再生表示ピクチャ番号である。符号列は、ＢピクチャとＰ（Ｉ）ピクチャの順番が逆転しているのが判る。その結果ＧＯＰの最後はＰピクチャにならず、その前のＢピクチャとなる。
【００１７】
通常のＧＯＰ構成の動画像符号列は、ＧＯＰ単位で編集を行うと最初のＢピクチャが復号化できなくなる。これはその前のＰピクチャが前のＧＯＰに属し、ＧＯＰ単位の編集によりＰピクチャが他の画像に変化してしまうので、正しい参照画像が得られなくなるためである。この場合、復号化装置でＢピクチャの画像を復号化しないようにするため、編集が行われていることを示すフラグ（Bloken Link）を立てる必要がある。
【００１８】
一方、クローズド（Closed）ＧＯＰ構成の動画像符号列は、最初のＢピクチャがないので、ＧＯＰ単位で編集を行っても復号化に影響しない。これは画像間予測がＧＯＰで閉じているためで、編集が行われていることを示すフラグ（Bloken
Link）を立てる必要はない。
【００１９】
図７に示した従来の動画像符号化装置に対応する従来の動画像復号化装置は、予測信号の形成において、図７の局部復号部分と同様に独立フレーム復号画像と画像間予測画像が適応的に加算する構成である。
【００２０】
一方、通常のＧＯＰ構成の動画像符号列で、ＧＯＰ単位で編集が行われ、ブロークンリンク（Bloken Link）フラグが立っている場合、復号化では、編集点以降でＩピクチャより前のＢピクチャは復号化せず、前の画像などで置き換える。クローズド（Closed）ＧＯＰ構成の動画像符号列では、ＧＯＰ単位での符号列編集の影響は受けないが、Ｐピクチャの周期が不連続となるので、それに応じた復号化処理が必要になる。
【００２１】
【発明が解決しようとする課題】
従来の動画像編集では、Ｉピクチャの周期で束ねられた符号列群、すなわちＧＯＰ（Group Of Picture）単位を持つ符号列の編集を行うが、通常のＧＯＰ構成では符号列の編集が困難であり、最初のＢピクチャがないクローズド（Closed）ＧＯＰ構成では、符号化効率が低下し、また、Ｐピクチャの周期が不連続になるという問題がある。
【００２２】
また、Ｐピクチャにおいて副符号列としてＩピクチャも持つ従来の手法は、ランダムアクセスなどには有効であるが、重複するＩピクチャ分だけ符号量が増加し、符号列編集に対応したＧＯＰ構造になっていない。
【００２３】
更に、同一フレームのＩピクチャ局部復号画像と画像間予測信号から予測信号を形成する従来の手法は、符号化効率は良いが、両方の符号列がないと復号化ができないので、符号列の編集はできない。
【００２４】
本発明は以上の点に鑑みなされたもので、所定ＰピクチャではＩピクチャも持ち、両者の再生画像を加算したものを再生画像とすることで、編集可能でありながら再生画像の画質を改善できる動画像符号化装置及び動画像符号化方法を提供することを目的とする。
【００２５】
【課題を解決するための手段】
上記の目的を達成するため、本発明の動画像符号化装置は、ピクチャ内独立、片方向予測、双方向予測の３種類の符号化手法で動画像の各ピクチャを符号化する動画像符号化装置において、入力画像信号を片方向予測又は双方向予測で符号化し、局部復号して片方向予測符号化の局部復号画像を得る第１の符号化局部復号化手段と、片方向予測で符号化されるピクチャの一部を特定ピクチャとし、特定ピクチャでは片方向予測符号化と共にピクチャ内独立でも符号化し、局部復号して、ピクチャ内独立符号化の局部復号画像を得る第２の符号化局部復号化手段と、特定ピクチャにおいて、片方向予測符号化の局部復号画像とピクチャ内独立符号化の局部復号画像を加算して、他ピクチャの画像間予測処理の参照画像とする画像間予測手段とを有する構成としたものである。
【００２６】
この発明では、特定ピクチャでは片方向予測符号化されたピクチャ符号列の他に、ピクチャ内独立符号化されたピクチャ符号列も持ち、これら２種類のピクチャ符号列が重複することになるが、両者の局部復号画像を加算することで、再生画像のＳ／Ｎを改善でき、また、その再生画像を画像間予測の参照画像とすることで、画像間予測効率も改善できる。
【００２７】
また、上記の目的を達成するため、本発明の動画像符号化方法は、ピクチャ内独立、片方向予測、双方向予測の３種類の符号化手法で動画像の各ピクチャを符号化する動画像符号化方法において、入力画像信号を片方向予測で符号化し、局部復号して片方向予測符号化の局部復号画像を得る第１のステップと、片方向予測で符号化されるピクチャの一部を特定ピクチャとし、特定ピクチャでは片方向予測符号化と共にピクチャ内独立でも符号化し、局部復号してピクチャ内独立符号化の局部復号画像を得る第２のステップと、特定ピクチャにおいて、片方向予測符号化の局部復号画像とピクチャ内独立符号化の局部復号画像を加算して、他ピクチャの画像間予測処理の参照画像とする第３のステップとを含むことを特徴とする。
【００２８】
この発明では、特定ピクチャでは片方向予測符号化されたピクチャ符号列の他に、ピクチャ内独立符号化されたピクチャ符号列も持ち、これら２種類のピクチャ符号列が重複することになるが、両者の局部復号画像を加算した信号を、他ピクチャの画像間予測処理の参照画像とすることで、再生画像のＳ／Ｎを改善でき、また、その再生画像を画像間予測の参照画像とすることで、画像間予測効率も改善できる。
【００３１】
【発明の実施の形態】
次に、本発明の実施の形態について図面と共に説明する。図１は本発明になる動画像符号化装置の一実施の形態のブロック図を示す。同図中、図７と同一構成部分には同一符号を付してある。また、本明細書中、「ピクチャ」とは、一つのフレームないしフィールドを指すものとする。
【００３２】
図１において、画像入力端子１より入来する動画像信号は、すべてがフレーム遅延器２に与えられ、Ｉピクチャとして符号化するもののみがスイッチ１９を介してＤＣＴ２０に与えられる。フレーム遅延器２は、ＰピクチャをＢピクチャに先行して符号化するために、Ｂピクチャのみを遅延させる。順番が入れ替えられた各画像は、減算器３に与えられる。
【００３３】
フレーム遅延器２により遅延された入力画像信号は、減算器３において画像間予測器２７から与えられる予測信号と減算され、予測残差とされてＤＣＴ４に入力される。ＤＣＴ４は予測残差に対してＤＣＴ（Discrete Cosine Transform）の変換処理を行い、得られた係数を量子化器５に与える。量子化器５は与えられた係数を所定のステップ幅で量子化し、固定長の符号となった係数を可変長符号化器６と逆量子化器１０に供給する。可変長符号化器６は、量子化器５からの固定長の予測残差を可変長符号で圧縮し、得られたＰピクチャ又はＢピクチャの可変長符号は多重化器１３に供給される。
【００３４】
一方、逆量子化器１０及び逆ＤＣＴ１１ではＤＣＴ４及び量子化器５の逆処理が行われ、予測残差を再生する。得られた再生予測残差は加算器１２において画像間予測器２７からの予測信号と加算されて局部復号画像となり、乗算器２５に供給される。乗算器２５は、特定画像設定器１８からの制御情報に従って局部復号画像に０から１の値を乗じて、加算器２６に供給する。
【００３５】
Ｉピクチャの符号化は、Ｐピクチャとして符号化される画像の内、周期的に設定した一部の画像について行う。このＩピクチャの符号化は、予測残差に対する上記の処理と同様にして行われる。すなわち、Ｉピクチャは、ＤＣＴ２０及び量子化器２１を通して可変長符号化器２２に入力されて可変長符号化されるが、この処理はＰ（Ｂ）ピクチャに対するＤＣＴ４、量子化器５及び可変長符号化器６の処理と同様である。可変長符号化器２２により得られたＩピクチャの可変長符号は多重化器１３に入力される。
【００３６】
一方、逆量子化器１５及び逆ＤＣＴ１６ではＤＣＴ２０及び量子化器２１の逆処理が行われ、局部復号画像を再生する。得られた局部復号画像は、乗算器１７に与えられる。乗算器１７は、特定画像設定器１８からの制御情報に従って局部復号画像に０から１の値を乗じて、加算器２６に供給する。
【００３７】
加算器２６は乗算器２５、１７からの２種類の局部復号画像を加算して画像間予測処理のための参照画像を得る。画像間予測器２７は、この参照画像を用いて画像間予測信号を形成する。この画像間予測信号は減算器３及び加算器１２にそれぞれ供給される。
【００３８】
特定画像設定器１８は所定周期毎のＰピクチャを特定ピクチャとして設定し、その制御情報をスイッチ１９、乗算器１７、２５、多重化器１３に与える。多重化器１３は、特定ピクチャの情報と各ピクチャの符号列を多重化し、符号列出力端子１４より出力する。スイッチ１９は上記の特定ピクチャのときにのみオンとされ、それ以外の非特定ピクチャのときにはオフとされる。
【００３９】
次に、加算器２６における２種類の局部復号画像の加算処理について説明する。まず、非特定ピクチャでは、Ｉピクチャはないので、乗算器２５は１を乗じ、乗算器１７は０を乗じる。すなわち一般的なＰピクチャの符号化と変わらない。なお、Ｂピクチャは参照画像とならないので、加算処理はそもそも関係しない。
【００４０】
一方、特定ピクチャでは、Ｐピクチャの局部復号画像とＩピクチャの局部復号画像の加算を行うために、乗算器２５と乗算器１７は共に係数０.５を入力局部復号画像に乗じる。互いの画像に含まれる雑音成分が白色雑音の場合は、加算により３ｄＢのＳ／Ｎが改善できるが、Ｐピクチャの局部復号画像のノイズ成分とＩピクチャの局部復号画像のノイズ成分は、それぞれ処理方法が異なるものの、高い周波数成分で量子化が粗くなっているなど共通点もあるので、雑音成分にも相関があり、３ｄＢの改善は得られない。しかし、同一ではないので、ある程度の改善は見込まれる。仮に半分の１.５ｄＢであるとすると、符号量でこれに見合う改善を行うためには３０％程度符号量を増加させる必要がある。
【００４１】
一般に、量子化器５、２１の各量子化ステップ幅を各々設定することで、ＩピクチャはＰピクチャより再生画像の品質を高めに設定する。これは、ＧＯＰのすべての画像の参照画像の基となるＩピクチャの品質を高めにすることが、ＧＯＰ全体の画質向上に寄与するためである。一方、Ｐピクチャの再生画像とＩピクチャの再生画像でＳ／Ｎが異なると、加算はあまり有効でなくなる。そこで、Ｉピクチャの符号量をある程度減らすと、ＰピクチャとＳ／Ｎが同等になり、最大の効果が得られる。
【００４２】
本発明は、通常のＧＯＰ構成に対しＰピクチャが追加されているので、その分符号量が多くなるが、Ｉピクチャの符号量を減らしてＳ／Ｎを下げても、ＩピクチャとＰピクチャの加算で参照画像のＳ／Ｎが保持できれば、再生画像、符号量共に通常のＧＯＰと同等となる。
【００４３】
ここで、発生符号量を通常ＧＯＰ及びクローズド（Closed）ＧＯＰと比較してみる。Ｉピクチャの平均符号量を１０００ｋｂｉｔ、Ｐピクチャの平均符号量を３００ｋｂｉｔ、Ｂピクチャの平均符号量を１００ｋｂｉｔとする。毎秒３０フレームの画像で、Ｐ（Ｉ）ピクチャの周期を３フレームとする通常ＧＯＰの場合、ＧＯＰの長さを１５フレームとすると、１秒中の各ピクチャ平均数から平均転送レートは６.４Ｍｂｐｓとなる。
【００４４】
一方、クローズド（Closed）ＧＯＰの場合は、ＧＯＰの大きさが通常ＧＯＰとは異なり、ＧＯＰの長さが１３フレームで平均転送レートが６.９２Ｍｂｐｓ、ＧＯＰの長さが１６フレームで平均転送レートが６.５６Ｍｂｐｓとなり、いずれも通常ＧＯＰに比べて平均転送レートが増加する。また、ＧＯＰの長さが１３フレームではアクセス性がやや向上するが、１６フレームの場合は低下する。両者から１５フレーム相当の符号量を得ると６.６８Ｍｂｐｓとなり、通常のＧＯＰに対して４．４％の符号量増加となる。
【００４５】
本実施の形態の場合は、Ｉピクチャの平均符号量を通常のＧＯＰやクローズドＧＯＰと同じとすると平均転送レートは７.０Ｍｂｐｓとなるが、３０％落として７００ｋｂｉｔとすると平均転送レートが６.４Ｍｂｐｓとなり、通常ＧＯＰの場合と同じになる。これは通常ＧＯＰのＩピクチャの符号量を、ＩピクチャとＰピクチャに割り振った形となる。
【００４６】
次に、動画像符号列について説明する。図１に示した符号化装置で符号化された符号列の形成において、特定ピクチャのＰピクチャ符号列をＧＯＰ（画像群）の最後にし、ＩピクチャをＧＯＰの最初にする。従って、特定ピクチャにおいては、Ｐピクチャ、Ｉピクチャの順で符号列が配置され、一つのＧＯＰで見るとＩピクチャで始まり、Ｐピクチャで終わる。この本実施の形態のＧＯＰ構成を図６（ｃ）に示す。
【００４７】
一方、符号列ではＢピクチャとＰ（Ｉ）ピクチャは逆転するので、最後はＰピクチャにならず、その前のＢピクチャとなる。すなわち、形成されるＧＯＰ（画像群）の符号列は、図８（ｃ）に示すように、特定ピクチャのピクチャ内独立符号化されたＩピクチャ符号列Ｉ１で始まり、次の特定ピクチャの直前にある双方向予測符号化されたＢピクチャ符号列Ｂ１５で終了する。
【００４８】
このＧＯＰ構成は、特定フレームの重複は無視して１ＧＯＰだけを比較するとクローズド（Closed）ＧＯＰと同様であり、特定フレームでＩピクチャまたはＰピクチャの一方を削除すると、削除された方によりＧＯＰの構成は変化するが、ピクチャの並びは通常ＧＯＰの並びと同様になる。すなわち、本実施の形態のＧＯＰは、クローズド（Closed）ＧＯＰと通常ＧＯＰの両方の特性を兼ね備えることができる。
【００４９】
前記動画像符号列は、クローズド（Closed）ＧＯＰの場合と同様にＧＯＰ単位で符号列の編集が可能になる。その様子を図２に示す。同図に示すように、１行目に示す符号列ＡのあるＧＯＰとＧＯＰの間に、３行目に示す符号列Ｂの１ＧＯＰが挿入されて、２行目に示すような編集された符号列が得られる。ここで、各ＧＯＰはその最初と最後が重複ピクチャとなっている。従って、従来の編集装置と処理が異なる。
【００５０】
まず、画像の長さについて、ＧＯＰの最後のＰピクチャは、ＧＯＰの長さ（時間）には組み入れないで、編集時間の計算を行う。従って、本実施の形態のＧＯＰ構成の符号列が１６フレームであっても、１５フレームと見なす。
【００５１】
次に、特定ピクチャの再生制御で、ＧＯＰ単位で編集を行った場合、編集点となる特定ピクチャは、前のＧＯＰのＰピクチャ、後のＧＯＰのＩピクチャいずれもが復号化再生可能である。一方、編集が行われているので画像内容は異なる。符号列が重複する点を積極的に利用する方法としては、再生時にどちらの画像を出力するか、制御情報を入れておけば、同じ符号列で編集点を１ピクチャ前後させることができる。
【００５２】
また、従来クローズド（Closed）ＧＯＰでは復号化装置で処理変更がないので、編集が行われていることを示すフラグ（Bloken Link）を立てる必要はなかったが、本手法においては復号化処理を切り替える必要があるので、ブロークンリンク（Bloken Link）のフラグを立てる必要がある。
【００５３】
次に、動画像復号化装置の各例について説明する。図３は動画像復号化装置の一例のブロック図を示す。この動画像復号化装置は、図１に示した本発明の動画像符号化装置の一実施の形態に対応する復号化装置の構成を示しており、これは編集が行われてない画像連続性が保たれた符号列の場合である。
【００５４】
図３において、符号列入力端子３１より入来する符号列は、多重化分離器３２によりピクチャのヘッダに基づきＩピクチャの符号列とそれ以外の符号列に分離される。ＰピクチャやＢピクチャの符号列は、可変長復号化器３３に供給され、Ｉピクチャの符号列は可変長復号化器３４に供給される。
【００５５】
Ｐ（Ｂ）ピクチャの符号列は、可変長復号化器３３で予測残差の可変長符号が固定長の符号に戻され、逆量子化器３５に供給される。逆量子化器３５は、入力された固定長符号を、量子化パラメータに従って逆量子化して予測残差の再生ＤＣＴ係数値を得、これを逆ＤＣＴ３６に供給する。
【００５６】
逆ＤＣＴ３６は８×８個の係数を復号予測残差信号に変換し、加算器３７に供給する。加算器３７は上記の復号予測残差信号に、画像間予測器４５から与えられる予測信号を加算して復号画像信号を得る。この様にして得られたＰ（Ｂ）ピクチャの復号画像信号は、乗算器４２に供給される。
【００５７】
一方、多重化分離器３２で分離されたＩピクチャの符号列は、可変長復号化器３４で復号化され、逆量子化器３８で逆量子化され、逆ＤＣＴ３９で復号化されて再生画像信号とされた後、乗算器４１に入力される。可変長復号化器３４、逆量子化器３８、逆ＤＣＴ３９の動作は、可変長復号化器３３、逆量子化器３５、逆ＤＣＴ３６と同様であるが、パラメータはＩピクチャ用のものとなる。
【００５８】
また、多重分離器３２は入力された符号列中のピクチャヘッダからピクチャのＩＤを検出して、その結果情報が特定画像制御器４０に供給される。特定画像制御器４０は、特定ピクチャを検出し、その制御情報を乗算器４１及び４２にそれぞれ供給する。乗算器４１は、上記の制御情報に従って逆ＤＣＴ３９からの再生画像信号に０から１の値を乗じて、加算器４３に与える。他方、乗算器４２は、上記の制御情報に従って加算器３７からのＰ（Ｂ）ピクチャの復号画像信号に０から１の値を乗じて、加算器４３に与える。
【００５９】
加算器４３は、乗算器４１及び４２から取り出された２種類の復号画像信号を加算して再生画像信号を得る。加算器４３による加算は特定ピクチャのみで行われ、このとき乗算器４１、乗算器４２共に係数０．５が乗算される。それ以外では、乗算器４２で係数１と復号画像信号との乗算が、乗算器４１で係数０と復号画像信号との乗算がそれぞれ行われるため、加算器４３からは加算器３７からのＰ（Ｂ）ピクチャの復号画像信号がそのまま出力される。
【００６０】
特定ピクチャでは再生画像信号は加算器４３での加算により、乗算器４１、４２から取り出された各復号画像信号よりＳ／Ｎが改善されたものとなる。このような復号化の様子を図４（ａ）に示す。
【００６１】
加算器４３から出力された再生画像信号は、Ｂピクチャではスイッチ４６を介して再生画像出力端子４７よりそのまま出力される。一方、加算器４３から出力された再生画像信号は、Ｐ（Ｉ）ピクチャでは画像メモリ４４にいったん蓄えられ、画像間予測処理のための参照画像とされると共に遅延させられた後、画像間予測器２２に供給され、ここでこの参照画像を用いて予測信号とされて加算器３７に入力される。スイッチ４６は遅延されたＢピクチャと、画像メモリ４４で遅延されたＰ（Ｉ）ピクチャを選択して出力端子４７へ出力する。
【００６２】
次に、動画像復号化装置の他の例について説明する。図５は図１の動画像符号化装置に対応する動画像復号化装置の他の例のブロック図を示す。図５中、図３と同一構成部分には同一符号を付し、その説明を省略する。図５の復号化装置は符号列編集が行われ、画像連続性が保たれない場合の復号化を行う装置である。
【００６３】
加算器３７から出力されるＢピクチャの復号画像信号は、スイッチ５１を介して再生画像出力端子４７よりそのまま出力される。一方、加算器３７から出力されるＰピクチャの復号画像信号は、画像メモリ４８に一旦保持される。また、逆ＤＣＴ３９から出力されるＩピクチャの復号画像信号は、画像メモリ４９に一旦保持される。
【００６４】
ここで、ＧＯＰは編集が行われているので、特定ピクチャのＰピクチャとＩピクチャは形式的に同一ピクチャとなっているが、Ｐピクチャは前ＧＯＰのものであり、Ｉピクチャは後のＧＯＰのものである。そこでスイッチ５０は、画像メモリ４８及び４９からの２種類の復号画像信号から次のように参照画像として適切な方を選択する。
【００６５】
特定ピクチャの復号化の次には、前ＧＯＰのＢピクチャの復号化が行われるが、それには画像メモリ４８に保持されているＰピクチャの復号画像を選択する。続けて、次のＧＯＰのＰピクチャ及びＢピクチャの復号化では、画像メモリ４９に保持されているＩピクチャの復号画像を選択する。この場合の復号化の様子を図４（ｂ）に示す。図で矢印は画像間予測の関係である。
【００６６】
図５に示した復号化装置の復号化では、画像間予測の参照画像が符号化装置の参照画像と若干異なることになるが、いずれも同一画像に対する復号画像であり、量子化雑音成分以外の元の画像は共通である。参照画像の変化は、編集点直前は２ピクチャのみ、編集点後は１ＧＯＰに影響する。しかし、編集点後は予測残差成分が順次加算されるので、参照画像変化の影響は次第に少なくなる。一方、視覚特性を考慮すると、編集でシーンが変わった場合、劣化にはかなり気づき難く、特に変化直後は０.１秒程度の間検知能力が大きく低下するといわれている。従って、劣化の視覚的影響は極めて小さい。
【００６７】
再生画像出力は、スイッチ５１で選択される。スイッチ５１の動作で特定ピクチャ以外は図３のスイッチ４６と同様である。特定フレームではＩピクチャとＰピクチャのいずれを出力することも可能であるので、どちらを選択するかあらかじめ決められていてもよいが、符号列編集装置にて符号列に制御情報が入れられている場合は、それに従って制御する。
【００６８】
なお、以上の実施の形態では、動画像符号化装置及び方法について説明したが、本発明はこれに限定されるものではなく、片方向予測で符号化されるピクチャの一部を特定ピクチャとし、その特定ピクチャでは片方向予測符号化と共にピクチャ内独立でも符号化した符号列があり、特定ピクチャのピクチャ内独立符号化された符号列で始まり、特定ピクチャの直前のピクチャの双方向予測符号化された符号列で終了する符号列を一つの符号列群とし、この符号列群単位で合成された符号列を、所望の伝送路を介して伝送するようにしてもよい。
【００６９】
【発明の効果】
以上説明したように、本発明によれば、特定ピクチャでは片方向予測符号化されたピクチャ符号列の他に、ピクチャ内独立符号化されたピクチャ符号列も持ち、これら２種類のピクチャ符号列が重複することになるが、両者の局部復号画像を加算することで、再生画像のＳ／Ｎを改善するようにしたため、再生画像の画質を改善できる。また、その再生画像を画像間予測の参照画像とすることで、画像間予測効率も改善するようにしたため、上記の特定ピクチャではその分総符号量を減らすこともできる。
【００７０】
また、本発明によれば、符号列の群（ＧＯＰ）構成を特定ピクチャに含まれる片方向予測符号化ピクチャと、ピクチャ内独立符号化ピクチャの間で区切る構成とすることにより、ＧＯＰ単位で符号列が編集された場合の不連続点では、復号化装置においてＧＯＰ終端は片方向予測符号化ピクチャを、ＧＯＰ始端はピクチャ内独立符号化ピクチャを参照画像として他ピクチャの画像間予測を行うようにしたため、通常ＧＯＰと同等の符号化効率で片方向予測符号化ピクチャの周期性を保ちながら、クローズド（Closed）ＧＯＰ同様に、ＧＯＰ単位の符号列編集ができる符号列を生成することができる。
【図面の簡単な説明】
【図１】本発明の動画像符号化装置の一実施の形態のブロック図である。
【図２】動画符号列の編集の様子の一例を示す図である。
【図３】動画像復号化装置の一例のブロック図である。
【図４】復号化の各例をピクチャ単位で示す図である。
【図５】動画像復号化装置の他の例のブロック図である。
【図６】ＧＯＰ構成の各例を示す図である。
【図７】従来の動画像符号化装置の一例のブロック図である。
【図８】ＧＯＰ（画像群）の符号列構成の各例を示す図である。
【符号の説明】
１画像入力端子
２フレーム遅延器
３減算器
４、２０ＤＣＴ
５、２１量子化器
６、２２可変長符号化器
１０、１５、３５、３８逆量子化器
１１、１６、３６、３９逆ＤＣＴ
１２、２６、３７、４３加算器
１３多重化器
１４符号列出力端子
１７、２５、４１、４２乗算器
１８特定画像設定器
１９、４６スイッチ
２７、４５画像間予測器
３１符号列入力端子
３２多重化分離器
３３、３４可変長復号化器
４０特定画像制御器
４７再生画像出力端子[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a moving image encoding apparatus. as well as Video coding Method In particular, in order to efficiently transmit, store, and display images, high-efficiency coding that converts image information into a digital signal with a smaller code amount, three types of in-picture independent, unidirectional prediction, and bidirectional prediction. Video encoding apparatus that facilitates editing of a code string while suppressing a decrease in encoding efficiency in video encoding using the above-described encoding method as well as Video coding Method About.
[0002]
[Prior art]
Conventionally, an MPEG (Moving Picture Experts Group) method is known as a high-efficiency compression encoding method for moving images. This MPEG system has three picture types according to the inter-picture prediction method. They are an intra-picture independent coded picture called an I picture, a unidirectional predictive coded (interframe or interfield forward predictive coded) picture called a P picture, and a bidirectional predictive coded picture called a B picture. The I picture corresponds to random access and channel switching, and can be decoded therefrom. Here, a picture indicates one frame or one field of a moving image.
[0003]
A plurality of pictures are bundled in the code string, and a code string group, that is, a GOP (Group Of Picture) is formed. In this GOP, one I picture is always included. A normal GOP configuration starts with a B picture and ends with a P picture. This picture configuration is shown in FIG. Since the order of the P (I) picture and the B picture is switched in the code string, such a configuration is obtained.
[0004]
On the other hand, since the first B picture is also predicted from the P picture of the previous GOP, inter-picture prediction is not interrupted and the code string cannot be replaced in units of GOP. Therefore, there is a method in which the first B picture is eliminated and the I picture immediately after the P picture of the previous GOP is used. This is called a closed GOP. Since each GOP is not related to the preceding and following GOPs, the code string can be edited in units of GOPs. This GOP configuration is shown in FIG. In this case, since the first part is not a periodic process, the process is somewhat troublesome. In addition, since the B picture with a small code amount is deleted, the average code amount increases.
[0005]
On the other hand, as disclosed in Japanese Patent Application Laid-Open No. 11-164307, the present inventor previously encoded a moving picture coding that multiplexes a code string that is independently encoded in a picture as a sub code string separately from the main code string Apparatus and moving picture decoding apparatus. In this moving image encoding apparatus, main encoding that performs intra-frame independent encoding or inter-image predictive encoding on an input moving image by switching in units of frames or fields and outputs the obtained main code sequence Sub-encoding means for independently encoding a predetermined frame or field among frames or fields for which inter-picture predictive encoding is performed in the main encoding means, and outputting an obtained sub-code sequence; It is characterized by comprising code sequence multiplexing means for inserting a sub code sequence of the predetermined frame or field into an adjacent portion of the main code sequence of the predetermined frame or field to obtain a multiplexed code sequence.
[0006]
In addition, the moving image decoding apparatus detects a type of a code string (main code string / sub code string) to be input from a header of the code string and outputs type information of the code string; Based on the type information of the code string, when the continuous image is not decoded, any code string to be input is led to the decoding process, and when the continuous image is decoded, A code sequence control unit that abandons the sub code sequence and guides only the main code sequence to the decoding process, and a code sequence provided from the code sequence control unit, performs intra-picture decoding or inter-picture predictive decoding, And a decoding unit that outputs the obtained reproduced image.
[0007]
[Figure 1]
According to the moving picture coding apparatus and moving picture decoding apparatus proposed previously by the present inventor, in normal decoding, decoding is performed without using an intra-picture independent coded code string, and random access or channel Only at the time of switching, decoding can be performed from a code string that has been independently encoded within a picture.
[0008]
In addition, there is a technique for improving image quality using both a locally decoded image obtained by independent intra-picture coding and an inter-picture prediction image. For example, in the moving picture coding apparatus previously disclosed in Japanese Patent Laid-Open No. 5-130591, the present inventor adaptively adds a reproduction picture of intra-picture independent coding and an inter-picture prediction picture to form a prediction signal. To do.
[0009]
FIG. 7 is a block diagram showing an example of a conventional moving picture encoding apparatus. In the figure, all the moving image signals coming from the image input terminal 1 are supplied to the frame delay device 2, while only the signal to be encoded as an I picture is supplied to the DCT 20 via the switch 19. The frame delay unit 2 delays only the B picture by a frame time in order to encode the P picture before the B picture. Each image whose order has been changed is given to the subtracter 3.
[0010]
The image signal from the frame delay unit 2 is subtracted from a prediction signal from an adder 9 (to be described later) in a subtracter 3 to obtain a prediction residual, which is input to the DCT 4. The DCT 4 performs a discrete cosine transform (DCT) conversion process on the prediction residual and supplies the obtained coefficient to the quantizer 5. The quantizer 5 quantizes the input coefficient with a predetermined step width, and supplies the coefficient that has become a fixed-length code to the variable-length encoder 6 and the inverse quantizer 10. The variable-length encoder 6 compresses the fixed-length prediction residual with the variable-length code and supplies the obtained code to the multiplexer 13.
[0011]
On the other hand, the inverse quantizer 10 and the inverse DCT 11 perform the inverse processing of the DCT 4 and the quantizer 5 to reproduce the prediction residual. The obtained reproduction prediction residual is added by the adder 12 with the prediction signal from the adder 9 to form a reproduced image, which is input to the inter-picture predictor 7. The inter-picture predictor 7 forms an inter-picture prediction signal using this reproduced picture as a reference picture and supplies it to the multiplier 8. The multiplier 8 multiplies the reproduced image by a value from 0 to 1 in accordance with control information from the specific image setting unit 18 described later, and supplies the multiplied image to the adder 9.
[0012]
The coding of the I picture is performed for a part of the images set periodically among the images coded as the P picture. The encoding of the I picture is similar to the above processing for the prediction residual, and is encoded by the circuit unit including the DCT 20, the quantizer 21, and the variable length encoder 22. This processing is performed by the I (P) picture. This is the same as the processing of the circuit unit comprising the DCT 4, the quantizer 5 and the variable length encoder 6. The obtained code is input from the variable length encoder 22 to the multiplexer 13.
[0013]
On the other hand, in the inverse quantizer 15 and the inverse DCT 16, the inverse processing of the DCT 20 and the quantizer 21 is performed to reproduce an image. The obtained reproduced image (I picture local decoded image) is supplied to the multiplier 17. The multiplier 17 multiplies the locally decoded image by a value from 0 to 1 according to control information from the specific image setting unit 18 described later, and supplies the result to the adder 9.
[0014]
The adder 9 adds the inter-picture prediction image from the multiplier 8 and the I picture local decoded image from the multiplier 17 to obtain a final prediction image. The multiplication coefficient of the multiplier 8 and the multiplication coefficient of the multiplier 17 have a sum of 1 and may be controlled by correlation of images. For a non-specific picture without an I picture, 1 is multiplied by the multiplier 8 and 0 is multiplied by the multiplier 17, and normal P picture processing is performed. Since the B picture does not become a prediction reference image, this addition processing is irrelevant.
[0015]
The specific image setting unit 18 sets a P picture for each predetermined period as a specific picture, and supplies the control information to the switch 19, multipliers 8 and 17, and the multiplexer 13. The multiplexer 13 multiplexes the information of the specific picture and the code string of each picture, and outputs them from the code string output terminal 14.
[0016]
Next, a conventional moving image code string will be described. The code string configuration of a conventional GOP (image group) is as shown in FIG. 8A for a normal GOP and as shown in FIG. 8B for a closed GOP. In FIG. 8, a delimiter indicates a code string of each picture, I, B, and P are picture types, and numbers are reproduction display picture numbers. In the code string, it can be seen that the order of the B picture and the P (I) picture is reversed. As a result, the last GOP is not a P picture but a previous B picture.
[0017]
When a moving image code string having a normal GOP configuration is edited in units of GOP, the first B picture cannot be decoded. This is because the previous P picture belongs to the previous GOP, and the P picture is changed to another image by editing in GOP units, so that a correct reference image cannot be obtained. In this case, in order to prevent the decoding apparatus from decoding the B picture image, it is necessary to set a flag (Bloken Link) indicating that editing is being performed.
[0018]
On the other hand, a moving image code string having a closed GOP configuration does not have the first B picture, so that even if editing is performed in units of GOP, decoding is not affected. This is because the inter-picture prediction is closed by GOP, and a flag (Bloken indicating that editing is being performed)
Link) is not necessary.
[0019]
The conventional moving picture decoding apparatus corresponding to the conventional moving picture encoding apparatus shown in FIG. 7 is adapted to use the independent frame decoded image and the inter-picture prediction image in the formation of the prediction signal as in the local decoding portion of FIG. It is the structure which adds automatically.
[0020]
On the other hand, when a video code string having a normal GOP structure is edited in units of GOPs and a Broken Link flag is set, in decoding, B pictures before the I picture after the editing point are Replace with previous image etc. without decoding. A moving image code string having a closed GOP configuration is not affected by code string editing in units of GOPs, but the period of the P picture is discontinuous, so that a decoding process corresponding thereto is required.
[0021]
[Problems to be solved by the invention]
In the conventional video editing, a code sequence group bundled with a cycle of I pictures, that is, a code sequence having a GOP (Group Of Picture) unit, is edited. However, it is difficult to edit the code sequence in a normal GOP configuration. In the closed GOP configuration without the first B picture, there are problems that the encoding efficiency is lowered and the period of the P picture becomes discontinuous.
[0022]
In addition, the conventional method having an I picture as a subcode sequence in a P picture is effective for random access or the like, but the code amount increases by the number of overlapping I pictures, resulting in a GOP structure corresponding to code sequence editing. Not.
[0023]
Furthermore, the conventional method of forming a prediction signal from an I picture locally decoded image and an inter-picture prediction signal of the same frame has good encoding efficiency, but cannot be decoded without both of the code strings. I can't.
[0024]
The present invention has been made in view of the above points. The predetermined P picture also has an I picture, and the sum of the two reproduced images is used as a reproduced image, so that the quality of the reproduced image can be improved while being editable. Video encoding device as well as Video coding Method The purpose is to provide.
[0025]
[Means for Solving the Problems]
In order to achieve the above object, the moving picture coding apparatus according to the present invention is a moving picture coding that encodes each picture of a moving picture by three kinds of coding methods of intra-picture independent, unidirectional prediction, and bidirectional prediction. In the apparatus, a first encoded local decoding unit that encodes an input image signal by unidirectional prediction or bi-directional prediction and locally decodes to obtain a local decoded image of unidirectional predictive encoding, and encodes by unidirectional prediction Second coded local decoding in which a part of the picture to be processed is defined as a specific picture, and the specific picture is also encoded with independent decoding within the picture together with unidirectional prediction coding, and is locally decoded to obtain a locally decoded image of independent coding within the picture And inter-picture prediction means for adding a locally decoded image of unidirectional predictive coding and a locally decoded image of intra-picture independent coding to obtain a reference image for inter-picture prediction processing of other pictures in a specific picture. It is obtained by a configuration in which.
[0026]
In the present invention, in addition to a picture code string that is one-way predictive coded in a specific picture, it also has a picture code string that is independently coded within a picture, and these two types of picture code strings overlap. The S / N of the reproduced image can be improved by adding the local decoded images, and the inter-image prediction efficiency can be improved by using the reproduced image as a reference image for inter-image prediction.
[0027]
In order to achieve the above object, the moving picture coding method of the present invention is a moving picture in which each picture of a moving picture is coded by three kinds of coding methods of intra-picture independent, unidirectional prediction, and bidirectional prediction. In the encoding method, a first step of encoding an input image signal by unidirectional prediction and local decoding to obtain a locally decoded image of unidirectional prediction encoding; and a part of a picture encoded by unidirectional prediction A second step in which a specific picture is encoded, and the specific picture is also encoded in the picture independently and is independently decoded, and is locally decoded to obtain a locally decoded image of the intra-picture independent encoding; And a third step of adding the locally decoded image and the locally decoded image of independent intra-picture coding to obtain a reference image for inter-picture prediction processing of other pictures.
[0028]
In the present invention, in addition to a picture code string that is one-way predictive coded in a specific picture, it also has a picture code string that is independently coded within a picture, and these two types of picture code strings overlap. The S / N of the reproduced image can be improved by using the signal obtained by adding the locally decoded images as a reference image for inter-picture prediction processing of other pictures, and the reproduced image can be used as a reference image for inter-picture prediction. Thus, the inter-picture prediction efficiency can be improved.
[0031]
DETAILED DESCRIPTION OF THE INVENTION
Next, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing an embodiment of a moving picture encoding apparatus according to the present invention. In the figure, the same components as in FIG. In this specification, “picture” refers to one frame or field.
[0032]
In FIG. 1, all the moving image signals coming from the image input terminal 1 are given to the frame delay unit 2, and only what is encoded as an I picture is given to the DCT 20 via the switch 19. The frame delay unit 2 delays only the B picture in order to encode the P picture before the B picture. Each image whose order has been changed is given to the subtracter 3.
[0033]
The input image signal delayed by the frame delay unit 2 is subtracted from the prediction signal supplied from the inter-picture predictor 27 in the subtracter 3 to obtain a prediction residual, which is input to the DCT 4. The DCT 4 performs DCT (Discrete Cosine Transform) conversion processing on the prediction residual, and gives the obtained coefficient to the quantizer 5. The quantizer 5 quantizes the given coefficient with a predetermined step width, and supplies the coefficient having a fixed length code to the variable length encoder 6 and the inverse quantizer 10. The variable length encoder 6 compresses the fixed length prediction residual from the quantizer 5 with the variable length code, and the obtained variable length code of the P picture or B picture is supplied to the multiplexer 13.
[0034]
On the other hand, the inverse quantizer 10 and the inverse DCT 11 perform the inverse processing of the DCT 4 and the quantizer 5 to reproduce the prediction residual. The obtained reproduction prediction residual is added to the prediction signal from the inter-picture predictor 27 in the adder 12 to form a locally decoded image, which is supplied to the multiplier 25. The multiplier 25 multiplies the locally decoded image by a value from 0 to 1 according to the control information from the specific image setting unit 18 and supplies the result to the adder 26.
[0035]
The coding of the I picture is performed for a part of the images set periodically among the images coded as the P picture. The encoding of the I picture is performed in the same manner as the above processing for the prediction residual. That is, the I picture is input to the variable length encoder 22 through the DCT 20 and the quantizer 21 and is variable length encoded. This process is performed by the DCT 4, the quantizer 5 and the variable length code for the P (B) picture. This is the same as the processing of the generator 6. The variable length code of the I picture obtained by the variable length encoder 22 is input to the multiplexer 13.
[0036]
On the other hand, the inverse quantizer 15 and the inverse DCT 16 perform the inverse processing of the DCT 20 and the quantizer 21 to reproduce the locally decoded image. The obtained locally decoded image is given to the multiplier 17. The multiplier 17 multiplies the local decoded image by a value from 0 to 1 according to the control information from the specific image setting unit 18 and supplies the result to the adder 26.
[0037]
The adder 26 adds the two types of locally decoded images from the multipliers 25 and 17 to obtain a reference image for inter-picture prediction processing. The inter-picture predictor 27 forms an inter-picture prediction signal using this reference picture. This inter-picture prediction signal is supplied to the subtracter 3 and the adder 12, respectively.
[0038]
The specific image setting unit 18 sets a P picture for each predetermined period as a specific picture, and supplies the control information to the switch 19, multipliers 17 and 25, and the multiplexer 13. The multiplexer 13 multiplexes the information of the specific picture and the code string of each picture, and outputs them from the code string output terminal 14. The switch 19 is turned on only for the above-mentioned specific picture, and is turned off for the other non-specific pictures.
[0039]
Next, addition processing of two types of locally decoded images in the adder 26 will be described. First, since there is no I picture in a non-specific picture, the multiplier 25 multiplies 1 and the multiplier 17 multiplies 0. That is, it is not different from general P picture encoding. In addition, since the B picture does not become a reference image, the addition process is not related in the first place.
[0040]
On the other hand, in the specific picture, both the multiplier 25 and the multiplier 17 multiply the input local decoded image by the coefficient 0.5 in order to add the locally decoded image of the P picture and the locally decoded image of the I picture. When the noise component included in each image is white noise, the S / N of 3 dB can be improved by addition, but the noise component of the locally decoded image of the P picture and the noise component of the locally decoded image of the I picture are processed separately. Although the method is different, there are some common points such as coarse quantization with high frequency components, so there is also a correlation with noise components, and an improvement of 3 dB cannot be obtained. However, since they are not the same, some improvement is expected. Assuming that the half is 1.5 dB, it is necessary to increase the code amount by about 30% in order to improve the code amount accordingly.
[0041]
Generally, by setting the quantization step widths of the quantizers 5 and 21, the I picture is set to have a higher quality of the reproduced image than the P picture. This is because increasing the quality of the I picture that is the basis of the reference image of all images in the GOP contributes to improving the image quality of the entire GOP. On the other hand, if the S / N is different between the reproduced picture of the P picture and the reproduced picture of the I picture, the addition is not very effective. Therefore, if the code amount of the I picture is reduced to some extent, the P picture and the S / N become equivalent, and the maximum effect can be obtained.
[0042]
In the present invention, since the P picture is added to the normal GOP configuration, the code amount increases accordingly. However, even if the code amount of the I picture is reduced and the S / N is reduced, the I picture and the P picture are reduced. If the S / N of the reference image can be maintained by addition, the reproduced image and the code amount are equivalent to those of a normal GOP.
[0043]
Here, the generated code amount will be compared with a normal GOP and a closed GOP. Assume that the average code amount of an I picture is 1000 kbit, the average code amount of a P picture is 300 kbit, and the average code amount of a B picture is 100 kbit. In the case of a normal GOP with an image of 30 frames per second and a P (I) picture cycle of 3 frames, if the GOP length is 15 frames, the average transfer rate is 6.4 Mbps from the average number of each picture per second. It becomes.
[0044]
On the other hand, in the case of a closed GOP, the size of the GOP is different from that of the normal GOP, the GOP length is 13 frames, the average transfer rate is 6.92 Mbps, the GOP length is 16 frames, and the average transfer rate is The average transfer rate increases to 6.56 Mbps compared to the normal GOP. In addition, the accessibility is slightly improved when the GOP length is 13 frames, but it is lowered when the GOP length is 16 frames. If a code amount corresponding to 15 frames is obtained from both, it is 6.68 Mbps, which is a 4.4% increase in code amount with respect to a normal GOP.
[0045]
In the case of this embodiment, if the average code amount of an I picture is the same as that of a normal GOP or closed GOP, the average transfer rate is 7.0 Mbps, but if it is reduced by 30% to 700 kbit, the average transfer rate is 6.4 Mbps. This is the same as in the case of normal GOP. This is a form in which the code amount of an I picture of a normal GOP is allocated to an I picture and a P picture.
[0046]
Next, the moving image code string will be described. In forming a code string encoded by the encoding apparatus shown in FIG. 1, the P picture code string of a specific picture is the last of the GOP (image group), and the I picture is the first of the GOP. Therefore, in a specific picture, a code string is arranged in the order of P picture and I picture, and when viewed in one GOP, it starts with I picture and ends with P picture. The GOP configuration of this embodiment is shown in FIG.
[0047]
On the other hand, since the B picture and the P (I) picture are reversed in the code string, the last B picture is not the P picture but the previous B picture. That is, as shown in FIG. 8C, the code sequence of the GOP (image group) to be formed starts with an I picture code sequence I1 that is independently encoded within a picture of a specific picture, and immediately before the next specific picture. The process ends with a B picture code string B15 that has been bi-predictively encoded.
[0048]
This GOP configuration is similar to a closed GOP when only one GOP is compared while ignoring the overlap of specific frames. When one of the I picture or P picture is deleted in a specific frame, the GOP configuration is determined by the deleted one. Changes, but the picture sequence is usually the same as the GOP sequence. That is, the GOP according to the present embodiment can have both the characteristics of a closed GOP and a normal GOP.
[0049]
The moving image code string can be edited in units of GOP as in the case of a closed GOP. This is shown in FIG. As shown in the drawing, 1 GOP of the code string B shown in the third line is inserted between the GOP and GOP having the code string A shown in the first line, and the edited code as shown in the second line A column is obtained. Here, each GOP has overlapping pictures at the beginning and end. Therefore, the processing is different from the conventional editing apparatus.
[0050]
First, regarding the image length, the last P picture of the GOP is not included in the GOP length (time), and the editing time is calculated. Therefore, even if the code sequence of the GOP configuration of this embodiment is 16 frames, it is regarded as 15 frames.
[0051]
Next, when editing is performed in units of GOPs by reproduction control of a specific picture, both the P picture of the previous GOP and the I picture of the subsequent GOP can be decoded and reproduced as the specific picture serving as the editing point. On the other hand, since editing is performed, the image contents are different. As a method of positively using the point where the code strings overlap, if the control information is entered as to which image is output at the time of reproduction, the edit point can be moved back and forth by one picture with the same code string.
[0052]
Further, in the conventional closed GOP, since there is no processing change in the decoding device, it is not necessary to set a flag (Bloken Link) indicating that editing is performed, but in this method, the decoding processing is switched. Since it is necessary, it is necessary to set a flag of Broken Link.
[0053]
Next, examples of the video decoding device will be described. FIG. 3 is a block diagram illustrating an example of a moving picture decoding apparatus. This moving picture decoding apparatus shows a configuration of a decoding apparatus corresponding to the embodiment of the moving picture encoding apparatus of the present invention shown in FIG. 1, and this is an image continuity without editing. This is a case of a code string in which is maintained.
[0054]
In FIG. 3, a code string coming from a code string input terminal 31 is separated into a code string of I picture and other code strings by a demultiplexer 32 based on the header of the picture. The code sequence of P picture and B picture is supplied to the variable length decoder 33, and the code sequence of I picture is supplied to the variable length decoder 34.
[0055]
The code sequence of the P (B) picture is returned to the fixed length code by the variable length decoder 33, and the variable length code of the prediction residual is supplied to the inverse quantizer 35. The inverse quantizer 35 inversely quantizes the input fixed length code according to the quantization parameter to obtain a reproduction DCT coefficient value of the prediction residual, and supplies this to the inverse DCT 36.
[0056]
The inverse DCT 36 converts 8 × 8 coefficients into a decoded prediction residual signal and supplies it to the adder 37. The adder 37 adds the prediction signal given from the inter-picture predictor 45 to the decoded prediction residual signal to obtain a decoded image signal. The decoded image signal of the P (B) picture obtained in this way is supplied to the multiplier 42.
[0057]
On the other hand, the I-picture code string separated by the demultiplexer 32 is decoded by the variable length decoder 34, dequantized by the inverse quantizer 38, decoded by the inverse DCT 39, and reproduced image signal. Is input to the multiplier 41. The operations of the variable length decoder 34, the inverse quantizer 38, and the inverse DCT 39 are the same as those of the variable length decoder 33, the inverse quantizer 35, and the inverse DCT 36, but the parameters are for I picture.
[0058]
The demultiplexer 32 detects the picture ID from the picture header in the input code string, and the result information is supplied to the specific image controller 40. The specific image controller 40 detects the specific picture and supplies the control information to the multipliers 41 and 42, respectively. The multiplier 41 multiplies the reproduced image signal from the inverse DCT 39 by a value from 0 to 1 in accordance with the control information described above, and gives the result to the adder 43. On the other hand, the multiplier 42 multiplies the decoded image signal of the P (B) picture from the adder 37 by a value from 0 to 1 according to the control information described above, and supplies the result to the adder 43.
[0059]
The adder 43 adds the two types of decoded image signals extracted from the multipliers 41 and 42 to obtain a reproduced image signal. Addition by the adder 43 is performed only for a specific picture. At this time, both the multiplier 41 and the multiplier 42 are multiplied by a coefficient 0.5. Otherwise, the multiplier 42 multiplies the coefficient 1 and the decoded image signal, and the multiplier 41 multiplies the coefficient 0 and the decoded image signal. B) The decoded image signal of the picture is output as it is.
[0060]
In a specific picture, the reproduced image signal is improved by S / N from the decoded image signals extracted from the multipliers 41 and 42 by addition in the adder 43. A state of such decoding is shown in FIG.
[0061]
The reproduced image signal output from the adder 43 is output as it is from the reproduced image output terminal 47 via the switch 46 in the B picture. On the other hand, the reproduced image signal output from the adder 43 is temporarily stored in the image memory 44 in the P (I) picture, is used as a reference image for inter-image prediction processing, and is delayed. The reference image is used as a prediction signal and input to the adder 37. The switch 46 selects the delayed B picture and the P (I) picture delayed in the image memory 44 and outputs the selected picture to the output terminal 47.
[0062]
Next, another example of the moving picture decoding apparatus will be described. FIG. 5 shows a block diagram of another example of a moving picture decoding apparatus corresponding to the moving picture encoding apparatus of FIG. 5, the same components as those in FIG. 3 are denoted by the same reference numerals, and description thereof is omitted. The decoding apparatus of FIG. 5 is an apparatus that performs decoding when code sequence editing is performed and image continuity is not maintained.
[0063]
The decoded picture signal of the B picture output from the adder 37 is output as it is from the reproduced picture output terminal 47 via the switch 51. On the other hand, the decoded picture signal of the P picture output from the adder 37 is temporarily held in the picture memory 48. Also, the decoded picture signal of the I picture output from the inverse DCT 39 is temporarily held in the picture memory 49.
[0064]
Here, since the GOP has been edited, the P picture and I picture of the specific picture are formally the same picture, but the P picture is that of the previous GOP, and the I picture is that of the subsequent GOP. Is. Therefore, the switch 50 selects an appropriate one as a reference image from the two types of decoded image signals from the image memories 48 and 49 as follows.
[0065]
Following the decoding of the specific picture, the B picture of the previous GOP is decoded by selecting the decoded picture of the P picture held in the picture memory 48. Subsequently, in decoding of the P picture and B picture of the next GOP, a decoded image of the I picture held in the image memory 49 is selected. The state of decoding in this case is shown in FIG. In the figure, the arrows indicate the inter-image prediction relationship.
[0066]
In the decoding of the decoding apparatus shown in FIG. 5, the reference image for inter-picture prediction is slightly different from the reference image of the encoding apparatus, but both are decoded images for the same image, except for the quantization noise component. The original image is common. The change in the reference image affects only 2 pictures immediately before the editing point and 1 GOP after the editing point. However, since the prediction residual components are sequentially added after the editing point, the influence of the reference image change gradually decreases. On the other hand, considering the visual characteristics, it is said that when the scene changes during editing, it is difficult to notice deterioration, and the detection capability is said to be greatly reduced for about 0.1 seconds immediately after the change. Therefore, the visual impact of degradation is very small.
[0067]
The playback image output is selected by the switch 51. The operation of the switch 51 is the same as that of the switch 46 in FIG. 3 except for the specific picture. Since it is possible to output either an I picture or a P picture in a specific frame, it may be determined in advance which one is selected, but the control information is put in the code string by the code string editing device. If so, control accordingly.
[0068]
In the above embodiment, the moving picture encoding apparatus and method have been described. However, the present invention is not limited to this, and a part of a picture encoded by unidirectional prediction is a specific picture. In the specific picture, there is a code sequence that is encoded independently within the picture as well as one-way predictive encoding, and starts with a code sequence that is independently encoded within the picture of the specific picture, and bidirectional predictive encoding of the picture immediately before the specific picture is performed The code string ending with the code string may be defined as one code string group, and the code string synthesized in units of the code string group may be transmitted via a desired transmission path.
[0069]
【The invention's effect】
As described above, according to the present invention, a specific picture has a picture code string that is independently encoded within a picture in addition to a picture code string that is unidirectionally predictively encoded, and these two types of picture code strings are Although overlapping, the S / N of the reproduced image is improved by adding the locally decoded images of both, so that the quality of the reproduced image can be improved. In addition, since the inter-picture prediction efficiency is improved by using the reproduced picture as a reference picture for inter-picture prediction, the total amount of codes can be reduced correspondingly in the specific picture.
[0070]
Further, according to the present invention, the code sequence group (GOP) configuration is divided between the unidirectional predictive coded picture included in the specific picture and the intra-picture independent coded picture, so that the code is encoded in GOP units. At a discontinuous point when a column is edited, the decoding apparatus performs inter-picture prediction of another picture using a unidirectional predictive coded picture as a GOP end and an intra-picture independent coded picture as a reference picture at the GOP end. Therefore, it is possible to generate a code string that can be edited in units of GOPs in the same manner as a closed GOP while maintaining the periodicity of a unidirectional predictive coded picture with a coding efficiency equivalent to that of a normal GOP.
[Brief description of the drawings]
FIG. 1 is a block diagram of an embodiment of a moving image encoding apparatus of the present invention.
FIG. 2 is a diagram illustrating an example of how a moving image code string is edited.
FIG. 3 is a block diagram illustrating an example of a moving picture decoding apparatus.
FIG. 4 is a diagram illustrating each example of decoding in units of pictures.
FIG. 5 is a block diagram of another example of the video decoding device.
FIG. 6 is a diagram illustrating each example of a GOP configuration.
FIG. 7 is a block diagram of an example of a conventional video encoding device.
FIG. 8 is a diagram illustrating each example of a code string configuration of a GOP (image group).
[Explanation of symbols]
1 Image input terminal
2 frame delay
3 Subtractor
4, 20 DCT
5, 21 Quantizer
6, 22 Variable length encoder
10, 15, 35, 38 Inverse quantizer
11, 16, 36, 39 Inverse DCT
12, 26, 37, 43 Adder
13 Multiplexer
14 Code string output terminal
17, 25, 41, 42 multiplier
18 Specific image setting device
19, 46 switch
27, 45 Image predictor
31 Code string input terminal
32 Demultiplexer
33, 34 Variable length decoder
40 Specific image controller
47 Playback image output terminal

Claims

In a moving image encoding apparatus that encodes each picture of a moving image with three types of encoding methods of independent in picture, unidirectional prediction, and bidirectional prediction,
A first encoded local decoding unit that encodes an input image signal by the unidirectional prediction or bidirectional prediction, and locally decodes to obtain a locally decoded image of unidirectional predictive encoding;
A part of a picture to be encoded by the unidirectional prediction is a specific picture, and the specific picture is encoded by unidirectional predictive encoding and independent in the picture, is locally decoded, and is locally decoded. Second encoded local decoding means for obtaining
Inter-picture prediction means for adding the locally decoded image of the one-way predictive coding and the locally decoded image of the intra-picture independent coding to the reference picture of the inter-picture prediction processing of other pictures in the specific picture. A moving picture coding apparatus characterized by the above.

In the moving picture coding method for coding each picture of a moving picture by three kinds of coding methods of independent in picture, unidirectional prediction, and bidirectional prediction,
A first step of encoding an input image signal by the unidirectional prediction and locally decoding to obtain a locally decoded image of unidirectional predictive encoding;
A part of a picture to be encoded by the unidirectional prediction is a specific picture, and the specific picture is encoded by the unidirectional predictive encoding as well as the intra-picture independent, and is locally decoded to generate a locally decoded image of the intra-picture independent encoding. A second step of obtaining
A third step of adding the locally decoded image of the one-way predictive coding and the locally decoded image of the intra-picture independent coding to the reference picture of the inter-picture prediction process of another picture in the specific picture. A video encoding method characterized by the above.