JP3599942B2

JP3599942B2 - Moving picture coding method and moving picture coding apparatus

Info

Publication number: JP3599942B2
Application number: JP4730797A
Authority: JP
Inventors: 吉宏堀; 真裕美丹羽
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1997-02-13
Filing date: 1997-02-13
Publication date: 2004-12-08
Anticipated expiration: 2017-02-13
Also published as: JPH10229563A

Description

【０００１】
【発明の属する技術分野】
本発明は、フレーム間予測符号化方式を利用してテレビジョン信号等の動画像映像信号を圧縮符号化する動画符号化方法と装置に関する。特に、場面切換（シーンチェンジ）時に、フレーム間予測符号化方式に於いて参照されることのない非参照フレームの符号量が増大する等の悪影響を防止することで画質の安定化を達成する動画像符号化方法と装置に関する。
【０００２】
【従来の技術】
動画像信号を圧縮符号化する方式として、ＭＰＥＧ方式が、ＭＰＥＧ１−Ｖｉｄｅｏ（ＩＳＯ／ＩＥＣ・１１１７２−２）、及びＭＰＥＧ２−Ｖｉｄｅｏ（ＩＳＯ／ＩＥＣ・１３８１８−２）として規格化され、記録分野ばかりでなく、ディジタル放送やディジタル伝送等の放送・通信分野でも広く用いられている。
【０００３】
ＭＰＥＧ方式では、複数のフレーム単位（１フィールド，１フレーム等）の集合であるＧＯＰ（ＧｒｏｕｐｏｆＰｉｃｔｕｒｅ）が、フレームシーケンス中に構成される。ＧＯＰ中では、フレーム単位内圧縮により少なくとも１枚のフレームが圧縮符号化され、そのフレーム以外は、他のフレームを参照して圧縮符号化される（フレーム単位間圧縮＝フレーム間予測符号化）される。フレーム単位内圧縮により圧縮符号化される画像はＩピクチャであり、フレーム単位間圧縮を利用して圧縮符号化される画像はＰピクチャとＢピクチャである。
【０００４】
フレーム単位内圧縮では、ＤＣＴ（ＤｉｓｃｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ）、量子化、可変長符号化（ＶＬＣ）という技法が採用されている。
まず、フレーム内の画像データがマクロブロックと呼ばれる１６×１６画素の小領域に分割され、この小領域中の８×８画素ブロック毎にＤＣＴ変換が施されて、周波数領域が求められる。
【０００５】
人間の視覚特性は高周波に対して鈍感なため、上述の周波数領域中の高周波領域への符号割当を少なくすることにより、人間の視覚的には大きな劣化を感じさせることなくデータ量を削減することができる。この割当は、量子化によって行われる。量子化は、８×８画素のブロックのＤＣＴ変換により得られた８×８の周波数領域データ（係数行列）を、あるマトリックスで除算することにより実現される。なお、復号器側では、可変長復号して得たデータに対して上記マトリックスを乗算（逆量子化）することにより、元の８×８の周波数領域データ（係数行列）を得ることが可能になる。即ち、逆ＤＣＴ演算を行うべきマトリックスを得ることが可能になる。
【０００６】
上記に於いて、高次のデータ、つまり高周波領域のデータに対する除算値を大きく設定することにより、高周波領域の符号化精度を下げることができる。つまり、高周波領域に相当するデータ量を大きく削減することができる。こうして得られた量子化データは、ゼロ値の多いものとなる。このため、このデータは、ゼロ値の数と、それに続く実値という組で可変長符号化テーブルが形成される。可変長符号化とは、エントロピー符号化の一種であり、出現頻度の高いデータに対しては短い符号を割当て、出現頻度の低いデータに対しては長い符号を割当てることにより、効率の高い圧縮を行う方式である。
【０００７】
フレーム単位間圧縮では、上述のＤＣＴ、量子化、可変長符号化という技法に加えて、動きベクトルを用いた動き補償という技法が採用されている。
まず、フレーム内の画像データが、フレーム単位内圧縮の場合と同様にマクロブロックと呼ばれる１６×１６画素の小領域に分割される。フレーム単位間圧縮では、符号化対象の画像データ（現画像データ）ばかりでなく、現画像データが参照するべき他フレームの画像データ（参照画像データ）が必要となる。参照画像データとしては、現画像データから時間的に近い位置に存在するフレームが用いられる。各マクロブロック毎に、参照フレームの画像データ中から、現在の符号化対象のマクロブロック（現マクロブロック）とデータ的に近い領域（マクロブロック）が検索される。「データ的に近い（近似する）領域」とは、通常、画素データの輝度のみを比較して、その自乗誤差平均が最も少ない値を示す領域を言う。ＭＰＥＧでは、この検索は半画素単位の精度で行われる。また、検索範囲は、現フレーム内の現マクロブロックの位置に対応する参照フレーム内の位置から所定範囲の領域である。この検索を「動きベクトル」の検出という。また、フレーム間予測符号化に於いて「動きベクトル」により参照マクロブロックと現マクロブロックとの位置ズレを補正することを「動き補償」という。この「動きベクトル」の値も可変長符号化される。
【０００８】
参照する画像データが存在する場合には、復号器側に於いて上記「データ的に近い」領域を切り出すことが可能である。しかし、切り出される参照マクロブロックの画像データと、符号化対象の現マクロブロックの画像データとは、完全同一ではない。このため、この差を補償してやる必要がある。ＭＰＥＧでは、この差を補償するために、各画素毎（輝度、色差）に差分を取って差分マクロブロックを構成し、その中の８×８画素の各領域毎に、フレーム単位内圧縮の場合と同様にＤＣＴ演算と量子化を行う。ただし、この場合の量子化は、対象が差分データであるため、通常、全体的に精度を落として行われる。このため、差分データに対しては、通常、高周波領域と低周波領域を区別しない量子化テーブルがデフォルトとして用意されている。勿論、この量子化テーブルをエンコーダ側で自由に変更することは可能である。
【０００９】
こうして得られた圧縮符号が逆量子化及び逆ＤＣＴされる。参照されるフレームの画像データがデコーダ内部に存在している場合には、上述の逆ＤＣＴ後の値を、その動き量から指定されるフレーム領域と加算することにより、当該マクロブロックの復号が行われることになる。
【００１０】
ＭＰＥＧでは、フレーム単位間圧縮を利用して圧縮符号化されるフレームは２種類存在する。過去のフレームからのみ参照が行われるフレームをＰピクチャと呼び、過去及び未来の双方から参照が行われるフレームをＢピクチャと呼ぶ。Ｐピクチャは、時間的に（＝表示順で）過去に位置するＩ又はＰピクチャから予測される。Ｂピクチャは時間的に（＝表示順で）過去及び／又は未来に位置するＩ又はＰピクチャから予測される。つまり、Ｂピクチャは、動き予測のために参照されない非参照フレーム（＝差分データの算出用として参照されない非参照フレーム）であり、Ｉ及びＰピクチャは、動き予測のために参照され得る参照フレーム（＝差分データの算出用として参照され得るフレーム）である。
【００１１】
これらの圧縮手法を組合わせて圧縮された差分圧縮符号の符号量は、もとの画像データのデータ量と比較して、数分の一〜数百分の一となり、しかも、比較的高い画質を保持している。しかし、他のフレームを参照するという方法を用いているため、復号器側に於いて参照フレームの画像データを保持するための構成が必要になる。また、復号器側では、参照フレームの画像データとして、復号された画像データを用いることになるため、エンコーダ側では、このことを考慮して符号化を行なわなくてはならない。つまり、参照されるフレームであるＩピクチャとＰピクチャを符号化した後、復号器と同じ条件で復号し、その復号後の参照フレームの画像データを保持する必要がある。そして、この復号した画像データを参照して、ＰピクチャとＢピクチャを符号化しなくてはならない。
【００１２】
図４に、ＭＰＥＧ動画像のフレームシーケンス例を示す。図内（Ａ）は、各フレームの表示順である。ＭＰＥＧで最も一般的に用いられるフレーム種類の順を示しているが、勿論これに限られるものではない。矢印は、終点のフレームを符号化／復号化する際に、始点のフレームを参照することを示す。例えば、Ｂピクチャ５１，５２は、Ｉピクチャ５３を参照して符号化／復号化されるフレームであるため、Ｂピクチャ５１，５２の符号化時／復号化時には、既に、Ｉピクチャ５３が復号されて画像メモリに保持されている必要がある。同様に、Ｂピクチャ５４，５５は、Ｉピクチャ５３とＰピクチャ５６とを参照して符号化／復号化されるフレームであるため、Ｂピクチャ５４，５５の符号化時／復号化時には、既に、Ｉピクチャ５３とＰピクチャ５６とが復号されて画像メモリに保持されている必要がある。同様に、Ｐピクチャ５６は、Ｉピクチャ５３を参照して符号化／復号化されるフレームであるため、Ｐピクチャ５６の符号化時／復号化時には、既に、Ｉピクチャ５３が復号されて画像メモリに保持されている必要がある。
【００１３】
この理由から、上述の各フレームの記録順は、図内（Ｂ）のようになる。この記録順に復号を行い、ＩピクチャやＰピクチャの復号画像データを順次保持することにより、上述のＢピクチャやＰピクチャの符号化／復号化が可能になる。また、このような参照を行うため、Ｉピクチャは、それ自体で画質を維持する必要がある。また、Ｐピクチャも他のＰピクチャやＢピクチャの参照に利用されるために、高い画質を維持する必要がある。このため、通常、Ｉピクチャに最も多くの符号量を割当て、次にＰピクチャ、Ｂピクチャの順に符号量を割り当てることにより、全体としての画質を維持している。
【００１４】
【発明が解決しようとする課題】
しかしながら、従来技術において、データ量の割り当て量が最も少ないＢピクチャに於いて場面切換（シーンチェンジ）が発生すると、過去及び未来の双方向のフレームを参照して符号化を行う、いわゆる両方向予測符号化を行うことができなくなる。このため、Ｂピクチャに於いて高い圧縮率を維持することができなくなり、当該Ｂピクチャの符号量が増大する。
【００１５】
また、動画像符号化装置では、該装置から出力される符号量をモニタして量子化スケールを制御することにより全体の符号量を制御するレート制御を行っているため、上述のように、シーンチェンジの発生したＢピクチャの符号量が増大すると、他のフレームの量子化が粗く制御される結果、該他のフレームの画質が劣化するという問題も生ずる。
【００１６】
本発明は、シーンチェンジの発生したＢピクチャの画質を劣化させることなく該Ｂピクチャの符号量の増大を抑圧し、もって、他のフレームの画質の劣化をも防止することを目的とする。
【００１７】
【課題を解決するための手段】
請求項１の発明は、Ｉフレーム、ＰフレームおよびＢフレームを含む一連のフレーム動画像信号を入力信号として、Ｉフレームに対してはフレーム内符号化を行い、ＰフレームおよびＢフレームに対しては予め設定された参照先フレームとの差分データを符号化した符号化データに当該参照先フレームを指示する予測方向情報を結合して出力するフレーム間符号化を行う動画像符号化方法において、前記Ｂフレームでシーンチェンジが検出されたとき、当該Ｂフレームに対する前記差分データをゼロで置換し、且つ、当該Ｂフレームの予測方向情報を前記設定された参照先フレームを指示するものから当該Ｂフレームに対し表示順が近いＩまたはＰフレームを参照先フレームとして指示するものに置き換えて出力する、ことを特徴とする動画像符号化方法である。
【００１８】
請求項２の発明は、請求項１において、前記Ｂフレームでシーンチェンジが検出されたとき、当該Ｂフレームから前記置き換え後の参照フレームまでの間に存在するＢフレームについても、前記差分データをゼロで置換し、且つ、前記予測方向情報を前記設定された参照先フレームを指示するものから前記置き換え後の参照先フレームを指示するものに置き換えて出力する、ことを特徴とする動画像符号化方法である。
【００１９】
請求項３の発明は、Ｉフレーム、ＰフレームおよびＢフレームを含む一連のフレーム動画像信号が入力されるとともに、Ｉフレームに対してはフレーム内符号化を行い、ＰフレームおよびＢフレームに対しては予め設定された参照先フレームとの差分データを符号化した符号化データに当該参照先フレームを指示する予測方向情報を結合して出力するフレーム間符号化を行う動画像符号化装置において、前記一連のフレーム動画像信号を構成する各フレームのうちシーンチェンジが行われるフレームを検出するシーンチェンジ検出手段と、前記シーンチェンジ検出手段によってシーンチェンジが検出されたフレームが前記Ｂフレームであるとき、当該Ｂフレームに対する前記差分データをゼロで置換し、且つ、当該Ｂフレームの予測方向情報を前記設定された参照先フレームを指示するものから当該Ｂフレームに対し表示順が近いＩまたはＰフレームを参照先フレームとして指示するものに置換する置換手段と、を有することを特徴とする動画像符号化装置である。
【００２０】
請求項４の発明は、請求項３において、前記置換手段は、前記シーンチェンジ検出手段によってシーンチェンジが検出されたフレームが前記Ｂフレームであるとき、当該Ｂフレームから前記置換後の参照フレームまでの間に存在するＢフレームについても、前記差分データをゼロで置換し、且つ、前記予測方向情報を前記設定された参照先フレームを指示するものから前記置換後の参照先フレームを指示するものに置き換えて出力する、ことを特徴とする動画像符号化方法である。
なお、シーンチェンジの検出は、例えば、動画像の高域成分の変化、フレームのエッジ成分の移動、低域成分の変動等によって検出することができる。なお、シーンチェンジ検出の手法は、特開平６−１５３１４６号公報、特開平７−３８８４２号公報、特開平７−７９４３１号公報に開示されているように公知である。
【００２６】
【発明の実施の形態】
１．典型的なＭＰＥＧエンコーダ（図３）．
まず、図３を参照して、典型的なＭＰＥＧエンコーダを説明する。なお、ＭＰＥＧエンコーダを例にとって説明しているが、ＭＰＥＧ規格に準拠しない動画像符号化装置や方法であっても、本発明の構成を具備し得る。
【００２７】
１１は入力手段、１２はフレーム並換器、１３はラスタースキャンからマクロブロックへの走査変換プロセッサ、１４は減算器、１６はＤＣＴプロセッサ、１８は重み付けを行う量子化器、２０は可変長符号化器、２２はエンコーダからの読出バッファ、２３出力手段、２４はレート制御器、２６は逆量子化器、２８は逆ＤＣＴプロセッサ、３０は加算器、３２はフレームメモリ、３２ａと３２ｂはフレームメモリ３２のデータの読出バス、３４は動き補償プロセッサ、３６はモード判定器、３８は動き検出器である。
【００２８】
まず、入力手段１１からフレーム並換器１２へ、フレーム信号が、図４（Ａ）の順で入力される。フレーム並換器１２は、図４（Ａ）の順から同図（Ｂ）の順のようにフレーム順を入れ換える。即ち、Ｉピクチャとして圧縮符号化されるフレームが先頭とされ、以下、参照されるフレームが参照するフレームに対して先行するように、フレームの順が並び換えられる。
【００２９】
次に、走査変換・マクロブロック化プロセッサ１３へデータが入力される。通常、画像データは、ラスタースキャンフォーマットで記録されている。これは、画面の左上から１画素づつデータが出力されて１ラインが出力された後、次ラインのデータが出力される形式である。ＭＰＥＧでは、１６×１６画素のマクロブロックと呼ばれる小領域を単位としてデータが処理されるため、ラスタースキャンフォーマットをマクロブロックフォーマットに変換する必要がある。なお、ＭＰＥＧ１では、マクロブロックは輝度データ１６×１６画素、色差データ各８×８画素である。ＭＰＥＧ２では、色差の割合がフレームフォーマットによって変更されるが、輝度で１６×１６画素に相当する範囲がマクロブロックとして処理され、動きベクトルの検索もこれを単位として行われる。
【００３０】
以下、走査変換・マクロブロック化プロセッサ１３から出力されるデータの処理を、圧縮手法別に説明する。
【００３１】
１−１．Ｉピクチャの場合．
Ｉピクチャのデータは、通常、フレーム間予測符号化されず、動きベクトルを検索されない（但し、ＭＰＥＧでは、エラー復帰のために、例外的にＩピクチャへの動きベクトルの導入が許容されている）ため、そのままＤＣＴプロセッサ１６に入力される。該ＤＣＴプロセッサ１６で画像データの空間周波数への変換が行われた後、量子化器１８によって各周波数領域への重み付けが行われる。量子化後の係数データが可変長符号化器２０に入力されて可変長符号化される。
【００３２】
可変長符号化後の符号は、一定の転送レートでデータを出力することができない。このため、読出バッファ２２で転送時のレートへの緩衝が行われ、出力手段２３から圧縮符号列が一定の転送レートで出力される。
【００３３】
可変長符号化器２２から出力される符号量はレート制御器２４により監視され、画像のタイプ（Ｉ，Ｐ，Ｂピクチャ）毎に設定されている符号量との差異が大きい場合には、量子化スケールが変更される。量子化に際しては前述の量子化マトリックスに量子化スケールが乗算されるのであるが、この量子化スケールが、上述のようにレート制御器２４により制御される。量子化スケールが大きくされると量子化が粗く行われる結果、発生符号量は小さくなるが、画質は劣化する。量子化スケールが小さくされると量子化が細かく行われる結果、発生符号量は増大するが、画質は良好となる。
【００３４】
量子化器１８の出力は、逆量子化器２６へも入力される。これは、前述の参照用の画像データを復号するためである。つまり、逆量子化後、逆ＤＣＴが逆ＤＣＴプロセッサ２８で行われ、復号された画像データが画像メモリ３２に格納される。こうして、画像メモリには、参照用のＩピクチャが保持される。
【００３５】
１−２．Ｐピクチャの場合．
Ｐピクチャの符号化に際しては、各マクロブロック毎に動きベクトルが検索される。この検索は、画像メモリ３２に記憶されているＩピクチャもしくはＰピクチャを参照して行われる。但し、全てのマクロブロックについて動きベクトルを検索するわけではない。例えば動きの速いフレームの場合、左に大きくパンするフレーム等では、右端側の領域は当該フレームに於いて初めて出現する領域であるため、かかる領域に関しては、参照フレーム内から似た領域を検索できない。このため、かかる場合には、Ｉピクチャのマクロブロックの場合と同様に、動きベクトルを検索しないモードが適用される。他に、背景処理を簡単に行うため、同位置のブロックデータをそのまま切り出すモードがある。何れのモードが当該マクロブロックの符号化に有効であるかは、モード判定器３６で判定される。
【００３６】
モード判定器３６が動き補償を行う旨の判断を出すと、その命令が動き補償器３４に入力される。また、動き検出器３８によって得られた動きベクトルも動き補償器３４に入力される。動き補償器３４はこれらのデータに基づいて、対応する領域のデータを画像メモリ３２から読み出して減算器１４へ出力させる。減算器１４では現マクロブロックとの差分が取られて、該差分データがＤＣＴ１６及び量子化器１８にて前述のＩピクチャの場合と同様に符号化される。Ｐピクチャの場合も、時間的に未来に位置する他のＰピクチャや、過去及び未来に位置するＢピクチャによって参照されるため、画像メモリ３２に記憶される必要がある。このため、逆量子化器２６と逆ＤＣＴ２８とにより処理されるが、そのままでは、差分データであって、もとの画像データではない。このため、当該マクロブロックの符号化に用いた動きベクトルを再度用いて、画像メモリ３２から加算器３０へ対応領域の画像データを出力させ、上記復号差分データに加算することにより、もとの画像データが復号される。この画像データが画像メモリ３２に格納されて、上述の参照に供される。
【００３７】
１−３．Ｂピクチャの場合．
Ｂピクチャの場合の符号化手法はＰピクチャの場合と基本的に同じである。但し、Ｂピクチャの場合は、過去及び未来のＩピクチャやＰピクチャを参照して圧縮符号化されるため、動きベクトルを、Ｐピクチャの場合の倍、検索する必要がある。また、Ｂピクチャは参照用には用いられないため、復号画像データを画像メモリ３２に格納する必要はない。Ｂピクチャの場合、マクロブロックの符号化のモードは、過去及び未来からの予測、過去からの予測、未来からの予測、単位内符号化、の四種類から選択される。
こうして、ＭＰＥＧ方式の圧縮画像符号のビットストリームが生成される。
【００３８】
２．第１実施例（図１）．
本発明の一実施の形態の例を図１に示す。図１中、図３と同一の機能を奏するブロックについては同一の符号を付して、説明を省略する。
【００３９】
１０２はシーンチェンジ検出器、１０３は置換判定器、１０４は切換器、１０６は切換器、１０８は切換器である。切換器１０４，１０６，１０８は、何れも、置換判定器１０３からの制御信号によって切り換えられる。
【００４０】
シーンチェンジ検出手段１０２は、入力手段１１から入力された画像データに基づいて場面切換の有無を判定する。例えば、画像データの高域成分の変化、フレームのエッジ成分の移動、低域成分の変動などによって、場面切換（シーンチェンジ）の有無を判定することができる。本発明では、シーンチェンジの検出技法は限定されない。また、現フレームのシーンチェンジの判定の基準となるフレームとしては、時間軸上で、現フレームの直前に位置するフレームの他、現フレームが符号化に際して参照するフレームを用いることもできる。
【００４１】
置換判定器１０３は、シーンチェンジ検出器１０２によりシーンチェンジが検出された場合、そのシーンチェンジが、Ｂピクチャで行われたか、又は、ＩピクチャやＰピクチャで行われたかを判定する。換言すれば、前記差分データの算出用として参照されない非参照フレームであるＢピクチャで行われたか、又は、差分データの算出用として参照され得るフレームであるＩピクチャやＰピクチャで行われたかを判定する。
【００４２】
その結果、Ｉピクチャ又はＰピクチャでシーンチェンジが検出されたと判定された場合には、置換判定器１０３は、切換器１０４，１０６，１０８に対して通常の位置（端子ａ）を維持するように命令する。この場合は、前述の図１と同様の動作が行われる。一方、Ｂピクチャでシーンチェンジが検出されたと判定された場合、置換判定器１０３は、当該Ｂピクチャの各マクロブロックについて、切換器１０４，１０６，１０８に対して通常の位置（端子ａ）からの切換を命令する。
【００４３】
これにより、切換器１０４は、当該Ｂピクチャの各マクロブロックについて、減算器１４からの入力（差分データの入力）に換えて、ゼロデータをＤＣＴプロセッサ１６に入力させるように、端子ｂ側へ切り換えられる。このため、量子化器１８から出力される差分圧縮符号はゼロデータとなる。
【００４４】
また、切換器１０６は、当該Ｂピクチャの各マクロブロックについて、検出された動きベクトルに換えて、動きが無いことを示すゼロデータを可変長符号化器２０へ送るように、端子ｂ側へ切り換えられる。これにより、当該マクロブロックの圧縮符号（この場合は、上述のようにゼロデータである）に結合される動きベクトルは、動きの無いことを示すゼロデータとなる。
【００４５】
また、切換器１０８は、当該Ｂピクチャの各マクロブロックについて、判定された符号化モードに代えて、当該Ｂピクチャに表示順が近く差分データの算出用として参照され得るフレーム（Ｉピクチャ又はＰピクチクャ）の方向を予測方向とするマクロブロックタイプ情報を可変長符号化器２０へ送るように、端子ｂ側（前方予測）又は端子ｃ側（後方予測）へ切り換えられる。これにより、当該マクロブロックの圧縮符号（この場合は、上述のようにゼロデータである）に結合されるマクロブロックタイプ情報は、上記予測方向を示す情報となる。なお、端子ｂ側（前方予測）又は端子ｃ側（後方予測）の何れを選択するかは、例えば、時間的に近い参照フレームの方向を選択するようにすると、容易に判定することができる。なお、端子ｂ側（前方予測）又は端子ｃ側（後方予測）の何れか一方のみを用意する構成も可能である。
【００４６】
このように、切換器１０４，１０６，１０８の切換が行われる結果、可変長符号化器２０には、上記Ｂピクチャの期間に渡り、全て「０」の圧縮符号と、動きの無いことを示す動きベクトル「０」情報と、当該Ｂピクチャに表示順が近いＩピクチャ又はＰピクチャを指示するマクロブロックタイプ情報とが入力される。これらは可変長符号化された後、結合されて出力される。なお、ＭＰＥＧ方式では、マクロブロック内のデータが特定の条件を満たす場合、該マクロブロックに対するデータを符号化する必要がない。可変長符号化手段２０への上述の入力はこの条件を満たしている。したがって、必ず符号化を義務づけられた少数のマクロブロックのみについて符号化が行われる。このため、当該Ｂピクチャについての符号量は、極めて少ないものに抑えられる。
【００４７】
こうして、シーンチェンジの検出されたＢピクチャについては、当該Ｂピクチャを、当該Ｂピクチャに表示順が近いＩピクチャ又はＰピクチャで置換する旨の情報が出力されることになる。このため、不図示のデコーダ側では、上述のＢピクチャに代えて、上記置換する旨の情報で指示されるＩピクチャ又はＰピクチャが再生出力される。
【００４８】
なお、図１では、切換器１０６により端子ｂ側へ切り換えられた動きベクトル情報と、切換器１０８により端子ｂ側又は端子ｃ側へ切り換えられたマクロブロックタイプ情報とを、動き補償器３４へ入力させているが、この動き補償器３４への入力としては、動き検出器３８から出力される通常の動きベクトル情報と、モード判定器３６から出力される通常のマクロブロックタイプ情報とを採用してもよい。
【００４９】
また、上記では、シーンチェンジの発生したＢピクチャを、当該Ｂピクチャに表示順が近いＩピクチャ又はＰピクチャで置換する旨の情報を出力するように制御する例を説明しているが、当該Ｂピクチャと、当該Ｂピクチャに表示順が近いＩピクチャ又はＰピクチャとの間に、他のＢピクチャが在る場合には、該他のＢピクチャ内の各マクロブロックについて、シーンチェンジの発生したＢピクチャと同様に処理するように構成することもできる。
【００５０】
例えば、図４のＢピクチャ５８でシーンチェンジが検出された場合に、上述のように、Ｂピクチャ５８をＰピクチャ５９で置換するように制御するとともに、他のＢピクチャ５７をＰピクチャ５６（Ｐピクチャ５９でもよい）で置換するように制御する構成も可能である。この構成は、請求項１０，１１に対応する。
【００５１】
３．第２実施例（図２）．
本発明における別の実施の形態の例を図２に示す。図２中、図１や図３と同一の機能を奏するブロックについては同一の符号を付して、説明を省略する。
【００５２】
図２では、切換器１０４を走査変換マクロブロック化器１３と減算器１４との間に配置し、且つ、切換器１０４の端子ｂに、画像メモリ３２から読み出されるデータを入力させている。つまり、切換器１０４が端子ｂに切り換えられた場合に、減算器１４にの両入力端子に、画像メモリ３２から読み出される同一の画像データが入力されるように構成されている。このため、切換器１０４が端子ｂ側に切り換えられた場合には、ＤＣＴプロセッサ１６にはゼロデータが入力されることになり、図１の場合と同じ効果を得ることができる。
【００５３】
【発明の効果】
本発明によると、非参照フレームに於いてシーンチェンジ発生した場合に、該非参照フレームの符号量の増大を抑圧することことができる。このため、シーンチェンジ以外での符号化効率を高めることもできる。その結果、高画質で安定な圧縮符号化を実現することができる。
【図面の簡単な説明】
【図１】本発明の動画像符号化装置の一実施の形態の例を示すブロック図。
【図２】本発明の動画像符号化装置の別の実施の形態の例を示すブロック図。
【図３】典型的な動画像符号化装置を示すブロック図。
【図４】フレーム間予測符号化の各フレームの参照関係の説明図。
【符号の説明】
１０２シーンチェンジ検出器
１０３置換判定器
１０４切換器
１０６切換器
１０８切換器[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a moving image encoding method and apparatus for compressing and encoding a moving image video signal such as a television signal using an inter-frame predictive encoding method. In particular, at the time of a scene change (scene change), a moving image that achieves image quality stabilization by preventing an adverse effect such as an increase in the code amount of a non-reference frame that is not referred to in the inter-frame predictive coding method. The present invention relates to an image encoding method and apparatus.
[0002]
[Prior art]
As a method for compressing and encoding a moving image signal, the MPEG method is standardized as MPEG1-Video (ISO / IEC 11172-2) and MPEG2-Video (ISO / IEC-13818-2). Instead, it is also widely used in broadcasting and communication fields such as digital broadcasting and digital transmission.
[0003]
In the MPEG system, a GOP (Group of Picture), which is a set of a plurality of frame units (one field, one frame, or the like), is configured in a frame sequence. In a GOP, at least one frame is compression-coded by intra-frame compression, and other frames are compression-coded with reference to other frames (inter-frame compression = inter-frame predictive coding). You. Images compressed and encoded by intra-frame compression are I pictures, and images compressed and encoded using inter-frame compression are P pictures and B pictures.
[0004]
In intra-frame compression, techniques such as DCT (Discrete Cosine Transform), quantization, and variable length coding (VLC) are employed.
First, image data in a frame is divided into small regions of 16 × 16 pixels called macroblocks, and DCT transform is performed for each 8 × 8 pixel block in the small region to obtain a frequency region.
[0005]
Since human visual characteristics are insensitive to high frequencies, reducing the amount of codes assigned to high-frequency regions in the above-mentioned frequency region reduces the amount of data without causing significant visual deterioration to humans. Can be. This assignment is performed by quantization. Quantization is realized by dividing 8 × 8 frequency domain data (coefficient matrix) obtained by DCT transform of a block of 8 × 8 pixels by a certain matrix. On the decoder side, the original 8 × 8 frequency domain data (coefficient matrix) can be obtained by multiplying (inverse quantizing) the data obtained by the variable length decoding by the above matrix. Become. That is, it is possible to obtain a matrix for performing the inverse DCT operation.
[0006]
In the above, by setting a large division value for high-order data, that is, data in a high-frequency region, it is possible to reduce the encoding accuracy in the high-frequency region. That is, the data amount corresponding to the high frequency region can be significantly reduced. The quantized data thus obtained has many zero values. Therefore, in this data, a variable length encoding table is formed by a set of the number of zero values and the subsequent real values. Variable-length coding is a type of entropy coding, in which a short code is assigned to data with a high frequency of occurrence, and a long code is assigned to data with a low frequency of occurrence, resulting in highly efficient compression. It is a method to perform.
[0007]
In the inter-frame compression, a technique called motion compensation using a motion vector is employed in addition to the techniques of DCT, quantization, and variable-length coding described above.
First, image data in a frame is divided into small areas of 16 × 16 pixels called macroblocks, as in the case of intra-frame compression. The inter-frame compression requires not only image data to be encoded (current image data) but also image data of another frame to be referred to by the current image data (reference image data). As the reference image data, a frame existing at a position temporally close to the current image data is used. For each macroblock, an area (macroblock) that is close in data to the current encoding-target macroblock (current macroblock) is searched from the image data of the reference frame. The “region that is close (approximate) in terms of data” usually refers to a region in which only the luminance of pixel data is compared, and the average of the square errors thereof shows the smallest value. In MPEG, this search is performed with an accuracy of a half pixel unit. The search range is a region within a predetermined range from the position in the reference frame corresponding to the position of the current macroblock in the current frame. This search is called “motion vector” detection. Further, in the inter-frame predictive coding, correcting a positional shift between a reference macroblock and a current macroblock by using a "motion vector" is referred to as "motion compensation". The value of this “motion vector” is also variable-length coded.
[0008]
If there is image data to be referred to, it is possible to cut out the above “data-wise close” region on the decoder side. However, the image data of the extracted reference macroblock and the image data of the current macroblock to be encoded are not completely the same. Therefore, it is necessary to compensate for this difference. In MPEG, in order to compensate for this difference, a difference macroblock is constructed by taking a difference for each pixel (luminance, chrominance), and for each region of 8 × 8 pixels, compression in a frame unit is performed. DCT operation and quantization are performed in the same manner as in. However, in this case, the quantization is generally performed with reduced accuracy because the target is differential data. For this reason, a quantization table that does not distinguish between the high-frequency region and the low-frequency region is usually prepared as a default for the difference data. Of course, it is possible to freely change this quantization table on the encoder side.
[0009]
The compression code thus obtained is subjected to inverse quantization and inverse DCT. When the image data of the referenced frame exists inside the decoder, the value after the inverse DCT described above is added to the frame area specified by the motion amount, so that the decoding of the macroblock can be performed. Will be
[0010]
In MPEG, there are two types of frames that are compression-encoded using inter-frame compression. A frame referred to only from a past frame is called a P picture, and a frame referred to from both the past and the future is called a B picture. A P picture is predicted from an I or P picture that is located temporally (= in display order) in the past. B pictures are predicted from I or P pictures located in the past and / or future in time (= in display order). That is, the B picture is a non-reference frame that is not referred to for motion prediction (= a non-reference frame that is not referred to for calculating difference data), and the I and P pictures are reference frames that can be referred to for motion prediction. == frame that can be referred to for calculating difference data).
[0011]
The code amount of the differential compression code compressed by combining these compression methods is several-hundredths to one-hundredths smaller than the data amount of the original image data. Holding. However, since a method of referring to another frame is used, a configuration for holding image data of the reference frame on the decoder side is required. On the decoder side, the decoded image data is used as the image data of the reference frame. Therefore, the encoder must perform encoding in consideration of this fact. That is, it is necessary to encode the I-frame and the P-picture which are the frames to be referred to, decode them under the same conditions as those of the decoder, and hold the decoded image data of the reference frame. Then, the P picture and the B picture must be encoded with reference to the decoded image data.
[0012]
FIG. 4 shows an example of a frame sequence of an MPEG moving image. (A) in the figure shows the display order of each frame. The order of the frame type most commonly used in MPEG is shown, but is not limited to this. The arrow indicates that the start frame is referred to when encoding / decoding the end frame. For example, since the B pictures 51 and 52 are frames that are encoded / decoded with reference to the I picture 53, the I picture 53 is already decoded when the B pictures 51 and 52 are encoded / decoded. Must be stored in the image memory. Similarly, since the B pictures 54 and 55 are frames that are encoded / decoded with reference to the I picture 53 and the P picture 56, when the B pictures 54 and 55 are encoded / decoded, The I picture 53 and the P picture 56 need to be decoded and stored in the image memory. Similarly, since the P picture 56 is a frame that is encoded / decoded with reference to the I picture 53, the I picture 53 has already been decoded and encoded when the P picture 56 is encoded / decoded. Must be held in
[0013]
For this reason, the recording order of each frame described above is as shown in FIG. Decoding is performed in the recording order, and the decoded image data of the I picture and the P picture are sequentially stored, so that the above-described B picture and the P picture can be encoded / decoded. In order to make such a reference, the I-picture itself needs to maintain its image quality. Also, since the P picture is used for referring to another P picture or B picture, it is necessary to maintain high image quality. For this reason, the image quality as a whole is generally maintained by allocating the largest amount of code to the I picture and then allocating the code amount in the order of the P picture and the B picture.
[0014]
[Problems to be solved by the invention]
However, in the related art, when a scene change (scene change) occurs in a B picture with the smallest data amount allocation, a so-called bidirectional prediction code for performing coding with reference to past and future bidirectional frames. Can not be performed. For this reason, a high compression rate cannot be maintained in the B picture, and the code amount of the B picture increases.
[0015]
Also, in the video encoding device, since the amount of code output from the device is monitored and the quantization scale is controlled to control the overall amount of code, a rate control is performed as described above. When the code amount of the changed B picture increases, the quantization of the other frames is coarsely controlled, resulting in a problem that the image quality of the other frames deteriorates.
[0016]
It is an object of the present invention to suppress an increase in the code amount of a B picture without deteriorating the image quality of a B picture in which a scene change has occurred, thereby preventing the deterioration of the image quality of another frame.
[0017]
[Means for Solving the Problems]
According to the first aspect of the present invention, a series of frame moving image signals including an I frame, a P frame, and a B frame are used as input signals to perform intra-frame encoding for an I frame, and perform encoding for a P frame and a B frame. A moving image encoding method for performing inter-frame encoding in which prediction data indicating a reference frame is combined with encoded data obtained by encoding difference data from a preset reference frame and encoded data is output. When a scene change is detected in the frame, the difference data for the B frame is replaced with zero, and the prediction direction information of the B frame is changed from the one indicating the set reference frame to the B frame. A moving image characterized in that an I or P frame whose display order is close is replaced with a frame designating a reference frame and output. It is an encoding method.
[0018]
According to a second aspect of the present invention, in the first aspect, when a scene change is detected in the B frame, the difference data is reduced to zero even for a B frame existing from the B frame to the reference frame after the replacement. And outputting the prediction direction information by replacing the one indicating the set reference destination frame with the one indicating the replacement reference frame. It is.
[0019]
According to a third aspect of the present invention, a series of frame moving image signals including an I frame, a P frame, and a B frame are input, an intra-frame encoding is performed on the I frame, and a P frame and a B frame are encoded. In a moving image encoding apparatus for performing inter-frame encoding in which prediction direction information indicating the reference destination frame is combined with encoded data obtained by encoding difference data with a preset reference destination frame and output, A scene change detecting unit that detects a frame where a scene change is performed among frames constituting a series of frame moving image signals; and, when the scene change detected by the scene change detecting unit is the B frame, The difference data for the B frame is replaced with zero, and prediction direction information of the B frame is replaced. A moving unit that replaces the frame indicating the set reference frame with a frame indicating the I or P frame whose display order is closer to the B frame as the frame to be referenced. Device.
[0020]
According to a fourth aspect of the present invention, in the third aspect, when the frame in which the scene change is detected by the scene change detecting unit is the B frame, the replacing unit performs a process from the B frame to the reference frame after the replacement. Also for the B frame existing between them, the difference data is replaced with zero, and the prediction direction information is replaced from the one indicating the set reference frame to the one indicating the replaced reference frame. And outputting the moving image.
The scene change can be detected by, for example, a change in a high-frequency component of a moving image, a movement of an edge component of a frame, a change in a low-frequency component, and the like. The method of detecting a scene change is known as disclosed in JP-A-6-153146, JP-A-7-38842, and JP-A-7-79431.
[0026]
BEST MODE FOR CARRYING OUT THE INVENTION
1. Typical MPEG encoder (FIG. 3).
First, a typical MPEG encoder will be described with reference to FIG. Although the description has been made by taking the MPEG encoder as an example, a moving picture encoding device or method that does not conform to the MPEG standard may have the configuration of the present invention.
[0027]
11 is an input means, 12 is a frame reorderer, 13 is a scan conversion processor for converting a raster scan into a macroblock, 14 is a subtractor, 16 is a DCT processor, 18 is a quantizer for performing weighting, and 20 is variable length coding. 22, a read buffer from the encoder, 23 output means, 24 a rate controller, 26 an inverse quantizer, 28 an inverse DCT processor, 30 an adder, 32 a frame memory, 32a and 32b a frame memory 32 , A motion compensation processor, 36 a mode determiner, and 38 a motion detector.
[0028]
First, a frame signal is input from the input unit 11 to the frame reordering unit 12 in the order of FIG. The frame reordering unit 12 changes the frame order from the order of FIG. 4A to the order of FIG. That is, the frame that is compression-encoded as an I picture is set as the head, and the order of the frames is rearranged so that the referenced frame precedes the referenced frame.
[0029]
Next, data is input to the scan conversion / macroblock processor 13. Usually, image data is recorded in a raster scan format. This is a format in which data is output one pixel at a time from the upper left of the screen, one line is output, and then the data of the next line is output. In MPEG, data is processed in units of small areas called macroblocks of 16 × 16 pixels, so it is necessary to convert a raster scan format into a macroblock format. In MPEG1, a macroblock is composed of 16 × 16 pixels of luminance data and 8 × 8 pixels of chrominance data. In MPEG2, the ratio of the color difference is changed depending on the frame format. However, a range corresponding to 16 × 16 pixels in luminance is processed as a macroblock, and a search for a motion vector is also performed using this as a unit.
[0030]
Hereinafter, processing of data output from the scan conversion / macroblock processor 13 will be described for each compression method.
[0031]
1-1. For I-picture.
I-picture data is usually not inter-frame predictive coded and is not searched for a motion vector (however, the introduction of a motion vector into an I-picture is exceptionally allowed in MPEG for error recovery). Therefore, it is directly input to the DCT processor 16. After the DCT processor 16 converts the image data into the spatial frequency, the quantizer 18 weights each frequency domain. The quantized coefficient data is input to the variable-length encoder 20 and subjected to variable-length encoding.
[0032]
The code after the variable length coding cannot output data at a fixed transfer rate. Therefore, the read buffer 22 buffers the transfer rate, and the output means 23 outputs the compressed code string at a constant transfer rate.
[0033]
The code amount output from the variable length encoder 22 is monitored by the rate controller 24. If the difference from the code amount set for each image type (I, P, B picture) is large, The scale is changed. At the time of quantization, the above-described quantization matrix is multiplied by a quantization scale, and this quantization scale is controlled by the rate controller 24 as described above. When the quantization scale is increased, the quantization is coarsely performed. As a result, the generated code amount decreases, but the image quality deteriorates. When the quantization scale is reduced, the quantization is performed finely, and as a result, the generated code amount increases, but the image quality is improved.
[0034]
The output of the quantizer 18 is also input to the inverse quantizer 26. This is for decoding the aforementioned reference image data. That is, after the inverse quantization, the inverse DCT is performed by the inverse DCT processor 28, and the decoded image data is stored in the image memory 32. In this way, the reference I picture is held in the image memory.
[0035]
1-2. For a P picture.
When encoding a P picture, a motion vector is searched for each macroblock. This search is performed with reference to the I picture or the P picture stored in the image memory 32. However, motion vectors are not searched for all macroblocks. For example, in the case of a fast-moving frame, in the case of a frame that pans to the left and the like, the region on the right end is a region that first appears in the frame. . Therefore, in such a case, a mode in which a motion vector is not searched is applied as in the case of a macroblock of an I picture. In addition, there is a mode for cutting out block data at the same position as it is in order to easily perform background processing. Which mode is effective for encoding the macroblock is determined by the mode determiner 36.
[0036]
When the mode determiner 36 determines to perform motion compensation, the command is input to the motion compensator 34. The motion vector obtained by the motion detector 38 is also input to the motion compensator 34. Based on these data, the motion compensator 34 reads the data of the corresponding area from the image memory 32 and outputs the data to the subtractor 14. The subtracter 14 calculates the difference from the current macroblock, and the difference data is encoded by the DCT 16 and the quantizer 18 in the same manner as in the case of the I-picture described above. In the case of a P picture, it is necessary to store the P picture in the image memory 32 because it is referred to by other P pictures located in the future in time and B pictures located in the past and the future. Therefore, the data is processed by the inverse quantizer 26 and the inverse DCT 28, but as it is, it is difference data and not original image data. For this reason, the image data of the corresponding area is output from the image memory 32 to the adder 30 by using the motion vector used for encoding the macroblock again, and is added to the decoded difference data. The data is decrypted. This image data is stored in the image memory 32 and is provided for the above-mentioned reference.
[0037]
1-3. For B picture.
The coding method for a B picture is basically the same as that for a P picture. However, in the case of a B picture, since it is compression-encoded with reference to past and future I and P pictures, it is necessary to search for a motion vector twice that of a P picture. Further, since the B picture is not used for reference, it is not necessary to store the decoded image data in the image memory 32. In the case of a B picture, the encoding mode of the macroblock is selected from four types: prediction from the past and the future, prediction from the past, prediction from the future, and intra-unit encoding.
Thus, a bit stream of the compressed image code of the MPEG system is generated.
[0038]
2. First embodiment (FIG. 1).
FIG. 1 shows an example of an embodiment of the present invention. 1, blocks having the same functions as those in FIG. 3 are denoted by the same reference numerals, and description thereof will be omitted.
[0039]
102 is a scene change detector, 103 is a replacement determiner, 104 is a switch, 106 is a switch, and 108 is a switch. Each of the switches 104, 106, and 108 is switched by a control signal from the replacement determination unit 103.
[0040]
The scene change detecting unit 102 determines whether or not there is a scene change based on the image data input from the input unit 11. For example, the presence or absence of a scene change (scene change) can be determined based on a change in the high-frequency component of the image data, a shift in the edge component of the frame, a change in the low-frequency component, and the like. In the present invention, a technique for detecting a scene change is not limited. Further, as a frame that is used as a reference for determining a scene change of the current frame, a frame that is located immediately before the current frame on the time axis, or a frame that the current frame refers to at the time of encoding can be used.
[0041]
When a scene change is detected by the scene change detector 102, the replacement determiner 103 determines whether the scene change has been performed in a B picture, an I picture, or a P picture. In other words, it is determined whether the processing has been performed on a B-picture which is a non-reference frame not referred to for calculating the difference data, or whether the processing has been performed on an I-picture or P-picture which can be referred to for calculating the difference data. I do.
[0042]
As a result, when it is determined that a scene change has been detected in an I picture or a P picture, the replacement determiner 103 controls the switches 104, 106, and 108 to maintain the normal position (terminal a). Command. In this case, the same operation as in FIG. 1 is performed. On the other hand, when it is determined that a scene change has been detected in a B picture, the replacement determiner 103 sends to each of the switches 104, 106, and 108 the macroblock of the B picture from the normal position (terminal a). Command switching.
[0043]
As a result, the switching unit 104 switches the macroblock of the B picture to the terminal b so that zero data is input to the DCT processor 16 instead of the input from the subtractor 14 (input of difference data). Can be Therefore, the differential compression code output from the quantizer 18 becomes zero data.
[0044]
In addition, the switch 106 switches to the terminal b side so that zero data indicating no motion is sent to the variable length encoder 20 in place of the detected motion vector for each macroblock of the B picture. Can be Thereby, the motion vector combined with the compression code of the macroblock (in this case, zero data as described above) becomes zero data indicating no motion.
[0045]
In addition, the switch 108 replaces the coding mode determined for each macroblock of the B picture with a frame (I picture or P picture) that is close to the display order of the B picture and can be referred to for calculating difference data. ) Is switched to the terminal b side (forward prediction) or the terminal c side (backward prediction) so that the macroblock type information having the direction of ()) as the prediction direction is sent to the variable-length encoder 20. As a result, the macroblock type information combined with the compression code of the macroblock (in this case, zero data as described above) becomes information indicating the prediction direction. Note that whether to select the terminal b side (forward prediction) or the terminal c side (backward prediction) can be easily determined by, for example, selecting a direction of a reference frame that is close in time. Note that a configuration in which only one of the terminal b side (forward prediction) and the terminal c side (backward prediction) is prepared is also possible.
[0046]
As a result of the switching of the switches 104, 106, and 108, the variable-length encoder 20 shows a compression code of all "0" and indicates that there is no motion over the period of the B picture. The motion vector “0” information and macroblock type information indicating an I picture or a P picture whose display order is close to the B picture are input. These are variable-length coded and then combined and output. In the MPEG system, when data in a macroblock satisfies a specific condition, there is no need to encode data for the macroblock. The above-described input to the variable length encoding means 20 satisfies this condition. Therefore, encoding is performed only for a small number of macroblocks that must be encoded. For this reason, the code amount of the B picture is suppressed to an extremely small amount.
[0047]
In this way, for a B picture in which a scene change is detected, information indicating that the B picture is replaced with an I picture or a P picture whose display order is close to that of the B picture is output. For this reason, the decoder (not shown) reproduces and outputs the I picture or the P picture indicated by the information to be replaced, instead of the B picture.
[0048]
In FIG. 1, the motion vector information switched to the terminal b by the switch 106 and the macroblock type information switched to the terminal b or the terminal c by the switch 108 are input to the motion compensator 34. However, as inputs to the motion compensator 34, normal motion vector information output from the motion detector 38 and normal macroblock type information output from the mode determiner 36 are adopted. Is also good.
[0049]
In the above description, an example is described in which control is performed so as to output information indicating that a B picture in which a scene change has occurred is replaced with an I picture or a P picture whose display order is closer to the B picture. If there is another B picture between the picture and the I picture or P picture whose display order is close to the B picture, for each macroblock in the other B picture, It can also be configured to process in the same way as a picture.
[0050]
For example, when a scene change is detected in the B picture 58 of FIG. 4, as described above, control is performed so that the B picture 58 is replaced with the P picture 59, and the other B pictures 57 are replaced with the P pictures 56 (P (It may be a picture 59). This configuration corresponds to claims 10 and 11.
[0051]
3. Second embodiment (FIG. 2).
FIG. 2 shows an example of another embodiment of the present invention. 2, blocks having the same functions as those in FIGS. 1 and 3 are denoted by the same reference numerals, and description thereof will be omitted.
[0052]
In FIG. 2, the switch 104 is disposed between the scan conversion macroblock generator 13 and the subtractor 14, and data read from the image memory 32 is input to the terminal b of the switch 104. That is, the same image data read from the image memory 32 is input to both input terminals of the subtractor 14 when the switch 104 is switched to the terminal b. Therefore, when the switch 104 is switched to the terminal b, zero data is input to the DCT processor 16, and the same effect as in FIG. 1 can be obtained.
[0053]
【The invention's effect】
According to the present invention, when a scene change occurs in a non-reference frame, an increase in the code amount of the non-reference frame can be suppressed. For this reason, the coding efficiency other than the scene change can be improved. As a result, stable compression encoding with high image quality can be realized.
[Brief description of the drawings]
FIG. 1 is a block diagram showing an example of an embodiment of a moving image encoding device according to the present invention.
FIG. 2 is a block diagram showing an example of another embodiment of the video encoding device of the present invention.
FIG. 3 is a block diagram showing a typical moving picture encoding device.
FIG. 4 is an explanatory diagram of a reference relationship between frames in inter-frame predictive coding.
[Explanation of symbols]
102 Scene Change Detector
103 Replacement decision unit
104 switch
106 switch
108 switch

Claims

Using a series of frame video signals including an I frame, a P frame and a B frame as an input signal, intra-frame encoding is performed for the I frame, and a preset reference frame is set for the P frame and the B frame. A moving image encoding method for performing inter-frame encoding in which prediction direction information indicating the reference destination frame is combined with encoded data obtained by encoding difference data with
When a scene change is detected in the B frame, the difference data for the B frame is replaced with zero, and the prediction direction information of the B frame is changed from the one indicating the set reference frame to the B frame. Is output by replacing the I or P frame whose display order is close to the one indicating the reference frame.
A moving picture coding method characterized by the above-mentioned.

In claim 1,
When a scene change is detected in the B frame, for the B frame existing between the B frame and the reference frame after the replacement, the difference data is replaced with zero, and the prediction direction information is replaced by the prediction direction information. Output by replacing with the one indicating the reference frame after the replacement from the one indicating the set reference frame,
A moving picture coding method characterized by the above-mentioned.

A series of frame moving image signals including an I frame, a P frame, and a B frame are input, an intra-frame encoding is performed for the I frame, and a preset reference destination is assigned to the P frame and the B frame. A moving image encoding apparatus that performs inter-frame encoding in which prediction direction information indicating the reference destination frame is combined with encoded data obtained by encoding difference data from a frame and output.
Scene change detection means for detecting a frame in which a scene change is performed among frames constituting the series of frame moving image signals,
When the frame in which a scene change is detected by the scene change detecting means is the B frame, the difference data for the B frame is replaced with zero, and the prediction direction information of the B frame is set to the set reference destination. Replacement means for replacing an I or P frame whose display order is closer to the B frame from a frame indicating the frame with a frame indicating the I or P frame as a reference frame;
A moving picture coding apparatus comprising:

In claim 3,
When the frame in which the scene change is detected by the scene change detecting unit is the B frame, the replacing unit also stores the difference data for the B frame existing between the B frame and the reference frame after the replacement. Is replaced with zero, and the prediction direction information is output by replacing the one indicating the set reference frame with the one indicating the reference frame after the replacement,
A moving picture coding apparatus characterized by the above-mentioned.