JP4264535B2

JP4264535B2 - Image processing apparatus and method, recording medium, and program

Info

Publication number: JP4264535B2
Application number: JP2003026701A
Authority: JP
Inventors: 恭一竹内
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2003-02-04
Filing date: 2003-02-04
Publication date: 2009-05-20
Anticipated expiration: 2023-02-04
Also published as: JP2004241879A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像処理装置および方法、記録媒体、並びにプログラムに関し、特に、画像情報をMPEG（Moving Picture Experts Group）方式で圧縮する際のVBV（Video Buffer Verifier）バッファアンダーフローを抑制できるようにした画像処理装置および方法、記録媒体、並びにプログラムに関する。
【０００２】
【従来の技術】
近年、MPEG方式による画像圧縮伸張技術が一般に普及している。
【０００３】
画像がMEPG方式で圧縮されることにより、MPEGビットストリームが生成され、その生成されたMPEGビットストリームが伝送路を介して受信側の装置に送信される。この際、MPEG方式で画像を圧縮して、送信する送信側の装置は、これから送信しようとするMPEGビットストリームが受信側の装置で、十分に再生可能なものであるか否かを監視しつつ送信している。
【０００４】
より詳細には、図１で示すように、送信側の装置、すなわちエンコーダは、受信側の装置（デコーダ）がMPEGビットストリームを受信する際に使用するバッファの状態をVBVバッファにより仮想的に再現して、バッファオーバーフローやバッファアンダーフローが発生しないように監視しつつエンコードしている。
【０００５】
すなわち、VBVバッファの占有量は、図２で示すように変化することになる。ここで、図２においては、縦軸がVBVバッファの占有量を示し、横軸が時間を示す。従って、図２における直線の傾きは、ビットレートを示す。VBVバッファは、その最大容量（図中のMax１）までMPEGビットストリームが入ると占有量は状態Ｓ１となる。この状態Ｓ１のVBVバッファから1ピクチャ（Iピクチャ（イントラピクチャ）、Pピクチャ（前方向予測ピクチャ）、または、Bピクチャ（双方向予測ピクチャ））分のMPEGビットストリームがデコードのために引き抜かれると、状態は、状態Ｓ１から状態Ｓ２に変化する。1ピクチャ分の発生量がVBVバッファの下限を超えない場合、デコーダは1ピクチャ分のデータを全て受け取ることができるので正常にデコードできることになる。
【０００６】
続いて、時間Ｔ（= 1/フレームレート）が経過した後、VBVバッファにMPEGビットストリームが入れられると占有量は状態Ｓ３となる。ここでVBVバッファから1ピクチャ分のMPEGビットストリームが引き抜かれる場合、図中のデータ量（Ｌ１＋Ｌ２）が１ピクチャ分のデータ量であるとき、状態Ｓ３においては、図中データ量Ｌ１分のデータしかVBVバッファには蓄えられておらず、状態Ｓ４で示すようにMPEGビットストリームの状態は、１ピクチャ分のMPEGビットストリームに対してデータ量Ｌ２が不足した状態となり、MPEGビットストリームが途切れてしまうことになる。このため、ストリームの受信側の装置となるデコーダでは1ピクチャ分のデコードが途中でできなくなってしまう。このように、VBVバッファにおいて、蓄えられるデータ量が、１ピクチャ分に満たない状態となり、デコードが不能になってしまう状態をVBVバッファアンダーフローと呼ぶ。
【０００７】
エンコーダは、VBVバッファアンダーフローが生じないように、状態Ｓ３の時点で、1ピクチャの発生量を最大発生ビット量（MaxGenBit＝Ｌ１）以内に抑えるようにストリームを制御し、VBVバッファアンダーフローが生じないようにエンコードしている。
【０００８】
VBVバッファアンダーフローを生じないようにエンコードする（1ピクチャの発生量を最大発生ビット量（MaxGenBit＝Ｌ１）以内に抑えるようにストリームを制御しつつエンコードする）方法の一つとして、１フレームのビット発生量を抑えるためマクロブロック（MB）単位でエンコードにより生じるパラメータを必要最小限（最少パラメータ）にして、圧縮処理（以下、最少パラメータ処理と称する）することが提案されている。この最少パラメータ処理はピクチャタイプによって異なり、Iピクチャの場合、全てのMBのDC（Direct Current）成分のみのパラメータとする処理を実行し、Bピクチャ、または、Pピクチャの場合、全てのMBをスキップドマクロブロック（skipped MB）にする処理を実行する。ここで、スキップドマクロブロックとは、各ピクチャにおける、各スライスの第1マクロブロックヘッダ（MBヘッダ）と各スライスの最終MBヘッダのみから構成されるデータである。
【０００９】
また、再生画像の画質を安定させつつ、目標ビットレートで符号化できるようにするものがある（例えば、特許文献１参照）。
【００１０】
【特許文献１】
特開平１０−１５５１５２号公報
【００１１】
【発明が解決しようとする課題】
ところが、以上のように最少パラメータ処理により、発生ビット量を最小にしても、結果として、発生するビット量は０にすることができないため、VBVバッファの占有量が、図２で示すデータ量Ｌ１以下になった場合に最少パラメータによる圧縮処理を行っても、最大発生ビット量Ｌ１以内に１ピクチャ分の発生ビット量の全てが収まり切らず、結果として、１ピクチャ分のデータに不足するデータ量Ｌ２が発生してしまう恐れがあった。
【００１２】
そこで、図３で示すように、VBVバッファにマージンを設けることにより、マージン＋最大発生ビット量と、VBVバッファ占有量との比較から、発生ビット量が、マージン＋最大発生ビット量を超えてしまうような場合、上述のような最少パラメータによる処理を実行することにより、VBVバッファアンダーフローを回避させる方法が提案されている。
【００１３】
すなわち、図４で示すように、VBVバッファの最大容量Max２（＝Max１＋マージン（margin））が設定される。そして、状態Ｓ１’で、最大容量Max２までMPEGビットストリームを蓄え、この状態から１ピクチャ分のデータが引き抜かれると、状態Ｓ２’に変化する。そして、再び、上述と同様に、時間Ｔだけ経過した後、状態Ｓ３’となったところで、再び１ピクチャ分のデータが引き抜かれるとき、VBVバッファの占有量は、マージンとなるレベルまでのデータ量Ｌ１となり、上述のように１ピクチャ分のMPEGビットストリームのデータ量に満たないため、最少パラメータによる圧縮処理が行われることになるが、この際、マージンを見込んでいるため、最少パラメータにより発生するデータ量Ｌ２がマージンにより吸収されることになるため、結果として、VBVバッファアンダーフローが発生しないことになる。
【００１４】
ところで、上述のマージンの設定は、予め所定の複数の設定画像から最少パラメータを求める、いわゆる、チューニング処理により、その最少パラメータにより発生される発生ビット量を求め、最大となるものがマージンとして設定されるようにされている。しかしながら、このようなチューニング処理によるマージンの設定では、既知の複数の設定画面には含まれていない、未知の画面のMPEGビットストリームのデータ量には対応することができない恐れがあり、その未知の画像のMPEGビットストリームのデータ量により最少パラメータの発生ビット量が、マージンとして設定されていた値よりも大きくなる恐れがあり、結果として、VBVバッファアンダーフローの発生を必ずしも抑制することができないという課題があった。
【００１５】
また、マージンは、チューニングに用いる画像によりばらつきが生じてしまうことになるため、必要以上にマージンを大きく設定してしまうことがあり、結果として、VBVバッファアンダーフローは抑制できるものの、設定されたVBVバッファの容量を最大限に利用することができない恐れがあった。
【００１６】
本発明はこのような状況に鑑みてなされたものであり、VBVバッファを最大限に使用できるようにし、VBVバッファアンダーフローを抑制できるようにするものである。
【００１７】
【課題を解決するための手段】
本発明の画像処理装置は、入力された画像のピクチャタイプを判別する判別手段と、画像のピクチャタイプが、 I ピクチャの場合、所定の定数およびパイプライン遅延により発生するビット量、スライス数およびヘッダにより発生するビット量、並びに、マクロブロック数および I ピクチャ内の DC 成分により発生するビット量のそれぞれの積和によりVBVバッファのマージンを演算し、画像のピクチャタイプが、 P ピクチャ、または、 B ピクチャの場合、所定の定数およびパイプライン遅延により発生するビット量、スライス数およびヘッダにより発生するビット量、並びに、スライス数およびスライス内の先頭マクロブロックと最終マクロブロックとのビット量の和のそれぞれの積和により VBV バッファのマージンを演算する演算手段と、演算手段により演算された、ピクチャタイプ別のVBVバッファのマージンを記憶する記憶手段と、MPEG方式で画像を圧縮する圧縮手段と、圧縮手段により圧縮された画像情報の最大ビット発生量、および判別手段により判別されたピクチャタイプ別のVBVバッファのマージンの和が、 VBV バッファの占有量より大きいとき、VBVバッファアンダーフローを検出するアンダーフロー検出手段と、アンダーフロー検出手段により、VBVバッファアンダーフローが検出された場合、ピクチャタイプが I ピクチャであるとき、全てのマクロブロックを DC 成分にし、 P ピクチャ、または、 B ピクチャであるとき、全てのマクロブロックをスキップドマクロブロックにすることにより、画像を最少パラメータで圧縮するように前記圧縮手段を制御する制御手段とを備えることを特徴とする。
【００２１】
本発明の画像処理方法は、入力された画像のピクチャタイプを判別する判別ステップと、画像のピクチャタイプが、 I ピクチャの場合、所定の定数およびパイプライン遅延により発生するビット量、スライス数およびヘッダにより発生するビット量、並びに、マクロブロック数および I ピクチャ内の DC 成分により発生するビット量のそれぞれの積和によりVBVバッファのマージンを演算し、画像のピクチャタイプが、 P ピクチャ、または、 B ピクチャの場合、所定の定数およびパイプライン遅延により発生するビット量、スライス数およびヘッダにより発生するビット量、並びに、スライス数およびスライス内の先頭マクロブロックと最終マクロブロックとのビット量の和のそれぞれの積和により VBV バッファのマージンを演算する演算ステップと、演算ステップの処理で演算された、ピクチャタイプ別のVBVバッファのマージンを記憶する記憶ステップと、MPEG方式で画像を圧縮する圧縮ステップと、圧縮ステップの処理で圧縮された画像情報の最大ビット発生量、および判別手段により判別されたピクチャタイプ別のVBVバッファのマージンの和が、 VBV バッファの占有量より大きいとき、VBVバッファアンダーフローを検出するアンダーフロー検出ステップと、アンダーフロー検出ステップの処理で、VBVバッファアンダーフローが検出された場合、ピクチャタイプが I ピクチャであるとき、全てのマクロブロックを DC 成分にし、 P ピクチャ、または、 B ピクチャであるとき、全てのマクロブロックをスキップドマクロブロックにすることにより、画像を最少パラメータで圧縮するように圧縮ステップの処理を制御する制御ステップとを含むことを特徴とする。
【００２２】
本発明の記録媒体のプログラムは、入力された画像のピクチャタイプを判別する判別ステップと、画像のピクチャタイプが、 I ピクチャの場合、所定の定数およびパイプライン遅延により発生するビット量、スライス数およびヘッダにより発生するビット量、並びに、マクロブロック数および I ピクチャ内の DC 成分により発生するビット量のそれぞれの積和によりVBVバッファのマージンを演算し、画像のピクチャタイプが、 P ピクチャ、または、 B ピクチャの場合、所定の定数およびパイプライン遅延により発生するビット量、スライス数およびヘッダにより発生するビット量、並びに、スライス数およびスライス内の先頭マクロブロックと最終マクロブロックとのビット量の和のそれぞれの積和により VBV バッファのマージンを演算する演算ステップと、演算ステップの処理で演算された、ピクチャタイプ別のVBVバッファのマージンを記憶する記憶ステップと、MPEG方式で画像を圧縮する圧縮ステップと、圧縮ステップの処理で圧縮された画像情報の最大ビット発生量、および判別手段により判別されたピクチャタイプ別のVBVバッファのマージンの和が、 VBV バッファの占有量より大きいとき、VBVバッファアンダーフローを検出するアンダーフロー検出ステップと、アンダーフロー検出ステップの処理で、VBVバッファアンダーフローが検出された場合、ピクチャタイプが I ピクチャであるとき、全てのマクロブロックを DC 成分にし、 P ピクチャ、または、 B ピクチャであるとき、全てのマクロブロックをスキップドマクロブロックにすることにより、画像を最少パラメータで圧縮するように圧縮ステップの処理を制御する制御ステップとを含むことを特徴とする。
【００２３】
本発明のプログラムは、入力された画像のピクチャタイプを判別する判別ステップと、画像のピクチャタイプが、 I ピクチャの場合、所定の定数およびパイプライン遅延により発生するビット量、スライス数およびヘッダにより発生するビット量、並びに、マクロブロック数および I ピクチャ内の DC 成分により発生するビット量のそれぞれの積和によりVBVバッファのマージンを演算し、画像のピクチャタイプが、 P ピクチャ、または、 B ピクチャの場合、所定の定数およびパイプライン遅延により発生するビット量、スライス数およびヘッダにより発生するビット量、並びに、スライス数およびスライス内の先頭マクロブロックと最終マクロブロックとのビット量の和のそれぞれの積和により VBV バッファのマージンを演算する演算ステップと、演算ステップの処理で演算された、ピクチャタイプ別のVBVバッファのマージンを記憶する記憶ステップと、MPEG方式で画像を圧縮する圧縮ステップと、圧縮ステップの処理で圧縮された画像情報の最大ビット発生量、および判別手段により判別されたピクチャタイプ別のVBVバッファのマージンの和が、 VBV バッファの占有量より大きいとき、VBVバッファアンダーフローを検出するアンダーフロー検出ステップと、アンダーフロー検出ステップの処理で、VBVバッファアンダーフローが検出された場合、ピクチャタイプが I ピクチャであるとき、全てのマクロブロックを DC 成分にし、 P ピクチャ、または、 B ピクチャであるとき、全てのマクロブロックをスキップドマクロブロックにすることにより、画像を最少パラメータで圧縮するように圧縮ステップの処理を制御する制御ステップとを含む処理をコンピュータに実行させることを特徴とする。
【００２４】
本発明の画像処理装置および方法、並びにプログラムにおいては、入力された画像のピクチャタイプが判別され、画像のピクチャタイプが、 I ピクチャの場合、所定の定数およびパイプライン遅延により発生するビット量、スライス数およびヘッダにより発生するビット量、並びに、マクロブロック数および I ピクチャ内の DC 成分により発生するビット量のそれぞれの積和によりVBVバッファのマージンが演算され、画像のピクチャタイプが、 P ピクチャ、または、 B ピクチャの場合、所定の定数およびパイプライン遅延により発生するビット量、スライス数およびヘッダにより発生するビット量、並びに、スライス数およびスライス内の先頭マクロブロックと最終マクロブロックとのビット量の和のそれぞれの積和により VBV バッファのマージンが演算され、演算された、ピクチャタイプ別のVBVバッファのマージンが記憶され、MPEG方式で画像が圧縮され、圧縮された画像情報の最大ビット発生量、および判別されたピクチャタイプ別のVBVバッファのマージンの和が、 VBV バッファの占有量より大きいとき、VBVバッファアンダーフローが検出され、VBVバッファアンダーフローが検出された場合、ピクチャタイプが I ピクチャであるとき、全てのマクロブロックが DC 成分にされ、 P ピクチャ、または、 B ピクチャであるとき、全てのマクロブロックがスキップドマクロブロックにされることにより、画像が最少パラメータで圧縮されるように制御される。
【００２５】
【発明の実施の形態】
図５は、本発明を適用したエンコーダの基本符号化処理部１の一実施の形態の構成を示している。
【００２６】
図５のエンコーダの基本符号化処理部１は、画像データをMPEG方式で符号化し、この符号化データをエンコーダ送出バッファ２に出力する。レート制御部３は、基本符号化処理部１によって画像データをMPEG方式で符号化するときの量子化幅ｑ_scaleを制御している。
【００２７】
ここで、図６を参照して、MPEG方式の符号化処理の概略を説明する。符号化される画像データの各ピクチャは、複数のマクロブロックに分割されて符号化される。１つのマクロブロックは、輝度について、１６×１６画素ブロックのデータ（更に基本符号化処理単位である４つの８×８画素ブロックに分けられる）、色差について、基本符号化処理単位である２つの８×８画素ブロックのデータに分けられ、これらのブロックが符号化される。また、１つのマクロブロックにおいて、マクロブロック内のブロックの符号化方法および量子化幅等が決定される。
【００２８】
スライスは、複数のマクロブロックを含むデータ単位であり、複数のスライスから１つのピクチャが構成される。ピクチャは、その符号化方法として、ピクチャ自体がそのまま符号化されるIピクチャ（イントラピクチャ）、時間的に過去のピクチャからの動きを予測した上で、符号化されるPピクチャ、時間的に過去、および、未来の両方、もしくはいずれか一方のピクチャからの動きを予測した上で、符号化されるBピクチャがある。
【００２９】
図６における各Ｉ，Ｐ，Ｂピクチャの配置は、その典型的な例であり、最初のIピクチャを用いて３枚先のＰピクチャが予測されて符号化され、その間に含まれる各Ｂピクチャが各Ｉ，Ｐピクチャの両方から予測されて符号化されている。したがって、最初にＩピクチャが符号化され、次にＰピクチャが符号化され、更にＢピクチャが符号化されることになる。このために、本来の時間の進行に沿った各Ｉ，Ｐ，Ｂピクチャの順序が変更されてから、これらのピクチャが符号化されることになる。
【００３０】
さらに、Ｉピクチャから始まる複数のピクチャからなるＧＯＰ（グループオブピクチャ）が構成され、任意の数のＧＯＰで１つのビデオシーケンスが構成される。
【００３１】
ピクチャタイプ決定部１１は、入力された画像の各ピクチャのピクチャタイプを判別して決定し、符号化する順番にならび替え、並び替えた状態で走査変換部１２、および、動き検出部２１に出力する。また、ピクチャタイプ決定部１１は、決定したピクチャタイプと共に、そのピクチャのMB（マクロブロック）数、スライス数、および、クロマフォーマットの情報をレート制御部３に出力する。
【００３２】
走査変換部１２は、並び替えられたピクチャを符号化される単位であるマクロブロックに変換し、これらのマクロブロックを順次減算器１３に出力する。減算器１３は、走査変換部１２より入力されたマクロブロックと動き補償付き予測部１９からの予測データとの差分を求め、この差分を予測誤差として求め、DCT（Discrete Cosine Transform）変換部１４に出力する。
【００３３】
DCT変換部１４は、モード判定部２０からのモード判定結果に基づいて、この予測誤差を８×８画素ブロック単位でDCT変換し、このDCT変換により得られた各変換係数を重み付け量子化部１５に出力する。
【００３４】
重み付け量子化部１５は、レート制御部３より入力される量子化幅q_scaleに基づいて、各変換係数を量子化し、これにより得た量子化データを、可変長符号化部１６および逆量子化部１７に出力する。また、重み付け量子化部１５は、レート制御部３より、最少パラメータによる符号化の指令を受信した場合、上述のように最少パラメータによる符号化処理を実行し、Iピクチャに対しては、DC成分のみを符号化し、Pピクチャ、または、Ｂピクチャに対しては、スキップドマクロブロックによる符号化を行い、符号化量を小さくする。
【００３５】
可変長符号化部１６は、モード判定部２０からのモード判定結果、および、動き検出部２１からの動きベクトルに基づいて、量子化データを可変長符号化して、圧縮符号化された画像データを形成し、この圧縮符号化された画像データを、所望の伝送レートで伝送するためにエンコーダ送出バッファ２に一旦蓄積させて出力する。より詳細には、可変長符号化部１６は、各マクロブロックの符号化が終了する度に、MBエンドタイミング信号をレート制御部３に送出す共に、１ピクチャの符号化が開始する度に、ピクチャ開始タイミング信号をレート制御部３へ送る。
【００３６】
逆量子化部１７は、重み付け量子化された画像データを逆量子化して、逆DCT変換部１８に出力する。逆DCT変換部１８は、入力された逆量子化されている画像データを逆DCT変換し、加算器２２に出力する。
【００３７】
加算器２２は、動き補償付き予測部１９から入力される予測データと、逆DCT変換された画像データを加算して、動き補償付き予測部１９に出力する。動き補償付き予測部１９は、加算器２２より入力される予測データが加算された、再生された画像データ、動き検出部２１より入力された動きベクトル、および、モード判定部２０より入力されるモード判定結果に基づいて、動きが予測されたピクチャを示す予測データを形成し、減算器１３、および、加算器２２に出力する。
【００３８】
動き検出処理部２１は、各マクロブロック毎に、画像の動きベクトルを算出し、この動きベクトルを動き補償付き予測部１９に入力するとともにモード判定部２０に出力する。
【００３９】
モード判定部２０は、動き検出部２１から入力された動きベクトルに基づいて、動き補償予測モードを決定し、DCT変換部１４、可変長符号化部１６、および、動き補償付き予測部１９に出力する。
【００４０】
レート制御部３は、ピクチャタイプ決定部１１より入力されてくるピクチャタイプ、MB数、スライス数、およびクロマフォーマットの情報に基づいて、上述のVBVバッファのマージンサイズを決定し、可変長符号化部１６から出力されたビットストリームのビット発生量からVBVバッファアンダーフローの有無を判定すると共に、VBVバッファアンダーフロー判定結果に応じて、重み付け量子化部１５に対して、上述の最少パラメータによる処理を実行するように指令を出力する。
【００４１】
次に、図７のブロック図を参照して、図５のレート制御部３の詳細な構成について説明する。
【００４２】
制御部４１は、MBエンドタイミング信号が入力される度に、可変長符号化部１６からのビットストリームに基づいて１マクロブロック当たりの実際のビット発生量を求め、VBVバッファの占有量として加算して、VBVバッファアンダーフロー判定部４２に供給すると共に、VBVバッファの占有量に基づいて、各マクロブロック毎に、量子化幅ｑ_scaleを求め、この量子化幅ｑ_scaleを基本符号化処理部１の重みづけ量子化部処理部１５に供給する。
【００４３】
より詳細には、制御部４１は、入力されるMBエンドタイミング信号、および、ピクチャ開始タイミング信号に基づいて、MBエンドタイミング信号の度に、MBの計数値に基づいて１マクロブロック当たりの実際のビット発生量を求め、VBVバッファの占有量として加算する。
【００４４】
VBVバッファアンダーフロー判定部４２は、ピクチャタイプ決定部１１より入力されるピクチャタイプ、MB数、および、スライス数に基づいて、演算部４３を制御して、VBVバッファの最大値を設定する際のIピクチャ、および、B、または、Pピクチャに対応するマージンを演算させマージンサイズメモリ４４に記憶させる。また、VBVバッファアンダーフロー判定部４２は、マージンサイズメモリ４４に記憶されたマージンサイズに基づいて、VBVバッファの最大値（図４で示した最大値Max２）を設定し、制御部４１により求められたVBVバッファの占有量を監視し、VBVバッファアンダーフローとなるか否かを判定する。そして、VBVバッファアンダーフロー判定部４２は、その判定結果に応じて、VBVバッファアンダーフローの状態に近づくと、アンダーフロー判定結果を重み付け量子化部１５に出力する。
【００４５】
次に、演算部４３によるマージンサイズの演算方法について説明する。
【００４６】
マージンサイズは、最少パラメータを用いた圧縮方法により必ず発生する発生ビット量を設定すればよいことになる。従って、Iピクチャの場合、以下の式（１）で示す発生ビット量がマージンMargin_Ipictureとして演算されることになる。
【００４７】
Margin_Ipicture＝α×GB_MB＋β×GB_HD＋γ×GB_DC ・・・（１）
【００４８】
ここで、GB_MBはチップ内のパイプライン遅延により発生するビット量を、GB_HDは、ヘッダにより発生するビット量を、さらに、GB_DCは、Iピクチャ内のDC成分により発生するビット量をそれぞれ示している。αは、制御することができずに、発生してしまうマクロブロック数（ハードウェアに依存する定数）を示している（ビット発生量を制御するCPUが1 MB毎に発生ビット量を検出するまでの時間内に発生してしまう（制御することができない）MB数）。βは、スライス数、γは、マクロブロック数をそれぞれ示している。
【００４９】
一方、Bピクチャ、または、Pピクチャの場合、以下の式（２）で示す発生ビット量がマージンMargin_B/Ppictureとして演算されることになる。
【００５０】
Margin_B/Ppicture＝α×GB_MB＋β×GB_HD＋βGB_SMB ・・・（２）
【００５１】
ここで、GB_SMBは、スライス内の第１MBのビット量と最終MBのビット量の和を示している。
【００５２】
例えば、クロマフォーマットがａ：ｂ：ｃである場合、GB_MBは、以下のように求められる。
【００５３】
GB_MB＝（DCT係数の最大ビット量）×（ａ＋ｂ＋ｃ）
×（１ブロックの画素数）＋（MBヘッダの発生ビット数）・・・（３）
【００５４】
ここで、DCT係数の最大ビット量は、例えば、ITU-T Rec.H.262(2000E)に規定されているFirst_DCT_coeficient（非イントラブロック（インターブロック）の第１非零DCT係数）、または、Subsequent_DCT_coeficients（後続のDCT係数）の最大値である。尚、上述の最大値は、インターマクロブロックの値であるが、イントラマクロブロックの値よりは大きなものであるので、全てにおいて、その値を使用するものとしてもよい。ITU-T Rec.H.262(2000E)においては、First_DCT_coeficientが、２乃至２４ビット、Subsequent_DCT_coeficients３乃至２４ビットと規定されているので、２４ビットとするようにしてもよい（ITU-T Rec.H.262(2000E)の第３６頁６．２．６章参照）。
【００５５】
１ブロックの画素数は、MPEG２においては、８画素×８画素であるので６４である。
【００５６】
また、MBヘッダの発生ビット量は、例えば、ITU-T Rec.H.262(2000E)で規定されるMPEG２である場合、macroblock_address_increment（現MBアドレスと前MBアドレスの差）、quantiser_scale_code（MB量子化スケールコード）、marker_bit（マーカ）、macroblock type（マクロブロックのタイプ）、spatial_temporal_weight_code（アップサンプル用の時空間重み付けコード）、frame_motion_type（フレーム構造の動き補償タイプ）、dct_type（DCTのタイプ（フレームorフィールド）、motion_vertical_field_select[0][s]（予測に用いる参照フィールドの選択情報）、motion_vertical_field_select[1][s] （予測に用いる参照フィールドの選択情報）、motion_code[0][s][0]（動きベクトルがmotion_vector(0, s)である場合の基本差分動きベクトル）、motion_residual[0][s][0]（動きベクトルがmotion_vector(0, s)である場合の残差ベクトル）、dmvector[0]（動きベクトルがmotion_vector(0, s)である場合のデュアルプライム用差分ベクトル）、motion_code[0][s][1] （動きベクトルがmotion_vector(0, s)である場合の基本差分動きベクトル）、motion_residual[0][s][1] （動きベクトルがmotion_vector(0, s)である場合の残差ベクトル）、dmvector[1] （動きベクトルがmotion_vector(0, s)である場合のデュアルプライム用差分ベクトル）、motion_code[1][s][0]（動きベクトルがmotion_vector(1, s)である場合の基本差分動きベクトル）、motion_residual[1][s][0] （動きベクトルがmotion_vector(1, s)である場合の残差ベクトル）、dmvector[0]（動きベクトルがmotion_vector(1, s)である場合のデュアルプライム用差分ベクトル）、motion_code[1][s][1] （動きベクトルがmotion_vector(1, s)である場合の基本差分動きベクトル）、motion_residual[1][s][1] （動きベクトルがmotion_vector(1, s)である場合の残差ベクトル）、dmvector[1] （動きベクトルがmotion_vector(1, s)である場合のデュアルプライム用差分ベクトル）、coded_block_pattern_420（コードブロックパターン）、および、coded_block_pattern_2（コードブロックパターン）といったもののそれぞれのビット量の合計である。尚、ｓは、０である場合、前方向予測を示し、１である場合、双方向予測を示すパラメータであるが、いずれであってもよいため、ｓのまま表示されている。
【００５７】
ITU-T Rec.H.262(2000E)の場合、macroblock_address_increment（現MBアドレスと前MBアドレスの差）が1ビット、quantiser_scale_code（MB量子化スケールコード）が５ビット、marker_bit（マーカ）が1ビット、macroblock type（マクロブロックのタイプ）が9ビット、spatial_temporal_weight_code（アップサンプル用の時空間重み付けコード）が２ビット、frame_motion_type（フレーム構造の動き補償タイプ）が２ビット、dct_type（DCTのタイプ（フレームorフィールド）が1ビット、motion_vertical_field_select[0][s]（予測に用いる参照フィールドの選択情報）が1ビット、motion_vertical_field_select[1][s] （予測に用いる参照フィールドの選択情報）が1ビット、motion_code[0][s][0]（動きベクトルがmotion_vector(0, s)である場合の基本差分動きベクトル）が１１ビット、motion_residual[0][s][0]（動きベクトルがmotion_vector(0, s)である場合の残差ベクトル）が８ビット、dmvector[0]（動きベクトルがmotion_vector(0, s)である場合のデュアルプライム用差分ベクトル）が２ビット、motion_code[0][s][1] （動きベクトルがmotion_vector(0, s)である場合の基本差分動きベクトル）が１１ビット、motion_residual[0][s][1] （動きベクトルがmotion_vector(0, s)である場合の残差ベクトル）が８ビット、dmvector[1] （動きベクトルがmotion_vector(0, s)である場合のデュアルプライム用差分ベクトル）が２ビット、motion_code[1][s][0]（動きベクトルがmotion_vector(1, s)である場合の基本差分動きベクトル）が１１ビット、motion_residual[1][s][0] （動きベクトルがmotion_vector(1, s)である場合の残差ベクトル）が８ビット、dmvector[0]（動きベクトルがmotion_vector(1, s)である場合のデュアルプライム用差分ベクトル）が２ビット、motion_code[1][s][1] （動きベクトルがmotion_vector(1, s)である場合の基本差分動きベクトル）が１１ビット、motion_residual[1][s][1] （動きベクトルがmotion_vector(1, s)である場合の残差ベクトル）が１１ビット、dmvector[1] （動きベクトルがmotion_vector(1, s)である場合のデュアルプライム用差分ベクトル）が２ビット、coded_block_pattern_420（コードブロックパターン）が９ビット、coded_block_pattern_2（コードブロックパターン）が６ビットにそれぞれ規定されているので、その合計である１２０ビットとするようにしてもよい（ITU-T Rec.H.262(2000E)の第３３乃至３６頁６．２．５章参照）。
【００５８】
チップ内のパイプライン遅延により発生するビット量GB_MBは、例えば、ITU-T Rec.H.262(2000E)により規定されるMPEG２の場合、クロマフォーマットが４：２：０のとき、以下に示す式（４）のように求められる。
【００５９】
GB_MB＝２４×（４＋２＋０）×６４＋１２０＝９３３６（ビット）・・・（４）
【００６０】
また、クロマフォーマットが４：２：２のときは、以下に示す式（５）のように求められる。
【００６１】
GB_MB＝２４×（４＋２＋２）×６４＋１２０＝１２４０８（ビット）・・・（５）
【００６２】
さらに、GM_HDは、ヘッダのビット量であるので、例えば、ITU-T Rec.H.262(2000E)により規定されるMPEG２の場合（Vertical_size≦1088であるものとすると）、slice_start_code（スライススタートコード＋スライス垂直位置）、q_scale_code（量子化スケールコード）、および、extra_bit_slice（スライスの拡張ビット）などのの合計ビット量となる。尚、ITU-T Rec.H.262(2000E)において、slice_start_codeは32ビット、q_scale_codeは５ビット、extra_bit_sliceは１ビットに、それぞれ規定されているので、その合計ビット量である、GM_HD＝３８ビットがヘッダのビット量GB_MBの固定値とされるようにしてもよい（ITU-T Rec.H.262(2000E)の第３２ページ６．２．４章参照）。
【００６３】
また、GB_DCは、ITU-T Rec.H.262(2000E)により規定されるMPEG２の場合、クロマフォーマットがａ：ｂ：ｃのとき、Iピクチャであることからpicture_coding_type＝I pictureとなり、motion vectorは存在しないことを前提として以下の式（６）のように定義される。
【００６４】
GB_DC＝macroblock_address_increment＋macroblock_mode
＋dct_dc_size_luminance×ａ
＋dct_dc_diff_luminance×ａ
＋dct_dc_size_chrominance×（ｂ＋ｃ）
＋dct_dc_diff_chrominance×（ｂ＋ｃ）
＋end_of_block×（ａ＋ｂ＋ｃ）・・・（６）
【００６５】
ここで、macroblock_address_incrementは、現MBアドレスと前MBアドレスの差であり、macroblock_modeは、macroblock_typeとdct_typeの和であり、それぞれ、マクロブロックの符号化タイプと、DCTタイプを示している。
【００６６】
また、dct_dc_size_luminanceは、DCT輝度DC係数差分サイズを、dct_dc_diff_luminanceは、DCT輝度DC係数差分値を、dct_dc_size_ chrominanceは、DCT色差DC係数差分サイズを、dct_dc_diff_chrominanceは、DCT色差DC係数差分値を、end_of_blockは、ブロック内DCT係数終了フラグをそれぞれ示している。
【００６７】
例えば、ITU-T Rec.H.262(2000E)により規定されるMPEG２の場合、macroblock_address_incrementは1ビットに、macroblock_modeは2ビットに、dct_dc_size_luminanceは9ビットに、dct_dc_diff_luminanceは１１ビットに、dct_dc_size_ chrominanceは１０ビットに、dct_dc_diff_chrominanceは１１ビットに、end_of_blockは4ビットにそれぞれ規定されている（ITU-T Rec.H.262(2000E)の第３３，３４，３６頁６．２．５章、６．２．５．１章、６．２．６章参照）。このため、ITU-T Rec.H.262(2000E)により規定されるMPEG２の場合、クロマフォーマットが４：２：０のとき、DC成分GB_DCは、以下の式（７）のように演算される。
【００６８】
GB_DC＝１＋２＋９×４＋１１×４＋１０×２＋１１×２＋４×６
＝１４９・・・（７）
【００６９】
また、クロマフォーマットが４：２：２のとき、DC成分GB_DCは、以下の式（８）のように演算される。
【００７０】
GB_DC＝１＋２＋９×４＋１１×４＋１０×４＋１１×４＋４×８
＝１９９・・・（８）
【００７１】
さらに、スライス内の第１MBのビット量と最終MBのビット量の和GB_SMBは、ITU-T Rec.H.262(2000E)により規定されるMPEG２の場合、以下の式（９）のように定義される。
【００７２】

【００７３】
ここで、macroblock_escape_Fは第１MBのMBアドレス拡張用のビット量を、macroblock_address_increment_Fは第１MBの現MBアドレスと前MBアドレスの差を、q_scale_codeは第１MBのｑスケールコードを、macroblock_type_Fは第１MBのMB符号化タイプを、frame(or field)_motion_type_Fは、フレーム構造の（または、フィールド構造の）動き補償タイプを、motion_vector_Fは、第１MBの動きベクトルのビット量を示しており、_Lが_Fに代えて付されているものは、対応するそれぞれの最終MBのビット量を示している。
【００７４】
ITU-T Rec.H.262(2000E)により規定されるMPEG２の場合、macroblock_escape_Fは０ビットに、macroblock_address_increment_Fは１ビットに、q_scale_codeは５ビットに（sliceの第１MBはq_scale_codeが必要となる）、macroblock_type_Fは６ビットに（ITU-T Rec.H.262(2000E)の第122頁の Table B.2乃至B.4より、frame_motion_type_Fまたはfield_motion_type_Fは２ビットに、motion_vector_Fは３ビット（motion_vertical_field_select[0][s]、motion_code[r][s][0]、および、motion_code[r][s][1]がそれぞれ１ビットずつ）に、macroblock_escape_Lは１１ビットに、macroblock_address_increment_Lは８ビットに、macroblock_type_Lは３ビットに、frame_motion_type_L、または、field_motion_type_Lは２ビットに、motion_vector_Lは３ビットに（motion_vertical_field_select[0][s]、motion_code[r][s][0]および、motion_code[r][s][1]が１ビットずつ）それぞれ、規定されている（ITU-T Rec.H.262(2000E)の第３３，３４頁６．２．５章、６．２．５．１章、６．２．５．２章参照）。
【００７５】
このため、ITU-T Rec.H.262(2000E)により規定されるMPEG２の場合、スライス内の第１MBのビット量と最終MBのビット量の和GB_SMBは、以下の式（１０）のように求められる。
【００７６】

【００７７】
式（１０）より、スライス内の第１MBのビット量と最終MBのビット量の和GB_SMBは、４４ビットの固定値となる。
【００７８】
以上をまとめると、例えば、ITU-T Rec.H.262(2000E)により規定されるMPEG２の場合、クロマフォーマットが、４：２：０、および、４：２：２であるとき、Iピクチャのマージンは、式（１）に基づいて、以下の式（１１），式（１２）で、それぞれ求められることになる。
【００７９】
Margin_Ipicture＝９３３６α＋３８β＋１４９γ ・・・（１１）
Margin_Ipicture＝１２４０８α＋３８β＋１９９γ ・・・（１２）
【００８０】
また、同様にして、Bピクチャ、または、Pピクチャのマージンは、クロマフォーマットが、４：２：０、および、４：２：２であるとき、式（２）に基づいて、以下の式（１３），式（１４）で、それぞれ求められることになる。
【００８１】

【００８２】
このように、マージンは、クロマフォーマットが確定すると、ハードウェアにより制御できないMB数α、スライス数β、および、MB数γに基づいて、算出することが可能となる。ここで、MB数αは、ハードウェア固有の値として設定されるため、その値は、予め設定される。
【００８３】
また、演算部４３は、Iピクチャ、並びに、Bピクチャ、または、Pピクチャ毎に、クロマフォーマットに対応した係数を予め内蔵するメモリに記憶しており、ピクチャタイプ決定部１１より入力されるクロマフォーマットとピクチャタイプの情報に基づいてその係数を読み出すと共に、同時にピクチャタイプ決定部１１より送信されてくるスライス数、および、MB数の情報に基づいて、マージンを計算し、マージンサイズメモリ４４に記憶させる。
【００８４】
１つのストリームにおいては、スライス数やMB数は変化しないため、演算部４３は、Iピクチャ、並びに、Pピクチャ、または、Bピクチャが最初に入力されたときにのみ、マージンの計算を行ってマージンサイズメモリ４４に記憶させる。VBVアンダーフロー判定部４２は、上述のように演算されてサイズメモリ４４に記憶されているマージンを用いてVBVアンダーフローの有無を判定する。
【００８５】
次に、図８のフローチャートを参照して、レート制御部３によるVBVバッファアンダーフロー監視処理について説明する。
【００８６】
ステップＳ１において、レート制御部３のVBVバッファアンダーフロー判定部４２は、ピクチャタイプ決定部１１より入力されてくるピクチャタイプ、MB数、スライス数、および、クロマフォーマットの情報を取得する。
【００８７】
ステップＳ２において、VBVバッファアンダーフロー判定部４２は、マージンサイズメモリ４４に、既に、マージンサイズが演算されているか否かを判定する。このとき、VBVバッファアンダーフロー判定部４２は、ステップＳ１において、取得したピクチャタイプのマージンサイズが既に演算されて、記憶されているか否かを判定する。
【００８８】
ステップＳ２において、マージンサイズが演算されていないと判定された場合、ステップＳ３において、VBVバッファアンダーフロー判定部４２は、演算部４３に取得したピクチャタイプ、MB数、スライス数、および、クロマフォーマットの情報を供給し、上述の式（１）、または、式（２）の演算を実行させ、演算結果をマージンサイズメモリ４４に記憶させる。
【００８９】
ステップＳ４において、VBVバッファアンダーフロー判定部４２は、入力された画像がIピクチャであるか否かを判定し、例えば、Iピクチャであると判定された場合、その処理は、ステップＳ５に進む。
【００９０】
ステップＳ５において、VBVバッファアンダーフロー判定部４２は、マージンサイズメモリ４４に記憶されているIピクチャのマージンサイズを読み出す。
【００９１】
ステップＳ６において、VBVバッファアンダーフロー判定部４２は、VBVバッファの占有量が、引き出されるピクチャの最大発生ビット量MaxGenBitとマージンとの和よりも大きいか否かを判定し、例えば、VBVバッファの占有量が、引き出されるピクチャの最大発生ビット量MaxGenBitとマージンとの和よりも大きくない、すなわち、VBVバッファの占有量が、引き出されるピクチャの最大発生ビット量MaxGenBitとマージンとの和よりも小さいと判定された場合、その処理は、ステップＳ７に進む。
【００９２】
ステップＳ７において、VBVバッファアンダーフロー判定部４２は、VBVバッファアンダーフローが発生するとみなし、重み付け量子化部１５に対して、最少パラメータ処理を実行するように指令する。
【００９３】
ステップＳ８において、VBVバッファアンダーフロー判定部４２は、次のピクチャが存在するか否かを判定し、次のピクチャが存在する場合、その処理は、ステップＳ１に戻り、次のピクチャが存在しない場合、ステップＳ９において、マージンサイズメモリ４４をリセットして、その処理を終了する。
【００９４】
また、ステップＳ２において、マージンサイズが演算されている、すなわち、Ｉピクチャが入力された場合、Ｉピクチャのマージンサイズが、また、Ｐピクチャ、または、Ｂピクチャのマージンサイズが入力された場合、Ｐピクチャ、または、Ｂピクチャのマージンサイズが、既に、演算されているとき、ステップＳ３の処理は、スキップされる。
【００９５】
さらに、ステップＳ４において、Ｉピクチャではない、すなわち、Ｂピクチャ、または、Ｐピクチャであった場合、ステップＳ１０において、VBVバッファアンダーフロー判定部４２は、マージンサイズメモリ４４に記憶されているＢピクチャ、または、Ｐピクチャのマージンサイズを読み出す。
【００９６】
また、ステップＳ６において、VBVバッファの占有量が、引き出されるピクチャの最大発生ビット量MaxGenBitとマージンとの和よりも大きいと判定された場合、ステップＳ７の処理はスキップされ、その処理は、ステップＳ８に進む。
【００９７】
すなわち、以上の処理をまとめると、ステップＳ３により、演算部４３が、Ｉピクチャ、もしくは、Ｐピクチャ、または、Ｂピクチャのマージンサイズを演算する。このとき、演算部４３は、上述のように、クロマフォーマットに対応する係数GB_MB、GB_HD、GB_DCを予め記憶しているので、ステップＳ１の処理で取得されるクロマフォーマットとピクチャタイプの情報に応じて選択し、さらに、予め設定されている、発生してしまうMB数αと、ステップＳ１の処理で取得されたスライス数β、および、MB数γを用いて、式（１）、または、式（２）を演算してマージンサイズを演算する。
【００９８】
ただし、ステップＳ２の処理により、マージンサイズメモリ４４に記憶されたピクチャタイプのマージンサイズは、ステップＳ３がスキップされることにより、それ以降の処理においては、演算されないことになるので、１ストリームの処理については、ピクチャタイプ毎に１回だけ演算されることになる。
【００９９】
ステップＳ５，Ｓ１０において、ピクチャタイプ毎にマージンサイズが読み出される。そして、ステップＳ６において、最大発生ビット量MexGenBitと読み出されたマージンサイズの和が、VBVバッファの占有量と比較され、VBVバッファの占有量が、引き出されるピクチャの最大発生ビット量MaxGenBitとマージンの和よりも小さい場合、最少パラメータ処理がなされるように指令が出される。
【０１００】
結果として、これまで、経験的に設定されていたマージンが、ストリーム毎に発生される最大のビット量として演算されて設定されることになるので、過不足のないマージン設定が可能となるため、VBVバッファを最大限利用することができ、VBVバッファのアンダーフローの発生をほぼ１００％抑制することが可能となる。
【０１０１】
上述した一連の処理は、ハードウェアにより実行させることもできるが、ソフトウェアにより実行させることもできる。一連の処理をソフトウェアにより実行させる場合には、そのソフトウェアを構成するプログラムが、専用のハードウェアに組み込まれているコンピュータ、または、各種のプログラムをインストールすることで、各種の機能を実行させることが可能な、例えば汎用のパーソナルコンピュータなどに記録媒体からインストールされる。
【０１０２】
図９は、エンコーダをソフトウェアにより実現する場合のパーソナルコンピュータの一実施の形態の構成を示している。パーソナルコンピュータのCPU１０１は、パーソナルコンピュータの全体の動作を制御する。また、CPU１０１は、バス１０４および入出力インタフェース１０５を介してユーザからキーボードやマウスなどからなる入力部１０６から指令が入力されると、それに対応してROM(Read Only Memory)１０２に格納されているプログラムを実行する。あるいはまた、CPU１０１は、ドライブ１１０に接続された磁気ディスク１１１、光ディスク１１２、光磁気ディスク１１３、または半導体メモリ１１４から読み出され、記憶部１０８にインストールされたプログラムを、RAM(Random Access Memory)１０３にロードして実行する。これにより、上述したエンコーダの機能が、ソフトウェアにより実現されている。さらに、CPU１０１は、通信部１０９を制御して、外部と通信し、データの授受を実行する。
【０１０３】
プログラムが記録されている記録媒体は、図９に示すように、コンピュータとは別に、ユーザにプログラムを提供するために配布される、プログラムが記録されている磁気ディスク１１１（フレキシブルディスクを含む）、光ディスク１１２（CD-ROM(Compact Disk-Read Only Memory)，DVD（Digital Versatile Disk）を含む）、光磁気ディスク１１３（MD（Mini-Disc）を含む）、もしくは半導体メモリ１１４などよりなるパッケージメディアにより構成されるだけでなく、コンピュータに予め組み込まれた状態でユーザに提供される、プログラムが記録されているROM１０２や、記憶部１０８に含まれるハードディスクなどで構成される。
【０１０４】
尚、本明細書において、記録媒体に記録されるプログラムを記述するステップは、記載された順序に沿って時系列的に行われる処理は、もちろん、必ずしも時系列的に処理されなくとも、並列的あるいは個別に実行される処理を含むものである。
【０１０５】
【発明の効果】
本発明によれば、VBVバッファを最大限利用することができ、VBVバッファアンダーフローをほぼ１００％抑制することが可能となる。
【図面の簡単な説明】
【図１】 VBVバッファを説明する図である。
【図２】 VBVバッファアンダーフローを説明する図である。
【図３】 VBVバッファのマージンを説明する図である。
【図４】 VBVバッファのマージンを説明する図である。
【図５】本発明を適用したエンコーダの構成を示すブロック図である。
【図６】図５のエンコーダによる処理を説明する図である。
【図７】図５のレート制御部の構成を示すブロック図である。
【図８】図７のレート制御部によるVBVバッファアンダーフロー監視処理を説明するフローチャートである。
【図９】媒体を説明する図である。
【符号の説明】
１基本符号化処理部，２エンコーダ送出バッファ，３レート制御部，
１１ピクチャタイプ決定部，１２走査変換部，１４ DCT変換部，
１５重み付け量子化部，１６可変長符号化部，２０モード判定部，
４１制御部，４２ VBVバッファアンダーフロー判定部，４３演算部，
４４マージンサイズメモリ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image processing apparatus and method, a recording medium, and a program. In particular, VBV (Video Buffer Verifier) buffer underflow when image information is compressed by the MPEG (Moving Picture Experts Group) method can be suppressed. The present invention relates to an image processing apparatus and method, a recording medium, and a program.
[0002]
[Prior art]
In recent years, an image compression / decompression technique based on the MPEG method has been widely used.
[0003]
By compressing the image by the MEPG method, an MPEG bit stream is generated, and the generated MPEG bit stream is transmitted to a receiving apparatus via a transmission path. At this time, the transmission side device that compresses and transmits the image in the MPEG format monitors whether or not the MPEG bit stream to be transmitted is sufficiently reproducible by the reception side device. Sending.
[0004]
More specifically, as shown in FIG. 1, the transmission side device, that is, the encoder, virtually reproduces the state of the buffer used when the reception side device (decoder) receives the MPEG bit stream by the VBV buffer. Thus, encoding is performed while monitoring so that buffer overflow and buffer underflow do not occur.
[0005]
That is, the occupation amount of the VBV buffer changes as shown in FIG. Here, in FIG. 2, the vertical axis indicates the occupation amount of the VBV buffer, and the horizontal axis indicates time. Therefore, the slope of the straight line in FIG. 2 indicates the bit rate. When the MPEG bit stream enters the VBV buffer up to its maximum capacity (Max1 in the figure), the occupation amount becomes the state S1. When an MPEG bit stream for one picture (I picture (intra picture), P picture (forward prediction picture), or B picture (bidirectional prediction picture)) is extracted from the VBV buffer in this state S1 for decoding The state changes from the state S1 to the state S2. If the amount of generation for one picture does not exceed the lower limit of the VBV buffer, the decoder can receive all of the data for one picture, so that it can be decoded normally.
[0006]
Subsequently, after the time T (= 1 / frame rate) elapses, when the MPEG bit stream is input to the VBV buffer, the occupation amount becomes the state S3. Here, when the MPEG bit stream for one picture is extracted from the VBV buffer, when the data amount (L1 + L2) in the drawing is the data amount for one picture, only the data for the data amount L1 in the drawing is displayed in the state S3. Not stored in the VBV buffer, as shown in state S4, the state of the MPEG bit stream is in a state where the data amount L2 is insufficient with respect to the MPEG bit stream for one picture, and the MPEG bit stream is interrupted. become. For this reason, a decoder serving as a device on the stream receiving side cannot decode one picture halfway. In this way, the state in which the amount of data stored in the VBV buffer is less than one picture and decoding becomes impossible is referred to as VBV buffer underflow.
[0007]
The encoder controls the stream so that the generated amount of one picture is kept within the maximum generated bit amount (MaxGenBit = L1) at the time of state S3 so that the VBV buffer underflow does not occur, and the VBV buffer underflow occurs. It is encoded so that there is no.
[0008]
One frame bit as one method of encoding so as not to cause VBV buffer underflow (encoding while controlling the stream so that the amount of generation of one picture is kept within the maximum generated bit amount (MaxGenBit = L1)) In order to suppress the generation amount, it has been proposed to perform compression processing (hereinafter, referred to as minimum parameter processing) by setting the parameters generated by encoding in units of macroblocks (MB) to the minimum necessary (minimum parameters). This minimum parameter processing varies depending on the picture type. In the case of an I picture, processing is performed using only the DC (Direct Current) component of all MBs. In the case of a B picture or P picture, all MBs are skipped. Execute the process of making a macro block (skipped MB). Here, the skipped macroblock is data including only the first macroblock header (MB header) of each slice and the last MB header of each slice in each picture.
[0009]
In addition, there is one that enables encoding at a target bit rate while stabilizing the quality of a reproduced image (see, for example, Patent Document 1).
[0010]
[Patent Document 1]
JP-A-10-155152
[0011]
[Problems to be solved by the invention]
However, even if the generated bit amount is minimized by the minimum parameter processing as described above, as a result, the generated bit amount cannot be reduced to 0. Therefore, the occupation amount of the VBV buffer is the data amount L1 shown in FIG. Even if the compression processing with the minimum parameter is performed in the following cases, the entire generated bit amount for one picture does not fit within the maximum generated bit amount L1, and as a result, the data amount that is insufficient for the data for one picture There was a risk that L2 would occur.
[0012]
Therefore, as shown in FIG. 3, by providing a margin in the VBV buffer, the generated bit amount exceeds the margin + maximum generated bit amount from a comparison between the margin + maximum generated bit amount and the VBV buffer occupation amount. In such a case, a method for avoiding the VBV buffer underflow has been proposed by executing the process using the minimum parameter as described above.
[0013]
That is, as shown in FIG. 4, the maximum capacity Max2 (= Max1 + margin) of the VBV buffer is set. Then, in the state S1 ', the MPEG bit stream is stored up to the maximum capacity Max2, and when data for one picture is extracted from this state, the state changes to the state S2'. Again, as described above, after the time T has elapsed, when the state S3 ′ is reached, when data for one picture is extracted again, the occupation amount of the VBV buffer is the amount of data up to a level that becomes a margin. Since it becomes L1 and does not satisfy the data amount of the MPEG bit stream for one picture as described above, compression processing is performed with the minimum parameter, but at this time, since a margin is expected, the compression occurs with the minimum parameter. Since the data amount L2 is absorbed by the margin, the VBV buffer underflow does not occur as a result.
[0014]
By the way, in the margin setting described above, a minimum parameter is obtained in advance from a plurality of predetermined setting images, so that a generated bit amount generated by the minimum parameter is obtained by so-called tuning processing, and the maximum one is set as a margin. It is supposed to be. However, the margin setting by such tuning processing may not be able to cope with the data amount of the MPEG bit stream of an unknown screen that is not included in a plurality of known setting screens. There is a risk that the generated bit amount of the minimum parameter may be larger than the value set as the margin due to the data amount of the MPEG bit stream of the image, and as a result, the occurrence of VBV buffer underflow cannot necessarily be suppressed. was there.
[0015]
In addition, since the margin will vary depending on the image used for tuning, the margin may be set larger than necessary. As a result, although the VBV buffer underflow can be suppressed, the set VBV There was a fear that the buffer capacity could not be fully utilized.
[0016]
The present invention has been made in view of such a situation, and makes it possible to use a VBV buffer to the maximum extent and to suppress a VBV buffer underflow.
[0017]
[Means for Solving the Problems]
  An image processing apparatus according to the present invention includes a determination unit that determines a picture type of an input image, an imageofPicture typeBut, I For pictures, the amount of bits generated by a predetermined constant and pipeline delay, the number of slices and bits generated by the header, and the number of macroblocks and I In the picture DC By the product sum of each bit amount generated by the componentCalculate the margin of the VBV buffer,The picture type of the image is P Picture or B In the case of a picture, the bit amount generated by a predetermined constant and pipeline delay, the number of slices and the bit amount generated by the header, and the sum of the number of slices and the bit amount of the first macroblock and the last macroblock in the slice, respectively By sum of products VBV Calculate buffer marginCalculation means, storage means for storing the margin of the VBV buffer for each picture type calculated by the calculation means, compression means for compressing an image in the MPEG format, and image information compressed by the compression meansThe maximum amount of bits generated, andPicture type discriminated by discriminating meansanotherVBV buffer marginThe sum of VBV When larger than buffer occupancy, VBV buffer underFlowVBV buffer underrun by detecting underflow detection means and underflow detection meansFlowIf detected,Picture type is I When a picture, all macroblocks DC Into ingredients, P Picture or B By making all macroblocks into skipped macroblocks when they are picturesAnd control means for controlling the compression means so as to compress the image with the minimum parameters.
[0021]
  An image processing method of the present invention includes a determination step of determining a picture type of an input image, and an imageofPicture typeBut, I For pictures, the amount of bits generated by a predetermined constant and pipeline delay, the number of slices and bits generated by the header, and the number of macroblocks and I In the picture DC By the product sum of each bit amount generated by the componentCalculate margin of VBV bufferAnd the picture type of the image is P Picture or B In the case of a picture, the bit amount generated by a predetermined constant and pipeline delay, the number of slices and the bit amount generated by the header, and the sum of the number of slices and the bit amount of the first macroblock and the last macroblock in the slice, respectively By sum of products VBV Calculate buffer marginCalculation step, a storage step for storing the margin of the VBV buffer for each picture type calculated by the processing of the calculation step, a compression step for compressing the image by the MPEG method, and image information compressed by the processing of the compression stepThe maximum amount of bits generated, andPicture type discriminated by discriminating meansanotherVBV buffer marginThe sum of VBV When larger than buffer occupancy, VBV buffer underFlowVBV buffer underflow is detected in the underflow detection step and underflow detection step processing to be detected.FlowIf detected,Picture type is I When a picture, all macroblocks DC Into ingredients, P Picture or B By making all macroblocks into skipped macroblocks when they are picturesAnd a control step for controlling the processing of the compression step so as to compress the image with the minimum parameters.
[0022]
  The recording medium program of the present invention includes a determination step of determining a picture type of an input image, and an imageofPicture typeBut, I For pictures, the amount of bits generated by a predetermined constant and pipeline delay, the number of slices and bits generated by the header, and the number of macroblocks and I In the picture DC By the product sum of each bit amount generated by the componentCalculate margin of VBV bufferAnd the picture type of the image is P Picture or B In the case of a picture, the bit amount generated by a predetermined constant and pipeline delay, the number of slices and the bit amount generated by the header, and the sum of the number of slices and the bit amount of the first macroblock and the last macroblock in the slice, respectively By sum of products VBV Calculate buffer marginCalculation step, a storage step for storing the margin of the VBV buffer for each picture type calculated by the processing of the calculation step, a compression step for compressing the image by the MPEG method, and image information compressed by the processing of the compression stepThe maximum amount of bits generated, andMargin of VBV buffer by picture type determined by determination meansThe sum of VBV When larger than buffer occupancy, VBV buffer underFlowVBV buffer underflow is detected in the underflow detection step and underflow detection step processing to be detected.FlowIf detected,Picture type is I When a picture, all macroblocks DC Into ingredients, P Picture or B By making all macroblocks into skipped macroblocks when they are picturesAnd a control step for controlling the processing of the compression step so as to compress the image with the minimum parameters.
[0023]
  The program of the present invention includes a determination step of determining a picture type of an input image, and an imageofPicture typeBut, I For pictures, the amount of bits generated by a predetermined constant and pipeline delay, the number of slices and bits generated by the header, and the number of macroblocks and I In the picture DC By the product sum of each bit amount generated by the componentCalculate margin of VBV bufferAnd the picture type of the image is P Picture or B In the case of a picture, the bit amount generated by a predetermined constant and pipeline delay, the number of slices and the bit amount generated by the header, and the sum of the number of slices and the bit amount of the first macroblock and the last macroblock in the slice, respectively By sum of products VBV Calculate buffer marginCalculation step, a storage step for storing the margin of the VBV buffer for each picture type calculated by the processing of the calculation step, a compression step for compressing the image by the MPEG method, and image information compressed by the processing of the compression stepThe maximum amount of bits generated, andPicture type discriminated by discriminating meansanotherVBV buffer marginThe sum of VBV When larger than buffer occupancy, VBV buffer underFlowIf VBV buffer underflow is detected in the underflow detection step and underflow detection step processing to detect,Picture type is I When a picture, all macroblocks DC Into ingredients, P Picture or B By making all macroblocks into skipped macroblocks when they are picturesAnd a control step of controlling the processing of the compression step so as to compress the image with the minimum parameters.
[0024]
  In the image processing apparatus and method and the program of the present invention, the picture type of the input image is determined, and the imageofPicture typeBut, I For pictures, the amount of bits generated by a predetermined constant and pipeline delay, the number of slices and bits generated by the header, and the number of macroblocks and I In the picture DC By the product sum of each bit amount generated by the componentVBV buffer margin is calculated,The picture type of the image is P Picture or B In the case of a picture, the bit amount generated by a predetermined constant and pipeline delay, the number of slices and the bit amount generated by the header, and the sum of the number of slices and the bit amount of the first macroblock and the last macroblock in the slice, respectively By sum of products VBV Buffer margin is calculatedThe calculated VBV buffer margin for each picture type is stored, the image is compressed in MPEG format, and the compressed image informationThe maximum amount of bits generated, andDetermined picture typeanotherVBV buffer marginThe sum of VBV When larger than buffer occupancy, VBV buffer underFlowDetected, VBV buffer underFlowIf detected,Picture type is I When it is a picture, all macroblocks DC Into ingredients, P Picture or B When it is a picture, by making all macroblocks into skipped macroblocks,The image is controlled to be compressed with the minimum parameters.
[0025]
DETAILED DESCRIPTION OF THE INVENTION
FIG. 5 shows a configuration of an embodiment of a basic encoding processing unit 1 of an encoder to which the present invention is applied.
[0026]
The basic encoding processing unit 1 of the encoder shown in FIG. 5 encodes image data by the MPEG method, and outputs the encoded data to the encoder transmission buffer 2. The rate control unit 3 controls the quantization width q_scale when the basic encoding processing unit 1 encodes image data by the MPEG method.
[0027]
Here, with reference to FIG. 6, the outline of the MPEG encoding process will be described. Each picture of the image data to be encoded is divided into a plurality of macroblocks and encoded. One macroblock is 16 × 16 pixel block data for luminance (further divided into four 8 × 8 pixel blocks which are basic encoding processing units), and two 8 which are basic encoding processing units for color difference. The data is divided into x8 pixel blocks, and these blocks are encoded. Also, in one macroblock, the coding method and quantization width of the block in the macroblock are determined.
[0028]
A slice is a data unit including a plurality of macroblocks, and one picture is composed of a plurality of slices. As a coding method, an I picture (intra picture) in which the picture itself is coded as it is, a P picture to be coded after predicting motion from a temporally past picture, and a temporal past There are B pictures that are encoded after predicting motion from either or both of the future pictures.
[0029]
The arrangement of each I, P, and B picture in FIG. 6 is a typical example, and the first I picture is used to predict and encode a P picture three frames ahead, and each B picture included therebetween Are predicted and coded from both I and P pictures. Therefore, the I picture is encoded first, then the P picture is encoded, and then the B picture is encoded. For this reason, these pictures are encoded after the order of each I, P, B picture is changed along with the progress of the original time.
[0030]
Furthermore, a GOP (group of pictures) composed of a plurality of pictures starting from an I picture is configured, and one video sequence is configured by an arbitrary number of GOPs.
[0031]
The picture type determination unit 11 discriminates and determines the picture type of each picture of the input image, rearranges them in the encoding order, and outputs them to the scan conversion unit 12 and the motion detection unit 21 in the rearranged state. To do. Further, the picture type determination unit 11 outputs the determined picture type, the number of MBs (macroblocks), the number of slices, and the chroma format information of the picture to the rate control unit 3.
[0032]
The scan conversion unit 12 converts the rearranged pictures into macroblocks that are units to be encoded, and sequentially outputs these macroblocks to the subtractor 13. The subtractor 13 obtains a difference between the macroblock input from the scan conversion unit 12 and the prediction data from the motion compensated prediction unit 19, obtains the difference as a prediction error, and supplies the difference to a DCT (Discrete Cosine Transform) conversion unit 14. Output.
[0033]
Based on the mode determination result from the mode determination unit 20, the DCT conversion unit 14 performs DCT conversion on the prediction error in units of 8 × 8 pixel blocks, and the weighting quantization unit 15 converts each conversion coefficient obtained by the DCT conversion. Output to.
[0034]
The weighted quantization unit 15 quantizes each transform coefficient based on the quantization width q_scale input from the rate control unit 3, and converts the quantized data obtained thereby into the variable length coding unit 16 and the inverse quantization unit. 17 to output. In addition, when the weighting quantization unit 15 receives an encoding command with the minimum parameter from the rate control unit 3, the weighting quantization unit 15 performs the encoding process with the minimum parameter as described above. Only the P picture or B picture is encoded with a skipped macroblock to reduce the encoding amount.
[0035]
The variable length coding unit 16 performs variable length coding on the quantized data based on the mode determination result from the mode determination unit 20 and the motion vector from the motion detection unit 21, and compresses the encoded image data. The compressed and encoded image data is temporarily stored in the encoder sending buffer 2 and output for transmission at a desired transmission rate. More specifically, the variable length encoding unit 16 sends an MB end timing signal to the rate control unit 3 every time encoding of each macroblock is completed, and every time encoding of one picture starts. A picture start timing signal is sent to the rate control unit 3.
[0036]
The inverse quantization unit 17 performs inverse quantization on the weighted quantized image data and outputs the image data to the inverse DCT conversion unit 18. The inverse DCT transform unit 18 performs inverse DCT transform on the input inversely quantized image data, and outputs the result to the adder 22.
[0037]
The adder 22 adds the prediction data input from the prediction unit 19 with motion compensation and the image data subjected to inverse DCT conversion, and outputs the result to the prediction unit 19 with motion compensation. The prediction unit 19 with motion compensation adds the prediction data input from the adder 22, the reproduced image data, the motion vector input from the motion detection unit 21, and the mode input from the mode determination unit 20. Based on the determination result, prediction data indicating a picture whose motion has been predicted is formed and output to the subtracter 13 and the adder 22.
[0038]
The motion detection processing unit 21 calculates a motion vector of an image for each macroblock, inputs the motion vector to the prediction unit 19 with motion compensation, and outputs the motion vector to the mode determination unit 20.
[0039]
The mode determination unit 20 determines a motion compensation prediction mode based on the motion vector input from the motion detection unit 21 and outputs the motion compensation prediction mode to the DCT conversion unit 14, the variable length coding unit 16, and the motion compensation prediction unit 19. To do.
[0040]
The rate control unit 3 determines the margin size of the VBV buffer described above based on the picture type, the number of MBs, the number of slices, and the chroma format information input from the picture type determination unit 11, and the variable length coding unit 16 determines whether or not there is a VBV buffer underflow from the bit generation amount of the bit stream output from 16, and executes the processing with the above-described minimum parameter for the weighting quantization unit 15 according to the VBV buffer underflow determination result Command to output.
[0041]
Next, the detailed configuration of the rate control unit 3 in FIG. 5 will be described with reference to the block diagram in FIG.
[0042]
Each time the MB end timing signal is input, the control unit 41 obtains the actual bit generation amount per macroblock based on the bit stream from the variable length coding unit 16 and adds it as the VBV buffer occupation amount. The quantization width q_scale is obtained for each macro block based on the VBV buffer occupancy, and the quantization width q_scale is obtained as the weight of the basic coding processing unit 1. To the quantization unit processing unit 15.
[0043]
More specifically, the control unit 41 determines the actual per macroblock based on the MB count value for each MB end timing signal based on the input MB end timing signal and the picture start timing signal. The bit generation amount is obtained and added as the VBV buffer occupation amount.
[0044]
The VBV buffer underflow determination unit 42 controls the calculation unit 43 based on the picture type, the number of MBs, and the number of slices input from the picture type determination unit 11 to set the maximum value of the VBV buffer. The margin corresponding to the I picture and the B or P picture is calculated and stored in the margin size memory 44. Further, the VBV buffer underflow determination unit 42 sets the maximum value of the VBV buffer (maximum value Max2 shown in FIG. 4) based on the margin size stored in the margin size memory 44 and is obtained by the control unit 41. The VBV buffer occupancy is monitored to determine whether a VBV buffer underflow occurs. When the VBV buffer underflow determination unit 42 approaches the VBV buffer underflow state according to the determination result, the VBV buffer underflow determination unit 42 outputs the underflow determination result to the weighting quantization unit 15.
[0045]
Next, a margin size calculation method by the calculation unit 43 will be described.
[0046]
For the margin size, the amount of generated bits that must be generated by the compression method using the minimum parameter may be set. Therefore, in the case of an I picture, the generated bit amount represented by the following expression (1) is calculated as a margin Margin_Ipicture.
[0047]
Margin_Ipicture = α × GB_MB + β × GB_HD + γ × GB_DC (1)
[0048]
Here, GB_MB indicates the bit amount generated by pipeline delay in the chip, GB_HD indicates the bit amount generated by the header, and GB_DC indicates the bit amount generated by the DC component in the I picture. . α indicates the number of macroblocks that cannot be controlled (a constant that depends on the hardware) (until the CPU that controls the amount of generated bits detects the amount of generated bits every 1 MB) (Number of MBs that cannot be controlled). β represents the number of slices, and γ represents the number of macroblocks.
[0049]
On the other hand, in the case of a B picture or a P picture, the generated bit amount represented by the following equation (2) is calculated as a margin Margin_B / Ppicture.
[0050]
Margin_B / Ppicture = α × GB_MB + β × GB_HD + βGB_SMB (2)
[0051]
Here, GB_SMB indicates the sum of the bit amount of the first MB and the bit amount of the final MB in the slice.
[0052]
For example, when the chroma format is a: b: c, GB_MB is obtained as follows.
[0053]
GB_MB = (maximum bit amount of DCT coefficient) × (a + b + c)
× (Number of pixels in one block) + (Number of bits generated in MB header) (3)
[0054]
Here, the maximum bit amount of the DCT coefficient is, for example, First_DCT_coeficient (first non-zero DCT coefficient of a non-intra block (interblock)) or Subsequent_DCT_coeficients specified in ITU-T Rec.H.262 (2000E) This is the maximum value of (subsequent DCT coefficient). The above-mentioned maximum value is the value of the inter macro block, but is larger than the value of the intra macro block, so that the value may be used in all cases. In ITU-T Rec. H.262 (2000E), First_DCT_coeficient is defined as 2 to 24 bits and Subsequent_DCT_coeficients 3 to 24 bits, so it may be set to 24 bits (ITU-T Rec.H. 262 (2000E), page 36, chapter 6.2.6).
[0055]
The number of pixels in one block is 64 because it is 8 pixels × 8 pixels in MPEG2.
[0056]
For example, when the generated bit amount of the MB header is MPEG2 defined by ITU-T Rec.H.262 (2000E), macroblock_address_increment (difference between the current MB address and the previous MB address), quantizer_scale_code (MB quantization) Scale code), marker_bit (marker), macroblock type (macroblock type), spatial_temporal_weight_code (space-time weighting code for upsampling), frame_motion_type (frame structure motion compensation type), dct_type (DCT type (frame or field)) , Motion_vertical_field_select [0] [s] (reference field selection information used for prediction), motion_vertical_field_select [1] [s] (reference field selection information used for prediction), motion_code [0] [s] [0] (motion vector) Basic difference motion vector when is motion_vector (0, s)), motion_residual [0] [s] [0] (residual vector when motion vector is motion_vector (0, s)), dmv ector [0] (Dual prime differential vector when motion vector is motion_vector (0, s)), motion_code [0] [s] [1] (Basic when motion vector is motion_vector (0, s) Differential motion vector), motion_residual [0] [s] [1] (residual vector when motion vector is motion_vector (0, s)), dmvector [1] (motion vector is motion_vector (0, s) Difference vector for dual prime), motion_code [1] [s] [0] (basic difference motion vector when motion vector is motion_vector (1, s)), motion_residual [1] [s] [0] ( Residual vector when motion vector is motion_vector (1, s)), dmvector [0] (dual prime differential vector when motion vector is motion_vector (1, s)), motion_code [1] [s] [1] (Basic differential motion vector when motion vector is motion_vector (1, s)), motion_residual [1] [s] [1] (motion vector is motion_vect residual vector when or (1, s), dmvector [1] (Dual prime differential vector when motion vector is motion_vector (1, s)), coded_block_pattern_420 (code block pattern), and coded_block_pattern_2 (Code block pattern) is the total amount of each bit. Note that when s is 0, it indicates forward prediction, and when it is 1, it is a parameter indicating bidirectional prediction.
[0057]
For ITU-T Rec.H.262 (2000E), macroblock_address_increment (difference between current MB address and previous MB address) is 1 bit, quantizer_scale_code (MB quantization scale code) is 5 bits, marker_bit (marker) is 1 bit, macroblock type (macroblock type) is 9 bits, spatial_temporal_weight_code (space-time weighting code for upsampling) is 2 bits, frame_motion_type (motion compensation type of frame structure) is 2 bits, dct_type (DCT type (frame or field) Is 1 bit, motion_vertical_field_select [0] [s] (reference field selection information used for prediction) is 1 bit, motion_vertical_field_select [1] [s] (reference field selection information used for prediction) is 1 bit, motion_code [0] [s] [0] (basic differential motion vector when motion vector is motion_vector (0, s)) is 11 bits, motion_residual [0] [s] [0] (motion vector is motion_vector (0, s) residual vector) is 8 bits, dmvector [0] (dual prime differential vector when motion vector is motion_vector (0, s)) is 2 bits, motion_code [0] [ s] [1] (basic differential motion vector when motion vector is motion_vector (0, s)) is 11 bits, motion_residual [0] [s] [1] (motion vector is motion_vector (0, s) Residual vector) is 8 bits, dmvector [1] (dual prime differential vector when motion vector is motion_vector (0, s)) is 2 bits, motion_code [1] [s] [0] (motion The basic differential motion vector when the vector is motion_vector (1, s)) is 11 bits, and motion_residual [1] [s] [0] (the residual vector when the motion vector is motion_vector (1, s)) is 8 bits, dmvector [0] (Dual prime differential vector when motion vector is motion_vector (1, s)), 2 bits, m otion_code [1] [s] [1] (basic differential motion vector when motion vector is motion_vector (1, s)) is 11 bits, motion_residual [1] [s] [1] (motion vector is motion_vector (1 , s) is 11 bits, dmvector [1] (dual-prime differential vector when motion vector is motion_vector (1, s)) is 2 bits, and coded_block_pattern_420 (code block pattern) is Since 9 bits and coded_block_pattern_2 (code block pattern) are respectively defined as 6 bits, they may be 120 bits, which is the sum of them (Nos. 33 to 36 of ITU-T Rec. H.262 (2000E)). (See page 6.2.5).
[0058]
The bit amount GB_MB generated by the pipeline delay in the chip is, for example, the following formula when the chroma format is 4: 2: 0 in the case of MPEG2 defined by ITU-T Rec.H.262 (2000E) It is calculated as shown in (4).
[0059]
GB_MB = 24 × (4 + 2 + 0) × 64 + 120 = 9336 (bits) (4)
[0060]
Further, when the chroma format is 4: 2: 2, the following equation (5) is obtained.
[0061]
GB_MB = 24 × (4 + 2 + 2) × 64 + 120 = 12408 (bits) (5)
[0062]
Furthermore, since GM_HD is the bit amount of the header, for example, in the case of MPEG2 defined by ITU-T Rec.H.262 (2000E) (assuming that Vertical_size ≦ 1088), slice_start_code (slice start code + The total bit amount such as slice vertical position), q_scale_code (quantization scale code), and extra_bit_slice (extended bit of slice). In ITU-T Rec. H.262 (2000E), slice_start_code is defined as 32 bits, q_scale_code is defined as 5 bits, and extra_bit_slice is defined as 1 bit. Therefore, GM_HD = 38 bits, which is the total bit amount, is defined. The header bit amount GB_MB may be a fixed value (see ITU-T Rec. H.262 (2000E), page 32, Chapter 6.2.4).
[0063]
In addition, in the case of MPEG2 defined by ITU-T Rec. H.262 (2000E), GB_DC is picture_coding_type = I picture because the chroma format is a: b: c, so that motion vector is It is defined as the following equation (6) on the assumption that it does not exist.
[0064]
GB_DC = macroblock_address_increment + macroblock_mode
+ Dct_dc_size_luminance × a
+ Dct_dc_diff_luminance × a
+ Dct_dc_size_chrominance × (b + c)
+ Dct_dc_diff_chrominance × (b + c)
+ End_of_block × (a + b + c) (6)
[0065]
Here, macroblock_address_increment is the difference between the current MB address and the previous MB address, and macroblock_mode is the sum of macroblock_type and dct_type, and indicates the encoding type of the macroblock and the DCT type, respectively.
[0066]
Dct_dc_size_luminance is the DCT luminance DC coefficient difference size, dct_dc_diff_luminance is the DCT luminance DC coefficient difference value, dct_dc_size_chrominance is the DCT color difference DC coefficient difference size, dct_dc_diff_chrominance is the DCT color difference DC coefficient difference value, and end_of_block is Each block DCT coefficient end flag is shown.
[0067]
For example, in the case of MPEG2 defined by ITU-T Rec.H.262 (2000E), macroblock_address_increment is 1 bit, macroblock_mode is 2 bits, dct_dc_size_luminance is 9 bits, dct_dc_diff_luminance is 11 bits, dct_dc_size_chrominance is 10 bits In addition, dct_dc_diff_chrominance is defined as 11 bits, and end_of_block is defined as 4 bits (see ITU-T Rec. H.262 (2000E), pages 33, 34, 36, 6.2.5, 6.2.5). .1, see Chapter 6.2.6). Therefore, in the case of MPEG2 defined by ITU-T Rec. H.262 (2000E), when the chroma format is 4: 2: 0, the DC component GB_DC is calculated as in the following equation (7). .
[0068]
GB_DC = 1 + 2 + 9 × 4 + 11 × 4 + 10 × 2 + 11 × 2 + 4 × 6
= 149 (7)
[0069]
Further, when the chroma format is 4: 2: 2, the DC component GB_DC is calculated as in the following equation (8).
[0070]
GB_DC = 1 + 2 + 9 × 4 + 11 × 4 + 10 × 4 + 11 × 4 + 4 × 8
= 199 (8)
[0071]
Furthermore, the sum GB_SMB of the bit amount of the first MB and the final MB in the slice is defined as the following formula (9) in the case of MPEG2 defined by ITU-T Rec.H.262 (2000E). Is done.
[0072]

[0073]
Where macroblock_escape_F is the bit size for MB address expansion of the first MB, macroblock_address_increment_F is the difference between the current MB address of the first MB and the previous MB address, q_scale_code is the q scale code of the first MB, and macroblock_type_F is the MB code of the first MB Frame (or field) _motion_type_F indicates the motion compensation type of the frame structure (or field structure), motion_vector_F indicates the bit amount of the first MB motion vector, and _L is replaced with _F What is attached indicates the bit amount of each corresponding final MB.
[0074]
In the case of MPEG2 defined by ITU-T Rec.H.262 (2000E), macroblock_escape_F is 0 bit, macroblock_address_increment_F is 1 bit, q_scale_code is 5 bits (the first MB of slice requires q_scale_code), macroblock_type_F Is 6 bits (from Table B.2 to B.4 on page 122 of ITU-T Rec. H.262 (2000E), frame_motion_type_F or field_motion_type_F is 2 bits, motion_vector_F is 3 bits (motion_vertical_field_select [0] [s ], Motion_code [r] [s] [0], and motion_code [r] [s] [1] are each 1 bit), macroblock_escape_L is 11 bits, macroblock_address_increment_L is 8 bits, and macroblock_type_L is 3 bits , Frame_motion_type_L or field_motion_type_L is 2 bits, motion_vector_L is 3 bits (motion_vertical_field_select [0] [s], motion_code [r] [s] [0] and motion_code [r] [s] [1] are 1 bit Stipulated) (ITU-T Rec.H.262 (2000E)) Chapter 33 and 34 pages 6.2.5 Chapter, 6.2.5.1 Chapter, see 6.2.5.2 Chapter).
[0075]
Therefore, in the case of MPEG2 defined by ITU-T Rec. H.262 (2000E), the sum GB_SMB of the bit amount of the first MB and the bit amount of the final MB in the slice is expressed by the following equation (10). Desired.
[0076]

[0077]
From equation (10), the sum GB_SMB of the bit amount of the first MB and the bit amount of the final MB in the slice is a fixed value of 44 bits.
[0078]
In summary, for example, in the case of MPEG2 specified by ITU-T Rec. H.262 (2000E), when the chroma format is 4: 2: 0 and 4: 2: 2, The margin is obtained by the following equations (11) and (12) based on the equation (1).
[0079]
Margin_Ipicture = 9336α + 38β + 149γ (11)
Margin_Ipicture = 12408α + 38β + 199γ (12)
[0080]
Similarly, the margin of a B picture or a P picture is expressed by the following formula (2) based on the formula (2) when the chroma format is 4: 2: 0 and 4: 2: 2. 13) and Equation (14), respectively.
[0081]

[0082]
In this way, when the chroma format is determined, the margin can be calculated based on the number of MBs α, the number of slices β, and the number of MBs γ that cannot be controlled by hardware. Here, since the number of MBs α is set as a hardware-specific value, the value is set in advance.
[0083]
The calculation unit 43 stores a coefficient corresponding to the chroma format in advance for each I picture, B picture, or P picture in a built-in memory, and the chroma format input from the picture type determination unit 11 The coefficient is read based on the information on the picture type and the margin is calculated based on the information on the number of slices and the number of MBs transmitted from the picture type determination unit 11 at the same time and stored in the margin size memory 44. .
[0084]
In one stream, since the number of slices and the number of MBs do not change, the calculation unit 43 calculates a margin only when an I picture, a P picture, or a B picture is input first and performs a margin calculation. Store in the size memory 44. The VBV underflow determination unit 42 determines the presence / absence of VBV underflow using the margin calculated and stored in the size memory 44 as described above.
[0085]
Next, the VBV buffer underflow monitoring process by the rate control unit 3 will be described with reference to the flowchart of FIG.
[0086]
In step S <b> 1, the VBV buffer underflow determination unit 42 of the rate control unit 3 acquires picture type, MB number, slice number, and chroma format information input from the picture type determination unit 11.
[0087]
In step S <b> 2, the VBV buffer underflow determination unit 42 determines whether the margin size has already been calculated in the margin size memory 44. At this time, the VBV buffer underflow determination unit 42 determines whether or not the acquired margin size of the picture type has already been calculated and stored in step S1.
[0088]
When it is determined in step S2 that the margin size has not been calculated, in step S3, the VBV buffer underflow determination unit 42 determines whether the picture type, MB number, slice number, and chroma format acquired in the calculation unit 43 are the same. Information is supplied, the calculation of the above formula (1) or formula (2) is executed, and the calculation result is stored in the margin size memory 44.
[0089]
In step S4, the VBV buffer underflow determination unit 42 determines whether or not the input image is an I picture. For example, if it is determined that the input image is an I picture, the process proceeds to step S5.
[0090]
In step S <b> 5, the VBV buffer underflow determination unit 42 reads the margin size of the I picture stored in the margin size memory 44.
[0091]
In step S6, the VBV buffer underflow determination unit 42 determines whether or not the VBV buffer occupation amount is larger than the sum of the maximum generated bit amount MaxGenBit and the margin of the picture to be extracted. The amount is not larger than the sum of the maximum generated bit amount MaxGenBit of the extracted picture and the margin, that is, the VBV buffer occupancy is smaller than the sum of the maximum generated bit amount MaxGenBit of the extracted picture and the margin. If so, the process proceeds to step S7.
[0092]
In step S7, the VBV buffer underflow determination unit 42 considers that a VBV buffer underflow has occurred, and instructs the weighting quantization unit 15 to execute the minimum parameter processing.
[0093]
In step S8, the VBV buffer underflow determination unit 42 determines whether or not the next picture exists. If the next picture exists, the process returns to step S1 and the next picture does not exist. In step S9, the margin size memory 44 is reset, and the process ends.
[0094]
In step S2, the margin size is calculated, that is, when an I picture is input, the margin size of an I picture is input, and when the margin size of a P picture or B picture is input, P When the margin size of the picture or B picture has already been calculated, the process of step S3 is skipped.
[0095]
Furthermore, if it is not an I picture in step S4, that is, if it is a B picture or a P picture, the VBV buffer underflow determination unit 42 in step S10, the B picture stored in the margin size memory 44, Alternatively, the margin size of the P picture is read out.
[0096]
If it is determined in step S6 that the VBV buffer occupancy is larger than the sum of the maximum generated bit amount MaxGenBit of the extracted picture and the margin, the process of step S7 is skipped, and the process is performed in step S8. Proceed to
[0097]
That is, when the above processing is summarized, the calculation unit 43 calculates the margin size of the I picture, the P picture, or the B picture in step S3. At this time, as described above, the calculation unit 43 stores the coefficients GB_MB, GB_HD, and GB_DC corresponding to the chroma format in advance, so that it corresponds to the chroma format and picture type information acquired in the process of step S1. Then, using the preset number of MBs α to be generated, the number of slices β acquired in the process of step S1, and the number of MBs γ, Expression (1) or Expression ( 2) is calculated to calculate the margin size.
[0098]
However, since the margin size of the picture type stored in the margin size memory 44 by the processing in step S2 is skipped in step S3, it is not calculated in the subsequent processing. Is calculated only once for each picture type.
[0099]
In steps S5 and S10, the margin size is read for each picture type. In step S6, the sum of the maximum generated bit amount MexGenBit and the read margin size is compared with the VBV buffer occupancy amount, and the VBV buffer occupancy amount is the maximum generated bit amount MaxGenBit of the extracted picture and the margin. If it is less than the sum, a command is issued so that the minimum parameter processing is performed.
[0100]
As a result, the margin that has been set empirically until now is calculated and set as the maximum bit amount generated for each stream, so it is possible to set margins without excess or deficiency, The VBV buffer can be used to the maximum, and the occurrence of underflow of the VBV buffer can be suppressed almost 100%.
[0101]
The series of processes described above can be executed by hardware, but can also be executed by software. When a series of processes is executed by software, a program constituting the software may execute various functions by installing a computer incorporated in dedicated hardware or various programs. For example, it is installed from a recording medium in a general-purpose personal computer or the like.
[0102]
FIG. 9 shows a configuration of an embodiment of a personal computer when the encoder is realized by software. The CPU 101 of the personal computer controls the overall operation of the personal computer. Further, when a command is input from the input unit 106 such as a keyboard or a mouse from the user via the bus 104 and the input / output interface 105, the CPU 101 stores the instruction in a ROM (Read Only Memory) 102 correspondingly. Run the program. Alternatively, the CPU 101 reads a program read from the magnetic disk 111, the optical disk 112, the magneto-optical disk 113, or the semiconductor memory 114 connected to the drive 110 and installed in the storage unit 108 into a RAM (Random Access Memory) 103. To load and execute. Thereby, the function of the encoder described above is realized by software. Further, the CPU 101 controls the communication unit 109 to communicate with the outside and exchange data.
[0103]
As shown in FIG. 9, the recording medium on which the program is recorded is distributed to provide the program to the user separately from the computer. The magnetic disk 111 (including the flexible disk) on which the program is recorded is distributed. By a package medium comprising an optical disk 112 (including compact disk-read only memory (CD-ROM), DVD (digital versatile disk)), a magneto-optical disk 113 (including MD (mini-disc)), or a semiconductor memory 114 In addition to being configured, it is configured by a ROM 102 in which a program is recorded and a hard disk included in the storage unit 108 provided to the user in a state of being pre-installed in a computer.
[0104]
In this specification, the step of describing the program recorded on the recording medium is not limited to the processing performed in time series in the order described, but of course, it is not necessarily performed in time series. Or the process performed separately is included.
[0105]
【The invention's effect】
According to the present invention, the VBV buffer can be used to the maximum, and the VBV buffer underflow can be suppressed by almost 100%.
[Brief description of the drawings]
FIG. 1 is a diagram illustrating a VBV buffer.
FIG. 2 is a diagram illustrating a VBV buffer underflow.
FIG. 3 is a diagram illustrating a margin of a VBV buffer.
FIG. 4 is a diagram illustrating a margin of a VBV buffer.
FIG. 5 is a block diagram showing a configuration of an encoder to which the present invention is applied.
6 is a diagram for explaining processing by the encoder of FIG. 5; FIG.
7 is a block diagram showing a configuration of a rate control unit in FIG. 5. FIG.
8 is a flowchart illustrating a VBV buffer underflow monitoring process by the rate control unit of FIG.
FIG. 9 is a diagram illustrating a medium.
[Explanation of symbols]
1 basic encoding processing unit, 2 encoder transmission buffer, 3 rate control unit,
11 picture type determination unit, 12 scan conversion unit, 14 DCT conversion unit,
15 weighting quantization unit, 16 variable length coding unit, 20 mode determination unit,
41 control unit, 42 VBV buffer underflow determination unit, 43 calculation unit,
44 Margin size memory

Claims

In an image processing apparatus that compresses and transmits an image using the MPEG method,
Discrimination means for discriminating the picture type of the input image;
When the picture type of the image is an I picture, the bit amount generated by a predetermined constant and pipeline delay, the number of slices and the bit generated by the header, the number of macroblocks, and the DC component in the I picture Calculate the margin of the VBV buffer by the product sum of each bit amount generated ,
When the picture type of the image is P picture or B picture, the predetermined constant and the bit amount generated by the pipeline delay, the bit number generated by the slice number and the header, and the slice number and A computing means for computing the margin of the VBV buffer by the product sum of the sum of the bit amounts of the first macroblock and the last macroblock in the slice ;
Storage means for storing a margin of the VBV buffer for each picture type calculated by the calculation means;
A compression means for compressing the image by MPEG;
When the maximum bit amount of generation of the image information compressed by the compression means, and the sum of the margin of the said specific discriminated picture type by discriminating means VBV buffer is greater than the occupancy of the VBV buffer, the VBV buffer underflows underflow detection means for detecting,
When the VBV buffer underflow is detected by the underflow detection means, when the picture type is the I picture, all macroblocks are DC components, and the P picture or the B picture An image processing apparatus comprising: control means for controlling the compression means so as to compress the image with a minimum parameter by making all macroblocks into skipped macroblocks .

In an image processing method of an image processing apparatus that compresses and transmits an image using the MPEG method,
A determining step of determining a picture type of the input image;
When the picture type of the image is an I picture, the bit amount generated by a predetermined constant and pipeline delay, the number of slices and the bit generated by the header, the number of macroblocks, and the DC component in the I picture Calculate the margin of the VBV buffer by the product sum of each bit amount generated ,
When the picture type of the image is P picture or B picture, the predetermined constant and the bit amount generated by the pipeline delay, the bit number generated by the slice number and the header, and the slice number and An operation step of calculating a margin of the VBV buffer by a product sum of the sum of the bit amounts of the first macroblock and the last macroblock in the slice ;
A storage step of storing a margin of the VBV buffer for each picture type calculated in the processing of the calculation step;
A compression step of compressing the image in MPEG format;
When said maximum bit generation of compressed image information in the process of compression steps, and the sum of the margin of the said specific discriminated picture type by discriminating means VBV buffer is greater than the occupancy of the VBV buffer, the VBV buffer An underflow detection step for detecting underflow;
When the VBV buffer underflow is detected in the processing of the underflow detection step, when the picture type is the I picture, all macroblocks are set as DC components, and the P picture or the B picture And a control step for controlling the processing of the compression step so as to compress the image with the minimum parameters by making all macroblocks into skipped macroblocks .

In a program that causes a computer to perform image processing to compress and transmit images using the MPEG method,
A determining step of determining a picture type of the input image;
When the picture type of the image is an I picture, the bit amount generated by a predetermined constant and pipeline delay, the number of slices and the bit generated by the header, the number of macroblocks, and the DC component in the I picture Calculate the margin of the VBV buffer by the product sum of each bit amount generated ,
When the picture type of the image is P picture or B picture, the predetermined constant and the bit amount generated by the pipeline delay, the bit number generated by the slice number and the header, and the slice number and An operation step of calculating a margin of the VBV buffer by a product sum of the sum of the bit amounts of the first macroblock and the last macroblock in the slice ;
A storage step of storing a margin of the VBV buffer for each picture type calculated in the processing of the calculation step;
A compression step of compressing the image in MPEG format;
When said maximum bit generation of compressed image information in the process of compression steps, and the sum of the margin of the said specific discriminated picture type by discriminating means VBV buffer is greater than the occupancy of the VBV buffer, the VBV buffer An underflow detection step for detecting underflow;
When the VBV buffer underflow is detected in the processing of the underflow detection step, when the picture type is the I picture, all macroblocks are set as DC components, and the P picture or the B picture And a control step for controlling the processing of the compression step so as to compress the image with minimum parameters by making all macroblocks into skipped macroblocks. A recording medium on which the program is recorded.

In a program that causes a computer to perform image processing to compress and transmit images using the MPEG method,
A determining step of determining a picture type of the input image;
When the picture type of the image is an I picture, the bit amount generated by a predetermined constant and pipeline delay, the number of slices and the bit generated by the header, the number of macroblocks, and the DC component in the I picture Calculate the margin of the VBV buffer by the product sum of each bit amount generated ,
When the picture type of the image is P picture or B picture, the predetermined constant and the bit amount generated by the pipeline delay, the bit number generated by the slice number and the header, and the slice number and An operation step of calculating a margin of the VBV buffer by a product sum of the sum of the bit amounts of the first macroblock and the last macroblock in the slice ;
A storage step of storing a margin of the VBV buffer for each picture type calculated in the processing of the calculation step;
A compression step of compressing the image in MPEG format;
When said maximum bit generation of compressed image information in the process of compression steps, and the sum of the margin of the said specific discriminated picture type by discriminating means VBV buffer is greater than the occupancy of the VBV buffer, the VBV buffer An underflow detection step for detecting underflow;
When the VBV buffer underflow is detected in the processing of the underflow detection step, when the picture type is the I picture, all macroblocks are set as DC components, and the P picture or the B picture In some cases, by causing all macroblocks to be skipped macroblocks, a computer is caused to execute a process including a control step for controlling the process of the compression step so as to compress the image with a minimum parameter. Program to do.