JP3812267B2

JP3812267B2 - Video encoding apparatus and method

Info

Publication number: JP3812267B2
Application number: JP2000042466A
Authority: JP
Inventors: 一彦森田
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 2000-02-21
Filing date: 2000-02-21
Publication date: 2006-08-23
Anticipated expiration: 2020-02-21
Also published as: JP2001238215A

Description

【０００１】
【発明の属する技術分野】
本発明は、動画像の高能率符号化に係り、特に、ほぼリアルタイムで固定ビットレート及び可変ビットレート符号化を行う際に好適な符号量制御装置及びその方法に関する。
【０００２】
【従来の技術】
TV信号などの動画像を高能率に符号化する技術の国際標準として既にMPEG２が規定されている。
MPEG２は、動画像を構成する「フレーム」画像を「マクロブロック」と呼ばれる１６×１６画素のブロックに分割し、各マクロブロック単位に、時間的に前または後に所定の数フレーム離れた参照画像と被符号化画像の間で「動きベクトル」と呼ばれる動き量を求め、この動き量を基に参照画像から被符号化画像を構成する「動き補償予測」技術と、動き補償予測の誤差信号または符号化画像そのものに対して、直交変換の一種であるDCT(離散コサイン変換)を用いて情報量を圧縮する「変換符号化」技術の２つの画像符号化の要素技術をベースに規定されている。
【０００３】
従来のMPEG２の動画像符号化装置の一構成例を図７に示す。
また、符号化ピクチャ構造の一例を図６に示す。
動き補償予測では、図６に示した符号化ピクチャ構造のように、Iピクチャ（フレーム内符号化）、Pピクチャ(順方向予測符号化)、Bピクチャ(双方向予測符号化)と呼ばれる、予測方法の異なる3種類のピクチャの組合わせによって構成される。
【０００４】
図７に示されるように、変換符号化では、Iピクチャでは符号化画像そのものに対し、P、Bピクチャでは動き補償器７９による動き補償予測の誤差信号である減算器７１の出力に対して、DCTがDCT器７２で夫々施される。
このDCT器７２で得られたDCT係数に対して量子化が、符号量制御部９０の出力により制御されて量子化器７３によってなされた後に、動きベクトル等のその他の付帯情報と共に可変長符号化が可変長符号化器７５でなされ、符号列が「ビットストリーム」としてバッファ７６に記憶された後に出力される。
この際、バッファ７６の充足度に応じて符号量制御部９０で量子化スケールが制御される。
一方、量子化器７３の出力係数は、逆量子化器７７、IDCT器７８に供給されて、局部復号された後に、加算器８０を介して、ブロック毎にフレームメモリ８１に貯えられる。
【０００５】
MPEG２は可変長符号化であるため、単位時間当りの発生符号量(ビットレート)は一定ではない。
そこで量子化器７３での量子化の際の量子化スケールをマクロブロック単位に適宜変更することにより、所要のビットレートに制御することが可能になっている。
MPEG２Test Model５では、GOP単位で発生符号量を一定にする固定ビットレート制御方法が提案されている。この固定ビットレート制御方法は、一定の転送レートが要求される用途に適した方法である。
【０００６】
Test Model 5における、図７の符号量制御部の動作に相当する、固定ビットレート制御方法の概略は次の通りである。
目標ビットレートをBitRate、１秒当りのフレーム数をPictureRate、１つの符号化単位である1GOP(通常はＩピクチャの間隔)のフレーム数をＮとすると、1GOPに割当てられる符号量Ｒは次式(1)で与えられる。
R ＝ (BitRate／PictureRate)・N (1)
【０００７】
(1)式の符号量ＲをGOP内の各画像に配分することになるが、ここで各ピクチャタイプの符号化直後の画像について、１フレームの発生符号量と平均量子化スケールの積を画面複雑度 (Complexity) Xi(Ｉピクチャ)、Xp(Ｐピクチャ)、Xb(Ｂピクチャ)として求め、これから符号化する画像を含むGOP内の画像が一様に前記画面複雑度 (Complexity)に等しいと仮定して、これから符号化する画像の目標割当符号量を決定する。
【０００８】
現在のGOPで符号化の終了していないＰ、Ｂピクチャのフレーム数をNp、Nb、Ｉピクチャに対するＰ、Ｂピクチャの量子化スケールの設定比率をKp、Kbとする。
この時、Ｉ、Ｐ、Ｂ各ピクチャタイプの目標割当符号量Ti, Tp, Tbは次式(2)(3)(4)で与えられる。なお、MAX[A, B]はAとBのいずれか大きい方を選択する動作を示す。
【０００９】

【００１０】
なお、符号量Ｒの値は１フレーム符号化が終了する毎に、そのフレームの発生符号量を減算し、GOPの先頭(Ｉピクチャ)において、(1)式の値を加算する。
つぎに、上式(2)(3)(4)で決定した目標割当符号量と、バッファ７６で検出される各マクロブロックの発生符号量をもとに、各マクロブロックの量子化スケールを決定する。
【００１１】
一方、DVD-Videoのように可変転送レートが可能な用途に適した方法として、可変ビットレート制御方法がある。特開平6−１４１２９８号公報には、可変ビットレート制御による符号化装置が開示されている。
この装置では、最初に入力動画像に対して固定量子化スケールによって仮符号化を行い、単位時間毎に発生符号量がカウントされる。次に入力動画像全体の発生符号量が所要値になるように、仮符号化時の発生符号量に基づいて各部分の目標転送レートを設定する。
そしてこの目標転送レートに合致するように制御を行いながら、入力動画像に対して２回目の符号化、言い換えると実符号化が行われる。
【００１２】
しかし、上記従来例では、出力ビットストリームを得るためには少なくとも２回の符号化を行わなければならず、リアルタイム性を要求されるような用途ではこの装置のような２パス方式の可変ビットレート制御は使用出来ない。
【００１３】
これに対し、動画像をほぼリアルタイムで符号化するための可変ビットレート制御方法、すなわち１パス方式の可変ビットレート制御方法も存在する。
特開平１０−１６４５７７号公報には、１パス方式の可変ビットレート制御方法による符号化装置が前記公報の図６等に開示されている。
この従来例における動画像符号化装置の一構成例を図８に示す。なお、図７と同一構成部材に対しては同一符号を付してその説明は省略する。
【００１４】
この従来例の装置では、バッファ７６に記憶した符号量を発生符号量検出器８３に供給し、この発生符号量検出器８３による発生符号量と、量子化器７３からの量子化スケールを平均量子化スケール検出器８２に供給し、この平均量子化スケール検出器８２による画面内の量子化スケールの平均値との積を、「画面複雑度」として画面複雑度検出器８４で求め、過去の画面複雑度の平均値に対する現在の画面複雑度の割合を基に、画面の目標発生符号量または目標量子化スケールを決定することにより、可変ビットレート制御を符号量制御器７４で実現している。
【００１５】
【発明が解決しようとする課題】
上記従来例においてはこれから符号化する画像の画面複雑度が、直前に符号化した同じピクチャタイプの画面複雑度と同程度であると仮定して符号化制御が行われている。
しかしながら、入力動画像にシーンチェンジのような大きな変化が生じた場合、変化点の前後で画像の性質が変わるため画面複雑度自体が変化するばかりでなく、予測符号化を行うＰ及びＢピクチャでは、変化の生じた前と後の画像の間では予測がほとんど当たらないため、変化直後の画像ではほとんどのマクロブロックが面内符号化を行うイントラになる場合が多い。
【００１６】
そのような画像では変化前の画像に比べ、実際の画面複雑度が極端に高くなるのにも関らず上記従来例では、変化前の画像の画面複雑度を基準にして符号量割当が行われるため、割当符号量が不足して、量子化スケールが上昇し、結果として変化点の直後で画質劣化が生じてしまうという問題があった。
【００１７】
一方、変化直後の画像から次に予測を行う画像では、画像が小刻みに変化する場合を除いては比較的予測が当たりやすくなり、実際の画面複雑度は変化直後の画像よりも低くなるにも関らず、上記従来例では変化直後の高い画面複雑度を基準にして符号量割当が行われるため、割当符号量が過剰となって量子化スケールが不必要に下降し、符号量に無駄が生じてしまうという問題があった。
本発明は以上の問題を解決して、入力動画像にシーンチェンジのような大きな変化が発生した場合でも、より適切な符号量割当を行うことが出来る固定ビットレート及び１パス方式の可変ビットレート制御方法を実現することを目的とするものである。
【００１８】
【課題を解決するための手段】
本発明は、上記課題を解決するために、以下１）〜８）に記載の手段よりなる。
すなわち、
１）入力動画像を動き補償予測手段、直交変換手段、量子化手段、及び可変長符号化手段を有して符号化する動画像符号化装置において、
前記入力動画像の各画像の発生符号量を検出する手段と、
前記入力動画像の各画像の平均量子化スケールを検出する手段と、
前記入力動画像及び前記動き補償予測手段によって生成される動き補償予測画像のうち、少なくとも前記入力動画像の輝度値の分散、または、画素間差分値によりアクティビティを検出して前記入力動画像の符号化画像特性を得るための手段と、
前記符号化画像特性を得るための手段によって得た符号化画像特性と、この符号化画像特性を得た前記入力動画像の直前に符号化された前記入力動画像の符号化画像特性と、の比が予め決められた所定の閾値を超えたことを判定し、この判定結果に基づいてシーンチェンジ画像情報を検出する手段と、
前記発生符号量を検出する手段によって検出された発生符号量と、前記平均量子化スケールを検出する手段によって検出された平均量子化スケールと、の積から、過去の画像の実測画面複雑度を算出し、
前記算出した過去の画像の実測画面複雑度と、前記符号化画像特性を検出する手段によって検出された符号化画像特性と前記シーンチェンジ画像情報を検出する手段によって検出されたシーンチェンジ画像情報の画像特性との比と、の積から現在の画像の推定画面複雑度を算出する手段と、
前記シーンチェンジ画像情報により検出されたＰ，Ｂピクチャのシーンチェンジ画像、シーンチェンジ画像の次の画像では前記算出された現在の画像の推定画面複雑度を、これらのシーンチェンジ画像及び次の画像以外の場合は前記算出された過去の画像の実測画面複雑度を選択することにより次に符号化する画像の割当符号量を決定し、前記割当符号量から前記次に符号化する画像の量子化スケールを決定する手段と、
を備えたことを特徴とする動画像符号化装置。
２）１）に記載された動画像符号化装置において、
前記現在の画像の推定画面複雑度を算出する手段は、
前記得られた現在の画像の符号化画像特性と、それと同じピクチャタイプ（Ｉピクチャ、Pピクチャ、Bピクチャ）の直前の画像において得られた符号化画像特性と、の比を比例定数とする関数を、前記直前の画像における前記実測画面複雑度に乗算することによって前記推定画面複雑度を算出することを特徴とする動画像符号化装置。
３）１）に記載された動画像符号化装置において、
前記次に符号化する画像の割当符号量を決定する手段において使用する画面複雑度は、
少なくとも前記現在の画像が前記検出されたシーンチェンジ画像情報によってシーンチェンジ画像、またはシーンチェンジ画像の次の画像と判定された場合は、
前記現在の画像の推定画面複雑度を使用することを特徴とする動画像符号化装置。
４）１）乃至３）のいずれかに記載された動画像符号化装置において、
前記発生符号量と前記実測画面複雑度、もしくは前記推定画面複雑度から次に符号化する画像の割当符号量を決定する手段は、前記実測画面複雑度の一定期間における平均値に対する前記推定画面複雑度の割合を、平均割当符号量に乗ずることによって前記割当符号量を決定することを特徴とする動画像符号化装置。
５）入力動画像を動き補償予測ステップ、直交変換ステップ、量子化ステップ、及び可変長符号化ステップを有して符号化する動画像符号化方法において、
前記入力動画像の各画像の発生符号量を検出するステップと、
前記入力動画像の各画像の平均量子化スケールを検出するステップと、
前記入力動画像及び前記動き補償予測ステップによって生成される動き補償予測画像のうち、少なくとも前記入力動画像の輝度値の分散、または、画素間差分値によりアクティビティを検出して前記入力動画像の符号化画像特性を得るためのステップと、
前記符号化画像特性を得るためのステップによって得た符号化画像特性と、この符号化画像特性を得た前記入力動画像の直前に符号化された前記入力動画像の符号化画像特性と、の比が予め決められた所定の閾値を超えたことを判定し、この判定結果に基づいてシーンチェンジ画像情報を検出するステップと、
前記発生符号量を検出するステップによって検出された発生符号量と、前記平均量子化スケールを検出するステップによって検出された平均量子化スケールと、の積から、過去の画像の実測画面複雑度を算出し、
前記算出した過去の画像の実測画面複雑度と、前記符号化画像特性を検出するステップによって検出された符号化画像特性と前記シーンチェンジ画像情報を検出するステップによって検出されたシーンチェンジ画像情報の画像特性との比と、の積から現在の画像の推定画面複雑度を算出するステップと、
前記シーンチェンジ画像情報により検出されたＰ，Ｂピクチャのシーンチェンジ画像、シーンチェンジ画像の次の画像では前記算出された現在の画像の推定画面複雑度を、これらのシーンチェンジ画像及び次の画像以外の場合は前記算出された過去の画像の実測画面複雑度を選択することにより次に符号化する画像の割当符号量を決定し、前記割当符号量から前記次に符号化する画像の量子化スケールを決定するステップと
を備えたことを特徴とする動画像符号化方法。
６）５）に記載された動画像符号化方法において、
前記現在の画像の推定画面複雑度を算出するステップは、
前記得られた現在の画像の符号化画像特性と、それと同じピクチャタイプ(Iピクチャ、Pピクチャ、Bピクチャ)の直前の画像において得られた符号化画像特性と、の比を比例定数とする関数を、前記直前の画像における前記実測画面複雑度に乗算することによって前記推定画面複雑度を算出することを特徴とする動画像符号化方法。
７）５）に記載された動画像符号化方法において、
前記次に符号化する画像の割当符号量を決定するステップにおいて使用する画面複雑度は、
少なくとも前記現在の画像が前記検出されたシーンチェンジ画像情報によってシーンチェンジ画像、またはシーンチェンジ画像の次の画像と判定された場合は、
前記現在の画像の推定画面複雑度を使用することを特徴とする動画像符号化方法。
８）５）乃至７）のいずれかに記載された動画像符号化方法において、
前記発生符号量と前記実測画面複雑度、もしくは前記推定画面複雑度から次に符号化する画像の割当符号量を決定するステップは、前記実測画面複雑度の一定期間における平均値に対する前記推定画面複雑度の割合を、平均割当符号量に乗ずることによって前記割当符号量を決定することを特徴とする動画像符号化方法。
【００１９】
よつて、本発明では、MPEG２等の動き補償予測、直交変換、量子化、可変長符号化の各手段を備えた動画像符号化装置において、各画像の発生符号量と平均量子化スケールを検出し、所定区間内の割当符号量をその間の各画像に配分するために、各画像の発生符号量と平均量子化スケールの積に対して所定の操作を施して既に符号化した画像の実測画面複雑度を求める。
それと同時に、各画像の符号化画像特性(アクティビティ)を検出し、同じピクチャタイプの直前の画像との符号化画像特性の比が算出されると共に、この符号化画像特性からシーンチェンジ検出を行う。
【００２０】
各画像の符号化画像特性としては入力画像のアクティビティの他に、Ｐ及びＢピクチャについては動き補償予測における誤差画像または動きベクトル検出における符号化画像と参照画像の差分画像、及び動きベクトルのアクティビティを用い、その中から１つあるいは複数を組合せて使用する。
符号化画像特性の各要素の値が所定の範囲を超えた場合、その画像をシーンチェンジ画像と判定する。
【００２１】
各ピクチャタイプにおいてシーンチェンジ画像と判定された画像では、前記した符号化画像特性の比を因数とする所定の関数と直前の画像の実測画面複雑度から、推定画面複雑度が算出され、推定画面複雑度が実測画面複雑度よりも大きな値となるため、シーンチェンジ画像により多くの符号量を割当てることが出来るようにする。
【００２２】
逆に直前の画像がシーンチェンジ画像であった場合は、実測画面複雑度を用いるのではなく、直前の画像の符号量割当の際に算出する実測画面複雑度、すなわちシーンチェンジ前の画像の画面複雑度に対し、シーンチェンジ前の画像とこれから符号化する画像の符号化画像特性の比を因数とする所定の関数とから、これから符号化する画像の画面複雑度の推定を行うことにより、シーンチェンジ画像の次の画像に不必要に多くの符号量を割当てることが防止される。
【００２３】
また、１パス方式の可変ビットレート制御の場合は、符号化画像特性の比と直前の画像の実測画面複雑度から、これから符号化する画像の推定画面複雑度が算出されるが、各ピクチャタイプにおいてシーンチェンジ画像と判定された画像では、前記した符号化画像特性の比を因数とする所定の関数と直前の画像の実測画面複雑度から推定画面複雑度が算出され、推定画面複雑度が通常の場合より加算される。
【００２４】
逆に直前の画像がシーンチェンジ画像であった場合は実測画面複雑度の代りに、直前の画像の符号量割当の際に算出する、シーンチェンジ前の画像の実測画面複雑度、またはシーンチェンジ画像における加算前の推定画面複雑度を用いて、符号化画像の画面複雑度の推定を行うことにより、シーンチェンジ画像の次の画像における画面複雑度の値を適正化出来る。
【００２５】
さらに１パス方式の可変ビットレート制御では、算出された推定画面複雑度と一定区間内のピクチャタイプ別平均画面複雑度の割合を目標ビットレートによる所定区間の符号量割当に反映させることにより、シーンチェンジに対応した所定区間の符号量割当が可能になる。
【００２６】
【発明の実施の形態】
（第１の実施例）
本発明の動画像符号化装置及びその方法の第１の実施例について、図と共に以下に説明する。
図１に示されている本発明の動画像符号化装置の第１の実施例は、減算器１１、ＤＣＴ器１２、量子化器１３、符号量制御器１４、可変長符号化器１５、バッファ１６、逆量子化器１７、ＩＤＣＴ器１８、動き補償予測器１９、加算器２０、フレームメモリ２１、平均量子化スケール検出器２２、発生符号量検出器２３、画面複雑度算出器２４、画面複雑度メモリ２４Ｍ、画像特性検出器２５、及びシーンチェンジ検出器２８より構成されている。
【００２７】
第１の実施例は本発明を固定ビットレート符号化に適用した場合である。
なお、以下の説明において、ｉはＩピクチャ、ｐはＰピクチャ、ｂはＢピクチャと対応している。
原動画像は画像ブロック分割器（図示されていない）によって、予めマクロブロック単位に分割されているものとする。
分割された原動画像は、Ｉピクチャについては動き補償予測が行われず、原動画像ブロックそのものが減算器１１を介してDCT器１２に送られ、DCTされた後に量子化器１３で符号量制御器１４から送られる量子化スケールによって量子化される。
【００２８】
その量子化された信号は、可変長符号化器１５で符号に変換されて、次のバッファ１６で調整された後に符号が出力される。
一方、量子化器１３の出力係数は逆量子化器１７、IDCT器１８で局部復号されて、動き補償予測器１９の出力が加算器２０で加算されることなく、ブロック毎にフレームメモリ２１に貯えられる。
【００２９】
P及びBピクチャについては、分割された原動画像とフレームメモリ２１に貯えられた所定の局部復号画像ブロックが動き補償予測器１９に供給され、ここで動きベクトル検出及び動き補償が行われて、予測画像ブロックが減算器１１で原画像ブロックとの間で画素間差分が取られ、差分値である誤差画像ブロックがDCT器１３に供給される。
【００３０】
この後はIピクチャと同様にして、DCT器１２で差分値がDCTされ、量子化器１３で符号量制御器１４から送られる量子化スケールによって量子化された後に、可変長符号化器１５で符号に変換されて、次のバッファ１６で調整された後に符号が出力される。
一方、量子化器１３の出力係数は、逆量子化器１７とIDCT器１８とで局部復号された後に前記予測画像ブロックが加算器２０によって画素毎に加算され、ブロック毎にフレームメモリ２１に貯えられる。
【００３１】
また、各ピクチャについて、量子化器１３からマクロブロック毎の量子化スケールが平均量子化スケール検出器２２に送られ、そこで1フレーム分の量子化スケールが加算され、1フレームの平均量子化スケールが算出される。
一方、バッファ１６において発生符号量が監視され、その値が発生符号量検出器２３に供給される。
この発生符号量検出器２３において、発生符号量がフレーム単位に加算され、1フレームの発生符号量が検出される。
【００３２】
フレーム毎について検出された平均量子化スケール、及び発生符号量は各々画面複雑度算出器２４にフレーム毎に供給される。
画面複雑度算出器２４では、供給された各フレームの平均量子化スケールと発生符号量が乗じられた後に所定の操作を施して、MPEG２Test Model 5におけるComplexityに相当する各フレームの実測画面複雑度Xi-p, Xp-p, Xb-pが求められる。
【００３３】
一方、画像特性検出器２５では、入力時に原画像を分割した原動画像が供給され、原動画像の各フレームについてマクロブロック単位に画像特性を示すパラメータであるアクティビティが検出され、フレーム単位に加算されて、その結果が１フレーム毎に画面複雑度算出器２４及びシーンチェンジ検出器２８に供給される。
【００３４】
ここで、画像特性検出器２５で画像特性を検出する動作は、実際の符号化動作に先行して検出している。
画像特性を示すパラメータとしては輝度値の分散、画素間差分値などが考えられるが、画像特性を示すものであれば、その他のパラメータでも当然よい。
【００３５】
シーンチェンジ検出器２８では、これから符号化する現在の画像のアクティビティACTi, ACTp, ACTbと、直前に符号化した同じピクチャタイプの画像のアクティビティACTi-p, ACTp-p, ACTb-pの比率(ACTi / ACTi-p)，(ACTp / ACTp-p)，(ACTb / ACTb-p)が計算される。
【００３６】
そして、この計算された比率（現在のアクティビティ／直前のアクティビティ）が一定範囲を超えた時、例えば、
(比率)＜Amin または (比率)＞Amax ( 0＜Amin＜1, Amax＞1 )
の場合、これら２つの画像の間でシーンチェンジが起こったとシーンチェンジ検出器２８は判定し、このフレームの位置情報を画面複雑度算出器２４に送る。
シーンチェンジが起こったと判定された時、これから符号化する現在の画像を以後、シーンチェンジ画像と呼ぶ。
【００３７】
シーンチェンジが発生したというフレームの位置情報が画面複雑度算出器２４に送られると、画面複雑度算出器２４では、Ｐ及びＢピクチャについて、シーンチェンジ画像における発生符号量の増加を補うための画面複雑度の推定(割増し)が行われる。
【００３８】
ここで、先に求めた実測画面複雑度は、その次の同じピクチャタイプの画面複雑度の推定に使用するため、画面複雑度メモリ２４Ｍに貯えられる。
シーンチェンジ画像における推定画面複雑度Xp, Xbは次式(5)〜(8)の通りになる。
なお、f1〜f4はアクティビティの比率(ACTp / ACTp-p)，(ACTb / ACTb-p)を因数とする所定の関数である。
前記の所定の関数の一実施例として、f1〜f4＝(ACTk / ACTk-p)・Ck1+ Ck2
（但し k＝p or b）が適当であるが、これに限定されるものではない。
【００３９】
（Ｐピクチャ）
(ACTp / ACTp-p)＞Amaxの場合
推定画面複雑度 Xp＝Xp-p・f1 (5)
(ACTp / ACTp-p)＜Aminの場合
推定画面複雑度 Xp＝Xp-p・f2 (6)
（Ｂピクチャ）
(ACTb / ACTb-p)＞Amaxの場合
推定画面複雑度 Xb＝Xb-p・f3 (7)
(ACTb / ACTb-p)＜Aminの場合
推定画面複雑度 Xb＝Xb-p・f4 (8)
【００４０】
例えば、Amax＝1.5, Amin＝0.67, f1＝(ACTp / ACTp-p)・2, f2＝1,
f3＝(ACTb / ACTb-p)・２, f4＝1と設定することにより、アクティビティが増加する場合はアクティビティの比率の２倍の値を推定画面複雑度として与え、アクティビティが減少する場合はシーンチェンジ前と同程度の画面複雑度を推定画面複雑度として与えることにより、シーンチェンジによってＰ及びＢピクチャでイントラマクロブロックの増大による発生符号量の増加を補うことが出来る。
【００４１】
また、シーンチェンジ画像の次の同じピクチャタイプのＰまたはＢピクチャについては、シーンチェンジ画像の増加した発生符号量から計算される画面複雑度Xp-p, Xb-pを基準として画面複雑度を計算すると、必要以上に大きな値となってしまう。
【００４２】
そこで、Xp-p, Xb-pの代りに、シーンチェンジ画像の時に使用した、(画面複雑度メモリ２４Mに貯えられた)実測画面複雑度Xp-pold, Xb-poldと、その実測画面複雑度の対象となった画像のアクティビティACTp-pold, ACTb-poldと、これから符号化する現在の画像のアクティビティACTp, ACTbから、(9)(10)式によって画面複雑度を推定することにより、画面複雑度の適正化が図られる。
【００４３】
（Ｐピクチャ）
Xp＝Xp-pold・(ACTp / ACTp-pold) (9)
（Ｂピクチャ）
Xb＝Xb-pold・(ACTb / ACTb-pold) (10)
なお、画面複雑度メモリ２４Ｍには、シーンチェンジ画像における実測画面複雑度の他に、各ピクチャタイプにおける直前数フレームの画像のアクティビティが貯えられる。
【００４４】
第１の実施例においては、Ｐ、Ｂピクチャのシーンチェンジ画像、シーンチェンジ画像の次の画像では推定画面複雑度、それ以外の通常のＰ、Ｂピクチャ及びＩピクチャについては、実測画面複雑度をMPEG２Test Model５の(2)〜(4)式のXi, Xp, Xbに代入して、これから符号化する画像の目標割当符号量Ti, Tp, Tbを決定することにより、シーンチェンジに対応した符号量割当を行うことが出来る。この目標割当符号量と、バッファ１６で検出される各マクロブロックの発生符号量をもとに、MPEG２Test Model５の方法を用いて各マクロブロックの量子化スケールを決定する。
【００４５】
なお、画像特性検出器２５からは符号量制御器１４へも各マクロブロックのアクティビティが送られ、MPEG２Test Model５におけるアクティビティに基づいて各マクロブロックの量子化スケールを変更する適応量子化制御に使用されるが、この適応量子化制御は行わなくてもよい。
【００４６】
符号量制御器１４から出力される各マクロブロックの量子化スケールが量子化器１３に送られ、現在の画像(DCT後の分割された原動画像または動き補償予測の誤差画像ブロック)がこの量子化スケールで量子化器１３で量子化され、可変長符号化器１５で可変長符号化されて、次のバッファ１６で調整された後に符号が出力される。量子化器１３のマクロブロック毎の量子化スケール、バッファ１６で監視される発生符号量がそれぞれ、平均量子化スケール検出器２２、発生符号量検出器２３に送られ、次のピクチャの符号量制御に使用される。
【００４７】
なお、本実施例ではMPEG２Test Model５に本発明を適用した例であるが、本発明はそれに限らず、他のレート制御にも適用可能である。
例えばアクティビティがACTk-p(k=i,p,b)である同じピクチャタイプの過去の画像の発生符号量から、次に符号化する画像の割当符号量Ｓkが定まる場合、アクティビティがACTkであるその画像がシーンチェンジ画像だとすると、補正割当符号量Ｓk'は次の(11)式の通りになる。
Ｓk' ＝Ｓk・ｆk(ACTk / ACTk-p) (11)
但し、 k=i,p,b ｆkは(ACTk / ACTk-p)を因数とする関数
【００４８】
（第２の実施例）
つぎに、本発明の動画像符号化装置の第２の実施例について、図２と共に以下に説明する。
第２の実施例は本発明を１パス方式の可変ビットレート符号化に適用した場合であり、第１の実施例と比較して、平均画面複雑度算出器２９が追加され、画面複雑度算出器２４と符号量制御器１４の構成及びその動作のみが異なるので、それ以外の部分についての説明は省略する。
【００４９】
第１の実施例と同様に、画面複雑度算出器２４には各フレームの平均量子化スケールと発生符号量、シーンチェンジが発生したというフレームの位置情報、及び各フレームの画像特性、すなわちアクティビティが供給される。
画面複雑度算出器２４では、供給された各フレームの平均量子化スケールと発生符号量が乗じられた後に乗算結果に所定の変換が施されて、それを基準として各フレームの実測画面複雑度が求められる。
【００５０】
実測画面複雑度は平均画面複雑度算出器に送られ、ここで符号化ピクチャタイプ別に一定期間内の値が加算された後に、その期間内の同じピクチャタイプのフレーム数で除算されて、Ｉ、Ｐ、B各ピクチャタイプの平均画面複雑度 Xi-ave(Ｉピクチャ)、Xp-ave(Ｐピクチャ)、Xb-ave(Ｂピクチャ) が算出される。
【００５１】
ここで言う一定期間内は、符号化の終了したばかりの画像から時間的に前に予め定めるフレーム数、例えば１５フレームとか、３００フレームといった一定のフレーム数の場合もあり、符号化開始フレームから符号化の終了したばかりの画像までのように、順次フレーム数が増加する場合もある。
なお、前者の一定フレーム数の場合でも、符号化したフレーム数が定めた一定期間を満たさない場合は後者と同様に順次フレーム数が増加していくことになる。
【００５２】
つぎに画面複雑度算出器２４では、求めた実測画面複雑度と平均画面複雑度算出器２９で求めた平均画面複雑度、及びアクティビティから画面複雑度の推定を行う。
これから符号化する現在の画像の画面複雑度 Xi, Xp, Xb は、現在の画像のアクティビティACTi, ACTp, ACTb、直前に符号化した同じピクチャタイプの画像の画面複雑度Xi-p, Xp-p, Xb-p、直前に符号化した同じピクチャタイプの画像のアクティビティACTi-p, ACTp-p, ACTb-pより、次式(12)(13)(14)で推定出来る。
【００５３】
（Ｉピクチャ）
現在の画像の画面複雑度Xi＝Xi-p・(ACTi / ACTi-p) (12)
（Ｐピクチャ）
現在の画像の画面複雑度Xp＝Xp-p・(ACTp / ACTp-p) (13)
（Ｂピクチャ）
現在の画像の画面複雑度Xb＝Xb-p・(ACTb / ACTb-p) (14)
【００５４】
なお、初期状態において、同じピクチャタイプの符号化の終了したフレームが存在しない場合は予め幾つかの画像で各ピクチャタイプの画像の画面複雑度とアクティビティを求めておき、それを平均的な動画像の発生頻度に合わせて統計的に平均してそれを初期値とすればよい。
【００５５】
一方、これから符号化する現在の画像がシーンチェンジ画像の場合、より多くの符号量を割当てるため、Ｐ及びＢピクチャでは画面複雑度算出器２４において推定画面複雑度の再計算が行われる。
ここで、直前に符号化した同じピクチャタイプの画像の実測画面複雑度は、その次の同じピクチャタイプの画面複雑度の推定に使用するため、画面複雑度メモリ２４Ｍに貯えられる。
【００５６】
シーンチェンジ画像における推定画面複雑度Xp, Xbは第１の実施例の(5)〜(8)式と同様にして求める。
なお、f1〜f4はアクティビティの比率(ACTp / ACTp-p)，(ACTb / ACTb-p)を因数とする所定の関数であるが、これを第１の実施例とは異なる関数を使用してもよい。
【００５７】
さらに、シーンチェンジ画像の次の同じピクチャタイプのＰまたはＢピクチャについては、第１の実施例の(9)(10)式と同様に、画面複雑度メモリ２４Ｍに貯えられた実測画面複雑度Xp-pold, Xb-poldと、その実測画面複雑度の対象になった画像のアクティビティACTp-pold, ACTb-pold、これから符号化する現在の画像のアクティビティACTp, ACTbから、画面複雑度を推定することにより、画面複雑度の適正化が図られる。
【００５８】
なお、第2の実施例においては、シーンチェンジの有無に関らず推定画面複雑度の計算を行うので、シーンチェンジ画像の際に画面複雑度メモリ２４Ｍに貯える値を実測画面複雑度ではなく、前記(13)(14)式で求めた再計算前の推定画面複雑度Xp-old, Xb-oldに変更し、またシーンチェンジ画像のつぎの同じピクチャタイプの画像における画面複雑度の推定では、次の(15)(16)式によって計算してもよい。
この場合、アクティビティを画面複雑度メモリ２４Ｍに貯える画像をシーンチェンジの有無によって切り替える必要がないため、装置の簡略化が出来る。
【００５９】
（Ｐピクチャ）
現在の画像の画面複雑度Xp＝Xp-old・(ACTp / ACTp-p) (15)
（Ｂピクチャ）
現在の画像の画面複雑度Xb＝Xb-old・(ACTb / ACTb-p) (16)
【００６０】
このようにして計算された、これから符号化する現在の画像の推定画面複雑度 Xi, Xp, Xbと、各ピクチャタイプの平均画面複雑度 Xi-ave, Xp-ave, Xb-aveは符号量制御器１４に送られる。
【００６１】
符号量制御器１４では、つぎに(これから)符号化する画像の割当符号量の設定(決定)、及び可変ビットレート制御のための量子化スケールの設定(決定)が行われる。
目標平均ビットレートをBitRate、1秒当りのフレーム数をPictureRate、1つの符号化単位である1GOP(通常はIピクチャの間隔)のフレーム数をNとすると、1GOPの平均割当符号量Raveは次式(17)で与えられる。
【００６２】
1GOPの平均割当符号量Rave＝(BitRate／PictureRate)・N (17)
上式のRaveは平均画面複雑度の時の1GOPの必要割当符号量とすると、これから符号化する現在の画像を含む1GOPの画像が一様に前記画面複雑度算出器２４で求めた現在の画像の推定画面複雑度に等しいと仮定すると、画質を一定に保持する場合に必要な1GOPの必要割当符号量Rcは次式(18)(19)(20)で与えられる。
【００６３】
(Ｉピクチャ)
必要割当符号量Rc ＝ Rave・( Xi / Xi-ave ) (18)
(Ｐピクチャ)
必要割当符号量Rc ＝ Rave・( Xp / Xp-ave ) (19)
(Ｂピクチャ)
必要割当符号量Rc ＝ Rave・( Xb / Xb-ave ) (20)
【００６４】
これら上式の必要割当符号量Rcを1GOPの各ピクチャに適切に割り振ることにより、これから符号化する現在の画像の目標符号量を算出し、各マクロブロックの量子化スケールを決定する。
例えばMPEG２Test Model５の方法を用いると、上で求めた現在の画像の推定画面複雑度 Xi, Xp, Xbを(2)〜(4)式のXi, Xp, Xbに、割当符号量Rcを(2)〜(4)式のRに代入して、これから符号化する画像の目標割当符号量Ti, Tp, Tbを決定する。
【００６５】
但し、第２の実施例においては、GOP毎に一定符号量にする必要がないので、Rcの値は(2)〜(4)式のＲのように各フレームの発生符号量で減算したり、GOPの先頭で加算する必要はない。また(2)〜(4)式のNp, Nbは常に一定値(GOP先頭での値)となる。
【００６６】
その後は第１の実施例と同様に、目標割当符号量と、バッファ１６で検出される各マクロブロックの発生符号量をもとに、各マクロブロックの量子化スケールを決定し、必要に応じて各マクロブロックのアクティビティによって量子化スケールを変更する適応量子化制御を行う。
このようにして、１パス方式の可変ビットレート制御においても、シーンチェンジに対応した符号量割当が可能になる。
【００６７】
（第３の実施例）
つぎに、本発明の動画像符号化装置の第３の実施例について、図３と共に以下に説明する。
第３の実施例も、本発明を１パス方式の可変ビットレート符号化に適用した場合であるが、第２の実施例と比較して、図４に示す画像特性検出器２５及びシーンチェンジ検出器２８の構成及びその動作及び、画像特性検出器２５からシーンチェンジ検出器２８に送られるアクティビティの信号と、シーンチェンジ検出器２８から画面複雑度算出器２４に送られるシーンチェンジ情報の内容のみが異なる。
また、図３は画像特性検出器２５に対して動き補償予測器１９より動き補償信号が供給されている点が図２と異なっており、それ以外の部分についての説明は省略する。
【００６８】
図４に示す画像特性検出器２５は、ACTcur検出器２５Ａ、ACTpred検出器２５Ｂ、ACTmv検出器２５Ｃと各々の累積器２５Ｄ,２５Ｅ,２５Ｆ、及びピクチャアクティビティ算出器２５Ｇより構成されている。
また、シーンチェンジ検出器２８は、SCcur検出器２８Ａ、SCpred検出器２８Ｂ、SCmv検出器２８Ｃ、及びシーンチェンジ判定器２８Ｄより構成されている。
【００６９】
図３、図４の実施例において、画像特性検出器２５への入力は、Ｉピクチャの場合は動き補償予測が行われないため、第２の実施例と同じくマクロブロック単位に分割された原動画像のみが入力され、マクロブロック単位に画像特性を示すパラメータであるアクティビティ(ACTcur)が検出され、フレーム単位に加算され、ＩピクチャのアクティビティACTiとして画面複雑度算出器２４に送られる。
また、ACTcur検出器２５Ａの出力値は累積器２５Ｄによってフレーム単位に加算された後にシーンチェンジ検出器２８に送られ、シーンチェンジ判定に使用される。
【００７０】
一方、図４に示す画像特性検出器２５への入力は、Ｐ及びＢピクチャの場合は分割された原動画像の他に、マクロブロック単位の動き補償予測における誤差画像または動きベクトル検出における符号化画像と参照画像との差分画像と、動き補償予測で使用した動きベクトルが図３に示す動き補償予測器１９から入力される。
分割された原動画像からはＩピクチャの場合と同様にマクロブロック単位に(原画像)アクティビティACTcurが検出される。
【００７１】
一方、マクロブロック単位の動き補償予測における誤差画像または動きベクトル検出における符号化画像と参照画像との差分画像は、その中で絶対値和または２乗誤差和がとられ、予測アクティビティACTpredとして検出される。
さらに、動き補償予測で使用した動きベクトルの方は、隣接マクロブロックとの間で各成分毎に差分の絶対値がとられるなどして、動きベクトルアクティビティACTmvとして検出される。
【００７２】
そして、各マクロブロック毎に次式(21)の演算により、マクロブロックアクティビティACTmbが算出され、それが１フレーム分加算されて、Ｐ及びＢピクチャのアクティビティACTp及びACTbとして画面複雑度算出器２４に送られる。
また、ACTcur、ACTpred、ACTmv検出器２５Ａ,２５Ｂ,２５Ｃの値は各々累積器２５Ｄ，２５Ｅ，２５Ｆによってフレーム単位に加算された後にシーンチェンジ検出器２８に送られ、シーンチェンジ判定に使用される。
マクロブロックアクティビティACTmb
＝ａ・ACTcur ＋ｂ・ACTpred ＋ｃ・ACTmv (21)
【００７３】
なお、各定数ａ、ｂ、ｃの値はピクチャ別、マクロブロックの予測モード別(イントラか片方向予測か双方向予測か)などで変化させる。
例えばイントラの場合はIピクチャと同様に予測を行わないので、ｂ＝ｃ＝０となり、予測を行うブロックに比べて発生符号量が多くなると考えられるので、ａの値を大きくする。
このように、予測モード等に即したアクティビティ検出を行うことにより、第２の実施例に比べ、より符号化特性に即した画面複雑度の推定が可能になる。
【００７４】
また、シーンチェンジ検出器２８には画像特性検出器２５から、フレーム単位に加算されたACTcur、ACTpred、ACTmvの値が送られる。
Ｉピクチャの場合はACTcurしか送られないので、ACTcurについて第２の実施例と同様に、これから符号化する現在の画像のアクティビティと、直前に符号化したＩピクチャのアクティビティの比率が計算される。
そしてこのアクティビティの比率が一定範囲を超えた時、例えば、
(比率)＜Amin または (比率)＞Amax
但し、(比率)＝(符号化画像のアクティビティ) / (直前の画像のアクティビティ)
0＜Amin＜1, Amax＞1
の場合、これら２つの画像の間でシーンチェンジが起こったと判定する。
【００７５】
一方、Ｐ及びＢピクチャの場合はACTcur、ACTpred、ACTmvの３つの値が画像特性検出器２５から送られてくるので、３つの値各々について、これから符号化する現在の画像のアクティビティと、直前に符号化した同じピクチャタイプの画像のアクティビティの比率を計算する。
そして、
・(ACTcurの比率) ＜A1min または (ACTcurの比率)＞A1max
・(ACTpredの比率)＜A2min または (ACTpredの比率)＞A2max
・(ACTmvの比率) ＜A3min または (ACTmvの比率) ＞A3max
但し、(比率)＝(符号化画像のアクティビティ) / (直前の画像のアクティビティ)
０＜A1min, A2min, A3min＜１, A1max, A2max, A3max,＞１
のいずれかを満たす時、これら２つの画像の間でシーンチェンジが起こったと判定する。
【００７６】
なお、１つの判定方法では誤判定する可能性が高いと判断した場合は、２つまたは３つ全ての条件を満たす場合のみシーンチェンジが起こったと判定するように条件を限定してもよい。
【００７７】
シーンチェンジ画像の位置情報は、ACTcur、ACTpred、ACTmvのどの判定でシーンチェンジと判断したかという情報も同時に、画面複雑度算出器２４に送られる。
また、画面複雑度算出器２４では、どのアクティビティ判定でシーンチェンジと判断したかによって、(5)〜(8)式のf1〜f4の関数を変更する。
【００７８】
このように第３の実施例では、原動画像のアクティビティACTcurだけでなく、動き補償予測における誤差画像や動きベクトルのアクティビティACTpred、ACTmvもシーンチェンジ判定に使用することにより、シーンチェンジ判定の精度を向上させることが出来る。
なお、以上の第３の実施例では、第２の実施例の１パス方式の可変ビットレート符号化に対して適用したが、同様な方法を第１の実施例である固定ビットレート符号化に適用してもよい。
【００７９】
なお、以上の実施例では、ピクチャ符号化構造が図６のようなＩピクチャ、Ｐピクチャ、Ｂピクチャの３種類存在するとして説明したが、ＩピクチャとＰピクチャ、ＩピクチャとＢピクチャのような２種類のみであってもよい。
また、全てのピクチャが動き補償予測が行われないＩピクチャであってもよい。
但し、このＩピクチャのみの場合における第３の実施例は、画像特性検出部２５への入力が分割された原動画像のみとなるため、第２の実施例と全く同一になる。
【００８０】
また、以上の実施例ではＩ、Ｐ、Ｂ各々のピクチャタイプについて、別々にシーンチェンジ判定を行い、シーンチェンジ画像を決定していた。
しかしながら、M＞２の場合で、図５に示されるように、Ｂピクチャを挟むＰまたはＩピクチャの間でシーンチェンジが生じた場合は、その間の複数のＢピクチャ全てが通常の場合より予測が当たりにくくなる可能性がある。
【００８１】
そこで、図５のように、入力画像順でＢピクチャの直後となるＰまたはＩピクチャ、及び、前記Ｂピクチャの直後となるＰまたはＩピクチャと直前のＰまたはＩピクチャに挟まれるＢピクチャのいずれかでシーンチェンジ画像と判定された場合、間のＢピクチャを全てシーンチェンジ画像と判定してもよい。
【００８２】
このように連続するＢピクチャを全てシーンチェンジ画像と判定した場合は、それらのＢピクチャについて、最初のシーンチェンジ画像における「直前の」Ｂピクチャをシーンチェンジ直前の画像として、その後のＢピクチャでも共通に使用する。
シーンチェンジ画像の次の画像は最後のシーンチェンジ画像の次のＢピクチャとし、比較するシーンチェンジ直前の画像は前記した最初のシーンチェンジ画像における直前のＢピクチャを用いる。
【００８３】
但しこの場合は、シーンチェンジ画像の次のＢピクチャにおける画面複雑度の推定で(15)(16)式を使用する場合でも、そのＢピクチャの画面複雑度の推定が終了するまで、最初のシーンチェンジ画像の直前のＢピクチャのアクティビティ・実測画面複雑度を画面複雑度メモリ２４Ｍに保持しておく必要があるため、アクティビティを画面複雑度メモリ２４Ｍに貯える画像をシーンチェンジの有無、シーンチェンジ画像の構成によって適宜切り替える必要がある。
【００８４】
なお、第２、第3の実施例において、1GOPの必要割当符号量Rcを求める際に必要となる平均画面複雑度は符号化ピクチャタイプ別に求めていたが、これをピクチャタイプで区別せず、一定期間内における各フレームの画面複雑度を加算した後にその期間内のフレーム数で除算した値を平均画面複雑度 X-ave として求め、それと現在の画像の推定画面複雑度Xk(k＝ i or p or b)から、次式(22)によって1GOPの必要割当符号量Rcを求めてもよい。
【００８５】

【００８６】
【発明の効果】
以上のように本発明によると、固定ビットレート制御または１パス方式の可変ビットレート制御で動画像を符号化する際に、符号化の終了した一定区間の画像の発生符号量と平均量子化スケールと、一定区間及びこれから符号化する現在の画像の符号化画像特性(アクティビティ)を検出し、発生符号量と平均量子化スケールの積に対して所定の操作を施すことによって得られる値を実測画面複雑度として求めた上で、同じピクチャタイプの直前の画像に対する、現在の画像のアクティビティの比率からシーンチェンジの有無を判定し、これから符号化する画像の画面複雑度を、シーンチェンジ画像については前記アクティビティの比率を因数とする所定の関数を実測画面複雑度に乗算することによって推定し、またシーンチェンジ画像の次の画像については、シーンチェンジ直前の画像に対する、現在の画像のアクティビティの比率をシーンチェンジ画像における実測画面複雑度に乗算することによって推定して、この推定画面複雑度を用いてシーンチェンジ時の符号量割当を行うことにより、シーンチェンジといった画像の変化点の直後で、画像劣化が発生せず、逆に符号量に無駄も発生せずに、対応した過不足のない符号量割当を行うことが可能になる。
【図面の簡単な説明】
【図１】本発明の動画像符号化装置及びその方法の第１の実施例のブロック構成を示した図である。
【図２】本発明の動画像符号化装置及びその方法の第２の実施例のブロック構成を示した図である。
【図３】本発明の動画像符号化装置及びその方法の第３の実施例のブロック構成を示した図である。
【図４】本発明の第３の実施例における画像特性検出器とシーンチェンジ検出器の一実施例のブロック構成を示した図である。
【図５】本発明におけるＢピクチャのシーンチェンジ判定の一実施例を説明した図である。
【図６】符号化ピクチャ構造の一例を示した図である。
【図７】一般的な動画像符号化装置の一構成例のブロック構成を示した図である。
【図８】従来における動画像符号化装置の一構成例のブロック構成を示した図である。
【符号の説明】
１１減算器
１２ＤＣＴ器（直交変換器）
１３量子化器
１４符号量制御器
１５可変長符号化器
１６バッファ
１７逆量子化器
１８ＩＤＣＴ器
１９動き補償予測器
２０加算器
２１フレームメモリ
２２平均量子化スケール検出器
２３発生符号量検出器
２４画面複雑度算出器
２４Ｍ画面複雑度メモリ
２５画像特性検出器
２５Ａ ACTcur検出器
２５Ｂ ACTpred検出器
２５Ｃ ACTmv検出器
２５Ｄ,２５Ｅ，２５Ｆ累積器
２５Ｇピクチャアクティビティ算出器
２８シーンチェンジ検出器
２８Ａ SCcur検出器
２８Ｂ SCpred検出器
２８Ｃ SCmv検出器
２８Ｄシーンチェンジ判定器
２９平均画面複雑度算出器
ACTcur 原画像アクティビティ
ACTi, ACTp, ACTb 現在の画像のアクティビティ
ACTi-p, ACTp-p, ACTb-p 直前に符号化した同じピクチャタイプの画像のアクティビティ
ACTmb マクロブロックアクティビティ
ACTmv 動きベクトル特性
ACTp-pold, ACTb-pold 実測画面複雑度の対象になった画像のアクティビティ
ACTpred 誤差画像アクティビティ
Ｒ符号量
Rave 平均割当符号量
Rc 画像の割当符号量
Xi, Xp, Xb 現在の画像の画面複雑度
Xi-ave, Xp-ave, Xb-ave 平均画面複雑度
Xi-p, Xp-p, Xb-p 各フレームの実測画面複雑度
Xp-pold, Xb-pold 実測画面複雑度[0001]
BACKGROUND OF THE INVENTION
The present invention relates to high-efficiency encoding of moving images, and more particularly, to a code amount control apparatus and method suitable for performing fixed bit rate and variable bit rate encoding in substantially real time.
[0002]
[Prior art]
MPEG2 has already been defined as an international standard for technology for efficiently encoding moving images such as TV signals.
MPEG2 divides a “frame” image constituting a moving image into blocks of 16 × 16 pixels called “macroblocks”, and a reference image separated by a predetermined number of frames before or after in time for each macroblock unit. A motion amount called a “motion vector” is calculated between the encoded images, and a “motion compensated prediction” technique for constructing the encoded image from the reference image based on the motion amount, and an error signal or code of the motion compensated prediction It is defined on the basis of two image coding elemental technologies of “transform coding” technology that compresses the information amount using DCT (Discrete Cosine Transform), which is a kind of orthogonal transform, for the transformed image itself.
[0003]
An example of the configuration of a conventional MPEG2 moving image encoding apparatus is shown in FIG.
An example of the coded picture structure is shown in FIG.
In motion compensated prediction, as in the coded picture structure shown in FIG. 6, prediction called I picture (intraframe coding), P picture (forward prediction coding), and B picture (bidirectional prediction coding) It consists of a combination of three types of pictures with different methods.
[0004]
As shown in FIG. 7, in transform coding, with respect to the output of the subtracter 71, which is an error signal of motion compensation prediction by the motion compensator 79 in the P and B pictures, with respect to the encoded image itself in I picture, DCT is performed by the DCT device 72, respectively.
After the DCT coefficient obtained by the DCT unit 72 is quantized by the quantizer 73 as controlled by the output of the code amount control unit 90, variable length coding is performed together with other auxiliary information such as motion vectors. Is performed by the variable-length encoder 75, and the code string is stored in the buffer 76 as a "bit stream" and output.
At this time, the quantization scale is controlled by the code amount control unit 90 in accordance with the sufficiency of the buffer 76.
On the other hand, the output coefficient of the quantizer 73 is supplied to the inverse quantizer 77 and the IDCT device 78 and is locally decoded, and then stored in the frame memory 81 for each block via the adder 80.
[0005]
Since MPEG2 is variable length coding, the generated code amount (bit rate) per unit time is not constant.
Therefore, it is possible to control to a required bit rate by appropriately changing the quantization scale at the time of quantization in the quantizer 73 in units of macroblocks.
MPEG2 Test Model 5 proposes a fixed bit rate control method in which the amount of generated code is constant in GOP units. This fixed bit rate control method is suitable for applications that require a constant transfer rate.
[0006]
An outline of the fixed bit rate control method corresponding to the operation of the code amount control unit of FIG. 7 in Test Model 5 is as follows.
If the target bit rate is BitRate, the number of frames per second is PictureRate, and the number of frames in 1 GOP (usually the interval between I pictures) as one coding unit is N, the code amount R allocated to 1 GOP is given by Given in 1).
R = (BitRate / PictureRate) · N (1)
[0007]
The code amount R in equation (1) is allocated to each image in the GOP. Here, the product of the generated code amount of one frame and the average quantization scale is displayed on the screen immediately after the encoding of each picture type. Complexity (Complexity) Xi (I picture), Xp (P picture), Xb (B picture) is obtained and the image in the GOP including the image to be encoded is uniformly equal to the screen complexity (Complexity). Assuming that the target allocation code amount of the image to be encoded is determined.
[0008]
The number of frames of P and B pictures that have not been encoded in the current GOP is Np, Nb, and the setting ratio of the quantization scale of P and B pictures for I pictures is Kp and Kb.
At this time, the target allocation code amounts Ti, Tp, and Tb for the I, P, and B picture types are given by the following equations (2), (3), and (4). Note that MAX [A, B] indicates the operation of selecting the larger of A and B.
[0009]

[0010]
Note that the code amount R is subtracted from the generated code amount of each frame every time one frame encoding is completed, and the value of equation (1) is added at the head (I picture) of the GOP.
Next, the quantization scale of each macroblock is determined based on the target allocation code amount determined by the above equations (2), (3), and (4) and the generated code amount of each macroblock detected by the buffer 76. To do.
[0011]
On the other hand, there is a variable bit rate control method as a method suitable for an application capable of a variable transfer rate such as DVD-Video. Japanese Patent Laid-Open No. 6-141298 discloses an encoding apparatus based on variable bit rate control.
In this apparatus, first, provisional encoding is performed on an input moving image using a fixed quantization scale, and a generated code amount is counted every unit time. Next, the target transfer rate of each part is set based on the generated code amount at the time of temporary encoding so that the generated code amount of the entire input moving image becomes a required value.
Then, the second encoding, that is, the actual encoding is performed on the input moving image while performing control so as to match the target transfer rate.
[0012]
However, in the above conventional example, in order to obtain an output bit stream, encoding must be performed at least twice, and in applications where real-time performance is required, the variable bit rate of the two-pass method such as this device is used. Control cannot be used.
[0013]
On the other hand, there is a variable bit rate control method for encoding a moving image in almost real time, that is, a one-pass variable bit rate control method.
Japanese Patent Application Laid-Open No. 10-164577 discloses an encoding apparatus using a one-pass variable bit rate control method in FIG.
FIG. 8 shows an example of the configuration of the moving picture encoding apparatus in this conventional example. In addition, the same code | symbol is attached | subjected to the same structural member as FIG. 7, and the description is abbreviate | omitted.
[0014]
In this conventional apparatus, the code amount stored in the buffer 76 is supplied to the generated code amount detector 83, and the generated code amount by the generated code amount detector 83 and the quantization scale from the quantizer 73 are averaged. Is supplied to the quantization scale detector 82, and a product of the average quantization scale detector 82 and the average value of the quantization scale in the screen is obtained as the “screen complexity” by the screen complexity detector 84. Based on the ratio of the current screen complexity to the average value of the complexity, the variable bit rate control is realized by the code amount controller 74 by determining the target generated code amount or target quantization scale of the screen.
[0015]
[Problems to be solved by the invention]
In the above conventional example, encoding control is performed on the assumption that the screen complexity of an image to be encoded is about the same as the screen complexity of the same picture type encoded immediately before.
However, when a large change such as a scene change occurs in the input moving image, not only the screen complexity itself changes because of the nature of the image before and after the change point, but also in P and B pictures that perform predictive coding. Since there is almost no prediction between the image before and after the change, most of the macroblocks in the image immediately after the change are often intra-coded.
[0016]
In such an image, although the actual screen complexity is extremely high compared to the image before the change, in the above conventional example, code amount allocation is performed based on the screen complexity of the image before the change. Therefore, there is a problem that the allocated code amount is insufficient, the quantization scale is increased, and as a result, the image quality is deteriorated immediately after the change point.
[0017]
On the other hand, the image that is predicted next from the image immediately after the change is relatively easy to predict except when the image changes in small increments, and the actual screen complexity is lower than the image immediately after the change. Regardless, since the code amount allocation is performed based on the high screen complexity immediately after the change in the above conventional example, the allocated code amount becomes excessive, the quantization scale is lowered unnecessarily, and the code amount is wasted. There was a problem that it would occur.
The present invention solves the above problems, and even when a large change such as a scene change occurs in an input moving image, a fixed bit rate and a variable bit rate of a one-pass method that can perform more appropriate code amount allocation. The purpose is to realize a control method.
[0018]
[Means for Solving the Problems]
  In order to solve the above-mentioned problems, the present invention comprises means described in 1) to 8) below.
That is,
  1) In a moving image encoding apparatus that encodes an input moving image by including motion compensation prediction means, orthogonal transform means, quantization means, and variable length encoding means,
  Means for detecting a generated code amount of each image of the input moving image;
  Means for detecting an average quantization scale of each image of the input moving image;
  A motion compensated prediction image generated by the input moving image and the motion compensation prediction means.SmallAt least the input videoThe activity is detected by the variance of luminance values or the difference value between pixels, and the input moving imageEncode image characteristicsTo getMeans,
  The encoded image characteristicsTo getBy meansObtainedThe encoded image characteristics and the encoded image characteristicsObtainedIt is determined that a ratio of the encoded image characteristic of the input moving image encoded immediately before the input moving image exceeds a predetermined threshold value, and scene change image information is determined based on the determination result. Means for detecting
  The actual screen complexity of the past image is calculated from the product of the generated code quantity detected by the means for detecting the generated code quantity and the average quantization scale detected by the means for detecting the average quantization scale. And
  The image of the scene change image information detected by the means for detecting the scene change image information, the encoded image characteristic detected by the means for detecting the encoded image characteristic, and the measured image complexity of the calculated past image. A means for calculating the estimated screen complexity of the current image from the product of the ratio to the characteristic;
  The estimated screen complexity of the calculated current image in the scene change image of the P and B pictures detected by the scene change image information and the image next to the scene change imageTheseIn cases other than the scene change image and the next image, the calculated actual screen complexity of the past imageBy selectingMeans for determining an allocated code amount of an image to be encoded next, and determining a quantization scale of the image to be encoded next from the allocated code amount;
A moving picture encoding apparatus comprising:
  2) In the video encoding apparatus described in 1),
  The means for calculating the estimated screen complexity of the current image is:
  SaidObtainedIn the image immediately before the encoded image characteristics of the current image and the same picture type (I picture, P picture, B picture)ObtainedA moving picture coding apparatus, characterized in that the estimated screen complexity is calculated by multiplying the measured screen complexity in the immediately preceding image by a function having a ratio of a coded image characteristic to a proportional constant.
  3) In the video encoding device described in 1),
  The screen complexity used in the means for determining the allocated code amount of the image to be encoded next is:
  If at least the current image is determined to be a scene change image or a next image of the scene change image according to the detected scene change image information,
A moving picture coding apparatus using the estimated picture complexity of the current picture.
  4) In the video encoding device described in any one of 1) to 3),
  The means for determining an assigned code amount of an image to be encoded next from the generated code amount and the measured screen complexity or the estimated screen complexity is the estimated screen complexity for an average value of the measured screen complexity for a certain period. A moving picture coding apparatus characterized in that the assigned code amount is determined by multiplying a ratio of degrees by an average assigned code amount.
  5) In a video encoding method for encoding an input video having a motion compensation prediction step, an orthogonal transform step, a quantization step, and a variable length encoding step,
  Detecting a generated code amount of each image of the input moving image;
  Detecting an average quantization scale of each image of the input moving image;
  A motion compensated prediction image generated by the input moving image and the motion compensated prediction step.SmallAt least the input videoActivated by variance of luminance values or difference value between pixels And detecting the input videoEncode image characteristicsTo getSteps,
  The encoded image characteristicsTo getBy stepObtainedThe encoded image characteristics and the encoded image characteristicsObtainedIt is determined that a ratio of the encoded image characteristic of the input moving image encoded immediately before the input moving image exceeds a predetermined threshold value, and scene change image information is determined based on the determination result. Detecting steps,
  The actual screen complexity of the past image is calculated from the product of the generated code amount detected by the step of detecting the generated code amount and the average quantization scale detected by the step of detecting the average quantization scale. And
  The image of the scene change image information detected by the step of detecting the calculated image complexity of the past image, the encoded image characteristic detected by the step of detecting the encoded image characteristic, and the scene change image information. Calculating the estimated screen complexity of the current image from the product of the ratio to the characteristic;
  The estimated screen complexity of the calculated current image in the scene change image of the P and B pictures detected by the scene change image information and the image next to the scene change imageTheseIn cases other than the scene change image and the next image, the calculated actual screen complexity of the past imageBy selectingDetermining an allocated code amount of an image to be encoded next, and determining a quantization scale of the image to be encoded next from the allocated code amount;
A moving picture encoding method comprising:
  6) In the video encoding method described in 5),
  Calculating the estimated screen complexity of the current image,
  SaidObtainedIn the image immediately before the encoded image characteristics of the current image and the same picture type (I picture, P picture, B picture)ObtainedA moving picture coding method, wherein the estimated screen complexity is calculated by multiplying a function having a ratio of a coded image characteristic to a proportional constant to the measured screen complexity in the immediately preceding image.
  7) In the video encoding method described in 5),
  The screen complexity used in the step of determining the assigned code amount of the image to be encoded next is:
  If at least the current image is determined to be a scene change image or a next image of the scene change image according to the detected scene change image information,
A moving picture coding method using the estimated picture complexity of the current picture.
  8) In the video encoding method described in any one of 5) to 7),
  The step of determining an assigned code amount of an image to be encoded next from the generated code amount and the measured screen complexity, or the estimated screen complexity includes the estimated screen complexity with respect to an average value of the measured screen complexity over a certain period. A moving picture coding method, wherein the assigned code amount is determined by multiplying a ratio of degrees by an average assigned code amount.
[0019]
Therefore, in the present invention, the amount of generated code and the average quantization scale of each image are detected in the moving picture coding apparatus provided with each means of motion compensation prediction such as MPEG2, orthogonal transform, quantization, and variable length coding. In order to distribute the allocated code amount within a predetermined section to each image between them, an actual measurement screen of an image already encoded by performing a predetermined operation on the product of the generated code amount of each image and the average quantization scale Find the complexity.
At the same time, the encoded image characteristic (activity) of each image is detected, the ratio of the encoded image characteristic with the immediately preceding image of the same picture type is calculated, and scene change detection is performed from this encoded image characteristic.
[0020]
As the encoded image characteristics of each image, in addition to the activity of the input image, for P and B pictures, the error image in motion compensation prediction or the difference image between the encoded image and the reference image in motion vector detection, and the activity of the motion vector Used, and one or more of them are used in combination.
When the value of each element of the encoded image characteristic exceeds a predetermined range, the image is determined as a scene change image.
[0021]
In an image determined to be a scene change image in each picture type, an estimated screen complexity is calculated from a predetermined function factored by the ratio of the encoded image characteristics described above and the measured screen complexity of the immediately preceding image. Since the complexity is a value larger than the actual screen complexity, a larger amount of code can be allocated to the scene change image.
[0022]
Conversely, when the immediately preceding image is a scene change image, instead of using the actual screen complexity, the actual screen complexity calculated when assigning the code amount of the previous image, that is, the screen of the image before the scene change. By estimating the screen complexity of the image to be encoded from the predetermined function whose factor is the ratio of the encoded image characteristics of the image before the scene change and the image to be encoded to the complexity, the scene An unnecessarily large amount of code is prevented from being assigned to the next image after the change image.
[0023]
In the case of variable bit rate control of the one-pass method, the estimated screen complexity of the image to be encoded is calculated from the ratio of the encoded image characteristics and the measured screen complexity of the immediately preceding image. In the case of an image determined as a scene change image in step S2, the estimated screen complexity is calculated from the predetermined function factored by the ratio of the encoded image characteristics and the measured screen complexity of the immediately preceding image. It is added from the case of.
[0024]
Conversely, if the previous image was a scene change image, instead of the actual screen complexity, the actual screen complexity of the image before the scene change or scene change image calculated when assigning the code amount of the previous image By estimating the screen complexity of the encoded image using the estimated screen complexity before the addition in, the value of the screen complexity in the next image of the scene change image can be optimized.
[0025]
Furthermore, in the variable bit rate control of the one-pass method, the ratio of the estimated estimated screen complexity and the average screen complexity for each picture type within a certain section is reflected in the code amount allocation of the predetermined section by the target bit rate, thereby It is possible to assign a code amount in a predetermined section corresponding to the change.
[0026]
DETAILED DESCRIPTION OF THE INVENTION
(First embodiment)
A first embodiment of the moving picture coding apparatus and method according to the present invention will be described below with reference to the drawings.
The first embodiment of the moving picture encoding apparatus of the present invention shown in FIG. 1 includes a subtracter 11, a DCT unit 12, a quantizer 13, a code amount controller 14, a variable length encoder 15, and a buffer. 16, inverse quantizer 17, IDCT device 18, motion compensation predictor 19, adder 20, frame memory 21, average quantization scale detector 22, generated code amount detector 23, screen complexity calculator 24, screen complexity It comprises a degree memory 24M, an image characteristic detector 25, and a scene change detector 28.
[0027]
The first embodiment is a case where the present invention is applied to constant bit rate coding.
In the following description, i corresponds to an I picture, p corresponds to a P picture, and b corresponds to a B picture.
It is assumed that the original moving image has been divided into macroblock units in advance by an image block divider (not shown).
The divided original moving image is not subjected to motion compensated prediction for the I picture, and the original moving image block itself is sent to the DCT unit 12 via the subtractor 11, and after DCT, the quantizer 13 performs a code amount controller 14. Quantized by the quantization scale sent from.
[0028]
The quantized signal is converted into a code by the variable length encoder 15, adjusted by the next buffer 16, and then the code is output.
On the other hand, the output coefficient of the quantizer 13 is locally decoded by the inverse quantizer 17 and the IDCT unit 18, and the output of the motion compensation predictor 19 is not added by the adder 20, but is added to the frame memory 21 for each block. Stored.
[0029]
For P and B pictures, the divided original moving image and a predetermined locally decoded image block stored in the frame memory 21 are supplied to the motion compensation predictor 19, where motion vector detection and motion compensation are performed, and prediction is performed. An inter-pixel difference is taken between the image block and the original image block by the subtractor 11, and an error image block as a difference value is supplied to the DCT unit 13.
[0030]
After that, similarly to the I picture, the difference value is DCTed by the DCT unit 12, quantized by the quantization scale by the quantization scale sent from the code amount controller 14, and then changed by the variable length encoder 15. After being converted into a code and adjusted in the next buffer 16, the code is output.
On the other hand, the output coefficient of the quantizer 13 is locally decoded by the inverse quantizer 17 and the IDCT unit 18 and then the predicted image block is added for each pixel by the adder 20 and stored in the frame memory 21 for each block. It is done.
[0031]
For each picture, the quantization scale for each macroblock is sent from the quantizer 13 to the average quantization scale detector 22, where the quantization scale for one frame is added, and the average quantization scale for one frame is obtained. Calculated.
On the other hand, the generated code amount is monitored in the buffer 16, and the value is supplied to the generated code amount detector 23.
In the generated code amount detector 23, the generated code amount is added in units of frames, and the generated code amount of one frame is detected.
[0032]
The average quantization scale and generated code amount detected for each frame are supplied to the screen complexity calculator 24 for each frame.
The screen complexity calculator 24 performs a predetermined operation after multiplying the supplied average quantization scale and generated code amount, and measures the screen complexity Xi of each frame corresponding to Complexity in MPEG2 Test Model 5. -p, Xp-p, Xb-p are obtained.
[0033]
On the other hand, in the image characteristic detector 25, an original moving image obtained by dividing the original image at the time of input is supplied, and an activity, which is a parameter indicating image characteristics for each macroblock, is detected for each frame of the original moving image and added to each frame. The result is supplied to the screen complexity calculator 24 and the scene change detector 28 for each frame.
[0034]
Here, the operation of detecting the image characteristic by the image characteristic detector 25 is detected prior to the actual encoding operation.
As the parameter indicating the image characteristics, dispersion of luminance values, inter-pixel difference values, and the like can be considered, but other parameters may be used as long as they indicate the image characteristics.
[0035]
In the scene change detector 28, the ratio (ACTi-p, ACTp-p, ACTb-p) of the activities ACTi, ACTp, ACTb of the current picture to be encoded and the activities ACTi-p, ACTp-p, ACTb-p of the same picture type encoded immediately before. / ACTi-p), (ACTp / ACTp-p), and (ACTb / ACTb-p) are calculated.
[0036]
And when this calculated ratio (current activity / previous activity) exceeds a certain range, for example,
(Ratio) <Amin or (Ratio)> Amax (0 <Amin <1, Amax> 1)
In this case, the scene change detector 28 determines that a scene change has occurred between these two images, and sends the position information of this frame to the screen complexity calculator 24.
When it is determined that a scene change has occurred, the current image to be encoded is hereinafter referred to as a scene change image.
[0037]
When frame position information indicating that a scene change has occurred is sent to the screen complexity calculator 24, the screen complexity calculator 24 uses the screen to compensate for the increase in the amount of generated code in the scene change image for P and B pictures. Complexity estimation (extra charge) is performed.
[0038]
Here, the measured screen complexity obtained previously is stored in the screen complexity memory 24M to be used for estimating the screen complexity of the next same picture type.
Estimated screen complexity Xp and Xb in the scene change image are as shown in the following equations (5) to (8).
Note that f1 to f4 are predetermined functions whose factors are activity ratios (ACTp / ACTp-p) and (ACTb / ACTb-p).
As an example of the predetermined function, f1 to f4 = (ACTk / ACTk-p) .Ck1 + Ck2
(However, k = p or b) is appropriate, but not limited to this.
[0039]
(P picture)
When (ACTp / ACTp-p)> Amax
Estimated screen complexity Xp ＝ Xp-p ・ f1 (5)
When (ACTp / ACTp-p) <Amin
Estimated screen complexity Xp ＝ Xp-p ・ f2 (6)
(B picture)
When (ACTb / ACTb-p)> Amax
Estimated screen complexity Xb ＝ Xb-p ・ f3 (7)
When (ACTb / ACTb-p) <Amin
Estimated screen complexity Xb ＝ Xb-p ・ f4 (8)
[0040]
For example, Amax = 1.5, Amin = 0.67, f1 = (ACTp / ACTp-p), 2, f2 = 1,
By setting f3 = (ACTb / ACTb-p) · 2, f4 = 1, if the activity increases, the value of twice the activity ratio is given as the estimated screen complexity, and if the activity decreases, the scene By giving the same screen complexity as that before the change as the estimated screen complexity, an increase in the amount of generated codes due to an increase in intra macroblocks in the P and B pictures can be compensated for by a scene change.
[0041]
Also, for P or B pictures of the same picture type following the scene change image, the screen complexity is calculated based on the screen complexity Xp-p and Xb-p calculated from the generated code amount of the scene change image. Then, it becomes a value larger than necessary.
[0042]
Therefore, instead of Xp-p and Xb-p, the measured screen complexity Xp-pold and Xb-pold (stored in the screen complexity memory 24M) used for the scene change image and the measured screen complexity From the activities ACTp-pold and ACTb-pold of the target image and the activities ACTp and ACTb of the current image to be encoded, the screen complexity is estimated by (9) and (10). The degree of optimization is achieved.
[0043]
(P picture)
Xp = Xp-pold (ACTp / ACTp-pold) (9)
(B picture)
Xb = Xb-pold ・ (ACTb / ACTb-pold) (10)
In the screen complexity memory 24M, the activity of the image of the last few frames in each picture type is stored in addition to the actual screen complexity of the scene change image.
[0044]
In the first embodiment, the estimated screen complexity is set for the scene change image of P and B pictures, the next image after the scene change image, and the measured screen complexity is set for other normal P, B and I pictures. Code amount corresponding to scene change by substituting into Xi, Xp, and Xb of MPEG2 Test Model 5 (2) to (4) and determining the target assigned code amount Ti, Tp, Tb of the image to be encoded Can be assigned. Based on the target allocated code amount and the generated code amount of each macroblock detected by the buffer 16, the quantization scale of each macroblock is determined using the method of MPEG2 Test Model 5.
[0045]
It should be noted that the activity of each macroblock is also sent from the image characteristic detector 25 to the code amount controller 14 and used for adaptive quantization control for changing the quantization scale of each macroblock based on the activity in the MPEG2 Test Model 5. However, this adaptive quantization control may not be performed.
[0046]
The quantization scale of each macroblock output from the code amount controller 14 is sent to the quantizer 13, and the current image (the divided original image after DCT or the error-compensated image of motion compensation prediction) is quantized. The scale is quantized by the quantizer 13, variable-length encoded by the variable-length encoder 15, adjusted by the next buffer 16, and then the code is output. The quantization scale for each macroblock of the quantizer 13 and the generated code amount monitored by the buffer 16 are sent to the average quantization scale detector 22 and the generated code amount detector 23, respectively, to control the code amount of the next picture. Used for.
[0047]
In the present embodiment, the present invention is applied to the MPEG2 Test Model 5, but the present invention is not limited to this and can be applied to other rate control.
For example, if the allocated code amount Sk of the next image to be encoded is determined from the generated code amount of the past picture of the same picture type whose activity is ACTk-p (k = i, p, b), the activity is ACTk. If the image is a scene change image, the corrected allocation code amount Sk ′ is expressed by the following equation (11).
Sk '= Sk · fk (ACTk / ACTk-p) (11)
Where k = i, p, b fk is a function whose factor is (ACTk / ACTk-p)
[0048]
(Second embodiment)
Next, a second embodiment of the moving picture coding apparatus of the present invention will be described below with reference to FIG.
The second embodiment is a case where the present invention is applied to the variable bit rate encoding of the one-pass method. Compared with the first embodiment, an average screen complexity calculator 29 is added, and the screen complexity is calculated. Since only the configuration and operation of the device 24 and the code amount controller 14 are different, the description of the other parts is omitted.
[0049]
As in the first embodiment, the screen complexity calculator 24 includes an average quantization scale and generated code amount of each frame, frame position information that a scene change has occurred, and image characteristics of each frame, that is, activity. Supplied.
The screen complexity calculator 24 multiplies the average quantization scale and generated code amount of each supplied frame, and then performs a predetermined transformation on the multiplication result. Based on the result, the actual screen complexity of each frame is determined. Desired.
[0050]
The actually measured screen complexity is sent to the average screen complexity calculator, where a value within a certain period is added for each encoded picture type, and then divided by the number of frames of the same picture type within that period, The average screen complexity Xi-ave (I picture), Xp-ave (P picture), and Xb-ave (B picture) of the P and B picture types is calculated.
[0051]
Within a certain period of time here, there may be a fixed number of frames, such as 15 frames or 300 frames, which is predetermined in time from the image that has just been encoded. There may be a case where the number of frames sequentially increases as in the case of the image just completed.
Even in the case of the former fixed number of frames, if the number of encoded frames does not satisfy the predetermined period, the number of frames will increase sequentially as in the latter case.
[0052]
Next, the screen complexity calculator 24 estimates the screen complexity from the actually measured screen complexity obtained, the average screen complexity obtained by the average screen complexity calculator 29, and the activity.
The screen complexity Xi, Xp, Xb of the current image to be encoded is the activity ACTi, ACTp, ACTb of the current image, and the screen complexity Xi-p, Xp-p of the image of the same picture type that was encoded immediately before , Xb-p, and the following pictures (12), (13), and (14) from the activities ACTi-p, ACTp-p, and ACTb-p of the same picture type encoded immediately before.
[0053]
(I picture)
Screen complexity of the current image Xi = Xi-p (ACTi / ACTi-p) (12)
(P picture)
Screen complexity of the current image Xp = Xp-p (ACTp / ACTp-p) (13)
(B picture)
Screen complexity of the current image Xb = Xb-p (ACTb / ACTb-p) (14)
[0054]
In the initial state, when there is no frame that has been encoded with the same picture type, the screen complexity and activity of each picture type image are obtained in advance for several images, and the average moving image is obtained. According to the frequency of occurrence, it may be statistically averaged and used as an initial value.
[0055]
On the other hand, when the current image to be encoded is a scene change image, the screen complexity calculator 24 recalculates the estimated screen complexity for P and B pictures in order to allocate a larger amount of code.
Here, the measured screen complexity of the image of the same picture type encoded immediately before is stored in the screen complexity memory 24M for use in estimating the screen complexity of the next same picture type.
[0056]
Estimated screen complexity Xp, Xb in the scene change image is obtained in the same manner as the equations (5) to (8) in the first embodiment.
Note that f1 to f4 are predetermined functions whose factors are activity ratios (ACTp / ACTp-p) and (ACTb / ACTb-p), which are different from those in the first embodiment. Also good.
[0057]
Further, for the P or B picture of the same picture type next to the scene change image, the measured screen complexity Xp stored in the screen complexity memory 24M is the same as the expressions (9) and (10) of the first embodiment. Estimate the screen complexity from -pold, Xb-pold, and the activity ACTp-pold, ACTb-pold of the image subject to the actual screen complexity, and the activity ACTp, ACTb of the current image to be encoded As a result, the screen complexity can be optimized.
[0058]
In the second embodiment, since the estimated screen complexity is calculated regardless of the presence or absence of a scene change, the value stored in the screen complexity memory 24M at the time of a scene change image is not the measured screen complexity, In the estimation of the screen complexity in the image of the same picture type next to the scene change image, it is changed to the estimated screen complexity Xp-old, Xb-old before recalculation obtained by the above formulas (13) and (14). It may be calculated by the following equations (15) and (16).
In this case, since it is not necessary to switch the image for storing activities in the screen complexity memory 24M depending on the presence or absence of a scene change, the apparatus can be simplified.
[0059]
(P picture)
Screen complexity of the current image Xp = Xp-old · (ACTp / ACTp-p) (15)
(B picture)
Screen complexity of the current image Xb = Xb-old · (ACTb / ACTb-p) (16)
[0060]
The estimated screen complexity Xi, Xp, Xb of the current image to be encoded and the average screen complexity Xi-ave, Xp-ave, Xb-ave of each picture type calculated in this way are code amount control. Sent to the container 14.
[0061]
The code amount controller 14 sets (determines) an assigned code amount of an image to be encoded next (from now on) and sets (determines) a quantization scale for variable bit rate control.
Assuming that the target average bit rate is BitRate, the number of frames per second is PictureRate, and the number of frames in one coding unit, 1 GOP (usually the interval between I pictures) is N, the average allocated code amount Rave for 1 GOP is Given in (17).
[0062]
1GOP average allocated code amount Rave = (BitRate / PictureRate) · N (17)
If the Rave in the above equation is the required allocation code amount of 1 GOP at the average screen complexity, the current image obtained by the screen complexity calculator 24 is uniformly obtained from the 1 GOP image including the current image to be encoded. Is assumed to be equal to the estimated screen complexity, the required allocation code amount Rc of 1 GOP required when the image quality is kept constant is given by the following equations (18), (19), and (20).
[0063]
(I picture)
Necessary assigned code amount Rc = Rave (Xi / Xi-ave) (18)
(P picture)
Necessary allocated code amount Rc = Rave (Xp / Xp-ave) (19)
(B picture)
Necessary assigned code amount Rc = Rave (Xb / Xb-ave) (20)
[0064]
By appropriately allocating the necessary allocation code amount Rc of the above equation to each picture of 1 GOP, the target code amount of the current image to be encoded is calculated, and the quantization scale of each macroblock is determined.
For example, when the method of MPEG2 Test Model 5 is used, the estimated screen complexity Xi, Xp, Xb of the current image obtained above is changed to Xi, Xp, Xb in equations (2) to (4), and the assigned code amount Rc is (2 ) To (4) are substituted into R, and target allocation code amounts Ti, Tp, and Tb of an image to be encoded are determined.
[0065]
However, in the second embodiment, since it is not necessary to set a constant code amount for each GOP, the value of Rc is subtracted by the generated code amount of each frame as R in equations (2) to (4). There is no need to add at the beginning of the GOP. In addition, Np and Nb in the expressions (2) to (4) are always constant values (values at the GOP head).
[0066]
After that, as in the first embodiment, the quantization scale of each macroblock is determined based on the target allocated code amount and the generated code amount of each macroblock detected by the buffer 16, and if necessary Adaptive quantization control is performed to change the quantization scale according to the activity of each macroblock.
In this way, even in the variable bit rate control of the 1-pass method, it is possible to assign the code amount corresponding to the scene change.
[0067]
(Third embodiment)
Next, a third embodiment of the moving picture encoding apparatus of the present invention will be described below with reference to FIG.
The third embodiment is also a case where the present invention is applied to the variable bit rate encoding of the one-pass method. Compared with the second embodiment, the image characteristic detector 25 and the scene change detection shown in FIG. Only the configuration and operation of the device 28, the activity signal sent from the image characteristic detector 25 to the scene change detector 28, and the contents of the scene change information sent from the scene change detector 28 to the screen complexity calculator 24. Different.
3 is different from FIG. 2 in that a motion compensation signal is supplied from the motion compensation predictor 19 to the image characteristic detector 25, and the description of other parts is omitted.
[0068]
The image characteristic detector 25 shown in FIG. 4 includes an ACTcur detector 25A, an ACTpred detector 25B, an ACTmv detector 25C,

accumulators

25D, 25E, and 25F, and a picture activity calculator 25G.
The scene change detector 28 includes an SCcur detector 28A, an SCpred detector 28B, an SCmv detector 28C, and a scene change determiner 28D.
[0069]
In the embodiment of FIGS. 3 and 4, since the motion compensation prediction is not performed in the case of the I picture for the input to the image characteristic detector 25, the original moving image divided in units of macroblocks as in the second embodiment. Are input, and an activity (ACTcur) that is a parameter indicating image characteristics is detected in units of macroblocks, added in units of frames, and sent to the screen complexity calculator 24 as an activity ACTi of an I picture.
The output value of the ACTcur detector 25A is added to the frame unit by the accumulator 25D and then sent to the scene change detector 28, where it is used for scene change determination.
[0070]
On the other hand, in the case of P and B pictures, the input to the image characteristic detector 25 shown in FIG. 4 is an error image in motion compensation prediction in macroblock units or a coded image in motion vector detection in addition to the divided original moving image. 3 and a reference image and a motion vector used in motion compensation prediction are input from a motion compensated predictor 19 shown in FIG.
From the divided original moving image, the (actual image) activity ACTcur is detected for each macroblock as in the case of the I picture.
[0071]
On the other hand, an error image in motion compensation prediction in macroblock units or a difference image between a coded image and a reference image in motion vector detection is detected as a prediction activity ACTpred in which an absolute value sum or a square error sum is taken. The
Further, the motion vector used in the motion compensation prediction is detected as a motion vector activity ACTmv by taking the absolute value of the difference for each component between adjacent macroblocks.
[0072]
Then, the macroblock activity ACTmb is calculated for each macroblock by the calculation of the following equation (21), and is added by one frame, and is added to the screen complexity calculator 24 as P and B picture activities ACtp and ACTb. Sent.
The values of the ACTcur, ACTpred, and ACTmv detectors 25A, 25B, and 25C are added in units of frames by

accumulators

25D, 25E, and 25F, respectively, and then sent to the scene change detector 28 for use in scene change determination.
Macroblock activity ACTmb
= A · ACTcur + b · ACTpred + c · ACTmv (21)
[0073]
Note that the values of the constants a, b, and c are changed for each picture and for each macroblock prediction mode (intra, unidirectional or bidirectional prediction).
For example, in the case of an intra, since prediction is not performed as in the case of an I picture, b = c = 0, and it is considered that the amount of generated codes is larger than that of a block to be predicted, so the value of a is increased.
As described above, by performing activity detection according to the prediction mode or the like, it is possible to estimate the screen complexity according to the encoding characteristics more than in the second embodiment.
[0074]
In addition, the ACTcur, ACTpred, and ACTmv values added in units of frames are sent from the image characteristic detector 25 to the scene change detector 28.
Since only ACTcur is sent in the case of an I picture, the ratio of the activity of the current picture to be encoded and the activity of the I picture encoded immediately before is calculated for ACTcur as in the second embodiment.
And when this activity ratio exceeds a certain range, for example,
(Ratio) <Amin or (Ratio)> Amax
However, (Ratio) = (Activity of encoded image) / (Activity of previous image)
0 <Amin <1, Amax> 1
In the case of, it is determined that a scene change has occurred between these two images.
[0075]
On the other hand, in the case of P and B pictures, three values of ACTcur, ACTpred, and ACTmv are sent from the image characteristic detector 25, so that for each of the three values, the activity of the current image to be encoded and the immediately preceding Calculate the activity ratio of the encoded picture of the same picture type.
And
・ (ACTcur ratio) <A1min or (ACTcur ratio)> A1max
・ (ACTpred ratio) <A2min or (ACTpred ratio)> A2max
・ (ACTmv ratio) <A3min or (ACTmv ratio)> A3max
However, (Ratio) = (Activity of encoded image) / (Activity of previous image)
0 <A1min, A2min, A3min <1, A1max, A2max, A3max,> 1
When either of these is satisfied, it is determined that a scene change has occurred between these two images.
[0076]
If it is determined that there is a high possibility of erroneous determination with one determination method, the condition may be limited so that it is determined that a scene change has occurred only when two or all three conditions are satisfied.
[0077]
The position information of the scene change image is also sent to the screen complexity calculator 24 at the same time as information indicating whether the scene change is determined as ACTcur, ACTpred, or ACTmv.
Further, the screen complexity calculator 24 changes the functions of f1 to f4 in the equations (5) to (8) depending on which activity determination determines the scene change.
[0078]
As described above, in the third embodiment, not only the activity ACTcur of the moving image but also the error image and motion vector activities ACTpred and ACTmv in motion compensation prediction are used for scene change determination, thereby improving the accuracy of scene change determination. It can be made.
In the above third embodiment, the present invention is applied to the variable bit rate encoding of the 1-pass scheme of the second embodiment, but the same method is applied to the fixed bit rate encoding of the first embodiment. You may apply.
[0079]
In the above embodiment, the picture coding structure has been described as having three types of I picture, P picture, and B picture as shown in FIG. 6, but I picture and P picture, I picture and B picture, etc. There may be only two types.
Further, all pictures may be I pictures for which motion compensation prediction is not performed.
However, in the case of only this I picture, the third embodiment is exactly the same as the second embodiment because only the original moving image obtained by dividing the input to the image characteristic detection unit 25 is provided.
[0080]
In the above embodiment, scene change determination is performed separately for each of the I, P, and B picture types to determine a scene change image.
However, in the case of M> 2, as shown in FIG. 5, when a scene change occurs between P or I pictures sandwiching a B picture, all of the B pictures in between are predicted more than usual. It may be hard to hit.
[0081]
Therefore, as shown in FIG. 5, any one of the P or I picture immediately after the B picture in the input image order, and the B picture sandwiched between the P or I picture immediately after the B picture and the immediately preceding P or I picture. If the scene change image is determined, all the B pictures in between may be determined to be scene change images.
[0082]
When it is determined that all consecutive B pictures are scene change images, the “previous” B picture in the first scene change image is used as the image immediately before the scene change, and the subsequent B pictures are also common to those B pictures. Used for.
The next image after the scene change image is the B picture next to the last scene change image, and the B picture immediately before the first scene change image is used as the image immediately before the scene change to be compared.
[0083]
However, in this case, even if the equations (15) and (16) are used for estimating the screen complexity of the next B picture of the scene change image, the first scene is not determined until the estimation of the screen complexity of the B picture is completed. Since it is necessary to store the activity / measured screen complexity of the B picture immediately before the change image in the screen complexity memory 24M, the image for storing the activity in the screen complexity memory 24M is the presence / absence of a scene change, the scene change image It is necessary to switch appropriately depending on the configuration.
[0084]
In the second and third embodiments, the average screen complexity required for obtaining the required allocation code amount Rc of 1 GOP is obtained for each encoded picture type, but this is not distinguished by the picture type. After adding the screen complexity of each frame in a certain period, the value divided by the number of frames in that period is obtained as the average screen complexity X-ave, and this is the estimated screen complexity Xk (k = i or From p or b), the required allocation code amount Rc of 1 GOP may be obtained by the following equation (22).
[0085]

[0086]
【The invention's effect】
As described above, according to the present invention, when a moving image is encoded by the fixed bit rate control or the variable bit rate control of the one-pass method, the generated code amount and the average quantization scale of the image in the fixed section after the encoding are performed. And the measured value of the value obtained by performing a predetermined operation on the product of the generated code amount and the average quantization scale. After obtaining the complexity, the presence / absence of a scene change is determined from the ratio of the activity of the current image to the immediately preceding image of the same picture type, and the screen complexity of the image to be encoded from now on is determined for the scene change image. Estimated by multiplying the measured screen complexity by a predetermined function that is a factor of the activity ratio. The image is estimated by multiplying the ratio of the current image activity to the image just before the scene change by multiplying the actual screen complexity in the scene change image, and using this estimated screen complexity, the code amount at the time of the scene change By allocating, it is possible to perform code amount allocation with no excess or deficiency immediately after an image change point such as a scene change without causing image degradation and conversely no waste of code amount. become.
[Brief description of the drawings]
FIG. 1 is a diagram showing a block configuration of a first embodiment of a moving picture encoding apparatus and method according to the present invention.
FIG. 2 is a diagram showing a block configuration of a second embodiment of the moving picture coding apparatus and method according to the present invention.
FIG. 3 is a diagram showing a block configuration of a third embodiment of the moving picture coding apparatus and method according to the present invention.
FIG. 4 is a diagram showing a block configuration of an embodiment of an image characteristic detector and a scene change detector according to a third embodiment of the present invention.
FIG. 5 is a diagram for explaining an example of scene change determination for a B picture according to the present invention.
FIG. 6 is a diagram illustrating an example of a coded picture structure.
FIG. 7 is a diagram illustrating a block configuration of a configuration example of a general moving image encoding device.
FIG. 8 is a diagram showing a block configuration of a configuration example of a conventional moving image encoding device.
[Explanation of symbols]
11 Subtractor
12 DCT unit (orthogonal transformer)
13 Quantizer
14 Code amount controller
15 Variable length encoder
16 buffers
17 Inverse quantizer
18 IDCT device
19 Motion compensated predictor
20 Adder
21 frame memory
22 Average quantization scale detector
23 Generated code amount detector
24 Screen complexity calculator
24M screen complexity memory
25 Image characteristic detector
25A ACTcur detector
25B ACTpred detector
25C ACTmv detector
25D, 25E, 25F Accumulator
25G picture activity calculator
28 Scene change detector
28A SCcur detector
28B SCpred detector
28C SCmv detector
28D scene change detector
29 Average screen complexity calculator
ACTcur Original image activity
ACTi, ACTp, ACTb Current image activity
ACTi-p, ACTp-p, ACTb-p Activity of images of the same picture type coded immediately before
ACTmb macroblock activity
ACTmv motion vector characteristics
ACTp-pold, ACTb-pold Activity of the image subject to the actual screen complexity
ACTpred error image activity
R code amount
Rave average allocated code amount
Rc image allocation code amount
Xi, Xp, Xb Screen complexity of the current image
Xi-ave, Xp-ave, Xb-ave average screen complexity
Xi-p, Xp-p, Xb-p Measurement complexity of each frame
Xp-pold, Xb-pold Measurement screen complexity

Claims

In a moving image encoding apparatus that encodes an input moving image by including motion compensation prediction means, orthogonal transform means, quantization means, and variable length encoding means,
Means for detecting a generated code amount of each image of the input moving image;
Means for detecting an average quantization scale of each image of the input moving image;
Chi sac motion compensated prediction image is generated by the input moving picture and the motion compensated prediction means, variance of luminance values of the input moving image even without low, or the input moving by detecting the activity by the difference value between pixels Means for obtaining an encoded image characteristic of the image;
An encoded image characteristic obtained by the means for obtaining the encoded image characteristic, and an encoded image characteristic of the input moving image encoded immediately before the input moving image from which the encoded image characteristic was obtained. Means for determining that the ratio exceeds a predetermined threshold value, and detecting scene change image information based on the determination result;
The actual screen complexity of the past image is calculated from the product of the generated code quantity detected by the means for detecting the generated code quantity and the average quantization scale detected by the means for detecting the average quantization scale. And
The image of the scene change image information detected by the means for detecting the calculated image complexity of the past image, the encoded image characteristic detected by the means for detecting the encoded image characteristic, and the scene change image information. Means for calculating the estimated screen complexity of the current image from the product of the ratio to the characteristic;
The scene change image of the P and B pictures detected by the scene change image information, and the next image after the scene change image indicate the estimated screen complexity of the calculated current image, except for the scene change image and the next image. In the case of determining the allocated code amount of the next image to be encoded by selecting the calculated actual screen complexity of the past image, the quantization scale of the image to be encoded next from the allocated code amount Means for determining
A moving picture encoding apparatus comprising:

The moving picture encoding apparatus according to claim 1,
The means for calculating the estimated screen complexity of the current image is:
A coded image characteristics to the current image the obtained therewith same picture type (I picture, P picture, B picture) function for the encoded image characteristic obtained in the immediately preceding image, and the ratio of the proportionality constant Is multiplied by the measured screen complexity of the immediately preceding image to calculate the estimated screen complexity.

The moving picture encoding apparatus according to claim 1,
The screen complexity used in the means for determining the allocated code amount of the image to be encoded next is:
If at least the current image is determined to be a scene change image or a next image of the scene change image according to the detected scene change image information,
A moving picture coding apparatus using the estimated picture complexity of the current picture.

In the moving image encoder according to any one of claims 1 to 3,
The means for determining an assigned code amount of an image to be encoded next from the generated code amount and the measured screen complexity or the estimated screen complexity is the estimated screen complexity for an average value of the measured screen complexity for a certain period. A moving picture coding apparatus characterized in that the assigned code amount is determined by multiplying a ratio of degrees by an average assigned code amount.

In a video encoding method for encoding an input video having a motion compensation prediction step, an orthogonal transform step, a quantization step, and a variable length encoding step,
Detecting a generated code amount of each image of the input moving image;
Detecting an average quantization scale of each image of the input moving image;
The motion produced by the input video and the motion compensated prediction step compensated prediction picture sac Chi, variance of luminance values of the input moving image even without low, or the input moving by detecting the activity by the difference value between pixels Steps for obtaining a coded image characteristic of the image;
An encoded image characteristic obtained by the step for obtaining the encoded image characteristic, and an encoded image characteristic of the input moving image encoded immediately before the input moving image from which the encoded image characteristic is obtained. Determining that the ratio exceeds a predetermined threshold, and detecting scene change image information based on the determination result;
The actual screen complexity of the past image is calculated from the product of the generated code amount detected by the step of detecting the generated code amount and the average quantization scale detected by the step of detecting the average quantization scale. And
Image of the scene change image information detected by the step of detecting the calculated image characteristic of the past image calculated, the encoded image characteristic detected by the step of detecting the encoded image characteristic, and the scene change image information Calculating the estimated screen complexity of the current image from the product of the ratio to the characteristic;
The scene change image of the P and B pictures detected by the scene change image information, and the next image after the scene change image indicate the estimated screen complexity of the calculated current image, except for the scene change image and the next image. In the case of determining the allocated code amount of the next image to be encoded by selecting the calculated actual screen complexity of the past image, the quantization scale of the image to be encoded next from the allocated code amount And a step of determining a video encoding method.

In the moving image encoding method according to claim 5,
Calculating the estimated screen complexity of the current image,
A coded image characteristics to the current image the obtained therewith same picture type (I picture, P picture, B picture) function for the encoded image characteristic obtained in the immediately preceding image, and the ratio of the proportionality constant Is multiplied by the measured screen complexity of the immediately preceding image to calculate the estimated screen complexity.

In the moving image encoding method according to claim 5,
The screen complexity used in the step of determining the allocated code amount of the image to be encoded next is:
If at least the current image is determined to be a scene change image or a next image of the scene change image according to the detected scene change image information,
A moving picture coding method using the estimated picture complexity of the current picture.

In the moving image encoding method according to any one of claims 5 to 7,
The step of determining an assigned code amount of an image to be encoded next from the generated code amount and the measured screen complexity, or the estimated screen complexity includes the estimated screen complexity with respect to an average value of the measured screen complexity over a certain period. A moving picture coding method, wherein the assigned code amount is determined by multiplying a ratio of degrees by an average assigned code amount.