JP4720284B2

JP4720284B2 - Image compression encoding device

Info

Publication number: JP4720284B2
Application number: JP2005142675A
Authority: JP
Inventors: 悟史宮地; 康弘滝嶋
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2005-05-16
Filing date: 2005-05-16
Publication date: 2011-07-13
Anticipated expiration: 2025-05-16
Also published as: JP2006319868A

Description

本発明は、画像圧縮符号化装置に関する。特に、動き予測ＤＣＴ(Discrete Cosine Transform：離散コサイン変換)符号化方式に基づく符号化装置に関する。 The present invention relates to an image compression encoding apparatus. In particular, the present invention relates to an encoding apparatus based on a motion prediction DCT (Discrete Cosine Transform) encoding method.

従来、マルチメディアアプリケーションに対する符号化方式として、Ｈ．２６３又はＭＰＥＧ−４がある。これら符号化方式は、アプリケーションの観点から、バッファ制約を厳密に規定する必要がなかった。なぜなら、ダウンロード配信又はＣＤ−ＲＯＭ蓄積というように伝送特性に影響されない用途、又は、伝送特性に影響されるインターネットストリーミングであっても品質保証に重点を置かない用途に、主に使用されるからである。例えばテレビ電話のように低遅延を要するアプリケーションに対しては、受信側の再生間隔のゆらぎを許容することにより、バッファ制約を厳密に規定しなくても実用上問題がなかった。 Conventionally, H.264 has been used as an encoding method for multimedia applications. H.263 or MPEG-4. These encoding methods did not require strict buffer constraints from the application point of view. This is because it is mainly used for applications that are not affected by transmission characteristics such as download distribution or CD-ROM storage, or for applications that do not focus on quality assurance even for Internet streaming that is affected by transmission characteristics. is there. For example, for applications that require low delay, such as videophones, there is no practical problem even if buffer constraints are not strictly defined by allowing fluctuations in the playback interval on the receiving side.

これに対し、低ビットレート（数百kbit/s以下）を対象とし、動画像の符号化効率が高いＨ．２６４の符号化方式がある。この符号化方式は、地上ディジタル放送の移動体向け放送（一般に「１セグメント放送」と称される）に採用されている。 On the other hand, H.264, which targets a low bit rate (several hundred kbit / s or less) and has high coding efficiency of moving images. There are H.264 encoding schemes. This encoding method is employed in broadcasting for terrestrial digital broadcasts (generally referred to as “one-segment broadcasting”).

Ｈ．２６４は、符号化制御の観点から、以下の特徴を有する。
（１）フレーム内符号化モードにおける予測方式の改善による圧縮効率の向上
（２）ブロックサイズ適応選択による予測効率の向上
（３）ＲＤ最適化又はそれに準じるモード選択方式を採用
ＲＤ最適化とは、情報量(Rate)及びひずみ(Distortion)の両方を共に最小するマクロブロックモードを選択することを指し、実際に仮符号化をしながら最適選択(Optimization)をする。 H. H.264 has the following features from the viewpoint of encoding control.
(1) Improvement of compression efficiency by improvement of prediction method in intraframe coding mode (2) Improvement of prediction efficiency by adaptive selection of block size (3) Adopting RD optimization or a mode selection method according to it RD optimization is This refers to selecting a macroblock mode that minimizes both the amount of information (Rate) and distortion (Distortion), and performs optimal selection (Optimization) while actually performing provisional encoding.

この符号化方式は、「放送」に採用されているために、エンコーダに高品質が要求され、即時性及び遅延の一意性が要求される。従って、エンコーダは、ビットレート及び遅延に対する条件に厳密に準拠する必要がある。 Since this encoding method is adopted for “broadcasting”, high quality is required for the encoder, and immediacy and uniqueness of delay are required. Therefore, the encoder needs to strictly comply with the requirements for bit rate and delay.

Ｈ．２６４に基づくレート制御方法には、ＭＰＥＧ−２ＴＭ５における情報量割当方式に基づいたＨＲＤ(Hypothetical Reference Decoder:仮想参照デコーダ)バッファ制約方式がある（例えば非特許文献１参照）。ＨＲＤバッファは、ＭＰＥＧ−２のＶＢＶ(Video Buffer Verifier：ビデオバッファ検証器)に相当するものである。 H. As a rate control method based on H.264, there is an HRD (Hypothetical Reference Decoder) buffer restriction method based on an information amount allocation method in MPEG-2TM5 (see, for example, Non-Patent Document 1). The HRD buffer corresponds to an MPEG-2 VBV (Video Buffer Verifier).

非特許文献１を発展させたものとして、リファレンスエンコーダ（ＪＭ：Joint Model、参照ソフトウェア）で用いられる方式もある（例えば非特許文献２参照）。ＪＭは、符号化方式の性能検証又は正当性確認に用いられる。 As a development of Non-Patent Document 1, there is a system used in a reference encoder (JM: Joint Model, reference software) (see Non-Patent Document 2, for example). JM is used for performance verification or validity confirmation of an encoding method.

これら従来技術は、ＧＯＰ(Group Of Pictures)単位での一定レートを前提とし、ピクチャタイプに応じた重み付けによって、各フレームに対して情報量を割り当てる。また、従来技術におけるレート制御は、レート−ひずみの二次方程式モデル（例えば非特許文献３参照）に基づくマクロブロックＱＰ(Quantization Parameter：量子化パラメータ)設定の組み合わせによって行われる。これら方式は、フレーム単位の画像性質を考慮していない。また、フレーム内ＱＰ決定の際の局所的ひずみ量を、前フレームの同位置のマクロブロックのものから流用しており、動きの大きい画像に対しては信頼性が低い。 These prior arts presuppose a constant rate in units of GOP (Group Of Pictures), and assign an amount of information to each frame by weighting according to the picture type. Further, rate control in the prior art is performed by a combination of macroblock QP (Quantization Parameter) settings based on a rate-distortion quadratic equation model (see, for example, Non-Patent Document 3). These systems do not take into account the image properties on a frame basis. In addition, the local distortion amount at the time of determining the intra-frame QP is diverted from that of the macro block at the same position in the previous frame, and the reliability is low for an image with a large motion.

その他の方法としては、フレーム内ＱＰ決定に際し、ゼロ係数の割合と発生情報量との関係に着目して量子化制御を行う方式もある（例えば非特許文献４参照）。更に、レートひずみ結合係数λを様々に変化させＲＤ最適化を行い、希望のレートとなるものを選択する方式もある（例えば非特許文献５参照）。 As another method, there is a method of performing quantization control by paying attention to the relationship between the ratio of the zero coefficient and the amount of generated information when determining the QP in the frame (for example, see Non-Patent Document 4). Furthermore, there is also a method of selecting a desired rate by changing the rate distortion coupling coefficient λ in various ways and performing RD optimization (see, for example, Non-Patent Document 5).

図１は、従来技術における画像圧縮符号化装置の機能構成図である。 FIG. 1 is a functional configuration diagram of an image compression coding apparatus in the prior art.

画像圧縮符号化装置１は、画面データ（フレーム単位）を入力し、その符号化データを出力する。現フレームメモリ１０１は、入力された１フレームを蓄積する。動き予測部１０２は、現フレームの各マクロブロックについて、ブロックマッチング法などにより、ローカルデコードメモリ１０７内のフレーム（前ローカルデコードフレーム）に対する動きを予測し、動きベクトルを出力する。動き補償部１１１は、時間方向に流れるフレーム間で、前ローカルデコードフレームに対して、先に算出された動きベクトルをマクロブロックごとに適用し、被写体の動きを考慮した予測画面を出力する。ＤＣＴ部１１２は、イントラモードにおいては現フレームのマクロブロックを、インターモードにおいては現フレームのマクロブロックと動き補償を施された予測画面のマクロブロックとの差分に対して、さらにこれを４つに分割したブロック毎に直交変換し、周波数成分を示すＤＣＴ係数を出力する。量子化部１１３は、直交変換されたＤＣＴ係数を、所定の量子化パラメータに基づいて離散数値化する。可変長符号化部１１６は、動きベクトルやモード情報や量子化により発生する情報を、実際に伝送される符号に変換を行う。その際，これら情報の発生頻度などの統計的性質に基づき、符号長が短くなるよう符号語の割り当てを行い、符号化を行う。バッファ部１１７は、圧縮符号化データを送信するために一時的にバッファする。 The image compression encoding apparatus 1 receives screen data (in units of frames) and outputs the encoded data. The current frame memory 101 stores one input frame. The motion prediction unit 102 predicts the motion of each macroblock in the current frame with respect to the frame in the local decode memory 107 (previous local decode frame) by a block matching method or the like, and outputs a motion vector. The motion compensation unit 111 applies the previously calculated motion vector for each macroblock to the previous local decoded frame between frames flowing in the time direction, and outputs a prediction screen in consideration of the motion of the subject. The DCT unit 112 further subdivides the macroblock of the current frame in the intra mode into four for the difference between the macroblock of the current frame and the macroblock of the prediction screen subjected to motion compensation in the inter mode. Orthogonal transform is performed for each divided block, and DCT coefficients indicating frequency components are output. The quantization unit 113 digitizes the orthogonally transformed DCT coefficients based on a predetermined quantization parameter. The variable length coding unit 116 converts motion vectors, mode information, and information generated by quantization into codes that are actually transmitted. At this time, based on statistical properties such as the frequency of occurrence of such information, codewords are assigned so that the code length is shortened, and encoding is performed. The buffer unit 117 temporarily buffers the compressed encoded data for transmission.

従来技術による量子化部のレート制御は、バッファ部における蓄積データの占有量によって、量子化部１１３の量子化パラメータを制御するものである。 The rate control of the quantization unit according to the prior art is to control the quantization parameter of the quantization unit 113 according to the amount of data stored in the buffer unit.

逆量子化部１１４は、量子化されたＤＣＴ係数データを元の値に変換する。逆ＤＣＴ部１１５は、ＤＣＴ係数から通常の画素情報に復元する。ローカルデコードメモリ１０７は、イントラモードの場合は復元された画素情報を、インターモードの場合は動き補償画像情報に対して加算をし、受信側で再生されるフレームと同じものを一時的に蓄積する。フレーム間予測誤差算出部１０３及びフレーム内予測誤差算出部１０４は、インター（フレーム間）／イントラ（フレーム内）モードそれぞれについて、予測時のブロックサイズなどの最適化を行い、予測誤差を出力する。モード判定部１０５は、予測誤差の大きさが小さくなるようインターモード又はイントラモードのいずれかを選択する。モード情報メモリ１０６は、選択されたモードを一時的に蓄積し、そのモードでスイッチを切り替える。 The inverse quantization unit 114 converts the quantized DCT coefficient data into an original value. The inverse DCT unit 115 restores normal pixel information from the DCT coefficient. The local decoding memory 107 adds the restored pixel information in the intra mode to the motion compensated image information in the inter mode, and temporarily stores the same frame as the frame reproduced on the receiving side. . The inter-frame prediction error calculation unit 103 and the intra-frame prediction error calculation unit 104 optimize the block size at the time of prediction for each of the inter (inter-frame) / intra (intra-frame) mode, and output a prediction error. The mode determination unit 105 selects either the inter mode or the intra mode so that the magnitude of the prediction error is reduced. The mode information memory 106 temporarily stores the selected mode, and switches the switch in that mode.

S.Ma、W.Gao、F.Wu及びYanLu、「Rate control for JVT video coding scheme with HRD considerations」、IEEE ICIP2003、Vol.III、pp.793-796、Sep. 2003.S.Ma, W.Gao, F.Wu and YanLu, `` Rate control for JVT video coding scheme with HRD considerations '', IEEE ICIP2003, Vol.III, pp.793-796, Sep. 2003. Z.Li、W.Gao及びF.Pan、「Adaptive rate control with HRD consideration」、Joint Video Team of ISO/IEC MPEG and ITU-T VCEG、JVT-H014、May 2003.Z.Li, W.Gao and F.Pan, `` Adaptive rate control with HRD consideration '', Joint Video Team of ISO / IEC MPEG and ITU-T VCEG, JVT-H014, May 2003. T.Chiang及びY.Zhang、「A new rate controlscheme using quadratic rate distortion model」、IEEE Trans. Circuits Syst. Video Technol.、Vol.7、pp.287-331、Apr. 1997.T.Chiang and Y.Zhang, “A new rate controlscheme using quadratic rate distortion model”, IEEE Trans. Circuits Syst. Video Technol., Vol. 7, pp. 287-331, Apr. 1997. S.Milani、L.Celetto及びG.A.Mian、「A rate control algorithm for the H.264 encoder」、6th Baiona Workshop on Signal Processing in Communications、pp.390-396、Sep. 2003.S. Milani, L. Celetto and G.A. Mian, `` A rate control algorithm for the H.264 encoder '', 6th Baiona Workshop on Signal Processing in Communications, pp. 390-396, Sep. 2003. M.M.Mahdi及びM.Ghanbar、「A lagrangian optimized rate control algorithm for the H.264/AVC encoder」、IEEE ICIP2004、Vol.I、pp.123-126、2004.M.M.Mahdi and M.Ghanbar, `` A lagrangian optimized rate control algorithm for the H.264 / AVC encoder '', IEEE ICIP2004, Vol.I, pp.123-126, 2004. G.Sullivan、T.Wiegand、K.P.Lim、「Joint Model Reference Encoding Methods and Decoding Concealment Methods」、Joint Video Team of ISO/IEC MPEG and ITU-T VCEG、JVT-I049、Sep. 2003.G. Sullivan, T. Wiegand, K.P. Lim, `` Joint Model Reference Encoding Methods and Decoding Concealment Methods '', Joint Video Team of ISO / IEC MPEG and ITU-T VCEG, JVT-I049, Sep. 2003. ITU-T Recommendation H.264 + Cor1、「Advanced video coding for generic audiovisual services」、May 2004.ITU-T Recommendation H.264 + Cor1, `` Advanced video coding for generic audiovisual services '', May 2004.

地上ディジタル放送では、１つのチャネルを１３のセグメントに分割し、必要な帯域に応じて、ＨＤＴＶを放送する場合は１２セグメント、又は、標準テレビでは４セグメント×３番組を使用し、残りの１セグメントを移動体向け放送に割り当てている。即ち、固定された帯域が割り当てられていることになる。 In digital terrestrial broadcasting, one channel is divided into 13 segments. Depending on the required bandwidth, 12 segments are used when broadcasting HDTV, or 4 segments x 3 programs are used for standard television, and the remaining 1 segment. Is assigned to mobile broadcasting. That is, a fixed band is allocated.

いずれの従来技術も、ビットレート制約に対して、ＧＯＰ単位に一定の情報量を割り当てることを目的としている。既に符号化されたフレーム発生情報量に対する割当量との差異は、ＧＯＰ内残りのフレームの符号化にて吸収されるよう制御される。更に、ＧＯＰ符号化後、当該ＧＯＰの発生情報量と割当量との差異は、次ＧＯＰの目標とする情報量に反映されることとなる。 All the prior arts aim to allocate a certain amount of information in GOP units with respect to bit rate constraints. The difference between the already-encoded frame generation information amount and the allocated amount is controlled so as to be absorbed by encoding the remaining frames in the GOP. Furthermore, after GOP encoding, the difference between the generated information amount of the GOP and the allocated amount is reflected in the target information amount of the next GOP.

このようなレート制御方法においては、全てのＧＯＰで発生情報量が均一となるように制御されているために、ＨＲＤバッファサイズは、ＧＯＰ内で発生量が最大となるフレームの情報量程度と暗示的に定まる。しかしながら、ＨＲＤバッファサイズ及び初期遅延を明確に意識した制御ではないために、伝送目的（テレビ電話用途：双方向超低遅延又はライブ放送用途：片方向低遅延など）に特化したバッファ制約を設定した上での符号化制御は難しい。 In such a rate control method, since the amount of generated information is controlled to be uniform in all GOPs, the HRD buffer size is implied to be about the amount of information of a frame in which the amount of generation is maximum in the GOP. Is determined. However, since the control is not clearly conscious of the HRD buffer size and initial delay, a buffer constraint specialized for transmission purposes (video phone use: bidirectional ultra-low delay or live broadcast use: one-way low delay, etc.) is set. Therefore, encoding control is difficult.

また、フレームの情報量割り当てに画像性質が考慮されていないため、発生情報量の多いシーンではひずみが増大し、少ないシーンでは必要以上に細かい量子化が行われる傾向となる。これは、フレーム間予測効率の低下を招くだけでなく、品質変動が視覚的に不快感を与えることとなる。 In addition, since image properties are not considered in the allocation of the information amount of frames, distortion increases in a scene with a large amount of generated information, and quantization tends to be performed more finely than necessary in a scene with a small amount. This not only causes a decrease in inter-frame prediction efficiency, but also quality fluctuations cause visual discomfort.

この場合、画像品質が低下するだけでなく、受信側のバッファ破綻による映像の乱れや、規定時刻にフレームが表示されないことによる動きのがたつき（ジャダー）などが発生し、特に放送用途ではこれらは厳密に回避されなければならない。 In this case, not only the image quality deteriorates, but also the video disturbance due to the buffer failure on the receiving side and the motion blur (judder) due to the frame not being displayed at the specified time occur. Must be strictly avoided.

また、画像性質を考慮したいがために、画像性質に基づく発生情報量をそのまま目標値として符号化を仮に行った場合、各フレームの性質を反映させた符号化をすることはできるものの、ＨＲＤバッファの推移を考慮していない状態となっている。つまり、各フレームの要求する情報量をそのまま割り当てるということは、ＱＰ値固定として符号化を行っていることと等価である。これは、ＨＲＤバッファが無限に大きい場合には問題とならないが、バッファ制約下においては設定したバッファ条件を破綻させないようにしなければならない。 In addition, in order to consider the image property, if encoding is performed with the generated information amount based on the image property as a target value as it is, encoding that reflects the property of each frame can be performed, but the HRD buffer The situation is not considered. In other words, allocating the amount of information requested by each frame as it is is equivalent to performing encoding with the QP value fixed. This is not a problem when the HRD buffer is infinitely large, but under the buffer constraint, the set buffer condition must not be broken.

このように、前述されたいずれの従来技術においても、バッファサイズ又は遅延量を明確に規定した上での符号化制御、及び、個々のフレームの画像性質を考慮した符号化制御については何ら提案していない。 As described above, in any of the above-described prior arts, there is no proposal for encoding control in which the buffer size or delay amount is clearly defined and in consideration of the image characteristics of each frame. Not.

従って、本発明は、高品質・低遅延が要求される地上ディジタル１セグメント放送への適用を考慮し、バッファ制約の考慮による安定化と、画像性質の反映による画質の向上とを両立させる画像圧縮符号化装置を提供することを目的とする。 Therefore, the present invention considers application to terrestrial digital one-segment broadcasting that requires high quality and low delay, and performs image compression that achieves both stabilization by considering buffer constraints and improvement of image quality by reflecting image properties. An object is to provide an encoding device.

本発明における画像圧縮符号化装置は、
量子化パラメータに基づいて量子化する量子化手段と、
第１のＧＯＰの先頭フレーム符号化直後時点からみて、第２のＧＯＰの先頭フレーム符号化直前時点までの仮想受信バッファのバッファ占有遷移が、オーバフロー限界及びアンダーフロー限界の中心となるように、各フレームが必要とする基本発生情報量に対して補正を行ったフレーム目標発生情報量を出力するバッファ制約手段と、
フレーム目標発生情報量に基づいて算出されたマクロブロック目標発生情報量から量子化パラメータを導出し、該量子化パラメータを量子化手段へ出力する量子化パラメータ制御手段と
を有することを特徴とする。 An image compression encoding apparatus according to the present invention includes:
A quantization means for performing quantization based on the quantization parameter;
From the time immediately after the first frame encoding of the first GOP, the buffer occupation transition of the virtual reception buffer up to the time immediately before the encoding of the first frame of the second GOP is set to the center of the overflow limit and the underflow limit. A buffer restricting means for outputting a frame target generation information amount corrected for the basic generation information amount required by the frame;
And a quantization parameter control unit that derives a quantization parameter from the macroblock target generation information amount calculated based on the frame target generation information amount and outputs the quantization parameter to the quantization unit.

本発明の画像圧縮符号化装置における他の実施形態によれば、
マクロブロック毎に、インター／イントラモード判定後の予測誤差である差分絶対値和ＳＡＤを算出する差分絶対値和算出手段を更に有し、
バッファ制約手段は、
差分絶対値和ＳＡＤに基づいて、フレーム基本発生情報量を推定するフレーム基本発生情報量推定手段と、
第１のＧＯＰの先頭フレーム符号化直後時点からみて、第２のＧＯＰの先頭フレーム符号化直前時点までの仮想受信バッファのバッファ占有遷移が、オーバフロー限界及びアンダーフロー限界の中心となるように、ビットレートから得られるフレームあたりの平均発生情報量に対する、フレームの発生情報量の割合であるスケーリング係数を出力する情報量補正手段と、
推定された基本発生情報量にスケーリング係数を乗じてフレーム目標発生情報量を出力する乗算手段と
を有することも好ましい。 According to another embodiment of the image compression encoding apparatus of the present invention,
A difference absolute value sum calculating means for calculating a difference absolute value sum SAD which is a prediction error after the inter / intra mode determination for each macroblock;
Buffer constraint means
Frame basic generation information amount estimation means for estimating the frame basic generation information amount based on the difference absolute value sum SAD;
Bits so that the buffer occupancy transition of the virtual reception buffer up to the time point immediately before the first frame encoding of the second GOP is the center of the overflow limit and the underflow limit as viewed immediately after the first frame encoding of the first GOP Information amount correcting means for outputting a scaling coefficient that is a ratio of the generated information amount of the frame to the average generated information amount per frame obtained from the rate;
It is also preferable to have multiplying means for multiplying the estimated basic generation information amount by a scaling coefficient and outputting the frame target generation information amount.

また、本発明の画像圧縮符号化装置における他の実施形態によれば、
情報量補正手段は、ＧＯＰ単位で行うものであって、バッファにおける符号化データの占有量が低く遷移している場合、小さいの値のスケーリング係数を出力し、符号化データの占有量が高く遷移している場合、大きい値のスケーリング係数を出力するように動作することも好ましい。 According to another embodiment of the image compression encoding apparatus of the present invention,
The information amount correction means is performed in units of GOP, and when the occupancy amount of the encoded data in the buffer is transitioning low, outputs a small scaling factor, and the occupancy amount of the encoded data transitions high. If so, it is also preferable to operate to output a large value of the scaling factor.

更に、本発明の画像圧縮符号化装置における他の実施形態によれば、
フレーム基本発生情報量推定手段は、差分絶対値和ＳＡＤ及び固定量子化パラメータＱＰに基づいて発生情報量を推定する関数ｆ又は推定値を予め蓄積していることも好ましい。 Furthermore, according to another embodiment of the image compression encoding apparatus of the present invention,
The frame basic generated information amount estimation means preferably stores in advance a function f or an estimated value for estimating the generated information amount based on the sum of absolute differences SAD and the fixed quantization parameter QP.

更に、本発明の画像圧縮符号化装置における他の実施形態によれば、
バッファ制約手段は、乗算手段から出力されたフレーム目標発生情報量に対して、バッファのオーバフロー限界又はアンダー限界に達しないようにクリップするクリップ手段を更に有することも好ましい。 Furthermore, according to another embodiment of the image compression encoding apparatus of the present invention,
The buffer restricting means preferably further includes clip means for clipping the frame target generated information amount output from the multiplying means so as not to reach the buffer overflow limit or under limit.

更に、本発明の画像圧縮符号化装置における他の実施形態によれば、
量子化パラメータ制御手段は、
フレーム目標発生情報量に基づいてマクロブロック目標発生情報量を出力するマクロブロック目標発生情報量決定手段と、
実発生情報量における当該マクロブロック直前までの情報量の和と、目標発生情報量における当該マクロブロック直前までの情報量の和と、当該マクロブロック直前までの量子化パラメータの平均値とから当該マクロブロックの量子化パラメータを算出し、量子化手段へ出力する量子化パラメータ算出手段と
を有することも好ましい。 Furthermore, according to another embodiment of the image compression encoding apparatus of the present invention,
The quantization parameter control means is
Macroblock target generation information amount determining means for outputting a macroblock target generation information amount based on the frame target generation information amount;
From the sum of the information amount immediately before the macroblock in the actual generated information amount, the sum of the information amount immediately before the macroblock in the target generated information amount, and the average value of the quantization parameter immediately before the macroblock, It is also preferable to have quantization parameter calculation means for calculating the quantization parameter of the block and outputting it to the quantization means.

本発明における画像圧縮符号化装置によれば、高品質・低遅延が要求される地上ディジタル１セグメント放送への適用に際し、バッファ制約の考慮による安定化と、画像性質の反映による画質の向上とを両立させることができる。特に、従来技術におけるレート制御によれば、ＧＯＰ単位に一定量を割り当てており、バッファサイズ及び画像品質を用途に応じて厳密に設定できなかった。 According to the image compression coding apparatus of the present invention, when applied to terrestrial digital one-segment broadcasting requiring high quality and low delay, stabilization by considering buffer constraints and improvement of image quality by reflecting image properties are achieved. Both can be achieved. In particular, according to the rate control in the prior art, a fixed amount is assigned to each GOP, and the buffer size and the image quality cannot be strictly set according to the application.

本発明によれば、バッファ制約を考慮し、ＧＯＰのバッファ遷移状態から得られるスケーリング係数の概念を新たに導入し、予測された発生情報量に適用して割り当てるべき情報量を得る。また、画像性質の反映に関しては、フレーム毎の予測誤差を事前に算出し、それに基づき最適な発生情報量を予測する。このようにして割り当てられた情報量は、一定のバッファ制約の下で、各フレームの性質を相対的に反映させたものとなる。 According to the present invention, in consideration of buffer constraints, the concept of a scaling coefficient obtained from the buffer transition state of a GOP is newly introduced, and an information amount to be allocated is obtained by applying to the predicted generated information amount. In addition, regarding reflection of image properties, a prediction error for each frame is calculated in advance, and an optimal amount of generated information is predicted based thereon. The amount of information allocated in this way is a relatively reflecting property of each frame under a certain buffer constraint.

以下では、図面を用いて、本発明を実施するための最良の形態について詳細に説明する。 Hereinafter, the best mode for carrying out the present invention will be described in detail with reference to the drawings.

図２は、本発明における画像圧縮符号化装置の機能構成図である。 FIG. 2 is a functional configuration diagram of the image compression coding apparatus according to the present invention.

画像圧縮符号化装置１は、図１と比較して、フレームメモリ１２１と、ＳＡＤ(Sum of Absolute Difference：差分絶対値和、フレーム間予測誤差)算出部１２２と、バッファ制約部１２３と、量子化パラメータ制御部１２４とを更に有する。フレームメモリ１２１は、インターモード又はイントラモードの予測誤差を入力する。ＳＡＤ算出部１２２は、予測誤差を入力し、マクロブロック単位のＳＡＤであるＳ_ｊと、Ｓ_ｊのフレーム内合計値ΣＳ_ｊと、マクロブロック個数における平均値meanＳＡＤとを出力する。以下、ｉはフレーム番号であり、ｊはフレーム内マクロブロック番号である。 Compared with FIG. 1, the image compression encoding apparatus 1 includes a frame memory 121, a SAD (Sum of Absolute Difference) calculation unit 122, a buffer restriction unit 123, and a quantization And a parameter control unit 124. The frame memory 121 inputs an inter-mode or intra-mode prediction error. The SAD calculation unit 122 receives the prediction error, and outputs S _j that is SAD in units of macroblocks, an intra-frame total value ΣS _{j of} S _j , and an average value meanSAD in the number of macroblocks. Hereinafter, i is a frame number and j is an intra-frame macroblock number.

バッファ制約部１２３は、ＨＲＤ(Hypothetical Reference Decoder：仮想参照デコーダ)バッファ制約への準拠を目的とする。バッファ制約部１２３は、ＳＡＤ算出部１２２から出力されたmeanＳＡＤを入力し、フレーム目標発生情報量Ｔ_ｉを出力する。尚、ＨＲＤバッファは固定ビットレートとする。 The buffer constraint unit 123 is intended to comply with HRD (Hypothetical Reference Decoder) buffer constraints. Buffer constraint unit 123 inputs the meanSAD output from SAD calculation unit 122 outputs the frame target generated information amount _{T i.} The HRD buffer has a fixed bit rate.

量子化パラメータ制御部１２４は、Ｈ．２６４の符号化特性に基づいて、目標発生情報量を達成するための情報量割り当てを目的とする。量子化パラメータ制御部１２４は、マクロブロック単位のＳＡＤであるＳ_ｊと、Ｓ_ｊのフレーム内合計値ΣＳ_ｊと、フレーム目標発生情報量Ｔ_ｉとを入力し、ＱＰ（量子化パラメータ）値を出力する。このＱＰ値に基づいて量子化部１１３が制御される。 The quantization parameter control unit 124 is an H.264 driver. Based on the H.264 coding characteristics, an information amount allocation for achieving the target generated information amount is aimed. The quantization parameter control unit 124 inputs S _j that is SAD in units of macroblocks, an intra-frame total value ΣS _{j of} S _j , and a frame target generation information amount T _i, and sets a QP (quantization parameter) value. Output. The quantization unit 113 is controlled based on the QP value.

以下では、各機能部について詳細に説明する。 Below, each function part is demonstrated in detail.

［ＳＡＤ算出部］
ＳＡＤ算出部１２２は、フレームメモリからマクロブロック毎の予測誤差を入力する。従って、インターモードマクロブロックに対してはインター予測誤差が用いられ、イントラモードマクロブロックに対してはイントラ予測誤差が用いられる。そして、ＳＡＤ算出部１２２は、フレームのマクロブロック毎に、モード判定後の予測誤差のＳＡＤを算出する。 [SAD calculation unit]
The SAD calculation unit 122 inputs a prediction error for each macroblock from the frame memory. Therefore, inter prediction errors are used for inter mode macroblocks, and intra prediction errors are used for intra mode macroblocks. Then, the SAD calculation unit 122 calculates the SAD of the prediction error after the mode determination for each macroblock of the frame.

本発明は、インター予測又はイントラ予測の後に得られる予測誤差の大きさと、実際の発生情報量との相関関係に着目し、フレームレベルで基本となる情報量割り当てを決定する。予測誤差の大きさの評価には差分絶対値和ＳＡＤを用いる。ＳＡＤ算出部１２２は、マクロブロック単位のＳＡＤであるＳ_ｊと、Ｓ_ｊのフレーム内合計値ΣＳ_ｊと、マクロブロック個数における平均値meanＳＡＤとを出力する。 The present invention pays attention to the correlation between the magnitude of prediction error obtained after inter prediction or intra prediction and the actual amount of generated information, and determines basic information amount allocation at the frame level. The difference absolute value sum SAD is used for evaluating the magnitude of the prediction error. The SAD calculation unit 122 outputs S _j that is SAD in units of macroblocks, an intra-frame total value ΣS _{j of} S _j , and an average value meanSAD in the number of macroblocks.

Ｈ．２６４によれば、イントラ予測におけるモード判定を効果的に行うために、従来の技術においては、既に符号化した隣接マクロブロックを用いてマクロブロック毎に符号化をしながらモード判定を行う。これに対し、本発明におけるＳＡＤ算出部１２２は、符号化前の予測誤差に対するＳＡＤ算出であるために、イントラモード判定の参照画像として原画像を用いる。符号化時には、ローカル復号画像を用いてイントラモード判定のみ再度行う。符号化時には、動きベクトル及びインターモードの予測誤差値は、ＳＡＤ算出時のものを用い、不必要な重複処理を避ける。 H. According to H.264, in order to effectively perform mode determination in intra prediction, in the conventional technique, mode determination is performed while performing coding for each macroblock using adjacent macroblocks that have already been encoded. On the other hand, the SAD calculation unit 122 according to the present invention uses the original image as a reference image for intra mode determination because it is SAD calculation for a prediction error before encoding. At the time of encoding, only intra mode determination is performed again using the local decoded image. At the time of encoding, the motion vector and inter-mode prediction error values used at the time of SAD calculation are used to avoid unnecessary duplication processing.

モード判定が行われた各マクロブロックｍに対し、各画素における予測誤差をｅ_ｍ（ｐ、ｑ）として、ＳＡＤであるＳ_ｍを、以下の式（１）によって算出する。
式（１）

For each macroblock m for which mode determination has been performed, S _m that is SAD is calculated by the following equation (1), where e _m (p, q) is a prediction error in each pixel.
Formula (1)

また、ＳＡＤのフレーム内平均値meanＳＡＤを、以下の式（２）によって算出する。尚、ｘ、ｙは、画像サイズである。
式（２）

In addition, the SAD intraframe average value meanSAD is calculated by the following equation (2). Note that x and y are image sizes.
Formula (2)

［バッファ制約部］
バッファ制約部１２３は、フレーム基本発生情報量推定部１２３１と、情報量補正部１２３２と、メモリ１２３３と、乗算部１２３４と、クリップ部１２３５とを有する。フレーム基本情報量推定部１２３１は、ＳＡＤ平均値meanＳＡＤを入力し、各ＱＰに基づいてフレーム基本発生情報量Ｔ_Ｂを出力する。情報量補正部１２３２は、ビットレートから得られるフレーム当たりの発生情報量に対する、実際の発生情報量の割合を、スケーリング係数Ｃ_Ｓｉとして出力する。乗算部１２３４は、スケーリング係数Ｃ_Ｓｉとフレーム基本発生情報量Ｔ_Ｂとを乗算し、フレーム目標発生情報量Ｔ_ｉを出力する。クリップ部１２３５は、フレーム目標発生情報量Ｔ_ｉが、実際に、バッファのオーバフロー限界又はアンダー限界に達しないようにクリップする。以下では、各機能部について詳細に説明する。 [Buffer constraints]
The buffer restriction unit 123 includes a frame basic generation information amount estimation unit 1231, an information amount correction unit 1232, a memory 1233, a multiplication unit 1234, and a clip unit 1235. Frame basic information amount estimation unit 1231 receives the SAD average MEANSAD, and outputs the frame fundamental generated information quantity _{T B} based on the QP. The information amount correction unit 1232 outputs the ratio of the actual generated information amount to the generated information amount per frame obtained from the bit rate as the scaling coefficient C _Si . The multiplier 1234 multiplies the scaling coefficient C _Si by the frame basic generation information amount T _B and outputs a frame target generation information amount T _i . Clip portion 1235, the frame target generated information amount T _i is, in effect, a clip so as not to reach the overflow limit or under limit of the buffer. Below, each function part is demonstrated in detail.

［フレーム基本発生情報量推定部］
フレーム基本発生情報量推定部１２３１は、マクロブロック個数におけるＳＡＤ平均値meanＳＡＤを入力する。meanＳＡＤと、各ＱＰ値とから、関数ｆにより、フレーム基本発生情報量Ｔ_Ｂを算出する。フレーム基本発生情報量とは、バッファ制約を考慮する前段階において、純粋に各フレームが必要とする発生情報量をいう。
式（３）
Ｔ_Ｂ＝ｆ（meanＳＡＤ，ＱＰ） [Basic frame generation information estimation unit]
Frame basic generation information amount estimation section 1231 receives SAD average value meanSAD in the number of macroblocks. and MEANSAD, from each QP value, the function f, and calculates the frame base generated information quantity _{T B.} The basic frame generated information amount is a generated information amount that each frame needs purely before considering the buffer constraint.
Formula (3)
T _B = f (meanSAD, QP)

図３は、ＳＡＤと発生情報量との関係を表すグラフである。図３のグラフは、事前実験によって得られたものである。 FIG. 3 is a graph showing the relationship between SAD and the amount of generated information. The graph of FIG. 3 is obtained by a preliminary experiment.

図３のグラフによれば、横軸は、ＳＡＤのフレーム内合計値Ｓ_ｊにおけるマクロブロックあたりの平均値であり、縦軸は、発生情報量である。各ＱＰ値におけるプロットが、それぞれほぼ同一曲線上に存在している。この相関関係は、事前測定によって得られた結果であり、ＱＰ値に依存しないことが理解できる。これは、フレームＳＡＤによって、そのフレームが必要とする発生情報量を予測できることを意味する。 According to the graph of FIG. 3, the horizontal axis represents the average value per macroblock in the SAD intra-frame total value S _j , and the vertical axis represents the amount of generated information. Plots at each QP value exist on substantially the same curve. It can be understood that this correlation is a result obtained by prior measurement and does not depend on the QP value. This means that the amount of generated information required by the frame can be predicted by the frame SAD.

これらの関係について、各ＱＰに対してＳに関する以下の線形多項式による最小二乗近似をしたところ、各係数は表１に示す通りとなる。ｂｉｔｓはフレーム発生情報量であり、ＳはフレームＳＡＤのマクロブロックあたりの平均値である。
ｂｉｔｓ＝ａ・Ｓ^３＋ｂ・Ｓ^２＋ｃ・Ｓ＋ｄ

With respect to these relationships, when the least square approximation is performed on each QP using the following linear polynomial for S, each coefficient is as shown in Table 1. Bits is a frame generation information amount, and S is an average value per macroblock of the frame SAD.
bits = a · S ³ + b · S ² + c · S + d

ここで、対象とするＱＰの範囲を５〜５０としているが、これは、従来の符号化方式が、量子化ステップサイズの定義に関して、ＱＰに対して線形とし、その最大値を３１としているのに対し、Ｈ．２６４では２の（ＱＰ／６）乗に比例させることでＱＰとひずみ量とが線形の関係になるよう定めており、かつＱＰの最大値を５１と規定していることによる。近似の結果、３次以上の係数が十分小さい値となったことから、以下に示す２次式を用いる。
ｂｉｔｓ＝ｂ・Ｓ^２＋ｃ・Ｓ＋ｄ Here, the range of the target QP is 5 to 50. This is because the conventional encoding method is linear with respect to the QP with respect to the definition of the quantization step size, and the maximum value is 31. H. This is because H.264 defines the linear relationship between QP and the amount of distortion by making it proportional to 2 to the power of (QP / 6), and the maximum value of QP is defined as 51. As a result of the approximation, the third-order or higher coefficient becomes a sufficiently small value, so the following quadratic expression is used.
bits = b · S ² + c · S + d

図４は、図３のグラフの結果を近似したグラフである。 FIG. 4 is a graph approximating the results of the graph of FIG.

式（３）の関数ｆは、図４によって得られるＳＡＤと発生情報量との相関関係を意味する。これにより、予測誤差算出後にフレームの発生情報量を予測することができる。マクロブロック毎の予測誤差が既に求められているので、フレーム内における各マクロブロックの発生情報量も同時に予測できることから、各ＱＰ値決定の際にも、この予測誤差情報を利用することができる。 The function f in Equation (3) means the correlation between the SAD obtained by FIG. 4 and the amount of generated information. As a result, the amount of information generated in a frame can be predicted after calculating the prediction error. Since the prediction error for each macroblock has already been obtained, the amount of information generated for each macroblock in the frame can be predicted at the same time. Therefore, this prediction error information can also be used when determining each QP value.

［情報量補正部］
情報量補正部１２３２は、ビットレートから得られるフレーム当たりの発生情報量に対する、実際の発生情報量の割合を、スケーリング係数Ｃ_Ｓｉとして出力する。 [Information correction unit]
The information amount correction unit 1232 outputs the ratio of the actual generated information amount to the generated information amount per frame obtained from the bit rate as the scaling coefficient C _Si .

図５は、本発明におけるＨＲＤバッファの遷移グラフである。 FIG. 5 is a transition graph of the HRD buffer in the present invention.

図５のグラフによれば、縦軸はバッファ占有量であり、横軸は経過時間である。現在時刻は、ＧＯＰ先頭のＩＤＲ(Instantaneous Decoding Refresh：瞬時復号リフレッシュ)フレームの符号化後とする。ＩＤＲフレームは、従来のＩフレームに相当する。一般に、ＨＲＤバッファは、情報量の多いＩＤＲフレームが符号化されるとバッファレベルが大きく下がり、その後、非ＩＤＲフレーム（Ｐフレーム又はＢフレーム）によってバッファレベルが回復するように遷移する。 According to the graph of FIG. 5, the vertical axis represents the buffer occupation amount, and the horizontal axis represents the elapsed time. The current time is after encoding the IDR (Instantaneous Decoding Refresh) frame at the head of the GOP. The IDR frame corresponds to a conventional I frame. In general, when an IDR frame having a large amount of information is encoded, the buffer level of the HRD buffer is greatly lowered, and then the buffer level is restored by a non-IDR frame (P frame or B frame).

ＩＤＲフレームとは、参照フレーム間距離を変更できるＨ．２６４において、ＩＤＲフレームをまたいだ、一切の参照関係が存在しないことを意味する。ディジタル放送においては、伝送途中のデータから受信復号を開始する必要があり、定期的なＩＤＲフレームの挿入が必要となる。 An IDR frame is an H.264 frame that can change the distance between reference frames. H.264 means that there is no reference relationship across IDR frames. In digital broadcasting, it is necessary to start reception decoding from data during transmission, and periodic IDR frame insertion is required.

ここで、１つのＧＯＰの符号化を完了するまでに、ＨＲＤバッファの遷移が満足すべき条件について考える。 Here, a condition that the transition of the HRD buffer should satisfy before completing the encoding of one GOP is considered.

アンダーフロー防止の観点からは、各ＩＤＲフレームの符号化直後、アンダーフローを生じないようにするだけでなく、次ＧＯＰの先頭フレームの符号化直前に、ＩＤＲフレームによって発生する可能性のある情報量を確保しておく必要がある。 From the viewpoint of preventing underflow, not only does not cause underflow immediately after encoding each IDR frame, but also the amount of information that may be generated by the IDR frame immediately before encoding the first frame of the next GOP. It is necessary to secure.

一方、オーバフロー防止の観点からは、各フレーム符号化直前のバッファレベルが、バッファ最大サイズを超えないようにすればよい。しかしながら、不必要なスタッフィングビットを発生させることなく、全ての発生情報量を画像の符号化に効率的に用いるためには、更に、次ＧＯＰの符号化直前まで一切のオーバフローを生じさせないことも条件となる。 On the other hand, from the viewpoint of preventing overflow, it is only necessary that the buffer level immediately before encoding each frame does not exceed the maximum buffer size. However, in order to efficiently use all of the generated information amount for image encoding without generating unnecessary stuffing bits, it is also necessary to prevent any overflow from occurring until immediately before the next GOP encoding. It becomes.

これは、以下の理由による。一般に、情報量の少ないシーケンスにおいては、各Ｐフレームの発生情報量が平均より小さいため、Ｐフレーム区間でのバッファ遷移が単調増加となる。このとき、ＧＯＰ途中のフレームで発生した情報量が小さいために、次フレーム符号化直前においてオーバフローが発生すると判断された場合、これを回避するためにスタッフィングが行われることとなるが、次フレームも似たような画像性質を持つ可能性が高い。この場合、以後のフレームに対してもスタッフィングが継続して必要となる。これに対しては、スタッフィングが発生した直後のフレーム符号化に使用する量子化パラメータを十分に小さくするなどの特別な補正を行うことで回避できる。しかし、本発明が目的とする画像性質の均一化に反することとなるため、ＧＯＰの最後までオーバフローを生じないことを条件として設定する必要がある。 This is due to the following reason. In general, in a sequence with a small amount of information, since the amount of information generated in each P frame is smaller than the average, buffer transitions in the P frame section monotonically increase. At this time, since the amount of information generated in the frame in the middle of the GOP is small, if it is determined that an overflow occurs immediately before encoding the next frame, stuffing is performed to avoid this, but the next frame is also It is likely to have similar image properties. In this case, stuffing is required continuously for subsequent frames. This can be avoided by performing a special correction such as sufficiently reducing the quantization parameter used for frame encoding immediately after stuffing has occurred. However, since this is contrary to the intended uniformity of image properties, it is necessary to set it on condition that no overflow occurs until the end of the GOP.

以上の点を考慮し、ＧＯＰ途中でのアンダーフロー限界及びオーバフロー限界を、図５のＢ_Ｏ(i)及びＢ_Ｕ(i)によって定義する。 Considering the above points, the underflow limit and overflow limit in the middle of the GOP are defined by B _O (i) and B _U (i) in FIG.

図５のグラフによれば、両限界及びＧＯＰ境界線に囲まれた三角形（点Ａ、Ｃ及ぶＤを頂点とする）の中でバッファ推移することを理想とする。そのために、以下の手順に基づいて情報量を割り当てる。 According to the graph of FIG. 5, it is ideal that the buffer transition occurs in a triangle (points A, C and D are vertices) surrounded by both limits and the GOP boundary line. For this purpose, an information amount is assigned based on the following procedure.

ＧＯＰ先頭フレーム（ＩＤＲ）の符号化直後に、ＧＯＰ毎のバッファ遷移の傾きを決定する。これは、ＧＯＰに対する目標発生情報量を決定することに相当する。ＩＤＲフレームの最大情報量は、現在のバッファレベルからマージン（バッファサイズの１０％）を減じた値とする。個々のフレームＳＡＤから予測される発生情報量に、その傾きを考慮したものをフレーム目標発生情報量とする。マクロブロック毎のＳＡＤに基づいて、フレーム内の情報量を割り当てる。 Immediately after encoding the GOP head frame (IDR), the slope of the buffer transition for each GOP is determined. This corresponds to determining the target generation information amount for the GOP. The maximum information amount of the IDR frame is a value obtained by subtracting a margin (10% of the buffer size) from the current buffer level. The amount of generated information predicted from each frame SAD is determined by taking the inclination into account as the amount of generated frame target information. Based on the SAD for each macroblock, the amount of information in the frame is allocated.

バッファ状態は、前記した「傾き」すなわちＧＯＰごとの目標発生情報量（ただし増減方向は逆）に反映される。即ち、現在のバッファ状態がアンダーフローに近く、発生情報量を抑える必要がある場合は、傾きが大きい値となる。一方で、バッファ状態がオーバフローに近い場合は、その逆となる。本発明によれば、ＳＡＤから事前に予測される発生情報量に対し、バッファ遷移のとるべき「傾き」を適用したものを符号化時の発生情報量の目標値とすることにより、画像性質の反映とバッファ制約とを両立させることができる。 The buffer state is reflected in the “tilt”, that is, the target generated information amount for each GOP (however, the increase / decrease direction is reversed). That is, when the current buffer state is close to underflow and it is necessary to suppress the amount of generated information, the value is large. On the other hand, if the buffer state is close to overflow, the reverse is true. According to the present invention, by applying the “slope” that should be taken by buffer transition to the generated information amount predicted in advance from the SAD, the target value of the generated information amount at the time of encoding is obtained. It is possible to achieve both reflection and buffer constraints.

尚、従来、アンダーフロー限界に達することを回避するために、バッファ状態に応じたフレームスキップ（こま落とし）が行われる場合もあった。しかし、これは、フレームの情報量そのものを制御するものではない。これは、次フレーム符号化時のＨＲＤバッファ位置を、より高い位置にすることを意味する。従って、アンダーフロー限界に達することに対する予防的な効果は期待できるものの、本質的な回避にはなっていない。また、フレームスキップによるぎくしゃく感が主観的印象を悪くする場合もある。従って、本発明によれば、フレームスキップは行わない。 Conventionally, in order to avoid reaching the underflow limit, frame skip (dropping) according to the buffer state may be performed. However, this does not control the amount of information of the frame itself. This means that the HRD buffer position at the time of encoding the next frame is set to a higher position. Therefore, although a preventive effect against reaching the underflow limit can be expected, it is not an essential avoidance. In addition, the jerky feeling due to frame skip may worsen the subjective impression. Therefore, according to the present invention, frame skipping is not performed.

図５のグラフによれば、ＧＯＰ内ｉ番目のフレーム符号化直前におけるバッファ状態は、以下の式（４）が成立する。
式（４）
Ｂｈ_ｉ＝Ｂｈ_ｉ−１−ｂ_ｉ−１＋ＢＲ／ＦＲ
Ｂｈ_ｉ(bit):フレームｉ符号化直前のＨＲＤバッファレベル
ｂ_ｉ(bit)：フレームｉの符号化による発生情報量
ＢＲ(bit/s)：符号化ビットレート
ＦＲ(frame/s)：符号化フレームレート According to the graph of FIG. 5, the following equation (4) is established for the buffer state immediately before the i-th frame encoding in the GOP.
Formula (4)
Bh _i = Bh _i-1 -b _i-1 + BR / FR
Bh _i (bit): HRD buffer level immediately before frame i encoding b _i (bit): amount of information generated by encoding frame i BR (bit / s): encoding bit rate FR (frame / s): encoding frame rate

次に、アンダーフロー限界、オーバフロー限界について、以下のように定義する。 Next, the underflow limit and overflow limit are defined as follows.

アンダーフロー限界Ｂ_Ｕ(i)は、ＧＯＰ先頭のＩＤＲフレーム符号化後のバッファ位置と、次ＧＯＰの先頭ＩＤＲフレームのために確保したバッファ位置とを結ぶ直線として、以下の式（５）のように定義する。
式（５）

ｂｅ(bit)：次ＧＯＰ先頭のフレームで許容する発生情報量
Ｎ(frame)：ＧＯＰ内フレーム数 The underflow limit B _U (i) is expressed as the following equation (5) as a straight line connecting the buffer position after encoding the IDR frame at the head of the GOP and the buffer position reserved for the head IDR frame of the next GOP. Defined in
Formula (5)

be (bit): The amount of generated information allowed in the first frame of the next GOP N (frame): Number of frames in the GOP

また、オーバフロー限界Ｂ_Ｏ(i)は、ＧＯＰ先頭から２番目のフレーム符号化前のバッファ位置と、次ＧＯＰ先頭ＩＤＲフレーム符号化前におけるバッファ最大値とを結ぶ直線として、以下の式（６）のように定義する。
式（６）

Ｂ：バッファサイズ The overflow limit B _O (i) is expressed as the following equation (6) as a straight line connecting the buffer position before the second frame encoding from the GOP head and the buffer maximum value before the next GOP head IDR frame encoding. Define as follows.
Formula (6)

B: Buffer size

実際のバッファ遷移は、図５に示すとおり、両限界で囲まれる四角形領域の中間で行われるのが、アンダーフロー／オーバフロー防止の観点から望ましい。四角形左側の２点Ａ（０、Ｂｈ_０−ｂ_０）、Ｂ（１、Ｂｈ_０−ｂ_０＋ＢＲ／ＦＲ）の中点（１／２、Ｂｈ_０−ｂ_０＋ＢＲ／２ＦＲ）と、四角形右側の２点Ｃ（Ｎ、ｂｅ）、Ｄ（Ｎ、Ｂ）の中点（Ｎ、（Ｂ＋ｂｅ）／２）とを結ぶ直線をＢ_Ｉ(i)として、以下の式（７）のように定義する。
式（７）

As shown in FIG. 5, the actual buffer transition is preferably performed in the middle of the rectangular area surrounded by both limits from the viewpoint of preventing underflow / overflow. Two points A (0, Bh ₀ -b ₀ ) and B (1, Bh ₀ -b ₀ + BR / FR) midpoint (1/2, Bh ₀ -b ₀ + BR / 2FR) on the left side of the square and the right side of the square A straight line connecting the midpoints (N, (B + be) / 2) of two points C (N, be) and D (N, B) is defined as B _I (i) as shown in the following formula (7). To do.
Formula (7)

バッファ破綻防止の観点からは、ＨＲＤバッファ遷移が直線Ｂ_Ｉ(i)に近いことが理想である。即ち、直線Ｂ_Ｉ(i)が、フレームｉの符号化直後のバッファ位置と、次フレーム符号化直前のバッファ位置との中間を貫くことを意味する。このとき、以下の式（８）が成立する。式（８）を満たすように、フレームｉの発生情報量ｂ_ｉを制御する必要がある。
式（８）
Ｂｈ_ｉ−ｂ_ｉ＋ＢＲ／２ＦＲ＝Ｂ_Ｉ(i) From the viewpoint of preventing buffer failure, it is ideal that the HRD buffer transition is close to the straight line B _I (i). That is, it means that the straight line B _I (i) passes through the middle between the buffer position immediately after encoding frame i and the buffer position immediately before encoding the next frame. At this time, the following equation (8) is established. It is necessary to control the generated information amount b _i of the frame i so as to satisfy the equation (8).
Formula (8)
Bh _i −b _i + BR / 2FR = B _I (i)

次に、ビットレートから得られるフレームあたりの発生情報量ｂ_ｆ（＝ＢＲ／ＦＲ）に対するｂ_ｉの割合をスケーリング係数Ｃ_Ｓｉとして、以下の式（９）のように定義する。
式（９）
Ｃ_Ｓｉ＝ｂ_ｉ／ｂ_ｆ Next, the ratio of b _{i to} the generated information amount b _f (= BR / FR) per frame obtained from the bit rate is defined as a scaling coefficient C _Si as shown in the following formula (9).
Formula (9)
C _Si = b _i / b _f

式（８）及び式（９）より、以下の式（１０）が成立する。
式（１０）
Ｃ_Ｓｉ＝Ｂｈ_ｉ−Ｂ_Ｉ(i)／（ＢＲ／ＦＲ）＋１／２ From the equations (8) and (9), the following equation (10) is established.
Formula (10)
C _Si = Bh _i −B _I (i) / (BR / FR) +1/2

このように算出されたスケーリング係数Ｃ_Ｓｉが、情報補正部１２３２から出力される。 The scaling coefficient C _Si calculated in this way is output from the information correction unit 1232.

メモリ１２３３は、情報量補正部１２３２から、スケーリング係数Ｃ_Ｓｉを受け取り、一時的に蓄積する。 The memory 1233 receives the scaling coefficient C _Si from the information amount correction unit 1232 and temporarily accumulates it.

［乗算部］
乗算部１２３４は、スケーリング係数Ｃ_Ｓｉと基本発生情報量Ｔ_Ｂとを乗算し、フレーム目標発生情報量Ｔ_ｉを出力する。各フレームが必要とする発生情報量を反映し、かつ、ＨＲＤバッファのアンダーフロー／オーバフローを防止するために、先に算出された基本発生情報量に対し、スケーリング係数Ｃ_Ｓｉを適用する。即ち、以下の式（１１）によって、目標発生情報量Ｔ_ｉが算出される。
式（１１）
Ｔ_ｉ＝Ｔ_Ｂｉ・Ｃ_Ｓｉ [Multiplier]
The multiplier 1234 multiplies the scaling coefficient C _Si by the basic generation information amount T _B and outputs a frame target generation information amount T _i . In order to reflect the amount of generated information required for each frame and to prevent underflow / overflow of the HRD buffer, the scaling coefficient C _Si is applied to the basic generated information amount calculated previously. That is, the target generation information amount T _i is calculated by the following equation (11).
Formula (11)
T _i = T _Bi · C _Si

これにより、目標発生情報量Ｔ_ｉは、次のように制御される。ＨＲＤバッファが低い位置で遷移している場合は、小さい値のスケーリング係数が基本発生情報量に適用され、発生情報量を抑え、バッファアンダーフローを回避する。ＨＲＤバッファが高い位置で遷移している場合は、大きい値のスケーリング係数が基本発生情報量に適用され、発生情報量を増やし、バッファオーバフローを回避する。 Thereby, the target generation information amount T _i is controlled as follows. When the HRD buffer is transitioned at a low position, a small scaling coefficient is applied to the basic generated information amount to suppress the generated information amount and avoid buffer underflow. When the HRD buffer is transitioning at a high position, a large value scaling factor is applied to the basic generated information amount to increase the generated information amount and avoid buffer overflow.

スケーリング係数は、基本発生情報量に対して補正をかけるものであるため、係数値を頻繁に更新することは望ましくなく、ある程度の範囲で一定の値をとることが望ましい。これは、次の理由による。式（１１）によれば、Ｃ_Ｓｉの更新頻度が高いと、Ｔ_Ｂに潜在的に反映されていたフレームごとの画像性質を無視することとなる。その結果、フレーム目標発生情報量Ｔ_ｉと画像性質との相関を低下させる原因となる。このため、スケーリング係数をＧＯＰ単位で更新し、フレーム毎に適用することとする。更新は、ＧＯＰ先頭ＩＤＲフレーム符号化後であって、次のＰフレーム符号化直前で行い、これは式（１０）においてｉ＝１とした場合に相当する。この場合、スケーリング係数は、ＧＯＰに割り当てる情報量の、平均ビットレートに対する比率を示すこととなる。 Since the scaling coefficient corrects the amount of basic generated information, it is not desirable to update the coefficient value frequently, and it is desirable to take a constant value within a certain range. This is due to the following reason. According to equation (11), the high update frequency of C _Si, and thus to ignore the image property of each frame that have been potentially reflected in T _B. As a result, the correlation between the frame target generated information amount _Ti and the image property is reduced. For this reason, the scaling coefficient is updated for each GOP and applied for each frame. The update is performed after the GOP head IDR frame encoding and immediately before the next P frame encoding, which corresponds to the case where i = 1 in equation (10). In this case, the scaling coefficient indicates the ratio of the information amount allocated to the GOP to the average bit rate.

［クリップ部］
クリップ部１２３５は、実際にフレーム目標発生情報量Ｔ_ｉが、バッファのオーバフロー限界又はアンダー限界に達しないようにクリップする。式（１１）によれば、バッファ状態が考慮されたＧＯＰ単位での情報量割り当てが行われるため、バッファの破綻は原則として生じない。しかし、これを厳密に回避するため、以下のバッファ制約を適用する。
式（１２）
Ｂｈ_ｉ−ｂ_ｉ ≧ ｍｇｎ
式（１３）
Ｂｈ_ｉ−ｂ_ｉ＋ＢＲ／ＦＲ ≦ （Ｂ−ｍｇｎ）
但し、ｍｇｎは、バッファ上限・下限に対するマージンであり、以下のように規定する。
式（１４）
ｍｇｎ＝０．２・ＢＲ／ＦＲ [Clip part]
Clip portion 1235 is in fact a frame target generated information amount T _i is clipped so as not to reach the overflow limit or under limit of the buffer. According to the equation (11), since the information amount allocation is performed in GOP units in consideration of the buffer state, the failure of the buffer does not occur in principle. However, to strictly avoid this, the following buffer constraint is applied.
Formula (12)
Bh _i -b _i ≧ mgn
Formula (13)
Bh _i −b _i + BR / FR ≦ (B−mgn)
However, mgn is a margin for the upper and lower limits of the buffer and is defined as follows.
Formula (14)
mgn = 0.2 · BR / FR

［量子化パラメータ制御部］
量子化パラメータ制御部１２４は、マクロブロック目標発生情報量決定部１２４１と、目標発生情報量和部１２４２と、実発生情報量和部１２４３と、量子化パラメータ算出部１２４４と、量子化パラメータ平均化部１２４５とを有する。マクロブロック目標発生情報量決定部１２４１は、フレーム目標発生情報量Ｔ_ｉと、マクロブロック毎のＳＡＤ（予測誤差）であるＳ_ｊと、Ｓ_ｊのフレーム内合計値ΣＳ_ｊとから、マクロブロック毎の目標発生情報量ＴＢ_ｊを決定する。目標発生情報量和部１２４２は、マクロブロック目標発生情報量ＴＢ_ｊを加算し、目標発生情報量Ｓ_ＴＢを出力する。実発生情報量和部１２４３は、マクロブロック毎の実発生情報量ＢＭ_ｊを加算し、実発生情報量Ｓ_ＢＭを出力する。量子化パラメータ算出部１２４４は、目標発生情報量Ｓ_ＴＢと、実発生情報量Ｓ_ＢＭと、ＱＰ値の平均値とから、ＱＰ値を決定し、そのＱＰ値を量子化部へ出力する。量子化パラメータ平均化部１２４５は、ＱＰ値の平均を算出し、平均ＱＰ値を量子化パラメータ算出部１２４４へ出力する。 [Quantization parameter controller]
The quantization parameter control unit 124 includes a macroblock target generation information amount determination unit 1241, a target generation information amount sum unit 1242, an actual generation information amount sum unit 1243, a quantization parameter calculation unit 1244, and a quantization parameter averaging. Part 1245. Macroblock target generated information amount determining unit 1241, the frame target generated information amount _{T i,} and _{S j} is the SAD of each macroblock (prediction errors), and intra-frame sum [sigma] s _j of _{S j,} each macroblock The target generation information amount TB _j is determined. The target generation information amount summation unit 1242 adds the macroblock target generation information amount TB _j and outputs the target generation information amount S _TB . The actual generation information amount summation unit 1243 adds the actual generation information amount BM _j for each macroblock, and outputs the actual generation information amount _SBM . The quantization parameter calculation unit 1244 determines a QP value from the target generation information amount S _TB , the actual generation information amount S _BM, and the average value of the QP values, and outputs the QP value to the quantization unit. The quantization parameter averaging unit 1245 calculates the average of the QP values and outputs the average QP value to the quantization parameter calculation unit 1244.

［マクロブロック目標発生情報量決定部］
マクロブロック目標発生情報量決定部１２４１は、フレーム目標発生情報量Ｔ_ｉと、マクロブロック毎の予測誤差Ｓ_ｊと、Ｓ_ｊのフレーム内合計値ΣＳ_ｊとから、マクロブロック毎の目標発生情報量ＴＢ_ｊを決定する。 [Macroblock target generation information amount determination unit]
Macroblock target generated information amount determining unit 1241, a frame target generated information amount T _i, and the prediction error S _j for each macroblock, and a frame in the total value [sigma] s _j of S _j, the target amount of information generated for each macroblock TB _j is determined.

本発明は、マクロブロック単位の情報量割り当てについても、各部が必要とする情報量に対し、最大限その情報量を反映させるように情報量を割り当て、ＱＰの変動を最小限に抑え、且つ、画像品質を均一に保つことを目的とする。必要とする情報量は、ＳＡＤ算出部１２２から出力されたＳＡＤ値Ｓ_ｊを利用することとし、次の式（１５）により、目標発生情報量ＴＢ_ｊを決定する。
式（１５）
ＴＢ_ｊ＝Ｓ_ｊ／ΣＳ_ｊ・Ｔ_ｉ The present invention assigns an information amount so as to reflect the information amount as much as possible with respect to the information amount required by each part for the information amount assignment in units of macroblocks, and minimizes the variation in QP, and The purpose is to keep the image quality uniform. The necessary information amount uses the SAD value S _j output from the SAD calculation unit 122, and the target generated information amount TB _j is determined by the following equation (15).
Formula (15)
TB _j = S _j / ΣS _j · T _i

［量子化パラメータ算出部］
目標発生情報量と、実際の符号化後の発生情報量との差異を吸収するために、ＱＰ値の制御を以下のようにして行う。これまでに発生した情報量と目標発生情報量は、それぞれ以下のようになる。
ｍ：現在符号化しようとしているマクロブロック
ＢＭ_ｊ：マクロブロックｊで実際に発生した情報量
式（１６）

式（１７）

[Quantization parameter calculation unit]
In order to absorb the difference between the target generated information amount and the actual generated information amount after encoding, the QP value is controlled as follows. The amount of information generated so far and the amount of target generated information are as follows.
m: Macroblock currently being encoded BM _j : Information amount actually generated in macroblock j Equation (16)

Formula (17)

マクロブロックｍのためのＱＰ値ＱＰｍは、この差異を吸収するための値として設定する。Ｈ．２６４（デコーダ規格、例えば非特許文献７参照）によれば、量子化パラメータＱＰ値と量子化ステップとの関係が以下の指数関数となっており、且つ、量子化パラメータが６増えると、量子化ステップが２倍となるように定義されている。
式（１８）

従って、以下の式（１９）が成立する。
式（１９）

これにより、以下の式（２０）が成立する。
式（２０）

The QP value QPm for the macroblock m is set as a value for absorbing this difference. H. According to H.264 (decoder standard, for example, see Non-Patent Document 7), the relationship between the quantization parameter QP value and the quantization step is the following exponential function, and the quantization parameter increases by 6 The step is defined to be doubled.
Formula (18)

Therefore, the following equation (19) is established.
Formula (19)

Thereby, the following formula | equation (20) is materialized.
Formula (20)

量子化パラメータ平均化部１２４５は、量子化パラメータ算出部１２４４から出力されるＱＰ値を平均化する。本発明は、マクロブロック単位の情報量割り当てについても、各部が必要とする情報量に対し、最大限その情報量を反映させるように情報量を割り当て、ＱＰの変動を最小限に抑え、且つ、画像品質を均一に保つことを目的とする。その観点から、前フレームの平均ＱＰ値をモード判定に有効利用できるものと考えて、これを利用する。ＱＰ_０については、過去のＱＰの移動平均値とする。フレームに対して、かつ、マクロブロックに対して、必要度に応じた情報量割り当てを行っているため、ＱＰ値の変動は最小限のものとなり、過去のＱＰ平均値は、有効に利用することができる。 The quantization parameter averaging unit 1245 averages the QP values output from the quantization parameter calculation unit 1244. The present invention assigns an information amount so as to reflect the information amount as much as possible with respect to the information amount required by each part for the information amount assignment in units of macroblocks, and minimizes the variation in QP, and The purpose is to keep the image quality uniform. From this point of view, it is considered that the average QP value of the previous frame can be effectively used for mode determination. QP ₀ is a moving average value of past QPs. Since the amount of information is allocated to the frame and the macroblock according to the necessity, the fluctuation of the QP value is minimized, and the past QP average value should be used effectively. Can do.

最後に、本発明の有効性を確認するため、地上ディジタル放送の移動体受信機向け（１セグメント）放送を想定して、ソフトウェアシミュレーション実験を行った。 Finally, in order to confirm the effectiveness of the present invention, a software simulation experiment was conducted on the assumption that terrestrial digital broadcasting (one segment) broadcasting for mobile receivers was assumed.

図６は、各画像におけるフレーム毎のＳＮＲの推移を表すグラフである。 FIG. 6 is a graph showing the transition of SNR for each frame in each image.

図６のグラフは、横軸がフレーム番号であり、縦軸がＰＳＮＲ（ｄＢ）である。本発明は、従来のＪＭ方式に比較して、ＧＯＰ境界での品質変動が抑えられていることが、図６のグラフからも明らかである。 In the graph of FIG. 6, the horizontal axis is the frame number, and the vertical axis is PSNR (dB). It is also clear from the graph of FIG. 6 that the present invention suppresses the quality fluctuation at the GOP boundary as compared with the conventional JM method.

表２は、本発明とＪＭ方式とにおけるＳＮＲ及びその標準偏差の比較を表す。表１によれば、本発明の平均ＳＮＲは、ＪＭ方式と比較して改善されており、その標準偏差が小さいことが理解できる。

Table 2 shows a comparison of SNR and its standard deviation between the present invention and the JM method. According to Table 1, it can be understood that the average SNR of the present invention is improved as compared with the JM method, and its standard deviation is small.

平均ＳＮＲでは、０．３程度の改善であるが、局所的な部分においてＳＮＲの変動を半分以下に抑え、最大劣化を２ｄＢ程度改善できた。同時にバッファ制約にも準拠している。 Although the average SNR was improved by about 0.3, the SNR fluctuation was suppressed to less than half in the local portion, and the maximum deterioration could be improved by about 2 dB. At the same time, it conforms to buffer constraints.

表３は、本発明とＪＭ方式とにおける平均ＱＰ及びその標準偏差の比較を表す。表２によれば、本発明の平均ＱＰ値は、ＪＭ方式と比較して変動が抑えられており、その標準偏差が小さいことが理解できる。

Table 3 shows a comparison of the average QP and its standard deviation between the present invention and the JM method. According to Table 2, it can be understood that the average QP value of the present invention is less fluctuated than that of the JM method, and its standard deviation is small.

このように、品質変動を小さくすることで符号化効率も改善し、平均ＱＰ値が減少し、結果として、平均ＳＮＲの改善となって現れている。 Thus, by reducing the quality fluctuation, the coding efficiency is also improved, the average QP value is decreased, and as a result, the average SNR is improved.

ＪＭ方式は、ＩＤＲリフレッシュやシーンチェンジ等の大きな発生情報量のフレームが存在すると、当該ＧＯＰ内の他のフレームへの情報量割り当てが削減され、品質低下を生じる。これに対し、本願発明によれば、各フレームの発生情報量の事前予測によりフレーム目標発生情報量を決定しているため、シーンが必要としている情報量に応じた情報量割り当てが行われ、局所的又は周期的な品質劣化の発生を防ぐ。 In the JM method, when there is a frame with a large amount of generated information such as IDR refresh or scene change, information amount allocation to other frames in the GOP is reduced, resulting in a deterioration in quality. On the other hand, according to the present invention, since the frame target generated information amount is determined by the prior prediction of the generated information amount of each frame, the information amount allocation according to the information amount required by the scene is performed, and the local amount Prevent periodic or periodic quality degradation.

図７は、フレーム毎のＨＲＤバッファ遷移を表すグラフである。 FIG. 7 is a graph showing HRD buffer transition for each frame.

図７のグラフは、横軸はフレーム番号であり、縦軸はＨＲＤバッファの占有量である。本発明は、画像性質を考慮しているのに対し、従来技術のＪＭ方式は、バッファ制約だけを考慮して（画像性質を考慮していない）符号化制御をする。従って、本願発明は、ＪＭ方式と比較して、画像性質を考慮するがためにバッファ占有量の変化が大きいにもかかわらず、アンダーフロー及びオーバフローを生じることがない。このように、バッファ遷移を考慮したスケーリングに基づくレート制御を行うことにより、画像性質に応じた情報量割り当てを行い、かつＨＲＤバッファ制約に準拠して制御することができる。 In the graph of FIG. 7, the horizontal axis represents the frame number, and the vertical axis represents the occupation amount of the HRD buffer. While the present invention considers image properties, the JM method of the prior art performs encoding control considering only buffer constraints (not considering image properties). Therefore, in the present invention, underflow and overflow do not occur even though the change in the buffer occupancy is large because the image property is taken into consideration as compared with the JM method. In this way, by performing rate control based on scaling in consideration of buffer transitions, it is possible to perform information amount allocation according to image properties and control based on HRD buffer constraints.

演算処理量について、本発明は、符号化前のＳＡＤに基づく発生情報量の事前予測を行っている。しかしながら、通常の符号化処理で行われる動き予測・イントラ予測を利用したものであって、本発明と従来方式との演算処理量は、ほぼ同等となる。 With respect to the calculation processing amount, the present invention performs a prior prediction of the amount of generated information based on the SAD before encoding. However, it uses motion prediction / intra prediction performed in a normal encoding process, and the amount of arithmetic processing between the present invention and the conventional method is almost the same.

本発明によれば、Ｈ．２６４の符号化レート制御に関して、フレーム間予測誤差（ＳＡＤ）と発生情報量との相関関係に着目し、発生情報量の事前予測に基づくレート制御方法を提案した。本発明は、符号化前に予測される発生情報量をその画面が必要としている情報量とみなし、フレーム目標発生情報量を決定する。また、ＨＲＤバッファ制約を考慮するため、バッファ遷移傾向に基づくスケーリング係数を導入し、フレームが必要とする発生情報量の相対関係を維持しつつ、情報量を割り当てる。これにより、バッファ制約の考慮による安定化と、画像性質の反映による画質の向上とを両立させることができる。 In accordance with the present invention, H.264. With regard to H.264 coding rate control, we focused on the correlation between inter-frame prediction error (SAD) and the amount of generated information, and proposed a rate control method based on prior prediction of the amount of generated information. In the present invention, the amount of generated information predicted before encoding is regarded as the amount of information required by the screen, and the frame target generated information amount is determined. In order to consider the HRD buffer constraint, a scaling coefficient based on a buffer transition tendency is introduced, and an information amount is allocated while maintaining a relative relationship between the generated information amounts required by the frame. As a result, it is possible to achieve both stabilization by considering buffer constraints and improvement of image quality by reflecting image properties.

前述した本発明における種々の実施形態によれば、本発明の技術思想及び見地の範囲の種々の変更、修正及び省略を、当業者は容易に行うことができる。前述の説明はあくまで例であって、何ら制約しようとするものではない。本発明は、特許請求の範囲及びその均等物として限定するものにのみ制約される。 According to the above-described various embodiments of the present invention, those skilled in the art can easily make various changes, modifications, and omissions in the technical idea and scope of the present invention. The above description is merely an example, and is not intended to be restrictive. The invention is limited only as defined in the following claims and the equivalents thereto.

従来技術における画像圧縮符号化装置の機能構成図である。It is a functional block diagram of the image compression encoding apparatus in a prior art. 本発明における画像圧縮符号化装置の機能構成図である。It is a functional block diagram of the image compression encoding apparatus in this invention. ＳＡＤと発生情報量との関係を表すグラフである。It is a graph showing the relationship between SAD and the amount of generated information. 図３のグラフの結果を近似したグラフである。It is the graph which approximated the result of the graph of FIG. 本発明におけるＨＲＤバッファの遷移グラフである。It is a transition graph of the HRD buffer in this invention. 各画像におけるフレーム毎のＳＮＲの推移を表すグラフである。It is a graph showing transition of SNR for every frame in each image. フレーム毎のＨＲＤバッファ遷移を表すグラフである。It is a graph showing the HRD buffer transition for every flame | frame.

Explanation of symbols

１画像圧縮符号化装置
１０１現フレームメモリ
１０２動き予測部
１０３インターモードのフレーム間予測コスト算出部
１０４イントラモードのフレーム間予測コスト算出部
１０５モード判定部
１０６モード情報メモリ
１０７ローカルデコードメモリ
１１１動き補償部
１１２ＤＣＴ（離散コサイン変換）部
１１３量子化部
１１４逆量子化部
１１５逆ＤＣＴ部
１１６可変長符号化部
１１７バッファ部
１２１フレームメモリ
１２２差分絶対値和（ＳＡＤ）算出手段
１２３バッファ制約部
１２３１フレーム基本発生情報量推定部
１２３２情報量補正部
１２３３メモリ
１２３４乗算部
１２３５クリップ部
１２４量子化パラメータ制御部
１２４１マクロブロック目標発生情報量決定部
１２４２目標発生情報量和部
１２４３実発生情報量和部
１２４４量子化パラメータ算出部
１２４５量子化パラメータ平均化部 DESCRIPTION OF SYMBOLS 1 Image compression coding apparatus 101 Current frame memory 102 Motion prediction part 103 Inter mode inter frame prediction cost calculation part 104 Intra mode inter frame prediction cost calculation part 105 Mode determination part 106 Mode information memory 107 Local decode memory 111 Motion compensation part 112 DCT (Discrete Cosine Transform) Unit 113 Quantization Unit 114 Inverse Quantization Unit 115 Inverse DCT Unit 116 Variable Length Coding Unit 117 Buffer Unit 121 Frame Memory 122 Difference Absolute Value Sum (SAD) Calculation Means 123 Buffer Restriction Unit 1231 Frame Basic Generation information amount estimation unit 1232 Information amount correction unit 1233 Memory 1234 Multiplication unit 1235 Clip unit 124 Quantization parameter control unit 1241 Macroblock target generation information amount determination unit 1242 Target generation information amount sum unit 12 43 Actually generated information amount summation unit 1244 Quantization parameter calculation unit 1245 Quantization parameter averaging unit

Claims

A quantization means for performing quantization based on the quantization parameter;
From the time immediately after the first frame encoding of the first GOP, the buffer occupation transition of the virtual reception buffer up to the time immediately before the encoding of the first frame of the second GOP is
A straight line connecting the buffer position after encoding the IDR frame at the head of the GOP and the buffer position reserved for the head IDR frame of the next GOP is set as an underflow limit,
And the second frame before encoding buffer position from the GOP head, a straight line connecting the buffer maximum value in the next GOP head IDR frame before encoding the overflow limit, so that the center of the overflow limit and the underflow limits In addition, using the relationship between the prediction error and the amount of generated information calculated by prior measurement, frame target generation is performed by correcting the basic generated information amount required for each frame, which is obtained from the prediction error of the encoded frame. A buffer restriction means for outputting the information amount;
A quantization parameter control unit for deriving a quantization parameter from the macroblock target generation information amount calculated based on the frame target generation information amount and outputting the quantization parameter to the quantization unit; An image compression encoding apparatus.

A difference absolute value sum calculating means for calculating a difference absolute value sum SAD which is a prediction error after the inter / intra mode determination for each macroblock;
The buffer restriction means includes
Frame basic generation information amount estimation means for estimating a frame basic generation information amount based on the difference absolute value sum SAD;
Viewed from point immediately after the first frame coding of the first GOP, as the buffer occupancy transition of the virtual reception buffer to the beginning frame encoding just before the time of the second GOP is at the center of the overflow limit and the underflow limits Information amount correcting means for outputting a scaling coefficient that is a ratio of the generated information amount of the frame to the average generated information amount per frame obtained from the bit rate;
2. The image compression encoding apparatus according to claim 1, further comprising multiplication means for multiplying the estimated basic generation information amount by the scaling coefficient and outputting a frame target generation information amount.

The information amount correction means is performed in units of GOP, and outputs a small scaling coefficient when the occupancy amount of the encoded data in the buffer is low, and occupies the encoded data. 4. The image compression coding apparatus according to claim 2, wherein when the amount is transitioning high, the image compression coding apparatus operates so as to output a large scaling coefficient.

3. The frame basic generated information amount estimating means stores in advance a function f or an estimated value for estimating a generated information amount based on the sum of absolute differences SAD and a fixed quantization parameter QP. The image compression encoding apparatus described in 1.

The buffer constraint means includes a further comprising a clipping means for clipping to prevent relative to the frame target generated information amount output from the multiplying means, reaches the overflow limit or the underflow limits of the buffer The image compression encoding apparatus according to any one of claims 2 to 4.

The quantization parameter control means includes
Macroblock target generation information amount determination means for outputting a macroblock target generation information amount based on the frame target generation information amount;
From the sum of the information amount immediately before the macroblock in the actual generated information amount, the sum of the information amount immediately before the macroblock in the target generated information amount, and the average value of the quantization parameter immediately before the macroblock, 2. The image compression coding apparatus according to claim 1, further comprising: a quantization parameter calculation unit that calculates a quantization parameter of the block and outputs the quantization parameter to the quantization unit.