JP4936557B2

JP4936557B2 - Encoder

Info

Publication number: JP4936557B2
Application number: JP2008013327A
Authority: JP
Inventors: 文貴中山
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2008-01-24
Filing date: 2008-01-24
Publication date: 2012-05-23
Anticipated expiration: 2028-01-24
Also published as: JP2009177443A

Description

本発明は、動画像を圧縮符号化する符号化装置に関する。 The present invention relates to an encoding device that compresses and encodes a moving image.

近年のマルティメディアの発展に伴い、様々な動画像圧縮符号化方式が提案されている。代表的なものに、ＭＰＥＧ−１，２，４及びＨ．２６Ｌがある。これらの符号化方式は、フレーム内符号化とフレーム間予測符号化を併用することで、高い圧縮率を実現する。フレーム間予測符号化では、動き補償と動き予測を加えることで、フレーム間の差分値を符号化する単なるフレーム間符号化よりも高い圧縮率を実現できる。エラーの伝搬を避けるために、所定間隔のフレームをフレーム内符号化のピクチャタイプで圧縮符号化し、その間の複数のフレームを、フレーム間予測符号化のピクチャタイプで圧縮符号化する。フレーム内符号化されるフレームはＩピクチャと呼ばれる。フレーム間予測符号化されるフレームの内、片方向の予測を使用するフレームは、Ｐピクチャと呼ばれ、両方向の予測を使用するフレームはＢピクチャと呼ばれる。勿論、フィールド単位の符号化もあるが、ここでは、フレーム単位の符号化で説明する。 Along with the development of multimedia in recent years, various video compression encoding methods have been proposed. Representative examples include MPEG-1, 2, 4 and H.264. There are 26L. These encoding methods achieve a high compression rate by using both intraframe encoding and interframe predictive encoding. In inter-frame predictive coding, by adding motion compensation and motion prediction, it is possible to realize a higher compression rate than mere inter-frame coding that encodes a difference value between frames. In order to avoid error propagation, frames at predetermined intervals are compression-coded with a picture type of intra-frame coding, and a plurality of frames therebetween are compression-coded with a picture type of inter-frame predictive coding. A frame that is intra-coded is called an I picture. Of the frames to be subjected to inter-frame prediction coding, a frame using unidirectional prediction is called a P picture, and a frame using bidirectional prediction is called a B picture. Of course, there is also encoding in field units, but here, description will be made with encoding in frame units.

動画像を圧縮符号化する場合、その発生符号量は、各ピクチャの空間周波数特性、量子化スケール値、及びシーン間の変化の程度により、大きく異なる。通常、発生符号量を抑制しつつ、一定以上の画質を維持する符号量制御技術が採用されている。例えば、符号量制御アルゴリズムの１つとして、ＴＭ５（ＴｅｓｔＭｏｄｅｌ５）が知られている(特許文献１)。 When a moving image is compressed and encoded, the amount of generated code varies greatly depending on the spatial frequency characteristics of each picture, the quantization scale value, and the degree of change between scenes. Usually, a code amount control technique that maintains a certain level of image quality while suppressing the generated code amount is employed. For example, TM5 (Test Model 5) is known as one of the code amount control algorithms (Patent Document 1).

ＴＭ５による符号量制御アルゴリズムは、ＧＯＰ（Group Of Picture）毎にビットレートが一定になるように、以下に挙げる３ステップで符号量を制御する。ＧＯＰは例えば、１６個のピクチャからなる。 The code amount control algorithm based on TM5 controls the code amount in the following three steps so that the bit rate is constant for each GOP (Group Of Picture). The GOP is composed of, for example, 16 pictures.

（ステップ１）
今から符号化を行うピクチャの目標符号量を決定する。現在のＧＯＰにおいて利用可能な符号量であるＲgopが以下の式（１）により演算される。即ち、
Ｒgop = (ni+np+nb)*(bits_rate/picture_rate) ・・・（１）
ここで、ni,np,nbはそれぞれ、現ＧＯＰにおけるＩピクチャ、Ｐピクチャ及びＢピクチャの残りピクチャ数である。bits_rateは目標ビットレートを示す。picture_rateはピクチャレートを示す。 (Step 1)
The target code amount of the picture to be encoded from now is determined. Rgop, which is a code amount that can be used in the current GOP, is calculated by the following equation (1). That is,
Rgop = (ni + np + nb) * (bits_rate / picture_rate) (1)
Here, ni, np, and nb are the number of remaining pictures of the I picture, P picture, and B picture, respectively, in the current GOP. bits_rate indicates the target bit rate. picture_rate indicates the picture rate.

更に、Ｉピクチャ、Ｐピクチャ及びＢピクチャ毎に符号化結果からピクチャの複雑度Ｘｉ，Ｘｐ，Ｘｂを、以下の式（２）に従い算出する。即ち、
Ｘi=Ｒi*Ｑi
Ｘp=Ｒp*Ｑp （２）
Ｘb=Ｒb*Ｑb
ここで、複雑度Ｘi，Ｘｐ，Ｘｂは、コンプレキシティ（Complexity）とも呼ばれる。Ｒi、Ｒp及びＲbはそれぞれ、Ｉピクチャ、Ｐピクチャ及びＢピクチャを符号化した結果得られる符号量である。Ｑi、Ｑp及びＱbはそれぞれ、Ｉピクチャ、Ｐピクチャ及びＢピクチャ内のすべてのマクロブロックにおけるＱスケール（量子化スケール）の平均値である。 Furthermore, picture complexity Xi, Xp, Xb is calculated from the encoding result for each of the I picture, P picture, and B picture according to the following equation (2). That is,
Xi = Ri * Qi
Xp = Rp * Qp (2)
Xb = Rb * Qb
Here, the complexity levels Xi, Xp, and Xb are also referred to as complexity. Ri, Rp, and Rb are code amounts obtained as a result of encoding the I picture, P picture, and B picture, respectively. Qi, Qp, and Qb are average values of the Q scale (quantization scale) in all macroblocks in the I picture, P picture, and B picture, respectively.

式（１）及び（２）から、Ｉピクチャ、Ｐピクチャ及びＢピクチャそれぞれについての目標符号量Ｔｉ，Ｔｐ，Ｔｂは、式（３）で求めることができる。即ち、
Ｔi=max{(Ｒgop/(1+((Ｎp*Ｘp)/(Ｘi*Ｋp))+((Ｎb*Ｘb)/(Ｘi*Ｋb)))),(bit_rate/(8*picture_rate))}
Ｔp=max{(Ｒgop/(Ｎp+(Ｎb*Ｋp*Ｘb)/(Ｋb*Ｘp))),(bit_rate/(8*picture_rate))} （３）
Ｔb=max{(Ｒgop/(Ｎb+(Ｎp*Ｋb*Ｘp)/(Ｋp*Ｘb))),(bit_rate/(8*picture_rate))}
ただし、Ｎｐ及びＮｂは、現GOP内のそれぞれPピクチャ及びBピクチャの残りの枚数を示す。定数Ｋｐ＝１．０、Ｋｂ＝１．４である。 From equations (1) and (2), the target code amounts Ti, Tp, and Tb for each of the I picture, P picture, and B picture can be obtained by equation (3). That is,
Ti = max {(Rgop / (1 + ((Np * Xp) / (Xi * Kp)) + ((Nb * Xb) / (Xi * Kb)))), (bit_rate / (8 * picture_rate))}
Tp = max {(Rgop / (Np + (Nb * Kp * Xb) / (Kb * Xp))), (bit_rate / (8 * picture_rate))} (3)
Tb = max {(Rgop / (Nb + (Np * Kb * Xp) / (Kp * Xb))), (bit_rate / (8 * picture_rate))}
Np and Nb indicate the remaining number of P pictures and B pictures in the current GOP, respectively. Constants Kp = 1.0 and Kb = 1.4.

（ステップ２）
Ｉピクチャ、Ｐピクチャ及びＢピクチャ毎に仮想バッファを使用し、式（３）で求めた目標符号量と発生符号量との差分を管理する。各仮想バッファのデータ蓄積量に基づき、実際の発生符号量が目標符号量に近づくように、次にエンコードするマクロブロックについて、Ｑスケールの参照値が設定される。例えば、現在のピクチャタイプがPピクチャの場合には、目標符号量と発生符号量との差分は、次の（４）式に従う演算処理により求めることができる。即ち、
ｄp,j=ｄp,0+Ｂp,j-1-((Ｔp*(j-1))/ＭＢ_cnt) （４）
ここで、添字ｊはピクチャ内のマクロブロックの番号を示す。ｄp,0は仮想バッファの初期フルネスを示す。Ｂp，jはj番目のマクロブロックまでの総符号量を示す。ＭＢ_cntはピクチャ内のマクロブロック数を示す。 (Step 2)
A virtual buffer is used for each of the I picture, the P picture, and the B picture, and the difference between the target code amount obtained by Expression (3) and the generated code amount is managed. Based on the data accumulation amount of each virtual buffer, a Q scale reference value is set for the macroblock to be encoded next so that the actual generated code amount approaches the target code amount. For example, when the current picture type is a P picture, the difference between the target code amount and the generated code amount can be obtained by arithmetic processing according to the following equation (4). That is,
dp, j = dp, 0 + Bp, j-1-((Tp * (j-1)) / MB_cnt) (4)
Here, the subscript j indicates the number of the macroblock in the picture. dp, 0 indicates the initial fullness of the virtual buffer. Bp, j represents the total code amount up to the j-th macroblock. MB_cnt indicates the number of macroblocks in the picture.

次に、dp,j(以後、「dj」と記載する)を用いて、j番目のマクロブロックにおけるＱスケールの参照値を求める。その結果は、下記式（５）に示すように、
Ｑj=(dj*31)/r （５）
となり、ここで、
r = 2*bits_rate/picture_rate （６）
である。 Next, using dp, j (hereinafter referred to as “dj”), a reference value of the Q scale in the j-th macroblock is obtained. The result is as shown in the following formula (5):
Qj = (dj * 31) / r (5)
Where
r = 2 * bits_rate / picture_rate (6)
It is.

（ステップ３）
復号画像の画質が視覚的に良好になるように、エンコード対象のマクロブロックの空間アクティビティに基づいて、量子化スケールを最終的に決定する。具体的には、
ACTj =1+min(vblk1,vblk2,……,vblk8) （７）
vblk1〜vblk4は、フレーム構造のマクロブロックにおける8×8のサブブロックにおける空間アクティビティを示す。vblk5〜vblk8は、フィールド構造のマクロブロックにおける８×８のサブブロックの空間アクティビティを示す。空間アクティビチィ自体は、以下の式（８），（９）により求めることができる。即ち、
vblk=Σ(ＰI-Ｐbar)2 （８）
Pbar=(1/64)*ΣＰi （９）
ここで、Ｐｉはｉ番目のマクロブロックにおける画素値である。式（８），（９）中のΣはi=1〜64の累積加算を示す。 (Step 3)
The quantization scale is finally determined based on the spatial activity of the macroblock to be encoded so that the quality of the decoded image is visually good. In particular,
ACTj = 1 + min (vblk1, vblk2, ..., vblk8) (7)
vblk1 to vblk4 indicate spatial activities in 8 × 8 sub-blocks in a macroblock having a frame structure. vblk5 to vblk8 indicate spatial activities of 8 × 8 sub-blocks in the field-structure macroblock. The space activity itself can be obtained by the following equations (8) and (9). That is,
vblk = Σ (PI-Pbar) 2 (8)
Pbar = (1/64) * ΣPi (9)
Here, Pi is a pixel value in the i-th macroblock. In the equations (8) and (9), Σ represents cumulative addition of i = 1 to 64.

式（７）で求めたACTjを以下の式（１０）により正規化する。即ち、
N_ACTj=(2*ACTj+AVG_ACT)/(ACTj+2*AVG_ACT) （１０）
ここで、AVG_ACTは、以前に符号化したピクチャにおけるACTjの参照値である。最終的に量子化スケール（Ｑスケール値）MQUANTjは、以下の式（１１）により求められる。即ち、
MQUANTj=Qj *N_ACTj （１１）
とする。 The ACTj obtained by the equation (7) is normalized by the following equation (10). That is,
N_ACTj = (2 * ACTj + AVG_ACT) / (ACTj + 2 * AVG_ACT) (10)
Here, AVG_ACT is a reference value of ACTj in a previously encoded picture. The quantization scale (Q scale value) MQUANTj is finally obtained by the following equation (11). That is,
MQUANTj = Qj * N_ACTj (11)
And

以上のＴＭ５のアルゴリズムによれば、ステップ１の処理によりIピクチャに対して多くの符号量を割り当てている。更に、ピクチャ内においては視覚的に劣化の目立ちやすい平坦部(空間アクティビティが低い部分)に符号量が多く配分されるようになる。このような符号量制御及び量子化制御により、予め定めたビットレート内で、画質の劣化を抑えることができる。 According to the above TM5 algorithm, a large amount of code is allocated to the I picture by the processing of step 1. Further, in the picture, a large amount of code is distributed to a flat portion (a portion having a low spatial activity) that is easily visually deteriorated. By such code amount control and quantization control, it is possible to suppress degradation of image quality within a predetermined bit rate.

この符号量制御アルゴリズムは、フレーム内符号化ピクチャとフレーム間符号化ピクチャに対する符号量のバランスをとることで、高画質化を実現する。しかし、難易度の高い画像に対しては、符号化後の輝度ピークがピクチャタイプ毎に変動し、それが、「フリッカ」とよばれる視覚劣化を発生することが知られている。このフリッカは、動画として再生した場合にちらつきとして観測され、視覚的な妨害となる。 This code amount control algorithm achieves high image quality by balancing the code amount for the intra-frame coded picture and the inter-frame coded picture. However, it is known that for an image with a high degree of difficulty, the luminance peak after encoding varies for each picture type, which causes visual deterioration called “flicker”. This flicker is observed as flickering when reproduced as a moving image, and becomes a visual disturbance.

この種のフリッカを低減させる方法が特許文献１に記載されている。ブロック単位に入力画像を分析し、その分析結果に基づき重み係数を決定し、重み係数とウェーブレット変換係数とを乗算することで、ブロックごとの画質を制御する。 A method for reducing this kind of flicker is described in Japanese Patent Application Laid-Open No. H10-228707. An input image is analyzed for each block, a weighting coefficient is determined based on the analysis result, and the image quality for each block is controlled by multiplying the weighting coefficient by the wavelet transform coefficient.

特許文献２には、ゲインアップ時に多くのフリッカが発生しやすいとの考えの下、ゲインアップに連動してフィルタ特性をアップすることにより、ゲインアップに由来するランダムノイズを除去してフリッカを抑制することが記載されている。特許文献３には、カメラにおけるゲインに従い、符号量と平滑化処理の強度を制御する技術が記載されている。
特開２００１−３２６９３６号公報国際公開ＷＯ９７／０５７４５号公報特開２００７−１３４８８０号公報 Patent Document 2 suppresses flicker by removing random noise resulting from gain-up by improving filter characteristics in conjunction with gain-up under the belief that many flickers are likely to occur during gain-up. It is described to do. Patent Document 3 describes a technique for controlling the code amount and the strength of the smoothing process according to the gain in the camera.
JP 2001-326936 A International Publication No. WO 97/05745 JP 2007-134880 A

図２は、フレーム内符号化とフレーム間符号化を併用する符号化方式におけるフリッカ発生のメカニズムを示す。図２（Ａ）は、Ｉピクチャからの再生信号のレベルの時間変化を示す。横軸は時間（又はフレーム）を示し、縦軸は、再生映像信号レベルを示す。フラットな映像信号に重畳するノイズ成分のピーク輝度は、フレーム内符号化によりある程度、再構成可能である。これは、上述した符号量制御アルゴリズムで、Ｉピクチャに対する符号量割り当てが、他のピクチャタイプに比べて多いことに起因する。 FIG. 2 shows a flicker generation mechanism in an encoding method using both intraframe encoding and interframe encoding. FIG. 2A shows a temporal change in the level of the reproduction signal from the I picture. The horizontal axis indicates time (or frame), and the vertical axis indicates the playback video signal level. The peak luminance of the noise component superimposed on the flat video signal can be reconstructed to some extent by intra-frame coding. This is because, in the above-described code amount control algorithm, the code amount allocation to the I picture is larger than that of other picture types.

図２（Ｂ）は、Ｐピクチャ及びＢピクチャのようなフレーム間符号化されたピクチャの再生信号のレベルの時間変化を示す。横軸は時間（又はフレーム）を示し、縦軸は、再生映像信号レベルを示す。複雑度の高い画像はフレーム間の相関が低くなるので、通常の符号化ではＰピクチャ及びＢピクチャのフレーム間差分情報量が増加する。その結果、符号化による映像信号の劣化が生じ、輝度のピークを再構成できなくなる。図２（Ｂ）では、図２（Ａ）に示すＩピクチャに比べ、ピーク輝度に差が生じている。これにより、図２（Ｃ）に示すように、動画再生時に輝度フリッカが発生する。横軸は時間（又はフレーム）を示し、縦軸は、再生映像信号レベルを示す。 FIG. 2B shows a temporal change in the level of a reproduction signal of an inter-frame encoded picture such as a P picture and a B picture. The horizontal axis indicates time (or frame), and the vertical axis indicates the playback video signal level. Since an image with high complexity has a low correlation between frames, the amount of difference information between frames of P pictures and B pictures increases in normal coding. As a result, the video signal is deteriorated due to encoding, and the luminance peak cannot be reconstructed. In FIG. 2B, there is a difference in peak luminance as compared to the I picture shown in FIG. As a result, as shown in FIG. 2C, luminance flicker occurs during moving image reproduction. The horizontal axis indicates time (or frame), and the vertical axis indicates the playback video signal level.

ただし、ピーク輝度差が大きいもの全てが、フリッカとして感じられるわけではない。人間の視覚特性は、動きの大きい画像の劣化よりも動きの小さい画像の劣化を検知しやすい。静止画で平坦部がざわざわしているのが気になるのはそのためである。そのため、動きの大きい画像ではピーク輝度差は検知できず、フリッカとして感じることが少ない。動きのある画像よりも動きの小さい画像でピーク輝度差が生じると、フリッカとして目立ってしまう。 However, not all that have a large difference in peak luminance are perceived as flicker. Human visual characteristics are more likely to detect degradation of images with less motion than degradation of images with greater motion. This is why the flat part of the still image is bothering. For this reason, the peak luminance difference cannot be detected in an image with a large motion, and it is less likely to feel flicker. If a difference in peak luminance occurs in an image that moves less than an image that moves, the image becomes noticeable as flicker.

このようなフリッカに対して特許文献１に記載の技術を適用すると、マクロブロックといった細かい単位でのフリッカ低減は可能となる。しかし、画像全体として見たときに、上述したピーク輝度差を抑えることは難しい。 When the technique described in Patent Document 1 is applied to such flicker, flicker can be reduced in small units such as macroblocks. However, it is difficult to suppress the above-described peak luminance difference when viewed as the entire image.

また、特許文献２に記載されるようなフィルタ特性を変更する技術は、一般的な低Ｓ／Ｎ時の符号化歪低減手法のひとつであり、部分的な効果は期待できる。しかし、フィルタのみによって上述したような輝度フリッカを除去するためには、フィルタ強度を十分に上げなければならない。フィルタ強度を上げると、解像度低下が大きくなり、残像が発生してしまい、却って画質の劣化を起こしてしまう。 Moreover, the technique for changing the filter characteristics as described in Patent Document 2 is one of the general encoding distortion reduction techniques at low S / N, and a partial effect can be expected. However, in order to remove the luminance flicker as described above only by the filter, the filter strength must be sufficiently increased. When the filter strength is increased, the resolution is reduced and an afterimage is generated. On the contrary, the image quality is deteriorated.

本発明は、このような従来の問題点を解決するものであり、フリッカが抑制された高画質の符号化信号を出力する符号化装置を提示することを目的とする。 SUMMARY OF THE INVENTION The present invention solves such a conventional problem, and an object thereof is to provide an encoding device that outputs a high-quality encoded signal in which flicker is suppressed.

本発明に係る符号化装置は、複数のフレームを含む入力動画データを、フレーム内符号化とフレーム間符号化とを用いて符号化する符号化手段と、所定数のフレームからなる符号化単位の符号量が目標符号量となるように、前記符号化単位に含まれるフレームのうち、まだ符号化されていないフレームについて、フレーム内符号化を行うフレーム内符号化ピクチャとフレーム間符号化を行うフレーム間符号化ピクチャとを含む複数のピクチャタイプ毎に目標符号量を設定する設定処理を前記符号化単位内のフレームの順に繰り返し、前記ピクチャタイプ毎に設定した目標符号量に従って、前記符号化手段により符号化される前記動画データの符号量をフレーム毎に制御する符号量制御手段と、前記符号化手段により符号化された動画データを復号し、局所復号データを出力する局所復号手段と、前記入力画像データに含まれる複数のフレームの複雑度をそれぞれ検出する特徴検出手段と、前記入力動画データにおけるフレーム間の動き量を複数の前記フレームについてそれぞれ検出する動き検出手段と、前記入力動画データと前記局所復号データとを用いて、複数の前記フレームそれぞれの符号化歪み量を算出する符号化歪み算出手段と、前記特徴検出手段の出力と、前記動き検出手段の出力と、前記符号化歪み量算出手段の出力とに従い、前記入力動画データにおけるフレーム毎に、前記符号化手段により符号化された動画データにフリッカが発生することを検出するフリッカ検出手段とを備え、前記フリッカ検出手段は、前記フリッカが発生することを検出した場合、前記フレーム内符号化ピクチャと前記フレーム間符号化ピクチャの目標符号量をそれぞれ前記設定処理により設定された目標符号量から変更することを特徴とする。 An encoding apparatus according to the present invention includes an encoding unit that encodes input moving image data including a plurality of frames by using intra-frame encoding and inter-frame encoding, and an encoding unit including a predetermined number of frames. Of the frames included in the coding unit, the intra-frame coded picture for performing intra-frame coding and the frame for performing inter-frame coding with respect to a frame that has not been coded among the frames included in the coding unit so that the code amount becomes the target code amount. A setting process for setting a target code amount for each of a plurality of picture types including an inter-coded picture is repeated in the order of frames in the encoding unit, and the encoding means performs the setting according to the target code amount set for each picture type. a code amount control means for controlling the code amount of the moving image data to be encoded for each frame, the moving image data encoded by said encoding means Goshi, local and local decoding means for outputting decoded data, wherein the feature detection means for detecting a plurality of frames of the complexity included in the input image data, wherein the motion amount plurality of between-frame in the input moving image data motion detection means for detecting respectively the frame, the said input video data by using the local decoded data and coding distortion calculation means for calculating a coding distortion amount of each of the plurality of the frames, the output of the character detector When the output of the movement detector, in accordance with an output of the encoding distortion calculating means, for each frame in the input moving image data, detects that the flicker occurs in the moving picture data encoded by said encoding means and a flicker detection means for said flicker detection means, when detecting that the flicker occurs, the frame And changes from the target code amount to a target code amount set by the setting processing each of the arm in the coded picture the interframe coded pictures.

本発明では、符号化に先立ち再生時にフリッカが発生しそうな画像を検出し、そのような画像に対してフレーム内符号化ピクチャとフレーム間符号化ピクチャに与える符号量比率を変更する。これにより、フリッカが抑制された符号化画像データを生成できる。 In the present invention, prior to encoding, an image in which flicker is likely to occur at the time of reproduction is detected, and the code amount ratio applied to the intra-frame encoded picture and the inter-frame encoded picture for such an image is changed. Thereby, encoded image data in which flicker is suppressed can be generated.

以下、図面を参照して、本発明の実施例を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明の一実施例の概略構成ブロック図を示す。図３は、本実施例における目標符号量を説明する模式図である。 FIG. 1 shows a schematic block diagram of an embodiment of the present invention. FIG. 3 is a schematic diagram for explaining the target code amount in this embodiment.

入力端子１０には、動画像を構成する画像データが画面順、すなわちフレーム順で外部から入力する。その画像データは、符号化方式に応じたサイズでブロック化されている。ＭＰＥＧでは、ブロックサイズは、１６画素×１６ライン又は８画素×８ラインである。本実施例では、ＭＰＥＧ方式を想定しており、最小サイズの基本的なブロックのサイズは、８画素×８ラインからなり、マクロブロックが１６画素×１６ラインからなる。 Image data constituting a moving image is input to the input terminal 10 from the outside in screen order, that is, frame order. The image data is blocked in a size corresponding to the encoding method. In MPEG, the block size is 16 pixels × 16 lines or 8 pixels × 8 lines. In this embodiment, the MPEG system is assumed, and the basic block size of the minimum size is 8 pixels × 8 lines, and the macro block is 16 pixels × 16 lines.

フレーム並べ替え装置１２は、入力端子１０から入力する画像データを、符号化タイプに応じたピクチャタイプ順にフレーム単位で並び替える。フレーム並べ替え装置１２は、並び替えた各フレームの画像データをブロック順に加減算器１４、動き予測動き補償装置３４及びＰＳＮＲ（Peak Signal to Noise Ratio）算出装置４６に出力する。 The frame rearrangement device 12 rearranges the image data input from the input terminal 10 in units of frames in the order of picture types corresponding to the encoding type. The frame rearrangement device 12 outputs the rearranged image data of each frame to the adder / subtractor 14, the motion prediction motion compensation device 34, and the PSNR (Peak Signal to Noise Ratio) calculation device 46 in block order.

加減算器１４は、画面内符号化、すなわちフレーム内符号化（イントラ符号化）のときには、フレーム並べ替え装置１２からの画像データをそのまま直交変換器１６に出力する。加減算器１４は、画面間予測符号化、すなわちフレーム間予測符号化（インター符号化）のときには、フレーム並べ替え装置１２からの画像データから後述する動き補償された予測値を減算し、差分値を直交変換器１６に出力する。 The adder / subtractor 14 outputs the image data from the frame rearrangement device 12 to the orthogonal transformer 16 as it is in the case of intra-frame coding, that is, intra-frame coding (intra coding). The adder / subtracter 14 subtracts a motion-compensated prediction value, which will be described later, from the image data from the frame rearrangement device 12 during inter-frame prediction encoding, that is, inter-frame prediction encoding (inter-coding), and calculates a difference value. Output to the orthogonal transformer 16.

直交変換器１６は、加減算器１４からの画像データを、マクロブロック単位で例えば離散コサイン変換等の直交変換方式で直交変換し、変換係数データを量子化器１８に出力する。量子化器１８は、直交変換器１６からの変換係数データを指定された量子化スケールで量子化する。可変長符号化装置２０は、量子化器１８からの量子化された変換係数データをハフマン符号化等の可変長符号化方式で符号化する。可変長符号化装置２０から出力される符号データは、バッファ２２に一時蓄積された後、出力端子２４から外部に出力される。後述するように、バッファ２２に記憶されている符号データ量は、符号量制御のために参照される。 The orthogonal transformer 16 orthogonally transforms the image data from the adder / subtractor 14 in units of macroblocks using an orthogonal transform method such as discrete cosine transform, and outputs transform coefficient data to the quantizer 18. The quantizer 18 quantizes the transform coefficient data from the orthogonal transformer 16 with a designated quantization scale. The variable length encoding device 20 encodes the quantized transform coefficient data from the quantizer 18 by a variable length encoding method such as Huffman encoding. Code data output from the variable length encoding device 20 is temporarily stored in the buffer 22 and then output to the outside from the output terminal 24. As will be described later, the amount of code data stored in the buffer 22 is referred to for code amount control.

量子化器１８の出力データは、動き予測及び動き補償を使ったフレーム間予測符号化のために、ローカルで復号化される。即ち、逆量子化器２６は、量子化器１８の出力データを逆量子化する。逆直交変換器２８は、逆量子化器２６の出力データを逆直交変換する。逆直交変換器２８の出力データは、復号化された画像データ（イントラ符号化の場合）又は差分画像データ（インター符号化の場合）である。加算器３０は、イントラ符号化の場合には、逆直交変換器２８の出力データをそのまま出力する。加算器３０は、インター符号化の場合には、逆直交変換器２８の出力データに予測値を加算して出力する。この加算により、差分値から画像値に戻される。 The output data of the quantizer 18 is decoded locally for interframe predictive coding using motion prediction and motion compensation. That is, the inverse quantizer 26 inversely quantizes the output data of the quantizer 18. The inverse orthogonal transformer 28 performs inverse orthogonal transform on the output data of the inverse quantizer 26. The output data of the inverse orthogonal transformer 28 is decoded image data (in the case of intra coding) or difference image data (in the case of inter coding). In the case of intra coding, the adder 30 outputs the output data of the inverse orthogonal transformer 28 as it is. In the case of inter coding, the adder 30 adds the predicted value to the output data of the inverse orthogonal transformer 28 and outputs the result. By this addition, the difference value is returned to the image value.

ビデオバッファ３２は、加算器３０から出力される局所復号画像データを複数フレーム分、動き予測及び動き補償のために一時記憶する。ビデオバッファ３２は数フレーム分の記憶容量を具備する。動き予測動き補償装置３４は、フレーム並び替え装置１２からの現フレームの画像データをビデオバッファ３２のフレーム間予測のための参照フレームと対比して、動きベクトルを算出する。そして、動き予測動き補償装置３４は、算出された動きベクトルから、動きを補償した予測値を算出する。算出された予測値が、加減算器１４に供給され、インター符号化の際に閉成されるスイッチ３６を介して加算器３０に印加される。 The video buffer 32 temporarily stores locally decoded image data output from the adder 30 for a plurality of frames for motion prediction and motion compensation. The video buffer 32 has a storage capacity for several frames. The motion prediction motion compensation device 34 compares the image data of the current frame from the frame rearrangement device 12 with a reference frame for inter-frame prediction in the video buffer 32 and calculates a motion vector. Then, the motion prediction motion compensation device 34 calculates a predicted value that compensates for motion from the calculated motion vector. The calculated prediction value is supplied to the adder / subtractor 14 and is applied to the adder 30 via the switch 36 which is closed at the time of inter coding.

図示を省略してあるが、動き予測動き補償装置３４が算出した動きベクトル値は、復号化に必要である。そこで、動き予測動き補償装置３４が算出した動きベクトル値はバッファ２２に供給され、符号データに多重されて出力端子２４に出力される。 Although illustration is omitted, the motion vector value calculated by the motion prediction motion compensator 34 is necessary for decoding. Therefore, the motion vector value calculated by the motion prediction motion compensation device 34 is supplied to the buffer 22, multiplexed with the code data, and output to the output terminal 24.

ここまでで説明した部分は、フレーム内符号化と動き補償のフレーム間符号化を併用する動画圧縮符号化装置で良く知られた構成からなる。 The portion described so far has a structure well known in a moving image compression encoding apparatus that uses both intraframe encoding and interframe encoding for motion compensation.

本実施例の特徴的な部分、即ち符号量制御及び量子化制御に関する部分を説明する。 A characteristic part of the present embodiment, that is, a part related to code amount control and quantization control will be described.

符号量制御装置３８は、バッファ２２に蓄積される符号データ量を参照して、ピクチャ毎の目標符号量を設定する。具体的には、ＧＯＰ内の各ピクチャに対する割り当てビット量を、割り当て対象ピクチャを含めＧＯＰ内でまだ符号化されていないピクチャに対するビット量を基に配分する。この配分をＧＯＰ内の符号化ピクチャ順に繰り返し、ピクチャごとにピクチャ目標符号量を設定する。 The code amount control device 38 refers to the code data amount stored in the buffer 22 and sets a target code amount for each picture. Specifically, the allocated bit amount for each picture in the GOP is distributed based on the bit amount for a picture that has not yet been encoded in the GOP including the allocation target picture. This distribution is repeated in the order of the encoded pictures in the GOP, and a picture target code amount is set for each picture.

量子化制御装置４０は、各ピクチャの目標符号量と実際の発生符号量とを一致させるため、仮想バッファの容量を基に量子化スケールの参照値を決定する。そのために、可変長符号化装置２０から出力されるマクロブロック単位の発生符号量が、バッファ２２から量子化制御装置４０にフィードバックされる。 The quantization control device 40 determines the reference value of the quantization scale based on the capacity of the virtual buffer in order to match the target code amount of each picture with the actual generated code amount. For this purpose, the generated code amount in units of macroblocks output from the variable length encoding device 20 is fed back from the buffer 22 to the quantization control device 40.

量子化器１８で使用する量子化パラメータは、量子化スケールの参照値に対して図示しないブロック特徴検出で算出するアクティビティを基に、式（１１）を用いて決定される。このアクティビティが小さい値であれば量子化パラメータを小さくして、多くの符号量が割り当てられるようにする。ここまでの動作は、背景技術で述べたステップ１〜３に相当する。 The quantization parameter used in the quantizer 18 is determined using Equation (11) based on the activity calculated by block feature detection (not shown) with respect to the reference value of the quantization scale. If this activity is a small value, the quantization parameter is reduced so that a large amount of code is allocated. The operation so far corresponds to Steps 1 to 3 described in the background art.

本実施例では、フリッカ低減のために、フレーム特徴検出装置４２、フレーム動き検出装置４４、ＰＳＮＲ算出装置４６及びフリッカ検出装置４８を設けた。これらの作用を説明する。 In this embodiment, a frame feature detection device 42, a frame motion detection device 44, a PSNR calculation device 46, and a flicker detection device 48 are provided for reducing flicker. These actions will be described.

フレーム特徴検出装置４２は、フレームアクティビティとして、入力端子１０からの画像データから今から符号化する画像の複雑度を算出する。この実施例では、複雑度として、画像データの交流成分量、好ましくは高周波成分量を採用する。具体的には、１画面の画像データを所定サイズのブロックに分割し、各ブロックに対して分散を算出する。そして、各ブロックで算出した分散を画像の全ブロック数分加算した結果を、高周波成分量とする。なお、分散でなく、ＤＣＴ（離散コサイン変換）やアダマール変換といった周波数変換を行い、その周波数成分で代用しても良い。 The frame feature detection device 42 calculates the complexity of the image to be encoded from the image data from the input terminal 10 as the frame activity. In this embodiment, the amount of AC component of image data, preferably the amount of high frequency component, is employed as the complexity. Specifically, the image data for one screen is divided into blocks of a predetermined size, and the variance is calculated for each block. Then, a result obtained by adding the variance calculated in each block for the total number of blocks of the image is set as a high-frequency component amount. Instead of dispersion, frequency conversion such as DCT (Discrete Cosine Transform) or Hadamard Transform may be performed, and the frequency component may be substituted.

フレーム動き検出装置４４は、入力端子１０からの画像データを隣接するフレーム間で相関をとり、今から符号化する画像全体がどれだけ動いたかを算出する。具体的には、１画面の画像を所定サイズのブロックに分割し、そのブロック毎に、隣接する画面間で一方の画像の座標をずらしながら相関が最も高くなる座標ずれ量を算出する。そして、各ブロックで算出した動きベクトル量の画面内の総和をフレーム間動き量とする。なお、このフレーム間動き量は、大局的な動き（グローバルベクトル）を示すものであり、ここで示す方法以外の方法でも算出できる。 The frame motion detector 44 correlates the image data from the input terminal 10 between adjacent frames, and calculates how much the entire image to be encoded has moved. Specifically, an image of one screen is divided into blocks of a predetermined size, and for each block, a coordinate shift amount that gives the highest correlation is calculated while shifting the coordinates of one image between adjacent screens. Then, the sum of the motion vector amounts calculated in each block in the screen is used as the inter-frame motion amount. This inter-frame motion amount indicates a global motion (global vector), and can be calculated by a method other than the method shown here.

本実施例では、入力画像と局所復号画像とから符号化歪み量を算出する符号化歪み量算出手段として、ＰＳＮＲ算出装置４６を設ける。ＰＳＮＲ算出装置４６は、先ず、入力端子１０からの画像データと局所復号画像データ（加算器３０の出力画像データ）とから、マクロブロック単位のＰＳＮＲを算出する。そして、ＰＳＮＲ算出装置４６は、マクロブロック毎のＰＳＮＲの画面内の総和を、最終的なＰＳＮＲとして出力する。ここで算出するＰＳＮＲは、符号化済みの画像に対するもの、即ち、今から符号化する画像に対して少なくとも１つ以上前に入力された画像に対するものである。 In the present embodiment, a PSNR calculation device 46 is provided as an encoding distortion amount calculation unit that calculates an encoding distortion amount from an input image and a locally decoded image. First, the PSNR calculation device 46 calculates the PSNR in units of macroblocks from the image data from the input terminal 10 and the locally decoded image data (output image data of the adder 30). Then, the PSNR calculation device 46 outputs the sum of the PSNR for each macroblock in the screen as the final PSNR. The PSNR calculated here is for an encoded image, that is, for an image input at least one before the image to be encoded.

フリッカ検出装置４８には、フレーム特徴検出装置４２からのフレームアクティビティ、フレーム動き検出装置４４からのフレーム間動き量、及びＰＳＮＲ算出装置４６からのＰＳＮＲが入力する。フリッカ検出装置４８は、これらの３つのパラメータ値に従い、今から符号化する画像にフリッカが発生しそうかどうかを検出する。 The flicker detection device 48 receives the frame activity from the frame feature detection device 42, the inter-frame motion amount from the frame motion detection device 44, and the PSNR from the PSNR calculation device 46. The flicker detection device 48 detects whether or not flicker is likely to occur in an image to be encoded in accordance with these three parameter values.

フリッカが発生する条件として、符号化後の輝度ピーク値がピクチャタイプ毎に異なること、動きの少ない画像であることは、上述した。符号化後の輝度ピーク値が生じる画像の条件は、１）画像の複雑さが高いこと、及び、２）符号化画像が劣化していることの二つの条件を満たしていることである。一つ目の条件は、フレーム特徴検出装置４２で算出した高周波成分量が高いことを意味する。二つ目の条件は、ＰＳＮＲ算出装置４６で算出したＰＳＮＲが低いことを意味する。なお、どちらか一方の条件を満たしていなければ、輝度ピーク差が生じるとは言えない。例えば、高周波を多く含んだ画像は画像の複雑さが高く、一つ目の条件は満たす。しかし、ビットレートが高い場合には、符号化画像は劣化していないので、輝度ピーク差は生じない。一方、符号化画像が劣化していると、二つ目の条件は満たす。しかし、ビットレートが低い場合、画像の複雑さが低くても、符号化画像が劣化する。この場合、輝度ピーク差は生じない。動きの少ない画像の条件は、画像全体の動き量が小さいことであり、フレーム動き検出装置４４で算出したフレーム間動き量が少ないことを意味する。 As described above, the flicker occurrence condition is that the luminance peak value after encoding differs for each picture type and that the image has little motion. The condition of an image in which a luminance peak value after encoding is satisfied is that the following two conditions are satisfied: 1) the complexity of the image is high, and 2) the encoded image is deteriorated. The first condition means that the amount of high-frequency components calculated by the frame feature detection device 42 is high. The second condition means that the PSNR calculated by the PSNR calculation device 46 is low. Note that if either one of the conditions is not satisfied, it cannot be said that a luminance peak difference occurs. For example, an image containing a lot of high frequencies has a high image complexity and satisfies the first condition. However, when the bit rate is high, the encoded image is not deteriorated, so that there is no luminance peak difference. On the other hand, if the encoded image is degraded, the second condition is satisfied. However, when the bit rate is low, the encoded image deteriorates even if the complexity of the image is low. In this case, no luminance peak difference occurs. The condition of an image with little motion is that the amount of motion of the entire image is small, which means that the amount of motion between frames calculated by the frame motion detector 44 is small.

フリッカ検出装置４８は、高周波成分量がその基準値より高く、ＰＳＮＲがその基準値より低く、フレーム動き量がその基準値より少ない場合、今から符号化しようとする画像でフリッカが発生する可能性が高いと判断する。フリッカ発生の可能性が高い場合、フリッカ検出装置４８は、符号化による輝度ピーク差を生じさせないように、符号量制御装置３８にピクチャタイプ毎の符号量配分の変更、たとえば、比率の変更を指示する。 The flicker detection device 48 may cause flicker in an image to be encoded from now on when the high frequency component amount is higher than the reference value, the PSNR is lower than the reference value, and the frame motion amount is less than the reference value. It is judged that is high. When the possibility of occurrence of flicker is high, the flicker detection device 48 instructs the code amount control device 38 to change the code amount distribution for each picture type, for example, to change the ratio so as not to cause a luminance peak difference due to encoding. To do.

図３（Ａ）は、フリッカ発生可能性を検出しない場合の符号量配分例を示す。図３（Ｂ）は、本実施例によるフリッカ発生可能性を検出した場合の符号量配分例を示す。図３（Ａ）はいわば、従来のＴＭ５方式の符号量制御アルゴリズムによる符号量配分例を示す。 FIG. 3A shows an example of code amount distribution when the possibility of occurrence of flicker is not detected. FIG. 3B shows an example of code amount distribution when the possibility of occurrence of flicker is detected according to this embodiment. FIG. 3A shows an example of code amount distribution by a conventional TM5 code amount control algorithm.

図３（Ａ）に示すように、フリッカ発生可能性が無い又は低い場合には、イントラ符号化であるＩピクチャに対する目標符号量が多く、インター符号化であるＰピクチャとＢピクチャの目標符号量が低く設定される。この符号量配分により、上述したように、インター符号化ピクチャではイントラ符号化ピクチャよりも劣化が大きくなり、輝度のピークが再構成できない。その結果、インター符号化ピクチャとイントラ符号化ピクチャの間で輝度ピーク差が生じてフリッカのように見えてしまう。 As shown in FIG. 3A, when the possibility of occurrence of flicker is low or low, the target code amount for I pictures that are intra-coded is large, and the target code amounts for P-picture and B-picture that are inter-coded Is set low. Due to this code amount distribution, as described above, the inter-coded picture is more deteriorated than the intra-coded picture, and the luminance peak cannot be reconstructed. As a result, a luminance peak difference occurs between the inter-coded picture and the intra-coded picture, and the picture looks like flicker.

本実施例では、フリッカ発生可能性を検出すると、フリッカ検出装置４８は、符号量制御装置３８に、図３（Ａ）に示すような符号量配分を、図３（Ｂ）に示すような符号量配分に変更するように指示する。具体的には、全体の発生符号量を一定に維持しつつ、Ｉピクチャの符号量を少なくし、Ｐピクチャ及びＢピクチャの発生符号量を増加させる。この符号量配分の変更は、式（３）における係数Ｋｐ，Ｋｂを変更することで実現できる。 In the present embodiment, when the possibility of occurrence of flicker is detected, the flicker detection device 48 distributes the code amount distribution as shown in FIG. 3 (A) to the code amount control device 38 as shown in FIG. 3 (B). Instruct to change to quantity distribution. Specifically, the code amount of the I picture is decreased while the generated code amount of the P picture and B picture is increased while maintaining the entire generated code amount constant. The change of the code amount distribution can be realized by changing the coefficients Kp and Kb in the equation (3).

フレームアクティビティ、フレーム間動き量及びＰＳＮＲの各値に従い、係数Ｋｐ，Ｋｂを段階的に又は連続的に変更してもよい。ピクチャタイプ毎の符号量配分を決定するパラメータＫｐは、通常、１．０であり、Ｋｂは１．４である。これに対し、フリッカが発生しそうな画像に対して、係数Ｋｐ，Ｋｂの値を小さくすれば、インター符号化ピクチャの目標符号量が高くなり、イントラ符号化ピクチャに対する目標符号量が低く変更される。 The coefficients Kp and Kb may be changed stepwise or continuously according to each value of the frame activity, the interframe motion amount, and the PSNR. The parameter Kp for determining the code amount distribution for each picture type is normally 1.0 and Kb is 1.4. On the other hand, if the values of the coefficients Kp and Kb are reduced for an image where flicker is likely to occur, the target code amount of the inter-coded picture is increased, and the target code amount for the intra-coded picture is changed to be low. .

このような目標符号量の変更により、ピクチャタイプ間の輝度ピーク差を生じにくくなり、再生時にフリッカとして発生することを低減できる。 Such a change in the target code amount makes it difficult to produce a luminance peak difference between picture types, and can reduce occurrence of flicker during reproduction.

アクティビティ、フレーム間動き量及びＰＳＮＲによるフリッカ発生可能性の検出エリア単位を小さくし、その検出エリア単位毎に目標符号量を変化させる。 The detection area unit of the possibility of occurrence of flicker due to the activity, interframe motion amount and PSNR is reduced, and the target code amount is changed for each detection area unit.

図４を参照して、本実施例の動作を説明する。図４に示すように、１画面（例えば、１フレーム）の画像を複数のエリアに分割し、エリア毎にフレームアクティビティ、フレーム間動き量及びＰＳＮＲを算出する。その後、フレーム単位で行った処理と同様の判定を行い、フリッカが起こりそうであるエリアを判定する。 The operation of this embodiment will be described with reference to FIG. As shown in FIG. 4, an image of one screen (for example, one frame) is divided into a plurality of areas, and frame activity, interframe motion amount and PSNR are calculated for each area. Thereafter, the same determination as that performed for each frame is performed, and an area where flicker is likely to occur is determined.

図４（Ａ），（Ｂ）では、斜線を付したエリアが、フリッカが発生しそうと判断したエリアである。イントラ符号化ピクチャでは、図４（Ａ）に示すように、フリッカが起こりそうなエリアに対する目標符号量を少なくする。また、インター符号化ピクチャでは、図４（Ｂ）に示すように、フリッカが起こりそうなエリアに対する目標符号量を多くする。一方、フリッカが起こりにくいと判定されたエリアに対しては通常の目標符号量を設定する。このようなエリア毎の判定と目標符号量の調整を行うことで、フリッカが発生しそうなエリアに限定した処理を行うことが可能になる。 In FIGS. 4A and 4B, hatched areas are areas where flicker is determined to occur. In an intra-coded picture, as shown in FIG. 4A, the target code amount for an area where flicker is likely to occur is reduced. In the inter-coded picture, as shown in FIG. 4B, the target code amount for an area where flicker is likely to occur is increased. On the other hand, a normal target code amount is set for an area where it is determined that flicker is unlikely to occur. By performing such determination for each area and adjustment of the target code amount, it is possible to perform processing limited to areas where flicker is likely to occur.

上述した各実施例における各処理は、各処理の機能を実現する為のプログラムをメモリから読み出してコンピュータのＣＰＵ（中央演算装置）が実行することによりその機能を実現させるものであってもよい。 Each process in each embodiment described above may be realized by reading a program for realizing the function of each process from the memory and executing it by a CPU (central processing unit) of the computer.

また、ＣＰＵがアクセスする上記メモリには、ＨＤＤ、光ディスク、フラッシュメモリ等の不揮発性メモリや、ＣＤ−ＲＯＭ等の読み出しのみが可能な記録媒体、ＲＡＭ以外の揮発性のメモリなどがある。または、これらの組合せによるコンピュータ読み取り、書き込み可能な記録媒体より構成されてもよい。 The memory accessed by the CPU includes a nonvolatile memory such as an HDD, an optical disk, and a flash memory, a recording medium such as a CD-ROM that can only be read, and a volatile memory other than the RAM. Or you may comprise from the computer-readable recording medium by these combination, and a writable recording medium.

また、上述した各実施例における各処理の機能を実現する為のプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより各処理を行っても良い。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含む。記録媒体から読み出されたプログラムが、コンピュータの機能拡張ボードや機能拡張ユニットに備わるＣＰＵなどで実行されることで、上述した実施例の機能の一部または全部が実現される場合も含む。 In addition, a program for realizing each processing function in each of the above-described embodiments is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system and executed. Processing may be performed. The “computer system” here includes an OS and hardware such as peripheral devices. The program read from the recording medium is executed by a CPU or the like provided in a function expansion board or function expansion unit of a computer, thereby including a case where part or all of the functions of the above-described embodiments are realized.

また、「コンピュータ読み取り可能な記録媒体」とは、ＣＤ−ＲＯＭやＤＶＤ等の光ディスクや半導体メモリカードといった可搬媒体、或いはコンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムが送信された場合のサーバやクライアントの揮発メモリ（ＲＡＭ）のように、一定時間プログラムを保持しているものを含む。 The “computer-readable recording medium” refers to a portable medium such as an optical disk such as a CD-ROM or DVD, a semiconductor memory card, or a storage device such as a hard disk built in a computer system. Furthermore, the “computer-readable recording medium” refers to a program for a certain period of time, such as a server or client volatile memory (RAM) when the program is transmitted via a network such as the Internet or a communication line such as a telephone line. Includes what is being held.

また、上記プログラムは、このプログラムを記憶装置等に格納したコンピュータシステムから、伝送媒体を介して、あるいは、伝送媒体中の伝送波により他のコンピュータシステムに伝送されてもよい。ここで、プログラムを伝送する「伝送媒体」は、インターネット等のネットワーク（通信網）や電話回線等の通信回線（通信線）のように情報を伝送する機能を有する媒体のことをいう。 The program may be transmitted from a computer system storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in the transmission medium. Here, the “transmission medium” for transmitting the program refers to a medium having a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line.

また、上記プログラムは、前述した機能の一部を実現する為のものであっても良い。さらに、前述した機能をコンピュータシステムに既に記録されているプログラムとの組合せで実現できるもの、いわゆる差分ファイル（差分プログラム）であっても良い。 The program may be for realizing a part of the functions described above. Furthermore, what can implement | achieve the function mentioned above in combination with the program already recorded on the computer system, and what is called a difference file (difference program) may be sufficient.

また、上記のプログラムを記録したコンピュータ読み取り可能な記録媒体等のプログラムプロダクトも本発明の実施形態として適用することができる。上記のプログラム、記録媒体、伝送媒体およびプログラムプロダクトは、本発明の範疇に含まれる。 A program product such as a computer-readable recording medium in which the above program is recorded can also be applied as an embodiment of the present invention. The above program, recording medium, transmission medium, and program product are included in the scope of the present invention.

以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 The embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes designs and the like that do not depart from the gist of the present invention.

本発明の一実施例の概略構成ブロック図である。It is a schematic block diagram of one Example of this invention. フリッカ発生のメカニズムを説明する模式図である。It is a schematic diagram explaining the mechanism of flicker generation. フリッカ発生可能性の有無に対する符号量配分の模式図であり、（Ａ）は、フリッカ発生可能性の無い場合の符号量配分を示し、（Ｂ）はフリッカ発生可能性のある場合の符号量配分を示す。4A and 4B are schematic diagrams of code amount distribution with respect to the presence / absence of occurrence of flicker, where (A) shows code amount distribution when there is no possibility of flicker occurrence, and (B) shows code amount distribution when there is a possibility of occurrence of flicker. Indicates. １画面を分割するエリア単位でフリッカ発生時の目標符号量を決定する場合の説明図である。It is explanatory drawing in the case of determining the target code amount at the time of flicker generation in the area unit which divides | segments 1 screen.

Explanation of symbols

１０・・・入力端子
１２・・・フレーム並べ替え装置
１４・・・加減算器
１６・・・直交変換器
１８・・・量子化器
２０・・・可変長符号化装置
２２・・・バッファ
２４・・・出力端子
２６・・・逆量子化器
２８・・・逆直交変換器
３０・・・加算器
３２・・・ビデオバッファ（フレームメモリ）
３４・・・動き予測動き補償装置
３６・・・スイッチ
３８・・・符号量制御装置
４０・・・量子化制御装置
４２・・・フレーム特徴検出装置
４４・・・フレーム動き検出装置
４６・・・ＰＳＮＲ算出装置
４８・・・フリッカ検出装置 DESCRIPTION OF SYMBOLS 10 ... Input terminal 12 ... Frame rearrangement device 14 ... Adder / Subtractor 16 ... Orthogonal transformer 18 ... Quantizer 20 ... Variable length encoding device 22 ... Buffer 24 ..Output terminal 26 ... inverse quantizer 28 ... inverse orthogonal transformer 30 ... adder 32 ... video buffer (frame memory)
34 ... Motion prediction motion compensation device 36 ... Switch 38 ... Code amount control device 40 ... Quantization control device 42 ... Frame feature detection device 44 ... Frame motion detection device 46 ... PSNR calculation device 48... Flicker detection device

Claims

The input moving image data including a plurality of frames, and encoding means for encoding using the intra-frame coding and inter-frame coding,
An intra-frame code that performs intra-frame coding on a frame that has not yet been encoded among frames included in the coding unit so that the code amount of the coding unit including a predetermined number of frames becomes the target code amount. A setting process for setting a target code amount for each of a plurality of picture types including an encoded picture and an inter-frame encoded picture for performing inter-frame encoding is repeated in the order of frames in the encoding unit, and set for each picture type. Code amount control means for controlling the code amount of the moving image data encoded by the encoding means for each frame in accordance with a target code amount ;
Local decoding means for decoding moving image data encoded by the encoding means and outputting locally decoded data;
Feature detection means for detecting the complexity of each of a plurality of frames included in the input image data ;
Motion detection means for detecting a motion amount between frames in the input video data for each of the plurality of frames ;
Coding distortion calculation means for calculating a coding distortion amount of each of the plurality of frames using the input moving image data and the local decoded data ;
An output of said characteristic detection means, an output of said motion detecting means, in accordance with an output of the encoding distortion calculating means, for each frame in the input moving image data, flicker encoded moving image data by the encoding means and a flicker detection means for detecting that but occur,
It said flicker detection means, when detecting that the flicker occurs, changing the target code amount to a target code amount set by the setting processing each of said intra-frame coded picture the inter-frame coding picture An encoding device characterized by the above.

The flicker detection means has a complexity detected by the feature detection means greater than its reference value, a motion amount between the frames detected by the motion detection means is less than its reference value, and calculates the distortion amount. The encoding apparatus according to claim 1, wherein when the encoding distortion amount calculated by the unit is larger than the reference value, it is determined that flicker occurs in the encoded image.

The flicker detection unit , when detecting occurrence of the flicker , reduces the target code amount of the intra-frame coded picture and increases the target code amount of the inter-frame coded picture. Item 3. The encoding device according to Item 1 or 2.

It said feature detection means, the encoding apparatus according to the alternating current component amount obtained by orthogonal transform of the input video data to any one of claims 1 to 3, characterized in that the said complexity.

The feature detection unit calculates an AC component by orthogonally transforming a block obtained by dividing the input moving image data into a predetermined size, and calculates a sum of the AC components of all blocks as the complexity. Item 4. The encoding device according to any one of Items 1 to 3.

The motion detection means calculates a correlation while shifting the coordinates of an image of one frame between adjacent frames of the input moving image data , and calculates a coordinate shift amount that maximizes the correlation as the motion amount. The encoding device according to any one of claims 1 to 5.

For each block obtained by dividing the input moving image data in the screen, the motion detection unit calculates a coordinate shift amount that maximizes the correlation while shifting the coordinates of one image between adjacent frames , and calculates the calculated coordinate shift amount. The encoding apparatus according to claim 1, wherein a total sum of all the blocks is calculated as the motion amount.

8. The coding distortion amount calculating means is means for calculating a PSNR (Peak Signal to Noise Ratio) of the local decoded image from the input moving image data and the local decoded image. The encoding device according to claim 1.

9. The encoding apparatus according to claim 1, wherein the intra-frame encoded picture is an I picture, and the inter-frame encoded picture is a P picture or a B picture.