JP2007060452A

JP2007060452A - Method, device, and program for moving image predictive coding and computer readable recording medium having the program recorded thereon

Info

Publication number: JP2007060452A
Application number: JP2005245209A
Authority: JP
Inventors: Atsushi Shimizu; 淳清水; Ryuichi Tanida; 隆一谷田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2005-08-26
Filing date: 2005-08-26
Publication date: 2007-03-08
Anticipated expiration: 2025-08-26
Also published as: JP4246722B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a new moving image predictive coding technique capable of improving subjective image quality while suppressing the decline of coding efficiency. <P>SOLUTION: Whether or not a coding object region is a flat region is judged. When judging that the coding object region is the flat region; processing is performed so as to search a motion vector and decide a predictive coding mode by calculating a coding cost just by the size of predictive residual signals, in order to suppress subjective image quality decline and the generation of unnatural movements. On the other hand, when judging that the coding object region is not the flat region, the processing is performed so as to search the motion vector and decide the predictive coding mode, by calculating the coding cost from the size of the predictive residual signals and a generation code amount required for coding information other than the predictive residual signals in order to prevent the increase of the generation code amount. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、動き補償予測を行って動画像を符号化する動画像予測符号化方法及びその装置と、その動画像予測符号化方法の実現に用いられる動画像予測符号化プログラム及びそのプログラムを記録したコンピュータ読み取り可能な記録媒体とに関する。 The present invention relates to a moving picture predictive coding method and apparatus for coding a moving picture by performing motion compensation prediction, a moving picture predictive coding program used for realizing the moving picture predictive coding method, and the program. And a computer-readable recording medium.

Ｈ．２６４（例えば、非特許文献１参照）など多くの映像符号化方式では、動き補償予測符号化方式を採用している。この動き補償予測符号化方式は、参照フレームと符号化対象フレームとの間で、映像の動きを補償することで符号化効率を向上させている。 H. In many video coding systems such as H.264 (see Non-Patent Document 1, for example), a motion compensated predictive coding system is employed. This motion-compensated predictive coding scheme improves the coding efficiency by compensating the motion of the video between the reference frame and the coding target frame.

動き補償予測符号化方式の符号化データは、主に、予測残差信号の符号化データと動きベクトルなどのオーバヘッド部の符号化データとからなる。一般に符号化効率は、符号化歪の大きさと発生符号量とにより決まるため、単に予測残差信号の小さい予測符号化モード（動き補償ブロックサイズや参照フレーム、予測方向など動き補償予測信号を生成する方法を示すもの）を選択しても、オーバヘッド部の符号量が多い場合は、必ずしも符号化効率が良くなるとは限らない。 The encoded data of the motion compensation predictive encoding system mainly includes encoded data of a prediction residual signal and encoded data of an overhead part such as a motion vector. In general, since the coding efficiency is determined by the magnitude of coding distortion and the amount of generated code, a prediction coding mode with a small prediction residual signal (a motion compensation prediction signal such as a motion compensation block size, a reference frame, and a prediction direction is generated). Even if the method indicating the method is selected, the coding efficiency is not necessarily improved if the code amount of the overhead portion is large.

そこで、オーバヘッド部の符号量を考慮して符号化処理を行うことで、符号化効率を向上させることができる。 Therefore, encoding efficiency can be improved by performing encoding processing in consideration of the code amount of the overhead portion.

図５に、一般的な符号化処理のフローチャートを図示する。この図に示すように、符号化処理は、動きベクトルを探索し、予測符号化モードを選択し、予測残差信号及びオーバヘッド部を符号化するという手順をとる。 FIG. 5 shows a flowchart of a general encoding process. As shown in this figure, the encoding process takes a procedure of searching for a motion vector, selecting a prediction encoding mode, and encoding a prediction residual signal and an overhead part.

Ｈ．２６４の参照ソフトウェア（例えば、非特許文献２参照）では、予測符号化モードの選択だけでなく、動き探索でも、オーバヘッド部の符号量を考慮している。その参照ソフトウェアの評価関数は、次式で表現される。 H. In the H.264 reference software (see, for example, Non-Patent Document 2), the code amount of the overhead portion is taken into consideration not only in the prediction encoding mode selection but also in the motion search. The evaluation function of the reference software is expressed by the following equation.

Ｃｏｓｔ＝ＳＡＤ＋λ・ＭＶＣｏｓｔ
ここで、この式中、ＳＡＤは予測残差電力、ＭＶＣｏｓｔは動きベクトルの発生符号量を示している。 Cost = SAD + λ · MVCost
Here, in this equation, SAD indicates the prediction residual power, and MVCost indicates the generated code amount of the motion vector.

Ｈ．２６４では、動きベクトルの符号化は、近傍ブロックの動きベクトルから求めた予測ベクトルとの差分を符号化する。このため、予測ベクトルとの距離が短い動きベクトルほどＭＶＣｏｓｔが小さくなる。 H. In H.264, a motion vector is encoded by encoding a difference from a prediction vector obtained from a motion vector of a neighboring block. For this reason, the MVCost becomes smaller as the motion vector has a shorter distance from the prediction vector.

λは、量子化ステップサイズによって変化させる値である。λを量子化ステップサイズに応じて変化させるのは、次の理由による。量子化ステップサイズが大きくなると、量子化が粗くなることで予測残差信号の発生符号量が小さくなり、符号化コストの多くはオーバヘッド部の符号量となる。逆に、量子化ステップサイズが小さくなると、符号化コストの多くが予測残差信号の発生符号量となる。つまり、予測残差電力に対し、量子化ステップサイズに応じてオーバヘッド部のコストが相対的に変化することになる。これから、λを量子化ステップサイズに応じて変化させるのである。 λ is a value that varies depending on the quantization step size. The reason why λ is changed according to the quantization step size is as follows. As the quantization step size increases, the amount of generated code of the prediction residual signal decreases due to coarser quantization, and much of the encoding cost is the amount of overhead. Conversely, when the quantization step size is reduced, much of the coding cost is the amount of generated code of the prediction residual signal. That is, the cost of the overhead portion changes relative to the prediction residual power according to the quantization step size. From this, λ is changed according to the quantization step size.

このような評価関数で符号化コストを算出し、その符号化コストを最小とする動きベクトルや予測符号化モードを選択することで、符号化効率を最大化できる。
ITU-T Rec. H.264, "Advanced video coding for generic audiovisual services," 2003. http://iphome.hhi.de/suehring/tml/download/ Coding efficiency can be maximized by calculating the coding cost with such an evaluation function and selecting a motion vector or prediction coding mode that minimizes the coding cost.
ITU-T Rec. H.264, "Advanced video coding for generic audiovisual services," 2003. http://iphome.hhi.de/suehring/tml/download/

先に述べたように、予測残差信号の大きさとオーバヘッド部の符号量とを考慮して符号化を行うことで、符号化効率を向上させることができる。 As described above, encoding efficiency can be improved by performing encoding in consideration of the size of the prediction residual signal and the code amount of the overhead portion.

しかし、一般に符号化歪の指標として用いられるＰＳＮＲ（Peak Signal to Noise Ratio：入力画像と復号画像との符号化誤差）は、必ずしも主観画質と一致しないことが知られている。例えば、同じＰＳＮＲであっても、平坦領域とテクスチャ領域では、主観画質は異なる。 However, it is known that a PSNR (Peak Signal to Noise Ratio: encoding error between an input image and a decoded image) generally used as an index of encoding distortion does not necessarily match the subjective image quality. For example, even with the same PSNR, the subjective image quality differs between the flat area and the texture area.

このため、従来のような予測残差信号の大きさとオーバヘッド部の符号量とを考慮した符号化方法では、ＰＳＮＲを符号化歪の指標とした符号化効率は向上するものの、主観画質を符号化歪の指標とした場合の符号化効率が向上するとは言えない。 For this reason, in the conventional coding method that takes into account the size of the prediction residual signal and the coding amount of the overhead part, the coding efficiency is improved using PSNR as an index of coding distortion, but the subjective image quality is coded. It cannot be said that the encoding efficiency is improved when the distortion index is used.

一般に、平坦領域では、予測残差電力が小さいことが多く、かつ、予測残差電力が最小となる動きベクトルと、その近傍での予測残差電力の差が小さい場合が多い。このため、符号化コストを計算する際、予測残差信号の大きさよりもオーバヘッド部の符号量が支配的となった場合、予測残差信号を最小とする予測符号化モードは選択されず、符号化歪が増加する可能性がある。上述したように、平坦領域では、歪が目立ちやすいため、符号化歪の増加は、主観画質の低下につながる可能性がある。 In general, in a flat region, the prediction residual power is often small, and the difference between the motion vector that minimizes the prediction residual power and the prediction residual power in the vicinity thereof is often small. For this reason, when the coding cost is calculated, if the coding amount of the overhead part becomes more dominant than the size of the prediction residual signal, the prediction coding mode that minimizes the prediction residual signal is not selected, and the code There is a possibility that the distortion increases. As described above, since distortion is conspicuous in a flat region, an increase in coding distortion may lead to a decrease in subjective image quality.

また、動きベクトルの符号量は、近傍ブロックの動きベクトル情報から算出される予測ベクトルに依存するため、符号化コストでオーバヘッド部の符号量が支配的になると、実際の動きベクトルではなく、予測ベクトルに近いベクトルを符号化用の動きベクトルとして選択する可能性がある。このため、予測残差信号が十分に伝送されない場合、復号画像では、実際の映像とは異なった動きをする領域や、フレーム毎に被写体の形状が変化する問題が発生する。 In addition, since the coding amount of the motion vector depends on the prediction vector calculated from the motion vector information of the neighboring blocks, if the coding amount of the overhead part becomes dominant at the coding cost, the prediction vector is not an actual motion vector. There is a possibility that a vector close to is selected as a motion vector for encoding. For this reason, when the prediction residual signal is not sufficiently transmitted, there arises a problem that the decoded image has a region that moves differently from the actual video or the shape of the subject changes from frame to frame.

次に、図６を使い、雲の映像を符号化する際を例にして、この問題について具体的に説明する。雲の領域は、画素値がほぼ平坦ではあるが、雲の輪郭や影から形状を認識できる。また、雲は、ゆっくりと流れて（動いて）いる場合が多い。図６の例では、左から右に移動していると仮定する。 Next, this problem will be described in detail with reference to FIG. 6, taking as an example the case of encoding a cloud video. Although the pixel value of the cloud region is almost flat, the shape can be recognized from the outline and shadow of the cloud. In addition, clouds often flow (move) slowly. In the example of FIG. 6, it is assumed that the user is moving from left to right.

従来の符号化コストを最小として、雲の領域の動きベクトルを検出すると、予測残差信号の大きさよりも、より小さい動きベクトル符号量の動きベクトルを選択する可能性がある。 When the motion vector in the cloud region is detected with the conventional coding cost being minimized, there is a possibility that a motion vector having a smaller motion vector code amount than the size of the prediction residual signal may be selected.

例えば、図６の動きベクトルＡと動きベクトルＢでは、動きベクトルＢが実際の動きを表現している。しかし、周辺の空が静止領域と仮定すれば、予測ベクトルが（０，０）に近くなり、結果、動きベクトルＡが選択される。動きベクトルＡで動き補償を行う際、量子化ステップサイズが大きくなると、予測残差信号が伝送されず、予測残差信号の形状がそのまま復号画像となる。これにより、図６に示すように、徐々に雲が伸びるように見える。これ以外にも、雲がゆっくり動いており、予測ベクトルと雲の動きとが一致しない場合、復号画像での雲の動きが不自然に見える。 For example, in the motion vector A and the motion vector B in FIG. 6, the motion vector B represents an actual motion. However, assuming that the surrounding sky is a static region, the prediction vector is close to (0, 0), and as a result, the motion vector A is selected. When motion compensation is performed with the motion vector A, if the quantization step size is increased, the prediction residual signal is not transmitted, and the shape of the prediction residual signal becomes a decoded image as it is. Thereby, as shown in FIG. 6, it seems that the clouds gradually grow. In addition to this, when the cloud is moving slowly and the predicted vector and the cloud motion do not match, the cloud motion in the decoded image looks unnatural.

このように、平坦領域では、主観画質が低下しやすく、かつ、動き補償予測による不自然な動きが発生する問題がある。 As described above, in the flat region, there is a problem that the subjective image quality is likely to be deteriorated and an unnatural motion due to motion compensation prediction occurs.

一方、単純に、オーバヘッド部の符号量を考慮せずに動きベクトル探索や予測符号化モードを決定した場合、平坦領域での主観画質低下や不自然な動きの発生は抑えられる。しかし、その場合、発生符号量が増加する問題がある。 On the other hand, when the motion vector search and the predictive coding mode are determined simply without considering the code amount of the overhead portion, the deterioration of subjective image quality and the occurrence of unnatural motion in a flat region can be suppressed. However, in that case, there is a problem that the amount of generated codes increases.

以上説明したように、従来の符号化コストの算出方法では、符号化コストとしてオーバヘッド部の符号量が支配的になった場合、ＰＳＮＲを符号化歪の指標とした符号化効率を最適化することが可能となるものの、主観画質が低下するという問題があった。また、単純に予測残差信号の大きさから符号化コストを算出して符号化処理を行うようにすると、発生符号量が増加するという問題があった。 As described above, in the conventional coding cost calculation method, when the overhead code amount becomes dominant as the coding cost, the coding efficiency is optimized using PSNR as an index of coding distortion. However, there was a problem that the subjective image quality deteriorated. In addition, if the encoding cost is simply calculated from the size of the prediction residual signal and the encoding process is performed, there is a problem that the amount of generated code increases.

本発明はかかる事情に鑑みてなされたものであって、符号化効率の低下を抑えつつ、主観画質を向上させることができるようにする新たな動画像予測符号化技術の提供を目的とする。 The present invention has been made in view of such circumstances, and an object of the present invention is to provide a new video predictive coding technique that can improve subjective image quality while suppressing a decrease in coding efficiency.

〔１〕本発明の構成
この目的を達成するために、本発明の動画像予測符号化装置は、動き補償予測を行って動画像を符号化する処理を行うために、（１）符号化対象領域の平坦さの度合いを算出する算出手段と、（２）算出手段の算出した平坦度合いから符号化対象領域が属する領域種別を判定する判定手段と、（３）判定手段の判定した領域種別に応じて、符号化コストの算出方法と利用可能な予測符号化モードとを制御する制御手段と、（４）符号化対象領域の量子化ステップサイズと予め与えられている閾値とを比較して、量子化ステップサイズがその閾値よりも大きい場合にのみ、制御手段による制御を行うことができるようにする実行手段とを備えるように構成する。 [1] Configuration of the present invention In order to achieve this object, the moving picture predictive coding apparatus of the present invention performs (1) a coding target in order to perform a process of coding a moving picture by performing motion compensation prediction. Calculating means for calculating the degree of flatness of the area; (2) determining means for determining the area type to which the encoding target area belongs from the flatness calculated by the calculating means; and (3) determining the area type determined by the determining means. Accordingly, the control means for controlling the encoding cost calculation method and the available predictive encoding modes, and (4) comparing the quantization step size of the encoding target region with a predetermined threshold value, Execution means that allows control by the control means to be performed only when the quantization step size is larger than the threshold value.

この構成を採るときにあって、制御手段は、符号化対象領域が平坦領域に属する場合には、予測残差信号の大きさのみで符号化コストを算出するように制御し、符号化対象領域が平坦領域以外に属する場合には、予測残差信号の大きさと予測残差信号以外の情報の符号化に要する発生符号量とから符号化コストを算出するように制御する。 When adopting this configuration, when the encoding target region belongs to the flat region, the control unit performs control so as to calculate the encoding cost only by the magnitude of the prediction residual signal, and the encoding target region When the signal belongs to a region other than the flat region, control is performed so as to calculate the encoding cost from the size of the prediction residual signal and the generated code amount required for encoding information other than the prediction residual signal.

そして、制御手段は、符号化対象領域が平坦領域に属する場合に、予測残差信号の生成に必要となる動きベクトルの本数に応じて、利用可能な予測符号化モードを制限するように制御する。例えば、利用可能な予測符号化モードを、１本の動きベクトルから予測残差信号を生成する動き補償予測符号化モードと、動きベクトルを伝送しないスキップモードとの２つに制限するように制御する。 Then, when the encoding target region belongs to the flat region, the control unit performs control so as to limit the available prediction encoding modes according to the number of motion vectors necessary for generating the prediction residual signal. . For example, control is performed so that the available prediction encoding modes are limited to two: a motion compensated prediction encoding mode that generates a prediction residual signal from one motion vector, and a skip mode that does not transmit a motion vector. .

また、判定手段は、符号化対象フレームまたは符号化済みフレームの平坦度合いの平均値と算出手段の算出した符号化対象領域の平坦度合いとを比較するとともに、予め与えられている閾値と算出手段の算出した符号化対象領域の平坦度合いとを比較して、算出手段の算出した符号化対象領域の平坦度合いがその平均値よりも小さく、かつ、算出手段の算出した符号化対象領域の平坦度合いがその閾値よりも小さい場合に、符号化対象領域を平坦領域と判定することがある。 In addition, the determination unit compares the average value of the flatness of the encoding target frame or the encoded frame with the flatness of the encoding target region calculated by the calculation unit. Comparing the calculated flatness of the encoding target area, the flatness of the encoding target area calculated by the calculating means is smaller than the average value, and the flatness of the encoding target area calculated by the calculating means is If it is smaller than the threshold, the encoding target area may be determined as a flat area.

また、算出手段は、符号化対象領域を複数の小ブロックに分割して各小ブロック毎に平坦度合いを算出し、それらの平坦度合いの最小値、最大値を算出することで符号化対象領域の平坦度合いを算出することがあり、この場合には、判定手段は、符号化対象フレームまたは符号化済みフレームの平坦度合いの平均値とその最大値とを比較するとともに、予め与えられている閾値とその最小値とを比較して、その最大値がその平均値よりも小さく、かつ、その最小値がその閾値よりも小さい場合に、符号化対象領域を平坦領域と判定することがある。 Further, the calculation means divides the encoding target area into a plurality of small blocks, calculates a flatness degree for each small block, and calculates a minimum value and a maximum value of the flatness degree, thereby calculating the encoding target area. The degree of flatness may be calculated. In this case, the determination means compares the average value of the flatness degree of the encoding target frame or the encoded frame with the maximum value, and sets a predetermined threshold value. The minimum value is compared, and if the maximum value is smaller than the average value and the minimum value is smaller than the threshold value, the encoding target region may be determined as a flat region.

以上の各処理手段が動作することで実現される本発明の動画像予測符号化方法はコンピュータプログラムでも実現できるものであり、このコンピュータプログラムは、適当なコンピュータ読み取り可能な記録媒体に記録して提供されたり、ネットワークを介して提供され、本発明を実施する際にインストールされてＣＰＵなどの制御手段上で動作することにより本発明を実現することになる。 The moving picture predictive encoding method of the present invention realized by the operation of each of the above processing means can also be realized by a computer program, and this computer program is recorded on an appropriate computer-readable recording medium and provided. Or provided via a network, installed when implementing the present invention, and operated on a control means such as a CPU, thereby realizing the present invention.

〔２〕本発明の処理
このように構成される本発明の動画像予測符号化装置では、画素値の分散、画素間差分の絶対値和、画素間差分の自乗和などを算出することで、符号化対象領域の平坦度合いを算出する。 [2] Processing of the present invention In the moving picture predictive coding apparatus of the present invention configured as described above, by calculating the variance of pixel values, the sum of absolute values of differences between pixels, the sum of squares of differences between pixels, and the like, The flatness of the encoding target area is calculated.

このとき、符号化対象領域に複数の平坦度合いを持つ領域が含まれる可能性があることを考慮して、符号化対象領域を複数の小ブロックに分割して各小ブロック毎に平坦度合いを算出し、それらの平坦度合いの最小値、最大値を算出することで符号化対象領域の平坦度合いを算出することがある。 At this time, in consideration of the possibility that the encoding target area may include areas having a plurality of flatness levels, the encoding target area is divided into a plurality of small blocks and the flatness is calculated for each small block. In some cases, the flatness of the encoding target region is calculated by calculating the minimum value and the maximum value of the flatness.

この算出した平坦度合いにより符号化対象領域が平坦な領域であることが判定できる場合には、主観画質低下や不自然な動きの発生を抑えるべく、その符号化対象領域について予測残差信号の大きさのみで符号化コストを算出することで、動きベクトルの探索を行い予測符号化モードを決定するように処理することになるが、フレーム全体について予測残差信号の大きさのみで符号化コストを算出するようにしてしまうと、今度は発生符号量が増加してしまうという問題がでる。 If it is possible to determine that the encoding target area is a flat area based on the calculated flatness degree, the magnitude of the prediction residual signal for the encoding target area is suppressed in order to suppress subjective image quality degradation and unnatural motion. By calculating the coding cost only, the process of searching for the motion vector and determining the prediction coding mode is performed, but the coding cost is reduced only by the size of the prediction residual signal for the entire frame. If the calculation is performed, there is a problem that the generated code amount increases.

そこで、フレームの中で相対的に目立ちやすい符号化対象領域を抽出して、その抽出した符号化対象領域が絶対的な意味においても平坦であるのか否かを判断することで、符号化対象領域が平坦な領域であるのか否かを判定するようにしている。 Therefore, an encoding target area that is relatively conspicuous in the frame is extracted, and it is determined whether the extracted encoding target area is flat in an absolute sense. Is determined to be a flat region.

具体的には、符号化対象フレームまたは符号化済みフレームの平坦度合いの平均値と算出した符号化対象領域の平坦度合いとを比較するとともに、予め与えられている閾値と算出した符号化対象領域の平坦度合いとを比較して、算出した符号化対象領域の平坦度合いがその平均値よりも小さいことで相対的に目立ちやすく、かつ、算出した符号化対象領域の平坦度合いがその閾値よりも小さいことで絶対的な意味においても平坦である場合には、符号化対象領域を平坦領域と判定する。 Specifically, the average value of the flatness of the encoding target frame or the encoded frame is compared with the calculated flatness of the encoding target area, and a predetermined threshold value and the calculated encoding target area are compared. Compared with the flatness degree, the calculated flatness degree of the encoding target area is relatively less noticeable than the average value, and the calculated flatness degree of the encoding target area is smaller than the threshold value. If the absolute meaning is flat, the encoding target area is determined to be a flat area.

また、符号化対象領域を複数の小ブロックに分割して各小ブロック毎に平坦度合いを算出して、その最小値、最大値を算出するという構成を採る場合には、符号化対象フレームまたは符号化済みフレームの平坦度合いの平均値とその最大値とを比較するとともに、予め与えられている閾値とその最小値とを比較して、その最大値がその平均値よりも小さいことで相対的に目立ちやすく、かつ、その最小値がその閾値よりも小さいことで絶対的な意味においても平坦である場合には、符号化対象領域を平坦領域と判定する。 In addition, when the encoding target area is divided into a plurality of small blocks, the flatness is calculated for each small block, and the minimum value and the maximum value are calculated, the encoding target frame or code The average value of the flatness of the converted frame is compared with the maximum value, and the threshold value given in advance is compared with the minimum value, and the relative maximum value is smaller than the average value. If it is conspicuous and the minimum value is smaller than the threshold value and is flat in absolute terms, the encoding target area is determined to be a flat area.

このようにして、符号化対象領域が平坦な領域であるのか否かを判定すると、符号化対象領域が平坦領域に属する場合には、主観画質低下や不自然な動きの発生を抑えるべく、その符号化対象領域について予測残差信号の大きさのみで符号化コストを算出することで、動きベクトルの探索を行い予測符号化モードを決定するようにし、一方、符号化対象領域が平坦領域以外に属する場合には、発生符号量の増加を防止すべく、予測残差信号の大きさと予測残差信号以外の情報の符号化に要する発生符号量とから符号化コストを算出することで、動きベクトルの探索を行い予測符号化モードを決定するようにする。 In this way, when it is determined whether or not the encoding target region is a flat region, if the encoding target region belongs to the flat region, in order to suppress subjective image quality degradation and unnatural motion, By calculating the encoding cost based only on the size of the prediction residual signal for the encoding target region, the motion vector search is performed to determine the prediction encoding mode, while the encoding target region is other than the flat region. If it belongs, the motion vector is calculated by calculating the coding cost from the size of the prediction residual signal and the amount of generated code required for encoding information other than the prediction residual signal in order to prevent an increase in the amount of generated code. The predictive coding mode is determined by performing a search.

そして、符号化対象領域が平坦領域に属する場合には、予測残差信号の大きさのみで符号化コストを算出することでオーバヘッド部のコスト増加により発生符号量が増大する可能性があることを考慮して、予測残差信号の生成に必要となる動きベクトルの本数に応じて、利用可能な予測符号化モードを制限するように制御して、その中から符号化コストを最小化する予測符号化モードを選択するようにする。例えば、利用可能な予測符号化モードを、１本の動きベクトルから予測残差信号を生成する動き補償予測符号化モードと、動きベクトルを伝送しないスキップモードとの２つに制限するように制御する。 When the encoding target region belongs to a flat region, the amount of generated code may increase due to an increase in overhead cost by calculating the encoding cost based only on the size of the prediction residual signal. In consideration of the number of motion vectors required to generate a prediction residual signal, control is performed so as to limit the available prediction coding modes, and the prediction code that minimizes the coding cost among them Select the mode. For example, control is performed so that the available prediction encoding modes are limited to two: a motion compensated prediction encoding mode that generates a prediction residual signal from one motion vector, and a skip mode that does not transmit a motion vector. .

一方、符号化対象領域が平坦領域以外に属する場合には、予測残差信号の大きさと予測残差信号以外の情報の符号化に要する発生符号量とから符号化コストを算出することに合わせ、符号化コストの最小化を実現可能にするためには予測符号化モードに制限を加えることはできないので、利用可能な予測符号化モードに制限を加えないように制御する。 On the other hand, when the encoding target region belongs to a region other than the flat region, in accordance with the calculation of the encoding cost from the size of the prediction residual signal and the generated code amount required for encoding information other than the prediction residual signal, Since it is not possible to limit the prediction encoding mode in order to make it possible to minimize the encoding cost, control is performed so as not to limit the available prediction encoding mode.

この処理を行うにあたって、量子化ステップサイズが小さい場合には、予測残差信号の発生符号量が大きくなりオーバヘッド部の符号量が支配的でなくなるので、主観画質の低下が検知されにくくなることを考慮して、符号化対象領域が平坦領域であるのか否かに関わらずに、予測残差信号の大きさと予測残差信号以外の情報の符号化に要する発生符号量とから符号化コストを算出するように制御する。 In performing this processing, when the quantization step size is small, the generated code amount of the prediction residual signal is large, and the code amount of the overhead part is not dominant, so that it is difficult to detect a decrease in subjective image quality. Considering this, the coding cost is calculated from the size of the prediction residual signal and the amount of generated code required to encode information other than the prediction residual signal, regardless of whether the region to be encoded is a flat region. Control to do.

以上説明したように、本発明によれば、符号化歪や不自然な動きが目立ちやすい平坦領域を抽出し、平坦領域の符号化コスト算出を予測残差信号の大きさのみから計算して符号化処理を行うことで、発生符号量の増加を抑えつつ、平坦領域での主観画質を向上させることができるようになる。 As described above, according to the present invention, a flat region where coding distortion and unnatural motion are conspicuous is extracted, and the coding cost calculation of the flat region is calculated only from the size of the prediction residual signal. By performing the conversion processing, it is possible to improve the subjective image quality in a flat region while suppressing an increase in the generated code amount.

以下、実施の形態に従って本発明を詳細に説明する。 Hereinafter, the present invention will be described in detail according to embodiments.

本発明では、主観画質が低下しやすい平坦領域を検出し、符号化制御方法を切り替えることで、符号化効率の低下を抑えつつ、主観画質を向上させることを実現する。 In the present invention, it is possible to improve the subjective image quality while suppressing a decrease in coding efficiency by detecting a flat region in which the subjective image quality is likely to deteriorate and switching the encoding control method.

図１に、これを実現するために本発明の実行するフローチャートを図示する。このフローチャートに示すように、本発明では、
〔１〕画素値の分散などを算出することにより、符号化対象領域の平坦度合いを算出し（ステップ１０）、
〔２〕その算出した平坦度合いに基づいて、符号化対象領域が主観画質の低下しやすい平坦領域であるのか否かを評価することで、符号化対象領域の領域種別を判定し（ステップ１１，１２）、
〔３〕その判定結果に基づいて、平坦な符号化対象領域については、予測残差信号を最小化するコスト関数で動きベクトルを探索し（ステップ１３）、平坦でない符号化対象領域については、オーバヘッド部の符号量も含めた符号化コストを最小化するコスト関数で動きベクトルを探索し（ステップ１５）、
〔４〕その探索に基づいて、平坦な符号化対象領域については、予測符号化モードを制限して予測符号化モードを選択し（ステップ１４）、平坦でない符号化対象領域については、予測符号化モードの制限なしに予測符号化モードを選択し（ステップ１６）、
〔５〕そして、符号化を行う（ステップ１７）
という手順をとる。 FIG. 1 shows a flowchart executed by the present invention to realize this. As shown in this flowchart, in the present invention,
[1] The degree of flatness of the encoding target region is calculated by calculating the variance of the pixel values (step 10),
[2] Based on the calculated flatness level, it is determined whether or not the encoding target area is a flat area in which subjective image quality is likely to deteriorate, thereby determining the area type of the encoding target area (step 11, 12),
[3] Based on the determination result, for a flat encoding target region, a motion vector is searched with a cost function that minimizes the prediction residual signal (step 13), and for a non-flat encoding target region, The motion vector is searched with a cost function that minimizes the coding cost including the coding amount of the overhead part (step 15),
[4] Based on the search, for a flat coding target region, the prediction coding mode is limited and a prediction coding mode is selected (step 14), and for a non-flat coding target region, a prediction code is selected. Select the predictive coding mode (step 16)
[5] Then, encoding is performed (step 17).
Take the procedure.

次に、各ステップで実行する詳細処理について説明する。 Next, detailed processing executed in each step will be described.

（ｉ）平坦度合いの算出方法
先ず最初に、ステップ１０で実行する符号化対象領域の平坦度合いの算出方法について説明する。 (I) Calculation Method of Flatness First, the calculation method of the flatness of the encoding target area executed in step 10 will be described first.

平坦度合いは、画素値の分散、画素間差分の絶対値和、画素間差分の自乗和などから算出する。 The degree of flatness is calculated from the variance of pixel values, the sum of absolute values of differences between pixels, the sum of squares of differences between pixels, and the like.

すなわち、下式に示すＬ２分散やＬ１分散に従って画素値の分散を算出することで、符号化対象領域の平坦度合いを算出する。ここで、ｓ（ｉ，ｊ）は符号化対象領域内の位置（ｉ，ｊ）の画素値を示し、＜ｓ＞はその平均値を示す。 That is, the degree of flatness of the encoding target region is calculated by calculating the variance of the pixel values according to the L2 variance and the L1 variance shown in the following equations. Here, s (i, j) indicates the pixel value at position (i, j) in the encoding target area, and <s> indicates the average value.

また、下式に従って画素間差分の絶対値和ａｃｔを算出することで、符号化対象領域の平坦度合いを算出する。 In addition, the flatness of the encoding target region is calculated by calculating the absolute value sum act of the inter-pixel difference according to the following equation.

また、下式に従って画素間差分の自乗和ａｃｔを算出することで、符号化対象領域の平坦度合いを算出する。 Further, the flatness of the encoding target area is calculated by calculating the square sum act of the inter-pixel difference according to the following equation.

ここで、符号化対象領域が広い場合、複数の平坦度合いを持つ領域が含まれる可能性があるため、符号化対象領域を小ブロックに分割して平坦度合いを算出することもできる。この場合、符号化対象領域の平坦度合いは小ブロックの数だけ求まるため、それらの中の最小値や最大値を符号化対象領域の平坦度合いとする。 Here, when the encoding target area is wide, there is a possibility that an area having a plurality of flatness levels may be included. Therefore, the flatness degree can be calculated by dividing the encoding target area into small blocks. In this case, since the flatness of the encoding target area is obtained by the number of small blocks, the minimum value or the maximum value among them is set as the flatness of the encoding target area.

（ii）主観画質が低下しやすい平坦領域の検出方法
次に、ステップ１１，１２で実行する符号化対象領域が主観画質の低下が目立ちやすい平坦領域であるのか否かの検出方法について説明する。 (Ii) Method for Detecting Flat Area in which Subjective Image Quality Easily Lowers Next, a method for detecting whether or not the encoding target area executed in steps 11 and 12 is a flat area in which the subjective image quality is likely to deteriorate is described.

平坦領域であるのか否かについては、
・予め与えられている固定された閾値
・符号化対象フレームまたは符号化済みフレームの平坦度合いの平均値
という２つの値と平坦度合いとの比較結果から判定する。 Whether it is a flat region or not,
Fixed threshold value given in advance Judgment is made based on a comparison result between the flatness and the two values of the average flatness of the encoding target frame or the encoded frame.

なお、符号化対象フレームの平坦度合いの平均値を用いる場合は、事前に符号化対象フレームの平坦度合いを求めておく必要がある。 When using the average value of the flatness of the encoding target frame, it is necessary to obtain the flatness of the encoding target frame in advance.

上記の２つの値よりも符号化対象領域の平坦度合いが小さい場合は、符号化対象領域は平坦領域であると判定する。上記の１つ目の条件は、平坦度合いの絶対的な大きさを評価し、上記の２つ目の条件は、符号化対象フレーム内での平坦度合いの相対的な大きさを評価する。 When the degree of flatness of the encoding target area is smaller than the above two values, it is determined that the encoding target area is a flat area. The first condition evaluates the absolute magnitude of the flatness, and the second condition evaluates the relative magnitude of the flatness in the encoding target frame.

これらの評価を行う理由は、予測残差を最小化する必要がない領域を排除するためである。例えば、フレーム全体がテクスチャ領域のような映像では、たとえ符号化対象領域の平坦度合いが平均値以下の領域であってもその絶対的な値が大きいため、符号化効率を優先した符号化制御を行ったほうが良い。逆に、画面全体が平坦な映像では、ほとんどの領域が平坦と判断され、発生符号量の増加を招く恐れがある。割り当て符号量が十分にない場合は、符号量割り当て制御に支障が出る可能性がある。そこで、相対的に目立ちやすい符号化対象領域のみについて予測残差信号最小化の符号化制御を行う。 The reason for performing these evaluations is to eliminate regions where the prediction residual need not be minimized. For example, in an image where the entire frame is a texture area, even if the flatness of the encoding target area is an area that is less than the average value, the absolute value is large. It is better to go. On the other hand, in a video with a flat entire screen, most of the area is determined to be flat, which may increase the amount of generated code. If the allocated code amount is not sufficient, the code amount allocation control may be hindered. Therefore, encoding control for minimizing the prediction residual signal is performed only for the encoding target region that is relatively conspicuous.

符号化対象領域を小ブロックに分割して平坦度合いを算出している場合、次のような方法により平坦領域であるのか否かを判定することもできる。 When the encoding target area is divided into small blocks and the flatness is calculated, it can be determined whether the area is a flat area by the following method.

まず、小ブロックに分割して求めた平坦度合いについて、最小値と最大値を求める。平坦度合いの小ブロック最大値は、符号化対象領域内で最も歪が目立ちにくい領域の平坦度合いであり、一方、平坦度合いの小ブロック最小値は、符号化対象領域内で最も歪が目立ち易い領域の平坦度合いである。 First, a minimum value and a maximum value are obtained for the flatness obtained by dividing into small blocks. The maximum value of the small block of flatness is the flatness of the region in which the distortion is least noticeable in the encoding target region, while the minimum value of the small block of flatness is the region in which the distortion is most noticeable in the encoding target region. The degree of flatness.

続いて、平坦度合いの小ブロック最大値と、符号化対象フレームまたは符号化済みフレームの平坦度合いの平均値とを比較する。それと同時に、平坦度合いの小ブロック最小値と、予め与えられている固定された閾値との比較を行う。これら２つの条件を満たす符号化対象領域（小ブロック最大値が平均値以下で、小ブロック最小値が閾値以下の条件を満たす符号化対象領域）は、フレーム内では相対的に歪が目立ちやすい領域であり、かつ、主観画質低下が検知されやすい平坦領域を含むことがわかる。 Subsequently, the small block maximum value of the flatness is compared with the average value of the flatness of the encoding target frame or the encoded frame. At the same time, a comparison is made between the minimum value of the small block of flatness and a fixed threshold value given in advance. An encoding target region that satisfies these two conditions (an encoding target region that satisfies the condition that the small block maximum value is equal to or less than the average value and the small block minimum value is equal to or less than the threshold value) is a region in which distortion is relatively conspicuous in the frame. In addition, it can be seen that it includes a flat region in which subjective image quality deterioration is easily detected.

（iii)符号化制御方法の切り替え方法
次に、ステップ１３〜１６で実行する平坦領域の判定結果に基づく符号化制御方法の切り替えについて説明する。 (Iii) Switching method of encoding control method Next, switching of the encoding control method based on the determination result of the flat region executed in steps 13 to 16 will be described.

符号化制御方法の切り替えは、コスト関数の切り替えと、予測符号化モードの制限とにより実施する。 The switching of the encoding control method is performed by switching the cost function and limiting the prediction encoding mode.

すなわち、平坦領域では、予測残差信号を最小化するコスト関数を採用し、それ以外の領域では、符号化コストを最小化するコスト関数を採用する。平坦領域では、予測残差信号を最小化するコスト関数を採用するため、オーバヘッド部のコストの増加により発生符号量が増大する可能性がある。そこで、オーバヘッド部のコストを制限するために、予測符号化モードを制限して、制限した予測符号化モードの中から、予測残差信号を最小化する予測符号化モードを選択する。例えば、平坦領域では、動きベクトルの本数が最小の予測符号化モードのみに制限する（動きベクトルの本数に合わせて予測符号化モードを制限する）。 That is, a cost function that minimizes the prediction residual signal is adopted in the flat region, and a cost function that minimizes the coding cost is adopted in the other regions. In the flat region, since a cost function that minimizes the prediction residual signal is adopted, there is a possibility that the amount of generated codes increases due to an increase in overhead part cost. Therefore, in order to limit the overhead cost, the prediction encoding mode is limited, and the prediction encoding mode that minimizes the prediction residual signal is selected from the limited prediction encoding modes. For example, in a flat region, the number of motion vectors is limited to only the predictive coding mode (the predictive coding mode is limited according to the number of motion vectors).

従来方法で問題となっている主観画質の低下は、オーバヘッド部の符号量が支配的になった場合に問題が顕在化する。 The decrease in subjective image quality, which is a problem in the conventional method, becomes apparent when the code amount of the overhead portion becomes dominant.

つまり、量子化ステップサイズが小さい場合は、予測残差信号の発生符号量が大きくなりオーバヘッド部の符号量が支配的でなくなるので、主観画質の低下は検知されにくい。そこで、符号化制御を切り替えるよりも前に、符号化対象領域の量子化ステップサイズにより、主観画質低下の発生しやすさを評価してもよい。例えば、予め与えられた閾値と比較して、符号化対象領域の量子化ステップサイズが閾値より大きい場合にのみ、その符号化対象領域について、平坦領域の検出と符号化制御の切り替えとを行う。 That is, when the quantization step size is small, the generated code amount of the prediction residual signal is large and the code amount of the overhead part is not dominant. Therefore, before switching the encoding control, the ease of the subjective image quality degradation may be evaluated based on the quantization step size of the encoding target area. For example, only when the quantization step size of the encoding target region is larger than the threshold value compared with a predetermined threshold value, the detection of the flat region and the switching of the encoding control are performed for the encoding target region.

図２に、この方法を用いる場合の本発明のフローチャートを図示する。この例では、最初にステップ１００で量子化ステップサイズを評価し、主観画質低下が顕在化するかどうかを判定して、図１に示したフローチャートの処理に入るのか否かを決定するようにしている。 FIG. 2 shows a flowchart of the present invention when this method is used. In this example, first, the quantization step size is evaluated in step 100, it is determined whether or not the subjective image quality deterioration is actualized, and it is determined whether or not to enter the processing of the flowchart shown in FIG. Yes.

次に、実施例に従って、本発明について詳細に説明する。 Next, the present invention will be described in detail according to examples.

図３に、本発明を具備する動画像符号化装置１の一実施例を図示する。 FIG. 3 shows an embodiment of the moving picture encoding apparatus 1 comprising the present invention.

本発明の動画像符号化装置１は、符号化処理部１０と、Ｌ１分散算出部１１と、Ｌ１分散平均値算出部１２と、Ｌ１分散比較部１３と、予測モード制御部１４と、切替部１５と、量子化ステップ評価部１６とを備える。 The moving image encoding apparatus 1 of the present invention includes an encoding processing unit 10, an L1 variance calculating unit 11, an L1 variance average calculating unit 12, an L1 variance comparing unit 13, a prediction mode control unit 14, and a switching unit. 15 and a quantization step evaluation unit 16.

この符号化処理部１０は、予測残差信号の大きさと予測残差信号以外の情報の符号化に要する発生符号量とから符号化コストを算出する“第１のコスト関数算出部”と、予測残差信号の大きさのみから符号化コストを算出する“第２のコスト関数算出部”とを備えて、これらのコスト関数算出部が算出する符号化コストを使って、動きベクトルの探索や予測符号化モードの決定を行うことで符号化対象領域（符号化対象ブロック）の符号化処理を実行する。 The encoding processing unit 10 includes a “first cost function calculation unit” that calculates the encoding cost from the magnitude of the prediction residual signal and the generated code amount required for encoding information other than the prediction residual signal, A “second cost function calculation unit” that calculates an encoding cost only from the magnitude of the residual signal, and uses the encoding cost calculated by these cost function calculation units to search and predict a motion vector By determining the encoding mode, the encoding process of the encoding target area (encoding target block) is executed.

Ｌ１分散算出部１１は、符号化対象ブロックを分割した各小ブロックについて、輝度信号のＬ１分散を算出することで平坦度合いを算出して、“最大値算出部”を使って、その中の最大Ｌ１分散を算出するとともに、“最小値算出部”を使って、その中の最小Ｌ１分散を算出する。 The L1 variance calculation unit 11 calculates the flatness by calculating the L1 variance of the luminance signal for each small block obtained by dividing the block to be encoded, and uses the “maximum value calculation unit” to determine the maximum In addition to calculating the L1 variance, the “minimum value calculation unit” is used to calculate the minimum L1 variance therein.

Ｌ１分散平均値算出部１２は、直前に符号化したフレームにおける最小Ｌ１分散の平均値を算出することで、平坦度合いの平均値を算出する。 The L1 variance average value calculation unit 12 calculates the average value of the flatness by calculating the average value of the minimum L1 variance in the frame encoded immediately before.

Ｌ１分散比較部１３は、Ｌ１分散算出部１１の算出した最大Ｌ１分散とＬ１分散平均値算出部１２の算出した平坦度合いの平均値とを比較する“最大値比較部”と、Ｌ１分散算出部１１の算出した最小Ｌ１分散と規定の閾値とを比較する“最小値比較部”とを備えて、それらの比較結果に基づいて、符号化対象ブロックが平坦領域であるのか否かを判定する。 The L1 variance comparison unit 13 includes a “maximum value comparison unit” that compares the maximum L1 variance calculated by the L1 variance calculation unit 11 and the average value of the flatness calculated by the L1 variance average value calculation unit 12, and an L1 variance calculation unit. 11 includes a “minimum value comparison unit” that compares the calculated minimum L1 variance with a prescribed threshold value, and determines whether or not the block to be encoded is a flat region based on the comparison result.

予測モード制御部１４は、Ｌ１分散比較部１３の判定結果に基づいて、符号化処理部１０に対して、利用可能な予測符号化モードの情報と使用するコスト関数の種別の情報とを通知する。 Based on the determination result of the L1 variance comparison unit 13, the prediction mode control unit 14 notifies the encoding processing unit 10 of information on available prediction encoding modes and information on the type of cost function to be used. .

切替部１５は、予測モード制御部１４による制御指示か、予測モード制御部１４によらない制御指示（従来技術における制御指示）のいずれか一方を選択して、符号化処理部１０に通知する。 The switching unit 15 selects either the control instruction by the prediction mode control unit 14 or the control instruction (control instruction in the prior art) not by the prediction mode control unit 14 and notifies the encoding processing unit 10 of the selected one.

量子化ステップ評価部１６は、符号化処理部１０の用いる量子化ステップサイズを評価して、それに基づいて、切替部１５に対して、どちらの制御指示を選択するのかを指示する。 The quantization step evaluation unit 16 evaluates the quantization step size used by the encoding processing unit 10, and instructs the switching unit 15 which control instruction to select based on the evaluation.

このように構成される本発明の動画像符号化装置１では、符号化対象ブロックを１６×１６画素とし、４つの小ブロック８×８画素に分割して平坦度合いを算出する。平坦度合いは、輝度信号のＬ１分散で算出し、符号化対象ブロック内の最小Ｌ１分散と最大Ｌ１分散とを求める。平坦度合いの平均値は、直前に符号化したフレームにおける最小Ｌ１分散の平均値とする。平坦領域の判定には、最小Ｌ１分散と最大Ｌ１分散とを用いる。また、主観画質の低下が顕在化するかどうかを判断するため、量子化ステップサイズによる判定も行う。 In the moving image encoding apparatus 1 of the present invention configured as described above, the encoding target block is 16 × 16 pixels, and is divided into four small blocks 8 × 8 pixels to calculate the flatness. The flatness is calculated from the L1 variance of the luminance signal, and the minimum L1 variance and the maximum L1 variance in the encoding target block are obtained. The average value of the flatness is the average value of the minimum L1 variance in the frame encoded immediately before. For determining the flat region, the minimum L1 variance and the maximum L1 variance are used. Further, in order to determine whether or not the deterioration of the subjective image quality becomes obvious, the determination based on the quantization step size is also performed.

符号化対象ブロックが平坦領域であると判定した場合には、動き補償予測のブロックサイズは１６×１６画素とし、予測符号化モード選択時には、１６×１６動き補償予測符号化モードとスキップモード（符号化対象ブロックについて動きベクトルを伝送しない動き補償予測モード）の中から予測符号化モードを選択する。一方、符号化対象ブロックが平坦領域でないと判定した場合には、予測残差信号の大きさと予測残差信号以外の情報の符号化に要する発生符号量とから符号化コストを算出するという構成を採って、それに基づいて動きベクトルや予測符号化モードを決定する。 When it is determined that the coding target block is a flat region, the block size of motion compensated prediction is set to 16 × 16 pixels. When the predictive coding mode is selected, the 16 × 16 motion compensated predictive coding mode and the skip mode (code Predictive coding mode is selected from among motion compensated prediction modes in which motion vectors are not transmitted for the current block. On the other hand, if it is determined that the block to be encoded is not a flat region, the encoding cost is calculated from the size of the prediction residual signal and the generated code amount required for encoding information other than the prediction residual signal. The motion vector and the predictive coding mode are determined based on this.

図４に、このように構成される本発明の動画像符号化装置１の実行するフローチャートを図示する。次に、このフローチャートに従って、本発明の動画像符号化装置１の実行する処理について詳細に説明する。 FIG. 4 shows a flowchart executed by the moving picture encoding apparatus 1 of the present invention configured as described above. Next, according to this flowchart, the process which the moving image encoder 1 of this invention performs is demonstrated in detail.

〔１〕量子化ステップサイズの評価
本発明の動画像符号化装置１では、先ず最初に、ステップ２０で、符号化対象ブロックについて、量子化ステップサイズＱＰと予め定めた定数である閾値ＴＨ_QPとを比較する。ＱＰ≧ＴＨ_QPが成立する場合には、次のステップ２１の処理に進み、ＱＰ＜ＴＨ_QPが成立する場合には、後述するステップ２７の処理へ直ちに移行する。 [1] Evaluation of Quantization Step Size In the moving picture encoding apparatus 1 of the present invention, first, in step 20, the quantization step size QP and a threshold TH _QP which is a predetermined constant are determined for the current block. Compare When QP ≧ TH _QP is satisfied, the process proceeds to the next step 21. When QP <TH _QP is satisfied, the process immediately proceeds to the process of step 27 described later.

〔２〕Ｌ１分散の算出
続いて、ステップ２１で、符号化対象ブロックを４つに分割した８×８サイズの小ブロックのそれぞれについて、下式に従って、輝度信号のＬ１分散“ act_n”を算出する。 [2] Calculation of L1 variance Subsequently, in step 21, the L1 variance “act _n ” of the luminance signal is calculated according to the following equation for each of the 8 × 8 size small blocks obtained by dividing the block to be encoded into four. To do.

ここで、ｓ_y（ｉ，ｊ）は小ブロックｎの輝度信号の画素値、＜ｓ_y＞は小ブロックｎの輝度信号の平均値を示す。 Here, s _y (i, j) represents the pixel value of the luminance signal of the small block n, and <s _y > represents the average value of the luminance signal of the small block n.

〔３〕Ｌ１分散の最小値と最大値を算出
続いて、ステップ２２で、ステップ２１で求めた４つのＬ１分散に対して下記の演算を施すことで、
ａｃｔ_min＝ｍｉｎ（ act₀, act₁ , act₂, act₃ ）
ａｃｔ_max＝ｍａｘ（ act₀, act₁ , act₂, act₃ ）
Ｌ１分散の最小値ａｃｔ_minと最大値ａｃｔ_maxを算出する。 [3] Calculating the minimum and maximum values of the L1 variance Subsequently, in step 22, by performing the following calculation on the four L1 variances obtained in step 21,
act _min = min (act ₀ , act ₁ , act ₂ , act ₃ )
act _max = max (act ₀ , act ₁ , act ₂ , act ₃ )
The minimum value act _min and maximum value act _max of the L1 variance are calculated.

〔４〕領域種別の判定
続いて、ステップ２３，２４で、Ｌ１分散の最小値ａｃｔ_minと最大値ａｃｔ_maxについて、下記の２つの条件が成立するのか否かを評価して、
・ａｃｔ_max＜ａｃｔ_avg
・ａｃｔ_min＜ＴＨ_ACT
この２つの条件を同時に満たした場合、符号化対象ブロックを平坦領域と判定する。 [4] Determination of Region Type Subsequently, in steps 23 and 24, it is evaluated whether or not the following two conditions are satisfied for the minimum value act _min and the maximum value act _max of the L1 variance.
・ Act _max <act _avg
・ Act _min <TH _ACT
When these two conditions are satisfied at the same time, the encoding target block is determined to be a flat region.

ここで、ａｃｔ_avgは直前に符号化したフレームのａｃｔ_minの平均値、ＴＨ_ACTは平坦領域判定のための閾値である。 Here, act _avg is an average value of act _min of the frame encoded immediately before, and TH _ACT is a threshold value for determining a flat region.

この判定処理に従って平坦な領域であると判断した場合には、ステップ２５の処理に進み、平坦でない領域と判断した場合には、ステップ２７の処理に進む。 If it is determined that the area is flat according to this determination process, the process proceeds to step 25. If it is determined that the area is not flat, the process proceeds to step 27.

〔５〕１６×１６サイズで動きベクトル探索
平坦領域であると判断した符号化対象ブロックについては、続いて、ステップ２５で、１６×１６サイズのみで動きベクトルを探索する。その際、動きベクトル情報の符号量（オーバヘッド部の符号量）は考慮せずに、予測残差電力が最小となる動きベクトルを探索する。 [5] Motion Vector Search with 16 × 16 Size For the encoding target block determined to be a flat region, subsequently, in step 25, a motion vector is searched with only the 16 × 16 size. At this time, the motion vector that minimizes the prediction residual power is searched without considering the code amount of the motion vector information (the code amount of the overhead portion).

〔６〕スキップモード判定
平坦領域であると判断した符号化対象ブロックについては、続いて、ステップ２６で、スキップモードの予測残差電力を算出し、１６×１６サイズの動きベクトル探索の結果と比較して、予測残差信号の差が一定の値ＯＦＦＳＥＴ_SAD以下の場合には、スキップモードとし、ＯＦＦＳＥＴ_SAD以下でない場合には、１本の動きベクトルから予測残差信号を生成する予測符号化モードとする。 [6] Skip mode determination For the coding target block determined to be a flat region, in step 26, the prediction residual power in the skip mode is calculated and compared with the result of motion vector search of 16 × 16 size. When the difference between the prediction residual signals is equal to or _smaller than the predetermined value OFFSET _SAD , the skip mode is selected. When the difference between the prediction residual signals is not equal to or _smaller than OFFSET _SAD , the prediction coding mode generates a prediction residual signal from one motion vector. And

ここで、上述の差が一定の値ＯＦＦＳＥＴ_SAD以下の場合にスキップモードとするのは、一般的にスキップモードの方が発生符号量が少なくなるからである。これから、上述の差を求めるのではなくて、それぞれについて実際に符号化コストを算出して、その算出値に基づいて符号化コストの小さくなる方を選択することで、スキップモードにするのか、１本の動きベクトルから予測残差信号を生成する予測符号化モードにするのかを決定するようにしてもよい。 Here, the reason _{why the} skip mode is set when the above-described difference is equal to or less than a certain value OFFSET _SAD is that the generated code amount is generally smaller in the skip mode. From this, instead of obtaining the above-mentioned difference, whether the skip mode is set by actually calculating the encoding cost for each, and selecting the one with the smaller encoding cost based on the calculated value. You may make it determine whether it is set as the prediction encoding mode which produces | generates a prediction residual signal from a book | motion vector.

〔７〕動きベクトル探索
平坦領域でないと判断した符号化対象ブロックについては、続いて、ステップ２７で、各ブロックサイズ（符号化対象ブロックにおいて利用可能な各ブロックサイズ）で、動きベクトル等のオーバヘッド部の符号量を含めて、符号化コストが最小となる動きベクトルを探索する。 [7] Motion Vector Search For an encoding target block that is determined not to be a flat region, in step 27, an overhead portion such as a motion vector is obtained at each block size (each block size that can be used in the encoding target block). The motion vector that minimizes the coding cost is searched for, including the amount of codes.

〔８〕予測符号化モード選択
平坦領域でないと判断した符号化対象ブロックについては、続いて、ステップ２８で、オーバヘッド部の符号量を含めて、符号化コストが最小となる予測符号化モードを選択する。 [8] Predictive coding mode selection For a block to be coded that is determined not to be a flat region, in step 28, a predictive coding mode that minimizes the coding cost is selected including the code amount of the overhead portion. To do.

〔９〕符号化処理
ステップ２６，２８の処理を終了すると、最後に、ステップ２９で、符号化処理を実行する。 [9] Encoding process When the processes in steps 26 and 28 are completed, finally, in step 29, the encoding process is executed.

このようにして、図３のように構成される本発明の動画像符号化装置１は、このような処理手順に従って、符号化対象ブロックについて平坦な領域であるのか否かを判定し、その判定結果に基づいて、符号化コストの算出方法の切り替え、予測符号化モードの制限を実行することにより、発生符号量の増加を抑えつつ、平坦領域での主観画質を向上させることを実現するのである。 In this way, the moving picture encoding apparatus 1 of the present invention configured as shown in FIG. 3 determines whether or not the encoding target block is a flat region according to such a processing procedure, and the determination Based on the results, switching the coding cost calculation method and restricting the predictive coding mode are performed, thereby suppressing the increase in the amount of generated codes and improving the subjective image quality in a flat region. .

本発明の実行するフローチャートである。It is a flowchart which this invention performs. 本発明の実行するフローチャートである。It is a flowchart which this invention performs. 本発明の動画像符号化装置の一実施例である。It is one Example of the moving image encoder of this invention. 本発明の動画像符号化装置の実行するフローチャートである。It is a flowchart which the moving image encoding device of this invention performs. 一般的な符号化処理のフローチャートである。It is a flowchart of a general encoding process. 従来技術の問題を説明する説明図である。It is explanatory drawing explaining the problem of a prior art.

Explanation of symbols

１動画像符号化装置
１０符号化処理部
１１Ｌ１分散算出部
１２Ｌ１分散平均値算出部
１３Ｌ１分散比較部
１４予測モード制御部
１５切替部
１６量子化ステップ評価部 DESCRIPTION OF SYMBOLS 1 Moving image encoder 10 Encoding process part 11 L1 dispersion | distribution calculation part 12 L1 dispersion | distribution average value calculation part 13 L1 dispersion | distribution comparison part 14 Prediction mode control part 15 Switching part 16 Quantization step evaluation part

Claims

In a video predictive encoding method for encoding a video by performing motion compensation prediction,
A process of calculating the degree of flatness of the encoding target region;
A process of determining the area type to which the encoding target area belongs from the calculated flatness degree;
In accordance with the determined region type, comprising a process for controlling a coding cost calculation method and an available predictive coding mode,
A moving image predictive encoding method as a feature.

The moving image predictive encoding method according to claim 1,
Comparing the quantization step size of the encoding target area with a predetermined threshold value, and providing a process for enabling the above control only when the quantization step size is larger than the threshold value That
A moving image predictive encoding method as a feature.

The moving image predictive encoding method according to claim 1 or 2,
In the above control process, when the encoding target region belongs to the flat region, control is performed so that the encoding cost is calculated only by the magnitude of the prediction residual signal, and the encoding target region belongs to a region other than the flat region. For controlling to calculate the coding cost from the size of the prediction residual signal and the amount of generated code required for encoding information other than the prediction residual signal,
A moving image predictive encoding method as a feature.

The video predictive encoding method according to any one of claims 1 to 3,
In the above control process, when the encoding target region belongs to a flat region, control is performed so as to limit the available prediction encoding modes according to the number of motion vectors necessary for generating a prediction residual signal. That
A moving image predictive encoding method as a feature.

The moving image predictive encoding method according to claim 4,
In the process of controlling, when the encoding target region belongs to a flat region, the available prediction encoding modes are a motion compensated prediction encoding mode for generating a prediction residual signal from one motion vector, and a motion vector. Control to limit to two with skip mode that does not transmit
A moving image predictive encoding method as a feature.

The moving image predictive encoding method according to any one of claims 1 to 5,
In the process of determining the region type, the average value of the flatness of the encoding target frame or the encoded frame is compared with the calculated flatness of the encoding target region, and the threshold value given in advance and the calculation are calculated. The calculated flatness of the encoding target area is smaller than the average value, and the calculated flatness of the encoding target area is lower than the threshold value. In this case, it is determined that the encoding target area is a flat area.
A moving image predictive encoding method as a feature.

The moving image predictive encoding method according to any one of claims 1 to 5,
In the process of calculating the flatness, the encoding target area is divided into a plurality of small blocks, the flatness is calculated for each small block, and the minimum and maximum values of the flatness are calculated. Calculating the flatness of the target area,
A moving image predictive encoding method as a feature.

The moving image predictive encoding method according to claim 7,
In the process of determining the region type, the average value of the flatness of the encoding target frame or the encoded frame is compared with the maximum value, and a threshold value given in advance is compared with the minimum value. When the maximum value is smaller than the average value and the minimum value is smaller than the threshold, it is determined that the encoding target region is a flat region,
A moving image predictive encoding method as a feature.

In the moving image predictive encoding method according to any one of claims 1 to 8,
In the process of calculating the flatness degree, calculating the flatness degree of the encoding target region using any one of dispersion of pixel values, absolute value sum of interpixel differences, and square sum of interpixel differences,
A moving image predictive encoding method as a feature.

In a video predictive encoding device that encodes a video by performing motion compensation prediction,
Means for calculating the degree of flatness of the encoding target region;
Means for determining the region type to which the encoding target region belongs from the calculated flatness;
In accordance with the determined region type, comprising a coding cost calculation method and a means for controlling the available predictive coding mode,
A moving image predictive coding apparatus as a feature.

The moving image predictive encoding device according to claim 10,
Comparing the quantization step size of the encoding target area with a predetermined threshold value, and providing means for enabling the above control only when the quantization step size is larger than the threshold value That
A moving image predictive coding apparatus as a feature.

A moving picture predictive coding program for causing a computer to execute processing used to realize the moving picture predictive coding method according to claim 1.

A computer-readable recording medium having recorded thereon a moving picture predictive coding program for causing a computer to execute processing used to realize the moving picture predictive coding method according to any one of claims 1 to 9.