JP4478480B2

JP4478480B2 - Video encoding apparatus and method

Info

Publication number: JP4478480B2
Application number: JP2004048173A
Authority: JP
Inventors: 克己大塚; 秀昭服部
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2004-02-24
Filing date: 2004-02-24
Publication date: 2010-06-09
Anticipated expiration: 2024-02-24
Also published as: JP2005244346A

Description

本発明は、動画像を所定単位で符号化する動画像符号化において、量子化処理における重み付けパラメータを用いて、入力画像を所定の目標符号量に符号化する動画像符号化装置及びその方法に関するものである。 The present invention relates to a moving image encoding apparatus and method for encoding an input image to a predetermined target code amount using a weighting parameter in quantization processing in moving image encoding for encoding a moving image in a predetermined unit. Is.

近年のデジタル信号処理技術の飛躍的な進歩により、従来ならば困難であった動画像の蓄積メディアへの記録や伝送路を介した動画像の伝送が行われている。この場合に、動画像を構成する各々のピクチャは、圧縮符号化処理が施されデータ量が大幅に削減されている。この圧縮符号化処理として、代表的な手法の一つに、例えば、ＭＰＥＧ（ＭｏｖｉｎｇＰｉｃｔｕｒｅＥｘｐｅｒｔｓＧｒｏｕｐ）方式がある。 Due to dramatic progress in digital signal processing technology in recent years, recording of moving images on storage media and transmission of moving images via transmission paths, which have been difficult in the past, have been performed. In this case, each picture constituting the moving image is subjected to compression encoding processing, and the data amount is greatly reduced. As a typical compression coding process, for example, there is an MPEG (Moving Picture Experts Group) system.

ＭＰＥＧ方式に準拠して、動画像中の一連のピクチャを一定のビット・レートという条件下で圧縮符号化する場合には、動画像中のシーンやピクチャの空間周波数特性、及び量子化スケール（Ｑスケール）値に応じて符号量が大きく異なる。このような符号化特性をもつ圧縮符号化装置を実現する上で、符号化歪みを最小限にするための重要な技術が符号量制御である。 When compressing and encoding a series of pictures in a moving picture under a condition of a constant bit rate in accordance with the MPEG system, the spatial frequency characteristics of the scenes and pictures in the moving picture and the quantization scale (Q The amount of code varies greatly depending on the (scale) value. An important technique for minimizing coding distortion in realizing a compression coding apparatus having such coding characteristics is code amount control.

符号量制御を実現するためのアルゴリズムについては、これまで数多く提案されてきている。その中でも、ＭＰＥＧ−２符号化方式の標準化の過程で提案されたＴＭ５（Test Model 5(Test Model Editing Commitee: "Test Model 5", ISO/IEC JTC/SC29/WG11/N0400(Apr.1993)））で使用されているアルゴリズム（以下、ＴＭ５：特許文献１）は良く知られているものの一つである。 Many algorithms for realizing the code amount control have been proposed so far. Among them, TM5 (Test Model Editing Commitee: "Test Model 5", ISO / IEC JTC / SC29 / WG11 / N0400 (Apr. 1993)) proposed in the process of standardization of the MPEG-2 encoding system. ) (Hereinafter, TM5: Patent Document 1) is well known.

＜従来技術１＞
ＴＭ５では、次の３つのステップ（ステップ１〜３）から構成され、ＧＯＰ（Group of Pictures）毎にビット・レートが一定になるように、Ｑ（量子化（quantization））スケールを制御している。 <Prior art 1>
TM5 is composed of the following three steps (steps 1 to 3), and the Q (quantization) scale is controlled so that the bit rate is constant for each GOP (Group of Pictures). .

［ステップ１：ビット割当］：ＧＯＰ内の残り符号量から、次に符号化するピクチャの目標符号量を算出する。 [Step 1: Bit allocation]: The target code amount of the picture to be encoded next is calculated from the remaining code amount in the GOP.

［ステップ２：符号量制御］：ステップ１で算出した目標符号量から仮想バッファの状態に応じて、Ｑスケールを算出する。 [Step 2: Code amount control]: The Q scale is calculated from the target code amount calculated in Step 1 according to the state of the virtual buffer.

［ステップ３：Ｑスケールの調整］：マクロブロックの空間アクティビチィに基づいて、最終的なＱスケールを決定する。 [Step 3: Adjustment of Q scale]: The final Q scale is determined based on the spatial activity of the macroblock.

前記３つのステップの内、符号化歪みに最も影響の大きいステップ１の詳細な処理を次に説明する。 Of the three steps, detailed processing of step 1 that has the greatest influence on coding distortion will be described below.

［ステップ１：ビット割当］
今、図１４に示す通りに、現ＧＯＰ内の１０番目のピクチャＰ３（Ｐピクチャ）の符号化に先立ち、ピクチャＰ３の目標符号量を算出するものとする。ステップ１の処理は、次式で表される。 [Step 1: Bit allocation]
Now, as shown in FIG. 14, the target code amount of the picture P3 is calculated prior to the encoding of the tenth picture P3 (P picture) in the current GOP. The process of step 1 is expressed by the following equation.

但し、Ｒ_gopは現ＧＯＰに対して割り当てられる符号量、Ｎ_i、Ｎ_p及びＮ_bはそれぞれＩ、Ｐ及びＢピクチャの現ＧＯＰにおける残りのピクチャ数、ｂｉｔｓ＿ｒａｔｅは目標ビット・レート、ｐｉｃｔｕｒｅ＿ｒａｔｅはピクチャ・レートを表す。更には、Ｉ、Ｐ及びＢピクチャ毎に、符号化結果からピクチャの複雑度Ｘ_i、Ｘ_p及びＸ_bを次式で算出する。 Where R _gop is the code amount allocated to the current GOP, N _i , N _p, and N _b are the number of remaining pictures in the current GOP of I, P, and B pictures, bits_rate is the target bit rate, and picture_rate is the picture -Represents the rate. Further, for each of the I, P, and B pictures, the picture complexity X _i , X _p, and X _b is calculated from the encoding result by the following equation.

但し、Ｒ_i、Ｒ_p及びＲ_bは、それぞれＩ、Ｐ及びＢピクチャを符号化して得られる符号量、Ｑ_i、Ｑ_p及びＱ_bはそれぞれＩ、Ｐ及びＢピクチャ内のすべてのマクロブロックに対するＱスケールの平均値である。式（１）及び式（２）から、次式を用いて、Ｉ、Ｐ及びＢピクチャそれぞれについて目標符号量Ｔ_i、Ｔ_p及びＴ_bを算出する。 Where R _i , R _p and R _b are code amounts obtained by encoding I, P and B pictures, respectively, and Q _i , Q _p and Q _b are all macroblocks in the I, P and B pictures, respectively. Is the average value of the Q scale. From the equations (1) and (2), the target code amounts T _i , T _p and T _b are calculated for the I, P and B pictures, respectively, using the following equations.

但し、Ｋ_p＝１．０及びＫ_b＝１．４である。
以上の処理から算出したピクチャＰ３の目標符号量Ｔ_pに基づき、ステップ２以降において、Ｑスケールを算出する。 However, K _p = 1.0 and K _b = 1.4.
Based on the target code amount T _p of the picture P3 calculated from the above processing, the Q scale is calculated in step 2 and subsequent steps.

＜従来技術２＞
プリフィルタを用いる符号量制御手法として、特許文献１として、「動画像符号化におけるプリフィルタ制御方法及び装置」がある。この手法によれば、符号化部の前段にあるプリフィルタであるＬＰＦ（以下、プリフィルタＬＰＦ）によって、符号化部に入力される各ピクチャの空間周波数を制御することで、量子化歪みを低減している。 <Conventional technology 2>
As a code amount control method using a prefilter, Patent Document 1 discloses a “prefilter control method and apparatus in moving picture coding”. According to this method, the spatial frequency of each picture input to the encoding unit is controlled by an LPF (hereinafter referred to as a prefilter LPF), which is a pre-filter before the encoding unit, thereby reducing quantization distortion. is doing.

プリフィルタＬＰＦの制御には、図１５に示すバランス関数なるものを定義し、量子化歪みと画像鮮鋭度劣化との整合を取る。図１５の２本の曲線は、以下の２つの関数Ｆ１及びＦ２にそれぞれ相当する。 For the control of the prefilter LPF, a balance function shown in FIG. 15 is defined to match the quantization distortion and the image sharpness deterioration. The two curves in FIG. 15 correspond to the following two functions F1 and F2, respectively.

Ｆ１（動き量、フィルタ係数、Ｑスケール、符号量）
Ｆ２（フィルタ係数、Ｑスケール）
関数Ｆ１及びＦ２の交差点をバランス点と称して、この点において符号量と画質の整合が最も優れているＱスケールとプリフィルタＬＰＦのフィルタ係数が得られるとしている。 F1 (motion amount, filter coefficient, Q scale, code amount)
F2 (filter coefficient, Q scale)
The intersection of the functions F1 and F2 is referred to as a balance point. At this point, the Q scale and the filter coefficient of the pre-filter LPF with the best matching of the code amount and the image quality are obtained.

＜従来技術３＞
同様に、プリフィルタを用いる符号量制御手法のもう一つの従来技術として、特許文献２として、「動画像符号化方法」が提案されている。 <Conventional technology 3>
Similarly, Patent Document 2 proposes a “moving image encoding method” as another prior art of a code amount control method using a prefilter.

この手法によれば、まず、符号化難易度Ｙを、以下の通りＩ、Ｐ及びＢピクチャ毎に関数Ｆを使用して算出する。 According to this method, first, the encoding difficulty level Y is calculated using the function F for each of the I, P, and B pictures as follows.

Ｙ＝Ｆ（累積符号量，平均Ｑスケール）
次に、Ｉ、Ｐ及びＢそれぞれについて算出した符号化難易度Ｙ_i、Ｙ_p及びＹ_bからフィルタ係数パラメータＺを次式から算出する。 Y = F (cumulative code amount, average Q scale)
Next, the filter coefficient parameter Z is calculated from the following equations from the encoding difficulty levels Y _i , Y _p and Y _b calculated for I, P and B, respectively.

式（４）により得られたフィルタ係数パラメータＺの値に応じて、図１６に示すグラフから実際のフィルタ係数Ｓを、予め設定してある所定値Ｓ０、Ｓ１あるいはＳ３から選択する。
即ち、各フィルタ係数Ｓに対応するフィルタ係数Ｚに幅をもたせることによって、急激なフィルタ係数Ｚの変化を回避している。
Test Model 5(Test Model Editing Commitee: "Test Model 5", ISO/IEC JTC/SC29/WG11/N0400(Apr.1993)）特許２８９４１３７号公報特開２００２−２４７５７６号公報 Depending on the value of the filter coefficient parameter Z obtained by the equation (4), the actual filter coefficient S is selected from the preset values S0, S1 or S3 from the graph shown in FIG.
That is, by providing the filter coefficient Z corresponding to each filter coefficient S with a width, a sudden change in the filter coefficient Z is avoided.
Test Model 5 (Test Model Editing Commitee: "Test Model 5", ISO / IEC JTC / SC29 / WG11 / N0400 (Apr. 1993)) Japanese Patent No. 2894137 JP 2002-247576 A

しかしながら、非特許文献１で示される従来技術１のＴＭ５には、次のような課題がある。 However, TM5 of prior art 1 shown in Non-Patent Document 1 has the following problems.

ステップ２及びステップ３において、最終的なＱスケールを得る場合には、ピクチャの目標符号量と現マクロブロックまでのピクチャ内の符号量との乖離、及びマクロブロックの空間アクティビティのみを使用している。 In step 2 and step 3, when obtaining the final Q scale, only the difference between the target code amount of the picture and the code amount in the picture up to the current macroblock, and the spatial activity of the macroblock are used. .

即ち、ＴＭ５は、符号化対象となるピクチャの目標符号量が既に決定された後、実際に符号化処理を行いながら視覚特性を調整しようとするものである。よって、画質の定量的な劣化具合や人間の視覚特性が十分反映されていない課題がある。 That is, TM5 attempts to adjust the visual characteristics while actually performing the encoding process after the target code amount of the picture to be encoded has already been determined. Therefore, there is a problem that the degree of quantitative deterioration of image quality and human visual characteristics are not sufficiently reflected.

特許文献１で示される従来技術２は、プリフィルタＬＰＦを用いることで、ＴＭ５の課題の解決を試みている。しかし、関数Ｆ１において、引数である動き量を算出するための大規模な回路が必要となる。また、関数Ｆ１及びＦ２の定義や図１５で示したバランス点の算出方法については、何ら言及しておらず、プリフィルタＬＰＦの制御方法及び効果が不明瞭である。 Prior art 2 disclosed in Patent Document 1 attempts to solve the problem of TM5 by using a prefilter LPF. However, in the function F1, a large-scale circuit for calculating the motion amount that is an argument is required. Further, the definition of the functions F1 and F2 and the balance point calculation method shown in FIG. 15 are not mentioned at all, and the control method and effect of the prefilter LPF are unclear.

更に、特許文献２で示される従来技術３は、フィルタ係数を変更する際に、急激な変化を回避することで、従来技術２の課題の解決を試みている。しかし、単純に累積符号量と平均Ｑスケールの情報からのみ、フィルタ係数を予測しているにすぎないので、依然として画質の劣化具合や人間の視覚特性が考慮されているとは言い難い。 Furthermore, the prior art 3 shown by patent document 2 is trying to solve the subject of the prior art 2 by avoiding a rapid change, when changing a filter coefficient. However, since the filter coefficient is simply predicted only from the information of the accumulated code amount and the average Q scale, it is still difficult to say that the degradation degree of image quality and human visual characteristics are still taken into consideration.

本発明は上記の課題に鑑みてなされたものであり、画質の劣化具合や人間の視覚特性を考慮して、割り当てられた目標符号量の条件下において符号量と符号化歪み量が最適な動画像符号化データを生成することができる動画像符号化装置及びその制御方法を提供することを目的とする。 The present invention has been made in view of the above problems, and in consideration of the degradation of image quality and human visual characteristics, a moving image with an optimal code amount and encoding distortion amount under the condition of the assigned target code amount. It is an object of the present invention to provide a moving image encoding apparatus capable of generating image encoded data and a control method thereof.

上記の目的を達成するための本発明による動画像符号化装置は以下の構成を備える。即ち、
動画像を所定単位で符号化する動画像符号化において、量子化処理における重み付けパラメータを用いて、入力画像を所定の目標符号量に符号化する動画像符号化装置であって
前記入力画像の分散を算出する分散算出手段と、
前記入力画像に対して、与えられたフィルタ特性によりフィルタ処理を行うフィルタ手段と、
前記フィルタ手段でフィルタ処理された入力画像に対して、量子化処理を行い、符号化する符号化手段と、
前記符号化手段が出力する符号化データに対して復号化処理を行う復号化手段と、
前記符号化手段への入力画像と前記復号化手段の出力である再構成画像から前記符号化手段によるブロック歪み量を検出する検出手段と、
前記符号化手段への入力画像の分散と前記符号化手段によって生じる符号化歪み量を用いて、前記符号化手段における符号量（Ｒ）と符号化歪み量（Ｄ）との関係であるＲ−Ｄモデルを規定する規定式を予め決定し、前記規定式より前記入力画像の目標符号量を算出するＲ−Ｄモデル算出手段と、
前記フィルタ手段によるフィルタ歪み量と、前記符号化手段によって生じる符号化歪み量と、前記入力画像の直前画像について前記検出手段で検出したブロック歪み量との加算演算から視覚感度を評価する視覚感度モデル算出手段と、
前記入力画像の目標符号量と、前記視覚感度モデル算出手段により算出された符号化歪み量と、前記分散算出手段により算出された分散とにより、前記符号化手段への入力画像の分散に対応するパラメータを算出するパラメータ算出手段と、
予め定めておいた複数のフィルタ係数の変化に応じた、前記フィルタ手段への入力画像と前記フィルタ手段からの出力画像との間の分散特性の中から、前記パラメータ算出手段により算出された前記符号化手段への入力画像の分散に対応するパラメータと、前記分散算出手段により算出された分散の関係に最も特定の近いフィルタ特性を選択するフィルタ特性算出手段と、
前記Ｒ−Ｄモデル算出手段により算出された目標符号量から前記量子化処理における重み付けパラメータを算出するＲ−Ｑモデル算出手段とを備え、
前記パラメータ算出手段は、前記フィルタ手段でフィルタ処理された入力画像の分散及び前記符号化手段によって生じる符号化歪み量を、前記入力画像の目標符号量が、前記規定式から得られる前記符号化手段の符号量と等しいことを拘束条件として、前記視覚感度を評価するための評価式が最大あるいは最小となるラグランジュ未定乗数法を用いて予め算出する。 In order to achieve the above object, a moving picture coding apparatus according to the present invention comprises the following arrangement. That is,
A moving image encoding apparatus for encoding an input image to a predetermined target code amount using a weighting parameter in quantization processing in moving image encoding for encoding a moving image in a predetermined unit, wherein the input image is distributed A variance calculating means for calculating
Filter means for performing a filtering process on the input image with a given filter characteristic;
Encoding means for performing quantization and encoding on the input image filtered by the filter means;
Decoding means for performing a decoding process on the encoded data output by the encoding means;
Detecting means for detecting a block distortion amount by the encoding means from an input image to the encoding means and a reconstructed image that is an output of the decoding means;
R−, which is the relationship between the code amount (R) and the coding distortion amount (D) in the coding means , using the variance of the input image to the coding means and the coding distortion amount generated by the coding means. An R-D model calculating unit that predetermines a defining formula that defines the D model and calculates a target code amount of the input image from the defining formula ;
A visual sensitivity model for evaluating visual sensitivity from an addition operation of a filter distortion amount generated by the filter means, an encoding distortion amount generated by the encoding means, and a block distortion amount detected by the detection means for an image immediately before the input image. A calculation means;
The input image target code amount , the coding distortion amount calculated by the visual sensitivity model calculating means, and the variance calculated by the variance calculating means correspond to the variance of the input image to the encoding means. Parameter calculating means for calculating the parameters;
The code calculated by the parameter calculation means from among dispersion characteristics between an input image to the filter means and an output image from the filter means in accordance with a change in a plurality of predetermined filter coefficients. A filter characteristic calculation unit that selects a parameter that corresponds to the variance of the input image to the conversion unit and a filter characteristic that is closest to the relationship of the variance calculated by the variance calculation unit;
RQ model calculation means for calculating a weighting parameter in the quantization process from the target code amount calculated by the RD model calculation means,
The parameter calculation unit is configured to obtain a variance of the input image filtered by the filter unit and an encoding distortion amount generated by the encoding unit, and a target code amount of the input image obtained from the prescribed expression. Is calculated in advance using a Lagrange undetermined multiplier method in which the evaluation formula for evaluating the visual sensitivity is maximized or minimized .

上記の目的を達成するための本発明による動画像符号化方法は以下の構成を備える。即ち、
動画像を所定単位で符号化する動画像符号化において、量子化処理における重み付けパラメータを用いて、入力画像を所定の目標符号量に符号化する動画像符号化方法であって、
前記入力画像の分散を算出する分散算出工程と、
前記入力画像に対して、与えられたフィルタ特性によりフィルタ処理を行うフィルタ工程と、
前記フィルタ工程でフィルタ処理された入力画像に対して、量子化処理を行い、符号化する符号化工程と、
前記符号化工程が出力する符号化データに対して復号化処理を行う復号化工程と、
前記符号化工程への入力画像と前記復号化工程の出力である再構成画像から前記符号化工程によるブロック歪み量を検出する検出工程と、
前記符号化工程への入力画像の分散と前記符号化工程によって生じる符号化歪み量を用いて、前記符号化工程における符号量（Ｒ）と符号化歪み量（Ｄ）との関係であるＲ−Ｄモデルを規定する規定式を予め決定し、前記規定式より前記入力画像の目標符号量を算出するＲ−Ｄモデル算出工程と、
前記フィルタ工程によるフィルタ歪み量と、前記符号化工程によって生じる符号化歪み量と、前記入力画像の直前画像について前記検出工程で検出したブロック歪み量との加算演算から視覚感度を評価する視覚感度モデル算出工程と、
前記入力画像の目標符号量と、前記視覚感度モデル算出工程により算出された符号化歪み量と、前記分散算出工程により算出された分散とにより、前記符号化工程への入力画像の分散に対応するパラメータを算出するパラメータ算出工程と、
予め定めておいた複数のフィルタ係数の変化に応じた、前記フィルタ工程への入力画像と前記フィルタ工程からの出力画像との間の分散特性の中から、前記パラメータ算出工程により算出された前記符号化工程への入力画像の分散に対応するパラメータと、前記分散算出工程により算出された分散の関係に最も特定の近いフィルタ特性を選択するフィルタ特性算出工程と、
前記Ｒ−Ｄモデル算出工程により算出された目標符号量から前記量子化処理における重み付けパラメータを算出するＲ−Ｑモデル算出工程とを備え、
前記パラメータ算出工程は、前記フィルタ工程でフィルタ処理された入力画像の分散及び前記符号化工程によって生じる符号化歪み量を、前記入力画像の目標符号量が、前記規定式から得られる前記符号化工程の符号量と等しいことを拘束条件として、前記視覚感度を評価するための評価式が最大あるいは最小となるラグランジュ未定乗数法を用いて予め算出する。 In order to achieve the above object, a moving picture coding method according to the present invention comprises the following arrangement. That is,
In moving picture coding for coding a moving picture in a predetermined unit, a moving picture coding method for coding an input picture to a predetermined target code amount using a weighting parameter in quantization processing,
A variance calculating step of calculating variance of the input image;
A filtering step for performing a filtering process on the input image with given filter characteristics;
An encoding step of performing quantization processing and encoding the input image filtered in the filtering step;
A decoding step of performing a decoding process on the encoded data output by the encoding step;
A detection step for detecting a block distortion amount due to the encoding step from an input image to the encoding step and a reconstructed image that is an output of the decoding step;
R−, which is the relationship between the coding amount (R) and the coding distortion amount (D) in the coding step , using the variance of the input image to the coding step and the coding distortion amount generated by the coding step. An R-D model calculating step of predetermining a defining formula for defining the D model , and calculating a target code amount of the input image from the defining formula ;
A visual sensitivity model for evaluating visual sensitivity from an addition operation of the filter distortion amount generated by the filtering step, the coding distortion amount generated by the encoding step, and the block distortion amount detected in the detection step for the image immediately before the input image. A calculation process;
A target code amount of the input image, and encoding distortion amount calculated by the visual sensitivity model calculation step, the dispersion and calculated by the variance calculation step, corresponding to the variance of the input image to the encoding step A parameter calculation step for calculating parameters;
The code calculated by the parameter calculation step from among dispersion characteristics between an input image to the filter step and an output image from the filter step according to a change in a plurality of predetermined filter coefficients. A filter characteristic calculation step of selecting a parameter corresponding to the variance of the input image to the conversion step and a filter characteristic closest to the relationship of the variance calculated by the variance calculation step;
An RQ model calculation step of calculating a weighting parameter in the quantization process from the target code amount calculated by the RD model calculation step,
In the encoding step, the parameter calculation step includes the variance of the input image filtered in the filtering step and the encoding distortion amount generated by the encoding step, and the target code amount of the input image is obtained from the prescribed expression. Is calculated in advance using a Lagrange undetermined multiplier method in which the evaluation formula for evaluating the visual sensitivity is maximized or minimized .

本発明によれば、画質の劣化具合や人間の視覚特性を考慮して、割り当てられた目標符号量の条件下において符号量と符号化歪み量が最適な動画像符号化データを生成することができる動画像符号化装置及びその方法を提供できる。 According to the present invention, it is possible to generate moving image encoded data in which the code amount and the coding distortion amount are optimum under the condition of the allocated target code amount in consideration of the degradation of image quality and human visual characteristics. A moving image encoding apparatus and method thereof that can be provided can be provided.

以下、本発明の実施の形態について図面を用いて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

本発明では、符号化歪みを伴う不可逆符号化方式に適用した例を詳細に説明する。 In the present invention, an example applied to an irreversible encoding method with encoding distortion will be described in detail.

実施形態１においては、符号化方式を限定せずに、一般的な不可逆符号化方式に適用した例を示す。実施形態２においては、ＭＰＥＧ符号化方式に適用した例を示す。 In the first embodiment, an example in which the present invention is applied to a general irreversible encoding method without limiting the encoding method is shown. In the second embodiment, an example applied to the MPEG encoding method is shown.

＜実施形態１＞
図１は本発明の実施形態１の動画像符号化装置の構成を示すブロック図である。また、図２は本発明の実施形態１の動画像符号化装置が実行する処理を示すフローチャートである。 <Embodiment 1>
FIG. 1 is a block diagram showing a configuration of a moving picture coding apparatus according to Embodiment 1 of the present invention. FIG. 2 is a flowchart showing processing executed by the video encoding apparatus according to the first embodiment of the present invention.

以下、図１及び図２を用いて、実施形態１の動画像符号化装置１００の動作の詳細について説明する。 Hereinafter, the details of the operation of the moving picture coding apparatus 100 according to the first embodiment will be described with reference to FIGS. 1 and 2.

尚、動画像符号化装置における量子化処理の重み付けパラメータは、Ｑスケールであるとする。 It is assumed that the weighting parameter for the quantization process in the moving image coding apparatus is the Q scale.

図１に示すように、動画像符号化装置１００は、大きく分けて、プリフィルタ部１０１、符号化部１０２、局所復号化部１０３及び符号量制御部１０４のブロックから構成される。また、これらの各ブロックは、ハードウェアで実現されても良いし、ブロックの一部あるいはすべてがソフトウェアとして、ＣＰＵやＲＡＭ、ＲＯＭを用いる制御によって実現されても良い。 As shown in FIG. 1, the moving picture coding apparatus 100 is roughly composed of blocks of a prefilter unit 101, a coding unit 102, a local decoding unit 103, and a code amount control unit 104. Each of these blocks may be realized by hardware, or a part or all of the blocks may be realized as software by control using a CPU, RAM, and ROM.

動画像符号化装置１００の動作を説明するにあたり、現時点においては、図３に示すように、ピクチャＩ２までの符号化処理が完了しており、次に、ピクチャＩ３の符号化処理を行うものとする。 In explaining the operation of the moving picture coding apparatus 100, at the present time, as shown in FIG. 3, the coding process up to the picture I2 is completed, and then the picture I3 is coded. To do.

まず、図２のステップＳ２００で、外部ブロック（不図示）からピクチャＩ３の目標符号量Ｒ_tが設定される。Ｒ_tの算出方法は、本発明に依存するものではなく、例えば、従来例で示されるＣＢＲ方式であれば、ＴＭ５のステップ１の処理に相当するものである。 First, at step S200 in FIG. 2, the target code amount R _t of the picture I3 is set externally block (not shown). Method for calculating the R _t is not dependent on the present invention, for example, if the CBR scheme shown in the conventional example, which corresponds to step 1 of the processing of TM5.

動画像符号化装置１００では、符号化部１０２のＱスケールを、設定された目標符号量Ｒ_tから直接算出するのではなく、目標符号量Ｒ_tから想定される符号化歪み量を、視覚感度モデル算出部１０７及びＲ−Ｄモデル算出部１０９を用いて、プリフィルタ部１０１と符号化部１０２に最適に分割する。 In the moving picture coding apparatus 100, the Q scale of the encoder 102, rather than directly calculated from the set target code amount R _t, the encoding distortion amount estimated from the target code amount R _t, visual sensitivity Using the model calculation unit 107 and the RD model calculation unit 109, the pre-filter unit 101 and the encoding unit 102 are optimally divided.

ステップＳ２０１で、ピクチャＩ３の分散Ｓ_iを、分散算出部１０５で算出する。分散Ｓ_iの算出方法は、例えば、以下のように算出する。
座標（ｘ，ｙ）、ピクチャサイズをＭ×Ｎ画素、着目するピクチャの平均をＡＶＥとした場合、ピクチャの分散Ｓ_iは、次式で算出される。

In step S201, the variance S _i of the picture I3 is calculated by the variance calculation unit 105. For example, the variance S _i is calculated as follows.
When the coordinates (x, y), the picture size is M × N pixels, and the average of the picture of interest is AVE, the picture variance S _i is calculated by the following equation.

次に、ステップＳ２０３〜２０４で用いる符号化部１０２のＲ−Ｄモデル（Ｒ−Ｄ規定式）及び視覚感度モデル（視覚感度評価式）について説明する。 Next, the RD model (RD definition formula) and the visual sensitivity model (visual sensitivity evaluation formula) of the encoding unit 102 used in steps S203 to S204 will be described.

実施形態１において適用する符号化部１０２のＲ−ＤモデルＲ_c（Ｓ_f，ＭＳＥ_c）は、次式を用いて算出する。 The RD model R _c (S _f , MSE _c ) of the encoding unit 102 applied in the first embodiment is calculated using the following equation.

ここで、Ｉ_c及びΘ_cは定数であり、Ｉ_c＝１及びΘ_c＝０．５の場合には、情報理論において、ＲａｔｅＤｉｓｔｏｒｔｉｏｎ理論として知られる符号量と符号化歪み量の関係を表す公知の式である。
Ｓ_fは、符号化部１０２の入力ピクチャの分散であり、プリフィルタ部１０１の出力ピクチャの分散に相当する。分散Ｓ_fは、実施形態１における動画像符号化装置１００の入力ピクチャの分散Ｓ_i及びプリフィルタ部１０１のフィルタ特性に応じて変化する変数である。 Here, I _c and Θ _c are constants, and when I _c = 1 and Θ _c = 0.5, in information theory, the relationship between the code amount known as the Rate Distortion theory and the coding distortion amount is represented. This is a known formula.
S _f is the variance of the input picture of the encoding unit 102, and corresponds to the variance of the output picture of the prefilter unit 101. The variance S _f is a variable that changes in accordance with the variance S _i of the input picture of the video encoding device 100 and the filter characteristics of the prefilter unit 101 in the first embodiment.

ＭＳＥ_cは、符号化部１０２によって生じる符号化歪み量である。ＭＳＥ_cは、符号化部１０２の入力ピクチャと、局所復号化部１０３の出力ピクチャとの差分二乗和に相当する変数である。 MSE _c is a coding distortion amount generated by the coding unit 102. MSE _c is a variable corresponding to the sum of squared differences between the input picture of the encoding unit 102 and the output picture of the local decoding unit 103.

Ｉ_c及びΘ_iは、符号化部１０２の符号化方式に依存するパラメータとして定義する。実施形態１では、符号化部１０２の符号化方式を限定しない場合を想定しているので、Ｉ_ｃ＝１及びΘ_c＝０.５を適用する。 I _c and Θ _i are defined as parameters depending on the encoding scheme of the encoding unit 102. In Embodiment 1, since it is assumed that the encoding method of the encoding unit 102 is not limited, I _c = 1 and Θ _c = 0.5 are applied.

ここで、図４に、式（５）の特性を表す図として、Ｓ_f＝２３００、Ｉ_c＝１及びΘ_c＝０．５の場合の符号量Ｒ_cと符号化歪み量ＭＳＥ_cとの関係を示す。 Here, in FIG. 4, as a diagram representing the characteristic of the equation (5), the code amount R _c and the coding distortion amount MSE _c when S _f = 2300, I _c = 1 and Θ _c = 0.5 are calculated. Show the relationship.

ステップＳ２０３で用いる視覚感度モデルＨ_vs（Ｓ_f，ＭＳＥ_c）は、実施形態１では、次式として定義する。 The visual sensitivity model H _vs (S _f , MSE _c ) used in step S203 is defined as the following expression in the first embodiment.

ここで、ＭＳＥ_fは、プリフィルタ部１０１において生じるフィルタ歪み量、Ｂ_cprevは直前ピクチャの符号化処理時にブロック歪み検出部１０６で検出したブロック歪み量、及びＳ_cprevは符号化部１０２の直前入力ピクチャの分散Ｓ_fである。
更に、式（６）中のフィルタ歪み量ＭＳＥ_fは、次式として定義する。 Here, MSE _f is the amount of filter distortion generated in the pre-filter unit 101, B _cprev is the block distortion amount detected by the block distortion detection unit 106 during the encoding process of the immediately preceding picture, and S _cprev is the input immediately before the encoding unit 102 It is the variance S _f of the picture.
Further, the filter distortion amount MSE _f in the equation (6) is defined as the following equation.

ここで、αはプリフィルタ部１０１のフィルタの種類に依存する定数である。
尚、式（５）〜式（７）で使用される変数及び定数の一覧を図５に示す。 Here, α is a constant depending on the type of filter of the pre-filter unit 101.
FIG. 5 shows a list of variables and constants used in the equations (5) to (7).

次に、実施形態１において使用する式（６）及び（７）の視覚感度モデルＨ_vs（Ｓ_f，ＭＳＥ_c）の特徴を、以下に示す。 Next, the characteristics of the visual sensitivity model H _vs (S _f , MSE _c ) of the equations (6) and (7) used in the first embodiment are shown below.

特徴１：符号化部１０２において生じる符号化歪み量ＭＳＥ_cのみならず、プリフィルタ部１０１で生じるフィルタ歪み量ＭＳＥ_f（Ｓ_f）を考慮することで、動画像符号化装置１００全体の歪み量の評価が可能であり、高精度な画質の制御が可能である。 Feature 1: Considering not only the encoding distortion amount MSE _c generated in the encoding unit 102 but also the filter distortion amount MSE _f (S _f ) generated in the prefilter unit 101, the distortion amount of the entire moving image encoding apparatus 100 It is possible to control the image quality with high accuracy.

特徴２：ブロック歪み量Ｂ_cpfevを評価量として追加することにより、人間の視覚感度に近い画質の評価が可能である。 Feature 2: By adding the block distortion amount B _cpfev as an evaluation amount, it is possible to evaluate image quality close to human visual sensitivity.

式（６）及び式（７）を用いて、ステップＳ２０２で、動画像符号化装置１００の入力ピクチャの分散Ｓ_i、符号化部１０２の直前入力ピクチャの分散Ｓ_cprev及び直前ピクチャのブロック歪み量Ｂ_cprevから、視覚感度モデルＨ_vs（Ｓ_f，ＭＳＥ_c）を算出する。 Using equation (6) and equation (7), in step S202, the variance S _i of the input picture of the video encoding device 100, the variance S _cprev of the immediately preceding input picture of the encoding unit 102, and the block distortion amount of the immediately preceding picture A visual sensitivity model H _vs (S _f , MSE _c ) is calculated from B _cprev .

次に、ステップＳ２０３のパラメータ算出部１０８における分散Ｓ_f及び符号化歪み量ＭＳＥ_cの算出方法について説明する。 Next, a method for calculating the variance S _f and the coding distortion amount MSE _c in the parameter calculation unit 108 in step S203 will be described.

実施形態１においては、動画像符号化装置１００に与えられたピクチャの目標符号量の拘束条件下で、ラグランジュ未定乗数法を用いて、二つの視覚感度モデルＨ_vs（Ｓ_f，ＭＳＥ_c）及びＲ−ＤモデルＲ_c（Ｓ_f，ＭＳＥ_c）の関係を最適する分散Ｓ_f及び符号化歪み量ＭＳＥ_cを算出する。 In the first embodiment, the two visual sensitivity models H _vs (S _f , MSE _c ) and the Lagrange undetermined multiplier method are used under the constraint condition of the target code amount of the picture given to the video encoding device 100. A variance S _f and an encoding distortion amount MSE _c that optimize the relationship of the RD model R _c (S _f , MSE _c ) are calculated.

即ち、動画像符号化装置１００に与えられたピクチャの目標符号量をＲ_tとすれば、拘束条件式は、次式で表せる。
［拘束条件式］ That is, if the target code amount of the picture given in the moving picture coding apparatus 100 and R _t, the constraint condition expression, expressed by the following equation.
[Constrained conditional expression]

更に、未定乗数をλと定義すれば、 Furthermore, if the undetermined multiplier is defined as λ,

となる。
また、必要条件式として次式を定義する。
［必要条件式］ It becomes.
In addition, the following expression is defined as a necessary conditional expression.
[Required expression]

よって、式（１０）及び式（８）から分散Ｓ_f及び符号化歪み量ＭＳＥ_cの最適解を算出するためには、ステップＳ２０３において、次式を演算する。 Therefore, in order to calculate the optimum solution of the variance S _f and the coding distortion amount MSE _c from the equations (10) and (8), the following equation is calculated in step S203.

但し、実施形態１においてはＩ_c＝１及びΘ_c＝０.５である。
また、αはプリフィルタ部１０１を構成したフィルタの種類に依存する係数であり、動画像符号化装置１００を構成した際に予め定める定数である。 However, in the first embodiment, I _c = 1 and Θ _c = 0.5.
Further, α is a coefficient depending on the type of the filter constituting the prefilter unit 101, and is a constant determined in advance when the moving picture coding apparatus 100 is constructed.

次に、ステップＳ２０４で、フィルタ特性算出部１１０において、プリフィルタ部１０１のフィルタ特性を決定する。実施形態１においては、プリフィルタ部１０１の入力及び出力ピクチャの分散の変化を用いてフィルタ特性を選択する。 Next, in step S204, the filter characteristic calculation unit 110 determines the filter characteristic of the pre-filter unit 101. In the first embodiment, filter characteristics are selected using changes in the variance of the input and output pictures of the prefilter unit 101.

尚、入力ピクチャの分散Ｓ_iはステップＳ２０１で、出力ピクチャの分散Ｓ_fはステップＳ２０３で、算出済みである。 The input picture variance S _i has been calculated in step S201, and the output picture variance S _f has been calculated in step S203.

これら二つの分散Ｓ_i及びＳ_fの関係と、予め定めておいた複数のフィルタ係数の変化に応じた、プリフィルタ部１０１の入力及び出力ピクチャの分散特性の中から、最も特性の近いフィルタ係数を一つ選択する。 Among the dispersion characteristics of the input and output pictures of the pre-filter unit 101 in accordance with the relationship between these two dispersions S _i and S _f and a plurality of predetermined filter coefficient changes, the filter coefficient having the closest characteristic is selected. Select one.

図６は、予め定めておいたフィルタ係数Ｃ１〜Ｃ５にそれぞれに対応する、プリフィルタ部１０１の入力及び出力ピクチャの分散特性を示す５つの曲線から、算出した分散Ｓ_i及びＳ_fの関係に最も近いフィルタ係数Ｃ２が選択されていることを示している。 FIG. 6 shows the relationship between the variances S _i and S _f calculated from five curves indicating the dispersion characteristics of the input and output pictures of the pre-filter unit 101 corresponding to the predetermined filter coefficients C1 to C5. It shows that the nearest filter coefficient C2 is selected.

プリフィルタ部１０１は、パラメータＣ１〜Ｃ５をフィルタ係数算出部１１０から受け取ることで、対応するフィルタ特性になるようにフィルタ係数を変更する。 The prefilter unit 101 receives the parameters C1 to C5 from the filter coefficient calculation unit 110, and changes the filter coefficient so that the corresponding filter characteristics are obtained.

ステップＳ２０５で、Ｒ−Ｄモデル算出部１０９において、式（１１）から得られた符号化歪み量ＭＳＥ_c及び分散Ｓ_fを用いて、Ｒ−ＤモデルＲ_c（Ｓ_f，ＭＳＥ_c）から符号化部１０２の目標符号量Ｒ_cを算出する。 In step S205, the RD model calculation unit 109 uses the encoded distortion amount MSE _c and the variance S _f obtained from the equation (11) to generate a code from the RD model R _c (S _f , MSE _c ). The target code amount R _c of the conversion unit 102 is calculated.

これは、式（５）のＲ−ＤモデルＲ_c（Ｓ_f，ＭＳＥ_c）に、当該符号化歪み量ＭＳＥ_c及び分散Ｓ_fを代入することで算出する。 This is calculated by substituting the coding distortion amount MSE _c and the variance S _f into the RD model R _c (S _f , MSE _c ) of the equation (5).

ステップＳ２０６で、ステップＳ２０５で算出した目標符号量Ｒ_cを用いて、符号化部１０２のＱスケールを算出する。Ｑスケールの算出には、符号化部１０２のＲ−Ｑモデルを使用する。実施形態１においては、符号化部１０２のＲ−ＱモデルＲＱ_c（Ｒ_c，Ｓ_f）として、次の一次式で表現する。 In step S206, by using the target code amount R _c calculated in step S205, and calculates the Q scale of the encoding unit 102. For calculating the Q scale, the RQ model of the encoding unit 102 is used. In the first embodiment, the RQ model RQ _c (R _c , S _f ) of the encoding unit 102 is expressed by the following linear expression.

ここで、Ｒ_cはステップＳ２０５において算出した目標符号量Ｒ_cであり、Ｓ_fはステップＳ２０３において算出した符号化部１０２の入力ピクチャの分散Ｓ_fである。
また、β_cは定数であり、直前のピクチャにおいて使用したＲ_c、Ｓ_i及びＱ_cの値を、式（１２）に再度代入することで得る。但し、実施形態１においては、Ｑスケールを算出する精度を高めるために、次式を用いてステップＳ２０９で、Ｒ−ＱモデルＲＱ_c（Ｒ_c，Ｓ_f）を更新する。 Here, R _c is the target code amount R _c calculated in step S205, and S _f is the variance S _f of the input picture of the encoding unit 102 calculated in step S203.
Β _c is a constant, and is obtained by substituting the values of R _c , S _i, and Q _c used in the immediately preceding picture into Equation (12) again. However, in the first embodiment, in order to improve the accuracy of calculating the Q scale, the RQ model RQ _c (R _c , S _f ) is updated in step S209 using the following equation.

但し、ｎはＲ−ＱモデルＲＱ_c（Ｒ_c，Ｓ_f）に反映させる過去のピクチャ数に相当する。
以上、ステップＳ２００〜ステップＳ２０６までの処理が完了した後、ステップＳ２０７で、プリフィルタ部１０１及び符号化部１０２の処理を実行する。 However, n corresponds to the number of past pictures to be reflected in the RQ model RQ _c (R _c , S _f ).
As described above, after the processing from step S200 to step S206 is completed, the processing of the prefilter unit 101 and the encoding unit 102 is executed in step S207.

また、符号化部１０２の符号化処理と並行して、ステップＳ２０８ではブロック歪み検出部１０６で、ブロック歪み量Ｂ_cprevの検出を行う。ブロック歪み量Ｂ_cprevは、符号化部１０２の入力ピクチャと局所復号化部１０３の出力ピクチャを用いる。 In parallel with the encoding process of the encoding unit 102, the block distortion detection unit 106 detects the block distortion amount B _cprev in step S208. The block distortion amount B _cprev uses the input picture of the encoding unit 102 and the output picture of the local decoding unit 103.

ここで、人間の視覚感度として、ブロック歪みに非常に敏感であることが知られている。このブロック歪みは、８×８画素の正方ブロック単位で直交変換及び量子化処理を施していることがその発生原因である。 Here, it is known that human visual sensitivity is very sensitive to block distortion. The cause of this block distortion is that orthogonal transformation and quantization processing are performed in units of square blocks of 8 × 8 pixels.

ブロック歪み量Ｂ_cprevの検出方法は、本発明には依存せずに自由に実装が可能であるが、たとえ同じピクチャに対してブロック歪みを検出したとしても、ブロック歪み量Ｂ_cprevは検出方法に依存して異なる。 The method for detecting the block distortion amount B _cprev can be freely implemented without depending on the present invention. However, even if block distortion is detected for the same picture, the block distortion amount B _cprev is used as the detection method. Depends on different.

しかし、その違いは、式（６）の視覚モデルＨ_vs（Ｓ_f，ＭＳＥ_c）を考慮し、定数をＢ_cprevに乗じれば良い。この定数は、ブロック歪み検出部１０６の検出方法が定まれば、実施形態１の動画像符号化装置１００を構成した際には一意に決定される値である。 However, the difference may be obtained by taking into account the visual model H _vs (S _f , MSE _c ) of Equation (6) and multiplying B _cprev by a constant. This constant is a value that is uniquely determined when the moving picture coding apparatus 100 according to the first embodiment is configured if the detection method of the block distortion detection unit 106 is determined.

実施形態１においては、ブロック歪み検出部１０６の検出方法として、８×８ブロック境界の差分二乗和ＭＳＥ_blkとピクチャ全体の差分二乗和ＭＳＥ_allの比を使って算出する。 In the first embodiment, as a detection method of the block distortion detection unit 106, the calculation is performed using the ratio of the difference square sum MSE _blk of the 8 × 8 block boundary and the difference square sum MSE _all of the entire picture.

ここで、符号化部１０２の入力ピクチャの水平方向の画素数をx_size及び垂直方向の画素数をy_sizeとする。水平方向の座標がＪ及び垂直方向の座標がＩの、符号化部１０２の入力ピクチャの画素値をＣＩＮ（Ｊ，Ｉ）とし、同様に、局所復号化部１０３の出力ピクチャの画素値をＣＯＵＴ（Ｊ，Ｉ）とすれば、ブロック歪み量Ｂ_cprevは、以下の（１４）式で算出する。 Here, it is assumed that the number of pixels in the horizontal direction of the input picture of the encoding unit 102 is x_size and the number of pixels in the vertical direction is y_size. The pixel value of the input picture of the encoding unit 102 with the horizontal coordinate J and the vertical coordinate I is CIN (J, I), and similarly, the pixel value of the output picture of the local decoding unit 103 is COUT. If (J, I), the block distortion amount B _cprev is calculated by the following equation (14).

ここで、ＭＳＥ_allは，ＣＩＮ（Ｊ，Ｉ）とＣＯＵ（Ｊ，Ｉ）とのピクチャ全体における差分二乗和であり、γはブロック歪み検出部１０６の検出方法に依存する定数である。
以上説明したように、実施形態１によれば、以上のステップＳ２００〜ステップＳ２０９の処理を、動画像符号化処理装置１００にピクチャを入力する毎に繰り返し行うことにより、画質の劣化具合や人間の視覚特性を考慮した、プリフィルタ部１０１及び符号化部１０２の制御を実現することができる。 Here, MSE _all is the sum of squared differences in the entire picture of CIN (J, I) and COU (J, I), and γ is a constant that depends on the detection method of the block distortion detection unit 106.
As described above, according to the first embodiment, the above-described processing in steps S200 to S209 is repeatedly performed every time a picture is input to the moving image coding processing apparatus 100, so that the degree of image quality degradation and human Control of the pre-filter unit 101 and the encoding unit 102 in consideration of visual characteristics can be realized.

よって、割り当てられた目標符号量の条件下において符号量と符号化歪み量が最適な符号化動画像データを得ることができる。 Therefore, it is possible to obtain encoded moving image data in which the code amount and the encoding distortion amount are optimal under the condition of the allocated target code amount.

＜実施形態２＞
実施形態２として、符号化部にＭＰＥＧ−４符号化方式に適用した例を詳細に説明する。 <Embodiment 2>
As the second embodiment, an example in which the encoding unit is applied to the MPEG-4 encoding method will be described in detail.

図７は本発明の実施形態２の動画像符号化装置の構成を示すブロック図である。また、図８は本発明の実施形態２の動画像符号化装置が実行する処理を示すフローチャートである。 FIG. 7 is a block diagram showing the configuration of the moving picture coding apparatus according to the second embodiment of the present invention. FIG. 8 is a flowchart showing processing executed by the video encoding apparatus according to the second embodiment of the present invention.

ここで、図７の実施形態２の動画像符号化装置８００を構成する各ブロックと、図１の実施形態１の動画像符号化装置１００を構成する各ブロックとの相違点は、次の二つである。 Here, the difference between each block constituting the moving picture coding apparatus 800 of the second embodiment shown in FIG. 7 and each block constituting the moving picture coding apparatus 100 of the first embodiment shown in FIG. One.

ブロックの相違点１：図１のプリフィルタ部１０１が、図７のバターワースフィルタ部８０１に相当する。 Block Difference 1: The prefilter unit 101 in FIG. 1 corresponds to the Butterworth filter unit 801 in FIG.

ブロックの相違点２：図１の符号化部１０２が、図７のＭＰＥＧ符号化部８０２に相当する。 Block Difference 2: The encoding unit 102 in FIG. 1 corresponds to the MPEG encoding unit 802 in FIG.

尚、符号量制御部８０４の内部ブロック構成は、図１の符号量制御部１０４の内部ブロック構成と同様である。 The internal block configuration of the code amount control unit 804 is the same as the internal block configuration of the code amount control unit 104 in FIG.

また、ＭＰＥＧ符号化部８０２は、動き検出部（ＭＥ）８０５、ＤＣＴ部８０６、量子化部（ＱＴＺ）８０７、可変長符号化部（ＶＬＣ）８０８を有している。また、局所ＭＰＥＧ復号化部８０３は、動き補償部（ＭＣ）８０９、逆ＤＣＴ部（ＩＤＣＴ）８１０、逆量子化部（ＩＱＴＺ）８１１のブロックを有している。 Also, the MPEG encoding unit 802 includes a motion detection unit (ME) 805, a DCT unit 806, a quantization unit (QTZ) 807, and a variable length encoding unit (VLC) 808. Also, the local MPEG decoding unit 803 has blocks of a motion compensation unit (MC) 809, an inverse DCT unit (IDCT) 810, and an inverse quantization unit (IQTZ) 811.

また、これらの各ブロックは、ハードウェアで実現されても良いし、ブロックの一部あるいはすべてがソフトウェアとして、ＣＰＵやＲＡＭ、ＲＯＭを用いる制御によって実現されても良い。 Each of these blocks may be realized by hardware, or a part or all of the blocks may be realized as software by control using a CPU, RAM, and ROM.

次に、図８の実施形態２の動画像符号化装置が実行する処理を示すフローチャートと、図１の実施形態１の動画像符号化装置１００が実行する処理を示すフローチャートとの相違点は、次の二つである。 Next, the difference between the flowchart showing the process executed by the video encoding apparatus according to the second embodiment in FIG. 8 and the flowchart showing the process executed by the video encoding apparatus 100 according to the first embodiment in FIG. The next two.

処理の相違点１：図８のステップＳ９０４及び９０６の処理で用いるＲ−Ｄモデルが、図２のステップＳ２０３及び２０５で用いるＲ−Ｄモデルと異なる。 Process Difference 1: The RD model used in the processes in steps S904 and 906 in FIG. 8 is different from the RD model used in steps S203 and 205 in FIG.

処理の相違点２：図８のステップＳ９０５で行うフィルタ特性の選択方法が、図２のステップＳ２０４で行うフィルタ特性の選択方法と異なる。 Processing Difference 2: The filter characteristic selection method performed in step S905 in FIG. 8 is different from the filter characteristic selection method performed in step S204 in FIG.

以後、ＭＰＥＧ−４符号化方式における全体の処理の中で、実施形態２の動画像符号化装置８００の処理が対応する部分について説明した後、前記２つの処理の相違点ついて、それぞれ詳細に説明する。 Hereinafter, in the overall processing in the MPEG-4 encoding method, the part corresponding to the processing of the moving image encoding apparatus 800 according to the second embodiment will be described, and then the difference between the two processes will be described in detail. To do.

[全体の処理の中で対応する部分]
実施形態２では、図９に示すように、ストリーム全体を、複数のピクチャからなるシーケンスに分割する。符号量制御は、このシーケンスを一つの単位として行い、シーケンス単位で、それぞれ同一のビット・レートになるように符号化する。例えば、このシーケンスは、ＭＰＥＧ−４符号化方式のシンタックスにおいて、Ｇｒｏｕｐ＿ｏｆ＿ＶｉｄｅｏＯｂｊｅｃｔＰｌａｎｅ（）に対応する。 [Corresponding part of the whole process]
In the second embodiment, as shown in FIG. 9, the entire stream is divided into a sequence composed of a plurality of pictures. In the code amount control, this sequence is performed as one unit, and encoding is performed so that each sequence has the same bit rate. For example, this sequence corresponds to Group_of_VideoObjectPlane () in the syntax of the MPEG-4 encoding system.

図１０は、一つのシーケンスにおけるＭＰＥＧ−４符号化方式の処理を示すフローチャートである。シーケンスを構成するピクチャの数及びシーケンスの目標符号量は、本発明には依存しない。 FIG. 10 is a flowchart showing the processing of the MPEG-4 encoding method in one sequence. The number of pictures constituting the sequence and the target code amount of the sequence do not depend on the present invention.

例えば、ステップＳ１０００で、シーケンスの目標符号量が、従来技術の式（１）中のＲ_gopに対応しているとする。この場合、ステップＳ１００１で、シーケンスを構成する一つのピクチャの目標符号量Ｒ_tを算出するため、従来技術の式（２）及び（３）を用いることができる。 For example, in step S1000, it is assumed that the target code amount of the sequence corresponds to R _gop in equation (1) of the prior art. In this case, in order to calculate the target code amount R _t of one picture constituting the sequence in step S1001, the conventional equations (2) and (3) can be used.

シーケンスを構成する一つのピクチャの目標符号量Ｒ_tが算出された後、ステップＳ１００２で、図８の処理を繰り返すことで、シーケンスを構成するすべてのピクチャの符号化を行う。
[処理の相違点１]
実施形態２においても、実施形態１と同様に、ＭＰＥＧ符号化部８０２のＲ−ＤモデルＲ_c（Ｓ_f，ＭＳＥ_c）を定義する。尚、実施形態２の動画像符号化処理装置８００が対象とするピクチャ・タイプはＩピクチャ及びＰピクチャの２つとする。 After the target code amount R _t of one picture constituting the sequence is calculated, in step S1002, the processing of FIG. 8 is repeated to encode all the pictures constituting the sequence.
[Processing difference 1]
Also in the second embodiment, the RD model R _c (S _f , MSE _c ) of the MPEG encoding unit 802 is defined as in the first embodiment. It is assumed that the picture types targeted by the moving picture coding processing apparatus 800 of the second embodiment are I picture and P picture.

式（５）のＲ−ＤモデルＲ_c（Ｓ_f，ＭＳＥ_c）の２つの定数Ｉ_c及びΘ_cの値を、実施形態２のＭＰＥＧ符号化部８０１の符号量Ｒ_cと符号化歪み量ＭＳＥ_cとの関係を表すように定義する。 The values of the two constants I _c and Θ _c of the RD model R _c (S _f , MSE _c ) in Expression (5) are used as the code amount R _c and the encoding distortion amount of the MPEG encoding unit 801 according to the second embodiment. It is defined to represent the relationship with MSE _c .

ＭＰＥＧ−４符号化方式のＰピクチャの符号化においては、ピクチャ内のみ情報を用いるＩピクチャの符号化と異なり、隣り合うピクチャ間の相関を利用して差分演算を行う。 In the coding of a P picture in the MPEG-4 coding method, unlike the coding of an I picture that uses information only within a picture, a difference calculation is performed using the correlation between adjacent pictures.

この差分演算は、図７中の動き検出処理を行うＭＥ８０５及び動き補償処理を行うＭＣ８０９のそれぞれ２つのブロックにより実現される。 This difference calculation is realized by two blocks of ME 805 for performing motion detection processing and MC 809 for performing motion compensation processing in FIG.

即ち、ＭＰＥＧ符号化部８０２の入力ピクチャが同一である場合でも、ＩピクチャあるいはＰピクチャを符号化しているのかによって、直交変換処理を行うＤＣＴ８０６の入力ピクチャの分散が異なってしまい、ＭＰＥＧ符号化部８０２のＲ−ＤモデルＲ_c（Ｓ_f，ＭＳＥ_c）を表現することができないという課題が生じる。 That is, even when the input picture of the MPEG encoding unit 802 is the same, the variance of the input picture of the DCT 806 that performs orthogonal transform processing differs depending on whether the I picture or the P picture is encoded. There arises a problem that the 802 RD model R _c (S _f , MSE _c ) cannot be expressed.

この課題を解決するためには、Ｉピクチャ及びＰピクチャのいずれの符号化時においても、ＤＣＴ８０６の入力ピクチャの分散Ｓ_fを算出して、それを式（５）中の分散Ｓ_fと定義すれば良い。但し、この場合には、新たに、ＭＥ８０５及びＭＣ８０９の処理を考慮した分散モデルを新たに定義する必要がある。 In order to solve this problem, the variance S _f of the input picture of the DCT 806 is calculated and defined as the variance S _f in the equation (5) at the time of encoding any of the I picture and the P picture. It ’s fine. However, in this case, it is necessary to newly define a distribution model considering the processing of the ME 805 and the MC 809.

そこで、実施形態２においては、ピクチャ・タイプに応じた二つのＲ−ＤモデルＲ_c（Ｓ_f，ＭＳＥ_c）を定義する。 Therefore, in the second embodiment, two RD models R _c (S _f , MSE _c ) corresponding to picture types are defined.

図１１は、ＭＰＥＧ符号化部８０１のＩピクチャにおけるＲ−ＤモデルＲ_ic（Ｓ_f，ＭＳＥ_c）の符号量Ｒ_cと符号化歪み量ＭＳＥ_cの関係を示した図である。 FIG. 11 is a diagram illustrating the relationship between the code amount R _{c of the} RD model R _ic (S _f , MSE _c ) and the encoding distortion amount MSE _c in the I picture of the MPEG encoding unit 801.

図１１中の「−▲−」で示される曲線は、式（５）中の二つの定数Ｉ_c及びΘ_cをそれぞれ、Ｉ_c＝１及びΘ_c＝０.５としたＲ−Ｄモデルに、「−■−」で示される曲線は、Ｉ_c＝０.１及びΘ_c＝０.２５とした実施形態２のＭＰＥＧ符号化部８０２のＩピクチャに対応したＩピクチャＲ−ＤモデルＲ_ic（Ｓ_f，ＭＳＥ_c）にそれぞれ対応する。更には、「−◆−」で示される曲線は、実際にＭＰＥＧ−４符号化方式でＩピクチャを符号化した場合の実測値を示している。 The curve indicated by “− ▲ −” in FIG. 11 is an RD model in which the two constants I _c and Θ _c in equation (5) are set to I _c = 1 and Θ _c = 0.5, respectively. , The curve indicated by “− ■ −” indicates that the I picture RD model R _ic corresponding to the I picture of the MPEG encoding unit 802 of the second embodiment in which I _c = 0.1 and Θ _c = 0.25. (S _f , MSE _c ) respectively. Furthermore, a curve indicated by “− ◆ −” indicates an actual measurement value when an I picture is actually encoded by the MPEG-4 encoding method.

図１１において、符号量Ｒ_cが０.５以上の領域において、実測値の曲線とＩピクチャＲ−ＤモデルＲ_ic（Ｓ_f，ＭＳＥ_c）の曲線の間に大きな乖離が生じている。 In FIG. 11, in the region where the code amount R _c is 0.5 or more, there is a large divergence between the measured value curve and the curve of the I picture RD model R _ic (S _f , MSE _c ).

尚、符号量Ｒ_c＝０.５に相当するビット・レートは、ＭＰＥＧ符号化部８０２の入力ピクチャの画像サイズがＶＧＡ、サブサンプル４−２−０及びフレーム・レート３０ｆｐｓの場合に、６.６Ｍｂｐｓという非常に高いビット・レートに相当する。 The bit rate corresponding to the code amount R _c = 0.5 is 6. when the image size of the input picture of the MPEG encoding unit 802 is VGA, the subsample 4-2-0, and the frame rate is 30 fps. This corresponds to a very high bit rate of 6 Mbps.

ここで、ＭＰＥＧ符号化部８０２において、このような高ビット・レートによる符号化を行った際には、視覚的に目立つ程度にブロック歪みが発生することは稀であり、そもそもバターワースフィルタ部８０１においてブロック歪みを緩和するようなプリフィルタ処理が必要とされない。 Here, when the MPEG encoding unit 802 performs encoding at such a high bit rate, it is rare that block distortion occurs to a visually noticeable level. In the first place, in the Butterworth filter unit 801, There is no need for prefiltering to mitigate block distortion.

即ち、ステップＳ９０３〜ステップＳ９０６までのバターワースフィルタ部８０１のフィルタ特性を制御するための処理を省略することができる。 That is, the process for controlling the filter characteristics of the Butterworth filter unit 801 from step S903 to step S906 can be omitted.

よって、ステップＳ９０２で、動画像符号化装置８００に与えられたピクチャの目標符号量が０.５ｂｉｔ／ｐｉｘｅｌ以上の場合には、ステップＳ９０７へ分岐することとする。 Therefore, in step S902, if the target code amount of the picture given to the moving picture coding apparatus 800 is 0.5 bit / pixel or more, the process branches to step S907.

一方、ＭＰＥＧ符号化部８０１のＰピクチャに対応した、ＰピクチャＲ−ＤモデルＲ_pc（Ｓ_f，ＭＳＥ_c）の符号量Ｒ_cと符号化歪み量ＭＳＥ_cの関係を図１２に示す。 On the other hand, FIG. 12 shows the relationship between the coding amount R _c and the coding distortion amount MSE _c of the P picture RD model R _pc (S _f , MSE _c ) corresponding to the P picture of the MPEG coding unit 801.

ＰピクチャＲ−ＤモデルＲ_pc（Ｓ_f，ＭＳＥ_c）は、図１２中の「−■−」で示される曲線に対応し、式（５）中の二つの定数Ｉ_c及びΘ_cが、それぞれＩ_c＝０.１５及びΘ_c＝０.１５である。また、「−▲−」は図１１と同様のＲ−Ｄモデルに対応し、「−◆−」で示される曲線は、実際にＭＰＥＧ−４符号化方式でＰピクチャを符号化した場合の実測値を示している。 The P picture R-D model R _pc (S _f , MSE _c ) corresponds to the curve indicated by “− ■ −” in FIG. 12, and the two constants I _c and Θ _c in the equation (5) are I _c = 0.15 and Θ _c = 0.15 respectively. “− ▲ −” corresponds to the RD model similar to FIG. 11, and the curve indicated by “− ◆ −” indicates the actual measurement when the P picture is actually encoded by the MPEG-4 encoding method. The value is shown.

図１２において、ＩピクチャＲ−ＤモデルＲ_ic（Ｓ_f，ＭＳＥ_c）と同様に、符号量Ｒ_cが０.５以上の領域において実測値と大きく乖離を生じるが、この領域においてはＰピクチャＲ−ＤモデルＲ_pc（Ｓ_f，ＭＳＥ_c）を使用しない。 In FIG. 12, as with the I picture RD model R _ic (S _f , MSE _c ), there is a large divergence from the actual measurement value in a region where the code amount R _c is 0.5 or more. The RD model R _pc (S _f , MSE _c ) is not used.

前記の通りにステップＳ９０４及び９０６が、図２に示す実施形態１のステップＳ２０３及びステップＳ２０５とそれぞれ異なる点は、実施形態１のＲ−ＤモデルＲ_c（Ｓ_f，ＭＳＥ_c）の代わりに、ピクチャ・タイプに応じて、２つのＩピクチャＲ−ＤモデルＲ_ic（Ｓ_f，ＭＳＥ_c）及びＰピクチャＲ−ＤモデルＲ_pc（Ｓ_f，ＭＳＥ_c）を使用する点のみである。 As described above, Steps S904 and 906 are different from Steps S203 and S205 of Embodiment 1 shown in FIG. 2 in that, instead of the RD model R _c (S _f , MSE _c ) of Embodiment 1, Depending on the picture type, only two I picture RD models R _ic (S _f , MSE _c ) and P picture RD model R _pc (S _f , MSE _c ) are used.

よって、ステップＳ９０４及び９０６においては、実施形態１で示したステップＳ２０３及びステップＳ２０５の処理をピクチャ・タイプに応じて、式（５）の定数Ｉ_c及びΘ_cを前記の通り定義して行えば良い。 Therefore, in steps S904 and 906, if the processing of steps S203 and S205 shown in the first embodiment is defined according to the picture type, the constants I _c and Θ _c of equation (5) are defined as described above. good.

次に、ステップＳ９０５における処理について説明する。 Next, the process in step S905 will be described.

実施形態２の動画符号化装置８００では、プリフィルタ部としてバターワース特性をもつバターワースフィルタ部８０１を用いる。 In the moving image coding apparatus 800 according to the second embodiment, a Butterworth filter unit 801 having Butterworth characteristics is used as a prefilter unit.

バターワースフィルタは、最大平坦特性をもつことが知られており、周波数応答特性が次数により決定されることが特徴である。 The Butterworth filter is known to have a maximum flat characteristic, and the frequency response characteristic is determined by the order.

実施形態２においては、カットオフ周波数を固定とし、バターワースフィルタの次数を変化させることでバターワースフィルタ部８０１のフィルタ特性を変化させる。 In the second embodiment, the filter characteristic of the Butterworth filter unit 801 is changed by changing the order of the Butterworth filter while fixing the cutoff frequency.

次数を１から５まで変化させた場合のバターワースフィルタ部８０１の入力ピクチャの周波数Ｆ_iと当該フィルタ通過後の周波数Ｆ_fとの関係を表すグラフを図１３に示す。 A graph showing the relationship between the frequency F _i of the input picture of the Butterworth filter unit 801 and the frequency F _f after passing through the filter when the order is changed from 1 to 5 is shown in FIG.

ステップＳ９０１で算出したバターワースフィルタ部８０１の入力ピクチャの分散Ｓ_iと、ステップＳ９０４で得られたバターワースフィルタ部８０１の出力ピクチャの分散Ｓ_fの関係を用いて、予め定めておいた図１３に示す次数に応じたバターワースフィルタ部８０１の周波数Ｆ_iとＦ_fの関係を表す曲線の中から、分散Ｓ_iとＳ_fの関係と、最も近い周波数Ｆ_iとＦ_fの関係を表す次数を選択すれば良い。 Using the relationship between the variance S _i of the input picture of the Butterworth filter unit 801 calculated in step S901 and the variance S _f of the output picture of the Butterworth filter unit 801 obtained in step S904, it is shown in FIG. From the curve representing the relationship between the frequencies F _i and F _f of the Butterworth filter unit 801 corresponding to the order, the order representing the relationship between the variances S _i and S _{f and} the relationship between the closest frequencies F _i and F _f is selected. It ’s fine.

尚、次数０の場合には、バターワースフィルタの機能はオフになる。 If the order is 0, the Butterworth filter function is turned off.

以上説明したように、実施形態２によれば、ＭＰＥＧ−４符号化方式においても、実施形態１と同様の効果を得ることができる。 As described above, according to the second embodiment, the same effect as that of the first embodiment can be obtained even in the MPEG-4 encoding method.

以上説明したように、本発明によれば、プリフィルタ部及び符号化部から構成される動画像符号化装置において、画質の劣化具合や人間の視覚特性を考慮し、プリフィルタ部及び符号化部を制御することで、割り当てられた目標符号量の条件下において、符号量と符号化歪み量が最適な符号化動画像データを得ることができる。 As described above, according to the present invention, in the moving image encoding apparatus including the prefilter unit and the encoding unit, the prefilter unit and the encoding unit are considered in consideration of the degradation of image quality and human visual characteristics. By controlling the above, it is possible to obtain encoded moving image data in which the code amount and the encoding distortion amount are optimum under the condition of the assigned target code amount.

具体的には、動画像符号化装置に対して予め決定したピクチャの目標符号量を設定する。次に、動画像符号化装置への入力ピクチャの分散Ｓ_iを算出する。直前のピクチャを符号化する際に、ブロック歪み量Ｂ_cprevを、符号化部の入力ピクチャと局所復号化部の出力ピクチャから予め算出しておく。 Specifically, a target code amount of a predetermined picture is set for the moving picture coding apparatus. Next, the variance S _i of the input picture to the moving picture coding apparatus is calculated. When encoding the immediately preceding picture, the block distortion amount B _cprev is calculated in advance from the input picture of the encoding unit and the output picture of the local decoding unit.

分散Ｓ_iとブロック歪み量Ｂ_cprevから視覚感度モデルの評価式を決定する。 A visual sensitivity model evaluation formula is determined from the variance S _i and the block distortion amount B _cprev .

決定した視覚感度モデルの評価式及び符号化部の符号量と符号化歪み量の関係を規定する規定式(Ｒ−Ｄモデル)を用いて、プリフィルタ部の通過後のピクチャの分散Ｓ_f及び符号化部で発生する符号化歪み量ＭＳＥ_cを、入力ピクチャの目標符号量を拘束条件としたラグランジュ未定乗数法の解として算出する。 Using the determined evaluation formula of the visual sensitivity model and the defining formula (RD model) that defines the relationship between the coding amount of the coding unit and the coding distortion amount, the variance S _f of the picture after passing through the pre-filter unit and The encoding distortion amount MSE _c generated in the encoding unit is calculated as a solution of the Lagrange undetermined multiplier method with the target code amount of the input picture as a constraint condition.

分散Ｓ_i及びＳ_fをパラメータとして、プリフィルタ部のフィルタ特性を決定する。 The filter characteristics of the prefilter unit are determined using the variances S _i and S _f as parameters.

更には、符号化歪み量ＭＳＥ_cとＲ−Ｄモデルから符号化部の目標符号量Ｒ_cを決定する。 Furthermore, the target code amount R _c of the encoding unit is determined from the encoding distortion amount MSE _c and the RD model.

決定した目標符号量Ｒ_cを用いて、符号化部の符号量及び量子化処理の重み付けパラメータの関係を規定する規定式（Ｒ−Ｑモデル）から、量子化処理の重み付けパラメータを算出する。 Using the determined target code amount R _c , a weighting parameter for the quantization process is calculated from a defining formula (RQ model) that defines the relationship between the code amount of the encoding unit and the weighting parameter for the quantization process.

尚、視覚感度モデルは、実施形態１で使用した式（６）の評価式に限定されるものではなく、符号化部のＲ−Ｄモデルの符号化歪み量ＭＳＥ_cに相当する変数と、プリフィルタ部の出力ピクチャの分散Ｓ_fを変数として含めば良い。 Note that the visual sensitivity model is not limited to the evaluation formula (6) used in the first embodiment, and a variable corresponding to the coding distortion amount MSE _c of the RD model of the coding unit, The output picture variance S _f of the filter unit may be included as a variable.

更には、Ｒ−Ｄモデルから得られる符号化部の目標符号量から、Ｑスケールを算出するＲ−Ｑモデルも式（１２）に限定されないことは言うまでもない。 Furthermore, it goes without saying that the RQ model for calculating the Q scale from the target code amount of the encoding unit obtained from the RD model is not limited to the equation (12).

以上、実施形態例を詳述したが、本発明は、例えば、システム、装置、方法、プログラムもしくは記憶媒体等としての実施態様をとることが可能であり、具体的には、複数の機器から構成されるシステムに適用しても良いし、また、一つの機器からなる装置に適用しても良い。 Although the embodiments have been described in detail above, the present invention can take an embodiment as, for example, a system, an apparatus, a method, a program, or a storage medium, and specifically includes a plurality of devices. The present invention may be applied to a system that is configured, or may be applied to an apparatus that includes a single device.

尚、本発明は、前述した実施形態の機能を実現するソフトウェアのプログラム（実施形態では図に示すフローチャートに対応したプログラム）を、システムあるいは装置に直接あるいは遠隔から供給し、そのシステムあるいは装置のコンピュータが該供給されたプログラムコードを読み出して実行することによっても達成される場合を含む。 In the present invention, a software program (in the embodiment, a program corresponding to the flowchart shown in the drawing) that realizes the functions of the above-described embodiment is directly or remotely supplied to the system or apparatus, and the computer of the system or apparatus Is also achieved by reading and executing the supplied program code.

従って、本発明の機能処理をコンピュータで実現するために、該コンピュータにインストールされるプログラムコード自体も本発明を実現するものである。つまり、本発明は、本発明の機能処理を実現するためのコンピュータプログラム自体も含まれる。 Accordingly, since the functions of the present invention are implemented by computer, the program code installed in the computer also implements the present invention. In other words, the present invention includes a computer program itself for realizing the functional processing of the present invention.

その場合、プログラムの機能を有していれば、オブジェクトコード、インタプリタにより実行されるプログラム、ＯＳに供給するスクリプトデータ等の形態であっても良い。 In that case, as long as it has the function of a program, it may be in the form of object code, a program executed by an interpreter, script data supplied to the OS, or the like.

プログラムを供給するための記録媒体としては、例えば、フロッピー（登録商標）ディスク、ハードディスク、光ディスク、光磁気ディスク、ＭＯ、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、ＣＤ−ＲＷ、磁気テープ、不揮発性のメモリカード、ＲＯＭ、ＤＶＤ（ＤＶＤ−ＲＯＭ，ＤＶＤ−Ｒ）などがある。 As a recording medium for supplying the program, for example, floppy (registered trademark) disk, hard disk, optical disk, magneto-optical disk, MO, CD-ROM, CD-R, CD-RW, magnetic tape, nonvolatile memory card ROM, DVD (DVD-ROM, DVD-R) and the like.

その他、プログラムの供給方法としては、クライアントコンピュータのブラウザを用いてインターネットのホームページに接続し、該ホームページから本発明のコンピュータプログラムそのもの、もしくは圧縮され自動インストール機能を含むファイルをハードディスク等の記録媒体にダウンロードすることによっても供給できる。また、本発明のプログラムを構成するプログラムコードを複数のファイルに分割し、それぞれのファイルを異なるホームページからダウンロードすることによっても実現可能である。つまり、本発明の機能処理をコンピュータで実現するためのプログラムファイルを複数のユーザに対してダウンロードさせるＷＷＷサーバも、本発明に含まれるものである。 As another program supply method, a client computer browser is used to connect to an Internet homepage, and the computer program of the present invention itself or a compressed file including an automatic installation function is downloaded from the homepage to a recording medium such as a hard disk. Can also be supplied. It can also be realized by dividing the program code constituting the program of the present invention into a plurality of files and downloading each file from a different homepage. That is, the present invention includes a WWW server that allows a plurality of users to download a program file for realizing the functional processing of the present invention on a computer.

また、本発明のプログラムを暗号化してＣＤ−ＲＯＭ等の記憶媒体に格納してユーザに配布し、所定の条件をクリアしたユーザに対し、インターネットを介してホームページから暗号化を解く鍵情報をダウンロードさせ、その鍵情報を使用することにより暗号化されたプログラムを実行してコンピュータにインストールさせて実現することも可能である。 In addition, the program of the present invention is encrypted, stored in a storage medium such as a CD-ROM, distributed to users, and key information for decryption is downloaded from a homepage via the Internet to users who have cleared predetermined conditions. It is also possible to execute the encrypted program by using the key information and install the program on a computer.

また、コンピュータが、読み出したプログラムを実行することによって、前述した実施形態の機能が実現される他、そのプログラムの指示に基づき、コンピュータ上で稼動しているＯＳなどが、実際の処理の一部または全部を行ない、その処理によっても前述した実施形態の機能が実現され得る。 In addition to the functions of the above-described embodiments being realized by the computer executing the read program, the OS running on the computer based on an instruction of the program is a part of the actual processing. Alternatively, the functions of the above-described embodiment can be realized by performing all of them and performing the processing.

さらに、記録媒体から読み出されたプログラムが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書き込まれた後、そのプログラムの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行ない、その処理によっても前述した実施形態の機能が実現される。 Furthermore, after the program read from the recording medium is written to a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer, the function expansion board or The CPU or the like provided in the function expansion unit performs part or all of the actual processing, and the functions of the above-described embodiments are also realized by the processing.

本発明の実施形態１の動画像符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the moving image encoder of Embodiment 1 of this invention. 本発明の実施形態１の動画像符号化装置が実行する処理を示すフローチャートである。It is a flowchart which shows the process which the moving image encoder of Embodiment 1 of this invention performs. 本発明の実施形態の符号化対象となるピクチャを説明するための図である。It is a figure for demonstrating the picture used as the encoding object of embodiment of this invention. 本発明の実施形態１で使用するＲ−Ｄモデルの特性を示す図である。It is a figure which shows the characteristic of the RD model used in Embodiment 1 of this invention. 本発明の実施形態１で使用するパラメータと係数の関係を説明する図である。It is a figure explaining the relationship between the parameter and coefficient used in Embodiment 1 of this invention. 本発明の実施形態１のプリフィルタの特性の選択を説明するための図である。It is a figure for demonstrating selection of the characteristic of the pre filter of Embodiment 1 of this invention. 本発明の実施形態２の動画像符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the moving image encoder of Embodiment 2 of this invention. 本発明の実施形態２の動画像符号化装置が実行する処理を示すフローチャートである。It is a flowchart which shows the process which the moving image encoder of Embodiment 2 of this invention performs. 本発明の実施形態２におけるシーケンスの構造を説明するための図である。It is a figure for demonstrating the structure of the sequence in Embodiment 2 of this invention. 本発明の実施形態２のＭＰＥＧ−４符号化方式の処理を示すフローチャートである。It is a flowchart which shows the process of the MPEG-4 encoding system of Embodiment 2 of this invention. 本発明の実施形態２で使用するＩピクチャＲ−Ｄモデルの特性を示す図である。It is a figure which shows the characteristic of the I picture RD model used in Embodiment 2 of this invention. 本発明の実施形態２で使用するＩピクチャＲ−Ｄモデルの特性を示す図である。It is a figure which shows the characteristic of the I picture RD model used in Embodiment 2 of this invention. 本発明の実施形態２の入力ピクチャの周波数とフィルタ通過後の周波数との関係を示す図である。It is a figure which shows the relationship between the frequency of the input picture of Embodiment 2 of this invention, and the frequency after filter passing. 従来技術のＴＭ５を説明するための図である。It is a figure for demonstrating TM5 of a prior art. 従来の技術のバランス関数を説明するグラフを示す図である。It is a figure which shows the graph explaining the balance function of a prior art. 従来の技術のフィルタ係数を決定するグラフを示す図である。It is a figure which shows the graph which determines the filter coefficient of a prior art.

Explanation of symbols

１００動画像符号化装置
１０１プリフィルタ部
１０２符号化部
１０３局所復号化部
１０４符号量制御部
１０５分散算出部
１０６ブロック歪み検出部
１０７視覚感度モデル算出部
１０８パラメータ算出部
１０９Ｒ−Ｄモデル算出部
１１０フィルタ特性算出部
１１１Ｒ−Ｑモデル算出部 DESCRIPTION OF SYMBOLS 100 Moving image encoder 101 Prefilter part 102 Encoding part 103 Local decoding part 104 Code amount control part 105 Variance calculation part 106 Block distortion detection part 107 Visual sensitivity model calculation part 108 Parameter calculation part 109 RD model calculation part 110 Filter characteristic calculation unit 111 RQ model calculation unit

Claims

A moving image encoding apparatus for encoding an input image to a predetermined target code amount using a weighting parameter in quantization processing in moving image encoding for encoding a moving image in a predetermined unit, wherein the input image is distributed A variance calculating means for calculating
Filter means for performing a filtering process on the input image with a given filter characteristic;
Encoding means for performing quantization and encoding on the input image filtered by the filter means;
Decoding means for performing a decoding process on the encoded data output by the encoding means;
Detecting means for detecting a block distortion amount by the encoding means from an input image to the encoding means and a reconstructed image that is an output of the decoding means;
R−, which is the relationship between the code amount (R) and the coding distortion amount (D) in the coding means , using the variance of the input image to the coding means and the coding distortion amount generated by the coding means. An R-D model calculating unit that predetermines a defining formula that defines the D model and calculates a target code amount of the input image from the defining formula ;
A visual sensitivity model for evaluating visual sensitivity from an addition operation of a filter distortion amount generated by the filter means, an encoding distortion amount generated by the encoding means, and a block distortion amount detected by the detection means for an image immediately before the input image. A calculation means;
The input image target code amount , the coding distortion amount calculated by the visual sensitivity model calculating means, and the variance calculated by the variance calculating means correspond to the variance of the input image to the encoding means. Parameter calculating means for calculating the parameters;
The code calculated by the parameter calculation means from among dispersion characteristics between an input image to the filter means and an output image from the filter means in accordance with a change in a plurality of predetermined filter coefficients. A filter characteristic calculation unit that selects a parameter that corresponds to the variance of the input image to the conversion unit and a filter characteristic that is closest to the relationship of the variance calculated by the variance calculation unit;
RQ model calculation means for calculating a weighting parameter in the quantization process from the target code amount calculated by the RD model calculation means,
The parameter calculation unit is configured to obtain a variance of the input image filtered by the filter unit and an encoding distortion amount generated by the encoding unit, and a target code amount of the input image obtained from the prescribed expression. A video encoding apparatus characterized in that, using a Lagrangian undetermined multiplier method that maximizes or minimizes the evaluation formula for evaluating the visual sensitivity, the moving picture coding apparatus is preliminarily calculated with a constraint that the amount of code is equal to the code amount .

In moving picture coding for coding a moving picture in a predetermined unit, a moving picture coding method for coding an input picture to a predetermined target code amount using a weighting parameter in quantization processing,
A variance calculating step of calculating variance of the input image;
A filtering step for performing a filtering process on the input image with given filter characteristics;
An encoding step of performing quantization processing and encoding the input image filtered in the filtering step;
A decoding step of performing a decoding process on the encoded data output by the encoding step;
A detection step for detecting a block distortion amount due to the encoding step from an input image to the encoding step and a reconstructed image that is an output of the decoding step;
R−, which is the relationship between the coding amount (R) and the coding distortion amount (D) in the coding step , using the variance of the input image to the coding step and the coding distortion amount generated by the coding step. An R-D model calculating step of predetermining a defining formula for defining the D model , and calculating a target code amount of the input image from the defining formula ;
A visual sensitivity model for evaluating visual sensitivity from an addition operation of the filter distortion amount generated by the filtering step, the coding distortion amount generated by the encoding step, and the block distortion amount detected in the detection step for the image immediately before the input image. A calculation process;
A target code amount of the input image, and encoding distortion amount calculated by the visual sensitivity model calculation step, the dispersion and calculated by the variance calculation step, corresponding to the variance of the input image to the encoding step A parameter calculation step for calculating parameters;
The code calculated by the parameter calculation step from among dispersion characteristics between an input image to the filter step and an output image from the filter step according to a change in a plurality of predetermined filter coefficients. A filter characteristic calculation step of selecting a parameter corresponding to the variance of the input image to the conversion step and a filter characteristic closest to the relationship of the variance calculated by the variance calculation step;
An RQ model calculation step of calculating a weighting parameter in the quantization process from the target code amount calculated by the RD model calculation step,
In the encoding step, the parameter calculation step includes the variance of the input image filtered in the filtering step and the encoding distortion amount generated by the encoding step, and the target code amount of the input image is obtained from the prescribed expression. A moving picture coding method characterized in that it is calculated in advance using a Lagrangian undetermined multiplier method in which the evaluation formula for evaluating the visual sensitivity is maximized or minimized, with the equality of the code amount as a constraint .